This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.140-rc1
Saravana Kannan saravanak@google.com driver core: Release all resources during unbind before updating device links
Vicki Pfau vi@endrift.com Input: xpad - add VID for Turtle Beach controllers
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have trace_event_file have ref counters
Michael Ellerman mpe@ellerman.id.au powerpc/powernv: Fix fortify source warnings in opal-prd.c
Jens Axboe axboe@kernel.dk io_uring/fdinfo: lock SQ thread while retrieving thread cpu/pid
Lewis Huang lewis.huang@amd.com drm/amd/display: Change the DMCUB mailbox memory location from FB to inbox
Christian König christian.koenig@amd.com drm/amdgpu: fix error handling in amdgpu_bo_list_get()
Alex Deucher alexander.deucher@amd.com drm/amdgpu: don't use ATRM for external devices
Kunwu Chan chentao@kylinos.cn drm/i915: Fix potential spectre vulnerability
Bas Nieuwenhuizen bas@basnieuwenhuizen.nl drm/amd/pm: Handle non-terminated overdrive commands.
Jan Kara jack@suse.cz ext4: properly sync file size update after O_SYNC direct IO
Kemeng Shi shikemeng@huaweicloud.com ext4: add missed brelse in update_backups
Kemeng Shi shikemeng@huaweicloud.com ext4: remove gdb backup copy for meta bg in setup_new_flex_group_blocks
Zhang Yi yi.zhang@huawei.com ext4: correct the start block of counting reserved clusters
Kemeng Shi shikemeng@huaweicloud.com ext4: correct return value of ext4_convert_meta_bg
Kemeng Shi shikemeng@huaweicloud.com ext4: correct offset of gdb backup in non meta_bg group to update_backups
Max Kellermann max.kellermann@ionos.com ext4: apply umask if ACL support is disabled
Heiner Kallweit hkallweit1@gmail.com Revert "net: r8169: Disable multicast filter for RTL8168H and RTL8107E"
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix missing vfe_lite clocks check
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix VFE-17x vfe_disable_output()
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix vfe_get() error jump
Bryan O'Donoghue bryan.odonoghue@linaro.org media: qcom: camss: Fix pm_domain_on sequence in probe
Victor Shih victor.shih@genesyslogic.com.tw mmc: sdhci-pci-gli: GL9750: Mask the replay timer timeout of AER
ChunHao Lin hau@realtek.com r8169: fix network lost after resume on DASH systems
Roman Gushchin roman.gushchin@linux.dev mm: kmem: drop __GFP_NOFAIL when allocating objcg vectors
Victor Shih victor.shih@genesyslogic.com.tw mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2
Nam Cao namcaov@gmail.com riscv: kprobes: allow writing to x0
Mahmoud Adam mngyadam@amazon.com nfsd: fix file memleak on client_opens_release
Sakari Ailus sakari.ailus@linux.intel.com media: ccs: Correctly initialise try compose rectangle
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: add checks to handle capabilities from firmware
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: fix the check to handle session buffer requirement
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi_parser: Add check to keep the number of codecs within range
Sean Young sean@mess.org media: sharp: fix sharp encoding
Sean Young sean@mess.org media: lirc: drop trailing space from scancode transmit
Su Hui suhui@nfschina.com f2fs: avoid format-overflow warning
Heiner Kallweit hkallweit1@gmail.com i2c: i801: fix potential race in i801_block_transaction_byte_by_byte
Klaus Kudielka klaus.kudielka@gmail.com net: phylink: initialize carrier state at creation
Alexander Sverdlin alexander.sverdlin@siemens.com net: dsa: lan9303: consequently nested-lock physical MDIO
Andrew Lunn andrew@lunn.ch net: ethtool: Fix documentation of ethtool_sprintf()
Harald Freudenberger freude@linux.ibm.com s390/ap: fix AP bus crash on early config change callback invocation
Tam Nguyen tamnguyenchi@os.amperecomputing.com i2c: designware: Disable TX_EMPTY irq while waiting for block length byte
Darren Hart darren@os.amperecomputing.com sbsa_gwdt: Calculate timeout with 64-bit math
Ondrej Mosnacek omosnace@redhat.com lsm: fix default return value for inode_getsecctx
Ondrej Mosnacek omosnace@redhat.com lsm: fix default return value for vm_enough_memory
Robert Marko robert.marko@sartura.hr Revert "i2c: pxa: move to generic GPIO recovery"
Johnathan Mantey johnathanx.mantey@intel.com Revert ncsi: Propagate carrier gain/loss events to the NCSI controller
Gaurav Batra gbatra@linux.vnet.ibm.com powerpc/pseries/iommu: enable_ddw incorrectly returns direct mapping for SR-IOV device
Alexey Kardashevskiy aik@ozlabs.ru powerpc/pseries/ddw: simplify enable_ddw()
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq6018: Fix tcsr_mutex register size
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: ipq6018: switch TCSR mutex to MMIO
Namjae Jeon linkinjeon@kernel.org ksmbd: fix slab out of bounds write in smb_inherit_dacl()
Guan Wentao guanwentao@uniontech.com Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE
Masum Reza masumrezarock100@gmail.com Bluetooth: btusb: Add RTW8852BE device 13d3:3570 to device tables
Larry Finger Larry.Finger@lwfinger.net bluetooth: Add device 13d3:3571 to device tables
Larry Finger Larry.Finger@lwfinger.net bluetooth: Add device 0bda:887b to device tables
Artem Lukyanov dukzcry@ya.ru Bluetooth: btusb: Add Realtek RTL8852BE support ID 0x0cb8:0xc559
Christian Marangi ansuelsmth@gmail.com cpufreq: stats: Fix buffer overflow detection in trans_stats()
Mark Brown broonie@kernel.org regmap: Ensure range selector registers are updated after cache sync
Pavel Krasavin pkrasavin@imaqliq.com tty: serial: meson: fix hard LOCKUP on crtscts mode
Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com serial: meson: Use platform_get_irq() to get the interrupt
Chandradeep Dey codesigning@chandradeepdey.com ALSA: hda/realtek - Enable internal speaker of ASUS K6500ZC
Kailang Yang kailang@realtek.com ALSA: hda/realtek - Add Dell ALC295 to pin fall back table
Takashi Iwai tiwai@suse.de ALSA: info: Fix potential deadlock at disconnection
Basavaraj Natikar Basavaraj.Natikar@amd.com xhci: Enable RPM on controllers that support low-power states
Helge Deller deller@gmx.de parisc/power: Fix power soft-off when running on qemu
Helge Deller deller@gmx.de parisc/pgtable: Do not drop upper 5 address bits of physical address
Helge Deller deller@gmx.de parisc: Prevent booting 64-bit kernels on PA1.x machines
Frank Li Frank.Li@nxp.com i3c: master: svc: fix SDA keep low when polling IBIWON timeout happen
Frank Li Frank.Li@nxp.com i3c: master: svc: fix check wrong status register in irq handler
Frank Li Frank.Li@nxp.com i3c: master: svc: fix ibi may not return mandatory data byte
Frank Li Frank.Li@nxp.com i3c: master: svc: fix wrong data return when IBI happen during start frame
Frank Li Frank.Li@nxp.com i3c: master: svc: fix race condition in ibi work thread
Joshua Yeong joshua.yeong@starfivetech.com i3c: master: cdns: Fix reading status register
Linus Walleij linus.walleij@linaro.org mtd: cfi_cmdset_0001: Byte swap OTP info
Zi Yan ziy@nvidia.com mm/memory_hotplug: use pfn math in place of direct struct page manipulation
Zi Yan ziy@nvidia.com mm/cma: use nth_page() in place of direct struct page manipulation
Heiko Carstens hca@linux.ibm.com s390/cmma: fix handling of swapper_pg_dir and invalid_pg_dir
Heiko Carstens hca@linux.ibm.com s390/cmma: fix detection of DAT pages
Heiko Carstens hca@linux.ibm.com s390/cmma: fix initial kernel address space page table walk
Alain Volmat alain.volmat@foss.st.com dmaengine: stm32-mdma: correct desc prep when channel running
Sanjuán García, Jorge Jorge.SanjuanGarcia@duagon.com mcb: fix error handling for different scenarios when parsing
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have the user copy of synthetic event address use correct context
Benjamin Bara benjamin.bara@skidata.com i2c: core: Run atomic i2c xfer when !preemptible
Benjamin Bara benjamin.bara@skidata.com kernel/reboot: emergency_restart: Set correct system_state
Eric Biggers ebiggers@google.com quota: explicitly forbid quota files from being encrypted
Zhihao Cheng chengzhihao1@huawei.com jbd2: fix potential data lost in recovering journal raced with synchronizing fs bdev
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org ASoC: codecs: wsa-macro: fix uninitialized stack variables with name prefix
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
Ilpo Järvinen ilpo.jarvinen@linux.intel.com selftests/resctrl: Remove duplicate feature check from CMT test
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: split async and sync catchall in two functions
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: remove catchall element in GC sync path
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: keystone: Don't discard .probe() callback
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: keystone: Don't discard .remove() callback
Jarkko Sakkinen jarkko@kernel.org KEYS: trusted: Rollback init_trusted() consistently
Herve Codina herve.codina@bootlin.com genirq/generic_chip: Make irq_remove_generic_chip() irqdomain aware
Rong Chen rong.chen@amlogic.com mmc: meson-gx: Remove setting of CMD_CFG_ERROR
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix htt pktlog locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix dfs radar event locking
Johan Hovold johan+linaro@kernel.org wifi: ath11k: fix temperature event locking
Mimi Zohar zohar@linux.ibm.com ima: detect changes to the backing overlay file
Amir Goldstein amir73il@gmail.com ima: annotate iint mutex to avoid lockdep false positive warnings
Vasily Khoruzhick anarsoul@gmail.com ACPI: FPDT: properly handle invalid FPDT subtables
Kathiravan Thirumoorthy quic_kathirav@quicinc.com firmware: qcom_scm: use 64-bit calling convention only when client is 64-bit
Josef Bacik josef@toxicpanda.com btrfs: don't arbitrarily slow down delalloc if we're committing
Catalin Marinas catalin.marinas@arm.com rcu: kmemleak: Ignore kmemleak false positives when RCU-freeing objects
Brian Geffon bgeffon@google.com PM: hibernate: Clean up sync_read handling in snapshot_write_next()
Brian Geffon bgeffon@google.com PM: hibernate: Use __get_safe_page() rather than touching the list
Vignesh Viswanathan quic_viswanat@quicinc.com arm64: dts: qcom: ipq6018: Fix hwlock index for SMEM
Joel Fernandes (Google) joel@joelfernandes.org rcu/tree: Defer setting of jiffies during stall reset
Chuck Lever chuck.lever@oracle.com svcrdma: Drop connection after an RDMA Read error
Ajay Singh ajay.kathat@microchip.com wifi: wilc1000: use vmm_table as array in wilc struct
Uwe Kleine-König u.kleine-koenig@pengutronix.de PCI: exynos: Don't discard .remove() callback
Heiner Kallweit hkallweit1@gmail.com PCI/ASPM: Fix L1 substate handling in aspm_attr_store_common()
Nitin Yadav n-yadav@ti.com mmc: sdhci_am654: fix start loop index for TAP value parsing
Dan Carpenter dan.carpenter@linaro.org mmc: vub300: fix an error code
Kathiravan Thirumoorthy quic_kathirav@quicinc.com clk: qcom: ipq6018: drop the CLK_SET_RATE_PARENT flag from PLL clocks
Kathiravan Thirumoorthy quic_kathirav@quicinc.com clk: qcom: ipq8074: drop the CLK_SET_RATE_PARENT flag from PLL clocks
Gustavo A. R. Silva gustavoars@kernel.org clk: socfpga: Fix undefined behavior bug in struct stratix10_clock_data
Helge Deller deller@gmx.de parisc/power: Add power soft-off when running on qemu
Helge Deller deller@gmx.de parisc/pdc: Add width field to struct pdc_model
Nathan Chancellor nathan@kernel.org arm64: Restrict CPU_BIG_ENDIAN to GNU as or LLVM IAS 15.x or newer
Werner Sembach wse@tuxedocomputers.com ACPI: resource: Do IRQ override on TongFang GMxXGxx
Krister Johansen kjlx@templeofstupid.com watchdog: move softlockup_panic back to early_param
Lukas Wunner lukas@wunner.de PCI/sysfs: Protect driver's D3cold preference from user space
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix event channel handling for secondary consoles
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix error path in xen_hvc_init() to always register frontend driver
David Woodhouse dwmw@amazon.co.uk hvc/xen: fix console unplug
Muhammad Usama Anjum usama.anjum@collabora.com tty/sysrq: replace smp_processor_id() with get_cpu()
Paul Moore paul@paul-moore.com audit: don't WARN_ON_ONCE(!current->mm) in audit_exe_compare()
Paul Moore paul@paul-moore.com audit: don't take task_lock() in audit_exe_compare() code path
Maciej S. Szmigiero maciej.szmigiero@oracle.com KVM: x86: Ignore MSR_AMD64_TW_CFG access
Nicolas Saenz Julienne nsaenz@amazon.com KVM: x86: hyper-v: Don't auto-enable stimer on write from user-space
Pu Wen puwen@hygon.cn x86/cpu/hygon: Fix the CPU topology evaluation for real
Roxana Nicolescu roxana.nicolescu@canonical.com crypto: x86/sha - load modules based on CPU features
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix system crash due to bad pointer access
Chandrakanth patil chandrakanth.patil@broadcom.com scsi: megaraid_sas: Increase register read retry rount from 3 to 30 for selected registers
Ranjan Kumar ranjan.kumar@broadcom.com scsi: mpt3sas: Fix loop logic
Shung-Hsi Yu shung-hsi.yu@suse.com bpf: Fix precision tracking for BPF_ALU | BPF_TO_BE | BPF_END
Hao Sun sunhao.th@gmail.com bpf: Fix check_stack_write_fixed_off() to correctly spill imm
Kees Cook keescook@chromium.org randstruct: Fix gcc-plugin performance mode to stay in group
Nicholas Piggin npiggin@gmail.com powerpc/perf: Fix disabling BHRB and instruction sampling
Vikash Garodia quic_vgarodia@quicinc.com media: venus: hfi: add checks to perform sanity on queue pointers
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com i915/perf: Fix NULL deref bugs with drm_dbg() calls
Li Zetao lizetao1@huawei.com xfs: Fix unreferenced object reported by kmemleak in xfs_sysfs_init()
Zeng Heng zengheng4@huawei.com xfs: fix memory leak in xfs_errortag_init
Guo Xuenan guoxuenan@huawei.com xfs: fix exception caused by unexpected illegal bestcount in leaf dir
Darrick J. Wong djwong@kernel.org xfs: avoid a UAF when log intent item recovery fails
hexiaole hexiaole@kylinos.cn xfs: fix inode reservation space for removing transaction
Chandan Babu R chandan.babu@oracle.com xfs: Fix false ENOSPC when performing direct write on a delalloc extent in cow fork
Gao Xiang hsiangkao@linux.alibaba.com xfs: add missing cmap->br_state = XFS_EXT_NORM update
Darrick J. Wong djwong@kernel.org xfs: fix intermittent hang during quotacheck
Darrick J. Wong djwong@kernel.org xfs: don't leak memory when attr fork loading fails
Darrick J. Wong djwong@kernel.org xfs: fix use-after-free in xattr node block inactivation
Zhang Yi yi.zhang@huawei.com xfs: flush inode gc workqueue before clearing agi bucket
Darrick J. Wong djwong@kernel.org xfs: prevent a UAF when log IO errors race with unmount
Kaixu Xia kaixuxia@tencent.com xfs: use invalidate_lock to check the state of mmap_lock
Darrick J. Wong djwong@kernel.org xfs: convert buf_cancel_table allocation to kmalloc_array
Darrick J. Wong djwong@kernel.org xfs: don't leak xfs_buf_cancel structures when recovery fails
Darrick J. Wong djwong@kernel.org xfs: refactor buffer cancellation table allocation
Ekaterina Esina eesina@astralinux.ru cifs: fix check of rc in function generate_smb3signingkey
Anastasia Belova abelova@astralinux.ru cifs: spnego: add ';' in HOST_KEY_LEN
Chen Yu yu.c.chen@intel.com tools/power/turbostat: Enable the C-state Pre-wake printing
Zhang Rui rui.zhang@intel.com tools/power/turbostat: Fix a knl bug
Vlad Buslov vladbu@nvidia.com macvlan: Don't propagate promisc change to lower dev in passthru
Rahul Rameshbabu rrameshbabu@nvidia.com net/mlx5e: Check return value of snprintf writing to fw_version buffer for representors
Saeed Mahameed saeedm@nvidia.com net/mlx5e: Reduce the size of icosq_str
Vlad Buslov vladbu@nvidia.com net/mlx5e: Fix pedit endianness
Paul Blakey paulb@nvidia.com net/mlx5e: Refactor mod header management API
Roi Dayan roid@nvidia.com net/mlx5e: Move mod hdr allocation to a single place
Roi Dayan roid@nvidia.com net/mlx5e: Remove incorrect addition of action fwd flag
Gavin Li gavinl@nvidia.com net/mlx5e: fix double free of encap_header in update funcs
Dust Li dust.li@linux.alibaba.com net/mlx5e: fix double free of encap_header
Baruch Siach baruch@tkos.co.il net: stmmac: fix rx budget limit check
Dan Carpenter dan.carpenter@linaro.org netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()
Florian Westphal fw@strlen.de netfilter: nf_tables: add and use BE register load-store helpers
Florian Westphal fw@strlen.de netfilter: nf_tables: use the correct get/put helpers
Linkui Xiao xiaolinkui@kylinos.cn netfilter: nf_conntrack_bridge: initialize err to 0
Eric Dumazet edumazet@google.com af_unix: fix use-after-free in unix_stream_read_actor()
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Fix MTU max setting
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Handle large frames
Linus Walleij linus.walleij@linaro.org net: ethernet: cortina: Fix max RX frame define
Eric Dumazet edumazet@google.com bonding: stop the device in bond_setup_by_slave()
Eric Dumazet edumazet@google.com ptp: annotate data-race around q->head and q->tail
Juergen Gross jgross@suse.com xen/events: fix delayed eoi list handling
Willem de Bruijn willemb@google.com ppp: limit MRU to 64K
Shigeru Yoshida syoshida@redhat.com tipc: Fix kernel-infoleak due to uninitialized TLV value
Jijie Shao shaojijie@huawei.com net: hns3: fix VF wrong speed and duplex issue
Jijie Shao shaojijie@huawei.com net: hns3: fix VF reset fail issue
Yonglong Liu liuyonglong@huawei.com net: hns3: fix variable may not initialized problem in hns3_init_mac_addr()
Jian Shen shenjian15@huawei.com net: hns3: fix incorrect capability bit display for copper port
Yonglong Liu liuyonglong@huawei.com net: hns3: add barrier in vf mailbox reply process
Jie Wang wangjie125@huawei.com net: hns3: add byte order conversion for PF to VF mailbox message
Jian Shen shenjian15@huawei.com net: hns3: refine the definition for struct hclge_pf_to_vf_msg
Jian Shen shenjian15@huawei.com net: hns3: fix add VLAN fail issue
Shigeru Yoshida syoshida@redhat.com tty: Fix uninit-value access in ppp_sync_receive()
Eric Dumazet edumazet@google.com ipvlan: add ipvlan_route_v6_outbound() helper
Stanislav Fomichev sdf@google.com net: set SOCK_RCU_FREE before inserting socket into hashtable
Martin KaFai Lau kafai@fb.com net: inet: Retire port only listening_hash
Martin KaFai Lau kafai@fb.com net: inet: Open code inet_hash2 and inet_unhash2
Martin KaFai Lau kafai@fb.com net: inet: Remove count from inet_listen_hashbucket
Florian Westphal fw@strlen.de mptcp: listen diag dump support
Florian Westphal fw@strlen.de mptcp: diag: switch to context structure
Andreas Gruenbacher agruenba@redhat.com gfs2: Silence "suspicious RCU usage in gfs2_permission" warning
felix fuzhen5@huawei.com SUNRPC: Fix RPC client cleaned up the freed pipefs dentries
Olga Kornievskaia kolga@netapp.com NFSv4.1: fix SP4_MACH_CRED protection for pnfs IO
Dan Carpenter dan.carpenter@linaro.org SUNRPC: Add an IS_ERR() check back to where it was
Marc Zyngier maz@kernel.org gpio: Add helpers to ease the transition towards immutable irq_chip
Marc Zyngier maz@kernel.org gpio: Expose the gpiochip_irq_re[ql]res helpers
Marc Zyngier maz@kernel.org gpio: Don't fiddle with irqchips marked as immutable
Trond Myklebust trond.myklebust@hammerspace.com SUNRPC: ECONNRESET might require a rebind
Marek Szyprowski m.szyprowski@samsung.com media: cec: meson: always include meson sub-directory in Makefile
Pratyush Yadav p.yadav@ti.com media: cadence: csi2rx: Unregister v4l2 async notifier
Finn Thain fthain@linux-m68k.org sched/core: Optimize in_task() and in_interrupt() a bit
Steven Rostedt (VMware) rostedt@goodmis.org tracing/perf: Add interrupt_context_level() helper
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Reuse logic from perf's get_recursion_context()
Miri Korenblit miriam.rachel.korenblit@intel.com wifi: iwlwifi: Use FW rate for non-data frames
Dan Carpenter dan.carpenter@linaro.org pwm: Fix double shift bug
Vitaly Prosyak vitaly.prosyak@amd.com drm/amdgpu: fix software pci_unplug on some chips
Zongmin Zhou zhouzongmin@kylinos.cn drm/qxl: prevent memory leak
Tony Lindgren tony@atomide.com ASoC: ti: omap-mcbsp: Fix runtime PM underflow warnings
Philipp Stanner pstanner@redhat.com i2c: dev: copy userspace array safely
Douglas Anderson dianders@chromium.org kgdb: Flush console before entering kgdb on panic
Wayne Lin wayne.lin@amd.com drm/amd/display: Avoid NULL dereference of timing generator
Takashi Iwai tiwai@suse.de media: imon: fix access to invalid resource for the second interface
Sakari Ailus sakari.ailus@linux.intel.com media: ccs: Fix driver quirk struct documentation
Ilpo Järvinen ilpo.jarvinen@linux.intel.com media: cobalt: Use FIELD_GET() to extract Link Width
Al Viro viro@zeniv.linux.org.uk gfs2: fix an oops in gfs2_permission
Bob Peterson rpeterso@redhat.com gfs2: ignore negated quota changes
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: avoid integer overflow
Rajeshwar R Shinde coolrrsh@gmail.com media: gspca: cpia1: shift-out-of-bounds in set_flicker
Billy Tsai billy_tsai@aspeedtech.com i3c: master: mipi-i3c-hci: Fix a kernel panic for accessing DAT_data.
zhenwei pi pizhenwei@bytedance.com virtio-blk: fix implicit overflow on virtio_max_dma_size
Axel Lin axel.lin@ingics.com i2c: sun6i-p2wi: Prevent potential division by zero
Jarkko Nikula jarkko.nikula@linux.intel.com i3c: mipi-i3c-hci: Fix out of bounds access in hci_dma_irq_handler
Dominique Martinet asmadeus@codewreck.org 9p: v9fs_listxattr: fix %s null argument warning
Marco Elver elver@google.com 9p/trans_fd: Annotate data-racy writes to file::f_flags
Hardik Gajjar hgajjar@de.adit-jv.com usb: gadget: f_ncm: Always set current gadget in ncm_bind()
Yi Yang yiyang13@huawei.com tty: vcc: Add check for kstrdup() in vcc_probe()
Yuezhang Mo Yuezhang.Mo@sony.com exfat: support handle zero-size directory
Jiri Kosina jkosina@suse.cz HID: Add quirk for Dell Pro Wireless Keyboard and Mouse KM5221W
Bjorn Helgaas bhelgaas@google.com PCI: Use FIELD_GET() in Sapphire RX 5600 XT Pulse quirk
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com misc: pci_endpoint_test: Add Device ID for R-Car S4-8 PCIe controller
Bartosz Pawlowski bartosz.pawlowski@intel.com PCI: Disable ATS for specific Intel IPU E2000 devices
Bartosz Pawlowski bartosz.pawlowski@intel.com PCI: Extract ATS disabling to a helper function
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: Use FIELD_GET() to extract Link Width
Wenchao Hao haowenchao2@huawei.com scsi: libfc: Fix potential NULL pointer dereference in fc_lport_ptp_setup()
Ilpo Järvinen ilpo.jarvinen@linux.intel.com atm: iphase: Do PCI error checks on own line
Ilpo Järvinen ilpo.jarvinen@linux.intel.com PCI: tegra194: Use FIELD_GET()/FIELD_PREP() with Link Width fields
Cezary Rojewski cezary.rojewski@intel.com ALSA: hda: Fix possible null-ptr-deref when assigning a stream
Vincent Whitchurch vincent.whitchurch@axis.com ARM: 9320/1: fix stack depot IRQ stack filter
Mikhail Khvainitski me@khvoinitsky.org HID: lenovo: Detect quirk-free fw on cptkbd and stop applying workaround
Manas Ghandat ghandatmanas@gmail.com jfs: fix array-index-out-of-bounds in diAlloc
Manas Ghandat ghandatmanas@gmail.com jfs: fix array-index-out-of-bounds in dbFindLeaf
Juntong Deng juntong.deng@outlook.com fs/jfs: Add validity check for db_maxag and db_agpref
Juntong Deng juntong.deng@outlook.com fs/jfs: Add check for negative db_l2nbperpage
Tyrel Datwyler tyreld@linux.ibm.com scsi: ibmvfc: Remove BUG_ON in the case of an empty event pool
Yihang Li liyihang9@huawei.com scsi: hisi_sas: Set debugfs_dir pointer to NULL after removing debugfs
Ilpo Järvinen ilpo.jarvinen@linux.intel.com RDMA/hfi1: Use FIELD_GET() to extract Link Width
Lu Jialin lujialin4@huawei.com crypto: pcrypt - Fix hungtask for PADATA_RESET
Richard Fitzgerald rf@opensource.cirrus.com ASoC: soc-card: Add storage for PCI SSID
zhujun2 zhujun2@cmss.chinamobile.com selftests/efivarfs: create-read: fix a resource leak
Laurentiu Tudor laurentiu.tudor@nxp.com arm64: dts: ls208xa: use a pseudo-bus to constrain usb dma size
Qu Huang qu.huang@linux.dev drm/amdgpu: Fix a null pointer access when the smc_rreg pointer is NULL
Jesse Zhang jesse.zhang@amd.com drm/amdkfd: Fix shift out-of-bounds issue
Ondrej Jirman megi@xff.cz drm/panel: st7703: Pick different reset sequence
Ma Ke make_ruc2021@163.com drm/amdgpu/vkms: fix a possible null pointer dereference
Ma Ke make_ruc2021@163.com drm/panel/panel-tpo-tpg110: fix a possible null pointer dereference
Ma Ke make_ruc2021@163.com drm/panel: fix a possible null pointer dereference
Stanley.Yang Stanley.Yang@amd.com drm/amdgpu: Fix potential null pointer derefernce
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix UBSAN array-index-out-of-bounds for Polaris and Tonga
Mario Limonciello mario.limonciello@amd.com drm/amd: Fix UBSAN array-index-out-of-bounds for SMU7
Jani Nikula jani.nikula@intel.com drm/msm/dp: skip validity check for DP CTS EDID checksum
Philipp Stanner pstanner@redhat.com drm: vmwgfx_surface.c: copy user-array safely
Philipp Stanner pstanner@redhat.com kernel: watch_queue: copy user-array safely
Philipp Stanner pstanner@redhat.com kernel: kexec: copy user-array safely
Philipp Stanner pstanner@redhat.com string.h: add array-wrappers for (v)memdup_user()
Wenjing Liu wenjing.liu@amd.com drm/amd/display: use full update for clip size increase of large plane source
Xiaogang Chen xiaogang.chen@amd.com drm/amdkfd: Fix a race condition of vram buffer unref in svm code
baozhu.liu lucas.liu@siengine.com drm/komeda: drop all currently held locks if deadlock happens
Olli Asikainen olli.asikainen@gmail.com platform/x86: thinkpad_acpi: Add battery quirk for Thinkpad X120e
ZhengHan Wang wzhmmmmm@gmail.com Bluetooth: Fix double free in hci_conn_cleanup
youwan Wang wangyouwan@126.com Bluetooth: btusb: Add date->evt_skb is NULL check
Douglas Anderson dianders@chromium.org wifi: ath10k: Don't touch the CE interrupt registers after power up
Eric Dumazet edumazet@google.com net: annotate data-races around sk->sk_dst_pending_confirm
Eric Dumazet edumazet@google.com net: annotate data-races around sk->sk_tx_queue_mapping
Dmitry Antipov dmantipov@yandex.ru wifi: ath10k: fix clang-specific fortify warning
Dmitry Antipov dmantipov@yandex.ru wifi: ath9k: fix clang-specific fortify warnings
Kumar Kartikeya Dwivedi memxor@gmail.com bpf: Detect IP == ksym.end as part of BPF program
Sieng-Piaw Liew liew.s.piaw@gmail.com atl1c: Work around the DMA RX overflow issue
Ping-Ke Shih pkshih@realtek.com wifi: mac80211: don't return unset power in ieee80211_get_tx_power()
Dmitry Antipov dmantipov@yandex.ru wifi: mac80211_hwsim: fix clang-specific fortify warning
Mike Rapoport (IBM) rppt@kernel.org x86/mm: Drop the 4 MB restriction on minimal NUMA node memory size
Frederic Weisbecker frederic@kernel.org workqueue: Provide one lock class key per work_on_cpu() callsite
Ronald Wahl ronald.wahl@raritan.com clocksource/drivers/timer-atmel-tcb: Fix initialization on SAM9 hardware
Jacky Bai ping.bai@nxp.com clocksource/drivers/timer-imx-gpt: Fix potential memory leak
Shuai Xue xueshuai@linux.alibaba.com perf/core: Bail out early if the request AUX area is out of bound
John Stultz jstultz@google.com locking/ww_mutex/test: Fix potential workqueue corruption
-------------
Diffstat:
Makefile | 4 +- arch/arm/include/asm/exception.h | 4 - arch/arm64/Kconfig | 2 + arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 46 ++-- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 15 +- arch/parisc/include/uapi/asm/pdc.h | 1 + arch/parisc/kernel/entry.S | 7 +- arch/parisc/kernel/head.S | 5 +- arch/powerpc/perf/core-book3s.c | 5 +- arch/powerpc/platforms/powernv/opal-prd.c | 17 +- arch/powerpc/platforms/pseries/iommu.c | 19 +- arch/riscv/kernel/probes/simulate-insn.c | 2 +- arch/s390/mm/page-states.c | 25 ++- arch/x86/crypto/sha1_ssse3_glue.c | 12 ++ arch/x86/crypto/sha256_ssse3_glue.c | 12 ++ arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/numa.h | 7 - arch/x86/kernel/cpu/hygon.c | 8 +- arch/x86/kvm/hyperv.c | 10 +- arch/x86/kvm/x86.c | 2 + arch/x86/mm/numa.c | 7 - crypto/pcrypt.c | 4 + drivers/acpi/acpi_fpdt.c | 45 +++- drivers/acpi/resource.c | 12 ++ drivers/atm/iphase.c | 20 +- drivers/base/dd.c | 4 +- drivers/base/regmap/regcache.c | 30 +++ drivers/block/virtio_blk.c | 4 +- drivers/bluetooth/btusb.c | 15 ++ drivers/clk/qcom/gcc-ipq6018.c | 6 - drivers/clk/qcom/gcc-ipq8074.c | 6 - drivers/clk/socfpga/stratix10-clk.h | 4 +- drivers/clocksource/timer-atmel-tcb.c | 1 + drivers/clocksource/timer-imx-gpt.c | 18 +- drivers/cpufreq/cpufreq_stats.c | 14 +- drivers/dma/stm32-mdma.c | 4 +- drivers/firmware/qcom_scm.c | 7 + drivers/gpio/gpiolib.c | 13 +- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 +- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 +- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 + drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 13 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 +- drivers/gpu/drm/amd/display/dc/core/dc.c | 12 +- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 +- drivers/gpu/drm/amd/display/dc/dc.h | 5 + drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 22 +- drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 32 ++- drivers/gpu/drm/amd/include/pptable.h | 4 +- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 +- .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 16 +- .../drm/arm/display/komeda/komeda_pipeline_state.c | 9 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + drivers/gpu/drm/i915/i915_perf.c | 15 +- drivers/gpu/drm/msm/dp/dp_panel.c | 21 +- drivers/gpu/drm/panel/panel-arm-versatile.c | 2 + drivers/gpu/drm/panel/panel-sitronix-st7703.c | 25 +-- drivers/gpu/drm/panel/panel-tpo-tpg110.c | 2 + drivers/gpu/drm/qxl/qxl_display.c | 3 + drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 4 +- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-lenovo.c | 68 ++++-- drivers/hid/hid-quirks.c | 1 + drivers/i2c/busses/i2c-designware-master.c | 19 +- drivers/i2c/busses/i2c-i801.c | 19 +- drivers/i2c/busses/i2c-pxa.c | 76 ++++++- drivers/i2c/busses/i2c-sun6i-p2wi.c | 5 + drivers/i2c/i2c-core.h | 2 +- drivers/i2c/i2c-dev.c | 4 +- drivers/i3c/master/i3c-master-cdns.c | 6 +- drivers/i3c/master/mipi-i3c-hci/dat_v1.c | 29 ++- drivers/i3c/master/mipi-i3c-hci/dma.c | 2 +- drivers/i3c/master/svc-i3c-master.c | 45 +++- drivers/infiniband/hw/hfi1/pcie.c | 9 +- drivers/input/joystick/xpad.c | 1 + drivers/mcb/mcb-core.c | 1 + drivers/mcb/mcb-parse.c | 2 +- drivers/media/cec/platform/Makefile | 2 +- drivers/media/i2c/ccs/ccs-core.c | 2 +- drivers/media/i2c/ccs/ccs-quirk.h | 4 +- drivers/media/pci/cobalt/cobalt-driver.c | 11 +- drivers/media/platform/cadence/cdns-csi2rx.c | 7 +- drivers/media/platform/qcom/camss/camss-vfe-170.c | 22 +- drivers/media/platform/qcom/camss/camss-vfe.c | 5 +- drivers/media/platform/qcom/camss/camss.c | 12 +- drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +- drivers/media/platform/qcom/venus/hfi_parser.c | 15 ++ drivers/media/platform/qcom/venus/hfi_venus.c | 10 + drivers/media/rc/imon.c | 6 + drivers/media/rc/ir-sharp-decoder.c | 8 +- drivers/media/rc/lirc_dev.c | 6 +- drivers/media/test-drivers/vivid/vivid-rds-gen.c | 2 +- drivers/media/usb/gspca/cpia1.c | 3 + drivers/misc/pci_endpoint_test.c | 4 + drivers/mmc/host/meson-gx-mmc.c | 1 - drivers/mmc/host/sdhci-pci-gli.c | 22 ++ drivers/mmc/host/sdhci_am654.c | 2 +- drivers/mmc/host/vub300.c | 1 + drivers/mtd/chips/cfi_cmdset_0001.c | 20 +- drivers/net/bonding/bond_main.c | 6 + drivers/net/dsa/lan9303_mdio.c | 4 +- drivers/net/ethernet/atheros/atl1c/atl1c.h | 3 - drivers/net/ethernet/atheros/atl1c/atl1c_main.c | 67 ++---- drivers/net/ethernet/cortina/gemini.c | 45 ++-- drivers/net/ethernet/cortina/gemini.h | 4 +- drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h | 47 ++++- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 33 ++- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 62 +++--- .../ethernet/hisilicon/hns3/hns3pf/hclge_trace.h | 2 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 29 ++- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 3 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c | 87 +++++--- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h | 2 +- .../net/ethernet/mellanox/mlx5/core/en/mod_hdr.c | 47 +++++ .../net/ethernet/mellanox/mlx5/core/en/mod_hdr.h | 13 ++ .../ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 +- .../net/ethernet/mellanox/mlx5/core/en/tc/sample.c | 5 +- drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c | 25 +-- .../net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 30 ++- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 12 +- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 234 +++++++++------------ drivers/net/ethernet/mellanox/mlx5/core/en_tc.h | 5 - .../ethernet/mellanox/mlx5/core/esw/indir_table.c | 5 +- drivers/net/ethernet/realtek/r8169_main.c | 10 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +- drivers/net/ipvlan/ipvlan_core.c | 41 ++-- drivers/net/macvlan.c | 2 +- drivers/net/phy/phylink.c | 1 + drivers/net/ppp/ppp_synctty.c | 6 +- drivers/net/wireless/ath/ath10k/debug.c | 2 +- drivers/net/wireless/ath/ath10k/snoc.c | 18 +- drivers/net/wireless/ath/ath11k/dp_rx.c | 8 +- drivers/net/wireless/ath/ath11k/wmi.c | 12 +- drivers/net/wireless/ath/ath9k/debug.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_debug.c | 2 +- drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 14 +- drivers/net/wireless/mac80211_hwsim.c | 2 +- drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +- drivers/parisc/power.c | 16 +- drivers/pci/controller/dwc/pci-exynos.c | 4 +- drivers/pci/controller/dwc/pci-keystone.c | 8 +- drivers/pci/controller/dwc/pcie-tegra194.c | 9 +- drivers/pci/pci-acpi.c | 2 +- drivers/pci/pci-sysfs.c | 10 +- drivers/pci/pci.c | 13 +- drivers/pci/pcie/aspm.c | 2 + drivers/pci/quirks.c | 35 ++- drivers/platform/x86/thinkpad_acpi.c | 1 + drivers/ptp/ptp_chardev.c | 3 +- drivers/ptp/ptp_clock.c | 5 +- drivers/ptp/ptp_private.h | 8 +- drivers/ptp/ptp_sysfs.c | 3 +- drivers/s390/crypto/ap_bus.c | 4 + drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 +- drivers/scsi/ibmvscsi/ibmvfc.c | 124 ++++++++++- drivers/scsi/libfc/fc_lport.c | 6 + drivers/scsi/megaraid/megaraid_sas_base.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_base.c | 4 +- drivers/scsi/qla2xxx/qla_os.c | 12 +- drivers/tty/hvc/hvc_xen.c | 39 +++- drivers/tty/serial/meson_uart.c | 25 ++- drivers/tty/sysrq.c | 3 +- drivers/tty/vcc.c | 16 +- drivers/usb/gadget/function/f_ncm.c | 27 +-- drivers/usb/host/xhci-pci.c | 4 +- drivers/watchdog/sbsa_gwdt.c | 4 +- drivers/xen/events/events_base.c | 4 +- fs/9p/xattr.c | 5 +- fs/btrfs/delalloc-space.c | 3 - fs/cifs/cifs_spnego.c | 4 +- fs/cifs/smb2transport.c | 5 +- fs/exfat/namei.c | 29 ++- fs/ext4/acl.h | 5 + fs/ext4/extents_status.c | 4 +- fs/ext4/file.c | 153 ++++++-------- fs/ext4/resize.c | 23 +- fs/f2fs/compress.c | 2 +- fs/gfs2/inode.c | 14 +- fs/gfs2/quota.c | 11 + fs/gfs2/super.c | 2 +- fs/jbd2/recovery.c | 8 + fs/jfs/jfs_dmap.c | 23 +- fs/jfs/jfs_imap.c | 5 +- fs/ksmbd/smbacl.c | 29 ++- fs/nfs/nfs4proc.c | 5 +- fs/nfsd/nfs4state.c | 2 +- fs/overlayfs/super.c | 2 +- fs/proc/proc_sysctl.c | 1 - fs/quota/dquot.c | 14 ++ fs/xfs/libxfs/xfs_dir2_leaf.c | 9 +- fs/xfs/libxfs/xfs_inode_fork.c | 1 + fs/xfs/libxfs/xfs_log_recover.h | 14 +- fs/xfs/libxfs/xfs_trans_resv.c | 2 +- fs/xfs/xfs_attr_inactive.c | 8 +- fs/xfs/xfs_buf_item_recover.c | 66 ++++++ fs/xfs/xfs_error.c | 9 +- fs/xfs/xfs_inode.c | 4 +- fs/xfs/xfs_log.c | 9 +- fs/xfs/xfs_log_priv.h | 3 - fs/xfs/xfs_log_recover.c | 44 ++-- fs/xfs/xfs_qm.c | 7 + fs/xfs/xfs_reflink.c | 197 ++++++++++++++--- fs/xfs/xfs_sysfs.h | 7 +- include/linux/ethtool.h | 4 +- include/linux/gpio/driver.h | 16 ++ include/linux/irq.h | 2 + include/linux/lsm_hook_defs.h | 4 +- include/linux/preempt.h | 36 +++- include/linux/pwm.h | 4 +- include/linux/string.h | 40 ++++ include/linux/sunrpc/clnt.h | 1 + include/linux/trace_events.h | 4 + include/linux/trace_recursion.h | 8 +- include/linux/workqueue.h | 46 +++- include/net/inet_connection_sock.h | 2 - include/net/inet_hashtables.h | 42 +--- include/net/netfilter/nf_tables.h | 19 +- include/net/sock.h | 26 ++- include/sound/soc-card.h | 37 ++++ include/sound/soc.h | 11 + io_uring/io_uring.c | 18 +- kernel/audit_watch.c | 9 +- kernel/bpf/core.c | 6 +- kernel/bpf/verifier.c | 9 +- kernel/debug/debug_core.c | 3 + kernel/events/internal.h | 7 +- kernel/events/ring_buffer.c | 6 + kernel/irq/debugfs.c | 1 + kernel/irq/generic-chip.c | 25 ++- kernel/kexec.c | 2 +- kernel/locking/test-ww_mutex.c | 20 +- kernel/padata.c | 2 +- kernel/power/snapshot.c | 16 +- kernel/rcu/tree.c | 21 ++ kernel/rcu/tree.h | 4 + kernel/rcu/tree_stall.h | 20 +- kernel/reboot.c | 1 + kernel/trace/ring_buffer.c | 9 +- kernel/trace/trace.c | 15 ++ kernel/trace/trace.h | 3 + kernel/trace/trace_events.c | 43 ++-- kernel/trace/trace_events_filter.c | 3 + kernel/trace/trace_events_synth.c | 2 +- kernel/watch_queue.c | 2 +- kernel/watchdog.c | 7 + kernel/workqueue.c | 20 +- mm/cma.c | 2 +- mm/memcontrol.c | 3 +- mm/memory_hotplug.c | 2 +- net/9p/client.c | 2 +- net/9p/trans_fd.c | 13 +- net/bluetooth/hci_conn.c | 6 +- net/bluetooth/hci_sysfs.c | 23 +- net/bridge/netfilter/nf_conntrack_bridge.c | 2 +- net/bridge/netfilter/nft_meta_bridge.c | 2 +- net/core/sock.c | 2 +- net/dccp/proto.c | 1 - net/ipv4/inet_diag.c | 5 +- net/ipv4/inet_hashtables.c | 121 +++-------- net/ipv4/tcp.c | 1 - net/ipv4/tcp_ipv4.c | 21 +- net/ipv4/tcp_output.c | 2 +- net/ipv6/inet6_hashtables.c | 5 +- net/mac80211/cfg.c | 4 + net/mptcp/mptcp_diag.c | 105 ++++++++- net/ncsi/ncsi-aen.c | 5 - net/netfilter/nf_tables_api.c | 53 +++-- net/netfilter/nft_byteorder.c | 6 +- net/netfilter/nft_meta.c | 2 +- net/netfilter/nft_osf.c | 2 +- net/netfilter/nft_socket.c | 8 +- net/netfilter/nft_tproxy.c | 6 +- net/netfilter/nft_xfrm.c | 8 +- net/sunrpc/clnt.c | 7 +- net/sunrpc/rpcb_clnt.c | 4 + net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 +- net/tipc/netlink_compat.c | 1 + net/unix/af_unix.c | 9 +- scripts/gcc-plugins/randomize_layout_plugin.c | 11 +- security/integrity/iint.c | 48 ++++- security/integrity/ima/ima_api.c | 5 + security/integrity/ima/ima_main.c | 16 +- security/integrity/integrity.h | 2 + security/keys/trusted-keys/trusted_core.c | 20 +- sound/core/info.c | 21 +- sound/hda/hdac_stream.c | 6 +- sound/pci/hda/patch_realtek.c | 20 +- sound/soc/codecs/lpass-wsa-macro.c | 3 + sound/soc/ti/omap-mcbsp.c | 6 +- tools/power/x86/turbostat/turbostat.c | 3 +- tools/testing/selftests/efivarfs/create-read.c | 2 + tools/testing/selftests/resctrl/cmt_test.c | 3 - tools/testing/selftests/resctrl/mba_test.c | 2 +- tools/testing/selftests/resctrl/mbm_test.c | 2 +- 298 files changed, 2886 insertions(+), 1445 deletions(-)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Stultz jstultz@google.com
[ Upstream commit bccdd808902f8c677317cec47c306e42b93b849e ]
In some cases running with the test-ww_mutex code, I was seeing odd behavior where sometimes it seemed flush_workqueue was returning before all the work threads were finished.
Often this would cause strange crashes as the mutexes would be freed while they were being used.
Looking at the code, there is a lifetime problem as the controlling thread that spawns the work allocates the "struct stress" structures that are passed to the workqueue threads. Then when the workqueue threads are finished, they free the stress struct that was passed to them.
Unfortunately the workqueue work_struct node is in the stress struct. Which means the work_struct is freed before the work thread returns and while flush_workqueue is waiting.
It seems like a better idea to have the controlling thread both allocate and free the stress structures, so that we can be sure we don't corrupt the workqueue by freeing the structure prematurely.
So this patch reworks the test to do so, and with this change I no longer see the early flush_workqueue returns.
Signed-off-by: John Stultz jstultz@google.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230922043616.19282-3-jstultz@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/locking/test-ww_mutex.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/kernel/locking/test-ww_mutex.c b/kernel/locking/test-ww_mutex.c index 3e82f449b4ff7..da36997d8742c 100644 --- a/kernel/locking/test-ww_mutex.c +++ b/kernel/locking/test-ww_mutex.c @@ -426,7 +426,6 @@ static void stress_inorder_work(struct work_struct *work) } while (!time_after(jiffies, stress->timeout));
kfree(order); - kfree(stress); }
struct reorder_lock { @@ -491,7 +490,6 @@ static void stress_reorder_work(struct work_struct *work) list_for_each_entry_safe(ll, ln, &locks, link) kfree(ll); kfree(order); - kfree(stress); }
static void stress_one_work(struct work_struct *work) @@ -512,8 +510,6 @@ static void stress_one_work(struct work_struct *work) break; } } while (!time_after(jiffies, stress->timeout)); - - kfree(stress); }
#define STRESS_INORDER BIT(0) @@ -524,15 +520,24 @@ static void stress_one_work(struct work_struct *work) static int stress(int nlocks, int nthreads, unsigned int flags) { struct ww_mutex *locks; - int n; + struct stress *stress_array; + int n, count;
locks = kmalloc_array(nlocks, sizeof(*locks), GFP_KERNEL); if (!locks) return -ENOMEM;
+ stress_array = kmalloc_array(nthreads, sizeof(*stress_array), + GFP_KERNEL); + if (!stress_array) { + kfree(locks); + return -ENOMEM; + } + for (n = 0; n < nlocks; n++) ww_mutex_init(&locks[n], &ww_class);
+ count = 0; for (n = 0; nthreads; n++) { struct stress *stress; void (*fn)(struct work_struct *work); @@ -556,9 +561,7 @@ static int stress(int nlocks, int nthreads, unsigned int flags) if (!fn) continue;
- stress = kmalloc(sizeof(*stress), GFP_KERNEL); - if (!stress) - break; + stress = &stress_array[count++];
INIT_WORK(&stress->work, fn); stress->locks = locks; @@ -573,6 +576,7 @@ static int stress(int nlocks, int nthreads, unsigned int flags)
for (n = 0; n < nlocks; n++) ww_mutex_destroy(&locks[n]); + kfree(stress_array); kfree(locks);
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuai Xue xueshuai@linux.alibaba.com
[ Upstream commit 54aee5f15b83437f23b2b2469bcf21bdd9823916 ]
When perf-record with a large AUX area, e.g 4GB, it fails with:
#perf record -C 0 -m ,4G -e arm_spe_0// -- sleep 1 failed to mmap with 12 (Cannot allocate memory)
and it reveals a WARNING with __alloc_pages():
------------[ cut here ]------------ WARNING: CPU: 44 PID: 17573 at mm/page_alloc.c:5568 __alloc_pages+0x1ec/0x248 Call trace: __alloc_pages+0x1ec/0x248 __kmalloc_large_node+0xc0/0x1f8 __kmalloc_node+0x134/0x1e8 rb_alloc_aux+0xe0/0x298 perf_mmap+0x440/0x660 mmap_region+0x308/0x8a8 do_mmap+0x3c0/0x528 vm_mmap_pgoff+0xf4/0x1b8 ksys_mmap_pgoff+0x18c/0x218 __arm64_sys_mmap+0x38/0x58 invoke_syscall+0x50/0x128 el0_svc_common.constprop.0+0x58/0x188 do_el0_svc+0x34/0x50 el0_svc+0x34/0x108 el0t_64_sync_handler+0xb8/0xc0 el0t_64_sync+0x1a4/0x1a8
'rb->aux_pages' allocated by kcalloc() is a pointer array which is used to maintains AUX trace pages. The allocated page for this array is physically contiguous (and virtually contiguous) with an order of 0..MAX_ORDER. If the size of pointer array crosses the limitation set by MAX_ORDER, it reveals a WARNING.
So bail out early with -ENOMEM if the request AUX area is out of bound, e.g.:
#perf record -C 0 -m ,4G -e arm_spe_0// -- sleep 1 failed to mmap with 12 (Cannot allocate memory)
Signed-off-by: Shuai Xue xueshuai@linux.alibaba.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/events/ring_buffer.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/kernel/events/ring_buffer.c b/kernel/events/ring_buffer.c index f40da32f5e753..6808873555f0d 100644 --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -696,6 +696,12 @@ int rb_alloc_aux(struct perf_buffer *rb, struct perf_event *event, watermark = 0; }
+ /* + * kcalloc_node() is unable to allocate buffer if the size is larger + * than: PAGE_SIZE << MAX_ORDER; directly bail out in this case. + */ + if (get_order((unsigned long)nr_pages * sizeof(void *)) > MAX_ORDER) + return -ENOMEM; rb->aux_pages = kcalloc_node(nr_pages, sizeof(void *), GFP_KERNEL, node); if (!rb->aux_pages)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacky Bai ping.bai@nxp.com
[ Upstream commit 8051a993ce222a5158bccc6ac22ace9253dd71cb ]
Fix coverity Issue CID 250382: Resource leak (RESOURCE_LEAK). Add kfree when error return.
Signed-off-by: Jacky Bai ping.bai@nxp.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20231009083922.1942971-1-ping.bai@nxp.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-imx-gpt.c | 18 +++++++++++++----- 1 file changed, 13 insertions(+), 5 deletions(-)
diff --git a/drivers/clocksource/timer-imx-gpt.c b/drivers/clocksource/timer-imx-gpt.c index 7b2c70f2f353b..fabff69e52e58 100644 --- a/drivers/clocksource/timer-imx-gpt.c +++ b/drivers/clocksource/timer-imx-gpt.c @@ -454,12 +454,16 @@ static int __init mxc_timer_init_dt(struct device_node *np, enum imx_gpt_type t return -ENOMEM;
imxtm->base = of_iomap(np, 0); - if (!imxtm->base) - return -ENXIO; + if (!imxtm->base) { + ret = -ENXIO; + goto err_kfree; + }
imxtm->irq = irq_of_parse_and_map(np, 0); - if (imxtm->irq <= 0) - return -EINVAL; + if (imxtm->irq <= 0) { + ret = -EINVAL; + goto err_kfree; + }
imxtm->clk_ipg = of_clk_get_by_name(np, "ipg");
@@ -472,11 +476,15 @@ static int __init mxc_timer_init_dt(struct device_node *np, enum imx_gpt_type t
ret = _mxc_timer_init(imxtm); if (ret) - return ret; + goto err_kfree;
initialized = 1;
return 0; + +err_kfree: + kfree(imxtm); + return ret; }
static int __init imx1_timer_init_dt(struct device_node *np)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ronald Wahl ronald.wahl@raritan.com
[ Upstream commit 6d3bc4c02d59996d1d3180d8ed409a9d7d5900e0 ]
On SAM9 hardware two cascaded 16 bit timers are used to form a 32 bit high resolution timer that is used as scheduler clock when the kernel has been configured that way (CONFIG_ATMEL_CLOCKSOURCE_TCB).
The driver initially triggers a reset-to-zero of the two timers but this reset is only performed on the next rising clock. For the first timer this is ok - it will be in the next 60ns (16MHz clock). For the chained second timer this will only happen after the first timer overflows, i.e. after 2^16 clocks (~4ms with a 16MHz clock). So with other words the scheduler clock resets to 0 after the first 2^16 clock cycles.
It looks like that the scheduler does not like this and behaves wrongly over its lifetime, e.g. some tasks are scheduled with a long delay. Why that is and if there are additional requirements for this behaviour has not been further analysed.
There is a simple fix for resetting the second timer as well when the first timer is reset and this is to set the ATMEL_TC_ASWTRG_SET bit in the Channel Mode register (CMR) of the first timer. This will also rise the TIOA line (clock input of the second timer) when a software trigger respective SYNC is issued.
Signed-off-by: Ronald Wahl ronald.wahl@raritan.com Acked-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Link: https://lore.kernel.org/r/20231007161803.31342-1-rwahl@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clocksource/timer-atmel-tcb.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/clocksource/timer-atmel-tcb.c b/drivers/clocksource/timer-atmel-tcb.c index 27af17c995900..2a90c92a9182a 100644 --- a/drivers/clocksource/timer-atmel-tcb.c +++ b/drivers/clocksource/timer-atmel-tcb.c @@ -315,6 +315,7 @@ static void __init tcb_setup_dual_chan(struct atmel_tc *tc, int mck_divisor_idx) writel(mck_divisor_idx /* likely divide-by-8 */ | ATMEL_TC_WAVE | ATMEL_TC_WAVESEL_UP /* free-run */ + | ATMEL_TC_ASWTRG_SET /* TIOA0 rises at software trigger */ | ATMEL_TC_ACPA_SET /* TIOA0 rises at 0 */ | ATMEL_TC_ACPC_CLEAR, /* (duty cycle 50%) */ tcaddr + ATMEL_TC_REG(0, CMR));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frederic Weisbecker frederic@kernel.org
[ Upstream commit 265f3ed077036f053981f5eea0b5b43e7c5b39ff ]
All callers of work_on_cpu() share the same lock class key for all the functions queued. As a result the workqueue related locking scenario for a function A may be spuriously accounted as an inversion against the locking scenario of function B such as in the following model:
long A(void *arg) { mutex_lock(&mutex); mutex_unlock(&mutex); }
long B(void *arg) { }
void launchA(void) { work_on_cpu(0, A, NULL); }
void launchB(void) { mutex_lock(&mutex); work_on_cpu(1, B, NULL); mutex_unlock(&mutex); }
launchA and launchB running concurrently have no chance to deadlock. However the above can be reported by lockdep as a possible locking inversion because the works containing A() and B() are treated as belonging to the same locking class.
The following shows an existing example of such a spurious lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 6.6.0-rc1-00065-g934ebd6e5359 #35409 Not tainted ------------------------------------------------------ kworker/0:1/9 is trying to acquire lock: ffffffff9bc72f30 (cpu_hotplug_lock){++++}-{0:0}, at: _cpu_down+0x57/0x2b0
but task is already holding lock: ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #2 ((work_completion)(&wfc.work)){+.+.}-{0:0}: __flush_work+0x83/0x4e0 work_on_cpu+0x97/0xc0 rcu_nocb_cpu_offload+0x62/0xb0 rcu_nocb_toggle+0xd0/0x1d0 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
-> #1 (rcu_state.barrier_mutex){+.+.}-{3:3}: __mutex_lock+0x81/0xc80 rcu_nocb_cpu_deoffload+0x38/0xb0 rcu_nocb_toggle+0x144/0x1d0 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
-> #0 (cpu_hotplug_lock){++++}-{0:0}: __lock_acquire+0x1538/0x2500 lock_acquire+0xbf/0x2a0 percpu_down_write+0x31/0x200 _cpu_down+0x57/0x2b0 __cpu_down_maps_locked+0x10/0x20 work_for_cpu_fn+0x15/0x20 process_scheduled_works+0x2a7/0x500 worker_thread+0x173/0x330 kthread+0xe6/0x120 ret_from_fork+0x2f/0x40 ret_from_fork_asm+0x1b/0x30
other info that might help us debug this:
Chain exists of: cpu_hotplug_lock --> rcu_state.barrier_mutex --> (work_completion)(&wfc.work)
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock((work_completion)(&wfc.work)); lock(rcu_state.barrier_mutex); lock((work_completion)(&wfc.work)); lock(cpu_hotplug_lock);
*** DEADLOCK ***
2 locks held by kworker/0:1/9: #0: ffff900481068b38 ((wq_completion)events){+.+.}-{0:0}, at: process_scheduled_works+0x212/0x500 #1: ffff9e3bc0057e60 ((work_completion)(&wfc.work)){+.+.}-{0:0}, at: process_scheduled_works+0x216/0x500
stack backtrace: CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted 6.6.0-rc1-00065-g934ebd6e5359 #35409 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba5276e321-prebuilt.qemu.org 04/01/2014 Workqueue: events work_for_cpu_fn Call Trace: rcu-torture: rcu_torture_read_exit: Start of episode <TASK> dump_stack_lvl+0x4a/0x80 check_noncircular+0x132/0x150 __lock_acquire+0x1538/0x2500 lock_acquire+0xbf/0x2a0 ? _cpu_down+0x57/0x2b0 percpu_down_write+0x31/0x200 ? _cpu_down+0x57/0x2b0 _cpu_down+0x57/0x2b0 __cpu_down_maps_locked+0x10/0x20 work_for_cpu_fn+0x15/0x20 process_scheduled_works+0x2a7/0x500 worker_thread+0x173/0x330 ? __pfx_worker_thread+0x10/0x10 kthread+0xe6/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x2f/0x40 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK
Fix this with providing one lock class key per work_on_cpu() caller.
Reported-and-tested-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/workqueue.h | 46 +++++++++++++++++++++++++++++++++------ kernel/workqueue.c | 20 ++++++++++------- 2 files changed, 51 insertions(+), 15 deletions(-)
diff --git a/include/linux/workqueue.h b/include/linux/workqueue.h index 20a47eb94b0f3..1e96680f50230 100644 --- a/include/linux/workqueue.h +++ b/include/linux/workqueue.h @@ -222,18 +222,16 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; } * to generate better code. */ #ifdef CONFIG_LOCKDEP -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK_KEY(_work, _func, _onstack, _key) \ do { \ - static struct lock_class_key __key; \ - \ __init_work((_work), _onstack); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ - lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, &__key, 0); \ + lockdep_init_map(&(_work)->lockdep_map, "(work_completion)"#_work, (_key), 0); \ INIT_LIST_HEAD(&(_work)->entry); \ (_work)->func = (_func); \ } while (0) #else -#define __INIT_WORK(_work, _func, _onstack) \ +#define __INIT_WORK_KEY(_work, _func, _onstack, _key) \ do { \ __init_work((_work), _onstack); \ (_work)->data = (atomic_long_t) WORK_DATA_INIT(); \ @@ -242,12 +240,22 @@ static inline unsigned int work_static(struct work_struct *work) { return 0; } } while (0) #endif
+#define __INIT_WORK(_work, _func, _onstack) \ + do { \ + static __maybe_unused struct lock_class_key __key; \ + \ + __INIT_WORK_KEY(_work, _func, _onstack, &__key); \ + } while (0) + #define INIT_WORK(_work, _func) \ __INIT_WORK((_work), (_func), 0)
#define INIT_WORK_ONSTACK(_work, _func) \ __INIT_WORK((_work), (_func), 1)
+#define INIT_WORK_ONSTACK_KEY(_work, _func, _key) \ + __INIT_WORK_KEY((_work), (_func), 1, _key) + #define __INIT_DELAYED_WORK(_work, _func, _tflags) \ do { \ INIT_WORK(&(_work)->work, (_func)); \ @@ -632,8 +640,32 @@ static inline long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg) return fn(arg); } #else -long work_on_cpu(int cpu, long (*fn)(void *), void *arg); -long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg); +long work_on_cpu_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key); +/* + * A new key is defined for each caller to make sure the work + * associated with the function doesn't share its locking class. + */ +#define work_on_cpu(_cpu, _fn, _arg) \ +({ \ + static struct lock_class_key __key; \ + \ + work_on_cpu_key(_cpu, _fn, _arg, &__key); \ +}) + +long work_on_cpu_safe_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key); + +/* + * A new key is defined for each caller to make sure the work + * associated with the function doesn't share its locking class. + */ +#define work_on_cpu_safe(_cpu, _fn, _arg) \ +({ \ + static struct lock_class_key __key; \ + \ + work_on_cpu_safe_key(_cpu, _fn, _arg, &__key); \ +}) #endif /* CONFIG_SMP */
#ifdef CONFIG_FREEZER diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 19868cf588779..962ee27ec7d70 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5209,50 +5209,54 @@ static void work_for_cpu_fn(struct work_struct *work) }
/** - * work_on_cpu - run a function in thread context on a particular cpu + * work_on_cpu_key - run a function in thread context on a particular cpu * @cpu: the cpu to run on * @fn: the function to run * @arg: the function arg + * @key: The lock class key for lock debugging purposes * * It is up to the caller to ensure that the cpu doesn't go offline. * The caller must not hold any locks which would prevent @fn from completing. * * Return: The value @fn returns. */ -long work_on_cpu(int cpu, long (*fn)(void *), void *arg) +long work_on_cpu_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key) { struct work_for_cpu wfc = { .fn = fn, .arg = arg };
- INIT_WORK_ONSTACK(&wfc.work, work_for_cpu_fn); + INIT_WORK_ONSTACK_KEY(&wfc.work, work_for_cpu_fn, key); schedule_work_on(cpu, &wfc.work); flush_work(&wfc.work); destroy_work_on_stack(&wfc.work); return wfc.ret; } -EXPORT_SYMBOL_GPL(work_on_cpu); +EXPORT_SYMBOL_GPL(work_on_cpu_key);
/** - * work_on_cpu_safe - run a function in thread context on a particular cpu + * work_on_cpu_safe_key - run a function in thread context on a particular cpu * @cpu: the cpu to run on * @fn: the function to run * @arg: the function argument + * @key: The lock class key for lock debugging purposes * * Disables CPU hotplug and calls work_on_cpu(). The caller must not hold * any locks which would prevent @fn from completing. * * Return: The value @fn returns. */ -long work_on_cpu_safe(int cpu, long (*fn)(void *), void *arg) +long work_on_cpu_safe_key(int cpu, long (*fn)(void *), + void *arg, struct lock_class_key *key) { long ret = -ENODEV;
cpus_read_lock(); if (cpu_online(cpu)) - ret = work_on_cpu(cpu, fn, arg); + ret = work_on_cpu_key(cpu, fn, arg, key); cpus_read_unlock(); return ret; } -EXPORT_SYMBOL_GPL(work_on_cpu_safe); +EXPORT_SYMBOL_GPL(work_on_cpu_safe_key); #endif /* CONFIG_SMP */
#ifdef CONFIG_FREEZER
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mike Rapoport (IBM) rppt@kernel.org
[ Upstream commit a1e2b8b36820d8c91275f207e77e91645b7c6836 ]
Qi Zheng reported crashes in a production environment and provided a simplified example as a reproducer:
| For example, if we use Qemu to start a two NUMA node kernel, | one of the nodes has 2M memory (less than NODE_MIN_SIZE), | and the other node has 2G, then we will encounter the | following panic: | | BUG: kernel NULL pointer dereference, address: 0000000000000000 | <...> | RIP: 0010:_raw_spin_lock_irqsave+0x22/0x40 | <...> | Call Trace: | <TASK> | deactivate_slab() | bootstrap() | kmem_cache_init() | start_kernel() | secondary_startup_64_no_verify()
The crashes happen because of inconsistency between the nodemask that has nodes with less than 4MB as memoryless, and the actual memory fed into the core mm.
The commit:
9391a3f9c7f1 ("[PATCH] x86_64: Clear more state when ignoring empty node in SRAT parsing")
... that introduced minimal size of a NUMA node does not explain why a node size cannot be less than 4MB and what boot failures this restriction might fix.
Fixes have been submitted to the core MM code to tighten up the memory topologies it accepts and to not crash on weird input:
mm: page_alloc: skip memoryless nodes entirely mm: memory_hotplug: drop memoryless node from fallback lists
Andrew has accepted them into the -mm tree, but there are no stable SHA1's yet.
This patch drops the limitation for minimal node size on x86:
- which works around the crash without the fixes to the core MM. - makes x86 topologies less weird, - removes an arbitrary and undocumented limitation on NUMA topologies.
[ mingo: Improved changelog clarity. ]
Reported-by: Qi Zheng zhengqi.arch@bytedance.com Tested-by: Mario Casquero mcasquer@redhat.com Signed-off-by: Mike Rapoport (IBM) rppt@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Acked-by: David Hildenbrand david@redhat.com Acked-by: Michal Hocko mhocko@suse.com Cc: Dave Hansen dave.hansen@linux.intel.com Cc: Rik van Riel riel@surriel.com Link: https://lore.kernel.org/r/ZS+2qqjEO5/867br@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/numa.h | 7 ------- arch/x86/mm/numa.c | 7 ------- 2 files changed, 14 deletions(-)
diff --git a/arch/x86/include/asm/numa.h b/arch/x86/include/asm/numa.h index e3bae2b60a0db..ef2844d691735 100644 --- a/arch/x86/include/asm/numa.h +++ b/arch/x86/include/asm/numa.h @@ -12,13 +12,6 @@
#define NR_NODE_MEMBLKS (MAX_NUMNODES*2)
-/* - * Too small node sizes may confuse the VM badly. Usually they - * result from BIOS bugs. So dont recognize nodes as standalone - * NUMA entities that have less than this amount of RAM listed: - */ -#define NODE_MIN_SIZE (4*1024*1024) - extern int numa_off;
/* diff --git a/arch/x86/mm/numa.c b/arch/x86/mm/numa.c index e360c6892a584..1a1c0c242f272 100644 --- a/arch/x86/mm/numa.c +++ b/arch/x86/mm/numa.c @@ -601,13 +601,6 @@ static int __init numa_register_memblks(struct numa_meminfo *mi) if (start >= end) continue;
- /* - * Don't confuse VM with a node that doesn't have the - * minimum amount of memory: - */ - if (end && (end - start) < NODE_MIN_SIZE) - continue; - alloc_node_data(nid); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit cbaccdc42483c65016f1bae89128c08dc17cfb2a ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/virtual/mac80211_hwsim.c:18: In file included from ./include/linux/slab.h:16: In file included from ./include/linux/gfp.h:7: In file included from ./include/linux/mmzone.h:8: In file included from ./include/linux/spinlock.h:56: In file included from ./include/linux/preempt.h:79: In file included from ./arch/x86/include/asm/preempt.h:9: In file included from ./include/linux/thread_info.h:60: In file included from ./arch/x86/include/asm/thread_info.h:53: In file included from ./arch/x86/include/asm/cpufeature.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'mac80211_hwsim_get_et_strings()' where fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole 'mac80211_hwsim_gstrings_stats' array from its first member and so issues an overread warning. This warning may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Link: https://lore.kernel.org/r/20230829094140.234636-1-dmantipov@yandex.ru Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index 6eb3c845640bd..7d73502586839 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -2615,7 +2615,7 @@ static void mac80211_hwsim_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *mac80211_hwsim_gstrings_stats, + memcpy(data, mac80211_hwsim_gstrings_stats, sizeof(mac80211_hwsim_gstrings_stats)); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ping-Ke Shih pkshih@realtek.com
[ Upstream commit e160ab85166e77347d0cbe5149045cb25e83937f ]
We can get a UBSAN warning if ieee80211_get_tx_power() returns the INT_MIN value mac80211 internally uses for "unset power level".
UBSAN: signed-integer-overflow in net/wireless/nl80211.c:3816:5 -2147483648 * 100 cannot be represented in type 'int' CPU: 0 PID: 20433 Comm: insmod Tainted: G WC OE Call Trace: dump_stack+0x74/0x92 ubsan_epilogue+0x9/0x50 handle_overflow+0x8d/0xd0 __ubsan_handle_mul_overflow+0xe/0x10 nl80211_send_iface+0x688/0x6b0 [cfg80211] [...] cfg80211_register_wdev+0x78/0xb0 [cfg80211] cfg80211_netdev_notifier_call+0x200/0x620 [cfg80211] [...] ieee80211_if_add+0x60e/0x8f0 [mac80211] ieee80211_register_hw+0xda5/0x1170 [mac80211]
In this case, simply return an error instead, to indicate that no data is available.
Cc: Zong-Zhe Yang kevin_yang@realtek.com Signed-off-by: Ping-Ke Shih pkshih@realtek.com Link: https://lore.kernel.org/r/20230203023636.4418-1-pkshih@realtek.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/cfg.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index 4fa216a108ae8..02bd90a537058 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -2762,6 +2762,10 @@ static int ieee80211_get_tx_power(struct wiphy *wiphy, else *dbm = sdata->vif.bss_conf.txpower;
+ /* INT_MIN indicates no power level was set yet */ + if (*dbm == INT_MIN) + return -EINVAL; + return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sieng-Piaw Liew liew.s.piaw@gmail.com
[ Upstream commit 86565682e9053e5deb128193ea9e88531bbae9cf ]
This is based on alx driver commit 881d0327db37 ("net: alx: Work around the DMA RX overflow issue").
The alx and atl1c drivers had RX overflow error which was why a custom allocator was created to avoid certain addresses. The simpler workaround then created for alx driver, but not for atl1c due to lack of tester.
Instead of using a custom allocator, check the allocated skb address and use skb_reserve() to move away from problematic 0x...fc0 address.
Tested on AR8131 on Acer 4540.
Signed-off-by: Sieng-Piaw Liew liew.s.piaw@gmail.com Link: https://lore.kernel.org/r/20230912010711.12036-1-liew.s.piaw@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/atl1c/atl1c.h | 3 - .../net/ethernet/atheros/atl1c/atl1c_main.c | 67 +++++-------------- 2 files changed, 16 insertions(+), 54 deletions(-)
diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c.h b/drivers/net/ethernet/atheros/atl1c/atl1c.h index 43d821fe7a542..63ba64dbb7310 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c.h +++ b/drivers/net/ethernet/atheros/atl1c/atl1c.h @@ -504,15 +504,12 @@ struct atl1c_rrd_ring { u16 next_to_use; u16 next_to_clean; struct napi_struct napi; - struct page *rx_page; - unsigned int rx_page_offset; };
/* board specific private data structure */ struct atl1c_adapter { struct net_device *netdev; struct pci_dev *pdev; - unsigned int rx_frag_size; struct atl1c_hw hw; struct atl1c_hw_stats hw_stats; struct mii_if_info mii; /* MII interface info */ diff --git a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c index dad21b4fbc0bc..c6f621c0ca836 100644 --- a/drivers/net/ethernet/atheros/atl1c/atl1c_main.c +++ b/drivers/net/ethernet/atheros/atl1c/atl1c_main.c @@ -493,15 +493,10 @@ static int atl1c_set_mac_addr(struct net_device *netdev, void *p) static void atl1c_set_rxbufsize(struct atl1c_adapter *adapter, struct net_device *dev) { - unsigned int head_size; int mtu = dev->mtu;
adapter->rx_buffer_len = mtu > AT_RX_BUF_SIZE ? roundup(mtu + ETH_HLEN + ETH_FCS_LEN + VLAN_HLEN, 8) : AT_RX_BUF_SIZE; - - head_size = SKB_DATA_ALIGN(adapter->rx_buffer_len + NET_SKB_PAD + NET_IP_ALIGN) + - SKB_DATA_ALIGN(sizeof(struct skb_shared_info)); - adapter->rx_frag_size = roundup_pow_of_two(head_size); }
static netdev_features_t atl1c_fix_features(struct net_device *netdev, @@ -974,7 +969,6 @@ static void atl1c_init_ring_ptrs(struct atl1c_adapter *adapter) static void atl1c_free_ring_resources(struct atl1c_adapter *adapter) { struct pci_dev *pdev = adapter->pdev; - int i;
dma_free_coherent(&pdev->dev, adapter->ring_header.size, adapter->ring_header.desc, adapter->ring_header.dma); @@ -987,12 +981,6 @@ static void atl1c_free_ring_resources(struct atl1c_adapter *adapter) kfree(adapter->tpd_ring[0].buffer_info); adapter->tpd_ring[0].buffer_info = NULL; } - for (i = 0; i < adapter->rx_queue_count; ++i) { - if (adapter->rrd_ring[i].rx_page) { - put_page(adapter->rrd_ring[i].rx_page); - adapter->rrd_ring[i].rx_page = NULL; - } - } }
/** @@ -1764,48 +1752,11 @@ static inline void atl1c_rx_checksum(struct atl1c_adapter *adapter, skb_checksum_none_assert(skb); }
-static struct sk_buff *atl1c_alloc_skb(struct atl1c_adapter *adapter, - u32 queue, bool napi_mode) -{ - struct atl1c_rrd_ring *rrd_ring = &adapter->rrd_ring[queue]; - struct sk_buff *skb; - struct page *page; - - if (adapter->rx_frag_size > PAGE_SIZE) { - if (likely(napi_mode)) - return napi_alloc_skb(&rrd_ring->napi, - adapter->rx_buffer_len); - else - return netdev_alloc_skb_ip_align(adapter->netdev, - adapter->rx_buffer_len); - } - - page = rrd_ring->rx_page; - if (!page) { - page = alloc_page(GFP_ATOMIC); - if (unlikely(!page)) - return NULL; - rrd_ring->rx_page = page; - rrd_ring->rx_page_offset = 0; - } - - skb = build_skb(page_address(page) + rrd_ring->rx_page_offset, - adapter->rx_frag_size); - if (likely(skb)) { - skb_reserve(skb, NET_SKB_PAD + NET_IP_ALIGN); - rrd_ring->rx_page_offset += adapter->rx_frag_size; - if (rrd_ring->rx_page_offset >= PAGE_SIZE) - rrd_ring->rx_page = NULL; - else - get_page(page); - } - return skb; -} - static int atl1c_alloc_rx_buffer(struct atl1c_adapter *adapter, u32 queue, bool napi_mode) { struct atl1c_rfd_ring *rfd_ring = &adapter->rfd_ring[queue]; + struct atl1c_rrd_ring *rrd_ring = &adapter->rrd_ring[queue]; struct pci_dev *pdev = adapter->pdev; struct atl1c_buffer *buffer_info, *next_info; struct sk_buff *skb; @@ -1824,13 +1775,27 @@ static int atl1c_alloc_rx_buffer(struct atl1c_adapter *adapter, u32 queue, while (next_info->flags & ATL1C_BUFFER_FREE) { rfd_desc = ATL1C_RFD_DESC(rfd_ring, rfd_next_to_use);
- skb = atl1c_alloc_skb(adapter, queue, napi_mode); + /* When DMA RX address is set to something like + * 0x....fc0, it will be very likely to cause DMA + * RFD overflow issue. + * + * To work around it, we apply rx skb with 64 bytes + * longer space, and offset the address whenever + * 0x....fc0 is detected. + */ + if (likely(napi_mode)) + skb = napi_alloc_skb(&rrd_ring->napi, adapter->rx_buffer_len + 64); + else + skb = netdev_alloc_skb(adapter->netdev, adapter->rx_buffer_len + 64); if (unlikely(!skb)) { if (netif_msg_rx_err(adapter)) dev_warn(&pdev->dev, "alloc rx buffer failed\n"); break; }
+ if (((unsigned long)skb->data & 0xfff) == 0xfc0) + skb_reserve(skb, 64); + /* * Make buffer alignment 2 beyond a 16 byte boundary * this will result in a 16 byte aligned IP header after
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kumar Kartikeya Dwivedi memxor@gmail.com
[ Upstream commit 66d9111f3517f85ef2af0337ece02683ce0faf21 ]
Now that bpf_throw kfunc is the first such call instruction that has noreturn semantics within the verifier, this also kicks in dead code elimination in unprecedented ways. For one, any instruction following a bpf_throw call will never be marked as seen. Moreover, if a callchain ends up throwing, any instructions after the call instruction to the eventually throwing subprog in callers will also never be marked as seen.
The tempting way to fix this would be to emit extra 'int3' instructions which bump the jited_len of a program, and ensure that during runtime when a program throws, we can discover its boundaries even if the call instruction to bpf_throw (or to subprogs that always throw) is emitted as the final instruction in the program.
An example of such a program would be this:
do_something(): ... r0 = 0 exit
foo(): r1 = 0 call bpf_throw r0 = 0 exit
bar(cond): if r1 != 0 goto pc+2 call do_something exit call foo r0 = 0 // Never seen by verifier exit //
main(ctx): r1 = ... call bar r0 = 0 exit
Here, if we do end up throwing, the stacktrace would be the following:
bpf_throw foo bar main
In bar, the final instruction emitted will be the call to foo, as such, the return address will be the subsequent instruction (which the JIT emits as int3 on x86). This will end up lying outside the jited_len of the program, thus, when unwinding, we will fail to discover the return address as belonging to any program and end up in a panic due to the unreliable stack unwinding of BPF programs that we never expect.
To remedy this case, make bpf_prog_ksym_find treat IP == ksym.end as part of the BPF program, so that is_bpf_text_address returns true when such a case occurs, and we are able to unwind reliably when the final instruction ends up being a call instruction.
Signed-off-by: Kumar Kartikeya Dwivedi memxor@gmail.com Link: https://lore.kernel.org/r/20230912233214.1518551-12-memxor@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/kernel/bpf/core.c b/kernel/bpf/core.c index f7c27c1cc593b..36c2896ee45f4 100644 --- a/kernel/bpf/core.c +++ b/kernel/bpf/core.c @@ -605,7 +605,11 @@ static __always_inline int bpf_tree_comp(void *key, struct latch_tree_node *n)
if (val < ksym->start) return -1; - if (val >= ksym->end) + /* Ensure that we detect return addresses as part of the program, when + * the final instruction is a call for a program part of the stack + * trace. Therefore, do val > ksym->end instead of val >= ksym->end. + */ + if (val > ksym->end) return 1;
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit 95f97fe0ac974467ab4da215985a32b2fdf48af0 ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/ath/ath9k/debug.c:17: In file included from ./include/linux/slab.h:16: In file included from ./include/linux/gfp.h:7: In file included from ./include/linux/mmzone.h:8: In file included from ./include/linux/spinlock.h:56: In file included from ./include/linux/preempt.h:79: In file included from ./arch/x86/include/asm/preempt.h:9: In file included from ./include/linux/thread_info.h:60: In file included from ./arch/x86/include/asm/thread_info.h:53: In file included from ./arch/x86/include/asm/cpufeature.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
In file included from drivers/net/wireless/ath/ath9k/htc_drv_debug.c:17: In file included from drivers/net/wireless/ath/ath9k/htc.h:20: In file included from ./include/linux/module.h:13: In file included from ./include/linux/stat.h:19: In file included from ./include/linux/time.h:60: In file included from ./include/linux/time32.h:13: In file included from ./include/linux/timex.h:67: In file included from ./arch/x86/include/asm/timex.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'ath9k_get_et_strings()' and 'ath9k_htc_get_et_strings()' due to the same reason: fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole array from it's first member and so issues an overread warning. These warnings may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230829093856.234584-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/debug.c | 2 +- drivers/net/wireless/ath/ath9k/htc_drv_debug.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/debug.c b/drivers/net/wireless/ath/ath9k/debug.c index 4c81b1d7f4171..6a043a49dfe6f 100644 --- a/drivers/net/wireless/ath/ath9k/debug.c +++ b/drivers/net/wireless/ath/ath9k/debug.c @@ -1284,7 +1284,7 @@ void ath9k_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath9k_gstrings_stats, + memcpy(data, ath9k_gstrings_stats, sizeof(ath9k_gstrings_stats)); }
diff --git a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c index c55aab01fff5d..e79bbcd3279af 100644 --- a/drivers/net/wireless/ath/ath9k/htc_drv_debug.c +++ b/drivers/net/wireless/ath/ath9k/htc_drv_debug.c @@ -428,7 +428,7 @@ void ath9k_htc_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath9k_htc_gstrings_stats, + memcpy(data, ath9k_htc_gstrings_stats, sizeof(ath9k_htc_gstrings_stats)); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit cb4c132ebfeac5962f7258ffc831caa0c4dada1a ]
When compiling with clang 16.0.6 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following (somewhat confusing due to absence of an actual source code location):
In file included from drivers/net/wireless/ath/ath10k/debug.c:8: In file included from ./include/linux/module.h:13: In file included from ./include/linux/stat.h:19: In file included from ./include/linux/time.h:60: In file included from ./include/linux/time32.h:13: In file included from ./include/linux/timex.h:67: In file included from ./arch/x86/include/asm/timex.h:5: In file included from ./arch/x86/include/asm/processor.h:23: In file included from ./arch/x86/include/asm/msr.h:11: In file included from ./arch/x86/include/asm/cpumask.h:5: In file included from ./include/linux/cpumask.h:12: In file included from ./include/linux/bitmap.h:11: In file included from ./include/linux/string.h:254: ./include/linux/fortify-string.h:592:4: warning: call to '__read_overflow2_field' declared with 'warning' attribute: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] __read_overflow2_field(q_size_field, size);
The compiler actually complains on 'ath10k_debug_get_et_strings()' where fortification logic inteprets call to 'memcpy()' as an attempt to copy the whole 'ath10k_gstrings_stats' array from it's first member and so issues an overread warning. This warning may be silenced by passing an address of the whole array and not the first member to 'memcpy()'.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230829093652.234537-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/debug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/ath/ath10k/debug.c b/drivers/net/wireless/ath/ath10k/debug.c index 39378e3f9b2bb..6e1b65b8ae656 100644 --- a/drivers/net/wireless/ath/ath10k/debug.c +++ b/drivers/net/wireless/ath/ath10k/debug.c @@ -1139,7 +1139,7 @@ void ath10k_debug_get_et_strings(struct ieee80211_hw *hw, u32 sset, u8 *data) { if (sset == ETH_SS_STATS) - memcpy(data, *ath10k_gstrings_stats, + memcpy(data, ath10k_gstrings_stats, sizeof(ath10k_gstrings_stats)); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 0bb4d124d34044179b42a769a0c76f389ae973b6 ]
This field can be read or written without socket lock being held.
Add annotations to avoid load-store tearing.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index 640bd7a367779..d148dc95c9e9c 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -1923,21 +1923,33 @@ static inline void sk_tx_queue_set(struct sock *sk, int tx_queue) /* sk_tx_queue_mapping accept only upto a 16-bit value */ if (WARN_ON_ONCE((unsigned short)tx_queue >= USHRT_MAX)) return; - sk->sk_tx_queue_mapping = tx_queue; + /* Paired with READ_ONCE() in sk_tx_queue_get() and + * other WRITE_ONCE() because socket lock might be not held. + */ + WRITE_ONCE(sk->sk_tx_queue_mapping, tx_queue); }
#define NO_QUEUE_MAPPING USHRT_MAX
static inline void sk_tx_queue_clear(struct sock *sk) { - sk->sk_tx_queue_mapping = NO_QUEUE_MAPPING; + /* Paired with READ_ONCE() in sk_tx_queue_get() and + * other WRITE_ONCE() because socket lock might be not held. + */ + WRITE_ONCE(sk->sk_tx_queue_mapping, NO_QUEUE_MAPPING); }
static inline int sk_tx_queue_get(const struct sock *sk) { - if (sk && sk->sk_tx_queue_mapping != NO_QUEUE_MAPPING) - return sk->sk_tx_queue_mapping; + if (sk) { + /* Paired with WRITE_ONCE() in sk_tx_queue_clear() + * and sk_tx_queue_set(). + */ + int val = READ_ONCE(sk->sk_tx_queue_mapping);
+ if (val != NO_QUEUE_MAPPING) + return val; + } return -1; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit eb44ad4e635132754bfbcb18103f1dcb7058aedd ]
This field can be read or written without socket lock being held.
Add annotations to avoid load-store tearing.
Signed-off-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/sock.h | 6 +++--- net/core/sock.c | 2 +- net/ipv4/tcp_output.c | 2 +- 3 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/include/net/sock.h b/include/net/sock.h index d148dc95c9e9c..e19eebaf59f73 100644 --- a/include/net/sock.h +++ b/include/net/sock.h @@ -2083,7 +2083,7 @@ static inline void __dst_negative_advice(struct sock *sk) if (ndst != dst) { rcu_assign_pointer(sk->sk_dst_cache, ndst); sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); } } } @@ -2100,7 +2100,7 @@ __sk_dst_set(struct sock *sk, struct dst_entry *dst) struct dst_entry *old_dst;
sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); old_dst = rcu_dereference_protected(sk->sk_dst_cache, lockdep_sock_is_held(sk)); rcu_assign_pointer(sk->sk_dst_cache, dst); @@ -2113,7 +2113,7 @@ sk_dst_set(struct sock *sk, struct dst_entry *dst) struct dst_entry *old_dst;
sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); old_dst = xchg((__force struct dst_entry **)&sk->sk_dst_cache, dst); dst_release(old_dst); } diff --git a/net/core/sock.c b/net/core/sock.c index 8faa0f9cc0839..662cd6d54ac70 100644 --- a/net/core/sock.c +++ b/net/core/sock.c @@ -557,7 +557,7 @@ struct dst_entry *__sk_dst_check(struct sock *sk, u32 cookie) INDIRECT_CALL_INET(dst->ops->check, ip6_dst_check, ipv4_dst_check, dst, cookie) == NULL) { sk_tx_queue_clear(sk); - sk->sk_dst_pending_confirm = 0; + WRITE_ONCE(sk->sk_dst_pending_confirm, 0); RCU_INIT_POINTER(sk->sk_dst_cache, NULL); dst_release(dst); return NULL; diff --git a/net/ipv4/tcp_output.c b/net/ipv4/tcp_output.c index 9d5e652c9bba1..8032ccb69463e 100644 --- a/net/ipv4/tcp_output.c +++ b/net/ipv4/tcp_output.c @@ -1318,7 +1318,7 @@ static int __tcp_transmit_skb(struct sock *sk, struct sk_buff *skb, skb->destructor = skb_is_tcp_pure_ack(skb) ? __sock_wfree : tcp_wfree; refcount_add(skb->truesize, &sk->sk_wmem_alloc);
- skb_set_dst_pending_confirm(skb, sk->sk_dst_pending_confirm); + skb_set_dst_pending_confirm(skb, READ_ONCE(sk->sk_dst_pending_confirm));
/* Build TCP header and checksum it. */ th = (struct tcphdr *)skb->data;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 170c75d43a77dc937c58f07ecf847ba1b42ab74e ]
As talked about in commit d66d24ac300c ("ath10k: Keep track of which interrupts fired, don't poll them"), if we access the copy engine register at a bad time then ath10k can go boom. However, it's not necessarily easy to know when it's safe to access them.
The ChromeOS test labs saw a crash that looked like this at shutdown/reboot time (on a chromeos-5.15 kernel, but likely the problem could also reproduce upstream):
Internal error: synchronous external abort: 96000010 [#1] PREEMPT SMP ... CPU: 4 PID: 6168 Comm: reboot Not tainted 5.15.111-lockdep-19350-g1d624fe6758f #1 010b9b233ab055c27c6dc88efb0be2f4e9e86f51 Hardware name: Google Kingoftown (DT) ... pc : ath10k_snoc_read32+0x50/0x74 [ath10k_snoc] lr : ath10k_snoc_read32+0x24/0x74 [ath10k_snoc] ... Call trace: ath10k_snoc_read32+0x50/0x74 [ath10k_snoc ...] ath10k_ce_disable_interrupt+0x190/0x65c [ath10k_core ...] ath10k_ce_disable_interrupts+0x8c/0x120 [ath10k_core ...] ath10k_snoc_hif_stop+0x78/0x660 [ath10k_snoc ...] ath10k_core_stop+0x13c/0x1ec [ath10k_core ...] ath10k_halt+0x398/0x5b0 [ath10k_core ...] ath10k_stop+0xfc/0x1a8 [ath10k_core ...] drv_stop+0x148/0x6b4 [mac80211 ...] ieee80211_stop_device+0x70/0x80 [mac80211 ...] ieee80211_do_stop+0x10d8/0x15b0 [mac80211 ...] ieee80211_stop+0x144/0x1a0 [mac80211 ...] __dev_close_many+0x1e8/0x2c0 dev_close_many+0x198/0x33c dev_close+0x140/0x210 cfg80211_shutdown_all_interfaces+0xc8/0x1e0 [cfg80211 ...] ieee80211_remove_interfaces+0x118/0x5c4 [mac80211 ...] ieee80211_unregister_hw+0x64/0x1f4 [mac80211 ...] ath10k_mac_unregister+0x4c/0xf0 [ath10k_core ...] ath10k_core_unregister+0x80/0xb0 [ath10k_core ...] ath10k_snoc_free_resources+0xb8/0x1ec [ath10k_snoc ...] ath10k_snoc_shutdown+0x98/0xd0 [ath10k_snoc ...] platform_shutdown+0x7c/0xa0 device_shutdown+0x3e0/0x58c kernel_restart_prepare+0x68/0xa0 kernel_restart+0x28/0x7c
Though there's no known way to reproduce the problem, it makes sense that it would be the same issue where we're trying to access copy engine registers when it's not allowed.
Let's fix this by changing how we "disable" the interrupts. Instead of tweaking the copy engine registers we'll just use disable_irq() and enable_irq(). Then we'll configure the interrupts once at power up time.
Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.2.2.c10-00754-QCAHLSWMTPL-1
Signed-off-by: Douglas Anderson dianders@chromium.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230630151842.1.If764ede23c4e09a43a842771c2ddf996... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath10k/snoc.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c index 73fe77e7824b4..439df8a404d86 100644 --- a/drivers/net/wireless/ath/ath10k/snoc.c +++ b/drivers/net/wireless/ath/ath10k/snoc.c @@ -828,12 +828,20 @@ static void ath10k_snoc_hif_get_default_pipe(struct ath10k *ar,
static inline void ath10k_snoc_irq_disable(struct ath10k *ar) { - ath10k_ce_disable_interrupts(ar); + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); + int id; + + for (id = 0; id < CE_COUNT_MAX; id++) + disable_irq(ar_snoc->ce_irqs[id].irq_line); }
static inline void ath10k_snoc_irq_enable(struct ath10k *ar) { - ath10k_ce_enable_interrupts(ar); + struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar); + int id; + + for (id = 0; id < CE_COUNT_MAX; id++) + enable_irq(ar_snoc->ce_irqs[id].irq_line); }
static void ath10k_snoc_rx_pipe_cleanup(struct ath10k_snoc_pipe *snoc_pipe) @@ -1089,6 +1097,8 @@ static int ath10k_snoc_hif_power_up(struct ath10k *ar, goto err_free_rri; }
+ ath10k_ce_enable_interrupts(ar); + return 0;
err_free_rri: @@ -1253,8 +1263,8 @@ static int ath10k_snoc_request_irq(struct ath10k *ar)
for (id = 0; id < CE_COUNT_MAX; id++) { ret = request_irq(ar_snoc->ce_irqs[id].irq_line, - ath10k_snoc_per_engine_handler, 0, - ce_name[id], ar); + ath10k_snoc_per_engine_handler, + IRQF_NO_AUTOEN, ce_name[id], ar); if (ret) { ath10k_err(ar, "failed to register IRQ handler for CE %d: %d\n",
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: youwan Wang wangyouwan@126.com
[ Upstream commit 624820f7c8826dd010e8b1963303c145f99816e9 ]
fix crash because of null pointers
[ 6104.969662] BUG: kernel NULL pointer dereference, address: 00000000000000c8 [ 6104.969667] #PF: supervisor read access in kernel mode [ 6104.969668] #PF: error_code(0x0000) - not-present page [ 6104.969670] PGD 0 P4D 0 [ 6104.969673] Oops: 0000 [#1] SMP NOPTI [ 6104.969684] RIP: 0010:btusb_mtk_hci_wmt_sync+0x144/0x220 [btusb] [ 6104.969688] RSP: 0018:ffffb8d681533d48 EFLAGS: 00010246 [ 6104.969689] RAX: 0000000000000000 RBX: ffff8ad560bb2000 RCX: 0000000000000006 [ 6104.969691] RDX: 0000000000000000 RSI: ffffb8d681533d08 RDI: 0000000000000000 [ 6104.969692] RBP: ffffb8d681533d70 R08: 0000000000000001 R09: 0000000000000001 [ 6104.969694] R10: 0000000000000001 R11: 00000000fa83b2da R12: ffff8ad461d1d7c0 [ 6104.969695] R13: 0000000000000000 R14: ffff8ad459618c18 R15: ffffb8d681533d90 [ 6104.969697] FS: 00007f5a1cab9d40(0000) GS:ffff8ad578200000(0000) knlGS:00000 [ 6104.969699] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 6104.969700] CR2: 00000000000000c8 CR3: 000000018620c001 CR4: 0000000000760ef0 [ 6104.969701] PKRU: 55555554 [ 6104.969702] Call Trace: [ 6104.969708] btusb_mtk_shutdown+0x44/0x80 [btusb] [ 6104.969732] hci_dev_do_close+0x470/0x5c0 [bluetooth] [ 6104.969748] hci_rfkill_set_block+0x56/0xa0 [bluetooth] [ 6104.969753] rfkill_set_block+0x92/0x160 [ 6104.969755] rfkill_fop_write+0x136/0x1e0 [ 6104.969759] __vfs_write+0x18/0x40 [ 6104.969761] vfs_write+0xdf/0x1c0 [ 6104.969763] ksys_write+0xb1/0xe0 [ 6104.969765] __x64_sys_write+0x1a/0x20 [ 6104.969769] do_syscall_64+0x51/0x180 [ 6104.969771] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 6104.969773] RIP: 0033:0x7f5a21f18fef [ 6104.9] RSP: 002b:00007ffeefe39010 EFLAGS: 00000293 ORIG_RAX: 0000000000000001 [ 6104.969780] RAX: ffffffffffffffda RBX: 000055c10a7560a0 RCX: 00007f5a21f18fef [ 6104.969781] RDX: 0000000000000008 RSI: 00007ffeefe39060 RDI: 0000000000000012 [ 6104.969782] RBP: 00007ffeefe39060 R08: 0000000000000000 R09: 0000000000000017 [ 6104.969784] R10: 00007ffeefe38d97 R11: 0000000000000293 R12: 0000000000000002 [ 6104.969785] R13: 00007ffeefe39220 R14: 00007ffeefe391a0 R15: 000055c10a72acf0
Signed-off-by: youwan Wang wangyouwan@126.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index c01d02f41bcb3..91a08892df223 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -2497,6 +2497,9 @@ static int btusb_mtk_hci_wmt_sync(struct hci_dev *hdev, goto err_free_wc; }
+ if (data->evt_skb == NULL) + goto err_free_wc; + /* Parse and handle the return WMT event */ wmt_evt = (struct btmtk_hci_wmt_evt *)data->evt_skb->data; if (wmt_evt->whdr.op != hdr->op) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: ZhengHan Wang wzhmmmmm@gmail.com
[ Upstream commit a85fb91e3d728bdfc80833167e8162cce8bc7004 ]
syzbot reports a slab use-after-free in hci_conn_hash_flush [1]. After releasing an object using hci_conn_del_sysfs in the hci_conn_cleanup function, releasing the same object again using the hci_dev_put and hci_conn_put functions causes a double free. Here's a simplified flow:
hci_conn_del_sysfs: hci_dev_put put_device kobject_put kref_put kobject_release kobject_cleanup kfree_const kfree(name)
hci_dev_put: ... kfree(name)
hci_conn_put: put_device ... kfree(name)
This patch drop the hci_dev_put and hci_conn_put function call in hci_conn_cleanup function, because the object is freed in hci_conn_del_sysfs function.
This patch also fixes the refcounting in hci_conn_add_sysfs() and hci_conn_del_sysfs() to take into account device_add() failures.
This fixes CVE-2023-28464.
Link: https://syzkaller.appspot.com/bug?id=1bb51491ca5df96a5f724899d1dbb87afda6141... [1]
Signed-off-by: ZhengHan Wang wzhmmmmm@gmail.com Co-developed-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/bluetooth/hci_conn.c | 6 ++---- net/bluetooth/hci_sysfs.c | 23 ++++++++++++----------- 2 files changed, 14 insertions(+), 15 deletions(-)
diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c index 5f1e388c2b951..ce538dbe89d13 100644 --- a/net/bluetooth/hci_conn.c +++ b/net/bluetooth/hci_conn.c @@ -135,13 +135,11 @@ static void hci_conn_cleanup(struct hci_conn *conn) hdev->notify(hdev, HCI_NOTIFY_CONN_DEL); }
- hci_conn_del_sysfs(conn); - debugfs_remove_recursive(conn->debugfs);
- hci_dev_put(hdev); + hci_conn_del_sysfs(conn);
- hci_conn_put(conn); + hci_dev_put(hdev); }
static void le_scan_cleanup(struct work_struct *work) diff --git a/net/bluetooth/hci_sysfs.c b/net/bluetooth/hci_sysfs.c index 08542dfc2dc53..633b82d542728 100644 --- a/net/bluetooth/hci_sysfs.c +++ b/net/bluetooth/hci_sysfs.c @@ -33,7 +33,7 @@ void hci_conn_init_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- BT_DBG("conn %p", conn); + bt_dev_dbg(hdev, "conn %p", conn);
conn->dev.type = &bt_link; conn->dev.class = bt_class; @@ -46,27 +46,30 @@ void hci_conn_add_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- BT_DBG("conn %p", conn); + bt_dev_dbg(hdev, "conn %p", conn);
if (device_is_registered(&conn->dev)) return;
dev_set_name(&conn->dev, "%s:%d", hdev->name, conn->handle);
- if (device_add(&conn->dev) < 0) { + if (device_add(&conn->dev) < 0) bt_dev_err(hdev, "failed to register connection device"); - return; - } - - hci_dev_hold(hdev); }
void hci_conn_del_sysfs(struct hci_conn *conn) { struct hci_dev *hdev = conn->hdev;
- if (!device_is_registered(&conn->dev)) + bt_dev_dbg(hdev, "conn %p", conn); + + if (!device_is_registered(&conn->dev)) { + /* If device_add() has *not* succeeded, use *only* put_device() + * to drop the reference count. + */ + put_device(&conn->dev); return; + }
while (1) { struct device *dev; @@ -78,9 +81,7 @@ void hci_conn_del_sysfs(struct hci_conn *conn) put_device(dev); }
- device_del(&conn->dev); - - hci_dev_put(hdev); + device_unregister(&conn->dev); }
static void bt_host_release(struct device *dev)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olli Asikainen olli.asikainen@gmail.com
[ Upstream commit 916646758aea81a143ce89103910f715ed923346 ]
Thinkpad X120e also needs this battery quirk.
Signed-off-by: Olli Asikainen olli.asikainen@gmail.com Link: https://lore.kernel.org/r/20231024190922.2742-1-olli.asikainen@gmail.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/thinkpad_acpi.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/platform/x86/thinkpad_acpi.c b/drivers/platform/x86/thinkpad_acpi.c index 3dc055ce6e61b..99c19a0b91513 100644 --- a/drivers/platform/x86/thinkpad_acpi.c +++ b/drivers/platform/x86/thinkpad_acpi.c @@ -9766,6 +9766,7 @@ static const struct tpacpi_quirk battery_quirk_table[] __initconst = { * Individual addressing is broken on models that expose the * primary battery as BAT1. */ + TPACPI_Q_LNV('8', 'F', true), /* Thinkpad X120e */ TPACPI_Q_LNV('J', '7', true), /* B5400 */ TPACPI_Q_LNV('J', 'I', true), /* Thinkpad 11e */ TPACPI_Q_LNV3('R', '0', 'B', true), /* Thinkpad 11e gen 3 */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: baozhu.liu lucas.liu@siengine.com
[ Upstream commit 19ecbe8325a2a7ffda5ff4790955b84eaccba49f ]
If komeda_pipeline_unbound_components() returns -EDEADLK, it means that a deadlock happened in the locking context. Currently, komeda is not dealing with the deadlock properly,producing the following output when CONFIG_DEBUG_WW_MUTEX_SLOWPATH is enabled:
------------[ cut here ]------------ [ 26.103984] WARNING: CPU: 2 PID: 345 at drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c:1248 komeda_release_unclaimed_resources+0x13c/0x170 [ 26.117453] Modules linked in: [ 26.120511] CPU: 2 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 [ 26.131374] Hardware name: Siengine Se1000 Evaluation board (DT) [ 26.137379] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) [ 26.143385] pc : komeda_release_unclaimed_resources+0x13c/0x170 [ 26.149301] lr : komeda_release_unclaimed_resources+0xbc/0x170 [ 26.155130] sp : ffff800017b8b8d0 [ 26.158442] pmr_save: 000000e0 [ 26.161493] x29: ffff800017b8b8d0 x28: ffff000cf2f96200 [ 26.166805] x27: ffff000c8f5a8800 x26: 0000000000000000 [ 26.172116] x25: 0000000000000038 x24: ffff8000116a0140 [ 26.177428] x23: 0000000000000038 x22: ffff000cf2f96200 [ 26.182739] x21: ffff000cfc300300 x20: ffff000c8ab77080 [ 26.188051] x19: 0000000000000003 x18: 0000000000000000 [ 26.193362] x17: 0000000000000000 x16: 0000000000000000 [ 26.198672] x15: b400e638f738ba38 x14: 0000000000000000 [ 26.203983] x13: 0000000106400a00 x12: 0000000000000000 [ 26.209294] x11: 0000000000000000 x10: 0000000000000000 [ 26.214604] x9 : ffff800012f80000 x8 : ffff000ca3308000 [ 26.219915] x7 : 0000000ff3000000 x6 : ffff80001084034c [ 26.225226] x5 : ffff800017b8bc40 x4 : 000000000000000f [ 26.230536] x3 : ffff000ca3308000 x2 : 0000000000000000 [ 26.235847] x1 : 0000000000000000 x0 : ffffffffffffffdd [ 26.241158] Call trace: [ 26.243604] komeda_release_unclaimed_resources+0x13c/0x170 [ 26.249175] komeda_crtc_atomic_check+0x68/0xf0 [ 26.253706] drm_atomic_helper_check_planes+0x138/0x1f4 [ 26.258929] komeda_kms_check+0x284/0x36c [ 26.262939] drm_atomic_check_only+0x40c/0x714 [ 26.267381] drm_atomic_nonblocking_commit+0x1c/0x60 [ 26.272344] drm_mode_atomic_ioctl+0xa3c/0xb8c [ 26.276787] drm_ioctl_kernel+0xc4/0x120 [ 26.280708] drm_ioctl+0x268/0x534 [ 26.284109] __arm64_sys_ioctl+0xa8/0xf0 [ 26.288030] el0_svc_common.constprop.0+0x80/0x240 [ 26.292817] do_el0_svc+0x24/0x90 [ 26.296132] el0_svc+0x20/0x30 [ 26.299185] el0_sync_handler+0xe8/0xf0 [ 26.303018] el0_sync+0x1a4/0x1c0 [ 26.306330] irq event stamp: 0 [ 26.309384] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 26.315650] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.323825] softirqs last enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.331997] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 26.338261] ---[ end trace 20ae984fa860184a ]--- [ 26.343021] ------------[ cut here ]------------ [ 26.347646] WARNING: CPU: 3 PID: 345 at drivers/gpu/drm/drm_modeset_lock.c:228 drm_modeset_drop_locks+0x84/0x90 [ 26.357727] Modules linked in: [ 26.360783] CPU: 3 PID: 345 Comm: composer@2.1-se Kdump: loaded Tainted: G W 5.10.110-SE-SDK1.8-dirty #16 [ 26.371645] Hardware name: Siengine Se1000 Evaluation board (DT) [ 26.377647] pstate: 20400009 (nzCv daif +PAN -UAO -TCO BTYPE=--) [ 26.383649] pc : drm_modeset_drop_locks+0x84/0x90 [ 26.388351] lr : drm_mode_atomic_ioctl+0x860/0xb8c [ 26.393137] sp : ffff800017b8bb10 [ 26.396447] pmr_save: 000000e0 [ 26.399497] x29: ffff800017b8bb10 x28: 0000000000000001 [ 26.404807] x27: 0000000000000038 x26: 0000000000000002 [ 26.410115] x25: ffff000cecbefa00 x24: ffff000cf2f96200 [ 26.415423] x23: 0000000000000001 x22: 0000000000000018 [ 26.420731] x21: 0000000000000001 x20: ffff800017b8bc10 [ 26.426039] x19: 0000000000000000 x18: 0000000000000000 [ 26.431347] x17: 0000000002e8bf2c x16: 0000000002e94c6b [ 26.436655] x15: 0000000002ea48b9 x14: ffff8000121f0300 [ 26.441963] x13: 0000000002ee2ca8 x12: ffff80001129cae0 [ 26.447272] x11: ffff800012435000 x10: ffff000ed46b5e88 [ 26.452580] x9 : ffff000c9935e600 x8 : 0000000000000000 [ 26.457888] x7 : 000000008020001e x6 : 000000008020001f [ 26.463196] x5 : ffff80001085fbe0 x4 : fffffe0033a59f20 [ 26.468504] x3 : 000000008020001e x2 : 0000000000000000 [ 26.473813] x1 : 0000000000000000 x0 : ffff000c8f596090 [ 26.479122] Call trace: [ 26.481566] drm_modeset_drop_locks+0x84/0x90 [ 26.485918] drm_mode_atomic_ioctl+0x860/0xb8c [ 26.490359] drm_ioctl_kernel+0xc4/0x120 [ 26.494278] drm_ioctl+0x268/0x534 [ 26.497677] __arm64_sys_ioctl+0xa8/0xf0 [ 26.501598] el0_svc_common.constprop.0+0x80/0x240 [ 26.506384] do_el0_svc+0x24/0x90 [ 26.509697] el0_svc+0x20/0x30 [ 26.512748] el0_sync_handler+0xe8/0xf0 [ 26.516580] el0_sync+0x1a4/0x1c0 [ 26.519891] irq event stamp: 0 [ 26.522943] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 26.529207] hardirqs last disabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.537379] softirqs last enabled at (0): [<ffff800010056d34>] copy_process+0x5d0/0x183c [ 26.545550] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 26.551812] ---[ end trace 20ae984fa860184b ]---
According to the call trace information,it can be located to be WARN_ON(IS_ERR(c_st)) in the komeda_pipeline_unbound_components function; Then follow the function. komeda_pipeline_unbound_components -> komeda_component_get_state_and_set_user -> komeda_pipeline_get_state_and_set_crtc -> komeda_pipeline_get_state ->drm_atomic_get_private_obj_state -> drm_atomic_get_private_obj_state -> drm_modeset_lock
komeda_pipeline_unbound_components -> komeda_component_get_state_and_set_user -> komeda_component_get_state -> drm_atomic_get_private_obj_state -> drm_modeset_lock
ret = drm_modeset_lock(&obj->lock, state->acquire_ctx); if (ret) return ERR_PTR(ret); Here it return -EDEADLK.
deal with the deadlock as suggested by [1], using the function drm_modeset_backoff(). [1] https://docs.kernel.org/gpu/drm-kms.html?highlight=kms#kms-locking
Therefore, handling this problem can be solved by adding return -EDEADLK back to the drm_modeset_backoff processing flow in the drm_mode_atomic_ioctl function.
Signed-off-by: baozhu.liu lucas.liu@siengine.com Signed-off-by: menghui.huang menghui.huang@siengine.com Reviewed-by: Liviu Dudau liviu.dudau@arm.com Signed-off-by: Liviu Dudau liviu.dudau@arm.com Link: https://patchwork.freedesktop.org/patch/msgid/20230804013117.6870-1-menghui.... Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/arm/display/komeda/komeda_pipeline_state.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c index e672b9cffee3c..88b58153f9d66 100644 --- a/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c +++ b/drivers/gpu/drm/arm/display/komeda/komeda_pipeline_state.c @@ -1223,7 +1223,7 @@ int komeda_build_display_data_flow(struct komeda_crtc *kcrtc, return 0; }
-static void +static int komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, struct komeda_pipeline_state *new) { @@ -1243,8 +1243,12 @@ komeda_pipeline_unbound_components(struct komeda_pipeline *pipe, c = komeda_pipeline_get_component(pipe, id); c_st = komeda_component_get_state_and_set_user(c, drm_st, NULL, new->crtc); + if (PTR_ERR(c_st) == -EDEADLK) + return -EDEADLK; WARN_ON(IS_ERR(c_st)); } + + return 0; }
/* release unclaimed pipeline resource */ @@ -1266,9 +1270,8 @@ int komeda_release_unclaimed_resources(struct komeda_pipeline *pipe, if (WARN_ON(IS_ERR_OR_NULL(st))) return -EINVAL;
- komeda_pipeline_unbound_components(pipe, st); + return komeda_pipeline_unbound_components(pipe, st);
- return 0; }
/* Since standalong disabled components must be disabled separately and in the
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaogang Chen xiaogang.chen@amd.com
[ Upstream commit 709c348261618da7ed89d6c303e2ceb9e453ba74 ]
prange->svm_bo unref can happen in both mmu callback and a callback after migrate to system ram. Both are async call in different tasks. Sync svm_bo unref operation to avoid random "use-after-free".
Signed-off-by: Xiaogang Chen xiaogang.chen@amd.com Reviewed-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Jesse Zhang Jesse.Zhang@amd.com Tested-by: Jesse Zhang Jesse.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 86135ca33e5be..5ffbf9ab643b8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -550,8 +550,15 @@ svm_range_vram_node_new(struct amdgpu_device *adev, struct svm_range *prange,
void svm_range_vram_node_free(struct svm_range *prange) { - svm_range_bo_unref(prange->svm_bo); - prange->ttm_res = NULL; + /* serialize prange->svm_bo unref */ + mutex_lock(&prange->lock); + /* prange->svm_bo has not been unref */ + if (prange->ttm_res) { + prange->ttm_res = NULL; + mutex_unlock(&prange->lock); + svm_range_bo_unref(prange->svm_bo); + } else + mutex_unlock(&prange->lock); }
struct amdgpu_device *
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenjing Liu wenjing.liu@amd.com
[ Upstream commit 05b78277ef0efc1deebc8a22384fffec29a3676e ]
[why] Clip size increase will increase viewport, which could cause us to switch to MPC combine. If we skip full update, we are not able to change to MPC combine in fast update. This will cause corruption showing on the video plane.
[how] treat clip size increase of a surface larger than 5k as a full update.
Reviewed-by: Jun Lei jun.lei@amd.com Acked-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Wenjing Liu wenjing.liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 12 ++++++++++-- drivers/gpu/drm/amd/display/dc/dc.h | 5 +++++ 2 files changed, 15 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index ffe7479a047d8..3919e75fec16d 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -886,7 +886,8 @@ static bool dc_construct(struct dc *dc, /* set i2c speed if not done by the respective dcnxxx__resource.c */ if (dc->caps.i2c_speed_in_khz_hdcp == 0) dc->caps.i2c_speed_in_khz_hdcp = dc->caps.i2c_speed_in_khz; - + if (dc->caps.max_optimizable_video_width == 0) + dc->caps.max_optimizable_video_width = 5120; dc->clk_mgr = dc_clk_mgr_create(dc->ctx, dc->res_pool->pp_smu, dc->res_pool->dccg); if (!dc->clk_mgr) goto fail; @@ -2053,6 +2054,7 @@ static enum surface_update_type get_plane_info_update_type(const struct dc_surfa }
static enum surface_update_type get_scaling_info_update_type( + const struct dc *dc, const struct dc_surface_update *u) { union surface_update_flags *update_flags = &u->surface->update_flags; @@ -2087,6 +2089,12 @@ static enum surface_update_type get_scaling_info_update_type( update_flags->bits.clock_change = 1; }
+ if (u->scaling_info->src_rect.width > dc->caps.max_optimizable_video_width && + (u->scaling_info->clip_rect.width > u->surface->clip_rect.width || + u->scaling_info->clip_rect.height > u->surface->clip_rect.height)) + /* Changing clip size of a large surface may result in MPC slice count change */ + update_flags->bits.bandwidth_change = 1; + if (u->scaling_info->src_rect.x != u->surface->src_rect.x || u->scaling_info->src_rect.y != u->surface->src_rect.y || u->scaling_info->clip_rect.x != u->surface->clip_rect.x @@ -2124,7 +2132,7 @@ static enum surface_update_type det_surface_update(const struct dc *dc, type = get_plane_info_update_type(u); elevate_update_type(&overall_type, type);
- type = get_scaling_info_update_type(u); + type = get_scaling_info_update_type(dc, u); elevate_update_type(&overall_type, type);
if (u->flip_addr) diff --git a/drivers/gpu/drm/amd/display/dc/dc.h b/drivers/gpu/drm/amd/display/dc/dc.h index e0f58fab5e8ed..09a8726c26399 100644 --- a/drivers/gpu/drm/amd/display/dc/dc.h +++ b/drivers/gpu/drm/amd/display/dc/dc.h @@ -164,6 +164,11 @@ struct dc_caps { uint32_t dmdata_alloc_size; unsigned int max_cursor_size; unsigned int max_video_width; + /* + * max video plane width that can be safely assumed to be always + * supported by single DPP pipe. + */ + unsigned int max_optimizable_video_width; unsigned int min_horizontal_blanking_period; int linear_pitch_alignment; bool dcc_const_color;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 313ebe47d75558511aa1237b6e35c663b5c0ec6f ]
Currently, user array duplications are sometimes done without an overflow check. Sometimes the checks are done manually; sometimes the array size is calculated with array_size() and sometimes by calculating n * size directly in code.
Introduce wrappers for arrays for memdup_user() and vmemdup_user() to provide a standardized and safe way for duplicating user arrays.
This is both for new code as well as replacing usage of (v)memdup_user() in existing code that uses, e.g., n * size to calculate array sizes.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-3-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/string.h | 40 ++++++++++++++++++++++++++++++++++++++++ 1 file changed, 40 insertions(+)
diff --git a/include/linux/string.h b/include/linux/string.h index d68097b4f600b..3b9f5abe5ee83 100644 --- a/include/linux/string.h +++ b/include/linux/string.h @@ -5,7 +5,9 @@ #include <linux/compiler.h> /* for inline */ #include <linux/types.h> /* for size_t */ #include <linux/stddef.h> /* for NULL */ +#include <linux/err.h> /* for ERR_PTR() */ #include <linux/errno.h> /* for E2BIG */ +#include <linux/overflow.h> /* for check_mul_overflow() */ #include <linux/stdarg.h> #include <uapi/linux/string.h>
@@ -14,6 +16,44 @@ extern void *memdup_user(const void __user *, size_t); extern void *vmemdup_user(const void __user *, size_t); extern void *memdup_user_nul(const void __user *, size_t);
+/** + * memdup_array_user - duplicate array from user space + * @src: source address in user space + * @n: number of array members to copy + * @size: size of one array member + * + * Return: an ERR_PTR() on failure. Result is physically + * contiguous, to be freed by kfree(). + */ +static inline void *memdup_array_user(const void __user *src, size_t n, size_t size) +{ + size_t nbytes; + + if (check_mul_overflow(n, size, &nbytes)) + return ERR_PTR(-EOVERFLOW); + + return memdup_user(src, nbytes); +} + +/** + * vmemdup_array_user - duplicate array from user space + * @src: source address in user space + * @n: number of array members to copy + * @size: size of one array member + * + * Return: an ERR_PTR() on failure. Result may be not + * physically contiguous. Use kvfree() to free. + */ +static inline void *vmemdup_array_user(const void __user *src, size_t n, size_t size) +{ + size_t nbytes; + + if (check_mul_overflow(n, size, &nbytes)) + return ERR_PTR(-EOVERFLOW); + + return vmemdup_user(src, nbytes); +} + /* * Include machine specific inline routines */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 569c8d82f95eb5993c84fb61a649a9c4ddd208b3 ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Acked-by: Baoquan He bhe@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-4-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/kexec.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/kexec.c b/kernel/kexec.c index cb8e6e6f983c7..5ff1dcc4acb78 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -240,7 +240,7 @@ SYSCALL_DEFINE4(kexec_load, unsigned long, entry, unsigned long, nr_segments, ((flags & KEXEC_ARCH_MASK) != KEXEC_ARCH_DEFAULT)) return -EINVAL;
- ksegments = memdup_user(segments, nr_segments * sizeof(ksegments[0])); + ksegments = memdup_array_user(segments, nr_segments, sizeof(ksegments[0])); if (IS_ERR(ksegments)) return PTR_ERR(ksegments);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit ca0776571d3163bd03b3e8c9e3da936abfaecbf6 ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-5-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/watch_queue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/watch_queue.c b/kernel/watch_queue.c index 54cbaa9711398..ae31bf8d2feb1 100644 --- a/kernel/watch_queue.c +++ b/kernel/watch_queue.c @@ -338,7 +338,7 @@ long watch_queue_set_filter(struct pipe_inode_info *pipe, filter.__reserved != 0) return -EINVAL;
- tf = memdup_user(_filter->filters, filter.nr_filters * sizeof(*tf)); + tf = memdup_array_user(_filter->filters, filter.nr_filters, sizeof(*tf)); if (IS_ERR(tf)) return PTR_ERR(tf);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit 06ab64a0d836ac430c5f94669710a78aa43942cb ]
Currently, there is no overflow-check with memdup_user().
Use the new function memdup_array_user() instead of memdup_user() for duplicating the user-space array safely.
Suggested-by: David Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Reviewed-by: Kees Cook keescook@chromium.org Reviewed-by: Zack Rusin zackr@vmware.com Signed-off-by: Dave Airlie airlied@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230920123612.16914-7-pstanne... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_surface.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c index 5d53a5f9d1237..872af7d4b3fc9 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_surface.c @@ -807,9 +807,9 @@ int vmw_surface_define_ioctl(struct drm_device *dev, void *data, metadata->num_sizes = num_sizes; user_srf->size = size; metadata->sizes = - memdup_user((struct drm_vmw_size __user *)(unsigned long) + memdup_array_user((struct drm_vmw_size __user *)(unsigned long) req->size_addr, - sizeof(*metadata->sizes) * metadata->num_sizes); + metadata->num_sizes, sizeof(*metadata->sizes)); if (IS_ERR(metadata->sizes)) { ret = PTR_ERR(metadata->sizes); goto out_no_sizes;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nikula jani.nikula@intel.com
[ Upstream commit a251c9d8e30833b260101edb9383b176ee2b7cb1 ]
The DP CTS test for EDID last block checksum expects the checksum for the last block, invalid or not. Skip the validity check.
For the most part (*), the EDIDs returned by drm_get_edid() will be valid anyway, and there's the CTS workaround to get the checksum for completely invalid EDIDs. See commit 7948fe12d47a ("drm/msm/dp: return correct edid checksum after corrupted edid checksum read").
This lets us remove one user of drm_edid_block_valid() with hopes the function can be removed altogether in the future.
(*) drm_get_edid() ignores checksum errors on CTA extensions.
Cc: Abhinav Kumar quic_abhinavk@quicinc.com Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Kuogee Hsieh khsieh@codeaurora.org Cc: Marijn Suijten marijn.suijten@somainline.org Cc: Rob Clark robdclark@gmail.com Cc: Sean Paul sean@poorly.run Cc: Stephen Boyd swboyd@chromium.org Cc: linux-arm-msm@vger.kernel.org Cc: freedreno@lists.freedesktop.org Signed-off-by: Jani Nikula jani.nikula@intel.com Reviewed-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Abhinav Kumar quic_abhinavk@quicinc.com Reviewed-by: Kuogee Hsieh quic_khsieh@quicinc.com Patchwork: https://patchwork.freedesktop.org/patch/555361/ Link: https://lore.kernel.org/r/20230901142034.580802-1-jani.nikula@intel.com Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dp/dp_panel.c | 21 ++------------------- 1 file changed, 2 insertions(+), 19 deletions(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_panel.c b/drivers/gpu/drm/msm/dp/dp_panel.c index 62b742e701d2c..f9d31069f4848 100644 --- a/drivers/gpu/drm/msm/dp/dp_panel.c +++ b/drivers/gpu/drm/msm/dp/dp_panel.c @@ -263,26 +263,9 @@ int dp_panel_get_modes(struct dp_panel *dp_panel,
static u8 dp_panel_get_edid_checksum(struct edid *edid) { - struct edid *last_block; - u8 *raw_edid; - bool is_edid_corrupt = false; + edid += edid->extensions;
- if (!edid) { - DRM_ERROR("invalid edid input\n"); - return 0; - } - - raw_edid = (u8 *)edid; - raw_edid += (edid->extensions * EDID_LENGTH); - last_block = (struct edid *)raw_edid; - - /* block type extension */ - drm_edid_block_valid(raw_edid, 1, false, &is_edid_corrupt); - if (!is_edid_corrupt) - return last_block->checksum; - - DRM_ERROR("Invalid block, no checksum\n"); - return 0; + return edid->checksum; }
void dp_panel_handle_sink_request(struct dp_panel *dp_panel)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 760efbca74a405dc439a013a5efaa9fadc95a8c3 ]
For pptable structs that use flexible array sizes, use flexible arrays.
Suggested-by: Felix Held felix.held@amd.com Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2874 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/include/pptable.h | 4 ++-- drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/include/pptable.h b/drivers/gpu/drm/amd/include/pptable.h index 0b6a057e0a4c4..5aac8d545bdc6 100644 --- a/drivers/gpu/drm/amd/include/pptable.h +++ b/drivers/gpu/drm/amd/include/pptable.h @@ -78,7 +78,7 @@ typedef struct _ATOM_PPLIB_THERMALCONTROLLER typedef struct _ATOM_PPLIB_STATE { UCHAR ucNonClockStateIndex; - UCHAR ucClockStateIndices[1]; // variable-sized + UCHAR ucClockStateIndices[]; // variable-sized } ATOM_PPLIB_STATE;
@@ -473,7 +473,7 @@ typedef struct _ATOM_PPLIB_STATE_V2 /** * Driver will read the first ucNumDPMLevels in this array */ - UCHAR clockInfoIndex[1]; + UCHAR clockInfoIndex[]; } ATOM_PPLIB_STATE_V2;
typedef struct _StateArray{ diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index b0ac4d121adca..41444e27bfc0c 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -179,7 +179,7 @@ typedef struct _ATOM_Tonga_MCLK_Dependency_Record { typedef struct _ATOM_Tonga_MCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_MCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_MCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_MCLK_Dependency_Table;
typedef struct _ATOM_Tonga_SCLK_Dependency_Record { @@ -194,7 +194,7 @@ typedef struct _ATOM_Tonga_SCLK_Dependency_Record { typedef struct _ATOM_Tonga_SCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_SCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_SCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_SCLK_Dependency_Table;
typedef struct _ATOM_Polaris_SCLK_Dependency_Record {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 0f0e59075b5c22f1e871fbd508d6e4f495048356 ]
For pptable structs that use flexible array sizes, use flexible arrays.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2036742 Signed-off-by: Mario Limonciello mario.limonciello@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h index 41444e27bfc0c..e0e40b054c08b 100644 --- a/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h +++ b/drivers/gpu/drm/amd/pm/powerplay/hwmgr/pptable_v1_0.h @@ -164,7 +164,7 @@ typedef struct _ATOM_Tonga_State { typedef struct _ATOM_Tonga_State_Array { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_State entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_State entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_State_Array;
typedef struct _ATOM_Tonga_MCLK_Dependency_Record { @@ -210,7 +210,7 @@ typedef struct _ATOM_Polaris_SCLK_Dependency_Record { typedef struct _ATOM_Polaris_SCLK_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Polaris_SCLK_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Polaris_SCLK_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Polaris_SCLK_Dependency_Table;
typedef struct _ATOM_Tonga_PCIE_Record { @@ -222,7 +222,7 @@ typedef struct _ATOM_Tonga_PCIE_Record { typedef struct _ATOM_Tonga_PCIE_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_PCIE_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_PCIE_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_PCIE_Table;
typedef struct _ATOM_Polaris10_PCIE_Record { @@ -235,7 +235,7 @@ typedef struct _ATOM_Polaris10_PCIE_Record { typedef struct _ATOM_Polaris10_PCIE_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Polaris10_PCIE_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Polaris10_PCIE_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Polaris10_PCIE_Table;
@@ -252,7 +252,7 @@ typedef struct _ATOM_Tonga_MM_Dependency_Record { typedef struct _ATOM_Tonga_MM_Dependency_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_MM_Dependency_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_MM_Dependency_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_MM_Dependency_Table;
typedef struct _ATOM_Tonga_Voltage_Lookup_Record { @@ -265,7 +265,7 @@ typedef struct _ATOM_Tonga_Voltage_Lookup_Record { typedef struct _ATOM_Tonga_Voltage_Lookup_Table { UCHAR ucRevId; UCHAR ucNumEntries; /* Number of entries. */ - ATOM_Tonga_Voltage_Lookup_Record entries[1]; /* Dynamically allocate entries. */ + ATOM_Tonga_Voltage_Lookup_Record entries[]; /* Dynamically allocate entries. */ } ATOM_Tonga_Voltage_Lookup_Table;
typedef struct _ATOM_Tonga_Fan_Table {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanley.Yang Stanley.Yang@amd.com
[ Upstream commit 80285ae1ec8717b597b20de38866c29d84d321a1 ]
The amdgpu_ras_get_context may return NULL if device not support ras feature, so add check before using.
Signed-off-by: Stanley.Yang Stanley.Yang@amd.com Reviewed-by: Tao Zhou tao.zhou1@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index f57334fff7fc8..19e32f38a4c45 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5116,7 +5116,8 @@ int amdgpu_device_gpu_recover(struct amdgpu_device *adev, * Flush RAM to disk so that after reboot * the user can read log and see why the system rebooted. */ - if (need_emergency_restart && amdgpu_ras_get_context(adev)->reboot) { + if (need_emergency_restart && amdgpu_ras_get_context(adev) && + amdgpu_ras_get_context(adev)->reboot) { DRM_WARN("Emergency reboot.");
ksys_sync_helper();
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit 924e5814d1f84e6fa5cb19c6eceb69f066225229 ]
In versatile_panel_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.
Signed-off-by: Ma Ke make_ruc2021@163.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20231007033105.3997998-1-make_ruc2021@163.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20231007033105.3997998-1-make_... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-arm-versatile.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-arm-versatile.c b/drivers/gpu/drm/panel/panel-arm-versatile.c index abb0788843c60..503ecea72c5ea 100644 --- a/drivers/gpu/drm/panel/panel-arm-versatile.c +++ b/drivers/gpu/drm/panel/panel-arm-versatile.c @@ -267,6 +267,8 @@ static int versatile_panel_get_modes(struct drm_panel *panel, connector->display_info.bus_flags = vpanel->panel_type->bus_flags;
mode = drm_mode_duplicate(connector->dev, &vpanel->panel_type->mode); + if (!mode) + return -ENOMEM; drm_mode_set_name(mode); mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit f22def5970c423ea7f87d5247bd0ef91416b0658 ]
In tpg110_get_modes(), the return value of drm_mode_duplicate() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_mode_duplicate(). Add a check to avoid npd.
Signed-off-by: Ma Ke make_ruc2021@163.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20231009090446.4043798-1-make_ruc2021@163.com Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://patchwork.freedesktop.org/patch/msgid/20231009090446.4043798-1-make_... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-tpo-tpg110.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/panel/panel-tpo-tpg110.c b/drivers/gpu/drm/panel/panel-tpo-tpg110.c index e3791dad6830c..3360e7ccb0a7d 100644 --- a/drivers/gpu/drm/panel/panel-tpo-tpg110.c +++ b/drivers/gpu/drm/panel/panel-tpo-tpg110.c @@ -379,6 +379,8 @@ static int tpg110_get_modes(struct drm_panel *panel, connector->display_info.bus_flags = tpg->panel_mode->bus_flags;
mode = drm_mode_duplicate(connector->dev, &tpg->panel_mode->mode); + if (!mode) + return -ENOMEM; drm_mode_set_name(mode); mode->type = DRM_MODE_TYPE_DRIVER | DRM_MODE_TYPE_PREFERRED;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit cd90511557fdfb394bb4ac4c3b539b007383914c ]
In amdgpu_vkms_conn_get_modes(), the return value of drm_cvt_mode() is assigned to mode, which will lead to a NULL pointer dereference on failure of drm_cvt_mode(). Add a check to avoid null pointer dereference.
Signed-off-by: Ma Ke make_ruc2021@163.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c index 4e8274de8fc0c..083f9c637a82e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_vkms.c @@ -238,6 +238,8 @@ static int amdgpu_vkms_conn_get_modes(struct drm_connector *connector)
for (i = 0; i < ARRAY_SIZE(common_modes); i++) { mode = drm_cvt_mode(dev, common_modes[i].w, common_modes[i].h, 60, false, false, false); + if (!mode) + continue; drm_mode_probed_add(connector, mode); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Jirman megi@xff.cz
[ Upstream commit d12d635bb03c7cb4830acb641eb176ee9ff2aa89 ]
Switching to a different reset sequence, enabling IOVCC before enabling VCC.
There also needs to be a delay after enabling the supplies and before deasserting the reset. The datasheet specifies 1ms after the supplies reach the required voltage. Use 10-20ms to also give the power supplies some time to reach the required voltage, too.
This fixes intermittent panel initialization failures and screen corruption during resume from sleep on panel xingbangda,xbd599 (e.g. used in PinePhone).
Signed-off-by: Ondrej Jirman megi@xff.cz Signed-off-by: Frank Oltmanns frank@oltmanns.dev Reported-by: Samuel Holland samuel@sholland.org Reviewed-by: Guido Günther agx@sigxcpu.org Tested-by: Guido Günther agx@sigxcpu.org Signed-off-by: Guido Günther agx@sigxcpu.org Link: https://patchwork.freedesktop.org/patch/msgid/20230211171748.36692-2-frank@o... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-sitronix-st7703.c | 25 ++++++++++--------- 1 file changed, 13 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/panel/panel-sitronix-st7703.c b/drivers/gpu/drm/panel/panel-sitronix-st7703.c index a2c303e5732c0..f50cc70e6337c 100644 --- a/drivers/gpu/drm/panel/panel-sitronix-st7703.c +++ b/drivers/gpu/drm/panel/panel-sitronix-st7703.c @@ -428,29 +428,30 @@ static int st7703_prepare(struct drm_panel *panel) return 0;
dev_dbg(ctx->dev, "Resetting the panel\n"); - ret = regulator_enable(ctx->vcc); + gpiod_set_value_cansleep(ctx->reset_gpio, 1); + + ret = regulator_enable(ctx->iovcc); if (ret < 0) { - dev_err(ctx->dev, "Failed to enable vcc supply: %d\n", ret); + dev_err(ctx->dev, "Failed to enable iovcc supply: %d\n", ret); return ret; } - ret = regulator_enable(ctx->iovcc); + + ret = regulator_enable(ctx->vcc); if (ret < 0) { - dev_err(ctx->dev, "Failed to enable iovcc supply: %d\n", ret); - goto disable_vcc; + dev_err(ctx->dev, "Failed to enable vcc supply: %d\n", ret); + regulator_disable(ctx->iovcc); + return ret; }
- gpiod_set_value_cansleep(ctx->reset_gpio, 1); - usleep_range(20, 40); + /* Give power supplies time to stabilize before deasserting reset. */ + usleep_range(10000, 20000); + gpiod_set_value_cansleep(ctx->reset_gpio, 0); - msleep(20); + usleep_range(15000, 20000);
ctx->prepared = true;
return 0; - -disable_vcc: - regulator_disable(ctx->vcc); - return ret; }
static int st7703_get_modes(struct drm_panel *panel,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jesse Zhang jesse.zhang@amd.com
[ Upstream commit 282c1d793076c2edac6c3db51b7e8ed2b41d60a5 ]
[ 567.613292] shift exponent 255 is too large for 64-bit type 'long unsigned int' [ 567.614498] CPU: 5 PID: 238 Comm: kworker/5:1 Tainted: G OE 6.2.0-34-generic #34~22.04.1-Ubuntu [ 567.614502] Hardware name: AMD Splinter/Splinter-RPL, BIOS WS43927N_871 09/25/2023 [ 567.614504] Workqueue: events send_exception_work_handler [amdgpu] [ 567.614748] Call Trace: [ 567.614750] <TASK> [ 567.614753] dump_stack_lvl+0x48/0x70 [ 567.614761] dump_stack+0x10/0x20 [ 567.614763] __ubsan_handle_shift_out_of_bounds+0x156/0x310 [ 567.614769] ? srso_alias_return_thunk+0x5/0x7f [ 567.614773] ? update_sd_lb_stats.constprop.0+0xf2/0x3c0 [ 567.614780] svm_range_split_by_granularity.cold+0x2b/0x34 [amdgpu] [ 567.615047] ? srso_alias_return_thunk+0x5/0x7f [ 567.615052] svm_migrate_to_ram+0x185/0x4d0 [amdgpu] [ 567.615286] do_swap_page+0x7b6/0xa30 [ 567.615291] ? srso_alias_return_thunk+0x5/0x7f [ 567.615294] ? __free_pages+0x119/0x130 [ 567.615299] handle_pte_fault+0x227/0x280 [ 567.615303] __handle_mm_fault+0x3c0/0x720 [ 567.615311] handle_mm_fault+0x119/0x330 [ 567.615314] ? lock_mm_and_find_vma+0x44/0x250 [ 567.615318] do_user_addr_fault+0x1a9/0x640 [ 567.615323] exc_page_fault+0x81/0x1b0 [ 567.615328] asm_exc_page_fault+0x27/0x30 [ 567.615332] RIP: 0010:__get_user_8+0x1c/0x30
Signed-off-by: Jesse Zhang jesse.zhang@amd.com Suggested-by: Philip Yang Philip.Yang@amd.com Reviewed-by: Yifan Zhang yifan1.zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_svm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c index 5ffbf9ab643b8..2cbe8ea16f24a 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_svm.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_svm.c @@ -698,7 +698,7 @@ svm_range_apply_attrs(struct kfd_process *p, struct svm_range *prange, prange->flags &= ~attrs[i].value; break; case KFD_IOCTL_SVM_ATTR_GRANULARITY: - prange->granularity = attrs[i].value; + prange->granularity = min_t(uint32_t, attrs[i].value, 0x3F); break; default: WARN_ONCE(1, "svm_range_check_attrs wasn't called?");
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Huang qu.huang@linux.dev
[ Upstream commit 5104fdf50d326db2c1a994f8b35dcd46e63ae4ad ]
In certain types of chips, such as VEGA20, reading the amdgpu_regs_smc file could result in an abnormal null pointer access when the smc_rreg pointer is NULL. Below are the steps to reproduce this issue and the corresponding exception log:
1. Navigate to the directory: /sys/kernel/debug/dri/0 2. Execute command: cat amdgpu_regs_smc 3. Exception Log:: [4005007.702554] BUG: kernel NULL pointer dereference, address: 0000000000000000 [4005007.702562] #PF: supervisor instruction fetch in kernel mode [4005007.702567] #PF: error_code(0x0010) - not-present page [4005007.702570] PGD 0 P4D 0 [4005007.702576] Oops: 0010 [#1] SMP NOPTI [4005007.702581] CPU: 4 PID: 62563 Comm: cat Tainted: G OE 5.15.0-43-generic #46-Ubunt u [4005007.702590] RIP: 0010:0x0 [4005007.702598] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.702600] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.702605] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.702609] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.702612] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.702615] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.702618] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.702622] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.702626] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.702629] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0 [4005007.702633] Call Trace: [4005007.702636] <TASK> [4005007.702640] amdgpu_debugfs_regs_smc_read+0xb0/0x120 [amdgpu] [4005007.703002] full_proxy_read+0x5c/0x80 [4005007.703011] vfs_read+0x9f/0x1a0 [4005007.703019] ksys_read+0x67/0xe0 [4005007.703023] __x64_sys_read+0x19/0x20 [4005007.703028] do_syscall_64+0x5c/0xc0 [4005007.703034] ? do_user_addr_fault+0x1e3/0x670 [4005007.703040] ? exit_to_user_mode_prepare+0x37/0xb0 [4005007.703047] ? irqentry_exit_to_user_mode+0x9/0x20 [4005007.703052] ? irqentry_exit+0x19/0x30 [4005007.703057] ? exc_page_fault+0x89/0x160 [4005007.703062] ? asm_exc_page_fault+0x8/0x30 [4005007.703068] entry_SYSCALL_64_after_hwframe+0x44/0xae [4005007.703075] RIP: 0033:0x7f5e07672992 [4005007.703079] Code: c0 e9 b2 fe ff ff 50 48 8d 3d fa b2 0c 00 e8 c5 1d 02 00 0f 1f 44 00 00 f3 0f 1e fa 64 8b 04 25 18 00 00 00 85 c0 75 10 0f 05 <48> 3d 00 f0 ff ff 77 56 c3 0f 1f 44 00 00 48 83 e c 28 48 89 54 24 [4005007.703083] RSP: 002b:00007ffe03097898 EFLAGS: 00000246 ORIG_RAX: 0000000000000000 [4005007.703088] RAX: ffffffffffffffda RBX: 0000000000020000 RCX: 00007f5e07672992 [4005007.703091] RDX: 0000000000020000 RSI: 00007f5e06753000 RDI: 0000000000000003 [4005007.703094] RBP: 00007f5e06753000 R08: 00007f5e06752010 R09: 00007f5e06752010 [4005007.703096] R10: 0000000000000022 R11: 0000000000000246 R12: 0000000000022000 [4005007.703099] R13: 0000000000000003 R14: 0000000000020000 R15: 0000000000020000 [4005007.703105] </TASK> [4005007.703107] Modules linked in: nf_tables libcrc32c nfnetlink algif_hash af_alg binfmt_misc nls_ iso8859_1 ipmi_ssif ast intel_rapl_msr intel_rapl_common drm_vram_helper drm_ttm_helper amd64_edac t tm edac_mce_amd kvm_amd ccp mac_hid k10temp kvm acpi_ipmi ipmi_si rapl sch_fq_codel ipmi_devintf ipm i_msghandler msr parport_pc ppdev lp parport mtd pstore_blk efi_pstore ramoops pstore_zone reed_solo mon ip_tables x_tables autofs4 ib_uverbs ib_core amdgpu(OE) amddrm_ttm_helper(OE) amdttm(OE) iommu_v 2 amd_sched(OE) amdkcl(OE) drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops cec rc_core drm igb ahci xhci_pci libahci i2c_piix4 i2c_algo_bit xhci_pci_renesas dca [4005007.703184] CR2: 0000000000000000 [4005007.703188] ---[ end trace ac65a538d240da39 ]--- [4005007.800865] RIP: 0010:0x0 [4005007.800871] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6. [4005007.800874] RSP: 0018:ffffa82b46d27da0 EFLAGS: 00010206 [4005007.800878] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa82b46d27e68 [4005007.800881] RDX: 0000000000000001 RSI: 0000000000000000 RDI: ffff9940656e0000 [4005007.800883] RBP: ffffa82b46d27dd8 R08: 0000000000000000 R09: ffff994060c07980 [4005007.800886] R10: 0000000000020000 R11: 0000000000000000 R12: 00007f5e06753000 [4005007.800888] R13: ffff9940656e0000 R14: ffffa82b46d27e68 R15: 00007f5e06753000 [4005007.800891] FS: 00007f5e0755b740(0000) GS:ffff99479d300000(0000) knlGS:0000000000000000 [4005007.800895] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [4005007.800898] CR2: ffffffffffffffd6 CR3: 00000003253fc000 CR4: 00000000003506e0
Signed-off-by: Qu Huang qu.huang@linux.dev Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c index 348629ea0e153..beb199d13451b 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_debugfs.c @@ -458,6 +458,9 @@ static ssize_t amdgpu_debugfs_regs_smc_read(struct file *f, char __user *buf, ssize_t result = 0; int r;
+ if (!adev->smc_rreg) + return -EPERM; + if (size & 0x3 || *pos & 0x3) return -EINVAL;
@@ -517,6 +520,9 @@ static ssize_t amdgpu_debugfs_regs_smc_write(struct file *f, const char __user * ssize_t result = 0; int r;
+ if (!adev->smc_wreg) + return -EPERM; + if (size & 0x3 || *pos & 0x3) return -EINVAL;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laurentiu Tudor laurentiu.tudor@nxp.com
[ Upstream commit b39d5016456871a88f5cd141914a5043591b46f3 ]
Wrap the usb controllers in an intermediate simple-bus and use it to constrain the dma address size of these usb controllers to the 40b that they generate toward the interconnect. This is required because the SoC uses 48b address sizes and this mismatch would lead to smmu context faults [1] because the usb generates 40b addresses while the smmu page tables are populated with 48b wide addresses.
[1] xhci-hcd xhci-hcd.0.auto: xHCI Host Controller xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 1 xhci-hcd xhci-hcd.0.auto: hcc params 0x0220f66d hci version 0x100 quirks 0x0000000002000010 xhci-hcd xhci-hcd.0.auto: irq 108, io mem 0x03100000 xhci-hcd xhci-hcd.0.auto: xHCI Host Controller xhci-hcd xhci-hcd.0.auto: new USB bus registered, assigned bus number 2 xhci-hcd xhci-hcd.0.auto: Host supports USB 3.0 SuperSpeed arm-smmu 5000000.iommu: Unhandled context fault: fsr=0x402, iova=0xffffffb000, fsynr=0x0, cbfrsynra=0xc01, cb=3
Signed-off-by: Laurentiu Tudor laurentiu.tudor@nxp.com Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../arm64/boot/dts/freescale/fsl-ls208xa.dtsi | 46 +++++++++++-------- 1 file changed, 27 insertions(+), 19 deletions(-)
diff --git a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi index 12e59777363fe..9bb360db6b195 100644 --- a/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi +++ b/arch/arm64/boot/dts/freescale/fsl-ls208xa.dtsi @@ -1179,26 +1179,34 @@ sata1: sata@3210000 { dma-coherent; };
- usb0: usb@3100000 { - status = "disabled"; - compatible = "snps,dwc3"; - reg = <0x0 0x3100000 0x0 0x10000>; - interrupts = <0 80 0x4>; /* Level high type */ - dr_mode = "host"; - snps,quirk-frame-length-adjustment = <0x20>; - snps,dis_rxdet_inp3_quirk; - snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; - }; + bus: bus { + #address-cells = <2>; + #size-cells = <2>; + compatible = "simple-bus"; + ranges; + dma-ranges = <0x0 0x0 0x0 0x0 0x100 0x00000000>; + + usb0: usb@3100000 { + compatible = "snps,dwc3"; + reg = <0x0 0x3100000 0x0 0x10000>; + interrupts = <0 80 0x4>; /* Level high type */ + dr_mode = "host"; + snps,quirk-frame-length-adjustment = <0x20>; + snps,dis_rxdet_inp3_quirk; + snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + status = "disabled"; + };
- usb1: usb@3110000 { - status = "disabled"; - compatible = "snps,dwc3"; - reg = <0x0 0x3110000 0x0 0x10000>; - interrupts = <0 81 0x4>; /* Level high type */ - dr_mode = "host"; - snps,quirk-frame-length-adjustment = <0x20>; - snps,dis_rxdet_inp3_quirk; - snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + usb1: usb@3110000 { + compatible = "snps,dwc3"; + reg = <0x0 0x3110000 0x0 0x10000>; + interrupts = <0 81 0x4>; /* Level high type */ + dr_mode = "host"; + snps,quirk-frame-length-adjustment = <0x20>; + snps,dis_rxdet_inp3_quirk; + snps,incr-burst-type-adjustment = <1>, <4>, <8>, <16>; + status = "disabled"; + }; };
ccn@4000000 {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhujun2 zhujun2@cmss.chinamobile.com
[ Upstream commit 3f6f8a8c5e11a9b384a36df4f40f0c9a653b6975 ]
The opened file should be closed in main(), otherwise resource leak will occur that this problem was discovered by code reading
Signed-off-by: zhujun2 zhujun2@cmss.chinamobile.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/efivarfs/create-read.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/testing/selftests/efivarfs/create-read.c b/tools/testing/selftests/efivarfs/create-read.c index 9674a19396a32..7bc7af4eb2c17 100644 --- a/tools/testing/selftests/efivarfs/create-read.c +++ b/tools/testing/selftests/efivarfs/create-read.c @@ -32,8 +32,10 @@ int main(int argc, char **argv) rc = read(fd, buf, sizeof(buf)); if (rc != 0) { fprintf(stderr, "Reading a new var should return EOF\n"); + close(fd); return EXIT_FAILURE; }
+ close(fd); return EXIT_SUCCESS; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 47f56e38a199bd45514b8e0142399cba4feeaf1a ]
Add members to struct snd_soc_card to store the PCI subsystem ID (SSID) of the soundcard.
The PCI specification provides two registers to store a vendor-specific SSID that can be read by drivers to uniquely identify a particular "soundcard". This is defined in the PCI specification to distinguish products that use the same silicon (and therefore have the same silicon ID) so that product-specific differences can be applied.
PCI only defines 0xFFFF as an invalid value. 0x0000 is not defined as invalid. So the usual pattern of zero-filling the struct and then assuming a zero value unset will not work. A flag is included to indicate when the SSID information has been filled in.
Unlike DMI information, which has a free-format entirely up to the vendor, the PCI SSID has a strictly defined format and a registry of vendor IDs.
It is usual in Windows drivers that the SSID is used as the sole identifier of the specific end-product and the Windows driver contains tables mapping that to information about the hardware setup, rather than using ACPI properties.
This SSID is important information for ASoC components that need to apply hardware-specific configuration on PCI-based systems.
As the SSID is a generic part of the PCI specification and is treated as identifying the "soundcard", it is reasonable to include this information in struct snd_soc_card, instead of components inventing their own custom ways to pass this information around.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20230912163207.3498161-2-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/sound/soc-card.h | 37 +++++++++++++++++++++++++++++++++++++ include/sound/soc.h | 11 +++++++++++ 2 files changed, 48 insertions(+)
diff --git a/include/sound/soc-card.h b/include/sound/soc-card.h index 4f2cc4fb56b7f..9a5429260ece5 100644 --- a/include/sound/soc-card.h +++ b/include/sound/soc-card.h @@ -40,6 +40,43 @@ int snd_soc_card_add_dai_link(struct snd_soc_card *card, void snd_soc_card_remove_dai_link(struct snd_soc_card *card, struct snd_soc_dai_link *dai_link);
+#ifdef CONFIG_PCI +static inline void snd_soc_card_set_pci_ssid(struct snd_soc_card *card, + unsigned short vendor, + unsigned short device) +{ + card->pci_subsystem_vendor = vendor; + card->pci_subsystem_device = device; + card->pci_subsystem_set = true; +} + +static inline int snd_soc_card_get_pci_ssid(struct snd_soc_card *card, + unsigned short *vendor, + unsigned short *device) +{ + if (!card->pci_subsystem_set) + return -ENOENT; + + *vendor = card->pci_subsystem_vendor; + *device = card->pci_subsystem_device; + + return 0; +} +#else /* !CONFIG_PCI */ +static inline void snd_soc_card_set_pci_ssid(struct snd_soc_card *card, + unsigned short vendor, + unsigned short device) +{ +} + +static inline int snd_soc_card_get_pci_ssid(struct snd_soc_card *card, + unsigned short *vendor, + unsigned short *device) +{ + return -ENOENT; +} +#endif /* CONFIG_PCI */ + /* device driver data */ static inline void snd_soc_card_set_drvdata(struct snd_soc_card *card, void *data) diff --git a/include/sound/soc.h b/include/sound/soc.h index 5872a8864f3b6..3f0369aae2faf 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -880,6 +880,17 @@ struct snd_soc_card { #ifdef CONFIG_DMI char dmi_longname[80]; #endif /* CONFIG_DMI */ + +#ifdef CONFIG_PCI + /* + * PCI does not define 0 as invalid, so pci_subsystem_set indicates + * whether a value has been written to these fields. + */ + unsigned short pci_subsystem_vendor; + unsigned short pci_subsystem_device; + bool pci_subsystem_set; +#endif /* CONFIG_PCI */ + char topology_shortname[32];
struct device *dev;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lu Jialin lujialin4@huawei.com
[ Upstream commit 8f4f68e788c3a7a696546291258bfa5fdb215523 ]
We found a hungtask bug in test_aead_vec_cfg as follows:
INFO: task cryptomgr_test:391009 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. Call trace: __switch_to+0x98/0xe0 __schedule+0x6c4/0xf40 schedule+0xd8/0x1b4 schedule_timeout+0x474/0x560 wait_for_common+0x368/0x4e0 wait_for_completion+0x20/0x30 wait_for_completion+0x20/0x30 test_aead_vec_cfg+0xab4/0xd50 test_aead+0x144/0x1f0 alg_test_aead+0xd8/0x1e0 alg_test+0x634/0x890 cryptomgr_test+0x40/0x70 kthread+0x1e0/0x220 ret_from_fork+0x10/0x18 Kernel panic - not syncing: hung_task: blocked tasks
For padata_do_parallel, when the return err is 0 or -EBUSY, it will call wait_for_completion(&wait->completion) in test_aead_vec_cfg. In normal case, aead_request_complete() will be called in pcrypt_aead_serial and the return err is 0 for padata_do_parallel. But, when pinst->flags is PADATA_RESET, the return err is -EBUSY for padata_do_parallel, and it won't call aead_request_complete(). Therefore, test_aead_vec_cfg will hung at wait_for_completion(&wait->completion), which will cause hungtask.
The problem comes as following: (padata_do_parallel) | rcu_read_lock_bh(); | err = -EINVAL; | (padata_replace) | pinst->flags |= PADATA_RESET; err = -EBUSY | if (pinst->flags & PADATA_RESET) | rcu_read_unlock_bh() | return err
In order to resolve the problem, we replace the return err -EBUSY with -EAGAIN, which means parallel_data is changing, and the caller should call it again.
v3: remove retry and just change the return err. v2: introduce padata_try_do_parallel() in pcrypt_aead_encrypt and pcrypt_aead_decrypt to solve the hungtask.
Signed-off-by: Lu Jialin lujialin4@huawei.com Signed-off-by: Guo Zihua guozihua@huawei.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/pcrypt.c | 4 ++++ kernel/padata.c | 2 +- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/crypto/pcrypt.c b/crypto/pcrypt.c index 9d10b846ccf73..005a36cb21bc4 100644 --- a/crypto/pcrypt.c +++ b/crypto/pcrypt.c @@ -117,6 +117,8 @@ static int pcrypt_aead_encrypt(struct aead_request *req) err = padata_do_parallel(ictx->psenc, padata, &ctx->cb_cpu); if (!err) return -EINPROGRESS; + if (err == -EBUSY) + return -EAGAIN;
return err; } @@ -164,6 +166,8 @@ static int pcrypt_aead_decrypt(struct aead_request *req) err = padata_do_parallel(ictx->psdec, padata, &ctx->cb_cpu); if (!err) return -EINPROGRESS; + if (err == -EBUSY) + return -EAGAIN;
return err; } diff --git a/kernel/padata.c b/kernel/padata.c index c6025a48fb49e..47f146f061fb1 100644 --- a/kernel/padata.c +++ b/kernel/padata.c @@ -194,7 +194,7 @@ int padata_do_parallel(struct padata_shell *ps, *cb_cpu = cpu; }
- err = -EBUSY; + err = -EBUSY; if ((pinst->flags & PADATA_RESET)) goto out;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 8bf7187d978610b9e327a3d92728c8864a575ebd ]
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting, and remove extract_width() which only wraps that FIELD_GET().
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Link: https://lore.kernel.org/r/20230919125648.1920-2-ilpo.jarvinen@linux.intel.co... Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Dean Luick dean.luick@cornelisnetworks.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/pcie.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/pcie.c b/drivers/infiniband/hw/hfi1/pcie.c index a0802332c8cb3..5395cf56fbd90 100644 --- a/drivers/infiniband/hw/hfi1/pcie.c +++ b/drivers/infiniband/hw/hfi1/pcie.c @@ -3,6 +3,7 @@ * Copyright(c) 2015 - 2019 Intel Corporation. */
+#include <linux/bitfield.h> #include <linux/pci.h> #include <linux/io.h> #include <linux/delay.h> @@ -212,12 +213,6 @@ static u32 extract_speed(u16 linkstat) return speed; }
-/* return the PCIe link speed from the given link status */ -static u32 extract_width(u16 linkstat) -{ - return (linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT; -} - /* read the link status and set dd->{lbus_width,lbus_speed,lbus_info} */ static void update_lbus_info(struct hfi1_devdata *dd) { @@ -230,7 +225,7 @@ static void update_lbus_info(struct hfi1_devdata *dd) return; }
- dd->lbus_width = extract_width(linkstat); + dd->lbus_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat); dd->lbus_speed = extract_speed(linkstat); snprintf(dd->lbus_info, sizeof(dd->lbus_info), "PCIe,%uMHz,x%u", dd->lbus_speed, dd->lbus_width);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yihang Li liyihang9@huawei.com
[ Upstream commit 6de426f9276c448e2db7238911c97fb157cb23be ]
If init debugfs failed during device registration due to memory allocation failure, debugfs_remove_recursive() is called, after which debugfs_dir is not set to NULL. debugfs_remove_recursive() will be called again during device removal. As a result, illegal pointer is accessed.
[ 1665.467244] hisi_sas_v3_hw 0000:b4:02.0: failed to init debugfs! ... [ 1669.836708] Unable to handle kernel NULL pointer dereference at virtual address 00000000000000a0 [ 1669.872669] pc : down_write+0x24/0x70 [ 1669.876315] lr : down_write+0x1c/0x70 [ 1669.879961] sp : ffff000036f53a30 [ 1669.883260] x29: ffff000036f53a30 x28: ffffa027c31549f8 [ 1669.888547] x27: ffffa027c3140000 x26: 0000000000000000 [ 1669.893834] x25: ffffa027bf37c270 x24: ffffa027bf37c270 [ 1669.899122] x23: ffff0000095406b8 x22: ffff0000095406a8 [ 1669.904408] x21: 0000000000000000 x20: ffffa027bf37c310 [ 1669.909695] x19: 00000000000000a0 x18: ffff8027dcd86f10 [ 1669.914982] x17: 0000000000000000 x16: 0000000000000000 [ 1669.920268] x15: 0000000000000000 x14: ffffa0274014f870 [ 1669.925555] x13: 0000000000000040 x12: 0000000000000228 [ 1669.930842] x11: 0000000000000020 x10: 0000000000000bb0 [ 1669.936129] x9 : ffff000036f537f0 x8 : ffff80273088ca10 [ 1669.941416] x7 : 000000000000001d x6 : 00000000ffffffff [ 1669.946702] x5 : ffff000008a36310 x4 : ffff80273088be00 [ 1669.951989] x3 : ffff000009513e90 x2 : 0000000000000000 [ 1669.957276] x1 : 00000000000000a0 x0 : ffffffff00000001 [ 1669.962563] Call trace: [ 1669.965000] down_write+0x24/0x70 [ 1669.968301] debugfs_remove_recursive+0x5c/0x1b0 [ 1669.972905] hisi_sas_debugfs_exit+0x24/0x30 [hisi_sas_main] [ 1669.978541] hisi_sas_v3_remove+0x130/0x150 [hisi_sas_v3_hw] [ 1669.984175] pci_device_remove+0x48/0xd8 [ 1669.988082] device_release_driver_internal+0x1b4/0x250 [ 1669.993282] device_release_driver+0x28/0x38 [ 1669.997534] pci_stop_bus_device+0x84/0xb8 [ 1670.001611] pci_stop_and_remove_bus_device_locked+0x24/0x40 [ 1670.007244] remove_store+0xfc/0x140 [ 1670.010802] dev_attr_store+0x44/0x60 [ 1670.014448] sysfs_kf_write+0x58/0x80 [ 1670.018095] kernfs_fop_write+0xe8/0x1f0 [ 1670.022000] __vfs_write+0x60/0x190 [ 1670.025472] vfs_write+0xac/0x1c0 [ 1670.028771] ksys_write+0x6c/0xd8 [ 1670.032071] __arm64_sys_write+0x24/0x30 [ 1670.035977] el0_svc_common+0x78/0x130 [ 1670.039710] el0_svc_handler+0x38/0x78 [ 1670.043442] el0_svc+0x8/0xc
To fix this, set debugfs_dir to NULL after debugfs_remove_recursive().
Signed-off-by: Yihang Li liyihang9@huawei.com Signed-off-by: Xingui Yang yangxingui@huawei.com Signed-off-by: Xiang Chen chenxiang66@hisilicon.com Link: https://lore.kernel.org/r/1694571327-78697-2-git-send-email-chenxiang66@hisi... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/hisi_sas/hisi_sas_v3_hw.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c index b8a12d3ad5f27..d1c07e7cb60df 100644 --- a/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c +++ b/drivers/scsi/hisi_sas/hisi_sas_v3_hw.c @@ -4718,6 +4718,12 @@ static void debugfs_bist_init_v3_hw(struct hisi_hba *hisi_hba) hisi_hba->debugfs_bist_linkrate = SAS_LINK_RATE_1_5_GBPS; }
+static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) +{ + debugfs_remove_recursive(hisi_hba->debugfs_dir); + hisi_hba->debugfs_dir = NULL; +} + static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba) { struct device *dev = hisi_hba->dev; @@ -4741,18 +4747,13 @@ static void debugfs_init_v3_hw(struct hisi_hba *hisi_hba)
for (i = 0; i < hisi_sas_debugfs_dump_count; i++) { if (debugfs_alloc_v3_hw(hisi_hba, i)) { - debugfs_remove_recursive(hisi_hba->debugfs_dir); + debugfs_exit_v3_hw(hisi_hba); dev_dbg(dev, "failed to init debugfs!\n"); break; } } }
-static void debugfs_exit_v3_hw(struct hisi_hba *hisi_hba) -{ - debugfs_remove_recursive(hisi_hba->debugfs_dir); -} - static int hisi_sas_v3_probe(struct pci_dev *pdev, const struct pci_device_id *id) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tyrel Datwyler tyreld@linux.ibm.com
[ Upstream commit b39f2d10b86d0af353ea339e5815820026bca48f ]
In practice the driver should never send more commands than are allocated to a queue's event pool. In the unlikely event that this happens, the code asserts a BUG_ON, and in the case that the kernel is not configured to crash on panic returns a junk event pointer from the empty event list causing things to spiral from there. This BUG_ON is a historical artifact of the ibmvfc driver first being upstreamed, and it is well known now that the use of BUG_ON is bad practice except in the most unrecoverable scenario. There is nothing about this scenario that prevents the driver from recovering and carrying on.
Remove the BUG_ON in question from ibmvfc_get_event() and return a NULL pointer in the case of an empty event pool. Update all call sites to ibmvfc_get_event() to check for a NULL pointer and perfrom the appropriate failure or recovery action.
Signed-off-by: Tyrel Datwyler tyreld@linux.ibm.com Link: https://lore.kernel.org/r/20230921225435.3537728-2-tyreld@linux.ibm.com Reviewed-by: Brian King brking@linux.vnet.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ibmvscsi/ibmvfc.c | 124 ++++++++++++++++++++++++++++++++- 1 file changed, 122 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/ibmvscsi/ibmvfc.c b/drivers/scsi/ibmvscsi/ibmvfc.c index d169ba772450f..85444ca1ae21b 100644 --- a/drivers/scsi/ibmvscsi/ibmvfc.c +++ b/drivers/scsi/ibmvscsi/ibmvfc.c @@ -1518,7 +1518,11 @@ static struct ibmvfc_event *ibmvfc_get_event(struct ibmvfc_queue *queue) unsigned long flags;
spin_lock_irqsave(&queue->l_lock, flags); - BUG_ON(list_empty(&queue->free)); + if (list_empty(&queue->free)) { + ibmvfc_log(queue->vhost, 4, "empty event pool on queue:%ld\n", queue->hwq_id); + spin_unlock_irqrestore(&queue->l_lock, flags); + return NULL; + } evt = list_entry(queue->free.next, struct ibmvfc_event, queue_list); atomic_set(&evt->free, 0); list_del(&evt->queue_list); @@ -1947,9 +1951,15 @@ static int ibmvfc_queuecommand(struct Scsi_Host *shost, struct scsi_cmnd *cmnd) if (vhost->using_channels) { scsi_channel = hwq % vhost->scsi_scrqs.active_queues; evt = ibmvfc_get_event(&vhost->scsi_scrqs.scrqs[scsi_channel]); + if (!evt) + return SCSI_MLQUEUE_HOST_BUSY; + evt->hwq = hwq % vhost->scsi_scrqs.active_queues; - } else + } else { evt = ibmvfc_get_event(&vhost->crq); + if (!evt) + return SCSI_MLQUEUE_HOST_BUSY; + }
ibmvfc_init_event(evt, ibmvfc_scsi_done, IBMVFC_CMD_FORMAT); evt->cmnd = cmnd; @@ -2037,6 +2047,11 @@ static int ibmvfc_bsg_timeout(struct bsg_job *job)
vhost->aborting_passthru = 1; evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } + ibmvfc_init_event(evt, ibmvfc_bsg_timeout_done, IBMVFC_MAD_FORMAT);
tmf = &evt->iu.tmf; @@ -2095,6 +2110,10 @@ static int ibmvfc_bsg_plogi(struct ibmvfc_host *vhost, unsigned int port_id) goto unlock_out;
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + rc = -ENOMEM; + goto unlock_out; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT); plogi = &evt->iu.plogi; memset(plogi, 0, sizeof(*plogi)); @@ -2213,6 +2232,11 @@ static int ibmvfc_bsg_request(struct bsg_job *job) }
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + rc = -ENOMEM; + goto out; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT); mad = &evt->iu.passthru;
@@ -2301,6 +2325,11 @@ static int ibmvfc_reset_device(struct scsi_device *sdev, int type, char *desc) else evt = ibmvfc_get_event(&vhost->crq);
+ if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } + ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_CMD_FORMAT); tmf = ibmvfc_init_vfc_cmd(evt, sdev); iu = ibmvfc_get_fcp_iu(vhost, tmf); @@ -2504,6 +2533,8 @@ static struct ibmvfc_event *ibmvfc_init_tmf(struct ibmvfc_queue *queue, struct ibmvfc_tmf *tmf;
evt = ibmvfc_get_event(queue); + if (!evt) + return NULL; ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_MAD_FORMAT);
tmf = &evt->iu.tmf; @@ -2560,6 +2591,11 @@ static int ibmvfc_cancel_all_mq(struct scsi_device *sdev, int type)
if (found_evt && vhost->logged_in) { evt = ibmvfc_init_tmf(&queues[i], sdev, type); + if (!evt) { + spin_unlock(queues[i].q_lock); + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } evt->sync_iu = &queues[i].cancel_rsp; ibmvfc_send_event(evt, vhost, default_timeout); list_add_tail(&evt->cancel, &cancelq); @@ -2773,6 +2809,10 @@ static int ibmvfc_abort_task_set(struct scsi_device *sdev)
if (vhost->state == IBMVFC_ACTIVE) { evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return -ENOMEM; + } ibmvfc_init_event(evt, ibmvfc_sync_completion, IBMVFC_CMD_FORMAT); tmf = ibmvfc_init_vfc_cmd(evt, sdev); iu = ibmvfc_get_fcp_iu(vhost, tmf); @@ -4029,6 +4069,12 @@ static void ibmvfc_tgt_send_prli(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_init_event(evt, ibmvfc_tgt_prli_done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; @@ -4136,6 +4182,12 @@ static void ibmvfc_tgt_send_plogi(struct ibmvfc_target *tgt) kref_get(&tgt->kref); tgt->logo_rcvd = 0; evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); ibmvfc_init_event(evt, ibmvfc_tgt_plogi_done, IBMVFC_MAD_FORMAT); @@ -4212,6 +4264,8 @@ static struct ibmvfc_event *__ibmvfc_tgt_get_implicit_logout_evt(struct ibmvfc_t
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) + return NULL; ibmvfc_init_event(evt, done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; mad = &evt->iu.implicit_logout; @@ -4239,6 +4293,13 @@ static void ibmvfc_tgt_implicit_logout(struct ibmvfc_target *tgt) vhost->discovery_threads++; evt = __ibmvfc_tgt_get_implicit_logout_evt(tgt, ibmvfc_tgt_implicit_logout_done); + if (!evt) { + vhost->discovery_threads--; + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + }
ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); if (ibmvfc_send_event(evt, vhost, default_timeout)) { @@ -4378,6 +4439,12 @@ static void ibmvfc_tgt_move_login(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_DEL_RPORT); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_INIT_WAIT); ibmvfc_init_event(evt, ibmvfc_tgt_move_login_done, IBMVFC_MAD_FORMAT); @@ -4544,6 +4611,14 @@ static void ibmvfc_adisc_timeout(struct timer_list *t) vhost->abort_threads++; kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + tgt_err(tgt, "Failed to get cancel event for ADISC.\n"); + vhost->abort_threads--; + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + spin_unlock_irqrestore(vhost->host->host_lock, flags); + return; + } ibmvfc_init_event(evt, ibmvfc_tgt_adisc_cancel_done, IBMVFC_MAD_FORMAT);
evt->tgt = tgt; @@ -4594,6 +4669,12 @@ static void ibmvfc_tgt_adisc(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; ibmvfc_init_event(evt, ibmvfc_tgt_adisc_done, IBMVFC_MAD_FORMAT); evt->tgt = tgt; @@ -4697,6 +4778,12 @@ static void ibmvfc_tgt_query_target(struct ibmvfc_target *tgt)
kref_get(&tgt->kref); evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_set_tgt_action(tgt, IBMVFC_TGT_ACTION_NONE); + kref_put(&tgt->kref, ibmvfc_release_tgt); + __ibmvfc_reset_host(vhost); + return; + } vhost->discovery_threads++; evt->tgt = tgt; ibmvfc_init_event(evt, ibmvfc_tgt_query_target_done, IBMVFC_MAD_FORMAT); @@ -4869,6 +4956,13 @@ static void ibmvfc_discover_targets(struct ibmvfc_host *vhost) { struct ibmvfc_discover_targets *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq); + int level = IBMVFC_DEFAULT_LOG_LEVEL; + + if (!evt) { + ibmvfc_log(vhost, level, "Discover Targets failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + }
ibmvfc_init_event(evt, ibmvfc_discover_targets_done, IBMVFC_MAD_FORMAT); mad = &evt->iu.discover_targets; @@ -4946,8 +5040,15 @@ static void ibmvfc_channel_setup(struct ibmvfc_host *vhost) struct ibmvfc_scsi_channels *scrqs = &vhost->scsi_scrqs; unsigned int num_channels = min(vhost->client_scsi_channels, vhost->max_vios_scsi_channels); + int level = IBMVFC_DEFAULT_LOG_LEVEL; int i;
+ if (!evt) { + ibmvfc_log(vhost, level, "Channel Setup failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + memset(setup_buf, 0, sizeof(*setup_buf)); if (num_channels == 0) setup_buf->flags = cpu_to_be32(IBMVFC_CANCEL_CHANNELS); @@ -5009,6 +5110,13 @@ static void ibmvfc_channel_enquiry(struct ibmvfc_host *vhost) { struct ibmvfc_channel_enquiry *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq); + int level = IBMVFC_DEFAULT_LOG_LEVEL; + + if (!evt) { + ibmvfc_log(vhost, level, "Channel Enquiry failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + }
ibmvfc_init_event(evt, ibmvfc_channel_enquiry_done, IBMVFC_MAD_FORMAT); mad = &evt->iu.channel_enquiry; @@ -5131,6 +5239,12 @@ static void ibmvfc_npiv_login(struct ibmvfc_host *vhost) struct ibmvfc_npiv_login_mad *mad; struct ibmvfc_event *evt = ibmvfc_get_event(&vhost->crq);
+ if (!evt) { + ibmvfc_dbg(vhost, "NPIV Login failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + ibmvfc_gather_partition_info(vhost); ibmvfc_set_login_info(vhost); ibmvfc_init_event(evt, ibmvfc_npiv_login_done, IBMVFC_MAD_FORMAT); @@ -5195,6 +5309,12 @@ static void ibmvfc_npiv_logout(struct ibmvfc_host *vhost) struct ibmvfc_event *evt;
evt = ibmvfc_get_event(&vhost->crq); + if (!evt) { + ibmvfc_dbg(vhost, "NPIV Logout failed: no available events\n"); + ibmvfc_hard_reset_host(vhost); + return; + } + ibmvfc_init_event(evt, ibmvfc_npiv_logout_done, IBMVFC_MAD_FORMAT);
mad = &evt->iu.npiv_logout;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juntong Deng juntong.deng@outlook.com
[ Upstream commit 525b861a008143048535011f3816d407940f4bfa ]
l2nbperpage is log2(number of blks per page), and the minimum legal value should be 0, not negative.
In the case of l2nbperpage being negative, an error will occur when subsequently used as shift exponent.
Syzbot reported this bug:
UBSAN: shift-out-of-bounds in fs/jfs/jfs_dmap.c:799:12 shift exponent -16777216 is negative
Reported-by: syzbot+debee9ab7ae2b34b0307@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=debee9ab7ae2b34b0307 Signed-off-by: Juntong Deng juntong.deng@outlook.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index da4f9c3b714fe..a700950429c5f 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -180,7 +180,8 @@ int dbMount(struct inode *ipbmap) bmp->db_nfree = le64_to_cpu(dbmp_le->dn_nfree);
bmp->db_l2nbperpage = le32_to_cpu(dbmp_le->dn_l2nbperpage); - if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE) { + if (bmp->db_l2nbperpage > L2PSIZE - L2MINBLOCKSIZE || + bmp->db_l2nbperpage < 0) { err = -EINVAL; goto err_release_metapage; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juntong Deng juntong.deng@outlook.com
[ Upstream commit 64933ab7b04881c6c18b21ff206c12278341c72e ]
Both db_maxag and db_agpref are used as the index of the db_agfree array, but there is currently no validity check for db_maxag and db_agpref, which can lead to errors.
The following is related bug reported by Syzbot:
UBSAN: array-index-out-of-bounds in fs/jfs/jfs_dmap.c:639:20 index 7936 is out of range for type 'atomic_t[128]'
Add checking that the values of db_maxag and db_agpref are valid indexes for the db_agfree array.
Reported-by: syzbot+38e876a8aa44b7115c76@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=38e876a8aa44b7115c76 Signed-off-by: Juntong Deng juntong.deng@outlook.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index a700950429c5f..217a673b751ef 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -195,6 +195,12 @@ int dbMount(struct inode *ipbmap) bmp->db_maxlevel = le32_to_cpu(dbmp_le->dn_maxlevel); bmp->db_maxag = le32_to_cpu(dbmp_le->dn_maxag); bmp->db_agpref = le32_to_cpu(dbmp_le->dn_agpref); + if (bmp->db_maxag >= MAXAG || bmp->db_maxag < 0 || + bmp->db_agpref >= MAXAG || bmp->db_agpref < 0) { + err = -EINVAL; + goto err_release_metapage; + } + bmp->db_aglevel = le32_to_cpu(dbmp_le->dn_aglevel); bmp->db_agheight = le32_to_cpu(dbmp_le->dn_agheight); bmp->db_agwidth = le32_to_cpu(dbmp_le->dn_agwidth);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manas Ghandat ghandatmanas@gmail.com
[ Upstream commit 22cad8bc1d36547cdae0eef316c47d917ce3147c ]
Currently while searching for dmtree_t for sufficient free blocks there is an array out of bounds while getting element in tp->dm_stree. To add the required check for out of bound we first need to determine the type of dmtree. Thus added an extra parameter to dbFindLeaf so that the type of tree can be determined and the required check can be applied.
Reported-by: syzbot+aea1ad91e854d0a83e04@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=aea1ad91e854d0a83e04 Signed-off-by: Manas Ghandat ghandatmanas@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index 217a673b751ef..5b01026fff9bf 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -87,7 +87,7 @@ static int dbAllocCtl(struct bmap * bmp, s64 nblocks, int l2nb, s64 blkno, static int dbExtend(struct inode *ip, s64 blkno, s64 nblocks, s64 addnblocks); static int dbFindBits(u32 word, int l2nb); static int dbFindCtl(struct bmap * bmp, int l2nb, int level, s64 * blkno); -static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx); +static int dbFindLeaf(dmtree_t *tp, int l2nb, int *leafidx, bool is_ctl); static int dbFreeBits(struct bmap * bmp, struct dmap * dp, s64 blkno, int nblocks); static int dbFreeDmap(struct bmap * bmp, struct dmap * dp, s64 blkno, @@ -1785,7 +1785,7 @@ static int dbFindCtl(struct bmap * bmp, int l2nb, int level, s64 * blkno) * dbFindLeaf() returns the index of the leaf at which * free space was found. */ - rc = dbFindLeaf((dmtree_t *) dcp, l2nb, &leafidx); + rc = dbFindLeaf((dmtree_t *) dcp, l2nb, &leafidx, true);
/* release the buffer. */ @@ -2032,7 +2032,7 @@ dbAllocDmapLev(struct bmap * bmp, * free space. if sufficient free space is found, dbFindLeaf() * returns the index of the leaf at which free space was found. */ - if (dbFindLeaf((dmtree_t *) & dp->tree, l2nb, &leafidx)) + if (dbFindLeaf((dmtree_t *) &dp->tree, l2nb, &leafidx, false)) return -ENOSPC;
if (leafidx < 0) @@ -2996,14 +2996,18 @@ static void dbAdjTree(dmtree_t * tp, int leafno, int newval) * leafidx - return pointer to be set to the index of the leaf * describing at least l2nb free blocks if sufficient * free blocks are found. + * is_ctl - determines if the tree is of type ctl * * RETURN VALUES: * 0 - success * -ENOSPC - insufficient free blocks. */ -static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx) +static int dbFindLeaf(dmtree_t *tp, int l2nb, int *leafidx, bool is_ctl) { int ti, n = 0, k, x = 0; + int max_size; + + max_size = is_ctl ? CTLTREESIZE : TREESIZE;
/* first check the root of the tree to see if there is * sufficient free space. @@ -3024,6 +3028,8 @@ static int dbFindLeaf(dmtree_t * tp, int l2nb, int *leafidx) /* sufficient free space found. move to the next * level (or quit if this is the last level). */ + if (x + n > max_size) + return -ENOSPC; if (l2nb <= tp->dmt_stree[x + n]) break; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Manas Ghandat ghandatmanas@gmail.com
[ Upstream commit 05d9ea1ceb62a55af6727a69269a4fd310edf483 ]
Currently there is not check against the agno of the iag while allocating new inodes to avoid fragmentation problem. Added the check which is required.
Reported-by: syzbot+79d792676d8ac050949f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=79d792676d8ac050949f Signed-off-by: Manas Ghandat ghandatmanas@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_imap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c index 4899663996d81..6ed2e1d4c894f 100644 --- a/fs/jfs/jfs_imap.c +++ b/fs/jfs/jfs_imap.c @@ -1320,7 +1320,7 @@ diInitInode(struct inode *ip, int iagno, int ino, int extno, struct iag * iagp) int diAlloc(struct inode *pip, bool dir, struct inode *ip) { int rc, ino, iagno, addext, extno, bitno, sword; - int nwords, rem, i, agno; + int nwords, rem, i, agno, dn_numag; u32 mask, inosmap, extsmap; struct inode *ipimap; struct metapage *mp; @@ -1356,6 +1356,9 @@ int diAlloc(struct inode *pip, bool dir, struct inode *ip)
/* get the ag number of this iag */ agno = BLKTOAG(JFS_IP(pip)->agstart, JFS_SBI(pip->i_sb)); + dn_numag = JFS_SBI(pip->i_sb)->bmap->db_numag; + if (agno < 0 || agno > dn_numag) + return -EIO;
if (atomic_read(&JFS_SBI(pip->i_sb)->bmap->db_active[agno])) { /*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikhail Khvainitski me@khvoinitsky.org
[ Upstream commit 46a0a2c96f0f47628190f122c2e3d879e590bcbe ]
Built-in firmware of cptkbd handles scrolling by itself (when middle button is pressed) but with issues: it does not support horizontal and hi-res scrolling and upon middle button release it sends middle button click even if there was a scrolling event. Commit 3cb5ff0220e3 ("HID: lenovo: Hide middle-button press until release") workarounds last issue but it's impossible to workaround scrolling-related issues without firmware modification.
Likely, Dennis Schneider has reverse engineered the firmware and provided an instruction on how to patch it [1]. However, aforementioned workaround prevents userspace (libinput) from knowing exact moment when middle button has been pressed down and performing "On-Button scrolling". This commit detects correctly-behaving patched firmware if cursor movement events has been received during middle button being pressed and stops applying workaround for this device.
Link: https://hohlerde.org/rauch/en/elektronik/projekte/tpkbd-fix/ [1]
Signed-off-by: Mikhail Khvainitski me@khvoinitsky.org Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-lenovo.c | 68 ++++++++++++++++++++++++++-------------- 1 file changed, 45 insertions(+), 23 deletions(-)
diff --git a/drivers/hid/hid-lenovo.c b/drivers/hid/hid-lenovo.c index 93b1f935e526e..901c1959efed4 100644 --- a/drivers/hid/hid-lenovo.c +++ b/drivers/hid/hid-lenovo.c @@ -50,7 +50,12 @@ struct lenovo_drvdata { int select_right; int sensitivity; int press_speed; - u8 middlebutton_state; /* 0:Up, 1:Down (undecided), 2:Scrolling */ + /* 0: Up + * 1: Down (undecided) + * 2: Scrolling + * 3: Patched firmware, disable workaround + */ + u8 middlebutton_state; bool fn_lock; };
@@ -529,31 +534,48 @@ static int lenovo_event_cptkbd(struct hid_device *hdev, { struct lenovo_drvdata *cptkbd_data = hid_get_drvdata(hdev);
- /* "wheel" scroll events */ - if (usage->type == EV_REL && (usage->code == REL_WHEEL || - usage->code == REL_HWHEEL)) { - /* Scroll events disable middle-click event */ - cptkbd_data->middlebutton_state = 2; - return 0; - } + if (cptkbd_data->middlebutton_state != 3) { + /* REL_X and REL_Y events during middle button pressed + * are only possible on patched, bug-free firmware + * so set middlebutton_state to 3 + * to never apply workaround anymore + */ + if (cptkbd_data->middlebutton_state == 1 && + usage->type == EV_REL && + (usage->code == REL_X || usage->code == REL_Y)) { + cptkbd_data->middlebutton_state = 3; + /* send middle button press which was hold before */ + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 1); + input_sync(field->hidinput->input); + }
- /* Middle click events */ - if (usage->type == EV_KEY && usage->code == BTN_MIDDLE) { - if (value == 1) { - cptkbd_data->middlebutton_state = 1; - } else if (value == 0) { - if (cptkbd_data->middlebutton_state == 1) { - /* No scrolling inbetween, send middle-click */ - input_event(field->hidinput->input, - EV_KEY, BTN_MIDDLE, 1); - input_sync(field->hidinput->input); - input_event(field->hidinput->input, - EV_KEY, BTN_MIDDLE, 0); - input_sync(field->hidinput->input); + /* "wheel" scroll events */ + if (usage->type == EV_REL && (usage->code == REL_WHEEL || + usage->code == REL_HWHEEL)) { + /* Scroll events disable middle-click event */ + cptkbd_data->middlebutton_state = 2; + return 0; + } + + /* Middle click events */ + if (usage->type == EV_KEY && usage->code == BTN_MIDDLE) { + if (value == 1) { + cptkbd_data->middlebutton_state = 1; + } else if (value == 0) { + if (cptkbd_data->middlebutton_state == 1) { + /* No scrolling inbetween, send middle-click */ + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 1); + input_sync(field->hidinput->input); + input_event(field->hidinput->input, + EV_KEY, BTN_MIDDLE, 0); + input_sync(field->hidinput->input); + } + cptkbd_data->middlebutton_state = 0; } - cptkbd_data->middlebutton_state = 0; + return 1; } - return 1; }
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vincent Whitchurch vincent.whitchurch@axis.com
[ Upstream commit b0150014878c32197cfa66e3e2f79e57f66babc0 ]
Place IRQ handlers such as gic_handle_irq() in the irqentry section even if FUNCTION_GRAPH_TRACER is not enabled. Without this, the stack depot's filter_irq_stacks() does not correctly filter out IRQ stacks in those configurations, which hampers deduplication and eventually leads to "Stack depot reached limit capacity" splats with KASAN.
A similar fix was done for arm64 in commit f6794950f0e5ba37e3bbed ("arm64: set __exception_irq_entry with __irq_entry as a default").
Link: https://lore.kernel.org/r/20230803-arm-irqentry-v1-1-8aad8e260b1c@axis.com
Signed-off-by: Vincent Whitchurch vincent.whitchurch@axis.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/include/asm/exception.h | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/arch/arm/include/asm/exception.h b/arch/arm/include/asm/exception.h index 58e039a851af0..3c82975d46db3 100644 --- a/arch/arm/include/asm/exception.h +++ b/arch/arm/include/asm/exception.h @@ -10,10 +10,6 @@
#include <linux/interrupt.h>
-#ifdef CONFIG_FUNCTION_GRAPH_TRACER #define __exception_irq_entry __irq_entry -#else -#define __exception_irq_entry -#endif
#endif /* __ASM_ARM_EXCEPTION_H */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit f93dc90c2e8ed664985e366aa6459ac83cdab236 ]
While AudioDSP drivers assign streams exclusively of HOST or LINK type, nothing blocks a user to attempt to assign a COUPLED stream. As supplied substream instance may be a stub, what is the case when code-loading, such scenario ends with null-ptr-deref.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Link: https://lore.kernel.org/r/20231006102857.749143-2-cezary.rojewski@intel.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/hdac_stream.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/hda/hdac_stream.c b/sound/hda/hdac_stream.c index eea22cf72aefd..ec95d0449bfe9 100644 --- a/sound/hda/hdac_stream.c +++ b/sound/hda/hdac_stream.c @@ -320,8 +320,10 @@ struct hdac_stream *snd_hdac_stream_assign(struct hdac_bus *bus, struct hdac_stream *res = NULL;
/* make a non-zero unique key for the substream */ - int key = (substream->pcm->device << 16) | (substream->number << 2) | - (substream->stream + 1); + int key = (substream->number << 2) | (substream->stream + 1); + + if (substream->pcm) + key |= (substream->pcm->device << 16);
spin_lock_irq(&bus->reg_lock); list_for_each_entry(azx_dev, &bus->stream_list, list) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit 759574abd78e3b47ec45bbd31a64e8832cf73f97 ]
Use FIELD_GET() to extract PCIe Negotiated Link Width field instead of custom masking and shifting.
Similarly, change custom code that misleadingly used PCI_EXP_LNKSTA_NLW_SHIFT to prepare value for PCI_EXP_LNKCAP write to use FIELD_PREP() with correct field define (PCI_EXP_LNKCAP_MLW).
Link: https://lore.kernel.org/r/20230919125648.1920-5-ilpo.jarvinen@linux.intel.co... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-tegra194.c b/drivers/pci/controller/dwc/pcie-tegra194.c index 765abe0732282..2f82da76e3711 100644 --- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -7,6 +7,7 @@ * Author: Vidya Sagar vidyas@nvidia.com */
+#include <linux/bitfield.h> #include <linux/clk.h> #include <linux/debugfs.h> #include <linux/delay.h> @@ -328,8 +329,7 @@ static void apply_bad_link_workaround(struct pcie_port *pp) */ val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA); if (val & PCI_EXP_LNKSTA_LBMS) { - current_link_width = (val & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + current_link_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, val); if (pcie->init_link_width > current_link_width) { dev_warn(pci->dev, "PCIe link is bad, width reduced\n"); val = dw_pcie_readw_dbi(pci, pcie->pcie_cap_base + @@ -731,8 +731,7 @@ static void tegra_pcie_enable_system_interrupts(struct pcie_port *pp)
val_w = dw_pcie_readw_dbi(&pcie->pci, pcie->pcie_cap_base + PCI_EXP_LNKSTA); - pcie->init_link_width = (val_w & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + pcie->init_link_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, val_w);
val_w = dw_pcie_readw_dbi(&pcie->pci, pcie->pcie_cap_base + PCI_EXP_LNKCTL); @@ -889,7 +888,7 @@ static int tegra_pcie_dw_host_init(struct pcie_port *pp) /* Configure Max lane width from DT */ val = dw_pcie_readl_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKCAP); val &= ~PCI_EXP_LNKCAP_MLW; - val |= (pcie->num_lanes << PCI_EXP_LNKSTA_NLW_SHIFT); + val |= FIELD_PREP(PCI_EXP_LNKCAP_MLW, pcie->num_lanes); dw_pcie_writel_dbi(pci, pcie->pcie_cap_base + PCI_EXP_LNKCAP, val);
config_gen3_gen4_eq_presets(pcie);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit c28742447ca9879b52fbaf022ad844f0ffcd749c ]
In get_esi() PCI errors are checked inside line-split "if" conditions (in addition to the file not following the coding style). To make the code in get_esi() more readable, fix the coding style and use the usual error handling pattern with a separate variable.
In addition, initialization of 'error' variable at declaration is not needed.
No functional changes intended.
Link: https://lore.kernel.org/r/20230911125354.25501-4-ilpo.jarvinen@linux.intel.c... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/atm/iphase.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/drivers/atm/iphase.c b/drivers/atm/iphase.c index bc8e8d9f176b2..ce56306eeb6ce 100644 --- a/drivers/atm/iphase.c +++ b/drivers/atm/iphase.c @@ -2293,19 +2293,21 @@ static int get_esi(struct atm_dev *dev) static int reset_sar(struct atm_dev *dev) { IADEV *iadev; - int i, error = 1; + int i, error; unsigned int pci[64]; iadev = INPH_IA_DEV(dev); - for(i=0; i<64; i++) - if ((error = pci_read_config_dword(iadev->pci, - i*4, &pci[i])) != PCIBIOS_SUCCESSFUL) - return error; + for (i = 0; i < 64; i++) { + error = pci_read_config_dword(iadev->pci, i * 4, &pci[i]); + if (error != PCIBIOS_SUCCESSFUL) + return error; + } writel(0, iadev->reg+IPHASE5575_EXT_RESET); - for(i=0; i<64; i++) - if ((error = pci_write_config_dword(iadev->pci, - i*4, pci[i])) != PCIBIOS_SUCCESSFUL) - return error; + for (i = 0; i < 64; i++) { + error = pci_write_config_dword(iadev->pci, i * 4, pci[i]); + if (error != PCIBIOS_SUCCESSFUL) + return error; + } udelay(5); return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wenchao Hao haowenchao2@huawei.com
[ Upstream commit 4df105f0ce9f6f30cda4e99f577150d23f0c9c5f ]
fc_lport_ptp_setup() did not check the return value of fc_rport_create() which can return NULL and would cause a NULL pointer dereference. Address this issue by checking return value of fc_rport_create() and log error message on fc_rport_create() failed.
Signed-off-by: Wenchao Hao haowenchao2@huawei.com Link: https://lore.kernel.org/r/20231011130350.819571-1-haowenchao2@huawei.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/libfc/fc_lport.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/scsi/libfc/fc_lport.c b/drivers/scsi/libfc/fc_lport.c index 19cd4a95d354d..d158c5eff059b 100644 --- a/drivers/scsi/libfc/fc_lport.c +++ b/drivers/scsi/libfc/fc_lport.c @@ -241,6 +241,12 @@ static void fc_lport_ptp_setup(struct fc_lport *lport, } mutex_lock(&lport->disc.disc_mutex); lport->ptp_rdata = fc_rport_create(lport, remote_fid); + if (!lport->ptp_rdata) { + printk(KERN_WARNING "libfc: Failed to setup lport 0x%x\n", + lport->port_id); + mutex_unlock(&lport->disc.disc_mutex); + return; + } kref_get(&lport->ptp_rdata->kref); lport->ptp_rdata->ids.port_name = remote_wwpn; lport->ptp_rdata->ids.node_name = remote_wwnn;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit d1f9b39da4a5347150246871325190018cda8cb3 ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting.
Link: https://lore.kernel.org/r/20230919125648.1920-7-ilpo.jarvinen@linux.intel.co... Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com [bhelgaas: drop duplicate include of <linux/bitfield.h>] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci-sysfs.c | 5 ++--- drivers/pci/pci.c | 5 ++--- 2 files changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c index f2909ae93f2f8..c271720c7f86f 100644 --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -12,7 +12,7 @@ * Modeled after usb's driverfs.c */
- +#include <linux/bitfield.h> #include <linux/kernel.h> #include <linux/sched.h> #include <linux/pci.h> @@ -208,8 +208,7 @@ static ssize_t current_link_width_show(struct device *dev, if (err) return -EINVAL;
- return sysfs_emit(buf, "%u\n", - (linkstat & PCI_EXP_LNKSTA_NLW) >> PCI_EXP_LNKSTA_NLW_SHIFT); + return sysfs_emit(buf, "%u\n", FIELD_GET(PCI_EXP_LNKSTA_NLW, linkstat)); } static DEVICE_ATTR_RO(current_link_width);
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 244c1c2e08767..371ba983b4084 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -6071,8 +6071,7 @@ u32 pcie_bandwidth_available(struct pci_dev *dev, struct pci_dev **limiting_dev, pcie_capability_read_word(dev, PCI_EXP_LNKSTA, &lnksta);
next_speed = pcie_link_speed[lnksta & PCI_EXP_LNKSTA_CLS]; - next_width = (lnksta & PCI_EXP_LNKSTA_NLW) >> - PCI_EXP_LNKSTA_NLW_SHIFT; + next_width = FIELD_GET(PCI_EXP_LNKSTA_NLW, lnksta);
next_bw = next_width * PCIE_SPEED2MBS_ENC(next_speed);
@@ -6144,7 +6143,7 @@ enum pcie_link_width pcie_get_width_cap(struct pci_dev *dev)
pcie_capability_read_dword(dev, PCI_EXP_LNKCAP, &lnkcap); if (lnkcap) - return (lnkcap & PCI_EXP_LNKCAP_MLW) >> 4; + return FIELD_GET(PCI_EXP_LNKCAP_MLW, lnkcap);
return PCIE_LNK_WIDTH_UNKNOWN; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Pawlowski bartosz.pawlowski@intel.com
[ Upstream commit f18b1137d38c091cc8c16365219f0a1d4a30b3d1 ]
Introduce quirk_no_ats() helper function to provide a standard way to disable ATS capability in PCI quirks.
Suggested-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20230908143606.685930-2-bartosz.pawlowski@intel.co... Signed-off-by: Bartosz Pawlowski bartosz.pawlowski@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 5955e682c4348..30efa1ee595d3 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5379,6 +5379,12 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0420, quirk_no_ext_tags); DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_SERVERWORKS, 0x0422, quirk_no_ext_tags);
#ifdef CONFIG_PCI_ATS +static void quirk_no_ats(struct pci_dev *pdev) +{ + pci_info(pdev, "disabling ATS\n"); + pdev->ats_cap = 0; +} + /* * Some devices require additional driver setup to enable ATS. Don't use * ATS for those devices as ATS will be enabled before the driver has had a @@ -5392,14 +5398,10 @@ static void quirk_amd_harvest_no_ats(struct pci_dev *pdev) (pdev->subsystem_device == 0xce19 || pdev->subsystem_device == 0xcc10 || pdev->subsystem_device == 0xcc08)) - goto no_ats; - else - return; + quirk_no_ats(pdev); + } else { + quirk_no_ats(pdev); } - -no_ats: - pci_info(pdev, "disabling ATS\n"); - pdev->ats_cap = 0; }
/* AMD Stoney platform GPU */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Pawlowski bartosz.pawlowski@intel.com
[ Upstream commit a18615b1cfc04f00548c60eb9a77e0ce56e848fd ]
Due to a hardware issue in A and B steppings of Intel IPU E2000, it expects wrong endianness in ATS invalidation message body. This problem can lead to outdated translations being returned as valid and finally cause system instability.
To prevent such issues, add quirk_intel_e2000_no_ats() to disable ATS for vulnerable IPU E2000 devices.
Link: https://lore.kernel.org/r/20230908143606.685930-3-bartosz.pawlowski@intel.co... Signed-off-by: Bartosz Pawlowski bartosz.pawlowski@intel.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Alexander Lobakin aleksander.lobakin@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/quirks.c | 19 +++++++++++++++++++ 1 file changed, 19 insertions(+)
diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 30efa1ee595d3..5d8768cd7c50a 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -5424,6 +5424,25 @@ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x7347, quirk_amd_harvest_no_ats); DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x734f, quirk_amd_harvest_no_ats); /* AMD Raven platform iGPU */ DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_ATI, 0x15d8, quirk_amd_harvest_no_ats); + +/* + * Intel IPU E2000 revisions before C0 implement incorrect endianness + * in ATS Invalidate Request message body. Disable ATS for those devices. + */ +static void quirk_intel_e2000_no_ats(struct pci_dev *pdev) +{ + if (pdev->revision < 0x20) + quirk_no_ats(pdev); +} +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1451, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1452, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1453, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1454, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1455, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1457, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x1459, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145a, quirk_intel_e2000_no_ats); +DECLARE_PCI_FIXUP_FINAL(PCI_VENDOR_ID_INTEL, 0x145c, quirk_intel_e2000_no_ats); #endif /* CONFIG_PCI_ATS */
/* Freescale PCIe doesn't support MSI in RC mode */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit 6c4b39937f4e65688ea294725ae432b2565821ff ]
Add Renesas R8A779F0 in pci_device_id table so that pci-epf-test can be used for testing PCIe EP on R-Car S4-8.
Link: https://lore.kernel.org/linux-pci/20231018085631.1121289-16-yoshihiro.shimod... Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Acked-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/misc/pci_endpoint_test.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c index 3382cf4905ded..0223e96aae47c 100644 --- a/drivers/misc/pci_endpoint_test.c +++ b/drivers/misc/pci_endpoint_test.c @@ -81,6 +81,7 @@ #define PCI_DEVICE_ID_RENESAS_R8A774B1 0x002b #define PCI_DEVICE_ID_RENESAS_R8A774C0 0x002d #define PCI_DEVICE_ID_RENESAS_R8A774E1 0x0025 +#define PCI_DEVICE_ID_RENESAS_R8A779F0 0x0031
static DEFINE_IDA(pci_endpoint_test_ida);
@@ -996,6 +997,9 @@ static const struct pci_device_id pci_endpoint_test_tbl[] = { { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774B1),}, { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774C0),}, { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A774E1),}, + { PCI_DEVICE(PCI_VENDOR_ID_RENESAS, PCI_DEVICE_ID_RENESAS_R8A779F0), + .driver_data = (kernel_ulong_t)&default_data, + }, { PCI_DEVICE(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_J721E), .driver_data = (kernel_ulong_t)&j721e_data, },
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bjorn Helgaas bhelgaas@google.com
[ Upstream commit 04e82fa5951ca66495d7b05665eff673aa3852b4 ]
Use FIELD_GET() to remove dependences on the field position, i.e., the shift value. No functional change intended.
Separate because this isn't as trivial as the other FIELD_GET() changes.
See 907830b0fc9e ("PCI: Add a REBAR size quirk for Sapphire RX 5600 XT Pulse")
Link: https://lore.kernel.org/r/20231010204436.1000644-3-helgaas@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: Nirmoy Das nirmoy.das@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 371ba983b4084..cc3f620b73bd7 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -3649,14 +3649,14 @@ u32 pci_rebar_get_possible_sizes(struct pci_dev *pdev, int bar) return 0;
pci_read_config_dword(pdev, pos + PCI_REBAR_CAP, &cap); - cap &= PCI_REBAR_CAP_SIZES; + cap = FIELD_GET(PCI_REBAR_CAP_SIZES, cap);
/* Sapphire RX 5600 XT Pulse has an invalid cap dword for BAR 0 */ if (pdev->vendor == PCI_VENDOR_ID_ATI && pdev->device == 0x731f && - bar == 0 && cap == 0x7000) - cap = 0x3f000; + bar == 0 && cap == 0x700) + return 0x3f00;
- return cap >> 4; + return cap; } EXPORT_SYMBOL(pci_rebar_get_possible_sizes);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Kosina jkosina@suse.cz
[ Upstream commit 62cc9c3cb3ec1bf31cc116146185ed97b450836a ]
This device needs ALWAYS_POLL quirk, otherwise it keeps reconnecting indefinitely.
Reported-by: Robert Ayrapetyan robert.ayrapetyan@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-ids.h | 1 + drivers/hid/hid-quirks.c | 1 + 2 files changed, 2 insertions(+)
diff --git a/drivers/hid/hid-ids.h b/drivers/hid/hid-ids.h index 5fceefb3c707e..caca5d6e95d64 100644 --- a/drivers/hid/hid-ids.h +++ b/drivers/hid/hid-ids.h @@ -349,6 +349,7 @@
#define USB_VENDOR_ID_DELL 0x413c #define USB_DEVICE_ID_DELL_PIXART_USB_OPTICAL_MOUSE 0x301a +#define USB_DEVICE_ID_DELL_PRO_WIRELESS_KM5221W 0x4503
#define USB_VENDOR_ID_DELORME 0x1163 #define USB_DEVICE_ID_DELORME_EARTHMATE 0x0100 diff --git a/drivers/hid/hid-quirks.c b/drivers/hid/hid-quirks.c index 96ca7d981ee20..225138a39d323 100644 --- a/drivers/hid/hid-quirks.c +++ b/drivers/hid/hid-quirks.c @@ -66,6 +66,7 @@ static const struct hid_device_id hid_quirks[] = { { HID_USB_DEVICE(USB_VENDOR_ID_CORSAIR, USB_DEVICE_ID_CORSAIR_STRAFE), HID_QUIRK_NO_INIT_REPORTS | HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_CREATIVELABS, USB_DEVICE_ID_CREATIVE_SB_OMNI_SURROUND_51), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_PIXART_USB_OPTICAL_MOUSE), HID_QUIRK_ALWAYS_POLL }, + { HID_USB_DEVICE(USB_VENDOR_ID_DELL, USB_DEVICE_ID_DELL_PRO_WIRELESS_KM5221W), HID_QUIRK_ALWAYS_POLL }, { HID_USB_DEVICE(USB_VENDOR_ID_DMI, USB_DEVICE_ID_DMI_ENC), HID_QUIRK_NOGET }, { HID_USB_DEVICE(USB_VENDOR_ID_DRACAL_RAPHNET, USB_DEVICE_ID_RAPHNET_2NES2SNES), HID_QUIRK_MULTI_INPUT }, { HID_USB_DEVICE(USB_VENDOR_ID_DRACAL_RAPHNET, USB_DEVICE_ID_RAPHNET_4NES4SNES), HID_QUIRK_MULTI_INPUT },
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuezhang Mo Yuezhang.Mo@sony.com
[ Upstream commit dab48b8f2fe7264d51ec9eed0adea0fe3c78830a ]
After repairing a corrupted file system with exfatprogs' fsck.exfat, zero-size directories may result. It is also possible to create zero-size directories in other exFAT implementation, such as Paragon ufsd dirver.
As described in the specification, the lower directory size limits is 0 bytes.
Without this commit, sub-directories and files cannot be created under a zero-size directory, and it cannot be removed.
Signed-off-by: Yuezhang Mo Yuezhang.Mo@sony.com Reviewed-by: Andy Wu Andy.Wu@sony.com Reviewed-by: Aoyama Wataru wataru.aoyama@sony.com Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/exfat/namei.c | 29 ++++++++++++++++++++++------- 1 file changed, 22 insertions(+), 7 deletions(-)
diff --git a/fs/exfat/namei.c b/fs/exfat/namei.c index b22d6c984f8c7..cfa46d8cf5b39 100644 --- a/fs/exfat/namei.c +++ b/fs/exfat/namei.c @@ -330,14 +330,20 @@ static int exfat_find_empty_entry(struct inode *inode, if (exfat_check_max_dentries(inode)) return -ENOSPC;
- /* we trust p_dir->size regardless of FAT type */ - if (exfat_find_last_cluster(sb, p_dir, &last_clu)) - return -EIO; - /* * Allocate new cluster to this directory */ - exfat_chain_set(&clu, last_clu + 1, 0, p_dir->flags); + if (ei->start_clu != EXFAT_EOF_CLUSTER) { + /* we trust p_dir->size regardless of FAT type */ + if (exfat_find_last_cluster(sb, p_dir, &last_clu)) + return -EIO; + + exfat_chain_set(&clu, last_clu + 1, 0, p_dir->flags); + } else { + /* This directory is empty */ + exfat_chain_set(&clu, EXFAT_EOF_CLUSTER, 0, + ALLOC_NO_FAT_CHAIN); + }
/* allocate a cluster */ ret = exfat_alloc_cluster(inode, 1, &clu, IS_DIRSYNC(inode)); @@ -347,6 +353,11 @@ static int exfat_find_empty_entry(struct inode *inode, if (exfat_zeroed_cluster(inode, clu.dir)) return -EIO;
+ if (ei->start_clu == EXFAT_EOF_CLUSTER) { + ei->start_clu = clu.dir; + p_dir->dir = clu.dir; + } + /* append to the FAT chain */ if (clu.flags != p_dir->flags) { /* no-fat-chain bit is disabled, @@ -644,7 +655,7 @@ static int exfat_find(struct inode *dir, struct qstr *qname, info->type = exfat_get_entry_type(ep); info->attr = le16_to_cpu(ep->dentry.file.attr); info->size = le64_to_cpu(ep2->dentry.stream.valid_size); - if ((info->type == TYPE_FILE) && (info->size == 0)) { + if (info->size == 0) { info->flags = ALLOC_NO_FAT_CHAIN; info->start_clu = EXFAT_EOF_CLUSTER; } else { @@ -891,6 +902,9 @@ static int exfat_check_dir_empty(struct super_block *sb,
dentries_per_clu = sbi->dentries_per_clu;
+ if (p_dir->dir == EXFAT_EOF_CLUSTER) + return 0; + exfat_chain_dup(&clu, p_dir);
while (clu.dir != EXFAT_EOF_CLUSTER) { @@ -1274,7 +1288,8 @@ static int __exfat_rename(struct inode *old_parent_inode, }
/* Free the clusters if new_inode is a dir(as if exfat_rmdir) */ - if (new_entry_type == TYPE_DIR) { + if (new_entry_type == TYPE_DIR && + new_ei->start_clu != EXFAT_EOF_CLUSTER) { /* new_ei, new_clu_to_free */ struct exfat_chain new_clu_to_free;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yi Yang yiyang13@huawei.com
[ Upstream commit d81ffb87aaa75f842cd7aa57091810353755b3e6 ]
Add check for the return value of kstrdup() and return the error, if it fails in order to avoid NULL pointer dereference.
Signed-off-by: Yi Yang yiyang13@huawei.com Reviewed-by: Jiri Slaby jirislaby@kernel.org Link: https://lore.kernel.org/r/20230904035220.48164-1-yiyang13@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/vcc.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/vcc.c b/drivers/tty/vcc.c index e11383ae1e7e3..71356d9684bac 100644 --- a/drivers/tty/vcc.c +++ b/drivers/tty/vcc.c @@ -578,18 +578,22 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) return -ENOMEM;
name = kstrdup(dev_name(&vdev->dev), GFP_KERNEL); + if (!name) { + rv = -ENOMEM; + goto free_port; + }
rv = vio_driver_init(&port->vio, vdev, VDEV_CONSOLE_CON, vcc_versions, ARRAY_SIZE(vcc_versions), NULL, name); if (rv) - goto free_port; + goto free_name;
port->vio.debug = vcc_dbg_vio; vcc_ldc_cfg.debug = vcc_dbg_ldc;
rv = vio_ldc_alloc(&port->vio, &vcc_ldc_cfg, port); if (rv) - goto free_port; + goto free_name;
spin_lock_init(&port->lock);
@@ -623,6 +627,11 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) goto unreg_tty; } port->domain = kstrdup(domain, GFP_KERNEL); + if (!port->domain) { + rv = -ENOMEM; + goto unreg_tty; + } +
mdesc_release(hp);
@@ -652,8 +661,9 @@ static int vcc_probe(struct vio_dev *vdev, const struct vio_device_id *id) vcc_table_remove(port->index); free_ldc: vio_ldc_free(&port->vio); -free_port: +free_name: kfree(name); +free_port: kfree(port);
return rv;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hardik Gajjar hgajjar@de.adit-jv.com
[ Upstream commit a04224da1f3424b2c607b12a3bd1f0e302fb8231 ]
Previously, gadget assignment to the net device occurred exclusively during the initial binding attempt.
Nevertheless, the gadget pointer could change during bind/unbind cycles due to various conditions, including the unloading/loading of the UDC device driver or the detachment/reconnection of an OTG-capable USB hub device.
This patch relocates the gether_set_gadget() function out from ncm_opts->bound condition check, ensuring that the correct gadget is assigned during each bind request.
The provided logs demonstrate the consistency of ncm_opts throughout the power cycle, while the gadget may change.
* OTG hub connected during boot up and assignment of gadget and ncm_opts pointer
[ 2.366301] usb 2-1.5: New USB device found, idVendor=2996, idProduct=0105 [ 2.366304] usb 2-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 2.366306] usb 2-1.5: Product: H2H Bridge [ 2.366308] usb 2-1.5: Manufacturer: Aptiv [ 2.366309] usb 2-1.5: SerialNumber: 13FEB2021 [ 2.427989] usb 2-1.5: New USB device found, VID=2996, PID=0105 [ 2.428959] dabridge 2-1.5:1.0: dabridge 2-4 total endpoints=5, 0000000093a8d681 [ 2.429710] dabridge 2-1.5:1.0: P(0105) D(22.06.22) F(17.3.16) H(1.1) high-speed [ 2.429714] dabridge 2-1.5:1.0: Hub 2-2 P(0151) V(06.87) [ 2.429956] dabridge 2-1.5:1.0: All downstream ports in host mode
[ 2.430093] gadget 000000003c414d59 ------> gadget pointer
* NCM opts and associated gadget pointer during First ncm_bind
[ 34.763929] NCM opts 00000000aa304ac9 [ 34.763930] NCM gadget 000000003c414d59
* OTG capable hub disconnecte or assume driver unload.
[ 97.203114] usb 2-1: USB disconnect, device number 2 [ 97.203118] usb 2-1.1: USB disconnect, device number 3 [ 97.209217] usb 2-1.5: USB disconnect, device number 4 [ 97.230990] dabr_udc deleted
* Reconnect the OTG hub or load driver assaign new gadget pointer.
[ 111.534035] usb 2-1.1: New USB device found, idVendor=2996, idProduct=0120, bcdDevice= 6.87 [ 111.534038] usb 2-1.1: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 111.534040] usb 2-1.1: Product: Vendor [ 111.534041] usb 2-1.1: Manufacturer: Aptiv [ 111.534042] usb 2-1.1: SerialNumber: Superior [ 111.535175] usb 2-1.1: New USB device found, VID=2996, PID=0120 [ 111.610995] usb 2-1.5: new high-speed USB device number 8 using xhci-hcd [ 111.630052] usb 2-1.5: New USB device found, idVendor=2996, idProduct=0105, bcdDevice=21.02 [ 111.630055] usb 2-1.5: New USB device strings: Mfr=1, Product=2, SerialNumber=3 [ 111.630057] usb 2-1.5: Product: H2H Bridge [ 111.630058] usb 2-1.5: Manufacturer: Aptiv [ 111.630059] usb 2-1.5: SerialNumber: 13FEB2021 [ 111.687464] usb 2-1.5: New USB device found, VID=2996, PID=0105 [ 111.690375] dabridge 2-1.5:1.0: dabridge 2-8 total endpoints=5, 000000000d87c961 [ 111.691172] dabridge 2-1.5:1.0: P(0105) D(22.06.22) F(17.3.16) H(1.1) high-speed [ 111.691176] dabridge 2-1.5:1.0: Hub 2-6 P(0151) V(06.87) [ 111.691646] dabridge 2-1.5:1.0: All downstream ports in host mode
[ 111.692298] gadget 00000000dc72f7a9 --------> new gadget ptr on connect
* NCM opts and associated gadget pointer during second ncm_bind
[ 113.271786] NCM opts 00000000aa304ac9 -----> same opts ptr used during first bind [ 113.271788] NCM gadget 00000000dc72f7a9 ----> however new gaget ptr, that will not set in net_device due to ncm_opts->bound = true
Signed-off-by: Hardik Gajjar hgajjar@de.adit-jv.com Link: https://lore.kernel.org/r/20231020153324.82794-1-hgajjar@de.adit-jv.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/function/f_ncm.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c index aabaedb2e0691..bd095ae569edd 100644 --- a/drivers/usb/gadget/function/f_ncm.c +++ b/drivers/usb/gadget/function/f_ncm.c @@ -1429,7 +1429,7 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f) struct usb_composite_dev *cdev = c->cdev; struct f_ncm *ncm = func_to_ncm(f); struct usb_string *us; - int status; + int status = 0; struct usb_ep *ep; struct f_ncm_opts *ncm_opts;
@@ -1447,22 +1447,17 @@ static int ncm_bind(struct usb_configuration *c, struct usb_function *f) f->os_desc_table[0].os_desc = &ncm_opts->ncm_os_desc; }
- /* - * in drivers/usb/gadget/configfs.c:configfs_composite_bind() - * configurations are bound in sequence with list_for_each_entry, - * in each configuration its functions are bound in sequence - * with list_for_each_entry, so we assume no race condition - * with regard to ncm_opts->bound access - */ - if (!ncm_opts->bound) { - mutex_lock(&ncm_opts->lock); - gether_set_gadget(ncm_opts->net, cdev->gadget); + mutex_lock(&ncm_opts->lock); + gether_set_gadget(ncm_opts->net, cdev->gadget); + if (!ncm_opts->bound) status = gether_register_netdev(ncm_opts->net); - mutex_unlock(&ncm_opts->lock); - if (status) - goto fail; - ncm_opts->bound = true; - } + mutex_unlock(&ncm_opts->lock); + + if (status) + goto fail; + + ncm_opts->bound = true; + us = usb_gstrings_attach(cdev, ncm_strings, ARRAY_SIZE(ncm_string_defs)); if (IS_ERR(us)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marco Elver elver@google.com
[ Upstream commit 355f074609dbf3042900ea9d30fcd2b0c323a365 ]
syzbot reported:
| BUG: KCSAN: data-race in p9_fd_create / p9_fd_create | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15599 on cpu 0: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | read-write to 0xffff888130fb3d48 of 4 bytes by task 15563 on cpu 1: | p9_fd_open net/9p/trans_fd.c:842 [inline] | p9_fd_create+0x210/0x250 net/9p/trans_fd.c:1092 | p9_client_create+0x595/0xa70 net/9p/client.c:1010 | v9fs_session_init+0xf9/0xd90 fs/9p/v9fs.c:410 | v9fs_mount+0x69/0x630 fs/9p/vfs_super.c:123 | legacy_get_tree+0x74/0xd0 fs/fs_context.c:611 | vfs_get_tree+0x51/0x190 fs/super.c:1519 | do_new_mount+0x203/0x660 fs/namespace.c:3335 | path_mount+0x496/0xb30 fs/namespace.c:3662 | do_mount fs/namespace.c:3675 [inline] | __do_sys_mount fs/namespace.c:3884 [inline] | [...] | | value changed: 0x00008002 -> 0x00008802
Within p9_fd_open(), O_NONBLOCK is added to f_flags of the read and write files. This may happen concurrently if e.g. mounting process modifies the fd in another thread.
Mark the plain read-modify-writes as intentional data-races, with the assumption that the result of executing the accesses concurrently will always result in the same result despite the accesses themselves not being atomic.
Reported-by: syzbot+e441aeeb422763cc5511@syzkaller.appspotmail.com Signed-off-by: Marco Elver elver@google.com Link: https://lore.kernel.org/r/ZO38mqkS0TYUlpFp@elver.google.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Message-ID: 20231025103445.1248103-1-asmadeus@codewreck.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/9p/trans_fd.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
diff --git a/net/9p/trans_fd.c b/net/9p/trans_fd.c index f359cfdc1858f..b44b77d3b35d1 100644 --- a/net/9p/trans_fd.c +++ b/net/9p/trans_fd.c @@ -835,14 +835,21 @@ static int p9_fd_open(struct p9_client *client, int rfd, int wfd) goto out_free_ts; if (!(ts->rd->f_mode & FMODE_READ)) goto out_put_rd; - /* prevent workers from hanging on IO when fd is a pipe */ - ts->rd->f_flags |= O_NONBLOCK; + /* Prevent workers from hanging on IO when fd is a pipe. + * It's technically possible for userspace or concurrent mounts to + * modify this flag concurrently, which will likely result in a + * broken filesystem. However, just having bad flags here should + * not crash the kernel or cause any other sort of bug, so mark this + * particular data race as intentional so that tooling (like KCSAN) + * can allow it and detect further problems. + */ + data_race(ts->rd->f_flags |= O_NONBLOCK); ts->wr = fget(wfd); if (!ts->wr) goto out_put_rd; if (!(ts->wr->f_mode & FMODE_WRITE)) goto out_put_wr; - ts->wr->f_flags |= O_NONBLOCK; + data_race(ts->wr->f_flags |= O_NONBLOCK);
client->trans = ts; client->status = Connected;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dominique Martinet asmadeus@codewreck.org
[ Upstream commit 9b5c6281838fc84683dd99b47302d81fce399918 ]
W=1 warns about null argument to kprintf: In file included from fs/9p/xattr.c:12: In function ‘v9fs_xattr_get’, inlined from ‘v9fs_listxattr’ at fs/9p/xattr.c:142:9: include/net/9p/9p.h:55:2: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 55 | _p9_debug(level, __func__, fmt, ##__VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Use an empty string instead of : - this is ok 9p-wise because p9pdu_vwritef serializes a null string and an empty string the same way (one '0' word for length) - since this degrades the print statements, add new single quotes for xattr's name delimter (Old: "file = (null)", new: "file = ''")
Link: https://lore.kernel.org/r/20231008060138.517057-1-suhui@nfschina.com Suggested-by: Su Hui suhui@nfschina.com Signed-off-by: Dominique Martinet asmadeus@codewreck.org Acked-by: Christian Schoenebeck linux_oss@crudebyte.com Message-ID: 20231025103445.1248103-2-asmadeus@codewreck.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/9p/xattr.c | 5 +++-- net/9p/client.c | 2 +- 2 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/9p/xattr.c b/fs/9p/xattr.c index ee331845e2c7a..31799ac10e33a 100644 --- a/fs/9p/xattr.c +++ b/fs/9p/xattr.c @@ -73,7 +73,7 @@ ssize_t v9fs_xattr_get(struct dentry *dentry, const char *name, struct p9_fid *fid; int ret;
- p9_debug(P9_DEBUG_VFS, "name = %s value_len = %zu\n", + p9_debug(P9_DEBUG_VFS, "name = '%s' value_len = %zu\n", name, buffer_size); fid = v9fs_fid_lookup(dentry); if (IS_ERR(fid)) @@ -144,7 +144,8 @@ int v9fs_fid_xattr_set(struct p9_fid *fid, const char *name,
ssize_t v9fs_listxattr(struct dentry *dentry, char *buffer, size_t buffer_size) { - return v9fs_xattr_get(dentry, NULL, buffer, buffer_size); + /* Txattrwalk with an empty string lists xattrs instead */ + return v9fs_xattr_get(dentry, "", buffer, buffer_size); }
static int v9fs_xattr_handler_get(const struct xattr_handler *handler, diff --git a/net/9p/client.c b/net/9p/client.c index 9fdcaa956c008..ead458486fdcf 100644 --- a/net/9p/client.c +++ b/net/9p/client.c @@ -2020,7 +2020,7 @@ struct p9_fid *p9_client_xattrwalk(struct p9_fid *file_fid, goto error; } p9_debug(P9_DEBUG_9P, - ">>> TXATTRWALK file_fid %d, attr_fid %d name %s\n", + ">>> TXATTRWALK file_fid %d, attr_fid %d name '%s'\n", file_fid->fid, attr_fid->fid, attr_name);
req = p9_client_rpc(clnt, P9_TXATTRWALK, "dds",
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit 45a832f989e520095429589d5b01b0c65da9b574 ]
Do not loop over ring headers in hci_dma_irq_handler() that are not allocated and enabled in hci_dma_init(). Otherwise out of bounds access will occur from rings->headers[i] access when i >= number of allocated ring headers.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Link: https://lore.kernel.org/r/20230921055704.1087277-5-jarkko.nikula@linux.intel... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/mipi-i3c-hci/dma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i3c/master/mipi-i3c-hci/dma.c b/drivers/i3c/master/mipi-i3c-hci/dma.c index af873a9be0507..dd2dc00399600 100644 --- a/drivers/i3c/master/mipi-i3c-hci/dma.c +++ b/drivers/i3c/master/mipi-i3c-hci/dma.c @@ -734,7 +734,7 @@ static bool hci_dma_irq_handler(struct i3c_hci *hci, unsigned int mask) unsigned int i; bool handled = false;
- for (i = 0; mask && i < 8; i++) { + for (i = 0; mask && i < rings->total; i++) { struct hci_rh_data *rh; u32 status;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Axel Lin axel.lin@ingics.com
[ Upstream commit 5ac61d26b8baff5b2e5a9f3dc1ef63297e4b53e7 ]
Make sure we don't OOPS in case clock-frequency is set to 0 in a DT. The variable set here is later used as a divisor.
Signed-off-by: Axel Lin axel.lin@ingics.com Acked-by: Boris Brezillon boris.brezillon@free-electrons.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-sun6i-p2wi.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/i2c/busses/i2c-sun6i-p2wi.c b/drivers/i2c/busses/i2c-sun6i-p2wi.c index 9e3483f507ff5..f2ed13b551088 100644 --- a/drivers/i2c/busses/i2c-sun6i-p2wi.c +++ b/drivers/i2c/busses/i2c-sun6i-p2wi.c @@ -201,6 +201,11 @@ static int p2wi_probe(struct platform_device *pdev) return -EINVAL; }
+ if (clk_freq == 0) { + dev_err(dev, "clock-frequency is set to 0 in DT\n"); + return -EINVAL; + } + if (of_get_child_count(np) > 1) { dev_err(dev, "P2WI only supports one slave device\n"); return -EINVAL;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: zhenwei pi pizhenwei@bytedance.com
[ Upstream commit fafb51a67fb883eb2dde352539df939a251851be ]
The following codes have an implicit conversion from size_t to u32: (u32)max_size = (size_t)virtio_max_dma_size(vdev);
This may lead overflow, Ex (size_t)4G -> (u32)0. Once virtio_max_dma_size() has a larger size than U32_MAX, use U32_MAX instead.
Signed-off-by: zhenwei pi pizhenwei@bytedance.com Message-Id: 20230904061045.510460-1-pizhenwei@bytedance.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/virtio_blk.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index d2ba849bb8d19..affeca0dbc7ea 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -743,6 +743,7 @@ static int virtblk_probe(struct virtio_device *vdev) u16 min_io_size; u8 physical_block_exp, alignment_offset; unsigned int queue_depth; + size_t max_dma_size;
if (!vdev->config->get) { dev_err(&vdev->dev, "%s failure: config access disabled\n", @@ -844,7 +845,8 @@ static int virtblk_probe(struct virtio_device *vdev) /* No real sector limit. */ blk_queue_max_hw_sectors(q, -1U);
- max_size = virtio_max_dma_size(vdev); + max_dma_size = virtio_max_dma_size(vdev); + max_size = max_dma_size > U32_MAX ? U32_MAX : max_dma_size;
/* Host can optionally specify maximum segment size and number of * segments. */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Billy Tsai billy_tsai@aspeedtech.com
[ Upstream commit b53e9758a31c683fc8615df930262192ed5f034b ]
The `i3c_master_bus_init` function may attach the I2C devices before the I3C bus initialization. In this flow, the DAT `alloc_entry`` will be used before the DAT `init`. Additionally, if the `i3c_master_bus_init` fails, the DAT `cleanup` will execute before the device is detached, which will execue DAT `free_entry` function. The above scenario can cause the driver to use DAT_data when it is NULL.
Signed-off-by: Billy Tsai billy_tsai@aspeedtech.com Link: https://lore.kernel.org/r/20231023080237.560936-1-billy_tsai@aspeedtech.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/mipi-i3c-hci/dat_v1.c | 29 ++++++++++++++++-------- 1 file changed, 19 insertions(+), 10 deletions(-)
diff --git a/drivers/i3c/master/mipi-i3c-hci/dat_v1.c b/drivers/i3c/master/mipi-i3c-hci/dat_v1.c index 97bb49ff5b53b..47b9b4d4ed3fc 100644 --- a/drivers/i3c/master/mipi-i3c-hci/dat_v1.c +++ b/drivers/i3c/master/mipi-i3c-hci/dat_v1.c @@ -64,15 +64,17 @@ static int hci_dat_v1_init(struct i3c_hci *hci) return -EOPNOTSUPP; }
- /* use a bitmap for faster free slot search */ - hci->DAT_data = bitmap_zalloc(hci->DAT_entries, GFP_KERNEL); - if (!hci->DAT_data) - return -ENOMEM; - - /* clear them */ - for (dat_idx = 0; dat_idx < hci->DAT_entries; dat_idx++) { - dat_w0_write(dat_idx, 0); - dat_w1_write(dat_idx, 0); + if (!hci->DAT_data) { + /* use a bitmap for faster free slot search */ + hci->DAT_data = bitmap_zalloc(hci->DAT_entries, GFP_KERNEL); + if (!hci->DAT_data) + return -ENOMEM; + + /* clear them */ + for (dat_idx = 0; dat_idx < hci->DAT_entries; dat_idx++) { + dat_w0_write(dat_idx, 0); + dat_w1_write(dat_idx, 0); + } }
return 0; @@ -87,7 +89,13 @@ static void hci_dat_v1_cleanup(struct i3c_hci *hci) static int hci_dat_v1_alloc_entry(struct i3c_hci *hci) { unsigned int dat_idx; + int ret;
+ if (!hci->DAT_data) { + ret = hci_dat_v1_init(hci); + if (ret) + return ret; + } dat_idx = find_first_zero_bit(hci->DAT_data, hci->DAT_entries); if (dat_idx >= hci->DAT_entries) return -ENOENT; @@ -103,7 +111,8 @@ static void hci_dat_v1_free_entry(struct i3c_hci *hci, unsigned int dat_idx) { dat_w0_write(dat_idx, 0); dat_w1_write(dat_idx, 0); - __clear_bit(dat_idx, hci->DAT_data); + if (hci->DAT_data) + __clear_bit(dat_idx, hci->DAT_data); }
static void hci_dat_v1_set_dynamic_addr(struct i3c_hci *hci,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rajeshwar R Shinde coolrrsh@gmail.com
[ Upstream commit 099be1822d1f095433f4b08af9cc9d6308ec1953 ]
Syzkaller reported the following issue: UBSAN: shift-out-of-bounds in drivers/media/usb/gspca/cpia1.c:1031:27 shift exponent 245 is too large for 32-bit type 'int'
When the value of the variable "sd->params.exposure.gain" exceeds the number of bits in an integer, a shift-out-of-bounds error is reported. It is triggered because the variable "currentexp" cannot be left-shifted by more than the number of bits in an integer. In order to avoid invalid range during left-shift, the conditional expression is added.
Reported-by: syzbot+e27f3dbdab04e43b9f73@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/20230818164522.12806-1-coolrrsh@gmail.com Link: https://syzkaller.appspot.com/bug?extid=e27f3dbdab04e43b9f73 Signed-off-by: Rajeshwar R Shinde coolrrsh@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/gspca/cpia1.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/media/usb/gspca/cpia1.c b/drivers/media/usb/gspca/cpia1.c index 46ed95483e222..5f5fa851ca640 100644 --- a/drivers/media/usb/gspca/cpia1.c +++ b/drivers/media/usb/gspca/cpia1.c @@ -18,6 +18,7 @@
#include <linux/input.h> #include <linux/sched/signal.h> +#include <linux/bitops.h>
#include "gspca.h"
@@ -1028,6 +1029,8 @@ static int set_flicker(struct gspca_dev *gspca_dev, int on, int apply) sd->params.exposure.expMode = 2; sd->exposure_status = EXPOSURE_NORMAL; } + if (sd->params.exposure.gain >= BITS_PER_TYPE(currentexp)) + return -EINVAL; currentexp = currentexp << sd->params.exposure.gain; sd->params.exposure.gain = 0; /* round down current exposure to nearest value */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 4567ebf8e8f9546b373e78e3b7d584cc30b62028 ]
Fixes these compiler warnings:
drivers/media/test-drivers/vivid/vivid-rds-gen.c: In function 'vivid_rds_gen_fill': drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:56: warning: '.' directive output may be truncated writing 1 byte into a region of size between 0 and 3 [-Wformat-truncation=] 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^ drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:52: note: directive argument in the range [0, 9] 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^~~~~~~~~ drivers/media/test-drivers/vivid/vivid-rds-gen.c:147:9: note: 'snprintf' output between 9 and 12 bytes into a destination of size 9 147 | snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 148 | freq / 16, ((freq & 0xf) * 10) / 16); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Acked-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/test-drivers/vivid/vivid-rds-gen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/test-drivers/vivid/vivid-rds-gen.c b/drivers/media/test-drivers/vivid/vivid-rds-gen.c index b5b104ee64c99..c57771119a34b 100644 --- a/drivers/media/test-drivers/vivid/vivid-rds-gen.c +++ b/drivers/media/test-drivers/vivid/vivid-rds-gen.c @@ -145,7 +145,7 @@ void vivid_rds_gen_fill(struct vivid_rds_gen *rds, unsigned freq, rds->ta = alt; rds->ms = true; snprintf(rds->psname, sizeof(rds->psname), "%6d.%1d", - freq / 16, ((freq & 0xf) * 10) / 16); + (freq / 16) % 1000000, (((freq & 0xf) * 10) / 16) % 10); if (alt) strscpy(rds->radiotext, " The Radio Data System can switch between different Radio Texts ",
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bob Peterson rpeterso@redhat.com
[ Upstream commit 4c6a08125f2249531ec01783a5f4317d7342add5 ]
When lots of quota changes are made, there may be cases in which an inode's quota information is increased and then decreased, such as when blocks are added to a file, then deleted from it. If the timing is right, function do_qc can add pending quota changes to a transaction, then later, another call to do_qc can negate those changes, resulting in a net gain of 0. The quota_change information is recorded in the qc buffer (and qd element of the inode as well). The buffer is added to the transaction by the first call to do_qc, but a subsequent call changes the value from non-zero back to zero. At that point it's too late to remove the buffer_head from the transaction. Later, when the quota sync code is called, the zero-change qd element is discovered and flagged as an assert warning. If the fs is mounted with errors=panic, the kernel will panic.
This is usually seen when files are truncated and the quota changes are negated by punch_hole/truncate which uses gfs2_quota_hold and gfs2_quota_unhold rather than block allocations that use gfs2_quota_lock and gfs2_quota_unlock which automatically do quota sync.
This patch solves the problem by adding a check to qd_check_sync such that net-zero quota changes already added to the transaction are no longer deemed necessary to be synced, and skipped.
In this case references are taken for the qd and the slot from do_qc so those need to be put. The normal sequence of events for a normal non-zero quota change is as follows:
gfs2_quota_change do_qc qd_hold slot_hold
Later, when the changes are to be synced:
gfs2_quota_sync qd_fish qd_check_sync gets qd ref via lockref_get_not_dead do_sync do_qc(QC_SYNC) qd_put lockref_put_or_lock qd_unlock qd_put lockref_put_or_lock
In the net-zero change case, we add a check to qd_check_sync so it puts the qd and slot references acquired in gfs2_quota_change and skip the unneeded sync.
Signed-off-by: Bob Peterson rpeterso@redhat.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/quota.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/fs/gfs2/quota.c b/fs/gfs2/quota.c index dc77080a82bbf..c381580095baf 100644 --- a/fs/gfs2/quota.c +++ b/fs/gfs2/quota.c @@ -431,6 +431,17 @@ static int qd_check_sync(struct gfs2_sbd *sdp, struct gfs2_quota_data *qd, (sync_gen && (qd->qd_sync_gen >= *sync_gen))) return 0;
+ /* + * If qd_change is 0 it means a pending quota change was negated. + * We should not sync it, but we still have a qd reference and slot + * reference taken by gfs2_quota_change -> do_qc that need to be put. + */ + if (!qd->qd_change && test_and_clear_bit(QDF_CHANGE, &qd->qd_flags)) { + slot_put(qd); + qd_put(qd); + return 0; + } + if (!lockref_get_not_dead(&qd->qd_lockref)) return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Al Viro viro@zeniv.linux.org.uk
[ Upstream commit 0abd1557e21c617bd13fc18f7725fc6363c05913 ]
In RCU mode, we might race with gfs2_evict_inode(), which zeroes ->i_gl. Freeing of the object it points to is RCU-delayed, so if we manage to fetch the pointer before it's been replaced with NULL, we are fine. Check if we'd fetched NULL and treat that as "bail out and tell the caller to get out of RCU mode".
Signed-off-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/inode.c | 11 +++++++++-- fs/gfs2/super.c | 2 +- 2 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 97ee17843b4d0..682418d9c8e72 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1850,14 +1850,21 @@ int gfs2_permission(struct user_namespace *mnt_userns, struct inode *inode, { struct gfs2_inode *ip; struct gfs2_holder i_gh; + struct gfs2_glock *gl; int error;
gfs2_holder_mark_uninitialized(&i_gh); ip = GFS2_I(inode); - if (gfs2_glock_is_locked_by_me(ip->i_gl) == NULL) { + gl = rcu_dereference(ip->i_gl); + if (unlikely(!gl)) { + /* inode is getting torn down, must be RCU mode */ + WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); + return -ECHILD; + } + if (gfs2_glock_is_locked_by_me(gl) == NULL) { if (mask & MAY_NOT_BLOCK) return -ECHILD; - error = gfs2_glock_nq_init(ip->i_gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); + error = gfs2_glock_nq_init(gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); if (error) return error; } diff --git a/fs/gfs2/super.c b/fs/gfs2/super.c index 51b44da4a0d64..268651ac9fc84 100644 --- a/fs/gfs2/super.c +++ b/fs/gfs2/super.c @@ -1436,7 +1436,7 @@ static void gfs2_evict_inode(struct inode *inode) wait_on_bit_io(&ip->i_flags, GIF_GLOP_PENDING, TASK_UNINTERRUPTIBLE); gfs2_glock_add_to_lru(ip->i_gl); gfs2_glock_put_eventually(ip->i_gl); - ip->i_gl = NULL; + rcu_assign_pointer(ip->i_gl, NULL); } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
[ Upstream commit f301fedbeecfdce91cb898d6fa5e62f269801fee ]
Use FIELD_GET() to extract PCIe Negotiated and Maximum Link Width fields instead of custom masking and shifting.
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/cobalt/cobalt-driver.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/media/pci/cobalt/cobalt-driver.c b/drivers/media/pci/cobalt/cobalt-driver.c index 16af58f2f93cc..f9cee061517bd 100644 --- a/drivers/media/pci/cobalt/cobalt-driver.c +++ b/drivers/media/pci/cobalt/cobalt-driver.c @@ -8,6 +8,7 @@ * All rights reserved. */
+#include <linux/bitfield.h> #include <linux/delay.h> #include <media/i2c/adv7604.h> #include <media/i2c/adv7842.h> @@ -210,17 +211,17 @@ void cobalt_pcie_status_show(struct cobalt *cobalt) pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &stat); cobalt_info("PCIe link capability 0x%08x: %s per lane and %u lanes\n", capa, get_link_speed(capa), - (capa & PCI_EXP_LNKCAP_MLW) >> 4); + FIELD_GET(PCI_EXP_LNKCAP_MLW, capa)); cobalt_info("PCIe link control 0x%04x\n", ctrl); cobalt_info("PCIe link status 0x%04x: %s per lane and %u lanes\n", stat, get_link_speed(stat), - (stat & PCI_EXP_LNKSTA_NLW) >> 4); + FIELD_GET(PCI_EXP_LNKSTA_NLW, stat));
/* Bus */ pcie_capability_read_dword(pci_bus_dev, PCI_EXP_LNKCAP, &capa); cobalt_info("PCIe bus link capability 0x%08x: %s per lane and %u lanes\n", capa, get_link_speed(capa), - (capa & PCI_EXP_LNKCAP_MLW) >> 4); + FIELD_GET(PCI_EXP_LNKCAP_MLW, capa));
/* Slot */ pcie_capability_read_dword(pci_dev, PCI_EXP_SLTCAP, &capa); @@ -239,7 +240,7 @@ static unsigned pcie_link_get_lanes(struct cobalt *cobalt) if (!pci_is_pcie(pci_dev)) return 0; pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &link); - return (link & PCI_EXP_LNKSTA_NLW) >> 4; + return FIELD_GET(PCI_EXP_LNKSTA_NLW, link); }
static unsigned pcie_bus_link_get_lanes(struct cobalt *cobalt) @@ -250,7 +251,7 @@ static unsigned pcie_bus_link_get_lanes(struct cobalt *cobalt) if (!pci_is_pcie(pci_dev)) return 0; pcie_capability_read_dword(pci_dev, PCI_EXP_LNKCAP, &link); - return (link & PCI_EXP_LNKCAP_MLW) >> 4; + return FIELD_GET(PCI_EXP_LNKCAP_MLW, link); }
static void msi_config_show(struct cobalt *cobalt, struct pci_dev *pci_dev)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 441b5c63d71ec9ec5453328f7e83384ecc1dddd9 ]
Fix documentation for struct ccs_quirk, a device specific struct for managing deviations from the standard. The flags field was drifted away from where it should have been.
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/i2c/ccs/ccs-quirk.h | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/media/i2c/ccs/ccs-quirk.h b/drivers/media/i2c/ccs/ccs-quirk.h index 5838fcda92fd4..0b1a64958d714 100644 --- a/drivers/media/i2c/ccs/ccs-quirk.h +++ b/drivers/media/i2c/ccs/ccs-quirk.h @@ -32,12 +32,10 @@ struct ccs_sensor; * @reg: Pointer to the register to access * @value: Register value, set by the caller on write, or * by the quirk on read - * - * @flags: Quirk flags - * * @return: 0 on success, -ENOIOCTLCMD if no register * access may be done by the caller (default read * value is zero), else negative error code on error + * @flags: Quirk flags */ struct ccs_quirk { int (*limits)(struct ccs_sensor *sensor);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit a1766a4fd83befa0b34d932d532e7ebb7fab1fa7 ]
imon driver probes two USB interfaces, and at the probe of the second interface, the driver assumes blindly that the first interface got bound with the same imon driver. It's usually true, but it's still possible that the first interface is bound with another driver via a malformed descriptor. Then it may lead to a memory corruption, as spotted by syzkaller; imon driver accesses the data from drvdata as struct imon_context object although it's a completely different one that was assigned by another driver.
This patch adds a sanity check -- whether the first interface is really bound with the imon driver or not -- for avoiding the problem above at the probe time.
Reported-by: syzbot+59875ffef5cb9c9b29e9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000a838aa0603cc74d6@google.com/ Tested-by: Ricardo B. Marliere ricardo@marliere.net Link: https://lore.kernel.org/r/20230922005152.163640-1-ricardo@marliere.net Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/rc/imon.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/media/rc/imon.c b/drivers/media/rc/imon.c index 72e4bb0fb71ec..4e7c3d889d5ce 100644 --- a/drivers/media/rc/imon.c +++ b/drivers/media/rc/imon.c @@ -2427,6 +2427,12 @@ static int imon_probe(struct usb_interface *interface, goto fail; }
+ if (first_if->dev.driver != interface->dev.driver) { + dev_err(&interface->dev, "inconsistent driver matching\n"); + ret = -EINVAL; + goto fail; + } + if (ifnum == 0) { ictx = imon_init_intf0(interface, id); if (!ictx) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wayne Lin wayne.lin@amd.com
[ Upstream commit b1904ed480cee3f9f4036ea0e36d139cb5fee2d6 ]
[Why & How] Check whether assigned timing generator is NULL or not before accessing its funcs to prevent NULL dereference.
Reviewed-by: Jun Lei jun.lei@amd.com Acked-by: Hersen Wu hersenxs.wu@amd.com Signed-off-by: Wayne Lin wayne.lin@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc_stream.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c index f0f54f4d3d9bc..5dd57cf170f51 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_stream.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_stream.c @@ -562,7 +562,7 @@ uint32_t dc_stream_get_vblank_counter(const struct dc_stream_state *stream) for (i = 0; i < MAX_PIPES; i++) { struct timing_generator *tg = res_ctx->pipe_ctx[i].stream_res.tg;
- if (res_ctx->pipe_ctx[i].stream != stream) + if (res_ctx->pipe_ctx[i].stream != stream || !tg) continue;
return tg->funcs->get_frame_count(tg); @@ -621,7 +621,7 @@ bool dc_stream_get_scanoutpos(const struct dc_stream_state *stream, for (i = 0; i < MAX_PIPES; i++) { struct timing_generator *tg = res_ctx->pipe_ctx[i].stream_res.tg;
- if (res_ctx->pipe_ctx[i].stream != stream) + if (res_ctx->pipe_ctx[i].stream != stream || !tg) continue;
tg->funcs->get_scanoutpos(tg,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson dianders@chromium.org
[ Upstream commit dd712d3d45807db9fcae28a522deee85c1f2fde6 ]
When entering kdb/kgdb on a kernel panic, it was be observed that the console isn't flushed before the `kdb` prompt came up. Specifically, when using the buddy lockup detector on arm64 and running: echo HARDLOCKUP > /sys/kernel/debug/provoke-crash/DIRECT
I could see: [ 26.161099] lkdtm: Performing direct entry HARDLOCKUP [ 32.499881] watchdog: Watchdog detected hard LOCKUP on cpu 6 [ 32.552865] Sending NMI from CPU 5 to CPUs 6: [ 32.557359] NMI backtrace for cpu 6 ... [backtrace for cpu 6] ... [ 32.558353] NMI backtrace for cpu 5 ... [backtrace for cpu 5] ... [ 32.867471] Sending NMI from CPU 5 to CPUs 0-4,7: [ 32.872321] NMI backtrace forP cpuANC: Hard LOCKUP
Entering kdb (current=..., pid 0) on processor 5 due to Keyboard Entry [5]kdb>
As you can see, backtraces for the other CPUs start printing and get interleaved with the kdb PANIC print.
Let's replicate the commands to flush the console in the kdb panic entry point to avoid this.
Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://lore.kernel.org/r/20230822131945.1.I5b460ae8f954e4c4f628a373d6e74713... Signed-off-by: Daniel Thompson daniel.thompson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/debug/debug_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/kernel/debug/debug_core.c b/kernel/debug/debug_core.c index 7beceb447211d..f40ca4f09afce 100644 --- a/kernel/debug/debug_core.c +++ b/kernel/debug/debug_core.c @@ -1018,6 +1018,9 @@ void kgdb_panic(const char *msg) if (panic_timeout) return;
+ debug_locks_off(); + console_flush_on_panic(CONSOLE_FLUSH_PENDING); + if (dbg_kdb_mode) kdb_printf("PANIC: %s\n", msg);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner pstanner@redhat.com
[ Upstream commit cc9c54232f04aef3a5d7f64a0ece7df00f1aaa3d ]
i2c-dev.c utilizes memdup_user() to copy a userspace array. This is done without an overflow check.
Use the new wrapper memdup_array_user() to copy the array more safely.
Suggested-by: Dave Airlie airlied@redhat.com Signed-off-by: Philipp Stanner pstanner@redhat.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/i2c-dev.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/i2c-dev.c b/drivers/i2c/i2c-dev.c index 6fd2b6718b086..9fefceb3a95d4 100644 --- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -450,8 +450,8 @@ static long i2cdev_ioctl(struct file *file, unsigned int cmd, unsigned long arg) if (rdwr_arg.nmsgs > I2C_RDWR_IOCTL_MAX_MSGS) return -EINVAL;
- rdwr_pa = memdup_user(rdwr_arg.msgs, - rdwr_arg.nmsgs * sizeof(struct i2c_msg)); + rdwr_pa = memdup_array_user(rdwr_arg.msgs, + rdwr_arg.nmsgs, sizeof(struct i2c_msg)); if (IS_ERR(rdwr_pa)) return PTR_ERR(rdwr_pa);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren tony@atomide.com
[ Upstream commit fbb74e56378d8306f214658e3d525a8b3f000c5a ]
We need to check for an active device as otherwise we get warnings for some mcbsp instances for "Runtime PM usage count underflow!".
Reported-by: Andreas Kemnade andreas@kemnade.info Signed-off-by: Tony Lindgren tony@atomide.com Link: https://lore.kernel.org/r/20231030052340.13415-1-tony@atomide.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/ti/omap-mcbsp.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/sound/soc/ti/omap-mcbsp.c b/sound/soc/ti/omap-mcbsp.c index 4479d74f0a458..81d2be87e9739 100644 --- a/sound/soc/ti/omap-mcbsp.c +++ b/sound/soc/ti/omap-mcbsp.c @@ -74,14 +74,16 @@ static int omap2_mcbsp_set_clks_src(struct omap_mcbsp *mcbsp, u8 fck_src_id) return -EINVAL; }
- pm_runtime_put_sync(mcbsp->dev); + if (mcbsp->active) + pm_runtime_put_sync(mcbsp->dev);
r = clk_set_parent(mcbsp->fclk, fck_src); if (r) dev_err(mcbsp->dev, "CLKS: could not clk_set_parent() to %s\n", src);
- pm_runtime_get_sync(mcbsp->dev); + if (mcbsp->active) + pm_runtime_get_sync(mcbsp->dev);
clk_put(fck_src);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zongmin Zhou zhouzongmin@kylinos.cn
[ Upstream commit 0e8b9f258baed25f1c5672613699247c76b007b5 ]
The allocated memory for qdev->dumb_heads should be released in qxl_destroy_monitors_object before qxl suspend. otherwise,qxl_create_monitors_object will be called to reallocate memory for qdev->dumb_heads after qxl resume, it will cause memory leak.
Signed-off-by: Zongmin Zhou zhouzongmin@kylinos.cn Link: https://lore.kernel.org/r/20230801025309.4049813-1-zhouzongmin@kylinos.cn Reviewed-by: Dave Airlie airlied@redhat.com Signed-off-by: Maxime Ripard mripard@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/qxl/qxl_display.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/qxl/qxl_display.c b/drivers/gpu/drm/qxl/qxl_display.c index 9e0a1e8360117..dc04412784a0d 100644 --- a/drivers/gpu/drm/qxl/qxl_display.c +++ b/drivers/gpu/drm/qxl/qxl_display.c @@ -1221,6 +1221,9 @@ int qxl_destroy_monitors_object(struct qxl_device *qdev) if (!qdev->monitors_config_bo) return 0;
+ kfree(qdev->dumb_heads); + qdev->dumb_heads = NULL; + qdev->monitors_config = NULL; qdev->ram_header->monitors_config = 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vitaly Prosyak vitaly.prosyak@amd.com
[ Upstream commit 4638e0c29a3f2294d5de0d052a4b8c9f33ccb957 ]
When software 'pci unplug' using IGT is executed we got a sysfs directory entry is NULL for differant ras blocks like hdp, umc, etc. Before call 'sysfs_remove_file_from_group' and 'sysfs_remove_group' check that 'sd' is not NULL.
[ +0.000001] RIP: 0010:sysfs_remove_group+0x83/0x90 [ +0.000002] Code: 31 c0 31 d2 31 f6 31 ff e9 9a a8 b4 00 4c 89 e7 e8 f2 a2 ff ff eb c2 49 8b 55 00 48 8b 33 48 c7 c7 80 65 94 82 e8 cd 82 bb ff <0f> 0b eb cc 66 0f 1f 84 00 00 00 00 00 90 90 90 90 90 90 90 90 90 [ +0.000001] RSP: 0018:ffffc90002067c90 EFLAGS: 00010246 [ +0.000002] RAX: 0000000000000000 RBX: ffffffff824ea180 RCX: 0000000000000000 [ +0.000001] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 [ +0.000001] RBP: ffffc90002067ca8 R08: 0000000000000000 R09: 0000000000000000 [ +0.000001] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [ +0.000001] R13: ffff88810a395f48 R14: ffff888101aab0d0 R15: 0000000000000000 [ +0.000001] FS: 00007f5ddaa43a00(0000) GS:ffff88841e800000(0000) knlGS:0000000000000000 [ +0.000002] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ +0.000001] CR2: 00007f8ffa61ba50 CR3: 0000000106432000 CR4: 0000000000350ef0 [ +0.000001] Call Trace: [ +0.000001] <TASK> [ +0.000001] ? show_regs+0x72/0x90 [ +0.000002] ? sysfs_remove_group+0x83/0x90 [ +0.000002] ? __warn+0x8d/0x160 [ +0.000001] ? sysfs_remove_group+0x83/0x90 [ +0.000001] ? report_bug+0x1bb/0x1d0 [ +0.000003] ? handle_bug+0x46/0x90 [ +0.000001] ? exc_invalid_op+0x19/0x80 [ +0.000002] ? asm_exc_invalid_op+0x1b/0x20 [ +0.000003] ? sysfs_remove_group+0x83/0x90 [ +0.000001] dpm_sysfs_remove+0x61/0x70 [ +0.000002] device_del+0xa3/0x3d0 [ +0.000002] ? ktime_get_mono_fast_ns+0x46/0xb0 [ +0.000002] device_unregister+0x18/0x70 [ +0.000001] i2c_del_adapter+0x26d/0x330 [ +0.000002] arcturus_i2c_control_fini+0x25/0x50 [amdgpu] [ +0.000236] smu_sw_fini+0x38/0x260 [amdgpu] [ +0.000241] amdgpu_device_fini_sw+0x116/0x670 [amdgpu] [ +0.000186] ? mutex_lock+0x13/0x50 [ +0.000003] amdgpu_driver_release_kms+0x16/0x40 [amdgpu] [ +0.000192] drm_minor_release+0x4f/0x80 [drm] [ +0.000025] drm_release+0xfe/0x150 [drm] [ +0.000027] __fput+0x9f/0x290 [ +0.000002] ____fput+0xe/0x20 [ +0.000002] task_work_run+0x61/0xa0 [ +0.000002] exit_to_user_mode_prepare+0x150/0x170 [ +0.000002] syscall_exit_to_user_mode+0x2a/0x50
Cc: Hawking Zhang hawking.zhang@amd.com Cc: Luben Tuikov luben.tuikov@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: Christian Koenig christian.koenig@amd.com Signed-off-by: Vitaly Prosyak vitaly.prosyak@amd.com Reviewed-by: Luben Tuikov luben.tuikov@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c index 96a8fd0ca1df3..439ea256ed252 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_ras.c @@ -1192,7 +1192,8 @@ static void amdgpu_ras_sysfs_remove_bad_page_node(struct amdgpu_device *adev) { struct amdgpu_ras *con = amdgpu_ras_get_context(adev);
- sysfs_remove_file_from_group(&adev->dev->kobj, + if (adev->dev->kobj.sd) + sysfs_remove_file_from_group(&adev->dev->kobj, &con->badpages_attr.attr, RAS_FS_NAME); } @@ -1209,7 +1210,8 @@ static int amdgpu_ras_sysfs_remove_feature_node(struct amdgpu_device *adev) .attrs = attrs, };
- sysfs_remove_group(&adev->dev->kobj, &group); + if (adev->dev->kobj.sd) + sysfs_remove_group(&adev->dev->kobj, &group);
return 0; } @@ -1257,7 +1259,8 @@ int amdgpu_ras_sysfs_remove(struct amdgpu_device *adev, if (!obj || !obj->attr_inuse) return -EINVAL;
- sysfs_remove_file_from_group(&adev->dev->kobj, + if (adev->dev->kobj.sd) + sysfs_remove_file_from_group(&adev->dev->kobj, &obj->sysfs_attr.attr, RAS_FS_NAME); obj->attr_inuse = 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit d27abbfd4888d79dd24baf50e774631046ac4732 ]
These enums are passed to set/test_bit(). The set/test_bit() functions take a bit number instead of a shifted value. Passing a shifted value is a double shift bug like doing BIT(BIT(1)). The double shift bug doesn't cause a problem here because we are only checking 0 and 1 but if the value was 5 or above then it can lead to a buffer overflow.
Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Reviewed-by: Sam Protsenko semen.protsenko@linaro.org Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/pwm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/pwm.h b/include/linux/pwm.h index c7bfa64aeb142..03c42e742dfe7 100644 --- a/include/linux/pwm.h +++ b/include/linux/pwm.h @@ -44,8 +44,8 @@ struct pwm_args { };
enum { - PWMF_REQUESTED = 1 << 0, - PWMF_EXPORTED = 1 << 1, + PWMF_REQUESTED = 0, + PWMF_EXPORTED = 1, };
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miri Korenblit miriam.rachel.korenblit@intel.com
[ Upstream commit 499d02790495958506a64f37ceda7e97345a50a8 ]
Currently we are setting the rate in the tx cmd for mgmt frames (e.g. during connection establishment). This was problematic when sending mgmt frames in eSR mode, as we don't know what link this frame will be sent on (This is decided by the FW), so we don't know what is the lowest rate. Fix this by not setting the rate in tx cmd and rely on FW to choose the right one. Set rate only for injected frames with fixed rate, or when no sta is given. Also set for important frames (EAPOL etc.) the High Priority flag.
Fixes: 055b22e770dd ("iwlwifi: mvm: Set Tx rate and flags when there is not station") Signed-off-by: Miri Korenblit miriam.rachel.korenblit@intel.com Signed-off-by: Gregory Greenman gregory.greenman@intel.com Link: https://lore.kernel.org/r/20230913145231.6c7e59620ee0.I6eaed3ccdd6dd62b9e664... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/intel/iwlwifi/mvm/tx.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 4375da00f7cf0..08dd227bad4b1 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -479,16 +479,20 @@ iwl_mvm_set_tx_params(struct iwl_mvm *mvm, struct sk_buff *skb, flags |= IWL_TX_FLAGS_ENCRYPT_DIS;
/* - * For data packets rate info comes from the fw. Only - * set rate/antenna during connection establishment or in case - * no station is given. + * For data and mgmt packets rate info comes from the fw. Only + * set rate/antenna for injected frames with fixed rate, or + * when no sta is given. */ - if (!sta || !ieee80211_is_data(hdr->frame_control) || - mvmsta->sta_state < IEEE80211_STA_AUTHORIZED) { + if (unlikely(!sta || + info->control.flags & IEEE80211_TX_CTRL_RATE_INJECT)) { flags |= IWL_TX_FLAGS_CMD_RATE; rate_n_flags = iwl_mvm_get_tx_rate_n_flags(mvm, info, sta, hdr->frame_control); + } else if (!ieee80211_is_data(hdr->frame_control) || + mvmsta->sta_state < IEEE80211_STA_AUTHORIZED) { + /* These are important frames */ + flags |= IWL_TX_FLAGS_HIGH_PRI; }
if (mvm->trans->trans_cfg->device_family >=
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
[ Upstream commit 9b84fadc444de5456ab5f5487e2108311c724c3f ]
Instead of having branches that adds noise to the branch prediction, use the addition logic to set the bit for the level of interrupt context that the state is currently in. This copies the logic from perf's get_recursion_context() function.
Link: https://lore.kernel.org/all/20211015161702.GF174703@worktop.programming.kick...
Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Stable-dep-of: 87c3a5893e86 ("sched/core: Optimize in_task() and in_interrupt() a bit") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/trace_recursion.h | 11 ++++++----- kernel/trace/ring_buffer.c | 12 ++++++------ 2 files changed, 12 insertions(+), 11 deletions(-)
diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h index fe95f09225266..00acd7dca7a7d 100644 --- a/include/linux/trace_recursion.h +++ b/include/linux/trace_recursion.h @@ -117,12 +117,13 @@ enum { static __always_inline int trace_get_context_bit(void) { unsigned long pc = preempt_count(); + unsigned char bit = 0;
- if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) - return TRACE_CTX_NORMAL; - else - return pc & NMI_MASK ? TRACE_CTX_NMI : - pc & HARDIRQ_MASK ? TRACE_CTX_IRQ : TRACE_CTX_SOFTIRQ; + bit += !!(pc & (NMI_MASK)); + bit += !!(pc & (NMI_MASK | HARDIRQ_MASK)); + bit += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + + return TRACE_CTX_NORMAL - bit; }
#ifdef CONFIG_FTRACE_RECORD_RECURSION diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index e5dc7b5a261c6..c3c9960c9f27b 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -3250,13 +3250,13 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer) { unsigned int val = cpu_buffer->current_context; unsigned long pc = preempt_count(); - int bit; + int bit = 0;
- if (!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) - bit = RB_CTX_NORMAL; - else - bit = pc & NMI_MASK ? RB_CTX_NMI : - pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ; + bit += !!(pc & (NMI_MASK)); + bit += !!(pc & (NMI_MASK | HARDIRQ_MASK)); + bit += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + + bit = RB_CTX_NORMAL - bit;
if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) { /*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (VMware) rostedt@goodmis.org
[ Upstream commit 91ebe8bcbff9d2ff21303e73bf7434f39a98b255 ]
Now that there are three different instances of doing the addition trick to the preempt_count() and NMI_MASK, HARDIRQ_MASK and SOFTIRQ_OFFSET macros, it deserves a helper function defined in the preempt.h header.
Add the interrupt_context_level() helper and replace the three instances that do that logic with it.
Link: https://lore.kernel.org/all/20211015142541.4badd8a9@gandalf.local.home/
Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Stable-dep-of: 87c3a5893e86 ("sched/core: Optimize in_task() and in_interrupt() a bit") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/preempt.h | 21 +++++++++++++++++++++ include/linux/trace_recursion.h | 7 +------ kernel/events/internal.h | 7 +------ kernel/trace/ring_buffer.c | 7 +------ 4 files changed, 24 insertions(+), 18 deletions(-)
diff --git a/include/linux/preempt.h b/include/linux/preempt.h index 4d244e295e855..b32e3dabe28bd 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -77,6 +77,27 @@ /* preempt_count() and related functions, depends on PREEMPT_NEED_RESCHED */ #include <asm/preempt.h>
+/** + * interrupt_context_level - return interrupt context level + * + * Returns the current interrupt context level. + * 0 - normal context + * 1 - softirq context + * 2 - hardirq context + * 3 - NMI context + */ +static __always_inline unsigned char interrupt_context_level(void) +{ + unsigned long pc = preempt_count(); + unsigned char level = 0; + + level += !!(pc & (NMI_MASK)); + level += !!(pc & (NMI_MASK | HARDIRQ_MASK)); + level += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + + return level; +} + #define nmi_count() (preempt_count() & NMI_MASK) #define hardirq_count() (preempt_count() & HARDIRQ_MASK) #ifdef CONFIG_PREEMPT_RT diff --git a/include/linux/trace_recursion.h b/include/linux/trace_recursion.h index 00acd7dca7a7d..816d7a0d2aad6 100644 --- a/include/linux/trace_recursion.h +++ b/include/linux/trace_recursion.h @@ -116,12 +116,7 @@ enum {
static __always_inline int trace_get_context_bit(void) { - unsigned long pc = preempt_count(); - unsigned char bit = 0; - - bit += !!(pc & (NMI_MASK)); - bit += !!(pc & (NMI_MASK | HARDIRQ_MASK)); - bit += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + unsigned char bit = interrupt_context_level();
return TRACE_CTX_NORMAL - bit; } diff --git a/kernel/events/internal.h b/kernel/events/internal.h index aa23ffdaf819f..5150d5f84c033 100644 --- a/kernel/events/internal.h +++ b/kernel/events/internal.h @@ -210,12 +210,7 @@ DEFINE_OUTPUT_COPY(__output_copy_user, arch_perf_out_copy_user)
static inline int get_recursion_context(int *recursion) { - unsigned int pc = preempt_count(); - unsigned char rctx = 0; - - rctx += !!(pc & (NMI_MASK)); - rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK)); - rctx += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + unsigned char rctx = interrupt_context_level();
if (recursion[rctx]) return -1; diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index c3c9960c9f27b..a930a9d7d834d 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -3249,12 +3249,7 @@ static __always_inline int trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer) { unsigned int val = cpu_buffer->current_context; - unsigned long pc = preempt_count(); - int bit = 0; - - bit += !!(pc & (NMI_MASK)); - bit += !!(pc & (NMI_MASK | HARDIRQ_MASK)); - bit += !!(pc & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)); + int bit = interrupt_context_level();
bit = RB_CTX_NORMAL - bit;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Finn Thain fthain@linux-m68k.org
[ Upstream commit 87c3a5893e865739ce78aa7192d36011022e0af7 ]
Except on x86, preempt_count is always accessed with READ_ONCE(). Repeated invocations in macros like irq_count() produce repeated loads. These redundant instructions appear in various fast paths. In the one shown below, for example, irq_count() is evaluated during kernel entry if !tick_nohz_full_cpu(smp_processor_id()).
0001ed0a <irq_enter_rcu>: 1ed0a: 4e56 0000 linkw %fp,#0 1ed0e: 200f movel %sp,%d0 1ed10: 0280 ffff e000 andil #-8192,%d0 1ed16: 2040 moveal %d0,%a0 1ed18: 2028 0008 movel %a0@(8),%d0 1ed1c: 0680 0001 0000 addil #65536,%d0 1ed22: 2140 0008 movel %d0,%a0@(8) 1ed26: 082a 0001 000f btst #1,%a2@(15) 1ed2c: 670c beqs 1ed3a <irq_enter_rcu+0x30> 1ed2e: 2028 0008 movel %a0@(8),%d0 1ed32: 2028 0008 movel %a0@(8),%d0 1ed36: 2028 0008 movel %a0@(8),%d0 1ed3a: 4e5e unlk %fp 1ed3c: 4e75 rts
This patch doesn't prevent the pointless btst and beqs instructions above, but it does eliminate 2 of the 3 pointless move instructions here and elsewhere.
On x86, preempt_count is per-cpu data and the problem does not arise presumably because the compiler is free to optimize more effectively.
This patch was tested on m68k and x86. I was expecting no changes to object code for x86 and mostly that's what I saw. However, there were a few places where code generation was perturbed for some reason.
The performance issue addressed here is minor on uniprocessor m68k. I got a 0.01% improvement from this patch for a simple "find /sys -false" benchmark. For architectures and workloads susceptible to cache line bounce the improvement is expected to be larger. The only SMP architecture I have is x86, and as x86 unaffected I have not done any further measurements.
Fixes: 15115830c887 ("preempt: Cleanup the macro maze a bit") Signed-off-by: Finn Thain fthain@linux-m68k.org Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/0a403120a682a525e6db2d81d1a3ffcc137c3742.169475683... Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/preempt.h | 15 +++++++++++++-- 1 file changed, 13 insertions(+), 2 deletions(-)
diff --git a/include/linux/preempt.h b/include/linux/preempt.h index b32e3dabe28bd..9c4534a69a8f7 100644 --- a/include/linux/preempt.h +++ b/include/linux/preempt.h @@ -98,14 +98,21 @@ static __always_inline unsigned char interrupt_context_level(void) return level; }
+/* + * These macro definitions avoid redundant invocations of preempt_count() + * because such invocations would result in redundant loads given that + * preempt_count() is commonly implemented with READ_ONCE(). + */ + #define nmi_count() (preempt_count() & NMI_MASK) #define hardirq_count() (preempt_count() & HARDIRQ_MASK) #ifdef CONFIG_PREEMPT_RT # define softirq_count() (current->softirq_disable_cnt & SOFTIRQ_MASK) +# define irq_count() ((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | softirq_count()) #else # define softirq_count() (preempt_count() & SOFTIRQ_MASK) +# define irq_count() (preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_MASK)) #endif -#define irq_count() (nmi_count() | hardirq_count() | softirq_count())
/* * Macros to retrieve the current execution context: @@ -118,7 +125,11 @@ static __always_inline unsigned char interrupt_context_level(void) #define in_nmi() (nmi_count()) #define in_hardirq() (hardirq_count()) #define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET) -#define in_task() (!(in_nmi() | in_hardirq() | in_serving_softirq())) +#ifdef CONFIG_PREEMPT_RT +# define in_task() (!((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | in_serving_softirq())) +#else +# define in_task() (!(preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET))) +#endif
/* * The following macros are deprecated and should not be used in new code:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pratyush Yadav p.yadav@ti.com
[ Upstream commit b2701715301a49b53d05c7d43f3fedc3b8743bfc ]
The notifier is added to the global notifier list when registered. When the module is removed, the struct csi2rx_priv in which the notifier is embedded, is destroyed. As a result the notifier list has a reference to a notifier that no longer exists. This causes invalid memory accesses when the list is iterated over. Similar for when the probe fails. Unregister and clean up the notifier to avoid this.
Fixes: 1fc3b37f34f6 ("media: v4l: cadence: Add Cadence MIPI-CSI2 RX driver")
Signed-off-by: Pratyush Yadav p.yadav@ti.com Tested-by: Julien Massot julien.massot@collabora.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Reviewed-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Reviewed-by: Maxime Ripard mripard@kernel.org Signed-off-by: Jai Luthra j-luthra@ti.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/cadence/cdns-csi2rx.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/media/platform/cadence/cdns-csi2rx.c +++ b/drivers/media/platform/cadence/cdns-csi2rx.c @@ -407,8 +407,10 @@ static int csi2rx_parse_dt(struct csi2rx fwh, struct v4l2_async_subdev); of_node_put(ep); - if (IS_ERR(asd)) + if (IS_ERR(asd)) { + v4l2_async_notifier_cleanup(&csi2rx->notifier); return PTR_ERR(asd); + }
csi2rx->notifier.ops = &csi2rx_notifier_ops;
@@ -471,6 +473,7 @@ static int csi2rx_probe(struct platform_ return 0;
err_cleanup: + v4l2_async_notifier_unregister(&csi2rx->notifier); v4l2_async_notifier_cleanup(&csi2rx->notifier); err_free_priv: kfree(csi2rx); @@ -481,6 +484,8 @@ static int csi2rx_remove(struct platform { struct csi2rx_priv *csi2rx = platform_get_drvdata(pdev);
+ v4l2_async_notifier_unregister(&csi2rx->notifier); + v4l2_async_notifier_cleanup(&csi2rx->notifier); v4l2_async_unregister_subdev(&csi2rx->subdev); kfree(csi2rx);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Szyprowski m.szyprowski@samsung.com
[ Upstream commit 94e27fbeca27d8c772fc2bc807730aaee5886055 ]
'meson' directory contains two separate drivers, so it should be added to Makefile compilation hierarchy unconditionally, because otherwise the meson-ao-cec-g12a won't be compiled if meson-ao-cec is not selected.
Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Fixes: 4be5e8648b0c ("media: move CEC platform drivers to a separate directory") Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/cec/platform/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/cec/platform/Makefile b/drivers/media/cec/platform/Makefile index ea6f8ee8161c9..e5e441faa0baa 100644 --- a/drivers/media/cec/platform/Makefile +++ b/drivers/media/cec/platform/Makefile @@ -6,7 +6,7 @@ # Please keep it in alphabetic order obj-$(CONFIG_CEC_CROS_EC) += cros-ec/ obj-$(CONFIG_CEC_GPIO) += cec-gpio/ -obj-$(CONFIG_CEC_MESON_AO) += meson/ +obj-y += meson/ obj-$(CONFIG_CEC_SAMSUNG_S5P) += s5p/ obj-$(CONFIG_CEC_SECO) += seco/ obj-$(CONFIG_CEC_STI) += sti/
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 4b09ca1508a60be30b2e3940264e93d7aeb5c97e ]
If connect() is returning ECONNRESET, it usually means that nothing is listening on that port. If so, a rebind might be required in order to obtain the new port on which the RPC service is listening.
Fixes: fd01b2597941 ("SUNRPC: ECONNREFUSED should cause a rebind.") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/clnt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index 7756c62e0c3ed..fc37f314a09dd 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -2088,6 +2088,7 @@ call_connect_status(struct rpc_task *task) task->tk_status = 0; switch (status) { case -ECONNREFUSED: + case -ECONNRESET: /* A positive refusal suggests a rebind is needed. */ if (RPC_IS_SOFTCONN(task)) break; @@ -2096,7 +2097,6 @@ call_connect_status(struct rpc_task *task) goto out_retry; } fallthrough; - case -ECONNRESET: case -ECONNABORTED: case -ENETDOWN: case -ENETUNREACH:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier maz@kernel.org
[ Upstream commit 6c846d026d490b2383d395bc8e7b06336219667b ]
In order to move away from gpiolib messing with the internals of unsuspecting irqchips, add a flag by which irqchips advertise that they are not to be messed with, and do solemnly swear that they correctly call into the gpiolib helpers when required.
Also nudge the users into converting their drivers to the new model.
Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220419141846.598305-2-maz@kernel.org Stable-dep-of: dc3115e6c5d9 ("hid: cp2112: Fix IRQ shutdown stopping polling for all IRQs on chip") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 7 ++++++- include/linux/irq.h | 2 ++ kernel/irq/debugfs.c | 1 + 3 files changed, 9 insertions(+), 1 deletion(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index f9fdd117c654c..e572c30a202ad 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1483,6 +1483,11 @@ static void gpiochip_set_irq_hooks(struct gpio_chip *gc) { struct irq_chip *irqchip = gc->irq.chip;
+ if (irqchip->flags & IRQCHIP_IMMUTABLE) + return; + + chip_warn(gc, "not an immutable chip, please consider fixing it!\n"); + if (!irqchip->irq_request_resources && !irqchip->irq_release_resources) { irqchip->irq_request_resources = gpiochip_irq_reqres; @@ -1650,7 +1655,7 @@ static void gpiochip_irqchip_remove(struct gpio_chip *gc) irq_domain_remove(gc->irq.domain); }
- if (irqchip) { + if (irqchip && !(irqchip->flags & IRQCHIP_IMMUTABLE)) { if (irqchip->irq_request_resources == gpiochip_irq_reqres) { irqchip->irq_request_resources = NULL; irqchip->irq_release_resources = NULL; diff --git a/include/linux/irq.h b/include/linux/irq.h index f9e6449fbbbae..296ef3b7d7afa 100644 --- a/include/linux/irq.h +++ b/include/linux/irq.h @@ -570,6 +570,7 @@ struct irq_chip { * IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND: Invokes __enable_irq()/__disable_irq() for wake irqs * in the suspend path if they are in disabled state * IRQCHIP_AFFINITY_PRE_STARTUP: Default affinity update before startup + * IRQCHIP_IMMUTABLE: Don't ever change anything in this chip */ enum { IRQCHIP_SET_TYPE_MASKED = (1 << 0), @@ -583,6 +584,7 @@ enum { IRQCHIP_SUPPORTS_NMI = (1 << 8), IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND = (1 << 9), IRQCHIP_AFFINITY_PRE_STARTUP = (1 << 10), + IRQCHIP_IMMUTABLE = (1 << 11), };
#include <linux/irqdesc.h> diff --git a/kernel/irq/debugfs.c b/kernel/irq/debugfs.c index e4cff358b437e..7ff52d94b42c0 100644 --- a/kernel/irq/debugfs.c +++ b/kernel/irq/debugfs.c @@ -58,6 +58,7 @@ static const struct irq_bit_descr irqchip_flags[] = { BIT_MASK_DESCR(IRQCHIP_SUPPORTS_LEVEL_MSI), BIT_MASK_DESCR(IRQCHIP_SUPPORTS_NMI), BIT_MASK_DESCR(IRQCHIP_ENABLE_WAKEUP_ON_SUSPEND), + BIT_MASK_DESCR(IRQCHIP_IMMUTABLE), };
static void
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier maz@kernel.org
[ Upstream commit 704f08753b6dcd0e08c1953af0b2c7f3fac87111 ]
The GPIO subsystem has a couple of internal helpers to manage resources on behalf of the irqchip. Expose them so that GPIO drivers can use them directly.
Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220419141846.598305-3-maz@kernel.org Stable-dep-of: dc3115e6c5d9 ("hid: cp2112: Fix IRQ shutdown stopping polling for all IRQs on chip") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib.c | 6 ++++-- include/linux/gpio/driver.h | 4 ++++ 2 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib.c b/drivers/gpio/gpiolib.c index e572c30a202ad..57e726d65904b 100644 --- a/drivers/gpio/gpiolib.c +++ b/drivers/gpio/gpiolib.c @@ -1431,19 +1431,21 @@ static int gpiochip_to_irq(struct gpio_chip *gc, unsigned int offset) return irq_create_mapping(domain, offset); }
-static int gpiochip_irq_reqres(struct irq_data *d) +int gpiochip_irq_reqres(struct irq_data *d) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
return gpiochip_reqres_irq(gc, d->hwirq); } +EXPORT_SYMBOL(gpiochip_irq_reqres);
-static void gpiochip_irq_relres(struct irq_data *d) +void gpiochip_irq_relres(struct irq_data *d) { struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
gpiochip_relres_irq(gc, d->hwirq); } +EXPORT_SYMBOL(gpiochip_irq_relres);
static void gpiochip_irq_mask(struct irq_data *d) { diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index 65df2ce96f0b1..b241fc23ff3a2 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -595,6 +595,10 @@ void gpiochip_relres_irq(struct gpio_chip *gc, unsigned int offset); void gpiochip_disable_irq(struct gpio_chip *gc, unsigned int offset); void gpiochip_enable_irq(struct gpio_chip *gc, unsigned int offset);
+/* irq_data versions of the above */ +int gpiochip_irq_reqres(struct irq_data *data); +void gpiochip_irq_relres(struct irq_data *data); + /* Line status inquiry for drivers */ bool gpiochip_line_is_open_drain(struct gpio_chip *gc, unsigned int offset); bool gpiochip_line_is_open_source(struct gpio_chip *gc, unsigned int offset);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Zyngier maz@kernel.org
[ Upstream commit 36b78aae4bfee749bbde73be570796bfd0f56bec ]
Add a couple of new helpers to make it slightly simpler to convert drivers to immutable irq_chip structures:
- GPIOCHIP_IRQ_RESOURCE_HELPERS populates the irq_chip structure with the resource management callbacks
- gpio_irq_chip_set_chip() populates the gpio_irq_chip.chip structure, avoiding the proliferation of ugly casts
Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Bartosz Golaszewski brgl@bgdev.pl Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220419141846.598305-4-maz@kernel.org Stable-dep-of: dc3115e6c5d9 ("hid: cp2112: Fix IRQ shutdown stopping polling for all IRQs on chip") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/gpio/driver.h | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/include/linux/gpio/driver.h b/include/linux/gpio/driver.h index b241fc23ff3a2..91f60d1e3eb31 100644 --- a/include/linux/gpio/driver.h +++ b/include/linux/gpio/driver.h @@ -599,6 +599,18 @@ void gpiochip_enable_irq(struct gpio_chip *gc, unsigned int offset); int gpiochip_irq_reqres(struct irq_data *data); void gpiochip_irq_relres(struct irq_data *data);
+/* Paste this in your irq_chip structure */ +#define GPIOCHIP_IRQ_RESOURCE_HELPERS \ + .irq_request_resources = gpiochip_irq_reqres, \ + .irq_release_resources = gpiochip_irq_relres + +static inline void gpio_irq_chip_set_chip(struct gpio_irq_chip *girq, + const struct irq_chip *chip) +{ + /* Yes, dropping const is ugly, but it isn't like we have a choice */ + girq->chip = (struct irq_chip *)chip; +} + /* Line status inquiry for drivers */ bool gpiochip_line_is_open_drain(struct gpio_chip *gc, unsigned int offset); bool gpiochip_line_is_open_source(struct gpio_chip *gc, unsigned int offset);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 4f3ed837186fc0d2722ba8d2457a594322e9c2ef ]
This IS_ERR() check was deleted during in a cleanup because, at the time, the rpcb_call_async() function could not return an error pointer. That changed in commit 25cf32ad5dba ("SUNRPC: Handle allocation failure in rpc_new_task()") and now it can return an error pointer. Put the check back.
A related revert was done in commit 13bd90141804 ("Revert "SUNRPC: Remove unreachable error condition"").
Fixes: 037e910b52b0 ("SUNRPC: Remove unreachable error condition in rpcb_getport_async()") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/rpcb_clnt.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sunrpc/rpcb_clnt.c b/net/sunrpc/rpcb_clnt.c index 647b323cc1d56..638b14f28101e 100644 --- a/net/sunrpc/rpcb_clnt.c +++ b/net/sunrpc/rpcb_clnt.c @@ -746,6 +746,10 @@ void rpcb_getport_async(struct rpc_task *task)
child = rpcb_call_async(rpcb_clnt, map, proc); rpc_release_client(rpcb_clnt); + if (IS_ERR(child)) { + /* rpcb_map_release() has freed the arguments */ + return; + }
xprt->stat.bind_count++; rpc_put_task(child);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia kolga@netapp.com
[ Upstream commit 5cc7688bae7f0757c39c1d3dfdd827b724061067 ]
If the client is doing pnfs IO and Kerberos is configured and EXCHANGEID successfully negotiated SP4_MACH_CRED and WRITE/COMMIT are on the list of state protected operations, then we need to make sure to choose the DS's rpc_client structure instead of the MDS's one.
Fixes: fb91fb0ee7b2 ("NFS: Move call to nfs4_state_protect_write() to nfs4_write_setup()") Signed-off-by: Olga Kornievskaia kolga@netapp.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/nfs4proc.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c index 565d11a21f5e2..d65af9a60c35c 100644 --- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -5608,7 +5608,7 @@ static void nfs4_proc_write_setup(struct nfs_pgio_header *hdr,
msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_WRITE]; nfs4_init_sequence(&hdr->args.seq_args, &hdr->res.seq_res, 0, 0); - nfs4_state_protect_write(server->nfs_client, clnt, msg, hdr); + nfs4_state_protect_write(hdr->ds_clp ? hdr->ds_clp : server->nfs_client, clnt, msg, hdr); }
static void nfs4_proc_commit_rpc_prepare(struct rpc_task *task, struct nfs_commit_data *data) @@ -5649,7 +5649,8 @@ static void nfs4_proc_commit_setup(struct nfs_commit_data *data, struct rpc_mess data->res.server = server; msg->rpc_proc = &nfs4_procedures[NFSPROC4_CLNT_COMMIT]; nfs4_init_sequence(&data->args.seq_args, &data->res.seq_res, 1, 0); - nfs4_state_protect(server->nfs_client, NFS_SP4_MACH_CRED_COMMIT, clnt, msg); + nfs4_state_protect(data->ds_clp ? data->ds_clp : server->nfs_client, + NFS_SP4_MACH_CRED_COMMIT, clnt, msg); }
static int _nfs4_proc_commit(struct file *dst, struct nfs_commitargs *args,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: felix fuzhen5@huawei.com
[ Upstream commit bfca5fb4e97c46503ddfc582335917b0cc228264 ]
RPC client pipefs dentries cleanup is in separated rpc_remove_pipedir() workqueue,which takes care about pipefs superblock locking. In some special scenarios, when kernel frees the pipefs sb of the current client and immediately alloctes a new pipefs sb, rpc_remove_pipedir function would misjudge the existence of pipefs sb which is not the one it used to hold. As a result, the rpc_remove_pipedir would clean the released freed pipefs dentries.
To fix this issue, rpc_remove_pipedir should check whether the current pipefs sb is consistent with the original pipefs sb.
This error can be catched by KASAN: ========================================================= [ 250.497700] BUG: KASAN: slab-use-after-free in dget_parent+0x195/0x200 [ 250.498315] Read of size 4 at addr ffff88800a2ab804 by task kworker/0:18/106503 [ 250.500549] Workqueue: events rpc_free_client_work [ 250.501001] Call Trace: [ 250.502880] kasan_report+0xb6/0xf0 [ 250.503209] ? dget_parent+0x195/0x200 [ 250.503561] dget_parent+0x195/0x200 [ 250.503897] ? __pfx_rpc_clntdir_depopulate+0x10/0x10 [ 250.504384] rpc_rmdir_depopulate+0x1b/0x90 [ 250.504781] rpc_remove_client_dir+0xf5/0x150 [ 250.505195] rpc_free_client_work+0xe4/0x230 [ 250.505598] process_one_work+0x8ee/0x13b0 ... [ 22.039056] Allocated by task 244: [ 22.039390] kasan_save_stack+0x22/0x50 [ 22.039758] kasan_set_track+0x25/0x30 [ 22.040109] __kasan_slab_alloc+0x59/0x70 [ 22.040487] kmem_cache_alloc_lru+0xf0/0x240 [ 22.040889] __d_alloc+0x31/0x8e0 [ 22.041207] d_alloc+0x44/0x1f0 [ 22.041514] __rpc_lookup_create_exclusive+0x11c/0x140 [ 22.041987] rpc_mkdir_populate.constprop.0+0x5f/0x110 [ 22.042459] rpc_create_client_dir+0x34/0x150 [ 22.042874] rpc_setup_pipedir_sb+0x102/0x1c0 [ 22.043284] rpc_client_register+0x136/0x4e0 [ 22.043689] rpc_new_client+0x911/0x1020 [ 22.044057] rpc_create_xprt+0xcb/0x370 [ 22.044417] rpc_create+0x36b/0x6c0 ... [ 22.049524] Freed by task 0: [ 22.049803] kasan_save_stack+0x22/0x50 [ 22.050165] kasan_set_track+0x25/0x30 [ 22.050520] kasan_save_free_info+0x2b/0x50 [ 22.050921] __kasan_slab_free+0x10e/0x1a0 [ 22.051306] kmem_cache_free+0xa5/0x390 [ 22.051667] rcu_core+0x62c/0x1930 [ 22.051995] __do_softirq+0x165/0x52a [ 22.052347] [ 22.052503] Last potentially related work creation: [ 22.052952] kasan_save_stack+0x22/0x50 [ 22.053313] __kasan_record_aux_stack+0x8e/0xa0 [ 22.053739] __call_rcu_common.constprop.0+0x6b/0x8b0 [ 22.054209] dentry_free+0xb2/0x140 [ 22.054540] __dentry_kill+0x3be/0x540 [ 22.054900] shrink_dentry_list+0x199/0x510 [ 22.055293] shrink_dcache_parent+0x190/0x240 [ 22.055703] do_one_tree+0x11/0x40 [ 22.056028] shrink_dcache_for_umount+0x61/0x140 [ 22.056461] generic_shutdown_super+0x70/0x590 [ 22.056879] kill_anon_super+0x3a/0x60 [ 22.057234] rpc_kill_sb+0x121/0x200
Fixes: 0157d021d23a ("SUNRPC: handle RPC client pipefs dentries by network namespace aware routines") Signed-off-by: felix fuzhen5@huawei.com Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sunrpc/clnt.h | 1 + net/sunrpc/clnt.c | 5 ++++- 2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h index 9fcf5ffc4f9ad..71ec22b1df860 100644 --- a/include/linux/sunrpc/clnt.h +++ b/include/linux/sunrpc/clnt.h @@ -83,6 +83,7 @@ struct rpc_clnt { }; const struct cred *cl_cred; unsigned int cl_max_connect; /* max number of transports not to the same IP */ + struct super_block *pipefs_sb; };
/* diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c index fc37f314a09dd..af1ca707c3d35 100644 --- a/net/sunrpc/clnt.c +++ b/net/sunrpc/clnt.c @@ -111,7 +111,8 @@ static void rpc_clnt_remove_pipedir(struct rpc_clnt *clnt)
pipefs_sb = rpc_get_sb_net(net); if (pipefs_sb) { - __rpc_clnt_remove_pipedir(clnt); + if (pipefs_sb == clnt->pipefs_sb) + __rpc_clnt_remove_pipedir(clnt); rpc_put_sb_net(net); } } @@ -151,6 +152,8 @@ rpc_setup_pipedir(struct super_block *pipefs_sb, struct rpc_clnt *clnt) { struct dentry *dentry;
+ clnt->pipefs_sb = pipefs_sb; + if (clnt->cl_program->pipe_dir_name != NULL) { dentry = rpc_setup_pipedir_sb(pipefs_sb, clnt); if (IS_ERR(dentry))
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 074d7306a4fe22fcac0b53f699f92757ab1cee99 ]
Commit 0abd1557e21c added rcu_dereference() for dereferencing ip->i_gl in gfs2_permission. This now causes lockdep to complain when gfs2_permission is called in non-RCU context:
WARNING: suspicious RCU usage in gfs2_permission
Switch to rcu_dereference_check() and check for the MAY_NOT_BLOCK flag to shut up lockdep when we know that dereferencing ip->i_gl is safe.
Fixes: 0abd1557e21c ("gfs2: fix an oops in gfs2_permission") Reported-by: syzbot+3e5130844b0c0e2b4948@syzkaller.appspotmail.com Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/inode.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/gfs2/inode.c b/fs/gfs2/inode.c index 682418d9c8e72..462e957eda8be 100644 --- a/fs/gfs2/inode.c +++ b/fs/gfs2/inode.c @@ -1848,6 +1848,7 @@ static const char *gfs2_get_link(struct dentry *dentry, int gfs2_permission(struct user_namespace *mnt_userns, struct inode *inode, int mask) { + int may_not_block = mask & MAY_NOT_BLOCK; struct gfs2_inode *ip; struct gfs2_holder i_gh; struct gfs2_glock *gl; @@ -1855,14 +1856,14 @@ int gfs2_permission(struct user_namespace *mnt_userns, struct inode *inode,
gfs2_holder_mark_uninitialized(&i_gh); ip = GFS2_I(inode); - gl = rcu_dereference(ip->i_gl); + gl = rcu_dereference_check(ip->i_gl, !may_not_block); if (unlikely(!gl)) { /* inode is getting torn down, must be RCU mode */ - WARN_ON_ONCE(!(mask & MAY_NOT_BLOCK)); + WARN_ON_ONCE(!may_not_block); return -ECHILD; } if (gfs2_glock_is_locked_by_me(gl) == NULL) { - if (mask & MAY_NOT_BLOCK) + if (may_not_block) return -ECHILD; error = gfs2_glock_nq_init(gl, LM_ST_SHARED, LM_FLAG_ANY, &i_gh); if (error)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 6b9ea5c81ea2bed80dc98a38d475124a87e7ab5d ]
Raw access to cb->arg[] is deprecated, use a context structure.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/mptcp_diag.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c index 292374fb07792..fb98b438b2c90 100644 --- a/net/mptcp/mptcp_diag.c +++ b/net/mptcp/mptcp_diag.c @@ -66,20 +66,28 @@ static int mptcp_diag_dump_one(struct netlink_callback *cb, return err; }
+struct mptcp_diag_ctx { + long s_slot; + long s_num; +}; + static void mptcp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, const struct inet_diag_req_v2 *r) { bool net_admin = netlink_net_capable(cb->skb, CAP_NET_ADMIN); + struct mptcp_diag_ctx *diag_ctx = (void *)cb->ctx; struct net *net = sock_net(skb->sk); struct inet_diag_dump_data *cb_data; struct mptcp_sock *msk; struct nlattr *bc;
+ BUILD_BUG_ON(sizeof(cb->ctx) < sizeof(*diag_ctx)); + cb_data = cb->data; bc = cb_data->inet_diag_nla_bc;
- while ((msk = mptcp_token_iter_next(net, &cb->args[0], &cb->args[1])) != - NULL) { + while ((msk = mptcp_token_iter_next(net, &diag_ctx->s_slot, + &diag_ctx->s_num)) != NULL) { struct inet_sock *inet = (struct inet_sock *)msk; struct sock *sk = (struct sock *)msk; int ret = 0; @@ -101,7 +109,7 @@ static void mptcp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, sock_put(sk); if (ret < 0) { /* will retry on the same position */ - cb->args[1]--; + diag_ctx->s_num--; break; } cond_resched();
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 4fa39b701ce9be7ec2169f7fba4f8dc1a3b92aac ]
makes 'ss -Ml' show mptcp listen sockets.
Iterate over the tcp listen sockets and pick those that have mptcp ulp info attached.
mptcp_diag_get_info() is modified to prefer msk->first for mptcp sockets in listen state. This reports accurate number for recv and send queue (pending / max connection backlog counters).
Sample output: ss -Mil State Recv-Q Send-Q Local Address:Port Peer Address:Port LISTEN 0 20 127.0.0.1:12000 0.0.0.0:* subflows_max:2
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable") Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/mptcp_diag.c | 91 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 91 insertions(+)
diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c index fb98b438b2c90..4d8625d0b179a 100644 --- a/net/mptcp/mptcp_diag.c +++ b/net/mptcp/mptcp_diag.c @@ -69,8 +69,83 @@ static int mptcp_diag_dump_one(struct netlink_callback *cb, struct mptcp_diag_ctx { long s_slot; long s_num; + unsigned int l_slot; + unsigned int l_num; };
+static void mptcp_diag_dump_listeners(struct sk_buff *skb, struct netlink_callback *cb, + const struct inet_diag_req_v2 *r, + bool net_admin) +{ + struct inet_diag_dump_data *cb_data = cb->data; + struct mptcp_diag_ctx *diag_ctx = (void *)cb->ctx; + struct nlattr *bc = cb_data->inet_diag_nla_bc; + struct net *net = sock_net(skb->sk); + int i; + + for (i = diag_ctx->l_slot; i < INET_LHTABLE_SIZE; i++) { + struct inet_listen_hashbucket *ilb; + struct hlist_nulls_node *node; + struct sock *sk; + int num = 0; + + ilb = &tcp_hashinfo.listening_hash[i]; + + rcu_read_lock(); + spin_lock(&ilb->lock); + sk_nulls_for_each(sk, node, &ilb->nulls_head) { + const struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(sk); + struct inet_sock *inet = inet_sk(sk); + int ret; + + if (num < diag_ctx->l_num) + goto next_listen; + + if (!ctx || strcmp(inet_csk(sk)->icsk_ulp_ops->name, "mptcp")) + goto next_listen; + + sk = ctx->conn; + if (!sk || !net_eq(sock_net(sk), net)) + goto next_listen; + + if (r->sdiag_family != AF_UNSPEC && + sk->sk_family != r->sdiag_family) + goto next_listen; + + if (r->id.idiag_sport != inet->inet_sport && + r->id.idiag_sport) + goto next_listen; + + if (!refcount_inc_not_zero(&sk->sk_refcnt)) + goto next_listen; + + ret = sk_diag_dump(sk, skb, cb, r, bc, net_admin); + + sock_put(sk); + + if (ret < 0) { + spin_unlock(&ilb->lock); + rcu_read_unlock(); + diag_ctx->l_slot = i; + diag_ctx->l_num = num; + return; + } + diag_ctx->l_num = num + 1; + num = 0; +next_listen: + ++num; + } + spin_unlock(&ilb->lock); + rcu_read_unlock(); + + cond_resched(); + diag_ctx->l_num = 0; + } + + diag_ctx->l_num = 0; + diag_ctx->l_slot = i; +} + static void mptcp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, const struct inet_diag_req_v2 *r) { @@ -114,6 +189,9 @@ static void mptcp_diag_dump(struct sk_buff *skb, struct netlink_callback *cb, } cond_resched(); } + + if ((r->idiag_states & TCPF_LISTEN) && r->id.idiag_dport == 0) + mptcp_diag_dump_listeners(skb, cb, r, net_admin); }
static void mptcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r, @@ -127,6 +205,19 @@ static void mptcp_diag_get_info(struct sock *sk, struct inet_diag_msg *r,
r->idiag_rqueue = sk_rmem_alloc_get(sk); r->idiag_wqueue = sk_wmem_alloc_get(sk); + + if (inet_sk_state_load(sk) == TCP_LISTEN) { + struct sock *lsk = READ_ONCE(msk->first); + + if (lsk) { + /* override with settings from tcp listener, + * so Send-Q will show accept queue. + */ + r->idiag_rqueue = READ_ONCE(lsk->sk_ack_backlog); + r->idiag_wqueue = READ_ONCE(lsk->sk_max_ack_backlog); + } + } + if (!info) return;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin KaFai Lau kafai@fb.com
[ Upstream commit 8ea1eebb49a2dfee1dce621a638cc1626e542392 ]
After commit 0ee58dad5b06 ("net: tcp6: prefer listeners bound to an address") and commit d9fbc7f6431f ("net: tcp: prefer listeners bound to an address"), the count is no longer used. This patch removes it.
Signed-off-by: Martin KaFai Lau kafai@fb.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_hashtables.h | 1 - net/ipv4/inet_hashtables.c | 6 ------ 2 files changed, 7 deletions(-)
diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 53c22b64e9724..405670d7661da 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -111,7 +111,6 @@ struct inet_bind_hashbucket { #define LISTENING_NULLS_BASE (1U << 29) struct inet_listen_hashbucket { spinlock_t lock; - unsigned int count; union { struct hlist_head head; struct hlist_nulls_head nulls_head; diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 2936676f86eb8..8e0451248fc05 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -209,7 +209,6 @@ static void inet_hash2(struct inet_hashinfo *h, struct sock *sk) else hlist_add_head_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, &ilb2->head); - ilb2->count++; spin_unlock(&ilb2->lock); }
@@ -225,7 +224,6 @@ static void inet_unhash2(struct inet_hashinfo *h, struct sock *sk)
spin_lock(&ilb2->lock); hlist_del_init_rcu(&inet_csk(sk)->icsk_listen_portaddr_node); - ilb2->count--; spin_unlock(&ilb2->lock); }
@@ -652,7 +650,6 @@ int __inet_hash(struct sock *sk, struct sock *osk) else __sk_nulls_add_node_rcu(sk, &ilb->nulls_head); inet_hash2(hashinfo, sk); - ilb->count++; sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); unlock: @@ -684,7 +681,6 @@ static void __inet_unhash(struct sock *sk, struct inet_listen_hashbucket *ilb) struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo;
inet_unhash2(hashinfo, sk); - ilb->count--; } __sk_nulls_del_node_init_rcu(sk); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); @@ -867,7 +863,6 @@ void inet_hashinfo_init(struct inet_hashinfo *h) spin_lock_init(&h->listening_hash[i].lock); INIT_HLIST_NULLS_HEAD(&h->listening_hash[i].nulls_head, i + LISTENING_NULLS_BASE); - h->listening_hash[i].count = 0; }
h->lhash2 = NULL; @@ -881,7 +876,6 @@ static void init_hashinfo_lhash2(struct inet_hashinfo *h) for (i = 0; i <= h->lhash2_mask; i++) { spin_lock_init(&h->lhash2[i].lock); INIT_HLIST_HEAD(&h->lhash2[i].head); - h->lhash2[i].count = 0; } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin KaFai Lau kafai@fb.com
[ Upstream commit e8d0059000b20c4745c5b6a713f6adb269cff8ff ]
This patch folds lhash2 related functions into __inet_hash and inet_unhash. This will make the removal of the listening_hash in a latter patch easier to review.
First, this patch folds inet_hash2 into __inet_hash.
For unhash, the current call sequence is like inet_unhash() => __inet_unhash() => inet_unhash2(). The specific testing cases in __inet_unhash() are mostly related to TCP_LISTEN sk and its caller inet_unhash() already has the TCP_LISTEN test, so this patch folds both __inet_unhash() and inet_unhash2() into inet_unhash().
Note that all listening_hash users also have lhash2 initialized, so the !h->lhash2 check is no longer needed.
Signed-off-by: Martin KaFai Lau kafai@fb.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable") Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_hashtables.c | 88 ++++++++++++++------------------------ 1 file changed, 33 insertions(+), 55 deletions(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 8e0451248fc05..637d806090b00 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -193,40 +193,6 @@ inet_lhash2_bucket_sk(struct inet_hashinfo *h, struct sock *sk) return inet_lhash2_bucket(h, hash); }
-static void inet_hash2(struct inet_hashinfo *h, struct sock *sk) -{ - struct inet_listen_hashbucket *ilb2; - - if (!h->lhash2) - return; - - ilb2 = inet_lhash2_bucket_sk(h, sk); - - spin_lock(&ilb2->lock); - if (sk->sk_reuseport && sk->sk_family == AF_INET6) - hlist_add_tail_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, - &ilb2->head); - else - hlist_add_head_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, - &ilb2->head); - spin_unlock(&ilb2->lock); -} - -static void inet_unhash2(struct inet_hashinfo *h, struct sock *sk) -{ - struct inet_listen_hashbucket *ilb2; - - if (!h->lhash2 || - WARN_ON_ONCE(hlist_unhashed(&inet_csk(sk)->icsk_listen_portaddr_node))) - return; - - ilb2 = inet_lhash2_bucket_sk(h, sk); - - spin_lock(&ilb2->lock); - hlist_del_init_rcu(&inet_csk(sk)->icsk_listen_portaddr_node); - spin_unlock(&ilb2->lock); -} - static inline int compute_score(struct sock *sk, struct net *net, const unsigned short hnum, const __be32 daddr, const int dif, const int sdif) @@ -626,6 +592,7 @@ static int inet_reuseport_add_sock(struct sock *sk, int __inet_hash(struct sock *sk, struct sock *osk) { struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; + struct inet_listen_hashbucket *ilb2; struct inet_listen_hashbucket *ilb; int err = 0;
@@ -637,22 +604,29 @@ int __inet_hash(struct sock *sk, struct sock *osk) } WARN_ON(!sk_unhashed(sk)); ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; + ilb2 = inet_lhash2_bucket_sk(hashinfo, sk);
spin_lock(&ilb->lock); + spin_lock(&ilb2->lock); if (sk->sk_reuseport) { err = inet_reuseport_add_sock(sk, ilb); if (err) goto unlock; } if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport && - sk->sk_family == AF_INET6) + sk->sk_family == AF_INET6) { + hlist_add_tail_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, + &ilb2->head); __sk_nulls_add_node_tail_rcu(sk, &ilb->nulls_head); - else + } else { + hlist_add_head_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, + &ilb2->head); __sk_nulls_add_node_rcu(sk, &ilb->nulls_head); - inet_hash2(hashinfo, sk); + } sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); unlock: + spin_unlock(&ilb2->lock); spin_unlock(&ilb->lock);
return err; @@ -670,22 +644,6 @@ int inet_hash(struct sock *sk) } EXPORT_SYMBOL_GPL(inet_hash);
-static void __inet_unhash(struct sock *sk, struct inet_listen_hashbucket *ilb) -{ - if (sk_unhashed(sk)) - return; - - if (rcu_access_pointer(sk->sk_reuseport_cb)) - reuseport_stop_listen_sock(sk); - if (ilb) { - struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; - - inet_unhash2(hashinfo, sk); - } - __sk_nulls_del_node_init_rcu(sk); - sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); -} - void inet_unhash(struct sock *sk) { struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; @@ -694,20 +652,40 @@ void inet_unhash(struct sock *sk) return;
if (sk->sk_state == TCP_LISTEN) { + struct inet_listen_hashbucket *ilb2; struct inet_listen_hashbucket *ilb;
ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; + ilb2 = inet_lhash2_bucket_sk(hashinfo, sk); /* Don't disable bottom halves while acquiring the lock to * avoid circular locking dependency on PREEMPT_RT. */ spin_lock(&ilb->lock); - __inet_unhash(sk, ilb); + spin_lock(&ilb2->lock); + if (sk_unhashed(sk)) { + spin_unlock(&ilb2->lock); + spin_unlock(&ilb->lock); + return; + } + + if (rcu_access_pointer(sk->sk_reuseport_cb)) + reuseport_stop_listen_sock(sk); + + hlist_del_init_rcu(&inet_csk(sk)->icsk_listen_portaddr_node); + __sk_nulls_del_node_init_rcu(sk); + sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); + spin_unlock(&ilb2->lock); spin_unlock(&ilb->lock); } else { spinlock_t *lock = inet_ehash_lockp(hashinfo, sk->sk_hash);
spin_lock_bh(lock); - __inet_unhash(sk, NULL); + if (sk_unhashed(sk)) { + spin_unlock_bh(lock); + return; + } + __sk_nulls_del_node_init_rcu(sk); + sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); spin_unlock_bh(lock); } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Martin KaFai Lau kafai@fb.com
[ Upstream commit cae3873c5b3a4fcd9706fb461ff4e91bdf1f0120 ]
The listen sk is currently stored in two hash tables, listening_hash (hashed by port) and lhash2 (hashed by port and address).
After commit 0ee58dad5b06 ("net: tcp6: prefer listeners bound to an address") and commit d9fbc7f6431f ("net: tcp: prefer listeners bound to an address"), the TCP-SYN lookup fast path does not use listening_hash.
The commit 05c0b35709c5 ("tcp: seq_file: Replace listening_hash with lhash2") also moved the seq_file (/proc/net/tcp) iteration usage from listening_hash to lhash2.
There are still a few listening_hash usages left. One of them is inet_reuseport_add_sock() which uses the listening_hash to search a listen sk during the listen() system call. This turns out to be very slow on use cases that listen on many different VIPs at a popular port (e.g. 443). [ On top of the slowness in adding to the tail in the IPv6 case ]. The latter patch has a selftest to demonstrate this case.
This patch takes this chance to move all remaining listening_hash usages to lhash2 and then retire listening_hash.
Since most changes need to be done together, it is hard to cut the listening_hash to lhash2 switch into small patches. The changes in this patch is highlighted here for the review purpose.
1. Because of the listening_hash removal, lhash2 can use the sk->sk_nulls_node instead of the icsk->icsk_listen_portaddr_node. This will also keep the sk_unhashed() check to work as is after stop adding sk to listening_hash.
The union is removed from inet_listen_hashbucket because only nulls_head is needed.
2. icsk->icsk_listen_portaddr_node and its helpers are removed.
3. The current lhash2 users needs to iterate with sk_nulls_node instead of icsk_listen_portaddr_node.
One case is in the inet[6]_lhash2_lookup().
Another case is the seq_file iterator in tcp_ipv4.c. One thing to note is sk_nulls_next() is needed because the old inet_lhash2_for_each_icsk_continue() does a "next" first before iterating.
4. Move the remaining listening_hash usage to lhash2
inet_reuseport_add_sock() which this series is trying to improve.
inet_diag.c and mptcp_diag.c are the final two remaining use cases and is moved to lhash2 now also.
Signed-off-by: Martin KaFai Lau kafai@fb.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 871019b22d1b ("net: set SOCK_RCU_FREE before inserting socket into hashtable") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/inet_connection_sock.h | 2 -- include/net/inet_hashtables.h | 41 +------------------------- net/dccp/proto.c | 1 - net/ipv4/inet_diag.c | 5 ++-- net/ipv4/inet_hashtables.c | 47 ++++++------------------------ net/ipv4/tcp.c | 1 - net/ipv4/tcp_ipv4.c | 21 ++++++------- net/ipv6/inet6_hashtables.c | 5 ++-- net/mptcp/mptcp_diag.c | 4 +-- 9 files changed, 26 insertions(+), 101 deletions(-)
diff --git a/include/net/inet_connection_sock.h b/include/net/inet_connection_sock.h index 695ed45841f06..d31a18824cd5c 100644 --- a/include/net/inet_connection_sock.h +++ b/include/net/inet_connection_sock.h @@ -66,7 +66,6 @@ struct inet_connection_sock_af_ops { * @icsk_ulp_ops Pluggable ULP control hook * @icsk_ulp_data ULP private data * @icsk_clean_acked Clean acked data hook - * @icsk_listen_portaddr_node hash to the portaddr listener hashtable * @icsk_ca_state: Congestion control state * @icsk_retransmits: Number of unrecovered [RTO] timeouts * @icsk_pending: Scheduled timer event @@ -96,7 +95,6 @@ struct inet_connection_sock { const struct tcp_ulp_ops *icsk_ulp_ops; void __rcu *icsk_ulp_data; void (*icsk_clean_acked)(struct sock *sk, u32 acked_seq); - struct hlist_node icsk_listen_portaddr_node; unsigned int (*icsk_sync_mss)(struct sock *sk, u32 pmtu); __u8 icsk_ca_state:5, icsk_ca_initialized:1, diff --git a/include/net/inet_hashtables.h b/include/net/inet_hashtables.h index 405670d7661da..a7a8e66a1bad0 100644 --- a/include/net/inet_hashtables.h +++ b/include/net/inet_hashtables.h @@ -111,10 +111,7 @@ struct inet_bind_hashbucket { #define LISTENING_NULLS_BASE (1U << 29) struct inet_listen_hashbucket { spinlock_t lock; - union { - struct hlist_head head; - struct hlist_nulls_head nulls_head; - }; + struct hlist_nulls_head nulls_head; };
/* This is for listening sockets, thus all sockets which possess wildcards. */ @@ -142,32 +139,8 @@ struct inet_hashinfo { /* The 2nd listener table hashed by local port and address */ unsigned int lhash2_mask; struct inet_listen_hashbucket *lhash2; - - /* All the above members are written once at bootup and - * never written again _or_ are predominantly read-access. - * - * Now align to a new cache line as all the following members - * might be often dirty. - */ - /* All sockets in TCP_LISTEN state will be in listening_hash. - * This is the only table where wildcard'd TCP sockets can - * exist. listening_hash is only hashed by local port number. - * If lhash2 is initialized, the same socket will also be hashed - * to lhash2 by port and address. - */ - struct inet_listen_hashbucket listening_hash[INET_LHTABLE_SIZE] - ____cacheline_aligned_in_smp; };
-#define inet_lhash2_for_each_icsk_continue(__icsk) \ - hlist_for_each_entry_continue(__icsk, icsk_listen_portaddr_node) - -#define inet_lhash2_for_each_icsk(__icsk, list) \ - hlist_for_each_entry(__icsk, list, icsk_listen_portaddr_node) - -#define inet_lhash2_for_each_icsk_rcu(__icsk, list) \ - hlist_for_each_entry_rcu(__icsk, list, icsk_listen_portaddr_node) - static inline struct inet_listen_hashbucket * inet_lhash2_bucket(struct inet_hashinfo *h, u32 hash) { @@ -218,23 +191,11 @@ static inline u32 inet_bhashfn(const struct net *net, const __u16 lport, void inet_bind_hash(struct sock *sk, struct inet_bind_bucket *tb, const unsigned short snum);
-/* These can have wildcards, don't try too hard. */ -static inline u32 inet_lhashfn(const struct net *net, const unsigned short num) -{ - return (num + net_hash_mix(net)) & (INET_LHTABLE_SIZE - 1); -} - -static inline int inet_sk_listen_hashfn(const struct sock *sk) -{ - return inet_lhashfn(sock_net(sk), inet_sk(sk)->inet_num); -} - /* Caller must disable local BH processing. */ int __inet_inherit_port(const struct sock *sk, struct sock *child);
void inet_put_port(struct sock *sk);
-void inet_hashinfo_init(struct inet_hashinfo *h); void inet_hashinfo2_init(struct inet_hashinfo *h, const char *name, unsigned long numentries, int scale, unsigned long low_limit, diff --git a/net/dccp/proto.c b/net/dccp/proto.c index 0b0567a692a8f..1b285a57c7aab 100644 --- a/net/dccp/proto.c +++ b/net/dccp/proto.c @@ -1131,7 +1131,6 @@ static int __init dccp_init(void)
BUILD_BUG_ON(sizeof(struct dccp_skb_cb) > sizeof_field(struct sk_buff, cb)); - inet_hashinfo_init(&dccp_hashinfo); rc = inet_hashinfo2_init_mod(&dccp_hashinfo); if (rc) goto out_fail; diff --git a/net/ipv4/inet_diag.c b/net/ipv4/inet_diag.c index ae70e07c52445..09cabed358fd0 100644 --- a/net/ipv4/inet_diag.c +++ b/net/ipv4/inet_diag.c @@ -1028,12 +1028,13 @@ void inet_diag_dump_icsk(struct inet_hashinfo *hashinfo, struct sk_buff *skb, if (!(idiag_states & TCPF_LISTEN) || r->id.idiag_dport) goto skip_listen_ht;
- for (i = s_i; i < INET_LHTABLE_SIZE; i++) { + for (i = s_i; i <= hashinfo->lhash2_mask; i++) { struct inet_listen_hashbucket *ilb; struct hlist_nulls_node *node;
num = 0; - ilb = &hashinfo->listening_hash[i]; + ilb = &hashinfo->lhash2[i]; + spin_lock(&ilb->lock); sk_nulls_for_each(sk, node, &ilb->nulls_head) { struct inet_sock *inet = inet_sk(sk); diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index 637d806090b00..a673f4ec1b429 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -246,12 +246,11 @@ static struct sock *inet_lhash2_lookup(struct net *net, const __be32 daddr, const unsigned short hnum, const int dif, const int sdif) { - struct inet_connection_sock *icsk; struct sock *sk, *result = NULL; + struct hlist_nulls_node *node; int score, hiscore = 0;
- inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) { - sk = (struct sock *)icsk; + sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) { score = compute_score(sk, net, hnum, daddr, dif, sdif); if (score > hiscore) { result = lookup_reuseport(net, sk, skb, doff, @@ -593,7 +592,6 @@ int __inet_hash(struct sock *sk, struct sock *osk) { struct inet_hashinfo *hashinfo = sk->sk_prot->h.hashinfo; struct inet_listen_hashbucket *ilb2; - struct inet_listen_hashbucket *ilb; int err = 0;
if (sk->sk_state != TCP_LISTEN) { @@ -603,31 +601,23 @@ int __inet_hash(struct sock *sk, struct sock *osk) return 0; } WARN_ON(!sk_unhashed(sk)); - ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; ilb2 = inet_lhash2_bucket_sk(hashinfo, sk);
- spin_lock(&ilb->lock); spin_lock(&ilb2->lock); if (sk->sk_reuseport) { - err = inet_reuseport_add_sock(sk, ilb); + err = inet_reuseport_add_sock(sk, ilb2); if (err) goto unlock; } if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport && - sk->sk_family == AF_INET6) { - hlist_add_tail_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, - &ilb2->head); - __sk_nulls_add_node_tail_rcu(sk, &ilb->nulls_head); - } else { - hlist_add_head_rcu(&inet_csk(sk)->icsk_listen_portaddr_node, - &ilb2->head); - __sk_nulls_add_node_rcu(sk, &ilb->nulls_head); - } + sk->sk_family == AF_INET6) + __sk_nulls_add_node_tail_rcu(sk, &ilb2->nulls_head); + else + __sk_nulls_add_node_rcu(sk, &ilb2->nulls_head); sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); unlock: spin_unlock(&ilb2->lock); - spin_unlock(&ilb->lock);
return err; } @@ -653,29 +643,23 @@ void inet_unhash(struct sock *sk)
if (sk->sk_state == TCP_LISTEN) { struct inet_listen_hashbucket *ilb2; - struct inet_listen_hashbucket *ilb;
- ilb = &hashinfo->listening_hash[inet_sk_listen_hashfn(sk)]; ilb2 = inet_lhash2_bucket_sk(hashinfo, sk); /* Don't disable bottom halves while acquiring the lock to * avoid circular locking dependency on PREEMPT_RT. */ - spin_lock(&ilb->lock); spin_lock(&ilb2->lock); if (sk_unhashed(sk)) { spin_unlock(&ilb2->lock); - spin_unlock(&ilb->lock); return; }
if (rcu_access_pointer(sk->sk_reuseport_cb)) reuseport_stop_listen_sock(sk);
- hlist_del_init_rcu(&inet_csk(sk)->icsk_listen_portaddr_node); __sk_nulls_del_node_init_rcu(sk); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, -1); spin_unlock(&ilb2->lock); - spin_unlock(&ilb->lock); } else { spinlock_t *lock = inet_ehash_lockp(hashinfo, sk->sk_hash);
@@ -833,27 +817,14 @@ int inet_hash_connect(struct inet_timewait_death_row *death_row, } EXPORT_SYMBOL_GPL(inet_hash_connect);
-void inet_hashinfo_init(struct inet_hashinfo *h) -{ - int i; - - for (i = 0; i < INET_LHTABLE_SIZE; i++) { - spin_lock_init(&h->listening_hash[i].lock); - INIT_HLIST_NULLS_HEAD(&h->listening_hash[i].nulls_head, - i + LISTENING_NULLS_BASE); - } - - h->lhash2 = NULL; -} -EXPORT_SYMBOL_GPL(inet_hashinfo_init); - static void init_hashinfo_lhash2(struct inet_hashinfo *h) { int i;
for (i = 0; i <= h->lhash2_mask; i++) { spin_lock_init(&h->lhash2[i].lock); - INIT_HLIST_HEAD(&h->lhash2[i].head); + INIT_HLIST_NULLS_HEAD(&h->lhash2[i].nulls_head, + i + LISTENING_NULLS_BASE); } }
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 6dcb77a2bde60..86dff7abdfd69 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4554,7 +4554,6 @@ void __init tcp_init(void) timer_setup(&tcp_orphan_timer, tcp_orphan_update, TIMER_DEFERRABLE); mod_timer(&tcp_orphan_timer, jiffies + TCP_ORPHAN_TIMER_PERIOD);
- inet_hashinfo_init(&tcp_hashinfo); inet_hashinfo2_init(&tcp_hashinfo, "tcp_listen_portaddr_hash", thash_entries, 21, /* one slot per 2 MB*/ 0, 64 * 1024); diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index f89cb184649ec..0666be6b9ec93 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2343,16 +2343,15 @@ static void *listening_get_first(struct seq_file *seq) st->offset = 0; for (; st->bucket <= tcp_hashinfo.lhash2_mask; st->bucket++) { struct inet_listen_hashbucket *ilb2; - struct inet_connection_sock *icsk; + struct hlist_nulls_node *node; struct sock *sk;
ilb2 = &tcp_hashinfo.lhash2[st->bucket]; - if (hlist_empty(&ilb2->head)) + if (hlist_nulls_empty(&ilb2->nulls_head)) continue;
spin_lock(&ilb2->lock); - inet_lhash2_for_each_icsk(icsk, &ilb2->head) { - sk = (struct sock *)icsk; + sk_nulls_for_each(sk, node, &ilb2->nulls_head) { if (seq_sk_match(seq, sk)) return sk; } @@ -2371,15 +2370,14 @@ static void *listening_get_next(struct seq_file *seq, void *cur) { struct tcp_iter_state *st = seq->private; struct inet_listen_hashbucket *ilb2; - struct inet_connection_sock *icsk; + struct hlist_nulls_node *node; struct sock *sk = cur;
++st->num; ++st->offset;
- icsk = inet_csk(sk); - inet_lhash2_for_each_icsk_continue(icsk) { - sk = (struct sock *)icsk; + sk = sk_nulls_next(sk); + sk_nulls_for_each_from(sk, node) { if (seq_sk_match(seq, sk)) return sk; } @@ -2788,16 +2786,15 @@ static unsigned int bpf_iter_tcp_listening_batch(struct seq_file *seq, { struct bpf_tcp_iter_state *iter = seq->private; struct tcp_iter_state *st = &iter->state; - struct inet_connection_sock *icsk; + struct hlist_nulls_node *node; unsigned int expected = 1; struct sock *sk;
sock_hold(start_sk); iter->batch[iter->end_sk++] = start_sk;
- icsk = inet_csk(start_sk); - inet_lhash2_for_each_icsk_continue(icsk) { - sk = (struct sock *)icsk; + sk = sk_nulls_next(start_sk); + sk_nulls_for_each_from(sk, node) { if (seq_sk_match(seq, sk)) { if (iter->end_sk < iter->max_sk) { sock_hold(sk); diff --git a/net/ipv6/inet6_hashtables.c b/net/ipv6/inet6_hashtables.c index b4a5e01e12016..c40cbdfc6247f 100644 --- a/net/ipv6/inet6_hashtables.c +++ b/net/ipv6/inet6_hashtables.c @@ -138,12 +138,11 @@ static struct sock *inet6_lhash2_lookup(struct net *net, const __be16 sport, const struct in6_addr *daddr, const unsigned short hnum, const int dif, const int sdif) { - struct inet_connection_sock *icsk; struct sock *sk, *result = NULL; + struct hlist_nulls_node *node; int score, hiscore = 0;
- inet_lhash2_for_each_icsk_rcu(icsk, &ilb2->head) { - sk = (struct sock *)icsk; + sk_nulls_for_each_rcu(sk, node, &ilb2->nulls_head) { score = compute_score(sk, net, hnum, daddr, dif, sdif); if (score > hiscore) { result = lookup_reuseport(net, sk, skb, doff, diff --git a/net/mptcp/mptcp_diag.c b/net/mptcp/mptcp_diag.c index 4d8625d0b179a..520ee65850553 100644 --- a/net/mptcp/mptcp_diag.c +++ b/net/mptcp/mptcp_diag.c @@ -83,13 +83,13 @@ static void mptcp_diag_dump_listeners(struct sk_buff *skb, struct netlink_callba struct net *net = sock_net(skb->sk); int i;
- for (i = diag_ctx->l_slot; i < INET_LHTABLE_SIZE; i++) { + for (i = diag_ctx->l_slot; i <= tcp_hashinfo.lhash2_mask; i++) { struct inet_listen_hashbucket *ilb; struct hlist_nulls_node *node; struct sock *sk; int num = 0;
- ilb = &tcp_hashinfo.listening_hash[i]; + ilb = &tcp_hashinfo.lhash2[i];
rcu_read_lock(); spin_lock(&ilb->lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stanislav Fomichev sdf@google.com
[ Upstream commit 871019b22d1bcc9fab2d1feba1b9a564acbb6e99 ]
We've started to see the following kernel traces:
WARNING: CPU: 83 PID: 0 at net/core/filter.c:6641 sk_lookup+0x1bd/0x1d0
Call Trace: <IRQ> __bpf_skc_lookup+0x10d/0x120 bpf_sk_lookup+0x48/0xd0 bpf_sk_lookup_tcp+0x19/0x20 bpf_prog_<redacted>+0x37c/0x16a3 cls_bpf_classify+0x205/0x2e0 tcf_classify+0x92/0x160 __netif_receive_skb_core+0xe52/0xf10 __netif_receive_skb_list_core+0x96/0x2b0 napi_complete_done+0x7b5/0xb70 <redacted>_poll+0x94/0xb0 net_rx_action+0x163/0x1d70 __do_softirq+0xdc/0x32e asm_call_irq_on_stack+0x12/0x20 </IRQ> do_softirq_own_stack+0x36/0x50 do_softirq+0x44/0x70
__inet_hash can race with lockless (rcu) readers on the other cpus:
__inet_hash __sk_nulls_add_node_rcu <- (bpf triggers here) sock_set_flag(SOCK_RCU_FREE)
Let's move the SOCK_RCU_FREE part up a bit, before we are inserting the socket into hashtables. Note, that the race is really harmless; the bpf callers are handling this situation (where listener socket doesn't have SOCK_RCU_FREE set) correctly, so the only annoyance is a WARN_ONCE.
More details from Eric regarding SOCK_RCU_FREE timeline:
Commit 3b24d854cb35 ("tcp/dccp: do not touch listener sk_refcnt under synflood") added SOCK_RCU_FREE. At that time, the precise location of sock_set_flag(sk, SOCK_RCU_FREE) did not matter, because the thread calling __inet_hash() owns a reference on sk. SOCK_RCU_FREE was only tested at dismantle time.
Commit 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") started checking SOCK_RCU_FREE _after_ the lookup to infer whether the refcount has been taken care of.
Fixes: 6acc9b432e67 ("bpf: Add helper to retrieve socket in BPF") Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: Stanislav Fomichev sdf@google.com Reviewed-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/inet_hashtables.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c index a673f4ec1b429..b4e0120af9c2b 100644 --- a/net/ipv4/inet_hashtables.c +++ b/net/ipv4/inet_hashtables.c @@ -609,12 +609,12 @@ int __inet_hash(struct sock *sk, struct sock *osk) if (err) goto unlock; } + sock_set_flag(sk, SOCK_RCU_FREE); if (IS_ENABLED(CONFIG_IPV6) && sk->sk_reuseport && sk->sk_family == AF_INET6) __sk_nulls_add_node_tail_rcu(sk, &ilb2->nulls_head); else __sk_nulls_add_node_rcu(sk, &ilb2->nulls_head); - sock_set_flag(sk, SOCK_RCU_FREE); sock_prot_inuse_add(sock_net(sk), sk->sk_prot, 1); unlock: spin_unlock(&ilb2->lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 18f039428c7df183b09c69ebf10ffd4e521035d2 ]
Inspired by syzbot reports using a stack of multiple ipvlan devices.
Reduce stack size needed in ipvlan_process_v6_outbound() by moving the flowi6 struct used for the route lookup in an non inlined helper. ipvlan_route_v6_outbound() needs 120 bytes on the stack, immediately reclaimed.
Also make sure ipvlan_process_v4_outbound() is not inlined.
We might also have to lower MAX_NEST_DEV, because only syzbot uses setups with more than four stacked devices.
BUG: TASK stack guard page was hit at ffffc9000e803ff8 (stack is ffffc9000e804000..ffffc9000e808000) stack guard page: 0000 [#1] SMP KASAN CPU: 0 PID: 13442 Comm: syz-executor.4 Not tainted 6.1.52-syzkaller #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 RIP: 0010:kasan_check_range+0x4/0x2a0 mm/kasan/generic.c:188 Code: 48 01 c6 48 89 c7 e8 db 4e c1 03 31 c0 5d c3 cc 0f 0b eb 02 0f 0b b8 ea ff ff ff 5d c3 cc 00 00 cc cc 00 00 cc cc 55 48 89 e5 <41> 57 41 56 41 55 41 54 53 b0 01 48 85 f6 0f 84 a4 01 00 00 48 89 RSP: 0018:ffffc9000e804000 EFLAGS: 00010246 RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff817e5bf2 RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffffff887c6568 RBP: ffffc9000e804000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: dffffc0000000001 R12: 1ffff92001d0080c R13: dffffc0000000000 R14: ffffffff87e6b100 R15: 0000000000000000 FS: 00007fd0c55826c0(0000) GS:ffff8881f6800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffc9000e803ff8 CR3: 0000000170ef7000 CR4: 00000000003506f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <#DF> </#DF> <TASK> [<ffffffff81f281d1>] __kasan_check_read+0x11/0x20 mm/kasan/shadow.c:31 [<ffffffff817e5bf2>] instrument_atomic_read include/linux/instrumented.h:72 [inline] [<ffffffff817e5bf2>] _test_bit include/asm-generic/bitops/instrumented-non-atomic.h:141 [inline] [<ffffffff817e5bf2>] cpumask_test_cpu include/linux/cpumask.h:506 [inline] [<ffffffff817e5bf2>] cpu_online include/linux/cpumask.h:1092 [inline] [<ffffffff817e5bf2>] trace_lock_acquire include/trace/events/lock.h:24 [inline] [<ffffffff817e5bf2>] lock_acquire+0xe2/0x590 kernel/locking/lockdep.c:5632 [<ffffffff8563221e>] rcu_lock_acquire+0x2e/0x40 include/linux/rcupdate.h:306 [<ffffffff8561464d>] rcu_read_lock include/linux/rcupdate.h:747 [inline] [<ffffffff8561464d>] ip6_pol_route+0x15d/0x1440 net/ipv6/route.c:2221 [<ffffffff85618120>] ip6_pol_route_output+0x50/0x80 net/ipv6/route.c:2606 [<ffffffff856f65b5>] pol_lookup_func include/net/ip6_fib.h:584 [inline] [<ffffffff856f65b5>] fib6_rule_lookup+0x265/0x620 net/ipv6/fib6_rules.c:116 [<ffffffff85618009>] ip6_route_output_flags_noref+0x2d9/0x3a0 net/ipv6/route.c:2638 [<ffffffff8561821a>] ip6_route_output_flags+0xca/0x340 net/ipv6/route.c:2651 [<ffffffff838bd5a3>] ip6_route_output include/net/ip6_route.h:100 [inline] [<ffffffff838bd5a3>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:473 [inline] [<ffffffff838bd5a3>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bd5a3>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bd5a3>] ipvlan_queue_xmit+0xc33/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff855ce4cd>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff855ce4cd>] neigh_hh_output include/net/neighbour.h:529 [inline] [<ffffffff855ce4cd>] neigh_output include/net/neighbour.h:543 [inline] [<ffffffff855ce4cd>] ip6_finish_output2+0x160d/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff8575d27f>] dst_output include/net/dst.h:444 [inline] [<ffffffff8575d27f>] ip6_local_out+0x10f/0x140 net/ipv6/output_core.c:161 [<ffffffff838bdae4>] ipvlan_process_v6_outbound drivers/net/ipvlan/ipvlan_core.c:483 [inline] [<ffffffff838bdae4>] ipvlan_process_outbound drivers/net/ipvlan/ipvlan_core.c:529 [inline] [<ffffffff838bdae4>] ipvlan_xmit_mode_l3 drivers/net/ipvlan/ipvlan_core.c:602 [inline] [<ffffffff838bdae4>] ipvlan_queue_xmit+0x1174/0x1be0 drivers/net/ipvlan/ipvlan_core.c:677 [<ffffffff838c2909>] ipvlan_start_xmit+0x49/0x100 drivers/net/ipvlan/ipvlan_main.c:229 [<ffffffff84d03900>] netdev_start_xmit include/linux/netdevice.h:4966 [inline] [<ffffffff84d03900>] xmit_one net/core/dev.c:3644 [inline] [<ffffffff84d03900>] dev_hard_start_xmit+0x320/0x980 net/core/dev.c:3660 [<ffffffff84d080e2>] __dev_queue_xmit+0x16b2/0x3370 net/core/dev.c:4324 [<ffffffff84d4a65e>] dev_queue_xmit include/linux/netdevice.h:3067 [inline] [<ffffffff84d4a65e>] neigh_resolve_output+0x64e/0x750 net/core/neighbour.c:1560 [<ffffffff855ce503>] neigh_output include/net/neighbour.h:545 [inline] [<ffffffff855ce503>] ip6_finish_output2+0x1643/0x1ae0 net/ipv6/ip6_output.c:139 [<ffffffff855b8616>] __ip6_finish_output net/ipv6/ip6_output.c:200 [inline] [<ffffffff855b8616>] ip6_finish_output+0x6c6/0xb10 net/ipv6/ip6_output.c:211 [<ffffffff855b7e3c>] NF_HOOK_COND include/linux/netfilter.h:298 [inline] [<ffffffff855b7e3c>] ip6_output+0x2bc/0x3d0 net/ipv6/ip6_output.c:232 [<ffffffff855b9ce4>] dst_output include/net/dst.h:444 [inline] [<ffffffff855b9ce4>] NF_HOOK include/linux/netfilter.h:309 [inline] [<ffffffff855b9ce4>] ip6_xmit+0x11a4/0x1b20 net/ipv6/ip6_output.c:352 [<ffffffff8597984e>] sctp_v6_xmit+0x9ae/0x1230 net/sctp/ipv6.c:250 [<ffffffff8594623e>] sctp_packet_transmit+0x25de/0x2bc0 net/sctp/output.c:653 [<ffffffff858f5142>] sctp_packet_singleton+0x202/0x310 net/sctp/outqueue.c:783 [<ffffffff858ea411>] sctp_outq_flush_ctrl net/sctp/outqueue.c:914 [inline] [<ffffffff858ea411>] sctp_outq_flush+0x661/0x3d40 net/sctp/outqueue.c:1212 [<ffffffff858f02f9>] sctp_outq_uncork+0x79/0xb0 net/sctp/outqueue.c:764 [<ffffffff8589f060>] sctp_side_effects net/sctp/sm_sideeffect.c:1199 [inline] [<ffffffff8589f060>] sctp_do_sm+0x55c0/0x5c30 net/sctp/sm_sideeffect.c:1170 [<ffffffff85941567>] sctp_primitive_ASSOCIATE+0x97/0xc0 net/sctp/primitive.c:73 [<ffffffff859408b2>] sctp_sendmsg_to_asoc+0xf62/0x17b0 net/sctp/socket.c:1839 [<ffffffff85910b5e>] sctp_sendmsg+0x212e/0x33b0 net/sctp/socket.c:2029 [<ffffffff8544d559>] inet_sendmsg+0x149/0x310 net/ipv4/af_inet.c:849 [<ffffffff84c6c4d2>] sock_sendmsg_nosec net/socket.c:716 [inline] [<ffffffff84c6c4d2>] sock_sendmsg net/socket.c:736 [inline] [<ffffffff84c6c4d2>] ____sys_sendmsg+0x572/0x8c0 net/socket.c:2504 [<ffffffff84c6ca91>] ___sys_sendmsg net/socket.c:2558 [inline] [<ffffffff84c6ca91>] __sys_sendmsg+0x271/0x360 net/socket.c:2587 [<ffffffff84c6cbff>] __do_sys_sendmsg net/socket.c:2596 [inline] [<ffffffff84c6cbff>] __se_sys_sendmsg net/socket.c:2594 [inline] [<ffffffff84c6cbff>] __x64_sys_sendmsg+0x7f/0x90 net/socket.c:2594 [<ffffffff85b32553>] do_syscall_x64 arch/x86/entry/common.c:51 [inline] [<ffffffff85b32553>] do_syscall_64+0x53/0x80 arch/x86/entry/common.c:84 [<ffffffff85c00087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: 2ad7bf363841 ("ipvlan: Initial check-in of the IPVLAN driver.") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Mahesh Bandewar maheshb@google.com Cc: Willem de Bruijn willemb@google.com Reviewed-by: Willem de Bruijn willemb@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipvlan/ipvlan_core.c | 41 +++++++++++++++++++------------- 1 file changed, 25 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ipvlan/ipvlan_core.c b/drivers/net/ipvlan/ipvlan_core.c index 905542df3b682..5aa9217240d53 100644 --- a/drivers/net/ipvlan/ipvlan_core.c +++ b/drivers/net/ipvlan/ipvlan_core.c @@ -412,7 +412,7 @@ struct ipvl_addr *ipvlan_addr_lookup(struct ipvl_port *port, void *lyr3h, return addr; }
-static int ipvlan_process_v4_outbound(struct sk_buff *skb) +static noinline_for_stack int ipvlan_process_v4_outbound(struct sk_buff *skb) { const struct iphdr *ip4h = ip_hdr(skb); struct net_device *dev = skb->dev; @@ -454,13 +454,11 @@ static int ipvlan_process_v4_outbound(struct sk_buff *skb) }
#if IS_ENABLED(CONFIG_IPV6) -static int ipvlan_process_v6_outbound(struct sk_buff *skb) + +static noinline_for_stack int +ipvlan_route_v6_outbound(struct net_device *dev, struct sk_buff *skb) { const struct ipv6hdr *ip6h = ipv6_hdr(skb); - struct net_device *dev = skb->dev; - struct net *net = dev_net(dev); - struct dst_entry *dst; - int err, ret = NET_XMIT_DROP; struct flowi6 fl6 = { .flowi6_oif = dev->ifindex, .daddr = ip6h->daddr, @@ -470,27 +468,38 @@ static int ipvlan_process_v6_outbound(struct sk_buff *skb) .flowi6_mark = skb->mark, .flowi6_proto = ip6h->nexthdr, }; + struct dst_entry *dst; + int err;
- dst = ip6_route_output(net, NULL, &fl6); - if (dst->error) { - ret = dst->error; + dst = ip6_route_output(dev_net(dev), NULL, &fl6); + err = dst->error; + if (err) { dst_release(dst); - goto err; + return err; } skb_dst_set(skb, dst); + return 0; +} + +static int ipvlan_process_v6_outbound(struct sk_buff *skb) +{ + struct net_device *dev = skb->dev; + int err, ret = NET_XMIT_DROP; + + err = ipvlan_route_v6_outbound(dev, skb); + if (unlikely(err)) { + DEV_STATS_INC(dev, tx_errors); + kfree_skb(skb); + return err; + }
memset(IP6CB(skb), 0, sizeof(*IP6CB(skb)));
- err = ip6_local_out(net, skb->sk, skb); + err = ip6_local_out(dev_net(dev), skb->sk, skb); if (unlikely(net_xmit_eval(err))) DEV_STATS_INC(dev, tx_errors); else ret = NET_XMIT_SUCCESS; - goto out; -err: - DEV_STATS_INC(dev, tx_errors); - kfree_skb(skb); -out: return ret; } #else
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit 719639853d88071dfdfd8d9971eca9c283ff314c ]
KMSAN reported the following uninit-value access issue:
===================================================== BUG: KMSAN: uninit-value in ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] BUG: KMSAN: uninit-value in ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 ppp_sync_input drivers/net/ppp/ppp_synctty.c:690 [inline] ppp_sync_receive+0xdc9/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at: __alloc_pages+0x75d/0xe80 mm/page_alloc.c:4591 __alloc_pages_node include/linux/gfp.h:238 [inline] alloc_pages_node include/linux/gfp.h:261 [inline] __page_frag_cache_refill+0x9a/0x2c0 mm/page_alloc.c:4691 page_frag_alloc_align+0x91/0x5d0 mm/page_alloc.c:4722 page_frag_alloc include/linux/gfp.h:322 [inline] __netdev_alloc_skb+0x215/0x6d0 net/core/skbuff.c:728 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0x237/0xe70 drivers/net/ppp/ppp_synctty.c:334 tiocsti+0x328/0x450 drivers/tty/tty_io.c:2295 tty_ioctl+0x808/0x1920 drivers/tty/tty_io.c:2694 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl+0x211/0x400 fs/ioctl.c:857 __x64_sys_ioctl+0x97/0xe0 fs/ioctl.c:857 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
CPU: 0 PID: 12950 Comm: syz-executor.1 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 =====================================================
ppp_sync_input() checks the first 2 bytes of the data are PPP_ALLSTATIONS and PPP_UI. However, if the data length is 1 and the first byte is PPP_ALLSTATIONS, an access to an uninitialized value occurs when checking PPP_UI. This patch resolves this issue by checking the data length.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Shigeru Yoshida syoshida@redhat.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_synctty.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ppp/ppp_synctty.c b/drivers/net/ppp/ppp_synctty.c index af3e048695b66..e37faed81937f 100644 --- a/drivers/net/ppp/ppp_synctty.c +++ b/drivers/net/ppp/ppp_synctty.c @@ -699,7 +699,7 @@ ppp_sync_input(struct syncppp *ap, const unsigned char *buf,
/* strip address/control field if present */ p = skb->data; - if (p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) { + if (skb->len >= 2 && p[0] == PPP_ALLSTATIONS && p[1] == PPP_UI) { /* chop off address/control */ if (skb->len < 3) goto err;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 472a2ff63efb30234cbf6b2cdaf8117f21b4f8bc ]
The hclge_sync_vlan_filter is called in periodic task, trying to remove VLAN from vlan_del_fail_bmap. It can be concurrence with VLAN adding operation from user. So once user failed to delete a VLAN id, and add it again soon, it may be removed by the periodic task, which may cause the software configuration being inconsistent with hardware. So add mutex handling to avoid this.
user hns3 driver
periodic task │ add vlan 10 ───── hns3_vlan_rx_add_vid │ │ (suppose success) │ │ │ del vlan 10 ───── hns3_vlan_rx_kill_vid │ │ (suppose fail,add to │ │ vlan_del_fail_bmap) │ │ │ add vlan 10 ───── hns3_vlan_rx_add_vid │ (suppose success) │ foreach vlan_del_fail_bmp del vlan 10
Fixes: fe4144d47eef ("net: hns3: sync VLAN filter entries when kill VLAN ID failed") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3pf/hclge_main.c | 28 +++++++++++++------ .../hisilicon/hns3/hns3vf/hclgevf_main.c | 11 ++++++-- 2 files changed, 29 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index ca59e1cd992e5..dba3cf15b48e1 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -10196,8 +10196,6 @@ static void hclge_rm_vport_vlan_table(struct hclge_vport *vport, u16 vlan_id, struct hclge_vport_vlan_cfg *vlan, *tmp; struct hclge_dev *hdev = vport->back;
- mutex_lock(&hdev->vport_lock); - list_for_each_entry_safe(vlan, tmp, &vport->vlan_list, node) { if (vlan->vlan_id == vlan_id) { if (is_write_tbl && vlan->hd_tbl_status) @@ -10212,8 +10210,6 @@ static void hclge_rm_vport_vlan_table(struct hclge_vport *vport, u16 vlan_id, break; } } - - mutex_unlock(&hdev->vport_lock); }
void hclge_rm_vport_all_vlan_table(struct hclge_vport *vport, bool is_del_list) @@ -10618,11 +10614,16 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, * handle mailbox. Just record the vlan id, and remove it after * reset finished. */ + mutex_lock(&hdev->vport_lock); if ((test_bit(HCLGE_STATE_RST_HANDLING, &hdev->state) || test_bit(HCLGE_STATE_RST_FAIL, &hdev->state)) && is_kill) { set_bit(vlan_id, vport->vlan_del_fail_bmap); + mutex_unlock(&hdev->vport_lock); return -EBUSY; + } else if (!is_kill && test_bit(vlan_id, vport->vlan_del_fail_bmap)) { + clear_bit(vlan_id, vport->vlan_del_fail_bmap); } + mutex_unlock(&hdev->vport_lock);
/* when port base vlan enabled, we use port base vlan as the vlan * filter entry. In this case, we don't update vlan filter table @@ -10637,17 +10638,22 @@ int hclge_set_vlan_filter(struct hnae3_handle *handle, __be16 proto, }
if (!ret) { - if (!is_kill) + if (!is_kill) { hclge_add_vport_vlan_table(vport, vlan_id, writen_to_tbl); - else if (is_kill && vlan_id != 0) + } else if (is_kill && vlan_id != 0) { + mutex_lock(&hdev->vport_lock); hclge_rm_vport_vlan_table(vport, vlan_id, false); + mutex_unlock(&hdev->vport_lock); + } } else if (is_kill) { /* when remove hw vlan filter failed, record the vlan id, * and try to remove it from hw later, to be consistence * with stack */ + mutex_lock(&hdev->vport_lock); set_bit(vlan_id, vport->vlan_del_fail_bmap); + mutex_unlock(&hdev->vport_lock); }
hclge_set_vport_vlan_fltr_change(vport); @@ -10687,6 +10693,7 @@ static void hclge_sync_vlan_filter(struct hclge_dev *hdev) int i, ret, sync_cnt = 0; u16 vlan_id;
+ mutex_lock(&hdev->vport_lock); /* start from vport 1 for PF is always alive */ for (i = 0; i < hdev->num_alloc_vport; i++) { struct hclge_vport *vport = &hdev->vport[i]; @@ -10697,21 +10704,26 @@ static void hclge_sync_vlan_filter(struct hclge_dev *hdev) ret = hclge_set_vlan_filter_hw(hdev, htons(ETH_P_8021Q), vport->vport_id, vlan_id, true); - if (ret && ret != -EINVAL) + if (ret && ret != -EINVAL) { + mutex_unlock(&hdev->vport_lock); return; + }
clear_bit(vlan_id, vport->vlan_del_fail_bmap); hclge_rm_vport_vlan_table(vport, vlan_id, false); hclge_set_vport_vlan_fltr_change(vport);
sync_cnt++; - if (sync_cnt >= HCLGE_MAX_SYNC_COUNT) + if (sync_cnt >= HCLGE_MAX_SYNC_COUNT) { + mutex_unlock(&hdev->vport_lock); return; + }
vlan_id = find_first_bit(vport->vlan_del_fail_bmap, VLAN_N_VID); } } + mutex_unlock(&hdev->vport_lock);
hclge_sync_vlan_fltr_state(hdev); } diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index bc140e3620d6c..8adc682f624f9 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -1710,6 +1710,8 @@ static int hclgevf_set_vlan_filter(struct hnae3_handle *handle, test_bit(HCLGEVF_STATE_RST_FAIL, &hdev->state)) && is_kill) { set_bit(vlan_id, hdev->vlan_del_fail_bmap); return -EBUSY; + } else if (!is_kill && test_bit(vlan_id, hdev->vlan_del_fail_bmap)) { + clear_bit(vlan_id, hdev->vlan_del_fail_bmap); }
hclgevf_build_send_msg(&send_msg, HCLGE_MBX_SET_VLAN, @@ -1737,20 +1739,25 @@ static void hclgevf_sync_vlan_filter(struct hclgevf_dev *hdev) int ret, sync_cnt = 0; u16 vlan_id;
+ if (bitmap_empty(hdev->vlan_del_fail_bmap, VLAN_N_VID)) + return; + + rtnl_lock(); vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID); while (vlan_id != VLAN_N_VID) { ret = hclgevf_set_vlan_filter(handle, htons(ETH_P_8021Q), vlan_id, true); if (ret) - return; + break;
clear_bit(vlan_id, hdev->vlan_del_fail_bmap); sync_cnt++; if (sync_cnt >= HCLGEVF_MAX_SYNC_COUNT) - return; + break;
vlan_id = find_first_bit(hdev->vlan_del_fail_bmap, VLAN_N_VID); } + rtnl_unlock(); }
static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 6fde96df0447a29ab785de4fcb229e5543f0cbf7 ]
The struct hclge_pf_to_vf_msg is used for mailbox message from PF to VF, including both response and request. But its definition can only indicate respone, which makes the message data copy in function hclge_send_mbx_msg() unreadable. So refine it by edding a general message definition into it.
Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: ac92c0a9a060 ("net: hns3: add barrier in vf mailbox reply process") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h | 17 +++++++++++++---- .../ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 2 +- 2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h index c2bd2584201f8..c4603a70ed60b 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h +++ b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h @@ -132,10 +132,19 @@ struct hclge_vf_to_pf_msg {
struct hclge_pf_to_vf_msg { u16 code; - u16 vf_mbx_msg_code; - u16 vf_mbx_msg_subcode; - u16 resp_status; - u8 resp_data[HCLGE_MBX_MAX_RESP_DATA_SIZE]; + union { + /* used for mbx response */ + struct { + u16 vf_mbx_msg_code; + u16 vf_mbx_msg_subcode; + u16 resp_status; + u8 resp_data[HCLGE_MBX_MAX_RESP_DATA_SIZE]; + }; + /* used for general mbx */ + struct { + u8 msg_data[HCLGE_MBX_MAX_MSG_SIZE]; + }; + }; };
struct hclge_mbx_vf_to_pf_cmd { diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c index 4a5b11b6fed3f..dac4ac425481c 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c @@ -108,7 +108,7 @@ static int hclge_send_mbx_msg(struct hclge_vport *vport, u8 *msg, u16 msg_len, resp_pf_to_vf->msg_len = msg_len; resp_pf_to_vf->msg.code = mbx_opcode;
- memcpy(&resp_pf_to_vf->msg.vf_mbx_msg_code, msg, msg_len); + memcpy(resp_pf_to_vf->msg.msg_data, msg, msg_len);
trace_hclge_pf_mbx_send(hdev, resp_pf_to_vf);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jie Wang wangjie125@huawei.com
[ Upstream commit 767975e582c50b39d633f6e1c4bb99cc1f156efb ]
Currently, hns3 mailbox processing between PF and VF missed to convert message byte order and use data type u16 instead of __le16 for mailbox data process. These processes may cause problems between different architectures.
So this patch uses __le16/__le32 data type to define mailbox data structures. To be compatible with old hns3 driver, these structures use one-byte alignment. Then byte order conversions are added to mailbox messages from PF to VF.
Signed-off-by: Jie Wang wangjie125@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: ac92c0a9a060 ("net: hns3: add barrier in vf mailbox reply process") Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hclge_mbx.h | 36 +++++++-- .../hisilicon/hns3/hns3pf/hclge_mbx.c | 60 +++++++------- .../hisilicon/hns3/hns3pf/hclge_trace.h | 2 +- .../hisilicon/hns3/hns3vf/hclgevf_main.c | 4 +- .../hisilicon/hns3/hns3vf/hclgevf_main.h | 2 +- .../hisilicon/hns3/hns3vf/hclgevf_mbx.c | 80 +++++++++++-------- .../hisilicon/hns3/hns3vf/hclgevf_trace.h | 2 +- 7 files changed, 109 insertions(+), 77 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h index c4603a70ed60b..277d6d657c429 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h +++ b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h @@ -131,13 +131,13 @@ struct hclge_vf_to_pf_msg { };
struct hclge_pf_to_vf_msg { - u16 code; + __le16 code; union { /* used for mbx response */ struct { - u16 vf_mbx_msg_code; - u16 vf_mbx_msg_subcode; - u16 resp_status; + __le16 vf_mbx_msg_code; + __le16 vf_mbx_msg_subcode; + __le16 resp_status; u8 resp_data[HCLGE_MBX_MAX_RESP_DATA_SIZE]; }; /* used for general mbx */ @@ -154,7 +154,7 @@ struct hclge_mbx_vf_to_pf_cmd { u8 rsv1[1]; u8 msg_len; u8 rsv2; - u16 match_id; + __le16 match_id; struct hclge_vf_to_pf_msg msg; };
@@ -165,7 +165,7 @@ struct hclge_mbx_pf_to_vf_cmd { u8 rsv[3]; u8 msg_len; u8 rsv1; - u16 match_id; + __le16 match_id; struct hclge_pf_to_vf_msg msg; };
@@ -175,6 +175,28 @@ struct hclge_vf_rst_cmd { u8 rsv[22]; };
+#pragma pack(1) +struct hclge_mbx_link_status { + __le16 link_status; + __le32 speed; + __le16 duplex; + u8 flag; +}; + +struct hclge_mbx_link_mode { + __le16 idx; + __le64 link_mode; +}; + +struct hclge_mbx_port_base_vlan { + __le16 state; + __le16 vlan_proto; + __le16 qos; + __le16 vlan_tag; +}; + +#pragma pack() + /* used by VF to store the received Async responses from PF */ struct hclgevf_mbx_arq_ring { #define HCLGE_MBX_MAX_ARQ_MSG_SIZE 8 @@ -183,7 +205,7 @@ struct hclgevf_mbx_arq_ring { u32 head; u32 tail; atomic_t count; - u16 msg_q[HCLGE_MBX_MAX_ARQ_MSG_NUM][HCLGE_MBX_MAX_ARQ_MSG_SIZE]; + __le16 msg_q[HCLGE_MBX_MAX_ARQ_MSG_NUM][HCLGE_MBX_MAX_ARQ_MSG_SIZE]; };
#define hclge_mbx_ring_ptr_move_crq(crq) \ diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c index dac4ac425481c..5182051e5414d 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c @@ -56,17 +56,19 @@ static int hclge_gen_resp_to_vf(struct hclge_vport *vport, resp_pf_to_vf->msg_len = vf_to_pf_req->msg_len; resp_pf_to_vf->match_id = vf_to_pf_req->match_id;
- resp_pf_to_vf->msg.code = HCLGE_MBX_PF_VF_RESP; - resp_pf_to_vf->msg.vf_mbx_msg_code = vf_to_pf_req->msg.code; - resp_pf_to_vf->msg.vf_mbx_msg_subcode = vf_to_pf_req->msg.subcode; + resp_pf_to_vf->msg.code = cpu_to_le16(HCLGE_MBX_PF_VF_RESP); + resp_pf_to_vf->msg.vf_mbx_msg_code = + cpu_to_le16(vf_to_pf_req->msg.code); + resp_pf_to_vf->msg.vf_mbx_msg_subcode = + cpu_to_le16(vf_to_pf_req->msg.subcode); resp = hclge_errno_to_resp(resp_msg->status); if (resp < SHRT_MAX) { - resp_pf_to_vf->msg.resp_status = resp; + resp_pf_to_vf->msg.resp_status = cpu_to_le16(resp); } else { dev_warn(&hdev->pdev->dev, "failed to send response to VF, response status %u is out-of-bound\n", resp); - resp_pf_to_vf->msg.resp_status = EIO; + resp_pf_to_vf->msg.resp_status = cpu_to_le16(EIO); }
if (resp_msg->len > 0) @@ -106,7 +108,7 @@ static int hclge_send_mbx_msg(struct hclge_vport *vport, u8 *msg, u16 msg_len,
resp_pf_to_vf->dest_vfid = dest_vfid; resp_pf_to_vf->msg_len = msg_len; - resp_pf_to_vf->msg.code = mbx_opcode; + resp_pf_to_vf->msg.code = cpu_to_le16(mbx_opcode);
memcpy(resp_pf_to_vf->msg.msg_data, msg, msg_len);
@@ -124,8 +126,8 @@ static int hclge_send_mbx_msg(struct hclge_vport *vport, u8 *msg, u16 msg_len, int hclge_inform_reset_assert_to_vf(struct hclge_vport *vport) { struct hclge_dev *hdev = vport->back; + __le16 msg_data; u16 reset_type; - u8 msg_data[2]; u8 dest_vfid;
BUILD_BUG_ON(HNAE3_MAX_RESET > U16_MAX); @@ -139,10 +141,10 @@ int hclge_inform_reset_assert_to_vf(struct hclge_vport *vport) else reset_type = HNAE3_VF_FUNC_RESET;
- memcpy(&msg_data[0], &reset_type, sizeof(u16)); + msg_data = cpu_to_le16(reset_type);
/* send this requested info to VF */ - return hclge_send_mbx_msg(vport, msg_data, sizeof(msg_data), + return hclge_send_mbx_msg(vport, (u8 *)&msg_data, sizeof(msg_data), HCLGE_MBX_ASSERTING_RESET, dest_vfid); }
@@ -338,16 +340,14 @@ int hclge_push_vf_port_base_vlan_info(struct hclge_vport *vport, u8 vfid, u16 state, struct hclge_vlan_info *vlan_info) { -#define MSG_DATA_SIZE 8 + struct hclge_mbx_port_base_vlan base_vlan;
- u8 msg_data[MSG_DATA_SIZE]; + base_vlan.state = cpu_to_le16(state); + base_vlan.vlan_proto = cpu_to_le16(vlan_info->vlan_proto); + base_vlan.qos = cpu_to_le16(vlan_info->qos); + base_vlan.vlan_tag = cpu_to_le16(vlan_info->vlan_tag);
- memcpy(&msg_data[0], &state, sizeof(u16)); - memcpy(&msg_data[2], &vlan_info->vlan_proto, sizeof(u16)); - memcpy(&msg_data[4], &vlan_info->qos, sizeof(u16)); - memcpy(&msg_data[6], &vlan_info->vlan_tag, sizeof(u16)); - - return hclge_send_mbx_msg(vport, msg_data, sizeof(msg_data), + return hclge_send_mbx_msg(vport, (u8 *)&base_vlan, sizeof(base_vlan), HCLGE_MBX_PUSH_VLAN_INFO, vfid); }
@@ -487,10 +487,9 @@ int hclge_push_vf_link_status(struct hclge_vport *vport) #define HCLGE_VF_LINK_STATE_UP 1U #define HCLGE_VF_LINK_STATE_DOWN 0U
+ struct hclge_mbx_link_status link_info; struct hclge_dev *hdev = vport->back; u16 link_status; - u8 msg_data[9]; - u16 duplex;
/* mac.link can only be 0 or 1 */ switch (vport->vf_info.link_state) { @@ -506,14 +505,13 @@ int hclge_push_vf_link_status(struct hclge_vport *vport) break; }
- duplex = hdev->hw.mac.duplex; - memcpy(&msg_data[0], &link_status, sizeof(u16)); - memcpy(&msg_data[2], &hdev->hw.mac.speed, sizeof(u32)); - memcpy(&msg_data[6], &duplex, sizeof(u16)); - msg_data[8] = HCLGE_MBX_PUSH_LINK_STATUS_EN; + link_info.link_status = cpu_to_le16(link_status); + link_info.speed = cpu_to_le32(hdev->hw.mac.speed); + link_info.duplex = cpu_to_le16(hdev->hw.mac.duplex); + link_info.flag = HCLGE_MBX_PUSH_LINK_STATUS_EN;
/* send this requested info to VF */ - return hclge_send_mbx_msg(vport, msg_data, sizeof(msg_data), + return hclge_send_mbx_msg(vport, (u8 *)&link_info, sizeof(link_info), HCLGE_MBX_LINK_STAT_CHANGE, vport->vport_id); }
@@ -521,22 +519,22 @@ static void hclge_get_link_mode(struct hclge_vport *vport, struct hclge_mbx_vf_to_pf_cmd *mbx_req) { #define HCLGE_SUPPORTED 1 + struct hclge_mbx_link_mode link_mode; struct hclge_dev *hdev = vport->back; unsigned long advertising; unsigned long supported; unsigned long send_data; - u8 msg_data[10] = {}; u8 dest_vfid;
advertising = hdev->hw.mac.advertising[0]; supported = hdev->hw.mac.supported[0]; dest_vfid = mbx_req->mbx_src_vfid; - msg_data[0] = mbx_req->msg.data[0]; - - send_data = msg_data[0] == HCLGE_SUPPORTED ? supported : advertising; + send_data = mbx_req->msg.data[0] == HCLGE_SUPPORTED ? supported : + advertising; + link_mode.idx = cpu_to_le16((u16)mbx_req->msg.data[0]); + link_mode.link_mode = cpu_to_le64(send_data);
- memcpy(&msg_data[2], &send_data, sizeof(unsigned long)); - hclge_send_mbx_msg(vport, msg_data, sizeof(msg_data), + hclge_send_mbx_msg(vport, (u8 *)&link_mode, sizeof(link_mode), HCLGE_MBX_LINK_STAT_MODE, dest_vfid); }
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h index 5b0b71bd61200..8510b88d49820 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_trace.h @@ -62,7 +62,7 @@ TRACE_EVENT(hclge_pf_mbx_send,
TP_fast_assign( __entry->vfid = req->dest_vfid; - __entry->code = req->msg.code; + __entry->code = le16_to_cpu(req->msg.code); __assign_str(pciname, pci_name(hdev->pdev)); __assign_str(devname, &hdev->vport[0].nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req, diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 8adc682f624f9..69913af880a40 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -3816,7 +3816,7 @@ static void hclgevf_get_regs(struct hnae3_handle *handle, u32 *version, }
void hclgevf_update_port_base_vlan_info(struct hclgevf_dev *hdev, u16 state, - u8 *port_base_vlan_info, u8 data_size) + struct hclge_mbx_port_base_vlan *port_base_vlan) { struct hnae3_handle *nic = &hdev->nic; struct hclge_vf_to_pf_msg send_msg; @@ -3841,7 +3841,7 @@ void hclgevf_update_port_base_vlan_info(struct hclgevf_dev *hdev, u16 state, /* send msg to PF and wait update port based vlan info */ hclgevf_build_send_msg(&send_msg, HCLGE_MBX_SET_VLAN, HCLGE_MBX_PORT_BASE_VLAN_CFG); - memcpy(send_msg.data, port_base_vlan_info, data_size); + memcpy(send_msg.data, port_base_vlan, sizeof(*port_base_vlan)); ret = hclgevf_send_mbx_msg(hdev, &send_msg, false, NULL, 0); if (!ret) { if (state == HNAE3_PORT_BASE_VLAN_DISABLE) diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h index f6f736c0091c0..e16068264fa77 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h @@ -355,5 +355,5 @@ void hclgevf_update_speed_duplex(struct hclgevf_dev *hdev, u32 speed, void hclgevf_reset_task_schedule(struct hclgevf_dev *hdev); void hclgevf_mbx_task_schedule(struct hclgevf_dev *hdev); void hclgevf_update_port_base_vlan_info(struct hclgevf_dev *hdev, u16 state, - u8 *port_base_vlan_info, u8 data_size); + struct hclge_mbx_port_base_vlan *port_base_vlan); #endif diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c index c5ac6ecf36e10..df6e9b8b26e0f 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c @@ -121,7 +121,7 @@ int hclgevf_send_mbx_msg(struct hclgevf_dev *hdev, if (need_resp) { mutex_lock(&hdev->mbx_resp.mbx_mutex); hclgevf_reset_mbx_resp_status(hdev); - req->match_id = hdev->mbx_resp.match_id; + req->match_id = cpu_to_le16(hdev->mbx_resp.match_id); status = hclgevf_cmd_send(&hdev->hw, &desc, 1); if (status) { dev_err(&hdev->pdev->dev, @@ -159,27 +159,29 @@ static bool hclgevf_cmd_crq_empty(struct hclgevf_hw *hw) static void hclgevf_handle_mbx_response(struct hclgevf_dev *hdev, struct hclge_mbx_pf_to_vf_cmd *req) { + u16 vf_mbx_msg_subcode = le16_to_cpu(req->msg.vf_mbx_msg_subcode); + u16 vf_mbx_msg_code = le16_to_cpu(req->msg.vf_mbx_msg_code); struct hclgevf_mbx_resp_status *resp = &hdev->mbx_resp; + u16 resp_status = le16_to_cpu(req->msg.resp_status); + u16 match_id = le16_to_cpu(req->match_id);
if (resp->received_resp) dev_warn(&hdev->pdev->dev, - "VF mbx resp flag not clear(%u)\n", - req->msg.vf_mbx_msg_code); - - resp->origin_mbx_msg = - (req->msg.vf_mbx_msg_code << 16); - resp->origin_mbx_msg |= req->msg.vf_mbx_msg_subcode; - resp->resp_status = - hclgevf_resp_to_errno(req->msg.resp_status); + "VF mbx resp flag not clear(%u)\n", + vf_mbx_msg_code); + + resp->origin_mbx_msg = (vf_mbx_msg_code << 16); + resp->origin_mbx_msg |= vf_mbx_msg_subcode; + resp->resp_status = hclgevf_resp_to_errno(resp_status); memcpy(resp->additional_info, req->msg.resp_data, HCLGE_MBX_MAX_RESP_DATA_SIZE * sizeof(u8)); - if (req->match_id) { + if (match_id) { /* If match_id is not zero, it means PF support match_id. * if the match_id is right, VF get the right response, or * ignore the response. and driver will clear hdev->mbx_resp * when send next message which need response. */ - if (req->match_id == resp->match_id) + if (match_id == resp->match_id) resp->received_resp = true; } else { resp->received_resp = true; @@ -196,7 +198,7 @@ static void hclgevf_handle_mbx_msg(struct hclgevf_dev *hdev, HCLGE_MBX_MAX_ARQ_MSG_NUM) { dev_warn(&hdev->pdev->dev, "Async Q full, dropping msg(%u)\n", - req->msg.code); + le16_to_cpu(req->msg.code)); return; }
@@ -215,6 +217,7 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) struct hclgevf_cmq_ring *crq; struct hclgevf_desc *desc; u16 flag; + u16 code;
crq = &hdev->hw.cmq.crq;
@@ -228,10 +231,11 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) req = (struct hclge_mbx_pf_to_vf_cmd *)desc->data;
flag = le16_to_cpu(crq->desc[crq->next_to_use].flag); + code = le16_to_cpu(req->msg.code); if (unlikely(!hnae3_get_bit(flag, HCLGEVF_CMDQ_RX_OUTVLD_B))) { dev_warn(&hdev->pdev->dev, "dropped invalid mailbox message, code = %u\n", - req->msg.code); + code);
/* dropping/not processing this invalid message */ crq->desc[crq->next_to_use].flag = 0; @@ -247,7 +251,7 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) * timeout and simultaneously queue the async messages for later * prcessing in context of mailbox task i.e. the slow path. */ - switch (req->msg.code) { + switch (code) { case HCLGE_MBX_PF_VF_RESP: hclgevf_handle_mbx_response(hdev, req); break; @@ -261,7 +265,7 @@ void hclgevf_mbx_handler(struct hclgevf_dev *hdev) default: dev_err(&hdev->pdev->dev, "VF received unsupported(%u) mbx msg from PF\n", - req->msg.code); + code); break; } crq->desc[crq->next_to_use].flag = 0; @@ -283,14 +287,18 @@ static void hclgevf_parse_promisc_info(struct hclgevf_dev *hdev,
void hclgevf_mbx_async_handler(struct hclgevf_dev *hdev) { + struct hclge_mbx_port_base_vlan *vlan_info; + struct hclge_mbx_link_status *link_info; + struct hclge_mbx_link_mode *link_mode; enum hnae3_reset_type reset_type; u16 link_status, state; - u16 *msg_q, *vlan_info; + __le16 *msg_q; + u16 opcode; u8 duplex; u32 speed; u32 tail; u8 flag; - u8 idx; + u16 idx;
tail = hdev->arq.tail;
@@ -303,13 +311,14 @@ void hclgevf_mbx_async_handler(struct hclgevf_dev *hdev) }
msg_q = hdev->arq.msg_q[hdev->arq.head]; - - switch (msg_q[0]) { + opcode = le16_to_cpu(msg_q[0]); + switch (opcode) { case HCLGE_MBX_LINK_STAT_CHANGE: - link_status = msg_q[1]; - memcpy(&speed, &msg_q[2], sizeof(speed)); - duplex = (u8)msg_q[4]; - flag = (u8)msg_q[5]; + link_info = (struct hclge_mbx_link_status *)(msg_q + 1); + link_status = le16_to_cpu(link_info->link_status); + speed = le32_to_cpu(link_info->speed); + duplex = (u8)le16_to_cpu(link_info->duplex); + flag = link_info->flag;
/* update upper layer with new link link status */ hclgevf_update_speed_duplex(hdev, speed, duplex); @@ -321,13 +330,14 @@ void hclgevf_mbx_async_handler(struct hclgevf_dev *hdev)
break; case HCLGE_MBX_LINK_STAT_MODE: - idx = (u8)msg_q[1]; + link_mode = (struct hclge_mbx_link_mode *)(msg_q + 1); + idx = le16_to_cpu(link_mode->idx); if (idx) - memcpy(&hdev->hw.mac.supported, &msg_q[2], - sizeof(unsigned long)); + hdev->hw.mac.supported = + le64_to_cpu(link_mode->link_mode); else - memcpy(&hdev->hw.mac.advertising, &msg_q[2], - sizeof(unsigned long)); + hdev->hw.mac.advertising = + le64_to_cpu(link_mode->link_mode); break; case HCLGE_MBX_ASSERTING_RESET: /* PF has asserted reset hence VF should go in pending @@ -335,25 +345,27 @@ void hclgevf_mbx_async_handler(struct hclgevf_dev *hdev) * has been completely reset. After this stack should * eventually be re-initialized. */ - reset_type = (enum hnae3_reset_type)msg_q[1]; + reset_type = + (enum hnae3_reset_type)le16_to_cpu(msg_q[1]); set_bit(reset_type, &hdev->reset_pending); set_bit(HCLGEVF_RESET_PENDING, &hdev->reset_state); hclgevf_reset_task_schedule(hdev);
break; case HCLGE_MBX_PUSH_VLAN_INFO: - state = msg_q[1]; - vlan_info = &msg_q[1]; + vlan_info = + (struct hclge_mbx_port_base_vlan *)(msg_q + 1); + state = le16_to_cpu(vlan_info->state); hclgevf_update_port_base_vlan_info(hdev, state, - (u8 *)vlan_info, 8); + vlan_info); break; case HCLGE_MBX_PUSH_PROMISC_INFO: - hclgevf_parse_promisc_info(hdev, msg_q[1]); + hclgevf_parse_promisc_info(hdev, le16_to_cpu(msg_q[1])); break; default: dev_err(&hdev->pdev->dev, "fetched unsupported(%u) message from arq\n", - msg_q[0]); + opcode); break; }
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h index e4bfb6191fef5..5d4895bb57a17 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_trace.h @@ -29,7 +29,7 @@ TRACE_EVENT(hclge_vf_mbx_get,
TP_fast_assign( __entry->vfid = req->dest_vfid; - __entry->code = req->msg.code; + __entry->code = le16_to_cpu(req->msg.code); __assign_str(pciname, pci_name(hdev->pdev)); __assign_str(devname, &hdev->nic.kinfo.netdev->name); memcpy(__entry->mbx_data, req,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit ac92c0a9a0603fb448e60f38e63302e4eebb8035 ]
In hclgevf_mbx_handler() and hclgevf_get_mbx_resp() functions, there is a typical store-store and load-load scenario between received_resp and additional_info. This patch adds barrier to fix the problem.
Fixes: 4671042f1ef0 ("net: hns3: add match_id to check mailbox response from PF to VF") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c index df6e9b8b26e0f..608a14fc27acc 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_mbx.c @@ -60,6 +60,9 @@ static int hclgevf_get_mbx_resp(struct hclgevf_dev *hdev, u16 code0, u16 code1, i++; }
+ /* ensure additional_info will be seen after received_resp */ + smp_rmb(); + if (i >= HCLGEVF_MAX_TRY_TIMES) { dev_err(&hdev->pdev->dev, "VF could not get mbx(%u,%u) resp(=%d) from PF in %d tries\n", @@ -175,6 +178,10 @@ static void hclgevf_handle_mbx_response(struct hclgevf_dev *hdev, resp->resp_status = hclgevf_resp_to_errno(resp_status); memcpy(resp->additional_info, req->msg.resp_data, HCLGE_MBX_MAX_RESP_DATA_SIZE * sizeof(u8)); + + /* ensure additional_info will be seen before setting received_resp */ + smp_wmb(); + if (match_id) { /* If match_id is not zero, it means PF support match_id. * if the match_id is right, VF get the right response, or
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 75b247b57d8b71bcb679e4cb37d0db104848806c ]
Currently, the FEC capability bit is default set for device version V2. It's incorrect for the copper port. Eventhough it doesn't make the nic work abnormal, but the capability information display in debugfs may confuse user. So clear it when driver get the port type inforamtion.
Fixes: 433ccce83504 ("net: hns3: use FEC capability queried from firmware") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index dba3cf15b48e1..7c28c74c1de92 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -11713,6 +11713,7 @@ static int hclge_init_ae_dev(struct hnae3_ae_dev *ae_dev) goto err_msi_irq_uninit;
if (hdev->hw.mac.media_type == HNAE3_MEDIA_TYPE_COPPER) { + clear_bit(HNAE3_DEV_SUPPORT_FEC_B, ae_dev->caps); if (hnae3_dev_phy_imp_supported(hdev)) ret = hclge_update_tp_port_info(hdev); else
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit dbd2f3b20c6ae425665b6975d766e3653d453e73 ]
When a VF is calling hns3_init_mac_addr(), get_mac_addr() may return fail, then the value of mac_addr_temp is not initialized.
Fixes: 76ad4f0ee747 ("net: hns3: Add support of HNS3 Ethernet Driver for hip08 SoC") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index fde1ff3580458..60e610ab976d4 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -4915,7 +4915,7 @@ static int hns3_init_mac_addr(struct net_device *netdev) struct hns3_nic_priv *priv = netdev_priv(netdev); char format_mac_addr[HNAE3_FORMAT_MAC_ADDR_LEN]; struct hnae3_handle *h = priv->ae_handle; - u8 mac_addr_temp[ETH_ALEN]; + u8 mac_addr_temp[ETH_ALEN] = {0}; int ret = 0;
if (h->ae_algo->ops->get_mac_addr)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit 65e98bb56fa3ce2edb400930c05238c9b380500e ]
Currently the reset process in hns3 and firmware watchdog init process is asynchronous. We think firmware watchdog initialization is completed before VF clear the interrupt source. However, firmware initialization may not complete early. So VF will receive multiple reset interrupts and fail to reset.
So we add delay before VF interrupt source and 5 ms delay is enough to avoid second reset interrupt.
Fixes: 427900d27d86 ("net: hns3: fix the timing issue of VF clearing interrupt sources") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 14 +++++++++++++- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 1 + 2 files changed, 14 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 69913af880a40..880feeac06375 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -2487,8 +2487,18 @@ static enum hclgevf_evt_cause hclgevf_check_evt_cause(struct hclgevf_dev *hdev, return HCLGEVF_VECTOR0_EVENT_OTHER; }
+static void hclgevf_reset_timer(struct timer_list *t) +{ + struct hclgevf_dev *hdev = from_timer(hdev, t, reset_timer); + + hclgevf_clear_event_cause(hdev, HCLGEVF_VECTOR0_EVENT_RST); + hclgevf_reset_task_schedule(hdev); +} + static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data) { +#define HCLGEVF_RESET_DELAY 5 + enum hclgevf_evt_cause event_cause; struct hclgevf_dev *hdev = data; u32 clearval; @@ -2500,7 +2510,8 @@ static irqreturn_t hclgevf_misc_irq_handle(int irq, void *data)
switch (event_cause) { case HCLGEVF_VECTOR0_EVENT_RST: - hclgevf_reset_task_schedule(hdev); + mod_timer(&hdev->reset_timer, + jiffies + msecs_to_jiffies(HCLGEVF_RESET_DELAY)); break; case HCLGEVF_VECTOR0_EVENT_MBX: hclgevf_mbx_handler(hdev); @@ -3477,6 +3488,7 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) HCLGEVF_DRIVER_NAME);
hclgevf_task_schedule(hdev, round_jiffies_relative(HZ)); + timer_setup(&hdev->reset_timer, hclgevf_reset_timer, 0);
return 0;
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h index e16068264fa77..5c7538ca36a76 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h @@ -281,6 +281,7 @@ struct hclgevf_dev { enum hnae3_reset_type reset_level; unsigned long reset_pending; enum hnae3_reset_type reset_type; + struct timer_list reset_timer;
#define HCLGEVF_RESET_REQUESTED 0 #define HCLGEVF_RESET_PENDING 1
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jijie Shao shaojijie@huawei.com
[ Upstream commit dff655e82faffc287d4a72a59f66fa120bf904e4 ]
If PF is down, firmware will returns 10 Mbit/s rate and half-duplex mode when PF queries the port information from firmware.
After imp reset command is executed, PF status changes to down, and PF will query link status and updates port information from firmware in a periodic scheduled task.
However, there is a low probability that port information is updated when PF is down, and then PF link status changes to up. In this case, PF synchronizes incorrect rate and duplex mode to VF.
This patch fixes it by updating port information before PF synchronizes the rate and duplex to the VF when PF changes to up.
Fixes: 18b6e31f8bf4 ("net: hns3: PF add support for pushing link status to VFs") Signed-off-by: Jijie Shao shaojijie@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 7c28c74c1de92..9e33f0f0b75dd 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -72,6 +72,7 @@ static void hclge_sync_promisc_mode(struct hclge_dev *hdev); static void hclge_sync_fd_table(struct hclge_dev *hdev); static int hclge_mac_link_status_wait(struct hclge_dev *hdev, int link_ret, int wait_cnt); +static int hclge_update_port_info(struct hclge_dev *hdev);
static struct hnae3_ae_algo ae_algo;
@@ -2950,6 +2951,9 @@ static void hclge_update_link_status(struct hclge_dev *hdev)
if (state != hdev->hw.mac.link) { hdev->hw.mac.link = state; + if (state == HCLGE_LINK_STATUS_UP) + hclge_update_port_info(hdev); + client->ops->link_status_change(handle, state); hclge_config_mac_tnl_int(hdev, state); if (rclient && rclient->ops->link_status_change)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shigeru Yoshida syoshida@redhat.com
[ Upstream commit fb317eb23b5ee4c37b0656a9a52a3db58d9dd072 ]
KMSAN reported the following kernel-infoleak issue:
===================================================== BUG: KMSAN: kernel-infoleak in instrument_copy_to_user include/linux/instrumented.h:114 [inline] BUG: KMSAN: kernel-infoleak in copy_to_user_iter lib/iov_iter.c:24 [inline] BUG: KMSAN: kernel-infoleak in iterate_ubuf include/linux/iov_iter.h:29 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance2 include/linux/iov_iter.h:245 [inline] BUG: KMSAN: kernel-infoleak in iterate_and_advance include/linux/iov_iter.h:271 [inline] BUG: KMSAN: kernel-infoleak in _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 instrument_copy_to_user include/linux/instrumented.h:114 [inline] copy_to_user_iter lib/iov_iter.c:24 [inline] iterate_ubuf include/linux/iov_iter.h:29 [inline] iterate_and_advance2 include/linux/iov_iter.h:245 [inline] iterate_and_advance include/linux/iov_iter.h:271 [inline] _copy_to_iter+0x4ec/0x2bc0 lib/iov_iter.c:186 copy_to_iter include/linux/uio.h:197 [inline] simple_copy_to_iter net/core/datagram.c:532 [inline] __skb_datagram_iter.5+0x148/0xe30 net/core/datagram.c:420 skb_copy_datagram_iter+0x52/0x210 net/core/datagram.c:546 skb_copy_datagram_msg include/linux/skbuff.h:3960 [inline] netlink_recvmsg+0x43d/0x1630 net/netlink/af_netlink.c:1967 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg net/socket.c:1066 [inline] __sys_recvfrom+0x476/0x860 net/socket.c:2246 __do_sys_recvfrom net/socket.c:2264 [inline] __se_sys_recvfrom net/socket.c:2260 [inline] __x64_sys_recvfrom+0x130/0x200 net/socket.c:2260 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Uninit was created at: slab_post_alloc_hook+0x103/0x9e0 mm/slab.h:768 slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x5f7/0xb50 mm/slub.c:3523 kmalloc_reserve+0x13c/0x4a0 net/core/skbuff.c:560 __alloc_skb+0x2fd/0x770 net/core/skbuff.c:651 alloc_skb include/linux/skbuff.h:1286 [inline] tipc_tlv_alloc net/tipc/netlink_compat.c:156 [inline] tipc_get_err_tlv+0x90/0x5d0 net/tipc/netlink_compat.c:170 tipc_nl_compat_recv+0x1042/0x15d0 net/tipc/netlink_compat.c:1324 genl_family_rcv_msg_doit net/netlink/genetlink.c:972 [inline] genl_family_rcv_msg net/netlink/genetlink.c:1052 [inline] genl_rcv_msg+0x1220/0x12c0 net/netlink/genetlink.c:1067 netlink_rcv_skb+0x4a4/0x6a0 net/netlink/af_netlink.c:2545 genl_rcv+0x41/0x60 net/netlink/genetlink.c:1076 netlink_unicast_kernel net/netlink/af_netlink.c:1342 [inline] netlink_unicast+0xf4b/0x1230 net/netlink/af_netlink.c:1368 netlink_sendmsg+0x1242/0x1420 net/netlink/af_netlink.c:1910 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg net/socket.c:745 [inline] ____sys_sendmsg+0x997/0xd60 net/socket.c:2588 ___sys_sendmsg+0x271/0x3b0 net/socket.c:2642 __sys_sendmsg net/socket.c:2671 [inline] __do_sys_sendmsg net/socket.c:2680 [inline] __se_sys_sendmsg net/socket.c:2678 [inline] __x64_sys_sendmsg+0x2fa/0x4a0 net/socket.c:2678 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Bytes 34-35 of 36 are uninitialized Memory access of size 36 starts at ffff88802d464a00 Data copied to user address 00007ff55033c0a0
CPU: 0 PID: 30322 Comm: syz-executor.0 Not tainted 6.6.0-14500-g1c41041124bd #10 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-1.fc38 04/01/2014 =====================================================
tipc_add_tlv() puts TLV descriptor and value onto `skb`. This size is calculated with TLV_SPACE() macro. It adds the size of struct tlv_desc and the length of TLV value passed as an argument, and aligns the result to a multiple of TLV_ALIGNTO, i.e., a multiple of 4 bytes.
If the size of struct tlv_desc plus the length of TLV value is not aligned, the current implementation leaves the remaining bytes uninitialized. This is the cause of the above kernel-infoleak issue.
This patch resolves this issue by clearing data up to an aligned size.
Fixes: d0796d1ef63d ("tipc: convert legacy nl bearer dump to nl compat") Signed-off-by: Shigeru Yoshida syoshida@redhat.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/tipc/netlink_compat.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/tipc/netlink_compat.c b/net/tipc/netlink_compat.c index ce00f271ca6b2..116a97e301443 100644 --- a/net/tipc/netlink_compat.c +++ b/net/tipc/netlink_compat.c @@ -101,6 +101,7 @@ static int tipc_add_tlv(struct sk_buff *skb, u16 type, void *data, u16 len) return -EMSGSIZE;
skb_put(skb, TLV_SPACE(len)); + memset(tlv, 0, TLV_SPACE(len)); tlv->tlv_type = htons(type); tlv->tlv_len = htons(TLV_LENGTH(len)); if (len && data)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Willem de Bruijn willemb@google.com
[ Upstream commit c0a2a1b0d631fc460d830f52d06211838874d655 ]
ppp_sync_ioctl allows setting device MRU, but does not sanity check this input.
Limit to a sane upper bound of 64KB.
No implementation I could find generates larger than 64KB frames. RFC 2823 mentions an upper bound of PPP over SDL of 64KB based on the 16-bit length field. Other protocols will be smaller, such as PPPoE (9KB jumbo frame) and PPPoA (18190 maximum CPCS-SDU size, RFC 2364). PPTP and L2TP encapsulate in IP.
Syzbot managed to trigger alloc warning in __alloc_pages:
if (WARN_ON_ONCE_GFP(order > MAX_ORDER, gfp))
WARNING: CPU: 1 PID: 37 at mm/page_alloc.c:4544 __alloc_pages+0x3ab/0x4a0 mm/page_alloc.c:4544
__alloc_skb+0x12b/0x330 net/core/skbuff.c:651 __netdev_alloc_skb+0x72/0x3f0 net/core/skbuff.c:715 netdev_alloc_skb include/linux/skbuff.h:3225 [inline] dev_alloc_skb include/linux/skbuff.h:3238 [inline] ppp_sync_input drivers/net/ppp/ppp_synctty.c:669 [inline] ppp_sync_receive+0xff/0x680 drivers/net/ppp/ppp_synctty.c:334 tty_ldisc_receive_buf+0x14c/0x180 drivers/tty/tty_buffer.c:390 tty_port_default_receive_buf+0x70/0xb0 drivers/tty/tty_port.c:37 receive_buf drivers/tty/tty_buffer.c:444 [inline] flush_to_ldisc+0x261/0x780 drivers/tty/tty_buffer.c:494 process_one_work+0x884/0x15c0 kernel/workqueue.c:2630
With call
ioctl$PPPIOCSMRU1(r1, 0x40047452, &(0x7f0000000100)=0x5e6417a8)
Similar code exists in other drivers that implement ppp_channel_ops ioctl PPPIOCSMRU. Those might also be in scope. Notably excluded from this are pppol2tp_ioctl and pppoe_ioctl.
This code goes back to the start of git history.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Reported-by: syzbot+6177e1f90d92583bcc58@syzkaller.appspotmail.com Signed-off-by: Willem de Bruijn willemb@google.com Reviewed-by: Eric Dumazet edumazet@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ppp/ppp_synctty.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ppp/ppp_synctty.c b/drivers/net/ppp/ppp_synctty.c index e37faed81937f..692c558beed54 100644 --- a/drivers/net/ppp/ppp_synctty.c +++ b/drivers/net/ppp/ppp_synctty.c @@ -464,6 +464,10 @@ ppp_sync_ioctl(struct ppp_channel *chan, unsigned int cmd, unsigned long arg) case PPPIOCSMRU: if (get_user(val, (int __user *) argp)) break; + if (val > U16_MAX) { + err = -EINVAL; + break; + } if (val < PPP_MRU) val = PPP_MRU; ap->mru = val;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Juergen Gross jgross@suse.com
[ Upstream commit 47d970204054f859f35a2237baa75c2d84fcf436 ]
When delaying eoi handling of events, the related elements are queued into the percpu lateeoi list. In case the list isn't empty, the elements should be sorted by the time when eoi handling is to happen.
Unfortunately a new element will never be queued at the start of the list, even if it has a handling time lower than all other list elements.
Fix that by handling that case the same way as for an empty list.
Fixes: e99502f76271 ("xen/events: defer eoi in case of excessive number of events") Reported-by: Jan Beulich jbeulich@suse.com Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Oleksandr Tyshchenko oleksandr_tyshchenko@epam.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/xen/events/events_base.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c index 9339f2aad5679..ee691b20d4a3f 100644 --- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -599,7 +599,9 @@ static void lateeoi_list_add(struct irq_info *info)
spin_lock_irqsave(&eoi->eoi_list_lock, flags);
- if (list_empty(&eoi->eoi_list)) { + elem = list_first_entry_or_null(&eoi->eoi_list, struct irq_info, + eoi_list); + if (!elem || info->eoi_time < elem->eoi_time) { list_add(&info->eoi_list, &eoi->eoi_list); mod_delayed_work_on(info->eoi_cpu, system_wq, &eoi->delayed, delay);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 73bde5a3294853947252cd9092a3517c7cb0cd2d ]
As I was working on a syzbot report, I found that KCSAN would probably complain that reading q->head or q->tail without barriers could lead to invalid results.
Add corresponding READ_ONCE() and WRITE_ONCE() to avoid load-store tearing.
Fixes: d94ba80ebbea ("ptp: Added a brand new class driver for ptp clocks.") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Richard Cochran richardcochran@gmail.com Link: https://lore.kernel.org/r/20231109174859.3995880-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_chardev.c | 3 ++- drivers/ptp/ptp_clock.c | 5 +++-- drivers/ptp/ptp_private.h | 8 ++++++-- drivers/ptp/ptp_sysfs.c | 3 ++- 4 files changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c index af3bc65c4595d..9311f3d09c8fc 100644 --- a/drivers/ptp/ptp_chardev.c +++ b/drivers/ptp/ptp_chardev.c @@ -487,7 +487,8 @@ ssize_t ptp_read(struct posix_clock *pc,
for (i = 0; i < cnt; i++) { event[i] = queue->buf[queue->head]; - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + /* Paired with READ_ONCE() in queue_cnt() */ + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS); }
spin_unlock_irqrestore(&queue->lock, flags); diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c index 8a652a367625b..e70c6dec3a3a3 100644 --- a/drivers/ptp/ptp_clock.c +++ b/drivers/ptp/ptp_clock.c @@ -56,10 +56,11 @@ static void enqueue_external_timestamp(struct timestamp_event_queue *queue, dst->t.sec = seconds; dst->t.nsec = remainder;
+ /* Both WRITE_ONCE() are paired with READ_ONCE() in queue_cnt() */ if (!queue_free(queue)) - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS);
- queue->tail = (queue->tail + 1) % PTP_MAX_TIMESTAMPS; + WRITE_ONCE(queue->tail, (queue->tail + 1) % PTP_MAX_TIMESTAMPS);
spin_unlock_irqrestore(&queue->lock, flags); } diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h index dba6be4770670..b336c12bb6976 100644 --- a/drivers/ptp/ptp_private.h +++ b/drivers/ptp/ptp_private.h @@ -74,9 +74,13 @@ struct ptp_vclock { * that a writer might concurrently increment the tail does not * matter, since the queue remains nonempty nonetheless. */ -static inline int queue_cnt(struct timestamp_event_queue *q) +static inline int queue_cnt(const struct timestamp_event_queue *q) { - int cnt = q->tail - q->head; + /* + * Paired with WRITE_ONCE() in enqueue_external_timestamp(), + * ptp_read(), extts_fifo_show(). + */ + int cnt = READ_ONCE(q->tail) - READ_ONCE(q->head); return cnt < 0 ? PTP_MAX_TIMESTAMPS + cnt : cnt; }
diff --git a/drivers/ptp/ptp_sysfs.c b/drivers/ptp/ptp_sysfs.c index 9233bfedeb174..0bdfdd4bb0fa2 100644 --- a/drivers/ptp/ptp_sysfs.c +++ b/drivers/ptp/ptp_sysfs.c @@ -79,7 +79,8 @@ static ssize_t extts_fifo_show(struct device *dev, qcnt = queue_cnt(queue); if (qcnt) { event = queue->buf[queue->head]; - queue->head = (queue->head + 1) % PTP_MAX_TIMESTAMPS; + /* Paired with READ_ONCE() in queue_cnt() */ + WRITE_ONCE(queue->head, (queue->head + 1) % PTP_MAX_TIMESTAMPS); } spin_unlock_irqrestore(&queue->lock, flags);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 3cffa2ddc4d3fcf70cde361236f5a614f81a09b2 ]
Commit 9eed321cde22 ("net: lapbether: only support ethernet devices") has been able to keep syzbot away from net/lapb, until today.
In the following splat [1], the issue is that a lapbether device has been created on a bonding device without members. Then adding a non ARPHRD_ETHER member forced the bonding master to change its type.
The fix is to make sure we call dev_close() in bond_setup_by_slave() so that the potential linked lapbether devices (or any other devices having assumptions on the physical device) are removed.
A similar bug has been addressed in commit 40baec225765 ("bonding: fix panic on non-ARPHRD_ETHER enslave failure")
[1] skbuff: skb_under_panic: text:ffff800089508810 len:44 put:40 head:ffff0000c78e7c00 data:ffff0000c78e7bea tail:0x16 end:0x140 dev:bond0 kernel BUG at net/core/skbuff.c:192 ! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 PID: 6007 Comm: syz-executor383 Not tainted 6.6.0-rc3-syzkaller-gbf6547d8715b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/04/2023 pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : skb_panic net/core/skbuff.c:188 [inline] pc : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 lr : skb_panic net/core/skbuff.c:188 [inline] lr : skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 sp : ffff800096a06aa0 x29: ffff800096a06ab0 x28: ffff800096a06ba0 x27: dfff800000000000 x26: ffff0000ce9b9b50 x25: 0000000000000016 x24: ffff0000c78e7bea x23: ffff0000c78e7c00 x22: 000000000000002c x21: 0000000000000140 x20: 0000000000000028 x19: ffff800089508810 x18: ffff800096a06100 x17: 0000000000000000 x16: ffff80008a629a3c x15: 0000000000000001 x14: 1fffe00036837a32 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000201 x10: 0000000000000000 x9 : cb50b496c519aa00 x8 : cb50b496c519aa00 x7 : 0000000000000001 x6 : 0000000000000001 x5 : ffff800096a063b8 x4 : ffff80008e280f80 x3 : ffff8000805ad11c x2 : 0000000000000001 x1 : 0000000100000201 x0 : 0000000000000086 Call trace: skb_panic net/core/skbuff.c:188 [inline] skb_under_panic+0x13c/0x140 net/core/skbuff.c:202 skb_push+0xf0/0x108 net/core/skbuff.c:2446 ip6gre_header+0xbc/0x738 net/ipv6/ip6_gre.c:1384 dev_hard_header include/linux/netdevice.h:3136 [inline] lapbeth_data_transmit+0x1c4/0x298 drivers/net/wan/lapbether.c:257 lapb_data_transmit+0x8c/0xb0 net/lapb/lapb_iface.c:447 lapb_transmit_buffer+0x178/0x204 net/lapb/lapb_out.c:149 lapb_send_control+0x220/0x320 net/lapb/lapb_subr.c:251 __lapb_disconnect_request+0x9c/0x17c net/lapb/lapb_iface.c:326 lapb_device_event+0x288/0x4e0 net/lapb/lapb_iface.c:492 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline] __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 lapbeth_device_event+0x2e4/0x958 drivers/net/wan/lapbether.c:466 notifier_call_chain+0x1a4/0x510 kernel/notifier.c:93 raw_notifier_call_chain+0x3c/0x50 kernel/notifier.c:461 call_netdevice_notifiers_info net/core/dev.c:1970 [inline] call_netdevice_notifiers_extack net/core/dev.c:2008 [inline] call_netdevice_notifiers net/core/dev.c:2022 [inline] __dev_close_many+0x1b8/0x3c4 net/core/dev.c:1508 dev_close_many+0x1e0/0x470 net/core/dev.c:1559 dev_close+0x174/0x250 net/core/dev.c:1585 bond_enslave+0x2298/0x30cc drivers/net/bonding/bond_main.c:2332 bond_do_ioctl+0x268/0xc64 drivers/net/bonding/bond_main.c:4539 dev_ifsioc+0x754/0x9ac dev_ioctl+0x4d8/0xd34 net/core/dev_ioctl.c:786 sock_do_ioctl+0x1d4/0x2d0 net/socket.c:1217 sock_ioctl+0x4e8/0x834 net/socket.c:1322 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:871 [inline] __se_sys_ioctl fs/ioctl.c:857 [inline] __arm64_sys_ioctl+0x14c/0x1c8 fs/ioctl.c:857 __invoke_syscall arch/arm64/kernel/syscall.c:37 [inline] invoke_syscall+0x98/0x2b8 arch/arm64/kernel/syscall.c:51 el0_svc_common+0x130/0x23c arch/arm64/kernel/syscall.c:136 do_el0_svc+0x48/0x58 arch/arm64/kernel/syscall.c:155 el0_svc+0x58/0x16c arch/arm64/kernel/entry-common.c:678 el0t_64_sync_handler+0x84/0xfc arch/arm64/kernel/entry-common.c:696 el0t_64_sync+0x190/0x194 arch/arm64/kernel/entry.S:591 Code: aa1803e6 aa1903e7 a90023f5 94785b8b (d4210000)
Fixes: 872254dd6b1f ("net/bonding: Enable bonding to enslave non ARPHRD_ETHER") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Jay Vosburgh jay.vosburgh@canonical.com Reviewed-by: Hangbin Liu liuhangbin@gmail.com Link: https://lore.kernel.org/r/20231109180102.4085183-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 80e42852ffefb..9aed194d308d6 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1473,6 +1473,10 @@ static void bond_compute_features(struct bonding *bond) static void bond_setup_by_slave(struct net_device *bond_dev, struct net_device *slave_dev) { + bool was_up = !!(bond_dev->flags & IFF_UP); + + dev_close(bond_dev); + bond_dev->header_ops = slave_dev->header_ops;
bond_dev->type = slave_dev->type; @@ -1487,6 +1491,8 @@ static void bond_setup_by_slave(struct net_device *bond_dev, bond_dev->flags &= ~(IFF_BROADCAST | IFF_MULTICAST); bond_dev->flags |= (IFF_POINTOPOINT | IFF_NOARP); } + if (was_up) + dev_open(bond_dev, NULL); }
/* On bonding slaves other than the currently active slave, suppress
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit 510e35fb931ffc3b100e5d5ae4595cd3beca9f1a ]
Enumerator 3 is 1548 bytes according to the datasheet. Not 1542.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-1-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 4 ++-- drivers/net/ethernet/cortina/gemini.h | 2 +- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index d0ba5ca862cf5..daab31e5bcbae 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -432,8 +432,8 @@ static const struct gmac_max_framelen gmac_maxlens[] = { .val = CONFIG0_MAXLEN_1536, }, { - .max_l3_len = 1542, - .val = CONFIG0_MAXLEN_1542, + .max_l3_len = 1548, + .val = CONFIG0_MAXLEN_1548, }, { .max_l3_len = 9212, diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h index 9fdf77d5eb374..99efb11557436 100644 --- a/drivers/net/ethernet/cortina/gemini.h +++ b/drivers/net/ethernet/cortina/gemini.h @@ -787,7 +787,7 @@ union gmac_config0 { #define CONFIG0_MAXLEN_1536 0 #define CONFIG0_MAXLEN_1518 1 #define CONFIG0_MAXLEN_1522 2 -#define CONFIG0_MAXLEN_1542 3 +#define CONFIG0_MAXLEN_1548 3 #define CONFIG0_MAXLEN_9k 4 /* 9212 */ #define CONFIG0_MAXLEN_10k 5 /* 10236 */ #define CONFIG0_MAXLEN_1518__6 6
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit d4d0c5b4d279bfe3585fbd806efefd3e51c82afa ]
The Gemini ethernet controller provides hardware checksumming for frames up to 1514 bytes including ethernet headers but not FCS.
If we start sending bigger frames (after first bumping up the MTU on both interfaces sending and receiving the frames), truncated packets start to appear on the target such as in this tcpdump resulting from ping -s 1474:
23:34:17.241983 14:d6:4d:a8:3c:4f (oui Unknown) > bc:ae:c5:6b:a8:3d (oui Unknown), ethertype IPv4 (0x0800), length 1514: truncated-ip - 2 bytes missing! (tos 0x0, ttl 64, id 32653, offset 0, flags [DF], proto ICMP (1), length 1502) OpenWrt.lan > Fecusia: ICMP echo request, id 1672, seq 50, length 1482
If we bypass the hardware checksumming and provide a software fallback, everything starts working fine up to the max TX MTU of 2047 bytes, for example ping -s2000 192.168.1.2:
00:44:29.587598 bc:ae:c5:6b:a8:3d (oui Unknown) > 14:d6:4d:a8:3c:4f (oui Unknown), ethertype IPv4 (0x0800), length 2042: (tos 0x0, ttl 64, id 51828, offset 0, flags [none], proto ICMP (1), length 2028) Fecusia > OpenWrt.lan: ICMP echo reply, id 1683, seq 4, length 2008
The bit enabling to bypass hardware checksum (or any of the "TSS" bits) are undocumented in the hardware reference manual. The entire hardware checksum unit appears undocumented. The conclusion that we need to use the "bypass" bit was found by trial-and-error.
Since no hardware checksum will happen, we slot in a software checksum fallback.
Check for the condition where we need to compute checksum on the skb with either hardware or software using == CHECKSUM_PARTIAL instead of != CHECKSUM_NONE which is an incomplete check according to <linux/skbuff.h>.
On the D-Link DIR-685 router this fixes a bug on the conduit interface to the RTL8366RB DSA switch: as the switch needs to add space for its tag it increases the MTU on the conduit interface to 1504 and that means that when the router sends packages of 1500 bytes these get an extra 4 bytes of DSA tag and the transfer fails because of the erroneous hardware checksumming, affecting such basic functionality as the LuCI web interface.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-2-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 24 +++++++++++++++++++++++- 1 file changed, 23 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index daab31e5bcbae..fbd83330ca787 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -1145,6 +1145,7 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb, dma_addr_t mapping; unsigned short mtu; void *buffer; + int ret;
mtu = ETH_HLEN; mtu += netdev->mtu; @@ -1159,9 +1160,30 @@ static int gmac_map_tx_bufs(struct net_device *netdev, struct sk_buff *skb, word3 |= mtu; }
- if (skb->ip_summed != CHECKSUM_NONE) { + if (skb->len >= ETH_FRAME_LEN) { + /* Hardware offloaded checksumming isn't working on frames + * bigger than 1514 bytes. A hypothesis about this is that the + * checksum buffer is only 1518 bytes, so when the frames get + * bigger they get truncated, or the last few bytes get + * overwritten by the FCS. + * + * Just use software checksumming and bypass on bigger frames. + */ + if (skb->ip_summed == CHECKSUM_PARTIAL) { + ret = skb_checksum_help(skb); + if (ret) + return ret; + } + word1 |= TSS_BYPASS_BIT; + } else if (skb->ip_summed == CHECKSUM_PARTIAL) { int tcp = 0;
+ /* We do not switch off the checksumming on non TCP/UDP + * frames: as is shown from tests, the checksumming engine + * is smart enough to see that a frame is not actually TCP + * or UDP and then just pass it through without any changes + * to the frame. + */ if (skb->protocol == htons(ETH_P_IP)) { word1 |= TSS_IP_CHKSUM_BIT; tcp = ip_hdr(skb)->protocol == IPPROTO_TCP;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
[ Upstream commit dc6c0bfbaa947dd7976e30e8c29b10c868b6fa42 ]
The RX max frame size is over 10000 for the Gemini ethernet, but the TX max frame size is actually just 2047 (0x7ff after checking the datasheet). Reflect this in what we offer to Linux, cap the MTU at the TX max frame minus ethernet headers.
We delete the code disabling the hardware checksum for large MTUs as netdev->mtu can no longer be larger than netdev->max_mtu meaning the if()-clause in gmac_fix_features() is never true.
Fixes: 4d5ae32f5e1e ("net: ethernet: Add a driver for Gemini gigabit ethernet") Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Linus Walleij linus.walleij@linaro.org Reviewed-by: Vladimir Oltean olteanv@gmail.com Link: https://lore.kernel.org/r/20231109-gemini-largeframe-fix-v4-3-6e611528db08@l... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cortina/gemini.c | 17 ++++------------- drivers/net/ethernet/cortina/gemini.h | 2 +- 2 files changed, 5 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/cortina/gemini.c b/drivers/net/ethernet/cortina/gemini.c index fbd83330ca787..675c6dda45e24 100644 --- a/drivers/net/ethernet/cortina/gemini.c +++ b/drivers/net/ethernet/cortina/gemini.c @@ -2000,15 +2000,6 @@ static int gmac_change_mtu(struct net_device *netdev, int new_mtu) return 0; }
-static netdev_features_t gmac_fix_features(struct net_device *netdev, - netdev_features_t features) -{ - if (netdev->mtu + ETH_HLEN + VLAN_HLEN > MTU_SIZE_BIT_MASK) - features &= ~GMAC_OFFLOAD_FEATURES; - - return features; -} - static int gmac_set_features(struct net_device *netdev, netdev_features_t features) { @@ -2230,7 +2221,6 @@ static const struct net_device_ops gmac_351x_ops = { .ndo_set_mac_address = gmac_set_mac_address, .ndo_get_stats64 = gmac_get_stats64, .ndo_change_mtu = gmac_change_mtu, - .ndo_fix_features = gmac_fix_features, .ndo_set_features = gmac_set_features, };
@@ -2480,11 +2470,12 @@ static int gemini_ethernet_port_probe(struct platform_device *pdev)
netdev->hw_features = GMAC_OFFLOAD_FEATURES; netdev->features |= GMAC_OFFLOAD_FEATURES | NETIF_F_GRO; - /* We can handle jumbo frames up to 10236 bytes so, let's accept - * payloads of 10236 bytes minus VLAN and ethernet header + /* We can receive jumbo frames up to 10236 bytes but only + * transmit 2047 bytes so, let's accept payloads of 2047 + * bytes minus VLAN and ethernet header */ netdev->min_mtu = ETH_MIN_MTU; - netdev->max_mtu = 10236 - VLAN_ETH_HLEN; + netdev->max_mtu = MTU_SIZE_BIT_MASK - VLAN_ETH_HLEN;
port->freeq_refill = 0; netif_napi_add(netdev, &port->napi, gmac_napi_poll, NAPI_POLL_WEIGHT); diff --git a/drivers/net/ethernet/cortina/gemini.h b/drivers/net/ethernet/cortina/gemini.h index 99efb11557436..24bb989981f23 100644 --- a/drivers/net/ethernet/cortina/gemini.h +++ b/drivers/net/ethernet/cortina/gemini.h @@ -502,7 +502,7 @@ union gmac_txdesc_3 { #define SOF_BIT 0x80000000 #define EOF_BIT 0x40000000 #define EOFIE_BIT BIT(29) -#define MTU_SIZE_BIT_MASK 0x1fff +#define MTU_SIZE_BIT_MASK 0x7ff /* Max MTU 2047 bytes */
/* GMAC Tx Descriptor */ struct gmac_txdesc {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 4b7b492615cf3017190f55444f7016812b66611d ]
syzbot reported the following crash [1]
After releasing unix socket lock, u->oob_skb can be changed by another thread. We must temporarily increase skb refcount to make sure this other thread will not free the skb under us.
[1]
BUG: KASAN: slab-use-after-free in unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866 Read of size 4 at addr ffff88801f3b9cc4 by task syz-executor107/5297
CPU: 1 PID: 5297 Comm: syz-executor107 Not tainted 6.6.0-syzkaller-15910-gb8e3a87a627b #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106 print_address_description mm/kasan/report.c:364 [inline] print_report+0xc4/0x620 mm/kasan/report.c:475 kasan_report+0xda/0x110 mm/kasan/report.c:588 unix_stream_read_actor+0xa7/0xc0 net/unix/af_unix.c:2866 unix_stream_recv_urg net/unix/af_unix.c:2587 [inline] unix_stream_read_generic+0x19a5/0x2480 net/unix/af_unix.c:2666 unix_stream_recvmsg+0x189/0x1b0 net/unix/af_unix.c:2903 sock_recvmsg_nosec net/socket.c:1044 [inline] sock_recvmsg+0xe2/0x170 net/socket.c:1066 ____sys_recvmsg+0x21f/0x5c0 net/socket.c:2803 ___sys_recvmsg+0x115/0x1a0 net/socket.c:2845 __sys_recvmsg+0x114/0x1e0 net/socket.c:2875 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b RIP: 0033:0x7fc67492c559 Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 51 18 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48 RSP: 002b:00007fc6748ab228 EFLAGS: 00000246 ORIG_RAX: 000000000000002f RAX: ffffffffffffffda RBX: 000000000000001c RCX: 00007fc67492c559 RDX: 0000000040010083 RSI: 0000000020000140 RDI: 0000000000000004 RBP: 00007fc6749b6348 R08: 00007fc6748ab6c0 R09: 00007fc6748ab6c0 R10: 0000000000000000 R11: 0000000000000246 R12: 00007fc6749b6340 R13: 00007fc6749b634c R14: 00007ffe9fac52a0 R15: 00007ffe9fac5388 </TASK>
Allocated by task 5295: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 __kasan_slab_alloc+0x81/0x90 mm/kasan/common.c:328 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] kmem_cache_alloc_node+0x180/0x3c0 mm/slub.c:3523 __alloc_skb+0x287/0x330 net/core/skbuff.c:641 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] queue_oob net/unix/af_unix.c:2147 [inline] unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Freed by task 5295: kasan_save_stack+0x33/0x50 mm/kasan/common.c:45 kasan_set_track+0x25/0x30 mm/kasan/common.c:52 kasan_save_free_info+0x2b/0x40 mm/kasan/generic.c:522 ____kasan_slab_free mm/kasan/common.c:236 [inline] ____kasan_slab_free+0x15b/0x1b0 mm/kasan/common.c:200 kasan_slab_free include/linux/kasan.h:164 [inline] slab_free_hook mm/slub.c:1800 [inline] slab_free_freelist_hook+0x114/0x1e0 mm/slub.c:1826 slab_free mm/slub.c:3809 [inline] kmem_cache_free+0xf8/0x340 mm/slub.c:3831 kfree_skbmem+0xef/0x1b0 net/core/skbuff.c:1015 __kfree_skb net/core/skbuff.c:1073 [inline] consume_skb net/core/skbuff.c:1288 [inline] consume_skb+0xdf/0x170 net/core/skbuff.c:1282 queue_oob net/unix/af_unix.c:2178 [inline] unix_stream_sendmsg+0xd49/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
The buggy address belongs to the object at ffff88801f3b9c80 which belongs to the cache skbuff_head_cache of size 240 The buggy address is located 68 bytes inside of freed 240-byte region [ffff88801f3b9c80, ffff88801f3b9d70)
The buggy address belongs to the physical page: page:ffffea00007cee40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f3b9 flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff) page_type: 0xffffffff() raw: 00fff00000000800 ffff888142a60640 dead000000000122 0000000000000000 raw: 0000000000000000 00000000000c000c 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as allocated page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5299, tgid 5283 (syz-executor107), ts 103803840339, free_ts 103600093431 set_page_owner include/linux/page_owner.h:31 [inline] post_alloc_hook+0x2cf/0x340 mm/page_alloc.c:1537 prep_new_page mm/page_alloc.c:1544 [inline] get_page_from_freelist+0xa25/0x36c0 mm/page_alloc.c:3312 __alloc_pages+0x1d0/0x4a0 mm/page_alloc.c:4568 alloc_pages_mpol+0x258/0x5f0 mm/mempolicy.c:2133 alloc_slab_page mm/slub.c:1870 [inline] allocate_slab+0x251/0x380 mm/slub.c:2017 new_slab mm/slub.c:2070 [inline] ___slab_alloc+0x8c7/0x1580 mm/slub.c:3223 __slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3322 __slab_alloc_node mm/slub.c:3375 [inline] slab_alloc_node mm/slub.c:3468 [inline] kmem_cache_alloc_node+0x132/0x3c0 mm/slub.c:3523 __alloc_skb+0x287/0x330 net/core/skbuff.c:641 alloc_skb include/linux/skbuff.h:1286 [inline] alloc_skb_with_frags+0xe4/0x710 net/core/skbuff.c:6331 sock_alloc_send_pskb+0x7e4/0x970 net/core/sock.c:2780 sock_alloc_send_skb include/net/sock.h:1884 [inline] queue_oob net/unix/af_unix.c:2147 [inline] unix_stream_sendmsg+0xb5f/0x10a0 net/unix/af_unix.c:2301 sock_sendmsg_nosec net/socket.c:730 [inline] __sock_sendmsg+0xd5/0x180 net/socket.c:745 ____sys_sendmsg+0x6ac/0x940 net/socket.c:2584 ___sys_sendmsg+0x135/0x1d0 net/socket.c:2638 __sys_sendmsg+0x117/0x1e0 net/socket.c:2667 page last free stack trace: reset_page_owner include/linux/page_owner.h:24 [inline] free_pages_prepare mm/page_alloc.c:1137 [inline] free_unref_page_prepare+0x4f8/0xa90 mm/page_alloc.c:2347 free_unref_page+0x33/0x3b0 mm/page_alloc.c:2487 __unfreeze_partials+0x21d/0x240 mm/slub.c:2655 qlink_free mm/kasan/quarantine.c:168 [inline] qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:187 kasan_quarantine_reduce+0x18e/0x1d0 mm/kasan/quarantine.c:294 __kasan_slab_alloc+0x65/0x90 mm/kasan/common.c:305 kasan_slab_alloc include/linux/kasan.h:188 [inline] slab_post_alloc_hook mm/slab.h:763 [inline] slab_alloc_node mm/slub.c:3478 [inline] slab_alloc mm/slub.c:3486 [inline] __kmem_cache_alloc_lru mm/slub.c:3493 [inline] kmem_cache_alloc+0x15d/0x380 mm/slub.c:3502 vm_area_dup+0x21/0x2f0 kernel/fork.c:500 __split_vma+0x17d/0x1070 mm/mmap.c:2365 split_vma mm/mmap.c:2437 [inline] vma_modify+0x25d/0x450 mm/mmap.c:2472 vma_modify_flags include/linux/mm.h:3271 [inline] mprotect_fixup+0x228/0xc80 mm/mprotect.c:635 do_mprotect_pkey+0x852/0xd60 mm/mprotect.c:809 __do_sys_mprotect mm/mprotect.c:830 [inline] __se_sys_mprotect mm/mprotect.c:827 [inline] __x64_sys_mprotect+0x78/0xb0 mm/mprotect.c:827 do_syscall_x64 arch/x86/entry/common.c:51 [inline] do_syscall_64+0x3f/0x110 arch/x86/entry/common.c:82 entry_SYSCALL_64_after_hwframe+0x63/0x6b
Memory state around the buggy address: ffff88801f3b9b80: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88801f3b9c00: fb fb fb fb fb fb fc fc fc fc fc fc fc fc fc fc
ffff88801f3b9c80: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88801f3b9d00: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fc fc ffff88801f3b9d80: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
Fixes: 876c14ad014d ("af_unix: fix holding spinlock in oob handling") Reported-and-tested-by: syzbot+7a2d546fa43e49315ed3@syzkaller.appspotmail.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Rao Shoaib rao.shoaib@oracle.com Reviewed-by: Rao shoaib rao.shoaib@oracle.com Link: https://lore.kernel.org/r/20231113134938.168151-1-edumazet@google.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 748769f4ba058..16b04e553a6c8 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -2529,15 +2529,16 @@ static int unix_stream_recv_urg(struct unix_stream_read_state *state)
if (!(state->flags & MSG_PEEK)) WRITE_ONCE(u->oob_skb, NULL); - + else + skb_get(oob_skb); unix_state_unlock(sk);
chunk = state->recv_actor(oob_skb, 0, chunk, state);
- if (!(state->flags & MSG_PEEK)) { + if (!(state->flags & MSG_PEEK)) UNIXCB(oob_skb).consumed += 1; - kfree_skb(oob_skb); - } + + consume_skb(oob_skb);
mutex_unlock(&u->iolock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linkui Xiao xiaolinkui@kylinos.cn
[ Upstream commit a44af08e3d4d7566eeea98d7a29fe06e7b9de944 ]
K2CI reported a problem:
consume_skb(skb); return err; [nf_br_ip_fragment() error] uninitialized symbol 'err'.
err is not initialized, because returning 0 is expected, initialize err to 0.
Fixes: 3c171f496ef5 ("netfilter: bridge: add connection tracking system") Reported-by: k2ci kernel-bot@kylinos.cn Signed-off-by: Linkui Xiao xiaolinkui@kylinos.cn Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/netfilter/nf_conntrack_bridge.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/netfilter/nf_conntrack_bridge.c b/net/bridge/netfilter/nf_conntrack_bridge.c index fdbed31585553..d14b2dbbd1dfb 100644 --- a/net/bridge/netfilter/nf_conntrack_bridge.c +++ b/net/bridge/netfilter/nf_conntrack_bridge.c @@ -36,7 +36,7 @@ static int nf_br_ip_fragment(struct net *net, struct sock *sk, ktime_t tstamp = skb->tstamp; struct ip_frag_state state; struct iphdr *iph; - int err; + int err = 0;
/* for offloaded checksums cleanup checksum before fragmentation */ if (skb->ip_summed == CHECKSUM_PARTIAL &&
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit d86473bf2ff39c05d4a6701c8aec66a16af0d410 ]
Switch to be16/32 and u16/32 respectively. No code changes here, the functions do the same thing, this is just for sparse checkers' sake.
objdiff shows no changes.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Stable-dep-of: c301f0981fdd ("netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()") Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_byteorder.c | 3 ++- net/netfilter/nft_osf.c | 2 +- net/netfilter/nft_socket.c | 8 ++++---- net/netfilter/nft_xfrm.c | 8 ++++---- 4 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c index 7b0b8fecb2205..d3e1467e576fb 100644 --- a/net/netfilter/nft_byteorder.c +++ b/net/netfilter/nft_byteorder.c @@ -44,7 +44,8 @@ void nft_byteorder_eval(const struct nft_expr *expr, case NFT_BYTEORDER_NTOH: for (i = 0; i < priv->len / 8; i++) { src64 = nft_reg_load64(&src[i]); - nft_reg_store64(&dst[i], be64_to_cpu(src64)); + nft_reg_store64(&dst[i], + be64_to_cpu((__force __be64)src64)); } break; case NFT_BYTEORDER_HTON: diff --git a/net/netfilter/nft_osf.c b/net/netfilter/nft_osf.c index 720dc9fba6d4f..c9c124200a4db 100644 --- a/net/netfilter/nft_osf.c +++ b/net/netfilter/nft_osf.c @@ -99,7 +99,7 @@ static int nft_osf_dump(struct sk_buff *skb, const struct nft_expr *expr) if (nla_put_u8(skb, NFTA_OSF_TTL, priv->ttl)) goto nla_put_failure;
- if (nla_put_be32(skb, NFTA_OSF_FLAGS, ntohl(priv->flags))) + if (nla_put_u32(skb, NFTA_OSF_FLAGS, ntohl((__force __be32)priv->flags))) goto nla_put_failure;
if (nft_dump_register(skb, NFTA_OSF_DREG, priv->dreg)) diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c index 9ad9cc0d1d27c..1725e7349f3d9 100644 --- a/net/netfilter/nft_socket.c +++ b/net/netfilter/nft_socket.c @@ -162,7 +162,7 @@ static int nft_socket_init(const struct nft_ctx *ctx, return -EOPNOTSUPP; }
- priv->key = ntohl(nla_get_u32(tb[NFTA_SOCKET_KEY])); + priv->key = ntohl(nla_get_be32(tb[NFTA_SOCKET_KEY])); switch(priv->key) { case NFT_SOCKET_TRANSPARENT: case NFT_SOCKET_WILDCARD: @@ -178,7 +178,7 @@ static int nft_socket_init(const struct nft_ctx *ctx, if (!tb[NFTA_SOCKET_LEVEL]) return -EINVAL;
- level = ntohl(nla_get_u32(tb[NFTA_SOCKET_LEVEL])); + level = ntohl(nla_get_be32(tb[NFTA_SOCKET_LEVEL])); if (level > 255) return -EOPNOTSUPP;
@@ -200,12 +200,12 @@ static int nft_socket_dump(struct sk_buff *skb, { const struct nft_socket *priv = nft_expr_priv(expr);
- if (nla_put_u32(skb, NFTA_SOCKET_KEY, htonl(priv->key))) + if (nla_put_be32(skb, NFTA_SOCKET_KEY, htonl(priv->key))) return -1; if (nft_dump_register(skb, NFTA_SOCKET_DREG, priv->dreg)) return -1; if (priv->key == NFT_SOCKET_CGROUPV2 && - nla_put_u32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level))) + nla_put_be32(skb, NFTA_SOCKET_LEVEL, htonl(priv->level))) return -1; return 0; } diff --git a/net/netfilter/nft_xfrm.c b/net/netfilter/nft_xfrm.c index cbbbc4ecad3ae..3553f89fd057f 100644 --- a/net/netfilter/nft_xfrm.c +++ b/net/netfilter/nft_xfrm.c @@ -50,7 +50,7 @@ static int nft_xfrm_get_init(const struct nft_ctx *ctx, return -EOPNOTSUPP; }
- priv->key = ntohl(nla_get_u32(tb[NFTA_XFRM_KEY])); + priv->key = ntohl(nla_get_be32(tb[NFTA_XFRM_KEY])); switch (priv->key) { case NFT_XFRM_KEY_REQID: case NFT_XFRM_KEY_SPI: @@ -132,13 +132,13 @@ static void nft_xfrm_state_get_key(const struct nft_xfrm *priv, WARN_ON_ONCE(1); break; case NFT_XFRM_KEY_DADDR_IP4: - *dest = state->id.daddr.a4; + *dest = (__force __u32)state->id.daddr.a4; return; case NFT_XFRM_KEY_DADDR_IP6: memcpy(dest, &state->id.daddr.in6, sizeof(struct in6_addr)); return; case NFT_XFRM_KEY_SADDR_IP4: - *dest = state->props.saddr.a4; + *dest = (__force __u32)state->props.saddr.a4; return; case NFT_XFRM_KEY_SADDR_IP6: memcpy(dest, &state->props.saddr.in6, sizeof(struct in6_addr)); @@ -147,7 +147,7 @@ static void nft_xfrm_state_get_key(const struct nft_xfrm *priv, *dest = state->props.reqid; return; case NFT_XFRM_KEY_SPI: - *dest = state->id.spi; + *dest = (__force __u32)state->id.spi; return; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 7278b3c1e4ebf6f9c4cda07600f19824857c81fe ]
Same as the existing ones, no conversions. This is just for sparse sake only so that we no longer mix be16/u16 and be32/u32 types.
Alternative is to add __force __beX in various places, but this seems nicer.
objdiff shows no changes.
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Stable-dep-of: c301f0981fdd ("netfilter: nf_tables: fix pointer math issue in nft_byteorder_eval()") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 15 +++++++++++++++ net/bridge/netfilter/nft_meta_bridge.c | 2 +- net/netfilter/nft_tproxy.c | 6 +++--- 3 files changed, 19 insertions(+), 4 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index a0b47f2b896e1..df91b9f422551 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -144,11 +144,26 @@ static inline void nft_reg_store16(u32 *dreg, u16 val) *(u16 *)dreg = val; }
+static inline void nft_reg_store_be16(u32 *dreg, __be16 val) +{ + nft_reg_store16(dreg, (__force __u16)val); +} + static inline u16 nft_reg_load16(const u32 *sreg) { return *(u16 *)sreg; }
+static inline __be16 nft_reg_load_be16(const u32 *sreg) +{ + return (__force __be16)nft_reg_load16(sreg); +} + +static inline __be32 nft_reg_load_be32(const u32 *sreg) +{ + return *(__force __be32 *)sreg; +} + static inline void nft_reg_store64(u32 *dreg, u64 val) { put_unaligned(val, (u64 *)dreg); diff --git a/net/bridge/netfilter/nft_meta_bridge.c b/net/bridge/netfilter/nft_meta_bridge.c index 97805ec424c19..1967fd063cfb7 100644 --- a/net/bridge/netfilter/nft_meta_bridge.c +++ b/net/bridge/netfilter/nft_meta_bridge.c @@ -53,7 +53,7 @@ static void nft_meta_bridge_get_eval(const struct nft_expr *expr, goto err;
br_vlan_get_proto(br_dev, &p_proto); - nft_reg_store16(dest, htons(p_proto)); + nft_reg_store_be16(dest, htons(p_proto)); return; } default: diff --git a/net/netfilter/nft_tproxy.c b/net/netfilter/nft_tproxy.c index 9fea90ed79d44..e9679cb4afbe6 100644 --- a/net/netfilter/nft_tproxy.c +++ b/net/netfilter/nft_tproxy.c @@ -52,11 +52,11 @@ static void nft_tproxy_eval_v4(const struct nft_expr *expr, skb->dev, NF_TPROXY_LOOKUP_ESTABLISHED);
if (priv->sreg_addr) - taddr = regs->data[priv->sreg_addr]; + taddr = nft_reg_load_be32(®s->data[priv->sreg_addr]); taddr = nf_tproxy_laddr4(skb, taddr, iph->daddr);
if (priv->sreg_port) - tport = nft_reg_load16(®s->data[priv->sreg_port]); + tport = nft_reg_load_be16(®s->data[priv->sreg_port]); if (!tport) tport = hp->dest;
@@ -124,7 +124,7 @@ static void nft_tproxy_eval_v6(const struct nft_expr *expr, taddr = *nf_tproxy_laddr6(skb, &taddr, &iph->daddr);
if (priv->sreg_port) - tport = nft_reg_load16(®s->data[priv->sreg_port]); + tport = nft_reg_load_be16(®s->data[priv->sreg_port]); if (!tport) tport = hp->dest;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit c301f0981fdd3fd1ffac6836b423c4d7a8e0eb63 ]
The problem is in nft_byteorder_eval() where we are iterating through a loop and writing to dst[0], dst[1], dst[2] and so on... On each iteration we are writing 8 bytes. But dst[] is an array of u32 so each element only has space for 4 bytes. That means that every iteration overwrites part of the previous element.
I spotted this bug while reviewing commit caf3ef7468f7 ("netfilter: nf_tables: prevent OOB access in nft_byteorder_eval") which is a related issue. I think that the reason we have not detected this bug in testing is that most of time we only write one element.
Fixes: ce1e7989d989 ("netfilter: nft_byteorder: provide 64bit le/be conversion") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 4 ++-- net/netfilter/nft_byteorder.c | 5 +++-- net/netfilter/nft_meta.c | 2 +- 3 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index df91b9f422551..8e9c5bc1a9e69 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -164,9 +164,9 @@ static inline __be32 nft_reg_load_be32(const u32 *sreg) return *(__force __be32 *)sreg; }
-static inline void nft_reg_store64(u32 *dreg, u64 val) +static inline void nft_reg_store64(u64 *dreg, u64 val) { - put_unaligned(val, (u64 *)dreg); + put_unaligned(val, dreg); }
static inline u64 nft_reg_load64(const u32 *sreg) diff --git a/net/netfilter/nft_byteorder.c b/net/netfilter/nft_byteorder.c index d3e1467e576fb..adf208b7929fd 100644 --- a/net/netfilter/nft_byteorder.c +++ b/net/netfilter/nft_byteorder.c @@ -38,13 +38,14 @@ void nft_byteorder_eval(const struct nft_expr *expr,
switch (priv->size) { case 8: { + u64 *dst64 = (void *)dst; u64 src64;
switch (priv->op) { case NFT_BYTEORDER_NTOH: for (i = 0; i < priv->len / 8; i++) { src64 = nft_reg_load64(&src[i]); - nft_reg_store64(&dst[i], + nft_reg_store64(&dst64[i], be64_to_cpu((__force __be64)src64)); } break; @@ -52,7 +53,7 @@ void nft_byteorder_eval(const struct nft_expr *expr, for (i = 0; i < priv->len / 8; i++) { src64 = (__force __u64) cpu_to_be64(nft_reg_load64(&src[i])); - nft_reg_store64(&dst[i], src64); + nft_reg_store64(&dst64[i], src64); } break; } diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c index 14412f69a34e8..35aba304a25b9 100644 --- a/net/netfilter/nft_meta.c +++ b/net/netfilter/nft_meta.c @@ -63,7 +63,7 @@ nft_meta_get_eval_time(enum nft_meta_keys key, { switch (key) { case NFT_META_TIME_NS: - nft_reg_store64(dest, ktime_get_real_ns()); + nft_reg_store64((u64 *)dest, ktime_get_real_ns()); break; case NFT_META_TIME_DAY: nft_reg_store8(dest, nft_meta_weekday());
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baruch Siach baruch@tkos.co.il
[ Upstream commit fa02de9e75889915b554eda1964a631fd019973b ]
The while loop condition verifies 'count < limit'. Neither value change before the 'count >= limit' check. As is this check is dead code. But code inspection reveals a code path that modifies 'count' and then goto 'drain_data' and back to 'read_again'. So there is a need to verify count value sanity after 'read_again'.
Move 'read_again' up to fix the count limit check.
Fixes: ec222003bd94 ("net: stmmac: Prepare to add Split Header support") Signed-off-by: Baruch Siach baruch@tkos.co.il Reviewed-by: Serge Semin fancer.lancer@gmail.com Link: https://lore.kernel.org/r/d9486296c3b6b12ab3a0515fcd47d56447a07bfc.169989737... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index a43628dd1f4c2..2b4c30a5ffcd9 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -5165,10 +5165,10 @@ static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue) len = 0; }
+read_again: if (count >= limit) break;
-read_again: buf1_len = 0; buf2_len = 0; entry = next_entry;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dust Li dust.li@linux.alibaba.com
[ Upstream commit 6f9b1a0731662648949a1c0587f6acb3b7f8acf1 ]
When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_create_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue.
mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header)
This happens when cmd: MLX5_CMD_OP_ALLOC_PACKET_REFORMAT_CONTEXT fail.
This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success.
Fixes: d589e785baf5e ("net/mlx5e: Allow concurrent creation of encap entries") Reported-by: Cruz Zhao cruzzhao@linux.alibaba.com Reported-by: Tianchen Ding dtcccc@linux.alibaba.com Signed-off-by: Dust Li dust.li@linux.alibaba.com Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index d90c6dc41c9f4..44071592bd6e2 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -294,9 +294,6 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, if (err) goto destroy_neigh_entry;
- e->encap_size = ipv4_encap_size; - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event @@ -316,6 +313,8 @@ int mlx5e_tc_tun_create_header_ipv4(struct mlx5e_priv *priv, goto destroy_neigh_entry; }
+ e->encap_size = ipv4_encap_size; + e->encap_header = encap_header; e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv4_put(&attr); @@ -559,9 +558,6 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, if (err) goto destroy_neigh_entry;
- e->encap_size = ipv6_encap_size; - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event @@ -581,6 +577,8 @@ int mlx5e_tc_tun_create_header_ipv6(struct mlx5e_priv *priv, goto destroy_neigh_entry; }
+ e->encap_size = ipv6_encap_size; + e->encap_header = encap_header; e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv6_put(&attr);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gavin Li gavinl@nvidia.com
[ Upstream commit 3a4aa3cb83563df942be49d145ee3b7ddf17d6bb ]
Follow up to the previous patch to fix the same issue for mlx5e_tc_tun_update_header_ipv4{6} when mlx5_packet_reformat_alloc() fails.
When mlx5_packet_reformat_alloc() fails, the encap_header allocated in mlx5e_tc_tun_update_header_ipv4{6} will be released within it. However, e->encap_header is already set to the previously freed encap_header before mlx5_packet_reformat_alloc(). As a result, the later mlx5e_encap_put() will free e->encap_header again, causing a double free issue.
mlx5e_encap_put() --> mlx5e_encap_dealloc() --> kfree(e->encap_header)
This patch fix it by not setting e->encap_header until mlx5_packet_reformat_alloc() success.
Fixes: a54e20b4fcae ("net/mlx5e: Add basic TC tunnel set action for SRIOV offloads") Signed-off-by: Gavin Li gavinl@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-7-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/tc_tun.c | 20 +++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c index 44071592bd6e2..303e6e7a5c448 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_tun.c @@ -397,16 +397,12 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, if (err) goto free_encap;
- e->encap_size = ipv4_encap_size; - kfree(e->encap_header); - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event * and not used before that. */ - goto release_neigh; + goto free_encap; }
memset(&reformat_params, 0, sizeof(reformat_params)); @@ -420,6 +416,10 @@ int mlx5e_tc_tun_update_header_ipv4(struct mlx5e_priv *priv, goto free_encap; }
+ e->encap_size = ipv4_encap_size; + kfree(e->encap_header); + e->encap_header = encap_header; + e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv4_put(&attr); @@ -660,16 +660,12 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, if (err) goto free_encap;
- e->encap_size = ipv6_encap_size; - kfree(e->encap_header); - e->encap_header = encap_header; - if (!(nud_state & NUD_VALID)) { neigh_event_send(attr.n, NULL); /* the encap entry will be made valid on neigh update event * and not used before that. */ - goto release_neigh; + goto free_encap; }
memset(&reformat_params, 0, sizeof(reformat_params)); @@ -683,6 +679,10 @@ int mlx5e_tc_tun_update_header_ipv6(struct mlx5e_priv *priv, goto free_encap; }
+ e->encap_size = ipv6_encap_size; + kfree(e->encap_header); + e->encap_header = encap_header; + e->flags |= MLX5_ENCAP_ENTRY_VALID; mlx5e_rep_queue_neigh_stats_work(netdev_priv(attr.out_dev)); mlx5e_route_lookup_ipv6_put(&attr);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roi Dayan roid@nvidia.com
[ Upstream commit 475fb86ac941f75da127c19d8e8b282d33de9784 ]
A user is expected to explicit request a fwd or drop action. It is not correct to implicit add a fwd action for the user, when modify header action flag exists.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 0c101a23ca7e ("net/mlx5e: Fix pedit endianness") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index d123d9b4adf5e..d13ffba138934 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3639,9 +3639,6 @@ static int parse_tc_nic_actions(struct mlx5e_priv *priv, return -EOPNOTSUPP; }
- if (attr->action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) - attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; - if (!actions_match_supported(priv, flow_action, parse_attr, flow, extack)) return -EOPNOTSUPP;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roi Dayan roid@nvidia.com
[ Upstream commit d9581e2fa73fadba187b2e62e05306e24e8a1ded ]
Move mod hdr allocation chunk from parse_tc_fdb_actions() and parse_tc_nic_actions() to a shared function.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Reviewed-by: Oz Shlomo ozsh@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 0c101a23ca7e ("net/mlx5e: Fix pedit endianness") Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 87 +++++++++++-------- 1 file changed, 51 insertions(+), 36 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index d13ffba138934..433602f871bd4 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3502,10 +3502,50 @@ static int validate_goto_chain(struct mlx5e_priv *priv, return 0; }
-static int parse_tc_nic_actions(struct mlx5e_priv *priv, - struct flow_action *flow_action, +static int +actions_prepare_mod_hdr_actions(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, + struct mlx5_flow_attr *attr, + struct pedit_headers_action *hdrs, struct netlink_ext_ack *extack) +{ + struct mlx5e_tc_flow_parse_attr *parse_attr = attr->parse_attr; + enum mlx5_flow_namespace_type ns_type; + int err; + + if (!hdrs[TCA_PEDIT_KEY_EX_CMD_SET].pedits && + !hdrs[TCA_PEDIT_KEY_EX_CMD_ADD].pedits) + return 0; + + ns_type = get_flow_name_space(flow); + + err = alloc_tc_pedit_action(priv, ns_type, parse_attr, hdrs, + &attr->action, extack); + if (err) + return err; + + /* In case all pedit actions are skipped, remove the MOD_HDR flag. */ + if (parse_attr->mod_hdr_acts.num_actions > 0) + return 0; + + attr->action &= ~MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; + dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); + + if (ns_type != MLX5_FLOW_NAMESPACE_FDB) + return 0; + + if (!((attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP) || + (attr->action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH))) + attr->esw_attr->split_count = 0; + + return 0; +} + +static int +parse_tc_nic_actions(struct mlx5e_priv *priv, + struct flow_action *flow_action, + struct mlx5e_tc_flow *flow, + struct netlink_ext_ack *extack) { struct mlx5e_tc_flow_parse_attr *parse_attr; struct mlx5_flow_attr *attr = flow->attr; @@ -3617,21 +3657,6 @@ static int parse_tc_nic_actions(struct mlx5e_priv *priv, } }
- if (hdrs[TCA_PEDIT_KEY_EX_CMD_SET].pedits || - hdrs[TCA_PEDIT_KEY_EX_CMD_ADD].pedits) { - err = alloc_tc_pedit_action(priv, MLX5_FLOW_NAMESPACE_KERNEL, - parse_attr, hdrs, &action, extack); - if (err) - return err; - /* in case all pedit actions are skipped, remove the MOD_HDR - * flag. - */ - if (parse_attr->mod_hdr_acts.num_actions == 0) { - action &= ~MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); - } - } - attr->action = action;
if (attr->dest_chain && parse_attr->mirred_ifindex[0]) { @@ -3639,6 +3664,10 @@ static int parse_tc_nic_actions(struct mlx5e_priv *priv, return -EOPNOTSUPP; }
+ err = actions_prepare_mod_hdr_actions(priv, flow, attr, hdrs, extack); + if (err) + return err; + if (!actions_match_supported(priv, flow_action, parse_attr, flow, extack)) return -EOPNOTSUPP;
@@ -4192,26 +4221,12 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, return err; }
- if (hdrs[TCA_PEDIT_KEY_EX_CMD_SET].pedits || - hdrs[TCA_PEDIT_KEY_EX_CMD_ADD].pedits) { - err = alloc_tc_pedit_action(priv, MLX5_FLOW_NAMESPACE_FDB, - parse_attr, hdrs, &action, extack); - if (err) - return err; - /* in case all pedit actions are skipped, remove the MOD_HDR - * flag. we might have set split_count either by pedit or - * pop/push. if there is no pop/push either, reset it too. - */ - if (parse_attr->mod_hdr_acts.num_actions == 0) { - action &= ~MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); - if (!((action & MLX5_FLOW_CONTEXT_ACTION_VLAN_POP) || - (action & MLX5_FLOW_CONTEXT_ACTION_VLAN_PUSH))) - esw_attr->split_count = 0; - } - } - attr->action = action; + + err = actions_prepare_mod_hdr_actions(priv, flow, attr, hdrs, extack); + if (err) + return err; + if (!actions_match_supported(priv, flow_action, parse_attr, flow, extack)) return -EOPNOTSUPP;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Blakey paulb@nvidia.com
[ Upstream commit 2c0e5cf5206ecd5da3c6bc5799671c2172713d71 ]
For all mod hdr related functions to reside in a single self contained component (mod_hdr.c), refactor alloc() and add get_id() so that user won't rely on internal implementation, and move both to mod_hdr component.
Rename the prefix to mlx5e_mod_hdr_* as other mod hdr functions.
Signed-off-by: Paul Blakey paulb@nvidia.com Reviewed-by: Oz Shlomo ozsh@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Stable-dep-of: 0c101a23ca7e ("net/mlx5e: Fix pedit endianness") Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/mellanox/mlx5/core/en/mod_hdr.c | 47 ++++++++++ .../ethernet/mellanox/mlx5/core/en/mod_hdr.h | 13 +++ .../mellanox/mlx5/core/en/tc/sample.c | 5 +- .../ethernet/mellanox/mlx5/core/en/tc_ct.c | 25 ++---- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 90 ++++--------------- .../net/ethernet/mellanox/mlx5/core/en_tc.h | 5 -- .../mellanox/mlx5/core/esw/indir_table.c | 5 +- 7 files changed, 90 insertions(+), 100 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.c b/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.c index 7edde4d536fda..19d05fb4aab2e 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.c @@ -155,3 +155,50 @@ struct mlx5_modify_hdr *mlx5e_mod_hdr_get(struct mlx5e_mod_hdr_handle *mh) return mh->modify_hdr; }
+char * +mlx5e_mod_hdr_alloc(struct mlx5_core_dev *mdev, int namespace, + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) +{ + int new_num_actions, max_hw_actions; + size_t new_sz, old_sz; + void *ret; + + if (mod_hdr_acts->num_actions < mod_hdr_acts->max_actions) + goto out; + + max_hw_actions = mlx5e_mod_hdr_max_actions(mdev, namespace); + new_num_actions = min(max_hw_actions, + mod_hdr_acts->actions ? + mod_hdr_acts->max_actions * 2 : 1); + if (mod_hdr_acts->max_actions == new_num_actions) + return ERR_PTR(-ENOSPC); + + new_sz = MLX5_MH_ACT_SZ * new_num_actions; + old_sz = mod_hdr_acts->max_actions * MLX5_MH_ACT_SZ; + + ret = krealloc(mod_hdr_acts->actions, new_sz, GFP_KERNEL); + if (!ret) + return ERR_PTR(-ENOMEM); + + memset(ret + old_sz, 0, new_sz - old_sz); + mod_hdr_acts->actions = ret; + mod_hdr_acts->max_actions = new_num_actions; + +out: + return mod_hdr_acts->actions + (mod_hdr_acts->num_actions * MLX5_MH_ACT_SZ); +} + +void +mlx5e_mod_hdr_dealloc(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) +{ + kfree(mod_hdr_acts->actions); + mod_hdr_acts->actions = NULL; + mod_hdr_acts->num_actions = 0; + mod_hdr_acts->max_actions = 0; +} + +char * +mlx5e_mod_hdr_get_item(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts, int pos) +{ + return mod_hdr_acts->actions + (pos * MLX5_MH_ACT_SZ); +} diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.h b/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.h index 33b23d8f91828..b8cd1a7a31be6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/mod_hdr.h @@ -15,6 +15,11 @@ struct mlx5e_tc_mod_hdr_acts { void *actions; };
+char *mlx5e_mod_hdr_alloc(struct mlx5_core_dev *mdev, int namespace, + struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); +void mlx5e_mod_hdr_dealloc(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); +char *mlx5e_mod_hdr_get_item(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts, int pos); + struct mlx5e_mod_hdr_handle * mlx5e_mod_hdr_attach(struct mlx5_core_dev *mdev, struct mod_hdr_tbl *tbl, @@ -28,4 +33,12 @@ struct mlx5_modify_hdr *mlx5e_mod_hdr_get(struct mlx5e_mod_hdr_handle *mh); void mlx5e_mod_hdr_tbl_init(struct mod_hdr_tbl *tbl); void mlx5e_mod_hdr_tbl_destroy(struct mod_hdr_tbl *tbl);
+static inline int mlx5e_mod_hdr_max_actions(struct mlx5_core_dev *mdev, int namespace) +{ + if (namespace == MLX5_FLOW_NAMESPACE_FDB) /* FDB offloading */ + return MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, max_modify_header_actions); + else /* namespace is MLX5_FLOW_NAMESPACE_KERNEL - NIC offloading */ + return MLX5_CAP_FLOWTABLE_NIC_RX(mdev, max_modify_header_actions); +} + #endif /* __MLX5E_EN_MOD_HDR_H__ */ diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c index 6552ecee3f9b9..d08723a444e3f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc/sample.c @@ -5,6 +5,7 @@ #include <net/psample.h> #include "en/mapping.h" #include "en/tc/post_act.h" +#include "en/mod_hdr.h" #include "sample.h" #include "eswitch.h" #include "en_tc.h" @@ -255,12 +256,12 @@ sample_modify_hdr_get(struct mlx5_core_dev *mdev, u32 obj_id, goto err_modify_hdr; }
- dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); return modify_hdr;
err_modify_hdr: err_post_act: - dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); err_set_regc0: return ERR_PTR(err); } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c index 94200f2dd92b0..80a49d7af05d6 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/tc_ct.c @@ -609,22 +609,15 @@ mlx5_tc_ct_entry_create_nat(struct mlx5_tc_ct_priv *ct_priv, struct flow_action *flow_action = &flow_rule->action; struct mlx5_core_dev *mdev = ct_priv->dev; struct flow_action_entry *act; - size_t action_size; char *modact; int err, i;
- action_size = MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto); - flow_action_for_each(i, act, flow_action) { switch (act->id) { case FLOW_ACTION_MANGLE: { - err = alloc_mod_hdr_actions(mdev, ct_priv->ns_type, - mod_acts); - if (err) - return err; - - modact = mod_acts->actions + - mod_acts->num_actions * action_size; + modact = mlx5e_mod_hdr_alloc(mdev, ct_priv->ns_type, mod_acts); + if (IS_ERR(modact)) + return PTR_ERR(modact);
err = mlx5_tc_ct_parse_mangle_to_mod_act(act, modact); if (err) @@ -707,11 +700,11 @@ mlx5_tc_ct_entry_create_mod_hdr(struct mlx5_tc_ct_priv *ct_priv, attr->modify_hdr = mlx5e_mod_hdr_get(*mh); }
- dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); return 0;
err_mapping: - dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); mlx5_put_label_mapping(ct_priv, attr->ct_attr.ct_labels_id); return err; } @@ -1463,7 +1456,7 @@ static int tc_ct_pre_ct_add_rules(struct mlx5_ct_ft *ct_ft, } pre_ct->miss_rule = rule;
- dealloc_mod_hdr_actions(&pre_mod_acts); + mlx5e_mod_hdr_dealloc(&pre_mod_acts); kvfree(spec); return 0;
@@ -1472,7 +1465,7 @@ static int tc_ct_pre_ct_add_rules(struct mlx5_ct_ft *ct_ft, err_flow_rule: mlx5_modify_header_dealloc(dev, pre_ct->modify_hdr); err_mapping: - dealloc_mod_hdr_actions(&pre_mod_acts); + mlx5e_mod_hdr_dealloc(&pre_mod_acts); kvfree(spec); return err; } @@ -1872,14 +1865,14 @@ __mlx5_tc_ct_flow_offload(struct mlx5_tc_ct_priv *ct_priv, }
attr->ct_attr.ct_flow = ct_flow; - dealloc_mod_hdr_actions(&pre_mod_acts); + mlx5e_mod_hdr_dealloc(&pre_mod_acts);
return ct_flow->pre_ct_rule;
err_insert_orig: mlx5_modify_header_dealloc(priv->mdev, pre_ct_attr->modify_hdr); err_mapping: - dealloc_mod_hdr_actions(&pre_mod_acts); + mlx5e_mod_hdr_dealloc(&pre_mod_acts); mlx5_chains_put_chain_mapping(ct_priv->chains, ct_flow->chain_mapping); err_get_chain: kfree(ct_flow->pre_ct_attr); diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 433602f871bd4..39fa0fa21e33c 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -71,7 +71,6 @@ #include "lag_mp.h"
#define nic_chains(priv) ((priv)->fs.tc.chains) -#define MLX5_MH_ACT_SZ MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto)
#define MLX5E_TC_TABLE_NUM_GROUPS 4 #define MLX5E_TC_TABLE_MAX_GROUP_SIZE BIT(18) @@ -209,12 +208,9 @@ mlx5e_tc_match_to_reg_set_and_get_id(struct mlx5_core_dev *mdev, char *modact; int err;
- err = alloc_mod_hdr_actions(mdev, ns, mod_hdr_acts); - if (err) - return err; - - modact = mod_hdr_acts->actions + - (mod_hdr_acts->num_actions * MLX5_MH_ACT_SZ); + modact = mlx5e_mod_hdr_alloc(mdev, ns, mod_hdr_acts); + if (IS_ERR(modact)) + return PTR_ERR(modact);
/* Firmware has 5bit length field and 0 means 32bits */ if (mlen == 32) @@ -316,7 +312,7 @@ void mlx5e_tc_match_to_reg_mod_hdr_change(struct mlx5_core_dev *mdev, int mlen = mlx5e_tc_attr_to_reg_mappings[type].mlen; char *modact;
- modact = mod_hdr_acts->actions + (act_id * MLX5_MH_ACT_SZ); + modact = mlx5e_mod_hdr_get_item(mod_hdr_acts, act_id);
/* Firmware has 5bit length field and 0 means 32bits */ if (mlen == 32) @@ -1059,7 +1055,7 @@ mlx5e_tc_add_nic_flow(struct mlx5e_priv *priv,
if (attr->action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) { err = mlx5e_attach_mod_hdr(priv, flow, parse_attr); - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); + mlx5e_mod_hdr_dealloc(&parse_attr->mod_hdr_acts); if (err) return err; } @@ -1557,7 +1553,7 @@ static void mlx5e_tc_del_fdb_flow(struct mlx5e_priv *priv, mlx5_tc_ct_match_del(get_ct_priv(priv), &flow->attr->ct_attr);
if (attr->action & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) { - dealloc_mod_hdr_actions(&attr->parse_attr->mod_hdr_acts); + mlx5e_mod_hdr_dealloc(&attr->parse_attr->mod_hdr_acts); if (vf_tun && attr->modify_hdr) mlx5_modify_header_dealloc(priv->mdev, attr->modify_hdr); else @@ -2803,13 +2799,12 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, struct netlink_ext_ack *extack) { struct pedit_headers *set_masks, *add_masks, *set_vals, *add_vals; - int i, action_size, first, last, next_z; void *headers_c, *headers_v, *action, *vals_p; u32 *s_masks_p, *a_masks_p, s_mask, a_mask; struct mlx5e_tc_mod_hdr_acts *mod_acts; - struct mlx5_fields *f; unsigned long mask, field_mask; - int err; + int i, first, last, next_z; + struct mlx5_fields *f; u8 cmd;
mod_acts = &parse_attr->mod_hdr_acts; @@ -2821,8 +2816,6 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, set_vals = &hdrs[0].vals; add_vals = &hdrs[1].vals;
- action_size = MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto); - for (i = 0; i < ARRAY_SIZE(fields); i++) { bool skip;
@@ -2890,18 +2883,16 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, return -EOPNOTSUPP; }
- err = alloc_mod_hdr_actions(priv->mdev, namespace, mod_acts); - if (err) { + action = mlx5e_mod_hdr_alloc(priv->mdev, namespace, mod_acts); + if (IS_ERR(action)) { NL_SET_ERR_MSG_MOD(extack, "too many pedit actions, can't offload"); mlx5_core_warn(priv->mdev, "mlx5: parsed %d pedit actions, can't do more\n", mod_acts->num_actions); - return err; + return PTR_ERR(action); }
- action = mod_acts->actions + - (mod_acts->num_actions * action_size); MLX5_SET(set_action_in, action, action_type, cmd); MLX5_SET(set_action_in, action, field, f->field);
@@ -2931,57 +2922,6 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, return 0; }
-static int mlx5e_flow_namespace_max_modify_action(struct mlx5_core_dev *mdev, - int namespace) -{ - if (namespace == MLX5_FLOW_NAMESPACE_FDB) /* FDB offloading */ - return MLX5_CAP_ESW_FLOWTABLE_FDB(mdev, max_modify_header_actions); - else /* namespace is MLX5_FLOW_NAMESPACE_KERNEL - NIC offloading */ - return MLX5_CAP_FLOWTABLE_NIC_RX(mdev, max_modify_header_actions); -} - -int alloc_mod_hdr_actions(struct mlx5_core_dev *mdev, - int namespace, - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) -{ - int action_size, new_num_actions, max_hw_actions; - size_t new_sz, old_sz; - void *ret; - - if (mod_hdr_acts->num_actions < mod_hdr_acts->max_actions) - return 0; - - action_size = MLX5_UN_SZ_BYTES(set_add_copy_action_in_auto); - - max_hw_actions = mlx5e_flow_namespace_max_modify_action(mdev, - namespace); - new_num_actions = min(max_hw_actions, - mod_hdr_acts->actions ? - mod_hdr_acts->max_actions * 2 : 1); - if (mod_hdr_acts->max_actions == new_num_actions) - return -ENOSPC; - - new_sz = action_size * new_num_actions; - old_sz = mod_hdr_acts->max_actions * action_size; - ret = krealloc(mod_hdr_acts->actions, new_sz, GFP_KERNEL); - if (!ret) - return -ENOMEM; - - memset(ret + old_sz, 0, new_sz - old_sz); - mod_hdr_acts->actions = ret; - mod_hdr_acts->max_actions = new_num_actions; - - return 0; -} - -void dealloc_mod_hdr_actions(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts) -{ - kfree(mod_hdr_acts->actions); - mod_hdr_acts->actions = NULL; - mod_hdr_acts->num_actions = 0; - mod_hdr_acts->max_actions = 0; -} - static const struct pedit_headers zero_masks = {};
static int @@ -3004,7 +2944,7 @@ parse_pedit_to_modify_hdr(struct mlx5e_priv *priv, goto out_err; }
- if (!mlx5e_flow_namespace_max_modify_action(priv->mdev, namespace)) { + if (!mlx5e_mod_hdr_max_actions(priv->mdev, namespace)) { NL_SET_ERR_MSG_MOD(extack, "The pedit offload action is not supported"); goto out_err; @@ -3096,7 +3036,7 @@ static int alloc_tc_pedit_action(struct mlx5e_priv *priv, int namespace, return 0;
out_dealloc_parsed_actions: - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); + mlx5e_mod_hdr_dealloc(&parse_attr->mod_hdr_acts); return err; }
@@ -3529,7 +3469,7 @@ actions_prepare_mod_hdr_actions(struct mlx5e_priv *priv, return 0;
attr->action &= ~MLX5_FLOW_CONTEXT_ACTION_MOD_HDR; - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); + mlx5e_mod_hdr_dealloc(&parse_attr->mod_hdr_acts);
if (ns_type != MLX5_FLOW_NAMESPACE_FDB) return 0; @@ -4613,7 +4553,7 @@ mlx5e_add_nic_flow(struct mlx5e_priv *priv,
err_free: flow_flag_set(flow, FAILED); - dealloc_mod_hdr_actions(&parse_attr->mod_hdr_acts); + mlx5e_mod_hdr_dealloc(&parse_attr->mod_hdr_acts); mlx5e_flow_put(priv, flow); out: return err; diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h index f48af82781f88..26a85a11eb6ca 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.h @@ -244,11 +244,6 @@ int mlx5e_tc_add_flow_mod_hdr(struct mlx5e_priv *priv, struct mlx5e_tc_flow *flow, struct mlx5_flow_attr *attr);
-int alloc_mod_hdr_actions(struct mlx5_core_dev *mdev, - int namespace, - struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); -void dealloc_mod_hdr_actions(struct mlx5e_tc_mod_hdr_acts *mod_hdr_acts); - struct mlx5e_tc_flow; u32 mlx5e_tc_get_flow_tun_id(struct mlx5e_tc_flow *flow);
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/esw/indir_table.c b/drivers/net/ethernet/mellanox/mlx5/core/esw/indir_table.c index 425c91814b34f..c275fe028b6d8 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/esw/indir_table.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/esw/indir_table.c @@ -14,6 +14,7 @@ #include "fs_core.h" #include "esw/indir_table.h" #include "lib/fs_chains.h" +#include "en/mod_hdr.h"
#define MLX5_ESW_INDIR_TABLE_SIZE 128 #define MLX5_ESW_INDIR_TABLE_RECIRC_IDX_MAX (MLX5_ESW_INDIR_TABLE_SIZE - 2) @@ -226,7 +227,7 @@ static int mlx5_esw_indir_table_rule_get(struct mlx5_eswitch *esw, goto err_handle; }
- dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); rule->handle = handle; rule->vni = esw_attr->rx_tun_attr->vni; rule->mh = flow_act.modify_hdr; @@ -243,7 +244,7 @@ static int mlx5_esw_indir_table_rule_get(struct mlx5_eswitch *esw, mlx5_modify_header_dealloc(esw->dev, flow_act.modify_hdr); err_mod_hdr_alloc: err_mod_hdr_regc1: - dealloc_mod_hdr_actions(&mod_acts); + mlx5e_mod_hdr_dealloc(&mod_acts); err_mod_hdr_regc0: err_ethertype: kfree(rule);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 0c101a23ca7eaf00eef1328eefb04b3a93401cc8 ]
Referenced commit addressed endianness issue in mlx5 pedit implementation in ad hoc manner instead of systematically treating integer values according to their types which left pedit fields of sizes not equal to 4 and where the bytes being modified are not least significant ones broken on big endian machines since wrong bits will be consumed during parsing which leads to following example error when applying pedit to source and destination MAC addresses:
[Wed Oct 18 12:52:42 2023] mlx5_core 0001:00:00.1 p1v3_r: attempt to offload an unsupported field (cmd 0) [Wed Oct 18 12:52:42 2023] mask: 00000000330c5b68: 00 00 00 00 ff ff 00 00 00 00 ff ff 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 0000000017d22fd9: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 000000008186d717: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 0000000029eb6149: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 000000007ed103e4: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 00000000db8101a6: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ [Wed Oct 18 12:52:42 2023] mask: 00000000ec3c08a9: 00 00 00 00 00 00 00 00 00 00 00 00 ............
Treat masks and values of pedit and filter match as network byte order, refactor pointers to them to void pointers instead of confusing u32 pointers and only cast to pointer-to-integer when reading a value from them. Treat pedit mlx5_fields->field_mask as host byte order according to its type u32, change the constants in fields array accordingly.
Fixes: 82198d8bcdef ("net/mlx5e: Fix endianness when calculating pedit mask first bit") Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Gal Pressman gal@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-8-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 60 ++++++++++--------- 1 file changed, 32 insertions(+), 28 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 39fa0fa21e33c..78538a15c097a 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -2764,7 +2764,7 @@ static struct mlx5_fields fields[] = { OFFLOAD(DIPV6_31_0, 32, U32_MAX, ip6.daddr.s6_addr32[3], 0, dst_ipv4_dst_ipv6.ipv6_layout.ipv6[12]), OFFLOAD(IPV6_HOPLIMIT, 8, U8_MAX, ip6.hop_limit, 0, ttl_hoplimit), - OFFLOAD(IP_DSCP, 16, 0xc00f, ip6, 0, ip_dscp), + OFFLOAD(IP_DSCP, 16, 0x0fc0, ip6, 0, ip_dscp),
OFFLOAD(TCP_SPORT, 16, U16_MAX, tcp.source, 0, tcp_sport), OFFLOAD(TCP_DPORT, 16, U16_MAX, tcp.dest, 0, tcp_dport), @@ -2775,21 +2775,31 @@ static struct mlx5_fields fields[] = { OFFLOAD(UDP_DPORT, 16, U16_MAX, udp.dest, 0, udp_dport), };
-static unsigned long mask_to_le(unsigned long mask, int size) +static u32 mask_field_get(void *mask, struct mlx5_fields *f) { - __be32 mask_be32; - __be16 mask_be16; - - if (size == 32) { - mask_be32 = (__force __be32)(mask); - mask = (__force unsigned long)cpu_to_le32(be32_to_cpu(mask_be32)); - } else if (size == 16) { - mask_be32 = (__force __be32)(mask); - mask_be16 = *(__be16 *)&mask_be32; - mask = (__force unsigned long)cpu_to_le16(be16_to_cpu(mask_be16)); + switch (f->field_bsize) { + case 32: + return be32_to_cpu(*(__be32 *)mask) & f->field_mask; + case 16: + return be16_to_cpu(*(__be16 *)mask) & (u16)f->field_mask; + default: + return *(u8 *)mask & (u8)f->field_mask; } +}
- return mask; +static void mask_field_clear(void *mask, struct mlx5_fields *f) +{ + switch (f->field_bsize) { + case 32: + *(__be32 *)mask &= ~cpu_to_be32(f->field_mask); + break; + case 16: + *(__be16 *)mask &= ~cpu_to_be16((u16)f->field_mask); + break; + default: + *(u8 *)mask &= ~(u8)f->field_mask; + break; + } } static int offload_pedit_fields(struct mlx5e_priv *priv, int namespace, @@ -2800,11 +2810,12 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, { struct pedit_headers *set_masks, *add_masks, *set_vals, *add_vals; void *headers_c, *headers_v, *action, *vals_p; - u32 *s_masks_p, *a_masks_p, s_mask, a_mask; struct mlx5e_tc_mod_hdr_acts *mod_acts; - unsigned long mask, field_mask; + void *s_masks_p, *a_masks_p; int i, first, last, next_z; struct mlx5_fields *f; + unsigned long mask; + u32 s_mask, a_mask; u8 cmd;
mod_acts = &parse_attr->mod_hdr_acts; @@ -2820,15 +2831,11 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, bool skip;
f = &fields[i]; - /* avoid seeing bits set from previous iterations */ - s_mask = 0; - a_mask = 0; - s_masks_p = (void *)set_masks + f->offset; a_masks_p = (void *)add_masks + f->offset;
- s_mask = *s_masks_p & f->field_mask; - a_mask = *a_masks_p & f->field_mask; + s_mask = mask_field_get(s_masks_p, f); + a_mask = mask_field_get(a_masks_p, f);
if (!s_mask && !a_mask) /* nothing to offload here */ continue; @@ -2855,22 +2862,20 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, match_mask, f->field_bsize)) skip = true; /* clear to denote we consumed this field */ - *s_masks_p &= ~f->field_mask; + mask_field_clear(s_masks_p, f); } else { cmd = MLX5_ACTION_TYPE_ADD; mask = a_mask; vals_p = (void *)add_vals + f->offset; /* add 0 is no change */ - if ((*(u32 *)vals_p & f->field_mask) == 0) + if (!mask_field_get(vals_p, f)) skip = true; /* clear to denote we consumed this field */ - *a_masks_p &= ~f->field_mask; + mask_field_clear(a_masks_p, f); } if (skip) continue;
- mask = mask_to_le(mask, f->field_bsize); - first = find_first_bit(&mask, f->field_bsize); next_z = find_next_zero_bit(&mask, f->field_bsize, first); last = find_last_bit(&mask, f->field_bsize); @@ -2897,10 +2902,9 @@ static int offload_pedit_fields(struct mlx5e_priv *priv, MLX5_SET(set_action_in, action, field, f->field);
if (cmd == MLX5_ACTION_TYPE_SET) { + unsigned long field_mask = f->field_mask; int start;
- field_mask = mask_to_le(f->field_mask, f->field_bsize); - /* if field is bit sized it can start not from first bit */ start = find_first_bit(&field_mask, f->field_bsize);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saeed Mahameed saeedm@nvidia.com
[ Upstream commit dce94142842e119b982c27c1b62bd20890c7fd21 ]
icosq_str size is unnecessarily too long, and it causes a build warning -Wformat-truncation with W=1. Looking closely, It doesn't need to be 255B, hence this patch reduces the size to 32B which should be more than enough to host the string: "ICOSQ: 0x%x, ".
While here, add a missing space in the formatted string.
This fixes the following build warning:
$ KCFLAGS='-Wall -Werror' $ make O=/tmp/kbuild/linux W=1 -s -j12 drivers/net/ethernet/mellanox/mlx5/core/
drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c: In function 'mlx5e_reporter_rx_timeout': drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c:718:56: error: ', CQ: 0x' directive output may be truncated writing 8 bytes into a region of size between 0 and 255 [-Werror=format-truncation=] 718 | "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", | ^~~~~~~~ drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c:717:9: note: 'snprintf' output between 43 and 322 bytes into a destination of size 288 717 | snprintf(err_str, sizeof(err_str), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 718 | "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 719 | rq->ix, icosq_str, rq->rqn, rq->cq.mcq.cqn); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: 521f31af004a ("net/mlx5e: Allow RQ outside of channel context") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-14-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c index 899a9a73eef68..a4c12c5bb0dc5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/reporter_rx.c @@ -655,11 +655,11 @@ static int mlx5e_rx_reporter_dump(struct devlink_health_reporter *reporter,
void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq) { - char icosq_str[MLX5E_REPORTER_PER_Q_MAX_LEN] = {}; char err_str[MLX5E_REPORTER_PER_Q_MAX_LEN]; struct mlx5e_icosq *icosq = rq->icosq; struct mlx5e_priv *priv = rq->priv; struct mlx5e_err_ctx err_ctx = {}; + char icosq_str[32] = {};
err_ctx.ctx = rq; err_ctx.recover = mlx5e_rx_reporter_timeout_recover; @@ -668,7 +668,7 @@ void mlx5e_reporter_rx_timeout(struct mlx5e_rq *rq) if (icosq) snprintf(icosq_str, sizeof(icosq_str), "ICOSQ: 0x%x, ", icosq->sqn); snprintf(err_str, sizeof(err_str), - "RX timeout on channel: %d, %sRQ: 0x%x, CQ: 0x%x", + "RX timeout on channel: %d, %s RQ: 0x%x, CQ: 0x%x", rq->ix, icosq_str, rq->rqn, rq->cq.mcq.cqn);
mlx5e_health_report(priv, priv->rx_reporter, err_str, &err_ctx);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rahul Rameshbabu rrameshbabu@nvidia.com
[ Upstream commit 1b2bd0c0264febcd8d47209079a6671c38e6558b ]
Treat the operation as an error case when the return value is equivalent to the size of the name buffer. Failed to write null terminator to the name buffer, making the string malformed and should not be used. Provide a string with only the firmware version when forming the string with the board id fails. This logic for representors is identical to normal flow with ethtool.
Without check, will trigger -Wformat-truncation with W=1.
drivers/net/ethernet/mellanox/mlx5/core/en_rep.c: In function 'mlx5e_rep_get_drvinfo': drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:78:31: warning: '%.16s' directive output may be truncated writing up to 16 bytes into a region of size between 13 and 22 [-Wformat-truncation=] 78 | "%d.%d.%04d (%.16s)", | ^~~~~ drivers/net/ethernet/mellanox/mlx5/core/en_rep.c:77:9: note: 'snprintf' output between 12 and 37 bytes into a destination of size 32 77 | snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 78 | "%d.%d.%04d (%.16s)", | ~~~~~~~~~~~~~~~~~~~~~ 79 | fw_rev_maj(mdev), fw_rev_min(mdev), | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 80 | fw_rev_sub(mdev), mdev->board_id); | ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Fixes: cf83c8fdcd47 ("net/mlx5e: Add missing ethtool driver info for representors") Link: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i... Signed-off-by: Rahul Rameshbabu rrameshbabu@nvidia.com Reviewed-by: Dragos Tatulea dtatulea@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Link: https://lore.kernel.org/r/20231114215846.5902-16-saeed@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_rep.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c index 3d614bf5cff9e..7a00faa62d993 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rep.c @@ -66,13 +66,17 @@ static void mlx5e_rep_get_drvinfo(struct net_device *dev, { struct mlx5e_priv *priv = netdev_priv(dev); struct mlx5_core_dev *mdev = priv->mdev; + int count;
strlcpy(drvinfo->driver, mlx5e_rep_driver_name, sizeof(drvinfo->driver)); - snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), - "%d.%d.%04d (%.16s)", - fw_rev_maj(mdev), fw_rev_min(mdev), - fw_rev_sub(mdev), mdev->board_id); + count = snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d (%.16s)", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev), mdev->board_id); + if (count == sizeof(drvinfo->fw_version)) + snprintf(drvinfo->fw_version, sizeof(drvinfo->fw_version), + "%d.%d.%04d", fw_rev_maj(mdev), + fw_rev_min(mdev), fw_rev_sub(mdev)); }
static const struct counter_desc sw_rep_stats_desc[] = {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 7e1caeace0418381f36b3aa8403dfd82fc57fc53 ]
Macvlan device in passthru mode sets its lower device promiscuous mode according to its MACVLAN_FLAG_NOPROMISC flag instead of synchronizing it to its own promiscuity setting. However, macvlan_change_rx_flags() function doesn't check the mode before propagating such changes to the lower device which can cause net_device->promiscuity counter overflow as illustrated by reproduction example [0] and resulting dmesg log [1]. Fix the issue by first verifying the mode in macvlan_change_rx_flags() function before propagating promiscuous mode change to the lower device.
[0]: ip link add macvlan1 link enp8s0f0 type macvlan mode passthru ip link set macvlan1 promisc on ip l set dev macvlan1 up ip link set macvlan1 promisc off ip l set dev macvlan1 down ip l set dev macvlan1 up
[1]: [ 5156.281724] macvlan1: entered promiscuous mode [ 5156.285467] mlx5_core 0000:08:00.0 enp8s0f0: entered promiscuous mode [ 5156.287639] macvlan1: left promiscuous mode [ 5156.288339] mlx5_core 0000:08:00.0 enp8s0f0: left promiscuous mode [ 5156.290907] mlx5_core 0000:08:00.0 enp8s0f0: entered promiscuous mode [ 5156.317197] mlx5_core 0000:08:00.0 enp8s0f0: promiscuity touches roof, set promiscuity failed. promiscuity feature of device might be broken.
Fixes: efdbd2b30caa ("macvlan: Propagate promiscuity setting to lower devices.") Reviewed-by: Gal Pressman gal@nvidia.com Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Link: https://lore.kernel.org/r/20231114175915.1649154-1-vladbu@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/macvlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 3dd1528dde028..6f0b6c924d724 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -770,7 +770,7 @@ static void macvlan_change_rx_flags(struct net_device *dev, int change) if (dev->flags & IFF_UP) { if (change & IFF_ALLMULTI) dev_set_allmulti(lowerdev, dev->flags & IFF_ALLMULTI ? 1 : -1); - if (change & IFF_PROMISC) + if (!macvlan_passthru(vlan->port) && change & IFF_PROMISC) dev_set_promiscuity(lowerdev, dev->flags & IFF_PROMISC ? 1 : -1);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Rui rui.zhang@intel.com
[ Upstream commit 137f01b3529d292a68d22e9681e2f903c768f790 ]
MSR_KNL_CORE_C6_RESIDENCY should be evaluated only if 1. this is KNL platform AND 2. need to get C6 residency or need to calculate C1 residency
Fix the broken logic introduced by commit 1e9042b9c8d4 ("tools/power turbostat: Fix CPU%C1 display value").
Fixes: 1e9042b9c8d4 ("tools/power turbostat: Fix CPU%C1 display value") Signed-off-by: Zhang Rui rui.zhang@intel.com Reviewed-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index a3197efe52c63..045d0db88755f 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -2100,7 +2100,7 @@ int get_counters(struct thread_data *t, struct core_data *c, struct pkg_data *p) if ((DO_BIC(BIC_CPU_c6) || soft_c1_residency_display(BIC_CPU_c6)) && !do_knl_cstates) { if (get_msr(cpu, MSR_CORE_C6_RESIDENCY, &c->c6)) return -7; - } else if (do_knl_cstates || soft_c1_residency_display(BIC_CPU_c6)) { + } else if (do_knl_cstates && soft_c1_residency_display(BIC_CPU_c6)) { if (get_msr(cpu, MSR_KNL_CORE_C6_RESIDENCY, &c->c6)) return -7; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Yu yu.c.chen@intel.com
[ Upstream commit b61b7d8c4c22c4298a50ae5d0ee88facb85ce665 ]
Currently the C-state Pre-wake will not be printed due to the probe has not been invoked. Invoke the probe function accordingly.
Fixes: aeb01e6d71ff ("tools/power turbostat: Print the C-state Pre-wake settings") Signed-off-by: Chen Yu yu.c.chen@intel.com Reviewed-by: Zhang Rui rui.zhang@intel.com Reviewed-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index 045d0db88755f..65ada8065cfc2 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -5543,6 +5543,7 @@ void process_cpuid() rapl_probe(family, model); perf_limit_reasons_probe(family, model); automatic_cstate_conversion_probe(family, model); + prewake_cstate_probe(family, model);
check_tcc_offset(model_orig);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anastasia Belova abelova@astralinux.ru
[ Upstream commit ff31ba19d732efb9aca3633935d71085e68d5076 ]
"host=" should start with ';' (as in cifs_get_spnego_key) So its length should be 6.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Fixes: 7c9c3760b3a5 ("[CIFS] add constants for string lengths of keynames in SPNEGO upcall string") Signed-off-by: Anastasia Belova abelova@astralinux.ru Co-developed-by: Ekaterina Esina eesina@astralinux.ru Signed-off-by: Ekaterina Esina eesina@astralinux.ru Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifs_spnego.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/cifs_spnego.c b/fs/cifs/cifs_spnego.c index 353bd0dd70260..66b4413b94f7f 100644 --- a/fs/cifs/cifs_spnego.c +++ b/fs/cifs/cifs_spnego.c @@ -64,8 +64,8 @@ struct key_type cifs_spnego_key_type = { * strlen(";sec=ntlmsspi") */ #define MAX_MECH_STR_LEN 13
-/* strlen of "host=" */ -#define HOST_KEY_LEN 5 +/* strlen of ";host=" */ +#define HOST_KEY_LEN 6
/* strlen of ";ip4=" or ";ip6=" */ #define IP_KEY_LEN 5
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ekaterina Esina eesina@astralinux.ru
[ Upstream commit 181724fc72486dec2bec8803459be05b5162aaa8 ]
Remove extra check after condition, add check after generating key for encryption. The check is needed to return non zero rc before rewriting it with generating key for decryption.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Fixes: d70e9fa55884 ("cifs: try opening channels after mounting") Signed-off-by: Ekaterina Esina eesina@astralinux.ru Co-developed-by: Anastasia Belova abelova@astralinux.ru Signed-off-by: Anastasia Belova abelova@astralinux.ru Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2transport.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c index 390cc5e8c7467..0f2e0ce84a03f 100644 --- a/fs/cifs/smb2transport.c +++ b/fs/cifs/smb2transport.c @@ -430,6 +430,8 @@ generate_smb3signingkey(struct cifs_ses *ses, ptriplet->encryption.context, ses->smb3encryptionkey, SMB3_ENC_DEC_KEY_SIZE); + if (rc) + return rc; rc = generate_key(ses, ptriplet->decryption.label, ptriplet->decryption.context, ses->smb3decryptionkey, @@ -438,9 +440,6 @@ generate_smb3signingkey(struct cifs_ses *ses, return rc; }
- if (rc) - return rc; - #ifdef CONFIG_CIFS_DEBUG_DUMP_KEYS cifs_dbg(VFS, "%s: dumping generated AES session keys\n", __func__); /*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 2723234923b3294dbcf6019c288c87465e927ed4 ]
Move the code that allocates and frees the buffer cancellation tables used by log recovery into the file that actually uses the tables. This is a precursor to some cleanups and a memory leak fix.
( backport: dependency of 8db074bd84df5ccc88bff3f8f900f66f4b8349fa )
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_log_recover.h | 14 +++++----- fs/xfs/xfs_buf_item_recover.c | 47 +++++++++++++++++++++++++++++++++ fs/xfs/xfs_log_priv.h | 3 --- fs/xfs/xfs_log_recover.c | 32 +++++++--------------- 4 files changed, 64 insertions(+), 32 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h index ff69a00008176..b8b65a6e9b1ec 100644 --- a/fs/xfs/libxfs/xfs_log_recover.h +++ b/fs/xfs/libxfs/xfs_log_recover.h @@ -108,12 +108,6 @@ struct xlog_recover {
#define ITEM_TYPE(i) (*(unsigned short *)(i)->ri_buf[0].i_addr)
-/* - * This is the number of entries in the l_buf_cancel_table used during - * recovery. - */ -#define XLOG_BC_TABLE_SIZE 64 - #define XLOG_RECOVER_CRCPASS 0 #define XLOG_RECOVER_PASS1 1 #define XLOG_RECOVER_PASS2 2 @@ -126,5 +120,13 @@ int xlog_recover_iget(struct xfs_mount *mp, xfs_ino_t ino, struct xfs_inode **ipp); void xlog_recover_release_intent(struct xlog *log, unsigned short intent_type, uint64_t intent_id); +void xlog_alloc_buf_cancel_table(struct xlog *log); +void xlog_free_buf_cancel_table(struct xlog *log); + +#ifdef DEBUG +void xlog_check_buf_cancel_table(struct xlog *log); +#else +#define xlog_check_buf_cancel_table(log) do { } while (0) +#endif
#endif /* __XFS_LOG_RECOVER_H__ */ diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index e04e44ef14c6d..dc099b2f4984c 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -23,6 +23,15 @@ #include "xfs_dir2.h" #include "xfs_quota.h"
+/* + * This is the number of entries in the l_buf_cancel_table used during + * recovery. + */ +#define XLOG_BC_TABLE_SIZE 64 + +#define XLOG_BUF_CANCEL_BUCKET(log, blkno) \ + ((log)->l_buf_cancel_table + ((uint64_t)blkno % XLOG_BC_TABLE_SIZE)) + /* * This structure is used during recovery to record the buf log items which * have been canceled and should not be replayed. @@ -1003,3 +1012,41 @@ const struct xlog_recover_item_ops xlog_buf_item_ops = { .commit_pass1 = xlog_recover_buf_commit_pass1, .commit_pass2 = xlog_recover_buf_commit_pass2, }; + +#ifdef DEBUG +void +xlog_check_buf_cancel_table( + struct xlog *log) +{ + int i; + + for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) + ASSERT(list_empty(&log->l_buf_cancel_table[i])); +} +#endif + +void +xlog_alloc_buf_cancel_table( + struct xlog *log) +{ + int i; + + ASSERT(log->l_buf_cancel_table == NULL); + + log->l_buf_cancel_table = kmem_zalloc(XLOG_BC_TABLE_SIZE * + sizeof(struct list_head), + 0); + for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) + INIT_LIST_HEAD(&log->l_buf_cancel_table[i]); +} + +void +xlog_free_buf_cancel_table( + struct xlog *log) +{ + if (!log->l_buf_cancel_table) + return; + + kmem_free(log->l_buf_cancel_table); + log->l_buf_cancel_table = NULL; +} diff --git a/fs/xfs/xfs_log_priv.h b/fs/xfs/xfs_log_priv.h index f3d68ca39f45c..03393595676f4 100644 --- a/fs/xfs/xfs_log_priv.h +++ b/fs/xfs/xfs_log_priv.h @@ -454,9 +454,6 @@ struct xlog { struct rw_semaphore l_incompat_users; };
-#define XLOG_BUF_CANCEL_BUCKET(log, blkno) \ - ((log)->l_buf_cancel_table + ((uint64_t)blkno % XLOG_BC_TABLE_SIZE)) - /* * Bits for operational state */ diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 581aeb288b32b..18d8eebc2d445 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -3248,7 +3248,7 @@ xlog_do_log_recovery( xfs_daddr_t head_blk, xfs_daddr_t tail_blk) { - int error, i; + int error;
ASSERT(head_blk != tail_blk);
@@ -3256,37 +3256,23 @@ xlog_do_log_recovery( * First do a pass to find all of the cancelled buf log items. * Store them in the buf_cancel_table for use in the second pass. */ - log->l_buf_cancel_table = kmem_zalloc(XLOG_BC_TABLE_SIZE * - sizeof(struct list_head), - 0); - for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) - INIT_LIST_HEAD(&log->l_buf_cancel_table[i]); + xlog_alloc_buf_cancel_table(log);
error = xlog_do_recovery_pass(log, head_blk, tail_blk, XLOG_RECOVER_PASS1, NULL); - if (error != 0) { - kmem_free(log->l_buf_cancel_table); - log->l_buf_cancel_table = NULL; - return error; - } + if (error != 0) + goto out_cancel; + /* * Then do a second pass to actually recover the items in the log. * When it is complete free the table of buf cancel items. */ error = xlog_do_recovery_pass(log, head_blk, tail_blk, XLOG_RECOVER_PASS2, NULL); -#ifdef DEBUG - if (!error) { - int i; - - for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) - ASSERT(list_empty(&log->l_buf_cancel_table[i])); - } -#endif /* DEBUG */ - - kmem_free(log->l_buf_cancel_table); - log->l_buf_cancel_table = NULL; - + if (!error) + xlog_check_buf_cancel_table(log); +out_cancel: + xlog_free_buf_cancel_table(log); return error; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 8db074bd84df5ccc88bff3f8f900f66f4b8349fa ]
If log recovery fails, we free the memory used by the buffer cancellation buckets, but we don't actually traverse each bucket list to free the individual xfs_buf_cancel objects. This leads to a memory leak, as reported by kmemleak in xfs/051:
unreferenced object 0xffff888103629560 (size 32): comm "mount", pid 687045, jiffies 4296935916 (age 10.752s) hex dump (first 32 bytes): 08 d3 0a 01 00 00 00 00 08 00 00 00 01 00 00 00 ................ d0 f5 0b 92 81 88 ff ff 80 64 64 25 81 88 ff ff .........dd%.... backtrace: [<ffffffffa0317c83>] kmem_alloc+0x73/0x140 [xfs] [<ffffffffa03234a9>] xlog_recover_buf_commit_pass1+0x139/0x200 [xfs] [<ffffffffa032dc27>] xlog_recover_commit_trans+0x307/0x350 [xfs] [<ffffffffa032df15>] xlog_recovery_process_trans+0xa5/0xe0 [xfs] [<ffffffffa032e12d>] xlog_recover_process_data+0x8d/0x140 [xfs] [<ffffffffa032e49d>] xlog_do_recovery_pass+0x19d/0x740 [xfs] [<ffffffffa032f22d>] xlog_do_log_recovery+0x6d/0x150 [xfs] [<ffffffffa032f343>] xlog_do_recover+0x33/0x1d0 [xfs] [<ffffffffa032faba>] xlog_recover+0xda/0x190 [xfs] [<ffffffffa03194bc>] xfs_log_mount+0x14c/0x360 [xfs] [<ffffffffa030bfed>] xfs_mountfs+0x50d/0xa60 [xfs] [<ffffffffa03124b5>] xfs_fs_fill_super+0x6a5/0x950 [xfs] [<ffffffff812b92a5>] get_tree_bdev+0x175/0x280 [<ffffffff812b7c3a>] vfs_get_tree+0x1a/0x80 [<ffffffff812e366f>] path_mount+0x6ff/0xaa0 [<ffffffff812e3b13>] __x64_sys_mount+0x103/0x140
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_buf_item_recover.c | 13 +++++++++++++ 1 file changed, 13 insertions(+)
diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index dc099b2f4984c..635f7f8ed9c2d 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -1044,9 +1044,22 @@ void xlog_free_buf_cancel_table( struct xlog *log) { + int i; + if (!log->l_buf_cancel_table) return;
+ for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) { + struct xfs_buf_cancel *bc; + + while ((bc = list_first_entry_or_null( + &log->l_buf_cancel_table[i], + struct xfs_buf_cancel, bc_list))) { + list_del(&bc->bc_list); + kmem_free(bc); + } + } + kmem_free(log->l_buf_cancel_table); log->l_buf_cancel_table = NULL; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 910bbdf2f4d7df46781bc9b723048f5ebed3d0d7 ]
While we're messing around with how recovery allocates and frees the buffer cancellation table, convert the allocation to use kmalloc_array instead of the old kmem_alloc APIs, and make it handle a null return, even though that's not likely.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_log_recover.h | 2 +- fs/xfs/xfs_buf_item_recover.c | 14 ++++++++++---- fs/xfs/xfs_log_recover.c | 4 +++- 3 files changed, 14 insertions(+), 6 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_log_recover.h b/fs/xfs/libxfs/xfs_log_recover.h index b8b65a6e9b1ec..81a065b0b5710 100644 --- a/fs/xfs/libxfs/xfs_log_recover.h +++ b/fs/xfs/libxfs/xfs_log_recover.h @@ -120,7 +120,7 @@ int xlog_recover_iget(struct xfs_mount *mp, xfs_ino_t ino, struct xfs_inode **ipp); void xlog_recover_release_intent(struct xlog *log, unsigned short intent_type, uint64_t intent_id); -void xlog_alloc_buf_cancel_table(struct xlog *log); +int xlog_alloc_buf_cancel_table(struct xlog *log); void xlog_free_buf_cancel_table(struct xlog *log);
#ifdef DEBUG diff --git a/fs/xfs/xfs_buf_item_recover.c b/fs/xfs/xfs_buf_item_recover.c index 635f7f8ed9c2d..31cbe7deebfaf 100644 --- a/fs/xfs/xfs_buf_item_recover.c +++ b/fs/xfs/xfs_buf_item_recover.c @@ -1025,19 +1025,25 @@ xlog_check_buf_cancel_table( } #endif
-void +int xlog_alloc_buf_cancel_table( struct xlog *log) { + void *p; int i;
ASSERT(log->l_buf_cancel_table == NULL);
- log->l_buf_cancel_table = kmem_zalloc(XLOG_BC_TABLE_SIZE * - sizeof(struct list_head), - 0); + p = kmalloc_array(XLOG_BC_TABLE_SIZE, sizeof(struct list_head), + GFP_KERNEL); + if (!p) + return -ENOMEM; + + log->l_buf_cancel_table = p; for (i = 0; i < XLOG_BC_TABLE_SIZE; i++) INIT_LIST_HEAD(&log->l_buf_cancel_table[i]); + + return 0; }
void diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 18d8eebc2d445..aeb01d4c0423b 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -3256,7 +3256,9 @@ xlog_do_log_recovery( * First do a pass to find all of the cancelled buf log items. * Store them in the buf_cancel_table for use in the second pass. */ - xlog_alloc_buf_cancel_table(log); + error = xlog_alloc_buf_cancel_table(log); + if (error) + return error;
error = xlog_do_recovery_pass(log, head_blk, tail_blk, XLOG_RECOVER_PASS1, NULL);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kaixu Xia kaixuxia@tencent.com
[ Upstream commit 82af88063961da9425924d9aec3fb67a4ebade3e ]
We should use invalidate_lock and XFS_MMAPLOCK_SHARED to check the state of mmap_lock rw_semaphore in xfs_isilocked(), rather than i_rwsem and XFS_IOLOCK_SHARED.
Fixes: 2433480a7e1d ("xfs: Convert to use invalidate_lock") Signed-off-by: Kaixu Xia kaixuxia@tencent.com Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index b2ea853182141..df64b902842dd 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -378,8 +378,8 @@ xfs_isilocked( }
if (lock_flags & (XFS_MMAPLOCK_EXCL|XFS_MMAPLOCK_SHARED)) { - return __xfs_rwsem_islocked(&VFS_I(ip)->i_rwsem, - (lock_flags & XFS_IOLOCK_SHARED)); + return __xfs_rwsem_islocked(&VFS_I(ip)->i_mapping->invalidate_lock, + (lock_flags & XFS_MMAPLOCK_SHARED)); }
if (lock_flags & (XFS_IOLOCK_EXCL | XFS_IOLOCK_SHARED)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 7561cea5dbb97fecb952548a0fb74fb105bf4664 ]
KASAN reported the following use after free bug when running generic/475:
XFS (dm-0): Mounting V5 Filesystem XFS (dm-0): Starting recovery (logdev: internal) XFS (dm-0): Ending recovery (logdev: internal) Buffer I/O error on dev dm-0, logical block 20639616, async page read Buffer I/O error on dev dm-0, logical block 20639617, async page read XFS (dm-0): log I/O error -5 XFS (dm-0): Filesystem has been shut down due to log error (0x2). XFS (dm-0): Unmounting Filesystem XFS (dm-0): Please unmount the filesystem and rectify the problem(s). ================================================================== BUG: KASAN: use-after-free in do_raw_spin_lock+0x246/0x270 Read of size 4 at addr ffff888109dd84c4 by task 3:1H/136
CPU: 3 PID: 136 Comm: 3:1H Not tainted 5.19.0-rc4-xfsx #rc4 8e53ab5ad0fddeb31cee5e7063ff9c361915a9c4 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Workqueue: xfs-log/dm-0 xlog_ioend_work [xfs] Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x2b8/0x661 ? do_raw_spin_lock+0x246/0x270 kasan_report+0xab/0x120 ? do_raw_spin_lock+0x246/0x270 do_raw_spin_lock+0x246/0x270 ? rwlock_bug.part.0+0x90/0x90 xlog_force_shutdown+0xf6/0x370 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318] xlog_ioend_work+0x100/0x190 [xfs 4ad76ae0d6add7e8183a553e624c31e9ed567318] process_one_work+0x672/0x1040 worker_thread+0x59b/0xec0 ? __kthread_parkme+0xc6/0x1f0 ? process_one_work+0x1040/0x1040 ? process_one_work+0x1040/0x1040 kthread+0x29e/0x340 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x1f/0x30 </TASK>
Allocated by task 154099: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 kmem_alloc+0x8d/0x2e0 [xfs] xlog_cil_init+0x1f/0x540 [xfs] xlog_alloc_log+0xd1e/0x1260 [xfs] xfs_log_mount+0xba/0x640 [xfs] xfs_mountfs+0xf2b/0x1d00 [xfs] xfs_fs_fill_super+0x10af/0x1910 [xfs] get_tree_bdev+0x383/0x670 vfs_get_tree+0x7d/0x240 path_mount+0xdb7/0x1890 __x64_sys_mount+0x1fa/0x270 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 154151: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x110/0x190 slab_free_freelist_hook+0xab/0x180 kfree+0xbc/0x310 xlog_dealloc_log+0x1b/0x2b0 [xfs] xfs_unmountfs+0x119/0x200 [xfs] xfs_fs_put_super+0x6e/0x2e0 [xfs] generic_shutdown_super+0x12b/0x3a0 kill_block_super+0x95/0xd0 deactivate_locked_super+0x80/0x130 cleanup_mnt+0x329/0x4d0 task_work_run+0xc5/0x160 exit_to_user_mode_prepare+0xd4/0xe0 syscall_exit_to_user_mode+0x1d/0x40 entry_SYSCALL_64_after_hwframe+0x46/0xb0
This appears to be a race between the unmount process, which frees the CIL and waits for in-flight iclog IO; and the iclog IO completion. When generic/475 runs, it starts fsstress in the background, waits a few seconds, and substitutes a dm-error device to simulate a disk falling out of a machine. If the fsstress encounters EIO on a pure data write, it will exit but the filesystem will still be online.
The next thing the test does is unmount the filesystem, which tries to clean the log, free the CIL, and wait for iclog IO completion. If an iclog was being written when the dm-error switch occurred, it can race with log unmounting as follows:
Thread 1 Thread 2
xfs_log_unmount xfs_log_clean xfs_log_quiesce xlog_ioend_work <observe error> xlog_force_shutdown test_and_set_bit(XLOG_IOERROR) xfs_log_force <log is shut down, nop> xfs_log_umount_write <log is shut down, nop> xlog_dealloc_log xlog_cil_destroy <wait for iclogs> spin_lock(&log->l_cilp->xc_push_lock) <KABOOM>
Therefore, free the CIL after waiting for the iclogs to complete. I /think/ this race has existed for quite a few years now, though I don't remember the ~2014 era logging code well enough to know if it was a real threat then or if the actual race was exposed only more recently.
Fixes: ac983517ec59 ("xfs: don't sleep in xlog_cil_force_lsn on shutdown") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_log.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_log.c b/fs/xfs/xfs_log.c index 0fb7d05ca308d..eba295f666acc 100644 --- a/fs/xfs/xfs_log.c +++ b/fs/xfs/xfs_log.c @@ -2061,8 +2061,6 @@ xlog_dealloc_log( xlog_in_core_t *iclog, *next_iclog; int i;
- xlog_cil_destroy(log); - /* * Cycle all the iclogbuf locks to make sure all log IO completion * is done before we tear down these buffers. @@ -2074,6 +2072,13 @@ xlog_dealloc_log( iclog = iclog->ic_next; }
+ /* + * Destroy the CIL after waiting for iclog IO completion because an + * iclog EIO error will try to shut down the log, which accesses the + * CIL to wake up the waiters. + */ + xlog_cil_destroy(log); + iclog = log->l_iclog; for (i = 0; i < log->l_iclog_bufs; i++) { next_iclog = iclog->ic_next;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 04a98a036cf8b810dda172a9dcfcbd783bf63655 ]
In the procedure of recover AGI unlinked lists, if something bad happenes on one of the unlinked inode in the bucket list, we would call xlog_recover_clear_agi_bucket() to clear the whole unlinked bucket list, not the unlinked inodes after the bad one. If we have already added some inodes to the gc workqueue before the bad inode in the list, we could get below error when freeing those inodes, and finaly fail to complete the log recover procedure.
XFS (ram0): Internal error xfs_iunlink_remove at line 2456 of file fs/xfs/xfs_inode.c. Caller xfs_ifree+0xb0/0x360 [xfs]
The problem is xlog_recover_clear_agi_bucket() clear the bucket list, so the gc worker fail to check the agino in xfs_verify_agino(). Fix this by flush workqueue before clearing the bucket.
Fixes: ab23a7768739 ("xfs: per-cpu deferred inode inactivation queues") Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Dave Chinner david@fromorbit.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_log_recover.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index aeb01d4c0423b..04961ebf16ea2 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2739,6 +2739,7 @@ xlog_recover_process_one_iunlink( * Call xlog_recover_clear_agi_bucket() to perform a transaction to * clear the inode pointer in the bucket. */ + xfs_inodegc_flush(mp); xlog_recover_clear_agi_bucket(mp, agno, bucket); return NULLAGINO; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 95ff0363f3f6ae70c21a0f2b0603e54438e5988b ]
The kernel build robot reported a UAF error while running xfs/433 (edited somewhat for brevity):
BUG: KASAN: use-after-free in xfs_attr3_node_inactive (fs/xfs/xfs_attr_inactive.c:214) xfs Read of size 4 at addr ffff88820ac2bd44 by task kworker/0:2/139
CPU: 0 PID: 139 Comm: kworker/0:2 Tainted: G S 5.19.0-rc2-00004-g7cf2b0f9611b #1 Hardware name: Hewlett-Packard p6-1451cx/2ADA, BIOS 8.15 02/05/2013 Workqueue: xfs-inodegc/sdb4 xfs_inodegc_worker [xfs] Call Trace: <TASK> dump_stack_lvl (lib/dump_stack.c:107 (discriminator 1)) print_address_description+0x1f/0x200 print_report.cold (mm/kasan/report.c:430) kasan_report (mm/kasan/report.c:162 mm/kasan/report.c:493) xfs_attr3_node_inactive (fs/xfs/xfs_attr_inactive.c:214) xfs xfs_attr3_root_inactive (fs/xfs/xfs_attr_inactive.c:296) xfs xfs_attr_inactive (fs/xfs/xfs_attr_inactive.c:371) xfs xfs_inactive (fs/xfs/xfs_inode.c:1781) xfs xfs_inodegc_worker (fs/xfs/xfs_icache.c:1837 fs/xfs/xfs_icache.c:1860) xfs process_one_work worker_thread kthread ret_from_fork </TASK>
Allocated by task 139: kasan_save_stack (mm/kasan/common.c:39) __kasan_slab_alloc (mm/kasan/common.c:45 mm/kasan/common.c:436 mm/kasan/common.c:469) kmem_cache_alloc (mm/slab.h:750 mm/slub.c:3214 mm/slub.c:3222 mm/slub.c:3229 mm/slub.c:3239) _xfs_buf_alloc (include/linux/instrumented.h:86 include/linux/atomic/atomic-instrumented.h:41 fs/xfs/xfs_buf.c:232) xfs xfs_buf_get_map (fs/xfs/xfs_buf.c:660) xfs xfs_buf_read_map (fs/xfs/xfs_buf.c:777) xfs xfs_trans_read_buf_map (fs/xfs/xfs_trans_buf.c:289) xfs xfs_da_read_buf (fs/xfs/libxfs/xfs_da_btree.c:2652) xfs xfs_da3_node_read (fs/xfs/libxfs/xfs_da_btree.c:392) xfs xfs_attr3_root_inactive (fs/xfs/xfs_attr_inactive.c:272) xfs xfs_attr_inactive (fs/xfs/xfs_attr_inactive.c:371) xfs xfs_inactive (fs/xfs/xfs_inode.c:1781) xfs xfs_inodegc_worker (fs/xfs/xfs_icache.c:1837 fs/xfs/xfs_icache.c:1860) xfs process_one_work worker_thread kthread ret_from_fork
Freed by task 139: kasan_save_stack (mm/kasan/common.c:39) kasan_set_track (mm/kasan/common.c:45) kasan_set_free_info (mm/kasan/generic.c:372) __kasan_slab_free (mm/kasan/common.c:368 mm/kasan/common.c:328 mm/kasan/common.c:374) kmem_cache_free (mm/slub.c:1753 mm/slub.c:3507 mm/slub.c:3524) xfs_buf_rele (fs/xfs/xfs_buf.c:1040) xfs xfs_attr3_node_inactive (fs/xfs/xfs_attr_inactive.c:210) xfs xfs_attr3_root_inactive (fs/xfs/xfs_attr_inactive.c:296) xfs xfs_attr_inactive (fs/xfs/xfs_attr_inactive.c:371) xfs xfs_inactive (fs/xfs/xfs_inode.c:1781) xfs xfs_inodegc_worker (fs/xfs/xfs_icache.c:1837 fs/xfs/xfs_icache.c:1860) xfs process_one_work worker_thread kthread ret_from_fork
I reproduced this for my own satisfaction, and got the same report, along with an extra morsel:
The buggy address belongs to the object at ffff88802103a800 which belongs to the cache xfs_buf of size 432 The buggy address is located 396 bytes inside of 432-byte region [ffff88802103a800, ffff88802103a9b0)
I tracked this code down to:
error = xfs_trans_get_buf(*trans, mp->m_ddev_targp, child_blkno, XFS_FSB_TO_BB(mp, mp->m_attr_geo->fsbcount), 0, &child_bp); if (error) return error; error = bp->b_error;
That doesn't look right -- I think this should be dereferencing child_bp, not bp. Looking through the codebase history, I think this was added by commit 2911edb653b9 ("xfs: remove the mappedbno argument to xfs_da_get_buf"), which replaced a call to xfs_da_get_buf with the current call to xfs_trans_get_buf. Not sure why we trans_brelse'd @bp earlier in the function, but I'm guessing it's to avoid pinning too many buffers in memory while we inactivate the bottom of the attr tree. Hence we now have to get the buffer back.
I /think/ this was supposed to check child_bp->b_error and fail the rest of the invalidation if child_bp had experienced any kind of IO or corruption error. I bet the xfs_da3_node_read earlier in the loop will catch most cases of incoming on-disk corruption which makes this check mostly moot unless someone corrupts the buffer and the AIL pushes it out to disk while the buffer's unlocked.
In the first case we'll never get to the bad check, and in the second case the AIL will shut down the log, at which point there's no reason to check b_error. Remove the check, and null out @bp to avoid this problem in the future.
Cc: hch@lst.de Reported-by: kernel test robot oliver.sang@intel.com Fixes: 2911edb653b9 ("xfs: remove the mappedbno argument to xfs_da_get_buf") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_attr_inactive.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/xfs/xfs_attr_inactive.c b/fs/xfs/xfs_attr_inactive.c index 2b5da6218977c..2afa6d9a7f8f6 100644 --- a/fs/xfs/xfs_attr_inactive.c +++ b/fs/xfs/xfs_attr_inactive.c @@ -158,6 +158,7 @@ xfs_attr3_node_inactive( } child_fsb = be32_to_cpu(ichdr.btree[0].before); xfs_trans_brelse(*trans, bp); /* no locks for later trans */ + bp = NULL;
/* * If this is the node level just above the leaves, simply loop @@ -211,12 +212,8 @@ xfs_attr3_node_inactive( &child_bp); if (error) return error; - error = bp->b_error; - if (error) { - xfs_trans_brelse(*trans, child_bp); - return error; - } xfs_trans_binval(*trans, child_bp); + child_bp = NULL;
/* * If we're not done, re-read the parent to get the next @@ -233,6 +230,7 @@ xfs_attr3_node_inactive( bp->b_addr); child_fsb = be32_to_cpu(phdr.btree[i + 1].before); xfs_trans_brelse(*trans, bp); + bp = NULL; } /* * Atomically commit the whole invalidate stuff.
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit c78c2d0903183a41beb90c56a923e30f90fa91b9 ]
I observed the following evidence of a memory leak while running xfs/399 from the xfs fsck test suite (edited for brevity):
XFS (sde): Metadata corruption detected at xfs_attr_shortform_verify_struct.part.0+0x7b/0xb0 [xfs], inode 0x1172 attr fork XFS: Assertion failed: ip->i_af.if_u1.if_data == NULL, file: fs/xfs/libxfs/xfs_inode_fork.c, line: 315 ------------[ cut here ]------------ WARNING: CPU: 2 PID: 91635 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs] CPU: 2 PID: 91635 Comm: xfs_scrub Tainted: G W 5.19.0-rc7-xfsx #rc7 6e6475eb29fd9dda3181f81b7ca7ff961d277a40 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:assfail+0x46/0x4a [xfs] Call Trace: <TASK> xfs_ifork_zap_attr+0x7c/0xb0 xfs_iformat_attr_fork+0x86/0x110 xfs_inode_from_disk+0x41d/0x480 xfs_iget+0x389/0xd70 xfs_bulkstat_one_int+0x5b/0x540 xfs_bulkstat_iwalk+0x1e/0x30 xfs_iwalk_ag_recs+0xd1/0x160 xfs_iwalk_run_callbacks+0xb9/0x180 xfs_iwalk_ag+0x1d8/0x2e0 xfs_iwalk+0x141/0x220 xfs_bulkstat+0x105/0x180 xfs_ioc_bulkstat.constprop.0.isra.0+0xc5/0x130 xfs_file_ioctl+0xa5f/0xef0 __x64_sys_ioctl+0x82/0xa0 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
This newly-added assertion checks that there aren't any incore data structures hanging off the incore fork when we're trying to reset its contents. From the call trace, it is evident that iget was trying to construct an incore inode from the ondisk inode, but the attr fork verifier failed and we were trying to undo all the memory allocations that we had done earlier.
The three assertions in xfs_ifork_zap_attr check that the caller has already called xfs_idestroy_fork, which clearly has not been done here. As the zap function then zeroes the pointers, we've effectively leaked the memory.
The shortest change would have been to insert an extra call to xfs_idestroy_fork, but it makes more sense to bundle the _idestroy_fork call into _zap_attr, since all other callsites call _idestroy_fork immediately prior to calling _zap_attr. IOWs, it eliminates one way to fail.
Note: This change only applies cleanly to 2ed5b09b3e8f, since we just reworked the attr fork lifetime. However, I think this memory leak has existed since 0f45a1b20cd8, since the chain xfs_iformat_attr_fork -> xfs_iformat_local -> xfs_init_local_fork will allocate ifp->if_u1.if_data, but if xfs_ifork_verify_local_attr fails, xfs_iformat_attr_fork will free i_afp without freeing any of the stuff hanging off i_afp. The solution for older kernels I think is to add the missing call to xfs_idestroy_fork just prior to calling kmem_cache_free.
Found by fuzzing a.sfattr.hdr.totsize = lastbit in xfs/399.
[ backport note: did not include refactoring of xfs_idestroy_fork into xfs_ifork_zap_attr, simply added the missing call as suggested in the commit for backports ]
Fixes: 2ed5b09b3e8f ("xfs: make inode attribute forks a permanent part of struct xfs_inode") Probably-Fixes: 0f45a1b20cd8 ("xfs: improve local fork verification") Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_inode_fork.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/xfs/libxfs/xfs_inode_fork.c b/fs/xfs/libxfs/xfs_inode_fork.c index 20095233d7bc0..c1f965af8432d 100644 --- a/fs/xfs/libxfs/xfs_inode_fork.c +++ b/fs/xfs/libxfs/xfs_inode_fork.c @@ -330,6 +330,7 @@ xfs_iformat_attr_fork( }
if (error) { + xfs_idestroy_fork(ip->i_afp); kmem_cache_free(xfs_ifork_zone, ip->i_afp); ip->i_afp = NULL; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit f0c2d7d2abca24d19831c99edea458704fac8087 ]
Every now and then, I see the following hang during mount time quotacheck when running fstests. Turning on KASAN seems to make it happen somewhat more frequently. I've edited the backtrace for brevity.
XFS (sdd): Quotacheck needed: Please wait. XFS: Assertion failed: bp->b_flags & _XBF_DELWRI_Q, file: fs/xfs/xfs_buf.c, line: 2411 ------------[ cut here ]------------ WARNING: CPU: 0 PID: 1831409 at fs/xfs/xfs_message.c:104 assfail+0x46/0x4a [xfs] CPU: 0 PID: 1831409 Comm: mount Tainted: G W 5.19.0-rc6-xfsx #rc6 09911566947b9f737b036b4af85e399e4b9aef64 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 RIP: 0010:assfail+0x46/0x4a [xfs] Code: a0 8f 41 a0 e8 45 fe ff ff 8a 1d 2c 36 10 00 80 fb 01 76 0f 0f b6 f3 48 c7 c7 c0 f0 4f a0 e8 10 f0 02 e1 80 e3 01 74 02 0f 0b <0f> 0b 5b c3 48 8d 45 10 48 89 e2 4c 89 e6 48 89 1c 24 48 89 44 24 RSP: 0018:ffffc900078c7b30 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffff8880099ac000 RCX: 000000007fffffff RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffffffffa0418fa0 RBP: ffff8880197bc1c0 R08: 0000000000000000 R09: 000000000000000a R10: 000000000000000a R11: f000000000000000 R12: ffffc900078c7d20 R13: 00000000fffffff5 R14: ffffc900078c7d20 R15: 0000000000000000 FS: 00007f0449903800(0000) GS:ffff88803ec00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00005610ada631f0 CR3: 0000000014dd8002 CR4: 00000000001706f0 Call Trace: <TASK> xfs_buf_delwri_pushbuf+0x150/0x160 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_qm_flush_one+0xd6/0x130 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_qm_dquot_walk.isra.0+0x109/0x1e0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_qm_quotacheck+0x319/0x490 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_qm_mount_quotas+0x65/0x2c0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_mountfs+0x6b5/0xab0 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] xfs_fs_fill_super+0x781/0x990 [xfs 4561f5b32c9bfb874ec98d58d0719464e1f87368] get_tree_bdev+0x175/0x280 vfs_get_tree+0x1a/0x80 path_mount+0x6f5/0xaa0 __x64_sys_mount+0x103/0x140 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
I /think/ this can happen if xfs_qm_flush_one is racing with xfs_qm_dquot_isolate (i.e. dquot reclaim) when the second function has taken the dquot flush lock but xfs_qm_dqflush hasn't yet locked the dquot buffer, let alone queued it to the delwri list. In this case, flush_one will fail to get the dquot flush lock, but it can lock the incore buffer, but xfs_buf_delwri_pushbuf will then trip over this ASSERT, which checks that the buffer isn't on a delwri list. The hang results because the _delwri_submit_buffers ignores non DELWRI_Q buffers, which means that xfs_buf_iowait waits forever for an IO that has not yet been scheduled.
AFAICT, a reasonable solution here is to detect a dquot buffer that is not on a DELWRI list, drop it, and return -EAGAIN to try the flush again. It's not /that/ big of a deal if quotacheck writes the dquot buffer repeatedly before we even set QUOTA_CHKD.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Dave Chinner dchinner@redhat.com Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_qm.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/fs/xfs/xfs_qm.c b/fs/xfs/xfs_qm.c index 623244650a2f0..792736e29a37a 100644 --- a/fs/xfs/xfs_qm.c +++ b/fs/xfs/xfs_qm.c @@ -1244,6 +1244,13 @@ xfs_qm_flush_one( error = -EINVAL; goto out_unlock; } + + if (!(bp->b_flags & _XBF_DELWRI_Q)) { + error = -EAGAIN; + xfs_buf_relse(bp); + goto out_unlock; + } + xfs_buf_unlock(bp);
xfs_buf_delwri_pushbuf(bp, buffer_list);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gao Xiang hsiangkao@linux.alibaba.com
[ Upstream commit 1a39ae415c1be1e46f5b3f97d438c7c4adc22b63 ]
COW extents are already converted into written real extents after xfs_reflink_convert_cow_locked(), therefore cmap->br_state should reflect it.
Otherwise, there is another necessary unwritten convertion triggered in xfs_dio_write_end_io() for direct I/O cases.
Signed-off-by: Gao Xiang hsiangkao@linux.alibaba.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_reflink.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 36832e4bc803c..628ce65d02bb5 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -425,7 +425,10 @@ xfs_reflink_allocate_cow( if (!convert_now || cmap->br_state == XFS_EXT_NORM) return 0; trace_xfs_reflink_convert_cow(ip, cmap); - return xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); + error = xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); + if (!error) + cmap->br_state = XFS_EXT_NORM; + return error;
out_trans_cancel: xfs_trans_cancel(tp);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandan Babu R chandan.babu@oracle.com
[ Upstream commit d62113303d691bcd8d0675ae4ac63e7769afc56c ]
On a higly fragmented filesystem a Direct IO write can fail with -ENOSPC error even though the filesystem has sufficient number of free blocks.
This occurs if the file offset range on which the write operation is being performed has a delalloc extent in the cow fork and this delalloc extent begins much before the Direct IO range.
In such a scenario, xfs_reflink_allocate_cow() invokes xfs_bmapi_write() to allocate the blocks mapped by the delalloc extent. The extent thus allocated may not cover the beginning of file offset range on which the Direct IO write was issued. Hence xfs_reflink_allocate_cow() ends up returning -ENOSPC.
The following script reliably recreates the bug described above.
#!/usr/bin/bash
device=/dev/loop0 shortdev=$(basename $device)
mntpnt=/mnt/ file1=${mntpnt}/file1 file2=${mntpnt}/file2 fragmentedfile=${mntpnt}/fragmentedfile punchprog=/root/repos/xfstests-dev/src/punch-alternating
errortag=/sys/fs/xfs/${shortdev}/errortag/bmap_alloc_minlen_extent
umount $device > /dev/null 2>&1
echo "Create FS" mkfs.xfs -f -m reflink=1 $device > /dev/null 2>&1 if [[ $? != 0 ]]; then echo "mkfs failed." exit 1 fi
echo "Mount FS" mount $device $mntpnt > /dev/null 2>&1 if [[ $? != 0 ]]; then echo "mount failed." exit 1 fi
echo "Create source file" xfs_io -f -c "pwrite 0 32M" $file1 > /dev/null 2>&1
sync
echo "Create Reflinked file" xfs_io -f -c "reflink $file1" $file2 &>/dev/null
echo "Set cowextsize" xfs_io -c "cowextsize 16M" $file1 > /dev/null 2>&1
echo "Fragment FS" xfs_io -f -c "pwrite 0 64M" $fragmentedfile > /dev/null 2>&1 sync $punchprog $fragmentedfile
echo "Allocate block sized extent from now onwards" echo -n 1 > $errortag
echo "Create 16MiB delalloc extent in CoW fork" xfs_io -c "pwrite 0 4k" $file1 > /dev/null 2>&1
sync
echo "Direct I/O write at offset 12k" xfs_io -d -c "pwrite 12k 8k" $file1
This commit fixes the bug by invoking xfs_bmapi_write() in a loop until disk blocks are allocated for atleast the starting file offset of the Direct IO write range.
Fixes: 3c68d44a2b49 ("xfs: allocate direct I/O COW blocks in iomap_begin") Reported-and-Root-caused-by: Wengang Wang wen.gang.wang@oracle.com Signed-off-by: Chandan Babu R chandan.babu@oracle.com Reviewed-by: Darrick J. Wong djwong@kernel.org [djwong: slight editing to make the locking less grody, and fix some style things] Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_reflink.c | 198 +++++++++++++++++++++++++++++++++++-------- 1 file changed, 163 insertions(+), 35 deletions(-)
diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c index 628ce65d02bb5..793bdf5ac2f76 100644 --- a/fs/xfs/xfs_reflink.c +++ b/fs/xfs/xfs_reflink.c @@ -340,9 +340,41 @@ xfs_find_trim_cow_extent( return 0; }
-/* Allocate all CoW reservations covering a range of blocks in a file. */ -int -xfs_reflink_allocate_cow( +static int +xfs_reflink_convert_unwritten( + struct xfs_inode *ip, + struct xfs_bmbt_irec *imap, + struct xfs_bmbt_irec *cmap, + bool convert_now) +{ + xfs_fileoff_t offset_fsb = imap->br_startoff; + xfs_filblks_t count_fsb = imap->br_blockcount; + int error; + + /* + * cmap might larger than imap due to cowextsize hint. + */ + xfs_trim_extent(cmap, offset_fsb, count_fsb); + + /* + * COW fork extents are supposed to remain unwritten until we're ready + * to initiate a disk write. For direct I/O we are going to write the + * data and need the conversion, but for buffered writes we're done. + */ + if (!convert_now || cmap->br_state == XFS_EXT_NORM) + return 0; + + trace_xfs_reflink_convert_cow(ip, cmap); + + error = xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); + if (!error) + cmap->br_state = XFS_EXT_NORM; + + return error; +} + +static int +xfs_reflink_fill_cow_hole( struct xfs_inode *ip, struct xfs_bmbt_irec *imap, struct xfs_bmbt_irec *cmap, @@ -351,25 +383,12 @@ xfs_reflink_allocate_cow( bool convert_now) { struct xfs_mount *mp = ip->i_mount; - xfs_fileoff_t offset_fsb = imap->br_startoff; - xfs_filblks_t count_fsb = imap->br_blockcount; struct xfs_trans *tp; - int nimaps, error = 0; - bool found; xfs_filblks_t resaligned; - xfs_extlen_t resblks = 0; - - ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); - if (!ip->i_cowfp) { - ASSERT(!xfs_is_reflink_inode(ip)); - xfs_ifork_init_cow(ip); - } - - error = xfs_find_trim_cow_extent(ip, imap, cmap, shared, &found); - if (error || !*shared) - return error; - if (found) - goto convert; + xfs_extlen_t resblks; + int nimaps; + int error; + bool found;
resaligned = xfs_aligned_fsb_count(imap->br_startoff, imap->br_blockcount, xfs_get_cowextsz_hint(ip)); @@ -385,17 +404,17 @@ xfs_reflink_allocate_cow(
*lockmode = XFS_ILOCK_EXCL;
- /* - * Check for an overlapping extent again now that we dropped the ilock. - */ error = xfs_find_trim_cow_extent(ip, imap, cmap, shared, &found); if (error || !*shared) goto out_trans_cancel; + if (found) { xfs_trans_cancel(tp); goto convert; }
+ ASSERT(cmap->br_startoff > imap->br_startoff); + /* Allocate the entire reservation as unwritten blocks. */ nimaps = 1; error = xfs_bmapi_write(tp, ip, imap->br_startoff, imap->br_blockcount, @@ -415,26 +434,135 @@ xfs_reflink_allocate_cow( */ if (nimaps == 0) return -ENOSPC; + convert: - xfs_trim_extent(cmap, offset_fsb, count_fsb); - /* - * COW fork extents are supposed to remain unwritten until we're ready - * to initiate a disk write. For direct I/O we are going to write the - * data and need the conversion, but for buffered writes we're done. - */ - if (!convert_now || cmap->br_state == XFS_EXT_NORM) - return 0; - trace_xfs_reflink_convert_cow(ip, cmap); - error = xfs_reflink_convert_cow_locked(ip, offset_fsb, count_fsb); - if (!error) - cmap->br_state = XFS_EXT_NORM; + return xfs_reflink_convert_unwritten(ip, imap, cmap, convert_now); + +out_trans_cancel: + xfs_trans_cancel(tp); return error; +} + +static int +xfs_reflink_fill_delalloc( + struct xfs_inode *ip, + struct xfs_bmbt_irec *imap, + struct xfs_bmbt_irec *cmap, + bool *shared, + uint *lockmode, + bool convert_now) +{ + struct xfs_mount *mp = ip->i_mount; + struct xfs_trans *tp; + int nimaps; + int error; + bool found; + + do { + xfs_iunlock(ip, *lockmode); + *lockmode = 0; + + error = xfs_trans_alloc_inode(ip, &M_RES(mp)->tr_write, 0, 0, + false, &tp); + if (error) + return error; + + *lockmode = XFS_ILOCK_EXCL; + + error = xfs_find_trim_cow_extent(ip, imap, cmap, shared, + &found); + if (error || !*shared) + goto out_trans_cancel; + + if (found) { + xfs_trans_cancel(tp); + break; + } + + ASSERT(isnullstartblock(cmap->br_startblock) || + cmap->br_startblock == DELAYSTARTBLOCK); + + /* + * Replace delalloc reservation with an unwritten extent. + */ + nimaps = 1; + error = xfs_bmapi_write(tp, ip, cmap->br_startoff, + cmap->br_blockcount, + XFS_BMAPI_COWFORK | XFS_BMAPI_PREALLOC, 0, + cmap, &nimaps); + if (error) + goto out_trans_cancel; + + xfs_inode_set_cowblocks_tag(ip); + error = xfs_trans_commit(tp); + if (error) + return error; + + /* + * Allocation succeeded but the requested range was not even + * partially satisfied? Bail out! + */ + if (nimaps == 0) + return -ENOSPC; + } while (cmap->br_startoff + cmap->br_blockcount <= imap->br_startoff); + + return xfs_reflink_convert_unwritten(ip, imap, cmap, convert_now);
out_trans_cancel: xfs_trans_cancel(tp); return error; }
+/* Allocate all CoW reservations covering a range of blocks in a file. */ +int +xfs_reflink_allocate_cow( + struct xfs_inode *ip, + struct xfs_bmbt_irec *imap, + struct xfs_bmbt_irec *cmap, + bool *shared, + uint *lockmode, + bool convert_now) +{ + int error; + bool found; + + ASSERT(xfs_isilocked(ip, XFS_ILOCK_EXCL)); + if (!ip->i_cowfp) { + ASSERT(!xfs_is_reflink_inode(ip)); + xfs_ifork_init_cow(ip); + } + + error = xfs_find_trim_cow_extent(ip, imap, cmap, shared, &found); + if (error || !*shared) + return error; + + /* CoW fork has a real extent */ + if (found) + return xfs_reflink_convert_unwritten(ip, imap, cmap, + convert_now); + + /* + * CoW fork does not have an extent and data extent is shared. + * Allocate a real extent in the CoW fork. + */ + if (cmap->br_startoff > imap->br_startoff) + return xfs_reflink_fill_cow_hole(ip, imap, cmap, shared, + lockmode, convert_now); + + /* + * CoW fork has a delalloc reservation. Replace it with a real extent. + * There may or may not be a data fork mapping. + */ + if (isnullstartblock(cmap->br_startblock) || + cmap->br_startblock == DELAYSTARTBLOCK) + return xfs_reflink_fill_delalloc(ip, imap, cmap, shared, + lockmode, convert_now); + + /* Shouldn't get here. */ + ASSERT(0); + return -EFSCORRUPTED; +} + /* * Cancel CoW reservations for some block range of an inode. *
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: hexiaole hexiaole@kylinos.cn
[ Upstream commit 031d166f968efba6e4f091ff75d0bb5206bb3918 ]
In 'fs/xfs/libxfs/xfs_trans_resv.c', the comment for transaction of removing a directory entry writes:
/* fs/xfs/libxfs/xfs_trans_resv.c begin */ /* * For removing a directory entry we can modify: * the parent directory inode: inode size * the removed inode: inode size ... xfs_calc_remove_reservation( struct xfs_mount *mp) { return XFS_DQUOT_LOGRES(mp) + xfs_calc_iunlink_add_reservation(mp) + max((xfs_calc_inode_res(mp, 1) + ... /* fs/xfs/libxfs/xfs_trans_resv.c end */
There has 2 inode size of space to be reserverd, but the actual code for inode reservation space writes.
There only count for 1 inode size to be reserved in 'xfs_calc_inode_res(mp, 1)', rather than 2.
Signed-off-by: hexiaole hexiaole@kylinos.cn Reviewed-by: Darrick J. Wong djwong@kernel.org [djwong: remove redundant code citations] Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_trans_resv.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/xfs/libxfs/xfs_trans_resv.c b/fs/xfs/libxfs/xfs_trans_resv.c index 5e300daa25593..2db9d9d123444 100644 --- a/fs/xfs/libxfs/xfs_trans_resv.c +++ b/fs/xfs/libxfs/xfs_trans_resv.c @@ -423,7 +423,7 @@ xfs_calc_remove_reservation( { return XFS_DQUOT_LOGRES(mp) + xfs_calc_iunlink_add_reservation(mp) + - max((xfs_calc_inode_res(mp, 1) + + max((xfs_calc_inode_res(mp, 2) + xfs_calc_buf_res(XFS_DIROP_LOG_COUNT(mp), XFS_FSB_TO_B(mp, 1))), (xfs_calc_buf_res(4, mp->m_sb.sb_sectsize) +
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 97cf79677ecb50a38517253ae2fd705849a7e51a ]
KASAN reported a UAF bug when I was running xfs/235:
BUG: KASAN: use-after-free in xlog_recover_process_intents+0xa77/0xae0 [xfs] Read of size 8 at addr ffff88804391b360 by task mount/5680
CPU: 2 PID: 5680 Comm: mount Not tainted 6.0.0-xfsx #6.0.0 77e7b52a4943a975441e5ac90a5ad7748b7867f6 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 print_report.cold+0x2cc/0x682 kasan_report+0xa3/0x120 xlog_recover_process_intents+0xa77/0xae0 [xfs fb841c7180aad3f8359438576e27867f5795667e] xlog_recover_finish+0x7d/0x970 [xfs fb841c7180aad3f8359438576e27867f5795667e] xfs_log_mount_finish+0x2d7/0x5d0 [xfs fb841c7180aad3f8359438576e27867f5795667e] xfs_mountfs+0x11d4/0x1d10 [xfs fb841c7180aad3f8359438576e27867f5795667e] xfs_fs_fill_super+0x13d5/0x1a80 [xfs fb841c7180aad3f8359438576e27867f5795667e] get_tree_bdev+0x3da/0x6e0 vfs_get_tree+0x7d/0x240 path_mount+0xdd3/0x17d0 __x64_sys_mount+0x1fa/0x270 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0 RIP: 0033:0x7ff5bc069eae Code: 48 8b 0d 85 1f 0f 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 49 89 ca b8 a5 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 52 1f 0f 00 f7 d8 64 89 01 48 RSP: 002b:00007ffe433fd448 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5 RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007ff5bc069eae RDX: 00005575d7213290 RSI: 00005575d72132d0 RDI: 00005575d72132b0 RBP: 00005575d7212fd0 R08: 00005575d7213230 R09: 00005575d7213fe0 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 00005575d7213290 R14: 00005575d72132b0 R15: 00005575d7212fd0 </TASK>
Allocated by task 5680: kasan_save_stack+0x1e/0x40 __kasan_slab_alloc+0x66/0x80 kmem_cache_alloc+0x152/0x320 xfs_rui_init+0x17a/0x1b0 [xfs] xlog_recover_rui_commit_pass2+0xb9/0x2e0 [xfs] xlog_recover_items_pass2+0xe9/0x220 [xfs] xlog_recover_commit_trans+0x673/0x900 [xfs] xlog_recovery_process_trans+0xbe/0x130 [xfs] xlog_recover_process_data+0x103/0x2a0 [xfs] xlog_do_recovery_pass+0x548/0xc60 [xfs] xlog_do_log_recovery+0x62/0xc0 [xfs] xlog_do_recover+0x73/0x480 [xfs] xlog_recover+0x229/0x460 [xfs] xfs_log_mount+0x284/0x640 [xfs] xfs_mountfs+0xf8b/0x1d10 [xfs] xfs_fs_fill_super+0x13d5/0x1a80 [xfs] get_tree_bdev+0x3da/0x6e0 vfs_get_tree+0x7d/0x240 path_mount+0xdd3/0x17d0 __x64_sys_mount+0x1fa/0x270 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
Freed by task 5680: kasan_save_stack+0x1e/0x40 kasan_set_track+0x21/0x30 kasan_set_free_info+0x20/0x30 ____kasan_slab_free+0x144/0x1b0 slab_free_freelist_hook+0xab/0x180 kmem_cache_free+0x1f1/0x410 xfs_rud_item_release+0x33/0x80 [xfs] xfs_trans_free_items+0xc3/0x220 [xfs] xfs_trans_cancel+0x1fa/0x590 [xfs] xfs_rui_item_recover+0x913/0xd60 [xfs] xlog_recover_process_intents+0x24e/0xae0 [xfs] xlog_recover_finish+0x7d/0x970 [xfs] xfs_log_mount_finish+0x2d7/0x5d0 [xfs] xfs_mountfs+0x11d4/0x1d10 [xfs] xfs_fs_fill_super+0x13d5/0x1a80 [xfs] get_tree_bdev+0x3da/0x6e0 vfs_get_tree+0x7d/0x240 path_mount+0xdd3/0x17d0 __x64_sys_mount+0x1fa/0x270 do_syscall_64+0x2b/0x80 entry_SYSCALL_64_after_hwframe+0x46/0xb0
The buggy address belongs to the object at ffff88804391b300 which belongs to the cache xfs_rui_item of size 688 The buggy address is located 96 bytes inside of 688-byte region [ffff88804391b300, ffff88804391b5b0)
The buggy address belongs to the physical page: page:ffffea00010e4600 refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff888043919320 pfn:0x43918 head:ffffea00010e4600 order:2 compound_mapcount:0 compound_pincount:0 flags: 0x4fff80000010200(slab|head|node=1|zone=1|lastcpupid=0xfff) raw: 04fff80000010200 0000000000000000 dead000000000122 ffff88807f0eadc0 raw: ffff888043919320 0000000080140010 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88804391b200: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88804391b280: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88804391b300: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^ ffff88804391b380: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ffff88804391b400: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb ==================================================================
The test fuzzes an rmap btree block and starts writer threads to induce a filesystem shutdown on the corrupt block. When the filesystem is remounted, recovery will try to replay the committed rmap intent item, but the corruption problem causes the recovery transaction to fail. Cancelling the transaction frees the RUD, which frees the RUI that we recovered.
When we return to xlog_recover_process_intents, @lip is now a dangling pointer, and we cannot use it to find the iop_recover method for the tracepoint. Hence we must store the item ops before calling ->iop_recover if we want to give it to the tracepoint so that the trace data will tell us exactly which intent item failed.
Signed-off-by: Darrick J. Wong djwong@kernel.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_log_recover.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index 04961ebf16ea2..3d844a250b710 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -2560,6 +2560,7 @@ xlog_recover_process_intents( for (lip = xfs_trans_ail_cursor_first(ailp, &cur, 0); lip != NULL; lip = xfs_trans_ail_cursor_next(ailp, &cur)) { + const struct xfs_item_ops *ops; /* * We're done when we see something other than an intent. * There should be no intents left in the AIL now. @@ -2584,13 +2585,17 @@ xlog_recover_process_intents( * deferred ops, you /must/ attach them to the capture list in * the recover routine or else those subsequent intents will be * replayed in the wrong order! + * + * The recovery function can free the log item, so we must not + * access lip after it returns. */ spin_unlock(&ailp->ail_lock); - error = lip->li_ops->iop_recover(lip, &capture_list); + ops = lip->li_ops; + error = ops->iop_recover(lip, &capture_list); spin_lock(&ailp->ail_lock); if (error) { trace_xlog_intent_recovery_failed(log->l_mp, error, - lip->li_ops->iop_recover); + ops->iop_recover); break; } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guo Xuenan guoxuenan@huawei.com
[ Upstream commit 13cf24e00665c9751951a422756d975812b71173 ]
For leaf dir, In most cases, there should be as many bestfree slots as the dir data blocks that can fit under i_size (except for [1]).
Root cause is we don't examin the number bestfree slots, when the slots number less than dir data blocks, if we need to allocate new dir data block and update the bestfree array, we will use the dir block number as index to assign bestfree array, while we did not check the leaf buf boundary which may cause UAF or other memory access problem. This issue can also triggered with test cases xfs/473 from fstests.
According to Dave Chinner & Darrick's suggestion, adding buffer verifier to detect this abnormal situation in time. Simplify the testcase for fstest xfs/554 [1]
The error log is shown as follows: ================================================================== BUG: KASAN: use-after-free in xfs_dir2_leaf_addname+0x1995/0x1ac0 Write of size 2 at addr ffff88810168b000 by task touch/1552 CPU: 5 PID: 1552 Comm: touch Not tainted 6.0.0-rc3+ #101 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 print_report.cold+0xf6/0x691 kasan_report+0xa8/0x120 xfs_dir2_leaf_addname+0x1995/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f33 RDI: 00000000ffffff9c RBP: 00007ffda4c17f33 R08: 0000000000000000 R09: 0000000000000000 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f33 R15: 0000000000000000 </TASK>
The buggy address belongs to the physical page: page:ffffea000405a2c0 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x10168b flags: 0x2fffff80000000(node=0|zone=2|lastcpupid=0x1fffff) raw: 002fffff80000000 ffffea0004057788 ffffea000402dbc8 0000000000000000 raw: 0000000000000000 0000000000170000 00000000ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff88810168af00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ffff88810168af80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88810168b000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^ ffff88810168b080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff88810168b100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ================================================================== Disabling lock debugging due to kernel taint 00000000: 58 44 44 33 5b 53 35 c2 00 00 00 00 00 00 00 78 XDD3[S5........x XFS (sdb): Internal error xfs_dir2_data_use_free at line 1200 of file fs/xfs/libxfs/xfs_dir2_data.c. Caller xfs_dir2_data_use_free+0x28a/0xeb0 CPU: 5 PID: 1552 Comm: touch Tainted: G B 6.0.0-rc3+ Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014 Call Trace: <TASK> dump_stack_lvl+0x4d/0x66 xfs_corruption_error+0x132/0x150 xfs_dir2_data_use_free+0x198/0xeb0 xfs_dir2_leaf_addname+0xa59/0x1ac0 xfs_dir_createname+0x58c/0x7f0 xfs_create+0x7af/0x1010 xfs_generic_create+0x270/0x5e0 path_openat+0x270b/0x3450 do_filp_open+0x1cf/0x2b0 do_sys_openat2+0x46b/0x7a0 do_sys_open+0xb7/0x130 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7fe4d9e9312b Code: 25 00 00 41 00 3d 00 00 41 00 74 4b 64 8b 04 25 18 00 00 00 85 c0 75 67 44 89 e2 48 89 ee bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 0f 87 91 00 00 00 48 8b 4c 24 28 64 48 33 0c 25 RSP: 002b:00007ffda4c16c20 EFLAGS: 00000246 ORIG_RAX: 0000000000000101 RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007fe4d9e9312b RDX: 0000000000000941 RSI: 00007ffda4c17f46 RDI: 00000000ffffff9c RBP: 00007ffda4c17f46 R08: 0000000000000000 R09: 0000000000000001 R10: 00000000000001b6 R11: 0000000000000246 R12: 0000000000000941 R13: 00007fe4d9f631a4 R14: 00007ffda4c17f46 R15: 0000000000000000 </TASK> XFS (sdb): Corruption detected. Unmount and run xfs_repair
[1] https://lore.kernel.org/all/20220928095355.2074025-1-guoxuenan@huawei.com/ Reviewed-by: Hou Tao houtao1@huawei.com Signed-off-by: Guo Xuenan guoxuenan@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/libxfs/xfs_dir2_leaf.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/libxfs/xfs_dir2_leaf.c b/fs/xfs/libxfs/xfs_dir2_leaf.c index d9b66306a9a77..cb9e950a911d8 100644 --- a/fs/xfs/libxfs/xfs_dir2_leaf.c +++ b/fs/xfs/libxfs/xfs_dir2_leaf.c @@ -146,6 +146,8 @@ xfs_dir3_leaf_check_int( xfs_dir2_leaf_tail_t *ltp; int stale; int i; + bool isleaf1 = (hdr->magic == XFS_DIR2_LEAF1_MAGIC || + hdr->magic == XFS_DIR3_LEAF1_MAGIC);
ltp = xfs_dir2_leaf_tail_p(geo, leaf);
@@ -158,8 +160,7 @@ xfs_dir3_leaf_check_int( return __this_address;
/* Leaves and bests don't overlap in leaf format. */ - if ((hdr->magic == XFS_DIR2_LEAF1_MAGIC || - hdr->magic == XFS_DIR3_LEAF1_MAGIC) && + if (isleaf1 && (char *)&hdr->ents[hdr->count] > (char *)xfs_dir2_leaf_bests_p(ltp)) return __this_address;
@@ -175,6 +176,10 @@ xfs_dir3_leaf_check_int( } if (hdr->ents[i].address == cpu_to_be32(XFS_DIR2_NULL_DATAPTR)) stale++; + if (isleaf1 && xfs_dir2_dataptr_to_db(geo, + be32_to_cpu(hdr->ents[i].address)) >= + be32_to_cpu(ltp->bestcount)) + return __this_address; } if (hdr->stale != stale) return __this_address;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zeng Heng zengheng4@huawei.com
[ Upstream commit cf4f4c12dea7a977a143c8fe5af1740b7f9876f8 ]
When `xfs_sysfs_init` returns failed, `mp->m_errortag` needs to free. Otherwise kmemleak would report memory leak after mounting xfs image:
unreferenced object 0xffff888101364900 (size 192): comm "mount", pid 13099, jiffies 4294915218 (age 335.207s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<00000000f08ad25c>] __kmalloc+0x41/0x1b0 [<00000000dca9aeb6>] kmem_alloc+0xfd/0x430 [<0000000040361882>] xfs_errortag_init+0x20/0x110 [<00000000b384a0f6>] xfs_mountfs+0x6ea/0x1a30 [<000000003774395d>] xfs_fs_fill_super+0xe10/0x1a80 [<000000009cf07b6c>] get_tree_bdev+0x3e7/0x700 [<00000000046b5426>] vfs_get_tree+0x8e/0x2e0 [<00000000952ec082>] path_mount+0xf8c/0x1990 [<00000000beb1f838>] do_mount+0xee/0x110 [<000000000e9c41bb>] __x64_sys_mount+0x14b/0x1f0 [<00000000f7bb938e>] do_syscall_64+0x3b/0x90 [<000000003fcd67a9>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
Fixes: c68401011522 ("xfs: expose errortag knobs via sysfs") Signed-off-by: Zeng Heng zengheng4@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_error.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/fs/xfs/xfs_error.c b/fs/xfs/xfs_error.c index 81c445e9489bd..b0ccec92e015d 100644 --- a/fs/xfs/xfs_error.c +++ b/fs/xfs/xfs_error.c @@ -224,13 +224,18 @@ int xfs_errortag_init( struct xfs_mount *mp) { + int ret; + mp->m_errortag = kmem_zalloc(sizeof(unsigned int) * XFS_ERRTAG_MAX, KM_MAYFAIL); if (!mp->m_errortag) return -ENOMEM;
- return xfs_sysfs_init(&mp->m_errortag_kobj, &xfs_errortag_ktype, - &mp->m_kobj, "errortag"); + ret = xfs_sysfs_init(&mp->m_errortag_kobj, &xfs_errortag_ktype, + &mp->m_kobj, "errortag"); + if (ret) + kmem_free(mp->m_errortag); + return ret; }
void
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Zetao lizetao1@huawei.com
[ Upstream commit d08af40340cad0e025d643c3982781a8f99d5032 ]
kmemleak reported a sequence of memory leaks, and one of them indicated we failed to free a pointer: comm "mount", pid 19610, jiffies 4297086464 (age 60.635s) hex dump (first 8 bytes): 73 64 61 00 81 88 ff ff sda..... backtrace: [<00000000d77f3e04>] kstrdup_const+0x46/0x70 [<00000000e51fa804>] kobject_set_name_vargs+0x2f/0xb0 [<00000000247cd595>] kobject_init_and_add+0xb0/0x120 [<00000000f9139aaf>] xfs_mountfs+0x367/0xfc0 [<00000000250d3caf>] xfs_fs_fill_super+0xa16/0xdc0 [<000000008d873d38>] get_tree_bdev+0x256/0x390 [<000000004881f3fa>] vfs_get_tree+0x41/0xf0 [<000000008291ab52>] path_mount+0x9b3/0xdd0 [<0000000022ba8f2d>] __x64_sys_mount+0x190/0x1d0
As mentioned in kobject_init_and_add() comment, if this function returns an error, kobject_put() must be called to properly clean up the memory associated with the object. Apparently, xfs_sysfs_init() does not follow such a requirement. When kobject_init_and_add() returns an error, the space of kobj->kobject.name alloced by kstrdup_const() is unfree, which will cause the above stack.
Fix it by adding kobject_put() when kobject_init_and_add returns an error.
Fixes: a31b1d3d89e4 ("xfs: add xfs_mount sysfs kobject") Signed-off-by: Li Zetao lizetao1@huawei.com Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Leah Rumancik leah.rumancik@gmail.com Acked-by: Chandan Babu R chandanbabu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/xfs/xfs_sysfs.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/xfs/xfs_sysfs.h b/fs/xfs/xfs_sysfs.h index 43585850f1546..513095e353a5b 100644 --- a/fs/xfs/xfs_sysfs.h +++ b/fs/xfs/xfs_sysfs.h @@ -33,10 +33,15 @@ xfs_sysfs_init( const char *name) { struct kobject *parent; + int err;
parent = parent_kobj ? &parent_kobj->kobject : NULL; init_completion(&kobj->complete); - return kobject_init_and_add(&kobj->kobject, ktype, parent, "%s", name); + err = kobject_init_and_add(&kobj->kobject, ktype, parent, "%s", name); + if (err) + kobject_put(&kobj->kobject); + + return err; }
static inline void
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
commit 471aa951bf1206d3c10d0daa67005b8e4db4ff83 upstream.
When i915 perf interface is not available dereferencing it will lead to NULL dereferences.
As returning -ENOTSUPP is pretty clear return when perf interface is not available.
Fixes: 2fec539112e8 ("i915/perf: Replace DRM_DEBUG with driver specific drm_dbg call") Suggested-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231027172822.2753059-1-harsh... [tursulin: added stable tag] (cherry picked from commit 36f27350ff745bd228ab04d7845dfbffc177a889) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/i915_perf.c | 15 +++------------ 1 file changed, 3 insertions(+), 12 deletions(-)
--- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -3795,11 +3795,8 @@ int i915_perf_open_ioctl(struct drm_devi u32 known_open_flags; int ret;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
known_open_flags = I915_PERF_FLAG_FD_CLOEXEC | I915_PERF_FLAG_FD_NONBLOCK | @@ -4090,11 +4087,8 @@ int i915_perf_add_config_ioctl(struct dr struct i915_oa_reg *regs; int err, id;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
if (!perf->metrics_kobj) { drm_dbg(&perf->i915->drm, @@ -4256,11 +4250,8 @@ int i915_perf_remove_config_ioctl(struct struct i915_oa_config *oa_config; int ret;
- if (!perf->i915) { - drm_dbg(&perf->i915->drm, - "i915 perf interface not available for this system\n"); + if (!perf->i915) return -ENOTSUPP; - }
if (i915_perf_stream_paranoid && !perfmon_capable()) { drm_dbg(&perf->i915->drm,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 5e538fce33589da6d7cb2de1445b84d3a8a692f7 upstream.
Read and write pointers are used to track the packet index in the memory shared between video driver and firmware. There is a possibility of OOB access if the read or write pointer goes beyond the queue memory size. Add checks for the read and write pointer to avoid OOB access.
Cc: stable@vger.kernel.org Fixes: d96d3f30c0f2 ("[media] media: venus: hfi: add Venus HFI files") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_venus.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_venus.c +++ b/drivers/media/platform/qcom/venus/hfi_venus.c @@ -205,6 +205,11 @@ static int venus_write_queue(struct venu
new_wr_idx = wr_idx + dwords; wr_ptr = (u32 *)(queue->qmem.kva + (wr_idx << 2)); + + if (wr_ptr < (u32 *)queue->qmem.kva || + wr_ptr > (u32 *)(queue->qmem.kva + queue->qmem.size - sizeof(*wr_ptr))) + return -EINVAL; + if (new_wr_idx < qsize) { memcpy(wr_ptr, packet, dwords << 2); } else { @@ -272,6 +277,11 @@ static int venus_read_queue(struct venus }
rd_ptr = (u32 *)(queue->qmem.kva + (rd_idx << 2)); + + if (rd_ptr < (u32 *)queue->qmem.kva || + rd_ptr > (u32 *)(queue->qmem.kva + queue->qmem.size - sizeof(*rd_ptr))) + return -EINVAL; + dwords = *rd_ptr >> 2; if (!dwords) return -EINVAL;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
commit ea142e590aec55ba40c5affb4d49e68c713c63dc upstream.
When the PMU is disabled, MMCRA is not updated to disable BHRB and instruction sampling. This can lead to those features remaining enabled, which can slow down a real or emulated CPU.
Fixes: 1cade527f6e9 ("powerpc/perf: BHRB control to disable BHRB logic when not used") Cc: stable@vger.kernel.org # v5.9+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20231018153423.298373-1-npiggin@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/perf/core-book3s.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/arch/powerpc/perf/core-book3s.c +++ b/arch/powerpc/perf/core-book3s.c @@ -1342,8 +1342,7 @@ static void power_pmu_disable(struct pmu /* * Disable instruction sampling if it was enabled */ - if (cpuhw->mmcr.mmcra & MMCRA_SAMPLE_ENABLE) - val &= ~MMCRA_SAMPLE_ENABLE; + val &= ~MMCRA_SAMPLE_ENABLE;
/* Disable BHRB via mmcra (BHRBRD) for p10 */ if (ppmu->flags & PPMU_ARCH_31) @@ -1354,7 +1353,7 @@ static void power_pmu_disable(struct pmu * instruction sampling or BHRB. */ if (val != mmcra) { - mtspr(SPRN_MMCRA, mmcra); + mtspr(SPRN_MMCRA, val); mb(); isync(); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kees Cook keescook@chromium.org
commit 381fdb73d1e2a48244de7260550e453d1003bb8e upstream.
The performance mode of the gcc-plugin randstruct was shuffling struct members outside of the cache-line groups. Limit the range to the specified group indexes.
Cc: linux-hardening@vger.kernel.org Cc: stable@vger.kernel.org Reported-by: Lukas Loidolt e1634039@student.tuwien.ac.at Closes: https://lore.kernel.org/all/f3ca77f0-e414-4065-83a5-ae4c4d25545d@student.tuw... Fixes: 313dd1b62921 ("gcc-plugins: Add the randstruct plugin") Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/gcc-plugins/randomize_layout_plugin.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/scripts/gcc-plugins/randomize_layout_plugin.c +++ b/scripts/gcc-plugins/randomize_layout_plugin.c @@ -209,12 +209,14 @@ static void partition_struct(tree *field
static void performance_shuffle(tree *newtree, unsigned long length, ranctx *prng_state) { - unsigned long i, x; + unsigned long i, x, index; struct partition_group size_group[length]; unsigned long num_groups = 0; unsigned long randnum;
partition_struct(newtree, length, (struct partition_group *)&size_group, &num_groups); + + /* FIXME: this group shuffle is currently a no-op. */ for (i = num_groups - 1; i > 0; i--) { struct partition_group tmp; randnum = ranval(prng_state) % (i + 1); @@ -224,11 +226,14 @@ static void performance_shuffle(tree *ne }
for (x = 0; x < num_groups; x++) { - for (i = size_group[x].start + size_group[x].length - 1; i > size_group[x].start; i--) { + for (index = size_group[x].length - 1; index > 0; index--) { tree tmp; + + i = size_group[x].start + index; if (DECL_BIT_FIELD_TYPE(newtree[i])) continue; - randnum = ranval(prng_state) % (i + 1); + randnum = ranval(prng_state) % (index + 1); + randnum += size_group[x].start; // we could handle this case differently if desired if (DECL_BIT_FIELD_TYPE(newtree[randnum])) continue;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Sun sunhao.th@gmail.com
commit 811c363645b33e6e22658634329e95f383dfc705 upstream.
In check_stack_write_fixed_off(), imm value is cast to u32 before being spilled to the stack. Therefore, the sign information is lost, and the range information is incorrect when load from the stack again.
For the following prog: 0: r2 = r10 1: *(u64*)(r2 -40) = -44 2: r0 = *(u64*)(r2 - 40) 3: if r0 s<= 0xa goto +2 4: r0 = 1 5: exit 6: r0 = 0 7: exit
The verifier gives: func#0 @0 0: R1=ctx(off=0,imm=0) R10=fp0 0: (bf) r2 = r10 ; R2_w=fp0 R10=fp0 1: (7a) *(u64 *)(r2 -40) = -44 ; R2_w=fp0 fp-40_w=4294967252 2: (79) r0 = *(u64 *)(r2 -40) ; R0_w=4294967252 R2_w=fp0 fp-40_w=4294967252 3: (c5) if r0 s< 0xa goto pc+2 mark_precise: frame0: last_idx 3 first_idx 0 subseq_idx -1 mark_precise: frame0: regs=r0 stack= before 2: (79) r0 = *(u64 *)(r2 -40) 3: R0_w=4294967252 4: (b7) r0 = 1 ; R0_w=1 5: (95) exit verification time 7971 usec stack depth 40 processed 6 insns (limit 1000000) max_states_per_insn 0 total_states 0 peak_states 0 mark_read 0
So remove the incorrect cast, since imm field is declared as s32, and __mark_reg_known() takes u64, so imm would be correctly sign extended by compiler.
Fixes: ecdf985d7615 ("bpf: track immediate values written to stack by BPF_ST instruction") Cc: stable@vger.kernel.org Signed-off-by: Hao Sun sunhao.th@gmail.com Acked-by: Shung-Hsi Yu shung-hsi.yu@suse.com Acked-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/r/20231101-fix-check-stack-write-v3-1-f05c2b1473d5@g... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2885,7 +2885,7 @@ static int check_stack_write_fixed_off(s insn->imm != 0 && env->bpf_capable) { struct bpf_reg_state fake_reg = {};
- __mark_reg_known(&fake_reg, (u32)insn->imm); + __mark_reg_known(&fake_reg, insn->imm); fake_reg.type = SCALAR_VALUE; save_register_state(state, spi, &fake_reg, size); } else if (reg && is_spillable_regtype(reg->type)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shung-Hsi Yu shung-hsi.yu@suse.com
commit 291d044fd51f8484066300ee42afecf8c8db7b3a upstream.
BPF_END and BPF_NEG has a different specification for the source bit in the opcode compared to other ALU/ALU64 instructions, and is either reserved or use to specify the byte swap endianness. In both cases the source bit does not encode source operand location, and src_reg is a reserved field.
backtrack_insn() currently does not differentiate BPF_END and BPF_NEG from other ALU/ALU64 instructions, which leads to r0 being incorrectly marked as precise when processing BPF_ALU | BPF_TO_BE | BPF_END instructions. This commit teaches backtrack_insn() to correctly mark precision for such case.
While precise tracking of BPF_NEG and other BPF_END instructions are correct and does not need fixing, this commit opt to process all BPF_NEG and BPF_END instructions within the same if-clause to better align with current convention used in the verifier (e.g. check_alu_op).
Fixes: b5dc0163d8fd ("bpf: precise scalar_value tracking") Cc: stable@vger.kernel.org Reported-by: Mohamed Mahmoud mmahmoud@redhat.com Closes: https://lore.kernel.org/r/87jzrrwptf.fsf@toke.dk Tested-by: Toke Høiland-Jørgensen toke@redhat.com Tested-by: Tao Lyu tao.lyu@epfl.ch Acked-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Shung-Hsi Yu shung-hsi.yu@suse.com Link: https://lore.kernel.org/r/20231102053913.12004-2-shung-hsi.yu@suse.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -2189,7 +2189,12 @@ static int backtrack_insn(struct bpf_ver if (class == BPF_ALU || class == BPF_ALU64) { if (!(*reg_mask & dreg)) return 0; - if (opcode == BPF_MOV) { + if (opcode == BPF_END || opcode == BPF_NEG) { + /* sreg is reserved and unused + * dreg still need precision before this insn + */ + return 0; + } else if (opcode == BPF_MOV) { if (BPF_SRC(insn->code) == BPF_X) { /* dreg = sreg * dreg needs precision after this insn
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ranjan Kumar ranjan.kumar@broadcom.com
commit 3c978492c333f0c08248a8d51cecbe5eb5f617c9 upstream.
The retry loop continues to iterate until the count reaches 30, even after receiving the correct value. Exit loop when a non-zero value is read.
Fixes: 4ca10f3e3174 ("scsi: mpt3sas: Perform additional retries if doorbell read returns 0") Cc: stable@vger.kernel.org Signed-off-by: Ranjan Kumar ranjan.kumar@broadcom.com Link: https://lore.kernel.org/r/20231020105849.6350-1-ranjan.kumar@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -224,8 +224,8 @@ _base_readl_ext_retry(const volatile voi
for (i = 0 ; i < 30 ; i++) { ret_val = readl(addr); - if (ret_val == 0) - continue; + if (ret_val != 0) + break; }
return ret_val;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandrakanth patil chandrakanth.patil@broadcom.com
commit 8e3ed9e786511ad800c33605ed904b9de49323cf upstream.
In BMC environments with concurrent access to multiple registers, certain registers occasionally yield a value of 0 even after 3 retries due to hardware errata. As a fix, we have extended the retry count from 3 to 30.
The same errata applies to the mpt3sas driver, and a similar patch has been accepted. Please find more details in the mpt3sas patch reference link.
Link: https://lore.kernel.org/r/20230829090020.5417-2-ranjan.kumar@broadcom.com Fixes: 272652fcbf1a ("scsi: megaraid_sas: add retry logic in megasas_readl") Cc: stable@vger.kernel.org Signed-off-by: Chandrakanth patil chandrakanth.patil@broadcom.com Signed-off-by: Sumit Saxena sumit.saxena@broadcom.com Link: https://lore.kernel.org/r/20231003110021.168862-2-chandrakanth.patil@broadco... Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/megaraid/megaraid_sas_base.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -263,13 +263,13 @@ u32 megasas_readl(struct megasas_instanc * Fusion registers could intermittently return all zeroes. * This behavior is transient in nature and subsequent reads will * return valid value. As a workaround in driver, retry readl for - * upto three times until a non-zero value is read. + * up to thirty times until a non-zero value is read. */ if (instance->adapter_type == AERO_SERIES) { do { ret_val = readl(addr); i++; - } while (ret_val == 0 && i < 3); + } while (ret_val == 0 && i < 30); return ret_val; } else { return readl(addr);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Quinn Tran qutran@marvell.com
commit 19597cad64d608aa8ac2f8aef50a50187a565223 upstream.
User experiences system crash when running AER error injection. The perturbation causes the abort-all-I/O path to trigger. The driver assumes all I/O on this path is FCP only. If there is both NVMe & FCP traffic, a system crash happens. Add additional check to see if I/O is FCP or not before access.
PID: 999019 TASK: ff35d769f24722c0 CPU: 53 COMMAND: "kworker/53:1" 0 [ff3f78b964847b58] machine_kexec at ffffffffae86973d 1 [ff3f78b964847ba8] __crash_kexec at ffffffffae9be29d 2 [ff3f78b964847c70] crash_kexec at ffffffffae9bf528 3 [ff3f78b964847c78] oops_end at ffffffffae8282ab 4 [ff3f78b964847c98] exc_page_fault at ffffffffaf2da502 5 [ff3f78b964847cc0] asm_exc_page_fault at ffffffffaf400b62 [exception RIP: qla2x00_abort_srb+444] RIP: ffffffffc07b5f8c RSP: ff3f78b964847d78 RFLAGS: 00010046 RAX: 0000000000000282 RBX: ff35d74a0195a200 RCX: ff35d76886fd03a0 RDX: 0000000000000001 RSI: ffffffffc07c5ec8 RDI: ff35d74a0195a200 RBP: ff35d76913d22080 R8: ff35d7694d103200 R9: ff35d7694d103200 R10: 0000000100000000 R11: ffffffffb05d6630 R12: 0000000000010000 R13: ff3f78b964847df8 R14: ff35d768d8754000 R15: ff35d768877248e0 ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018 6 [ff3f78b964847d70] qla2x00_abort_srb at ffffffffc07b5f84 [qla2xxx] 7 [ff3f78b964847de0] __qla2x00_abort_all_cmds at ffffffffc07b6238 [qla2xxx] 8 [ff3f78b964847e38] qla2x00_abort_all_cmds at ffffffffc07ba635 [qla2xxx] 9 [ff3f78b964847e58] qla2x00_terminate_rport_io at ffffffffc08145eb [qla2xxx] 10 [ff3f78b964847e70] fc_terminate_rport_io at ffffffffc045987e [scsi_transport_fc] 11 [ff3f78b964847e88] process_one_work at ffffffffae914f15 12 [ff3f78b964847ed0] worker_thread at ffffffffae9154c0 13 [ff3f78b964847f10] kthread at ffffffffae91c456 14 [ff3f78b964847f50] ret_from_fork at ffffffffae8036ef
Cc: stable@vger.kernel.org Fixes: f45bca8c5052 ("scsi: qla2xxx: Fix double scsi_done for abort path") Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Link: https://lore.kernel.org/r/20231030064912.37912-1-njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/qla2xxx/qla_os.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
--- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -1823,8 +1823,16 @@ static void qla2x00_abort_srb(struct qla }
spin_lock_irqsave(qp->qp_lock_ptr, *flags); - if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd))) - sp->done(sp, res); + switch (sp->type) { + case SRB_SCSI_CMD: + if (ret_cmd && blk_mq_request_started(scsi_cmd_to_rq(cmd))) + sp->done(sp, res); + break; + default: + if (ret_cmd) + sp->done(sp, res); + break; + } } else { sp->done(sp, res); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roxana Nicolescu roxana.nicolescu@canonical.com
commit 1c43c0f1f84aa59dfc98ce66f0a67b2922aa7f9d upstream.
x86 optimized crypto modules are built as modules rather than build-in and they are not loaded when the crypto API is initialized, resulting in the generic builtin module (sha1-generic) being used instead.
It was discovered when creating a sha1/sha256 checksum of a 2Gb file by using kcapi-tools because it would take significantly longer than creating a sha512 checksum of the same file. trace-cmd showed that for sha1/256 the generic module was used, whereas for sha512 the optimized module was used instead.
Add module aliases() for these x86 optimized crypto modules based on CPU feature bits so udev gets a chance to load them later in the boot process. This resulted in ~3x decrease in the real-time execution of kcapi-dsg.
Fix is inspired from commit aa031b8f702e ("crypto: x86/sha512 - load based on CPU features") where a similar fix was done for sha512.
Cc: stable@vger.kernel.org # 5.15+ Suggested-by: Dimitri John Ledkov dimitri.ledkov@canonical.com Suggested-by: Julian Andres Klode julian.klode@canonical.com Signed-off-by: Roxana Nicolescu roxana.nicolescu@canonical.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/crypto/sha1_ssse3_glue.c | 12 ++++++++++++ arch/x86/crypto/sha256_ssse3_glue.c | 12 ++++++++++++ 2 files changed, 24 insertions(+)
diff --git a/arch/x86/crypto/sha1_ssse3_glue.c b/arch/x86/crypto/sha1_ssse3_glue.c index 44340a1139e0..959afa705e95 100644 --- a/arch/x86/crypto/sha1_ssse3_glue.c +++ b/arch/x86/crypto/sha1_ssse3_glue.c @@ -24,8 +24,17 @@ #include <linux/types.h> #include <crypto/sha1.h> #include <crypto/sha1_base.h> +#include <asm/cpu_device_id.h> #include <asm/simd.h>
+static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int sha1_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha1_block_fn *sha1_xform) { @@ -301,6 +310,9 @@ static inline void unregister_sha1_ni(void) { }
static int __init sha1_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha1_ssse3()) goto fail;
diff --git a/arch/x86/crypto/sha256_ssse3_glue.c b/arch/x86/crypto/sha256_ssse3_glue.c index 3a5f6be7dbba..d25235f0ccaf 100644 --- a/arch/x86/crypto/sha256_ssse3_glue.c +++ b/arch/x86/crypto/sha256_ssse3_glue.c @@ -38,11 +38,20 @@ #include <crypto/sha2.h> #include <crypto/sha256_base.h> #include <linux/string.h> +#include <asm/cpu_device_id.h> #include <asm/simd.h>
asmlinkage void sha256_transform_ssse3(struct sha256_state *state, const u8 *data, int blocks);
+static const struct x86_cpu_id module_cpu_ids[] = { + X86_MATCH_FEATURE(X86_FEATURE_AVX2, NULL), + X86_MATCH_FEATURE(X86_FEATURE_AVX, NULL), + X86_MATCH_FEATURE(X86_FEATURE_SSSE3, NULL), + {} +}; +MODULE_DEVICE_TABLE(x86cpu, module_cpu_ids); + static int _sha256_update(struct shash_desc *desc, const u8 *data, unsigned int len, sha256_block_fn *sha256_xform) { @@ -366,6 +375,9 @@ static inline void unregister_sha256_ni(void) { }
static int __init sha256_ssse3_mod_init(void) { + if (!x86_match_cpu(module_cpu_ids)) + return -ENODEV; + if (register_sha256_ssse3()) goto fail;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pu Wen puwen@hygon.cn
commit ee545b94d39a00c93dc98b1dbcbcf731d2eadeb4 upstream.
Hygon processors with a model ID > 3 have CPUID leaf 0xB correctly populated and don't need the fixed package ID shift workaround. The fixup is also incorrect when running in a guest.
Fixes: e0ceeae708ce ("x86/CPU/hygon: Fix phys_proc_id calculation logic for multi-die processors") Signed-off-by: Pu Wen puwen@hygon.cn Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/tencent_594804A808BD93A4EBF50A994F228E3A7F07@qq.co... Link: https://lore.kernel.org/r/20230814085112.089607918@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/hygon.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/hygon.c +++ b/arch/x86/kernel/cpu/hygon.c @@ -86,8 +86,12 @@ static void hygon_get_topology(struct cp if (!err) c->x86_coreid_bits = get_count_order(c->x86_max_cores);
- /* Socket ID is ApicId[6] for these processors. */ - c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT; + /* + * Socket ID is ApicId[6] for the processors with model <= 0x3 + * when running on host. + */ + if (!boot_cpu_has(X86_FEATURE_HYPERVISOR) && c->x86_model <= 0x3) + c->phys_proc_id = c->apicid >> APICID_SOCKET_ID_BIT;
cacheinfo_hygon_init_llc_id(c, cpu); } else if (cpu_has(c, X86_FEATURE_NODEID_MSR)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolas Saenz Julienne nsaenz@amazon.com
commit d6800af51c76b6dae20e6023bbdc9b3da3ab5121 upstream.
Don't apply the stimer's counter side effects when modifying its value from user-space, as this may trigger spurious interrupts.
For example: - The stimer is configured in auto-enable mode. - The stimer's count is set and the timer enabled. - The stimer expires, an interrupt is injected. - The VM is live migrated. - The stimer config and count are deserialized, auto-enable is ON, the stimer is re-enabled. - The stimer expires right away, and injects an unwarranted interrupt.
Cc: stable@vger.kernel.org Fixes: 1f4b34f825e8 ("kvm/x86: Hyper-V SynIC timers") Signed-off-by: Nicolas Saenz Julienne nsaenz@amazon.com Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com Link: https://lore.kernel.org/r/20231017155101.40677-1-nsaenz@amazon.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/hyperv.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/arch/x86/kvm/hyperv.c +++ b/arch/x86/kvm/hyperv.c @@ -701,10 +701,12 @@ static int stimer_set_count(struct kvm_v
stimer_cleanup(stimer); stimer->count = count; - if (stimer->count == 0) - stimer->config.enable = 0; - else if (stimer->config.auto_enable) - stimer->config.enable = 1; + if (!host) { + if (stimer->count == 0) + stimer->config.enable = 0; + else if (stimer->config.auto_enable) + stimer->config.enable = 1; + }
if (stimer->config.enable) stimer_mark_pending(stimer, false);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej S. Szmigiero maciej.szmigiero@oracle.com
commit 2770d4722036d6bd24bcb78e9cd7f6e572077d03 upstream.
Hyper-V enabled Windows Server 2022 KVM VM cannot be started on Zen1 Ryzen since it crashes at boot with SYSTEM_THREAD_EXCEPTION_NOT_HANDLED + STATUS_PRIVILEGED_INSTRUCTION (in other words, because of an unexpected #GP in the guest kernel).
This is because Windows tries to set bit 8 in MSR_AMD64_TW_CFG and can't handle receiving a #GP when doing so.
Give this MSR the same treatment that commit 2e32b7190641 ("x86, kvm: Add MSR_AMD64_BU_CFG2 to the list of ignored MSRs") gave MSR_AMD64_BU_CFG2 under justification that this MSR is baremetal-relevant only. Although apparently it was then needed for Linux guests, not Windows as in this case.
With this change, the aforementioned guest setup is able to finish booting successfully.
This issue can be reproduced either on a Summit Ridge Ryzen (with just "-cpu host") or on a Naples EPYC (with "-cpu host,stepping=1" since EPYC is ordinarily stepping 2).
Alternatively, userspace could solve the problem by using MSR filters, but forcing every userspace to define a filter isn't very friendly and doesn't add much, if any, value. The only potential hiccup is if one of these "baremetal-only" MSRs ever requires actual emulation and/or has F/M/S specific behavior. But if that happens, then KVM can still punt *that* handling to userspace since userspace MSR filters "win" over KVM's default handling.
Signed-off-by: Maciej S. Szmigiero maciej.szmigiero@oracle.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/1ce85d9c7c9e9632393816cf19c902e0a3f411f1.169773140... [sean: call out MSR filtering alternative] Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/x86.c | 2 ++ 2 files changed, 3 insertions(+)
--- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -511,6 +511,7 @@ #define MSR_AMD64_CPUID_FN_1 0xc0011004 #define MSR_AMD64_LS_CFG 0xc0011020 #define MSR_AMD64_DC_CFG 0xc0011022 +#define MSR_AMD64_TW_CFG 0xc0011023
#define MSR_AMD64_DE_CFG 0xc0011029 #define MSR_AMD64_DE_CFG_LFENCE_SERIALIZE_BIT 1 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -3393,6 +3393,7 @@ int kvm_set_msr_common(struct kvm_vcpu * case MSR_AMD64_PATCH_LOADER: case MSR_AMD64_BU_CFG2: case MSR_AMD64_DC_CFG: + case MSR_AMD64_TW_CFG: case MSR_F15H_EX_CFG: break;
@@ -3733,6 +3734,7 @@ int kvm_get_msr_common(struct kvm_vcpu * case MSR_AMD64_BU_CFG2: case MSR_IA32_PERF_CTL: case MSR_AMD64_DC_CFG: + case MSR_AMD64_TW_CFG: case MSR_F15H_EX_CFG: /* * Intel Sandy Bridge CPUs must support the RAPL (running average power
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Moore paul@paul-moore.com
commit 47846d51348dd62e5231a83be040981b17c955fa upstream.
The get_task_exe_file() function locks the given task with task_lock() which when used inside audit_exe_compare() can cause deadlocks on systems that generate audit records when the task_lock() is held. We resolve this problem with two changes: ignoring those cases where the task being audited is not the current task, and changing our approach to obtaining the executable file struct to not require task_lock().
With the intent of the audit exe filter being to filter on audit events generated by processes started by the specified executable, it makes sense that we would only want to use the exe filter on audit records associated with the currently executing process, e.g. @current. If we are asked to filter records using a non-@current task_struct we can safely ignore the exe filter without negatively impacting the admin's expectations for the exe filter.
Knowing that we only have to worry about filtering the currently executing task in audit_exe_compare() we can do away with the task_lock() and call get_mm_exe_file() with @current->mm directly.
Cc: stable@vger.kernel.org Fixes: 5efc244346f9 ("audit: fix exe_file access in audit_exe_compare") Reported-by: Andreas Steinmetz anstein99@googlemail.com Reviewed-by: John Johansen john.johanse@canonical.com Reviewed-by: Mateusz Guzik mjguzik@gmail.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/audit_watch.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -527,11 +527,18 @@ int audit_exe_compare(struct task_struct unsigned long ino; dev_t dev;
- exe_file = get_task_exe_file(tsk); + /* only do exe filtering if we are recording @current events/records */ + if (tsk != current) + return 0; + + if (WARN_ON_ONCE(!current->mm)) + return 0; + exe_file = get_mm_exe_file(current->mm); if (!exe_file) return 0; ino = file_inode(exe_file)->i_ino; dev = file_inode(exe_file)->i_sb->s_dev; fput(exe_file); + return audit_mark_compare(mark, ino, dev); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul Moore paul@paul-moore.com
commit 969d90ec212bae4b45bf9d21d7daa30aa6cf055e upstream.
eBPF can end up calling into the audit code from some odd places, and some of these places don't have @current set properly so we end up tripping the `WARN_ON_ONCE(!current->mm)` near the top of `audit_exe_compare()`. While the basic `!current->mm` check is good, the `WARN_ON_ONCE()` results in some scary console messages so let's drop that and just do the regular `!current->mm` check to avoid problems.
Cc: stable@vger.kernel.org Fixes: 47846d51348d ("audit: don't take task_lock() in audit_exe_compare() code path") Reported-by: Artem Savkov asavkov@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/audit_watch.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/audit_watch.c +++ b/kernel/audit_watch.c @@ -531,7 +531,7 @@ int audit_exe_compare(struct task_struct if (tsk != current) return 0;
- if (WARN_ON_ONCE(!current->mm)) + if (!current->mm) return 0; exe_file = get_mm_exe_file(current->mm); if (!exe_file)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum usama.anjum@collabora.com
commit dd976a97d15b47656991e185a94ef42a0fa5cfd4 upstream.
The smp_processor_id() shouldn't be called from preemptible code. Instead use get_cpu() and put_cpu() which disables preemption in addition to getting the processor id. Enable preemption back after calling schedule_work() to make sure that the work gets scheduled on all cores other than the current core. We want to avoid a scenario where current core's stack trace is printed multiple times and one core's stack trace isn't printed because of scheduling of current task.
This fixes the following bug:
[ 119.143590] sysrq: Show backtrace of all active CPUs [ 119.143902] BUG: using smp_processor_id() in preemptible [00000000] code: bash/873 [ 119.144586] caller is debug_smp_processor_id+0x20/0x30 [ 119.144827] CPU: 6 PID: 873 Comm: bash Not tainted 5.10.124-dirty #3 [ 119.144861] Hardware name: QEMU QEMU Virtual Machine, BIOS 2023.05-1 07/22/2023 [ 119.145053] Call trace: [ 119.145093] dump_backtrace+0x0/0x1a0 [ 119.145122] show_stack+0x18/0x70 [ 119.145141] dump_stack+0xc4/0x11c [ 119.145159] check_preemption_disabled+0x100/0x110 [ 119.145175] debug_smp_processor_id+0x20/0x30 [ 119.145195] sysrq_handle_showallcpus+0x20/0xc0 [ 119.145211] __handle_sysrq+0x8c/0x1a0 [ 119.145227] write_sysrq_trigger+0x94/0x12c [ 119.145247] proc_reg_write+0xa8/0xe4 [ 119.145266] vfs_write+0xec/0x280 [ 119.145282] ksys_write+0x6c/0x100 [ 119.145298] __arm64_sys_write+0x20/0x30 [ 119.145315] el0_svc_common.constprop.0+0x78/0x1e4 [ 119.145332] do_el0_svc+0x24/0x8c [ 119.145348] el0_svc+0x10/0x20 [ 119.145364] el0_sync_handler+0x134/0x140 [ 119.145381] el0_sync+0x180/0x1c0
Cc: jirislaby@kernel.org Cc: stable@vger.kernel.org Fixes: 47cab6a722d4 ("debug lockups: Improve lockup detection, fix generic arch fallback") Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://lore.kernel.org/r/20231009162021.3607632-1-usama.anjum@collabora.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/sysrq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/tty/sysrq.c +++ b/drivers/tty/sysrq.c @@ -263,13 +263,14 @@ static void sysrq_handle_showallcpus(int if (in_hardirq()) regs = get_irq_regs();
- pr_info("CPU%d:\n", smp_processor_id()); + pr_info("CPU%d:\n", get_cpu()); if (regs) show_regs(regs); else show_stack(NULL, NULL, KERN_INFO);
schedule_work(&sysrq_showallcpus); + put_cpu(); } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit a30badfd7c13fc8763a9e10c5a12ba7f81515a55 upstream.
On unplug of a Xen console, xencons_disconnect_backend() unconditionally calls free_irq() via unbind_from_irqhandler(), causing a warning of freeing an already-free IRQ:
(qemu) device_del con1 [ 32.050919] ------------[ cut here ]------------ [ 32.050942] Trying to free already-free IRQ 33 [ 32.050990] WARNING: CPU: 0 PID: 51 at kernel/irq/manage.c:1895 __free_irq+0x1d4/0x330
It should be using evtchn_put() to tear down the event channel binding, and let the Linux IRQ side of it be handled by notifier_del_irq() through the HVC code.
On which topic... xencons_disconnect_backend() should call hvc_remove() *first*, rather than tearing down the event channel and grant mapping while they are in use. And then the IRQ is guaranteed to be freed by the time it's torn down by evtchn_put().
Since evtchn_put() also closes the actual event channel, avoid calling xenbus_free_evtchn() except in the failure path where the IRQ was not successfully set up.
However, calling hvc_remove() at the start of xencons_disconnect_backend() still isn't early enough. An unplug request is indicated by the backend setting its state to XenbusStateClosing, which triggers a notification to xencons_backend_changed(), which... does nothing except set its own frontend state directly to XenbusStateClosed without *actually* tearing down the HVC device or, you know, making sure it isn't actively in use.
So the backend sees the guest frontend set its state to XenbusStateClosed and stops servicing the interrupt... and the guest spins for ever in the domU_write_console() function waiting for the ring to drain.
Fix that one by calling hvc_remove() from xencons_backend_changed() before signalling to the backend that it's OK to proceed with the removal.
Tested with 'dd if=/dev/zero of=/dev/hvc1' while telling Qemu to remove the console device.
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-4-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 32 ++++++++++++++++++++++++-------- 1 file changed, 24 insertions(+), 8 deletions(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -377,18 +377,21 @@ void xen_console_resume(void) #ifdef CONFIG_HVC_XEN_FRONTEND static void xencons_disconnect_backend(struct xencons_info *info) { - if (info->irq > 0) - unbind_from_irqhandler(info->irq, NULL); - info->irq = 0; + if (info->hvc != NULL) + hvc_remove(info->hvc); + info->hvc = NULL; + if (info->irq > 0) { + evtchn_put(info->evtchn); + info->irq = 0; + info->evtchn = 0; + } + /* evtchn_put() will also close it so this is only an error path */ if (info->evtchn > 0) xenbus_free_evtchn(info->xbdev, info->evtchn); info->evtchn = 0; if (info->gntref > 0) gnttab_free_grant_references(info->gntref); info->gntref = 0; - if (info->hvc != NULL) - hvc_remove(info->hvc); - info->hvc = NULL; }
static void xencons_free(struct xencons_info *info) @@ -553,10 +556,23 @@ static void xencons_backend_changed(stru if (dev->state == XenbusStateClosed) break; fallthrough; /* Missed the backend's CLOSING state */ - case XenbusStateClosing: + case XenbusStateClosing: { + struct xencons_info *info = dev_get_drvdata(&dev->dev);; + + /* + * Don't tear down the evtchn and grant ref before the other + * end has disconnected, but do stop userspace from trying + * to use the device before we allow the backend to close. + */ + if (info->hvc) { + hvc_remove(info->hvc); + info->hvc = NULL; + } + xenbus_frontend_closed(dev); break; } + } }
static const struct xenbus_device_id xencons_ids[] = { @@ -615,7 +631,7 @@ static int __init xen_hvc_init(void) list_del(&info->list); spin_unlock_irqrestore(&xencons_lock, flags); if (info->irq) - unbind_from_irqhandler(info->irq, NULL); + evtchn_put(info->evtchn); kfree(info); return r; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit 2704c9a5593f4a47620c12dad78838ca62b52f48 upstream.
The xen_hvc_init() function should always register the frontend driver, even when there's no primary console — as there may be secondary consoles. (Qemu can always add secondary consoles, but only the toolstack can add the primary because it's special.)
Signed-off-by: David Woodhouse dwmw@amazon.co.uk Reviewed-by: Juergen Gross jgross@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-3-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -603,7 +603,7 @@ static int __init xen_hvc_init(void) ops = &dom0_hvc_ops; r = xen_initial_domain_console_init(); if (r < 0) - return r; + goto register_fe; info = vtermno_to_xencons(HVC_COOKIE); } else { ops = &domU_hvc_ops; @@ -612,7 +612,7 @@ static int __init xen_hvc_init(void) else r = xen_pv_console_init(); if (r < 0) - return r; + goto register_fe;
info = vtermno_to_xencons(HVC_COOKIE); info->irq = bind_evtchn_to_irq_lateeoi(info->evtchn); @@ -637,6 +637,7 @@ static int __init xen_hvc_init(void) }
r = 0; + register_fe: #ifdef CONFIG_HVC_XEN_FRONTEND r = xenbus_register_frontend(&xencons_driver); #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Woodhouse dwmw@amazon.co.uk
commit ef5dd8ec88ac11e8e353164407d55b73c988b369 upstream.
The xencons_connect_backend() function allocates a local interdomain event channel with xenbus_alloc_evtchn(), then calls bind_interdomain_evtchn_to_irq_lateeoi() to bind to that port# on the *remote* domain.
That doesn't work very well:
(qemu) device_add xen-console,id=con1,chardev=pty0 [ 44.323872] xenconsole console-1: 2 xenbus_dev_probe on device/console/1 [ 44.323995] xenconsole: probe of console-1 failed with error -2
Fix it to use bind_evtchn_to_irq_lateeoi(), which does the right thing by just binding that *local* event channel to an irq. The backend will do the interdomain binding.
This didn't affect the primary console because the setup for that is special — the toolstack allocates the guest event channel and the guest discovers it with HVMOP_get_param.
Fixes: fe415186b43d ("xen/console: harden hvc_xen against event channel storms") Signed-off-by: David Woodhouse dwmw@amazon.co.uk Reviewed-by: Juergen Gross jgross@suse.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231020161529.355083-2-dwmw2@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/hvc/hvc_xen.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/tty/hvc/hvc_xen.c +++ b/drivers/tty/hvc/hvc_xen.c @@ -436,7 +436,7 @@ static int xencons_connect_backend(struc if (ret) return ret; info->evtchn = evtchn; - irq = bind_interdomain_evtchn_to_irq_lateeoi(dev, evtchn); + irq = bind_evtchn_to_irq_lateeoi(evtchn); if (irq < 0) return irq; info->irq = irq;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
commit 70b70a4307cccebe91388337b1c85735ce4de6ff upstream.
struct pci_dev contains two flags which govern whether the device may suspend to D3cold:
* no_d3cold provides an opt-out for drivers (e.g. if a device is known to not wake from D3cold)
* d3cold_allowed provides an opt-out for user space (default is true, user space may set to false)
Since commit 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend"), the user space setting overwrites the driver setting. Essentially user space is trusted to know better than the driver whether D3cold is working.
That feels unsafe and wrong. Assume that the change was introduced inadvertently and do not overwrite no_d3cold when d3cold_allowed is modified. Instead, consider d3cold_allowed in addition to no_d3cold when choosing a suspend state for the device.
That way, user space may opt out of D3cold if the driver hasn't, but it may no longer force an opt in if the driver has opted out.
Fixes: 9d26d3a8f1b0 ("PCI: Put PCIe ports into D3 during suspend") Link: https://lore.kernel.org/r/b8a7f4af2b73f6b506ad8ddee59d747cbf834606.169502536... Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Reviewed-by: Mario Limonciello mario.limonciello@amd.com Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-acpi.c | 2 +- drivers/pci/pci-sysfs.c | 5 +---- 2 files changed, 2 insertions(+), 5 deletions(-)
--- a/drivers/pci/pci-acpi.c +++ b/drivers/pci/pci-acpi.c @@ -910,7 +910,7 @@ static pci_power_t acpi_pci_choose_state { int acpi_state, d_max;
- if (pdev->no_d3cold) + if (pdev->no_d3cold || !pdev->d3cold_allowed) d_max = ACPI_STATE_D3_HOT; else d_max = ACPI_STATE_D3_COLD; --- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -508,10 +508,7 @@ static ssize_t d3cold_allowed_store(stru return -EINVAL;
pdev->d3cold_allowed = !!val; - if (pdev->d3cold_allowed) - pci_d3cold_enable(pdev); - else - pci_d3cold_disable(pdev); + pci_bridge_d3_update(pdev);
pm_runtime_resume(dev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krister Johansen kjlx@templeofstupid.com
commit 8b793bcda61f6c3ed4f5b2ded7530ef6749580cb upstream.
Setting softlockup_panic from do_sysctl_args() causes it to take effect later in boot. The lockup detector is enabled before SMP is brought online, but do_sysctl_args runs afterwards. If a user wants to set softlockup_panic on boot and have it trigger should a softlockup occur during onlining of the non-boot processors, they could do this prior to commit f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases"). However, after this commit the value of softlockup_panic is set too late to be of help for this type of problem. Restore the prior behavior.
Signed-off-by: Krister Johansen kjlx@templeofstupid.com Cc: stable@vger.kernel.org Fixes: f117955a2255 ("kernel/watchdog.c: convert {soft/hard}lockup boot parameters to sysctl aliases") Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/proc/proc_sysctl.c | 1 - kernel/watchdog.c | 7 +++++++ 2 files changed, 7 insertions(+), 1 deletion(-)
--- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -1765,7 +1765,6 @@ static const struct sysctl_alias sysctl_ {"hung_task_panic", "kernel.hung_task_panic" }, {"numa_zonelist_order", "vm.numa_zonelist_order" }, {"softlockup_all_cpu_backtrace", "kernel.softlockup_all_cpu_backtrace" }, - {"softlockup_panic", "kernel.softlockup_panic" }, { } };
--- a/kernel/watchdog.c +++ b/kernel/watchdog.c @@ -183,6 +183,13 @@ static DEFINE_PER_CPU(unsigned long, hrt static DEFINE_PER_CPU(unsigned long, hrtimer_interrupts_saved); static unsigned long soft_lockup_nmi_warn;
+static int __init softlockup_panic_setup(char *str) +{ + softlockup_panic = simple_strtoul(str, NULL, 0); + return 1; +} +__setup("softlockup_panic=", softlockup_panic_setup); + static int __init nowatchdog_setup(char *str) { watchdog_user_enabled = 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Werner Sembach wse@tuxedocomputers.com
commit 0da9eccde3270b832c059ad618bf66e510c75d33 upstream.
The TongFang GMxXGxx/TUXEDO Stellaris/Pollaris Gen5 needs IRQ overriding for the keyboard to work.
Adding an entry for this laptop to the override_table makes the internal keyboard functional.
Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: All applicable stable@vger.kernel.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/resource.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/acpi/resource.c +++ b/drivers/acpi/resource.c @@ -468,6 +468,18 @@ static const struct dmi_system_id mainge } }, { + /* TongFang GMxXGxx/TUXEDO Polaris 15 Gen5 AMD */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GMxXGxx"), + }, + }, + { + /* TongFang GM6XGxX/TUXEDO Stellaris 16 Gen5 AMD */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "GM6XGxX"), + }, + }, + { .ident = "MAINGEAR Vector Pro 2 17", .matches = { DMI_MATCH(DMI_SYS_VENDOR, "Micro Electronics Inc"),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 146a15b873353f8ac28dc281c139ff611a3c4848 upstream.
Prior to LLVM 15.0.0, LLVM's integrated assembler would incorrectly byte-swap NOP when compiling for big-endian, and the resulting series of bytes happened to match the encoding of FNMADD S21, S30, S0, S0.
This went unnoticed until commit:
34f66c4c4d5518c1 ("arm64: Use a positive cpucap for FP/SIMD")
Prior to that commit, the kernel would always enable the use of FPSIMD early in boot when __cpu_setup() initialized CPACR_EL1, and so usage of FNMADD within the kernel was not detected, but could result in the corruption of user or kernel FPSIMD state.
After that commit, the instructions happen to trap during boot prior to FPSIMD being detected and enabled, e.g.
| Unhandled 64-bit el1h sync exception on CPU0, ESR 0x000000001fe00000 -- ASIMD | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1 | Hardware name: linux,dummy-virt (DT) | pstate: 400000c9 (nZcv daIF -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : __pi_strcmp+0x1c/0x150 | lr : populate_properties+0xe4/0x254 | sp : ffffd014173d3ad0 | x29: ffffd014173d3af0 x28: fffffbfffddffcb8 x27: 0000000000000000 | x26: 0000000000000058 x25: fffffbfffddfe054 x24: 0000000000000008 | x23: fffffbfffddfe000 x22: fffffbfffddfe000 x21: fffffbfffddfe044 | x20: ffffd014173d3b70 x19: 0000000000000001 x18: 0000000000000005 | x17: 0000000000000010 x16: 0000000000000000 x15: 00000000413e7000 | x14: 0000000000000000 x13: 0000000000001bcc x12: 0000000000000000 | x11: 00000000d00dfeed x10: ffffd414193f2cd0 x9 : 0000000000000000 | x8 : 0101010101010101 x7 : ffffffffffffffc0 x6 : 0000000000000000 | x5 : 0000000000000000 x4 : 0101010101010101 x3 : 000000000000002a | x2 : 0000000000000001 x1 : ffffd014171f2988 x0 : fffffbfffddffcb8 | Kernel panic - not syncing: Unhandled exception | CPU: 0 PID: 0 Comm: swapper Not tainted 6.6.0-rc3-00013-g34f66c4c4d55 #1 | Hardware name: linux,dummy-virt (DT) | Call trace: | dump_backtrace+0xec/0x108 | show_stack+0x18/0x2c | dump_stack_lvl+0x50/0x68 | dump_stack+0x18/0x24 | panic+0x13c/0x340 | el1t_64_irq_handler+0x0/0x1c | el1_abort+0x0/0x5c | el1h_64_sync+0x64/0x68 | __pi_strcmp+0x1c/0x150 | unflatten_dt_nodes+0x1e8/0x2d8 | __unflatten_device_tree+0x5c/0x15c | unflatten_device_tree+0x38/0x50 | setup_arch+0x164/0x1e0 | start_kernel+0x64/0x38c | __primary_switched+0xbc/0xc4
Restrict CONFIG_CPU_BIG_ENDIAN to a known good assembler, which is either GNU as or LLVM's IAS 15.0.0 and newer, which contains the linked commit.
Closes: https://github.com/ClangBuiltLinux/linux/issues/1948 Link: https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba530... Signed-off-by: Nathan Chancellor nathan@kernel.org Cc: stable@vger.kernel.org Acked-by: Mark Rutland mark.rutland@arm.com Link: https://lore.kernel.org/r/20231025-disable-arm64-be-ias-b4-llvm-15-v1-1-b252... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/Kconfig | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1153,6 +1153,8 @@ choice config CPU_BIG_ENDIAN bool "Build big-endian kernel" depends on !LD_IS_LLD || LLD_VERSION >= 130000 + # https://github.com/llvm/llvm-project/commit/1379b150991f70a5782e9a143c2ba530... + depends on AS_IS_GNU || AS_VERSION >= 150000 help Say Y if you plan on running a kernel with a big-endian userspace.
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 6240553b52c475d9fc9674de0521b77e692f3764 upstream.
PDC2.0 specifies the additional PSW-bit field.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/uapi/asm/pdc.h | 1 + 1 file changed, 1 insertion(+)
--- a/arch/parisc/include/uapi/asm/pdc.h +++ b/arch/parisc/include/uapi/asm/pdc.h @@ -465,6 +465,7 @@ struct pdc_model { /* for PDC_MODEL */ unsigned long arch_rev; unsigned long pot_key; unsigned long curr_key; + unsigned long width; /* default of PSW_W bit (1=enabled) */ };
struct pdc_cache_cf { /* for PDC_CACHE (I/D-caches) */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit d0c219472980d15f5cbc5c8aec736848bda3f235 upstream.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/power.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
--- a/drivers/parisc/power.c +++ b/drivers/parisc/power.c @@ -197,6 +197,14 @@ static struct notifier_block parisc_pani .priority = INT_MAX, };
+/* qemu soft power-off function */ +static int qemu_power_off(struct sys_off_data *data) +{ + /* this turns the system off via SeaBIOS */ + *(int *)data->cb_data = 0; + pdc_soft_power_button(1); + return NOTIFY_DONE; +}
static int __init power_init(void) { @@ -226,7 +234,13 @@ static int __init power_init(void) soft_power_reg); }
- power_task = kthread_run(kpowerswd, (void*)soft_power_reg, KTHREAD_NAME); + power_task = NULL; + if (running_on_qemu && soft_power_reg) + register_sys_off_handler(SYS_OFF_MODE_POWER_OFF, SYS_OFF_PRIO_DEFAULT, + qemu_power_off, (void *)soft_power_reg); + else + power_task = kthread_run(kpowerswd, (void*)soft_power_reg, + KTHREAD_NAME); if (IS_ERR(power_task)) { printk(KERN_ERR DRIVER_NAME ": thread creation failed. Driver not loaded.\n"); pdc_soft_power_button(0);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gustavo A. R. Silva gustavoars@kernel.org
commit d761bb01c85b22d5b44abe283eb89019693f6595 upstream.
`struct clk_hw_onecell_data` is a flexible structure, which means that it contains flexible-array member at the bottom, in this case array `hws`:
include/linux/clk-provider.h: 1380 struct clk_hw_onecell_data { 1381 unsigned int num; 1382 struct clk_hw *hws[] __counted_by(num); 1383 };
This could potentially lead to an overwrite of the objects following `clk_data` in `struct stratix10_clock_data`, in this case `void __iomem *base;` at run-time:
drivers/clk/socfpga/stratix10-clk.h: 9 struct stratix10_clock_data { 10 struct clk_hw_onecell_data clk_data; 11 void __iomem *base; 12 };
There are currently three different places where memory is allocated for `struct stratix10_clock_data`, including the flex-array `hws` in `struct clk_hw_onecell_data`:
drivers/clk/socfpga/clk-agilex.c: 469 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 470 num_clks), GFP_KERNEL);
drivers/clk/socfpga/clk-agilex.c: 509 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 510 num_clks), GFP_KERNEL);
drivers/clk/socfpga/clk-s10.c: 400 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 401 num_clks), GFP_KERNEL);
I'll use just one of them to describe the issue. See below.
Notice that a total of 440 bytes are allocated for flexible-array member `hws` at line 469:
include/dt-bindings/clock/agilex-clock.h: 70 #define AGILEX_NUM_CLKS 55
drivers/clk/socfpga/clk-agilex.c: 459 struct stratix10_clock_data *clk_data; 460 void __iomem *base; ... 466 467 num_clks = AGILEX_NUM_CLKS; 468 469 clk_data = devm_kzalloc(dev, struct_size(clk_data, clk_data.hws, 470 num_clks), GFP_KERNEL);
`struct_size(clk_data, clk_data.hws, num_clks)` above translates to sizeof(struct stratix10_clock_data) + sizeof(struct clk_hw *) * 55 == 16 + 8 * 55 == 16 + 440 ^^^ | allocated bytes for flex-array `hws`
474 for (i = 0; i < num_clks; i++) 475 clk_data->clk_data.hws[i] = ERR_PTR(-ENOENT); 476 477 clk_data->base = base;
and then some data is written into both `hws` and `base` objects.
Fix this by placing the declaration of object `clk_data` at the end of `struct stratix10_clock_data`. Also, add a comment to make it clear that this object must always be last in the structure.
-Wflex-array-member-not-at-end is coming in GCC-14, and we are getting ready to enable it globally.
Fixes: ba7e258425ac ("clk: socfpga: Convert to s10/agilex/n5x to use clk_hw") Cc: stable@vger.kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Link: https://lore.kernel.org/r/1da736106d8e0806aeafa6e471a13ced490eae22.169811781... Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/socfpga/stratix10-clk.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/clk/socfpga/stratix10-clk.h b/drivers/clk/socfpga/stratix10-clk.h index 75234e0783e1..83fe4eb3133c 100644 --- a/drivers/clk/socfpga/stratix10-clk.h +++ b/drivers/clk/socfpga/stratix10-clk.h @@ -7,8 +7,10 @@ #define __STRATIX10_CLK_H
struct stratix10_clock_data { - struct clk_hw_onecell_data clk_data; void __iomem *base; + + /* Must be last */ + struct clk_hw_onecell_data clk_data; };
struct stratix10_pll_clock {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit e641a070137dd959932c7c222e000d9d941167a2 upstream.
GPLL, NSS crypto PLL clock rates are fixed and shouldn't be scaled based on the request from dependent clocks. Doing so will result in the unexpected behaviour. So drop the CLK_SET_RATE_PARENT flag from the PLL clocks.
Cc: stable@vger.kernel.org Fixes: b8e7e519625f ("clk: qcom: ipq8074: add remaining PLL’s") Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Link: https://lore.kernel.org/r/20230913-gpll_cleanup-v2-1-c8ceb1a37680@quicinc.co... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/gcc-ipq8074.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/clk/qcom/gcc-ipq8074.c +++ b/drivers/clk/qcom/gcc-ipq8074.c @@ -418,7 +418,6 @@ static struct clk_fixed_factor gpll0_out }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -465,7 +464,6 @@ static struct clk_alpha_pll_postdiv gpll }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -498,7 +496,6 @@ static struct clk_alpha_pll_postdiv gpll }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -532,7 +529,6 @@ static struct clk_alpha_pll_postdiv gpll }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -546,7 +542,6 @@ static struct clk_fixed_factor gpll6_out }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -611,7 +606,6 @@ static struct clk_alpha_pll_postdiv nss_ }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit 99cd4935cb972d0aafb16838bb2aeadbcaf196ce upstream.
GPLL, NSS crypto PLL clock rates are fixed and shouldn't be scaled based on the request from dependent clocks. Doing so will result in the unexpected behaviour. So drop the CLK_SET_RATE_PARENT flag from the PLL clocks.
Cc: stable@vger.kernel.org Fixes: d9db07f088af ("clk: qcom: Add ipq6018 Global Clock Controller support") Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230913-gpll_cleanup-v2-2-c8ceb1a37680@quicinc.co... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/gcc-ipq6018.c | 6 ------ 1 file changed, 6 deletions(-)
--- a/drivers/clk/qcom/gcc-ipq6018.c +++ b/drivers/clk/qcom/gcc-ipq6018.c @@ -75,7 +75,6 @@ static struct clk_fixed_factor gpll0_out &gpll0_main.clkr.hw }, .num_parents = 1, .ops = &clk_fixed_factor_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -89,7 +88,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll0_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -164,7 +162,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll6_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -195,7 +192,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll4_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -246,7 +242,6 @@ static struct clk_alpha_pll_postdiv gpll &gpll2_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
@@ -277,7 +272,6 @@ static struct clk_alpha_pll_postdiv nss_ &nss_crypto_pll_main.clkr.hw }, .num_parents = 1, .ops = &clk_alpha_pll_postdiv_ro_ops, - .flags = CLK_SET_RATE_PARENT, }, };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit b44f9da81783fda72632ef9b0d05ea3f3ca447a5 upstream.
This error path should return -EINVAL instead of success.
Fixes: 88095e7b473a ("mmc: Add new VUB300 USB-to-SD/SDIO/MMC driver") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/0769d30c-ad80-421b-bf5d-7d6f5d85604e@moroto.mounta... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/vub300.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mmc/host/vub300.c +++ b/drivers/mmc/host/vub300.c @@ -2311,6 +2311,7 @@ static int vub300_probe(struct usb_inter vub300->read_only = (0x0010 & vub300->system_port_status.port_flags) ? 1 : 0; } else { + retval = -EINVAL; goto error5; } usb_set_intfdata(interface, vub300);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nitin Yadav n-yadav@ti.com
commit 71956d0cb56c1e5f9feeb4819db87a076418e930 upstream.
ti,otap-del-sel-legacy/ti,itap-del-sel-legacy passed from DT are currently ignored for all SD/MMC and eMMC modes. Fix this by making start loop index to MMC_TIMING_LEGACY.
Fixes: 8ee5fc0e0b3b ("mmc: sdhci_am654: Update OTAPDLY writes") Signed-off-by: Nitin Yadav n-yadav@ti.com Acked-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231026061458.1116276-1-n-yadav@ti.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci_am654.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/sdhci_am654.c +++ b/drivers/mmc/host/sdhci_am654.c @@ -600,7 +600,7 @@ static int sdhci_am654_get_otap_delay(st return 0; }
- for (i = MMC_TIMING_MMC_HS; i <= MMC_TIMING_MMC_HS400; i++) { + for (i = MMC_TIMING_LEGACY; i <= MMC_TIMING_MMC_HS400; i++) {
ret = device_property_read_u32(dev, td[i].otap_binding, &sdhci_am654->otap_del_sel[i]);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit 8e37372ad0bea4c9b4712d9943f6ae96cff9491f upstream.
aspm_attr_store_common(), which handles sysfs control of ASPM, has the same problem as fb097dcd5a28 ("PCI/ASPM: Disable only ASPM_STATE_L1 when driver disables L1"): disabling L1 adds only ASPM_L1 (but not any of the L1.x substates) to the "aspm_disable" mask.
Enabling one substate, e.g., L1.1, via sysfs removes ASPM_L1 from the disable mask. Since disabling L1 via sysfs doesn't add any of the substates to the disable mask, enabling L1.1 actually enables *all* the substates.
In this scenario:
- Write 0 to "l1_aspm" to disable L1 - Write 1 to "l1_1_aspm" to enable L1.1
the intention is to disable L1 and all L1.x substates, then enable just L1.1, but in fact, *all* L1.x substates are enabled.
Fix this by explicitly disabling all the L1.x substates when disabling L1.
Fixes: 72ea91afbfb0 ("PCI/ASPM: Add sysfs attributes for controlling ASPM link states") Link: https://lore.kernel.org/r/6ba7dd79-9cfe-4ed0-a002-d99cb842f361@gmail.com Signed-off-by: Heiner Kallweit hkallweit1@gmail.com [bhelgaas: commit log] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pcie/aspm.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/pci/pcie/aspm.c +++ b/drivers/pci/pcie/aspm.c @@ -1231,6 +1231,8 @@ static ssize_t aspm_attr_store_common(st link->aspm_disable &= ~ASPM_STATE_L1; } else { link->aspm_disable |= state; + if (state & ASPM_STATE_L1) + link->aspm_disable |= ASPM_STATE_L1SS; }
pcie_config_aspm_link(link, policy_to_aspm_state(link));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 83a939f0fdc208ff3639dd3d42ac9b3c35607fd2 upstream.
With CONFIG_PCI_EXYNOS=y and exynos_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse.
The right thing to do is do always have the remove callback available. This fixes the following warning by modpost:
WARNING: modpost: drivers/pci/controller/dwc/pci-exynos: section mismatch in reference: exynos_pcie_driver+0x8 (section: .data) -> exynos_pcie_remove (section: .exit.text)
(with ARCH=x86_64 W=1 allmodconfig).
Fixes: 340cba6092c2 ("pci: Add PCIe driver for Samsung Exynos") Link: https://lore.kernel.org/r/20231001170254.2506508-2-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Alim Akhtar alim.akhtar@samsung.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-exynos.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-exynos.c +++ b/drivers/pci/controller/dwc/pci-exynos.c @@ -377,7 +377,7 @@ fail_probe: return ret; }
-static int __exit exynos_pcie_remove(struct platform_device *pdev) +static int exynos_pcie_remove(struct platform_device *pdev) { struct exynos_pcie *ep = platform_get_drvdata(pdev);
@@ -433,7 +433,7 @@ static const struct of_device_id exynos_
static struct platform_driver exynos_pcie_driver = { .probe = exynos_pcie_probe, - .remove = __exit_p(exynos_pcie_remove), + .remove = exynos_pcie_remove, .driver = { .name = "exynos-pcie", .of_match_table = exynos_pcie_of_match,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ajay Singh ajay.kathat@microchip.com
commit 05ac1a198a63ad66bf5ae8b7321407c102d40ef3 upstream.
Enabling KASAN and running some iperf tests raises some memory issues with vmm_table:
BUG: KASAN: slab-out-of-bounds in wilc_wlan_handle_txq+0x6ac/0xdb4 Write of size 4 at addr c3a61540 by task wlan0-tx/95
KASAN detects that we are writing data beyond range allocated to vmm_table. There is indeed a mismatch between the size passed to allocator in wilc_wlan_init, and the range of possible indexes used later: allocation size is missing a multiplication by sizeof(u32)
Fixes: 40b717bfcefa ("wifi: wilc1000: fix DMA on stack objects") Cc: stable@vger.kernel.org Signed-off-by: Ajay Singh ajay.kathat@microchip.com Signed-off-by: Alexis Lothoré alexis.lothore@bootlin.com Reviewed-by: Michael Walle mwalle@kernel.org Reviewed-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20231017-wilc1000_tx_oops-v3-1-b2155f1f7bee@bootli... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/microchip/wilc1000/wlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/wireless/microchip/wilc1000/wlan.c +++ b/drivers/net/wireless/microchip/wilc1000/wlan.c @@ -1458,7 +1458,7 @@ int wilc_wlan_init(struct net_device *de }
if (!wilc->vmm_table) - wilc->vmm_table = kzalloc(WILC_VMM_TBL_SIZE, GFP_KERNEL); + wilc->vmm_table = kcalloc(WILC_VMM_TBL_SIZE, sizeof(u32), GFP_KERNEL);
if (!wilc->vmm_table) { ret = -ENOBUFS;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
commit 197115ebf358cb440c73e868b2a0a5ef728decc6 upstream.
When an RPC Call message cannot be pulled from the client, that is a message loss, by definition. Close the connection to trigger the client to resend.
Cc: stable@vger.kernel.org Reviewed-by: Tom Talpey tom@talpey.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -852,7 +852,8 @@ out_readfail: if (ret == -EINVAL) svc_rdma_send_error(rdma_xprt, ctxt, ret); svc_rdma_recv_ctxt_put(rdma_xprt, ctxt); - return ret; + svc_xprt_deferred_close(xprt); + return -ENOTCONN;
out_backchannel: svc_rdma_handle_bc_reply(rqstp, ctxt);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joel Fernandes (Google) joel@joelfernandes.org
commit b96e7a5fa0ba9cda32888e04f8f4bac42d49a7f8 upstream.
There are instances where rcu_cpu_stall_reset() is called when jiffies did not get a chance to update for a long time. Before jiffies is updated, the CPU stall detector can go off triggering false-positives where a just-started grace period appears to be ages old. In the past, we disabled stall detection in rcu_cpu_stall_reset() however this got changed [1]. This is resulting in false-positives in KGDB usecase [2].
Fix this by deferring the update of jiffies to the third run of the FQS loop. This is more robust, as, even if rcu_cpu_stall_reset() is called just before jiffies is read, we would end up pushing out the jiffies read by 3 more FQS loops. Meanwhile the CPU stall detection will be delayed and we will not get any false positives.
[1] https://lore.kernel.org/all/20210521155624.174524-2-senozhatsky@chromium.org... [2] https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/
Tested with rcutorture.cpu_stall option as well to verify stall behavior with/without patch.
Tested-by: Huacai Chen chenhuacai@loongson.cn Reported-by: Binbin Zhou zhoubinbin@loongson.cn Closes: https://lore.kernel.org/all/20230814020045.51950-2-chenhuacai@loongson.cn/ Suggested-by: Paul McKenney paulmck@kernel.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Fixes: a80be428fbc1 ("rcu: Do not disable GP stall detection in rcu_cpu_stall_reset()") Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tree.c | 12 ++++++++++++ kernel/rcu/tree.h | 4 ++++ kernel/rcu/tree_stall.h | 20 ++++++++++++++++++-- 3 files changed, 34 insertions(+), 2 deletions(-)
--- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -1906,10 +1906,22 @@ static bool rcu_gp_fqs_check_wake(int *g */ static void rcu_gp_fqs(bool first_time) { + int nr_fqs = READ_ONCE(rcu_state.nr_fqs_jiffies_stall); struct rcu_node *rnp = rcu_get_root();
WRITE_ONCE(rcu_state.gp_activity, jiffies); WRITE_ONCE(rcu_state.n_force_qs, rcu_state.n_force_qs + 1); + + WARN_ON_ONCE(nr_fqs > 3); + /* Only countdown nr_fqs for stall purposes if jiffies moves. */ + if (nr_fqs) { + if (nr_fqs == 1) { + WRITE_ONCE(rcu_state.jiffies_stall, + jiffies + rcu_jiffies_till_stall_check()); + } + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, --nr_fqs); + } + if (first_time) { /* Collect dyntick-idle snapshots. */ force_qs_rnp(dyntick_save_progress_counter); --- a/kernel/rcu/tree.h +++ b/kernel/rcu/tree.h @@ -351,6 +351,10 @@ struct rcu_state { /* in jiffies. */ unsigned long jiffies_stall; /* Time at which to check */ /* for CPU stalls. */ + int nr_fqs_jiffies_stall; /* Number of fqs loops after + * which read jiffies and set + * jiffies_stall. Stall + * warnings disabled if !0. */ unsigned long jiffies_resched; /* Time at which to resched */ /* a reluctant CPU. */ unsigned long n_force_qs_gpstart; /* Snapshot of n_force_qs at */ --- a/kernel/rcu/tree_stall.h +++ b/kernel/rcu/tree_stall.h @@ -121,12 +121,17 @@ static void panic_on_rcu_stall(void) /** * rcu_cpu_stall_reset - restart stall-warning timeout for current grace period * + * To perform the reset request from the caller, disable stall detection until + * 3 fqs loops have passed. This is required to ensure a fresh jiffies is + * loaded. It should be safe to do from the fqs loop as enough timer + * interrupts and context switches should have passed. + * * The caller must disable hard irqs. */ void rcu_cpu_stall_reset(void) { - WRITE_ONCE(rcu_state.jiffies_stall, - jiffies + rcu_jiffies_till_stall_check()); + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 3); + WRITE_ONCE(rcu_state.jiffies_stall, ULONG_MAX); }
////////////////////////////////////////////////////////////////////////////// @@ -142,6 +147,7 @@ static void record_gp_stall_check_time(v WRITE_ONCE(rcu_state.gp_start, j); j1 = rcu_jiffies_till_stall_check(); smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq. + WRITE_ONCE(rcu_state.nr_fqs_jiffies_stall, 0); WRITE_ONCE(rcu_state.jiffies_stall, j + j1); rcu_state.jiffies_resched = j + j1 / 2; rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs); @@ -662,6 +668,16 @@ static void check_cpu_stall(struct rcu_d !rcu_gp_in_progress()) return; rcu_stall_kick_kthreads(); + + /* + * Check if it was requested (via rcu_cpu_stall_reset()) that the FQS + * loop has to set jiffies to ensure a non-stale jiffies value. This + * is required to have good jiffies value after coming out of long + * breaks of jiffies updates. Not doing so can cause false positives. + */ + if (READ_ONCE(rcu_state.nr_fqs_jiffies_stall) > 0) + return; + j = jiffies;
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
commit 95d97b111e1e184b0c8656137033ed64f2cf21e4 upstream.
SMEM uses lock index 3 of the TCSR Mutex hwlock for allocations in SMEM region shared by the Host and FW.
Fix the SMEM hwlock index to 3 for IPQ6018.
Cc: stable@vger.kernel.org Fixes: 5bf635621245 ("arm64: dts: ipq6018: Add a few device nodes") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Acked-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230904172516.479866-3-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -175,7 +175,7 @@ smem { compatible = "qcom,smem"; memory-region = <&smem_region>; - hwlocks = <&tcsr_mutex 0>; + hwlocks = <&tcsr_mutex 3>; };
soc: soc {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Geffon bgeffon@google.com
commit f0c7183008b41e92fa676406d87f18773724b48b upstream.
We found at least one situation where the safe pages list was empty and get_buffer() would gladly try to use a NULL pointer.
Signed-off-by: Brian Geffon bgeffon@google.com Fixes: 8357376d3df2 ("[PATCH] swsusp: Improve handling of highmem") Cc: All applicable stable@vger.kernel.org Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/power/snapshot.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -2414,8 +2414,9 @@ static void *get_highmem_page_buffer(str pbe->copy_page = tmp; } else { /* Copy of the page will be stored in normal memory */ - kaddr = safe_pages_list; - safe_pages_list = safe_pages_list->next; + kaddr = __get_safe_page(ca->gfp_mask); + if (!kaddr) + return ERR_PTR(-ENOMEM); pbe->copy_page = virt_to_page(kaddr); } pbe->next = highmem_pblist; @@ -2595,8 +2596,9 @@ static void *get_buffer(struct memory_bi return ERR_PTR(-ENOMEM); } pbe->orig_address = page_address(page); - pbe->address = safe_pages_list; - safe_pages_list = safe_pages_list->next; + pbe->address = __get_safe_page(ca->gfp_mask); + if (!pbe->address) + return ERR_PTR(-ENOMEM); pbe->next = restore_pblist; restore_pblist = pbe; return pbe->address;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Geffon bgeffon@google.com
commit d08970df1980476f27936e24d452550f3e9e92e1 upstream.
In snapshot_write_next(), sync_read is set and unset in three different spots unnecessiarly. As a result there is a subtle bug where the first page after the meta data has been loaded unconditionally sets sync_read to 0. If this first PFN was actually a highmem page, then the returned buffer will be the global "buffer," and the page needs to be loaded synchronously.
That is, I'm not sure we can always assume the following to be safe:
handle->buffer = get_buffer(&orig_bm, &ca); handle->sync_read = 0;
Because get_buffer() can call get_highmem_page_buffer() which can return 'buffer'.
The easiest way to address this is just set sync_read before snapshot_write_next() returns if handle->buffer == buffer.
Signed-off-by: Brian Geffon bgeffon@google.com Fixes: 8357376d3df2 ("[PATCH] swsusp: Improve handling of highmem") Cc: All applicable stable@vger.kernel.org [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/power/snapshot.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
--- a/kernel/power/snapshot.c +++ b/kernel/power/snapshot.c @@ -2629,8 +2629,6 @@ int snapshot_write_next(struct snapshot_ if (handle->cur > 1 && handle->cur > nr_meta_pages + nr_copy_pages) return 0;
- handle->sync_read = 1; - if (!handle->cur) { if (!buffer) /* This makes the buffer be freed by swsusp_free() */ @@ -2666,7 +2664,6 @@ int snapshot_write_next(struct snapshot_ memory_bm_position_reset(&orig_bm); restore_pblist = NULL; handle->buffer = get_buffer(&orig_bm, &ca); - handle->sync_read = 0; if (IS_ERR(handle->buffer)) return PTR_ERR(handle->buffer); } @@ -2676,9 +2673,8 @@ int snapshot_write_next(struct snapshot_ handle->buffer = get_buffer(&orig_bm, &ca); if (IS_ERR(handle->buffer)) return PTR_ERR(handle->buffer); - if (handle->buffer != buffer) - handle->sync_read = 0; } + handle->sync_read = (handle->buffer == buffer); handle->cur++; return PAGE_SIZE; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Catalin Marinas catalin.marinas@arm.com
commit 5f98fd034ca6fd1ab8c91a3488968a0e9caaabf6 upstream.
Since the actual slab freeing is deferred when calling kvfree_rcu(), so is the kmemleak_free() callback informing kmemleak of the object deletion. From the perspective of the kvfree_rcu() caller, the object is freed and it may remove any references to it. Since kmemleak does not scan RCU internal data storing the pointer, it will report such objects as leaks during the grace period.
Tell kmemleak to ignore such objects on the kvfree_call_rcu() path. Note that the tiny RCU implementation does not have such issue since the objects can be tracked from the rcu_ctrlblk structure.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Reported-by: Christoph Paasch cpaasch@apple.com Closes: https://lore.kernel.org/all/F903A825-F05F-4B77-A2B5-7356282FBA2C@apple.com/ Cc: stable@vger.kernel.org Tested-by: Christoph Paasch cpaasch@apple.com Reviewed-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Joel Fernandes (Google) joel@joelfernandes.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/rcu/tree.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/kernel/rcu/tree.c +++ b/kernel/rcu/tree.c @@ -31,6 +31,7 @@ #include <linux/bitops.h> #include <linux/export.h> #include <linux/completion.h> +#include <linux/kmemleak.h> #include <linux/moduleparam.h> #include <linux/panic.h> #include <linux/panic_notifier.h> @@ -3609,6 +3610,14 @@ void kvfree_call_rcu(struct rcu_head *he
WRITE_ONCE(krcp->count, krcp->count + 1);
+ /* + * The kvfree_rcu() caller considers the pointer freed at this point + * and likely removes any references to it. Since the actual slab + * freeing (and kmemleak_free()) is deferred, tell kmemleak to ignore + * this object (no scanning or false positives reporting). + */ + kmemleak_ignore(ptr); + // Set timer to drain after KFREE_DRAIN_JIFFIES. if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING && !krcp->monitor_todo) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
commit 11aeb97b45ad2e0040cbb2a589bc403152526345 upstream.
We have a random schedule_timeout() if the current transaction is committing, which seems to be a holdover from the original delalloc reservation code.
Remove this, we have the proper flushing stuff, we shouldn't be hoping for random timing things to make everything work. This just induces latency for no reason.
CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/delalloc-space.c | 3 --- 1 file changed, 3 deletions(-)
--- a/fs/btrfs/delalloc-space.c +++ b/fs/btrfs/delalloc-space.c @@ -312,9 +312,6 @@ int btrfs_delalloc_reserve_metadata(stru } else { if (current->journal_info) flush = BTRFS_RESERVE_FLUSH_LIMIT; - - if (btrfs_transaction_in_commit(fs_info)) - schedule_timeout(1); }
num_bytes = ALIGN(num_bytes, fs_info->sectorsize);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kathiravan Thirumoorthy quic_kathirav@quicinc.com
commit 3337a6fea25370d3d244ec6bb38c71ee86fcf837 upstream.
Per the "SMC calling convention specification", the 64-bit calling convention can only be used when the client is 64-bit. Whereas the 32-bit calling convention can be used by either a 32-bit or a 64-bit client.
Currently during SCM probe, irrespective of the client, 64-bit calling convention is made, which is incorrect and may lead to the undefined behaviour when the client is 32-bit. Let's fix it.
Cc: stable@vger.kernel.org Fixes: 9a434cee773a ("firmware: qcom_scm: Dynamically support SMCCC and legacy conventions") Reviewed-By: Elliot Berman quic_eberman@quicinc.com Signed-off-by: Kathiravan Thirumoorthy quic_kathirav@quicinc.com Link: https://lore.kernel.org/r/20230925-scm-v3-1-8790dff6a749@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/qcom_scm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/firmware/qcom_scm.c +++ b/drivers/firmware/qcom_scm.c @@ -137,6 +137,12 @@ static enum qcom_scm_convention __get_co return qcom_scm_convention;
/* + * Per the "SMC calling convention specification", the 64-bit calling + * convention can only be used when the client is 64-bit, otherwise + * system will encounter the undefined behaviour. + */ +#if IS_ENABLED(CONFIG_ARM64) + /* * Device isn't required as there is only one argument - no device * needed to dma_map_single to secure world */ @@ -156,6 +162,7 @@ static enum qcom_scm_convention __get_co forced = true; goto found; } +#endif
probed_convention = SMC_CONVENTION_ARM_32; ret = __scm_smc_call(NULL, &desc, probed_convention, &res, true);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vasily Khoruzhick anarsoul@gmail.com
commit a83c68a3bf7c418c9a46693c63c638852b0c1f4e upstream.
Buggy BIOSes may have invalid FPDT subtables, e.g. on my hardware:
S3PT subtable:
7F20FE30: 53 33 50 54 24 00 00 00-00 00 00 00 00 00 18 01 *S3PT$...........* 7F20FE40: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................* 7F20FE50: 00 00 00 00
Here the first record has zero length.
FBPT subtable:
7F20FE50: 46 42 50 54-3C 00 00 00 46 42 50 54 *....FBPT<...FBPT* 7F20FE60: 02 00 30 02 00 00 00 00-00 00 00 00 00 00 00 00 *..0.............* 7F20FE70: 2A A6 BC 6E 0B 00 00 00-1A 44 41 70 0B 00 00 00 **..n.....DAp....* 7F20FE80: 00 00 00 00 00 00 00 00-00 00 00 00 00 00 00 00 *................*
And here FBPT table has FBPT signature repeated instead of the first record.
Current code will be looping indefinitely due to zero length records, so break out of the loop if record length is zero.
While we are here, add proper handling for fpdt_process_subtable() failures.
Fixes: d1eb86e59be0 ("ACPI: tables: introduce support for FPDT table") Cc: All applicable stable@vger.kernel.org Signed-off-by: Vasily Khoruzhick anarsoul@gmail.com [ rjw: Comment edit, added empty code lines ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_fpdt.c | 45 +++++++++++++++++++++++++++++++++++++-------- 1 file changed, 37 insertions(+), 8 deletions(-)
--- a/drivers/acpi/acpi_fpdt.c +++ b/drivers/acpi/acpi_fpdt.c @@ -194,12 +194,19 @@ static int fpdt_process_subtable(u64 add record_header = (void *)subtable_header + offset; offset += record_header->length;
+ if (!record_header->length) { + pr_err(FW_BUG "Zero-length record found in FPTD.\n"); + result = -EINVAL; + goto err; + } + switch (record_header->type) { case RECORD_S3_RESUME: if (subtable_type != SUBTABLE_S3PT) { pr_err(FW_BUG "Invalid record %d for subtable %s\n", record_header->type, signature); - return -EINVAL; + result = -EINVAL; + goto err; } if (record_resume) { pr_err("Duplicate resume performance record found.\n"); @@ -208,7 +215,7 @@ static int fpdt_process_subtable(u64 add record_resume = (struct resume_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &resume_attr_group); if (result) - return result; + goto err; break; case RECORD_S3_SUSPEND: if (subtable_type != SUBTABLE_S3PT) { @@ -223,13 +230,14 @@ static int fpdt_process_subtable(u64 add record_suspend = (struct suspend_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &suspend_attr_group); if (result) - return result; + goto err; break; case RECORD_BOOT: if (subtable_type != SUBTABLE_FBPT) { pr_err(FW_BUG "Invalid %d for subtable %s\n", record_header->type, signature); - return -EINVAL; + result = -EINVAL; + goto err; } if (record_boot) { pr_err("Duplicate boot performance record found.\n"); @@ -238,7 +246,7 @@ static int fpdt_process_subtable(u64 add record_boot = (struct boot_performance_record *)record_header; result = sysfs_create_group(fpdt_kobj, &boot_attr_group); if (result) - return result; + goto err; break;
default: @@ -247,6 +255,18 @@ static int fpdt_process_subtable(u64 add } } return 0; + +err: + if (record_boot) + sysfs_remove_group(fpdt_kobj, &boot_attr_group); + + if (record_suspend) + sysfs_remove_group(fpdt_kobj, &suspend_attr_group); + + if (record_resume) + sysfs_remove_group(fpdt_kobj, &resume_attr_group); + + return result; }
static int __init acpi_init_fpdt(void) @@ -255,6 +275,7 @@ static int __init acpi_init_fpdt(void) struct acpi_table_header *header; struct fpdt_subtable_entry *subtable; u32 offset = sizeof(*header); + int result;
status = acpi_get_table(ACPI_SIG_FPDT, 0, &header);
@@ -263,8 +284,8 @@ static int __init acpi_init_fpdt(void)
fpdt_kobj = kobject_create_and_add("fpdt", acpi_kobj); if (!fpdt_kobj) { - acpi_put_table(header); - return -ENOMEM; + result = -ENOMEM; + goto err_nomem; }
while (offset < header->length) { @@ -272,8 +293,10 @@ static int __init acpi_init_fpdt(void) switch (subtable->type) { case SUBTABLE_FBPT: case SUBTABLE_S3PT: - fpdt_process_subtable(subtable->address, + result = fpdt_process_subtable(subtable->address, subtable->type); + if (result) + goto err_subtable; break; default: /* Other types are reserved in ACPI 6.4 spec. */ @@ -282,6 +305,12 @@ static int __init acpi_init_fpdt(void) offset += sizeof(*subtable); } return 0; +err_subtable: + kobject_put(fpdt_kobj); + +err_nomem: + acpi_put_table(header); + return result; }
fs_initcall(acpi_init_fpdt);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
commit e044374a8a0a99e46f4e6d6751d3042b6d9cc12e upstream.
It is not clear that IMA should be nested at all, but as long is it measures files both on overlayfs and on underlying fs, we need to annotate the iint mutex to avoid lockdep false positives related to IMA + overlayfs, same as overlayfs annotates the inode mutex.
Reported-and-tested-by: syzbot+b42fe626038981fb7bfa@syzkaller.appspotmail.com Signed-off-by: Amir Goldstein amir73il@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/integrity/iint.c | 48 +++++++++++++++++++++++++++++++++++----------- 1 file changed, 37 insertions(+), 11 deletions(-)
--- a/security/integrity/iint.c +++ b/security/integrity/iint.c @@ -66,9 +66,32 @@ struct integrity_iint_cache *integrity_i return iint; }
-static void iint_free(struct integrity_iint_cache *iint) +#define IMA_MAX_NESTING (FILESYSTEM_MAX_STACK_DEPTH+1) + +/* + * It is not clear that IMA should be nested at all, but as long is it measures + * files both on overlayfs and on underlying fs, we need to annotate the iint + * mutex to avoid lockdep false positives related to IMA + overlayfs. + * See ovl_lockdep_annotate_inode_mutex_key() for more details. + */ +static inline void iint_lockdep_annotate(struct integrity_iint_cache *iint, + struct inode *inode) +{ +#ifdef CONFIG_LOCKDEP + static struct lock_class_key iint_mutex_key[IMA_MAX_NESTING]; + + int depth = inode->i_sb->s_stack_depth; + + if (WARN_ON_ONCE(depth < 0 || depth >= IMA_MAX_NESTING)) + depth = 0; + + lockdep_set_class(&iint->mutex, &iint_mutex_key[depth]); +#endif +} + +static void iint_init_always(struct integrity_iint_cache *iint, + struct inode *inode) { - kfree(iint->ima_hash); iint->ima_hash = NULL; iint->version = 0; iint->flags = 0UL; @@ -80,6 +103,14 @@ static void iint_free(struct integrity_i iint->ima_creds_status = INTEGRITY_UNKNOWN; iint->evm_status = INTEGRITY_UNKNOWN; iint->measured_pcrs = 0; + mutex_init(&iint->mutex); + iint_lockdep_annotate(iint, inode); +} + +static void iint_free(struct integrity_iint_cache *iint) +{ + kfree(iint->ima_hash); + mutex_destroy(&iint->mutex); kmem_cache_free(iint_cache, iint); }
@@ -112,6 +143,8 @@ struct integrity_iint_cache *integrity_i if (!iint) return NULL;
+ iint_init_always(iint, inode); + write_lock(&integrity_iint_lock);
p = &integrity_iint_tree.rb_node; @@ -161,25 +194,18 @@ void integrity_inode_free(struct inode * iint_free(iint); }
-static void init_once(void *foo) +static void iint_init_once(void *foo) { struct integrity_iint_cache *iint = (struct integrity_iint_cache *) foo;
memset(iint, 0, sizeof(*iint)); - iint->ima_file_status = INTEGRITY_UNKNOWN; - iint->ima_mmap_status = INTEGRITY_UNKNOWN; - iint->ima_bprm_status = INTEGRITY_UNKNOWN; - iint->ima_read_status = INTEGRITY_UNKNOWN; - iint->ima_creds_status = INTEGRITY_UNKNOWN; - iint->evm_status = INTEGRITY_UNKNOWN; - mutex_init(&iint->mutex); }
static int __init integrity_iintcache_init(void) { iint_cache = kmem_cache_create("iint_cache", sizeof(struct integrity_iint_cache), - 0, SLAB_PANIC, init_once); + 0, SLAB_PANIC, iint_init_once); return 0; } DEFINE_LSM(integrity) = {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mimi Zohar zohar@linux.ibm.com
commit b836c4d29f2744200b2af41e14bf50758dddc818 upstream.
Commit 18b44bc5a672 ("ovl: Always reevaluate the file signature for IMA") forced signature re-evaulation on every file access.
Instead of always re-evaluating the file's integrity, detect a change to the backing file, by comparing the cached file metadata with the backing file's metadata. Verifying just the i_version has not changed is insufficient. In addition save and compare the i_ino and s_dev as well.
Reviewed-by: Amir Goldstein amir73il@gmail.com Tested-by: Eric Snowberg eric.snowberg@oracle.com Tested-by: Raul E Rangel rrangel@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Mimi Zohar zohar@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/super.c | 2 +- security/integrity/ima/ima_api.c | 5 +++++ security/integrity/ima/ima_main.c | 16 +++++++++++++++- security/integrity/integrity.h | 2 ++ 4 files changed, 23 insertions(+), 2 deletions(-)
--- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -2140,7 +2140,7 @@ static int ovl_fill_super(struct super_b ovl_trusted_xattr_handlers; sb->s_fs_info = ofs; sb->s_flags |= SB_POSIXACL; - sb->s_iflags |= SB_I_SKIP_SYNC | SB_I_IMA_UNVERIFIABLE_SIGNATURE; + sb->s_iflags |= SB_I_SKIP_SYNC;
err = -ENOMEM; root_dentry = ovl_get_root(sb, upperpath.dentry, oe); --- a/security/integrity/ima/ima_api.c +++ b/security/integrity/ima/ima_api.c @@ -216,6 +216,7 @@ int ima_collect_measurement(struct integ { const char *audit_cause = "failed"; struct inode *inode = file_inode(file); + struct inode *real_inode = d_real_inode(file_dentry(file)); const char *filename = file->f_path.dentry->d_name.name; int result = 0; int length; @@ -266,6 +267,10 @@ int ima_collect_measurement(struct integ iint->ima_hash = tmpbuf; memcpy(iint->ima_hash, &hash, length); iint->version = i_version; + if (real_inode != inode) { + iint->real_ino = real_inode->i_ino; + iint->real_dev = real_inode->i_sb->s_dev; + }
/* Possibly temporary failure due to type of read (eg. O_DIRECT) */ if (!result) --- a/security/integrity/ima/ima_main.c +++ b/security/integrity/ima/ima_main.c @@ -26,6 +26,7 @@ #include <linux/ima.h> #include <linux/iversion.h> #include <linux/fs.h> +#include <linux/iversion.h>
#include "ima.h"
@@ -202,7 +203,7 @@ static int process_measurement(struct fi u32 secid, char *buf, loff_t size, int mask, enum ima_hooks func) { - struct inode *inode = file_inode(file); + struct inode *backing_inode, *inode = file_inode(file); struct integrity_iint_cache *iint = NULL; struct ima_template_desc *template_desc = NULL; char *pathbuf = NULL; @@ -278,6 +279,19 @@ static int process_measurement(struct fi iint->measured_pcrs = 0; }
+ /* Detect and re-evaluate changes made to the backing file. */ + backing_inode = d_real_inode(file_dentry(file)); + if (backing_inode != inode && + (action & IMA_DO_MASK) && (iint->flags & IMA_DONE_MASK)) { + if (!IS_I_VERSION(backing_inode) || + backing_inode->i_sb->s_dev != iint->real_dev || + backing_inode->i_ino != iint->real_ino || + !inode_eq_iversion(backing_inode, iint->version)) { + iint->flags &= ~IMA_DONE_MASK; + iint->measured_pcrs = 0; + } + } + /* Determine if already appraised/measured based on bitmask * (IMA_MEASURE, IMA_MEASURED, IMA_XXXX_APPRAISE, IMA_XXXX_APPRAISED, * IMA_AUDIT, IMA_AUDITED) --- a/security/integrity/integrity.h +++ b/security/integrity/integrity.h @@ -131,6 +131,8 @@ struct integrity_iint_cache { unsigned long flags; unsigned long measured_pcrs; unsigned long atomic_flags; + unsigned long real_ino; + dev_t real_dev; enum integrity_status ima_file_status:4; enum integrity_status ima_mmap_status:4; enum integrity_status ima_bprm_status:4;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 1a5352a81b4720ba43d9c899974e3bddf7ce0ce8 upstream.
The ath11k active pdevs are protected by RCU but the temperature event handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section as reported by RCU lockdep:
============================= WARNING: suspicious RCU usage 6.6.0-rc6 #7 Not tainted ----------------------------- drivers/net/wireless/ath/ath11k/mac.c:638 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 no locks held by swapper/0/0. ... Call trace: ... lockdep_rcu_suspicious+0x16c/0x22c ath11k_mac_get_ar_by_pdev_id+0x194/0x1b0 [ath11k] ath11k_wmi_tlv_op_rx+0xa84/0x2c1c [ath11k] ath11k_htc_rx_completion_handler+0x388/0x510 [ath11k]
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Tested-on: WCN6855 hw2.1 PCI WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.23
Fixes: a41d10348b01 ("ath11k: add thermal sensor device support") Cc: stable@vger.kernel.org # 5.7 Signed-off-by: Johan Hovold johan+linaro@kernel.org Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019153115.26401-2-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/wmi.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -6855,15 +6855,19 @@ ath11k_wmi_pdev_temperature_event(struct ath11k_dbg(ab, ATH11K_DBG_WMI, "pdev temperature ev temp %d pdev_id %d\n", ev->temp, ev->pdev_id);
+ rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, ev->pdev_id); if (!ar) { ath11k_warn(ab, "invalid pdev id in pdev temperature ev %d", ev->pdev_id); - kfree(tb); - return; + goto exit; }
ath11k_thermal_event_temperature(ar, ev->temp);
+exit: + rcu_read_unlock(); + kfree(tb); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 3b6c14833165f689cc5928574ebafe52bbce5f1e upstream.
The ath11k active pdevs are protected by RCU but the DFS radar event handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org # 5.6 Acked-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019153115.26401-3-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/wmi.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/net/wireless/ath/ath11k/wmi.c +++ b/drivers/net/wireless/ath/ath11k/wmi.c @@ -6809,6 +6809,8 @@ ath11k_wmi_pdev_dfs_radar_detected_event ev->detector_id, ev->segment_id, ev->timestamp, ev->is_chirp, ev->freq_offset, ev->sidx);
+ rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, ev->pdev_id);
if (!ar) { @@ -6826,6 +6828,8 @@ ath11k_wmi_pdev_dfs_radar_detected_event ieee80211_radar_detected(ar->hw);
exit: + rcu_read_unlock(); + kfree(tb); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan+linaro@kernel.org
commit 3f77c7d605b29df277d77e9ee75d96e7ad145d2d upstream.
The ath11k active pdevs are protected by RCU but the htt pktlog handling code calling ath11k_mac_get_ar_by_pdev_id() was not marked as a read-side critical section.
Mark the code in question as an RCU read-side critical section to avoid any potential use-after-free issues.
Compile tested only.
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org # 5.6 Signed-off-by: Johan Hovold johan+linaro@kernel.org Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20231019112521.2071-1-johan+linaro@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/dp_rx.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/net/wireless/ath/ath11k/dp_rx.c +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c @@ -1603,14 +1603,20 @@ static void ath11k_htt_pktlog(struct ath u8 pdev_id;
pdev_id = FIELD_GET(HTT_T2H_PPDU_STATS_INFO_PDEV_ID, data->hdr); + + rcu_read_lock(); + ar = ath11k_mac_get_ar_by_pdev_id(ab, pdev_id); if (!ar) { ath11k_warn(ab, "invalid pdev id %d on htt pktlog\n", pdev_id); - return; + goto out; }
trace_ath11k_htt_pktlog(ar, data->payload, hdr->size, ar->ab->pktlog_defs_checksum); + +out: + rcu_read_unlock(); }
static void ath11k_htt_backpressure_event_handler(struct ath11k_base *ab,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rong Chen rong.chen@amlogic.com
commit 57925e16c9f7d18012bcf45bfa658f92c087981a upstream.
For the t7 and older SoC families, the CMD_CFG_ERROR has no effect. Starting from SoC family C3, setting this bit without SG LINK data address will cause the controller to generate an IRQ and stop working.
To fix it, don't set the bit CMD_CFG_ERROR anymore.
Fixes: 18f92bc02f17 ("mmc: meson-gx: make sure the descriptor is stopped on errors") Signed-off-by: Rong Chen rong.chen@amlogic.com Reviewed-by: Jerome Brunet jbrunet@baylibre.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231026073156.2868310-1-rong.chen@amlogic.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/meson-gx-mmc.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/mmc/host/meson-gx-mmc.c +++ b/drivers/mmc/host/meson-gx-mmc.c @@ -811,7 +811,6 @@ static void meson_mmc_start_cmd(struct m
cmd_cfg |= FIELD_PREP(CMD_CFG_CMD_INDEX_MASK, cmd->opcode); cmd_cfg |= CMD_CFG_OWNER; /* owned by CPU */ - cmd_cfg |= CMD_CFG_ERROR; /* stop in case of error */
meson_mmc_set_response_bits(cmd, &cmd_cfg);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herve Codina herve.codina@bootlin.com
commit 5e7afb2eb7b2a7c81e9f608cbdf74a07606fd1b5 upstream.
irq_remove_generic_chip() calculates the Linux interrupt number for removing the handler and interrupt chip based on gc::irq_base as a linear function of the bit positions of set bits in the @msk argument.
When the generic chip is present in an irq domain, i.e. created with a call to irq_alloc_domain_generic_chips(), gc::irq_base contains not the base Linux interrupt number. It contains the base hardware interrupt for this chip. It is set to 0 for the first chip in the domain, 0 + N for the next chip, where $N is the number of hardware interrupts per chip.
That means the Linux interrupt number cannot be calculated based on gc::irq_base for irqdomain based chips without a domain map lookup, which is currently missing.
Rework the code to take the irqdomain case into account and calculate the Linux interrupt number by a irqdomain lookup of the domain specific hardware interrupt number.
[ tglx: Massage changelog. Reshuffle the logic and add a proper comment. ]
Fixes: cfefd21e693d ("genirq: Add chip suspend and resume callbacks") Signed-off-by: Herve Codina herve.codina@bootlin.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231024150335.322282-1-herve.codina@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/irq/generic-chip.c | 25 +++++++++++++++++++------ 1 file changed, 19 insertions(+), 6 deletions(-)
--- a/kernel/irq/generic-chip.c +++ b/kernel/irq/generic-chip.c @@ -541,21 +541,34 @@ EXPORT_SYMBOL_GPL(irq_setup_alt_chip); void irq_remove_generic_chip(struct irq_chip_generic *gc, u32 msk, unsigned int clr, unsigned int set) { - unsigned int i = gc->irq_base; + unsigned int i, virq;
raw_spin_lock(&gc_lock); list_del(&gc->list); raw_spin_unlock(&gc_lock);
- for (; msk; msk >>= 1, i++) { + for (i = 0; msk; msk >>= 1, i++) { if (!(msk & 0x01)) continue;
+ /* + * Interrupt domain based chips store the base hardware + * interrupt number in gc::irq_base. Otherwise gc::irq_base + * contains the base Linux interrupt number. + */ + if (gc->domain) { + virq = irq_find_mapping(gc->domain, gc->irq_base + i); + if (!virq) + continue; + } else { + virq = gc->irq_base + i; + } + /* Remove handler first. That will mask the irq line */ - irq_set_handler(i, NULL); - irq_set_chip(i, &no_irq_chip); - irq_set_chip_data(i, NULL); - irq_modify_status(i, clr, set); + irq_set_handler(virq, NULL); + irq_set_chip(virq, &no_irq_chip); + irq_set_chip_data(virq, NULL); + irq_modify_status(virq, clr, set); } } EXPORT_SYMBOL_GPL(irq_remove_generic_chip);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Sakkinen jarkko@kernel.org
commit 31de287345f41bbfaec36a5c8cbdba035cf76442 upstream.
Do bind neither static calls nor trusted_key_exit() before a successful init, in order to maintain a consistent state. In addition, depart the init_trusted() in the case of a real error (i.e. getting back something else than -ENODEV).
Reported-by: Linus Torvalds torvalds@linux-foundation.org Closes: https://lore.kernel.org/linux-integrity/CAHk-=whOPoLaWM8S8GgoOPT7a2+nMH5h3TL... Cc: stable@vger.kernel.org # v5.13+ Fixes: 5d0682be3189 ("KEYS: trusted: Add generic trusted keys framework") Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/trusted-keys/trusted_core.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/security/keys/trusted-keys/trusted_core.c +++ b/security/keys/trusted-keys/trusted_core.c @@ -354,17 +354,17 @@ static int __init init_trusted(void) if (!get_random) get_random = kernel_get_random;
- static_call_update(trusted_key_seal, - trusted_key_sources[i].ops->seal); - static_call_update(trusted_key_unseal, - trusted_key_sources[i].ops->unseal); - static_call_update(trusted_key_get_random, - get_random); - trusted_key_exit = trusted_key_sources[i].ops->exit; - migratable = trusted_key_sources[i].ops->migratable; - ret = trusted_key_sources[i].ops->init(); - if (!ret) + if (!ret) { + static_call_update(trusted_key_seal, trusted_key_sources[i].ops->seal); + static_call_update(trusted_key_unseal, trusted_key_sources[i].ops->unseal); + static_call_update(trusted_key_get_random, get_random); + + trusted_key_exit = trusted_key_sources[i].ops->exit; + migratable = trusted_key_sources[i].ops->migratable; + } + + if (!ret || ret != -ENODEV) break; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 200bddbb3f5202bbce96444fdc416305de14f547 upstream.
With CONFIG_PCIE_KEYSTONE=y and ks_pcie_remove() marked with __exit, the function is discarded from the driver. In this case a bound device can still get unbound, e.g via sysfs. Then no cleanup code is run resulting in resource leaks or worse.
The right thing to do is do always have the remove callback available. Note that this driver cannot be compiled as a module, so ks_pcie_remove() was always discarded before this change and modpost couldn't warn about this issue. Furthermore the __ref annotation also prevents a warning.
Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-4-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-keystone.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -1284,7 +1284,7 @@ err_link: return ret; }
-static int __exit ks_pcie_remove(struct platform_device *pdev) +static int ks_pcie_remove(struct platform_device *pdev) { struct keystone_pcie *ks_pcie = platform_get_drvdata(pdev); struct device_link **link = ks_pcie->link; @@ -1302,7 +1302,7 @@ static int __exit ks_pcie_remove(struct
static struct platform_driver ks_pcie_driver __refdata = { .probe = ks_pcie_probe, - .remove = __exit_p(ks_pcie_remove), + .remove = ks_pcie_remove, .driver = { .name = "keystone-pcie", .of_match_table = of_match_ptr(ks_pcie_of_match),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
commit 7994db905c0fd692cf04c527585f08a91b560144 upstream.
The __init annotation makes the ks_pcie_probe() function disappear after booting completes. However a device can also be bound later. In that case, we try to call ks_pcie_probe(), but the backing memory is likely already overwritten.
The right thing to do is do always have the probe callback available. Note that the (wrong) __refdata annotation prevented this issue to be noticed by modpost.
Fixes: 0c4ffcfe1fbc ("PCI: keystone: Add TI Keystone PCIe driver") Link: https://lore.kernel.org/r/20231001170254.2506508-5-u.kleine-koenig@pengutron... Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-keystone.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -1080,7 +1080,7 @@ static const struct of_device_id ks_pcie { }, };
-static int __init ks_pcie_probe(struct platform_device *pdev) +static int ks_pcie_probe(struct platform_device *pdev) { const struct dw_pcie_host_ops *host_ops; const struct dw_pcie_ep_ops *ep_ops; @@ -1300,7 +1300,7 @@ static int ks_pcie_remove(struct platfor return 0; }
-static struct platform_driver ks_pcie_driver __refdata = { +static struct platform_driver ks_pcie_driver = { .probe = ks_pcie_probe, .remove = ks_pcie_remove, .driver = {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 93995bf4af2c5a99e2a87f0cd5ce547d31eb7630 ]
The expired catchall element is not deactivated and removed from GC sync path. This path holds mutex so just call nft_setelem_data_deactivate() and nft_setelem_catchall_remove() before queueing the GC work.
Fixes: 4a9e12ea7e70 ("netfilter: nft_set_pipapo: call nft_trans_gc_queue_sync() in catchall GC") Reported-by: lonial con kongln9170@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 8f12e83280cbd..e9d0c6c8e0b12 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6049,6 +6049,12 @@ static int nft_setelem_deactivate(const struct net *net, return ret; }
+static void nft_setelem_catchall_destroy(struct nft_set_elem_catchall *catchall) +{ + list_del_rcu(&catchall->list); + kfree_rcu(catchall, rcu); +} + static void nft_setelem_catchall_remove(const struct net *net, const struct nft_set *set, const struct nft_set_elem *elem) @@ -6057,8 +6063,7 @@ static void nft_setelem_catchall_remove(const struct net *net,
list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { if (catchall->elem == elem->priv) { - list_del_rcu(&catchall->list); - kfree_rcu(catchall, rcu); + nft_setelem_catchall_destroy(catchall); break; } } @@ -9046,11 +9051,12 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, unsigned int gc_seq, bool sync) { - struct nft_set_elem_catchall *catchall; + struct nft_set_elem_catchall *catchall, *next; const struct nft_set *set = gc->set; + struct nft_elem_priv *elem_priv; struct nft_set_ext *ext;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) { + list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext)) @@ -9068,7 +9074,17 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, if (!gc) return NULL;
- nft_trans_gc_elem_add(gc, catchall->elem); + elem_priv = catchall->elem; + if (sync) { + struct nft_set_elem elem = { + .priv = elem_priv, + }; + + nft_setelem_data_deactivate(gc->net, gc->set, &elem); + nft_setelem_catchall_destroy(catchall); + } + + nft_trans_gc_elem_add(gc, elem_priv); }
return gc;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 8837ba3e58ea1e3d09ae36db80b1e80853aada95 ]
list_for_each_entry_safe() does not work for the async case which runs under RCU, therefore, split GC logic for catchall in two functions instead, one for each of the sync and async GC variants.
The catchall sync GC variant never sees a _DEAD bit set on ever, thus, this handling is removed in such case, moreover, allocate GC sync batch via GFP_KERNEL.
Fixes: 93995bf4af2c ("netfilter: nf_tables: remove catchall element in GC sync path") Reported-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 61 ++++++++++++++++++----------------- 1 file changed, 32 insertions(+), 29 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index e9d0c6c8e0b12..bf0bd44f2fb3a 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9047,16 +9047,14 @@ void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans) call_rcu(&trans->rcu, nft_trans_gc_trans_free); }
-static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, - unsigned int gc_seq, - bool sync) +struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc, + unsigned int gc_seq) { - struct nft_set_elem_catchall *catchall, *next; + struct nft_set_elem_catchall *catchall; const struct nft_set *set = gc->set; - struct nft_elem_priv *elem_priv; struct nft_set_ext *ext;
- list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { + list_for_each_entry_rcu(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext)) @@ -9066,39 +9064,44 @@ static struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
nft_set_elem_dead(ext); dead_elem: - if (sync) - gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); - else - gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); - + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); if (!gc) return NULL;
- elem_priv = catchall->elem; - if (sync) { - struct nft_set_elem elem = { - .priv = elem_priv, - }; - - nft_setelem_data_deactivate(gc->net, gc->set, &elem); - nft_setelem_catchall_destroy(catchall); - } - - nft_trans_gc_elem_add(gc, elem_priv); + nft_trans_gc_elem_add(gc, catchall->elem); }
return gc; }
-struct nft_trans_gc *nft_trans_gc_catchall_async(struct nft_trans_gc *gc, - unsigned int gc_seq) -{ - return nft_trans_gc_catchall(gc, gc_seq, false); -} - struct nft_trans_gc *nft_trans_gc_catchall_sync(struct nft_trans_gc *gc) { - return nft_trans_gc_catchall(gc, 0, true); + struct nft_set_elem_catchall *catchall, *next; + const struct nft_set *set = gc->set; + struct nft_set_elem elem; + struct nft_set_ext *ext; + + WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net)); + + list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { + ext = nft_set_elem_ext(set, catchall->elem); + + if (!nft_set_elem_expired(ext)) + continue; + + gc = nft_trans_gc_queue_sync(gc, GFP_KERNEL); + if (!gc) + return NULL; + + memset(&elem, 0, sizeof(elem)); + elem.priv = catchall->elem; + + nft_setelem_data_deactivate(gc->net, gc->set, &elem); + nft_setelem_catchall_destroy(catchall); + nft_trans_gc_elem_add(gc, elem.priv); + } + + return gc; }
static void nf_tables_module_autoload_cleanup(struct net *net)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit 030b48fb2cf045dead8ee2c5ead560930044c029 upstream.
The test runner run_cmt_test() in resctrl_tests.c checks for CMT feature and does not run cmt_resctrl_val() if CMT is not supported. Then cmt_resctrl_val() also check is CMT is supported.
Remove the duplicated feature check for CMT from cmt_resctrl_val().
Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Tested-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/cmt_test.c | 3 --- 1 file changed, 3 deletions(-)
--- a/tools/testing/selftests/resctrl/cmt_test.c +++ b/tools/testing/selftests/resctrl/cmt_test.c @@ -91,9 +91,6 @@ int cmt_resctrl_val(int cpu_no, int n, c if (ret) return ret;
- if (!validate_resctrl_feature_request(CMT_STR)) - return -1; - ret = get_cbm_mask("L3", cbm_mask); if (ret) return ret;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilpo Järvinen ilpo.jarvinen@linux.intel.com
commit ef43c30858754d99373a63dff33280a9969b49bc upstream.
The initial value of 5% chosen for the maximum allowed percentage difference between resctrl mbm value and IMC mbm value in
commit 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format") was "randomly chosen value" (as admitted by the changelog).
When running tests in our lab across a large number platforms, 5% difference upper bound for success seems a bit on the low side for the MBA and MBM tests. Some platforms produce outliers that are slightly above that, typically 6-7%, which leads MBA/MBM test frequently failing.
Replace the "randomly chosen value" with a success bound that is based on those measurements across large number of platforms by relaxing the MBA/MBM success bound to 8%. The relaxed bound removes the failures due the frequent outliers.
Fixed commit description style error during merge: Shuah Khan skhan@linuxfoundation.org
Fixes: 06bd03a57f8c ("selftests/resctrl: Fix MBA/MBM results reporting format") Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Tested-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Reviewed-by: Reinette Chatre reinette.chatre@intel.com Reviewed-by: Shaopeng Tan tan.shaopeng@jp.fujitsu.com Cc: stable@vger.kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/resctrl/mba_test.c | 2 +- tools/testing/selftests/resctrl/mbm_test.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/resctrl/mba_test.c +++ b/tools/testing/selftests/resctrl/mba_test.c @@ -12,7 +12,7 @@
#define RESULT_FILE_NAME "result_mba" #define NUM_OF_RUNS 5 -#define MAX_DIFF_PERCENT 5 +#define MAX_DIFF_PERCENT 8 #define ALLOCATION_MAX 100 #define ALLOCATION_MIN 10 #define ALLOCATION_STEP 10 --- a/tools/testing/selftests/resctrl/mbm_test.c +++ b/tools/testing/selftests/resctrl/mbm_test.c @@ -11,7 +11,7 @@ #include "resctrl.h"
#define RESULT_FILE_NAME "result_mbm" -#define MAX_DIFF_PERCENT 5 +#define MAX_DIFF_PERCENT 8 #define NUM_OF_RUNS 5
static int
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 72151ad0cba8a07df90130ff62c979520d71f23b upstream.
Driver compares widget name in wsa_macro_spk_boost_event() widget event callback, however it does not handle component's name prefix. This leads to using uninitialized stack variables as registers and register values. Handle gracefully such case.
Fixes: 2c4066e5d428 ("ASoC: codecs: lpass-wsa-macro: add dapm widgets and route") Cc: stable@vger.kernel.org Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Link: https://lore.kernel.org/r/20231003155422.801160-1-krzysztof.kozlowski@linaro... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/codecs/lpass-wsa-macro.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/soc/codecs/lpass-wsa-macro.c +++ b/sound/soc/codecs/lpass-wsa-macro.c @@ -1678,6 +1678,9 @@ static int wsa_macro_spk_boost_event(str boost_path_cfg1 = CDC_WSA_RX1_RX_PATH_CFG1; reg = CDC_WSA_RX1_RX_PATH_CTL; reg_mix = CDC_WSA_RX1_RX_PATH_MIX_CTL; + } else { + dev_warn(component->dev, "Incorrect widget name in the driver\n"); + return -EINVAL; }
switch (event) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhihao Cheng chengzhihao1@huawei.com
commit 61187fce8600e8ef90e601be84f9d0f3222c1206 upstream.
JBD2 makes sure journal data is fallen on fs device by sync_blockdev(), however, other process could intercept the EIO information from bdev's mapping, which leads journal recovering successful even EIO occurs during data written back to fs device.
We found this problem in our product, iscsi + multipath is chosen for block device of ext4. Unstable network may trigger kpartx to rescan partitions in device mapper layer. Detailed process is shown as following:
mount kpartx irq jbd2_journal_recover do_one_pass memcpy(nbh->b_data, obh->b_data) // copy data to fs dev from journal mark_buffer_dirty // mark bh dirty vfs_read generic_file_read_iter // dio filemap_write_and_wait_range __filemap_fdatawrite_range do_writepages block_write_full_folio submit_bh_wbc >> EIO occurs in disk << end_buffer_async_write mark_buffer_write_io_error mapping_set_error set_bit(AS_EIO, &mapping->flags) // set! filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // clear! err2 = sync_blockdev filemap_write_and_wait filemap_check_errors test_and_clear_bit(AS_EIO, &mapping->flags) // false err2 = 0
Filesystem is mounted successfully even data from journal is failed written into disk, and ext4/ocfs2 could become corrupted.
Fix it by comparing the wb_err state in fs block device before recovering and after recovering.
A reproducer can be found in the kernel bugzilla referenced below.
Link: https://bugzilla.kernel.org/show_bug.cgi?id=217888 Cc: stable@vger.kernel.org Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230919012525.1783108-1-chengzhihao1@huawei.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/jbd2/recovery.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -283,6 +283,8 @@ int jbd2_journal_recover(journal_t *jour journal_superblock_t * sb;
struct recovery_info info; + errseq_t wb_err; + struct address_space *mapping;
memset(&info, 0, sizeof(info)); sb = journal->j_superblock; @@ -300,6 +302,9 @@ int jbd2_journal_recover(journal_t *jour return 0; }
+ wb_err = 0; + mapping = journal->j_fs_dev->bd_inode->i_mapping; + errseq_check_and_advance(&mapping->wb_err, &wb_err); err = do_one_pass(journal, &info, PASS_SCAN); if (!err) err = do_one_pass(journal, &info, PASS_REVOKE); @@ -320,6 +325,9 @@ int jbd2_journal_recover(journal_t *jour err2 = sync_blockdev(journal->j_fs_dev); if (!err) err = err2; + err2 = errseq_check_and_advance(&mapping->wb_err, &wb_err); + if (!err) + err = err2; /* Make sure all replayed data is on permanent storage */ if (journal->j_flags & JBD2_BARRIER) { err2 = blkdev_issue_flush(journal->j_fs_dev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@google.com
commit d3cc1b0be258191d6360c82ea158c2972f8d3991 upstream.
Since commit d7e7b9af104c ("fscrypt: stop using keyrings subsystem for fscrypt_master_key"), xfstest generic/270 causes a WARNING when run on f2fs with test_dummy_encryption in the mount options:
$ kvm-xfstests -c f2fs/encrypt generic/270 [...] WARNING: CPU: 1 PID: 2453 at fs/crypto/keyring.c:240 fscrypt_destroy_keyring+0x1f5/0x260
The cause of the WARNING is that not all encrypted inodes have been evicted before fscrypt_destroy_keyring() is called, which violates an assumption. This happens because the test uses an external quota file, which gets automatically encrypted due to test_dummy_encryption.
Encryption of quota files has never really been supported. On ext4, ext4_quota_read() does not decrypt the data, so encrypted quota files are always considered invalid on ext4. On f2fs, f2fs_quota_read() uses the pagecache, so trying to use an encrypted quota file gets farther, resulting in the issue described above being possible. But this was never intended to be possible, and there is no use case for it.
Therefore, make the quota support layer explicitly reject using IS_ENCRYPTED inodes when quotaon is attempted.
Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20230905003227.326998-1-ebiggers@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/quota/dquot.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -2396,6 +2396,20 @@ static int vfs_setup_quota_inode(struct if (sb_has_quota_loaded(sb, type)) return -EBUSY;
+ /* + * Quota files should never be encrypted. They should be thought of as + * filesystem metadata, not user data. New-style internal quota files + * cannot be encrypted by users anyway, but old-style external quota + * files could potentially be incorrectly created in an encrypted + * directory, hence this explicit check. Some reasons why encrypted + * quota files don't work include: (1) some filesystems that support + * encryption don't handle it in their quota_read and quota_write, and + * (2) cleaning up encrypted quota files at unmount would need special + * consideration, as quota files are cleaned up later than user files. + */ + if (IS_ENCRYPTED(inode)) + return -EINVAL; + dqopt->files[type] = igrab(inode); if (!dqopt->files[type]) return -EIO;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Bara benjamin.bara@skidata.com
commit 60466c067927abbcaff299845abd4b7069963139 upstream.
As the emergency restart does not call kernel_restart_prepare(), the system_state stays in SYSTEM_RUNNING.
Since bae1d3a05a8b, this hinders i2c_in_atomic_xfer_mode() from becoming active, and therefore might lead to avoidable warnings in the restart handlers, e.g.:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c
Avoid these by setting the correct system_state.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Reviewed-by: Dmitry Osipenko dmitry.osipenko@collabora.com Tested-by: Nishanth Menon nm@ti.com Signed-off-by: Benjamin Bara benjamin.bara@skidata.com Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-1-18699d5dcd76@skida... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/reboot.c | 1 + 1 file changed, 1 insertion(+)
--- a/kernel/reboot.c +++ b/kernel/reboot.c @@ -65,6 +65,7 @@ EXPORT_SYMBOL_GPL(pm_power_off_prepare); void emergency_restart(void) { kmsg_dump(KMSG_DUMP_EMERG); + system_state = SYSTEM_RESTART; machine_emergency_restart(); } EXPORT_SYMBOL_GPL(emergency_restart);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Bara benjamin.bara@skidata.com
commit aa49c90894d06e18a1ee7c095edbd2f37c232d02 upstream.
Since bae1d3a05a8b, i2c transfers are non-atomic if preemption is disabled. However, non-atomic i2c transfers require preemption (e.g. in wait_for_completion() while waiting for the DMA).
panic() calls preempt_disable_notrace() before calling emergency_restart(). Therefore, if an i2c device is used for the restart, the xfer should be atomic. This avoids warnings like:
[ 12.667612] WARNING: CPU: 1 PID: 1 at kernel/rcu/tree_plugin.h:318 rcu_note_context_switch+0x33c/0x6b0 [ 12.676926] Voluntary context switch within RCU read-side critical section! ... [ 12.742376] schedule_timeout from wait_for_completion_timeout+0x90/0x114 [ 12.749179] wait_for_completion_timeout from tegra_i2c_wait_completion+0x40/0x70 ... [ 12.994527] atomic_notifier_call_chain from machine_restart+0x34/0x58 [ 13.001050] machine_restart from panic+0x2a8/0x32c
Use !preemptible() instead, which is basically the same check as pre-v5.2.
Fixes: bae1d3a05a8b ("i2c: core: remove use of in_atomic()") Cc: stable@vger.kernel.org # v5.2+ Suggested-by: Dmitry Osipenko dmitry.osipenko@collabora.com Acked-by: Wolfram Sang wsa@kernel.org Reviewed-by: Dmitry Osipenko dmitry.osipenko@collabora.com Tested-by: Nishanth Menon nm@ti.com Signed-off-by: Benjamin Bara benjamin.bara@skidata.com Link: https://lore.kernel.org/r/20230327-tegra-pmic-reboot-v7-2-18699d5dcd76@skida... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-core.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i2c/i2c-core.h +++ b/drivers/i2c/i2c-core.h @@ -29,7 +29,7 @@ int i2c_dev_irq_from_resources(const str */ static inline bool i2c_in_atomic_xfer_mode(void) { - return system_state > SYSTEM_RUNNING && irqs_disabled(); + return system_state > SYSTEM_RUNNING && !preemptible(); }
static inline int __i2c_lock_bus_helper(struct i2c_adapter *adap)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 4f7969bcd6d33042d62e249b41b5578161e4c868 upstream.
A synthetic event is created by the synthetic event interface that can read both user or kernel address memory. In reality, it reads any arbitrary memory location from within the kernel. If the address space is in USER (where CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE is set) then it uses strncpy_from_user_nofault() to copy strings otherwise it uses strncpy_from_kernel_nofault().
But since both functions use the same variable there's no annotation to what that variable is (ie. __user). This makes sparse complain.
Quiet sparse by typecasting the strncpy_from_user_nofault() variable to a __user pointer.
Link: https://lore.kernel.org/linux-trace-kernel/20231031151033.73c42e23@gandalf.l...
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Fixes: 0934ae9977c2 ("tracing: Fix reading strings from synthetic events"); Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202311010013.fm8WTxa5-lkp@intel.com/ Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_events_synth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -454,7 +454,7 @@ static unsigned int trace_string(struct
#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE if ((unsigned long)str_val < TASK_SIZE) - ret = strncpy_from_user_nofault(str_field, str_val, STR_VAR_LEN_MAX); + ret = strncpy_from_user_nofault(str_field, (const void __user *)str_val, STR_VAR_LEN_MAX); else #endif ret = strncpy_from_kernel_nofault(str_field, str_val, STR_VAR_LEN_MAX);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sanjuán García, Jorge Jorge.SanjuanGarcia@duagon.com
commit 63ba2d07b4be72b94216d20561f43e1150b25d98 upstream.
chameleon_parse_gdd() may fail for different reasons and end up in the err tag. Make sure we at least always free the mcb_device allocated with mcb_alloc_dev().
If mcb_device_register() fails, make sure to give up the reference in the same place the device was added.
Fixes: 728ac3389296 ("mcb: mcb-parse: fix error handing in chameleon_parse_gdd()") Cc: stable stable@kernel.org Reviewed-by: Jose Javier Rodriguez Barbarin JoseJavier.Rodriguez@duagon.com Signed-off-by: Jorge Sanjuan Garcia jorge.sanjuangarcia@duagon.com Link: https://lore.kernel.org/r/20231019141434.57971-2-jorge.sanjuangarcia@duagon.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mcb/mcb-core.c | 1 + drivers/mcb/mcb-parse.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/mcb/mcb-core.c +++ b/drivers/mcb/mcb-core.c @@ -246,6 +246,7 @@ int mcb_device_register(struct mcb_bus * return 0;
out: + put_device(&dev->dev);
return ret; } --- a/drivers/mcb/mcb-parse.c +++ b/drivers/mcb/mcb-parse.c @@ -106,7 +106,7 @@ static int chameleon_parse_gdd(struct mc return 0;
err: - put_device(&mdev->dev); + mcb_free_dev(mdev);
return ret; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alain Volmat alain.volmat@foss.st.com
commit 03f25d53b145bc2f7ccc82fc04e4482ed734f524 upstream.
In case of the prep descriptor while the channel is already running, the CCR register value stored into the channel could already have its EN bit set. This would lead to a bad transfer since, at start transfer time, enabling the channel while other registers aren't yet properly set. To avoid this, ensure to mask the CCR_EN bit when storing the ccr value into the mdma channel structure.
Fixes: a4ffb13c8946 ("dmaengine: Add STM32 MDMA driver") Signed-off-by: Alain Volmat alain.volmat@foss.st.com Signed-off-by: Amelie Delaunay amelie.delaunay@foss.st.com Cc: stable@vger.kernel.org Tested-by: Alain Volmat alain.volmat@foss.st.com Link: https://lore.kernel.org/r/20231009082450.452877-1-amelie.delaunay@foss.st.co... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/dma/stm32-mdma.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/dma/stm32-mdma.c +++ b/drivers/dma/stm32-mdma.c @@ -509,7 +509,7 @@ static int stm32_mdma_set_xfer_param(str src_maxburst = chan->dma_config.src_maxburst; dst_maxburst = chan->dma_config.dst_maxburst;
- ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)); + ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)) & ~STM32_MDMA_CCR_EN; ctcr = stm32_mdma_read(dmadev, STM32_MDMA_CTCR(chan->id)); ctbr = stm32_mdma_read(dmadev, STM32_MDMA_CTBR(chan->id));
@@ -937,7 +937,7 @@ stm32_mdma_prep_dma_memcpy(struct dma_ch if (!desc) return NULL;
- ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)); + ccr = stm32_mdma_read(dmadev, STM32_MDMA_CCR(chan->id)) & ~STM32_MDMA_CCR_EN; ctcr = stm32_mdma_read(dmadev, STM32_MDMA_CTCR(chan->id)); ctbr = stm32_mdma_read(dmadev, STM32_MDMA_CTBR(chan->id)); cbndtr = stm32_mdma_read(dmadev, STM32_MDMA_CBNDTR(chan->id));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 16ba44826a04834d3eeeda4b731c2ea3481062b7 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however does not cover the complete kernel address space. This can result in pages being marked as not being used for dynamic address translation, even though they are. In turn guest TLB entries incorrectly may not be purged.
Fix this by adjusting the end address of the kernel address range being walked.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -161,15 +161,22 @@ static void mark_kernel_p4d(pgd_t *pgd,
static void mark_kernel_pgd(void) { - unsigned long addr, next; + unsigned long addr, next, max_addr; struct page *page; pgd_t *pgd; int i;
addr = 0; + /* + * Figure out maximum virtual address accessible with the + * kernel ASCE. This is required to keep the page table walker + * from accessing non-existent entries. + */ + max_addr = (S390_lowcore.kernel_asce.val & _ASCE_TYPE_MASK) >> 2; + max_addr = 1UL << (max_addr * 11 + 31); pgd = pgd_offset_k(addr); do { - next = pgd_addr_end(addr, MODULES_END); + next = pgd_addr_end(addr, max_addr); if (pgd_none(*pgd)) continue; if (!pgd_folded(*pgd)) { @@ -178,7 +185,7 @@ static void mark_kernel_pgd(void) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_p4d(pgd, addr, next); - } while (pgd++, addr = next, addr != MODULES_END); + } while (pgd++, addr = next, addr != max_addr); }
void __init cmma_init_nodat(void)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 44d93045247661acbd50b1629e62f415f2747577 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. The initial loop however is incorrect: only the first three of the four pages which belong to segment and region tables will be marked as being used for DAT. The last page is incorrectly marked as no-dat.
This can result in incorrect guest TLB flushes.
Fix this by simply marking all four pages.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -131,7 +131,7 @@ static void mark_kernel_pud(p4d_t *p4d, continue; if (!pud_folded(*pud)) { page = phys_to_page(pud_val(*pud)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_pmd(pud, addr, next); @@ -152,7 +152,7 @@ static void mark_kernel_p4d(pgd_t *pgd, continue; if (!p4d_folded(*p4d)) { page = phys_to_page(p4d_val(*p4d)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_pud(p4d, addr, next); @@ -181,7 +181,7 @@ static void mark_kernel_pgd(void) continue; if (!pgd_folded(*pgd)) { page = phys_to_page(pgd_val(*pgd)); - for (i = 0; i < 3; i++) + for (i = 0; i < 4; i++) set_bit(PG_arch_1, &page[i].flags); } mark_kernel_p4d(pgd, addr, next);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit 84bb41d5df48868055d159d9247b80927f1f70f9 upstream.
If the cmma no-dat feature is available the kernel page tables are walked to identify and mark all pages which are used for address translation (all region, segment, and page tables). In a subsequent loop all other pages are marked as "no-dat" pages with the ESSA instruction.
This information is visible to the hypervisor, so that the hypervisor can optimize purging of guest TLB entries. All pages used for swapper_pg_dir and invalid_pg_dir are incorrectly marked as no-dat, which in turn can result in incorrect guest TLB flushes.
Fix this by marking those pages correctly as being used for DAT.
Cc: stable@vger.kernel.org Reviewed-by: Claudio Imbrenda imbrenda@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/mm/page-states.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/s390/mm/page-states.c +++ b/arch/s390/mm/page-states.c @@ -198,6 +198,12 @@ void __init cmma_init_nodat(void) return; /* Mark pages used in kernel page tables */ mark_kernel_pgd(); + page = virt_to_page(&swapper_pg_dir); + for (i = 0; i < 4; i++) + set_bit(PG_arch_1, &page[i].flags); + page = virt_to_page(&invalid_pg_dir); + for (i = 0; i < 4; i++) + set_bit(PG_arch_1, &page[i].flags);
/* Set all kernel pages not used for page tables to stable/no-dat */ for_each_mem_pfn_range(i, MAX_NUMNODES, &start, &end, NULL) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
commit 2e7cfe5cd5b6b0b98abf57a3074885979e187c1c upstream.
Patch series "Use nth_page() in place of direct struct page manipulation", v3.
On SPARSEMEM without VMEMMAP, struct page is not guaranteed to be contiguous, since each memory section's memmap might be allocated independently. hugetlb pages can go beyond a memory section size, thus direct struct page manipulation on hugetlb pages/subpages might give wrong struct page. Kernel provides nth_page() to do the manipulation properly. Use that whenever code can see hugetlb pages.
This patch (of 5):
When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use nth_page() to handle it properly.
Without the fix, page_kasan_tag_reset() could reset wrong page tags, causing a wrong kasan result. No related bug is reported. The fix comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-1-zi.yan@sent.com Link: https://lkml.kernel.org/r/20230913201248.452081-2-zi.yan@sent.com Fixes: 2813b9c02962 ("kasan, mm, arm64: tag non slab memory allocated via pagealloc") Signed-off-by: Zi Yan ziy@nvidia.com Reviewed-by: Muchun Song songmuchun@bytedance.com Cc: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/cma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/cma.c +++ b/mm/cma.c @@ -503,7 +503,7 @@ struct page *cma_alloc(struct cma *cma, */ if (page) { for (i = 0; i < count; i++) - page_kasan_tag_reset(page + i); + page_kasan_tag_reset(nth_page(page, i)); }
if (ret && !no_warn) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zi Yan ziy@nvidia.com
commit 1640a0ef80f6d572725f5b0330038c18e98ea168 upstream.
When dealing with hugetlb pages, manipulating struct page pointers directly can get to wrong struct page, since struct page is not guaranteed to be contiguous on SPARSEMEM without VMEMMAP. Use pfn calculation to handle it properly.
Without the fix, a wrong number of page might be skipped. Since skip cannot be negative, scan_movable_page() will end early and might miss a movable page with -ENOENT. This might fail offline_pages(). No bug is reported. The fix comes from code inspection.
Link: https://lkml.kernel.org/r/20230913201248.452081-4-zi.yan@sent.com Fixes: eeb0efd071d8 ("mm,memory_hotplug: fix scan_movable_pages() for gigantic hugepages") Signed-off-by: Zi Yan ziy@nvidia.com Reviewed-by: Muchun Song songmuchun@bytedance.com Acked-by: David Hildenbrand david@redhat.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport (IBM) rppt@kernel.org Cc: Thomas Bogendoerfer tsbogend@alpha.franken.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory_hotplug.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1677,7 +1677,7 @@ static int scan_movable_pages(unsigned l */ if (HPageMigratable(head)) goto found; - skip = compound_nr(head) - (page - head); + skip = compound_nr(head) - (pfn - page_to_pfn(head)); pfn += skip - 1; } return -ENOENT;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
commit 565fe150624ee77dc63a735cc1b3bff5101f38a3 upstream.
Currently the offset into the device when looking for OTP bits can go outside of the address of the MTD NOR devices, and if that memory isn't readable, bad things happen on the IXP4xx (added prints that illustrate the problem before the crash):
cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x00000100 ixp4xx_copy_from copy from 0x00000100 to 0xc880dd78 cfi_intelext_otp_walk walk OTP on chip 0 start at reg_prot_offset 0x12000000 ixp4xx_copy_from copy from 0x12000000 to 0xc880dd78 8<--- cut here --- Unable to handle kernel paging request at virtual address db000000 [db000000] *pgd=00000000 (...)
This happens in this case because the IXP4xx is big endian and the 32- and 16-bit fields in the struct cfi_intelext_otpinfo are not properly byteswapped. Compare to how the code in read_pri_intelext() byteswaps the fields in struct cfi_pri_intelext.
Adding a small byte swapping loop for the OTP in read_pri_intelext() and the crash goes away.
The problem went unnoticed for many years until I enabled CONFIG_MTD_OTP on the IXP4xx as well, triggering the bug.
Cc: stable@vger.kernel.org Reviewed-by: Nicolas Pitre nico@fluxnic.net Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20231020-mtd-otp-byteswap-v4-1-0d132c06aa9... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/chips/cfi_cmdset_0001.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/mtd/chips/cfi_cmdset_0001.c +++ b/drivers/mtd/chips/cfi_cmdset_0001.c @@ -421,9 +421,25 @@ read_pri_intelext(struct map_info *map, extra_size = 0;
/* Protection Register info */ - if (extp->NumProtectionFields) + if (extp->NumProtectionFields) { + struct cfi_intelext_otpinfo *otp = + (struct cfi_intelext_otpinfo *)&extp->extra[0]; + extra_size += (extp->NumProtectionFields - 1) * - sizeof(struct cfi_intelext_otpinfo); + sizeof(struct cfi_intelext_otpinfo); + + if (extp_size >= sizeof(*extp) + extra_size) { + int i; + + /* Do some byteswapping if necessary */ + for (i = 0; i < extp->NumProtectionFields - 1; i++) { + otp->ProtRegAddr = le32_to_cpu(otp->ProtRegAddr); + otp->FactGroups = le16_to_cpu(otp->FactGroups); + otp->UserGroups = le16_to_cpu(otp->UserGroups); + otp++; + } + } + } }
if (extp->MinorVersion >= '1') {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joshua Yeong joshua.yeong@starfivetech.com
commit 4bd8405257da717cd556f99e5fb68693d12c9766 upstream.
IBIR_DEPTH and CMDR_DEPTH should read from status0 instead of status1.
Cc: stable@vger.kernel.org Fixes: 603f2bee2c54 ("i3c: master: Add driver for Cadence IP") Signed-off-by: Joshua Yeong joshua.yeong@starfivetech.com Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/r/20230913031743.11439-2-joshua.yeong@starfivetech.c... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/i3c-master-cdns.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/i3c/master/i3c-master-cdns.c +++ b/drivers/i3c/master/i3c-master-cdns.c @@ -192,7 +192,7 @@ #define SLV_STATUS1_HJ_DIS BIT(18) #define SLV_STATUS1_MR_DIS BIT(17) #define SLV_STATUS1_PROT_ERR BIT(16) -#define SLV_STATUS1_DA(x) (((s) & GENMASK(15, 9)) >> 9) +#define SLV_STATUS1_DA(s) (((s) & GENMASK(15, 9)) >> 9) #define SLV_STATUS1_HAS_DA BIT(8) #define SLV_STATUS1_DDR_RX_FULL BIT(7) #define SLV_STATUS1_DDR_TX_FULL BIT(6) @@ -1624,13 +1624,13 @@ static int cdns_i3c_master_probe(struct /* Device ID0 is reserved to describe this master. */ master->maxdevs = CONF_STATUS0_DEVS_NUM(val); master->free_rr_slots = GENMASK(master->maxdevs, 1); + master->caps.ibirfifodepth = CONF_STATUS0_IBIR_DEPTH(val); + master->caps.cmdrfifodepth = CONF_STATUS0_CMDR_DEPTH(val);
val = readl(master->regs + CONF_STATUS1); master->caps.cmdfifodepth = CONF_STATUS1_CMD_DEPTH(val); master->caps.rxfifodepth = CONF_STATUS1_RX_DEPTH(val); master->caps.txfifodepth = CONF_STATUS1_TX_DEPTH(val); - master->caps.ibirfifodepth = CONF_STATUS0_IBIR_DEPTH(val); - master->caps.cmdrfifodepth = CONF_STATUS0_CMDR_DEPTH(val);
spin_lock_init(&master->ibi.lock); master->ibi.num_slots = CONF_STATUS1_IBI_HW_RES(val);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 6bf3fc268183816856c96b8794cd66146bc27b35 upstream.
The ibi work thread operates asynchronously with other transfers, such as svc_i3c_master_priv_xfers(). Introduce mutex protection to ensure the completion of the entire i3c/i2c transaction.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-2-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -165,6 +165,7 @@ struct svc_i3c_xfer { * @ibi.slots: Available IBI slots * @ibi.tbq_slot: To be queued IBI slot * @ibi.lock: IBI lock + * @lock: Transfer lock, protect between IBI work thread and callbacks from master */ struct svc_i3c_master { struct i3c_master_controller base; @@ -192,6 +193,7 @@ struct svc_i3c_master { /* Prevent races within IBI handlers */ spinlock_t lock; } ibi; + struct mutex lock; };
/** @@ -345,6 +347,7 @@ static void svc_i3c_master_ibi_work(stru u32 status, val; int ret;
+ mutex_lock(&master->lock); /* Acknowledge the incoming interrupt with the AUTOIBI mechanism */ writel(SVC_I3C_MCTRL_REQUEST_AUTO_IBI | SVC_I3C_MCTRL_IBIRESP_AUTO, @@ -421,6 +424,7 @@ static void svc_i3c_master_ibi_work(stru
reenable_ibis: svc_i3c_master_enable_interrupts(master, SVC_I3C_MINT_SLVSTART); + mutex_unlock(&master->lock); }
static irqreturn_t svc_i3c_master_irq_handler(int irq, void *dev_id) @@ -1095,9 +1099,11 @@ static int svc_i3c_master_send_bdcast_cc cmd->read_len = 0; cmd->continued = false;
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; kfree(buf); @@ -1141,9 +1147,11 @@ static int svc_i3c_master_send_direct_cc cmd->read_len = read_len; cmd->continued = false;
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; svc_i3c_master_free_xfer(xfer); @@ -1197,9 +1205,11 @@ static int svc_i3c_master_priv_xfers(str cmd->continued = (i + 1) < nxfers; }
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; svc_i3c_master_free_xfer(xfer); @@ -1235,9 +1245,11 @@ static int svc_i3c_master_i2c_xfers(stru cmd->continued = (i + 1 < nxfers); }
+ mutex_lock(&master->lock); svc_i3c_master_enqueue_xfer(master, xfer); if (!wait_for_completion_timeout(&xfer->comp, msecs_to_jiffies(1000))) svc_i3c_master_dequeue_xfer(master, xfer); + mutex_unlock(&master->lock);
ret = xfer->ret; svc_i3c_master_free_xfer(xfer); @@ -1407,6 +1419,8 @@ static int svc_i3c_master_probe(struct p
INIT_WORK(&master->hj_work, svc_i3c_master_hj_work); INIT_WORK(&master->ibi_work, svc_i3c_master_ibi_work); + mutex_init(&master->lock); + ret = devm_request_irq(dev, master->irq, svc_i3c_master_irq_handler, IRQF_NO_SUSPEND, "svc-i3c-irq", master); if (ret)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 5e5e3c92e748a6d859190e123b9193cf4911fcca upstream.
┌─────┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┏──┐ ┌───── SCL: ┘ └─────┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┛ └──┘ ───┐ ┌─────┐ ┌─────┐ ┌───────────┐ SDA: └───────────────────────┘ └─────┘ └─────┘ └───── xxx╱ ╲╱ ╲╱ ╲╱ ╲╱ ╲ : xxx╲IBI ╱╲ Addr(0x0a) ╱╲ RW ╱╲NACK╱╲ S ╱
If an In-Band Interrupt (IBI) occurs and IBI work thread is not immediately scheduled, when svc_i3c_master_priv_xfers() initiates the I3C transfer and attempts to send address 0x7e, the target interprets it as an IBI handler and returns the target address 0x0a.
However, svc_i3c_master_priv_xfers() does not handle this case and proceeds with other transfers, resulting in incorrect data being returned.
Add IBIWON check in svc_i3c_master_xfer(). In case this situation occurs, return a failure to the driver.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-3-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -930,6 +930,9 @@ static int svc_i3c_master_xfer(struct sv u32 reg; int ret;
+ /* clean SVC_I3C_MINT_IBIWON w1c bits */ + writel(SVC_I3C_MINT_IBIWON, master->regs + SVC_I3C_MSTATUS); + writel(SVC_I3C_MCTRL_REQUEST_START_ADDR | xfer_type | SVC_I3C_MCTRL_IBIRESP_NACK | @@ -943,6 +946,23 @@ static int svc_i3c_master_xfer(struct sv if (ret) goto emit_stop;
+ /* + * According to I3C spec ver 1.1.1, 5.1.2.2.3 Consequence of Controller Starting a Frame + * with I3C Target Address. + * + * The I3C Controller normally should start a Frame, the Address may be arbitrated, and so + * the Controller shall monitor to see whether an In-Band Interrupt request, a Controller + * Role Request (i.e., Secondary Controller requests to become the Active Controller), or + * a Hot-Join Request has been made. + * + * If missed IBIWON check, the wrong data will be return. When IBIWON happen, return failure + * and yield the above events handler. + */ + if (SVC_I3C_MSTATUS_IBIWON(reg)) { + ret = -ENXIO; + goto emit_stop; + } + if (rnw) ret = svc_i3c_master_read(master, in, xfer_len); else
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit c85e209b799f12d18a90ae6353b997b1bb1274a5 upstream.
MSTATUS[RXPEND] is only updated after the data transfer cycle started. This creates an issue when the I3C clock is slow, and the CPU is running fast enough that MSTATUS[RXPEND] may not be updated when the code reaches checking point. As a result, mandatory data can be missed.
Add a wait for MSTATUS[COMPLETE] to ensure that all mandatory data is already in FIFO. It also works without mandatory data.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-4-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -294,6 +294,7 @@ static int svc_i3c_master_handle_ibi(str struct i3c_ibi_slot *slot; unsigned int count; u32 mdatactrl; + int ret, val; u8 *buf;
slot = i3c_generic_ibi_get_free_slot(data->ibi_pool); @@ -303,6 +304,13 @@ static int svc_i3c_master_handle_ibi(str slot->len = 0; buf = slot->data;
+ ret = readl_relaxed_poll_timeout(master->regs + SVC_I3C_MSTATUS, val, + SVC_I3C_MSTATUS_COMPLETE(val), 0, 1000); + if (ret) { + dev_err(master->dev, "Timeout when polling for COMPLETE\n"); + return ret; + } + while (SVC_I3C_MSTATUS_RXPEND(readl(master->regs + SVC_I3C_MSTATUS)) && slot->len < SVC_I3C_FIFO_SIZE) { mdatactrl = readl(master->regs + SVC_I3C_MDATACTRL);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit 225d5ef048c4ed01a475c95d94833bd7dd61072d upstream.
svc_i3c_master_irq_handler() wrongly checks register SVC_I3C_MINTMASKED. It should be SVC_I3C_MSTATUS.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-5-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -438,7 +438,7 @@ reenable_ibis: static irqreturn_t svc_i3c_master_irq_handler(int irq, void *dev_id) { struct svc_i3c_master *master = (struct svc_i3c_master *)dev_id; - u32 active = readl(master->regs + SVC_I3C_MINTMASKED); + u32 active = readl(master->regs + SVC_I3C_MSTATUS);
if (!SVC_I3C_MSTATUS_SLVSTART(active)) return IRQ_NONE;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Frank Li Frank.Li@nxp.com
commit dfd7cd6aafdb1f5ba93828e97e56b38304b23a05 upstream.
Upon IBIWON timeout, the SDA line will always be kept low if we don't emit a stop. Calling svc_i3c_master_emit_stop() there will let the bus return to idle state.
Fixes: dd3c52846d59 ("i3c: master: svc: Add Silvaco I3C master driver") Cc: stable@vger.kernel.org Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20231023161658.3890811-6-Frank.Li@nxp.com Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i3c/master/svc-i3c-master.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/i3c/master/svc-i3c-master.c +++ b/drivers/i3c/master/svc-i3c-master.c @@ -366,6 +366,7 @@ static void svc_i3c_master_ibi_work(stru SVC_I3C_MSTATUS_IBIWON(val), 0, 1000); if (ret) { dev_err(master->dev, "Timeout when polling for IBIWON\n"); + svc_i3c_master_emit_stop(master); goto reenable_ibis; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit a406b8b424fa01f244c1aab02ba186258448c36b upstream.
Bail out early with error message when trying to boot a 64-bit kernel on 32-bit machines. This fixes the previous commit to include the check for true 64-bit kernels as well.
Signed-off-by: Helge Deller deller@gmx.de Fixes: 591d2108f3abc ("parisc: Add runtime check to prevent PA2.0 kernels on PA1.x machines") Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/head.S | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/arch/parisc/kernel/head.S +++ b/arch/parisc/kernel/head.S @@ -69,9 +69,8 @@ $bss_loop: stw,ma %arg2,4(%r1) stw,ma %arg3,4(%r1)
-#if !defined(CONFIG_64BIT) && defined(CONFIG_PA20) - /* This 32-bit kernel was compiled for PA2.0 CPUs. Check current CPU - * and halt kernel if we detect a PA1.x CPU. */ +#if defined(CONFIG_PA20) + /* check for 64-bit capable CPU as required by current kernel */ ldi 32,%r10 mtctl %r10,%cr11 .level 2.0
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 166b0110d1ee53290bd11618df6e3991c117495a upstream.
When calculating the pfn for the iitlbt/idtlbt instruction, do not drop the upper 5 address bits. This doesn't seem to have an effect on physical hardware which uses less physical address bits, but in qemu the missing bits are visible.
Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/kernel/entry.S | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
--- a/arch/parisc/kernel/entry.S +++ b/arch/parisc/kernel/entry.S @@ -497,13 +497,13 @@ * to a CPU TLB 4k PFN (4k => 12 bits to shift) */ #define PAGE_ADD_SHIFT (PAGE_SHIFT-12) #define PAGE_ADD_HUGE_SHIFT (REAL_HPAGE_SHIFT-12) + #define PFN_START_BIT (63-ASM_PFN_PTE_SHIFT+(63-58)-PAGE_ADD_SHIFT)
/* Drop prot bits and convert to page addr for iitlbt and idtlbt */ .macro convert_for_tlb_insert20 pte,tmp #ifdef CONFIG_HUGETLB_PAGE copy \pte,\tmp - extrd,u \tmp,(63-ASM_PFN_PTE_SHIFT)+(63-58)+PAGE_ADD_SHIFT,\ - 64-PAGE_SHIFT-PAGE_ADD_SHIFT,\pte + extrd,u \tmp,PFN_START_BIT,PFN_START_BIT+1,\pte
depdi _PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_SHIFT,\pte @@ -511,8 +511,7 @@ depdi _HUGE_PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_HUGE_SHIFT,\pte #else /* Huge pages disabled */ - extrd,u \pte,(63-ASM_PFN_PTE_SHIFT)+(63-58)+PAGE_ADD_SHIFT,\ - 64-PAGE_SHIFT-PAGE_ADD_SHIFT,\pte + extrd,u \pte,PFN_START_BIT,PFN_START_BIT+1,\pte depdi _PAGE_SIZE_ENCODING_DEFAULT,63,\ (63-58)+PAGE_ADD_SHIFT,\pte #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit 6ad6e15a9c46b8f0932cd99724f26f3db4db1cdf upstream.
Firmware returns the physical address of the power switch, so need to use gsc_writel() instead of direct memory access.
Fixes: d0c219472980 ("parisc/power: Add power soft-off when running on qemu") Signed-off-by: Helge Deller deller@gmx.de Cc: stable@vger.kernel.org # v6.0+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/parisc/power.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/parisc/power.c +++ b/drivers/parisc/power.c @@ -201,7 +201,7 @@ static struct notifier_block parisc_pani static int qemu_power_off(struct sys_off_data *data) { /* this turns the system off via SeaBIOS */ - *(int *)data->cb_data = 0; + gsc_writel(0, (unsigned long) data->cb_data); pdc_soft_power_button(1); return NOTIFY_DONE; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Basavaraj Natikar Basavaraj.Natikar@amd.com
commit a5d6264b638efeca35eff72177fd28d149e0764b upstream.
Use the low-power states of the underlying platform to enable runtime PM. If the platform doesn't support runtime D3, then enabling default RPM will result in the controller malfunctioning, as in the case of hotplug devices not being detected because of a failed interrupt generation.
Cc: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20231019102924.2797346-16-mathias.nyman@linux.inte... Cc: Oleksandr Natalenko oleksandr@natalenko.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -509,7 +509,9 @@ static int xhci_pci_probe(struct pci_dev /* USB-2 and USB-3 roothubs initialized, allow runtime pm suspend */ pm_runtime_put_noidle(&dev->dev);
- if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW) + if (pci_choose_state(dev, PMSG_SUSPEND) == PCI_D0) + pm_runtime_forbid(&dev->dev); + else if (xhci->quirks & XHCI_DEFAULT_PM_RUNTIME_ALLOW) pm_runtime_allow(&dev->dev);
dma_set_max_seg_size(&dev->dev, UINT_MAX);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
commit c7a60651953359f98dbf24b43e1bf561e1573ed4 upstream.
As reported recently, ALSA core info helper may cause a deadlock at the forced device disconnection during the procfs operation.
The proc_remove() (that is called from the snd_card_disconnect() helper) has a synchronization of the pending procfs accesses via wait_for_completion(). Meanwhile, ALSA procfs helper takes the global mutex_lock(&info_mutex) at both the proc_open callback and snd_card_info_disconnect() helper. Since the proc_open can't finish due to the mutex lock, wait_for_completion() never returns, either, hence it deadlocks.
TASK#1 TASK#2 proc_reg_open() takes use_pde() snd_info_text_entry_open() snd_card_disconnect() snd_info_card_disconnect() takes mutex_lock(&info_mutex) proc_remove() wait_for_completion(unused_pde) ... waiting task#1 closes mutex_lock(&info_mutex) => DEADLOCK
This patch is a workaround for avoiding the deadlock scenario above.
The basic strategy is to move proc_remove() call outside the mutex lock. proc_remove() can work gracefully without extra locking, and it can delete the tree recursively alone. So, we call proc_remove() at snd_info_card_disconnection() at first, then delete the rest resources recursively within the info_mutex lock.
After the change, the function snd_info_disconnect() doesn't do disconnection by itself any longer, but it merely clears the procfs pointer. So rename the function to snd_info_clear_entries() for avoiding confusion.
The similar change is applied to snd_info_free_entry(), too. Since the proc_remove() is called only conditionally with the non-NULL entry->p, it's skipped after the snd_info_clear_entries() call.
Reported-by: Shinhyung Kang s47.kang@samsung.com Closes: https://lore.kernel.org/r/664457955.21699345385931.JavaMail.epsvc@epcpadp4 Reviewed-by: Jaroslav Kysela perex@perex.cz Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231109141954.4283-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/info.c | 21 +++++++++++++-------- 1 file changed, 13 insertions(+), 8 deletions(-)
--- a/sound/core/info.c +++ b/sound/core/info.c @@ -56,7 +56,7 @@ struct snd_info_private_data { };
static int snd_info_version_init(void); -static void snd_info_disconnect(struct snd_info_entry *entry); +static void snd_info_clear_entries(struct snd_info_entry *entry);
/*
@@ -569,11 +569,16 @@ void snd_info_card_disconnect(struct snd { if (!card) return; - mutex_lock(&info_mutex); + proc_remove(card->proc_root_link); - card->proc_root_link = NULL; if (card->proc_root) - snd_info_disconnect(card->proc_root); + proc_remove(card->proc_root->p); + + mutex_lock(&info_mutex); + if (card->proc_root) + snd_info_clear_entries(card->proc_root); + card->proc_root_link = NULL; + card->proc_root = NULL; mutex_unlock(&info_mutex); }
@@ -745,15 +750,14 @@ struct snd_info_entry *snd_info_create_c } EXPORT_SYMBOL(snd_info_create_card_entry);
-static void snd_info_disconnect(struct snd_info_entry *entry) +static void snd_info_clear_entries(struct snd_info_entry *entry) { struct snd_info_entry *p;
if (!entry->p) return; list_for_each_entry(p, &entry->children, list) - snd_info_disconnect(p); - proc_remove(entry->p); + snd_info_clear_entries(p); entry->p = NULL; }
@@ -770,8 +774,9 @@ void snd_info_free_entry(struct snd_info if (!entry) return; if (entry->p) { + proc_remove(entry->p); mutex_lock(&info_mutex); - snd_info_disconnect(entry); + snd_info_clear_entries(entry); mutex_unlock(&info_mutex); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kailang Yang kailang@realtek.com
commit 4b21a669ca21ed8f24ef4530b2918be5730114de upstream.
Add ALC295 to pin fall back table. Remove 5 pin quirks for Dell ALC295. ALC295 was only support MIC2 for external MIC function. ALC295 assigned model "ALC269_FIXUP_DELL1_MIC_NO_PRESENCE" for pin fall back table. It was assigned wrong model. So, let's remove it.
Fixes: fbc571290d9f ("ALSA: hda/realtek - Fixed Headphone Mic can't record on Dell platform") Signed-off-by: Kailang Yang kailang@realtek.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/7c1998e873834df98d59bd7e0d08c72e@realtek.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9963,22 +9963,6 @@ static const struct snd_hda_pin_quirk al {0x12, 0x90a60130}, {0x17, 0x90170110}, {0x21, 0x03211020}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, - {0x14, 0x90170110}, - {0x21, 0x04211020}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, - {0x14, 0x90170110}, - {0x21, 0x04211030}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS, - {0x17, 0x21014020}, - {0x18, 0x21a19030}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS, - {0x17, 0x21014040}, - {0x18, 0x21a19050}), - SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, - ALC295_STANDARD_PINS), SND_HDA_PIN_QUIRK(0x10ec0298, 0x1028, "Dell", ALC298_FIXUP_DELL1_MIC_NO_PRESENCE, ALC298_STANDARD_PINS, {0x17, 0x90170110}), @@ -10022,6 +10006,9 @@ static const struct snd_hda_pin_quirk al SND_HDA_PIN_QUIRK(0x10ec0289, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, {0x19, 0x40000000}, {0x1b, 0x40000000}), + SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE, + {0x19, 0x40000000}, + {0x1b, 0x40000000}), SND_HDA_PIN_QUIRK(0x10ec0256, 0x1028, "Dell", ALC255_FIXUP_DELL1_MIC_NO_PRESENCE, {0x19, 0x40000000}, {0x1a, 0x40000000}),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chandradeep Dey codesigning@chandradeepdey.com
commit 713f040cd22285fcc506f40a0d259566e6758c3c upstream.
Apply the already existing quirk chain ALC294_FIXUP_ASUS_SPK to enable the internal speaker of ASUS K6500ZC.
Signed-off-by: Chandradeep Dey codesigning@chandradeepdey.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/NizcVHQ--3-9@chandradeepdey.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9113,6 +9113,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1043, 0x10a1, "ASUS UX391UA", ALC294_FIXUP_ASUS_SPK), SND_PCI_QUIRK(0x1043, 0x10c0, "ASUS X540SA", ALC256_FIXUP_ASUS_MIC), SND_PCI_QUIRK(0x1043, 0x10d0, "ASUS X540LA/X540LJ", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1043, 0x10d3, "ASUS K6500ZC", ALC294_FIXUP_ASUS_SPK), SND_PCI_QUIRK(0x1043, 0x115d, "Asus 1015E", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x1043, 0x11c0, "ASUS X556UR", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1043, 0x125e, "ASUS Q524UQK", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com
[ Upstream commit 5b68061983471470d4109bac776145245f06bc09 ]
platform_get_resource(pdev, IORESOURCE_IRQ, ..) relies on static allocation of IRQ resources in DT core code, this causes an issue when using hierarchical interrupt domains using "interrupts" property in the node as this bypasses the hierarchical setup and messes up the irq chaining.
In preparation for removal of static setup of IRQ resource from DT core code use platform_get_irq().
Signed-off-by: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com Link: https://lore.kernel.org/r/20211224142917.6966-5-prabhakar.mahadev-lad.rj@bp.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Stable-dep-of: 2a1d728f20ed ("tty: serial: meson: fix hard LOCKUP on crtscts mode") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/meson_uart.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/serial/meson_uart.c b/drivers/tty/serial/meson_uart.c index 62e6c1af13445..b6e8db0ddf065 100644 --- a/drivers/tty/serial/meson_uart.c +++ b/drivers/tty/serial/meson_uart.c @@ -726,10 +726,11 @@ static int meson_uart_probe_clocks(struct platform_device *pdev,
static int meson_uart_probe(struct platform_device *pdev) { - struct resource *res_mem, *res_irq; + struct resource *res_mem; struct uart_port *port; u32 fifosize = 64; /* Default is 64, 128 for EE UART_0 */ int ret = 0; + int irq;
if (pdev->dev.of_node) pdev->id = of_alias_get_id(pdev->dev.of_node, "serial"); @@ -752,9 +753,9 @@ static int meson_uart_probe(struct platform_device *pdev) if (!res_mem) return -ENODEV;
- res_irq = platform_get_resource(pdev, IORESOURCE_IRQ, 0); - if (!res_irq) - return -ENODEV; + irq = platform_get_irq(pdev, 0); + if (irq < 0) + return irq;
of_property_read_u32(pdev->dev.of_node, "fifo-size", &fifosize);
@@ -779,7 +780,7 @@ static int meson_uart_probe(struct platform_device *pdev) port->iotype = UPIO_MEM; port->mapbase = res_mem->start; port->mapsize = resource_size(res_mem); - port->irq = res_irq->start; + port->irq = irq; port->flags = UPF_BOOT_AUTOCONF | UPF_LOW_LATENCY; port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_MESON_CONSOLE); port->dev = &pdev->dev;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Krasavin pkrasavin@imaqliq.com
[ Upstream commit 2a1d728f20edeee7f26dc307ed9df4e0d23947ab ]
There might be hard lockup if we set crtscts mode on port without RTS/CTS configured:
# stty -F /dev/ttyAML6 crtscts; echo 1 > /dev/ttyAML6; echo 2 > /dev/ttyAML6 [ 95.890386] rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: [ 95.890857] rcu: 3-...0: (201 ticks this GP) idle=e33c/1/0x4000000000000000 softirq=5844/5846 fqs=4984 [ 95.900212] rcu: (detected by 2, t=21016 jiffies, g=7753, q=296 ncpus=4) [ 95.906972] Task dump for CPU 3: [ 95.910178] task:bash state:R running task stack:0 pid:205 ppid:1 flags:0x00000202 [ 95.920059] Call trace: [ 95.922485] __switch_to+0xe4/0x168 [ 95.925951] 0xffffff8003477508 [ 95.974379] watchdog: Watchdog detected hard LOCKUP on cpu 3 [ 95.974424] Modules linked in: 88x2cs(O) rtc_meson_vrtc
Possible solution would be to not allow to setup crtscts on such port.
Tested on S905X3 based board.
Fixes: ff7693d079e5 ("ARM: meson: serial: add MesonX SoC on-chip uart driver") Cc: stable@vger.kernel.org Signed-off-by: Pavel Krasavin pkrasavin@imaqliq.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Reviewed-by: Dmitry Rokosov ddrokosov@salutedevices.com
v6: stable tag added v5: https://lore.kernel.org/lkml/OF43DA36FF.2BD3BB21-ON00258A47.005A8125-00258A4... added missed Reviewed-by tags, Fixes tag added according to Dmitry and Neil notes v4: https://lore.kernel.org/lkml/OF55521400.7512350F-ON00258A47.003F7254-00258A4... More correct patch subject according to Jiri's note v3: https://lore.kernel.org/lkml/OF6CF5FFA0.CCFD0E8E-ON00258A46.00549EDF-00258A4... "From:" line added to the mail v2: https://lore.kernel.org/lkml/OF950BEF72.7F425944-ON00258A46.00488A76-00258A4... braces for single statement removed according to Dmitry's note v1: https://lore.kernel.org/lkml/OF28B2B8C9.5BC0CD28-ON00258A46.0037688F-00258A4... Link: https://lore.kernel.org/r/OF66360032.51C36182-ON00258A48.003F656B-00258A48.0...
Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/meson_uart.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/serial/meson_uart.c b/drivers/tty/serial/meson_uart.c index b6e8db0ddf065..7e653d681ac01 100644 --- a/drivers/tty/serial/meson_uart.c +++ b/drivers/tty/serial/meson_uart.c @@ -368,10 +368,14 @@ static void meson_uart_set_termios(struct uart_port *port, else val |= AML_UART_STOP_BIT_1SB;
- if (cflags & CRTSCTS) - val &= ~AML_UART_TWO_WIRE_EN; - else + if (cflags & CRTSCTS) { + if (port->flags & UPF_HARD_FLOW) + val &= ~AML_UART_TWO_WIRE_EN; + else + termios->c_cflag &= ~CRTSCTS; + } else { val |= AML_UART_TWO_WIRE_EN; + }
writel(val, port->membase + AML_UART_CONTROL);
@@ -731,6 +735,7 @@ static int meson_uart_probe(struct platform_device *pdev) u32 fifosize = 64; /* Default is 64, 128 for EE UART_0 */ int ret = 0; int irq; + bool has_rtscts;
if (pdev->dev.of_node) pdev->id = of_alias_get_id(pdev->dev.of_node, "serial"); @@ -758,6 +763,7 @@ static int meson_uart_probe(struct platform_device *pdev) return irq;
of_property_read_u32(pdev->dev.of_node, "fifo-size", &fifosize); + has_rtscts = of_property_read_bool(pdev->dev.of_node, "uart-has-rtscts");
if (meson_ports[pdev->id]) { dev_err(&pdev->dev, "port %d already allocated\n", pdev->id); @@ -782,6 +788,8 @@ static int meson_uart_probe(struct platform_device *pdev) port->mapsize = resource_size(res_mem); port->irq = irq; port->flags = UPF_BOOT_AUTOCONF | UPF_LOW_LATENCY; + if (has_rtscts) + port->flags |= UPF_HARD_FLOW; port->has_sysrq = IS_ENABLED(CONFIG_SERIAL_MESON_CONSOLE); port->dev = &pdev->dev; port->line = pdev->id;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark Brown broonie@kernel.org
[ Upstream commit 0ec7731655de196bc1e4af99e495b38778109d22 ]
When we sync the register cache we do so with the cache bypassed in order to avoid overhead from writing the synced values back into the cache. If the regmap has ranges and the selector register for those ranges is in a register which is cached this has the unfortunate side effect of meaning that the physical and cached copies of the selector register can be out of sync after a cache sync. The cache will have whatever the selector was when the sync started and the hardware will have the selector for the register that was synced last.
Fix this by rewriting all cached selector registers after every sync, ensuring that the hardware and cache have the same content. This will result in extra writes that wouldn't otherwise be needed but is simple so hopefully robust. We don't read from the hardware since not all devices have physical read support.
Given that nobody noticed this until now it is likely that we are rarely if ever hitting this case.
Reported-by: Hector Martin marcan@marcan.st Cc: stable@vger.kernel.org Signed-off-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20231026-regmap-fix-selector-sync-v1-1-633ded82770... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/regmap/regcache.c | 30 ++++++++++++++++++++++++++++++ 1 file changed, 30 insertions(+)
diff --git a/drivers/base/regmap/regcache.c b/drivers/base/regmap/regcache.c index 0b517a83c4493..b04e8c90aca20 100644 --- a/drivers/base/regmap/regcache.c +++ b/drivers/base/regmap/regcache.c @@ -325,6 +325,11 @@ static int regcache_default_sync(struct regmap *map, unsigned int min, return 0; }
+static int rbtree_all(const void *key, const struct rb_node *node) +{ + return 0; +} + /** * regcache_sync - Sync the register cache with the hardware. * @@ -342,6 +347,7 @@ int regcache_sync(struct regmap *map) unsigned int i; const char *name; bool bypass; + struct rb_node *node;
if (WARN_ON(map->cache_type == REGCACHE_NONE)) return -EINVAL; @@ -386,6 +392,30 @@ int regcache_sync(struct regmap *map) map->async = false; map->cache_bypass = bypass; map->no_sync_defaults = false; + + /* + * If we did any paging with cache bypassed and a cached + * paging register then the register and cache state might + * have gone out of sync, force writes of all the paging + * registers. + */ + rb_for_each(node, 0, &map->range_tree, rbtree_all) { + struct regmap_range_node *this = + rb_entry(node, struct regmap_range_node, node); + + /* If there's nothing in the cache there's nothing to sync */ + ret = regcache_read(map, this->selector_reg, &i); + if (ret != 0) + continue; + + ret = _regmap_write(map, this->selector_reg, i); + if (ret != 0) { + dev_err(map->dev, "Failed to write %x = %x: %d\n", + this->selector_reg, i, ret); + break; + } + } + map->unlock(map->lock_arg);
regmap_async_complete(map);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Marangi ansuelsmth@gmail.com
[ Upstream commit ea167a7fc2426f7685c3735e104921c1a20a6d3f ]
Commit 3c0897c180c6 ("cpufreq: Use scnprintf() for avoiding potential buffer overflow") switched from snprintf to the more secure scnprintf but never updated the exit condition for PAGE_SIZE.
As the commit say and as scnprintf document, what scnprintf returns what is actually written not counting the '\0' end char. This results in the case of len exceeding the size, len set to PAGE_SIZE - 1, as it can be written at max PAGE_SIZE - 1 (as '\0' is not counted)
Because of len is never set to PAGE_SIZE, the function never break early, never prints the warning and never return -EFBIG.
Fix this by changing the condition to PAGE_SIZE - 1 to correctly trigger the error.
Cc: 5.10+ stable@vger.kernel.org # 5.10+ Fixes: 3c0897c180c6 ("cpufreq: Use scnprintf() for avoiding potential buffer overflow") Signed-off-by: Christian Marangi ansuelsmth@gmail.com [ rjw: Subject and changelog edits ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/cpufreq_stats.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/cpufreq/cpufreq_stats.c b/drivers/cpufreq/cpufreq_stats.c index 1570d6f3e75d3..6e57df7a2249f 100644 --- a/drivers/cpufreq/cpufreq_stats.c +++ b/drivers/cpufreq/cpufreq_stats.c @@ -131,25 +131,25 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf) len += scnprintf(buf + len, PAGE_SIZE - len, " From : To\n"); len += scnprintf(buf + len, PAGE_SIZE - len, " : "); for (i = 0; i < stats->state_num; i++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break; len += scnprintf(buf + len, PAGE_SIZE - len, "%9u ", stats->freq_table[i]); } - if (len >= PAGE_SIZE) - return PAGE_SIZE; + if (len >= PAGE_SIZE - 1) + return PAGE_SIZE - 1;
len += scnprintf(buf + len, PAGE_SIZE - len, "\n");
for (i = 0; i < stats->state_num; i++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break;
len += scnprintf(buf + len, PAGE_SIZE - len, "%9u: ", stats->freq_table[i]);
for (j = 0; j < stats->state_num; j++) { - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break;
if (pending) @@ -159,12 +159,12 @@ static ssize_t show_trans_table(struct cpufreq_policy *policy, char *buf)
len += scnprintf(buf + len, PAGE_SIZE - len, "%9u ", count); } - if (len >= PAGE_SIZE) + if (len >= PAGE_SIZE - 1) break; len += scnprintf(buf + len, PAGE_SIZE - len, "\n"); }
- if (len >= PAGE_SIZE) { + if (len >= PAGE_SIZE - 1) { pr_warn_once("cpufreq transition table exceeds PAGE_SIZE. Disabling\n"); return -EFBIG; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Artem Lukyanov dukzcry@ya.ru
[ Upstream commit 393b4916b7b5b94faf5c6a7c68df1c62d17e4f38 ]
Add the support ID(0x0cb8, 0xc559) to usb_device_id table for Realtek RTL8852BE.
The device info from /sys/kernel/debug/usb/devices as below.
T: Bus=03 Lev=01 Prnt=01 Port=02 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0cb8 ProdID=c559 Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Artem Lukyanov dukzcry@ya.ru Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: da06ff1f585e ("Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 91a08892df223..c1ce5592921af 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -436,6 +436,10 @@ static const struct usb_device_id blacklist_table[] = { { USB_DEVICE(0x13d3, 0x3586), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH },
+ /* Realtek 8852BE Bluetooth devices */ + { USB_DEVICE(0x0cb8, 0xc559), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH }, + /* Realtek Bluetooth devices */ { USB_VENDOR_AND_INTERFACE_INFO(0x0bda, 0xe0, 0x01, 0x01), .driver_info = BTUSB_REALTEK },
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larry Finger Larry.Finger@lwfinger.net
[ Upstream commit 730a1d1a93a3e30c3723f87af97a8517334b2203 ]
This device is part of a Realtek RTW8852BE chip.
The device table entry is as follows:
T: Bus=03 Lev=01 Prnt=01 Port=12 Cnt=02 Dev#= 3 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0bda ProdID=887b Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Larry Finger Larry.Finger@lwfinger.net Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: da06ff1f585e ("Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index c1ce5592921af..5b905f1501578 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -439,6 +439,8 @@ static const struct usb_device_id blacklist_table[] = { /* Realtek 8852BE Bluetooth devices */ { USB_DEVICE(0x0cb8, 0xc559), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x0bda, 0x887b), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH },
/* Realtek Bluetooth devices */ { USB_VENDOR_AND_INTERFACE_INFO(0x0bda, 0xe0, 0x01, 0x01),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larry Finger Larry.Finger@lwfinger.net
[ Upstream commit 069f534247bb6db4f8c2c2ea8e9155abf495c37e ]
This device is part of a Realtek RTW8852BE chip. The device table is as follows:
T: Bus=03 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3571 Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Larry Finger Larry.Finger@lwfinger.net Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: da06ff1f585e ("Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 5b905f1501578..363642eda5323 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -441,6 +441,8 @@ static const struct usb_device_id blacklist_table[] = { BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x0bda, 0x887b), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x13d3, 0x3571), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH },
/* Realtek Bluetooth devices */ { USB_VENDOR_AND_INTERFACE_INFO(0x0bda, 0xe0, 0x01, 0x01),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masum Reza masumrezarock100@gmail.com
[ Upstream commit 02be109d3a405dbc4d53fb4b4473d7a113548088 ]
This device is used in TP-Link TX20E WiFi+Bluetooth adapter.
Relevant information in /sys/kernel/debug/usb/devices about the Bluetooth device is listed as the below.
T: Bus=01 Lev=01 Prnt=01 Port=08 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=13d3 ProdID=3570 Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Signed-off-by: Masum Reza masumrezarock100@gmail.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Stable-dep-of: da06ff1f585e ("Bluetooth: btusb: Add 0bda:b85b for Fn-Link RTL8852BE") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index 363642eda5323..ea0ab93097905 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -441,6 +441,8 @@ static const struct usb_device_id blacklist_table[] = { BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x0bda, 0x887b), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x13d3, 0x3570), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x13d3, 0x3571), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH },
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guan Wentao guanwentao@uniontech.com
[ Upstream commit da06ff1f585ea784c79f80e7fab0e0c4ebb49c1c ]
Add PID/VID 0bda:b85b for Realtek RTL8852BE USB bluetooth part. The PID/VID was reported by the patch last year. [1] Some SBCs like rockpi 5B A8 module contains the device. And it`s founded in website. [2] [3]
Here is the device tables in /sys/kernel/debug/usb/devices .
T: Bus=07 Lev=01 Prnt=01 Port=01 Cnt=01 Dev#= 2 Spd=12 MxCh= 0 D: Ver= 1.00 Cls=e0(wlcon) Sub=01 Prot=01 MxPS=64 #Cfgs= 1 P: Vendor=0bda ProdID=b85b Rev= 0.00 S: Manufacturer=Realtek S: Product=Bluetooth Radio S: SerialNumber=00e04c000001 C:* #Ifs= 2 Cfg#= 1 Atr=e0 MxPwr=500mA I:* If#= 0 Alt= 0 #EPs= 3 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=81(I) Atr=03(Int.) MxPS= 16 Ivl=1ms E: Ad=02(O) Atr=02(Bulk) MxPS= 64 Ivl=0ms E: Ad=82(I) Atr=02(Bulk) MxPS= 64 Ivl=0ms I:* If#= 1 Alt= 0 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 0 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 0 Ivl=1ms I: If#= 1 Alt= 1 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 9 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 9 Ivl=1ms I: If#= 1 Alt= 2 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 17 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 17 Ivl=1ms I: If#= 1 Alt= 3 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 25 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 25 Ivl=1ms I: If#= 1 Alt= 4 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 33 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 33 Ivl=1ms I: If#= 1 Alt= 5 #EPs= 2 Cls=e0(wlcon) Sub=01 Prot=01 Driver=btusb E: Ad=03(O) Atr=01(Isoc) MxPS= 49 Ivl=1ms E: Ad=83(I) Atr=01(Isoc) MxPS= 49 Ivl=1ms
Link: https://lore.kernel.org/all/20220420052402.19049-1-tangmeng@uniontech.com/ [1] Link: https://forum.radxa.com/t/bluetooth-on-ubuntu/13051/4 [2] Link: https://ubuntuforums.org/showthread.php?t=2489527 [3]
Cc: stable@vger.kernel.org Signed-off-by: Meng Tang tangmeng@uniontech.com Signed-off-by: Guan Wentao guanwentao@uniontech.com Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bluetooth/btusb.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bluetooth/btusb.c b/drivers/bluetooth/btusb.c index ea0ab93097905..a862f859f7a50 100644 --- a/drivers/bluetooth/btusb.c +++ b/drivers/bluetooth/btusb.c @@ -441,6 +441,8 @@ static const struct usb_device_id blacklist_table[] = { BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x0bda, 0x887b), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, + { USB_DEVICE(0x0bda, 0xb85b), .driver_info = BTUSB_REALTEK | + BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x13d3, 0x3570), .driver_info = BTUSB_REALTEK | BTUSB_WIDEBAND_SPEECH }, { USB_DEVICE(0x13d3, 0x3571), .driver_info = BTUSB_REALTEK |
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
[ Upstream commit eebff19acaa35820cb09ce2ccb3d21bee2156ffb ]
slab out-of-bounds write is caused by that offsets is bigger than pntsd allocation size. This patch add the check to validate 3 offsets using allocation size.
Reported-by: zdi-disclosures@trendmicro.com # ZDI-CAN-22271 Cc: stable@vger.kernel.org Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ksmbd/smbacl.c | 29 ++++++++++++++++++++++++++--- 1 file changed, 26 insertions(+), 3 deletions(-)
diff --git a/fs/ksmbd/smbacl.c b/fs/ksmbd/smbacl.c index 3781bca2c8fc4..83f805248a814 100644 --- a/fs/ksmbd/smbacl.c +++ b/fs/ksmbd/smbacl.c @@ -1105,6 +1105,7 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, struct smb_acl *pdacl; struct smb_sid *powner_sid = NULL, *pgroup_sid = NULL; int powner_sid_size = 0, pgroup_sid_size = 0, pntsd_size; + int pntsd_alloc_size;
if (parent_pntsd->osidoffset) { powner_sid = (struct smb_sid *)((char *)parent_pntsd + @@ -1117,9 +1118,10 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, pgroup_sid_size = 1 + 1 + 6 + (pgroup_sid->num_subauth * 4); }
- pntsd = kzalloc(sizeof(struct smb_ntsd) + powner_sid_size + - pgroup_sid_size + sizeof(struct smb_acl) + - nt_size, GFP_KERNEL); + pntsd_alloc_size = sizeof(struct smb_ntsd) + powner_sid_size + + pgroup_sid_size + sizeof(struct smb_acl) + nt_size; + + pntsd = kzalloc(pntsd_alloc_size, GFP_KERNEL); if (!pntsd) { rc = -ENOMEM; goto free_aces_base; @@ -1134,6 +1136,27 @@ int smb_inherit_dacl(struct ksmbd_conn *conn, pntsd->gsidoffset = parent_pntsd->gsidoffset; pntsd->dacloffset = parent_pntsd->dacloffset;
+ if ((u64)le32_to_cpu(pntsd->osidoffset) + powner_sid_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + + if ((u64)le32_to_cpu(pntsd->gsidoffset) + pgroup_sid_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + + if ((u64)le32_to_cpu(pntsd->dacloffset) + sizeof(struct smb_acl) + nt_size > + pntsd_alloc_size) { + rc = -EINVAL; + kfree(pntsd); + goto free_aces_base; + } + if (pntsd->osidoffset) { struct smb_sid *owner_sid = (struct smb_sid *)((char *)pntsd + le32_to_cpu(pntsd->osidoffset));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit f5e303aefc06b7508d7a490f9a2d80e4dc134c70 ]
The TCSR mutex bindings allow device to be described only with address space (so it uses MMIO, not syscon regmap). This seems reasonable as TCSR mutex is actually a dedicated IO address space and it also fixes DT schema checks:
qcom/ipq6018-cp01-c1.dtb: hwlock: 'reg' is a required property qcom/ipq6018-cp01-c1.dtb: hwlock: 'syscon' does not match any of the regexes: 'pinctrl-[0-9]+'
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Bjorn Andersson andersson@kernel.org Link: https://lore.kernel.org/r/20220909092035.223915-12-krzysztof.kozlowski@linar... Stable-dep-of: 72fc3d58b87b ("arm64: dts: qcom: ipq6018: Fix tcsr_mutex register size") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi index 13651a5ac5a69..43339557d7e5a 100644 --- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -129,12 +129,6 @@ }; };
- tcsr_mutex: hwlock { - compatible = "qcom,tcsr-mutex"; - syscon = <&tcsr_mutex_regs 0 0x80>; - #hwlock-cells = <1>; - }; - pmuv8: pmu { compatible = "arm,cortex-a53-pmu"; interrupts = <GIC_PPI 7 (GIC_CPU_MASK_SIMPLE(4) | @@ -253,9 +247,10 @@ #reset-cells = <1>; };
- tcsr_mutex_regs: syscon@1905000 { - compatible = "syscon"; - reg = <0x0 0x01905000 0x0 0x8000>; + tcsr_mutex: hwlock@1905000 { + compatible = "qcom,ipq6018-tcsr-mutex", "qcom,tcsr-mutex"; + reg = <0x0 0x01905000 0x0 0x1000>; + #hwlock-cells = <1>; };
tcsr: syscon@1937000 {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vignesh Viswanathan quic_viswanat@quicinc.com
[ Upstream commit 72fc3d58b87b0d622039c6299b89024fbb7b420f ]
IPQ6018's TCSR Mutex HW lock register has 32 locks of size 4KB each. Total size of the TCSR Mutex registers is 128KB.
Fix size of the tcsr_mutex hwlock register to 0x20000.
Changes in v2: - Drop change to remove qcom,ipq6018-tcsr-mutex compatible string - Added Fixes and stable tags
Cc: stable@vger.kernel.org Fixes: 5bf635621245 ("arm64: dts: ipq6018: Add a few device nodes") Signed-off-by: Vignesh Viswanathan quic_viswanat@quicinc.com Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230905095535.1263113-2-quic_viswanat@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/ipq6018.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/ipq6018.dtsi b/arch/arm64/boot/dts/qcom/ipq6018.dtsi index 43339557d7e5a..dde6fde10f8d3 100644 --- a/arch/arm64/boot/dts/qcom/ipq6018.dtsi +++ b/arch/arm64/boot/dts/qcom/ipq6018.dtsi @@ -249,7 +249,7 @@
tcsr_mutex: hwlock@1905000 { compatible = "qcom,ipq6018-tcsr-mutex", "qcom,tcsr-mutex"; - reg = <0x0 0x01905000 0x0 0x1000>; + reg = <0x0 0x01905000 0x0 0x20000>; #hwlock-cells = <1>; };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Kardashevskiy aik@ozlabs.ru
[ Upstream commit fb4ee2b30cd09e95524640149e4ee0d7f22c3e7b ]
This drops rather useless ddw_enabled flag as direct_mapping implies it anyway.
While at this, fix indents in enable_ddw().
This should not cause any behavioral change.
Signed-off-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20211108040320.3857636-3-aik@ozlabs.ru Stable-dep-of: 3bf983e4e93c ("powerpc/pseries/iommu: enable_ddw incorrectly returns direct mapping for SR-IOV device") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/iommu.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index ec5d84b4958c5..aa5f8074e9b10 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -1241,7 +1241,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) u32 ddw_avail[DDW_APPLICABLE_SIZE]; struct dma_win *window; struct property *win64; - bool ddw_enabled = false; struct failed_ddw_pdn *fpdn; bool default_win_removed = false, direct_mapping = false; bool pmem_present; @@ -1256,7 +1255,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { direct_mapping = (len >= max_ram_len); - ddw_enabled = true; goto out_unlock; }
@@ -1411,8 +1409,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) dev_info(&dev->dev, "failed to map DMA window for %pOF: %d\n", dn, ret);
- /* Make sure to clean DDW if any TCE was set*/ - clean_dma_window(pdn, win64->value); + /* Make sure to clean DDW if any TCE was set*/ + clean_dma_window(pdn, win64->value); goto out_del_list; } } else { @@ -1459,7 +1457,6 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) spin_unlock(&dma_win_list_lock);
dev->dev.archdata.dma_offset = win_addr; - ddw_enabled = true; goto out_unlock;
out_del_list: @@ -1495,10 +1492,10 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn) * as RAM, then we failed to create a window to cover persistent * memory and need to set the DMA limit. */ - if (pmem_present && ddw_enabled && direct_mapping && len == max_ram_len) + if (pmem_present && direct_mapping && len == max_ram_len) dev->dev.bus_dma_limit = dev->dev.archdata.dma_offset + (1ULL << len);
- return ddw_enabled && direct_mapping; + return direct_mapping; }
static void pci_dma_dev_setup_pSeriesLP(struct pci_dev *dev)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gaurav Batra gbatra@linux.vnet.ibm.com
[ Upstream commit 3bf983e4e93ce8e6d69e9d63f52a66ec0856672e ]
When a device is initialized, the driver invokes dma_supported() twice - first for streaming mappings followed by coherent mappings. For an SR-IOV device, default window is deleted and DDW created. With vPMEM enabled, TCE mappings are dynamically created for both vPMEM and SR-IOV device. There are no direct mappings.
First time when dma_supported() is called with 64 bit mask, DDW is created and marked as dynamic window. The second time dma_supported() is called, enable_ddw() finds existing window for the device and incorrectly returns it as "direct mapping".
This only happens when size of DDW is big enough to map max LPAR memory.
This results in streaming TCEs to not get dynamically mapped, since code incorrently assumes these are already pre-mapped. The adapter initially comes up but goes down due to EEH.
Fixes: 381ceda88c4c ("powerpc/pseries/iommu: Make use of DDW for indirect mapping") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Gaurav Batra gbatra@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20231003030802.47914-1-gbatra@linux.vnet.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/iommu.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/powerpc/platforms/pseries/iommu.c b/arch/powerpc/platforms/pseries/iommu.c index aa5f8074e9b10..bee61292de23b 100644 --- a/arch/powerpc/platforms/pseries/iommu.c +++ b/arch/powerpc/platforms/pseries/iommu.c @@ -891,7 +891,8 @@ static int remove_ddw(struct device_node *np, bool remove_prop, const char *win_ return 0; }
-static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift) +static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *window_shift, + bool *direct_mapping) { struct dma_win *window; const struct dynamic_dma_window_prop *dma64; @@ -904,6 +905,7 @@ static bool find_existing_ddw(struct device_node *pdn, u64 *dma_addr, int *windo dma64 = window->prop; *dma_addr = be64_to_cpu(dma64->dma_base); *window_shift = be32_to_cpu(dma64->window_shift); + *direct_mapping = window->direct; found = true; break; } @@ -1253,10 +1255,8 @@ static bool enable_ddw(struct pci_dev *dev, struct device_node *pdn)
mutex_lock(&dma_win_init_mutex);
- if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len)) { - direct_mapping = (len >= max_ram_len); + if (find_existing_ddw(pdn, &dev->dev.archdata.dma_offset, &len, &direct_mapping)) goto out_unlock; - }
/* * If we already went through this for a previous function of
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johnathan Mantey johnathanx.mantey@intel.com
commit 9e2e7efbbbff69d8340abb56d375dd79d1f5770f upstream.
This reverts commit 3780bb29311eccb7a1c9641032a112eed237f7e3.
The cited commit introduced unwanted behavior.
The intent for the commit was to be able to detect carrier loss/gain for just the NIC connected to the BMC. The unwanted effect is a carrier loss for auxiliary paths also causes the BMC to lose carrier. The BMC never regains carrier despite the secondary NIC regaining a link.
This change, when merged, needs to be backported to stable kernels. 5.4-stable, 5.10-stable, 5.15-stable, 6.1-stable, 6.5-stable
Fixes: 3780bb29311e ("ncsi: Propagate carrier gain/loss events to the NCSI controller") CC: stable@vger.kernel.org Signed-off-by: Johnathan Mantey johnathanx.mantey@intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ncsi/ncsi-aen.c | 5 ----- 1 file changed, 5 deletions(-)
--- a/net/ncsi/ncsi-aen.c +++ b/net/ncsi/ncsi-aen.c @@ -89,11 +89,6 @@ static int ncsi_aen_handler_lsc(struct n if ((had_link == has_link) || chained) return 0;
- if (had_link) - netif_carrier_off(ndp->ndev.dev); - else - netif_carrier_on(ndp->ndev.dev); - if (!ndp->multi_package && !nc->package->multi_channel) { if (had_link) { ndp->flags |= NCSI_DEV_RESHUFFLE;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robert Marko robert.marko@sartura.hr
commit 7b211c7671212cad0b83603c674838c7e824d845 upstream.
This reverts commit 0b01392c18b9993a584f36ace1d61118772ad0ca.
Conversion of PXA to generic I2C recovery, makes the I2C bus completely lock up if recovery pinctrl is present in the DT and I2C recovery is enabled.
So, until the generic I2C recovery can also work with PXA lets revert to have working I2C and I2C recovery again.
Signed-off-by: Robert Marko robert.marko@sartura.hr Cc: stable@vger.kernel.org # 5.11+ Acked-by: Andi Shyti andi.shyti@kernel.org Acked-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Acked-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-pxa.c | 76 ++++++++++++++++++++++++++++++++++++++----- 1 file changed, 68 insertions(+), 8 deletions(-)
--- a/drivers/i2c/busses/i2c-pxa.c +++ b/drivers/i2c/busses/i2c-pxa.c @@ -264,6 +264,9 @@ struct pxa_i2c { u32 hs_mask;
struct i2c_bus_recovery_info recovery; + struct pinctrl *pinctrl; + struct pinctrl_state *pinctrl_default; + struct pinctrl_state *pinctrl_recovery; };
#define _IBMR(i2c) ((i2c)->reg_ibmr) @@ -1302,12 +1305,13 @@ static void i2c_pxa_prepare_recovery(str */ gpiod_set_value(i2c->recovery.scl_gpiod, ibmr & IBMR_SCLS); gpiod_set_value(i2c->recovery.sda_gpiod, ibmr & IBMR_SDAS); + + WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery)); }
static void i2c_pxa_unprepare_recovery(struct i2c_adapter *adap) { struct pxa_i2c *i2c = adap->algo_data; - struct i2c_bus_recovery_info *bri = adap->bus_recovery_info; u32 isr;
/* @@ -1321,7 +1325,7 @@ static void i2c_pxa_unprepare_recovery(s i2c_pxa_do_reset(i2c); }
- WARN_ON(pinctrl_select_state(bri->pinctrl, bri->pins_default)); + WARN_ON(pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default));
dev_dbg(&i2c->adap.dev, "recovery: IBMR 0x%08x ISR 0x%08x\n", readl(_IBMR(i2c)), readl(_ISR(i2c))); @@ -1343,20 +1347,76 @@ static int i2c_pxa_init_recovery(struct if (IS_ENABLED(CONFIG_I2C_PXA_SLAVE)) return 0;
- bri->pinctrl = devm_pinctrl_get(dev); - if (PTR_ERR(bri->pinctrl) == -ENODEV) { - bri->pinctrl = NULL; + i2c->pinctrl = devm_pinctrl_get(dev); + if (PTR_ERR(i2c->pinctrl) == -ENODEV) + i2c->pinctrl = NULL; + if (IS_ERR(i2c->pinctrl)) + return PTR_ERR(i2c->pinctrl); + + if (!i2c->pinctrl) + return 0; + + i2c->pinctrl_default = pinctrl_lookup_state(i2c->pinctrl, + PINCTRL_STATE_DEFAULT); + i2c->pinctrl_recovery = pinctrl_lookup_state(i2c->pinctrl, "recovery"); + + if (IS_ERR(i2c->pinctrl_default) || IS_ERR(i2c->pinctrl_recovery)) { + dev_info(dev, "missing pinmux recovery information: %ld %ld\n", + PTR_ERR(i2c->pinctrl_default), + PTR_ERR(i2c->pinctrl_recovery)); + return 0; + } + + /* + * Claiming GPIOs can influence the pinmux state, and may glitch the + * I2C bus. Do this carefully. + */ + bri->scl_gpiod = devm_gpiod_get(dev, "scl", GPIOD_OUT_HIGH_OPEN_DRAIN); + if (bri->scl_gpiod == ERR_PTR(-EPROBE_DEFER)) + return -EPROBE_DEFER; + if (IS_ERR(bri->scl_gpiod)) { + dev_info(dev, "missing scl gpio recovery information: %pe\n", + bri->scl_gpiod); + return 0; + } + + /* + * We have SCL. Pull SCL low and wait a bit so that SDA glitches + * have no effect. + */ + gpiod_direction_output(bri->scl_gpiod, 0); + udelay(10); + bri->sda_gpiod = devm_gpiod_get(dev, "sda", GPIOD_OUT_HIGH_OPEN_DRAIN); + + /* Wait a bit in case of a SDA glitch, and then release SCL. */ + udelay(10); + gpiod_direction_output(bri->scl_gpiod, 1); + + if (bri->sda_gpiod == ERR_PTR(-EPROBE_DEFER)) + return -EPROBE_DEFER; + + if (IS_ERR(bri->sda_gpiod)) { + dev_info(dev, "missing sda gpio recovery information: %pe\n", + bri->sda_gpiod); return 0; } - if (IS_ERR(bri->pinctrl)) - return PTR_ERR(bri->pinctrl);
bri->prepare_recovery = i2c_pxa_prepare_recovery; bri->unprepare_recovery = i2c_pxa_unprepare_recovery; + bri->recover_bus = i2c_generic_scl_recovery;
i2c->adap.bus_recovery_info = bri;
- return 0; + /* + * Claiming GPIOs can change the pinmux state, which confuses the + * pinctrl since pinctrl's idea of the current setting is unaffected + * by the pinmux change caused by claiming the GPIO. Work around that + * by switching pinctrl to the GPIO state here. We do it this way to + * avoid glitching the I2C bus. + */ + pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_recovery); + + return pinctrl_select_state(i2c->pinctrl, i2c->pinctrl_default); }
static int i2c_pxa_probe(struct platform_device *dev)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Mosnacek omosnace@redhat.com
commit 866d648059d5faf53f1cd960b43fe8365ad93ea7 upstream.
1 is the return value that implements a "no-op" hook, not 0.
Cc: stable@vger.kernel.org Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/lsm_hook_defs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -48,7 +48,7 @@ LSM_HOOK(int, 0, quota_on, struct dentry LSM_HOOK(int, 0, syslog, int type) LSM_HOOK(int, 0, settime, const struct timespec64 *ts, const struct timezone *tz) -LSM_HOOK(int, 0, vm_enough_memory, struct mm_struct *mm, long pages) +LSM_HOOK(int, 1, vm_enough_memory, struct mm_struct *mm, long pages) LSM_HOOK(int, 0, bprm_creds_for_exec, struct linux_binprm *bprm) LSM_HOOK(int, 0, bprm_creds_from_file, struct linux_binprm *bprm, struct file *file) LSM_HOOK(int, 0, bprm_check_security, struct linux_binprm *bprm)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ondrej Mosnacek omosnace@redhat.com
commit b36995b8609a5a8fe5cf259a1ee768fcaed919f8 upstream.
-EOPNOTSUPP is the return value that implements a "no-op" hook, not 0.
Without this fix having only the BPF LSM enabled (with no programs attached) can cause uninitialized variable reads in nfsd4_encode_fattr(), because the BPF hook returns 0 without touching the 'ctxlen' variable and the corresponding 'contextlen' variable in nfsd4_encode_fattr() remains uninitialized, yet being treated as valid based on the 0 return value.
Cc: stable@vger.kernel.org Fixes: 98e828a0650f ("security: Refactor declaration of LSM hooks") Reported-by: Benjamin Coddington bcodding@redhat.com Signed-off-by: Ondrej Mosnacek omosnace@redhat.com Signed-off-by: Paul Moore paul@paul-moore.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/lsm_hook_defs.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/lsm_hook_defs.h +++ b/include/linux/lsm_hook_defs.h @@ -265,7 +265,7 @@ LSM_HOOK(void, LSM_RET_VOID, release_sec LSM_HOOK(void, LSM_RET_VOID, inode_invalidate_secctx, struct inode *inode) LSM_HOOK(int, 0, inode_notifysecctx, struct inode *inode, void *ctx, u32 ctxlen) LSM_HOOK(int, 0, inode_setsecctx, struct dentry *dentry, void *ctx, u32 ctxlen) -LSM_HOOK(int, 0, inode_getsecctx, struct inode *inode, void **ctx, +LSM_HOOK(int, -EOPNOTSUPP, inode_getsecctx, struct inode *inode, void **ctx, u32 *ctxlen)
#if defined(CONFIG_SECURITY) && defined(CONFIG_WATCH_QUEUE)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darren Hart darren@os.amperecomputing.com
commit 5d6aa89bba5bd6af2580f872b57f438dab883738 upstream.
Commit abd3ac7902fb ("watchdog: sbsa: Support architecture version 1") introduced new timer math for watchdog revision 1 with the 48 bit offset register.
The gwdt->clk and timeout are u32, but the argument being calculated is u64. Without a cast, the compiler performs u32 operations, truncating intermediate steps, resulting in incorrect values.
A watchdog revision 1 implementation with a gwdt->clk of 1GHz and a timeout of 600s writes 3647256576 to the one shot watchdog instead of 300000000000, resulting in the watchdog firing in 3.6s instead of 600s.
Force u64 math by casting the first argument (gwdt->clk) as a u64. Make the order of operations explicit with parenthesis.
Fixes: abd3ac7902fb ("watchdog: sbsa: Support architecture version 1") Reported-by: Vanshidhar Konda vanshikonda@os.amperecomputing.com Signed-off-by: Darren Hart darren@os.amperecomputing.com Cc: Wim Van Sebroeck wim@linux-watchdog.org Cc: Guenter Roeck linux@roeck-us.net Cc: linux-watchdog@vger.kernel.org Cc: linux-kernel@vger.kernel.org Cc: linux-arm-kernel@lists.infradead.org Cc: stable@vger.kernel.org # 5.14.x Reviewed-by: Guenter Roeck linux@roeck-us.net Link: https://lore.kernel.org/r/7d1713c5ffab19b0f3de796d82df19e8b1f340de.169528612... Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Wim Van Sebroeck wim@linux-watchdog.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/watchdog/sbsa_gwdt.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/watchdog/sbsa_gwdt.c +++ b/drivers/watchdog/sbsa_gwdt.c @@ -153,14 +153,14 @@ static int sbsa_gwdt_set_timeout(struct timeout = clamp_t(unsigned int, timeout, 1, wdd->max_hw_heartbeat_ms / 1000);
if (action) - sbsa_gwdt_reg_write(gwdt->clk * timeout, gwdt); + sbsa_gwdt_reg_write((u64)gwdt->clk * timeout, gwdt); else /* * In the single stage mode, The first signal (WS0) is ignored, * the timeout is (WOR * 2), so the WOR should be configured * to half value of timeout. */ - sbsa_gwdt_reg_write(gwdt->clk / 2 * timeout, gwdt); + sbsa_gwdt_reg_write(((u64)gwdt->clk / 2) * timeout, gwdt);
return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tam Nguyen tamnguyenchi@os.amperecomputing.com
commit e8183fa10c25c7b3c20670bf2b430ddcc1ee03c0 upstream.
During SMBus block data read process, we have seen high interrupt rate because of TX_EMPTY irq status while waiting for block length byte (the first data byte after the address phase). The interrupt handler does not do anything because the internal state is kept as STATUS_WRITE_IN_PROGRESS. Hence, we should disable TX_EMPTY IRQ until I2C DesignWare receives first data byte from I2C device, then re-enable it to resume SMBus transaction.
It takes 0.789 ms for host to receive data length from slave. Without the patch, i2c_dw_isr() is called 99 times by TX_EMPTY interrupt. And it is none after applying the patch.
Cc: stable@vger.kernel.org Co-developed-by: Chuong Tran chuong@os.amperecomputing.com Signed-off-by: Chuong Tran chuong@os.amperecomputing.com Signed-off-by: Tam Nguyen tamnguyenchi@os.amperecomputing.com Acked-by: Jarkko Nikula jarkko.nikula@linux.intel.com Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-designware-master.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
--- a/drivers/i2c/busses/i2c-designware-master.c +++ b/drivers/i2c/busses/i2c-designware-master.c @@ -456,10 +456,16 @@ i2c_dw_xfer_msg(struct dw_i2c_dev *dev)
/* * Because we don't know the buffer length in the - * I2C_FUNC_SMBUS_BLOCK_DATA case, we can't stop - * the transaction here. + * I2C_FUNC_SMBUS_BLOCK_DATA case, we can't stop the + * transaction here. Also disable the TX_EMPTY IRQ + * while waiting for the data length byte to avoid the + * bogus interrupts flood. */ - if (buf_len > 0 || flags & I2C_M_RECV_LEN) { + if (flags & I2C_M_RECV_LEN) { + dev->status |= STATUS_WRITE_IN_PROGRESS; + intr_mask &= ~DW_IC_INTR_TX_EMPTY; + break; + } else if (buf_len > 0) { /* more bytes to be written */ dev->status |= STATUS_WRITE_IN_PROGRESS; break; @@ -495,6 +501,13 @@ i2c_dw_recv_len(struct dw_i2c_dev *dev, msgs[dev->msg_read_idx].len = len; msgs[dev->msg_read_idx].flags &= ~I2C_M_RECV_LEN;
+ /* + * Received buffer length, re-enable TX_EMPTY interrupt + * to resume the SMBUS transaction. + */ + regmap_update_bits(dev->map, DW_IC_INTR_MASK, DW_IC_INTR_TX_EMPTY, + DW_IC_INTR_TX_EMPTY); + return len; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harald Freudenberger freude@linux.ibm.com
commit e14aec23025eeb1f2159ba34dbc1458467c4c347 upstream.
Fix kernel crash in AP bus code caused by very early invocation of the config change callback function via SCLP.
After a fresh IML of the machine the crypto cards are still offline and will get switched online only with activation of any LPAR which has the card in it's configuration. A crypto card coming online is reported to the LPAR via SCLP and the AP bus offers a callback function to get this kind of information. However, it may happen that the callback is invoked before the AP bus init function is complete. As the callback triggers a synchronous AP bus scan, the scan may already run but some internal states are not initialized by the AP bus init function resulting in a crash like this:
[ 11.635859] Unable to handle kernel pointer dereference in virtual kernel address space [ 11.635861] Failing address: 0000000000000000 TEID: 0000000000000887 [ 11.635862] Fault in home space mode while using kernel ASCE. [ 11.635864] AS:00000000894c4007 R3:00000001fece8007 S:00000001fece7800 P:000000000000013d [ 11.635879] Oops: 0004 ilc:1 [#1] SMP [ 11.635882] Modules linked in: [ 11.635884] CPU: 5 PID: 42 Comm: kworker/5:0 Not tainted 6.6.0-rc3-00003-g4dbf7cdc6b42 #12 [ 11.635886] Hardware name: IBM 3931 A01 751 (LPAR) [ 11.635887] Workqueue: events_long ap_scan_bus [ 11.635891] Krnl PSW : 0704c00180000000 0000000000000000 (0x0) [ 11.635895] R:0 T:1 IO:1 EX:1 Key:0 M:1 W:0 P:0 AS:3 CC:0 PM:0 RI:0 EA:3 [ 11.635897] Krnl GPRS: 0000000001000a00 0000000000000000 0000000000000006 0000000089591940 [ 11.635899] 0000000080000000 0000000000000a00 0000000000000000 0000000000000000 [ 11.635901] 0000000081870c00 0000000089591000 000000008834e4e2 0000000002625a00 [ 11.635903] 0000000081734200 0000038000913c18 000000008834c6d6 0000038000913ac8 [ 11.635906] Krnl Code:>0000000000000000: 0000 illegal [ 11.635906] 0000000000000002: 0000 illegal [ 11.635906] 0000000000000004: 0000 illegal [ 11.635906] 0000000000000006: 0000 illegal [ 11.635906] 0000000000000008: 0000 illegal [ 11.635906] 000000000000000a: 0000 illegal [ 11.635906] 000000000000000c: 0000 illegal [ 11.635906] 000000000000000e: 0000 illegal [ 11.635915] Call Trace: [ 11.635916] [<0000000000000000>] 0x0 [ 11.635918] [<000000008834e4e2>] ap_queue_init_state+0x82/0xb8 [ 11.635921] [<000000008834ba1c>] ap_scan_domains+0x6fc/0x740 [ 11.635923] [<000000008834c092>] ap_scan_adapter+0x632/0x8b0 [ 11.635925] [<000000008834c3e4>] ap_scan_bus+0xd4/0x288 [ 11.635927] [<00000000879a33ba>] process_one_work+0x19a/0x410 [ 11.635930] Discipline DIAG cannot be used without z/VM [ 11.635930] [<00000000879a3a2c>] worker_thread+0x3fc/0x560 [ 11.635933] [<00000000879aea60>] kthread+0x120/0x128 [ 11.635936] [<000000008792afa4>] __ret_from_fork+0x3c/0x58 [ 11.635938] [<00000000885ebe62>] ret_from_fork+0xa/0x30 [ 11.635942] Last Breaking-Event-Address: [ 11.635942] [<000000008834c6d4>] ap_wait+0xcc/0x148
This patch improves the ap_bus_force_rescan() function which is invoked by the config change callback by checking if a first initial AP bus scan has been done. If not, the force rescan request is simple ignored. Anyhow it does not make sense to trigger AP bus re-scans even before the very first bus scan is complete.
Cc: stable@vger.kernel.org Reviewed-by: Holger Dengler dengler@linux.ibm.com Signed-off-by: Harald Freudenberger freude@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/crypto/ap_bus.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/s390/crypto/ap_bus.c +++ b/drivers/s390/crypto/ap_bus.c @@ -955,6 +955,10 @@ EXPORT_SYMBOL(ap_driver_unregister);
void ap_bus_force_rescan(void) { + /* Only trigger AP bus scans after the initial scan is done */ + if (atomic64_read(&ap_scan_bus_count) <= 0) + return; + /* processing a asynchronous bus rescan */ del_timer(&ap_config_timer); queue_work(system_long_wq, &ap_scan_work);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Lunn andrew@lunn.ch
commit f55d8e60f10909dbc5524e261041e1d28d7d20d8 upstream.
This function takes a pointer to a pointer, unlike sprintf() which is passed a plain pointer. Fix up the documentation to make this clear.
Fixes: 7888fe53b706 ("ethtool: Add common function for filling out strings") Cc: Alexander Duyck alexanderduyck@fb.com Cc: Justin Stitt justinstitt@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Justin Stitt justinstitt@google.com Link: https://lore.kernel.org/r/20231028192511.100001-1-andrew@lunn.ch Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/ethtool.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -781,10 +781,10 @@ int ethtool_get_phc_vclocks(struct net_d
/** * ethtool_sprintf - Write formatted string to ethtool string data - * @data: Pointer to start of string to update + * @data: Pointer to a pointer to the start of string to update * @fmt: Format of string to write * - * Write formatted string to data. Update data to point at start of + * Write formatted string to *data. Update *data to point at start of * next string. */ extern __printf(2, 3) void ethtool_sprintf(u8 **data, const char *fmt, ...);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Sverdlin alexander.sverdlin@siemens.com
commit 5a22fbcc10f3f7d94c5d88afbbffa240a3677057 upstream.
When LAN9303 is MDIO-connected two callchains exist into mdio->bus->write():
1. switch ports 1&2 ("physical" PHYs):
virtual (switch-internal) MDIO bus (lan9303_switch_ops->phy_{read|write})-> lan9303_mdio_phy_{read|write} -> mdiobus_{read|write}_nested
2. LAN9303 virtual PHY:
virtual MDIO bus (lan9303_phy_{read|write}) -> lan9303_virt_phy_reg_{read|write} -> regmap -> lan9303_mdio_{read|write}
If the latter functions just take mutex_lock(&sw_dev->device->bus->mdio_lock) it triggers a LOCKDEP false-positive splat. It's false-positive because the first mdio_lock in the second callchain above belongs to virtual MDIO bus, the second mdio_lock belongs to physical MDIO bus.
Consequent annotation in lan9303_mdio_{read|write} as nested lock (similar to lan9303_mdio_phy_{read|write}, it's the same physical MDIO bus) prevents the following splat:
WARNING: possible circular locking dependency detected 5.15.71 #1 Not tainted ------------------------------------------------------ kworker/u4:3/609 is trying to acquire lock: ffff000011531c68 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}, at: regmap_lock_mutex but task is already holding lock: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&bus->mdio_lock){+.+.}-{3:3}: lock_acquire __mutex_lock mutex_lock_nested lan9303_mdio_read _regmap_read regmap_read lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork -> #0 (lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock){+.+.}-{3:3}: __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe mdio_probe really_probe __driver_probe_device driver_probe_device __device_attach_driver bus_for_each_drv __device_attach device_initial_probe bus_probe_device deferred_probe_work_func process_one_work worker_thread kthread ret_from_fork other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); lock(&bus->mdio_lock); lock(lan9303_mdio:131:(&lan9303_mdio_regmap_config)->lock); *** DEADLOCK *** 5 locks held by kworker/u4:3/609: #0: ffff000002842938 ((wq_completion)events_unbound){+.+.}-{0:0}, at: process_one_work #1: ffff80000bacbd60 (deferred_probe_work){+.+.}-{0:0}, at: process_one_work #2: ffff000007645178 (&dev->mutex){....}-{3:3}, at: __device_attach #3: ffff8000096e6e78 (dsa2_mutex){+.+.}-{3:3}, at: dsa_register_switch #4: ffff0000114c44d8 (&bus->mdio_lock){+.+.}-{3:3}, at: mdiobus_read stack backtrace: CPU: 1 PID: 609 Comm: kworker/u4:3 Not tainted 5.15.71 #1 Workqueue: events_unbound deferred_probe_work_func Call trace: dump_backtrace show_stack dump_stack_lvl dump_stack print_circular_bug check_noncircular __lock_acquire lock_acquire.part.0 lock_acquire __mutex_lock mutex_lock_nested regmap_lock_mutex regmap_read lan9303_phy_read dsa_slave_phy_read __mdiobus_read mdiobus_read get_phy_device mdiobus_scan __mdiobus_register dsa_register_switch lan9303_probe lan9303_mdio_probe ...
Cc: stable@vger.kernel.org Fixes: dc7005831523 ("net: dsa: LAN9303: add MDIO managed mode support") Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://lore.kernel.org/r/20231027065741.534971-1-alexander.sverdlin@siemens... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/lan9303_mdio.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/dsa/lan9303_mdio.c +++ b/drivers/net/dsa/lan9303_mdio.c @@ -32,7 +32,7 @@ static int lan9303_mdio_write(void *ctx, struct lan9303_mdio *sw_dev = (struct lan9303_mdio *)ctx;
reg <<= 2; /* reg num to offset */ - mutex_lock(&sw_dev->device->bus->mdio_lock); + mutex_lock_nested(&sw_dev->device->bus->mdio_lock, MDIO_MUTEX_NESTED); lan9303_mdio_real_write(sw_dev->device, reg, val & 0xffff); lan9303_mdio_real_write(sw_dev->device, reg + 2, (val >> 16) & 0xffff); mutex_unlock(&sw_dev->device->bus->mdio_lock); @@ -50,7 +50,7 @@ static int lan9303_mdio_read(void *ctx, struct lan9303_mdio *sw_dev = (struct lan9303_mdio *)ctx;
reg <<= 2; /* reg num to offset */ - mutex_lock(&sw_dev->device->bus->mdio_lock); + mutex_lock_nested(&sw_dev->device->bus->mdio_lock, MDIO_MUTEX_NESTED); *val = lan9303_mdio_real_read(sw_dev->device, reg); *val |= (lan9303_mdio_real_read(sw_dev->device, reg + 2) << 16); mutex_unlock(&sw_dev->device->bus->mdio_lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Klaus Kudielka klaus.kudielka@gmail.com
commit 02d5fdbf4f2b8c406f7a4c98fa52aa181a11d733 upstream.
Background: Turris Omnia (Armada 385); eth2 (mvneta) connected to SFP bus; SFP module is present, but no fiber connected, so definitely no carrier.
After booting, eth2 is down, but netdev LED trigger surprisingly reports link active. Then, after "ip link set eth2 up", the link indicator goes away - as I would have expected it from the beginning.
It turns out, that the default carrier state after netdev creation is "carrier ok". Some ethernet drivers explicitly call netif_carrier_off during probing, others (like mvneta) don't - which explains the current behaviour: only when the device is brought up, phylink_start calls netif_carrier_off.
Fix this for all drivers using phylink, by calling netif_carrier_off in phylink_create.
Fixes: 089381b27abe ("leds: initial support for Turris Omnia LEDs") Cc: stable@vger.kernel.org Suggested-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Klaus Kudielka klaus.kudielka@gmail.com Reviewed-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/phylink.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -853,6 +853,7 @@ struct phylink *phylink_create(struct ph pl->config = config; if (config->type == PHYLINK_NETDEV) { pl->netdev = to_net_dev(config->dev); + netif_carrier_off(pl->netdev); } else if (config->type == PHYLINK_DEV) { pl->dev = config->dev; } else {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit f78ca48a8ba9cdec96e8839351e49eec3233b177 upstream.
Currently we set SMBHSTCNT_LAST_BYTE only after the host has started receiving the last byte. If we get e.g. preempted before setting SMBHSTCNT_LAST_BYTE, the host may be finished with receiving the byte before SMBHSTCNT_LAST_BYTE is set. Therefore change the code to set SMBHSTCNT_LAST_BYTE before writing SMBHSTSTS_BYTE_DONE for the byte before the last byte. Now the code is also consistent with what we do in i801_isr_byte_done().
Reported-by: Jean Delvare jdelvare@suse.com Closes: https://lore.kernel.org/linux-i2c/20230828152747.09444625@endymion.delvare/ Cc: stable@vger.kernel.org Acked-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Reviewed-by: Jean Delvare jdelvare@suse.de Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-i801.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
--- a/drivers/i2c/busses/i2c-i801.c +++ b/drivers/i2c/busses/i2c-i801.c @@ -702,15 +702,11 @@ static int i801_block_transaction_byte_b return i801_check_post(priv, result ? priv->status : -ETIMEDOUT); }
- for (i = 1; i <= len; i++) { - if (i == len && read_write == I2C_SMBUS_READ) - smbcmd |= SMBHSTCNT_LAST_BYTE; - outb_p(smbcmd, SMBHSTCNT(priv)); - - if (i == 1) - outb_p(inb(SMBHSTCNT(priv)) | SMBHSTCNT_START, - SMBHSTCNT(priv)); + if (len == 1 && read_write == I2C_SMBUS_READ) + smbcmd |= SMBHSTCNT_LAST_BYTE; + outb_p(smbcmd | SMBHSTCNT_START, SMBHSTCNT(priv));
+ for (i = 1; i <= len; i++) { status = i801_wait_byte_done(priv); if (status) goto exit; @@ -733,9 +729,12 @@ static int i801_block_transaction_byte_b data->block[0] = len; }
- /* Retrieve/store value in SMBBLKDAT */ - if (read_write == I2C_SMBUS_READ) + if (read_write == I2C_SMBUS_READ) { data->block[i] = inb_p(SMBBLKDAT(priv)); + if (i == len - 1) + outb_p(smbcmd | SMBHSTCNT_LAST_BYTE, SMBHSTCNT(priv)); + } + if (read_write == I2C_SMBUS_WRITE && i+1 <= len) outb_p(data->block[i+1], SMBBLKDAT(priv));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Su Hui suhui@nfschina.com
commit e0d4e8acb3789c5a8651061fbab62ca24a45c063 upstream.
With gcc and W=1 option, there's a warning like this:
fs/f2fs/compress.c: In function ‘f2fs_init_page_array_cache’: fs/f2fs/compress.c:1984:47: error: ‘%u’ directive writing between 1 and 7 bytes into a region of size between 5 and 8 [-Werror=format-overflow=] 1984 | sprintf(slab_name, "f2fs_page_array_entry-%u:%u", MAJOR(dev), MINOR(dev)); | ^~
String "f2fs_page_array_entry-%u:%u" can up to 35. The first "%u" can up to 4 and the second "%u" can up to 7, so total size is "24 + 4 + 7 = 35". slab_name's size should be 35 rather than 32.
Cc: stable@vger.kernel.org Signed-off-by: Su Hui suhui@nfschina.com Reviewed-by: Chao Yu chao@kernel.org Signed-off-by: Jaegeuk Kim jaegeuk@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/f2fs/compress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/f2fs/compress.c +++ b/fs/f2fs/compress.c @@ -1906,7 +1906,7 @@ void f2fs_destroy_compress_inode(struct int f2fs_init_page_array_cache(struct f2fs_sb_info *sbi) { dev_t dev = sbi->sb->s_bdev->bd_dev; - char slab_name[32]; + char slab_name[35];
sprintf(slab_name, "f2fs_page_array_entry-%u:%u", MAJOR(dev), MINOR(dev));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Young sean@mess.org
commit c8a489f820179fb12251e262b50303c29de991ac upstream.
When transmitting, infrared drivers expect an odd number of samples; iow without a trailing space. No problems have been observed so far, so this is just belt and braces.
Fixes: 9b6192589be7 ("media: lirc: implement scancode sending") Cc: stable@vger.kernel.org Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/lirc_dev.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/media/rc/lirc_dev.c +++ b/drivers/media/rc/lirc_dev.c @@ -287,7 +287,11 @@ static ssize_t lirc_transmit(struct file if (ret < 0) goto out_kfree_raw;
- count = ret; + /* drop trailing space */ + if (!(ret % 2)) + count = ret - 1; + else + count = ret;
txbuf = kmalloc_array(count, sizeof(unsigned int), GFP_KERNEL); if (!txbuf) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Young sean@mess.org
commit 4f7efc71891462ab7606da7039f480d7c1584a13 upstream.
The Sharp protocol[1] encoding has incorrect timings for bit space.
[1] https://www.sbprojects.net/knowledge/ir/sharp.php
Fixes: d35afc5fe097 ("[media] rc: ir-sharp-decoder: Add encode capability") Cc: stable@vger.kernel.org Reported-by: Joe Ferner joe.m.ferner@gmail.com Closes: https://sourceforge.net/p/lirc/mailman/message/38604507/ Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/ir-sharp-decoder.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/media/rc/ir-sharp-decoder.c +++ b/drivers/media/rc/ir-sharp-decoder.c @@ -15,7 +15,9 @@ #define SHARP_UNIT 40 /* us */ #define SHARP_BIT_PULSE (8 * SHARP_UNIT) /* 320us */ #define SHARP_BIT_0_PERIOD (25 * SHARP_UNIT) /* 1ms (680us space) */ -#define SHARP_BIT_1_PERIOD (50 * SHARP_UNIT) /* 2ms (1680ms space) */ +#define SHARP_BIT_1_PERIOD (50 * SHARP_UNIT) /* 2ms (1680us space) */ +#define SHARP_BIT_0_SPACE (17 * SHARP_UNIT) /* 680us space */ +#define SHARP_BIT_1_SPACE (42 * SHARP_UNIT) /* 1680us space */ #define SHARP_ECHO_SPACE (1000 * SHARP_UNIT) /* 40 ms */ #define SHARP_TRAILER_SPACE (125 * SHARP_UNIT) /* 5 ms (even longer) */
@@ -168,8 +170,8 @@ static const struct ir_raw_timings_pd ir .header_pulse = 0, .header_space = 0, .bit_pulse = SHARP_BIT_PULSE, - .bit_space[0] = SHARP_BIT_0_PERIOD, - .bit_space[1] = SHARP_BIT_1_PERIOD, + .bit_space[0] = SHARP_BIT_0_SPACE, + .bit_space[1] = SHARP_BIT_1_SPACE, .trailer_pulse = SHARP_BIT_PULSE, .trailer_space = SHARP_ECHO_SPACE, .msb_first = 1,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 0768a9dd809ef52440b5df7dce5a1c1c7e97abbd upstream.
Supported codec bitmask is populated from the payload from venus firmware. There is a possible case when all the bits in the codec bitmask is set. In such case, core cap for decoder is filled and MAX_CODEC_NUM is utilized. Now while filling the caps for encoder, it can lead to access the caps array beyong 32 index. Hence leading to OOB write. The fix counts the supported encoder and decoder. If the count is more than max, then it skips accessing the caps.
Cc: stable@vger.kernel.org Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_parser.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_parser.c +++ b/drivers/media/platform/qcom/venus/hfi_parser.c @@ -19,6 +19,9 @@ static void init_codecs(struct venus_cor struct hfi_plat_caps *caps = core->caps, *cap; unsigned long bit;
+ if (hweight_long(core->dec_codecs) + hweight_long(core->enc_codecs) > MAX_CODEC_NUM) + return; + for_each_set_bit(bit, &core->dec_codecs, MAX_CODEC_NUM) { cap = &caps[core->codecs_count++]; cap->codec = BIT(bit);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit b18e36dfd6c935da60a971310374f3dfec3c82e1 upstream.
Buffer requirement, for different buffer type, comes from video firmware. While copying these requirements, there is an OOB possibility when the payload from firmware is more than expected size. Fix the check to avoid the OOB possibility.
Cc: stable@vger.kernel.org Fixes: 09c2845e8fe4 ("[media] media: venus: hfi: add Host Firmware Interface (HFI)") Reviewed-by: Nathan Hebert nhebert@chromium.org Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_msgs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/venus/hfi_msgs.c +++ b/drivers/media/platform/qcom/venus/hfi_msgs.c @@ -367,7 +367,7 @@ session_get_prop_buf_req(struct hfi_msg_ memcpy(&bufreq[idx], buf_req, sizeof(*bufreq)); idx++;
- if (idx > HFI_BUFFER_TYPE_MAX) + if (idx >= HFI_BUFFER_TYPE_MAX) return HFI_ERR_SESSION_INVALID_PARAMETER;
req_bytes -= sizeof(struct hfi_buffer_requirements);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vikash Garodia quic_vgarodia@quicinc.com
commit 8d0b89398b7ebc52103e055bf36b60b045f5258f upstream.
The hfi parser, parses the capabilities received from venus firmware and copies them to core capabilities. Consider below api, for example, fill_caps - In this api, caps in core structure gets updated with the number of capabilities received in firmware data payload. If the same api is called multiple times, there is a possibility of copying beyond the max allocated size in core caps. Similar possibilities in fill_raw_fmts and fill_profile_level functions.
Cc: stable@vger.kernel.org Fixes: 1a73374a04e5 ("media: venus: hfi_parser: add common capability parser") Signed-off-by: Vikash Garodia quic_vgarodia@quicinc.com Signed-off-by: Stanimir Varbanov stanimir.k.varbanov@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/hfi_parser.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/media/platform/qcom/venus/hfi_parser.c +++ b/drivers/media/platform/qcom/venus/hfi_parser.c @@ -89,6 +89,9 @@ static void fill_profile_level(struct hf { const struct hfi_profile_level *pl = data;
+ if (cap->num_pl + num >= HFI_MAX_PROFILE_COUNT) + return; + memcpy(&cap->pl[cap->num_pl], pl, num * sizeof(*pl)); cap->num_pl += num; } @@ -114,6 +117,9 @@ fill_caps(struct hfi_plat_caps *cap, con { const struct hfi_capability *caps = data;
+ if (cap->num_caps + num >= MAX_CAP_ENTRIES) + return; + memcpy(&cap->caps[cap->num_caps], caps, num * sizeof(*caps)); cap->num_caps += num; } @@ -140,6 +146,9 @@ static void fill_raw_fmts(struct hfi_pla { const struct raw_formats *formats = fmts;
+ if (cap->num_fmts + num_fmts >= MAX_FMT_ENTRIES) + return; + memcpy(&cap->fmts[cap->num_fmts], formats, num_fmts * sizeof(*formats)); cap->num_fmts += num_fmts; } @@ -162,6 +171,9 @@ parse_raw_formats(struct venus_core *cor rawfmts[i].buftype = fmt->buffer_type; i++;
+ if (i >= MAX_FMT_ENTRIES) + return; + if (pinfo->num_planes > MAX_PLANES) break;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
commit 724ff68e968b19d786870d333f9952bdd6b119cb upstream.
Initialise the try sink compose rectangle size to the sink compose rectangle for binner and scaler sub-devices. This was missed due to the faulty condition that lead to the compose rectangles to be initialised for the pixel array sub-device where it is not relevant.
Fixes: ccfc97bdb5ae ("[media] smiapp: Add driver") Cc: stable@vger.kernel.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/i2c/ccs/ccs-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/i2c/ccs/ccs-core.c +++ b/drivers/media/i2c/ccs/ccs-core.c @@ -3089,7 +3089,7 @@ static int ccs_open(struct v4l2_subdev * try_fmt->code = sensor->internal_csi_format->code; try_fmt->field = V4L2_FIELD_NONE;
- if (ssd != sensor->pixel_array) + if (ssd == sensor->pixel_array) continue;
try_comp = v4l2_subdev_get_try_compose(sd, fh->state, i);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mahmoud Adam mngyadam@amazon.com
commit bc1b5acb40201a0746d68a7d7cfc141899937f4f upstream.
seq_release should be called to free the allocated seq_file
Cc: stable@vger.kernel.org # v5.3+ Signed-off-by: Mahmoud Adam mngyadam@amazon.com Reviewed-by: Jeff Layton jlayton@kernel.org Fixes: 78599c42ae3c ("nfsd4: add file to display list of client's opens") Reviewed-by: NeilBrown neilb@suse.de Tested-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -2686,7 +2686,7 @@ static int client_opens_release(struct i
/* XXX: alternatively, we could get/drop in seq start/stop */ drop_client(clp); - return 0; + return seq_release(inode, file); }
static const struct file_operations client_states_fops = {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcaov@gmail.com
commit 8cb22bec142624d21bc85ff96b7bad10b6220e6a upstream.
Instructions can write to x0, so we should simulate these instructions normally.
Currently, the kernel hangs if an instruction who writes to x0 is simulated.
Fixes: c22b0bcb1dd0 ("riscv: Add kprobes supported") Cc: stable@vger.kernel.org Signed-off-by: Nam Cao namcaov@gmail.com Reviewed-by: Charlie Jenkins charlie@rivosinc.com Acked-by: Guo Ren guoren@kernel.org Link: https://lore.kernel.org/r/20230829182500.61875-1-namcaov@gmail.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kernel/probes/simulate-insn.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/riscv/kernel/probes/simulate-insn.c b/arch/riscv/kernel/probes/simulate-insn.c index d3099d67816d..6c166029079c 100644 --- a/arch/riscv/kernel/probes/simulate-insn.c +++ b/arch/riscv/kernel/probes/simulate-insn.c @@ -24,7 +24,7 @@ static inline bool rv_insn_reg_set_val(struct pt_regs *regs, u32 index, unsigned long val) { if (index == 0) - return false; + return true; else if (index <= 31) *((unsigned long *)regs + index) = val; else
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Shih victor.shih@genesyslogic.com.tw
commit d7133797e9e1b72fd89237f68cb36d745599ed86 upstream.
When GL9750 enters ASPM L1 sub-states, it will stay at L1.1 and will not enter L1.2. The workaround is to toggle PM state to allow GL9750 to enter ASPM L1.2.
Signed-off-by: Victor Shih victor.shih@genesyslogic.com.tw Link: https://lore.kernel.org/r/20230912091710.7797-1-victorshihgli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-gli.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -23,6 +23,9 @@ #define GLI_9750_WT_EN_ON 0x1 #define GLI_9750_WT_EN_OFF 0x0
+#define PCI_GLI_9750_PM_CTRL 0xFC +#define PCI_GLI_9750_PM_STATE GENMASK(1, 0) + #define SDHCI_GLI_9750_CFG2 0x848 #define SDHCI_GLI_9750_CFG2_L1DLY GENMASK(28, 24) #define GLI_9750_CFG2_L1DLY_VALUE 0x1F @@ -421,8 +424,12 @@ static void sdhci_gl9750_set_clock(struc
static void gl9750_hw_setting(struct sdhci_host *host) { + struct sdhci_pci_slot *slot = sdhci_priv(host); + struct pci_dev *pdev; u32 value;
+ pdev = slot->chip->pdev; + gl9750_wt_on(host);
value = sdhci_readl(host, SDHCI_GLI_9750_CFG2); @@ -432,6 +439,13 @@ static void gl9750_hw_setting(struct sdh GLI_9750_CFG2_L1DLY_VALUE); sdhci_writel(host, value, SDHCI_GLI_9750_CFG2);
+ /* toggle PM state to allow GL9750 to enter ASPM L1.2 */ + pci_read_config_dword(pdev, PCI_GLI_9750_PM_CTRL, &value); + value |= PCI_GLI_9750_PM_STATE; + pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value); + value &= ~PCI_GLI_9750_PM_STATE; + pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value); + gl9750_wt_off(host); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roman Gushchin roman.gushchin@linux.dev
commit 24948e3b7b12e0031a6edb4f49bbb9fb2ad1e4e9 upstream.
Objcg vectors attached to slab pages to store slab object ownership information are allocated using gfp flags for the original slab allocation. Depending on slab page order and the size of slab objects, objcg vector can take several pages.
If the original allocation was done with the __GFP_NOFAIL flag, it triggered a warning in the page allocation code. Indeed, order > 1 pages should not been allocated with the __GFP_NOFAIL flag.
Fix this by simply dropping the __GFP_NOFAIL flag when allocating the objcg vector. It effectively allows to skip the accounting of a single slab object under a heavy memory pressure.
An alternative would be to implement the mechanism to fallback to order-0 allocations for accounting metadata, which is also not perfect because it will increase performance penalty and memory footprint of the kernel memory accounting under memory pressure.
Link: https://lkml.kernel.org/r/ZUp8ZFGxwmCx4ZFr@P9FQF9L96D.corp.robot.car Signed-off-by: Roman Gushchin roman.gushchin@linux.dev Reported-by: Christoph Lameter cl@linux.com Closes: https://lkml.kernel.org/r/6b42243e-f197-600a-5d22-56bd728a5ad8@gentwo.org Acked-by: Shakeel Butt shakeelb@google.com Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2822,7 +2822,8 @@ retry: * Moreover, it should not come from DMA buffer and is not readily * reclaimable. So those GFP bits should be masked off. */ -#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | __GFP_ACCOUNT) +#define OBJCGS_CLEAR_MASK (__GFP_DMA | __GFP_RECLAIMABLE | \ + __GFP_ACCOUNT | __GFP_NOFAIL)
int memcg_alloc_page_obj_cgroups(struct page *page, struct kmem_cache *s, gfp_t gfp, bool new_page)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: ChunHao Lin hau@realtek.com
commit 868c3b95afef4883bfb66c9397482da6840b5baf upstream.
Device that support DASH may be reseted or powered off during suspend. So driver needs to handle DASH during system suspend and resume. Or DASH firmware will influence device behavior and causes network lost.
Fixes: b646d90053f8 ("r8169: magic.") Cc: stable@vger.kernel.org Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: ChunHao Lin hau@realtek.com Link: https://lore.kernel.org/r/20231109173400.4573-3-hau@realtek.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -4714,10 +4714,16 @@ static void rtl8169_down(struct rtl8169_ rtl8169_cleanup(tp, true);
rtl_prepare_power_down(tp); + + if (tp->dash_type != RTL_DASH_NONE) + rtl8168_driver_stop(tp); }
static void rtl8169_up(struct rtl8169_private *tp) { + if (tp->dash_type != RTL_DASH_NONE) + rtl8168_driver_start(tp); + pci_set_master(tp->pci_dev); phy_init_hw(tp->phydev); phy_resume(tp->phydev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Shih victor.shih@genesyslogic.com.tw
commit 015c9cbcf0ad709079117d27c2094a46e0eadcdb upstream.
Due to a flaw in the hardware design, the GL9750 replay timer frequently times out when ASPM is enabled. As a result, the warning messages will often appear in the system log when the system accesses the GL9750 PCI config. Therefore, the replay timer timeout must be masked.
Fixes: d7133797e9e1 ("mmc: sdhci-pci-gli: A workaround to allow GL9750 to enter ASPM L1.2") Signed-off-by: Victor Shih victor.shih@genesyslogic.com.tw Acked-by: Adrian Hunter adrian.hunter@intel.com Acked-by: Kai-Heng Feng kai.heng.geng@canonical.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20231107095741.8832-2-victorshihgli@gmail.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-pci-gli.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/mmc/host/sdhci-pci-gli.c +++ b/drivers/mmc/host/sdhci-pci-gli.c @@ -26,6 +26,9 @@ #define PCI_GLI_9750_PM_CTRL 0xFC #define PCI_GLI_9750_PM_STATE GENMASK(1, 0)
+#define PCI_GLI_9750_CORRERR_MASK 0x214 +#define PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT BIT(12) + #define SDHCI_GLI_9750_CFG2 0x848 #define SDHCI_GLI_9750_CFG2_L1DLY GENMASK(28, 24) #define GLI_9750_CFG2_L1DLY_VALUE 0x1F @@ -446,6 +449,11 @@ static void gl9750_hw_setting(struct sdh value &= ~PCI_GLI_9750_PM_STATE; pci_write_config_dword(pdev, PCI_GLI_9750_PM_CTRL, value);
+ /* mask the replay timer timeout of AER */ + pci_read_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, &value); + value |= PCI_GLI_9750_CORRERR_MASK_REPLAY_TIMER_TIMEOUT; + pci_write_config_dword(pdev, PCI_GLI_9750_CORRERR_MASK, value); + gl9750_wt_off(host); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 7405116519ad70b8c7340359bfac8db8279e7ce4 upstream.
We need to make sure camss_configure_pd() happens before camss_register_entities() as the vfe_get() path relies on the pointer provided by camss_configure_pd().
Fix the ordering sequence in probe to ensure the pointers vfe_get() demands are present by the time camss_register_entities() runs.
In order to facilitate backporting to stable kernels I've moved the configure_pd() call pretty early on the probe() function so that irrespective of the existence of the old error handling jump labels this patch should still apply to -next circa Aug 2023 to v5.13 inclusive.
Fixes: 2f6f8af67203 ("media: camss: Refactor VFE power domain toggling") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss.c +++ b/drivers/media/platform/qcom/camss/camss.c @@ -1369,6 +1369,12 @@ static int camss_probe(struct platform_d goto err_cleanup; }
+ ret = camss_configure_pd(camss); + if (ret < 0) { + dev_err(dev, "Failed to configure power domains: %d\n", ret); + goto err_cleanup; + } + ret = camss_init_subdevices(camss); if (ret < 0) goto err_cleanup; @@ -1421,12 +1427,6 @@ static int camss_probe(struct platform_d } }
- ret = camss_configure_pd(camss); - if (ret < 0) { - dev_err(dev, "Failed to configure power domains: %d\n", ret); - return ret; - } - pm_runtime_enable(dev);
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 26bda3da00c3edef727a6acb00ed2eb4b22f8723 upstream.
Right now it is possible to do a vfe_get() with the internal reference count at 1. If vfe_check_clock_rates() returns non-zero then we will leave the reference count as-is and
run: - pm_runtime_put_sync() - vfe->ops->pm_domain_off()
skip: - camss_disable_clocks()
Subsequent vfe_put() calls will when the ref-count is non-zero unconditionally run:
- pm_runtime_put_sync() - vfe->ops->pm_domain_off() - camss_disable_clocks()
vfe_get() should not attempt to roll-back on error when the ref-count is non-zero as the upper layers will still do their own vfe_put() operations.
vfe_put() will drop the reference count and do the necessary power domain release, the cleanup jumps in vfe_get() should only be run when the ref-count is zero.
[ 50.095796] CPU: 7 PID: 3075 Comm: cam Not tainted 6.3.2+ #80 [ 50.095798] Hardware name: LENOVO 21BXCTO1WW/21BXCTO1WW, BIOS N3HET82W (1.54 ) 05/26/2023 [ 50.095799] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 50.095802] pc : refcount_warn_saturate+0xf4/0x148 [ 50.095804] lr : refcount_warn_saturate+0xf4/0x148 [ 50.095805] sp : ffff80000c7cb8b0 [ 50.095806] x29: ffff80000c7cb8b0 x28: ffff16ecc0e3fc10 x27: 0000000000000000 [ 50.095810] x26: 0000000000000000 x25: 0000000000020802 x24: 0000000000000000 [ 50.095813] x23: ffff16ecc7360640 x22: 00000000ffffffff x21: 0000000000000005 [ 50.095815] x20: ffff16ed175f4400 x19: ffffb4d9852942a8 x18: ffffffffffffffff [ 50.095818] x17: ffffb4d9852d4a48 x16: ffffb4d983da5db8 x15: ffff80000c7cb320 [ 50.095821] x14: 0000000000000001 x13: 2e656572662d7265 x12: 7466612d65737520 [ 50.095823] x11: 00000000ffffefff x10: ffffb4d9850cebf0 x9 : ffffb4d9835cf954 [ 50.095826] x8 : 0000000000017fe8 x7 : c0000000ffffefff x6 : 0000000000057fa8 [ 50.095829] x5 : ffff16f813fe3d08 x4 : 0000000000000000 x3 : ffff621e8f4d2000 [ 50.095832] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff16ed32119040 [ 50.095835] Call trace: [ 50.095836] refcount_warn_saturate+0xf4/0x148 [ 50.095838] device_link_put_kref+0x84/0xc8 [ 50.095843] device_link_del+0x38/0x58 [ 50.095846] vfe_pm_domain_off+0x3c/0x50 [qcom_camss] [ 50.095860] vfe_put+0x114/0x140 [qcom_camss] [ 50.095869] csid_set_power+0x2c8/0x408 [qcom_camss] [ 50.095878] pipeline_pm_power_one+0x164/0x170 [videodev] [ 50.095896] pipeline_pm_power+0xc4/0x110 [videodev] [ 50.095909] v4l2_pipeline_pm_use+0x5c/0xa0 [videodev] [ 50.095923] v4l2_pipeline_pm_get+0x1c/0x30 [videodev] [ 50.095937] video_open+0x7c/0x100 [qcom_camss] [ 50.095945] v4l2_open+0x84/0x130 [videodev] [ 50.095960] chrdev_open+0xc8/0x250 [ 50.095964] do_dentry_open+0x1bc/0x498 [ 50.095966] vfs_open+0x34/0x40 [ 50.095968] path_openat+0xb44/0xf20 [ 50.095971] do_filp_open+0xa4/0x160 [ 50.095974] do_sys_openat2+0xc8/0x188 [ 50.095975] __arm64_sys_openat+0x6c/0xb8 [ 50.095977] invoke_syscall+0x50/0x128 [ 50.095982] el0_svc_common.constprop.0+0x4c/0x100 [ 50.095985] do_el0_svc+0x40/0xa8 [ 50.095988] el0_svc+0x2c/0x88 [ 50.095991] el0t_64_sync_handler+0xf4/0x120 [ 50.095994] el0t_64_sync+0x190/0x198 [ 50.095996] ---[ end trace 0000000000000000 ]---
Fixes: 779096916dae ("media: camss: vfe: Fix runtime PM imbalance on error") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe.c +++ b/drivers/media/platform/qcom/camss/camss-vfe.c @@ -607,7 +607,7 @@ static int vfe_get(struct vfe_device *vf } else { ret = vfe_check_clock_rates(vfe); if (ret < 0) - goto error_pm_runtime_get; + goto error_pm_domain; } vfe->power_count++;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit 3143ad282fc08bf995ee73e32a9e40c527bf265d upstream.
There are two problems with the current vfe_disable_output() routine.
Firstly we rightly use a spinlock to protect output->gen2.active_num everywhere except for in the IDLE timeout path of vfe_disable_output(). Even if that is not racy "in practice" somehow it is by happenstance not by design.
Secondly we do not get consistent behaviour from this routine. On sc8280xp 50% of the time I get "VFE idle timeout - resetting". In this case the subsequent capture will succeed. The other 50% of the time, we don't hit the idle timeout, never do the VFE reset and subsequent captures stall indefinitely.
Rewrite the vfe_disable_output() routine to
- Quiesce write masters with vfe_wm_stop() - Set active_num = 0
remembering to hold the spinlock when we do so followed by
- Reset the VFE
Testing on sc8280xp and sdm845 shows this to be a valid fix.
Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe-170.c | 22 +++------------------- 1 file changed, 3 insertions(+), 19 deletions(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe-170.c +++ b/drivers/media/platform/qcom/camss/camss-vfe-170.c @@ -7,7 +7,6 @@ * Copyright (C) 2020-2021 Linaro Ltd. */
-#include <linux/delay.h> #include <linux/interrupt.h> #include <linux/io.h> #include <linux/iopoll.h> @@ -498,35 +497,20 @@ static int vfe_enable_output(struct vfe_ return 0; }
-static int vfe_disable_output(struct vfe_line *line) +static void vfe_disable_output(struct vfe_line *line) { struct vfe_device *vfe = to_vfe(line); struct vfe_output *output = &line->output; unsigned long flags; unsigned int i; - bool done; - int timeout = 0; - - do { - spin_lock_irqsave(&vfe->output_lock, flags); - done = !output->gen2.active_num; - spin_unlock_irqrestore(&vfe->output_lock, flags); - usleep_range(10000, 20000); - - if (timeout++ == 100) { - dev_err(vfe->camss->dev, "VFE idle timeout - resetting\n"); - vfe_reset(vfe); - output->gen2.active_num = 0; - return 0; - } - } while (!done);
spin_lock_irqsave(&vfe->output_lock, flags); for (i = 0; i < output->wm_num; i++) vfe_wm_stop(vfe, output->wm_idx[i]); + output->gen2.active_num = 0; spin_unlock_irqrestore(&vfe->output_lock, flags);
- return 0; + vfe_reset(vfe); }
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bryan O'Donoghue bryan.odonoghue@linaro.org
commit b6e1bdca463a932c1ac02caa7d3e14bf39288e0c upstream.
check_clock doesn't account for vfe_lite which means that vfe_lite will never get validated by this routine. Add the clock name to the expected set to remediate.
Fixes: 7319cdf189bb ("media: camss: Add support for VFE hardware version Titan 170") Cc: stable@vger.kernel.org Signed-off-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/camss/camss-vfe.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/media/platform/qcom/camss/camss-vfe.c +++ b/drivers/media/platform/qcom/camss/camss-vfe.c @@ -533,7 +533,8 @@ static int vfe_check_clock_rates(struct struct camss_clock *clock = &vfe->clock[i];
if (!strcmp(clock->name, "vfe0") || - !strcmp(clock->name, "vfe1")) { + !strcmp(clock->name, "vfe1") || + !strcmp(clock->name, "vfe_lite")) { u64 min_rate = 0; unsigned long rate;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiner Kallweit hkallweit1@gmail.com
commit 6a26310273c323380da21eb23fcfd50e31140913 upstream.
This reverts commit efa5f1311c4998e9e6317c52bc5ee93b3a0f36df.
I couldn't reproduce the reported issue. What I did, based on a pcap packet log provided by the reporter: - Used same chip version (RTL8168h) - Set MAC address to the one used on the reporters system - Replayed the EAPOL unicast packet that, according to the reporter, was filtered out by the mc filter. The packet was properly received.
Therefore the root cause of the reported issue seems to be somewhere else. Disabling mc filtering completely for the most common chip version is a quite big hammer. Therefore revert the change and wait for further analysis results from the reporter.
Cc: stable@vger.kernel.org Signed-off-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/realtek/r8169_main.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -2553,9 +2553,7 @@ static void rtl_set_rx_mode(struct net_d rx_mode &= ~AcceptMulticast; } else if (netdev_mc_count(dev) > MC_FILTER_LIMIT || dev->flags & IFF_ALLMULTI || - tp->mac_version == RTL_GIGA_MAC_VER_35 || - tp->mac_version == RTL_GIGA_MAC_VER_46 || - tp->mac_version == RTL_GIGA_MAC_VER_48) { + tp->mac_version == RTL_GIGA_MAC_VER_35) { /* accept all multicasts */ } else if (netdev_mc_empty(dev)) { rx_mode &= ~AcceptMulticast;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Max Kellermann max.kellermann@ionos.com
commit 484fd6c1de13b336806a967908a927cc0356e312 upstream.
The function ext4_init_acl() calls posix_acl_create() which is responsible for applying the umask. But without CONFIG_EXT4_FS_POSIX_ACL, ext4_init_acl() is an empty inline function, and nobody applies the umask.
This fixes a bug which causes the umask to be ignored with O_TMPFILE on ext4:
https://github.com/MusicPlayerDaemon/MPD/issues/558 https://bugs.gentoo.org/show_bug.cgi?id=686142#c3 https://bugzilla.kernel.org/show_bug.cgi?id=203625
Reviewed-by: "J. Bruce Fields" bfields@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Max Kellermann max.kellermann@ionos.com Link: https://lore.kernel.org/r/20230919081824.1096619-1-max.kellermann@ionos.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/acl.h | 5 +++++ 1 file changed, 5 insertions(+)
--- a/fs/ext4/acl.h +++ b/fs/ext4/acl.h @@ -68,6 +68,11 @@ extern int ext4_init_acl(handle_t *, str static inline int ext4_init_acl(handle_t *handle, struct inode *inode, struct inode *dir) { + /* usually, the umask is applied by posix_acl_create(), but if + ext4 ACL support is disabled at compile time, we need to do + it here, because posix_acl_create() will never be called */ + inode->i_mode &= ~current_umask(); + return 0; } #endif /* CONFIG_EXT4_FS_POSIX_ACL */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 31f13421c004a420c0e9d288859c9ea9259ea0cc upstream.
Commit 0aeaa2559d6d5 ("ext4: fix corruption when online resizing a 1K bigalloc fs") found that primary superblock's offset in its group is not equal to offset of backup superblock in its group when block size is 1K and bigalloc is enabled. As group descriptor blocks are right after superblock, we can't pass block number of gdb to update_backups for the same reason.
The root casue of the issue above is that leading 1K padding block is count as data block offset for primary block while backup block has no padding block offset in its group.
Remove padding data block count to fix the issue for gdb backups.
For meta_bg case, update_backups treat blk_off as block number, do no conversion in this case.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-2-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1555,6 +1555,8 @@ exit_journal: int gdb_num_end = ((group + flex_gd->count - 1) / EXT4_DESC_PER_BLOCK(sb)); int meta_bg = ext4_has_feature_meta_bg(sb); + sector_t padding_blocks = meta_bg ? 0 : sbi->s_sbh->b_blocknr - + ext4_group_first_block_no(sb, 0); sector_t old_gdb = 0;
update_backups(sb, ext4_group_first_block_no(sb, 0), @@ -1566,8 +1568,8 @@ exit_journal: gdb_num); if (old_gdb == gdb_bh->b_blocknr) continue; - update_backups(sb, gdb_bh->b_blocknr, gdb_bh->b_data, - gdb_bh->b_size, meta_bg); + update_backups(sb, gdb_bh->b_blocknr - padding_blocks, + gdb_bh->b_data, gdb_bh->b_size, meta_bg); old_gdb = gdb_bh->b_blocknr; } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 48f1551592c54f7d8e2befc72a99ff4e47f7dca0 upstream.
Avoid to ignore error in "err".
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Link: https://lore.kernel.org/r/20230826174712.4059355-4-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1938,9 +1938,7 @@ static int ext4_convert_meta_bg(struct s
errout: ret = ext4_journal_stop(handle); - if (!err) - err = ret; - return ret; + return err ? err : ret;
invalid_resize_inode: ext4_error(sb, "corrupted/inconsistent resize inode");
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 40ea98396a3659062267d1fe5f99af4f7e4f05e3 upstream.
When big allocate feature is enabled, we need to count and update reserved clusters before removing a delayed only extent_status entry. {init|count|get}_rsvd() have already done this, but the start block number of this counting isn't correct in the following case.
lblk end | | v v ------------------------- | | orig_es ------------------------- ^ ^ len1 is 0 | len2 |
If the start block of the orig_es entry founded is bigger than lblk, we passed lblk as start block to count_rsvd(), but the length is correct, finally, the range to be counted is offset. This patch fix this by passing the start blocks to 'orig_es->lblk + len1'.
Signed-off-by: Zhang Yi yi.zhang@huawei.com Cc: stable@kernel.org Link: https://lore.kernel.org/r/20230824092619.1327976-2-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/extents_status.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1366,8 +1366,8 @@ retry: } } if (count_reserved) - count_rsvd(inode, lblk, orig_es.es_len - len1 - len2, - &orig_es, &rc); + count_rsvd(inode, orig_es.es_lblk + len1, + orig_es.es_len - len1 - len2, &orig_es, &rc); goto out_get_reserved; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 40dd7953f4d606c280074f10d23046b6812708ce upstream.
Wrong check of gdb backup in meta bg as following: first_group is the first group of meta_bg which contains target group, so target group is always >= first_group. We check if target group has gdb backup by comparing first_group with [group + 1] and [group + EXT4_DESC_PER_BLOCK(sb) - 1]. As group >= first_group, then [group + N] is
first_group. So no copy of gdb backup in meta bg is done in
setup_new_flex_group_blocks.
No need to do gdb backup copy in meta bg from setup_new_flex_group_blocks as we always copy updated gdb block to backups at end of ext4_flex_group_add as following:
ext4_flex_group_add /* no gdb backup copy for meta bg any more */ setup_new_flex_group_blocks
/* update current group number */ ext4_update_super sbi->s_groups_count += flex_gd->count;
/* * if group in meta bg contains backup is added, the primary gdb block * of the meta bg will be copy to backup in new added group here. */ for (; gdb_num <= gdb_num_end; gdb_num++) update_backups(...)
In summary, we can remove wrong gdb backup copy code in setup_new_flex_group_blocks.
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-5-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -556,13 +556,8 @@ static int setup_new_flex_group_blocks(s if (meta_bg == 0 && !ext4_bg_has_super(sb, group)) goto handle_itb;
- if (meta_bg == 1) { - ext4_group_t first_group; - first_group = ext4_meta_bg_first_group(sb, group); - if (first_group != group + 1 && - first_group != group + EXT4_DESC_PER_BLOCK(sb) - 1) - goto handle_itb; - } + if (meta_bg == 1) + goto handle_itb;
block = start + ext4_bg_has_super(sb, group); /* Copy all of the GDT blocks into the backup in this group */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kemeng Shi shikemeng@huaweicloud.com
commit 9adac8b01f4be28acd5838aade42b8daa4f0b642 upstream.
add missed brelse in update_backups
Signed-off-by: Kemeng Shi shikemeng@huaweicloud.com Reviewed-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/r/20230826174712.4059355-3-shikemeng@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Cc: stable@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/resize.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/ext4/resize.c +++ b/fs/ext4/resize.c @@ -1160,8 +1160,10 @@ static void update_backups(struct super_ ext4_group_first_block_no(sb, group)); BUFFER_TRACE(bh, "get_write_access"); if ((err = ext4_journal_get_write_access(handle, sb, bh, - EXT4_JTR_NONE))) + EXT4_JTR_NONE))) { + brelse(bh); break; + } lock_buffer(bh); memcpy(bh->b_data, data, size); if (rest)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 91562895f8030cb9a0470b1db49de79346a69f91 upstream.
Gao Xiang has reported that on ext4 O_SYNC direct IO does not properly sync file size update and thus if we crash at unfortunate moment, the file can have smaller size although O_SYNC IO has reported successful completion. The problem happens because update of on-disk inode size is handled in ext4_dio_write_iter() *after* iomap_dio_rw() (and thus dio_complete() in particular) has returned and generic_file_sync() gets called by dio_complete(). Fix the problem by handling on-disk inode size update directly in our ->end_io completion handler.
References: https://lore.kernel.org/all/02d18236-26ef-09b0-90ad-030c4fe3ee20@linux.aliba... Reported-by: Gao Xiang hsiangkao@linux.alibaba.com CC: stable@vger.kernel.org Fixes: 378f32bab371 ("ext4: introduce direct I/O write using iomap infrastructure") Signed-off-by: Jan Kara jack@suse.cz Tested-by: Joseph Qi joseph.qi@linux.alibaba.com Reviewed-by: "Ritesh Harjani (IBM)" ritesh.list@gmail.com Link: https://lore.kernel.org/r/20231013121350.26872-1-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/file.c | 153 ++++++++++++++++++++++++--------------------------------- 1 file changed, 65 insertions(+), 88 deletions(-)
--- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -279,80 +279,38 @@ out: }
static ssize_t ext4_handle_inode_extension(struct inode *inode, loff_t offset, - ssize_t written, size_t count) + ssize_t count) { handle_t *handle; - bool truncate = false; - u8 blkbits = inode->i_blkbits; - ext4_lblk_t written_blk, end_blk; - int ret; - - /* - * Note that EXT4_I(inode)->i_disksize can get extended up to - * inode->i_size while the I/O was running due to writeback of delalloc - * blocks. But, the code in ext4_iomap_alloc() is careful to use - * zeroed/unwritten extents if this is possible; thus we won't leave - * uninitialized blocks in a file even if we didn't succeed in writing - * as much as we intended. - */ - WARN_ON_ONCE(i_size_read(inode) < EXT4_I(inode)->i_disksize); - if (offset + count <= EXT4_I(inode)->i_disksize) { - /* - * We need to ensure that the inode is removed from the orphan - * list if it has been added prematurely, due to writeback of - * delalloc blocks. - */ - if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) { - handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); - - if (IS_ERR(handle)) { - ext4_orphan_del(NULL, inode); - return PTR_ERR(handle); - } - - ext4_orphan_del(handle, inode); - ext4_journal_stop(handle); - } - - return written; - } - - if (written < 0) - goto truncate;
+ lockdep_assert_held_write(&inode->i_rwsem); handle = ext4_journal_start(inode, EXT4_HT_INODE, 2); - if (IS_ERR(handle)) { - written = PTR_ERR(handle); - goto truncate; - } + if (IS_ERR(handle)) + return PTR_ERR(handle);
- if (ext4_update_inode_size(inode, offset + written)) { - ret = ext4_mark_inode_dirty(handle, inode); + if (ext4_update_inode_size(inode, offset + count)) { + int ret = ext4_mark_inode_dirty(handle, inode); if (unlikely(ret)) { - written = ret; ext4_journal_stop(handle); - goto truncate; + return ret; } }
- /* - * We may need to truncate allocated but not written blocks beyond EOF. - */ - written_blk = ALIGN(offset + written, 1 << blkbits); - end_blk = ALIGN(offset + count, 1 << blkbits); - if (written_blk < end_blk && ext4_can_truncate(inode)) - truncate = true; - - /* - * Remove the inode from the orphan list if it has been extended and - * everything went OK. - */ - if (!truncate && inode->i_nlink) + if (inode->i_nlink) ext4_orphan_del(handle, inode); ext4_journal_stop(handle);
- if (truncate) { -truncate: + return count; +} + +/* + * Clean up the inode after DIO or DAX extending write has completed and the + * inode size has been updated using ext4_handle_inode_extension(). + */ +static void ext4_inode_extension_cleanup(struct inode *inode, ssize_t count) +{ + lockdep_assert_held_write(&inode->i_rwsem); + if (count < 0) { ext4_truncate_failed_write(inode); /* * If the truncate operation failed early, then the inode may @@ -361,9 +319,28 @@ truncate: */ if (inode->i_nlink) ext4_orphan_del(NULL, inode); + return; } + /* + * If i_disksize got extended due to writeback of delalloc blocks while + * the DIO was running we could fail to cleanup the orphan list in + * ext4_handle_inode_extension(). Do it now. + */ + if (!list_empty(&EXT4_I(inode)->i_orphan) && inode->i_nlink) { + handle_t *handle = ext4_journal_start(inode, EXT4_HT_INODE, 2);
- return written; + if (IS_ERR(handle)) { + /* + * The write has successfully completed. Not much to + * do with the error here so just cleanup the orphan + * list and hope for the best. + */ + ext4_orphan_del(NULL, inode); + return; + } + ext4_orphan_del(handle, inode); + ext4_journal_stop(handle); + } }
static int ext4_dio_write_end_io(struct kiocb *iocb, ssize_t size, @@ -372,31 +349,22 @@ static int ext4_dio_write_end_io(struct loff_t pos = iocb->ki_pos; struct inode *inode = file_inode(iocb->ki_filp);
+ if (!error && size && flags & IOMAP_DIO_UNWRITTEN) + error = ext4_convert_unwritten_extents(NULL, inode, pos, size); if (error) return error; - - if (size && flags & IOMAP_DIO_UNWRITTEN) { - error = ext4_convert_unwritten_extents(NULL, inode, pos, size); - if (error < 0) - return error; - } /* - * If we are extending the file, we have to update i_size here before - * page cache gets invalidated in iomap_dio_rw(). Otherwise racing - * buffered reads could zero out too much from page cache pages. Update - * of on-disk size will happen later in ext4_dio_write_iter() where - * we have enough information to also perform orphan list handling etc. - * Note that we perform all extending writes synchronously under - * i_rwsem held exclusively so i_size update is safe here in that case. - * If the write was not extending, we cannot see pos > i_size here - * because operations reducing i_size like truncate wait for all - * outstanding DIO before updating i_size. + * Note that EXT4_I(inode)->i_disksize can get extended up to + * inode->i_size while the I/O was running due to writeback of delalloc + * blocks. But the code in ext4_iomap_alloc() is careful to use + * zeroed/unwritten extents if this is possible; thus we won't leave + * uninitialized blocks in a file even if we didn't succeed in writing + * as much as we intended. */ - pos += size; - if (pos > i_size_read(inode)) - i_size_write(inode, pos); - - return 0; + WARN_ON_ONCE(i_size_read(inode) < READ_ONCE(EXT4_I(inode)->i_disksize)); + if (pos + size <= READ_ONCE(EXT4_I(inode)->i_disksize)) + return size; + return ext4_handle_inode_extension(inode, pos, size); }
static const struct iomap_dio_ops ext4_dio_write_ops = { @@ -572,9 +540,16 @@ static ssize_t ext4_dio_write_iter(struc 0); if (ret == -ENOTBLK) ret = 0; - - if (extend) - ret = ext4_handle_inode_extension(inode, offset, ret, count); + if (extend) { + /* + * We always perform extending DIO write synchronously so by + * now the IO is completed and ext4_handle_inode_extension() + * was called. Cleanup the inode in case of error or race with + * writeback of delalloc blocks. + */ + WARN_ON_ONCE(ret == -EIOCBQUEUED); + ext4_inode_extension_cleanup(inode, ret); + }
out: if (ilock_shared) @@ -655,8 +630,10 @@ ext4_dax_write_iter(struct kiocb *iocb,
ret = dax_iomap_rw(iocb, from, &ext4_iomap_ops);
- if (extend) - ret = ext4_handle_inode_extension(inode, offset, ret, count); + if (extend) { + ret = ext4_handle_inode_extension(inode, offset, ret); + ext4_inode_extension_cleanup(inode, ret); + } out: inode_unlock(inode); if (ret > 0)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl
commit 08e9ebc75b5bcfec9d226f9e16bab2ab7b25a39a upstream.
The incoming strings might not be terminated by a newline or a 0.
(found while testing a program that just wrote the string itself, causing a crash)
Cc: stable@vger.kernel.org Fixes: e3933f26b657 ("drm/amd/pp: Add edit/commit/show OD clock/voltage support in sysfs") Signed-off-by: Bas Nieuwenhuizen bas@basnieuwenhuizen.nl Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/amdgpu_pm.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/pm/amdgpu_pm.c +++ b/drivers/gpu/drm/amd/pm/amdgpu_pm.c @@ -807,7 +807,7 @@ static ssize_t amdgpu_set_pp_od_clk_volt if (adev->in_suspend && !adev->in_runpm) return -EPERM;
- if (count > 127) + if (count > 127 || count == 0) return -EINVAL;
if (*buf == 's') @@ -827,7 +827,8 @@ static ssize_t amdgpu_set_pp_od_clk_volt else return -EINVAL;
- memcpy(buf_cpy, buf, count+1); + memcpy(buf_cpy, buf, count); + buf_cpy[count] = 0;
tmp_str = buf_cpy;
@@ -844,6 +845,9 @@ static ssize_t amdgpu_set_pp_od_clk_volt return -EINVAL; parameter_size++;
+ if (!tmp_str) + break; + while (isspace(*tmp_str)) tmp_str++; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kunwu Chan chentao@kylinos.cn
commit 1a8e9bad6ef563c28ab0f8619628d5511be55431 upstream.
Fix smatch warning: drivers/gpu/drm/i915/gem/i915_gem_context.c:847 set_proto_ctx_sseu() warn: potential spectre issue 'pc->user_engines' [r] (local cap)
Fixes: d4433c7600f7 ("drm/i915/gem: Use the proto-context to handle create parameters (v5)") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Kunwu Chan chentao@kylinos.cn Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20231103110922.430122-1-tvrtko... (cherry picked from commit 27b086382c22efb7e0a16442f7bdc2e120108ef3) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -642,6 +642,7 @@ static int set_proto_ctx_sseu(struct drm if (idx >= pc->num_user_engines) return -EINVAL;
+ idx = array_index_nospec(idx, pc->num_user_engines); pe = &pc->user_engines[idx];
/* Only render engine supports RPCS configuration. */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 432e664e7c98c243fab4c3c95bd463bea3aeed28 upstream.
The ATRM ACPI method is for fetching the dGPU vbios rom image on laptops and all-in-one systems. It should not be used for external add in cards. If the dGPU is thunderbolt connected, don't try ATRM.
v2: pci_is_thunderbolt_attached only works for Intel. Use pdev->external_facing instead. v3: dev_is_removable() seems to be what we want
Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2925 Reviewed-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bios.c @@ -29,6 +29,7 @@ #include "amdgpu.h" #include "atom.h"
+#include <linux/device.h> #include <linux/pci.h> #include <linux/slab.h> #include <linux/acpi.h> @@ -289,6 +290,10 @@ static bool amdgpu_atrm_get_bios(struct if (adev->flags & AMD_IS_APU) return false;
+ /* ATRM is for on-platform devices only */ + if (dev_is_removable(&adev->pdev->dev)) + return false; + while ((pdev = pci_get_class(PCI_CLASS_DISPLAY_VGA << 8, pdev)) != NULL) { dhandle = ACPI_HANDLE(&pdev->dev); if (!dhandle)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 12f76050d8d4d10dab96333656b821bd4620d103 upstream.
We should not leak the pointer where we couldn't grab the reference on to the caller because it can be that the error handling still tries to put the reference then.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_bo_list.c @@ -178,6 +178,7 @@ int amdgpu_bo_list_get(struct amdgpu_fpr }
rcu_read_unlock(); + *result = NULL; return -ENOENT; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lewis Huang lewis.huang@amd.com
commit 5911d02cac70d7fb52009fbd37423e63f8f6f9bc upstream.
[WHY] Flush command sent to DMCUB spends more time for execution on a dGPU than on an APU. This causes cursor lag when using high refresh rate mouses.
[HOW] 1. Change the DMCUB mailbox memory location from FB to inbox. 2. Only change windows memory to inbox.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Lewis Huang lewis.huang@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 13 ++++---- drivers/gpu/drm/amd/display/dmub/dmub_srv.h | 22 +++++++++------ drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c | 32 ++++++++++++++++------ 3 files changed, 45 insertions(+), 22 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1911,7 +1911,7 @@ static int dm_dmub_sw_init(struct amdgpu struct dmub_srv_create_params create_params; struct dmub_srv_region_params region_params; struct dmub_srv_region_info region_info; - struct dmub_srv_fb_params fb_params; + struct dmub_srv_memory_params memory_params; struct dmub_srv_fb_info *fb_info; struct dmub_srv *dmub_srv; const struct dmcub_firmware_header_v1_0 *hdr; @@ -2021,6 +2021,7 @@ static int dm_dmub_sw_init(struct amdgpu adev->dm.dmub_fw->data + le32_to_cpu(hdr->header.ucode_array_offset_bytes) + PSP_HEADER_BYTES; + region_params.is_mailbox_in_inbox = false;
status = dmub_srv_calc_region_info(dmub_srv, ®ion_params, ®ion_info); @@ -2042,10 +2043,10 @@ static int dm_dmub_sw_init(struct amdgpu return r;
/* Rebase the regions on the framebuffer address. */ - memset(&fb_params, 0, sizeof(fb_params)); - fb_params.cpu_addr = adev->dm.dmub_bo_cpu_addr; - fb_params.gpu_addr = adev->dm.dmub_bo_gpu_addr; - fb_params.region_info = ®ion_info; + memset(&memory_params, 0, sizeof(memory_params)); + memory_params.cpu_fb_addr = adev->dm.dmub_bo_cpu_addr; + memory_params.gpu_fb_addr = adev->dm.dmub_bo_gpu_addr; + memory_params.region_info = ®ion_info;
adev->dm.dmub_fb_info = kzalloc(sizeof(*adev->dm.dmub_fb_info), GFP_KERNEL); @@ -2057,7 +2058,7 @@ static int dm_dmub_sw_init(struct amdgpu return -ENOMEM; }
- status = dmub_srv_calc_fb_info(dmub_srv, &fb_params, fb_info); + status = dmub_srv_calc_mem_info(dmub_srv, &memory_params, fb_info); if (status != DMUB_STATUS_OK) { DRM_ERROR("Error calculating DMUB FB info: %d\n", status); return -EINVAL; --- a/drivers/gpu/drm/amd/display/dmub/dmub_srv.h +++ b/drivers/gpu/drm/amd/display/dmub/dmub_srv.h @@ -166,6 +166,7 @@ struct dmub_srv_region_params { uint32_t vbios_size; const uint8_t *fw_inst_const; const uint8_t *fw_bss_data; + bool is_mailbox_in_inbox; };
/** @@ -185,20 +186,25 @@ struct dmub_srv_region_params { */ struct dmub_srv_region_info { uint32_t fb_size; + uint32_t inbox_size; uint8_t num_regions; struct dmub_region regions[DMUB_WINDOW_TOTAL]; };
/** - * struct dmub_srv_fb_params - parameters used for driver fb setup + * struct dmub_srv_memory_params - parameters used for driver fb setup * @region_info: region info calculated by dmub service - * @cpu_addr: base cpu address for the framebuffer - * @gpu_addr: base gpu virtual address for the framebuffer + * @cpu_fb_addr: base cpu address for the framebuffer + * @cpu_inbox_addr: base cpu address for the gart + * @gpu_fb_addr: base gpu virtual address for the framebuffer + * @gpu_inbox_addr: base gpu virtual address for the gart */ -struct dmub_srv_fb_params { +struct dmub_srv_memory_params { const struct dmub_srv_region_info *region_info; - void *cpu_addr; - uint64_t gpu_addr; + void *cpu_fb_addr; + void *cpu_inbox_addr; + uint64_t gpu_fb_addr; + uint64_t gpu_inbox_addr; };
/** @@ -496,8 +502,8 @@ dmub_srv_calc_region_info(struct dmub_sr * DMUB_STATUS_OK - success * DMUB_STATUS_INVALID - unspecified error */ -enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub, - const struct dmub_srv_fb_params *params, +enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub, + const struct dmub_srv_memory_params *params, struct dmub_srv_fb_info *out);
/** --- a/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c +++ b/drivers/gpu/drm/amd/display/dmub/src/dmub_srv.c @@ -318,7 +318,7 @@ dmub_srv_calc_region_info(struct dmub_sr uint32_t fw_state_size = DMUB_FW_STATE_SIZE; uint32_t trace_buffer_size = DMUB_TRACE_BUFFER_SIZE; uint32_t scratch_mem_size = DMUB_SCRATCH_MEM_SIZE; - + uint32_t previous_top = 0; if (!dmub->sw_init) return DMUB_STATUS_INVALID;
@@ -343,8 +343,15 @@ dmub_srv_calc_region_info(struct dmub_sr bios->base = dmub_align(stack->top, 256); bios->top = bios->base + params->vbios_size;
- mail->base = dmub_align(bios->top, 256); - mail->top = mail->base + DMUB_MAILBOX_SIZE; + if (params->is_mailbox_in_inbox) { + mail->base = 0; + mail->top = mail->base + DMUB_MAILBOX_SIZE; + previous_top = bios->top; + } else { + mail->base = dmub_align(bios->top, 256); + mail->top = mail->base + DMUB_MAILBOX_SIZE; + previous_top = mail->top; + }
fw_info = dmub_get_fw_meta_info(params);
@@ -363,7 +370,7 @@ dmub_srv_calc_region_info(struct dmub_sr dmub->fw_version = fw_info->fw_version; }
- trace_buff->base = dmub_align(mail->top, 256); + trace_buff->base = dmub_align(previous_top, 256); trace_buff->top = trace_buff->base + dmub_align(trace_buffer_size, 64);
fw_state->base = dmub_align(trace_buff->top, 256); @@ -374,11 +381,14 @@ dmub_srv_calc_region_info(struct dmub_sr
out->fb_size = dmub_align(scratch_mem->top, 4096);
+ if (params->is_mailbox_in_inbox) + out->inbox_size = dmub_align(mail->top, 4096); + return DMUB_STATUS_OK; }
-enum dmub_status dmub_srv_calc_fb_info(struct dmub_srv *dmub, - const struct dmub_srv_fb_params *params, +enum dmub_status dmub_srv_calc_mem_info(struct dmub_srv *dmub, + const struct dmub_srv_memory_params *params, struct dmub_srv_fb_info *out) { uint8_t *cpu_base; @@ -393,8 +403,8 @@ enum dmub_status dmub_srv_calc_fb_info(s if (params->region_info->num_regions != DMUB_NUM_WINDOWS) return DMUB_STATUS_INVALID;
- cpu_base = (uint8_t *)params->cpu_addr; - gpu_base = params->gpu_addr; + cpu_base = (uint8_t *)params->cpu_fb_addr; + gpu_base = params->gpu_fb_addr;
for (i = 0; i < DMUB_NUM_WINDOWS; ++i) { const struct dmub_region *reg = @@ -402,6 +412,12 @@ enum dmub_status dmub_srv_calc_fb_info(s
out->fb[i].cpu_addr = cpu_base + reg->base; out->fb[i].gpu_addr = gpu_base + reg->base; + + if (i == DMUB_WINDOW_4_MAILBOX && params->cpu_inbox_addr != 0) { + out->fb[i].cpu_addr = (uint8_t *)params->cpu_inbox_addr + reg->base; + out->fb[i].gpu_addr = params->gpu_inbox_addr + reg->base; + } + out->fb[i].size = reg->top - reg->base; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 7644b1a1c9a7ae8ab99175989bfc8676055edb46 upstream.
We could race with SQ thread exit, and if we do, we'll hit a NULL pointer dereference when the thread is cleared. Grab the SQPOLL data lock before attempting to get the task cpu and pid for fdinfo, this ensures we have a stable view of it.
Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=218032 Reviewed-by: Gabriel Krisman Bertazi krisman@suse.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: He Gao hegao@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io_uring.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -10411,7 +10411,7 @@ static int io_uring_show_cred(struct seq
static void __io_uring_show_fdinfo(struct io_ring_ctx *ctx, struct seq_file *m) { - struct io_sq_data *sq = NULL; + int sq_pid = -1, sq_cpu = -1; bool has_lock; int i;
@@ -10424,13 +10424,19 @@ static void __io_uring_show_fdinfo(struc has_lock = mutex_trylock(&ctx->uring_lock);
if (has_lock && (ctx->flags & IORING_SETUP_SQPOLL)) { - sq = ctx->sq_data; - if (!sq->thread) - sq = NULL; + struct io_sq_data *sq = ctx->sq_data; + + if (mutex_trylock(&sq->lock)) { + if (sq->thread) { + sq_pid = task_pid_nr(sq->thread); + sq_cpu = task_cpu(sq->thread); + } + mutex_unlock(&sq->lock); + } }
- seq_printf(m, "SqThread:\t%d\n", sq ? task_pid_nr(sq->thread) : -1); - seq_printf(m, "SqThreadCpu:\t%d\n", sq ? task_cpu(sq->thread) : -1); + seq_printf(m, "SqThread:\t%d\n", sq_pid); + seq_printf(m, "SqThreadCpu:\t%d\n", sq_cpu); seq_printf(m, "UserFiles:\t%u\n", ctx->nr_user_files); for (i = 0; has_lock && i < ctx->nr_user_files; i++) { struct file *f = io_file_from_index(ctx, i);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
commit feea65a338e52297b68ceb688eaf0ffc50310a83 upstream.
As reported by Mahesh & Aneesh, opal_prd_msg_notifier() triggers a FORTIFY_SOURCE warning:
memcpy: detected field-spanning write (size 32) of single field "&item->msg" at arch/powerpc/platforms/powernv/opal-prd.c:355 (size 4) WARNING: CPU: 9 PID: 660 at arch/powerpc/platforms/powernv/opal-prd.c:355 opal_prd_msg_notifier+0x174/0x188 [opal_prd] NIP opal_prd_msg_notifier+0x174/0x188 [opal_prd] LR opal_prd_msg_notifier+0x170/0x188 [opal_prd] Call Trace: opal_prd_msg_notifier+0x170/0x188 [opal_prd] (unreliable) notifier_call_chain+0xc0/0x1b0 atomic_notifier_call_chain+0x2c/0x40 opal_message_notify+0xf4/0x2c0
This happens because the copy is targeting item->msg, which is only 4 bytes in size, even though the enclosing item was allocated with extra space following the msg.
To fix the warning define struct opal_prd_msg with a union of the header and a flex array, and have the memcpy target the flex array.
Reported-by: "Aneesh Kumar K.V" aneesh.kumar@linux.ibm.com Reported-by: Mahesh Salgaonkar mahesh@linux.ibm.com Tested-by: Mahesh Salgaonkar mahesh@linux.ibm.com Reviewed-by: Mahesh Salgaonkar mahesh@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230821142820.497107-1-mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/powernv/opal-prd.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/arch/powerpc/platforms/powernv/opal-prd.c +++ b/arch/powerpc/platforms/powernv/opal-prd.c @@ -24,13 +24,20 @@ #include <linux/uaccess.h>
+struct opal_prd_msg { + union { + struct opal_prd_msg_header header; + DECLARE_FLEX_ARRAY(u8, data); + }; +}; + /* * The msg member must be at the end of the struct, as it's followed by the * message data. */ struct opal_prd_msg_queue_item { - struct list_head list; - struct opal_prd_msg_header msg; + struct list_head list; + struct opal_prd_msg msg; };
static struct device_node *prd_node; @@ -156,7 +163,7 @@ static ssize_t opal_prd_read(struct file int rc;
/* we need at least a header's worth of data */ - if (count < sizeof(item->msg)) + if (count < sizeof(item->msg.header)) return -EINVAL;
if (*ppos) @@ -186,7 +193,7 @@ static ssize_t opal_prd_read(struct file return -EINTR; }
- size = be16_to_cpu(item->msg.size); + size = be16_to_cpu(item->msg.header.size); if (size > count) { err = -EINVAL; goto err_requeue; @@ -352,7 +359,7 @@ static int opal_prd_msg_notifier(struct if (!item) return -ENOMEM;
- memcpy(&item->msg, msg->params, msg_size); + memcpy(&item->msg.data, msg->params, msg_size);
spin_lock_irqsave(&opal_prd_msg_queue_lock, flags); list_add_tail(&item->list, &opal_prd_msg_queue);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit bb32500fb9b78215e4ef6ee8b4345c5f5d7eafb4 upstream.
The following can crash the kernel:
# cd /sys/kernel/tracing # echo 'p:sched schedule' > kprobe_events # exec 5>>events/kprobes/sched/enable # > kprobe_events # exec 5>&-
The above commands:
1. Change directory to the tracefs directory 2. Create a kprobe event (doesn't matter what one) 3. Open bash file descriptor 5 on the enable file of the kprobe event 4. Delete the kprobe event (removes the files too) 5. Close the bash file descriptor 5
The above causes a crash!
BUG: kernel NULL pointer dereference, address: 0000000000000028 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 6 PID: 877 Comm: bash Not tainted 6.5.0-rc4-test-00008-g2c6b6b1029d4-dirty #186 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014 RIP: 0010:tracing_release_file_tr+0xc/0x50
What happens here is that the kprobe event creates a trace_event_file "file" descriptor that represents the file in tracefs to the event. It maintains state of the event (is it enabled for the given instance?). Opening the "enable" file gets a reference to the event "file" descriptor via the open file descriptor. When the kprobe event is deleted, the file is also deleted from the tracefs system which also frees the event "file" descriptor.
But as the tracefs file is still opened by user space, it will not be totally removed until the final dput() is called on it. But this is not true with the event "file" descriptor that is already freed. If the user does a write to or simply closes the file descriptor it will reference the event "file" descriptor that was just freed, causing a use-after-free bug.
To solve this, add a ref count to the event "file" descriptor as well as a new flag called "FREED". The "file" will not be freed until the last reference is released. But the FREE flag will be set when the event is removed to prevent any more modifications to that event from happening, even if there's still a reference to the event "file" descriptor.
Link: https://lore.kernel.org/linux-trace-kernel/20231031000031.1e705592@gandalf.l... Link: https://lore.kernel.org/linux-trace-kernel/20231031122453.7a48b923@gandalf.l...
Cc: stable@vger.kernel.org Cc: Mark Rutland mark.rutland@arm.com Fixes: f5ca233e2e66d ("tracing: Increase trace array ref count on enable and filter files") Reported-by: Beau Belgrave beaub@linux.microsoft.com Tested-by: Beau Belgrave beaub@linux.microsoft.com Reviewed-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/trace_events.h | 4 +++ kernel/trace/trace.c | 15 ++++++++++++ kernel/trace/trace.h | 3 ++ kernel/trace/trace_events.c | 43 ++++++++++++++++++++++++------------- kernel/trace/trace_events_filter.c | 3 ++ 5 files changed, 53 insertions(+), 15 deletions(-)
--- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -468,6 +468,7 @@ enum { EVENT_FILE_FL_TRIGGER_COND_BIT, EVENT_FILE_FL_PID_FILTER_BIT, EVENT_FILE_FL_WAS_ENABLED_BIT, + EVENT_FILE_FL_FREED_BIT, };
extern struct trace_event_file *trace_get_event_file(const char *instance, @@ -606,6 +607,7 @@ extern int __kprobe_event_add_fields(str * TRIGGER_COND - When set, one or more triggers has an associated filter * PID_FILTER - When set, the event is filtered based on pid * WAS_ENABLED - Set when enabled to know to clear trace on module removal + * FREED - File descriptor is freed, all fields should be considered invalid */ enum { EVENT_FILE_FL_ENABLED = (1 << EVENT_FILE_FL_ENABLED_BIT), @@ -619,6 +621,7 @@ enum { EVENT_FILE_FL_TRIGGER_COND = (1 << EVENT_FILE_FL_TRIGGER_COND_BIT), EVENT_FILE_FL_PID_FILTER = (1 << EVENT_FILE_FL_PID_FILTER_BIT), EVENT_FILE_FL_WAS_ENABLED = (1 << EVENT_FILE_FL_WAS_ENABLED_BIT), + EVENT_FILE_FL_FREED = (1 << EVENT_FILE_FL_FREED_BIT), };
struct trace_event_file { @@ -647,6 +650,7 @@ struct trace_event_file { * caching and such. Which is mostly OK ;-) */ unsigned long flags; + atomic_t ref; /* ref count for opened files */ atomic_t sm_ref; /* soft-mode reference counter */ atomic_t tm_ref; /* trigger-mode reference counter */ }; --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4900,6 +4900,20 @@ int tracing_open_file_tr(struct inode *i if (ret) return ret;
+ mutex_lock(&event_mutex); + + /* Fail if the file is marked for removal */ + if (file->flags & EVENT_FILE_FL_FREED) { + trace_array_put(file->tr); + ret = -ENODEV; + } else { + event_file_get(file); + } + + mutex_unlock(&event_mutex); + if (ret) + return ret; + filp->private_data = inode->i_private;
return 0; @@ -4910,6 +4924,7 @@ int tracing_release_file_tr(struct inode struct trace_event_file *file = inode->i_private;
trace_array_put(file->tr); + event_file_put(file);
return 0; } --- a/kernel/trace/trace.h +++ b/kernel/trace/trace.h @@ -1620,6 +1620,9 @@ extern int register_event_command(struct extern int unregister_event_command(struct event_command *cmd); extern int register_trigger_hist_enable_disable_cmds(void);
+extern void event_file_get(struct trace_event_file *file); +extern void event_file_put(struct trace_event_file *file); + /** * struct event_trigger_ops - callbacks for trace event triggers * --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -969,26 +969,38 @@ static void remove_subsystem(struct trac } }
-static void remove_event_file_dir(struct trace_event_file *file) +void event_file_get(struct trace_event_file *file) { - struct dentry *dir = file->dir; - struct dentry *child; + atomic_inc(&file->ref); +}
- if (dir) { - spin_lock(&dir->d_lock); /* probably unneeded */ - list_for_each_entry(child, &dir->d_subdirs, d_child) { - if (d_really_is_positive(child)) /* probably unneeded */ - d_inode(child)->i_private = NULL; - } - spin_unlock(&dir->d_lock); +void event_file_put(struct trace_event_file *file) +{ + if (WARN_ON_ONCE(!atomic_read(&file->ref))) { + if (file->flags & EVENT_FILE_FL_FREED) + kmem_cache_free(file_cachep, file); + return; + }
- tracefs_remove(dir); + if (atomic_dec_and_test(&file->ref)) { + /* Count should only go to zero when it is freed */ + if (WARN_ON_ONCE(!(file->flags & EVENT_FILE_FL_FREED))) + return; + kmem_cache_free(file_cachep, file); } +} + +static void remove_event_file_dir(struct trace_event_file *file) +{ + struct dentry *dir = file->dir; + + tracefs_remove(dir);
list_del(&file->list); remove_subsystem(file->system); free_event_filter(file->filter); - kmem_cache_free(file_cachep, file); + file->flags |= EVENT_FILE_FL_FREED; + event_file_put(file); }
/* @@ -1361,7 +1373,7 @@ event_enable_read(struct file *filp, cha flags = file->flags; mutex_unlock(&event_mutex);
- if (!file) + if (!file || flags & EVENT_FILE_FL_FREED) return -ENODEV;
if (flags & EVENT_FILE_FL_ENABLED && @@ -1399,7 +1411,7 @@ event_enable_write(struct file *filp, co ret = -ENODEV; mutex_lock(&event_mutex); file = event_file_data(filp); - if (likely(file)) + if (likely(file && !(file->flags & EVENT_FILE_FL_FREED))) ret = ftrace_event_enable_disable(file, val); mutex_unlock(&event_mutex); break; @@ -1668,7 +1680,7 @@ event_filter_read(struct file *filp, cha
mutex_lock(&event_mutex); file = event_file_data(filp); - if (file) + if (file && !(file->flags & EVENT_FILE_FL_FREED)) print_event_filter(file, s); mutex_unlock(&event_mutex);
@@ -2784,6 +2796,7 @@ trace_create_new_event(struct trace_even atomic_set(&file->tm_ref, 0); INIT_LIST_HEAD(&file->triggers); list_add(&file->list, &tr->events); + event_file_get(file);
return file; } --- a/kernel/trace/trace_events_filter.c +++ b/kernel/trace/trace_events_filter.c @@ -1872,6 +1872,9 @@ int apply_event_filter(struct trace_even struct event_filter *filter = NULL; int err;
+ if (file->flags & EVENT_FILE_FL_FREED) + return -ENODEV; + if (!strcmp(strstrip(filter_string), "0")) { filter_disable(file); filter = event_filter(file);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vicki Pfau vi@endrift.com
commit 1999a6b12a3b5c8953fc9ec74863ebc75a1b851d upstream.
This adds support for the Turtle Beach REACT-R and Recon Xbox controllers
Signed-off-by: Vicki Pfau vi@endrift.com Link: https://lore.kernel.org/r/20230225012147.276489-4-vi@endrift.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/joystick/xpad.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -449,6 +449,7 @@ static const struct usb_device_id xpad_t XPAD_XBOX360_VENDOR(0x0f0d), /* Hori Controllers */ XPAD_XBOXONE_VENDOR(0x0f0d), /* Hori Controllers */ XPAD_XBOX360_VENDOR(0x1038), /* SteelSeries Controllers */ + XPAD_XBOXONE_VENDOR(0x10f5), /* Turtle Beach Controllers */ XPAD_XBOX360_VENDOR(0x11c9), /* Nacon GC100XF */ XPAD_XBOX360_VENDOR(0x11ff), /* PXN V900 */ XPAD_XBOX360_VENDOR(0x1209), /* Ardwiino Controllers */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saravana Kannan saravanak@google.com
commit 2e84dc37920012b458e9458b19fc4ed33f81bc74 upstream.
This commit fixes a bug in commit 9ed9895370ae ("driver core: Functional dependencies tracking support") where the device link status was incorrectly updated in the driver unbind path before all the device's resources were released.
Fixes: 9ed9895370ae ("driver core: Functional dependencies tracking support") Cc: stable stable@kernel.org Reported-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Closes: https://lore.kernel.org/all/20231014161721.f4iqyroddkcyoefo@pengutronix.de/ Signed-off-by: Saravana Kannan saravanak@google.com Cc: Thierry Reding thierry.reding@gmail.com Cc: Yang Yingliang yangyingliang@huawei.com Cc: Andy Shevchenko andriy.shevchenko@linux.intel.com Cc: Mark Brown broonie@kernel.org Cc: Matti Vaittinen mazziesaccount@gmail.com Cc: James Clark james.clark@arm.com Acked-by: "Rafael J. Wysocki" rafael@kernel.org Tested-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Acked-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Link: https://lore.kernel.org/r/20231018013851.3303928-1-saravanak@google.com Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/dd.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -1228,8 +1228,6 @@ static void __device_release_driver(stru else if (drv->remove) drv->remove(dev);
- device_links_driver_cleanup(dev); - devres_release_all(dev); arch_teardown_dma_ops(dev); kfree(dev->dma_range_map); @@ -1241,6 +1239,8 @@ static void __device_release_driver(stru pm_runtime_reinit(dev); dev_pm_set_driver_flags(dev, 0);
+ device_links_driver_cleanup(dev); + klist_remove(&dev->p->knode_driver); device_pm_check_callbacks(dev); if (dev->bus)
Hello!
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
There are problems with PA-RISC:
-----8<----- /builds/linux/drivers/parisc/power.c:201:34: warning: 'struct sys_off_data' declared inside parameter list will not be visible outside of this definition or declaration 201 | static int qemu_power_off(struct sys_off_data *data) | ^~~~~~~~~~~~ /builds/linux/drivers/parisc/power.c: In function 'qemu_power_off': /builds/linux/drivers/parisc/power.c:204:43: error: invalid use of undefined type 'struct sys_off_data' 204 | gsc_writel(0, (unsigned long) data->cb_data); | ^~ /builds/linux/drivers/parisc/power.c: In function 'power_init': /builds/linux/drivers/parisc/power.c:239:17: error: implicit declaration of function 'register_sys_off_handler'; did you mean 'register_restart_handler'? [-Werror=implicit-function-declaration] 239 | register_sys_off_handler(SYS_OFF_MODE_POWER_OFF, SYS_OFF_PRIO_DEFAULT, | ^~~~~~~~~~~~~~~~~~~~~~~~ | register_restart_handler /builds/linux/drivers/parisc/power.c:239:42: error: 'SYS_OFF_MODE_POWER_OFF' undeclared (first use in this function); did you mean 'SYSTEM_POWER_OFF'? 239 | register_sys_off_handler(SYS_OFF_MODE_POWER_OFF, SYS_OFF_PRIO_DEFAULT, | ^~~~~~~~~~~~~~~~~~~~~~ | SYSTEM_POWER_OFF /builds/linux/drivers/parisc/power.c:239:42: note: each undeclared identifier is reported only once for each function it appears in /builds/linux/drivers/parisc/power.c:239:66: error: 'SYS_OFF_PRIO_DEFAULT' undeclared (first use in this function) 239 | register_sys_off_handler(SYS_OFF_MODE_POWER_OFF, SYS_OFF_PRIO_DEFAULT, | ^~~~~~~~~~~~~~~~~~~~ cc1: some warnings being treated as errors make[3]: *** [/builds/linux/scripts/Makefile.build:289: drivers/parisc/power.o] Error 1 ----->8-----
That's allnoconfig with GCC 11; defconfig and tinyconfig fail just like that.
Bisection points to:
commit 065a7d0b92c0f1ef4160e2129d835eb6093cc675 Author: Helge Deller deller@gmx.de Date: Tue Oct 17 22:19:53 2023 +0200
parisc/power: Add power soft-off when running on qemu
commit d0c219472980d15f5cbc5c8aec736848bda3f235 upstream.
Then there's this failure on System/390:
-----8<----- /builds/linux/arch/s390/mm/page-states.c: In function 'mark_kernel_pgd': /builds/linux/arch/s390/mm/page-states.c:175:38: error: request for member 'val' in something not a structure or union max_addr = (S390_lowcore.kernel_asce.val & _ASCE_TYPE_MASK) >> 2; ^ In file included from /builds/linux/arch/s390/include/asm/page.h:186, from /builds/linux/arch/s390/include/asm/thread_info.h:26, from /builds/linux/include/linux/thread_info.h:60, from /builds/linux/arch/s390/include/asm/preempt.h:6, from /builds/linux/include/linux/preempt.h:78, from /builds/linux/include/linux/spinlock.h:55, from /builds/linux/include/linux/mmzone.h:8, from /builds/linux/include/linux/gfp.h:6, from /builds/linux/include/linux/mm.h:10, from /builds/linux/arch/s390/mm/page-states.c:13: /builds/linux/arch/s390/mm/page-states.c: In function 'cmma_init_nodat': /builds/linux/arch/s390/mm/page-states.c:204:23: error: 'invalid_pg_dir' undeclared (first use in this function); did you mean 'is_valid_bugaddr'? page = virt_to_page(&invalid_pg_dir); ^~~~~~~~~~~~~~ /builds/linux/include/asm-generic/memory_model.h:25:40: note: in definition of macro '__pfn_to_page' #define __pfn_to_page(pfn) (vmemmap + (pfn)) ^~~ /builds/linux/arch/s390/include/asm/page.h:176:29: note: in expansion of macro 'phys_to_pfn' #define virt_to_pfn(kaddr) (phys_to_pfn(__pa(kaddr))) ^~~~~~~~~~~ /builds/linux/arch/s390/include/asm/page.h:176:41: note: in expansion of macro '__pa' #define virt_to_pfn(kaddr) (phys_to_pfn(__pa(kaddr))) ^~~~ /builds/linux/arch/s390/include/asm/page.h:179:41: note: in expansion of macro 'virt_to_pfn' #define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr)) ^~~~~~~~~~~ /builds/linux/arch/s390/mm/page-states.c:204:9: note: in expansion of macro 'virt_to_page' page = virt_to_page(&invalid_pg_dir); ^~~~~~~~~~~~ /builds/linux/arch/s390/mm/page-states.c:204:23: note: each undeclared identifier is reported only once for each function it appears in page = virt_to_page(&invalid_pg_dir); ^~~~~~~~~~~~~~ /builds/linux/include/asm-generic/memory_model.h:25:40: note: in definition of macro '__pfn_to_page' #define __pfn_to_page(pfn) (vmemmap + (pfn)) ^~~ /builds/linux/arch/s390/include/asm/page.h:176:29: note: in expansion of macro 'phys_to_pfn' #define virt_to_pfn(kaddr) (phys_to_pfn(__pa(kaddr))) ^~~~~~~~~~~ /builds/linux/arch/s390/include/asm/page.h:176:41: note: in expansion of macro '__pa' #define virt_to_pfn(kaddr) (phys_to_pfn(__pa(kaddr))) ^~~~ /builds/linux/arch/s390/include/asm/page.h:179:41: note: in expansion of macro 'virt_to_pfn' #define virt_to_page(kaddr) pfn_to_page(virt_to_pfn(kaddr)) ^~~~~~~~~~~ /builds/linux/arch/s390/mm/page-states.c:204:9: note: in expansion of macro 'virt_to_page' page = virt_to_page(&invalid_pg_dir); ^~~~~~~~~~~~ make[3]: *** [/builds/linux/scripts/Makefile.build:289: arch/s390/mm/page-states.o] Error 1 ----->8-----
Bisection in this case points to:
commit d247caa47712c9cc36f25ec744f3b5dba90c3334 Author: Heiko Carstens hca@linux.ibm.com Date: Tue Oct 17 21:07:03 2023 +0200
s390/cmma: fix initial kernel address space page table walk
commit 16ba44826a04834d3eeeda4b731c2ea3481062b7 upstream.
Reproducers:
tuxmake --runtime podman --target-arch parisc --toolchain gcc-11 --kconfig allnoconfig
and
tuxmake --runtime podman --target-arch s390 --toolchain gcc-8 --kconfig allnoconfig
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On 11/25/23 00:21, Daniel Díaz wrote:
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release.
... There are problems with PA-RISC:
-----8<----- /builds/linux/drivers/parisc/power.c:201:34: warning: 'struct sys_off_data' declared inside parameter list will not be visible outside of this definition or declaration ...
Bisection points to:
commit 065a7d0b92c0f1ef4160e2129d835eb6093cc675 Author: Helge Deller deller@gmx.de Date: Tue Oct 17 22:19:53 2023 +0200
parisc/power: Add power soft-off when running on qemu commit d0c219472980d15f5cbc5c8aec736848bda3f235 upstream.
Right.
I already asked Greg to drop two patches from all queues where kernel < 6.0:
- parisc-power-add-power-soft-off-when-running-on-qemu.patch - parisc-power-fix-power-soft-off-when-running-on-qemu.patch
Helge
Hello!
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We are noticing a regression with ltp-syscalls' preadv03:
-----8<----- preadv03 preadv03 preadv03_64 preadv03_64 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 ----->8-----
This is seen in the following environments: * dragonboard-845c * juno-64k_page_size * qemu-arm64 * qemu-armv7 * qemu-i386 * qemu-x86_64 * x86_64-clang
and on the following RC's: * v5.10.202-rc1 * v5.15.140-rc1 * v6.1.64-rc1
(Note that the list might not be complete, because some branches failed to execute completely due to build issues reported elsewhere.)
Bisection in linux-5.15.y pointed to:
commit db85c7fff122c14bc5755e47b51fbfafae660235 Author: Jan Kara jack@suse.cz Date: Fri Oct 13 14:13:50 2023 +0200
ext4: properly sync file size update after O_SYNC direct IO
commit 91562895f8030cb9a0470b1db49de79346a69f91 upstream.
Reverting that commit made the test pass.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Fri, Nov 24, 2023 at 11:45:09PM -0600, Daniel Díaz wrote:
Hello!
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We are noticing a regression with ltp-syscalls' preadv03:
-----8<----- preadv03 preadv03 preadv03_64 preadv03_64 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 ----->8-----
This is seen in the following environments:
- dragonboard-845c
- juno-64k_page_size
- qemu-arm64
- qemu-armv7
- qemu-i386
- qemu-x86_64
- x86_64-clang
and on the following RC's:
- v5.10.202-rc1
- v5.15.140-rc1
- v6.1.64-rc1
(Note that the list might not be complete, because some branches failed to execute completely due to build issues reported elsewhere.)
Bisection in linux-5.15.y pointed to:
commit db85c7fff122c14bc5755e47b51fbfafae660235 Author: Jan Kara jack@suse.cz Date: Fri Oct 13 14:13:50 2023 +0200
ext4: properly sync file size update after O_SYNC direct IO commit 91562895f8030cb9a0470b1db49de79346a69f91 upstream.
Reverting that commit made the test pass.
Odd. I'll go drop that from 5.10.y and 5.15.y now, thanks.
greg k-h
Hello!
On Fri 24-11-23 23:45:09, Daniel Díaz wrote:
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We are noticing a regression with ltp-syscalls' preadv03:
Thanks for report!
-----8<----- preadv03 preadv03 preadv03_64 preadv03_64 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 ----->8-----
This is seen in the following environments:
- dragonboard-845c
- juno-64k_page_size
- qemu-arm64
- qemu-armv7
- qemu-i386
- qemu-x86_64
- x86_64-clang
and on the following RC's:
- v5.10.202-rc1
- v5.15.140-rc1
- v6.1.64-rc1
Hum, even in 6.1? That's odd. Can you please test whether current upstream vanilla kernel works for you with this test? Thanks!
Honza
Hello!
On Mon, 27 Nov 2023 at 09:56, Jan Kara jack@suse.cz wrote:
Hello!
On Fri 24-11-23 23:45:09, Daniel Díaz wrote:
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We are noticing a regression with ltp-syscalls' preadv03:
Thanks for report!
-----8<----- preadv03 preadv03 preadv03_64 preadv03_64 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 ----->8-----
This is seen in the following environments:
- dragonboard-845c
- juno-64k_page_size
- qemu-arm64
- qemu-armv7
- qemu-i386
- qemu-x86_64
- x86_64-clang
and on the following RC's:
- v5.10.202-rc1
- v5.15.140-rc1
- v6.1.64-rc1
Hum, even in 6.1? That's odd. Can you please test whether current upstream vanilla kernel works for you with this test? Thanks!
Yes, this is working for us on mainline and next: https://qa-reports.linaro.org/lkft/linux-mainline-master/tests/ltp-syscalls/... https://qa-reports.linaro.org/lkft/linux-next-master/tests/ltp-syscalls/prea... c.fr. 6.1: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/tests/ltp-sys...
Greetings!
Daniel Díaz daniel.diaz@linaro.org
Hello!
On Mon 27-11-23 11:32:12, Daniel Díaz wrote:
On Mon, 27 Nov 2023 at 09:56, Jan Kara jack@suse.cz wrote:
On Fri 24-11-23 23:45:09, Daniel Díaz wrote:
On 24/11/23 11:50 a. m., Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.140 release. There are 297 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 26 Nov 2023 17:19:17 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.140-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
We are noticing a regression with ltp-syscalls' preadv03:
Thanks for report!
-----8<----- preadv03 preadv03 preadv03_64 preadv03_64 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'a' expectedly preadv03.c:87: TPASS: preadv(O_DIRECT) read 512 bytes successfully with content 'b' expectedly preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 preadv03.c:102: TINFO: Using block size 512 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:77: TFAIL: Buffer wrong at 0 have 62 expected 61 preadv03.c:66: TFAIL: preadv(O_DIRECT) read 0 bytes, expected 512 ----->8-----
This is seen in the following environments:
- dragonboard-845c
- juno-64k_page_size
- qemu-arm64
- qemu-armv7
- qemu-i386
- qemu-x86_64
- x86_64-clang
and on the following RC's:
- v5.10.202-rc1
- v5.15.140-rc1
- v6.1.64-rc1
Hum, even in 6.1? That's odd. Can you please test whether current upstream vanilla kernel works for you with this test? Thanks!
Yes, this is working for us on mainline and next: https://qa-reports.linaro.org/lkft/linux-mainline-master/tests/ltp-syscalls/... https://qa-reports.linaro.org/lkft/linux-next-master/tests/ltp-syscalls/prea... c.fr. 6.1: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.1.y/tests/ltp-sys...
Greetings!
So I've got back to this and the failure is a subtle interaction between iomap code and ext4 code. In particular that fact that commit 936e114a245b6 ("iomap: update ki_pos a little later in iomap_dio_complete") is not in stable causes that file position is not updated after direct IO write and thus we direct IO writes are ending in wrong locations effectively corrupting data. The subtle detail is that before this commit if ->end_io handler returns non-zero value (which the new ext4 ->end_io handler does), file pos doesn't get updated, after this commit it doesn't get updated only if the return value is < 0.
The commit got merged in 6.5-rc1 so all stable kernels that have 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still carrying the problematic commit. Greg, please take out the commit from all stable kernels before 6.5 as soon as possible, we'll figure out proper backport once user data are not being corrupted anymore. Thanks!
Honza
On Tue, Dec 05, 2023 at 01:21:22PM +0100, Jan Kara wrote: [ ... ]
So I've got back to this and the failure is a subtle interaction between iomap code and ext4 code. In particular that fact that commit 936e114a245b6 ("iomap: update ki_pos a little later in iomap_dio_complete") is not in stable causes that file position is not updated after direct IO write and thus we direct IO writes are ending in wrong locations effectively corrupting data. The subtle detail is that before this commit if ->end_io handler returns non-zero value (which the new ext4 ->end_io handler does), file pos doesn't get updated, after this commit it doesn't get updated only if the return value is < 0.
The commit got merged in 6.5-rc1 so all stable kernels that have 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still carrying the problematic commit. Greg, please take out the commit from all stable kernels before 6.5 as soon as possible, we'll figure out proper backport once user data are not being corrupted anymore. Thanks!
Thanks a lot for the update.
Turns out this is causing a regression in chromeos-6.1, and reverting the offending patch fixes the problem. I suspect anyone running v6.1.64+ may have a problem.
Guenter
On Tue, Dec 05, 2023 at 09:55:08AM -0800, Guenter Roeck wrote:
On Tue, Dec 05, 2023 at 01:21:22PM +0100, Jan Kara wrote: [ ... ]
So I've got back to this and the failure is a subtle interaction between iomap code and ext4 code. In particular that fact that commit 936e114a245b6 ("iomap: update ki_pos a little later in iomap_dio_complete") is not in stable causes that file position is not updated after direct IO write and thus we direct IO writes are ending in wrong locations effectively corrupting data. The subtle detail is that before this commit if ->end_io handler returns non-zero value (which the new ext4 ->end_io handler does), file pos doesn't get updated, after this commit it doesn't get updated only if the return value is < 0.
The commit got merged in 6.5-rc1 so all stable kernels that have 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still carrying the problematic commit. Greg, please take out the commit from all stable kernels before 6.5 as soon as possible, we'll figure out proper backport once user data are not being corrupted anymore. Thanks!
Thanks a lot for the update.
Turns out this is causing a regression in chromeos-6.1, and reverting the offending patch fixes the problem. I suspect anyone running v6.1.64+ may have a problem.
Jan, thanks for the report, and Guenter, thanks for letting me know as well. I'll go queue up the fix now and push out new -rc releases.
greg k-h
Hi!
So I've got back to this and the failure is a subtle interaction between iomap code and ext4 code. In particular that fact that commit 936e114a245b6 ("iomap: update ki_pos a little later in iomap_dio_complete") is not in stable causes that file position is not updated after direct IO write and thus we direct IO writes are ending in wrong locations effectively corrupting data. The subtle detail is that before this commit if ->end_io handler returns non-zero value (which the new ext4 ->end_io handler does), file pos doesn't get updated, after this commit it doesn't get updated only if the return value is < 0.
The commit got merged in 6.5-rc1 so all stable kernels that have 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still carrying the problematic commit. Greg, please take out the commit from all stable kernels before 6.5 as soon as possible, we'll figure out proper backport once user data are not being corrupted anymore. Thanks!
Thanks a lot for the update.
Turns out this is causing a regression in chromeos-6.1, and reverting the offending patch fixes the problem. I suspect anyone running v6.1.64+ may have a problem.
Jan, thanks for the report, and Guenter, thanks for letting me know as well. I'll go queue up the fix now and push out new -rc releases.
Would someone have a brief summary here? I see 6.1.66 is out but I don't see any "Fixes: 91562895f803" tags.
Plus, what is the severity of this? It is "data being corrupted when using O_SYNC|O_DIRECT" or does metadata somehow get corrupted, too?
Thanks and best regards, Pavel
On Mon 11-12-23 09:28:40, Pavel Machek wrote:
So I've got back to this and the failure is a subtle interaction between iomap code and ext4 code. In particular that fact that commit 936e114a245b6 ("iomap: update ki_pos a little later in iomap_dio_complete") is not in stable causes that file position is not updated after direct IO write and thus we direct IO writes are ending in wrong locations effectively corrupting data. The subtle detail is that before this commit if ->end_io handler returns non-zero value (which the new ext4 ->end_io handler does), file pos doesn't get updated, after this commit it doesn't get updated only if the return value is < 0.
The commit got merged in 6.5-rc1 so all stable kernels that have 91562895f803 ("ext4: properly sync file size update after O_SYNC direct IO") before 6.5 are corrupting data - I've noticed at least 6.1 is still carrying the problematic commit. Greg, please take out the commit from all stable kernels before 6.5 as soon as possible, we'll figure out proper backport once user data are not being corrupted anymore. Thanks!
Thanks a lot for the update.
Turns out this is causing a regression in chromeos-6.1, and reverting the offending patch fixes the problem. I suspect anyone running v6.1.64+ may have a problem.
Jan, thanks for the report, and Guenter, thanks for letting me know as well. I'll go queue up the fix now and push out new -rc releases.
Would someone have a brief summary here? I see 6.1.66 is out but I don't see any "Fixes: 91562895f803" tags.
Plus, what is the severity of this? It is "data being corrupted when using O_SYNC|O_DIRECT" or does metadata somehow get corrupted, too?
It is pure data corruption happening for ext4 direct IO writes because they do not properly update current file position after the write.
Honza
linux-stable-mirror@lists.linaro.org