This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.12.54-rc1
Kai Vehmanen kai.vehmanen@linux.intel.com ASoC: SOF: ipc4-pcm: fix start offset calculation for chain DMA
Olga Kornievskaia okorniev@redhat.com nfsd: fix access checking for NLM under XPRTSEC policies
Olga Kornievskaia okorniev@redhat.com nfsd: fix __fh_verify for localio
Ian Rogers irogers@google.com perf test stat: Avoid hybrid assumption when virtualized
K Prateek Nayak kprateek.nayak@amd.com sched/fair: Block delayed tasks on throttled hierarchy during dequeue
Jan Kara jack@suse.cz writeback: Avoid excessively long inode switching times
Jan Kara jack@suse.cz writeback: Avoid softlockup when switching many inodes
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp cramfs: Verify inode mode when loading from disk
Lichen Liu lichliu@redhat.com fs: Add 'initramfs_options' to set initramfs mount options
Oleg Nesterov oleg@redhat.com pid: make __task_pid_nr_ns(ns => NULL) safe for zombie callers
gaoxiang17 gaoxiang17@xiaomi.com pid: Add a judgment for ns null in pid_nr_ns
Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp minixfs: Verify inode mode when loading from disk
Miklos Szeredi mszeredi@redhat.com copy_file_range: limit size if in compat mode
Lucas Zampieri lzampier@redhat.com irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume
Hongbo Li lihongbo22@huawei.com irqchip/sifive-plic: Make use of __assign_bit()
Ilya Leoshkevich iii@linux.ibm.com s390/bpf: Write back tail call counter for BPF_TRAMP_F_CALL_ORIG
Ilya Leoshkevich iii@linux.ibm.com s390/bpf: Write back tail call counter for BPF_PSEUDO_CALL
Ilya Leoshkevich iii@linux.ibm.com s390/bpf: Describe the frame using a struct instead of constants
Ilya Leoshkevich iii@linux.ibm.com s390/bpf: Centralize frame offset calculations
Lance Yang lance.yang@linux.dev mm/rmap: fix soft-dirty and uffd-wp bit loss when remapping zero-filled mTHP subpage to shared zeropage
Guenter Roeck linux@roeck-us.net ipmi: Fix handling of messages with provided receive message pointer
Corey Minyard corey@minyard.net ipmi: Rework user message limit handling
Matthieu Baerts (NGI0) matttbe@kernel.org mptcp: pm: in-kernel: usable client side with C-flag
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: property: Do not pass NULL handles to acpi_attach_data()
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: property: Add code comments explaining what is going on
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: property: Disregard references in data-only subnode lists
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: battery: Add synchronization between interface updates
Andy Shevchenko andriy.shevchenko@linux.intel.com ACPI: battery: Check for error code from devm_mutex_init() call
Thomas Weißschuh linux@weissschuh.net ACPI: battery: initialize mutexes through devm_ APIs
Thomas Weißschuh linux@weissschuh.net ACPI: battery: allocate driver data through devm_ APIs
Olga Kornievskaia okorniev@redhat.com nfsd: unregister with rpcbind when deleting a transport
NeilBrown neilb@suse.de nfsd: don't use sv_nrthreads in connection limiting calculations.
NeilBrown neilb@suse.de nfsd: refine and rename NFSD_MAY_LOCK
Chuck Lever chuck.lever@oracle.com NFSD: Replace use of NFSD_MAY_LOCK in nfsd4_lock()
Pali Rohár pali@kernel.org nfsd: Fix NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT
Sean Christopherson seanjc@google.com x86/kvm: Force legacy PCI hole to UC when overriding MTRRs for TDX/SNP
Kirill A. Shutemov kirill.shutemov@linux.intel.com x86/mtrr: Rename mtrr_overwrite_state() to guest_force_mtrr_state()
Catalin Marinas catalin.marinas@arm.com arm64: mte: Do not flag the zero page as PG_mte_tagged
Christian Brauner brauner@kernel.org statmount: don't call path_put() under namespace semaphore
Borislav Petkov (AMD) bp@alien8.de KVM: x86: Advertise SRSO_USER_KERNEL_NO to userspace
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: Make drivers using CPUFREQ_ETERNAL specify transition latency
Qu Wenruo wqu@suse.com btrfs: fix the incorrect max_bytes value for find_lock_delalloc_range()
Hans de Goede hansg@kernel.org mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag
Andy Shevchenko andriy.shevchenko@linux.intel.com mfd: intel_soc_pmic_chtdc_ti: Drop unneeded assignment for cache_type
Hans de Goede hdegoede@redhat.com mfd: intel_soc_pmic_chtdc_ti: Fix invalid regmap-config max_register value
Kai Vehmanen kai.vehmanen@linux.intel.com ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc4-pcm: Enable delay reporting for ChainDMA streams
Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com PCI: endpoint: pci-epf-test: Add NULL check for DMA channels before release
Wang Jiang jiangwang@kylinos.cn PCI: endpoint: Remove surplus return statement from pci_epf_test_clean_dma_chan()
Donet Tom donettom@linux.ibm.com mm/ksm: fix incorrect KSM counter handling in mm_struct during fork
Yuan Chen chenyuan@kylinos.cn tracing: Fix race condition in kprobe initialization causing NULL pointer dereference
Phillip Lougher phillip@squashfs.org.uk Squashfs: reject negative file sizes in squashfs_read_inode()
Phillip Lougher phillip@squashfs.org.uk Squashfs: add additional inode sanity checking
Edward Adam Davis eadavis@qq.com media: mc: Clear minor number before put device
Lance Yang lance.yang@linux.dev selftests/mm: skip soft-dirty tests when CONFIG_MEM_SOFT_DIRTY is disabled
Nathan Chancellor nathan@kernel.org lib/crypto/curve25519-hacl64: Disable KASAN with clang-17 and older
Jan Kara jack@suse.cz ext4: free orphan info with kvfree
Huacai Chen chenhuacai@kernel.org ACPICA: Allow to skip Global Lock initialization
Deepanshu Kartikey kartikey406@gmail.com ext4: validate ea_ino and size in check_xattrs
Ahmet Eray Karadag eraykrdg1@gmail.com ext4: guard against EA inode refcount underflow in xattr update
Zhang Yi yi.zhang@huawei.com ext4: fix an off-by-one issue during moving extents
Theodore Ts'o tytso@mit.edu ext4: avoid potential buffer over-read in parse_apply_sb_mount_options()
Ojaswin Mujoo ojaswin@linux.ibm.com ext4: correctly handle queries for metadata mappings
Yongjian Sun sunyongjian1@huawei.com ext4: increase i_disksize to offset + len in ext4_update_disksize_before_punch()
Jan Kara jack@suse.cz ext4: verify orphan file size is not too big
Baokun Li libaokun1@huawei.com ext4: add ext4_sb_bread_nofail() helper function for ext4_free_branches()
Olga Kornievskaia okorniev@redhat.com nfsd: nfserr_jukebox in nlm_fopen should lead to a retry
Thorsten Blum thorsten.blum@linux.dev NFSD: Fix destination buffer size in nfsd4_ssc_setup_dul()
SeongJae Park sj@kernel.org mm/damon/lru_sort: use param_ctx for damon_attrs staging
SeongJae Park sj@kernel.org mm/damon/vaddr: do not repeat pte_offset_map_lock() until success
Li RongQing lirongqing@baidu.com mm/hugetlb: early exit from hugetlb_pages_alloc_boot() when max_huge_pages=0
Thadeu Lima de Souza Cascardo cascardo@igalia.com mm/page_alloc: only set ALLOC_HIGHATOMIC for __GPF_HIGH allocations
Lance Yang lance.yang@linux.dev mm/thp: fix MTE tag mismatch when replacing zero-filled subpages
Nick Morrow morrownr@gmail.com wifi: mt76: mt7921u: Add VID/PID for Netgear A7500
Nick Morrow morrownr@gmail.com wifi: mt76: mt7925u: Add VID/PID for Netgear A9000
Muhammad Usama Anjum usama.anjum@collabora.com wifi: ath11k: HAL SRNG: don't deinitialize and re-initialize again
Suren Baghdasaryan surenb@google.com slab: mark slab->obj_exts allocation failures unconditionally
Suren Baghdasaryan surenb@google.com slab: prevent warnings when slab obj_exts vector allocation fails
Heiko Carstens hca@linux.ibm.com s390: Add -Wno-pointer-sign to KBUILD_CFLAGS_DECOMPRESSOR
Jaehoon Kim jhkim@linux.ibm.com s390/dasd: Return BLK_STS_INVAL for EINVAL from do_dasd_request
Jaehoon Kim jhkim@linux.ibm.com s390/dasd: enforce dma_alignment to ensure proper buffer validation
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: join: validate C-flag + def limit
Sean Christopherson seanjc@google.com x86/umip: Fix decoding of register forms of 0F 01 (SGDT and SIDT aliases)
Sean Christopherson seanjc@google.com x86/umip: Check that the instruction opcode is at least two bytes
Xin Li (Intel) xin@zytor.com x86/fred: Remove ENDBR64 from FRED entry points
Santhosh Kumar K s-k6@ti.com spi: cadence-quadspi: Fix cqspi_setup_flash()
Pratyush Yadav pratyush@kernel.org spi: cadence-quadspi: Flush posted register writes before DAC access
Pratyush Yadav pratyush@kernel.org spi: cadence-quadspi: Flush posted register writes before INDAC access
Niklas Cassel cassel@kernel.org PCI: tegra194: Reset BARs when running in PCIe endpoint mode
Vidya Sagar vidyas@nvidia.com PCI: tegra194: Handle errors in BPMP response
Niklas Cassel cassel@kernel.org PCI: tegra194: Fix broken tegra_pcie_ep_raise_msi_irq()
Marek Vasut marek.vasut+renesas@mailbox.org PCI: rcar-host: Convert struct rcar_msi mask_lock into raw spinlock
Marek Vasut marek.vasut+renesas@mailbox.org PCI: rcar-host: Drop PMSR spinlock
Marek Vasut marek.vasut+renesas@mailbox.org PCI: rcar-gen4: Fix PHY initialization
Siddharth Vadapalli s-vadapalli@ti.com PCI: keystone: Use devm_request_irq() to free "ks-pcie-error-irq" on exit
Siddharth Vadapalli s-vadapalli@ti.com PCI: j721e: Fix programming sequence of "strap" settings
Lukas Wunner lukas@wunner.de PCI/AER: Support errors introduced by PCIe r6.0
Niklas Schnelle schnelle@linux.ibm.com PCI/AER: Fix missing uevent on recovery when a reset is requested
Lukas Wunner lukas@wunner.de PCI/ERR: Fix uevent on failure to recover
Niklas Schnelle schnelle@linux.ibm.com PCI/IOV: Add PCI rescan-remove locking when enabling/disabling SR-IOV
Brian Norris briannorris@google.com PCI/sysfs: Ensure devices are powered for config reads
Marek Vasut marek.vasut+renesas@mailbox.org PCI: tegra: Convert struct tegra_msi mask_lock into raw spinlock
Jani Nurminen jani.nurminen@windriver.com PCI: xilinx-nwl: Fix ECAM programming
Sean Christopherson seanjc@google.com rseq/selftests: Use weak symbol reference, not definition, to link with glibc
Esben Haabendal esben@geanix.com rtc: interface: Fix long-standing race when setting alarm
Esben Haabendal esben@geanix.com rtc: interface: Ensure alarm irq is enabled when UIE is enabled
Zhen Ni zhen.ni@easystack.cn memory: samsung: exynos-srom: Fix of_iomap leak in exynos_srom_probe
Rex Chen rex.chen_1@nxp.com mmc: mmc_spi: multiple block read remove read crc ack
Rex Chen rex.chen_1@nxp.com mmc: core: SPI mode remove cmd7
Linus Walleij linus.walleij@linaro.org mtd: rawnand: fsmc: Default to autodetect buswidth
Alexander Lobakin aleksander.lobakin@intel.com xsk: Harden userspace-supplied xdp_desc validation
Miaoqian Lin linmq006@gmail.com xtensa: simdisk: add input size check in proc_write_simdisk
Ma Ke make24@iscas.ac.cn sparc: fix error handling in scan_one_device()
Anthony Yznaga anthony.yznaga@oracle.com sparc64: fix hugetlb for sun4u
Eric Biggers ebiggers@kernel.org sctp: Fix MAC comparison to be constant-time
Abinash Singh abinashsinghlalotra@gmail.com scsi: sd: Fix build warning in sd_revalidate_disk()
Thorsten Blum thorsten.blum@linux.dev scsi: hpsa: Fix potential memory leak in hpsa_big_passthru_ioctl()
Harshit Agarwal harshit@nutanix.com sched/deadline: Fix race in push_dl_task()
Corey Minyard corey@minyard.net Revert "ipmi: fix msg stack when IPMI is disconnected"
Jisheng Zhang jszhang@kernel.org pwm: berlin: Fix wrong register in suspend/resume
Nam Cao namcao@linutronix.de powerpc/pseries/msi: Fix potential underflow and leak issue
Nam Cao namcao@linutronix.de powerpc/powernv/pci: Fix underflow and leak issue
Dzmitry Sankouski dsankouski@gmail.com power: supply: max77976_charger: fix constant current reporting
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org pinctrl: samsung: Drop unused S3C24xx driver data
Georg Gottleuber ggo@tuxedocomputers.com nvme-pci: Add TUXEDO IBS Gen8 to Samsung sleep quirk
John David Anglin dave.anglin@bell.net parisc: Remove spurious if statement from raw_copy_from_user()
Sam James sam@gentoo.org parisc: don't reference obsolete termio struct for TC* constants
Askar Safin safinaskar@zohomail.com openat2: don't trigger automounts with RESOLVE_NO_XDEV
Ma Ke make24@iscas.ac.cn of: unittest: Fix device reference count leak in of_unittest_pci_node_verify
Li Chen me@linux.beauty loop: fix backing file reference leak on validation error
Johan Hovold johan@kernel.org lib/genalloc: fix device leak in of_gen_pool_get()
Eric Biggers ebiggers@kernel.org KEYS: trusted_tpm1: Compare HMAC values in constant time
Oleg Nesterov oleg@redhat.com kernel/sys.c: fix the racy usage of task_lock(tsk->group_leader) in sys_prlimit64() paths
Lu Baolu baolu.lu@linux.intel.com iommu/vt-d: PRS isn't usable if PDS isn't supported
Sean Nyekjaer sean@geanix.com iio: imu: inv_icm42600: Drop redundant pm_runtime reinitialization in resume
Huacai Chen chenhuacai@kernel.org init: handle bootloader identifier in kernel parameters
Sean Anderson sean.anderson@linux.dev iio: xilinx-ams: Unmask interrupts after updating alarms
Sean Anderson sean.anderson@linux.dev iio: xilinx-ams: Fix AMS_ALARM_THR_DIRECT_MASK
Michael Hennerich michael.hennerich@analog.com iio: frequency: adf4350: Fix prescaler usage.
Qianfeng Rong rongqianfeng@vivo.com iio: dac: ad5421: use int type to store negative error codes
Qianfeng Rong rongqianfeng@vivo.com iio: dac: ad5360: use int type to store negative error codes
Aleksandar Gerasimovski aleksandar.gerasimovski@belden.com iio/adc/pac1934: fix channel disable configuration
Darrick J. Wong djwong@kernel.org fuse: fix livelock in synchronous file put from fuseblk workers
Miklos Szeredi mszeredi@redhat.com fuse: fix possibly missing fuse_copy_finish() call in fuse_notify()
Shashank A P shashank.ap@samsung.com fs: quota: create dedicated workqueue for quota_release_work
Haoxiang Li haoxiang_li2024@163.com fs/ntfs3: Fix a resource leak bug in wnd_extend()
Finn Thain fthain@linux-m68k.org fbdev: Fix logic error in "offb" name match
Nam Cao namcao@linutronix.de eventpoll: Replace rwlock with spinlock
Thomas Fourier fourier.thomas@gmail.com crypto: rockchip - Fix dma_unmap_sg() nents value
Thomas Fourier fourier.thomas@gmail.com crypto: atmel - Fix dma_unmap_sg() direction
Thomas Fourier fourier.thomas@gmail.com crypto: aspeed - Fix dma_unmap_sg() direction
Rafael J. Wysocki rafael.j.wysocki@intel.com cpufreq: intel_pstate: Fix object lifecycle issue in update_qos_request()
Simon Schuster schuster.simon@siemens-energy.com copy_sighand: Handle architectures where sizeof(unsigned long) < sizeof(u64)
Abel Vesa abel.vesa@linaro.org clk: qcom: tcsrcc-x1e80100: Set the bi_tcxo as parent to eDP refclk
Adam Xue zxue@semtech.com bus: mhi: host: Do not use uninitialized 'dev' pointer in mhi_init_irq_setup()
Sumit Kumar sumit.kumar@oss.qualcomm.com bus: mhi: ep: Fix chained transfer handling in read path
Anderson Nascimento anderson@allelesecurity.com btrfs: avoid potential out-of-bounds in btrfs_encode_fh()
Yu Kuai yukuai3@huawei.com blk-crypto: fix missing blktrace bio split events
Fangzhi Zuo Jerry.Zuo@amd.com drm/amd/display: Enable Dynamic DTBCLK Switch
Matthew Auld matthew.auld@intel.com drm/xe/uapi: loosen used tracking restriction
Shuhao Fu sfual@cse.ust.hk drm/nouveau: fix bad ret code in nouveau_bo_move_prep
Marek Vasut marek.vasut+renesas@mailbox.org drm/rcar-du: dsi: Fix 1/2/3 lane support
Jann Horn jannh@google.com drm/panthor: Fix memory leak in panthor_ioctl_group_create()
Ma Ke make24@iscas.ac.cn media: lirc: Fix error handling in lirc_register()
Jai Luthra jai.luthra@ideasonboard.com media: ti: j721e-csi2rx: Fix source subdev link creation
Jai Luthra jai.luthra@ideasonboard.com media: ti: j721e-csi2rx: Use devm_of_platform_populate
Hans Verkuil hverkuil+cisco@kernel.org media: vivid: fix disappearing <Vendor Command With ID> messages
Stephan Gerhold stephan.gerhold@linaro.org media: venus: firmware: Use correct reset sequence for IRIS2
Arnd Bergmann arnd@arndb.de media: s5p-mfc: remove an unused/uninitialized variable
David Lechner dlechner@baylibre.com media: pci: mg4b: fix uninitialized iio scan data
Thomas Fourier fourier.thomas@gmail.com media: pci: ivtv: Add missing check after DMA map
Laurent Pinchart laurent.pinchart@ideasonboard.com media: mc: Fix MUST_CONNECT handling for pads with no links
Qianfeng Rong rongqianfeng@vivo.com media: i2c: mt9v111: fix incorrect type for ret
Thomas Fourier fourier.thomas@gmail.com media: cx18: Add missing check after DMA map
Randy Dunlap rdunlap@infradead.org media: cec: extron-da-hd-4k-plus: drop external-module make commands
Johan Hovold johan@kernel.org firmware: meson_sm: fix device leak at probe
Jason Andryuk jason.andryuk@amd.com xen/events: Update virq_to_irq on migration
Jason Andryuk jason.andryuk@amd.com xen/events: Return -EEXIST for bound VIRQs
Lukas Wunner lukas@wunner.de xen/manage: Fix suspend error path
Jason Andryuk jason.andryuk@amd.com xen/events: Cleanup find_virq() return codes
Michael Riesch michael.riesch@collabora.com dt-bindings: phy: rockchip-inno-csi-dphy: make power-domains non-required
Robin Murphy robin.murphy@arm.com perf/arm-cmn: Fix CMN S3 DTM offset
Miaoqian Lin linmq006@gmail.com ARM: OMAP2+: pm33xx-core: ix device node reference leaks in amx3_idle_init
Alexander Sverdlin alexander.sverdlin@siemens.com ARM: AM33xx: Implement TI advisory 1.0.36 (EMU0/EMU1 pins state on reset)
Yang Shi yang@os.amperecomputing.com arm64: kprobes: call set_memory_rox() for kprobe page
Vibhore Vardhan vibhore@ti.com arm64: dts: ti: k3-am62a-main: Fix main padcfg length
Aleksandrs Vinarskis alex.vinarskis@gmail.com arm64: dts: qcom: x1e80100-pmics: Disable pm8010 by default
Stephan Gerhold stephan.gerhold@linaro.org arm64: dts: qcom: sdm845: Fix slimbam num-channels/ees
Stephan Gerhold stephan.gerhold@linaro.org arm64: dts: qcom: msm8939: Add missing MDSS reset
Stephan Gerhold stephan.gerhold@linaro.org arm64: dts: qcom: msm8916: Add missing MDSS reset
Amir Mohammad Jahangirzad a.jahangirzad@gmail.com ACPI: debug: fix signedness issues in read/write helpers
Daniel Tang danielzgtg.opensource@gmail.com ACPI: TAD: Add missing sysfs_remove_group() for ACPI_TAD_RT
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: property: Fix buffer properties extraction for subnodes
Nathan Chancellor nathan@kernel.org s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections
Alexey Gladkov legion@kernel.org s390: vmlinux.lds.S: Reorder sections
KaFai Wan kafai.wan@linux.dev bpf: Avoid RCU context warning when unpinning htab with internal structs
Bartosz Golaszewski bartosz.golaszewski@linaro.org gpio: wcd934x: mark the GPIO controller as sleeping
Gunnar Kudrjavets gunnarku@amazon.com tpm_tis: Fix incorrect arguments in tpm_tis_probe_irq_single
Pali Rohár pali@kernel.org cifs: Query EA $LXMOD in cifs_query_path_info() for WSL reparse points
Paulo Alcantara pc@manguebit.org smb: client: fix missing timestamp updates after utime(2)
Fushuai Wang wangfushuai@baidu.com cifs: Fix copy_to_iter return value check
Herbert Xu herbert@gondor.apana.org.au crypto: essiv - Check ssize for decryption and in-place encryption
Florian Westphal fw@strlen.de selftests: netfilter: query conntrack state to check for port clash resolution
Eric Woudstra ericwouds@gmail.com bridge: br_vlan_fill_forward_path_pvid: use br_vlan_group_rcu()
Fernando Fernandez Mancera fmancera@suse.de netfilter: nft_objref: validate objref and objrefmap expressions
Timur Kristóf timur.kristof@gmail.com drm/amd/display: Properly disable scaling on DCE6
Timur Kristóf timur.kristof@gmail.com drm/amd/display: Properly clear SCL_*_FILTER_CONTROL on DCE6
Timur Kristóf timur.kristof@gmail.com drm/amd/display: Add missing DCE6 SCL_HORZ_FILTER_INIT* SRIs
Alex Deucher alexander.deucher@amd.com drm/amdgpu: Add additional DCE6 SCL registers
Jason-JH Lin jason-jh.lin@mediatek.com mailbox: mtk-cmdq: Remove pm_runtime APIs from cmdq_mbox_send_data()
Sakari Ailus sakari.ailus@linux.intel.com mailbox: mtk-cmdq: Switch to pm_runtime_put_autosuspend()
Sakari Ailus sakari.ailus@linux.intel.com mailbox: mtk-cmdq-mailbox: Switch to __pm_runtime_put_autosuspend()
Daniel Borkmann daniel@iogearbox.net bpf: Fix metadata_dst leak __bpf_redirect_neigh_v{4,6}
Harini T harini.t@amd.com mailbox: zynqmp-ipi: Fix SGI cleanup on unbind
Harini T harini.t@amd.com mailbox: zynqmp-ipi: Fix out-of-bounds access in mailbox cleanup loop
Harini T harini.t@amd.com mailbox: zynqmp-ipi: Remove dev.parent check in zynqmp_ipi_free_mboxes
Harini T harini.t@amd.com mailbox: zynqmp-ipi: Remove redundant mbox_controller_unregister() call
Eric Dumazet edumazet@google.com tcp: take care of zero tp->window_clamp in tcp_set_rcvlowat()
Leo Yan leo.yan@arm.com perf python: split Clang options when invoking Popen
Leo Yan leo.yan@arm.com tools build: Align warning options with perf
Erick Karanja karanja99erick@gmail.com net: fsl_pq_mdio: Fix device node reference leak in fsl_pq_mdio_probe
Haotian Zhang vulab@iscas.ac.cn ice: ice_adapter: release xa entry on adapter allocation failure
Duoming Zhou duoming@zju.edu.cn net: mscc: ocelot: Fix use-after-free caused by cyclic delayed work
Kuniyuki Iwashima kuniyu@google.com tcp: Don't call reqsk_fastopen_remove() in tcp_conn_request().
Alexandr Sapozhnikov alsp705@gmail.com net/sctp: fix a null dereference in sctp_disposition sctp_sf_do_5_1D_ce()
Ian Forbes ian.forbes@broadcom.com drm/vmwgfx: Fix copy-paste typo in validation
Ian Forbes ian.forbes@broadcom.com drm/vmwgfx: Fix Use-after-free in validation
Zack Rusin zack.rusin@broadcom.com drm/vmwgfx: Fix a null-ptr access in the cursor snooper
Vineeth Vijayan vneethv@linux.ibm.com s390/cio: Update purge function to unregister the unused subchannels
Shuicheng Lin shuicheng.lin@intel.com drm/xe/hw_engine_group: Fix double write lock release in error path
Dan Carpenter dan.carpenter@linaro.org net/mlx4: prevent potential use after free in mlx4_en_do_uc_filter()
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: Intel: Read the LLP via the associated Link DMA channel
Huacai Chen chenhuacai@kernel.org LoongArch: Init acpi_gbl_use_global_lock to false
Tiezhu Yang yangtiezhu@loongson.cn LoongArch: Add cflag -fno-isolate-erroneous-paths-dereference
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: Intel: hda-pcm: Place the constraint on period time instead of buffer time
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc4-topology: Account for different ChainDMA host buffer size
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc4-topology: Correct the minimum host DMA buffer size
Duoming Zhou duoming@zju.edu.cn scsi: mvsas: Fix use-after-free bugs in mvs_work_queue
Aaron Kling webgeek1234@gmail.com cpufreq: tegra186: Set target frequency for all cpus in policy
Fedor Pchelkin pchelkin@ispras.ru clk: tegra: do not overallocate memory for bpmp clocks
Alok Tiwari alok.a.tiwari@oracle.com clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver
Brian Masney bmasney@redhat.com clk: nxp: lpc18xx-cgu: convert from round_rate() to determine_rate()
Chen-Yu Tsai wenst@chromium.org clk: mediatek: clk-mux: Do not pass flags to clk_mux_determine_rate_flags()
AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com clk: mediatek: mt8195-infra_ao: Fix parent for infra_ao_hdmi_26m
Ian Rogers irogers@google.com perf evsel: Ensure the fallback message is always written to
Namhyung Kim namhyung@kernel.org perf tools: Add fallback for exclude_guest
James Clark james.clark@linaro.org perf test: Add a test for default perf stat command
Ian Rogers irogers@google.com perf test: Don't leak workload gopipe in PERF_RECORD_*
Leo Yan leo.yan@arm.com perf session: Fix handling when buffer exceeds 2 GiB
Ian Rogers irogers@google.com perf test shell lbr: Avoid failures with perf event paranoia
Namhyung Kim namhyung@kernel.org perf test: Update sysfs path for core PMU caps
Ilkka Koskinen ilkka@os.amperecomputing.com perf vendor events arm64 AmpereOneX: Fix typo - should be l1d_cache_access_prefetches
Leo Yan leo.yan@arm.com perf arm_spe: Correct memory level for remote access
Leo Yan leo.yan@arm.com perf arm-spe: Rename the common data source encoding
Leo Yan leo.yan@arm.com perf arm_spe: Correct setting remote access
Clément Le Goffic clement.legoffic@foss.st.com rtc: optee: fix memory leak on driver removal
Rob Herring (Arm) robh@kernel.org rtc: x1205: Fix Xicor X1205 vendor prefix
Yunseong Kim ysk@kzalloc.com perf util: Fix compression checks returning -1 as bool
Yuan CHen chenyuan@kylinos.cn clk: renesas: cpg-mssr: Fix memory leak in cpg_mssr_reserved_init()
Brian Masney bmasney@redhat.com clk: at91: peripheral: fix return value
Dan Carpenter dan.carpenter@linaro.org clk: qcom: common: Fix NULL vs IS_ERR() check in qcom_cc_icc_register()
Ian Rogers irogers@google.com libperf event: Ensure tracing data is multiple of 8 sized
Ian Rogers irogers@google.com perf evsel: Avoid container_of on a NULL leader
Ian Rogers irogers@google.com perf test trace_btf_enum: Skip if permissions are insufficient
Ian Rogers irogers@google.com perf disasm: Avoid undefined behavior in incrementing NULL
Varad Gautam varadgautam@google.com asm-generic/io.h: Skip trace helpers if rwmmio events are disabled
Tomi Valkeinen tomi.valkeinen@ideasonboard.com media: v4l2-subdev: Fix alloc failure check in v4l2_subdev_call_state_try()
Michael Hennerich michael.hennerich@analog.com iio: frequency: adf4350: Fix ADF4350_REG3_12BIT_CLKDIV_MODE
Sean Christopherson seanjc@google.com KVM: SVM: Emulate PERF_CNTR_GLOBAL_STATUS_SET for PerfMonV2
Petr Tesarik ptesarik@suse.com dma-mapping: fix direction in dma_alloc direction traces
Toke Høiland-Jørgensen toke@redhat.com page_pool: Fix PP_MAGIC_MASK to avoid crashing on some 32-bit arches
Zhen Ni zhen.ni@easystack.cn clocksource/drivers/clps711x: Fix resource leaks in error paths
Christian Brauner brauner@kernel.org listmount: don't call path_put() under namespace semaphore
Thomas Gleixner tglx@linutronix.de rseq: Protect event mask against membarrier IPI
Omar Sandoval osandov@fb.com arm64: map [_text, _stext) virtual address range non-executable+read-only
Aleksa Sarai cyphar@cyphar.com fscontext: do not consume log entries when returning -EMSGSIZE
Thomas Weißschuh thomas.weissschuh@linutronix.de fs: always return zero on success from replace_fd()
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 3 + .../bindings/phy/rockchip-inno-csi-dphy.yaml | 15 +- Makefile | 4 +- arch/arm/mach-omap2/am33xx-restart.c | 36 ++ arch/arm/mach-omap2/pm33xx-core.c | 6 +- arch/arm64/boot/dts/qcom/msm8916.dtsi | 2 + arch/arm64/boot/dts/qcom/msm8939.dtsi | 2 + arch/arm64/boot/dts/qcom/sdm845.dtsi | 4 +- arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi | 2 + arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 2 +- arch/arm64/kernel/cpufeature.c | 10 +- arch/arm64/kernel/mte.c | 3 +- arch/arm64/kernel/pi/map_kernel.c | 6 + arch/arm64/kernel/probes/kprobes.c | 12 + arch/arm64/kernel/setup.c | 4 +- arch/arm64/mm/init.c | 2 +- arch/arm64/mm/mmu.c | 14 +- arch/loongarch/Makefile | 2 +- arch/loongarch/kernel/setup.c | 1 + arch/parisc/include/uapi/asm/ioctls.h | 8 +- arch/parisc/lib/memcpy.c | 1 - arch/powerpc/platforms/powernv/pci-ioda.c | 2 +- arch/powerpc/platforms/pseries/msi.c | 2 +- arch/s390/Makefile | 1 + arch/s390/kernel/vmlinux.lds.S | 54 +-- arch/s390/net/bpf_jit.h | 55 --- arch/s390/net/bpf_jit_comp.c | 139 ++++--- arch/sparc/kernel/of_device_32.c | 1 + arch/sparc/kernel/of_device_64.c | 1 + arch/sparc/mm/hugetlbpage.c | 20 + arch/x86/entry/entry_64_fred.S | 2 +- arch/x86/hyperv/ivm.c | 2 +- arch/x86/include/asm/msr-index.h | 1 + arch/x86/include/asm/mtrr.h | 10 +- arch/x86/kernel/cpu/mtrr/generic.c | 6 +- arch/x86/kernel/cpu/mtrr/mtrr.c | 2 +- arch/x86/kernel/kvm.c | 21 +- arch/x86/kernel/umip.c | 15 +- arch/x86/kvm/cpuid.c | 2 +- arch/x86/kvm/pmu.c | 5 + arch/x86/kvm/svm/pmu.c | 1 + arch/x86/kvm/x86.c | 2 + arch/x86/xen/enlighten_pv.c | 4 +- arch/xtensa/platforms/iss/simdisk.c | 6 +- block/blk-crypto-fallback.c | 3 + crypto/essiv.c | 14 +- drivers/acpi/acpi_dbg.c | 26 +- drivers/acpi/acpi_tad.c | 3 + drivers/acpi/acpica/evglock.c | 4 + drivers/acpi/battery.c | 60 +-- drivers/acpi/property.c | 139 ++++--- drivers/block/loop.c | 8 +- drivers/bus/mhi/ep/main.c | 37 +- drivers/bus/mhi/host/init.c | 5 +- drivers/char/ipmi/ipmi_kcs_sm.c | 16 +- drivers/char/ipmi/ipmi_msghandler.c | 422 ++++++++++----------- drivers/char/tpm/tpm_tis_core.c | 4 +- drivers/clk/at91/clk-peripheral.c | 7 +- drivers/clk/mediatek/clk-mt8195-infra_ao.c | 2 +- drivers/clk/mediatek/clk-mux.c | 4 +- drivers/clk/nxp/clk-lpc18xx-cgu.c | 20 +- drivers/clk/qcom/common.c | 4 +- drivers/clk/qcom/tcsrcc-x1e80100.c | 4 + drivers/clk/renesas/renesas-cpg-mssr.c | 7 +- drivers/clk/tegra/clk-bpmp.c | 2 +- drivers/clocksource/clps711x-timer.c | 23 +- drivers/cpufreq/cpufreq-dt.c | 2 +- drivers/cpufreq/imx6q-cpufreq.c | 2 +- drivers/cpufreq/intel_pstate.c | 8 +- drivers/cpufreq/mediatek-cpufreq-hw.c | 2 +- drivers/cpufreq/scmi-cpufreq.c | 2 +- drivers/cpufreq/scpi-cpufreq.c | 2 +- drivers/cpufreq/spear-cpufreq.c | 2 +- drivers/cpufreq/tegra186-cpufreq.c | 8 +- drivers/crypto/aspeed/aspeed-hace-crypto.c | 2 +- drivers/crypto/atmel-tdes.c | 2 +- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 2 +- drivers/firmware/meson/meson_sm.c | 7 +- drivers/gpio/gpio-wcd934x.c | 2 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 + drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 21 +- drivers/gpu/drm/amd/display/dc/dce/dce_transform.h | 4 + .../gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h | 7 + .../drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h | 2 + drivers/gpu/drm/nouveau/nouveau_bo.c | 2 +- drivers/gpu/drm/panthor/panthor_drv.c | 11 +- drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c | 5 +- .../gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h | 8 +- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 17 +- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 6 +- drivers/gpu/drm/xe/xe_hw_engine_group.c | 6 +- drivers/gpu/drm/xe/xe_query.c | 15 +- drivers/iio/adc/pac1934.c | 20 +- drivers/iio/adc/xilinx-ams.c | 47 ++- drivers/iio/dac/ad5360.c | 2 +- drivers/iio/dac/ad5421.c | 2 +- drivers/iio/frequency/adf4350.c | 20 +- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 4 - drivers/iommu/intel/iommu.c | 2 +- drivers/irqchip/irq-sifive-plic.c | 13 +- drivers/mailbox/mtk-cmdq-mailbox.c | 12 +- drivers/mailbox/zynqmp-ipi-mailbox.c | 24 +- .../media/cec/usb/extron-da-hd-4k-plus/Makefile | 6 - drivers/media/i2c/mt9v111.c | 2 +- drivers/media/mc/mc-devnode.c | 6 +- drivers/media/mc/mc-entity.c | 2 +- drivers/media/pci/cx18/cx18-queue.c | 13 +- drivers/media/pci/ivtv/ivtv-irq.c | 2 +- drivers/media/pci/ivtv/ivtv-yuv.c | 8 +- drivers/media/pci/mgb4/mgb4_trigger.c | 2 +- drivers/media/platform/qcom/venus/firmware.c | 8 +- .../platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c | 35 +- .../media/platform/ti/j721e-csi2rx/j721e-csi2rx.c | 9 +- drivers/media/rc/lirc_dev.c | 9 +- drivers/media/test-drivers/vivid/vivid-cec.c | 12 +- drivers/memory/samsung/exynos-srom.c | 10 +- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 5 +- drivers/mmc/core/sdio.c | 6 +- drivers/mmc/host/mmc_spi.c | 2 +- drivers/mtd/nand/raw/fsmc_nand.c | 6 +- drivers/net/ethernet/freescale/fsl_pq_mdio.c | 2 + drivers/net/ethernet/intel/ice/ice_adapter.c | 10 +- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +- drivers/net/ethernet/mscc/ocelot_stats.c | 2 +- drivers/net/wireless/ath/ath11k/core.c | 6 +- drivers/net/wireless/ath/ath11k/hal.c | 16 + drivers/net/wireless/ath/ath11k/hal.h | 1 + drivers/net/wireless/mediatek/mt76/mt7921/usb.c | 3 + drivers/net/wireless/mediatek/mt76/mt7925/usb.c | 3 + drivers/nvme/host/pci.c | 2 + drivers/of/unittest.c | 1 + drivers/pci/controller/cadence/pci-j721e.c | 25 ++ drivers/pci/controller/dwc/pci-keystone.c | 4 +- drivers/pci/controller/dwc/pcie-rcar-gen4.c | 2 +- drivers/pci/controller/dwc/pcie-tegra194.c | 32 +- drivers/pci/controller/pci-tegra.c | 27 +- drivers/pci/controller/pcie-rcar-host.c | 40 +- drivers/pci/controller/pcie-xilinx-nwl.c | 7 +- drivers/pci/endpoint/functions/pci-epf-test.c | 19 +- drivers/pci/iov.c | 5 + drivers/pci/pci-driver.c | 1 + drivers/pci/pci-sysfs.c | 20 +- drivers/pci/pcie/aer.c | 12 +- drivers/pci/pcie/err.c | 8 +- drivers/perf/arm-cmn.c | 9 +- drivers/pinctrl/samsung/pinctrl-samsung.h | 4 - drivers/power/supply/max77976_charger.c | 12 +- drivers/pwm/pwm-berlin.c | 4 +- drivers/rtc/interface.c | 27 ++ drivers/rtc/rtc-optee.c | 1 + drivers/rtc/rtc-x1205.c | 2 +- drivers/s390/block/dasd.c | 17 +- drivers/s390/cio/device.c | 37 +- drivers/scsi/hpsa.c | 21 +- drivers/scsi/mvsas/mv_init.c | 2 +- drivers/scsi/sd.c | 50 +-- drivers/spi/spi-cadence-quadspi.c | 18 +- drivers/video/fbdev/core/fb_cmdline.c | 2 +- drivers/xen/events/events_base.c | 37 +- drivers/xen/manage.c | 3 +- fs/btrfs/export.c | 8 +- fs/btrfs/extent_io.c | 14 +- fs/cramfs/inode.c | 11 +- fs/eventpoll.c | 139 ++----- fs/ext4/ext4.h | 2 + fs/ext4/fsmap.c | 14 +- fs/ext4/indirect.c | 2 +- fs/ext4/inode.c | 10 +- fs/ext4/move_extent.c | 2 +- fs/ext4/orphan.c | 17 +- fs/ext4/super.c | 26 +- fs/ext4/xattr.c | 19 +- fs/file.c | 5 +- fs/fs-writeback.c | 32 +- fs/fsopen.c | 70 ++-- fs/fuse/dev.c | 2 +- fs/fuse/file.c | 8 +- fs/minix/inode.c | 8 +- fs/namei.c | 8 + fs/namespace.c | 106 ++++-- fs/nfs/callback.c | 4 - fs/nfs/callback_xdr.c | 1 + fs/nfsd/export.c | 24 +- fs/nfsd/export.h | 3 +- fs/nfsd/lockd.c | 28 +- fs/nfsd/netns.h | 4 +- fs/nfsd/nfs4proc.c | 4 +- fs/nfsd/nfs4state.c | 6 +- fs/nfsd/nfs4xdr.c | 2 +- fs/nfsd/nfsfh.c | 22 +- fs/nfsd/trace.h | 2 +- fs/nfsd/vfs.c | 14 +- fs/nfsd/vfs.h | 2 +- fs/ntfs3/bitmap.c | 1 + fs/quota/dquot.c | 10 +- fs/read_write.c | 14 +- fs/smb/client/smb1ops.c | 62 ++- fs/smb/client/smb2inode.c | 22 +- fs/smb/client/smb2ops.c | 10 +- fs/squashfs/inode.c | 24 +- include/acpi/acpixf.h | 6 + include/asm-generic/io.h | 98 +++-- include/linux/cpufreq.h | 3 + include/linux/iio/frequency/adf4350.h | 2 +- include/linux/ksm.h | 8 +- include/linux/mm.h | 22 +- include/linux/rseq.h | 11 +- include/linux/sunrpc/svc.h | 2 +- include/linux/sunrpc/svc_xprt.h | 19 + include/media/v4l2-subdev.h | 30 +- include/trace/events/dma.h | 1 + init/main.c | 12 + kernel/bpf/inode.c | 4 +- kernel/fork.c | 2 +- kernel/pid.c | 5 +- kernel/rseq.c | 10 +- kernel/sched/deadline.c | 73 ++-- kernel/sched/fair.c | 9 +- kernel/sys.c | 22 +- kernel/trace/trace_fprobe.c | 11 +- kernel/trace/trace_kprobe.c | 11 +- kernel/trace/trace_probe.h | 9 +- kernel/trace/trace_uprobe.c | 12 +- lib/crypto/Makefile | 4 + lib/genalloc.c | 5 +- mm/damon/lru_sort.c | 2 +- mm/damon/vaddr.c | 8 +- mm/huge_memory.c | 15 +- mm/hugetlb.c | 3 + mm/migrate.c | 23 +- mm/page_alloc.c | 2 +- mm/slab.h | 8 +- mm/slub.c | 3 +- net/bridge/br_vlan.c | 2 +- net/core/filter.c | 2 + net/core/page_pool.c | 76 ++-- net/ipv4/tcp.c | 5 +- net/ipv4/tcp_input.c | 1 - net/mptcp/pm.c | 7 +- net/mptcp/pm_netlink.c | 50 ++- net/mptcp/protocol.h | 8 + net/netfilter/nft_objref.c | 39 ++ net/sctp/sm_make_chunk.c | 3 +- net/sctp/sm_statefuns.c | 6 +- net/sunrpc/svc_xprt.c | 45 ++- net/sunrpc/svcsock.c | 2 + net/xdp/xsk_queue.h | 45 ++- security/keys/trusted-keys/trusted_tpm1.c | 7 +- sound/soc/sof/intel/hda-pcm.c | 29 +- sound/soc/sof/intel/hda-stream.c | 29 +- sound/soc/sof/ipc4-pcm.c | 138 +++++-- sound/soc/sof/ipc4-topology.c | 16 +- sound/soc/sof/ipc4-topology.h | 10 +- tools/build/feature/Makefile | 4 +- tools/lib/perf/include/perf/event.h | 1 + tools/perf/builtin-stat.c | 18 +- .../arch/arm64/ampere/ampereonex/metrics.json | 10 +- tools/perf/tests/perf-record.c | 4 + tools/perf/tests/shell/record_lbr.sh | 29 +- tools/perf/tests/shell/stat.sh | 29 ++ tools/perf/tests/shell/trace_btf_enum.sh | 11 + tools/perf/util/arm-spe-decoder/arm-spe-decoder.h | 18 +- tools/perf/util/arm-spe.c | 34 +- tools/perf/util/disasm.c | 7 +- tools/perf/util/evsel.c | 31 +- tools/perf/util/lzma.c | 2 +- tools/perf/util/session.c | 2 +- tools/perf/util/setup.py | 5 +- tools/perf/util/zlib.c | 2 +- tools/testing/selftests/mm/madv_populate.c | 21 +- tools/testing/selftests/mm/soft-dirty.c | 5 +- tools/testing/selftests/mm/vm_util.c | 77 ++++ tools/testing/selftests/mm/vm_util.h | 1 + tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 + .../selftests/net/netfilter/nf_nat_edemux.sh | 58 ++- tools/testing/selftests/rseq/rseq.c | 8 +- 276 files changed, 2812 insertions(+), 1556 deletions(-)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Weißschuh thomas.weissschuh@linutronix.de
commit 708c04a5c2b78e22f56e2350de41feba74dfccd9 upstream.
replace_fd() returns the number of the new file descriptor through the return value of do_dup2(). However its callers never care about the specific returned number. In fact the caller in receive_fd_replace() treats any non-zero return value as an error and therefore never calls __receive_sock() for most file descriptors, which is a bug.
To fix the bug in receive_fd_replace() and to avoid the same issue happening in future callers, signal success through a plain zero.
Suggested-by: Al Viro viro@zeniv.linux.org.uk Link: https://lore.kernel.org/lkml/20250801220215.GS222315@ZenIV/ Fixes: 173817151b15 ("fs: Expand __receive_fd() to accept existing fd") Fixes: 42eb0d54c08a ("fs: split receive_fd_replace from __receive_fd") Cc: stable@vger.kernel.org Signed-off-by: Thomas Weißschuh thomas.weissschuh@linutronix.de Link: https://lore.kernel.org/20250805-fix-receive_fd_replace-v3-1-b72ba8b34bac@li... Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/file.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/file.c +++ b/fs/file.c @@ -1262,7 +1262,10 @@ int replace_fd(unsigned fd, struct file err = expand_files(files, fd); if (unlikely(err < 0)) goto out_unlock; - return do_dup2(files, file, fd, flags); + err = do_dup2(files, file, fd, flags); + if (err < 0) + return err; + return 0;
out_unlock: spin_unlock(&files->file_lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksa Sarai cyphar@cyphar.com
commit 72d271a7baa7062cb27e774ac37c5459c6d20e22 upstream.
Userspace generally expects APIs that return -EMSGSIZE to allow for them to adjust their buffer size and retry the operation. However, the fscontext log would previously clear the message even in the -EMSGSIZE case.
Given that it is very cheap for us to check whether the buffer is too small before we remove the message from the ring buffer, let's just do that instead. While we're at it, refactor some fscontext_read() into a separate helper to make the ring buffer logic a bit easier to read.
Fixes: 007ec26cdc9f ("vfs: Implement logging through fs_context") Cc: David Howells dhowells@redhat.com Cc: stable@vger.kernel.org # v5.2+ Signed-off-by: Aleksa Sarai cyphar@cyphar.com Link: https://lore.kernel.org/20250807-fscontext-log-cleanups-v3-1-8d91d6242dc3@cy... Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fsopen.c | 70 ++++++++++++++++++++++++++++++++---------------------------- 1 file changed, 38 insertions(+), 32 deletions(-)
--- a/fs/fsopen.c +++ b/fs/fsopen.c @@ -18,50 +18,56 @@ #include "internal.h" #include "mount.h"
+static inline const char *fetch_message_locked(struct fc_log *log, size_t len, + bool *need_free) +{ + const char *p; + int index; + + if (unlikely(log->head == log->tail)) + return ERR_PTR(-ENODATA); + + index = log->tail & (ARRAY_SIZE(log->buffer) - 1); + p = log->buffer[index]; + if (unlikely(strlen(p) > len)) + return ERR_PTR(-EMSGSIZE); + + log->buffer[index] = NULL; + *need_free = log->need_free & (1 << index); + log->need_free &= ~(1 << index); + log->tail++; + + return p; +} + /* * Allow the user to read back any error, warning or informational messages. + * Only one message is returned for each read(2) call. */ static ssize_t fscontext_read(struct file *file, char __user *_buf, size_t len, loff_t *pos) { struct fs_context *fc = file->private_data; - struct fc_log *log = fc->log.log; - unsigned int logsize = ARRAY_SIZE(log->buffer); - ssize_t ret; - char *p; + ssize_t err; + const char *p __free(kfree) = NULL, *message; bool need_free; - int index, n; - - ret = mutex_lock_interruptible(&fc->uapi_mutex); - if (ret < 0) - return ret; - - if (log->head == log->tail) { - mutex_unlock(&fc->uapi_mutex); - return -ENODATA; - } + int n;
- index = log->tail & (logsize - 1); - p = log->buffer[index]; - need_free = log->need_free & (1 << index); - log->buffer[index] = NULL; - log->need_free &= ~(1 << index); - log->tail++; + err = mutex_lock_interruptible(&fc->uapi_mutex); + if (err < 0) + return err; + message = fetch_message_locked(fc->log.log, len, &need_free); mutex_unlock(&fc->uapi_mutex); + if (IS_ERR(message)) + return PTR_ERR(message);
- ret = -EMSGSIZE; - n = strlen(p); - if (n > len) - goto err_free; - ret = -EFAULT; - if (copy_to_user(_buf, p, n) != 0) - goto err_free; - ret = n; - -err_free: if (need_free) - kfree(p); - return ret; + p = message; + + n = strlen(message); + if (copy_to_user(_buf, message, n)) + return -EFAULT; + return n; }
static int fscontext_release(struct inode *inode, struct file *file)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Omar Sandoval osandov@fb.com
commit 5973a62efa34c80c9a4e5eac1fca6f6209b902af upstream.
Since the referenced fixes commit, the kernel's .text section is only mapped starting from _stext; the region [_text, _stext) is omitted. As a result, other vmalloc/vmap allocations may use the virtual addresses nominally in the range [_text, _stext). This address reuse confuses multiple things:
1. crash_prepare_elf64_headers() sets up a segment in /proc/vmcore mapping the entire range [_text, _end) to [__pa_symbol(_text), __pa_symbol(_end)). Reading an address in [_text, _stext) from /proc/vmcore therefore gives the incorrect result. 2. Tools doing symbolization (either by reading /proc/kallsyms or based on the vmlinux ELF file) will incorrectly identify vmalloc/vmap allocations in [_text, _stext) as kernel symbols.
In practice, both of these issues affect the drgn debugger. Specifically, there were cases where the vmap IRQ stacks for some CPUs were allocated in [_text, _stext). As a result, drgn could not get the stack trace for a crash in an IRQ handler because the core dump contained invalid data for the IRQ stack address. The stack addresses were also symbolized as being in the _text symbol.
Fix this by bringing back the mapping of [_text, _stext), but now make it non-executable and read-only. This prevents other allocations from using it while still achieving the original goal of not mapping unpredictable data as executable. Other than the changed protection, this is effectively a revert of the fixes commit.
Fixes: e2a073dde921 ("arm64: omit [_text, _stext) from permanent kernel mapping") Cc: stable@vger.kernel.org Signed-off-by: Omar Sandoval osandov@fb.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/pi/map_kernel.c | 6 ++++++ arch/arm64/kernel/setup.c | 4 ++-- arch/arm64/mm/init.c | 2 +- arch/arm64/mm/mmu.c | 14 +++++++++----- 4 files changed, 18 insertions(+), 8 deletions(-)
--- a/arch/arm64/kernel/pi/map_kernel.c +++ b/arch/arm64/kernel/pi/map_kernel.c @@ -78,6 +78,12 @@ static void __init map_kernel(u64 kaslr_ twopass |= enable_scs; prot = twopass ? data_prot : text_prot;
+ /* + * [_stext, _text) isn't executed after boot and contains some + * non-executable, unpredictable data, so map it non-executable. + */ + map_segment(init_pg_dir, &pgdp, va_offset, _text, _stext, data_prot, + false, root_level); map_segment(init_pg_dir, &pgdp, va_offset, _stext, _etext, prot, !twopass, root_level); map_segment(init_pg_dir, &pgdp, va_offset, __start_rodata, --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -213,7 +213,7 @@ static void __init request_standard_reso unsigned long i = 0; size_t res_size;
- kernel_code.start = __pa_symbol(_stext); + kernel_code.start = __pa_symbol(_text); kernel_code.end = __pa_symbol(__init_begin - 1); kernel_data.start = __pa_symbol(_sdata); kernel_data.end = __pa_symbol(_end - 1); @@ -281,7 +281,7 @@ u64 cpu_logical_map(unsigned int cpu)
void __init __no_sanitize_address setup_arch(char **cmdline_p) { - setup_initial_init_mm(_stext, _etext, _edata, _end); + setup_initial_init_mm(_text, _etext, _edata, _end);
*cmdline_p = boot_command_line;
--- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -300,7 +300,7 @@ void __init arm64_memblock_init(void) * Register the kernel text, kernel data, initrd, and initial * pagetables with memblock. */ - memblock_reserve(__pa_symbol(_stext), _end - _stext); + memblock_reserve(__pa_symbol(_text), _end - _text); if (IS_ENABLED(CONFIG_BLK_DEV_INITRD) && phys_initrd_size) { /* the generic initrd code expects virtual addresses */ initrd_start = __phys_to_virt(phys_initrd_start); --- a/arch/arm64/mm/mmu.c +++ b/arch/arm64/mm/mmu.c @@ -561,8 +561,8 @@ void __init mark_linear_text_alias_ro(vo /* * Remove the write permissions from the linear alias of .text/.rodata */ - update_mapping_prot(__pa_symbol(_stext), (unsigned long)lm_alias(_stext), - (unsigned long)__init_begin - (unsigned long)_stext, + update_mapping_prot(__pa_symbol(_text), (unsigned long)lm_alias(_text), + (unsigned long)__init_begin - (unsigned long)_text, PAGE_KERNEL_RO); }
@@ -623,7 +623,7 @@ static inline void arm64_kfence_map_pool static void __init map_mem(pgd_t *pgdp) { static const u64 direct_map_end = _PAGE_END(VA_BITS_MIN); - phys_addr_t kernel_start = __pa_symbol(_stext); + phys_addr_t kernel_start = __pa_symbol(_text); phys_addr_t kernel_end = __pa_symbol(__init_begin); phys_addr_t start, end; phys_addr_t early_kfence_pool; @@ -670,7 +670,7 @@ static void __init map_mem(pgd_t *pgdp) }
/* - * Map the linear alias of the [_stext, __init_begin) interval + * Map the linear alias of the [_text, __init_begin) interval * as non-executable now, and remove the write permission in * mark_linear_text_alias_ro() below (which will be called after * alternative patching has completed). This makes the contents @@ -697,6 +697,10 @@ void mark_rodata_ro(void) WRITE_ONCE(rodata_is_rw, false); update_mapping_prot(__pa_symbol(__start_rodata), (unsigned long)__start_rodata, section_size, PAGE_KERNEL_RO); + /* mark the range between _text and _stext as read only. */ + update_mapping_prot(__pa_symbol(_text), (unsigned long)_text, + (unsigned long)_stext - (unsigned long)_text, + PAGE_KERNEL_RO); }
static void __init declare_vma(struct vm_struct *vma, @@ -767,7 +771,7 @@ static void __init declare_kernel_vmas(v { static struct vm_struct vmlinux_seg[KERNEL_SEGMENT_COUNT];
- declare_vma(&vmlinux_seg[0], _stext, _etext, VM_NO_GUARD); + declare_vma(&vmlinux_seg[0], _text, _etext, VM_NO_GUARD); declare_vma(&vmlinux_seg[1], __start_rodata, __inittext_begin, VM_NO_GUARD); declare_vma(&vmlinux_seg[2], __inittext_begin, __inittext_end, VM_NO_GUARD); declare_vma(&vmlinux_seg[3], __initdata_begin, __initdata_end, VM_NO_GUARD);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit 6eb350a2233100a283f882c023e5ad426d0ed63b upstream.
rseq_need_restart() reads and clears task::rseq_event_mask with preemption disabled to guard against the scheduler.
But membarrier() uses an IPI and sets the PREEMPT bit in the event mask from the IPI, which leaves that RMW operation unprotected.
Use guard(irq) if CONFIG_MEMBARRIER is enabled to fix that.
Fixes: 2a36ab717e8f ("rseq/membarrier: Add MEMBARRIER_CMD_PRIVATE_EXPEDITED_RSEQ") Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Boqun Feng boqun.feng@gmail.com Reviewed-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/rseq.h | 11 ++++++++--- kernel/rseq.c | 10 +++++----- 2 files changed, 13 insertions(+), 8 deletions(-)
--- a/include/linux/rseq.h +++ b/include/linux/rseq.h @@ -7,6 +7,12 @@ #include <linux/preempt.h> #include <linux/sched.h>
+#ifdef CONFIG_MEMBARRIER +# define RSEQ_EVENT_GUARD irq +#else +# define RSEQ_EVENT_GUARD preempt +#endif + /* * Map the event mask on the user-space ABI enum rseq_cs_flags * for direct mask checks. @@ -41,9 +47,8 @@ static inline void rseq_handle_notify_re static inline void rseq_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) { - preempt_disable(); - __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask); - preempt_enable(); + scoped_guard(RSEQ_EVENT_GUARD) + __set_bit(RSEQ_EVENT_SIGNAL_BIT, ¤t->rseq_event_mask); rseq_handle_notify_resume(ksig, regs); }
--- a/kernel/rseq.c +++ b/kernel/rseq.c @@ -255,12 +255,12 @@ static int rseq_need_restart(struct task
/* * Load and clear event mask atomically with respect to - * scheduler preemption. + * scheduler preemption and membarrier IPIs. */ - preempt_disable(); - event_mask = t->rseq_event_mask; - t->rseq_event_mask = 0; - preempt_enable(); + scoped_guard(RSEQ_EVENT_GUARD) { + event_mask = t->rseq_event_mask; + t->rseq_event_mask = 0; + }
return !!event_mask; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Brauner brauner@kernel.org
commit c1f86d0ac322c7e77f6f8dbd216c65d39358ffc0 upstream.
Massage listmount() and make sure we don't call path_put() under the namespace semaphore. If we put the last reference we're fscked.
Fixes: b4c2bea8ceaa ("add listmount(2) syscall") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namespace.c | 87 ++++++++++++++++++++++++++++++++++++++------------------- 1 file changed, 59 insertions(+), 28 deletions(-)
--- a/fs/namespace.c +++ b/fs/namespace.c @@ -5411,23 +5411,34 @@ retry: return ret; }
-static ssize_t do_listmount(struct mnt_namespace *ns, u64 mnt_parent_id, - u64 last_mnt_id, u64 *mnt_ids, size_t nr_mnt_ids, - bool reverse) +struct klistmount { + u64 last_mnt_id; + u64 mnt_parent_id; + u64 *kmnt_ids; + u32 nr_mnt_ids; + struct mnt_namespace *ns; + struct path root; +}; + +static ssize_t do_listmount(struct klistmount *kls, bool reverse) { - struct path root __free(path_put) = {}; + struct mnt_namespace *ns = kls->ns; + u64 mnt_parent_id = kls->mnt_parent_id; + u64 last_mnt_id = kls->last_mnt_id; + u64 *mnt_ids = kls->kmnt_ids; + size_t nr_mnt_ids = kls->nr_mnt_ids; struct path orig; struct mount *r, *first; ssize_t ret;
rwsem_assert_held(&namespace_sem);
- ret = grab_requested_root(ns, &root); + ret = grab_requested_root(ns, &kls->root); if (ret) return ret;
if (mnt_parent_id == LSMT_ROOT) { - orig = root; + orig = kls->root; } else { orig.mnt = lookup_mnt_in_ns(mnt_parent_id, ns); if (!orig.mnt) @@ -5439,7 +5450,7 @@ static ssize_t do_listmount(struct mnt_n * Don't trigger audit denials. We just want to determine what * mounts to show users. */ - if (!is_path_reachable(real_mount(orig.mnt), orig.dentry, &root) && + if (!is_path_reachable(real_mount(orig.mnt), orig.dentry, &kls->root) && !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) return -EPERM;
@@ -5472,14 +5483,45 @@ static ssize_t do_listmount(struct mnt_n return ret; }
+static void __free_klistmount_free(const struct klistmount *kls) +{ + path_put(&kls->root); + kvfree(kls->kmnt_ids); + mnt_ns_release(kls->ns); +} + +static inline int prepare_klistmount(struct klistmount *kls, struct mnt_id_req *kreq, + size_t nr_mnt_ids) +{ + + u64 last_mnt_id = kreq->param; + + /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ + if (last_mnt_id != 0 && last_mnt_id <= MNT_UNIQUE_ID_OFFSET) + return -EINVAL; + + kls->last_mnt_id = last_mnt_id; + + kls->nr_mnt_ids = nr_mnt_ids; + kls->kmnt_ids = kvmalloc_array(nr_mnt_ids, sizeof(*kls->kmnt_ids), + GFP_KERNEL_ACCOUNT); + if (!kls->kmnt_ids) + return -ENOMEM; + + kls->ns = grab_requested_mnt_ns(kreq); + if (!kls->ns) + return -ENOENT; + + kls->mnt_parent_id = kreq->mnt_id; + return 0; +} + SYSCALL_DEFINE4(listmount, const struct mnt_id_req __user *, req, u64 __user *, mnt_ids, size_t, nr_mnt_ids, unsigned int, flags) { - u64 *kmnt_ids __free(kvfree) = NULL; + struct klistmount kls __free(klistmount_free) = {}; const size_t maxcount = 1000000; - struct mnt_namespace *ns __free(mnt_ns_release) = NULL; struct mnt_id_req kreq; - u64 last_mnt_id; ssize_t ret;
if (flags & ~LISTMOUNT_REVERSE) @@ -5500,31 +5542,20 @@ SYSCALL_DEFINE4(listmount, const struct if (ret) return ret;
- last_mnt_id = kreq.param; - /* The first valid unique mount id is MNT_UNIQUE_ID_OFFSET + 1. */ - if (last_mnt_id != 0 && last_mnt_id <= MNT_UNIQUE_ID_OFFSET) - return -EINVAL; - - kmnt_ids = kvmalloc_array(nr_mnt_ids, sizeof(*kmnt_ids), - GFP_KERNEL_ACCOUNT); - if (!kmnt_ids) - return -ENOMEM; - - ns = grab_requested_mnt_ns(&kreq); - if (!ns) - return -ENOENT; + ret = prepare_klistmount(&kls, &kreq, nr_mnt_ids); + if (ret) + return ret;
- if (kreq.mnt_ns_id && (ns != current->nsproxy->mnt_ns) && - !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) + if (kreq.mnt_ns_id && (kls.ns != current->nsproxy->mnt_ns) && + !ns_capable_noaudit(kls.ns->user_ns, CAP_SYS_ADMIN)) return -ENOENT;
scoped_guard(rwsem_read, &namespace_sem) - ret = do_listmount(ns, kreq.mnt_id, last_mnt_id, kmnt_ids, - nr_mnt_ids, (flags & LISTMOUNT_REVERSE)); + ret = do_listmount(&kls, (flags & LISTMOUNT_REVERSE)); if (ret <= 0) return ret;
- if (copy_to_user(mnt_ids, kmnt_ids, ret * sizeof(*mnt_ids))) + if (copy_to_user(mnt_ids, kls.kmnt_ids, ret * sizeof(*mnt_ids))) return -EFAULT;
return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhen Ni zhen.ni@easystack.cn
commit cd32e596f02fc981674573402c1138f616df1728 upstream.
The current implementation of clps711x_timer_init() has multiple error paths that directly return without releasing the base I/O memory mapped via of_iomap(). Fix of_iomap leaks in error paths.
Fixes: 04410efbb6bc ("clocksource/drivers/clps711x: Convert init function to return error") Fixes: 2a6a8e2d9004 ("clocksource/drivers/clps711x: Remove board support") Signed-off-by: Zhen Ni zhen.ni@easystack.cn Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250814123324.1516495-1-zhen.ni@easystack.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/clps711x-timer.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
--- a/drivers/clocksource/clps711x-timer.c +++ b/drivers/clocksource/clps711x-timer.c @@ -78,24 +78,33 @@ static int __init clps711x_timer_init(st unsigned int irq = irq_of_parse_and_map(np, 0); struct clk *clock = of_clk_get(np, 0); void __iomem *base = of_iomap(np, 0); + int ret = 0;
if (!base) return -ENOMEM; - if (!irq) - return -EINVAL; - if (IS_ERR(clock)) - return PTR_ERR(clock); + if (!irq) { + ret = -EINVAL; + goto unmap_io; + } + if (IS_ERR(clock)) { + ret = PTR_ERR(clock); + goto unmap_io; + }
switch (of_alias_get_id(np, "timer")) { case CLPS711X_CLKSRC_CLOCKSOURCE: clps711x_clksrc_init(clock, base); break; case CLPS711X_CLKSRC_CLOCKEVENT: - return _clps711x_clkevt_init(clock, base, irq); + ret = _clps711x_clkevt_init(clock, base, irq); + break; default: - return -EINVAL; + ret = -EINVAL; + break; }
- return 0; +unmap_io: + iounmap(base); + return ret; } TIMER_OF_DECLARE(clps711x, "cirrus,ep7209-timer", clps711x_timer_init);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toke Høiland-Jørgensen toke@redhat.com
commit 95920c2ed02bde551ab654e9749c2ca7bc3100e0 upstream.
Helge reported that the introduction of PP_MAGIC_MASK let to crashes on boot on his 32-bit parisc machine. The cause of this is the mask is set too wide, so the page_pool_page_is_pp() incurs false positives which crashes the machine.
Just disabling the check in page_pool_is_pp() will lead to the page_pool code itself malfunctioning; so instead of doing this, this patch changes the define for PP_DMA_INDEX_BITS to avoid mistaking arbitrary kernel pointers for page_pool-tagged pages.
The fix relies on the kernel pointers that alias with the pp_magic field always being above PAGE_OFFSET. With this assumption, we can use the lowest bit of the value of PAGE_OFFSET as the upper bound of the PP_DMA_INDEX_MASK, which should avoid the false positives.
Because we cannot rely on PAGE_OFFSET always being a compile-time constant, nor on it always being >0, we fall back to disabling the dma_index storage when there are not enough bits available. This leaves us in the situation we were in before the patch in the Fixes tag, but only on a subset of architecture configurations. This seems to be the best we can do until the transition to page types in complete for page_pool pages.
v2: - Make sure there's at least 8 bits available and that the PAGE_OFFSET bit calculation doesn't wrap
Link: https://lore.kernel.org/all/aMNJMFa5fDalFmtn@p100/ Fixes: ee62ce7a1d90 ("page_pool: Track DMA-mapped pages and unmap them when destroying the pool") Cc: stable@vger.kernel.org # 6.15+ Tested-by: Helge Deller deller@gmx.de Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Reviewed-by: Mina Almasry almasrymina@google.com Tested-by: Helge Deller deller@gmx.de Link: https://patch.msgid.link/20250930114331.675412-1-toke@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/mm.h | 22 ++++++++------ net/core/page_pool.c | 76 +++++++++++++++++++++++++++++++++++---------------- 2 files changed, 66 insertions(+), 32 deletions(-)
--- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -4318,14 +4318,13 @@ static inline void pgalloc_tag_copy(stru * since this value becomes part of PP_SIGNATURE; meaning we can just use the * space between the PP_SIGNATURE value (without POISON_POINTER_DELTA), and the * lowest bits of POISON_POINTER_DELTA. On arches where POISON_POINTER_DELTA is - * 0, we make sure that we leave the two topmost bits empty, as that guarantees - * we won't mistake a valid kernel pointer for a value we set, regardless of the - * VMSPLIT setting. + * 0, we use the lowest bit of PAGE_OFFSET as the boundary if that value is + * known at compile-time. * - * Altogether, this means that the number of bits available is constrained by - * the size of an unsigned long (at the upper end, subtracting two bits per the - * above), and the definition of PP_SIGNATURE (with or without - * POISON_POINTER_DELTA). + * If the value of PAGE_OFFSET is not known at compile time, or if it is too + * small to leave at least 8 bits available above PP_SIGNATURE, we define the + * number of bits to be 0, which turns off the DMA index tracking altogether + * (see page_pool_register_dma_index()). */ #define PP_DMA_INDEX_SHIFT (1 + __fls(PP_SIGNATURE - POISON_POINTER_DELTA)) #if POISON_POINTER_DELTA > 0 @@ -4334,8 +4333,13 @@ static inline void pgalloc_tag_copy(stru */ #define PP_DMA_INDEX_BITS MIN(32, __ffs(POISON_POINTER_DELTA) - PP_DMA_INDEX_SHIFT) #else -/* Always leave out the topmost two; see above. */ -#define PP_DMA_INDEX_BITS MIN(32, BITS_PER_LONG - PP_DMA_INDEX_SHIFT - 2) +/* Use the lowest bit of PAGE_OFFSET if there's at least 8 bits available; see above */ +#define PP_DMA_INDEX_MIN_OFFSET (1 << (PP_DMA_INDEX_SHIFT + 8)) +#define PP_DMA_INDEX_BITS ((__builtin_constant_p(PAGE_OFFSET) && \ + PAGE_OFFSET >= PP_DMA_INDEX_MIN_OFFSET && \ + !(PAGE_OFFSET & (PP_DMA_INDEX_MIN_OFFSET - 1))) ? \ + MIN(32, __ffs(PAGE_OFFSET) - PP_DMA_INDEX_SHIFT) : 0) + #endif
#define PP_DMA_INDEX_MASK GENMASK(PP_DMA_INDEX_BITS + PP_DMA_INDEX_SHIFT - 1, \ --- a/net/core/page_pool.c +++ b/net/core/page_pool.c @@ -462,11 +462,60 @@ page_pool_dma_sync_for_device(const stru } }
+static int page_pool_register_dma_index(struct page_pool *pool, + netmem_ref netmem, gfp_t gfp) +{ + int err = 0; + u32 id; + + if (unlikely(!PP_DMA_INDEX_BITS)) + goto out; + + if (in_softirq()) + err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + else + err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), + PP_DMA_INDEX_LIMIT, gfp); + if (err) { + WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@"); + goto out; + } + + netmem_set_dma_index(netmem, id); +out: + return err; +} + +static int page_pool_release_dma_index(struct page_pool *pool, + netmem_ref netmem) +{ + struct page *old, *page = netmem_to_page(netmem); + unsigned long id; + + if (unlikely(!PP_DMA_INDEX_BITS)) + return 0; + + id = netmem_get_dma_index(netmem); + if (!id) + return -1; + + if (in_softirq()) + old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0); + else + old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0); + if (old != page) + return -1; + + netmem_set_dma_index(netmem, 0); + + return 0; +} + static bool page_pool_dma_map(struct page_pool *pool, netmem_ref netmem, gfp_t gfp) { dma_addr_t dma; int err; - u32 id;
/* Setup DMA mapping: use 'struct page' area for storing DMA-addr * since dma_addr_t can be either 32 or 64 bits and does not always fit @@ -485,18 +534,10 @@ static bool page_pool_dma_map(struct pag goto unmap_failed; }
- if (in_softirq()) - err = xa_alloc(&pool->dma_mapped, &id, netmem_to_page(netmem), - PP_DMA_INDEX_LIMIT, gfp); - else - err = xa_alloc_bh(&pool->dma_mapped, &id, netmem_to_page(netmem), - PP_DMA_INDEX_LIMIT, gfp); - if (err) { - WARN_ONCE(err != -ENOMEM, "couldn't track DMA mapping, please report to netdev@"); + err = page_pool_register_dma_index(pool, netmem, gfp); + if (err) goto unset_failed; - }
- netmem_set_dma_index(netmem, id); page_pool_dma_sync_for_device(pool, netmem, pool->p.max_len);
return true; @@ -669,8 +710,6 @@ void page_pool_clear_pp_info(netmem_ref static __always_inline void __page_pool_release_page_dma(struct page_pool *pool, netmem_ref netmem) { - struct page *old, *page = netmem_to_page(netmem); - unsigned long id; dma_addr_t dma;
if (!pool->dma_map) @@ -679,15 +718,7 @@ static __always_inline void __page_pool_ */ return;
- id = netmem_get_dma_index(netmem); - if (!id) - return; - - if (in_softirq()) - old = xa_cmpxchg(&pool->dma_mapped, id, page, NULL, 0); - else - old = xa_cmpxchg_bh(&pool->dma_mapped, id, page, NULL, 0); - if (old != page) + if (page_pool_release_dma_index(pool, netmem)) return;
dma = page_pool_get_dma_addr_netmem(netmem); @@ -697,7 +728,6 @@ static __always_inline void __page_pool_ PAGE_SIZE << pool->p.order, pool->p.dma_dir, DMA_ATTR_SKIP_CPU_SYNC | DMA_ATTR_WEAK_ORDERING); page_pool_set_dma_addr_netmem(netmem, 0); - netmem_set_dma_index(netmem, 0); }
/* Disconnects a page (from a page_pool). API users can have a need
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Petr Tesarik ptesarik@suse.com
commit 16abbabc004bedeeaa702e11913da9d4fa70e63a upstream.
Set __entry->dir to the actual "dir" parameter of all trace events in dma_alloc_class. This struct member was left uninitialized by mistake.
Signed-off-by: Petr Tesarik ptesarik@suse.com Fixes: 3afff779a725 ("dma-mapping: trace dma_alloc/free direction") Cc: stable@vger.kernel.org Reviewed-by: Sean Anderson sean.anderson@linux.dev Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/r/20251001061028.412258-1-ptesarik@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/trace/events/dma.h | 1 + 1 file changed, 1 insertion(+)
--- a/include/trace/events/dma.h +++ b/include/trace/events/dma.h @@ -136,6 +136,7 @@ DECLARE_EVENT_CLASS(dma_alloc_class, __entry->dma_addr = dma_addr; __entry->size = size; __entry->flags = flags; + __entry->dir = dir; __entry->attrs = attrs; ),
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 68e61f6fd65610e73b17882f86fedfd784d99229 upstream.
Emulate PERF_CNTR_GLOBAL_STATUS_SET when PerfMonV2 is enumerated to the guest, as the MSR is supposed to exist in all AMD v2 PMUs.
Fixes: 4a2771895ca6 ("KVM: x86/svm/pmu: Add AMD PerfMonV2 support") Cc: stable@vger.kernel.org Cc: Sandipan Das sandipan.das@amd.com Link: https://lore.kernel.org/r/20250711172746.1579423-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/msr-index.h | 1 + arch/x86/kvm/pmu.c | 5 +++++ arch/x86/kvm/svm/pmu.c | 1 + arch/x86/kvm/x86.c | 2 ++ 4 files changed, 9 insertions(+)
--- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -722,6 +722,7 @@ #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS 0xc0000300 #define MSR_AMD64_PERF_CNTR_GLOBAL_CTL 0xc0000301 #define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR 0xc0000302 +#define MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET 0xc0000303
/* AMD Last Branch Record MSRs */ #define MSR_AMD64_LBR_SELECT 0xc000010e --- a/arch/x86/kvm/pmu.c +++ b/arch/x86/kvm/pmu.c @@ -650,6 +650,7 @@ int kvm_pmu_get_msr(struct kvm_vcpu *vcp msr_info->data = pmu->global_ctrl; break; case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR: + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET: case MSR_CORE_PERF_GLOBAL_OVF_CTRL: msr_info->data = 0; break; @@ -711,6 +712,10 @@ int kvm_pmu_set_msr(struct kvm_vcpu *vcp if (!msr_info->host_initiated) pmu->global_status &= ~data; break; + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET: + if (!msr_info->host_initiated) + pmu->global_status |= data & ~pmu->global_status_rsvd; + break; default: kvm_pmu_mark_pmc_in_use(vcpu, msr_info->index); return kvm_pmu_call(set_msr)(vcpu, msr_info); --- a/arch/x86/kvm/svm/pmu.c +++ b/arch/x86/kvm/svm/pmu.c @@ -113,6 +113,7 @@ static bool amd_is_valid_msr(struct kvm_ case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS: case MSR_AMD64_PERF_CNTR_GLOBAL_CTL: case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR: + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET: return pmu->version > 1; default: if (msr > MSR_F15H_PERF_CTR5 && --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -364,6 +364,7 @@ static const u32 msrs_to_save_pmu[] = { MSR_AMD64_PERF_CNTR_GLOBAL_CTL, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS, MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR, + MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET, };
static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_base) + @@ -7449,6 +7450,7 @@ static void kvm_probe_msr_to_save(u32 ms case MSR_AMD64_PERF_CNTR_GLOBAL_CTL: case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS: case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_CLR: + case MSR_AMD64_PERF_CNTR_GLOBAL_STATUS_SET: if (!kvm_cpu_cap_has(X86_FEATURE_PERFMON_V2)) return; break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Hennerich michael.hennerich@analog.com
commit 1d8fdabe19267338f29b58f968499e5b55e6a3b6 upstream.
The clk div bits (2 bits wide) do not start in bit 16 but in bit 15. Fix it accordingly.
Fixes: e31166f0fd48 ("iio: frequency: New driver for Analog Devices ADF4350/ADF4351 Wideband Synthesizers") Signed-off-by: Michael Hennerich michael.hennerich@analog.com Signed-off-by: Nuno Sá nuno.sa@analog.com Link: https://patch.msgid.link/20250829-adf4350-fix-v2-2-0bf543ba797d@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/iio/frequency/adf4350.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/iio/frequency/adf4350.h +++ b/include/linux/iio/frequency/adf4350.h @@ -51,7 +51,7 @@
/* REG3 Bit Definitions */ #define ADF4350_REG3_12BIT_CLKDIV(x) ((x) << 3) -#define ADF4350_REG3_12BIT_CLKDIV_MODE(x) ((x) << 16) +#define ADF4350_REG3_12BIT_CLKDIV_MODE(x) ((x) << 15) #define ADF4350_REG3_12BIT_CSR_EN (1 << 18) #define ADF4351_REG3_CHARGE_CANCELLATION_EN (1 << 21) #define ADF4351_REG3_ANTI_BACKLASH_3ns_EN (1 << 22)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomi Valkeinen tomi.valkeinen@ideasonboard.com
commit f37df9a0eb5e43fcfe02cbaef076123dc0d79c7e upstream.
v4l2_subdev_call_state_try() macro allocates a subdev state with __v4l2_subdev_state_alloc(), but does not check the returned value. If __v4l2_subdev_state_alloc fails, it returns an ERR_PTR, and that would cause v4l2_subdev_call_state_try() to crash.
Add proper error handling to v4l2_subdev_call_state_try().
Signed-off-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Fixes: 982c0487185b ("media: subdev: Add v4l2_subdev_call_state_try() macro") Reported-by: Dan Carpenter dan.carpenter@linaro.org Closes: https://lore.kernel.org/all/aJTNtpDUbTz7eyJc%40stanley.mountain/ Cc: stable@vger.kernel.org Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/media/v4l2-subdev.h | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-)
--- a/include/media/v4l2-subdev.h +++ b/include/media/v4l2-subdev.h @@ -1952,19 +1952,23 @@ extern const struct v4l2_subdev_ops v4l2 * * Note: only legacy non-MC drivers may need this macro. */ -#define v4l2_subdev_call_state_try(sd, o, f, args...) \ - ({ \ - int __result; \ - static struct lock_class_key __key; \ - const char *name = KBUILD_BASENAME \ - ":" __stringify(__LINE__) ":state->lock"; \ - struct v4l2_subdev_state *state = \ - __v4l2_subdev_state_alloc(sd, name, &__key); \ - v4l2_subdev_lock_state(state); \ - __result = v4l2_subdev_call(sd, o, f, state, ##args); \ - v4l2_subdev_unlock_state(state); \ - __v4l2_subdev_state_free(state); \ - __result; \ +#define v4l2_subdev_call_state_try(sd, o, f, args...) \ + ({ \ + int __result; \ + static struct lock_class_key __key; \ + const char *name = KBUILD_BASENAME \ + ":" __stringify(__LINE__) ":state->lock"; \ + struct v4l2_subdev_state *state = \ + __v4l2_subdev_state_alloc(sd, name, &__key); \ + if (IS_ERR(state)) { \ + __result = PTR_ERR(state); \ + } else { \ + v4l2_subdev_lock_state(state); \ + __result = v4l2_subdev_call(sd, o, f, state, ##args); \ + v4l2_subdev_unlock_state(state); \ + __v4l2_subdev_state_free(state); \ + } \ + __result; \ })
/**
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Varad Gautam varadgautam@google.com
commit 8327bd4fcb6c1dab01ce5c6ff00b42496836dcd2 upstream.
With `CONFIG_TRACE_MMIO_ACCESS=y`, the `{read,write}{b,w,l,q}{_relaxed}()` mmio accessors unconditionally call `log_{post_}{read,write}_mmio()` helpers, which in turn call the ftrace ops for `rwmmio` trace events
This adds a performance penalty per mmio accessor call, even when `rwmmio` events are disabled at runtime (~80% overhead on local measurement).
Guard these with `tracepoint_enabled()`.
Signed-off-by: Varad Gautam varadgautam@google.com Fixes: 210031971cdd ("asm-generic/io: Add logging support for MMIO accessors") Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/asm-generic/io.h | 98 +++++++++++++++++++++++++++++++---------------- 1 file changed, 66 insertions(+), 32 deletions(-)
--- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -75,6 +75,7 @@ #if IS_ENABLED(CONFIG_TRACE_MMIO_ACCESS) && !(defined(__DISABLE_TRACE_MMIO__)) #include <linux/tracepoint-defs.h>
+#define rwmmio_tracepoint_enabled(tracepoint) tracepoint_enabled(tracepoint) DECLARE_TRACEPOINT(rwmmio_write); DECLARE_TRACEPOINT(rwmmio_post_write); DECLARE_TRACEPOINT(rwmmio_read); @@ -91,6 +92,7 @@ void log_post_read_mmio(u64 val, u8 widt
#else
+#define rwmmio_tracepoint_enabled(tracepoint) false static inline void log_write_mmio(u64 val, u8 width, volatile void __iomem *addr, unsigned long caller_addr, unsigned long caller_addr0) {} static inline void log_post_write_mmio(u64 val, u8 width, volatile void __iomem *addr, @@ -189,11 +191,13 @@ static inline u8 readb(const volatile vo { u8 val;
- log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); __io_br(); val = __raw_readb(addr); __io_ar(val); - log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -204,11 +208,13 @@ static inline u16 readw(const volatile v { u16 val;
- log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); __io_br(); val = __le16_to_cpu((__le16 __force)__raw_readw(addr)); __io_ar(val); - log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -219,11 +225,13 @@ static inline u32 readl(const volatile v { u32 val;
- log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); __io_br(); val = __le32_to_cpu((__le32 __force)__raw_readl(addr)); __io_ar(val); - log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -235,11 +243,13 @@ static inline u64 readq(const volatile v { u64 val;
- log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); __io_br(); val = __le64_to_cpu((__le64 __force)__raw_readq(addr)); __io_ar(val); - log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -249,11 +259,13 @@ static inline u64 readq(const volatile v #define writeb writeb static inline void writeb(u8 value, volatile void __iomem *addr) { - log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); __io_bw(); __raw_writeb(value, addr); __io_aw(); - log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -261,11 +273,13 @@ static inline void writeb(u8 value, vola #define writew writew static inline void writew(u16 value, volatile void __iomem *addr) { - log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); __io_bw(); __raw_writew((u16 __force)cpu_to_le16(value), addr); __io_aw(); - log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -273,11 +287,13 @@ static inline void writew(u16 value, vol #define writel writel static inline void writel(u32 value, volatile void __iomem *addr) { - log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); __io_bw(); __raw_writel((u32 __force)__cpu_to_le32(value), addr); __io_aw(); - log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -286,11 +302,13 @@ static inline void writel(u32 value, vol #define writeq writeq static inline void writeq(u64 value, volatile void __iomem *addr) { - log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); __io_bw(); __raw_writeq((u64 __force)__cpu_to_le64(value), addr); __io_aw(); - log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); } #endif #endif /* CONFIG_64BIT */ @@ -306,9 +324,11 @@ static inline u8 readb_relaxed(const vol { u8 val;
- log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(8, addr, _THIS_IP_, _RET_IP_); val = __raw_readb(addr); - log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 8, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -319,9 +339,11 @@ static inline u16 readw_relaxed(const vo { u16 val;
- log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(16, addr, _THIS_IP_, _RET_IP_); val = __le16_to_cpu((__le16 __force)__raw_readw(addr)); - log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 16, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -332,9 +354,11 @@ static inline u32 readl_relaxed(const vo { u32 val;
- log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(32, addr, _THIS_IP_, _RET_IP_); val = __le32_to_cpu((__le32 __force)__raw_readl(addr)); - log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 32, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -345,9 +369,11 @@ static inline u64 readq_relaxed(const vo { u64 val;
- log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_read)) + log_read_mmio(64, addr, _THIS_IP_, _RET_IP_); val = __le64_to_cpu((__le64 __force)__raw_readq(addr)); - log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_read)) + log_post_read_mmio(val, 64, addr, _THIS_IP_, _RET_IP_); return val; } #endif @@ -356,9 +382,11 @@ static inline u64 readq_relaxed(const vo #define writeb_relaxed writeb_relaxed static inline void writeb_relaxed(u8 value, volatile void __iomem *addr) { - log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); __raw_writeb(value, addr); - log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 8, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -366,9 +394,11 @@ static inline void writeb_relaxed(u8 val #define writew_relaxed writew_relaxed static inline void writew_relaxed(u16 value, volatile void __iomem *addr) { - log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); __raw_writew((u16 __force)cpu_to_le16(value), addr); - log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 16, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -376,9 +406,11 @@ static inline void writew_relaxed(u16 va #define writel_relaxed writel_relaxed static inline void writel_relaxed(u32 value, volatile void __iomem *addr) { - log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); __raw_writel((u32 __force)__cpu_to_le32(value), addr); - log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 32, addr, _THIS_IP_, _RET_IP_); } #endif
@@ -386,9 +418,11 @@ static inline void writel_relaxed(u32 va #define writeq_relaxed writeq_relaxed static inline void writeq_relaxed(u64 value, volatile void __iomem *addr) { - log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_write)) + log_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); __raw_writeq((u64 __force)__cpu_to_le64(value), addr); - log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); + if (rwmmio_tracepoint_enabled(rwmmio_post_write)) + log_post_write_mmio(value, 64, addr, _THIS_IP_, _RET_IP_); } #endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 78d853512d6f979cf0cc41566e4f6cd82995ff34 ]
Incrementing NULL is undefined behavior and triggers ubsan during the perf annotate test.
Split a compound statement over two lines to avoid this.
Fixes: 98f69a573c668a18 ("perf annotate: Split out util/disasm.c") Reviewed-by: Collin Funk collin.funk1@gmail.com Reviewed-by: James Clark james.clark@linaro.org Reviewed-by: Kuan-Wei Chiu visitorckw@gmail.com Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Athira Rajeev atrajeev@linux.ibm.com Cc: Blake Jones blakejones@google.com Cc: Chun-Tse Shao ctshao@google.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: Jan Polensky japo@linux.ibm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Li Huafei lihuafei1@huawei.com Cc: Mark Rutland mark.rutland@arm.com Cc: Nam Cao namcao@linutronix.de Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Steinar H. Gunderson sesse@google.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20250821163820.1132977-2-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/disasm.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/disasm.c b/tools/perf/util/disasm.c index 648e8d87ef194..a228a7ba30caa 100644 --- a/tools/perf/util/disasm.c +++ b/tools/perf/util/disasm.c @@ -389,13 +389,16 @@ static int jump__parse(struct arch *arch, struct ins_operands *ops, struct map_s * skip over possible up to 2 operands to get to address, e.g.: * tbnz w0, #26, ffff0000083cd190 <security_file_permission+0xd0> */ - if (c++ != NULL) { + if (c != NULL) { + c++; ops->target.addr = strtoull(c, NULL, 16); if (!ops->target.addr) { c = strchr(c, ','); c = validate_comma(c, ops); - if (c++ != NULL) + if (c != NULL) { + c++; ops->target.addr = strtoull(c, NULL, 16); + } } } else { ops->target.addr = strtoull(ops->raw, NULL, 16);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 4bd5bd8dbd41a208fb73afb65bda6f38e2b5a637 ]
Modify test behavior to skip if BPF calls fail with "Operation not permitted".
Fixes: d66763fed30f0bd8 ("perf test trace_btf_enum: Add regression test for the BTF augmentation of enums in 'perf trace'") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Athira Rajeev atrajeev@linux.ibm.com Cc: Blake Jones blakejones@google.com Cc: Chun-Tse Shao ctshao@google.com Cc: Collin Funk collin.funk1@gmail.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: Jan Polensky japo@linux.ibm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Li Huafei lihuafei1@huawei.com Cc: Mark Rutland mark.rutland@arm.com Cc: Nam Cao namcao@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Steinar H. Gunderson sesse@google.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20250821163820.1132977-3-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/trace_btf_enum.sh | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/tools/perf/tests/shell/trace_btf_enum.sh b/tools/perf/tests/shell/trace_btf_enum.sh index 8d1e6bbeac906..1447d7425f381 100755 --- a/tools/perf/tests/shell/trace_btf_enum.sh +++ b/tools/perf/tests/shell/trace_btf_enum.sh @@ -23,6 +23,14 @@ check_vmlinux() { fi }
+check_permissions() { + if perf trace -e $syscall $TESTPROG 2>&1 | grep -q "Operation not permitted" + then + echo "trace+enum test [Skipped permissions]" + err=2 + fi +} + trace_landlock() { echo "Tracing syscall ${syscall}"
@@ -54,6 +62,9 @@ trace_non_syscall() { }
check_vmlinux +if [ $err = 0 ]; then + check_permissions +fi
if [ $err = 0 ]; then trace_landlock
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 2354479026d726954ff86ce82f4b649637319661 ]
An evsel should typically have a leader of itself, however, in tests like 'Sample parsing' a NULL leader may occur and the container_of will return a corrupt pointer.
Avoid this with an explicit NULL test.
Fixes: fba7c86601e2e42d ("libperf: Move 'leader' from tools/perf to perf_evsel::leader") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Athira Rajeev atrajeev@linux.ibm.com Cc: Blake Jones blakejones@google.com Cc: Chun-Tse Shao ctshao@google.com Cc: Collin Funk collin.funk1@gmail.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: Jan Polensky japo@linux.ibm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Li Huafei lihuafei1@huawei.com Cc: Mark Rutland mark.rutland@arm.com Cc: Nam Cao namcao@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Steinar H. Gunderson sesse@google.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20250821163820.1132977-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/evsel.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 6d7249cc1a993..b3de8ce559998 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -3497,6 +3497,8 @@ bool evsel__is_hybrid(const struct evsel *evsel)
struct evsel *evsel__leader(const struct evsel *evsel) { + if (evsel->core.leader == NULL) + return NULL; return container_of(evsel->core.leader, struct evsel, core); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit b39c915a4f365cce6bdc0e538ed95d31823aea8f ]
Perf's synthetic-events.c will ensure 8-byte alignment of tracing data, writing it after a perf_record_header_tracing_data event.
Add padding to struct perf_record_header_tracing_data to make it 16-byte rather than 12-byte sized.
Fixes: 055c67ed39887c55 ("perf tools: Move event synthesizing routines to separate .c file") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Athira Rajeev atrajeev@linux.ibm.com Cc: Blake Jones blakejones@google.com Cc: Chun-Tse Shao ctshao@google.com Cc: Collin Funk collin.funk1@gmail.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: Jan Polensky japo@linux.ibm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Li Huafei lihuafei1@huawei.com Cc: Mark Rutland mark.rutland@arm.com Cc: Nam Cao namcao@linutronix.de Cc: Peter Zijlstra peterz@infradead.org Cc: Steinar H. Gunderson sesse@google.com Cc: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20250821163820.1132977-6-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/perf/include/perf/event.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/perf/include/perf/event.h b/tools/lib/perf/include/perf/event.h index 37bb7771d9143..32b75c0326c93 100644 --- a/tools/lib/perf/include/perf/event.h +++ b/tools/lib/perf/include/perf/event.h @@ -291,6 +291,7 @@ struct perf_record_header_event_type { struct perf_record_header_tracing_data { struct perf_event_header header; __u32 size; + __u32 pad; };
#define PERF_RECORD_MISC_BUILD_ID_SIZE (1 << 15)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 1e50f5c9965252ed6657b8692cd7366784d60616 ]
The devm_clk_hw_get_clk() function doesn't return NULL, it returns error pointers. Update the checking to match.
Fixes: 8737ec830ee3 ("clk: qcom: common: Add interconnect clocks support") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Imran Shaik imran.shaik@oss.qualcomm.com Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/aLaPwL2gFS85WsfD@stanley.mountain Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/qcom/common.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/qcom/common.c b/drivers/clk/qcom/common.c index 33cc1f73c69d1..cab2dbb7a8d49 100644 --- a/drivers/clk/qcom/common.c +++ b/drivers/clk/qcom/common.c @@ -273,8 +273,8 @@ static int qcom_cc_icc_register(struct device *dev, icd[i].slave_id = desc->icc_hws[i].slave_id; hws = &desc->clks[desc->icc_hws[i].clk_id]->hw; icd[i].clk = devm_clk_hw_get_clk(dev, hws, "icc"); - if (!icd[i].clk) - return dev_err_probe(dev, -ENOENT, + if (IS_ERR(icd[i].clk)) + return dev_err_probe(dev, PTR_ERR(icd[i].clk), "(%d) clock entry is null\n", i); icd[i].name = clk_hw_get_name(hws); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Masney bmasney@redhat.com
[ Upstream commit 47b13635dabc14f1c2fdcaa5468b47ddadbdd1b5 ]
determine_rate() is expected to return an error code, or 0 on success. clk_sam9x5_peripheral_determine_rate() has a branch that returns the parent rate on a certain case. This is the behavior of round_rate(), so let's go ahead and fix this by setting req->rate.
Fixes: b4c115c76184f ("clk: at91: clk-peripheral: add support for changeable parent rate") Reviewed-by: Alexander Sverdlin alexander.sverdlin@gmail.com Acked-by: Nicolas Ferre nicolas.ferre@microchip.com Signed-off-by: Brian Masney bmasney@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/at91/clk-peripheral.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/at91/clk-peripheral.c b/drivers/clk/at91/clk-peripheral.c index c173a44c800aa..629f050a855aa 100644 --- a/drivers/clk/at91/clk-peripheral.c +++ b/drivers/clk/at91/clk-peripheral.c @@ -279,8 +279,11 @@ static int clk_sam9x5_peripheral_determine_rate(struct clk_hw *hw, long best_diff = LONG_MIN; u32 shift;
- if (periph->id < PERIPHERAL_ID_MIN || !periph->range.max) - return parent_rate; + if (periph->id < PERIPHERAL_ID_MIN || !periph->range.max) { + req->rate = parent_rate; + + return 0; + }
/* Fist step: check the available dividers. */ for (shift = 0; shift <= PERIPHERAL_MAX_SHIFT; shift++) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuan CHen chenyuan@kylinos.cn
[ Upstream commit cc55fc58fc1b7f405003fd2ecf79e74653461f0b ]
In case of krealloc_array() failure, the current error handling just returns from the function without freeing the original array. Fix this memory leak by freeing the original array.
Fixes: 6aa1754764901668 ("clk: renesas: cpg-mssr: Ignore all clocks assigned to non-Linux system") Signed-off-by: Yuan CHen chenyuan@kylinos.cn Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Link: https://patch.msgid.link/20250908012810.4767-1-chenyuan_fl@163.com Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/renesas/renesas-cpg-mssr.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/renesas/renesas-cpg-mssr.c b/drivers/clk/renesas/renesas-cpg-mssr.c index 0f27c33192e10..112ed81f648ee 100644 --- a/drivers/clk/renesas/renesas-cpg-mssr.c +++ b/drivers/clk/renesas/renesas-cpg-mssr.c @@ -1012,6 +1012,7 @@ static int __init cpg_mssr_reserved_init(struct cpg_mssr_priv *priv,
of_for_each_phandle(&it, rc, node, "clocks", "#clock-cells", -1) { int idx; + unsigned int *new_ids;
if (it.node != priv->np) continue; @@ -1022,11 +1023,13 @@ static int __init cpg_mssr_reserved_init(struct cpg_mssr_priv *priv, if (args[0] != CPG_MOD) continue;
- ids = krealloc_array(ids, (num + 1), sizeof(*ids), GFP_KERNEL); - if (!ids) { + new_ids = krealloc_array(ids, (num + 1), sizeof(*ids), GFP_KERNEL); + if (!new_ids) { of_node_put(it.node); + kfree(ids); return -ENOMEM; } + ids = new_ids;
if (priv->reg_layout == CLK_REG_LAYOUT_RZ_A) idx = MOD_CLK_PACK_10(args[1]); /* for DEF_MOD_STB() */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yunseong Kim ysk@kzalloc.com
[ Upstream commit 43fa1141e2c1af79c91aaa4df03e436c415a6fc3 ]
The lzma_is_compressed and gzip_is_compressed functions are declared to return a "bool" type, but in case of an error (e.g., file open failure), they incorrectly returned -1.
A bool type is a boolean value that is either true or false. Returning -1 for a bool return type can lead to unexpected behavior and may violate strict type-checking in some compilers.
Fix the return value to be false in error cases, ensuring the function adheres to its declared return type improves for preventing potential bugs related to type mismatch.
Fixes: 4b57fd44b61beb51 ("perf tools: Add lzma_is_compressed function") Reviewed-by: Ian Rogers irogers@google.com Signed-off-by: Yunseong Kim ysk@kzalloc.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Namhyung Kim namhyung@kernel.org Cc: Stephen Brennan stephen.s.brennan@oracle.com Link: https://lore.kernel.org/r/20250822162506.316844-3-ysk@kzalloc.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/lzma.c | 2 +- tools/perf/util/zlib.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/lzma.c b/tools/perf/util/lzma.c index af9a97612f9df..f61574d1581e3 100644 --- a/tools/perf/util/lzma.c +++ b/tools/perf/util/lzma.c @@ -113,7 +113,7 @@ bool lzma_is_compressed(const char *input) ssize_t rc;
if (fd < 0) - return -1; + return false;
rc = read(fd, buf, sizeof(buf)); close(fd); diff --git a/tools/perf/util/zlib.c b/tools/perf/util/zlib.c index 78d2297c1b674..1f7c065230599 100644 --- a/tools/perf/util/zlib.c +++ b/tools/perf/util/zlib.c @@ -88,7 +88,7 @@ bool gzip_is_compressed(const char *input) ssize_t rc;
if (fd < 0) - return -1; + return false;
rc = read(fd, buf, sizeof(buf)); close(fd);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Herring (Arm) robh@kernel.org
[ Upstream commit 606d19ee37de3a72f1b6e95a4ea544f6f20dbb46 ]
The vendor for the X1205 RTC is not Xircom, but Xicor which was acquired by Intersil. Since the I2C subsystem drops the vendor prefix for driver matching, the vendor prefix hasn't mattered.
Fixes: 6875404fdb44 ("rtc: x1205: Add DT probing support") Signed-off-by: Rob Herring (Arm) robh@kernel.org Reviewed-by: Linus Walleij linus.walleij@linaro.org Link: https://lore.kernel.org/r/20250821215703.869628-2-robh@kernel.org Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-x1205.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/rtc/rtc-x1205.c b/drivers/rtc/rtc-x1205.c index 4bcd7ca32f27b..b8a0fccef14e0 100644 --- a/drivers/rtc/rtc-x1205.c +++ b/drivers/rtc/rtc-x1205.c @@ -669,7 +669,7 @@ static const struct i2c_device_id x1205_id[] = { MODULE_DEVICE_TABLE(i2c, x1205_id);
static const struct of_device_id x1205_dt_ids[] = { - { .compatible = "xircom,x1205", }, + { .compatible = "xicor,x1205", }, {}, }; MODULE_DEVICE_TABLE(of, x1205_dt_ids);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Clément Le Goffic clement.legoffic@foss.st.com
[ Upstream commit a531350d2fe58f7fc4516e555f22391dee94efd9 ]
Fix a memory leak in case of driver removal. Free the shared memory used for arguments exchanges between kernel and OP-TEE RTC PTA.
Fixes: 81c2f059ab90 ("rtc: optee: add RTC driver for OP-TEE RTC PTA") Signed-off-by: Clément Le Goffic clement.legoffic@foss.st.com Link: https://lore.kernel.org/r/20250715-upstream-optee-rtc-v1-1-e0fdf8aae545@foss... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/rtc/rtc-optee.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/rtc/rtc-optee.c b/drivers/rtc/rtc-optee.c index 9f8b5d4a8f6b6..6b77c122fdc10 100644 --- a/drivers/rtc/rtc-optee.c +++ b/drivers/rtc/rtc-optee.c @@ -320,6 +320,7 @@ static int optee_rtc_remove(struct device *dev) { struct optee_rtc *priv = dev_get_drvdata(dev);
+ tee_shm_free(priv->shm); tee_client_close_session(priv->ctx, priv->session_id); tee_client_close_context(priv->ctx);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit 039fd0634a0629132432632d7ac9a14915406b5c ]
Set the mem_remote field for a remote access to appropriately represent the event.
Fixes: a89dbc9b988f3ba8 ("perf arm-spe: Set sample's data source field") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ali Saidi alisaidi@amazon.com Cc: German Gomez german.gomez@arm.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Will Deacon will@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/arm-spe.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 2c06f2a85400e..921b1c6e11379 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -514,7 +514,7 @@ static void arm_spe__synth_data_source_generic(const struct arm_spe_record *reco }
if (record->type & ARM_SPE_REMOTE_ACCESS) - data_src->mem_lvl |= PERF_MEM_LVL_REM_CCE1; + data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; }
static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 midr)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit 50b8f1d5bf4ad7f09ef8012ccf5f94f741df827b ]
The Neoverse CPUs follow the common data source encoding, and other CPU variants can share the same format.
Rename the CPU list and data source definitions as common data source names. This change prepares for appending more CPU variants.
Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: James Clark james.clark@linaro.org Link: https://lore.kernel.org/r/20241003185322.192357-3-leo.yan@arm.com Signed-off-by: Namhyung Kim namhyung@kernel.org Stable-dep-of: cb300e351505 ("perf arm_spe: Correct memory level for remote access") Signed-off-by: Sasha Levin sashal@kernel.org --- .../util/arm-spe-decoder/arm-spe-decoder.h | 18 ++++++------ tools/perf/util/arm-spe.c | 28 +++++++++---------- 2 files changed, 23 insertions(+), 23 deletions(-)
diff --git a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h index 1443c28545a94..358c611eeddbb 100644 --- a/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h +++ b/tools/perf/util/arm-spe-decoder/arm-spe-decoder.h @@ -56,15 +56,15 @@ enum arm_spe_op_type { ARM_SPE_OP_BR_INDIRECT = 1 << 17, };
-enum arm_spe_neoverse_data_source { - ARM_SPE_NV_L1D = 0x0, - ARM_SPE_NV_L2 = 0x8, - ARM_SPE_NV_PEER_CORE = 0x9, - ARM_SPE_NV_LOCAL_CLUSTER = 0xa, - ARM_SPE_NV_SYS_CACHE = 0xb, - ARM_SPE_NV_PEER_CLUSTER = 0xc, - ARM_SPE_NV_REMOTE = 0xd, - ARM_SPE_NV_DRAM = 0xe, +enum arm_spe_common_data_source { + ARM_SPE_COMMON_DS_L1D = 0x0, + ARM_SPE_COMMON_DS_L2 = 0x8, + ARM_SPE_COMMON_DS_PEER_CORE = 0x9, + ARM_SPE_COMMON_DS_LOCAL_CLUSTER = 0xa, + ARM_SPE_COMMON_DS_SYS_CACHE = 0xb, + ARM_SPE_COMMON_DS_PEER_CLUSTER = 0xc, + ARM_SPE_COMMON_DS_REMOTE = 0xd, + ARM_SPE_COMMON_DS_DRAM = 0xe, };
struct arm_spe_record { diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 921b1c6e11379..4754d247dbe73 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -411,15 +411,15 @@ static int arm_spe__synth_instruction_sample(struct arm_spe_queue *speq, return arm_spe_deliver_synth_event(spe, speq, event, &sample); }
-static const struct midr_range neoverse_spe[] = { +static const struct midr_range common_ds_encoding_cpus[] = { MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N1), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_N2), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V1), {}, };
-static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *record, - union perf_mem_data_src *data_src) +static void arm_spe__synth_data_source_common(const struct arm_spe_record *record, + union perf_mem_data_src *data_src) { /* * Even though four levels of cache hierarchy are possible, no known @@ -441,17 +441,17 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec }
switch (record->source) { - case ARM_SPE_NV_L1D: + case ARM_SPE_COMMON_DS_L1D: data_src->mem_lvl = PERF_MEM_LVL_L1 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L1; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; break; - case ARM_SPE_NV_L2: + case ARM_SPE_COMMON_DS_L2: data_src->mem_lvl = PERF_MEM_LVL_L2 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; break; - case ARM_SPE_NV_PEER_CORE: + case ARM_SPE_COMMON_DS_PEER_CORE: data_src->mem_lvl = PERF_MEM_LVL_L2 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L2; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; @@ -460,8 +460,8 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec * We don't know if this is L1, L2 but we do know it was a cache-2-cache * transfer, so set SNOOPX_PEER */ - case ARM_SPE_NV_LOCAL_CLUSTER: - case ARM_SPE_NV_PEER_CLUSTER: + case ARM_SPE_COMMON_DS_LOCAL_CLUSTER: + case ARM_SPE_COMMON_DS_PEER_CLUSTER: data_src->mem_lvl = PERF_MEM_LVL_L3 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; @@ -469,7 +469,7 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec /* * System cache is assumed to be L3 */ - case ARM_SPE_NV_SYS_CACHE: + case ARM_SPE_COMMON_DS_SYS_CACHE: data_src->mem_lvl = PERF_MEM_LVL_L3 | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_L3; data_src->mem_snoop = PERF_MEM_SNOOP_HIT; @@ -478,13 +478,13 @@ static void arm_spe__synth_data_source_neoverse(const struct arm_spe_record *rec * We don't know what level it hit in, except it came from the other * socket */ - case ARM_SPE_NV_REMOTE: + case ARM_SPE_COMMON_DS_REMOTE: data_src->mem_lvl = PERF_MEM_LVL_REM_CCE1; data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; break; - case ARM_SPE_NV_DRAM: + case ARM_SPE_COMMON_DS_DRAM: data_src->mem_lvl = PERF_MEM_LVL_LOC_RAM | PERF_MEM_LVL_HIT; data_src->mem_lvl_num = PERF_MEM_LVLNUM_RAM; data_src->mem_snoop = PERF_MEM_SNOOP_NONE; @@ -520,7 +520,7 @@ static void arm_spe__synth_data_source_generic(const struct arm_spe_record *reco static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 midr) { union perf_mem_data_src data_src = { .mem_op = PERF_MEM_OP_NA }; - bool is_neoverse = is_midr_in_range_list(midr, neoverse_spe); + bool is_common = is_midr_in_range_list(midr, common_ds_encoding_cpus);
/* Only synthesize data source for LDST operations */ if (!is_ldst_op(record->op)) @@ -533,8 +533,8 @@ static u64 arm_spe__synth_data_source(const struct arm_spe_record *record, u64 m else return 0;
- if (is_neoverse) - arm_spe__synth_data_source_neoverse(record, &data_src); + if (is_common) + arm_spe__synth_data_source_common(record, &data_src); else arm_spe__synth_data_source_generic(record, &data_src);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit cb300e3515057fb555983ce47e8acc86a5c69c3c ]
For remote accesses, the data source packet does not contain information about the memory level. To avoid misinformation, set the memory level to NA (Not Available).
Fixes: 4e6430cbb1a9f1dc ("perf arm-spe: Use SPE data source for neoverse cores") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Leo Yan leo.yan@arm.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ali Saidi alisaidi@amazon.com Cc: German Gomez german.gomez@arm.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Will Deacon will@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/arm-spe.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/arm-spe.c b/tools/perf/util/arm-spe.c index 4754d247dbe73..9890b17241c34 100644 --- a/tools/perf/util/arm-spe.c +++ b/tools/perf/util/arm-spe.c @@ -479,8 +479,8 @@ static void arm_spe__synth_data_source_common(const struct arm_spe_record *recor * socket */ case ARM_SPE_COMMON_DS_REMOTE: - data_src->mem_lvl = PERF_MEM_LVL_REM_CCE1; - data_src->mem_lvl_num = PERF_MEM_LVLNUM_ANY_CACHE; + data_src->mem_lvl = PERF_MEM_LVL_NA; + data_src->mem_lvl_num = PERF_MEM_LVLNUM_NA; data_src->mem_remote = PERF_MEM_REMOTE_REMOTE; data_src->mem_snoopx = PERF_MEM_SNOOPX_PEER; break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilkka Koskinen ilkka@os.amperecomputing.com
[ Upstream commit 97996580da08f06f8b09a86f3384ed9fa7a52e32 ]
Add missing 'h' to l1d_cache_access_prefetces
Also fix a couple of typos and use consistent term in brief descriptions
Fixes: 16438b652b464ef7 ("perf vendor events arm64 AmpereOneX: Add core PMU events and metrics") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Ilkka Koskinen ilkka@os.amperecomputing.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Ilkka Koskinen ilkka@os.amperecomputing.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: John Garry john.g.garry@oracle.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Leo Yan leo.yan@linux.dev Cc: Mark Rutland mark.rutland@arm.com Cc: Mike Leach mike.leach@linaro.org Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Will Deacon will@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../arch/arm64/ampere/ampereonex/metrics.json | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json b/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json index 5228f94a793f9..6817cac149e0b 100644 --- a/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json +++ b/tools/perf/pmu-events/arch/arm64/ampere/ampereonex/metrics.json @@ -113,7 +113,7 @@ { "MetricName": "load_store_spec_rate", "MetricExpr": "LDST_SPEC / INST_SPEC", - "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speclatively executed", + "BriefDescription": "The rate of load or store instructions speculatively executed to overall instructions speculatively executed", "MetricGroup": "Operation_Mix", "ScaleUnit": "100percent of operations" }, @@ -132,7 +132,7 @@ { "MetricName": "pc_write_spec_rate", "MetricExpr": "PC_WRITE_SPEC / INST_SPEC", - "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speclatively executed", + "BriefDescription": "The rate of software change of the PC speculatively executed to overall instructions speculatively executed", "MetricGroup": "Operation_Mix", "ScaleUnit": "100percent of operations" }, @@ -195,14 +195,14 @@ { "MetricName": "stall_frontend_cache_rate", "MetricExpr": "STALL_FRONTEND_CACHE / CPU_CYCLES", - "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and cache miss", + "BriefDescription": "Proportion of cycles stalled and no operations delivered from frontend and cache miss", "MetricGroup": "Stall", "ScaleUnit": "100percent of cycles" }, { "MetricName": "stall_frontend_tlb_rate", "MetricExpr": "STALL_FRONTEND_TLB / CPU_CYCLES", - "BriefDescription": "Proportion of cycles stalled and no ops delivered from frontend and TLB miss", + "BriefDescription": "Proportion of cycles stalled and no operations delivered from frontend and TLB miss", "MetricGroup": "Stall", "ScaleUnit": "100percent of cycles" }, @@ -391,7 +391,7 @@ "ScaleUnit": "100percent of cache acceses" }, { - "MetricName": "l1d_cache_access_prefetces", + "MetricName": "l1d_cache_access_prefetches", "MetricExpr": "L1D_CACHE_PRFM / L1D_CACHE", "BriefDescription": "L1D cache access - prefetch", "MetricGroup": "Cache",
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit b9228817127430356c929847f111197776201225 ]
While CPU is a system device, it'd be better to use a path for event_source devices when it checks PMU capability.
Reviewed-by: Ian Rogers irogers@google.com Signed-off-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20250509213017.204343-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: 48314d20fe46 ("perf test shell lbr: Avoid failures with perf event paranoia") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/record_lbr.sh | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh index 32314641217e6..ab6f2825f7903 100755 --- a/tools/perf/tests/shell/record_lbr.sh +++ b/tools/perf/tests/shell/record_lbr.sh @@ -4,7 +4,8 @@
set -e
-if [ ! -f /sys/devices/cpu/caps/branches ] && [ ! -f /sys/devices/cpu_core/caps/branches ] +if [ ! -f /sys/bus/event_source/devices/cpu/caps/branches ] && + [ ! -f /sys/bus/event_source/devices/cpu_core/caps/branches ] then echo "Skip: only x86 CPUs support LBR" exit 2
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 48314d20fe467d6653783cbf5536cb2fcc9bdd7c ]
When not running as root and with higher perf event paranoia values the perf record LBR tests could fail rather than skipping the problematic tests.
Add the sensitivity to the test and confirm it passes with paranoia values from -1 to 2.
Committer testing:
Testing with '$ perf test -vv lbr', i.e. as non root, and then comparing the output shows the mentioned errors before this patch:
acme@x1:~$ grep -m1 "model name" /proc/cpuinfo model name : 13th Gen Intel(R) Core(TM) i7-1365U acme@x1:~$
Before:
132: perf record LBR tests : Skip
After:
132: perf record LBR tests : Ok
Fixes: 32559b99e0f59070 ("perf test: Add set of perf record LBR tests") Signed-off-by: Ian Rogers irogers@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Chun-Tse Shao ctshao@google.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: James Clark james.clark@linaro.org Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/record_lbr.sh | 26 ++++++++++++++++++++------ 1 file changed, 20 insertions(+), 6 deletions(-)
diff --git a/tools/perf/tests/shell/record_lbr.sh b/tools/perf/tests/shell/record_lbr.sh index ab6f2825f7903..5984ca9b78f8a 100755 --- a/tools/perf/tests/shell/record_lbr.sh +++ b/tools/perf/tests/shell/record_lbr.sh @@ -4,6 +4,10 @@
set -e
+ParanoidAndNotRoot() { + [ "$(id -u)" != 0 ] && [ "$(cat /proc/sys/kernel/perf_event_paranoid)" -gt $1 ] +} + if [ ! -f /sys/bus/event_source/devices/cpu/caps/branches ] && [ ! -f /sys/bus/event_source/devices/cpu_core/caps/branches ] then @@ -23,6 +27,7 @@ cleanup() { }
trap_cleanup() { + echo "Unexpected signal in ${FUNCNAME[1]}" cleanup exit 1 } @@ -123,8 +128,11 @@ lbr_test "-j ind_call" "any indirect call" 2 lbr_test "-j ind_jmp" "any indirect jump" 100 lbr_test "-j call" "direct calls" 2 lbr_test "-j ind_call,u" "any indirect user call" 100 -lbr_test "-a -b" "system wide any branch" 2 -lbr_test "-a -j any_call" "system wide any call" 2 +if ! ParanoidAndNotRoot 1 +then + lbr_test "-a -b" "system wide any branch" 2 + lbr_test "-a -j any_call" "system wide any call" 2 +fi
# Parallel parallel_lbr_test "-b" "parallel any branch" 100 & @@ -141,10 +149,16 @@ parallel_lbr_test "-j call" "parallel direct calls" 100 & pid6=$! parallel_lbr_test "-j ind_call,u" "parallel any indirect user call" 100 & pid7=$! -parallel_lbr_test "-a -b" "parallel system wide any branch" 100 & -pid8=$! -parallel_lbr_test "-a -j any_call" "parallel system wide any call" 100 & -pid9=$! +if ParanoidAndNotRoot 1 +then + pid8= + pid9= +else + parallel_lbr_test "-a -b" "parallel system wide any branch" 100 & + pid8=$! + parallel_lbr_test "-a -j any_call" "parallel system wide any call" 100 & + pid9=$! +fi
for pid in $pid1 $pid2 $pid3 $pid4 $pid5 $pid6 $pid7 $pid8 $pid9 do
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit c17dda8013495d8132c976cbf349be9949d0fbd1 ]
If a user specifies an AUX buffer larger than 2 GiB, the returned size may exceed 0x80000000. Since the err variable is defined as a signed 32-bit integer, such a value overflows and becomes negative.
As a result, the perf record command reports an error:
0x146e8 [0x30]: failed to process type: 71 [Unknown error 183711232]
Change the type of the err variable to a signed 64-bit integer to accommodate large buffer sizes correctly.
Fixes: d5652d865ea734a1 ("perf session: Add ability to skip 4GiB or more") Reported-by: Tamas Zsoldos tamas.zsoldos@arm.com Signed-off-by: Leo Yan leo.yan@arm.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Link: https://lore.kernel.org/r/20250808-perf_fix_big_buffer_size-v1-1-45f45444a9a... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/session.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c index dbaf07bf6c5fb..89e5354fa094c 100644 --- a/tools/perf/util/session.c +++ b/tools/perf/util/session.c @@ -1371,7 +1371,7 @@ static s64 perf_session__process_user_event(struct perf_session *session, const struct perf_tool *tool = session->tool; struct perf_sample sample = { .time = 0, }; int fd = perf_data__fd(session->data); - int err; + s64 err;
if (event->header.type != PERF_RECORD_COMPRESSED || perf_tool__compressed_is_stub(tool)) dump_event(session->evlist, event, file_offset, &sample, file_path);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 48918cacefd226af44373e914e63304927c0e7dc ]
The test starts a workload and then opens events. If the events fail to open, for example because of perf_event_paranoid, the gopipe of the workload is leaked and the file descriptor leak check fails when the test exits. To avoid this cancel the workload when opening the events fails.
Before: ``` $ perf test -vv 7 7: PERF_RECORD_* events & perf_sample fields: --- start --- test child forked, pid 1189568 Using CPUID GenuineIntel-6-B7-1 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -13 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0xa00000000 (cpu_atom/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 exclude_kernel 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 sys_perf_event_open failed, error -13 ------------------------------------------------------------ perf_event_attr: type 0 (PERF_TYPE_HARDWARE) config 0x400000000 (cpu_core/PERF_COUNT_HW_CPU_CYCLES/) disabled 1 exclude_kernel 1 ------------------------------------------------------------ sys_perf_event_open: pid 0 cpu -1 group_fd -1 flags 0x8 = 3 Attempt to add: software/cpu-clock/ ..after resolving event: software/config=0/ cpu-clock -> software/cpu-clock/ ------------------------------------------------------------ perf_event_attr: type 1 (PERF_TYPE_SOFTWARE) size 136 config 0x9 (PERF_COUNT_SW_DUMMY) sample_type IP|TID|TIME|CPU read_format ID|LOST disabled 1 inherit 1 mmap 1 comm 1 enable_on_exec 1 task 1 sample_id_all 1 mmap2 1 comm_exec 1 ksymbol 1 bpf_event 1 { wakeup_events, wakeup_watermark } 1 ------------------------------------------------------------ sys_perf_event_open: pid 1189569 cpu 0 group_fd -1 flags 0x8 sys_perf_event_open failed, error -13 perf_evlist__open: Permission denied ---- end(-2) ---- Leak of file descriptor 6 that opened: 'pipe:[14200347]' ---- unexpected signal (6) ---- iFailed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon Failed to read build ID for //anon #0 0x565358f6666e in child_test_sig_handler builtin-test.c:311 #1 0x7f29ce849df0 in __restore_rt libc_sigaction.c:0 #2 0x7f29ce89e95c in __pthread_kill_implementation pthread_kill.c:44 #3 0x7f29ce849cc2 in raise raise.c:27 #4 0x7f29ce8324ac in abort abort.c:81 #5 0x565358f662d4 in check_leaks builtin-test.c:226 #6 0x565358f6682e in run_test_child builtin-test.c:344 #7 0x565358ef7121 in start_command run-command.c:128 #8 0x565358f67273 in start_test builtin-test.c:545 #9 0x565358f6771d in __cmd_test builtin-test.c:647 #10 0x565358f682bd in cmd_test builtin-test.c:849 #11 0x565358ee5ded in run_builtin perf.c:349 #12 0x565358ee6085 in handle_internal_command perf.c:401 #13 0x565358ee61de in run_argv perf.c:448 #14 0x565358ee6527 in main perf.c:555 #15 0x7f29ce833ca8 in __libc_start_call_main libc_start_call_main.h:74 #16 0x7f29ce833d65 in __libc_start_main@@GLIBC_2.34 libc-start.c:128 #17 0x565358e391c1 in _start perf[851c1] 7: PERF_RECORD_* events & perf_sample fields : FAILED! ```
After: ``` $ perf test 7 7: PERF_RECORD_* events & perf_sample fields : Skip (permissions) ```
Fixes: 16d00fee703866c6 ("perf tests: Move test__PERF_RECORD into separate object") Signed-off-by: Ian Rogers irogers@google.com Tested-by: Arnaldo Carvalho de Melo acme@redhat.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Athira Rajeev atrajeev@linux.ibm.com Cc: Chun-Tse Shao ctshao@google.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: James Clark james.clark@linaro.org Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/perf-record.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/tools/perf/tests/perf-record.c b/tools/perf/tests/perf-record.c index 1c4feec1adff1..6e7f053006b4f 100644 --- a/tools/perf/tests/perf-record.c +++ b/tools/perf/tests/perf-record.c @@ -115,6 +115,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest if (err < 0) { pr_debug("sched__get_first_possible_cpu: %s\n", str_error_r(errno, sbuf, sizeof(sbuf))); + evlist__cancel_workload(evlist); goto out_delete_evlist; }
@@ -126,6 +127,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest if (sched_setaffinity(evlist->workload.pid, cpu_mask_size, &cpu_mask) < 0) { pr_debug("sched_setaffinity: %s\n", str_error_r(errno, sbuf, sizeof(sbuf))); + evlist__cancel_workload(evlist); goto out_delete_evlist; }
@@ -137,6 +139,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest if (err < 0) { pr_debug("perf_evlist__open: %s\n", str_error_r(errno, sbuf, sizeof(sbuf))); + evlist__cancel_workload(evlist); goto out_delete_evlist; }
@@ -149,6 +152,7 @@ static int test__PERF_RECORD(struct test_suite *test __maybe_unused, int subtest if (err < 0) { pr_debug("evlist__mmap: %s\n", str_error_r(errno, sbuf, sizeof(sbuf))); + evlist__cancel_workload(evlist); goto out_delete_evlist; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Clark james.clark@linaro.org
[ Upstream commit 65d11821910bd910a2b4b5b005360d036c76ecef ]
Test that one cycles event is opened for each core PMU when "perf stat" is run without arguments.
The event line can either be output as "pmu/cycles/" or just "cycles" if there is only one PMU. Include 2 spaces for padding in the one PMU case to avoid matching when the word cycles is included in metric descriptions.
Acked-by: Namhyung Kim namhyung@kernel.org Acked-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: James Clark james.clark@linaro.org Cc: Yang Jihong yangjihong@bytedance.com Cc: Dominique Martinet asmadeus@codewreck.org Cc: Colin Ian King colin.i.king@gmail.com Cc: Howard Chu howardchu95@gmail.com Cc: Ze Gao zegao2021@gmail.com Cc: Yicong Yang yangyicong@hisilicon.com Cc: Weilin Wang weilin.wang@intel.com Cc: Will Deacon will@kernel.org Cc: Mike Leach mike.leach@linaro.org Cc: Jing Zhang renyu.zj@linux.alibaba.com Cc: Yang Li yang.lee@linux.alibaba.com Cc: Leo Yan leo.yan@linux.dev Cc: ak@linux.intel.com Cc: Athira Rajeev atrajeev@linux.vnet.ibm.com Cc: linux-arm-kernel@lists.infradead.org Cc: Yanteng Si siyanteng@loongson.cn Cc: Sun Haiyong sunhaiyong@loongson.cn Cc: John Garry john.g.garry@oracle.com Link: https://lore.kernel.org/r/20240926144851.245903-8-james.clark@linaro.org Signed-off-by: Namhyung Kim namhyung@kernel.org Stable-dep-of: 24937ee839e4 ("perf evsel: Ensure the fallback message is always written to") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/stat.sh | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh index 3f1e67795490a..c6df7eec96b98 100755 --- a/tools/perf/tests/shell/stat.sh +++ b/tools/perf/tests/shell/stat.sh @@ -146,6 +146,30 @@ test_cputype() { echo "cputype test [Success]" }
+test_hybrid() { + # Test the default stat command on hybrid devices opens one cycles event for + # each CPU type. + echo "hybrid test" + + # Count the number of core PMUs, assume minimum of 1 + pmus=$(ls /sys/bus/event_source/devices/*/cpus 2>/dev/null | wc -l) + if [ "$pmus" -lt 1 ] + then + pmus=1 + fi + + # Run default Perf stat + cycles_events=$(perf stat -- true 2>&1 | grep -E "/cycles/| cycles " | wc -l) + + if [ "$pmus" -ne "$cycles_events" ] + then + echo "hybrid test [Found $pmus PMUs but $cycles_events cycles events. Failed]" + err=1 + return + fi + echo "hybrid test [Success]" +} + test_default_stat test_stat_record_report test_stat_record_script @@ -153,4 +177,5 @@ test_stat_repeat_weak_groups test_topdown_groups test_topdown_weak_groups test_cputype +test_hybrid exit $err
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit bb6e7cb11d97ce1957894d30d13bfad3e8bfefe9 ]
Commit 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default") changed to parse "cycles:P" event instead of creating a new cycles event for perf record. But it also changed the way how modifiers are handled so it doesn't set the exclude_guest bit by default.
It seems Apple M1 PMU requires exclude_guest set and returns EOPNOTSUPP if not. Let's add a fallback so that it can work with default events.
Also update perf stat hybrid tests to handle possible u or H modifiers.
Reviewed-by: Ian Rogers irogers@google.com Reviewed-by: James Clark james.clark@linaro.org Reviewed-by: Ravi Bangoria ravi.bangoria@amd.com Acked-by: Kan Liang kan.liang@linux.intel.com Cc: James Clark james.clark@arm.com Cc: Atish Patra atishp@atishpatra.org Cc: Mingwei Zhang mizhang@google.com Cc: Kajol Jain kjain@linux.ibm.com Cc: Thomas Richter tmricht@linux.ibm.com Cc: Palmer Dabbelt palmer@rivosinc.com Link: https://lore.kernel.org/r/20241016062359.264929-2-namhyung@kernel.org Fixes: 7b100989b4f6bce70 ("perf evlist: Remove __evlist__add_default") Signed-off-by: Namhyung Kim namhyung@kernel.org Stable-dep-of: 24937ee839e4 ("perf evsel: Ensure the fallback message is always written to") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-stat.c | 18 +++++++++++++++--- tools/perf/tests/shell/stat.sh | 2 +- tools/perf/util/evsel.c | 21 +++++++++++++++++++++ 3 files changed, 37 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-stat.c b/tools/perf/builtin-stat.c index 628c61397d2d3..b578930ed76a4 100644 --- a/tools/perf/builtin-stat.c +++ b/tools/perf/builtin-stat.c @@ -639,8 +639,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter) * (behavior changed with commit b0a873e). */ if (errno == EINVAL || errno == ENOSYS || - errno == ENOENT || errno == EOPNOTSUPP || - errno == ENXIO) { + errno == ENOENT || errno == ENXIO) { if (verbose > 0) ui__warning("%s event is not supported by the kernel.\n", evsel__name(counter)); @@ -658,7 +657,7 @@ static enum counter_recovery stat_handle_error(struct evsel *counter) if (verbose > 0) ui__warning("%s\n", msg); return COUNTER_RETRY; - } else if (target__has_per_thread(&target) && + } else if (target__has_per_thread(&target) && errno != EOPNOTSUPP && evsel_list->core.threads && evsel_list->core.threads->err_thread != -1) { /* @@ -679,6 +678,19 @@ static enum counter_recovery stat_handle_error(struct evsel *counter) return COUNTER_SKIP; }
+ if (errno == EOPNOTSUPP) { + if (verbose > 0) { + ui__warning("%s event is not supported by the kernel.\n", + evsel__name(counter)); + } + counter->supported = false; + counter->errored = true; + + if ((evsel__leader(counter) != counter) || + !(counter->core.leader->nr_members > 1)) + return COUNTER_SKIP; + } + evsel__open_strerror(counter, &target, errno, msg, sizeof(msg)); ui__error("%s\n", msg);
diff --git a/tools/perf/tests/shell/stat.sh b/tools/perf/tests/shell/stat.sh index c6df7eec96b98..c4bef71568970 100755 --- a/tools/perf/tests/shell/stat.sh +++ b/tools/perf/tests/shell/stat.sh @@ -159,7 +159,7 @@ test_hybrid() { fi
# Run default Perf stat - cycles_events=$(perf stat -- true 2>&1 | grep -E "/cycles/| cycles " | wc -l) + cycles_events=$(perf stat -- true 2>&1 | grep -E "/cycles/[uH]*| cycles[:uH]* " -c)
if [ "$pmus" -ne "$cycles_events" ] then diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index b3de8ce559998..70c4e06da7a03 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -3255,6 +3255,27 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err, evsel->core.attr.exclude_kernel = 1; evsel->core.attr.exclude_hv = 1;
+ return true; + } else if (err == EOPNOTSUPP && !evsel->core.attr.exclude_guest && + !evsel->exclude_GH) { + const char *name = evsel__name(evsel); + char *new_name; + const char *sep = ":"; + + /* Is there already the separator in the name. */ + if (strchr(name, '/') || + (strchr(name, ':') && !evsel->is_libpfm_event)) + sep = ""; + + if (asprintf(&new_name, "%s%sH", name, sep) < 0) + return false; + + free(evsel->name); + evsel->name = new_name; + /* Apple M1 requires exclude_guest */ + scnprintf(msg, msgsize, "trying to fall back to excluding guest samples"); + evsel->core.attr.exclude_guest = 1; + return true; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
[ Upstream commit 24937ee839e4bbc097acde73eeed67812bad2d99 ]
The fallback message is unconditionally printed in places like record__open().
If no fallback is attempted this can lead to printing uninitialized data, crashes, etc.
Fixes: c0a54341c0e89333 ("perf evsel: Introduce event fallback method") Signed-off-by: Ian Rogers irogers@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Howard Chu howardchu95@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: James Clark james.clark@linaro.org Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/evsel.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/tools/perf/util/evsel.c b/tools/perf/util/evsel.c index 70c4e06da7a03..dda107b12b8c6 100644 --- a/tools/perf/util/evsel.c +++ b/tools/perf/util/evsel.c @@ -3237,7 +3237,7 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err,
/* If event has exclude user then don't exclude kernel. */ if (evsel->core.attr.exclude_user) - return false; + goto no_fallback;
/* Is there already the separator in the name. */ if (strchr(name, '/') || @@ -3245,7 +3245,7 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err, sep = "";
if (asprintf(&new_name, "%s%su", name, sep) < 0) - return false; + goto no_fallback;
free(evsel->name); evsel->name = new_name; @@ -3268,17 +3268,19 @@ bool evsel__fallback(struct evsel *evsel, struct target *target, int err, sep = "";
if (asprintf(&new_name, "%s%sH", name, sep) < 0) - return false; + goto no_fallback;
free(evsel->name); evsel->name = new_name; /* Apple M1 requires exclude_guest */ - scnprintf(msg, msgsize, "trying to fall back to excluding guest samples"); + scnprintf(msg, msgsize, "Trying to fall back to excluding guest samples"); evsel->core.attr.exclude_guest = 1;
return true; } - +no_fallback: + scnprintf(msg, msgsize, "No fallback found for '%s' for error %d", + evsel__name(evsel), err); return false; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com
[ Upstream commit 6c4c26b624790098988c1034541087e3e5ed5bed ]
The infrastructure gate for the HDMI specific crystal needs the top_hdmi_xtal clock to be configured in order to ungate the 26m clock to the HDMI IP, and it wouldn't work without.
Reparent the infra_ao_hdmi_26m clock to top_hdmi_xtal to fix that.
Fixes: e2edf59dec0b ("clk: mediatek: Add MT8195 infrastructure clock support") Signed-off-by: AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mediatek/clk-mt8195-infra_ao.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/mediatek/clk-mt8195-infra_ao.c b/drivers/clk/mediatek/clk-mt8195-infra_ao.c index bb648a88e43af..ad47fdb234607 100644 --- a/drivers/clk/mediatek/clk-mt8195-infra_ao.c +++ b/drivers/clk/mediatek/clk-mt8195-infra_ao.c @@ -103,7 +103,7 @@ static const struct mtk_gate infra_ao_clks[] = { GATE_INFRA_AO0(CLK_INFRA_AO_CQ_DMA_FPC, "infra_ao_cq_dma_fpc", "fpc", 28), GATE_INFRA_AO0(CLK_INFRA_AO_UART5, "infra_ao_uart5", "top_uart", 29), /* INFRA_AO1 */ - GATE_INFRA_AO1(CLK_INFRA_AO_HDMI_26M, "infra_ao_hdmi_26m", "clk26m", 0), + GATE_INFRA_AO1(CLK_INFRA_AO_HDMI_26M, "infra_ao_hdmi_26m", "top_hdmi_xtal", 0), GATE_INFRA_AO1(CLK_INFRA_AO_SPI0, "infra_ao_spi0", "top_spi", 1), GATE_INFRA_AO1(CLK_INFRA_AO_MSDC0, "infra_ao_msdc0", "top_msdc50_0_hclk", 2), GATE_INFRA_AO1(CLK_INFRA_AO_MSDC1, "infra_ao_msdc1", "top_axi", 4),
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wenst@chromium.org
[ Upstream commit 5e121370a7ad3414c7f3a77002e2b18abe5c6fe1 ]
The `flags` in |struct mtk_mux| are core clk flags, not mux clk flags. Passing one to the other is wrong.
Since there aren't any actual users adding CLK_MUX_* flags, just drop it for now.
Fixes: b05ea3314390 ("clk: mediatek: clk-mux: Add .determine_rate() callback") Signed-off-by: Chen-Yu Tsai wenst@chromium.org Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/mediatek/clk-mux.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/clk/mediatek/clk-mux.c b/drivers/clk/mediatek/clk-mux.c index 60990296450bb..9a12e58230bed 100644 --- a/drivers/clk/mediatek/clk-mux.c +++ b/drivers/clk/mediatek/clk-mux.c @@ -146,9 +146,7 @@ static int mtk_clk_mux_set_parent_setclr_lock(struct clk_hw *hw, u8 index) static int mtk_clk_mux_determine_rate(struct clk_hw *hw, struct clk_rate_request *req) { - struct mtk_clk_mux *mux = to_mtk_clk_mux(hw); - - return clk_mux_determine_rate_flags(hw, req, mux->data->flags); + return clk_mux_determine_rate_flags(hw, req, 0); }
const struct clk_ops mtk_mux_clr_set_upd_ops = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Masney bmasney@redhat.com
[ Upstream commit b46a3d323a5b7942e65025254c13801d0f475f02 ]
The round_rate() clk ops is deprecated, so migrate this driver from round_rate() to determine_rate() using the Coccinelle semantic patch on the cover letter of this series.
Signed-off-by: Brian Masney bmasney@redhat.com Stable-dep-of: 1624dead9a4d ("clk: nxp: Fix pll0 rate check condition in LPC18xx CGU driver") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/nxp/clk-lpc18xx-cgu.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
diff --git a/drivers/clk/nxp/clk-lpc18xx-cgu.c b/drivers/clk/nxp/clk-lpc18xx-cgu.c index 81efa885069b2..30e0b283ca608 100644 --- a/drivers/clk/nxp/clk-lpc18xx-cgu.c +++ b/drivers/clk/nxp/clk-lpc18xx-cgu.c @@ -370,23 +370,25 @@ static unsigned long lpc18xx_pll0_recalc_rate(struct clk_hw *hw, return 0; }
-static long lpc18xx_pll0_round_rate(struct clk_hw *hw, unsigned long rate, - unsigned long *prate) +static int lpc18xx_pll0_determine_rate(struct clk_hw *hw, + struct clk_rate_request *req) { unsigned long m;
- if (*prate < rate) { + if (req->best_parent_rate < req->rate) { pr_warn("%s: pll dividers not supported\n", __func__); return -EINVAL; }
- m = DIV_ROUND_UP_ULL(*prate, rate * 2); + m = DIV_ROUND_UP_ULL(req->best_parent_rate, req->rate * 2); if (m <= 0 && m > LPC18XX_PLL0_MSEL_MAX) { - pr_warn("%s: unable to support rate %lu\n", __func__, rate); + pr_warn("%s: unable to support rate %lu\n", __func__, req->rate); return -EINVAL; }
- return 2 * *prate * m; + req->rate = 2 * req->best_parent_rate * m; + + return 0; }
static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate, @@ -443,7 +445,7 @@ static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate,
static const struct clk_ops lpc18xx_pll0_ops = { .recalc_rate = lpc18xx_pll0_recalc_rate, - .round_rate = lpc18xx_pll0_round_rate, + .determine_rate = lpc18xx_pll0_determine_rate, .set_rate = lpc18xx_pll0_set_rate, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alok Tiwari alok.a.tiwari@oracle.com
[ Upstream commit 1624dead9a4d288a594fdf19735ebfe4bb567cb8 ]
The conditional check for the PLL0 multiplier 'm' used a logical AND instead of OR, making the range check ineffective. This patch replaces && with || to correctly reject invalid values of 'm' that are either less than or equal to 0 or greater than LPC18XX_PLL0_MSEL_MAX.
This ensures proper bounds checking during clk rate setting and rounding.
Fixes: b04e0b8fd544 ("clk: add lpc18xx cgu clk driver") Signed-off-by: Alok Tiwari alok.a.tiwari@oracle.com [sboyd@kernel.org: 'm' is unsigned so remove < condition] Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/nxp/clk-lpc18xx-cgu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/nxp/clk-lpc18xx-cgu.c b/drivers/clk/nxp/clk-lpc18xx-cgu.c index 30e0b283ca608..b9e204d63a972 100644 --- a/drivers/clk/nxp/clk-lpc18xx-cgu.c +++ b/drivers/clk/nxp/clk-lpc18xx-cgu.c @@ -381,7 +381,7 @@ static int lpc18xx_pll0_determine_rate(struct clk_hw *hw, }
m = DIV_ROUND_UP_ULL(req->best_parent_rate, req->rate * 2); - if (m <= 0 && m > LPC18XX_PLL0_MSEL_MAX) { + if (m == 0 || m > LPC18XX_PLL0_MSEL_MAX) { pr_warn("%s: unable to support rate %lu\n", __func__, req->rate); return -EINVAL; } @@ -404,7 +404,7 @@ static int lpc18xx_pll0_set_rate(struct clk_hw *hw, unsigned long rate, }
m = DIV_ROUND_UP_ULL(parent_rate, rate * 2); - if (m <= 0 && m > LPC18XX_PLL0_MSEL_MAX) { + if (m == 0 || m > LPC18XX_PLL0_MSEL_MAX) { pr_warn("%s: unable to support rate %lu\n", __func__, rate); return -EINVAL; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit 49ef6491106209c595476fc122c3922dfd03253f ]
struct tegra_bpmp::clocks is a pointer to a dynamically allocated array of pointers to 'struct tegra_bpmp_clk'.
But the size of the allocated area is calculated like it is an array containing actual 'struct tegra_bpmp_clk' objects - it's not true, there are just pointers.
Found by Linux Verification Center (linuxtesting.org) with Svace static analysis tool.
Fixes: 2db12b15c6f3 ("clk: tegra: Register clocks from root to leaf") Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Stephen Boyd sboyd@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/tegra/clk-bpmp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/clk/tegra/clk-bpmp.c b/drivers/clk/tegra/clk-bpmp.c index 7bfba0afd7783..4ec408c3a26aa 100644 --- a/drivers/clk/tegra/clk-bpmp.c +++ b/drivers/clk/tegra/clk-bpmp.c @@ -635,7 +635,7 @@ static int tegra_bpmp_register_clocks(struct tegra_bpmp *bpmp,
bpmp->num_clocks = count;
- bpmp->clocks = devm_kcalloc(bpmp->dev, count, sizeof(struct tegra_bpmp_clk), GFP_KERNEL); + bpmp->clocks = devm_kcalloc(bpmp->dev, count, sizeof(*bpmp->clocks), GFP_KERNEL); if (!bpmp->clocks) return -ENOMEM;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aaron Kling webgeek1234@gmail.com
[ Upstream commit 0b1bb980fd7cae126ee3d59f817068a13e321b07 ]
The original commit set all cores in a cluster to a shared policy, but did not update set_target to apply a frequency change to all cores for the policy. This caused most cores to remain stuck at their boot frequency.
Fixes: be4ae8c19492 ("cpufreq: tegra186: Share policy per cluster") Signed-off-by: Aaron Kling webgeek1234@gmail.com Reviewed-by: Mikko Perttunen mperttunen@nvidia.com Signed-off-by: Viresh Kumar viresh.kumar@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cpufreq/tegra186-cpufreq.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/tegra186-cpufreq.c b/drivers/cpufreq/tegra186-cpufreq.c index 7b8fcfa55038b..39186008afbfd 100644 --- a/drivers/cpufreq/tegra186-cpufreq.c +++ b/drivers/cpufreq/tegra186-cpufreq.c @@ -86,10 +86,14 @@ static int tegra186_cpufreq_set_target(struct cpufreq_policy *policy, { struct tegra186_cpufreq_data *data = cpufreq_get_driver_data(); struct cpufreq_frequency_table *tbl = policy->freq_table + index; - unsigned int edvd_offset = data->cpus[policy->cpu].edvd_offset; + unsigned int edvd_offset; u32 edvd_val = tbl->driver_data; + u32 cpu;
- writel(edvd_val, data->regs + edvd_offset); + for_each_cpu(cpu, policy->cpus) { + edvd_offset = data->cpus[cpu].edvd_offset; + writel(edvd_val, data->regs + edvd_offset); + }
return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit 60cd16a3b7439ccb699d0bf533799eeb894fd217 ]
During the detaching of Marvell's SAS/SATA controller, the original code calls cancel_delayed_work() in mvs_free() to cancel the delayed work item mwq->work_q. However, if mwq->work_q is already running, the cancel_delayed_work() may fail to cancel it. This can lead to use-after-free scenarios where mvs_free() frees the mvs_info while mvs_work_queue() is still executing and attempts to access the already-freed mvs_info.
A typical race condition is illustrated below:
CPU 0 (remove) | CPU 1 (delayed work callback) mvs_pci_remove() | mvs_free() | mvs_work_queue() cancel_delayed_work() | kfree(mvi) | | mvi-> // UAF
Replace cancel_delayed_work() with cancel_delayed_work_sync() to ensure that the delayed work item is properly canceled and any executing delayed work item completes before the mvs_info is deallocated.
This bug was found by static analysis.
Fixes: 20b09c2992fe ("[SCSI] mvsas: add support for 94xx; layout change; bug fixes") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mvsas/mv_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 020037cbf0d91..655cca5fe7ccb 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -124,7 +124,7 @@ static void mvs_free(struct mvs_info *mvi) if (mvi->shost) scsi_host_put(mvi->shost); list_for_each_entry(mwq, &mvi->wq_list, entry) - cancel_delayed_work(&mwq->work_q); + cancel_delayed_work_sync(&mwq->work_q); kfree(mvi->rsvd_tags); kfree(mvi); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit a7fe5ff832d61d9393095bc3dd5f06f4af7da3c1 ]
The firmware has changed the minimum host buffer size from 2 periods to 4 periods (1 period is 1ms) which was missed by the kernel side.
Adjust the SOF_IPC4_MIN_DMA_BUFFER_SIZE to 4 ms to align with firmware.
Link: https://github.com/thesofproject/sof/commit/f0a14a3f410735db18a79eb7a5f40dc4... Fixes: 594c1bb9ff73 ("ASoC: SOF: ipc4-topology: Do not parse the DMA_BUFFER_SIZE token") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://patch.msgid.link/20251002135752.2430-2-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc4-topology.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h index f4dc499c0ffe5..187e186ba4f8f 100644 --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -60,8 +60,8 @@
#define SOF_IPC4_INVALID_NODE_ID 0xffffffff
-/* FW requires minimum 2ms DMA buffer size */ -#define SOF_IPC4_MIN_DMA_BUFFER_SIZE 2 +/* FW requires minimum 4ms DMA buffer size */ +#define SOF_IPC4_MIN_DMA_BUFFER_SIZE 4
/* * The base of multi-gateways. Multi-gateways addressing starts from
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 3dcf683bf1062d69014fe81b90d285c7eb85ca8a ]
For ChainDMA the firmware allocates 5ms host buffer instead of the standard 4ms which should be taken into account when setting the constraint on the buffer size.
Fixes: 842bb8b62cc6 ("ASoC: SOF: ipc4-topology: Save the DMA maximum burst size for PCMs") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://patch.msgid.link/20251002135752.2430-3-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc4-topology.c | 9 +++++++-- sound/soc/sof/ipc4-topology.h | 3 +++ 2 files changed, 10 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index f82db7f2a6b7e..ac836d0d20de0 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -519,8 +519,13 @@ static int sof_ipc4_widget_setup_pcm(struct snd_sof_widget *swidget) swidget->tuples, swidget->num_tuples, sizeof(u32), 1); /* Set default DMA buffer size if it is not specified in topology */ - if (!sps->dsp_max_burst_size_in_ms) - sps->dsp_max_burst_size_in_ms = SOF_IPC4_MIN_DMA_BUFFER_SIZE; + if (!sps->dsp_max_burst_size_in_ms) { + struct snd_sof_widget *pipe_widget = swidget->spipe->pipe_widget; + struct sof_ipc4_pipeline *pipeline = pipe_widget->private; + + sps->dsp_max_burst_size_in_ms = pipeline->use_chain_dma ? + SOF_IPC4_CHAIN_DMA_BUFFER_SIZE : SOF_IPC4_MIN_DMA_BUFFER_SIZE; + } } else { /* Capture data is copied from DSP to host in 1ms bursts */ spcm->stream[dir].dsp_max_burst_size_in_ms = 1; diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h index 187e186ba4f8f..adb52a1eff85f 100644 --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -63,6 +63,9 @@ /* FW requires minimum 4ms DMA buffer size */ #define SOF_IPC4_MIN_DMA_BUFFER_SIZE 4
+/* ChainDMA in fw uses 5ms DMA buffer */ +#define SOF_IPC4_CHAIN_DMA_BUFFER_SIZE 5 + /* * The base of multi-gateways. Multi-gateways addressing starts from * ALH_MULTI_GTW_BASE and there are ALH_MULTI_GTW_COUNT multi-sources
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 45ad27d9a6f7c620d8bbc80be3bab1faf37dfa0a ]
Instead of constraining the ALSA buffer time to be double of the firmware host buffer size, it is better to set it for the period time. This will implicitly constrain the buffer time to a safe value (num_periods is at least 2) and prohibits applications to set smaller period size than what will be covered by the initial DMA burst.
Fixes: fe76d2e75a6d ("ASoC: SOF: Intel: hda-pcm: Use dsp_max_burst_size_in_ms to place constraint") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://patch.msgid.link/20251002135752.2430-4-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/hda-pcm.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-)
diff --git a/sound/soc/sof/intel/hda-pcm.c b/sound/soc/sof/intel/hda-pcm.c index f6e24edd7adbe..898e4fcde2dde 100644 --- a/sound/soc/sof/intel/hda-pcm.c +++ b/sound/soc/sof/intel/hda-pcm.c @@ -29,6 +29,8 @@ #define SDnFMT_BITS(x) ((x) << 4) #define SDnFMT_CHAN(x) ((x) << 0)
+#define HDA_MAX_PERIOD_TIME_HEADROOM 10 + static bool hda_always_enable_dmi_l1; module_param_named(always_enable_dmi_l1, hda_always_enable_dmi_l1, bool, 0444); MODULE_PARM_DESC(always_enable_dmi_l1, "SOF HDA always enable DMI l1"); @@ -276,19 +278,30 @@ int hda_dsp_pcm_open(struct snd_sof_dev *sdev, * On playback start the DMA will transfer dsp_max_burst_size_in_ms * amount of data in one initial burst to fill up the host DMA buffer. * Consequent DMA burst sizes are shorter and their length can vary. - * To make sure that userspace allocate large enough ALSA buffer we need - * to place a constraint on the buffer time. + * To avoid immediate xrun by the initial burst we need to place + * constraint on the period size (via PERIOD_TIME) to cover the size of + * the host buffer. + * We need to add headroom of max 10ms as the firmware needs time to + * settle to the 1ms pacing and initially it can run faster for few + * internal periods. * * On capture the DMA will transfer 1ms chunks. - * - * Exact dsp_max_burst_size_in_ms constraint is racy, so set the - * constraint to a minimum of 2x dsp_max_burst_size_in_ms. */ - if (spcm->stream[direction].dsp_max_burst_size_in_ms) + if (spcm->stream[direction].dsp_max_burst_size_in_ms) { + unsigned int period_time = spcm->stream[direction].dsp_max_burst_size_in_ms; + + /* + * add headroom over the maximum burst size to cover the time + * needed for the DMA pace to settle. + * Limit the headroom time to HDA_MAX_PERIOD_TIME_HEADROOM + */ + period_time += min(period_time, HDA_MAX_PERIOD_TIME_HEADROOM); + snd_pcm_hw_constraint_minmax(substream->runtime, - SNDRV_PCM_HW_PARAM_BUFFER_TIME, - spcm->stream[direction].dsp_max_burst_size_in_ms * USEC_PER_MSEC * 2, + SNDRV_PCM_HW_PARAM_PERIOD_TIME, + period_time * USEC_PER_MSEC, UINT_MAX); + }
/* binding pcm substream to hda stream */ substream->runtime->private_data = &dsp_stream->hstream;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tiezhu Yang yangtiezhu@loongson.cn
[ Upstream commit abb2a5572264b425e6dd9c213b735a82ab0ca68a ]
Currently, when compiling with GCC, there is no "break 7" instruction for zero division due to using the option -mno-check-zero-division, but the compiler still generates "break 0" instruction for zero division.
Here is a simple example:
$ cat test.c int div(int a) { return a / 0; } $ gcc -O2 -S test.c -o test.s
GCC generates "break 0" on LoongArch and "ud2" on x86, objtool decodes "ud2" as INSN_BUG for x86, so decode "break 0" as INSN_BUG can fix the objtool warnings for LoongArch, but this is not the intention.
When decoding "break 0" as INSN_TRAP in the previous commit, the aim is to handle "break 0" as a trap. The generated "break 0" for zero division by GCC is not proper, it should generate a break instruction with proper bug type, so add the GCC option -fno-isolate-erroneous-paths-dereference to avoid generating the unexpected "break 0" instruction for now.
Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/r/202509200413.7uihAxJ5-lkp@intel.com/ Fixes: baad7830ee9a ("objtool/LoongArch: Mark types based on break immediate code") Suggested-by: WANG Rui wangrui@loongson.cn Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Sasha Levin sashal@kernel.org --- arch/loongarch/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/loongarch/Makefile b/arch/loongarch/Makefile index 567bd122a9ee4..c14ae21075c5f 100644 --- a/arch/loongarch/Makefile +++ b/arch/loongarch/Makefile @@ -114,7 +114,7 @@ KBUILD_RUSTFLAGS_KERNEL += -Crelocation-model=pie LDFLAGS_vmlinux += -static -pie --no-dynamic-linker -z notext $(call ld-option, --apply-dynamic-relocs) endif
-cflags-y += $(call cc-option, -mno-check-zero-division) +cflags-y += $(call cc-option, -mno-check-zero-division -fno-isolate-erroneous-paths-dereference)
ifndef CONFIG_KASAN cflags-y += -fno-builtin-memcpy -fno-builtin-memmove -fno-builtin-memset
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huacai Chen chenhuacai@loongson.cn
[ Upstream commit 98662be7ef20d2b88b598f72e7ce9b6ac26a40f9 ]
Init acpi_gbl_use_global_lock to false, in order to void error messages during boot phase:
ACPI Error: Could not enable GlobalLock event (20240827/evxfevnt-182) ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59)
Fixes: 628c3bb40e9a8cefc0a6 ("LoongArch: Add boot and setup routines") Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Sasha Levin sashal@kernel.org --- arch/loongarch/kernel/setup.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/loongarch/kernel/setup.c b/arch/loongarch/kernel/setup.c index 1fa6a604734ef..2ceb198ae8c80 100644 --- a/arch/loongarch/kernel/setup.c +++ b/arch/loongarch/kernel/setup.c @@ -354,6 +354,7 @@ void __init platform_init(void)
#ifdef CONFIG_ACPI acpi_table_upgrade(); + acpi_gbl_use_global_lock = false; acpi_gbl_use_default_register_widths = false; acpi_boot_table_init(); #endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit aaab61de1f1e44a2ab527e935474e2e03a0f6b08 ]
It is allowed to mix Link and Host DMA channels in a way that their index is different. In this case we would read the LLP from a channel which is not used or used for other operation.
Such case can be reproduced on cAVS2.5 or ACE1 platforms with soundwire configuration: playback to SDW would take Host channel 0 (stream_tag 1) and no Link DMA used Second playback to HDMI (HDA) would use Host channel 1 (stream_tag 2) and Link channel 0 (stream_tag 1).
In this case reading the LLP from channel 2 is incorrect as that is not the Link channel used for the HDMI playback.
To correct this, we should look up the BE and get the channel used on the Link side.
Fixes: 67b182bea08a ("ASoC: SOF: Intel: hda: Implement get_stream_position (Linear Link Position)") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Link: https://patch.msgid.link/20251002074719.2084-6-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/hda-stream.c | 29 +++++++++++++++++++++++++++-- 1 file changed, 27 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/intel/hda-stream.c b/sound/soc/sof/intel/hda-stream.c index 24f3cc7676142..2be0d02f9cf9b 100644 --- a/sound/soc/sof/intel/hda-stream.c +++ b/sound/soc/sof/intel/hda-stream.c @@ -1103,10 +1103,35 @@ u64 hda_dsp_get_stream_llp(struct snd_sof_dev *sdev, struct snd_soc_component *component, struct snd_pcm_substream *substream) { - struct hdac_stream *hstream = substream->runtime->private_data; - struct hdac_ext_stream *hext_stream = stream_to_hdac_ext_stream(hstream); + struct snd_soc_pcm_runtime *rtd = snd_soc_substream_to_rtd(substream); + struct snd_soc_pcm_runtime *be_rtd = NULL; + struct hdac_ext_stream *hext_stream; + struct snd_soc_dai *cpu_dai; + struct snd_soc_dpcm *dpcm; u32 llp_l, llp_u;
+ /* + * The LLP needs to be read from the Link DMA used for this FE as it is + * allowed to use any combination of Link and Host channels + */ + for_each_dpcm_be(rtd, substream->stream, dpcm) { + if (dpcm->fe != rtd) + continue; + + be_rtd = dpcm->be; + } + + if (!be_rtd) + return 0; + + cpu_dai = snd_soc_rtd_to_cpu(be_rtd, 0); + if (!cpu_dai) + return 0; + + hext_stream = snd_soc_dai_get_dma_data(cpu_dai, substream); + if (!hext_stream) + return 0; + /* * The pplc_addr have been calculated during probe in * hda_dsp_stream_init():
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit 4f0d91ba72811fd5dd577bcdccd7fed649aae62c ]
Print "entry->mac" before freeing "entry". The "entry" pointer is freed with kfree_rcu() so it's unlikely that we would trigger this in real life, but it's safer to re-order it.
Fixes: cc5387f7346a ("net/mlx4_en: Add unicast MAC filtering") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Tariq Toukan tariqt@nvidia.com Link: https://patch.msgid.link/aNvMHX4g8RksFFvV@stanley.mountain Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c index 281b34af0bb48..3160a1a2d2ab3 100644 --- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c +++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c @@ -1180,9 +1180,9 @@ static void mlx4_en_do_uc_filter(struct mlx4_en_priv *priv, mlx4_unregister_mac(mdev->dev, priv->port, mac);
hlist_del_rcu(&entry->hlist); - kfree_rcu(entry, rcu); en_dbg(DRV, priv, "Removed MAC %pM on port:%d\n", entry->mac, priv->port); + kfree_rcu(entry, rcu); ++removed; } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuicheng Lin shuicheng.lin@intel.com
[ Upstream commit 08fdfd260e641da203f80aff8d3ed19c5ecceb7d ]
In xe_hw_engine_group_get_mode(), a write lock is acquired before calling switch_mode(), which in turn invokes xe_hw_engine_group_suspend_faulting_lr_jobs().
On failure inside xe_hw_engine_group_suspend_faulting_lr_jobs(), the write lock is released there, and then again in xe_hw_engine_group_get_mode(), leading to a double release.
Fix this by keeping both acquire and release operation in xe_hw_engine_group_get_mode().
Fixes: 770bd1d34113 ("drm/xe/hw_engine_group: Ensure safe transition between execution modes") Cc: Francois Dugast francois.dugast@intel.com Signed-off-by: Shuicheng Lin shuicheng.lin@intel.com Reviewed-by: Francois Dugast francois.dugast@intel.com Link: https://lore.kernel.org/r/20250925023145.1203004-2-shuicheng.lin@intel.com Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com (cherry picked from commit 662d98b8b373007fa1b08ba93fee11f6fd3e387c) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/xe/xe_hw_engine_group.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_hw_engine_group.c b/drivers/gpu/drm/xe/xe_hw_engine_group.c index 82750520a90a5..f14a3cc7d1173 100644 --- a/drivers/gpu/drm/xe/xe_hw_engine_group.c +++ b/drivers/gpu/drm/xe/xe_hw_engine_group.c @@ -237,17 +237,13 @@ static int xe_hw_engine_group_suspend_faulting_lr_jobs(struct xe_hw_engine_group
err = q->ops->suspend_wait(q); if (err) - goto err_suspend; + return err; }
if (need_resume) xe_hw_engine_group_resume_faulting_lr_jobs(group);
return 0; - -err_suspend: - up_write(&group->mode_sem); - return err; }
/**
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vineeth Vijayan vneethv@linux.ibm.com
[ Upstream commit 9daa5a8795865f9a3c93d8d1066785b07ded6073 ]
Starting with 'commit 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers")', cio no longer unregisters subchannels when the attached device is invalid or unavailable.
As an unintended side-effect, the cio_ignore purge function no longer removes subchannels for devices on the cio_ignore list if no CCW device is attached. This situation occurs when a CCW device is non-operational or unavailable
To ensure the same outcome of the purge function as when the current cio_ignore list had been active during boot, update the purge function to remove I/O subchannels without working CCW devices if the associated device number is found on the cio_ignore list.
Fixes: 2297791c92d0 ("s390/cio: dont unregister subchannel from child-drivers") Suggested-by: Peter Oberparleiter oberpar@linux.ibm.com Reviewed-by: Peter Oberparleiter oberpar@linux.ibm.com Signed-off-by: Vineeth Vijayan vneethv@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/cio/device.c | 37 ++++++++++++++++++++++++------------- 1 file changed, 24 insertions(+), 13 deletions(-)
diff --git a/drivers/s390/cio/device.c b/drivers/s390/cio/device.c index 9498825d9c7a5..858a9e171351b 100644 --- a/drivers/s390/cio/device.c +++ b/drivers/s390/cio/device.c @@ -1318,23 +1318,34 @@ void ccw_device_schedule_recovery(void) spin_unlock_irqrestore(&recovery_lock, flags); }
-static int purge_fn(struct device *dev, void *data) +static int purge_fn(struct subchannel *sch, void *data) { - struct ccw_device *cdev = to_ccwdev(dev); - struct ccw_dev_id *id = &cdev->private->dev_id; - struct subchannel *sch = to_subchannel(cdev->dev.parent); + struct ccw_device *cdev;
- spin_lock_irq(cdev->ccwlock); - if (is_blacklisted(id->ssid, id->devno) && - (cdev->private->state == DEV_STATE_OFFLINE) && - (atomic_cmpxchg(&cdev->private->onoff, 0, 1) == 0)) { - CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x\n", id->ssid, - id->devno); + spin_lock_irq(&sch->lock); + if (sch->st != SUBCHANNEL_TYPE_IO || !sch->schib.pmcw.dnv) + goto unlock; + + if (!is_blacklisted(sch->schid.ssid, sch->schib.pmcw.dev)) + goto unlock; + + cdev = sch_get_cdev(sch); + if (cdev) { + if (cdev->private->state != DEV_STATE_OFFLINE) + goto unlock; + + if (atomic_cmpxchg(&cdev->private->onoff, 0, 1) != 0) + goto unlock; ccw_device_sched_todo(cdev, CDEV_TODO_UNREG); - css_sched_sch_todo(sch, SCH_TODO_UNREG); atomic_set(&cdev->private->onoff, 0); } - spin_unlock_irq(cdev->ccwlock); + + css_sched_sch_todo(sch, SCH_TODO_UNREG); + CIO_MSG_EVENT(3, "ccw: purging 0.%x.%04x%s\n", sch->schid.ssid, + sch->schib.pmcw.dev, cdev ? "" : " (no cdev)"); + +unlock: + spin_unlock_irq(&sch->lock); /* Abort loop in case of pending signal. */ if (signal_pending(current)) return -EINTR; @@ -1350,7 +1361,7 @@ static int purge_fn(struct device *dev, void *data) int ccw_purge_blacklisted(void) { CIO_MSG_EVENT(2, "ccw: purging blacklisted devices\n"); - bus_for_each_dev(&ccw_bus_type, NULL, NULL, purge_fn); + for_each_subchannel_staged(purge_fn, NULL, NULL); return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zack Rusin zack.rusin@broadcom.com
[ Upstream commit 5ac2c0279053a2c5265d46903432fb26ae2d0da2 ]
Check that the resource which is converted to a surface exists before trying to use the cursor snooper on it.
vmw_cmd_res_check allows explicit invalid (SVGA3D_INVALID_ID) identifiers because some svga commands accept SVGA3D_INVALID_ID to mean "no surface", unfortunately functions that accept the actual surfaces as objects might (and in case of the cursor snooper, do not) be able to handle null objects. Make sure that we validate not only the identifier (via the vmw_cmd_res_check) but also check that the actual resource exists before trying to do something with it.
Fixes unchecked null-ptr reference in the snooping code.
Signed-off-by: Zack Rusin zack.rusin@broadcom.com Fixes: c0951b797e7d ("drm/vmwgfx: Refactor resource management") Reported-by: Kuzey Arda Bulut kuzeyardabulut@gmail.com Cc: Broadcom internal kernel review list bcm-kernel-feedback-list@broadcom.com Cc: dri-devel@lists.freedesktop.org Reviewed-by: Ian Forbes ian.forbes@broadcom.com Link: https://lore.kernel.org/r/20250917153655.1968583-1-zack.rusin@broadcom.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c index ea741bc4ac3fc..8b72848bb25cd 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_execbuf.c @@ -1515,6 +1515,7 @@ static int vmw_cmd_dma(struct vmw_private *dev_priv, SVGA3dCmdHeader *header) { struct vmw_bo *vmw_bo = NULL; + struct vmw_resource *res; struct vmw_surface *srf = NULL; VMW_DECLARE_CMD_VAR(*cmd, SVGA3dCmdSurfaceDMA); int ret; @@ -1550,18 +1551,24 @@ static int vmw_cmd_dma(struct vmw_private *dev_priv,
dirty = (cmd->body.transfer == SVGA3D_WRITE_HOST_VRAM) ? VMW_RES_DIRTY_SET : 0; - ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, - dirty, user_surface_converter, - &cmd->body.host.sid, NULL); + ret = vmw_cmd_res_check(dev_priv, sw_context, vmw_res_surface, dirty, + user_surface_converter, &cmd->body.host.sid, + NULL); if (unlikely(ret != 0)) { if (unlikely(ret != -ERESTARTSYS)) VMW_DEBUG_USER("could not find surface for DMA.\n"); return ret; }
- srf = vmw_res_to_srf(sw_context->res_cache[vmw_res_surface].res); + res = sw_context->res_cache[vmw_res_surface].res; + if (!res) { + VMW_DEBUG_USER("Invalid DMA surface.\n"); + return -EINVAL; + }
- vmw_kms_cursor_snoop(srf, sw_context->fp->tfile, &vmw_bo->tbo, header); + srf = vmw_res_to_srf(res); + vmw_kms_cursor_snoop(srf, sw_context->fp->tfile, &vmw_bo->tbo, + header);
return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes ian.forbes@broadcom.com
[ Upstream commit dfe1323ab3c8a4dd5625ebfdba44dc47df84512a ]
Nodes stored in the validation duplicates hashtable come from an arena allocator that is cleared at the end of vmw_execbuf_process. All nodes are expected to be cleared in vmw_validation_drop_ht but this node escaped because its resource was destroyed prematurely.
Fixes: 64ad2abfe9a6 ("drm/vmwgfx: Adapt validation code for reference-free lookups") Reported-by: Kuzey Arda Bulut kuzeyardabulut@gmail.com Signed-off-by: Ian Forbes ian.forbes@broadcom.com Reviewed-by: Zack Rusin zack.rusin@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://lore.kernel.org/r/20250926195427.1405237-1-ian.forbes@broadcom.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index e7625b3f71e0e..c18dc414a047f 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -309,8 +309,10 @@ int vmw_validation_add_resource(struct vmw_validation_context *ctx, hash_add_rcu(ctx->sw_context->res_ht, &node->hash.head, node->hash.key); } node->res = vmw_resource_reference_unless_doomed(res); - if (!node->res) + if (!node->res) { + hash_del_rcu(&node->hash.head); return -ESRCH; + }
node->first_usage = 1; if (!res->dev_priv->has_mob) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Forbes ian.forbes@broadcom.com
[ Upstream commit 228c5d44dffe8c293cd2d2f0e7ea45e64565b1c4 ]
'entry' should be 'val' which is the loop iterator.
Fixes: 9e931f2e0970 ("drm/vmwgfx: Refactor resource validation hashtable to use linux/hashtable implementation.") Signed-off-by: Ian Forbes ian.forbes@broadcom.com Reviewed-by: Zack Rusin zack.rusin@broadcom.com Signed-off-by: Zack Rusin zack.rusin@broadcom.com Link: https://lore.kernel.org/r/20250926195427.1405237-2-ian.forbes@broadcom.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_validation.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c index c18dc414a047f..80e8bbc3021d1 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_validation.c @@ -639,7 +639,7 @@ void vmw_validation_drop_ht(struct vmw_validation_context *ctx) hash_del_rcu(&val->hash.head);
list_for_each_entry(val, &ctx->resource_ctx_list, head) - hash_del_rcu(&entry->hash.head); + hash_del_rcu(&val->hash.head);
ctx->sw_context = NULL; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexandr Sapozhnikov alsp705@gmail.com
[ Upstream commit 2f3119686ef50319490ccaec81a575973da98815 ]
If new_asoc->peer.adaptation_ind=0 and sctp_ulpevent_make_authkey=0 and sctp_ulpevent_make_authkey() returns 0, then the variable ai_ev remains zero and the zero will be dereferenced in the sctp_ulpevent_free() function.
Signed-off-by: Alexandr Sapozhnikov alsp705@gmail.com Acked-by: Xin Long lucien.xin@gmail.com Fixes: 30f6ebf65bc4 ("sctp: add SCTP_AUTH_NO_AUTH type for AUTHENTICATION_EVENT") Link: https://patch.msgid.link/20251002091448.11-1-alsp705@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/sm_statefuns.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index a0524ba8d7878..93cac73472c79 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -885,7 +885,8 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net, return SCTP_DISPOSITION_CONSUME;
nomem_authev: - sctp_ulpevent_free(ai_ev); + if (ai_ev) + sctp_ulpevent_free(ai_ev); nomem_aiev: sctp_ulpevent_free(ev); nomem_ev:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@google.com
[ Upstream commit 2e7cbbbe3d61c63606994b7ff73c72537afe2e1c ]
syzbot reported the splat below in tcp_conn_request(). [0]
If a listener is close()d while a TFO socket is being processed in tcp_conn_request(), inet_csk_reqsk_queue_add() does not set reqsk->sk and calls inet_child_forget(), which calls tcp_disconnect() for the TFO socket.
After the cited commit, tcp_disconnect() calls reqsk_fastopen_remove(), where reqsk_put() is called due to !reqsk->sk.
Then, reqsk_fastopen_remove() in tcp_conn_request() decrements the last req->rsk_refcnt and frees reqsk, and __reqsk_free() at the drop_and_free label causes the refcount underflow for the listener and double-free of the reqsk.
Let's remove reqsk_fastopen_remove() in tcp_conn_request().
Note that other callers make sure tp->fastopen_rsk is not NULL.
[0]: refcount_t: underflow; use-after-free. WARNING: CPU: 12 PID: 5563 at lib/refcount.c:28 refcount_warn_saturate (lib/refcount.c:28) Modules linked in: CPU: 12 UID: 0 PID: 5563 Comm: syz-executor Not tainted syzkaller #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/12/2025 RIP: 0010:refcount_warn_saturate (lib/refcount.c:28) Code: ab e8 8e b4 98 ff 0f 0b c3 cc cc cc cc cc 80 3d a4 e4 d6 01 00 75 9c c6 05 9b e4 d6 01 01 48 c7 c7 e8 df fb ab e8 6a b4 98 ff <0f> 0b e9 03 5b 76 00 cc 80 3d 7d e4 d6 01 00 0f 85 74 ff ff ff c6 RSP: 0018:ffffa79fc0304a98 EFLAGS: 00010246 RAX: d83af4db1c6b3900 RBX: ffff9f65c7a69020 RCX: d83af4db1c6b3900 RDX: 0000000000000000 RSI: 00000000ffff7fff RDI: ffffffffac78a280 RBP: 000000009d781b60 R08: 0000000000007fff R09: ffffffffac6ca280 R10: 0000000000017ffd R11: 0000000000000004 R12: ffff9f65c7b4f100 R13: ffff9f65c7d23c00 R14: ffff9f65c7d26000 R15: ffff9f65c7a64ef8 FS: 00007f9f962176c0(0000) GS:ffff9f65fcf00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000200000000180 CR3: 000000000dbbe006 CR4: 0000000000372ef0 Call Trace: <IRQ> tcp_conn_request (./include/linux/refcount.h:400 ./include/linux/refcount.h:432 ./include/linux/refcount.h:450 ./include/net/sock.h:1965 ./include/net/request_sock.h:131 net/ipv4/tcp_input.c:7301) tcp_rcv_state_process (net/ipv4/tcp_input.c:6708) tcp_v6_do_rcv (net/ipv6/tcp_ipv6.c:1670) tcp_v6_rcv (net/ipv6/tcp_ipv6.c:1906) ip6_protocol_deliver_rcu (net/ipv6/ip6_input.c:438) ip6_input (net/ipv6/ip6_input.c:500) ipv6_rcv (net/ipv6/ip6_input.c:311) __netif_receive_skb (net/core/dev.c:6104) process_backlog (net/core/dev.c:6456) __napi_poll (net/core/dev.c:7506) net_rx_action (net/core/dev.c:7569 net/core/dev.c:7696) handle_softirqs (kernel/softirq.c:579) do_softirq (kernel/softirq.c:480) </IRQ>
Fixes: 45c8a6cc2bcd ("tcp: Clear tcp_sk(sk)->fastopen_rsk in tcp_disconnect().") Reported-by: syzkaller syzkaller@googlegroups.com Signed-off-by: Kuniyuki Iwashima kuniyu@google.com Link: https://patch.msgid.link/20251001233755.1340927-1-kuniyu@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_input.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 30f4375f8431b..4c8d84fc27ca3 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -7338,7 +7338,6 @@ int tcp_conn_request(struct request_sock_ops *rsk_ops, &foc, TCP_SYNACK_FASTOPEN, skb); /* Add the child socket directly into the accept queue */ if (!inet_csk_reqsk_queue_add(sk, req, fastopen_sk)) { - reqsk_fastopen_remove(fastopen_sk, req, false); bh_unlock_sock(fastopen_sk); sock_put(fastopen_sk); goto drop_and_free;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit bc9ea787079671cb19a8b25ff9f02be5ef6bfcf5 ]
The origin code calls cancel_delayed_work() in ocelot_stats_deinit() to cancel the cyclic delayed work item ocelot->stats_work. However, cancel_delayed_work() may fail to cancel the work item if it is already executing. While destroy_workqueue() does wait for all pending work items in the work queue to complete before destroying the work queue, it cannot prevent the delayed work item from being rescheduled within the ocelot_check_stats_work() function. This limitation exists because the delayed work item is only enqueued into the work queue after its timer expires. Before the timer expiration, destroy_workqueue() has no visibility of this pending work item. Once the work queue appears empty, destroy_workqueue() proceeds with destruction. When the timer eventually expires, the delayed work item gets queued again, leading to the following warning:
workqueue: cannot queue ocelot_check_stats_work on wq ocelot-switch-stats WARNING: CPU: 2 PID: 0 at kernel/workqueue.c:2255 __queue_work+0x875/0xaf0 ... RIP: 0010:__queue_work+0x875/0xaf0 ... RSP: 0018:ffff88806d108b10 EFLAGS: 00010086 RAX: 0000000000000000 RBX: 0000000000000101 RCX: 0000000000000027 RDX: 0000000000000027 RSI: 0000000000000004 RDI: ffff88806d123e88 RBP: ffffffff813c3170 R08: 0000000000000000 R09: ffffed100da247d2 R10: ffffed100da247d1 R11: ffff88806d123e8b R12: ffff88800c00f000 R13: ffff88800d7285c0 R14: ffff88806d0a5580 R15: ffff88800d7285a0 FS: 0000000000000000(0000) GS:ffff8880e5725000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007fe18e45ea10 CR3: 0000000005e6c000 CR4: 00000000000006f0 Call Trace: <IRQ> ? kasan_report+0xc6/0xf0 ? __pfx_delayed_work_timer_fn+0x10/0x10 ? __pfx_delayed_work_timer_fn+0x10/0x10 call_timer_fn+0x25/0x1c0 __run_timer_base.part.0+0x3be/0x8c0 ? __pfx_delayed_work_timer_fn+0x10/0x10 ? rcu_sched_clock_irq+0xb06/0x27d0 ? __pfx___run_timer_base.part.0+0x10/0x10 ? try_to_wake_up+0xb15/0x1960 ? _raw_spin_lock_irq+0x80/0xe0 ? __pfx__raw_spin_lock_irq+0x10/0x10 tmigr_handle_remote_up+0x603/0x7e0 ? __pfx_tmigr_handle_remote_up+0x10/0x10 ? sched_balance_trigger+0x1c0/0x9f0 ? sched_tick+0x221/0x5a0 ? _raw_spin_lock_irq+0x80/0xe0 ? __pfx__raw_spin_lock_irq+0x10/0x10 ? tick_nohz_handler+0x339/0x440 ? __pfx_tmigr_handle_remote_up+0x10/0x10 __walk_groups.isra.0+0x42/0x150 tmigr_handle_remote+0x1f4/0x2e0 ? __pfx_tmigr_handle_remote+0x10/0x10 ? ktime_get+0x60/0x140 ? lapic_next_event+0x11/0x20 ? clockevents_program_event+0x1d4/0x2a0 ? hrtimer_interrupt+0x322/0x780 handle_softirqs+0x16a/0x550 irq_exit_rcu+0xaf/0xe0 sysvec_apic_timer_interrupt+0x70/0x80 </IRQ> ...
The following diagram reveals the cause of the above warning:
CPU 0 (remove) | CPU 1 (delayed work callback) mscc_ocelot_remove() | ocelot_deinit() | ocelot_check_stats_work() ocelot_stats_deinit() | cancel_delayed_work()| ... | queue_delayed_work() destroy_workqueue() | (wait a time) | __queue_work() //UAF
The above scenario actually constitutes a UAF vulnerability.
The ocelot_stats_deinit() is only invoked when initialization failure or resource destruction, so we must ensure that any delayed work items cannot be rescheduled.
Replace cancel_delayed_work() with disable_delayed_work_sync() to guarantee proper cancellation of the delayed work item and ensure completion of any currently executing work before the workqueue is deallocated.
A deadlock concern was considered: ocelot_stats_deinit() is called in a process context and is not holding any locks that the delayed work item might also need. Therefore, the use of the _sync() variant is safe here.
This bug was identified through static analysis. To reproduce the issue and validate the fix, I simulated ocelot-switch device by writing a kernel module and prepared the necessary resources for the virtual ocelot-switch device's probe process. Then, removing the virtual device will trigger the mscc_ocelot_remove() function, which in turn destroys the workqueue.
Fixes: a556c76adc05 ("net: mscc: Add initial Ocelot switch support") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Link: https://patch.msgid.link/20251001011149.55073-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot_stats.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_stats.c b/drivers/net/ethernet/mscc/ocelot_stats.c index c018783757fb2..858d4cbc83d3f 100644 --- a/drivers/net/ethernet/mscc/ocelot_stats.c +++ b/drivers/net/ethernet/mscc/ocelot_stats.c @@ -984,6 +984,6 @@ int ocelot_stats_init(struct ocelot *ocelot)
void ocelot_stats_deinit(struct ocelot *ocelot) { - cancel_delayed_work(&ocelot->stats_work); + disable_delayed_work_sync(&ocelot->stats_work); destroy_workqueue(ocelot->stats_queue); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haotian Zhang vulab@iscas.ac.cn
[ Upstream commit 2db687f3469dbc5c59bc53d55acafd75d530b497 ]
When ice_adapter_new() fails, the reserved XArray entry created by xa_insert() is not released. This causes subsequent insertions at the same index to return -EBUSY, potentially leading to NULL pointer dereferences.
Reorder the operations as suggested by Przemek Kitszel: 1. Check if adapter already exists (xa_load) 2. Reserve the XArray slot (xa_reserve) 3. Allocate the adapter (ice_adapter_new) 4. Store the adapter (xa_store)
Fixes: 0f0023c649c7 ("ice: do not init struct ice_adapter more times than needed") Suggested-by: Przemek Kitszel przemyslaw.kitszel@intel.com Suggested-by: Jacob Keller jacob.e.keller@intel.com Signed-off-by: Haotian Zhang vulab@iscas.ac.cn Reviewed-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20251001115336.1707-1-vulab@iscas.ac.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_adapter.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_adapter.c b/drivers/net/ethernet/intel/ice/ice_adapter.c index 10285995c9edd..cc4798026182e 100644 --- a/drivers/net/ethernet/intel/ice/ice_adapter.c +++ b/drivers/net/ethernet/intel/ice/ice_adapter.c @@ -98,19 +98,21 @@ struct ice_adapter *ice_adapter_get(struct pci_dev *pdev)
index = ice_adapter_xa_index(pdev); scoped_guard(mutex, &ice_adapters_mutex) { - err = xa_insert(&ice_adapters, index, NULL, GFP_KERNEL); - if (err == -EBUSY) { - adapter = xa_load(&ice_adapters, index); + adapter = xa_load(&ice_adapters, index); + if (adapter) { refcount_inc(&adapter->refcount); WARN_ON_ONCE(adapter->index != ice_adapter_index(pdev)); return adapter; } + err = xa_reserve(&ice_adapters, index, GFP_KERNEL); if (err) return ERR_PTR(err);
adapter = ice_adapter_new(pdev); - if (!adapter) + if (!adapter) { + xa_release(&ice_adapters, index); return ERR_PTR(-ENOMEM); + } xa_store(&ice_adapters, index, adapter, GFP_KERNEL); } return adapter;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Erick Karanja karanja99erick@gmail.com
[ Upstream commit 521405cb54cd2812bbb6dedd5afc14bca1e7e98a ]
Add missing of_node_put call to release device node tbi obtained via for_each_child_of_node.
Fixes: afae5ad78b342 ("net/fsl_pq_mdio: streamline probing of MDIO nodes") Signed-off-by: Erick Karanja karanja99erick@gmail.com Link: https://patch.msgid.link/20251002174617.960521-1-karanja99erick@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fsl_pq_mdio.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/freescale/fsl_pq_mdio.c b/drivers/net/ethernet/freescale/fsl_pq_mdio.c index 026f7270a54de..fb5d12a87872e 100644 --- a/drivers/net/ethernet/freescale/fsl_pq_mdio.c +++ b/drivers/net/ethernet/freescale/fsl_pq_mdio.c @@ -479,10 +479,12 @@ static int fsl_pq_mdio_probe(struct platform_device *pdev) "missing 'reg' property in node %pOF\n", tbi); err = -EBUSY; + of_node_put(tbi); goto error; } set_tbipa(*prop, pdev, data->get_tbipa, priv->map, &res); + of_node_put(tbi); } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit 53d067feb8c4f16d1f24ce3f4df4450bb18c555f ]
The feature test programs are built without enabling '-Wall -Werror' options. As a result, a feature may appear to be available, but later building in perf can fail with stricter checks.
Make the feature test program use the same warning options as perf.
Fixes: 1925459b4d92 ("tools build: Fix feature Makefile issues with 'O='") Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Ian Rogers irogers@google.com Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-1-4305590795b2@... Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Albert Ou aou@eecs.berkeley.edu Cc: Alexandre Ghiti alex@ghiti.fr Cc: Nick Desaulniers nick.desaulniers+lkml@gmail.com Cc: Justin Stitt justinstitt@google.com Cc: Bill Wendling morbo@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Nathan Chancellor nathan@kernel.org Cc: James Clark james.clark@linaro.org Cc: linux-riscv@lists.infradead.org Cc: llvm@lists.linux.dev Cc: Paul Walmsley paul.walmsley@sifive.com Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/build/feature/Makefile | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/build/feature/Makefile b/tools/build/feature/Makefile index 1658596188bf8..592ca17b4b74a 100644 --- a/tools/build/feature/Makefile +++ b/tools/build/feature/Makefile @@ -327,10 +327,10 @@ $(OUTPUT)test-libcapstone.bin: $(BUILD) # -lcapstone provided by $(FEATURE_CHECK_LDFLAGS-libcapstone)
$(OUTPUT)test-compile-32.bin: - $(CC) -m32 -o $@ test-compile.c + $(CC) -m32 -Wall -Werror -o $@ test-compile.c
$(OUTPUT)test-compile-x32.bin: - $(CC) -mx32 -o $@ test-compile.c + $(CC) -mx32 -Wall -Werror -o $@ test-compile.c
$(OUTPUT)test-zlib.bin: $(BUILD) -lz
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Yan leo.yan@arm.com
[ Upstream commit c6a43bc3e8f6102a47da0d2e53428d08f00172fb ]
When passing a list to subprocess.Popen, each element maps to one argv token. Current code bundles multiple Clang flags into a single element, something like:
cmd = ['clang', '--target=x86_64-linux-gnu -fintegrated-as -Wno-cast-function-type-mismatch', 'test-hello.c']
So Clang only sees one long, invalid option instead of separate flags, as a result, the script cannot capture any log via PIPE.
Fix this by using shlex.split() to separate the string so each option becomes its own argv element. The fixed list will be:
cmd = ['clang', '--target=x86_64-linux-gnu', '-fintegrated-as', '-Wno-cast-function-type-mismatch', 'test-hello.c']
Fixes: 09e6f9f98370 ("perf python: Fix splitting CC into compiler and options") Signed-off-by: Leo Yan leo.yan@arm.com Reviewed-by: Ian Rogers irogers@google.com Link: https://lore.kernel.org/r/20251006-perf_build_android_ndk-v3-2-4305590795b2@... Cc: Palmer Dabbelt palmer@dabbelt.com Cc: Albert Ou aou@eecs.berkeley.edu Cc: Alexandre Ghiti alex@ghiti.fr Cc: Nick Desaulniers nick.desaulniers+lkml@gmail.com Cc: Justin Stitt justinstitt@google.com Cc: Bill Wendling morbo@google.com Cc: Adrian Hunter adrian.hunter@intel.com Cc: Arnaldo Carvalho de Melo acme@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Nathan Chancellor nathan@kernel.org Cc: James Clark james.clark@linaro.org Cc: linux-riscv@lists.infradead.org Cc: llvm@lists.linux.dev Cc: Paul Walmsley paul.walmsley@sifive.com Cc: linux-kernel@vger.kernel.org Cc: linux-perf-users@vger.kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/setup.py | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/setup.py b/tools/perf/util/setup.py index 649550e9b7aa8..abb567de3e8a9 100644 --- a/tools/perf/util/setup.py +++ b/tools/perf/util/setup.py @@ -1,6 +1,7 @@ from os import getenv, path from subprocess import Popen, PIPE from re import sub +import shlex
cc = getenv("CC")
@@ -16,7 +17,9 @@ cc_is_clang = b"clang version" in Popen([cc, "-v"], stderr=PIPE).stderr.readline src_feature_tests = getenv('srctree') + '/tools/build/feature'
def clang_has_option(option): - cc_output = Popen([cc, cc_options + option, path.join(src_feature_tests, "test-hello.c") ], stderr=PIPE).stderr.readlines() + cmd = shlex.split(f"{cc} {cc_options} {option}") + cmd.append(path.join(src_feature_tests, "test-hello.c")) + cc_output = Popen(cmd, stderr=PIPE).stderr.readlines() return [o for o in cc_output if ((b"unknown argument" in o) or (b"is not supported" in o) or (b"unknown warning option" in o))] == [ ]
if cc_is_clang:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
[ Upstream commit 21b29e74ffe5a6c851c235bb80bf5ee26292c67b ]
Some applications (like selftests/net/tcp_mmap.c) call SO_RCVLOWAT on their listener, before accept().
This has an unfortunate effect on wscale selection in tcp_select_initial_window() during 3WHS.
For instance, tcp_mmap was negotiating wscale 4, regardless of tcp_rmem[2] and sysctl_rmem_max.
Do not change tp->window_clamp if it is zero or bigger than our computed value.
Zero value is special, it allows tcp_select_initial_window() to enable autotuning.
Note that SO_RCVLOWAT use on listener is probably not wise, because tp->scaling_ratio has a default value, possibly wrong.
Fixes: d1361840f8c5 ("tcp: fix SO_RCVLOWAT and RCVBUF autotuning") Signed-off-by: Eric Dumazet edumazet@google.com Reviewed-by: Kuniyuki Iwashima kuniyu@google.com Reviewed-by: Neal Cardwell ncardwell@google.com Link: https://patch.msgid.link/20251003184119.2526655-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 739931aabb4e3..795ffa62cc0e6 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -1735,6 +1735,7 @@ EXPORT_SYMBOL(tcp_peek_len); /* Make sure sk_rcvbuf is big enough to satisfy SO_RCVLOWAT hint */ int tcp_set_rcvlowat(struct sock *sk, int val) { + struct tcp_sock *tp = tcp_sk(sk); int space, cap;
if (sk->sk_userlocks & SOCK_RCVBUF_LOCK) @@ -1753,7 +1754,9 @@ int tcp_set_rcvlowat(struct sock *sk, int val) space = tcp_space_from_win(sk, val); if (space > sk->sk_rcvbuf) { WRITE_ONCE(sk->sk_rcvbuf, space); - WRITE_ONCE(tcp_sk(sk)->window_clamp, val); + + if (tp->window_clamp && tp->window_clamp < val) + WRITE_ONCE(tp->window_clamp, val); } return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harini T harini.t@amd.com
[ Upstream commit 341867f730d3d3bb54491ee64e8b1a0c446656e7 ]
The controller is registered using the device-managed function 'devm_mbox_controller_register()'. As documented in mailbox.c, this ensures the devres framework automatically calls mbox_controller_unregister() when device_unregister() is invoked, making the explicit call unnecessary.
Remove redundant mbox_controller_unregister() call as device_unregister() handles controller cleanup.
Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller") Signed-off-by: Harini T harini.t@amd.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/zynqmp-ipi-mailbox.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c b/drivers/mailbox/zynqmp-ipi-mailbox.c index d59fcb74b3479..388f418aebb72 100644 --- a/drivers/mailbox/zynqmp-ipi-mailbox.c +++ b/drivers/mailbox/zynqmp-ipi-mailbox.c @@ -894,7 +894,6 @@ static void zynqmp_ipi_free_mboxes(struct zynqmp_ipi_pdata *pdata) for (; i >= 0; i--) { ipi_mbox = &pdata->ipi_mboxes[i]; if (ipi_mbox->dev.parent) { - mbox_controller_unregister(&ipi_mbox->mbox); if (device_is_registered(&ipi_mbox->dev)) device_unregister(&ipi_mbox->dev); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harini T harini.t@amd.com
[ Upstream commit 019e3f4550fc7d319a7fd03eff487255f8e8aecd ]
The ipi_mbox->dev.parent check is unreliable proxy for registration status as it fails to protect against probe failures that occur after the parent is assigned but before device_register() completes.
device_is_registered() is the canonical and robust method to verify the registration status.
Remove ipi_mbox->dev.parent check in zynqmp_ipi_free_mboxes().
Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller") Signed-off-by: Harini T harini.t@amd.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/zynqmp-ipi-mailbox.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c b/drivers/mailbox/zynqmp-ipi-mailbox.c index 388f418aebb72..411bbc10edeaf 100644 --- a/drivers/mailbox/zynqmp-ipi-mailbox.c +++ b/drivers/mailbox/zynqmp-ipi-mailbox.c @@ -893,10 +893,8 @@ static void zynqmp_ipi_free_mboxes(struct zynqmp_ipi_pdata *pdata) i = pdata->num_mboxes; for (; i >= 0; i--) { ipi_mbox = &pdata->ipi_mboxes[i]; - if (ipi_mbox->dev.parent) { - if (device_is_registered(&ipi_mbox->dev)) - device_unregister(&ipi_mbox->dev); - } + if (device_is_registered(&ipi_mbox->dev)) + device_unregister(&ipi_mbox->dev); } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harini T harini.t@amd.com
[ Upstream commit 0aead8197fc1a85b0a89646e418feb49a564b029 ]
The cleanup loop was starting at the wrong array index, causing out-of-bounds access. Start the loop at the correct index for zero-indexed arrays to prevent accessing memory beyond the allocated array bounds.
Fixes: 4981b82ba2ff ("mailbox: ZynqMP IPI mailbox controller") Signed-off-by: Harini T harini.t@amd.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/zynqmp-ipi-mailbox.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c b/drivers/mailbox/zynqmp-ipi-mailbox.c index 411bbc10edeaf..fab897b5724c1 100644 --- a/drivers/mailbox/zynqmp-ipi-mailbox.c +++ b/drivers/mailbox/zynqmp-ipi-mailbox.c @@ -890,7 +890,7 @@ static void zynqmp_ipi_free_mboxes(struct zynqmp_ipi_pdata *pdata) if (pdata->irq < MAX_SGI) xlnx_mbox_cleanup_sgi(pdata);
- i = pdata->num_mboxes; + i = pdata->num_mboxes - 1; for (; i >= 0; i--) { ipi_mbox = &pdata->ipi_mboxes[i]; if (device_is_registered(&ipi_mbox->dev))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harini T harini.t@amd.com
[ Upstream commit bb160e791ab15b89188a7a19589b8e11f681bef3 ]
The driver incorrectly determines SGI vs SPI interrupts by checking IRQ number < 16, which fails with dynamic IRQ allocation. During unbind, this causes improper SGI cleanup leading to kernel crash.
Add explicit irq_type field to pdata for reliable identification of SGI interrupts (type-2) and only clean up SGI resources when appropriate.
Fixes: 6ffb1635341b ("mailbox: zynqmp: handle SGI for shared IPI") Signed-off-by: Harini T harini.t@amd.com Reviewed-by: Peng Fan peng.fan@nxp.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/zynqmp-ipi-mailbox.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/mailbox/zynqmp-ipi-mailbox.c b/drivers/mailbox/zynqmp-ipi-mailbox.c index fab897b5724c1..5bf81b60e9e67 100644 --- a/drivers/mailbox/zynqmp-ipi-mailbox.c +++ b/drivers/mailbox/zynqmp-ipi-mailbox.c @@ -62,7 +62,8 @@ #define DST_BIT_POS 9U #define SRC_BITMASK GENMASK(11, 8)
-#define MAX_SGI 16 +/* Macro to represent SGI type for IPI IRQs */ +#define IPI_IRQ_TYPE_SGI 2
/* * Module parameters @@ -121,6 +122,7 @@ struct zynqmp_ipi_mbox { * @dev: device pointer corresponding to the Xilinx ZynqMP * IPI agent * @irq: IPI agent interrupt ID + * @irq_type: IPI SGI or SPI IRQ type * @method: IPI SMC or HVC is going to be used * @local_id: local IPI agent ID * @virq_sgi: IRQ number mapped to SGI @@ -130,6 +132,7 @@ struct zynqmp_ipi_mbox { struct zynqmp_ipi_pdata { struct device *dev; int irq; + unsigned int irq_type; unsigned int method; u32 local_id; int virq_sgi; @@ -887,7 +890,7 @@ static void zynqmp_ipi_free_mboxes(struct zynqmp_ipi_pdata *pdata) struct zynqmp_ipi_mbox *ipi_mbox; int i;
- if (pdata->irq < MAX_SGI) + if (pdata->irq_type == IPI_IRQ_TYPE_SGI) xlnx_mbox_cleanup_sgi(pdata);
i = pdata->num_mboxes - 1; @@ -956,14 +959,16 @@ static int zynqmp_ipi_probe(struct platform_device *pdev) dev_err(dev, "failed to parse interrupts\n"); goto free_mbox_dev; } - ret = out_irq.args[1]; + + /* Use interrupt type to distinguish SGI and SPI interrupts */ + pdata->irq_type = out_irq.args[0];
/* * If Interrupt number is in SGI range, then request SGI else request * IPI system IRQ. */ - if (ret < MAX_SGI) { - pdata->irq = ret; + if (pdata->irq_type == IPI_IRQ_TYPE_SGI) { + pdata->irq = out_irq.args[1]; ret = xlnx_mbox_init_sgi(pdev, pdata->irq, pdata); if (ret) goto free_mbox_dev;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 23f3770e1a53e6c7a553135011f547209e141e72 ]
Cilium has a BPF egress gateway feature which forces outgoing K8s Pod traffic to pass through dedicated egress gateways which then SNAT the traffic in order to interact with stable IPs outside the cluster.
The traffic is directed to the gateway via vxlan tunnel in collect md mode. A recent BPF change utilized the bpf_redirect_neigh() helper to forward packets after the arrival and decap on vxlan, which turned out over time that the kmalloc-256 slab usage in kernel was ever-increasing.
The issue was that vxlan allocates the metadata_dst object and attaches it through a fake dst entry to the skb. The latter was never released though given bpf_redirect_neigh() was merely setting the new dst entry via skb_dst_set() without dropping an existing one first.
Fixes: b4ab31414970 ("bpf: Add redirect_neigh helper as redirect drop-in") Reported-by: Yusuke Suzuki yusuke.suzuki@isovalent.com Reported-by: Julian Wiedmann jwi@isovalent.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Cc: Martin KaFai Lau martin.lau@kernel.org Cc: Jakub Kicinski kuba@kernel.org Cc: Jordan Rife jrife@google.com Reviewed-by: Simon Horman horms@kernel.org Reviewed-by: Jordan Rife jrife@google.com Reviewed-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Martin KaFai Lau martin.lau@kernel.org Link: https://lore.kernel.org/r/20251003073418.291171-1-daniel@iogearbox.net Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/filter.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/core/filter.c b/net/core/filter.c index c850e5d6cbd87..fef4d85fee008 100644 --- a/net/core/filter.c +++ b/net/core/filter.c @@ -2289,6 +2289,7 @@ static int __bpf_redirect_neigh_v6(struct sk_buff *skb, struct net_device *dev, if (IS_ERR(dst)) goto out_drop;
+ skb_dst_drop(skb); skb_dst_set(skb, dst); } else if (nh->nh_family != AF_INET6) { goto out_drop; @@ -2397,6 +2398,7 @@ static int __bpf_redirect_neigh_v4(struct sk_buff *skb, struct net_device *dev, goto out_drop; }
+ skb_dst_drop(skb); skb_dst_set(skb, &rt->dst); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 08fb6d8ff900a1d06ef2f4a6ded45cdaa7140c01 ]
pm_runtime_put_autosuspend() will soon be changed to include a call to pm_runtime_mark_last_busy(). This patch switches the current users to __pm_runtime_put_autosuspend() which will continue to have the functionality of old pm_runtime_put_autosuspend().
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Reviewed-by: Matthias Brugger matthias.bgg@gmail.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Stable-dep-of: 3f39f5652037 ("mailbox: mtk-cmdq: Remove pm_runtime APIs from cmdq_mbox_send_data()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/mtk-cmdq-mailbox.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c index d24f71819c3d6..4e093b58fb7b5 100644 --- a/drivers/mailbox/mtk-cmdq-mailbox.c +++ b/drivers/mailbox/mtk-cmdq-mailbox.c @@ -390,7 +390,7 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
task = kzalloc(sizeof(*task), GFP_ATOMIC); if (!task) { - pm_runtime_put_autosuspend(cmdq->mbox.dev); + __pm_runtime_put_autosuspend(cmdq->mbox.dev); return -ENOMEM; }
@@ -440,7 +440,7 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data) list_move_tail(&task->list_entry, &thread->task_busy_list);
pm_runtime_mark_last_busy(cmdq->mbox.dev); - pm_runtime_put_autosuspend(cmdq->mbox.dev); + __pm_runtime_put_autosuspend(cmdq->mbox.dev);
return 0; } @@ -488,7 +488,7 @@ static void cmdq_mbox_shutdown(struct mbox_chan *chan) spin_unlock_irqrestore(&thread->chan->lock, flags);
pm_runtime_mark_last_busy(cmdq->mbox.dev); - pm_runtime_put_autosuspend(cmdq->mbox.dev); + __pm_runtime_put_autosuspend(cmdq->mbox.dev); }
static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) @@ -528,7 +528,7 @@ static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) out: spin_unlock_irqrestore(&thread->chan->lock, flags); pm_runtime_mark_last_busy(cmdq->mbox.dev); - pm_runtime_put_autosuspend(cmdq->mbox.dev); + __pm_runtime_put_autosuspend(cmdq->mbox.dev);
return 0;
@@ -543,7 +543,7 @@ static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) return -EFAULT; } pm_runtime_mark_last_busy(cmdq->mbox.dev); - pm_runtime_put_autosuspend(cmdq->mbox.dev); + __pm_runtime_put_autosuspend(cmdq->mbox.dev); return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 472f8a3fccbb579cb98c1821da4cb9cbd51ee3e4 ]
__pm_runtime_put_autosuspend() was meant to be used by callers that needed to put the Runtime PM usage_count without marking the device's last busy timestamp. It was however seen that the Runtime PM autosuspend related functions should include that call. Thus switch the driver to use pm_runtime_put_autosuspend().
Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Stable-dep-of: 3f39f5652037 ("mailbox: mtk-cmdq: Remove pm_runtime APIs from cmdq_mbox_send_data()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/mtk-cmdq-mailbox.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c index 4e093b58fb7b5..d24f71819c3d6 100644 --- a/drivers/mailbox/mtk-cmdq-mailbox.c +++ b/drivers/mailbox/mtk-cmdq-mailbox.c @@ -390,7 +390,7 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data)
task = kzalloc(sizeof(*task), GFP_ATOMIC); if (!task) { - __pm_runtime_put_autosuspend(cmdq->mbox.dev); + pm_runtime_put_autosuspend(cmdq->mbox.dev); return -ENOMEM; }
@@ -440,7 +440,7 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data) list_move_tail(&task->list_entry, &thread->task_busy_list);
pm_runtime_mark_last_busy(cmdq->mbox.dev); - __pm_runtime_put_autosuspend(cmdq->mbox.dev); + pm_runtime_put_autosuspend(cmdq->mbox.dev);
return 0; } @@ -488,7 +488,7 @@ static void cmdq_mbox_shutdown(struct mbox_chan *chan) spin_unlock_irqrestore(&thread->chan->lock, flags);
pm_runtime_mark_last_busy(cmdq->mbox.dev); - __pm_runtime_put_autosuspend(cmdq->mbox.dev); + pm_runtime_put_autosuspend(cmdq->mbox.dev); }
static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) @@ -528,7 +528,7 @@ static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) out: spin_unlock_irqrestore(&thread->chan->lock, flags); pm_runtime_mark_last_busy(cmdq->mbox.dev); - __pm_runtime_put_autosuspend(cmdq->mbox.dev); + pm_runtime_put_autosuspend(cmdq->mbox.dev);
return 0;
@@ -543,7 +543,7 @@ static int cmdq_mbox_flush(struct mbox_chan *chan, unsigned long timeout) return -EFAULT; } pm_runtime_mark_last_busy(cmdq->mbox.dev); - __pm_runtime_put_autosuspend(cmdq->mbox.dev); + pm_runtime_put_autosuspend(cmdq->mbox.dev); return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason-JH Lin jason-jh.lin@mediatek.com
[ Upstream commit 3f39f56520374cf56872644acf9afcc618a4b674 ]
pm_runtime_get_sync() and pm_runtime_put_autosuspend() were previously called in cmdq_mbox_send_data(), which is under a spinlock in msg_submit() (mailbox.c). This caused lockdep warnings such as "sleeping function called from invalid context" when running with lockdebug enabled.
The BUG report: BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1164 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 3616, name: kworker/u17:3 preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 CPU: 1 PID: 3616 Comm: kworker/u17:3 Not tainted 6.1.87-lockdep-14133-g26e933aca785 #1 Hardware name: Google Ciri sku0/unprovisioned board (DT) Workqueue: imgsys_runner imgsys_runner_func Call trace: dump_backtrace+0x100/0x120 show_stack+0x20/0x2c dump_stack_lvl+0x84/0xb4 dump_stack+0x18/0x48 __might_resched+0x354/0x4c0 __might_sleep+0x98/0xe4 __pm_runtime_resume+0x70/0x124 cmdq_mbox_send_data+0xe4/0xb1c msg_submit+0x194/0x2dc mbox_send_message+0x190/0x330 imgsys_cmdq_sendtask+0x1618/0x2224 imgsys_runner_func+0xac/0x11c process_one_work+0x638/0xf84 worker_thread+0x808/0xcd0 kthread+0x24c/0x324 ret_from_fork+0x10/0x20
Additionally, pm_runtime_put_autosuspend() should be invoked from the GCE IRQ handler to ensure the hardware has actually completed its work.
To resolve these issues, remove the pm_runtime calls from cmdq_mbox_send_data() and delegate power management responsibilities to the client driver.
Fixes: 8afe816b0c99 ("mailbox: mtk-cmdq-mailbox: Implement Runtime PM with autosuspend") Signed-off-by: Jason-JH Lin jason-jh.lin@mediatek.com Signed-off-by: Jassi Brar jassisinghbrar@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mailbox/mtk-cmdq-mailbox.c | 12 +----------- 1 file changed, 1 insertion(+), 11 deletions(-)
diff --git a/drivers/mailbox/mtk-cmdq-mailbox.c b/drivers/mailbox/mtk-cmdq-mailbox.c index d24f71819c3d6..38ab35157c85f 100644 --- a/drivers/mailbox/mtk-cmdq-mailbox.c +++ b/drivers/mailbox/mtk-cmdq-mailbox.c @@ -379,20 +379,13 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data) struct cmdq *cmdq = dev_get_drvdata(chan->mbox->dev); struct cmdq_task *task; unsigned long curr_pa, end_pa; - int ret;
/* Client should not flush new tasks if suspended. */ WARN_ON(cmdq->suspended);
- ret = pm_runtime_get_sync(cmdq->mbox.dev); - if (ret < 0) - return ret; - task = kzalloc(sizeof(*task), GFP_ATOMIC); - if (!task) { - pm_runtime_put_autosuspend(cmdq->mbox.dev); + if (!task) return -ENOMEM; - }
task->cmdq = cmdq; INIT_LIST_HEAD(&task->list_entry); @@ -439,9 +432,6 @@ static int cmdq_mbox_send_data(struct mbox_chan *chan, void *data) } list_move_tail(&task->list_entry, &thread->task_busy_list);
- pm_runtime_mark_last_busy(cmdq->mbox.dev); - pm_runtime_put_autosuspend(cmdq->mbox.dev); - return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit 507296328b36ffd00ec1f4fde5b8acafb7222ec7 ]
Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h | 7 +++++++ drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h | 2 ++ 2 files changed, 9 insertions(+)
diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h index 9de01ae574c03..067eddd9c62d8 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_d.h @@ -4115,6 +4115,7 @@ #define mmSCL0_SCL_COEF_RAM_CONFLICT_STATUS 0x1B55 #define mmSCL0_SCL_COEF_RAM_SELECT 0x1B40 #define mmSCL0_SCL_COEF_RAM_TAP_DATA 0x1B41 +#define mmSCL0_SCL_SCALER_ENABLE 0x1B42 #define mmSCL0_SCL_CONTROL 0x1B44 #define mmSCL0_SCL_DEBUG 0x1B6A #define mmSCL0_SCL_DEBUG2 0x1B69 @@ -4144,6 +4145,7 @@ #define mmSCL1_SCL_COEF_RAM_CONFLICT_STATUS 0x1E55 #define mmSCL1_SCL_COEF_RAM_SELECT 0x1E40 #define mmSCL1_SCL_COEF_RAM_TAP_DATA 0x1E41 +#define mmSCL1_SCL_SCALER_ENABLE 0x1E42 #define mmSCL1_SCL_CONTROL 0x1E44 #define mmSCL1_SCL_DEBUG 0x1E6A #define mmSCL1_SCL_DEBUG2 0x1E69 @@ -4173,6 +4175,7 @@ #define mmSCL2_SCL_COEF_RAM_CONFLICT_STATUS 0x4155 #define mmSCL2_SCL_COEF_RAM_SELECT 0x4140 #define mmSCL2_SCL_COEF_RAM_TAP_DATA 0x4141 +#define mmSCL2_SCL_SCALER_ENABLE 0x4142 #define mmSCL2_SCL_CONTROL 0x4144 #define mmSCL2_SCL_DEBUG 0x416A #define mmSCL2_SCL_DEBUG2 0x4169 @@ -4202,6 +4205,7 @@ #define mmSCL3_SCL_COEF_RAM_CONFLICT_STATUS 0x4455 #define mmSCL3_SCL_COEF_RAM_SELECT 0x4440 #define mmSCL3_SCL_COEF_RAM_TAP_DATA 0x4441 +#define mmSCL3_SCL_SCALER_ENABLE 0x4442 #define mmSCL3_SCL_CONTROL 0x4444 #define mmSCL3_SCL_DEBUG 0x446A #define mmSCL3_SCL_DEBUG2 0x4469 @@ -4231,6 +4235,7 @@ #define mmSCL4_SCL_COEF_RAM_CONFLICT_STATUS 0x4755 #define mmSCL4_SCL_COEF_RAM_SELECT 0x4740 #define mmSCL4_SCL_COEF_RAM_TAP_DATA 0x4741 +#define mmSCL4_SCL_SCALER_ENABLE 0x4742 #define mmSCL4_SCL_CONTROL 0x4744 #define mmSCL4_SCL_DEBUG 0x476A #define mmSCL4_SCL_DEBUG2 0x4769 @@ -4260,6 +4265,7 @@ #define mmSCL5_SCL_COEF_RAM_CONFLICT_STATUS 0x4A55 #define mmSCL5_SCL_COEF_RAM_SELECT 0x4A40 #define mmSCL5_SCL_COEF_RAM_TAP_DATA 0x4A41 +#define mmSCL5_SCL_SCALER_ENABLE 0x4A42 #define mmSCL5_SCL_CONTROL 0x4A44 #define mmSCL5_SCL_DEBUG 0x4A6A #define mmSCL5_SCL_DEBUG2 0x4A69 @@ -4287,6 +4293,7 @@ #define mmSCL_COEF_RAM_CONFLICT_STATUS 0x1B55 #define mmSCL_COEF_RAM_SELECT 0x1B40 #define mmSCL_COEF_RAM_TAP_DATA 0x1B41 +#define mmSCL_SCALER_ENABLE 0x1B42 #define mmSCL_CONTROL 0x1B44 #define mmSCL_DEBUG 0x1B6A #define mmSCL_DEBUG2 0x1B69 diff --git a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h index bd8085ec54ed5..da5596fbfdcb3 100644 --- a/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h +++ b/drivers/gpu/drm/amd/include/asic_reg/dce/dce_6_0_sh_mask.h @@ -8648,6 +8648,8 @@ #define REGAMMA_LUT_INDEX__REGAMMA_LUT_INDEX__SHIFT 0x00000000 #define REGAMMA_LUT_WRITE_EN_MASK__REGAMMA_LUT_WRITE_EN_MASK_MASK 0x00000007L #define REGAMMA_LUT_WRITE_EN_MASK__REGAMMA_LUT_WRITE_EN_MASK__SHIFT 0x00000000 +#define SCL_SCALER_ENABLE__SCL_SCALE_EN_MASK 0x00000001L +#define SCL_SCALER_ENABLE__SCL_SCALE_EN__SHIFT 0x00000000 #define SCL_ALU_CONTROL__SCL_ALU_DISABLE_MASK 0x00000001L #define SCL_ALU_CONTROL__SCL_ALU_DISABLE__SHIFT 0x00000000 #define SCL_BYPASS_CONTROL__SCL_BYPASS_MODE_MASK 0x00000003L
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Timur Kristóf timur.kristof@gmail.com
[ Upstream commit d60f9c45d1bff7e20ecd57492ef7a5e33c94a37c ]
Without these, it's impossible to program these registers.
Fixes: 102b2f587ac8 ("drm/amd/display: dce_transform: DCE6 Scaling Horizontal Filter Init (v2)") Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Timur Kristóf timur.kristof@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dce/dce_transform.h | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h index cbce194ec7b82..ff746fba850bc 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h @@ -155,6 +155,8 @@ SRI(SCL_COEF_RAM_TAP_DATA, SCL, id), \ SRI(VIEWPORT_START, SCL, id), \ SRI(VIEWPORT_SIZE, SCL, id), \ + SRI(SCL_HORZ_FILTER_INIT_RGB_LUMA, SCL, id), \ + SRI(SCL_HORZ_FILTER_INIT_CHROMA, SCL, id), \ SRI(SCL_HORZ_FILTER_SCALE_RATIO, SCL, id), \ SRI(SCL_VERT_FILTER_SCALE_RATIO, SCL, id), \ SRI(SCL_VERT_FILTER_INIT, SCL, id), \
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Timur Kristóf timur.kristof@gmail.com
[ Upstream commit c0aa7cf49dd6cb302fe28e7183992b772cb7420c ]
Previously, the code would set a bit field which didn't exist on DCE6 so it would be effectively a no-op.
Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions") Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Timur Kristóf timur.kristof@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dce/dce_transform.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c index 2b1673d69ea83..e5c2fb134d14d 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c @@ -527,8 +527,7 @@ static void dce60_transform_set_scaler( if (coeffs_v != xfm_dce->filter_v || coeffs_h != xfm_dce->filter_h) { /* 4. Program vertical filters */ if (xfm_dce->filter_v == NULL) - REG_SET(SCL_VERT_FILTER_CONTROL, 0, - SCL_V_2TAP_HARDCODE_COEF_EN, 0); + REG_WRITE(SCL_VERT_FILTER_CONTROL, 0); program_multi_taps_filter( xfm_dce, data->taps.v_taps, @@ -542,8 +541,7 @@ static void dce60_transform_set_scaler(
/* 5. Program horizontal filters */ if (xfm_dce->filter_h == NULL) - REG_SET(SCL_HORZ_FILTER_CONTROL, 0, - SCL_H_2TAP_HARDCODE_COEF_EN, 0); + REG_WRITE(SCL_HORZ_FILTER_CONTROL, 0); program_multi_taps_filter( xfm_dce, data->taps.h_taps,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Timur Kristóf timur.kristof@gmail.com
[ Upstream commit a7dc87f3448bea5ebe054f14e861074b9c289c65 ]
SCL_SCALER_ENABLE can be used to enable/disable the scaler on DCE6. Program it to 0 when scaling isn't used, 1 when used. Additionally, clear some other registers when scaling is disabled and program the SCL_UPDATE register as recommended.
This fixes visible glitches for users whose BIOS sets up a mode with scaling at boot, which DC was unable to clean up.
Fixes: b70aaf5586f2 ("drm/amd/display: dce_transform: add DCE6 specific macros,functions") Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Timur Kristóf timur.kristof@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/display/dc/dce/dce_transform.c | 15 +++++++++++---- .../gpu/drm/amd/display/dc/dce/dce_transform.h | 2 ++ 2 files changed, 13 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c index e5c2fb134d14d..1ab5ae9b5ea51 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.c @@ -154,10 +154,13 @@ static bool dce60_setup_scaling_configuration( REG_SET(SCL_BYPASS_CONTROL, 0, SCL_BYPASS_MODE, 0);
if (data->taps.h_taps + data->taps.v_taps <= 2) { - /* Set bypass */ - - /* DCE6 has no SCL_MODE register, skip scale mode programming */ + /* Disable scaler functionality */ + REG_WRITE(SCL_SCALER_ENABLE, 0);
+ /* Clear registers that can cause glitches even when the scaler is off */ + REG_WRITE(SCL_TAP_CONTROL, 0); + REG_WRITE(SCL_AUTOMATIC_MODE_CONTROL, 0); + REG_WRITE(SCL_F_SHARP_CONTROL, 0); return false; }
@@ -165,7 +168,7 @@ static bool dce60_setup_scaling_configuration( SCL_H_NUM_OF_TAPS, data->taps.h_taps - 1, SCL_V_NUM_OF_TAPS, data->taps.v_taps - 1);
- /* DCE6 has no SCL_MODE register, skip scale mode programming */ + REG_WRITE(SCL_SCALER_ENABLE, 1);
/* DCE6 has no SCL_BOUNDARY_MODE bit, skip replace out of bound pixels */
@@ -502,6 +505,8 @@ static void dce60_transform_set_scaler( REG_SET(DC_LB_MEM_SIZE, 0, DC_LB_MEM_SIZE, xfm_dce->lb_memory_size);
+ REG_WRITE(SCL_UPDATE, 0x00010000); + /* Clear SCL_F_SHARP_CONTROL value to 0 */ REG_WRITE(SCL_F_SHARP_CONTROL, 0);
@@ -564,6 +569,8 @@ static void dce60_transform_set_scaler( /* DCE6 has no SCL_COEF_UPDATE_COMPLETE bit to flip to new coefficient memory */
/* DCE6 DATA_FORMAT register does not support ALPHA_EN */ + + REG_WRITE(SCL_UPDATE, 0); } #endif
diff --git a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h index ff746fba850bc..eb716e8337e23 100644 --- a/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h +++ b/drivers/gpu/drm/amd/display/dc/dce/dce_transform.h @@ -155,6 +155,7 @@ SRI(SCL_COEF_RAM_TAP_DATA, SCL, id), \ SRI(VIEWPORT_START, SCL, id), \ SRI(VIEWPORT_SIZE, SCL, id), \ + SRI(SCL_SCALER_ENABLE, SCL, id), \ SRI(SCL_HORZ_FILTER_INIT_RGB_LUMA, SCL, id), \ SRI(SCL_HORZ_FILTER_INIT_CHROMA, SCL, id), \ SRI(SCL_HORZ_FILTER_SCALE_RATIO, SCL, id), \ @@ -592,6 +593,7 @@ struct dce_transform_registers { uint32_t SCL_VERT_FILTER_SCALE_RATIO; uint32_t SCL_HORZ_FILTER_INIT; #if defined(CONFIG_DRM_AMD_DC_SI) + uint32_t SCL_SCALER_ENABLE; uint32_t SCL_HORZ_FILTER_INIT_RGB_LUMA; uint32_t SCL_HORZ_FILTER_INIT_CHROMA; #endif
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fernando Fernandez Mancera fmancera@suse.de
[ Upstream commit f359b809d54c6e3dd1d039b97e0b68390b0e53e4 ]
Referencing a synproxy stateful object from OUTPUT hook causes kernel crash due to infinite recursive calls:
BUG: TASK stack guard page was hit at 000000008bda5b8c (stack is 000000003ab1c4a5..00000000494d8b12) [...] Call Trace: __find_rr_leaf+0x99/0x230 fib6_table_lookup+0x13b/0x2d0 ip6_pol_route+0xa4/0x400 fib6_rule_lookup+0x156/0x240 ip6_route_output_flags+0xc6/0x150 __nf_ip6_route+0x23/0x50 synproxy_send_tcp_ipv6+0x106/0x200 synproxy_send_client_synack_ipv6+0x1aa/0x1f0 nft_synproxy_do_eval+0x263/0x310 nft_do_chain+0x5a8/0x5f0 [nf_tables nft_do_chain_inet+0x98/0x110 nf_hook_slow+0x43/0xc0 __ip6_local_out+0xf0/0x170 ip6_local_out+0x17/0x70 synproxy_send_tcp_ipv6+0x1a2/0x200 synproxy_send_client_synack_ipv6+0x1aa/0x1f0 [...]
Implement objref and objrefmap expression validate functions.
Currently, only NFT_OBJECT_SYNPROXY object type requires validation. This will also handle a jump to a chain using a synproxy object from the OUTPUT hook.
Now when trying to reference a synproxy object in the OUTPUT hook, nft will produce the following error:
synproxy_crash.nft: Error: Could not process rule: Operation not supported synproxy name mysynproxy ^^^^^^^^^^^^^^^^^^^^^^^^
Fixes: ee394f96ad75 ("netfilter: nft_synproxy: add synproxy stateful object support") Reported-by: Georg Pfuetzenreuter georg.pfuetzenreuter@suse.com Closes: https://bugzilla.suse.com/1250237 Signed-off-by: Fernando Fernandez Mancera fmancera@suse.de Reviewed-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_objref.c | 39 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+)
diff --git a/net/netfilter/nft_objref.c b/net/netfilter/nft_objref.c index 8ee66a86c3bc7..1a62e384766a7 100644 --- a/net/netfilter/nft_objref.c +++ b/net/netfilter/nft_objref.c @@ -22,6 +22,35 @@ void nft_objref_eval(const struct nft_expr *expr, obj->ops->eval(obj, regs, pkt); }
+static int nft_objref_validate_obj_type(const struct nft_ctx *ctx, u32 type) +{ + unsigned int hooks; + + switch (type) { + case NFT_OBJECT_SYNPROXY: + if (ctx->family != NFPROTO_IPV4 && + ctx->family != NFPROTO_IPV6 && + ctx->family != NFPROTO_INET) + return -EOPNOTSUPP; + + hooks = (1 << NF_INET_LOCAL_IN) | (1 << NF_INET_FORWARD); + + return nft_chain_validate_hooks(ctx->chain, hooks); + default: + break; + } + + return 0; +} + +static int nft_objref_validate(const struct nft_ctx *ctx, + const struct nft_expr *expr) +{ + struct nft_object *obj = nft_objref_priv(expr); + + return nft_objref_validate_obj_type(ctx, obj->ops->type->type); +} + static int nft_objref_init(const struct nft_ctx *ctx, const struct nft_expr *expr, const struct nlattr * const tb[]) @@ -93,6 +122,7 @@ static const struct nft_expr_ops nft_objref_ops = { .activate = nft_objref_activate, .deactivate = nft_objref_deactivate, .dump = nft_objref_dump, + .validate = nft_objref_validate, .reduce = NFT_REDUCE_READONLY, };
@@ -197,6 +227,14 @@ static void nft_objref_map_destroy(const struct nft_ctx *ctx, nf_tables_destroy_set(ctx, priv->set); }
+static int nft_objref_map_validate(const struct nft_ctx *ctx, + const struct nft_expr *expr) +{ + const struct nft_objref_map *priv = nft_expr_priv(expr); + + return nft_objref_validate_obj_type(ctx, priv->set->objtype); +} + static const struct nft_expr_ops nft_objref_map_ops = { .type = &nft_objref_type, .size = NFT_EXPR_SIZE(sizeof(struct nft_objref_map)), @@ -206,6 +244,7 @@ static const struct nft_expr_ops nft_objref_map_ops = { .deactivate = nft_objref_map_deactivate, .destroy = nft_objref_map_destroy, .dump = nft_objref_map_dump, + .validate = nft_objref_map_validate, .reduce = NFT_REDUCE_READONLY, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Woudstra ericwouds@gmail.com
[ Upstream commit bbf0c98b3ad9edaea1f982de6c199cc11d3b7705 ]
net/bridge/br_private.h:1627 suspicious rcu_dereference_protected() usage! other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 7 locks held by socat/410: #0: ffff88800d7a9c90 (sk_lock-AF_INET){+.+.}-{0:0}, at: inet_stream_connect+0x43/0xa0 #1: ffffffff9a779900 (rcu_read_lock){....}-{1:3}, at: __ip_queue_xmit+0x62/0x1830 [..] #6: ffffffff9a779900 (rcu_read_lock){....}-{1:3}, at: nf_hook.constprop.0+0x8a/0x440
Call Trace: lockdep_rcu_suspicious.cold+0x4f/0xb1 br_vlan_fill_forward_path_pvid+0x32c/0x410 [bridge] br_fill_forward_path+0x7a/0x4d0 [bridge]
Use to correct helper, non _rcu variant requires RTNL mutex.
Fixes: bcf2766b1377 ("net: bridge: resolve forwarding path for VLAN tag actions in bridge devices") Signed-off-by: Eric Woudstra ericwouds@gmail.com Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_vlan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c index f2efb58d152bc..13d6c3f51c29d 100644 --- a/net/bridge/br_vlan.c +++ b/net/bridge/br_vlan.c @@ -1455,7 +1455,7 @@ void br_vlan_fill_forward_path_pvid(struct net_bridge *br, if (!br_opt_get(br, BROPT_VLAN_ENABLED)) return;
- vg = br_vlan_group(br); + vg = br_vlan_group_rcu(br);
if (idx >= 0 && ctx->vlan[idx].proto == br->vlan_proto) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit e84945bdc619ed4243ba4298dbb8ca2062026474 ]
Jakub reported this self test flaking occasionally (fails, but passes on re-run) on debug kernels.
This is because the test checks for elapsed time to determine if both connections were established in parallel.
Rework this to no longer depend on timing. Use busywait helper to check that both sockets have moved to established state and then query the conntrack engine for the two entries.
Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://lore.kernel.org/netfilter-devel/20250926163318.40d1a502@kernel.org/ Fixes: 117e149e26d1 ("selftests: netfilter: test nat source port clash resolution interaction with tcp early demux") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../selftests/net/netfilter/nf_nat_edemux.sh | 58 +++++++++++++------ 1 file changed, 41 insertions(+), 17 deletions(-)
diff --git a/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh b/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh index 1014551dd7694..6731fe1eaf2e9 100755 --- a/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh +++ b/tools/testing/selftests/net/netfilter/nf_nat_edemux.sh @@ -17,9 +17,31 @@ cleanup()
checktool "socat -h" "run test without socat" checktool "iptables --version" "run test without iptables" +checktool "conntrack --version" "run test without conntrack"
trap cleanup EXIT
+connect_done() +{ + local ns="$1" + local port="$2" + + ip netns exec "$ns" ss -nt -o state established "dport = :$port" | grep -q "$port" +} + +check_ctstate() +{ + local ns="$1" + local dp="$2" + + if ! ip netns exec "$ns" conntrack --get -s 192.168.1.2 -d 192.168.1.1 -p tcp \ + --sport 10000 --dport "$dp" --state ESTABLISHED > /dev/null 2>&1;then + echo "FAIL: Did not find expected state for dport $2" + ip netns exec "$ns" bash -c 'conntrack -L; conntrack -S; ss -nt' + ret=1 + fi +} + setup_ns ns1 ns2
# Connect the namespaces using a veth pair @@ -44,15 +66,18 @@ socatpid=$! ip netns exec "$ns2" sysctl -q net.ipv4.ip_local_port_range="10000 10000"
# add a virtual IP using DNAT -ip netns exec "$ns2" iptables -t nat -A OUTPUT -d 10.96.0.1/32 -p tcp --dport 443 -j DNAT --to-destination 192.168.1.1:5201 +ip netns exec "$ns2" iptables -t nat -A OUTPUT -d 10.96.0.1/32 -p tcp --dport 443 -j DNAT --to-destination 192.168.1.1:5201 || exit 1
# ... and route it to the other namespace ip netns exec "$ns2" ip route add 10.96.0.1 via 192.168.1.1
-# add a persistent connection from the other namespace -ip netns exec "$ns2" socat -t 10 - TCP:192.168.1.1:5201 > /dev/null & +# listener should be up by now, wait if it isn't yet. +wait_local_port_listen "$ns1" 5201 tcp
-sleep 1 +# add a persistent connection from the other namespace +sleep 10 | ip netns exec "$ns2" socat -t 10 - TCP:192.168.1.1:5201 > /dev/null & +cpid0=$! +busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" "5201"
# ip daddr:dport will be rewritten to 192.168.1.1 5201 # NAT must reallocate source port 10000 because @@ -71,26 +96,25 @@ fi ip netns exec "$ns1" iptables -t nat -A PREROUTING -p tcp --dport 5202 -j REDIRECT --to-ports 5201 ip netns exec "$ns1" iptables -t nat -A PREROUTING -p tcp --dport 5203 -j REDIRECT --to-ports 5201
-sleep 5 | ip netns exec "$ns2" socat -t 5 -u STDIN TCP:192.168.1.1:5202,connect-timeout=5 >/dev/null & +sleep 5 | ip netns exec "$ns2" socat -T 5 -u STDIN TCP:192.168.1.1:5202,connect-timeout=5 >/dev/null & +cpid1=$!
-# if connect succeeds, client closes instantly due to EOF on stdin. -# if connect hangs, it will time out after 5s. -echo | ip netns exec "$ns2" socat -t 3 -u STDIN TCP:192.168.1.1:5203,connect-timeout=5 >/dev/null & +sleep 5 | ip netns exec "$ns2" socat -T 5 -u STDIN TCP:192.168.1.1:5203,connect-timeout=5 >/dev/null & cpid2=$!
-time_then=$(date +%s) -wait $cpid2 -rv=$? -time_now=$(date +%s) +busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" 5202 +busywait "$BUSYWAIT_TIMEOUT" connect_done "$ns2" 5203
-# Check how much time has elapsed, expectation is for -# 'cpid2' to connect and then exit (and no connect delay). -delta=$((time_now - time_then)) +check_ctstate "$ns1" 5202 +check_ctstate "$ns1" 5203
-if [ $delta -lt 2 ] && [ $rv -eq 0 ]; then +kill $socatpid $cpid0 $cpid1 $cpid2 +socatpid=0 + +if [ $ret -eq 0 ]; then echo "PASS: could connect to service via redirected ports" else - echo "FAIL: socat cannot connect to service via redirect ($delta seconds elapsed, returned $rv)" + echo "FAIL: socat cannot connect to service via redirect" ret=1 fi
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Herbert Xu herbert@gondor.apana.org.au
[ Upstream commit 6bb73db6948c2de23e407fe1b7ef94bf02b7529f ]
Move the ssize check to the start in essiv_aead_crypt so that it's also checked for decryption and in-place encryption.
Reported-by: Muhammad Alifa Ramdhan ramdhan@starlabs.sg Fixes: be1eb7f78aa8 ("crypto: essiv - create wrapper template for ESSIV generation") Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/essiv.c | 14 ++++++-------- 1 file changed, 6 insertions(+), 8 deletions(-)
diff --git a/crypto/essiv.c b/crypto/essiv.c index e63fc6442e320..aa7ebc1235274 100644 --- a/crypto/essiv.c +++ b/crypto/essiv.c @@ -186,9 +186,14 @@ static int essiv_aead_crypt(struct aead_request *req, bool enc) const struct essiv_tfm_ctx *tctx = crypto_aead_ctx(tfm); struct essiv_aead_request_ctx *rctx = aead_request_ctx(req); struct aead_request *subreq = &rctx->aead_req; + int ivsize = crypto_aead_ivsize(tfm); + int ssize = req->assoclen - ivsize; struct scatterlist *src = req->src; int err;
+ if (ssize < 0) + return -EINVAL; + crypto_cipher_encrypt_one(tctx->essiv_cipher, req->iv, req->iv);
/* @@ -198,19 +203,12 @@ static int essiv_aead_crypt(struct aead_request *req, bool enc) */ rctx->assoc = NULL; if (req->src == req->dst || !enc) { - scatterwalk_map_and_copy(req->iv, req->dst, - req->assoclen - crypto_aead_ivsize(tfm), - crypto_aead_ivsize(tfm), 1); + scatterwalk_map_and_copy(req->iv, req->dst, ssize, ivsize, 1); } else { u8 *iv = (u8 *)aead_request_ctx(req) + tctx->ivoffset; - int ivsize = crypto_aead_ivsize(tfm); - int ssize = req->assoclen - ivsize; struct scatterlist *sg; int nents;
- if (ssize < 0) - return -EINVAL; - nents = sg_nents_for_len(req->src, ssize); if (nents < 0) return -EINVAL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fushuai Wang wangfushuai@baidu.com
[ Upstream commit 0cc380d0e1d36b8f2703379890e90f896f68e9e8 ]
The return value of copy_to_iter() function will never be negative, it is the number of bytes copied, or zero if nothing was copied. Update the check to treat 0 as an error, and return -1 in that case.
Fixes: d08089f649a0 ("cifs: Change the I/O paths to use an iterator rather than a page list") Acked-by: Tom Talpey tom@talpey.com Reviewed-by: David Howells dhowells@redhat.com Signed-off-by: Fushuai Wang wangfushuai@baidu.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2ops.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index c946c3a09245c..1b30035d02bc5 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -4657,7 +4657,7 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, unsigned int pad_len; struct cifs_io_subrequest *rdata = mid->callback_data; struct smb2_hdr *shdr = (struct smb2_hdr *)buf; - int length; + size_t copied; bool use_rdma_mr = false;
if (shdr->Command != SMB2_READ) { @@ -4770,10 +4770,10 @@ handle_read_data(struct TCP_Server_Info *server, struct mid_q_entry *mid, } else if (buf_len >= data_offset + data_len) { /* read response payload is in buf */ WARN_ONCE(buffer, "read data can be either in buf or in buffer"); - length = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter); - if (length < 0) - return length; - rdata->got_bytes = data_len; + copied = copy_to_iter(buf + data_offset, data_len, &rdata->subreq.io_iter); + if (copied == 0) + return -EIO; + rdata->got_bytes = copied; } else { /* read response payload cannot be in both buf and pages */ WARN_ONCE(1, "buf can not contain only a part of read data");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paulo Alcantara pc@manguebit.org
[ Upstream commit b95cd1bdf5aa9221c98fc9259014b8bb8d1829d7 ]
Don't reuse open handle when changing timestamps to prevent the server from disabling automatic timestamp updates as per MS-FSA 2.1.4.17.
---8<--- import os import time
filename = '/mnt/foo'
def print_stat(prefix): st = os.stat(filename) print(prefix, ': ', time.ctime(st.st_atime), time.ctime(st.st_ctime))
fd = os.open(filename, os.O_CREAT|os.O_TRUNC|os.O_WRONLY, 0o644) print_stat('old') os.utime(fd, None) time.sleep(2) os.write(fd, b'foo') os.close(fd) time.sleep(2) print_stat('new') ---8<---
Before patch:
$ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025 new : Fri Oct 3 14:01:21 2025 Fri Oct 3 14:01:21 2025
After patch:
$ mount.cifs //srv/share /mnt -o ... $ python3 run.py old : Fri Oct 3 17:03:34 2025 Fri Oct 3 17:03:34 2025 new : Fri Oct 3 17:03:36 2025 Fri Oct 3 17:03:36 2025
Fixes: b6f2a0f89d7e ("cifs: for compound requests, use open handle if possible") Signed-off-by: Paulo Alcantara (Red Hat) pc@manguebit.org Cc: Frank Sorenson sorenson@redhat.com Reviewed-by: David Howells dhowells@redhat.com Cc: linux-cifs@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2inode.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/smb/client/smb2inode.c b/fs/smb/client/smb2inode.c index 104a563dc317f..cb049bc70e0cb 100644 --- a/fs/smb/client/smb2inode.c +++ b/fs/smb/client/smb2inode.c @@ -1220,31 +1220,33 @@ int smb2_set_file_info(struct inode *inode, const char *full_path, FILE_BASIC_INFO *buf, const unsigned int xid) { - struct cifs_open_parms oparms; + struct kvec in_iov = { .iov_base = buf, .iov_len = sizeof(*buf), }; struct cifs_sb_info *cifs_sb = CIFS_SB(inode->i_sb); + struct cifsFileInfo *cfile = NULL; + struct cifs_open_parms oparms; struct tcon_link *tlink; struct cifs_tcon *tcon; - struct cifsFileInfo *cfile; - struct kvec in_iov = { .iov_base = buf, .iov_len = sizeof(*buf), }; - int rc; - - if ((buf->CreationTime == 0) && (buf->LastAccessTime == 0) && - (buf->LastWriteTime == 0) && (buf->ChangeTime == 0) && - (buf->Attributes == 0)) - return 0; /* would be a no op, no sense sending this */ + int rc = 0;
tlink = cifs_sb_tlink(cifs_sb); if (IS_ERR(tlink)) return PTR_ERR(tlink); tcon = tlink_tcon(tlink);
- cifs_get_writable_path(tcon, full_path, FIND_WR_ANY, &cfile); + if ((buf->CreationTime == 0) && (buf->LastAccessTime == 0) && + (buf->LastWriteTime == 0) && (buf->ChangeTime == 0)) { + if (buf->Attributes == 0) + goto out; /* would be a no op, no sense sending this */ + cifs_get_writable_path(tcon, full_path, FIND_WR_ANY, &cfile); + } + oparms = CIFS_OPARMS(cifs_sb, tcon, full_path, FILE_WRITE_ATTRIBUTES, FILE_OPEN, 0, ACL_NO_MODE); rc = smb2_compound_op(xid, tcon, cifs_sb, full_path, &oparms, &in_iov, &(int){SMB2_OP_SET_INFO}, 1, cfile, NULL, NULL, NULL); +out: cifs_put_tlink(tlink); return rc; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pali Rohár pali@kernel.org
[ Upstream commit 057ac50638bcece64b3b436d3a61b70ed6c01a34 ]
EA $LXMOD is required for WSL non-symlink reparse points.
Fixes: ef86ab131d91 ("cifs: Fix querying of WSL CHR and BLK reparse points over SMB1") Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb1ops.c | 62 +++++++++++++++++++++++++++++++++++++++-- 1 file changed, 60 insertions(+), 2 deletions(-)
diff --git a/fs/smb/client/smb1ops.c b/fs/smb/client/smb1ops.c index 0385a514f59e9..4670c3e451a75 100644 --- a/fs/smb/client/smb1ops.c +++ b/fs/smb/client/smb1ops.c @@ -671,14 +671,72 @@ static int cifs_query_path_info(const unsigned int xid, }
#ifdef CONFIG_CIFS_XATTR + /* + * For non-symlink WSL reparse points it is required to fetch + * EA $LXMOD which contains in its S_DT part the mandatory file type. + */ + if (!rc && data->reparse_point) { + struct smb2_file_full_ea_info *ea; + u32 next = 0; + + ea = (struct smb2_file_full_ea_info *)data->wsl.eas; + do { + ea = (void *)((u8 *)ea + next); + next = le32_to_cpu(ea->next_entry_offset); + } while (next); + if (le16_to_cpu(ea->ea_value_length)) { + ea->next_entry_offset = cpu_to_le32(ALIGN(sizeof(*ea) + + ea->ea_name_length + 1 + + le16_to_cpu(ea->ea_value_length), 4)); + ea = (void *)((u8 *)ea + le32_to_cpu(ea->next_entry_offset)); + } + + rc = CIFSSMBQAllEAs(xid, tcon, full_path, SMB2_WSL_XATTR_MODE, + &ea->ea_data[SMB2_WSL_XATTR_NAME_LEN + 1], + SMB2_WSL_XATTR_MODE_SIZE, cifs_sb); + if (rc == SMB2_WSL_XATTR_MODE_SIZE) { + ea->next_entry_offset = cpu_to_le32(0); + ea->flags = 0; + ea->ea_name_length = SMB2_WSL_XATTR_NAME_LEN; + ea->ea_value_length = cpu_to_le16(SMB2_WSL_XATTR_MODE_SIZE); + memcpy(&ea->ea_data[0], SMB2_WSL_XATTR_MODE, SMB2_WSL_XATTR_NAME_LEN + 1); + data->wsl.eas_len += ALIGN(sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 + + SMB2_WSL_XATTR_MODE_SIZE, 4); + rc = 0; + } else if (rc >= 0) { + /* It is an error if EA $LXMOD has wrong size. */ + rc = -EINVAL; + } else { + /* + * In all other cases ignore error if fetching + * of EA $LXMOD failed. It is needed only for + * non-symlink WSL reparse points and wsl_to_fattr() + * handle the case when EA is missing. + */ + rc = 0; + } + } + /* * For WSL CHR and BLK reparse points it is required to fetch * EA $LXDEV which contains major and minor device numbers. */ if (!rc && data->reparse_point) { struct smb2_file_full_ea_info *ea; + u32 next = 0;
ea = (struct smb2_file_full_ea_info *)data->wsl.eas; + do { + ea = (void *)((u8 *)ea + next); + next = le32_to_cpu(ea->next_entry_offset); + } while (next); + if (le16_to_cpu(ea->ea_value_length)) { + ea->next_entry_offset = cpu_to_le32(ALIGN(sizeof(*ea) + + ea->ea_name_length + 1 + + le16_to_cpu(ea->ea_value_length), 4)); + ea = (void *)((u8 *)ea + le32_to_cpu(ea->next_entry_offset)); + } + rc = CIFSSMBQAllEAs(xid, tcon, full_path, SMB2_WSL_XATTR_DEV, &ea->ea_data[SMB2_WSL_XATTR_NAME_LEN + 1], SMB2_WSL_XATTR_DEV_SIZE, cifs_sb); @@ -688,8 +746,8 @@ static int cifs_query_path_info(const unsigned int xid, ea->ea_name_length = SMB2_WSL_XATTR_NAME_LEN; ea->ea_value_length = cpu_to_le16(SMB2_WSL_XATTR_DEV_SIZE); memcpy(&ea->ea_data[0], SMB2_WSL_XATTR_DEV, SMB2_WSL_XATTR_NAME_LEN + 1); - data->wsl.eas_len = sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 + - SMB2_WSL_XATTR_DEV_SIZE; + data->wsl.eas_len += ALIGN(sizeof(*ea) + SMB2_WSL_XATTR_NAME_LEN + 1 + + SMB2_WSL_XATTR_MODE_SIZE, 4); rc = 0; } else if (rc >= 0) { /* It is an error if EA $LXDEV has wrong size. */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Gunnar Kudrjavets gunnarku@amazon.com
[ Upstream commit 8a81236f2cb0882c7ea6c621ce357f7f3f601fe5 ]
The tpm_tis_write8() call specifies arguments in wrong order. Should be (data, addr, value) not (data, value, addr). The initial correct order was changed during the major refactoring when the code was split.
Fixes: 41a5e1cf1fe1 ("tpm/tpm_tis: Split tpm_tis driver into a core and TCG TIS compliant phy") Signed-off-by: Gunnar Kudrjavets gunnarku@amazon.com Reviewed-by: Justinien Bouron jbouron@amazon.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Reviewed-by: Paul Menzel pmenzel@molgen.mpg.de Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/tpm_tis_core.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index ed0d3d8449b30..59e992dc65c4c 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -977,8 +977,8 @@ static int tpm_tis_probe_irq_single(struct tpm_chip *chip, u32 intmask, * will call disable_irq which undoes all of the above. */ if (!(chip->flags & TPM_CHIP_FLAG_IRQ)) { - tpm_tis_write8(priv, original_int_vec, - TPM_INT_VECTOR(priv->locality)); + tpm_tis_write8(priv, TPM_INT_VECTOR(priv->locality), + original_int_vec); rc = -1; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Bartosz Golaszewski bartosz.golaszewski@linaro.org
[ Upstream commit b5f8aa8d4bde0cf3e4595af5a536da337e5f1c78 ]
The slimbus regmap passed to the GPIO driver down from MFD does not use fast_io. This means a mutex is used for locking and thus this GPIO chip must not be used in atomic context. Change the can_sleep switch in struct gpio_chip to true.
Fixes: 59c324683400 ("gpio: wcd934x: Add support to wcd934x gpio controller") Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-wcd934x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpio/gpio-wcd934x.c b/drivers/gpio/gpio-wcd934x.c index cfa7b0a50c8e3..03b16b8f639ad 100644 --- a/drivers/gpio/gpio-wcd934x.c +++ b/drivers/gpio/gpio-wcd934x.c @@ -102,7 +102,7 @@ static int wcd_gpio_probe(struct platform_device *pdev) chip->base = -1; chip->ngpio = WCD934X_NPINS; chip->label = dev_name(dev); - chip->can_sleep = false; + chip->can_sleep = true;
return devm_gpiochip_add_data(dev, chip, data); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: KaFai Wan kafai.wan@linux.dev
[ Upstream commit 4f375ade6aa9f37fd72d7a78682f639772089eed ]
When unpinning a BPF hash table (htab or htab_lru) that contains internal structures (timer, workqueue, or task_work) in its values, a BUG warning is triggered: BUG: sleeping function called from invalid context at kernel/bpf/hashtab.c:244 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 14, name: ksoftirqd/0 ...
The issue arises from the interaction between BPF object unpinning and RCU callback mechanisms: 1. BPF object unpinning uses ->free_inode() which schedules cleanup via call_rcu(), deferring the actual freeing to an RCU callback that executes within the RCU_SOFTIRQ context. 2. During cleanup of hash tables containing internal structures, htab_map_free_internal_structs() is invoked, which includes cond_resched() or cond_resched_rcu() calls to yield the CPU during potentially long operations.
However, cond_resched() or cond_resched_rcu() cannot be safely called from atomic RCU softirq context, leading to the BUG warning when attempting to reschedule.
Fix this by changing from ->free_inode() to ->destroy_inode() and rename bpf_free_inode() to bpf_destroy_inode() for BPF objects (prog, map, link). This allows direct inode freeing without RCU callback scheduling, avoiding the invalid context warning.
Reported-by: Le Chen tom2cat@sjtu.edu.cn Closes: https://lore.kernel.org/all/1444123482.1827743.1750996347470.JavaMail.zimbra... Fixes: 68134668c17f ("bpf: Add map side support for bpf timers.") Suggested-by: Alexei Starovoitov ast@kernel.org Signed-off-by: KaFai Wan kafai.wan@linux.dev Acked-by: Yonghong Song yonghong.song@linux.dev Link: https://lore.kernel.org/r/20251008102628.808045-2-kafai.wan@linux.dev Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/bpf/inode.c b/kernel/bpf/inode.c index 9aaf5124648bd..746b5644d9a19 100644 --- a/kernel/bpf/inode.c +++ b/kernel/bpf/inode.c @@ -775,7 +775,7 @@ static int bpf_show_options(struct seq_file *m, struct dentry *root) return 0; }
-static void bpf_free_inode(struct inode *inode) +static void bpf_destroy_inode(struct inode *inode) { enum bpf_type type;
@@ -790,7 +790,7 @@ const struct super_operations bpf_super_ops = { .statfs = simple_statfs, .drop_inode = generic_delete_inode, .show_options = bpf_show_options, - .free_inode = bpf_free_inode, + .destroy_inode = bpf_destroy_inode, };
enum {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Gladkov legion@kernel.org
[ Upstream commit 8d18ef04f940a8d336fe7915b5ea419c3eb0c0a6 ]
In the upcoming changes, the ELF_DETAILS macro will be extended with the ".modinfo" section, which will cause an error:
s390x-linux-ld: .tmp_vmlinux1: warning: allocated section `.modinfo' not in segment s390x-linux-ld: .tmp_vmlinux2: warning: allocated section `.modinfo' not in segment s390x-linux-ld: vmlinux.unstripped: warning: allocated section `.modinfo' not in segment
This happens because the .vmlinux.info use :NONE to override the default segment and tell the linker to not put the section in any segment at all.
To avoid this, we need to change the sections order that will be placed in the default segment.
Cc: Heiko Carstens hca@linux.ibm.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Alexander Gordeev agordeev@linux.ibm.com Cc: linux-s390@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202506062053.zbkFBEnJ-lkp@intel.com/ Signed-off-by: Alexey Gladkov legion@kernel.org Acked-by: Heiko Carstens hca@linux.ibm.com Link: https://patch.msgid.link/20d40a7a3a053ba06a54155e777dcde7fdada1db.1758182101... Signed-off-by: Nathan Chancellor nathan@kernel.org Stable-dep-of: 9338d660b79a ("s390/vmlinux.lds.S: Move .vmlinux.info to end of allocatable sections") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/vmlinux.lds.S | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index ff1ddba96352a..3f2f90e38808c 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -202,6 +202,11 @@ SECTIONS . = ALIGN(PAGE_SIZE); _end = . ;
+ /* Debugging sections. */ + STABS_DEBUG + DWARF_DEBUG + ELF_DETAILS + /* * uncompressed image info used by the decompressor * it should match struct vmlinux_info @@ -232,11 +237,6 @@ SECTIONS #endif } :NONE
- /* Debugging sections. */ - STABS_DEBUG - DWARF_DEBUG - ELF_DETAILS - /* * Make sure that the .got.plt is either completely empty or it * contains only the three reserved double words.
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
[ Upstream commit 9338d660b79a0dfe4eb3fe9bd748054cded87d4f ]
When building s390 defconfig with binutils older than 2.32, there are several warnings during the final linking stage:
s390-linux-ld: .tmp_vmlinux1: warning: allocated section `.got.plt' not in segment s390-linux-ld: .tmp_vmlinux2: warning: allocated section `.got.plt' not in segment s390-linux-ld: vmlinux.unstripped: warning: allocated section `.got.plt' not in segment s390-linux-objcopy: vmlinux: warning: allocated section `.got.plt' not in segment s390-linux-objcopy: st7afZyb: warning: allocated section `.got.plt' not in segment
binutils commit afca762f598 ("S/390: Improve partial relro support for 64 bit") [1] in 2.32 changed where .got.plt is emitted, avoiding the warning.
The :NONE in the .vmlinux.info output section description changes the segment for subsequent allocated sections. Move .vmlinux.info right above the discards section to place all other sections in the previously defined segment, .data.
Fixes: 30226853d6ec ("s390: vmlinux.lds.S: explicitly handle '.got' and '.plt' sections") Link: https://sourceware.org/git/?p=binutils-gdb.git%3Ba=commit%3Bh=afca762f598d45... [1] Acked-by: Alexander Gordeev agordeev@linux.ibm.com Acked-by: Alexey Gladkov legion@kernel.org Acked-by: Nicolas Schier nsc@kernel.org Link: https://patch.msgid.link/20251008-kbuild-fix-modinfo-regressions-v1-3-9fc776... Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/vmlinux.lds.S | 44 +++++++++++++++++----------------- 1 file changed, 22 insertions(+), 22 deletions(-)
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 3f2f90e38808c..0d18d3267bc4f 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -207,6 +207,28 @@ SECTIONS DWARF_DEBUG ELF_DETAILS
+ /* + * Make sure that the .got.plt is either completely empty or it + * contains only the three reserved double words. + */ + .got.plt : { + *(.got.plt) + } + ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!") + + /* + * Sections that should stay zero sized, which is safer to + * explicitly check instead of blindly discarding. + */ + .plt : { + *(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt) + } + ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!") + .rela.dyn : { + *(.rela.*) *(.rela_*) + } + ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!") + /* * uncompressed image info used by the decompressor * it should match struct vmlinux_info @@ -237,28 +259,6 @@ SECTIONS #endif } :NONE
- /* - * Make sure that the .got.plt is either completely empty or it - * contains only the three reserved double words. - */ - .got.plt : { - *(.got.plt) - } - ASSERT(SIZEOF(.got.plt) == 0 || SIZEOF(.got.plt) == 0x18, "Unexpected GOT/PLT entries detected!") - - /* - * Sections that should stay zero sized, which is safer to - * explicitly check instead of blindly discarding. - */ - .plt : { - *(.plt) *(.plt.*) *(.iplt) *(.igot .igot.plt) - } - ASSERT(SIZEOF(.plt) == 0, "Unexpected run-time procedure linkages detected!") - .rela.dyn : { - *(.rela.*) *(.rela_*) - } - ASSERT(SIZEOF(.rela.dyn) == 0, "Unexpected run-time relocations (.rela) detected!") - /* Sections to be discarded */ DISCARDS /DISCARD/ : {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit d0759b10989c5c5aae3d455458c9fc4e8cc694f7 upstream.
The ACPI handle passed to acpi_extract_properties() as the first argument represents the ACPI namespace scope in which to look for objects returning buffers associated with buffer properties.
For _DSD objects located immediately under ACPI devices, this handle is the same as the handle of the device object holding the _DSD, but for data-only subnodes it is not so.
First of all, data-only subnodes are represented by objects that cannot hold other objects in their scopes (like control methods). Therefore a data-only subnode handle cannot be used for completing relative pathname segments, so the current code in in acpi_nondev_subnode_extract() passing a data-only subnode handle to acpi_extract_properties() is invalid.
Moreover, a data-only subnode of device A may be represented by an object located in the scope of device B (which kind of makes sense, for instance, if A is a B's child). In that case, the scope in question would be the one of device B. In other words, the scope mentioned above is the same as the scope used for subnode object lookup in acpi_nondev_subnode_extract().
Accordingly, rearrange that function to use the same scope for the extraction of properties and subnode object lookup.
Fixes: 103e10c69c61 ("ACPI: property: Add support for parsing buffer property UUID") Cc: 6.0+ stable@vger.kernel.org # 6.0+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/property.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-)
--- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -83,6 +83,7 @@ static bool acpi_nondev_subnode_extract( struct fwnode_handle *parent) { struct acpi_data_node *dn; + acpi_handle scope = NULL; bool result;
if (acpi_graph_ignore_port(handle)) @@ -98,27 +99,18 @@ static bool acpi_nondev_subnode_extract( INIT_LIST_HEAD(&dn->data.properties); INIT_LIST_HEAD(&dn->data.subnodes);
- result = acpi_extract_properties(handle, desc, &dn->data); + /* + * The scope for the completion of relative pathname segments and + * subnode object lookup is the one of the namespace node (device) + * containing the object that has returned the package. That is, it's + * the scope of that object's parent device. + */ + if (handle) + acpi_get_parent(handle, &scope);
- if (handle) { - acpi_handle scope; - acpi_status status; - - /* - * The scope for the subnode object lookup is the one of the - * namespace node (device) containing the object that has - * returned the package. That is, it's the scope of that - * object's parent. - */ - status = acpi_get_parent(handle, &scope); - if (ACPI_SUCCESS(status) - && acpi_enumerate_nondev_subnodes(scope, desc, &dn->data, - &dn->fwnode)) - result = true; - } else if (acpi_enumerate_nondev_subnodes(NULL, desc, &dn->data, - &dn->fwnode)) { + result = acpi_extract_properties(scope, desc, &dn->data); + if (acpi_enumerate_nondev_subnodes(scope, desc, &dn->data, &dn->fwnode)) result = true; - }
if (result) { dn->handle = handle;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Tang danielzgtg.opensource@gmail.com
commit 4aac453deca0d9c61df18d968f8864c3ae7d3d8d upstream.
Previously, after `rmmod acpi_tad`, `modprobe acpi_tad` would fail with this dmesg:
sysfs: cannot create duplicate filename '/devices/platform/ACPI000E:00/time' Call Trace: <TASK> dump_stack_lvl+0x6c/0x90 dump_stack+0x10/0x20 sysfs_warn_dup+0x8b/0xa0 sysfs_add_file_mode_ns+0x122/0x130 internal_create_group+0x1dd/0x4c0 sysfs_create_group+0x13/0x20 acpi_tad_probe+0x147/0x1f0 [acpi_tad] platform_probe+0x42/0xb0 </TASK> acpi-tad ACPI000E:00: probe with driver acpi-tad failed with error -17
Fixes: 3230b2b3c1ab ("ACPI: TAD: Add low-level support for real time capability") Signed-off-by: Daniel Tang danielzgtg.opensource@gmail.com Reviewed-by: Mika Westerberg mika.westerberg@linux.intel.com Link: https://patch.msgid.link/2881298.hMirdbgypa@daniel-desktop3 Cc: 5.2+ stable@vger.kernel.org # 5.2+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_tad.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/acpi/acpi_tad.c +++ b/drivers/acpi/acpi_tad.c @@ -565,6 +565,9 @@ static void acpi_tad_remove(struct platf
pm_runtime_get_sync(dev);
+ if (dd->capabilities & ACPI_TAD_RT) + sysfs_remove_group(&dev->kobj, &acpi_tad_time_attr_group); + if (dd->capabilities & ACPI_TAD_DC_WAKE) sysfs_remove_group(&dev->kobj, &acpi_tad_dc_attr_group);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Mohammad Jahangirzad a.jahangirzad@gmail.com
commit 496f9372eae14775e0524e83e952814691fe850a upstream.
In the ACPI debugger interface, the helper functions for read and write operations use "int" as the length parameter data type. When a large "size_t count" is passed from the file operations, this cast to "int" results in truncation and a negative value due to signed integer representation.
Logically, this negative number propagates to the min() calculation, where it is selected over the positive buffer space value, leading to unexpected behavior. Subsequently, when this negative value is used in copy_to_user() or copy_from_user(), it is interpreted as a large positive value due to the unsigned nature of the size parameter in these functions, causing the copy operations to attempt handling sizes far beyond the intended buffer limits.
Address the issue by: - Changing the length parameters in acpi_aml_read_user() and acpi_aml_write_user() from "int" to "size_t", aligning with the expected unsigned size semantics. - Updating return types and local variables in acpi_aml_read() and acpi_aml_write() to "ssize_t" for consistency with kernel file operation conventions. - Using "size_t" for the "n" variable to ensure calculations remain unsigned. - Using min_t() for circ_count_to_end() and circ_space_to_end() to ensure type-safe comparisons and prevent integer overflow.
Signed-off-by: Amir Mohammad Jahangirzad a.jahangirzad@gmail.com Link: https://patch.msgid.link/20250923013113.20615-1-a.jahangirzad@gmail.com [ rjw: Changelog tweaks, local variable definitions ordering adjustments ] Fixes: 8cfb0cdf07e2 ("ACPI / debugger: Add IO interface to access debugger functionalities") Cc: 4.5+ stable@vger.kernel.org # 4.5+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpi_dbg.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-)
--- a/drivers/acpi/acpi_dbg.c +++ b/drivers/acpi/acpi_dbg.c @@ -569,11 +569,11 @@ static int acpi_aml_release(struct inode return 0; }
-static int acpi_aml_read_user(char __user *buf, int len) +static ssize_t acpi_aml_read_user(char __user *buf, size_t len) { - int ret; struct circ_buf *crc = &acpi_aml_io.out_crc; - int n; + ssize_t ret; + size_t n; char *p;
ret = acpi_aml_lock_read(crc, ACPI_AML_OUT_USER); @@ -582,7 +582,7 @@ static int acpi_aml_read_user(char __use /* sync head before removing logs */ smp_rmb(); p = &crc->buf[crc->tail]; - n = min(len, circ_count_to_end(crc)); + n = min_t(size_t, len, circ_count_to_end(crc)); if (copy_to_user(buf, p, n)) { ret = -EFAULT; goto out; @@ -599,8 +599,8 @@ out: static ssize_t acpi_aml_read(struct file *file, char __user *buf, size_t count, loff_t *ppos) { - int ret = 0; - int size = 0; + ssize_t ret = 0; + ssize_t size = 0;
if (!count) return 0; @@ -639,11 +639,11 @@ again: return size > 0 ? size : ret; }
-static int acpi_aml_write_user(const char __user *buf, int len) +static ssize_t acpi_aml_write_user(const char __user *buf, size_t len) { - int ret; struct circ_buf *crc = &acpi_aml_io.in_crc; - int n; + ssize_t ret; + size_t n; char *p;
ret = acpi_aml_lock_write(crc, ACPI_AML_IN_USER); @@ -652,7 +652,7 @@ static int acpi_aml_write_user(const cha /* sync tail before inserting cmds */ smp_mb(); p = &crc->buf[crc->head]; - n = min(len, circ_space_to_end(crc)); + n = min_t(size_t, len, circ_space_to_end(crc)); if (copy_from_user(p, buf, n)) { ret = -EFAULT; goto out; @@ -663,14 +663,14 @@ static int acpi_aml_write_user(const cha ret = n; out: acpi_aml_unlock_fifo(ACPI_AML_IN_USER, ret >= 0); - return n; + return ret; }
static ssize_t acpi_aml_write(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - int ret = 0; - int size = 0; + ssize_t ret = 0; + ssize_t size = 0;
if (!count) return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
commit 99b78773c2ae55dcc01025f94eae8ce9700ae985 upstream.
On most MSM8916 devices (aside from the DragonBoard 410c), the bootloader already initializes the display to show the boot splash screen. In this situation, MDSS is already configured and left running when starting Linux. To avoid side effects from the bootloader configuration, the MDSS reset can be specified in the device tree to start again with a clean hardware state.
The reset for MDSS is currently missing in msm8916.dtsi, which causes errors when the MDSS driver tries to re-initialize the registers:
dsi_err_worker: status=6 dsi_err_worker: status=6 dsi_err_worker: status=6 ...
It turns out that we have always indirectly worked around this by building the MDSS driver as a module. Before v6.17, the power domain was temporarily turned off until the module was loaded, long enough to clear the register contents. In v6.17, power domains are not turned off during boot until sync_state() happens, so this is no longer working. Even before v6.17 this resulted in broken behavior, but notably only when the MDSS driver was built-in instead of a module.
Cc: stable@vger.kernel.org Fixes: 305410ffd1b2 ("arm64: dts: msm8916: Add display support") Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Link: https://lore.kernel.org/r/20250915-msm8916-resets-v1-1-a5c705df0c45@linaro.o... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/msm8916.dtsi | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/boot/dts/qcom/msm8916.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8916.dtsi @@ -1540,6 +1540,8 @@
interrupts = <GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH>;
+ resets = <&gcc GCC_MDSS_BCR>; + interrupt-controller; #interrupt-cells = <1>;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
commit f73c82c855e186e9b67125e3eee743960320e43c upstream.
On most MSM8939 devices, the bootloader already initializes the display to show the boot splash screen. In this situation, MDSS is already configured and left running when starting Linux. To avoid side effects from the bootloader configuration, the MDSS reset can be specified in the device tree to start again with a clean hardware state.
The reset for MDSS is currently missing in msm8939.dtsi, which causes errors when the MDSS driver tries to re-initialize the registers:
dsi_err_worker: status=6 dsi_err_worker: status=6 dsi_err_worker: status=6 ...
It turns out that we have always indirectly worked around this by building the MDSS driver as a module. Before v6.17, the power domain was temporarily turned off until the module was loaded, long enough to clear the register contents. In v6.17, power domains are not turned off during boot until sync_state() happens, so this is no longer working. Even before v6.17 this resulted in broken behavior, but notably only when the MDSS driver was built-in instead of a module.
Cc: stable@vger.kernel.org Fixes: 61550c6c156c ("arm64: dts: qcom: Add msm8939 SoC") Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Link: https://lore.kernel.org/r/20250915-msm8916-resets-v1-2-a5c705df0c45@linaro.o... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/msm8939.dtsi | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/boot/dts/qcom/msm8939.dtsi +++ b/arch/arm64/boot/dts/qcom/msm8939.dtsi @@ -1218,6 +1218,8 @@
power-domains = <&gcc MDSS_GDSC>;
+ resets = <&gcc GCC_MDSS_BCR>; + #address-cells = <1>; #size-cells = <1>; #interrupt-cells = <1>;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
commit 316294bb6695a43a9181973ecd4e6fb3e576a9f7 upstream.
Reading the hardware registers of the &slimbam on RB3 reveals that the BAM supports only 23 pipes (channels) and supports 4 EEs instead of 2. This hasn't caused problems so far since nothing is using the extra channels, but attempting to use them would lead to crashes.
The bam_dma driver might warn in the future if the num-channels in the DT are wrong, so correct the properties in the DT to avoid future regressions.
Cc: stable@vger.kernel.org Fixes: 27ca1de07dc3 ("arm64: dts: qcom: sdm845: add slimbus nodes") Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Link: https://lore.kernel.org/r/20250821-sdm845-slimbam-channels-v1-1-498f7d46b9ee... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/sdm845.dtsi | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi +++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi @@ -5365,11 +5365,11 @@ compatible = "qcom,bam-v1.7.4", "qcom,bam-v1.7.0"; qcom,controlled-remotely; reg = <0 0x17184000 0 0x2a000>; - num-channels = <31>; + num-channels = <23>; interrupts = <GIC_SPI 164 IRQ_TYPE_LEVEL_HIGH>; #dma-cells = <1>; qcom,ee = <1>; - qcom,num-ees = <2>; + qcom,num-ees = <4>; iommus = <&apps_smmu 0x1806 0x0>; };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandrs Vinarskis alex.vinarskis@gmail.com
commit b9a185198f96259311543b30d884d8c01da913f7 upstream.
pm8010 is a camera specific PMIC, and may not be present on some devices. These may instead use a dedicated vreg for this purpose (Dell XPS 9345, Dell Inspiron..) or use USB webcam instead of a MIPI one alltogether (Lenovo Thinbook 16, Lenovo Yoga..).
Disable pm8010 by default, let platforms that actually have one onboard enable it instead.
Cc: stable@vger.kernel.org Fixes: 2559e61e7ef4 ("arm64: dts: qcom: x1e80100-pmics: Add the missing PMICs") Reviewed-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Konrad Dybcio konrad.dybcio@oss.qualcomm.com Signed-off-by: Aleksandrs Vinarskis alex.vinarskis@gmail.com Link: https://lore.kernel.org/r/20250701183625.1968246-2-alex.vinarskis@gmail.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi +++ b/arch/arm64/boot/dts/qcom/x1e80100-pmics.dtsi @@ -475,6 +475,8 @@ #address-cells = <1>; #size-cells = <0>;
+ status = "disabled"; + pm8010_temp_alarm: temp-alarm@2400 { compatible = "qcom,spmi-temp-alarm"; reg = <0x2400>;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vibhore Vardhan vibhore@ti.com
commit 4c4e48afb6d85c1a8f9fdbae1fdf17ceef4a6f5b upstream.
The main pad configuration register region starts with the register MAIN_PADCFG_CTRL_MMR_CFG0_PADCONFIG0 with address 0x000f4000 and ends with the MAIN_PADCFG_CTRL_MMR_CFG0_PADCONFIG150 register with address 0x000f4258, as a result of which, total size of the region is 0x25c instead of 0x2ac.
Reference Docs TRM (AM62A) - https://www.ti.com/lit/ug/spruj16b/spruj16b.pdf TRM (AM62D) - https://www.ti.com/lit/ug/sprujd4/sprujd4.pdf
Fixes: 5fc6b1b62639c ("arm64: dts: ti: Introduce AM62A7 family of SoCs") Cc: stable@vger.kernel.org Signed-off-by: Vibhore Vardhan vibhore@ti.com Signed-off-by: Paresh Bhagat p-bhagat@ti.com Reviewed-by: Siddharth Vadapalli s-vadapalli@ti.com Link: https://patch.msgid.link/20250903062513.813925-2-p-bhagat@ti.com Signed-off-by: Nishanth Menon nm@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/boot/dts/ti/k3-am62a-main.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi +++ b/arch/arm64/boot/dts/ti/k3-am62a-main.dtsi @@ -258,7 +258,7 @@
main_pmx0: pinctrl@f4000 { compatible = "pinctrl-single"; - reg = <0x00 0xf4000 0x00 0x2ac>; + reg = <0x00 0xf4000 0x00 0x25c>; #pinctrl-cells = <1>; pinctrl-single,register-width = <32>; pinctrl-single,function-mask = <0xffffffff>;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yang Shi yang@os.amperecomputing.com
commit 195a1b7d8388c0ec2969a39324feb8bebf9bb907 upstream.
The kprobe page is allocated by execmem allocator with ROX permission. It needs to call set_memory_rox() to set proper permission for the direct map too. It was missed.
Fixes: 10d5e97c1bf8 ("arm64: use PAGE_KERNEL_ROX directly in alloc_insn_page") Cc: stable@vger.kernel.org Signed-off-by: Yang Shi yang@os.amperecomputing.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/probes/kprobes.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/arch/arm64/kernel/probes/kprobes.c +++ b/arch/arm64/kernel/probes/kprobes.c @@ -10,6 +10,7 @@
#define pr_fmt(fmt) "kprobes: " fmt
+#include <linux/execmem.h> #include <linux/extable.h> #include <linux/kasan.h> #include <linux/kernel.h> @@ -41,6 +42,17 @@ DEFINE_PER_CPU(struct kprobe_ctlblk, kpr static void __kprobes post_kprobe_handler(struct kprobe *, struct kprobe_ctlblk *, struct pt_regs *);
+void *alloc_insn_page(void) +{ + void *addr; + + addr = execmem_alloc(EXECMEM_KPROBES, PAGE_SIZE); + if (!addr) + return NULL; + set_memory_rox((unsigned long)addr, 1); + return addr; +} + static void __kprobes arch_prepare_ss_slot(struct kprobe *p) { kprobe_opcode_t *addr = p->ainsn.api.insn;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Sverdlin alexander.sverdlin@siemens.com
commit 8a6506e1ba0d2b831729808d958aae77604f12f9 upstream.
There is an issue possible where TI AM33xx SoCs do not boot properly after a reset if EMU0/EMU1 pins were used as GPIO and have been driving low level actively prior to reset [1].
"Advisory 1.0.36 EMU0 and EMU1: Terminals Must be Pulled High Before ICEPick Samples
The state of the EMU[1:0] terminals are latched during reset to determine ICEPick boot mode. For normal device operation, these terminals must be pulled up to a valid high logic level ( > VIH min) before ICEPick samples the state of these terminals, which occurs [five CLK_M_OSC clock cycles - 10 ns] after the falling edge of WARMRSTn.
Many applications may not require the secondary GPIO function of the EMU[1:0] terminals. In this case, they would only be connected to pull-up resistors, which ensures they are always high when ICEPick samples. However, some applications may need to use these terminals as GPIO where they could be driven low before reset is asserted. This usage of the EMU[1:0] terminals may require special attention to ensure the terminals are allowed to return to a valid high-logic level before ICEPick samples the state of these terminals.
When any device reset is asserted, the pin mux mode of EMU[1:0] terminals configured to operate as GPIO (mode 7) will change back to EMU input (mode 0) on the falling edge of WARMRSTn. This only provides a short period of time for the terminals to return high if driven low before reset is asserted...
If the EMU[1:0] terminals are configured to operate as GPIO, the product should be designed such these terminals can be pulled to a valid high-logic level within 190 ns after the falling edge of WARMRSTn."
We've noticed this problem with custom am335x hardware in combination with recently implemented cold reset method (commit 6521f6a195c70 ("ARM: AM33xx: PRM: Implement REBOOT_COLD")). It looks like the problem can affect other HW, for instance AM335x Chiliboard, because the latter has LEDs on GPIO3_7/GPIO3_8 as well.
One option would be to check if the pins are in GPIO mode and either switch to output active high, or switch to input and poll until the external pull-ups have brought the pins to the desired high state. But fighting with GPIO driver for these pins is probably not the most straight forward approch in a reboot handler.
Fortunately we can easily control pinmuxing here and rely on the external pull-ups. TI recommends 4k7 external pull up resistors [2] and even with quite conservative estimation for pin capacity (1 uF should never happen) the required delay shall not exceed 5ms.
[1] Link: https://www.ti.com/lit/pdf/sprz360 [2] Link: https://e2e.ti.com/support/processors-group/processors/f/processors-forum/86...
Cc: stable@vger.kernel.org Signed-off-by: Alexander Sverdlin alexander.sverdlin@siemens.com Link: https://lore.kernel.org/r/20250717152708.487891-1-alexander.sverdlin@siemens... Signed-off-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-omap2/am33xx-restart.c | 36 +++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+)
--- a/arch/arm/mach-omap2/am33xx-restart.c +++ b/arch/arm/mach-omap2/am33xx-restart.c @@ -2,12 +2,46 @@ /* * am33xx-restart.c - Code common to all AM33xx machines. */ +#include <dt-bindings/pinctrl/am33xx.h> +#include <linux/delay.h> #include <linux/kernel.h> #include <linux/reboot.h>
#include "common.h" +#include "control.h" #include "prm.h"
+/* + * Advisory 1.0.36 EMU0 and EMU1: Terminals Must be Pulled High Before + * ICEPick Samples + * + * If EMU0/EMU1 pins have been used as GPIO outputs and actively driving low + * level, the device might not reboot in normal mode. We are in a bad position + * to override GPIO state here, so just switch the pins into EMU input mode + * (that's what reset will do anyway) and wait a bit, because the state will be + * latched 190 ns after reset. + */ +static void am33xx_advisory_1_0_36(void) +{ + u32 emu0 = omap_ctrl_readl(AM335X_PIN_EMU0); + u32 emu1 = omap_ctrl_readl(AM335X_PIN_EMU1); + + /* If both pins are in EMU mode, nothing to do */ + if (!(emu0 & 7) && !(emu1 & 7)) + return; + + /* Switch GPIO3_7/GPIO3_8 into EMU0/EMU1 modes respectively */ + omap_ctrl_writel(emu0 & ~7, AM335X_PIN_EMU0); + omap_ctrl_writel(emu1 & ~7, AM335X_PIN_EMU1); + + /* + * Give pull-ups time to load the pin/PCB trace capacity. + * 5 ms shall be enough to load 1 uF (would be huge capacity for these + * pins) with TI-recommended 4k7 external pull-ups. + */ + mdelay(5); +} + /** * am33xx_restart - trigger a software restart of the SoC * @mode: the "reboot mode", see arch/arm/kernel/{setup,process}.c @@ -18,6 +52,8 @@ */ void am33xx_restart(enum reboot_mode mode, const char *cmd) { + am33xx_advisory_1_0_36(); + /* TODO: Handle cmd if necessary */ prm_reboot_mode = mode;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit 74139a64e8cedb6d971c78d5d17384efeced1725 upstream.
Add missing of_node_put() calls to release device node references obtained via of_parse_phandle().
Fixes: 06ee7a950b6a ("ARM: OMAP2+: pm33xx-core: Add cpuidle_ops for am335x/am437x") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Link: https://lore.kernel.org/r/20250902075943.2408832-1-linmq006@gmail.com Signed-off-by: Kevin Hilman khilman@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-omap2/pm33xx-core.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/arm/mach-omap2/pm33xx-core.c +++ b/arch/arm/mach-omap2/pm33xx-core.c @@ -388,12 +388,15 @@ static int __init amx3_idle_init(struct if (!state_node) break;
- if (!of_device_is_available(state_node)) + if (!of_device_is_available(state_node)) { + of_node_put(state_node); continue; + }
if (i == CPUIDLE_STATE_MAX) { pr_warn("%s: cpuidle states reached max possible\n", __func__); + of_node_put(state_node); break; }
@@ -403,6 +406,7 @@ static int __init amx3_idle_init(struct states[state_count].wfi_flags |= WFI_FLAG_WAKE_M3 | WFI_FLAG_FLUSH_CACHE;
+ of_node_put(state_node); state_count++; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Robin Murphy robin.murphy@arm.com
commit b3fe1c83a56f3cb7c475747ee1c6ec5a9dd5f60e upstream.
CMN S3's DTM offset is different between r0px and r1p0, and it turns out this was not a error in the earlier documentation, but does actually exist in the design. Lovely.
Cc: stable@vger.kernel.org Fixes: 0dc2f4963f7e ("perf/arm-cmn: Support CMN S3") Signed-off-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/perf/arm-cmn.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/perf/arm-cmn.c +++ b/drivers/perf/arm-cmn.c @@ -65,7 +65,7 @@ /* PMU registers occupy the 3rd 4KB page of each node's region */ #define CMN_PMU_OFFSET 0x2000 /* ...except when they don't :( */ -#define CMN_S3_DTM_OFFSET 0xa000 +#define CMN_S3_R1_DTM_OFFSET 0xa000 #define CMN_S3_PMU_OFFSET 0xd900
/* For most nodes, this is all there is */ @@ -233,6 +233,9 @@ enum cmn_revision { REV_CMN700_R1P0, REV_CMN700_R2P0, REV_CMN700_R3P0, + REV_CMNS3_R0P0 = 0, + REV_CMNS3_R0P1, + REV_CMNS3_R1P0, REV_CI700_R0P0 = 0, REV_CI700_R1P0, REV_CI700_R2P0, @@ -425,8 +428,8 @@ static enum cmn_model arm_cmn_model(cons static int arm_cmn_pmu_offset(const struct arm_cmn *cmn, const struct arm_cmn_node *dn) { if (cmn->part == PART_CMN_S3) { - if (dn->type == CMN_TYPE_XP) - return CMN_S3_DTM_OFFSET; + if (cmn->rev >= REV_CMNS3_R1P0 && dn->type == CMN_TYPE_XP) + return CMN_S3_R1_DTM_OFFSET; return CMN_S3_PMU_OFFSET; } return CMN_PMU_OFFSET;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Riesch michael.riesch@collabora.com
commit c254815b02673cc77a84103c4c0d6197bd90c0ef upstream.
There are variants of the Rockchip Innosilicon CSI DPHY (e.g., the RK3568 variant) that are powered on by default as they are part of the ALIVE power domain. Remove 'power-domains' from the required properties in order to avoid false positives.
Fixes: 22c8e0a69b7f ("dt-bindings: phy: add compatible for rk356x to rockchip-inno-csi-dphy") Cc: stable@kernel.org Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Michael Riesch michael.riesch@collabora.com Link: https://lore.kernel.org/r/20250616-rk3588-csi-dphy-v4-2-a4f340a7f0cf@collabo... Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml | 15 +++++++++- 1 file changed, 14 insertions(+), 1 deletion(-)
--- a/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml +++ b/Documentation/devicetree/bindings/phy/rockchip-inno-csi-dphy.yaml @@ -57,11 +57,24 @@ required: - clocks - clock-names - '#phy-cells' - - power-domains - resets - reset-names - rockchip,grf
+allOf: + - if: + properties: + compatible: + contains: + enum: + - rockchip,px30-csi-dphy + - rockchip,rk1808-csi-dphy + - rockchip,rk3326-csi-dphy + - rockchip,rk3368-csi-dphy + then: + required: + - power-domains + additionalProperties: false
examples:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Andryuk jason.andryuk@amd.com
commit 08df2d7dd4ab2db8a172d824cda7872d5eca460a upstream.
rc is overwritten by the evtchn_status hypercall in each iteration, so the return value will be whatever the last iteration is. This could incorrectly return success even if the event channel was not found. Change to an explicit -ENOENT for an un-found virq and return 0 on a successful match.
Fixes: 62cc5fc7b2e0 ("xen/pv-on-hvm kexec: rebind virqs to existing eventchannel ports") Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk jason.andryuk@amd.com Reviewed-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20250828003604.8949-2-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1324,10 +1324,11 @@ static int find_virq(unsigned int virq, { struct evtchn_status status; evtchn_port_t port; - int rc = -ENOENT;
memset(&status, 0, sizeof(status)); for (port = 0; port < xen_evtchn_max_channels(); port++) { + int rc; + status.dom = DOMID_SELF; status.port = port; rc = HYPERVISOR_event_channel_op(EVTCHNOP_status, &status); @@ -1337,10 +1338,10 @@ static int find_virq(unsigned int virq, continue; if (status.u.virq == virq && status.vcpu == xen_vcpu_nr(cpu)) { *evtchn = port; - break; + return 0; } } - return rc; + return -ENOENT; }
/**
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
commit f770c3d858687252f1270265ba152d5c622e793f upstream.
The device power management API has the following asymmetry: * dpm_suspend_start() does not clean up on failure (it requires a call to dpm_resume_end()) * dpm_suspend_end() does clean up on failure (it does not require a call to dpm_resume_start())
The asymmetry was introduced by commit d8f3de0d2412 ("Suspend-related patches for 2.6.27") in June 2008: It removed a call to device_resume() from device_suspend() (which was later renamed to dpm_suspend_start()).
When Xen began using the device power management API in May 2008 with commit 0e91398f2a5d ("xen: implement save/restore"), the asymmetry did not yet exist. But since it was introduced, a call to dpm_resume_end() is missing in the error path of dpm_suspend_start(). Fix it.
Fixes: d8f3de0d2412 ("Suspend-related patches for 2.6.27") Signed-off-by: Lukas Wunner lukas@wunner.de Cc: stable@vger.kernel.org # v2.6.27 Reviewed-by: "Rafael J. Wysocki (Intel)" rafael@kernel.org Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 22453676d1ddcebbe81641bb68ddf587fee7e21e.1756990799.git.lukas@wunner.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/manage.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/xen/manage.c +++ b/drivers/xen/manage.c @@ -116,7 +116,7 @@ static void do_suspend(void) err = dpm_suspend_start(PMSG_FREEZE); if (err) { pr_err("%s: dpm_suspend_start %d\n", __func__, err); - goto out_thaw; + goto out_resume_end; }
printk(KERN_DEBUG "suspending xenstore...\n"); @@ -156,6 +156,7 @@ out_resume: else xs_suspend_cancel();
+out_resume_end: dpm_resume_end(si.cancelled ? PMSG_THAW : PMSG_RESTORE);
out_thaw:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Andryuk jason.andryuk@amd.com
commit 07ce121d93a5e5fb2440a24da3dbf408fcee978e upstream.
Change find_virq() to return -EEXIST when a VIRQ is bound to a different CPU than the one passed in. With that, remove the BUG_ON() from bind_virq_to_irq() to propogate the error upwards.
Some VIRQs are per-cpu, but others are per-domain or global. Those must be bound to CPU0 and can then migrate elsewhere. The lookup for per-domain and global will probably fail when migrated off CPU 0, especially when the current CPU is tracked. This now returns -EEXIST instead of BUG_ON().
A second call to bind a per-domain or global VIRQ is not expected, but make it non-fatal to avoid trying to look up the irq, since we don't know which per_cpu(virq_to_irq) it will be in.
Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk jason.andryuk@amd.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20250828003604.8949-3-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 19 ++++++++++++++----- 1 file changed, 14 insertions(+), 5 deletions(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1320,10 +1320,12 @@ int bind_interdomain_evtchn_to_irq_latee } EXPORT_SYMBOL_GPL(bind_interdomain_evtchn_to_irq_lateeoi);
-static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn) +static int find_virq(unsigned int virq, unsigned int cpu, evtchn_port_t *evtchn, + bool percpu) { struct evtchn_status status; evtchn_port_t port; + bool exists = false;
memset(&status, 0, sizeof(status)); for (port = 0; port < xen_evtchn_max_channels(); port++) { @@ -1336,12 +1338,16 @@ static int find_virq(unsigned int virq, continue; if (status.status != EVTCHNSTAT_virq) continue; - if (status.u.virq == virq && status.vcpu == xen_vcpu_nr(cpu)) { + if (status.u.virq != virq) + continue; + if (status.vcpu == xen_vcpu_nr(cpu)) { *evtchn = port; return 0; + } else if (!percpu) { + exists = true; } } - return -ENOENT; + return exists ? -EEXIST : -ENOENT; }
/** @@ -1388,8 +1394,11 @@ int bind_virq_to_irq(unsigned int virq, evtchn = bind_virq.port; else { if (ret == -EEXIST) - ret = find_virq(virq, cpu, &evtchn); - BUG_ON(ret < 0); + ret = find_virq(virq, cpu, &evtchn, percpu); + if (ret) { + __unbind_from_irq(info, info->irq); + goto out; + } }
ret = xen_irq_info_virq_setup(info, cpu, evtchn, virq);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Andryuk jason.andryuk@amd.com
commit 3fcc8e146935415d69ffabb5df40ecf50e106131 upstream.
VIRQs come in 3 flavors, per-VPU, per-domain, and global, and the VIRQs are tracked in per-cpu virq_to_irq arrays.
Per-domain and global VIRQs must be bound on CPU 0, and bind_virq_to_irq() sets the per_cpu virq_to_irq at registration time Later, the interrupt can migrate, and info->cpu is updated. When calling __unbind_from_irq(), the per-cpu virq_to_irq is cleared for a different cpu. If bind_virq_to_irq() is called again with CPU 0, the stale irq is returned. There won't be any irq_info for the irq, so things break.
Make xen_rebind_evtchn_to_cpu() update the per_cpu virq_to_irq mappings to keep them update to date with the current cpu. This ensures the correct virq_to_irq is cleared in __unbind_from_irq().
Fixes: e46cdb66c8fc ("xen: event channels") Cc: stable@vger.kernel.org Signed-off-by: Jason Andryuk jason.andryuk@amd.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Juergen Gross jgross@suse.com Message-ID: 20250828003604.8949-4-jason.andryuk@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/xen/events/events_base.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/xen/events/events_base.c +++ b/drivers/xen/events/events_base.c @@ -1803,9 +1803,20 @@ static int xen_rebind_evtchn_to_cpu(stru * virq or IPI channel, which don't actually need to be rebound. Ignore * it, but don't do the xenlinux-level rebind in that case. */ - if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) + if (HYPERVISOR_event_channel_op(EVTCHNOP_bind_vcpu, &bind_vcpu) >= 0) { + int old_cpu = info->cpu; + bind_evtchn_to_cpu(info, tcpu, false);
+ if (info->type == IRQT_VIRQ) { + int virq = info->u.virq; + int irq = per_cpu(virq_to_irq, old_cpu)[virq]; + + per_cpu(virq_to_irq, old_cpu)[virq] = -1; + per_cpu(virq_to_irq, tcpu)[virq] = irq; + } + } + do_unmask(info, EVT_MASK_REASON_TEMPORARY);
return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 8ece3173f87df03935906d0c612c2aeda9db92ca upstream.
Make sure to drop the reference to the secure monitor device taken by of_find_device_by_node() when looking up its driver data on behalf of other drivers (e.g. during probe).
Note that holding a reference to the platform device does not prevent its driver data from going away so there is no point in keeping the reference after the helper returns.
Fixes: 8cde3c2153e8 ("firmware: meson_sm: Rework driver as a proper platform driver") Cc: stable@vger.kernel.org # 5.5 Cc: Carlo Caione ccaione@baylibre.com Signed-off-by: Johan Hovold johan@kernel.org Acked-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Link: https://lore.kernel.org/r/20250725074019.8765-1-johan@kernel.org Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/meson/meson_sm.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/firmware/meson/meson_sm.c +++ b/drivers/firmware/meson/meson_sm.c @@ -232,11 +232,16 @@ EXPORT_SYMBOL(meson_sm_call_write); struct meson_sm_firmware *meson_sm_get(struct device_node *sm_node) { struct platform_device *pdev = of_find_device_by_node(sm_node); + struct meson_sm_firmware *fw;
if (!pdev) return NULL;
- return platform_get_drvdata(pdev); + fw = platform_get_drvdata(pdev); + + put_device(&pdev->dev); + + return fw; } EXPORT_SYMBOL_GPL(meson_sm_get);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Randy Dunlap rdunlap@infradead.org
commit d5d12cc03e501c38925e544abe44d5bfe23dc930 upstream.
Delete the external-module style Makefile commands. They are not needed for in-tree modules.
This is the only Makefile in the kernel tree (aside from tools/ and samples/) that uses this Makefile style.
The same files are built with or without this patch.
Fixes: 056f2821b631 ("media: cec: extron-da-hd-4k-plus: add the Extron DA HD 4K Plus CEC driver") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile | 6 ------ 1 file changed, 6 deletions(-)
diff --git a/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile b/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile index 2e8f7f60263f..08d58524419f 100644 --- a/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile +++ b/drivers/media/cec/usb/extron-da-hd-4k-plus/Makefile @@ -1,8 +1,2 @@ extron-da-hd-4k-plus-cec-objs := extron-da-hd-4k-plus.o cec-splitter.o obj-$(CONFIG_USB_EXTRON_DA_HD_4K_PLUS_CEC) := extron-da-hd-4k-plus-cec.o - -all: - $(MAKE) -C $(KDIR) M=$(shell pwd) modules - -install: - $(MAKE) -C $(KDIR) M=$(shell pwd) modules_install
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
commit 23b53639a793477326fd57ed103823a8ab63084f upstream.
The DMA map functions can fail and should be tested for errors. If the mapping fails, dealloc buffers, and return.
Fixes: 1c1e45d17b66 ("V4L/DVB (7786): cx18: new driver for the Conexant CX23418 MPEG encoder chip") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/pci/cx18/cx18-queue.c | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-)
--- a/drivers/media/pci/cx18/cx18-queue.c +++ b/drivers/media/pci/cx18/cx18-queue.c @@ -379,15 +379,22 @@ int cx18_stream_alloc(struct cx18_stream break; }
+ buf->dma_handle = dma_map_single(&s->cx->pci_dev->dev, + buf->buf, s->buf_size, + s->dma); + if (dma_mapping_error(&s->cx->pci_dev->dev, buf->dma_handle)) { + kfree(buf->buf); + kfree(mdl); + kfree(buf); + break; + } + INIT_LIST_HEAD(&mdl->list); INIT_LIST_HEAD(&mdl->buf_list); mdl->id = s->mdl_base_idx; /* a somewhat safe value */ cx18_enqueue(s, mdl, &s->q_idle);
INIT_LIST_HEAD(&buf->list); - buf->dma_handle = dma_map_single(&s->cx->pci_dev->dev, - buf->buf, s->buf_size, - s->dma); cx18_buf_sync_for_cpu(s, buf); list_add_tail(&buf->list, &s->buf_pool); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qianfeng Rong rongqianfeng@vivo.com
commit bacd713145443dce7764bb2967d30832a95e5ec8 upstream.
Change "ret" from unsigned int to int type in mt9v111_calc_frame_rate() to store negative error codes or zero returned by __mt9v111_hw_reset() and other functions.
Storing the negative error codes in unsigned type, doesn't cause an issue at runtime but it's ugly as pants.
No effect on runtime.
Signed-off-by: Qianfeng Rong rongqianfeng@vivo.com Fixes: aab7ed1c3927 ("media: i2c: Add driver for Aptina MT9V111") Cc: stable@vger.kernel.org Reviewed-by: Jacopo Mondi jacopo.mondi@ideasonboard.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/i2c/mt9v111.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/i2c/mt9v111.c +++ b/drivers/media/i2c/mt9v111.c @@ -532,8 +532,8 @@ static int mt9v111_calc_frame_rate(struc static int mt9v111_hw_config(struct mt9v111_dev *mt9v111) { struct i2c_client *c = mt9v111->client; - unsigned int ret; u16 outfmtctrl2; + int ret;
/* Force device reset. */ ret = __mt9v111_hw_reset(mt9v111);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Laurent Pinchart laurent.pinchart@ideasonboard.com
commit eec81250219a209b863f11d02128ec1dd8e20877 upstream.
Commit b3decc5ce7d7 ("media: mc: Expand MUST_CONNECT flag to always require an enabled link") expanded the meaning of the MUST_CONNECT flag to require an enabled link in all cases. To do so, the link exploration code was expanded to cover unconnected pads, in order to reject those that have the MUST_CONNECT flag set. The implementation was however incorrect, ignoring unconnected pads instead of ignoring connected pads. Fix it.
Reported-by: Martin Kepplinger-Novaković martink@posteo.de Closes: https://lore.kernel.org/linux-media/20250205172957.182362-1-martink@posteo.d... Reported-by: Maud Spierings maudspierings@gocontroll.com Closes: https://lore.kernel.org/linux-media/20250818-imx8_isi-v1-1-e9cfe994c435@goco... Fixes: b3decc5ce7d7 ("media: mc: Expand MUST_CONNECT flag to always require an enabled link") Cc: stable@vger.kernel.org # 6.1 Signed-off-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Tested-by: Maud Spierings maudspierings@gocontroll.com Tested-by: Martin Kepplinger-Novaković martink@posteo.de Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/mc/mc-entity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/mc/mc-entity.c +++ b/drivers/media/mc/mc-entity.c @@ -691,7 +691,7 @@ done: * (already discovered through iterating over links) and pads * not internally connected. */ - if (origin == local || !local->num_links || + if (origin == local || local->num_links || !media_entity_has_pad_interdep(origin->entity, origin->index, local->index)) continue;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
commit 1069a4fe637d0e3e4c163e3f8df9be306cc299b4 upstream.
The DMA map functions can fail and should be tested for errors. If the mapping fails, free blanking_ptr and set it to 0. As 0 is a valid DMA address, use blanking_ptr to test if the DMA address is set.
Fixes: 1a0adaf37c30 ("V4L/DVB (5345): ivtv driver for Conexant cx23416/cx23415 MPEG encoder/decoder") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/pci/ivtv/ivtv-irq.c | 2 +- drivers/media/pci/ivtv/ivtv-yuv.c | 8 +++++++- 2 files changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/media/pci/ivtv/ivtv-irq.c +++ b/drivers/media/pci/ivtv/ivtv-irq.c @@ -351,7 +351,7 @@ void ivtv_dma_stream_dec_prepare(struct
/* Insert buffer block for YUV if needed */ if (s->type == IVTV_DEC_STREAM_TYPE_YUV && f->offset_y) { - if (yi->blanking_dmaptr) { + if (yi->blanking_ptr) { s->sg_pending[idx].src = yi->blanking_dmaptr; s->sg_pending[idx].dst = offset; s->sg_pending[idx].size = 720 * 16; --- a/drivers/media/pci/ivtv/ivtv-yuv.c +++ b/drivers/media/pci/ivtv/ivtv-yuv.c @@ -125,7 +125,7 @@ static int ivtv_yuv_prep_user_dma(struct ivtv_udma_fill_sg_array(dma, y_buffer_offset, uv_buffer_offset, y_size);
/* If we've offset the y plane, ensure top area is blanked */ - if (f->offset_y && yi->blanking_dmaptr) { + if (f->offset_y && yi->blanking_ptr) { dma->SGarray[dma->SG_length].size = cpu_to_le32(720*16); dma->SGarray[dma->SG_length].src = cpu_to_le32(yi->blanking_dmaptr); dma->SGarray[dma->SG_length].dst = cpu_to_le32(IVTV_DECODER_OFFSET + yuv_offset[frame]); @@ -929,6 +929,12 @@ static void ivtv_yuv_init(struct ivtv *i yi->blanking_dmaptr = dma_map_single(&itv->pdev->dev, yi->blanking_ptr, 720 * 16, DMA_TO_DEVICE); + if (dma_mapping_error(&itv->pdev->dev, yi->blanking_dmaptr)) { + kfree(yi->blanking_ptr); + yi->blanking_ptr = NULL; + yi->blanking_dmaptr = 0; + IVTV_DEBUG_WARN("Failed to dma_map yuv blanking buffer\n"); + } } else { yi->blanking_dmaptr = 0; IVTV_DEBUG_WARN("Failed to allocate yuv blanking buffer\n");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit c0d3f6969bb4d72476cfe7ea9263831f1c283704 upstream.
Fix potential leak of uninitialized stack data to userspace by ensuring that the `scan` structure is zeroed before use.
Fixes: 0ab13674a9bd ("media: pci: mgb4: Added Digiteq Automotive MGB4 driver") Cc: stable@vger.kernel.org Signed-off-by: David Lechner dlechner@baylibre.com Reviewed-by: Jonathan Cameron jonathan.cameron@huawei.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/pci/mgb4/mgb4_trigger.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/pci/mgb4/mgb4_trigger.c +++ b/drivers/media/pci/mgb4/mgb4_trigger.c @@ -91,7 +91,7 @@ static irqreturn_t trigger_handler(int i struct { u32 data; s64 ts __aligned(8); - } scan; + } scan = { };
scan.data = mgb4_read_reg(&st->mgbdev->video, 0xA0); mgb4_write_reg(&st->mgbdev->video, 0xA0, scan.data);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
commit 7fa37ba25a1dfc084e24ea9acc14bf1fad8af14c upstream.
The s5p_mfc_cmd_args structure in the v6 driver is never used, not initialized to anything other than zero, but as of clang-21 this causes a warning:
drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c:45:7: error: variable 'h2r_args' is uninitialized when passed as a const pointer argument here [-Werror,-Wuninitialized-const-pointer] 45 | &h2r_args); | ^~~~~~~~
Just remove this for simplicity. Since the function is also called through a callback, this does require adding a trivial wrapper with the correct prototype.
Fixes: f96f3cfa0bb8 ("[media] s5p-mfc: Update MFC v4l2 driver to support MFC6.x") Cc: stable@vger.kernel.org Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c | 35 +++++----------- 1 file changed, 13 insertions(+), 22 deletions(-)
--- a/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c +++ b/drivers/media/platform/samsung/s5p-mfc/s5p_mfc_cmd_v6.c @@ -14,8 +14,7 @@ #include "s5p_mfc_opr.h" #include "s5p_mfc_cmd_v6.h"
-static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd, - const struct s5p_mfc_cmd_args *args) +static int s5p_mfc_cmd_host2risc_v6(struct s5p_mfc_dev *dev, int cmd) { mfc_debug(2, "Issue the command: %d\n", cmd);
@@ -31,7 +30,6 @@ static int s5p_mfc_cmd_host2risc_v6(stru
static int s5p_mfc_sys_init_cmd_v6(struct s5p_mfc_dev *dev) { - struct s5p_mfc_cmd_args h2r_args; const struct s5p_mfc_buf_size_v6 *buf_size = dev->variant->buf_size->priv; int ret;
@@ -41,33 +39,23 @@ static int s5p_mfc_sys_init_cmd_v6(struc
mfc_write(dev, dev->ctx_buf.dma, S5P_FIMV_CONTEXT_MEM_ADDR_V6); mfc_write(dev, buf_size->dev_ctx, S5P_FIMV_CONTEXT_MEM_SIZE_V6); - return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6, - &h2r_args); + return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SYS_INIT_V6); }
static int s5p_mfc_sleep_cmd_v6(struct s5p_mfc_dev *dev) { - struct s5p_mfc_cmd_args h2r_args; - - memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args)); - return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6, - &h2r_args); + return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_SLEEP_V6); }
static int s5p_mfc_wakeup_cmd_v6(struct s5p_mfc_dev *dev) { - struct s5p_mfc_cmd_args h2r_args; - - memset(&h2r_args, 0, sizeof(struct s5p_mfc_cmd_args)); - return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6, - &h2r_args); + return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_WAKEUP_V6); }
/* Open a new instance and get its number */ static int s5p_mfc_open_inst_cmd_v6(struct s5p_mfc_ctx *ctx) { struct s5p_mfc_dev *dev = ctx->dev; - struct s5p_mfc_cmd_args h2r_args; int codec_type;
mfc_debug(2, "Requested codec mode: %d\n", ctx->codec_mode); @@ -129,23 +117,20 @@ static int s5p_mfc_open_inst_cmd_v6(stru mfc_write(dev, ctx->ctx.size, S5P_FIMV_CONTEXT_MEM_SIZE_V6); mfc_write(dev, 0, S5P_FIMV_D_CRC_CTRL_V6); /* no crc */
- return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6, - &h2r_args); + return s5p_mfc_cmd_host2risc_v6(dev, S5P_FIMV_H2R_CMD_OPEN_INSTANCE_V6); }
/* Close instance */ static int s5p_mfc_close_inst_cmd_v6(struct s5p_mfc_ctx *ctx) { struct s5p_mfc_dev *dev = ctx->dev; - struct s5p_mfc_cmd_args h2r_args; int ret = 0;
dev->curr_ctx = ctx->num; if (ctx->state != MFCINST_FREE) { mfc_write(dev, ctx->inst_no, S5P_FIMV_INSTANCE_ID_V6); ret = s5p_mfc_cmd_host2risc_v6(dev, - S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6, - &h2r_args); + S5P_FIMV_H2R_CMD_CLOSE_INSTANCE_V6); } else { ret = -EINVAL; } @@ -153,9 +138,15 @@ static int s5p_mfc_close_inst_cmd_v6(str return ret; }
+static int s5p_mfc_cmd_host2risc_v6_args(struct s5p_mfc_dev *dev, int cmd, + const struct s5p_mfc_cmd_args *ignored) +{ + return s5p_mfc_cmd_host2risc_v6(dev, cmd); +} + /* Initialize cmd function pointers for MFC v6 */ static const struct s5p_mfc_hw_cmds s5p_mfc_cmds_v6 = { - .cmd_host2risc = s5p_mfc_cmd_host2risc_v6, + .cmd_host2risc = s5p_mfc_cmd_host2risc_v6_args, .sys_init_cmd = s5p_mfc_sys_init_cmd_v6, .sleep_cmd = s5p_mfc_sleep_cmd_v6, .wakeup_cmd = s5p_mfc_wakeup_cmd_v6,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
commit 93f213b444a40f1e7a4383b499b65e782dcb14b9 upstream.
When starting venus with the "no_tz" code path, IRIS2 needs the same boot/reset sequence as IRIS2_1. This is because most of the registers were moved to the "wrapper_tz_base", which is already defined for both IRIS2 and IRIS2_1 inside core.c. Add IRIS2 to the checks inside firmware.c as well to make sure that it uses the correct reset sequence.
Both IRIS2 and IRIS2_1 are HFI v6 variants, so the correct sequence was used before commit c38610f8981e ("media: venus: firmware: Sanitize per-VPU-version").
Fixes: c38610f8981e ("media: venus: firmware: Sanitize per-VPU-version") Cc: stable@vger.kernel.org Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Vikash Garodia quic_vgarodia@quicinc.com Reviewed-by: Dikshita Agarwal quic_dikshita@quicinc.com Reviewed-by: Bryan O'Donoghue bryan.odonoghue@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com [bod: Fixed commit log IRIS -> IRIS2] Signed-off-by: Bryan O'Donoghue bod@kernel.org Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/qcom/venus/firmware.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/media/platform/qcom/venus/firmware.c +++ b/drivers/media/platform/qcom/venus/firmware.c @@ -30,7 +30,7 @@ static void venus_reset_cpu(struct venus u32 fw_size = core->fw.mapped_mem_size; void __iomem *wrapper_base;
- if (IS_IRIS2_1(core)) + if (IS_IRIS2(core) || IS_IRIS2_1(core)) wrapper_base = core->wrapper_tz_base; else wrapper_base = core->wrapper_base; @@ -42,7 +42,7 @@ static void venus_reset_cpu(struct venus writel(fw_size, wrapper_base + WRAPPER_NONPIX_START_ADDR); writel(fw_size, wrapper_base + WRAPPER_NONPIX_END_ADDR);
- if (IS_IRIS2_1(core)) { + if (IS_IRIS2(core) || IS_IRIS2_1(core)) { /* Bring XTSS out of reset */ writel(0, wrapper_base + WRAPPER_TZ_XTSS_SW_RESET); } else { @@ -68,7 +68,7 @@ int venus_set_hw_state(struct venus_core if (resume) { venus_reset_cpu(core); } else { - if (IS_IRIS2_1(core)) + if (IS_IRIS2(core) || IS_IRIS2_1(core)) writel(WRAPPER_XTSS_SW_RESET_BIT, core->wrapper_tz_base + WRAPPER_TZ_XTSS_SW_RESET); else @@ -181,7 +181,7 @@ static int venus_shutdown_no_tz(struct v void __iomem *wrapper_base = core->wrapper_base; void __iomem *wrapper_tz_base = core->wrapper_tz_base;
- if (IS_IRIS2_1(core)) { + if (IS_IRIS2(core) || IS_IRIS2_1(core)) { /* Assert the reset to XTSS */ reg = readl(wrapper_tz_base + WRAPPER_TZ_XTSS_SW_RESET); reg |= WRAPPER_XTSS_SW_RESET_BIT;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil+cisco@kernel.org
commit 4bd8a6147645480d550242ff816b4c7ba160e5b7 upstream.
The vivid driver supports the <Vendor Command With ID> message, but if the Vendor ID of the received message didn't match the Vendor ID of the CEC Adapter, then it ignores it (good) and returns 0 (bad).
It should return -ENOMSG to indicate that other followers should be asked to handle it. Return code 0 means that the driver handled it, which is wrong in this case.
As a result, userspace followers never get the chance to process such a message.
Refactor the code a bit to have the function return -ENOMSG at the end, drop the default case, and ensure that the message handlers return 0.
That way 0 is only returned if the message is actually handled in the vivid_received() function.
Fixes: 812765cd6954 ("media: vivid: add <Vendor Command With ID> support") Cc: stable@vger.kernel.org Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/test-drivers/vivid/vivid-cec.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/drivers/media/test-drivers/vivid/vivid-cec.c b/drivers/media/test-drivers/vivid/vivid-cec.c index 356a988dd6a1..2d15fdd5d999 100644 --- a/drivers/media/test-drivers/vivid/vivid-cec.c +++ b/drivers/media/test-drivers/vivid/vivid-cec.c @@ -327,7 +327,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg) char osd[14];
if (!cec_is_sink(adap)) - return -ENOMSG; + break; cec_ops_set_osd_string(msg, &disp_ctl, osd); switch (disp_ctl) { case CEC_OP_DISP_CTL_DEFAULT: @@ -348,7 +348,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg) cec_transmit_msg(adap, &reply, false); break; } - break; + return 0; } case CEC_MSG_VENDOR_COMMAND_WITH_ID: { u32 vendor_id; @@ -379,7 +379,7 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg) if (size == 1) { // Ignore even op values if (!(vendor_cmd[0] & 1)) - break; + return 0; reply.len = msg->len; memcpy(reply.msg + 1, msg->msg + 1, msg->len - 1); reply.msg[msg->len - 1]++; @@ -388,12 +388,10 @@ static int vivid_received(struct cec_adapter *adap, struct cec_msg *msg) CEC_OP_ABORT_INVALID_OP); } cec_transmit_msg(adap, &reply, false); - break; + return 0; } - default: - return -ENOMSG; } - return 0; + return -ENOMSG; }
static const struct cec_adap_ops vivid_cec_adap_ops = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jai Luthra jai.luthra@ideasonboard.com
commit 072799db233f9de90a62be54c1e59275c2db3969 upstream.
Ensure that we clean up the platform bus when we remove this driver.
This fixes a crash seen when reloading the module for the child device with the parent not yet reloaded.
Fixes: b4a3d877dc92 ("media: ti: Add CSI2RX support for J721E") Cc: stable@vger.kernel.org Reviewed-by: Devarsh Thakkar devarsht@ti.com Tested-by: Yemike Abhilash Chandra y-abhilashchandra@ti.com (on SK-AM68) Signed-off-by: Jai Luthra jai.luthra@ideasonboard.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c @@ -1121,7 +1121,7 @@ static int ti_csi2rx_probe(struct platfo if (ret) goto err_vb2q;
- ret = of_platform_populate(csi->dev->of_node, NULL, NULL, csi->dev); + ret = devm_of_platform_populate(csi->dev); if (ret) { dev_err(csi->dev, "Failed to create children: %d\n", ret); goto err_subdev;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jai Luthra jai.luthra@ideasonboard.com
commit 3e743cd0a73246219da117ee99061aad51c4748c upstream.
We don't use OF ports and remote-endpoints to connect the CSI2RX bridge and this device in the device tree, thus it is wrong to use v4l2_create_fwnode_links_to_pad() to create the media graph link between the two.
It works out on accident, as neither the source nor the sink implement the .get_fwnode_pad() callback, and the framework helper falls back on using the first source and sink pads to create the link between them.
Instead, manually create the media link from the first source pad of the bridge to the first sink pad of the J721E CSI2RX.
Fixes: b4a3d877dc92 ("media: ti: Add CSI2RX support for J721E") Cc: stable@vger.kernel.org Reviewed-by: Devarsh Thakkar devarsht@ti.com Tested-by: Yemike Abhilash Chandra y-abhilashchandra@ti.com (on SK-AM68) Signed-off-by: Jai Luthra jai.luthra@ideasonboard.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c +++ b/drivers/media/platform/ti/j721e-csi2rx/j721e-csi2rx.c @@ -52,6 +52,8 @@ #define DRAIN_TIMEOUT_MS 50 #define DRAIN_BUFFER_SIZE SZ_32K
+#define CSI2RX_BRIDGE_SOURCE_PAD 1 + struct ti_csi2rx_fmt { u32 fourcc; /* Four character code. */ u32 code; /* Mbus code. */ @@ -426,8 +428,9 @@ static int csi_async_notifier_complete(s if (ret) return ret;
- ret = v4l2_create_fwnode_links_to_pad(csi->source, &csi->pad, - MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED); + ret = media_create_pad_link(&csi->source->entity, CSI2RX_BRIDGE_SOURCE_PAD, + &vdev->entity, csi->pad.index, + MEDIA_LNK_FL_IMMUTABLE | MEDIA_LNK_FL_ENABLED);
if (ret) { video_unregister_device(vdev);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit 4f4098c57e139ad972154077fb45c3e3141555dd upstream.
When cdev_device_add() failed, calling put_device() to explicitly release dev->lirc_dev. Otherwise, it could cause the fault of the reference count.
Found by code review.
Cc: stable@vger.kernel.org Fixes: a6ddd4fecbb0 ("media: lirc: remove last remnants of lirc kapi") Signed-off-by: Ma Ke make24@iscas.ac.cn Signed-off-by: Sean Young sean@mess.org Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/rc/lirc_dev.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/drivers/media/rc/lirc_dev.c +++ b/drivers/media/rc/lirc_dev.c @@ -736,11 +736,11 @@ int lirc_register(struct rc_dev *dev)
cdev_init(&dev->lirc_cdev, &lirc_fops);
+ get_device(&dev->dev); + err = cdev_device_add(&dev->lirc_cdev, &dev->lirc_dev); if (err) - goto out_ida; - - get_device(&dev->dev); + goto out_put_device;
switch (dev->driver_type) { case RC_DRIVER_SCANCODE: @@ -764,7 +764,8 @@ int lirc_register(struct rc_dev *dev)
return 0;
-out_ida: +out_put_device: + put_device(&dev->lirc_dev); ida_free(&lirc_ida, minor); return err; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit ca2a6abdaee43808034cdb218428d2ed85fd3db8 upstream.
When bailing out due to group_priority_permit() failure, the queue_args need to be freed. Fix it by rearranging the function to use the goto-on-error pattern, such that the success case flows straight without indentation while error cases jump forward to cleanup.
Cc: stable@vger.kernel.org Fixes: 5f7762042f8a ("drm/panthor: Restrict high priorities on group_create") Signed-off-by: Jann Horn jannh@google.com Reviewed-by: Boris Brezillon boris.brezillon@collabora.com Reviewed-by: Liviu Dudau liviu.dudau@arm.com Reviewed-by: Steven Price steven.price@arm.com Signed-off-by: Steven Price steven.price@arm.com Link: https://lore.kernel.org/r/20241113-panthor-fix-gcq-bailout-v1-1-654307254d68... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/panthor/panthor_drv.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/panthor/panthor_drv.c +++ b/drivers/gpu/drm/panthor/panthor_drv.c @@ -1032,14 +1032,15 @@ static int panthor_ioctl_group_create(st
ret = group_priority_permit(file, args->priority); if (ret) - return ret; + goto out;
ret = panthor_group_create(pfile, args, queue_args); - if (ret >= 0) { - args->group_handle = ret; - ret = 0; - } + if (ret < 0) + goto out; + args->group_handle = ret; + ret = 0;
+out: kvfree(queue_args); return ret; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
commit d83f1d19c898ac1b54ae64d1c950f5beff801982 upstream.
Remove fixed PPI lane count setup. The R-Car DSI host is capable of operating in 1..4 DSI lane mode. Remove the hard-coded 4-lane configuration from PPI register settings and instead configure the PPI lane count according to lane count information already obtained by this driver instance.
Configure TXSETR register to match PPI lane count. The R-Car V4H Reference Manual R19UH0186EJ0121 Rev.1.21 section 67.2.2.3 Tx Set Register (TXSETR), field LANECNT description indicates that the TXSETR register LANECNT bitfield lane count must be configured such, that it matches lane count configuration in PPISETR register DLEN bitfield. Make sure the LANECNT and DLEN bitfields are configured to match.
Fixes: 155358310f01 ("drm: rcar-du: Add R-Car DSI driver") Cc: stable@vger.kernel.org Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Reviewed-by: Tomi Valkeinen tomi.valkeinen+renesas@ideasonboard.com Link: https://lore.kernel.org/r/20250813210840.97621-1-marek.vasut+renesas@mailbox... Signed-off-by: Tomi Valkeinen tomi.valkeinen@ideasonboard.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c | 5 ++++- drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h | 8 ++++---- 2 files changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c +++ b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi.c @@ -576,7 +576,10 @@ static int rcar_mipi_dsi_startup(struct udelay(10); rcar_mipi_dsi_clr(dsi, CLOCKSET1, CLOCKSET1_UPDATEPLL);
- ppisetr = PPISETR_DLEN_3 | PPISETR_CLEN; + rcar_mipi_dsi_clr(dsi, TXSETR, TXSETR_LANECNT_MASK); + rcar_mipi_dsi_set(dsi, TXSETR, dsi->lanes - 1); + + ppisetr = ((BIT(dsi->lanes) - 1) & PPISETR_DLEN_MASK) | PPISETR_CLEN; rcar_mipi_dsi_write(dsi, PPISETR, ppisetr);
rcar_mipi_dsi_set(dsi, PHYSETUP, PHYSETUP_SHUTDOWNZ); --- a/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h +++ b/drivers/gpu/drm/renesas/rcar-du/rcar_mipi_dsi_regs.h @@ -12,6 +12,9 @@ #define LINKSR_LPBUSY (1 << 1) #define LINKSR_HSBUSY (1 << 0)
+#define TXSETR 0x100 +#define TXSETR_LANECNT_MASK (0x3 << 0) + /* * Video Mode Register */ @@ -80,10 +83,7 @@ * PHY-Protocol Interface (PPI) Registers */ #define PPISETR 0x700 -#define PPISETR_DLEN_0 (0x1 << 0) -#define PPISETR_DLEN_1 (0x3 << 0) -#define PPISETR_DLEN_2 (0x7 << 0) -#define PPISETR_DLEN_3 (0xf << 0) +#define PPISETR_DLEN_MASK (0xf << 0) #define PPISETR_CLEN (1 << 8)
#define PPICLCR 0x710
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shuhao Fu sfual@cse.ust.hk
commit e4bea919584ff292c9156cf7d641a2ab3cbe27b0 upstream.
In `nouveau_bo_move_prep`, if `nouveau_mem_map` fails, an error code should be returned. Currently, it returns zero even if vmm addr is not correctly mapped.
Cc: stable@vger.kernel.org Reviewed-by: Petr Vorel pvorel@suse.cz Signed-off-by: Shuhao Fu sfual@cse.ust.hk Fixes: 9ce523cc3bf2 ("drm/nouveau: separate buffer object backing memory from nvkm structures") Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/nouveau_bo.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c +++ b/drivers/gpu/drm/nouveau/nouveau_bo.c @@ -852,7 +852,7 @@ done: nvif_vmm_put(vmm, &old_mem->vma[1]); nvif_vmm_put(vmm, &old_mem->vma[0]); } - return 0; + return ret; }
static int
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Auld matthew.auld@intel.com
commit 2d1684a077d62fddfac074052c162ec6573a34e1 upstream.
Currently this is hidden behind perfmon_capable() since this is technically an info leak, given that this is a system wide metric. However the granularity reported here is always PAGE_SIZE aligned, which matches what the core kernel is already willing to expose to userspace if querying how many free RAM pages there are on the system, and that doesn't need any special privileges. In addition other drm drivers seem happy to expose this.
The motivation here if with oneAPI where they want to use the system wide 'used' reporting here, so not the per-client fdinfo stats. This has also come up with some perf overlay applications wanting this information.
Fixes: 1105ac15d2a1 ("drm/xe/uapi: restrict system wide accounting") Signed-off-by: Matthew Auld matthew.auld@intel.com Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Joshua Santosh joshua.santosh.ranjan@intel.com Cc: José Roberto de Souza jose.souza@intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: Rodrigo Vivi rodrigo.vivi@intel.com Cc: stable@vger.kernel.org # v6.8+ Acked-by: Rodrigo Vivi rodrigo.vivi@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://lore.kernel.org/r/20250919122052.420979-2-matthew.auld@intel.com (cherry picked from commit 4d0b035fd6dae8ee48e9c928b10f14877e595356) Signed-off-by: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/xe/xe_query.c | 15 ++++++--------- 1 file changed, 6 insertions(+), 9 deletions(-)
--- a/drivers/gpu/drm/xe/xe_query.c +++ b/drivers/gpu/drm/xe/xe_query.c @@ -277,8 +277,7 @@ static int query_mem_regions(struct xe_d mem_regions->mem_regions[0].instance = 0; mem_regions->mem_regions[0].min_page_size = PAGE_SIZE; mem_regions->mem_regions[0].total_size = man->size << PAGE_SHIFT; - if (perfmon_capable()) - mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man); + mem_regions->mem_regions[0].used = ttm_resource_manager_usage(man); mem_regions->num_mem_regions = 1;
for (i = XE_PL_VRAM0; i <= XE_PL_VRAM1; ++i) { @@ -294,13 +293,11 @@ static int query_mem_regions(struct xe_d mem_regions->mem_regions[mem_regions->num_mem_regions].total_size = man->size;
- if (perfmon_capable()) { - xe_ttm_vram_get_used(man, - &mem_regions->mem_regions - [mem_regions->num_mem_regions].used, - &mem_regions->mem_regions - [mem_regions->num_mem_regions].cpu_visible_used); - } + xe_ttm_vram_get_used(man, + &mem_regions->mem_regions + [mem_regions->num_mem_regions].used, + &mem_regions->mem_regions + [mem_regions->num_mem_regions].cpu_visible_used);
mem_regions->mem_regions[mem_regions->num_mem_regions].cpu_visible_size = xe_ttm_vram_get_cpu_visible_size(man);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fangzhi Zuo Jerry.Zuo@amd.com
commit 5949e7c4890c3cf65e783c83c355b95e21f10dba upstream.
[WHAT] Since dcn35, DTBCLK can be disabled when no DP2 sink connected for power saving purpose.
Cc: Mario Limonciello mario.limonciello@amd.com Cc: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Reviewed-by: Aurabindo Pillai aurabindo.pillai@amd.com Signed-off-by: Fangzhi Zuo Jerry.Zuo@amd.com Signed-off-by: Alex Hung alex.hung@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2027,6 +2027,10 @@ static int amdgpu_dm_init(struct amdgpu_
init_data.flags.disable_ips_in_vpb = 0;
+ /* DCN35 and above supports dynamic DTBCLK switch */ + if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= IP_VERSION(3, 5, 0)) + init_data.flags.allow_0_dtb_clk = true; + /* Enable DWB for tested platforms only */ if (amdgpu_ip_version(adev, DCE_HWIP, 0) >= IP_VERSION(3, 0, 0)) init_data.num_virtual_links = 1;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yu Kuai yukuai3@huawei.com
commit 06d712d297649f48ebf1381d19bd24e942813b37 upstream.
trace_block_split() is missing, resulting in blktrace inability to catch BIO split events and making it harder to analyze the BIO sequence.
Cc: stable@vger.kernel.org Fixes: 488f6682c832 ("block: blk-crypto-fallback for Inline Encryption") Signed-off-by: Yu Kuai yukuai3@huawei.com Reviewed-by: Bart Van Assche bvanassche@acm.org Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- block/blk-crypto-fallback.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/block/blk-crypto-fallback.c +++ b/block/blk-crypto-fallback.c @@ -18,6 +18,7 @@ #include <linux/module.h> #include <linux/random.h> #include <linux/scatterlist.h> +#include <trace/events/block.h>
#include "blk-cgroup.h" #include "blk-crypto-internal.h" @@ -230,7 +231,9 @@ static bool blk_crypto_fallback_split_bi bio->bi_status = BLK_STS_RESOURCE; return false; } + bio_chain(split_bio, bio); + trace_block_split(split_bio, bio->bi_iter.bi_sector); submit_bio_noacct(bio); *bio_ptr = split_bio; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anderson Nascimento anderson@allelesecurity.com
commit dff4f9ff5d7f289e4545cc936362e01ed3252742 upstream.
The function btrfs_encode_fh() does not properly account for the three cases it handles.
Before writing to the file handle (fh), the function only returns to the user BTRFS_FID_SIZE_NON_CONNECTABLE (5 dwords, 20 bytes) or BTRFS_FID_SIZE_CONNECTABLE (8 dwords, 32 bytes).
However, when a parent exists and the root ID of the parent and the inode are different, the function writes BTRFS_FID_SIZE_CONNECTABLE_ROOT (10 dwords, 40 bytes).
If *max_len is not large enough, this write goes out of bounds because BTRFS_FID_SIZE_CONNECTABLE_ROOT is greater than BTRFS_FID_SIZE_CONNECTABLE originally returned.
This results in an 8-byte out-of-bounds write at fid->parent_root_objectid = parent_root_id.
A previous attempt to fix this issue was made but was lost.
https://lore.kernel.org/all/4CADAEEC020000780001B32C@vpn.id2.novell.com/
Although this issue does not seem to be easily triggerable, it is a potential memory corruption bug that should be fixed. This patch resolves the issue by ensuring the function returns the appropriate size for all three cases and validates that *max_len is large enough before writing any data.
Fixes: be6e8dc0ba84 ("NFS support for btrfs - v3") CC: stable@vger.kernel.org # 3.0+ Signed-off-by: Anderson Nascimento anderson@allelesecurity.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/export.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/btrfs/export.c +++ b/fs/btrfs/export.c @@ -23,7 +23,11 @@ static int btrfs_encode_fh(struct inode int type;
if (parent && (len < BTRFS_FID_SIZE_CONNECTABLE)) { - *max_len = BTRFS_FID_SIZE_CONNECTABLE; + if (btrfs_root_id(BTRFS_I(inode)->root) != + btrfs_root_id(BTRFS_I(parent)->root)) + *max_len = BTRFS_FID_SIZE_CONNECTABLE_ROOT; + else + *max_len = BTRFS_FID_SIZE_CONNECTABLE; return FILEID_INVALID; } else if (len < BTRFS_FID_SIZE_NON_CONNECTABLE) { *max_len = BTRFS_FID_SIZE_NON_CONNECTABLE; @@ -45,6 +49,8 @@ static int btrfs_encode_fh(struct inode parent_root_id = btrfs_root_id(BTRFS_I(parent)->root);
if (parent_root_id != fid->root_objectid) { + if (*max_len < BTRFS_FID_SIZE_CONNECTABLE_ROOT) + return FILEID_INVALID; fid->parent_root_objectid = parent_root_id; len = BTRFS_FID_SIZE_CONNECTABLE_ROOT; type = FILEID_BTRFS_WITH_PARENT_ROOT;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sumit Kumar sumit.kumar@oss.qualcomm.com
commit f5225a34bd8f9f64eec37f6ae1461289aaa3eb86 upstream.
The mhi_ep_read_channel function incorrectly assumes the End of Transfer (EOT) bit is present for each packet in a chained transactions, causing it to advance mhi_chan->rd_offset beyond wr_offset during host-to-device transfers when EOT has not yet arrived. This leads to access of unmapped host memory, causing IOMMU faults and processing of stale TREs.
Modify the loop condition to ensure mhi_queue is not empty, allowing the function to process only valid TREs up to the current write pointer to prevent premature reads and ensure safe traversal of chained TREs.
Due to this change, buf_left needs to be removed from the while loop condition to avoid exiting prematurely before reading the ring completely, and also remove write_offset since it will always be zero because the new cache buffer is allocated every time.
Fixes: 5301258899773 ("bus: mhi: ep: Add support for reading from the host") Co-developed-by: Akhil Vinod akhil.vinod@oss.qualcomm.com Signed-off-by: Akhil Vinod akhil.vinod@oss.qualcomm.com Signed-off-by: Sumit Kumar sumit.kumar@oss.qualcomm.com [mani: reworded description slightly] Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@oss.qualcomm.com Reviewed-by: Krishna Chaitanya Chundru krishna.chundru@oss.qualcomm.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250910-final_chained-v3-1-ec77c9d88ace@oss.qualco... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/mhi/ep/main.c | 37 ++++++++++++------------------------- 1 file changed, 12 insertions(+), 25 deletions(-)
--- a/drivers/bus/mhi/ep/main.c +++ b/drivers/bus/mhi/ep/main.c @@ -403,17 +403,13 @@ static int mhi_ep_read_channel(struct mh { struct mhi_ep_chan *mhi_chan = &mhi_cntrl->mhi_chan[ring->ch_id]; struct device *dev = &mhi_cntrl->mhi_dev->dev; - size_t tr_len, read_offset, write_offset; + size_t tr_len, read_offset; struct mhi_ep_buf_info buf_info = {}; u32 len = MHI_EP_DEFAULT_MTU; struct mhi_ring_element *el; - bool tr_done = false; void *buf_addr; - u32 buf_left; int ret;
- buf_left = len; - do { /* Don't process the transfer ring if the channel is not in RUNNING state */ if (mhi_chan->state != MHI_CH_STATE_RUNNING) { @@ -426,24 +422,23 @@ static int mhi_ep_read_channel(struct mh /* Check if there is data pending to be read from previous read operation */ if (mhi_chan->tre_bytes_left) { dev_dbg(dev, "TRE bytes remaining: %u\n", mhi_chan->tre_bytes_left); - tr_len = min(buf_left, mhi_chan->tre_bytes_left); + tr_len = min(len, mhi_chan->tre_bytes_left); } else { mhi_chan->tre_loc = MHI_TRE_DATA_GET_PTR(el); mhi_chan->tre_size = MHI_TRE_DATA_GET_LEN(el); mhi_chan->tre_bytes_left = mhi_chan->tre_size;
- tr_len = min(buf_left, mhi_chan->tre_size); + tr_len = min(len, mhi_chan->tre_size); }
read_offset = mhi_chan->tre_size - mhi_chan->tre_bytes_left; - write_offset = len - buf_left;
buf_addr = kmem_cache_zalloc(mhi_cntrl->tre_buf_cache, GFP_KERNEL); if (!buf_addr) return -ENOMEM;
buf_info.host_addr = mhi_chan->tre_loc + read_offset; - buf_info.dev_addr = buf_addr + write_offset; + buf_info.dev_addr = buf_addr; buf_info.size = tr_len; buf_info.cb = mhi_ep_read_completion; buf_info.cb_buf = buf_addr; @@ -459,16 +454,12 @@ static int mhi_ep_read_channel(struct mh goto err_free_buf_addr; }
- buf_left -= tr_len; mhi_chan->tre_bytes_left -= tr_len;
- if (!mhi_chan->tre_bytes_left) { - if (MHI_TRE_DATA_GET_IEOT(el)) - tr_done = true; - + if (!mhi_chan->tre_bytes_left) mhi_chan->rd_offset = (mhi_chan->rd_offset + 1) % ring->ring_size; - } - } while (buf_left && !tr_done); + /* Read until the some buffer is left or the ring becomes not empty */ + } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE));
return 0;
@@ -502,15 +493,11 @@ static int mhi_ep_process_ch_ring(struct mhi_chan->xfer_cb(mhi_chan->mhi_dev, &result); } else { /* UL channel */ - do { - ret = mhi_ep_read_channel(mhi_cntrl, ring); - if (ret < 0) { - dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n"); - return ret; - } - - /* Read until the ring becomes empty */ - } while (!mhi_ep_queue_is_empty(mhi_chan->mhi_dev, DMA_TO_DEVICE)); + ret = mhi_ep_read_channel(mhi_cntrl, ring); + if (ret < 0) { + dev_err(&mhi_chan->mhi_dev->dev, "Failed to read channel\n"); + return ret; + } }
return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Adam Xue zxue@semtech.com
commit d0856a6dff57f95cc5d2d74e50880f01697d0cc4 upstream.
In mhi_init_irq_setup, the device pointer used for dev_err() was not initialized. Use the pointer from mhi_cntrl instead.
Fixes: b0fc0167f254 ("bus: mhi: core: Allow shared IRQ for event rings") Fixes: 3000f85b8f47 ("bus: mhi: core: Add support for basic PM operations") Signed-off-by: Adam Xue zxue@semtech.com [mani: reworded subject/description and CCed stable] Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@oss.qualcomm.com Reviewed-by: Krishna Chaitanya Chundru krishna.chundru@oss.qualcomm.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250905174118.38512-1-zxue@semtech.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/mhi/host/init.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/bus/mhi/host/init.c +++ b/drivers/bus/mhi/host/init.c @@ -194,7 +194,6 @@ void mhi_deinit_free_irq(struct mhi_cont int mhi_init_irq_setup(struct mhi_controller *mhi_cntrl) { struct mhi_event *mhi_event = mhi_cntrl->mhi_event; - struct device *dev = &mhi_cntrl->mhi_dev->dev; unsigned long irq_flags = IRQF_SHARED | IRQF_NO_SUSPEND; int i, ret;
@@ -221,7 +220,7 @@ int mhi_init_irq_setup(struct mhi_contro continue;
if (mhi_event->irq >= mhi_cntrl->nr_irqs) { - dev_err(dev, "irq %d not available for event ring\n", + dev_err(mhi_cntrl->cntrl_dev, "irq %d not available for event ring\n", mhi_event->irq); ret = -EINVAL; goto error_request; @@ -232,7 +231,7 @@ int mhi_init_irq_setup(struct mhi_contro irq_flags, "mhi", mhi_event); if (ret) { - dev_err(dev, "Error requesting irq:%d for ev:%d\n", + dev_err(mhi_cntrl->cntrl_dev, "Error requesting irq:%d for ev:%d\n", mhi_cntrl->irq[mhi_event->irq], i); goto error_request; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Abel Vesa abel.vesa@linaro.org
commit 57c8e9da3dfe606b918d8f193837ebf2213a9545 upstream.
All the other ref clocks provided by this driver have the bi_tcxo as parent. The eDP refclk is the only one without a parent, leading to reporting its rate as 0. So set its parent to bi_tcxo, just like the rest of the refclks.
Cc: stable@vger.kernel.org # v6.9 Fixes: 06aff116199c ("clk: qcom: Add TCSR clock driver for x1e80100") Signed-off-by: Abel Vesa abel.vesa@linaro.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Link: https://lore.kernel.org/r/20250730-clk-qcom-tcsrcc-x1e80100-parent-edp-refcl... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/tcsrcc-x1e80100.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/clk/qcom/tcsrcc-x1e80100.c +++ b/drivers/clk/qcom/tcsrcc-x1e80100.c @@ -29,6 +29,10 @@ static struct clk_branch tcsr_edp_clkref .enable_mask = BIT(0), .hw.init = &(const struct clk_init_data) { .name = "tcsr_edp_clkref_en", + .parent_data = &(const struct clk_parent_data){ + .index = DT_BI_TCXO_PAD, + }, + .num_parents = 1, .ops = &clk_branch2_ops, }, },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Schuster schuster.simon@siemens-energy.com
commit 04ff48239f46e8b493571e260bd0e6c3a6400371 upstream.
With the introduction of clone3 in commit 7f192e3cd316 ("fork: add clone3") the effective bit width of clone_flags on all architectures was increased from 32-bit to 64-bit. However, the signature of the copy_* helper functions (e.g., copy_sighand) used by copy_process was not adapted.
As such, they truncate the flags on any 32-bit architectures that supports clone3 (arc, arm, csky, m68k, microblaze, mips32, openrisc, parisc32, powerpc32, riscv32, x86-32 and xtensa).
For copy_sighand with CLONE_CLEAR_SIGHAND being an actual u64 constant, this triggers an observable bug in kernel selftest clone3_clear_sighand:
if (clone_flags & CLONE_CLEAR_SIGHAND)
in function copy_sighand within fork.c will always fail given:
unsigned long /* == uint32_t */ clone_flags #define CLONE_CLEAR_SIGHAND 0x100000000ULL
This commit fixes the bug by always passing clone_flags to copy_sighand via their declared u64 type, invariant of architecture-dependent integer sizes.
Fixes: b612e5df4587 ("clone3: add CLONE_CLEAR_SIGHAND") Cc: stable@vger.kernel.org # linux-5.5+ Signed-off-by: Simon Schuster schuster.simon@siemens-energy.com Link: https://lore.kernel.org/20250901-nios2-implement-clone3-v2-1-53fcf5577d57@si... Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Lorenzo Stoakes lorenzo.stoakes@oracle.com Reviewed-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/fork.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/fork.c +++ b/kernel/fork.c @@ -1807,7 +1807,7 @@ static int copy_files(unsigned long clon return 0; }
-static int copy_sighand(unsigned long clone_flags, struct task_struct *tsk) +static int copy_sighand(u64 clone_flags, struct task_struct *tsk) { struct sighand_struct *sig;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
commit 69e5d50fcf4093fb3f9f41c4f931f12c2ca8c467 upstream.
The cpufreq_cpu_put() call in update_qos_request() takes place too early because the latter subsequently calls freq_qos_update_request() that indirectly accesses the policy object in question through the QoS request object passed to it.
Fortunately, update_qos_request() is called under intel_pstate_driver_lock, so this issue does not matter for changing the intel_pstate operation mode, but it theoretically can cause a crash to occur on CPU device hot removal (which currently can only happen in virt, but it is formally supported nevertheless).
Address this issue by modifying update_qos_request() to drop the reference to the policy later.
Fixes: da5c504c7aae ("cpufreq: intel_pstate: Implement QoS supported freq constraints") Cc: 5.4+ stable@vger.kernel.org # 5.4+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Zihuan Zhang zhangzihuan@kylinos.cn Link: https://patch.msgid.link/2255671.irdbgypaU6@rafael.j.wysocki Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/intel_pstate.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
--- a/drivers/cpufreq/intel_pstate.c +++ b/drivers/cpufreq/intel_pstate.c @@ -1582,10 +1582,10 @@ static void update_qos_request(enum freq continue;
req = policy->driver_data; - cpufreq_cpu_put(policy); - - if (!req) + if (!req) { + cpufreq_cpu_put(policy); continue; + }
if (hwp_active) intel_pstate_get_hwp_cap(cpu); @@ -1601,6 +1601,8 @@ static void update_qos_request(enum freq
if (freq_qos_update_request(req, freq) < 0) pr_warn("Failed to update freq constraint: CPU%d\n", i); + + cpufreq_cpu_put(policy); } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
commit 838d2d51513e6d2504a678e906823cfd2ecaaa22 upstream.
It seems like everywhere in this file, when the request is not bidirectionala, req->src is mapped with DMA_TO_DEVICE and req->dst is mapped with DMA_FROM_DEVICE.
Fixes: 62f58b1637b7 ("crypto: aspeed - add HACE crypto driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/crypto/aspeed/aspeed-hace-crypto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/crypto/aspeed/aspeed-hace-crypto.c +++ b/drivers/crypto/aspeed/aspeed-hace-crypto.c @@ -346,7 +346,7 @@ free_req:
} else { dma_unmap_sg(hace_dev->dev, req->dst, rctx->dst_nents, - DMA_TO_DEVICE); + DMA_FROM_DEVICE); dma_unmap_sg(hace_dev->dev, req->src, rctx->src_nents, DMA_TO_DEVICE); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
commit f5d643156ef62216955c119216d2f3815bd51cb1 upstream.
It seems like everywhere in this file, dd->in_sg is mapped with DMA_TO_DEVICE and dd->out_sg is mapped with DMA_FROM_DEVICE.
Fixes: 13802005d8f2 ("crypto: atmel - add Atmel DES/TDES driver") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/crypto/atmel-tdes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/crypto/atmel-tdes.c +++ b/drivers/crypto/atmel-tdes.c @@ -512,7 +512,7 @@ static int atmel_tdes_crypt_start(struct
if (err && (dd->flags & TDES_FLAGS_FAST)) { dma_unmap_sg(dd->dev, dd->in_sg, 1, DMA_TO_DEVICE); - dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_TO_DEVICE); + dma_unmap_sg(dd->dev, dd->out_sg, 1, DMA_FROM_DEVICE); }
return err;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Fourier fourier.thomas@gmail.com
commit 21140e5caf019e4a24e1ceabcaaa16bd693b393f upstream.
The dma_unmap_sg() functions should be called with the same nents as the dma_map_sg(), not the value the map function returned.
Fixes: 57d67c6e8219 ("crypto: rockchip - rework by using crypto_engine") Cc: stable@vger.kernel.org Signed-off-by: Thomas Fourier fourier.thomas@gmail.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/crypto/rockchip/rk3288_crypto_ahash.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/crypto/rockchip/rk3288_crypto_ahash.c +++ b/drivers/crypto/rockchip/rk3288_crypto_ahash.c @@ -252,7 +252,7 @@ static void rk_hash_unprepare(struct cry struct rk_ahash_rctx *rctx = ahash_request_ctx(areq); struct rk_crypto_info *rkc = rctx->dev;
- dma_unmap_sg(rkc->dev, areq->src, rctx->nrsg, DMA_TO_DEVICE); + dma_unmap_sg(rkc->dev, areq->src, sg_nents(areq->src), DMA_TO_DEVICE); }
static int rk_hash_run(struct crypto_engine *engine, void *breq)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit 0c43094f8cc9d3d99d835c0ac9c4fe1ccc62babd upstream.
The ready event list of an epoll object is protected by read-write semaphore:
- The consumer (waiter) acquires the write lock and takes items. - the producer (waker) takes the read lock and adds items.
The point of this design is enabling epoll to scale well with large number of producers, as multiple producers can hold the read lock at the same time.
Unfortunately, this implementation may cause scheduling priority inversion problem. Suppose the consumer has higher scheduling priority than the producer. The consumer needs to acquire the write lock, but may be blocked by the producer holding the read lock. Since read-write semaphore does not support priority-boosting for the readers (even with CONFIG_PREEMPT_RT=y), we have a case of priority inversion: a higher priority consumer is blocked by a lower priority producer. This problem was reported in [1].
Furthermore, this could also cause stall problem, as described in [2].
Fix this problem by replacing rwlock with spinlock.
This reduces the event bandwidth, as the producers now have to contend with each other for the spinlock. According to the benchmark from https://github.com/rouming/test-tools/blob/master/stress-epoll.c:
On 12 x86 CPUs: Before After Diff threads events/ms events/ms 8 7162 4956 -31% 16 8733 5383 -38% 32 7968 5572 -30% 64 10652 5739 -46% 128 11236 5931 -47%
On 4 riscv CPUs: Before After Diff threads events/ms events/ms 8 2958 2833 -4% 16 3323 3097 -7% 32 3451 3240 -6% 64 3554 3178 -11% 128 3601 3235 -10%
Although the numbers look bad, it should be noted that this benchmark creates multiple threads who do nothing except constantly generating new epoll events, thus contention on the spinlock is high. For real workload, the event rate is likely much lower, and the performance drop is not as bad.
Using another benchmark (perf bench epoll wait) where spinlock contention is lower, improvement is even observed on x86:
On 12 x86 CPUs: Before: Averaged 110279 operations/sec (+- 1.09%), total secs = 8 After: Averaged 114577 operations/sec (+- 2.25%), total secs = 8
On 4 riscv CPUs: Before: Averaged 175767 operations/sec (+- 0.62%), total secs = 8 After: Averaged 167396 operations/sec (+- 0.23%), total secs = 8
In conclusion, no one is likely to be upset over this change. After all, spinlock was used originally for years, and the commit which converted to rwlock didn't mention a real workload, just that the benchmark numbers are nice.
This patch is not exactly the revert of commit a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention"), because git revert conflicts in some places which are not obvious on the resolution. This patch is intended to be backported, therefore go with the obvious approach:
- Replace rwlock_t with spinlock_t one to one
- Delete list_add_tail_lockless() and chain_epi_lockless(). These were introduced to allow producers to concurrently add items to the list. But now that spinlock no longer allows producers to touch the event list concurrently, these two functions are not necessary anymore.
Fixes: a218cc491420 ("epoll: use rwlock in order to reduce ep_poll_callback() contention") Signed-off-by: Nam Cao namcao@linutronix.de Link: https://lore.kernel.org/ec92458ea357ec503c737ead0f10b2c6e4c37d47.1752581388.... Tested-by: K Prateek Nayak kprateek.nayak@amd.com Cc: stable@vger.kernel.org Reported-by: Frederic Weisbecker frederic@kernel.org Closes: https://lore.kernel.org/linux-rt-users/20210825132754.GA895675@lothringen/ [1] Reported-by: Valentin Schneider vschneid@redhat.com Closes: https://lore.kernel.org/linux-rt-users/xhsmhttqvnall.mognet@vschneid.remote.... [2] Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/eventpoll.c | 139 ++++++++++----------------------------------------------- 1 file changed, 26 insertions(+), 113 deletions(-)
--- a/fs/eventpoll.c +++ b/fs/eventpoll.c @@ -46,10 +46,10 @@ * * 1) epnested_mutex (mutex) * 2) ep->mtx (mutex) - * 3) ep->lock (rwlock) + * 3) ep->lock (spinlock) * * The acquire order is the one listed above, from 1 to 3. - * We need a rwlock (ep->lock) because we manipulate objects + * We need a spinlock (ep->lock) because we manipulate objects * from inside the poll callback, that might be triggered from * a wake_up() that in turn might be called from IRQ context. * So we can't sleep inside the poll callback and hence we need @@ -195,7 +195,7 @@ struct eventpoll { struct list_head rdllist;
/* Lock which protects rdllist and ovflist */ - rwlock_t lock; + spinlock_t lock;
/* RB tree root used to store monitored fd structs */ struct rb_root_cached rbr; @@ -713,10 +713,10 @@ static void ep_start_scan(struct eventpo * in a lockless way. */ lockdep_assert_irqs_enabled(); - write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); list_splice_init(&ep->rdllist, txlist); WRITE_ONCE(ep->ovflist, NULL); - write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock); }
static void ep_done_scan(struct eventpoll *ep, @@ -724,7 +724,7 @@ static void ep_done_scan(struct eventpol { struct epitem *epi, *nepi;
- write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); /* * During the time we spent inside the "sproc" callback, some * other events might have been queued by the poll callback. @@ -765,7 +765,7 @@ static void ep_done_scan(struct eventpol wake_up(&ep->wq); }
- write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock); }
static void ep_get(struct eventpoll *ep) @@ -839,10 +839,10 @@ static bool __ep_remove(struct eventpoll
rb_erase_cached(&epi->rbn, &ep->rbr);
- write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); if (ep_is_linked(epi)) list_del_init(&epi->rdllink); - write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock);
wakeup_source_unregister(ep_wakeup_source(epi)); /* @@ -1123,7 +1123,7 @@ static int ep_alloc(struct eventpoll **p return -ENOMEM;
mutex_init(&ep->mtx); - rwlock_init(&ep->lock); + spin_lock_init(&ep->lock); init_waitqueue_head(&ep->wq); init_waitqueue_head(&ep->poll_wait); INIT_LIST_HEAD(&ep->rdllist); @@ -1211,99 +1211,9 @@ struct file *get_epoll_tfile_raw_ptr(str #endif /* CONFIG_KCMP */
/* - * Adds a new entry to the tail of the list in a lockless way, i.e. - * multiple CPUs are allowed to call this function concurrently. - * - * Beware: it is necessary to prevent any other modifications of the - * existing list until all changes are completed, in other words - * concurrent list_add_tail_lockless() calls should be protected - * with a read lock, where write lock acts as a barrier which - * makes sure all list_add_tail_lockless() calls are fully - * completed. - * - * Also an element can be locklessly added to the list only in one - * direction i.e. either to the tail or to the head, otherwise - * concurrent access will corrupt the list. - * - * Return: %false if element has been already added to the list, %true - * otherwise. - */ -static inline bool list_add_tail_lockless(struct list_head *new, - struct list_head *head) -{ - struct list_head *prev; - - /* - * This is simple 'new->next = head' operation, but cmpxchg() - * is used in order to detect that same element has been just - * added to the list from another CPU: the winner observes - * new->next == new. - */ - if (!try_cmpxchg(&new->next, &new, head)) - return false; - - /* - * Initially ->next of a new element must be updated with the head - * (we are inserting to the tail) and only then pointers are atomically - * exchanged. XCHG guarantees memory ordering, thus ->next should be - * updated before pointers are actually swapped and pointers are - * swapped before prev->next is updated. - */ - - prev = xchg(&head->prev, new); - - /* - * It is safe to modify prev->next and new->prev, because a new element - * is added only to the tail and new->next is updated before XCHG. - */ - - prev->next = new; - new->prev = prev; - - return true; -} - -/* - * Chains a new epi entry to the tail of the ep->ovflist in a lockless way, - * i.e. multiple CPUs are allowed to call this function concurrently. - * - * Return: %false if epi element has been already chained, %true otherwise. - */ -static inline bool chain_epi_lockless(struct epitem *epi) -{ - struct eventpoll *ep = epi->ep; - - /* Fast preliminary check */ - if (epi->next != EP_UNACTIVE_PTR) - return false; - - /* Check that the same epi has not been just chained from another CPU */ - if (cmpxchg(&epi->next, EP_UNACTIVE_PTR, NULL) != EP_UNACTIVE_PTR) - return false; - - /* Atomically exchange tail */ - epi->next = xchg(&ep->ovflist, epi); - - return true; -} - -/* * This is the callback that is passed to the wait queue wakeup * mechanism. It is called by the stored file descriptors when they * have events to report. - * - * This callback takes a read lock in order not to contend with concurrent - * events from another file descriptor, thus all modifications to ->rdllist - * or ->ovflist are lockless. Read lock is paired with the write lock from - * ep_start/done_scan(), which stops all list modifications and guarantees - * that lists state is seen correctly. - * - * Another thing worth to mention is that ep_poll_callback() can be called - * concurrently for the same @epi from different CPUs if poll table was inited - * with several wait queues entries. Plural wakeup from different CPUs of a - * single wait queue is serialized by wq.lock, but the case when multiple wait - * queues are used should be detected accordingly. This is detected using - * cmpxchg() operation. */ static int ep_poll_callback(wait_queue_entry_t *wait, unsigned mode, int sync, void *key) { @@ -1314,7 +1224,7 @@ static int ep_poll_callback(wait_queue_e unsigned long flags; int ewake = 0;
- read_lock_irqsave(&ep->lock, flags); + spin_lock_irqsave(&ep->lock, flags);
ep_set_busy_poll_napi_id(epi);
@@ -1343,12 +1253,15 @@ static int ep_poll_callback(wait_queue_e * chained in ep->ovflist and requeued later on. */ if (READ_ONCE(ep->ovflist) != EP_UNACTIVE_PTR) { - if (chain_epi_lockless(epi)) + if (epi->next == EP_UNACTIVE_PTR) { + epi->next = READ_ONCE(ep->ovflist); + WRITE_ONCE(ep->ovflist, epi); ep_pm_stay_awake_rcu(epi); + } } else if (!ep_is_linked(epi)) { /* In the usual case, add event to ready list. */ - if (list_add_tail_lockless(&epi->rdllink, &ep->rdllist)) - ep_pm_stay_awake_rcu(epi); + list_add_tail(&epi->rdllink, &ep->rdllist); + ep_pm_stay_awake_rcu(epi); }
/* @@ -1381,7 +1294,7 @@ static int ep_poll_callback(wait_queue_e pwake++;
out_unlock: - read_unlock_irqrestore(&ep->lock, flags); + spin_unlock_irqrestore(&ep->lock, flags);
/* We have to call this outside the lock */ if (pwake) @@ -1716,7 +1629,7 @@ static int ep_insert(struct eventpoll *e }
/* We have to drop the new item inside our item list to keep track of it */ - write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock);
/* record NAPI ID of new item if present */ ep_set_busy_poll_napi_id(epi); @@ -1733,7 +1646,7 @@ static int ep_insert(struct eventpoll *e pwake++; }
- write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock);
/* We have to call this outside the lock */ if (pwake) @@ -1797,7 +1710,7 @@ static int ep_modify(struct eventpoll *e * list, push it inside. */ if (ep_item_poll(epi, &pt, 1)) { - write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); if (!ep_is_linked(epi)) { list_add_tail(&epi->rdllink, &ep->rdllist); ep_pm_stay_awake(epi); @@ -1808,7 +1721,7 @@ static int ep_modify(struct eventpoll *e if (waitqueue_active(&ep->poll_wait)) pwake++; } - write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock); }
/* We have to call this outside the lock */ @@ -2041,7 +1954,7 @@ static int ep_poll(struct eventpoll *ep, init_wait(&wait); wait.func = ep_autoremove_wake_function;
- write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); /* * Barrierless variant, waitqueue_active() is called under * the same lock on wakeup ep_poll_callback() side, so it @@ -2060,7 +1973,7 @@ static int ep_poll(struct eventpoll *ep, if (!eavail) __add_wait_queue_exclusive(&ep->wq, &wait);
- write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock);
if (!eavail) timed_out = !schedule_hrtimeout_range(to, slack, @@ -2075,7 +1988,7 @@ static int ep_poll(struct eventpoll *ep, eavail = 1;
if (!list_empty_careful(&wait.entry)) { - write_lock_irq(&ep->lock); + spin_lock_irq(&ep->lock); /* * If the thread timed out and is not on the wait queue, * it means that the thread was woken up after its @@ -2086,7 +1999,7 @@ static int ep_poll(struct eventpoll *ep, if (timed_out) eavail = list_empty(&wait.entry); __remove_wait_queue(&ep->wq, &wait); - write_unlock_irq(&ep->lock); + spin_unlock_irq(&ep->lock); } } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Finn Thain fthain@linux-m68k.org
commit 15df28699b28d6b49dc305040c4e26a9553df07a upstream.
A regression was reported to me recently whereby /dev/fb0 had disappeared from a PowerBook G3 Series "Wallstreet". The problem shows up when the "video=ofonly" parameter is passed to the kernel, which is what the bootloader does when "no video driver" is selected. The cause of the problem is the "offb" string comparison, which got mangled when it got refactored. Fix it.
Cc: stable@vger.kernel.org Fixes: 93604a5ade3a ("fbdev: Handle video= parameter in video/cmdline.c") Reported-and-tested-by: Stan Johnson userm57@yahoo.com Signed-off-by: Finn Thain fthain@linux-m68k.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/video/fbdev/core/fb_cmdline.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/video/fbdev/core/fb_cmdline.c +++ b/drivers/video/fbdev/core/fb_cmdline.c @@ -40,7 +40,7 @@ int fb_get_options(const char *name, cha bool enabled;
if (name) - is_of = strncmp(name, "offb", 4); + is_of = !strncmp(name, "offb", 4);
enabled = __video_get_options(name, &options, is_of);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Haoxiang Li haoxiang_li2024@163.com
commit d68318471aa2e16222ebf492883e05a2d72b9b17 upstream.
Add put_bh() to decrease the refcount of 'bh' after the job is finished, preventing a resource leak.
Fixes: 3f3b442b5ad2 ("fs/ntfs3: Add bitmap") Cc: stable@vger.kernel.org Signed-off-by: Haoxiang Li haoxiang_li2024@163.com Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ntfs3/bitmap.c | 1 + 1 file changed, 1 insertion(+)
--- a/fs/ntfs3/bitmap.c +++ b/fs/ntfs3/bitmap.c @@ -1399,6 +1399,7 @@ int wnd_extend(struct wnd_bitmap *wnd, s mark_buffer_dirty(bh); unlock_buffer(bh); /* err = sync_dirty_buffer(bh); */ + put_bh(bh);
b0 = 0; bits -= op;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shashank A P shashank.ap@samsung.com
commit 72b7ceca857f38a8ca7c5629feffc63769638974 upstream.
There is a kernel panic due to WARN_ONCE when panic_on_warn is set.
This issue occurs when writeback is triggered due to sync call for an opened file(ie, writeback reason is WB_REASON_SYNC). When f2fs balance is needed at sync path, flush for quota_release_work is triggered. By default quota_release_work is queued to "events_unbound" queue which does not have WQ_MEM_RECLAIM flag. During f2fs balance "writeback" workqueue tries to flush quota_release_work causing kernel panic due to MEM_RECLAIM flag mismatch errors.
This patch creates dedicated workqueue with WQ_MEM_RECLAIM flag for work quota_release_work.
------------[ cut here ]------------ WARNING: CPU: 4 PID: 14867 at kernel/workqueue.c:3721 check_flush_dependency+0x13c/0x148 Call trace: check_flush_dependency+0x13c/0x148 __flush_work+0xd0/0x398 flush_delayed_work+0x44/0x5c dquot_writeback_dquots+0x54/0x318 f2fs_do_quota_sync+0xb8/0x1a8 f2fs_write_checkpoint+0x3cc/0x99c f2fs_gc+0x190/0x750 f2fs_balance_fs+0x110/0x168 f2fs_write_single_data_page+0x474/0x7dc f2fs_write_data_pages+0x7d0/0xd0c do_writepages+0xe0/0x2f4 __writeback_single_inode+0x44/0x4ac writeback_sb_inodes+0x30c/0x538 wb_writeback+0xf4/0x440 wb_workfn+0x128/0x5d4 process_scheduled_works+0x1c4/0x45c worker_thread+0x32c/0x3e8 kthread+0x11c/0x1b0 ret_from_fork+0x10/0x20 Kernel panic - not syncing: kernel: panic_on_warn set ...
Fixes: ac6f420291b3 ("quota: flush quota_release_work upon quota writeback") CC: stable@vger.kernel.org Signed-off-by: Shashank A P shashank.ap@samsung.com Link: https://patch.msgid.link/20250901092905.2115-1-shashank.ap@samsung.com Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/quota/dquot.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/fs/quota/dquot.c +++ b/fs/quota/dquot.c @@ -163,6 +163,9 @@ static struct quota_module_name module_n /* SLAB cache for dquot structures */ static struct kmem_cache *dquot_cachep;
+/* workqueue for work quota_release_work*/ +static struct workqueue_struct *quota_unbound_wq; + void register_quota_format(struct quota_format_type *fmt) { spin_lock(&dq_list_lock); @@ -882,7 +885,7 @@ void dqput(struct dquot *dquot) put_releasing_dquots(dquot); atomic_dec(&dquot->dq_count); spin_unlock(&dq_list_lock); - queue_delayed_work(system_unbound_wq, "a_release_work, 1); + queue_delayed_work(quota_unbound_wq, "a_release_work, 1); } EXPORT_SYMBOL(dqput);
@@ -3042,6 +3045,11 @@ static int __init dquot_init(void)
shrinker_register(dqcache_shrinker);
+ quota_unbound_wq = alloc_workqueue("quota_events_unbound", + WQ_UNBOUND | WQ_MEM_RECLAIM, WQ_MAX_ACTIVE); + if (!quota_unbound_wq) + panic("Cannot create quota_unbound_wq\n"); + return 0; } fs_initcall(dquot_init);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 0b563aad1c0a05dc7d123f68a9f82f79de206dad upstream.
In case of FUSE_NOTIFY_RESEND and FUSE_NOTIFY_INC_EPOCH fuse_copy_finish() isn't called.
Fix by always calling fuse_copy_finish() after fuse_notify(). It's a no-op if called a second time.
Fixes: 760eac73f9f6 ("fuse: Introduce a new notification type for resend pending requests") Fixes: 2396356a945b ("fuse: add more control over cache invalidation behaviour") Cc: stable@vger.kernel.org # v6.9 Reviewed-by: Joanne Koong joannelkoong@gmail.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/dev.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1989,7 +1989,7 @@ static ssize_t fuse_dev_do_write(struct */ if (!oh.unique) { err = fuse_notify(fc, oh.error, nbytes - sizeof(oh), cs); - goto out; + goto copy_finish; }
err = -EINVAL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Darrick J. Wong djwong@kernel.org
commit 26e5c67deb2e1f42a951f022fdf5b9f7eb747b01 upstream.
I observed a hang when running generic/323 against a fuseblk server. This test opens a file, initiates a lot of AIO writes to that file descriptor, and closes the file descriptor before the writes complete. Unsurprisingly, the AIO exerciser threads are mostly stuck waiting for responses from the fuseblk server:
# cat /proc/372265/task/372313/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_do_getattr+0xfc/0x1f0 [fuse] [<0>] fuse_file_read_iter+0xbe/0x1c0 [fuse] [<0>] aio_read+0x130/0x1e0 [<0>] io_submit_one+0x542/0x860 [<0>] __x64_sys_io_submit+0x98/0x1a0 [<0>] do_syscall_64+0x37/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
But the /weird/ part is that the fuseblk server threads are waiting for responses from itself:
# cat /proc/372210/task/372232/stack [<0>] request_wait_answer+0x1fe/0x2a0 [fuse] [<0>] __fuse_simple_request+0xd3/0x2b0 [fuse] [<0>] fuse_file_put+0x9a/0xd0 [fuse] [<0>] fuse_release+0x36/0x50 [fuse] [<0>] __fput+0xec/0x2b0 [<0>] task_work_run+0x55/0x90 [<0>] syscall_exit_to_user_mode+0xe9/0x100 [<0>] do_syscall_64+0x43/0xf0 [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0x53
The fuseblk server is fuse2fs so there's nothing all that exciting in the server itself. So why is the fuse server calling fuse_file_put? The commit message for the fstest sheds some light on that:
"By closing the file descriptor before calling io_destroy, you pretty much guarantee that the last put on the ioctx will be done in interrupt context (during I/O completion).
Aha. AIO fgets a new struct file from the fd when it queues the ioctx. The completion of the FUSE_WRITE command from userspace causes the fuse server to call the AIO completion function. The completion puts the struct file, queuing a delayed fput to the fuse server task. When the fuse server task returns to userspace, it has to run the delayed fput, which in the case of a fuseblk server, it does synchronously.
Sending the FUSE_RELEASE command sychronously from fuse server threads is a bad idea because a client program can initiate enough simultaneous AIOs such that all the fuse server threads end up in delayed_fput, and now there aren't any threads left to handle the queued fuse commands.
Fix this by only using asynchronous fputs when closing files, and leave a comment explaining why.
Cc: stable@vger.kernel.org # v2.6.38 Fixes: 5a18ec176c934c ("fuse: fix hang of single threaded fuseblk filesystem") Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -355,8 +355,14 @@ void fuse_file_release(struct inode *ino * Make the release synchronous if this is a fuseblk mount, * synchronous RELEASE is allowed (and desirable) in this case * because the server can be trusted not to screw up. + * + * Always use the asynchronous file put because the current thread + * might be the fuse server. This can happen if a process starts some + * aio and closes the fd before the aio completes. Since aio takes its + * own ref to the file, the IO completion has to drop the ref, which is + * how the fuse server can end up closing its clients' files. */ - fuse_file_put(ff, ff->fm->fc->destroy); + fuse_file_put(ff, false); }
void fuse_release_common(struct file *file, bool isdir)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandar Gerasimovski aleksandar.gerasimovski@belden.com
commit 3c63ba1c430af1c0dcd68dd36f2246980621dcba upstream.
There are two problems with the chip configuration in this driver: - First, is that writing 12 bytes (ARRAY_SIZE(regs)) would anyhow lead to a config overflow due to HW auto increment implementation in the chip. - Second, the i2c_smbus_write_block_data write ends up in writing unexpected value to the channel_dis register, this is because the smbus size that is 0x03 in this case gets written to the register. The PAC1931/2/3/4 data sheet does not really specify that block write is indeed supported.
This problem is probably not visible on PAC1934 version where all channels are used as the chip is properly configured by luck, but in our case whenusing PAC1931 this leads to nonfunctional device.
Fixes: 0fb528c8255b (iio: adc: adding support for PAC193x) Suggested-by: Rene Straub mailto:rene.straub@belden.com Signed-off-by: Aleksandar Gerasimovski aleksandar.gerasimovski@belden.com Reviewed-by: Marius Cristea marius.cristea@microchip.com Link: https://patch.msgid.link/20250811130904.2481790-1-aleksandar.gerasimovski@be... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/pac1934.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/drivers/iio/adc/pac1934.c +++ b/drivers/iio/adc/pac1934.c @@ -88,6 +88,7 @@ #define PAC1934_VPOWER_3_ADDR 0x19 #define PAC1934_VPOWER_4_ADDR 0x1A #define PAC1934_REFRESH_V_REG_ADDR 0x1F +#define PAC1934_SLOW_REG_ADDR 0x20 #define PAC1934_CTRL_STAT_REGS_ADDR 0x1C #define PAC1934_PID_REG_ADDR 0xFD #define PAC1934_MID_REG_ADDR 0xFE @@ -1265,8 +1266,23 @@ static int pac1934_chip_configure(struct /* no SLOW triggered REFRESH, clear POR */ regs[PAC1934_SLOW_REG_OFF] = 0;
- ret = i2c_smbus_write_block_data(client, PAC1934_CTRL_STAT_REGS_ADDR, - ARRAY_SIZE(regs), (u8 *)regs); + /* + * Write the three bytes sequentially, as the device does not support + * block write. + */ + ret = i2c_smbus_write_byte_data(client, PAC1934_CTRL_STAT_REGS_ADDR, + regs[PAC1934_CHANNEL_DIS_REG_OFF]); + if (ret) + return ret; + + ret = i2c_smbus_write_byte_data(client, + PAC1934_CTRL_STAT_REGS_ADDR + PAC1934_NEG_PWR_REG_OFF, + regs[PAC1934_NEG_PWR_REG_OFF]); + if (ret) + return ret; + + ret = i2c_smbus_write_byte_data(client, PAC1934_SLOW_REG_ADDR, + regs[PAC1934_SLOW_REG_OFF]); if (ret) return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qianfeng Rong rongqianfeng@vivo.com
commit f9381ece76de999a2065d5b4fdd87fa17883978c upstream.
Change the 'ret' variable in ad5360_update_ctrl() from unsigned int to int, as it needs to store either negative error codes or zero returned by ad5360_write_unlocked().
Fixes: a3e2940c24d3 ("staging:iio:dac: Add AD5360 driver") Signed-off-by: Qianfeng Rong rongqianfeng@vivo.com Reviewed-by: Andy Shevchenko andriy.shevchenko@intel.com Link: https://patch.msgid.link/20250901135726.17601-2-rongqianfeng@vivo.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/dac/ad5360.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/dac/ad5360.c +++ b/drivers/iio/dac/ad5360.c @@ -262,7 +262,7 @@ static int ad5360_update_ctrl(struct iio unsigned int clr) { struct ad5360_state *st = iio_priv(indio_dev); - unsigned int ret; + int ret;
mutex_lock(&st->lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qianfeng Rong rongqianfeng@vivo.com
commit 3379c900320954d768ed9903691fb2520926bbe3 upstream.
Change the 'ret' variable in ad5421_update_ctrl() from unsigned int to int, as it needs to store either negative error codes or zero returned by ad5421_write_unlocked().
Fixes: 5691b23489db ("staging:iio:dac: Add AD5421 driver") Signed-off-by: Qianfeng Rong rongqianfeng@vivo.com Reviewed-by: Andy Shevchenko andriy.shevchenko@intel.com Link: https://patch.msgid.link/20250901135726.17601-3-rongqianfeng@vivo.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/dac/ad5421.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/dac/ad5421.c +++ b/drivers/iio/dac/ad5421.c @@ -186,7 +186,7 @@ static int ad5421_update_ctrl(struct iio unsigned int clr) { struct ad5421_state *st = iio_priv(indio_dev); - unsigned int ret; + int ret;
mutex_lock(&st->lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Hennerich michael.hennerich@analog.com
commit 33d7ecbf69aa7dd4145e3b77962bcb8759eede3d upstream.
The ADF4350/1 features a programmable dual-modulus prescaler of 4/5 or 8/9. When set to 4/5, the maximum RF frequency allowed is 3 GHz. Therefore, when operating the ADF4351 above 3 GHz, this must be set to 8/9. In this context not the RF output frequency is meant - it's the VCO frequency.
Therefore move the prescaler selection after we derived the VCO frequency from the desired RF output frequency.
This BUG may have caused PLL lock instabilities when operating the VCO at the very high range close to 4.4 GHz.
Fixes: e31166f0fd48 ("iio: frequency: New driver for Analog Devices ADF4350/ADF4351 Wideband Synthesizers") Signed-off-by: Michael Hennerich michael.hennerich@analog.com Signed-off-by: Nuno Sá nuno.sa@analog.com Reviewed-by: Andy Shevchenko andy@kernel.org Link: https://patch.msgid.link/20250829-adf4350-fix-v2-1-0bf543ba797d@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/frequency/adf4350.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
--- a/drivers/iio/frequency/adf4350.c +++ b/drivers/iio/frequency/adf4350.c @@ -149,6 +149,19 @@ static int adf4350_set_freq(struct adf43 if (freq > ADF4350_MAX_OUT_FREQ || freq < st->min_out_freq) return -EINVAL;
+ st->r4_rf_div_sel = 0; + + /* + * !\TODO: The below computation is making sure we get a power of 2 + * shift (st->r4_rf_div_sel) so that freq becomes higher or equal to + * ADF4350_MIN_VCO_FREQ. This might be simplified with fls()/fls_long() + * and friends. + */ + while (freq < ADF4350_MIN_VCO_FREQ) { + freq <<= 1; + st->r4_rf_div_sel++; + } + if (freq > ADF4350_MAX_FREQ_45_PRESC) { prescaler = ADF4350_REG1_PRESCALER; mdiv = 75; @@ -157,13 +170,6 @@ static int adf4350_set_freq(struct adf43 mdiv = 23; }
- st->r4_rf_div_sel = 0; - - while (freq < ADF4350_MIN_VCO_FREQ) { - freq <<= 1; - st->r4_rf_div_sel++; - } - /* * Allow a predefined reference division factor * if not set, compute our own
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@linux.dev
commit 1315cc2dbd5034f566e20ddce4d675cb9e6d4ddd upstream.
AMS_ALARM_THR_DIRECT_MASK should be bit 0, not bit 1. This would cause hysteresis to be enabled with a lower threshold of -28C. The temperature alarm would never deassert even if the temperature dropped below the upper threshold.
Fixes: d5c70627a794 ("iio: adc: Add Xilinx AMS driver") Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: O'Griofa, Conall conall.ogriofa@amd.com Tested-by: Erim, Salih Salih.Erim@amd.com Acked-by: Erim, Salih Salih.Erim@amd.com Link: https://patch.msgid.link/20250715003058.2035656-1-sean.anderson@linux.dev Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/xilinx-ams.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iio/adc/xilinx-ams.c +++ b/drivers/iio/adc/xilinx-ams.c @@ -118,7 +118,7 @@ #define AMS_ALARM_THRESHOLD_OFF_10 0x10 #define AMS_ALARM_THRESHOLD_OFF_20 0x20
-#define AMS_ALARM_THR_DIRECT_MASK BIT(1) +#define AMS_ALARM_THR_DIRECT_MASK BIT(0) #define AMS_ALARM_THR_MIN 0x0000 #define AMS_ALARM_THR_MAX (BIT(16) - 1)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@linux.dev
commit feb500c7ae7a198db4d2757901bce562feeefa5e upstream.
To convert level-triggered alarms into edge-triggered IIO events, alarms are masked when they are triggered. To ensure we catch subsequent alarms, we then periodically poll to see if the alarm is still active. If it isn't, we unmask it. Active but masked alarms are stored in current_masked_alarm.
If an active alarm is disabled, it will remain set in current_masked_alarm until ams_unmask_worker clears it. If the alarm is re-enabled before ams_unmask_worker runs, then it will never be cleared from current_masked_alarm. This will prevent the alarm event from being pushed even if the alarm is still active.
Fix this by recalculating current_masked_alarm immediately when enabling or disabling alarms.
Fixes: d5c70627a794 ("iio: adc: Add Xilinx AMS driver") Signed-off-by: Sean Anderson sean.anderson@linux.dev Reviewed-by: O'Griofa, Conall conall.ogriofa@amd.com Tested-by: Erim, Salih Salih.Erim@amd.com Acked-by: Erim, Salih Salih.Erim@amd.com Link: https://patch.msgid.link/20250715002847.2035228-1-sean.anderson@linux.dev Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/xilinx-ams.c | 45 +++++++++++++++++++++++-------------------- 1 file changed, 25 insertions(+), 20 deletions(-)
--- a/drivers/iio/adc/xilinx-ams.c +++ b/drivers/iio/adc/xilinx-ams.c @@ -389,6 +389,29 @@ static void ams_update_pl_alarm(struct a ams_pl_update_reg(ams, AMS_REG_CONFIG3, AMS_REGCFG3_ALARM_MASK, cfg); }
+static void ams_unmask(struct ams *ams) +{ + unsigned int status, unmask; + + status = readl(ams->base + AMS_ISR_0); + + /* Clear those bits which are not active anymore */ + unmask = (ams->current_masked_alarm ^ status) & ams->current_masked_alarm; + + /* Clear status of disabled alarm */ + unmask |= ams->intr_mask; + + ams->current_masked_alarm &= status; + + /* Also clear those which are masked out anyway */ + ams->current_masked_alarm &= ~ams->intr_mask; + + /* Clear the interrupts before we unmask them */ + writel(unmask, ams->base + AMS_ISR_0); + + ams_update_intrmask(ams, ~AMS_ALARM_MASK, ~AMS_ALARM_MASK); +} + static void ams_update_alarm(struct ams *ams, unsigned long alarm_mask) { unsigned long flags; @@ -401,6 +424,7 @@ static void ams_update_alarm(struct ams
spin_lock_irqsave(&ams->intr_lock, flags); ams_update_intrmask(ams, AMS_ISR0_ALARM_MASK, ~alarm_mask); + ams_unmask(ams); spin_unlock_irqrestore(&ams->intr_lock, flags); }
@@ -1035,28 +1059,9 @@ static void ams_handle_events(struct iio static void ams_unmask_worker(struct work_struct *work) { struct ams *ams = container_of(work, struct ams, ams_unmask_work.work); - unsigned int status, unmask;
spin_lock_irq(&ams->intr_lock); - - status = readl(ams->base + AMS_ISR_0); - - /* Clear those bits which are not active anymore */ - unmask = (ams->current_masked_alarm ^ status) & ams->current_masked_alarm; - - /* Clear status of disabled alarm */ - unmask |= ams->intr_mask; - - ams->current_masked_alarm &= status; - - /* Also clear those which are masked out anyway */ - ams->current_masked_alarm &= ~ams->intr_mask; - - /* Clear the interrupts before we unmask them */ - writel(unmask, ams->base + AMS_ISR_0); - - ams_update_intrmask(ams, ~AMS_ALARM_MASK, ~AMS_ALARM_MASK); - + ams_unmask(ams); spin_unlock_irq(&ams->intr_lock);
/* If still pending some alarm re-trigger the timer */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huacai Chen chenhuacai@loongson.cn
commit e416f0ed3c500c05c55fb62ee62662717b1c7f71 upstream.
BootLoaders (Grub, LILO, etc) may pass an identifier such as "BOOT_IMAGE= /boot/vmlinuz-x.y.z" to kernel parameters. But these identifiers are not recognized by the kernel itself so will be passed to userspace. However user space init program also don't recognize it.
KEXEC/KDUMP (kexec-tools) may also pass an identifier such as "kexec" on some architectures.
We cannot change BootLoader's behavior, because this behavior exists for many years, and there are already user space programs search BOOT_IMAGE= in /proc/cmdline to obtain the kernel image locations:
https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/util.go (search getBootOptions) https://github.com/linuxdeepin/deepin-ab-recovery/blob/master/main.go (search getKernelReleaseWithBootOption) So the the best way is handle (ignore) it by the kernel itself, which can avoid such boot warnings (if we use something like init=/bin/bash, bootloader identifier can even cause a crash):
Kernel command line: BOOT_IMAGE=(hd0,1)/vmlinuz-6.x root=/dev/sda3 ro console=tty Unknown kernel command line parameters "BOOT_IMAGE=(hd0,1)/vmlinuz-6.x", will be passed to user space.
[chenhuacai@loongson.cn: use strstarts()] Link: https://lkml.kernel.org/r/20250815090120.1569947-1-chenhuacai@loongson.cn Link: https://lkml.kernel.org/r/20250721101343.3283480-1-chenhuacai@loongson.cn Signed-off-by: Huacai Chen chenhuacai@loongson.cn Cc: Al Viro viro@zeniv.linux.org.uk Cc: Christian Brauner brauner@kernel.org Cc: Jan Kara jack@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- init/main.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/init/main.c +++ b/init/main.c @@ -543,6 +543,12 @@ static int __init unknown_bootoption(cha const char *unused, void *arg) { size_t len = strlen(param); + /* + * Well-known bootloader identifiers: + * 1. LILO/Grub pass "BOOT_IMAGE=..."; + * 2. kexec/kdump (kexec-tools) pass "kexec". + */ + const char *bootloader[] = { "BOOT_IMAGE=", "kexec", NULL };
/* Handle params aliased to sysctls */ if (sysctl_is_alias(param)) @@ -550,6 +556,12 @@ static int __init unknown_bootoption(cha
repair_env_string(param, val);
+ /* Handle bootloader identifier */ + for (int i = 0; bootloader[i]; i++) { + if (strstarts(param, bootloader[i])) + return 0; + } + /* Handle obsolete-style parameters */ if (obsolete_checksetup(param)) return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Nyekjaer sean@geanix.com
commit a95a0b4e471a6d8860f40c6ac8f1cad9dde3189a upstream.
Remove unnecessary calls to pm_runtime_disable(), pm_runtime_set_active(), and pm_runtime_enable() from the resume path. These operations are not required here and can interfere with proper pm_runtime state handling, especially when resuming from a pm_runtime suspended state.
Fixes: 31c24c1e93c3 ("iio: imu: inv_icm42600: add core of new inv_icm42600 driver") Signed-off-by: Sean Nyekjaer sean@geanix.com Link: https://patch.msgid.link/20250901-icm42pmreg-v3-2-ef1336246960@geanix.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/imu/inv_icm42600/inv_icm42600_core.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c +++ b/drivers/iio/imu/inv_icm42600/inv_icm42600_core.c @@ -847,10 +847,6 @@ static int inv_icm42600_resume(struct de if (ret) goto out_unlock;
- pm_runtime_disable(dev); - pm_runtime_set_active(dev); - pm_runtime_enable(dev); - /* restore sensors state */ ret = inv_icm42600_set_pwr_mgmt0(st, st->suspended.gyro, st->suspended.accel,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lu Baolu baolu.lu@linux.intel.com
commit 5ef7e24c742038a5d8c626fdc0e3a21834358341 upstream.
The specification, Section 7.10, "Software Steps to Drain Page Requests & Responses," requires software to submit an Invalidation Wait Descriptor (inv_wait_dsc) with the Page-request Drain (PD=1) flag set, along with the Invalidation Wait Completion Status Write flag (SW=1). It then waits for the Invalidation Wait Descriptor's completion.
However, the PD field in the Invalidation Wait Descriptor is optional, as stated in Section 6.5.2.9, "Invalidation Wait Descriptor":
"Page-request Drain (PD): Remapping hardware implementations reporting Page-request draining as not supported (PDS = 0 in ECAP_REG) treat this field as reserved."
This implies that if the IOMMU doesn't support the PDS capability, software can't drain page requests and group responses as expected.
Do not enable PCI/PRI if the IOMMU doesn't support PDS.
Reported-by: Joel Granados joel.granados@kernel.org Closes: https://lore.kernel.org/r/20250909-jag-pds-v1-1-ad8cba0e494e@kernel.org Fixes: 66ac4db36f4c ("iommu/vt-d: Add page request draining support") Cc: stable@vger.kernel.org Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20250915062946.120196-1-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel joerg.roedel@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel/iommu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3952,7 +3952,7 @@ static struct iommu_device *intel_iommu_ }
if (info->ats_supported && ecap_prs(iommu->ecap) && - pci_pri_supported(pdev)) + ecap_pds(iommu->ecap) && pci_pri_supported(pdev)) info->pri_supported = 1; } }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov oleg@redhat.com
commit a15f37a40145c986cdf289a4b88390f35efdecc4 upstream.
The usage of task_lock(tsk->group_leader) in sys_prlimit64()->do_prlimit() path is very broken.
sys_prlimit64() does get_task_struct(tsk) but this only protects task_struct itself. If tsk != current and tsk is not a leader, this process can exit/exec and task_lock(tsk->group_leader) may use the already freed task_struct.
Another problem is that sys_prlimit64() can race with mt-exec which changes ->group_leader. In this case do_prlimit() may take the wrong lock, or (worse) ->group_leader may change between task_lock() and task_unlock().
Change sys_prlimit64() to take tasklist_lock when necessary. This is not nice, but I don't see a better fix for -stable.
Link: https://lkml.kernel.org/r/20250915120917.GA27702@redhat.com Fixes: 18c91bb2d872 ("prlimit: do not grab the tasklist_lock") Signed-off-by: Oleg Nesterov oleg@redhat.com Cc: Christian Brauner brauner@kernel.org Cc: Jiri Slaby jirislaby@kernel.org Cc: Mateusz Guzik mjguzik@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sys.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
--- a/kernel/sys.c +++ b/kernel/sys.c @@ -1698,6 +1698,7 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u struct rlimit old, new; struct task_struct *tsk; unsigned int checkflags = 0; + bool need_tasklist; int ret;
if (old_rlim) @@ -1724,8 +1725,25 @@ SYSCALL_DEFINE4(prlimit64, pid_t, pid, u get_task_struct(tsk); rcu_read_unlock();
- ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, - old_rlim ? &old : NULL); + need_tasklist = !same_thread_group(tsk, current); + if (need_tasklist) { + /* + * Ensure we can't race with group exit or de_thread(), + * so tsk->group_leader can't be freed or changed until + * read_unlock(tasklist_lock) below. + */ + read_lock(&tasklist_lock); + if (!pid_alive(tsk)) + ret = -ESRCH; + } + + if (!ret) { + ret = do_prlimit(tsk, resource, new_rlim ? &new : NULL, + old_rlim ? &old : NULL); + } + + if (need_tasklist) + read_unlock(&tasklist_lock);
if (!ret && old_rlim) { rlim_to_rlim64(&old, &old64);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit eed0e3d305530066b4fc5370107cff8ef1a0d229 upstream.
To prevent timing attacks, HMAC value comparison needs to be constant time. Replace the memcmp() with the correct function, crypto_memneq().
[For the Fixes commit I used the commit that introduced the memcmp(). It predates the introduction of crypto_memneq(), but it was still a bug at the time even though a helper function didn't exist yet.]
Fixes: d00a1c72f7f4 ("keys: add new trusted key-type") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- security/keys/trusted-keys/trusted_tpm1.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/security/keys/trusted-keys/trusted_tpm1.c +++ b/security/keys/trusted-keys/trusted_tpm1.c @@ -7,6 +7,7 @@ */
#include <crypto/hash_info.h> +#include <crypto/utils.h> #include <linux/init.h> #include <linux/slab.h> #include <linux/parser.h> @@ -241,7 +242,7 @@ int TSS_checkhmac1(unsigned char *buffer if (ret < 0) goto out;
- if (memcmp(testhmac, authdata, SHA1_DIGEST_SIZE)) + if (crypto_memneq(testhmac, authdata, SHA1_DIGEST_SIZE)) ret = -EINVAL; out: kfree_sensitive(sdesc); @@ -334,7 +335,7 @@ static int TSS_checkhmac2(unsigned char TPM_NONCE_SIZE, ononce, 1, continueflag1, 0, 0); if (ret < 0) goto out; - if (memcmp(testhmac1, authdata1, SHA1_DIGEST_SIZE)) { + if (crypto_memneq(testhmac1, authdata1, SHA1_DIGEST_SIZE)) { ret = -EINVAL; goto out; } @@ -343,7 +344,7 @@ static int TSS_checkhmac2(unsigned char TPM_NONCE_SIZE, ononce, 1, continueflag2, 0, 0); if (ret < 0) goto out; - if (memcmp(testhmac2, authdata2, SHA1_DIGEST_SIZE)) + if (crypto_memneq(testhmac2, authdata2, SHA1_DIGEST_SIZE)) ret = -EINVAL; out: kfree_sensitive(sdesc);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johan Hovold johan@kernel.org
commit 1260cbcffa608219fc9188a6cbe9c45a300ef8b5 upstream.
Make sure to drop the reference taken when looking up the genpool platform device in of_gen_pool_get() before returning the pool.
Note that holding a reference to a device does typically not prevent its devres managed resources from being released so there is no point in keeping the reference.
Link: https://lkml.kernel.org/r/20250924080207.18006-1-johan@kernel.org Fixes: 9375db07adea ("genalloc: add devres support, allow to find a managed pool by device") Signed-off-by: Johan Hovold johan@kernel.org Cc: Philipp Zabel p.zabel@pengutronix.de Cc: Vladimir Zapolskiy vz@mleia.com Cc: stable@vger.kernel.org [3.10+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/genalloc.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/lib/genalloc.c +++ b/lib/genalloc.c @@ -899,8 +899,11 @@ struct gen_pool *of_gen_pool_get(struct if (!name) name = of_node_full_name(np_pool); } - if (pdev) + if (pdev) { pool = gen_pool_get(&pdev->dev, name); + put_device(&pdev->dev); + } + of_node_put(np_pool);
return pool;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Chen me@linux.beauty
commit 98b7bf54338b797e3a11e8178ce0e806060d8fa3 upstream.
loop_change_fd() and loop_configure() call loop_check_backing_file() to validate the new backing file. If validation fails, the reference acquired by fget() was not dropped, leaking a file reference.
Fix this by calling fput(file) before returning the error.
Cc: stable@vger.kernel.org Cc: Markus Elfring Markus.Elfring@web.de CC: Yang Erkun yangerkun@huawei.com Cc: Ming Lei ming.lei@redhat.com Cc: Yu Kuai yukuai1@huaweicloud.com Fixes: f5c84eff634b ("loop: Add sanity check for read/write_iter") Signed-off-by: Li Chen chenl311@chinatelecom.cn Reviewed-by: Ming Lei ming.lei@redhat.com Reviewed-by: Yang Erkun yangerkun@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/loop.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -536,8 +536,10 @@ static int loop_change_fd(struct loop_de return -EBADF;
error = loop_check_backing_file(file); - if (error) + if (error) { + fput(file); return error; + }
/* suppress uevents while reconfiguring the device */ dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1); @@ -973,8 +975,10 @@ static int loop_configure(struct loop_de return -EBADF;
error = loop_check_backing_file(file); - if (error) + if (error) { + fput(file); return error; + }
is_loop = is_loop_device(file);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit a8de554774ae48efbe48ace79f8badae2daa2bf1 upstream.
In of_unittest_pci_node_verify(), when the add parameter is false, device_find_any_child() obtains a reference to a child device. This function implicitly calls get_device() to increment the device's reference count before returning the pointer. However, the caller fails to properly release this reference by calling put_device(), leading to a device reference count leak. Add put_device() in the else branch immediately after child_dev is no longer needed.
As the comment of device_find_any_child states: "NOTE: you will need to drop the reference with put_device() after use".
Found by code review.
Cc: stable@vger.kernel.org Fixes: 26409dd04589 ("of: unittest: Add pci_dt_testdrv pci driver") Signed-off-by: Ma Ke make24@iscas.ac.cn Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/unittest.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/of/unittest.c +++ b/drivers/of/unittest.c @@ -4191,6 +4191,7 @@ static int of_unittest_pci_node_verify(s unittest(!np, "Child device tree node is not removed\n"); child_dev = device_find_any_child(&pdev->dev); unittest(!child_dev, "Child device is not removed\n"); + put_device(child_dev); }
failed:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Askar Safin safinaskar@zohomail.com
commit 042a60680de43175eb4df0977ff04a4eba9da082 upstream.
openat2 had a bug: if we pass RESOLVE_NO_XDEV, then openat2 doesn't traverse through automounts, but may still trigger them. (See the link for full bug report with reproducer.)
This commit fixes this bug.
Link: https://lore.kernel.org/linux-fsdevel/20250817075252.4137628-1-safinaskar@zo... Fixes: fddb5d430ad9fa91b49b1 ("open: introduce openat2(2) syscall") Reviewed-by: Aleksa Sarai cyphar@cyphar.com Cc: stable@vger.kernel.org Signed-off-by: Askar Safin safinaskar@zohomail.com Link: https://lore.kernel.org/20250825181233.2464822-5-safinaskar@zohomail.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namei.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/namei.c +++ b/fs/namei.c @@ -1388,6 +1388,10 @@ static int follow_automount(struct path dentry->d_inode) return -EISDIR;
+ /* No need to trigger automounts if mountpoint crossing is disabled. */ + if (lookup_flags & LOOKUP_NO_XDEV) + return -EXDEV; + if (count && (*count)++ >= MAXSYMLINKS) return -ELOOP;
@@ -1411,6 +1415,10 @@ static int __traverse_mounts(struct path /* Allow the filesystem to manage the transit without i_mutex * being held. */ if (flags & DCACHE_MANAGE_TRANSIT) { + if (lookup_flags & LOOKUP_NO_XDEV) { + ret = -EXDEV; + break; + } ret = path->dentry->d_op->d_manage(path, false); flags = smp_load_acquire(&path->dentry->d_flags); if (ret < 0)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sam James sam@gentoo.org
commit 8ec5a066f88f89bd52094ba18792b34c49dcd55a upstream.
Similar in nature to ab107276607af90b13a5994997e19b7b9731e251. glibc-2.42 drops the legacy termio struct, but the ioctls.h header still defines some TC* constants in terms of termio (via sizeof). Hardcode the values instead.
This fixes building Python for example, which falls over like: ./Modules/termios.c:1119:16: error: invalid application of 'sizeof' to incomplete type 'struct termio'
Link: https://bugs.gentoo.org/961769 Link: https://bugs.gentoo.org/962600 Co-authored-by: Stian Halseth stian@itx.no Cc: stable@vger.kernel.org Signed-off-by: Sam James sam@gentoo.org Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/include/uapi/asm/ioctls.h | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/arch/parisc/include/uapi/asm/ioctls.h +++ b/arch/parisc/include/uapi/asm/ioctls.h @@ -10,10 +10,10 @@ #define TCSETS _IOW('T', 17, struct termios) /* TCSETATTR */ #define TCSETSW _IOW('T', 18, struct termios) /* TCSETATTRD */ #define TCSETSF _IOW('T', 19, struct termios) /* TCSETATTRF */ -#define TCGETA _IOR('T', 1, struct termio) -#define TCSETA _IOW('T', 2, struct termio) -#define TCSETAW _IOW('T', 3, struct termio) -#define TCSETAF _IOW('T', 4, struct termio) +#define TCGETA 0x40125401 +#define TCSETA 0x80125402 +#define TCSETAW 0x80125403 +#define TCSETAF 0x80125404 #define TCSBRK _IO('T', 5) #define TCXONC _IO('T', 6) #define TCFLSH _IO('T', 7)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: John David Anglin dave.anglin@bell.net
commit 16794e524d310780163fdd49d0bf7fac30f8dbc8 upstream.
Accidently introduced in commit 91428ca9320e.
Signed-off-by: John David Anglin dave.anglin@bell.net Signed-off-by: Helge Deller deller@gmx.de Fixes: 91428ca9320e ("parisc: Check region is readable by user in raw_copy_from_user()") Cc: stable@vger.kernel.org # v5.12+ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/lib/memcpy.c | 1 - 1 file changed, 1 deletion(-)
--- a/arch/parisc/lib/memcpy.c +++ b/arch/parisc/lib/memcpy.c @@ -41,7 +41,6 @@ unsigned long raw_copy_from_user(void *d mtsp(get_kernel_space(), SR_TEMP2);
/* Check region is user accessible */ - if (start) while (start < end) { if (!prober_user(SR_TEMP1, start)) { newlen = (start - (unsigned long) src);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Georg Gottleuber ggo@tuxedocomputers.com
commit eeaed48980a7aeb0d3d8b438185d4b5a66154ff9 upstream.
On the TUXEDO InfinityBook S Gen8, a Samsung 990 Evo NVMe leads to a high power consumption in s2idle sleep (3.5 watts).
This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with a lower power consumption, typically around 1 watts.
Signed-off-by: Georg Gottleuber ggo@tuxedocomputers.com Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3140,10 +3140,12 @@ static unsigned long check_vendor_combin * Exclude Samsung 990 Evo from NVME_QUIRK_SIMPLE_SUSPEND * because of high power consumption (> 2 Watt) in s2idle * sleep. Only some boards with Intel CPU are affected. + * (Note for testing: Samsung 990 Evo Plus has same PCI ID) */ if (dmi_match(DMI_BOARD_NAME, "DN50Z-140HC-YD") || dmi_match(DMI_BOARD_NAME, "GMxPXxx") || dmi_match(DMI_BOARD_NAME, "GXxMRXx") || + dmi_match(DMI_BOARD_NAME, "NS5X_NS7XAU") || dmi_match(DMI_BOARD_NAME, "PH4PG31") || dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1") || dmi_match(DMI_BOARD_NAME, "PH6PG01_PH6PG71"))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
commit 358253fa8179ab4217ac283b56adde0174186f87 upstream.
Drop unused declarations after S3C24xx SoC family removal in the commit 61b7f8920b17 ("ARM: s3c: remove all s3c24xx support").
Fixes: 1ea35b355722 ("ARM: s3c: remove s3c24xx specific hacks") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250830111657.126190-3-krzysztof.kozlowski@linaro... Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/samsung/pinctrl-samsung.h | 4 ---- 1 file changed, 4 deletions(-)
--- a/drivers/pinctrl/samsung/pinctrl-samsung.h +++ b/drivers/pinctrl/samsung/pinctrl-samsung.h @@ -393,10 +393,6 @@ extern const struct samsung_pinctrl_of_m extern const struct samsung_pinctrl_of_match_data fsd_of_data; extern const struct samsung_pinctrl_of_match_data gs101_of_data; extern const struct samsung_pinctrl_of_match_data s3c64xx_of_data; -extern const struct samsung_pinctrl_of_match_data s3c2412_of_data; -extern const struct samsung_pinctrl_of_match_data s3c2416_of_data; -extern const struct samsung_pinctrl_of_match_data s3c2440_of_data; -extern const struct samsung_pinctrl_of_match_data s3c2450_of_data; extern const struct samsung_pinctrl_of_match_data s5pv210_of_data;
#endif /* __PINCTRL_SAMSUNG_H */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dzmitry Sankouski dsankouski@gmail.com
commit ee6cd8f3e28ee5a929c3b67c01a350f550f9b73a upstream.
CHARGE_CONTROL_LIMIT is a wrong property to report charge current limit, because `CHARGE_*` attributes represents capacity, not current. The correct attribute to report and set charge current limit is CONSTANT_CHARGE_CURRENT.
Rename CHARGE_CONTROL_LIMIT to CONSTANT_CHARGE_CURRENT.
Cc: stable@vger.kernel.org Fixes: 715ecbc10d6a ("power: supply: max77976: add Maxim MAX77976 charger driver") Signed-off-by: Dzmitry Sankouski dsankouski@gmail.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/power/supply/max77976_charger.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/power/supply/max77976_charger.c +++ b/drivers/power/supply/max77976_charger.c @@ -292,10 +292,10 @@ static int max77976_get_property(struct case POWER_SUPPLY_PROP_ONLINE: err = max77976_get_online(chg, &val->intval); break; - case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX: + case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX: val->intval = MAX77976_CHG_CC_MAX; break; - case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT: + case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT: err = max77976_get_integer(chg, CHG_CC, MAX77976_CHG_CC_MIN, MAX77976_CHG_CC_MAX, @@ -330,7 +330,7 @@ static int max77976_set_property(struct int err = 0;
switch (psp) { - case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT: + case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT: err = max77976_set_integer(chg, CHG_CC, MAX77976_CHG_CC_MIN, MAX77976_CHG_CC_MAX, @@ -355,7 +355,7 @@ static int max77976_property_is_writeabl enum power_supply_property psp) { switch (psp) { - case POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT: + case POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT: case POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT: return true; default: @@ -368,8 +368,8 @@ static enum power_supply_property max779 POWER_SUPPLY_PROP_CHARGE_TYPE, POWER_SUPPLY_PROP_HEALTH, POWER_SUPPLY_PROP_ONLINE, - POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT, - POWER_SUPPLY_PROP_CHARGE_CONTROL_LIMIT_MAX, + POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT, + POWER_SUPPLY_PROP_CONSTANT_CHARGE_CURRENT_MAX, POWER_SUPPLY_PROP_INPUT_CURRENT_LIMIT, POWER_SUPPLY_PROP_MODEL_NAME, POWER_SUPPLY_PROP_MANUFACTURER,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit a39087905af9ffecaa237a918a2c03a04e479934 upstream.
pnv_irq_domain_alloc() allocates interrupts at parent's interrupt domain. If it fails in the progress, all allocated interrupts are freed.
The number of successfully allocated interrupts so far is stored "i". However, "i - 1" interrupts are freed. This is broken:
- One interrupt is not be freed
- If "i" is zero, "i - 1" wraps around
Correct the number of freed interrupts to "i".
Fixes: 0fcfe2247e75 ("powerpc/powernv/pci: Add MSI domains") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Cédric Le Goater clg@redhat.com Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/70f8debe8688e0b467367db769b71c20146a836d.1754300646... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/powernv/pci-ioda.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/powernv/pci-ioda.c +++ b/arch/powerpc/platforms/powernv/pci-ioda.c @@ -1897,7 +1897,7 @@ static int pnv_irq_domain_alloc(struct i return 0;
out: - irq_domain_free_irqs_parent(domain, virq, i - 1); + irq_domain_free_irqs_parent(domain, virq, i); msi_bitmap_free_hwirqs(&phb->msi_bmp, hwirq, nr_irqs); return ret; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nam Cao namcao@linutronix.de
commit 3443ff3be6e59b80d74036bb39f5b6409eb23cc9 upstream.
pseries_irq_domain_alloc() allocates interrupts at parent's interrupt domain. If it fails in the progress, all allocated interrupts are freed.
The number of successfully allocated interrupts so far is stored "i". However, "i - 1" interrupts are freed. This is broken:
- One interrupt is not be freed
- If "i" is zero, "i - 1" wraps around
Correct the number of freed interrupts to 'i'.
Fixes: a5f3d2c17b07 ("powerpc/pseries/pci: Add MSI domains") Signed-off-by: Nam Cao namcao@linutronix.de Cc: stable@vger.kernel.org Reviewed-by: Cédric Le Goater clg@redhat.com Signed-off-by: Madhavan Srinivasan maddy@linux.ibm.com Link: https://patch.msgid.link/a980067f2b256bf716b4cd713bc1095966eed8cd.1754300646... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/pseries/msi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/powerpc/platforms/pseries/msi.c +++ b/arch/powerpc/platforms/pseries/msi.c @@ -592,7 +592,7 @@ static int pseries_irq_domain_alloc(stru
out: /* TODO: handle RTAS cleanup in ->msi_finish() ? */ - irq_domain_free_irqs_parent(domain, virq, i - 1); + irq_domain_free_irqs_parent(domain, virq, i); return ret; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jisheng Zhang jszhang@kernel.org
commit 3a4b9d027e4061766f618292df91760ea64a1fcc upstream.
The 'enable' register should be BERLIN_PWM_EN rather than BERLIN_PWM_ENABLE, otherwise, the driver accesses wrong address, there will be cpu exception then kernel panic during suspend/resume.
Fixes: bbf0722c1c66 ("pwm: berlin: Add suspend/resume support") Signed-off-by: Jisheng Zhang jszhang@kernel.org Link: https://lore.kernel.org/r/20250819114224.31825-1-jszhang@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Uwe Kleine-König ukleinek@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pwm/pwm-berlin.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pwm/pwm-berlin.c +++ b/drivers/pwm/pwm-berlin.c @@ -234,7 +234,7 @@ static int berlin_pwm_suspend(struct dev for (i = 0; i < chip->npwm; i++) { struct berlin_pwm_channel *channel = &bpc->channel[i];
- channel->enable = berlin_pwm_readl(bpc, i, BERLIN_PWM_ENABLE); + channel->enable = berlin_pwm_readl(bpc, i, BERLIN_PWM_EN); channel->ctrl = berlin_pwm_readl(bpc, i, BERLIN_PWM_CONTROL); channel->duty = berlin_pwm_readl(bpc, i, BERLIN_PWM_DUTY); channel->tcnt = berlin_pwm_readl(bpc, i, BERLIN_PWM_TCNT); @@ -262,7 +262,7 @@ static int berlin_pwm_resume(struct devi berlin_pwm_writel(bpc, i, channel->ctrl, BERLIN_PWM_CONTROL); berlin_pwm_writel(bpc, i, channel->duty, BERLIN_PWM_DUTY); berlin_pwm_writel(bpc, i, channel->tcnt, BERLIN_PWM_TCNT); - berlin_pwm_writel(bpc, i, channel->enable, BERLIN_PWM_ENABLE); + berlin_pwm_writel(bpc, i, channel->enable, BERLIN_PWM_EN); }
return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Corey Minyard corey@minyard.net
commit 5d09ee1bec870263f4ace439402ea840503b503b upstream.
This reverts commit c608966f3f9c2dca596967501d00753282b395fc.
This patch has a subtle bug that can cause the IPMI driver to go into an infinite loop if the BMC misbehaves in a certain way. Apparently certain BMCs do misbehave this way because several reports have come in recently about this.
Signed-off-by: Corey Minyard corey@minyard.net Tested-by: Eric Hagberg ehagberg@janestreet.com Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/ipmi/ipmi_kcs_sm.c | 16 +++++----------- 1 file changed, 5 insertions(+), 11 deletions(-)
--- a/drivers/char/ipmi/ipmi_kcs_sm.c +++ b/drivers/char/ipmi/ipmi_kcs_sm.c @@ -122,10 +122,10 @@ struct si_sm_data { unsigned long error0_timeout; };
-static unsigned int init_kcs_data_with_state(struct si_sm_data *kcs, - struct si_sm_io *io, enum kcs_states state) +static unsigned int init_kcs_data(struct si_sm_data *kcs, + struct si_sm_io *io) { - kcs->state = state; + kcs->state = KCS_IDLE; kcs->io = io; kcs->write_pos = 0; kcs->write_count = 0; @@ -140,12 +140,6 @@ static unsigned int init_kcs_data_with_s return 2; }
-static unsigned int init_kcs_data(struct si_sm_data *kcs, - struct si_sm_io *io) -{ - return init_kcs_data_with_state(kcs, io, KCS_IDLE); -} - static inline unsigned char read_status(struct si_sm_data *kcs) { return kcs->io->inputb(kcs->io, 1); @@ -276,7 +270,7 @@ static int start_kcs_transaction(struct if (size > MAX_KCS_WRITE_SIZE) return IPMI_REQ_LEN_EXCEEDED_ERR;
- if (kcs->state != KCS_IDLE) { + if ((kcs->state != KCS_IDLE) && (kcs->state != KCS_HOSED)) { dev_warn(kcs->io->dev, "KCS in invalid state %d\n", kcs->state); return IPMI_NOT_IN_MY_STATE_ERR; } @@ -501,7 +495,7 @@ static enum si_sm_result kcs_event(struc }
if (kcs->state == KCS_HOSED) { - init_kcs_data_with_state(kcs, kcs->io, KCS_ERROR0); + init_kcs_data(kcs, kcs->io); return SI_SM_HOSED; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Harshit Agarwal harshit@nutanix.com
commit 8fd5485fb4f3d9da3977fd783fcb8e5452463420 upstream.
When a CPU chooses to call push_dl_task and picks a task to push to another CPU's runqueue then it will call find_lock_later_rq method which would take a double lock on both CPUs' runqueues. If one of the locks aren't readily available, it may lead to dropping the current runqueue lock and reacquiring both the locks at once. During this window it is possible that the task is already migrated and is running on some other CPU. These cases are already handled. However, if the task is migrated and has already been executed and another CPU is now trying to wake it up (ttwu) such that it is queued again on the runqeue (on_rq is 1) and also if the task was run by the same CPU, then the current checks will pass even though the task was migrated out and is no longer in the pushable tasks list. Please go through the original rt change for more details on the issue.
To fix this, after the lock is obtained inside the find_lock_later_rq, it ensures that the task is still at the head of pushable tasks list. Also removed some checks that are no longer needed with the addition of this new check. However, the new check of pushable tasks list only applies when find_lock_later_rq is called by push_dl_task. For the other caller i.e. dl_task_offline_migration, existing checks are used.
Signed-off-by: Harshit Agarwal harshit@nutanix.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Juri Lelli juri.lelli@redhat.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250408045021.3283624-1-harshit@nutanix.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/deadline.c | 73 ++++++++++++++++++++++++++++++++---------------- 1 file changed, 49 insertions(+), 24 deletions(-)
--- a/kernel/sched/deadline.c +++ b/kernel/sched/deadline.c @@ -2617,6 +2617,25 @@ static int find_later_rq(struct task_str return -1; }
+static struct task_struct *pick_next_pushable_dl_task(struct rq *rq) +{ + struct task_struct *p; + + if (!has_pushable_dl_tasks(rq)) + return NULL; + + p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root)); + + WARN_ON_ONCE(rq->cpu != task_cpu(p)); + WARN_ON_ONCE(task_current(rq, p)); + WARN_ON_ONCE(p->nr_cpus_allowed <= 1); + + WARN_ON_ONCE(!task_on_rq_queued(p)); + WARN_ON_ONCE(!dl_task(p)); + + return p; +} + /* Locks the rq it finds */ static struct rq *find_lock_later_rq(struct task_struct *task, struct rq *rq) { @@ -2644,12 +2663,37 @@ static struct rq *find_lock_later_rq(str
/* Retry if something changed. */ if (double_lock_balance(rq, later_rq)) { - if (unlikely(task_rq(task) != rq || + /* + * double_lock_balance had to release rq->lock, in the + * meantime, task may no longer be fit to be migrated. + * Check the following to ensure that the task is + * still suitable for migration: + * 1. It is possible the task was scheduled, + * migrate_disabled was set and then got preempted, + * so we must check the task migration disable + * flag. + * 2. The CPU picked is in the task's affinity. + * 3. For throttled task (dl_task_offline_migration), + * check the following: + * - the task is not on the rq anymore (it was + * migrated) + * - the task is not on CPU anymore + * - the task is still a dl task + * - the task is not queued on the rq anymore + * 4. For the non-throttled task (push_dl_task), the + * check to ensure that this task is still at the + * head of the pushable tasks list is enough. + */ + if (unlikely(is_migration_disabled(task) || !cpumask_test_cpu(later_rq->cpu, &task->cpus_mask) || - task_on_cpu(rq, task) || - !dl_task(task) || - is_migration_disabled(task) || - !task_on_rq_queued(task))) { + (task->dl.dl_throttled && + (task_rq(task) != rq || + task_on_cpu(rq, task) || + !dl_task(task) || + !task_on_rq_queued(task))) || + (!task->dl.dl_throttled && + task != pick_next_pushable_dl_task(rq)))) { + double_unlock_balance(rq, later_rq); later_rq = NULL; break; @@ -2672,25 +2716,6 @@ static struct rq *find_lock_later_rq(str return later_rq; }
-static struct task_struct *pick_next_pushable_dl_task(struct rq *rq) -{ - struct task_struct *p; - - if (!has_pushable_dl_tasks(rq)) - return NULL; - - p = __node_2_pdl(rb_first_cached(&rq->dl.pushable_dl_tasks_root)); - - WARN_ON_ONCE(rq->cpu != task_cpu(p)); - WARN_ON_ONCE(task_current(rq, p)); - WARN_ON_ONCE(p->nr_cpus_allowed <= 1); - - WARN_ON_ONCE(!task_on_rq_queued(p)); - WARN_ON_ONCE(!dl_task(p)); - - return p; -} - /* * See if the non running -deadline tasks on this rq * can be sent to some other CPU where they can preempt
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thorsten Blum thorsten.blum@linux.dev
commit b81296591c567b12d3873b05a37b975707959b94 upstream.
Replace kmalloc() followed by copy_from_user() with memdup_user() to fix a memory leak that occurs when copy_from_user(buff[sg_used],,) fails and the 'cleanup1:' path does not free the memory for 'buff[sg_used]'. Using memdup_user() avoids this by freeing the memory internally.
Since memdup_user() already allocates memory, use kzalloc() in the else branch instead of manually zeroing 'buff[sg_used]' using memset(0).
Cc: stable@vger.kernel.org Fixes: edd163687ea5 ("[SCSI] hpsa: add driver for HP Smart Array controllers.") Signed-off-by: Thorsten Blum thorsten.blum@linux.dev Acked-by: Don Brace don.brace@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/hpsa.c | 21 ++++++++++++--------- 1 file changed, 12 insertions(+), 9 deletions(-)
--- a/drivers/scsi/hpsa.c +++ b/drivers/scsi/hpsa.c @@ -6528,18 +6528,21 @@ static int hpsa_big_passthru_ioctl(struc while (left) { sz = (left > ioc->malloc_size) ? ioc->malloc_size : left; buff_size[sg_used] = sz; - buff[sg_used] = kmalloc(sz, GFP_KERNEL); - if (buff[sg_used] == NULL) { - status = -ENOMEM; - goto cleanup1; - } + if (ioc->Request.Type.Direction & XFER_WRITE) { - if (copy_from_user(buff[sg_used], data_ptr, sz)) { - status = -EFAULT; + buff[sg_used] = memdup_user(data_ptr, sz); + if (IS_ERR(buff[sg_used])) { + status = PTR_ERR(buff[sg_used]); + goto cleanup1; + } + } else { + buff[sg_used] = kzalloc(sz, GFP_KERNEL); + if (!buff[sg_used]) { + status = -ENOMEM; goto cleanup1; } - } else - memset(buff[sg_used], 0, sz); + } + left -= sz; data_ptr += sz; sg_used++;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Abinash Singh abinashsinghlalotra@gmail.com
commit b5f717b31b5e478398740db8aee2ecbc4dd72bf3 upstream.
A build warning was triggered due to excessive stack usage in sd_revalidate_disk():
drivers/scsi/sd.c: In function ‘sd_revalidate_disk.isra’: drivers/scsi/sd.c:3824:1: warning: the frame size of 1160 bytes is larger than 1024 bytes [-Wframe-larger-than=]
This is caused by a large local struct queue_limits (~400B) allocated on the stack. Replacing it with a heap allocation using kmalloc() significantly reduces frame usage. Kernel stack is limited (~8 KB), and allocating large structs on the stack is discouraged. As the function already performs heap allocations (e.g. for buffer), this change fits well.
Fixes: 804e498e0496 ("sd: convert to the atomic queue limits API") Cc: stable@vger.kernel.org Reviewed-by: Bart Van Assche bvanassche@acm.org Signed-off-by: Abinash Singh abinashsinghlalotra@gmail.com Link: https://lore.kernel.org/r/20250825183940.13211-2-abinashsinghlalotra@gmail.c... Reviewed-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/sd.c | 50 ++++++++++++++++++++++++++++---------------------- 1 file changed, 28 insertions(+), 22 deletions(-)
--- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -3695,10 +3695,10 @@ static int sd_revalidate_disk(struct gen struct scsi_disk *sdkp = scsi_disk(disk); struct scsi_device *sdp = sdkp->device; sector_t old_capacity = sdkp->capacity; - struct queue_limits lim; - unsigned char *buffer; + struct queue_limits *lim = NULL; + unsigned char *buffer = NULL; unsigned int dev_max; - int err; + int err = 0;
SCSI_LOG_HLQUEUE(3, sd_printk(KERN_INFO, sdkp, "sd_revalidate_disk\n")); @@ -3710,6 +3710,10 @@ static int sd_revalidate_disk(struct gen if (!scsi_device_online(sdp)) goto out;
+ lim = kmalloc(sizeof(*lim), GFP_KERNEL); + if (!lim) + goto out; + buffer = kmalloc(SD_BUF_SIZE, GFP_KERNEL); if (!buffer) { sd_printk(KERN_WARNING, sdkp, "sd_revalidate_disk: Memory " @@ -3719,14 +3723,14 @@ static int sd_revalidate_disk(struct gen
sd_spinup_disk(sdkp);
- lim = queue_limits_start_update(sdkp->disk->queue); + *lim = queue_limits_start_update(sdkp->disk->queue);
/* * Without media there is no reason to ask; moreover, some devices * react badly if we do. */ if (sdkp->media_present) { - sd_read_capacity(sdkp, &lim, buffer); + sd_read_capacity(sdkp, lim, buffer); /* * Some USB/UAS devices return generic values for mode pages * until the media has been accessed. Trigger a READ operation @@ -3740,17 +3744,17 @@ static int sd_revalidate_disk(struct gen * cause this to be updated correctly and any device which * doesn't support it should be treated as rotational. */ - lim.features |= (BLK_FEAT_ROTATIONAL | BLK_FEAT_ADD_RANDOM); + lim->features |= (BLK_FEAT_ROTATIONAL | BLK_FEAT_ADD_RANDOM);
if (scsi_device_supports_vpd(sdp)) { sd_read_block_provisioning(sdkp); - sd_read_block_limits(sdkp, &lim); + sd_read_block_limits(sdkp, lim); sd_read_block_limits_ext(sdkp); - sd_read_block_characteristics(sdkp, &lim); - sd_zbc_read_zones(sdkp, &lim, buffer); + sd_read_block_characteristics(sdkp, lim); + sd_zbc_read_zones(sdkp, lim, buffer); }
- sd_config_discard(sdkp, &lim, sd_discard_mode(sdkp)); + sd_config_discard(sdkp, lim, sd_discard_mode(sdkp));
sd_print_capacity(sdkp, old_capacity);
@@ -3760,47 +3764,46 @@ static int sd_revalidate_disk(struct gen sd_read_app_tag_own(sdkp, buffer); sd_read_write_same(sdkp, buffer); sd_read_security(sdkp, buffer); - sd_config_protection(sdkp, &lim); + sd_config_protection(sdkp, lim); }
/* * We now have all cache related info, determine how we deal * with flush requests. */ - sd_set_flush_flag(sdkp, &lim); + sd_set_flush_flag(sdkp, lim);
/* Initial block count limit based on CDB TRANSFER LENGTH field size. */ dev_max = sdp->use_16_for_rw ? SD_MAX_XFER_BLOCKS : SD_DEF_XFER_BLOCKS;
/* Some devices report a maximum block count for READ/WRITE requests. */ dev_max = min_not_zero(dev_max, sdkp->max_xfer_blocks); - lim.max_dev_sectors = logical_to_sectors(sdp, dev_max); + lim->max_dev_sectors = logical_to_sectors(sdp, dev_max);
if (sd_validate_min_xfer_size(sdkp)) - lim.io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks); + lim->io_min = logical_to_bytes(sdp, sdkp->min_xfer_blocks); else - lim.io_min = 0; + lim->io_min = 0;
/* * Limit default to SCSI host optimal sector limit if set. There may be * an impact on performance for when the size of a request exceeds this * host limit. */ - lim.io_opt = sdp->host->opt_sectors << SECTOR_SHIFT; + lim->io_opt = sdp->host->opt_sectors << SECTOR_SHIFT; if (sd_validate_opt_xfer_size(sdkp, dev_max)) { - lim.io_opt = min_not_zero(lim.io_opt, + lim->io_opt = min_not_zero(lim->io_opt, logical_to_bytes(sdp, sdkp->opt_xfer_blocks)); }
sdkp->first_scan = 0;
set_capacity_and_notify(disk, logical_to_sectors(sdp, sdkp->capacity)); - sd_config_write_same(sdkp, &lim); - kfree(buffer); + sd_config_write_same(sdkp, lim);
- err = queue_limits_commit_update_frozen(sdkp->disk->queue, &lim); + err = queue_limits_commit_update_frozen(sdkp->disk->queue, lim); if (err) - return err; + goto out;
/* * Query concurrent positioning ranges after @@ -3819,7 +3822,10 @@ static int sd_revalidate_disk(struct gen set_capacity_and_notify(disk, 0);
out: - return 0; + kfree(buffer); + kfree(lim); + + return err; }
/**
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Biggers ebiggers@kernel.org
commit dd91c79e4f58fbe2898dac84858033700e0e99fb upstream.
To prevent timing attacks, MACs need to be compared in constant time. Use the appropriate helper function for this.
Fixes: bbd0d59809f9 ("[SCTP]: Implement the receive and verification of AUTH chunk") Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Cc: stable@vger.kernel.org Signed-off-by: Eric Biggers ebiggers@kernel.org Link: https://patch.msgid.link/20250818205426.30222-3-ebiggers@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sctp/sm_make_chunk.c | 3 ++- net/sctp/sm_statefuns.c | 3 ++- 2 files changed, 4 insertions(+), 2 deletions(-)
--- a/net/sctp/sm_make_chunk.c +++ b/net/sctp/sm_make_chunk.c @@ -31,6 +31,7 @@ #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <crypto/hash.h> +#include <crypto/utils.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/ip.h> @@ -1796,7 +1797,7 @@ struct sctp_association *sctp_unpack_coo } }
- if (memcmp(digest, cookie->signature, SCTP_SIGNATURE_SIZE)) { + if (crypto_memneq(digest, cookie->signature, SCTP_SIGNATURE_SIZE)) { *error = -SCTP_IERROR_BAD_SIG; goto fail; } --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -30,6 +30,7 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+#include <crypto/utils.h> #include <linux/types.h> #include <linux/kernel.h> #include <linux/ip.h> @@ -4417,7 +4418,7 @@ static enum sctp_ierror sctp_sf_authenti sh_key, GFP_ATOMIC);
/* Discard the packet if the digests do not match */ - if (memcmp(save_digest, digest, sig_len)) { + if (crypto_memneq(save_digest, digest, sig_len)) { kfree(save_digest); return SCTP_IERROR_BAD_SIG; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anthony Yznaga anthony.yznaga@oracle.com
commit 6fd44a481b3c6111e4801cec964627791d0f3ec5 upstream.
An attempt to exercise sparc hugetlb code in a sun4u-based guest running under qemu results in the guest hanging due to being stuck in a trap loop. This is due to invalid hugetlb TTEs being installed that do not have the expected _PAGE_PMD_HUGE and page size bits set. Although the breakage has gone apparently unnoticed for several years, fix it now so there is the option to exercise sparc hugetlb code under qemu. This can be useful because sun4v support in qemu does not support linux guests currently and sun4v-based hardware resources may not be readily available.
Fix tested with a 6.15.2 and 6.16-rc6 kernels by running libhugetlbfs tests on a qemu guest running Debian 13.
Fixes: c7d9f77d33a7 ("sparc64: Multi-page size support") Cc: stable@vger.kernel.org Signed-off-by: Anthony Yznaga anthony.yznaga@oracle.com Tested-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Reviewed-by: John Paul Adrian Glaubitz glaubitz@physik.fu-berlin.de Reviewed-by: Andreas Larsson andreas@gaisler.com Link: https://lore.kernel.org/r/20250716012446.10357-1-anthony.yznaga@oracle.com Signed-off-by: Andreas Larsson andreas@gaisler.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/mm/hugetlbpage.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/arch/sparc/mm/hugetlbpage.c +++ b/arch/sparc/mm/hugetlbpage.c @@ -130,6 +130,26 @@ hugetlb_get_unmapped_area(struct file *f
static pte_t sun4u_hugepage_shift_to_tte(pte_t entry, unsigned int shift) { + unsigned long hugepage_size = _PAGE_SZ4MB_4U; + + pte_val(entry) = pte_val(entry) & ~_PAGE_SZALL_4U; + + switch (shift) { + case HPAGE_256MB_SHIFT: + hugepage_size = _PAGE_SZ256MB_4U; + pte_val(entry) |= _PAGE_PMD_HUGE; + break; + case HPAGE_SHIFT: + pte_val(entry) |= _PAGE_PMD_HUGE; + break; + case HPAGE_64K_SHIFT: + hugepage_size = _PAGE_SZ64K_4U; + break; + default: + WARN_ONCE(1, "unsupported hugepage shift=%u\n", shift); + } + + pte_val(entry) = pte_val(entry) | hugepage_size; return entry; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit 302c04110f0ce70d25add2496b521132548cd408 upstream.
Once of_device_register() failed, we should call put_device() to decrement reference count for cleanup. Or it could cause memory leak. So fix this by calling put_device(), then the name can be freed in kobject_cleanup().
Calling path: of_device_register() -> of_device_add() -> device_add(). As comment of device_add() says, 'if device_add() succeeds, you should call device_del() when you want to get rid of it. If device_add() has not succeeded, use only put_device() to drop the reference count'.
Found by code review.
Cc: stable@vger.kernel.org Fixes: cf44bbc26cf1 ("[SPARC]: Beginnings of generic of_device framework.") Signed-off-by: Ma Ke make24@iscas.ac.cn Reviewed-by: Andreas Larsson andreas@gaisler.com Signed-off-by: Andreas Larsson andreas@gaisler.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/sparc/kernel/of_device_32.c | 1 + arch/sparc/kernel/of_device_64.c | 1 + 2 files changed, 2 insertions(+)
--- a/arch/sparc/kernel/of_device_32.c +++ b/arch/sparc/kernel/of_device_32.c @@ -387,6 +387,7 @@ static struct platform_device * __init s
if (of_device_register(op)) { printk("%pOF: Could not register of device.\n", dp); + put_device(&op->dev); kfree(op); op = NULL; } --- a/arch/sparc/kernel/of_device_64.c +++ b/arch/sparc/kernel/of_device_64.c @@ -677,6 +677,7 @@ static struct platform_device * __init s
if (of_device_register(op)) { printk("%pOF: Could not register of device.\n", dp); + put_device(&op->dev); kfree(op); op = NULL; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miaoqian Lin linmq006@gmail.com
commit 5d5f08fd0cd970184376bee07d59f635c8403f63 upstream.
A malicious user could pass an arbitrarily bad value to memdup_user_nul(), potentially causing kernel crash.
This follows the same pattern as commit ee76746387f6 ("netdevsim: prevent bad user input in nsim_dev_health_break_write()")
Fixes: b6c7e873daf7 ("xtensa: ISS: add host file-based simulated disk") Fixes: 16e5c1fc3604 ("convert a bunch of open-coded instances of memdup_user_nul()") Cc: stable@vger.kernel.org Signed-off-by: Miaoqian Lin linmq006@gmail.com Message-Id: 20250829083015.1992751-1-linmq006@gmail.com Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/xtensa/platforms/iss/simdisk.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/arch/xtensa/platforms/iss/simdisk.c +++ b/arch/xtensa/platforms/iss/simdisk.c @@ -230,10 +230,14 @@ static ssize_t proc_read_simdisk(struct static ssize_t proc_write_simdisk(struct file *file, const char __user *buf, size_t count, loff_t *ppos) { - char *tmp = memdup_user_nul(buf, count); + char *tmp; struct simdisk *dev = pde_data(file_inode(file)); int err;
+ if (count == 0 || count > PAGE_SIZE) + return -EINVAL; + + tmp = memdup_user_nul(buf, count); if (IS_ERR(tmp)) return PTR_ERR(tmp);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Lobakin aleksander.lobakin@intel.com
commit 07ca98f906a403637fc5e513a872a50ef1247f3b upstream.
Turned out certain clearly invalid values passed in xdp_desc from userspace can pass xp_{,un}aligned_validate_desc() and then lead to UBs or just invalid frames to be queued for xmit.
desc->len close to ``U32_MAX`` with a non-zero pool->tx_metadata_len can cause positive integer overflow and wraparound, the same way low enough desc->addr with a non-zero pool->tx_metadata_len can cause negative integer overflow. Both scenarios can then pass the validation successfully. This doesn't happen with valid XSk applications, but can be used to perform attacks.
Always promote desc->len to ``u64`` first to exclude positive overflows of it. Use explicit check_{add,sub}_overflow() when validating desc->addr (which is ``u64`` already).
bloat-o-meter reports a little growth of the code size:
add/remove: 0/0 grow/shrink: 2/1 up/down: 60/-16 (44) Function old new delta xskq_cons_peek_desc 299 330 +31 xsk_tx_peek_release_desc_batch 973 1002 +29 xsk_generic_xmit 3148 3132 -16
but hopefully this doesn't hurt the performance much.
Fixes: 341ac980eab9 ("xsk: Support tx_metadata_len") Cc: stable@vger.kernel.org # 6.8+ Signed-off-by: Alexander Lobakin aleksander.lobakin@intel.com Reviewed-by: Jason Xing kerneljasonxing@gmail.com Reviewed-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Link: https://lore.kernel.org/r/20251008165659.4141318-1-aleksander.lobakin@intel.... Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/xdp/xsk_queue.h | 45 +++++++++++++++++++++++++++++++++++---------- 1 file changed, 35 insertions(+), 10 deletions(-)
--- a/net/xdp/xsk_queue.h +++ b/net/xdp/xsk_queue.h @@ -143,14 +143,24 @@ static inline bool xp_unused_options_set static inline bool xp_aligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 addr = desc->addr - pool->tx_metadata_len; - u64 len = desc->len + pool->tx_metadata_len; - u64 offset = addr & (pool->chunk_size - 1); + u64 len = desc->len; + u64 addr, offset;
- if (!desc->len) + if (!len) return false;
- if (offset + len > pool->chunk_size) + /* Can overflow if desc->addr < pool->tx_metadata_len */ + if (check_sub_overflow(desc->addr, pool->tx_metadata_len, &addr)) + return false; + + offset = addr & (pool->chunk_size - 1); + + /* + * Can't overflow: @offset is guaranteed to be < ``U32_MAX`` + * (pool->chunk_size is ``u32``), @len is guaranteed + * to be <= ``U32_MAX``. + */ + if (offset + len + pool->tx_metadata_len > pool->chunk_size) return false;
if (addr >= pool->addrs_cnt) @@ -158,27 +168,42 @@ static inline bool xp_aligned_validate_d
if (xp_unused_options_set(desc->options)) return false; + return true; }
static inline bool xp_unaligned_validate_desc(struct xsk_buff_pool *pool, struct xdp_desc *desc) { - u64 addr = xp_unaligned_add_offset_to_addr(desc->addr) - pool->tx_metadata_len; - u64 len = desc->len + pool->tx_metadata_len; + u64 len = desc->len; + u64 addr, end;
- if (!desc->len) + if (!len) return false;
+ /* Can't overflow: @len is guaranteed to be <= ``U32_MAX`` */ + len += pool->tx_metadata_len; if (len > pool->chunk_size) return false;
- if (addr >= pool->addrs_cnt || addr + len > pool->addrs_cnt || - xp_desc_crosses_non_contig_pg(pool, addr, len)) + /* Can overflow if desc->addr is close to 0 */ + if (check_sub_overflow(xp_unaligned_add_offset_to_addr(desc->addr), + pool->tx_metadata_len, &addr)) + return false; + + if (addr >= pool->addrs_cnt) + return false; + + /* Can overflow if pool->addrs_cnt is high enough */ + if (check_add_overflow(addr, len, &end) || end > pool->addrs_cnt) + return false; + + if (xp_desc_crosses_non_contig_pg(pool, addr, len)) return false;
if (xp_unused_options_set(desc->options)) return false; + return true; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Linus Walleij linus.walleij@linaro.org
commit b8df622cf7f6808c85764e681847150ed6d85f3d upstream.
If you don't specify buswidth 2 (16 bits) in the device tree, FSMC doesn't even probe anymore:
fsmc-nand 10100000.flash: FSMC device partno 090, manufacturer 80, revision 00, config 00 nand: device found, Manufacturer ID: 0x20, Chip ID: 0xb1 nand: ST Micro 10100000.flash nand: bus width 8 instead of 16 bits nand: No NAND device found fsmc-nand 10100000.flash: probe with driver fsmc-nand failed with error -22
With this patch to use autodetection unless buswidth is specified, the device is properly detected again:
fsmc-nand 10100000.flash: FSMC device partno 090, manufacturer 80, revision 00, config 00 nand: device found, Manufacturer ID: 0x20, Chip ID: 0xb1 nand: ST Micro NAND 128MiB 1,8V 16-bit nand: 128 MiB, SLC, erase size: 128 KiB, page size: 2048, OOB size: 64 fsmc-nand 10100000.flash: Using 1-bit HW ECC scheme Scanning device for bad blocks
I don't know where or how this happened, I think some change in the nand core.
Cc: stable@vger.kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mtd/nand/raw/fsmc_nand.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/mtd/nand/raw/fsmc_nand.c +++ b/drivers/mtd/nand/raw/fsmc_nand.c @@ -876,10 +876,14 @@ static int fsmc_nand_probe_config_dt(str if (!of_property_read_u32(np, "bank-width", &val)) { if (val == 2) { nand->options |= NAND_BUSWIDTH_16; - } else if (val != 1) { + } else if (val == 1) { + nand->options |= NAND_BUSWIDTH_AUTO; + } else { dev_err(&pdev->dev, "invalid bank-width %u\n", val); return -EINVAL; } + } else { + nand->options |= NAND_BUSWIDTH_AUTO; }
if (of_property_read_bool(np, "nand-skip-bbtscan"))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rex Chen rex.chen_1@nxp.com
commit fec40f44afdabcbc4a7748e4278f30737b54bb1a upstream.
SPI mode doesn't support cmd7, so remove it in mmc_sdio_alive() and confirm if sdio is active by checking CCCR register value is available or not.
Signed-off-by: Rex Chen rex.chen_1@nxp.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250728082230.1037917-2-rex.chen_1@nxp.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/sdio.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/drivers/mmc/core/sdio.c +++ b/drivers/mmc/core/sdio.c @@ -945,7 +945,11 @@ static void mmc_sdio_remove(struct mmc_h */ static int mmc_sdio_alive(struct mmc_host *host) { - return mmc_select_card(host->card); + if (!mmc_host_is_spi(host)) + return mmc_select_card(host->card); + else + return mmc_io_rw_direct(host->card, 0, 0, SDIO_CCCR_CCCR, 0, + NULL); }
/*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rex Chen rex.chen_1@nxp.com
commit fef12d9f5bcf7e2b19a7cf1295c6abd5642dd241 upstream.
For multiple block read, the current implementation, transfer packet includes cmd53 + cmd53 response + block nums*(1byte token + block length bytes payload + 2bytes CRC + 1byte transfer), the last 1byte transfer of every block is not needed, so remove it.
Why doesn't multiple block read need CRC ack? For read operation, host side get the payload and CRC value, then will only check the CRC value to confirm if the data is correct or not, but not send CRC ack to card. If the data is correct, save it, or discard it and retransmit if data is error, so the last 1byte transfer of every block make no sense.
What's the side effect of this 1byte transfer? As the SPI is full duplex, if add this redundant 1byte transfer, SDIO card side take it as the token of next block, then all the next sub blocks sequence distort.
Signed-off-by: Rex Chen rex.chen_1@nxp.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250728082230.1037917-3-rex.chen_1@nxp.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/mmc_spi.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/mmc_spi.c +++ b/drivers/mmc/host/mmc_spi.c @@ -563,7 +563,7 @@ mmc_spi_setup_data_message(struct mmc_sp * the next token (next data block, or STOP_TRAN). We can try to * minimize I/O ops by using a single read to collect end-of-busy. */ - if (multiple || write) { + if (write) { t = &host->early_status; memset(t, 0, sizeof(*t)); t->len = write ? sizeof(scratch->status) : 1;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhen Ni zhen.ni@easystack.cn
commit 6744085079e785dae5f7a2239456135407c58b25 upstream.
The of_platform_populate() call at the end of the function has a possible failure path, causing a resource leak.
Replace of_iomap() with devm_platform_ioremap_resource() to ensure automatic cleanup of srom->reg_base.
This issue was detected by smatch static analysis: drivers/memory/samsung/exynos-srom.c:155 exynos_srom_probe()warn: 'srom->reg_base' from of_iomap() not released on lines: 155.
Fixes: 8ac2266d8831 ("memory: samsung: exynos-srom: Add support for bank configuration") Cc: stable@vger.kernel.org Signed-off-by: Zhen Ni zhen.ni@easystack.cn Link: https://lore.kernel.org/r/20250806025538.306593-1-zhen.ni@easystack.cn Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/memory/samsung/exynos-srom.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/memory/samsung/exynos-srom.c +++ b/drivers/memory/samsung/exynos-srom.c @@ -121,20 +121,18 @@ static int exynos_srom_probe(struct plat return -ENOMEM;
srom->dev = dev; - srom->reg_base = of_iomap(np, 0); - if (!srom->reg_base) { + srom->reg_base = devm_platform_ioremap_resource(pdev, 0); + if (IS_ERR(srom->reg_base)) { dev_err(&pdev->dev, "iomap of exynos srom controller failed\n"); - return -ENOMEM; + return PTR_ERR(srom->reg_base); }
platform_set_drvdata(pdev, srom);
srom->reg_offset = exynos_srom_alloc_reg_dump(exynos_srom_offsets, ARRAY_SIZE(exynos_srom_offsets)); - if (!srom->reg_offset) { - iounmap(srom->reg_base); + if (!srom->reg_offset) return -ENOMEM; - }
for_each_child_of_node(np, child) { if (exynos_srom_configure_bank(srom, child)) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Esben Haabendal esben@geanix.com
commit 9db26d5855d0374d4652487bfb5aacf40821c469 upstream.
When setting a normal alarm, user-space is responsible for using RTC_AIE_ON/RTC_AIE_OFF to control if alarm irq should be enabled.
But when RTC_UIE_ON is used, interrupts must be enabled so that the requested irq events are generated. When RTC_UIE_OFF is used, alarm irq is disabled if there are no other alarms queued, so this commit brings symmetry to that.
Signed-off-by: Esben Haabendal esben@geanix.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250516-rtc-uie-irq-fixes-v2-5-3de8e530a39e@geani... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/rtc/interface.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/drivers/rtc/interface.c +++ b/drivers/rtc/interface.c @@ -594,6 +594,10 @@ int rtc_update_irq_enable(struct rtc_dev rtc->uie_rtctimer.node.expires = ktime_add(now, onesec); rtc->uie_rtctimer.period = ktime_set(1, 0); err = rtc_timer_enqueue(rtc, &rtc->uie_rtctimer); + if (!err && rtc->ops && rtc->ops->alarm_irq_enable) + err = rtc->ops->alarm_irq_enable(rtc->dev.parent, 1); + if (err) + goto out; } else { rtc_timer_remove(rtc, &rtc->uie_rtctimer); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Esben Haabendal esben@geanix.com
commit 795cda8338eab036013314dbc0b04aae728880ab upstream.
As described in the old comment dating back to commit 6610e0893b8b ("RTC: Rework RTC code to use timerqueue for events") from 2010, we have been living with a race window when setting alarm with an expiry in the near future (i.e. next second). With 1 second resolution, it can happen that the second ticks after the check for the timer having expired, but before the alarm is actually set. When this happen, no alarm IRQ is generated, at least not with some RTC chips (isl12022 is an example of this).
With UIE RTC timer being implemented on top of alarm irq, being re-armed every second, UIE will occasionally fail to work, as an alarm irq lost due to this race will stop the re-arming loop.
For now, I have limited the additional expiry check to only be done for alarms set to next seconds. I expect it should be good enough, although I don't know if we can now for sure that systems with loads could end up causing the same problems for alarms set 2 seconds or even longer in the future.
I haven't been able to reproduce the problem with this check in place.
Cc: stable@vger.kernel.org Signed-off-by: Esben Haabendal esben@geanix.com Link: https://lore.kernel.org/r/20250516-rtc-uie-irq-fixes-v2-1-3de8e530a39e@geani... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/rtc/interface.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
--- a/drivers/rtc/interface.c +++ b/drivers/rtc/interface.c @@ -443,6 +443,29 @@ static int __rtc_set_alarm(struct rtc_de else err = rtc->ops->set_alarm(rtc->dev.parent, alarm);
+ /* + * Check for potential race described above. If the waiting for next + * second, and the second just ticked since the check above, either + * + * 1) It ticked after the alarm was set, and an alarm irq should be + * generated. + * + * 2) It ticked before the alarm was set, and alarm irq most likely will + * not be generated. + * + * While we cannot easily check for which of these two scenarios we + * are in, we can return -ETIME to signal that the timer has already + * expired, which is true in both cases. + */ + if ((scheduled - now) <= 1) { + err = __rtc_read_time(rtc, &tm); + if (err) + return err; + now = rtc_tm_to_time64(&tm); + if (scheduled <= now) + return -ETIME; + } + trace_rtc_set_alarm(rtc_tm_to_time64(&alarm->time), err); return err; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit a001cd248ab244633c5fabe4f7c707e13fc1d1cc upstream.
Add "extern" to the glibc-defined weak rseq symbols to convert the rseq selftest's usage from weak symbol definitions to weak symbol _references_. Effectively re-defining the glibc symbols wreaks havoc when building with -fno-common, e.g. generates segfaults when running multi-threaded programs, as dynamically linked applications end up with multiple versions of the symbols.
Building with -fcommon, which until recently has the been the default for GCC and clang, papers over the bug by allowing the linker to resolve the weak/tentative definition to glibc's "real" definition.
Note, the symbol itself (or rather its address), not the value of the symbol, is set to 0/NULL for unresolved weak symbol references, as the symbol doesn't exist and thus can't have a value. Check for a NULL rseq size pointer to handle the scenario where the test is statically linked against a libc that doesn't support rseq in any capacity.
Fixes: 3bcbc20942db ("selftests/rseq: Play nice with binaries statically linked against glibc 2.35+") Reported-by: Thomas Gleixner tglx@linutronix.de Suggested-by: Florian Weimer fweimer@redhat.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Mathieu Desnoyers mathieu.desnoyers@efficios.com Cc: stable@vger.kernel.org Closes: https://lore.kernel.org/all/87frdoybk4.ffs@tglx Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/rseq/rseq.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/tools/testing/selftests/rseq/rseq.c +++ b/tools/testing/selftests/rseq/rseq.c @@ -40,9 +40,9 @@ * Define weak versions to play nice with binaries that are statically linked * against a libc that doesn't support registering its own rseq. */ -__weak ptrdiff_t __rseq_offset; -__weak unsigned int __rseq_size; -__weak unsigned int __rseq_flags; +extern __weak ptrdiff_t __rseq_offset; +extern __weak unsigned int __rseq_size; +extern __weak unsigned int __rseq_flags;
static const ptrdiff_t *libc_rseq_offset_p = &__rseq_offset; static const unsigned int *libc_rseq_size_p = &__rseq_size; @@ -198,7 +198,7 @@ void rseq_init(void) * libc not having registered a restartable sequence. Try to find the * symbols if that's the case. */ - if (!*libc_rseq_size_p) { + if (!libc_rseq_size_p || !*libc_rseq_size_p) { libc_rseq_offset_p = dlsym(RTLD_NEXT, "__rseq_offset"); libc_rseq_size_p = dlsym(RTLD_NEXT, "__rseq_size"); libc_rseq_flags_p = dlsym(RTLD_NEXT, "__rseq_flags");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jani Nurminen jani.nurminen@windriver.com
commit 98a4f5b7359205ced1b6a626df3963bf7c5e5052 upstream.
When PCIe has been set up by the bootloader, the ecam_size field in the E_ECAM_CONTROL register already contains a value.
The driver previously programmed it to 0xc (for 16 busses; 16 MB), but bumped to 0x10 (for 256 busses; 256 MB) by the commit 2fccd11518f1 ("PCI: xilinx-nwl: Modify ECAM size to enable support for 256 buses").
Regardless of what the bootloader has programmed, the driver ORs in a new maximal value without doing a proper RMW sequence. This can lead to problems.
For example, if the bootloader programs in 0xc and the driver uses 0x10, the ORed result is 0x1c, which is beyond the ecam_max_size limit of 0x10 (from E_ECAM_CAPABILITIES).
Avoid the problems by doing a proper RMW.
Fixes: 2fccd11518f1 ("PCI: xilinx-nwl: Modify ECAM size to enable support for 256 buses") Signed-off-by: Jani Nurminen jani.nurminen@windriver.com [mani: added stable tag] Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/e83a2af2-af0b-4670-bcf5-ad408571c2b0@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-xilinx-nwl.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/pci/controller/pcie-xilinx-nwl.c +++ b/drivers/pci/controller/pcie-xilinx-nwl.c @@ -721,9 +721,10 @@ static int nwl_pcie_bridge_init(struct n nwl_bridge_writel(pcie, nwl_bridge_readl(pcie, E_ECAM_CONTROL) | E_ECAM_CR_ENABLE, E_ECAM_CONTROL);
- nwl_bridge_writel(pcie, nwl_bridge_readl(pcie, E_ECAM_CONTROL) | - (NWL_ECAM_MAX_SIZE << E_ECAM_SIZE_SHIFT), - E_ECAM_CONTROL); + ecam_val = nwl_bridge_readl(pcie, E_ECAM_CONTROL); + ecam_val &= ~E_ECAM_SIZE_LOC; + ecam_val |= NWL_ECAM_MAX_SIZE << E_ECAM_SIZE_SHIFT; + nwl_bridge_writel(pcie, ecam_val, E_ECAM_CONTROL);
nwl_bridge_writel(pcie, lower_32_bits(pcie->phys_ecam_base), E_ECAM_BASE_LO);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
commit 26fda92d3b56bf44a02bcb4001c5a5548e0ae8ee upstream.
The tegra_msi_irq_unmask() function may be called from a PCI driver request_threaded_irq() function. This triggers kernel/irq/manage.c __setup_irq() which locks raw spinlock &desc->lock descriptor lock and with that descriptor lock held, calls tegra_msi_irq_unmask().
Since the &desc->lock descriptor lock is a raw spinlock, and the tegra_msi .mask_lock is not a raw spinlock, this setup triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y.
Use scoped_guard() to simplify the locking.
Fixes: 2c99e55f7955 ("PCI: tegra: Convert to MSI domains") Reported-by: Geert Uytterhoeven geert+renesas@glider.be Closes: https://patchwork.kernel.org/project/linux-pci/patch/20250909162707.13927-2-... Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922150811.88450-1-marek.vasut+renesas@mailbox.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pci-tegra.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-)
--- a/drivers/pci/controller/pci-tegra.c +++ b/drivers/pci/controller/pci-tegra.c @@ -14,6 +14,7 @@ */
#include <linux/clk.h> +#include <linux/cleanup.h> #include <linux/debugfs.h> #include <linux/delay.h> #include <linux/export.h> @@ -269,7 +270,7 @@ struct tegra_msi { DECLARE_BITMAP(used, INT_PCI_MSI_NR); struct irq_domain *domain; struct mutex map_lock; - spinlock_t mask_lock; + raw_spinlock_t mask_lock; void *virt; dma_addr_t phys; int irq; @@ -1604,14 +1605,13 @@ static void tegra_msi_irq_mask(struct ir struct tegra_msi *msi = irq_data_get_irq_chip_data(d); struct tegra_pcie *pcie = msi_to_pcie(msi); unsigned int index = d->hwirq / 32; - unsigned long flags; u32 value;
- spin_lock_irqsave(&msi->mask_lock, flags); - value = afi_readl(pcie, AFI_MSI_EN_VEC(index)); - value &= ~BIT(d->hwirq % 32); - afi_writel(pcie, value, AFI_MSI_EN_VEC(index)); - spin_unlock_irqrestore(&msi->mask_lock, flags); + scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) { + value = afi_readl(pcie, AFI_MSI_EN_VEC(index)); + value &= ~BIT(d->hwirq % 32); + afi_writel(pcie, value, AFI_MSI_EN_VEC(index)); + } }
static void tegra_msi_irq_unmask(struct irq_data *d) @@ -1619,14 +1619,13 @@ static void tegra_msi_irq_unmask(struct struct tegra_msi *msi = irq_data_get_irq_chip_data(d); struct tegra_pcie *pcie = msi_to_pcie(msi); unsigned int index = d->hwirq / 32; - unsigned long flags; u32 value;
- spin_lock_irqsave(&msi->mask_lock, flags); - value = afi_readl(pcie, AFI_MSI_EN_VEC(index)); - value |= BIT(d->hwirq % 32); - afi_writel(pcie, value, AFI_MSI_EN_VEC(index)); - spin_unlock_irqrestore(&msi->mask_lock, flags); + scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) { + value = afi_readl(pcie, AFI_MSI_EN_VEC(index)); + value |= BIT(d->hwirq % 32); + afi_writel(pcie, value, AFI_MSI_EN_VEC(index)); + } }
static void tegra_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) @@ -1736,7 +1735,7 @@ static int tegra_pcie_msi_setup(struct t int err;
mutex_init(&msi->map_lock); - spin_lock_init(&msi->mask_lock); + raw_spin_lock_init(&msi->mask_lock);
if (IS_ENABLED(CONFIG_PCI_MSI)) { err = tegra_allocate_domains(msi);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Norris briannorris@google.com
commit 48991e4935078b05f80616c75d1ee2ea3ae18e58 upstream.
The "max_link_width", "current_link_speed", "current_link_width", "secondary_bus_number", and "subordinate_bus_number" sysfs files all access config registers, but they don't check the runtime PM state. If the device is in D3cold or a parent bridge is suspended, we may see -EINVAL, bogus values, or worse, depending on implementation details.
Wrap these access in pci_config_pm_runtime_{get,put}() like most of the rest of the similar sysfs attributes.
Notably, "max_link_speed" does not access config registers; it returns a cached value since d2bd39c0456b ("PCI: Store all PCIe Supported Link Speeds").
Fixes: 56c1af4606f0 ("PCI: Add sysfs max_link_speed/width, current_link_speed/width, etc") Signed-off-by: Brian Norris briannorris@google.com Signed-off-by: Brian Norris briannorris@chromium.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250924095711.v2.1.Ibb5b6ca1e2c059e04ec53140cd98a4... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-)
--- a/drivers/pci/pci-sysfs.c +++ b/drivers/pci/pci-sysfs.c @@ -200,8 +200,14 @@ static ssize_t max_link_width_show(struc struct device_attribute *attr, char *buf) { struct pci_dev *pdev = to_pci_dev(dev); + ssize_t ret;
- return sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + /* We read PCI_EXP_LNKCAP, so we need the device to be accessible. */ + pci_config_pm_runtime_get(pdev); + ret = sysfs_emit(buf, "%u\n", pcie_get_width_cap(pdev)); + pci_config_pm_runtime_put(pdev); + + return ret; } static DEVICE_ATTR_RO(max_link_width);
@@ -213,7 +219,10 @@ static ssize_t current_link_speed_show(s int err; enum pci_bus_speed speed;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -230,7 +239,10 @@ static ssize_t current_link_width_show(s u16 linkstat; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pcie_capability_read_word(pci_dev, PCI_EXP_LNKSTA, &linkstat); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -246,7 +258,10 @@ static ssize_t secondary_bus_number_show u8 sec_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SECONDARY_BUS, &sec_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
@@ -262,7 +277,10 @@ static ssize_t subordinate_bus_number_sh u8 sub_bus; int err;
+ pci_config_pm_runtime_get(pci_dev); err = pci_read_config_byte(pci_dev, PCI_SUBORDINATE_BUS, &sub_bus); + pci_config_pm_runtime_put(pci_dev); + if (err) return -EINVAL;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Schnelle schnelle@linux.ibm.com
commit 05703271c3cdcc0f2a8cf6ebdc45892b8ca83520 upstream.
Before disabling SR-IOV via config space accesses to the parent PF, sriov_disable() first removes the PCI devices representing the VFs.
Since commit 9d16947b7583 ("PCI: Add global pci_lock_rescan_remove()") such removal operations are serialized against concurrent remove and rescan using the pci_rescan_remove_lock. No such locking was ever added in sriov_disable() however. In particular when commit 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") factored out the PCI device removal into sriov_del_vfs() there was still no locking around the pci_iov_remove_virtfn() calls.
On s390 the lack of serialization in sriov_disable() may cause double remove and list corruption with the below (amended) trace being observed:
PSW: 0704c00180000000 0000000c914e4b38 (klist_put+56) GPRS: 000003800313fb48 0000000000000000 0000000100000001 0000000000000001 00000000f9b520a8 0000000000000000 0000000000002fbd 00000000f4cc9480 0000000000000001 0000000000000000 0000000000000000 0000000180692828 00000000818e8000 000003800313fe2c 000003800313fb20 000003800313fad8 #0 [3800313fb20] device_del at c9158ad5c #1 [3800313fb88] pci_remove_bus_device at c915105ba #2 [3800313fbd0] pci_iov_remove_virtfn at c9152f198 #3 [3800313fc28] zpci_iov_remove_virtfn at c90fb67c0 #4 [3800313fc60] zpci_bus_remove_device at c90fb6104 #5 [3800313fca0] __zpci_event_availability at c90fb3dca #6 [3800313fd08] chsc_process_sei_nt0 at c918fe4a2 #7 [3800313fd60] crw_collect_info at c91905822 #8 [3800313fe10] kthread at c90feb390 #9 [3800313fe68] __ret_from_fork at c90f6aa64 #10 [3800313fe98] ret_from_fork at c9194f3f2.
This is because in addition to sriov_disable() removing the VFs, the platform also generates hot-unplug events for the VFs. This being the reverse operation to the hotplug events generated by sriov_enable() and handled via pdev->no_vf_scan. And while the event processing takes pci_rescan_remove_lock and checks whether the struct pci_dev still exists, the lack of synchronization makes this checking racy.
Other races may also be possible of course though given that this lack of locking persisted so long observable races seem very rare. Even on s390 the list corruption was only observed with certain devices since the platform events are only triggered by config accesses after the removal, so as long as the removal finished synchronously they would not race. Either way the locking is missing so fix this by adding it to the sriov_del_vfs() helper.
Just like PCI rescan-remove, locking is also missing in sriov_add_vfs() including for the error case where pci_stop_and_remove_bus_device() is called without the PCI rescan-remove lock being held. Even in the non-error case, adding new PCI devices and buses should be serialized via the PCI rescan-remove lock. Add the necessary locking.
Fixes: 18f9e9d150fc ("PCI/IOV: Factor out sriov_add_vfs()") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Benjamin Block bblock@linux.ibm.com Reviewed-by: Farhan Ali alifm@linux.ibm.com Reviewed-by: Julian Ruess julianr@linux.ibm.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250826-pci_fix_sriov_disable-v1-1-2d0bc938f2a3@li... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/iov.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/pci/iov.c +++ b/drivers/pci/iov.c @@ -581,15 +581,18 @@ static int sriov_add_vfs(struct pci_dev if (dev->no_vf_scan) return 0;
+ pci_lock_rescan_remove(); for (i = 0; i < num_vfs; i++) { rc = pci_iov_add_virtfn(dev, i); if (rc) goto failed; } + pci_unlock_rescan_remove(); return 0; failed: while (i--) pci_iov_remove_virtfn(dev, i); + pci_unlock_rescan_remove();
return rc; } @@ -709,8 +712,10 @@ static void sriov_del_vfs(struct pci_dev struct pci_sriov *iov = dev->sriov; int i;
+ pci_lock_rescan_remove(); for (i = 0; i < iov->num_VFs; i++) pci_iov_remove_virtfn(dev, i); + pci_unlock_rescan_remove(); }
static void sriov_disable(struct pci_dev *dev)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
commit 1cbc5e25fb70e942a7a735a1f3d6dd391afc9b29 upstream.
Upon failure to recover from a PCIe error through AER, DPC or EDR, a uevent is sent to inform user space about disconnection of the bridge whose subordinate devices failed to recover.
However the bridge itself is not disconnected. Instead, a uevent should be sent for each of the subordinate devices.
Only if the "bridge" happens to be a Root Complex Event Collector or Integrated Endpoint does it make sense to send a uevent for it (because there are no subordinate devices).
Right now if there is a mix of subordinate devices with and without pci_error_handlers, a BEGIN_RECOVERY event is sent for those with pci_error_handlers but no FAILED_RECOVERY event is ever sent for them afterwards. Fix it.
Fixes: 856e1eb9bdd4 ("PCI/AER: Add uevents in AER and EEH error/resume") Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v4.16+ Link: https://patch.msgid.link/68fc527a380821b5d861dd554d2ce42cb739591c.1755008151... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pcie/err.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/pci/pcie/err.c +++ b/drivers/pci/pcie/err.c @@ -108,6 +108,12 @@ static int report_normal_detected(struct return report_error_detected(dev, pci_channel_io_normal, data); }
+static int report_perm_failure_detected(struct pci_dev *dev, void *data) +{ + pci_uevent_ers(dev, PCI_ERS_RESULT_DISCONNECT); + return 0; +} + static int report_mmio_enabled(struct pci_dev *dev, void *data) { struct pci_driver *pdrv; @@ -269,7 +275,7 @@ pci_ers_result_t pcie_do_recovery(struct failed: pci_walk_bridge(bridge, pci_pm_runtime_put, NULL);
- pci_uevent_ers(bridge, PCI_ERS_RESULT_DISCONNECT); + pci_walk_bridge(bridge, report_perm_failure_detected, NULL);
/* TODO: Should kernel panic here? */ pci_info(bridge, "device recovery failed\n");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Schnelle schnelle@linux.ibm.com
commit bbf7d0468d0da71d76cc6ec9bc8a224325d07b6b upstream.
Since commit 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for udev") AER uses the result of error_detected() as parameter to pci_uevent_ers(). As pci_uevent_ers() however does not handle PCI_ERS_RESULT_NEED_RESET this results in a missing uevent for the beginning of recovery if drivers request a reset. Fix this by treating PCI_ERS_RESULT_NEED_RESET as beginning recovery.
Fixes: 7b42d97e99d3 ("PCI/ERR: Always report current recovery status for udev") Signed-off-by: Niklas Schnelle schnelle@linux.ibm.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Lukas Wunner lukas@wunner.de Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250807-add_err_uevents-v5-1-adf85b0620b0@linux.ib... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pci-driver.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -1600,6 +1600,7 @@ void pci_uevent_ers(struct pci_dev *pdev switch (err_type) { case PCI_ERS_RESULT_NONE: case PCI_ERS_RESULT_CAN_RECOVER: + case PCI_ERS_RESULT_NEED_RESET: envp[idx++] = "ERROR_EVENT=BEGIN_RECOVERY"; envp[idx++] = "DEVICE_ONLINE=0"; break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lukas Wunner lukas@wunner.de
commit 6633875250b38b18b8638cf01e695de031c71f02 upstream.
PCIe r6.0 defined five additional errors in the Uncorrectable Error Status, Mask and Severity Registers (PCIe r7.0 sec 7.8.4.2ff).
lspci has been supporting them since commit 144b0911cc0b ("ls-ecaps: extend decode support for more fields for AER CE and UE status"):
https://git.kernel.org/pub/scm/utils/pciutils/pciutils.git/commit/?id=144b09...
Amend the AER driver to recognize them as well, instead of logging them as "Unknown Error Bit".
Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Kuppuswamy Sathyanarayanan sathyanarayanan.kuppuswamy@linux.intel.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/21f1875b18d4078c99353378f37dcd6b994f6d4e.1756301211... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/pcie/aer.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/pci/pcie/aer.c +++ b/drivers/pci/pcie/aer.c @@ -38,7 +38,7 @@ #define AER_ERROR_SOURCES_MAX 128
#define AER_MAX_TYPEOF_COR_ERRS 16 /* as per PCI_ERR_COR_STATUS */ -#define AER_MAX_TYPEOF_UNCOR_ERRS 27 /* as per PCI_ERR_UNCOR_STATUS*/ +#define AER_MAX_TYPEOF_UNCOR_ERRS 32 /* as per PCI_ERR_UNCOR_STATUS*/
struct aer_err_source { u32 status; /* PCI_ERR_ROOT_STATUS */ @@ -510,11 +510,11 @@ static const char *aer_uncorrectable_err "AtomicOpBlocked", /* Bit Position 24 */ "TLPBlockedErr", /* Bit Position 25 */ "PoisonTLPBlocked", /* Bit Position 26 */ - NULL, /* Bit Position 27 */ - NULL, /* Bit Position 28 */ - NULL, /* Bit Position 29 */ - NULL, /* Bit Position 30 */ - NULL, /* Bit Position 31 */ + "DMWrReqBlocked", /* Bit Position 27 */ + "IDECheck", /* Bit Position 28 */ + "MisIDETLP", /* Bit Position 29 */ + "PCRC_CHECK", /* Bit Position 30 */ + "TLPXlatBlocked", /* Bit Position 31 */ };
static const char *aer_agent_string[] = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Siddharth Vadapalli s-vadapalli@ti.com
commit f842d3313ba179d4005096357289c7ad09cec575 upstream.
The Cadence PCIe Controller integrated in the TI K3 SoCs supports both Root-Complex and Endpoint modes of operation. The Glue Layer allows "strapping" the Mode of operation of the Controller, the Link Speed and the Link Width. This is enabled by programming the "PCIEn_CTRL" register (n corresponds to the PCIe instance) within the CTRL_MMR memory-mapped register space. The "reset-values" of the registers are also different depending on the mode of operation.
Since the PCIe Controller latches onto the "reset-values" immediately after being powered on, if the Glue Layer configuration is not done while the PCIe Controller is off, it will result in the PCIe Controller latching onto the wrong "reset-values". In practice, this will show up as a wrong representation of the PCIe Controller's capability structures in the PCIe Configuration Space. Some such capabilities which are supported by the PCIe Controller in the Root-Complex mode but are incorrectly latched onto as being unsupported are: - Link Bandwidth Notification - Alternate Routing ID (ARI) Forwarding Support - Next capability offset within Advanced Error Reporting (AER) capability
Fix this by powering off the PCIe Controller before programming the "strap" settings and powering it on after that. The runtime PM APIs namely pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and increment the usage counter respectively, causing GENPD to power off and power on the PCIe Controller.
Fixes: f3e25911a430 ("PCI: j721e: Add TI J721E PCIe driver") Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250908120828.1471776-1-s-vadapalli@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/cadence/pci-j721e.c | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+)
--- a/drivers/pci/controller/cadence/pci-j721e.c +++ b/drivers/pci/controller/cadence/pci-j721e.c @@ -277,6 +277,25 @@ static int j721e_pcie_ctrl_init(struct j if (!ret) offset = args.args[0];
+ /* + * The PCIe Controller's registers have different "reset-values" + * depending on the "strap" settings programmed into the PCIEn_CTRL + * register within the CTRL_MMR memory-mapped register space. + * The registers latch onto a "reset-value" based on the "strap" + * settings sampled after the PCIe Controller is powered on. + * To ensure that the "reset-values" are sampled accurately, power + * off the PCIe Controller before programming the "strap" settings + * and power it on after that. The runtime PM APIs namely + * pm_runtime_put_sync() and pm_runtime_get_sync() will decrement and + * increment the usage counter respectively, causing GENPD to power off + * and power on the PCIe Controller. + */ + ret = pm_runtime_put_sync(dev); + if (ret < 0) { + dev_err(dev, "Failed to power off PCIe Controller\n"); + return ret; + } + ret = j721e_pcie_set_mode(pcie, syscon, offset); if (ret < 0) { dev_err(dev, "Failed to set pci mode\n"); @@ -295,6 +314,12 @@ static int j721e_pcie_ctrl_init(struct j return ret; }
+ ret = pm_runtime_get_sync(dev); + if (ret < 0) { + dev_err(dev, "Failed to power on PCIe Controller\n"); + return ret; + } + /* Enable ACSPCIE refclk output if the optional property exists */ syscon = syscon_regmap_lookup_by_phandle_optional(node, "ti,syscon-acspcie-proxy-ctrl");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Siddharth Vadapalli s-vadapalli@ti.com
commit e51d05f523e43ce5d2bad957943a2b14f68078cd upstream.
Commit under Fixes introduced the IRQ handler for "ks-pcie-error-irq". The interrupt is acquired using "request_irq()" but is never freed if the driver exits due to an error. Although the section in the driver that invokes "request_irq()" has moved around over time, the issue hasn't been addressed until now.
Fix this by using "devm_request_irq()" which automatically frees the interrupt if the driver exits.
Fixes: 025dd3daeda7 ("PCI: keystone: Add error IRQ handler") Reported-by: Jiri Slaby jirislaby@kernel.org Closes: https://lore.kernel.org/r/3d3a4b52-e343-42f3-9d69-94c259812143@kernel.org Signed-off-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250912100802.3136121-2-s-vadapalli@ti.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-keystone.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -1202,8 +1202,8 @@ static int ks_pcie_probe(struct platform if (irq < 0) return irq;
- ret = request_irq(irq, ks_pcie_err_irq_handler, IRQF_SHARED, - "ks-pcie-error-irq", ks_pcie); + ret = devm_request_irq(dev, irq, ks_pcie_err_irq_handler, IRQF_SHARED, + "ks-pcie-error-irq", ks_pcie); if (ret < 0) { dev_err(dev, "failed to request error IRQ %d\n", irq);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
commit d96ac5bdc52b271b4f8ac0670a203913666b8758 upstream.
R-Car V4H Reference Manual R19UH0186EJ0130 Rev.1.30 Apr. 21, 2025 page 4581 Figure 104.3b Initial Setting of PCIEC(example), middle of the figure indicates that fourth write into register 0x148 [2:0] is 0x3 or GENMASK(1, 0). The current code writes GENMASK(11, 0) which is a typo. Fix the typo.
Fixes: faf5a975ee3b ("PCI: rcar-gen4: Add support for R-Car V4H") Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250806192548.133140-1-marek.vasut+renesas@mailbox... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-rcar-gen4.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/pci/controller/dwc/pcie-rcar-gen4.c +++ b/drivers/pci/controller/dwc/pcie-rcar-gen4.c @@ -723,7 +723,7 @@ static int rcar_gen4_pcie_ltssm_control( rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(23, 22), BIT(22)); rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(18, 16), GENMASK(17, 16)); rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(7, 6), BIT(6)); - rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(2, 0), GENMASK(11, 0)); + rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x148, GENMASK(2, 0), GENMASK(1, 0)); rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x1d4, GENMASK(16, 15), GENMASK(16, 15)); rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x514, BIT(26), BIT(26)); rcar_gen4_pcie_phy_reg_update_bits(rcar, 0x0f8, BIT(16), 0);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
commit 0a8f173d9dad13930d5888505dc4c4fd6a1d4262 upstream.
The pmsr_lock spinlock used to be necessary to synchronize access to the PMSR register, because that access could have been triggered from either config space access in rcar_pcie_config_access() or an exception handler rcar_pcie_aarch32_abort_handler().
The rcar_pcie_aarch32_abort_handler() case is no longer applicable since commit 6e36203bc14c ("PCI: rcar: Use PCI_SET_ERROR_RESPONSE after read which triggered an exception"), which performs more accurate, controlled invocation of the exception, and a fixup.
This leaves rcar_pcie_config_access() as the only call site from which rcar_pcie_wakeup() is called. The rcar_pcie_config_access() can only be called from the controller struct pci_ops .read and .write callbacks, and those are serialized in drivers/pci/access.c using raw spinlock 'pci_lock' . It should be noted that CONFIG_PCI_LOCKLESS_CONFIG is never set on this platform.
Since the 'pci_lock' is a raw spinlock , and the 'pmsr_lock' is not a raw spinlock, this constellation triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y .
Remove the pmsr_lock to fix the locking.
Fixes: a115b1bd3af0 ("PCI: rcar: Add L1 link state fix into data abort hook") Reported-by: Duy Nguyen duy.nguyen.rh@renesas.com Reported-by: Thuan Nguyen thuan.nguyen-hong@banvien.com.vn Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-1-marek.vasut+renesas@mailbox.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rcar-host.c | 13 ++----------- 1 file changed, 2 insertions(+), 11 deletions(-)
--- a/drivers/pci/controller/pcie-rcar-host.c +++ b/drivers/pci/controller/pcie-rcar-host.c @@ -51,20 +51,13 @@ struct rcar_pcie_host { int (*phy_init_fn)(struct rcar_pcie_host *host); };
-static DEFINE_SPINLOCK(pmsr_lock); - static int rcar_pcie_wakeup(struct device *pcie_dev, void __iomem *pcie_base) { - unsigned long flags; u32 pmsr, val; int ret = 0;
- spin_lock_irqsave(&pmsr_lock, flags); - - if (!pcie_base || pm_runtime_suspended(pcie_dev)) { - ret = -EINVAL; - goto unlock_exit; - } + if (!pcie_base || pm_runtime_suspended(pcie_dev)) + return -EINVAL;
pmsr = readl(pcie_base + PMSR);
@@ -86,8 +79,6 @@ static int rcar_pcie_wakeup(struct devic writel(L1FAEG | PMEL1RX, pcie_base + PMSR); }
-unlock_exit: - spin_unlock_irqrestore(&pmsr_lock, flags); return ret; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marek.vasut+renesas@mailbox.org
commit 5ed35b4d490d8735021cce9b715b62a418310864 upstream.
The rcar_msi_irq_unmask() function may be called from a PCI driver request_threaded_irq() function. This triggers kernel/irq/manage.c __setup_irq() which locks raw spinlock &desc->lock descriptor lock and with that descriptor lock held, calls rcar_msi_irq_unmask().
Since the &desc->lock descriptor lock is a raw spinlock, and the rcar_msi .mask_lock is not a raw spinlock, this setup triggers 'BUG: Invalid wait context' with CONFIG_PROVE_RAW_LOCK_NESTING=y.
Use scoped_guard() to simplify the locking.
Fixes: 83ed8d4fa656 ("PCI: rcar: Convert to MSI domains") Reported-by: Duy Nguyen duy.nguyen.rh@renesas.com Reported-by: Thuan Nguyen thuan.nguyen-hong@banvien.com.vn Signed-off-by: Marek Vasut marek.vasut+renesas@mailbox.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Acked-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250909162707.13927-2-marek.vasut+renesas@mailbox.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/pcie-rcar-host.c | 27 +++++++++++++-------------- 1 file changed, 13 insertions(+), 14 deletions(-)
--- a/drivers/pci/controller/pcie-rcar-host.c +++ b/drivers/pci/controller/pcie-rcar-host.c @@ -12,6 +12,7 @@ */
#include <linux/bitops.h> +#include <linux/cleanup.h> #include <linux/clk.h> #include <linux/clk-provider.h> #include <linux/delay.h> @@ -37,7 +38,7 @@ struct rcar_msi { DECLARE_BITMAP(used, INT_PCI_MSI_NR); struct irq_domain *domain; struct mutex map_lock; - spinlock_t mask_lock; + raw_spinlock_t mask_lock; int irq1; int irq2; }; @@ -625,28 +626,26 @@ static void rcar_msi_irq_mask(struct irq { struct rcar_msi *msi = irq_data_get_irq_chip_data(d); struct rcar_pcie *pcie = &msi_to_host(msi)->pcie; - unsigned long flags; u32 value;
- spin_lock_irqsave(&msi->mask_lock, flags); - value = rcar_pci_read_reg(pcie, PCIEMSIIER); - value &= ~BIT(d->hwirq); - rcar_pci_write_reg(pcie, value, PCIEMSIIER); - spin_unlock_irqrestore(&msi->mask_lock, flags); + scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) { + value = rcar_pci_read_reg(pcie, PCIEMSIIER); + value &= ~BIT(d->hwirq); + rcar_pci_write_reg(pcie, value, PCIEMSIIER); + } }
static void rcar_msi_irq_unmask(struct irq_data *d) { struct rcar_msi *msi = irq_data_get_irq_chip_data(d); struct rcar_pcie *pcie = &msi_to_host(msi)->pcie; - unsigned long flags; u32 value;
- spin_lock_irqsave(&msi->mask_lock, flags); - value = rcar_pci_read_reg(pcie, PCIEMSIIER); - value |= BIT(d->hwirq); - rcar_pci_write_reg(pcie, value, PCIEMSIIER); - spin_unlock_irqrestore(&msi->mask_lock, flags); + scoped_guard(raw_spinlock_irqsave, &msi->mask_lock) { + value = rcar_pci_read_reg(pcie, PCIEMSIIER); + value |= BIT(d->hwirq); + rcar_pci_write_reg(pcie, value, PCIEMSIIER); + } }
static void rcar_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) @@ -756,7 +755,7 @@ static int rcar_pcie_enable_msi(struct r int err;
mutex_init(&msi->map_lock); - spin_lock_init(&msi->mask_lock); + raw_spin_lock_init(&msi->mask_lock);
err = of_address_to_resource(dev->of_node, 0, &res); if (err)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel cassel@kernel.org
commit b640d42a6ac9ba01abe65ec34f7c73aaf6758ab8 upstream.
The pci_epc_raise_irq() supplies a MSI or MSI-X interrupt number in range (1-N), as per the pci_epc_raise_irq() kdoc, where N is 32 for MSI.
But tegra_pcie_ep_raise_msi_irq() incorrectly uses the interrupt number as the MSI vector. This causes wrong MSI vector to be triggered, leading to the failure of PCI endpoint Kselftest MSI_TEST test case.
To fix this issue, convert the interrupt number to MSI vector.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-6-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -1954,10 +1954,10 @@ static int tegra_pcie_ep_raise_intx_irq(
static int tegra_pcie_ep_raise_msi_irq(struct tegra_pcie_dw *pcie, u16 irq) { - if (unlikely(irq > 31)) + if (unlikely(irq > 32)) return -EINVAL;
- appl_writel(pcie, BIT(irq), APPL_MSI_CTRL_1); + appl_writel(pcie, BIT(irq - 1), APPL_MSI_CTRL_1);
return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vidya Sagar vidyas@nvidia.com
commit f8c9ad46b00453a8c075453f3745f8d263f44834 upstream.
The return value from tegra_bpmp_transfer() indicates the success or failure of the IPC transaction with BPMP. If the transaction succeeded, we also need to check the actual command's result code.
If we don't have error handling for tegra_bpmp_transfer(), we will set the pcie->ep_state to EP_STATE_ENABLED even when the tegra_bpmp_transfer() command fails. Thus, the pcie->ep_state will get out of sync with reality, and any further PERST# assert + deassert will be a no-op and will not trigger the hardware initialization sequence.
This is because pex_ep_event_pex_rst_deassert() checks the current pcie->ep_state, and does nothing if the current state is already EP_STATE_ENABLED.
Thus, it is important to have error handling for tegra_bpmp_transfer(), such that the pcie->ep_state can not get out of sync with reality, so that we will try to initialize the hardware not only during the first PERST# assert + deassert, but also during any succeeding PERST# assert + deassert.
One example where this fix is needed is when using a rock5b as host. During the initial PERST# assert + deassert (triggered by the bootloader on the rock5b) pex_ep_event_pex_rst_deassert() will get called, but for some unknown reason, the tegra_bpmp_transfer() call to initialize the PHY fails. Once Linux has been loaded on the rock5b, the PCIe driver will once again assert + deassert PERST#. However, without tegra_bpmp_transfer() error handling, this second PERST# assert + deassert will not trigger the hardware initialization sequence.
With tegra_bpmp_transfer() error handling, the second PERST# assert + deassert will once again trigger the hardware to be initialized and this time the tegra_bpmp_transfer() succeeds.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Vidya Sagar vidyas@nvidia.com [cassel: improve commit log] Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Jon Hunter jonathanh@nvidia.com Acked-by: Thierry Reding treding@nvidia.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-8-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)
--- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -1205,6 +1205,7 @@ static int tegra_pcie_bpmp_set_ctrl_stat struct mrq_uphy_response resp; struct tegra_bpmp_message msg; struct mrq_uphy_request req; + int err;
/* * Controller-5 doesn't need to have its state set by BPMP-FW in @@ -1227,7 +1228,13 @@ static int tegra_pcie_bpmp_set_ctrl_stat msg.rx.data = &resp; msg.rx.size = sizeof(resp);
- return tegra_bpmp_transfer(pcie->bpmp, &msg); + err = tegra_bpmp_transfer(pcie->bpmp, &msg); + if (err) + return err; + if (msg.rx.ret) + return -EINVAL; + + return 0; }
static int tegra_pcie_bpmp_set_pll_state(struct tegra_pcie_dw *pcie, @@ -1236,6 +1243,7 @@ static int tegra_pcie_bpmp_set_pll_state struct mrq_uphy_response resp; struct tegra_bpmp_message msg; struct mrq_uphy_request req; + int err;
memset(&req, 0, sizeof(req)); memset(&resp, 0, sizeof(resp)); @@ -1255,7 +1263,13 @@ static int tegra_pcie_bpmp_set_pll_state msg.rx.data = &resp; msg.rx.size = sizeof(resp);
- return tegra_bpmp_transfer(pcie->bpmp, &msg); + err = tegra_bpmp_transfer(pcie->bpmp, &msg); + if (err) + return err; + if (msg.rx.ret) + return -EINVAL; + + return 0; }
static void tegra_pcie_downstream_dev_to_D0(struct tegra_pcie_dw *pcie)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel cassel@kernel.org
commit 42f9c66a6d0cc45758dab77233c5460e1cf003df upstream.
Tegra already defines all BARs except BAR0 as BAR_RESERVED. This is sufficient for pci-epf-test to not allocate backing memory and to not call set_bar() for those BARs. However, marking a BAR as BAR_RESERVED does not mean that the BAR gets disabled.
The host side driver, pci_endpoint_test, simply does an ioremap for all enabled BARs and will run tests against all enabled BARs, so it will run tests against the BARs marked as BAR_RESERVED.
After running the BAR tests (which will write to all enabled BARs), the inbound address translation is broken. This is because the tegra controller exposes the ATU Port Logic Structure in BAR4, so when BAR4 is written, the inbound address translation settings get overwritten.
To avoid this, implement the dw_pcie_ep_ops .init() callback and start off by disabling all BARs (pci-epf-test will later enable/configure BARs that are not defined as BAR_RESERVED).
This matches the behavior of other PCIe endpoint drivers: dra7xx, imx6, layerscape-ep, artpec6, dw-rockchip, qcom-ep, rcar-gen4, and uniphier-ep.
With this, the PCI endpoint kselftest test case CONSECUTIVE_BAR_TEST (which was specifically made to detect address translation issues) passes.
Fixes: c57247f940e8 ("PCI: tegra: Add support for PCIe endpoint mode in Tegra194") Signed-off-by: Niklas Cassel cassel@kernel.org Signed-off-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250922140822.519796-7-cassel@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pcie-tegra194.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
--- a/drivers/pci/controller/dwc/pcie-tegra194.c +++ b/drivers/pci/controller/dwc/pcie-tegra194.c @@ -1954,6 +1954,15 @@ static irqreturn_t tegra_pcie_ep_pex_rst return IRQ_HANDLED; }
+static void tegra_pcie_ep_init(struct dw_pcie_ep *ep) +{ + struct dw_pcie *pci = to_dw_pcie_from_ep(ep); + enum pci_barno bar; + + for (bar = 0; bar < PCI_STD_NUM_BARS; bar++) + dw_pcie_ep_reset_bar(pci, bar); +}; + static int tegra_pcie_ep_raise_intx_irq(struct tegra_pcie_dw *pcie, u16 irq) { /* Tegra194 supports only INTA */ @@ -2030,6 +2039,7 @@ tegra_pcie_ep_get_features(struct dw_pci }
static const struct dw_pcie_ep_ops pcie_ep_ops = { + .init = tegra_pcie_ep_init, .raise_irq = tegra_pcie_ep_raise_irq, .get_features = tegra_pcie_ep_get_features, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pratyush Yadav pratyush@kernel.org
commit 29e0b471ccbd674d20d4bbddea1a51e7105212c5 upstream.
cqspi_indirect_read_execute() and cqspi_indirect_write_execute() first set the enable bit on APB region and then start reading/writing to the AHB region. On TI K3 SoCs these regions lie on different endpoints. This means that the order of the two operations is not guaranteed, and they might be reordered at the interconnect level.
It is possible for the AHB write to be executed before the APB write to enable the indirect controller, causing the transaction to be invalid and the write erroring out. Read back the APB region write before accessing the AHB region to make sure the write got flushed and the race condition is eliminated.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") CC: stable@vger.kernel.org Reviewed-by: Pratyush Yadav pratyush@kernel.org Signed-off-by: Pratyush Yadav pratyush@kernel.org Signed-off-by: Santhosh Kumar K s-k6@ti.com Message-ID: 20250905185958.3575037-2-s-k6@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-cadence-quadspi.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -754,6 +754,7 @@ static int cqspi_indirect_read_execute(s reinit_completion(&cqspi->transfer_complete); writel(CQSPI_REG_INDIRECTRD_START_MASK, reg_base + CQSPI_REG_INDIRECTRD); + readl(reg_base + CQSPI_REG_INDIRECTRD); /* Flush posted write. */
while (remaining > 0) { if (use_irq && @@ -1058,6 +1059,8 @@ static int cqspi_indirect_write_execute( reinit_completion(&cqspi->transfer_complete); writel(CQSPI_REG_INDIRECTWR_START_MASK, reg_base + CQSPI_REG_INDIRECTWR); + readl(reg_base + CQSPI_REG_INDIRECTWR); /* Flush posted write. */ + /* * As per 66AK2G02 TRM SPRUHY8F section 11.15.5.3 Indirect Access * Controller programming sequence, couple of cycles of
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pratyush Yadav pratyush@kernel.org
commit 1ad55767e77a853c98752ed1e33b68049a243bd7 upstream.
cqspi_read_setup() and cqspi_write_setup() program the address width as the last step in the setup. This is likely to be immediately followed by a DAC region read/write. On TI K3 SoCs the DAC region is on a different endpoint from the register region. This means that the order of the two operations is not guaranteed, and they might be reordered at the interconnect level. It is possible that the DAC read/write goes through before the address width update goes through. In this situation if the previous command used a different address width the OSPI command is sent with the wrong number of address bytes, resulting in an invalid command and undefined behavior.
Read back the size register to make sure the write gets flushed before accessing the DAC region.
Fixes: 140623410536 ("mtd: spi-nor: Add driver for Cadence Quad SPI Flash Controller") CC: stable@vger.kernel.org Reviewed-by: Pratyush Yadav pratyush@kernel.org Signed-off-by: Pratyush Yadav pratyush@kernel.org Signed-off-by: Santhosh Kumar K s-k6@ti.com Message-ID: 20250905185958.3575037-3-s-k6@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-cadence-quadspi.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -712,6 +712,7 @@ static int cqspi_read_setup(struct cqspi reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK; reg |= (op->addr.nbytes - 1); writel(reg, reg_base + CQSPI_REG_SIZE); + readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */ return 0; }
@@ -1034,6 +1035,7 @@ static int cqspi_write_setup(struct cqsp reg &= ~CQSPI_REG_SIZE_ADDRESS_MASK; reg |= (op->addr.nbytes - 1); writel(reg, reg_base + CQSPI_REG_SIZE); + readl(reg_base + CQSPI_REG_SIZE); /* Flush posted write. */ return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Santhosh Kumar K s-k6@ti.com
commit 858d4d9e0a9d6b64160ef3c824f428c9742172c4 upstream.
The 'max_cs' stores the largest chip select number. It should only be updated when the current 'cs' is greater than existing 'max_cs'. So, fix the condition accordingly.
Also, return failure if there are no flash device declared.
Fixes: 0f3841a5e115 ("spi: cadence-qspi: report correct number of chip-select") CC: stable@vger.kernel.org Reviewed-by: Pratyush Yadav pratyush@kernel.org Reviewed-by: Théo Lebrun theo.lebrun@bootlin.com Signed-off-by: Santhosh Kumar K s-k6@ti.com Message-ID: 20250905185958.3575037-4-s-k6@ti.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-cadence-quadspi.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -1673,12 +1673,10 @@ static const struct spi_controller_mem_c
static int cqspi_setup_flash(struct cqspi_st *cqspi) { - unsigned int max_cs = cqspi->num_chipselect - 1; struct platform_device *pdev = cqspi->pdev; struct device *dev = &pdev->dev; struct cqspi_flash_pdata *f_pdata; - unsigned int cs; - int ret; + int ret, cs, max_cs = -1;
/* Get flash device data */ for_each_available_child_of_node_scoped(dev->of_node, np) { @@ -1691,10 +1689,10 @@ static int cqspi_setup_flash(struct cqsp if (cs >= cqspi->num_chipselect) { dev_err(dev, "Chip select %d out of range.\n", cs); return -EINVAL; - } else if (cs < max_cs) { - max_cs = cs; }
+ max_cs = max_t(int, cs, max_cs); + f_pdata = &cqspi->f_pdata[cs]; f_pdata->cqspi = cqspi; f_pdata->cs = cs; @@ -1704,6 +1702,11 @@ static int cqspi_setup_flash(struct cqsp return ret; }
+ if (max_cs < 0) { + dev_err(dev, "No flash device declared\n"); + return -ENODEV; + } + cqspi->num_chipselect = max_cs + 1; return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xin Li (Intel) xin@zytor.com
commit 3da01ffe1aeaa0d427ab5235ba735226670a80d9 upstream.
The FRED specification has been changed in v9.0 to state that there is no need for FRED event handlers to begin with ENDBR64, because in the presence of supervisor indirect branch tracking, FRED event delivery does not enter the WAIT_FOR_ENDBRANCH state.
As a result, remove ENDBR64 from FRED entry points.
Then add ANNOTATE_NOENDBR to indicate that FRED entry points will never be used for indirect calls to suppress an objtool warning.
This change implies that any indirect CALL/JMP to FRED entry points causes #CP in the presence of supervisor indirect branch tracking.
Credit goes to Jennifer Miller jmill@asu.edu and other contributors from Arizona State University whose research shows that placing ENDBR at entry points has negative value thus led to this change.
Note: This is obviously an incompatible change to the FRED architecture. But, it's OK because there no FRED systems out in the wild today. All production hardware and late pre-production hardware will follow the FRED v9 spec and be compatible with this approach.
[ dhansen: add note to changelog about incompatibility ]
Fixes: 14619d912b65 ("x86/fred: FRED entry/exit and dispatch code") Signed-off-by: Xin Li (Intel) xin@zytor.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Reviewed-by: H. Peter Anvin (Intel) hpa@zytor.com Reviewed-by: Andrew Cooper andrew.cooper3@citrix.com Link: https://lore.kernel.org/linux-hardening/Z60NwR4w%2F28Z7XUa@ubun/ Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20250716063320.1337818-1-xin%40zytor.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/entry/entry_64_fred.S | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/entry/entry_64_fred.S +++ b/arch/x86/entry/entry_64_fred.S @@ -16,7 +16,7 @@
.macro FRED_ENTER UNWIND_HINT_END_OF_STACK - ENDBR + ANNOTATE_NOENDBR PUSH_AND_CLEAR_REGS movq %rsp, %rdi /* %rdi -> pt_regs */ .endm
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 32278c677947ae2f042c9535674a7fff9a245dd3 upstream.
When checking for a potential UMIP violation on #GP, verify the decoder found at least two opcode bytes to avoid false positives when the kernel encounters an unknown instruction that starts with 0f. Because the array of opcode.bytes is zero-initialized by insn_init(), peeking at bytes[1] will misinterpret garbage as a potential SLDT or STR instruction, and can incorrectly trigger emulation.
E.g. if a VPALIGNR instruction
62 83 c5 05 0f 08 ff vpalignr xmm17{k5},xmm23,XMMWORD PTR [r8],0xff
hits a #GP, the kernel emulates it as STR and squashes the #GP (and corrupts the userspace code stream).
Arguably the check should look for exactly two bytes, but no three byte opcodes use '0f 00 xx' or '0f 01 xx' as an escape, i.e. it should be impossible to get a false positive if the first two opcode bytes match '0f 00' or '0f 01'. Go with a more conservative check with respect to the existing code to minimize the chances of breaking userspace, e.g. due to decoder weirdness.
Analyzed by Nick Bray ncbray@google.com.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions") Reported-by: Dan Snyder dansnyder@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/umip.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/umip.c +++ b/arch/x86/kernel/umip.c @@ -156,8 +156,8 @@ static int identify_insn(struct insn *in if (!insn->modrm.nbytes) return -EINVAL;
- /* All the instructions of interest start with 0x0f. */ - if (insn->opcode.bytes[0] != 0xf) + /* The instructions of interest have 2-byte opcodes: 0F 00 or 0F 01. */ + if (insn->opcode.nbytes < 2 || insn->opcode.bytes[0] != 0xf) return -EINVAL;
if (insn->opcode.bytes[1] == 0x1) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 27b1fd62012dfe9d3eb8ecde344d7aa673695ecf upstream.
Filter out the register forms of 0F 01 when determining whether or not to emulate in response to a potential UMIP violation #GP, as SGDT and SIDT only accept memory operands. The register variants of 0F 01 are used to encode instructions for things like VMX and SGX, i.e. not checking the Mod field would cause the kernel to incorrectly emulate on #GP, e.g. due to a CPL violation on VMLAUNCH.
Fixes: 1e5db223696a ("x86/umip: Add emulation code for UMIP instructions") Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/umip.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/arch/x86/kernel/umip.c +++ b/arch/x86/kernel/umip.c @@ -163,8 +163,19 @@ static int identify_insn(struct insn *in if (insn->opcode.bytes[1] == 0x1) { switch (X86_MODRM_REG(insn->modrm.value)) { case 0: + /* The reg form of 0F 01 /0 encodes VMX instructions. */ + if (X86_MODRM_MOD(insn->modrm.value) == 3) + return -EINVAL; + return UMIP_INST_SGDT; case 1: + /* + * The reg form of 0F 01 /1 encodes MONITOR/MWAIT, + * STAC/CLAC, and ENCLS. + */ + if (X86_MODRM_MOD(insn->modrm.value) == 3) + return -EINVAL; + return UMIP_INST_SIDT; case 4: return UMIP_INST_SMSW;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit 008385efd05e04d8dff299382df2e8be0f91d8a0 upstream.
The previous commit adds an exception for the C-flag case. The 'mptcp_join.sh' selftest is extended to validate this case.
In this subtest, there is a typical CDN deployment with a client where MPTCP endpoints have been 'automatically' configured:
- the server set net.mptcp.allow_join_initial_addr_port=0
- the client has multiple 'subflow' endpoints, and the default limits: not accepting ADD_ADDRs.
Without the parent patch, the client is not able to establish new subflows using its 'subflow' endpoints. The parent commit fixes that.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received") Cc: stable@vger.kernel.org Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-2-ad126cc... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3160,6 +3160,17 @@ deny_join_id0_tests() run_tests $ns1 $ns2 10.0.1.1 chk_join_nr 1 1 1 fi + + # default limits, server deny join id 0 + signal + if reset_with_allow_join_id0 "default limits, server deny join id 0" 0 1; then + pm_nl_set_limits $ns1 0 2 + pm_nl_set_limits $ns2 0 2 + pm_nl_add_endpoint $ns1 10.0.2.1 flags signal + pm_nl_add_endpoint $ns2 10.0.3.2 flags subflow + pm_nl_add_endpoint $ns2 10.0.4.2 flags subflow + run_tests $ns1 $ns2 10.0.1.1 + chk_join_nr 2 2 2 + fi }
fullmesh_tests()
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaehoon Kim jhkim@linux.ibm.com
commit 130e6de62107116eba124647116276266be0f84c upstream.
The block layer validates buffer alignment using the device's dma_alignment value. If dma_alignment is smaller than logical_block_size(bp_block) -1, misaligned buffer incorrectly pass validation and propagate to the lower-level driver.
This patch adjusts dma_alignment to be at least logical_block_size -1, ensuring that misalignment buffers are properly rejected at the block layer and do not reach the DASD driver unnecessarily.
Fixes: 2a07bb64d801 ("s390/dasd: Remove DMA alignment") Reviewed-by: Stefan Haberland sth@linux.ibm.com Cc: stable@vger.kernel.org #6.11+ Signed-off-by: Jaehoon Kim jhkim@linux.ibm.com Signed-off-by: Stefan Haberland sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -332,6 +332,11 @@ static int dasd_state_basic_to_ready(str lim.max_dev_sectors = device->discipline->max_sectors(block); lim.max_hw_sectors = lim.max_dev_sectors; lim.logical_block_size = block->bp_block; + /* + * Adjust dma_alignment to match block_size - 1 + * to ensure proper buffer alignment checks in the block layer. + */ + lim.dma_alignment = lim.logical_block_size - 1;
if (device->discipline->has_discard) { unsigned int max_bytes;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jaehoon Kim jhkim@linux.ibm.com
commit 8f4ed0ce4857ceb444174503fc9058720d4faaa1 upstream.
Currently, if CCW request creation fails with -EINVAL, the DASD driver returns BLK_STS_IOERR to the block layer.
This can happen, for example, when a user-space application such as QEMU passes a misaligned buffer, but the original cause of the error is masked as a generic I/O error.
This patch changes the behavior so that -EINVAL is returned as BLK_STS_INVAL, allowing user space to properly detect alignment issues instead of interpreting them as I/O errors.
Reviewed-by: Stefan Haberland sth@linux.ibm.com Cc: stable@vger.kernel.org #6.11+ Signed-off-by: Jaehoon Kim jhkim@linux.ibm.com Signed-off-by: Stefan Haberland sth@linux.ibm.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/s390/block/dasd.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/drivers/s390/block/dasd.c +++ b/drivers/s390/block/dasd.c @@ -3117,12 +3117,14 @@ static blk_status_t do_dasd_request(stru PTR_ERR(cqr) == -ENOMEM || PTR_ERR(cqr) == -EAGAIN) { rc = BLK_STS_RESOURCE; - goto out; + } else if (PTR_ERR(cqr) == -EINVAL) { + rc = BLK_STS_INVAL; + } else { + DBF_DEV_EVENT(DBF_ERR, basedev, + "CCW creation failed (rc=%ld) on request %p", + PTR_ERR(cqr), req); + rc = BLK_STS_IOERR; } - DBF_DEV_EVENT(DBF_ERR, basedev, - "CCW creation failed (rc=%ld) on request %p", - PTR_ERR(cqr), req); - rc = BLK_STS_IOERR; goto out; } /*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
commit fa7a0a53eeb7e16402f82c3d5a9ef4bf5efe9357 upstream.
If the decompressor is compiled with clang this can lead to the following warning:
In file included from arch/s390/boot/startup.c:4: ... In file included from ./include/linux/pgtable.h:6: ./arch/s390/include/asm/pgtable.h:2065:48: warning: passing 'unsigned long *' to parameter of type 'long *' converts between pointers to integer types with different sign [-Wpointer-sign] 2065 | value = __atomic64_or_barrier(PGSTE_PCL_BIT, ptr);
Add -Wno-pointer-sign to the decompressor compile flags, like it is also done for the kernel. This is similar to what was done for x86 to address the same problem [1].
[1] commit dca5203e3fe2 ("x86/boot: Add -Wno-pointer-sign to KBUILD_CFLAGS")
Cc: stable@vger.kernel.org Reported-by: Gerd Bayer gbayer@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/Makefile | 1 + 1 file changed, 1 insertion(+)
--- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -25,6 +25,7 @@ endif KBUILD_CFLAGS_DECOMPRESSOR := $(CLANG_FLAGS) -m64 -O2 -mpacked-stack -std=gnu11 KBUILD_CFLAGS_DECOMPRESSOR += -DDISABLE_BRANCH_PROFILING -D__NO_FORTIFY KBUILD_CFLAGS_DECOMPRESSOR += -D__DECOMPRESSOR +KBUILD_CFLAGS_DECOMPRESSOR += -Wno-pointer-sign KBUILD_CFLAGS_DECOMPRESSOR += -fno-delete-null-pointer-checks -msoft-float -mbackchain KBUILD_CFLAGS_DECOMPRESSOR += -fno-asynchronous-unwind-tables KBUILD_CFLAGS_DECOMPRESSOR += -ffreestanding
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suren Baghdasaryan surenb@google.com
commit 4038016397da5c1cebb10e7c85a36d06123724a8 upstream.
When object extension vector allocation fails, we set slab->obj_exts to OBJEXTS_ALLOC_FAIL to indicate the failure. Later, once the vector is successfully allocated, we will use this flag to mark codetag references stored in that vector as empty to avoid codetag warnings.
slab_obj_exts() used to retrieve the slab->obj_exts vector pointer checks slab->obj_exts for being either NULL or a pointer with MEMCG_DATA_OBJEXTS bit set. However it does not handle the case when slab->obj_exts equals OBJEXTS_ALLOC_FAIL. Add the missing condition to avoid extra warning.
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations") Reported-by: Shakeel Butt shakeel.butt@linux.dev Closes: https://lore.kernel.org/all/jftidhymri2af5u3xtcqry3cfu6aqzte3uzlznhlaylgrdzt... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org # v6.10+ Acked-by: Shakeel Butt shakeel.butt@linux.dev Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slab.h | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/mm/slab.h +++ b/mm/slab.h @@ -572,8 +572,12 @@ static inline struct slabobj_ext *slab_o unsigned long obj_exts = READ_ONCE(slab->obj_exts);
#ifdef CONFIG_MEMCG - VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS), - slab_page(slab)); + /* + * obj_exts should be either NULL, a valid pointer with + * MEMCG_DATA_OBJEXTS bit set or be equal to OBJEXTS_ALLOC_FAIL. + */ + VM_BUG_ON_PAGE(obj_exts && !(obj_exts & MEMCG_DATA_OBJEXTS) && + obj_exts != OBJEXTS_ALLOC_FAIL, slab_page(slab)); VM_BUG_ON_PAGE(obj_exts & MEMCG_DATA_KMEM, slab_page(slab)); #endif return (struct slabobj_ext *)(obj_exts & ~OBJEXTS_FLAGS_MASK);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suren Baghdasaryan surenb@google.com
commit f7381b9116407ba2a429977c80ff8df953ea9354 upstream.
alloc_slab_obj_exts() should mark failed obj_exts vector allocations independent on whether the vector is being allocated for a new or an existing slab. Current implementation skips doing this for existing slabs. Fix this by marking failed allocations unconditionally.
Fixes: 09c46563ff6d ("codetag: debug: introduce OBJEXTS_ALLOC_FAIL to mark failed slab_ext allocations") Reported-by: Shakeel Butt shakeel.butt@linux.dev Closes: https://lore.kernel.org/all/avhakjldsgczmq356gkwmvfilyvf7o6temvcmtt5lqd4fhp5... Signed-off-by: Suren Baghdasaryan surenb@google.com Cc: stable@vger.kernel.org # v6.10+ Acked-by: Shakeel Butt shakeel.butt@linux.dev Signed-off-by: Vlastimil Babka vbabka@suse.cz Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slub.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/slub.c +++ b/mm/slub.c @@ -1999,8 +1999,7 @@ int alloc_slab_obj_exts(struct slab *sla slab_nid(slab)); if (!vec) { /* Mark vectors which failed to allocate */ - if (new_slab) - mark_failed_objexts_alloc(slab); + mark_failed_objexts_alloc(slab);
return -ENOMEM; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Muhammad Usama Anjum usama.anjum@collabora.com
commit 32be3ca4cf78b309dfe7ba52fe2d7cc3c23c5634 upstream.
Don't deinitialize and reinitialize the HAL helpers. The dma memory is deallocated and there is high possibility that we'll not be able to get the same memory allocated from dma when there is high memory pressure.
Tested-on: WCN6855 hw2.0 PCI WLAN.HSP.1.1-03926.13-QCAHSPSWPL_V2_SILICONZ_CE-2.52297.6
Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices") Cc: stable@vger.kernel.org Cc: Baochen Qiang baochen.qiang@oss.qualcomm.com Reviewed-by: Baochen Qiang baochen.qiang@oss.qualcomm.com Signed-off-by: Muhammad Usama Anjum usama.anjum@collabora.com Link: https://patch.msgid.link/20250722053121.1145001-1-usama.anjum@collabora.com Signed-off-by: Jeff Johnson jeff.johnson@oss.qualcomm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath11k/core.c | 6 +----- drivers/net/wireless/ath/ath11k/hal.c | 16 ++++++++++++++++ drivers/net/wireless/ath/ath11k/hal.h | 1 + 3 files changed, 18 insertions(+), 5 deletions(-)
--- a/drivers/net/wireless/ath/ath11k/core.c +++ b/drivers/net/wireless/ath/ath11k/core.c @@ -1936,14 +1936,10 @@ static int ath11k_core_reconfigure_on_cr mutex_unlock(&ab->core_lock);
ath11k_dp_free(ab); - ath11k_hal_srng_deinit(ab); + ath11k_hal_srng_clear(ab);
ab->free_vdev_map = (1LL << (ab->num_radios * TARGET_NUM_VDEVS(ab))) - 1;
- ret = ath11k_hal_srng_init(ab); - if (ret) - return ret; - clear_bit(ATH11K_FLAG_CRASH_FLUSH, &ab->dev_flags);
ret = ath11k_core_qmi_firmware_ready(ab); --- a/drivers/net/wireless/ath/ath11k/hal.c +++ b/drivers/net/wireless/ath/ath11k/hal.c @@ -1383,6 +1383,22 @@ void ath11k_hal_srng_deinit(struct ath11 } EXPORT_SYMBOL(ath11k_hal_srng_deinit);
+void ath11k_hal_srng_clear(struct ath11k_base *ab) +{ + /* No need to memset rdp and wrp memory since each individual + * segment would get cleared in ath11k_hal_srng_src_hw_init() + * and ath11k_hal_srng_dst_hw_init(). + */ + memset(ab->hal.srng_list, 0, + sizeof(ab->hal.srng_list)); + memset(ab->hal.shadow_reg_addr, 0, + sizeof(ab->hal.shadow_reg_addr)); + ab->hal.avail_blk_resource = 0; + ab->hal.current_blk_index = 0; + ab->hal.num_shadow_reg_configured = 0; +} +EXPORT_SYMBOL(ath11k_hal_srng_clear); + void ath11k_hal_dump_srng_stats(struct ath11k_base *ab) { struct hal_srng *srng; --- a/drivers/net/wireless/ath/ath11k/hal.h +++ b/drivers/net/wireless/ath/ath11k/hal.h @@ -965,6 +965,7 @@ int ath11k_hal_srng_setup(struct ath11k_ struct hal_srng_params *params); int ath11k_hal_srng_init(struct ath11k_base *ath11k); void ath11k_hal_srng_deinit(struct ath11k_base *ath11k); +void ath11k_hal_srng_clear(struct ath11k_base *ab); void ath11k_hal_dump_srng_stats(struct ath11k_base *ab); void ath11k_hal_srng_get_shadow_config(struct ath11k_base *ab, u32 **cfg, u32 *len);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nick Morrow morrownr@gmail.com
commit f6159b2051e157550d7609e19d04471609c6050b upstream.
Add VID/PID 0846/9072 for recently released Netgear A9000.
Signed-off-by: Nick Morrow morrownr@gmail.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/7afd3c3c-e7cf-4bd9-801d-bdfc76def506@gmail.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/mediatek/mt76/mt7925/usb.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/wireless/mediatek/mt76/mt7925/usb.c +++ b/drivers/net/wireless/mediatek/mt76/mt7925/usb.c @@ -12,6 +12,9 @@ static const struct usb_device_id mt7925u_device_table[] = { { USB_DEVICE_AND_INTERFACE_INFO(0x0e8d, 0x7925, 0xff, 0xff, 0xff), .driver_info = (kernel_ulong_t)MT7925_FIRMWARE_WM }, + /* Netgear, Inc. A9000 */ + { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9072, 0xff, 0xff, 0xff), + .driver_info = (kernel_ulong_t)MT7925_FIRMWARE_WM }, { }, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nick Morrow morrownr@gmail.com
commit fc6627ca8a5f811b601aea74e934cf8a048c88ac upstream.
Add VID/PID 0846/9065 for Netgear A7500.
Reported-by: Autumn Dececco autumndececco@gmail.com Tested-by: Autumn Dececco autumndececco@gmail.com Signed-off-by: Nick Morrow morrownr@gmail.com Cc: stable@vger.kernel.org Acked-by: Lorenzo Bianconi lorenzo@kernel.org Link: https://patch.msgid.link/80bacfd6-6073-4ce5-be32-ae9580832337@gmail.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/mediatek/mt76/mt7921/usb.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/net/wireless/mediatek/mt76/mt7921/usb.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/usb.c @@ -21,6 +21,9 @@ static const struct usb_device_id mt7921 /* Netgear, Inc. [A8000,AXE3000] */ { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9060, 0xff, 0xff, 0xff), .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM }, + /* Netgear, Inc. A7500 */ + { USB_DEVICE_AND_INTERFACE_INFO(0x0846, 0x9065, 0xff, 0xff, 0xff), + .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM }, /* TP-Link TXE50UH */ { USB_DEVICE_AND_INTERFACE_INFO(0x35bc, 0x0107, 0xff, 0xff, 0xff), .driver_info = (kernel_ulong_t)MT7921_FIRMWARE_WM },
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lance Yang lance.yang@linux.dev
commit 1ce6473d17e78e3cb9a40147658231731a551828 upstream.
When both THP and MTE are enabled, splitting a THP and replacing its zero-filled subpages with the shared zeropage can cause MTE tag mismatch faults in userspace.
Remapping zero-filled subpages to the shared zeropage is unsafe, as the zeropage has a fixed tag of zero, which may not match the tag expected by the userspace pointer.
KSM already avoids this problem by using memcmp_pages(), which on arm64 intentionally reports MTE-tagged pages as non-identical to prevent unsafe merging.
As suggested by David[1], this patch adopts the same pattern, replacing the memchr_inv() byte-level check with a call to pages_identical(). This leverages existing architecture-specific logic to determine if a page is truly identical to the shared zeropage.
Having both the THP shrinker and KSM rely on pages_identical() makes the design more future-proof, IMO. Instead of handling quirks in generic code, we just let the architecture decide what makes two pages identical.
[1] https://lore.kernel.org/all/ca2106a3-4bb2-4457-81af-301fd99fbef4@redhat.com
Link: https://lkml.kernel.org/r/20250922021458.68123-1-lance.yang@linux.dev Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp") Signed-off-by: Lance Yang lance.yang@linux.dev Reported-by: Qun-wei Lin Qun-wei.Lin@mediatek.com Closes: https://lore.kernel.org/all/a7944523fcc3634607691c35311a5d59d1a3f8d4.camel@m... Suggested-by: David Hildenbrand david@redhat.com Acked-by: Zi Yan ziy@nvidia.com Acked-by: David Hildenbrand david@redhat.com Acked-by: Usama Arif usamaarif642@gmail.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Reviewed-by: Wei Yang richard.weiyang@gmail.com Cc: Alistair Popple apopple@nvidia.com Cc: andrew.yang andrew.yang@mediatek.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Barry Song baohua@kernel.org Cc: Byungchul Park byungchul@sk.com Cc: Charlie Jenkins charlie@rivosinc.com Cc: Chinwen Chang chinwen.chang@mediatek.com Cc: Dev Jain dev.jain@arm.com Cc: Domenico Cerasuolo cerasuolodomenico@gmail.com Cc: Gregory Price gourry@gourry.net Cc: "Huang, Ying" ying.huang@linux.alibaba.com Cc: Hugh Dickins hughd@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Joshua Hahn joshua.hahnjy@gmail.com Cc: Kairui Song ryncsn@gmail.com Cc: Kalesh Singh kaleshsingh@google.com Cc: Liam Howlett liam.howlett@oracle.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Mariano Pache npache@redhat.com Cc: Mathew Brost matthew.brost@intel.com Cc: Matthew Wilcox (Oracle) willy@infradead.org Cc: Mike Rapoport rppt@kernel.org Cc: Palmer Dabbelt palmer@rivosinc.com Cc: Rakie Kim rakie.kim@sk.com Cc: Rik van Riel riel@surriel.com Cc: Roman Gushchin roman.gushchin@linux.dev Cc: Ryan Roberts ryan.roberts@arm.com Cc: Samuel Holland samuel.holland@sifive.com Cc: Shakeel Butt shakeel.butt@linux.dev Cc: Suren Baghdasaryan surenb@google.com Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/huge_memory.c | 15 +++------------ mm/migrate.c | 8 +------- 2 files changed, 4 insertions(+), 19 deletions(-)
--- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3715,32 +3715,23 @@ static unsigned long deferred_split_coun static bool thp_underused(struct folio *folio) { int num_zero_pages = 0, num_filled_pages = 0; - void *kaddr; int i;
if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1) return false;
for (i = 0; i < folio_nr_pages(folio); i++) { - kaddr = kmap_local_folio(folio, i * PAGE_SIZE); - if (!memchr_inv(kaddr, 0, PAGE_SIZE)) { - num_zero_pages++; - if (num_zero_pages > khugepaged_max_ptes_none) { - kunmap_local(kaddr); + if (pages_identical(folio_page(folio, i), ZERO_PAGE(0))) { + if (++num_zero_pages > khugepaged_max_ptes_none) return true; - } } else { /* * Another path for early exit once the number * of non-zero filled pages exceeds threshold. */ - num_filled_pages++; - if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) { - kunmap_local(kaddr); + if (++num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) return false; - } } - kunmap_local(kaddr); } return false; } --- a/mm/migrate.c +++ b/mm/migrate.c @@ -202,9 +202,7 @@ static bool try_to_map_unused_to_zeropag unsigned long idx) { struct page *page = folio_page(folio, idx); - bool contains_data; pte_t newpte; - void *addr;
if (PageCompound(page)) return false; @@ -221,11 +219,7 @@ static bool try_to_map_unused_to_zeropag * this subpage has been non present. If the subpage is only zero-filled * then map it to the shared zeropage. */ - addr = kmap_local_page(page); - contains_data = memchr_inv(addr, 0, PAGE_SIZE); - kunmap_local(addr); - - if (contains_data) + if (!pages_identical(page, ZERO_PAGE(0))) return false;
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address),
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thadeu Lima de Souza Cascardo cascardo@igalia.com
commit 6a204d4b14c99232e05d35305c27ebce1c009840 upstream.
Commit 524c48072e56 ("mm/page_alloc: rename ALLOC_HIGH to ALLOC_MIN_RESERVE") is the start of a series that explains how __GFP_HIGH, which implies ALLOC_MIN_RESERVE, is going to be used instead of __GFP_ATOMIC for high atomic reserves.
Commit eb2e2b425c69 ("mm/page_alloc: explicitly record high-order atomic allocations in alloc_flags") introduced ALLOC_HIGHATOMIC for such allocations of order higher than 0. It still used __GFP_ATOMIC, though.
Then, commit 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves") just turned that check for !__GFP_DIRECT_RECLAIM, ignoring that high atomic reserves were expected to test for __GFP_HIGH.
This leads to high atomic reserves being added for high-order GFP_NOWAIT allocations and others that clear __GFP_DIRECT_RECLAIM, which is unexpected. Later, those reserves lead to 0-order allocations going to the slow path and starting reclaim.
From /proc/pagetypeinfo, without the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type HighAtomic 1 8 10 9 7 3 0 0 0 0 0 Node 0, zone Normal, type HighAtomic 64 20 12 5 0 0 0 0 0 0 0
With the patch:
Node 0, zone DMA, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone DMA32, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0 Node 0, zone Normal, type HighAtomic 0 0 0 0 0 0 0 0 0 0 0
Link: https://lkml.kernel.org/r/20250814172245.1259625-1-cascardo@igalia.com Fixes: 1ebbb21811b7 ("mm/page_alloc: explicitly define how __GFP_HIGH non-blocking allocations accesses reserves") Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@igalia.com Tested-by: Helen Koike koike@igalia.com Reviewed-by: Vlastimil Babka vbabka@suse.cz Tested-by: Sergey Senozhatsky senozhatsky@chromium.org Acked-by: Michal Hocko mhocko@suse.com Cc: Mel Gorman mgorman@techsingularity.net Cc: Matthew Wilcox willy@infradead.org Cc: NeilBrown neilb@suse.de Cc: Thierry Reding thierry.reding@gmail.com Cc: Brendan Jackman jackmanb@google.com Cc: Johannes Weiner hannes@cmpxchg.org Cc: Suren Baghdasaryan surenb@google.com Cc: Zi Yan ziy@nvidia.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -4052,7 +4052,7 @@ gfp_to_alloc_flags(gfp_t gfp_mask, unsig if (!(gfp_mask & __GFP_NOMEMALLOC)) { alloc_flags |= ALLOC_NON_BLOCK;
- if (order > 0) + if (order > 0 && (alloc_flags & ALLOC_MIN_RESERVE)) alloc_flags |= ALLOC_HIGHATOMIC; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li RongQing lirongqing@baidu.com
commit b322e88b3d553e85b4e15779491c70022783faa4 upstream.
Optimize hugetlb_pages_alloc_boot() to return immediately when max_huge_pages is 0, avoiding unnecessary CPU cycles and the below log message when hugepages aren't configured in the kernel command line. [ 3.702280] HugeTLB: allocation took 0ms with hugepage_allocation_threads=32
Link: https://lkml.kernel.org/r/20250814102333.4428-1-lirongqing@baidu.com Signed-off-by: Li RongQing lirongqing@baidu.com Reviewed-by: Dev Jain dev.jain@arm.com Tested-by: Dev Jain dev.jain@arm.com Reviewed-by: Jane Chu jane.chu@oracle.com Acked-by: David Hildenbrand david@redhat.com Cc: Muchun Song muchun.song@linux.dev Cc: Oscar Salvador osalvador@suse.de Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3453,6 +3453,9 @@ static void __init hugetlb_hstate_alloc_ initialized = true; }
+ if (!h->max_huge_pages) + return; + /* do node specific alloc */ if (hugetlb_hstate_alloc_pages_specific_nodes(h)) return;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit b93af2cc8e036754c0d9970d9ddc47f43cc94b9f upstream.
DAMON's virtual address space operation set implementation (vaddr) calls pte_offset_map_lock() inside the page table walk callback function. This is for reading and writing page table accessed bits. If pte_offset_map_lock() fails, it retries by returning the page table walk callback function with ACTION_AGAIN.
pte_offset_map_lock() can continuously fail if the target is a pmd migration entry, though. Hence it could cause an infinite page table walk if the migration cannot be done until the page table walk is finished. This indeed caused a soft lockup when CPU hotplugging and DAMON were running in parallel.
Avoid the infinite loop by simply not retrying the page table walk. DAMON is promising only a best-effort accuracy, so missing access to such pages is no problem.
Link: https://lkml.kernel.org/r/20250930004410.55228-1-sj@kernel.org Fixes: 7780d04046a2 ("mm/pagewalkers: ACTION_AGAIN if pte_offset_map_lock() fails") Signed-off-by: SeongJae Park sj@kernel.org Reported-by: Xinyu Zheng zhengxinyu6@huawei.com Closes: https://lore.kernel.org/20250918030029.2652607-1-zhengxinyu6@huawei.com Acked-by: Hugh Dickins hughd@google.com Cc: stable@vger.kernel.org [6.5+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/vaddr.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
--- a/mm/damon/vaddr.c +++ b/mm/damon/vaddr.c @@ -324,10 +324,8 @@ static int damon_mkold_pmd_entry(pmd_t * }
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); - if (!pte) { - walk->action = ACTION_AGAIN; + if (!pte) return 0; - } if (!pte_present(ptep_get(pte))) goto out; damon_ptep_mkold(pte, walk->vma, addr); @@ -479,10 +477,8 @@ regular_page: #endif /* CONFIG_TRANSPARENT_HUGEPAGE */
pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl); - if (!pte) { - walk->action = ACTION_AGAIN; + if (!pte) return 0; - } ptent = ptep_get(pte); if (!pte_present(ptent)) goto out;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: SeongJae Park sj@kernel.org
commit e18190b7e97e9db6546390e6e0ceddae606892b2 upstream.
damon_lru_sort_apply_parameters() allocates a new DAMON context, stages user-specified DAMON parameters on it, and commits to running DAMON context at once, using damon_commit_ctx(). The code is, however, directly updating the monitoring attributes of the running context. And the attributes are over-written by later damon_commit_ctx() call. This means that the monitoring attributes parameters are not really working. Fix the wrong use of the parameter context.
Link: https://lkml.kernel.org/r/20250916031549.115326-1-sj@kernel.org Fixes: a30969436428 ("mm/damon/lru_sort: use damon_commit_ctx()") Signed-off-by: SeongJae Park sj@kernel.org Reviewed-by: Joshua Hahn joshua.hahnjy@gmail.com Cc: Joshua Hahn joshua.hahnjy@gmail.com Cc: stable@vger.kernel.org [6.11+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/damon/lru_sort.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/damon/lru_sort.c +++ b/mm/damon/lru_sort.c @@ -203,7 +203,7 @@ static int damon_lru_sort_apply_paramete goto out; }
- err = damon_set_attrs(ctx, &damon_lru_sort_mon_attrs); + err = damon_set_attrs(param_ctx, &damon_lru_sort_mon_attrs); if (err) goto out;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thorsten Blum thorsten.blum@linux.dev
commit ab1c282c010c4f327bd7addc3c0035fd8e3c1721 upstream.
Commit 5304877936c0 ("NFSD: Fix strncpy() fortify warning") replaced strncpy(,, sizeof(..)) with strlcpy(,, sizeof(..) - 1), but strlcpy() already guaranteed NUL-termination of the destination buffer and subtracting one byte potentially truncated the source string.
The incorrect size was then carried over in commit 72f78ae00a8e ("NFSD: move from strlcpy with unused retval to strscpy") when switching from strlcpy() to strscpy().
Fix this off-by-one error by using the full size of the destination buffer again.
Cc: stable@vger.kernel.org Fixes: 5304877936c0 ("NFSD: Fix strncpy() fortify warning") Signed-off-by: Thorsten Blum thorsten.blum@linux.dev Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4proc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -1372,7 +1372,7 @@ try_again: return 0; } if (work) { - strscpy(work->nsui_ipaddr, ipaddr, sizeof(work->nsui_ipaddr) - 1); + strscpy(work->nsui_ipaddr, ipaddr, sizeof(work->nsui_ipaddr)); refcount_set(&work->nsui_refcnt, 2); work->nsui_busy = true; list_add_tail(&work->nsui_list, &nn->nfsd_ssc_mount_list);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia okorniev@redhat.com
commit a082e4b4d08a4a0e656d90c2c05da85f23e6d0c9 upstream.
When v3 NLM request finds a conflicting delegation, it triggers a delegation recall and nfsd_open fails with EAGAIN. nfsd_open then translates EAGAIN into nfserr_jukebox. In nlm_fopen, instead of returning nlm_failed for when there is a conflicting delegation, drop this NLM request so that the client retries. Once delegation is recalled and if a local lock is claimed, a retry would lead to nfsd returning a nlm_lck_blocked error or a successful nlm lock.
Fixes: d343fce148a4 ("[PATCH] knfsd: Allow lockd to drop replies as appropriate") Cc: stable@vger.kernel.org # v6.6 Signed-off-by: Olga Kornievskaia okorniev@redhat.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/lockd.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/fs/nfsd/lockd.c +++ b/fs/nfsd/lockd.c @@ -48,6 +48,21 @@ nlm_fopen(struct svc_rqst *rqstp, struct switch (nfserr) { case nfs_ok: return 0; + case nfserr_jukebox: + /* this error can indicate a presence of a conflicting + * delegation to an NLM lock request. Options are: + * (1) For now, drop this request and make the client + * retry. When delegation is returned, client's lock retry + * will complete. + * (2) NLM4_DENIED as per "spec" signals to the client + * that the lock is unavailable now but client can retry. + * Linux client implementation does not. It treats + * NLM4_DENIED same as NLM4_FAILED and errors the request. + * (3) For the future, treat this as blocked lock and try + * to callback when the delegation is returned but might + * not have a proper lock request to block on. + */ + fallthrough; case nfserr_dropit: return nlm_drop_reply; case nfserr_stale:
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Baokun Li libaokun1@huawei.com
commit d8b90e6387a74bcb1714c8d1e6a782ff709de9a9 upstream.
The implicit __GFP_NOFAIL flag in ext4_sb_bread() was removed in commit 8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()"), meaning the function can now fail under memory pressure.
Most callers of ext4_sb_bread() propagate the error to userspace and do not remount the filesystem read-only. However, ext4_free_branches() handles ext4_sb_bread() failure by remounting the filesystem read-only.
This implies that an ext3 filesystem (mounted via the ext4 driver) could be forcibly remounted read-only due to a transient page allocation failure, which is unacceptable.
To mitigate this, introduce a new helper function, ext4_sb_bread_nofail(), which explicitly uses __GFP_NOFAIL, and use it in ext4_free_branches().
Fixes: 8a83ac54940d ("ext4: call bdev_getblk() from sb_getblk_gfp()") Cc: stable@kernel.org Signed-off-by: Baokun Li libaokun1@huawei.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/ext4.h | 2 ++ fs/ext4/indirect.c | 2 +- fs/ext4/super.c | 9 +++++++++ 3 files changed, 12 insertions(+), 1 deletion(-)
--- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3107,6 +3107,8 @@ extern struct buffer_head *ext4_sb_bread sector_t block, blk_opf_t op_flags); extern struct buffer_head *ext4_sb_bread_unmovable(struct super_block *sb, sector_t block); +extern struct buffer_head *ext4_sb_bread_nofail(struct super_block *sb, + sector_t block); extern void ext4_read_bh_nowait(struct buffer_head *bh, blk_opf_t op_flags, bh_end_io_t *end_io, bool simu_fail); extern int ext4_read_bh(struct buffer_head *bh, blk_opf_t op_flags, --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -1025,7 +1025,7 @@ static void ext4_free_branches(handle_t }
/* Go read the buffer for the next level down */ - bh = ext4_sb_bread(inode->i_sb, nr, 0); + bh = ext4_sb_bread_nofail(inode->i_sb, nr);
/* * A read failure? Report error and clear slot --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -266,6 +266,15 @@ struct buffer_head *ext4_sb_bread_unmova return __ext4_sb_bread_gfp(sb, block, 0, gfp); }
+struct buffer_head *ext4_sb_bread_nofail(struct super_block *sb, + sector_t block) +{ + gfp_t gfp = mapping_gfp_constraint(sb->s_bdev->bd_mapping, + ~__GFP_FS) | __GFP_MOVABLE | __GFP_NOFAIL; + + return __ext4_sb_bread_gfp(sb, block, 0, gfp); +} + void ext4_sb_breadahead_unmovable(struct super_block *sb, sector_t block) { struct buffer_head *bh = bdev_getblk(sb->s_bdev, block,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 0a6ce20c156442a4ce2a404747bb0fb05d54eeb3 upstream.
In principle orphan file can be arbitrarily large. However orphan replay needs to traverse it all and we also pin all its buffers in memory. Thus filesystems with absurdly large orphan files can lead to big amounts of memory consumed. Limit orphan file size to a sane value and also use kvmalloc() for allocating array of block descriptor structures to avoid large order allocations for sane but large orphan files.
Reported-by: syzbot+0b92850d68d9b12934f5@syzkaller.appspotmail.com Fixes: 02f310fcf47f ("ext4: Speedup ext4 orphan inode handling") Cc: stable@kernel.org Signed-off-by: Jan Kara jack@suse.cz Message-ID: 20250909112206.10459-2-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/orphan.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/fs/ext4/orphan.c +++ b/fs/ext4/orphan.c @@ -584,9 +584,20 @@ int ext4_init_orphan_info(struct super_b ext4_msg(sb, KERN_ERR, "get orphan inode failed"); return PTR_ERR(inode); } + /* + * This is just an artificial limit to prevent corrupted fs from + * consuming absurd amounts of memory when pinning blocks of orphan + * file in memory. + */ + if (inode->i_size > 8 << 20) { + ext4_msg(sb, KERN_ERR, "orphan file too big: %llu", + (unsigned long long)inode->i_size); + ret = -EFSCORRUPTED; + goto out_put; + } oi->of_blocks = inode->i_size >> sb->s_blocksize_bits; oi->of_csum_seed = EXT4_I(inode)->i_csum_seed; - oi->of_binfo = kmalloc_array(oi->of_blocks, + oi->of_binfo = kvmalloc_array(oi->of_blocks, sizeof(struct ext4_orphan_block), GFP_KERNEL); if (!oi->of_binfo) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yongjian Sun sunyongjian1@huawei.com
commit 9d80eaa1a1d37539224982b76c9ceeee736510b9 upstream.
After running a stress test combined with fault injection, we performed fsck -a followed by fsck -fn on the filesystem image. During the second pass, fsck -fn reported:
Inode 131512, end of extent exceeds allowed value (logical block 405, physical block 1180540, len 2)
This inode was not in the orphan list. Analysis revealed the following call chain that leads to the inconsistency:
ext4_da_write_end() //does not update i_disksize ext4_punch_hole() //truncate folio, keep size ext4_page_mkwrite() ext4_block_page_mkwrite() ext4_block_write_begin() ext4_get_block() //insert written extent without update i_disksize journal commit echo 1 > /sys/block/xxx/device/delete
da-write path updates i_size but does not update i_disksize. Then ext4_punch_hole truncates the da-folio yet still leaves i_disksize unchanged(in the ext4_update_disksize_before_punch function, the condition offset + len < size is met). Then ext4_page_mkwrite sees ext4_nonda_switch return 1 and takes the nodioread_nolock path, the folio about to be written has just been punched out, and it’s offset sits beyond the current i_disksize. This may result in a written extent being inserted, but again does not update i_disksize. If the journal gets committed and then the block device is yanked, we might run into this. It should be noted that replacing ext4_punch_hole with ext4_zero_range in the call sequence may also trigger this issue, as neither will update i_disksize under these circumstances.
To fix this, we can modify ext4_update_disksize_before_punch to increase i_disksize to min(i_size, offset + len) when both i_size and (offset + len) are greater than i_disksize.
Cc: stable@kernel.org Signed-off-by: Yongjian Sun sunyongjian1@huawei.com Reviewed-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Baokun Li libaokun1@huawei.com Message-ID: 20250911133024.1841027-1-sunyongjian@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/inode.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -3859,7 +3859,11 @@ int ext4_can_truncate(struct inode *inod * We have to make sure i_disksize gets properly updated before we truncate * page cache due to hole punching or zero range. Otherwise i_disksize update * can get lost as it may have been postponed to submission of writeback but - * that will never happen after we truncate page cache. + * that will never happen if we remove the folio containing i_size from the + * page cache. Also if we punch hole within i_size but above i_disksize, + * following ext4_page_mkwrite() may mistakenly allocate written blocks over + * the hole and thus introduce allocated blocks beyond i_disksize which is + * not allowed (e2fsck would complain in case of crash). */ int ext4_update_disksize_before_punch(struct inode *inode, loff_t offset, loff_t len) @@ -3870,9 +3874,11 @@ int ext4_update_disksize_before_punch(st loff_t size = i_size_read(inode);
WARN_ON(!inode_is_locked(inode)); - if (offset > size || offset + len < size) + if (offset > size) return 0;
+ if (offset + len < size) + size = offset + len; if (EXT4_I(inode)->i_disksize >= size) return 0;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ojaswin Mujoo ojaswin@linux.ibm.com
commit 46c22a8bb4cb03211da1100d7ee4a2005bf77c70 upstream.
Currently, our handling of metadata is _ambiguous_ in some scenarios, that is, we end up returning unknown if the range only covers the mapping partially.
For example, in the following case:
$ xfs_io -c fsmap -d
0: 254:16 [0..7]: static fs metadata 8 1: 254:16 [8..15]: special 102:1 8 2: 254:16 [16..5127]: special 102:2 5112 3: 254:16 [5128..5255]: special 102:3 128 4: 254:16 [5256..5383]: special 102:4 128 5: 254:16 [5384..70919]: inodes 65536 6: 254:16 [70920..70967]: unknown 48 ...
$ xfs_io -c fsmap -d 24 33
0: 254:16 [24..39]: unknown 16 <--- incomplete reporting
$ xfs_io -c fsmap -d 24 33 (With patch)
0: 254:16 [16..5127]: special 102:2 5112
This is because earlier in ext4_getfsmap_meta_helper, we end up ignoring any extent that starts before our queried range, but overlaps it. While the man page [1] is a bit ambiguous on this, this fix makes the output make more sense since we are anyways returning an "unknown" extent. This is also consistent to how XFS does it:
$ xfs_io -c fsmap -d
... 6: 254:16 [104..127]: free space 24 7: 254:16 [128..191]: inodes 64 ...
$ xfs_io -c fsmap -d 137 150
0: 254:16 [128..191]: inodes 64 <-- full extent returned
[1] https://man7.org/linux/man-pages/man2/ioctl_getfsmap.2.html
Reported-by: Ritesh Harjani (IBM) ritesh.list@gmail.com Cc: stable@kernel.org Signed-off-by: Ojaswin Mujoo ojaswin@linux.ibm.com Message-ID: 023f37e35ee280cd9baac0296cbadcbe10995cab.1757058211.git.ojaswin@linux.ibm.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/fsmap.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/fs/ext4/fsmap.c +++ b/fs/ext4/fsmap.c @@ -74,7 +74,8 @@ static int ext4_getfsmap_dev_compare(con static bool ext4_getfsmap_rec_before_low_key(struct ext4_getfsmap_info *info, struct ext4_fsmap *rec) { - return rec->fmr_physical < info->gfi_low.fmr_physical; + return rec->fmr_physical + rec->fmr_length <= + info->gfi_low.fmr_physical; }
/* @@ -200,15 +201,18 @@ static int ext4_getfsmap_meta_helper(str ext4_group_first_block_no(sb, agno)); fs_end = fs_start + EXT4_C2B(sbi, len);
- /* Return relevant extents from the meta_list */ + /* + * Return relevant extents from the meta_list. We emit all extents that + * partially/fully overlap with the query range + */ list_for_each_entry_safe(p, tmp, &info->gfi_meta_list, fmr_list) { - if (p->fmr_physical < info->gfi_next_fsblk) { + if (p->fmr_physical + p->fmr_length <= info->gfi_next_fsblk) { list_del(&p->fmr_list); kfree(p); continue; } - if (p->fmr_physical <= fs_start || - p->fmr_physical + p->fmr_length <= fs_end) { + if (p->fmr_physical <= fs_end && + p->fmr_physical + p->fmr_length > fs_start) { /* Emit the retained free extent record if present */ if (info->gfi_lastfree.fmr_owner) { error = ext4_getfsmap_helper(sb, info,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Theodore Ts'o tytso@mit.edu
commit 8ecb790ea8c3fc69e77bace57f14cf0d7c177bd8 upstream.
Unlike other strings in the ext4 superblock, we rely on tune2fs to make sure s_mount_opts is NUL terminated. Harden parse_apply_sb_mount_options() by treating s_mount_opts as a potential __nonstring.
Cc: stable@vger.kernel.org Fixes: 8b67f04ab9de ("ext4: Add mount options in superblock") Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Theodore Ts'o tytso@mit.edu Message-ID: 20250916-tune2fs-v2-1-d594dc7486f0@mit.edu Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/super.c | 17 +++++------------ 1 file changed, 5 insertions(+), 12 deletions(-)
--- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -2493,7 +2493,7 @@ static int parse_apply_sb_mount_options( struct ext4_fs_context *m_ctx) { struct ext4_sb_info *sbi = EXT4_SB(sb); - char *s_mount_opts = NULL; + char s_mount_opts[65]; struct ext4_fs_context *s_ctx = NULL; struct fs_context *fc = NULL; int ret = -ENOMEM; @@ -2501,15 +2501,11 @@ static int parse_apply_sb_mount_options( if (!sbi->s_es->s_mount_opts[0]) return 0;
- s_mount_opts = kstrndup(sbi->s_es->s_mount_opts, - sizeof(sbi->s_es->s_mount_opts), - GFP_KERNEL); - if (!s_mount_opts) - return ret; + strscpy_pad(s_mount_opts, sbi->s_es->s_mount_opts);
fc = kzalloc(sizeof(struct fs_context), GFP_KERNEL); if (!fc) - goto out_free; + return -ENOMEM;
s_ctx = kzalloc(sizeof(struct ext4_fs_context), GFP_KERNEL); if (!s_ctx) @@ -2541,11 +2537,8 @@ parse_failed: ret = 0;
out_free: - if (fc) { - ext4_fc_free(fc); - kfree(fc); - } - kfree(s_mount_opts); + ext4_fc_free(fc); + kfree(fc); return ret; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
commit 12e803c8827d049ae8f2c743ef66ab87ae898375 upstream.
During the movement of a written extent, mext_page_mkuptodate() is called to read data in the range [from, to) into the page cache and to update the corresponding buffers. Therefore, we should not wait on any buffer whose start offset is >= 'to'. Otherwise, it will return -EIO and fail the extents movement.
$ for i in `seq 3 -1 0`; \ do xfs_io -fs -c "pwrite -b 1024 $((i * 1024)) 1024" /mnt/foo; \ done $ umount /mnt && mount /dev/pmem1s /mnt # drop cache $ e4defrag /mnt/foo e4defrag 1.47.0 (5-Feb-2023) ext4 defragmentation for /mnt/foo [1/1]/mnt/foo: 0% [ NG ] Success: [0/1]
Cc: stable@kernel.org Fixes: a40759fb16ae ("ext4: remove array of buffer_heads from mext_page_mkuptodate()") Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Message-ID: 20250912105841.1886799-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/move_extent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -225,7 +225,7 @@ static int mext_page_mkuptodate(struct f do { if (bh_offset(bh) + blocksize <= from) continue; - if (bh_offset(bh) > to) + if (bh_offset(bh) >= to) break; wait_on_buffer(bh); if (buffer_uptodate(bh))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ahmet Eray Karadag eraykrdg1@gmail.com
commit 57295e835408d8d425bef58da5253465db3d6888 upstream.
syzkaller found a path where ext4_xattr_inode_update_ref() reads an EA inode refcount that is already <= 0 and then applies ref_change (often -1). That lets the refcount underflow and we proceed with a bogus value, triggering errors like:
EXT4-fs error: EA inode <n> ref underflow: ref_count=-1 ref_change=-1 EXT4-fs warning: ea_inode dec ref err=-117
Make the invariant explicit: if the current refcount is non-positive, treat this as on-disk corruption, emit ext4_error_inode(), and fail the operation with -EFSCORRUPTED instead of updating the refcount. Delete the WARN_ONCE() as negative refcounts are now impossible; keep error reporting in ext4_error_inode().
This prevents the underflow and the follow-on orphan/cleanup churn.
Reported-by: syzbot+0be4f339a8218d2a5bb1@syzkaller.appspotmail.com Fixes: https://syzbot.org/bug?extid=0be4f339a8218d2a5bb1 Cc: stable@kernel.org Co-developed-by: Albin Babu Varghese albinbabuvarghese20@gmail.com Signed-off-by: Albin Babu Varghese albinbabuvarghese20@gmail.com Signed-off-by: Ahmet Eray Karadag eraykrdg1@gmail.com Message-ID: 20250920021342.45575-1-eraykrdg1@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -1036,7 +1036,7 @@ static int ext4_xattr_inode_update_ref(h int ref_change) { struct ext4_iloc iloc; - s64 ref_count; + u64 ref_count; int ret;
inode_lock_nested(ea_inode, I_MUTEX_XATTR); @@ -1046,13 +1046,17 @@ static int ext4_xattr_inode_update_ref(h goto out;
ref_count = ext4_xattr_inode_get_ref(ea_inode); + if ((ref_count == 0 && ref_change < 0) || (ref_count == U64_MAX && ref_change > 0)) { + ext4_error_inode(ea_inode, __func__, __LINE__, 0, + "EA inode %lu ref wraparound: ref_count=%lld ref_change=%d", + ea_inode->i_ino, ref_count, ref_change); + ret = -EFSCORRUPTED; + goto out; + } ref_count += ref_change; ext4_xattr_inode_set_ref(ea_inode, ref_count);
if (ref_change > 0) { - WARN_ONCE(ref_count <= 0, "EA inode %lu ref_count=%lld", - ea_inode->i_ino, ref_count); - if (ref_count == 1) { WARN_ONCE(ea_inode->i_nlink, "EA inode %lu i_nlink=%u", ea_inode->i_ino, ea_inode->i_nlink); @@ -1061,9 +1065,6 @@ static int ext4_xattr_inode_update_ref(h ext4_orphan_del(handle, ea_inode); } } else { - WARN_ONCE(ref_count < 0, "EA inode %lu ref_count=%lld", - ea_inode->i_ino, ref_count); - if (ref_count == 0) { WARN_ONCE(ea_inode->i_nlink != 1, "EA inode %lu i_nlink=%u",
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Deepanshu Kartikey kartikey406@gmail.com
commit 44d2a72f4d64655f906ba47a5e108733f59e6f28 upstream.
During xattr block validation, check_xattrs() processes xattr entries without validating that entries claiming to use EA inodes have non-zero sizes. Corrupted filesystems may contain xattr entries where e_value_size is zero but e_value_inum is non-zero, indicating invalid xattr data.
Add validation in check_xattrs() to detect this corruption pattern early and return -EFSCORRUPTED, preventing invalid xattr entries from causing issues throughout the ext4 codebase.
Cc: stable@kernel.org Suggested-by: Theodore Ts'o tytso@mit.edu Reported-by: syzbot+4c9d23743a2409b80293@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?extid=4c9d23743a2409b80293 Signed-off-by: Deepanshu Kartikey kartikey406@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Message-ID: 20250923133245.1091761-1-kartikey406@gmail.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/xattr.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -251,6 +251,10 @@ check_xattrs(struct inode *inode, struct err_str = "invalid ea_ino"; goto errout; } + if (ea_ino && !size) { + err_str = "invalid size in ea xattr"; + goto errout; + } if (size > EXT4_XATTR_SIZE_MAX) { err_str = "e_value size too large"; goto errout;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Huacai Chen chenhuacai@loongson.cn
commit feb8ae81b2378b75a99c81d315602ac8918ed382 upstream.
Introduce acpi_gbl_use_global_lock, which allows to skip the Global Lock initialization. This is useful for systems without Global Lock (such as loong_arch), so as to avoid error messages during boot phase:
ACPI Error: Could not enable global_lock event (20240827/evxfevnt-182) ACPI Error: No response from Global Lock hardware, disabling lock (20240827/evglock-59)
Link: https://github.com/acpica/acpica/commit/463cb0fe Signed-off-by: Huacai Chen chenhuacai@loongson.cn Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Cc: Huacai Chen chenhuacai@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/acpica/evglock.c | 4 ++++ include/acpi/acpixf.h | 6 ++++++ 2 files changed, 10 insertions(+)
--- a/drivers/acpi/acpica/evglock.c +++ b/drivers/acpi/acpica/evglock.c @@ -42,6 +42,10 @@ acpi_status acpi_ev_init_global_lock_han return_ACPI_STATUS(AE_OK); }
+ if (!acpi_gbl_use_global_lock) { + return_ACPI_STATUS(AE_OK); + } + /* Attempt installation of the global lock handler */
status = acpi_install_fixed_event_handler(ACPI_EVENT_GLOBAL, --- a/include/acpi/acpixf.h +++ b/include/acpi/acpixf.h @@ -214,6 +214,12 @@ ACPI_INIT_GLOBAL(u8, acpi_gbl_osi_data, ACPI_INIT_GLOBAL(u8, acpi_gbl_reduced_hardware, FALSE);
/* + * ACPI Global Lock is mainly used for systems with SMM, so no-SMM systems + * (such as loong_arch) may not have and not use Global Lock. + */ +ACPI_INIT_GLOBAL(u8, acpi_gbl_use_global_lock, TRUE); + +/* * Maximum timeout for While() loop iterations before forced method abort. * This mechanism is intended to prevent infinite loops during interpreter * execution within a host kernel.
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
commit 971843c511c3c2f6eda96c6b03442913bfee6148 upstream.
Orphan info is now getting allocated with kvmalloc_array(). Free it with kvfree() instead of kfree() to avoid complaints from mm.
Reported-by: Chris Mason clm@meta.com Fixes: 0a6ce20c1564 ("ext4: verify orphan file size is not too big") Cc: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz Message-ID: 20251007134936.7291-2-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/orphan.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/ext4/orphan.c +++ b/fs/ext4/orphan.c @@ -513,7 +513,7 @@ void ext4_release_orphan_info(struct sup return; for (i = 0; i < oi->of_blocks; i++) brelse(oi->of_binfo[i].ob_bh); - kfree(oi->of_binfo); + kvfree(oi->of_binfo); }
static struct ext4_orphan_block_tail *ext4_orphan_block_tail( @@ -638,7 +638,7 @@ int ext4_init_orphan_info(struct super_b out_free: for (i--; i >= 0; i--) brelse(oi->of_binfo[i].ob_bh); - kfree(oi->of_binfo); + kvfree(oi->of_binfo); out_put: iput(inode); return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nathan Chancellor nathan@kernel.org
commit 2f13daee2a72bb962f5fd356c3a263a6f16da965 upstream.
After commit 6f110a5e4f99 ("Disable SLUB_TINY for build testing"), which causes CONFIG_KASAN to be enabled in allmodconfig again, arm64 allmodconfig builds with clang-17 and older show an instance of -Wframe-larger-than (which breaks the build with CONFIG_WERROR=y):
lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (2336) exceeds limit (2048) in 'curve25519_generic' [-Werror,-Wframe-larger-than] 757 | void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE], | ^
When KASAN is disabled, the stack usage is roughly quartered:
lib/crypto/curve25519-hacl64.c:757:6: error: stack frame size (608) exceeds limit (128) in 'curve25519_generic' [-Werror,-Wframe-larger-than] 757 | void curve25519_generic(u8 mypublic[CURVE25519_KEY_SIZE], | ^
Using '-Rpass-analysis=stack-frame-layout' shows the following variables and many, many 8-byte spills when KASAN is enabled:
Offset: [SP-144], Type: Variable, Align: 8, Size: 40 Offset: [SP-464], Type: Variable, Align: 8, Size: 320 Offset: [SP-784], Type: Variable, Align: 8, Size: 320 Offset: [SP-864], Type: Variable, Align: 32, Size: 80 Offset: [SP-896], Type: Variable, Align: 32, Size: 32 Offset: [SP-1016], Type: Variable, Align: 8, Size: 120
When KASAN is disabled, there are still spills but not at many and the variables list is smaller:
Offset: [SP-192], Type: Variable, Align: 32, Size: 80 Offset: [SP-224], Type: Variable, Align: 32, Size: 32 Offset: [SP-344], Type: Variable, Align: 8, Size: 120
Disable KASAN for this file when using clang-17 or older to avoid blowing out the stack, clearing up the warning.
Signed-off-by: Nathan Chancellor nathan@kernel.org Acked-by: "Jason A. Donenfeld" Jason@zx2c4.com Acked-by: Ard Biesheuvel ardb@kernel.org Link: https://lore.kernel.org/r/20250609-curve25519-hacl64-disable-kasan-clang-v1-... Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- lib/crypto/Makefile | 4 ++++ 1 file changed, 4 insertions(+)
--- a/lib/crypto/Makefile +++ b/lib/crypto/Makefile @@ -33,6 +33,10 @@ obj-$(CONFIG_CRYPTO_LIB_CURVE25519_GENER libcurve25519-generic-y := curve25519-fiat32.o libcurve25519-generic-$(CONFIG_ARCH_SUPPORTS_INT128) := curve25519-hacl64.o libcurve25519-generic-y += curve25519-generic.o +# clang versions prior to 18 may blow out the stack with KASAN +ifeq ($(call clang-min-version, 180000),) +KASAN_SANITIZE_curve25519-hacl64.o := n +endif
obj-$(CONFIG_CRYPTO_LIB_CURVE25519) += libcurve25519.o libcurve25519-y += curve25519.o
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lance Yang lance.yang@linux.dev
commit 0389c305ef56cbadca4cbef44affc0ec3213ed30 upstream.
The madv_populate and soft-dirty kselftests currently fail on systems where CONFIG_MEM_SOFT_DIRTY is disabled.
Introduce a new helper softdirty_supported() into vm_util.c/h to ensure tests are properly skipped when the feature is not enabled.
Link: https://lkml.kernel.org/r/20250917133137.62802-1-lance.yang@linux.dev Fixes: 9f3265db6ae8 ("selftests: vm: add test for Soft-Dirty PTE bit") Signed-off-by: Lance Yang lance.yang@linux.dev Acked-by: David Hildenbrand david@redhat.com Suggested-by: David Hildenbrand david@redhat.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Shuah Khan shuah@kernel.org Cc: Gabriel Krisman Bertazi krisman@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/mm/madv_populate.c | 21 ------- tools/testing/selftests/mm/soft-dirty.c | 5 + tools/testing/selftests/mm/vm_util.c | 77 +++++++++++++++++++++++++++++ tools/testing/selftests/mm/vm_util.h | 1 4 files changed, 84 insertions(+), 20 deletions(-)
--- a/tools/testing/selftests/mm/madv_populate.c +++ b/tools/testing/selftests/mm/madv_populate.c @@ -264,23 +264,6 @@ static void test_softdirty(void) munmap(addr, SIZE); }
-static int system_has_softdirty(void) -{ - /* - * There is no way to check if the kernel supports soft-dirty, other - * than by writing to a page and seeing if the bit was set. But the - * tests are intended to check that the bit gets set when it should, so - * doing that check would turn a potentially legitimate fail into a - * skip. Fortunately, we know for sure that arm64 does not support - * soft-dirty. So for now, let's just use the arch as a corse guide. - */ -#if defined(__aarch64__) - return 0; -#else - return 1; -#endif -} - int main(int argc, char **argv) { int nr_tests = 16; @@ -288,7 +271,7 @@ int main(int argc, char **argv)
pagesize = getpagesize();
- if (system_has_softdirty()) + if (softdirty_supported()) nr_tests += 5;
ksft_print_header(); @@ -300,7 +283,7 @@ int main(int argc, char **argv) test_holes(); test_populate_read(); test_populate_write(); - if (system_has_softdirty()) + if (softdirty_supported()) test_softdirty();
err = ksft_get_fail_cnt(); --- a/tools/testing/selftests/mm/soft-dirty.c +++ b/tools/testing/selftests/mm/soft-dirty.c @@ -193,8 +193,11 @@ int main(int argc, char **argv) int pagesize;
ksft_print_header(); - ksft_set_plan(15);
+ if (!softdirty_supported()) + ksft_exit_skip("soft-dirty is not support\n"); + + ksft_set_plan(15); pagemap_fd = open(PAGEMAP_FILE_PATH, O_RDONLY); if (pagemap_fd < 0) ksft_exit_fail_msg("Failed to open %s\n", PAGEMAP_FILE_PATH); --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -193,6 +193,42 @@ err_out: return rss_anon; }
+char *__get_smap_entry(void *addr, const char *pattern, char *buf, size_t len) +{ + int ret; + FILE *fp; + char *entry = NULL; + char addr_pattern[MAX_LINE_LENGTH]; + + ret = snprintf(addr_pattern, MAX_LINE_LENGTH, "%08lx-", + (unsigned long)addr); + if (ret >= MAX_LINE_LENGTH) + ksft_exit_fail_msg("%s: Pattern is too long\n", __func__); + + fp = fopen(SMAP_FILE_PATH, "r"); + if (!fp) + ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__, + SMAP_FILE_PATH); + + if (!check_for_pattern(fp, addr_pattern, buf, len)) + goto err_out; + + /* Fetch the pattern in the same block */ + if (!check_for_pattern(fp, pattern, buf, len)) + goto err_out; + + /* Trim trailing newline */ + entry = strchr(buf, '\n'); + if (entry) + *entry = '\0'; + + entry = buf + strlen(pattern); + +err_out: + fclose(fp); + return entry; +} + bool __check_huge(void *addr, char *pattern, int nr_hpages, uint64_t hpage_size) { @@ -384,3 +420,44 @@ unsigned long get_free_hugepages(void) fclose(f); return fhp; } + +static bool check_vmflag(void *addr, const char *flag) +{ + char buffer[MAX_LINE_LENGTH]; + const char *flags; + size_t flaglen; + + flags = __get_smap_entry(addr, "VmFlags:", buffer, sizeof(buffer)); + if (!flags) + ksft_exit_fail_msg("%s: No VmFlags for %p\n", __func__, addr); + + while (true) { + flags += strspn(flags, " "); + + flaglen = strcspn(flags, " "); + if (!flaglen) + return false; + + if (flaglen == strlen(flag) && !memcmp(flags, flag, flaglen)) + return true; + + flags += flaglen; + } +} + +bool softdirty_supported(void) +{ + char *addr; + bool supported = false; + const size_t pagesize = getpagesize(); + + /* New mappings are expected to be marked with VM_SOFTDIRTY (sd). */ + addr = mmap(0, pagesize, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, 0, 0); + if (!addr) + ksft_exit_fail_msg("mmap failed\n"); + + supported = check_vmflag(addr, "sd"); + munmap(addr, pagesize); + return supported; +} --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -53,6 +53,7 @@ int uffd_unregister(int uffd, void *addr int uffd_register_with_ioctls(int uffd, void *addr, uint64_t len, bool miss, bool wp, bool minor, uint64_t *ioctls); unsigned long get_free_hugepages(void); +bool softdirty_supported(void);
/* * On ppc64 this will only work with radix 2M hugepage size
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Edward Adam Davis eadavis@qq.com
[ Upstream commit 8cfc8cec1b4da88a47c243a11f384baefd092a50 ]
The device minor should not be cleared after the device is released.
Fixes: 9e14868dc952 ("media: mc: Clear minor number reservation at unregistration time") Cc: stable@vger.kernel.org Reported-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=031d0cfd7c362817963f Tested-by: syzbot+031d0cfd7c362817963f@syzkaller.appspotmail.com Signed-off-by: Edward Adam Davis eadavis@qq.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil+cisco@kernel.org [ moved clear_bit from media_devnode_release callback to media_devnode_unregister before put_device ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/mc/mc-devnode.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
--- a/drivers/media/mc/mc-devnode.c +++ b/drivers/media/mc/mc-devnode.c @@ -50,11 +50,6 @@ static void media_devnode_release(struct { struct media_devnode *devnode = to_media_devnode(cd);
- mutex_lock(&media_devnode_lock); - /* Mark device node number as free */ - clear_bit(devnode->minor, media_devnode_nums); - mutex_unlock(&media_devnode_lock); - /* Release media_devnode and perform other cleanups as needed. */ if (devnode->release) devnode->release(devnode); @@ -281,6 +276,7 @@ void media_devnode_unregister(struct med /* Delete the cdev on this minor as well */ cdev_device_del(&devnode->cdev, &devnode->dev); devnode->media_dev = NULL; + clear_bit(devnode->minor, media_devnode_nums); mutex_unlock(&media_devnode_lock);
put_device(&devnode->dev);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phillip Lougher phillip@squashfs.org.uk
[ Upstream commit 9ee94bfbe930a1b39df53fa2d7b31141b780eb5a ]
Patch series "Squashfs: performance improvement and a sanity check".
This patchset adds an additional sanity check when reading regular file inodes, and adds support for SEEK_DATA/SEEK_HOLE lseek() whence values.
This patch (of 2):
Add an additional sanity check when reading regular file inodes.
A regular file if the file size is an exact multiple of the filesystem block size cannot have a fragment. This is because by definition a fragment block stores tailends which are not a whole block in size.
Link: https://lkml.kernel.org/r/20250923220652.568416-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250923220652.568416-2-phillip@squashfs.org.uk Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: 9f1c14c1de1b ("Squashfs: reject negative file sizes in squashfs_read_inode()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/inode.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/fs/squashfs/inode.c +++ b/fs/squashfs/inode.c @@ -140,8 +140,17 @@ int squashfs_read_inode(struct inode *in if (err < 0) goto failed_read;
+ inode->i_size = le32_to_cpu(sqsh_ino->file_size); frag = le32_to_cpu(sqsh_ino->fragment); if (frag != SQUASHFS_INVALID_FRAG) { + /* + * the file cannot have a fragment (tailend) and have a + * file size a multiple of the block size + */ + if ((inode->i_size & (msblk->block_size - 1)) == 0) { + err = -EINVAL; + goto failed_read; + } frag_offset = le32_to_cpu(sqsh_ino->offset); frag_size = squashfs_frag_lookup(sb, frag, &frag_blk); if (frag_size < 0) { @@ -155,7 +164,6 @@ int squashfs_read_inode(struct inode *in }
set_nlink(inode, 1); - inode->i_size = le32_to_cpu(sqsh_ino->file_size); inode->i_fop = &generic_ro_fops; inode->i_mode |= S_IFREG; inode->i_blocks = ((inode->i_size - 1) >> 9) + 1; @@ -184,8 +192,17 @@ int squashfs_read_inode(struct inode *in if (err < 0) goto failed_read;
+ inode->i_size = le64_to_cpu(sqsh_ino->file_size); frag = le32_to_cpu(sqsh_ino->fragment); if (frag != SQUASHFS_INVALID_FRAG) { + /* + * the file cannot have a fragment (tailend) and have a + * file size a multiple of the block size + */ + if ((inode->i_size & (msblk->block_size - 1)) == 0) { + err = -EINVAL; + goto failed_read; + } frag_offset = le32_to_cpu(sqsh_ino->offset); frag_size = squashfs_frag_lookup(sb, frag, &frag_blk); if (frag_size < 0) { @@ -200,7 +217,6 @@ int squashfs_read_inode(struct inode *in
xattr_id = le32_to_cpu(sqsh_ino->xattr); set_nlink(inode, le32_to_cpu(sqsh_ino->nlink)); - inode->i_size = le64_to_cpu(sqsh_ino->file_size); inode->i_op = &squashfs_inode_ops; inode->i_fop = &generic_ro_fops; inode->i_mode |= S_IFREG;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phillip Lougher phillip@squashfs.org.uk
[ Upstream commit 9f1c14c1de1bdde395f6cc893efa4f80a2ae3b2b ]
Syskaller reports a "WARNING in ovl_copy_up_file" in overlayfs.
This warning is ultimately caused because the underlying Squashfs file system returns a file with a negative file size.
This commit checks for a negative file size and returns EINVAL.
[phillip@squashfs.org.uk: only need to check 64 bit quantity] Link: https://lkml.kernel.org/r/20250926222305.110103-1-phillip@squashfs.org.uk Link: https://lkml.kernel.org/r/20250926215935.107233-1-phillip@squashfs.org.uk Fixes: 6545b246a2c8 ("Squashfs: inode operations") Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Reported-by: syzbot+f754e01116421e9754b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/68d580e5.a00a0220.303701.0019.GAE@google.com/ Cc: Amir Goldstein amir73il@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/squashfs/inode.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/squashfs/inode.c +++ b/fs/squashfs/inode.c @@ -193,6 +193,10 @@ int squashfs_read_inode(struct inode *in goto failed_read;
inode->i_size = le64_to_cpu(sqsh_ino->file_size); + if (inode->i_size < 0) { + err = -EINVAL; + goto failed_read; + } frag = le32_to_cpu(sqsh_ino->fragment); if (frag != SQUASHFS_INVALID_FRAG) { /*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yuan Chen chenyuan@kylinos.cn
[ Upstream commit 9cf9aa7b0acfde7545c1a1d912576e9bab28dc6f ]
There is a critical race condition in kprobe initialization that can lead to NULL pointer dereference and kernel crash.
[1135630.084782] Unable to handle kernel paging request at virtual address 0000710a04630000 ... [1135630.260314] pstate: 404003c9 (nZcv DAIF +PAN -UAO) [1135630.269239] pc : kprobe_perf_func+0x30/0x260 [1135630.277643] lr : kprobe_dispatcher+0x44/0x60 [1135630.286041] sp : ffffaeff4977fa40 [1135630.293441] x29: ffffaeff4977fa40 x28: ffffaf015340e400 [1135630.302837] x27: 0000000000000000 x26: 0000000000000000 [1135630.312257] x25: ffffaf029ed108a8 x24: ffffaf015340e528 [1135630.321705] x23: ffffaeff4977fc50 x22: ffffaeff4977fc50 [1135630.331154] x21: 0000000000000000 x20: ffffaeff4977fc50 [1135630.340586] x19: ffffaf015340e400 x18: 0000000000000000 [1135630.349985] x17: 0000000000000000 x16: 0000000000000000 [1135630.359285] x15: 0000000000000000 x14: 0000000000000000 [1135630.368445] x13: 0000000000000000 x12: 0000000000000000 [1135630.377473] x11: 0000000000000000 x10: 0000000000000000 [1135630.386411] x9 : 0000000000000000 x8 : 0000000000000000 [1135630.395252] x7 : 0000000000000000 x6 : 0000000000000000 [1135630.403963] x5 : 0000000000000000 x4 : 0000000000000000 [1135630.412545] x3 : 0000710a04630000 x2 : 0000000000000006 [1135630.421021] x1 : ffffaeff4977fc50 x0 : 0000710a04630000 [1135630.429410] Call trace: [1135630.434828] kprobe_perf_func+0x30/0x260 [1135630.441661] kprobe_dispatcher+0x44/0x60 [1135630.448396] aggr_pre_handler+0x70/0xc8 [1135630.454959] kprobe_breakpoint_handler+0x140/0x1e0 [1135630.462435] brk_handler+0xbc/0xd8 [1135630.468437] do_debug_exception+0x84/0x138 [1135630.475074] el1_dbg+0x18/0x8c [1135630.480582] security_file_permission+0x0/0xd0 [1135630.487426] vfs_write+0x70/0x1c0 [1135630.493059] ksys_write+0x5c/0xc8 [1135630.498638] __arm64_sys_write+0x24/0x30 [1135630.504821] el0_svc_common+0x78/0x130 [1135630.510838] el0_svc_handler+0x38/0x78 [1135630.516834] el0_svc+0x8/0x1b0
kernel/trace/trace_kprobe.c: 1308 0xffff3df8995039ec <kprobe_perf_func+0x2c>: ldr x21, [x24,#120] include/linux/compiler.h: 294 0xffff3df8995039f0 <kprobe_perf_func+0x30>: ldr x1, [x21,x0]
kernel/trace/trace_kprobe.c 1308: head = this_cpu_ptr(call->perf_events); 1309: if (hlist_empty(head)) 1310: return 0;
crash> struct trace_event_call -o struct trace_event_call { ... [120] struct hlist_head *perf_events; //(call->perf_event) ... }
crash> struct trace_event_call ffffaf015340e528 struct trace_event_call { ... perf_events = 0xffff0ad5fa89f088, //this value is correct, but x21 = 0 ... }
Race Condition Analysis:
The race occurs between kprobe activation and perf_events initialization:
CPU0 CPU1 ==== ==== perf_kprobe_init perf_trace_event_init tp_event->perf_events = list;(1) tp_event->class->reg (2)← KPROBE ACTIVE Debug exception triggers ... kprobe_dispatcher kprobe_perf_func (tk->tp.flags & TP_FLAG_PROFILE) head = this_cpu_ptr(call->perf_events)(3) (perf_events is still NULL)
Problem: 1. CPU0 executes (1) assigning tp_event->perf_events = list 2. CPU0 executes (2) enabling kprobe functionality via class->reg() 3. CPU1 triggers and reaches kprobe_dispatcher 4. CPU1 checks TP_FLAG_PROFILE - condition passes (step 2 completed) 5. CPU1 calls kprobe_perf_func() and crashes at (3) because call->perf_events is still NULL
CPU1 sees that kprobe functionality is enabled but does not see that perf_events has been assigned.
Add pairing read and write memory barriers to guarantee that if CPU1 sees that kprobe functionality is enabled, it must also see that perf_events has been assigned.
Link: https://lore.kernel.org/all/20251001022025.44626-1-chenyuan_fl@163.com/
Fixes: 50d780560785 ("tracing/kprobes: Add probe handler dispatcher to support perf and ftrace concurrent use") Cc: stable@vger.kernel.org Signed-off-by: Yuan Chen chenyuan@kylinos.cn Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org [ Adjust context ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_fprobe.c | 11 +++++++---- kernel/trace/trace_kprobe.c | 11 +++++++---- kernel/trace/trace_probe.h | 9 +++++++-- kernel/trace/trace_uprobe.c | 12 ++++++++---- 4 files changed, 29 insertions(+), 14 deletions(-)
--- a/kernel/trace/trace_fprobe.c +++ b/kernel/trace/trace_fprobe.c @@ -343,12 +343,14 @@ static int fentry_dispatcher(struct fpro void *entry_data) { struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); + unsigned int flags = trace_probe_load_flag(&tf->tp); int ret = 0;
- if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE)) + if (flags & TP_FLAG_TRACE) fentry_trace_func(tf, entry_ip, regs); + #ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) ret = fentry_perf_func(tf, entry_ip, regs); #endif return ret; @@ -360,11 +362,12 @@ static void fexit_dispatcher(struct fpro void *entry_data) { struct trace_fprobe *tf = container_of(fp, struct trace_fprobe, fp); + unsigned int flags = trace_probe_load_flag(&tf->tp);
- if (trace_probe_test_flag(&tf->tp, TP_FLAG_TRACE)) + if (flags & TP_FLAG_TRACE) fexit_trace_func(tf, entry_ip, ret_ip, regs, entry_data); #ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tf->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) fexit_perf_func(tf, entry_ip, ret_ip, regs, entry_data); #endif } --- a/kernel/trace/trace_kprobe.c +++ b/kernel/trace/trace_kprobe.c @@ -1799,14 +1799,15 @@ static int kprobe_register(struct trace_ static int kprobe_dispatcher(struct kprobe *kp, struct pt_regs *regs) { struct trace_kprobe *tk = container_of(kp, struct trace_kprobe, rp.kp); + unsigned int flags = trace_probe_load_flag(&tk->tp); int ret = 0;
raw_cpu_inc(*tk->nhit);
- if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE)) + if (flags & TP_FLAG_TRACE) kprobe_trace_func(tk, regs); #ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) ret = kprobe_perf_func(tk, regs); #endif return ret; @@ -1818,6 +1819,7 @@ kretprobe_dispatcher(struct kretprobe_in { struct kretprobe *rp = get_kretprobe(ri); struct trace_kprobe *tk; + unsigned int flags;
/* * There is a small chance that get_kretprobe(ri) returns NULL when @@ -1830,10 +1832,11 @@ kretprobe_dispatcher(struct kretprobe_in tk = container_of(rp, struct trace_kprobe, rp); raw_cpu_inc(*tk->nhit);
- if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE)) + flags = trace_probe_load_flag(&tk->tp); + if (flags & TP_FLAG_TRACE) kretprobe_trace_func(tk, ri, regs); #ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) kretprobe_perf_func(tk, ri, regs); #endif return 0; /* We don't tweak kernel, so just return 0 */ --- a/kernel/trace/trace_probe.h +++ b/kernel/trace/trace_probe.h @@ -269,16 +269,21 @@ struct event_file_link { struct list_head list; };
+static inline unsigned int trace_probe_load_flag(struct trace_probe *tp) +{ + return smp_load_acquire(&tp->event->flags); +} + static inline bool trace_probe_test_flag(struct trace_probe *tp, unsigned int flag) { - return !!(tp->event->flags & flag); + return !!(trace_probe_load_flag(tp) & flag); }
static inline void trace_probe_set_flag(struct trace_probe *tp, unsigned int flag) { - tp->event->flags |= flag; + smp_store_release(&tp->event->flags, tp->event->flags | flag); }
static inline void trace_probe_clear_flag(struct trace_probe *tp, --- a/kernel/trace/trace_uprobe.c +++ b/kernel/trace/trace_uprobe.c @@ -1531,6 +1531,7 @@ static int uprobe_dispatcher(struct upro struct trace_uprobe *tu; struct uprobe_dispatch_data udd; struct uprobe_cpu_buffer *ucb = NULL; + unsigned int flags; int ret = 0;
tu = container_of(con, struct trace_uprobe, consumer); @@ -1545,11 +1546,12 @@ static int uprobe_dispatcher(struct upro if (WARN_ON_ONCE(!uprobe_cpu_buffer)) return 0;
- if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) + flags = trace_probe_load_flag(&tu->tp); + if (flags & TP_FLAG_TRACE) ret |= uprobe_trace_func(tu, regs, &ucb);
#ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tu->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) ret |= uprobe_perf_func(tu, regs, &ucb); #endif uprobe_buffer_put(ucb); @@ -1562,6 +1564,7 @@ static int uretprobe_dispatcher(struct u struct trace_uprobe *tu; struct uprobe_dispatch_data udd; struct uprobe_cpu_buffer *ucb = NULL; + unsigned int flags;
tu = container_of(con, struct trace_uprobe, consumer);
@@ -1573,11 +1576,12 @@ static int uretprobe_dispatcher(struct u if (WARN_ON_ONCE(!uprobe_cpu_buffer)) return 0;
- if (trace_probe_test_flag(&tu->tp, TP_FLAG_TRACE)) + flags = trace_probe_load_flag(&tu->tp); + if (flags & TP_FLAG_TRACE) uretprobe_trace_func(tu, func, regs, &ucb);
#ifdef CONFIG_PERF_EVENTS - if (trace_probe_test_flag(&tu->tp, TP_FLAG_PROFILE)) + if (flags & TP_FLAG_PROFILE) uretprobe_perf_func(tu, func, regs, &ucb); #endif uprobe_buffer_put(ucb);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Donet Tom donettom@linux.ibm.com
[ Upstream commit 4d6fc29f36341d7795db1d1819b4c15fe9be7b23 ]
Patch series "mm/ksm: Fix incorrect accounting of KSM counters during fork", v3.
The first patch in this series fixes the incorrect accounting of KSM counters such as ksm_merging_pages, ksm_rmap_items, and the global ksm_zero_pages during fork.
The following patch add a selftest to verify the ksm_merging_pages counter was updated correctly during fork.
Test Results ============ Without the first patch ----------------------- # [RUN] test_fork_ksm_merging_page_count not ok 10 ksm_merging_page in child: 32
With the first patch -------------------- # [RUN] test_fork_ksm_merging_page_count ok 10 ksm_merging_pages is not inherited after fork
This patch (of 2):
Currently, the KSM-related counters in `mm_struct`, such as `ksm_merging_pages`, `ksm_rmap_items`, and `ksm_zero_pages`, are inherited by the child process during fork. This results in inconsistent accounting.
When a process uses KSM, identical pages are merged and an rmap item is created for each merged page. The `ksm_merging_pages` and `ksm_rmap_items` counters are updated accordingly. However, after a fork, these counters are copied to the child while the corresponding rmap items are not. As a result, when the child later triggers an unmerge, there are no rmap items present in the child, so the counters remain stale, leading to incorrect accounting.
A similar issue exists with `ksm_zero_pages`, which maintains both a global counter and a per-process counter. During fork, the per-process counter is inherited by the child, but the global counter is not incremented. Since the child also references zero pages, the global counter should be updated as well. Otherwise, during zero-page unmerge, both the global and per-process counters are decremented, causing the global counter to become inconsistent.
To fix this, ksm_merging_pages and ksm_rmap_items are reset to 0 during fork, and the global ksm_zero_pages counter is updated with the per-process ksm_zero_pages value inherited by the child. This ensures that KSM statistics remain accurate and reflect the activity of each process correctly.
Link: https://lkml.kernel.org/r/cover.1758648700.git.donettom@linux.ibm.com Link: https://lkml.kernel.org/r/7b9870eb67ccc0d79593940d9dbd4a0b39b5d396.175864870... Fixes: 7609385337a4 ("ksm: count ksm merging pages for each process") Fixes: cb4df4cae4f2 ("ksm: count allocated ksm rmap_items for each process") Fixes: e2942062e01d ("ksm: count all zero pages placed by KSM") Signed-off-by: Donet Tom donettom@linux.ibm.com Reviewed-by: Chengming Zhou chengming.zhou@linux.dev Acked-by: David Hildenbrand david@redhat.com Cc: Aboorva Devarajan aboorvad@linux.ibm.com Cc: David Hildenbrand david@redhat.com Cc: Donet Tom donettom@linux.ibm.com Cc: "Ritesh Harjani (IBM)" ritesh.list@gmail.com Cc: Wei Yang richard.weiyang@gmail.com Cc: xu xin xu.xin16@zte.com.cn Cc: stable@vger.kernel.org [6.6+] Signed-off-by: Andrew Morton akpm@linux-foundation.org [ replaced mm_flags_test() calls with test_bit() ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/ksm.h | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/include/linux/ksm.h +++ b/include/linux/ksm.h @@ -57,8 +57,14 @@ static inline long mm_ksm_zero_pages(str static inline void ksm_fork(struct mm_struct *mm, struct mm_struct *oldmm) { /* Adding mm to ksm is best effort on fork. */ - if (test_bit(MMF_VM_MERGEABLE, &oldmm->flags)) + if (test_bit(MMF_VM_MERGEABLE, &oldmm->flags)) { + long nr_ksm_zero_pages = atomic_long_read(&mm->ksm_zero_pages); + + mm->ksm_merging_pages = 0; + mm->ksm_rmap_items = 0; + atomic_long_add(nr_ksm_zero_pages, &ksm_zero_pages); __ksm_enter(mm); + } }
static inline int ksm_execve(struct mm_struct *mm)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wang Jiang jiangwang@kylinos.cn
[ Upstream commit 9b80bdb10aee04ce7289896e6bdad13e33972636 ]
Remove a surplus return statement from the void function that has been added in the commit commit 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities").
Especially, as an empty return statements at the end of a void functions serve little purpose.
This fixes the following checkpatch.pl script warning:
WARNING: void function return statements are not generally useful #296: FILE: drivers/pci/endpoint/functions/pci-epf-test.c:296: + return; +}
Link: https://lore.kernel.org/r/tencent_F250BEE2A65745A524E2EFE70CF615CA8F06@qq.co... Signed-off-by: Wang Jiang jiangwang@kylinos.cn [kwilczynski: commit log] Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Stable-dep-of: 85afa9ea122d ("PCI: endpoint: pci-epf-test: Add NULL check for DMA channels before release") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/endpoint/functions/pci-epf-test.c | 2 -- 1 file changed, 2 deletions(-)
--- a/drivers/pci/endpoint/functions/pci-epf-test.c +++ b/drivers/pci/endpoint/functions/pci-epf-test.c @@ -291,8 +291,6 @@ static void pci_epf_test_clean_dma_chan(
dma_release_channel(epf_test->dma_chan_rx); epf_test->dma_chan_rx = NULL; - - return; }
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com
[ Upstream commit 85afa9ea122dd9d4a2ead104a951d318975dcd25 ]
The fields dma_chan_tx and dma_chan_rx of the struct pci_epf_test can be NULL even after EPF initialization. Then it is prudent to check that they have non-NULL values before releasing the channels. Add the checks in pci_epf_test_clean_dma_chan().
Without the checks, NULL pointer dereferences happen and they can lead to a kernel panic in some cases:
Unable to handle kernel NULL pointer dereference at virtual address 0000000000000050 Call trace: dma_release_channel+0x2c/0x120 (P) pci_epf_test_epc_deinit+0x94/0xc0 [pci_epf_test] pci_epc_deinit_notify+0x74/0xc0 tegra_pcie_ep_pex_rst_irq+0x250/0x5d8 irq_thread_fn+0x34/0xb8 irq_thread+0x18c/0x2e8 kthread+0x14c/0x210 ret_from_fork+0x10/0x20
Fixes: 8353813c88ef ("PCI: endpoint: Enable DMA tests for endpoints with DMA capabilities") Fixes: 5ebf3fc59bd2 ("PCI: endpoint: functions/pci-epf-test: Add DMA support to transfer data") Signed-off-by: Shin'ichiro Kawasaki shinichiro.kawasaki@wdc.com [mani: trimmed the stack trace] Signed-off-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Damien Le Moal dlemoal@kernel.org Reviewed-by: Krzysztof Wilczyński kwilczynski@kernel.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250916025756.34807-1-shinichiro.kawasaki@wdc.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/endpoint/functions/pci-epf-test.c | 17 +++++++++++------ 1 file changed, 11 insertions(+), 6 deletions(-)
--- a/drivers/pci/endpoint/functions/pci-epf-test.c +++ b/drivers/pci/endpoint/functions/pci-epf-test.c @@ -282,15 +282,20 @@ static void pci_epf_test_clean_dma_chan( if (!epf_test->dma_supported) return;
- dma_release_channel(epf_test->dma_chan_tx); - if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) { + if (epf_test->dma_chan_tx) { + dma_release_channel(epf_test->dma_chan_tx); + if (epf_test->dma_chan_tx == epf_test->dma_chan_rx) { + epf_test->dma_chan_tx = NULL; + epf_test->dma_chan_rx = NULL; + return; + } epf_test->dma_chan_tx = NULL; - epf_test->dma_chan_rx = NULL; - return; }
- dma_release_channel(epf_test->dma_chan_rx); - epf_test->dma_chan_rx = NULL; + if (epf_test->dma_chan_rx) { + dma_release_channel(epf_test->dma_chan_rx); + epf_test->dma_chan_rx = NULL; + } }
static void pci_epf_test_print_rate(struct pci_epf_test *epf_test,
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit a1d203d390e04798ccc1c3c06019cd4411885d6d ]
All streams (currently) which is configured to use ChainDMA can only work on Link/host DMA pairs where the link side position can be access via host registers (like HDA on CAVS 2.5 platforms).
Since the firmware does not provide time_info for ChainDMA, unlike for HDA stream, the kernel should calculate the start and end offsets that is needed for the delay calculation.
With this small change we can report accurate delays when the stream is configured to use ChainDMA.
Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Liam Girdwood liam.r.girdwood@intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Link: https://patch.msgid.link/20250619102848.12389-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: bcd1383516bb ("ASoC: SOF: ipc4-pcm: fix delay calculation when DSP resamples") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/ipc4-pcm.c | 49 ++++++++++++++++++++++++++++++++++++++---- sound/soc/sof/ipc4-topology.c | 6 ++--- sound/soc/sof/ipc4-topology.h | 1 3 files changed, 49 insertions(+), 7 deletions(-)
--- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -409,9 +409,33 @@ static int sof_ipc4_trigger_pipelines(st * If use_chain_dma attribute is set we proceed to chained DMA * trigger function that handles the rest for the substream. */ - if (pipeline->use_chain_dma) - return sof_ipc4_chain_dma_trigger(sdev, spcm, substream->stream, - pipeline_list, state, cmd); + if (pipeline->use_chain_dma) { + struct sof_ipc4_timestamp_info *time_info; + + time_info = sof_ipc4_sps_to_time_info(&spcm->stream[substream->stream]); + + ret = sof_ipc4_chain_dma_trigger(sdev, spcm, substream->stream, + pipeline_list, state, cmd); + if (ret || !time_info) + return ret; + + if (state == SOF_IPC4_PIPE_PAUSED) { + /* + * Record the DAI position for delay reporting + * To handle multiple pause/resume/xrun we need to add + * the positions to simulate how the firmware behaves + */ + u64 pos = snd_sof_pcm_get_dai_frame_counter(sdev, component, + substream); + + time_info->stream_end_offset += pos; + } else if (state == SOF_IPC4_PIPE_RESET) { + /* Reset the end offset as the stream is stopped */ + time_info->stream_end_offset = 0; + } + + return 0; + }
/* allocate memory for the pipeline data */ trigger_list = kzalloc(struct_size(trigger_list, pipeline_instance_ids, @@ -924,8 +948,24 @@ static int sof_ipc4_get_stream_start_off if (!host_copier || !dai_copier) return -EINVAL;
- if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_INVALID_NODE_ID) + if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_INVALID_NODE_ID) { return -EINVAL; + } else if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_CHAIN_DMA_NODE_ID) { + /* + * While the firmware does not supports time_info reporting for + * streams using ChainDMA, it is granted that ChainDMA can only + * be used on Host+Link pairs where the link position is + * accessible from the host side. + * + * Enable delay calculation in case of ChainDMA via host + * accessible registers. + * + * The ChainDMA uses 2x 1ms ping-pong buffer, dai side starts + * when 1ms data is available + */ + time_info->stream_start_offset = substream->runtime->rate / MSEC_PER_SEC; + goto out; + }
node_index = SOF_IPC4_NODE_INDEX(host_copier->data.gtw_cfg.node_id); offset = offsetof(struct sof_ipc4_fw_registers, pipeline_regs) + node_index * sizeof(ppl_reg); @@ -943,6 +983,7 @@ static int sof_ipc4_get_stream_start_off time_info->stream_end_offset = ppl_reg.stream_end_offset; do_div(time_info->stream_end_offset, dai_sample_size);
+out: /* * Calculate the wrap boundary need to be used for delay calculation * The host counter is in bytes, it will wrap earlier than the frames --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -1782,10 +1782,10 @@ sof_ipc4_prepare_copier_module(struct sn pipeline->msg.extension |= SOF_IPC4_GLB_EXT_CHAIN_DMA_FIFO_SIZE(fifo_size);
/* - * Chain DMA does not support stream timestamping, set node_id to invalid - * to skip the code in sof_ipc4_get_stream_start_offset(). + * Chain DMA does not support stream timestamping, but it + * can use the host side registers for delay calculation. */ - copier_data->gtw_cfg.node_id = SOF_IPC4_INVALID_NODE_ID; + copier_data->gtw_cfg.node_id = SOF_IPC4_CHAIN_DMA_NODE_ID;
return 0; } --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -58,6 +58,7 @@
#define SOF_IPC4_DMA_DEVICE_MAX_COUNT 16
+#define SOF_IPC4_CHAIN_DMA_NODE_ID 0x7fffffff #define SOF_IPC4_INVALID_NODE_ID 0xffffffff
/* FW requires minimum 4ms DMA buffer size */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kai Vehmanen kai.vehmanen@linux.intel.com
[ Upstream commit bcd1383516bb5a6f72b2d1e7f7ad42c4a14837d1 ]
When the sampling rates going in (host) and out (dai) from the DSP are different, the IPC4 delay reporting does not work correctly. Add support for this case by scaling the all raw position values to a common timebase before calculating real-time delay for the PCM.
Cc: stable@vger.kernel.org Fixes: 0ea06680dfcb ("ASoC: SOF: ipc4-pcm: Correct the delay calculation") Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://patch.msgid.link/20251002074719.2084-2-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/ipc4-pcm.c | 83 +++++++++++++++++++++++++++++++++++------------ 1 file changed, 62 insertions(+), 21 deletions(-)
--- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -19,12 +19,14 @@ * struct sof_ipc4_timestamp_info - IPC4 timestamp info * @host_copier: the host copier of the pcm stream * @dai_copier: the dai copier of the pcm stream - * @stream_start_offset: reported by fw in memory window (converted to frames) - * @stream_end_offset: reported by fw in memory window (converted to frames) + * @stream_start_offset: reported by fw in memory window (converted to + * frames at host_copier sampling rate) + * @stream_end_offset: reported by fw in memory window (converted to + * frames at host_copier sampling rate) * @llp_offset: llp offset in memory window - * @boundary: wrap boundary should be used for the LLP frame counter * @delay: Calculated and stored in pointer callback. The stored value is - * returned in the delay callback. + * returned in the delay callback. Expressed in frames at host copier + * sampling rate. */ struct sof_ipc4_timestamp_info { struct sof_ipc4_copier *host_copier; @@ -33,7 +35,6 @@ struct sof_ipc4_timestamp_info { u64 stream_end_offset; u32 llp_offset;
- u64 boundary; snd_pcm_sframes_t delay; };
@@ -48,6 +49,16 @@ struct sof_ipc4_pcm_stream_priv { bool chain_dma_allocated; };
+/* + * Modulus to use to compare host and link position counters. The sampling + * rates may be different, so the raw hardware counters will wrap + * around at different times. To calculate differences, use + * DELAY_BOUNDARY as a common modulus. This value must be smaller than + * the wrap-around point of any hardware counter, and larger than any + * valid delay measurement. + */ +#define DELAY_BOUNDARY U32_MAX + static inline struct sof_ipc4_timestamp_info * sof_ipc4_sps_to_time_info(struct snd_sof_pcm_stream *sps) { @@ -933,6 +944,35 @@ static int sof_ipc4_pcm_hw_params(struct return 0; }
+static u64 sof_ipc4_frames_dai_to_host(struct sof_ipc4_timestamp_info *time_info, u64 value) +{ + u64 dai_rate, host_rate; + + if (!time_info->dai_copier || !time_info->host_copier) + return value; + + /* + * copiers do not change sampling rate, so we can use the + * out_format independently of stream direction + */ + dai_rate = time_info->dai_copier->data.out_format.sampling_frequency; + host_rate = time_info->host_copier->data.out_format.sampling_frequency; + + if (!dai_rate || !host_rate || dai_rate == host_rate) + return value; + + /* take care not to overflow u64, rates can be up to 768000 */ + if (value > U32_MAX) { + value = div64_u64(value, dai_rate); + value *= host_rate; + } else { + value *= host_rate; + value = div64_u64(value, dai_rate); + } + + return value; +} + static int sof_ipc4_get_stream_start_offset(struct snd_sof_dev *sdev, struct snd_pcm_substream *substream, struct snd_sof_pcm_stream *sps, @@ -983,14 +1023,13 @@ static int sof_ipc4_get_stream_start_off time_info->stream_end_offset = ppl_reg.stream_end_offset; do_div(time_info->stream_end_offset, dai_sample_size);
+ /* convert to host frame time */ + time_info->stream_start_offset = + sof_ipc4_frames_dai_to_host(time_info, time_info->stream_start_offset); + time_info->stream_end_offset = + sof_ipc4_frames_dai_to_host(time_info, time_info->stream_end_offset); + out: - /* - * Calculate the wrap boundary need to be used for delay calculation - * The host counter is in bytes, it will wrap earlier than the frames - * based link counter. - */ - time_info->boundary = div64_u64(~((u64)0), - frames_to_bytes(substream->runtime, 1)); /* Initialize the delay value to 0 (no delay) */ time_info->delay = 0;
@@ -1033,6 +1072,8 @@ static int sof_ipc4_pcm_pointer(struct s
/* For delay calculation we need the host counter */ host_cnt = snd_sof_pcm_get_host_byte_counter(sdev, component, substream); + + /* Store the original value to host_ptr */ host_ptr = host_cnt;
/* convert the host_cnt to frames */ @@ -1051,6 +1092,8 @@ static int sof_ipc4_pcm_pointer(struct s sof_mailbox_read(sdev, time_info->llp_offset, &llp, sizeof(llp)); dai_cnt = ((u64)llp.reading.llp_u << 32) | llp.reading.llp_l; } + + dai_cnt = sof_ipc4_frames_dai_to_host(time_info, dai_cnt); dai_cnt += time_info->stream_end_offset;
/* In two cases dai dma counter is not accurate @@ -1084,8 +1127,9 @@ static int sof_ipc4_pcm_pointer(struct s dai_cnt -= time_info->stream_start_offset; }
- /* Wrap the dai counter at the boundary where the host counter wraps */ - div64_u64_rem(dai_cnt, time_info->boundary, &dai_cnt); + /* Convert to a common base before comparisons */ + dai_cnt &= DELAY_BOUNDARY; + host_cnt &= DELAY_BOUNDARY;
if (substream->stream == SNDRV_PCM_STREAM_PLAYBACK) { head_cnt = host_cnt; @@ -1095,14 +1139,11 @@ static int sof_ipc4_pcm_pointer(struct s tail_cnt = host_cnt; }
- if (head_cnt < tail_cnt) { - time_info->delay = time_info->boundary - tail_cnt + head_cnt; - goto out; - } - - time_info->delay = head_cnt - tail_cnt; + if (unlikely(head_cnt < tail_cnt)) + time_info->delay = DELAY_BOUNDARY - tail_cnt + head_cnt; + else + time_info->delay = head_cnt - tail_cnt;
-out: /* * Convert the host byte counter to PCM pointer which wraps in buffer * and it is in frames
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 70e997e0107e5ed85c1a3ef2adfccbe351c29d71 ]
The max_register = 128 setting in the regmap config is not valid.
The Intel Dollar Cove TI PMIC has an eeprom unlock register at address 0x88 and a number of EEPROM registers at 0xF?. Increase max_register to 0xff so that these registers can be accessed.
Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Andy Shevchenko andy@kernel.org Link: https://lore.kernel.org/r/20241208150028.325349-1-hdegoede@redhat.com Signed-off-by: Lee Jones lee@kernel.org Stable-dep-of: 64e0d839c589 ("mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c +++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c @@ -81,7 +81,7 @@ static struct mfd_cell chtdc_ti_dev[] = static const struct regmap_config chtdc_ti_regmap_config = { .reg_bits = 8, .val_bits = 8, - .max_register = 128, + .max_register = 0xff, .cache_type = REGCACHE_NONE, };
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 9eb99c08508714906db078b5efbe075329a3fb06 ]
REGCACHE_NONE is the default type of the cache when not provided. Drop unneeded explicit assignment to it.
Note, it's defined to 0, and if ever be redefined, it will break literally a lot of the drivers, so it very unlikely to happen.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20250129152823.1802273-1-andriy.shevchenko@linux.i... Signed-off-by: Lee Jones lee@kernel.org Stable-dep-of: 64e0d839c589 ("mfd: intel_soc_pmic_chtdc_ti: Set use_single_read regmap_config flag") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c +++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c @@ -82,7 +82,6 @@ static const struct regmap_config chtdc_ .reg_bits = 8, .val_bits = 8, .max_register = 0xff, - .cache_type = REGCACHE_NONE, };
static const struct regmap_irq chtdc_ti_irqs[] = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans de Goede hansg@kernel.org
[ Upstream commit 64e0d839c589f4f2ecd2e3e5bdb5cee6ba6bade9 ]
Testing has shown that reading multiple registers at once (for 10-bit ADC values) does not work. Set the use_single_read regmap_config flag to make regmap split these for us.
This should fix temperature opregion accesses done by drivers/acpi/pmic/intel_pmic_chtdc_ti.c and is also necessary for the upcoming drivers for the ADC and battery MFD cells.
Fixes: 6bac0606fdba ("mfd: Add support for Cherry Trail Dollar Cove TI PMIC") Cc: stable@vger.kernel.org Reviewed-by: Andy Shevchenko andy@kernel.org Signed-off-by: Hans de Goede hansg@kernel.org Link: https://lore.kernel.org/r/20250804133240.312383-1-hansg@kernel.org Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mfd/intel_soc_pmic_chtdc_ti.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/mfd/intel_soc_pmic_chtdc_ti.c +++ b/drivers/mfd/intel_soc_pmic_chtdc_ti.c @@ -82,6 +82,8 @@ static const struct regmap_config chtdc_ .reg_bits = 8, .val_bits = 8, .max_register = 0xff, + /* The hardware does not support reading multiple registers at once */ + .use_single_read = true, };
static const struct regmap_irq chtdc_ti_irqs[] = {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 7b26da407420e5054e3f06c5d13271697add9423 ]
[BUG] With my local branch to enable bs > ps support for btrfs, sometimes I hit the following ASSERT() inside submit_one_sector():
ASSERT(block_start != EXTENT_MAP_HOLE);
Please note that it's not yet possible to hit this ASSERT() in the wild yet, as it requires btrfs bs > ps support, which is not even in the development branch.
But on the other hand, there is also a very low chance to hit above ASSERT() with bs < ps cases, so this is an existing bug affect not only the incoming bs > ps support but also the existing bs < ps support.
[CAUSE] Firstly that ASSERT() means we're trying to submit a dirty block but without a real extent map nor ordered extent map backing it.
Furthermore with extra debugging, the folio triggering such ASSERT() is always larger than the fs block size in my bs > ps case. (8K block size, 4K page size)
After some more debugging, the ASSERT() is trigger by the following sequence:
extent_writepage() | We got a 32K folio (4 fs blocks) at file offset 0, and the fs block | size is 8K, page size is 4K. | And there is another 8K folio at file offset 32K, which is also | dirty. | So the filemap layout looks like the following: | | "||" is the filio boundary in the filemap. | "//| is the dirty range. | | 0 8K 16K 24K 32K 40K | |////////| |//////////////////////||////////| | |- writepage_delalloc() | |- find_lock_delalloc_range() for [0, 8K) | | Now range [0, 8K) is properly locked. | | | |- find_lock_delalloc_range() for [16K, 40K) | | |- btrfs_find_delalloc_range() returned range [16K, 40K) | | |- lock_delalloc_folios() locked folio 0 successfully | | | | | | The filemap range [32K, 40K) got dropped from filemap. | | | | | |- lock_delalloc_folios() failed with -EAGAIN on folio 32K | | | As the folio at 32K is dropped. | | | | | |- loops = 1; | | |- max_bytes = PAGE_SIZE; | | |- goto again; | | | This will re-do the lookup for dirty delalloc ranges. | | | | | |- btrfs_find_delalloc_range() called with @max_bytes == 4K | | | This is smaller than block size, so | | | btrfs_find_delalloc_range() is unable to return any range. | | - return false; | | | - Now only range [0, 8K) has an OE for it, but for dirty range | [16K, 32K) it's dirty without an OE. | This breaks the assumption that writepage_delalloc() will find | and lock all dirty ranges inside the folio. | |- extent_writepage_io() |- submit_one_sector() for [0, 8K) | Succeeded | |- submit_one_sector() for [16K, 24K) Triggering the ASSERT(), as there is no OE, and the original extent map is a hole.
Please note that, this also exposed the same problem for bs < ps support. E.g. with 64K page size and 4K block size.
If we failed to lock a folio, and falls back into the "loops = 1;" branch, we will re-do the search using 64K as max_bytes. Which may fail again to lock the next folio, and exit early without handling all dirty blocks inside the folio.
[FIX] Instead of using the fixed size PAGE_SIZE as @max_bytes, use @sectorsize, so that we are ensured to find and lock any remaining blocks inside the folio.
And since we're here, add an extra ASSERT() to before calling btrfs_find_delalloc_range() to make sure the @max_bytes is at least no smaller than a block to avoid false negative.
Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -355,6 +355,13 @@ again: /* step one, find a bunch of delalloc bytes starting at start */ delalloc_start = *start; delalloc_end = 0; + + /* + * If @max_bytes is smaller than a block, btrfs_find_delalloc_range() can + * return early without handling any dirty ranges. + */ + ASSERT(max_bytes >= fs_info->sectorsize); + found = btrfs_find_delalloc_range(tree, &delalloc_start, &delalloc_end, max_bytes, &cached_state); if (!found || delalloc_end <= *start || delalloc_start > orig_end) { @@ -385,13 +392,14 @@ again: delalloc_end); ASSERT(!ret || ret == -EAGAIN); if (ret == -EAGAIN) { - /* some of the folios are gone, lets avoid looping by - * shortening the size of the delalloc range we're searching + /* + * Some of the folios are gone, lets avoid looping by + * shortening the size of the delalloc range we're searching. */ free_extent_state(cached_state); cached_state = NULL; if (!loops) { - max_bytes = PAGE_SIZE; + max_bytes = fs_info->sectorsize; loops = 1; goto again; } else {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit f97aef092e199c10a3da96ae79b571edd5362faa ]
Commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us") caused platforms where cpuinfo.transition_latency is CPUFREQ_ETERNAL to get a very large transition latency whereas previously it had been capped at 10 ms (and later at 2 ms).
This led to a user-observable regression between 6.6 and 6.12 as described by Shawn:
"The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms because the default transition delay was dropped [...].
It slows down dbs governor's reacting to CPU loading change dramatically. Also, as transition_delay_us is used by schedutil governor as rate_limit_us, it shows a negative impact on device idle power consumption, because the device gets slightly less time in the lowest OPP."
Evidently, the expectation of the drivers using CPUFREQ_ETERNAL as cpuinfo.transition_latency was that it would be capped by the core, but they may as well return a default transition latency value instead of CPUFREQ_ETERNAL and the core need not do anything with it.
Accordingly, introduce CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS and make all of the drivers in question use it instead of CPUFREQ_ETERNAL. Also update the related Rust binding.
Fixes: a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us") Closes: https://lore.kernel.org/linux-pm/20250922125929.453444-1-shawnguo2@yeah.net/ Reported-by: Shawn Guo shawnguo@kernel.org Reviewed-by: Mario Limonciello (AMD) superm1@kernel.org Reviewed-by: Jie Zhan zhanjie9@hisilicon.com Acked-by: Viresh Kumar viresh.kumar@linaro.org Cc: 6.6+ stable@vger.kernel.org # 6.6+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Link: https://patch.msgid.link/2264949.irdbgypaU6@rafael.j.wysocki [ rjw: Fix typo in new symbol name, drop redundant type cast from Rust binding ] Tested-by: Shawn Guo shawnguo@kernel.org # with cpufreq-dt driver Reviewed-by: Qais Yousef qyousef@layalina.io Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com [ omitted Rust changes ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/cpufreq/cpufreq-dt.c | 2 +- drivers/cpufreq/imx6q-cpufreq.c | 2 +- drivers/cpufreq/mediatek-cpufreq-hw.c | 2 +- drivers/cpufreq/scmi-cpufreq.c | 2 +- drivers/cpufreq/scpi-cpufreq.c | 2 +- drivers/cpufreq/spear-cpufreq.c | 2 +- include/linux/cpufreq.h | 3 +++ 7 files changed, 9 insertions(+), 6 deletions(-)
--- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -110,7 +110,7 @@ static int cpufreq_init(struct cpufreq_p
transition_latency = dev_pm_opp_get_max_transition_latency(cpu_dev); if (!transition_latency) - transition_latency = CPUFREQ_ETERNAL; + transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
cpumask_copy(policy->cpus, priv->cpus); policy->driver_data = priv; --- a/drivers/cpufreq/imx6q-cpufreq.c +++ b/drivers/cpufreq/imx6q-cpufreq.c @@ -443,7 +443,7 @@ soc_opp_out: }
if (of_property_read_u32(np, "clock-latency", &transition_latency)) - transition_latency = CPUFREQ_ETERNAL; + transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
/* * Calculate the ramp time for max voltage change in the --- a/drivers/cpufreq/mediatek-cpufreq-hw.c +++ b/drivers/cpufreq/mediatek-cpufreq-hw.c @@ -238,7 +238,7 @@ static int mtk_cpufreq_hw_cpu_init(struc
latency = readl_relaxed(data->reg_bases[REG_FREQ_LATENCY]) * 1000; if (!latency) - latency = CPUFREQ_ETERNAL; + latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency; policy->fast_switch_possible = true; --- a/drivers/cpufreq/scmi-cpufreq.c +++ b/drivers/cpufreq/scmi-cpufreq.c @@ -280,7 +280,7 @@ static int scmi_cpufreq_init(struct cpuf
latency = perf_ops->transition_latency_get(ph, domain); if (!latency) - latency = CPUFREQ_ETERNAL; + latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency;
--- a/drivers/cpufreq/scpi-cpufreq.c +++ b/drivers/cpufreq/scpi-cpufreq.c @@ -157,7 +157,7 @@ static int scpi_cpufreq_init(struct cpuf
latency = scpi_ops->get_transition_latency(cpu_dev); if (!latency) - latency = CPUFREQ_ETERNAL; + latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
policy->cpuinfo.transition_latency = latency;
--- a/drivers/cpufreq/spear-cpufreq.c +++ b/drivers/cpufreq/spear-cpufreq.c @@ -183,7 +183,7 @@ static int spear_cpufreq_probe(struct pl
if (of_property_read_u32(np, "clock-latency", &spear_cpufreq.transition_latency)) - spear_cpufreq.transition_latency = CPUFREQ_ETERNAL; + spear_cpufreq.transition_latency = CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS;
cnt = of_property_count_u32_elems(np, "cpufreq_tbl"); if (cnt <= 0) { --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -32,6 +32,9 @@ */
#define CPUFREQ_ETERNAL (-1) + +#define CPUFREQ_DEFAULT_TRANSITION_LATENCY_NS NSEC_PER_MSEC + #define CPUFREQ_NAME_LEN 16 /* Print length for names. Extra 1 space for accommodating '\n' in prints */ #define CPUFREQ_NAME_PLEN (CPUFREQ_NAME_LEN + 1)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Borislav Petkov (AMD)" bp@alien8.de
[ Upstream commit 716f86b523d8ec3c17015ee0b03135c7aa6f2f08 ]
SRSO_USER_KERNEL_NO denotes whether the CPU is affected by SRSO across user/kernel boundaries. Advertise it to guest userspace.
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Nikolay Borisov nik.borisov@suse.com Link: https://lore.kernel.org/r/20241202120416.6054-3-bp@kernel.org Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Boris Ostrovsky suggested that we backport this commit to 6.12.y as we have commit: 6f0f23ef76be ("KVM: x86: Add IBPB_BRTYPE support") in 6.12.y
Hi borislav: Can you please ACK before stable maintainers pick this ? Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/cpuid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -816,7 +816,7 @@ void kvm_set_cpu_caps(void) F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ | F(VERW_CLEAR) | F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ | - F(WRMSR_XX_BASE_NS) + F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO) );
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
Hi Greg,
On 17/10/25 20:24, Greg Kroah-Hartman wrote:
6.12-stable review patch. If anyone has any objections, please let me know.
From: "Borislav Petkov (AMD)" bp@alien8.de
[ Upstream commit 716f86b523d8ec3c17015ee0b03135c7aa6f2f08 ]
SRSO_USER_KERNEL_NO denotes whether the CPU is affected by SRSO across user/kernel boundaries. Advertise it to guest userspace.
Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Nikolay Borisov nik.borisov@suse.com Link: https://lore.kernel.org/r/20241202120416.6054-3-bp@kernel.org Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Boris Ostrovsky suggested that we backport this commit to 6.12.y as we have commit: 6f0f23ef76be ("KVM: x86: Add IBPB_BRTYPE support") in 6.12.y
Hi borislav: Can you please ACK before stable maintainers pick this ? Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
^^^ I think we should skip this part after first ---
Also, I haven't yet got ACK from Borislav, so should we defer ?
Thanks, Harshit
arch/x86/kvm/cpuid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -816,7 +816,7 @@ void kvm_set_cpu_caps(void) F(NO_NESTED_DATA_BP) | F(LFENCE_RDTSC) | 0 /* SmmPgCfgLock */ | F(VERW_CLEAR) | F(NULL_SEL_CLR_BASE) | F(AUTOIBRS) | 0 /* PrefetchCtlMsr */ |
F(WRMSR_XX_BASE_NS)
);F(WRMSR_XX_BASE_NS) | F(SRSO_USER_KERNEL_NO)
kvm_cpu_cap_check_and_set(X86_FEATURE_SBPB);
On Fri, Oct 17, 2025 at 10:11:36PM +0530, Harshit Mogalapalli wrote:
Also, I haven't yet got ACK from Borislav, so should we defer ?
I assume you're in much better position than me to verify those bits are actually exported and visible in the guest. So you don't really need my ACK.
Thx.
Hi Borislav,
On 17/10/25 22:23, Borislav Petkov wrote:
On Fri, Oct 17, 2025 at 10:11:36PM +0530, Harshit Mogalapalli wrote:
Also, I haven't yet got ACK from Borislav, so should we defer ?
I assume you're in much better position than me to verify those bits are actually exported and visible in the guest. So you don't really need my ACK.
Thanks for the reply!
Boris Ostrovsky verified it(thanks Boris for that), so we will go with taking this patch to 6.12.y.
Greg: Summary: Let's just trim the text after cut-off line "---" if possible.
Thanks, Harshit
Thx.
On Fri, Oct 17, 2025 at 10:36:50PM +0530, Harshit Mogalapalli wrote:
Hi Borislav,
On 17/10/25 22:23, Borislav Petkov wrote:
On Fri, Oct 17, 2025 at 10:11:36PM +0530, Harshit Mogalapalli wrote:
Also, I haven't yet got ACK from Borislav, so should we defer ?
I assume you're in much better position than me to verify those bits are actually exported and visible in the guest. So you don't really need my ACK.
Thanks for the reply!
Boris Ostrovsky verified it(thanks Boris for that), so we will go with taking this patch to 6.12.y.
Greg: Summary: Let's just trim the text after cut-off line "---" if possible.
Oops, yes, will go fix that up, thanks.
greg k-h
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Brauner brauner@kernel.org
[ Upstream commit e8c84e2082e69335f66c8ade4895e80ec270d7c4 ]
Massage statmount() and make sure we don't call path_put() under the namespace semaphore. If we put the last reference we're fscked.
Fixes: 46eae99ef733 ("add statmount(2) syscall") Cc: stable@vger.kernel.org # v6.8+ Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/namespace.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
--- a/fs/namespace.c +++ b/fs/namespace.c @@ -5200,7 +5200,6 @@ static int grab_requested_root(struct mn static int do_statmount(struct kstatmount *s, u64 mnt_id, u64 mnt_ns_id, struct mnt_namespace *ns) { - struct path root __free(path_put) = {}; struct mount *m; int err;
@@ -5212,7 +5211,7 @@ static int do_statmount(struct kstatmoun if (!s->mnt) return -ENOENT;
- err = grab_requested_root(ns, &root); + err = grab_requested_root(ns, &s->root); if (err) return err;
@@ -5221,15 +5220,13 @@ static int do_statmount(struct kstatmoun * mounts to show users. */ m = real_mount(s->mnt); - if (!is_path_reachable(m, m->mnt.mnt_root, &root) && + if (!is_path_reachable(m, m->mnt.mnt_root, &s->root) && !ns_capable_noaudit(ns->user_ns, CAP_SYS_ADMIN)) return -EPERM;
err = security_sb_statfs(s->mnt->mnt_root); if (err) return err; - - s->root = root; if (s->mask & STATMOUNT_SB_BASIC) statmount_sb_basic(s);
@@ -5406,6 +5403,7 @@ retry: if (!ret) ret = copy_statmount_to_user(ks); kvfree(ks->seq.buf); + path_put(&ks->root); if (retry_statmount(ret, &seq_size)) goto retry; return ret;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Catalin Marinas catalin.marinas@arm.com
[ Upstream commit f620d66af3165838bfa845dcf9f5f9b4089bf508 ]
Commit 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page") attempted to fix ptrace() reading of tags from the zero page by marking it as PG_mte_tagged during cpu_enable_mte(). The same commit also changed the ptrace() tag access permission check to the VM_MTE vma flag while turning the page flag test into a WARN_ON_ONCE().
Attempting to set the PG_mte_tagged flag early with CONFIG_DEFERRED_STRUCT_PAGE_INIT enabled may either hang (after commit d77e59a8fccd "arm64: mte: Lock a page for MTE tag initialisation") or have the flags cleared later during page_alloc_init_late(). In addition, pages_identical() -> memcmp_pages() will reject any comparison with the zero page as it is marked as tagged.
Partially revert the above commit to avoid setting PG_mte_tagged on the zero page. Update the __access_remote_tags() warning on untagged pages to ignore the zero page since it is known to have the tags initialised.
Note that all user mapping of the zero page are marked as pte_special(). The arm64 set_pte_at() will not call mte_sync_tags() on such pages, so PG_mte_tagged will remain cleared.
Signed-off-by: Catalin Marinas catalin.marinas@arm.com Fixes: 68d54ceeec0e ("arm64: mte: Allow PTRACE_PEEKMTETAGS access to the zero page") Reported-by: Gergely Kovacs Gergely.Kovacs2@arm.com Cc: stable@vger.kernel.org # 5.10.x Cc: Will Deacon will@kernel.org Cc: David Hildenbrand david@redhat.com Cc: Lance Yang lance.yang@linux.dev Acked-by: Lance Yang lance.yang@linux.dev Reviewed-by: David Hildenbrand david@redhat.com Tested-by: Lance Yang lance.yang@linux.dev Signed-off-by: Will Deacon will@kernel.org [ Adjust context ] Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/cpufeature.c | 10 +++++++--- arch/arm64/kernel/mte.c | 3 ++- 2 files changed, 9 insertions(+), 4 deletions(-)
--- a/arch/arm64/kernel/cpufeature.c +++ b/arch/arm64/kernel/cpufeature.c @@ -2279,17 +2279,21 @@ static void bti_enable(const struct arm6 #ifdef CONFIG_ARM64_MTE static void cpu_enable_mte(struct arm64_cpu_capabilities const *cap) { + static bool cleared_zero_page = false; + sysreg_clear_set(sctlr_el1, 0, SCTLR_ELx_ATA | SCTLR_EL1_ATA0);
mte_cpu_setup();
/* * Clear the tags in the zero page. This needs to be done via the - * linear map which has the Tagged attribute. + * linear map which has the Tagged attribute. Since this page is + * always mapped as pte_special(), set_pte_at() will not attempt to + * clear the tags or set PG_mte_tagged. */ - if (try_page_mte_tagging(ZERO_PAGE(0))) { + if (!cleared_zero_page) { + cleared_zero_page = true; mte_clear_page_tags(lm_alias(empty_zero_page)); - set_page_mte_tagged(ZERO_PAGE(0)); }
kasan_init_hw_tags_cpu(); --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -428,7 +428,8 @@ static int __access_remote_tags(struct m put_page(page); break; } - WARN_ON_ONCE(!page_mte_tagged(page)); + + WARN_ON_ONCE(!page_mte_tagged(page) && !is_zero_page(page));
/* limit access to the end of the page */ offset = offset_in_page(addr);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Kirill A. Shutemov" kirill.shutemov@linux.intel.com
[ Upstream commit 6a5abeea9c72e1d2c538622b4cf66c80cc816fd3 ]
Rename the helper to better reflect its function.
Suggested-by: Dave Hansen dave.hansen@intel.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Acked-by: Dave Hansen dave.hansen@intel.com Link: https://lore.kernel.org/all/20241202073139.448208-1-kirill.shutemov%40linux.... Stable-dep-of: 0dccbc75e18d ("x86/kvm: Force legacy PCI hole to UC when overriding MTRRs for TDX/SNP") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/hyperv/ivm.c | 2 +- arch/x86/include/asm/mtrr.h | 10 +++++----- arch/x86/kernel/cpu/mtrr/generic.c | 6 +++--- arch/x86/kernel/cpu/mtrr/mtrr.c | 2 +- arch/x86/kernel/kvm.c | 2 +- arch/x86/xen/enlighten_pv.c | 4 ++-- 6 files changed, 13 insertions(+), 13 deletions(-)
--- a/arch/x86/hyperv/ivm.c +++ b/arch/x86/hyperv/ivm.c @@ -681,7 +681,7 @@ void __init hv_vtom_init(void) x86_platform.guest.enc_status_change_finish = hv_vtom_set_host_visibility;
/* Set WB as the default cache mode. */ - mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK); + guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK); }
#endif /* defined(CONFIG_AMD_MEM_ENCRYPT) || defined(CONFIG_INTEL_TDX_GUEST) */ --- a/arch/x86/include/asm/mtrr.h +++ b/arch/x86/include/asm/mtrr.h @@ -58,8 +58,8 @@ struct mtrr_state_type { */ # ifdef CONFIG_MTRR void mtrr_bp_init(void); -void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var, - mtrr_type def_type); +void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, + mtrr_type def_type); extern u8 mtrr_type_lookup(u64 addr, u64 end, u8 *uniform); extern void mtrr_save_fixed_ranges(void *); extern void mtrr_save_state(void); @@ -75,9 +75,9 @@ void mtrr_disable(void); void mtrr_enable(void); void mtrr_generic_set_state(void); # else -static inline void mtrr_overwrite_state(struct mtrr_var_range *var, - unsigned int num_var, - mtrr_type def_type) +static inline void guest_force_mtrr_state(struct mtrr_var_range *var, + unsigned int num_var, + mtrr_type def_type) { }
--- a/arch/x86/kernel/cpu/mtrr/generic.c +++ b/arch/x86/kernel/cpu/mtrr/generic.c @@ -423,7 +423,7 @@ void __init mtrr_copy_map(void) }
/** - * mtrr_overwrite_state - set static MTRR state + * guest_force_mtrr_state - set static MTRR state for a guest * * Used to set MTRR state via different means (e.g. with data obtained from * a hypervisor). @@ -436,8 +436,8 @@ void __init mtrr_copy_map(void) * @num_var: length of the @var array * @def_type: default caching type */ -void mtrr_overwrite_state(struct mtrr_var_range *var, unsigned int num_var, - mtrr_type def_type) +void guest_force_mtrr_state(struct mtrr_var_range *var, unsigned int num_var, + mtrr_type def_type) { unsigned int i;
--- a/arch/x86/kernel/cpu/mtrr/mtrr.c +++ b/arch/x86/kernel/cpu/mtrr/mtrr.c @@ -625,7 +625,7 @@ void mtrr_save_state(void) static int __init mtrr_init_finalize(void) { /* - * Map might exist if mtrr_overwrite_state() has been called or if + * Map might exist if guest_force_mtrr_state() has been called or if * mtrr_enabled() returns true. */ mtrr_copy_map(); --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -983,7 +983,7 @@ static void __init kvm_init_platform(voi x86_platform.apic_post_init = kvm_apic_init;
/* Set WB as the default cache mode for SEV-SNP and TDX */ - mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK); + guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK); }
#if defined(CONFIG_AMD_MEM_ENCRYPT) --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -171,7 +171,7 @@ static void __init xen_set_mtrr_data(voi
/* Only overwrite MTRR state if any MTRR could be got from Xen. */ if (reg) - mtrr_overwrite_state(var, reg, MTRR_TYPE_UNCACHABLE); + guest_force_mtrr_state(var, reg, MTRR_TYPE_UNCACHABLE); #endif }
@@ -195,7 +195,7 @@ static void __init xen_pv_init_platform( if (xen_initial_domain()) xen_set_mtrr_data(); else - mtrr_overwrite_state(NULL, 0, MTRR_TYPE_WRBACK); + guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK);
/* Adjust nr_cpu_ids before "enumeration" happens */ xen_smp_count_cpus();
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
[ Upstream commit 0dccbc75e18df85399a71933d60b97494110f559 ]
When running as an SNP or TDX guest under KVM, force the legacy PCI hole, i.e. memory between Top of Lower Usable DRAM and 4GiB, to be mapped as UC via a forced variable MTRR range.
In most KVM-based setups, legacy devices such as the HPET and TPM are enumerated via ACPI. ACPI enumeration includes a Memory32Fixed entry, and optionally a SystemMemory descriptor for an OperationRegion, e.g. if the device needs to be accessed via a Control Method.
If a SystemMemory entry is present, then the kernel's ACPI driver will auto-ioremap the region so that it can be accessed at will. However, the ACPI spec doesn't provide a way to enumerate the memory type of SystemMemory regions, i.e. there's no way to tell software that a region must be mapped as UC vs. WB, etc. As a result, Linux's ACPI driver always maps SystemMemory regions using ioremap_cache(), i.e. as WB on x86.
The dedicated device drivers however, e.g. the HPET driver and TPM driver, want to map their associated memory as UC or WC, as accessing PCI devices using WB is unsupported.
On bare metal and non-CoCO, the conflicting requirements "work" as firmware configures the PCI hole (and other device memory) to be UC in the MTRRs. So even though the ACPI mappings request WB, they are forced to UC- in the kernel's tracking due to the kernel properly handling the MTRR overrides, and thus are compatible with the drivers' requested WC/UC-.
With force WB MTRRs on SNP and TDX guests, the ACPI mappings get their requested WB if the ACPI mappings are established before the dedicated driver code attempts to initialize the device. E.g. if acpi_init() runs before the corresponding device driver is probed, ACPI's WB mapping will "win", and result in the driver's ioremap() failing because the existing WB mapping isn't compatible with the requested WC/UC-.
E.g. when a TPM is emulated by the hypervisor (ignoring the security implications of relying on what is allegedly an untrusted entity to store measurements), the TPM driver will request UC and fail:
[ 1.730459] ioremap error for 0xfed40000-0xfed45000, requested 0x2, got 0x0 [ 1.732780] tpm_tis MSFT0101:00: probe with driver tpm_tis failed with error -12
Note, the '0x2' and '0x0' values refer to "enum page_cache_mode", not x86's memtypes (which frustratingly are an almost pure inversion; 2 == WB, 0 == UC). E.g. tracing mapping requests for TPM TIS yields:
Mapping TPM TIS with req_type = 0 WARNING: CPU: 22 PID: 1 at arch/x86/mm/pat/memtype.c:530 memtype_reserve+0x2ab/0x460 Modules linked in: CPU: 22 UID: 0 PID: 1 Comm: swapper/0 Tainted: G W 6.16.0-rc7+ #2 VOLUNTARY Tainted: [W]=WARN Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 05/29/2025 RIP: 0010:memtype_reserve+0x2ab/0x460 __ioremap_caller+0x16d/0x3d0 ioremap_cache+0x17/0x30 x86_acpi_os_ioremap+0xe/0x20 acpi_os_map_iomem+0x1f3/0x240 acpi_os_map_memory+0xe/0x20 acpi_ex_system_memory_space_handler+0x273/0x440 acpi_ev_address_space_dispatch+0x176/0x4c0 acpi_ex_access_region+0x2ad/0x530 acpi_ex_field_datum_io+0xa2/0x4f0 acpi_ex_extract_from_field+0x296/0x3e0 acpi_ex_read_data_from_field+0xd1/0x460 acpi_ex_resolve_node_to_value+0x2ee/0x530 acpi_ex_resolve_to_value+0x1f2/0x540 acpi_ds_evaluate_name_path+0x11b/0x190 acpi_ds_exec_end_op+0x456/0x960 acpi_ps_parse_loop+0x27a/0xa50 acpi_ps_parse_aml+0x226/0x600 acpi_ps_execute_method+0x172/0x3e0 acpi_ns_evaluate+0x175/0x5f0 acpi_evaluate_object+0x213/0x490 acpi_evaluate_integer+0x6d/0x140 acpi_bus_get_status+0x93/0x150 acpi_add_single_object+0x43a/0x7c0 acpi_bus_check_add+0x149/0x3a0 acpi_bus_check_add_1+0x16/0x30 acpi_ns_walk_namespace+0x22c/0x360 acpi_walk_namespace+0x15c/0x170 acpi_bus_scan+0x1dd/0x200 acpi_scan_init+0xe5/0x2b0 acpi_init+0x264/0x5b0 do_one_initcall+0x5a/0x310 kernel_init_freeable+0x34f/0x4f0 kernel_init+0x1b/0x200 ret_from_fork+0x186/0x1b0 ret_from_fork_asm+0x1a/0x30 </TASK>
The above traces are from a Google-VMM based VM, but the same behavior happens with a QEMU based VM that is modified to add a SystemMemory range for the TPM TIS address space.
The only reason this doesn't cause problems for HPET, which appears to require a SystemMemory region, is because HPET gets special treatment via x86_init.timers.timer_init(), and so gets a chance to create its UC- mapping before acpi_init() clobbers things. Disabling the early call to hpet_time_init() yields the same behavior for HPET:
[ 0.318264] ioremap error for 0xfed00000-0xfed01000, requested 0x2, got 0x0
Hack around the ACPI gap by forcing the legacy PCI hole to UC when overriding the (virtual) MTRRs for CoCo guest, so that ioremap handling of MTRRs naturally kicks in and forces the ACPI mappings to be UC.
Note, the requested/mapped memtype doesn't actually matter in terms of accessing the device. In practically every setup, legacy PCI devices are emulated by the hypervisor, and accesses are intercepted and handled as emulated MMIO, i.e. never access physical memory and thus don't have an effective memtype.
Even in a theoretical setup where such devices are passed through by the host, i.e. point at real MMIO memory, it is KVM's (as the hypervisor) responsibility to force the memory to be WC/UC, e.g. via EPT memtype under TDX or real hardware MTRRs under SNP. Not doing so cannot work, and the hypervisor is highly motivated to do the right thing as letting the guest access hardware MMIO with WB would likely result in a variety of fatal #MCs.
In other words, forcing the range to be UC is all about coercing the kernel's tracking into thinking that it has established UC mappings, so that the ioremap code doesn't reject mappings from e.g. the TPM driver and thus prevent the driver from loading and the device from functioning.
Note #2, relying on guest firmware to handle this scenario, e.g. by setting virtual MTRRs and then consuming them in Linux, is not a viable option, as the virtual MTRR state is managed by the untrusted hypervisor, and because OVMF at least has stopped programming virtual MTRRs when running as a TDX guest.
Link: https://lore.kernel.org/all/8137d98e-8825-415b-9282-1d2a115bb51a@linux.intel... Fixes: 8e690b817e38 ("x86/kvm: Override default caching mode for SEV-SNP and TDX") Cc: stable@vger.kernel.org Cc: Peter Gonda pgonda@google.com Cc: Vitaly Kuznetsov vkuznets@redhat.com Cc: Tom Lendacky thomas.lendacky@amd.com Cc: Jürgen Groß jgross@suse.com Cc: Korakit Seemakhupt korakit@google.com Cc: Jianxiong Gao jxgao@google.com Cc: Nikolay Borisov nik.borisov@suse.com Suggested-by: Binbin Wu binbin.wu@linux.intel.com Reviewed-by: Binbin Wu binbin.wu@linux.intel.com Tested-by: Korakit Seemakhupt korakit@google.com Link: https://lore.kernel.org/r/20250828005249.39339-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/kvm.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -933,6 +933,19 @@ static void kvm_sev_hc_page_enc_status(u
static void __init kvm_init_platform(void) { + u64 tolud = PFN_PHYS(e820__end_of_low_ram_pfn()); + /* + * Note, hardware requires variable MTRR ranges to be power-of-2 sized + * and naturally aligned. But when forcing guest MTRR state, Linux + * doesn't program the forced ranges into hardware. Don't bother doing + * the math to generate a technically-legal range. + */ + struct mtrr_var_range pci_hole = { + .base_lo = tolud | X86_MEMTYPE_UC, + .mask_lo = (u32)(~(SZ_4G - tolud - 1)) | MTRR_PHYSMASK_V, + .mask_hi = (BIT_ULL(boot_cpu_data.x86_phys_bits) - 1) >> 32, + }; + if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && kvm_para_has_feature(KVM_FEATURE_MIGRATION_CONTROL)) { unsigned long nr_pages; @@ -982,8 +995,12 @@ static void __init kvm_init_platform(voi kvmclock_init(); x86_platform.apic_post_init = kvm_apic_init;
- /* Set WB as the default cache mode for SEV-SNP and TDX */ - guest_force_mtrr_state(NULL, 0, MTRR_TYPE_WRBACK); + /* + * Set WB as the default cache mode for SEV-SNP and TDX, with a single + * UC range for the legacy PCI hole, e.g. so that devices that expect + * to get UC/WC mappings don't get surprised with WB. + */ + guest_force_mtrr_state(&pci_hole, 1, MTRR_TYPE_WRBACK); }
#if defined(CONFIG_AMD_MEM_ENCRYPT)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pali Rohár pali@kernel.org
[ Upstream commit bb4f07f2409c26c01e97e6f9b432545f353e3b66 ]
Currently NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT do not bypass only GSS, but bypass any method. This is a problem specially for NFS3 AUTH_NULL-only exports.
The purpose of NFSD_MAY_BYPASS_GSS_ON_ROOT is described in RFC 2623, section 2.3.2, to allow mounting NFS2/3 GSS-only export without authentication. So few procedures which do not expose security risk used during mount time can be called also with AUTH_NONE or AUTH_SYS, to allow client mount operation to finish successfully.
The problem with current implementation is that for AUTH_NULL-only exports, the NFSD_MAY_BYPASS_GSS_ON_ROOT is active also for NFS3 AUTH_UNIX mount attempts which confuse NFS3 clients, and make them think that AUTH_UNIX is enabled and is working. Linux NFS3 client never switches from AUTH_UNIX to AUTH_NONE on active mount, which makes the mount inaccessible.
Fix the NFSD_MAY_BYPASS_GSS and NFSD_MAY_BYPASS_GSS_ON_ROOT implementation and really allow to bypass only exports which have enabled some real authentication (GSS, TLS, or any other).
The result would be: For AUTH_NULL-only export if client attempts to do mount with AUTH_UNIX flavor then it will receive access errors, which instruct client that AUTH_UNIX flavor is not usable and will either try other auth flavor (AUTH_NULL if enabled) or fails mount procedure. Similarly if client attempt to do mount with AUTH_NULL flavor and only AUTH_UNIX flavor is enabled then the client will receive access error.
This should fix problems with AUTH_NULL-only or AUTH_UNIX-only exports if client attempts to mount it with other auth flavor (e.g. with AUTH_NULL for AUTH_UNIX-only export, or with AUTH_UNIX for AUTH_NULL-only export).
Signed-off-by: Pali Rohár pali@kernel.org Reviewed-by: NeilBrown neilb@suse.de Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/export.c | 21 ++++++++++++++++++++- fs/nfsd/export.h | 3 ++- fs/nfsd/nfs4proc.c | 2 +- fs/nfsd/nfs4xdr.c | 2 +- fs/nfsd/nfsfh.c | 9 ++++++--- fs/nfsd/vfs.c | 2 +- 6 files changed, 31 insertions(+), 8 deletions(-)
--- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1078,12 +1078,14 @@ static struct svc_export *exp_find(struc * check_nfsd_access - check if access to export is allowed. * @exp: svc_export that is being accessed. * @rqstp: svc_rqst attempting to access @exp (will be NULL for LOCALIO). + * @may_bypass_gss: reduce strictness of authorization check * * Return values: * %nfs_ok if access is granted, or * %nfserr_wrongsec if access is denied */ -__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp) +__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp, + bool may_bypass_gss) { struct exp_flavor_info *f, *end = exp->ex_flavors + exp->ex_nflavors; struct svc_xprt *xprt; @@ -1140,6 +1142,23 @@ ok: if (nfsd4_spo_must_allow(rqstp)) return nfs_ok;
+ /* Some calls may be processed without authentication + * on GSS exports. For example NFS2/3 calls on root + * directory, see section 2.3.2 of rfc 2623. + * For "may_bypass_gss" check that export has really + * enabled some flavor with authentication (GSS or any + * other) and also check that the used auth flavor is + * without authentication (none or sys). + */ + if (may_bypass_gss && ( + rqstp->rq_cred.cr_flavor == RPC_AUTH_NULL || + rqstp->rq_cred.cr_flavor == RPC_AUTH_UNIX)) { + for (f = exp->ex_flavors; f < end; f++) { + if (f->pseudoflavor >= RPC_AUTH_DES) + return 0; + } + } + denied: return nfserr_wrongsec; } --- a/fs/nfsd/export.h +++ b/fs/nfsd/export.h @@ -101,7 +101,8 @@ struct svc_expkey {
struct svc_cred; int nfsexp_flags(struct svc_cred *cred, struct svc_export *exp); -__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp); +__be32 check_nfsd_access(struct svc_export *exp, struct svc_rqst *rqstp, + bool may_bypass_gss);
/* * Function declarations --- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -2799,7 +2799,7 @@ nfsd4_proc_compound(struct svc_rqst *rqs
if (current_fh->fh_export && need_wrongsec_check(rqstp)) - op->status = check_nfsd_access(current_fh->fh_export, rqstp); + op->status = check_nfsd_access(current_fh->fh_export, rqstp, false); } encode_op: if (op->status == nfserr_replay_me) { --- a/fs/nfsd/nfs4xdr.c +++ b/fs/nfsd/nfs4xdr.c @@ -3784,7 +3784,7 @@ nfsd4_encode_entry4_fattr(struct nfsd4_r nfserr = nfserrno(err); goto out_put; } - nfserr = check_nfsd_access(exp, cd->rd_rqstp); + nfserr = check_nfsd_access(exp, cd->rd_rqstp, false); if (nfserr) goto out_put;
--- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -320,6 +320,7 @@ __fh_verify(struct svc_rqst *rqstp, { struct nfsd_net *nn = net_generic(net, nfsd_net_id); struct svc_export *exp = NULL; + bool may_bypass_gss = false; struct dentry *dentry; __be32 error;
@@ -367,8 +368,10 @@ __fh_verify(struct svc_rqst *rqstp, * which clients virtually always use auth_sys for, * even while using RPCSEC_GSS for NFS. */ - if (access & NFSD_MAY_LOCK || access & NFSD_MAY_BYPASS_GSS) + if (access & NFSD_MAY_LOCK) goto skip_pseudoflavor_check; + if (access & NFSD_MAY_BYPASS_GSS) + may_bypass_gss = true; /* * Clients may expect to be able to use auth_sys during mount, * even if they use gss for everything else; see section 2.3.2 @@ -376,9 +379,9 @@ __fh_verify(struct svc_rqst *rqstp, */ if (access & NFSD_MAY_BYPASS_GSS_ON_ROOT && exp->ex_path.dentry == dentry) - goto skip_pseudoflavor_check; + may_bypass_gss = true;
- error = check_nfsd_access(exp, rqstp); + error = check_nfsd_access(exp, rqstp, may_bypass_gss); if (error) goto out;
--- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -321,7 +321,7 @@ nfsd_lookup(struct svc_rqst *rqstp, stru err = nfsd_lookup_dentry(rqstp, fhp, name, len, &exp, &dentry); if (err) return err; - err = check_nfsd_access(exp, rqstp); + err = check_nfsd_access(exp, rqstp, false); if (err) goto out; /*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 6640556b0c80edc66d6f50abe53f00311a873536 ]
NFSv4 LOCK operations should not avoid the set of authorization checks that apply to all other NFSv4 operations. Also, the "no_auth_nlm" export option should apply only to NLM LOCK requests. It's not necessary or sensible to apply it to NFSv4 LOCK operations.
Instead, set no permission bits when calling fh_verify(). Subsequent stateid processing handles authorization checks.
Reported-by: NeilBrown neilb@suse.de Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4state.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -7998,11 +7998,9 @@ nfsd4_lock(struct svc_rqst *rqstp, struc if (check_lock_length(lock->lk_offset, lock->lk_length)) return nfserr_inval;
- if ((status = fh_verify(rqstp, &cstate->current_fh, - S_IFREG, NFSD_MAY_LOCK))) { - dprintk("NFSD: nfsd4_lock: permission denied!\n"); + status = fh_verify(rqstp, &cstate->current_fh, S_IFREG, 0); + if (status != nfs_ok) return status; - } sb = cstate->current_fh.fh_dentry->d_sb;
if (lock->lk_is_new) {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown neilb@suse.de
[ Upstream commit 4cc9b9f2bf4dfe13fe573da978e626e2248df388 ]
NFSD_MAY_LOCK means a few different things. - it means that GSS is not required. - it means that with NFSEXP_NOAUTHNLM, authentication is not required - it means that OWNER_OVERRIDE is allowed.
None of these are specific to locking, they are specific to the NLM protocol. So: - rename to NFSD_MAY_NLM - set NFSD_MAY_OWNER_OVERRIDE and NFSD_MAY_BYPASS_GSS in nlm_fopen() so that NFSD_MAY_NLM doesn't need to imply these. - move the test on NFSEXP_NOAUTHNLM out of nfsd_permission() and into fh_verify where other special-case tests on the MAY flags happen. nfsd_permission() can be called from other places than fh_verify(), but none of these will have NFSD_MAY_NLM.
Signed-off-by: NeilBrown neilb@suse.de Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/lockd.c | 13 +++++++++++-- fs/nfsd/nfsfh.c | 12 ++++-------- fs/nfsd/trace.h | 2 +- fs/nfsd/vfs.c | 12 +----------- fs/nfsd/vfs.h | 2 +- 5 files changed, 18 insertions(+), 23 deletions(-)
--- a/fs/nfsd/lockd.c +++ b/fs/nfsd/lockd.c @@ -38,11 +38,20 @@ nlm_fopen(struct svc_rqst *rqstp, struct memcpy(&fh.fh_handle.fh_raw, f->data, f->size); fh.fh_export = NULL;
+ /* + * Allow BYPASS_GSS as some client implementations use AUTH_SYS + * for NLM even when GSS is used for NFS. + * Allow OWNER_OVERRIDE as permission might have been changed + * after the file was opened. + * Pass MAY_NLM so that authentication can be completely bypassed + * if NFSEXP_NOAUTHNLM is set. Some older clients use AUTH_NULL + * for NLM requests. + */ access = (mode == O_WRONLY) ? NFSD_MAY_WRITE : NFSD_MAY_READ; - access |= NFSD_MAY_LOCK; + access |= NFSD_MAY_NLM | NFSD_MAY_OWNER_OVERRIDE | NFSD_MAY_BYPASS_GSS; nfserr = nfsd_open(rqstp, &fh, S_IFREG, access, filp); fh_put(&fh); - /* We return nlm error codes as nlm doesn't know + /* We return nlm error codes as nlm doesn't know * about nfsd, but nfsd does know about nlm.. */ switch (nfserr) { --- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -363,13 +363,10 @@ __fh_verify(struct svc_rqst *rqstp, if (error) goto out;
- /* - * pseudoflavor restrictions are not enforced on NLM, - * which clients virtually always use auth_sys for, - * even while using RPCSEC_GSS for NFS. - */ - if (access & NFSD_MAY_LOCK) - goto skip_pseudoflavor_check; + if ((access & NFSD_MAY_NLM) && (exp->ex_flags & NFSEXP_NOAUTHNLM)) + /* NLM is allowed to fully bypass authentication */ + goto out; + if (access & NFSD_MAY_BYPASS_GSS) may_bypass_gss = true; /* @@ -385,7 +382,6 @@ __fh_verify(struct svc_rqst *rqstp, if (error) goto out;
-skip_pseudoflavor_check: /* Finally, check access permissions. */ error = nfsd_permission(cred, exp, dentry, access); out: --- a/fs/nfsd/trace.h +++ b/fs/nfsd/trace.h @@ -79,7 +79,7 @@ DEFINE_NFSD_XDR_ERR_EVENT(cant_encode); { NFSD_MAY_READ, "READ" }, \ { NFSD_MAY_SATTR, "SATTR" }, \ { NFSD_MAY_TRUNC, "TRUNC" }, \ - { NFSD_MAY_LOCK, "LOCK" }, \ + { NFSD_MAY_NLM, "NLM" }, \ { NFSD_MAY_OWNER_OVERRIDE, "OWNER_OVERRIDE" }, \ { NFSD_MAY_LOCAL_ACCESS, "LOCAL_ACCESS" }, \ { NFSD_MAY_BYPASS_GSS_ON_ROOT, "BYPASS_GSS_ON_ROOT" }, \ --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -2519,7 +2519,7 @@ nfsd_permission(struct svc_cred *cred, s (acc & NFSD_MAY_EXEC)? " exec" : "", (acc & NFSD_MAY_SATTR)? " sattr" : "", (acc & NFSD_MAY_TRUNC)? " trunc" : "", - (acc & NFSD_MAY_LOCK)? " lock" : "", + (acc & NFSD_MAY_NLM)? " nlm" : "", (acc & NFSD_MAY_OWNER_OVERRIDE)? " owneroverride" : "", inode->i_mode, IS_IMMUTABLE(inode)? " immut" : "", @@ -2544,16 +2544,6 @@ nfsd_permission(struct svc_cred *cred, s if ((acc & NFSD_MAY_TRUNC) && IS_APPEND(inode)) return nfserr_perm;
- if (acc & NFSD_MAY_LOCK) { - /* If we cannot rely on authentication in NLM requests, - * just allow locks, otherwise require read permission, or - * ownership - */ - if (exp->ex_flags & NFSEXP_NOAUTHNLM) - return 0; - else - acc = NFSD_MAY_READ | NFSD_MAY_OWNER_OVERRIDE; - } /* * The file owner always gets access permission for accesses that * would normally be checked at open time. This is to make --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -20,7 +20,7 @@ #define NFSD_MAY_READ 0x004 /* == MAY_READ */ #define NFSD_MAY_SATTR 0x008 #define NFSD_MAY_TRUNC 0x010 -#define NFSD_MAY_LOCK 0x020 +#define NFSD_MAY_NLM 0x020 /* request is from lockd */ #define NFSD_MAY_MASK 0x03f
/* extra hints to permission and open routines: */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: NeilBrown neilb@suse.de
[ Upstream commit eccbbc7c00a5aae5e704d4002adfaf4c3fa4b30d ]
The heuristic for limiting the number of incoming connections to nfsd currently uses sv_nrthreads - allowing more connections if more threads were configured.
A future patch will allow number of threads to grow dynamically so that there will be no need to configure sv_nrthreads. So we need a different solution for limiting connections.
It isn't clear what problem is solved by limiting connections (as mentioned in a code comment) but the most likely problem is a connection storm - many connections that are not doing productive work. These will be closed after about 6 minutes already but it might help to slow down a storm.
This patch adds a per-connection flag XPT_PEER_VALID which indicates that the peer has presented a filehandle for which it has some sort of access. i.e the peer is known to be trusted in some way. We now only count connections which have NOT been determined to be valid. There should be relative few of these at any given time.
If the number of non-validated peer exceed a limit - currently 64 - we close the oldest non-validated peer to avoid having too many of these useless connections.
Note that this patch significantly changes the meaning of the various configuration parameters for "max connections". The next patch will remove all of these.
Signed-off-by: NeilBrown neilb@suse.de Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Stable-dep-of: 898374fdd7f0 ("nfsd: unregister with rpcbind when deleting a transport") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/callback.c | 4 ---- fs/nfs/callback_xdr.c | 1 + fs/nfsd/netns.h | 4 ++-- fs/nfsd/nfsfh.c | 2 ++ include/linux/sunrpc/svc.h | 2 +- include/linux/sunrpc/svc_xprt.h | 16 ++++++++++++++++ net/sunrpc/svc_xprt.c | 32 ++++++++++++++++---------------- 7 files changed, 38 insertions(+), 23 deletions(-)
--- a/fs/nfs/callback.c +++ b/fs/nfs/callback.c @@ -211,10 +211,6 @@ static struct svc_serv *nfs_callback_cre return ERR_PTR(-ENOMEM); } cb_info->serv = serv; - /* As there is only one thread we need to over-ride the - * default maximum of 80 connections - */ - serv->sv_maxconn = 1024; dprintk("nfs_callback_create_svc: service created\n"); return serv; } --- a/fs/nfs/callback_xdr.c +++ b/fs/nfs/callback_xdr.c @@ -984,6 +984,7 @@ static __be32 nfs4_callback_compound(str nfs_put_client(cps.clp); goto out_invalidcred; } + svc_xprt_set_valid(rqstp->rq_xprt); }
cps.minorversion = hdr_arg.minorversion; --- a/fs/nfsd/netns.h +++ b/fs/nfsd/netns.h @@ -129,8 +129,8 @@ struct nfsd_net { unsigned char writeverf[8];
/* - * Max number of connections this nfsd container will allow. Defaults - * to '0' which is means that it bases this on the number of threads. + * Max number of non-validated connections this nfsd container + * will allow. Defaults to '0' gets mapped to 64. */ unsigned int max_connections;
--- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -382,6 +382,8 @@ __fh_verify(struct svc_rqst *rqstp, if (error) goto out;
+ svc_xprt_set_valid(rqstp->rq_xprt); + /* Finally, check access permissions. */ error = nfsd_permission(cred, exp, dentry, access); out: --- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -81,7 +81,7 @@ struct svc_serv { unsigned int sv_xdrsize; /* XDR buffer size */ struct list_head sv_permsocks; /* all permanent sockets */ struct list_head sv_tempsocks; /* all temporary sockets */ - int sv_tmpcnt; /* count of temporary sockets */ + int sv_tmpcnt; /* count of temporary "valid" sockets */ struct timer_list sv_temptimer; /* timer for aging temporary sockets */
char * sv_name; /* service name */ --- a/include/linux/sunrpc/svc_xprt.h +++ b/include/linux/sunrpc/svc_xprt.h @@ -99,8 +99,24 @@ enum { XPT_HANDSHAKE, /* xprt requests a handshake */ XPT_TLS_SESSION, /* transport-layer security established */ XPT_PEER_AUTH, /* peer has been authenticated */ + XPT_PEER_VALID, /* peer has presented a filehandle that + * it has access to. It is NOT counted + * in ->sv_tmpcnt. + */ };
+static inline void svc_xprt_set_valid(struct svc_xprt *xpt) +{ + if (test_bit(XPT_TEMP, &xpt->xpt_flags) && + !test_and_set_bit(XPT_PEER_VALID, &xpt->xpt_flags)) { + struct svc_serv *serv = xpt->xpt_server; + + spin_lock(&serv->sv_lock); + serv->sv_tmpcnt -= 1; + spin_unlock(&serv->sv_lock); + } +} + static inline void unregister_xpt_user(struct svc_xprt *xpt, struct svc_xpt_user *u) { spin_lock(&xpt->xpt_lock); --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -606,7 +606,8 @@ int svc_port_is_privileged(struct sockad }
/* - * Make sure that we don't have too many active connections. If we have, + * Make sure that we don't have too many connections that have not yet + * demonstrated that they have access to the NFS server. If we have, * something must be dropped. It's not clear what will happen if we allow * "too many" connections, but when dealing with network-facing software, * we have to code defensively. Here we do that by imposing hard limits. @@ -625,27 +626,25 @@ int svc_port_is_privileged(struct sockad */ static void svc_check_conn_limits(struct svc_serv *serv) { - unsigned int limit = serv->sv_maxconn ? serv->sv_maxconn : - (serv->sv_nrthreads+3) * 20; + unsigned int limit = serv->sv_maxconn ? serv->sv_maxconn : 64;
if (serv->sv_tmpcnt > limit) { - struct svc_xprt *xprt = NULL; + struct svc_xprt *xprt = NULL, *xprti; spin_lock_bh(&serv->sv_lock); if (!list_empty(&serv->sv_tempsocks)) { - /* Try to help the admin */ - net_notice_ratelimited("%s: too many open connections, consider increasing the %s\n", - serv->sv_name, serv->sv_maxconn ? - "max number of connections" : - "number of threads"); /* * Always select the oldest connection. It's not fair, - * but so is life + * but nor is life. */ - xprt = list_entry(serv->sv_tempsocks.prev, - struct svc_xprt, - xpt_list); - set_bit(XPT_CLOSE, &xprt->xpt_flags); - svc_xprt_get(xprt); + list_for_each_entry_reverse(xprti, &serv->sv_tempsocks, + xpt_list) { + if (!test_bit(XPT_PEER_VALID, &xprti->xpt_flags)) { + xprt = xprti; + set_bit(XPT_CLOSE, &xprt->xpt_flags); + svc_xprt_get(xprt); + break; + } + } } spin_unlock_bh(&serv->sv_lock);
@@ -1039,7 +1038,8 @@ static void svc_delete_xprt(struct svc_x
spin_lock_bh(&serv->sv_lock); list_del_init(&xprt->xpt_list); - if (test_bit(XPT_TEMP, &xprt->xpt_flags)) + if (test_bit(XPT_TEMP, &xprt->xpt_flags) && + !test_bit(XPT_PEER_VALID, &xprt->xpt_flags)) serv->sv_tmpcnt--; spin_unlock_bh(&serv->sv_lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia okorniev@redhat.com
[ Upstream commit 898374fdd7f06fa4c4a66e8be3135efeae6128d5 ]
When a listener is added, a part of creation of transport also registers program/port with rpcbind. However, when the listener is removed, while transport goes away, rpcbind still has the entry for that port/type.
When deleting the transport, unregister with rpcbind when appropriate.
---v2 created a new xpt_flag XPT_RPCB_UNREG to mark TCP and UDP transport and at xprt destroy send rpcbind unregister if flag set.
Suggested-by: Chuck Lever chuck.lever@oracle.com Fixes: d093c9089260 ("nfsd: fix management of listener transports") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia okorniev@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/sunrpc/svc_xprt.h | 3 +++ net/sunrpc/svc_xprt.c | 13 +++++++++++++ net/sunrpc/svcsock.c | 2 ++ 3 files changed, 18 insertions(+)
--- a/include/linux/sunrpc/svc_xprt.h +++ b/include/linux/sunrpc/svc_xprt.h @@ -103,6 +103,9 @@ enum { * it has access to. It is NOT counted * in ->sv_tmpcnt. */ + XPT_RPCB_UNREG, /* transport that needs unregistering + * with rpcbind (TCP, UDP) on destroy + */ };
static inline void svc_xprt_set_valid(struct svc_xprt *xpt) --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -1028,6 +1028,19 @@ static void svc_delete_xprt(struct svc_x struct svc_serv *serv = xprt->xpt_server; struct svc_deferred_req *dr;
+ /* unregister with rpcbind for when transport type is TCP or UDP. + */ + if (test_bit(XPT_RPCB_UNREG, &xprt->xpt_flags)) { + struct svc_sock *svsk = container_of(xprt, struct svc_sock, + sk_xprt); + struct socket *sock = svsk->sk_sock; + + if (svc_register(serv, xprt->xpt_net, sock->sk->sk_family, + sock->sk->sk_protocol, 0) < 0) + pr_warn("failed to unregister %s with rpcbind\n", + xprt->xpt_class->xcl_name); + } + if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags)) return;
--- a/net/sunrpc/svcsock.c +++ b/net/sunrpc/svcsock.c @@ -837,6 +837,7 @@ static void svc_udp_init(struct svc_sock /* data might have come in before data_ready set up */ set_bit(XPT_DATA, &svsk->sk_xprt.xpt_flags); set_bit(XPT_CHNGBUF, &svsk->sk_xprt.xpt_flags); + set_bit(XPT_RPCB_UNREG, &svsk->sk_xprt.xpt_flags);
/* make sure we get destination address info */ switch (svsk->sk_sk->sk_family) { @@ -1357,6 +1358,7 @@ static void svc_tcp_init(struct svc_sock if (sk->sk_state == TCP_LISTEN) { strcpy(svsk->sk_xprt.xpt_remotebuf, "listener"); set_bit(XPT_LISTENER, &svsk->sk_xprt.xpt_flags); + set_bit(XPT_RPCB_UNREG, &svsk->sk_xprt.xpt_flags); sk->sk_data_ready = svc_tcp_listen_data_ready; set_bit(XPT_CONN, &svsk->sk_xprt.xpt_flags); } else {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit 909dfc60692331e1599d5e28a8f08a611f353aef ]
Simplify the cleanup logic a bit.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Link: https://patch.msgid.link/20240904-acpi-battery-cleanups-v1-2-a3bf74f22d40@we... Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/battery.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
--- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -1218,7 +1218,7 @@ static int acpi_battery_add(struct acpi_ if (device->dep_unmet) return -EPROBE_DEFER;
- battery = kzalloc(sizeof(struct acpi_battery), GFP_KERNEL); + battery = devm_kzalloc(&device->dev, sizeof(*battery), GFP_KERNEL); if (!battery) return -ENOMEM; battery->device = device; @@ -1256,7 +1256,6 @@ fail: sysfs_remove_battery(battery); mutex_destroy(&battery->lock); mutex_destroy(&battery->sysfs_lock); - kfree(battery);
return result; } @@ -1279,7 +1278,6 @@ static void acpi_battery_remove(struct a
mutex_destroy(&battery->lock); mutex_destroy(&battery->sysfs_lock); - kfree(battery); }
#ifdef CONFIG_PM_SLEEP
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Weißschuh linux@weissschuh.net
[ Upstream commit 0710c1ce50455ed0db91bffa0eebbaa4f69b1773 ]
Simplify the cleanup logic a bit.
Signed-off-by: Thomas Weißschuh linux@weissschuh.net Link: https://patch.msgid.link/20240904-acpi-battery-cleanups-v1-3-a3bf74f22d40@we... Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/battery.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -1225,8 +1225,8 @@ static int acpi_battery_add(struct acpi_ strscpy(acpi_device_name(device), ACPI_BATTERY_DEVICE_NAME); strscpy(acpi_device_class(device), ACPI_BATTERY_CLASS); device->driver_data = battery; - mutex_init(&battery->lock); - mutex_init(&battery->sysfs_lock); + devm_mutex_init(&device->dev, &battery->lock); + devm_mutex_init(&device->dev, &battery->sysfs_lock); if (acpi_has_method(battery->device->handle, "_BIX")) set_bit(ACPI_BATTERY_XINFO_PRESENT, &battery->flags);
@@ -1254,8 +1254,6 @@ fail_pm: unregister_pm_notifier(&battery->pm_nb); fail: sysfs_remove_battery(battery); - mutex_destroy(&battery->lock); - mutex_destroy(&battery->sysfs_lock);
return result; } @@ -1275,9 +1273,6 @@ static void acpi_battery_remove(struct a device_init_wakeup(&device->dev, 0); unregister_pm_notifier(&battery->pm_nb); sysfs_remove_battery(battery); - - mutex_destroy(&battery->lock); - mutex_destroy(&battery->sysfs_lock); }
#ifdef CONFIG_PM_SLEEP
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 815daedc318b2f9f1b956d0631377619a0d69d96 ]
Even if it's not critical, the avoidance of checking the error code from devm_mutex_init() call today diminishes the point of using devm variant of it. Tomorrow it may even leak something. Add the missed check.
Fixes: 0710c1ce5045 ("ACPI: battery: initialize mutexes through devm_ APIs") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Thomas Weißschuh linux@weissschuh.net Link: https://patch.msgid.link/20241030162754.2110946-1-andriy.shevchenko@linux.in... [ rjw: Added 2 empty code lines ] Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Stable-dep-of: 399dbcadc01e ("ACPI: battery: Add synchronization between interface updates") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/battery.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -1225,8 +1225,14 @@ static int acpi_battery_add(struct acpi_ strscpy(acpi_device_name(device), ACPI_BATTERY_DEVICE_NAME); strscpy(acpi_device_class(device), ACPI_BATTERY_CLASS); device->driver_data = battery; - devm_mutex_init(&device->dev, &battery->lock); - devm_mutex_init(&device->dev, &battery->sysfs_lock); + result = devm_mutex_init(&device->dev, &battery->lock); + if (result) + return result; + + result = devm_mutex_init(&device->dev, &battery->sysfs_lock); + if (result) + return result; + if (acpi_has_method(battery->device->handle, "_BIX")) set_bit(ACPI_BATTERY_XINFO_PRESENT, &battery->flags);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit 399dbcadc01ebf0035f325eaa8c264f8b5cd0a14 ]
There is no synchronization between different code paths in the ACPI battery driver that update its sysfs interface or its power supply class device interface. In some cases this results to functional failures due to race conditions.
One example of this is when two ACPI notifications:
- ACPI_BATTERY_NOTIFY_STATUS (0x80) - ACPI_BATTERY_NOTIFY_INFO (0x81)
are triggered (by the platform firmware) in a row with a little delay in between after removing and reinserting a laptop battery. Both notifications cause acpi_battery_update() to be called and if the delay between them is sufficiently small, sysfs_add_battery() can be re-entered before battery->bat is set which leads to a duplicate sysfs entry error:
sysfs: cannot create duplicate filename '/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT1' CPU: 1 UID: 0 PID: 185 Comm: kworker/1:4 Kdump: loaded Not tainted 6.12.38+deb13-amd64 #1 Debian 6.12.38-1 Hardware name: Gateway NV44 /SJV40-MV , BIOS V1.3121 04/08/2009 Workqueue: kacpi_notify acpi_os_execute_deferred Call Trace: <TASK> dump_stack_lvl+0x5d/0x80 sysfs_warn_dup.cold+0x17/0x23 sysfs_create_dir_ns+0xce/0xe0 kobject_add_internal+0xba/0x250 kobject_add+0x96/0xc0 ? get_device_parent+0xde/0x1e0 device_add+0xe2/0x870 __power_supply_register.part.0+0x20f/0x3f0 ? wake_up_q+0x4e/0x90 sysfs_add_battery+0xa4/0x1d0 [battery] acpi_battery_update+0x19e/0x290 [battery] acpi_battery_notify+0x50/0x120 [battery] acpi_ev_notify_dispatch+0x49/0x70 acpi_os_execute_deferred+0x1a/0x30 process_one_work+0x177/0x330 worker_thread+0x251/0x390 ? __pfx_worker_thread+0x10/0x10 kthread+0xd2/0x100 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x34/0x50 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK> kobject: kobject_add_internal failed for BAT1 with -EEXIST, don't try to register things with the same name in the same directory.
There are also other scenarios in which analogous issues may occur.
Address this by using a common lock in all of the code paths leading to updates of driver interfaces: ACPI Notify () handler, system resume callback and post-resume notification, device addition and removal.
This new lock replaces sysfs_lock that has been used only in sysfs_remove_battery() which now is going to be always called under the new lock, so it doesn't need any internal locking any more.
Fixes: 10666251554c ("ACPI: battery: Install Notify() handler directly") Closes: https://lore.kernel.org/linux-acpi/20250910142653.313360-1-luogf2025@163.com... Reported-by: GuangFei Luo luogf2025@163.com Tested-by: GuangFei Luo luogf2025@163.com Cc: 6.6+ stable@vger.kernel.org # 6.6+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/battery.c | 43 +++++++++++++++++++++++++++++-------------- 1 file changed, 29 insertions(+), 14 deletions(-)
--- a/drivers/acpi/battery.c +++ b/drivers/acpi/battery.c @@ -92,7 +92,7 @@ enum {
struct acpi_battery { struct mutex lock; - struct mutex sysfs_lock; + struct mutex update_lock; struct power_supply *bat; struct power_supply_desc bat_desc; struct acpi_device *device; @@ -903,15 +903,12 @@ static int sysfs_add_battery(struct acpi
static void sysfs_remove_battery(struct acpi_battery *battery) { - mutex_lock(&battery->sysfs_lock); - if (!battery->bat) { - mutex_unlock(&battery->sysfs_lock); + if (!battery->bat) return; - } + battery_hook_remove_battery(battery); power_supply_unregister(battery->bat); battery->bat = NULL; - mutex_unlock(&battery->sysfs_lock); }
static void find_battery(const struct dmi_header *dm, void *private) @@ -1071,6 +1068,9 @@ static void acpi_battery_notify(acpi_han
if (!battery) return; + + guard(mutex)(&battery->update_lock); + old = battery->bat; /* * On Acer Aspire V5-573G notifications are sometimes triggered too @@ -1093,21 +1093,22 @@ static void acpi_battery_notify(acpi_han }
static int battery_notify(struct notifier_block *nb, - unsigned long mode, void *_unused) + unsigned long mode, void *_unused) { struct acpi_battery *battery = container_of(nb, struct acpi_battery, pm_nb); - int result;
- switch (mode) { - case PM_POST_HIBERNATION: - case PM_POST_SUSPEND: + if (mode == PM_POST_SUSPEND || mode == PM_POST_HIBERNATION) { + guard(mutex)(&battery->update_lock); + if (!acpi_battery_present(battery)) return 0;
if (battery->bat) { acpi_battery_refresh(battery); } else { + int result; + result = acpi_battery_get_info(battery); if (result) return result; @@ -1119,7 +1120,6 @@ static int battery_notify(struct notifie
acpi_battery_init_alarm(battery); acpi_battery_get_state(battery); - break; }
return 0; @@ -1197,6 +1197,8 @@ static int acpi_battery_update_retry(str { int retry, ret;
+ guard(mutex)(&battery->update_lock); + for (retry = 5; retry; retry--) { ret = acpi_battery_update(battery, false); if (!ret) @@ -1207,6 +1209,13 @@ static int acpi_battery_update_retry(str return ret; }
+static void sysfs_battery_cleanup(struct acpi_battery *battery) +{ + guard(mutex)(&battery->update_lock); + + sysfs_remove_battery(battery); +} + static int acpi_battery_add(struct acpi_device *device) { int result = 0; @@ -1229,7 +1238,7 @@ static int acpi_battery_add(struct acpi_ if (result) return result;
- result = devm_mutex_init(&device->dev, &battery->sysfs_lock); + result = devm_mutex_init(&device->dev, &battery->update_lock); if (result) return result;
@@ -1259,7 +1268,7 @@ fail_pm: device_init_wakeup(&device->dev, 0); unregister_pm_notifier(&battery->pm_nb); fail: - sysfs_remove_battery(battery); + sysfs_battery_cleanup(battery);
return result; } @@ -1278,6 +1287,9 @@ static void acpi_battery_remove(struct a
device_init_wakeup(&device->dev, 0); unregister_pm_notifier(&battery->pm_nb); + + guard(mutex)(&battery->update_lock); + sysfs_remove_battery(battery); }
@@ -1295,6 +1307,9 @@ static int acpi_battery_resume(struct de return -EINVAL;
battery->update_time = 0; + + guard(mutex)(&battery->update_lock); + acpi_battery_update(battery, true); return 0; }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit d06118fe9b03426484980ed4c189a8c7b99fa631 ]
Data-only subnode links following the ACPI data subnode GUID in a _DSD package are expected to point to named objects returning _DSD-equivalent packages. If a reference to such an object is used in the target field of any of those links, that object will be evaluated in place (as a named object) and its return data will be embedded in the outer _DSD package.
For this reason, it is not expected to see a subnode link with the target field containing a local reference (that would mean pointing to a device or another object that cannot be evaluated in place and therefore cannot return a _DSD-equivalent package).
Accordingly, simplify the code parsing data-only subnode links to simply print a message when it encounters a local reference in the target field of one of those links.
Moreover, since acpi_nondev_subnode_data_ok() would only have one caller after the change above, fold it into that caller.
Link: https://lore.kernel.org/linux-acpi/CAJZ5v0jVeSrDO6hrZhKgRZrH=FpGD4vNUjFD8hV9... Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Sakari Ailus sakari.ailus@linux.intel.com Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/property.c | 51 ++++++++++++++++++++---------------------------- 1 file changed, 22 insertions(+), 29 deletions(-)
--- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -124,32 +124,12 @@ static bool acpi_nondev_subnode_extract( return false; }
-static bool acpi_nondev_subnode_data_ok(acpi_handle handle, - const union acpi_object *link, - struct list_head *list, - struct fwnode_handle *parent) -{ - struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER }; - acpi_status status; - - status = acpi_evaluate_object_typed(handle, NULL, NULL, &buf, - ACPI_TYPE_PACKAGE); - if (ACPI_FAILURE(status)) - return false; - - if (acpi_nondev_subnode_extract(buf.pointer, handle, link, list, - parent)) - return true; - - ACPI_FREE(buf.pointer); - return false; -} - static bool acpi_nondev_subnode_ok(acpi_handle scope, const union acpi_object *link, struct list_head *list, struct fwnode_handle *parent) { + struct acpi_buffer buf = { ACPI_ALLOCATE_BUFFER }; acpi_handle handle; acpi_status status;
@@ -161,7 +141,17 @@ static bool acpi_nondev_subnode_ok(acpi_ if (ACPI_FAILURE(status)) return false;
- return acpi_nondev_subnode_data_ok(handle, link, list, parent); + status = acpi_evaluate_object_typed(handle, NULL, NULL, &buf, + ACPI_TYPE_PACKAGE); + if (ACPI_FAILURE(status)) + return false; + + if (acpi_nondev_subnode_extract(buf.pointer, handle, link, list, + parent)) + return true; + + ACPI_FREE(buf.pointer); + return false; }
static bool acpi_add_nondev_subnodes(acpi_handle scope, @@ -174,7 +164,6 @@ static bool acpi_add_nondev_subnodes(acp
for (i = 0; i < links->package.count; i++) { union acpi_object *link, *desc; - acpi_handle handle; bool result;
link = &links->package.elements[i]; @@ -186,22 +175,26 @@ static bool acpi_add_nondev_subnodes(acp if (link->package.elements[0].type != ACPI_TYPE_STRING) continue;
- /* The second one may be a string, a reference or a package. */ + /* The second one may be a string or a package. */ switch (link->package.elements[1].type) { case ACPI_TYPE_STRING: result = acpi_nondev_subnode_ok(scope, link, list, parent); break; - case ACPI_TYPE_LOCAL_REFERENCE: - handle = link->package.elements[1].reference.handle; - result = acpi_nondev_subnode_data_ok(handle, link, list, - parent); - break; case ACPI_TYPE_PACKAGE: desc = &link->package.elements[1]; result = acpi_nondev_subnode_extract(desc, NULL, link, list, parent); break; + case ACPI_TYPE_LOCAL_REFERENCE: + /* + * It is not expected to see any local references in + * the links package because referencing a named object + * should cause it to be evaluated in place. + */ + acpi_handle_info(scope, "subnode %s: Unexpected reference\n", + link->package.elements[0].string.pointer); + fallthrough; default: result = false; break;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit 737c3a09dcf69ba2814f3674947ccaec1861c985 ]
In some places in the ACPI device properties handling code, it is unclear why the code is what it is. Some assumptions are not documented and some pieces of code are based on knowledge that is not mentioned anywhere.
Add code comments explaining these things.
Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Sakari Ailus sakari.ailus@linux.intel.com Stable-dep-of: baf60d5cb8bc ("ACPI: property: Do not pass NULL handles to acpi_attach_data()") Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/property.c | 46 ++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 44 insertions(+), 2 deletions(-)
--- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -108,7 +108,18 @@ static bool acpi_nondev_subnode_extract( if (handle) acpi_get_parent(handle, &scope);
+ /* + * Extract properties from the _DSD-equivalent package pointed to by + * desc and use scope (if not NULL) for the completion of relative + * pathname segments. + * + * The extracted properties will be held in the new data node dn. + */ result = acpi_extract_properties(scope, desc, &dn->data); + /* + * Look for subnodes in the _DSD-equivalent package pointed to by desc + * and create child nodes of dn if there are any. + */ if (acpi_enumerate_nondev_subnodes(scope, desc, &dn->data, &dn->fwnode)) result = true;
@@ -133,6 +144,12 @@ static bool acpi_nondev_subnode_ok(acpi_ acpi_handle handle; acpi_status status;
+ /* + * If the scope is unknown, the _DSD-equivalent package being parsed + * was embedded in an outer _DSD-equivalent package as a result of + * direct evaluation of an object pointed to by a reference. In that + * case, using a pathname as the target object pointer is invalid. + */ if (!scope) return false;
@@ -162,6 +179,10 @@ static bool acpi_add_nondev_subnodes(acp bool ret = false; int i;
+ /* + * Every element in the links package is expected to represent a link + * to a non-device node in a tree containing device-specific data. + */ for (i = 0; i < links->package.count; i++) { union acpi_object *link, *desc; bool result; @@ -171,17 +192,38 @@ static bool acpi_add_nondev_subnodes(acp if (link->package.count != 2) continue;
- /* The first one must be a string. */ + /* The first one (the key) must be a string. */ if (link->package.elements[0].type != ACPI_TYPE_STRING) continue;
- /* The second one may be a string or a package. */ + /* The second one (the target) may be a string or a package. */ switch (link->package.elements[1].type) { case ACPI_TYPE_STRING: + /* + * The string is expected to be a full pathname or a + * pathname segment relative to the given scope. That + * pathname is expected to point to an object returning + * a package that contains _DSD-equivalent information. + */ result = acpi_nondev_subnode_ok(scope, link, list, parent); break; case ACPI_TYPE_PACKAGE: + /* + * This happens when a reference is used in AML to + * point to the target. Since the target is expected + * to be a named object, a reference to it will cause it + * to be avaluated in place and its return package will + * be embedded in the links package at the location of + * the reference. + * + * The target package is expected to contain _DSD- + * equivalent information, but the scope in which it + * is located in the original AML is unknown. Thus + * it cannot contain pathname segments represented as + * strings because there is no way to build full + * pathnames out of them. + */ desc = &link->package.elements[1]; result = acpi_nondev_subnode_extract(desc, NULL, link, list, parent);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Rafael J. Wysocki" rafael.j.wysocki@intel.com
[ Upstream commit baf60d5cb8bc6b85511c5df5f0ad7620bb66d23c ]
In certain circumstances, the ACPI handle of a data-only node may be NULL, in which case it does not make sense to attempt to attach that node to an ACPI namespace object, so update the code to avoid attempts to do so.
This prevents confusing and unuseful error messages from being printed.
Also document the fact that the ACPI handle of a data-only node may be NULL and when that happens in a code comment. In addition, make acpi_add_nondev_subnodes() print a diagnostic message for each data-only node with an unknown ACPI namespace scope.
Fixes: 1d52f10917a7 ("ACPI: property: Tie data nodes to acpi handles") Cc: 6.0+ stable@vger.kernel.org # 6.0+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Sakari Ailus sakari.ailus@linux.intel.com Tested-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/property.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/drivers/acpi/property.c +++ b/drivers/acpi/property.c @@ -124,6 +124,10 @@ static bool acpi_nondev_subnode_extract( result = true;
if (result) { + /* + * This will be NULL if the desc package is embedded in an outer + * _DSD-equivalent package and its scope cannot be determined. + */ dn->handle = handle; dn->data.pointer = desc; list_add_tail(&dn->sibling, list); @@ -224,6 +228,8 @@ static bool acpi_add_nondev_subnodes(acp * strings because there is no way to build full * pathnames out of them. */ + acpi_handle_debug(scope, "subnode %s: Unknown scope\n", + link->package.elements[0].string.pointer); desc = &link->package.elements[1]; result = acpi_nondev_subnode_extract(desc, NULL, link, list, parent); @@ -396,6 +402,9 @@ static void acpi_untie_nondev_subnodes(s struct acpi_data_node *dn;
list_for_each_entry(dn, &data->subnodes, sibling) { + if (!dn->handle) + continue; + acpi_detach_data(dn->handle, acpi_nondev_subnode_tag);
acpi_untie_nondev_subnodes(&dn->data); @@ -410,6 +419,9 @@ static bool acpi_tie_nondev_subnodes(str acpi_status status; bool ret;
+ if (!dn->handle) + continue; + status = acpi_attach_data(dn->handle, acpi_nondev_subnode_tag, dn); if (ACPI_FAILURE(status) && status != AE_ALREADY_EXISTS) { acpi_handle_err(dn->handle, "Can't tag data node\n");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Matthieu Baerts (NGI0)" matttbe@kernel.org
commit 4b1ff850e0c1aacc23e923ed22989b827b9808f9 upstream.
When servers set the C-flag in their MP_CAPABLE to tell clients not to create subflows to the initial address and port, clients will likely not use their other endpoints. That's because the in-kernel path-manager uses the 'subflow' endpoints to create subflows only to the initial address and port.
If the limits have not been modified to accept ADD_ADDR, the client doesn't try to establish new subflows. If the limits accept ADD_ADDR, the routing routes will be used to select the source IP.
The C-flag is typically set when the server is operating behind a legacy Layer 4 load balancer, or using anycast IP address. Clients having their different 'subflow' endpoints setup, don't end up creating multiple subflows as expected, and causing some deployment issues.
A special case is then added here: when servers set the C-flag in the MPC and directly sends an ADD_ADDR, this single ADD_ADDR is accepted. The 'subflows' endpoints will then be used with this new remote IP and port. This exception is only allowed when the ADD_ADDR is sent immediately after the 3WHS, and makes the client switching to the 'fully established' mode. After that, 'select_local_address()' will not be able to find any subflows, because 'id_avail_bitmap' will be filled in mptcp_pm_create_subflow_or_signal_addr(), when switching to 'fully established' mode.
Fixes: df377be38725 ("mptcp: add deny_join_id0 in mptcp_options_received") Cc: stable@vger.kernel.org Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/536 Reviewed-by: Geliang Tang geliang@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Link: https://patch.msgid.link/20250925-net-next-mptcp-c-flag-laminar-v1-1-ad126cc... Signed-off-by: Jakub Kicinski kuba@kernel.org [ Conflict in pm.c, because commit 498d7d8b75f1 ("mptcp: pm: remove '_nl' from mptcp_pm_nl_is_init_remote_addr") renamed an helper in the context, and it is not in this version. The same new code can be applied at the same place. Conflict in pm_kernel.c, because the modified code has been moved from pm_netlink.c to pm_kernel.c in commit 8617e85e04bd ("mptcp: pm: split in-kernel PM specific code"), which is not in this version. The resolution is easy: simply by applying the patch where 'pm_kernel.c' has been replaced 'pm_netlink.c'. 'patch --merge' managed to apply this modified patch without creating any conflicts. ] Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/pm.c | 7 ++++-- net/mptcp/pm_netlink.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++- net/mptcp/protocol.h | 8 +++++++ 3 files changed, 62 insertions(+), 3 deletions(-)
--- a/net/mptcp/pm.c +++ b/net/mptcp/pm.c @@ -226,9 +226,12 @@ void mptcp_pm_add_addr_received(const st } else { __MPTCP_INC_STATS(sock_net((struct sock *)msk), MPTCP_MIB_ADDADDRDROP); } - /* id0 should not have a different address */ + /* - id0 should not have a different address + * - special case for C-flag: linked to fill_local_addresses_vec() + */ } else if ((addr->id == 0 && !mptcp_pm_nl_is_init_remote_addr(msk, addr)) || - (addr->id > 0 && !READ_ONCE(pm->accept_addr))) { + (addr->id > 0 && !READ_ONCE(pm->accept_addr) && + !mptcp_pm_add_addr_c_flag_case(msk))) { mptcp_pm_announce_addr(msk, addr, true); mptcp_pm_add_addr_send_ack(msk); } else if (mptcp_pm_schedule_work(msk, MPTCP_PM_ADD_ADDR_RECEIVED)) { --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -674,10 +674,12 @@ static unsigned int fill_local_addresses struct mptcp_addr_info mpc_addr; struct pm_nl_pernet *pernet; unsigned int subflows_max; + bool c_flag_case; int i = 0;
pernet = pm_nl_get_pernet_from_msk(msk); subflows_max = mptcp_pm_get_subflows_max(msk); + c_flag_case = remote->id && mptcp_pm_add_addr_c_flag_case(msk);
mptcp_local_address((struct sock_common *)msk, &mpc_addr);
@@ -690,12 +692,27 @@ static unsigned int fill_local_addresses continue;
if (msk->pm.subflows < subflows_max) { + bool is_id0; + locals[i].addr = entry->addr; locals[i].flags = entry->flags; locals[i].ifindex = entry->ifindex;
+ is_id0 = mptcp_addresses_equal(&locals[i].addr, + &mpc_addr, + locals[i].addr.port); + + if (c_flag_case && + (entry->flags & MPTCP_PM_ADDR_FLAG_SUBFLOW)) { + __clear_bit(locals[i].addr.id, + msk->pm.id_avail_bitmap); + + if (!is_id0) + msk->pm.local_addr_used++; + } + /* Special case for ID0: set the correct ID */ - if (mptcp_addresses_equal(&locals[i].addr, &mpc_addr, locals[i].addr.port)) + if (is_id0) locals[i].addr.id = 0;
msk->pm.subflows++; @@ -704,6 +721,37 @@ static unsigned int fill_local_addresses } rcu_read_unlock();
+ /* Special case: peer sets the C flag, accept one ADD_ADDR if default + * limits are used -- accepting no ADD_ADDR -- and use subflow endpoints + */ + if (!i && c_flag_case) { + unsigned int local_addr_max = mptcp_pm_get_local_addr_max(msk); + + while (msk->pm.local_addr_used < local_addr_max && + msk->pm.subflows < subflows_max) { + struct mptcp_pm_local *local = &locals[i]; + + if (!select_local_address(pernet, msk, local)) + break; + + __clear_bit(local->addr.id, msk->pm.id_avail_bitmap); + + if (!mptcp_pm_addr_families_match(sk, &local->addr, + remote)) + continue; + + if (mptcp_addresses_equal(&local->addr, &mpc_addr, + local->addr.port)) + continue; + + msk->pm.local_addr_used++; + msk->pm.subflows++; + i++; + } + + return i; + } + /* If the array is empty, fill in the single * 'IPADDRANY' local address */ --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -1172,6 +1172,14 @@ static inline void mptcp_pm_close_subflo spin_unlock_bh(&msk->pm.lock); }
+static inline bool mptcp_pm_add_addr_c_flag_case(struct mptcp_sock *msk) +{ + return READ_ONCE(msk->pm.remote_deny_join_id0) && + msk->pm.local_addr_used == 0 && + mptcp_pm_get_add_addr_accept_max(msk) == 0 && + msk->pm.subflows < mptcp_pm_get_subflows_max(msk); +} + void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk);
static inline struct mptcp_ext *mptcp_get_ext(const struct sk_buff *skb)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Corey Minyard corey@minyard.net
commit b52da4054ee0bf9ecb44996f2c83236ff50b3812 upstream
This patch required quite a bit of work to backport due to a number of unrelated changes that do not make sense to backport. This has been run against my test suite and passes all tests.
The limit on the number of user messages had a number of issues, improper counting in some cases and a use after free.
Restructure how this is all done to handle more in the receive message allocation routine, so all refcouting and user message limit counts are done in that routine. It's a lot cleaner and safer.
Reported-by: Gilles BULOZ gilles.buloz@kontron.com Closes: https://lore.kernel.org/lkml/aLsw6G0GyqfpKs2S@mail.minyard.net/ Fixes: 8e76741c3d8b ("ipmi: Add a limit on the number of users that may use IPMI") Cc: stable@vger.kernel.org # 4.19 Signed-off-by: Corey Minyard corey@minyard.net Tested-by: Gilles BULOZ gilles.buloz@kontron.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/ipmi/ipmi_msghandler.c | 415 +++++++++++++++++------------------- 1 file changed, 198 insertions(+), 217 deletions(-)
--- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -39,7 +39,9 @@
#define IPMI_DRIVER_VERSION "39.2"
-static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void); +static struct ipmi_recv_msg *ipmi_alloc_recv_msg(struct ipmi_user *user); +static void ipmi_set_recv_msg_user(struct ipmi_recv_msg *msg, + struct ipmi_user *user); static int ipmi_init_msghandler(void); static void smi_recv_work(struct work_struct *t); static void handle_new_recv_msgs(struct ipmi_smi *intf); @@ -939,13 +941,11 @@ static int deliver_response(struct ipmi_ * risk. At this moment, simply skip it in that case. */ ipmi_free_recv_msg(msg); - atomic_dec(&msg->user->nr_msgs); } else { int index; struct ipmi_user *user = acquire_ipmi_user(msg->user, &index);
if (user) { - atomic_dec(&user->nr_msgs); user->handler->ipmi_recv_hndl(msg, user->handler_data); release_ipmi_user(user, index); } else { @@ -1634,8 +1634,7 @@ int ipmi_set_gets_events(struct ipmi_use spin_unlock_irqrestore(&intf->events_lock, flags);
list_for_each_entry_safe(msg, msg2, &msgs, link) { - msg->user = user; - kref_get(&user->refcount); + ipmi_set_recv_msg_user(msg, user); deliver_local_response(intf, msg); }
@@ -2309,22 +2308,15 @@ static int i_ipmi_request(struct ipmi_us struct ipmi_recv_msg *recv_msg; int rv = 0;
- if (user) { - if (atomic_add_return(1, &user->nr_msgs) > max_msgs_per_user) { - /* Decrement will happen at the end of the routine. */ - rv = -EBUSY; - goto out; - } - } - - if (supplied_recv) + if (supplied_recv) { recv_msg = supplied_recv; - else { - recv_msg = ipmi_alloc_recv_msg(); - if (recv_msg == NULL) { - rv = -ENOMEM; - goto out; - } + recv_msg->user = user; + if (user) + atomic_inc(&user->nr_msgs); + } else { + recv_msg = ipmi_alloc_recv_msg(user); + if (IS_ERR(recv_msg)) + return PTR_ERR(recv_msg); } recv_msg->user_msg_data = user_msg_data;
@@ -2335,8 +2327,7 @@ static int i_ipmi_request(struct ipmi_us if (smi_msg == NULL) { if (!supplied_recv) ipmi_free_recv_msg(recv_msg); - rv = -ENOMEM; - goto out; + return -ENOMEM; } }
@@ -2346,10 +2337,6 @@ static int i_ipmi_request(struct ipmi_us goto out_err; }
- recv_msg->user = user; - if (user) - /* The put happens when the message is freed. */ - kref_get(&user->refcount); recv_msg->msgid = msgid; /* * Store the message to send in the receive message so timeout @@ -2378,8 +2365,10 @@ static int i_ipmi_request(struct ipmi_us
if (rv) { out_err: - ipmi_free_smi_msg(smi_msg); - ipmi_free_recv_msg(recv_msg); + if (!supplied_smi) + ipmi_free_smi_msg(smi_msg); + if (!supplied_recv) + ipmi_free_recv_msg(recv_msg); } else { dev_dbg(intf->si_dev, "Send: %*ph\n", smi_msg->data_size, smi_msg->data); @@ -2388,9 +2377,6 @@ out_err: } rcu_read_unlock();
-out: - if (rv && user) - atomic_dec(&user->nr_msgs); return rv; }
@@ -3882,7 +3868,7 @@ static int handle_ipmb_get_msg_cmd(struc unsigned char chan; struct ipmi_user *user = NULL; struct ipmi_ipmb_addr *ipmb_addr; - struct ipmi_recv_msg *recv_msg; + struct ipmi_recv_msg *recv_msg = NULL;
if (msg->rsp_size < 10) { /* Message not big enough, just ignore it. */ @@ -3903,9 +3889,8 @@ static int handle_ipmb_get_msg_cmd(struc rcvr = find_cmd_rcvr(intf, netfn, cmd, chan); if (rcvr) { user = rcvr->user; - kref_get(&user->refcount); - } else - user = NULL; + recv_msg = ipmi_alloc_recv_msg(user); + } rcu_read_unlock();
if (user == NULL) { @@ -3940,47 +3925,41 @@ static int handle_ipmb_get_msg_cmd(struc rv = -1; } rcu_read_unlock(); - } else { - recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { - /* - * We couldn't allocate memory for the - * message, so requeue it for handling - * later. - */ - rv = 1; - kref_put(&user->refcount, free_user); - } else { - /* Extract the source address from the data. */ - ipmb_addr = (struct ipmi_ipmb_addr *) &recv_msg->addr; - ipmb_addr->addr_type = IPMI_IPMB_ADDR_TYPE; - ipmb_addr->slave_addr = msg->rsp[6]; - ipmb_addr->lun = msg->rsp[7] & 3; - ipmb_addr->channel = msg->rsp[3] & 0xf; + } else if (!IS_ERR(recv_msg)) { + /* Extract the source address from the data. */ + ipmb_addr = (struct ipmi_ipmb_addr *) &recv_msg->addr; + ipmb_addr->addr_type = IPMI_IPMB_ADDR_TYPE; + ipmb_addr->slave_addr = msg->rsp[6]; + ipmb_addr->lun = msg->rsp[7] & 3; + ipmb_addr->channel = msg->rsp[3] & 0xf;
- /* - * Extract the rest of the message information - * from the IPMB header. - */ - recv_msg->user = user; - recv_msg->recv_type = IPMI_CMD_RECV_TYPE; - recv_msg->msgid = msg->rsp[7] >> 2; - recv_msg->msg.netfn = msg->rsp[4] >> 2; - recv_msg->msg.cmd = msg->rsp[8]; - recv_msg->msg.data = recv_msg->msg_data; + /* + * Extract the rest of the message information + * from the IPMB header. + */ + recv_msg->recv_type = IPMI_CMD_RECV_TYPE; + recv_msg->msgid = msg->rsp[7] >> 2; + recv_msg->msg.netfn = msg->rsp[4] >> 2; + recv_msg->msg.cmd = msg->rsp[8]; + recv_msg->msg.data = recv_msg->msg_data;
- /* - * We chop off 10, not 9 bytes because the checksum - * at the end also needs to be removed. - */ - recv_msg->msg.data_len = msg->rsp_size - 10; - memcpy(recv_msg->msg_data, &msg->rsp[9], - msg->rsp_size - 10); - if (deliver_response(intf, recv_msg)) - ipmi_inc_stat(intf, unhandled_commands); - else - ipmi_inc_stat(intf, handled_commands); - } + /* + * We chop off 10, not 9 bytes because the checksum + * at the end also needs to be removed. + */ + recv_msg->msg.data_len = msg->rsp_size - 10; + memcpy(recv_msg->msg_data, &msg->rsp[9], + msg->rsp_size - 10); + if (deliver_response(intf, recv_msg)) + ipmi_inc_stat(intf, unhandled_commands); + else + ipmi_inc_stat(intf, handled_commands); + } else { + /* + * We couldn't allocate memory for the message, so + * requeue it for handling later. + */ + rv = 1; }
return rv; @@ -3993,7 +3972,7 @@ static int handle_ipmb_direct_rcv_cmd(st int rv = 0; struct ipmi_user *user = NULL; struct ipmi_ipmb_direct_addr *daddr; - struct ipmi_recv_msg *recv_msg; + struct ipmi_recv_msg *recv_msg = NULL; unsigned char netfn = msg->rsp[0] >> 2; unsigned char cmd = msg->rsp[3];
@@ -4002,9 +3981,8 @@ static int handle_ipmb_direct_rcv_cmd(st rcvr = find_cmd_rcvr(intf, netfn, cmd, 0); if (rcvr) { user = rcvr->user; - kref_get(&user->refcount); - } else - user = NULL; + recv_msg = ipmi_alloc_recv_msg(user); + } rcu_read_unlock();
if (user == NULL) { @@ -4031,44 +4009,38 @@ static int handle_ipmb_direct_rcv_cmd(st rv = -1; } rcu_read_unlock(); - } else { - recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { - /* - * We couldn't allocate memory for the - * message, so requeue it for handling - * later. - */ - rv = 1; - kref_put(&user->refcount, free_user); - } else { - /* Extract the source address from the data. */ - daddr = (struct ipmi_ipmb_direct_addr *)&recv_msg->addr; - daddr->addr_type = IPMI_IPMB_DIRECT_ADDR_TYPE; - daddr->channel = 0; - daddr->slave_addr = msg->rsp[1]; - daddr->rs_lun = msg->rsp[0] & 3; - daddr->rq_lun = msg->rsp[2] & 3; + } else if (!IS_ERR(recv_msg)) { + /* Extract the source address from the data. */ + daddr = (struct ipmi_ipmb_direct_addr *)&recv_msg->addr; + daddr->addr_type = IPMI_IPMB_DIRECT_ADDR_TYPE; + daddr->channel = 0; + daddr->slave_addr = msg->rsp[1]; + daddr->rs_lun = msg->rsp[0] & 3; + daddr->rq_lun = msg->rsp[2] & 3;
- /* - * Extract the rest of the message information - * from the IPMB header. - */ - recv_msg->user = user; - recv_msg->recv_type = IPMI_CMD_RECV_TYPE; - recv_msg->msgid = (msg->rsp[2] >> 2); - recv_msg->msg.netfn = msg->rsp[0] >> 2; - recv_msg->msg.cmd = msg->rsp[3]; - recv_msg->msg.data = recv_msg->msg_data; - - recv_msg->msg.data_len = msg->rsp_size - 4; - memcpy(recv_msg->msg_data, msg->rsp + 4, - msg->rsp_size - 4); - if (deliver_response(intf, recv_msg)) - ipmi_inc_stat(intf, unhandled_commands); - else - ipmi_inc_stat(intf, handled_commands); - } + /* + * Extract the rest of the message information + * from the IPMB header. + */ + recv_msg->recv_type = IPMI_CMD_RECV_TYPE; + recv_msg->msgid = (msg->rsp[2] >> 2); + recv_msg->msg.netfn = msg->rsp[0] >> 2; + recv_msg->msg.cmd = msg->rsp[3]; + recv_msg->msg.data = recv_msg->msg_data; + + recv_msg->msg.data_len = msg->rsp_size - 4; + memcpy(recv_msg->msg_data, msg->rsp + 4, + msg->rsp_size - 4); + if (deliver_response(intf, recv_msg)) + ipmi_inc_stat(intf, unhandled_commands); + else + ipmi_inc_stat(intf, handled_commands); + } else { + /* + * We couldn't allocate memory for the message, so + * requeue it for handling later. + */ + rv = 1; }
return rv; @@ -4182,7 +4154,7 @@ static int handle_lan_get_msg_cmd(struct unsigned char chan; struct ipmi_user *user = NULL; struct ipmi_lan_addr *lan_addr; - struct ipmi_recv_msg *recv_msg; + struct ipmi_recv_msg *recv_msg = NULL;
if (msg->rsp_size < 12) { /* Message not big enough, just ignore it. */ @@ -4203,9 +4175,8 @@ static int handle_lan_get_msg_cmd(struct rcvr = find_cmd_rcvr(intf, netfn, cmd, chan); if (rcvr) { user = rcvr->user; - kref_get(&user->refcount); - } else - user = NULL; + recv_msg = ipmi_alloc_recv_msg(user); + } rcu_read_unlock();
if (user == NULL) { @@ -4217,49 +4188,44 @@ static int handle_lan_get_msg_cmd(struct * them to be freed. */ rv = 0; - } else { - recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { - /* - * We couldn't allocate memory for the - * message, so requeue it for handling later. - */ - rv = 1; - kref_put(&user->refcount, free_user); - } else { - /* Extract the source address from the data. */ - lan_addr = (struct ipmi_lan_addr *) &recv_msg->addr; - lan_addr->addr_type = IPMI_LAN_ADDR_TYPE; - lan_addr->session_handle = msg->rsp[4]; - lan_addr->remote_SWID = msg->rsp[8]; - lan_addr->local_SWID = msg->rsp[5]; - lan_addr->lun = msg->rsp[9] & 3; - lan_addr->channel = msg->rsp[3] & 0xf; - lan_addr->privilege = msg->rsp[3] >> 4; + } else if (!IS_ERR(recv_msg)) { + /* Extract the source address from the data. */ + lan_addr = (struct ipmi_lan_addr *) &recv_msg->addr; + lan_addr->addr_type = IPMI_LAN_ADDR_TYPE; + lan_addr->session_handle = msg->rsp[4]; + lan_addr->remote_SWID = msg->rsp[8]; + lan_addr->local_SWID = msg->rsp[5]; + lan_addr->lun = msg->rsp[9] & 3; + lan_addr->channel = msg->rsp[3] & 0xf; + lan_addr->privilege = msg->rsp[3] >> 4;
- /* - * Extract the rest of the message information - * from the IPMB header. - */ - recv_msg->user = user; - recv_msg->recv_type = IPMI_CMD_RECV_TYPE; - recv_msg->msgid = msg->rsp[9] >> 2; - recv_msg->msg.netfn = msg->rsp[6] >> 2; - recv_msg->msg.cmd = msg->rsp[10]; - recv_msg->msg.data = recv_msg->msg_data; + /* + * Extract the rest of the message information + * from the IPMB header. + */ + recv_msg->recv_type = IPMI_CMD_RECV_TYPE; + recv_msg->msgid = msg->rsp[9] >> 2; + recv_msg->msg.netfn = msg->rsp[6] >> 2; + recv_msg->msg.cmd = msg->rsp[10]; + recv_msg->msg.data = recv_msg->msg_data;
- /* - * We chop off 12, not 11 bytes because the checksum - * at the end also needs to be removed. - */ - recv_msg->msg.data_len = msg->rsp_size - 12; - memcpy(recv_msg->msg_data, &msg->rsp[11], - msg->rsp_size - 12); - if (deliver_response(intf, recv_msg)) - ipmi_inc_stat(intf, unhandled_commands); - else - ipmi_inc_stat(intf, handled_commands); - } + /* + * We chop off 12, not 11 bytes because the checksum + * at the end also needs to be removed. + */ + recv_msg->msg.data_len = msg->rsp_size - 12; + memcpy(recv_msg->msg_data, &msg->rsp[11], + msg->rsp_size - 12); + if (deliver_response(intf, recv_msg)) + ipmi_inc_stat(intf, unhandled_commands); + else + ipmi_inc_stat(intf, handled_commands); + } else { + /* + * We couldn't allocate memory for the message, so + * requeue it for handling later. + */ + rv = 1; }
return rv; @@ -4281,7 +4247,7 @@ static int handle_oem_get_msg_cmd(struct unsigned char chan; struct ipmi_user *user = NULL; struct ipmi_system_interface_addr *smi_addr; - struct ipmi_recv_msg *recv_msg; + struct ipmi_recv_msg *recv_msg = NULL;
/* * We expect the OEM SW to perform error checking @@ -4310,9 +4276,8 @@ static int handle_oem_get_msg_cmd(struct rcvr = find_cmd_rcvr(intf, netfn, cmd, chan); if (rcvr) { user = rcvr->user; - kref_get(&user->refcount); - } else - user = NULL; + recv_msg = ipmi_alloc_recv_msg(user); + } rcu_read_unlock();
if (user == NULL) { @@ -4325,48 +4290,42 @@ static int handle_oem_get_msg_cmd(struct */
rv = 0; - } else { - recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { - /* - * We couldn't allocate memory for the - * message, so requeue it for handling - * later. - */ - rv = 1; - kref_put(&user->refcount, free_user); - } else { - /* - * OEM Messages are expected to be delivered via - * the system interface to SMS software. We might - * need to visit this again depending on OEM - * requirements - */ - smi_addr = ((struct ipmi_system_interface_addr *) - &recv_msg->addr); - smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; - smi_addr->channel = IPMI_BMC_CHANNEL; - smi_addr->lun = msg->rsp[0] & 3; - - recv_msg->user = user; - recv_msg->user_msg_data = NULL; - recv_msg->recv_type = IPMI_OEM_RECV_TYPE; - recv_msg->msg.netfn = msg->rsp[0] >> 2; - recv_msg->msg.cmd = msg->rsp[1]; - recv_msg->msg.data = recv_msg->msg_data; + } else if (!IS_ERR(recv_msg)) { + /* + * OEM Messages are expected to be delivered via + * the system interface to SMS software. We might + * need to visit this again depending on OEM + * requirements + */ + smi_addr = ((struct ipmi_system_interface_addr *) + &recv_msg->addr); + smi_addr->addr_type = IPMI_SYSTEM_INTERFACE_ADDR_TYPE; + smi_addr->channel = IPMI_BMC_CHANNEL; + smi_addr->lun = msg->rsp[0] & 3; + + recv_msg->user_msg_data = NULL; + recv_msg->recv_type = IPMI_OEM_RECV_TYPE; + recv_msg->msg.netfn = msg->rsp[0] >> 2; + recv_msg->msg.cmd = msg->rsp[1]; + recv_msg->msg.data = recv_msg->msg_data;
- /* - * The message starts at byte 4 which follows the - * Channel Byte in the "GET MESSAGE" command - */ - recv_msg->msg.data_len = msg->rsp_size - 4; - memcpy(recv_msg->msg_data, &msg->rsp[4], - msg->rsp_size - 4); - if (deliver_response(intf, recv_msg)) - ipmi_inc_stat(intf, unhandled_commands); - else - ipmi_inc_stat(intf, handled_commands); - } + /* + * The message starts at byte 4 which follows the + * Channel Byte in the "GET MESSAGE" command + */ + recv_msg->msg.data_len = msg->rsp_size - 4; + memcpy(recv_msg->msg_data, &msg->rsp[4], + msg->rsp_size - 4); + if (deliver_response(intf, recv_msg)) + ipmi_inc_stat(intf, unhandled_commands); + else + ipmi_inc_stat(intf, handled_commands); + } else { + /* + * We couldn't allocate memory for the message, so + * requeue it for handling later. + */ + rv = 1; }
return rv; @@ -4425,8 +4384,8 @@ static int handle_read_event_rsp(struct if (!user->gets_events) continue;
- recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { + recv_msg = ipmi_alloc_recv_msg(user); + if (IS_ERR(recv_msg)) { rcu_read_unlock(); list_for_each_entry_safe(recv_msg, recv_msg2, &msgs, link) { @@ -4445,8 +4404,6 @@ static int handle_read_event_rsp(struct deliver_count++;
copy_event_into_recv_msg(recv_msg, msg); - recv_msg->user = user; - kref_get(&user->refcount); list_add_tail(&recv_msg->link, &msgs); } srcu_read_unlock(&intf->users_srcu, index); @@ -4462,8 +4419,8 @@ static int handle_read_event_rsp(struct * No one to receive the message, put it in queue if there's * not already too many things in the queue. */ - recv_msg = ipmi_alloc_recv_msg(); - if (!recv_msg) { + recv_msg = ipmi_alloc_recv_msg(NULL); + if (IS_ERR(recv_msg)) { /* * We couldn't allocate memory for the * message, so requeue it for handling @@ -5155,27 +5112,51 @@ static void free_recv_msg(struct ipmi_re kfree(msg); }
-static struct ipmi_recv_msg *ipmi_alloc_recv_msg(void) +static struct ipmi_recv_msg *ipmi_alloc_recv_msg(struct ipmi_user *user) { struct ipmi_recv_msg *rv;
+ if (user) { + if (atomic_add_return(1, &user->nr_msgs) > max_msgs_per_user) { + atomic_dec(&user->nr_msgs); + return ERR_PTR(-EBUSY); + } + } + rv = kmalloc(sizeof(struct ipmi_recv_msg), GFP_ATOMIC); - if (rv) { - rv->user = NULL; - rv->done = free_recv_msg; - atomic_inc(&recv_msg_inuse_count); + if (!rv) { + if (user) + atomic_dec(&user->nr_msgs); + return ERR_PTR(-ENOMEM); } + + rv->user = user; + rv->done = free_recv_msg; + if (user) + kref_get(&user->refcount); + atomic_inc(&recv_msg_inuse_count); return rv; }
void ipmi_free_recv_msg(struct ipmi_recv_msg *msg) { - if (msg->user && !oops_in_progress) + if (msg->user && !oops_in_progress) { + atomic_dec(&msg->user->nr_msgs); kref_put(&msg->user->refcount, free_user); + } msg->done(msg); } EXPORT_SYMBOL(ipmi_free_recv_msg);
+static void ipmi_set_recv_msg_user(struct ipmi_recv_msg *msg, + struct ipmi_user *user) +{ + WARN_ON_ONCE(msg->user); /* User should not be set. */ + msg->user = user; + atomic_inc(&user->nr_msgs); + kref_get(&user->refcount); +} + static atomic_t panic_done_count = ATOMIC_INIT(0);
static void dummy_smi_done_handler(struct ipmi_smi_msg *msg)
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
commit e2c69490dda5d4c9f1bfbb2898989c8f3530e354 upstream
Prior to commit b52da4054ee0 ("ipmi: Rework user message limit handling"), i_ipmi_request() used to increase the user reference counter if the receive message is provided by the caller of IPMI API functions. This is no longer the case. However, ipmi_free_recv_msg() is still called and decreases the reference counter. This results in the reference counter reaching zero, the user data pointer is released, and all kinds of interesting crashes are seen.
Fix the problem by increasing user reference counter if the receive message has been provided by the caller.
Fixes: b52da4054ee0 ("ipmi: Rework user message limit handling") Reported-by: Eric Dumazet edumazet@google.com Cc: Eric Dumazet edumazet@google.com Cc: Greg Thelen gthelen@google.com Signed-off-by: Guenter Roeck linux@roeck-us.net Message-ID: 20251006201857.3433837-1-linux@roeck-us.net Signed-off-by: Corey Minyard corey@minyard.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/char/ipmi/ipmi_msghandler.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/char/ipmi/ipmi_msghandler.c +++ b/drivers/char/ipmi/ipmi_msghandler.c @@ -2311,8 +2311,11 @@ static int i_ipmi_request(struct ipmi_us if (supplied_recv) { recv_msg = supplied_recv; recv_msg->user = user; - if (user) + if (user) { atomic_inc(&user->nr_msgs); + /* The put happens when the message is freed. */ + kref_get(&user->refcount); + } } else { recv_msg = ipmi_alloc_recv_msg(user); if (IS_ERR(recv_msg))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lance Yang lance.yang@linux.dev
commit 9658d698a8a83540bf6a6c80d13c9a61590ee985 upstream.
When splitting an mTHP and replacing a zero-filled subpage with the shared zeropage, try_to_map_unused_to_zeropage() currently drops several important PTE bits.
For userspace tools like CRIU, which rely on the soft-dirty mechanism for incremental snapshots, losing the soft-dirty bit means modified pages are missed, leading to inconsistent memory state after restore.
As pointed out by David, the more critical uffd-wp bit is also dropped. This breaks the userfaultfd write-protection mechanism, causing writes to be silently missed by monitoring applications, which can lead to data corruption.
Preserve both the soft-dirty and uffd-wp bits from the old PTE when creating the new zeropage mapping to ensure they are correctly tracked.
Link: https://lkml.kernel.org/r/20250930081040.80926-1-lance.yang@linux.dev Fixes: b1f202060afe ("mm: remap unused subpages to shared zeropage when splitting isolated thp") Signed-off-by: Lance Yang lance.yang@linux.dev Suggested-by: David Hildenbrand david@redhat.com Suggested-by: Dev Jain dev.jain@arm.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Dev Jain dev.jain@arm.com Acked-by: Zi Yan ziy@nvidia.com Reviewed-by: Liam R. Howlett Liam.Howlett@oracle.com Reviewed-by: Harry Yoo harry.yoo@oracle.com Cc: Alistair Popple apopple@nvidia.com Cc: Baolin Wang baolin.wang@linux.alibaba.com Cc: Barry Song baohua@kernel.org Cc: Byungchul Park byungchul@sk.com Cc: Gregory Price gourry@gourry.net Cc: "Huang, Ying" ying.huang@linux.alibaba.com Cc: Jann Horn jannh@google.com Cc: Joshua Hahn joshua.hahnjy@gmail.com Cc: Lorenzo Stoakes lorenzo.stoakes@oracle.com Cc: Mariano Pache npache@redhat.com Cc: Mathew Brost matthew.brost@intel.com Cc: Peter Xu peterx@redhat.com Cc: Rakie Kim rakie.kim@sk.com Cc: Rik van Riel riel@surriel.com Cc: Ryan Roberts ryan.roberts@arm.com Cc: Usama Arif usamaarif642@gmail.com Cc: Vlastimil Babka vbabka@suse.cz Cc: Yu Zhao yuzhao@google.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/migrate.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/mm/migrate.c +++ b/mm/migrate.c @@ -198,8 +198,7 @@ bool isolate_folio_to_list(struct folio }
static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw, - struct folio *folio, - unsigned long idx) + struct folio *folio, pte_t old_pte, unsigned long idx) { struct page *page = folio_page(folio, idx); pte_t newpte; @@ -208,7 +207,7 @@ static bool try_to_map_unused_to_zeropag return false; VM_BUG_ON_PAGE(!PageAnon(page), page); VM_BUG_ON_PAGE(!PageLocked(page), page); - VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + VM_BUG_ON_PAGE(pte_present(old_pte), page);
if (folio_test_mlocked(folio) || (pvmw->vma->vm_flags & VM_LOCKED) || mm_forbids_zeropage(pvmw->vma->vm_mm)) @@ -224,6 +223,12 @@ static bool try_to_map_unused_to_zeropag
newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address), pvmw->vma->vm_page_prot)); + + if (pte_swp_soft_dirty(old_pte)) + newpte = pte_mksoft_dirty(newpte); + if (pte_swp_uffd_wp(old_pte)) + newpte = pte_mkuffd_wp(newpte); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte);
dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); @@ -266,13 +271,13 @@ static bool remove_migration_pte(struct continue; } #endif + old_pte = ptep_get(pvmw.pte); if (rmap_walk_arg->map_unused_to_zeropage && - try_to_map_unused_to_zeropage(&pvmw, folio, idx)) + try_to_map_unused_to_zeropage(&pvmw, folio, old_pte, idx)) continue;
folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); - old_pte = ptep_get(pvmw.pte);
entry = pte_to_swp_entry(old_pte); if (!is_migration_entry_young(entry))
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Leoshkevich iii@linux.ibm.com
commit b2268d550d20ff860bddfe3a91b1aec00414689a upstream.
The calculation of the distance from %r15 to the caller-allocated portion of the stack frame is copy-pasted into multiple places in the JIT code.
Move it to bpf_jit_prog() and save the result into bpf_jit::frame_off, so that the other parts of the JIT can use it.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Link: https://lore.kernel.org/r/20250624121501.50536-2-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/net/bpf_jit_comp.c | 56 +++++++++++++++++++------------------------ 1 file changed, 26 insertions(+), 30 deletions(-)
--- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -56,6 +56,7 @@ struct bpf_jit { int prologue_plt; /* Start of prologue hotpatch PLT */ int kern_arena; /* Pool offset of kernel arena address */ u64 user_arena; /* User arena address */ + u32 frame_off; /* Offset of frame from %r15 */ };
#define SEEN_MEM BIT(0) /* use mem[] for temporary storage */ @@ -421,12 +422,9 @@ static void save_regs(struct bpf_jit *ji /* * Restore registers from "rs" (register start) to "re" (register end) on stack */ -static void restore_regs(struct bpf_jit *jit, u32 rs, u32 re, u32 stack_depth) +static void restore_regs(struct bpf_jit *jit, u32 rs, u32 re) { - u32 off = STK_OFF_R6 + (rs - 6) * 8; - - if (jit->seen & SEEN_STACK) - off += STK_OFF + stack_depth; + u32 off = jit->frame_off + STK_OFF_R6 + (rs - 6) * 8;
if (rs == re) /* lg %rs,off(%r15) */ @@ -470,8 +468,7 @@ static int get_end(u16 seen_regs, int st * Save and restore clobbered registers (6-15) on stack. * We save/restore registers in chunks with gap >= 2 registers. */ -static void save_restore_regs(struct bpf_jit *jit, int op, u32 stack_depth, - u16 extra_regs) +static void save_restore_regs(struct bpf_jit *jit, int op, u16 extra_regs) { u16 seen_regs = jit->seen_regs | extra_regs; const int last = 15, save_restore_size = 6; @@ -494,7 +491,7 @@ static void save_restore_regs(struct bpf if (op == REGS_SAVE) save_regs(jit, rs, re); else - restore_regs(jit, rs, re, stack_depth); + restore_regs(jit, rs, re); re++; } while (re <= last); } @@ -561,8 +558,7 @@ static void bpf_jit_plt(struct bpf_plt * * Save registers and create stack frame if necessary. * See stack frame layout description in "bpf_jit.h"! */ -static void bpf_jit_prologue(struct bpf_jit *jit, struct bpf_prog *fp, - u32 stack_depth) +static void bpf_jit_prologue(struct bpf_jit *jit, struct bpf_prog *fp) { /* No-op for hotpatching */ /* brcl 0,prologue_plt */ @@ -595,7 +591,7 @@ static void bpf_jit_prologue(struct bpf_ jit->seen_regs |= NVREGS; } else { /* Save registers */ - save_restore_regs(jit, REGS_SAVE, stack_depth, + save_restore_regs(jit, REGS_SAVE, fp->aux->exception_boundary ? NVREGS : 0); } /* Setup literal pool */ @@ -617,8 +613,8 @@ static void bpf_jit_prologue(struct bpf_ EMIT4(0xb9040000, REG_W1, REG_15); /* la %bfp,STK_160_UNUSED(%r15) (BPF frame pointer) */ EMIT4_DISP(0x41000000, BPF_REG_FP, REG_15, STK_160_UNUSED); - /* aghi %r15,-STK_OFF */ - EMIT4_IMM(0xa70b0000, REG_15, -(STK_OFF + stack_depth)); + /* aghi %r15,-frame_off */ + EMIT4_IMM(0xa70b0000, REG_15, -jit->frame_off); /* stg %w1,152(%r15) (backchain) */ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0, REG_15, 152); @@ -665,13 +661,13 @@ static void call_r1(struct bpf_jit *jit) /* * Function epilogue */ -static void bpf_jit_epilogue(struct bpf_jit *jit, u32 stack_depth) +static void bpf_jit_epilogue(struct bpf_jit *jit) { jit->exit_ip = jit->prg; /* Load exit code: lgr %r2,%b0 */ EMIT4(0xb9040000, REG_2, BPF_REG_0); /* Restore registers */ - save_restore_regs(jit, REGS_RESTORE, stack_depth, 0); + save_restore_regs(jit, REGS_RESTORE, 0); if (nospec_uses_trampoline()) { jit->r14_thunk_ip = jit->prg; /* Generate __s390_indirect_jump_r14 thunk */ @@ -862,7 +858,7 @@ static int sign_extend(struct bpf_jit *j * stack space for the large switch statement. */ static noinline int bpf_jit_insn(struct bpf_jit *jit, struct bpf_prog *fp, - int i, bool extra_pass, u32 stack_depth) + int i, bool extra_pass) { struct bpf_insn *insn = &fp->insnsi[i]; s32 branch_oc_off = insn->off; @@ -1783,9 +1779,9 @@ static noinline int bpf_jit_insn(struct * Note 2: We assume that the verifier does not let us call the * main program, which clears the tail call counter on entry. */ - /* mvc STK_OFF_TCCNT(4,%r15),N(%r15) */ + /* mvc STK_OFF_TCCNT(4,%r15),frame_off+STK_OFF_TCCNT(%r15) */ _EMIT6(0xd203f000 | STK_OFF_TCCNT, - 0xf000 | (STK_OFF_TCCNT + STK_OFF + stack_depth)); + 0xf000 | (jit->frame_off + STK_OFF_TCCNT));
/* Sign-extend the kfunc arguments. */ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { @@ -1836,10 +1832,7 @@ static noinline int bpf_jit_insn(struct * goto out; */
- if (jit->seen & SEEN_STACK) - off = STK_OFF_TCCNT + STK_OFF + stack_depth; - else - off = STK_OFF_TCCNT; + off = jit->frame_off + STK_OFF_TCCNT; /* lhi %w0,1 */ EMIT4_IMM(0xa7080000, REG_W0, 1); /* laal %w1,%w0,off(%r15) */ @@ -1869,7 +1862,7 @@ static noinline int bpf_jit_insn(struct /* * Restore registers before calling function */ - save_restore_regs(jit, REGS_RESTORE, stack_depth, 0); + save_restore_regs(jit, REGS_RESTORE, 0);
/* * goto *(prog->bpf_func + tail_call_start); @@ -2161,7 +2154,7 @@ static int bpf_set_addr(struct bpf_jit * * Compile eBPF program into s390x code */ static int bpf_jit_prog(struct bpf_jit *jit, struct bpf_prog *fp, - bool extra_pass, u32 stack_depth) + bool extra_pass) { int i, insn_count, lit32_size, lit64_size; u64 kern_arena; @@ -2170,24 +2163,28 @@ static int bpf_jit_prog(struct bpf_jit * jit->lit64 = jit->lit64_start; jit->prg = 0; jit->excnt = 0; + if (is_first_pass(jit) || (jit->seen & SEEN_STACK)) + jit->frame_off = STK_OFF + round_up(fp->aux->stack_depth, 8); + else + jit->frame_off = 0;
kern_arena = bpf_arena_get_kern_vm_start(fp->aux->arena); if (kern_arena) jit->kern_arena = _EMIT_CONST_U64(kern_arena); jit->user_arena = bpf_arena_get_user_vm_start(fp->aux->arena);
- bpf_jit_prologue(jit, fp, stack_depth); + bpf_jit_prologue(jit, fp); if (bpf_set_addr(jit, 0) < 0) return -1; for (i = 0; i < fp->len; i += insn_count) { - insn_count = bpf_jit_insn(jit, fp, i, extra_pass, stack_depth); + insn_count = bpf_jit_insn(jit, fp, i, extra_pass); if (insn_count < 0) return -1; /* Next instruction address */ if (bpf_set_addr(jit, i + insn_count) < 0) return -1; } - bpf_jit_epilogue(jit, stack_depth); + bpf_jit_epilogue(jit);
lit32_size = jit->lit32 - jit->lit32_start; lit64_size = jit->lit64 - jit->lit64_start; @@ -2263,7 +2260,6 @@ static struct bpf_binary_header *bpf_jit */ struct bpf_prog *bpf_int_jit_compile(struct bpf_prog *fp) { - u32 stack_depth = round_up(fp->aux->stack_depth, 8); struct bpf_prog *tmp, *orig_fp = fp; struct bpf_binary_header *header; struct s390_jit_data *jit_data; @@ -2316,7 +2312,7 @@ struct bpf_prog *bpf_int_jit_compile(str * - 3: Calculate program size and addrs array */ for (pass = 1; pass <= 3; pass++) { - if (bpf_jit_prog(&jit, fp, extra_pass, stack_depth)) { + if (bpf_jit_prog(&jit, fp, extra_pass)) { fp = orig_fp; goto free_addrs; } @@ -2330,7 +2326,7 @@ struct bpf_prog *bpf_int_jit_compile(str goto free_addrs; } skip_init_ctx: - if (bpf_jit_prog(&jit, fp, extra_pass, stack_depth)) { + if (bpf_jit_prog(&jit, fp, extra_pass)) { bpf_jit_binary_free(header); fp = orig_fp; goto free_addrs;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Leoshkevich iii@linux.ibm.com
commit e26d523edf2a62b142d2dd2dd9b87f61ed92f33a upstream.
Currently the caller-allocated portion of the stack frame is described using constants, hardcoded values, and an ASCII drawing, making it harder than necessary to ensure that everything is in sync.
Declare a struct and use offsetof() and offsetofend() macros to refer to various values stored within the frame.
Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Link: https://lore.kernel.org/r/20250624121501.50536-3-iii@linux.ibm.com Signed-off-by: Alexei Starovoitov ast@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/net/bpf_jit.h | 55 ---------------------------------- arch/s390/net/bpf_jit_comp.c | 69 +++++++++++++++++++++++++++++-------------- 2 files changed, 47 insertions(+), 77 deletions(-) delete mode 100644 arch/s390/net/bpf_jit.h
--- a/arch/s390/net/bpf_jit.h +++ /dev/null @@ -1,55 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0 */ -/* - * BPF Jit compiler defines - * - * Copyright IBM Corp. 2012,2015 - * - * Author(s): Martin Schwidefsky schwidefsky@de.ibm.com - * Michael Holzheu holzheu@linux.vnet.ibm.com - */ - -#ifndef __ARCH_S390_NET_BPF_JIT_H -#define __ARCH_S390_NET_BPF_JIT_H - -#ifndef __ASSEMBLY__ - -#include <linux/filter.h> -#include <linux/types.h> - -#endif /* __ASSEMBLY__ */ - -/* - * Stackframe layout (packed stack): - * - * ^ high - * +---------------+ | - * | old backchain | | - * +---------------+ | - * | r15 - r6 | | - * +---------------+ | - * | 4 byte align | | - * | tail_call_cnt | | - * BFP -> +===============+ | - * | | | - * | BPF stack | | - * | | | - * R15+160 -> +---------------+ | - * | new backchain | | - * R15+152 -> +---------------+ | - * | + 152 byte SA | | - * R15 -> +---------------+ + low - * - * We get 160 bytes stack space from calling function, but only use - * 12 * 8 byte for old backchain, r15..r6, and tail_call_cnt. - * - * The stack size used by the BPF program ("BPF stack" above) is passed - * via "aux->stack_depth". - */ -#define STK_SPACE_ADD (160) -#define STK_160_UNUSED (160 - 12 * 8) -#define STK_OFF (STK_SPACE_ADD - STK_160_UNUSED) - -#define STK_OFF_R6 (160 - 11 * 8) /* Offset of r6 on stack */ -#define STK_OFF_TCCNT (160 - 12 * 8) /* Offset of tail_call_cnt on stack */ - -#endif /* __ARCH_S390_NET_BPF_JIT_H */ --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -32,7 +32,6 @@ #include <asm/set_memory.h> #include <asm/text-patching.h> #include <asm/unwind.h> -#include "bpf_jit.h"
struct bpf_jit { u32 seen; /* Flags to remember seen eBPF instructions */ @@ -56,7 +55,7 @@ struct bpf_jit { int prologue_plt; /* Start of prologue hotpatch PLT */ int kern_arena; /* Pool offset of kernel arena address */ u64 user_arena; /* User arena address */ - u32 frame_off; /* Offset of frame from %r15 */ + u32 frame_off; /* Offset of struct bpf_prog from %r15 */ };
#define SEEN_MEM BIT(0) /* use mem[] for temporary storage */ @@ -405,11 +404,25 @@ static void jit_fill_hole(void *area, un }
/* + * Caller-allocated part of the frame. + * Thanks to packed stack, its otherwise unused initial part can be used for + * the BPF stack and for the next frame. + */ +struct prog_frame { + u64 unused[8]; + /* BPF stack starts here and grows towards 0 */ + u32 tail_call_cnt; + u32 pad; + u64 r6[10]; /* r6 - r15 */ + u64 backchain; +} __packed; + +/* * Save registers from "rs" (register start) to "re" (register end) on stack */ static void save_regs(struct bpf_jit *jit, u32 rs, u32 re) { - u32 off = STK_OFF_R6 + (rs - 6) * 8; + u32 off = offsetof(struct prog_frame, r6) + (rs - 6) * 8;
if (rs == re) /* stg %rs,off(%r15) */ @@ -424,7 +437,7 @@ static void save_regs(struct bpf_jit *ji */ static void restore_regs(struct bpf_jit *jit, u32 rs, u32 re) { - u32 off = jit->frame_off + STK_OFF_R6 + (rs - 6) * 8; + u32 off = jit->frame_off + offsetof(struct prog_frame, r6) + (rs - 6) * 8;
if (rs == re) /* lg %rs,off(%r15) */ @@ -556,10 +569,12 @@ static void bpf_jit_plt(struct bpf_plt * * Emit function prologue * * Save registers and create stack frame if necessary. - * See stack frame layout description in "bpf_jit.h"! + * Stack frame layout is described by struct prog_frame. */ static void bpf_jit_prologue(struct bpf_jit *jit, struct bpf_prog *fp) { + BUILD_BUG_ON(sizeof(struct prog_frame) != STACK_FRAME_OVERHEAD); + /* No-op for hotpatching */ /* brcl 0,prologue_plt */ EMIT6_PCREL_RILC(0xc0040000, 0, jit->prologue_plt); @@ -567,8 +582,9 @@ static void bpf_jit_prologue(struct bpf_
if (!bpf_is_subprog(fp)) { /* Initialize the tail call counter in the main program. */ - /* xc STK_OFF_TCCNT(4,%r15),STK_OFF_TCCNT(%r15) */ - _EMIT6(0xd703f000 | STK_OFF_TCCNT, 0xf000 | STK_OFF_TCCNT); + /* xc tail_call_cnt(4,%r15),tail_call_cnt(%r15) */ + _EMIT6(0xd703f000 | offsetof(struct prog_frame, tail_call_cnt), + 0xf000 | offsetof(struct prog_frame, tail_call_cnt)); } else { /* * Skip the tail call counter initialization in subprograms. @@ -611,13 +627,15 @@ static void bpf_jit_prologue(struct bpf_ if (is_first_pass(jit) || (jit->seen & SEEN_STACK)) { /* lgr %w1,%r15 (backchain) */ EMIT4(0xb9040000, REG_W1, REG_15); - /* la %bfp,STK_160_UNUSED(%r15) (BPF frame pointer) */ - EMIT4_DISP(0x41000000, BPF_REG_FP, REG_15, STK_160_UNUSED); + /* la %bfp,unused_end(%r15) (BPF frame pointer) */ + EMIT4_DISP(0x41000000, BPF_REG_FP, REG_15, + offsetofend(struct prog_frame, unused)); /* aghi %r15,-frame_off */ EMIT4_IMM(0xa70b0000, REG_15, -jit->frame_off); - /* stg %w1,152(%r15) (backchain) */ + /* stg %w1,backchain(%r15) */ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_W1, REG_0, - REG_15, 152); + REG_15, + offsetof(struct prog_frame, backchain)); } }
@@ -1779,9 +1797,10 @@ static noinline int bpf_jit_insn(struct * Note 2: We assume that the verifier does not let us call the * main program, which clears the tail call counter on entry. */ - /* mvc STK_OFF_TCCNT(4,%r15),frame_off+STK_OFF_TCCNT(%r15) */ - _EMIT6(0xd203f000 | STK_OFF_TCCNT, - 0xf000 | (jit->frame_off + STK_OFF_TCCNT)); + /* mvc tail_call_cnt(4,%r15),frame_off+tail_call_cnt(%r15) */ + _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt), + 0xf000 | (jit->frame_off + + offsetof(struct prog_frame, tail_call_cnt)));
/* Sign-extend the kfunc arguments. */ if (insn->src_reg == BPF_PSEUDO_KFUNC_CALL) { @@ -1832,7 +1851,8 @@ static noinline int bpf_jit_insn(struct * goto out; */
- off = jit->frame_off + STK_OFF_TCCNT; + off = jit->frame_off + + offsetof(struct prog_frame, tail_call_cnt); /* lhi %w0,1 */ EMIT4_IMM(0xa7080000, REG_W0, 1); /* laal %w1,%w0,off(%r15) */ @@ -2164,7 +2184,9 @@ static int bpf_jit_prog(struct bpf_jit * jit->prg = 0; jit->excnt = 0; if (is_first_pass(jit) || (jit->seen & SEEN_STACK)) - jit->frame_off = STK_OFF + round_up(fp->aux->stack_depth, 8); + jit->frame_off = sizeof(struct prog_frame) - + offsetofend(struct prog_frame, unused) + + round_up(fp->aux->stack_depth, 8); else jit->frame_off = 0;
@@ -2647,9 +2669,10 @@ static int __arch_prepare_bpf_trampoline /* stg %r1,backchain_off(%r15) */ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_1, REG_0, REG_15, tjit->backchain_off); - /* mvc tccnt_off(4,%r15),stack_size+STK_OFF_TCCNT(%r15) */ + /* mvc tccnt_off(4,%r15),stack_size+tail_call_cnt(%r15) */ _EMIT6(0xd203f000 | tjit->tccnt_off, - 0xf000 | (tjit->stack_size + STK_OFF_TCCNT)); + 0xf000 | (tjit->stack_size + + offsetof(struct prog_frame, tail_call_cnt))); /* stmg %r2,%rN,fwd_reg_args_off(%r15) */ if (nr_reg_args) EMIT6_DISP_LH(0xeb000000, 0x0024, REG_2, @@ -2786,8 +2809,9 @@ static int __arch_prepare_bpf_trampoline (nr_stack_args * sizeof(u64) - 1) << 16 | tjit->stack_args_off, 0xf000 | tjit->orig_stack_args_off); - /* mvc STK_OFF_TCCNT(4,%r15),tccnt_off(%r15) */ - _EMIT6(0xd203f000 | STK_OFF_TCCNT, 0xf000 | tjit->tccnt_off); + /* mvc tail_call_cnt(4,%r15),tccnt_off(%r15) */ + _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt), + 0xf000 | tjit->tccnt_off); /* lgr %r1,%r8 */ EMIT4(0xb9040000, REG_1, REG_8); /* %r1() */ @@ -2844,8 +2868,9 @@ static int __arch_prepare_bpf_trampoline if (flags & (BPF_TRAMP_F_CALL_ORIG | BPF_TRAMP_F_RET_FENTRY_RET)) EMIT6_DISP_LH(0xe3000000, 0x0004, REG_2, REG_0, REG_15, tjit->retval_off); - /* mvc stack_size+STK_OFF_TCCNT(4,%r15),tccnt_off(%r15) */ - _EMIT6(0xd203f000 | (tjit->stack_size + STK_OFF_TCCNT), + /* mvc stack_size+tail_call_cnt(4,%r15),tccnt_off(%r15) */ + _EMIT6(0xd203f000 | (tjit->stack_size + + offsetof(struct prog_frame, tail_call_cnt)), 0xf000 | tjit->tccnt_off); /* aghi %r15,stack_size */ EMIT4_IMM(0xa70b0000, REG_15, tjit->stack_size);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Leoshkevich iii@linux.ibm.com
commit c861a6b147137d10b5ff88a2c492ba376cd1b8b0 upstream.
The tailcall_bpf2bpf_hierarchy_1 test hangs on s390. Its call graph is as follows:
entry() subprog_tail() bpf_tail_call_static(0) -> entry + tail_call_start subprog_tail() bpf_tail_call_static(0) -> entry + tail_call_start
entry() copies its tail call counter to the subprog_tail()'s frame, which then increments it. However, the incremented result is discarded, leading to an astronomically large number of tail calls.
Fix by writing the incremented counter back to the entry()'s frame.
Fixes: dd691e847d28 ("s390/bpf: Implement bpf_jit_supports_subprog_tailcalls()") Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20250813121016.163375-3-iii@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/net/bpf_jit_comp.c | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-)
--- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -1789,13 +1789,6 @@ static noinline int bpf_jit_insn(struct jit->seen |= SEEN_FUNC; /* * Copy the tail call counter to where the callee expects it. - * - * Note 1: The callee can increment the tail call counter, but - * we do not load it back, since the x86 JIT does not do this - * either. - * - * Note 2: We assume that the verifier does not let us call the - * main program, which clears the tail call counter on entry. */ /* mvc tail_call_cnt(4,%r15),frame_off+tail_call_cnt(%r15) */ _EMIT6(0xd203f000 | offsetof(struct prog_frame, tail_call_cnt), @@ -1822,6 +1815,22 @@ static noinline int bpf_jit_insn(struct call_r1(jit); /* lgr %b0,%r2: load return value into %b0 */ EMIT4(0xb9040000, BPF_REG_0, REG_2); + + /* + * Copy the potentially updated tail call counter back. + */ + + if (insn->src_reg == BPF_PSEUDO_CALL) + /* + * mvc frame_off+tail_call_cnt(%r15), + * tail_call_cnt(4,%r15) + */ + _EMIT6(0xd203f000 | (jit->frame_off + + offsetof(struct prog_frame, + tail_call_cnt)), + 0xf000 | offsetof(struct prog_frame, + tail_call_cnt)); + break; } case BPF_JMP | BPF_TAIL_CALL: {
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ilya Leoshkevich iii@linux.ibm.com
commit bc3905a71f02511607d3ccf732360580209cac4c upstream.
The tailcall_bpf2bpf_hierarchy_fentry test hangs on s390. Its call graph is as follows:
entry() subprog_tail() trampoline() fentry() the rest of subprog_tail() # via BPF_TRAMP_F_CALL_ORIG return to entry()
The problem is that the rest of subprog_tail() increments the tail call counter, but the trampoline discards the incremented value. This results in an astronomically large number of tail calls.
Fix by making the trampoline write the incremented tail call counter back.
Fixes: 528eb2cb87bc ("s390/bpf: Implement arch_prepare_bpf_trampoline()") Signed-off-by: Ilya Leoshkevich iii@linux.ibm.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20250813121016.163375-4-iii@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/net/bpf_jit_comp.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -2828,6 +2828,9 @@ static int __arch_prepare_bpf_trampoline /* stg %r2,retval_off(%r15) */ EMIT6_DISP_LH(0xe3000000, 0x0024, REG_2, REG_0, REG_15, tjit->retval_off); + /* mvc tccnt_off(%r15),tail_call_cnt(4,%r15) */ + _EMIT6(0xd203f000 | tjit->tccnt_off, + 0xf000 | offsetof(struct prog_frame, tail_call_cnt));
im->ip_after_call = jit->prg_buf + jit->prg;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hongbo Li lihongbo22@huawei.com
[ Upstream commit 40d7af5375a4e27d8576d9d11954ac213d06f09e ]
Replace the open coded
if (foo) __set_bit(n, bar); else __clear_bit(n, bar);
with __assign_bit(). No functional change intended.
Signed-off-by: Hongbo Li lihongbo22@huawei.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Palmer Dabbelt palmer@rivosinc.com Link: https://lore.kernel.org/all/20240902130824.2878644-1-lihongbo22@huawei.com Stable-dep-of: f75e07bf5226 ("irqchip/sifive-plic: Avoid interrupt ID 0 handling during suspend/resume") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-sifive-plic.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index 36dbcf2d728a5..bf69a4802b71e 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -252,11 +252,10 @@ static int plic_irq_suspend(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) - if (readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID)) - __set_bit(i, priv->prio_save); - else - __clear_bit(i, priv->prio_save); + for (i = 0; i < priv->nr_irqs; i++) { + __assign_bit(i, priv->prio_save, + readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID)); + }
for_each_cpu(cpu, cpu_present_mask) { struct plic_handler *handler = per_cpu_ptr(&plic_handlers, cpu);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lucas Zampieri lzampier@redhat.com
[ Upstream commit f75e07bf5226da640fa99a0594687c780d9bace4 ]
According to the PLIC specification[1], global interrupt sources are assigned small unsigned integer identifiers beginning at the value 1. An interrupt ID of 0 is reserved to mean "no interrupt".
The current plic_irq_resume() and plic_irq_suspend() functions incorrectly start the loop from index 0, which accesses the register space for the reserved interrupt ID 0.
Change the loop to start from index 1, skipping the reserved interrupt ID 0 as per the PLIC specification.
This prevents potential undefined behavior when accessing the reserved register space during suspend/resume cycles.
Fixes: e80f0b6a2cf3 ("irqchip/irq-sifive-plic: Add syscore callbacks for hibernation") Co-developed-by: Jia Wang wangjia@ultrarisc.com Signed-off-by: Jia Wang wangjia@ultrarisc.com Co-developed-by: Charles Mirabile cmirabil@redhat.com Signed-off-by: Charles Mirabile cmirabil@redhat.com Signed-off-by: Lucas Zampieri lzampier@redhat.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://github.com/riscv/riscv-plic-spec/releases/tag/1.0.0 Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-sifive-plic.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/irqchip/irq-sifive-plic.c b/drivers/irqchip/irq-sifive-plic.c index bf69a4802b71e..9c4af7d588463 100644 --- a/drivers/irqchip/irq-sifive-plic.c +++ b/drivers/irqchip/irq-sifive-plic.c @@ -252,7 +252,8 @@ static int plic_irq_suspend(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) { + /* irq ID 0 is reserved */ + for (i = 1; i < priv->nr_irqs; i++) { __assign_bit(i, priv->prio_save, readl(priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID)); } @@ -283,7 +284,8 @@ static void plic_irq_resume(void)
priv = per_cpu_ptr(&plic_handlers, smp_processor_id())->priv;
- for (i = 0; i < priv->nr_irqs; i++) { + /* irq ID 0 is reserved */ + for (i = 1; i < priv->nr_irqs; i++) { index = BIT_WORD(i); writel((priv->prio_save[index] & BIT_MASK(i)) ? 1 : 0, priv->regs + PRIORITY_BASE + i * PRIORITY_PER_ID);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
[ Upstream commit f8f59a2c05dc16d19432e3154a9ac7bc385f4b92 ]
If the process runs in 32-bit compat mode, copy_file_range results can be in the in-band error range. In this case limit copy length to MAX_RW_COUNT to prevent a signed overflow.
Reported-by: Florian Weimer fweimer@redhat.com Closes: https://lore.kernel.org/all/lhuh5ynl8z5.fsf@oldenburg.str.redhat.com/ Signed-off-by: Miklos Szeredi mszeredi@redhat.com Link: https://lore.kernel.org/20250813151107.99856-1-mszeredi@redhat.com Reviewed-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/read_write.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/fs/read_write.c b/fs/read_write.c index befec0b5c537a..46408bab92385 100644 --- a/fs/read_write.c +++ b/fs/read_write.c @@ -1600,6 +1600,13 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, if (len == 0) return 0;
+ /* + * Make sure return value doesn't overflow in 32bit compat mode. Also + * limit the size for all cases except when calling ->copy_file_range(). + */ + if (splice || !file_out->f_op->copy_file_range || in_compat_syscall()) + len = min_t(size_t, MAX_RW_COUNT, len); + file_start_write(file_out);
/* @@ -1613,9 +1620,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, len, flags); } else if (!splice && file_in->f_op->remap_file_range && samesb) { ret = file_in->f_op->remap_file_range(file_in, pos_in, - file_out, pos_out, - min_t(loff_t, MAX_RW_COUNT, len), - REMAP_FILE_CAN_SHORTEN); + file_out, pos_out, len, REMAP_FILE_CAN_SHORTEN); /* fallback to splice */ if (ret <= 0) splice = true; @@ -1648,8 +1653,7 @@ ssize_t vfs_copy_file_range(struct file *file_in, loff_t pos_in, * to splicing from input file, while file_start_write() is held on * the output file on a different sb. */ - ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out, - min_t(size_t, len, MAX_RW_COUNT), 0); + ret = do_splice_direct(file_in, &pos_in, file_out, &pos_out, len, 0); done: if (ret > 0) { fsnotify_access(file_in);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 73861970938ad1323eb02bbbc87f6fbd1e5bacca ]
The inode mode loaded from corrupted disk can be invalid. Do like what commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk") does.
Reported-by: syzbot syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/ec982681-84b8-4624-94fa-8af15b77cbd2@I-love.SAKURA.n... Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/minix/inode.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/minix/inode.c b/fs/minix/inode.c index f007e389d5d29..fc01f9dc8c391 100644 --- a/fs/minix/inode.c +++ b/fs/minix/inode.c @@ -491,8 +491,14 @@ void minix_set_inode(struct inode *inode, dev_t rdev) inode->i_op = &minix_symlink_inode_operations; inode_nohighmem(inode); inode->i_mapping->a_ops = &minix_aops; - } else + } else if (S_ISCHR(inode->i_mode) || S_ISBLK(inode->i_mode) || + S_ISFIFO(inode->i_mode) || S_ISSOCK(inode->i_mode)) { init_special_inode(inode, inode->i_mode, rdev); + } else { + printk(KERN_DEBUG "MINIX-fs: Invalid file type 0%04o for inode %lu.\n", + inode->i_mode, inode->i_ino); + make_bad_inode(inode); + } }
/*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: gaoxiang17 gaoxiang17@xiaomi.com
[ Upstream commit 006568ab4c5ca2309ceb36fa553e390b4aa9c0c7 ]
__task_pid_nr_ns ns = task_active_pid_ns(current); pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); if (pid && ns->level <= pid->level) {
Sometimes null is returned for task_active_pid_ns. Then it will trigger kernel panic in pid_nr_ns.
For example: Unable to handle kernel NULL pointer dereference at virtual address 0000000000000058 Mem abort info: ESR = 0x0000000096000007 EC = 0x25: DABT (current EL), IL = 32 bits SET = 0, FnV = 0 EA = 0, S1PTW = 0 FSC = 0x07: level 3 translation fault Data abort info: ISV = 0, ISS = 0x00000007, ISS2 = 0x00000000 CM = 0, WnR = 0, TnD = 0, TagAccess = 0 GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 user pgtable: 4k pages, 39-bit VAs, pgdp=00000002175aa000 [0000000000000058] pgd=08000002175ab003, p4d=08000002175ab003, pud=08000002175ab003, pmd=08000002175be003, pte=0000000000000000 pstate: 834000c5 (Nzcv daIF +PAN -UAO +TCO +DIT -SSBS BTYPE=--) pc : __task_pid_nr_ns+0x74/0xd0 lr : __task_pid_nr_ns+0x24/0xd0 sp : ffffffc08001bd10 x29: ffffffc08001bd10 x28: ffffffd4422b2000 x27: 0000000000000001 x26: ffffffd442821168 x25: ffffffd442821000 x24: 00000f89492eab31 x23: 00000000000000c0 x22: ffffff806f5693c0 x21: ffffff806f5693c0 x20: 0000000000000001 x19: 0000000000000000 x18: 0000000000000000 x17: 00000000529c6ef0 x16: 00000000529c6ef0 x15: 00000000023a1adc x14: 0000000000000003 x13: 00000000007ef6d8 x12: 001167c391c78800 x11: 00ffffffffffffff x10: 0000000000000000 x9 : 0000000000000001 x8 : ffffff80816fa3c0 x7 : 0000000000000000 x6 : 49534d702d535449 x5 : ffffffc080c4c2c0 x4 : ffffffd43ee128c8 x3 : ffffffd43ee124dc x2 : 0000000000000000 x1 : 0000000000000001 x0 : ffffff806f5693c0 Call trace: __task_pid_nr_ns+0x74/0xd0 ... __handle_irq_event_percpu+0xd4/0x284 handle_irq_event+0x48/0xb0 handle_fasteoi_irq+0x160/0x2d8 generic_handle_domain_irq+0x44/0x60 gic_handle_irq+0x4c/0x114 call_on_irq_stack+0x3c/0x74 do_interrupt_handler+0x4c/0x84 el1_interrupt+0x34/0x58 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x68/0x6c account_kernel_stack+0x60/0x144 exit_task_stack_account+0x1c/0x80 do_exit+0x7e4/0xaf8 ... get_signal+0x7bc/0x8d8 do_notify_resume+0x128/0x828 el0_svc+0x6c/0x70 el0t_64_sync_handler+0x68/0xbc el0t_64_sync+0x1a8/0x1ac Code: 35fffe54 911a02a8 f9400108 b4000128 (b9405a69) ---[ end trace 0000000000000000 ]--- Kernel panic - not syncing: Oops: Fatal exception in interrupt
Signed-off-by: gaoxiang17 gaoxiang17@xiaomi.com Link: https://lore.kernel.org/20250802022123.3536934-1-gxxa03070307@gmail.com Reviewed-by: Baoquan He bhe@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/pid.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/pid.c b/kernel/pid.c index 2715afb77eab8..b80c3bfb58d07 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -487,7 +487,7 @@ pid_t pid_nr_ns(struct pid *pid, struct pid_namespace *ns) struct upid *upid; pid_t nr = 0;
- if (pid && ns->level <= pid->level) { + if (pid && ns && ns->level <= pid->level) { upid = &pid->numbers[ns->level]; if (upid->ns == ns) nr = upid->nr;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oleg Nesterov oleg@redhat.com
[ Upstream commit abdfd4948e45c51b19162cf8b3f5003f8f53c9b9 ]
task_pid_vnr(another_task) will crash if the caller was already reaped. The pid_alive(current) check can't really help, the parent/debugger can call release_task() right after this check.
This also means that even task_ppid_nr_ns(current, NULL) is not safe, pid_alive() only ensures that it is safe to dereference ->real_parent.
Change __task_pid_nr_ns() to ensure ns != NULL.
Originally-by: 高翔 gaoxiang17@xiaomi.com Link: https://lore.kernel.org/all/20250802022123.3536934-1-gxxa03070307@gmail.com/ Signed-off-by: Oleg Nesterov oleg@redhat.com Link: https://lore.kernel.org/20250810173604.GA19991@redhat.com Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/pid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/kernel/pid.c b/kernel/pid.c index b80c3bfb58d07..8fdc3a5f87c7d 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -510,7 +510,8 @@ pid_t __task_pid_nr_ns(struct task_struct *task, enum pid_type type, rcu_read_lock(); if (!ns) ns = task_active_pid_ns(current); - nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); + if (ns) + nr = pid_nr_ns(rcu_dereference(*task_pid_ptr(task, type)), ns); rcu_read_unlock();
return nr;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Lichen Liu lichliu@redhat.com
[ Upstream commit 278033a225e13ec21900f0a92b8351658f5377f2 ]
When CONFIG_TMPFS is enabled, the initial root filesystem is a tmpfs. By default, a tmpfs mount is limited to using 50% of the available RAM for its content. This can be problematic in memory-constrained environments, particularly during a kdump capture.
In a kdump scenario, the capture kernel boots with a limited amount of memory specified by the 'crashkernel' parameter. If the initramfs is large, it may fail to unpack into the tmpfs rootfs due to insufficient space. This is because to get X MB of usable space in tmpfs, 2*X MB of memory must be available for the mount. This leads to an OOM failure during the early boot process, preventing a successful crash dump.
This patch introduces a new kernel command-line parameter, initramfs_options, which allows passing specific mount options directly to the rootfs when it is first mounted. This gives users control over the rootfs behavior.
For example, a user can now specify initramfs_options=size=75% to allow the tmpfs to use up to 75% of the available memory. This can significantly reduce the memory pressure for kdump.
Consider a practical example:
To unpack a 48MB initramfs, the tmpfs needs 48MB of usable space. With the default 50% limit, this requires a memory pool of 96MB to be available for the tmpfs mount. The total memory requirement is therefore approximately: 16MB (vmlinuz) + 48MB (loaded initramfs) + 48MB (unpacked kernel) + 96MB (for tmpfs) + 12MB (runtime overhead) ≈ 220MB.
By using initramfs_options=size=75%, the memory pool required for the 48MB tmpfs is reduced to 48MB / 0.75 = 64MB. This reduces the total memory requirement by 32MB (96MB - 64MB), allowing the kdump to succeed with a smaller crashkernel size, such as 192MB.
An alternative approach of reusing the existing rootflags parameter was considered. However, a new, dedicated initramfs_options parameter was chosen to avoid altering the current behavior of rootflags (which applies to the final root filesystem) and to prevent any potential regressions.
Also add documentation for the new kernel parameter "initramfs_options"
This approach is inspired by prior discussions and patches on the topic. Ref: https://www.lightofdawn.org/blog/?viewDetailed=00128 Ref: https://landley.net/notes-2015.html#01-01-2015 Ref: https://lkml.org/lkml/2021/6/29/783 Ref: https://www.kernel.org/doc/html/latest/filesystems/ramfs-rootfs-initramfs.ht...
Signed-off-by: Lichen Liu lichliu@redhat.com Link: https://lore.kernel.org/20250815121459.3391223-1-lichliu@redhat.com Tested-by: Rob Landley rob@landley.net Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/admin-guide/kernel-parameters.txt | 3 +++ fs/namespace.c | 11 ++++++++++- 2 files changed, 13 insertions(+), 1 deletion(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index 8724c2c580b88..e88505e945d52 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -5923,6 +5923,9 @@
rootflags= [KNL] Set root filesystem mount option string
+ initramfs_options= [KNL] + Specify mount options for for the initramfs mount. + rootfstype= [KNL] Set root filesystem type
rootwait [KNL] Wait (indefinitely) for root device to show up. diff --git a/fs/namespace.c b/fs/namespace.c index 7606969412493..f5e46c2595b11 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -65,6 +65,15 @@ static int __init set_mphash_entries(char *str) } __setup("mphash_entries=", set_mphash_entries);
+static char * __initdata initramfs_options; +static int __init initramfs_options_setup(char *str) +{ + initramfs_options = str; + return 1; +} + +__setup("initramfs_options=", initramfs_options_setup); + static u64 event; static DEFINE_IDA(mnt_id_ida); static DEFINE_IDA(mnt_group_ida); @@ -5566,7 +5575,7 @@ static void __init init_mount_tree(void) struct mnt_namespace *ns; struct path root;
- mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", NULL); + mnt = vfs_kern_mount(&rootfs_fs_type, 0, "rootfs", initramfs_options); if (IS_ERR(mnt)) panic("Can't create rootfs");
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit 7f9d34b0a7cb93d678ee7207f0634dbf79e47fe5 ]
The inode mode loaded from corrupted disk can be invalid. Do like what commit 0a9e74051313 ("isofs: Verify inode mode when loading from disk") does.
Reported-by: syzbot syzbot+895c23f6917da440ed0d@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=895c23f6917da440ed0d Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Link: https://lore.kernel.org/429b3ef1-13de-4310-9a8e-c2dc9a36234a@I-love.SAKURA.n... Acked-by: Nicolas Pitre nico@fluxnic.net Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cramfs/inode.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/cramfs/inode.c b/fs/cramfs/inode.c index b84d1747a0205..e7d192f7ab3b4 100644 --- a/fs/cramfs/inode.c +++ b/fs/cramfs/inode.c @@ -117,9 +117,18 @@ static struct inode *get_cramfs_inode(struct super_block *sb, inode_nohighmem(inode); inode->i_data.a_ops = &cramfs_aops; break; - default: + case S_IFCHR: + case S_IFBLK: + case S_IFIFO: + case S_IFSOCK: init_special_inode(inode, cramfs_inode->mode, old_decode_dev(cramfs_inode->size)); + break; + default: + printk(KERN_DEBUG "CRAMFS: Invalid file type 0%04o for inode %lu.\n", + inode->i_mode, inode->i_ino); + iget_failed(inode); + return ERR_PTR(-EIO); }
inode->i_mode = cramfs_inode->mode;
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit 66c14dccd810d42ec5c73bb8a9177489dfd62278 ]
process_inode_switch_wbs_work() can be switching over 100 inodes to a different cgroup. Since switching an inode requires counting all dirty & under-writeback pages in the address space of each inode, this can take a significant amount of time. Add a possibility to reschedule after processing each inode to avoid softlockups.
Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fs-writeback.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index 4ae226778d646..eff778dc0386c 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -503,6 +503,7 @@ static void inode_switch_wbs_work_fn(struct work_struct *work) */ down_read(&bdi->wb_switch_rwsem);
+ inodep = isw->inodes; /* * By the time control reaches here, RCU grace period has passed * since I_WB_SWITCH assertion and all wb stat update transactions @@ -513,6 +514,7 @@ static void inode_switch_wbs_work_fn(struct work_struct *work) * gives us exclusion against all wb related operations on @inode * including IO list manipulations and stat updates. */ +relock: if (old_wb < new_wb) { spin_lock(&old_wb->list_lock); spin_lock_nested(&new_wb->list_lock, SINGLE_DEPTH_NESTING); @@ -521,10 +523,17 @@ static void inode_switch_wbs_work_fn(struct work_struct *work) spin_lock_nested(&old_wb->list_lock, SINGLE_DEPTH_NESTING); }
- for (inodep = isw->inodes; *inodep; inodep++) { + while (*inodep) { WARN_ON_ONCE((*inodep)->i_wb != old_wb); if (inode_do_switch_wbs(*inodep, old_wb, new_wb)) nr_switched++; + inodep++; + if (*inodep && need_resched()) { + spin_unlock(&new_wb->list_lock); + spin_unlock(&old_wb->list_lock); + cond_resched(); + goto relock; + } }
spin_unlock(&new_wb->list_lock);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit 9a6ebbdbd41235ea3bc0c4f39e2076599b8113cc ]
With lazytime mount option enabled we can be switching many dirty inodes on cgroup exit to the parent cgroup. The numbers observed in practice when systemd slice of a large cron job exits can easily reach hundreds of thousands or millions. The logic in inode_do_switch_wbs() which sorts the inode into appropriate place in b_dirty list of the target wb however has linear complexity in the number of dirty inodes thus overall time complexity of switching all the inodes is quadratic leading to workers being pegged for hours consuming 100% of the CPU and switching inodes to the parent wb.
Simple reproducer of the issue: FILES=10000 # Filesystem mounted with lazytime mount option MNT=/mnt/ echo "Creating files and switching timestamps" for (( j = 0; j < 50; j ++ )); do mkdir $MNT/dir$j for (( i = 0; i < $FILES; i++ )); do echo "foo" >$MNT/dir$j/file$i done touch -a -t 202501010000 $MNT/dir$j/file* done wait echo "Syncing and flushing" sync echo 3 >/proc/sys/vm/drop_caches
echo "Reading all files from a cgroup" mkdir /sys/fs/cgroup/unified/mycg1 || exit echo $$ >/sys/fs/cgroup/unified/mycg1/cgroup.procs || exit for (( j = 0; j < 50; j ++ )); do cat /mnt/dir$j/file* >/dev/null & done wait echo "Switching wbs" # Now rmdir the cgroup after the script exits
We need to maintain b_dirty list ordering to keep writeback happy so instead of sorting inode into appropriate place just append it at the end of the list and clobber dirtied_time_when. This may result in inode writeback starting later after cgroup switch however cgroup switches are rare so it shouldn't matter much. Since the cgroup had write access to the inode, there are no practical concerns of the possible DoS issues.
Acked-by: Tejun Heo tj@kernel.org Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fs-writeback.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c index eff778dc0386c..28edfad85c628 100644 --- a/fs/fs-writeback.c +++ b/fs/fs-writeback.c @@ -446,22 +446,23 @@ static bool inode_do_switch_wbs(struct inode *inode, * Transfer to @new_wb's IO list if necessary. If the @inode is dirty, * the specific list @inode was on is ignored and the @inode is put on * ->b_dirty which is always correct including from ->b_dirty_time. - * The transfer preserves @inode->dirtied_when ordering. If the @inode - * was clean, it means it was on the b_attached list, so move it onto - * the b_attached list of @new_wb. + * If the @inode was clean, it means it was on the b_attached list, so + * move it onto the b_attached list of @new_wb. */ if (!list_empty(&inode->i_io_list)) { inode->i_wb = new_wb;
if (inode->i_state & I_DIRTY_ALL) { - struct inode *pos; - - list_for_each_entry(pos, &new_wb->b_dirty, i_io_list) - if (time_after_eq(inode->dirtied_when, - pos->dirtied_when)) - break; + /* + * We need to keep b_dirty list sorted by + * dirtied_time_when. However properly sorting the + * inode in the list gets too expensive when switching + * many inodes. So just attach inode at the end of the + * dirty list and clobber the dirtied_time_when. + */ + inode->dirtied_time_when = jiffies; inode_io_list_move_locked(inode, new_wb, - pos->i_io_list.prev); + &new_wb->b_dirty); } else { inode_cgwb_move_to_attached(inode, new_wb); }
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: K Prateek Nayak kprateek.nayak@amd.com
Dequeuing a fair task on a throttled hierarchy returns early on encountering a throttled cfs_rq since the throttle path has already dequeued the hierarchy above and has adjusted the h_nr_* accounting till the root cfs_rq.
dequeue_entities() crucially misses calling __block_task() for delayed tasks being dequeued on the throttled hierarchies, but this was mostly harmless until commit b7ca5743a260 ("sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks") since all existing cases would re-enqueue the task if task_on_rq_queued() returned true and the task would eventually be blocked at pick after the hierarchy was unthrottled.
wait_task_inactive() is special as it expects the delayed task on throttled hierarchy to reach the blocked state on dequeue but since __block_task() is never called, task_on_rq_queued() continues to return true. Furthermore, since the task is now off the hierarchy, the pick never reaches it to fully block the task even after unthrottle leading to wait_task_inactive() looping endlessly.
Remedy this by calling __block_task() if a delayed task is being dequeued on a throttled hierarchy.
This fix is only required for stabled kernels implementing delay dequeue (>= v6.12) before v6.18 since upstream commit e1fad12dcb66 ("sched/fair: Switch to task based throttle model") indirectly fixes this by removing the early return conditions in dequeue_entities() as part of the per-task throttle feature.
Cc: stable@vger.kernel.org Reported-by: Matt Fleming matt@readmodwrite.com Closes: https://lore.kernel.org/all/20250925133310.1843863-1-matt@readmodwrite.com/ Fixes: b7ca5743a260 ("sched/core: Tweak wait_task_inactive() to force dequeue sched_delayed tasks") Tested-by: Matt Fleming mfleming@cloudflare.com Signed-off-by: K Prateek Nayak kprateek.nayak@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/sched/fair.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
--- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -7187,6 +7187,7 @@ static int dequeue_entities(struct rq *r int h_nr_delayed = 0; struct cfs_rq *cfs_rq; u64 slice = 0; + int ret = 0;
if (entity_is_task(se)) { p = task_of(se); @@ -7218,7 +7219,7 @@ static int dequeue_entities(struct rq *r
/* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(cfs_rq)) - return 0; + goto out;
/* Don't dequeue parent if it has other entities besides us */ if (cfs_rq->load.weight) { @@ -7261,7 +7262,7 @@ static int dequeue_entities(struct rq *r
/* end evaluation on encountering a throttled cfs_rq */ if (cfs_rq_throttled(cfs_rq)) - return 0; + goto out; }
sub_nr_running(rq, h_nr_queued); @@ -7273,6 +7274,8 @@ static int dequeue_entities(struct rq *r if (unlikely(!was_sched_idle && sched_idle_rq(rq))) rq->next_balance = jiffies;
+ ret = 1; +out: if (p && task_delayed) { SCHED_WARN_ON(!task_sleep); SCHED_WARN_ON(p->on_rq != 1); @@ -7288,7 +7291,7 @@ static int dequeue_entities(struct rq *r __block_task(rq, p); }
- return 1; + return ret; }
/*
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers irogers@google.com
commit f9c506fb69bdcfb9d7138281378129ff037f2aa1 upstream.
The cycles event will fallback to task-clock in the hybrid test when running virtualized. Change the test to not fail for this.
Fixes: 65d11821910bd910 ("perf test: Add a test for default perf stat command") Reviewed-by: James Clark james.clark@linaro.org Signed-off-by: Ian Rogers irogers@google.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ingo Molnar mingo@redhat.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kan Liang kan.liang@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Link: https://lore.kernel.org/r/20241212173354.9860-1-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/tests/shell/stat.sh | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/tools/perf/tests/shell/stat.sh +++ b/tools/perf/tests/shell/stat.sh @@ -161,7 +161,11 @@ test_hybrid() { # Run default Perf stat cycles_events=$(perf stat -- true 2>&1 | grep -E "/cycles/[uH]*| cycles[:uH]* " -c)
- if [ "$pmus" -ne "$cycles_events" ] + # The expectation is that default output will have a cycles events on each + # hybrid PMU. In situations with no cycles PMU events, like virtualized, this + # can fall back to task-clock and so the end count may be 0. Fail if neither + # condition holds. + if [ "$pmus" -ne "$cycles_events" ] && [ "0" -ne "$cycles_events" ] then echo "hybrid test [Found $pmus PMUs but $cycles_events cycles events. Failed]" err=1
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia okorniev@redhat.com
commit d9d6b74e4be989f919498798fa40df37a74b5bb0 upstream.
__fh_verify() added a call to svc_xprt_set_valid() to help do connection management but during LOCALIO path rqstp argument is NULL, leading to NULL pointer dereferencing and a crash.
Fixes: eccbbc7c00a5 ("nfsd: don't use sv_nrthreads in connection limiting calculations.") Signed-off-by: Olga Kornievskaia okorniev@redhat.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfsfh.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfsfh.c +++ b/fs/nfsd/nfsfh.c @@ -381,8 +381,9 @@ __fh_verify(struct svc_rqst *rqstp, error = check_nfsd_access(exp, rqstp, may_bypass_gss); if (error) goto out; - - svc_xprt_set_valid(rqstp->rq_xprt); + /* During LOCALIO call to fh_verify will be called with a NULL rqstp */ + if (rqstp) + svc_xprt_set_valid(rqstp->rq_xprt);
/* Finally, check access permissions. */ error = nfsd_permission(cred, exp, dentry, access);
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olga Kornievskaia okorniev@redhat.com
commit 0813c5f01249dbc32ccbc68d27a24fde5bf2901c upstream.
When an export policy with xprtsec policy is set with "tls" and/or "mtls", but an NFS client is doing a v3 xprtsec=tls mount, then NLM locking calls fail with an error because there is currently no support for NLM with TLS.
Until such support is added, allow NLM calls under TLS-secured policy.
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia okorniev@redhat.com Reviewed-by: NeilBrown neil@brown.name Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/export.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1115,7 +1115,8 @@ __be32 check_nfsd_access(struct svc_expo test_bit(XPT_PEER_AUTH, &xprt->xpt_flags)) goto ok; } - goto denied; + if (!may_bypass_gss) + goto denied;
ok: /* legacy gss-only clients are always OK: */
On 10/17/25 10:54 AM, Greg Kroah-Hartman wrote:
6.12-stable review patch. If anyone has any objections, please let me know.
No objection, but a question:
This set of patches seems to be a pre-requisite for "nfsd: decouple the xprtsec policy check from check_nfsd_access()", which FAILED to apply to v6.12 yesterday. Are you planning to apply that one next to v6.12?
From: Olga Kornievskaia okorniev@redhat.com
commit 0813c5f01249dbc32ccbc68d27a24fde5bf2901c upstream.
When an export policy with xprtsec policy is set with "tls" and/or "mtls", but an NFS client is doing a v3 xprtsec=tls mount, then NLM locking calls fail with an error because there is currently no support for NLM with TLS.
Until such support is added, allow NLM calls under TLS-secured policy.
Fixes: 4cc9b9f2bf4d ("nfsd: refine and rename NFSD_MAY_LOCK") Cc: stable@vger.kernel.org Signed-off-by: Olga Kornievskaia okorniev@redhat.com Reviewed-by: NeilBrown neil@brown.name Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
fs/nfsd/export.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/nfsd/export.c +++ b/fs/nfsd/export.c @@ -1115,7 +1115,8 @@ __be32 check_nfsd_access(struct svc_expo test_bit(XPT_PEER_AUTH, &xprt->xpt_flags)) goto ok; }
- goto denied;
- if (!may_bypass_gss)
goto denied;
ok: /* legacy gss-only clients are always OK: */
6.12-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kai Vehmanen kai.vehmanen@linux.intel.com
commit bace10b59624e6bd8d68bc9304357f292f1b3dcf upstream.
Assumption that chain DMA module starts the link DMA when 1ms of data is available from host is not correct. Instead the firmware chain DMA module fills the link DMA with initial buffer of zeroes and the host and link DMAs are started at the same time.
This results in a small error in delay calculation. This can become a more severe problem if host DMA has delays that exceed 1ms. This results in negative delay to be calculated and bogus values reported to applications. This can confuse some applications like alsa_conformance_test.
Fix the issue by correctly calculating the firmware chain DMA preamble size and initializing the start offset to this value.
Cc: stable@vger.kernel.org Fixes: a1d203d390e0 ("ASoC: SOF: ipc4-pcm: Enable delay reporting for ChainDMA streams") Signed-off-by: Kai Vehmanen kai.vehmanen@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://patch.msgid.link/20251002074719.2084-3-peter.ujfalusi@linux.intel.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/sof/ipc4-pcm.c | 14 ++++++++++---- sound/soc/sof/ipc4-topology.c | 1 - sound/soc/sof/ipc4-topology.h | 2 ++ 3 files changed, 12 insertions(+), 5 deletions(-)
--- a/sound/soc/sof/ipc4-pcm.c +++ b/sound/soc/sof/ipc4-pcm.c @@ -992,7 +992,7 @@ static int sof_ipc4_get_stream_start_off return -EINVAL; } else if (host_copier->data.gtw_cfg.node_id == SOF_IPC4_CHAIN_DMA_NODE_ID) { /* - * While the firmware does not supports time_info reporting for + * While the firmware does not support time_info reporting for * streams using ChainDMA, it is granted that ChainDMA can only * be used on Host+Link pairs where the link position is * accessible from the host side. @@ -1000,10 +1000,16 @@ static int sof_ipc4_get_stream_start_off * Enable delay calculation in case of ChainDMA via host * accessible registers. * - * The ChainDMA uses 2x 1ms ping-pong buffer, dai side starts - * when 1ms data is available + * The ChainDMA prefills the link DMA with a preamble + * of zero samples. Set the stream start offset based + * on size of the preamble (driver provided fifo size + * multiplied by 2.5). We add 1ms of margin as the FW + * will align the buffer size to DMA hardware + * alignment that is not known to host. */ - time_info->stream_start_offset = substream->runtime->rate / MSEC_PER_SEC; + int pre_ms = SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS * 5 / 2 + 1; + + time_info->stream_start_offset = pre_ms * substream->runtime->rate / MSEC_PER_SEC; goto out; }
--- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -32,7 +32,6 @@ MODULE_PARM_DESC(ipc4_ignore_cpc,
#define SOF_IPC4_GAIN_PARAM_ID 0 #define SOF_IPC4_TPLG_ABI_SIZE 6 -#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2
static DEFINE_IDA(alh_group_ida); static DEFINE_IDA(pipeline_ida); --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -250,6 +250,8 @@ struct sof_ipc4_dma_stream_ch_map { #define SOF_IPC4_DMA_METHOD_HDA 1 #define SOF_IPC4_DMA_METHOD_GPDMA 2 /* defined for consistency but not used */
+#define SOF_IPC4_CHAIN_DMA_BUF_SIZE_MS 2 + /** * struct sof_ipc4_dma_config: DMA configuration * @dma_method: HDAudio or GPDMA
On Fri, 17 Oct 2025 16:50:07 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v6.12: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 120 tests: 120 pass, 0 fail
Linux version: 6.12.54-rc1-g6122296b30b6 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra186-p3509-0000+p3636-0001, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
Hi!
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
CIP testing did not find any problems here:
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
Tested-by: Pavel Machek (CIP) pavel@denx.de
Best regards, Pavel
On Fri, Oct 17, 2025 at 11:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Builds successfully. Boots and works on qemu and Dell XPS 15 9520 w/ Intel Core i7-12600H
Tested-by: Brett Mastbergen bmastbergen@ciq.com
Thanks, Brett
The kernel, bpf tool, perf tool, and kselftest builds fine for v6.12.54-rc1 on x86 and arm64 Azure VM.
Tested-by: Hardik Garg hargar@linux.microsoft.com
Thanks, Hardik
On 10/17/25 08:50, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
Am 17.10.2025 um 16:50 schrieb Greg Kroah-Hartman:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Builds, boots and works on my 2-socket Ivy Bridge Xeon E5-2697 v2 server. No dmesg oddities or regressions found.
Tested-by: Peter Schneider pschneider1968@googlemail.com
Beste Grüße, Peter Schneider
On 10/17/25 07:50, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Fri, 17 Oct 2025 at 20:45, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
The following kernel crash noticed on the stable-rc 6.12.54-rc1 while running LTP syscalls listmount04 test case.
This is a known regression on the Linux next and reported [1] and fixed [2].
This was caused by, listmount: don't call path_put() under namespace semaphore commit c1f86d0ac322c7e77f6f8dbd216c65d39358ffc0 upstream.
And there is a follow up patch to fix this.
mount: handle NULL values in mnt_ns_release() [ Upstream commit 6c7ca6a02f8f9549a438a08a23c6327580ecf3d6 ]
When calling in listmount() mnt_ns_release() may be passed a NULL pointer. Handle that case gracefully.
Christian Brauner brauner@kernel.org
First seen on 6.12.54-rc1 Good: v6.12.53 Bad: 6.12.54-rc1
Regression Analysis: - New regression? yes - Reproducibility? yes
Test regression: 6.12.54-rc1 Internal error: Oops: mnt_ns_release __arm64_sys_listmount (fs/namespace.c:5526)
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
### LTP syscalls failures: ltp-syscalls/listmount04 ltp-syscalls/madvise06 ltp-syscalls/sendmsg03 ltp-syscalls/sendto03 ltp-syscalls/setsockopt05 ltp-syscalls/setsockopt09 ltp-syscalls/timerfd_settime02 ltp-syscalls/wait403 ltp-containers/userns08
### LTP test log listmount04 [ 3587.449309] <LAVA_SIGNAL_STARTTC listmount04> Received signal: <STARTTC> listmount04 tst_buffers.c:57: TINFO: Test is using guarded buffers tst_test.c:2021: TINFO: LTP version: 20250930 tst_test.c:2024: TINFO: Tested kernel: 6.12.54-rc1 #1 SMP PREEMPT @1760715935 aarch64 tst_kconfig.c:88: TINFO: Parsing kernel config '/proc/config.gz' tst_kconfig.c:676: TINFO: CONFIG_TRACE_IRQFLAGS kernel option detected which might slow the execution tst_test.c:1842: TINFO: Overall timeout per run is 0h 21m 36s [ 3587.464366] <LAVA_SIGNAL_ENDTC listmount04> Received signal: <ENDTC> listmount04 tst_test.c:1920: TBROK: Test killed by SIGSEGV! Summary: passed 0 failed 0 broken 1 skipped 0 warnings 0 [ 3587.523917] <LAVA_SIGNAL_TESTCASE TEST_CASE_ID=listmount04 RESULT=fail>
### Test cash log listmount04: [ 1440.660118] /usr/local/bin/kirk[418]: listmount04: start (command: listmount04) [ 1440.761870] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000080 [ 1440.762768] Mem abort info: [ 1440.763156] ESR = 0x0000000096000004 [ 1440.763722] EC = 0x25: DABT (current EL), IL = 32 bits [ 1440.764204] SET = 0, FnV = 0 [ 1440.764486] EA = 0, S1PTW = 0 [ 1440.764883] FSC = 0x04: level 0 translation fault [ 1440.765393] Data abort info: [ 1440.765795] ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 [ 1440.766288] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 [ 1440.766738] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 [ 1440.767213] user pgtable: 4k pages, 48-bit VAs, pgdp=000000000d4c1000 [ 1440.767819] [0000000000000080] pgd=0000000000000000, p4d=0000000000000000 [ 1440.768448] Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP [ 1440.769002] Modules linked in: tun overlay btrfs xor xor_neon raid6_pq zstd_compress libcrc32c snd_soc_hdmi_codec hantro_vpu dw_hdmi_cec dw_hdmi_i2s_audio brcmfmac rockchipdrm v4l2_h264 dw_mipi_dsi v4l2_vp9 hci_uart brcmutil analogix_dp crct10dif_ce btqca v4l2_jpeg panfrost dw_hdmi v4l2_mem2mem btbcm snd_soc_simple_card snd_soc_audio_graph_card snd_soc_spdif_tx cec gpu_sched snd_soc_simple_card_utils bluetooth cfg80211 drm_display_helper videobuf2_v4l2 drm_shmem_helper snd_soc_rockchip_i2s videobuf2_dma_contig pwrseq_core phy_rockchip_pcie drm_dma_helper rtc_rk808 videobuf2_memops drm_kms_helper videobuf2_common rfkill snd_soc_es8316 rockchip_saradc industrialio_triggered_buffer rockchip_thermal kfifo_buf pcie_rockchip_host coresight_cpu_debug drm fuse backlight ip_tables x_tables [ 1440.775190] CPU: 3 UID: 0 PID: 131415 Comm: listmount04 Not tainted 6.12.54-rc1 #1 [ 1440.775866] Hardware name: Radxa ROCK Pi 4B (DT) [ 1440.776277] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 1440.776893] pc : mnt_ns_release (arch/arm64/include/asm/atomic_ll_sc.h:96 arch/arm64/include/asm/atomic.h:51 include/linux/atomic/atomic-arch-fallback.h:944 include/linux/atomic/atomic-instrumented.h:401 include/linux/refcount.h:264 include/linux/refcount.h:307 include/linux/refcount.h:325 fs/namespace.c:156) [ 1440.777267] lr : __arm64_sys_listmount (fs/namespace.c:? fs/namespace.c:5569 fs/namespace.c:5526 fs/namespace.c:5526) [ 1440.777694] sp : ffff80008d663d30 [ 1440.777987] x29: ffff80008d663d30 x28: ffff0000bba30000 x27: 0000000000000000 [ 1440.778622] x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000000000 [ 1440.779256] x23: 0000000000000000 x22: 0000000000000020 x21: fffffffffffffff2 [ 1440.779890] x20: 0000000000000100 x19: 0000aaaab5ee1110 x18: 0000000000000000 [ 1440.780524] x17: 0000000000000000 x16: 0000000000000000 x15: 0000000000000000 [ 1440.781158] x14: 0000000000000000 x13: ffff80008d660000 x12: ffff80008d664000 [ 1440.781791] x11: 0000000000000000 x10: 0000000000000001 x9 : ffff80008044cdf0 [ 1440.782425] x8 : 0000000000000080 x7 : 0000000000000000 x6 : 0000000000000000 [ 1440.783059] x5 : 0000000000000000 x4 : 0000000000000000 x3 : ffff80008d663e00 [ 1440.783692] x2 : ffff800081700c70 x1 : ffff80008d663d50 x0 : 0000000000000000 [ 1440.784326] Call trace: [ 1440.784545] mnt_ns_release (arch/arm64/include/asm/atomic_ll_sc.h:96 arch/arm64/include/asm/atomic.h:51 include/linux/atomic/atomic-arch-fallback.h:944 include/linux/atomic/atomic-instrumented.h:401 include/linux/refcount.h:264 include/linux/refcount.h:307 include/linux/refcount.h:325 fs/namespace.c:156) [ 1440.784882] __arm64_sys_listmount (fs/namespace.c:? fs/namespace.c:5569 fs/namespace.c:5526 fs/namespace.c:5526) [ 1440.785278] invoke_syscall (arch/arm64/kernel/syscall.c:50) [ 1440.785618] el0_svc_common (include/linux/thread_info.h:127 arch/arm64/kernel/syscall.c:140) [ 1440.785948] do_el0_svc (arch/arm64/kernel/syscall.c:152) [ 1440.786247] el0_svc (arch/arm64/kernel/entry-common.c:165) [ 1440.786524] el0t_64_sync_handler (arch/arm64/kernel/entry-common.c:789) [ 1440.786904] el0t_64_sync (arch/arm64/kernel/entry.S:598) [ 1440.787238] Code: aa1303e0 14000019 5280002a f9800111 (885f7d09) All code ======== 0: aa1303e0 mov x0, x19 4: 14000019 b 0x68 8: 5280002a mov w10, #0x1 // #1 c: f9800111 prfm pstl1strm, [x8] 10:* 885f7d09 ldxr w9, [x8] <-- trapping instruction
Code starting with the faulting instruction =========================================== 0: 885f7d09 ldxr w9, [x8] [ 1440.787776] ---[ end trace 0000000000000000 ]---
## Lore link, [1] https://lore.kernel.org/all/CA+G9fYueO8kP8mXVNmbHkyrFPKpt-onPfeyNXLuLGGjiO1W... [2] https://lore.kernel.org/all/20251017145215.505418259@linuxfoundation.org/
## Build * kernel: 6.12.54-rc1 * git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git * git commit: 6122296b30b695962026ca4d1b434cae639373e0 * git describe: v6.12.53-278-g6122296b30b6 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.12.y/build/v6.12....
## Build * Test log: https://lkft.validation.linaro.org/scheduler/job/8496757#L6787 * Test details: https://regressions.linaro.org/lkft/linux-stable-rc-linux-6.12.y/v6.12.53-27... * Build plan: https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/34CRZ9uzNjZ... * Build link: https://storage.tuxsuite.com/public/linaro/lkft/builds/34CRXmuzdt1HaZluq4cBw... * Kernel config: https://storage.tuxsuite.com/public/linaro/lkft/builds/34CRXmuzdt1HaZluq4cBw...
-- Linaro LKFT
# Librecast Test Results
020/020 [ OK ] liblcrq 010/010 [ OK ] libmld 120/120 [ OK ] liblibrecast
CPU/kernel: Linux auntie 6.12.54-rc1-g6122296b30b6 #111 SMP PREEMPT_DYNAMIC Sat Oct 18 07:55:01 -00 2025 x86_64 AMD Ryzen 9 9950X 16-Core Processor AuthenticAMD GNU/Linux
Tested-by: Brett A C Sheffield bacs@librecast.net
Hi Greg,
On 17/10/25 20:20, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
No problems seen on x86_64 and aarch64 with our testing.
Tested-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
Thanks, Harshit
Hii Greg.
On Fri, Oct 17, 2025 at 8:46 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.12.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.12.y and the diffstat can be found below.
thanks,
greg k-h
Build and boot tested 6.12.54-rc1 using qemu-x86_64. The kernel was successfully built and booted in a virtualized environment without any issues.
Build kernel: 6.12.54-rc1 git: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git commit: 6122296b30b695962026ca4d1b434cae639373e0
Tested-by: Dileep Malepu dileep.debian@gmail.com
Best regards Dileep Malepu.
On Fri, 17 Oct 2025 16:50:07 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.12.54 release. There are 277 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Sun, 19 Oct 2025 14:50:59 +0000. Anything received after that time might be too late.
Boot-tested under QEMU for Rust x86_64, arm64 and riscv64; built-tested for loongarch64:
Tested-by: Miguel Ojeda ojeda@kernel.org
Thanks!
Cheers, Miguel
linux-stable-mirror@lists.linaro.org