This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.133-rc1
Melissa Wen mwen@igalia.com drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
Jamal Hadi Salim jhs@mojatatu.com net/sched: Retire rsvp classifier
Christian König christian.koenig@amd.com drm/amdgpu: fix amdgpu_cs_p1_user_fence
Yifan Zhang yifan1.zhang@amd.com drm/amd/display: fix the white screen issue when >= 64GB DRAM
Shida Zhang zhangshida@kylinos.cn ext4: fix rec_len verify error
Damien Le Moal dlemoal@kernel.org scsi: pm8001: Setup IRQs on resume
Junxiao Bi junxiao.bi@oracle.com scsi: megaraid_sas: Fix deadlock on firmware crashdump
Niklas Cassel niklas.cassel@wdc.com ata: libata: disallow dev-initiated LPM transitions to unsupported states
Tommy Huang tommy_huang@aspeedtech.com i2c: aspeed: Reset the i2c controller when timeout occurs
Steven Rostedt (Google) rostedt@goodmis.org tracefs: Add missing lockdown check to tracefs_create_dir()
Jeff Layton jlayton@kernel.org nfsd: fix change_info in NFSv4 RENAME replies
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have option files inc the trace array ref count
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have current_trace inc the trace array ref count
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have tracing_max_latency inc the trace array ref count
Filipe Manana fdmanana@suse.com btrfs: release path before inode lookup during the ino lookup ioctl
Filipe Manana fdmanana@suse.com btrfs: fix lockdep splat and potential deadlock after failure running delayed items
Amir Goldstein amir73il@gmail.com ovl: fix incorrect fdput() on aio completion
Amir Goldstein amir73il@gmail.com ovl: fix failed copyup of fileattr on a symlink
Christian Brauner brauner@kernel.org attr: block mode changes of symlinks
Nigel Croxon ncroxon@redhat.com md/raid1: fix error: ISO C90 forbids mixed declarations
Arnd Bergmann arnd@arndb.de samples/hw_breakpoint: fix building without module unloading
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: GC transaction race with netns dismantle
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
Florian Westphal fw@strlen.de netfilter: nf_tables: fix kdoc warnings after gc rework
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: remove busy mark and gc batch API
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_set_hash: mark set element as dead when deleting from packet path
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: adapt set backend to use GC transaction API
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: GC transaction API to avoid race with control plane
Florian Westphal fw@strlen.de netfilter: nf_tables: make validation state per table
Song Liu song@kernel.org x86/purgatory: Remove LTO flags
Kirill A. Shutemov kirill.shutemov@linux.intel.com x86/boot/compressed: Reserve more memory for page tables
Jinjie Ruan ruanjinjie@huawei.com scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
Masami Hiramatsu (Google) mhiramat@kernel.org selftests: tracing: Fix to unmount tracefs for recovering environment
Jinjie Ruan ruanjinjie@huawei.com scsi: qla2xxx: Fix NULL vs IS_ERR() bug for debugfs_create_dir()
Jinjie Ruan ruanjinjie@huawei.com drm: gm12u320: Fix the timeout usage for usb_bulk_msg()
Anand Jain anand.jain@oracle.com btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super
Anand Jain anand.jain@oracle.com btrfs: add a helper to read the superblock metadata_uuid
Josef Bacik josef@toxicpanda.com btrfs: move btrfs_pinned_by_swapfile prototype into volumes.h
Namhyung Kim namhyung@kernel.org perf test shell stat_bpf_counters: Fix test on Intel
James Clark james.clark@arm.com perf test: Remove bash construct from stat_bpf_counters.sh test
Namhyung Kim namhyung@kernel.org perf build: Update build rule for generated files
Ian Rogers rogers.email@gmail.com perf jevents: Switch build to use jevents.py
Tiezhu Yang yangtiezhu@loongson.cn MIPS: Use "grep -E" instead of "egrep"
William Zhang william.zhang@broadcom.com mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller
Florian Fainelli f.fainelli@gmail.com mtd: rawnand: brcmnand: Allow SoC to provide I/O operations
Zhang Yi yi.zhang@huawei.com jbd2: correct the end of the journal recovery scan range
Jan Kara jack@suse.cz jbd2: rename jbd_debug() to jbd2_debug()
Ritesh Harjani riteshh@linux.ibm.com jbd2: kill t_handle_lock transaction spinlock
Ritesh Harjani riteshh@linux.ibm.com jbd2: fix use-after-free of transaction_t race
Ritesh Harjani riteshh@linux.ibm.com jbd2: refactor wait logic for transaction updates into a common function
John Ogness john.ogness@linutronix.de printk: Consolidate console deferred printing
Rob Clark robdclark@chromium.org interconnect: Fix locking for runpm vs reclaim
Zhen Lei thunder.leizhen@huawei.com kobject: Add sanity check for kset->kobj.ktype in kset_register()
Sakari Ailus sakari.ailus@linux.intel.com media: pci: ipu3-cio2: Initialise timing struct to avoid a compiler warning
Xu Yang xu.yang_2@nxp.com usb: ehci: add workaround for chipidea PORTSC.PEC bug
Christophe Leroy christophe.leroy@csgroup.eu serial: cpm_uart: Avoid suspicious locking
Konstantin Shelekhin k.shelekhin@yadro.com scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show()
Chenyuan Mi michenyuan@huawei.com tools: iio: iio_generic_buffer: Fix some integer type and calculation
Ma Ke make_ruc2021@163.com usb: gadget: fsl_qe_udc: validate endpoint index for ch9 udc
Xiaolei Wang xiaolei.wang@windriver.com usb: cdns3: Put the cdns set active part outside the spin lock
Hans Verkuil hverkuil-cisco@xs4all.nl media: pci: cx23885: replace BUG with error return
Hans Verkuil hverkuil-cisco@xs4all.nl media: tuners: qt1010: replace BUG_ON with a regular error
Zhang Shurong zhang_shurong@foxmail.com media: dvb-usb-v2: gl861: Fix null-ptr-deref in gl861_i2c_master_xfer
Zhang Shurong zhang_shurong@foxmail.com media: az6007: Fix null-ptr-deref in az6007_i2c_xfer()
Zhang Shurong zhang_shurong@foxmail.com media: anysee: fix null-ptr-deref in anysee_master_xfer
Zhang Shurong zhang_shurong@foxmail.com media: af9005: Fix null-ptr-deref in af9005_i2c_xfer
Zhang Shurong zhang_shurong@foxmail.com media: dw2102: Fix null-ptr-deref in dw2102_i2c_transfer()
Zhang Shurong zhang_shurong@foxmail.com media: dvb-usb-v2: af9035: Fix null-ptr-deref in af9035_i2c_master_xfer
Yong-Xuan Wang yongxuan.wang@sifive.com PCI: fu740: Set the number of MSI vectors
ruanjinjie ruanjinjie@huawei.com powerpc/pseries: fix possible memory leak in ibmebus_bus_init()
Mårten Lindahl marten.lindahl@axis.com ARM: 9317/1: kexec: Make smp stop calls asynchronous
Liu Shixin via Jfs-discussion jfs-discussion@lists.sourceforge.net jfs: fix invalid free of JFS_IP(ipimap)->i_imap in diUnmount
Andrew Kanner andrew.kanner@gmail.com fs/jfs: prevent double-free in dbUnmount() after failed jfs_remount()
Georg Ottinger g.ottinger@gmx.at ext2: fix datatype of block number in ext2_xattr_set2()
Zhang Shurong zhang_shurong@foxmail.com md: raid1: fix potential OOB in raid1_remove_disk()
Tony Lindgren tony@atomide.com bus: ti-sysc: Configure uart quirks for k3 SoC
Tuo Li islituo@gmail.com drm/exynos: fix a possible null-pointer dereference due to data race in exynos_drm_crtc_atomic_disable()
Leo Chen sancchen@amd.com drm/amd/display: Blocking invalid 420 modes on HDMI TMDS for DCN31
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ALSA: hda: intel-dsp-cfg: add LunarLake support
Rong Tao rongtao@cestc.cn samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm8250-edo: correct ramoops pmsg-size
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm8150-kumano: correct ramoops pmsg-size
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm6125-pdx201: correct ramoops pmsg-size
Marek Vasut marex@denx.de drm/bridge: tc358762: Instruct DSI host to generate HSE packets
Hao Luo haoluo@google.com libbpf: Free btf_vmlinux when closing bpf_object
Johannes Berg johannes.berg@intel.com wifi: mac80211_hwsim: drop short frames
GONG, Ruiqi gongruiqi1@huawei.com netfilter: ebtables: fix fortify warnings in size_entry_mwt()
Johannes Berg johannes.berg@intel.com wifi: mac80211: check S1G action frame size
GONG, Ruiqi gongruiqi1@huawei.com alx: fix OOB-read compiler warning
Giulio Benetti giulio.benetti@benettiengineering.com mmc: sdhci-esdhc-imx: improve ESDHC_FLAG_ERR010450
Alexander Steffen Alexander.Steffen@infineon.com tpm_tis: Resend command to recover from data transfer errors
Mark O'Donovan shiftee@posteo.net crypto: lib/mpi - avoid null pointer deref in mpi_cmp_ui()
Dmitry Antipov dmantipov@yandex.ru wifi: wil6210: fix fortify warnings
Dmitry Antipov dmantipov@yandex.ru wifi: mwifiex: fix fortify warning
Dongliang Mu dzm91@hust.edu.cn wifi: ath9k: fix printk specifier
Dmitry Antipov dmantipov@yandex.ru wifi: ath9k: fix fortify warnings
Azeem Shaikh azeemshaikh38@gmail.com crypto: lrw,xts - Replace strlcpy with strscpy
Jiri Pirko jiri@nvidia.com devlink: remove reload failed checks in params get/set callbacks
Mario Limonciello mario.limonciello@amd.com ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects
Tomislav Novak tnovak@meta.com hw_breakpoint: fix single-stepping when using bpf_overflow_handler
Xu Yang xu.yang_2@nxp.com perf/imx_ddr: speed up overflow frequency of cycle
Yicong Yang yangyicong@hisilicon.com perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
Jiri Slaby (SUSE) jirislaby@kernel.org ACPI: video: Add backlight=native DMI quirk for Lenovo Ideapad Z470
Paul E. McKenney paulmck@kernel.org scftorture: Forgive memory-allocation failure if KASAN
Zqiang qiang.zhang1211@gmail.com rcuscale: Move rcu_scale_writer() schedule_timeout_uninterruptible() to _idle()
Wander Lairson Costa wander@redhat.com kernel/fork: beware of __put_task_struct() calling context
Abhishek Mainkar abmainkar@nvidia.com ACPICA: Add AML_NO_OPERAND_RESOLVE flag to Timer
Will Shiu Will.Shiu@mediatek.com locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
Qu Wenruo wqu@suse.com btrfs: output extra debug info if we failed to find an inline backref
Fedor Pchelkin pchelkin@ispras.ru autofs: fix memory leak of waitqueues in autofs_catatonic_mode
-------------
Diffstat:
Documentation/arm64/silicon-errata.rst | 3 + Makefile | 4 +- arch/arm/kernel/hw_breakpoint.c | 8 +- arch/arm/kernel/machine_kexec.c | 14 +- .../dts/qcom/sm6125-sony-xperia-seine-pdx201.dts | 2 +- .../boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi | 2 +- .../boot/dts/qcom/sm8250-sony-xperia-edo.dtsi | 2 +- arch/arm64/kernel/hw_breakpoint.c | 4 +- arch/mips/Makefile | 2 +- arch/mips/vdso/Makefile | 2 +- arch/powerpc/platforms/pseries/ibmebus.c | 1 + arch/x86/boot/compressed/ident_map_64.c | 8 + arch/x86/include/asm/boot.h | 45 +- arch/x86/purgatory/Makefile | 4 + crypto/lrw.c | 6 +- crypto/xts.c | 6 +- drivers/acpi/acpica/psopcode.c | 2 +- drivers/acpi/arm64/iort.c | 5 +- drivers/acpi/video_detect.c | 9 + drivers/acpi/x86/s2idle.c | 6 + drivers/ata/ahci.c | 9 + drivers/ata/libata-sata.c | 19 +- drivers/bus/ti-sysc.c | 2 + drivers/char/tpm/tpm_tis_core.c | 15 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 +- .../amd/display/dc/dml/dcn31/display_mode_vba_31.c | 4 +- drivers/gpu/drm/bridge/tc358762.c | 2 +- drivers/gpu/drm/exynos/exynos_drm_crtc.c | 5 +- drivers/gpu/drm/tiny/gm12u320.c | 10 +- drivers/i2c/busses/i2c-aspeed.c | 7 +- drivers/interconnect/core.c | 8 +- drivers/md/raid1.c | 3 + drivers/media/pci/cx23885/cx23885-video.c | 2 +- drivers/media/pci/intel/ipu3/ipu3-cio2-main.c | 2 +- drivers/media/tuners/qt1010.c | 11 +- drivers/media/usb/dvb-usb-v2/af9035.c | 14 +- drivers/media/usb/dvb-usb-v2/anysee.c | 2 +- drivers/media/usb/dvb-usb-v2/az6007.c | 8 + drivers/media/usb/dvb-usb-v2/gl861.c | 2 +- drivers/media/usb/dvb-usb/af9005.c | 5 + drivers/media/usb/dvb-usb/dw2102.c | 24 + drivers/mmc/host/sdhci-esdhc-imx.c | 7 +- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 102 ++- drivers/mtd/nand/raw/brcmnand/brcmnand.h | 29 + drivers/net/ethernet/atheros/alx/ethtool.c | 5 +- drivers/net/wireless/ath/ath9k/ahb.c | 4 +- drivers/net/wireless/ath/ath9k/mac.h | 6 +- drivers/net/wireless/ath/ath9k/pci.c | 4 +- drivers/net/wireless/ath/ath9k/xmit.c | 4 +- drivers/net/wireless/ath/wil6210/txrx.c | 2 +- drivers/net/wireless/ath/wil6210/txrx.h | 6 +- drivers/net/wireless/ath/wil6210/txrx_edma.c | 2 +- drivers/net/wireless/ath/wil6210/txrx_edma.h | 6 +- drivers/net/wireless/mac80211_hwsim.c | 7 +- drivers/net/wireless/marvell/mwifiex/tdls.c | 9 +- drivers/pci/controller/dwc/pcie-fu740.c | 1 + drivers/perf/arm_smmuv3_pmu.c | 46 +- drivers/perf/fsl_imx8_ddr_perf.c | 21 + drivers/scsi/lpfc/lpfc_debugfs.c | 14 +- drivers/scsi/megaraid/megaraid_sas.h | 2 +- drivers/scsi/megaraid/megaraid_sas_base.c | 21 +- drivers/scsi/pm8001/pm8001_init.c | 51 +- drivers/scsi/qla2xxx/qla_dfs.c | 6 +- drivers/target/iscsi/iscsi_target_configfs.c | 54 +- drivers/tty/serial/cpm_uart/cpm_uart_core.c | 13 +- drivers/usb/cdns3/cdns3-plat.c | 3 +- drivers/usb/cdns3/cdnsp-pci.c | 3 +- drivers/usb/cdns3/core.c | 15 +- drivers/usb/cdns3/core.h | 7 +- drivers/usb/gadget/udc/fsl_qe_udc.c | 2 + drivers/usb/host/ehci-hcd.c | 8 +- drivers/usb/host/ehci-hub.c | 10 +- drivers/usb/host/ehci.h | 10 + fs/attr.c | 20 +- fs/autofs/waitq.c | 3 +- fs/btrfs/ctree.h | 2 - fs/btrfs/delayed-inode.c | 19 +- fs/btrfs/disk-io.c | 8 +- fs/btrfs/extent-tree.c | 5 + fs/btrfs/ioctl.c | 8 +- fs/btrfs/volumes.c | 8 + fs/btrfs/volumes.h | 3 + fs/ext2/xattr.c | 4 +- fs/ext4/namei.c | 26 +- fs/jbd2/checkpoint.c | 6 +- fs/jbd2/commit.c | 49 +- fs/jbd2/journal.c | 34 +- fs/jbd2/recovery.c | 42 +- fs/jbd2/revoke.c | 8 +- fs/jbd2/transaction.c | 114 +-- fs/jfs/jfs_dmap.c | 1 + fs/jfs/jfs_imap.c | 1 + fs/locks.c | 2 +- fs/nfsd/nfs4proc.c | 4 +- fs/overlayfs/copy_up.c | 3 +- fs/overlayfs/file.c | 9 +- fs/tracefs/inode.c | 3 + include/linux/acpi_iort.h | 1 + include/linux/jbd2.h | 11 +- include/linux/libata.h | 4 + include/linux/perf_event.h | 22 +- include/linux/sched/task.h | 28 +- include/net/netfilter/nf_tables.h | 124 ++-- include/uapi/linux/netfilter_bridge/ebtables.h | 14 +- kernel/fork.c | 8 + kernel/printk/printk.c | 35 +- kernel/printk/printk_safe.c | 9 +- kernel/rcu/rcuscale.c | 2 +- kernel/scftorture.c | 6 +- kernel/trace/trace.c | 41 +- lib/kobject.c | 5 + lib/mpi/mpi-cmp.c | 8 +- net/bridge/netfilter/ebtables.c | 3 +- net/core/devlink.c | 4 +- net/mac80211/rx.c | 4 + net/netfilter/nf_tables_api.c | 374 +++++++--- net/netfilter/nft_set_hash.c | 83 ++- net/netfilter/nft_set_pipapo.c | 50 +- net/netfilter/nft_set_rbtree.c | 144 ++-- net/sched/Kconfig | 28 - net/sched/Makefile | 2 - net/sched/cls_rsvp.c | 24 - net/sched/cls_rsvp.h | 776 --------------------- net/sched/cls_rsvp6.c | 24 - samples/hw_breakpoint/data_breakpoint.c | 4 +- sound/hda/intel-dsp-config.c | 8 + tools/build/Makefile.build | 10 + tools/iio/iio_generic_buffer.c | 17 +- tools/lib/bpf/libbpf.c | 1 + tools/perf/Makefile.config | 19 + tools/perf/Makefile.perf | 1 + tools/perf/pmu-events/Build | 19 +- tools/perf/pmu-events/empty-pmu-events.c | 158 +++++ tools/perf/tests/shell/stat_bpf_counters.sh | 6 +- tools/testing/selftests/ftrace/ftracetest | 8 + 136 files changed, 1680 insertions(+), 1595 deletions(-)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fedor Pchelkin pchelkin@ispras.ru
[ Upstream commit ccbe77f7e45dfb4420f7f531b650c00c6e9c7507 ]
Syzkaller reports a memory leak:
BUG: memory leak unreferenced object 0xffff88810b279e00 (size 96): comm "syz-executor399", pid 3631, jiffies 4294964921 (age 23.870s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 08 9e 27 0b 81 88 ff ff ..........'..... 08 9e 27 0b 81 88 ff ff 00 00 00 00 00 00 00 00 ..'............. backtrace: [<ffffffff814cfc90>] kmalloc_trace+0x20/0x90 mm/slab_common.c:1046 [<ffffffff81bb75ca>] kmalloc include/linux/slab.h:576 [inline] [<ffffffff81bb75ca>] autofs_wait+0x3fa/0x9a0 fs/autofs/waitq.c:378 [<ffffffff81bb88a7>] autofs_do_expire_multi+0xa7/0x3e0 fs/autofs/expire.c:593 [<ffffffff81bb8c33>] autofs_expire_multi+0x53/0x80 fs/autofs/expire.c:619 [<ffffffff81bb6972>] autofs_root_ioctl_unlocked+0x322/0x3b0 fs/autofs/root.c:897 [<ffffffff81bb6a95>] autofs_root_ioctl+0x25/0x30 fs/autofs/root.c:910 [<ffffffff81602a9c>] vfs_ioctl fs/ioctl.c:51 [inline] [<ffffffff81602a9c>] __do_sys_ioctl fs/ioctl.c:870 [inline] [<ffffffff81602a9c>] __se_sys_ioctl fs/ioctl.c:856 [inline] [<ffffffff81602a9c>] __x64_sys_ioctl+0xfc/0x140 fs/ioctl.c:856 [<ffffffff84608225>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<ffffffff84608225>] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 [<ffffffff84800087>] entry_SYSCALL_64_after_hwframe+0x63/0xcd
autofs_wait_queue structs should be freed if their wait_ctr becomes zero. Otherwise they will be lost.
In this case an AUTOFS_IOC_EXPIRE_MULTI ioctl is done, then a new waitqueue struct is allocated in autofs_wait(), its initial wait_ctr equals 2. After that wait_event_killable() is interrupted (it returns -ERESTARTSYS), so that 'wq->name.name == NULL' condition may be not satisfied. Actually, this condition can be satisfied when autofs_wait_release() or autofs_catatonic_mode() is called and, what is also important, wait_ctr is decremented in those places. Upon the exit of autofs_wait(), wait_ctr is decremented to 1. Then the unmounting process begins: kill_sb calls autofs_catatonic_mode(), which should have freed the waitqueues, but it only decrements its usage counter to zero which is not a correct behaviour.
edit:imk This description is of course not correct. The umount performed as a result of an expire is a umount of a mount that has been automounted, it's not the autofs mount itself. They happen independently, usually after everything mounted within the autofs file system has been expired away. If everything hasn't been expired away the automount daemon can still exit leaving mounts in place. But expires done in both cases will result in a notification that calls autofs_wait_release() with a result status. The problem case is the summary execution of of the automount daemon. In this case any waiting processes won't be woken up until either they are terminated or the mount is umounted. end edit: imk
So in catatonic mode we should free waitqueues which counter becomes zero.
edit: imk Initially I was concerned that the calling of autofs_wait_release() and autofs_catatonic_mode() was not mutually exclusive but that can't be the case (obviously) because the queue entry (or entries) is removed from the list when either of these two functions are called. Consequently the wait entry will be freed by only one of these functions or by the woken process in autofs_wait() depending on the order of the calls. end edit: imk
Reported-by: syzbot+5e53f70e69ff0c0a1c0c@syzkaller.appspotmail.com Suggested-by: Takeshi Misawa jeliantsurux@gmail.com Signed-off-by: Fedor Pchelkin pchelkin@ispras.ru Signed-off-by: Alexey Khoroshilov khoroshilov@ispras.ru Signed-off-by: Ian Kent raven@themaw.net Cc: Matthew Wilcox willy@infradead.org Cc: Andrei Vagin avagin@gmail.com Cc: autofs@vger.kernel.org Cc: linux-kernel@vger.kernel.org Message-Id: 169112719161.7590.6700123246297365841.stgit@donald.themaw.net Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/autofs/waitq.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/fs/autofs/waitq.c b/fs/autofs/waitq.c index 54c1f8b8b0757..efdc76732faed 100644 --- a/fs/autofs/waitq.c +++ b/fs/autofs/waitq.c @@ -32,8 +32,9 @@ void autofs_catatonic_mode(struct autofs_sb_info *sbi) wq->status = -ENOENT; /* Magic is gone - report failure */ kfree(wq->name.name - wq->offset); wq->name.name = NULL; - wq->wait_ctr--; wake_up_interruptible(&wq->queue); + if (!--wq->wait_ctr) + kfree(wq); wq = nwq; } fput(sbi->pipe); /* Close the pipe */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Qu Wenruo wqu@suse.com
[ Upstream commit 7f72f50547b7af4ddf985b07fc56600a4deba281 ]
[BUG] Syzbot reported several warning triggered inside lookup_inline_extent_backref().
[CAUSE] As usual, the reproducer doesn't reliably trigger locally here, but at least we know the WARN_ON() is triggered when an inline backref can not be found, and it can only be triggered when @insert is true. (I.e. inserting a new inline backref, which means the backref should already exist)
[ENHANCEMENT] After the WARN_ON(), dump all the parameters and the extent tree leaf to help debug.
Link: https://syzkaller.appspot.com/bug?extid=d6f9ff86c1d804ba2bc6 Signed-off-by: Qu Wenruo wqu@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent-tree.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 597cc2607481c..48f2de789b755 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -860,6 +860,11 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans, err = -ENOENT; goto out; } else if (WARN_ON(ret)) { + btrfs_print_leaf(path->nodes[0]); + btrfs_err(fs_info, +"extent item not found for insert, bytenr %llu num_bytes %llu parent %llu root_objectid %llu owner %llu offset %llu", + bytenr, num_bytes, parent, root_objectid, owner, + offset); err = -EIO; goto out; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Shiu Will.Shiu@mediatek.com
[ Upstream commit 74f6f5912693ce454384eaeec48705646a21c74f ]
As following backtrace, the struct file_lock request , in posix_lock_inode is free before ftrace function using. Replace the ftrace function ahead free flow could fix the use-after-free issue.
[name:report&]=============================================== BUG:KASAN: use-after-free in trace_event_raw_event_filelock_lock+0x80/0x12c [name:report&]Read at addr f6ffff8025622620 by task NativeThread/16753 [name:report_hw_tags&]Pointer tag: [f6], memory tag: [fe] [name:report&] BT: Hardware name: MT6897 (DT) Call trace: dump_backtrace+0xf8/0x148 show_stack+0x18/0x24 dump_stack_lvl+0x60/0x7c print_report+0x2c8/0xa08 kasan_report+0xb0/0x120 __do_kernel_fault+0xc8/0x248 do_bad_area+0x30/0xdc do_tag_check_fault+0x1c/0x30 do_mem_abort+0x58/0xbc el1_abort+0x3c/0x5c el1h_64_sync_handler+0x54/0x90 el1h_64_sync+0x68/0x6c trace_event_raw_event_filelock_lock+0x80/0x12c posix_lock_inode+0xd0c/0xd60 do_lock_file_wait+0xb8/0x190 fcntl_setlk+0x2d8/0x440 ... [name:report&] [name:report&]Allocated by task 16752: ... slab_post_alloc_hook+0x74/0x340 kmem_cache_alloc+0x1b0/0x2f0 posix_lock_inode+0xb0/0xd60 ... [name:report&] [name:report&]Freed by task 16752: ... kmem_cache_free+0x274/0x5b0 locks_dispose_list+0x3c/0x148 posix_lock_inode+0xc40/0xd60 do_lock_file_wait+0xb8/0x190 fcntl_setlk+0x2d8/0x440 do_fcntl+0x150/0xc18 ...
Signed-off-by: Will Shiu Will.Shiu@mediatek.com Signed-off-by: Jeff Layton jlayton@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/locks.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/locks.c b/fs/locks.c index 881fd16905c61..4899a4666f24d 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -1339,6 +1339,7 @@ static int posix_lock_inode(struct inode *inode, struct file_lock *request, out: spin_unlock(&ctx->flc_lock); percpu_up_read(&file_rwsem); + trace_posix_lock_inode(inode, request, error); /* * Free any unused locks. */ @@ -1347,7 +1348,6 @@ static int posix_lock_inode(struct inode *inode, struct file_lock *request, if (new_fl2) locks_free_lock(new_fl2); locks_dispose_list(&dispose); - trace_posix_lock_inode(inode, request, error);
return error; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Abhishek Mainkar abmainkar@nvidia.com
[ Upstream commit 3a21ffdbc825e0919db9da0e27ee5ff2cc8a863e ]
ACPICA commit 90310989a0790032f5a0140741ff09b545af4bc5
According to the ACPI specification 19.6.134, no argument is required to be passed for ASL Timer instruction. For taking care of no argument, AML_NO_OPERAND_RESOLVE flag is added to ASL Timer instruction opcode.
When ASL timer instruction interpreted by ACPI interpreter, getting error. After adding AML_NO_OPERAND_RESOLVE flag to ASL Timer instruction opcode, issue is not observed.
============================================================= UBSAN: array-index-out-of-bounds in acpica/dswexec.c:401:12 index -1 is out of range for type 'union acpi_operand_object *[9]' CPU: 37 PID: 1678 Comm: cat Not tainted 6.0.0-dev-th500-6.0.y-1+bcf8c46459e407-generic-64k HW name: NVIDIA BIOS v1.1.1-d7acbfc-dirty 12/19/2022 Call trace: dump_backtrace+0xe0/0x130 show_stack+0x20/0x60 dump_stack_lvl+0x68/0x84 dump_stack+0x18/0x34 ubsan_epilogue+0x10/0x50 __ubsan_handle_out_of_bounds+0x80/0x90 acpi_ds_exec_end_op+0x1bc/0x6d8 acpi_ps_parse_loop+0x57c/0x618 acpi_ps_parse_aml+0x1e0/0x4b4 acpi_ps_execute_method+0x24c/0x2b8 acpi_ns_evaluate+0x3a8/0x4bc acpi_evaluate_object+0x15c/0x37c acpi_evaluate_integer+0x54/0x15c show_power+0x8c/0x12c [acpi_power_meter]
Link: https://github.com/acpica/acpica/commit/90310989 Signed-off-by: Abhishek Mainkar abmainkar@nvidia.com Signed-off-by: Bob Moore robert.moore@intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpica/psopcode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/acpica/psopcode.c b/drivers/acpi/acpica/psopcode.c index 3e80eb1a5f35c..590326c200a28 100644 --- a/drivers/acpi/acpica/psopcode.c +++ b/drivers/acpi/acpica/psopcode.c @@ -603,7 +603,7 @@ const struct acpi_opcode_info acpi_gbl_aml_op_info[AML_NUM_OPCODES] = {
/* 7E */ ACPI_OP("Timer", ARGP_TIMER_OP, ARGI_TIMER_OP, ACPI_TYPE_ANY, AML_CLASS_EXECUTE, AML_TYPE_EXEC_0A_0T_1R, - AML_FLAGS_EXEC_0A_0T_1R), + AML_FLAGS_EXEC_0A_0T_1R | AML_NO_OPERAND_RESOLVE),
/* ACPI 5.0 opcodes */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wander Lairson Costa wander@redhat.com
[ Upstream commit d243b34459cea30cfe5f3a9b2feb44e7daff9938 ]
Under PREEMPT_RT, __put_task_struct() indirectly acquires sleeping locks. Therefore, it can't be called from an non-preemptible context.
One practical example is splat inside inactive_task_timer(), which is called in a interrupt context:
CPU: 1 PID: 2848 Comm: life Kdump: loaded Tainted: G W --------- Hardware name: HP ProLiant DL388p Gen8, BIOS P70 07/15/2012 Call Trace: dump_stack_lvl+0x57/0x7d mark_lock_irq.cold+0x33/0xba mark_lock+0x1e7/0x400 mark_usage+0x11d/0x140 __lock_acquire+0x30d/0x930 lock_acquire.part.0+0x9c/0x210 rt_spin_lock+0x27/0xe0 refill_obj_stock+0x3d/0x3a0 kmem_cache_free+0x357/0x560 inactive_task_timer+0x1ad/0x340 __run_hrtimer+0x8a/0x1a0 __hrtimer_run_queues+0x91/0x130 hrtimer_interrupt+0x10f/0x220 __sysvec_apic_timer_interrupt+0x7b/0xd0 sysvec_apic_timer_interrupt+0x4f/0xd0 asm_sysvec_apic_timer_interrupt+0x12/0x20 RIP: 0033:0x7fff196bf6f5
Instead of calling __put_task_struct() directly, we defer it using call_rcu(). A more natural approach would use a workqueue, but since in PREEMPT_RT, we can't allocate dynamic memory from atomic context, the code would become more complex because we would need to put the work_struct instance in the task_struct and initialize it when we allocate a new task_struct.
The issue is reproducible with stress-ng:
while true; do stress-ng --sched deadline --sched-period 1000000000 \ --sched-runtime 800000000 --sched-deadline \ 1000000000 --mmapfork 23 -t 20 done
Reported-by: Hu Chunyu chuhu@redhat.com Suggested-by: Oleg Nesterov oleg@redhat.com Suggested-by: Valentin Schneider vschneid@redhat.com Suggested-by: Peter Zijlstra peterz@infradead.org Signed-off-by: Wander Lairson Costa wander@redhat.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20230614122323.37957-2-wander@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/sched/task.h | 28 +++++++++++++++++++++++++++- kernel/fork.c | 8 ++++++++ 2 files changed, 35 insertions(+), 1 deletion(-)
diff --git a/include/linux/sched/task.h b/include/linux/sched/task.h index d23977e9035d4..0c2d008099151 100644 --- a/include/linux/sched/task.h +++ b/include/linux/sched/task.h @@ -108,10 +108,36 @@ static inline struct task_struct *get_task_struct(struct task_struct *t) }
extern void __put_task_struct(struct task_struct *t); +extern void __put_task_struct_rcu_cb(struct rcu_head *rhp);
static inline void put_task_struct(struct task_struct *t) { - if (refcount_dec_and_test(&t->usage)) + if (!refcount_dec_and_test(&t->usage)) + return; + + /* + * under PREEMPT_RT, we can't call put_task_struct + * in atomic context because it will indirectly + * acquire sleeping locks. + * + * call_rcu() will schedule delayed_put_task_struct_rcu() + * to be called in process context. + * + * __put_task_struct() is called when + * refcount_dec_and_test(&t->usage) succeeds. + * + * This means that it can't "conflict" with + * put_task_struct_rcu_user() which abuses ->rcu the same + * way; rcu_users has a reference so task->usage can't be + * zero after rcu_users 1 -> 0 transition. + * + * delayed_free_task() also uses ->rcu, but it is only called + * when it fails to fork a process. Therefore, there is no + * way it can conflict with put_task_struct(). + */ + if (IS_ENABLED(CONFIG_PREEMPT_RT) && !preemptible()) + call_rcu(&t->rcu, __put_task_struct_rcu_cb); + else __put_task_struct(t); }
diff --git a/kernel/fork.c b/kernel/fork.c index ace0717c71e27..753e641f617bd 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -764,6 +764,14 @@ void __put_task_struct(struct task_struct *tsk) } EXPORT_SYMBOL_GPL(__put_task_struct);
+void __put_task_struct_rcu_cb(struct rcu_head *rhp) +{ + struct task_struct *task = container_of(rhp, struct task_struct, rcu); + + __put_task_struct(task); +} +EXPORT_SYMBOL_GPL(__put_task_struct_rcu_cb); + void __init __weak arch_task_cache_init(void) { }
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zqiang qiang.zhang1211@gmail.com
[ Upstream commit e60c122a1614b4f65b29a7bef9d83b9fd30e937a ]
The rcuscale.holdoff module parameter can be used to delay the start of rcu_scale_writer() kthread. However, the hung-task timeout will trigger when the timeout specified by rcuscale.holdoff is greater than hung_task_timeout_secs:
runqemu kvm nographic slirp qemuparams="-smp 4 -m 2048M" bootparams="rcuscale.shutdown=0 rcuscale.holdoff=300"
[ 247.071753] INFO: task rcu_scale_write:59 blocked for more than 122 seconds. [ 247.072529] Not tainted 6.4.0-rc1-00134-gb9ed6de8d4ff #7 [ 247.073400] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 247.074331] task:rcu_scale_write state:D stack:30144 pid:59 ppid:2 flags:0x00004000 [ 247.075346] Call Trace: [ 247.075660] <TASK> [ 247.075965] __schedule+0x635/0x1280 [ 247.076448] ? __pfx___schedule+0x10/0x10 [ 247.076967] ? schedule_timeout+0x2dc/0x4d0 [ 247.077471] ? __pfx_lock_release+0x10/0x10 [ 247.078018] ? enqueue_timer+0xe2/0x220 [ 247.078522] schedule+0x84/0x120 [ 247.078957] schedule_timeout+0x2e1/0x4d0 [ 247.079447] ? __pfx_schedule_timeout+0x10/0x10 [ 247.080032] ? __pfx_rcu_scale_writer+0x10/0x10 [ 247.080591] ? __pfx_process_timeout+0x10/0x10 [ 247.081163] ? __pfx_sched_set_fifo_low+0x10/0x10 [ 247.081760] ? __pfx_rcu_scale_writer+0x10/0x10 [ 247.082287] rcu_scale_writer+0x6b1/0x7f0 [ 247.082773] ? mark_held_locks+0x29/0xa0 [ 247.083252] ? __pfx_rcu_scale_writer+0x10/0x10 [ 247.083865] ? __pfx_rcu_scale_writer+0x10/0x10 [ 247.084412] kthread+0x179/0x1c0 [ 247.084759] ? __pfx_kthread+0x10/0x10 [ 247.085098] ret_from_fork+0x2c/0x50 [ 247.085433] </TASK>
This commit therefore replaces schedule_timeout_uninterruptible() with schedule_timeout_idle().
Signed-off-by: Zqiang qiang.zhang1211@gmail.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/rcu/rcuscale.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/rcuscale.c b/kernel/rcu/rcuscale.c index 57ec414710bbc..a83cb29d37607 100644 --- a/kernel/rcu/rcuscale.c +++ b/kernel/rcu/rcuscale.c @@ -402,7 +402,7 @@ rcu_scale_writer(void *arg) sched_set_fifo_low(current);
if (holdoff) - schedule_timeout_uninterruptible(holdoff * HZ); + schedule_timeout_idle(holdoff * HZ);
/* * Wait until rcu_end_inkernel_boot() is called for normal GP tests
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Paul E. McKenney paulmck@kernel.org
[ Upstream commit 013608cd0812bdb21fc26d39ed8fdd2fc76e8b9b ]
Kernels built with CONFIG_KASAN=y quarantine newly freed memory in order to better detect use-after-free errors. However, this can exhaust memory more quickly in allocator-heavy tests, which can result in spurious scftorture failure. This commit therefore forgives memory-allocation failure in kernels built with CONFIG_KASAN=y, but continues counting the errors for use in detailed test-result analyses.
Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/scftorture.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/kernel/scftorture.c b/kernel/scftorture.c index 27286d99e0c28..41006eef003f6 100644 --- a/kernel/scftorture.c +++ b/kernel/scftorture.c @@ -175,7 +175,8 @@ static void scf_torture_stats_print(void) scfs.n_all_wait += scf_stats_p[i].n_all_wait; } if (atomic_read(&n_errs) || atomic_read(&n_mb_in_errs) || - atomic_read(&n_mb_out_errs) || atomic_read(&n_alloc_errs)) + atomic_read(&n_mb_out_errs) || + (!IS_ENABLED(CONFIG_KASAN) && atomic_read(&n_alloc_errs))) bangstr = "!!! "; pr_alert("%s %sscf_invoked_count %s: %lld resched: %lld single: %lld/%lld single_ofl: %lld/%lld single_rpc: %lld single_rpc_ofl: %lld many: %lld/%lld all: %lld/%lld ", SCFTORT_FLAG, bangstr, isdone ? "VER" : "ver", invoked_count, scfs.n_resched, @@ -327,7 +328,8 @@ static void scftorture_invoke_one(struct scf_statistics *scfp, struct torture_ra preempt_disable(); if (scfsp->scfs_prim == SCF_PRIM_SINGLE || scfsp->scfs_wait) { scfcp = kmalloc(sizeof(*scfcp), GFP_ATOMIC); - if (WARN_ON_ONCE(!scfcp)) { + if (!scfcp) { + WARN_ON_ONCE(!IS_ENABLED(CONFIG_KASAN)); atomic_inc(&n_alloc_errs); } else { scfcp->scfc_cpu = -1;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Slaby (SUSE) jirislaby@kernel.org
[ Upstream commit 96b709be183c56293933ef45b8b75f8af268c6de ]
The Lenovo Ideapad Z470 predates Windows 8, so it defaults to using acpi_video for backlight control. But this is not functional on this model.
Add a DMI quirk to use the native backlight interface which works.
Link: https://bugzilla.suse.com/show_bug.cgi?id=1208724 Signed-off-by: Jiri Slaby (SUSE) jirislaby@kernel.org Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index 038542b3a80a7..5afc42d52e49b 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -307,6 +307,15 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_BOARD_NAME, "Lenovo IdeaPad S405"), }, }, + { + /* https://bugzilla.suse.com/show_bug.cgi?id=1208724 */ + .callback = video_detect_force_native, + /* Lenovo Ideapad Z470 */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"), + DMI_MATCH(DMI_PRODUCT_VERSION, "IdeaPad Z470"), + }, + }, { /* https://bugzilla.redhat.com/show_bug.cgi?id=1187004 */ .callback = video_detect_force_native,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yicong Yang yangyicong@hisilicon.com
[ Upstream commit 0242737dc4eb9f6e9a5ea594b3f93efa0b12f28d ]
Some HiSilicon SMMU PMCG suffers the erratum 162001900 that the PMU disable control sometimes fail to disable the counters. This will lead to error or inaccurate data since before we enable the counters the counter's still counting for the event used in last perf session.
This patch tries to fix this by hardening the global disable process. Before disable the PMU, writing an invalid event type (0xffff) to focibly stop the counters. Correspondingly restore each events on pmu::pmu_enable().
Signed-off-by: Yicong Yang yangyicong@hisilicon.com Link: https://lore.kernel.org/r/20230814124012.58013-1-yangyicong@huawei.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/arm64/silicon-errata.rst | 3 ++ drivers/acpi/arm64/iort.c | 5 ++- drivers/perf/arm_smmuv3_pmu.c | 46 +++++++++++++++++++++++++- include/linux/acpi_iort.h | 1 + 4 files changed, 53 insertions(+), 2 deletions(-)
diff --git a/Documentation/arm64/silicon-errata.rst b/Documentation/arm64/silicon-errata.rst index 83a75e16e54de..d2f90ecc426f9 100644 --- a/Documentation/arm64/silicon-errata.rst +++ b/Documentation/arm64/silicon-errata.rst @@ -172,6 +172,9 @@ stable kernels. +----------------+-----------------+-----------------+-----------------------------+ | Hisilicon | Hip08 SMMU PMCG | #162001800 | N/A | +----------------+-----------------+-----------------+-----------------------------+ +| Hisilicon | Hip08 SMMU PMCG | #162001900 | N/A | +| | Hip09 SMMU PMCG | | | ++----------------+-----------------+-----------------+-----------------------------+ +----------------+-----------------+-----------------+-----------------------------+ | Qualcomm Tech. | Kryo/Falkor v1 | E1003 | QCOM_FALKOR_ERRATUM_1003 | +----------------+-----------------+-----------------+-----------------------------+ diff --git a/drivers/acpi/arm64/iort.c b/drivers/acpi/arm64/iort.c index f2f8f05662deb..3dd74ad95d01b 100644 --- a/drivers/acpi/arm64/iort.c +++ b/drivers/acpi/arm64/iort.c @@ -1381,7 +1381,10 @@ static void __init arm_smmu_v3_pmcg_init_resources(struct resource *res, static struct acpi_platform_list pmcg_plat_info[] __initdata = { /* HiSilicon Hip08 Platform */ {"HISI ", "HIP08 ", 0, ACPI_SIG_IORT, greater_than_or_equal, - "Erratum #162001800", IORT_SMMU_V3_PMCG_HISI_HIP08}, + "Erratum #162001800, Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP08}, + /* HiSilicon Hip09 Platform */ + {"HISI ", "HIP09 ", 0, ACPI_SIG_IORT, greater_than_or_equal, + "Erratum #162001900", IORT_SMMU_V3_PMCG_HISI_HIP09}, { } };
diff --git a/drivers/perf/arm_smmuv3_pmu.c b/drivers/perf/arm_smmuv3_pmu.c index 5933ad151f869..1ef683bb40b64 100644 --- a/drivers/perf/arm_smmuv3_pmu.c +++ b/drivers/perf/arm_smmuv3_pmu.c @@ -96,6 +96,7 @@ #define SMMU_PMCG_PA_SHIFT 12
#define SMMU_PMCG_EVCNTR_RDONLY BIT(0) +#define SMMU_PMCG_HARDEN_DISABLE BIT(1)
static int cpuhp_state_num;
@@ -140,6 +141,20 @@ static inline void smmu_pmu_enable(struct pmu *pmu) writel(SMMU_PMCG_CR_ENABLE, smmu_pmu->reg_base + SMMU_PMCG_CR); }
+static int smmu_pmu_apply_event_filter(struct smmu_pmu *smmu_pmu, + struct perf_event *event, int idx); + +static inline void smmu_pmu_enable_quirk_hip08_09(struct pmu *pmu) +{ + struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); + unsigned int idx; + + for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters) + smmu_pmu_apply_event_filter(smmu_pmu, smmu_pmu->events[idx], idx); + + smmu_pmu_enable(pmu); +} + static inline void smmu_pmu_disable(struct pmu *pmu) { struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); @@ -148,6 +163,22 @@ static inline void smmu_pmu_disable(struct pmu *pmu) writel(0, smmu_pmu->reg_base + SMMU_PMCG_IRQ_CTRL); }
+static inline void smmu_pmu_disable_quirk_hip08_09(struct pmu *pmu) +{ + struct smmu_pmu *smmu_pmu = to_smmu_pmu(pmu); + unsigned int idx; + + /* + * The global disable of PMU sometimes fail to stop the counting. + * Harden this by writing an invalid event type to each used counter + * to forcibly stop counting. + */ + for_each_set_bit(idx, smmu_pmu->used_counters, smmu_pmu->num_counters) + writel(0xffff, smmu_pmu->reg_base + SMMU_PMCG_EVTYPER(idx)); + + smmu_pmu_disable(pmu); +} + static inline void smmu_pmu_counter_set_value(struct smmu_pmu *smmu_pmu, u32 idx, u64 value) { @@ -747,7 +778,10 @@ static void smmu_pmu_get_acpi_options(struct smmu_pmu *smmu_pmu) switch (model) { case IORT_SMMU_V3_PMCG_HISI_HIP08: /* HiSilicon Erratum 162001800 */ - smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY; + smmu_pmu->options |= SMMU_PMCG_EVCNTR_RDONLY | SMMU_PMCG_HARDEN_DISABLE; + break; + case IORT_SMMU_V3_PMCG_HISI_HIP09: + smmu_pmu->options |= SMMU_PMCG_HARDEN_DISABLE; break; }
@@ -836,6 +870,16 @@ static int smmu_pmu_probe(struct platform_device *pdev)
smmu_pmu_get_acpi_options(smmu_pmu);
+ /* + * For platforms suffer this quirk, the PMU disable sometimes fails to + * stop the counters. This will leads to inaccurate or error counting. + * Forcibly disable the counters with these quirk handler. + */ + if (smmu_pmu->options & SMMU_PMCG_HARDEN_DISABLE) { + smmu_pmu->pmu.pmu_enable = smmu_pmu_enable_quirk_hip08_09; + smmu_pmu->pmu.pmu_disable = smmu_pmu_disable_quirk_hip08_09; + } + /* Pick one CPU to be the preferred one to use */ smmu_pmu->on_cpu = raw_smp_processor_id(); WARN_ON(irq_set_affinity(smmu_pmu->irq, cpumask_of(smmu_pmu->on_cpu))); diff --git a/include/linux/acpi_iort.h b/include/linux/acpi_iort.h index f1f0842a2cb2b..43082bd44a999 100644 --- a/include/linux/acpi_iort.h +++ b/include/linux/acpi_iort.h @@ -21,6 +21,7 @@ */ #define IORT_SMMU_V3_PMCG_GENERIC 0x00000000 /* Generic SMMUv3 PMCG */ #define IORT_SMMU_V3_PMCG_HISI_HIP08 0x00000001 /* HiSilicon HIP08 PMCG */ +#define IORT_SMMU_V3_PMCG_HISI_HIP09 0x00000002 /* HiSilicon HIP09 PMCG */
int iort_register_domain_token(int trans_id, phys_addr_t base, struct fwnode_handle *fw_node);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Yang xu.yang_2@nxp.com
[ Upstream commit e89ecd8368860bf05437eabd07d292c316221cfc ]
For i.MX8MP, we cannot ensure that cycle counter overflow occurs at least 4 times as often as other events. Due to byte counters will count for any event configured, it will overflow more often. And if byte counters overflow that related counters would stop since they share the COUNTER_CNTL. We can speed up cycle counter overflow frequency by setting counter parameter (CP) field of cycle counter. In this way, we can avoid stop counting byte counters when interrupt didn't come and the byte counters can be fetched or updated from each cycle counter overflow interrupt.
Because we initialize CP filed to shorten counter0 overflow time, the cycle counter will start couting from a fixed/base value each time. We need to remove the base from the result too. Therefore, we could get precise result from cycle counter.
Signed-off-by: Xu Yang xu.yang_2@nxp.com Reviewed-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20230811015438.1999307-1-xu.yang_2@nxp.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/perf/fsl_imx8_ddr_perf.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c index 4daa782c48df0..6f6bc0a446ff6 100644 --- a/drivers/perf/fsl_imx8_ddr_perf.c +++ b/drivers/perf/fsl_imx8_ddr_perf.c @@ -28,6 +28,8 @@ #define CNTL_CLEAR_MASK 0xFFFFFFFD #define CNTL_OVER_MASK 0xFFFFFFFE
+#define CNTL_CP_SHIFT 16 +#define CNTL_CP_MASK (0xFF << CNTL_CP_SHIFT) #define CNTL_CSV_SHIFT 24 #define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT)
@@ -35,6 +37,8 @@ #define EVENT_CYCLES_COUNTER 0 #define NUM_COUNTERS 4
+/* For removing bias if cycle counter CNTL.CP is set to 0xf0 */ +#define CYCLES_COUNTER_MASK 0x0FFFFFFF #define AXI_MASKING_REVERT 0xffff0000 /* AXI_MASKING(MSB 16bits) + AXI_ID(LSB 16bits) */
#define to_ddr_pmu(p) container_of(p, struct ddr_pmu, pmu) @@ -429,6 +433,17 @@ static void ddr_perf_counter_enable(struct ddr_pmu *pmu, int config, writel(0, pmu->base + reg); val = CNTL_EN | CNTL_CLEAR; val |= FIELD_PREP(CNTL_CSV_MASK, config); + + /* + * On i.MX8MP we need to bias the cycle counter to overflow more often. + * We do this by initializing bits [23:16] of the counter value via the + * COUNTER_CTRL Counter Parameter (CP) field. + */ + if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED) { + if (counter == EVENT_CYCLES_COUNTER) + val |= FIELD_PREP(CNTL_CP_MASK, 0xf0); + } + writel(val, pmu->base + reg); } else { /* Disable counter */ @@ -468,6 +483,12 @@ static void ddr_perf_event_update(struct perf_event *event) int ret;
new_raw_count = ddr_perf_read_counter(pmu, counter); + /* Remove the bias applied in ddr_perf_counter_enable(). */ + if (pmu->devtype_data->quirks & DDR_CAP_AXI_ID_FILTER_ENHANCED) { + if (counter == EVENT_CYCLES_COUNTER) + new_raw_count &= CYCLES_COUNTER_MASK; + } + local64_add(new_raw_count, &event->count);
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tomislav Novak tnovak@meta.com
[ Upstream commit d11a69873d9a7435fe6a48531e165ab80a8b1221 ]
Arm platforms use is_default_overflow_handler() to determine if the hw_breakpoint code should single-step over the breakpoint trigger or let the custom handler deal with it.
Since bpf_overflow_handler() currently isn't recognized as a default handler, attaching a BPF program to a PERF_TYPE_BREAKPOINT event causes it to keep firing (the instruction triggering the data abort exception is never skipped). For example:
# bpftrace -e 'watchpoint:0x10000:4:w { print("hit") }' -c ./test Attaching 1 probe... hit hit [...] ^C
(./test performs a single 4-byte store to 0x10000)
This patch replaces the check with uses_default_overflow_handler(), which accounts for the bpf_overflow_handler() case by also testing if one of the perf_event_output functions gets invoked indirectly, via orig_default_handler.
Signed-off-by: Tomislav Novak tnovak@meta.com Tested-by: Samuel Gosselin sgosselin@google.com # arm64 Reviewed-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Alexei Starovoitov ast@kernel.org Link: https://lore.kernel.org/linux-arm-kernel/20220923203644.2731604-1-tnovak@fb.... Link: https://lore.kernel.org/r/20230605191923.1219974-1-tnovak@meta.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/kernel/hw_breakpoint.c | 8 ++++---- arch/arm64/kernel/hw_breakpoint.c | 4 ++-- include/linux/perf_event.h | 22 +++++++++++++++++++--- 3 files changed, 25 insertions(+), 9 deletions(-)
diff --git a/arch/arm/kernel/hw_breakpoint.c b/arch/arm/kernel/hw_breakpoint.c index b1423fb130ea4..8f1fa7aac31fb 100644 --- a/arch/arm/kernel/hw_breakpoint.c +++ b/arch/arm/kernel/hw_breakpoint.c @@ -626,7 +626,7 @@ int hw_breakpoint_arch_parse(struct perf_event *bp, hw->address &= ~alignment_mask; hw->ctrl.len <<= offset;
- if (is_default_overflow_handler(bp)) { + if (uses_default_overflow_handler(bp)) { /* * Mismatch breakpoints are required for single-stepping * breakpoints. @@ -798,7 +798,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, * Otherwise, insert a temporary mismatch breakpoint so that * we can single-step over the watchpoint trigger. */ - if (!is_default_overflow_handler(wp)) + if (!uses_default_overflow_handler(wp)) continue; step: enable_single_step(wp, instruction_pointer(regs)); @@ -811,7 +811,7 @@ static void watchpoint_handler(unsigned long addr, unsigned int fsr, info->trigger = addr; pr_debug("watchpoint fired: address = 0x%x\n", info->trigger); perf_bp_event(wp, regs); - if (is_default_overflow_handler(wp)) + if (uses_default_overflow_handler(wp)) enable_single_step(wp, instruction_pointer(regs)); }
@@ -886,7 +886,7 @@ static void breakpoint_handler(unsigned long unknown, struct pt_regs *regs) info->trigger = addr; pr_debug("breakpoint fired: address = 0x%x\n", addr); perf_bp_event(bp, regs); - if (is_default_overflow_handler(bp)) + if (uses_default_overflow_handler(bp)) enable_single_step(bp, addr); goto unlock; } diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index 2a7f21314cde6..c30fa24458328 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -654,7 +654,7 @@ static int breakpoint_handler(unsigned long unused, unsigned long esr, perf_bp_event(bp, regs);
/* Do we need to handle the stepping? */ - if (is_default_overflow_handler(bp)) + if (uses_default_overflow_handler(bp)) step = 1; unlock: rcu_read_unlock(); @@ -733,7 +733,7 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val, static int watchpoint_report(struct perf_event *wp, unsigned long addr, struct pt_regs *regs) { - int step = is_default_overflow_handler(wp); + int step = uses_default_overflow_handler(wp); struct arch_hw_breakpoint *info = counter_arch_bp(wp);
info->trigger = addr; diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index 014eb0a963fcb..5806fc4dc7e59 100644 --- a/include/linux/perf_event.h +++ b/include/linux/perf_event.h @@ -1084,15 +1084,31 @@ extern int perf_event_output(struct perf_event *event, struct pt_regs *regs);
static inline bool -is_default_overflow_handler(struct perf_event *event) +__is_default_overflow_handler(perf_overflow_handler_t overflow_handler) { - if (likely(event->overflow_handler == perf_event_output_forward)) + if (likely(overflow_handler == perf_event_output_forward)) return true; - if (unlikely(event->overflow_handler == perf_event_output_backward)) + if (unlikely(overflow_handler == perf_event_output_backward)) return true; return false; }
+#define is_default_overflow_handler(event) \ + __is_default_overflow_handler((event)->overflow_handler) + +#ifdef CONFIG_BPF_SYSCALL +static inline bool uses_default_overflow_handler(struct perf_event *event) +{ + if (likely(is_default_overflow_handler(event))) + return true; + + return __is_default_overflow_handler(event->orig_overflow_handler); +} +#else +#define uses_default_overflow_handler(event) \ + is_default_overflow_handler(event) +#endif + extern void perf_event_header__init_id(struct perf_event_header *header, struct perf_sample_data *data,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mario Limonciello mario.limonciello@amd.com
[ Upstream commit 883cf0d4cf288313b71146ddebdf5d647b76c78b ]
If a badly constructed firmware includes multiple `ACPI_TYPE_PACKAGE` objects while evaluating the AMD LPS0 _DSM, there will be a memory leak. Explicitly guard against this.
Suggested-by: Bjorn Helgaas helgaas@kernel.org Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/s2idle.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/acpi/x86/s2idle.c b/drivers/acpi/x86/s2idle.c index 946e0160ad3bf..ada989670d9b8 100644 --- a/drivers/acpi/x86/s2idle.c +++ b/drivers/acpi/x86/s2idle.c @@ -111,6 +111,12 @@ static void lpi_device_get_constraints_amd(void) union acpi_object *package = &out_obj->package.elements[i];
if (package->type == ACPI_TYPE_PACKAGE) { + if (lpi_constraints_table) { + acpi_handle_err(lps0_device_handle, + "Duplicate constraints list\n"); + goto free_acpi_buffer; + } + lpi_constraints_table = kcalloc(package->package.count, sizeof(*lpi_constraints_table), GFP_KERNEL);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiri Pirko jiri@nvidia.com
[ Upstream commit 633d76ad01ad0321a1ace3e5cc4fed06753d7ac4 ]
The checks in question were introduced by: commit 6b4db2e528f6 ("devlink: Fix use-after-free after a failed reload"). That fixed an issue of reload with mlxsw driver.
Back then, that was a valid fix, because there was a limitation in place that prevented drivers from registering/unregistering params when devlink instance was registered.
It was possible to do the fix differently by changing drivers to register/unregister params in appropriate places making sure the ops operate only on memory which is allocated and initialized. But that, as a dependency, would require to remove the limitation mentioned above.
Eventually, this limitation was lifted by: commit 1d18bb1a4ddd ("devlink: allow registering parameters after the instance")
Also, the alternative fix (which also fixed another issue) was done by: commit 74cbc3c03c82 ("mlxsw: spectrum_acl_tcam: Move devlink param to TCAM code").
Therefore, the checks are no longer relevant. Each driver should make sure to have the params registered only when the memory the ops are working with is allocated and initialized.
So remove the checks.
Signed-off-by: Jiri Pirko jiri@nvidia.com Reviewed-by: Ido Schimmel idosch@nvidia.com Reviewed-by: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/devlink.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/core/devlink.c b/net/core/devlink.c index b4d7a7f749c18..db76c55e1a6d7 100644 --- a/net/core/devlink.c +++ b/net/core/devlink.c @@ -4413,7 +4413,7 @@ static int devlink_param_get(struct devlink *devlink, const struct devlink_param *param, struct devlink_param_gset_ctx *ctx) { - if (!param->get || devlink->reload_failed) + if (!param->get) return -EOPNOTSUPP; return param->get(devlink, param->id, ctx); } @@ -4422,7 +4422,7 @@ static int devlink_param_set(struct devlink *devlink, const struct devlink_param *param, struct devlink_param_gset_ctx *ctx) { - if (!param->set || devlink->reload_failed) + if (!param->set) return -EOPNOTSUPP; return param->set(devlink, param->id, ctx); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Azeem Shaikh azeemshaikh38@gmail.com
[ Upstream commit babb80b3ecc6f40c962e13c654ebcd27f25ee327 ]
strlcpy() reads the entire source buffer first. This read may exceed the destination size limit. This is both inefficient and can lead to linear read overflows if a source string is not NUL-terminated [1]. In an effort to remove strlcpy() completely [2], replace strlcpy() here with strscpy().
Direct replacement is safe here since return value of -errno is used to check for truncation instead of sizeof(dest).
[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#strlcpy [2] https://github.com/KSPP/linux/issues/89
Signed-off-by: Azeem Shaikh azeemshaikh38@gmail.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- crypto/lrw.c | 6 +++--- crypto/xts.c | 6 +++--- 2 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/crypto/lrw.c b/crypto/lrw.c index bcf09fbc750af..80d9076e42e0b 100644 --- a/crypto/lrw.c +++ b/crypto/lrw.c @@ -357,10 +357,10 @@ static int lrw_create(struct crypto_template *tmpl, struct rtattr **tb) * cipher name. */ if (!strncmp(cipher_name, "ecb(", 4)) { - unsigned len; + int len;
- len = strlcpy(ecb_name, cipher_name + 4, sizeof(ecb_name)); - if (len < 2 || len >= sizeof(ecb_name)) + len = strscpy(ecb_name, cipher_name + 4, sizeof(ecb_name)); + if (len < 2) goto err_free_inst;
if (ecb_name[len - 1] != ')') diff --git a/crypto/xts.c b/crypto/xts.c index de6cbcf69bbd6..b05020657cdc8 100644 --- a/crypto/xts.c +++ b/crypto/xts.c @@ -396,10 +396,10 @@ static int xts_create(struct crypto_template *tmpl, struct rtattr **tb) * cipher name. */ if (!strncmp(cipher_name, "ecb(", 4)) { - unsigned len; + int len;
- len = strlcpy(ctx->name, cipher_name + 4, sizeof(ctx->name)); - if (len < 2 || len >= sizeof(ctx->name)) + len = strscpy(ctx->name, cipher_name + 4, sizeof(ctx->name)); + if (len < 2) goto err_free_inst;
if (ctx->name[len - 1] != ')')
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit 810e41cebb6c6e394f2068f839e1a3fc745a5dcc ]
When compiling with gcc 13.1 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following:
In function ‘fortify_memcpy_chk’, inlined from ‘ath_tx_complete_aggr’ at drivers/net/wireless/ath/ath9k/xmit.c:556:4, inlined from ‘ath_tx_process_buffer’ at drivers/net/wireless/ath/ath9k/xmit.c:773:3: ./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 529 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In function ‘fortify_memcpy_chk’, inlined from ‘ath_tx_count_frames’ at drivers/net/wireless/ath/ath9k/xmit.c:473:3, inlined from ‘ath_tx_complete_aggr’ at drivers/net/wireless/ath/ath9k/xmit.c:572:2, inlined from ‘ath_tx_process_buffer’ at drivers/net/wireless/ath/ath9k/xmit.c:773:3: ./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 529 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In both cases, the compiler complains on:
memcpy(ba, &ts->ba_low, WME_BA_BMP_SIZE >> 3);
which is the legal way to copy both 'ba_low' and following 'ba_high' members of 'struct ath_tx_status' at once (that is, issue one 8-byte 'memcpy()' for two 4-byte fields). Since the fortification logic seems interprets this trick as an attempt to overread 4-byte 'ba_low', silence relevant warnings by using the convenient 'struct_group()' quirk.
Suggested-by: Johannes Berg johannes@sipsolutions.net Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230620080855.396851-2-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/mac.h | 6 ++++-- drivers/net/wireless/ath/ath9k/xmit.c | 4 ++-- 2 files changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/mac.h b/drivers/net/wireless/ath/ath9k/mac.h index fd6aa49adadfe..9b00e77a6fc3c 100644 --- a/drivers/net/wireless/ath/ath9k/mac.h +++ b/drivers/net/wireless/ath/ath9k/mac.h @@ -113,8 +113,10 @@ struct ath_tx_status { u8 qid; u16 desc_id; u8 tid; - u32 ba_low; - u32 ba_high; + struct_group(ba, + u32 ba_low; + u32 ba_high; + ); u32 evm0; u32 evm1; u32 evm2; diff --git a/drivers/net/wireless/ath/ath9k/xmit.c b/drivers/net/wireless/ath/ath9k/xmit.c index 6555abf02f18b..84c68aefc171a 100644 --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -421,7 +421,7 @@ static void ath_tx_count_frames(struct ath_softc *sc, struct ath_buf *bf, isaggr = bf_isaggr(bf); if (isaggr) { seq_st = ts->ts_seqnum; - memcpy(ba, &ts->ba_low, WME_BA_BMP_SIZE >> 3); + memcpy(ba, &ts->ba, WME_BA_BMP_SIZE >> 3); }
while (bf) { @@ -504,7 +504,7 @@ static void ath_tx_complete_aggr(struct ath_softc *sc, struct ath_txq *txq, if (isaggr && txok) { if (ts->ts_flags & ATH9K_TX_BA) { seq_st = ts->ts_seqnum; - memcpy(ba, &ts->ba_low, WME_BA_BMP_SIZE >> 3); + memcpy(ba, &ts->ba, WME_BA_BMP_SIZE >> 3); } else { /* * AR5416 can become deaf/mute when BA
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dongliang Mu dzm91@hust.edu.cn
[ Upstream commit 061115fbfb2ce5870c9a004d68dc63138c07c782 ]
Smatch reports:
ath_pci_probe() warn: argument 4 to %lx specifier is cast from pointer ath_ahb_probe() warn: argument 4 to %lx specifier is cast from pointer
Fix it by modifying %lx to %p in the printk format string.
Note that with this change, the pointer address will be printed as a hashed value by default. This is appropriate because the kernel should not leak kernel pointers to user space in an informational message. If someone wants to see the real address for debugging purposes, this can be achieved with the no_hash_pointers kernel option.
Signed-off-by: Dongliang Mu dzm91@hust.edu.cn Acked-by: Toke Høiland-Jørgensen toke@toke.dk Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230723040403.296723-1-dzm91@hust.edu.cn Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath9k/ahb.c | 4 ++-- drivers/net/wireless/ath/ath9k/pci.c | 4 ++-- 2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath9k/ahb.c b/drivers/net/wireless/ath/ath9k/ahb.c index cdefb8e2daf14..05fb76a4e144e 100644 --- a/drivers/net/wireless/ath/ath9k/ahb.c +++ b/drivers/net/wireless/ath/ath9k/ahb.c @@ -136,8 +136,8 @@ static int ath_ahb_probe(struct platform_device *pdev)
ah = sc->sc_ah; ath9k_hw_name(ah, hw_name, sizeof(hw_name)); - wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", - hw_name, (unsigned long)mem, irq); + wiphy_info(hw->wiphy, "%s mem=0x%p, irq=%d\n", + hw_name, mem, irq);
return 0;
diff --git a/drivers/net/wireless/ath/ath9k/pci.c b/drivers/net/wireless/ath/ath9k/pci.c index a074e23013c58..f0e3901e8182a 100644 --- a/drivers/net/wireless/ath/ath9k/pci.c +++ b/drivers/net/wireless/ath/ath9k/pci.c @@ -988,8 +988,8 @@ static int ath_pci_probe(struct pci_dev *pdev, const struct pci_device_id *id) sc->sc_ah->msi_reg = 0;
ath9k_hw_name(sc->sc_ah, hw_name, sizeof(hw_name)); - wiphy_info(hw->wiphy, "%s mem=0x%lx, irq=%d\n", - hw_name, (unsigned long)sc->mem, pdev->irq); + wiphy_info(hw->wiphy, "%s mem=0x%p, irq=%d\n", + hw_name, sc->mem, pdev->irq);
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit dcce94b80a954a8968ff29fafcfb066d6197fa9a ]
When compiling with gcc 13.1 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following:
In function ‘fortify_memcpy_chk’, inlined from ‘mwifiex_construct_tdls_action_frame’ at drivers/net/wireless/marvell/mwifiex/tdls.c:765:3, inlined from ‘mwifiex_send_tdls_action_frame’ at drivers/net/wireless/marvell/mwifiex/tdls.c:856:6: ./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 529 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The compiler actually complains on:
memmove(pos + ETH_ALEN, &mgmt->u.action.category, sizeof(mgmt->u.action.u.tdls_discover_resp));
and it happens because the fortification logic interprets this as an attempt to overread 1-byte 'u.action.category' member of 'struct ieee80211_mgmt'. To silence this warning, it's enough to pass an address of 'u.action' itself instead of an address of its first member.
This also fixes an improper usage of 'sizeof()'. Since 'skb' is extended with 'sizeof(mgmt->u.action.u.tdls_discover_resp) + 1' bytes (where 1 is actually 'sizeof(mgmt->u.action.category)'), I assume that the same number of bytes should be copied.
Suggested-by: Brian Norris briannorris@chromium.org Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Reviewed-by: Brian Norris briannorris@chromium.org Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20230629085115.180499-2-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/marvell/mwifiex/tdls.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/marvell/mwifiex/tdls.c b/drivers/net/wireless/marvell/mwifiex/tdls.c index 97bb87c3676bb..6c60621b6cccb 100644 --- a/drivers/net/wireless/marvell/mwifiex/tdls.c +++ b/drivers/net/wireless/marvell/mwifiex/tdls.c @@ -735,6 +735,7 @@ mwifiex_construct_tdls_action_frame(struct mwifiex_private *priv, int ret; u16 capab; struct ieee80211_ht_cap *ht_cap; + unsigned int extra; u8 radio, *pos;
capab = priv->curr_bss_params.bss_descriptor.cap_info_bitmap; @@ -753,7 +754,10 @@ mwifiex_construct_tdls_action_frame(struct mwifiex_private *priv,
switch (action_code) { case WLAN_PUB_ACTION_TDLS_DISCOVER_RES: - skb_put(skb, sizeof(mgmt->u.action.u.tdls_discover_resp) + 1); + /* See the layout of 'struct ieee80211_mgmt'. */ + extra = sizeof(mgmt->u.action.u.tdls_discover_resp) + + sizeof(mgmt->u.action.category); + skb_put(skb, extra); mgmt->u.action.category = WLAN_CATEGORY_PUBLIC; mgmt->u.action.u.tdls_discover_resp.action_code = WLAN_PUB_ACTION_TDLS_DISCOVER_RES; @@ -762,8 +766,7 @@ mwifiex_construct_tdls_action_frame(struct mwifiex_private *priv, mgmt->u.action.u.tdls_discover_resp.capability = cpu_to_le16(capab); /* move back for addr4 */ - memmove(pos + ETH_ALEN, &mgmt->u.action.category, - sizeof(mgmt->u.action.u.tdls_discover_resp)); + memmove(pos + ETH_ALEN, &mgmt->u.action, extra); /* init address 4 */ eth_broadcast_addr(pos);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Antipov dmantipov@yandex.ru
[ Upstream commit 1ad8237e971630c66a1a6194491e0837b64d00e0 ]
When compiling with gcc 13.1 and CONFIG_FORTIFY_SOURCE=y, I've noticed the following:
In function ‘fortify_memcpy_chk’, inlined from ‘wil_rx_crypto_check_edma’ at drivers/net/wireless/ath/wil6210/txrx_edma.c:566:2: ./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 529 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where the compiler complains on:
const u8 *pn; ... pn = (u8 *)&st->ext.pn_15_0; ... memcpy(cc->pn, pn, IEEE80211_GCMP_PN_LEN);
and:
In function ‘fortify_memcpy_chk’, inlined from ‘wil_rx_crypto_check’ at drivers/net/wireless/ath/wil6210/txrx.c:684:2: ./include/linux/fortify-string.h:529:25: warning: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Wattribute-warning] 529 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
where the compiler complains on:
const u8 *pn = (u8 *)&d->mac.pn_15_0; ... memcpy(cc->pn, pn, IEEE80211_GCMP_PN_LEN);
In both cases, the fortification logic interprets 'memcpy()' as 6-byte overread of 2-byte field 'pn_15_0' of 'struct wil_rx_status_extension' and 'pn_15_0' of 'struct vring_rx_mac', respectively. To silence these warnings, last two fields of the aforementioned structures are grouped using 'struct_group_attr(pn, __packed' quirk.
Signed-off-by: Dmitry Antipov dmantipov@yandex.ru Signed-off-by: Kalle Valo quic_kvalo@quicinc.com Link: https://lore.kernel.org/r/20230621093711.80118-1-dmantipov@yandex.ru Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/wil6210/txrx.c | 2 +- drivers/net/wireless/ath/wil6210/txrx.h | 6 ++++-- drivers/net/wireless/ath/wil6210/txrx_edma.c | 2 +- drivers/net/wireless/ath/wil6210/txrx_edma.h | 6 ++++-- 4 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/ath/wil6210/txrx.c b/drivers/net/wireless/ath/wil6210/txrx.c index cc830c795b33c..5b2de4f3fa0bd 100644 --- a/drivers/net/wireless/ath/wil6210/txrx.c +++ b/drivers/net/wireless/ath/wil6210/txrx.c @@ -666,7 +666,7 @@ static int wil_rx_crypto_check(struct wil6210_priv *wil, struct sk_buff *skb) struct wil_tid_crypto_rx *c = mc ? &s->group_crypto_rx : &s->tid_crypto_rx[tid]; struct wil_tid_crypto_rx_single *cc = &c->key_id[key_id]; - const u8 *pn = (u8 *)&d->mac.pn_15_0; + const u8 *pn = (u8 *)&d->mac.pn;
if (!cc->key_set) { wil_err_ratelimited(wil, diff --git a/drivers/net/wireless/ath/wil6210/txrx.h b/drivers/net/wireless/ath/wil6210/txrx.h index 1f4c8ec75be87..0f6f6b62bfc9a 100644 --- a/drivers/net/wireless/ath/wil6210/txrx.h +++ b/drivers/net/wireless/ath/wil6210/txrx.h @@ -343,8 +343,10 @@ struct vring_rx_mac { u32 d0; u32 d1; u16 w4; - u16 pn_15_0; - u32 pn_47_16; + struct_group_attr(pn, __packed, + u16 pn_15_0; + u32 pn_47_16; + ); } __packed;
/* Rx descriptor - DMA part diff --git a/drivers/net/wireless/ath/wil6210/txrx_edma.c b/drivers/net/wireless/ath/wil6210/txrx_edma.c index 201c8c35e0c9e..1ba1f21ebea26 100644 --- a/drivers/net/wireless/ath/wil6210/txrx_edma.c +++ b/drivers/net/wireless/ath/wil6210/txrx_edma.c @@ -548,7 +548,7 @@ static int wil_rx_crypto_check_edma(struct wil6210_priv *wil, s = &wil->sta[cid]; c = mc ? &s->group_crypto_rx : &s->tid_crypto_rx[tid]; cc = &c->key_id[key_id]; - pn = (u8 *)&st->ext.pn_15_0; + pn = (u8 *)&st->ext.pn;
if (!cc->key_set) { wil_err_ratelimited(wil, diff --git a/drivers/net/wireless/ath/wil6210/txrx_edma.h b/drivers/net/wireless/ath/wil6210/txrx_edma.h index c736f7413a35f..ee90e225bb050 100644 --- a/drivers/net/wireless/ath/wil6210/txrx_edma.h +++ b/drivers/net/wireless/ath/wil6210/txrx_edma.h @@ -330,8 +330,10 @@ struct wil_rx_status_extension { u32 d0; u32 d1; __le16 seq_num; /* only lower 12 bits */ - u16 pn_15_0; - u32 pn_47_16; + struct_group_attr(pn, __packed, + u16 pn_15_0; + u32 pn_47_16; + ); } __packed;
struct wil_rx_status_extended {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mark O'Donovan shiftee@posteo.net
[ Upstream commit 9e47a758b70167c9301d2b44d2569f86c7796f2d ]
During NVMeTCP Authentication a controller can trigger a kernel oops by specifying the 8192 bit Diffie Hellman group and passing a correctly sized, but zeroed Diffie Hellamn value. mpi_cmp_ui() was detecting this if the second parameter was 0, but 1 is passed from dh_is_pubkey_valid(). This causes the null pointer u->d to be dereferenced towards the end of mpi_cmp_ui()
Signed-off-by: Mark O'Donovan shiftee@posteo.net Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- lib/mpi/mpi-cmp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/lib/mpi/mpi-cmp.c b/lib/mpi/mpi-cmp.c index c4cfa3ff05818..0835b6213235e 100644 --- a/lib/mpi/mpi-cmp.c +++ b/lib/mpi/mpi-cmp.c @@ -25,8 +25,12 @@ int mpi_cmp_ui(MPI u, unsigned long v) mpi_limb_t limb = v;
mpi_normalize(u); - if (!u->nlimbs && !limb) - return 0; + if (u->nlimbs == 0) { + if (v == 0) + return 0; + else + return -1; + } if (u->sign) return -1; if (u->nlimbs > 1)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexander Steffen Alexander.Steffen@infineon.com
[ Upstream commit 280db21e153d8810ce3b93640c63ae922bcb9e8e ]
Similar to the transmission of TPM responses, also the transmission of TPM commands may become corrupted. Instead of aborting when detecting such issues, try resending the command again.
Signed-off-by: Alexander Steffen Alexander.Steffen@infineon.com Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/char/tpm/tpm_tis_core.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/char/tpm/tpm_tis_core.c b/drivers/char/tpm/tpm_tis_core.c index d7c440ac465f3..b3452259d6e0b 100644 --- a/drivers/char/tpm/tpm_tis_core.c +++ b/drivers/char/tpm/tpm_tis_core.c @@ -469,10 +469,17 @@ static int tpm_tis_send_main(struct tpm_chip *chip, const u8 *buf, size_t len) int rc; u32 ordinal; unsigned long dur; - - rc = tpm_tis_send_data(chip, buf, len); - if (rc < 0) - return rc; + unsigned int try; + + for (try = 0; try < TPM_RETRY; try++) { + rc = tpm_tis_send_data(chip, buf, len); + if (rc >= 0) + /* Data transfer done successfully */ + break; + else if (rc != -EIO) + /* Data transfer failed, not recoverable */ + return rc; + }
/* go and do it */ rc = tpm_tis_write8(priv, TPM_STS(priv->locality), TPM_STS_GO);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Giulio Benetti giulio.benetti@benettiengineering.com
[ Upstream commit 5ae4b0d8875caa44946e579420c7fd5740d58653 ]
Errata ERR010450 only shows up if voltage is 1.8V, but if the device is supplied by 3v3 the errata can be ignored. So let's check for if quirk SDHCI_QUIRK2_NO_1_8_V is defined or not before limiting the frequency.
Cc: Jim Reinhart jimr@tekvox.com Cc: James Autry jautry@tekvox.com Cc: Matthew Maron matthewm@tekvox.com Signed-off-by: Giulio Benetti giulio.benetti@benettiengineering.com Acked-by: Haibo Chen haibo.chen@nxp.com Acked-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20230811214853.8623-1-giulio.benetti@benettiengine... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mmc/host/sdhci-esdhc-imx.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/mmc/host/sdhci-esdhc-imx.c b/drivers/mmc/host/sdhci-esdhc-imx.c index a6aa33dcd2a2e..d8a4080712365 100644 --- a/drivers/mmc/host/sdhci-esdhc-imx.c +++ b/drivers/mmc/host/sdhci-esdhc-imx.c @@ -171,8 +171,8 @@ #define ESDHC_FLAG_HS400 BIT(9) /* * The IP has errata ERR010450 - * uSDHC: Due to the I/O timing limit, for SDR mode, SD card clock can't - * exceed 150MHz, for DDR mode, SD card clock can't exceed 45MHz. + * uSDHC: At 1.8V due to the I/O timing limit, for SDR mode, SD card + * clock can't exceed 150MHz, for DDR mode, SD card clock can't exceed 45MHz. */ #define ESDHC_FLAG_ERR010450 BIT(10) /* The IP supports HS400ES mode */ @@ -917,7 +917,8 @@ static inline void esdhc_pltfm_set_clock(struct sdhci_host *host, | ESDHC_CLOCK_MASK); sdhci_writel(host, temp, ESDHC_SYSTEM_CONTROL);
- if (imx_data->socdata->flags & ESDHC_FLAG_ERR010450) { + if ((imx_data->socdata->flags & ESDHC_FLAG_ERR010450) && + (!(host->quirks2 & SDHCI_QUIRK2_NO_1_8_V))) { unsigned int max_clock;
max_clock = imx_data->is_ddr ? 45000000 : 150000000;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: GONG, Ruiqi gongruiqi1@huawei.com
[ Upstream commit 3a198c95c95da10ad844cbeade2fe40bdf14c411 ]
The following message shows up when compiling with W=1:
In function ‘fortify_memcpy_chk’, inlined from ‘alx_get_ethtool_stats’ at drivers/net/ethernet/atheros/alx/ethtool.c:297:2: ./include/linux/fortify-string.h:592:4: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 592 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In order to get alx stats altogether, alx_get_ethtool_stats() reads beyond hw->stats.rx_ok. Fix this warning by directly copying hw->stats, and refactor the unnecessarily complicated BUILD_BUG_ON btw.
Signed-off-by: GONG, Ruiqi gongruiqi1@huawei.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/r/20230821013218.1614265-1-gongruiqi@huaweicloud.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/atheros/alx/ethtool.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/atheros/alx/ethtool.c b/drivers/net/ethernet/atheros/alx/ethtool.c index b716adacd8159..7f6b69a523676 100644 --- a/drivers/net/ethernet/atheros/alx/ethtool.c +++ b/drivers/net/ethernet/atheros/alx/ethtool.c @@ -292,9 +292,8 @@ static void alx_get_ethtool_stats(struct net_device *netdev, spin_lock(&alx->stats_lock);
alx_update_hw_stats(hw); - BUILD_BUG_ON(sizeof(hw->stats) - offsetof(struct alx_hw_stats, rx_ok) < - ALX_NUM_STATS * sizeof(u64)); - memcpy(data, &hw->stats.rx_ok, ALX_NUM_STATS * sizeof(u64)); + BUILD_BUG_ON(sizeof(hw->stats) != ALX_NUM_STATS * sizeof(u64)); + memcpy(data, &hw->stats, sizeof(hw->stats));
spin_unlock(&alx->stats_lock); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 19e4a47ee74718a22e963e8a647c8c3bfe8bb05c ]
Before checking the action code, check that it even exists in the frame.
Reported-by: syzbot+be9c824e6f269d608288@syzkaller.appspotmail.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/rx.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/mac80211/rx.c b/net/mac80211/rx.c index 175ead6b19cb4..26943c93f14c4 100644 --- a/net/mac80211/rx.c +++ b/net/mac80211/rx.c @@ -3557,6 +3557,10 @@ ieee80211_rx_h_action(struct ieee80211_rx_data *rx) break; goto queue; case WLAN_CATEGORY_S1G: + if (len < offsetofend(typeof(*mgmt), + u.action.u.s1g.action_code)) + break; + switch (mgmt->u.action.u.s1g.action_code) { case WLAN_S1G_TWT_SETUP: case WLAN_S1G_TWT_TEARDOWN:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: GONG, Ruiqi gongruiqi1@huawei.com
[ Upstream commit a7ed3465daa240bdf01a5420f64336fee879c09d ]
When compiling with gcc 13 and CONFIG_FORTIFY_SOURCE=y, the following warning appears:
In function ‘fortify_memcpy_chk’, inlined from ‘size_entry_mwt’ at net/bridge/netfilter/ebtables.c:2118:2: ./include/linux/fortify-string.h:592:25: error: call to ‘__read_overflow2_field’ declared with attribute warning: detected read beyond size of field (2nd parameter); maybe use struct_group()? [-Werror=attribute-warning] 592 | __read_overflow2_field(q_size_field, size); | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The compiler is complaining:
memcpy(&offsets[1], &entry->watchers_offset, sizeof(offsets) - sizeof(offsets[0]));
where memcpy reads beyong &entry->watchers_offset to copy {watchers,target,next}_offset altogether into offsets[]. Silence the warning by wrapping these three up via struct_group().
Signed-off-by: GONG, Ruiqi gongruiqi1@huawei.com Reviewed-by: Gustavo A. R. Silva gustavoars@kernel.org Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/netfilter_bridge/ebtables.h | 14 ++++++++------ net/bridge/netfilter/ebtables.c | 3 +-- 2 files changed, 9 insertions(+), 8 deletions(-)
diff --git a/include/uapi/linux/netfilter_bridge/ebtables.h b/include/uapi/linux/netfilter_bridge/ebtables.h index a494cf43a7552..b0caad82b6937 100644 --- a/include/uapi/linux/netfilter_bridge/ebtables.h +++ b/include/uapi/linux/netfilter_bridge/ebtables.h @@ -182,12 +182,14 @@ struct ebt_entry { unsigned char sourcemsk[ETH_ALEN]; unsigned char destmac[ETH_ALEN]; unsigned char destmsk[ETH_ALEN]; - /* sizeof ebt_entry + matches */ - unsigned int watchers_offset; - /* sizeof ebt_entry + matches + watchers */ - unsigned int target_offset; - /* sizeof ebt_entry + matches + watchers + target */ - unsigned int next_offset; + __struct_group(/* no tag */, offsets, /* no attrs */, + /* sizeof ebt_entry + matches */ + unsigned int watchers_offset; + /* sizeof ebt_entry + matches + watchers */ + unsigned int target_offset; + /* sizeof ebt_entry + matches + watchers + target */ + unsigned int next_offset; + ); unsigned char elems[0] __attribute__ ((aligned (__alignof__(struct ebt_replace)))); };
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index a09b2fc11c80e..c0389199c0dcb 100644 --- a/net/bridge/netfilter/ebtables.c +++ b/net/bridge/netfilter/ebtables.c @@ -2114,8 +2114,7 @@ static int size_entry_mwt(const struct ebt_entry *entry, const unsigned char *ba return ret;
offsets[0] = sizeof(struct ebt_entry); /* matches come first */ - memcpy(&offsets[1], &entry->watchers_offset, - sizeof(offsets) - sizeof(offsets[0])); + memcpy(&offsets[1], &entry->offsets, sizeof(entry->offsets));
if (state->buf_kern_start) { buf_start = state->buf_kern_start + state->buf_kern_offset;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit fba360a047d5eeeb9d4b7c3a9b1c8308980ce9a6 ]
While technically some control frames like ACK are shorter and end after Address 1, such frames shouldn't be forwarded through wmediumd or similar userspace, so require the full 3-address header to avoid accessing invalid memory if shorter frames are passed in.
Reported-by: syzbot+b2645b5bf1512b81fa22@syzkaller.appspotmail.com Reviewed-by: Jeff Johnson quic_jjohnson@quicinc.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mac80211_hwsim.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/net/wireless/mac80211_hwsim.c b/drivers/net/wireless/mac80211_hwsim.c index c3c3b5aa87b0d..6eb3c845640bd 100644 --- a/drivers/net/wireless/mac80211_hwsim.c +++ b/drivers/net/wireless/mac80211_hwsim.c @@ -3693,14 +3693,15 @@ static int hwsim_cloned_frame_received_nl(struct sk_buff *skb_2, frame_data_len = nla_len(info->attrs[HWSIM_ATTR_FRAME]); frame_data = (void *)nla_data(info->attrs[HWSIM_ATTR_FRAME]);
+ if (frame_data_len < sizeof(struct ieee80211_hdr_3addr) || + frame_data_len > IEEE80211_MAX_DATA_LEN) + goto err; + /* Allocate new skb here */ skb = alloc_skb(frame_data_len, GFP_KERNEL); if (skb == NULL) goto err;
- if (frame_data_len > IEEE80211_MAX_DATA_LEN) - goto err; - /* Copy the data */ skb_put_data(skb, frame_data, frame_data_len);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Luo haoluo@google.com
[ Upstream commit 29d67fdebc42af6466d1909c60fdd1ef4f3e5240 ]
I hit a memory leak when testing bpf_program__set_attach_target(). Basically, set_attach_target() may allocate btf_vmlinux, for example, when setting attach target for bpf_iter programs. But btf_vmlinux is freed only in bpf_object_load(), which means if we only open bpf object but not load it, setting attach target may leak btf_vmlinux.
So let's free btf_vmlinux in bpf_object__close() anyway.
Signed-off-by: Hao Luo haoluo@google.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20230822193840.1509809-1-haoluo@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/libbpf.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index f87a15bbf53b3..9b8a0fe0eb1c3 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -7559,6 +7559,7 @@ void bpf_object__close(struct bpf_object *obj) bpf_object__elf_finish(obj); bpf_object__unload(obj); btf__free(obj->btf); + btf__free(obj->btf_vmlinux); btf_ext__free(obj->btf_ext);
for (i = 0; i < obj->nr_maps; i++)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Vasut marex@denx.de
[ Upstream commit 362fa8f6e6a05089872809f4465bab9d011d05b3 ]
This bridge seems to need the HSE packet, otherwise the image is shifted up and corrupted at the bottom. This makes the bridge work with Samsung DSIM on i.MX8MM and i.MX8MP.
Signed-off-by: Marek Vasut marex@denx.de Reviewed-by: Sam Ravnborg sam@ravnborg.org Signed-off-by: Robert Foss rfoss@kernel.org Link: https://patchwork.freedesktop.org/patch/msgid/20230615201902.566182-3-marex@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/bridge/tc358762.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/bridge/tc358762.c b/drivers/gpu/drm/bridge/tc358762.c index 1bfdfc6affafe..21c57d3435687 100644 --- a/drivers/gpu/drm/bridge/tc358762.c +++ b/drivers/gpu/drm/bridge/tc358762.c @@ -224,7 +224,7 @@ static int tc358762_probe(struct mipi_dsi_device *dsi) dsi->lanes = 1; dsi->format = MIPI_DSI_FMT_RGB888; dsi->mode_flags = MIPI_DSI_MODE_VIDEO | MIPI_DSI_MODE_VIDEO_SYNC_PULSE | - MIPI_DSI_MODE_LPM; + MIPI_DSI_MODE_LPM | MIPI_DSI_MODE_VIDEO_HSE;
ret = tc358762_parse_dt(ctx); if (ret < 0)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit c42f5452de6ad2599c6e5e2a64c180a4ac835d27 ]
There is no 'msg-size' property in ramoops, so assume intention was for 'pmsg-size':
sm6125-sony-xperia-seine-pdx201.dtb: ramoops@ffc00000: Unevaluated properties are not allowed ('msg-size' was unexpected)
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230618114442.140185-3-krzysztof.kozlowski@linaro... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts index 47f8e5397ebba..8625fb3a7f0a7 100644 --- a/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts +++ b/arch/arm64/boot/dts/qcom/sm6125-sony-xperia-seine-pdx201.dts @@ -74,7 +74,7 @@ pstore_mem: ramoops@ffc00000 { reg = <0x0 0xffc40000 0x0 0xc0000>; record-size = <0x1000>; console-size = <0x40000>; - msg-size = <0x20000 0x20000>; + pmsg-size = <0x20000>; };
cmdline_mem: memory@ffd00000 {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 4e6b942f092653ebcdbbc0819b2d1f08ab415bdc ]
There is no 'msg-size' property in ramoops, so assume intention was for 'pmsg-size':
sm8150-sony-xperia-kumano-griffin.dtb: ramoops@ffc00000: Unevaluated properties are not allowed ('msg-size' was unexpected)
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230618114442.140185-6-krzysztof.kozlowski@linaro... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi b/arch/arm64/boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi index 04c71f74ab72d..c9aa7764fc59a 100644 --- a/arch/arm64/boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi @@ -127,7 +127,7 @@ ramoops@ffc00000 { reg = <0x0 0xffc00000 0x0 0x100000>; record-size = <0x1000>; console-size = <0x40000>; - msg-size = <0x20000 0x20000>; + pmsg-size = <0x20000>; ecc-size = <16>; no-map; };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit 7dc3606f91427414d00a2fb09e6e0e32c14c2093 ]
There is no 'msg-size' property in ramoops, so assume intention was for 'pmsg-size':
sm8250-sony-xperia-edo-pdx206.dtb: ramoops@ffc00000: Unevaluated properties are not allowed ('msg-size' was unexpected)
Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Link: https://lore.kernel.org/r/20230618114442.140185-7-krzysztof.kozlowski@linaro... Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi index e622cbe167b0d..aeec0b6a1d7d2 100644 --- a/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi +++ b/arch/arm64/boot/dts/qcom/sm8250-sony-xperia-edo.dtsi @@ -126,7 +126,7 @@ ramoops@ffc00000 { reg = <0x0 0xffc00000 0x0 0x100000>; record-size = <0x1000>; console-size = <0x40000>; - msg-size = <0x20000 0x20000>; + pmsg-size = <0x20000>; ecc-size = <16>; no-map; };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rong Tao rongtao@cestc.cn
[ Upstream commit 910e230d5f1bb72c54532e94fbb1705095c7bab6 ]
Macro symbol_put() is defined as __symbol_put(__stringify(x))
ksym_name = "jiffies" symbol_put(ksym_name)
will be resolved as
__symbol_put("ksym_name")
which is clearly wrong. So symbol_put must be replaced with __symbol_put.
When we uninstall hw_breakpoint.ko (rmmod), a kernel bug occurs with the following error:
[11381.854152] kernel BUG at kernel/module/main.c:779! [11381.854159] invalid opcode: 0000 [#2] PREEMPT SMP PTI [11381.854163] CPU: 8 PID: 59623 Comm: rmmod Tainted: G D OE 6.2.9-200.fc37.x86_64 #1 [11381.854167] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./B360M-HDV, BIOS P3.20 10/23/2018 [11381.854169] RIP: 0010:__symbol_put+0xa2/0xb0 [11381.854175] Code: 00 e8 92 d2 f7 ff 65 8b 05 c3 2f e6 78 85 c0 74 1b 48 8b 44 24 30 65 48 2b 04 25 28 00 00 00 75 12 48 83 c4 38 c3 cc cc cc cc <0f> 0b 0f 1f 44 00 00 eb de e8 c0 df d8 00 90 90 90 90 90 90 90 90 [11381.854178] RSP: 0018:ffffad8ec6ae7dd0 EFLAGS: 00010246 [11381.854181] RAX: 0000000000000000 RBX: ffffffffc1fd1240 RCX: 000000000000000c [11381.854184] RDX: 000000000000006b RSI: ffffffffc02bf7c7 RDI: ffffffffc1fd001c [11381.854186] RBP: 000055a38b76e7c8 R08: ffffffff871ccfe0 R09: 0000000000000000 [11381.854188] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000 [11381.854190] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [11381.854192] FS: 00007fbf7c62c740(0000) GS:ffff8c5badc00000(0000) knlGS:0000000000000000 [11381.854195] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [11381.854197] CR2: 000055a38b7793f8 CR3: 0000000363e1e001 CR4: 00000000003726e0 [11381.854200] DR0: ffffffffb3407980 DR1: 0000000000000000 DR2: 0000000000000000 [11381.854202] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [11381.854204] Call Trace: [11381.854207] <TASK> [11381.854212] s_module_exit+0xc/0xff0 [symbol_getput] [11381.854219] __do_sys_delete_module.constprop.0+0x198/0x2f0 [11381.854225] do_syscall_64+0x58/0x80 [11381.854231] ? exit_to_user_mode_prepare+0x180/0x1f0 [11381.854237] ? syscall_exit_to_user_mode+0x17/0x40 [11381.854241] ? do_syscall_64+0x67/0x80 [11381.854245] ? syscall_exit_to_user_mode+0x17/0x40 [11381.854248] ? do_syscall_64+0x67/0x80 [11381.854252] ? exc_page_fault+0x70/0x170 [11381.854256] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Signed-off-by: Rong Tao rongtao@cestc.cn Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- samples/hw_breakpoint/data_breakpoint.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/samples/hw_breakpoint/data_breakpoint.c b/samples/hw_breakpoint/data_breakpoint.c index 418c46fe5ffc3..9debd128b2ab8 100644 --- a/samples/hw_breakpoint/data_breakpoint.c +++ b/samples/hw_breakpoint/data_breakpoint.c @@ -70,7 +70,7 @@ static int __init hw_break_module_init(void) static void __exit hw_break_module_exit(void) { unregister_wide_hw_breakpoint(sample_hbp); - symbol_put(ksym_name); + __symbol_put(ksym_name); printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n", ksym_name); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit d2852b8c045ebd31d753b06f2810df5be30ed56a ]
One more PCI ID for the road.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Link: https://lore.kernel.org/r/20230802150105.24604-5-pierre-louis.bossart@linux.... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/intel-dsp-config.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/sound/hda/intel-dsp-config.c b/sound/hda/intel-dsp-config.c index 513eadcc38d90..c69d069b3f2b6 100644 --- a/sound/hda/intel-dsp-config.c +++ b/sound/hda/intel-dsp-config.c @@ -385,6 +385,14 @@ static const struct config_entry config_table[] = { }, #endif
+/* Lunar Lake */ +#if IS_ENABLED(CONFIG_SND_SOC_SOF_LUNARLAKE) + /* Lunarlake-P */ + { + .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC_OR_SOUNDWIRE, + .device = PCI_DEVICE_ID_INTEL_HDA_LNL_P, + }, +#endif };
static const struct config_entry *snd_intel_dsp_find_config
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Leo Chen sancchen@amd.com
[ Upstream commit 026a71babf48efb6b9884a3a66fa31aec9e1ea54 ]
[Why & How] HDMI TMDS does not have ODM support. Filtering 420 modes that exceed the 4096 FMT limitation on DCN31 will resolve intermittent corruptions issues.
Reviewed-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Acked-by: Tom Chung chiahsuan.chung@amd.com Signed-off-by: Leo Chen sancchen@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c index aa0507e017926..c5e0de4f77b31 100644 --- a/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c +++ b/drivers/gpu/drm/amd/display/dc/dml/dcn31/display_mode_vba_31.c @@ -4162,7 +4162,9 @@ void dml31_ModeSupportAndSystemConfigurationFull(struct display_mode_lib *mode_l } if (v->OutputFormat[k] == dm_420 && v->HActive[k] > DCN31_MAX_FMT_420_BUFFER_WIDTH && v->ODMCombineEnablePerState[i][k] != dm_odm_combine_mode_4to1) { - if (v->HActive[k] / 2 > DCN31_MAX_FMT_420_BUFFER_WIDTH) { + if (v->Output[k] == dm_hdmi) { + FMTBufferExceeded = true; + } else if (v->HActive[k] / 2 > DCN31_MAX_FMT_420_BUFFER_WIDTH) { v->ODMCombineEnablePerState[i][k] = dm_odm_combine_mode_4to1; v->PlaneRequiredDISPCLK = v->PlaneRequiredDISPCLKWithODMCombine4To1;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tuo Li islituo@gmail.com
[ Upstream commit 2e63972a2de14482d0eae1a03a73e379f1c3f44c ]
The variable crtc->state->event is often protected by the lock crtc->dev->event_lock when is accessed. However, it is accessed as a condition of an if statement in exynos_drm_crtc_atomic_disable() without holding the lock:
if (crtc->state->event && !crtc->state->active)
However, if crtc->state->event is changed to NULL by another thread right after the conditions of the if statement is checked to be true, a null-pointer dereference can occur in drm_crtc_send_vblank_event():
e->pipe = pipe;
To fix this possible null-pointer dereference caused by data race, the spin lock coverage is extended to protect the if statement as well as the function call to drm_crtc_send_vblank_event().
Reported-by: BassCheck bass@buaa.edu.cn Link: https://sites.google.com/view/basscheck/home Signed-off-by: Tuo Li islituo@gmail.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Added relevant link. Signed-off-by: Inki Dae inki.dae@samsung.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/exynos/exynos_drm_crtc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/exynos/exynos_drm_crtc.c b/drivers/gpu/drm/exynos/exynos_drm_crtc.c index 4153f302de7c4..d19e796c20613 100644 --- a/drivers/gpu/drm/exynos/exynos_drm_crtc.c +++ b/drivers/gpu/drm/exynos/exynos_drm_crtc.c @@ -39,13 +39,12 @@ static void exynos_drm_crtc_atomic_disable(struct drm_crtc *crtc, if (exynos_crtc->ops->atomic_disable) exynos_crtc->ops->atomic_disable(exynos_crtc);
+ spin_lock_irq(&crtc->dev->event_lock); if (crtc->state->event && !crtc->state->active) { - spin_lock_irq(&crtc->dev->event_lock); drm_crtc_send_vblank_event(crtc, crtc->state->event); - spin_unlock_irq(&crtc->dev->event_lock); - crtc->state->event = NULL; } + spin_unlock_irq(&crtc->dev->event_lock); }
static int exynos_crtc_atomic_check(struct drm_crtc *crtc,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tony Lindgren tony@atomide.com
[ Upstream commit 03a711d3cb83692733f865312f49e665c49de6de ]
Enable the uart quirks similar to the earlier SoCs. Let's assume we are likely going to need a k3 specific quirk mask separate from the earlier SoCs, so let's not start changing the revision register mask at this point.
Note that SYSC_QUIRK_LEGACY_IDLE will be needed until we can remove the need for pm_runtime_irq_safe() from 8250_omap driver.
Reviewed-by: Nishanth Menon nm@ti.com Signed-off-by: Tony Lindgren tony@atomide.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/ti-sysc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/bus/ti-sysc.c b/drivers/bus/ti-sysc.c index 436c0f3563d79..74ab6cf031ce0 100644 --- a/drivers/bus/ti-sysc.c +++ b/drivers/bus/ti-sysc.c @@ -1504,6 +1504,8 @@ static const struct sysc_revision_quirk sysc_revision_quirks[] = { SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE), SYSC_QUIRK("uart", 0, 0x50, 0x54, 0x58, 0x47422e03, 0xffffffff, SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE), + SYSC_QUIRK("uart", 0, 0x50, 0x54, 0x58, 0x47424e03, 0xffffffff, + SYSC_QUIRK_SWSUP_SIDLE | SYSC_QUIRK_LEGACY_IDLE),
/* Quirks that need to be set based on the module address */ SYSC_QUIRK("mcpdm", 0x40132000, 0, 0x10, -ENODEV, 0x50000800, 0xffffffff,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 8b0472b50bcf0f19a5119b00a53b63579c8e1e4d ]
If rddev->raid_disk is greater than mddev->raid_disks, there will be an out-of-bounds in raid1_remove_disk(). We have already found similar reports as follows:
1) commit d17f744e883b ("md-raid10: fix KASAN warning") 2) commit 1ebc2cec0b7d ("dm raid: fix KASAN warning in raid5_remove_disk")
Fix this bug by checking whether the "number" variable is valid.
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Reviewed-by: Yu Kuai yukuai3@huawei.com Link: https://lore.kernel.org/r/tencent_0D24426FAC6A21B69AC0C03CE4143A508F09@qq.co... Signed-off-by: Song Liu song@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/raid1.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 084bfea6ad316..5360d6ed16e05 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1820,6 +1820,10 @@ static int raid1_remove_disk(struct mddev *mddev, struct md_rdev *rdev) struct r1conf *conf = mddev->private; int err = 0; int number = rdev->raid_disk; + + if (unlikely(number >= conf->raid_disks)) + goto abort; + struct raid1_info *p = conf->mirrors + number;
if (rdev != p->rdev)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Georg Ottinger g.ottinger@gmx.at
[ Upstream commit e88076348425b7d0491c8c98d8732a7df8de7aa3 ]
I run a small server that uses external hard drives for backups. The backup software I use uses ext2 filesystems with 4KiB block size and the server is running SELinux and therefore relies on xattr. I recently upgraded the hard drives from 4TB to 12TB models. I noticed that after transferring some TBs I got a filesystem error "Freeing blocks not in datazone - block = 18446744071529317386, count = 1" and the backup process stopped. Trying to fix the fs with e2fsck resulted in a completely corrupted fs. The error probably came from ext2_free_blocks(), and because of the large number 18e19 this problem immediately looked like some kind of integer overflow. Whereas the 4TB fs was about 1e9 blocks, the new 12TB is about 3e9 blocks. So, searching the ext2 code, I came across the line in fs/ext2/xattr.c:745 where ext2_new_block() is called and the resulting block number is stored in the variable block as an int datatype. If a block with a block number greater than INT32_MAX is returned, this variable overflows and the call to sb_getblk() at line fs/ext2/xattr.c:750 fails, then the call to ext2_free_blocks() produces the error.
Signed-off-by: Georg Ottinger g.ottinger@gmx.at Signed-off-by: Jan Kara jack@suse.cz Message-Id: 20230815100340.22121-1-g.ottinger@gmx.at Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext2/xattr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/ext2/xattr.c b/fs/ext2/xattr.c index 841fa6d9d744b..f1dc11dab0d88 100644 --- a/fs/ext2/xattr.c +++ b/fs/ext2/xattr.c @@ -694,10 +694,10 @@ ext2_xattr_set2(struct inode *inode, struct buffer_head *old_bh, /* We need to allocate a new block */ ext2_fsblk_t goal = ext2_group_first_block_no(sb, EXT2_I(inode)->i_block_group); - int block = ext2_new_block(inode, goal, &error); + ext2_fsblk_t block = ext2_new_block(inode, goal, &error); if (error) goto cleanup; - ea_idebug(inode, "creating block %d", block); + ea_idebug(inode, "creating block %lu", block);
new_bh = sb_getblk(sb, block); if (unlikely(!new_bh)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andrew Kanner andrew.kanner@gmail.com
[ Upstream commit cade5397e5461295f3cb87880534b6a07cafa427 ]
Syzkaller reported the following issue: ================================================================== BUG: KASAN: double-free in slab_free mm/slub.c:3787 [inline] BUG: KASAN: double-free in __kmem_cache_free+0x71/0x110 mm/slub.c:3800 Free of addr ffff888086408000 by task syz-executor.4/12750 [...] Call Trace: <TASK> [...] kasan_report_invalid_free+0xac/0xd0 mm/kasan/report.c:482 ____kasan_slab_free+0xfb/0x120 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1807 slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3800 dbUnmount+0xf4/0x110 fs/jfs/jfs_dmap.c:264 jfs_umount+0x248/0x3b0 fs/jfs/jfs_umount.c:87 jfs_put_super+0x86/0x190 fs/jfs/super.c:194 generic_shutdown_super+0x130/0x310 fs/super.c:492 kill_block_super+0x79/0xd0 fs/super.c:1386 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 cleanup_mnt+0x494/0x520 fs/namespace.c:1291 task_work_run+0x243/0x300 kernel/task_work.c:179 resume_user_mode_work include/linux/resume_user_mode.h:49 [inline] exit_to_user_mode_loop+0x124/0x150 kernel/entry/common.c:171 exit_to_user_mode_prepare+0xb2/0x140 kernel/entry/common.c:203 __syscall_exit_to_user_mode_work kernel/entry/common.c:285 [inline] syscall_exit_to_user_mode+0x26/0x60 kernel/entry/common.c:296 do_syscall_64+0x49/0xb0 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...] </TASK>
Allocated by task 13352: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 ____kasan_kmalloc mm/kasan/common.c:371 [inline] __kasan_kmalloc+0x97/0xb0 mm/kasan/common.c:380 kmalloc include/linux/slab.h:580 [inline] dbMount+0x54/0x980 fs/jfs/jfs_dmap.c:164 jfs_mount+0x1dd/0x830 fs/jfs/jfs_mount.c:121 jfs_fill_super+0x590/0xc50 fs/jfs/super.c:556 mount_bdev+0x26c/0x3a0 fs/super.c:1359 legacy_get_tree+0xea/0x180 fs/fs_context.c:610 vfs_get_tree+0x88/0x270 fs/super.c:1489 do_new_mount+0x289/0xad0 fs/namespace.c:3145 do_mount fs/namespace.c:3488 [inline] __do_sys_mount fs/namespace.c:3697 [inline] __se_sys_mount+0x2d3/0x3c0 fs/namespace.c:3674 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Freed by task 13352: kasan_save_stack mm/kasan/common.c:45 [inline] kasan_set_track+0x3d/0x60 mm/kasan/common.c:52 kasan_save_free_info+0x27/0x40 mm/kasan/generic.c:518 ____kasan_slab_free+0xd6/0x120 mm/kasan/common.c:236 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1781 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1807 slab_free mm/slub.c:3787 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3800 dbUnmount+0xf4/0x110 fs/jfs/jfs_dmap.c:264 jfs_mount_rw+0x545/0x740 fs/jfs/jfs_mount.c:247 jfs_remount+0x3db/0x710 fs/jfs/super.c:454 reconfigure_super+0x3bc/0x7b0 fs/super.c:935 vfs_fsconfig_locked fs/fsopen.c:254 [inline] __do_sys_fsconfig fs/fsopen.c:439 [inline] __se_sys_fsconfig+0xad5/0x1060 fs/fsopen.c:314 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...]
JFS_SBI(ipbmap->i_sb)->bmap wasn't set to NULL after kfree() in dbUnmount().
Syzkaller uses faultinject to reproduce this KASAN double-free warning. The issue is triggered if either diMount() or dbMount() fail in jfs_remount(), since diUnmount() or dbUnmount() already happened in such a case - they will do double-free on next execution: jfs_umount or jfs_remount.
Tested on both upstream and jfs-next by syzkaller.
Reported-and-tested-by: syzbot+6a93efb725385bc4b2e9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000471f2d05f1ce8bad@google.com/T/ Link: https://syzkaller.appspot.com/bug?extid=6a93efb725385bc4b2e9 Signed-off-by: Andrew Kanner andrew.kanner@gmail.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_dmap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/jfs/jfs_dmap.c b/fs/jfs/jfs_dmap.c index f235a3d270a01..da4f9c3b714fe 100644 --- a/fs/jfs/jfs_dmap.c +++ b/fs/jfs/jfs_dmap.c @@ -269,6 +269,7 @@ int dbUnmount(struct inode *ipbmap, int mounterror)
/* free the memory for the in-memory bmap. */ kfree(bmp); + JFS_SBI(ipbmap->i_sb)->bmap = NULL;
return (0); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liu Shixin via Jfs-discussion jfs-discussion@lists.sourceforge.net
[ Upstream commit 6e2bda2c192d0244b5a78b787ef20aa10cb319b7 ]
syzbot found an invalid-free in diUnmount:
BUG: KASAN: double-free in slab_free mm/slub.c:3661 [inline] BUG: KASAN: double-free in __kmem_cache_free+0x71/0x110 mm/slub.c:3674 Free of addr ffff88806f410000 by task syz-executor131/3632
CPU: 0 PID: 3632 Comm: syz-executor131 Not tainted 6.1.0-rc7-syzkaller-00012-gca57f02295f1 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1b1/0x28e lib/dump_stack.c:106 print_address_description+0x74/0x340 mm/kasan/report.c:284 print_report+0x107/0x1f0 mm/kasan/report.c:395 kasan_report_invalid_free+0xac/0xd0 mm/kasan/report.c:460 ____kasan_slab_free+0xfb/0x120 kasan_slab_free include/linux/kasan.h:177 [inline] slab_free_hook mm/slub.c:1724 [inline] slab_free_freelist_hook+0x12e/0x1a0 mm/slub.c:1750 slab_free mm/slub.c:3661 [inline] __kmem_cache_free+0x71/0x110 mm/slub.c:3674 diUnmount+0xef/0x100 fs/jfs/jfs_imap.c:195 jfs_umount+0x108/0x370 fs/jfs/jfs_umount.c:63 jfs_put_super+0x86/0x190 fs/jfs/super.c:194 generic_shutdown_super+0x130/0x310 fs/super.c:492 kill_block_super+0x79/0xd0 fs/super.c:1428 deactivate_locked_super+0xa7/0xf0 fs/super.c:332 cleanup_mnt+0x494/0x520 fs/namespace.c:1186 task_work_run+0x243/0x300 kernel/task_work.c:179 exit_task_work include/linux/task_work.h:38 [inline] do_exit+0x664/0x2070 kernel/exit.c:820 do_group_exit+0x1fd/0x2b0 kernel/exit.c:950 __do_sys_exit_group kernel/exit.c:961 [inline] __se_sys_exit_group kernel/exit.c:959 [inline] __x64_sys_exit_group+0x3b/0x40 kernel/exit.c:959 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd [...]
JFS_IP(ipimap)->i_imap is not setting to NULL after free in diUnmount. If jfs_remount() free JFS_IP(ipimap)->i_imap but then failed at diMount(). JFS_IP(ipimap)->i_imap will be freed once again. Fix this problem by setting JFS_IP(ipimap)->i_imap to NULL after free.
Reported-by: syzbot+90a11e6b1e810785c6ff@syzkaller.appspotmail.com Signed-off-by: Liu Shixin liushixin2@huawei.com Signed-off-by: Dave Kleikamp dave.kleikamp@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jfs/jfs_imap.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/jfs/jfs_imap.c b/fs/jfs/jfs_imap.c index 799d3837e7c2b..4899663996d81 100644 --- a/fs/jfs/jfs_imap.c +++ b/fs/jfs/jfs_imap.c @@ -193,6 +193,7 @@ int diUnmount(struct inode *ipimap, int mounterror) * free in-memory control structure */ kfree(imap); + JFS_IP(ipimap)->i_imap = NULL;
return (0); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mårten Lindahl marten.lindahl@axis.com
[ Upstream commit 8922ba71c969d2a0c01a94372a71477d879470de ]
If a panic is triggered by a hrtimer interrupt all online cpus will be notified and set offline. But as highlighted by commit 19dbdcb8039c ("smp: Warn on function calls from softirq context") this call should not be made synchronous with disabled interrupts:
softdog: Initiating panic Kernel panic - not syncing: Software Watchdog Timer expired WARNING: CPU: 1 PID: 0 at kernel/smp.c:753 smp_call_function_many_cond unwind_backtrace: show_stack dump_stack_lvl __warn warn_slowpath_fmt smp_call_function_many_cond smp_call_function crash_smp_send_stop.part.0 machine_crash_shutdown __crash_kexec panic softdog_fire __hrtimer_run_queues hrtimer_interrupt
Make the smp call for machine_crash_nonpanic_core() asynchronous.
Signed-off-by: Mårten Lindahl marten.lindahl@axis.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/kernel/machine_kexec.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/arm/kernel/machine_kexec.c b/arch/arm/kernel/machine_kexec.c index f567032a09c0b..6d1938d1b4df7 100644 --- a/arch/arm/kernel/machine_kexec.c +++ b/arch/arm/kernel/machine_kexec.c @@ -92,16 +92,28 @@ void machine_crash_nonpanic_core(void *unused) } }
+static DEFINE_PER_CPU(call_single_data_t, cpu_stop_csd) = + CSD_INIT(machine_crash_nonpanic_core, NULL); + void crash_smp_send_stop(void) { static int cpus_stopped; unsigned long msecs; + call_single_data_t *csd; + int cpu, this_cpu = raw_smp_processor_id();
if (cpus_stopped) return;
atomic_set(&waiting_for_crash_ipi, num_online_cpus() - 1); - smp_call_function(machine_crash_nonpanic_core, NULL, false); + for_each_online_cpu(cpu) { + if (cpu == this_cpu) + continue; + + csd = &per_cpu(cpu_stop_csd, cpu); + smp_call_function_single_async(cpu, csd); + } + msecs = 1000; /* Wait at most a second for the other cpus to stop */ while ((atomic_read(&waiting_for_crash_ipi) > 0) && msecs) { mdelay(1);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: ruanjinjie ruanjinjie@huawei.com
[ Upstream commit afda85b963c12947e298ad85d757e333aa40fd74 ]
If device_register() returns error in ibmebus_bus_init(), name of kobject which is allocated in dev_set_name() called in device_add() is leaked.
As comment of device_add() says, it should call put_device() to drop the reference count that was set in device_initialize() when it fails, so the name can be freed in kobject_cleanup().
Signed-off-by: ruanjinjie ruanjinjie@huawei.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20221110011929.3709774-1-ruanjinjie@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/platforms/pseries/ibmebus.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/powerpc/platforms/pseries/ibmebus.c b/arch/powerpc/platforms/pseries/ibmebus.c index 7ee3ed7d6cc21..6936ffee253b2 100644 --- a/arch/powerpc/platforms/pseries/ibmebus.c +++ b/arch/powerpc/platforms/pseries/ibmebus.c @@ -451,6 +451,7 @@ static int __init ibmebus_bus_init(void) if (err) { printk(KERN_WARNING "%s: device_register returned %i\n", __func__, err); + put_device(&ibmebus_bus_device); bus_unregister(&ibmebus_bus_type);
return err;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yong-Xuan Wang yongxuan.wang@sifive.com
[ Upstream commit 551a60e1225e71fff8efd9390204c505b0870e0f ]
The iMSI-RX module of the DW PCIe controller provides multiple sets of MSI_CTRL_INT_i_* registers, and each set is capable of handling 32 MSI interrupts. However, the fu740 PCIe controller driver only enabled one set of MSI_CTRL_INT_i_* registers, as the total number of supported interrupts was not specified.
Set the supported number of MSI vectors to enable all the MSI_CTRL_INT_i_* registers on the fu740 PCIe core, allowing the system to fully utilize the available MSI interrupts.
Link: https://lore.kernel.org/r/20230807055621.2431-1-yongxuan.wang@sifive.com Signed-off-by: Yong-Xuan Wang yongxuan.wang@sifive.com Signed-off-by: Lorenzo Pieralisi lpieralisi@kernel.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-fu740.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/pci/controller/dwc/pcie-fu740.c b/drivers/pci/controller/dwc/pcie-fu740.c index 78d002be4f821..f6c71c1b657b6 100644 --- a/drivers/pci/controller/dwc/pcie-fu740.c +++ b/drivers/pci/controller/dwc/pcie-fu740.c @@ -301,6 +301,7 @@ static int fu740_pcie_probe(struct platform_device *pdev) pci->dev = dev; pci->ops = &dw_pcie_ops; pci->pp.ops = &fu740_pcie_host_ops; + pci->pp.num_vectors = MAX_MSI_IRQS;
/* SiFive specific region: mgmt */ afp->mgmt_base = devm_platform_ioremap_resource_byname(pdev, "mgmt");
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 7bf744f2de0a848fb1d717f5831b03db96feae89 ]
In af9035_i2c_master_xfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach af9035_i2c_master_xfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 0ed554fd769a ("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org [ moved variable declaration to fix build issues in older kernels - gregkh ] Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/usb/dvb-usb-v2/af9035.c | 14 ++++++++------ 1 file changed, 8 insertions(+), 6 deletions(-)
--- a/drivers/media/usb/dvb-usb-v2/af9035.c +++ b/drivers/media/usb/dvb-usb-v2/af9035.c @@ -270,6 +270,7 @@ static int af9035_i2c_master_xfer(struct struct dvb_usb_device *d = i2c_get_adapdata(adap); struct state *state = d_to_priv(d); int ret; + u32 reg;
if (mutex_lock_interruptible(&d->i2c_mutex) < 0) return -EAGAIN; @@ -322,8 +323,10 @@ static int af9035_i2c_master_xfer(struct ret = -EOPNOTSUPP; } else if ((msg[0].addr == state->af9033_i2c_addr[0]) || (msg[0].addr == state->af9033_i2c_addr[1])) { + if (msg[0].len < 3 || msg[1].len < 1) + return -EOPNOTSUPP; /* demod access via firmware interface */ - u32 reg = msg[0].buf[0] << 16 | msg[0].buf[1] << 8 | + reg = msg[0].buf[0] << 16 | msg[0].buf[1] << 8 | msg[0].buf[2];
if (msg[0].addr == state->af9033_i2c_addr[1]) @@ -381,17 +384,16 @@ static int af9035_i2c_master_xfer(struct ret = -EOPNOTSUPP; } else if ((msg[0].addr == state->af9033_i2c_addr[0]) || (msg[0].addr == state->af9033_i2c_addr[1])) { + if (msg[0].len < 3) + return -EOPNOTSUPP; /* demod access via firmware interface */ - u32 reg = msg[0].buf[0] << 16 | msg[0].buf[1] << 8 | + reg = msg[0].buf[0] << 16 | msg[0].buf[1] << 8 | msg[0].buf[2];
if (msg[0].addr == state->af9033_i2c_addr[1]) reg |= 0x100000;
- ret = (msg[0].len >= 3) ? af9035_wr_regs(d, reg, - &msg[0].buf[3], - msg[0].len - 3) - : -EOPNOTSUPP; + ret = af9035_wr_regs(d, reg, &msg[0].buf[3], msg[0].len - 3); } else { /* I2C write */ u8 buf[MAX_XFER_SIZE];
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 5ae544d94abc8ff77b1b9bf8774def3fa5689b5b ]
In dw2102_i2c_transfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach dw2102_i2c_transfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 950e252cb469 ("[media] dw2102: limit messages to buffer size")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb/dw2102.c | 24 ++++++++++++++++++++++++ 1 file changed, 24 insertions(+)
diff --git a/drivers/media/usb/dvb-usb/dw2102.c b/drivers/media/usb/dvb-usb/dw2102.c index 1ed62a80067c6..253d13bdb63e5 100644 --- a/drivers/media/usb/dvb-usb/dw2102.c +++ b/drivers/media/usb/dvb-usb/dw2102.c @@ -128,6 +128,10 @@ static int dw2102_i2c_transfer(struct i2c_adapter *adap, struct i2c_msg msg[],
switch (num) { case 2: + if (msg[0].len < 1) { + num = -EOPNOTSUPP; + break; + } /* read stv0299 register */ value = msg[0].buf[0];/* register */ for (i = 0; i < msg[1].len; i++) { @@ -139,6 +143,10 @@ static int dw2102_i2c_transfer(struct i2c_adapter *adap, struct i2c_msg msg[], case 1: switch (msg[0].addr) { case 0x68: + if (msg[0].len < 2) { + num = -EOPNOTSUPP; + break; + } /* write to stv0299 register */ buf6[0] = 0x2a; buf6[1] = msg[0].buf[0]; @@ -148,6 +156,10 @@ static int dw2102_i2c_transfer(struct i2c_adapter *adap, struct i2c_msg msg[], break; case 0x60: if (msg[0].flags == 0) { + if (msg[0].len < 4) { + num = -EOPNOTSUPP; + break; + } /* write to tuner pll */ buf6[0] = 0x2c; buf6[1] = 5; @@ -159,6 +171,10 @@ static int dw2102_i2c_transfer(struct i2c_adapter *adap, struct i2c_msg msg[], dw210x_op_rw(d->udev, 0xb2, 0, 0, buf6, 7, DW210X_WRITE_MSG); } else { + if (msg[0].len < 1) { + num = -EOPNOTSUPP; + break; + } /* read from tuner */ dw210x_op_rw(d->udev, 0xb5, 0, 0, buf6, 1, DW210X_READ_MSG); @@ -166,12 +182,20 @@ static int dw2102_i2c_transfer(struct i2c_adapter *adap, struct i2c_msg msg[], } break; case (DW2102_RC_QUERY): + if (msg[0].len < 2) { + num = -EOPNOTSUPP; + break; + } dw210x_op_rw(d->udev, 0xb8, 0, 0, buf6, 2, DW210X_READ_MSG); msg[0].buf[0] = buf6[0]; msg[0].buf[1] = buf6[1]; break; case (DW2102_VOLTAGE_CTRL): + if (msg[0].len < 1) { + num = -EOPNOTSUPP; + break; + } buf6[0] = 0x30; buf6[1] = msg[0].buf[0]; dw210x_op_rw(d->udev, 0xb2, 0, 0,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit f4ee84f27625ce1fdf41e8483fa0561a1b837d10 ]
In af9005_i2c_xfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach af9005_i2c_xfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 0ed554fd769a ("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb/af9005.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/media/usb/dvb-usb/af9005.c b/drivers/media/usb/dvb-usb/af9005.c index b6a2436d16e97..9af54fcbed1de 100644 --- a/drivers/media/usb/dvb-usb/af9005.c +++ b/drivers/media/usb/dvb-usb/af9005.c @@ -422,6 +422,10 @@ static int af9005_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], if (ret == 0) ret = 2; } else { + if (msg[0].len < 2) { + ret = -EOPNOTSUPP; + goto unlock; + } /* write one or more registers */ reg = msg[0].buf[0]; addr = msg[0].addr; @@ -431,6 +435,7 @@ static int af9005_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], ret = 1; }
+unlock: mutex_unlock(&d->i2c_mutex); return ret; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit c30411266fd67ea3c02a05c157231654d5a3bdc9 ]
In anysee_master_xfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach anysee_master_xfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 0ed554fd769a ("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl [hverkuil: add spaces around +] Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb-v2/anysee.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/usb/dvb-usb-v2/anysee.c b/drivers/media/usb/dvb-usb-v2/anysee.c index aa45b5d263f6b..a1235d0cce92f 100644 --- a/drivers/media/usb/dvb-usb-v2/anysee.c +++ b/drivers/media/usb/dvb-usb-v2/anysee.c @@ -202,7 +202,7 @@ static int anysee_master_xfer(struct i2c_adapter *adap, struct i2c_msg *msg,
while (i < num) { if (num > i + 1 && (msg[i+1].flags & I2C_M_RD)) { - if (msg[i].len > 2 || msg[i+1].len > 60) { + if (msg[i].len != 2 || msg[i + 1].len > 60) { ret = -EOPNOTSUPP; break; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit 1047f9343011f2cedc73c64829686206a7e9fc3f ]
In az6007_i2c_xfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach az6007_i2c_xfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 0ed554fd769a ("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb-v2/az6007.c | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/drivers/media/usb/dvb-usb-v2/az6007.c b/drivers/media/usb/dvb-usb-v2/az6007.c index 7524c90f5da61..6cbfe75791c21 100644 --- a/drivers/media/usb/dvb-usb-v2/az6007.c +++ b/drivers/media/usb/dvb-usb-v2/az6007.c @@ -788,6 +788,10 @@ static int az6007_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], if (az6007_xfer_debug) printk(KERN_DEBUG "az6007: I2C W addr=0x%x len=%d\n", addr, msgs[i].len); + if (msgs[i].len < 1) { + ret = -EIO; + goto err; + } req = AZ6007_I2C_WR; index = msgs[i].buf[0]; value = addr | (1 << 8); @@ -802,6 +806,10 @@ static int az6007_i2c_xfer(struct i2c_adapter *adap, struct i2c_msg msgs[], if (az6007_xfer_debug) printk(KERN_DEBUG "az6007: I2C R addr=0x%x len=%d\n", addr, msgs[i].len); + if (msgs[i].len < 1) { + ret = -EIO; + goto err; + } req = AZ6007_I2C_RD; index = msgs[i].buf[0]; value = addr;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Shurong zhang_shurong@foxmail.com
[ Upstream commit b97719a66970601cd3151a3e2020f4454a1c4ff6 ]
In gl861_i2c_master_xfer, msg is controlled by user. When msg[i].buf is null and msg[i].len is zero, former checks on msg[i].buf would be passed. Malicious data finally reach gl861_i2c_master_xfer. If accessing msg[i].buf[0] without sanity check, null ptr deref would happen. We add check on msg[i].len to prevent crash.
Similar commit: commit 0ed554fd769a ("media: dvb-usb: az6027: fix null-ptr-deref in az6027_i2c_xfer()")
Signed-off-by: Zhang Shurong zhang_shurong@foxmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/usb/dvb-usb-v2/gl861.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/usb/dvb-usb-v2/gl861.c b/drivers/media/usb/dvb-usb-v2/gl861.c index 0c434259c36f1..c71e7b93476de 100644 --- a/drivers/media/usb/dvb-usb-v2/gl861.c +++ b/drivers/media/usb/dvb-usb-v2/gl861.c @@ -120,7 +120,7 @@ static int gl861_i2c_master_xfer(struct i2c_adapter *adap, struct i2c_msg msg[], } else if (num == 2 && !(msg[0].flags & I2C_M_RD) && (msg[1].flags & I2C_M_RD)) { /* I2C write + read */ - if (msg[0].len > 1 || msg[1].len > sizeof(ctx->buf)) { + if (msg[0].len != 1 || msg[1].len > sizeof(ctx->buf)) { ret = -EOPNOTSUPP; goto err; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit ee630b29ea44d1851bb6c903f400956604834463 ]
BUG_ON is unnecessary here, and in addition it confuses smatch. Replacing this with an error return help resolve this smatch warning:
drivers/media/tuners/qt1010.c:350 qt1010_init() error: buffer overflow 'i2c_data' 34 <= 34
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/tuners/qt1010.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/media/tuners/qt1010.c b/drivers/media/tuners/qt1010.c index 60931367b82ca..48fc79cd40273 100644 --- a/drivers/media/tuners/qt1010.c +++ b/drivers/media/tuners/qt1010.c @@ -345,11 +345,12 @@ static int qt1010_init(struct dvb_frontend *fe) else valptr = &tmpval;
- BUG_ON(i >= ARRAY_SIZE(i2c_data) - 1); - - err = qt1010_init_meas1(priv, i2c_data[i+1].reg, - i2c_data[i].reg, - i2c_data[i].val, valptr); + if (i >= ARRAY_SIZE(i2c_data) - 1) + err = -EIO; + else + err = qt1010_init_meas1(priv, i2c_data[i + 1].reg, + i2c_data[i].reg, + i2c_data[i].val, valptr); i++; break; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 2e1796fd4904fdd6062a8e4589778ea899ea0c8d ]
It was completely unnecessary to use BUG in buffer_prepare(). Just replace it with an error return. This also fixes a smatch warning:
drivers/media/pci/cx23885/cx23885-video.c:422 buffer_prepare() error: uninitialized symbol 'ret'.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/cx23885/cx23885-video.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/pci/cx23885/cx23885-video.c b/drivers/media/pci/cx23885/cx23885-video.c index b01499f810697..6851e01da1c5b 100644 --- a/drivers/media/pci/cx23885/cx23885-video.c +++ b/drivers/media/pci/cx23885/cx23885-video.c @@ -413,7 +413,7 @@ static int buffer_prepare(struct vb2_buffer *vb) dev->height >> 1); break; default: - BUG(); + return -EINVAL; /* should not happen */ } dprintk(2, "[%p/%d] buffer_init - %dx%d %dbpp 0x%08x - dma=0x%08lx\n", buf, buf->vb.vb2_buf.index,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiaolei Wang xiaolei.wang@windriver.com
[ Upstream commit 2319b9c87fe243327285f2fefd7374ffd75a65fc ]
The device may be scheduled during the resume process, so this cannot appear in atomic operations. Since pm_runtime_set_active will resume suppliers, put set active outside the spin lock, which is only used to protect the struct cdns data structure, otherwise the kernel will report the following warning:
BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1163 in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 651, name: sh preempt_count: 1, expected: 0 RCU nest depth: 0, expected: 0 CPU: 0 PID: 651 Comm: sh Tainted: G WC 6.1.20 #1 Hardware name: Freescale i.MX8QM MEK (DT) Call trace: dump_backtrace.part.0+0xe0/0xf0 show_stack+0x18/0x30 dump_stack_lvl+0x64/0x80 dump_stack+0x1c/0x38 __might_resched+0x1fc/0x240 __might_sleep+0x68/0xc0 __pm_runtime_resume+0x9c/0xe0 rpm_get_suppliers+0x68/0x1b0 __pm_runtime_set_status+0x298/0x560 cdns_resume+0xb0/0x1c0 cdns3_controller_resume.isra.0+0x1e0/0x250 cdns3_plat_resume+0x28/0x40
Signed-off-by: Xiaolei Wang xiaolei.wang@windriver.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20230616021952.1025854-1-xiaolei.wang@windriver.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/cdns3/cdns3-plat.c | 3 ++- drivers/usb/cdns3/cdnsp-pci.c | 3 ++- drivers/usb/cdns3/core.c | 15 +++++++++++---- drivers/usb/cdns3/core.h | 7 +++++-- 4 files changed, 20 insertions(+), 8 deletions(-)
diff --git a/drivers/usb/cdns3/cdns3-plat.c b/drivers/usb/cdns3/cdns3-plat.c index 4d0f027e5bd3a..9cb647203dcf2 100644 --- a/drivers/usb/cdns3/cdns3-plat.c +++ b/drivers/usb/cdns3/cdns3-plat.c @@ -256,9 +256,10 @@ static int cdns3_controller_resume(struct device *dev, pm_message_t msg) cdns3_set_platform_suspend(cdns->dev, false, false);
spin_lock_irqsave(&cdns->lock, flags); - cdns_resume(cdns, !PMSG_IS_AUTO(msg)); + cdns_resume(cdns); cdns->in_lpm = false; spin_unlock_irqrestore(&cdns->lock, flags); + cdns_set_active(cdns, !PMSG_IS_AUTO(msg)); if (cdns->wakeup_pending) { cdns->wakeup_pending = false; enable_irq(cdns->wakeup_irq); diff --git a/drivers/usb/cdns3/cdnsp-pci.c b/drivers/usb/cdns3/cdnsp-pci.c index 29f433c5a6f3f..a85db23fa19f2 100644 --- a/drivers/usb/cdns3/cdnsp-pci.c +++ b/drivers/usb/cdns3/cdnsp-pci.c @@ -210,8 +210,9 @@ static int __maybe_unused cdnsp_pci_resume(struct device *dev) int ret;
spin_lock_irqsave(&cdns->lock, flags); - ret = cdns_resume(cdns, 1); + ret = cdns_resume(cdns); spin_unlock_irqrestore(&cdns->lock, flags); + cdns_set_active(cdns, 1);
return ret; } diff --git a/drivers/usb/cdns3/core.c b/drivers/usb/cdns3/core.c index dbcdf3b24b477..7b20d2d5c262e 100644 --- a/drivers/usb/cdns3/core.c +++ b/drivers/usb/cdns3/core.c @@ -522,9 +522,8 @@ int cdns_suspend(struct cdns *cdns) } EXPORT_SYMBOL_GPL(cdns_suspend);
-int cdns_resume(struct cdns *cdns, u8 set_active) +int cdns_resume(struct cdns *cdns) { - struct device *dev = cdns->dev; enum usb_role real_role; bool role_changed = false; int ret = 0; @@ -556,15 +555,23 @@ int cdns_resume(struct cdns *cdns, u8 set_active) if (cdns->roles[cdns->role]->resume) cdns->roles[cdns->role]->resume(cdns, cdns_power_is_lost(cdns));
+ return 0; +} +EXPORT_SYMBOL_GPL(cdns_resume); + +void cdns_set_active(struct cdns *cdns, u8 set_active) +{ + struct device *dev = cdns->dev; + if (set_active) { pm_runtime_disable(dev); pm_runtime_set_active(dev); pm_runtime_enable(dev); }
- return 0; + return; } -EXPORT_SYMBOL_GPL(cdns_resume); +EXPORT_SYMBOL_GPL(cdns_set_active); #endif /* CONFIG_PM_SLEEP */
MODULE_AUTHOR("Peter Chen peter.chen@nxp.com"); diff --git a/drivers/usb/cdns3/core.h b/drivers/usb/cdns3/core.h index ab0cb68acd239..1b6631cdf5dec 100644 --- a/drivers/usb/cdns3/core.h +++ b/drivers/usb/cdns3/core.h @@ -125,10 +125,13 @@ int cdns_init(struct cdns *cdns); int cdns_remove(struct cdns *cdns);
#ifdef CONFIG_PM_SLEEP -int cdns_resume(struct cdns *cdns, u8 set_active); +int cdns_resume(struct cdns *cdns); int cdns_suspend(struct cdns *cdns); +void cdns_set_active(struct cdns *cdns, u8 set_active); #else /* CONFIG_PM_SLEEP */ -static inline int cdns_resume(struct cdns *cdns, u8 set_active) +static inline int cdns_resume(struct cdns *cdns) +{ return 0; } +static inline int cdns_set_active(struct cdns *cdns, u8 set_active) { return 0; } static inline int cdns_suspend(struct cdns *cdns) { return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make_ruc2021@163.com
[ Upstream commit ce9daa2efc0872a9a68ea51dc8000df05893ef2e ]
We should verify the bound of the array to assure that host may not manipulate the index to point past endpoint array.
Signed-off-by: Ma Ke make_ruc2021@163.com Acked-by: Li Yang leoyang.li@nxp.com Link: https://lore.kernel.org/r/20230628081511.186850-1-make_ruc2021@163.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/fsl_qe_udc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/gadget/udc/fsl_qe_udc.c b/drivers/usb/gadget/udc/fsl_qe_udc.c index 15db7a3868fe4..aff4050f96dd6 100644 --- a/drivers/usb/gadget/udc/fsl_qe_udc.c +++ b/drivers/usb/gadget/udc/fsl_qe_udc.c @@ -1956,6 +1956,8 @@ static void ch9getstatus(struct qe_udc *udc, u8 request_type, u16 value, } else if ((request_type & USB_RECIP_MASK) == USB_RECIP_ENDPOINT) { /* Get endpoint status */ int pipe = index & USB_ENDPOINT_NUMBER_MASK; + if (pipe >= USB_MAX_ENDPOINTS) + goto stall; struct qe_ep *target_ep = &udc->eps[pipe]; u16 usep;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chenyuan Mi michenyuan@huawei.com
[ Upstream commit 49d736313d0975ddeb156f4f59801da833f78b30 ]
In function size_from_channelarray(), the return value 'bytes' is defined as int type. However, the calcution of 'bytes' in this function is designed to use the unsigned int type. So it is necessary to change 'bytes' type to unsigned int to avoid integer overflow.
The size_from_channelarray() is called in main() function, its return value is directly multipled by 'buf_len' and then used as the malloc() parameter. The 'buf_len' is completely controllable by user, thus a multiplication overflow may occur here. This could allocate an unexpected small area.
Signed-off-by: Chenyuan Mi michenyuan@huawei.com Link: https://lore.kernel.org/r/20230725092407.62545-1-michenyuan@huawei.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/iio/iio_generic_buffer.c | 17 +++++++++++++---- 1 file changed, 13 insertions(+), 4 deletions(-)
diff --git a/tools/iio/iio_generic_buffer.c b/tools/iio/iio_generic_buffer.c index f8deae4e26a15..44bbf80f0cfdd 100644 --- a/tools/iio/iio_generic_buffer.c +++ b/tools/iio/iio_generic_buffer.c @@ -51,9 +51,9 @@ enum autochan { * Has the side effect of filling the channels[i].location values used * in processing the buffer output. **/ -static int size_from_channelarray(struct iio_channel_info *channels, int num_channels) +static unsigned int size_from_channelarray(struct iio_channel_info *channels, int num_channels) { - int bytes = 0; + unsigned int bytes = 0; int i = 0;
while (i < num_channels) { @@ -348,7 +348,7 @@ int main(int argc, char **argv) ssize_t read_size; int dev_num = -1, trig_num = -1; char *buffer_access = NULL; - int scan_size; + unsigned int scan_size; int noevents = 0; int notrigger = 0; char *dummy; @@ -674,7 +674,16 @@ int main(int argc, char **argv) }
scan_size = size_from_channelarray(channels, num_channels); - data = malloc(scan_size * buf_len); + + size_t total_buf_len = scan_size * buf_len; + + if (scan_size > 0 && total_buf_len / scan_size != buf_len) { + ret = -EFAULT; + perror("Integer overflow happened when calculate scan_size * buf_len"); + goto error; + } + + data = malloc(total_buf_len); if (!data) { ret = -ENOMEM; goto error;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Shelekhin k.shelekhin@yadro.com
[ Upstream commit 801f287c93ff95582b0a2d2163f12870a2f076d4 ]
The function lio_target_nacl_info_show() uses sprintf() in a loop to print details for every iSCSI connection in a session without checking for the buffer length. With enough iSCSI connections it's possible to overflow the buffer provided by configfs and corrupt the memory.
This patch replaces sprintf() with sysfs_emit_at() that checks for buffer boundries.
Signed-off-by: Konstantin Shelekhin k.shelekhin@yadro.com Link: https://lore.kernel.org/r/20230722152657.168859-2-k.shelekhin@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/iscsi/iscsi_target_configfs.c | 54 ++++++++++---------- 1 file changed, 27 insertions(+), 27 deletions(-)
diff --git a/drivers/target/iscsi/iscsi_target_configfs.c b/drivers/target/iscsi/iscsi_target_configfs.c index f4a24fa5058e6..df399110fbf11 100644 --- a/drivers/target/iscsi/iscsi_target_configfs.c +++ b/drivers/target/iscsi/iscsi_target_configfs.c @@ -507,102 +507,102 @@ static ssize_t lio_target_nacl_info_show(struct config_item *item, char *page) spin_lock_bh(&se_nacl->nacl_sess_lock); se_sess = se_nacl->nacl_sess; if (!se_sess) { - rb += sprintf(page+rb, "No active iSCSI Session for Initiator" + rb += sysfs_emit_at(page, rb, "No active iSCSI Session for Initiator" " Endpoint: %s\n", se_nacl->initiatorname); } else { sess = se_sess->fabric_sess_ptr;
- rb += sprintf(page+rb, "InitiatorName: %s\n", + rb += sysfs_emit_at(page, rb, "InitiatorName: %s\n", sess->sess_ops->InitiatorName); - rb += sprintf(page+rb, "InitiatorAlias: %s\n", + rb += sysfs_emit_at(page, rb, "InitiatorAlias: %s\n", sess->sess_ops->InitiatorAlias);
- rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "LIO Session ID: %u ISID: 0x%6ph TSIH: %hu ", sess->sid, sess->isid, sess->tsih); - rb += sprintf(page+rb, "SessionType: %s\n", + rb += sysfs_emit_at(page, rb, "SessionType: %s\n", (sess->sess_ops->SessionType) ? "Discovery" : "Normal"); - rb += sprintf(page+rb, "Session State: "); + rb += sysfs_emit_at(page, rb, "Session State: "); switch (sess->session_state) { case TARG_SESS_STATE_FREE: - rb += sprintf(page+rb, "TARG_SESS_FREE\n"); + rb += sysfs_emit_at(page, rb, "TARG_SESS_FREE\n"); break; case TARG_SESS_STATE_ACTIVE: - rb += sprintf(page+rb, "TARG_SESS_STATE_ACTIVE\n"); + rb += sysfs_emit_at(page, rb, "TARG_SESS_STATE_ACTIVE\n"); break; case TARG_SESS_STATE_LOGGED_IN: - rb += sprintf(page+rb, "TARG_SESS_STATE_LOGGED_IN\n"); + rb += sysfs_emit_at(page, rb, "TARG_SESS_STATE_LOGGED_IN\n"); break; case TARG_SESS_STATE_FAILED: - rb += sprintf(page+rb, "TARG_SESS_STATE_FAILED\n"); + rb += sysfs_emit_at(page, rb, "TARG_SESS_STATE_FAILED\n"); break; case TARG_SESS_STATE_IN_CONTINUE: - rb += sprintf(page+rb, "TARG_SESS_STATE_IN_CONTINUE\n"); + rb += sysfs_emit_at(page, rb, "TARG_SESS_STATE_IN_CONTINUE\n"); break; default: - rb += sprintf(page+rb, "ERROR: Unknown Session" + rb += sysfs_emit_at(page, rb, "ERROR: Unknown Session" " State!\n"); break; }
- rb += sprintf(page+rb, "---------------------[iSCSI Session" + rb += sysfs_emit_at(page, rb, "---------------------[iSCSI Session" " Values]-----------------------\n"); - rb += sprintf(page+rb, " CmdSN/WR : CmdSN/WC : ExpCmdSN" + rb += sysfs_emit_at(page, rb, " CmdSN/WR : CmdSN/WC : ExpCmdSN" " : MaxCmdSN : ITT : TTT\n"); max_cmd_sn = (u32) atomic_read(&sess->max_cmd_sn); - rb += sprintf(page+rb, " 0x%08x 0x%08x 0x%08x 0x%08x" + rb += sysfs_emit_at(page, rb, " 0x%08x 0x%08x 0x%08x 0x%08x" " 0x%08x 0x%08x\n", sess->cmdsn_window, (max_cmd_sn - sess->exp_cmd_sn) + 1, sess->exp_cmd_sn, max_cmd_sn, sess->init_task_tag, sess->targ_xfer_tag); - rb += sprintf(page+rb, "----------------------[iSCSI" + rb += sysfs_emit_at(page, rb, "----------------------[iSCSI" " Connections]-------------------------\n");
spin_lock(&sess->conn_lock); list_for_each_entry(conn, &sess->sess_conn_list, conn_list) { - rb += sprintf(page+rb, "CID: %hu Connection" + rb += sysfs_emit_at(page, rb, "CID: %hu Connection" " State: ", conn->cid); switch (conn->conn_state) { case TARG_CONN_STATE_FREE: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_FREE\n"); break; case TARG_CONN_STATE_XPT_UP: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_XPT_UP\n"); break; case TARG_CONN_STATE_IN_LOGIN: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_IN_LOGIN\n"); break; case TARG_CONN_STATE_LOGGED_IN: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_LOGGED_IN\n"); break; case TARG_CONN_STATE_IN_LOGOUT: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_IN_LOGOUT\n"); break; case TARG_CONN_STATE_LOGOUT_REQUESTED: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_LOGOUT_REQUESTED\n"); break; case TARG_CONN_STATE_CLEANUP_WAIT: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "TARG_CONN_STATE_CLEANUP_WAIT\n"); break; default: - rb += sprintf(page+rb, + rb += sysfs_emit_at(page, rb, "ERROR: Unknown Connection State!\n"); break; }
- rb += sprintf(page+rb, " Address %pISc %s", &conn->login_sockaddr, + rb += sysfs_emit_at(page, rb, " Address %pISc %s", &conn->login_sockaddr, (conn->network_transport == ISCSI_TCP) ? "TCP" : "SCTP"); - rb += sprintf(page+rb, " StatSN: 0x%08x\n", + rb += sysfs_emit_at(page, rb, " StatSN: 0x%08x\n", conn->stat_sn); } spin_unlock(&sess->conn_lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 36ef11d311f405e55ad8e848c19b212ff71ef536 ]
CHECK drivers/tty/serial/cpm_uart/cpm_uart_core.c drivers/tty/serial/cpm_uart/cpm_uart_core.c:1271:39: warning: context imbalance in 'cpm_uart_console_write' - unexpected unlock
Allthough 'nolock' is not expected to change, sparse find the following form suspicious:
if (unlikely(nolock)) { local_irq_save(flags); } else { spin_lock_irqsave(&pinfo->port.lock, flags); }
cpm_uart_early_write(pinfo, s, count, true);
if (unlikely(nolock)) { local_irq_restore(flags); } else { spin_unlock_irqrestore(&pinfo->port.lock, flags); }
Rewrite it a more obvious form:
if (unlikely(oops_in_progress)) { local_irq_save(flags); cpm_uart_early_write(pinfo, s, count, true); local_irq_restore(flags); } else { spin_lock_irqsave(&pinfo->port.lock, flags); cpm_uart_early_write(pinfo, s, count, true); spin_unlock_irqrestore(&pinfo->port.lock, flags); }
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Link: https://lore.kernel.org/r/f7da5cdc9287960185829cfef681a7d8614efa1f.169106870... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/cpm_uart/cpm_uart_core.c | 13 ++++--------- 1 file changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/tty/serial/cpm_uart/cpm_uart_core.c b/drivers/tty/serial/cpm_uart/cpm_uart_core.c index db07d6a5d764d..2335edd516847 100644 --- a/drivers/tty/serial/cpm_uart/cpm_uart_core.c +++ b/drivers/tty/serial/cpm_uart/cpm_uart_core.c @@ -1276,19 +1276,14 @@ static void cpm_uart_console_write(struct console *co, const char *s, { struct uart_cpm_port *pinfo = &cpm_uart_ports[co->index]; unsigned long flags; - int nolock = oops_in_progress;
- if (unlikely(nolock)) { + if (unlikely(oops_in_progress)) { local_irq_save(flags); - } else { - spin_lock_irqsave(&pinfo->port.lock, flags); - } - - cpm_uart_early_write(pinfo, s, count, true); - - if (unlikely(nolock)) { + cpm_uart_early_write(pinfo, s, count, true); local_irq_restore(flags); } else { + spin_lock_irqsave(&pinfo->port.lock, flags); + cpm_uart_early_write(pinfo, s, count, true); spin_unlock_irqrestore(&pinfo->port.lock, flags); } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xu Yang xu.yang_2@nxp.com
[ Upstream commit dda4b60ed70bd670eefda081f70c0cb20bbeb1fa ]
Some NXP processor using chipidea IP has a bug when frame babble is detected.
As per 4.15.1.1.1 Serial Bus Babble: A babble condition also exists if IN transaction is in progress at High-speed SOF2 point. This is called frame babble. The host controller must disable the port to which the frame babble is detected.
The USB controller has disabled the port (PE cleared) and has asserted USBERRINT when frame babble is detected, but PEC is not asserted. Therefore, the SW isn't aware that port has been disabled. Then the SW keeps sending packets to this port, but all of the transfers will fail.
This workaround will firstly assert PCD by SW when USBERRINT is detected and then judge whether port change has really occurred or not by polling roothub status. Because the PEC doesn't get asserted in our case, this patch will also assert it by SW when specific conditions are satisfied.
Signed-off-by: Xu Yang xu.yang_2@nxp.com Acked-by: Peter Chen peter.chen@kernel.org Link: https://lore.kernel.org/r/20230809024432.535160-1-xu.yang_2@nxp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/host/ehci-hcd.c | 8 ++++++-- drivers/usb/host/ehci-hub.c | 10 +++++++++- drivers/usb/host/ehci.h | 10 ++++++++++ 3 files changed, 25 insertions(+), 3 deletions(-)
diff --git a/drivers/usb/host/ehci-hcd.c b/drivers/usb/host/ehci-hcd.c index 1440803216297..02044d45edded 100644 --- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -755,10 +755,14 @@ static irqreturn_t ehci_irq (struct usb_hcd *hcd)
/* normal [4.15.1.2] or error [4.15.1.1] completion */ if (likely ((status & (STS_INT|STS_ERR)) != 0)) { - if (likely ((status & STS_ERR) == 0)) + if (likely ((status & STS_ERR) == 0)) { INCR(ehci->stats.normal); - else + } else { + /* Force to check port status */ + if (ehci->has_ci_pec_bug) + status |= STS_PCD; INCR(ehci->stats.error); + } bh = 1; }
diff --git a/drivers/usb/host/ehci-hub.c b/drivers/usb/host/ehci-hub.c index c4f6a2559a987..0350c03dc97a1 100644 --- a/drivers/usb/host/ehci-hub.c +++ b/drivers/usb/host/ehci-hub.c @@ -674,7 +674,8 @@ ehci_hub_status_data (struct usb_hcd *hcd, char *buf)
if ((temp & mask) != 0 || test_bit(i, &ehci->port_c_suspend) || (ehci->reset_done[i] && time_after_eq( - jiffies, ehci->reset_done[i]))) { + jiffies, ehci->reset_done[i])) + || ehci_has_ci_pec_bug(ehci, temp)) { if (i < 7) buf [0] |= 1 << (i + 1); else @@ -874,6 +875,13 @@ int ehci_hub_control( if (temp & PORT_PEC) status |= USB_PORT_STAT_C_ENABLE << 16;
+ if (ehci_has_ci_pec_bug(ehci, temp)) { + status |= USB_PORT_STAT_C_ENABLE << 16; + ehci_info(ehci, + "PE is cleared by HW port:%d PORTSC:%08x\n", + wIndex + 1, temp); + } + if ((temp & PORT_OCC) && (!ignore_oc && !ehci->spurious_oc)){ status |= USB_PORT_STAT_C_OVERCURRENT << 16;
diff --git a/drivers/usb/host/ehci.h b/drivers/usb/host/ehci.h index fdd073cc053b8..9888ca5f5f36f 100644 --- a/drivers/usb/host/ehci.h +++ b/drivers/usb/host/ehci.h @@ -207,6 +207,7 @@ struct ehci_hcd { /* one per controller */ unsigned has_fsl_port_bug:1; /* FreeScale */ unsigned has_fsl_hs_errata:1; /* Freescale HS quirk */ unsigned has_fsl_susp_errata:1; /* NXP SUSP quirk */ + unsigned has_ci_pec_bug:1; /* ChipIdea PEC bug */ unsigned big_endian_mmio:1; unsigned big_endian_desc:1; unsigned big_endian_capbase:1; @@ -706,6 +707,15 @@ ehci_port_speed(struct ehci_hcd *ehci, unsigned int portsc) */ #define ehci_has_fsl_susp_errata(e) ((e)->has_fsl_susp_errata)
+/* + * Some Freescale/NXP processors using ChipIdea IP have a bug in which + * disabling the port (PE is cleared) does not cause PEC to be asserted + * when frame babble is detected. + */ +#define ehci_has_ci_pec_bug(e, portsc) \ + ((e)->has_ci_pec_bug && ((e)->command & CMD_PSE) \ + && !(portsc & PORT_PEC) && !(portsc & PORT_PE)) + /* * While most USB host controllers implement their registers in * little-endian format, a minority (celleb companion chip) implement
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sakari Ailus sakari.ailus@linux.intel.com
[ Upstream commit 9d7531be3085a8f013cf173ccc4e72e3cf493538 ]
Initialise timing struct in cio2_hw_init() to zero in order to avoid a compiler warning. The warning was a false positive.
Reported-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/pci/intel/ipu3/ipu3-cio2-main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c b/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c index 3a8af3936e93a..162ab089124f3 100644 --- a/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c +++ b/drivers/media/pci/intel/ipu3/ipu3-cio2-main.c @@ -345,7 +345,7 @@ static int cio2_hw_init(struct cio2_device *cio2, struct cio2_queue *q) void __iomem *const base = cio2->base; u8 lanes, csi2bus = q->csi2.port; u8 sensor_vc = SENSOR_VIR_CH_DFLT; - struct cio2_csi2_timing timing; + struct cio2_csi2_timing timing = { 0 }; int i, r;
fmt = cio2_find_format(NULL, &q->subdev_fmt.code);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhen Lei thunder.leizhen@huawei.com
[ Upstream commit 4d0fe8c52bb3029d83e323c961221156ab98680b ]
When I register a kset in the following way: static struct kset my_kset; kobject_set_name(&my_kset.kobj, "my_kset"); ret = kset_register(&my_kset);
A null pointer dereference exception is occurred: [ 4453.568337] Unable to handle kernel NULL pointer dereference at \ virtual address 0000000000000028 ... ... [ 4453.810361] Call trace: [ 4453.813062] kobject_get_ownership+0xc/0x34 [ 4453.817493] kobject_add_internal+0x98/0x274 [ 4453.822005] kset_register+0x5c/0xb4 [ 4453.825820] my_kobj_init+0x44/0x1000 [my_kset] ... ...
Because I didn't initialize my_kset.kobj.ktype.
According to the description in Documentation/core-api/kobject.rst: - A ktype is the type of object that embeds a kobject. Every structure that embeds a kobject needs a corresponding ktype.
So add sanity check to make sure kset->kobj.ktype is not NULL.
Signed-off-by: Zhen Lei thunder.leizhen@huawei.com Link: https://lore.kernel.org/r/20230805084114.1298-2-thunder.leizhen@huaweicloud.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- lib/kobject.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/lib/kobject.c b/lib/kobject.c index 184a3dab26991..b6ccb4cced635 100644 --- a/lib/kobject.c +++ b/lib/kobject.c @@ -882,6 +882,11 @@ int kset_register(struct kset *k) if (!k) return -EINVAL;
+ if (!k->kobj.ktype) { + pr_err("must have a ktype to be initialized properly!\n"); + return -EINVAL; + } + kset_init(k); err = kobject_add_internal(&k->kobj); if (err)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Rob Clark robdclark@chromium.org
[ Upstream commit af42269c3523492d71ebbe11fefae2653e9cdc78 ]
For cases where icc_bw_set() can be called in callbaths that could deadlock against shrinker/reclaim, such as runpm resume, we need to decouple the icc locking. Introduce a new icc_bw_lock for cases where we need to serialize bw aggregation and update to decouple that from paths that require memory allocation such as node/link creation/ destruction.
Fixes this lockdep splat:
====================================================== WARNING: possible circular locking dependency detected 6.2.0-rc8-debug+ #554 Not tainted ------------------------------------------------------ ring0/132 is trying to acquire lock: ffffff80871916d0 (&gmu->lock){+.+.}-{3:3}, at: a6xx_pm_resume+0xf0/0x234
but task is already holding lock: ffffffdb5aee57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (dma_fence_map){++++}-{0:0}: __dma_fence_might_wait+0x74/0xc0 dma_resv_lockdep+0x1f4/0x2f4 do_one_initcall+0x104/0x2bc kernel_init_freeable+0x344/0x34c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20
-> #3 (mmu_notifier_invalidate_range_start){+.+.}-{0:0}: fs_reclaim_acquire+0x80/0xa8 slab_pre_alloc_hook.constprop.0+0x40/0x25c __kmem_cache_alloc_node+0x60/0x1cc __kmalloc+0xd8/0x100 topology_parse_cpu_capacity+0x8c/0x178 get_cpu_for_node+0x88/0xc4 parse_cluster+0x1b0/0x28c parse_cluster+0x8c/0x28c init_cpu_topology+0x168/0x188 smp_prepare_cpus+0x24/0xf8 kernel_init_freeable+0x18c/0x34c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20
-> #2 (fs_reclaim){+.+.}-{0:0}: __fs_reclaim_acquire+0x3c/0x48 fs_reclaim_acquire+0x54/0xa8 slab_pre_alloc_hook.constprop.0+0x40/0x25c __kmem_cache_alloc_node+0x60/0x1cc __kmalloc+0xd8/0x100 kzalloc.constprop.0+0x14/0x20 icc_node_create_nolock+0x4c/0xc4 icc_node_create+0x38/0x58 qcom_icc_rpmh_probe+0x1b8/0x248 platform_probe+0x70/0xc4 really_probe+0x158/0x290 __driver_probe_device+0xc8/0xe0 driver_probe_device+0x44/0x100 __driver_attach+0xf8/0x108 bus_for_each_dev+0x78/0xc4 driver_attach+0x2c/0x38 bus_add_driver+0xd0/0x1d8 driver_register+0xbc/0xf8 __platform_driver_register+0x30/0x3c qnoc_driver_init+0x24/0x30 do_one_initcall+0x104/0x2bc kernel_init_freeable+0x344/0x34c kernel_init+0x30/0x134 ret_from_fork+0x10/0x20
-> #1 (icc_lock){+.+.}-{3:3}: __mutex_lock+0xcc/0x3c8 mutex_lock_nested+0x30/0x44 icc_set_bw+0x88/0x2b4 _set_opp_bw+0x8c/0xd8 _set_opp+0x19c/0x300 dev_pm_opp_set_opp+0x84/0x94 a6xx_gmu_resume+0x18c/0x804 a6xx_pm_resume+0xf8/0x234 adreno_runtime_resume+0x2c/0x38 pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x15c/0x174 rpm_callback+0x78/0x7c rpm_resume+0x318/0x524 __pm_runtime_resume+0x78/0xbc adreno_load_gpu+0xc4/0x17c msm_open+0x50/0x120 drm_file_alloc+0x17c/0x228 drm_open_helper+0x74/0x118 drm_open+0xa0/0x144 drm_stub_open+0xd4/0xe4 chrdev_open+0x1b8/0x1e4 do_dentry_open+0x2f8/0x38c vfs_open+0x34/0x40 path_openat+0x64c/0x7b4 do_filp_open+0x54/0xc4 do_sys_openat2+0x9c/0x100 do_sys_open+0x50/0x7c __arm64_sys_openat+0x28/0x34 invoke_syscall+0x8c/0x128 el0_svc_common.constprop.0+0xa0/0x11c do_el0_svc+0xac/0xbc el0_svc+0x48/0xa0 el0t_64_sync_handler+0xac/0x13c el0t_64_sync+0x190/0x194
-> #0 (&gmu->lock){+.+.}-{3:3}: __lock_acquire+0xe00/0x1060 lock_acquire+0x1e0/0x2f8 __mutex_lock+0xcc/0x3c8 mutex_lock_nested+0x30/0x44 a6xx_pm_resume+0xf0/0x234 adreno_runtime_resume+0x2c/0x38 pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x15c/0x174 rpm_callback+0x78/0x7c rpm_resume+0x318/0x524 __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 msm_gpu_submit+0x58/0x178 msm_job_run+0x78/0x150 drm_sched_main+0x290/0x370 kthread+0xf0/0x100 ret_from_fork+0x10/0x20
other info that might help us debug this:
Chain exists of: &gmu->lock --> mmu_notifier_invalidate_range_start --> dma_fence_map
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(dma_fence_map); lock(mmu_notifier_invalidate_range_start); lock(dma_fence_map); lock(&gmu->lock);
*** DEADLOCK ***
2 locks held by ring0/132: #0: ffffff8087191170 (&gpu->lock){+.+.}-{3:3}, at: msm_job_run+0x64/0x150 #1: ffffffdb5aee57e8 (dma_fence_map){++++}-{0:0}, at: msm_job_run+0x68/0x150
stack backtrace: CPU: 7 PID: 132 Comm: ring0 Not tainted 6.2.0-rc8-debug+ #554 Hardware name: Google Lazor (rev1 - 2) with LTE (DT) Call trace: dump_backtrace.part.0+0xb4/0xf8 show_stack+0x20/0x38 dump_stack_lvl+0x9c/0xd0 dump_stack+0x18/0x34 print_circular_bug+0x1b4/0x1f0 check_noncircular+0x78/0xac __lock_acquire+0xe00/0x1060 lock_acquire+0x1e0/0x2f8 __mutex_lock+0xcc/0x3c8 mutex_lock_nested+0x30/0x44 a6xx_pm_resume+0xf0/0x234 adreno_runtime_resume+0x2c/0x38 pm_generic_runtime_resume+0x30/0x44 __rpm_callback+0x15c/0x174 rpm_callback+0x78/0x7c rpm_resume+0x318/0x524 __pm_runtime_resume+0x78/0xbc pm_runtime_get_sync.isra.0+0x14/0x20 msm_gpu_submit+0x58/0x178 msm_job_run+0x78/0x150 drm_sched_main+0x290/0x370 kthread+0xf0/0x100 ret_from_fork+0x10/0x20
Signed-off-by: Rob Clark robdclark@chromium.org Link: https://lore.kernel.org/r/20230807171148.210181-7-robdclark@gmail.com Signed-off-by: Georgi Djakov djakov@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/interconnect/core.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/interconnect/core.c b/drivers/interconnect/core.c index 14d785e5629e6..df88d9e9fb551 100644 --- a/drivers/interconnect/core.c +++ b/drivers/interconnect/core.c @@ -29,6 +29,7 @@ static LIST_HEAD(icc_providers); static int providers_count; static bool synced_state; static DEFINE_MUTEX(icc_lock); +static DEFINE_MUTEX(icc_bw_lock); static struct dentry *icc_debugfs_dir;
static void icc_summary_show_one(struct seq_file *s, struct icc_node *n) @@ -632,7 +633,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw) if (WARN_ON(IS_ERR(path) || !path->num_nodes)) return -EINVAL;
- mutex_lock(&icc_lock); + mutex_lock(&icc_bw_lock);
old_avg = path->reqs[0].avg_bw; old_peak = path->reqs[0].peak_bw; @@ -664,7 +665,7 @@ int icc_set_bw(struct icc_path *path, u32 avg_bw, u32 peak_bw) apply_constraints(path); }
- mutex_unlock(&icc_lock); + mutex_unlock(&icc_bw_lock);
trace_icc_set_bw_end(path, ret);
@@ -967,6 +968,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) return;
mutex_lock(&icc_lock); + mutex_lock(&icc_bw_lock);
node->provider = provider; list_add_tail(&node->node_list, &provider->nodes); @@ -992,6 +994,7 @@ void icc_node_add(struct icc_node *node, struct icc_provider *provider) node->avg_bw = 0; node->peak_bw = 0;
+ mutex_unlock(&icc_bw_lock); mutex_unlock(&icc_lock); } EXPORT_SYMBOL_GPL(icc_node_add); @@ -1119,6 +1122,7 @@ void icc_sync_state(struct device *dev) return;
mutex_lock(&icc_lock); + mutex_lock(&icc_bw_lock); synced_state = true; list_for_each_entry(p, &icc_providers, provider_list) { dev_dbg(p->dev, "interconnect provider is in synced state\n");
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: John Ogness john.ogness@linutronix.de
[ Upstream commit 696ffaf50e1f8dbc66223ff614473f945f5fb8d8 ]
Printing to consoles can be deferred for several reasons:
- explicitly with printk_deferred() - printk() in NMI context - recursive printk() calls
The current implementation is not consistent. For printk_deferred(), irq work is scheduled twice. For NMI und recursive, panic CPU suppression and caller delays are not properly enforced.
Correct these inconsistencies by consolidating the deferred printing code so that vprintk_deferred() is the top-level function for deferred printing and vprintk_emit() will perform whichever irq_work queueing is appropriate.
Also add kerneldoc for wake_up_klogd() and defer_console_output() to clarify their differences and appropriate usage.
Signed-off-by: John Ogness john.ogness@linutronix.de Reviewed-by: Sergey Senozhatsky senozhatsky@chromium.org Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Petr Mladek pmladek@suse.com Link: https://lore.kernel.org/r/20230717194607.145135-6-john.ogness@linutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/printk/printk.c | 35 ++++++++++++++++++++++++++++------- kernel/printk/printk_safe.c | 9 ++------- 2 files changed, 30 insertions(+), 14 deletions(-)
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 8d856b7c2e5af..8b110b245d92c 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2269,7 +2269,11 @@ asmlinkage int vprintk_emit(int facility, int level, preempt_enable(); }
- wake_up_klogd(); + if (in_sched) + defer_console_output(); + else + wake_up_klogd(); + return printed_len; } EXPORT_SYMBOL(vprintk_emit); @@ -3277,11 +3281,33 @@ static void __wake_up_klogd(int val) preempt_enable(); }
+/** + * wake_up_klogd - Wake kernel logging daemon + * + * Use this function when new records have been added to the ringbuffer + * and the console printing of those records has already occurred or is + * known to be handled by some other context. This function will only + * wake the logging daemon. + * + * Context: Any context. + */ void wake_up_klogd(void) { __wake_up_klogd(PRINTK_PENDING_WAKEUP); }
+/** + * defer_console_output - Wake kernel logging daemon and trigger + * console printing in a deferred context + * + * Use this function when new records have been added to the ringbuffer, + * this context is responsible for console printing those records, but + * the current context is not allowed to perform the console printing. + * Trigger an irq_work context to perform the console printing. This + * function also wakes the logging daemon. + * + * Context: Any context. + */ void defer_console_output(void) { /* @@ -3298,12 +3324,7 @@ void printk_trigger_flush(void)
int vprintk_deferred(const char *fmt, va_list args) { - int r; - - r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); - defer_console_output(); - - return r; + return vprintk_emit(0, LOGLEVEL_SCHED, NULL, fmt, args); }
int _printk_deferred(const char *fmt, ...) diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index ef0f9a2044da1..6d10927a07d83 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -38,13 +38,8 @@ asmlinkage int vprintk(const char *fmt, va_list args) * Use the main logbuf even in NMI. But avoid calling console * drivers that might have their own locks. */ - if (this_cpu_read(printk_context) || in_nmi()) { - int len; - - len = vprintk_store(0, LOGLEVEL_DEFAULT, NULL, fmt, args); - defer_console_output(); - return len; - } + if (this_cpu_read(printk_context) || in_nmi()) + return vprintk_deferred(fmt, args);
/* No obstacles. */ return vprintk_default(fmt, args);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ritesh Harjani riteshh@linux.ibm.com
[ Upstream commit 4f98186848707f530669238d90e0562d92a78aab ]
No functionality change as such in this patch. This only refactors the common piece of code which waits for t_updates to finish into a common function named as jbd2_journal_wait_updates(journal_t *)
Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/8c564f70f4b2591171677a2a74fccb22a7b6c3a4.164241699... Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 2dfba3bb40ad ("jbd2: correct the end of the journal recovery scan range") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/commit.c | 19 +++------------- fs/jbd2/transaction.c | 53 ++++++++++++++++++++++++++----------------- include/linux/jbd2.h | 4 +++- 3 files changed, 38 insertions(+), 38 deletions(-)
diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 20294c1bbeab7..2705850ca6460 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -484,22 +484,9 @@ void jbd2_journal_commit_transaction(journal_t *journal) stats.run.rs_running = jbd2_time_diff(commit_transaction->t_start, stats.run.rs_locked);
- spin_lock(&commit_transaction->t_handle_lock); - while (atomic_read(&commit_transaction->t_updates)) { - DEFINE_WAIT(wait); + // waits for any t_updates to finish + jbd2_journal_wait_updates(journal);
- prepare_to_wait(&journal->j_wait_updates, &wait, - TASK_UNINTERRUPTIBLE); - if (atomic_read(&commit_transaction->t_updates)) { - spin_unlock(&commit_transaction->t_handle_lock); - write_unlock(&journal->j_state_lock); - schedule(); - write_lock(&journal->j_state_lock); - spin_lock(&commit_transaction->t_handle_lock); - } - finish_wait(&journal->j_wait_updates, &wait); - } - spin_unlock(&commit_transaction->t_handle_lock); commit_transaction->t_state = T_SWITCH;
J_ASSERT (atomic_read(&commit_transaction->t_outstanding_credits) <= @@ -819,7 +806,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) commit_transaction->t_state = T_COMMIT_DFLUSH; write_unlock(&journal->j_state_lock);
- /* + /* * If the journal is not located on the file system device, * then we must flush the file system device before we issue * the commit record diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 62e68c5b8ec3d..69aed72c8206d 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -449,7 +449,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle, }
/* OK, account for the buffers that this operation expects to - * use and add the handle to the running transaction. + * use and add the handle to the running transaction. */ update_t_max_wait(transaction, ts); handle->h_transaction = transaction; @@ -836,6 +836,35 @@ int jbd2_journal_restart(handle_t *handle, int nblocks) } EXPORT_SYMBOL(jbd2_journal_restart);
+/* + * Waits for any outstanding t_updates to finish. + * This is called with write j_state_lock held. + */ +void jbd2_journal_wait_updates(journal_t *journal) +{ + transaction_t *commit_transaction = journal->j_running_transaction; + + if (!commit_transaction) + return; + + spin_lock(&commit_transaction->t_handle_lock); + while (atomic_read(&commit_transaction->t_updates)) { + DEFINE_WAIT(wait); + + prepare_to_wait(&journal->j_wait_updates, &wait, + TASK_UNINTERRUPTIBLE); + if (atomic_read(&commit_transaction->t_updates)) { + spin_unlock(&commit_transaction->t_handle_lock); + write_unlock(&journal->j_state_lock); + schedule(); + write_lock(&journal->j_state_lock); + spin_lock(&commit_transaction->t_handle_lock); + } + finish_wait(&journal->j_wait_updates, &wait); + } + spin_unlock(&commit_transaction->t_handle_lock); +} + /** * jbd2_journal_lock_updates () - establish a transaction barrier. * @journal: Journal to establish a barrier on. @@ -863,27 +892,9 @@ void jbd2_journal_lock_updates(journal_t *journal) write_lock(&journal->j_state_lock); }
- /* Wait until there are no running updates */ - while (1) { - transaction_t *transaction = journal->j_running_transaction; - - if (!transaction) - break; + /* Wait until there are no running t_updates */ + jbd2_journal_wait_updates(journal);
- spin_lock(&transaction->t_handle_lock); - prepare_to_wait(&journal->j_wait_updates, &wait, - TASK_UNINTERRUPTIBLE); - if (!atomic_read(&transaction->t_updates)) { - spin_unlock(&transaction->t_handle_lock); - finish_wait(&journal->j_wait_updates, &wait); - break; - } - spin_unlock(&transaction->t_handle_lock); - write_unlock(&journal->j_state_lock); - schedule(); - finish_wait(&journal->j_wait_updates, &wait); - write_lock(&journal->j_state_lock); - } write_unlock(&journal->j_state_lock);
/* diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index ade8a6d7acff9..03d8ba98cbeb7 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -594,7 +594,7 @@ struct transaction_s */ unsigned long t_log_start;
- /* + /* * Number of buffers on the t_buffers list [j_list_lock, no locks * needed for jbd2 thread] */ @@ -1538,6 +1538,8 @@ extern int jbd2_journal_flush(journal_t *journal, unsigned int flags); extern void jbd2_journal_lock_updates (journal_t *); extern void jbd2_journal_unlock_updates (journal_t *);
+void jbd2_journal_wait_updates(journal_t *); + extern journal_t * jbd2_journal_init_dev(struct block_device *bdev, struct block_device *fs_dev, unsigned long long start, int len, int bsize);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ritesh Harjani riteshh@linux.ibm.com
[ Upstream commit cc16eecae687912238ee6efbff71ad31e2bc414e ]
jbd2_journal_wait_updates() is called with j_state_lock held. But if there is a commit in progress, then this transaction might get committed and freed via jbd2_journal_commit_transaction() -> jbd2_journal_free_transaction(), when we release j_state_lock. So check for journal->j_running_transaction everytime we release and acquire j_state_lock to avoid use-after-free issue.
Link: https://lore.kernel.org/r/948c2fed518ae739db6a8f7f83f1d58b504f87d0.164449710... Fixes: 4f98186848707f53 ("jbd2: refactor wait logic for transaction updates into a common function") Cc: stable@kernel.org Reported-and-tested-by: syzbot+afa2ca5171d93e44b348@syzkaller.appspotmail.com Reviewed-by: Jan Kara jack@suse.cz Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 2dfba3bb40ad ("jbd2: correct the end of the journal recovery scan range") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/transaction.c | 41 +++++++++++++++++++++++++---------------- 1 file changed, 25 insertions(+), 16 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 69aed72c8206d..59292b82d4424 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -842,27 +842,38 @@ EXPORT_SYMBOL(jbd2_journal_restart); */ void jbd2_journal_wait_updates(journal_t *journal) { - transaction_t *commit_transaction = journal->j_running_transaction; + DEFINE_WAIT(wait);
- if (!commit_transaction) - return; + while (1) { + /* + * Note that the running transaction can get freed under us if + * this transaction is getting committed in + * jbd2_journal_commit_transaction() -> + * jbd2_journal_free_transaction(). This can only happen when we + * release j_state_lock -> schedule() -> acquire j_state_lock. + * Hence we should everytime retrieve new j_running_transaction + * value (after j_state_lock release acquire cycle), else it may + * lead to use-after-free of old freed transaction. + */ + transaction_t *transaction = journal->j_running_transaction;
- spin_lock(&commit_transaction->t_handle_lock); - while (atomic_read(&commit_transaction->t_updates)) { - DEFINE_WAIT(wait); + if (!transaction) + break;
+ spin_lock(&transaction->t_handle_lock); prepare_to_wait(&journal->j_wait_updates, &wait, - TASK_UNINTERRUPTIBLE); - if (atomic_read(&commit_transaction->t_updates)) { - spin_unlock(&commit_transaction->t_handle_lock); - write_unlock(&journal->j_state_lock); - schedule(); - write_lock(&journal->j_state_lock); - spin_lock(&commit_transaction->t_handle_lock); + TASK_UNINTERRUPTIBLE); + if (!atomic_read(&transaction->t_updates)) { + spin_unlock(&transaction->t_handle_lock); + finish_wait(&journal->j_wait_updates, &wait); + break; } + spin_unlock(&transaction->t_handle_lock); + write_unlock(&journal->j_state_lock); + schedule(); finish_wait(&journal->j_wait_updates, &wait); + write_lock(&journal->j_state_lock); } - spin_unlock(&commit_transaction->t_handle_lock); }
/** @@ -877,8 +888,6 @@ void jbd2_journal_wait_updates(journal_t *journal) */ void jbd2_journal_lock_updates(journal_t *journal) { - DEFINE_WAIT(wait); - jbd2_might_wait_for_commit(journal);
write_lock(&journal->j_state_lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ritesh Harjani riteshh@linux.ibm.com
[ Upstream commit f7f497cb702462e8505ff3d8d4e7722ad95626a1 ]
This patch kills t_handle_lock transaction spinlock completely from jbd2.
To explain the reasoning, currently there were three sites at which this spinlock was used.
1. jbd2_journal_wait_updates() a. Based on careful code review it can be seen that, we don't need this lock here. This is since we wait for any currently ongoing updates based on a atomic variable t_updates. And we anyway don't take any t_handle_lock while in stop_this_handle(). i.e.
write_lock(&journal->j_state_lock() jbd2_journal_wait_updates() stop_this_handle() while (atomic_read(txn->t_updates) { | DEFINE_WAIT(wait); | prepare_to_wait(); | if (atomic_read(txn->t_updates) if (atomic_dec_and_test(txn->t_updates)) write_unlock(&journal->j_state_lock); schedule(); wake_up() write_lock(&journal->j_state_lock); finish_wait(); } txn->t_state = T_COMMIT write_unlock(&journal->j_state_lock);
b. Also note that between atomic_inc(&txn->t_updates) in start_this_handle() and jbd2_journal_wait_updates(), the synchronization happens via read_lock(journal->j_state_lock) in start_this_handle();
2. jbd2_journal_extend() a. jbd2_journal_extend() is called with the handle of each process from task_struct. So no lock required in updating member fields of handle_t
b. For member fields of h_transaction, all updates happens only via atomic APIs (which is also within read_lock()). So, no need of this transaction spinlock.
3. update_t_max_wait() Based on Jan suggestion, this can be carefully removed using atomic cmpxchg API. Note that there can be several processes which are waiting for a new transaction to be allocated and started. For doing this only one process will succeed in taking write_lock() and allocating a new txn. After that all of the process will be updating the t_max_wait (max transaction wait time). This can be done via below method w/o taking any locks using atomic cmpxchg. For more details refer [1]
new = get_new_val(); old = READ_ONCE(ptr->max_val); while (old < new) old = cmpxchg(&ptr->max_val, old, new);
[1]: https://lwn.net/Articles/849237/
Suggested-by: Jan Kara jack@suse.cz Signed-off-by: Ritesh Harjani riteshh@linux.ibm.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/d89e599658b4a1f3893a48c6feded200073037fc.164499207... Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 2dfba3bb40ad ("jbd2: correct the end of the journal recovery scan range") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/transaction.c | 28 +++++++++------------------- include/linux/jbd2.h | 3 --- 2 files changed, 9 insertions(+), 22 deletions(-)
diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index 59292b82d4424..b31145b2bb6bf 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -107,7 +107,6 @@ static void jbd2_get_transaction(journal_t *journal, transaction->t_start_time = ktime_get(); transaction->t_tid = journal->j_transaction_sequence++; transaction->t_expires = jiffies + journal->j_commit_interval; - spin_lock_init(&transaction->t_handle_lock); atomic_set(&transaction->t_updates, 0); atomic_set(&transaction->t_outstanding_credits, jbd2_descriptor_blocks_per_trans(journal) + @@ -139,24 +138,21 @@ static void jbd2_get_transaction(journal_t *journal, /* * Update transaction's maximum wait time, if debugging is enabled. * - * In order for t_max_wait to be reliable, it must be protected by a - * lock. But doing so will mean that start_this_handle() can not be - * run in parallel on SMP systems, which limits our scalability. So - * unless debugging is enabled, we no longer update t_max_wait, which - * means that maximum wait time reported by the jbd2_run_stats - * tracepoint will always be zero. + * t_max_wait is carefully updated here with use of atomic compare exchange. + * Note that there could be multiplre threads trying to do this simultaneously + * hence using cmpxchg to avoid any use of locks in this case. */ static inline void update_t_max_wait(transaction_t *transaction, unsigned long ts) { #ifdef CONFIG_JBD2_DEBUG + unsigned long oldts, newts; if (jbd2_journal_enable_debug && time_after(transaction->t_start, ts)) { - ts = jbd2_time_diff(ts, transaction->t_start); - spin_lock(&transaction->t_handle_lock); - if (ts > transaction->t_max_wait) - transaction->t_max_wait = ts; - spin_unlock(&transaction->t_handle_lock); + newts = jbd2_time_diff(ts, transaction->t_start); + oldts = READ_ONCE(transaction->t_max_wait); + while (oldts < newts) + oldts = cmpxchg(&transaction->t_max_wait, oldts, newts); } #endif } @@ -690,7 +686,6 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records) DIV_ROUND_UP( handle->h_revoke_credits_requested, journal->j_revoke_records_per_block); - spin_lock(&transaction->t_handle_lock); wanted = atomic_add_return(nblocks, &transaction->t_outstanding_credits);
@@ -698,7 +693,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records) jbd_debug(3, "denied handle %p %d blocks: " "transaction too large\n", handle, nblocks); atomic_sub(nblocks, &transaction->t_outstanding_credits); - goto unlock; + goto error_out; }
trace_jbd2_handle_extend(journal->j_fs_dev->bd_dev, @@ -714,8 +709,6 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records) result = 0;
jbd_debug(3, "extended handle %p by %d\n", handle, nblocks); -unlock: - spin_unlock(&transaction->t_handle_lock); error_out: read_unlock(&journal->j_state_lock); return result; @@ -860,15 +853,12 @@ void jbd2_journal_wait_updates(journal_t *journal) if (!transaction) break;
- spin_lock(&transaction->t_handle_lock); prepare_to_wait(&journal->j_wait_updates, &wait, TASK_UNINTERRUPTIBLE); if (!atomic_read(&transaction->t_updates)) { - spin_unlock(&transaction->t_handle_lock); finish_wait(&journal->j_wait_updates, &wait); break; } - spin_unlock(&transaction->t_handle_lock); write_unlock(&journal->j_state_lock); schedule(); finish_wait(&journal->j_wait_updates, &wait); diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index 03d8ba98cbeb7..f29fda630b2de 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -554,9 +554,6 @@ struct transaction_chp_stats_s { * ->j_list_lock * * j_state_lock - * ->t_handle_lock - * - * j_state_lock * ->j_list_lock (journal_unmap_buffer) * */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit cb3b3bf22cf33707d684e74207908ba0ef3b6467 ]
The name of jbd_debug() is confusing as all functions inside jbd2 have jbd2_ prefix. Rename jbd_debug() to jbd2_debug(). No functional changes.
Signed-off-by: Jan Kara jack@suse.cz Reviewed-by: Lukas Czerner lczerner@redhat.com Link: https://lore.kernel.org/r/20220608112355.4397-2-jack@suse.cz Signed-off-by: Theodore Ts'o tytso@mit.edu Stable-dep-of: 2dfba3bb40ad ("jbd2: correct the end of the journal recovery scan range") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/checkpoint.c | 6 +++--- fs/jbd2/commit.c | 30 +++++++++++++++--------------- fs/jbd2/journal.c | 34 +++++++++++++++++----------------- fs/jbd2/recovery.c | 30 +++++++++++++++--------------- fs/jbd2/revoke.c | 8 ++++---- fs/jbd2/transaction.c | 26 +++++++++++++------------- include/linux/jbd2.h | 4 ++-- 7 files changed, 69 insertions(+), 69 deletions(-)
diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c index 95d5bb7d825a6..f033ac807013c 100644 --- a/fs/jbd2/checkpoint.c +++ b/fs/jbd2/checkpoint.c @@ -165,7 +165,7 @@ int jbd2_log_do_checkpoint(journal_t *journal) tid_t this_tid; int result, batch_count = 0;
- jbd_debug(1, "Start checkpoint\n"); + jbd2_debug(1, "Start checkpoint\n");
/* * First thing: if there are any transactions in the log which @@ -174,7 +174,7 @@ int jbd2_log_do_checkpoint(journal_t *journal) */ result = jbd2_cleanup_journal_tail(journal); trace_jbd2_checkpoint(journal, result); - jbd_debug(1, "cleanup_journal_tail returned %d\n", result); + jbd2_debug(1, "cleanup_journal_tail returned %d\n", result); if (result <= 0) return result;
@@ -725,5 +725,5 @@ void __jbd2_journal_drop_transaction(journal_t *journal, transaction_t *transact
trace_jbd2_drop_transaction(journal, transaction);
- jbd_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid); + jbd2_debug(1, "Dropping transaction %d, all done\n", transaction->t_tid); } diff --git a/fs/jbd2/commit.c b/fs/jbd2/commit.c index 2705850ca6460..e058ef1839377 100644 --- a/fs/jbd2/commit.c +++ b/fs/jbd2/commit.c @@ -419,7 +419,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
/* Do we need to erase the effects of a prior jbd2_journal_flush? */ if (journal->j_flags & JBD2_FLUSHED) { - jbd_debug(3, "super block updated\n"); + jbd2_debug(3, "super block updated\n"); mutex_lock_io(&journal->j_checkpoint_mutex); /* * We hold j_checkpoint_mutex so tail cannot change under us. @@ -433,7 +433,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) REQ_SYNC); mutex_unlock(&journal->j_checkpoint_mutex); } else { - jbd_debug(3, "superblock not updated\n"); + jbd2_debug(3, "superblock not updated\n"); }
J_ASSERT(journal->j_running_transaction != NULL); @@ -465,7 +465,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) commit_transaction = journal->j_running_transaction;
trace_jbd2_start_commit(journal, commit_transaction); - jbd_debug(1, "JBD2: starting commit of transaction %d\n", + jbd2_debug(1, "JBD2: starting commit of transaction %d\n", commit_transaction->t_tid);
write_lock(&journal->j_state_lock); @@ -538,7 +538,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) __jbd2_journal_clean_checkpoint_list(journal, false); spin_unlock(&journal->j_list_lock);
- jbd_debug(3, "JBD2: commit phase 1\n"); + jbd2_debug(3, "JBD2: commit phase 1\n");
/* * Clear revoked flag to reflect there is no revoked buffers @@ -571,7 +571,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) wake_up_all(&journal->j_wait_transaction_locked); write_unlock(&journal->j_state_lock);
- jbd_debug(3, "JBD2: commit phase 2a\n"); + jbd2_debug(3, "JBD2: commit phase 2a\n");
/* * Now start flushing things to disk, in the order they appear @@ -584,7 +584,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) blk_start_plug(&plug); jbd2_journal_write_revoke_records(commit_transaction, &log_bufs);
- jbd_debug(3, "JBD2: commit phase 2b\n"); + jbd2_debug(3, "JBD2: commit phase 2b\n");
/* * Way to go: we have now written out all of the data for a @@ -640,7 +640,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) if (!descriptor) { J_ASSERT (bufs == 0);
- jbd_debug(4, "JBD2: get descriptor\n"); + jbd2_debug(4, "JBD2: get descriptor\n");
descriptor = jbd2_journal_get_descriptor_buffer( commit_transaction, @@ -650,7 +650,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) continue; }
- jbd_debug(4, "JBD2: got buffer %llu (%p)\n", + jbd2_debug(4, "JBD2: got buffer %llu (%p)\n", (unsigned long long)descriptor->b_blocknr, descriptor->b_data); tagp = &descriptor->b_data[sizeof(journal_header_t)]; @@ -735,7 +735,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) commit_transaction->t_buffers == NULL || space_left < tag_bytes + 16 + csum_size) {
- jbd_debug(4, "JBD2: Submit %d IOs\n", bufs); + jbd2_debug(4, "JBD2: Submit %d IOs\n", bufs);
/* Write an end-of-descriptor marker before submitting the IOs. "tag" still points to @@ -837,7 +837,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) so we incur less scheduling load. */
- jbd_debug(3, "JBD2: commit phase 3\n"); + jbd2_debug(3, "JBD2: commit phase 3\n");
while (!list_empty(&io_bufs)) { struct buffer_head *bh = list_entry(io_bufs.prev, @@ -880,7 +880,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
J_ASSERT (commit_transaction->t_shadow_list == NULL);
- jbd_debug(3, "JBD2: commit phase 4\n"); + jbd2_debug(3, "JBD2: commit phase 4\n");
/* Here we wait for the revoke record and descriptor record buffers */ while (!list_empty(&log_bufs)) { @@ -904,7 +904,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) if (err) jbd2_journal_abort(journal, err);
- jbd_debug(3, "JBD2: commit phase 5\n"); + jbd2_debug(3, "JBD2: commit phase 5\n"); write_lock(&journal->j_state_lock); J_ASSERT(commit_transaction->t_state == T_COMMIT_DFLUSH); commit_transaction->t_state = T_COMMIT_JFLUSH; @@ -943,7 +943,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) transaction can be removed from any checkpoint list it was on before. */
- jbd_debug(3, "JBD2: commit phase 6\n"); + jbd2_debug(3, "JBD2: commit phase 6\n");
J_ASSERT(list_empty(&commit_transaction->t_inode_list)); J_ASSERT(commit_transaction->t_buffers == NULL); @@ -1120,7 +1120,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
/* Done with this transaction! */
- jbd_debug(3, "JBD2: commit phase 7\n"); + jbd2_debug(3, "JBD2: commit phase 7\n");
J_ASSERT(commit_transaction->t_state == T_COMMIT_JFLUSH);
@@ -1162,7 +1162,7 @@ void jbd2_journal_commit_transaction(journal_t *journal) journal->j_fc_cleanup_callback(journal, 1, commit_transaction->t_tid);
trace_jbd2_end_commit(journal, commit_transaction); - jbd_debug(1, "JBD2: commit %d complete, head %d\n", + jbd2_debug(1, "JBD2: commit %d complete, head %d\n", journal->j_commit_sequence, journal->j_tail_sequence);
write_lock(&journal->j_state_lock); diff --git a/fs/jbd2/journal.c b/fs/jbd2/journal.c index 580d2fdfe21f5..11fbc9b6ec5cb 100644 --- a/fs/jbd2/journal.c +++ b/fs/jbd2/journal.c @@ -203,11 +203,11 @@ static int kjournald2(void *arg) if (journal->j_flags & JBD2_UNMOUNT) goto end_loop;
- jbd_debug(1, "commit_sequence=%u, commit_request=%u\n", + jbd2_debug(1, "commit_sequence=%u, commit_request=%u\n", journal->j_commit_sequence, journal->j_commit_request);
if (journal->j_commit_sequence != journal->j_commit_request) { - jbd_debug(1, "OK, requests differ\n"); + jbd2_debug(1, "OK, requests differ\n"); write_unlock(&journal->j_state_lock); del_timer_sync(&journal->j_commit_timer); jbd2_journal_commit_transaction(journal); @@ -222,7 +222,7 @@ static int kjournald2(void *arg) * good idea, because that depends on threads that may * be already stopped. */ - jbd_debug(1, "Now suspending kjournald2\n"); + jbd2_debug(1, "Now suspending kjournald2\n"); write_unlock(&journal->j_state_lock); try_to_freeze(); write_lock(&journal->j_state_lock); @@ -252,7 +252,7 @@ static int kjournald2(void *arg) finish_wait(&journal->j_wait_commit, &wait); }
- jbd_debug(1, "kjournald2 wakes\n"); + jbd2_debug(1, "kjournald2 wakes\n");
/* * Were we woken up by a commit wakeup event? @@ -260,7 +260,7 @@ static int kjournald2(void *arg) transaction = journal->j_running_transaction; if (transaction && time_after_eq(jiffies, transaction->t_expires)) { journal->j_commit_request = transaction->t_tid; - jbd_debug(1, "woke because of timeout\n"); + jbd2_debug(1, "woke because of timeout\n"); } goto loop;
@@ -268,7 +268,7 @@ static int kjournald2(void *arg) del_timer_sync(&journal->j_commit_timer); journal->j_task = NULL; wake_up(&journal->j_wait_done_commit); - jbd_debug(1, "Journal thread exiting.\n"); + jbd2_debug(1, "Journal thread exiting.\n"); write_unlock(&journal->j_state_lock); return 0; } @@ -500,7 +500,7 @@ int __jbd2_log_start_commit(journal_t *journal, tid_t target) */
journal->j_commit_request = target; - jbd_debug(1, "JBD2: requesting commit %u/%u\n", + jbd2_debug(1, "JBD2: requesting commit %u/%u\n", journal->j_commit_request, journal->j_commit_sequence); journal->j_running_transaction->t_requested = jiffies; @@ -705,7 +705,7 @@ int jbd2_log_wait_commit(journal_t *journal, tid_t tid) } #endif while (tid_gt(tid, journal->j_commit_sequence)) { - jbd_debug(1, "JBD2: want %u, j_commit_sequence=%u\n", + jbd2_debug(1, "JBD2: want %u, j_commit_sequence=%u\n", tid, journal->j_commit_sequence); read_unlock(&journal->j_state_lock); wake_up(&journal->j_wait_commit); @@ -1123,7 +1123,7 @@ int __jbd2_update_log_tail(journal_t *journal, tid_t tid, unsigned long block) freed += journal->j_last - journal->j_first;
trace_jbd2_update_log_tail(journal, tid, block, freed); - jbd_debug(1, + jbd2_debug(1, "Cleaning journal tail from %u to %u (offset %lu), " "freeing %lu\n", journal->j_tail_sequence, tid, block, freed); @@ -1498,7 +1498,7 @@ journal_t *jbd2_journal_init_inode(struct inode *inode) return NULL; }
- jbd_debug(1, "JBD2: inode %s/%ld, size %lld, bits %d, blksize %ld\n", + jbd2_debug(1, "JBD2: inode %s/%ld, size %lld, bits %d, blksize %ld\n", inode->i_sb->s_id, inode->i_ino, (long long) inode->i_size, inode->i_sb->s_blocksize_bits, inode->i_sb->s_blocksize);
@@ -1577,7 +1577,7 @@ static int journal_reset(journal_t *journal) * attempting a write to a potential-readonly device. */ if (sb->s_start == 0) { - jbd_debug(1, "JBD2: Skipping superblock update on recovered sb " + jbd2_debug(1, "JBD2: Skipping superblock update on recovered sb " "(start %ld, seq %u, errno %d)\n", journal->j_tail, journal->j_tail_sequence, journal->j_errno); @@ -1680,7 +1680,7 @@ int jbd2_journal_update_sb_log_tail(journal_t *journal, tid_t tail_tid, }
BUG_ON(!mutex_is_locked(&journal->j_checkpoint_mutex)); - jbd_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n", + jbd2_debug(1, "JBD2: updating superblock (start %lu, seq %u)\n", tail_block, tail_tid);
lock_buffer(journal->j_sb_buffer); @@ -1721,7 +1721,7 @@ static void jbd2_mark_journal_empty(journal_t *journal, int write_op) return; }
- jbd_debug(1, "JBD2: Marking journal as empty (seq %u)\n", + jbd2_debug(1, "JBD2: Marking journal as empty (seq %u)\n", journal->j_tail_sequence);
sb->s_sequence = cpu_to_be32(journal->j_tail_sequence); @@ -1867,7 +1867,7 @@ void jbd2_journal_update_sb_errno(journal_t *journal) errcode = journal->j_errno; if (errcode == -ESHUTDOWN) errcode = 0; - jbd_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode); + jbd2_debug(1, "JBD2: updating superblock error (errno %d)\n", errcode); sb->s_errno = cpu_to_be32(errcode);
jbd2_write_superblock(journal, REQ_SYNC | REQ_FUA); @@ -2339,7 +2339,7 @@ int jbd2_journal_set_features(journal_t *journal, unsigned long compat, compat & JBD2_FEATURE_COMPAT_CHECKSUM) compat &= ~JBD2_FEATURE_COMPAT_CHECKSUM;
- jbd_debug(1, "Setting new features 0x%lx/0x%lx/0x%lx\n", + jbd2_debug(1, "Setting new features 0x%lx/0x%lx/0x%lx\n", compat, ro, incompat);
sb = journal->j_superblock; @@ -2408,7 +2408,7 @@ void jbd2_journal_clear_features(journal_t *journal, unsigned long compat, { journal_superblock_t *sb;
- jbd_debug(1, "Clear features 0x%lx/0x%lx/0x%lx\n", + jbd2_debug(1, "Clear features 0x%lx/0x%lx/0x%lx\n", compat, ro, incompat);
sb = journal->j_superblock; @@ -2865,7 +2865,7 @@ static struct journal_head *journal_alloc_journal_head(void) #endif ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS); if (!ret) { - jbd_debug(1, "out of memory for journal_head\n"); + jbd2_debug(1, "out of memory for journal_head\n"); pr_notice_ratelimited("ENOMEM in %s, retrying.\n", __func__); ret = kmem_cache_zalloc(jbd2_journal_head_cache, GFP_NOFS | __GFP_NOFAIL); diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 3c5dd010e39d2..18f525f7f4063 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -245,11 +245,11 @@ static int fc_do_one_pass(journal_t *journal, return 0;
while (next_fc_block <= journal->j_fc_last) { - jbd_debug(3, "Fast commit replay: next block %ld\n", + jbd2_debug(3, "Fast commit replay: next block %ld\n", next_fc_block); err = jread(&bh, journal, next_fc_block); if (err) { - jbd_debug(3, "Fast commit replay: read error\n"); + jbd2_debug(3, "Fast commit replay: read error\n"); break; }
@@ -264,7 +264,7 @@ static int fc_do_one_pass(journal_t *journal, }
if (err) - jbd_debug(3, "Fast commit replay failed, err = %d\n", err); + jbd2_debug(3, "Fast commit replay failed, err = %d\n", err);
return err; } @@ -298,7 +298,7 @@ int jbd2_journal_recover(journal_t *journal) */
if (!sb->s_start) { - jbd_debug(1, "No recovery required, last transaction %d\n", + jbd2_debug(1, "No recovery required, last transaction %d\n", be32_to_cpu(sb->s_sequence)); journal->j_transaction_sequence = be32_to_cpu(sb->s_sequence) + 1; return 0; @@ -310,10 +310,10 @@ int jbd2_journal_recover(journal_t *journal) if (!err) err = do_one_pass(journal, &info, PASS_REPLAY);
- jbd_debug(1, "JBD2: recovery, exit status %d, " + jbd2_debug(1, "JBD2: recovery, exit status %d, " "recovered transactions %u to %u\n", err, info.start_transaction, info.end_transaction); - jbd_debug(1, "JBD2: Replayed %d and revoked %d/%d blocks\n", + jbd2_debug(1, "JBD2: Replayed %d and revoked %d/%d blocks\n", info.nr_replays, info.nr_revoke_hits, info.nr_revokes);
/* Restart the log at the next transaction ID, thus invalidating @@ -363,7 +363,7 @@ int jbd2_journal_skip_recovery(journal_t *journal) #ifdef CONFIG_JBD2_DEBUG int dropped = info.end_transaction - be32_to_cpu(journal->j_superblock->s_sequence); - jbd_debug(1, + jbd2_debug(1, "JBD2: ignoring %d transaction%s from the journal.\n", dropped, (dropped == 1) ? "" : "s"); #endif @@ -485,7 +485,7 @@ static int do_one_pass(journal_t *journal, if (pass == PASS_SCAN) info->start_transaction = first_commit_ID;
- jbd_debug(1, "Starting recovery pass %d\n", pass); + jbd2_debug(1, "Starting recovery pass %d\n", pass);
/* * Now we walk through the log, transaction by transaction, @@ -511,7 +511,7 @@ static int do_one_pass(journal_t *journal, if (tid_geq(next_commit_ID, info->end_transaction)) break;
- jbd_debug(2, "Scanning for sequence ID %u at %lu/%lu\n", + jbd2_debug(2, "Scanning for sequence ID %u at %lu/%lu\n", next_commit_ID, next_log_block, jbd2_has_feature_fast_commit(journal) ? journal->j_fc_last : journal->j_last); @@ -520,7 +520,7 @@ static int do_one_pass(journal_t *journal, * either the next descriptor block or the final commit * record. */
- jbd_debug(3, "JBD2: checking block %ld\n", next_log_block); + jbd2_debug(3, "JBD2: checking block %ld\n", next_log_block); err = jread(&bh, journal, next_log_block); if (err) goto failed; @@ -543,7 +543,7 @@ static int do_one_pass(journal_t *journal,
blocktype = be32_to_cpu(tmp->h_blocktype); sequence = be32_to_cpu(tmp->h_sequence); - jbd_debug(3, "Found magic %d, sequence %d\n", + jbd2_debug(3, "Found magic %d, sequence %d\n", blocktype, sequence);
if (sequence != next_commit_ID) { @@ -576,7 +576,7 @@ static int do_one_pass(journal_t *journal, goto failed; } need_check_commit_time = true; - jbd_debug(1, + jbd2_debug(1, "invalid descriptor block found in %lu\n", next_log_block); } @@ -759,7 +759,7 @@ static int do_one_pass(journal_t *journal, * It likely does not belong to same journal, * just end this recovery with success. */ - jbd_debug(1, "JBD2: Invalid checksum ignored in transaction %u, likely stale data\n", + jbd2_debug(1, "JBD2: Invalid checksum ignored in transaction %u, likely stale data\n", next_commit_ID); brelse(bh); goto done; @@ -827,7 +827,7 @@ static int do_one_pass(journal_t *journal, if (pass == PASS_SCAN && !jbd2_descriptor_block_csum_verify(journal, bh->b_data)) { - jbd_debug(1, "JBD2: invalid revoke block found in %lu\n", + jbd2_debug(1, "JBD2: invalid revoke block found in %lu\n", next_log_block); need_check_commit_time = true; } @@ -846,7 +846,7 @@ static int do_one_pass(journal_t *journal, continue;
default: - jbd_debug(3, "Unrecognised magic %d, end of scan.\n", + jbd2_debug(3, "Unrecognised magic %d, end of scan.\n", blocktype); brelse(bh); goto done; diff --git a/fs/jbd2/revoke.c b/fs/jbd2/revoke.c index fa608788b93d7..4556e46890244 100644 --- a/fs/jbd2/revoke.c +++ b/fs/jbd2/revoke.c @@ -398,7 +398,7 @@ int jbd2_journal_revoke(handle_t *handle, unsigned long long blocknr, } handle->h_revoke_credits--;
- jbd_debug(2, "insert revoke for block %llu, bh_in=%p\n",blocknr, bh_in); + jbd2_debug(2, "insert revoke for block %llu, bh_in=%p\n",blocknr, bh_in); err = insert_revoke_hash(journal, blocknr, handle->h_transaction->t_tid); BUFFER_TRACE(bh_in, "exit"); @@ -428,7 +428,7 @@ int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh) int did_revoke = 0; /* akpm: debug */ struct buffer_head *bh = jh2bh(jh);
- jbd_debug(4, "journal_head %p, cancelling revoke\n", jh); + jbd2_debug(4, "journal_head %p, cancelling revoke\n", jh);
/* Is the existing Revoke bit valid? If so, we trust it, and * only perform the full cancel if the revoke bit is set. If @@ -444,7 +444,7 @@ int jbd2_journal_cancel_revoke(handle_t *handle, struct journal_head *jh) if (need_cancel) { record = find_revoke_record(journal, bh->b_blocknr); if (record) { - jbd_debug(4, "cancelled existing revoke on " + jbd2_debug(4, "cancelled existing revoke on " "blocknr %llu\n", (unsigned long long)bh->b_blocknr); spin_lock(&journal->j_revoke_lock); list_del(&record->hash); @@ -560,7 +560,7 @@ void jbd2_journal_write_revoke_records(transaction_t *transaction, } if (descriptor) flush_descriptor(journal, descriptor, offset); - jbd_debug(1, "Wrote %d revoke records\n", count); + jbd2_debug(1, "Wrote %d revoke records\n", count); }
/* diff --git a/fs/jbd2/transaction.c b/fs/jbd2/transaction.c index b31145b2bb6bf..c2125203ef2d9 100644 --- a/fs/jbd2/transaction.c +++ b/fs/jbd2/transaction.c @@ -374,7 +374,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle, return -ENOMEM; }
- jbd_debug(3, "New handle %p going live.\n", handle); + jbd2_debug(3, "New handle %p going live.\n", handle);
/* * We need to hold j_state_lock until t_updates has been incremented, @@ -454,7 +454,7 @@ static int start_this_handle(journal_t *journal, handle_t *handle, handle->h_start_jiffies = jiffies; atomic_inc(&transaction->t_updates); atomic_inc(&transaction->t_handle_count); - jbd_debug(4, "Handle %p given %d credits (total %d, free %lu)\n", + jbd2_debug(4, "Handle %p given %d credits (total %d, free %lu)\n", handle, blocks, atomic_read(&transaction->t_outstanding_credits), jbd2_log_space_left(journal)); @@ -675,7 +675,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records)
/* Don't extend a locked-down transaction! */ if (transaction->t_state != T_RUNNING) { - jbd_debug(3, "denied handle %p %d blocks: " + jbd2_debug(3, "denied handle %p %d blocks: " "transaction not running\n", handle, nblocks); goto error_out; } @@ -690,7 +690,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records) &transaction->t_outstanding_credits);
if (wanted > journal->j_max_transaction_buffers) { - jbd_debug(3, "denied handle %p %d blocks: " + jbd2_debug(3, "denied handle %p %d blocks: " "transaction too large\n", handle, nblocks); atomic_sub(nblocks, &transaction->t_outstanding_credits); goto error_out; @@ -708,7 +708,7 @@ int jbd2_journal_extend(handle_t *handle, int nblocks, int revoke_records) handle->h_revoke_credits_requested += revoke_records; result = 0;
- jbd_debug(3, "extended handle %p by %d\n", handle, nblocks); + jbd2_debug(3, "extended handle %p by %d\n", handle, nblocks); error_out: read_unlock(&journal->j_state_lock); return result; @@ -796,7 +796,7 @@ int jbd2__journal_restart(handle_t *handle, int nblocks, int revoke_records, * First unlink the handle from its current transaction, and start the * commit on that. */ - jbd_debug(2, "restarting handle %p\n", handle); + jbd2_debug(2, "restarting handle %p\n", handle); stop_this_handle(handle); handle->h_transaction = NULL;
@@ -980,7 +980,7 @@ do_get_write_access(handle_t *handle, struct journal_head *jh,
journal = transaction->t_journal;
- jbd_debug(5, "journal_head %p, force_copy %d\n", jh, force_copy); + jbd2_debug(5, "journal_head %p, force_copy %d\n", jh, force_copy);
JBUFFER_TRACE(jh, "entry"); repeat: @@ -1280,7 +1280,7 @@ int jbd2_journal_get_create_access(handle_t *handle, struct buffer_head *bh) struct journal_head *jh = jbd2_journal_add_journal_head(bh); int err;
- jbd_debug(5, "journal_head %p\n", jh); + jbd2_debug(5, "journal_head %p\n", jh); err = -EROFS; if (is_handle_aborted(handle)) goto out; @@ -1503,7 +1503,7 @@ int jbd2_journal_dirty_metadata(handle_t *handle, struct buffer_head *bh) * of the running transaction. */ jh = bh2jh(bh); - jbd_debug(5, "journal_head %p\n", jh); + jbd2_debug(5, "journal_head %p\n", jh); JBUFFER_TRACE(jh, "entry");
/* @@ -1836,7 +1836,7 @@ int jbd2_journal_stop(handle_t *handle) pid_t pid;
if (--handle->h_ref > 0) { - jbd_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1, + jbd2_debug(4, "h_ref %d -> %d\n", handle->h_ref + 1, handle->h_ref); if (is_handle_aborted(handle)) return -EIO; @@ -1856,7 +1856,7 @@ int jbd2_journal_stop(handle_t *handle) if (is_handle_aborted(handle)) err = -EIO;
- jbd_debug(4, "Handle %p going down\n", handle); + jbd2_debug(4, "Handle %p going down\n", handle); trace_jbd2_handle_stats(journal->j_fs_dev->bd_dev, tid, handle->h_type, handle->h_line_no, jiffies - handle->h_start_jiffies, @@ -1934,7 +1934,7 @@ int jbd2_journal_stop(handle_t *handle) * completes the commit thread, it just doesn't write * anything to disk. */
- jbd_debug(2, "transaction too old, requesting commit for " + jbd2_debug(2, "transaction too old, requesting commit for " "handle %p\n", handle); /* This is non-blocking */ jbd2_log_start_commit(journal, tid); @@ -2678,7 +2678,7 @@ static int jbd2_journal_file_inode(handle_t *handle, struct jbd2_inode *jinode, return -EROFS; journal = transaction->t_journal;
- jbd_debug(4, "Adding inode %lu, tid:%d\n", jinode->i_vfs_inode->i_ino, + jbd2_debug(4, "Adding inode %lu, tid:%d\n", jinode->i_vfs_inode->i_ino, transaction->t_tid);
spin_lock(&journal->j_list_lock); diff --git a/include/linux/jbd2.h b/include/linux/jbd2.h index f29fda630b2de..d19f527ade3bb 100644 --- a/include/linux/jbd2.h +++ b/include/linux/jbd2.h @@ -58,10 +58,10 @@ extern ushort jbd2_journal_enable_debug; void __jbd2_debug(int level, const char *file, const char *func, unsigned int line, const char *fmt, ...);
-#define jbd_debug(n, fmt, a...) \ +#define jbd2_debug(n, fmt, a...) \ __jbd2_debug((n), __FILE__, __func__, __LINE__, (fmt), ##a) #else -#define jbd_debug(n, fmt, a...) no_printk(fmt, ##a) +#define jbd2_debug(n, fmt, a...) no_printk(fmt, ##a) #endif
extern void *jbd2_alloc(size_t size, gfp_t flags);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zhang Yi yi.zhang@huawei.com
[ Upstream commit 2dfba3bb40ad8536b9fa802364f2d40da31aa88e ]
We got a filesystem inconsistency issue below while running generic/475 I/O failure pressure test with fast_commit feature enabled.
Symlink /p3/d3/d1c/d6c/dd6/dce/l101 (inode #132605) is invalid.
If fast_commit feature is enabled, a special fast_commit journal area is appended to the end of the normal journal area. The journal->j_last point to the first unused block behind the normal journal area instead of the whole log area, and the journal->j_fc_last point to the first unused block behind the fast_commit journal area. While doing journal recovery, do_one_pass(PASS_SCAN) should first scan the normal journal area and turn around to the first block once it meet journal->j_last, but the wrap() macro misuse the journal->j_fc_last, so the recovering could not read the next magic block (commit block perhaps) and would end early mistakenly and missing tN and every transaction after it in the following example. Finally, it could lead to filesystem inconsistency.
| normal journal area | fast commit area | +-------------------------------------------------+------------------+ | tN(rere) | tN+1 |~| tN-x |...| tN-1 | tN(front) | .... | +-------------------------------------------------+------------------+ / / / start journal->j_last journal->j_fc_last
This patch fix it by use the correct ending journal->j_last.
Fixes: 5b849b5f96b4 ("jbd2: fast commit recovery path") Cc: stable@kernel.org Reported-by: Theodore Ts'o tytso@mit.edu Link: https://lore.kernel.org/linux-ext4/20230613043120.GB1584772@mit.edu/ Signed-off-by: Zhang Yi yi.zhang@huawei.com Reviewed-by: Jan Kara jack@suse.cz Link: https://lore.kernel.org/r/20230626073322.3956567-1-yi.zhang@huaweicloud.com Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/jbd2/recovery.c | 12 +++--------- 1 file changed, 3 insertions(+), 9 deletions(-)
diff --git a/fs/jbd2/recovery.c b/fs/jbd2/recovery.c index 18f525f7f4063..cce36a76fd021 100644 --- a/fs/jbd2/recovery.c +++ b/fs/jbd2/recovery.c @@ -224,12 +224,8 @@ static int count_tags(journal_t *journal, struct buffer_head *bh) /* Make sure we wrap around the log correctly! */ #define wrap(journal, var) \ do { \ - unsigned long _wrap_last = \ - jbd2_has_feature_fast_commit(journal) ? \ - (journal)->j_fc_last : (journal)->j_last; \ - \ - if (var >= _wrap_last) \ - var -= (_wrap_last - (journal)->j_first); \ + if (var >= (journal)->j_last) \ + var -= ((journal)->j_last - (journal)->j_first); \ } while (0)
static int fc_do_one_pass(journal_t *journal, @@ -512,9 +508,7 @@ static int do_one_pass(journal_t *journal, break;
jbd2_debug(2, "Scanning for sequence ID %u at %lu/%lu\n", - next_commit_ID, next_log_block, - jbd2_has_feature_fast_commit(journal) ? - journal->j_fc_last : journal->j_last); + next_commit_ID, next_log_block, journal->j_last);
/* Skip over each chunk of the transaction looking * either the next descriptor block or the final commit
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Fainelli f.fainelli@gmail.com
[ Upstream commit 25f97138f8c225dbf365b428a94d7b30a6daefb3 ]
Allow a brcmnand_soc instance to provide a custom set of I/O operations which we will require when using this driver on a BCMA bus which is not directly memory mapped I/O. Update the nand_{read,write}_reg accordingly to use the SoC operations if provided.
To minimize the penalty on other SoCs which do support standard MMIO accesses, we use a static key which is disabled by default and gets enabled if a soc implementation does provide I/O operations.
Signed-off-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20220107184614.2670254-3-f.fainelli@gmail.... Stable-dep-of: 2ec2839a9062 ("mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 28 +++++++++++++++++++++-- drivers/mtd/nand/raw/brcmnand/brcmnand.h | 29 ++++++++++++++++++++++++ 2 files changed, 55 insertions(+), 2 deletions(-)
diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c index c1afadb50eecc..f20a3154a0adb 100644 --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c @@ -25,6 +25,7 @@ #include <linux/of.h> #include <linux/of_platform.h> #include <linux/slab.h> +#include <linux/static_key.h> #include <linux/list.h> #include <linux/log2.h>
@@ -207,6 +208,8 @@ enum {
struct brcmnand_host;
+static DEFINE_STATIC_KEY_FALSE(brcmnand_soc_has_ops_key); + struct brcmnand_controller { struct device *dev; struct nand_controller controller; @@ -592,15 +595,25 @@ enum { INTFC_CTLR_READY = BIT(31), };
+static inline bool brcmnand_non_mmio_ops(struct brcmnand_controller *ctrl) +{ + return static_branch_unlikely(&brcmnand_soc_has_ops_key); +} + static inline u32 nand_readreg(struct brcmnand_controller *ctrl, u32 offs) { + if (brcmnand_non_mmio_ops(ctrl)) + return brcmnand_soc_read(ctrl->soc, offs); return brcmnand_readl(ctrl->nand_base + offs); }
static inline void nand_writereg(struct brcmnand_controller *ctrl, u32 offs, u32 val) { - brcmnand_writel(val, ctrl->nand_base + offs); + if (brcmnand_non_mmio_ops(ctrl)) + brcmnand_soc_write(ctrl->soc, val, offs); + else + brcmnand_writel(val, ctrl->nand_base + offs); }
static int brcmnand_revision_init(struct brcmnand_controller *ctrl) @@ -766,13 +779,18 @@ static inline void brcmnand_rmw_reg(struct brcmnand_controller *ctrl,
static inline u32 brcmnand_read_fc(struct brcmnand_controller *ctrl, int word) { + if (brcmnand_non_mmio_ops(ctrl)) + return brcmnand_soc_read(ctrl->soc, BRCMNAND_NON_MMIO_FC_ADDR); return __raw_readl(ctrl->nand_fc + word * 4); }
static inline void brcmnand_write_fc(struct brcmnand_controller *ctrl, int word, u32 val) { - __raw_writel(val, ctrl->nand_fc + word * 4); + if (brcmnand_non_mmio_ops(ctrl)) + brcmnand_soc_write(ctrl->soc, val, BRCMNAND_NON_MMIO_FC_ADDR); + else + __raw_writel(val, ctrl->nand_fc + word * 4); }
static inline void edu_writel(struct brcmnand_controller *ctrl, @@ -3034,6 +3052,12 @@ int brcmnand_probe(struct platform_device *pdev, struct brcmnand_soc *soc) dev_set_drvdata(dev, ctrl); ctrl->dev = dev;
+ /* Enable the static key if the soc provides I/O operations indicating + * that a non-memory mapped IO access path must be used + */ + if (brcmnand_soc_has_ops(ctrl->soc)) + static_branch_enable(&brcmnand_soc_has_ops_key); + init_completion(&ctrl->done); init_completion(&ctrl->dma_done); init_completion(&ctrl->edu_done); diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.h b/drivers/mtd/nand/raw/brcmnand/brcmnand.h index eb498fbe505ec..f1f93d85f50d2 100644 --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.h +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.h @@ -11,12 +11,25 @@
struct platform_device; struct dev_pm_ops; +struct brcmnand_io_ops; + +/* Special register offset constant to intercept a non-MMIO access + * to the flash cache register space. This is intentionally large + * not to overlap with an existing offset. + */ +#define BRCMNAND_NON_MMIO_FC_ADDR 0xffffffff
struct brcmnand_soc { bool (*ctlrdy_ack)(struct brcmnand_soc *soc); void (*ctlrdy_set_enabled)(struct brcmnand_soc *soc, bool en); void (*prepare_data_bus)(struct brcmnand_soc *soc, bool prepare, bool is_param); + const struct brcmnand_io_ops *ops; +}; + +struct brcmnand_io_ops { + u32 (*read_reg)(struct brcmnand_soc *soc, u32 offset); + void (*write_reg)(struct brcmnand_soc *soc, u32 val, u32 offset); };
static inline void brcmnand_soc_data_bus_prepare(struct brcmnand_soc *soc, @@ -58,6 +71,22 @@ static inline void brcmnand_writel(u32 val, void __iomem *addr) writel_relaxed(val, addr); }
+static inline bool brcmnand_soc_has_ops(struct brcmnand_soc *soc) +{ + return soc && soc->ops && soc->ops->read_reg && soc->ops->write_reg; +} + +static inline u32 brcmnand_soc_read(struct brcmnand_soc *soc, u32 offset) +{ + return soc->ops->read_reg(soc, offset); +} + +static inline void brcmnand_soc_write(struct brcmnand_soc *soc, u32 val, + u32 offset) +{ + soc->ops->write_reg(soc, val, offset); +} + int brcmnand_probe(struct platform_device *pdev, struct brcmnand_soc *soc); int brcmnand_remove(struct platform_device *pdev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: William Zhang william.zhang@broadcom.com
[ Upstream commit 2ec2839a9062db8a592525a3fdabd42dcd9a3a9b ]
v7.2 controller has different ECC level field size and shift in the acc control register than its predecessor and successor controller. It needs to be set specifically.
Fixes: decba6d47869 ("mtd: brcmnand: Add v7.2 controller support") Signed-off-by: William Zhang william.zhang@broadcom.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Cc: stable@vger.kernel.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20230706182909.79151-2-william.zhang@broad... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 74 +++++++++++++----------- 1 file changed, 41 insertions(+), 33 deletions(-)
diff --git a/drivers/mtd/nand/raw/brcmnand/brcmnand.c b/drivers/mtd/nand/raw/brcmnand/brcmnand.c index f20a3154a0adb..c1e7ab13e7773 100644 --- a/drivers/mtd/nand/raw/brcmnand/brcmnand.c +++ b/drivers/mtd/nand/raw/brcmnand/brcmnand.c @@ -271,6 +271,7 @@ struct brcmnand_controller { const unsigned int *page_sizes; unsigned int page_size_shift; unsigned int max_oob; + u32 ecc_level_shift; u32 features;
/* for low-power standby/resume only */ @@ -595,6 +596,34 @@ enum { INTFC_CTLR_READY = BIT(31), };
+/*********************************************************************** + * NAND ACC CONTROL bitfield + * + * Some bits have remained constant throughout hardware revision, while + * others have shifted around. + ***********************************************************************/ + +/* Constant for all versions (where supported) */ +enum { + /* See BRCMNAND_HAS_CACHE_MODE */ + ACC_CONTROL_CACHE_MODE = BIT(22), + + /* See BRCMNAND_HAS_PREFETCH */ + ACC_CONTROL_PREFETCH = BIT(23), + + ACC_CONTROL_PAGE_HIT = BIT(24), + ACC_CONTROL_WR_PREEMPT = BIT(25), + ACC_CONTROL_PARTIAL_PAGE = BIT(26), + ACC_CONTROL_RD_ERASED = BIT(27), + ACC_CONTROL_FAST_PGM_RDIN = BIT(28), + ACC_CONTROL_WR_ECC = BIT(30), + ACC_CONTROL_RD_ECC = BIT(31), +}; + +#define ACC_CONTROL_ECC_SHIFT 16 +/* Only for v7.2 */ +#define ACC_CONTROL_ECC_EXT_SHIFT 13 + static inline bool brcmnand_non_mmio_ops(struct brcmnand_controller *ctrl) { return static_branch_unlikely(&brcmnand_soc_has_ops_key); @@ -732,6 +761,12 @@ static int brcmnand_revision_init(struct brcmnand_controller *ctrl) else if (of_property_read_bool(ctrl->dev->of_node, "brcm,nand-has-wp")) ctrl->features |= BRCMNAND_HAS_WP;
+ /* v7.2 has different ecc level shift in the acc register */ + if (ctrl->nand_version == 0x0702) + ctrl->ecc_level_shift = ACC_CONTROL_ECC_EXT_SHIFT; + else + ctrl->ecc_level_shift = ACC_CONTROL_ECC_SHIFT; + return 0; }
@@ -920,30 +955,6 @@ static inline int brcmnand_cmd_shift(struct brcmnand_controller *ctrl) return 0; }
-/*********************************************************************** - * NAND ACC CONTROL bitfield - * - * Some bits have remained constant throughout hardware revision, while - * others have shifted around. - ***********************************************************************/ - -/* Constant for all versions (where supported) */ -enum { - /* See BRCMNAND_HAS_CACHE_MODE */ - ACC_CONTROL_CACHE_MODE = BIT(22), - - /* See BRCMNAND_HAS_PREFETCH */ - ACC_CONTROL_PREFETCH = BIT(23), - - ACC_CONTROL_PAGE_HIT = BIT(24), - ACC_CONTROL_WR_PREEMPT = BIT(25), - ACC_CONTROL_PARTIAL_PAGE = BIT(26), - ACC_CONTROL_RD_ERASED = BIT(27), - ACC_CONTROL_FAST_PGM_RDIN = BIT(28), - ACC_CONTROL_WR_ECC = BIT(30), - ACC_CONTROL_RD_ECC = BIT(31), -}; - static inline u32 brcmnand_spare_area_mask(struct brcmnand_controller *ctrl) { if (ctrl->nand_version == 0x0702) @@ -956,18 +967,15 @@ static inline u32 brcmnand_spare_area_mask(struct brcmnand_controller *ctrl) return GENMASK(4, 0); }
-#define NAND_ACC_CONTROL_ECC_SHIFT 16 -#define NAND_ACC_CONTROL_ECC_EXT_SHIFT 13 - static inline u32 brcmnand_ecc_level_mask(struct brcmnand_controller *ctrl) { u32 mask = (ctrl->nand_version >= 0x0600) ? 0x1f : 0x0f;
- mask <<= NAND_ACC_CONTROL_ECC_SHIFT; + mask <<= ACC_CONTROL_ECC_SHIFT;
/* v7.2 includes additional ECC levels */ - if (ctrl->nand_version >= 0x0702) - mask |= 0x7 << NAND_ACC_CONTROL_ECC_EXT_SHIFT; + if (ctrl->nand_version == 0x0702) + mask |= 0x7 << ACC_CONTROL_ECC_EXT_SHIFT;
return mask; } @@ -981,8 +989,8 @@ static void brcmnand_set_ecc_enabled(struct brcmnand_host *host, int en)
if (en) { acc_control |= ecc_flags; /* enable RD/WR ECC */ - acc_control |= host->hwcfg.ecc_level - << NAND_ACC_CONTROL_ECC_SHIFT; + acc_control &= ~brcmnand_ecc_level_mask(ctrl); + acc_control |= host->hwcfg.ecc_level << ctrl->ecc_level_shift; } else { acc_control &= ~ecc_flags; /* disable RD/WR ECC */ acc_control &= ~brcmnand_ecc_level_mask(ctrl); @@ -2582,7 +2590,7 @@ static int brcmnand_set_cfg(struct brcmnand_host *host, tmp &= ~brcmnand_ecc_level_mask(ctrl); tmp &= ~brcmnand_spare_area_mask(ctrl); if (ctrl->nand_version >= 0x0302) { - tmp |= cfg->ecc_level << NAND_ACC_CONTROL_ECC_SHIFT; + tmp |= cfg->ecc_level << ctrl->ecc_level_shift; tmp |= cfg->spare_area_size; } nand_writereg(ctrl, acc_control_offs, tmp);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tiezhu Yang yangtiezhu@loongson.cn
[ Upstream commit d42f0c6ad502c9f612410e125ebdf290cce8bdc3 ]
The latest version of grep claims the egrep is now obsolete so the build now contains warnings that look like: egrep: warning: egrep is obsolescent; using grep -E fix this up by moving the related file to use "grep -E" instead.
Here are the steps to install the latest grep:
wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz tar xf grep-3.8.tar.gz cd grep-3.8 && ./configure && make sudo make install export PATH=/usr/local/bin:$PATH
Signed-off-by: Tiezhu Yang yangtiezhu@loongson.cn Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Stable-dep-of: 4fe4a6374c4d ("MIPS: Only fiddle with CHECKFLAGS if `need-compiler'") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/Makefile | 2 +- arch/mips/vdso/Makefile | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/mips/Makefile b/arch/mips/Makefile index 151e98698f763..3830217fab414 100644 --- a/arch/mips/Makefile +++ b/arch/mips/Makefile @@ -323,7 +323,7 @@ KBUILD_LDFLAGS += -m $(ld-emul)
ifdef need-compiler CHECKFLAGS += $(shell $(CC) $(KBUILD_CFLAGS) -dM -E -x c /dev/null | \ - egrep -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \ + grep -E -vw '__GNUC_(MINOR_|PATCHLEVEL_)?_' | \ sed -e "s/^#define /-D'/" -e "s/ /'='/" -e "s/$$/'/" -e 's/$$/&&/g') endif
diff --git a/arch/mips/vdso/Makefile b/arch/mips/vdso/Makefile index 1b2ea34c3d3bb..ed090ef30757c 100644 --- a/arch/mips/vdso/Makefile +++ b/arch/mips/vdso/Makefile @@ -68,7 +68,7 @@ KCOV_INSTRUMENT := n
# Check that we don't have PIC 'jalr t9' calls left quiet_cmd_vdso_mips_check = VDSOCHK $@ - cmd_vdso_mips_check = if $(OBJDUMP) --disassemble $@ | egrep -h "jalr.*t9" > /dev/null; \ + cmd_vdso_mips_check = if $(OBJDUMP) --disassemble $@ | grep -E -h "jalr.*t9" > /dev/null; \ then (echo >&2 "$@: PIC 'jalr t9' calls are not supported"; \ rm -f $@; /bin/false); fi
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ian Rogers rogers.email@gmail.com
[ Upstream commit 00facc760903be6675870c2749e2cd72140e396e ]
Generate pmu-events.c using jevents.py rather than the binary built from jevents.c.
Add a new config variable NO_JEVENTS that is set when there is no architecture json or an appropriate python interpreter isn't present.
When NO_JEVENTS is defined the file pmu-events/empty-pmu-events.c is copied and used as the pmu-events.c file.
Signed-off-by: Ian Rogers irogers@google.com Tested-by: John Garry john.garry@huawei.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ananth Narayan ananth.narayan@amd.com Cc: Andi Kleen ak@linux.intel.com Cc: Andrew Kilroy andrew.kilroy@arm.com Cc: Caleb Biggers caleb.biggers@intel.com Cc: Felix Fietkau nbd@nbd.name Cc: Ian Rogers rogers.email@gmail.com Cc: Ingo Molnar mingo@redhat.com Cc: James Clark james.clark@arm.com Cc: Jiri Olsa jolsa@kernel.org Cc: Kajol Jain kjain@linux.ibm.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Kshipra Bopardikar kshipra.bopardikar@intel.com Cc: Like Xu likexu@tencent.com Cc: Mark Rutland mark.rutland@arm.com Cc: Mathieu Poirier mathieu.poirier@linaro.org Cc: Namhyung Kim namhyung@kernel.org Cc: Nick Forrington nick.forrington@arm.com Cc: Paul Clarke pc@us.ibm.com Cc: Perry Taylor perry.taylor@intel.com Cc: Peter Zijlstra peterz@infradead.org Cc: Qi Liu liuqi115@huawei.com Cc: Ravi Bangoria ravi.bangoria@amd.com Cc: Sandipan Das sandipan.das@amd.com Cc: Santosh Shukla santosh.shukla@amd.com Cc: Stephane Eranian eranian@google.com Cc: Will Deacon will@kernel.org Cc: Xing Zhengjun zhengjun.xing@linux.intel.com Link: https://lore.kernel.org/r/20220629182505.406269-4-irogers@google.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: 7822a8913f4c ("perf build: Update build rule for generated files") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/Makefile.config | 19 +++ tools/perf/Makefile.perf | 1 + tools/perf/pmu-events/Build | 13 +- tools/perf/pmu-events/empty-pmu-events.c | 158 +++++++++++++++++++++++ 4 files changed, 189 insertions(+), 2 deletions(-) create mode 100644 tools/perf/pmu-events/empty-pmu-events.c
diff --git a/tools/perf/Makefile.config b/tools/perf/Makefile.config index 973c0d5ed8d8b..e1077c4d30fff 100644 --- a/tools/perf/Makefile.config +++ b/tools/perf/Makefile.config @@ -857,6 +857,25 @@ else endif endif
+ifneq ($(NO_JEVENTS),1) + ifeq ($(wildcard pmu-events/arch/$(SRCARCH)/mapfile.csv),) + NO_JEVENTS := 1 + endif +endif +ifneq ($(NO_JEVENTS),1) + NO_JEVENTS := 0 + ifndef PYTHON + $(warning No python interpreter disabling jevent generation) + NO_JEVENTS := 1 + else + # jevents.py uses f-strings present in Python 3.6 released in Dec. 2016. + JEVENTS_PYTHON_GOOD := $(shell $(PYTHON) -c 'import sys;print("1" if(sys.version_info.major >= 3 and sys.version_info.minor >= 6) else "0")' 2> /dev/null) + ifneq ($(JEVENTS_PYTHON_GOOD), 1) + $(warning Python interpreter too old (older than 3.6) disabling jevent generation) + NO_JEVENTS := 1 + endif + endif +endif
ifndef NO_LIBBFD ifeq ($(feature-libbfd), 1) diff --git a/tools/perf/Makefile.perf b/tools/perf/Makefile.perf index b856afa6eb52e..b82f2d89d74c4 100644 --- a/tools/perf/Makefile.perf +++ b/tools/perf/Makefile.perf @@ -649,6 +649,7 @@ JEVENTS := $(OUTPUT)pmu-events/jevents JEVENTS_IN := $(OUTPUT)pmu-events/jevents-in.o
PMU_EVENTS_IN := $(OUTPUT)pmu-events/pmu-events-in.o +export NO_JEVENTS
export JEVENTS
diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build index a055dee6a46af..5ec5ce8c31bab 100644 --- a/tools/perf/pmu-events/Build +++ b/tools/perf/pmu-events/Build @@ -9,10 +9,19 @@ JSON = $(shell [ -d $(JDIR) ] && \ JDIR_TEST = pmu-events/arch/test JSON_TEST = $(shell [ -d $(JDIR_TEST) ] && \ find $(JDIR_TEST) -name '*.json') +JEVENTS_PY = pmu-events/jevents.py
# # Locate/process JSON files in pmu-events/arch/ # directory and create tables in pmu-events.c. # -$(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS) - $(Q)$(call echo-cmd,gen)$(JEVENTS) $(SRCARCH) pmu-events/arch $(OUTPUT)pmu-events/pmu-events.c $(V) + +ifeq ($(NO_JEVENTS),1) +$(OUTPUT)pmu-events/pmu-events.c: pmu-events/empty-pmu-events.c + $(call rule_mkdir) + $(Q)$(call echo-cmd,gen)cp $< $@ +else +$(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) + $(call rule_mkdir) + $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(SRCARCH) pmu-events/arch $@ +endif diff --git a/tools/perf/pmu-events/empty-pmu-events.c b/tools/perf/pmu-events/empty-pmu-events.c new file mode 100644 index 0000000000000..77e655c6f1162 --- /dev/null +++ b/tools/perf/pmu-events/empty-pmu-events.c @@ -0,0 +1,158 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * An empty pmu-events.c file used when there is no architecture json files in + * arch or when the jevents.py script cannot be run. + * + * The test cpu/soc is provided for testing. + */ +#include "pmu-events/pmu-events.h" + +static const struct pmu_event pme_test_soc_cpu[] = { + { + .name = "l3_cache_rd", + .event = "event=0x40", + .desc = "L3 cache access, read", + .topic = "cache", + .long_desc = "Attributable Level 3 cache access, read", + }, + { + .name = "segment_reg_loads.any", + .event = "event=0x6,period=200000,umask=0x80", + .desc = "Number of segment register loads", + .topic = "other", + }, + { + .name = "dispatch_blocked.any", + .event = "event=0x9,period=200000,umask=0x20", + .desc = "Memory cluster signals to block micro-op dispatch for any reason", + .topic = "other", + }, + { + .name = "eist_trans", + .event = "event=0x3a,period=200000,umask=0x0", + .desc = "Number of Enhanced Intel SpeedStep(R) Technology (EIST) transitions", + .topic = "other", + }, + { + .name = "uncore_hisi_ddrc.flux_wcmd", + .event = "event=0x2", + .desc = "DDRC write commands. Unit: hisi_sccl,ddrc ", + .topic = "uncore", + .long_desc = "DDRC write commands", + .pmu = "hisi_sccl,ddrc", + }, + { + .name = "unc_cbo_xsnp_response.miss_eviction", + .event = "event=0x22,umask=0x81", + .desc = "A cross-core snoop resulted from L3 Eviction which misses in some processor core. Unit: uncore_cbox ", + .topic = "uncore", + .long_desc = "A cross-core snoop resulted from L3 Eviction which misses in some processor core", + .pmu = "uncore_cbox", + }, + { + .name = "event-hyphen", + .event = "event=0xe0,umask=0x00", + .desc = "UNC_CBO_HYPHEN. Unit: uncore_cbox ", + .topic = "uncore", + .long_desc = "UNC_CBO_HYPHEN", + .pmu = "uncore_cbox", + }, + { + .name = "event-two-hyph", + .event = "event=0xc0,umask=0x00", + .desc = "UNC_CBO_TWO_HYPH. Unit: uncore_cbox ", + .topic = "uncore", + .long_desc = "UNC_CBO_TWO_HYPH", + .pmu = "uncore_cbox", + }, + { + .name = "uncore_hisi_l3c.rd_hit_cpipe", + .event = "event=0x7", + .desc = "Total read hits. Unit: hisi_sccl,l3c ", + .topic = "uncore", + .long_desc = "Total read hits", + .pmu = "hisi_sccl,l3c", + }, + { + .name = "uncore_imc_free_running.cache_miss", + .event = "event=0x12", + .desc = "Total cache misses. Unit: uncore_imc_free_running ", + .topic = "uncore", + .long_desc = "Total cache misses", + .pmu = "uncore_imc_free_running", + }, + { + .name = "uncore_imc.cache_hits", + .event = "event=0x34", + .desc = "Total cache hits. Unit: uncore_imc ", + .topic = "uncore", + .long_desc = "Total cache hits", + .pmu = "uncore_imc", + }, + { + .name = "bp_l1_btb_correct", + .event = "event=0x8a", + .desc = "L1 BTB Correction", + .topic = "branch", + }, + { + .name = "bp_l2_btb_correct", + .event = "event=0x8b", + .desc = "L2 BTB Correction", + .topic = "branch", + }, + { + .name = 0, + .event = 0, + .desc = 0, + }, +}; + +const struct pmu_events_map pmu_events_map[] = { + { + .cpuid = "testcpu", + .version = "v1", + .type = "core", + .table = pme_test_soc_cpu, + }, + { + .cpuid = 0, + .version = 0, + .type = 0, + .table = 0, + }, +}; + +static const struct pmu_event pme_test_soc_sys[] = { + { + .name = "sys_ddr_pmu.write_cycles", + .event = "event=0x2b", + .desc = "ddr write-cycles event. Unit: uncore_sys_ddr_pmu ", + .compat = "v8", + .topic = "uncore", + .pmu = "uncore_sys_ddr_pmu", + }, + { + .name = "sys_ccn_pmu.read_cycles", + .event = "config=0x2c", + .desc = "ccn read-cycles event. Unit: uncore_sys_ccn_pmu ", + .compat = "0x01", + .topic = "uncore", + .pmu = "uncore_sys_ccn_pmu", + }, + { + .name = 0, + .event = 0, + .desc = 0, + }, +}; + +const struct pmu_sys_events pmu_sys_event_tables[] = { + { + .table = pme_test_soc_sys, + .name = "pme_test_soc_sys", + }, + { + .table = 0 + }, +};
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 7822a8913f4c51c7d1aff793b525d60c3384fb5b ]
The bison and flex generate C files from the source (.y and .l) files. When O= option is used, they are saved in a separate directory but the default build rule assumes the .C files are in the source directory. So it might read invalid file if there are generated files from an old version. The same is true for the pmu-events files.
For example, the following command would cause a build failure:
$ git checkout v6.3 $ make -C tools/perf # build in the same directory
$ git checkout v6.5-rc2 $ mkdir build # create a build directory $ make -C tools/perf O=build # build in a different directory but it # refers files in the source directory
Let's update the build rule to specify those cases explicitly to depend on the files in the output directory.
Note that it's not a complete fix and it needs the next patch for the include path too.
Fixes: 80eeb67fe577aa76 ("perf jevents: Program to convert JSON file") Signed-off-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Andi Kleen ak@linux.intel.com Cc: Anup Sharma anupnewsmail@gmail.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230728022447.1323563-1-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/build/Makefile.build | 10 ++++++++++ tools/perf/pmu-events/Build | 6 ++++++ 2 files changed, 16 insertions(+)
diff --git a/tools/build/Makefile.build b/tools/build/Makefile.build index 715092fc6a239..0f0aba16bdee7 100644 --- a/tools/build/Makefile.build +++ b/tools/build/Makefile.build @@ -116,6 +116,16 @@ $(OUTPUT)%.s: %.c FORCE $(call rule_mkdir) $(call if_changed_dep,cc_s_c)
+# bison and flex files are generated in the OUTPUT directory +# so it needs a separate rule to depend on them properly +$(OUTPUT)%-bison.o: $(OUTPUT)%-bison.c FORCE + $(call rule_mkdir) + $(call if_changed_dep,$(host)cc_o_c) + +$(OUTPUT)%-flex.o: $(OUTPUT)%-flex.c FORCE + $(call rule_mkdir) + $(call if_changed_dep,$(host)cc_o_c) + # Gather build data: # obj-y - list of build objects # subdir-y - list of directories to nest diff --git a/tools/perf/pmu-events/Build b/tools/perf/pmu-events/Build index 5ec5ce8c31bab..ea8c41f9c7398 100644 --- a/tools/perf/pmu-events/Build +++ b/tools/perf/pmu-events/Build @@ -25,3 +25,9 @@ $(OUTPUT)pmu-events/pmu-events.c: $(JSON) $(JSON_TEST) $(JEVENTS_PY) $(call rule_mkdir) $(Q)$(call echo-cmd,gen)$(PYTHON) $(JEVENTS_PY) $(SRCARCH) pmu-events/arch $@ endif + +# pmu-events.c file is generated in the OUTPUT directory so it needs a +# separate rule to depend on it properly +$(OUTPUT)pmu-events/pmu-events.o: $(PMU_EVENTS_C) + $(call rule_mkdir) + $(call if_changed_dep,cc_o_c)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Clark james.clark@arm.com
[ Upstream commit c8b947642d2339ce74c6a1ce56726089539f48d9 ]
Currently the test skips with an error because == only works in bash:
$ ./perf test 91 -v Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 91: perf stat --bpf-counters test : --- start --- test child forked, pid 44586 ./tests/shell/stat_bpf_counters.sh: 26: [: -v: unexpected operator test child finished with -2 ---- end ---- perf stat --bpf-counters test: Skip
Changing == to = does the same thing, but doesn't result in an error:
./perf test 91 -v Couldn't bump rlimit(MEMLOCK), failures may take place when creating BPF maps, etc 91: perf stat --bpf-counters test : --- start --- test child forked, pid 45833 Skipping: --bpf-counters not supported Error: unknown option `bpf-counters' [...] test child finished with -2 ---- end ---- perf stat --bpf-counters test: Skip
Signed-off-by: James Clark james.clark@arm.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Florian Fainelli f.fainelli@gmail.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: John Fastabend john.fastabend@gmail.com Cc: KP Singh kpsingh@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Martin KaFai Lau kafai@fb.com Cc: Namhyung Kim namhyung@kernel.org Cc: Song Liu songliubraving@fb.com Cc: Sumanth Korikkar sumanthk@linux.ibm.com Cc: Thomas Richter tmricht@linux.ibm.com Cc: Yonghong Song yhs@fb.com Cc: bpf@vger.kernel.org Cc: netdev@vger.kernel.org Link: https://lore.kernel.org/r/20211028134828.65774-2-james.clark@arm.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Stable-dep-of: 68ca249c964f ("perf test shell stat_bpf_counters: Fix test on Intel") Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/stat_bpf_counters.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh index 2aed20dc22625..13473aeba489c 100755 --- a/tools/perf/tests/shell/stat_bpf_counters.sh +++ b/tools/perf/tests/shell/stat_bpf_counters.sh @@ -23,7 +23,7 @@ compare_number()
# skip if --bpf-counters is not supported if ! perf stat --bpf-counters true > /dev/null 2>&1; then - if [ "$1" == "-v" ]; then + if [ "$1" = "-v" ]; then echo "Skipping: --bpf-counters not supported" perf --no-pager stat --bpf-counters true || true fi
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namhyung Kim namhyung@kernel.org
[ Upstream commit 68ca249c964f520af7f8763e22f12bd26b57b870 ]
As of now, bpf counters (bperf) don't support event groups. But the default perf stat includes topdown metrics if supported (on recent Intel machines) which require groups. That makes perf stat exiting.
$ sudo perf stat --bpf-counter true bpf managed perf events do not yet support groups.
Actually the test explicitly uses cycles event only, but it missed to pass the option when it checks the availability of the command.
Fixes: 2c0cb9f56020d2ea ("perf test: Add a shell test for 'perf stat --bpf-counters' new option") Reviewed-by: Song Liu song@kernel.org Signed-off-by: Namhyung Kim namhyung@kernel.org Cc: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Ingo Molnar mingo@kernel.org Cc: Jiri Olsa jolsa@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: bpf@vger.kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230825164152.165610-2-namhyung@kernel.org Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/shell/stat_bpf_counters.sh | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/tests/shell/stat_bpf_counters.sh b/tools/perf/tests/shell/stat_bpf_counters.sh index 13473aeba489c..6bf24b85294c7 100755 --- a/tools/perf/tests/shell/stat_bpf_counters.sh +++ b/tools/perf/tests/shell/stat_bpf_counters.sh @@ -22,10 +22,10 @@ compare_number() }
# skip if --bpf-counters is not supported -if ! perf stat --bpf-counters true > /dev/null 2>&1; then +if ! perf stat -e cycles --bpf-counters true > /dev/null 2>&1; then if [ "$1" = "-v" ]; then echo "Skipping: --bpf-counters not supported" - perf --no-pager stat --bpf-counters true || true + perf --no-pager stat -e cycles --bpf-counters true || true fi exit 2 fi
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit c2e79e865b87c2920a3cd39de69c35f2bc758a51 ]
This is defined in volumes.c, move the prototype into volumes.h.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: 6bfe3959b0e7 ("btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.h | 2 -- fs/btrfs/volumes.h | 2 ++ 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 02d3ee6c7d9b0..1467bf439cb48 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -536,8 +536,6 @@ struct btrfs_swapfile_pin { int bg_extent_count; };
-bool btrfs_pinned_by_swapfile(struct btrfs_fs_info *fs_info, void *ptr); - enum { BTRFS_FS_BARRIER, BTRFS_FS_CLOSING_START, diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index b49fa784e5ba3..1f6461b568659 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -622,4 +622,6 @@ const char *btrfs_bg_type_to_raid_name(u64 flags); int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); int btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical);
+bool btrfs_pinned_by_swapfile(struct btrfs_fs_info *fs_info, void *ptr); + #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anand Jain anand.jain@oracle.com
[ Upstream commit 4844c3664a72d36cc79752cb651c78860b14c240 ]
In some cases, we need to read the FSID from the superblock when the metadata_uuid is not set, and otherwise, read the metadata_uuid. So, add a helper.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Guilherme G. Piccoli gpiccoli@igalia.com Signed-off-by: Anand Jain anand.jain@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: 6bfe3959b0e7 ("btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 8 ++++++++ fs/btrfs/volumes.h | 1 + 2 files changed, 9 insertions(+)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 0e9236a745b81..56ea2ec26436f 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -709,6 +709,14 @@ static int btrfs_open_one_device(struct btrfs_fs_devices *fs_devices, return -EINVAL; }
+u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb) +{ + bool has_metadata_uuid = (btrfs_super_incompat_flags(sb) & + BTRFS_FEATURE_INCOMPAT_METADATA_UUID); + + return has_metadata_uuid ? sb->metadata_uuid : sb->fsid; +} + /* * Handle scanned device having its CHANGING_FSID_V2 flag set and the fs_devices * being created with a disk that has already completed its fsid change. Such diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 1f6461b568659..eb91d6eb78ceb 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -623,5 +623,6 @@ int btrfs_verify_dev_extents(struct btrfs_fs_info *fs_info); int btrfs_repair_one_zone(struct btrfs_fs_info *fs_info, u64 logical);
bool btrfs_pinned_by_swapfile(struct btrfs_fs_info *fs_info, void *ptr); +u8 *btrfs_sb_fsid_ptr(struct btrfs_super_block *sb);
#endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Anand Jain anand.jain@oracle.com
[ Upstream commit 6bfe3959b0e7a526f5c64747801a8613f002f05a ]
The function btrfs_validate_super() should verify the metadata_uuid in the provided superblock argument. Because, all its callers expect it to do that.
Such as in the following stacks:
write_all_supers() sb = fs_info->super_for_commit; btrfs_validate_write_super(.., sb) btrfs_validate_super(.., sb, ..)
scrub_one_super() btrfs_validate_super(.., sb, ..)
And check_dev_super() btrfs_validate_super(.., sb, ..)
However, it currently verifies the fs_info::super_copy::metadata_uuid instead. Fix this using the correct metadata_uuid in the superblock argument.
CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Guilherme G. Piccoli gpiccoli@igalia.com Signed-off-by: Anand Jain anand.jain@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/disk-io.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c index 6e0fdfd98f234..f0654fe80b346 100644 --- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -2605,13 +2605,11 @@ int btrfs_validate_super(struct btrfs_fs_info *fs_info, ret = -EINVAL; }
- if (btrfs_fs_incompat(fs_info, METADATA_UUID) && - memcmp(fs_info->fs_devices->metadata_uuid, - fs_info->super_copy->metadata_uuid, BTRFS_FSID_SIZE)) { + if (memcmp(fs_info->fs_devices->metadata_uuid, btrfs_sb_fsid_ptr(sb), + BTRFS_FSID_SIZE) != 0) { btrfs_err(fs_info, "superblock metadata_uuid doesn't match metadata uuid of fs_devices: %pU != %pU", - fs_info->super_copy->metadata_uuid, - fs_info->fs_devices->metadata_uuid); + btrfs_sb_fsid_ptr(sb), fs_info->fs_devices->metadata_uuid); ret = -EINVAL; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinjie Ruan ruanjinjie@huawei.com
[ Upstream commit 7583028d359db3cd0072badcc576b4f9455fd27a ]
The timeout arg of usb_bulk_msg() is ms already, which has been converted to jiffies by msecs_to_jiffies() in usb_start_wait_urb(). So fix the usage by removing the redundant msecs_to_jiffies() in the macros.
And as Hans suggested, also remove msecs_to_jiffies() for the IDLE_TIMEOUT macro to make it consistent here and so change IDLE_TIMEOUT to msecs_to_jiffies(IDLE_TIMEOUT) where it is used.
Fixes: e4f86e437164 ("drm: Add Grain Media GM12U320 driver v2") Signed-off-by: Jinjie Ruan ruanjinjie@huawei.com Suggested-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Thomas Zimmermann tzimmermann@suse.de Link: https://patchwork.freedesktop.org/patch/msgid/20230904021421.1663892-1-ruanj... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/tiny/gm12u320.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/tiny/gm12u320.c b/drivers/gpu/drm/tiny/gm12u320.c index 6bc0c298739cc..9985a4419bb0c 100644 --- a/drivers/gpu/drm/tiny/gm12u320.c +++ b/drivers/gpu/drm/tiny/gm12u320.c @@ -67,10 +67,10 @@ MODULE_PARM_DESC(eco_mode, "Turn on Eco mode (less bright, more silent)"); #define READ_STATUS_SIZE 13 #define MISC_VALUE_SIZE 4
-#define CMD_TIMEOUT msecs_to_jiffies(200) -#define DATA_TIMEOUT msecs_to_jiffies(1000) -#define IDLE_TIMEOUT msecs_to_jiffies(2000) -#define FIRST_FRAME_TIMEOUT msecs_to_jiffies(2000) +#define CMD_TIMEOUT 200 +#define DATA_TIMEOUT 1000 +#define IDLE_TIMEOUT 2000 +#define FIRST_FRAME_TIMEOUT 2000
#define MISC_REQ_GET_SET_ECO_A 0xff #define MISC_REQ_GET_SET_ECO_B 0x35 @@ -386,7 +386,7 @@ static void gm12u320_fb_update_work(struct work_struct *work) * switches back to showing its logo. */ queue_delayed_work(system_long_wq, &gm12u320->fb_update.work, - IDLE_TIMEOUT); + msecs_to_jiffies(IDLE_TIMEOUT));
return; err:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinjie Ruan ruanjinjie@huawei.com
[ Upstream commit d0b0822e32dbae80bbcb3cc86f34d28539d913df ]
Since both debugfs_create_dir() and debugfs_create_file() return ERR_PTR and never NULL, use IS_ERR() instead of checking for NULL.
Fixes: 1e98fb0f9208 ("scsi: qla2xxx: Setup debugfs entries for remote ports") Signed-off-by: Jinjie Ruan ruanjinjie@huawei.com Link: https://lore.kernel.org/r/20230831140930.3166359-1-ruanjinjie@huawei.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_dfs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_dfs.c b/drivers/scsi/qla2xxx/qla_dfs.c index aa9d69e5274d8..af921fd150d1e 100644 --- a/drivers/scsi/qla2xxx/qla_dfs.c +++ b/drivers/scsi/qla2xxx/qla_dfs.c @@ -116,7 +116,7 @@ qla2x00_dfs_create_rport(scsi_qla_host_t *vha, struct fc_port *fp)
sprintf(wwn, "pn-%016llx", wwn_to_u64(fp->port_name)); fp->dfs_rport_dir = debugfs_create_dir(wwn, vha->dfs_rport_root); - if (!fp->dfs_rport_dir) + if (IS_ERR(fp->dfs_rport_dir)) return; if (NVME_TARGET(vha->hw, fp)) debugfs_create_file("dev_loss_tmo", 0600, fp->dfs_rport_dir, @@ -615,14 +615,14 @@ qla2x00_dfs_setup(scsi_qla_host_t *vha) if (IS_QLA27XX(ha) || IS_QLA83XX(ha) || IS_QLA28XX(ha)) { ha->tgt.dfs_naqp = debugfs_create_file("naqp", 0400, ha->dfs_dir, vha, &dfs_naqp_ops); - if (!ha->tgt.dfs_naqp) { + if (IS_ERR(ha->tgt.dfs_naqp)) { ql_log(ql_log_warn, vha, 0xd011, "Unable to create debugFS naqp node.\n"); goto out; } } vha->dfs_rport_root = debugfs_create_dir("rports", ha->dfs_dir); - if (!vha->dfs_rport_root) { + if (IS_ERR(vha->dfs_rport_root)) { ql_log(ql_log_warn, vha, 0xd012, "Unable to create debugFS rports node.\n"); goto out;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Masami Hiramatsu (Google) mhiramat@kernel.org
[ Upstream commit 7e021da80f48582171029714f8a487347f29dddb ]
Fix to unmount the tracefs if the ftracetest mounted it for recovering system environment. If the tracefs is already mounted, this does nothing.
Suggested-by: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/all/29fce076-746c-4650-8358-b4e0fa215cf7@sirena.org.... Fixes: cbd965bde74c ("ftrace/selftests: Return the skip code when tracing directory not configured in kernel") Signed-off-by: Masami Hiramatsu (Google) mhiramat@kernel.org Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Reviewed-by: Mark Brown broonie@kernel.org Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/ftrace/ftracetest | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/tools/testing/selftests/ftrace/ftracetest b/tools/testing/selftests/ftrace/ftracetest index 8ec1922e974eb..55314cd197ab9 100755 --- a/tools/testing/selftests/ftrace/ftracetest +++ b/tools/testing/selftests/ftrace/ftracetest @@ -30,6 +30,9 @@ err_ret=1 # kselftest skip code is 4 err_skip=4
+# umount required +UMOUNT_DIR="" + # cgroup RT scheduling prevents chrt commands from succeeding, which # induces failures in test wakeup tests. Disable for the duration of # the tests. @@ -44,6 +47,9 @@ setup() {
cleanup() { echo $sched_rt_runtime_orig > $sched_rt_runtime + if [ -n "${UMOUNT_DIR}" ]; then + umount ${UMOUNT_DIR} ||: + fi }
errexit() { # message @@ -155,11 +161,13 @@ if [ -z "$TRACING_DIR" ]; then mount -t tracefs nodev /sys/kernel/tracing || errexit "Failed to mount /sys/kernel/tracing" TRACING_DIR="/sys/kernel/tracing" + UMOUNT_DIR=${TRACING_DIR} # If debugfs exists, then so does /sys/kernel/debug elif [ -d "/sys/kernel/debug" ]; then mount -t debugfs nodev /sys/kernel/debug || errexit "Failed to mount /sys/kernel/debug" TRACING_DIR="/sys/kernel/debug/tracing" + UMOUNT_DIR=${TRACING_DIR} else err_ret=$err_skip errexit "debugfs and tracefs are not configured in this kernel"
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jinjie Ruan ruanjinjie@huawei.com
[ Upstream commit 7dcc683db3639eadd11bf0d59a09088a43de5e22 ]
Since debugfs_create_file() returns ERR_PTR and never NULL, use IS_ERR() to check the return value.
Fixes: 2fcbc569b9f5 ("scsi: lpfc: Make debugfs ktime stats generic for NVME and SCSI") Fixes: 4c47efc140fa ("scsi: lpfc: Move SCSI and NVME Stats to hardware queue structures") Fixes: 6a828b0f6192 ("scsi: lpfc: Support non-uniform allocation of MSIX vectors to hardware queues") Fixes: 95bfc6d8ad86 ("scsi: lpfc: Make FW logging dynamically configurable") Fixes: 9f77870870d8 ("scsi: lpfc: Add debugfs support for cm framework buffers") Fixes: c490850a0947 ("scsi: lpfc: Adapt partitioned XRI lists to efficient sharing") Signed-off-by: Jinjie Ruan ruanjinjie@huawei.com Link: https://lore.kernel.org/r/20230906030809.2847970-1-ruanjinjie@huawei.com Reviewed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_debugfs.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_debugfs.c b/drivers/scsi/lpfc/lpfc_debugfs.c index 560b2504e674d..a15ad76aee211 100644 --- a/drivers/scsi/lpfc/lpfc_debugfs.c +++ b/drivers/scsi/lpfc/lpfc_debugfs.c @@ -6058,7 +6058,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) phba->hba_debugfs_root, phba, &lpfc_debugfs_op_multixripools); - if (!phba->debug_multixri_pools) { + if (IS_ERR(phba->debug_multixri_pools)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "0527 Cannot create debugfs multixripools\n"); goto debug_failed; @@ -6070,7 +6070,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, S_IFREG | 0644, phba->hba_debugfs_root, phba, &lpfc_cgn_buffer_op); - if (!phba->debug_cgn_buffer) { + if (IS_ERR(phba->debug_cgn_buffer)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "6527 Cannot create debugfs " "cgn_buffer\n"); @@ -6083,7 +6083,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, S_IFREG | 0644, phba->hba_debugfs_root, phba, &lpfc_rx_monitor_op); - if (!phba->debug_rx_monitor) { + if (IS_ERR(phba->debug_rx_monitor)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "6528 Cannot create debugfs " "rx_monitor\n"); @@ -6096,7 +6096,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, 0644, phba->hba_debugfs_root, phba, &lpfc_debugfs_ras_log); - if (!phba->debug_ras_log) { + if (IS_ERR(phba->debug_ras_log)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "6148 Cannot create debugfs" " ras_log\n"); @@ -6117,7 +6117,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, S_IFREG | 0644, phba->hba_debugfs_root, phba, &lpfc_debugfs_op_lockstat); - if (!phba->debug_lockstat) { + if (IS_ERR(phba->debug_lockstat)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "4610 Can't create debugfs lockstat\n"); goto debug_failed; @@ -6346,7 +6346,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, 0644, vport->vport_debugfs_root, vport, &lpfc_debugfs_op_scsistat); - if (!vport->debug_scsistat) { + if (IS_ERR(vport->debug_scsistat)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "4611 Cannot create debugfs scsistat\n"); goto debug_failed; @@ -6357,7 +6357,7 @@ lpfc_debugfs_initialize(struct lpfc_vport *vport) debugfs_create_file(name, 0644, vport->vport_debugfs_root, vport, &lpfc_debugfs_op_ioktime); - if (!vport->debug_ioktime) { + if (IS_ERR(vport->debug_ioktime)) { lpfc_printf_vlog(vport, KERN_ERR, LOG_INIT, "0815 Cannot create debugfs ioktime\n"); goto debug_failed;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kirill A. Shutemov kirill.shutemov@linux.intel.com
[ Upstream commit f530ee95b72e77b09c141c4b1a4b94d1199ffbd9 ]
The decompressor has a hard limit on the number of page tables it can allocate. This limit is defined at compile-time and will cause boot failure if it is reached.
The kernel is very strict and calculates the limit precisely for the worst-case scenario based on the current configuration. However, it is easy to forget to adjust the limit when a new use-case arises. The worst-case scenario is rarely encountered during sanity checks.
In the case of enabling 5-level paging, a use-case was overlooked. The limit needs to be increased by one to accommodate the additional level. This oversight went unnoticed until Aaron attempted to run the kernel via kexec with 5-level paging and unaccepted memory enabled.
Update wost-case calculations to include 5-level paging.
To address this issue, let's allocate some extra space for page tables. 128K should be sufficient for any use-case. The logic can be simplified by using a single value for all kernel configurations.
[ Also add a warning, should this memory run low - by Dave Hansen. ]
Fixes: 34bbb0009f3b ("x86/boot/compressed: Enable 5-level paging during decompression stage") Reported-by: Aaron Lu aaron.lu@intel.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Ingo Molnar mingo@kernel.org Link: https://lore.kernel.org/r/20230915070221.10266-1-kirill.shutemov@linux.intel... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/boot/compressed/ident_map_64.c | 8 +++++ arch/x86/include/asm/boot.h | 45 +++++++++++++++++-------- 2 files changed, 39 insertions(+), 14 deletions(-)
diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index f7213d0943b82..575d881ff86e2 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -67,6 +67,14 @@ static void *alloc_pgt_page(void *context) return NULL; }
+ /* Consumed more tables than expected? */ + if (pages->pgt_buf_offset == BOOT_PGT_SIZE_WARN) { + debug_putstr("pgt_buf running low in " __FILE__ "\n"); + debug_putstr("Need to raise BOOT_PGT_SIZE?\n"); + debug_putaddr(pages->pgt_buf_offset); + debug_putaddr(pages->pgt_buf_size); + } + entry = pages->pgt_buf + pages->pgt_buf_offset; pages->pgt_buf_offset += PAGE_SIZE;
diff --git a/arch/x86/include/asm/boot.h b/arch/x86/include/asm/boot.h index 9191280d9ea31..215d37f7dde8a 100644 --- a/arch/x86/include/asm/boot.h +++ b/arch/x86/include/asm/boot.h @@ -40,23 +40,40 @@ #ifdef CONFIG_X86_64 # define BOOT_STACK_SIZE 0x4000
+/* + * Used by decompressor's startup_32() to allocate page tables for identity + * mapping of the 4G of RAM in 4-level paging mode: + * - 1 level4 table; + * - 1 level3 table; + * - 4 level2 table that maps everything with 2M pages; + * + * The additional level5 table needed for 5-level paging is allocated from + * trampoline_32bit memory. + */ # define BOOT_INIT_PGT_SIZE (6*4096) -# ifdef CONFIG_RANDOMIZE_BASE + /* - * Assuming all cross the 512GB boundary: - * 1 page for level4 - * (2+2)*4 pages for kernel, param, cmd_line, and randomized kernel - * 2 pages for first 2M (video RAM: CONFIG_X86_VERBOSE_BOOTUP). - * Total is 19 pages. + * Total number of page tables kernel_add_identity_map() can allocate, + * including page tables consumed by startup_32(). + * + * Worst-case scenario: + * - 5-level paging needs 1 level5 table; + * - KASLR needs to map kernel, boot_params, cmdline and randomized kernel, + * assuming all of them cross 256T boundary: + * + 4*2 level4 table; + * + 4*2 level3 table; + * + 4*2 level2 table; + * - X86_VERBOSE_BOOTUP needs to map the first 2M (video RAM): + * + 1 level4 table; + * + 1 level3 table; + * + 1 level2 table; + * Total: 28 tables + * + * Add 4 spare table in case decompressor touches anything beyond what is + * accounted above. Warn if it happens. */ -# ifdef CONFIG_X86_VERBOSE_BOOTUP -# define BOOT_PGT_SIZE (19*4096) -# else /* !CONFIG_X86_VERBOSE_BOOTUP */ -# define BOOT_PGT_SIZE (17*4096) -# endif -# else /* !CONFIG_RANDOMIZE_BASE */ -# define BOOT_PGT_SIZE BOOT_INIT_PGT_SIZE -# endif +# define BOOT_PGT_SIZE_WARN (28*4096) +# define BOOT_PGT_SIZE (32*4096)
#else /* !CONFIG_X86_64 */ # define BOOT_STACK_SIZE 0x1000
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Song Liu song@kernel.org
[ Upstream commit 75b2f7e4c9e0fd750a5a27ca9736d1daa7a3762a ]
-flto* implies -ffunction-sections. With LTO enabled, ld.lld generates multiple .text sections for purgatory.ro:
$ readelf -S purgatory.ro | grep " .text" [ 1] .text PROGBITS 0000000000000000 00000040 [ 7] .text.purgatory PROGBITS 0000000000000000 000020e0 [ 9] .text.warn PROGBITS 0000000000000000 000021c0 [13] .text.sha256_upda PROGBITS 0000000000000000 000022f0 [15] .text.sha224_upda PROGBITS 0000000000000000 00002be0 [17] .text.sha256_fina PROGBITS 0000000000000000 00002bf0 [19] .text.sha224_fina PROGBITS 0000000000000000 00002cc0
This causes WARNING from kexec_purgatory_setup_sechdrs():
WARNING: CPU: 26 PID: 110894 at kernel/kexec_file.c:919 kexec_load_purgatory+0x37f/0x390
Fix this by disabling LTO for purgatory.
[ AFAICT, x86 is the only arch that supports LTO and purgatory. ]
We could also fix this with an explicit linker script to rejoin .text.* sections back into .text. However, given the benefit of LTOing purgatory is small, simply disable the production of more .text.* sections for now.
Fixes: b33fff07e3e3 ("x86, build: allow LTO to be selected") Signed-off-by: Song Liu song@kernel.org Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Nick Desaulniers ndesaulniers@google.com Reviewed-by: Sami Tolvanen samitolvanen@google.com Link: https://lore.kernel.org/r/20230914170138.995606-1-song@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/purgatory/Makefile | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/arch/x86/purgatory/Makefile b/arch/x86/purgatory/Makefile index dc0b91c1db04b..7a7701d1e18d0 100644 --- a/arch/x86/purgatory/Makefile +++ b/arch/x86/purgatory/Makefile @@ -19,6 +19,10 @@ CFLAGS_sha256.o := -D__DISABLE_EXPORTS # optimization flags. KBUILD_CFLAGS := $(filter-out -fprofile-sample-use=% -fprofile-use=%,$(KBUILD_CFLAGS))
+# When LTO is enabled, llvm emits many text sections, which is not supported +# by kexec. Remove -flto=* flags. +KBUILD_CFLAGS := $(filter-out $(CC_FLAGS_LTO),$(KBUILD_CFLAGS)) + # When linking purgatory.ro with -r unresolved symbols are not checked, # also link a purgatory.chk binary without -r to check for unresolved symbols. PURGATORY_LDFLAGS := -e purgatory_start -nostdlib -z nodefaultlib
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 00c320f9b75560628e840bef027a27c746706759 ]
We only need to validate tables that saw changes in the current transaction.
The existing code revalidates all tables, but this isn't needed as cross-table jumps are not allowed (chains have table scope).
Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Stable-dep-of: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 3 ++- net/netfilter/nf_tables_api.c | 38 +++++++++++++++---------------- 2 files changed, 20 insertions(+), 21 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 1458b3eae8ada..b8d967e0eb1e2 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -1183,6 +1183,7 @@ static inline void nft_use_inc_restore(u32 *use) * @genmask: generation mask * @afinfo: address family info * @name: name of the table + * @validate_state: internal, set when transaction adds jumps */ struct nft_table { struct list_head list; @@ -1201,6 +1202,7 @@ struct nft_table { char *name; u16 udlen; u8 *udata; + u8 validate_state; };
static inline bool nft_table_has_owner(const struct nft_table *table) @@ -1682,7 +1684,6 @@ struct nftables_pernet { struct mutex commit_mutex; u64 table_handle; unsigned int base_seq; - u8 validate_state; };
extern unsigned int nf_tables_net_id; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index d84da11aaee5c..dde19be41610d 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -102,11 +102,9 @@ static const u8 nft2audit_op[NFT_MSG_MAX] = { // enum nf_tables_msg_types [NFT_MSG_DELFLOWTABLE] = AUDIT_NFT_OP_FLOWTABLE_UNREGISTER, };
-static void nft_validate_state_update(struct net *net, u8 new_validate_state) +static void nft_validate_state_update(struct nft_table *table, u8 new_validate_state) { - struct nftables_pernet *nft_net = nft_pernet(net); - - switch (nft_net->validate_state) { + switch (table->validate_state) { case NFT_VALIDATE_SKIP: WARN_ON_ONCE(new_validate_state == NFT_VALIDATE_DO); break; @@ -117,7 +115,7 @@ static void nft_validate_state_update(struct net *net, u8 new_validate_state) return; }
- nft_net->validate_state = new_validate_state; + table->validate_state = new_validate_state; } static void nf_tables_trans_destroy_work(struct work_struct *w); static DECLARE_WORK(trans_destroy_work, nf_tables_trans_destroy_work); @@ -1286,6 +1284,7 @@ static int nf_tables_newtable(struct sk_buff *skb, const struct nfnl_info *info, if (table == NULL) goto err_kzalloc;
+ table->validate_state = NFT_VALIDATE_SKIP; table->name = nla_strdup(attr, GFP_KERNEL); if (table->name == NULL) goto err_strdup; @@ -3650,7 +3649,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, }
if (expr_info[i].ops->validate) - nft_validate_state_update(net, NFT_VALIDATE_NEED); + nft_validate_state_update(table, NFT_VALIDATE_NEED);
expr_info[i].ops = NULL; expr = nft_expr_next(expr); @@ -3704,7 +3703,7 @@ static int nf_tables_newrule(struct sk_buff *skb, const struct nfnl_info *info, if (flow) nft_trans_flow_rule(trans) = flow;
- if (nft_net->validate_state == NFT_VALIDATE_DO) + if (table->validate_state == NFT_VALIDATE_DO) return nft_table_validate(net, table);
return 0; @@ -6347,7 +6346,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, if (desc.type == NFT_DATA_VERDICT && (elem.data.val.verdict.code == NFT_GOTO || elem.data.val.verdict.code == NFT_JUMP)) - nft_validate_state_update(ctx->net, + nft_validate_state_update(ctx->table, NFT_VALIDATE_NEED); }
@@ -6465,7 +6464,6 @@ static int nf_tables_newsetelem(struct sk_buff *skb, const struct nfnl_info *info, const struct nlattr * const nla[]) { - struct nftables_pernet *nft_net = nft_pernet(info->net); struct netlink_ext_ack *extack = info->extack; u8 genmask = nft_genmask_next(info->net); u8 family = info->nfmsg->nfgen_family; @@ -6503,7 +6501,7 @@ static int nf_tables_newsetelem(struct sk_buff *skb, return err; }
- if (nft_net->validate_state == NFT_VALIDATE_DO) + if (table->validate_state == NFT_VALIDATE_DO) return nft_table_validate(net, table);
return 0; @@ -8600,19 +8598,20 @@ static int nf_tables_validate(struct net *net) struct nftables_pernet *nft_net = nft_pernet(net); struct nft_table *table;
- switch (nft_net->validate_state) { - case NFT_VALIDATE_SKIP: - break; - case NFT_VALIDATE_NEED: - nft_validate_state_update(net, NFT_VALIDATE_DO); - fallthrough; - case NFT_VALIDATE_DO: - list_for_each_entry(table, &nft_net->tables, list) { + list_for_each_entry(table, &nft_net->tables, list) { + switch (table->validate_state) { + case NFT_VALIDATE_SKIP: + continue; + case NFT_VALIDATE_NEED: + nft_validate_state_update(table, NFT_VALIDATE_DO); + fallthrough; + case NFT_VALIDATE_DO: if (nft_table_validate(net, table) < 0) return -EAGAIN; + + nft_validate_state_update(table, NFT_VALIDATE_SKIP); }
- nft_validate_state_update(net, NFT_VALIDATE_SKIP); break; }
@@ -10344,7 +10343,6 @@ static int __net_init nf_tables_init_net(struct net *net) INIT_LIST_HEAD(&nft_net->notify_list); mutex_init(&nft_net->commit_mutex); nft_net->base_seq = 1; - nft_net->validate_state = NFT_VALIDATE_SKIP;
return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 5f68718b34a531a556f2f50300ead2862278da26 ]
The set types rhashtable and rbtree use a GC worker to reclaim memory.
From system work queue, in periodic intervals, a scan of the table is
done.
The major caveat here is that the nft transaction mutex is not held. This causes a race between control plane and GC when they attempt to delete the same element.
We cannot grab the netlink mutex from the work queue, because the control plane has to wait for the GC work queue in case the set is to be removed, so we get following deadlock:
cpu 1 cpu2 GC work transaction comes in , lock nft mutex `acquire nft mutex // BLOCKS transaction asks to remove the set set destruction calls cancel_work_sync()
cancel_work_sync will now block forever, because it is waiting for the mutex the caller already owns.
This patch adds a new API that deals with garbage collection in two steps:
1) Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT so they are not visible via lookup. Annotate current GC sequence in the GC transaction. Enqueue GC transaction work as soon as it is full. If ruleset is updated, then GC transaction is aborted and retried later.
2) GC work grabs the mutex. If GC sequence has changed then this GC transaction lost race with control plane, abort it as it contains stale references to objects and let GC try again later. If the ruleset is intact, then this GC transaction deactivates and removes the elements and it uses call_rcu() to destroy elements.
Note that no elements are removed from GC lockless path, the _DEAD bit is set and pointers are collected. GC catchall does not remove the elements anymore too. There is a new set->dead flag that is set on to abort the GC transaction to deal with set->ops->destroy() path which removes the remaining elements in the set from commit_release, where no mutex is held.
To deal with GC when mutex is held, which allows safe deactivate and removal, add sync GC API which releases the set element object via call_rcu(). This is used by rbtree and pipapo backends which also perform garbage collection from control plane path.
Since element removal from sets can happen from control plane and element garbage collection/timeout, it is necessary to keep the set structure alive until all elements have been deactivated and destroyed.
We cannot do a cancel_work_sync or flush_work in nft_set_destroy because its called with the transaction mutex held, but the aforementioned async work queue might be blocked on the very mutex that nft_set_destroy() callchain is sitting on.
This gives us the choice of ABBA deadlock or UaF.
To avoid both, add set->refs refcount_t member. The GC API can then increment the set refcount and release it once the elements have been free'd.
Set backends are adapted to use the GC transaction API in a follow up patch entitled:
("netfilter: nf_tables: use gc transaction API in set backends")
This is joint work with Florian Westphal.
Fixes: cfed7e1b1f8e ("netfilter: nf_tables: add set garbage collection helpers") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 64 +++++++- net/netfilter/nf_tables_api.c | 248 ++++++++++++++++++++++++++++-- 2 files changed, 300 insertions(+), 12 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index b8d967e0eb1e2..a6bf58316a5d8 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -477,6 +477,7 @@ struct nft_set_elem_expr { * * @list: table set list node * @bindings: list of set bindings + * @refs: internal refcounting for async set destruction * @table: table this set belongs to * @net: netnamespace this set belongs to * @name: name of the set @@ -506,6 +507,7 @@ struct nft_set_elem_expr { struct nft_set { struct list_head list; struct list_head bindings; + refcount_t refs; struct nft_table *table; possible_net_t net; char *name; @@ -527,7 +529,8 @@ struct nft_set { struct list_head pending_update; /* runtime data below here */ const struct nft_set_ops *ops ____cacheline_aligned; - u16 flags:14, + u16 flags:13, + dead:1, genmask:2; u8 klen; u8 dlen; @@ -1527,6 +1530,32 @@ static inline void nft_set_elem_clear_busy(struct nft_set_ext *ext) clear_bit(NFT_SET_ELEM_BUSY_BIT, word); }
+#define NFT_SET_ELEM_DEAD_MASK (1 << 3) + +#if defined(__LITTLE_ENDIAN_BITFIELD) +#define NFT_SET_ELEM_DEAD_BIT 3 +#elif defined(__BIG_ENDIAN_BITFIELD) +#define NFT_SET_ELEM_DEAD_BIT (BITS_PER_LONG - BITS_PER_BYTE + 3) +#else +#error +#endif + +static inline void nft_set_elem_dead(struct nft_set_ext *ext) +{ + unsigned long *word = (unsigned long *)ext; + + BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0); + set_bit(NFT_SET_ELEM_DEAD_BIT, word); +} + +static inline int nft_set_elem_is_dead(const struct nft_set_ext *ext) +{ + unsigned long *word = (unsigned long *)ext; + + BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0); + return test_bit(NFT_SET_ELEM_DEAD_BIT, word); +} + /** * struct nft_trans - nf_tables object update in transaction * @@ -1658,6 +1687,38 @@ struct nft_trans_flowtable { #define nft_trans_flowtable_flags(trans) \ (((struct nft_trans_flowtable *)trans->data)->flags)
+#define NFT_TRANS_GC_BATCHCOUNT 256 + +struct nft_trans_gc { + struct list_head list; + struct net *net; + struct nft_set *set; + u32 seq; + u8 count; + void *priv[NFT_TRANS_GC_BATCHCOUNT]; + struct rcu_head rcu; +}; + +struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set, + unsigned int gc_seq, gfp_t gfp); +void nft_trans_gc_destroy(struct nft_trans_gc *trans); + +struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc, + unsigned int gc_seq, gfp_t gfp); +void nft_trans_gc_queue_async_done(struct nft_trans_gc *gc); + +struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp); +void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans); + +void nft_trans_gc_elem_add(struct nft_trans_gc *gc, void *priv); + +struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, + unsigned int gc_seq); + +void nft_setelem_data_deactivate(const struct net *net, + const struct nft_set *set, + struct nft_set_elem *elem); + int __init nft_chain_filter_init(void); void nft_chain_filter_fini(void);
@@ -1684,6 +1745,7 @@ struct nftables_pernet { struct mutex commit_mutex; u64 table_handle; unsigned int base_seq; + unsigned int gc_seq; };
extern unsigned int nf_tables_net_id; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index dde19be41610d..2333f5da1eb97 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -31,7 +31,9 @@ static LIST_HEAD(nf_tables_expressions); static LIST_HEAD(nf_tables_objects); static LIST_HEAD(nf_tables_flowtables); static LIST_HEAD(nf_tables_destroy_list); +static LIST_HEAD(nf_tables_gc_list); static DEFINE_SPINLOCK(nf_tables_destroy_list_lock); +static DEFINE_SPINLOCK(nf_tables_gc_list_lock);
enum { NFT_VALIDATE_SKIP = 0, @@ -120,6 +122,9 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s static void nf_tables_trans_destroy_work(struct work_struct *w); static DECLARE_WORK(trans_destroy_work, nf_tables_trans_destroy_work);
+static void nft_trans_gc_work(struct work_struct *work); +static DECLARE_WORK(trans_gc_work, nft_trans_gc_work); + static void nft_ctx_init(struct nft_ctx *ctx, struct net *net, const struct sk_buff *skb, @@ -581,10 +586,6 @@ static int nft_trans_set_add(const struct nft_ctx *ctx, int msg_type, return __nft_trans_set_add(ctx, msg_type, set, NULL); }
-static void nft_setelem_data_deactivate(const struct net *net, - const struct nft_set *set, - struct nft_set_elem *elem); - static int nft_mapelem_deactivate(const struct nft_ctx *ctx, struct nft_set *set, const struct nft_set_iter *iter, @@ -4756,6 +4757,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info,
INIT_LIST_HEAD(&set->bindings); INIT_LIST_HEAD(&set->catchall_list); + refcount_set(&set->refs, 1); set->table = table; write_pnet(&set->net, net); set->ops = ops; @@ -4823,6 +4825,14 @@ static void nft_set_catchall_destroy(const struct nft_ctx *ctx, } }
+static void nft_set_put(struct nft_set *set) +{ + if (refcount_dec_and_test(&set->refs)) { + kfree(set->name); + kvfree(set); + } +} + static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set) { int i; @@ -4835,8 +4845,7 @@ static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set)
set->ops->destroy(ctx, set); nft_set_catchall_destroy(ctx, set); - kfree(set->name); - kvfree(set); + nft_set_put(set); }
static int nf_tables_delset(struct sk_buff *skb, const struct nfnl_info *info, @@ -5901,7 +5910,8 @@ struct nft_set_ext *nft_set_catchall_lookup(const struct net *net, list_for_each_entry_rcu(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem); if (nft_set_elem_active(ext, genmask) && - !nft_set_elem_expired(ext)) + !nft_set_elem_expired(ext) && + !nft_set_elem_is_dead(ext)) return ext; }
@@ -6545,9 +6555,9 @@ static void nft_setelem_data_activate(const struct net *net, nft_use_inc_restore(&(*nft_set_ext_obj(ext))->use); }
-static void nft_setelem_data_deactivate(const struct net *net, - const struct nft_set *set, - struct nft_set_elem *elem) +void nft_setelem_data_deactivate(const struct net *net, + const struct nft_set *set, + struct nft_set_elem *elem) { const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv);
@@ -8882,6 +8892,207 @@ void nft_chain_del(struct nft_chain *chain) list_del_rcu(&chain->list); }
+static void nft_trans_gc_setelem_remove(struct nft_ctx *ctx, + struct nft_trans_gc *trans) +{ + void **priv = trans->priv; + unsigned int i; + + for (i = 0; i < trans->count; i++) { + struct nft_set_elem elem = { + .priv = priv[i], + }; + + nft_setelem_data_deactivate(ctx->net, trans->set, &elem); + nft_setelem_remove(ctx->net, trans->set, &elem); + } +} + +void nft_trans_gc_destroy(struct nft_trans_gc *trans) +{ + nft_set_put(trans->set); + put_net(trans->net); + kfree(trans); +} + +static void nft_trans_gc_trans_free(struct rcu_head *rcu) +{ + struct nft_set_elem elem = {}; + struct nft_trans_gc *trans; + struct nft_ctx ctx = {}; + unsigned int i; + + trans = container_of(rcu, struct nft_trans_gc, rcu); + ctx.net = read_pnet(&trans->set->net); + + for (i = 0; i < trans->count; i++) { + elem.priv = trans->priv[i]; + if (!nft_setelem_is_catchall(trans->set, &elem)) + atomic_dec(&trans->set->nelems); + + nf_tables_set_elem_destroy(&ctx, trans->set, elem.priv); + } + + nft_trans_gc_destroy(trans); +} + +static bool nft_trans_gc_work_done(struct nft_trans_gc *trans) +{ + struct nftables_pernet *nft_net; + struct nft_ctx ctx = {}; + + nft_net = nft_pernet(trans->net); + + mutex_lock(&nft_net->commit_mutex); + + /* Check for race with transaction, otherwise this batch refers to + * stale objects that might not be there anymore. Skip transaction if + * set has been destroyed from control plane transaction in case gc + * worker loses race. + */ + if (READ_ONCE(nft_net->gc_seq) != trans->seq || trans->set->dead) { + mutex_unlock(&nft_net->commit_mutex); + return false; + } + + ctx.net = trans->net; + ctx.table = trans->set->table; + + nft_trans_gc_setelem_remove(&ctx, trans); + mutex_unlock(&nft_net->commit_mutex); + + return true; +} + +static void nft_trans_gc_work(struct work_struct *work) +{ + struct nft_trans_gc *trans, *next; + LIST_HEAD(trans_gc_list); + + spin_lock(&nf_tables_destroy_list_lock); + list_splice_init(&nf_tables_gc_list, &trans_gc_list); + spin_unlock(&nf_tables_destroy_list_lock); + + list_for_each_entry_safe(trans, next, &trans_gc_list, list) { + list_del(&trans->list); + if (!nft_trans_gc_work_done(trans)) { + nft_trans_gc_destroy(trans); + continue; + } + call_rcu(&trans->rcu, nft_trans_gc_trans_free); + } +} + +struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set, + unsigned int gc_seq, gfp_t gfp) +{ + struct net *net = read_pnet(&set->net); + struct nft_trans_gc *trans; + + trans = kzalloc(sizeof(*trans), gfp); + if (!trans) + return NULL; + + refcount_inc(&set->refs); + trans->set = set; + trans->net = get_net(net); + trans->seq = gc_seq; + + return trans; +} + +void nft_trans_gc_elem_add(struct nft_trans_gc *trans, void *priv) +{ + trans->priv[trans->count++] = priv; +} + +static void nft_trans_gc_queue_work(struct nft_trans_gc *trans) +{ + spin_lock(&nf_tables_gc_list_lock); + list_add_tail(&trans->list, &nf_tables_gc_list); + spin_unlock(&nf_tables_gc_list_lock); + + schedule_work(&trans_gc_work); +} + +static int nft_trans_gc_space(struct nft_trans_gc *trans) +{ + return NFT_TRANS_GC_BATCHCOUNT - trans->count; +} + +struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc, + unsigned int gc_seq, gfp_t gfp) +{ + if (nft_trans_gc_space(gc)) + return gc; + + nft_trans_gc_queue_work(gc); + + return nft_trans_gc_alloc(gc->set, gc_seq, gfp); +} + +void nft_trans_gc_queue_async_done(struct nft_trans_gc *trans) +{ + if (trans->count == 0) { + nft_trans_gc_destroy(trans); + return; + } + + nft_trans_gc_queue_work(trans); +} + +struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp) +{ + if (WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net))) + return NULL; + + if (nft_trans_gc_space(gc)) + return gc; + + call_rcu(&gc->rcu, nft_trans_gc_trans_free); + + return nft_trans_gc_alloc(gc->set, 0, gfp); +} + +void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans) +{ + WARN_ON_ONCE(!lockdep_commit_lock_is_held(trans->net)); + + if (trans->count == 0) { + nft_trans_gc_destroy(trans); + return; + } + + call_rcu(&trans->rcu, nft_trans_gc_trans_free); +} + +struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc, + unsigned int gc_seq) +{ + struct nft_set_elem_catchall *catchall; + const struct nft_set *set = gc->set; + struct nft_set_ext *ext; + + list_for_each_entry_rcu(catchall, &set->catchall_list, list) { + ext = nft_set_elem_ext(set, catchall->elem); + + if (!nft_set_elem_expired(ext)) + continue; + if (nft_set_elem_is_dead(ext)) + goto dead_elem; + + nft_set_elem_dead(ext); +dead_elem: + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); + if (!gc) + return NULL; + + nft_trans_gc_elem_add(gc, catchall->elem); + } + + return gc; +} + static void nf_tables_module_autoload_cleanup(struct net *net) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -9044,11 +9255,11 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) { struct nftables_pernet *nft_net = nft_pernet(net); struct nft_trans *trans, *next; + unsigned int base_seq, gc_seq; LIST_HEAD(set_update_list); struct nft_trans_elem *te; struct nft_chain *chain; struct nft_table *table; - unsigned int base_seq; LIST_HEAD(adl); int err;
@@ -9125,6 +9336,10 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
WRITE_ONCE(nft_net->base_seq, base_seq);
+ /* Bump gc counter, it becomes odd, this is the busy mark. */ + gc_seq = READ_ONCE(nft_net->gc_seq); + WRITE_ONCE(nft_net->gc_seq, ++gc_seq); + /* step 3. Start new generation, rules_gen_X now in use. */ net->nft.gencursor = nft_gencursor_next(net);
@@ -9213,6 +9428,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) nft_trans_destroy(trans); break; case NFT_MSG_DELSET: + nft_trans_set(trans)->dead = 1; list_del_rcu(&nft_trans_set(trans)->list); nf_tables_set_notify(&trans->ctx, nft_trans_set(trans), NFT_MSG_DELSET, GFP_KERNEL); @@ -9312,6 +9528,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) nft_commit_notify(net, NETLINK_CB(skb).portid); nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN); nf_tables_commit_audit_log(&adl, nft_net->base_seq); + + WRITE_ONCE(nft_net->gc_seq, ++gc_seq); nf_tables_commit_release(net);
return 0; @@ -10343,6 +10561,7 @@ static int __net_init nf_tables_init_net(struct net *net) INIT_LIST_HEAD(&nft_net->notify_list); mutex_init(&nft_net->commit_mutex); nft_net->base_seq = 1; + nft_net->gc_seq = 0;
return 0; } @@ -10371,10 +10590,16 @@ static void __net_exit nf_tables_exit_net(struct net *net) WARN_ON_ONCE(!list_empty(&nft_net->notify_list)); }
+static void nf_tables_exit_batch(struct list_head *net_exit_list) +{ + flush_work(&trans_gc_work); +} + static struct pernet_operations nf_tables_net_ops = { .init = nf_tables_init_net, .pre_exit = nf_tables_pre_exit_net, .exit = nf_tables_exit_net, + .exit_batch = nf_tables_exit_batch, .id = &nf_tables_net_id, .size = sizeof(struct nftables_pernet), }; @@ -10446,6 +10671,7 @@ static void __exit nf_tables_module_exit(void) nft_chain_filter_fini(); nft_chain_route_fini(); unregister_pernet_subsys(&nf_tables_net_ops); + cancel_work_sync(&trans_gc_work); cancel_work_sync(&trans_destroy_work); rcu_barrier(); rhltable_destroy(&nft_objname_ht);
Hi Greg,
On Wed, Sep 20, 2023 at 01:32:21PM +0200, Greg Kroah-Hartman wrote:
5.15-stable review patch. If anyone has any objections, please let me know.
Please, keep this back from 5.15, I am preparing a more complete patch series which includes follow up fixes for this on top of this.
Thanks.
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 5f68718b34a531a556f2f50300ead2862278da26 ]
The set types rhashtable and rbtree use a GC worker to reclaim memory.
From system work queue, in periodic intervals, a scan of the table is
done.
The major caveat here is that the nft transaction mutex is not held. This causes a race between control plane and GC when they attempt to delete the same element.
We cannot grab the netlink mutex from the work queue, because the control plane has to wait for the GC work queue in case the set is to be removed, so we get following deadlock:
cpu 1 cpu2 GC work transaction comes in , lock nft mutex `acquire nft mutex // BLOCKS transaction asks to remove the set set destruction calls cancel_work_sync()
cancel_work_sync will now block forever, because it is waiting for the mutex the caller already owns.
This patch adds a new API that deals with garbage collection in two steps:
Lockless GC of expired elements sets on the NFT_SET_ELEM_DEAD_BIT so they are not visible via lookup. Annotate current GC sequence in the GC transaction. Enqueue GC transaction work as soon as it is full. If ruleset is updated, then GC transaction is aborted and retried later.
GC work grabs the mutex. If GC sequence has changed then this GC transaction lost race with control plane, abort it as it contains stale references to objects and let GC try again later. If the ruleset is intact, then this GC transaction deactivates and removes the elements and it uses call_rcu() to destroy elements.
Note that no elements are removed from GC lockless path, the _DEAD bit is set and pointers are collected. GC catchall does not remove the elements anymore too. There is a new set->dead flag that is set on to abort the GC transaction to deal with set->ops->destroy() path which removes the remaining elements in the set from commit_release, where no mutex is held.
To deal with GC when mutex is held, which allows safe deactivate and removal, add sync GC API which releases the set element object via call_rcu(). This is used by rbtree and pipapo backends which also perform garbage collection from control plane path.
Since element removal from sets can happen from control plane and element garbage collection/timeout, it is necessary to keep the set structure alive until all elements have been deactivated and destroyed.
We cannot do a cancel_work_sync or flush_work in nft_set_destroy because its called with the transaction mutex held, but the aforementioned async work queue might be blocked on the very mutex that nft_set_destroy() callchain is sitting on.
This gives us the choice of ABBA deadlock or UaF.
To avoid both, add set->refs refcount_t member. The GC API can then increment the set refcount and release it once the elements have been free'd.
Set backends are adapted to use the GC transaction API in a follow up patch entitled:
("netfilter: nf_tables: use gc transaction API in set backends")
This is joint work with Florian Westphal.
Fixes: cfed7e1b1f8e ("netfilter: nf_tables: add set garbage collection helpers") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org
include/net/netfilter/nf_tables.h | 64 +++++++- net/netfilter/nf_tables_api.c | 248 ++++++++++++++++++++++++++++-- 2 files changed, 300 insertions(+), 12 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index b8d967e0eb1e2..a6bf58316a5d8 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -477,6 +477,7 @@ struct nft_set_elem_expr {
- @list: table set list node
- @bindings: list of set bindings
- @refs: internal refcounting for async set destruction
- @table: table this set belongs to
- @net: netnamespace this set belongs to
- @name: name of the set
@@ -506,6 +507,7 @@ struct nft_set_elem_expr { struct nft_set { struct list_head list; struct list_head bindings;
- refcount_t refs; struct nft_table *table; possible_net_t net; char *name;
@@ -527,7 +529,8 @@ struct nft_set { struct list_head pending_update; /* runtime data below here */ const struct nft_set_ops *ops ____cacheline_aligned;
- u16 flags:14,
- u16 flags:13,
u8 klen; u8 dlen;dead:1, genmask:2;
@@ -1527,6 +1530,32 @@ static inline void nft_set_elem_clear_busy(struct nft_set_ext *ext) clear_bit(NFT_SET_ELEM_BUSY_BIT, word); } +#define NFT_SET_ELEM_DEAD_MASK (1 << 3)
+#if defined(__LITTLE_ENDIAN_BITFIELD) +#define NFT_SET_ELEM_DEAD_BIT 3 +#elif defined(__BIG_ENDIAN_BITFIELD) +#define NFT_SET_ELEM_DEAD_BIT (BITS_PER_LONG - BITS_PER_BYTE + 3) +#else +#error +#endif
+static inline void nft_set_elem_dead(struct nft_set_ext *ext) +{
- unsigned long *word = (unsigned long *)ext;
- BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0);
- set_bit(NFT_SET_ELEM_DEAD_BIT, word);
+}
+static inline int nft_set_elem_is_dead(const struct nft_set_ext *ext) +{
- unsigned long *word = (unsigned long *)ext;
- BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0);
- return test_bit(NFT_SET_ELEM_DEAD_BIT, word);
+}
/**
- struct nft_trans - nf_tables object update in transaction
@@ -1658,6 +1687,38 @@ struct nft_trans_flowtable { #define nft_trans_flowtable_flags(trans) \ (((struct nft_trans_flowtable *)trans->data)->flags) +#define NFT_TRANS_GC_BATCHCOUNT 256
+struct nft_trans_gc {
- struct list_head list;
- struct net *net;
- struct nft_set *set;
- u32 seq;
- u8 count;
- void *priv[NFT_TRANS_GC_BATCHCOUNT];
- struct rcu_head rcu;
+};
+struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set,
unsigned int gc_seq, gfp_t gfp);
+void nft_trans_gc_destroy(struct nft_trans_gc *trans);
+struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc,
unsigned int gc_seq, gfp_t gfp);
+void nft_trans_gc_queue_async_done(struct nft_trans_gc *gc);
+struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp); +void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans);
+void nft_trans_gc_elem_add(struct nft_trans_gc *gc, void *priv);
+struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
unsigned int gc_seq);
+void nft_setelem_data_deactivate(const struct net *net,
const struct nft_set *set,
struct nft_set_elem *elem);
int __init nft_chain_filter_init(void); void nft_chain_filter_fini(void); @@ -1684,6 +1745,7 @@ struct nftables_pernet { struct mutex commit_mutex; u64 table_handle; unsigned int base_seq;
- unsigned int gc_seq;
}; extern unsigned int nf_tables_net_id; diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index dde19be41610d..2333f5da1eb97 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -31,7 +31,9 @@ static LIST_HEAD(nf_tables_expressions); static LIST_HEAD(nf_tables_objects); static LIST_HEAD(nf_tables_flowtables); static LIST_HEAD(nf_tables_destroy_list); +static LIST_HEAD(nf_tables_gc_list); static DEFINE_SPINLOCK(nf_tables_destroy_list_lock); +static DEFINE_SPINLOCK(nf_tables_gc_list_lock); enum { NFT_VALIDATE_SKIP = 0, @@ -120,6 +122,9 @@ static void nft_validate_state_update(struct nft_table *table, u8 new_validate_s static void nf_tables_trans_destroy_work(struct work_struct *w); static DECLARE_WORK(trans_destroy_work, nf_tables_trans_destroy_work); +static void nft_trans_gc_work(struct work_struct *work); +static DECLARE_WORK(trans_gc_work, nft_trans_gc_work);
static void nft_ctx_init(struct nft_ctx *ctx, struct net *net, const struct sk_buff *skb, @@ -581,10 +586,6 @@ static int nft_trans_set_add(const struct nft_ctx *ctx, int msg_type, return __nft_trans_set_add(ctx, msg_type, set, NULL); } -static void nft_setelem_data_deactivate(const struct net *net,
const struct nft_set *set,
struct nft_set_elem *elem);
static int nft_mapelem_deactivate(const struct nft_ctx *ctx, struct nft_set *set, const struct nft_set_iter *iter, @@ -4756,6 +4757,7 @@ static int nf_tables_newset(struct sk_buff *skb, const struct nfnl_info *info, INIT_LIST_HEAD(&set->bindings); INIT_LIST_HEAD(&set->catchall_list);
- refcount_set(&set->refs, 1); set->table = table; write_pnet(&set->net, net); set->ops = ops;
@@ -4823,6 +4825,14 @@ static void nft_set_catchall_destroy(const struct nft_ctx *ctx, } } +static void nft_set_put(struct nft_set *set) +{
- if (refcount_dec_and_test(&set->refs)) {
kfree(set->name);
kvfree(set);
- }
+}
static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set) { int i; @@ -4835,8 +4845,7 @@ static void nft_set_destroy(const struct nft_ctx *ctx, struct nft_set *set) set->ops->destroy(ctx, set); nft_set_catchall_destroy(ctx, set);
- kfree(set->name);
- kvfree(set);
- nft_set_put(set);
} static int nf_tables_delset(struct sk_buff *skb, const struct nfnl_info *info, @@ -5901,7 +5910,8 @@ struct nft_set_ext *nft_set_catchall_lookup(const struct net *net, list_for_each_entry_rcu(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem); if (nft_set_elem_active(ext, genmask) &&
!nft_set_elem_expired(ext))
!nft_set_elem_expired(ext) &&
}!nft_set_elem_is_dead(ext)) return ext;
@@ -6545,9 +6555,9 @@ static void nft_setelem_data_activate(const struct net *net, nft_use_inc_restore(&(*nft_set_ext_obj(ext))->use); } -static void nft_setelem_data_deactivate(const struct net *net,
const struct nft_set *set,
struct nft_set_elem *elem)
+void nft_setelem_data_deactivate(const struct net *net,
const struct nft_set *set,
struct nft_set_elem *elem)
{ const struct nft_set_ext *ext = nft_set_elem_ext(set, elem->priv); @@ -8882,6 +8892,207 @@ void nft_chain_del(struct nft_chain *chain) list_del_rcu(&chain->list); } +static void nft_trans_gc_setelem_remove(struct nft_ctx *ctx,
struct nft_trans_gc *trans)
+{
- void **priv = trans->priv;
- unsigned int i;
- for (i = 0; i < trans->count; i++) {
struct nft_set_elem elem = {
.priv = priv[i],
};
nft_setelem_data_deactivate(ctx->net, trans->set, &elem);
nft_setelem_remove(ctx->net, trans->set, &elem);
- }
+}
+void nft_trans_gc_destroy(struct nft_trans_gc *trans) +{
- nft_set_put(trans->set);
- put_net(trans->net);
- kfree(trans);
+}
+static void nft_trans_gc_trans_free(struct rcu_head *rcu) +{
- struct nft_set_elem elem = {};
- struct nft_trans_gc *trans;
- struct nft_ctx ctx = {};
- unsigned int i;
- trans = container_of(rcu, struct nft_trans_gc, rcu);
- ctx.net = read_pnet(&trans->set->net);
- for (i = 0; i < trans->count; i++) {
elem.priv = trans->priv[i];
if (!nft_setelem_is_catchall(trans->set, &elem))
atomic_dec(&trans->set->nelems);
nf_tables_set_elem_destroy(&ctx, trans->set, elem.priv);
- }
- nft_trans_gc_destroy(trans);
+}
+static bool nft_trans_gc_work_done(struct nft_trans_gc *trans) +{
- struct nftables_pernet *nft_net;
- struct nft_ctx ctx = {};
- nft_net = nft_pernet(trans->net);
- mutex_lock(&nft_net->commit_mutex);
- /* Check for race with transaction, otherwise this batch refers to
* stale objects that might not be there anymore. Skip transaction if
* set has been destroyed from control plane transaction in case gc
* worker loses race.
*/
- if (READ_ONCE(nft_net->gc_seq) != trans->seq || trans->set->dead) {
mutex_unlock(&nft_net->commit_mutex);
return false;
- }
- ctx.net = trans->net;
- ctx.table = trans->set->table;
- nft_trans_gc_setelem_remove(&ctx, trans);
- mutex_unlock(&nft_net->commit_mutex);
- return true;
+}
+static void nft_trans_gc_work(struct work_struct *work) +{
- struct nft_trans_gc *trans, *next;
- LIST_HEAD(trans_gc_list);
- spin_lock(&nf_tables_destroy_list_lock);
- list_splice_init(&nf_tables_gc_list, &trans_gc_list);
- spin_unlock(&nf_tables_destroy_list_lock);
- list_for_each_entry_safe(trans, next, &trans_gc_list, list) {
list_del(&trans->list);
if (!nft_trans_gc_work_done(trans)) {
nft_trans_gc_destroy(trans);
continue;
}
call_rcu(&trans->rcu, nft_trans_gc_trans_free);
- }
+}
+struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set,
unsigned int gc_seq, gfp_t gfp)
+{
- struct net *net = read_pnet(&set->net);
- struct nft_trans_gc *trans;
- trans = kzalloc(sizeof(*trans), gfp);
- if (!trans)
return NULL;
- refcount_inc(&set->refs);
- trans->set = set;
- trans->net = get_net(net);
- trans->seq = gc_seq;
- return trans;
+}
+void nft_trans_gc_elem_add(struct nft_trans_gc *trans, void *priv) +{
- trans->priv[trans->count++] = priv;
+}
+static void nft_trans_gc_queue_work(struct nft_trans_gc *trans) +{
- spin_lock(&nf_tables_gc_list_lock);
- list_add_tail(&trans->list, &nf_tables_gc_list);
- spin_unlock(&nf_tables_gc_list_lock);
- schedule_work(&trans_gc_work);
+}
+static int nft_trans_gc_space(struct nft_trans_gc *trans) +{
- return NFT_TRANS_GC_BATCHCOUNT - trans->count;
+}
+struct nft_trans_gc *nft_trans_gc_queue_async(struct nft_trans_gc *gc,
unsigned int gc_seq, gfp_t gfp)
+{
- if (nft_trans_gc_space(gc))
return gc;
- nft_trans_gc_queue_work(gc);
- return nft_trans_gc_alloc(gc->set, gc_seq, gfp);
+}
+void nft_trans_gc_queue_async_done(struct nft_trans_gc *trans) +{
- if (trans->count == 0) {
nft_trans_gc_destroy(trans);
return;
- }
- nft_trans_gc_queue_work(trans);
+}
+struct nft_trans_gc *nft_trans_gc_queue_sync(struct nft_trans_gc *gc, gfp_t gfp) +{
- if (WARN_ON_ONCE(!lockdep_commit_lock_is_held(gc->net)))
return NULL;
- if (nft_trans_gc_space(gc))
return gc;
- call_rcu(&gc->rcu, nft_trans_gc_trans_free);
- return nft_trans_gc_alloc(gc->set, 0, gfp);
+}
+void nft_trans_gc_queue_sync_done(struct nft_trans_gc *trans) +{
- WARN_ON_ONCE(!lockdep_commit_lock_is_held(trans->net));
- if (trans->count == 0) {
nft_trans_gc_destroy(trans);
return;
- }
- call_rcu(&trans->rcu, nft_trans_gc_trans_free);
+}
+struct nft_trans_gc *nft_trans_gc_catchall(struct nft_trans_gc *gc,
unsigned int gc_seq)
+{
- struct nft_set_elem_catchall *catchall;
- const struct nft_set *set = gc->set;
- struct nft_set_ext *ext;
- list_for_each_entry_rcu(catchall, &set->catchall_list, list) {
ext = nft_set_elem_ext(set, catchall->elem);
if (!nft_set_elem_expired(ext))
continue;
if (nft_set_elem_is_dead(ext))
goto dead_elem;
nft_set_elem_dead(ext);
+dead_elem:
gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC);
if (!gc)
return NULL;
nft_trans_gc_elem_add(gc, catchall->elem);
- }
- return gc;
+}
static void nf_tables_module_autoload_cleanup(struct net *net) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -9044,11 +9255,11 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) { struct nftables_pernet *nft_net = nft_pernet(net); struct nft_trans *trans, *next;
- unsigned int base_seq, gc_seq; LIST_HEAD(set_update_list); struct nft_trans_elem *te; struct nft_chain *chain; struct nft_table *table;
- unsigned int base_seq; LIST_HEAD(adl); int err;
@@ -9125,6 +9336,10 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) WRITE_ONCE(nft_net->base_seq, base_seq);
- /* Bump gc counter, it becomes odd, this is the busy mark. */
- gc_seq = READ_ONCE(nft_net->gc_seq);
- WRITE_ONCE(nft_net->gc_seq, ++gc_seq);
- /* step 3. Start new generation, rules_gen_X now in use. */ net->nft.gencursor = nft_gencursor_next(net);
@@ -9213,6 +9428,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) nft_trans_destroy(trans); break; case NFT_MSG_DELSET:
nft_trans_set(trans)->dead = 1; list_del_rcu(&nft_trans_set(trans)->list); nf_tables_set_notify(&trans->ctx, nft_trans_set(trans), NFT_MSG_DELSET, GFP_KERNEL);
@@ -9312,6 +9528,8 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) nft_commit_notify(net, NETLINK_CB(skb).portid); nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN); nf_tables_commit_audit_log(&adl, nft_net->base_seq);
- WRITE_ONCE(nft_net->gc_seq, ++gc_seq); nf_tables_commit_release(net);
return 0; @@ -10343,6 +10561,7 @@ static int __net_init nf_tables_init_net(struct net *net) INIT_LIST_HEAD(&nft_net->notify_list); mutex_init(&nft_net->commit_mutex); nft_net->base_seq = 1;
- nft_net->gc_seq = 0;
return 0; } @@ -10371,10 +10590,16 @@ static void __net_exit nf_tables_exit_net(struct net *net) WARN_ON_ONCE(!list_empty(&nft_net->notify_list)); } +static void nf_tables_exit_batch(struct list_head *net_exit_list) +{
- flush_work(&trans_gc_work);
+}
static struct pernet_operations nf_tables_net_ops = { .init = nf_tables_init_net, .pre_exit = nf_tables_pre_exit_net, .exit = nf_tables_exit_net,
- .exit_batch = nf_tables_exit_batch, .id = &nf_tables_net_id, .size = sizeof(struct nftables_pernet),
}; @@ -10446,6 +10671,7 @@ static void __exit nf_tables_module_exit(void) nft_chain_filter_fini(); nft_chain_route_fini(); unregister_pernet_subsys(&nf_tables_net_ops);
- cancel_work_sync(&trans_gc_work); cancel_work_sync(&trans_destroy_work); rcu_barrier(); rhltable_destroy(&nft_objname_ht);
-- 2.40.1
On Wed, Sep 20, 2023 at 04:02:29PM +0200, Pablo Neira Ayuso wrote:
Hi Greg,
On Wed, Sep 20, 2023 at 01:32:21PM +0200, Greg Kroah-Hartman wrote:
5.15-stable review patch. If anyone has any objections, please let me know.
Please, keep this back from 5.15, I am preparing a more complete patch series which includes follow up fixes for this on top of this.
Thanks, I've dropped all of these netfilter patches from this tree/queue now.
greg k-h
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit f6c383b8c31a93752a52697f8430a71dcbc46adf ]
Use the GC transaction API to replace the old and buggy gc API and the busy mark approach.
No set elements are removed from async garbage collection anymore, instead the _DEAD bit is set on so the set element is not visible from lookup path anymore. Async GC enqueues transaction work that might be aborted and retried later.
rbtree and pipapo set backends does not set on the _DEAD bit from the sync GC path since this runs in control plane path where mutex is held. In this case, set elements are deactivated, removed and then released via RCU callback, sync GC never fails.
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Fixes: 8d8540c4f5e0 ("netfilter: nft_set_rbtree: add timeout support") Fixes: 9d0982927e79 ("netfilter: nft_hash: add support for timeouts") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 7 +- net/netfilter/nft_set_hash.c | 77 +++++++++++------- net/netfilter/nft_set_pipapo.c | 48 ++++++++--- net/netfilter/nft_set_rbtree.c | 144 ++++++++++++++++++++------------- 4 files changed, 173 insertions(+), 103 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 2333f5da1eb97..616fd19d5cedc 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6003,7 +6003,6 @@ static void nft_setelem_activate(struct net *net, struct nft_set *set,
if (nft_setelem_is_catchall(set, elem)) { nft_set_elem_change_active(net, set, ext); - nft_set_elem_clear_busy(ext); } else { set->ops->activate(net, set, elem); } @@ -6018,8 +6017,7 @@ static int nft_setelem_catchall_deactivate(const struct net *net,
list_for_each_entry(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem); - if (!nft_is_active(net, ext) || - nft_set_elem_mark_busy(ext)) + if (!nft_is_active(net, ext)) continue;
kfree(elem->priv); @@ -6719,8 +6717,7 @@ static int nft_set_catchall_flush(const struct nft_ctx *ctx,
list_for_each_entry_rcu(catchall, &set->catchall_list, list) { ext = nft_set_elem_ext(set, catchall->elem); - if (!nft_set_elem_active(ext, genmask) || - nft_set_elem_mark_busy(ext)) + if (!nft_set_elem_active(ext, genmask)) continue;
elem.priv = catchall->elem; diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c index 0b73cb0e752f7..960cbd56c0406 100644 --- a/net/netfilter/nft_set_hash.c +++ b/net/netfilter/nft_set_hash.c @@ -59,6 +59,8 @@ static inline int nft_rhash_cmp(struct rhashtable_compare_arg *arg,
if (memcmp(nft_set_ext_key(&he->ext), x->key, x->set->klen)) return 1; + if (nft_set_elem_is_dead(&he->ext)) + return 1; if (nft_set_elem_expired(&he->ext)) return 1; if (!nft_set_elem_active(&he->ext, x->genmask)) @@ -188,7 +190,6 @@ static void nft_rhash_activate(const struct net *net, const struct nft_set *set, struct nft_rhash_elem *he = elem->priv;
nft_set_elem_change_active(net, set, &he->ext); - nft_set_elem_clear_busy(&he->ext); }
static bool nft_rhash_flush(const struct net *net, @@ -196,12 +197,9 @@ static bool nft_rhash_flush(const struct net *net, { struct nft_rhash_elem *he = priv;
- if (!nft_set_elem_mark_busy(&he->ext) || - !nft_is_active(net, &he->ext)) { - nft_set_elem_change_active(net, set, &he->ext); - return true; - } - return false; + nft_set_elem_change_active(net, set, &he->ext); + + return true; }
static void *nft_rhash_deactivate(const struct net *net, @@ -218,9 +216,8 @@ static void *nft_rhash_deactivate(const struct net *net,
rcu_read_lock(); he = rhashtable_lookup(&priv->ht, &arg, nft_rhash_params); - if (he != NULL && - !nft_rhash_flush(net, set, he)) - he = NULL; + if (he) + nft_set_elem_change_active(net, set, &he->ext);
rcu_read_unlock();
@@ -314,25 +311,48 @@ static bool nft_rhash_expr_needs_gc_run(const struct nft_set *set,
static void nft_rhash_gc(struct work_struct *work) { + struct nftables_pernet *nft_net; struct nft_set *set; struct nft_rhash_elem *he; struct nft_rhash *priv; - struct nft_set_gc_batch *gcb = NULL; struct rhashtable_iter hti; + struct nft_trans_gc *gc; + struct net *net; + u32 gc_seq;
priv = container_of(work, struct nft_rhash, gc_work.work); set = nft_set_container_of(priv); + net = read_pnet(&set->net); + nft_net = nft_pernet(net); + gc_seq = READ_ONCE(nft_net->gc_seq); + + gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL); + if (!gc) + goto done;
rhashtable_walk_enter(&priv->ht, &hti); rhashtable_walk_start(&hti);
while ((he = rhashtable_walk_next(&hti))) { if (IS_ERR(he)) { - if (PTR_ERR(he) != -EAGAIN) - break; + if (PTR_ERR(he) != -EAGAIN) { + nft_trans_gc_destroy(gc); + gc = NULL; + goto try_later; + } continue; }
+ /* Ruleset has been updated, try later. */ + if (READ_ONCE(nft_net->gc_seq) != gc_seq) { + nft_trans_gc_destroy(gc); + gc = NULL; + goto try_later; + } + + if (nft_set_elem_is_dead(&he->ext)) + goto dead_elem; + if (nft_set_ext_exists(&he->ext, NFT_SET_EXT_EXPRESSIONS) && nft_rhash_expr_needs_gc_run(set, &he->ext)) goto needs_gc_run; @@ -340,26 +360,26 @@ static void nft_rhash_gc(struct work_struct *work) if (!nft_set_elem_expired(&he->ext)) continue; needs_gc_run: - if (nft_set_elem_mark_busy(&he->ext)) - continue; + nft_set_elem_dead(&he->ext); +dead_elem: + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); + if (!gc) + goto try_later;
- gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC); - if (gcb == NULL) - break; - rhashtable_remove_fast(&priv->ht, &he->node, nft_rhash_params); - atomic_dec(&set->nelems); - nft_set_gc_batch_add(gcb, he); + nft_trans_gc_elem_add(gc, he); } + + gc = nft_trans_gc_catchall(gc, gc_seq); + +try_later: + /* catchall list iteration requires rcu read side lock. */ rhashtable_walk_stop(&hti); rhashtable_walk_exit(&hti);
- he = nft_set_catchall_gc(set); - if (he) { - gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC); - if (gcb) - nft_set_gc_batch_add(gcb, he); - } - nft_set_gc_batch_complete(gcb); + if (gc) + nft_trans_gc_queue_async_done(gc); + +done: queue_delayed_work(system_power_efficient_wq, &priv->gc_work, nft_set_gc_interval(set)); } @@ -422,7 +442,6 @@ static void nft_rhash_destroy(const struct nft_ctx *ctx, };
cancel_delayed_work_sync(&priv->gc_work); - rcu_barrier(); rhashtable_free_and_destroy(&priv->ht, nft_rhash_elem_destroy, (void *)&rhash_ctx); } diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 8c16681884b7e..0a49c59a22dbd 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1536,16 +1536,34 @@ static void pipapo_drop(struct nft_pipapo_match *m, } }
+static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set, + struct nft_pipapo_elem *e) + +{ + struct nft_set_elem elem = { + .priv = e, + }; + + nft_setelem_data_deactivate(net, set, &elem); +} + /** * pipapo_gc() - Drop expired entries from set, destroy start and end elements * @set: nftables API set representation * @m: Matching data */ -static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m) +static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m) { + struct nft_set *set = (struct nft_set *) _set; struct nft_pipapo *priv = nft_set_priv(set); + struct net *net = read_pnet(&set->net); int rules_f0, first_rule = 0; struct nft_pipapo_elem *e; + struct nft_trans_gc *gc; + + gc = nft_trans_gc_alloc(set, 0, GFP_KERNEL); + if (!gc) + return;
while ((rules_f0 = pipapo_rules_same_key(m->f, first_rule))) { union nft_pipapo_map_bucket rulemap[NFT_PIPAPO_MAX_FIELDS]; @@ -1569,13 +1587,20 @@ static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m) f--; i--; e = f->mt[rulemap[i].to].e; - if (nft_set_elem_expired(&e->ext) && - !nft_set_elem_mark_busy(&e->ext)) { + + /* synchronous gc never fails, there is no need to set on + * NFT_SET_ELEM_DEAD_BIT. + */ + if (nft_set_elem_expired(&e->ext)) { priv->dirty = true; - pipapo_drop(m, rulemap);
- rcu_barrier(); - nft_set_elem_destroy(set, e, true); + gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); + if (!gc) + break; + + nft_pipapo_gc_deactivate(net, set, e); + pipapo_drop(m, rulemap); + nft_trans_gc_elem_add(gc, e);
/* And check again current first rule, which is now the * first we haven't checked. @@ -1585,11 +1610,11 @@ static void pipapo_gc(const struct nft_set *set, struct nft_pipapo_match *m) } }
- e = nft_set_catchall_gc(set); - if (e) - nft_set_elem_destroy(set, e, true); - - priv->last_gc = jiffies; + gc = nft_trans_gc_catchall(gc, 0); + if (gc) { + nft_trans_gc_queue_sync_done(gc); + priv->last_gc = jiffies; + } }
/** @@ -1725,7 +1750,6 @@ static void nft_pipapo_activate(const struct net *net, return;
nft_set_elem_change_active(net, set, &e->ext); - nft_set_elem_clear_busy(&e->ext); }
/** diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c index 8d73fffd2d09d..7015153b7be18 100644 --- a/net/netfilter/nft_set_rbtree.c +++ b/net/netfilter/nft_set_rbtree.c @@ -46,6 +46,12 @@ static int nft_rbtree_cmp(const struct nft_set *set, set->klen); }
+static bool nft_rbtree_elem_expired(const struct nft_rbtree_elem *rbe) +{ + return nft_set_elem_expired(&rbe->ext) || + nft_set_elem_is_dead(&rbe->ext); +} + static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set, const u32 *key, const struct nft_set_ext **ext, unsigned int seq) @@ -80,7 +86,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set continue; }
- if (nft_set_elem_expired(&rbe->ext)) + if (nft_rbtree_elem_expired(rbe)) return false;
if (nft_rbtree_interval_end(rbe)) { @@ -98,7 +104,7 @@ static bool __nft_rbtree_lookup(const struct net *net, const struct nft_set *set
if (set->flags & NFT_SET_INTERVAL && interval != NULL && nft_set_elem_active(&interval->ext, genmask) && - !nft_set_elem_expired(&interval->ext) && + !nft_rbtree_elem_expired(interval) && nft_rbtree_interval_start(interval)) { *ext = &interval->ext; return true; @@ -215,6 +221,18 @@ static void *nft_rbtree_get(const struct net *net, const struct nft_set *set, return rbe; }
+static void nft_rbtree_gc_remove(struct net *net, struct nft_set *set, + struct nft_rbtree *priv, + struct nft_rbtree_elem *rbe) +{ + struct nft_set_elem elem = { + .priv = rbe, + }; + + nft_setelem_data_deactivate(net, set, &elem); + rb_erase(&rbe->node, &priv->root); +} + static int nft_rbtree_gc_elem(const struct nft_set *__set, struct nft_rbtree *priv, struct nft_rbtree_elem *rbe, @@ -222,11 +240,12 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set, { struct nft_set *set = (struct nft_set *)__set; struct rb_node *prev = rb_prev(&rbe->node); + struct net *net = read_pnet(&set->net); struct nft_rbtree_elem *rbe_prev; - struct nft_set_gc_batch *gcb; + struct nft_trans_gc *gc;
- gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC); - if (!gcb) + gc = nft_trans_gc_alloc(set, 0, GFP_ATOMIC); + if (!gc) return -ENOMEM;
/* search for end interval coming before this element. @@ -244,17 +263,28 @@ static int nft_rbtree_gc_elem(const struct nft_set *__set,
if (prev) { rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node); + nft_rbtree_gc_remove(net, set, priv, rbe_prev);
- rb_erase(&rbe_prev->node, &priv->root); - atomic_dec(&set->nelems); - nft_set_gc_batch_add(gcb, rbe_prev); + /* There is always room in this trans gc for this element, + * memory allocation never actually happens, hence, the warning + * splat in such case. No need to set NFT_SET_ELEM_DEAD_BIT, + * this is synchronous gc which never fails. + */ + gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); + if (WARN_ON_ONCE(!gc)) + return -ENOMEM; + + nft_trans_gc_elem_add(gc, rbe_prev); }
- rb_erase(&rbe->node, &priv->root); - atomic_dec(&set->nelems); + nft_rbtree_gc_remove(net, set, priv, rbe); + gc = nft_trans_gc_queue_sync(gc, GFP_ATOMIC); + if (WARN_ON_ONCE(!gc)) + return -ENOMEM; + + nft_trans_gc_elem_add(gc, rbe);
- nft_set_gc_batch_add(gcb, rbe); - nft_set_gc_batch_complete(gcb); + nft_trans_gc_queue_sync_done(gc);
return 0; } @@ -482,7 +512,6 @@ static void nft_rbtree_activate(const struct net *net, struct nft_rbtree_elem *rbe = elem->priv;
nft_set_elem_change_active(net, set, &rbe->ext); - nft_set_elem_clear_busy(&rbe->ext); }
static bool nft_rbtree_flush(const struct net *net, @@ -490,12 +519,9 @@ static bool nft_rbtree_flush(const struct net *net, { struct nft_rbtree_elem *rbe = priv;
- if (!nft_set_elem_mark_busy(&rbe->ext) || - !nft_is_active(net, &rbe->ext)) { - nft_set_elem_change_active(net, set, &rbe->ext); - return true; - } - return false; + nft_set_elem_change_active(net, set, &rbe->ext); + + return true; }
static void *nft_rbtree_deactivate(const struct net *net, @@ -572,26 +598,40 @@ static void nft_rbtree_walk(const struct nft_ctx *ctx,
static void nft_rbtree_gc(struct work_struct *work) { - struct nft_rbtree_elem *rbe, *rbe_end = NULL, *rbe_prev = NULL; - struct nft_set_gc_batch *gcb = NULL; + struct nft_rbtree_elem *rbe, *rbe_end = NULL; + struct nftables_pernet *nft_net; struct nft_rbtree *priv; + struct nft_trans_gc *gc; struct rb_node *node; struct nft_set *set; + unsigned int gc_seq; struct net *net; - u8 genmask;
priv = container_of(work, struct nft_rbtree, gc_work.work); set = nft_set_container_of(priv); net = read_pnet(&set->net); - genmask = nft_genmask_cur(net); + nft_net = nft_pernet(net); + gc_seq = READ_ONCE(nft_net->gc_seq); + + gc = nft_trans_gc_alloc(set, gc_seq, GFP_KERNEL); + if (!gc) + goto done;
write_lock_bh(&priv->lock); write_seqcount_begin(&priv->count); for (node = rb_first(&priv->root); node != NULL; node = rb_next(node)) { + + /* Ruleset has been updated, try later. */ + if (READ_ONCE(nft_net->gc_seq) != gc_seq) { + nft_trans_gc_destroy(gc); + gc = NULL; + goto try_later; + } + rbe = rb_entry(node, struct nft_rbtree_elem, node);
- if (!nft_set_elem_active(&rbe->ext, genmask)) - continue; + if (nft_set_elem_is_dead(&rbe->ext)) + goto dead_elem;
/* elements are reversed in the rbtree for historical reasons, * from highest to lowest value, that is why end element is @@ -604,46 +644,36 @@ static void nft_rbtree_gc(struct work_struct *work) if (!nft_set_elem_expired(&rbe->ext)) continue;
- if (nft_set_elem_mark_busy(&rbe->ext)) { - rbe_end = NULL; + nft_set_elem_dead(&rbe->ext); + + if (!rbe_end) continue; - }
- if (rbe_prev) { - rb_erase(&rbe_prev->node, &priv->root); - rbe_prev = NULL; - } - gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC); - if (!gcb) - break; + nft_set_elem_dead(&rbe_end->ext);
- atomic_dec(&set->nelems); - nft_set_gc_batch_add(gcb, rbe); - rbe_prev = rbe; + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); + if (!gc) + goto try_later;
- if (rbe_end) { - atomic_dec(&set->nelems); - nft_set_gc_batch_add(gcb, rbe_end); - rb_erase(&rbe_end->node, &priv->root); - rbe_end = NULL; - } - node = rb_next(node); - if (!node) - break; + nft_trans_gc_elem_add(gc, rbe_end); + rbe_end = NULL; +dead_elem: + gc = nft_trans_gc_queue_async(gc, gc_seq, GFP_ATOMIC); + if (!gc) + goto try_later; + + nft_trans_gc_elem_add(gc, rbe); } - if (rbe_prev) - rb_erase(&rbe_prev->node, &priv->root); + + gc = nft_trans_gc_catchall(gc, gc_seq); + +try_later: write_seqcount_end(&priv->count); write_unlock_bh(&priv->lock);
- rbe = nft_set_catchall_gc(set); - if (rbe) { - gcb = nft_set_gc_batch_check(set, gcb, GFP_ATOMIC); - if (gcb) - nft_set_gc_batch_add(gcb, rbe); - } - nft_set_gc_batch_complete(gcb); - + if (gc) + nft_trans_gc_queue_async_done(gc); +done: queue_delayed_work(system_power_efficient_wq, &priv->gc_work, nft_set_gc_interval(set)); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit c92db3030492b8ad1d0faace7a93bbcf53850d0c ]
Set on the NFT_SET_ELEM_DEAD_BIT flag on this element, instead of performing element removal which might race with an ongoing transaction. Enable gc when dynamic flag is set on since dynset deletion requires garbage collection after this patch.
Fixes: d0a8d877da97 ("netfilter: nft_dynset: support for element deletion") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_set_hash.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/netfilter/nft_set_hash.c b/net/netfilter/nft_set_hash.c index 960cbd56c0406..8dfa97105d8a3 100644 --- a/net/netfilter/nft_set_hash.c +++ b/net/netfilter/nft_set_hash.c @@ -249,7 +249,9 @@ static bool nft_rhash_delete(const struct nft_set *set, if (he == NULL) return false;
- return rhashtable_remove_fast(&priv->ht, &he->node, nft_rhash_params) == 0; + nft_set_elem_dead(&he->ext); + + return true; }
static void nft_rhash_walk(const struct nft_ctx *ctx, struct nft_set *set, @@ -414,7 +416,7 @@ static int nft_rhash_init(const struct nft_set *set, return err;
INIT_DEFERRABLE_WORK(&priv->gc_work, nft_rhash_gc); - if (set->flags & NFT_SET_TIMEOUT) + if (set->flags & (NFT_SET_TIMEOUT | NFT_SET_EVAL)) nft_rhash_gc_init(set);
return 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit a2dd0233cbc4d8a0abb5f64487487ffc9265beb5 ]
Ditch it, it has been replace it by the GC transaction API and it has no clients anymore.
Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 98 +------------------------------ net/netfilter/nf_tables_api.c | 48 +-------------- 2 files changed, 4 insertions(+), 142 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index a6bf58316a5d8..cd3b5b9db8890 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -564,7 +564,6 @@ struct nft_set *nft_set_lookup_global(const struct net *net,
struct nft_set_ext *nft_set_catchall_lookup(const struct net *net, const struct nft_set *set); -void *nft_set_catchall_gc(const struct nft_set *set);
static inline unsigned long nft_set_gc_interval(const struct nft_set *set) { @@ -779,62 +778,6 @@ void nft_set_elem_destroy(const struct nft_set *set, void *elem, void nf_tables_set_elem_destroy(const struct nft_ctx *ctx, const struct nft_set *set, void *elem);
-/** - * struct nft_set_gc_batch_head - nf_tables set garbage collection batch - * - * @rcu: rcu head - * @set: set the elements belong to - * @cnt: count of elements - */ -struct nft_set_gc_batch_head { - struct rcu_head rcu; - const struct nft_set *set; - unsigned int cnt; -}; - -#define NFT_SET_GC_BATCH_SIZE ((PAGE_SIZE - \ - sizeof(struct nft_set_gc_batch_head)) / \ - sizeof(void *)) - -/** - * struct nft_set_gc_batch - nf_tables set garbage collection batch - * - * @head: GC batch head - * @elems: garbage collection elements - */ -struct nft_set_gc_batch { - struct nft_set_gc_batch_head head; - void *elems[NFT_SET_GC_BATCH_SIZE]; -}; - -struct nft_set_gc_batch *nft_set_gc_batch_alloc(const struct nft_set *set, - gfp_t gfp); -void nft_set_gc_batch_release(struct rcu_head *rcu); - -static inline void nft_set_gc_batch_complete(struct nft_set_gc_batch *gcb) -{ - if (gcb != NULL) - call_rcu(&gcb->head.rcu, nft_set_gc_batch_release); -} - -static inline struct nft_set_gc_batch * -nft_set_gc_batch_check(const struct nft_set *set, struct nft_set_gc_batch *gcb, - gfp_t gfp) -{ - if (gcb != NULL) { - if (gcb->head.cnt + 1 < ARRAY_SIZE(gcb->elems)) - return gcb; - nft_set_gc_batch_complete(gcb); - } - return nft_set_gc_batch_alloc(set, gfp); -} - -static inline void nft_set_gc_batch_add(struct nft_set_gc_batch *gcb, - void *elem) -{ - gcb->elems[gcb->head.cnt++] = elem; -} - struct nft_expr_ops; /** * struct nft_expr_type - nf_tables expression type @@ -1495,47 +1438,12 @@ static inline void nft_set_elem_change_active(const struct net *net,
#endif /* IS_ENABLED(CONFIG_NF_TABLES) */
-/* - * We use a free bit in the genmask field to indicate the element - * is busy, meaning it is currently being processed either by - * the netlink API or GC. - * - * Even though the genmask is only a single byte wide, this works - * because the extension structure if fully constant once initialized, - * so there are no non-atomic write accesses unless it is already - * marked busy. - */ -#define NFT_SET_ELEM_BUSY_MASK (1 << 2) - -#if defined(__LITTLE_ENDIAN_BITFIELD) -#define NFT_SET_ELEM_BUSY_BIT 2 -#elif defined(__BIG_ENDIAN_BITFIELD) -#define NFT_SET_ELEM_BUSY_BIT (BITS_PER_LONG - BITS_PER_BYTE + 2) -#else -#error -#endif - -static inline int nft_set_elem_mark_busy(struct nft_set_ext *ext) -{ - unsigned long *word = (unsigned long *)ext; - - BUILD_BUG_ON(offsetof(struct nft_set_ext, genmask) != 0); - return test_and_set_bit(NFT_SET_ELEM_BUSY_BIT, word); -} - -static inline void nft_set_elem_clear_busy(struct nft_set_ext *ext) -{ - unsigned long *word = (unsigned long *)ext; - - clear_bit(NFT_SET_ELEM_BUSY_BIT, word); -} - -#define NFT_SET_ELEM_DEAD_MASK (1 << 3) +#define NFT_SET_ELEM_DEAD_MASK (1 << 2)
#if defined(__LITTLE_ENDIAN_BITFIELD) -#define NFT_SET_ELEM_DEAD_BIT 3 +#define NFT_SET_ELEM_DEAD_BIT 2 #elif defined(__BIG_ENDIAN_BITFIELD) -#define NFT_SET_ELEM_DEAD_BIT (BITS_PER_LONG - BITS_PER_BYTE + 3) +#define NFT_SET_ELEM_DEAD_BIT (BITS_PER_LONG - BITS_PER_BYTE + 2) #else #error #endif diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 616fd19d5cedc..75b04c01f7b81 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -5919,29 +5919,6 @@ struct nft_set_ext *nft_set_catchall_lookup(const struct net *net, } EXPORT_SYMBOL_GPL(nft_set_catchall_lookup);
-void *nft_set_catchall_gc(const struct nft_set *set) -{ - struct nft_set_elem_catchall *catchall, *next; - struct nft_set_ext *ext; - void *elem = NULL; - - list_for_each_entry_safe(catchall, next, &set->catchall_list, list) { - ext = nft_set_elem_ext(set, catchall->elem); - - if (!nft_set_elem_expired(ext) || - nft_set_elem_mark_busy(ext)) - continue; - - elem = catchall->elem; - list_del_rcu(&catchall->list); - kfree_rcu(catchall, rcu); - break; - } - - return elem; -} -EXPORT_SYMBOL_GPL(nft_set_catchall_gc); - static int nft_setelem_catchall_insert(const struct net *net, struct nft_set *set, const struct nft_set_elem *elem, @@ -6406,7 +6383,7 @@ static int nft_add_set_elem(struct nft_ctx *ctx, struct nft_set *set, goto err_elem_expr; }
- ext->genmask = nft_genmask_cur(ctx->net) | NFT_SET_ELEM_BUSY_MASK; + ext->genmask = nft_genmask_cur(ctx->net);
err = nft_setelem_insert(ctx->net, set, &elem, &ext2, flags); if (err) { @@ -6786,29 +6763,6 @@ static int nf_tables_delsetelem(struct sk_buff *skb, return err; }
-void nft_set_gc_batch_release(struct rcu_head *rcu) -{ - struct nft_set_gc_batch *gcb; - unsigned int i; - - gcb = container_of(rcu, struct nft_set_gc_batch, head.rcu); - for (i = 0; i < gcb->head.cnt; i++) - nft_set_elem_destroy(gcb->head.set, gcb->elems[i], true); - kfree(gcb); -} - -struct nft_set_gc_batch *nft_set_gc_batch_alloc(const struct nft_set *set, - gfp_t gfp) -{ - struct nft_set_gc_batch *gcb; - - gcb = kzalloc(sizeof(*gcb), gfp); - if (gcb == NULL) - return gcb; - gcb->head.set = set; - return gcb; -} - /* * Stateful objects */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Florian Westphal fw@strlen.de
[ Upstream commit 08713cb006b6f07434f276c5ee214fb20c7fd965 ]
Jakub Kicinski says: We've got some new kdoc warnings here: net/netfilter/nft_set_pipapo.c:1557: warning: Function parameter or member '_set' not described in 'pipapo_gc' net/netfilter/nft_set_pipapo.c:1557: warning: Excess function parameter 'set' description in 'pipapo_gc' include/net/netfilter/nf_tables.h:577: warning: Function parameter or member 'dead' not described in 'nft_set'
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Fixes: f6c383b8c31a ("netfilter: nf_tables: adapt set backend to use GC transaction API") Reported-by: Jakub Kicinski kuba@kernel.org Closes: https://lore.kernel.org/netdev/20230810104638.746e46f1@kernel.org/ Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 1 + net/netfilter/nft_set_pipapo.c | 2 +- 2 files changed, 2 insertions(+), 1 deletion(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index cd3b5b9db8890..86ffb222f4d74 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -499,6 +499,7 @@ struct nft_set_elem_expr { * @expr: stateful expression * @ops: set ops * @flags: set flags + * @dead: set will be freed, never cleared * @genmask: generation mask * @klen: key length * @dlen: data length diff --git a/net/netfilter/nft_set_pipapo.c b/net/netfilter/nft_set_pipapo.c index 0a49c59a22dbd..03472c1d9d548 100644 --- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -1549,7 +1549,7 @@ static void nft_pipapo_gc_deactivate(struct net *net, struct nft_set *set,
/** * pipapo_gc() - Drop expired entries from set, destroy start and end elements - * @set: nftables API set representation + * @_set: nftables API set representation * @m: Matching data */ static void pipapo_gc(const struct nft_set *_set, struct nft_pipapo_match *m)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 6a33d8b73dfac0a41f3877894b38082bd0c9a5bc ]
Netlink event path is missing a synchronization point with GC transactions. Add GC sequence number update to netns release path and netlink event path, any GC transaction losing race will be discarded.
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 36 +++++++++++++++++++++++++++++++---- 1 file changed, 32 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 75b04c01f7b81..3e17d2403d899 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9202,6 +9202,22 @@ static void nft_set_commit_update(struct list_head *set_update_list) } }
+static unsigned int nft_gc_seq_begin(struct nftables_pernet *nft_net) +{ + unsigned int gc_seq; + + /* Bump gc counter, it becomes odd, this is the busy mark. */ + gc_seq = READ_ONCE(nft_net->gc_seq); + WRITE_ONCE(nft_net->gc_seq, ++gc_seq); + + return gc_seq; +} + +static void nft_gc_seq_end(struct nftables_pernet *nft_net, unsigned int gc_seq) +{ + WRITE_ONCE(nft_net->gc_seq, ++gc_seq); +} + static int nf_tables_commit(struct net *net, struct sk_buff *skb) { struct nftables_pernet *nft_net = nft_pernet(net); @@ -9287,9 +9303,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb)
WRITE_ONCE(nft_net->base_seq, base_seq);
- /* Bump gc counter, it becomes odd, this is the busy mark. */ - gc_seq = READ_ONCE(nft_net->gc_seq); - WRITE_ONCE(nft_net->gc_seq, ++gc_seq); + gc_seq = nft_gc_seq_begin(nft_net);
/* step 3. Start new generation, rules_gen_X now in use. */ net->nft.gencursor = nft_gencursor_next(net); @@ -9480,7 +9494,7 @@ static int nf_tables_commit(struct net *net, struct sk_buff *skb) nf_tables_gen_notify(net, skb, NFT_MSG_NEWGEN); nf_tables_commit_audit_log(&adl, nft_net->base_seq);
- WRITE_ONCE(nft_net->gc_seq, ++gc_seq); + nft_gc_seq_end(nft_net, gc_seq); nf_tables_commit_release(net);
return 0; @@ -10463,6 +10477,7 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event, struct net *net = n->net; unsigned int deleted; bool restart = false; + unsigned int gc_seq;
if (event != NETLINK_URELEASE || n->protocol != NETLINK_NETFILTER) return NOTIFY_DONE; @@ -10470,6 +10485,9 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event, nft_net = nft_pernet(net); deleted = 0; mutex_lock(&nft_net->commit_mutex); + + gc_seq = nft_gc_seq_begin(nft_net); + if (!list_empty(&nf_tables_destroy_list)) nf_tables_trans_destroy_flush_work(); again: @@ -10492,6 +10510,8 @@ static int nft_rcv_nl_event(struct notifier_block *this, unsigned long event, if (restart) goto again; } + nft_gc_seq_end(nft_net, gc_seq); + mutex_unlock(&nft_net->commit_mutex);
return NOTIFY_DONE; @@ -10529,12 +10549,20 @@ static void __net_exit nf_tables_pre_exit_net(struct net *net) static void __net_exit nf_tables_exit_net(struct net *net) { struct nftables_pernet *nft_net = nft_pernet(net); + unsigned int gc_seq;
mutex_lock(&nft_net->commit_mutex); + + gc_seq = nft_gc_seq_begin(nft_net); + if (!list_empty(&nft_net->commit_list) || !list_empty(&nft_net->module_list)) __nf_tables_abort(net, NFNL_ABORT_NONE); + __nft_release_tables(net); + + nft_gc_seq_end(nft_net, gc_seq); + mutex_unlock(&nft_net->commit_mutex); WARN_ON_ONCE(!list_empty(&nft_net->tables)); WARN_ON_ONCE(!list_empty(&nft_net->module_list));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit 02c6c24402bf1c1e986899c14ba22a10b510916b ]
Use maybe_get_net() since GC workqueue might race with netns exit path.
Fixes: 5f68718b34a5 ("netfilter: nf_tables: GC transaction API to avoid race with control plane") Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 3e17d2403d899..33023b42698bb 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -8944,9 +8944,14 @@ struct nft_trans_gc *nft_trans_gc_alloc(struct nft_set *set, if (!trans) return NULL;
+ trans->net = maybe_get_net(net); + if (!trans->net) { + kfree(trans); + return NULL; + } + refcount_inc(&set->refs); trans->set = set; - trans->net = get_net(net); trans->seq = gc_seq;
return trans;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arnd Bergmann arnd@arndb.de
[ Upstream commit b9080468caeddc58a91edd1c3a7d212ea82b0d1d ]
__symbol_put() is really meant as an internal helper and is not available when module unloading is disabled, unlike the previously used symbol_put():
samples/hw_breakpoint/data_breakpoint.c: In function 'hw_break_module_exit': samples/hw_breakpoint/data_breakpoint.c:73:9: error: implicit declaration of function '__symbol_put'; did you mean '__symbol_get'? [-Werror=implicit-function-declaration]
The hw_break_module_exit() function is not actually used when module unloading is disabled, but it still causes the build failure for an undefined identifier. Enclose this one call in an appropriate #ifdef to clarify what the requirement is. Leaving out the entire exit function would also work but feels less clar in this case.
Fixes: 910e230d5f1bb ("samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'") Fixes: d8a84d33a4954 ("samples/hw_breakpoint: drop use of kallsyms_lookup_name()") Signed-off-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Petr Mladek pmladek@suse.com Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- samples/hw_breakpoint/data_breakpoint.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/samples/hw_breakpoint/data_breakpoint.c b/samples/hw_breakpoint/data_breakpoint.c index 9debd128b2ab8..b99322f188e59 100644 --- a/samples/hw_breakpoint/data_breakpoint.c +++ b/samples/hw_breakpoint/data_breakpoint.c @@ -70,7 +70,9 @@ static int __init hw_break_module_init(void) static void __exit hw_break_module_exit(void) { unregister_wide_hw_breakpoint(sample_hbp); +#ifdef CONFIG_MODULE_UNLOAD __symbol_put(ksym_name); +#endif printk(KERN_INFO "HW Breakpoint for %s write uninstalled\n", ksym_name); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nigel Croxon ncroxon@redhat.com
[ Upstream commit df203da47f4428bc286fc99318936416253a321c ]
There is a compile error when this commit is added: md: raid1: fix potential OOB in raid1_remove_disk()
drivers/md/raid1.c: In function 'raid1_remove_disk': drivers/md/raid1.c:1844:9: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 1844 | struct raid1_info *p = conf->mirrors + number; | ^~~~~~
That's because the new code was inserted before the struct. The change is move the struct command above this commit.
Fixes: 8b0472b50bcf ("md: raid1: fix potential OOB in raid1_remove_disk()") Signed-off-by: Nigel Croxon ncroxon@redhat.com Signed-off-by: Song Liu song@kernel.org Link: https://lore.kernel.org/r/46d929d0-2aab-4cf2-b2bf-338963e8ba5a@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/raid1.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c index 5360d6ed16e05..8427c9767a61b 100644 --- a/drivers/md/raid1.c +++ b/drivers/md/raid1.c @@ -1820,12 +1820,11 @@ static int raid1_remove_disk(struct mddev *mddev, struct md_rdev *rdev) struct r1conf *conf = mddev->private; int err = 0; int number = rdev->raid_disk; + struct raid1_info *p = conf->mirrors + number;
if (unlikely(number >= conf->raid_disks)) goto abort;
- struct raid1_info *p = conf->mirrors + number; - if (rdev != p->rdev) p = conf->mirrors + conf->raid_disks + number;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Brauner brauner@kernel.org
commit 5d1f903f75a80daa4dfb3d84e114ec8ecbf29956 upstream.
Changing the mode of symlinks is meaningless as the vfs doesn't take the mode of a symlink into account during path lookup permission checking.
However, the vfs doesn't block mode changes on symlinks. This however, has lead to an untenable mess roughly classifiable into the following two categories:
(1) Filesystems that don't implement a i_op->setattr() for symlinks.
Such filesystems may or may not know that without i_op->setattr() defined, notify_change() falls back to simple_setattr() causing the inode's mode in the inode cache to be changed.
That's a generic issue as this will affect all non-size changing inode attributes including ownership changes.
Example: afs
(2) Filesystems that fail with EOPNOTSUPP but change the mode of the symlink nonetheless.
Some filesystems will happily update the mode of a symlink but still return EOPNOTSUPP. This is the biggest source of confusion for userspace.
The EOPNOTSUPP in this case comes from POSIX ACLs. Specifically it comes from filesystems that call posix_acl_chmod(), e.g., btrfs via
if (!err && attr->ia_valid & ATTR_MODE) err = posix_acl_chmod(idmap, dentry, inode->i_mode);
Filesystems including btrfs don't implement i_op->set_acl() so posix_acl_chmod() will report EOPNOTSUPP.
When posix_acl_chmod() is called, most filesystems will have finished updating the inode.
Perversely, this has the consequences that this behavior may depend on two kconfig options and mount options:
* CONFIG_POSIX_ACL={y,n} * CONFIG_${FSTYPE}_POSIX_ACL={y,n} * Opt_acl, Opt_noacl
Example: btrfs, ext4, xfs
The only way to change the mode on a symlink currently involves abusing an O_PATH file descriptor in the following manner:
fd = openat(-1, "/path/to/link", O_CLOEXEC | O_PATH | O_NOFOLLOW);
char path[PATH_MAX]; snprintf(path, sizeof(path), "/proc/self/fd/%d", fd); chmod(path, 0000);
But for most major filesystems with POSIX ACL support such as btrfs, ext4, ceph, tmpfs, xfs and others this will fail with EOPNOTSUPP with the mode still updated due to the aforementioned posix_acl_chmod() nonsense.
So, given that for all major filesystems this would fail with EOPNOTSUPP and that both glibc (cf. [1]) and musl (cf. [2]) outright block mode changes on symlinks we should just try and block mode changes on symlinks directly in the vfs and have a clean break with this nonsense.
If this causes any regressions, we do the next best thing and fix up all filesystems that do return EOPNOTSUPP with the mode updated to not call posix_acl_chmod() on symlinks.
But as usual, let's try the clean cut solution first. It's a simple patch that can be easily reverted. Not marking this for backport as I'll do that manually if we're reasonably sure that this works and there are no strong objections.
We could block this in chmod_common() but it's more appropriate to do it notify_change() as it will also mean that we catch filesystems that change symlink permissions explicitly or accidently.
Similar proposals were floated in the past as in [3] and [4] and again recently in [5]. There's also a couple of bugs about this inconsistency as in [6] and [7].
Link: https://sourceware.org/git/?p=glibc.git%3Ba=blob%3Bf=sysdeps/unix/sysv/linux... [1] Link: https://git.musl-libc.org/cgit/musl/tree/src/stat/fchmodat.c [2] Link: https://lore.kernel.org/all/20200911065733.GA31579@infradead.org [3] Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00518.html [4] Link: https://lore.kernel.org/lkml/87lefmbppo.fsf@oldenburg.str.redhat.com [5] Link: https://sourceware.org/legacy-ml/libc-alpha/2020-02/msg00467.html [6] Link: https://sourceware.org/bugzilla/show_bug.cgi?id=14578#c17 [7] Reviewed-by: Aleksa Sarai cyphar@cyphar.com Reviewed-by: Christoph Hellwig hch@lst.de Cc: stable@vger.kernel.org # please backport to all LTSes but not before v6.6-rc2 is tagged Suggested-by: Christoph Hellwig hch@lst.de Suggested-by: Florian Weimer fweimer@redhat.com Message-Id: 20230712-vfs-chmod-symlinks-v2-1-08cfb92b61dd@kernel.org Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/attr.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-)
--- a/fs/attr.c +++ b/fs/attr.c @@ -402,9 +402,25 @@ int notify_change(struct user_namespace return error;
if ((ia_valid & ATTR_MODE)) { - umode_t amode = attr->ia_mode; + /* + * Don't allow changing the mode of symlinks: + * + * (1) The vfs doesn't take the mode of symlinks into account + * during permission checking. + * (2) This has never worked correctly. Most major filesystems + * did return EOPNOTSUPP due to interactions with POSIX ACLs + * but did still updated the mode of the symlink. + * This inconsistency led system call wrapper providers such + * as libc to block changing the mode of symlinks with + * EOPNOTSUPP already. + * (3) To even do this in the first place one would have to use + * specific file descriptors and quite some effort. + */ + if (S_ISLNK(inode->i_mode)) + return -EOPNOTSUPP; + /* Flag setting protected by i_mutex */ - if (is_sxid(amode)) + if (is_sxid(attr->ia_mode)) inode->i_flags &= ~S_NOSEC; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
commit ab048302026d7701e7fbd718917e0dbcff0c4223 upstream.
Some local filesystems support setting persistent fileattr flags (e.g. FS_NOATIME_FL) on directories and regular files via ioctl. Some of those persistent fileattr flags are reflected to vfs as in-memory inode flags (e.g. S_NOATIME).
Overlayfs uses the in-memory inode flags (e.g. S_NOATIME) on a lower file as an indication that a the lower file may have persistent inode fileattr flags (e.g. FS_NOATIME_FL) that need to be copied to upper file.
However, in some cases, the S_NOATIME in-memory flag could be a false indication for persistent FS_NOATIME_FL fileattr. For example, with NFS and FUSE lower fs, as was the case in the two bug reports, the S_NOATIME flag is set unconditionally for all inodes.
Users cannot set persistent fileattr flags on symlinks and special files, but in some local fs, such as ext4/btrfs/tmpfs, the FS_NOATIME_FL fileattr flag are inheritted to symlinks and special files from parent directory.
In both cases described above, when lower symlink has the S_NOATIME flag, overlayfs will try to copy the symlink's fileattrs and fail with error ENOXIO, because it could not open the symlink for the ioctl security hook.
To solve this failure, do not attempt to copyup fileattrs for anything other than directories and regular files.
Reported-by: Ruiwen Zhao ruiwen@google.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217850 Fixes: 72db82115d2b ("ovl: copy up sync/noatime fileattr flags") Cc: stable@vger.kernel.org # v5.15 Reviewed-by: Miklos Szeredi miklos@szeredi.hu Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/copy_up.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -583,7 +583,8 @@ static int ovl_copy_up_inode(struct ovl_ if (err) return err;
- if (inode->i_flags & OVL_COPY_I_FLAGS_MASK) { + if (inode->i_flags & OVL_COPY_I_FLAGS_MASK && + (S_ISREG(c->stat.mode) || S_ISDIR(c->stat.mode))) { /* * Copy the fileattr inode flags that are the source of already * copied i_flags
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amir Goldstein amir73il@gmail.com
commit 724768a39374d35b70eaeae8dd87048a2ec7ae8e upstream.
ovl_{read,write}_iter() always call fdput(real) to put one or zero refcounts of the real file, but for aio, whether it was submitted or not, ovl_aio_put() also calls fdput(), which is not balanced. This is only a problem in the less common case when FDPUT_FPUT flag is set.
To fix the problem use get_file() to take file refcount and use fput() instead of fdput() in ovl_aio_put().
Fixes: 2406a307ac7d ("ovl: implement async IO routines") Cc: stable@vger.kernel.org # v5.6 Reviewed-by: Miklos Szeredi miklos@szeredi.hu Signed-off-by: Amir Goldstein amir73il@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/overlayfs/file.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/fs/overlayfs/file.c +++ b/fs/overlayfs/file.c @@ -19,7 +19,6 @@ struct ovl_aio_req { struct kiocb iocb; refcount_t ref; struct kiocb *orig_iocb; - struct fd fd; };
static struct kmem_cache *ovl_aio_request_cachep; @@ -256,7 +255,7 @@ static rwf_t ovl_iocb_to_rwf(int ifl) static inline void ovl_aio_put(struct ovl_aio_req *aio_req) { if (refcount_dec_and_test(&aio_req->ref)) { - fdput(aio_req->fd); + fput(aio_req->iocb.ki_filp); kmem_cache_free(ovl_aio_request_cachep, aio_req); } } @@ -322,10 +321,9 @@ static ssize_t ovl_read_iter(struct kioc if (!aio_req) goto out;
- aio_req->fd = real; real.flags = 0; aio_req->orig_iocb = iocb; - kiocb_clone(&aio_req->iocb, iocb, real.file); + kiocb_clone(&aio_req->iocb, iocb, get_file(real.file)); aio_req->iocb.ki_complete = ovl_aio_rw_complete; refcount_set(&aio_req->ref, 2); ret = vfs_iocb_iter_read(real.file, &aio_req->iocb, iter); @@ -394,10 +392,9 @@ static ssize_t ovl_write_iter(struct kio /* Pacify lockdep, same trick as done in aio_write() */ __sb_writers_release(file_inode(real.file)->i_sb, SB_FREEZE_WRITE); - aio_req->fd = real; real.flags = 0; aio_req->orig_iocb = iocb; - kiocb_clone(&aio_req->iocb, iocb, real.file); + kiocb_clone(&aio_req->iocb, iocb, get_file(real.file)); aio_req->iocb.ki_flags = ifl; aio_req->iocb.ki_complete = ovl_aio_rw_complete; refcount_set(&aio_req->ref, 2);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit e110f8911ddb93e6f55da14ccbbe705397b30d0b upstream.
When running delayed items we are holding a delayed node's mutex and then we will attempt to modify a subvolume btree to insert/update/delete the delayed items. However if have an error during the insertions for example, btrfs_insert_delayed_items() may return with a path that has locked extent buffers (a leaf at the very least), and then we attempt to release the delayed node at __btrfs_run_delayed_items(), which requires taking the delayed node's mutex, causing an ABBA type of deadlock. This was reported by syzbot and the lockdep splat is the following:
WARNING: possible circular locking dependency detected 6.5.0-rc7-syzkaller-00024-g93f5de5f648d #0 Not tainted ------------------------------------------------------ syz-executor.2/13257 is trying to acquire lock: ffff88801835c0c0 (&delayed_node->mutex){+.+.}-{3:3}, at: __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256
but task is already holding lock: ffff88802a5ab8e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x3c/0x2a0 fs/btrfs/locking.c:198
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (btrfs-tree-00){++++}-{3:3}: __lock_release kernel/locking/lockdep.c:5475 [inline] lock_release+0x36f/0x9d0 kernel/locking/lockdep.c:5781 up_write+0x79/0x580 kernel/locking/rwsem.c:1625 btrfs_tree_unlock_rw fs/btrfs/locking.h:189 [inline] btrfs_unlock_up_safe+0x179/0x3b0 fs/btrfs/locking.c:239 search_leaf fs/btrfs/ctree.c:1986 [inline] btrfs_search_slot+0x2511/0x2f80 fs/btrfs/ctree.c:2230 btrfs_insert_empty_items+0x9c/0x180 fs/btrfs/ctree.c:4376 btrfs_insert_delayed_item fs/btrfs/delayed-inode.c:746 [inline] btrfs_insert_delayed_items fs/btrfs/delayed-inode.c:824 [inline] __btrfs_commit_inode_delayed_items+0xd24/0x2410 fs/btrfs/delayed-inode.c:1111 __btrfs_run_delayed_items+0x1db/0x430 fs/btrfs/delayed-inode.c:1153 flush_space+0x269/0xe70 fs/btrfs/space-info.c:723 btrfs_async_reclaim_metadata_space+0x106/0x350 fs/btrfs/space-info.c:1078 process_one_work+0x92c/0x12c0 kernel/workqueue.c:2600 worker_thread+0xa63/0x1210 kernel/workqueue.c:2751 kthread+0x2b8/0x350 kernel/kthread.c:389 ret_from_fork+0x2e/0x60 arch/x86/kernel/process.c:145 ret_from_fork_asm+0x11/0x20 arch/x86/entry/entry_64.S:304
-> #0 (&delayed_node->mutex){+.+.}-{3:3}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 __mutex_lock_common+0x1d8/0x2530 kernel/locking/mutex.c:603 __mutex_lock kernel/locking/mutex.c:747 [inline] mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:799 __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:281 [inline] __btrfs_run_delayed_items+0x2b5/0x430 fs/btrfs/delayed-inode.c:1156 btrfs_commit_transaction+0x859/0x2ff0 fs/btrfs/transaction.c:2276 btrfs_sync_file+0xf56/0x1330 fs/btrfs/file.c:1988 vfs_fsync_range fs/sync.c:188 [inline] vfs_fsync fs/sync.c:202 [inline] do_fsync fs/sync.c:212 [inline] __do_sys_fsync fs/sync.c:220 [inline] __se_sys_fsync fs/sync.c:218 [inline] __x64_sys_fsync+0x196/0x1e0 fs/sync.c:218 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(btrfs-tree-00); lock(&delayed_node->mutex); lock(btrfs-tree-00); lock(&delayed_node->mutex);
*** DEADLOCK ***
3 locks held by syz-executor.2/13257: #0: ffff88802c1ee370 (btrfs_trans_num_writers){++++}-{0:0}, at: spin_unlock include/linux/spinlock.h:391 [inline] #0: ffff88802c1ee370 (btrfs_trans_num_writers){++++}-{0:0}, at: join_transaction+0xb87/0xe00 fs/btrfs/transaction.c:287 #1: ffff88802c1ee398 (btrfs_trans_num_extwriters){++++}-{0:0}, at: join_transaction+0xbb2/0xe00 fs/btrfs/transaction.c:288 #2: ffff88802a5ab8e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_lock+0x3c/0x2a0 fs/btrfs/locking.c:198
stack backtrace: CPU: 0 PID: 13257 Comm: syz-executor.2 Not tainted 6.5.0-rc7-syzkaller-00024-g93f5de5f648d #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 check_noncircular+0x375/0x4a0 kernel/locking/lockdep.c:2195 check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 __mutex_lock_common+0x1d8/0x2530 kernel/locking/mutex.c:603 __mutex_lock kernel/locking/mutex.c:747 [inline] mutex_lock_nested+0x1b/0x20 kernel/locking/mutex.c:799 __btrfs_release_delayed_node+0x9a/0xaa0 fs/btrfs/delayed-inode.c:256 btrfs_release_delayed_node fs/btrfs/delayed-inode.c:281 [inline] __btrfs_run_delayed_items+0x2b5/0x430 fs/btrfs/delayed-inode.c:1156 btrfs_commit_transaction+0x859/0x2ff0 fs/btrfs/transaction.c:2276 btrfs_sync_file+0xf56/0x1330 fs/btrfs/file.c:1988 vfs_fsync_range fs/sync.c:188 [inline] vfs_fsync fs/sync.c:202 [inline] do_fsync fs/sync.c:212 [inline] __do_sys_fsync fs/sync.c:220 [inline] __se_sys_fsync fs/sync.c:218 [inline] __x64_sys_fsync+0x196/0x1e0 fs/sync.c:218 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f3ad047cae9 Code: 28 00 00 00 75 (...) RSP: 002b:00007f3ad12510c8 EFLAGS: 00000246 ORIG_RAX: 000000000000004a RAX: ffffffffffffffda RBX: 00007f3ad059bf80 RCX: 00007f3ad047cae9 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000005 RBP: 00007f3ad04c847a R08: 0000000000000000 R09: 0000000000000000 R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000 R13: 000000000000000b R14: 00007f3ad059bf80 R15: 00007ffe56af92f8 </TASK> ------------[ cut here ]------------
Fix this by releasing the path before releasing the delayed node in the error path at __btrfs_run_delayed_items().
Reported-by: syzbot+a379155f07c134ea9879@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/000000000000abba27060403b5bd@google.com/ CC: stable@vger.kernel.org # 4.14+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/delayed-inode.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
--- a/fs/btrfs/delayed-inode.c +++ b/fs/btrfs/delayed-inode.c @@ -1083,20 +1083,33 @@ static int __btrfs_run_delayed_items(str ret = __btrfs_commit_inode_delayed_items(trans, path, curr_node); if (ret) { - btrfs_release_delayed_node(curr_node); - curr_node = NULL; btrfs_abort_transaction(trans, ret); break; }
prev_node = curr_node; curr_node = btrfs_next_delayed_node(curr_node); + /* + * See the comment below about releasing path before releasing + * node. If the commit of delayed items was successful the path + * should always be released, but in case of an error, it may + * point to locked extent buffers (a leaf at the very least). + */ + ASSERT(path->nodes[0] == NULL); btrfs_release_delayed_node(prev_node); }
+ /* + * Release the path to avoid a potential deadlock and lockdep splat when + * releasing the delayed node, as that requires taking the delayed node's + * mutex. If another task starts running delayed items before we take + * the mutex, it will first lock the mutex and then it may try to lock + * the same btree path (leaf). + */ + btrfs_free_path(path); + if (curr_node) btrfs_release_delayed_node(curr_node); - btrfs_free_path(path); trans->block_rsv = block_rsv;
return ret;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit ee34a82e890a7babb5585daf1a6dd7d4d1cf142a upstream.
During the ino lookup ioctl we can end up calling btrfs_iget() to get an inode reference while we are holding on a root's btree. If btrfs_iget() needs to lookup the inode from the root's btree, because it's not currently loaded in memory, then it will need to lock another or the same path in the same root btree. This may result in a deadlock and trigger the following lockdep splat:
WARNING: possible circular locking dependency detected 6.5.0-rc7-syzkaller-00004-gf7757129e3de #0 Not tainted ------------------------------------------------------ syz-executor277/5012 is trying to acquire lock: ffff88802df41710 (btrfs-tree-01){++++}-{3:3}, at: __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136
but task is already holding lock: ffff88802df418e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #1 (btrfs-tree-00){++++}-{3:3}: down_read_nested+0x49/0x2f0 kernel/locking/rwsem.c:1645 __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136 btrfs_search_slot+0x13a4/0x2f80 fs/btrfs/ctree.c:2302 btrfs_init_root_free_objectid+0x148/0x320 fs/btrfs/disk-io.c:4955 btrfs_init_fs_root fs/btrfs/disk-io.c:1128 [inline] btrfs_get_root_ref+0x5ae/0xae0 fs/btrfs/disk-io.c:1338 btrfs_get_fs_root fs/btrfs/disk-io.c:1390 [inline] open_ctree+0x29c8/0x3030 fs/btrfs/disk-io.c:3494 btrfs_fill_super+0x1c7/0x2f0 fs/btrfs/super.c:1154 btrfs_mount_root+0x7e0/0x910 fs/btrfs/super.c:1519 legacy_get_tree+0xef/0x190 fs/fs_context.c:611 vfs_get_tree+0x8c/0x270 fs/super.c:1519 fc_mount fs/namespace.c:1112 [inline] vfs_kern_mount+0xbc/0x150 fs/namespace.c:1142 btrfs_mount+0x39f/0xb50 fs/btrfs/super.c:1579 legacy_get_tree+0xef/0x190 fs/fs_context.c:611 vfs_get_tree+0x8c/0x270 fs/super.c:1519 do_new_mount+0x28f/0xae0 fs/namespace.c:3335 do_mount fs/namespace.c:3675 [inline] __do_sys_mount fs/namespace.c:3884 [inline] __se_sys_mount+0x2d9/0x3c0 fs/namespace.c:3861 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
-> #0 (btrfs-tree-01){++++}-{3:3}: check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 down_read_nested+0x49/0x2f0 kernel/locking/rwsem.c:1645 __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136 btrfs_tree_read_lock fs/btrfs/locking.c:142 [inline] btrfs_read_lock_root_node+0x292/0x3c0 fs/btrfs/locking.c:281 btrfs_search_slot_get_root fs/btrfs/ctree.c:1832 [inline] btrfs_search_slot+0x4ff/0x2f80 fs/btrfs/ctree.c:2154 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:412 btrfs_read_locked_inode fs/btrfs/inode.c:3892 [inline] btrfs_iget_path+0x2d9/0x1520 fs/btrfs/inode.c:5716 btrfs_search_path_in_tree_user fs/btrfs/ioctl.c:1961 [inline] btrfs_ioctl_ino_lookup_user+0x77a/0xf50 fs/btrfs/ioctl.c:2105 btrfs_ioctl+0xb0b/0xd40 fs/btrfs/ioctl.c:4683 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- rlock(btrfs-tree-00); lock(btrfs-tree-01); lock(btrfs-tree-00); rlock(btrfs-tree-01);
*** DEADLOCK ***
1 lock held by syz-executor277/5012: #0: ffff88802df418e8 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136
stack backtrace: CPU: 1 PID: 5012 Comm: syz-executor277 Not tainted 6.5.0-rc7-syzkaller-00004-gf7757129e3de #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 07/26/2023 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x1e7/0x2d0 lib/dump_stack.c:106 check_noncircular+0x375/0x4a0 kernel/locking/lockdep.c:2195 check_prev_add kernel/locking/lockdep.c:3142 [inline] check_prevs_add kernel/locking/lockdep.c:3261 [inline] validate_chain kernel/locking/lockdep.c:3876 [inline] __lock_acquire+0x39ff/0x7f70 kernel/locking/lockdep.c:5144 lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5761 down_read_nested+0x49/0x2f0 kernel/locking/rwsem.c:1645 __btrfs_tree_read_lock+0x2f/0x220 fs/btrfs/locking.c:136 btrfs_tree_read_lock fs/btrfs/locking.c:142 [inline] btrfs_read_lock_root_node+0x292/0x3c0 fs/btrfs/locking.c:281 btrfs_search_slot_get_root fs/btrfs/ctree.c:1832 [inline] btrfs_search_slot+0x4ff/0x2f80 fs/btrfs/ctree.c:2154 btrfs_lookup_inode+0xdc/0x480 fs/btrfs/inode-item.c:412 btrfs_read_locked_inode fs/btrfs/inode.c:3892 [inline] btrfs_iget_path+0x2d9/0x1520 fs/btrfs/inode.c:5716 btrfs_search_path_in_tree_user fs/btrfs/ioctl.c:1961 [inline] btrfs_ioctl_ino_lookup_user+0x77a/0xf50 fs/btrfs/ioctl.c:2105 btrfs_ioctl+0xb0b/0xd40 fs/btrfs/ioctl.c:4683 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:870 [inline] __se_sys_ioctl+0xf8/0x170 fs/ioctl.c:856 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x7f0bec94ea39
Fix this simply by releasing the path before calling btrfs_iget() as at point we don't need the path anymore.
Reported-by: syzbot+bf66ad948981797d2f1d@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/00000000000045fa140603c4a969@google.com/ Fixes: 23d0b79dfaed ("btrfs: Add unprivileged version of ino_lookup ioctl") CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -2526,6 +2526,13 @@ static int btrfs_search_path_in_tree_use goto out_put; }
+ /* + * We don't need the path anymore, so release it and + * avoid deadlocks and lockdep warnings in case + * btrfs_iget() needs to lookup the inode from its root + * btree and lock the same leaf. + */ + btrfs_release_path(path); temp_inode = btrfs_iget(sb, key2.objectid, root); if (IS_ERR(temp_inode)) { ret = PTR_ERR(temp_inode); @@ -2546,7 +2553,6 @@ static int btrfs_search_path_in_tree_use goto out_put; }
- btrfs_release_path(path); key.objectid = key.offset; key.offset = (u64)-1; dirid = key.objectid;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 7d660c9b2bc95107f90a9f4c4759be85309a6550 upstream.
The tracing_max_latency file points to the trace_array max_latency field. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free.
Up the ref count of the trace_array when tracing_max_latency is opened.
Link: https://lkml.kernel.org/r/20230907024803.666889383@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Zheng Yejian zhengyejian1@huawei.com Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -1716,7 +1716,7 @@ static void trace_create_maxlat_file(str init_irq_work(&tr->fsnotify_irqwork, latency_fsnotify_workfn_irq); tr->d_max_latency = trace_create_file("tracing_max_latency", TRACE_MODE_WRITE, - d_tracer, &tr->max_latency, + d_tracer, tr, &tracing_max_lat_fops); }
@@ -1749,7 +1749,7 @@ void latency_fsnotify(struct trace_array
#define trace_create_maxlat_file(tr, d_tracer) \ trace_create_file("tracing_max_latency", TRACE_MODE_WRITE, \ - d_tracer, &tr->max_latency, &tracing_max_lat_fops) + d_tracer, tr, &tracing_max_lat_fops)
#endif
@@ -6575,14 +6575,18 @@ static ssize_t tracing_max_lat_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos) { - return tracing_nsecs_read(filp->private_data, ubuf, cnt, ppos); + struct trace_array *tr = filp->private_data; + + return tracing_nsecs_read(&tr->max_latency, ubuf, cnt, ppos); }
static ssize_t tracing_max_lat_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos) { - return tracing_nsecs_write(filp->private_data, ubuf, cnt, ppos); + struct trace_array *tr = filp->private_data; + + return tracing_nsecs_write(&tr->max_latency, ubuf, cnt, ppos); }
#endif @@ -7648,10 +7652,11 @@ static const struct file_operations trac
#ifdef CONFIG_TRACER_MAX_TRACE static const struct file_operations tracing_max_lat_fops = { - .open = tracing_open_generic, + .open = tracing_open_generic_tr, .read = tracing_max_lat_read, .write = tracing_max_lat_write, .llseek = generic_file_llseek, + .release = tracing_release_generic_tr, }; #endif
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 9b37febc578b2e1ad76a105aab11d00af5ec3d27 upstream.
The current_trace updates the trace array tracer. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free.
Up the ref count of the trace array when current_trace is opened.
Link: https://lkml.kernel.org/r/20230907024803.877687227@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Zheng Yejian zhengyejian1@huawei.com Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7661,10 +7661,11 @@ static const struct file_operations trac #endif
static const struct file_operations set_tracer_fops = { - .open = tracing_open_generic, + .open = tracing_open_generic_tr, .read = tracing_set_trace_read, .write = tracing_set_trace_write, .llseek = generic_file_llseek, + .release = tracing_release_generic_tr, };
static const struct file_operations tracing_pipe_fops = {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 7e2cfbd2d3c86afcd5c26b5c4b1dd251f63c5838 upstream.
The option files update the options for a given trace array. For an instance, if the file is opened and the instance is deleted, reading or writing to the file will cause a use after free.
Up the ref count of the trace_array when an option file is opened.
Link: https://lkml.kernel.org/r/20230907024804.086679464@goodmis.org Link: https://lore.kernel.org/all/1cb3aee2-19af-c472-e265-05176fe9bd84@huawei.com/
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Zheng Yejian zhengyejian1@huawei.com Fixes: 8530dec63e7b4 ("tracing: Add tracing_check_open_get_tr()") Tested-by: Linux Kernel Functional Testing lkft@linaro.org Tested-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -8830,12 +8830,33 @@ trace_options_write(struct file *filp, c return cnt; }
+static int tracing_open_options(struct inode *inode, struct file *filp) +{ + struct trace_option_dentry *topt = inode->i_private; + int ret; + + ret = tracing_check_open_get_tr(topt->tr); + if (ret) + return ret; + + filp->private_data = inode->i_private; + return 0; +} + +static int tracing_release_options(struct inode *inode, struct file *file) +{ + struct trace_option_dentry *topt = file->private_data; + + trace_array_put(topt->tr); + return 0; +}
static const struct file_operations trace_options_fops = { - .open = tracing_open_generic, + .open = tracing_open_options, .read = trace_options_read, .write = trace_options_write, .llseek = generic_file_llseek, + .release = tracing_release_options, };
/*
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeff Layton jlayton@kernel.org
commit fdd2630a7398191e84822612e589062063bd4f3d upstream.
nfsd sends the transposed directory change info in the RENAME reply. The source directory is in save_fh and the target is in current_fh.
Reported-by: Zhi Li yieli@redhat.com Reported-by: Benjamin Coddington bcodding@redhat.com Closes: https://bugzilla.redhat.com/show_bug.cgi?id=2218844 Signed-off-by: Jeff Layton jlayton@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfsd/nfs4proc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/nfsd/nfs4proc.c +++ b/fs/nfsd/nfs4proc.c @@ -895,8 +895,8 @@ nfsd4_rename(struct svc_rqst *rqstp, str rename->rn_tname, rename->rn_tnamelen); if (status) return status; - set_change_info(&rename->rn_sinfo, &cstate->current_fh); - set_change_info(&rename->rn_tinfo, &cstate->save_fh); + set_change_info(&rename->rn_sinfo, &cstate->save_fh); + set_change_info(&rename->rn_tinfo, &cstate->current_fh); return nfs_ok; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Steven Rostedt (Google) rostedt@goodmis.org
commit 51aab5ffceb43e05119eb059048fd75765d2bc21 upstream.
The function tracefs_create_dir() was missing a lockdown check and was called by the RV code. This gave an inconsistent behavior of this function returning success while other tracefs functions failed. This caused the inode being freed by the wrong kmem_cache.
Link: https://lkml.kernel.org/r/20230905182711.692687042@goodmis.org Link: https://lore.kernel.org/all/202309050916.58201dc6-oliver.sang@intel.com/
Cc: stable@vger.kernel.org Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Andrew Morton akpm@linux-foundation.org Cc: Ajay Kaher akaher@vmware.com Cc: Ching-lin Yu chinglinyu@google.com Fixes: bf8e602186ec4 ("tracing: Do not create tracefs files if tracefs lockdown is in effect") Reported-by: kernel test robot oliver.sang@intel.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/tracefs/inode.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/tracefs/inode.c +++ b/fs/tracefs/inode.c @@ -556,6 +556,9 @@ static struct dentry *__create_dir(const */ struct dentry *tracefs_create_dir(const char *name, struct dentry *parent) { + if (security_locked_down(LOCKDOWN_TRACEFS)) + return NULL; + return __create_dir(name, parent, &simple_dir_inode_operations); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Tommy Huang tommy_huang@aspeedtech.com
commit fee465150b458351b6d9b9f66084f3cc3022b88b upstream.
Reset the i2c controller when an i2c transfer timeout occurs. The remaining interrupts and device should be reset to avoid unpredictable controller behavior.
Fixes: 2e57b7cebb98 ("i2c: aspeed: Add multi-master use case support") Cc: stable@vger.kernel.org # v5.1+ Signed-off-by: Tommy Huang tommy_huang@aspeedtech.com Reviewed-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-aspeed.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/i2c/busses/i2c-aspeed.c +++ b/drivers/i2c/busses/i2c-aspeed.c @@ -693,13 +693,16 @@ static int aspeed_i2c_master_xfer(struct
if (time_left == 0) { /* - * If timed out and bus is still busy in a multi master - * environment, attempt recovery at here. + * In a multi-master setup, if a timeout occurs, attempt + * recovery. But if the bus is idle, we still need to reset the + * i2c controller to clear the remaining interrupts. */ if (bus->multi_master && (readl(bus->base + ASPEED_I2C_CMD_REG) & ASPEED_I2CD_BUS_BUSY_STS)) aspeed_i2c_recover_bus(bus); + else + aspeed_i2c_reset(bus);
/* * If timed out and the state is still pending, drop the pending
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niklas Cassel niklas.cassel@wdc.com
commit 24e0e61db3cb86a66824531989f1df80e0939f26 upstream.
In AHCI 1.3.1, the register description for CAP.SSC: "When cleared to ‘0’, software must not allow the HBA to initiate transitions to the Slumber state via agressive link power management nor the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port must be programmed to disallow device initiated Slumber requests."
In AHCI 1.3.1, the register description for CAP.PSC: "When cleared to ‘0’, software must not allow the HBA to initiate transitions to the Partial state via agressive link power management nor the PxCMD.ICC field in each port, and the PxSCTL.IPM field in each port must be programmed to disallow device initiated Partial requests."
Ensure that we always set the corresponding bits in PxSCTL.IPM, such that a device is not allowed to initiate transitions to power states which are unsupported by the HBA.
DevSleep is always initiated by the HBA, however, for completeness, set the corresponding bit in PxSCTL.IPM such that agressive link power management cannot transition to DevSleep if DevSleep is not supported.
sata_link_scr_lpm() is used by libahci, ata_piix and libata-pmp. However, only libahci has the ability to read the CAP/CAP2 register to see if these features are supported. Therefore, in order to not introduce any regressions on ata_piix or libata-pmp, create flags that indicate that the respective feature is NOT supported. This way, the behavior for ata_piix and libata-pmp should remain unchanged.
This change is based on a patch originally submitted by Runa Guo-oc.
Signed-off-by: Niklas Cassel niklas.cassel@wdc.com Fixes: 1152b2617a6e ("libata: implement sata_link_scr_lpm() and make ata_dev_set_feature() global") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ata/ahci.c | 9 +++++++++ drivers/ata/libata-sata.c | 19 ++++++++++++++++--- include/linux/libata.h | 4 ++++ 3 files changed, 29 insertions(+), 3 deletions(-)
--- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c @@ -1886,6 +1886,15 @@ static int ahci_init_one(struct pci_dev else dev_info(&pdev->dev, "SSS flag set, parallel bus scan disabled\n");
+ if (!(hpriv->cap & HOST_CAP_PART)) + host->flags |= ATA_HOST_NO_PART; + + if (!(hpriv->cap & HOST_CAP_SSC)) + host->flags |= ATA_HOST_NO_SSC; + + if (!(hpriv->cap2 & HOST_CAP2_SDS)) + host->flags |= ATA_HOST_NO_DEVSLP; + if (pi.flags & ATA_FLAG_EM) ahci_reset_em(host);
--- a/drivers/ata/libata-sata.c +++ b/drivers/ata/libata-sata.c @@ -394,10 +394,23 @@ int sata_link_scr_lpm(struct ata_link *l case ATA_LPM_MED_POWER_WITH_DIPM: case ATA_LPM_MIN_POWER_WITH_PARTIAL: case ATA_LPM_MIN_POWER: - if (ata_link_nr_enabled(link) > 0) - /* no restrictions on LPM transitions */ + if (ata_link_nr_enabled(link) > 0) { + /* assume no restrictions on LPM transitions */ scontrol &= ~(0x7 << 8); - else { + + /* + * If the controller does not support partial, slumber, + * or devsleep, then disallow these transitions. + */ + if (link->ap->host->flags & ATA_HOST_NO_PART) + scontrol |= (0x1 << 8); + + if (link->ap->host->flags & ATA_HOST_NO_SSC) + scontrol |= (0x2 << 8); + + if (link->ap->host->flags & ATA_HOST_NO_DEVSLP) + scontrol |= (0x4 << 8); + } else { /* empty port, power off */ scontrol &= ~0xf; scontrol |= (0x1 << 2); --- a/include/linux/libata.h +++ b/include/linux/libata.h @@ -264,6 +264,10 @@ enum { ATA_HOST_PARALLEL_SCAN = (1 << 2), /* Ports on this host can be scanned in parallel */ ATA_HOST_IGNORE_ATA = (1 << 3), /* Ignore ATA devices on this host. */
+ ATA_HOST_NO_PART = (1 << 4), /* Host does not support partial */ + ATA_HOST_NO_SSC = (1 << 5), /* Host does not support slumber */ + ATA_HOST_NO_DEVSLP = (1 << 6), /* Host does not support devslp */ + /* bits 24:31 of host->flags are reserved for LLD specific flags */
/* various lengths of time */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Junxiao Bi junxiao.bi@oracle.com
commit 0b0747d507bffb827e40fc0f9fb5883fffc23477 upstream.
The following processes run into a deadlock. CPU 41 was waiting for CPU 29 to handle a CSD request while holding spinlock "crashdump_lock", but CPU 29 was hung by that spinlock with IRQs disabled.
PID: 17360 TASK: ffff95c1090c5c40 CPU: 41 COMMAND: "mrdiagd" !# 0 [ffffb80edbf37b58] __read_once_size at ffffffff9b871a40 include/linux/compiler.h:185:0 !# 1 [ffffb80edbf37b58] atomic_read at ffffffff9b871a40 arch/x86/include/asm/atomic.h:27:0 !# 2 [ffffb80edbf37b58] dump_stack at ffffffff9b871a40 lib/dump_stack.c:54:0 # 3 [ffffb80edbf37b78] csd_lock_wait_toolong at ffffffff9b131ad5 kernel/smp.c:364:0 # 4 [ffffb80edbf37b78] __csd_lock_wait at ffffffff9b131ad5 kernel/smp.c:384:0 # 5 [ffffb80edbf37bf8] csd_lock_wait at ffffffff9b13267a kernel/smp.c:394:0 # 6 [ffffb80edbf37bf8] smp_call_function_many at ffffffff9b13267a kernel/smp.c:843:0 # 7 [ffffb80edbf37c50] smp_call_function at ffffffff9b13279d kernel/smp.c:867:0 # 8 [ffffb80edbf37c50] on_each_cpu at ffffffff9b13279d kernel/smp.c:976:0 # 9 [ffffb80edbf37c78] flush_tlb_kernel_range at ffffffff9b085c4b arch/x86/mm/tlb.c:742:0 #10 [ffffb80edbf37cb8] __purge_vmap_area_lazy at ffffffff9b23a1e0 mm/vmalloc.c:701:0 #11 [ffffb80edbf37ce0] try_purge_vmap_area_lazy at ffffffff9b23a2cc mm/vmalloc.c:722:0 #12 [ffffb80edbf37ce0] free_vmap_area_noflush at ffffffff9b23a2cc mm/vmalloc.c:754:0 #13 [ffffb80edbf37cf8] free_unmap_vmap_area at ffffffff9b23bb3b mm/vmalloc.c:764:0 #14 [ffffb80edbf37cf8] remove_vm_area at ffffffff9b23bb3b mm/vmalloc.c:1509:0 #15 [ffffb80edbf37d18] __vunmap at ffffffff9b23bb8a mm/vmalloc.c:1537:0 #16 [ffffb80edbf37d40] vfree at ffffffff9b23bc85 mm/vmalloc.c:1612:0 #17 [ffffb80edbf37d58] megasas_free_host_crash_buffer [megaraid_sas] at ffffffffc020b7f2 drivers/scsi/megaraid/megaraid_sas_fusion.c:3932:0 #18 [ffffb80edbf37d80] fw_crash_state_store [megaraid_sas] at ffffffffc01f804d drivers/scsi/megaraid/megaraid_sas_base.c:3291:0 #19 [ffffb80edbf37dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #20 [ffffb80edbf37dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #21 [ffffb80edbf37de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #22 [ffffb80edbf37e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #23 [ffffb80edbf37ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #24 [ffffb80edbf37ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #25 [ffffb80edbf37ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #26 [ffffb80edbf37f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #27 [ffffb80edbf37f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0
PID: 17355 TASK: ffff95c1090c3d80 CPU: 29 COMMAND: "mrdiagd" !# 0 [ffffb80f2d3c7d30] __read_once_size at ffffffff9b0f2ab0 include/linux/compiler.h:185:0 !# 1 [ffffb80f2d3c7d30] native_queued_spin_lock_slowpath at ffffffff9b0f2ab0 kernel/locking/qspinlock.c:368:0 # 2 [ffffb80f2d3c7d58] pv_queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/paravirt.h:674:0 # 3 [ffffb80f2d3c7d58] queued_spin_lock_slowpath at ffffffff9b0f244b arch/x86/include/asm/qspinlock.h:53:0 # 4 [ffffb80f2d3c7d68] queued_spin_lock at ffffffff9b8961a6 include/asm-generic/qspinlock.h:90:0 # 5 [ffffb80f2d3c7d68] do_raw_spin_lock_flags at ffffffff9b8961a6 include/linux/spinlock.h:173:0 # 6 [ffffb80f2d3c7d68] __raw_spin_lock_irqsave at ffffffff9b8961a6 include/linux/spinlock_api_smp.h:122:0 # 7 [ffffb80f2d3c7d68] _raw_spin_lock_irqsave at ffffffff9b8961a6 kernel/locking/spinlock.c:160:0 # 8 [ffffb80f2d3c7d88] fw_crash_buffer_store [megaraid_sas] at ffffffffc01f8129 drivers/scsi/megaraid/megaraid_sas_base.c:3205:0 # 9 [ffffb80f2d3c7dc0] dev_attr_store at ffffffff9b56dd7b drivers/base/core.c:758:0 #10 [ffffb80f2d3c7dd0] sysfs_kf_write at ffffffff9b326acf fs/sysfs/file.c:144:0 #11 [ffffb80f2d3c7de0] kernfs_fop_write at ffffffff9b325fd4 fs/kernfs/file.c:316:0 #12 [ffffb80f2d3c7e20] __vfs_write at ffffffff9b29418a fs/read_write.c:480:0 #13 [ffffb80f2d3c7ea8] vfs_write at ffffffff9b294462 fs/read_write.c:544:0 #14 [ffffb80f2d3c7ee8] SYSC_write at ffffffff9b2946ec fs/read_write.c:590:0 #15 [ffffb80f2d3c7ee8] SyS_write at ffffffff9b2946ec fs/read_write.c:582:0 #16 [ffffb80f2d3c7f30] do_syscall_64 at ffffffff9b003ca9 arch/x86/entry/common.c:298:0 #17 [ffffb80f2d3c7f58] entry_SYSCALL_64 at ffffffff9ba001b1 arch/x86/entry/entry_64.S:238:0
The lock is used to synchronize different sysfs operations, it doesn't protect any resource that will be touched by an interrupt. Consequently it's not required to disable IRQs. Replace the spinlock with a mutex to fix the deadlock.
Signed-off-by: Junxiao Bi junxiao.bi@oracle.com Link: https://lore.kernel.org/r/20230828221018.19471-1-junxiao.bi@oracle.com Reviewed-by: Mike Christie michael.christie@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/megaraid/megaraid_sas.h | 2 +- drivers/scsi/megaraid/megaraid_sas_base.c | 21 +++++++++------------ 2 files changed, 10 insertions(+), 13 deletions(-)
--- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2330,7 +2330,7 @@ struct megasas_instance { u32 support_morethan256jbod; /* FW support for more than 256 PD/JBOD */ bool use_seqnum_jbod_fp; /* Added for PD sequence */ bool smp_affinity_enable; - spinlock_t crashdump_lock; + struct mutex crashdump_lock;
struct megasas_register_set __iomem *reg_set; u32 __iomem *reply_post_host_index_addr[MR_MAX_MSIX_REG_ARRAY]; --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -3275,14 +3275,13 @@ fw_crash_buffer_store(struct device *cde struct megasas_instance *instance = (struct megasas_instance *) shost->hostdata; int val = 0; - unsigned long flags;
if (kstrtoint(buf, 0, &val) != 0) return -EINVAL;
- spin_lock_irqsave(&instance->crashdump_lock, flags); + mutex_lock(&instance->crashdump_lock); instance->fw_crash_buffer_offset = val; - spin_unlock_irqrestore(&instance->crashdump_lock, flags); + mutex_unlock(&instance->crashdump_lock); return strlen(buf); }
@@ -3297,24 +3296,23 @@ fw_crash_buffer_show(struct device *cdev unsigned long dmachunk = CRASH_DMA_BUF_SIZE; unsigned long chunk_left_bytes; unsigned long src_addr; - unsigned long flags; u32 buff_offset;
- spin_lock_irqsave(&instance->crashdump_lock, flags); + mutex_lock(&instance->crashdump_lock); buff_offset = instance->fw_crash_buffer_offset; if (!instance->crash_dump_buf || !((instance->fw_crash_state == AVAILABLE) || (instance->fw_crash_state == COPYING))) { dev_err(&instance->pdev->dev, "Firmware crash dump is not available\n"); - spin_unlock_irqrestore(&instance->crashdump_lock, flags); + mutex_unlock(&instance->crashdump_lock); return -EINVAL; }
if (buff_offset > (instance->fw_crash_buffer_size * dmachunk)) { dev_err(&instance->pdev->dev, "Firmware crash dump offset is out of range\n"); - spin_unlock_irqrestore(&instance->crashdump_lock, flags); + mutex_unlock(&instance->crashdump_lock); return 0; }
@@ -3326,7 +3324,7 @@ fw_crash_buffer_show(struct device *cdev src_addr = (unsigned long)instance->crash_buf[buff_offset / dmachunk] + (buff_offset % dmachunk); memcpy(buf, (void *)src_addr, size); - spin_unlock_irqrestore(&instance->crashdump_lock, flags); + mutex_unlock(&instance->crashdump_lock);
return size; } @@ -3351,7 +3349,6 @@ fw_crash_state_store(struct device *cdev struct megasas_instance *instance = (struct megasas_instance *) shost->hostdata; int val = 0; - unsigned long flags;
if (kstrtoint(buf, 0, &val) != 0) return -EINVAL; @@ -3365,9 +3362,9 @@ fw_crash_state_store(struct device *cdev instance->fw_crash_state = val;
if ((val == COPIED) || (val == COPY_ERROR)) { - spin_lock_irqsave(&instance->crashdump_lock, flags); + mutex_lock(&instance->crashdump_lock); megasas_free_host_crash_buffer(instance); - spin_unlock_irqrestore(&instance->crashdump_lock, flags); + mutex_unlock(&instance->crashdump_lock); if (val == COPY_ERROR) dev_info(&instance->pdev->dev, "application failed to " "copy Firmware crash dump\n"); @@ -7432,7 +7429,7 @@ static inline void megasas_init_ctrl_par init_waitqueue_head(&instance->int_cmd_wait_q); init_waitqueue_head(&instance->abort_cmd_wait_q);
- spin_lock_init(&instance->crashdump_lock); + mutex_init(&instance->crashdump_lock); spin_lock_init(&instance->mfi_pool_lock); spin_lock_init(&instance->hba_lock); spin_lock_init(&instance->stream_lock);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Damien Le Moal dlemoal@kernel.org
commit c91774818b041ed290df29fb1dc0725be9b12e83 upstream.
The function pm8001_pci_resume() only calls pm8001_request_irq() without calling pm8001_setup_irq(). This causes the IRQ allocation to fail, which leads all drives being removed from the system.
Fix this issue by integrating the code for pm8001_setup_irq() directly inside pm8001_request_irq() so that MSI-X setup is performed both during normal initialization and resume operations.
Fixes: dbf9bfe61571 ("[SCSI] pm8001: add SAS/SATA HBA driver") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal dlemoal@kernel.org Link: https://lore.kernel.org/r/20230911232745.325149-2-dlemoal@kernel.org Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/scsi/pm8001/pm8001_init.c | 51 ++++++++++++-------------------------- 1 file changed, 17 insertions(+), 34 deletions(-)
--- a/drivers/scsi/pm8001/pm8001_init.c +++ b/drivers/scsi/pm8001/pm8001_init.c @@ -255,7 +255,6 @@ static irqreturn_t pm8001_interrupt_hand return ret; }
-static u32 pm8001_setup_irq(struct pm8001_hba_info *pm8001_ha); static u32 pm8001_request_irq(struct pm8001_hba_info *pm8001_ha);
/** @@ -276,13 +275,6 @@ static int pm8001_alloc(struct pm8001_hb pm8001_dbg(pm8001_ha, INIT, "pm8001_alloc: PHY:%x\n", pm8001_ha->chip->n_phy);
- /* Setup Interrupt */ - rc = pm8001_setup_irq(pm8001_ha); - if (rc) { - pm8001_dbg(pm8001_ha, FAIL, - "pm8001_setup_irq failed [ret: %d]\n", rc); - goto err_out; - } /* Request Interrupt */ rc = pm8001_request_irq(pm8001_ha); if (rc) @@ -1002,47 +994,38 @@ static u32 pm8001_request_msix(struct pm } #endif
-static u32 pm8001_setup_irq(struct pm8001_hba_info *pm8001_ha) -{ - struct pci_dev *pdev; - - pdev = pm8001_ha->pdev; - -#ifdef PM8001_USE_MSIX - if (pci_find_capability(pdev, PCI_CAP_ID_MSIX)) - return pm8001_setup_msix(pm8001_ha); - pm8001_dbg(pm8001_ha, INIT, "MSIX not supported!!!\n"); -#endif - return 0; -} - /** * pm8001_request_irq - register interrupt * @pm8001_ha: our ha struct. */ static u32 pm8001_request_irq(struct pm8001_hba_info *pm8001_ha) { - struct pci_dev *pdev; + struct pci_dev *pdev = pm8001_ha->pdev; +#ifdef PM8001_USE_MSIX int rc;
- pdev = pm8001_ha->pdev; + if (pci_find_capability(pdev, PCI_CAP_ID_MSIX)) { + rc = pm8001_setup_msix(pm8001_ha); + if (rc) { + pm8001_dbg(pm8001_ha, FAIL, + "pm8001_setup_irq failed [ret: %d]\n", rc); + return rc; + }
-#ifdef PM8001_USE_MSIX - if (pdev->msix_cap && pci_msi_enabled()) - return pm8001_request_msix(pm8001_ha); - else { - pm8001_dbg(pm8001_ha, INIT, "MSIX not supported!!!\n"); - goto intx; + if (pdev->msix_cap && pci_msi_enabled()) + return pm8001_request_msix(pm8001_ha); } + + pm8001_dbg(pm8001_ha, INIT, "MSIX not supported!!!\n"); #endif
-intx: /* initialize the INT-X interrupt */ pm8001_ha->irq_vector[0].irq_id = 0; pm8001_ha->irq_vector[0].drv_inst = pm8001_ha; - rc = request_irq(pdev->irq, pm8001_interrupt_handler_intx, IRQF_SHARED, - pm8001_ha->name, SHOST_TO_SAS_HA(pm8001_ha->shost)); - return rc; + + return request_irq(pdev->irq, pm8001_interrupt_handler_intx, + IRQF_SHARED, pm8001_ha->name, + SHOST_TO_SAS_HA(pm8001_ha->shost)); }
/**
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shida Zhang zhangshida@kylinos.cn
commit 7fda67e8c3ab6069f75888f67958a6d30454a9f6 upstream.
With the configuration PAGE_SIZE 64k and filesystem blocksize 64k, a problem occurred when more than 13 million files were directly created under a directory:
EXT4-fs error (device xx): ext4_dx_csum_set:492: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): ext4_dx_csum_verify:463: inode #xxxx: comm xxxxx: dir seems corrupt? Run e2fsck -D. EXT4-fs error (device xx): dx_probe:856: inode #xxxx: block 8188: comm xxxxx: Directory index failed checksum
When enough files are created, the fake_dirent->reclen will be 0xffff. it doesn't equal to the blocksize 65536, i.e. 0x10000.
But it is not the same condition when blocksize equals to 4k. when enough files are created, the fake_dirent->reclen will be 0x1000. it equals to the blocksize 4k, i.e. 0x1000.
The problem seems to be related to the limitation of the 16-bit field when the blocksize is set to 64k. To address this, helpers like ext4_rec_len_{from,to}_disk has already been introduced to complete the conversion between the encoded and the plain form of rec_len.
So fix this one by using the helper, and all the other in this file too.
Cc: stable@kernel.org Fixes: dbe89444042a ("ext4: Calculate and verify checksums for htree nodes") Suggested-by: Andreas Dilger adilger@dilger.ca Suggested-by: Darrick J. Wong djwong@kernel.org Signed-off-by: Shida Zhang zhangshida@kylinos.cn Reviewed-by: Andreas Dilger adilger@dilger.ca Reviewed-by: Darrick J. Wong djwong@kernel.org Link: https://lore.kernel.org/r/20230803060938.1929759-1-zhangshida@kylinos.cn Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/ext4/namei.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
--- a/fs/ext4/namei.c +++ b/fs/ext4/namei.c @@ -343,17 +343,17 @@ static struct ext4_dir_entry_tail *get_d struct buffer_head *bh) { struct ext4_dir_entry_tail *t; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb);
#ifdef PARANOID struct ext4_dir_entry *d, *top;
d = (struct ext4_dir_entry *)bh->b_data; top = (struct ext4_dir_entry *)(bh->b_data + - (EXT4_BLOCK_SIZE(inode->i_sb) - - sizeof(struct ext4_dir_entry_tail))); - while (d < top && d->rec_len) + (blocksize - sizeof(struct ext4_dir_entry_tail))); + while (d < top && ext4_rec_len_from_disk(d->rec_len, blocksize)) d = (struct ext4_dir_entry *)(((void *)d) + - le16_to_cpu(d->rec_len)); + ext4_rec_len_from_disk(d->rec_len, blocksize));
if (d != top) return NULL; @@ -364,7 +364,8 @@ static struct ext4_dir_entry_tail *get_d #endif
if (t->det_reserved_zero1 || - le16_to_cpu(t->det_rec_len) != sizeof(struct ext4_dir_entry_tail) || + (ext4_rec_len_from_disk(t->det_rec_len, blocksize) != + sizeof(struct ext4_dir_entry_tail)) || t->det_reserved_zero2 || t->det_reserved_ft != EXT4_FT_DIR_CSUM) return NULL; @@ -445,13 +446,14 @@ static struct dx_countlimit *get_dx_coun struct ext4_dir_entry *dp; struct dx_root_info *root; int count_offset; + int blocksize = EXT4_BLOCK_SIZE(inode->i_sb); + unsigned int rlen = ext4_rec_len_from_disk(dirent->rec_len, blocksize);
- if (le16_to_cpu(dirent->rec_len) == EXT4_BLOCK_SIZE(inode->i_sb)) + if (rlen == blocksize) count_offset = 8; - else if (le16_to_cpu(dirent->rec_len) == 12) { + else if (rlen == 12) { dp = (struct ext4_dir_entry *)(((void *)dirent) + 12); - if (le16_to_cpu(dp->rec_len) != - EXT4_BLOCK_SIZE(inode->i_sb) - 12) + if (ext4_rec_len_from_disk(dp->rec_len, blocksize) != blocksize - 12) return NULL; root = (struct dx_root_info *)(((void *)dp + 12)); if (root->reserved_zero || @@ -1315,6 +1317,7 @@ static int dx_make_map(struct inode *dir unsigned int buflen = bh->b_size; char *base = bh->b_data; struct dx_hash_info h = *hinfo; + int blocksize = EXT4_BLOCK_SIZE(dir->i_sb);
if (ext4_has_metadata_csum(dir->i_sb)) buflen -= sizeof(struct ext4_dir_entry_tail); @@ -1335,11 +1338,12 @@ static int dx_make_map(struct inode *dir map_tail--; map_tail->hash = h.hash; map_tail->offs = ((char *) de - base)>>2; - map_tail->size = le16_to_cpu(de->rec_len); + map_tail->size = ext4_rec_len_from_disk(de->rec_len, + blocksize); count++; cond_resched(); } - de = ext4_next_entry(de, dir->i_sb->s_blocksize); + de = ext4_next_entry(de, blocksize); } return count; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yifan Zhang yifan1.zhang@amd.com
commit ef064187a9709393a981a56cce1e31880fd97107 upstream.
Dropping bit 31:4 of page table base is wrong, it makes page table base points to wrong address if phys addr is beyond 64GB; dropping page_table_start/end bit 31:4 is unnecessary since dcn20_vmid_setup will do that. Also, while we are at it, cleanup the assignments using upper_32_bits()/lower_32_bits() and AMDGPU_GPU_PAGE_SHIFT.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2354 Fixes: 81d0bcf99009 ("drm/amdgpu: make display pinning more flexible (v2)") Acked-by: Harry Wentland harry.wentland@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Yifan Zhang yifan1.zhang@amd.com Co-developed-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Hamza Mahfooz hamza.mahfooz@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -1204,11 +1204,15 @@ static void mmhub_read_system_context(st agp_top = adev->gmc.agp_end >> 24;
- page_table_start.high_part = (u32)(adev->gmc.gart_start >> 44) & 0xF; - page_table_start.low_part = (u32)(adev->gmc.gart_start >> 12); - page_table_end.high_part = (u32)(adev->gmc.gart_end >> 44) & 0xF; - page_table_end.low_part = (u32)(adev->gmc.gart_end >> 12); - page_table_base.high_part = upper_32_bits(pt_base) & 0xF; + page_table_start.high_part = upper_32_bits(adev->gmc.gart_start >> + AMDGPU_GPU_PAGE_SHIFT); + page_table_start.low_part = lower_32_bits(adev->gmc.gart_start >> + AMDGPU_GPU_PAGE_SHIFT); + page_table_end.high_part = upper_32_bits(adev->gmc.gart_end >> + AMDGPU_GPU_PAGE_SHIFT); + page_table_end.low_part = lower_32_bits(adev->gmc.gart_end >> + AMDGPU_GPU_PAGE_SHIFT); + page_table_base.high_part = upper_32_bits(pt_base); page_table_base.low_part = lower_32_bits(pt_base);
pa_config->system_aperture.start_addr = (uint64_t)logical_addr_low << 18;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian König christian.koenig@amd.com
commit 35588314e963938dfdcdb792c9170108399377d6 upstream.
The offset is just 32bits here so this can potentially overflow if somebody specifies a large value. Instead reduce the size to calculate the last possible offset.
The error handling path incorrectly drops the reference to the user fence BO resulting in potential reference count underflow.
Signed-off-by: Christian König christian.koenig@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 20 +++++--------------- 1 file changed, 5 insertions(+), 15 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c @@ -45,7 +45,6 @@ static int amdgpu_cs_user_fence_chunk(st struct drm_gem_object *gobj; struct amdgpu_bo *bo; unsigned long size; - int r;
gobj = drm_gem_object_lookup(p->filp, data->handle); if (gobj == NULL) @@ -60,23 +59,14 @@ static int amdgpu_cs_user_fence_chunk(st drm_gem_object_put(gobj);
size = amdgpu_bo_size(bo); - if (size != PAGE_SIZE || (data->offset + 8) > size) { - r = -EINVAL; - goto error_unref; - } - - if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm)) { - r = -EINVAL; - goto error_unref; - } + if (size != PAGE_SIZE || data->offset > (size - 8)) + return -EINVAL;
- *offset = data->offset; + if (amdgpu_ttm_tt_get_usermm(bo->tbo.ttm)) + return -EINVAL;
+ *offset = data->offset; return 0; - -error_unref: - amdgpu_bo_unref(&bo); - return r; }
static int amdgpu_cs_bo_handles_chunk(struct amdgpu_cs_parser *p,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jamal Hadi Salim jhs@mojatatu.com
commit 265b4da82dbf5df04bee5a5d46b7474b1aaf326a upstream.
The rsvp classifier has served us well for about a quarter of a century but has has not been getting much maintenance attention due to lack of known users.
Signed-off-by: Jamal Hadi Salim jhs@mojatatu.com Acked-by: Jiri Pirko jiri@nvidia.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Kyle Zeng zengyhkyle@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
--- net/sched/Kconfig | 28 - net/sched/Makefile | 2 net/sched/cls_rsvp.c | 24 - net/sched/cls_rsvp.h | 776 -------------------------------------------------- net/sched/cls_rsvp6.c | 24 - 5 files changed, 854 deletions(-)
--- a/net/sched/Kconfig +++ b/net/sched/Kconfig @@ -548,34 +548,6 @@ config CLS_U32_MARK help Say Y here to be able to use netfilter marks as u32 key.
-config NET_CLS_RSVP - tristate "IPv4 Resource Reservation Protocol (RSVP)" - select NET_CLS - help - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp. - -config NET_CLS_RSVP6 - tristate "IPv6 Resource Reservation Protocol (RSVP6)" - select NET_CLS - help - The Resource Reservation Protocol (RSVP) permits end systems to - request a minimum and maximum data flow rate for a connection; this - is important for real time data such as streaming sound or video. - - Say Y here if you want to be able to classify outgoing packets based - on their RSVP requests and you are using the IPv6 protocol. - - To compile this code as a module, choose M here: the - module will be called cls_rsvp6. - config NET_CLS_FLOW tristate "Flow classifier" select NET_CLS --- a/net/sched/Makefile +++ b/net/sched/Makefile @@ -69,8 +69,6 @@ obj-$(CONFIG_NET_SCH_TAPRIO) += sch_tapr obj-$(CONFIG_NET_CLS_U32) += cls_u32.o obj-$(CONFIG_NET_CLS_ROUTE4) += cls_route.o obj-$(CONFIG_NET_CLS_FW) += cls_fw.o -obj-$(CONFIG_NET_CLS_RSVP) += cls_rsvp.o -obj-$(CONFIG_NET_CLS_RSVP6) += cls_rsvp6.o obj-$(CONFIG_NET_CLS_BASIC) += cls_basic.o obj-$(CONFIG_NET_CLS_FLOW) += cls_flow.o obj-$(CONFIG_NET_CLS_CGROUP) += cls_cgroup.o --- a/net/sched/cls_rsvp.c +++ /dev/null @@ -1,24 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * net/sched/cls_rsvp.c Special RSVP packet classifier for IPv4. - * - * Authors: Alexey Kuznetsov, kuznet@ms2.inr.ac.ru - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/skbuff.h> -#include <net/ip.h> -#include <net/netlink.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> - -#define RSVP_DST_LEN 1 -#define RSVP_ID "rsvp" -#define RSVP_OPS cls_rsvp_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL"); --- a/net/sched/cls_rsvp.h +++ /dev/null @@ -1,776 +0,0 @@ -/* SPDX-License-Identifier: GPL-2.0-or-later */ -/* - * net/sched/cls_rsvp.h Template file for RSVPv[46] classifiers. - * - * Authors: Alexey Kuznetsov, kuznet@ms2.inr.ac.ru - */ - -/* - Comparing to general packet classification problem, - RSVP needs only several relatively simple rules: - - * (dst, protocol) are always specified, - so that we are able to hash them. - * src may be exact, or may be wildcard, so that - we can keep a hash table plus one wildcard entry. - * source port (or flow label) is important only if src is given. - - IMPLEMENTATION. - - We use a two level hash table: The top level is keyed by - destination address and protocol ID, every bucket contains a list - of "rsvp sessions", identified by destination address, protocol and - DPI(="Destination Port ID"): triple (key, mask, offset). - - Every bucket has a smaller hash table keyed by source address - (cf. RSVP flowspec) and one wildcard entry for wildcard reservations. - Every bucket is again a list of "RSVP flows", selected by - source address and SPI(="Source Port ID" here rather than - "security parameter index"): triple (key, mask, offset). - - - NOTE 1. All the packets with IPv6 extension headers (but AH and ESP) - and all fragmented packets go to the best-effort traffic class. - - - NOTE 2. Two "port id"'s seems to be redundant, rfc2207 requires - only one "Generalized Port Identifier". So that for classic - ah, esp (and udp,tcp) both *pi should coincide or one of them - should be wildcard. - - At first sight, this redundancy is just a waste of CPU - resources. But DPI and SPI add the possibility to assign different - priorities to GPIs. Look also at note 4 about tunnels below. - - - NOTE 3. One complication is the case of tunneled packets. - We implement it as following: if the first lookup - matches a special session with "tunnelhdr" value not zero, - flowid doesn't contain the true flow ID, but the tunnel ID (1...255). - In this case, we pull tunnelhdr bytes and restart lookup - with tunnel ID added to the list of keys. Simple and stupid 8)8) - It's enough for PIMREG and IPIP. - - - NOTE 4. Two GPIs make it possible to parse even GRE packets. - F.e. DPI can select ETH_P_IP (and necessary flags to make - tunnelhdr correct) in GRE protocol field and SPI matches - GRE key. Is it not nice? 8)8) - - - Well, as result, despite its simplicity, we get a pretty - powerful classification engine. */ - - -struct rsvp_head { - u32 tmap[256/32]; - u32 hgenerator; - u8 tgenerator; - struct rsvp_session __rcu *ht[256]; - struct rcu_head rcu; -}; - -struct rsvp_session { - struct rsvp_session __rcu *next; - __be32 dst[RSVP_DST_LEN]; - struct tc_rsvp_gpi dpi; - u8 protocol; - u8 tunnelid; - /* 16 (src,sport) hash slots, and one wildcard source slot */ - struct rsvp_filter __rcu *ht[16 + 1]; - struct rcu_head rcu; -}; - - -struct rsvp_filter { - struct rsvp_filter __rcu *next; - __be32 src[RSVP_DST_LEN]; - struct tc_rsvp_gpi spi; - u8 tunnelhdr; - - struct tcf_result res; - struct tcf_exts exts; - - u32 handle; - struct rsvp_session *sess; - struct rcu_work rwork; -}; - -static inline unsigned int hash_dst(__be32 *dst, u8 protocol, u8 tunnelid) -{ - unsigned int h = (__force __u32)dst[RSVP_DST_LEN - 1]; - - h ^= h>>16; - h ^= h>>8; - return (h ^ protocol ^ tunnelid) & 0xFF; -} - -static inline unsigned int hash_src(__be32 *src) -{ - unsigned int h = (__force __u32)src[RSVP_DST_LEN-1]; - - h ^= h>>16; - h ^= h>>8; - h ^= h>>4; - return h & 0xF; -} - -#define RSVP_APPLY_RESULT() \ -{ \ - int r = tcf_exts_exec(skb, &f->exts, res); \ - if (r < 0) \ - continue; \ - else if (r > 0) \ - return r; \ -} - -static int rsvp_classify(struct sk_buff *skb, const struct tcf_proto *tp, - struct tcf_result *res) -{ - struct rsvp_head *head = rcu_dereference_bh(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1, h2; - __be32 *dst, *src; - u8 protocol; - u8 tunnelid = 0; - u8 *xprt; -#if RSVP_DST_LEN == 4 - struct ipv6hdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ipv6_hdr(skb); -#else - struct iphdr *nhptr; - - if (!pskb_network_may_pull(skb, sizeof(*nhptr))) - return -1; - nhptr = ip_hdr(skb); -#endif -restart: - -#if RSVP_DST_LEN == 4 - src = &nhptr->saddr.s6_addr32[0]; - dst = &nhptr->daddr.s6_addr32[0]; - protocol = nhptr->nexthdr; - xprt = ((u8 *)nhptr) + sizeof(struct ipv6hdr); -#else - src = &nhptr->saddr; - dst = &nhptr->daddr; - protocol = nhptr->protocol; - xprt = ((u8 *)nhptr) + (nhptr->ihl<<2); - if (ip_is_fragment(nhptr)) - return -1; -#endif - - h1 = hash_dst(dst, protocol, tunnelid); - h2 = hash_src(src); - - for (s = rcu_dereference_bh(head->ht[h1]); s; - s = rcu_dereference_bh(s->next)) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN - 1] && - protocol == s->protocol && - !(s->dpi.mask & - (*(u32 *)(xprt + s->dpi.offset) ^ s->dpi.key)) && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - tunnelid == s->tunnelid) { - - for (f = rcu_dereference_bh(s->ht[h2]); f; - f = rcu_dereference_bh(f->next)) { - if (src[RSVP_DST_LEN-1] == f->src[RSVP_DST_LEN - 1] && - !(f->spi.mask & (*(u32 *)(xprt + f->spi.offset) ^ f->spi.key)) -#if RSVP_DST_LEN == 4 - && - src[0] == f->src[0] && - src[1] == f->src[1] && - src[2] == f->src[2] -#endif - ) { - *res = f->res; - RSVP_APPLY_RESULT(); - -matched: - if (f->tunnelhdr == 0) - return 0; - - tunnelid = f->res.classid; - nhptr = (void *)(xprt + f->tunnelhdr - sizeof(*nhptr)); - goto restart; - } - } - - /* And wildcard bucket... */ - for (f = rcu_dereference_bh(s->ht[16]); f; - f = rcu_dereference_bh(f->next)) { - *res = f->res; - RSVP_APPLY_RESULT(); - goto matched; - } - return -1; - } - } - return -1; -} - -static void rsvp_replace(struct tcf_proto *tp, struct rsvp_filter *n, u32 h) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter __rcu **ins; - struct rsvp_filter *pins; - unsigned int h1 = h & 0xFF; - unsigned int h2 = (h >> 8) & 0xFF; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (ins = &s->ht[h2], pins = rtnl_dereference(*ins); ; - ins = &pins->next, pins = rtnl_dereference(*ins)) { - if (pins->handle == h) { - RCU_INIT_POINTER(n->next, pins->next); - rcu_assign_pointer(*ins, n); - return; - } - } - } - - /* Something went wrong if we are trying to replace a non-existent - * node. Mind as well halt instead of silently failing. - */ - BUG_ON(1); -} - -static void *rsvp_get(struct tcf_proto *tp, u32 handle) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_session *s; - struct rsvp_filter *f; - unsigned int h1 = handle & 0xFF; - unsigned int h2 = (handle >> 8) & 0xFF; - - if (h2 > 16) - return NULL; - - for (s = rtnl_dereference(head->ht[h1]); s; - s = rtnl_dereference(s->next)) { - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->handle == handle) - return f; - } - } - return NULL; -} - -static int rsvp_init(struct tcf_proto *tp) -{ - struct rsvp_head *data; - - data = kzalloc(sizeof(struct rsvp_head), GFP_KERNEL); - if (data) { - rcu_assign_pointer(tp->root, data); - return 0; - } - return -ENOBUFS; -} - -static void __rsvp_delete_filter(struct rsvp_filter *f) -{ - tcf_exts_destroy(&f->exts); - tcf_exts_put_net(&f->exts); - kfree(f); -} - -static void rsvp_delete_filter_work(struct work_struct *work) -{ - struct rsvp_filter *f = container_of(to_rcu_work(work), - struct rsvp_filter, - rwork); - rtnl_lock(); - __rsvp_delete_filter(f); - rtnl_unlock(); -} - -static void rsvp_delete_filter(struct tcf_proto *tp, struct rsvp_filter *f) -{ - tcf_unbind_filter(tp, &f->res); - /* all classifiers are required to call tcf_exts_destroy() after rcu - * grace period, since converted-to-rcu actions are relying on that - * in cleanup() callback - */ - if (tcf_exts_get_net(&f->exts)) - tcf_queue_work(&f->rwork, rsvp_delete_filter_work); - else - __rsvp_delete_filter(f); -} - -static void rsvp_destroy(struct tcf_proto *tp, bool rtnl_held, - struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int h1, h2; - - if (data == NULL) - return; - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - - while ((s = rtnl_dereference(data->ht[h1])) != NULL) { - RCU_INIT_POINTER(data->ht[h1], s->next); - - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - while ((f = rtnl_dereference(s->ht[h2])) != NULL) { - rcu_assign_pointer(s->ht[h2], f->next); - rsvp_delete_filter(tp, f); - } - } - kfree_rcu(s, rcu); - } - } - kfree_rcu(data, rcu); -} - -static int rsvp_delete(struct tcf_proto *tp, void *arg, bool *last, - bool rtnl_held, struct netlink_ext_ack *extack) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - struct rsvp_filter *nfp, *f = arg; - struct rsvp_filter __rcu **fp; - unsigned int h = f->handle; - struct rsvp_session __rcu **sp; - struct rsvp_session *nsp, *s = f->sess; - int i, h1; - - fp = &s->ht[(h >> 8) & 0xFF]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - if (nfp == f) { - RCU_INIT_POINTER(*fp, f->next); - rsvp_delete_filter(tp, f); - - /* Strip tree */ - - for (i = 0; i <= 16; i++) - if (s->ht[i]) - goto out; - - /* OK, session has no flows */ - sp = &head->ht[h & 0xFF]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if (nsp == s) { - RCU_INIT_POINTER(*sp, s->next); - kfree_rcu(s, rcu); - goto out; - } - } - - break; - } - } - -out: - *last = true; - for (h1 = 0; h1 < 256; h1++) { - if (rcu_access_pointer(head->ht[h1])) { - *last = false; - break; - } - } - - return 0; -} - -static unsigned int gen_handle(struct tcf_proto *tp, unsigned salt) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - int i = 0xFFFF; - - while (i-- > 0) { - u32 h; - - if ((data->hgenerator += 0x10000) == 0) - data->hgenerator = 0x10000; - h = data->hgenerator|salt; - if (!rsvp_get(tp, h)) - return h; - } - return 0; -} - -static int tunnel_bts(struct rsvp_head *data) -{ - int n = data->tgenerator >> 5; - u32 b = 1 << (data->tgenerator & 0x1F); - - if (data->tmap[n] & b) - return 0; - data->tmap[n] |= b; - return 1; -} - -static void tunnel_recycle(struct rsvp_head *data) -{ - struct rsvp_session __rcu **sht = data->ht; - u32 tmap[256/32]; - int h1, h2; - - memset(tmap, 0, sizeof(tmap)); - - for (h1 = 0; h1 < 256; h1++) { - struct rsvp_session *s; - for (s = rtnl_dereference(sht[h1]); s; - s = rtnl_dereference(s->next)) { - for (h2 = 0; h2 <= 16; h2++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h2]); f; - f = rtnl_dereference(f->next)) { - if (f->tunnelhdr == 0) - continue; - data->tgenerator = f->res.classid; - tunnel_bts(data); - } - } - } - } - - memcpy(data->tmap, tmap, sizeof(tmap)); -} - -static u32 gen_tunnel(struct rsvp_head *data) -{ - int i, k; - - for (k = 0; k < 2; k++) { - for (i = 255; i > 0; i--) { - if (++data->tgenerator == 0) - data->tgenerator = 1; - if (tunnel_bts(data)) - return data->tgenerator; - } - tunnel_recycle(data); - } - return 0; -} - -static const struct nla_policy rsvp_policy[TCA_RSVP_MAX + 1] = { - [TCA_RSVP_CLASSID] = { .type = NLA_U32 }, - [TCA_RSVP_DST] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_SRC] = { .len = RSVP_DST_LEN * sizeof(u32) }, - [TCA_RSVP_PINFO] = { .len = sizeof(struct tc_rsvp_pinfo) }, -}; - -static int rsvp_change(struct net *net, struct sk_buff *in_skb, - struct tcf_proto *tp, unsigned long base, - u32 handle, struct nlattr **tca, - void **arg, u32 flags, - struct netlink_ext_ack *extack) -{ - struct rsvp_head *data = rtnl_dereference(tp->root); - struct rsvp_filter *f, *nfp; - struct rsvp_filter __rcu **fp; - struct rsvp_session *nsp, *s; - struct rsvp_session __rcu **sp; - struct tc_rsvp_pinfo *pinfo = NULL; - struct nlattr *opt = tca[TCA_OPTIONS]; - struct nlattr *tb[TCA_RSVP_MAX + 1]; - struct tcf_exts e; - unsigned int h1, h2; - __be32 *dst; - int err; - - if (opt == NULL) - return handle ? -EINVAL : 0; - - err = nla_parse_nested_deprecated(tb, TCA_RSVP_MAX, opt, rsvp_policy, - NULL); - if (err < 0) - return err; - - err = tcf_exts_init(&e, net, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - return err; - err = tcf_exts_validate(net, tp, tb, tca[TCA_RATE], &e, flags, - extack); - if (err < 0) - goto errout2; - - f = *arg; - if (f) { - /* Node exists: adjust only classid */ - struct rsvp_filter *n; - - if (f->handle != handle && handle) - goto errout2; - - n = kmemdup(f, sizeof(*f), GFP_KERNEL); - if (!n) { - err = -ENOMEM; - goto errout2; - } - - err = tcf_exts_init(&n->exts, net, TCA_RSVP_ACT, - TCA_RSVP_POLICE); - if (err < 0) { - kfree(n); - goto errout2; - } - - if (tb[TCA_RSVP_CLASSID]) { - n->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - tcf_bind_filter(tp, &n->res, base); - } - - tcf_exts_change(&n->exts, &e); - rsvp_replace(tp, n, handle); - return 0; - } - - /* Now more serious part... */ - err = -EINVAL; - if (handle) - goto errout2; - if (tb[TCA_RSVP_DST] == NULL) - goto errout2; - - err = -ENOBUFS; - f = kzalloc(sizeof(struct rsvp_filter), GFP_KERNEL); - if (f == NULL) - goto errout2; - - err = tcf_exts_init(&f->exts, net, TCA_RSVP_ACT, TCA_RSVP_POLICE); - if (err < 0) - goto errout; - h2 = 16; - if (tb[TCA_RSVP_SRC]) { - memcpy(f->src, nla_data(tb[TCA_RSVP_SRC]), sizeof(f->src)); - h2 = hash_src(f->src); - } - if (tb[TCA_RSVP_PINFO]) { - pinfo = nla_data(tb[TCA_RSVP_PINFO]); - f->spi = pinfo->spi; - f->tunnelhdr = pinfo->tunnelhdr; - } - if (tb[TCA_RSVP_CLASSID]) - f->res.classid = nla_get_u32(tb[TCA_RSVP_CLASSID]); - - dst = nla_data(tb[TCA_RSVP_DST]); - h1 = hash_dst(dst, pinfo ? pinfo->protocol : 0, pinfo ? pinfo->tunnelid : 0); - - err = -ENOMEM; - if ((f->handle = gen_handle(tp, h1 | (h2<<8))) == 0) - goto errout; - - if (f->tunnelhdr) { - err = -EINVAL; - if (f->res.classid > 255) - goto errout; - - err = -ENOMEM; - if (f->res.classid == 0 && - (f->res.classid = gen_tunnel(data)) == 0) - goto errout; - } - - for (sp = &data->ht[h1]; - (s = rtnl_dereference(*sp)) != NULL; - sp = &s->next) { - if (dst[RSVP_DST_LEN-1] == s->dst[RSVP_DST_LEN-1] && - pinfo && pinfo->protocol == s->protocol && - memcmp(&pinfo->dpi, &s->dpi, sizeof(s->dpi)) == 0 && -#if RSVP_DST_LEN == 4 - dst[0] == s->dst[0] && - dst[1] == s->dst[1] && - dst[2] == s->dst[2] && -#endif - pinfo->tunnelid == s->tunnelid) { - -insert: - /* OK, we found appropriate session */ - - fp = &s->ht[h2]; - - f->sess = s; - if (f->tunnelhdr == 0) - tcf_bind_filter(tp, &f->res, base); - - tcf_exts_change(&f->exts, &e); - - fp = &s->ht[h2]; - for (nfp = rtnl_dereference(*fp); nfp; - fp = &nfp->next, nfp = rtnl_dereference(*fp)) { - __u32 mask = nfp->spi.mask & f->spi.mask; - - if (mask != f->spi.mask) - break; - } - RCU_INIT_POINTER(f->next, nfp); - rcu_assign_pointer(*fp, f); - - *arg = f; - return 0; - } - } - - /* No session found. Create new one. */ - - err = -ENOBUFS; - s = kzalloc(sizeof(struct rsvp_session), GFP_KERNEL); - if (s == NULL) - goto errout; - memcpy(s->dst, dst, sizeof(s->dst)); - - if (pinfo) { - s->dpi = pinfo->dpi; - s->protocol = pinfo->protocol; - s->tunnelid = pinfo->tunnelid; - } - sp = &data->ht[h1]; - for (nsp = rtnl_dereference(*sp); nsp; - sp = &nsp->next, nsp = rtnl_dereference(*sp)) { - if ((nsp->dpi.mask & s->dpi.mask) != s->dpi.mask) - break; - } - RCU_INIT_POINTER(s->next, nsp); - rcu_assign_pointer(*sp, s); - - goto insert; - -errout: - tcf_exts_destroy(&f->exts); - kfree(f); -errout2: - tcf_exts_destroy(&e); - return err; -} - -static void rsvp_walk(struct tcf_proto *tp, struct tcf_walker *arg, - bool rtnl_held) -{ - struct rsvp_head *head = rtnl_dereference(tp->root); - unsigned int h, h1; - - if (arg->stop) - return; - - for (h = 0; h < 256; h++) { - struct rsvp_session *s; - - for (s = rtnl_dereference(head->ht[h]); s; - s = rtnl_dereference(s->next)) { - for (h1 = 0; h1 <= 16; h1++) { - struct rsvp_filter *f; - - for (f = rtnl_dereference(s->ht[h1]); f; - f = rtnl_dereference(f->next)) { - if (arg->count < arg->skip) { - arg->count++; - continue; - } - if (arg->fn(tp, f, arg) < 0) { - arg->stop = 1; - return; - } - arg->count++; - } - } - } - } -} - -static int rsvp_dump(struct net *net, struct tcf_proto *tp, void *fh, - struct sk_buff *skb, struct tcmsg *t, bool rtnl_held) -{ - struct rsvp_filter *f = fh; - struct rsvp_session *s; - struct nlattr *nest; - struct tc_rsvp_pinfo pinfo; - - if (f == NULL) - return skb->len; - s = f->sess; - - t->tcm_handle = f->handle; - - nest = nla_nest_start_noflag(skb, TCA_OPTIONS); - if (nest == NULL) - goto nla_put_failure; - - if (nla_put(skb, TCA_RSVP_DST, sizeof(s->dst), &s->dst)) - goto nla_put_failure; - pinfo.dpi = s->dpi; - pinfo.spi = f->spi; - pinfo.protocol = s->protocol; - pinfo.tunnelid = s->tunnelid; - pinfo.tunnelhdr = f->tunnelhdr; - pinfo.pad = 0; - if (nla_put(skb, TCA_RSVP_PINFO, sizeof(pinfo), &pinfo)) - goto nla_put_failure; - if (f->res.classid && - nla_put_u32(skb, TCA_RSVP_CLASSID, f->res.classid)) - goto nla_put_failure; - if (((f->handle >> 8) & 0xFF) != 16 && - nla_put(skb, TCA_RSVP_SRC, sizeof(f->src), f->src)) - goto nla_put_failure; - - if (tcf_exts_dump(skb, &f->exts) < 0) - goto nla_put_failure; - - nla_nest_end(skb, nest); - - if (tcf_exts_dump_stats(skb, &f->exts) < 0) - goto nla_put_failure; - return skb->len; - -nla_put_failure: - nla_nest_cancel(skb, nest); - return -1; -} - -static void rsvp_bind_class(void *fh, u32 classid, unsigned long cl, void *q, - unsigned long base) -{ - struct rsvp_filter *f = fh; - - if (f && f->res.classid == classid) { - if (cl) - __tcf_bind_filter(q, &f->res, base); - else - __tcf_unbind_filter(q, &f->res); - } -} - -static struct tcf_proto_ops RSVP_OPS __read_mostly = { - .kind = RSVP_ID, - .classify = rsvp_classify, - .init = rsvp_init, - .destroy = rsvp_destroy, - .get = rsvp_get, - .change = rsvp_change, - .delete = rsvp_delete, - .walk = rsvp_walk, - .dump = rsvp_dump, - .bind_class = rsvp_bind_class, - .owner = THIS_MODULE, -}; - -static int __init init_rsvp(void) -{ - return register_tcf_proto_ops(&RSVP_OPS); -} - -static void __exit exit_rsvp(void) -{ - unregister_tcf_proto_ops(&RSVP_OPS); -} - -module_init(init_rsvp) -module_exit(exit_rsvp) --- a/net/sched/cls_rsvp6.c +++ /dev/null @@ -1,24 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0-or-later -/* - * net/sched/cls_rsvp6.c Special RSVP packet classifier for IPv6. - * - * Authors: Alexey Kuznetsov, kuznet@ms2.inr.ac.ru - */ - -#include <linux/module.h> -#include <linux/types.h> -#include <linux/kernel.h> -#include <linux/string.h> -#include <linux/errno.h> -#include <linux/ipv6.h> -#include <linux/skbuff.h> -#include <net/act_api.h> -#include <net/pkt_cls.h> -#include <net/netlink.h> - -#define RSVP_DST_LEN 4 -#define RSVP_ID "rsvp6" -#define RSVP_OPS cls_rsvp6_ops - -#include "cls_rsvp.h" -MODULE_LICENSE("GPL");
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Melissa Wen mwen@igalia.com
commit 57a943ebfcdb4a97fbb409640234bdb44bfa1953 upstream.
For DRM legacy gamma, AMD display manager applies implicit sRGB degamma using a pre-defined sRGB transfer function. It works fine for DCN2 family where degamma ROM and custom curves go to the same color block. But, on DCN3+, degamma is split into two blocks: degamma ROM for pre-defined TFs and `gamma correction` for user/custom curves and degamma ROM settings doesn't apply to cursor plane. To get DRM legacy gamma working as expected, enable cursor degamma ROM for implict sRGB degamma on HW with this configuration.
Cc: stable@vger.kernel.org Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2803 Fixes: 96b020e2163f ("drm/amd/display: check attr flag before set cursor degamma on DCN3+") Signed-off-by: Melissa Wen mwen@igalia.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -8794,6 +8794,13 @@ static void handle_cursor_update(struct attributes.rotation_angle = 0; attributes.attribute_flags.value = 0;
+ /* Enable cursor degamma ROM on DCN3+ for implicit sRGB degamma in DRM + * legacy gamma setup. + */ + if (crtc_state->cm_is_degamma_srgb && + adev->dm.dc->caps.color.dpp.gamma_corr) + attributes.attribute_flags.bits.ENABLE_CURSOR_DEGAMMA = 1; + attributes.pitch = afb->base.pitches[0] / afb->base.format->cpp[0];
if (crtc_state->stream) {
Hello,
On Wed, 20 Sep 2023 13:30:58 +0200 Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
This rc kernel passes DAMON functionality test[1] on my test machine. Attaching the test results summary below. Please note that I retrieved the kernel from linux-stable-rc tree[2].
Tested-by: SeongJae Park sj@kernel.org
[1] https://github.com/awslabs/damon-tests/tree/next/corr [2] 634d2466eedd ("Linux 5.15.133-rc1")
Thanks, SJ
[...]
---
ok 1 selftests: damon: debugfs_attrs.sh ok 1 selftests: damon-tests: kunit.sh ok 2 selftests: damon-tests: huge_count_read_write.sh ok 3 selftests: damon-tests: buffer_overflow.sh ok 4 selftests: damon-tests: rm_contexts.sh ok 5 selftests: damon-tests: record_null_deref.sh ok 6 selftests: damon-tests: dbgfs_target_ids_read_before_terminate_race.sh ok 7 selftests: damon-tests: dbgfs_target_ids_pid_leak.sh ok 8 selftests: damon-tests: damo_tests.sh ok 9 selftests: damon-tests: masim-record.sh ok 10 selftests: damon-tests: build_i386.sh ok 11 selftests: damon-tests: build_m68k.sh ok 12 selftests: damon-tests: build_arm64.sh ok 13 selftests: damon-tests: build_i386_idle_flag.sh ok 14 selftests: damon-tests: build_i386_highpte.sh ok 15 selftests: damon-tests: build_nomemcg.sh [33m [92mPASS [39m
On 9/20/23 04:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Same thing as with 5.10 on ARM, ARM64 and MIPS:
CC /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events.o fixdep: error opening depfile: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/.pmu-events.o.d: No such file or directory make[5]: *** [pmu-events/Build:33: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events.o] Error 2 make[4]: *** [Makefile.perf:668: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events-in.o] Error 2 make[3]: *** [Makefile.perf:238: sub-make] Error 2 make[2]: *** [Makefile:70: all] Error 2 make[1]: *** [package/pkg-generic.mk:294: /local/users/fainelli/buildroot/output/bmips/build/linux-tools/.stamp_built] Error 2 make: *** [Makefile:27: _all] Error 2
reverting 95466975f143df7423bbf4905348c7b6260f4d29 ("perf build: Update build rule for generated files") and 12ff96780ebd93d1aacb396994e3a32b50dccecf ("perf jevents: Switch build to use jevents.py") gets us going again.
On Wed, Sep 20, 2023 at 11:47:17AM -0700, Florian Fainelli wrote:
On 9/20/23 04:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Same thing as with 5.10 on ARM, ARM64 and MIPS:
CC /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events.o fixdep: error opening depfile: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/.pmu-events.o.d: No such file or directory make[5]: *** [pmu-events/Build:33: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events.o] Error 2 make[4]: *** [Makefile.perf:668: /local/users/fainelli/buildroot/output/bmips/build/linux-custom/tools/perf/pmu-events/pmu-events-in.o] Error 2 make[3]: *** [Makefile.perf:238: sub-make] Error 2 make[2]: *** [Makefile:70: all] Error 2 make[1]: *** [package/pkg-generic.mk:294: /local/users/fainelli/buildroot/output/bmips/build/linux-tools/.stamp_built] Error 2 make: *** [Makefile:27: _all] Error 2
reverting 95466975f143df7423bbf4905348c7b6260f4d29 ("perf build: Update build rule for generated files") and 12ff96780ebd93d1aacb396994e3a32b50dccecf ("perf jevents: Switch build to use jevents.py") gets us going again.
Now dropped from all queues, thanks!
greg k-h
On 9/20/23 05:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On 9/20/23 04:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
Building powerpc:ppc32_allmodconfig ... failed -------------- Error log: drivers/usb/gadget/udc/fsl_qe_udc.c: In function 'ch9getstatus': drivers/usb/gadget/udc/fsl_qe_udc.c:1961:17: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 1961 | struct qe_ep *target_ep = &udc->eps[pipe]; | ^~~~~~
This problem also affects v6.1.
Guenter
On Wed, 20 Sept 2023 at 14:44, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.15.133-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.15.y * git commit: 634d2466eedd8795e5251256807f08190998f237 * git describe: v5.15.132-111-g634d2466eedd * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.15.y/build/v5.15....
## Test Regressions (compared to v5.15.132)
## Metric Regressions (compared to v5.15.132)
## Test Fixes (compared to v5.15.132)
## Metric Fixes (compared to v5.15.132)
## Test result summary total: 94047, pass: 74238, fail: 2610, skip: 17099, xfail: 100
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 117 total, 117 passed, 0 failed * arm64: 44 total, 44 passed, 0 failed * i386: 35 total, 34 passed, 1 failed * mips: 27 total, 26 passed, 1 failed * parisc: 4 total, 4 passed, 0 failed * powerpc: 27 total, 26 passed, 1 failed * riscv: 11 total, 11 passed, 0 failed * s390: 12 total, 11 passed, 1 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 8 passed, 0 failed * x86_64: 38 total, 38 passed, 0 failed
## Test suites summary * boot * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-filesystems-epoll * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-vm * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * network-basic-tests * perf * rcutorture * v4l2-compliance
-- Linaro LKFT https://lkft.linaro.org
On Wed, Sep 20, 2023 at 01:30:58PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
Build results: total: 160 pass: 157 fail: 3 Failed builds: i386:tools/perf powerpc:ppc32_allmodconfig x86_64:tools/perf Qemu test results: total: 509 pass: 509 fail: 0
perf:
gcc-10: fatal error: no input files
same as the problem in 5.10.y.
powerpc:ppc32_allmodconfig:
Building powerpc:ppc32_allmodconfig ... failed -------------- Error log: drivers/usb/gadget/udc/fsl_qe_udc.c: In function 'ch9getstatus': drivers/usb/gadget/udc/fsl_qe_udc.c:1961:17: error: ISO C90 forbids mixed declarations and code [-Werror=declaration-after-statement] 1961 | struct qe_ep *target_ep = &udc->eps[pipe];
Guenter
On Wed, Sep 20, 2023 at 01:30:58PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
For RCU, Tested-by: Joel Fernandes (Google) joel@joelfernandes.org
thanks,
- Joel
thanks,
greg k-h
Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.133-rc1
Melissa Wen mwen@igalia.com drm/amd/display: enable cursor degamma for DCN3+ DRM legacy gamma
Jamal Hadi Salim jhs@mojatatu.com net/sched: Retire rsvp classifier
Christian König christian.koenig@amd.com drm/amdgpu: fix amdgpu_cs_p1_user_fence
Yifan Zhang yifan1.zhang@amd.com drm/amd/display: fix the white screen issue when >= 64GB DRAM
Shida Zhang zhangshida@kylinos.cn ext4: fix rec_len verify error
Damien Le Moal dlemoal@kernel.org scsi: pm8001: Setup IRQs on resume
Junxiao Bi junxiao.bi@oracle.com scsi: megaraid_sas: Fix deadlock on firmware crashdump
Niklas Cassel niklas.cassel@wdc.com ata: libata: disallow dev-initiated LPM transitions to unsupported states
Tommy Huang tommy_huang@aspeedtech.com i2c: aspeed: Reset the i2c controller when timeout occurs
Steven Rostedt (Google) rostedt@goodmis.org tracefs: Add missing lockdown check to tracefs_create_dir()
Jeff Layton jlayton@kernel.org nfsd: fix change_info in NFSv4 RENAME replies
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have option files inc the trace array ref count
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have current_trace inc the trace array ref count
Steven Rostedt (Google) rostedt@goodmis.org tracing: Have tracing_max_latency inc the trace array ref count
Filipe Manana fdmanana@suse.com btrfs: release path before inode lookup during the ino lookup ioctl
Filipe Manana fdmanana@suse.com btrfs: fix lockdep splat and potential deadlock after failure running delayed items
Amir Goldstein amir73il@gmail.com ovl: fix incorrect fdput() on aio completion
Amir Goldstein amir73il@gmail.com ovl: fix failed copyup of fileattr on a symlink
Christian Brauner brauner@kernel.org attr: block mode changes of symlinks
Nigel Croxon ncroxon@redhat.com md/raid1: fix error: ISO C90 forbids mixed declarations
Arnd Bergmann arnd@arndb.de samples/hw_breakpoint: fix building without module unloading
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: GC transaction race with netns dismantle
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: fix GC transaction races with netns and netlink event exit path
Florian Westphal fw@strlen.de netfilter: nf_tables: fix kdoc warnings after gc rework
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: remove busy mark and gc batch API
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_set_hash: mark set element as dead when deleting from packet path
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: adapt set backend to use GC transaction API
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: GC transaction API to avoid race with control plane
Florian Westphal fw@strlen.de netfilter: nf_tables: make validation state per table
Song Liu song@kernel.org x86/purgatory: Remove LTO flags
Kirill A. Shutemov kirill.shutemov@linux.intel.com x86/boot/compressed: Reserve more memory for page tables
Jinjie Ruan ruanjinjie@huawei.com scsi: lpfc: Fix the NULL vs IS_ERR() bug for debugfs_create_file()
Masami Hiramatsu (Google) mhiramat@kernel.org selftests: tracing: Fix to unmount tracefs for recovering environment
Jinjie Ruan ruanjinjie@huawei.com scsi: qla2xxx: Fix NULL vs IS_ERR() bug for debugfs_create_dir()
Jinjie Ruan ruanjinjie@huawei.com drm: gm12u320: Fix the timeout usage for usb_bulk_msg()
Anand Jain anand.jain@oracle.com btrfs: compare the correct fsid/metadata_uuid in btrfs_validate_super
Anand Jain anand.jain@oracle.com btrfs: add a helper to read the superblock metadata_uuid
Josef Bacik josef@toxicpanda.com btrfs: move btrfs_pinned_by_swapfile prototype into volumes.h
Namhyung Kim namhyung@kernel.org perf test shell stat_bpf_counters: Fix test on Intel
James Clark james.clark@arm.com perf test: Remove bash construct from stat_bpf_counters.sh test
Namhyung Kim namhyung@kernel.org perf build: Update build rule for generated files
Ian Rogers rogers.email@gmail.com perf jevents: Switch build to use jevents.py
Tiezhu Yang yangtiezhu@loongson.cn MIPS: Use "grep -E" instead of "egrep"
William Zhang william.zhang@broadcom.com mtd: rawnand: brcmnand: Fix ECC level field setting for v7.2 controller
Florian Fainelli f.fainelli@gmail.com mtd: rawnand: brcmnand: Allow SoC to provide I/O operations
Zhang Yi yi.zhang@huawei.com jbd2: correct the end of the journal recovery scan range
Jan Kara jack@suse.cz jbd2: rename jbd_debug() to jbd2_debug()
Ritesh Harjani riteshh@linux.ibm.com jbd2: kill t_handle_lock transaction spinlock
Ritesh Harjani riteshh@linux.ibm.com jbd2: fix use-after-free of transaction_t race
Ritesh Harjani riteshh@linux.ibm.com jbd2: refactor wait logic for transaction updates into a common function
John Ogness john.ogness@linutronix.de printk: Consolidate console deferred printing
Rob Clark robdclark@chromium.org interconnect: Fix locking for runpm vs reclaim
Zhen Lei thunder.leizhen@huawei.com kobject: Add sanity check for kset->kobj.ktype in kset_register()
Sakari Ailus sakari.ailus@linux.intel.com media: pci: ipu3-cio2: Initialise timing struct to avoid a compiler warning
Xu Yang xu.yang_2@nxp.com usb: ehci: add workaround for chipidea PORTSC.PEC bug
Christophe Leroy christophe.leroy@csgroup.eu serial: cpm_uart: Avoid suspicious locking
Konstantin Shelekhin k.shelekhin@yadro.com scsi: target: iscsi: Fix buffer overflow in lio_target_nacl_info_show()
Chenyuan Mi michenyuan@huawei.com tools: iio: iio_generic_buffer: Fix some integer type and calculation
Ma Ke make_ruc2021@163.com usb: gadget: fsl_qe_udc: validate endpoint index for ch9 udc
Xiaolei Wang xiaolei.wang@windriver.com usb: cdns3: Put the cdns set active part outside the spin lock
Hans Verkuil hverkuil-cisco@xs4all.nl media: pci: cx23885: replace BUG with error return
Hans Verkuil hverkuil-cisco@xs4all.nl media: tuners: qt1010: replace BUG_ON with a regular error
Zhang Shurong zhang_shurong@foxmail.com media: dvb-usb-v2: gl861: Fix null-ptr-deref in gl861_i2c_master_xfer
Zhang Shurong zhang_shurong@foxmail.com media: az6007: Fix null-ptr-deref in az6007_i2c_xfer()
Zhang Shurong zhang_shurong@foxmail.com media: anysee: fix null-ptr-deref in anysee_master_xfer
Zhang Shurong zhang_shurong@foxmail.com media: af9005: Fix null-ptr-deref in af9005_i2c_xfer
Zhang Shurong zhang_shurong@foxmail.com media: dw2102: Fix null-ptr-deref in dw2102_i2c_transfer()
Zhang Shurong zhang_shurong@foxmail.com media: dvb-usb-v2: af9035: Fix null-ptr-deref in af9035_i2c_master_xfer
Yong-Xuan Wang yongxuan.wang@sifive.com PCI: fu740: Set the number of MSI vectors
ruanjinjie ruanjinjie@huawei.com powerpc/pseries: fix possible memory leak in ibmebus_bus_init()
Mårten Lindahl marten.lindahl@axis.com ARM: 9317/1: kexec: Make smp stop calls asynchronous
Liu Shixin via Jfs-discussion jfs-discussion@lists.sourceforge.net jfs: fix invalid free of JFS_IP(ipimap)->i_imap in diUnmount
Andrew Kanner andrew.kanner@gmail.com fs/jfs: prevent double-free in dbUnmount() after failed jfs_remount()
Georg Ottinger g.ottinger@gmx.at ext2: fix datatype of block number in ext2_xattr_set2()
Zhang Shurong zhang_shurong@foxmail.com md: raid1: fix potential OOB in raid1_remove_disk()
Tony Lindgren tony@atomide.com bus: ti-sysc: Configure uart quirks for k3 SoC
Tuo Li islituo@gmail.com drm/exynos: fix a possible null-pointer dereference due to data race in exynos_drm_crtc_atomic_disable()
Leo Chen sancchen@amd.com drm/amd/display: Blocking invalid 420 modes on HDMI TMDS for DCN31
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ALSA: hda: intel-dsp-cfg: add LunarLake support
Rong Tao rongtao@cestc.cn samples/hw_breakpoint: Fix kernel BUG 'invalid opcode: 0000'
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm8250-edo: correct ramoops pmsg-size
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm8150-kumano: correct ramoops pmsg-size
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org arm64: dts: qcom: sm6125-pdx201: correct ramoops pmsg-size
Marek Vasut marex@denx.de drm/bridge: tc358762: Instruct DSI host to generate HSE packets
Hao Luo haoluo@google.com libbpf: Free btf_vmlinux when closing bpf_object
Johannes Berg johannes.berg@intel.com wifi: mac80211_hwsim: drop short frames
GONG, Ruiqi gongruiqi1@huawei.com netfilter: ebtables: fix fortify warnings in size_entry_mwt()
Johannes Berg johannes.berg@intel.com wifi: mac80211: check S1G action frame size
GONG, Ruiqi gongruiqi1@huawei.com alx: fix OOB-read compiler warning
Giulio Benetti giulio.benetti@benettiengineering.com mmc: sdhci-esdhc-imx: improve ESDHC_FLAG_ERR010450
Alexander Steffen Alexander.Steffen@infineon.com tpm_tis: Resend command to recover from data transfer errors
Mark O'Donovan shiftee@posteo.net crypto: lib/mpi - avoid null pointer deref in mpi_cmp_ui()
Dmitry Antipov dmantipov@yandex.ru wifi: wil6210: fix fortify warnings
Dmitry Antipov dmantipov@yandex.ru wifi: mwifiex: fix fortify warning
Dongliang Mu dzm91@hust.edu.cn wifi: ath9k: fix printk specifier
Dmitry Antipov dmantipov@yandex.ru wifi: ath9k: fix fortify warnings
Azeem Shaikh azeemshaikh38@gmail.com crypto: lrw,xts - Replace strlcpy with strscpy
Jiri Pirko jiri@nvidia.com devlink: remove reload failed checks in params get/set callbacks
Mario Limonciello mario.limonciello@amd.com ACPI: x86: s2idle: Catch multiple ACPI_TYPE_PACKAGE objects
Tomislav Novak tnovak@meta.com hw_breakpoint: fix single-stepping when using bpf_overflow_handler
Xu Yang xu.yang_2@nxp.com perf/imx_ddr: speed up overflow frequency of cycle
Yicong Yang yangyicong@hisilicon.com perf/smmuv3: Enable HiSilicon Erratum 162001900 quirk for HIP08/09
Jiri Slaby (SUSE) jirislaby@kernel.org ACPI: video: Add backlight=native DMI quirk for Lenovo Ideapad Z470
Paul E. McKenney paulmck@kernel.org scftorture: Forgive memory-allocation failure if KASAN
Zqiang qiang.zhang1211@gmail.com rcuscale: Move rcu_scale_writer() schedule_timeout_uninterruptible() to _idle()
Wander Lairson Costa wander@redhat.com kernel/fork: beware of __put_task_struct() calling context
Abhishek Mainkar abmainkar@nvidia.com ACPICA: Add AML_NO_OPERAND_RESOLVE flag to Timer
Will Shiu Will.Shiu@mediatek.com locks: fix KASAN: use-after-free in trace_event_raw_event_filelock_lock
Qu Wenruo wqu@suse.com btrfs: output extra debug info if we failed to find an inline backref
Fedor Pchelkin pchelkin@ispras.ru autofs: fix memory leak of waitqueues in autofs_catatonic_mode
Diffstat:
Documentation/arm64/silicon-errata.rst | 3 + Makefile | 4 +- arch/arm/kernel/hw_breakpoint.c | 8 +- arch/arm/kernel/machine_kexec.c | 14 +- .../dts/qcom/sm6125-sony-xperia-seine-pdx201.dts | 2 +- .../boot/dts/qcom/sm8150-sony-xperia-kumano.dtsi | 2 +- .../boot/dts/qcom/sm8250-sony-xperia-edo.dtsi | 2 +- arch/arm64/kernel/hw_breakpoint.c | 4 +- arch/mips/Makefile | 2 +- arch/mips/vdso/Makefile | 2 +- arch/powerpc/platforms/pseries/ibmebus.c | 1 + arch/x86/boot/compressed/ident_map_64.c | 8 + arch/x86/include/asm/boot.h | 45 +- arch/x86/purgatory/Makefile | 4 + crypto/lrw.c | 6 +- crypto/xts.c | 6 +- drivers/acpi/acpica/psopcode.c | 2 +- drivers/acpi/arm64/iort.c | 5 +- drivers/acpi/video_detect.c | 9 + drivers/acpi/x86/s2idle.c | 6 + drivers/ata/ahci.c | 9 + drivers/ata/libata-sata.c | 19 +- drivers/bus/ti-sysc.c | 2 + drivers/char/tpm/tpm_tis_core.c | 15 +- drivers/gpu/drm/amd/amdgpu/amdgpu_cs.c | 18 +- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 21 +- .../amd/display/dc/dml/dcn31/display_mode_vba_31.c | 4 +- drivers/gpu/drm/bridge/tc358762.c | 2 +- drivers/gpu/drm/exynos/exynos_drm_crtc.c | 5 +- drivers/gpu/drm/tiny/gm12u320.c | 10 +- drivers/i2c/busses/i2c-aspeed.c | 7 +- drivers/interconnect/core.c | 8 +- drivers/md/raid1.c | 3 + drivers/media/pci/cx23885/cx23885-video.c | 2 +- drivers/media/pci/intel/ipu3/ipu3-cio2-main.c | 2 +- drivers/media/tuners/qt1010.c | 11 +- drivers/media/usb/dvb-usb-v2/af9035.c | 14 +- drivers/media/usb/dvb-usb-v2/anysee.c | 2 +- drivers/media/usb/dvb-usb-v2/az6007.c | 8 + drivers/media/usb/dvb-usb-v2/gl861.c | 2 +- drivers/media/usb/dvb-usb/af9005.c | 5 + drivers/media/usb/dvb-usb/dw2102.c | 24 + drivers/mmc/host/sdhci-esdhc-imx.c | 7 +- drivers/mtd/nand/raw/brcmnand/brcmnand.c | 102 ++- drivers/mtd/nand/raw/brcmnand/brcmnand.h | 29 + drivers/net/ethernet/atheros/alx/ethtool.c | 5 +- drivers/net/wireless/ath/ath9k/ahb.c | 4 +- drivers/net/wireless/ath/ath9k/mac.h | 6 +- drivers/net/wireless/ath/ath9k/pci.c | 4 +- drivers/net/wireless/ath/ath9k/xmit.c | 4 +- drivers/net/wireless/ath/wil6210/txrx.c | 2 +- drivers/net/wireless/ath/wil6210/txrx.h | 6 +- drivers/net/wireless/ath/wil6210/txrx_edma.c | 2 +- drivers/net/wireless/ath/wil6210/txrx_edma.h | 6 +- drivers/net/wireless/mac80211_hwsim.c | 7 +- drivers/net/wireless/marvell/mwifiex/tdls.c | 9 +- drivers/pci/controller/dwc/pcie-fu740.c | 1 + drivers/perf/arm_smmuv3_pmu.c | 46 +- drivers/perf/fsl_imx8_ddr_perf.c | 21 + drivers/scsi/lpfc/lpfc_debugfs.c | 14 +- drivers/scsi/megaraid/megaraid_sas.h | 2 +- drivers/scsi/megaraid/megaraid_sas_base.c | 21 +- drivers/scsi/pm8001/pm8001_init.c | 51 +- drivers/scsi/qla2xxx/qla_dfs.c | 6 +- drivers/target/iscsi/iscsi_target_configfs.c | 54 +- drivers/tty/serial/cpm_uart/cpm_uart_core.c | 13 +- drivers/usb/cdns3/cdns3-plat.c | 3 +- drivers/usb/cdns3/cdnsp-pci.c | 3 +- drivers/usb/cdns3/core.c | 15 +- drivers/usb/cdns3/core.h | 7 +- drivers/usb/gadget/udc/fsl_qe_udc.c | 2 + drivers/usb/host/ehci-hcd.c | 8 +- drivers/usb/host/ehci-hub.c | 10 +- drivers/usb/host/ehci.h | 10 + fs/attr.c | 20 +- fs/autofs/waitq.c | 3 +- fs/btrfs/ctree.h | 2 - fs/btrfs/delayed-inode.c | 19 +- fs/btrfs/disk-io.c | 8 +- fs/btrfs/extent-tree.c | 5 + fs/btrfs/ioctl.c | 8 +- fs/btrfs/volumes.c | 8 + fs/btrfs/volumes.h | 3 + fs/ext2/xattr.c | 4 +- fs/ext4/namei.c | 26 +- fs/jbd2/checkpoint.c | 6 +- fs/jbd2/commit.c | 49 +- fs/jbd2/journal.c | 34 +- fs/jbd2/recovery.c | 42 +- fs/jbd2/revoke.c | 8 +- fs/jbd2/transaction.c | 114 +-- fs/jfs/jfs_dmap.c | 1 + fs/jfs/jfs_imap.c | 1 + fs/locks.c | 2 +- fs/nfsd/nfs4proc.c | 4 +- fs/overlayfs/copy_up.c | 3 +- fs/overlayfs/file.c | 9 +- fs/tracefs/inode.c | 3 + include/linux/acpi_iort.h | 1 + include/linux/jbd2.h | 11 +- include/linux/libata.h | 4 + include/linux/perf_event.h | 22 +- include/linux/sched/task.h | 28 +- include/net/netfilter/nf_tables.h | 124 ++-- include/uapi/linux/netfilter_bridge/ebtables.h | 14 +- kernel/fork.c | 8 + kernel/printk/printk.c | 35 +- kernel/printk/printk_safe.c | 9 +- kernel/rcu/rcuscale.c | 2 +- kernel/scftorture.c | 6 +- kernel/trace/trace.c | 41 +- lib/kobject.c | 5 + lib/mpi/mpi-cmp.c | 8 +- net/bridge/netfilter/ebtables.c | 3 +- net/core/devlink.c | 4 +- net/mac80211/rx.c | 4 + net/netfilter/nf_tables_api.c | 374 +++++++--- net/netfilter/nft_set_hash.c | 83 ++- net/netfilter/nft_set_pipapo.c | 50 +- net/netfilter/nft_set_rbtree.c | 144 ++-- net/sched/Kconfig | 28 - net/sched/Makefile | 2 - net/sched/cls_rsvp.c | 24 - net/sched/cls_rsvp.h | 776 --------------------- net/sched/cls_rsvp6.c | 24 - samples/hw_breakpoint/data_breakpoint.c | 4 +- sound/hda/intel-dsp-config.c | 8 + tools/build/Makefile.build | 10 + tools/iio/iio_generic_buffer.c | 17 +- tools/lib/bpf/libbpf.c | 1 + tools/perf/Makefile.config | 19 + tools/perf/Makefile.perf | 1 + tools/perf/pmu-events/Build | 19 +- tools/perf/pmu-events/empty-pmu-events.c | 158 +++++ tools/perf/tests/shell/stat_bpf_counters.sh | 6 +- tools/testing/selftests/ftrace/ftracetest | 8 + 136 files changed, 1680 insertions(+), 1595 deletions(-)
On 9/20/23 4:30 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
Hi Greg,
On 20/09/2023 12:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.133 release. There are 110 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 22 Sep 2023 11:28:09 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.133-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
...
Rob Clark robdclark@chromium.org interconnect: Fix locking for runpm vs reclaim
I am seeing several boot failures for ARM 32-bit Tegra devices and bisect is pointing to the above commit. Reverting this does fix the problem.
Test results for stable-v5.15: 11 builds: 11 pass, 0 fail 34 boots: 28 pass, 6 fail 78 tests: 78 pass, 0 fail
Linux version: 5.15.133-rc1-g634d2466eedd Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Boot failures: tegra124-jetson-tk1, tegra20-ventana, tegra30-cardhu-a04
Jon
linux-stable-mirror@lists.linaro.org