This is the start of the stable review cycle for the 5.15.54 release. There are 230 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 13 Jul 2022 09:05:28 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.54-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.54-rc1
Hangbin Liu liuhangbin@gmail.com selftests/net: fix section name when using xdp_dummy.o
Dave Jiang dave.jiang@intel.com dmaengine: idxd: force wq context cleanup on device disable path
Miaoqian Lin linmq006@gmail.com dmaengine: ti: Add missing put_device in ti_dra7_xbar_route_allocate
Caleb Connolly caleb.connolly@linaro.org dmaengine: qcom: bam_dma: fix runtime PM underflow
Miaoqian Lin linmq006@gmail.com dmaengine: ti: Fix refcount leak in ti_dra7_xbar_route_allocate
Michael Walle michael@walle.cc dmaengine: at_xdma: handle errors of at_xdmac_alloc_desc() correctly
Christophe JAILLET christophe.jaillet@wanadoo.fr dmaengine: lgm: Fix an error handling path in intel_ldma_probe()
Dmitry Osipenko dmitry.osipenko@collabora.com dmaengine: pl330: Fix lockdep warning about non-static key
Linus Torvalds torvalds@linux-foundation.org ida: don't use BUG_ON() for debugging
Samuel Holland samuel@sholland.org dt-bindings: dma: allwinner,sun50i-a64-dma: Fix min/max typo
AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com Revert "serial: 8250_mtk: Make sure to select the right FEATURE_SEL"
Naoya Horiguchi naoya.horiguchi@nec.com Revert "mm/memory-failure.c: fix race with changing page compound again"
Shuah Khan skhan@linuxfoundation.org misc: rtsx_usb: set return value in rsp_buf alloc err path
Shuah Khan skhan@linuxfoundation.org misc: rtsx_usb: use separate command and response buffers
Shuah Khan skhan@linuxfoundation.org misc: rtsx_usb: fix use of dma mapped buffer for usb bulk transfer
Peter Robinson pbrobinson@gmail.com dmaengine: imx-sdma: Allow imx8m for imx7 FW revs
Satish Nagireddy satish.nagireddy@getcruise.com i2c: cadence: Unregister the clk notifier in error path
Heiner Kallweit hkallweit1@gmail.com r8169: fix accessing unset transport header
Vladimir Oltean vladimir.oltean@nxp.com selftests: forwarding: fix error message in learning_test
Vladimir Oltean vladimir.oltean@nxp.com selftests: forwarding: fix learning_test when h1 supports IFF_UNICAST_FLT
Vladimir Oltean vladimir.oltean@nxp.com selftests: forwarding: fix flood_unicast_test when h2 supports IFF_UNICAST_FLT
Rick Lindsley ricklind@us.ibm.com ibmvnic: Properly dispose of all skbs during a failover.
Fabrice Gasnier fabrice.gasnier@foss.st.com ARM: dts: stm32: add missing usbh clock and fix clk order on stm32mp15
Amelie Delaunay amelie.delaunay@foss.st.com ARM: dts: stm32: use usbphyc ck_usbo_48m as USBH OHCI clock on stm32mp151
Norbert Zulinski norbertx.zulinski@intel.com i40e: Fix VF's MAC Address change on VM
Lukasz Cieplicki lukaszx.cieplicki@intel.com i40e: Fix dropped jumbo frames statistics
Jean Delvare jdelvare@suse.de i2c: piix4: Fix a memory leak in the EFCH MMIO support
Ivan Malov ivan.malov@oktetlabs.ru xsk: Clear page contiguity bit when unmapping pool
Mihai Sain mihai.sain@microchip.com ARM: at91: fix soc detection for SAM9X60 SiPs
Eugen Hristev eugen.hristev@microchip.com ARM: dts: at91: sama5d2_icp: fix eeprom compatibles
Eugen Hristev eugen.hristev@microchip.com ARM: dts: at91: sam9x60ek: fix eeprom compatible and size
Claudiu Beznea claudiu.beznea@microchip.com ARM: at91: pm: use proper compatibles for sama7g5's rtc and rtt
Claudiu Beznea claudiu.beznea@microchip.com ARM: at91: pm: use proper compatibles for sam9x60's rtc and rtt
Claudiu Beznea claudiu.beznea@microchip.com ARM: at91: pm: use proper compatible for sama5d2's rtc
Stephan Gerhold stephan.gerhold@kernkonzept.com arm64: dts: qcom: msm8992-*: Fix vdd_lvs1_2-supply typo
Andrei Lalaev andrey.lalaev@gmail.com pinctrl: sunxi: sunxi_pconf_set: use correct offset
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-phyboard-pollux-rdk: correct i2c2 & mmc settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-phyboard-pollux-rdk: correct eqos pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-phyboard-pollux-rdk: correct uart pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct I2C3 pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct I2C1 pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct eqos pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct vbus pad settings
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct gpio-led pad settings
Sherry Sun sherry.sun@nxp.com arm64: dts: imx8mp-evk: correct the uart2 pinctl value
Peng Fan peng.fan@nxp.com arm64: dts: imx8mp-evk: correct mmc pad settings
Fabio Estevam festevam@denx.de ARM: mxs_defconfig: Enable the framebuffer
Dmitry Baryshkov dmitry.baryshkov@linaro.org arm64: dts: qcom: sdm845: use dispcc AHB clock for mdss node
Konrad Dybcio konrad.dybcio@somainline.org arm64: dts: qcom: msm8994: Fix CPU6/7 reg values
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: codecs: rt700/rt711/rt711-sdca: resume bus/codec in .set_jack_detect
Charles Keepax ckeepax@opensource.cirrus.com ASoC: rt711-sdca: Add endianness flag in snd_soc_component_driver
Charles Keepax ckeepax@opensource.cirrus.com ASoC: rt711: Add endianness flag in snd_soc_component_driver
Samuel Holland samuel@sholland.org pinctrl: sunxi: a83t: Fix NAND function name for some pins
Miaoqian Lin linmq006@gmail.com ARM: meson: Fix refcount leak in meson_smp_prepare_cpus
daniel.starke@siemens.com daniel.starke@siemens.com tty: n_gsm: fix encoding of command/response bit
Tom Rix trix@redhat.com btrfs: fix use of uninitialized variable at rm device ioctl
Ye Guojin ye.guojin@zte.com.cn virtio-blk: modify the value type of num in virtio_queue_rq()
Dan Carpenter dan.carpenter@oracle.com btrfs: fix error pointer dereference in btrfs_ioctl_rm_dev_v2()
Hui Wang hui.wang@canonical.com Revert "serial: sc16is7xx: Clear RS485 bits in the shutdown"
Eric Sandeen sandeen@redhat.com xfs: remove incorrect ASSERT in xfs_rename
Jimmy Assarsson extja@kvaser.com can: kvaser_usb: kvaser_usb_leaf: fix bittiming limits
Jimmy Assarsson extja@kvaser.com can: kvaser_usb: kvaser_usb_leaf: fix CAN clock frequency regression
Jimmy Assarsson extja@kvaser.com can: kvaser_usb: replace run-time checks with struct kvaser_usb_driver_info
Christian Marangi ansuelsmth@gmail.com net: dsa: qca8k: reset cpu port on MTU change
Jason A. Donenfeld Jason@zx2c4.com powerpc/powernv: delay rng platform device creation until later in boot
Hsin-Yi Wang hsinyi@chromium.org video: of_display_timing.h: include errno.h
Dan Williams dan.j.williams@intel.com memregion: Fix memregion_free() fallback definition
Rafael J. Wysocki rafael.j.wysocki@intel.com PM: runtime: Redefine pm_runtime_release_supplier()
Helge Deller deller@gmx.de fbcon: Prevent that screen size is smaller than font size
Helge Deller deller@gmx.de fbcon: Disallow setting font bigger than screen size
Helge Deller deller@gmx.de fbmem: Check virtual screen sizes in fb_set_var()
Guiling Deng greens9@163.com fbdev: fbmem: Fix logo center image dx issue
Yian Chen yian.chen@intel.com iommu/vt-d: Fix PCI bus rescan device hot add
Alexey Dobriyan adobriyan@gmail.com module: fix [e_shstrndx].sh_size=0 OOB access
Shuah Khan skhan@linuxfoundation.org module: change to print useful messages from elf_validity_check()
Bryan O'Donoghue bryan.odonoghue@linaro.org dt-bindings: soc: qcom: smd-rpm: Fix missing MSM8936 compatible
Vladimir Lypak vladimir.lypak@gmail.com dt-bindings: soc: qcom: smd-rpm: Add compatible for MSM8953 SoC
David Howells dhowells@redhat.com rxrpc: Fix locking issue
Mark Rutland mark.rutland@arm.com irqchip/gic-v3: Ensure pseudo-NMIs have an ISB between ack and handling
Pavel Begunkov asml.silence@gmail.com io_uring: avoid io-wq -EAGAIN looping for !IOPOLL
Sean Wang sean.wang@mediatek.com Bluetooth: btmtksdio: fix use-after-free at btmtksdio_recv_event
Niels Dossche dossche.niels@gmail.com Bluetooth: protect le accept and resolv lists with hdev->lock
Rex-BC Chen rex-bc.chen@mediatek.com drm/mediatek: Add vblank register/unregister callback functions
Chun-Kuang Hu chunkuang.hu@kernel.org drm/mediatek: Add cmdq_handle in mtk_crtc
Chun-Kuang Hu chunkuang.hu@kernel.org drm/mediatek: Detect CMDQ execution timeout
Chun-Kuang Hu chunkuang.hu@kernel.org drm/mediatek: Remove the pointer of struct cmdq_client
Chun-Kuang Hu chunkuang.hu@kernel.org drm/mediatek: Use mailbox rx_callback instead of cmdq_task_cb
Thomas Hellström thomas.hellstrom@linux.intel.com drm/i915: Fix a race between vma / object destruction and unbinding
Richard Gong richard.gong@amd.com drm/amdgpu: vi: disable ASPM on Intel Alder Lake based systems
Mario Limonciello mario.limonciello@amd.com drm/amd: Refactor `amdgpu_aspm` to be evaluated per device
Alex Deucher alexander.deucher@amd.com drm/amdgpu: drop flags check for CHIP_IP_DISCOVERY
Lukas Fink lukas.fink1@gmail.com drm/amdgpu: Fix rejecting Tahiti GPUs
Alex Deucher alexander.deucher@amd.com drm/amdgpu: bind to any 0x1002 PCI diplay class device
Daniel Starke daniel.starke@siemens.com tty: n_gsm: fix invalid gsmtty_write_room() result
AngeloGioacchino Del Regno angelogioacchino.delregno@collabora.com serial: 8250_mtk: Make sure to select the right FEATURE_SEL
Michael Ellerman mpe@ellerman.id.au powerpc/vdso: Fix incorrect CFI in gettimeofday.S
Christophe Leroy christophe.leroy@csgroup.eu powerpc/vdso: Move cvdso_call macro into gettimeofday.S
Christophe Leroy christophe.leroy@csgroup.eu powerpc/vdso: Remove cvdso_call_time macro
Daniel Starke daniel.starke@siemens.com tty: n_gsm: fix sometimes uninitialized warning in gsm_dlci_modem_output()
Daniel Starke daniel.starke@siemens.com tty: n_gsm: fix invalid use of MSC in advanced option
Naoya Horiguchi naoya.horiguchi@nec.com mm/hwpoison: fix race between hugetlb free/demotion and memory_failure_hugetlb()
Miaohe Lin linmiaohe@huawei.com mm/memory-failure.c: fix race with changing page compound again
luofei luofei@unicloud.com mm/hwpoison: avoid the impact of hwpoison_filter() return value on mce handler
Naoya Horiguchi naoya.horiguchi@nec.com mm/hwpoison: mf_mutex for soft offline and unpoison
Sean Christopherson seanjc@google.com KVM: Initialize debugfs_dentry when a VM is created to avoid NULL deref
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: use dedicated lock for data relocation
Johannes Thumshirn johannes.thumshirn@wdc.com btrfs: zoned: encapsulate inode locking for zoned relocation
Daniel Starke daniel.starke@siemens.com tty: n_gsm: fix missing update of modem controls after DLCI open
Maurizio Avogadro mavoga@gmail.com ALSA: usb-audio: add mapping for MSI MAG X570S Torpedo MAX.
Johannes Schickel lordhoto@gmail.com ALSA: usb-audio: add mapping for MSI MPG X570S Carbon Max Wifi.
Daniel Starke daniel.starke@siemens.com tty: n_gsm: fix frame reception handling
Zhenguo Zhao Zhenguo.Zhao1@unisoc.com tty: n_gsm: Save dlci address open status when config requester
Zhenguo Zhao Zhenguo.Zhao1@unisoc.com tty: n_gsm: Modify CR,PF bit when config requester
Oliver Upton oupton@google.com KVM: Don't create VM debugfs files outside of the VM directory
tiancyin tianci.yin@amd.com drm/amd/vcn: fix an error msg on vcn 3.0
Xiaomeng Tong xiam0nd.tong@gmail.com ASoC: rt5682: fix an incorrect NULL check on list iterator
Jack Yu jack.yu@realtek.com ASoC: rt5682: move clk related code to rt5682_i2c_probe
Tadeusz Struk tadeusz.struk@linaro.org uapi/linux/stddef.h: Add include guards
Kees Cook keescook@chromium.org stddef: Introduce DECLARE_FLEX_ARRAY() helper
Paul Davey paul.davey@alliedtelesis.co.nz bus: mhi: Fix pm_state conversion to string
Kees Cook keescook@chromium.org bus: mhi: core: Use correctly sized arguments for bit field
Hui Wang hui.wang@canonical.com serial: sc16is7xx: Clear RS485 bits in the shutdown
Nicholas Piggin npiggin@gmail.com powerpc/tm: Fix more userspace r13 corruption
Nicholas Piggin npiggin@gmail.com powerpc: flexible GPR range save/restore macros
Christophe Leroy christophe.leroy@csgroup.eu powerpc/32: Don't use lmw/stmw for saving/restoring non volatile regs
Arun Easi aeasi@marvell.com scsi: qla2xxx: Fix loss of NVMe namespaces after driver reload test
Claudio Imbrenda imbrenda@linux.ibm.com KVM: s390x: fix SCK locking
Dongliang Mu mudongliangabcd@gmail.com btrfs: don't access possibly stale fs_info data in device_list_add
Paolo Bonzini pbonzini@redhat.com KVM: use __vcalloc for very large allocations
Paolo Bonzini pbonzini@redhat.com mm: vmalloc: introduce array allocation functions
Kees Cook keescook@chromium.org Compiler Attributes: add __alloc_size() for better bounds checking
Tudor Ambarus tudor.ambarus@microchip.com mtd: spi-nor: Skip erase logic when SPI_NOR_NO_ERASE is set
Sebastian Andrzej Siewior bigeasy@linutronix.de batman-adv: Use netif_rx().
Haibo Chen haibo.chen@nxp.com iio: accel: mma8452: use the correct logic to get mma8452_data
Palmer Dabbelt palmer@rivosinc.com riscv/mm: Add XIP_FIXUP for riscv_pfn_base
Chuck Lever chuck.lever@oracle.com NFSD: COMMIT operations must not return NFS?ERR_INVAL
Chuck Lever chuck.lever@oracle.com NFSD: De-duplicate net_generic(nf->nf_net, nfsd_net_id)
CHANDAN VURDIGERE NATARAJ chandan.vurdigerenataraj@amd.com drm/amd/display: Fix by adding FPU protection for dcn30_internal_validate_bw
Michael Strauss michael.strauss@amd.com drm/amd/display: Set min dcfclk if pipe count is 0
Xiaomeng Tong xiam0nd.tong@gmail.com drbd: fix an invalid memory access caused by incorrect use of list iterator
Wu Bo wubo40@huawei.com drbd: Fix double free problem in drbd_create_device
Luis Chamberlain mcgrof@kernel.org drbd: add error handling support for add_disk()
Qu Wenruo wqu@suse.com btrfs: remove device item and update super block in the same transaction
Josef Bacik josef@toxicpanda.com btrfs: use btrfs_get_dev_args_from_path in dev removal ioctls
Josef Bacik josef@toxicpanda.com btrfs: add a btrfs_get_dev_args_from_path helper
Josef Bacik josef@toxicpanda.com btrfs: handle device lookup with btrfs_dev_lookup_args
Eli Cohen elic@nvidia.com vdpa/mlx5: Avoid processing works if workqueue was destroyed
Andreas Gruenbacher agruenba@redhat.com gfs2: Fix gfs2_file_buffered_write endless loop workaround
Arun Easi aeasi@marvell.com scsi: qla2xxx: Fix crash during module load unload test
Quinn Tran qutran@marvell.com scsi: qla2xxx: edif: Replace list_for_each_safe with list_for_each_entry_safe
Quinn Tran qutran@marvell.com scsi: qla2xxx: Fix laggy FC remote port session recovery
Manish Rangankar mrangankar@marvell.com scsi: qla2xxx: Move heartbeat handling from DPC thread to workqueue
Sean Christopherson seanjc@google.com KVM: x86/mmu: Use common TDP MMU zap helper for MMU notifier unmap hook
Sean Christopherson seanjc@google.com KVM: x86/mmu: Use yield-safe TDP MMU root iter in MMU notifier unmapping
Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com clk: renesas: r9a07g044: Update multiplier and divider values for PLL2/3
Dan Williams dan.j.williams@intel.com cxl/port: Hold port reference until decoder release
Lorenzo Bianconi lorenzo@kernel.org mt76: mt7921: do not always disable fw runtime-pm
Sean Wang sean.wang@mediatek.com mt76: mt76_connac: fix MCU_CE_CMD_SET_ROC definition error
Johan Hovold johan@kernel.org media: davinci: vpif: fix use-after-free on driver unbind
Kees Cook keescook@chromium.org media: omap3isp: Use struct_group() for memcpy() region
Kees Cook keescook@chromium.org stddef: Introduce struct_group() helper macro
Tejun Heo tj@kernel.org block: fix rq-qos breakage from skipping rq_qos_done_bio()
Jens Axboe axboe@kernel.dk block: only mark bio as tracked if it really is tracked
Pavel Begunkov asml.silence@gmail.com block: use bdev_get_queue() in bio.c
Jens Axboe axboe@kernel.dk io_uring: ensure that fsnotify is always called
Max Gurtovoy mgurtovoy@nvidia.com virtio-blk: avoid preallocating big SGL for data
Sukadev Bhattiprolu sukadev@linux.ibm.com ibmvnic: Allow queueing resets during probe
Sukadev Bhattiprolu sukadev@linux.ibm.com ibmvnic: clear fop when retrying probe
Sukadev Bhattiprolu sukadev@linux.ibm.com ibmvnic: init init_done_rc earlier
Alexander Egorenkov egorenar@linux.ibm.com s390/setup: preserve memory at OLDMEM_BASE and OLDMEM_SIZE
Alexander Gordeev agordeev@linux.ibm.com s390/setup: use physical pointers for memblock_reserve()
Alexander Gordeev agordeev@linux.ibm.com s390/boot: allocate amode31 section in decompressor
Florian Westphal fw@strlen.de netfilter: nft_payload: don't allow th access for fragments
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_payload: support for inner header matching / mangling
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: convert pktinfo->tprot_set to flags field
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: rt5682: Fix deadlock on resume
Derek Fang derek.fang@realtek.com ASoC: rt5682: Re-detect the combo jack after resuming
Derek Fang derek.fang@realtek.com ASoC: rt5682: Avoid the unexpected IRQ event during going to suspend
Roi Dayan roid@nvidia.com net/mlx5e: TC, Reject rules with forward and drop actions
Roi Dayan roid@nvidia.com net/mlx5e: TC, Reject rules with drop and modify hdr action
Roi Dayan roid@nvidia.com net/mlx5e: Split actions_match_supported() into a sub function
Roi Dayan roid@nvidia.com net/mlx5e: Check action fwd/drop flag exists also for nic flows
Palmer Dabbelt palmer@rivosinc.com RISC-V: defconfigs: Set CONFIG_FB=y, for FB console
Heinrich Schuchardt heinrich.schuchardt@canonical.com riscv: defconfig: enable DRM_NOUVEAU
Hou Tao houtao1@huawei.com bpf, arm64: Use emit_addr_mov_i64() for BPF_PSEUDO_FUNC
Lorenzo Bianconi lorenzo@kernel.org mt76: mt7921: fix a possible race enabling/disabling runtime-pm
Lorenzo Bianconi lorenzo@kernel.org mt76: mt7921: introduce mt7921_mcu_set_beacon_filter utility routine
Lorenzo Bianconi lorenzo@kernel.org mt76: mt7921: get rid of mt7921_mac_set_beacon_filter
Hans de Goede hdegoede@redhat.com platform/x86: wmi: Fix driver->notify() vs ->probe() race
Hans de Goede hdegoede@redhat.com platform/x86: wmi: Replace read_takes_no_args with a flags field
Barnabás Pőcze pobrn@protonmail.com platform/x86: wmi: introduce helper to convert driver to WMI driver
Shai Malin smalin@marvell.com qed: Improve the stack space of filter_config()
Seevalamuthu Mariappan quic_seevalam@quicinc.com ath11k: add hw_param for wakeup_mhi
Andrew Gabbasov andrew_gabbasov@mentor.com memory: renesas-rpc-if: Avoid unaligned bus access for HyperFlash
Sean Young sean@mess.org media: ir_toy: prevent device from hanging during transmit
Lukas Wunner lukas@wunner.de PCI: pciehp: Ignore Link Down/Up caused by error-induced Hot Reset
Lukas Wunner lukas@wunner.de PCI/portdrv: Rename pm_iter() to pcie_port_device_iter()
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Replace the unconditional clflush with drm_clflush_virt_range()
Thomas Hellström thomas.hellstrom@linux.intel.com drm/i915/gt: Register the migrate contexts with their engines
Matthew Brost matthew.brost@intel.com drm/i915: Disable bonding on gen12+ platforms
Filipe Manana fdmanana@suse.com btrfs: fix deadlock between chunk allocation and chunk btree modifications
Michel Dänzer mdaenzer@redhat.com dma-buf/poll: Get a file reference for outstanding fence callbacks
Hans de Goede hdegoede@redhat.com Input: goodix - try not to touch the reset-pin on x86/ACPI devices
Hans de Goede hdegoede@redhat.com Input: goodix - refactor reset handling
Hans de Goede hdegoede@redhat.com Input: goodix - add a goodix.h header file
Hans de Goede hdegoede@redhat.com Input: goodix - change goodix_i2c_write() len parameter type to int
Tang Bin tangbin@cmss.chinamobile.com Input: cpcap-pwrbutton - handle errors from platform_get_irq()
Filipe Manana fdmanana@suse.com btrfs: fix warning when freeing leaf after subvolume creation failure
Filipe Manana fdmanana@suse.com btrfs: fix invalid delayed ref after subvolume creation failure
Nikolay Borisov nborisov@suse.com btrfs: add additional parameters to btrfs_init_tree_ref/btrfs_init_data_ref
Nikolay Borisov nborisov@suse.com btrfs: rename btrfs_alloc_chunk to btrfs_create_chunk
Pablo Neira Ayuso pablo@netfilter.org netfilter: nf_tables: stricter validation of element data
Pablo Neira Ayuso pablo@netfilter.org netfilter: nft_set_pipapo: release elements in clone from abort path
Duoming Zhou duoming@zju.edu.cn net: rose: fix UAF bug caused by rose_t0timer_expiry
Oliver Neukum oneukum@suse.com usbnet: fix memory leak in error case
Daniel Borkmann daniel@iogearbox.net bpf: Fix insufficient bounds propagation from adjust_scalar_min_max_vals
Daniel Borkmann daniel@iogearbox.net bpf: Fix incorrect verifier simulation around jmp32's jeq/jne
Thomas Kopp thomas.kopp@microchip.com can: mcp251xfd: mcp251xfd_regmap_crc_read(): update workaround broken CRC on TBC register
Thomas Kopp thomas.kopp@microchip.com can: mcp251xfd: mcp251xfd_regmap_crc_read(): improve workaround handling for mcp2517fd
Marc Kleine-Budde mkl@pengutronix.de can: m_can: m_can_{read_fifo,echo_tx_event}(): shift timestamp to full 32 bits
Marc Kleine-Budde mkl@pengutronix.de can: m_can: m_can_chip_config(): actually enable internal timestamping
Rhett Aultman rhett.aultman@samsara.com can: gs_usb: gs_usb_open/close(): fix memory leak
Liang He windhl@126.com can: grcan: grcan_probe(): remove extra of_node_get()
Oliver Hartkopp socketcan@hartkopp.net can: bcm: use call_rcu() instead of costly synchronize_rcu()
Takashi Iwai tiwai@suse.de ALSA: cs46xx: Fix missing snd_card_free() call at probe error
Tim Crawford tcrawford@system76.com ALSA: hda/realtek: Add quirk for Clevo L140PU
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Workarounds for Behringer UMC 204/404 HD
Po-Hsu Lin po-hsu.lin@canonical.com Revert "selftests/bpf: Add test for bpf_timer overwriting crash"
Liu Shixin liushixin2@huawei.com mm/filemap: fix UAF in find_lock_entries
Jann Horn jannh@google.com mm/slub: add missing TID updates on slab deactivation
-------------
Diffstat:
.../bindings/dma/allwinner,sun50i-a64-dma.yaml | 2 +- .../devicetree/bindings/soc/qcom/qcom,smd-rpm.yaml | 3 + MAINTAINERS | 3 +- Makefile | 19 +- arch/arm/boot/dts/at91-sam9x60ek.dts | 3 +- arch/arm/boot/dts/at91-sama5d2_icp.dts | 6 +- arch/arm/boot/dts/stm32mp151.dtsi | 4 +- arch/arm/configs/mxs_defconfig | 1 + arch/arm/mach-at91/pm.c | 10 +- arch/arm/mach-meson/platsmp.c | 2 + arch/arm64/boot/dts/freescale/imx8mp-evk.dts | 54 ++-- .../dts/freescale/imx8mp-phyboard-pollux-rdk.dts | 48 ++-- .../boot/dts/qcom/msm8992-bullhead-rev-101.dts | 2 +- arch/arm64/boot/dts/qcom/msm8992-xiaomi-libra.dts | 2 +- arch/arm64/boot/dts/qcom/msm8994.dtsi | 4 +- arch/arm64/boot/dts/qcom/sdm845.dtsi | 2 +- arch/arm64/net/bpf_jit_comp.c | 5 +- arch/powerpc/boot/crt0.S | 31 +-- arch/powerpc/crypto/md5-asm.S | 10 +- arch/powerpc/crypto/sha1-powerpc-asm.S | 6 +- arch/powerpc/include/asm/ppc_asm.h | 43 +-- arch/powerpc/include/asm/vdso/gettimeofday.h | 69 +---- arch/powerpc/kernel/entry_32.S | 23 +- arch/powerpc/kernel/exceptions-64e.S | 14 +- arch/powerpc/kernel/exceptions-64s.S | 6 +- arch/powerpc/kernel/head_32.h | 3 +- arch/powerpc/kernel/head_booke.h | 3 +- arch/powerpc/kernel/interrupt_64.S | 34 +-- arch/powerpc/kernel/optprobes_head.S | 4 +- arch/powerpc/kernel/tm.S | 38 +-- arch/powerpc/kernel/trace/ftrace_64_mprofile.S | 15 +- arch/powerpc/kernel/vdso32/gettimeofday.S | 51 +++- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 +- arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- arch/powerpc/lib/test_emulate_step_exec_instr.S | 8 +- arch/powerpc/platforms/powernv/rng.c | 16 +- arch/riscv/configs/defconfig | 8 +- arch/riscv/configs/rv32_defconfig | 1 + arch/riscv/mm/init.c | 1 + arch/s390/boot/compressed/decompressor.h | 1 + arch/s390/boot/startup.c | 8 + arch/s390/kernel/entry.h | 1 + arch/s390/kernel/setup.c | 31 +-- arch/s390/kernel/vmlinux.lds.S | 1 + arch/s390/kvm/kvm-s390.c | 19 +- arch/s390/kvm/kvm-s390.h | 4 +- arch/s390/kvm/priv.c | 15 +- arch/x86/kernel/cpu/mce/core.c | 8 +- arch/x86/kvm/mmu/page_track.c | 4 +- arch/x86/kvm/mmu/tdp_mmu.c | 9 +- arch/x86/kvm/x86.c | 4 +- block/bio.c | 11 +- block/blk-iolatency.c | 2 +- block/blk-rq-qos.h | 23 +- drivers/base/core.c | 3 +- drivers/base/memory.c | 2 + drivers/base/power/runtime.c | 20 +- drivers/block/Kconfig | 1 + drivers/block/drbd/drbd_main.c | 8 +- drivers/block/virtio_blk.c | 158 +++++++---- drivers/bluetooth/btmtksdio.c | 3 +- drivers/bus/mhi/core/init.c | 9 +- drivers/bus/mhi/core/internal.h | 2 +- drivers/clk/renesas/r9a07g044-cpg.c | 4 +- drivers/cxl/core/bus.c | 4 + drivers/dma-buf/dma-buf.c | 19 +- drivers/dma/at_xdmac.c | 5 + drivers/dma/idxd/device.c | 5 +- drivers/dma/imx-sdma.c | 2 +- drivers/dma/lgm/lgm-dma.c | 3 +- drivers/dma/pl330.c | 2 +- drivers/dma/qcom/bam_dma.c | 39 +-- drivers/dma/ti/dma-crossbar.c | 5 + drivers/gpu/drm/amd/amdgpu/amdgpu.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 25 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 10 + drivers/gpu/drm/amd/amdgpu/cik.c | 2 +- drivers/gpu/drm/amd/amdgpu/nv.c | 2 +- drivers/gpu/drm/amd/amdgpu/si.c | 2 +- drivers/gpu/drm/amd/amdgpu/soc15.c | 2 +- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/vi.c | 17 +- .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.c | 2 +- .../gpu/drm/amd/display/dc/dcn30/dcn30_resource.h | 7 + .../gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 65 ++++- .../drm/amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +- drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 + drivers/gpu/drm/i915/gem/i915_gem_object.c | 6 + drivers/gpu/drm/i915/gt/intel_context_types.h | 8 + drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 + drivers/gpu/drm/i915/gt/intel_engine_pm.c | 23 ++ drivers/gpu/drm/i915/gt/intel_engine_pm.h | 2 + drivers/gpu/drm/i915/gt/intel_engine_types.h | 7 + .../gpu/drm/i915/gt/intel_execlists_submission.c | 2 + drivers/gpu/drm/i915/gt/intel_ring_submission.c | 5 +- drivers/gpu/drm/i915/gt/mock_engine.c | 2 + drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c | 12 +- drivers/gpu/drm/mediatek/mtk_disp_drv.h | 16 +- drivers/gpu/drm/mediatek/mtk_disp_ovl.c | 22 +- drivers/gpu/drm/mediatek/mtk_disp_rdma.c | 20 +- drivers/gpu/drm/mediatek/mtk_drm_crtc.c | 133 +++++++-- drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.c | 4 + drivers/gpu/drm/mediatek/mtk_drm_ddp_comp.h | 29 +- drivers/i2c/busses/i2c-cadence.c | 1 + drivers/i2c/busses/i2c-piix4.c | 16 +- drivers/iio/accel/mma8452.c | 4 +- drivers/input/misc/cpcap-pwrbutton.c | 6 +- drivers/input/touchscreen/goodix.c | 150 +++++----- drivers/input/touchscreen/goodix.h | 75 +++++ drivers/iommu/intel/dmar.c | 2 +- drivers/irqchip/irq-gic-v3.c | 3 + drivers/media/platform/davinci/vpif.c | 97 +++++-- drivers/media/platform/omap3isp/ispstat.c | 5 +- drivers/media/rc/ir_toy.c | 2 +- drivers/memory/renesas-rpc-if.c | 48 +++- drivers/misc/cardreader/rtsx_usb.c | 27 +- drivers/mtd/spi-nor/core.c | 3 +- drivers/net/can/grcan.c | 1 - drivers/net/can/m_can/m_can.c | 8 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 22 +- drivers/net/can/usb/gs_usb.c | 23 +- drivers/net/can/usb/kvaser_usb/kvaser_usb.h | 25 +- drivers/net/can/usb/kvaser_usb/kvaser_usb_core.c | 286 ++++++++++--------- drivers/net/can/usb/kvaser_usb/kvaser_usb_hydra.c | 4 +- drivers/net/can/usb/kvaser_usb/kvaser_usb_leaf.c | 119 ++++---- drivers/net/dsa/qca8k.c | 23 +- drivers/net/ethernet/ibm/ibmvnic.c | 147 +++++++++- drivers/net/ethernet/ibm/ibmvnic.h | 1 + drivers/net/ethernet/intel/i40e/i40e.h | 16 ++ drivers/net/ethernet/intel/i40e/i40e_main.c | 73 +++++ drivers/net/ethernet/intel/i40e/i40e_register.h | 13 + drivers/net/ethernet/intel/i40e/i40e_type.h | 1 + drivers/net/ethernet/intel/i40e/i40e_virtchnl_pf.c | 4 + drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 88 +++--- drivers/net/ethernet/qlogic/qed/qed_l2.c | 23 +- drivers/net/ethernet/qlogic/qede/qede_filter.c | 47 ++-- drivers/net/ethernet/realtek/r8169_main.c | 10 +- drivers/net/usb/usbnet.c | 17 +- drivers/net/wireless/ath/ath11k/core.c | 5 + drivers/net/wireless/ath/ath11k/hw.h | 1 + drivers/net/wireless/ath/ath11k/pci.c | 12 +- .../net/wireless/mediatek/mt76/mt76_connac_mac.c | 3 - .../net/wireless/mediatek/mt76/mt76_connac_mcu.h | 2 +- .../net/wireless/mediatek/mt76/mt7921/debugfs.c | 31 ++- drivers/net/wireless/mediatek/mt76/mt7921/mac.c | 28 -- drivers/net/wireless/mediatek/mt76/mt7921/main.c | 33 +-- drivers/net/wireless/mediatek/mt76/mt7921/mcu.c | 47 ++-- drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h | 11 +- drivers/pci/hotplug/pciehp.h | 2 + drivers/pci/hotplug/pciehp_core.c | 2 + drivers/pci/hotplug/pciehp_hpc.c | 26 ++ drivers/pci/pcie/portdrv.h | 3 + drivers/pci/pcie/portdrv_core.c | 20 +- drivers/pci/pcie/portdrv_pci.c | 3 + drivers/pinctrl/sunxi/pinctrl-sun8i-a83t.c | 10 +- drivers/pinctrl/sunxi/pinctrl-sunxi.c | 2 + drivers/platform/x86/wmi.c | 39 +-- drivers/scsi/qla2xxx/qla_def.h | 5 +- drivers/scsi/qla2xxx/qla_edif.c | 39 +-- drivers/scsi/qla2xxx/qla_edif.h | 1 - drivers/scsi/qla2xxx/qla_init.c | 2 + drivers/scsi/qla2xxx/qla_nvme.c | 27 +- drivers/scsi/qla2xxx/qla_os.c | 102 ++++--- drivers/soc/atmel/soc.c | 12 +- drivers/tty/n_gsm.c | 263 ++++++++++++++--- drivers/vdpa/mlx5/net/mlx5_vnet.c | 7 +- drivers/video/fbdev/core/fbcon.c | 33 +++ drivers/video/fbdev/core/fbmem.c | 16 +- fs/btrfs/block-group.c | 152 ++++++---- fs/btrfs/block-group.h | 2 + fs/btrfs/ctree.c | 17 +- fs/btrfs/ctree.h | 8 +- fs/btrfs/delayed-ref.h | 5 +- fs/btrfs/dev-replace.c | 16 +- fs/btrfs/disk-io.c | 1 + fs/btrfs/extent-tree.c | 28 +- fs/btrfs/extent_io.c | 8 +- fs/btrfs/file.c | 13 +- fs/btrfs/free-space-tree.c | 4 +- fs/btrfs/inode.c | 3 +- fs/btrfs/ioctl.c | 96 ++++--- fs/btrfs/qgroup.c | 3 +- fs/btrfs/relocation.c | 25 +- fs/btrfs/scrub.c | 6 +- fs/btrfs/tree-log.c | 2 +- fs/btrfs/volumes.c | 310 +++++++++++++-------- fs/btrfs/volumes.h | 28 +- fs/btrfs/zoned.c | 2 +- fs/btrfs/zoned.h | 17 ++ fs/gfs2/file.c | 1 + fs/io_uring.c | 10 +- fs/nfsd/nfs3proc.c | 6 - fs/nfsd/vfs.c | 64 +++-- fs/nfsd/vfs.h | 4 +- fs/seq_file.c | 32 +++ fs/xfs/xfs_inode.c | 1 - include/linux/blk_types.h | 3 +- include/linux/compiler-gcc.h | 8 + include/linux/compiler_attributes.h | 10 + include/linux/compiler_types.h | 12 + include/linux/fbcon.h | 4 + include/linux/hugetlb.h | 6 + include/linux/list.h | 10 + include/linux/memregion.h | 2 +- include/linux/mm.h | 8 + include/linux/pm_runtime.h | 5 +- include/linux/qed/qed_eth_if.h | 21 +- include/linux/rtsx_usb.h | 2 - include/linux/seq_file.h | 4 + include/linux/stddef.h | 61 ++++ include/linux/vmalloc.h | 5 + include/net/netfilter/nf_tables.h | 10 +- include/net/netfilter/nf_tables_ipv4.h | 7 +- include/net/netfilter/nf_tables_ipv6.h | 6 +- include/uapi/linux/netfilter/nf_tables.h | 2 + include/uapi/linux/omap3isp.h | 21 +- include/uapi/linux/stddef.h | 41 +++ include/video/of_display_timing.h | 2 + kernel/bpf/verifier.c | 113 ++++---- kernel/module.c | 79 ++++-- lib/idr.c | 3 +- mm/filemap.c | 12 +- mm/hugetlb.c | 10 + mm/hwpoison-inject.c | 3 +- mm/madvise.c | 2 + mm/memory-failure.c | 205 ++++++++------ mm/slub.c | 2 + mm/util.c | 50 ++++ net/batman-adv/bridge_loop_avoidance.c | 2 +- net/bluetooth/hci_event.c | 12 + net/can/bcm.c | 18 +- net/netfilter/nf_tables_api.c | 9 +- net/netfilter/nf_tables_core.c | 2 +- net/netfilter/nf_tables_trace.c | 4 +- net/netfilter/nft_exthdr.c | 2 +- net/netfilter/nft_meta.c | 2 +- net/netfilter/nft_payload.c | 63 ++++- net/netfilter/nft_set_pipapo.c | 48 +++- net/rose/rose_route.c | 4 +- net/rxrpc/ar-internal.h | 2 +- net/rxrpc/call_accept.c | 6 +- net/rxrpc/call_object.c | 18 +- net/rxrpc/net_ns.c | 2 +- net/rxrpc/proc.c | 10 +- net/xdp/xsk_buff_pool.c | 1 + scripts/checkpatch.pl | 3 +- scripts/kernel-doc | 9 + sound/pci/cs46xx/cs46xx.c | 22 +- sound/pci/hda/patch_realtek.c | 1 + sound/soc/codecs/rt5682-i2c.c | 36 ++- sound/soc/codecs/rt5682.c | 125 ++++----- sound/soc/codecs/rt5682.h | 4 +- sound/soc/codecs/rt700.c | 16 +- sound/soc/codecs/rt711-sdca.c | 27 +- sound/soc/codecs/rt711.c | 25 +- sound/usb/mixer_maps.c | 16 ++ sound/usb/quirks.c | 4 + .../testing/selftests/bpf/prog_tests/timer_crash.c | 32 --- tools/testing/selftests/bpf/progs/timer_crash.c | 54 ---- tools/testing/selftests/net/forwarding/lib.sh | 6 +- tools/testing/selftests/net/udpgro.sh | 2 +- tools/testing/selftests/net/udpgro_bench.sh | 2 +- tools/testing/selftests/net/udpgro_fwd.sh | 2 +- tools/testing/selftests/net/veth.sh | 6 +- virt/kvm/kvm_main.c | 14 +- 265 files changed, 3774 insertions(+), 2044 deletions(-)
From: Jann Horn jannh@google.com
commit eeaa345e128515135ccb864c04482180c08e3259 upstream.
The fastpath in slab_alloc_node() assumes that c->slab is stable as long as the TID stays the same. However, two places in __slab_alloc() currently don't update the TID when deactivating the CPU slab.
If multiple operations race the right way, this could lead to an object getting lost; or, in an even more unlikely situation, it could even lead to an object being freed onto the wrong slab's freelist, messing up the `inuse` counter and eventually causing a page to be freed to the page allocator while it still contains slab objects.
(I haven't actually tested these cases though, this is just based on looking at the code. Writing testcases for this stuff seems like it'd be a pain...)
The race leading to state inconsistency is (all operations on the same CPU and kmem_cache):
- task A: begin do_slab_free(): - read TID - read pcpu freelist (==NULL) - check `slab == c->slab` (true) - [PREEMPT A->B] - task B: begin slab_alloc_node(): - fastpath fails (`c->freelist` is NULL) - enter __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - take local_lock_irqsave() - read c->freelist as NULL - get_freelist() returns NULL - write `c->slab = NULL` - drop local_unlock_irqrestore() - goto new_slab - slub_percpu_partial() is NULL - get_partial() returns NULL - slub_put_cpu_ptr() (enables preemption) - [PREEMPT B->A] - task A: finish do_slab_free(): - this_cpu_cmpxchg_double() succeeds() - [CORRUPT STATE: c->slab==NULL, c->freelist!=NULL]
From there, the object on c->freelist will get lost if task B is allowed to
continue from here: It will proceed to the retry_load_slab label, set c->slab, then jump to load_freelist, which clobbers c->freelist.
But if we instead continue as follows, we get worse corruption:
- task A: run __slab_free() on object from other struct slab: - CPU_PARTIAL_FREE case (slab was on no list, is now on pcpu partial) - task A: run slab_alloc_node() with NUMA node constraint: - fastpath fails (c->slab is NULL) - call __slab_alloc() - slub_get_cpu_ptr() (disables preemption) - enter ___slab_alloc() - c->slab is NULL: goto new_slab - slub_percpu_partial() is non-NULL - set c->slab to slub_percpu_partial(c) - [CORRUPT STATE: c->slab points to slab-1, c->freelist has objects from slab-2] - goto redo - node_match() fails - goto deactivate_slab - existing c->freelist is passed into deactivate_slab() - inuse count of slab-1 is decremented to account for object from slab-2
At this point, the inuse count of slab-1 is 1 lower than it should be. This means that if we free all allocated objects in slab-1 except for one, SLUB will think that slab-1 is completely unused, and may free its page, leading to use-after-free.
Fixes: c17dda40a6a4e ("slub: Separate out kmem_cache_cpu processing from deactivate_slab") Fixes: 03e404af26dc2 ("slub: fast release on full slab") Cc: stable@vger.kernel.org Signed-off-by: Jann Horn jannh@google.com Acked-by: Christoph Lameter cl@linux.com Acked-by: David Rientjes rientjes@google.com Reviewed-by: Muchun Song songmuchun@bytedance.com Tested-by: Hyeonggon Yoo 42.hyeyoo@gmail.com Signed-off-by: Vlastimil Babka vbabka@suse.cz Link: https://lore.kernel.org/r/20220608182205.2945720-1-jannh@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/slub.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/mm/slub.c +++ b/mm/slub.c @@ -2935,6 +2935,7 @@ redo:
if (!freelist) { c->page = NULL; + c->tid = next_tid(c->tid); local_unlock_irqrestore(&s->cpu_slab->lock, flags); stat(s, DEACTIVATE_BYPASS); goto new_slab; @@ -2967,6 +2968,7 @@ deactivate_slab: freelist = c->freelist; c->page = NULL; c->freelist = NULL; + c->tid = next_tid(c->tid); local_unlock_irqrestore(&s->cpu_slab->lock, flags); deactivate_slab(s, page, freelist);
From: Liu Shixin liushixin2@huawei.com
Release refcount after xas_set to fix UAF which may cause panic like this:
page:ffffea000491fa40 refcount:1 mapcount:0 mapping:0000000000000000 index:0x1 pfn:0x1247e9 head:ffffea000491fa00 order:3 compound_mapcount:0 compound_pincount:0 memcg:ffff888104f91091 flags: 0x2fffff80010200(slab|head|node=0|zone=2|lastcpupid=0x1fffff) ... page dumped because: VM_BUG_ON_PAGE(PageTail(page)) ------------[ cut here ]------------ kernel BUG at include/linux/page-flags.h:632! invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC KASAN CPU: 1 PID: 7642 Comm: sh Not tainted 5.15.51-dirty #26 ... Call Trace: <TASK> __invalidate_mapping_pages+0xe7/0x540 drop_pagecache_sb+0x159/0x320 iterate_supers+0x120/0x240 drop_caches_sysctl_handler+0xaa/0xe0 proc_sys_call_handler+0x2b4/0x480 new_sync_write+0x3d6/0x5c0 vfs_write+0x446/0x7a0 ksys_write+0x105/0x210 do_syscall_64+0x35/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f52b5733130 ...
This problem has been fixed on mainline by patch 6b24ca4a1a8d ("mm: Use multi-index entries in the page cache") since it deletes the related code.
Fixes: 5c211ba29deb ("mm: add and use find_lock_entries") Signed-off-by: Liu Shixin liushixin2@huawei.com Acked-by: Matthew Wilcox (Oracle) willy@infradead.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/filemap.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
--- a/mm/filemap.c +++ b/mm/filemap.c @@ -2090,7 +2090,11 @@ unsigned find_lock_entries(struct addres
rcu_read_lock(); while ((page = find_get_entry(&xas, end, XA_PRESENT))) { + unsigned long next_idx = xas.xa_index + 1; + if (!xa_is_value(page)) { + if (PageTransHuge(page)) + next_idx = page->index + thp_nr_pages(page); if (page->index < start) goto put; if (page->index + thp_nr_pages(page) - 1 > end) @@ -2111,13 +2115,11 @@ unlock: put: put_page(page); next: - if (!xa_is_value(page) && PageTransHuge(page)) { - unsigned int nr_pages = thp_nr_pages(page); - + if (next_idx != xas.xa_index + 1) { /* Final THP may cross MAX_LFS_FILESIZE on 32-bit */ - xas_set(&xas, page->index + nr_pages); - if (xas.xa_index < nr_pages) + if (next_idx < xas.xa_index) break; + xas_set(&xas, next_idx); } } rcu_read_unlock();
From: Po-Hsu Lin po-hsu.lin@canonical.com
This reverts commit b0028e1cc1faf2e5d88ad4065590aca90d650182 which is commit a7e75016a0753c24d6c995bc02501ae35368e333 upstream.
It will break the bpf self-tests build with: progs/timer_crash.c:8:19: error: field has incomplete type 'struct bpf_timer' struct bpf_timer timer; ^ /home/ubuntu/linux/tools/testing/selftests/bpf/tools/include/bpf/bpf_helper_defs.h:39:8: note: forward declaration of 'struct bpf_timer' struct bpf_timer; ^ 1 error generated.
This test can only be built with 5.17 and newer kernels.
Signed-off-by: Po-Hsu Lin po-hsu.lin@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/bpf/prog_tests/timer_crash.c | 32 ----------- tools/testing/selftests/bpf/progs/timer_crash.c | 54 ------------------- 2 files changed, 86 deletions(-) delete mode 100644 tools/testing/selftests/bpf/prog_tests/timer_crash.c delete mode 100644 tools/testing/selftests/bpf/progs/timer_crash.c
--- a/tools/testing/selftests/bpf/prog_tests/timer_crash.c +++ /dev/null @@ -1,32 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 -#include <test_progs.h> -#include "timer_crash.skel.h" - -enum { - MODE_ARRAY, - MODE_HASH, -}; - -static void test_timer_crash_mode(int mode) -{ - struct timer_crash *skel; - - skel = timer_crash__open_and_load(); - if (!ASSERT_OK_PTR(skel, "timer_crash__open_and_load")) - return; - skel->bss->pid = getpid(); - skel->bss->crash_map = mode; - if (!ASSERT_OK(timer_crash__attach(skel), "timer_crash__attach")) - goto end; - usleep(1); -end: - timer_crash__destroy(skel); -} - -void test_timer_crash(void) -{ - if (test__start_subtest("array")) - test_timer_crash_mode(MODE_ARRAY); - if (test__start_subtest("hash")) - test_timer_crash_mode(MODE_HASH); -} --- a/tools/testing/selftests/bpf/progs/timer_crash.c +++ /dev/null @@ -1,54 +0,0 @@ -// SPDX-License-Identifier: GPL-2.0 - -#include <vmlinux.h> -#include <bpf/bpf_tracing.h> -#include <bpf/bpf_helpers.h> - -struct map_elem { - struct bpf_timer timer; - struct bpf_spin_lock lock; -}; - -struct { - __uint(type, BPF_MAP_TYPE_ARRAY); - __uint(max_entries, 1); - __type(key, int); - __type(value, struct map_elem); -} amap SEC(".maps"); - -struct { - __uint(type, BPF_MAP_TYPE_HASH); - __uint(max_entries, 1); - __type(key, int); - __type(value, struct map_elem); -} hmap SEC(".maps"); - -int pid = 0; -int crash_map = 0; /* 0 for amap, 1 for hmap */ - -SEC("fentry/do_nanosleep") -int sys_enter(void *ctx) -{ - struct map_elem *e, value = {}; - void *map = crash_map ? (void *)&hmap : (void *)&amap; - - if (bpf_get_current_task_btf()->tgid != pid) - return 0; - - *(void **)&value = (void *)0xdeadcaf3; - - bpf_map_update_elem(map, &(int){0}, &value, 0); - /* For array map, doing bpf_map_update_elem will do a - * check_and_free_timer_in_array, which will trigger the crash if timer - * pointer was overwritten, for hmap we need to use bpf_timer_cancel. - */ - if (crash_map == 1) { - e = bpf_map_lookup_elem(map, &(int){0}); - if (!e) - return 0; - bpf_timer_cancel(&e->timer); - } - return 0; -} - -char _license[] SEC("license") = "GPL";
From: Takashi Iwai tiwai@suse.de
commit ae8b1631561a3634cc09d0c62bbdd938eade05ec upstream.
Both Behringer UMC 202 HD and 404 HD need explicit quirks to enable the implicit feedback mode and start the playback stream primarily. The former seems fixing the stuttering and the latter is required for a playback-only case.
Note that the "clock source 41 is not valid" error message still appears even after this fix, but it should be only once at probe. The reason of the error is still unknown, but this seems to be mostly harmless as it's a one-off error and the driver retires the clock setup and it succeeds afterwards.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215934 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220624101132.14528-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1842,6 +1842,10 @@ static const struct usb_audio_quirk_flag QUIRK_FLAG_SHARE_MEDIA_DEVICE | QUIRK_FLAG_ALIGN_TRANSFER), DEVICE_FLG(0x1395, 0x740a, /* Sennheiser DECT */ QUIRK_FLAG_GET_SAMPLE_RATE), + DEVICE_FLG(0x1397, 0x0508, /* Behringer UMC204HD */ + QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB), + DEVICE_FLG(0x1397, 0x0509, /* Behringer UMC404HD */ + QUIRK_FLAG_PLAYBACK_FIRST | QUIRK_FLAG_GENERIC_IMPLICIT_FB), DEVICE_FLG(0x13e5, 0x0001, /* Serato Phono */ QUIRK_FLAG_IGNORE_CTL_ERROR), DEVICE_FLG(0x154e, 0x1002, /* Denon DCD-1500RE */
From: Tim Crawford tcrawford@system76.com
commit 11bea26929a1a3a9dd1a287b60c2f471701bf706 upstream.
Fixes headset detection on Clevo L140PU.
Signed-off-by: Tim Crawford tcrawford@system76.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220624144109.3957-1-tcrawford@system76.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9001,6 +9001,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1558, 0x70f4, "Clevo NH77EPY", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70f6, "Clevo NH77DPQ-Y", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x7716, "Clevo NS50PU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1558, 0x7718, "Clevo L140PU", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x8228, "Clevo NR40BU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x8520, "Clevo NH50D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x8521, "Clevo NH77D[CD]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
From: Takashi Iwai tiwai@suse.de
commit c5e58c4545a69677d078b4c813b5d10d3481be9c upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 5bff69b3645d ("ALSA: cs46xx: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Reported-and-tested-by: Jan Engelhardt jengelh@inai.de Link: https://lore.kernel.org/r/p2p1s96o-746-74p4-s95-61qo1p7782pn@vanv.qr Link: https://lore.kernel.org/r/20220705152336.350-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/cs46xx/cs46xx.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
--- a/sound/pci/cs46xx/cs46xx.c +++ b/sound/pci/cs46xx/cs46xx.c @@ -74,36 +74,36 @@ static int snd_card_cs46xx_probe(struct err = snd_cs46xx_create(card, pci, external_amp[dev], thinkpad[dev]); if (err < 0) - return err; + goto error; card->private_data = chip; chip->accept_valid = mmap_valid[dev]; err = snd_cs46xx_pcm(chip, 0); if (err < 0) - return err; + goto error; #ifdef CONFIG_SND_CS46XX_NEW_DSP err = snd_cs46xx_pcm_rear(chip, 1); if (err < 0) - return err; + goto error; err = snd_cs46xx_pcm_iec958(chip, 2); if (err < 0) - return err; + goto error; #endif err = snd_cs46xx_mixer(chip, 2); if (err < 0) - return err; + goto error; #ifdef CONFIG_SND_CS46XX_NEW_DSP if (chip->nr_ac97_codecs ==2) { err = snd_cs46xx_pcm_center_lfe(chip, 3); if (err < 0) - return err; + goto error; } #endif err = snd_cs46xx_midi(chip, 0); if (err < 0) - return err; + goto error; err = snd_cs46xx_start_dsp(chip); if (err < 0) - return err; + goto error;
snd_cs46xx_gameport(chip);
@@ -117,11 +117,15 @@ static int snd_card_cs46xx_probe(struct
err = snd_card_register(card); if (err < 0) - return err; + goto error;
pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver cs46xx_driver = {
From: Oliver Hartkopp socketcan@hartkopp.net
commit f1b4e32aca0811aa011c76e5d6cf2fa19224b386 upstream.
In commit d5f9023fa61e ("can: bcm: delay release of struct bcm_op after synchronize_rcu()") Thadeu Lima de Souza Cascardo introduced two synchronize_rcu() calls in bcm_release() (only once at socket close) and in bcm_delete_rx_op() (called on removal of each single bcm_op).
Unfortunately this slow removal of the bcm_op's affects user space applications like cansniffer where the modification of a filter removes 2048 bcm_op's which blocks the cansniffer application for 40(!) seconds.
In commit 181d4447905d ("can: gw: use call_rcu() instead of costly synchronize_rcu()") Eric Dumazet replaced the synchronize_rcu() calls with several call_rcu()'s to safely remove the data structures after the removal of CAN ID subscriptions with can_rx_unregister() calls.
This patch adopts Erics approach for the can-bcm which should be applicable since the removal of tasklet_kill() in bcm_remove_op() and the introduction of the HRTIMER_MODE_SOFT timer handling in Linux 5.4.
Fixes: d5f9023fa61e ("can: bcm: delay release of struct bcm_op after synchronize_rcu()") # >= 5.4 Link: https://lore.kernel.org/all/20220520183239.19111-1-socketcan@hartkopp.net Cc: stable@vger.kernel.org Cc: Eric Dumazet edumazet@google.com Cc: Norbert Slusarek nslusarek@gmx.net Cc: Thadeu Lima de Souza Cascardo cascardo@canonical.com Signed-off-by: Oliver Hartkopp socketcan@hartkopp.net Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/can/bcm.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
--- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -100,6 +100,7 @@ static inline u64 get_u64(const struct c
struct bcm_op { struct list_head list; + struct rcu_head rcu; int ifindex; canid_t can_id; u32 flags; @@ -718,10 +719,9 @@ static struct bcm_op *bcm_find_op(struct return NULL; }
-static void bcm_remove_op(struct bcm_op *op) +static void bcm_free_op_rcu(struct rcu_head *rcu_head) { - hrtimer_cancel(&op->timer); - hrtimer_cancel(&op->thrtimer); + struct bcm_op *op = container_of(rcu_head, struct bcm_op, rcu);
if ((op->frames) && (op->frames != &op->sframe)) kfree(op->frames); @@ -732,6 +732,14 @@ static void bcm_remove_op(struct bcm_op kfree(op); }
+static void bcm_remove_op(struct bcm_op *op) +{ + hrtimer_cancel(&op->timer); + hrtimer_cancel(&op->thrtimer); + + call_rcu(&op->rcu, bcm_free_op_rcu); +} + static void bcm_rx_unreg(struct net_device *dev, struct bcm_op *op) { if (op->rx_reg_dev == dev) { @@ -757,6 +765,9 @@ static int bcm_delete_rx_op(struct list_ if ((op->can_id == mh->can_id) && (op->ifindex == ifindex) && (op->flags & CAN_FD_FRAME) == (mh->flags & CAN_FD_FRAME)) {
+ /* disable automatic timer on frame reception */ + op->flags |= RX_NO_AUTOTIMER; + /* * Don't care if we're bound or not (due to netdev * problems) can_rx_unregister() is always a save @@ -785,7 +796,6 @@ static int bcm_delete_rx_op(struct list_ bcm_rx_handler, op);
list_del(&op->list); - synchronize_rcu(); bcm_remove_op(op); return 1; /* done */ }
From: Liang He windhl@126.com
commit 562fed945ea482833667f85496eeda766d511386 upstream.
In grcan_probe(), of_find_node_by_path() has already increased the refcount. There is no need to call of_node_get() again, so remove it.
Link: https://lore.kernel.org/all/20220619070257.4067022-1-windhl@126.com Fixes: 1e93ed26acf0 ("can: grcan: grcan_probe(): fix broken system id check for errata workaround needs") Cc: stable@vger.kernel.org # v5.18 Cc: Andreas Larsson andreas@gaisler.com Signed-off-by: Liang He windhl@126.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/grcan.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/net/can/grcan.c +++ b/drivers/net/can/grcan.c @@ -1659,7 +1659,6 @@ static int grcan_probe(struct platform_d */ sysid_parent = of_find_node_by_path("/ambapp0"); if (sysid_parent) { - of_node_get(sysid_parent); err = of_property_read_u32(sysid_parent, "systemid", &sysid); if (!err && ((sysid & GRLIB_VERSION_MASK) >= GRCAN_TXBUG_SAFE_GRLIB_VERSION))
From: Rhett Aultman rhett.aultman@samsara.com
commit 2bda24ef95c0311ab93bda00db40486acf30bd0a upstream.
The gs_usb driver appears to suffer from a malady common to many USB CAN adapter drivers in that it performs usb_alloc_coherent() to allocate a number of USB request blocks (URBs) for RX, and then later relies on usb_kill_anchored_urbs() to free them, but this doesn't actually free them. As a result, this may be leaking DMA memory that's been used by the driver.
This commit is an adaptation of the techniques found in the esd_usb2 driver where a similar design pattern led to a memory leak. It explicitly frees the RX URBs and their DMA memory via a call to usb_free_coherent(). Since the RX URBs were allocated in the gs_can_open(), we remove them in gs_can_close() rather than in the disconnect function as was done in esd_usb2.
For more information, see the 928150fad41b ("can: esd_usb2: fix memory leak").
Link: https://lore.kernel.org/all/alpine.DEB.2.22.394.2206031547001.1630869@thelap... Fixes: d08e973a77d1 ("can: gs_usb: Added support for the GS_USB CAN devices") Cc: stable@vger.kernel.org Signed-off-by: Rhett Aultman rhett.aultman@samsara.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/usb/gs_usb.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/drivers/net/can/usb/gs_usb.c +++ b/drivers/net/can/usb/gs_usb.c @@ -185,6 +185,8 @@ struct gs_can {
struct usb_anchor tx_submitted; atomic_t active_tx_urbs; + void *rxbuf[GS_MAX_RX_URBS]; + dma_addr_t rxbuf_dma[GS_MAX_RX_URBS]; };
/* usb interface struct */ @@ -594,6 +596,7 @@ static int gs_can_open(struct net_device for (i = 0; i < GS_MAX_RX_URBS; i++) { struct urb *urb; u8 *buf; + dma_addr_t buf_dma;
/* alloc rx urb */ urb = usb_alloc_urb(0, GFP_KERNEL); @@ -604,7 +607,7 @@ static int gs_can_open(struct net_device buf = usb_alloc_coherent(dev->udev, sizeof(struct gs_host_frame), GFP_KERNEL, - &urb->transfer_dma); + &buf_dma); if (!buf) { netdev_err(netdev, "No memory left for USB buffer\n"); @@ -612,6 +615,8 @@ static int gs_can_open(struct net_device return -ENOMEM; }
+ urb->transfer_dma = buf_dma; + /* fill, anchor, and submit rx urb */ usb_fill_bulk_urb(urb, dev->udev, @@ -635,10 +640,17 @@ static int gs_can_open(struct net_device rc);
usb_unanchor_urb(urb); + usb_free_coherent(dev->udev, + sizeof(struct gs_host_frame), + buf, + buf_dma); usb_free_urb(urb); break; }
+ dev->rxbuf[i] = buf; + dev->rxbuf_dma[i] = buf_dma; + /* Drop reference, * USB core will take care of freeing it */ @@ -703,13 +715,20 @@ static int gs_can_close(struct net_devic int rc; struct gs_can *dev = netdev_priv(netdev); struct gs_usb *parent = dev->parent; + unsigned int i;
netif_stop_queue(netdev);
/* Stop polling */ parent->active_channels--; - if (!parent->active_channels) + if (!parent->active_channels) { usb_kill_anchored_urbs(&parent->rx_submitted); + for (i = 0; i < GS_MAX_RX_URBS; i++) + usb_free_coherent(dev->udev, + sizeof(struct gs_host_frame), + dev->rxbuf[i], + dev->rxbuf_dma[i]); + }
/* Stop sending URBs */ usb_kill_anchored_urbs(&dev->tx_submitted);
From: Marc Kleine-Budde mkl@pengutronix.de
commit 5b12933de4e76ec164031c18ce8e0904abf530d7 upstream.
In commit df06fd678260 ("can: m_can: m_can_chip_config(): enable and configure internal timestamps") the timestamping in the m_can core should be enabled. In peripheral mode, the RX'ed CAN frames, TX compete frames and error events are sorted by the timestamp.
The above mentioned commit however forgot to enable the timestamping. Add the missing bits to enable the timestamp counter to the write of the Timestamp Counter Configuration register.
Link: https://lore.kernel.org/all/20220612212708.4081756-1-mkl@pengutronix.de Fixes: df06fd678260 ("can: m_can: m_can_chip_config(): enable and configure internal timestamps") Cc: stable@vger.kernel.org # 5.13 Cc: Torin Cooper-Bennun torin@maxiluxsystems.com Reviewed-by: Chandrasekar Ramakrishnan rcsekar@samsung.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/m_can/m_can.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1367,7 +1367,9 @@ static void m_can_chip_config(struct net /* enable internal timestamp generation, with a prescalar of 16. The * prescalar is applied to the nominal bit timing */ - m_can_write(cdev, M_CAN_TSCC, FIELD_PREP(TSCC_TCP_MASK, 0xf)); + m_can_write(cdev, M_CAN_TSCC, + FIELD_PREP(TSCC_TCP_MASK, 0xf) | + FIELD_PREP(TSCC_TSS_MASK, TSCC_TSS_INTERNAL));
m_can_config_endisable(cdev, false);
From: Marc Kleine-Budde mkl@pengutronix.de
commit 4c3333693f07313f5f0145a922f14a7d3c0f4f21 upstream.
In commit 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context") the RX path for peripheral devices was switched to RX-offload.
Received CAN frames are pushed to RX-offload together with a timestamp. RX-offload is designed to handle overflows of the timestamp correctly, if 32 bit timestamps are provided.
The timestamps of m_can core are only 16 bits wide. So this patch shifts them to full 32 bit before passing them to RX-offload.
Link: https://lore.kernel.org/all/20220612211410.4081390-1-mkl@pengutronix.de Fixes: 1be37d3b0414 ("can: m_can: fix periph RX path: use rx-offload to ensure skbs are sent from softirq context") Cc: stable@vger.kernel.org # 5.13 Cc: Torin Cooper-Bennun torin@maxiluxsystems.com Reviewed-by: Chandrasekar Ramakrishnan rcsekar@samsung.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/m_can/m_can.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -532,7 +532,7 @@ static int m_can_read_fifo(struct net_de stats->rx_packets++; stats->rx_bytes += cf->len;
- timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc); + timestamp = FIELD_GET(RX_BUF_RXTS_MASK, fifo_header.dlc) << 16;
m_can_receive_skb(cdev, skb, timestamp);
@@ -1043,7 +1043,7 @@ static int m_can_echo_tx_event(struct ne }
msg_mark = FIELD_GET(TX_EVENT_MM_MASK, txe); - timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe); + timestamp = FIELD_GET(TX_EVENT_TXTS_MASK, txe) << 16;
/* ack txe element */ m_can_write(cdev, M_CAN_TXEFA, FIELD_PREP(TXEFA_EFAI_MASK,
From: Thomas Kopp thomas.kopp@microchip.com
commit 406cc9cdb3e8d644b15e8028948f091b82abdbca upstream.
The mcp251xfd compatible chips have an erratum ([1], [2]), where the received CRC doesn't match the calculated CRC. In commit c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register") the following workaround was implementierend.
- If a CRC read error on the TBC register is detected and the first byte is 0x00 or 0x80, the most significant bit of the first byte is flipped and the CRC is calculated again. - If the CRC now matches, the _original_ data is passed to the reader. For now we assume transferred data was OK.
Measurements on the mcp2517fd show that the workaround is applicable not only of the lowest byte is 0x00 or 0x80, but also if 3 least significant bits are set.
Update check on 1st data byte and workaround description accordingly.
[1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands" [2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11MB... Fixes: c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register") Cc: stable@vger.kernel.org Reported-by: Pavel Modilaynen pavel.modilaynen@volvocars.com Signed-off-by: Thomas Kopp thomas.kopp@microchip.com [mkl: split into 2 patches, update patch description and documentation] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-)
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c @@ -325,10 +325,12 @@ mcp251xfd_regmap_crc_read(void *context, * register. It increments once per SYS clock tick, * which is 20 or 40 MHz. * - * Observation shows that if the lowest byte (which is - * transferred first on the SPI bus) of that register - * is 0x00 or 0x80 the calculated CRC doesn't always - * match the transferred one. + * Observation on the mcp2518fd shows that if the + * lowest byte (which is transferred first on the SPI + * bus) of that register is 0x00 or 0x80 the + * calculated CRC doesn't always match the transferred + * one. On the mcp2517fd this problem is not limited + * to the first byte being 0x00 or 0x80. * * If the highest bit in the lowest byte is flipped * the transferred CRC matches the calculated one. We @@ -337,7 +339,8 @@ mcp251xfd_regmap_crc_read(void *context, * correct. */ if (reg == MCP251XFD_REG_TBC && - (buf_rx->data[0] == 0x0 || buf_rx->data[0] == 0x80)) { + ((buf_rx->data[0] & 0xf8) == 0x0 || + (buf_rx->data[0] & 0xf8) == 0x80)) { /* Flip highest bit in lowest byte of le32 */ buf_rx->data[0] ^= 0x80;
From: Thomas Kopp thomas.kopp@microchip.com
commit e3d4ee7d5f7f5256dfe89219afcc7a2d553b731f upstream.
The mcp251xfd compatible chips have an erratum ([1], [2]), where the received CRC doesn't match the calculated CRC. In commit c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register") the following workaround was implementierend.
- If a CRC read error on the TBC register is detected and the first byte is 0x00 or 0x80, the most significant bit of the first byte is flipped and the CRC is calculated again. - If the CRC now matches, the _original_ data is passed to the reader. For now we assume transferred data was OK.
New investigations and simulations indicate that the CRC send by the device is calculated on correct data, and the data is incorrectly received by the SPI host controller.
Use flipped instead of original data and update workaround description in mcp251xfd_regmap_crc_read().
[1] mcp2517fd: DS80000792C: "Incorrect CRC for certain READ_CRC commands" [2] mcp2518fd: DS80000789C: "Incorrect CRC for certain READ_CRC commands"
Link: https://lore.kernel.org/all/DM4PR11MB53901D49578FE265B239E55AFB7C9@DM4PR11MB... Fixes: c7eb923c3caf ("can: mcp251xfd: mcp251xfd_regmap_crc_read(): work around broken CRC on TBC register") Cc: stable@vger.kernel.org Signed-off-by: Thomas Kopp thomas.kopp@microchip.com [mkl: split into 2 patches, update patch description and documentation] Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-)
--- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-regmap.c @@ -334,9 +334,8 @@ mcp251xfd_regmap_crc_read(void *context, * * If the highest bit in the lowest byte is flipped * the transferred CRC matches the calculated one. We - * assume for now the CRC calculation in the chip - * works on wrong data and the transferred data is - * correct. + * assume for now the CRC operates on the correct + * data. */ if (reg == MCP251XFD_REG_TBC && ((buf_rx->data[0] & 0xf8) == 0x0 || @@ -350,10 +349,8 @@ mcp251xfd_regmap_crc_read(void *context, val_len); if (!err) { /* If CRC is now correct, assume - * transferred data was OK, flip bit - * back to original value. + * flipped data is OK. */ - buf_rx->data[0] ^= 0x80; goto out; } }
From: Daniel Borkmann daniel@iogearbox.net
commit a12ca6277eca6aeeccf66e840c23a2b520e24c8f upstream.
Kuee reported a quirk in the jmp32's jeq/jne simulation, namely that the register value does not match expectations for the fall-through path. For example:
Before fix:
0: R1=ctx(off=0,imm=0) R10=fp0 0: (b7) r2 = 0 ; R2_w=P0 1: (b7) r6 = 563 ; R6_w=P563 2: (87) r2 = -r2 ; R2_w=Pscalar() 3: (87) r2 = -r2 ; R2_w=Pscalar() 4: (4c) w2 |= w6 ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563 5: (56) if w2 != 0x8 goto pc+1 ; R2_w=P571 <--- [*] 6: (95) exit R0 !read_ok
After fix:
0: R1=ctx(off=0,imm=0) R10=fp0 0: (b7) r2 = 0 ; R2_w=P0 1: (b7) r6 = 563 ; R6_w=P563 2: (87) r2 = -r2 ; R2_w=Pscalar() 3: (87) r2 = -r2 ; R2_w=Pscalar() 4: (4c) w2 |= w6 ; R2_w=Pscalar(umin=563,umax=4294967295,var_off=(0x233; 0xfffffdcc),s32_min=-2147483085) R6_w=P563 5: (56) if w2 != 0x8 goto pc+1 ; R2_w=P8 <--- [*] 6: (95) exit R0 !read_ok
As can be seen on line 5 for the branch fall-through path in R2 [*] is that given condition w2 != 0x8 is false, verifier should conclude that r2 = 8 as upper 32 bit are known to be zero. However, verifier incorrectly concludes that r2 = 571 which is far off.
The problem is it only marks false{true}_reg as known in the switch for JE/NE case, but at the end of the function, it uses {false,true}_{64,32}off to update {false,true}_reg->var_off and they still hold the prior value of {false,true}_reg->var_off before it got marked as known. The subsequent __reg_combine_32_into_64() then propagates this old var_off and derives new bounds. The information between min/max bounds on {false,true}_reg from setting the register to known const combined with the {false,true}_reg->var_off based on the old information then derives wrong register data.
Fix it by detangling the BPF_JEQ/BPF_JNE cases and updating relevant {false,true}_{64,32}off tnums along with the register marking to known constant.
Fixes: 3f50f132d840 ("bpf: Verifier, do explicit ALU32 bounds tracking") Reported-by: Kuee K1r0a liulin063@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20220701124727.11153-1-daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 41 ++++++++++++++++++++++++----------------- 1 file changed, 24 insertions(+), 17 deletions(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -8591,26 +8591,33 @@ static void reg_set_min_max(struct bpf_r return;
switch (opcode) { + /* JEQ/JNE comparison doesn't change the register equivalence. + * + * r1 = r2; + * if (r1 == 42) goto label; + * ... + * label: // here both r1 and r2 are known to be 42. + * + * Hence when marking register as known preserve it's ID. + */ case BPF_JEQ: + if (is_jmp32) { + __mark_reg32_known(true_reg, val32); + true_32off = tnum_subreg(true_reg->var_off); + } else { + ___mark_reg_known(true_reg, val); + true_64off = true_reg->var_off; + } + break; case BPF_JNE: - { - struct bpf_reg_state *reg = - opcode == BPF_JEQ ? true_reg : false_reg; - - /* JEQ/JNE comparison doesn't change the register equivalence. - * r1 = r2; - * if (r1 == 42) goto label; - * ... - * label: // here both r1 and r2 are known to be 42. - * - * Hence when marking register as known preserve it's ID. - */ - if (is_jmp32) - __mark_reg32_known(reg, val32); - else - ___mark_reg_known(reg, val); + if (is_jmp32) { + __mark_reg32_known(false_reg, val32); + false_32off = tnum_subreg(false_reg->var_off); + } else { + ___mark_reg_known(false_reg, val); + false_64off = false_reg->var_off; + } break; - } case BPF_JSET: if (is_jmp32) { false_32off = tnum_and(false_32off, tnum_const(~val32));
From: Daniel Borkmann daniel@iogearbox.net
commit 3844d153a41adea718202c10ae91dc96b37453b5 upstream.
Kuee reported a corner case where the tnum becomes constant after the call to __reg_bound_offset(), but the register's bounds are not, that is, its min bounds are still not equal to the register's max bounds.
This in turn allows to leak pointers through turning a pointer register as is into an unknown scalar via adjust_ptr_min_max_vals().
Before:
func#0 @0 0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 0: (b7) r0 = 1 ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) 1: (b7) r3 = 0 ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) 2: (87) r3 = -r3 ; R3_w=scalar() 3: (87) r3 = -r3 ; R3_w=scalar() 4: (47) r3 |= 32767 ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881) 5: (75) if r3 s>= 0x0 goto pc+1 ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767) 6: (95) exit
from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 7: (d5) if r3 s<= 0x8000 goto pc+1 ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767) 8: (95) exit
from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 9: (07) r3 += -32767 ; R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0)) <--- [*] 10: (95) exit
What can be seen here is that R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) after the operation R3 += -32767 results in a 'malformed' constant, that is, R3_w=scalar(imm=0,umax=1,var_off=(0x0; 0x0)). Intersecting with var_off has not been done at that point via __update_reg_bounds(), which would have improved the umax to be equal to umin.
Refactor the tnum <> min/max bounds information flow into a reg_bounds_sync() helper and use it consistently everywhere. After the fix, bounds have been corrected to R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) and thus the register is regarded as a 'proper' constant scalar of 0.
After:
func#0 @0 0: R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 0: (b7) r0 = 1 ; R0_w=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) 1: (b7) r3 = 0 ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) 2: (87) r3 = -r3 ; R3_w=scalar() 3: (87) r3 = -r3 ; R3_w=scalar() 4: (47) r3 |= 32767 ; R3_w=scalar(smin=-9223372036854743041,umin=32767,var_off=(0x7fff; 0xffffffffffff8000),s32_min=-2147450881) 5: (75) if r3 s>= 0x0 goto pc+1 ; R3_w=scalar(umin=9223372036854808575,var_off=(0x8000000000007fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767) 6: (95) exit
from 5 to 7: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 7: (d5) if r3 s<= 0x8000 goto pc+1 ; R3=scalar(umin=32769,umax=9223372036854775807,var_off=(0x7fff; 0x7fffffffffff8000),s32_min=-2147450881,u32_min=32767) 8: (95) exit
from 7 to 9: R0=scalar(imm=1,umin=1,umax=1,var_off=(0x1; 0x0)) R1=ctx(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) R3=scalar(umin=32767,umax=32768,var_off=(0x7fff; 0x8000)) R10=fp(off=0,imm=0,umax=0,var_off=(0x0; 0x0)) 9: (07) r3 += -32767 ; R3_w=scalar(imm=0,umax=0,var_off=(0x0; 0x0)) <--- [*] 10: (95) exit
Fixes: b03c9f9fdc37 ("bpf/verifier: track signed and unsigned min/max values") Reported-by: Kuee K1r0a liulin063@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: Andrii Nakryiko andrii@kernel.org Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20220701124727.11153-2-daniel@iogearbox.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/verifier.c | 72 +++++++++++++++----------------------------------- 1 file changed, 23 insertions(+), 49 deletions(-)
--- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -1333,6 +1333,21 @@ static void __reg_bound_offset(struct bp reg->var_off = tnum_or(tnum_clear_subreg(var64_off), var32_off); }
+static void reg_bounds_sync(struct bpf_reg_state *reg) +{ + /* We might have learned new bounds from the var_off. */ + __update_reg_bounds(reg); + /* We might have learned something about the sign bit. */ + __reg_deduce_bounds(reg); + /* We might have learned some bits from the bounds. */ + __reg_bound_offset(reg); + /* Intersecting with the old var_off might have improved our bounds + * slightly, e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), + * then new var_off is (0; 0x7f...fc) which improves our umax. + */ + __update_reg_bounds(reg); +} + static bool __reg32_bound_s64(s32 a) { return a >= 0 && a <= S32_MAX; @@ -1374,16 +1389,8 @@ static void __reg_combine_32_into_64(str * so they do not impact tnum bounds calculation. */ __mark_reg64_unbounded(reg); - __update_reg_bounds(reg); } - - /* Intersecting with the old var_off might have improved our bounds - * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), - * then new var_off is (0; 0x7f...fc) which improves our umax. - */ - __reg_deduce_bounds(reg); - __reg_bound_offset(reg); - __update_reg_bounds(reg); + reg_bounds_sync(reg); }
static bool __reg64_bound_s32(s64 a) @@ -1399,7 +1406,6 @@ static bool __reg64_bound_u32(u64 a) static void __reg_combine_64_into_32(struct bpf_reg_state *reg) { __mark_reg32_unbounded(reg); - if (__reg64_bound_s32(reg->smin_value) && __reg64_bound_s32(reg->smax_value)) { reg->s32_min_value = (s32)reg->smin_value; reg->s32_max_value = (s32)reg->smax_value; @@ -1408,14 +1414,7 @@ static void __reg_combine_64_into_32(str reg->u32_min_value = (u32)reg->umin_value; reg->u32_max_value = (u32)reg->umax_value; } - - /* Intersecting with the old var_off might have improved our bounds - * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), - * then new var_off is (0; 0x7f...fc) which improves our umax. - */ - __reg_deduce_bounds(reg); - __reg_bound_offset(reg); - __update_reg_bounds(reg); + reg_bounds_sync(reg); }
/* Mark a register as having a completely unknown (scalar) value. */ @@ -6054,9 +6053,7 @@ static void do_refine_retval_range(struc ret_reg->s32_max_value = meta->msize_max_value; ret_reg->smin_value = -MAX_ERRNO; ret_reg->s32_min_value = -MAX_ERRNO; - __reg_deduce_bounds(ret_reg); - __reg_bound_offset(ret_reg); - __update_reg_bounds(ret_reg); + reg_bounds_sync(ret_reg); }
static int @@ -7216,11 +7213,7 @@ reject:
if (!check_reg_sane_offset(env, dst_reg, ptr_reg->type)) return -EINVAL; - - __update_reg_bounds(dst_reg); - __reg_deduce_bounds(dst_reg); - __reg_bound_offset(dst_reg); - + reg_bounds_sync(dst_reg); if (sanitize_check_bounds(env, insn, dst_reg) < 0) return -EACCES; if (sanitize_needed(opcode)) { @@ -7958,10 +7951,7 @@ static int adjust_scalar_min_max_vals(st /* ALU32 ops are zero extended into 64bit register */ if (alu32) zext_32_to_64(dst_reg); - - __update_reg_bounds(dst_reg); - __reg_deduce_bounds(dst_reg); - __reg_bound_offset(dst_reg); + reg_bounds_sync(dst_reg); return 0; }
@@ -8150,10 +8140,7 @@ static int check_alu_op(struct bpf_verif insn->dst_reg); } zext_32_to_64(dst_reg); - - __update_reg_bounds(dst_reg); - __reg_deduce_bounds(dst_reg); - __reg_bound_offset(dst_reg); + reg_bounds_sync(dst_reg); } } else { /* case: R = imm @@ -8756,21 +8743,8 @@ static void __reg_combine_min_max(struct dst_reg->smax_value); src_reg->var_off = dst_reg->var_off = tnum_intersect(src_reg->var_off, dst_reg->var_off); - /* We might have learned new bounds from the var_off. */ - __update_reg_bounds(src_reg); - __update_reg_bounds(dst_reg); - /* We might have learned something about the sign bit. */ - __reg_deduce_bounds(src_reg); - __reg_deduce_bounds(dst_reg); - /* We might have learned some bits from the bounds. */ - __reg_bound_offset(src_reg); - __reg_bound_offset(dst_reg); - /* Intersecting with the old var_off might have improved our bounds - * slightly. e.g. if umax was 0x7f...f and var_off was (0; 0xf...fc), - * then new var_off is (0; 0x7f...fc) which improves our umax. - */ - __update_reg_bounds(src_reg); - __update_reg_bounds(dst_reg); + reg_bounds_sync(src_reg); + reg_bounds_sync(dst_reg); }
static void reg_combine_min_max(struct bpf_reg_state *true_src,
From: Oliver Neukum oneukum@suse.com
commit b55a21b764c1e182014630fa5486d717484ac58f upstream.
usbnet_write_cmd_async() mixed up which buffers need to be freed in which error case.
v2: add Fixes tag v3: fix uninitialized buf pointer
Fixes: 877bd862f32b8 ("usbnet: introduce usbnet 3 command helpers") Signed-off-by: Oliver Neukum oneukum@suse.com Link: https://lore.kernel.org/r/20220705125351.17309-1-oneukum@suse.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/usb/usbnet.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -2135,7 +2135,7 @@ static void usbnet_async_cmd_cb(struct u int usbnet_write_cmd_async(struct usbnet *dev, u8 cmd, u8 reqtype, u16 value, u16 index, const void *data, u16 size) { - struct usb_ctrlrequest *req = NULL; + struct usb_ctrlrequest *req; struct urb *urb; int err = -ENOMEM; void *buf = NULL; @@ -2153,7 +2153,7 @@ int usbnet_write_cmd_async(struct usbnet if (!buf) { netdev_err(dev->net, "Error allocating buffer" " in %s!\n", __func__); - goto fail_free; + goto fail_free_urb; } }
@@ -2177,14 +2177,21 @@ int usbnet_write_cmd_async(struct usbnet if (err < 0) { netdev_err(dev->net, "Error submitting the control" " message: status=%d\n", err); - goto fail_free; + goto fail_free_all; } return 0;
+fail_free_all: + kfree(req); fail_free_buf: kfree(buf); -fail_free: - kfree(req); + /* + * avoid a double free + * needed because the flag can be set only + * after filling the URB + */ + urb->transfer_flags = 0; +fail_free_urb: usb_free_urb(urb); fail: return err;
From: Duoming Zhou duoming@zju.edu.cn
commit 148ca04518070910739dfc4eeda765057856403d upstream.
There are UAF bugs caused by rose_t0timer_expiry(). The root cause is that del_timer() could not stop the timer handler that is running and there is no synchronization. One of the race conditions is shown below:
(thread 1) | (thread 2) | rose_device_event | rose_rt_device_down | rose_remove_neigh rose_t0timer_expiry | rose_stop_t0timer(rose_neigh) ... | del_timer(&neigh->t0timer) | kfree(rose_neigh) //[1]FREE neigh->dce_mode //[2]USE |
The rose_neigh is deallocated in position [1] and use in position [2].
The crash trace triggered by POC is like below:
BUG: KASAN: use-after-free in expire_timers+0x144/0x320 Write of size 8 at addr ffff888009b19658 by task swapper/0/0 ... Call Trace: <IRQ> dump_stack_lvl+0xbf/0xee print_address_description+0x7b/0x440 print_report+0x101/0x230 ? expire_timers+0x144/0x320 kasan_report+0xed/0x120 ? expire_timers+0x144/0x320 expire_timers+0x144/0x320 __run_timers+0x3ff/0x4d0 run_timer_softirq+0x41/0x80 __do_softirq+0x233/0x544 ...
This patch changes rose_stop_ftimer() and rose_stop_t0timer() in rose_remove_neigh() to del_timer_sync() in order that the timer handler could be finished before the resources such as rose_neigh and so on are deallocated. As a result, the UAF bugs could be mitigated.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Duoming Zhou duoming@zju.edu.cn Link: https://lore.kernel.org/r/20220705125610.77971-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/rose/rose_route.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/rose/rose_route.c +++ b/net/rose/rose_route.c @@ -227,8 +227,8 @@ static void rose_remove_neigh(struct ros { struct rose_neigh *s;
- rose_stop_ftimer(rose_neigh); - rose_stop_t0timer(rose_neigh); + del_timer_sync(&rose_neigh->ftimer); + del_timer_sync(&rose_neigh->t0timer);
skb_queue_purge(&rose_neigh->queue);
From: Pablo Neira Ayuso pablo@netfilter.org
commit 9827a0e6e23bf43003cd3d5b7fb11baf59a35e1e upstream.
New elements that reside in the clone are not released in case that the transaction is aborted.
[16302.231754] ------------[ cut here ]------------ [16302.231756] WARNING: CPU: 0 PID: 100509 at net/netfilter/nf_tables_api.c:1864 nf_tables_chain_destroy+0x26/0x127 [nf_tables] [...] [16302.231882] CPU: 0 PID: 100509 Comm: nft Tainted: G W 5.19.0-rc3+ #155 [...] [16302.231887] RIP: 0010:nf_tables_chain_destroy+0x26/0x127 [nf_tables] [16302.231899] Code: f3 fe ff ff 41 55 41 54 55 53 48 8b 6f 10 48 89 fb 48 c7 c7 82 96 d9 a0 8b 55 50 48 8b 75 58 e8 de f5 92 e0 83 7d 50 00 74 09 <0f> 0b 5b 5d 41 5c 41 5d c3 4c 8b 65 00 48 8b 7d 08 49 39 fc 74 05 [...] [16302.231917] Call Trace: [16302.231919] <TASK> [16302.231921] __nf_tables_abort.cold+0x23/0x28 [nf_tables] [16302.231934] nf_tables_abort+0x30/0x50 [nf_tables] [16302.231946] nfnetlink_rcv_batch+0x41a/0x840 [nfnetlink] [16302.231952] ? __nla_validate_parse+0x48/0x190 [16302.231959] nfnetlink_rcv+0x110/0x129 [nfnetlink] [16302.231963] netlink_unicast+0x211/0x340 [16302.231969] netlink_sendmsg+0x21e/0x460
Add nft_set_pipapo_match_destroy() helper function to release the elements in the lookup tables.
Stefano Brivio says: "We additionally look for elements pointers in the cloned matching data if priv->dirty is set, because that means that cloned data might point to additional elements we did not commit to the working copy yet (such as the abort path case, but perhaps not limited to it)."
Fixes: 3c4287f62044 ("nf_tables: Add set type for arbitrary concatenation of ranges") Reviewed-by: Stefano Brivio sbrivio@redhat.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nft_set_pipapo.c | 48 ++++++++++++++++++++++++++++------------- 1 file changed, 33 insertions(+), 15 deletions(-)
--- a/net/netfilter/nft_set_pipapo.c +++ b/net/netfilter/nft_set_pipapo.c @@ -2125,6 +2125,32 @@ out_scratch: }
/** + * nft_set_pipapo_match_destroy() - Destroy elements from key mapping array + * @set: nftables API set representation + * @m: matching data pointing to key mapping array + */ +static void nft_set_pipapo_match_destroy(const struct nft_set *set, + struct nft_pipapo_match *m) +{ + struct nft_pipapo_field *f; + int i, r; + + for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) + ; + + for (r = 0; r < f->rules; r++) { + struct nft_pipapo_elem *e; + + if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) + continue; + + e = f->mt[r].e; + + nft_set_elem_destroy(set, e, true); + } +} + +/** * nft_pipapo_destroy() - Free private data for set and all committed elements * @set: nftables API set representation */ @@ -2132,26 +2158,13 @@ static void nft_pipapo_destroy(const str { struct nft_pipapo *priv = nft_set_priv(set); struct nft_pipapo_match *m; - struct nft_pipapo_field *f; - int i, r, cpu; + int cpu;
m = rcu_dereference_protected(priv->match, true); if (m) { rcu_barrier();
- for (i = 0, f = m->f; i < m->field_count - 1; i++, f++) - ; - - for (r = 0; r < f->rules; r++) { - struct nft_pipapo_elem *e; - - if (r < f->rules - 1 && f->mt[r + 1].e == f->mt[r].e) - continue; - - e = f->mt[r].e; - - nft_set_elem_destroy(set, e, true); - } + nft_set_pipapo_match_destroy(set, m);
#ifdef NFT_PIPAPO_ALIGN free_percpu(m->scratch_aligned); @@ -2165,6 +2178,11 @@ static void nft_pipapo_destroy(const str }
if (priv->clone) { + m = priv->clone; + + if (priv->dirty) + nft_set_pipapo_match_destroy(set, m); + #ifdef NFT_PIPAPO_ALIGN free_percpu(priv->clone->scratch_aligned); #endif
From: Pablo Neira Ayuso pablo@netfilter.org
commit 7e6bc1f6cabcd30aba0b11219d8e01b952eacbb6 upstream.
Make sure element data type and length do not mismatch the one specified by the set declaration.
Fixes: 7d7402642eaf ("netfilter: nf_tables: variable sized set element keys / data") Reported-by: Hugues ANGUELKOV hanguelkov@randorisec.fr Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/netfilter/nf_tables_api.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
--- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -5118,13 +5118,20 @@ static int nft_setelem_parse_data(struct struct nft_data *data, struct nlattr *attr) { + u32 dtype; int err;
err = nft_data_init(ctx, data, NFT_DATA_VALUE_MAXLEN, desc, attr); if (err < 0) return err;
- if (desc->type != NFT_DATA_VERDICT && desc->len != set->dlen) { + if (set->dtype == NFT_DATA_VERDICT) + dtype = NFT_DATA_VERDICT; + else + dtype = NFT_DATA_VALUE; + + if (dtype != desc->type || + set->dlen != desc->len) { nft_data_release(data, desc->type); return -EINVAL; }
From: Nikolay Borisov nborisov@suse.com
[ Upstream commit f6f39f7a0add4e7fd120a709545b57586a1d0393 ]
The user facing function used to allocate new chunks is btrfs_chunk_alloc, unfortunately there is yet another similar sounding function - btrfs_alloc_chunk. This creates confusion, especially since the latter function can be considered "private" in the sense that it implements the first stage of chunk creation and as such is called by btrfs_chunk_alloc.
To avoid the awkwardness that comes with having similarly named but distinctly different in their purpose function rename btrfs_alloc_chunk to btrfs_create_chunk, given that the main purpose of this function is to orchestrate the whole process of allocating a chunk - reserving space into devices, deciding on characteristics of the stripe size and creating the in-memory structures.
Reviewed-by: Filipe Manana fdmanana@suse.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 6 +++--- fs/btrfs/volumes.c | 10 +++++----- fs/btrfs/volumes.h | 2 +- fs/btrfs/zoned.c | 2 +- 4 files changed, 10 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 37418b183b07..aadc1203ad88 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3400,7 +3400,7 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) */ check_system_chunk(trans, flags);
- bg = btrfs_alloc_chunk(trans, flags); + bg = btrfs_create_chunk(trans, flags); if (IS_ERR(bg)) { ret = PTR_ERR(bg); goto out; @@ -3461,7 +3461,7 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) const u64 sys_flags = btrfs_system_alloc_profile(trans->fs_info); struct btrfs_block_group *sys_bg;
- sys_bg = btrfs_alloc_chunk(trans, sys_flags); + sys_bg = btrfs_create_chunk(trans, sys_flags); if (IS_ERR(sys_bg)) { ret = PTR_ERR(sys_bg); btrfs_abort_transaction(trans, ret); @@ -3754,7 +3754,7 @@ void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) * could deadlock on an extent buffer since our caller may be * COWing an extent buffer from the chunk btree. */ - bg = btrfs_alloc_chunk(trans, flags); + bg = btrfs_create_chunk(trans, flags); if (IS_ERR(bg)) { ret = PTR_ERR(bg); } else if (!(type & BTRFS_BLOCK_GROUP_SYSTEM)) { diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 378e03a93e10..b75ce79a2540 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3131,7 +3131,7 @@ int btrfs_remove_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset) const u64 sys_flags = btrfs_system_alloc_profile(fs_info); struct btrfs_block_group *sys_bg;
- sys_bg = btrfs_alloc_chunk(trans, sys_flags); + sys_bg = btrfs_create_chunk(trans, sys_flags); if (IS_ERR(sys_bg)) { ret = PTR_ERR(sys_bg); btrfs_abort_transaction(trans, ret); @@ -5010,7 +5010,7 @@ static void check_raid1c34_incompat_flag(struct btrfs_fs_info *info, u64 type) }
/* - * Structure used internally for __btrfs_alloc_chunk() function. + * Structure used internally for btrfs_create_chunk() function. * Wraps needed parameters. */ struct alloc_chunk_ctl { @@ -5414,7 +5414,7 @@ static struct btrfs_block_group *create_chunk(struct btrfs_trans_handle *trans, return block_group; }
-struct btrfs_block_group *btrfs_alloc_chunk(struct btrfs_trans_handle *trans, +struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans, u64 type) { struct btrfs_fs_info *info = trans->fs_info; @@ -5615,12 +5615,12 @@ static noinline int init_first_rw_device(struct btrfs_trans_handle *trans) */
alloc_profile = btrfs_metadata_alloc_profile(fs_info); - meta_bg = btrfs_alloc_chunk(trans, alloc_profile); + meta_bg = btrfs_create_chunk(trans, alloc_profile); if (IS_ERR(meta_bg)) return PTR_ERR(meta_bg);
alloc_profile = btrfs_system_alloc_profile(fs_info); - sys_bg = btrfs_alloc_chunk(trans, alloc_profile); + sys_bg = btrfs_create_chunk(trans, alloc_profile); if (IS_ERR(sys_bg)) return PTR_ERR(sys_bg);
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index 4db10d071d67..d86a6f9f166c 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -454,7 +454,7 @@ int btrfs_get_io_geometry(struct btrfs_fs_info *fs_info, struct extent_map *map, struct btrfs_io_geometry *io_geom); int btrfs_read_sys_array(struct btrfs_fs_info *fs_info); int btrfs_read_chunk_tree(struct btrfs_fs_info *fs_info); -struct btrfs_block_group *btrfs_alloc_chunk(struct btrfs_trans_handle *trans, +struct btrfs_block_group *btrfs_create_chunk(struct btrfs_trans_handle *trans, u64 type); void btrfs_mapping_tree_free(struct extent_map_tree *tree); blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index 596b2148807d..3bc2f92cd197 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -636,7 +636,7 @@ int btrfs_check_zoned_mode(struct btrfs_fs_info *fs_info)
/* * stripe_size is always aligned to BTRFS_STRIPE_LEN in - * __btrfs_alloc_chunk(). Since we want stripe_len == zone_size, + * btrfs_create_chunk(). Since we want stripe_len == zone_size, * check the alignment here. */ if (!IS_ALIGNED(zone_size, BTRFS_STRIPE_LEN)) {
From: Nikolay Borisov nborisov@suse.com
[ Upstream commit f42c5da6c12e990d8ec415199600b4d593c63bf5 ]
In order to make 'real_root' used only in ref-verify it's required to have the necessary context to perform the same checks that this member is used for. So add 'mod_root' which will contain the root on behalf of which a delayed ref was created and a 'skip_group' parameter which will contain callsite-specific override of skip_qgroup.
Signed-off-by: Nikolay Borisov nborisov@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/delayed-ref.h | 5 +++-- fs/btrfs/extent-tree.c | 17 +++++++++++------ fs/btrfs/file.c | 13 ++++++++----- fs/btrfs/inode.c | 3 ++- fs/btrfs/relocation.c | 21 ++++++++++++++------- fs/btrfs/tree-log.c | 2 +- 6 files changed, 39 insertions(+), 22 deletions(-)
diff --git a/fs/btrfs/delayed-ref.h b/fs/btrfs/delayed-ref.h index e22fba272e4f..31266ba1d430 100644 --- a/fs/btrfs/delayed-ref.h +++ b/fs/btrfs/delayed-ref.h @@ -271,7 +271,7 @@ static inline void btrfs_init_generic_ref(struct btrfs_ref *generic_ref, }
static inline void btrfs_init_tree_ref(struct btrfs_ref *generic_ref, - int level, u64 root) + int level, u64 root, u64 mod_root, bool skip_qgroup) { /* If @real_root not set, use @root as fallback */ if (!generic_ref->real_root) @@ -282,7 +282,8 @@ static inline void btrfs_init_tree_ref(struct btrfs_ref *generic_ref, }
static inline void btrfs_init_data_ref(struct btrfs_ref *generic_ref, - u64 ref_root, u64 ino, u64 offset) + u64 ref_root, u64 ino, u64 offset, u64 mod_root, + bool skip_qgroup) { /* If @real_root not set, use @root as fallback */ if (!generic_ref->real_root) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 514adc83577f..e01b9344fb9c 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -2440,7 +2440,8 @@ static int __btrfs_mod_ref(struct btrfs_trans_handle *trans, num_bytes, parent); generic_ref.real_root = root->root_key.objectid; btrfs_init_data_ref(&generic_ref, ref_root, key.objectid, - key.offset); + key.offset, root->root_key.objectid, + for_reloc); generic_ref.skip_qgroup = for_reloc; if (inc) ret = btrfs_inc_extent_ref(trans, &generic_ref); @@ -2454,7 +2455,8 @@ static int __btrfs_mod_ref(struct btrfs_trans_handle *trans, btrfs_init_generic_ref(&generic_ref, action, bytenr, num_bytes, parent); generic_ref.real_root = root->root_key.objectid; - btrfs_init_tree_ref(&generic_ref, level - 1, ref_root); + btrfs_init_tree_ref(&generic_ref, level - 1, ref_root, + root->root_key.objectid, for_reloc); generic_ref.skip_qgroup = for_reloc; if (inc) ret = btrfs_inc_extent_ref(trans, &generic_ref); @@ -3289,7 +3291,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, btrfs_init_generic_ref(&generic_ref, BTRFS_DROP_DELAYED_REF, buf->start, buf->len, parent); btrfs_init_tree_ref(&generic_ref, btrfs_header_level(buf), - root->root_key.objectid); + root->root_key.objectid, 0, false);
if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) { btrfs_ref_tree_mod(fs_info, &generic_ref); @@ -4705,7 +4707,8 @@ int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans,
btrfs_init_generic_ref(&generic_ref, BTRFS_ADD_DELAYED_EXTENT, ins->objectid, ins->offset, 0); - btrfs_init_data_ref(&generic_ref, root->root_key.objectid, owner, offset); + btrfs_init_data_ref(&generic_ref, root->root_key.objectid, owner, + offset, 0, false); btrfs_ref_tree_mod(root->fs_info, &generic_ref);
return btrfs_add_delayed_data_ref(trans, &generic_ref, ram_bytes); @@ -4898,7 +4901,8 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, btrfs_init_generic_ref(&generic_ref, BTRFS_ADD_DELAYED_EXTENT, ins.objectid, ins.offset, parent); generic_ref.real_root = root->root_key.objectid; - btrfs_init_tree_ref(&generic_ref, level, root_objectid); + btrfs_init_tree_ref(&generic_ref, level, root_objectid, + root->root_key.objectid, false); btrfs_ref_tree_mod(fs_info, &generic_ref); ret = btrfs_add_delayed_tree_ref(trans, &generic_ref, extent_op); if (ret) @@ -5315,7 +5319,8 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans,
btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, bytenr, fs_info->nodesize, parent); - btrfs_init_tree_ref(&ref, level - 1, root->root_key.objectid); + btrfs_init_tree_ref(&ref, level - 1, root->root_key.objectid, + 0, false); ret = btrfs_free_extent(trans, &ref); if (ret) goto out_unlock; diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a06c8366a8f4..1c597cd6c024 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -869,7 +869,8 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans, btrfs_init_data_ref(&ref, root->root_key.objectid, new_key.objectid, - args->start - extent_offset); + args->start - extent_offset, + 0, false); ret = btrfs_inc_extent_ref(trans, &ref); BUG_ON(ret); /* -ENOMEM */ } @@ -955,7 +956,8 @@ int btrfs_drop_extents(struct btrfs_trans_handle *trans, btrfs_init_data_ref(&ref, root->root_key.objectid, key.objectid, - key.offset - extent_offset); + key.offset - extent_offset, 0, + false); ret = btrfs_free_extent(trans, &ref); BUG_ON(ret); /* -ENOMEM */ args->bytes_found += extent_end - key.offset; @@ -1232,7 +1234,7 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, bytenr, num_bytes, 0); btrfs_init_data_ref(&ref, root->root_key.objectid, ino, - orig_offset); + orig_offset, 0, false); ret = btrfs_inc_extent_ref(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); @@ -1257,7 +1259,8 @@ int btrfs_mark_extent_written(struct btrfs_trans_handle *trans, other_end = 0; btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, bytenr, num_bytes, 0); - btrfs_init_data_ref(&ref, root->root_key.objectid, ino, orig_offset); + btrfs_init_data_ref(&ref, root->root_key.objectid, ino, orig_offset, + 0, false); if (extent_mergeable(leaf, path->slots[0] + 1, ino, bytenr, orig_offset, &other_start, &other_end)) { @@ -2715,7 +2718,7 @@ static int btrfs_insert_replace_extent(struct btrfs_trans_handle *trans, extent_info->disk_len, 0); ref_offset = extent_info->file_offset - extent_info->data_offset; btrfs_init_data_ref(&ref, root->root_key.objectid, - btrfs_ino(inode), ref_offset); + btrfs_ino(inode), ref_offset, 0, false); ret = btrfs_inc_extent_ref(trans, &ref); }
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 044d584c3467..d644dcaf3004 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4919,7 +4919,8 @@ int btrfs_truncate_inode_items(struct btrfs_trans_handle *trans, extent_start, extent_num_bytes, 0); ref.real_root = root->root_key.objectid; btrfs_init_data_ref(&ref, btrfs_header_owner(leaf), - ino, extent_offset); + ino, extent_offset, + root->root_key.objectid, false); ret = btrfs_free_extent(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index a6661f2ad2c0..0300770c0a89 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -1147,7 +1147,8 @@ int replace_file_extents(struct btrfs_trans_handle *trans, num_bytes, parent); ref.real_root = root->root_key.objectid; btrfs_init_data_ref(&ref, btrfs_header_owner(leaf), - key.objectid, key.offset); + key.objectid, key.offset, + root->root_key.objectid, false); ret = btrfs_inc_extent_ref(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); @@ -1158,7 +1159,8 @@ int replace_file_extents(struct btrfs_trans_handle *trans, num_bytes, parent); ref.real_root = root->root_key.objectid; btrfs_init_data_ref(&ref, btrfs_header_owner(leaf), - key.objectid, key.offset); + key.objectid, key.offset, + root->root_key.objectid, false); ret = btrfs_free_extent(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); @@ -1368,7 +1370,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, old_bytenr, blocksize, path->nodes[level]->start); ref.skip_qgroup = true; - btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid); + btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid, + 0, true); ret = btrfs_inc_extent_ref(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); @@ -1377,7 +1380,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc, btrfs_init_generic_ref(&ref, BTRFS_ADD_DELAYED_REF, new_bytenr, blocksize, 0); ref.skip_qgroup = true; - btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid); + btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid, 0, + true); ret = btrfs_inc_extent_ref(trans, &ref); if (ret) { btrfs_abort_transaction(trans, ret); @@ -1386,7 +1390,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, new_bytenr, blocksize, path->nodes[level]->start); - btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid); + btrfs_init_tree_ref(&ref, level - 1, src->root_key.objectid, + 0, true); ref.skip_qgroup = true; ret = btrfs_free_extent(trans, &ref); if (ret) { @@ -1396,7 +1401,8 @@ int replace_path(struct btrfs_trans_handle *trans, struct reloc_control *rc,
btrfs_init_generic_ref(&ref, BTRFS_DROP_DELAYED_REF, old_bytenr, blocksize, 0); - btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid); + btrfs_init_tree_ref(&ref, level - 1, dest->root_key.objectid, + 0, true); ref.skip_qgroup = true; ret = btrfs_free_extent(trans, &ref); if (ret) { @@ -2475,7 +2481,8 @@ static int do_relocation(struct btrfs_trans_handle *trans, upper->eb->start); ref.real_root = root->root_key.objectid; btrfs_init_tree_ref(&ref, node->level, - btrfs_header_owner(upper->eb)); + btrfs_header_owner(upper->eb), + root->root_key.objectid, false); ret = btrfs_inc_extent_ref(trans, &ref); if (!ret) ret = btrfs_drop_subtree(trans, root, eb, diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c index 1221d8483d63..bed6811476b0 100644 --- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -761,7 +761,7 @@ static noinline int replay_one_extent(struct btrfs_trans_handle *trans, ins.objectid, ins.offset, 0); btrfs_init_data_ref(&ref, root->root_key.objectid, - key->objectid, offset); + key->objectid, offset, 0, false); ret = btrfs_inc_extent_ref(trans, &ref); if (ret) goto out;
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 7a1636089acfee7562fe79aff7d1b4c57869896d ]
When creating a subvolume, at ioctl.c:create_subvol(), if we fail to insert the new root's root item into the root tree, we are freeing the metadata extent we reserved for the new root to prevent a metadata extent leak, as we don't abort the transaction at that point (since there is nothing at that point that is irreversible).
However we allocated the metadata extent for the new root which we are creating for the new subvolume, so its delayed reference refers to the ID of this new root. But when we free the metadata extent we pass the root of the subvolume where the new subvolume is located to btrfs_free_tree_block() - this is incorrect because this will generate a delayed reference that refers to the ID of the parent subvolume's root, and not to ID of the new root.
This results in a failure when running delayed references that leads to a transaction abort and a trace like the following:
[3868.738042] RIP: 0010:__btrfs_free_extent+0x709/0x950 [btrfs] [3868.739857] Code: 68 0f 85 e6 fb ff (...) [3868.742963] RSP: 0018:ffffb0e9045cf910 EFLAGS: 00010246 [3868.743908] RAX: 00000000fffffffe RBX: 00000000fffffffe RCX: 0000000000000002 [3868.745312] RDX: 00000000fffffffe RSI: 0000000000000002 RDI: ffff90b0cd793b88 [3868.746643] RBP: 000000000e5d8000 R08: 0000000000000000 R09: ffff90b0cd793b88 [3868.747979] R10: 0000000000000002 R11: 00014ded97944d68 R12: 0000000000000000 [3868.749373] R13: ffff90b09afe4a28 R14: 0000000000000000 R15: ffff90b0cd793b88 [3868.750725] FS: 00007f281c4a8b80(0000) GS:ffff90b3ada00000(0000) knlGS:0000000000000000 [3868.752275] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [3868.753515] CR2: 00007f281c6a5000 CR3: 0000000108a42006 CR4: 0000000000370ee0 [3868.754869] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [3868.756228] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [3868.757803] Call Trace: [3868.758281] <TASK> [3868.758655] ? btrfs_merge_delayed_refs+0x178/0x1c0 [btrfs] [3868.759827] __btrfs_run_delayed_refs+0x2b1/0x1250 [btrfs] [3868.761047] btrfs_run_delayed_refs+0x86/0x210 [btrfs] [3868.762069] ? lock_acquired+0x19f/0x420 [3868.762829] btrfs_commit_transaction+0x69/0xb20 [btrfs] [3868.763860] ? _raw_spin_unlock+0x29/0x40 [3868.764614] ? btrfs_block_rsv_release+0x1c2/0x1e0 [btrfs] [3868.765870] create_subvol+0x1d8/0x9a0 [btrfs] [3868.766766] btrfs_mksubvol+0x447/0x4c0 [btrfs] [3868.767669] ? preempt_count_add+0x49/0xa0 [3868.768444] __btrfs_ioctl_snap_create+0x123/0x190 [btrfs] [3868.769639] ? _copy_from_user+0x66/0xa0 [3868.770391] btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs] [3868.771495] btrfs_ioctl+0xd1e/0x35c0 [btrfs] [3868.772364] ? __slab_free+0x10a/0x360 [3868.773198] ? rcu_read_lock_sched_held+0x12/0x60 [3868.774121] ? lock_release+0x223/0x4a0 [3868.774863] ? lock_acquired+0x19f/0x420 [3868.775634] ? rcu_read_lock_sched_held+0x12/0x60 [3868.776530] ? trace_hardirqs_on+0x1b/0xe0 [3868.777373] ? _raw_spin_unlock_irqrestore+0x3e/0x60 [3868.778280] ? kmem_cache_free+0x321/0x3c0 [3868.779011] ? __x64_sys_ioctl+0x83/0xb0 [3868.779718] __x64_sys_ioctl+0x83/0xb0 [3868.780387] do_syscall_64+0x3b/0xc0 [3868.781059] entry_SYSCALL_64_after_hwframe+0x44/0xae [3868.781953] RIP: 0033:0x7f281c59e957 [3868.782585] Code: 3c 1c 48 f7 d8 4c (...) [3868.785867] RSP: 002b:00007ffe1f83e2b8 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [3868.787198] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f281c59e957 [3868.788450] RDX: 00007ffe1f83e2c0 RSI: 0000000050009418 RDI: 0000000000000003 [3868.789748] RBP: 00007ffe1f83f300 R08: 0000000000000000 R09: 00007ffe1f83fe36 [3868.791214] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000003 [3868.792468] R13: 0000000000000003 R14: 00007ffe1f83e2c0 R15: 00000000000003cc [3868.793765] </TASK> [3868.794037] irq event stamp: 0 [3868.794548] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [3868.795670] hardirqs last disabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040 [3868.797086] softirqs last enabled at (0): [<ffffffff98294214>] copy_process+0x934/0x2040 [3868.798309] softirqs last disabled at (0): [<0000000000000000>] 0x0 [3868.799284] ---[ end trace be24c7002fe27747 ]--- [3868.799928] BTRFS info (device dm-0): leaf 241188864 gen 1268 total ptrs 214 free space 469 owner 2 [3868.801133] BTRFS info (device dm-0): refs 2 lock_owner 225627 current 225627 [3868.802056] item 0 key (237436928 169 0) itemoff 16250 itemsize 33 [3868.802863] extent refs 1 gen 1265 flags 2 [3868.803447] ref#0: tree block backref root 1610 (...) [3869.064354] item 114 key (241008640 169 0) itemoff 12488 itemsize 33 [3869.065421] extent refs 1 gen 1268 flags 2 [3869.066115] ref#0: tree block backref root 1689 (...) [3869.403834] BTRFS error (device dm-0): unable to find ref byte nr 241008640 parent 0 root 1622 owner 0 offset 0 [3869.405641] BTRFS: error (device dm-0) in __btrfs_free_extent:3076: errno=-2 No such entry [3869.407138] BTRFS: error (device dm-0) in btrfs_run_delayed_refs:2159: errno=-2 No such entry
Fix this by passing the new subvolume's root ID to btrfs_free_tree_block(). This requires changing the root argument of btrfs_free_tree_block() from struct btrfs_root * to a u64, since at this point during the subvolume creation we have not yet created the struct btrfs_root for the new subvolume, and btrfs_free_tree_block() only needs a root ID and nothing else from a struct btrfs_root.
This was triggered by test case generic/475 from fstests.
Fixes: 67addf29004c5b ("btrfs: fix metadata extent leak after failure to create subvolume") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.c | 17 +++++++++-------- fs/btrfs/ctree.h | 7 ++++++- fs/btrfs/extent-tree.c | 13 +++++++------ fs/btrfs/free-space-tree.c | 4 ++-- fs/btrfs/ioctl.c | 9 +++++---- fs/btrfs/qgroup.c | 3 ++- 6 files changed, 31 insertions(+), 22 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index 899f85445925..341ce90d24b1 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -462,8 +462,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, BUG_ON(ret < 0); rcu_assign_pointer(root->node, cow);
- btrfs_free_tree_block(trans, root, buf, parent_start, - last_ref); + btrfs_free_tree_block(trans, btrfs_root_id(root), buf, + parent_start, last_ref); free_extent_buffer(buf); add_root_to_dirty_list(root); } else { @@ -484,8 +484,8 @@ static noinline int __btrfs_cow_block(struct btrfs_trans_handle *trans, return ret; } } - btrfs_free_tree_block(trans, root, buf, parent_start, - last_ref); + btrfs_free_tree_block(trans, btrfs_root_id(root), buf, + parent_start, last_ref); } if (unlock_orig) btrfs_tree_unlock(buf); @@ -926,7 +926,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, free_extent_buffer(mid);
root_sub_used(root, mid->len); - btrfs_free_tree_block(trans, root, mid, 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(root), mid, 0, 1); /* once for the root ptr */ free_extent_buffer_stale(mid); return 0; @@ -985,7 +985,8 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, btrfs_tree_unlock(right); del_ptr(root, path, level + 1, pslot + 1); root_sub_used(root, right->len); - btrfs_free_tree_block(trans, root, right, 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(root), right, + 0, 1); free_extent_buffer_stale(right); right = NULL; } else { @@ -1030,7 +1031,7 @@ static noinline int balance_level(struct btrfs_trans_handle *trans, btrfs_tree_unlock(mid); del_ptr(root, path, level + 1, pslot); root_sub_used(root, mid->len); - btrfs_free_tree_block(trans, root, mid, 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(root), mid, 0, 1); free_extent_buffer_stale(mid); mid = NULL; } else { @@ -4059,7 +4060,7 @@ static noinline void btrfs_del_leaf(struct btrfs_trans_handle *trans, root_sub_used(root, leaf->len);
atomic_inc(&leaf->refs); - btrfs_free_tree_block(trans, root, leaf, 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(root), leaf, 0, 1); free_extent_buffer_stale(leaf); } /* diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h index 21c44846b002..cc72d8981c47 100644 --- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -2256,6 +2256,11 @@ static inline bool btrfs_root_dead(const struct btrfs_root *root) return (root->root_item.flags & cpu_to_le64(BTRFS_ROOT_SUBVOL_DEAD)) != 0; }
+static inline u64 btrfs_root_id(const struct btrfs_root *root) +{ + return root->root_key.objectid; +} + /* struct btrfs_root_backup */ BTRFS_SETGET_STACK_FUNCS(backup_tree_root, struct btrfs_root_backup, tree_root, 64); @@ -2718,7 +2723,7 @@ struct extent_buffer *btrfs_alloc_tree_block(struct btrfs_trans_handle *trans, u64 empty_size, enum btrfs_lock_nesting nest); void btrfs_free_tree_block(struct btrfs_trans_handle *trans, - struct btrfs_root *root, + u64 root_id, struct extent_buffer *buf, u64 parent, int last_ref); int btrfs_alloc_reserved_file_extent(struct btrfs_trans_handle *trans, diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index e01b9344fb9c..f11616f61dd6 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -3280,20 +3280,20 @@ static noinline int check_ref_cleanup(struct btrfs_trans_handle *trans, }
void btrfs_free_tree_block(struct btrfs_trans_handle *trans, - struct btrfs_root *root, + u64 root_id, struct extent_buffer *buf, u64 parent, int last_ref) { - struct btrfs_fs_info *fs_info = root->fs_info; + struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_ref generic_ref = { 0 }; int ret;
btrfs_init_generic_ref(&generic_ref, BTRFS_DROP_DELAYED_REF, buf->start, buf->len, parent); btrfs_init_tree_ref(&generic_ref, btrfs_header_level(buf), - root->root_key.objectid, 0, false); + root_id, 0, false);
- if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) { + if (root_id != BTRFS_TREE_LOG_OBJECTID) { btrfs_ref_tree_mod(fs_info, &generic_ref); ret = btrfs_add_delayed_tree_ref(trans, &generic_ref, NULL); BUG_ON(ret); /* -ENOMEM */ @@ -3303,7 +3303,7 @@ void btrfs_free_tree_block(struct btrfs_trans_handle *trans, struct btrfs_block_group *cache; bool must_pin = false;
- if (root->root_key.objectid != BTRFS_TREE_LOG_OBJECTID) { + if (root_id != BTRFS_TREE_LOG_OBJECTID) { ret = check_ref_cleanup(trans, buf->start); if (!ret) { btrfs_redirty_list_add(trans->transaction, buf); @@ -5441,7 +5441,8 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, goto owner_mismatch; }
- btrfs_free_tree_block(trans, root, eb, parent, wc->refs[level] == 1); + btrfs_free_tree_block(trans, btrfs_root_id(root), eb, parent, + wc->refs[level] == 1); out: wc->refs[level] = 0; wc->flags[level] = 0; diff --git a/fs/btrfs/free-space-tree.c b/fs/btrfs/free-space-tree.c index a33bca94d133..3abec44c6255 100644 --- a/fs/btrfs/free-space-tree.c +++ b/fs/btrfs/free-space-tree.c @@ -1256,8 +1256,8 @@ int btrfs_clear_free_space_tree(struct btrfs_fs_info *fs_info) btrfs_tree_lock(free_space_root->node); btrfs_clean_tree_block(free_space_root->node); btrfs_tree_unlock(free_space_root->node); - btrfs_free_tree_block(trans, free_space_root, free_space_root->node, - 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(free_space_root), + free_space_root->node, 0, 1);
btrfs_put_root(free_space_root);
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index bf53af8694f8..7272d9d3fa78 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -615,11 +615,12 @@ static noinline int create_subvol(struct user_namespace *mnt_userns, * Since we don't abort the transaction in this case, free the * tree block so that we don't leak space and leave the * filesystem in an inconsistent state (an extent item in the - * extent tree without backreferences). Also no need to have - * the tree block locked since it is not in any tree at this - * point, so no other task can find it and use it. + * extent tree with a backreference for a root that does not + * exists). Also no need to have the tree block locked since it + * is not in any tree at this point, so no other task can find + * it and use it. */ - btrfs_free_tree_block(trans, root, leaf, 0, 1); + btrfs_free_tree_block(trans, objectid, leaf, 0, 1); free_extent_buffer(leaf); goto fail; } diff --git a/fs/btrfs/qgroup.c b/fs/btrfs/qgroup.c index 2c803108ea94..4ca809fa80ea 100644 --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1259,7 +1259,8 @@ int btrfs_quota_disable(struct btrfs_fs_info *fs_info) btrfs_tree_lock(quota_root->node); btrfs_clean_tree_block(quota_root->node); btrfs_tree_unlock(quota_root->node); - btrfs_free_tree_block(trans, quota_root, quota_root->node, 0, 1); + btrfs_free_tree_block(trans, btrfs_root_id(quota_root), + quota_root->node, 0, 1);
btrfs_put_root(quota_root);
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 212a58fda9b9077e0efc20200a4feb76afacfd95 ]
When creating a subvolume, at ioctl.c:create_subvol(), if we fail to insert the root item for the new subvolume into the root tree, we can trigger the following warning:
[78961.741046] WARNING: CPU: 0 PID: 4079814 at fs/btrfs/extent-tree.c:3357 btrfs_free_tree_block+0x2af/0x310 [btrfs] [78961.743344] Modules linked in: [78961.749440] dm_snapshot dm_thin_pool (...) [78961.773648] CPU: 0 PID: 4079814 Comm: fsstress Not tainted 5.16.0-rc4-btrfs-next-108 #1 [78961.775198] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [78961.777266] RIP: 0010:btrfs_free_tree_block+0x2af/0x310 [btrfs] [78961.778398] Code: 17 00 48 85 (...) [78961.781067] RSP: 0018:ffffaa4001657b28 EFLAGS: 00010202 [78961.781877] RAX: 0000000000000213 RBX: ffff897f8a796910 RCX: 0000000000000000 [78961.782780] RDX: 0000000000000000 RSI: 0000000011004000 RDI: 00000000ffffffff [78961.783764] RBP: ffff8981f490e800 R08: 0000000000000001 R09: 0000000000000000 [78961.784740] R10: 0000000000000000 R11: 0000000000000001 R12: ffff897fc963fcc8 [78961.785665] R13: 0000000000000001 R14: ffff898063548000 R15: ffff898063548000 [78961.786620] FS: 00007f31283c6b80(0000) GS:ffff8982ace00000(0000) knlGS:0000000000000000 [78961.787717] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [78961.788598] CR2: 00007f31285c3000 CR3: 000000023fcc8003 CR4: 0000000000370ef0 [78961.789568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [78961.790585] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [78961.791684] Call Trace: [78961.792082] <TASK> [78961.792359] create_subvol+0x5d1/0x9a0 [btrfs] [78961.793054] btrfs_mksubvol+0x447/0x4c0 [btrfs] [78961.794009] ? preempt_count_add+0x49/0xa0 [78961.794705] __btrfs_ioctl_snap_create+0x123/0x190 [btrfs] [78961.795712] ? _copy_from_user+0x66/0xa0 [78961.796382] btrfs_ioctl_snap_create_v2+0xbb/0x140 [btrfs] [78961.797392] btrfs_ioctl+0xd1e/0x35c0 [btrfs] [78961.798172] ? __slab_free+0x10a/0x360 [78961.798820] ? rcu_read_lock_sched_held+0x12/0x60 [78961.799664] ? lock_release+0x223/0x4a0 [78961.800321] ? lock_acquired+0x19f/0x420 [78961.800992] ? rcu_read_lock_sched_held+0x12/0x60 [78961.801796] ? trace_hardirqs_on+0x1b/0xe0 [78961.802495] ? _raw_spin_unlock_irqrestore+0x3e/0x60 [78961.803358] ? kmem_cache_free+0x321/0x3c0 [78961.804071] ? __x64_sys_ioctl+0x83/0xb0 [78961.804711] __x64_sys_ioctl+0x83/0xb0 [78961.805348] do_syscall_64+0x3b/0xc0 [78961.805969] entry_SYSCALL_64_after_hwframe+0x44/0xae [78961.806830] RIP: 0033:0x7f31284bc957 [78961.807517] Code: 3c 1c 48 f7 d8 (...)
This is because we are calling btrfs_free_tree_block() on an extent buffer that is dirty. Fix that by cleaning the extent buffer, with btrfs_clean_tree_block(), before freeing it.
This was triggered by test case generic/475 from fstests.
Fixes: 67addf29004c5b ("btrfs: fix metadata extent leak after failure to create subvolume") CC: stable@vger.kernel.org # 4.4+ Reviewed-by: Nikolay Borisov nborisov@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ioctl.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 7272d9d3fa78..a37ab3e89a3b 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -616,10 +616,11 @@ static noinline int create_subvol(struct user_namespace *mnt_userns, * tree block so that we don't leak space and leave the * filesystem in an inconsistent state (an extent item in the * extent tree with a backreference for a root that does not - * exists). Also no need to have the tree block locked since it - * is not in any tree at this point, so no other task can find - * it and use it. + * exists). */ + btrfs_tree_lock(leaf); + btrfs_clean_tree_block(leaf); + btrfs_tree_unlock(leaf); btrfs_free_tree_block(trans, objectid, leaf, 0, 1); free_extent_buffer(leaf); goto fail;
From: Tang Bin tangbin@cmss.chinamobile.com
[ Upstream commit 58ae4004b9c4bb040958cf73986b687a5ea4d85d ]
The function cpcap_power_button_probe() does not perform sufficient error checking after executing platform_get_irq(), thus fix it.
Signed-off-by: Tang Bin tangbin@cmss.chinamobile.com Link: https://lore.kernel.org/r/20210802121740.8700-1-tangbin@cmss.chinamobile.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/misc/cpcap-pwrbutton.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/input/misc/cpcap-pwrbutton.c b/drivers/input/misc/cpcap-pwrbutton.c index 0abef63217e2..372cb44d0635 100644 --- a/drivers/input/misc/cpcap-pwrbutton.c +++ b/drivers/input/misc/cpcap-pwrbutton.c @@ -54,9 +54,13 @@ static irqreturn_t powerbutton_irq(int irq, void *_button) static int cpcap_power_button_probe(struct platform_device *pdev) { struct cpcap_power_button *button; - int irq = platform_get_irq(pdev, 0); + int irq; int err;
+ irq = platform_get_irq(pdev, 0); + if (irq < 0) + return irq; + button = devm_kmalloc(&pdev->dev, sizeof(*button), GFP_KERNEL); if (!button) return -ENOMEM;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 31ae0102a34ed863c7d32b10e768036324991679 ]
Change the type of the goodix_i2c_write() len parameter to from 'unsigned' to 'int' to avoid bare use of 'unsigned', changing it to 'int' makes goodix_i2c_write()' prototype consistent with goodix_i2c_read().
Reviewed-by: Bastien Nocera hadess@hadess.net Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210920150643.155872-2-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/touchscreen/goodix.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 5051a1766aac..1eb776abe562 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -246,7 +246,7 @@ static int goodix_i2c_read(struct i2c_client *client, * @len: length of the buffer to write */ static int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, - unsigned len) + int len) { u8 *addr_buf; struct i2c_msg msg;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit a2233cb7b65a017067e2f2703375ecc930a0ab30 ]
Add a goodix.h header file, and move the register definitions, and struct declarations there and add prototypes for various helper functions.
This is a preparation patch for adding support for controllers without flash, which need to have their firmware uploaded and need some other special handling too.
Since MAINTAINERS needs updating because of this change anyways, also add myself as co-maintainer.
Reviewed-by: Bastien Nocera hadess@hadess.net Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210920150643.155872-3-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- MAINTAINERS | 3 +- drivers/input/touchscreen/goodix.c | 74 +++--------------------------- drivers/input/touchscreen/goodix.h | 73 +++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+), 69 deletions(-) create mode 100644 drivers/input/touchscreen/goodix.h
diff --git a/MAINTAINERS b/MAINTAINERS index a60d7e0466af..edc32575828b 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -7952,9 +7952,10 @@ F: drivers/media/usb/go7007/
GOODIX TOUCHSCREEN M: Bastien Nocera hadess@hadess.net +M: Hans de Goede hdegoede@redhat.com L: linux-input@vger.kernel.org S: Maintained -F: drivers/input/touchscreen/goodix.c +F: drivers/input/touchscreen/goodix*
GOOGLE ETHERNET DRIVERS M: Jeroen de Borst jeroendb@google.com diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 1eb776abe562..5ccdd6abd868 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -14,20 +14,15 @@ #include <linux/kernel.h> #include <linux/dmi.h> #include <linux/firmware.h> -#include <linux/gpio/consumer.h> -#include <linux/i2c.h> -#include <linux/input.h> -#include <linux/input/mt.h> -#include <linux/input/touchscreen.h> #include <linux/module.h> #include <linux/delay.h> #include <linux/irq.h> #include <linux/interrupt.h> -#include <linux/regulator/consumer.h> #include <linux/slab.h> #include <linux/acpi.h> #include <linux/of.h> #include <asm/unaligned.h> +#include "goodix.h"
#define GOODIX_GPIO_INT_NAME "irq" #define GOODIX_GPIO_RST_NAME "reset" @@ -38,22 +33,11 @@ #define GOODIX_CONTACT_SIZE 8 #define GOODIX_MAX_CONTACT_SIZE 9 #define GOODIX_MAX_CONTACTS 10 -#define GOODIX_MAX_KEYS 7
#define GOODIX_CONFIG_MIN_LENGTH 186 #define GOODIX_CONFIG_911_LENGTH 186 #define GOODIX_CONFIG_967_LENGTH 228 #define GOODIX_CONFIG_GT9X_LENGTH 240 -#define GOODIX_CONFIG_MAX_LENGTH 240 - -/* Register defines */ -#define GOODIX_REG_COMMAND 0x8040 -#define GOODIX_CMD_SCREEN_OFF 0x05 - -#define GOODIX_READ_COOR_ADDR 0x814E -#define GOODIX_GT1X_REG_CONFIG_DATA 0x8050 -#define GOODIX_GT9X_REG_CONFIG_DATA 0x8047 -#define GOODIX_REG_ID 0x8140
#define GOODIX_BUFFER_STATUS_READY BIT(7) #define GOODIX_HAVE_KEY BIT(4) @@ -68,55 +52,11 @@ #define ACPI_GPIO_SUPPORT #endif
-struct goodix_ts_data; - -enum goodix_irq_pin_access_method { - IRQ_PIN_ACCESS_NONE, - IRQ_PIN_ACCESS_GPIO, - IRQ_PIN_ACCESS_ACPI_GPIO, - IRQ_PIN_ACCESS_ACPI_METHOD, -}; - -struct goodix_chip_data { - u16 config_addr; - int config_len; - int (*check_config)(struct goodix_ts_data *ts, const u8 *cfg, int len); - void (*calc_config_checksum)(struct goodix_ts_data *ts); -}; - struct goodix_chip_id { const char *id; const struct goodix_chip_data *data; };
-#define GOODIX_ID_MAX_LEN 4 - -struct goodix_ts_data { - struct i2c_client *client; - struct input_dev *input_dev; - const struct goodix_chip_data *chip; - struct touchscreen_properties prop; - unsigned int max_touch_num; - unsigned int int_trigger_type; - struct regulator *avdd28; - struct regulator *vddio; - struct gpio_desc *gpiod_int; - struct gpio_desc *gpiod_rst; - int gpio_count; - int gpio_int_idx; - char id[GOODIX_ID_MAX_LEN + 1]; - u16 version; - const char *cfg_name; - bool reset_controller_at_probe; - bool load_cfg_from_disk; - struct completion firmware_loading_complete; - unsigned long irq_flags; - enum goodix_irq_pin_access_method irq_pin_access_method; - unsigned int contact_size; - u8 config[GOODIX_CONFIG_MAX_LENGTH]; - unsigned short keymap[GOODIX_MAX_KEYS]; -}; - static int goodix_check_cfg_8(struct goodix_ts_data *ts, const u8 *cfg, int len); static int goodix_check_cfg_16(struct goodix_ts_data *ts, @@ -216,8 +156,7 @@ static const struct dmi_system_id inverted_x_screen[] = { * @buf: raw write data buffer. * @len: length of the buffer to write */ -static int goodix_i2c_read(struct i2c_client *client, - u16 reg, u8 *buf, int len) +int goodix_i2c_read(struct i2c_client *client, u16 reg, u8 *buf, int len) { struct i2c_msg msgs[2]; __be16 wbuf = cpu_to_be16(reg); @@ -245,8 +184,7 @@ static int goodix_i2c_read(struct i2c_client *client, * @buf: raw data buffer to write. * @len: length of the buffer to write */ -static int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, - int len) +int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, int len) { u8 *addr_buf; struct i2c_msg msg; @@ -270,7 +208,7 @@ static int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, return ret < 0 ? ret : (ret != 1 ? -EIO : 0); }
-static int goodix_i2c_write_u8(struct i2c_client *client, u16 reg, u8 value) +int goodix_i2c_write_u8(struct i2c_client *client, u16 reg, u8 value) { return goodix_i2c_write(client, reg, &value, sizeof(value)); } @@ -554,7 +492,7 @@ static int goodix_check_cfg(struct goodix_ts_data *ts, const u8 *cfg, int len) * @cfg: config firmware to write to device * @len: config data length */ -static int goodix_send_cfg(struct goodix_ts_data *ts, const u8 *cfg, int len) +int goodix_send_cfg(struct goodix_ts_data *ts, const u8 *cfg, int len) { int error;
@@ -652,7 +590,7 @@ static int goodix_irq_direction_input(struct goodix_ts_data *ts) return -EINVAL; /* Never reached */ }
-static int goodix_int_sync(struct goodix_ts_data *ts) +int goodix_int_sync(struct goodix_ts_data *ts) { int error;
diff --git a/drivers/input/touchscreen/goodix.h b/drivers/input/touchscreen/goodix.h new file mode 100644 index 000000000000..cdaced4f2980 --- /dev/null +++ b/drivers/input/touchscreen/goodix.h @@ -0,0 +1,73 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#ifndef __GOODIX_H__ +#define __GOODIX_H__ + +#include <linux/gpio/consumer.h> +#include <linux/i2c.h> +#include <linux/input.h> +#include <linux/input/mt.h> +#include <linux/input/touchscreen.h> +#include <linux/regulator/consumer.h> + +/* Register defines */ +#define GOODIX_REG_COMMAND 0x8040 +#define GOODIX_CMD_SCREEN_OFF 0x05 + +#define GOODIX_GT1X_REG_CONFIG_DATA 0x8050 +#define GOODIX_GT9X_REG_CONFIG_DATA 0x8047 +#define GOODIX_REG_ID 0x8140 +#define GOODIX_READ_COOR_ADDR 0x814E + +#define GOODIX_ID_MAX_LEN 4 +#define GOODIX_CONFIG_MAX_LENGTH 240 +#define GOODIX_MAX_KEYS 7 + +enum goodix_irq_pin_access_method { + IRQ_PIN_ACCESS_NONE, + IRQ_PIN_ACCESS_GPIO, + IRQ_PIN_ACCESS_ACPI_GPIO, + IRQ_PIN_ACCESS_ACPI_METHOD, +}; + +struct goodix_ts_data; + +struct goodix_chip_data { + u16 config_addr; + int config_len; + int (*check_config)(struct goodix_ts_data *ts, const u8 *cfg, int len); + void (*calc_config_checksum)(struct goodix_ts_data *ts); +}; + +struct goodix_ts_data { + struct i2c_client *client; + struct input_dev *input_dev; + const struct goodix_chip_data *chip; + struct touchscreen_properties prop; + unsigned int max_touch_num; + unsigned int int_trigger_type; + struct regulator *avdd28; + struct regulator *vddio; + struct gpio_desc *gpiod_int; + struct gpio_desc *gpiod_rst; + int gpio_count; + int gpio_int_idx; + char id[GOODIX_ID_MAX_LEN + 1]; + u16 version; + const char *cfg_name; + bool reset_controller_at_probe; + bool load_cfg_from_disk; + struct completion firmware_loading_complete; + unsigned long irq_flags; + enum goodix_irq_pin_access_method irq_pin_access_method; + unsigned int contact_size; + u8 config[GOODIX_CONFIG_MAX_LENGTH]; + unsigned short keymap[GOODIX_MAX_KEYS]; +}; + +int goodix_i2c_read(struct i2c_client *client, u16 reg, u8 *buf, int len); +int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, int len); +int goodix_i2c_write_u8(struct i2c_client *client, u16 reg, u8 value); +int goodix_send_cfg(struct goodix_ts_data *ts, const u8 *cfg, int len); +int goodix_int_sync(struct goodix_ts_data *ts); + +#endif
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 209bda4741f68f102cf2f272227bfc938e387b51 ]
Refactor reset handling a bit, change the main reset handler into a new goodix_reset_no_int_sync() helper and add a goodix_reset() wrapper which calls goodix_int_sync() separately.
Also push the dev_err() call on reset failure into the goodix_reset_no_int_sync() and goodix_int_sync() functions, so that we don't need to have separate dev_err() calls in all their callers.
This is a preparation patch for adding support for controllers without flash, which need to have their firmware uploaded and need some other special handling too.
Reviewed-by: Bastien Nocera hadess@hadess.net Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20210920150643.155872-4-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/touchscreen/goodix.c | 48 ++++++++++++++++++++---------- drivers/input/touchscreen/goodix.h | 1 + 2 files changed, 33 insertions(+), 16 deletions(-)
diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 5ccdd6abd868..2ca903a8af21 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -596,56 +596,76 @@ int goodix_int_sync(struct goodix_ts_data *ts)
error = goodix_irq_direction_output(ts, 0); if (error) - return error; + goto error;
msleep(50); /* T5: 50ms */
error = goodix_irq_direction_input(ts); if (error) - return error; + goto error;
return 0; + +error: + dev_err(&ts->client->dev, "Controller irq sync failed.\n"); + return error; }
/** - * goodix_reset - Reset device during power on + * goodix_reset_no_int_sync - Reset device, leaving interrupt line in output mode * * @ts: goodix_ts_data pointer */ -static int goodix_reset(struct goodix_ts_data *ts) +int goodix_reset_no_int_sync(struct goodix_ts_data *ts) { int error;
/* begin select I2C slave addr */ error = gpiod_direction_output(ts->gpiod_rst, 0); if (error) - return error; + goto error;
msleep(20); /* T2: > 10ms */
/* HIGH: 0x28/0x29, LOW: 0xBA/0xBB */ error = goodix_irq_direction_output(ts, ts->client->addr == 0x14); if (error) - return error; + goto error;
usleep_range(100, 2000); /* T3: > 100us */
error = gpiod_direction_output(ts->gpiod_rst, 1); if (error) - return error; + goto error;
usleep_range(6000, 10000); /* T4: > 5ms */
/* end select I2C slave addr */ error = gpiod_direction_input(ts->gpiod_rst); if (error) - return error; + goto error;
- error = goodix_int_sync(ts); + return 0; + +error: + dev_err(&ts->client->dev, "Controller reset failed.\n"); + return error; +} + +/** + * goodix_reset - Reset device during power on + * + * @ts: goodix_ts_data pointer + */ +static int goodix_reset(struct goodix_ts_data *ts) +{ + int error; + + error = goodix_reset_no_int_sync(ts); if (error) return error;
- return 0; + return goodix_int_sync(ts); }
#ifdef ACPI_GPIO_SUPPORT @@ -1144,10 +1164,8 @@ static int goodix_ts_probe(struct i2c_client *client, if (ts->reset_controller_at_probe) { /* reset the controller */ error = goodix_reset(ts); - if (error) { - dev_err(&client->dev, "Controller reset failed.\n"); + if (error) return error; - } }
error = goodix_i2c_test(client); @@ -1289,10 +1307,8 @@ static int __maybe_unused goodix_resume(struct device *dev)
if (error != 0 || config_ver != ts->config[0]) { error = goodix_reset(ts); - if (error) { - dev_err(dev, "Controller reset failed.\n"); + if (error) return error; - }
error = goodix_send_cfg(ts, ts->config, ts->chip->config_len); if (error) diff --git a/drivers/input/touchscreen/goodix.h b/drivers/input/touchscreen/goodix.h index cdaced4f2980..0b88554ba2ae 100644 --- a/drivers/input/touchscreen/goodix.h +++ b/drivers/input/touchscreen/goodix.h @@ -69,5 +69,6 @@ int goodix_i2c_write(struct i2c_client *client, u16 reg, const u8 *buf, int len) int goodix_i2c_write_u8(struct i2c_client *client, u16 reg, u8 value); int goodix_send_cfg(struct goodix_ts_data *ts, const u8 *cfg, int len); int goodix_int_sync(struct goodix_ts_data *ts); +int goodix_reset_no_int_sync(struct goodix_ts_data *ts);
#endif
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit a2fd46cd3dbb83b373ba74f4043f8dae869c65f1 ]
Unless the controller is not responding at boot or after suspend/resume, the driver never resets the controller on x86/ACPI platforms. The driver still requesting the reset pin at probe() though in case it needs it.
Until now the driver has always requested the reset pin with GPIOD_IN as type. The idea being to put the pin in high-impedance mode to save power until the driver actually wants to issue a reset.
But this means that just requesting the pin can cause issues, since requesting it in another mode then GPIOD_ASIS may cause the pinctrl driver to touch the pin settings. We have already had issues before due to a bug in the pinctrl-cherryview.c driver which has been fixed in commit 921daeeca91b ("pinctrl: cherryview: Preserve CHV_PADCTRL1_INVRXTX_TXDATA flag on GPIOs").
And now it turns out that requesting the reset-pin as GPIOD_IN also stops the touchscreen from working on the GPD P2 max mini-laptop. The behavior of putting the pin in high-impedance mode relies on there being some external pull-up to keep it high and there seems to be no pull-up on the GPD P2 max, causing things to break.
This commit fixes this by requesting the reset pin as is when using the x86/ACPI code paths to lookup the GPIOs; and by not dropping it back into input-mode in case the driver does end up issuing a reset for error-recovery.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=209061 Fixes: a7d4b171660c ("Input: goodix - add support for getting IRQ + reset GPIOs on Cherry Trail devices") Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20211206091116.44466-2-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/touchscreen/goodix.c | 30 +++++++++++++++++++++++++----- drivers/input/touchscreen/goodix.h | 1 + 2 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/input/touchscreen/goodix.c b/drivers/input/touchscreen/goodix.c index 2ca903a8af21..3667f7e51fde 100644 --- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -640,10 +640,16 @@ int goodix_reset_no_int_sync(struct goodix_ts_data *ts)
usleep_range(6000, 10000); /* T4: > 5ms */
- /* end select I2C slave addr */ - error = gpiod_direction_input(ts->gpiod_rst); - if (error) - goto error; + /* + * Put the reset pin back in to input / high-impedance mode to save + * power. Only do this in the non ACPI case since some ACPI boards + * don't have a pull-up, so there the reset pin must stay active-high. + */ + if (ts->irq_pin_access_method == IRQ_PIN_ACCESS_GPIO) { + error = gpiod_direction_input(ts->gpiod_rst); + if (error) + goto error; + }
return 0;
@@ -777,6 +783,14 @@ static int goodix_add_acpi_gpio_mappings(struct goodix_ts_data *ts) return -EINVAL; }
+ /* + * Normally we put the reset pin in input / high-impedance mode to save + * power. But some x86/ACPI boards don't have a pull-up, so for the ACPI + * case, leave the pin as is. This results in the pin not being touched + * at all on x86/ACPI boards, except when needed for error-recover. + */ + ts->gpiod_rst_flags = GPIOD_ASIS; + return devm_acpi_dev_add_driver_gpios(dev, gpio_mapping); } #else @@ -802,6 +816,12 @@ static int goodix_get_gpio_config(struct goodix_ts_data *ts) return -EINVAL; dev = &ts->client->dev;
+ /* + * By default we request the reset pin as input, leaving it in + * high-impedance when not resetting the controller to save power. + */ + ts->gpiod_rst_flags = GPIOD_IN; + ts->avdd28 = devm_regulator_get(dev, "AVDD28"); if (IS_ERR(ts->avdd28)) { error = PTR_ERR(ts->avdd28); @@ -839,7 +859,7 @@ static int goodix_get_gpio_config(struct goodix_ts_data *ts) ts->gpiod_int = gpiod;
/* Get the reset line GPIO pin number */ - gpiod = devm_gpiod_get_optional(dev, GOODIX_GPIO_RST_NAME, GPIOD_IN); + gpiod = devm_gpiod_get_optional(dev, GOODIX_GPIO_RST_NAME, ts->gpiod_rst_flags); if (IS_ERR(gpiod)) { error = PTR_ERR(gpiod); if (error != -EPROBE_DEFER) diff --git a/drivers/input/touchscreen/goodix.h b/drivers/input/touchscreen/goodix.h index 0b88554ba2ae..1a1571ad2cd2 100644 --- a/drivers/input/touchscreen/goodix.h +++ b/drivers/input/touchscreen/goodix.h @@ -51,6 +51,7 @@ struct goodix_ts_data { struct gpio_desc *gpiod_rst; int gpio_count; int gpio_int_idx; + enum gpiod_flags gpiod_rst_flags; char id[GOODIX_ID_MAX_LEN + 1]; u16 version; const char *cfg_name;
From: Michel Dänzer mdaenzer@redhat.com
[ Upstream commit ff2d23843f7fb4f13055be5a4a9a20ddd04e6e9c ]
This makes sure we don't hit the
BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
in dma_buf_release, which could be triggered by user space closing the dma-buf file description while there are outstanding fence callbacks from dma_buf_poll.
Cc: stable@vger.kernel.org Signed-off-by: Michel Dänzer mdaenzer@redhat.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20210723075857.4065-1-michel@d... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/dma-buf/dma-buf.c | 19 +++++++++++++------ 1 file changed, 13 insertions(+), 6 deletions(-)
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index f9217e300eea..968c3df2810e 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -67,12 +67,9 @@ static void dma_buf_release(struct dentry *dentry) BUG_ON(dmabuf->vmapping_counter);
/* - * Any fences that a dma-buf poll can wait on should be signaled - * before releasing dma-buf. This is the responsibility of each - * driver that uses the reservation objects. - * - * If you hit this BUG() it means someone dropped their ref to the - * dma-buf while still having pending operation to the buffer. + * If you hit this BUG() it could mean: + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active);
@@ -200,6 +197,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) { struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); unsigned long flags;
spin_lock_irqsave(&dcb->poll->lock, flags); @@ -207,6 +205,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) dcb->active = 0; spin_unlock_irqrestore(&dcb->poll->lock, flags); dma_fence_put(fence); + /* Paired with get_file in dma_buf_poll */ + fput(dmabuf->file); }
static bool dma_buf_poll_shared(struct dma_resv *resv, @@ -282,8 +282,12 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock);
if (events & EPOLLOUT) { + /* Paired with fput in dma_buf_poll_cb */ + get_file(dmabuf->file); + if (!dma_buf_poll_shared(resv, dcb) && !dma_buf_poll_excl(resv, dcb)) + /* No callback queued, wake up any other waiters */ dma_buf_poll_cb(NULL, &dcb->cb); else @@ -303,6 +307,9 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock);
if (events & EPOLLIN) { + /* Paired with fput in dma_buf_poll_cb */ + get_file(dmabuf->file); + if (!dma_buf_poll_excl(resv, dcb)) /* No callback queued, wake up any other waiters */ dma_buf_poll_cb(NULL, &dcb->cb);
From: Filipe Manana fdmanana@suse.com
[ Upstream commit 2bb2e00ed9787e52580bb651264b8d6a2b7a9dd2 ]
When a task is doing some modification to the chunk btree and it is not in the context of a chunk allocation or a chunk removal, it can deadlock with another task that is currently allocating a new data or metadata chunk.
These contexts are the following:
* When relocating a system chunk, when we need to COW the extent buffers that belong to the chunk btree;
* When adding a new device (ioctl), where we need to add a new device item to the chunk btree;
* When removing a device (ioctl), where we need to remove a device item from the chunk btree;
* When resizing a device (ioctl), where we need to update a device item in the chunk btree and may need to relocate a system chunk that lies beyond the new device size when shrinking a device.
The problem happens due to a sequence of steps like the following:
1) Task A starts a data or metadata chunk allocation and it locks the chunk mutex;
2) Task B is relocating a system chunk, and when it needs to COW an extent buffer of the chunk btree, it has locked both that extent buffer as well as its parent extent buffer;
3) Since there is not enough available system space, either because none of the existing system block groups have enough free space or because the only one with enough free space is in RO mode due to the relocation, task B triggers a new system chunk allocation. It blocks when trying to acquire the chunk mutex, currently held by task A;
4) Task A enters btrfs_chunk_alloc_add_chunk_item(), in order to insert the new chunk item into the chunk btree and update the existing device items there. But in order to do that, it has to lock the extent buffer that task B locked at step 2, or its parent extent buffer, but task B is waiting on the chunk mutex, which is currently locked by task A, therefore resulting in a deadlock.
One example report when the deadlock happens with system chunk relocation:
INFO: task kworker/u9:5:546 blocked for more than 143 seconds. Not tainted 5.15.0-rc3+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:kworker/u9:5 state:D stack:25936 pid: 546 ppid: 2 flags:0x00004000 Workqueue: events_unbound btrfs_async_reclaim_metadata_space Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 rwsem_down_read_slowpath+0x4ee/0x9d0 kernel/locking/rwsem.c:993 __down_read_common kernel/locking/rwsem.c:1214 [inline] __down_read kernel/locking/rwsem.c:1223 [inline] down_read_nested+0xe6/0x440 kernel/locking/rwsem.c:1590 __btrfs_tree_read_lock+0x31/0x350 fs/btrfs/locking.c:47 btrfs_tree_read_lock fs/btrfs/locking.c:54 [inline] btrfs_read_lock_root_node+0x8a/0x320 fs/btrfs/locking.c:191 btrfs_search_slot_get_root fs/btrfs/ctree.c:1623 [inline] btrfs_search_slot+0x13b4/0x2140 fs/btrfs/ctree.c:1728 btrfs_update_device+0x11f/0x500 fs/btrfs/volumes.c:2794 btrfs_chunk_alloc_add_chunk_item+0x34d/0xea0 fs/btrfs/volumes.c:5504 do_chunk_alloc fs/btrfs/block-group.c:3408 [inline] btrfs_chunk_alloc+0x84d/0xf50 fs/btrfs/block-group.c:3653 flush_space+0x54e/0xd80 fs/btrfs/space-info.c:670 btrfs_async_reclaim_metadata_space+0x396/0xa90 fs/btrfs/space-info.c:953 process_one_work+0x9df/0x16d0 kernel/workqueue.c:2297 worker_thread+0x90/0xed0 kernel/workqueue.c:2444 kthread+0x3e5/0x4d0 kernel/kthread.c:319 ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:295 INFO: task syz-executor:9107 blocked for more than 143 seconds. Not tainted 5.15.0-rc3+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:syz-executor state:D stack:23200 pid: 9107 ppid: 7792 flags:0x00004004 Call Trace: context_switch kernel/sched/core.c:4940 [inline] __schedule+0xcd9/0x2530 kernel/sched/core.c:6287 schedule+0xd3/0x270 kernel/sched/core.c:6366 schedule_preempt_disabled+0xf/0x20 kernel/sched/core.c:6425 __mutex_lock_common kernel/locking/mutex.c:669 [inline] __mutex_lock+0xc96/0x1680 kernel/locking/mutex.c:729 btrfs_chunk_alloc+0x31a/0xf50 fs/btrfs/block-group.c:3631 find_free_extent_update_loop fs/btrfs/extent-tree.c:3986 [inline] find_free_extent+0x25cb/0x3a30 fs/btrfs/extent-tree.c:4335 btrfs_reserve_extent+0x1f1/0x500 fs/btrfs/extent-tree.c:4415 btrfs_alloc_tree_block+0x203/0x1120 fs/btrfs/extent-tree.c:4813 __btrfs_cow_block+0x412/0x1620 fs/btrfs/ctree.c:415 btrfs_cow_block+0x2f6/0x8c0 fs/btrfs/ctree.c:570 btrfs_search_slot+0x1094/0x2140 fs/btrfs/ctree.c:1768 relocate_tree_block fs/btrfs/relocation.c:2694 [inline] relocate_tree_blocks+0xf73/0x1770 fs/btrfs/relocation.c:2757 relocate_block_group+0x47e/0xc70 fs/btrfs/relocation.c:3673 btrfs_relocate_block_group+0x48a/0xc60 fs/btrfs/relocation.c:4070 btrfs_relocate_chunk+0x96/0x280 fs/btrfs/volumes.c:3181 __btrfs_balance fs/btrfs/volumes.c:3911 [inline] btrfs_balance+0x1f03/0x3cd0 fs/btrfs/volumes.c:4301 btrfs_ioctl_balance+0x61e/0x800 fs/btrfs/ioctl.c:4137 btrfs_ioctl+0x39ea/0x7b70 fs/btrfs/ioctl.c:4949 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
So fix this by making sure that whenever we try to modify the chunk btree and we are neither in a chunk allocation context nor in a chunk remove context, we reserve system space before modifying the chunk btree.
Reported-by: Hao Sun sunhao.th@gmail.com Link: https://lore.kernel.org/linux-btrfs/CACkBjsax51i4mu6C0C3vJqQN3NR_iVuucoeG3U1... Fixes: 79bd37120b1495 ("btrfs: rework chunk allocation to avoid exhaustion of the system chunk array") CC: stable@vger.kernel.org # 5.14+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 146 +++++++++++++++++++++++++---------------- fs/btrfs/block-group.h | 2 + fs/btrfs/relocation.c | 4 ++ fs/btrfs/volumes.c | 15 ++++- 4 files changed, 111 insertions(+), 56 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index aadc1203ad88..c6c5a22ff6e8 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3406,25 +3406,6 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) goto out; }
- /* - * If this is a system chunk allocation then stop right here and do not - * add the chunk item to the chunk btree. This is to prevent a deadlock - * because this system chunk allocation can be triggered while COWing - * some extent buffer of the chunk btree and while holding a lock on a - * parent extent buffer, in which case attempting to insert the chunk - * item (or update the device item) would result in a deadlock on that - * parent extent buffer. In this case defer the chunk btree updates to - * the second phase of chunk allocation and keep our reservation until - * the second phase completes. - * - * This is a rare case and can only be triggered by the very few cases - * we have where we need to touch the chunk btree outside chunk allocation - * and chunk removal. These cases are basically adding a device, removing - * a device or resizing a device. - */ - if (flags & BTRFS_BLOCK_GROUP_SYSTEM) - return 0; - ret = btrfs_chunk_alloc_add_chunk_item(trans, bg); /* * Normally we are not expected to fail with -ENOSPC here, since we have @@ -3557,14 +3538,14 @@ static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) * This has happened before and commit eafa4fd0ad0607 ("btrfs: fix exhaustion of * the system chunk array due to concurrent allocations") provides more details. * - * For allocation of system chunks, we defer the updates and insertions into the - * chunk btree to phase 2. This is to prevent deadlocks on extent buffers because - * if the chunk allocation is triggered while COWing an extent buffer of the - * chunk btree, we are holding a lock on the parent of that extent buffer and - * doing the chunk btree updates and insertions can require locking that parent. - * This is for the very few and rare cases where we update the chunk btree that - * are not chunk allocation or chunk removal: adding a device, removing a device - * or resizing a device. + * Allocation of system chunks does not happen through this function. A task that + * needs to update the chunk btree (the only btree that uses system chunks), must + * preallocate chunk space by calling either check_system_chunk() or + * btrfs_reserve_chunk_metadata() - the former is used when allocating a data or + * metadata chunk or when removing a chunk, while the later is used before doing + * a modification to the chunk btree - use cases for the later are adding, + * removing and resizing a device as well as relocation of a system chunk. + * See the comment below for more details. * * The reservation of system space, done through check_system_chunk(), as well * as all the updates and insertions into the chunk btree must be done while @@ -3601,11 +3582,27 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags, if (trans->allocating_chunk) return -ENOSPC; /* - * If we are removing a chunk, don't re-enter or we would deadlock. - * System space reservation and system chunk allocation is done by the - * chunk remove operation (btrfs_remove_chunk()). + * Allocation of system chunks can not happen through this path, as we + * could end up in a deadlock if we are allocating a data or metadata + * chunk and there is another task modifying the chunk btree. + * + * This is because while we are holding the chunk mutex, we will attempt + * to add the new chunk item to the chunk btree or update an existing + * device item in the chunk btree, while the other task that is modifying + * the chunk btree is attempting to COW an extent buffer while holding a + * lock on it and on its parent - if the COW operation triggers a system + * chunk allocation, then we can deadlock because we are holding the + * chunk mutex and we may need to access that extent buffer or its parent + * in order to add the chunk item or update a device item. + * + * Tasks that want to modify the chunk tree should reserve system space + * before updating the chunk btree, by calling either + * btrfs_reserve_chunk_metadata() or check_system_chunk(). + * It's possible that after a task reserves the space, it still ends up + * here - this happens in the cases described above at do_chunk_alloc(). + * The task will have to either retry or fail. */ - if (trans->removing_chunk) + if (flags & BTRFS_BLOCK_GROUP_SYSTEM) return -ENOSPC;
space_info = btrfs_find_space_info(fs_info, flags); @@ -3704,17 +3701,14 @@ static u64 get_profile_num_devs(struct btrfs_fs_info *fs_info, u64 type) return num_dev; }
-/* - * Reserve space in the system space for allocating or removing a chunk - */ -void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) +static void reserve_chunk_space(struct btrfs_trans_handle *trans, + u64 bytes, + u64 type) { struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_space_info *info; u64 left; - u64 thresh; int ret = 0; - u64 num_devs;
/* * Needed because we can end up allocating a system chunk and for an @@ -3727,19 +3721,13 @@ void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) left = info->total_bytes - btrfs_space_info_used(info, true); spin_unlock(&info->lock);
- num_devs = get_profile_num_devs(fs_info, type); - - /* num_devs device items to update and 1 chunk item to add or remove */ - thresh = btrfs_calc_metadata_size(fs_info, num_devs) + - btrfs_calc_insert_metadata_size(fs_info, 1); - - if (left < thresh && btrfs_test_opt(fs_info, ENOSPC_DEBUG)) { + if (left < bytes && btrfs_test_opt(fs_info, ENOSPC_DEBUG)) { btrfs_info(fs_info, "left=%llu, need=%llu, flags=%llu", - left, thresh, type); + left, bytes, type); btrfs_dump_space_info(fs_info, info, 0, 0); }
- if (left < thresh) { + if (left < bytes) { u64 flags = btrfs_system_alloc_profile(fs_info); struct btrfs_block_group *bg;
@@ -3748,21 +3736,20 @@ void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) * needing it, as we might not need to COW all nodes/leafs from * the paths we visit in the chunk tree (they were already COWed * or created in the current transaction for example). - * - * Also, if our caller is allocating a system chunk, do not - * attempt to insert the chunk item in the chunk btree, as we - * could deadlock on an extent buffer since our caller may be - * COWing an extent buffer from the chunk btree. */ bg = btrfs_create_chunk(trans, flags); if (IS_ERR(bg)) { ret = PTR_ERR(bg); - } else if (!(type & BTRFS_BLOCK_GROUP_SYSTEM)) { + } else { /* * If we fail to add the chunk item here, we end up * trying again at phase 2 of chunk allocation, at * btrfs_create_pending_block_groups(). So ignore - * any error here. + * any error here. An ENOSPC here could happen, due to + * the cases described at do_chunk_alloc() - the system + * block group we just created was just turned into RO + * mode by a scrub for example, or a running discard + * temporarily removed its free space entries, etc. */ btrfs_chunk_alloc_add_chunk_item(trans, bg); } @@ -3771,12 +3758,61 @@ void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) if (!ret) { ret = btrfs_block_rsv_add(fs_info->chunk_root, &fs_info->chunk_block_rsv, - thresh, BTRFS_RESERVE_NO_FLUSH); + bytes, BTRFS_RESERVE_NO_FLUSH); if (!ret) - trans->chunk_bytes_reserved += thresh; + trans->chunk_bytes_reserved += bytes; } }
+/* + * Reserve space in the system space for allocating or removing a chunk. + * The caller must be holding fs_info->chunk_mutex. + */ +void check_system_chunk(struct btrfs_trans_handle *trans, u64 type) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + const u64 num_devs = get_profile_num_devs(fs_info, type); + u64 bytes; + + /* num_devs device items to update and 1 chunk item to add or remove. */ + bytes = btrfs_calc_metadata_size(fs_info, num_devs) + + btrfs_calc_insert_metadata_size(fs_info, 1); + + reserve_chunk_space(trans, bytes, type); +} + +/* + * Reserve space in the system space, if needed, for doing a modification to the + * chunk btree. + * + * @trans: A transaction handle. + * @is_item_insertion: Indicate if the modification is for inserting a new item + * in the chunk btree or if it's for the deletion or update + * of an existing item. + * + * This is used in a context where we need to update the chunk btree outside + * block group allocation and removal, to avoid a deadlock with a concurrent + * task that is allocating a metadata or data block group and therefore needs to + * update the chunk btree while holding the chunk mutex. After the update to the + * chunk btree is done, btrfs_trans_release_chunk_metadata() should be called. + * + */ +void btrfs_reserve_chunk_metadata(struct btrfs_trans_handle *trans, + bool is_item_insertion) +{ + struct btrfs_fs_info *fs_info = trans->fs_info; + u64 bytes; + + if (is_item_insertion) + bytes = btrfs_calc_insert_metadata_size(fs_info, 1); + else + bytes = btrfs_calc_metadata_size(fs_info, 1); + + mutex_lock(&fs_info->chunk_mutex); + reserve_chunk_space(trans, bytes, BTRFS_BLOCK_GROUP_SYSTEM); + mutex_unlock(&fs_info->chunk_mutex); +} + void btrfs_put_block_group_cache(struct btrfs_fs_info *info) { struct btrfs_block_group *block_group; diff --git a/fs/btrfs/block-group.h b/fs/btrfs/block-group.h index c72a71efcb18..37e55ebde735 100644 --- a/fs/btrfs/block-group.h +++ b/fs/btrfs/block-group.h @@ -289,6 +289,8 @@ int btrfs_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags, enum btrfs_chunk_alloc_enum force); int btrfs_force_chunk_alloc(struct btrfs_trans_handle *trans, u64 type); void check_system_chunk(struct btrfs_trans_handle *trans, const u64 type); +void btrfs_reserve_chunk_metadata(struct btrfs_trans_handle *trans, + bool is_item_insertion); u64 btrfs_get_alloc_profile(struct btrfs_fs_info *fs_info, u64 orig_flags); void btrfs_put_block_group_cache(struct btrfs_fs_info *info); int btrfs_free_block_groups(struct btrfs_fs_info *info); diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c index 0300770c0a89..429a198f8937 100644 --- a/fs/btrfs/relocation.c +++ b/fs/btrfs/relocation.c @@ -2698,8 +2698,12 @@ static int relocate_tree_block(struct btrfs_trans_handle *trans, list_add_tail(&node->list, &rc->backref_cache.changed); } else { path->lowest_level = node->level; + if (root == root->fs_info->chunk_root) + btrfs_reserve_chunk_metadata(trans, false); ret = btrfs_search_slot(trans, root, key, path, 0, 1); btrfs_release_path(path); + if (root == root->fs_info->chunk_root) + btrfs_trans_release_chunk_metadata(trans); if (ret > 0) ret = 0; } diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index b75ce79a2540..fa68efd7e610 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1879,8 +1879,10 @@ static int btrfs_add_dev_item(struct btrfs_trans_handle *trans, key.type = BTRFS_DEV_ITEM_KEY; key.offset = device->devid;
+ btrfs_reserve_chunk_metadata(trans, true); ret = btrfs_insert_empty_item(trans, trans->fs_info->chunk_root, path, &key, sizeof(*dev_item)); + btrfs_trans_release_chunk_metadata(trans); if (ret) goto out;
@@ -1957,7 +1959,9 @@ static int btrfs_rm_dev_item(struct btrfs_device *device) key.type = BTRFS_DEV_ITEM_KEY; key.offset = device->devid;
+ btrfs_reserve_chunk_metadata(trans, false); ret = btrfs_search_slot(trans, root, &key, path, -1, 1); + btrfs_trans_release_chunk_metadata(trans); if (ret) { if (ret > 0) ret = -ENOENT; @@ -2513,7 +2517,9 @@ static int btrfs_finish_sprout(struct btrfs_trans_handle *trans) key.type = BTRFS_DEV_ITEM_KEY;
while (1) { + btrfs_reserve_chunk_metadata(trans, false); ret = btrfs_search_slot(trans, root, &key, path, 0, 1); + btrfs_trans_release_chunk_metadata(trans); if (ret < 0) goto error;
@@ -2861,6 +2867,7 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans, struct btrfs_super_block *super_copy = fs_info->super_copy; u64 old_total; u64 diff; + int ret;
if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) return -EACCES; @@ -2889,7 +2896,11 @@ int btrfs_grow_device(struct btrfs_trans_handle *trans, &trans->transaction->dev_update_list); mutex_unlock(&fs_info->chunk_mutex);
- return btrfs_update_device(trans, device); + btrfs_reserve_chunk_metadata(trans, false); + ret = btrfs_update_device(trans, device); + btrfs_trans_release_chunk_metadata(trans); + + return ret; }
static int btrfs_free_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset) @@ -4926,8 +4937,10 @@ int btrfs_shrink_device(struct btrfs_device *device, u64 new_size) round_down(old_total - diff, fs_info->sectorsize)); mutex_unlock(&fs_info->chunk_mutex);
+ btrfs_reserve_chunk_metadata(trans, false); /* Now btrfs_update_device() will change the on-disk size. */ ret = btrfs_update_device(trans, device); + btrfs_trans_release_chunk_metadata(trans); if (ret < 0) { btrfs_abort_transaction(trans, ret); btrfs_end_transaction(trans);
From: Matthew Brost matthew.brost@intel.com
[ Upstream commit ce7e75c7ef1bf8ea3d947da8c674d2f40fd7d734 ]
Disable bonding on gen12+ platforms aside from ones already supported by the i915 - TGL, RKL, and ADL-S.
Signed-off-by: Matthew Brost matthew.brost@intel.com Reviewed-by: John Harrison John.C.Harrison@Intel.com Acked-by: Daniel Vetter daniel.vetter@ffwll.ch Signed-off-by: Matt Roper matthew.d.roper@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210728192100.132425-1-matthe... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gem/i915_gem_context.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_context.c b/drivers/gpu/drm/i915/gem/i915_gem_context.c index ee0c0b712522..ba2e037a82e4 100644 --- a/drivers/gpu/drm/i915/gem/i915_gem_context.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_context.c @@ -442,6 +442,13 @@ set_proto_ctx_engines_bond(struct i915_user_extension __user *base, void *data) u16 idx, num_bonds; int err, n;
+ if (GRAPHICS_VER(i915) >= 12 && !IS_TIGERLAKE(i915) && + !IS_ROCKETLAKE(i915) && !IS_ALDERLAKE_S(i915)) { + drm_dbg(&i915->drm, + "Bonding on gen12+ aside from TGL, RKL, and ADL_S not supported\n"); + return -ENODEV; + } + if (get_user(idx, &ext->virtual_index)) return -EFAULT;
From: Thomas Hellström thomas.hellstrom@linux.intel.com
[ Upstream commit 3e42cc61275f95fd7f022b6380b95428efe134d3 ]
Pinned contexts, like the migrate contexts need reset after resume since their context image may have been lost. Also the GuC needs to register pinned contexts.
Add a list to struct intel_engine_cs where we add all pinned contexts on creation, and traverse that list at resume time to reset the pinned contexts.
This fixes the kms_pipe_crc_basic@suspend-read-crc-pipe-a selftest for now, but proper LMEM backup / restore is needed for full suspend functionality. However, note that even with full LMEM backup / restore it may be desirable to keep the reset since backing up the migrate context images must happen using memcpy() after the migrate context has become inactive, and for performance- and other reasons we want to avoid memcpy() from LMEM.
Also traverse the list at guc_init_lrc_mapping() calling guc_kernel_context_pin() for the pinned contexts, like is already done for the kernel context.
v2: - Don't reset the contexts on each __engine_unpark() but rather at resume time (Chris Wilson). v3: - Reset contexts in the engine sanitize callback. (Chris Wilson)
Cc: Tvrtko Ursulin tvrtko.ursulin@linux.intel.com Cc: Matthew Auld matthew.auld@intel.com Cc: Maarten Lankhorst maarten.lankhorst@linux.intel.com Cc: Brost Matthew matthew.brost@intel.com Cc: Chris Wilson chris@chris-wilson.co.uk Signed-off-by: Thomas Hellström thomas.hellstrom@linux.intel.com Reviewed-by: Matthew Auld matthew.auld@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20210922062527.865433-6-thomas... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_context_types.h | 8 +++++++ drivers/gpu/drm/i915/gt/intel_engine_cs.c | 4 ++++ drivers/gpu/drm/i915/gt/intel_engine_pm.c | 23 +++++++++++++++++++ drivers/gpu/drm/i915/gt/intel_engine_pm.h | 2 ++ drivers/gpu/drm/i915/gt/intel_engine_types.h | 7 ++++++ .../drm/i915/gt/intel_execlists_submission.c | 2 ++ .../gpu/drm/i915/gt/intel_ring_submission.c | 3 +++ drivers/gpu/drm/i915/gt/mock_engine.c | 2 ++ .../gpu/drm/i915/gt/uc/intel_guc_submission.c | 12 +++++++--- 9 files changed, 60 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_context_types.h b/drivers/gpu/drm/i915/gt/intel_context_types.h index e54351a170e2..a63631ea0ec4 100644 --- a/drivers/gpu/drm/i915/gt/intel_context_types.h +++ b/drivers/gpu/drm/i915/gt/intel_context_types.h @@ -152,6 +152,14 @@ struct intel_context { /** sseu: Control eu/slice partitioning */ struct intel_sseu sseu;
+ /** + * pinned_contexts_link: List link for the engine's pinned contexts. + * This is only used if this is a perma-pinned kernel context and + * the list is assumed to only be manipulated during driver load + * or unload time so no mutex protection currently. + */ + struct list_head pinned_contexts_link; + u8 wa_bb_page; /* if set, page num reserved for context workarounds */
struct { diff --git a/drivers/gpu/drm/i915/gt/intel_engine_cs.c b/drivers/gpu/drm/i915/gt/intel_engine_cs.c index 0d9105a31d84..eb99441e0ada 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_cs.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_cs.c @@ -320,6 +320,7 @@ static int intel_engine_setup(struct intel_gt *gt, enum intel_engine_id id)
BUILD_BUG_ON(BITS_PER_TYPE(engine->mask) < I915_NUM_ENGINES);
+ INIT_LIST_HEAD(&engine->pinned_contexts_list); engine->id = id; engine->legacy_idx = INVALID_ENGINE; engine->mask = BIT(id); @@ -875,6 +876,8 @@ intel_engine_create_pinned_context(struct intel_engine_cs *engine, return ERR_PTR(err); }
+ list_add_tail(&ce->pinned_contexts_link, &engine->pinned_contexts_list); + /* * Give our perma-pinned kernel timelines a separate lockdep class, * so that we can use them from within the normal user timelines @@ -897,6 +900,7 @@ void intel_engine_destroy_pinned_context(struct intel_context *ce) list_del(&ce->timeline->engine_link); mutex_unlock(&hwsp->vm->mutex);
+ list_del(&ce->pinned_contexts_link); intel_context_unpin(ce); intel_context_put(ce); } diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.c b/drivers/gpu/drm/i915/gt/intel_engine_pm.c index 1f07ac4e0672..dacd62773735 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.c +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.c @@ -298,6 +298,29 @@ void intel_engine_init__pm(struct intel_engine_cs *engine) intel_engine_init_heartbeat(engine); }
+/** + * intel_engine_reset_pinned_contexts - Reset the pinned contexts of + * an engine. + * @engine: The engine whose pinned contexts we want to reset. + * + * Typically the pinned context LMEM images lose or get their content + * corrupted on suspend. This function resets their images. + */ +void intel_engine_reset_pinned_contexts(struct intel_engine_cs *engine) +{ + struct intel_context *ce; + + list_for_each_entry(ce, &engine->pinned_contexts_list, + pinned_contexts_link) { + /* kernel context gets reset at __engine_unpark() */ + if (ce == engine->kernel_context) + continue; + + dbg_poison_ce(ce); + ce->ops->reset(ce); + } +} + #if IS_ENABLED(CONFIG_DRM_I915_SELFTEST) #include "selftest_engine_pm.c" #endif diff --git a/drivers/gpu/drm/i915/gt/intel_engine_pm.h b/drivers/gpu/drm/i915/gt/intel_engine_pm.h index 70ea46d6cfb0..8520c595f5e1 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_pm.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_pm.h @@ -69,4 +69,6 @@ intel_engine_create_kernel_request(struct intel_engine_cs *engine)
void intel_engine_init__pm(struct intel_engine_cs *engine);
+void intel_engine_reset_pinned_contexts(struct intel_engine_cs *engine); + #endif /* INTEL_ENGINE_PM_H */ diff --git a/drivers/gpu/drm/i915/gt/intel_engine_types.h b/drivers/gpu/drm/i915/gt/intel_engine_types.h index ed91bcff20eb..adc44c9fac6d 100644 --- a/drivers/gpu/drm/i915/gt/intel_engine_types.h +++ b/drivers/gpu/drm/i915/gt/intel_engine_types.h @@ -304,6 +304,13 @@ struct intel_engine_cs {
struct intel_context *kernel_context; /* pinned */
+ /** + * pinned_contexts_list: List of pinned contexts. This list is only + * assumed to be manipulated during driver load- or unload time and + * does therefore not have any additional protection. + */ + struct list_head pinned_contexts_list; + intel_engine_mask_t saturated; /* submitting semaphores too late? */
struct { diff --git a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c index cafb0608ffb4..416f5e0657f0 100644 --- a/drivers/gpu/drm/i915/gt/intel_execlists_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_execlists_submission.c @@ -2787,6 +2787,8 @@ static void execlists_sanitize(struct intel_engine_cs *engine)
/* And scrub the dirty cachelines for the HWSP */ clflush_cache_range(engine->status_page.addr, PAGE_SIZE); + + intel_engine_reset_pinned_contexts(engine); }
static void enable_error_interrupt(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 2958e2fae380..6f2f6ba87397 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -17,6 +17,7 @@ #include "intel_ring.h" #include "shmem_utils.h" #include "intel_engine_heartbeat.h" +#include "intel_engine_pm.h"
/* Rough estimate of the typical request size, performing a flush, * set-context and then emitting the batch. @@ -292,6 +293,8 @@ static void xcs_sanitize(struct intel_engine_cs *engine)
/* And scrub the dirty cachelines for the HWSP */ clflush_cache_range(engine->status_page.addr, PAGE_SIZE); + + intel_engine_reset_pinned_contexts(engine); }
static void reset_prepare(struct intel_engine_cs *engine) diff --git a/drivers/gpu/drm/i915/gt/mock_engine.c b/drivers/gpu/drm/i915/gt/mock_engine.c index 2c1af030310c..8b89215afe46 100644 --- a/drivers/gpu/drm/i915/gt/mock_engine.c +++ b/drivers/gpu/drm/i915/gt/mock_engine.c @@ -376,6 +376,8 @@ int mock_engine_init(struct intel_engine_cs *engine) { struct intel_context *ce;
+ INIT_LIST_HEAD(&engine->pinned_contexts_list); + engine->sched_engine = i915_sched_engine_create(ENGINE_MOCK); if (!engine->sched_engine) return -ENOMEM; diff --git a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c index 93c9de8f43e8..6e09a1cca37b 100644 --- a/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c +++ b/drivers/gpu/drm/i915/gt/uc/intel_guc_submission.c @@ -2347,6 +2347,8 @@ static void guc_sanitize(struct intel_engine_cs *engine)
/* And scrub the dirty cachelines for the HWSP */ clflush_cache_range(engine->status_page.addr, PAGE_SIZE); + + intel_engine_reset_pinned_contexts(engine); }
static void setup_hwsp(struct intel_engine_cs *engine) @@ -2422,9 +2424,13 @@ static inline void guc_init_lrc_mapping(struct intel_guc *guc) * and even it did this code would be run again. */
- for_each_engine(engine, gt, id) - if (engine->kernel_context) - guc_kernel_context_pin(guc, engine->kernel_context); + for_each_engine(engine, gt, id) { + struct intel_context *ce; + + list_for_each_entry(ce, &engine->pinned_contexts_list, + pinned_contexts_link) + guc_kernel_context_pin(guc, ce); + } }
static void guc_release(struct intel_engine_cs *engine)
From: Ville Syrjälä ville.syrjala@linux.intel.com
[ Upstream commit ef7ec41f17cbc0861891ccc0634d06a0c8dcbf09 ]
Not all machines have clflush, so don't go assuming they do. Not really sure why the clflush is even here since hwsp is supposed to get snooped I thought.
Although in my case we're talking about a i830 machine where render/blitter snooping is definitely busted. But it might work for the hswp perhaps. Haven't really reverse engineered that one fully.
Cc: stable@vger.kernel.org Cc: Chris Wilson chris@chris-wilson.co.uk Cc: Mika Kuoppala mika.kuoppala@linux.intel.com Fixes: b436a5f8b6c8 ("drm/i915/gt: Track all timelines created using the HWSP") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20211014090941.12159-2-ville.s... Reviewed-by: Dave Airlie airlied@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_ring_submission.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_ring_submission.c b/drivers/gpu/drm/i915/gt/intel_ring_submission.c index 6f2f6ba87397..02e18e70c78e 100644 --- a/drivers/gpu/drm/i915/gt/intel_ring_submission.c +++ b/drivers/gpu/drm/i915/gt/intel_ring_submission.c @@ -292,7 +292,7 @@ static void xcs_sanitize(struct intel_engine_cs *engine) sanitize_hwsp(engine);
/* And scrub the dirty cachelines for the HWSP */ - clflush_cache_range(engine->status_page.addr, PAGE_SIZE); + drm_clflush_virt_range(engine->status_page.addr, PAGE_SIZE);
intel_engine_reset_pinned_contexts(engine); }
From: Lukas Wunner lukas@wunner.de
[ Upstream commit 3134689f98f9e09004a4727370adc46e7635b4be ]
Rename pm_iter() to pcie_port_device_iter() and make it visible outside CONFIG_PM and portdrv_core.c so it can be used for pciehp slot reset recovery.
[bhelgaas: split into its own patch] Link: https://lore.kernel.org/linux-pci/08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail... Link: https://lore.kernel.org/r/251f4edcc04c14f873ff1c967bc686169cd07d2d.162763818... Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pcie/portdrv.h | 1 + drivers/pci/pcie/portdrv_core.c | 20 ++++++++++---------- 2 files changed, 11 insertions(+), 10 deletions(-)
diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 2ff5724b8f13..6126ee4676a7 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -110,6 +110,7 @@ void pcie_port_service_unregister(struct pcie_port_service_driver *new);
extern struct bus_type pcie_port_bus_type; int pcie_port_device_register(struct pci_dev *dev); +int pcie_port_device_iter(struct device *dev, void *data); #ifdef CONFIG_PM int pcie_port_device_suspend(struct device *dev); int pcie_port_device_resume_noirq(struct device *dev); diff --git a/drivers/pci/pcie/portdrv_core.c b/drivers/pci/pcie/portdrv_core.c index 3ee63968deaa..604feeb84ee4 100644 --- a/drivers/pci/pcie/portdrv_core.c +++ b/drivers/pci/pcie/portdrv_core.c @@ -367,24 +367,24 @@ int pcie_port_device_register(struct pci_dev *dev) return status; }
-#ifdef CONFIG_PM -typedef int (*pcie_pm_callback_t)(struct pcie_device *); +typedef int (*pcie_callback_t)(struct pcie_device *);
-static int pm_iter(struct device *dev, void *data) +int pcie_port_device_iter(struct device *dev, void *data) { struct pcie_port_service_driver *service_driver; size_t offset = *(size_t *)data; - pcie_pm_callback_t cb; + pcie_callback_t cb;
if ((dev->bus == &pcie_port_bus_type) && dev->driver) { service_driver = to_service_driver(dev->driver); - cb = *(pcie_pm_callback_t *)((void *)service_driver + offset); + cb = *(pcie_callback_t *)((void *)service_driver + offset); if (cb) return cb(to_pcie_device(dev)); } return 0; }
+#ifdef CONFIG_PM /** * pcie_port_device_suspend - suspend port services associated with a PCIe port * @dev: PCI Express port to handle @@ -392,13 +392,13 @@ static int pm_iter(struct device *dev, void *data) int pcie_port_device_suspend(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, suspend); - return device_for_each_child(dev, &off, pm_iter); + return device_for_each_child(dev, &off, pcie_port_device_iter); }
int pcie_port_device_resume_noirq(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, resume_noirq); - return device_for_each_child(dev, &off, pm_iter); + return device_for_each_child(dev, &off, pcie_port_device_iter); }
/** @@ -408,7 +408,7 @@ int pcie_port_device_resume_noirq(struct device *dev) int pcie_port_device_resume(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, resume); - return device_for_each_child(dev, &off, pm_iter); + return device_for_each_child(dev, &off, pcie_port_device_iter); }
/** @@ -418,7 +418,7 @@ int pcie_port_device_resume(struct device *dev) int pcie_port_device_runtime_suspend(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, runtime_suspend); - return device_for_each_child(dev, &off, pm_iter); + return device_for_each_child(dev, &off, pcie_port_device_iter); }
/** @@ -428,7 +428,7 @@ int pcie_port_device_runtime_suspend(struct device *dev) int pcie_port_device_runtime_resume(struct device *dev) { size_t off = offsetof(struct pcie_port_service_driver, runtime_resume); - return device_for_each_child(dev, &off, pm_iter); + return device_for_each_child(dev, &off, pcie_port_device_iter); } #endif /* PM */
From: Lukas Wunner lukas@wunner.de
[ Upstream commit ea401499e943c307e6d44af6c2b4e068643e7884 ]
Stuart Hayes reports that an error handled by DPC at a Root Port results in pciehp gratuitously bringing down a subordinate hotplug port:
RP -- UP -- DP -- UP -- DP (hotplug) -- EP
pciehp brings the slot down because the Link to the Endpoint goes down. That is caused by a Hot Reset being propagated as a result of DPC. Per PCIe Base Spec 5.0, section 6.6.1 "Conventional Reset":
For a Switch, the following must cause a hot reset to be sent on all Downstream Ports: [...]
* The Data Link Layer of the Upstream Port reporting DL_Down status. In Switches that support Link speeds greater than 5.0 GT/s, the Upstream Port must direct the LTSSM of each Downstream Port to the Hot Reset state, but not hold the LTSSMs in that state. This permits each Downstream Port to begin Link training immediately after its hot reset completes. This behavior is recommended for all Switches.
* Receiving a hot reset on the Upstream Port.
Once DPC recovers, pcie_do_recovery() walks down the hierarchy and invokes pcie_portdrv_slot_reset() to restore each port's config space. At that point, a hotplug interrupt is signaled per PCIe Base Spec r5.0, section 6.7.3.4 "Software Notification of Hot-Plug Events":
If the Port is enabled for edge-triggered interrupt signaling using MSI or MSI-X, an interrupt message must be sent every time the logical AND of the following conditions transitions from FALSE to TRUE: [...]
* The Hot-Plug Interrupt Enable bit in the Slot Control register is set to 1b.
* At least one hot-plug event status bit in the Slot Status register and its associated enable bit in the Slot Control register are both set to 1b.
Prevent pciehp from gratuitously bringing down the slot by clearing the error-induced Data Link Layer State Changed event before restoring config space. Afterwards, check whether the link has unexpectedly failed to retrain and synthesize a DLLSC event if so.
Allow each pcie_port_service_driver (one of them being pciehp) to define a slot_reset callback and re-use the existing pm_iter() function to iterate over the callbacks.
Thereby, the Endpoint driver remains bound throughout error recovery and may restore the device to working state.
Surprise removal during error recovery is detected through a Presence Detect Changed event. The hotplug port is expected to not signal that event as a result of a Hot Reset.
The issue isn't DPC-specific, it also occurs when an error is handled by AER through aer_root_reset(). So while the issue was noticed only now, it's been around since 2006 when AER support was first introduced.
[bhelgaas: drop PCI_ERROR_RECOVERY Kconfig, split pm_iter() rename to preparatory patch] Link: https://lore.kernel.org/linux-pci/08c046b0-c9f2-3489-eeef-7e7aca435bb9@gmail... Fixes: 6c2b374d7485 ("PCI-Express AER implemetation: AER core and aerdriver") Link: https://lore.kernel.org/r/251f4edcc04c14f873ff1c967bc686169cd07d2d.162763818... Reported-by: Stuart Hayes stuart.w.hayes@gmail.com Tested-by: Stuart Hayes stuart.w.hayes@gmail.com Signed-off-by: Lukas Wunner lukas@wunner.de Signed-off-by: Bjorn Helgaas bhelgaas@google.com Cc: stable@vger.kernel.org # v2.6.19+: ba952824e6c1: PCI/portdrv: Report reset for frozen channel Cc: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/hotplug/pciehp.h | 2 ++ drivers/pci/hotplug/pciehp_core.c | 2 ++ drivers/pci/hotplug/pciehp_hpc.c | 26 ++++++++++++++++++++++++++ drivers/pci/pcie/portdrv.h | 2 ++ drivers/pci/pcie/portdrv_pci.c | 3 +++ 5 files changed, 35 insertions(+)
diff --git a/drivers/pci/hotplug/pciehp.h b/drivers/pci/hotplug/pciehp.h index 10d7e7e1b553..e0a614acee05 100644 --- a/drivers/pci/hotplug/pciehp.h +++ b/drivers/pci/hotplug/pciehp.h @@ -192,6 +192,8 @@ int pciehp_get_attention_status(struct hotplug_slot *hotplug_slot, u8 *status); int pciehp_set_raw_indicator_status(struct hotplug_slot *h_slot, u8 status); int pciehp_get_raw_indicator_status(struct hotplug_slot *h_slot, u8 *status);
+int pciehp_slot_reset(struct pcie_device *dev); + static inline const char *slot_name(struct controller *ctrl) { return hotplug_slot_name(&ctrl->hotplug_slot); diff --git a/drivers/pci/hotplug/pciehp_core.c b/drivers/pci/hotplug/pciehp_core.c index e7fe4b42f039..4042d87d539d 100644 --- a/drivers/pci/hotplug/pciehp_core.c +++ b/drivers/pci/hotplug/pciehp_core.c @@ -351,6 +351,8 @@ static struct pcie_port_service_driver hpdriver_portdrv = { .runtime_suspend = pciehp_runtime_suspend, .runtime_resume = pciehp_runtime_resume, #endif /* PM */ + + .slot_reset = pciehp_slot_reset, };
int __init pcie_hp_init(void) diff --git a/drivers/pci/hotplug/pciehp_hpc.c b/drivers/pci/hotplug/pciehp_hpc.c index 8bedbc77fe95..60098a701e83 100644 --- a/drivers/pci/hotplug/pciehp_hpc.c +++ b/drivers/pci/hotplug/pciehp_hpc.c @@ -865,6 +865,32 @@ void pcie_disable_interrupt(struct controller *ctrl) pcie_write_cmd(ctrl, 0, mask); }
+/** + * pciehp_slot_reset() - ignore link event caused by error-induced hot reset + * @dev: PCI Express port service device + * + * Called from pcie_portdrv_slot_reset() after AER or DPC initiated a reset + * further up in the hierarchy to recover from an error. The reset was + * propagated down to this hotplug port. Ignore the resulting link flap. + * If the link failed to retrain successfully, synthesize the ignored event. + * Surprise removal during reset is detected through Presence Detect Changed. + */ +int pciehp_slot_reset(struct pcie_device *dev) +{ + struct controller *ctrl = get_service_data(dev); + + if (ctrl->state != ON_STATE) + return 0; + + pcie_capability_write_word(dev->port, PCI_EXP_SLTSTA, + PCI_EXP_SLTSTA_DLLSC); + + if (!pciehp_check_link_active(ctrl)) + pciehp_request(ctrl, PCI_EXP_SLTSTA_DLLSC); + + return 0; +} + /* * pciehp has a 1:1 bus:slot relationship so we ultimately want a secondary * bus reset of the bridge, but at the same time we want to ensure that it is diff --git a/drivers/pci/pcie/portdrv.h b/drivers/pci/pcie/portdrv.h index 6126ee4676a7..41fe1ffd5907 100644 --- a/drivers/pci/pcie/portdrv.h +++ b/drivers/pci/pcie/portdrv.h @@ -85,6 +85,8 @@ struct pcie_port_service_driver { int (*runtime_suspend)(struct pcie_device *dev); int (*runtime_resume)(struct pcie_device *dev);
+ int (*slot_reset)(struct pcie_device *dev); + /* Device driver may resume normal operations */ void (*error_resume)(struct pci_dev *dev);
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c index c7ff1eea225a..1af74c3d9d5d 100644 --- a/drivers/pci/pcie/portdrv_pci.c +++ b/drivers/pci/pcie/portdrv_pci.c @@ -160,6 +160,9 @@ static pci_ers_result_t pcie_portdrv_error_detected(struct pci_dev *dev,
static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev) { + size_t off = offsetof(struct pcie_port_service_driver, slot_reset); + device_for_each_child(&dev->dev, &off, pcie_port_device_iter); + pci_restore_state(dev); pci_save_state(dev); return PCI_ERS_RESULT_RECOVERED;
From: Sean Young sean@mess.org
[ Upstream commit 4114978dcd24e72415276bba60ff4ff355970bbc ]
If the IR Toy is receiving IR while a transmit is done, it may end up hanging. We can prevent this from happening by re-entering sample mode just before issuing the transmit command.
Link: https://github.com/bengtmartensson/HarcHardware/discussions/25
Cc: stable@vger.kernel.org [mchehab: renamed: s/STATE_RESET/STATE_COMMAND_NO_RESP/ ] Signed-off-by: Sean Young sean@mess.org Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/rc/ir_toy.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/rc/ir_toy.c b/drivers/media/rc/ir_toy.c index 7f394277478b..53ae19fa103a 100644 --- a/drivers/media/rc/ir_toy.c +++ b/drivers/media/rc/ir_toy.c @@ -310,7 +310,7 @@ static int irtoy_tx(struct rc_dev *rc, uint *txbuf, uint count) buf[i] = cpu_to_be16(v); }
- buf[count] = cpu_to_be16(0xffff); + buf[count] = 0xffff;
irtoy->tx_buf = buf; irtoy->tx_len = size;
From: Andrew Gabbasov andrew_gabbasov@mentor.com
[ Upstream commit 1869023e24c0de73a160a424dac4621cefd628ae ]
HyperFlash devices in Renesas SoCs use 2-bytes addressing, according to HW manual paragraph 62.3.3 (which officially describes Serial Flash access, but seems to be applicable to HyperFlash too). And 1-byte bus read operations to 2-bytes unaligned addresses in external address space read mode work incorrectly (returns the other byte from the same word).
Function memcpy_fromio(), used by the driver to read data from the bus, in ARM64 architecture (to which Renesas cores belong) uses 8-bytes bus accesses for appropriate aligned addresses, and 1-bytes accesses for other addresses. This results in incorrect data read from HyperFlash in unaligned cases.
This issue can be reproduced using something like the following commands (where mtd1 is a parition on Hyperflash storage, defined properly in a device tree):
[Correct fragment, read from Hyperflash]
root@rcar-gen3:~# dd if=/dev/mtd1 of=/tmp/zz bs=32 count=1 root@rcar-gen3:~# hexdump -C /tmp/zz 00000000 f4 03 00 aa f5 03 01 aa f6 03 02 aa f7 03 03 aa |................| 00000010 00 00 80 d2 40 20 18 d5 00 06 81 d2 a0 18 a6 f2 |....@ ..........| 00000020
[Incorrect read of the same fragment: see the difference at offsets 8-11]
root@rcar-gen3:~# dd if=/dev/mtd1 of=/tmp/zz bs=12 count=1 root@rcar-gen3:~# hexdump -C /tmp/zz 00000000 f4 03 00 aa f5 03 01 aa 03 03 aa aa |............| 0000000c
Fix this issue by creating a local replacement of the copying function, that performs only properly aligned bus accesses, and is used for reading from HyperFlash.
Fixes: ca7d8b980b67f ("memory: add Renesas RPC-IF driver") Cc: stable@vger.kernel.org Signed-off-by: Andrew Gabbasov andrew_gabbasov@mentor.com Link: https://lore.kernel.org/r/20210922184830.29147-1-andrew_gabbasov@mentor.com Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/memory/renesas-rpc-if.c | 48 +++++++++++++++++++++++++++++++-- 1 file changed, 46 insertions(+), 2 deletions(-)
diff --git a/drivers/memory/renesas-rpc-if.c b/drivers/memory/renesas-rpc-if.c index 3a416705f61c..c77b23b68a93 100644 --- a/drivers/memory/renesas-rpc-if.c +++ b/drivers/memory/renesas-rpc-if.c @@ -199,7 +199,6 @@ static int rpcif_reg_read(void *context, unsigned int reg, unsigned int *val)
*val = readl(rpc->base + reg); return 0; - }
static int rpcif_reg_write(void *context, unsigned int reg, unsigned int val) @@ -577,6 +576,48 @@ int rpcif_manual_xfer(struct rpcif *rpc) } EXPORT_SYMBOL(rpcif_manual_xfer);
+static void memcpy_fromio_readw(void *to, + const void __iomem *from, + size_t count) +{ + const int maxw = (IS_ENABLED(CONFIG_64BIT)) ? 8 : 4; + u8 buf[2]; + + if (count && ((unsigned long)from & 1)) { + *(u16 *)buf = __raw_readw((void __iomem *)((unsigned long)from & ~1)); + *(u8 *)to = buf[1]; + from++; + to++; + count--; + } + while (count >= 2 && !IS_ALIGNED((unsigned long)from, maxw)) { + *(u16 *)to = __raw_readw(from); + from += 2; + to += 2; + count -= 2; + } + while (count >= maxw) { +#ifdef CONFIG_64BIT + *(u64 *)to = __raw_readq(from); +#else + *(u32 *)to = __raw_readl(from); +#endif + from += maxw; + to += maxw; + count -= maxw; + } + while (count >= 2) { + *(u16 *)to = __raw_readw(from); + from += 2; + to += 2; + count -= 2; + } + if (count) { + *(u16 *)buf = __raw_readw(from); + *(u8 *)to = buf[0]; + } +} + ssize_t rpcif_dirmap_read(struct rpcif *rpc, u64 offs, size_t len, void *buf) { loff_t from = offs & (RPCIF_DIRMAP_SIZE - 1); @@ -598,7 +639,10 @@ ssize_t rpcif_dirmap_read(struct rpcif *rpc, u64 offs, size_t len, void *buf) regmap_write(rpc->regmap, RPCIF_DRDMCR, rpc->dummy); regmap_write(rpc->regmap, RPCIF_DRDRENR, rpc->ddr);
- memcpy_fromio(buf, rpc->dirmap + from, len); + if (rpc->bus_size == 2) + memcpy_fromio_readw(buf, rpc->dirmap + from, len); + else + memcpy_fromio(buf, rpc->dirmap + from, len);
pm_runtime_put(rpc->dev);
From: Seevalamuthu Mariappan quic_seevalam@quicinc.com
[ Upstream commit 081e2d6476e30399433b509684d5da4d1844e430 ]
Wakeup mhi is needed before pci_read/write only for QCA6390 and WCN6855. Since wakeup & release mhi is enabled for all hardwares, below mhi assert is seen in QCN9074 when doing 'rmmod ath11k_pci':
Kernel panic - not syncing: dev_wake != 0 CPU: 2 PID: 13535 Comm: procd Not tainted 4.4.60 #1 Hardware name: Generic DT based system [<80316dac>] (unwind_backtrace) from [<80313700>] (show_stack+0x10/0x14) [<80313700>] (show_stack) from [<805135dc>] (dump_stack+0x7c/0x9c) [<805135dc>] (dump_stack) from [<8032136c>] (panic+0x84/0x1f8) [<8032136c>] (panic) from [<80549b24>] (mhi_pm_disable_transition+0x3b8/0x5b8) [<80549b24>] (mhi_pm_disable_transition) from [<80549ddc>] (mhi_power_down+0xb8/0x100) [<80549ddc>] (mhi_power_down) from [<7f5242b0>] (ath11k_mhi_op_status_cb+0x284/0x3ac [ath11k_pci]) [E][__mhi_device_get_sync] Did not enter M0 state, cur_state:RESET pm_state:SHUTDOWN Process [E][__mhi_device_get_sync] Did not enter M0 state, cur_state:RESET pm_state:SHUTDOWN Process [E][__mhi_device_get_sync] Did not enter M0 state, cur_state:RESET pm_state:SHUTDOWN Process [<7f5242b0>] (ath11k_mhi_op_status_cb [ath11k_pci]) from [<7f524878>] (ath11k_mhi_stop+0x10/0x20 [ath11k_pci]) [<7f524878>] (ath11k_mhi_stop [ath11k_pci]) from [<7f525b94>] (ath11k_pci_power_down+0x54/0x90 [ath11k_pci]) [<7f525b94>] (ath11k_pci_power_down [ath11k_pci]) from [<8056b2a8>] (pci_device_shutdown+0x30/0x44) [<8056b2a8>] (pci_device_shutdown) from [<805cfa0c>] (device_shutdown+0x124/0x174) [<805cfa0c>] (device_shutdown) from [<8033aaa4>] (kernel_restart+0xc/0x50) [<8033aaa4>] (kernel_restart) from [<8033ada8>] (SyS_reboot+0x178/0x1ec) [<8033ada8>] (SyS_reboot) from [<80301b80>] (ret_fast_syscall+0x0/0x34)
Hence, disable wakeup/release mhi using hw_param for other hardwares.
Tested-on: QCN9074 hw1.0 PCI WLAN.HK.2.5.0.1-01060-QCAHKSWPL_SILICONZ-1
Fixes: a05bd8513335 ("ath11k: read and write registers below unwindowed address") Signed-off-by: Seevalamuthu Mariappan quic_seevalam@quicinc.com Signed-off-by: Kalle Valo kvalo@codeaurora.org Link: https://lore.kernel.org/r/1636702019-26142-1-git-send-email-quic_seevalam@qu... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath11k/core.c | 5 +++++ drivers/net/wireless/ath/ath11k/hw.h | 1 + drivers/net/wireless/ath/ath11k/pci.c | 12 ++++++++---- 3 files changed, 14 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/core.c b/drivers/net/wireless/ath/ath11k/core.c index 7dcf6b13f794..48b4151e13a3 100644 --- a/drivers/net/wireless/ath/ath11k/core.c +++ b/drivers/net/wireless/ath/ath11k/core.c @@ -71,6 +71,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = { .supports_suspend = false, .hal_desc_sz = sizeof(struct hal_rx_desc_ipq8074), .fix_l1ss = true, + .wakeup_mhi = false, }, { .hw_rev = ATH11K_HW_IPQ6018_HW10, @@ -112,6 +113,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = { .supports_suspend = false, .hal_desc_sz = sizeof(struct hal_rx_desc_ipq8074), .fix_l1ss = true, + .wakeup_mhi = false, }, { .name = "qca6390 hw2.0", @@ -152,6 +154,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = { .supports_suspend = true, .hal_desc_sz = sizeof(struct hal_rx_desc_ipq8074), .fix_l1ss = true, + .wakeup_mhi = true, }, { .name = "qcn9074 hw1.0", @@ -190,6 +193,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = { .supports_suspend = false, .hal_desc_sz = sizeof(struct hal_rx_desc_qcn9074), .fix_l1ss = true, + .wakeup_mhi = false, }, { .name = "wcn6855 hw2.0", @@ -230,6 +234,7 @@ static const struct ath11k_hw_params ath11k_hw_params[] = { .supports_suspend = true, .hal_desc_sz = sizeof(struct hal_rx_desc_wcn6855), .fix_l1ss = false, + .wakeup_mhi = true, }, };
diff --git a/drivers/net/wireless/ath/ath11k/hw.h b/drivers/net/wireless/ath/ath11k/hw.h index 62f5978b3005..4fe051625edf 100644 --- a/drivers/net/wireless/ath/ath11k/hw.h +++ b/drivers/net/wireless/ath/ath11k/hw.h @@ -163,6 +163,7 @@ struct ath11k_hw_params { bool supports_suspend; u32 hal_desc_sz; bool fix_l1ss; + bool wakeup_mhi; };
struct ath11k_hw_ops { diff --git a/drivers/net/wireless/ath/ath11k/pci.c b/drivers/net/wireless/ath/ath11k/pci.c index 353a2d669fcd..7d0be9388f89 100644 --- a/drivers/net/wireless/ath/ath11k/pci.c +++ b/drivers/net/wireless/ath/ath11k/pci.c @@ -182,7 +182,8 @@ void ath11k_pci_write32(struct ath11k_base *ab, u32 offset, u32 value) /* for offset beyond BAR + 4K - 32, may * need to wakeup MHI to access. */ - if (test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && + if (ab->hw_params.wakeup_mhi && + test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && offset >= ACCESS_ALWAYS_OFF) mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
@@ -206,7 +207,8 @@ void ath11k_pci_write32(struct ath11k_base *ab, u32 offset, u32 value) } }
- if (test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && + if (ab->hw_params.wakeup_mhi && + test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && offset >= ACCESS_ALWAYS_OFF) mhi_device_put(ab_pci->mhi_ctrl->mhi_dev); } @@ -219,7 +221,8 @@ u32 ath11k_pci_read32(struct ath11k_base *ab, u32 offset) /* for offset beyond BAR + 4K - 32, may * need to wakeup MHI to access. */ - if (test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && + if (ab->hw_params.wakeup_mhi && + test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && offset >= ACCESS_ALWAYS_OFF) mhi_device_get_sync(ab_pci->mhi_ctrl->mhi_dev);
@@ -243,7 +246,8 @@ u32 ath11k_pci_read32(struct ath11k_base *ab, u32 offset) } }
- if (test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && + if (ab->hw_params.wakeup_mhi && + test_bit(ATH11K_PCI_FLAG_INIT_DONE, &ab_pci->flags) && offset >= ACCESS_ALWAYS_OFF) mhi_device_put(ab_pci->mhi_ctrl->mhi_dev);
From: Shai Malin smalin@marvell.com
[ Upstream commit f55e36d5ab76c3097ff36ecea60b91c6b0d80fc8 ]
As it was reported and discussed in: https://lore.kernel.org/lkml/CAHk-=whF9F89vsfH8E9TGc0tZA-yhzi2Di8wOtquNB5vRk... This patch improves the stack space of qede_config_rx_mode() by splitting filter_config() to 3 functions and removing the union qed_filter_type_params.
Reported-by: Naresh Kamboju naresh.kamboju@linaro.org Signed-off-by: Ariel Elior aelior@marvell.com Signed-off-by: Shai Malin smalin@marvell.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/qlogic/qed/qed_l2.c | 23 ++------- .../net/ethernet/qlogic/qede/qede_filter.c | 47 ++++++++----------- include/linux/qed/qed_eth_if.h | 21 ++++----- 3 files changed, 30 insertions(+), 61 deletions(-)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_l2.c b/drivers/net/ethernet/qlogic/qed/qed_l2.c index dfaf10edfabf..ba8c7a31cce1 100644 --- a/drivers/net/ethernet/qlogic/qed/qed_l2.c +++ b/drivers/net/ethernet/qlogic/qed/qed_l2.c @@ -2763,25 +2763,6 @@ static int qed_configure_filter_mcast(struct qed_dev *cdev, return qed_filter_mcast_cmd(cdev, &mcast, QED_SPQ_MODE_CB, NULL); }
-static int qed_configure_filter(struct qed_dev *cdev, - struct qed_filter_params *params) -{ - enum qed_filter_rx_mode_type accept_flags; - - switch (params->type) { - case QED_FILTER_TYPE_UCAST: - return qed_configure_filter_ucast(cdev, ¶ms->filter.ucast); - case QED_FILTER_TYPE_MCAST: - return qed_configure_filter_mcast(cdev, ¶ms->filter.mcast); - case QED_FILTER_TYPE_RX_MODE: - accept_flags = params->filter.accept_flags; - return qed_configure_filter_rx_mode(cdev, accept_flags); - default: - DP_NOTICE(cdev, "Unknown filter type %d\n", (int)params->type); - return -EINVAL; - } -} - static int qed_configure_arfs_searcher(struct qed_dev *cdev, enum qed_filter_config_mode mode) { @@ -2904,7 +2885,9 @@ static const struct qed_eth_ops qed_eth_ops_pass = { .q_rx_stop = &qed_stop_rxq, .q_tx_start = &qed_start_txq, .q_tx_stop = &qed_stop_txq, - .filter_config = &qed_configure_filter, + .filter_config_rx_mode = &qed_configure_filter_rx_mode, + .filter_config_ucast = &qed_configure_filter_ucast, + .filter_config_mcast = &qed_configure_filter_mcast, .fastpath_stop = &qed_fastpath_stop, .eth_cqe_completion = &qed_fp_cqe_completion, .get_vport_stats = &qed_get_vport_stats, diff --git a/drivers/net/ethernet/qlogic/qede/qede_filter.c b/drivers/net/ethernet/qlogic/qede/qede_filter.c index a2e4dfb5cb44..f99b085b56a5 100644 --- a/drivers/net/ethernet/qlogic/qede/qede_filter.c +++ b/drivers/net/ethernet/qlogic/qede/qede_filter.c @@ -619,30 +619,28 @@ static int qede_set_ucast_rx_mac(struct qede_dev *edev, enum qed_filter_xcast_params_type opcode, unsigned char mac[ETH_ALEN]) { - struct qed_filter_params filter_cmd; + struct qed_filter_ucast_params ucast;
- memset(&filter_cmd, 0, sizeof(filter_cmd)); - filter_cmd.type = QED_FILTER_TYPE_UCAST; - filter_cmd.filter.ucast.type = opcode; - filter_cmd.filter.ucast.mac_valid = 1; - ether_addr_copy(filter_cmd.filter.ucast.mac, mac); + memset(&ucast, 0, sizeof(ucast)); + ucast.type = opcode; + ucast.mac_valid = 1; + ether_addr_copy(ucast.mac, mac);
- return edev->ops->filter_config(edev->cdev, &filter_cmd); + return edev->ops->filter_config_ucast(edev->cdev, &ucast); }
static int qede_set_ucast_rx_vlan(struct qede_dev *edev, enum qed_filter_xcast_params_type opcode, u16 vid) { - struct qed_filter_params filter_cmd; + struct qed_filter_ucast_params ucast;
- memset(&filter_cmd, 0, sizeof(filter_cmd)); - filter_cmd.type = QED_FILTER_TYPE_UCAST; - filter_cmd.filter.ucast.type = opcode; - filter_cmd.filter.ucast.vlan_valid = 1; - filter_cmd.filter.ucast.vlan = vid; + memset(&ucast, 0, sizeof(ucast)); + ucast.type = opcode; + ucast.vlan_valid = 1; + ucast.vlan = vid;
- return edev->ops->filter_config(edev->cdev, &filter_cmd); + return edev->ops->filter_config_ucast(edev->cdev, &ucast); }
static int qede_config_accept_any_vlan(struct qede_dev *edev, bool action) @@ -1057,18 +1055,17 @@ static int qede_set_mcast_rx_mac(struct qede_dev *edev, enum qed_filter_xcast_params_type opcode, unsigned char *mac, int num_macs) { - struct qed_filter_params filter_cmd; + struct qed_filter_mcast_params mcast; int i;
- memset(&filter_cmd, 0, sizeof(filter_cmd)); - filter_cmd.type = QED_FILTER_TYPE_MCAST; - filter_cmd.filter.mcast.type = opcode; - filter_cmd.filter.mcast.num = num_macs; + memset(&mcast, 0, sizeof(mcast)); + mcast.type = opcode; + mcast.num = num_macs;
for (i = 0; i < num_macs; i++, mac += ETH_ALEN) - ether_addr_copy(filter_cmd.filter.mcast.mac[i], mac); + ether_addr_copy(mcast.mac[i], mac);
- return edev->ops->filter_config(edev->cdev, &filter_cmd); + return edev->ops->filter_config_mcast(edev->cdev, &mcast); }
int qede_set_mac_addr(struct net_device *ndev, void *p) @@ -1194,7 +1191,6 @@ void qede_config_rx_mode(struct net_device *ndev) { enum qed_filter_rx_mode_type accept_flags; struct qede_dev *edev = netdev_priv(ndev); - struct qed_filter_params rx_mode; unsigned char *uc_macs, *temp; struct netdev_hw_addr *ha; int rc, uc_count; @@ -1220,10 +1216,6 @@ void qede_config_rx_mode(struct net_device *ndev)
netif_addr_unlock_bh(ndev);
- /* Configure the struct for the Rx mode */ - memset(&rx_mode, 0, sizeof(struct qed_filter_params)); - rx_mode.type = QED_FILTER_TYPE_RX_MODE; - /* Remove all previous unicast secondary macs and multicast macs * (configure / leave the primary mac) */ @@ -1271,8 +1263,7 @@ void qede_config_rx_mode(struct net_device *ndev) qede_config_accept_any_vlan(edev, false); }
- rx_mode.filter.accept_flags = accept_flags; - edev->ops->filter_config(edev->cdev, &rx_mode); + edev->ops->filter_config_rx_mode(edev->cdev, accept_flags); out: kfree(uc_macs); } diff --git a/include/linux/qed/qed_eth_if.h b/include/linux/qed/qed_eth_if.h index 812a4d751163..4df0bf0a0864 100644 --- a/include/linux/qed/qed_eth_if.h +++ b/include/linux/qed/qed_eth_if.h @@ -145,12 +145,6 @@ struct qed_filter_mcast_params { unsigned char mac[64][ETH_ALEN]; };
-union qed_filter_type_params { - enum qed_filter_rx_mode_type accept_flags; - struct qed_filter_ucast_params ucast; - struct qed_filter_mcast_params mcast; -}; - enum qed_filter_type { QED_FILTER_TYPE_UCAST, QED_FILTER_TYPE_MCAST, @@ -158,11 +152,6 @@ enum qed_filter_type { QED_MAX_FILTER_TYPES, };
-struct qed_filter_params { - enum qed_filter_type type; - union qed_filter_type_params filter; -}; - struct qed_tunn_params { u16 vxlan_port; u8 update_vxlan_port; @@ -314,8 +303,14 @@ struct qed_eth_ops {
int (*q_tx_stop)(struct qed_dev *cdev, u8 rss_id, void *handle);
- int (*filter_config)(struct qed_dev *cdev, - struct qed_filter_params *params); + int (*filter_config_rx_mode)(struct qed_dev *cdev, + enum qed_filter_rx_mode_type type); + + int (*filter_config_ucast)(struct qed_dev *cdev, + struct qed_filter_ucast_params *params); + + int (*filter_config_mcast)(struct qed_dev *cdev, + struct qed_filter_mcast_params *params);
int (*fastpath_stop)(struct qed_dev *cdev);
From: Barnabás Pőcze pobrn@protonmail.com
[ Upstream commit e7b2e33449e22fdbaa0247d96f31543affe6163d ]
Introduce a helper function which wraps the appropriate `container_of()` macro invocation to convert a `struct device_driver` to `struct wmi_driver`.
Signed-off-by: Barnabás Pőcze pobrn@protonmail.com Link: https://lore.kernel.org/r/20210904175450.156801-27-pobrn@protonmail.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/wmi.c | 22 ++++++++++------------ 1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index 1b65bb61ce88..9aeb1a009097 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -676,6 +676,11 @@ static struct wmi_device *dev_to_wdev(struct device *dev) return container_of(dev, struct wmi_device, dev); }
+static inline struct wmi_driver *drv_to_wdrv(struct device_driver *drv) +{ + return container_of(drv, struct wmi_driver, driver); +} + /* * sysfs interface */ @@ -794,8 +799,7 @@ static void wmi_dev_release(struct device *dev)
static int wmi_dev_match(struct device *dev, struct device_driver *driver) { - struct wmi_driver *wmi_driver = - container_of(driver, struct wmi_driver, driver); + struct wmi_driver *wmi_driver = drv_to_wdrv(driver); struct wmi_block *wblock = dev_to_wblock(dev); const struct wmi_device_id *id = wmi_driver->id_table;
@@ -892,8 +896,7 @@ static long wmi_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) }
/* let the driver do any filtering and do the call */ - wdriver = container_of(wblock->dev.dev.driver, - struct wmi_driver, driver); + wdriver = drv_to_wdrv(wblock->dev.dev.driver); if (!try_module_get(wdriver->driver.owner)) { ret = -EBUSY; goto out_ioctl; @@ -926,8 +929,7 @@ static const struct file_operations wmi_fops = { static int wmi_dev_probe(struct device *dev) { struct wmi_block *wblock = dev_to_wblock(dev); - struct wmi_driver *wdriver = - container_of(dev->driver, struct wmi_driver, driver); + struct wmi_driver *wdriver = drv_to_wdrv(dev->driver); int ret = 0; char *buf;
@@ -990,8 +992,7 @@ static int wmi_dev_probe(struct device *dev) static void wmi_dev_remove(struct device *dev) { struct wmi_block *wblock = dev_to_wblock(dev); - struct wmi_driver *wdriver = - container_of(dev->driver, struct wmi_driver, driver); + struct wmi_driver *wdriver = drv_to_wdrv(dev->driver);
if (wdriver->filter_callback) { misc_deregister(&wblock->char_dev); @@ -1296,15 +1297,12 @@ static void acpi_wmi_notify_handler(acpi_handle handle, u32 event,
/* If a driver is bound, then notify the driver. */ if (wblock->dev.dev.driver) { - struct wmi_driver *driver; + struct wmi_driver *driver = drv_to_wdrv(wblock->dev.dev.driver); struct acpi_object_list input; union acpi_object params[1]; struct acpi_buffer evdata = { ACPI_ALLOCATE_BUFFER, NULL }; acpi_status status;
- driver = container_of(wblock->dev.dev.driver, - struct wmi_driver, driver); - input.count = 1; input.pointer = params; params[0].type = ACPI_TYPE_INTEGER;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit a90b38c58667142ecff2521481ed44286d46b140 ]
Replace the wmi_block.read_takes_no_args bool field with an unsigned long flags field, used together with test_bit() and friends.
This is a preparation patch for fixing a driver->notify() vs ->probe() race, which requires atomic flag handling.
Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20211128190031.405620-1-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/wmi.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index 9aeb1a009097..a00b72ace6d2 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -51,6 +51,10 @@ struct guid_block { u8 flags; };
+enum { /* wmi_block flags */ + WMI_READ_TAKES_NO_ARGS, +}; + struct wmi_block { struct wmi_device dev; struct list_head list; @@ -61,8 +65,7 @@ struct wmi_block { wmi_notify_handler handler; void *handler_data; u64 req_buf_size; - - bool read_takes_no_args; + unsigned long flags; };
@@ -325,7 +328,7 @@ static acpi_status __query_block(struct wmi_block *wblock, u8 instance, wq_params[0].type = ACPI_TYPE_INTEGER; wq_params[0].integer.value = instance;
- if (instance == 0 && wblock->read_takes_no_args) + if (instance == 0 && test_bit(WMI_READ_TAKES_NO_ARGS, &wblock->flags)) input.count = 0;
/* @@ -1087,7 +1090,7 @@ static int wmi_create_device(struct device *wmi_bus_dev, * laptops, WQxx may not be a method at all.) */ if (info->type != ACPI_TYPE_METHOD || info->param_count == 0) - wblock->read_takes_no_args = true; + set_bit(WMI_READ_TAKES_NO_ARGS, &wblock->flags);
kfree(info);
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 9918878676a5f9e99b98679f04b9e6c0f5426b0a ]
The driver core sets struct device->driver before calling out to the bus' probe() method, this leaves a window where an ACPI notify may happen on the WMI object before the driver's probe() method has completed running, causing e.g. the driver's notify() callback to get called with drvdata not yet being set leading to a NULL pointer deref.
At a check for this to the WMI core, ensuring that the notify() callback is not called before the driver is ready.
Fixes: 1686f5444546 ("platform/x86: wmi: Incorporate acpi_install_notify_handler") Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20211128190031.405620-2-hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/wmi.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/wmi.c b/drivers/platform/x86/wmi.c index a00b72ace6d2..c4f917d45b51 100644 --- a/drivers/platform/x86/wmi.c +++ b/drivers/platform/x86/wmi.c @@ -53,6 +53,7 @@ struct guid_block {
enum { /* wmi_block flags */ WMI_READ_TAKES_NO_ARGS, + WMI_PROBED, };
struct wmi_block { @@ -980,6 +981,7 @@ static int wmi_dev_probe(struct device *dev) } }
+ set_bit(WMI_PROBED, &wblock->flags); return 0;
probe_misc_failure: @@ -997,6 +999,8 @@ static void wmi_dev_remove(struct device *dev) struct wmi_block *wblock = dev_to_wblock(dev); struct wmi_driver *wdriver = drv_to_wdrv(dev->driver);
+ clear_bit(WMI_PROBED, &wblock->flags); + if (wdriver->filter_callback) { misc_deregister(&wblock->char_dev); kfree(wblock->char_dev.name); @@ -1299,7 +1303,7 @@ static void acpi_wmi_notify_handler(acpi_handle handle, u32 event, return;
/* If a driver is bound, then notify the driver. */ - if (wblock->dev.dev.driver) { + if (test_bit(WMI_PROBED, &wblock->flags) && wblock->dev.dev.driver) { struct wmi_driver *driver = drv_to_wdrv(wblock->dev.dev.driver); struct acpi_object_list input; union acpi_object params[1];
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit b30363102a4122f6eed37927b64a2c7ac70b8859 ]
Remove mt7921_mac_set_beacon_filter routine since it is no longer used.
Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/wireless/mediatek/mt76/mt7921/mac.c | 28 ------------------- .../wireless/mediatek/mt76/mt7921/mt7921.h | 3 -- 2 files changed, 31 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c index bef8d4a76ed9..426e7a32bdc8 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mac.c @@ -1573,34 +1573,6 @@ void mt7921_pm_power_save_work(struct work_struct *work) queue_delayed_work(dev->mt76.wq, &dev->pm.ps_work, delta); }
-int mt7921_mac_set_beacon_filter(struct mt7921_phy *phy, - struct ieee80211_vif *vif, - bool enable) -{ - struct mt7921_dev *dev = phy->dev; - bool ext_phy = phy != &dev->phy; - int err; - - if (!dev->pm.enable) - return -EOPNOTSUPP; - - err = mt7921_mcu_set_bss_pm(dev, vif, enable); - if (err) - return err; - - if (enable) { - vif->driver_flags |= IEEE80211_VIF_BEACON_FILTER; - mt76_set(dev, MT_WF_RFCR(ext_phy), - MT_WF_RFCR_DROP_OTHER_BEACON); - } else { - vif->driver_flags &= ~IEEE80211_VIF_BEACON_FILTER; - mt76_clear(dev, MT_WF_RFCR(ext_phy), - MT_WF_RFCR_DROP_OTHER_BEACON); - } - - return 0; -} - void mt7921_coredump_work(struct work_struct *work) { struct mt7921_dev *dev; diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h b/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h index 2d8bd6bfc820..1aafd8723b3a 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h @@ -381,9 +381,6 @@ int mt7921_mcu_fw_pmctrl(struct mt7921_dev *dev); void mt7921_pm_wake_work(struct work_struct *work); void mt7921_pm_power_save_work(struct work_struct *work); bool mt7921_wait_for_mcu_init(struct mt7921_dev *dev); -int mt7921_mac_set_beacon_filter(struct mt7921_phy *phy, - struct ieee80211_vif *vif, - bool enable); void mt7921_pm_interface_iter(void *priv, u8 *mac, struct ieee80211_vif *vif); void mt7921_coredump_work(struct work_struct *work); int mt7921_wfsys_reset(struct mt7921_dev *dev);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit 890809ca1986e63d29dd1591090af67b655ed89c ]
Introduce mt7921_mcu_set_beacon_filter utility routine in order to remove duplicated code for hw beacon filtering. Move mt7921_pm_interface_iter in debugfs since it is just used there. Make the following routine static: - mt7921_pm_interface_iter - mt7921_mcu_uni_bss_bcnft - mt7921_mcu_set_bss_pm
Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- .../wireless/mediatek/mt76/mt7921/debugfs.c | 20 +++++--- .../net/wireless/mediatek/mt76/mt7921/main.c | 33 +------------ .../net/wireless/mediatek/mt76/mt7921/mcu.c | 47 ++++++++++--------- .../wireless/mediatek/mt76/mt7921/mt7921.h | 8 ++-- 4 files changed, 45 insertions(+), 63 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c index 8d5e261cd10f..82fb2585f413 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c @@ -262,30 +262,38 @@ mt7921_txpwr(struct seq_file *s, void *data) return 0; }
+static void +mt7921_pm_interface_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) +{ + struct mt7921_dev *dev = priv; + + mt7921_mcu_set_beacon_filter(dev, vif, dev->pm.enable); +} + static int mt7921_pm_set(void *data, u64 val) { struct mt7921_dev *dev = data; struct mt76_connac_pm *pm = &dev->pm; - struct mt76_phy *mphy = dev->phy.mt76; - - if (val == pm->enable) - return 0;
mt7921_mutex_acquire(dev);
+ if (val == pm->enable) + goto out; + if (!pm->enable) { pm->stats.last_wake_event = jiffies; pm->stats.last_doze_event = jiffies; } pm->enable = val;
- ieee80211_iterate_active_interfaces(mphy->hw, + ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, - mt7921_pm_interface_iter, mphy->priv); + mt7921_pm_interface_iter, dev);
mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable);
+out: mt7921_mutex_release(dev);
return 0; diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/main.c b/drivers/net/wireless/mediatek/mt76/mt7921/main.c index 30252f408ddc..13a7ae3d8351 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/main.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/main.c @@ -528,36 +528,6 @@ static void mt7921_configure_filter(struct ieee80211_hw *hw, mt7921_mutex_release(dev); }
-static int -mt7921_bss_bcnft_apply(struct mt7921_dev *dev, struct ieee80211_vif *vif, - bool assoc) -{ - int ret; - - if (!dev->pm.enable) - return 0; - - if (assoc) { - ret = mt7921_mcu_uni_bss_bcnft(dev, vif, true); - if (ret) - return ret; - - vif->driver_flags |= IEEE80211_VIF_BEACON_FILTER; - mt76_set(dev, MT_WF_RFCR(0), MT_WF_RFCR_DROP_OTHER_BEACON); - - return 0; - } - - ret = mt7921_mcu_set_bss_pm(dev, vif, false); - if (ret) - return ret; - - vif->driver_flags &= ~IEEE80211_VIF_BEACON_FILTER; - mt76_clear(dev, MT_WF_RFCR(0), MT_WF_RFCR_DROP_OTHER_BEACON); - - return 0; -} - static void mt7921_bss_info_changed(struct ieee80211_hw *hw, struct ieee80211_vif *vif, struct ieee80211_bss_conf *info, @@ -587,7 +557,8 @@ static void mt7921_bss_info_changed(struct ieee80211_hw *hw, if (changed & BSS_CHANGED_ASSOC) { mt7921_mcu_sta_update(dev, NULL, vif, true, MT76_STA_INFO_STATE_ASSOC); - mt7921_bss_bcnft_apply(dev, vif, info->assoc); + if (dev->pm.enable) + mt7921_mcu_set_beacon_filter(dev, vif, info->assoc); }
if (changed & BSS_CHANGED_ARP_FILTER) { diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c index 4119f8efd896..dabc0de2ec65 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mcu.c @@ -1205,8 +1205,9 @@ int mt7921_mcu_uni_bss_ps(struct mt7921_dev *dev, struct ieee80211_vif *vif) &ps_req, sizeof(ps_req), true); }
-int mt7921_mcu_uni_bss_bcnft(struct mt7921_dev *dev, struct ieee80211_vif *vif, - bool enable) +static int +mt7921_mcu_uni_bss_bcnft(struct mt7921_dev *dev, struct ieee80211_vif *vif, + bool enable) { struct mt7921_vif *mvif = (struct mt7921_vif *)vif->drv_priv; struct { @@ -1240,8 +1241,9 @@ int mt7921_mcu_uni_bss_bcnft(struct mt7921_dev *dev, struct ieee80211_vif *vif, &bcnft_req, sizeof(bcnft_req), true); }
-int mt7921_mcu_set_bss_pm(struct mt7921_dev *dev, struct ieee80211_vif *vif, - bool enable) +static int +mt7921_mcu_set_bss_pm(struct mt7921_dev *dev, struct ieee80211_vif *vif, + bool enable) { struct mt7921_vif *mvif = (struct mt7921_vif *)vif->drv_priv; struct { @@ -1390,31 +1392,34 @@ int mt7921_mcu_fw_pmctrl(struct mt7921_dev *dev) return err; }
-void -mt7921_pm_interface_iter(void *priv, u8 *mac, struct ieee80211_vif *vif) +int mt7921_mcu_set_beacon_filter(struct mt7921_dev *dev, + struct ieee80211_vif *vif, + bool enable) { - struct mt7921_phy *phy = priv; - struct mt7921_dev *dev = phy->dev; struct ieee80211_hw *hw = mt76_hw(dev); - int ret; - - if (dev->pm.enable) - ret = mt7921_mcu_uni_bss_bcnft(dev, vif, true); - else - ret = mt7921_mcu_set_bss_pm(dev, vif, false); + int err;
- if (ret) - return; + if (enable) { + err = mt7921_mcu_uni_bss_bcnft(dev, vif, true); + if (err) + return err;
- if (dev->pm.enable) { vif->driver_flags |= IEEE80211_VIF_BEACON_FILTER; ieee80211_hw_set(hw, CONNECTION_MONITOR); mt76_set(dev, MT_WF_RFCR(0), MT_WF_RFCR_DROP_OTHER_BEACON); - } else { - vif->driver_flags &= ~IEEE80211_VIF_BEACON_FILTER; - __clear_bit(IEEE80211_HW_CONNECTION_MONITOR, hw->flags); - mt76_clear(dev, MT_WF_RFCR(0), MT_WF_RFCR_DROP_OTHER_BEACON); + + return 0; } + + err = mt7921_mcu_set_bss_pm(dev, vif, false); + if (err) + return err; + + vif->driver_flags &= ~IEEE80211_VIF_BEACON_FILTER; + __clear_bit(IEEE80211_HW_CONNECTION_MONITOR, hw->flags); + mt76_clear(dev, MT_WF_RFCR(0), MT_WF_RFCR_DROP_OTHER_BEACON); + + return 0; }
int mt7921_get_txpwr_info(struct mt7921_dev *dev, struct mt7921_txpwr *txpwr) diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h b/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h index 1aafd8723b3a..32d4f2cab94e 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h +++ b/drivers/net/wireless/mediatek/mt76/mt7921/mt7921.h @@ -363,6 +363,9 @@ void mt7921_set_stream_he_caps(struct mt7921_phy *phy); void mt7921_update_channel(struct mt76_phy *mphy); int mt7921_init_debugfs(struct mt7921_dev *dev);
+int mt7921_mcu_set_beacon_filter(struct mt7921_dev *dev, + struct ieee80211_vif *vif, + bool enable); int mt7921_mcu_uni_tx_ba(struct mt7921_dev *dev, struct ieee80211_ampdu_params *params, bool enable); @@ -371,17 +374,12 @@ int mt7921_mcu_uni_rx_ba(struct mt7921_dev *dev, bool enable); void mt7921_scan_work(struct work_struct *work); int mt7921_mcu_uni_bss_ps(struct mt7921_dev *dev, struct ieee80211_vif *vif); -int mt7921_mcu_uni_bss_bcnft(struct mt7921_dev *dev, struct ieee80211_vif *vif, - bool enable); -int mt7921_mcu_set_bss_pm(struct mt7921_dev *dev, struct ieee80211_vif *vif, - bool enable); int __mt7921_mcu_drv_pmctrl(struct mt7921_dev *dev); int mt7921_mcu_drv_pmctrl(struct mt7921_dev *dev); int mt7921_mcu_fw_pmctrl(struct mt7921_dev *dev); void mt7921_pm_wake_work(struct work_struct *work); void mt7921_pm_power_save_work(struct work_struct *work); bool mt7921_wait_for_mcu_init(struct mt7921_dev *dev); -void mt7921_pm_interface_iter(void *priv, u8 *mac, struct ieee80211_vif *vif); void mt7921_coredump_work(struct work_struct *work); int mt7921_wfsys_reset(struct mt7921_dev *dev); int mt7921_get_txpwr_info(struct mt7921_dev *dev, struct mt7921_txpwr *txpwr);
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit d430dffbe9dd30759f3c64b65bf85b0245c8d8ab ]
Fix a possible race enabling/disabling runtime-pm between mt7921_pm_set() and mt7921_poll_rx() since mt7921_pm_wake_work() always schedules rx-napi callback and it will trigger mt7921_pm_power_save_work routine putting chip to in low-power state during mt7921_pm_set processing.
Suggested-by: Deren Wu deren.wu@mediatek.com Tested-by: Deren Wu deren.wu@mediatek.com Fixes: 1d8efc741df8 ("mt76: mt7921: introduce Runtime PM support") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/0f3e075a2033dc05f09dab4059e5be8cbdccc239.164009484... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c | 3 --- drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c | 12 +++++++++--- 2 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c index af43bcb54578..306e9eaea917 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c +++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mac.c @@ -7,9 +7,6 @@ int mt76_connac_pm_wake(struct mt76_phy *phy, struct mt76_connac_pm *pm) { struct mt76_dev *dev = phy->dev;
- if (!pm->enable) - return 0; - if (mt76_is_usb(dev)) return 0;
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c index 82fb2585f413..b9f25599227d 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c @@ -276,7 +276,7 @@ mt7921_pm_set(void *data, u64 val) struct mt7921_dev *dev = data; struct mt76_connac_pm *pm = &dev->pm;
- mt7921_mutex_acquire(dev); + mutex_lock(&dev->mt76.mutex);
if (val == pm->enable) goto out; @@ -285,7 +285,11 @@ mt7921_pm_set(void *data, u64 val) pm->stats.last_wake_event = jiffies; pm->stats.last_doze_event = jiffies; } - pm->enable = val; + /* make sure the chip is awake here and ps_work is scheduled + * just at end of the this routine. + */ + pm->enable = false; + mt76_connac_pm_wake(&dev->mphy, pm);
ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, @@ -293,8 +297,10 @@ mt7921_pm_set(void *data, u64 val)
mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable);
+ pm->enable = val; + mt76_connac_power_save_sched(&dev->mphy, pm); out: - mt7921_mutex_release(dev); + mutex_unlock(&dev->mt76.mutex);
return 0; }
From: Hou Tao houtao1@huawei.com
[ Upstream commit e4a41c2c1fa916547e63440c73a51a5eb06247af ]
The following error is reported when running "./test_progs -t for_each" under arm64:
bpf_jit: multi-func JIT bug 58 != 56 [...] JIT doesn't support bpf-to-bpf calls
The root cause is the size of BPF_PSEUDO_FUNC instruction increases from 2 to 3 after the address of called bpf-function is settled and there are two bpf-to-bpf calls in test_pkt_access. The generated instructions are shown below:
0x48: 21 00 C0 D2 movz x1, #0x1, lsl #32 0x4c: 21 00 80 F2 movk x1, #0x1
0x48: E1 3F C0 92 movn x1, #0x1ff, lsl #32 0x4c: 41 FE A2 F2 movk x1, #0x17f2, lsl #16 0x50: 81 70 9F F2 movk x1, #0xfb84
Fixing it by using emit_addr_mov_i64() for BPF_PSEUDO_FUNC, so the size of jited image will not change.
Fixes: 69c087ba6225 ("bpf: Add bpf_for_each_map_elem() helper") Signed-off-by: Hou Tao houtao1@huawei.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20211231151018.3781550-1-houtao1@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/net/bpf_jit_comp.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/net/bpf_jit_comp.c b/arch/arm64/net/bpf_jit_comp.c index 95439bbe5df8..4895b4d7e150 100644 --- a/arch/arm64/net/bpf_jit_comp.c +++ b/arch/arm64/net/bpf_jit_comp.c @@ -788,7 +788,10 @@ static int build_insn(const struct bpf_insn *insn, struct jit_ctx *ctx, u64 imm64;
imm64 = (u64)insn1.imm << 32 | (u32)imm; - emit_a64_mov_i64(dst, imm64, ctx); + if (bpf_pseudo_func(insn)) + emit_addr_mov_i64(dst, imm64, ctx); + else + emit_a64_mov_i64(dst, imm64, ctx);
return 1; }
From: Heinrich Schuchardt heinrich.schuchardt@canonical.com
[ Upstream commit ffa7a9141bb70702744a312f904b190ca064bdd7 ]
Both RADEON and NOUVEAU graphics cards are supported on RISC-V. Enabling the one and not the other does not make sense.
As typically at most one of RADEON, NOUVEAU, or VIRTIO GPU support will be needed DRM drivers should be compiled as modules.
Signed-off-by: Heinrich Schuchardt heinrich.schuchardt@canonical.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/configs/defconfig | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index 4ebc80315f01..c252fd5706d2 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -72,9 +72,10 @@ CONFIG_GPIOLIB=y CONFIG_GPIO_SIFIVE=y # CONFIG_PTP_1588_CLOCK is not set CONFIG_POWER_RESET=y -CONFIG_DRM=y -CONFIG_DRM_RADEON=y -CONFIG_DRM_VIRTIO_GPU=y +CONFIG_DRM=m +CONFIG_DRM_RADEON=m +CONFIG_DRM_NOUVEAU=m +CONFIG_DRM_VIRTIO_GPU=m CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_USB=y CONFIG_USB_XHCI_HCD=y
From: Palmer Dabbelt palmer@rivosinc.com
[ Upstream commit 3d12b634fe8206ea974c6061a3f3eea529ffbc48 ]
We have CONFIG_FRAMEBUFFER_CONSOLE=y in the defconfigs, but that depends on CONFIG_FB so it's not actually getting set. I'm assuming most users on real systems want a framebuffer console, so this enables CONFIG_FB to allow that to take effect.
Fixes: 33c57c0d3c67 ("RISC-V: Add a basic defconfig") Reviewed-by: Anup Patel anup@brainfault.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/configs/defconfig | 1 + arch/riscv/configs/rv32_defconfig | 1 + 2 files changed, 2 insertions(+)
diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig index c252fd5706d2..f2a2f9c9ed49 100644 --- a/arch/riscv/configs/defconfig +++ b/arch/riscv/configs/defconfig @@ -76,6 +76,7 @@ CONFIG_DRM=m CONFIG_DRM_RADEON=m CONFIG_DRM_NOUVEAU=m CONFIG_DRM_VIRTIO_GPU=m +CONFIG_FB=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_USB=y CONFIG_USB_XHCI_HCD=y diff --git a/arch/riscv/configs/rv32_defconfig b/arch/riscv/configs/rv32_defconfig index 434ef5b64599..cdd113e7a291 100644 --- a/arch/riscv/configs/rv32_defconfig +++ b/arch/riscv/configs/rv32_defconfig @@ -71,6 +71,7 @@ CONFIG_POWER_RESET=y CONFIG_DRM=y CONFIG_DRM_RADEON=y CONFIG_DRM_VIRTIO_GPU=y +CONFIG_FB=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_USB=y CONFIG_USB_XHCI_HCD=y
From: Roi Dayan roid@nvidia.com
[ Upstream commit 6b50cf45b6a0e99f3cab848a72ecca8da56b7460 ]
The driver should add offloaded rules with either a fwd or drop action. The check existed in parsing fdb flows but not when parsing nic flows. Move the test into actions_match_supported() which is called for checking nic flows and fdb flows.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 3aa8d0b83d10..fe52db591121 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3305,6 +3305,12 @@ static bool actions_match_supported(struct mlx5e_priv *priv, ct_flow = flow_flag_test(flow, CT) && !ct_clear; actions = flow->attr->action;
+ if (!(actions & + (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) { + NL_SET_ERR_MSG_MOD(extack, "Rule must have at least one forward/drop action"); + return false; + } + if (mlx5e_is_eswitch_flow(flow)) { if (flow->attr->esw_attr->split_count && ct_flow && !MLX5_CAP_GEN(flow->attr->esw_attr->in_mdev, reg_c_preserve)) { @@ -4207,13 +4213,6 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; }
- if (!(attr->action & - (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) { - NL_SET_ERR_MSG_MOD(extack, - "Rule must have at least one forward/drop action"); - return -EOPNOTSUPP; - } - if (esw_attr->split_count > 0 && !mlx5_esw_has_fwd_fdb(priv->mdev)) { NL_SET_ERR_MSG_MOD(extack, "current firmware doesn't support split rule for port mirroring");
On Mon, Jul 11, 2022 at 11:05:05AM +0200, Greg Kroah-Hartman wrote:
From: Roi Dayan roid@nvidia.com
[ Upstream commit 6b50cf45b6a0e99f3cab848a72ecca8da56b7460 ]
The driver should add offloaded rules with either a fwd or drop action. The check existed in parsing fdb flows but not when parsing nic flows. Move the test into actions_match_supported() which is called for checking nic flows and fdb flows.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org
hey Sasha,
A contact at Nvidia tells me that this has caused a regression w/ OVN HW offload. To fix that, commit 7f8770c7 ("net/mlx5e: Set action fwd flag when parsing tc action goto") is also required.
I'm not really sure what flagged this patch for stable, so I don't know whether to suggest it be reverted, or that additonal patch be applied. Roi - what's your thought?
-dann
drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 3aa8d0b83d10..fe52db591121 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3305,6 +3305,12 @@ static bool actions_match_supported(struct mlx5e_priv *priv, ct_flow = flow_flag_test(flow, CT) && !ct_clear; actions = flow->attr->action;
- if (!(actions &
(MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) {
NL_SET_ERR_MSG_MOD(extack, "Rule must have at least one forward/drop action");
return false;
- }
- if (mlx5e_is_eswitch_flow(flow)) { if (flow->attr->esw_attr->split_count && ct_flow && !MLX5_CAP_GEN(flow->attr->esw_attr->in_mdev, reg_c_preserve)) {
@@ -4207,13 +4213,6 @@ static int parse_tc_fdb_actions(struct mlx5e_priv *priv, attr->action |= MLX5_FLOW_CONTEXT_ACTION_FWD_DEST; }
- if (!(attr->action &
(MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) {
NL_SET_ERR_MSG_MOD(extack,
"Rule must have at least one forward/drop action");
return -EOPNOTSUPP;
- }
- if (esw_attr->split_count > 0 && !mlx5_esw_has_fwd_fdb(priv->mdev)) { NL_SET_ERR_MSG_MOD(extack, "current firmware doesn't support split rule for port mirroring");
On Thu, Jan 12, 2023 at 03:07:59PM -0700, dann frazier wrote:
On Mon, Jul 11, 2022 at 11:05:05AM +0200, Greg Kroah-Hartman wrote:
From: Roi Dayan roid@nvidia.com
[ Upstream commit 6b50cf45b6a0e99f3cab848a72ecca8da56b7460 ]
The driver should add offloaded rules with either a fwd or drop action. The check existed in parsing fdb flows but not when parsing nic flows. Move the test into actions_match_supported() which is called for checking nic flows and fdb flows.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org
hey Sasha,
A contact at Nvidia tells me that this has caused a regression w/ OVN HW offload. To fix that, commit 7f8770c7 ("net/mlx5e: Set action fwd flag when parsing tc action goto") is also required.
I'm not really sure what flagged this patch for stable, so I don't know whether to suggest it be reverted, or that additonal patch be applied. Roi - what's your thought?
I've queued up the additional change now, thanks.
greg k-h
On 14/01/2023 15:49, Greg Kroah-Hartman wrote:
On Thu, Jan 12, 2023 at 03:07:59PM -0700, dann frazier wrote:
On Mon, Jul 11, 2022 at 11:05:05AM +0200, Greg Kroah-Hartman wrote:
From: Roi Dayan roid@nvidia.com
[ Upstream commit 6b50cf45b6a0e99f3cab848a72ecca8da56b7460 ]
The driver should add offloaded rules with either a fwd or drop action. The check existed in parsing fdb flows but not when parsing nic flows. Move the test into actions_match_supported() which is called for checking nic flows and fdb flows.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org
hey Sasha,
A contact at Nvidia tells me that this has caused a regression w/ OVN HW offload. To fix that, commit 7f8770c7 ("net/mlx5e: Set action fwd flag when parsing tc action goto") is also required.
I'm not really sure what flagged this patch for stable, so I don't know whether to suggest it be reverted, or that additonal patch be applied. Roi - what's your thought?
I've queued up the additional change now, thanks.
greg k-h
right. thanks. I'm also not sure why the first commit added to stable but it caused checking the fwd flag earlier in the code and the missing commit added the fwd flag earlier. Both commits were in the same series when submitted.
From: Roi Dayan roid@nvidia.com
[ Upstream commit 9c1d3511a2c2fd30c991a20c670991ece4ef27c1 ]
There will probably be more checks, some for nic flows, some for fdb flows and some are shared checks. Split it for fdb and nic to avoid the function getting too big.
Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/en_tc.c | 65 +++++++++++-------- 1 file changed, 39 insertions(+), 26 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index fe52db591121..3c6c7801f131 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3291,19 +3291,41 @@ static bool modify_header_match_supported(struct mlx5e_priv *priv, return true; }
-static bool actions_match_supported(struct mlx5e_priv *priv, - struct flow_action *flow_action, - struct mlx5e_tc_flow_parse_attr *parse_attr, - struct mlx5e_tc_flow *flow, - struct netlink_ext_ack *extack) +static bool +actions_match_supported_fdb(struct mlx5e_priv *priv, + struct mlx5e_tc_flow_parse_attr *parse_attr, + struct mlx5e_tc_flow *flow, + struct netlink_ext_ack *extack) +{ + bool ct_flow, ct_clear; + + ct_clear = flow->attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR; + ct_flow = flow_flag_test(flow, CT) && !ct_clear; + + if (flow->attr->esw_attr->split_count && ct_flow && + !MLX5_CAP_GEN(flow->attr->esw_attr->in_mdev, reg_c_preserve)) { + /* All registers used by ct are cleared when using + * split rules. + */ + NL_SET_ERR_MSG_MOD(extack, "Can't offload mirroring with action ct"); + return false; + } + + return true; +} + +static bool +actions_match_supported(struct mlx5e_priv *priv, + struct flow_action *flow_action, + struct mlx5e_tc_flow_parse_attr *parse_attr, + struct mlx5e_tc_flow *flow, + struct netlink_ext_ack *extack) { - bool ct_flow = false, ct_clear = false; - u32 actions; + u32 actions = flow->attr->action; + bool ct_flow, ct_clear;
- ct_clear = flow->attr->ct_attr.ct_action & - TCA_CT_ACT_CLEAR; + ct_clear = flow->attr->ct_attr.ct_action & TCA_CT_ACT_CLEAR; ct_flow = flow_flag_test(flow, CT) && !ct_clear; - actions = flow->attr->action;
if (!(actions & (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) { @@ -3311,23 +3333,14 @@ static bool actions_match_supported(struct mlx5e_priv *priv, return false; }
- if (mlx5e_is_eswitch_flow(flow)) { - if (flow->attr->esw_attr->split_count && ct_flow && - !MLX5_CAP_GEN(flow->attr->esw_attr->in_mdev, reg_c_preserve)) { - /* All registers used by ct are cleared when using - * split rules. - */ - NL_SET_ERR_MSG_MOD(extack, - "Can't offload mirroring with action ct"); - return false; - } - } + if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && + !modify_header_match_supported(priv, &parse_attr->spec, flow_action, + actions, ct_flow, ct_clear, extack)) + return false;
- if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR) - return modify_header_match_supported(priv, &parse_attr->spec, - flow_action, actions, - ct_flow, ct_clear, - extack); + if (mlx5e_is_eswitch_flow(flow) && + !actions_match_supported_fdb(priv, parse_attr, flow, extack)) + return false;
return true; }
From: Roi Dayan roid@nvidia.com
[ Upstream commit a2446bc77a16cefd27de712d28af2396d6287593 ]
This kind of action is not supported by firmware and generates a syndrome.
kernel: mlx5_core 0000:08:00.0: mlx5_cmd_check:777:(pid 102063): SET_FLOW_TABLE_ENTRY(0x936) op_mod(0x0) failed, status bad parameter(0x3), syndrome (0x8708c3)
Fixes: d7e75a325cb2 ("net/mlx5e: Add offloading of E-Switch TC pedit (header re-write) actions") Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Oz Shlomo ozsh@nvidia.com Reviewed-by: Maor Dickman maord@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index 3c6c7801f131..eb0d9082ccc5 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3333,6 +3333,12 @@ actions_match_supported(struct mlx5e_priv *priv, return false; }
+ if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && + actions & MLX5_FLOW_CONTEXT_ACTION_DROP) { + NL_SET_ERR_MSG_MOD(extack, "Drop with modify header action is not supported"); + return false; + } + if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && !modify_header_match_supported(priv, &parse_attr->spec, flow_action, actions, ct_flow, ct_clear, extack))
From: Roi Dayan roid@nvidia.com
[ Upstream commit 5623ef8a118838aae65363750dfafcba734dc8cb ]
Such rules are redundant but allowed and passed to the driver. The driver does not support offloading such rules so return an error.
Fixes: 03a9d11e6eeb ("net/mlx5e: Add TC drop and mirred/redirect action parsing for SRIOV offloads") Signed-off-by: Roi Dayan roid@nvidia.com Reviewed-by: Oz Shlomo ozsh@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_tc.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c index eb0d9082ccc5..843c8435387f 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_tc.c @@ -3333,6 +3333,12 @@ actions_match_supported(struct mlx5e_priv *priv, return false; }
+ if (!(~actions & + (MLX5_FLOW_CONTEXT_ACTION_FWD_DEST | MLX5_FLOW_CONTEXT_ACTION_DROP))) { + NL_SET_ERR_MSG_MOD(extack, "Rule cannot support forward+drop action"); + return false; + } + if (actions & MLX5_FLOW_CONTEXT_ACTION_MOD_HDR && actions & MLX5_FLOW_CONTEXT_ACTION_DROP) { NL_SET_ERR_MSG_MOD(extack, "Drop with modify header action is not supported");
From: Derek Fang derek.fang@realtek.com
[ Upstream commit a3774a2a6544a7a4a85186e768afc07044aa507f ]
When the system suspends, the codec driver will set SAR to power saving mode if a headset is plugged in. There is a chance to generate an unexpected IRQ, and leads to issues after resuming such as noise from OMTP type headsets.
Signed-off-by: Derek Fang derek.fang@realtek.com Link: https://lore.kernel.org/r/20211109095450.12950-1-derek.fang@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5682.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/sound/soc/codecs/rt5682.c b/sound/soc/codecs/rt5682.c index 6ad3159eceb9..949b5638db29 100644 --- a/sound/soc/codecs/rt5682.c +++ b/sound/soc/codecs/rt5682.c @@ -48,6 +48,7 @@ static const struct reg_sequence patch_list[] = { {RT5682_SAR_IL_CMD_6, 0x0110}, {RT5682_CHARGE_PUMP_1, 0x0210}, {RT5682_HP_LOGIC_CTRL_2, 0x0007}, + {RT5682_SAR_IL_CMD_2, 0xac00}, };
void rt5682_apply_patch_list(struct rt5682_priv *rt5682, struct device *dev) @@ -2969,9 +2970,6 @@ static int rt5682_suspend(struct snd_soc_component *component) cancel_delayed_work_sync(&rt5682->jack_detect_work); cancel_delayed_work_sync(&rt5682->jd_check_work); if (rt5682->hs_jack && rt5682->jack_type == SND_JACK_HEADSET) { - snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, - RT5682_MB1_PATH_MASK | RT5682_MB2_PATH_MASK, - RT5682_CTRL_MB1_REG | RT5682_CTRL_MB2_REG); val = snd_soc_component_read(component, RT5682_CBJ_CTRL_2) & RT5682_JACK_TYPE_MASK;
@@ -2993,10 +2991,15 @@ static int rt5682_suspend(struct snd_soc_component *component) /* enter SAR ADC power saving mode */ snd_soc_component_update_bits(component, RT5682_SAR_IL_CMD_1, RT5682_SAR_BUTT_DET_MASK | RT5682_SAR_BUTDET_MODE_MASK | - RT5682_SAR_BUTDET_RST_MASK | RT5682_SAR_SEL_MB1_MB2_MASK, 0); + RT5682_SAR_SEL_MB1_MB2_MASK, 0); + usleep_range(5000, 6000); + snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, + RT5682_MB1_PATH_MASK | RT5682_MB2_PATH_MASK, + RT5682_CTRL_MB1_REG | RT5682_CTRL_MB2_REG); + usleep_range(10000, 12000); snd_soc_component_update_bits(component, RT5682_SAR_IL_CMD_1, - RT5682_SAR_BUTT_DET_MASK | RT5682_SAR_BUTDET_MODE_MASK | RT5682_SAR_BUTDET_RST_MASK, - RT5682_SAR_BUTT_DET_EN | RT5682_SAR_BUTDET_POW_SAV | RT5682_SAR_BUTDET_RST_NORMAL); + RT5682_SAR_BUTT_DET_MASK | RT5682_SAR_BUTDET_MODE_MASK, + RT5682_SAR_BUTT_DET_EN | RT5682_SAR_BUTDET_POW_SAV); }
regcache_cache_only(rt5682->regmap, true);
From: Derek Fang derek.fang@realtek.com
[ Upstream commit 2cd9b0ef82d936623d789bb3fbb6fcf52c500367 ]
Sometimes, end-users change the jack type under suspending, so it needs to re-detect the combo jack type after resuming to avoid any unexpected behaviors.
Signed-off-by: Derek Fang derek.fang@realtek.com Link: https://lore.kernel.org/r/20211109095450.12950-2-derek.fang@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5682-i2c.c | 1 + sound/soc/codecs/rt5682.c | 23 ++++++++++++++++++++--- sound/soc/codecs/rt5682.h | 1 + 3 files changed, 22 insertions(+), 3 deletions(-)
diff --git a/sound/soc/codecs/rt5682-i2c.c b/sound/soc/codecs/rt5682-i2c.c index b9d5d7a0975b..b17c14b8b36e 100644 --- a/sound/soc/codecs/rt5682-i2c.c +++ b/sound/soc/codecs/rt5682-i2c.c @@ -196,6 +196,7 @@ static int rt5682_i2c_probe(struct i2c_client *i2c, }
mutex_init(&rt5682->calibrate_mutex); + mutex_init(&rt5682->jdet_mutex); rt5682_calibrate(rt5682);
rt5682_apply_patch_list(rt5682, &i2c->dev); diff --git a/sound/soc/codecs/rt5682.c b/sound/soc/codecs/rt5682.c index 949b5638db29..d470b9189768 100644 --- a/sound/soc/codecs/rt5682.c +++ b/sound/soc/codecs/rt5682.c @@ -49,6 +49,7 @@ static const struct reg_sequence patch_list[] = { {RT5682_CHARGE_PUMP_1, 0x0210}, {RT5682_HP_LOGIC_CTRL_2, 0x0007}, {RT5682_SAR_IL_CMD_2, 0xac00}, + {RT5682_CBJ_CTRL_7, 0x0104}, };
void rt5682_apply_patch_list(struct rt5682_priv *rt5682, struct device *dev) @@ -943,6 +944,10 @@ int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert) snd_soc_component_update_bits(component, RT5682_HP_CHARGE_PUMP_1, RT5682_OSW_L_MASK | RT5682_OSW_R_MASK, 0); + rt5682_enable_push_button_irq(component, false); + snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, + RT5682_TRIG_JD_MASK, RT5682_TRIG_JD_LOW); + usleep_range(55000, 60000); snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, RT5682_TRIG_JD_MASK, RT5682_TRIG_JD_HIGH);
@@ -1099,6 +1104,7 @@ void rt5682_jack_detect_handler(struct work_struct *work) return; }
+ mutex_lock(&rt5682->jdet_mutex); mutex_lock(&rt5682->calibrate_mutex);
val = snd_soc_component_read(rt5682->component, RT5682_AJD1_CTRL) @@ -1172,6 +1178,7 @@ void rt5682_jack_detect_handler(struct work_struct *work) }
mutex_unlock(&rt5682->calibrate_mutex); + mutex_unlock(&rt5682->jdet_mutex); } EXPORT_SYMBOL_GPL(rt5682_jack_detect_handler);
@@ -1521,6 +1528,7 @@ static int rt5682_hp_event(struct snd_soc_dapm_widget *w, { struct snd_soc_component *component = snd_soc_dapm_to_component(w->dapm); + struct rt5682_priv *rt5682 = snd_soc_component_get_drvdata(component);
switch (event) { case SND_SOC_DAPM_PRE_PMU: @@ -1532,12 +1540,17 @@ static int rt5682_hp_event(struct snd_soc_dapm_widget *w, RT5682_DEPOP_1, 0x60, 0x60); snd_soc_component_update_bits(component, RT5682_DAC_ADC_DIG_VOL1, 0x00c0, 0x0080); + + mutex_lock(&rt5682->jdet_mutex); + snd_soc_component_update_bits(component, RT5682_HP_CTRL_2, RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN, RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN); usleep_range(5000, 10000); snd_soc_component_update_bits(component, RT5682_CHARGE_PUMP_1, RT5682_CP_SW_SIZE_MASK, RT5682_CP_SW_SIZE_L); + + mutex_unlock(&rt5682->jdet_mutex); break;
case SND_SOC_DAPM_POST_PMD: @@ -2969,7 +2982,7 @@ static int rt5682_suspend(struct snd_soc_component *component)
cancel_delayed_work_sync(&rt5682->jack_detect_work); cancel_delayed_work_sync(&rt5682->jd_check_work); - if (rt5682->hs_jack && rt5682->jack_type == SND_JACK_HEADSET) { + if (rt5682->hs_jack && (rt5682->jack_type & SND_JACK_HEADSET) == SND_JACK_HEADSET) { val = snd_soc_component_read(component, RT5682_CBJ_CTRL_2) & RT5682_JACK_TYPE_MASK;
@@ -3000,6 +3013,8 @@ static int rt5682_suspend(struct snd_soc_component *component) snd_soc_component_update_bits(component, RT5682_SAR_IL_CMD_1, RT5682_SAR_BUTT_DET_MASK | RT5682_SAR_BUTDET_MODE_MASK, RT5682_SAR_BUTT_DET_EN | RT5682_SAR_BUTDET_POW_SAV); + snd_soc_component_update_bits(component, RT5682_HP_CHARGE_PUMP_1, + RT5682_OSW_L_MASK | RT5682_OSW_R_MASK, 0); }
regcache_cache_only(rt5682->regmap, true); @@ -3017,10 +3032,11 @@ static int rt5682_resume(struct snd_soc_component *component) regcache_cache_only(rt5682->regmap, false); regcache_sync(rt5682->regmap);
- if (rt5682->hs_jack && rt5682->jack_type == SND_JACK_HEADSET) { + if (rt5682->hs_jack && (rt5682->jack_type & SND_JACK_HEADSET) == SND_JACK_HEADSET) { snd_soc_component_update_bits(component, RT5682_SAR_IL_CMD_1, RT5682_SAR_BUTDET_MODE_MASK | RT5682_SAR_SEL_MB1_MB2_MASK, RT5682_SAR_BUTDET_POW_NORM | RT5682_SAR_SEL_MB1_MB2_AUTO); + usleep_range(5000, 6000); snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, RT5682_MB1_PATH_MASK | RT5682_MB2_PATH_MASK, RT5682_CTRL_MB1_FSM | RT5682_CTRL_MB2_FSM); @@ -3028,8 +3044,9 @@ static int rt5682_resume(struct snd_soc_component *component) RT5682_PWR_CBJ, RT5682_PWR_CBJ); }
+ rt5682->jack_type = 0; mod_delayed_work(system_power_efficient_wq, - &rt5682->jack_detect_work, msecs_to_jiffies(250)); + &rt5682->jack_detect_work, msecs_to_jiffies(0));
return 0; } diff --git a/sound/soc/codecs/rt5682.h b/sound/soc/codecs/rt5682.h index 8e3244a62c16..f866d207d1bd 100644 --- a/sound/soc/codecs/rt5682.h +++ b/sound/soc/codecs/rt5682.h @@ -1462,6 +1462,7 @@ struct rt5682_priv {
int jack_type; int irq_work_delay_time; + struct mutex jdet_mutex; };
extern const char *rt5682_supply_names[RT5682_NUM_SUPPLIES];
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 4045daf0fa87846a27f56329fddad2deeb5ca354 ]
On resume from suspend the following chain of events can happen: A rt5682_resume() -> mod_delayed_work() for jack_detect_work B DAPM sequence starts ( DAPM is locked now)
A1. rt5682_jack_detect_handler() scheduled - Takes both jdet_mutex and calibrate_mutex - Calls in to rt5682_headset_detect() which tries to take DAPM lock, it starts to wait for it as B path took it already. B1. DAPM sequence reaches the "HP Amp", rt5682_hp_event() tries to take the jdet_mutex, but it is locked in A1, so it waits.
Deadlock.
To solve the deadlock, drop the jdet_mutex, use the jack_detect_work to do the jack removal handling, move the dapm lock up one level to protect the most of the rt5682_jack_detect_handler(), but not the jack reporting as it might trigger a DAPM sequence. The rt5682_headset_detect() can be changed to static as well.
Fixes: 8deb34a90f063 ("ASoC: rt5682: fix the wrong jack type detected") Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20220126100325.16513-1-peter.ujfalusi@linux.intel.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5682-i2c.c | 15 ++++----------- sound/soc/codecs/rt5682.c | 24 ++++++++---------------- sound/soc/codecs/rt5682.h | 2 -- 3 files changed, 12 insertions(+), 29 deletions(-)
diff --git a/sound/soc/codecs/rt5682-i2c.c b/sound/soc/codecs/rt5682-i2c.c index b17c14b8b36e..74a1fee071dd 100644 --- a/sound/soc/codecs/rt5682-i2c.c +++ b/sound/soc/codecs/rt5682-i2c.c @@ -59,18 +59,12 @@ static void rt5682_jd_check_handler(struct work_struct *work) struct rt5682_priv *rt5682 = container_of(work, struct rt5682_priv, jd_check_work.work);
- if (snd_soc_component_read(rt5682->component, RT5682_AJD1_CTRL) - & RT5682_JDH_RS_MASK) { + if (snd_soc_component_read(rt5682->component, RT5682_AJD1_CTRL) & RT5682_JDH_RS_MASK) /* jack out */ - rt5682->jack_type = rt5682_headset_detect(rt5682->component, 0); - - snd_soc_jack_report(rt5682->hs_jack, rt5682->jack_type, - SND_JACK_HEADSET | - SND_JACK_BTN_0 | SND_JACK_BTN_1 | - SND_JACK_BTN_2 | SND_JACK_BTN_3); - } else { + mod_delayed_work(system_power_efficient_wq, + &rt5682->jack_detect_work, 0); + else schedule_delayed_work(&rt5682->jd_check_work, 500); - } }
static irqreturn_t rt5682_irq(int irq, void *data) @@ -196,7 +190,6 @@ static int rt5682_i2c_probe(struct i2c_client *i2c, }
mutex_init(&rt5682->calibrate_mutex); - mutex_init(&rt5682->jdet_mutex); rt5682_calibrate(rt5682);
rt5682_apply_patch_list(rt5682, &i2c->dev); diff --git a/sound/soc/codecs/rt5682.c b/sound/soc/codecs/rt5682.c index d470b9189768..1cd59e166cce 100644 --- a/sound/soc/codecs/rt5682.c +++ b/sound/soc/codecs/rt5682.c @@ -922,15 +922,13 @@ static void rt5682_enable_push_button_irq(struct snd_soc_component *component, * * Returns detect status. */ -int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert) +static int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert) { struct rt5682_priv *rt5682 = snd_soc_component_get_drvdata(component); struct snd_soc_dapm_context *dapm = &component->dapm; unsigned int val, count;
if (jack_insert) { - snd_soc_dapm_mutex_lock(dapm); - snd_soc_component_update_bits(component, RT5682_PWR_ANLG_1, RT5682_PWR_VREF2 | RT5682_PWR_MB, RT5682_PWR_VREF2 | RT5682_PWR_MB); @@ -981,8 +979,6 @@ int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert) snd_soc_component_update_bits(component, RT5682_MICBIAS_2, RT5682_PWR_CLK25M_MASK | RT5682_PWR_CLK1M_MASK, RT5682_PWR_CLK25M_PU | RT5682_PWR_CLK1M_PU); - - snd_soc_dapm_mutex_unlock(dapm); } else { rt5682_enable_push_button_irq(component, false); snd_soc_component_update_bits(component, RT5682_CBJ_CTRL_1, @@ -1011,7 +1007,6 @@ int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert) dev_dbg(component->dev, "jack_type = %d\n", rt5682->jack_type); return rt5682->jack_type; } -EXPORT_SYMBOL_GPL(rt5682_headset_detect);
static int rt5682_set_jack_detect(struct snd_soc_component *component, struct snd_soc_jack *hs_jack, void *data) @@ -1094,6 +1089,7 @@ void rt5682_jack_detect_handler(struct work_struct *work) { struct rt5682_priv *rt5682 = container_of(work, struct rt5682_priv, jack_detect_work.work); + struct snd_soc_dapm_context *dapm; int val, btn_type;
if (!rt5682->component || !rt5682->component->card || @@ -1104,7 +1100,9 @@ void rt5682_jack_detect_handler(struct work_struct *work) return; }
- mutex_lock(&rt5682->jdet_mutex); + dapm = snd_soc_component_get_dapm(rt5682->component); + + snd_soc_dapm_mutex_lock(dapm); mutex_lock(&rt5682->calibrate_mutex);
val = snd_soc_component_read(rt5682->component, RT5682_AJD1_CTRL) @@ -1164,6 +1162,9 @@ void rt5682_jack_detect_handler(struct work_struct *work) rt5682->irq_work_delay_time = 50; }
+ mutex_unlock(&rt5682->calibrate_mutex); + snd_soc_dapm_mutex_unlock(dapm); + snd_soc_jack_report(rt5682->hs_jack, rt5682->jack_type, SND_JACK_HEADSET | SND_JACK_BTN_0 | SND_JACK_BTN_1 | @@ -1176,9 +1177,6 @@ void rt5682_jack_detect_handler(struct work_struct *work) else cancel_delayed_work_sync(&rt5682->jd_check_work); } - - mutex_unlock(&rt5682->calibrate_mutex); - mutex_unlock(&rt5682->jdet_mutex); } EXPORT_SYMBOL_GPL(rt5682_jack_detect_handler);
@@ -1528,7 +1526,6 @@ static int rt5682_hp_event(struct snd_soc_dapm_widget *w, { struct snd_soc_component *component = snd_soc_dapm_to_component(w->dapm); - struct rt5682_priv *rt5682 = snd_soc_component_get_drvdata(component);
switch (event) { case SND_SOC_DAPM_PRE_PMU: @@ -1540,17 +1537,12 @@ static int rt5682_hp_event(struct snd_soc_dapm_widget *w, RT5682_DEPOP_1, 0x60, 0x60); snd_soc_component_update_bits(component, RT5682_DAC_ADC_DIG_VOL1, 0x00c0, 0x0080); - - mutex_lock(&rt5682->jdet_mutex); - snd_soc_component_update_bits(component, RT5682_HP_CTRL_2, RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN, RT5682_HP_C2_DAC_L_EN | RT5682_HP_C2_DAC_R_EN); usleep_range(5000, 10000); snd_soc_component_update_bits(component, RT5682_CHARGE_PUMP_1, RT5682_CP_SW_SIZE_MASK, RT5682_CP_SW_SIZE_L); - - mutex_unlock(&rt5682->jdet_mutex); break;
case SND_SOC_DAPM_POST_PMD: diff --git a/sound/soc/codecs/rt5682.h b/sound/soc/codecs/rt5682.h index f866d207d1bd..539a9fe26294 100644 --- a/sound/soc/codecs/rt5682.h +++ b/sound/soc/codecs/rt5682.h @@ -1462,7 +1462,6 @@ struct rt5682_priv {
int jack_type; int irq_work_delay_time; - struct mutex jdet_mutex; };
extern const char *rt5682_supply_names[RT5682_NUM_SUPPLIES]; @@ -1472,7 +1471,6 @@ int rt5682_sel_asrc_clk_src(struct snd_soc_component *component,
void rt5682_apply_patch_list(struct rt5682_priv *rt5682, struct device *dev);
-int rt5682_headset_detect(struct snd_soc_component *component, int jack_insert); void rt5682_jack_detect_handler(struct work_struct *work);
bool rt5682_volatile_register(struct device *dev, unsigned int reg);
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit b5bdc6f9c24db9a0adf8bd00c0e935b184654f00 ]
Generalize boolean field to store more flags on the pktinfo structure.
Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 8 ++++++-- include/net/netfilter/nf_tables_ipv4.h | 7 ++++--- include/net/netfilter/nf_tables_ipv6.h | 6 +++--- net/netfilter/nf_tables_core.c | 2 +- net/netfilter/nf_tables_trace.c | 4 ++-- net/netfilter/nft_meta.c | 2 +- net/netfilter/nft_payload.c | 4 ++-- 7 files changed, 19 insertions(+), 14 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index 2af1c2c64128..d005f87691da 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -21,10 +21,14 @@ struct module;
#define NFT_JUMP_STACK_SIZE 16
+enum { + NFT_PKTINFO_L4PROTO = (1 << 0), +}; + struct nft_pktinfo { struct sk_buff *skb; const struct nf_hook_state *state; - bool tprot_set; + u8 flags; u8 tprot; u16 fragoff; unsigned int thoff; @@ -75,7 +79,7 @@ static inline void nft_set_pktinfo(struct nft_pktinfo *pkt,
static inline void nft_set_pktinfo_unspec(struct nft_pktinfo *pkt) { - pkt->tprot_set = false; + pkt->flags = 0; pkt->tprot = 0; pkt->thoff = 0; pkt->fragoff = 0; diff --git a/include/net/netfilter/nf_tables_ipv4.h b/include/net/netfilter/nf_tables_ipv4.h index eb4c094cd54d..c4a6147b0ef8 100644 --- a/include/net/netfilter/nf_tables_ipv4.h +++ b/include/net/netfilter/nf_tables_ipv4.h @@ -10,7 +10,7 @@ static inline void nft_set_pktinfo_ipv4(struct nft_pktinfo *pkt) struct iphdr *ip;
ip = ip_hdr(pkt->skb); - pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = ip->protocol; pkt->thoff = ip_hdrlen(pkt->skb); pkt->fragoff = ntohs(ip->frag_off) & IP_OFFSET; @@ -36,7 +36,7 @@ static inline int __nft_set_pktinfo_ipv4_validate(struct nft_pktinfo *pkt) else if (len < thoff) return -1;
- pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = iph->protocol; pkt->thoff = thoff; pkt->fragoff = ntohs(iph->frag_off) & IP_OFFSET; @@ -71,7 +71,7 @@ static inline int nft_set_pktinfo_ipv4_ingress(struct nft_pktinfo *pkt) goto inhdr_error; }
- pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = iph->protocol; pkt->thoff = thoff; pkt->fragoff = ntohs(iph->frag_off) & IP_OFFSET; @@ -82,4 +82,5 @@ static inline int nft_set_pktinfo_ipv4_ingress(struct nft_pktinfo *pkt) __IP_INC_STATS(nft_net(pkt), IPSTATS_MIB_INHDRERRORS); return -1; } + #endif diff --git a/include/net/netfilter/nf_tables_ipv6.h b/include/net/netfilter/nf_tables_ipv6.h index 7595e02b00ba..ec7eaeaf4f04 100644 --- a/include/net/netfilter/nf_tables_ipv6.h +++ b/include/net/netfilter/nf_tables_ipv6.h @@ -18,7 +18,7 @@ static inline void nft_set_pktinfo_ipv6(struct nft_pktinfo *pkt) return; }
- pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = protohdr; pkt->thoff = thoff; pkt->fragoff = frag_off; @@ -50,7 +50,7 @@ static inline int __nft_set_pktinfo_ipv6_validate(struct nft_pktinfo *pkt) if (protohdr < 0) return -1;
- pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = protohdr; pkt->thoff = thoff; pkt->fragoff = frag_off; @@ -96,7 +96,7 @@ static inline int nft_set_pktinfo_ipv6_ingress(struct nft_pktinfo *pkt) if (protohdr < 0) goto inhdr_error;
- pkt->tprot_set = true; + pkt->flags = NFT_PKTINFO_L4PROTO; pkt->tprot = protohdr; pkt->thoff = thoff; pkt->fragoff = frag_off; diff --git a/net/netfilter/nf_tables_core.c b/net/netfilter/nf_tables_core.c index 907e848dbc17..d4d8f613af51 100644 --- a/net/netfilter/nf_tables_core.c +++ b/net/netfilter/nf_tables_core.c @@ -79,7 +79,7 @@ static bool nft_payload_fast_eval(const struct nft_expr *expr, if (priv->base == NFT_PAYLOAD_NETWORK_HEADER) ptr = skb_network_header(skb); else { - if (!pkt->tprot_set) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) return false; ptr = skb_network_header(skb) + nft_thoff(pkt); } diff --git a/net/netfilter/nf_tables_trace.c b/net/netfilter/nf_tables_trace.c index e4fe2f0780eb..84a7dea46efa 100644 --- a/net/netfilter/nf_tables_trace.c +++ b/net/netfilter/nf_tables_trace.c @@ -113,13 +113,13 @@ static int nf_trace_fill_pkt_info(struct sk_buff *nlskb, int off = skb_network_offset(skb); unsigned int len, nh_end;
- nh_end = pkt->tprot_set ? nft_thoff(pkt) : skb->len; + nh_end = pkt->flags & NFT_PKTINFO_L4PROTO ? nft_thoff(pkt) : skb->len; len = min_t(unsigned int, nh_end - skb_network_offset(skb), NFT_TRACETYPE_NETWORK_HSIZE); if (trace_fill_header(nlskb, NFTA_TRACE_NETWORK_HEADER, skb, off, len)) return -1;
- if (pkt->tprot_set) { + if (pkt->flags & NFT_PKTINFO_L4PROTO) { len = min_t(unsigned int, skb->len - nft_thoff(pkt), NFT_TRACETYPE_TRANSPORT_HSIZE); if (trace_fill_header(nlskb, NFTA_TRACE_TRANSPORT_HEADER, skb, diff --git a/net/netfilter/nft_meta.c b/net/netfilter/nft_meta.c index 44d9b38e5f90..14412f69a34e 100644 --- a/net/netfilter/nft_meta.c +++ b/net/netfilter/nft_meta.c @@ -321,7 +321,7 @@ void nft_meta_get_eval(const struct nft_expr *expr, nft_reg_store8(dest, nft_pf(pkt)); break; case NFT_META_L4PROTO: - if (!pkt->tprot_set) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) goto err; nft_reg_store8(dest, pkt->tprot); break; diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c index 132875cd7fff..c3ccfff54a35 100644 --- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -108,7 +108,7 @@ void nft_payload_eval(const struct nft_expr *expr, offset = skb_network_offset(skb); break; case NFT_PAYLOAD_TRANSPORT_HEADER: - if (!pkt->tprot_set) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) goto err; offset = nft_thoff(pkt); break; @@ -613,7 +613,7 @@ static void nft_payload_set_eval(const struct nft_expr *expr, offset = skb_network_offset(skb); break; case NFT_PAYLOAD_TRANSPORT_HEADER: - if (!pkt->tprot_set) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) goto err; offset = nft_thoff(pkt); break;
From: Pablo Neira Ayuso pablo@netfilter.org
[ Upstream commit c46b38dc8743535e686b911d253a844f0bd50ead ]
Allow to match and mangle on inner headers / payload data after the transport header. There is a new field in the pktinfo structure that stores the inner header offset which is calculated only when requested. Only TCP and UDP supported at this stage.
Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/netfilter/nf_tables.h | 2 + include/uapi/linux/netfilter/nf_tables.h | 2 + net/netfilter/nft_payload.c | 56 +++++++++++++++++++++++- 3 files changed, 58 insertions(+), 2 deletions(-)
diff --git a/include/net/netfilter/nf_tables.h b/include/net/netfilter/nf_tables.h index d005f87691da..bcfee89012a1 100644 --- a/include/net/netfilter/nf_tables.h +++ b/include/net/netfilter/nf_tables.h @@ -23,6 +23,7 @@ struct module;
enum { NFT_PKTINFO_L4PROTO = (1 << 0), + NFT_PKTINFO_INNER = (1 << 1), };
struct nft_pktinfo { @@ -32,6 +33,7 @@ struct nft_pktinfo { u8 tprot; u16 fragoff; unsigned int thoff; + unsigned int inneroff; };
static inline struct sock *nft_sk(const struct nft_pktinfo *pkt) diff --git a/include/uapi/linux/netfilter/nf_tables.h b/include/uapi/linux/netfilter/nf_tables.h index e94d1fa554cb..07871c8a0601 100644 --- a/include/uapi/linux/netfilter/nf_tables.h +++ b/include/uapi/linux/netfilter/nf_tables.h @@ -753,11 +753,13 @@ enum nft_dynset_attributes { * @NFT_PAYLOAD_LL_HEADER: link layer header * @NFT_PAYLOAD_NETWORK_HEADER: network header * @NFT_PAYLOAD_TRANSPORT_HEADER: transport header + * @NFT_PAYLOAD_INNER_HEADER: inner header / payload */ enum nft_payload_bases { NFT_PAYLOAD_LL_HEADER, NFT_PAYLOAD_NETWORK_HEADER, NFT_PAYLOAD_TRANSPORT_HEADER, + NFT_PAYLOAD_INNER_HEADER, };
/** diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c index c3ccfff54a35..ee359a4a60f5 100644 --- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -22,6 +22,7 @@ #include <linux/icmpv6.h> #include <linux/ip.h> #include <linux/ipv6.h> +#include <linux/ip.h> #include <net/sctp/checksum.h>
static bool nft_payload_rebuild_vlan_hdr(const struct sk_buff *skb, int mac_off, @@ -79,6 +80,45 @@ nft_payload_copy_vlan(u32 *d, const struct sk_buff *skb, u8 offset, u8 len) return skb_copy_bits(skb, offset + mac_off, dst_u8, len) == 0; }
+static int __nft_payload_inner_offset(struct nft_pktinfo *pkt) +{ + unsigned int thoff = nft_thoff(pkt); + + if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) + return -1; + + switch (pkt->tprot) { + case IPPROTO_UDP: + pkt->inneroff = thoff + sizeof(struct udphdr); + break; + case IPPROTO_TCP: { + struct tcphdr *th, _tcph; + + th = skb_header_pointer(pkt->skb, thoff, sizeof(_tcph), &_tcph); + if (!th) + return -1; + + pkt->inneroff = thoff + __tcp_hdrlen(th); + } + break; + default: + return -1; + } + + pkt->flags |= NFT_PKTINFO_INNER; + + return 0; +} + +static int nft_payload_inner_offset(const struct nft_pktinfo *pkt) +{ + if (!(pkt->flags & NFT_PKTINFO_INNER) && + __nft_payload_inner_offset((struct nft_pktinfo *)pkt) < 0) + return -1; + + return pkt->inneroff; +} + void nft_payload_eval(const struct nft_expr *expr, struct nft_regs *regs, const struct nft_pktinfo *pkt) @@ -112,6 +152,11 @@ void nft_payload_eval(const struct nft_expr *expr, goto err; offset = nft_thoff(pkt); break; + case NFT_PAYLOAD_INNER_HEADER: + offset = nft_payload_inner_offset(pkt); + if (offset < 0) + goto err; + break; default: BUG(); } @@ -617,6 +662,11 @@ static void nft_payload_set_eval(const struct nft_expr *expr, goto err; offset = nft_thoff(pkt); break; + case NFT_PAYLOAD_INNER_HEADER: + offset = nft_payload_inner_offset(pkt); + if (offset < 0) + goto err; + break; default: BUG(); } @@ -625,7 +675,8 @@ static void nft_payload_set_eval(const struct nft_expr *expr, offset += priv->offset;
if ((priv->csum_type == NFT_PAYLOAD_CSUM_INET || priv->csum_flags) && - (priv->base != NFT_PAYLOAD_TRANSPORT_HEADER || + ((priv->base != NFT_PAYLOAD_TRANSPORT_HEADER && + priv->base != NFT_PAYLOAD_INNER_HEADER) || skb->ip_summed != CHECKSUM_PARTIAL)) { fsum = skb_checksum(skb, offset, priv->len, 0); tsum = csum_partial(src, priv->len, 0); @@ -744,6 +795,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx, case NFT_PAYLOAD_LL_HEADER: case NFT_PAYLOAD_NETWORK_HEADER: case NFT_PAYLOAD_TRANSPORT_HEADER: + case NFT_PAYLOAD_INNER_HEADER: break; default: return ERR_PTR(-EOPNOTSUPP); @@ -762,7 +814,7 @@ nft_payload_select_ops(const struct nft_ctx *ctx, len = ntohl(nla_get_be32(tb[NFTA_PAYLOAD_LEN]));
if (len <= 4 && is_power_of_2(len) && IS_ALIGNED(offset, len) && - base != NFT_PAYLOAD_LL_HEADER) + base != NFT_PAYLOAD_LL_HEADER && base != NFT_PAYLOAD_INNER_HEADER) return &nft_payload_fast_ops; else return &nft_payload_ops;
From: Florian Westphal fw@strlen.de
[ Upstream commit a9e8503def0fd4ed89ade1f61c315f904581d439 ]
Loads relative to ->thoff naturally expect that this points to the transport header, but this is only true if pkt->fragoff == 0.
This has little effect for rulesets with connection tracking/nat because these enable ip defra. For other rulesets this prevents false matches.
Fixes: 96518518cc41 ("netfilter: add nftables") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_exthdr.c | 2 +- net/netfilter/nft_payload.c | 9 +++++---- 2 files changed, 6 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nft_exthdr.c b/net/netfilter/nft_exthdr.c index dbe1f2e7dd9e..9e927ab4df15 100644 --- a/net/netfilter/nft_exthdr.c +++ b/net/netfilter/nft_exthdr.c @@ -167,7 +167,7 @@ nft_tcp_header_pointer(const struct nft_pktinfo *pkt, { struct tcphdr *tcph;
- if (pkt->tprot != IPPROTO_TCP) + if (pkt->tprot != IPPROTO_TCP || pkt->fragoff) return NULL;
tcph = skb_header_pointer(pkt->skb, nft_thoff(pkt), sizeof(*tcph), buffer); diff --git a/net/netfilter/nft_payload.c b/net/netfilter/nft_payload.c index ee359a4a60f5..b46e01365bd9 100644 --- a/net/netfilter/nft_payload.c +++ b/net/netfilter/nft_payload.c @@ -84,7 +84,7 @@ static int __nft_payload_inner_offset(struct nft_pktinfo *pkt) { unsigned int thoff = nft_thoff(pkt);
- if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO) || pkt->fragoff) return -1;
switch (pkt->tprot) { @@ -148,7 +148,7 @@ void nft_payload_eval(const struct nft_expr *expr, offset = skb_network_offset(skb); break; case NFT_PAYLOAD_TRANSPORT_HEADER: - if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO) || pkt->fragoff) goto err; offset = nft_thoff(pkt); break; @@ -658,7 +658,7 @@ static void nft_payload_set_eval(const struct nft_expr *expr, offset = skb_network_offset(skb); break; case NFT_PAYLOAD_TRANSPORT_HEADER: - if (!(pkt->flags & NFT_PKTINFO_L4PROTO)) + if (!(pkt->flags & NFT_PKTINFO_L4PROTO) || pkt->fragoff) goto err; offset = nft_thoff(pkt); break; @@ -697,7 +697,8 @@ static void nft_payload_set_eval(const struct nft_expr *expr, if (priv->csum_type == NFT_PAYLOAD_CSUM_SCTP && pkt->tprot == IPPROTO_SCTP && skb->ip_summed != CHECKSUM_PARTIAL) { - if (nft_payload_csum_sctp(skb, nft_thoff(pkt))) + if (pkt->fragoff == 0 && + nft_payload_csum_sctp(skb, nft_thoff(pkt))) goto err; }
From: Alexander Gordeev agordeev@linux.ibm.com
[ Upstream commit e3ec8e0f5711d73f7e5d5c3cffdf4fad4f1487b9 ]
The memory for amode31 section is allocated from the decompressed kernel. Instead, allocate that memory from the decompressor. This is a prerequisite to allow initialization of the virtual memory before the decompressed kernel takes over.
Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/boot/compressed/decompressor.h | 1 + arch/s390/boot/startup.c | 8 ++++++++ arch/s390/kernel/entry.h | 1 + arch/s390/kernel/setup.c | 22 +++++++++------------- arch/s390/kernel/vmlinux.lds.S | 1 + 5 files changed, 20 insertions(+), 13 deletions(-)
diff --git a/arch/s390/boot/compressed/decompressor.h b/arch/s390/boot/compressed/decompressor.h index a59f75c5b049..f75cc31a77dd 100644 --- a/arch/s390/boot/compressed/decompressor.h +++ b/arch/s390/boot/compressed/decompressor.h @@ -24,6 +24,7 @@ struct vmlinux_info { unsigned long dynsym_start; unsigned long rela_dyn_start; unsigned long rela_dyn_end; + unsigned long amode31_size; };
/* Symbols defined by linker scripts */ diff --git a/arch/s390/boot/startup.c b/arch/s390/boot/startup.c index b13352dd1e1c..1aa11a8f57dd 100644 --- a/arch/s390/boot/startup.c +++ b/arch/s390/boot/startup.c @@ -15,6 +15,7 @@ #include "uv.h"
unsigned long __bootdata_preserved(__kaslr_offset); +unsigned long __bootdata(__amode31_base); unsigned long __bootdata_preserved(VMALLOC_START); unsigned long __bootdata_preserved(VMALLOC_END); struct page *__bootdata_preserved(vmemmap); @@ -233,6 +234,12 @@ static void offset_vmlinux_info(unsigned long offset) vmlinux.dynsym_start += offset; }
+static unsigned long reserve_amode31(unsigned long safe_addr) +{ + __amode31_base = PAGE_ALIGN(safe_addr); + return safe_addr + vmlinux.amode31_size; +} + void startup_kernel(void) { unsigned long random_lma; @@ -247,6 +254,7 @@ void startup_kernel(void) setup_lpp(); store_ipl_parmblock(); safe_addr = mem_safe_offset(); + safe_addr = reserve_amode31(safe_addr); safe_addr = read_ipl_report(safe_addr); uv_query_info(); rescue_initrd(safe_addr); diff --git a/arch/s390/kernel/entry.h b/arch/s390/kernel/entry.h index 7f2696e8d511..6083090be1f4 100644 --- a/arch/s390/kernel/entry.h +++ b/arch/s390/kernel/entry.h @@ -70,5 +70,6 @@ extern struct exception_table_entry _stop_amode31_ex_table[]; #define __amode31_data __section(".amode31.data") #define __amode31_ref __section(".amode31.refs") extern long _start_amode31_refs[], _end_amode31_refs[]; +extern unsigned long __amode31_base;
#endif /* _ENTRY_H */ diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 8ede12c4ba6b..e38de9e8ee13 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -95,10 +95,10 @@ EXPORT_SYMBOL(console_irq); * relocated above 2 GB, because it has to use 31 bit addresses. * Such code and data is part of the .amode31 section. */ -unsigned long __amode31_ref __samode31 = __pa(&_samode31); -unsigned long __amode31_ref __eamode31 = __pa(&_eamode31); -unsigned long __amode31_ref __stext_amode31 = __pa(&_stext_amode31); -unsigned long __amode31_ref __etext_amode31 = __pa(&_etext_amode31); +unsigned long __amode31_ref __samode31 = (unsigned long)&_samode31; +unsigned long __amode31_ref __eamode31 = (unsigned long)&_eamode31; +unsigned long __amode31_ref __stext_amode31 = (unsigned long)&_stext_amode31; +unsigned long __amode31_ref __etext_amode31 = (unsigned long)&_etext_amode31; struct exception_table_entry __amode31_ref *__start_amode31_ex_table = _start_amode31_ex_table; struct exception_table_entry __amode31_ref *__stop_amode31_ex_table = _stop_amode31_ex_table;
@@ -149,6 +149,7 @@ struct mem_detect_info __bootdata(mem_detect); struct initrd_data __bootdata(initrd_data);
unsigned long __bootdata_preserved(__kaslr_offset); +unsigned long __bootdata(__amode31_base); unsigned int __bootdata_preserved(zlib_dfltcc_support); EXPORT_SYMBOL(zlib_dfltcc_support); u64 __bootdata_preserved(stfle_fac_list[16]); @@ -800,6 +801,7 @@ static void __init reserve_kernel(void)
memblock_reserve(0, STARTUP_NORMAL_OFFSET); memblock_reserve((unsigned long)sclp_early_sccb, EXT_SCCB_READ_SCP); + memblock_reserve(__amode31_base, __eamode31 - __samode31); memblock_reserve((unsigned long)_stext, PFN_PHYS(start_pfn) - (unsigned long)_stext); } @@ -820,20 +822,14 @@ static void __init setup_memory(void)
static void __init relocate_amode31_section(void) { - unsigned long amode31_addr, amode31_size; - long amode31_offset; + unsigned long amode31_size = __eamode31 - __samode31; + long amode31_offset = __amode31_base - __samode31; long *ptr;
- /* Allocate a new AMODE31 capable memory region */ - amode31_size = __eamode31 - __samode31; pr_info("Relocating AMODE31 section of size 0x%08lx\n", amode31_size); - amode31_addr = (unsigned long)memblock_alloc_low(amode31_size, PAGE_SIZE); - if (!amode31_addr) - panic("Failed to allocate memory for AMODE31 section\n"); - amode31_offset = amode31_addr - __samode31;
/* Move original AMODE31 section to the new one */ - memmove((void *)amode31_addr, (void *)__samode31, amode31_size); + memmove((void *)__amode31_base, (void *)__samode31, amode31_size); /* Zero out the old AMODE31 section to catch invalid accesses within it */ memset((void *)__samode31, 0, amode31_size);
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 63bdb9e1bfc1..42c43521878f 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -212,6 +212,7 @@ SECTIONS QUAD(__dynsym_start) /* dynsym_start */ QUAD(__rela_dyn_start) /* rela_dyn_start */ QUAD(__rela_dyn_end) /* rela_dyn_end */ + QUAD(_eamode31 - _samode31) /* amode31_size */ } :NONE
/* Debugging sections. */
From: Alexander Gordeev agordeev@linux.ibm.com
[ Upstream commit 04f11ed7d8e018e1f01ebda5814ddfeb3a1e6ae1 ]
memblock_reserve() function accepts physcal address of a memory block to be reserved, but provided with virtual memory pointers.
Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/setup.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index e38de9e8ee13..2ebde341d057 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -797,13 +797,10 @@ static void __init check_initrd(void) */ static void __init reserve_kernel(void) { - unsigned long start_pfn = PFN_UP(__pa(_end)); - memblock_reserve(0, STARTUP_NORMAL_OFFSET); - memblock_reserve((unsigned long)sclp_early_sccb, EXT_SCCB_READ_SCP); memblock_reserve(__amode31_base, __eamode31 - __samode31); - memblock_reserve((unsigned long)_stext, PFN_PHYS(start_pfn) - - (unsigned long)_stext); + memblock_reserve(__pa(sclp_early_sccb), EXT_SCCB_READ_SCP); + memblock_reserve(__pa(_stext), _end - _stext); }
static void __init setup_memory(void)
From: Alexander Egorenkov egorenar@linux.ibm.com
[ Upstream commit 6b4b54c7ca347bcb4aa7a3cc01aa16e84ac7fbe4 ]
We need to preserve the values at OLDMEM_BASE and OLDMEM_SIZE which are used by zgetdump in case when kdump crashes. In that case zgetdump will attempt to read OLDMEM_BASE and OLDMEM_SIZE in order to find out where the memory range [0 - OLDMEM_SIZE] belonging to the production kernel is.
Fixes: f1a546947431 ("s390/setup: don't reserve memory that occupied decompressor's head") Cc: stable@vger.kernel.org # 5.15+ Signed-off-by: Alexander Egorenkov egorenar@linux.ibm.com Acked-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/setup.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/s390/kernel/setup.c b/arch/s390/kernel/setup.c index 2ebde341d057..36c1f31dfd66 100644 --- a/arch/s390/kernel/setup.c +++ b/arch/s390/kernel/setup.c @@ -798,6 +798,8 @@ static void __init check_initrd(void) static void __init reserve_kernel(void) { memblock_reserve(0, STARTUP_NORMAL_OFFSET); + memblock_reserve(OLDMEM_BASE, sizeof(unsigned long)); + memblock_reserve(OLDMEM_SIZE, sizeof(unsigned long)); memblock_reserve(__amode31_base, __eamode31 - __samode31); memblock_reserve(__pa(sclp_early_sccb), EXT_SCCB_READ_SCP); memblock_reserve(__pa(_stext), _end - _stext);
From: Sukadev Bhattiprolu sukadev@linux.ibm.com
[ Upstream commit ae16bf15374d8b055e040ac6f3f1147ab1c9bb7d ]
We currently initialize the ->init_done completion/return code fields before issuing a CRQ_INIT command. But if we get a transport event soon after registering the CRQ the taskslet may already have recorded the completion and error code. If we initialize here, we might overwrite/ lose that and end up issuing the CRQ_INIT only to timeout later.
If that timeout happens during probe, we will leave the adapter in the DOWN state rather than retrying to register/init the CRQ.
Initialize the completion before registering the CRQ so we don't lose the notification.
Fixes: 032c5e82847a ("Driver for IBM System i/p VNIC protocol") Signed-off-by: Sukadev Bhattiprolu sukadev@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmvnic.c | 26 +++++++++++++++++++++----- 1 file changed, 21 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index b262aa84b6a2..70267bd73429 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2063,6 +2063,19 @@ static const char *reset_reason_to_string(enum ibmvnic_reset_reason reason) return "UNKNOWN"; }
+/* + * Initialize the init_done completion and return code values. We + * can get a transport event just after registering the CRQ and the + * tasklet will use this to communicate the transport event. To ensure + * we don't miss the notification/error, initialize these _before_ + * regisering the CRQ. + */ +static inline void reinit_init_done(struct ibmvnic_adapter *adapter) +{ + reinit_completion(&adapter->init_done); + adapter->init_done_rc = 0; +} + /* * do_reset returns zero if we are able to keep processing reset events, or * non-zero if we hit a fatal error and must halt. @@ -2169,6 +2182,8 @@ static int do_reset(struct ibmvnic_adapter *adapter, */ adapter->state = VNIC_PROBED;
+ reinit_init_done(adapter); + if (adapter->reset_reason == VNIC_RESET_CHANGE_PARAM) { rc = init_crq_queue(adapter); } else if (adapter->reset_reason == VNIC_RESET_MOBILITY) { @@ -2314,7 +2329,8 @@ static int do_hard_reset(struct ibmvnic_adapter *adapter, */ adapter->state = VNIC_PROBED;
- reinit_completion(&adapter->init_done); + reinit_init_done(adapter); + rc = init_crq_queue(adapter); if (rc) { netdev_err(adapter->netdev, @@ -5485,10 +5501,6 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
adapter->from_passive_init = false;
- if (reset) - reinit_completion(&adapter->init_done); - - adapter->init_done_rc = 0; rc = ibmvnic_send_crq_init(adapter); if (rc) { dev_err(dev, "Send crq init failed with error %d\n", rc); @@ -5502,12 +5514,14 @@ static int ibmvnic_reset_init(struct ibmvnic_adapter *adapter, bool reset)
if (adapter->init_done_rc) { release_crq_queue(adapter); + dev_err(dev, "CRQ-init failed, %d\n", adapter->init_done_rc); return adapter->init_done_rc; }
if (adapter->from_passive_init) { adapter->state = VNIC_OPEN; adapter->from_passive_init = false; + dev_err(dev, "CRQ-init failed, passive-init\n"); return -1; }
@@ -5596,6 +5610,8 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id)
init_success = false; do { + reinit_init_done(adapter); + rc = init_crq_queue(adapter); if (rc) { dev_err(&dev->dev, "Couldn't initialize crq. rc=%d\n",
From: Sukadev Bhattiprolu sukadev@linux.ibm.com
[ Upstream commit f628ad531b4f34fdba0984255b4a2850dd369513 ]
Clear ->failover_pending flag that may have been set in the previous pass of registering CRQ. If we don't clear, a subsequent ibmvnic_open() call would be misled into thinking a failover is pending and assuming that the reset worker thread would open the adapter. If this pass of registering the CRQ succeeds (i.e there is no transport event), there wouldn't be a reset worker thread.
This would leave the adapter unconfigured and require manual intervention to bring it up during boot.
Fixes: 5a18e1e0c193 ("ibmvnic: Fix failover case for non-redundant configuration") Signed-off-by: Sukadev Bhattiprolu sukadev@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmvnic.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 70267bd73429..9f4f40564d74 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -5612,6 +5612,11 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) do { reinit_init_done(adapter);
+ /* clear any failovers we got in the previous pass + * since we are reinitializing the CRQ + */ + adapter->failover_pending = false; + rc = init_crq_queue(adapter); if (rc) { dev_err(&dev->dev, "Couldn't initialize crq. rc=%d\n",
From: Sukadev Bhattiprolu sukadev@linux.ibm.com
[ Upstream commit fd98693cb0721317f27341951593712c580c36a1 ]
We currently don't allow queuing resets when adapter is in VNIC_PROBING state - instead we throw away the reset and return EBUSY. The reasoning is probably that during ibmvnic_probe() the ibmvnic_adapter itself is being initialized so performing a reset during this time can lead us to accessing fields in the ibmvnic_adapter that are not fully initialized. A review of the code shows that all the adapter state neede to process a reset is initialized before registering the CRQ so that should no longer be a concern.
Further the expectation is that if we do get a reset (transport event) during probe, the do..while() loop in ibmvnic_probe() will handle this by reinitializing the CRQ.
While that is true to some extent, it is possible that the reset might occur _after_ the CRQ is registered and CRQ_INIT message was exchanged but _before_ the adapter state is set to VNIC_PROBED. As mentioned above, such a reset will be thrown away. While the client assumes that the adapter is functional, the vnic server will wait for the client to reinit the adapter. This disconnect between the two leaves the adapter down needing manual intervention.
Because ibmvnic_probe() has other work to do after initializing the CRQ (such as registering the netdev at a minimum) and because the reset event can occur at any instant after the CRQ is initialized, there will always be a window between initializing the CRQ and considering the adapter ready for resets (ie state == PROBED).
So rather than discarding resets during this window, allow queueing them - but only process them after the adapter is fully initialized.
To do this, introduce a new completion state ->probe_done and have the reset worker thread wait on this before processing resets.
This change brings up two new situations in or just after ibmvnic_probe(). First after one or more resets were queued, we encounter an error and decide to retry the initialization. At that point the queued resets are no longer relevant since we could be talking to a new vnic server. So we must purge/flush the queued resets before restarting the initialization. As a side note, since we are still in the probing stage and we have not registered the netdev, it will not be CHANGE_PARAM reset.
Second this change opens up a potential race between the worker thread in __ibmvnic_reset(), the tasklet and the ibmvnic_open() due to the following sequence of events:
1. Register CRQ 2. Get transport event before CRQ_INIT completes. 3. Tasklet schedules reset: a) add rwi to list b) schedule_work() to start worker thread which runs and waits for ->probe_done. 4. ibmvnic_probe() decides to retry, purges rwi_list 5. Re-register crq and this time rest of probe succeeds - register netdev and complete(->probe_done). 6. Worker thread resumes in __ibmvnic_reset() from 3b. 7. Worker thread sets ->resetting bit 8. ibmvnic_open() comes in, notices ->resetting bit, sets state to IBMVNIC_OPEN and returns early expecting worker thread to finish the open. 9. Worker thread finds rwi_list empty and returns without opening the interface.
If this happens, the ->ndo_open() call is effectively lost and the interface remains down. To address this, ensure that ->rwi_list is not empty before setting the ->resetting bit. See also comments in __ibmvnic_reset().
Fixes: 6a2fb0e99f9c ("ibmvnic: driver initialization for kdump/kexec") Signed-off-by: Sukadev Bhattiprolu sukadev@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmvnic.c | 107 ++++++++++++++++++++++++++--- drivers/net/ethernet/ibm/ibmvnic.h | 1 + 2 files changed, 98 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index 9f4f40564d74..28344c3dfea1 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -2471,23 +2471,82 @@ static int do_passive_init(struct ibmvnic_adapter *adapter) static void __ibmvnic_reset(struct work_struct *work) { struct ibmvnic_adapter *adapter; - bool saved_state = false; + unsigned int timeout = 5000; struct ibmvnic_rwi *tmprwi; + bool saved_state = false; struct ibmvnic_rwi *rwi; unsigned long flags; - u32 reset_state; + struct device *dev; + bool need_reset; int num_fails = 0; + u32 reset_state; int rc = 0;
adapter = container_of(work, struct ibmvnic_adapter, ibmvnic_reset); + dev = &adapter->vdev->dev;
- if (test_and_set_bit_lock(0, &adapter->resetting)) { + /* Wait for ibmvnic_probe() to complete. If probe is taking too long + * or if another reset is in progress, defer work for now. If probe + * eventually fails it will flush and terminate our work. + * + * Three possibilities here: + * 1. Adpater being removed - just return + * 2. Timed out on probe or another reset in progress - delay the work + * 3. Completed probe - perform any resets in queue + */ + if (adapter->state == VNIC_PROBING && + !wait_for_completion_timeout(&adapter->probe_done, timeout)) { + dev_err(dev, "Reset thread timed out on probe"); queue_delayed_work(system_long_wq, &adapter->ibmvnic_delayed_reset, IBMVNIC_RESET_DELAY); return; }
+ /* adapter is done with probe (i.e state is never VNIC_PROBING now) */ + if (adapter->state == VNIC_REMOVING) + return; + + /* ->rwi_list is stable now (no one else is removing entries) */ + + /* ibmvnic_probe() may have purged the reset queue after we were + * scheduled to process a reset so there maybe no resets to process. + * Before setting the ->resetting bit though, we have to make sure + * that there is infact a reset to process. Otherwise we may race + * with ibmvnic_open() and end up leaving the vnic down: + * + * __ibmvnic_reset() ibmvnic_open() + * ----------------- -------------- + * + * set ->resetting bit + * find ->resetting bit is set + * set ->state to IBMVNIC_OPEN (i.e + * assume reset will open device) + * return + * find reset queue empty + * return + * + * Neither performed vnic login/open and vnic stays down + * + * If we hold the lock and conditionally set the bit, either we + * or ibmvnic_open() will complete the open. + */ + need_reset = false; + spin_lock(&adapter->rwi_lock); + if (!list_empty(&adapter->rwi_list)) { + if (test_and_set_bit_lock(0, &adapter->resetting)) { + queue_delayed_work(system_long_wq, + &adapter->ibmvnic_delayed_reset, + IBMVNIC_RESET_DELAY); + } else { + need_reset = true; + } + } + spin_unlock(&adapter->rwi_lock); + + if (!need_reset) + return; + rwi = get_next_rwi(adapter); while (rwi) { spin_lock_irqsave(&adapter->state_lock, flags); @@ -2639,13 +2698,6 @@ static int ibmvnic_reset(struct ibmvnic_adapter *adapter, goto err; }
- if (adapter->state == VNIC_PROBING) { - netdev_warn(netdev, "Adapter reset during probe\n"); - adapter->init_done_rc = -EAGAIN; - ret = EAGAIN; - goto err; - } - list_for_each_entry(tmp, &adapter->rwi_list, list) { if (tmp->reset_reason == reason) { netdev_dbg(netdev, "Skipping matching reset, reason=%s\n", @@ -5561,6 +5613,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) struct ibmvnic_adapter *adapter; struct net_device *netdev; unsigned char *mac_addr_p; + unsigned long flags; bool init_success; int rc;
@@ -5602,6 +5655,7 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) spin_lock_init(&adapter->rwi_lock); spin_lock_init(&adapter->state_lock); mutex_init(&adapter->fw_lock); + init_completion(&adapter->probe_done); init_completion(&adapter->init_done); init_completion(&adapter->fw_done); init_completion(&adapter->reset_done); @@ -5617,6 +5671,26 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) */ adapter->failover_pending = false;
+ /* If we had already initialized CRQ, we may have one or + * more resets queued already. Discard those and release + * the CRQ before initializing the CRQ again. + */ + release_crq_queue(adapter); + + /* Since we are still in PROBING state, __ibmvnic_reset() + * will not access the ->rwi_list and since we released CRQ, + * we won't get _new_ transport events. But there maybe an + * ongoing ibmvnic_reset() call. So serialize access to + * rwi_list. If we win the race, ibvmnic_reset() could add + * a reset after we purged but thats ok - we just may end + * up with an extra reset (i.e similar to having two or more + * resets in the queue at once). + * CHECK. + */ + spin_lock_irqsave(&adapter->rwi_lock, flags); + flush_reset_queue(adapter); + spin_unlock_irqrestore(&adapter->rwi_lock, flags); + rc = init_crq_queue(adapter); if (rc) { dev_err(&dev->dev, "Couldn't initialize crq. rc=%d\n", @@ -5668,6 +5742,8 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) } dev_info(&dev->dev, "ibmvnic registered\n");
+ complete(&adapter->probe_done); + return 0;
ibmvnic_register_fail: @@ -5682,6 +5758,17 @@ static int ibmvnic_probe(struct vio_dev *dev, const struct vio_device_id *id) ibmvnic_init_fail: release_sub_crqs(adapter, 1); release_crq_queue(adapter); + + /* cleanup worker thread after releasing CRQ so we don't get + * transport events (i.e new work items for the worker thread). + */ + adapter->state = VNIC_REMOVING; + complete(&adapter->probe_done); + flush_work(&adapter->ibmvnic_reset); + flush_delayed_work(&adapter->ibmvnic_delayed_reset); + + flush_reset_queue(adapter); + mutex_destroy(&adapter->fw_lock); free_netdev(netdev);
diff --git a/drivers/net/ethernet/ibm/ibmvnic.h b/drivers/net/ethernet/ibm/ibmvnic.h index 1a9ed9202654..b01c439965ff 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.h +++ b/drivers/net/ethernet/ibm/ibmvnic.h @@ -927,6 +927,7 @@ struct ibmvnic_adapter {
struct ibmvnic_tx_pool *tx_pool; struct ibmvnic_tx_pool *tso_pool; + struct completion probe_done; struct completion init_done; int init_done_rc;
From: Max Gurtovoy mgurtovoy@nvidia.com
[ Upstream commit 02746e26c39ee473b975e0f68d1295abc92672ed ]
No need to pre-allocate a big buffer for the IO SGL anymore. If a device has lots of deep queues, preallocation for the sg list can consume substantial amounts of memory. For HW virtio-blk device, nr_hw_queues can be 64 or 128 and each queue's depth might be 128. This means the resulting preallocation for the data SGLs is big.
Switch to runtime allocation for SGL for lists longer than 2 entries. This is the approach used by NVMe drivers so it should be reasonable for virtio block as well. Runtime SGL allocation has always been the case for the legacy I/O path so this is nothing new.
The preallocated small SGL depends on SG_CHAIN so if the ARCH doesn't support SG_CHAIN, use only runtime allocation for the SGL.
Re-organize the setup of the IO request to fit the new sg chain mechanism.
No performance degradation was seen (fio libaio engine with 16 jobs and 128 iodepth):
IO size IOPs Rand Read (before/after) IOPs Rand Write (before/after) -------- --------------------------------- ---------------------------------- 512B 318K/316K 329K/325K
4KB 323K/321K 353K/349K
16KB 199K/208K 250K/275K
128KB 36K/36.1K 39.2K/41.7K
Signed-off-by: Max Gurtovoy mgurtovoy@nvidia.com Reviewed-by: Israel Rukshin israelr@nvidia.com Link: https://lore.kernel.org/r/20210901131434.31158-1-mgurtovoy@nvidia.com Reviewed-by: Feng Li lifeng1519@gmail.com Reviewed-by: Stefan Hajnoczi stefanha@redhat.com Tested-by: Stefan Hajnoczi stefanha@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Arnd Bergmann arnd@arndb.de # kconfig fixups Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/Kconfig | 1 + drivers/block/virtio_blk.c | 156 ++++++++++++++++++++++++------------- 2 files changed, 101 insertions(+), 56 deletions(-)
diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig index f93cb989241c..28ed157b1203 100644 --- a/drivers/block/Kconfig +++ b/drivers/block/Kconfig @@ -410,6 +410,7 @@ config XEN_BLKDEV_BACKEND config VIRTIO_BLK tristate "Virtio block driver" depends on VIRTIO + select SG_POOL help This is the virtual block driver for virtio. It can be used with QEMU based VMMs (like KVM or Xen). Say Y or M. diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index c05138a28475..ccc5770d7654 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -24,6 +24,12 @@ /* The maximum number of sg elements that fit into a virtqueue */ #define VIRTIO_BLK_MAX_SG_ELEMS 32768
+#ifdef CONFIG_ARCH_NO_SG_CHAIN +#define VIRTIO_BLK_INLINE_SG_CNT 0 +#else +#define VIRTIO_BLK_INLINE_SG_CNT 2 +#endif + static int major; static DEFINE_IDA(vd_index_ida);
@@ -77,6 +83,7 @@ struct virtio_blk { struct virtblk_req { struct virtio_blk_outhdr out_hdr; u8 status; + struct sg_table sg_table; struct scatterlist sg[]; };
@@ -162,12 +169,92 @@ static int virtblk_setup_discard_write_zeroes(struct request *req, bool unmap) return 0; }
-static inline void virtblk_request_done(struct request *req) +static void virtblk_unmap_data(struct request *req, struct virtblk_req *vbr) { - struct virtblk_req *vbr = blk_mq_rq_to_pdu(req); + if (blk_rq_nr_phys_segments(req)) + sg_free_table_chained(&vbr->sg_table, + VIRTIO_BLK_INLINE_SG_CNT); +} + +static int virtblk_map_data(struct blk_mq_hw_ctx *hctx, struct request *req, + struct virtblk_req *vbr) +{ + int err; + + if (!blk_rq_nr_phys_segments(req)) + return 0; + + vbr->sg_table.sgl = vbr->sg; + err = sg_alloc_table_chained(&vbr->sg_table, + blk_rq_nr_phys_segments(req), + vbr->sg_table.sgl, + VIRTIO_BLK_INLINE_SG_CNT); + if (unlikely(err)) + return -ENOMEM;
+ return blk_rq_map_sg(hctx->queue, req, vbr->sg_table.sgl); +} + +static void virtblk_cleanup_cmd(struct request *req) +{ if (req->rq_flags & RQF_SPECIAL_PAYLOAD) kfree(bvec_virt(&req->special_vec)); +} + +static int virtblk_setup_cmd(struct virtio_device *vdev, struct request *req, + struct virtblk_req *vbr) +{ + bool unmap = false; + u32 type; + + vbr->out_hdr.sector = 0; + + switch (req_op(req)) { + case REQ_OP_READ: + type = VIRTIO_BLK_T_IN; + vbr->out_hdr.sector = cpu_to_virtio64(vdev, + blk_rq_pos(req)); + break; + case REQ_OP_WRITE: + type = VIRTIO_BLK_T_OUT; + vbr->out_hdr.sector = cpu_to_virtio64(vdev, + blk_rq_pos(req)); + break; + case REQ_OP_FLUSH: + type = VIRTIO_BLK_T_FLUSH; + break; + case REQ_OP_DISCARD: + type = VIRTIO_BLK_T_DISCARD; + break; + case REQ_OP_WRITE_ZEROES: + type = VIRTIO_BLK_T_WRITE_ZEROES; + unmap = !(req->cmd_flags & REQ_NOUNMAP); + break; + case REQ_OP_DRV_IN: + type = VIRTIO_BLK_T_GET_ID; + break; + default: + WARN_ON_ONCE(1); + return BLK_STS_IOERR; + } + + vbr->out_hdr.type = cpu_to_virtio32(vdev, type); + vbr->out_hdr.ioprio = cpu_to_virtio32(vdev, req_get_ioprio(req)); + + if (type == VIRTIO_BLK_T_DISCARD || type == VIRTIO_BLK_T_WRITE_ZEROES) { + if (virtblk_setup_discard_write_zeroes(req, unmap)) + return BLK_STS_RESOURCE; + } + + return 0; +} + +static inline void virtblk_request_done(struct request *req) +{ + struct virtblk_req *vbr = blk_mq_rq_to_pdu(req); + + virtblk_unmap_data(req, vbr); + virtblk_cleanup_cmd(req); blk_mq_end_request(req, virtblk_result(vbr)); }
@@ -225,57 +312,23 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, int qid = hctx->queue_num; int err; bool notify = false; - bool unmap = false; - u32 type;
BUG_ON(req->nr_phys_segments + 2 > vblk->sg_elems);
- switch (req_op(req)) { - case REQ_OP_READ: - case REQ_OP_WRITE: - type = 0; - break; - case REQ_OP_FLUSH: - type = VIRTIO_BLK_T_FLUSH; - break; - case REQ_OP_DISCARD: - type = VIRTIO_BLK_T_DISCARD; - break; - case REQ_OP_WRITE_ZEROES: - type = VIRTIO_BLK_T_WRITE_ZEROES; - unmap = !(req->cmd_flags & REQ_NOUNMAP); - break; - case REQ_OP_DRV_IN: - type = VIRTIO_BLK_T_GET_ID; - break; - default: - WARN_ON_ONCE(1); - return BLK_STS_IOERR; - } - - vbr->out_hdr.type = cpu_to_virtio32(vblk->vdev, type); - vbr->out_hdr.sector = type ? - 0 : cpu_to_virtio64(vblk->vdev, blk_rq_pos(req)); - vbr->out_hdr.ioprio = cpu_to_virtio32(vblk->vdev, req_get_ioprio(req)); + err = virtblk_setup_cmd(vblk->vdev, req, vbr); + if (unlikely(err)) + return err;
blk_mq_start_request(req);
- if (type == VIRTIO_BLK_T_DISCARD || type == VIRTIO_BLK_T_WRITE_ZEROES) { - err = virtblk_setup_discard_write_zeroes(req, unmap); - if (err) - return BLK_STS_RESOURCE; - } - - num = blk_rq_map_sg(hctx->queue, req, vbr->sg); - if (num) { - if (rq_data_dir(req) == WRITE) - vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_OUT); - else - vbr->out_hdr.type |= cpu_to_virtio32(vblk->vdev, VIRTIO_BLK_T_IN); + num = virtblk_map_data(hctx, req, vbr); + if (unlikely(num < 0)) { + virtblk_cleanup_cmd(req); + return BLK_STS_RESOURCE; }
spin_lock_irqsave(&vblk->vqs[qid].lock, flags); - err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg, num); + err = virtblk_add_req(vblk->vqs[qid].vq, vbr, vbr->sg_table.sgl, num); if (err) { virtqueue_kick(vblk->vqs[qid].vq); /* Don't stop the queue if -ENOMEM: we may have failed to @@ -284,6 +337,8 @@ static blk_status_t virtio_queue_rq(struct blk_mq_hw_ctx *hctx, if (err == -ENOSPC) blk_mq_stop_hw_queue(hctx); spin_unlock_irqrestore(&vblk->vqs[qid].lock, flags); + virtblk_unmap_data(req, vbr); + virtblk_cleanup_cmd(req); switch (err) { case -ENOSPC: return BLK_STS_DEV_RESOURCE; @@ -660,16 +715,6 @@ static const struct attribute_group *virtblk_attr_groups[] = { NULL, };
-static int virtblk_init_request(struct blk_mq_tag_set *set, struct request *rq, - unsigned int hctx_idx, unsigned int numa_node) -{ - struct virtio_blk *vblk = set->driver_data; - struct virtblk_req *vbr = blk_mq_rq_to_pdu(rq); - - sg_init_table(vbr->sg, vblk->sg_elems); - return 0; -} - static int virtblk_map_queues(struct blk_mq_tag_set *set) { struct virtio_blk *vblk = set->driver_data; @@ -682,7 +727,6 @@ static const struct blk_mq_ops virtio_mq_ops = { .queue_rq = virtio_queue_rq, .commit_rqs = virtio_commit_rqs, .complete = virtblk_request_done, - .init_request = virtblk_init_request, .map_queues = virtblk_map_queues, };
@@ -762,7 +806,7 @@ static int virtblk_probe(struct virtio_device *vdev) vblk->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; vblk->tag_set.cmd_size = sizeof(struct virtblk_req) + - sizeof(struct scatterlist) * sg_elems; + sizeof(struct scatterlist) * VIRTIO_BLK_INLINE_SG_CNT; vblk->tag_set.driver_data = vblk; vblk->tag_set.nr_hw_queues = vblk->num_vqs;
From: Jens Axboe axboe@kernel.dk
[ Upstream commit f63cf5192fe3418ad5ae1a4412eba5694b145f79 ]
Ensure that we call fsnotify_modify() if we write a file, and that we do fsnotify_access() if we read it. This enables anyone using inotify on the file to get notified.
Ditto for fallocate, ensure that fsnotify_modify() is called.
Cc: stable@vger.kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 189323740324..0c5dcda0b622 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2686,8 +2686,12 @@ static bool io_rw_should_reissue(struct io_kiocb *req)
static bool __io_complete_rw_common(struct io_kiocb *req, long res) { - if (req->rw.kiocb.ki_flags & IOCB_WRITE) + if (req->rw.kiocb.ki_flags & IOCB_WRITE) { kiocb_end_write(req); + fsnotify_modify(req->file); + } else { + fsnotify_access(req->file); + } if (res != req->result) { if ((res == -EAGAIN || res == -EOPNOTSUPP) && io_rw_should_reissue(req)) { @@ -4183,6 +4187,8 @@ static int io_fallocate(struct io_kiocb *req, unsigned int issue_flags) req->sync.len); if (ret < 0) req_set_fail(req); + else + fsnotify_modify(req->file); io_req_complete(req, ret); return 0; }
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit 3caee4634be68e755d2fb130962f1623661dbd5b ]
Convert bdev->bd_disk->queue to bdev_get_queue(), it's uses a cached queue pointer and so is faster.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/85c36ea784d285a5075baa10049e6b59e15fb484.163421954... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bio.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/block/bio.c b/block/bio.c index 8381c6690dd6..365bb6362669 100644 --- a/block/bio.c +++ b/block/bio.c @@ -910,7 +910,7 @@ EXPORT_SYMBOL(bio_add_pc_page); int bio_add_zone_append_page(struct bio *bio, struct page *page, unsigned int len, unsigned int offset) { - struct request_queue *q = bio->bi_bdev->bd_disk->queue; + struct request_queue *q = bdev_get_queue(bio->bi_bdev); bool same_page = false;
if (WARN_ON_ONCE(bio_op(bio) != REQ_OP_ZONE_APPEND)) @@ -1054,7 +1054,7 @@ static int bio_iov_bvec_set(struct bio *bio, struct iov_iter *iter)
static int bio_iov_bvec_set_append(struct bio *bio, struct iov_iter *iter) { - struct request_queue *q = bio->bi_bdev->bd_disk->queue; + struct request_queue *q = bdev_get_queue(bio->bi_bdev); struct iov_iter i = *iter;
iov_iter_truncate(&i, queue_max_zone_append_sectors(q) << 9); @@ -1132,7 +1132,7 @@ static int __bio_iov_append_get_pages(struct bio *bio, struct iov_iter *iter) { unsigned short nr_pages = bio->bi_max_vecs - bio->bi_vcnt; unsigned short entries_left = bio->bi_max_vecs - bio->bi_vcnt; - struct request_queue *q = bio->bi_bdev->bd_disk->queue; + struct request_queue *q = bdev_get_queue(bio->bi_bdev); unsigned int max_append_sectors = queue_max_zone_append_sectors(q); struct bio_vec *bv = bio->bi_io_vec + bio->bi_vcnt; struct page **pages = (struct page **)bv; @@ -1471,10 +1471,10 @@ void bio_endio(struct bio *bio) return;
if (bio->bi_bdev && bio_flagged(bio, BIO_TRACKED)) - rq_qos_done_bio(bio->bi_bdev->bd_disk->queue, bio); + rq_qos_done_bio(bdev_get_queue(bio->bi_bdev), bio);
if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) { - trace_block_bio_complete(bio->bi_bdev->bd_disk->queue, bio); + trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio); bio_clear_flag(bio, BIO_TRACE_COMPLETION); }
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 90b8faa0e8de1b02b619fb33f6c6e1e13e7d1d70 ]
We set BIO_TRACKED unconditionally when rq_qos_throttle() is called, even though we may not even have an rq_qos handler. Only mark it as TRACKED if it really is potentially tracked.
This saves considerable time for the case where the bio isn't tracked:
2.64% -1.65% [kernel.vmlinux] [k] bio_endio
Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/blk-rq-qos.h | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index f000f83e0621..3cfbc8668cba 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -189,9 +189,10 @@ static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio) * BIO_TRACKED lets controllers know that a bio went through the * normal rq_qos path. */ - bio_set_flag(bio, BIO_TRACKED); - if (q->rq_qos) + if (q->rq_qos) { + bio_set_flag(bio, BIO_TRACKED); __rq_qos_throttle(q->rq_qos, bio); + } }
static inline void rq_qos_track(struct request_queue *q, struct request *rq,
From: Tejun Heo tj@kernel.org
[ Upstream commit aa1b46dcdc7baaf5fec0be25782ef24b26aa209e ]
a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked") made bio_endio() skip rq_qos_done_bio() if BIO_TRACKED is not set. While this fixed a potential oops, it also broke blk-iocost by skipping the done_bio callback for merged bios.
Before, whether a bio goes through rq_qos_throttle() or rq_qos_merge(), rq_qos_done_bio() would be called on the bio on completion with BIO_TRACKED distinguishing the former from the latter. rq_qos_done_bio() is not called for bios which wenth through rq_qos_merge(). This royally confuses blk-iocost as the merged bios never finish and are considered perpetually in-flight.
One reliably reproducible failure mode is an intermediate cgroup geting stuck active preventing its children from being activated due to the leaf-only rule, leading to loss of control. The following is from resctl-bench protection scenario which emulates isolating a web server like workload from a memory bomb run on an iocost configuration which should yield a reasonable level of protection.
# cat /sys/block/nvme2n1/device/model Samsung SSD 970 PRO 512GB # cat /sys/fs/cgroup/io.cost.model 259:0 ctrl=user model=linear rbps=834913556 rseqiops=93622 rrandiops=102913 wbps=618985353 wseqiops=72325 wrandiops=71025 # cat /sys/fs/cgroup/io.cost.qos 259:0 enable=1 ctrl=user rpct=95.00 rlat=18776 wpct=95.00 wlat=8897 min=60.00 max=100.00 # resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1 ... Memory Hog Summary ==================
IO Latency: R p50=242u:336u/2.5m p90=794u:1.4m/7.5m p99=2.7m:8.0m/62.5m max=8.0m:36.4m/350m W p50=221u:323u/1.5m p90=709u:1.2m/5.5m p99=1.5m:2.5m/9.5m max=6.9m:35.9m/350m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev isol% 15.90 15.90 15.90 40.05 57.24 59.07 60.01 74.63 74.63 90.35 90.35 58.12 15.82 lat-imp% 0 0 0 0 0 4.55 14.68 15.54 233.5 548.1 548.1 53.88 143.6
Result: isol=58.12:15.82% lat_imp=53.88%:143.6 work_csv=100.0% missing=3.96%
The isolation result of 58.12% is close to what this device would show without any IO control.
Fix it by introducing a new flag BIO_QOS_MERGED to mark merged bios and calling rq_qos_done_bio() on them too. For consistency and clarity, rename BIO_TRACKED to BIO_QOS_THROTTLED. The flag checks are moved into rq_qos_done_bio() so that it's next to the code paths that set the flags.
With the patch applied, the above same benchmark shows:
# resctl-bench -m 29.6G -r out.json run protection::scenario=mem-hog,loops=1 ... Memory Hog Summary ==================
IO Latency: R p50=123u:84.4u/985u p90=322u:256u/2.5m p99=1.6m:1.4m/9.5m max=11.1m:36.0m/350m W p50=429u:274u/995u p90=1.7m:1.3m/4.5m p99=3.4m:2.7m/11.5m max=7.9m:5.9m/26.5m
Isolation and Request Latency Impact Distributions:
min p01 p05 p10 p25 p50 p75 p90 p95 p99 max mean stdev isol% 84.91 84.91 89.51 90.73 92.31 94.49 96.36 98.04 98.71 100.0 100.0 94.42 2.81 lat-imp% 0 0 0 0 0 2.81 5.73 11.11 13.92 17.53 22.61 4.10 4.68
Result: isol=94.42:2.81% lat_imp=4.10%:4.68 work_csv=58.34% missing=0%
Signed-off-by: Tejun Heo tj@kernel.org Fixes: a647a524a467 ("block: don't call rq_qos_ops->done_bio if the bio isn't tracked") Cc: stable@vger.kernel.org # v5.15+ Cc: Ming Lei ming.lei@redhat.com Cc: Yu Kuai yukuai3@huawei.com Reviewed-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/Yi7rdrzQEHjJLGKB@slm.duckdns.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bio.c | 3 +-- block/blk-iolatency.c | 2 +- block/blk-rq-qos.h | 20 +++++++++++--------- include/linux/blk_types.h | 3 ++- 4 files changed, 15 insertions(+), 13 deletions(-)
diff --git a/block/bio.c b/block/bio.c index 365bb6362669..b8a8bfba714f 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1470,8 +1470,7 @@ void bio_endio(struct bio *bio) if (!bio_integrity_endio(bio)) return;
- if (bio->bi_bdev && bio_flagged(bio, BIO_TRACKED)) - rq_qos_done_bio(bdev_get_queue(bio->bi_bdev), bio); + rq_qos_done_bio(bio);
if (bio->bi_bdev && bio_flagged(bio, BIO_TRACE_COMPLETION)) { trace_block_bio_complete(bdev_get_queue(bio->bi_bdev), bio); diff --git a/block/blk-iolatency.c b/block/blk-iolatency.c index ce3847499d85..d85f30a85ee7 100644 --- a/block/blk-iolatency.c +++ b/block/blk-iolatency.c @@ -601,7 +601,7 @@ static void blkcg_iolatency_done_bio(struct rq_qos *rqos, struct bio *bio) int inflight = 0;
blkg = bio->bi_blkg; - if (!blkg || !bio_flagged(bio, BIO_TRACKED)) + if (!blkg || !bio_flagged(bio, BIO_QOS_THROTTLED)) return;
iolat = blkg_to_lat(bio->bi_blkg); diff --git a/block/blk-rq-qos.h b/block/blk-rq-qos.h index 3cfbc8668cba..68267007da1c 100644 --- a/block/blk-rq-qos.h +++ b/block/blk-rq-qos.h @@ -177,20 +177,20 @@ static inline void rq_qos_requeue(struct request_queue *q, struct request *rq) __rq_qos_requeue(q->rq_qos, rq); }
-static inline void rq_qos_done_bio(struct request_queue *q, struct bio *bio) +static inline void rq_qos_done_bio(struct bio *bio) { - if (q->rq_qos) - __rq_qos_done_bio(q->rq_qos, bio); + if (bio->bi_bdev && (bio_flagged(bio, BIO_QOS_THROTTLED) || + bio_flagged(bio, BIO_QOS_MERGED))) { + struct request_queue *q = bdev_get_queue(bio->bi_bdev); + if (q->rq_qos) + __rq_qos_done_bio(q->rq_qos, bio); + } }
static inline void rq_qos_throttle(struct request_queue *q, struct bio *bio) { - /* - * BIO_TRACKED lets controllers know that a bio went through the - * normal rq_qos path. - */ if (q->rq_qos) { - bio_set_flag(bio, BIO_TRACKED); + bio_set_flag(bio, BIO_QOS_THROTTLED); __rq_qos_throttle(q->rq_qos, bio); } } @@ -205,8 +205,10 @@ static inline void rq_qos_track(struct request_queue *q, struct request *rq, static inline void rq_qos_merge(struct request_queue *q, struct request *rq, struct bio *bio) { - if (q->rq_qos) + if (q->rq_qos) { + bio_set_flag(bio, BIO_QOS_MERGED); __rq_qos_merge(q->rq_qos, rq, bio); + } }
static inline void rq_qos_queue_depth_changed(struct request_queue *q) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 17c92c0f15b2..36ce3d0fb9f3 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -294,7 +294,8 @@ enum { BIO_TRACE_COMPLETION, /* bio_endio() should trace the final completion * of this bio. */ BIO_CGROUP_ACCT, /* has been accounted to a cgroup */ - BIO_TRACKED, /* set if bio goes through the rq_qos path */ + BIO_QOS_THROTTLED, /* bio went through rq_qos throttle path */ + BIO_QOS_MERGED, /* but went through rq_qos merge path */ BIO_REMAPPED, BIO_ZONE_WRITE_LOCKED, /* Owns a zoned device zone write lock */ BIO_PERCPU_CACHE, /* can participate in per-cpu alloc cache */
From: Kees Cook keescook@chromium.org
[ Upstream commit 50d7bd38c3aafc4749e05e8d7fcb616979143602 ]
Kernel code has a regular need to describe groups of members within a structure usually when they need to be copied or initialized separately from the rest of the surrounding structure. The generally accepted design pattern in C is to use a named sub-struct:
struct foo { int one; struct { int two; int three, four; } thing; int five; };
This would allow for traditional references and sizing:
memcpy(&dst.thing, &src.thing, sizeof(dst.thing));
However, doing this would mean that referencing struct members enclosed by such named structs would always require including the sub-struct name in identifiers:
do_something(dst.thing.three);
This has tended to be quite inflexible, especially when such groupings need to be added to established code which causes huge naming churn. Three workarounds exist in the kernel for this problem, and each have other negative properties.
To avoid the naming churn, there is a design pattern of adding macro aliases for the named struct:
#define f_three thing.three
This ends up polluting the global namespace, and makes it difficult to search for identifiers.
Another common work-around in kernel code avoids the pollution by avoiding the named struct entirely, instead identifying the group's boundaries using either a pair of empty anonymous structs of a pair of zero-element arrays:
struct foo { int one; struct { } start; int two; int three, four; struct { } finish; int five; };
struct foo { int one; int start[0]; int two; int three, four; int finish[0]; int five; };
This allows code to avoid needing to use a sub-struct named for member references within the surrounding structure, but loses the benefits of being able to actually use such a struct, making it rather fragile. Using these requires open-coded calculation of sizes and offsets. The efforts made to avoid common mistakes include lots of comments, or adding various BUILD_BUG_ON()s. Such code is left with no way for the compiler to reason about the boundaries (e.g. the "start" object looks like it's 0 bytes in length), making bounds checking depend on open-coded calculations:
if (length > offsetof(struct foo, finish) - offsetof(struct foo, start)) return -EINVAL; memcpy(&dst.start, &src.start, offsetof(struct foo, finish) - offsetof(struct foo, start));
However, the vast majority of places in the kernel that operate on groups of members do so without any identification of the grouping, relying either on comments or implicit knowledge of the struct contents, which is even harder for the compiler to reason about, and results in even more fragile manual sizing, usually depending on member locations outside of the region (e.g. to copy "two" and "three", use the start of "four" to find the size):
BUILD_BUG_ON((offsetof(struct foo, four) < offsetof(struct foo, two)) || (offsetof(struct foo, four) < offsetof(struct foo, three)); if (length > offsetof(struct foo, four) - offsetof(struct foo, two)) return -EINVAL; memcpy(&dst.two, &src.two, length);
In order to have a regular programmatic way to describe a struct region that can be used for references and sizing, can be examined for bounds checking, avoids forcing the use of intermediate identifiers, and avoids polluting the global namespace, introduce the struct_group() macro. This macro wraps the member declarations to create an anonymous union of an anonymous struct (no intermediate name) and a named struct (for references and sizing):
struct foo { int one; struct_group(thing, int two; int three, four; ); int five; };
if (length > sizeof(src.thing)) return -EINVAL; memcpy(&dst.thing, &src.thing, length); do_something(dst.three);
There are some rare cases where the resulting struct_group() needs attributes added, so struct_group_attr() is also introduced to allow for specifying struct attributes (e.g. __align(x) or __packed). Additionally, there are places where such declarations would like to have the struct be tagged, so struct_group_tagged() is added.
Given there is a need for a handful of UAPI uses too, the underlying __struct_group() macro has been defined in UAPI so it can be used there too.
To avoid confusing scripts/kernel-doc, hide the macro from its struct parsing.
Co-developed-by: Keith Packard keithp@keithp.com Signed-off-by: Keith Packard keithp@keithp.com Acked-by: Gustavo A. R. Silva gustavoars@kernel.org Link: https://lore.kernel.org/lkml/20210728023217.GC35706@embeddedor Enhanced-by: Rasmus Villemoes linux@rasmusvillemoes.dk Link: https://lore.kernel.org/lkml/41183a98-bdb9-4ad6-7eab-5a7292a6df84@rasmusvill... Enhanced-by: Dan Williams dan.j.williams@intel.com Link: https://lore.kernel.org/lkml/1d9a2e6df2a9a35b2cdd50a9a68cac5991e7e5f0.camel@... Enhanced-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://lore.kernel.org/lkml/YQKa76A6XuFqgM03@phenom.ffwll.local Acked-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/stddef.h | 48 +++++++++++++++++++++++++++++++++++++ include/uapi/linux/stddef.h | 21 ++++++++++++++++ scripts/kernel-doc | 7 ++++++ 3 files changed, 76 insertions(+)
diff --git a/include/linux/stddef.h b/include/linux/stddef.h index 998a4ba28eba..938216f8ab7e 100644 --- a/include/linux/stddef.h +++ b/include/linux/stddef.h @@ -36,4 +36,52 @@ enum { #define offsetofend(TYPE, MEMBER) \ (offsetof(TYPE, MEMBER) + sizeof_field(TYPE, MEMBER))
+/** + * struct_group() - Wrap a set of declarations in a mirrored struct + * + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. + */ +#define struct_group(NAME, MEMBERS...) \ + __struct_group(/* no tag */, NAME, /* no attrs */, MEMBERS) + +/** + * struct_group_attr() - Create a struct_group() with trailing attributes + * + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes to apply + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes structure attributes argument. + */ +#define struct_group_attr(NAME, ATTRS, MEMBERS...) \ + __struct_group(/* no tag */, NAME, ATTRS, MEMBERS) + +/** + * struct_group_tagged() - Create a struct_group with a reusable tag + * + * @TAG: The tag name for the named sub-struct + * @NAME: The identifier name of the mirrored sub-struct + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical + * layout and size: one anonymous and one named. The former can be + * used normally without sub-struct naming, and the latter can be + * used to reason about the start, end, and size of the group of + * struct members. Includes struct tag argument for the named copy, + * so the specified layout can be reused later. + */ +#define struct_group_tagged(TAG, NAME, MEMBERS...) \ + __struct_group(TAG, NAME, /* no attrs */, MEMBERS) + #endif diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index ee8220f8dcf5..610204f7c275 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -4,3 +4,24 @@ #ifndef __always_inline #define __always_inline inline #endif + +/** + * __struct_group() - Create a mirrored named and anonyomous struct + * + * @TAG: The tag name for the named sub-struct (usually empty) + * @NAME: The identifier name of the mirrored sub-struct + * @ATTRS: Any struct attributes (usually empty) + * @MEMBERS: The member declarations for the mirrored structs + * + * Used to create an anonymous union of two structs with identical layout + * and size: one anonymous and one named. The former's members can be used + * normally without sub-struct naming, and the latter can be used to + * reason about the start, end, and size of the group of struct members. + * The named struct can also be explicitly tagged for layer reuse, as well + * as both having struct attributes appended. + */ +#define __struct_group(TAG, NAME, ATTRS, MEMBERS...) \ + union { \ + struct { MEMBERS } ATTRS; \ + struct TAG { MEMBERS } ATTRS NAME; \ + } diff --git a/scripts/kernel-doc b/scripts/kernel-doc index cfcb60737957..38aa799a776c 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1245,6 +1245,13 @@ sub dump_struct($$) { $members =~ s/\s*CRYPTO_MINALIGN_ATTR/ /gos; $members =~ s/\s*____cacheline_aligned_in_smp/ /gos; $members =~ s/\s*____cacheline_aligned/ /gos; + # unwrap struct_group(): + # - first eat non-declaration parameters and rewrite for final match + # - then remove macro, outer parens, and trailing semicolon + $members =~ s/\bstruct_group\s*(([^,]*,)/STRUCT_GROUP(/gos; + $members =~ s/\bstruct_group_(attr|tagged)\s*(([^,]*,){2}/STRUCT_GROUP(/gos; + $members =~ s/\b__struct_group\s*(([^,]*,){3}/STRUCT_GROUP(/gos; + $members =~ s/\bSTRUCT_GROUP((((?:(?>[^)(]+)|(?1))*)))[^;]*;/$2/gos;
my $args = qr{([^,)]+)}; # replace DECLARE_BITMAP
From: Kees Cook keescook@chromium.org
[ Upstream commit d4568fc8525897e683983806f813be1ae9eedaed ]
In preparation for FORTIFY_SOURCE performing compile-time and run-time field bounds checking for memcpy(), memmove(), and memset(), avoid intentionally writing across neighboring fields. Wrap the target region in struct_group(). This additionally fixes a theoretical misalignment of the copy (since the size of "buf" changes between 64-bit and 32-bit, but this is likely never built for 64-bit).
FWIW, I think this code is totally broken on 64-bit (which appears to not be a "real" build configuration): it would either always fail (with an uninitialized data->buf_size) or would cause corruption in userspace due to the copy_to_user() in the call path against an uninitialized data->buf value:
omap3isp_stat_request_statistics_time32(...) struct omap3isp_stat_data data64; ... omap3isp_stat_request_statistics(stat, &data64);
int omap3isp_stat_request_statistics(struct ispstat *stat, struct omap3isp_stat_data *data) ... buf = isp_stat_buf_get(stat, data);
static struct ispstat_buffer *isp_stat_buf_get(struct ispstat *stat, struct omap3isp_stat_data *data) ... if (buf->buf_size > data->buf_size) { ... return ERR_PTR(-EINVAL); } ... rval = copy_to_user(data->buf, buf->virt_addr, buf->buf_size);
Regardless, additionally initialize data64 to be zero-filled to avoid undefined behavior.
Link: https://lore.kernel.org/lkml/20211215220505.GB21862@embeddedor
Cc: Arnd Bergmann arnd@arndb.de Fixes: 378e3f81cb56 ("media: omap3isp: support 64-bit version of omap3isp_stat_data") Cc: stable@vger.kernel.org Reviewed-by: Gustavo A. R. Silva gustavoars@kernel.org Signed-off-by: Kees Cook keescook@chromium.org Reviewed-by: Laurent Pinchart laurent.pinchart@ideasonboard.com Signed-off-by: Sakari Ailus sakari.ailus@linux.intel.com Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/omap3isp/ispstat.c | 5 +++-- include/uapi/linux/omap3isp.h | 21 +++++++++++++-------- 2 files changed, 16 insertions(+), 10 deletions(-)
diff --git a/drivers/media/platform/omap3isp/ispstat.c b/drivers/media/platform/omap3isp/ispstat.c index 5b9b57f4d9bf..68cf68dbcace 100644 --- a/drivers/media/platform/omap3isp/ispstat.c +++ b/drivers/media/platform/omap3isp/ispstat.c @@ -512,7 +512,7 @@ int omap3isp_stat_request_statistics(struct ispstat *stat, int omap3isp_stat_request_statistics_time32(struct ispstat *stat, struct omap3isp_stat_data_time32 *data) { - struct omap3isp_stat_data data64; + struct omap3isp_stat_data data64 = { }; int ret;
ret = omap3isp_stat_request_statistics(stat, &data64); @@ -521,7 +521,8 @@ int omap3isp_stat_request_statistics_time32(struct ispstat *stat,
data->ts.tv_sec = data64.ts.tv_sec; data->ts.tv_usec = data64.ts.tv_usec; - memcpy(&data->buf, &data64.buf, sizeof(*data) - sizeof(data->ts)); + data->buf = (uintptr_t)data64.buf; + memcpy(&data->frame, &data64.frame, sizeof(data->frame));
return 0; } diff --git a/include/uapi/linux/omap3isp.h b/include/uapi/linux/omap3isp.h index 87b55755f4ff..d9db7ad43890 100644 --- a/include/uapi/linux/omap3isp.h +++ b/include/uapi/linux/omap3isp.h @@ -162,6 +162,7 @@ struct omap3isp_h3a_aewb_config { * struct omap3isp_stat_data - Statistic data sent to or received from user * @ts: Timestamp of returned framestats. * @buf: Pointer to pass to user. + * @buf_size: Size of buffer. * @frame_number: Frame number of requested stats. * @cur_frame: Current frame number being processed. * @config_counter: Number of the configuration associated with the data. @@ -176,10 +177,12 @@ struct omap3isp_stat_data { struct timeval ts; #endif void __user *buf; - __u32 buf_size; - __u16 frame_number; - __u16 cur_frame; - __u16 config_counter; + __struct_group(/* no tag */, frame, /* no attrs */, + __u32 buf_size; + __u16 frame_number; + __u16 cur_frame; + __u16 config_counter; + ); };
#ifdef __KERNEL__ @@ -189,10 +192,12 @@ struct omap3isp_stat_data_time32 { __s32 tv_usec; } ts; __u32 buf; - __u32 buf_size; - __u16 frame_number; - __u16 cur_frame; - __u16 config_counter; + __struct_group(/* no tag */, frame, /* no attrs */, + __u32 buf_size; + __u16 frame_number; + __u16 cur_frame; + __u16 config_counter; + ); }; #endif
From: Johan Hovold johan@kernel.org
[ Upstream commit 43acb728bbc40169d2e2425e84a80068270974be ]
The driver allocates and registers two platform device structures during probe, but the devices were never deregistered on driver unbind.
This results in a use-after-free on driver unbind as the device structures were allocated using devres and would be freed by driver core when remove() returns.
Fix this by adding the missing deregistration calls to the remove() callback and failing probe on registration errors.
Note that the platform device structures must be freed using a proper release callback to avoid leaking associated resources like device names.
Fixes: 479f7a118105 ("[media] davinci: vpif: adaptions for DT support") Cc: stable@vger.kernel.org # 4.12 Cc: Kevin Hilman khilman@baylibre.com Signed-off-by: Johan Hovold johan@kernel.org Reviewed-by: Lad Prabhakar prabhakar.csengg@gmail.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/davinci/vpif.c | 97 ++++++++++++++++++++------- 1 file changed, 71 insertions(+), 26 deletions(-)
diff --git a/drivers/media/platform/davinci/vpif.c b/drivers/media/platform/davinci/vpif.c index 5658c7f148d7..8ffc01c606d0 100644 --- a/drivers/media/platform/davinci/vpif.c +++ b/drivers/media/platform/davinci/vpif.c @@ -41,6 +41,11 @@ MODULE_ALIAS("platform:" VPIF_DRIVER_NAME); #define VPIF_CH2_MAX_MODES 15 #define VPIF_CH3_MAX_MODES 2
+struct vpif_data { + struct platform_device *capture; + struct platform_device *display; +}; + DEFINE_SPINLOCK(vpif_lock); EXPORT_SYMBOL_GPL(vpif_lock);
@@ -423,11 +428,19 @@ int vpif_channel_getfid(u8 channel_id) } EXPORT_SYMBOL(vpif_channel_getfid);
+static void vpif_pdev_release(struct device *dev) +{ + struct platform_device *pdev = to_platform_device(dev); + + kfree(pdev); +} + static int vpif_probe(struct platform_device *pdev) { static struct resource *res, *res_irq; struct platform_device *pdev_capture, *pdev_display; struct device_node *endpoint = NULL; + struct vpif_data *data; int ret;
res = platform_get_resource(pdev, IORESOURCE_MEM, 0); @@ -435,6 +448,12 @@ static int vpif_probe(struct platform_device *pdev) if (IS_ERR(vpif_base)) return PTR_ERR(vpif_base);
+ data = kzalloc(sizeof(*data), GFP_KERNEL); + if (!data) + return -ENOMEM; + + platform_set_drvdata(pdev, data); + pm_runtime_enable(&pdev->dev); pm_runtime_get(&pdev->dev);
@@ -462,49 +481,75 @@ static int vpif_probe(struct platform_device *pdev) goto err_put_rpm; }
- pdev_capture = devm_kzalloc(&pdev->dev, sizeof(*pdev_capture), - GFP_KERNEL); - if (pdev_capture) { - pdev_capture->name = "vpif_capture"; - pdev_capture->id = -1; - pdev_capture->resource = res_irq; - pdev_capture->num_resources = 1; - pdev_capture->dev.dma_mask = pdev->dev.dma_mask; - pdev_capture->dev.coherent_dma_mask = pdev->dev.coherent_dma_mask; - pdev_capture->dev.parent = &pdev->dev; - platform_device_register(pdev_capture); - } else { - dev_warn(&pdev->dev, "Unable to allocate memory for pdev_capture.\n"); + pdev_capture = kzalloc(sizeof(*pdev_capture), GFP_KERNEL); + if (!pdev_capture) { + ret = -ENOMEM; + goto err_put_rpm; }
- pdev_display = devm_kzalloc(&pdev->dev, sizeof(*pdev_display), - GFP_KERNEL); - if (pdev_display) { - pdev_display->name = "vpif_display"; - pdev_display->id = -1; - pdev_display->resource = res_irq; - pdev_display->num_resources = 1; - pdev_display->dev.dma_mask = pdev->dev.dma_mask; - pdev_display->dev.coherent_dma_mask = pdev->dev.coherent_dma_mask; - pdev_display->dev.parent = &pdev->dev; - platform_device_register(pdev_display); - } else { - dev_warn(&pdev->dev, "Unable to allocate memory for pdev_display.\n"); + pdev_capture->name = "vpif_capture"; + pdev_capture->id = -1; + pdev_capture->resource = res_irq; + pdev_capture->num_resources = 1; + pdev_capture->dev.dma_mask = pdev->dev.dma_mask; + pdev_capture->dev.coherent_dma_mask = pdev->dev.coherent_dma_mask; + pdev_capture->dev.parent = &pdev->dev; + pdev_capture->dev.release = vpif_pdev_release; + + ret = platform_device_register(pdev_capture); + if (ret) + goto err_put_pdev_capture; + + pdev_display = kzalloc(sizeof(*pdev_display), GFP_KERNEL); + if (!pdev_display) { + ret = -ENOMEM; + goto err_put_pdev_capture; }
+ pdev_display->name = "vpif_display"; + pdev_display->id = -1; + pdev_display->resource = res_irq; + pdev_display->num_resources = 1; + pdev_display->dev.dma_mask = pdev->dev.dma_mask; + pdev_display->dev.coherent_dma_mask = pdev->dev.coherent_dma_mask; + pdev_display->dev.parent = &pdev->dev; + pdev_display->dev.release = vpif_pdev_release; + + ret = platform_device_register(pdev_display); + if (ret) + goto err_put_pdev_display; + + data->capture = pdev_capture; + data->display = pdev_display; + return 0;
+err_put_pdev_display: + platform_device_put(pdev_display); +err_put_pdev_capture: + platform_device_put(pdev_capture); err_put_rpm: pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); + kfree(data);
return ret; }
static int vpif_remove(struct platform_device *pdev) { + struct vpif_data *data = platform_get_drvdata(pdev); + + if (data->capture) + platform_device_unregister(data->capture); + if (data->display) + platform_device_unregister(data->display); + pm_runtime_put(&pdev->dev); pm_runtime_disable(&pdev->dev); + + kfree(data); + return 0; }
From: Sean Wang sean.wang@mediatek.com
[ Upstream commit bf9727a27442a50c75b7d99a5088330c578b2a42 ]
Fixed an MCU_CE_CMD_SET_ROC definition error that occurred from a previous refactor work.
Fixes: d0e274af2f2e4 ("mt76: mt76_connac: create mcu library") Signed-off-by: Sean Wang sean.wang@mediatek.com Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h index 77d4435e4581..72a70a7046fb 100644 --- a/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h +++ b/drivers/net/wireless/mediatek/mt76/mt76_connac_mcu.h @@ -556,7 +556,7 @@ enum { MCU_CMD_SET_BSS_CONNECTED = MCU_CE_PREFIX | 0x16, MCU_CMD_SET_BSS_ABORT = MCU_CE_PREFIX | 0x17, MCU_CMD_CANCEL_HW_SCAN = MCU_CE_PREFIX | 0x1b, - MCU_CMD_SET_ROC = MCU_CE_PREFIX | 0x1d, + MCU_CMD_SET_ROC = MCU_CE_PREFIX | 0x1c, MCU_CMD_SET_P2P_OPPPS = MCU_CE_PREFIX | 0x33, MCU_CMD_SET_RATE_TX_POWER = MCU_CE_PREFIX | 0x5d, MCU_CMD_SCHED_SCAN_ENABLE = MCU_CE_PREFIX | 0x61,
From: Lorenzo Bianconi lorenzo@kernel.org
[ Upstream commit b44eeb8cbdf2b88f2844f11e4f263b0abed5b5b0 ]
After commit 'd430dffbe9dd ("mt76: mt7921: fix a possible race enabling/disabling runtime-pm")', runtime-pm is always disabled in the fw even if the user requests to enable it toggling debugfs node since mt7921_pm_interface_iter routine will use pm->enable to configure the fw. Fix the issue moving enable variable configuration before running mt7921_pm_interface_iter routine.
Fixes: d430dffbe9dd ("mt76: mt7921: fix a possible race enabling/disabling runtime-pm") Signed-off-by: Lorenzo Bianconi lorenzo@kernel.org Signed-off-by: Felix Fietkau nbd@nbd.name Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c index b9f25599227d..cfcf7964c688 100644 --- a/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c +++ b/drivers/net/wireless/mediatek/mt76/mt7921/debugfs.c @@ -291,13 +291,12 @@ mt7921_pm_set(void *data, u64 val) pm->enable = false; mt76_connac_pm_wake(&dev->mphy, pm);
+ pm->enable = val; ieee80211_iterate_active_interfaces(mt76_hw(dev), IEEE80211_IFACE_ITER_RESUME_ALL, mt7921_pm_interface_iter, dev);
mt76_connac_mcu_set_deep_sleep(&dev->mt76, pm->ds_enable); - - pm->enable = val; mt76_connac_power_save_sched(&dev->mphy, pm); out: mutex_unlock(&dev->mt76.mutex);
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit 74be98774dfbc5b8b795db726bd772e735d2edd4 ]
KASAN + DEBUG_KOBJECT_RELEASE reports a potential use-after-free in cxl_decoder_release() where it goes to reference its parent, a cxl_port, to free its id back to port->decoder_ida.
BUG: KASAN: use-after-free in to_cxl_port+0x18/0x90 [cxl_core] Read of size 8 at addr ffff888119270908 by task kworker/35:2/379
CPU: 35 PID: 379 Comm: kworker/35:2 Tainted: G OE 5.17.0-rc2+ #198 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Workqueue: events kobject_delayed_cleanup Call Trace: <TASK> dump_stack_lvl+0x59/0x73 print_address_description.constprop.0+0x1f/0x150 ? to_cxl_port+0x18/0x90 [cxl_core] kasan_report.cold+0x83/0xdf ? to_cxl_port+0x18/0x90 [cxl_core] to_cxl_port+0x18/0x90 [cxl_core] cxl_decoder_release+0x2a/0x60 [cxl_core] device_release+0x5f/0x100 kobject_cleanup+0x80/0x1c0
The device core only guarantees parent lifetime until all children are unregistered. If a child needs a parent to complete its ->release() callback that child needs to hold a reference to extend the lifetime of the parent.
Fixes: 40ba17afdfab ("cxl/acpi: Introduce cxl_decoder objects") Reported-by: Ben Widawsky ben.widawsky@intel.com Tested-by: Ben Widawsky ben.widawsky@intel.com Reviewed-by: Ben Widawsky ben.widawsky@intel.com Link: https://lore.kernel.org/r/164505751190.4175768.13324905271463416712.stgit@dw... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/cxl/core/bus.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/cxl/core/bus.c b/drivers/cxl/core/bus.c index 267d8042bec2..0987a6423ee0 100644 --- a/drivers/cxl/core/bus.c +++ b/drivers/cxl/core/bus.c @@ -182,6 +182,7 @@ static void cxl_decoder_release(struct device *dev)
ida_free(&port->decoder_ida, cxld->id); kfree(cxld); + put_device(&port->dev); }
static const struct device_type cxl_decoder_switch_type = { @@ -481,6 +482,9 @@ cxl_decoder_alloc(struct cxl_port *port, int nr_targets, resource_size_t base, if (rc < 0) goto err;
+ /* need parent to stick around to release the id */ + get_device(&port->dev); + *cxld = (struct cxl_decoder) { .id = rc, .range = {
From: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com
[ Upstream commit b289cdecc7c3e25e001cde260c882e4d9a8b0772 ]
As per the HW manual (Rev.1.00 Sep, 2021) PLL2 and PLL3 should be 1600 MHz, but with current multiplier and divider values this resulted to 1596 MHz.
This patch updates the multiplier and divider values for PLL2 and PLL3 so that we get the exact (1600 MHz) values.
Fixes: 17f0ff3d49ff1 ("clk: renesas: Add support for R9A07G044 SoC") Suggested-by: Biju Das biju.das.jz@bp.renesas.com Signed-off-by: Lad Prabhakar prabhakar.mahadev-lad.rj@bp.renesas.com Link: https://lore.kernel.org/r/20211223093223.4725-1-prabhakar.mahadev-lad.rj@bp.... Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/clk/renesas/r9a07g044-cpg.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/clk/renesas/r9a07g044-cpg.c b/drivers/clk/renesas/r9a07g044-cpg.c index 1490446985e2..61609eddf7d0 100644 --- a/drivers/clk/renesas/r9a07g044-cpg.c +++ b/drivers/clk/renesas/r9a07g044-cpg.c @@ -61,8 +61,8 @@ static const struct cpg_core_clk r9a07g044_core_clks[] __initconst = { DEF_FIXED(".osc", R9A07G044_OSCCLK, CLK_EXTAL, 1, 1), DEF_FIXED(".osc_div1000", CLK_OSC_DIV1000, CLK_EXTAL, 1, 1000), DEF_SAMPLL(".pll1", CLK_PLL1, CLK_EXTAL, PLL146_CONF(0)), - DEF_FIXED(".pll2", CLK_PLL2, CLK_EXTAL, 133, 2), - DEF_FIXED(".pll3", CLK_PLL3, CLK_EXTAL, 133, 2), + DEF_FIXED(".pll2", CLK_PLL2, CLK_EXTAL, 200, 3), + DEF_FIXED(".pll3", CLK_PLL3, CLK_EXTAL, 200, 3),
DEF_FIXED(".pll2_div2", CLK_PLL2_DIV2, CLK_PLL2, 1, 2), DEF_FIXED(".pll2_div16", CLK_PLL2_DIV16, CLK_PLL2, 1, 16),
From: Sean Christopherson seanjc@google.com
[ Upstream commit 7533377215b6ee432c06c5855f6be5d66e694e46 ]
Use the yield-safe variant of the TDP MMU iterator when handling an unmapping event from the MMU notifier, as most occurences of the event allow yielding.
Fixes: e1eed5847b09 ("KVM: x86/mmu: Allow yielding during MMU notifier unmap/zap, if possible") Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20211120015008.3780032-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 853780eb033b..6195f0d219ae 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1100,7 +1100,7 @@ bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, { struct kvm_mmu_page *root;
- for_each_tdp_mmu_root(kvm, root, range->slot->as_id) + for_each_tdp_mmu_root_yield_safe(kvm, root, range->slot->as_id, false) flush = zap_gfn_range(kvm, root, range->start, range->end, range->may_block, flush, false);
From: Sean Christopherson seanjc@google.com
[ Upstream commit 83b83a02073ec8d18c77a9bbe0881d710f7a9d32 ]
Use the common TDP MMU zap helper when handling an MMU notifier unmap event, the two flows are semantically identical. Consolidate the code in preparation for a future bug fix, as both kvm_tdp_mmu_unmap_gfn_range() and __kvm_tdp_mmu_zap_gfn_range() are guilty of not zapping SPTEs in invalid roots.
No functional change intended.
Cc: stable@vger.kernel.org Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20211215011557.399940-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/mmu/tdp_mmu.c | 9 ++------- 1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c index 6195f0d219ae..6c2bb60ccd88 100644 --- a/arch/x86/kvm/mmu/tdp_mmu.c +++ b/arch/x86/kvm/mmu/tdp_mmu.c @@ -1098,13 +1098,8 @@ int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, bool kvm_tdp_mmu_unmap_gfn_range(struct kvm *kvm, struct kvm_gfn_range *range, bool flush) { - struct kvm_mmu_page *root; - - for_each_tdp_mmu_root_yield_safe(kvm, root, range->slot->as_id, false) - flush = zap_gfn_range(kvm, root, range->start, range->end, - range->may_block, flush, false); - - return flush; + return __kvm_tdp_mmu_zap_gfn_range(kvm, range->slot->as_id, range->start, + range->end, range->may_block, flush); }
typedef bool (*tdp_handler_t)(struct kvm *kvm, struct tdp_iter *iter,
From: Manish Rangankar mrangankar@marvell.com
[ Upstream commit 3a4e1f3b3a3c733de3b82b9b522e54803e1165ae ]
DPC thread gets restricted due to a no-op mailbox, which is a blocking call and has a high execution frequency. To free up the DPC thread we move no-op handling to the workqueue. Also, modified qla_do_heartbeat() to send no-op MBC if we don’t have any active interrupts, but there are still I/Os outstanding with firmware.
Link: https://lore.kernel.org/r/20210908164622.19240-9-njavali@marvell.com Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check") Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Manish Rangankar mrangankar@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 4 +- drivers/scsi/qla2xxx/qla_init.c | 2 + drivers/scsi/qla2xxx/qla_os.c | 78 +++++++++++++++------------------ 3 files changed, 40 insertions(+), 44 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 2ea35e47ea44..0589ab8e6467 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -3759,6 +3759,7 @@ struct qla_qpair { struct qla_fw_resources fwres ____cacheline_aligned; u32 cmd_cnt; u32 cmd_completion_cnt; + u32 prev_completion_cnt; };
/* Place holder for FW buffer parameters */ @@ -4618,6 +4619,7 @@ struct qla_hw_data { struct qla_chip_state_84xx *cs84xx; struct isp_operations *isp_ops; struct workqueue_struct *wq; + struct work_struct heartbeat_work; struct qlfc_fw fw_buf;
/* FCP_CMND priority support */ @@ -4719,7 +4721,6 @@ struct qla_hw_data {
struct qla_hw_data_stat stat; pci_error_state_t pci_error_state; - u64 prev_cmd_cnt; struct dma_pool *purex_dma_pool; struct btree_head32 host_map;
@@ -4865,7 +4866,6 @@ typedef struct scsi_qla_host { #define SET_ZIO_THRESHOLD_NEEDED 32 #define ISP_ABORT_TO_ROM 33 #define VPORT_DELETE 34 -#define HEARTBEAT_CHK 38
#define PROCESS_PUREX_IOCB 63
diff --git a/drivers/scsi/qla2xxx/qla_init.c b/drivers/scsi/qla2xxx/qla_init.c index af8df5a800c6..c3ba2995209b 100644 --- a/drivers/scsi/qla2xxx/qla_init.c +++ b/drivers/scsi/qla2xxx/qla_init.c @@ -7096,12 +7096,14 @@ qla2x00_abort_isp_cleanup(scsi_qla_host_t *vha) ha->chip_reset++; ha->base_qpair->chip_reset = ha->chip_reset; ha->base_qpair->cmd_cnt = ha->base_qpair->cmd_completion_cnt = 0; + ha->base_qpair->prev_completion_cnt = 0; for (i = 0; i < ha->max_qpairs; i++) { if (ha->queue_pair_map[i]) { ha->queue_pair_map[i]->chip_reset = ha->base_qpair->chip_reset; ha->queue_pair_map[i]->cmd_cnt = ha->queue_pair_map[i]->cmd_completion_cnt = 0; + ha->base_qpair->prev_completion_cnt = 0; } }
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 77c0bf06f162..b224326bacee 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -2779,6 +2779,16 @@ qla2xxx_scan_finished(struct Scsi_Host *shost, unsigned long time) return atomic_read(&vha->loop_state) == LOOP_READY; }
+static void qla_heartbeat_work_fn(struct work_struct *work) +{ + struct qla_hw_data *ha = container_of(work, + struct qla_hw_data, heartbeat_work); + struct scsi_qla_host *base_vha = pci_get_drvdata(ha->pdev); + + if (!ha->flags.mbox_busy && base_vha->flags.init_done) + qla_no_op_mb(base_vha); +} + static void qla2x00_iocb_work_fn(struct work_struct *work) { struct scsi_qla_host *vha = container_of(work, @@ -3217,6 +3227,7 @@ qla2x00_probe_one(struct pci_dev *pdev, const struct pci_device_id *id) host->transportt, sht->vendor_id);
INIT_WORK(&base_vha->iocb_work, qla2x00_iocb_work_fn); + INIT_WORK(&ha->heartbeat_work, qla_heartbeat_work_fn);
/* Set up the irqs */ ret = qla2x00_request_irqs(ha, rsp); @@ -7103,17 +7114,6 @@ qla2x00_do_dpc(void *data) qla2x00_lip_reset(base_vha); }
- if (test_bit(HEARTBEAT_CHK, &base_vha->dpc_flags)) { - /* - * if there is a mb in progress then that's - * enough of a check to see if fw is still ticking. - */ - if (!ha->flags.mbox_busy && base_vha->flags.init_done) - qla_no_op_mb(base_vha); - - clear_bit(HEARTBEAT_CHK, &base_vha->dpc_flags); - } - ha->dpc_active = 0; end_loop: set_current_state(TASK_INTERRUPTIBLE); @@ -7172,57 +7172,51 @@ qla2x00_rst_aen(scsi_qla_host_t *vha)
static bool qla_do_heartbeat(struct scsi_qla_host *vha) { - u64 cmd_cnt, prev_cmd_cnt; - bool do_hb = false; struct qla_hw_data *ha = vha->hw; - int i; + u32 cmpl_cnt; + u16 i; + bool do_heartbeat = false;
- /* if cmds are still pending down in fw, then do hb */ - if (ha->base_qpair->cmd_cnt != ha->base_qpair->cmd_completion_cnt) { - do_hb = true; + /* + * Allow do_heartbeat only if we don’t have any active interrupts, + * but there are still IOs outstanding with firmware. + */ + cmpl_cnt = ha->base_qpair->cmd_completion_cnt; + if (cmpl_cnt == ha->base_qpair->prev_completion_cnt && + cmpl_cnt != ha->base_qpair->cmd_cnt) { + do_heartbeat = true; goto skip; } + ha->base_qpair->prev_completion_cnt = cmpl_cnt;
for (i = 0; i < ha->max_qpairs; i++) { - if (ha->queue_pair_map[i] && - ha->queue_pair_map[i]->cmd_cnt != - ha->queue_pair_map[i]->cmd_completion_cnt) { - do_hb = true; - break; + if (ha->queue_pair_map[i]) { + cmpl_cnt = ha->queue_pair_map[i]->cmd_completion_cnt; + if (cmpl_cnt == ha->queue_pair_map[i]->prev_completion_cnt && + cmpl_cnt != ha->queue_pair_map[i]->cmd_cnt) { + do_heartbeat = true; + break; + } + ha->queue_pair_map[i]->prev_completion_cnt = cmpl_cnt; } }
skip: - prev_cmd_cnt = ha->prev_cmd_cnt; - cmd_cnt = ha->base_qpair->cmd_cnt; - for (i = 0; i < ha->max_qpairs; i++) { - if (ha->queue_pair_map[i]) - cmd_cnt += ha->queue_pair_map[i]->cmd_cnt; - } - ha->prev_cmd_cnt = cmd_cnt; - - if (!do_hb && ((cmd_cnt - prev_cmd_cnt) > 50)) - /* - * IOs are completing before periodic hb check. - * IOs seems to be running, do hb for sanity check. - */ - do_hb = true; - - return do_hb; + return do_heartbeat; }
static void qla_heart_beat(struct scsi_qla_host *vha) { + struct qla_hw_data *ha = vha->hw; + if (vha->vp_idx) return;
if (vha->hw->flags.eeh_busy || qla2x00_chip_is_down(vha)) return;
- if (qla_do_heartbeat(vha)) { - set_bit(HEARTBEAT_CHK, &vha->dpc_flags); - qla2xxx_wake_dpc(vha); - } + if (qla_do_heartbeat(vha)) + queue_work(ha->wq, &ha->heartbeat_work); }
/**************************************************************************
From: Quinn Tran qutran@marvell.com
[ Upstream commit 713b415726f100f6644971e75ebfe1edbef1a390 ]
For session recovery, driver relies on the dpc thread to initiate certain operations. The dpc thread runs exclusively without the Mailbox interface being occupied. A recent code change for heartbeat check via mailbox cmd 0 is preventing the dpc thread from carrying out its operation. This patch allows the higher priority error recovery to run first before running the lower priority heartbeat check.
Link: https://lore.kernel.org/r/20220310092604.22950-9-njavali@marvell.com Fixes: d94d8158e184 ("scsi: qla2xxx: Add heartbeat check") Cc: stable@vger.kernel.org Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_def.h | 1 + drivers/scsi/qla2xxx/qla_os.c | 20 +++++++++++++++++--- 2 files changed, 18 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_def.h b/drivers/scsi/qla2xxx/qla_def.h index 0589ab8e6467..303ad60d1d49 100644 --- a/drivers/scsi/qla2xxx/qla_def.h +++ b/drivers/scsi/qla2xxx/qla_def.h @@ -4621,6 +4621,7 @@ struct qla_hw_data { struct workqueue_struct *wq; struct work_struct heartbeat_work; struct qlfc_fw fw_buf; + unsigned long last_heartbeat_run_jiffies;
/* FCP_CMND priority support */ struct qla_fcp_prio_cfg *fcp_prio_cfg; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index b224326bacee..12958aea893f 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -7205,7 +7205,7 @@ static bool qla_do_heartbeat(struct scsi_qla_host *vha) return do_heartbeat; }
-static void qla_heart_beat(struct scsi_qla_host *vha) +static void qla_heart_beat(struct scsi_qla_host *vha, u16 dpc_started) { struct qla_hw_data *ha = vha->hw;
@@ -7215,8 +7215,19 @@ static void qla_heart_beat(struct scsi_qla_host *vha) if (vha->hw->flags.eeh_busy || qla2x00_chip_is_down(vha)) return;
- if (qla_do_heartbeat(vha)) + /* + * dpc thread cannot run if heartbeat is running at the same time. + * We also do not want to starve heartbeat task. Therefore, do + * heartbeat task at least once every 5 seconds. + */ + if (dpc_started && + time_before(jiffies, ha->last_heartbeat_run_jiffies + 5 * HZ)) + return; + + if (qla_do_heartbeat(vha)) { + ha->last_heartbeat_run_jiffies = jiffies; queue_work(ha->wq, &ha->heartbeat_work); + } }
/************************************************************************** @@ -7407,6 +7418,8 @@ qla2x00_timer(struct timer_list *t) start_dpc++; }
+ /* borrowing w to signify dpc will run */ + w = 0; /* Schedule the DPC routine if needed */ if ((test_bit(ISP_ABORT_NEEDED, &vha->dpc_flags) || test_bit(LOOP_RESYNC_NEEDED, &vha->dpc_flags) || @@ -7439,9 +7452,10 @@ qla2x00_timer(struct timer_list *t) test_bit(RELOGIN_NEEDED, &vha->dpc_flags), test_bit(PROCESS_PUREX_IOCB, &vha->dpc_flags)); qla2xxx_wake_dpc(vha); + w = 1; }
- qla_heart_beat(vha); + qla_heart_beat(vha, w);
qla2x00_restart_timer(vha, WATCH_INTERVAL); }
From: Quinn Tran qutran@marvell.com
[ Upstream commit 8062b742d3bd336ca10ab5a1db1629d33700f9c6 ]
This patch is per review comment by Hannes Reinecke from previous submission to replace list_for_each_safe with list_for_each_entry_safe.
Link: https://lore.kernel.org/r/20211026115412.27691-8-njavali@marvell.com Reviewed-by: Hannes Reinecke hare@suse.de Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Quinn Tran qutran@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_edif.c | 39 ++++++++------------------------- drivers/scsi/qla2xxx/qla_edif.h | 1 - drivers/scsi/qla2xxx/qla_os.c | 8 +++---- 3 files changed, 13 insertions(+), 35 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_edif.c b/drivers/scsi/qla2xxx/qla_edif.c index a00fe88c6021..e40b9cc38214 100644 --- a/drivers/scsi/qla2xxx/qla_edif.c +++ b/drivers/scsi/qla2xxx/qla_edif.c @@ -1684,41 +1684,25 @@ static struct enode * qla_enode_find(scsi_qla_host_t *vha, uint32_t ntype, uint32_t p1, uint32_t p2) { struct enode *node_rtn = NULL; - struct enode *list_node = NULL; + struct enode *list_node, *q; unsigned long flags; - struct list_head *pos, *q; uint32_t sid; - uint32_t rw_flag; struct purexevent *purex;
/* secure the list from moving under us */ spin_lock_irqsave(&vha->pur_cinfo.pur_lock, flags);
- list_for_each_safe(pos, q, &vha->pur_cinfo.head) { - list_node = list_entry(pos, struct enode, list); + list_for_each_entry_safe(list_node, q, &vha->pur_cinfo.head, list) {
/* node type determines what p1 and p2 are */ purex = &list_node->u.purexinfo; sid = p1; - rw_flag = p2;
if (purex->pur_info.pur_sid.b24 == sid) { - if (purex->pur_info.pur_pend == 1 && - rw_flag == PUR_GET) { - /* - * if the receive is in progress - * and its a read/get then can't - * transfer yet - */ - ql_dbg(ql_dbg_edif, vha, 0x9106, - "%s purex xfer in progress for sid=%x\n", - __func__, sid); - } else { - /* found it and its complete */ - node_rtn = list_node; - list_del(pos); - break; - } + /* found it and its complete */ + node_rtn = list_node; + list_del(&list_node->list); + break; } }
@@ -2428,7 +2412,6 @@ void qla24xx_auth_els(scsi_qla_host_t *vha, void **pkt, struct rsp_que **rsp)
purex = &ptr->u.purexinfo; purex->pur_info.pur_sid = a.did; - purex->pur_info.pur_pend = 0; purex->pur_info.pur_bytes_rcvd = totlen; purex->pur_info.pur_rx_xchg_address = le32_to_cpu(p->rx_xchg_addr); purex->pur_info.pur_nphdl = le16_to_cpu(p->nport_handle); @@ -3180,18 +3163,14 @@ static uint16_t qla_edif_sadb_get_sa_index(fc_port_t *fcport, /* release any sadb entries -- only done at teardown */ void qla_edif_sadb_release(struct qla_hw_data *ha) { - struct list_head *pos; - struct list_head *tmp; - struct edif_sa_index_entry *entry; + struct edif_sa_index_entry *entry, *tmp;
- list_for_each_safe(pos, tmp, &ha->sadb_rx_index_list) { - entry = list_entry(pos, struct edif_sa_index_entry, next); + list_for_each_entry_safe(entry, tmp, &ha->sadb_rx_index_list, next) { list_del(&entry->next); kfree(entry); }
- list_for_each_safe(pos, tmp, &ha->sadb_tx_index_list) { - entry = list_entry(pos, struct edif_sa_index_entry, next); + list_for_each_entry_safe(entry, tmp, &ha->sadb_tx_index_list, next) { list_del(&entry->next); kfree(entry); } diff --git a/drivers/scsi/qla2xxx/qla_edif.h b/drivers/scsi/qla2xxx/qla_edif.h index 45cf87e33778..32800bfb32a3 100644 --- a/drivers/scsi/qla2xxx/qla_edif.h +++ b/drivers/scsi/qla2xxx/qla_edif.h @@ -101,7 +101,6 @@ struct dinfo { };
struct pur_ninfo { - unsigned int pur_pend:1; port_id_t pur_sid; port_id_t pur_did; uint8_t vp_idx; diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index 12958aea893f..c7ab8a8be24c 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -3886,13 +3886,13 @@ qla2x00_remove_one(struct pci_dev *pdev) static inline void qla24xx_free_purex_list(struct purex_list *list) { - struct list_head *item, *next; + struct purex_item *item, *next; ulong flags;
spin_lock_irqsave(&list->lock, flags); - list_for_each_safe(item, next, &list->head) { - list_del(item); - kfree(list_entry(item, struct purex_item, list)); + list_for_each_entry_safe(item, next, &list->head, list) { + list_del(&item->list); + kfree(item); } spin_unlock_irqrestore(&list->lock, flags); }
From: Arun Easi aeasi@marvell.com
[ Upstream commit 0972252450f90db56dd5415a20e2aec21a08d036 ]
During purex packet handling the driver was incorrectly freeing a pre-allocated structure. Fix this by skipping that entry.
System crashed with the following stack during a module unload test.
Call Trace: sbitmap_init_node+0x7f/0x1e0 sbitmap_queue_init_node+0x24/0x150 blk_mq_init_bitmaps+0x3d/0xa0 blk_mq_init_tags+0x68/0x90 blk_mq_alloc_map_and_rqs+0x44/0x120 blk_mq_alloc_set_map_and_rqs+0x63/0x150 blk_mq_alloc_tag_set+0x11b/0x230 scsi_add_host_with_dma.cold+0x3f/0x245 qla2x00_probe_one+0xd5a/0x1b80 [qla2xxx]
Call Trace with slub_debug and debug kernel: kasan_report_invalid_free+0x50/0x80 __kasan_slab_free+0x137/0x150 slab_free_freelist_hook+0xc6/0x190 kfree+0xe8/0x2e0 qla2x00_free_device+0x3bb/0x5d0 [qla2xxx] qla2x00_remove_one+0x668/0xcf0 [qla2xxx]
Link: https://lore.kernel.org/r/20220310092604.22950-6-njavali@marvell.com Fixes: 62e9dd177732 ("scsi: qla2xxx: Change in PUREX to handle FPIN ELS requests") Cc: stable@vger.kernel.org Reported-by: Marco Patalano mpatalan@redhat.com Tested-by: Marco Patalano mpatalan@redhat.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_os.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c index c7ab8a8be24c..e683b1c01c9f 100644 --- a/drivers/scsi/qla2xxx/qla_os.c +++ b/drivers/scsi/qla2xxx/qla_os.c @@ -3892,6 +3892,8 @@ qla24xx_free_purex_list(struct purex_list *list) spin_lock_irqsave(&list->lock, flags); list_for_each_entry_safe(item, next, &list->head, list) { list_del(&item->list); + if (item == &item->vha->default_item) + continue; kfree(item); } spin_unlock_irqrestore(&list->lock, flags);
From: Andreas Gruenbacher agruenba@redhat.com
[ Upstream commit 46f3e0421ccb5474b5c006b0089b9dfd42534bb6 ]
Since commit 554c577cee95b, gfs2_file_buffered_write() can accidentally return a truncated iov_iter, which might confuse callers. Fix that.
Fixes: 554c577cee95b ("gfs2: Prevent endless loops in gfs2_file_buffered_write") Signed-off-by: Andreas Gruenbacher agruenba@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/gfs2/file.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c index 60390f9dc31f..e93185d804e0 100644 --- a/fs/gfs2/file.c +++ b/fs/gfs2/file.c @@ -1086,6 +1086,7 @@ static ssize_t gfs2_file_buffered_write(struct kiocb *iocb, gfs2_holder_uninit(gh); if (statfs_gh) kfree(statfs_gh); + from->count = orig_count - read; return read ? read : ret; }
From: Eli Cohen elic@nvidia.com
[ Upstream commit ad6dc1daaf29f97f23cc810d60ee01c0e83f4c6b ]
If mlx5_vdpa gets unloaded while a VM is running, the workqueue will be destroyed. However, vhost might still have reference to the kick function and might attempt to push new works. This could lead to null pointer dereference.
To fix this, set mvdev->wq to NULL just before destroying and verify that the workqueue is not NULL in mlx5_vdpa_kick_vq before attempting to push a new work.
Fixes: 5262912ef3cf ("vdpa/mlx5: Add support for control VQ and MAC setting") Signed-off-by: Eli Cohen elic@nvidia.com Link: https://lore.kernel.org/r/20220321141303.9586-1-elic@nvidia.com Signed-off-by: Michael S. Tsirkin mst@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vdpa/mlx5/net/mlx5_vnet.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/vdpa/mlx5/net/mlx5_vnet.c b/drivers/vdpa/mlx5/net/mlx5_vnet.c index 174895372e7f..467a349dc26c 100644 --- a/drivers/vdpa/mlx5/net/mlx5_vnet.c +++ b/drivers/vdpa/mlx5/net/mlx5_vnet.c @@ -1641,7 +1641,7 @@ static void mlx5_vdpa_kick_vq(struct vdpa_device *vdev, u16 idx) return;
if (unlikely(is_ctrl_vq_idx(mvdev, idx))) { - if (!mvdev->cvq.ready) + if (!mvdev->wq || !mvdev->cvq.ready) return;
queue_work(mvdev->wq, &ndev->cvq_ent.work); @@ -2626,9 +2626,12 @@ static void mlx5_vdpa_dev_del(struct vdpa_mgmt_dev *v_mdev, struct vdpa_device * struct mlx5_vdpa_mgmtdev *mgtdev = container_of(v_mdev, struct mlx5_vdpa_mgmtdev, mgtdev); struct mlx5_vdpa_dev *mvdev = to_mvdev(dev); struct mlx5_vdpa_net *ndev = to_mlx5_vdpa_ndev(mvdev); + struct workqueue_struct *wq;
mlx5_notifier_unregister(mvdev->mdev, &ndev->nb); - destroy_workqueue(mvdev->wq); + wq = mvdev->wq; + mvdev->wq = NULL; + destroy_workqueue(wq); _vdpa_unregister_device(dev); mgtdev->ndev = NULL; }
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 562d7b1512f7369a19bca2883e2e8672d78f0481 ]
We have a lot of device lookup functions that all do something slightly different. Clean this up by adding a struct to hold the different lookup criteria, and then pass this around to btrfs_find_device() so it can do the proper matching based on the lookup criteria.
Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/dev-replace.c | 16 +++--- fs/btrfs/ioctl.c | 13 +++-- fs/btrfs/scrub.c | 6 +- fs/btrfs/volumes.c | 122 +++++++++++++++++++++++++---------------- fs/btrfs/volumes.h | 20 ++++++- 5 files changed, 112 insertions(+), 65 deletions(-)
diff --git a/fs/btrfs/dev-replace.c b/fs/btrfs/dev-replace.c index bdbc310a8f8c..781556e2a37f 100644 --- a/fs/btrfs/dev-replace.c +++ b/fs/btrfs/dev-replace.c @@ -70,6 +70,7 @@ static int btrfs_dev_replace_kthread(void *data);
int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info) { + struct btrfs_dev_lookup_args args = { .devid = BTRFS_DEV_REPLACE_DEVID }; struct btrfs_key key; struct btrfs_root *dev_root = fs_info->dev_root; struct btrfs_dev_replace *dev_replace = &fs_info->dev_replace; @@ -100,8 +101,7 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info) * We don't have a replace item or it's corrupted. If there is * a replace target, fail the mount. */ - if (btrfs_find_device(fs_info->fs_devices, - BTRFS_DEV_REPLACE_DEVID, NULL, NULL)) { + if (btrfs_find_device(fs_info->fs_devices, &args)) { btrfs_err(fs_info, "found replace target device without a valid replace item"); ret = -EUCLEAN; @@ -163,8 +163,7 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info) * We don't have an active replace item but if there is a * replace target, fail the mount. */ - if (btrfs_find_device(fs_info->fs_devices, - BTRFS_DEV_REPLACE_DEVID, NULL, NULL)) { + if (btrfs_find_device(fs_info->fs_devices, &args)) { btrfs_err(fs_info, "replace devid present without an active replace item"); ret = -EUCLEAN; @@ -175,11 +174,10 @@ int btrfs_init_dev_replace(struct btrfs_fs_info *fs_info) break; case BTRFS_IOCTL_DEV_REPLACE_STATE_STARTED: case BTRFS_IOCTL_DEV_REPLACE_STATE_SUSPENDED: - dev_replace->srcdev = btrfs_find_device(fs_info->fs_devices, - src_devid, NULL, NULL); - dev_replace->tgtdev = btrfs_find_device(fs_info->fs_devices, - BTRFS_DEV_REPLACE_DEVID, - NULL, NULL); + dev_replace->tgtdev = btrfs_find_device(fs_info->fs_devices, &args); + args.devid = src_devid; + dev_replace->srcdev = btrfs_find_device(fs_info->fs_devices, &args); + /* * allow 'btrfs dev replace_cancel' if src/tgt device is * missing diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index a37ab3e89a3b..4951a2ab88dd 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -1657,6 +1657,7 @@ static int exclop_start_or_cancel_reloc(struct btrfs_fs_info *fs_info, static noinline int btrfs_ioctl_resize(struct file *file, void __user *arg) { + BTRFS_DEV_LOOKUP_ARGS(args); struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); u64 new_size; @@ -1712,7 +1713,8 @@ static noinline int btrfs_ioctl_resize(struct file *file, btrfs_info(fs_info, "resizing devid %llu", devid); }
- device = btrfs_find_device(fs_info->fs_devices, devid, NULL, NULL); + args.devid = devid; + device = btrfs_find_device(fs_info->fs_devices, &args); if (!device) { btrfs_info(fs_info, "resizer unable to find device %llu", devid); @@ -3375,22 +3377,21 @@ static long btrfs_ioctl_fs_info(struct btrfs_fs_info *fs_info, static long btrfs_ioctl_dev_info(struct btrfs_fs_info *fs_info, void __user *arg) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_ioctl_dev_info_args *di_args; struct btrfs_device *dev; int ret = 0; - char *s_uuid = NULL;
di_args = memdup_user(arg, sizeof(*di_args)); if (IS_ERR(di_args)) return PTR_ERR(di_args);
+ args.devid = di_args->devid; if (!btrfs_is_empty_uuid(di_args->uuid)) - s_uuid = di_args->uuid; + args.uuid = di_args->uuid;
rcu_read_lock(); - dev = btrfs_find_device(fs_info->fs_devices, di_args->devid, s_uuid, - NULL); - + dev = btrfs_find_device(fs_info->fs_devices, &args); if (!dev) { ret = -ENODEV; goto out; diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 62f4bafbe54b..6f2787b21530 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -4068,6 +4068,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, u64 end, struct btrfs_scrub_progress *progress, int readonly, int is_dev_replace) { + struct btrfs_dev_lookup_args args = { .devid = devid }; struct scrub_ctx *sctx; int ret; struct btrfs_device *dev; @@ -4115,7 +4116,7 @@ int btrfs_scrub_dev(struct btrfs_fs_info *fs_info, u64 devid, u64 start, goto out_free_ctx;
mutex_lock(&fs_info->fs_devices->device_list_mutex); - dev = btrfs_find_device(fs_info->fs_devices, devid, NULL, NULL); + dev = btrfs_find_device(fs_info->fs_devices, &args); if (!dev || (test_bit(BTRFS_DEV_STATE_MISSING, &dev->dev_state) && !is_dev_replace)) { mutex_unlock(&fs_info->fs_devices->device_list_mutex); @@ -4288,11 +4289,12 @@ int btrfs_scrub_cancel_dev(struct btrfs_device *dev) int btrfs_scrub_progress(struct btrfs_fs_info *fs_info, u64 devid, struct btrfs_scrub_progress *progress) { + struct btrfs_dev_lookup_args args = { .devid = devid }; struct btrfs_device *dev; struct scrub_ctx *sctx = NULL;
mutex_lock(&fs_info->fs_devices->device_list_mutex); - dev = btrfs_find_device(fs_info->fs_devices, devid, NULL, NULL); + dev = btrfs_find_device(fs_info->fs_devices, &args); if (dev) sctx = dev->scrub_ctx; if (sctx) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index fa68efd7e610..53417a1c5402 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -844,9 +844,13 @@ static noinline struct btrfs_device *device_list_add(const char *path,
device = NULL; } else { + struct btrfs_dev_lookup_args args = { + .devid = devid, + .uuid = disk_super->dev_item.uuid, + }; + mutex_lock(&fs_devices->device_list_mutex); - device = btrfs_find_device(fs_devices, devid, - disk_super->dev_item.uuid, NULL); + device = btrfs_find_device(fs_devices, &args);
/* * If this disk has been pulled into an fs devices created by @@ -2360,10 +2364,9 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_device *tgtdev) static struct btrfs_device *btrfs_find_device_by_path( struct btrfs_fs_info *fs_info, const char *device_path) { + BTRFS_DEV_LOOKUP_ARGS(args); int ret = 0; struct btrfs_super_block *disk_super; - u64 devid; - u8 *dev_uuid; struct block_device *bdev; struct btrfs_device *device;
@@ -2372,14 +2375,14 @@ static struct btrfs_device *btrfs_find_device_by_path( if (ret) return ERR_PTR(ret);
- devid = btrfs_stack_device_id(&disk_super->dev_item); - dev_uuid = disk_super->dev_item.uuid; + args.devid = btrfs_stack_device_id(&disk_super->dev_item); + args.uuid = disk_super->dev_item.uuid; if (btrfs_fs_incompat(fs_info, METADATA_UUID)) - device = btrfs_find_device(fs_info->fs_devices, devid, dev_uuid, - disk_super->metadata_uuid); + args.fsid = disk_super->metadata_uuid; else - device = btrfs_find_device(fs_info->fs_devices, devid, dev_uuid, - disk_super->fsid); + args.fsid = disk_super->fsid; + + device = btrfs_find_device(fs_info->fs_devices, &args);
btrfs_release_disk_super(disk_super); if (!device) @@ -2395,11 +2398,12 @@ struct btrfs_device *btrfs_find_device_by_devspec( struct btrfs_fs_info *fs_info, u64 devid, const char *device_path) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_device *device;
if (devid) { - device = btrfs_find_device(fs_info->fs_devices, devid, NULL, - NULL); + args.devid = devid; + device = btrfs_find_device(fs_info->fs_devices, &args); if (!device) return ERR_PTR(-ENOENT); return device; @@ -2409,14 +2413,11 @@ struct btrfs_device *btrfs_find_device_by_devspec( return ERR_PTR(-EINVAL);
if (strcmp(device_path, "missing") == 0) { - /* Find first missing device */ - list_for_each_entry(device, &fs_info->fs_devices->devices, - dev_list) { - if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, - &device->dev_state) && !device->bdev) - return device; - } - return ERR_PTR(-ENOENT); + args.missing = true; + device = btrfs_find_device(fs_info->fs_devices, &args); + if (!device) + return ERR_PTR(-ENOENT); + return device; }
return btrfs_find_device_by_path(fs_info, device_path); @@ -2496,6 +2497,7 @@ static int btrfs_prepare_sprout(struct btrfs_fs_info *fs_info) */ static int btrfs_finish_sprout(struct btrfs_trans_handle *trans) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_root *root = fs_info->chunk_root; struct btrfs_path *path; @@ -2505,7 +2507,6 @@ static int btrfs_finish_sprout(struct btrfs_trans_handle *trans) struct btrfs_key key; u8 fs_uuid[BTRFS_FSID_SIZE]; u8 dev_uuid[BTRFS_UUID_SIZE]; - u64 devid; int ret;
path = btrfs_alloc_path(); @@ -2544,13 +2545,14 @@ static int btrfs_finish_sprout(struct btrfs_trans_handle *trans)
dev_item = btrfs_item_ptr(leaf, path->slots[0], struct btrfs_dev_item); - devid = btrfs_device_id(leaf, dev_item); + args.devid = btrfs_device_id(leaf, dev_item); read_extent_buffer(leaf, dev_uuid, btrfs_device_uuid(dev_item), BTRFS_UUID_SIZE); read_extent_buffer(leaf, fs_uuid, btrfs_device_fsid(dev_item), BTRFS_FSID_SIZE); - device = btrfs_find_device(fs_info->fs_devices, devid, dev_uuid, - fs_uuid); + args.uuid = dev_uuid; + args.fsid = fs_uuid; + device = btrfs_find_device(fs_info->fs_devices, &args); BUG_ON(!device); /* Logic error */
if (device->fs_devices->seeding) { @@ -6805,6 +6807,33 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, return BLK_STS_OK; }
+static bool dev_args_match_fs_devices(const struct btrfs_dev_lookup_args *args, + const struct btrfs_fs_devices *fs_devices) +{ + if (args->fsid == NULL) + return true; + if (memcmp(fs_devices->metadata_uuid, args->fsid, BTRFS_FSID_SIZE) == 0) + return true; + return false; +} + +static bool dev_args_match_device(const struct btrfs_dev_lookup_args *args, + const struct btrfs_device *device) +{ + ASSERT((args->devid != (u64)-1) || args->missing); + + if ((args->devid != (u64)-1) && device->devid != args->devid) + return false; + if (args->uuid && memcmp(device->uuid, args->uuid, BTRFS_UUID_SIZE) != 0) + return false; + if (!args->missing) + return true; + if (test_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state) && + !device->bdev) + return true; + return false; +} + /* * Find a device specified by @devid or @uuid in the list of @fs_devices, or * return NULL. @@ -6812,31 +6841,25 @@ blk_status_t btrfs_map_bio(struct btrfs_fs_info *fs_info, struct bio *bio, * If devid and uuid are both specified, the match must be exact, otherwise * only devid is used. */ -struct btrfs_device *btrfs_find_device(struct btrfs_fs_devices *fs_devices, - u64 devid, u8 *uuid, u8 *fsid) +struct btrfs_device *btrfs_find_device(const struct btrfs_fs_devices *fs_devices, + const struct btrfs_dev_lookup_args *args) { struct btrfs_device *device; struct btrfs_fs_devices *seed_devs;
- if (!fsid || !memcmp(fs_devices->metadata_uuid, fsid, BTRFS_FSID_SIZE)) { + if (dev_args_match_fs_devices(args, fs_devices)) { list_for_each_entry(device, &fs_devices->devices, dev_list) { - if (device->devid == devid && - (!uuid || memcmp(device->uuid, uuid, - BTRFS_UUID_SIZE) == 0)) + if (dev_args_match_device(args, device)) return device; } }
list_for_each_entry(seed_devs, &fs_devices->seed_list, seed_list) { - if (!fsid || - !memcmp(seed_devs->metadata_uuid, fsid, BTRFS_FSID_SIZE)) { - list_for_each_entry(device, &seed_devs->devices, - dev_list) { - if (device->devid == devid && - (!uuid || memcmp(device->uuid, uuid, - BTRFS_UUID_SIZE) == 0)) - return device; - } + if (!dev_args_match_fs_devices(args, seed_devs)) + continue; + list_for_each_entry(device, &seed_devs->devices, dev_list) { + if (dev_args_match_device(args, device)) + return device; } }
@@ -7002,6 +7025,7 @@ static void warn_32bit_meta_chunk(struct btrfs_fs_info *fs_info, static int read_one_chunk(struct btrfs_key *key, struct extent_buffer *leaf, struct btrfs_chunk *chunk) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_fs_info *fs_info = leaf->fs_info; struct extent_map_tree *map_tree = &fs_info->mapping_tree; struct map_lookup *map; @@ -7079,11 +7103,12 @@ static int read_one_chunk(struct btrfs_key *key, struct extent_buffer *leaf, map->stripes[i].physical = btrfs_stripe_offset_nr(leaf, chunk, i); devid = btrfs_stripe_devid_nr(leaf, chunk, i); + args.devid = devid; read_extent_buffer(leaf, uuid, (unsigned long) btrfs_stripe_dev_uuid_nr(chunk, i), BTRFS_UUID_SIZE); - map->stripes[i].dev = btrfs_find_device(fs_info->fs_devices, - devid, uuid, NULL); + args.uuid = uuid; + map->stripes[i].dev = btrfs_find_device(fs_info->fs_devices, &args); if (!map->stripes[i].dev && !btrfs_test_opt(fs_info, DEGRADED)) { free_extent_map(em); @@ -7201,6 +7226,7 @@ static struct btrfs_fs_devices *open_seed_devices(struct btrfs_fs_info *fs_info, static int read_one_dev(struct extent_buffer *leaf, struct btrfs_dev_item *dev_item) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_fs_info *fs_info = leaf->fs_info; struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; struct btrfs_device *device; @@ -7209,11 +7235,13 @@ static int read_one_dev(struct extent_buffer *leaf, u8 fs_uuid[BTRFS_FSID_SIZE]; u8 dev_uuid[BTRFS_UUID_SIZE];
- devid = btrfs_device_id(leaf, dev_item); + devid = args.devid = btrfs_device_id(leaf, dev_item); read_extent_buffer(leaf, dev_uuid, btrfs_device_uuid(dev_item), BTRFS_UUID_SIZE); read_extent_buffer(leaf, fs_uuid, btrfs_device_fsid(dev_item), BTRFS_FSID_SIZE); + args.uuid = dev_uuid; + args.fsid = fs_uuid;
if (memcmp(fs_uuid, fs_devices->metadata_uuid, BTRFS_FSID_SIZE)) { fs_devices = open_seed_devices(fs_info, fs_uuid); @@ -7221,8 +7249,7 @@ static int read_one_dev(struct extent_buffer *leaf, return PTR_ERR(fs_devices); }
- device = btrfs_find_device(fs_info->fs_devices, devid, dev_uuid, - fs_uuid); + device = btrfs_find_device(fs_info->fs_devices, &args); if (!device) { if (!btrfs_test_opt(fs_info, DEGRADED)) { btrfs_report_missing_device(fs_info, devid, @@ -7899,12 +7926,14 @@ static void btrfs_dev_stat_print_on_load(struct btrfs_device *dev) int btrfs_get_dev_stats(struct btrfs_fs_info *fs_info, struct btrfs_ioctl_get_dev_stats *stats) { + BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_device *dev; struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; int i;
mutex_lock(&fs_devices->device_list_mutex); - dev = btrfs_find_device(fs_info->fs_devices, stats->devid, NULL, NULL); + args.devid = stats->devid; + dev = btrfs_find_device(fs_info->fs_devices, &args); mutex_unlock(&fs_devices->device_list_mutex);
if (!dev) { @@ -7980,6 +8009,7 @@ static int verify_one_dev_extent(struct btrfs_fs_info *fs_info, u64 chunk_offset, u64 devid, u64 physical_offset, u64 physical_len) { + struct btrfs_dev_lookup_args args = { .devid = devid }; struct extent_map_tree *em_tree = &fs_info->mapping_tree; struct extent_map *em; struct map_lookup *map; @@ -8035,7 +8065,7 @@ static int verify_one_dev_extent(struct btrfs_fs_info *fs_info, }
/* Make sure no dev extent is beyond device boundary */ - dev = btrfs_find_device(fs_info->fs_devices, devid, NULL, NULL); + dev = btrfs_find_device(fs_info->fs_devices, &args); if (!dev) { btrfs_err(fs_info, "failed to find devid %llu", devid); ret = -EUCLEAN; diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index d86a6f9f166c..f3b1380f45ad 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -418,6 +418,22 @@ struct btrfs_balance_control { struct btrfs_balance_progress stat; };
+/* + * Search for a given device by the set parameters + */ +struct btrfs_dev_lookup_args { + u64 devid; + u8 *uuid; + u8 *fsid; + bool missing; +}; + +/* We have to initialize to -1 because BTRFS_DEV_REPLACE_DEVID is 0 */ +#define BTRFS_DEV_LOOKUP_ARGS_INIT { .devid = (u64)-1 } + +#define BTRFS_DEV_LOOKUP_ARGS(name) \ + struct btrfs_dev_lookup_args name = BTRFS_DEV_LOOKUP_ARGS_INIT + enum btrfs_map_op { BTRFS_MAP_READ, BTRFS_MAP_WRITE, @@ -482,8 +498,8 @@ void __exit btrfs_cleanup_fs_uuids(void); int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len); int btrfs_grow_device(struct btrfs_trans_handle *trans, struct btrfs_device *device, u64 new_size); -struct btrfs_device *btrfs_find_device(struct btrfs_fs_devices *fs_devices, - u64 devid, u8 *uuid, u8 *fsid); +struct btrfs_device *btrfs_find_device(const struct btrfs_fs_devices *fs_devices, + const struct btrfs_dev_lookup_args *args); int btrfs_shrink_device(struct btrfs_device *device, u64 new_size); int btrfs_init_new_device(struct btrfs_fs_info *fs_info, const char *path); int btrfs_balance(struct btrfs_fs_info *fs_info,
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit faa775c41d655a4786e9d53cb075a77bb5a75f66 ]
We are going to want to populate our device lookup args outside of any locks and then do the actual device lookup later, so add a helper to do this work and make btrfs_find_device_by_devspec() use this helper for now.
Reviewed-by: Nikolay Borisov nborisov@suse.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 96 ++++++++++++++++++++++++++++++---------------- fs/btrfs/volumes.h | 4 ++ 2 files changed, 68 insertions(+), 32 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 53417a1c5402..8d09e6d442b2 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2361,45 +2361,81 @@ void btrfs_destroy_dev_replace_tgtdev(struct btrfs_device *tgtdev) btrfs_free_device(tgtdev); }
-static struct btrfs_device *btrfs_find_device_by_path( - struct btrfs_fs_info *fs_info, const char *device_path) +/** + * Populate args from device at path + * + * @fs_info: the filesystem + * @args: the args to populate + * @path: the path to the device + * + * This will read the super block of the device at @path and populate @args with + * the devid, fsid, and uuid. This is meant to be used for ioctls that need to + * lookup a device to operate on, but need to do it before we take any locks. + * This properly handles the special case of "missing" that a user may pass in, + * and does some basic sanity checks. The caller must make sure that @path is + * properly NUL terminated before calling in, and must call + * btrfs_put_dev_args_from_path() in order to free up the temporary fsid and + * uuid buffers. + * + * Return: 0 for success, -errno for failure + */ +int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info, + struct btrfs_dev_lookup_args *args, + const char *path) { - BTRFS_DEV_LOOKUP_ARGS(args); - int ret = 0; struct btrfs_super_block *disk_super; struct block_device *bdev; - struct btrfs_device *device; + int ret;
- ret = btrfs_get_bdev_and_sb(device_path, FMODE_READ, - fs_info->bdev_holder, 0, &bdev, &disk_super); - if (ret) - return ERR_PTR(ret); + if (!path || !path[0]) + return -EINVAL; + if (!strcmp(path, "missing")) { + args->missing = true; + return 0; + } + + args->uuid = kzalloc(BTRFS_UUID_SIZE, GFP_KERNEL); + args->fsid = kzalloc(BTRFS_FSID_SIZE, GFP_KERNEL); + if (!args->uuid || !args->fsid) { + btrfs_put_dev_args_from_path(args); + return -ENOMEM; + }
- args.devid = btrfs_stack_device_id(&disk_super->dev_item); - args.uuid = disk_super->dev_item.uuid; + ret = btrfs_get_bdev_and_sb(path, FMODE_READ, fs_info->bdev_holder, 0, + &bdev, &disk_super); + if (ret) + return ret; + args->devid = btrfs_stack_device_id(&disk_super->dev_item); + memcpy(args->uuid, disk_super->dev_item.uuid, BTRFS_UUID_SIZE); if (btrfs_fs_incompat(fs_info, METADATA_UUID)) - args.fsid = disk_super->metadata_uuid; + memcpy(args->fsid, disk_super->metadata_uuid, BTRFS_FSID_SIZE); else - args.fsid = disk_super->fsid; - - device = btrfs_find_device(fs_info->fs_devices, &args); - + memcpy(args->fsid, disk_super->fsid, BTRFS_FSID_SIZE); btrfs_release_disk_super(disk_super); - if (!device) - device = ERR_PTR(-ENOENT); blkdev_put(bdev, FMODE_READ); - return device; + return 0; }
/* - * Lookup a device given by device id, or the path if the id is 0. + * Only use this jointly with btrfs_get_dev_args_from_path() because we will + * allocate our ->uuid and ->fsid pointers, everybody else uses local variables + * that don't need to be freed. */ +void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args) +{ + kfree(args->uuid); + kfree(args->fsid); + args->uuid = NULL; + args->fsid = NULL; +} + struct btrfs_device *btrfs_find_device_by_devspec( struct btrfs_fs_info *fs_info, u64 devid, const char *device_path) { BTRFS_DEV_LOOKUP_ARGS(args); struct btrfs_device *device; + int ret;
if (devid) { args.devid = devid; @@ -2409,18 +2445,14 @@ struct btrfs_device *btrfs_find_device_by_devspec( return device; }
- if (!device_path || !device_path[0]) - return ERR_PTR(-EINVAL); - - if (strcmp(device_path, "missing") == 0) { - args.missing = true; - device = btrfs_find_device(fs_info->fs_devices, &args); - if (!device) - return ERR_PTR(-ENOENT); - return device; - } - - return btrfs_find_device_by_path(fs_info, device_path); + ret = btrfs_get_dev_args_from_path(fs_info, &args, device_path); + if (ret) + return ERR_PTR(ret); + device = btrfs_find_device(fs_info->fs_devices, &args); + btrfs_put_dev_args_from_path(&args); + if (!device) + return ERR_PTR(-ENOENT); + return device; }
/* diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index f3b1380f45ad..d1df03f77e29 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -487,9 +487,13 @@ void btrfs_assign_next_active_device(struct btrfs_device *device, struct btrfs_device *btrfs_find_device_by_devspec(struct btrfs_fs_info *fs_info, u64 devid, const char *devpath); +int btrfs_get_dev_args_from_path(struct btrfs_fs_info *fs_info, + struct btrfs_dev_lookup_args *args, + const char *path); struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, const u64 *devid, const u8 *uuid); +void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args); void btrfs_free_device(struct btrfs_device *device); int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, u64 devid,
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 1a15eb724aaef8656f8cc01d9355797cfe7c618e ]
For device removal and replace we call btrfs_find_device_by_devspec, which if we give it a device path and nothing else will call btrfs_get_dev_args_from_path, which opens the block device and reads the super block and then looks up our device based on that.
However at this point we're holding the sb write "lock", so reading the block device pulls in the dependency of ->open_mutex, which produces the following lockdep splat
====================================================== WARNING: possible circular locking dependency detected 5.14.0-rc2+ #405 Not tainted ------------------------------------------------------ losetup/11576 is trying to acquire lock: ffff9bbe8cded938 ((wq_completion)loop0){+.+.}-{0:0}, at: flush_workqueue+0x67/0x5e0
but task is already holding lock: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
which lock already depends on the new lock.
the existing dependency chain (in reverse order) is:
-> #4 (&lo->lo_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 lo_open+0x28/0x60 [loop] blkdev_get_whole+0x25/0xf0 blkdev_get_by_dev.part.0+0x168/0x3c0 blkdev_open+0xd2/0xe0 do_dentry_open+0x161/0x390 path_openat+0x3cc/0xa20 do_filp_open+0x96/0x120 do_sys_openat2+0x7b/0x130 __x64_sys_openat+0x46/0x70 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #3 (&disk->open_mutex){+.+.}-{3:3}: __mutex_lock+0x7d/0x750 blkdev_get_by_dev.part.0+0x56/0x3c0 blkdev_get_by_path+0x98/0xa0 btrfs_get_bdev_and_sb+0x1b/0xb0 btrfs_find_device_by_devspec+0x12b/0x1c0 btrfs_rm_device+0x127/0x610 btrfs_ioctl+0x2a31/0x2e70 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
-> #2 (sb_writers#12){.+.+}-{0:0}: lo_write_bvec+0xc2/0x240 [loop] loop_process_work+0x238/0xd00 [loop] process_one_work+0x26b/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #1 ((work_completion)(&lo->rootcg_work)){+.+.}-{0:0}: process_one_work+0x245/0x560 worker_thread+0x55/0x3c0 kthread+0x140/0x160 ret_from_fork+0x1f/0x30
-> #0 ((wq_completion)loop0){+.+.}-{0:0}: __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 flush_workqueue+0x91/0x5e0 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae
other info that might help us debug this:
Chain exists of: (wq_completion)loop0 --> &disk->open_mutex --> &lo->lo_mutex
Possible unsafe locking scenario:
CPU0 CPU1 ---- ---- lock(&lo->lo_mutex); lock(&disk->open_mutex); lock(&lo->lo_mutex); lock((wq_completion)loop0);
*** DEADLOCK ***
1 lock held by losetup/11576: #0: ffff9bbe88e4fc68 (&lo->lo_mutex){+.+.}-{3:3}, at: __loop_clr_fd+0x41/0x660 [loop]
stack backtrace: CPU: 0 PID: 11576 Comm: losetup Not tainted 5.14.0-rc2+ #405 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.13.0-2.fc32 04/01/2014 Call Trace: dump_stack_lvl+0x57/0x72 check_noncircular+0xcf/0xf0 ? stack_trace_save+0x3b/0x50 __lock_acquire+0x10ea/0x1d90 lock_acquire+0xb5/0x2b0 ? flush_workqueue+0x67/0x5e0 ? lockdep_init_map_type+0x47/0x220 flush_workqueue+0x91/0x5e0 ? flush_workqueue+0x67/0x5e0 ? verify_cpu+0xf0/0x100 drain_workqueue+0xa0/0x110 destroy_workqueue+0x36/0x250 __loop_clr_fd+0x9a/0x660 [loop] ? blkdev_ioctl+0x8d/0x2a0 block_ioctl+0x3f/0x50 __x64_sys_ioctl+0x80/0xb0 do_syscall_64+0x38/0x90 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f31b02404cb
Instead what we want to do is populate our device lookup args before we grab any locks, and then pass these args into btrfs_rm_device(). From there we can find the device and do the appropriate removal.
Suggested-by: Anand Jain anand.jain@oracle.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ioctl.c | 67 +++++++++++++++++++++++++++------------------- fs/btrfs/volumes.c | 15 +++++------ fs/btrfs/volumes.h | 2 +- 3 files changed, 48 insertions(+), 36 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c index 4951a2ab88dd..4317720a29e8 100644 --- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3218,6 +3218,7 @@ static long btrfs_ioctl_add_dev(struct btrfs_fs_info *fs_info, void __user *arg)
static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) { + BTRFS_DEV_LOOKUP_ARGS(args); struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args_v2 *vol_args; @@ -3229,35 +3230,39 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM;
- ret = mnt_want_write_file(file); - if (ret) - return ret; - vol_args = memdup_user(arg, sizeof(*vol_args)); if (IS_ERR(vol_args)) { ret = PTR_ERR(vol_args); - goto err_drop; + goto out; }
if (vol_args->flags & ~BTRFS_DEVICE_REMOVE_ARGS_MASK) { ret = -EOPNOTSUPP; goto out; } + vol_args->name[BTRFS_SUBVOL_NAME_MAX] = '\0'; - if (!(vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) && - strcmp("cancel", vol_args->name) == 0) + if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) { + args.devid = vol_args->devid; + } else if (!strcmp("cancel", vol_args->name)) { cancel = true; + } else { + ret = btrfs_get_dev_args_from_path(fs_info, &args, vol_args->name); + if (ret) + goto out; + } + + ret = mnt_want_write_file(file); + if (ret) + goto out;
ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE, cancel); if (ret) - goto out; - /* Exclusive operation is now claimed */ + goto err_drop;
- if (vol_args->flags & BTRFS_DEVICE_SPEC_BY_ID) - ret = btrfs_rm_device(fs_info, NULL, vol_args->devid, &bdev, &mode); - else - ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode); + /* Exclusive operation is now claimed */ + ret = btrfs_rm_device(fs_info, &args, &bdev, &mode);
btrfs_exclop_finish(fs_info);
@@ -3269,17 +3274,19 @@ static long btrfs_ioctl_rm_dev_v2(struct file *file, void __user *arg) btrfs_info(fs_info, "device deleted: %s", vol_args->name); } -out: - kfree(vol_args); err_drop: mnt_drop_write_file(file); if (bdev) blkdev_put(bdev, mode); +out: + btrfs_put_dev_args_from_path(&args); + kfree(vol_args); return ret; }
static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) { + BTRFS_DEV_LOOKUP_ARGS(args); struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_ioctl_vol_args *vol_args; @@ -3291,32 +3298,38 @@ static long btrfs_ioctl_rm_dev(struct file *file, void __user *arg) if (!capable(CAP_SYS_ADMIN)) return -EPERM;
- ret = mnt_want_write_file(file); - if (ret) - return ret; - vol_args = memdup_user(arg, sizeof(*vol_args)); - if (IS_ERR(vol_args)) { - ret = PTR_ERR(vol_args); - goto out_drop_write; - } + if (IS_ERR(vol_args)) + return PTR_ERR(vol_args); + vol_args->name[BTRFS_PATH_NAME_MAX] = '\0'; - cancel = (strcmp("cancel", vol_args->name) == 0); + if (!strcmp("cancel", vol_args->name)) { + cancel = true; + } else { + ret = btrfs_get_dev_args_from_path(fs_info, &args, vol_args->name); + if (ret) + goto out; + } + + ret = mnt_want_write_file(file); + if (ret) + goto out;
ret = exclop_start_or_cancel_reloc(fs_info, BTRFS_EXCLOP_DEV_REMOVE, cancel); if (ret == 0) { - ret = btrfs_rm_device(fs_info, vol_args->name, 0, &bdev, &mode); + ret = btrfs_rm_device(fs_info, &args, &bdev, &mode); if (!ret) btrfs_info(fs_info, "disk deleted %s", vol_args->name); btrfs_exclop_finish(fs_info); }
- kfree(vol_args); -out_drop_write: mnt_drop_write_file(file); if (bdev) blkdev_put(bdev, mode); +out: + btrfs_put_dev_args_from_path(&args); + kfree(vol_args); return ret; }
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 8d09e6d442b2..3bd68f1b79e6 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -2120,8 +2120,9 @@ void btrfs_scratch_superblocks(struct btrfs_fs_info *fs_info, update_dev_time(device_path); }
-int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, - u64 devid, struct block_device **bdev, fmode_t *mode) +int btrfs_rm_device(struct btrfs_fs_info *fs_info, + struct btrfs_dev_lookup_args *args, + struct block_device **bdev, fmode_t *mode) { struct btrfs_device *device; struct btrfs_fs_devices *cur_devices; @@ -2140,14 +2141,12 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, const char *device_path, if (ret) goto out;
- device = btrfs_find_device_by_devspec(fs_info, devid, device_path); - - if (IS_ERR(device)) { - if (PTR_ERR(device) == -ENOENT && - device_path && strcmp(device_path, "missing") == 0) + device = btrfs_find_device(fs_info->fs_devices, args); + if (!device) { + if (args->missing) ret = BTRFS_ERROR_DEV_MISSING_NOT_FOUND; else - ret = PTR_ERR(device); + ret = -ENOENT; goto out; }
diff --git a/fs/btrfs/volumes.h b/fs/btrfs/volumes.h index d1df03f77e29..30288b728bbb 100644 --- a/fs/btrfs/volumes.h +++ b/fs/btrfs/volumes.h @@ -496,7 +496,7 @@ struct btrfs_device *btrfs_alloc_device(struct btrfs_fs_info *fs_info, void btrfs_put_dev_args_from_path(struct btrfs_dev_lookup_args *args); void btrfs_free_device(struct btrfs_device *device); int btrfs_rm_device(struct btrfs_fs_info *fs_info, - const char *device_path, u64 devid, + struct btrfs_dev_lookup_args *args, struct block_device **bdev, fmode_t *mode); void __exit btrfs_cleanup_fs_uuids(void); int btrfs_num_copies(struct btrfs_fs_info *fs_info, u64 logical, u64 len);
From: Qu Wenruo wqu@suse.com
[ Upstream commit bbac58698a55cc0a6f0c0d69a6dcd3f9f3134c11 ]
[BUG] There is a report that a btrfs has a bad super block num devices.
This makes btrfs to reject the fs completely.
BTRFS error (device sdd3): super_num_devices 3 mismatch with num_devices 2 found here BTRFS error (device sdd3): failed to read chunk tree: -22 BTRFS error (device sdd3): open_ctree failed
[CAUSE] During btrfs device removal, chunk tree and super block num devs are updated in two different transactions:
btrfs_rm_device() |- btrfs_rm_dev_item(device) | |- trans = btrfs_start_transaction() | | Now we got transaction X | | | |- btrfs_del_item() | | Now device item is removed from chunk tree | | | |- btrfs_commit_transaction() | Transaction X got committed, super num devs untouched, | but device item removed from chunk tree. | (AKA, super num devs is already incorrect) | |- cur_devices->num_devices--; |- cur_devices->total_devices--; |- btrfs_set_super_num_devices() All those operations are not in transaction X, thus it will only be written back to disk in next transaction.
So after the transaction X in btrfs_rm_dev_item() committed, but before transaction X+1 (which can be minutes away), a power loss happen, then we got the super num mismatch.
[FIX] Instead of starting and committing a transaction inside btrfs_rm_dev_item(), start a transaction in side btrfs_rm_device() and pass it to btrfs_rm_dev_item().
And only commit the transaction after everything is done.
Reported-by: Luca Béla Palkovics luca.bela.palkovics@gmail.com Link: https://lore.kernel.org/linux-btrfs/CA+8xDSpvdm_U0QLBAnrH=zqDq_cWCOH5TiV46CK... CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 65 ++++++++++++++++++++-------------------------- 1 file changed, 28 insertions(+), 37 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 3bd68f1b79e6..cec54c6e1cdd 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1942,23 +1942,18 @@ static void update_dev_time(const char *device_path) path_put(&path); }
-static int btrfs_rm_dev_item(struct btrfs_device *device) +static int btrfs_rm_dev_item(struct btrfs_trans_handle *trans, + struct btrfs_device *device) { struct btrfs_root *root = device->fs_info->chunk_root; int ret; struct btrfs_path *path; struct btrfs_key key; - struct btrfs_trans_handle *trans;
path = btrfs_alloc_path(); if (!path) return -ENOMEM;
- trans = btrfs_start_transaction(root, 0); - if (IS_ERR(trans)) { - btrfs_free_path(path); - return PTR_ERR(trans); - } key.objectid = BTRFS_DEV_ITEMS_OBJECTID; key.type = BTRFS_DEV_ITEM_KEY; key.offset = device->devid; @@ -1969,21 +1964,12 @@ static int btrfs_rm_dev_item(struct btrfs_device *device) if (ret) { if (ret > 0) ret = -ENOENT; - btrfs_abort_transaction(trans, ret); - btrfs_end_transaction(trans); goto out; }
ret = btrfs_del_item(trans, root, path); - if (ret) { - btrfs_abort_transaction(trans, ret); - btrfs_end_transaction(trans); - } - out: btrfs_free_path(path); - if (!ret) - ret = btrfs_commit_transaction(trans); return ret; }
@@ -2124,6 +2110,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, struct btrfs_dev_lookup_args *args, struct block_device **bdev, fmode_t *mode) { + struct btrfs_trans_handle *trans; struct btrfs_device *device; struct btrfs_fs_devices *cur_devices; struct btrfs_fs_devices *fs_devices = fs_info->fs_devices; @@ -2139,7 +2126,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info,
ret = btrfs_check_raid_min_devices(fs_info, num_devices - 1); if (ret) - goto out; + return ret;
device = btrfs_find_device(fs_info->fs_devices, args); if (!device) { @@ -2147,27 +2134,22 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, ret = BTRFS_ERROR_DEV_MISSING_NOT_FOUND; else ret = -ENOENT; - goto out; + return ret; }
if (btrfs_pinned_by_swapfile(fs_info, device)) { btrfs_warn_in_rcu(fs_info, "cannot remove device %s (devid %llu) due to active swapfile", rcu_str_deref(device->name), device->devid); - ret = -ETXTBSY; - goto out; + return -ETXTBSY; }
- if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) { - ret = BTRFS_ERROR_DEV_TGT_REPLACE; - goto out; - } + if (test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state)) + return BTRFS_ERROR_DEV_TGT_REPLACE;
if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state) && - fs_info->fs_devices->rw_devices == 1) { - ret = BTRFS_ERROR_DEV_ONLY_WRITABLE; - goto out; - } + fs_info->fs_devices->rw_devices == 1) + return BTRFS_ERROR_DEV_ONLY_WRITABLE;
if (test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state)) { mutex_lock(&fs_info->chunk_mutex); @@ -2182,14 +2164,22 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, if (ret) goto error_undo;
- /* - * TODO: the superblock still includes this device in its num_devices - * counter although write_all_supers() is not locked out. This - * could give a filesystem state which requires a degraded mount. - */ - ret = btrfs_rm_dev_item(device); - if (ret) + trans = btrfs_start_transaction(fs_info->chunk_root, 0); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); goto error_undo; + } + + ret = btrfs_rm_dev_item(trans, device); + if (ret) { + /* Any error in dev item removal is critical */ + btrfs_crit(fs_info, + "failed to remove device item for devid %llu: %d", + device->devid, ret); + btrfs_abort_transaction(trans, ret); + btrfs_end_transaction(trans); + return ret; + }
clear_bit(BTRFS_DEV_STATE_IN_FS_METADATA, &device->dev_state); btrfs_scrub_cancel_dev(device); @@ -2264,7 +2254,8 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, free_fs_devices(cur_devices); }
-out: + ret = btrfs_commit_transaction(trans); + return ret;
error_undo: @@ -2276,7 +2267,7 @@ int btrfs_rm_device(struct btrfs_fs_info *fs_info, device->fs_devices->rw_devices++; mutex_unlock(&fs_info->chunk_mutex); } - goto out; + return ret; }
void btrfs_rm_dev_replace_remove_srcdev(struct btrfs_device *srcdev)
From: Luis Chamberlain mcgrof@kernel.org
[ Upstream commit e92ab4eda516a5bfd96c087282ebe9521deba4f4 ]
We never checked for errors on add_disk() as this function returned void. Now that this is fixed, use the shiny new error handling.
Signed-off-by: Luis Chamberlain mcgrof@kernel.org Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/drbd/drbd_main.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 8ba2fe356f01..ae6a136d278e 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -2798,7 +2798,9 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig goto out_idr_remove_vol; }
- add_disk(disk); + err = add_disk(disk); + if (err) + goto out_cleanup_disk;
/* inherit the connection state */ device->state.conn = first_connection(resource)->cstate; @@ -2812,6 +2814,8 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig drbd_debugfs_device_add(device); return NO_ERROR;
+out_cleanup_disk: + blk_cleanup_disk(disk); out_idr_remove_vol: idr_remove(&connection->peer_devices, vnr); out_idr_remove_from_resource:
From: Wu Bo wubo40@huawei.com
[ Upstream commit 27548088ac628109f70eb0b1eb521d035844dba8 ]
In drbd_create_device(), the 'out_no_io_page' lable has called blk_cleanup_disk() when return failed.
So remove the 'out_cleanup_disk' lable to avoid double free the disk pointer.
Fixes: e92ab4eda516 ("drbd: add error handling support for add_disk()") Signed-off-by: Wu Bo wubo40@huawei.com Reviewed-by: Christoph Hellwig hch@lst.de Link: https://lore.kernel.org/r/1636013229-26309-1-git-send-email-wubo40@huawei.co... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/drbd/drbd_main.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index ae6a136d278e..b91d2a9dc238 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -2800,7 +2800,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig
err = add_disk(disk); if (err) - goto out_cleanup_disk; + goto out_idr_remove_vol;
/* inherit the connection state */ device->state.conn = first_connection(resource)->cstate; @@ -2814,8 +2814,6 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig drbd_debugfs_device_add(device); return NO_ERROR;
-out_cleanup_disk: - blk_cleanup_disk(disk); out_idr_remove_vol: idr_remove(&connection->peer_devices, vnr); out_idr_remove_from_resource:
From: Xiaomeng Tong xiam0nd.tong@gmail.com
[ Upstream commit ae4d37b5df749926891583d42a6801b5da11e3c1 ]
The bug is here: idr_remove(&connection->peer_devices, vnr);
If the previous for_each_connection() don't exit early (no goto hit inside the loop), the iterator 'connection' after the loop will be a bogus pointer to an invalid structure object containing the HEAD (&resource->connections). As a result, the use of 'connection' above will lead to a invalid memory access (including a possible invalid free as idr_remove could call free_layer).
The original intention should have been to remove all peer_devices, but the following lines have already done the work. So just remove this line and the unneeded label, to fix this bug.
Cc: stable@vger.kernel.org Fixes: c06ece6ba6f1b ("drbd: Turn connection->volumes into connection->peer_devices") Signed-off-by: Xiaomeng Tong xiam0nd.tong@gmail.com Reviewed-by: Christoph Böhmwalder christoph.boehmwalder@linbit.com Reviewed-by: Lars Ellenberg lars.ellenberg@linbit.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/drbd/drbd_main.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index b91d2a9dc238..d59af26d7703 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -2795,12 +2795,12 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig
if (init_submitter(device)) { err = ERR_NOMEM; - goto out_idr_remove_vol; + goto out_idr_remove_from_resource; }
err = add_disk(disk); if (err) - goto out_idr_remove_vol; + goto out_idr_remove_from_resource;
/* inherit the connection state */ device->state.conn = first_connection(resource)->cstate; @@ -2814,8 +2814,6 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig drbd_debugfs_device_add(device); return NO_ERROR;
-out_idr_remove_vol: - idr_remove(&connection->peer_devices, vnr); out_idr_remove_from_resource: for_each_connection(connection, resource) { peer_device = idr_remove(&connection->peer_devices, vnr);
From: Michael Strauss michael.strauss@amd.com
[ Upstream commit bc204778b4032b336cb3bde85bea852d79e7e389 ]
[WHY] Clocks don't get recalculated in 0 stream/0 pipe configs, blocking S0i3 if dcfclk gets high enough
[HOW] Create DCN31 copy of DCN30 bandwidth validation func which doesn't entirely skip validation in 0 pipe scenarios
Override dcfclk to vlevel 0/min value during validation if pipe count is 0
Reviewed-by: Eric Yang Eric.Yang2@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Michael Strauss michael.strauss@amd.com Tested-by: Daniel Wheeler Daniel.Wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../drm/amd/display/dc/dcn30/dcn30_resource.c | 2 +- .../drm/amd/display/dc/dcn30/dcn30_resource.h | 7 +++ .../drm/amd/display/dc/dcn31/dcn31_resource.c | 63 ++++++++++++++++++- 3 files changed, 70 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c index 0294d0cc4759..735c92a5aa36 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.c @@ -1856,7 +1856,7 @@ static struct pipe_ctx *dcn30_find_split_pipe( return pipe; }
-static noinline bool dcn30_internal_validate_bw( +noinline bool dcn30_internal_validate_bw( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.h b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.h index b754b89beadf..b92e4cc0232f 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.h +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_resource.h @@ -55,6 +55,13 @@ unsigned int dcn30_calc_max_scaled_time(
bool dcn30_validate_bandwidth(struct dc *dc, struct dc_state *context, bool fast_validate); +bool dcn30_internal_validate_bw( + struct dc *dc, + struct dc_state *context, + display_e2e_pipe_params_st *pipes, + int *pipe_cnt_out, + int *vlevel_out, + bool fast_validate); void dcn30_calculate_wm_and_dlg( struct dc *dc, struct dc_state *context, display_e2e_pipe_params_st *pipes, diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c index b60ab3cc0f11..7aadb35a3079 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -1664,6 +1664,15 @@ static void dcn31_calculate_wm_and_dlg_fp( if (context->bw_ctx.dml.soc.min_dcfclk > dcfclk) dcfclk = context->bw_ctx.dml.soc.min_dcfclk;
+ /* We don't recalculate clocks for 0 pipe configs, which can block + * S0i3 as high clocks will block low power states + * Override any clocks that can block S0i3 to min here + */ + if (pipe_cnt == 0) { + context->bw_ctx.bw.dcn.clk.dcfclk_khz = dcfclk; // always should be vlevel 0 + return; + } + pipes[0].clks_cfg.voltage = vlevel; pipes[0].clks_cfg.dcfclk_mhz = dcfclk; pipes[0].clks_cfg.socclk_mhz = context->bw_ctx.dml.soc.clock_limits[vlevel].socclk_mhz; @@ -1789,6 +1798,58 @@ static void dcn31_calculate_wm_and_dlg( DC_FP_END(); }
+bool dcn31_validate_bandwidth(struct dc *dc, + struct dc_state *context, + bool fast_validate) +{ + bool out = false; + + BW_VAL_TRACE_SETUP(); + + int vlevel = 0; + int pipe_cnt = 0; + display_e2e_pipe_params_st *pipes = kzalloc(dc->res_pool->pipe_count * sizeof(display_e2e_pipe_params_st), GFP_KERNEL); + DC_LOGGER_INIT(dc->ctx->logger); + + BW_VAL_TRACE_COUNT(); + + out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, &vlevel, fast_validate); + + // Disable fast_validate to set min dcfclk in alculate_wm_and_dlg + if (pipe_cnt == 0) + fast_validate = false; + + if (!out) + goto validate_fail; + + BW_VAL_TRACE_END_VOLTAGE_LEVEL(); + + if (fast_validate) { + BW_VAL_TRACE_SKIP(fast); + goto validate_out; + } + + dc->res_pool->funcs->calculate_wm_and_dlg(dc, context, pipes, pipe_cnt, vlevel); + + BW_VAL_TRACE_END_WATERMARKS(); + + goto validate_out; + +validate_fail: + DC_LOG_WARNING("Mode Validation Warning: %s failed alidation.\n", + dml_get_status_message(context->bw_ctx.dml.vba.ValidationStatus[context->bw_ctx.dml.vba.soc.num_states])); + + BW_VAL_TRACE_SKIP(fail); + out = false; + +validate_out: + kfree(pipes); + + BW_VAL_TRACE_FINISH(); + + return out; +} + static struct dc_cap_funcs cap_funcs = { .get_dcc_compression_cap = dcn20_get_dcc_compression_cap }; @@ -1871,7 +1932,7 @@ static struct resource_funcs dcn31_res_pool_funcs = { .link_encs_assign = link_enc_cfg_link_encs_assign, .link_enc_unassign = link_enc_cfg_link_enc_unassign, .panel_cntl_create = dcn31_panel_cntl_create, - .validate_bandwidth = dcn30_validate_bandwidth, + .validate_bandwidth = dcn31_validate_bandwidth, .calculate_wm_and_dlg = dcn31_calculate_wm_and_dlg, .update_soc_for_wm_a = dcn31_update_soc_for_wm_a, .populate_dml_pipes = dcn31_populate_dml_pipes_from_context,
From: CHANDAN VURDIGERE NATARAJ chandan.vurdigerenataraj@amd.com
[ Upstream commit 50e6cb3fd2cde554db646282ea10df7236e6493c ]
[Why] Below general protection fault observed when WebGL Aquarium is run for longer duration. If drm debug logs are enabled and set to 0x1f then the issue is observed within 10 minutes of run.
[ 100.717056] general protection fault, probably for non-canonical address 0x2d33302d32323032: 0000 [#1] PREEMPT SMP NOPTI [ 100.727921] CPU: 3 PID: 1906 Comm: DrmThread Tainted: G W 5.15.30 #12 d726c6a2d6ebe5cf9223931cbca6892f916fe18b [ 100.754419] RIP: 0010:CalculateSwathWidth+0x1f7/0x44f [ 100.767109] Code: 00 00 00 f2 42 0f 11 04 f0 48 8b 85 88 00 00 00 f2 42 0f 10 04 f0 48 8b 85 98 00 00 00 f2 42 0f 11 04 f0 48 8b 45 10 0f 57 c0 <f3> 42 0f 2a 04 b0 0f 57 c9 f3 43 0f 2a 0c b4 e8 8c e2 f3 ff 48 8b [ 100.781269] RSP: 0018:ffffa9230079eeb0 EFLAGS: 00010246 [ 100.812528] RAX: 2d33302d32323032 RBX: 0000000000000500 RCX: 0000000000000000 [ 100.819656] RDX: 0000000000000001 RSI: ffff99deb712c49c RDI: 0000000000000000 [ 100.826781] RBP: ffffa9230079ef50 R08: ffff99deb712460c R09: ffff99deb712462c [ 100.833907] R10: ffff99deb7124940 R11: ffff99deb7124d70 R12: ffff99deb712ae44 [ 100.841033] R13: 0000000000000001 R14: 0000000000000000 R15: ffffa9230079f0a0 [ 100.848159] FS: 00007af121212640(0000) GS:ffff99deba780000(0000) knlGS:0000000000000000 [ 100.856240] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 100.861980] CR2: 0000209000fe1000 CR3: 000000011b18c000 CR4: 0000000000350ee0 [ 100.869106] Call Trace: [ 100.871555] <TASK> [ 100.873655] ? asm_sysvec_reschedule_ipi+0x12/0x20 [ 100.878449] CalculateSwathAndDETConfiguration+0x1a3/0x6dd [ 100.883937] dml31_ModeSupportAndSystemConfigurationFull+0x2ce4/0x76da [ 100.890467] ? kallsyms_lookup_buildid+0xc8/0x163 [ 100.895173] ? kallsyms_lookup_buildid+0xc8/0x163 [ 100.899874] ? __sprint_symbol+0x80/0x135 [ 100.903883] ? dm_update_plane_state+0x3f9/0x4d2 [ 100.908500] ? symbol_string+0xb7/0xde [ 100.912250] ? number+0x145/0x29b [ 100.915566] ? vsnprintf+0x341/0x5ff [ 100.919141] ? desc_read_finalized_seq+0x39/0x87 [ 100.923755] ? update_load_avg+0x1b9/0x607 [ 100.927849] ? compute_mst_dsc_configs_for_state+0x7d/0xd5b [ 100.933416] ? fetch_pipe_params+0xa4d/0xd0c [ 100.937686] ? dc_fpu_end+0x3d/0xa8 [ 100.941175] dml_get_voltage_level+0x16b/0x180 [ 100.945619] dcn30_internal_validate_bw+0x10e/0x89b [ 100.950495] ? dcn31_validate_bandwidth+0x68/0x1fc [ 100.955285] ? resource_build_scaling_params+0x98b/0xb8c [ 100.960595] ? dcn31_validate_bandwidth+0x68/0x1fc [ 100.965384] dcn31_validate_bandwidth+0x9a/0x1fc [ 100.970001] dc_validate_global_state+0x238/0x295 [ 100.974703] amdgpu_dm_atomic_check+0x9c1/0xbce [ 100.979235] ? _printk+0x59/0x73 [ 100.982467] drm_atomic_check_only+0x403/0x78b [ 100.986912] drm_mode_atomic_ioctl+0x49b/0x546 [ 100.991358] ? drm_ioctl+0x1c1/0x3b3 [ 100.994936] ? drm_atomic_set_property+0x92a/0x92a [ 100.999725] drm_ioctl_kernel+0xdc/0x149 [ 101.003648] drm_ioctl+0x27f/0x3b3 [ 101.007051] ? drm_atomic_set_property+0x92a/0x92a [ 101.011842] amdgpu_drm_ioctl+0x49/0x7d [ 101.015679] __se_sys_ioctl+0x7c/0xb8 [ 101.015685] do_syscall_64+0x5f/0xb8 [ 101.015690] ? __irq_exit_rcu+0x34/0x96
[How] It calles populate_dml_pipes which uses doubles to initialize. Adding FPU protection avoids context switch and probable loss of vba context as there is potential contention while drm debug logs are enabled.
Signed-off-by: CHANDAN VURDIGERE NATARAJ chandan.vurdigerenataraj@amd.com Reviewed-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c index 7aadb35a3079..e224c5213258 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -1813,7 +1813,9 @@ bool dcn31_validate_bandwidth(struct dc *dc,
BW_VAL_TRACE_COUNT();
+ DC_FP_START(); out = dcn30_internal_validate_bw(dc, context, pipes, &pipe_cnt, &vlevel, fast_validate); + DC_FP_END();
// Disable fast_validate to set min dcfclk in alculate_wm_and_dlg if (pipe_cnt == 0)
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 2c445a0e72cb1fbfbdb7f9473c53556ee27c1d90 ]
Since this pointer is used repeatedly, move it to a stack variable.
Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/vfs.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index 5f62fa0963ce..c8e3f81d110e 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1121,6 +1121,7 @@ __be32 nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset, unsigned long count, __be32 *verf) { + struct nfsd_net *nn; struct nfsd_file *nf; loff_t end = LLONG_MAX; __be32 err = nfserr_inval; @@ -1137,6 +1138,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf); if (err) goto out; + nn = net_generic(nf->nf_net, nfsd_net_id); if (EX_ISSYNC(fhp->fh_export)) { errseq_t since = READ_ONCE(nf->nf_file->f_wb_err); int err2; @@ -1144,8 +1146,7 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, err2 = vfs_fsync_range(nf->nf_file, offset, end, 0); switch (err2) { case 0: - nfsd_copy_boot_verifier(verf, net_generic(nf->nf_net, - nfsd_net_id)); + nfsd_copy_boot_verifier(verf, nn); err2 = filemap_check_wb_err(nf->nf_file->f_mapping, since); err = nfserrno(err2); @@ -1154,13 +1155,11 @@ nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, err = nfserr_notsupp; break; default: - nfsd_reset_boot_verifier(net_generic(nf->nf_net, - nfsd_net_id)); + nfsd_reset_boot_verifier(nn); err = nfserrno(err2); } } else - nfsd_copy_boot_verifier(verf, net_generic(nf->nf_net, - nfsd_net_id)); + nfsd_copy_boot_verifier(verf, nn);
nfsd_file_put(nf); out:
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 3f965021c8bc38965ecb1924f570c4842b33d408 ]
Since, well, forever, the Linux NFS server's nfsd_commit() function has returned nfserr_inval when the passed-in byte range arguments were non-sensical.
However, according to RFC 1813 section 3.3.21, NFSv3 COMMIT requests are permitted to return only the following non-zero status codes:
NFS3ERR_IO NFS3ERR_STALE NFS3ERR_BADHANDLE NFS3ERR_SERVERFAULT
NFS3ERR_INVAL is not included in that list. Likewise, NFS4ERR_INVAL is not listed in the COMMIT row of Table 6 in RFC 8881.
RFC 7530 does permit COMMIT to return NFS4ERR_INVAL, but does not specify when it can or should be used.
Instead of dropping or failing a COMMIT request in a byte range that is not supported, turn it into a valid request by treating one or both arguments as zero. Offset zero means start-of-file, count zero means until-end-of-file, so we only ever extend the commit range. NFS servers are always allowed to commit more and sooner than requested.
The range check is no longer bounded by NFS_OFFSET_MAX, but rather by the value that is returned in the maxfilesize field of the NFSv3 FSINFO procedure or the NFSv4 maxfilesize file attribute.
Note that this change results in a new pynfs failure:
CMT4 st_commit.testCommitOverflow : RUNNING CMT4 st_commit.testCommitOverflow : FAILURE COMMIT with offset + count overflow should return NFS4ERR_INVAL, instead got NFS4_OK
IMO the test is not correct as written: RFC 8881 does not allow the COMMIT operation to return NFS4ERR_INVAL.
Reported-by: Dan Aloni dan.aloni@vastdata.com Cc: stable@vger.kernel.org Signed-off-by: Chuck Lever chuck.lever@oracle.com Reviewed-by: Bruce Fields bfields@fieldses.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/nfs3proc.c | 6 ------ fs/nfsd/vfs.c | 53 +++++++++++++++++++++++++++++++--------------- fs/nfsd/vfs.h | 4 ++-- 3 files changed, 38 insertions(+), 25 deletions(-)
diff --git a/fs/nfsd/nfs3proc.c b/fs/nfsd/nfs3proc.c index b540489ea240..936eebd4c56d 100644 --- a/fs/nfsd/nfs3proc.c +++ b/fs/nfsd/nfs3proc.c @@ -660,15 +660,9 @@ nfsd3_proc_commit(struct svc_rqst *rqstp) argp->count, (unsigned long long) argp->offset);
- if (argp->offset > NFS_OFFSET_MAX) { - resp->status = nfserr_inval; - goto out; - } - fh_copy(&resp->fh, &argp->fh); resp->status = nfsd_commit(rqstp, &resp->fh, argp->offset, argp->count, resp->verf); -out: return rpc_success; }
diff --git a/fs/nfsd/vfs.c b/fs/nfsd/vfs.c index c8e3f81d110e..abfbb6953e89 100644 --- a/fs/nfsd/vfs.c +++ b/fs/nfsd/vfs.c @@ -1108,42 +1108,61 @@ nfsd_write(struct svc_rqst *rqstp, struct svc_fh *fhp, loff_t offset, }
#ifdef CONFIG_NFSD_V3 -/* - * Commit all pending writes to stable storage. +/** + * nfsd_commit - Commit pending writes to stable storage + * @rqstp: RPC request being processed + * @fhp: NFS filehandle + * @offset: raw offset from beginning of file + * @count: raw count of bytes to sync + * @verf: filled in with the server's current write verifier * - * Note: we only guarantee that data that lies within the range specified - * by the 'offset' and 'count' parameters will be synced. + * Note: we guarantee that data that lies within the range specified + * by the 'offset' and 'count' parameters will be synced. The server + * is permitted to sync data that lies outside this range at the + * same time. * * Unfortunately we cannot lock the file to make sure we return full WCC * data to the client, as locking happens lower down in the filesystem. + * + * Return values: + * An nfsstat value in network byte order. */ __be32 -nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, - loff_t offset, unsigned long count, __be32 *verf) +nfsd_commit(struct svc_rqst *rqstp, struct svc_fh *fhp, u64 offset, + u32 count, __be32 *verf) { + u64 maxbytes; + loff_t start, end; struct nfsd_net *nn; struct nfsd_file *nf; - loff_t end = LLONG_MAX; - __be32 err = nfserr_inval; - - if (offset < 0) - goto out; - if (count != 0) { - end = offset + (loff_t)count - 1; - if (end < offset) - goto out; - } + __be32 err;
err = nfsd_file_acquire(rqstp, fhp, NFSD_MAY_WRITE|NFSD_MAY_NOT_BREAK_LEASE, &nf); if (err) goto out; + + /* + * Convert the client-provided (offset, count) range to a + * (start, end) range. If the client-provided range falls + * outside the maximum file size of the underlying FS, + * clamp the sync range appropriately. + */ + start = 0; + end = LLONG_MAX; + maxbytes = (u64)fhp->fh_dentry->d_sb->s_maxbytes; + if (offset < maxbytes) { + start = offset; + if (count && (offset + count - 1 < maxbytes)) + end = offset + count - 1; + } + nn = net_generic(nf->nf_net, nfsd_net_id); if (EX_ISSYNC(fhp->fh_export)) { errseq_t since = READ_ONCE(nf->nf_file->f_wb_err); int err2;
- err2 = vfs_fsync_range(nf->nf_file, offset, end, 0); + err2 = vfs_fsync_range(nf->nf_file, start, end, 0); switch (err2) { case 0: nfsd_copy_boot_verifier(verf, nn); diff --git a/fs/nfsd/vfs.h b/fs/nfsd/vfs.h index b21b76e6b9a8..3cf5a8a13da5 100644 --- a/fs/nfsd/vfs.h +++ b/fs/nfsd/vfs.h @@ -73,8 +73,8 @@ __be32 do_nfsd_create(struct svc_rqst *, struct svc_fh *, char *name, int len, struct iattr *attrs, struct svc_fh *res, int createmode, u32 *verifier, bool *truncp, bool *created); -__be32 nfsd_commit(struct svc_rqst *, struct svc_fh *, - loff_t, unsigned long, __be32 *verf); +__be32 nfsd_commit(struct svc_rqst *rqst, struct svc_fh *fhp, + u64 offset, u32 count, __be32 *verf); #endif /* CONFIG_NFSD_V3 */ #ifdef CONFIG_NFSD_V4 __be32 nfsd_getxattr(struct svc_rqst *rqstp, struct svc_fh *fhp,
From: Palmer Dabbelt palmer@rivosinc.com
[ Upstream commit ca0cb9a60f6d86d4b2139c6f393a78f39edcd7cb ]
This manifests as a crash early in boot on VexRiscv.
Signed-off-by: Myrtle Shah gatecat@ds0.me [Palmer: split commit] Fixes: 44c922572952 ("RISC-V: enable XIP") Cc: stable@vger.kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/mm/init.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 7f130ac3b9f9..c58a7c77989b 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -265,6 +265,7 @@ pgd_t early_pg_dir[PTRS_PER_PGD] __initdata __aligned(PAGE_SIZE); static pmd_t __maybe_unused early_dtb_pmd[PTRS_PER_PMD] __initdata __aligned(PAGE_SIZE);
#ifdef CONFIG_XIP_KERNEL +#define riscv_pfn_base (*(unsigned long *)XIP_FIXUP(&riscv_pfn_base)) #define trampoline_pg_dir ((pgd_t *)XIP_FIXUP(trampoline_pg_dir)) #define fixmap_pte ((pte_t *)XIP_FIXUP(fixmap_pte)) #define early_pg_dir ((pgd_t *)XIP_FIXUP(early_pg_dir))
From: Haibo Chen haibo.chen@nxp.com
[ Upstream commit c87b7b12f48db86ac9909894f4dc0107d7df6375 ]
The original logic to get mma8452_data is wrong, the *dev point to the device belong to iio_dev. we can't use this dev to find the correct i2c_client. The original logic happen to work because it finally use dev->driver_data to get iio_dev. Here use the API to_i2c_client() is wrong and make reader confuse. To correct the logic, it should be like this
struct mma8452_data *data = iio_priv(dev_get_drvdata(dev));
But after commit 8b7651f25962 ("iio: iio_device_alloc(): Remove unnecessary self drvdata"), the upper logic also can't work. When try to show the avialable scale in userspace, will meet kernel dump, kernel handle NULL pointer dereference.
So use dev_to_iio_dev() to correct the logic.
Dual fixes tags as the second reflects when the bug was exposed, whilst the first reflects when the original bug was introduced.
Fixes: c3cdd6e48e35 ("iio: mma8452: refactor for seperating chip specific data") Fixes: 8b7651f25962 ("iio: iio_device_alloc(): Remove unnecessary self drvdata") Signed-off-by: Haibo Chen haibo.chen@nxp.com Reviewed-by: Martin Kepplinger martink@posteo.de Cc: Stable@vger.kernel.org Link: https://lore.kernel.org/r/1645497741-5402-1-git-send-email-haibo.chen@nxp.co... Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iio/accel/mma8452.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/accel/mma8452.c b/drivers/iio/accel/mma8452.c index 373b59557afe..1f46a73aafea 100644 --- a/drivers/iio/accel/mma8452.c +++ b/drivers/iio/accel/mma8452.c @@ -380,8 +380,8 @@ static ssize_t mma8452_show_scale_avail(struct device *dev, struct device_attribute *attr, char *buf) { - struct mma8452_data *data = iio_priv(i2c_get_clientdata( - to_i2c_client(dev))); + struct iio_dev *indio_dev = dev_to_iio_dev(dev); + struct mma8452_data *data = iio_priv(indio_dev);
return mma8452_show_int_plus_micros(buf, data->chip_info->mma_scales, ARRAY_SIZE(data->chip_info->mma_scales));
From: Sebastian Andrzej Siewior bigeasy@linutronix.de
[ Upstream commit 94da81e2fc4285db373fe9a1eb012c2ee205b110 ]
Since commit baebdf48c3600 ("net: dev: Makes sure netif_rx() can be invoked in any context.")
the function netif_rx() can be used in preemptible/thread context as well as in interrupt context.
Use netif_rx().
Cc: Antonio Quartulli a@unstable.cc Cc: Marek Lindner mareklindner@neomailbox.ch Cc: Simon Wunderlich sw@simonwunderlich.de Cc: Sven Eckelmann sven@narfation.org Cc: b.a.t.m.a.n@lists.open-mesh.org Signed-off-by: Sebastian Andrzej Siewior bigeasy@linutronix.de Acked-by: Sven Eckelmann sven@narfation.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/batman-adv/bridge_loop_avoidance.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/batman-adv/bridge_loop_avoidance.c b/net/batman-adv/bridge_loop_avoidance.c index 17687848daec..11f6ef657d82 100644 --- a/net/batman-adv/bridge_loop_avoidance.c +++ b/net/batman-adv/bridge_loop_avoidance.c @@ -443,7 +443,7 @@ static void batadv_bla_send_claim(struct batadv_priv *bat_priv, u8 *mac, batadv_add_counter(bat_priv, BATADV_CNT_RX_BYTES, skb->len + ETH_HLEN);
- netif_rx_any_context(skb); + netif_rx(skb); out: batadv_hardif_put(primary_if); }
From: Tudor Ambarus tudor.ambarus@microchip.com
[ Upstream commit 151c6b49d679872d6fc0b50e0ad96303091694a2 ]
Even if SPI_NOR_NO_ERASE was set, one could still send erase opcodes to the flash. It is not recommended to send unsupported opcodes to flashes. Fix the logic and do not set mtd->_erase when SPI_NOR_NO_ERASE is specified. With this users will not be able to issue erase opcodes to flashes and instead they will recive an -ENOTSUPP error.
Fixes: b199489d37b2 ("mtd: spi-nor: add the framework for SPI NOR") Signed-off-by: Tudor Ambarus tudor.ambarus@microchip.com Reviewed-by: Michael Walle michael@walle.cc Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220228163334.277730-1-tudor.ambarus@microchip.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/spi-nor/core.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/spi-nor/core.c b/drivers/mtd/spi-nor/core.c index 90f39aabc1ff..d97cdbc2b9de 100644 --- a/drivers/mtd/spi-nor/core.c +++ b/drivers/mtd/spi-nor/core.c @@ -3148,7 +3148,6 @@ int spi_nor_scan(struct spi_nor *nor, const char *name, mtd->writesize = nor->params->writesize; mtd->flags = MTD_CAP_NORFLASH; mtd->size = nor->params->size; - mtd->_erase = spi_nor_erase; mtd->_read = spi_nor_read; mtd->_suspend = spi_nor_suspend; mtd->_resume = spi_nor_resume; @@ -3178,6 +3177,8 @@ int spi_nor_scan(struct spi_nor *nor, const char *name,
if (info->flags & SPI_NOR_NO_ERASE) mtd->flags |= MTD_NO_ERASE; + else + mtd->_erase = spi_nor_erase;
mtd->dev.parent = dev; nor->page_size = nor->params->page_size;
From: Kees Cook keescook@chromium.org
[ Upstream commit 86cffecdeaa278444870c8745ab166a65865dbf0 ]
GCC and Clang can use the "alloc_size" attribute to better inform the results of __builtin_object_size() (for compile-time constant values). Clang can additionally use alloc_size to inform the results of __builtin_dynamic_object_size() (for run-time values).
Because GCC sees the frequent use of struct_size() as an allocator size argument, and notices it can return SIZE_MAX (the overflow indication), it complains about these call sites overflowing (since SIZE_MAX is greater than the default -Walloc-size-larger-than=PTRDIFF_MAX). This isn't helpful since we already know a SIZE_MAX will be caught at run-time (this was an intentional design). To deal with this, we must disable this check as it is both a false positive and redundant. (Clang does not have this warning option.)
Unfortunately, just checking the -Wno-alloc-size-larger-than is not sufficient to make the __alloc_size attribute behave correctly under older GCC versions. The attribute itself must be disabled in those situations too, as there appears to be no way to reliably silence the SIZE_MAX constant expression cases for GCC versions less than 9.1:
In file included from ./include/linux/resource_ext.h:11, from ./include/linux/pci.h:40, from drivers/net/ethernet/intel/ixgbe/ixgbe.h:9, from drivers/net/ethernet/intel/ixgbe/ixgbe_lib.c:4: In function 'kmalloc_node', inlined from 'ixgbe_alloc_q_vector' at ./include/linux/slab.h:743:9: ./include/linux/slab.h:618:9: error: argument 1 value '18446744073709551615' exceeds maximum object size 9223372036854775807 [-Werror=alloc-size-larger-than=] return __kmalloc_node(size, flags, node); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ./include/linux/slab.h: In function 'ixgbe_alloc_q_vector': ./include/linux/slab.h:455:7: note: in a call to allocation function '__kmalloc_node' declared here void *__kmalloc_node(size_t size, gfp_t flags, int node) __assume_slab_alignment __malloc; ^~~~~~~~~~~~~~
Specifically: '-Wno-alloc-size-larger-than' is not correctly handled by GCC < 9.1 https://godbolt.org/z/hqsfG7q84 (doesn't disable) https://godbolt.org/z/P9jdrPTYh (doesn't admit to not knowing about option) https://godbolt.org/z/465TPMWKb (only warns when other warnings appear)
'-Walloc-size-larger-than=18446744073709551615' is not handled by GCC < 8.2 https://godbolt.org/z/73hh1EPxz (ignores numeric value)
Since anything marked with __alloc_size would also qualify for marking with __malloc, just include __malloc along with it to avoid redundant markings. (Suggested by Linus Torvalds.)
Finally, make sure checkpatch.pl doesn't get confused about finding the __alloc_size attribute on functions. (Thanks to Joe Perches.)
Link: https://lkml.kernel.org/r/20210930222704.2631604-3-keescook@chromium.org Signed-off-by: Kees Cook keescook@chromium.org Tested-by: Randy Dunlap rdunlap@infradead.org Cc: Andy Whitcroft apw@canonical.com Cc: Christoph Lameter cl@linux.com Cc: Daniel Micay danielmicay@gmail.com Cc: David Rientjes rientjes@google.com Cc: Dennis Zhou dennis@kernel.org Cc: Dwaipayan Ray dwaipayanray1@gmail.com Cc: Joe Perches joe@perches.com Cc: Joonsoo Kim iamjoonsoo.kim@lge.com Cc: Lukas Bulwahn lukas.bulwahn@gmail.com Cc: Pekka Enberg penberg@kernel.org Cc: Tejun Heo tj@kernel.org Cc: Vlastimil Babka vbabka@suse.cz Cc: Alexandre Bounine alex.bou9@gmail.com Cc: Gustavo A. R. Silva gustavoars@kernel.org Cc: Ira Weiny ira.weiny@intel.com Cc: Jing Xiangfeng jingxiangfeng@huawei.com Cc: John Hubbard jhubbard@nvidia.com Cc: kernel test robot lkp@intel.com Cc: Matt Porter mporter@kernel.crashing.org Cc: Miguel Ojeda ojeda@kernel.org Cc: Nathan Chancellor nathan@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: Souptick Joarder jrdr.linux@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- Makefile | 15 +++++++++++++++ include/linux/compiler-gcc.h | 8 ++++++++ include/linux/compiler_attributes.h | 10 ++++++++++ include/linux/compiler_types.h | 12 ++++++++++++ scripts/checkpatch.pl | 3 ++- 5 files changed, 47 insertions(+), 1 deletion(-)
diff --git a/Makefile b/Makefile index c7750d260a55..397fb08d17f2 100644 --- a/Makefile +++ b/Makefile @@ -1011,6 +1011,21 @@ ifdef CONFIG_CC_IS_GCC KBUILD_CFLAGS += -Wno-maybe-uninitialized endif
+ifdef CONFIG_CC_IS_GCC +# The allocators already balk at large sizes, so silence the compiler +# warnings for bounds checks involving those possible values. While +# -Wno-alloc-size-larger-than would normally be used here, earlier versions +# of gcc (<9.1) weirdly don't handle the option correctly when _other_ +# warnings are produced (?!). Using -Walloc-size-larger-than=SIZE_MAX +# doesn't work (as it is documented to), silently resolving to "0" prior to +# version 9.1 (and producing an error more recently). Numeric values larger +# than PTRDIFF_MAX also don't work prior to version 9.1, which are silently +# ignored, continuing to default to PTRDIFF_MAX. So, left with no other +# choice, we must perform a versioned check to disable this warning. +# https://lore.kernel.org/lkml/20210824115859.187f272f@canb.auug.org.au +KBUILD_CFLAGS += $(call cc-ifversion, -ge, 0901, -Wno-alloc-size-larger-than) +endif + # disable invalid "can't wrap" optimizations for signed / pointers KBUILD_CFLAGS += -fno-strict-overflow
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h index bd2b881c6b63..b9d5f9c373a0 100644 --- a/include/linux/compiler-gcc.h +++ b/include/linux/compiler-gcc.h @@ -144,3 +144,11 @@ #else #define __diag_GCC_8(s) #endif + +/* + * Prior to 9.1, -Wno-alloc-size-larger-than (and therefore the "alloc_size" + * attribute) do not work, and must be disabled. + */ +#if GCC_VERSION < 90100 +#undef __alloc_size__ +#endif diff --git a/include/linux/compiler_attributes.h b/include/linux/compiler_attributes.h index e6ec63403965..3de06a8fae73 100644 --- a/include/linux/compiler_attributes.h +++ b/include/linux/compiler_attributes.h @@ -33,6 +33,15 @@ #define __aligned(x) __attribute__((__aligned__(x))) #define __aligned_largest __attribute__((__aligned__))
+/* + * Note: do not use this directly. Instead, use __alloc_size() since it is conditionally + * available and includes other attributes. + * + * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-all... + * clang: https://clang.llvm.org/docs/AttributeReference.html#alloc-size + */ +#define __alloc_size__(x, ...) __attribute__((__alloc_size__(x, ## __VA_ARGS__))) + /* * Note: users of __always_inline currently do not write "inline" themselves, * which seems to be required by gcc to apply the attribute according @@ -153,6 +162,7 @@
/* * gcc: https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-mal... + * clang: https://clang.llvm.org/docs/AttributeReference.html#malloc */ #define __malloc __attribute__((__malloc__))
diff --git a/include/linux/compiler_types.h b/include/linux/compiler_types.h index b6ff83a714ca..4f2203c4a257 100644 --- a/include/linux/compiler_types.h +++ b/include/linux/compiler_types.h @@ -250,6 +250,18 @@ struct ftrace_likely_data { # define __cficanonical #endif
+/* + * Any place that could be marked with the "alloc_size" attribute is also + * a place to be marked with the "malloc" attribute. Do this as part of the + * __alloc_size macro to avoid redundant attributes and to avoid missing a + * __malloc marking. + */ +#ifdef __alloc_size__ +# define __alloc_size(x, ...) __alloc_size__(x, ## __VA_ARGS__) __malloc +#else +# define __alloc_size(x, ...) __malloc +#endif + #ifndef asm_volatile_goto #define asm_volatile_goto(x...) asm goto(x) #endif diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl index c27d2312cfc3..88cb294dc447 100755 --- a/scripts/checkpatch.pl +++ b/scripts/checkpatch.pl @@ -489,7 +489,8 @@ our $Attribute = qr{ ____cacheline_aligned| ____cacheline_aligned_in_smp| ____cacheline_internodealigned_in_smp| - __weak + __weak| + __alloc_size\s*(\s*\d+\s*(?:,\s*\d+\s*)?) }x; our $Modifier; our $Inline = qr{inline|__always_inline|noinline|__inline|__inline__};
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit a8749a35c39903120ec421ef2525acc8e0daa55c ]
Linux has dozens of occurrences of vmalloc(array_size()) and vzalloc(array_size()). Allow to simplify the code by providing vmalloc_array and vcalloc, as well as the underscored variants that let the caller specify the GFP flags.
Acked-by: Michal Hocko mhocko@suse.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/vmalloc.h | 5 +++++ mm/util.c | 50 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+)
diff --git a/include/linux/vmalloc.h b/include/linux/vmalloc.h index 4fe9e885bbfa..5535be1012a2 100644 --- a/include/linux/vmalloc.h +++ b/include/linux/vmalloc.h @@ -159,6 +159,11 @@ void *__vmalloc_node(unsigned long size, unsigned long align, gfp_t gfp_mask, int node, const void *caller); void *vmalloc_no_huge(unsigned long size);
+extern void *__vmalloc_array(size_t n, size_t size, gfp_t flags) __alloc_size(1, 2); +extern void *vmalloc_array(size_t n, size_t size) __alloc_size(1, 2); +extern void *__vcalloc(size_t n, size_t size, gfp_t flags) __alloc_size(1, 2); +extern void *vcalloc(size_t n, size_t size) __alloc_size(1, 2); + extern void vfree(const void *addr); extern void vfree_atomic(const void *addr);
diff --git a/mm/util.c b/mm/util.c index 3073de05c2bd..ea04979f131e 100644 --- a/mm/util.c +++ b/mm/util.c @@ -698,6 +698,56 @@ static inline void *__page_rmapping(struct page *page) return (void *)mapping; }
+/** + * __vmalloc_array - allocate memory for a virtually contiguous array. + * @n: number of elements. + * @size: element size. + * @flags: the type of memory to allocate (see kmalloc). + */ +void *__vmalloc_array(size_t n, size_t size, gfp_t flags) +{ + size_t bytes; + + if (unlikely(check_mul_overflow(n, size, &bytes))) + return NULL; + return __vmalloc(bytes, flags); +} +EXPORT_SYMBOL(__vmalloc_array); + +/** + * vmalloc_array - allocate memory for a virtually contiguous array. + * @n: number of elements. + * @size: element size. + */ +void *vmalloc_array(size_t n, size_t size) +{ + return __vmalloc_array(n, size, GFP_KERNEL); +} +EXPORT_SYMBOL(vmalloc_array); + +/** + * __vcalloc - allocate and zero memory for a virtually contiguous array. + * @n: number of elements. + * @size: element size. + * @flags: the type of memory to allocate (see kmalloc). + */ +void *__vcalloc(size_t n, size_t size, gfp_t flags) +{ + return __vmalloc_array(n, size, flags | __GFP_ZERO); +} +EXPORT_SYMBOL(__vcalloc); + +/** + * vcalloc - allocate and zero memory for a virtually contiguous array. + * @n: number of elements. + * @size: element size. + */ +void *vcalloc(size_t n, size_t size) +{ + return __vmalloc_array(n, size, GFP_KERNEL | __GFP_ZERO); +} +EXPORT_SYMBOL(vcalloc); + /* Neutral page->mapping pointer to address_space or anon_vma or other */ void *page_rmapping(struct page *page) {
From: Paolo Bonzini pbonzini@redhat.com
[ Upstream commit 37b2a6510a48ca361ced679f92682b7b7d7d0330 ]
Allocations whose size is related to the memslot size can be arbitrarily large. Do not use kvzalloc/kvcalloc, as those are limited to "not crazy" sizes that fit in 32 bits.
Cc: stable@vger.kernel.org Fixes: 7661809d493b ("mm: don't allow oversized kvmalloc() calls") Reviewed-by: David Hildenbrand david@redhat.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv_uvmem.c | 2 +- arch/x86/kvm/mmu/page_track.c | 4 ++-- arch/x86/kvm/x86.c | 4 ++-- virt/kvm/kvm_main.c | 4 ++-- 4 files changed, 7 insertions(+), 7 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv_uvmem.c b/arch/powerpc/kvm/book3s_hv_uvmem.c index 3fbe710ff839..3d4ee75b0fb7 100644 --- a/arch/powerpc/kvm/book3s_hv_uvmem.c +++ b/arch/powerpc/kvm/book3s_hv_uvmem.c @@ -251,7 +251,7 @@ int kvmppc_uvmem_slot_init(struct kvm *kvm, const struct kvm_memory_slot *slot) p = kzalloc(sizeof(*p), GFP_KERNEL); if (!p) return -ENOMEM; - p->pfns = vzalloc(array_size(slot->npages, sizeof(*p->pfns))); + p->pfns = vcalloc(slot->npages, sizeof(*p->pfns)); if (!p->pfns) { kfree(p); return -ENOMEM; diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c index 21427e84a82e..630ae70bb6bd 100644 --- a/arch/x86/kvm/mmu/page_track.c +++ b/arch/x86/kvm/mmu/page_track.c @@ -36,8 +36,8 @@ int kvm_page_track_create_memslot(struct kvm_memory_slot *slot,
for (i = 0; i < KVM_PAGE_TRACK_MAX; i++) { slot->arch.gfn_track[i] = - kvcalloc(npages, sizeof(*slot->arch.gfn_track[i]), - GFP_KERNEL_ACCOUNT); + __vcalloc(npages, sizeof(*slot->arch.gfn_track[i]), + GFP_KERNEL_ACCOUNT); if (!slot->arch.gfn_track[i]) goto track_free; } diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index 39aaa21e28f7..8974884ef2ad 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -11552,7 +11552,7 @@ static int memslot_rmap_alloc(struct kvm_memory_slot *slot, if (slot->arch.rmap[i]) continue;
- slot->arch.rmap[i] = kvcalloc(lpages, sz, GFP_KERNEL_ACCOUNT); + slot->arch.rmap[i] = __vcalloc(lpages, sz, GFP_KERNEL_ACCOUNT); if (!slot->arch.rmap[i]) { memslot_rmap_free(slot); return -ENOMEM; @@ -11633,7 +11633,7 @@ static int kvm_alloc_memslot_metadata(struct kvm *kvm,
lpages = __kvm_mmu_slot_lpages(slot, npages, level);
- linfo = kvcalloc(lpages, sizeof(*linfo), GFP_KERNEL_ACCOUNT); + linfo = __vcalloc(lpages, sizeof(*linfo), GFP_KERNEL_ACCOUNT); if (!linfo) goto out_free;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index fefdf3a6dae3..99c591569815 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1255,9 +1255,9 @@ static int kvm_vm_release(struct inode *inode, struct file *filp) */ static int kvm_alloc_dirty_bitmap(struct kvm_memory_slot *memslot) { - unsigned long dirty_bytes = 2 * kvm_dirty_bitmap_bytes(memslot); + unsigned long dirty_bytes = kvm_dirty_bitmap_bytes(memslot);
- memslot->dirty_bitmap = kvzalloc(dirty_bytes, GFP_KERNEL_ACCOUNT); + memslot->dirty_bitmap = __vcalloc(2, dirty_bytes, GFP_KERNEL_ACCOUNT); if (!memslot->dirty_bitmap) return -ENOMEM;
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit 79c9234ba596e903907de20573fd4bcc85315b06 ]
Syzbot reported a possible use-after-free in printing information in device_list_add.
Very similar with the bug fixed by commit 0697d9a61099 ("btrfs: don't access possibly stale fs_info data for printing duplicate device"), but this time the use occurs in btrfs_info_in_rcu.
Call Trace: kasan_report.cold+0x83/0xdf mm/kasan/report.c:459 btrfs_printk+0x395/0x425 fs/btrfs/super.c:244 device_list_add.cold+0xd7/0x2ed fs/btrfs/volumes.c:957 btrfs_scan_one_device+0x4c7/0x5c0 fs/btrfs/volumes.c:1387 btrfs_control_ioctl+0x12a/0x2d0 fs/btrfs/super.c:2409 vfs_ioctl fs/ioctl.c:51 [inline] __do_sys_ioctl fs/ioctl.c:874 [inline] __se_sys_ioctl fs/ioctl.c:860 [inline] __x64_sys_ioctl+0x193/0x200 fs/ioctl.c:860 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x35/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x44/0xae
Fix this by modifying device->fs_info to NULL too.
Reported-and-tested-by: syzbot+82650a4e0ed38f218363@syzkaller.appspotmail.com CC: stable@vger.kernel.org # 4.19+ Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/volumes.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index cec54c6e1cdd..89ce0b449c22 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -955,6 +955,11 @@ static noinline struct btrfs_device *device_list_add(const char *path, /* * We are going to replace the device path for a given devid, * make sure it's the same device if the device is mounted + * + * NOTE: the device->fs_info may not be reliable here so pass + * in a NULL to message helpers instead. This avoids a possible + * use-after-free when the fs_info and fs_info->sb are already + * torn down. */ if (device->bdev) { int error; @@ -968,12 +973,6 @@ static noinline struct btrfs_device *device_list_add(const char *path,
if (device->bdev->bd_dev != path_dev) { mutex_unlock(&fs_devices->device_list_mutex); - /* - * device->fs_info may not be reliable here, so - * pass in a NULL instead. This avoids a - * possible use-after-free when the fs_info and - * fs_info->sb are already torn down. - */ btrfs_warn_in_rcu(NULL, "duplicate device %s devid %llu generation %llu scanned by %s (%d)", path, devid, found_transid, @@ -981,7 +980,7 @@ static noinline struct btrfs_device *device_list_add(const char *path, task_pid_nr(current)); return ERR_PTR(-EEXIST); } - btrfs_info_in_rcu(device->fs_info, + btrfs_info_in_rcu(NULL, "devid %llu device path %s changed to %s scanned by %s (%d)", devid, rcu_str_deref(device->name), path, current->comm,
From: Claudio Imbrenda imbrenda@linux.ibm.com
[ Upstream commit c0573ba5c5a2244dc02060b1f374d4593c1d20b7 ]
When handling the SCK instruction, the kvm lock is taken, even though the vcpu lock is already being held. The normal locking order is kvm lock first and then vcpu lock. This is can (and in some circumstances does) lead to deadlocks.
The function kvm_s390_set_tod_clock is called both by the SCK handler and by some IOCTLs to set the clock. The IOCTLs will not hold the vcpu lock, so they can safely take the kvm lock. The SCK handler holds the vcpu lock, but will also somehow need to acquire the kvm lock without relinquishing the vcpu lock.
The solution is to factor out the code to set the clock, and provide two wrappers. One is called like the original function and does the locking, the other is called kvm_s390_try_set_tod_clock and uses trylock to try to acquire the kvm lock. This new wrapper is then used in the SCK handler. If locking fails, -EAGAIN is returned, which is eventually propagated to userspace, thus also freeing the vcpu lock and allowing for forward progress.
This is not the most efficient or elegant way to solve this issue, but the SCK instruction is deprecated and its performance is not critical.
The goal of this patch is just to provide a simple but correct way to fix the bug.
Fixes: 6a3f95a6b04c ("KVM: s390: Intercept SCK instruction") Signed-off-by: Claudio Imbrenda imbrenda@linux.ibm.com Reviewed-by: Christian Borntraeger borntraeger@linux.ibm.com Reviewed-by: Janis Schoetterl-Glausch scgl@linux.ibm.com Link: https://lore.kernel.org/r/20220301143340.111129-1-imbrenda@linux.ibm.com Cc: stable@vger.kernel.org Signed-off-by: Christian Borntraeger borntraeger@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kvm/kvm-s390.c | 19 ++++++++++++++++--- arch/s390/kvm/kvm-s390.h | 4 ++-- arch/s390/kvm/priv.c | 15 ++++++++++++++- 3 files changed, 32 insertions(+), 6 deletions(-)
diff --git a/arch/s390/kvm/kvm-s390.c b/arch/s390/kvm/kvm-s390.c index 402597f9d050..b456aa196c04 100644 --- a/arch/s390/kvm/kvm-s390.c +++ b/arch/s390/kvm/kvm-s390.c @@ -3913,14 +3913,12 @@ static int kvm_s390_handle_requests(struct kvm_vcpu *vcpu) return 0; }
-void kvm_s390_set_tod_clock(struct kvm *kvm, - const struct kvm_s390_vm_tod_clock *gtod) +static void __kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod) { struct kvm_vcpu *vcpu; union tod_clock clk; int i;
- mutex_lock(&kvm->lock); preempt_disable();
store_tod_clock_ext(&clk); @@ -3941,7 +3939,22 @@ void kvm_s390_set_tod_clock(struct kvm *kvm,
kvm_s390_vcpu_unblock_all(kvm); preempt_enable(); +} + +void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod) +{ + mutex_lock(&kvm->lock); + __kvm_s390_set_tod_clock(kvm, gtod); + mutex_unlock(&kvm->lock); +} + +int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod) +{ + if (!mutex_trylock(&kvm->lock)) + return 0; + __kvm_s390_set_tod_clock(kvm, gtod); mutex_unlock(&kvm->lock); + return 1; }
/** diff --git a/arch/s390/kvm/kvm-s390.h b/arch/s390/kvm/kvm-s390.h index 1539dd981104..f8803bf0ff17 100644 --- a/arch/s390/kvm/kvm-s390.h +++ b/arch/s390/kvm/kvm-s390.h @@ -326,8 +326,8 @@ int kvm_s390_handle_sigp(struct kvm_vcpu *vcpu); int kvm_s390_handle_sigp_pei(struct kvm_vcpu *vcpu);
/* implemented in kvm-s390.c */ -void kvm_s390_set_tod_clock(struct kvm *kvm, - const struct kvm_s390_vm_tod_clock *gtod); +void kvm_s390_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod); +int kvm_s390_try_set_tod_clock(struct kvm *kvm, const struct kvm_s390_vm_tod_clock *gtod); long kvm_arch_fault_in_page(struct kvm_vcpu *vcpu, gpa_t gpa, int writable); int kvm_s390_store_status_unloaded(struct kvm_vcpu *vcpu, unsigned long addr); int kvm_s390_vcpu_store_status(struct kvm_vcpu *vcpu, unsigned long addr); diff --git a/arch/s390/kvm/priv.c b/arch/s390/kvm/priv.c index 417154b314a6..6a765fe22eaf 100644 --- a/arch/s390/kvm/priv.c +++ b/arch/s390/kvm/priv.c @@ -102,7 +102,20 @@ static int handle_set_clock(struct kvm_vcpu *vcpu) return kvm_s390_inject_prog_cond(vcpu, rc);
VCPU_EVENT(vcpu, 3, "SCK: setting guest TOD to 0x%llx", gtod.tod); - kvm_s390_set_tod_clock(vcpu->kvm, >od); + /* + * To set the TOD clock the kvm lock must be taken, but the vcpu lock + * is already held in handle_set_clock. The usual lock order is the + * opposite. As SCK is deprecated and should not be used in several + * cases, for example when the multiple epoch facility or TOD clock + * steering facility is installed (see Principles of Operation), a + * slow path can be used. If the lock can not be taken via try_lock, + * the instruction will be retried via -EAGAIN at a later point in + * time. + */ + if (!kvm_s390_try_set_tod_clock(vcpu->kvm, >od)) { + kvm_s390_retry_instr(vcpu); + return -EAGAIN; + }
kvm_s390_set_psw_cc(vcpu, 0); return 0;
From: Arun Easi aeasi@marvell.com
[ Upstream commit db212f2eb3fb7f546366777e93c8f54614d39269 ]
Driver registration of localport can race when it happens at the remote port discovery time. Fix this by calling the registration under a mutex.
Link: https://lore.kernel.org/r/20220310092604.22950-4-njavali@marvell.com Fixes: e84067d74301 ("scsi: qla2xxx: Add FC-NVMe F/W initialization and transport registration") Cc: stable@vger.kernel.org Reported-by: Marco Patalano mpatalan@redhat.com Tested-by: Marco Patalano mpatalan@redhat.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Arun Easi aeasi@marvell.com Signed-off-by: Nilesh Javali njavali@marvell.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/qla2xxx/qla_nvme.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c index 42b29f4fd937..1bf3ab10846a 100644 --- a/drivers/scsi/qla2xxx/qla_nvme.c +++ b/drivers/scsi/qla2xxx/qla_nvme.c @@ -775,7 +775,6 @@ int qla_nvme_register_hba(struct scsi_qla_host *vha) ha = vha->hw; tmpl = &qla_nvme_fc_transport;
- WARN_ON(vha->nvme_local_port);
qla_nvme_fc_transport.max_hw_queues = min((uint8_t)(qla_nvme_fc_transport.max_hw_queues), @@ -786,13 +785,25 @@ int qla_nvme_register_hba(struct scsi_qla_host *vha) pinfo.port_role = FC_PORT_ROLE_NVME_INITIATOR; pinfo.port_id = vha->d_id.b24;
- ql_log(ql_log_info, vha, 0xffff, - "register_localport: host-traddr=nn-0x%llx:pn-0x%llx on portID:%x\n", - pinfo.node_name, pinfo.port_name, pinfo.port_id); - qla_nvme_fc_transport.dma_boundary = vha->host->dma_boundary; - - ret = nvme_fc_register_localport(&pinfo, tmpl, - get_device(&ha->pdev->dev), &vha->nvme_local_port); + mutex_lock(&ha->vport_lock); + /* + * Check again for nvme_local_port to see if any other thread raced + * with this one and finished registration. + */ + if (!vha->nvme_local_port) { + ql_log(ql_log_info, vha, 0xffff, + "register_localport: host-traddr=nn-0x%llx:pn-0x%llx on portID:%x\n", + pinfo.node_name, pinfo.port_name, pinfo.port_id); + qla_nvme_fc_transport.dma_boundary = vha->host->dma_boundary; + + ret = nvme_fc_register_localport(&pinfo, tmpl, + get_device(&ha->pdev->dev), + &vha->nvme_local_port); + mutex_unlock(&ha->vport_lock); + } else { + mutex_unlock(&ha->vport_lock); + return 0; + } if (ret) { ql_log(ql_log_warn, vha, 0xffff, "register_localport failed: ret=%x\n", ret);
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit a85c728cb5e12216c19ae5878980c2cbbbf8616d ]
Instructions lmw/stmw are interesting for functions that are rarely used and not in the cache, because only one instruction is to be copied into the instruction cache instead of 19. However those instruction are less performant than 19x raw lwz/stw as they require synchronisation plus one additional cycle.
SAVE_NVGPRS / REST_NVGPRS are used in only a few places which are mostly in interrupts entries/exits and in task switch so they are likely already in the cache.
Using standard lwz improves null_syscall selftest by: - 10 cycles on mpc832x. - 2 cycles on mpc8xx.
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/316c543b8906712c108985c8463eec09c8db577b.162973254... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/ppc_asm.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 1c538a9a11e0..7be24048b8d1 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -28,8 +28,8 @@ #else #define SAVE_GPR(n, base) stw n,GPR0+4*(n)(base) #define REST_GPR(n, base) lwz n,GPR0+4*(n)(base) -#define SAVE_NVGPRS(base) stmw 13, GPR0+4*13(base) -#define REST_NVGPRS(base) lmw 13, GPR0+4*13(base) +#define SAVE_NVGPRS(base) SAVE_GPR(13, base); SAVE_8GPRS(14, base); SAVE_10GPRS(22, base) +#define REST_NVGPRS(base) REST_GPR(13, base); REST_8GPRS(14, base); REST_10GPRS(22, base) #endif
#define SAVE_2GPRS(n, base) SAVE_GPR(n, base); SAVE_GPR(n+1, base)
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit aebd1fb45c622e9a2b06fb70665d084d3a8d6c78 ]
Introduce macros that operate on a (start, end) range of GPRs, which reduces lines of code and need to do mental arithmetic while reading the code.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Segher Boessenkool segher@kernel.crashing.org Reviewed-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20211022061322.2671178-1-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/boot/crt0.S | 31 +++++++------ arch/powerpc/crypto/md5-asm.S | 10 ++--- arch/powerpc/crypto/sha1-powerpc-asm.S | 6 +-- arch/powerpc/include/asm/ppc_asm.h | 43 ++++++++++++------- arch/powerpc/kernel/entry_32.S | 23 ++++------ arch/powerpc/kernel/exceptions-64e.S | 14 ++---- arch/powerpc/kernel/exceptions-64s.S | 6 +-- arch/powerpc/kernel/head_32.h | 3 +- arch/powerpc/kernel/head_booke.h | 3 +- arch/powerpc/kernel/interrupt_64.S | 34 ++++++--------- arch/powerpc/kernel/optprobes_head.S | 4 +- arch/powerpc/kernel/tm.S | 15 ++----- .../powerpc/kernel/trace/ftrace_64_mprofile.S | 15 +++---- arch/powerpc/kvm/book3s_hv_rmhandlers.S | 5 +-- .../lib/test_emulate_step_exec_instr.S | 8 ++-- 15 files changed, 94 insertions(+), 126 deletions(-)
diff --git a/arch/powerpc/boot/crt0.S b/arch/powerpc/boot/crt0.S index 1d83966f5ef6..e8f10a599659 100644 --- a/arch/powerpc/boot/crt0.S +++ b/arch/powerpc/boot/crt0.S @@ -226,16 +226,19 @@ p_base: mflr r10 /* r10 now points to runtime addr of p_base */ #ifdef __powerpc64__
#define PROM_FRAME_SIZE 512 -#define SAVE_GPR(n, base) std n,8*(n)(base) -#define REST_GPR(n, base) ld n,8*(n)(base) -#define SAVE_2GPRS(n, base) SAVE_GPR(n, base); SAVE_GPR(n+1, base) -#define SAVE_4GPRS(n, base) SAVE_2GPRS(n, base); SAVE_2GPRS(n+2, base) -#define SAVE_8GPRS(n, base) SAVE_4GPRS(n, base); SAVE_4GPRS(n+4, base) -#define SAVE_10GPRS(n, base) SAVE_8GPRS(n, base); SAVE_2GPRS(n+8, base) -#define REST_2GPRS(n, base) REST_GPR(n, base); REST_GPR(n+1, base) -#define REST_4GPRS(n, base) REST_2GPRS(n, base); REST_2GPRS(n+2, base) -#define REST_8GPRS(n, base) REST_4GPRS(n, base); REST_4GPRS(n+4, base) -#define REST_10GPRS(n, base) REST_8GPRS(n, base); REST_2GPRS(n+8, base) + +.macro OP_REGS op, width, start, end, base, offset + .Lreg=\start + .rept (\end - \start + 1) + \op .Lreg,\offset+\width*.Lreg(\base) + .Lreg=.Lreg+1 + .endr +.endm + +#define SAVE_GPRS(start, end, base) OP_REGS std, 8, start, end, base, 0 +#define REST_GPRS(start, end, base) OP_REGS ld, 8, start, end, base, 0 +#define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) +#define REST_GPR(n, base) REST_GPRS(n, n, base)
/* prom handles the jump into and return from firmware. The prom args pointer is loaded in r3. */ @@ -246,9 +249,7 @@ prom: stdu r1,-PROM_FRAME_SIZE(r1) /* Save SP and create stack space */
SAVE_GPR(2, r1) - SAVE_GPR(13, r1) - SAVE_8GPRS(14, r1) - SAVE_10GPRS(22, r1) + SAVE_GPRS(13, 31, r1) mfcr r10 std r10,8*32(r1) mfmsr r10 @@ -283,9 +284,7 @@ prom:
/* Restore other registers */ REST_GPR(2, r1) - REST_GPR(13, r1) - REST_8GPRS(14, r1) - REST_10GPRS(22, r1) + REST_GPRS(13, 31, r1) ld r10,8*32(r1) mtcr r10
diff --git a/arch/powerpc/crypto/md5-asm.S b/arch/powerpc/crypto/md5-asm.S index 948d100a2934..fa6bc440cf4a 100644 --- a/arch/powerpc/crypto/md5-asm.S +++ b/arch/powerpc/crypto/md5-asm.S @@ -38,15 +38,11 @@
#define INITIALIZE \ PPC_STLU r1,-INT_FRAME_SIZE(r1); \ - SAVE_8GPRS(14, r1); /* push registers onto stack */ \ - SAVE_4GPRS(22, r1); \ - SAVE_GPR(26, r1) + SAVE_GPRS(14, 26, r1) /* push registers onto stack */
#define FINALIZE \ - REST_8GPRS(14, r1); /* pop registers from stack */ \ - REST_4GPRS(22, r1); \ - REST_GPR(26, r1); \ - addi r1,r1,INT_FRAME_SIZE; + REST_GPRS(14, 26, r1); /* pop registers from stack */ \ + addi r1,r1,INT_FRAME_SIZE
#ifdef __BIG_ENDIAN__ #define LOAD_DATA(reg, off) \ diff --git a/arch/powerpc/crypto/sha1-powerpc-asm.S b/arch/powerpc/crypto/sha1-powerpc-asm.S index 23e248beff71..f0d5ed557ab1 100644 --- a/arch/powerpc/crypto/sha1-powerpc-asm.S +++ b/arch/powerpc/crypto/sha1-powerpc-asm.S @@ -125,8 +125,7 @@
_GLOBAL(powerpc_sha_transform) PPC_STLU r1,-INT_FRAME_SIZE(r1) - SAVE_8GPRS(14, r1) - SAVE_10GPRS(22, r1) + SAVE_GPRS(14, 31, r1)
/* Load up A - E */ lwz RA(0),0(r3) /* A */ @@ -184,7 +183,6 @@ _GLOBAL(powerpc_sha_transform) stw RD(0),12(r3) stw RE(0),16(r3)
- REST_8GPRS(14, r1) - REST_10GPRS(22, r1) + REST_GPRS(14, 31, r1) addi r1,r1,INT_FRAME_SIZE blr diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 7be24048b8d1..f21e6bde17a1 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -16,30 +16,41 @@
#define SZL (BITS_PER_LONG/8)
+/* + * This expands to a sequence of operations with reg incrementing from + * start to end inclusive, of this form: + * + * op reg, (offset + (width * reg))(base) + * + * Note that offset is not the offset of the first operation unless start + * is zero (or width is zero). + */ +.macro OP_REGS op, width, start, end, base, offset + .Lreg=\start + .rept (\end - \start + 1) + \op .Lreg, \offset + \width * .Lreg(\base) + .Lreg=.Lreg+1 + .endr +.endm + /* * Macros for storing registers into and loading registers from * exception frames. */ #ifdef __powerpc64__ -#define SAVE_GPR(n, base) std n,GPR0+8*(n)(base) -#define REST_GPR(n, base) ld n,GPR0+8*(n)(base) -#define SAVE_NVGPRS(base) SAVE_8GPRS(14, base); SAVE_10GPRS(22, base) -#define REST_NVGPRS(base) REST_8GPRS(14, base); REST_10GPRS(22, base) +#define SAVE_GPRS(start, end, base) OP_REGS std, 8, start, end, base, GPR0 +#define REST_GPRS(start, end, base) OP_REGS ld, 8, start, end, base, GPR0 +#define SAVE_NVGPRS(base) SAVE_GPRS(14, 31, base) +#define REST_NVGPRS(base) REST_GPRS(14, 31, base) #else -#define SAVE_GPR(n, base) stw n,GPR0+4*(n)(base) -#define REST_GPR(n, base) lwz n,GPR0+4*(n)(base) -#define SAVE_NVGPRS(base) SAVE_GPR(13, base); SAVE_8GPRS(14, base); SAVE_10GPRS(22, base) -#define REST_NVGPRS(base) REST_GPR(13, base); REST_8GPRS(14, base); REST_10GPRS(22, base) +#define SAVE_GPRS(start, end, base) OP_REGS stw, 4, start, end, base, GPR0 +#define REST_GPRS(start, end, base) OP_REGS lwz, 4, start, end, base, GPR0 +#define SAVE_NVGPRS(base) SAVE_GPRS(13, 31, base) +#define REST_NVGPRS(base) REST_GPRS(13, 31, base) #endif
-#define SAVE_2GPRS(n, base) SAVE_GPR(n, base); SAVE_GPR(n+1, base) -#define SAVE_4GPRS(n, base) SAVE_2GPRS(n, base); SAVE_2GPRS(n+2, base) -#define SAVE_8GPRS(n, base) SAVE_4GPRS(n, base); SAVE_4GPRS(n+4, base) -#define SAVE_10GPRS(n, base) SAVE_8GPRS(n, base); SAVE_2GPRS(n+8, base) -#define REST_2GPRS(n, base) REST_GPR(n, base); REST_GPR(n+1, base) -#define REST_4GPRS(n, base) REST_2GPRS(n, base); REST_2GPRS(n+2, base) -#define REST_8GPRS(n, base) REST_4GPRS(n, base); REST_4GPRS(n+4, base) -#define REST_10GPRS(n, base) REST_8GPRS(n, base); REST_2GPRS(n+8, base) +#define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) +#define REST_GPR(n, base) REST_GPRS(n, n, base)
#define SAVE_FPR(n, base) stfd n,8*TS_FPRWIDTH*(n)(base) #define SAVE_2FPRS(n, base) SAVE_FPR(n, base); SAVE_FPR(n+1, base) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 61fdd53cdd9a..c62dd9815965 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -90,8 +90,7 @@ transfer_to_syscall: stw r12,8(r1) stw r2,_TRAP(r1) SAVE_GPR(0, r1) - SAVE_4GPRS(3, r1) - SAVE_2GPRS(7, r1) + SAVE_GPRS(3, 8, r1) addi r2,r10,-THREAD SAVE_NVGPRS(r1)
@@ -139,7 +138,7 @@ syscall_exit_finish: mtxer r5 lwz r0,GPR0(r1) lwz r3,GPR3(r1) - REST_8GPRS(4,r1) + REST_GPRS(4, 11, r1) lwz r12,GPR12(r1) b 1b
@@ -232,9 +231,9 @@ fast_exception_return: beq 3f /* if not, we've got problems */ #endif
-2: REST_4GPRS(3, r11) +2: REST_GPRS(3, 6, r11) lwz r10,_CCR(r11) - REST_2GPRS(1, r11) + REST_GPRS(1, 2, r11) mtcr r10 lwz r10,_LINK(r11) mtlr r10 @@ -298,16 +297,14 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) * the reliable stack unwinder later on. Clear it. */ stw r0,8(r1) - REST_4GPRS(7, r1) - REST_2GPRS(11, r1) + REST_GPRS(7, 12, r1)
mtcr r3 mtlr r4 mtctr r5 mtspr SPRN_XER,r6
- REST_4GPRS(2, r1) - REST_GPR(6, r1) + REST_GPRS(2, 6, r1) REST_GPR(0, r1) REST_GPR(1, r1) rfi @@ -341,8 +338,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) lwz r6,_CCR(r1) li r0,0
- REST_4GPRS(7, r1) - REST_2GPRS(11, r1) + REST_GPRS(7, 12, r1)
mtlr r3 mtctr r4 @@ -354,7 +350,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) */ stw r0,8(r1)
- REST_4GPRS(2, r1) + REST_GPRS(2, 5, r1)
bne- cr1,1f /* emulate stack store */ mtcr r6 @@ -430,8 +426,7 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) bne interrupt_return; \ lwz r0,GPR0(r1); \ lwz r2,GPR2(r1); \ - REST_4GPRS(3, r1); \ - REST_2GPRS(7, r1); \ + REST_GPRS(3, 8, r1); \ lwz r10,_XER(r1); \ lwz r11,_CTR(r1); \ mtspr SPRN_XER,r10; \ diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 711c66b76df1..67dc4e3179a0 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -198,8 +198,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV)
stdcx. r0,0,r1 /* to clear the reservation */
- REST_4GPRS(2, r1) - REST_4GPRS(6, r1) + REST_GPRS(2, 9, r1)
ld r10,_CTR(r1) ld r11,_XER(r1) @@ -375,9 +374,7 @@ ret_from_mc_except: exc_##n##_common: \ std r0,GPR0(r1); /* save r0 in stackframe */ \ std r2,GPR2(r1); /* save r2 in stackframe */ \ - SAVE_4GPRS(3, r1); /* save r3 - r6 in stackframe */ \ - SAVE_2GPRS(7, r1); /* save r7, r8 in stackframe */ \ - std r9,GPR9(r1); /* save r9 in stackframe */ \ + SAVE_GPRS(3, 9, r1); /* save r3 - r9 in stackframe */ \ std r10,_NIP(r1); /* save SRR0 to stackframe */ \ std r11,_MSR(r1); /* save SRR1 to stackframe */ \ beq 2f; /* if from kernel mode */ \ @@ -1061,9 +1058,7 @@ bad_stack_book3e: std r11,_ESR(r1) std r0,GPR0(r1); /* save r0 in stackframe */ \ std r2,GPR2(r1); /* save r2 in stackframe */ \ - SAVE_4GPRS(3, r1); /* save r3 - r6 in stackframe */ \ - SAVE_2GPRS(7, r1); /* save r7, r8 in stackframe */ \ - std r9,GPR9(r1); /* save r9 in stackframe */ \ + SAVE_GPRS(3, 9, r1); /* save r3 - r9 in stackframe */ \ ld r3,PACA_EXGEN+EX_R10(r13);/* get back r10 */ \ ld r4,PACA_EXGEN+EX_R11(r13);/* get back r11 */ \ mfspr r5,SPRN_SPRG_GEN_SCRATCH;/* get back r13 XXX can be wrong */ \ @@ -1077,8 +1072,7 @@ bad_stack_book3e: std r10,_LINK(r1) std r11,_CTR(r1) std r12,_XER(r1) - SAVE_10GPRS(14,r1) - SAVE_8GPRS(24,r1) + SAVE_GPRS(14, 31, r1) lhz r12,PACA_TRAP_SAVE(r13) std r12,_TRAP(r1) addi r11,r1,INT_FRAME_SIZE diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index eaf1f72131a1..277eccf0f086 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -574,8 +574,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r10,IAREA+EX_CTR(r13) std r10,_CTR(r1) std r2,GPR2(r1) /* save r2 in stackframe */ - SAVE_4GPRS(3, r1) /* save r3 - r6 in stackframe */ - SAVE_2GPRS(7, r1) /* save r7, r8 in stackframe */ + SAVE_GPRS(3, 8, r1) /* save r3 - r8 in stackframe */ mflr r9 /* Get LR, later save to stack */ ld r2,PACATOC(r13) /* get kernel TOC into r2 */ std r9,_LINK(r1) @@ -693,8 +692,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mtlr r9 ld r9,_CCR(r1) mtcr r9 - REST_8GPRS(2, r1) - REST_4GPRS(10, r1) + REST_GPRS(2, 13, r1) REST_GPR(0, r1) /* restore original r1. */ ld r1,GPR1(r1) diff --git a/arch/powerpc/kernel/head_32.h b/arch/powerpc/kernel/head_32.h index 349c4a820231..261c79bdbe53 100644 --- a/arch/powerpc/kernel/head_32.h +++ b/arch/powerpc/kernel/head_32.h @@ -115,8 +115,7 @@ _ASM_NOKPROBE_SYMBOL(\name()_virt) stw r10,8(r1) li r10, \trapno stw r10,_TRAP(r1) - SAVE_4GPRS(3, r1) - SAVE_2GPRS(7, r1) + SAVE_GPRS(3, 8, r1) SAVE_NVGPRS(r1) stw r2,GPR2(r1) stw r12,_NIP(r1) diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h index ef8d1b1c234e..bb6d5d0fc4ac 100644 --- a/arch/powerpc/kernel/head_booke.h +++ b/arch/powerpc/kernel/head_booke.h @@ -87,8 +87,7 @@ END_BTB_FLUSH_SECTION stw r10, 8(r1) li r10, \trapno stw r10,_TRAP(r1) - SAVE_4GPRS(3, r1) - SAVE_2GPRS(7, r1) + SAVE_GPRS(3, 8, r1) SAVE_NVGPRS(r1) stw r2,GPR2(r1) stw r12,_NIP(r1) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 4c6d1a8dcefe..ff8c8c03f41a 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -166,10 +166,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * The value of AMR only matters while we're in the kernel. */ mtcr r2 - ld r2,GPR2(r1) - ld r3,GPR3(r1) - ld r13,GPR13(r1) - ld r1,GPR1(r1) + REST_GPRS(2, 3, r1) + REST_GPR(13, r1) + REST_GPR(1, r1) RFSCV_TO_USER b . /* prevent speculative execution */
@@ -187,9 +186,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtctr r3 mtlr r4 mtspr SPRN_XER,r5 - REST_10GPRS(2, r1) - REST_2GPRS(12, r1) - ld r1,GPR1(r1) + REST_GPRS(2, 13, r1) + REST_GPR(1, r1) RFI_TO_USER .Lsyscall_vectored_\name()_rst_end:
@@ -378,10 +376,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * The value of AMR only matters while we're in the kernel. */ mtcr r2 - ld r2,GPR2(r1) - ld r3,GPR3(r1) - ld r13,GPR13(r1) - ld r1,GPR1(r1) + REST_GPRS(2, 3, r1) + REST_GPR(13, r1) + REST_GPR(1, r1) RFI_TO_USER b . /* prevent speculative execution */
@@ -392,8 +389,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) mtctr r3 mtspr SPRN_XER,r4 ld r0,GPR0(r1) - REST_8GPRS(4, r1) - ld r12,GPR12(r1) + REST_GPRS(4, 12, r1) b .Lsyscall_restore_regs_cont .Lsyscall_rst_end:
@@ -522,17 +518,14 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) ld r6,_XER(r1) li r0,0
- REST_4GPRS(7, r1) - REST_2GPRS(11, r1) - REST_GPR(13, r1) + REST_GPRS(7, 13, r1)
mtcr r3 mtlr r4 mtctr r5 mtspr SPRN_XER,r6
- REST_4GPRS(2, r1) - REST_GPR(6, r1) + REST_GPRS(2, 6, r1) REST_GPR(0, r1) REST_GPR(1, r1) .ifc \srr,srr @@ -629,8 +622,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) ld r6,_CCR(r1) li r0,0
- REST_4GPRS(7, r1) - REST_2GPRS(11, r1) + REST_GPRS(7, 12, r1)
mtlr r3 mtctr r4 @@ -642,7 +634,7 @@ ALT_FTR_SECTION_END_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) */ std r0,STACK_FRAME_OVERHEAD-16(r1)
- REST_4GPRS(2, r1) + REST_GPRS(2, 5, r1)
bne- cr1,1f /* emulate stack store */ mtcr r6 diff --git a/arch/powerpc/kernel/optprobes_head.S b/arch/powerpc/kernel/optprobes_head.S index 19ea3312403c..5c7f0b4b784b 100644 --- a/arch/powerpc/kernel/optprobes_head.S +++ b/arch/powerpc/kernel/optprobes_head.S @@ -10,8 +10,8 @@ #include <asm/asm-offsets.h>
#ifdef CONFIG_PPC64 -#define SAVE_30GPRS(base) SAVE_10GPRS(2,base); SAVE_10GPRS(12,base); SAVE_10GPRS(22,base) -#define REST_30GPRS(base) REST_10GPRS(2,base); REST_10GPRS(12,base); REST_10GPRS(22,base) +#define SAVE_30GPRS(base) SAVE_GPRS(2, 31, base) +#define REST_30GPRS(base) REST_GPRS(2, 31, base) #define TEMPLATE_FOR_IMM_LOAD_INSNS nop; nop; nop; nop; nop #else #define SAVE_30GPRS(base) stmw r2, GPR2(base) diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S index 2b91f233b05d..3beecc32940b 100644 --- a/arch/powerpc/kernel/tm.S +++ b/arch/powerpc/kernel/tm.S @@ -226,11 +226,8 @@ _GLOBAL(tm_reclaim)
/* Sync the userland GPRs 2-12, 14-31 to thread->regs: */ SAVE_GPR(0, r7) /* user r0 */ - SAVE_GPR(2, r7) /* user r2 */ - SAVE_4GPRS(3, r7) /* user r3-r6 */ - SAVE_GPR(8, r7) /* user r8 */ - SAVE_GPR(9, r7) /* user r9 */ - SAVE_GPR(10, r7) /* user r10 */ + SAVE_GPRS(2, 6, r7) /* user r2-r6 */ + SAVE_GPRS(8, 10, r7) /* user r8-r10 */ ld r3, GPR1(r1) /* user r1 */ ld r4, GPR7(r1) /* user r7 */ ld r5, GPR11(r1) /* user r11 */ @@ -445,12 +442,8 @@ restore_gprs: ld r6, THREAD_TM_PPR(r3)
REST_GPR(0, r7) /* GPR0 */ - REST_2GPRS(2, r7) /* GPR2-3 */ - REST_GPR(4, r7) /* GPR4 */ - REST_4GPRS(8, r7) /* GPR8-11 */ - REST_2GPRS(12, r7) /* GPR12-13 */ - - REST_NVGPRS(r7) /* GPR14-31 */ + REST_GPRS(2, 4, r7) /* GPR2-4 */ + REST_GPRS(8, 31, r7) /* GPR8-31 */
/* Load up PPR and DSCR here so we don't run with user values for long */ mtspr SPRN_DSCR, r5 diff --git a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S index f9fd5f743eba..d636fc755f60 100644 --- a/arch/powerpc/kernel/trace/ftrace_64_mprofile.S +++ b/arch/powerpc/kernel/trace/ftrace_64_mprofile.S @@ -41,15 +41,14 @@ _GLOBAL(ftrace_regs_caller)
/* Save all gprs to pt_regs */ SAVE_GPR(0, r1) - SAVE_10GPRS(2, r1) + SAVE_GPRS(2, 11, r1)
/* Ok to continue? */ lbz r3, PACA_FTRACE_ENABLED(r13) cmpdi r3, 0 beq ftrace_no_trace
- SAVE_10GPRS(12, r1) - SAVE_10GPRS(22, r1) + SAVE_GPRS(12, 31, r1)
/* Save previous stack pointer (r1) */ addi r8, r1, SWITCH_FRAME_SIZE @@ -108,10 +107,8 @@ ftrace_regs_call: #endif
/* Restore gprs */ - REST_GPR(0,r1) - REST_10GPRS(2,r1) - REST_10GPRS(12,r1) - REST_10GPRS(22,r1) + REST_GPR(0, r1) + REST_GPRS(2, 31, r1)
/* Restore possibly modified LR */ ld r0, _LINK(r1) @@ -157,7 +154,7 @@ _GLOBAL(ftrace_caller) stdu r1, -SWITCH_FRAME_SIZE(r1)
/* Save all gprs to pt_regs */ - SAVE_8GPRS(3, r1) + SAVE_GPRS(3, 10, r1)
lbz r3, PACA_FTRACE_ENABLED(r13) cmpdi r3, 0 @@ -194,7 +191,7 @@ ftrace_call: mtctr r3
/* Restore gprs */ - REST_8GPRS(3,r1) + REST_GPRS(3, 10, r1)
/* Restore callee's TOC */ ld r2, 24(r1) diff --git a/arch/powerpc/kvm/book3s_hv_rmhandlers.S b/arch/powerpc/kvm/book3s_hv_rmhandlers.S index 32a4b4d412b9..81fc1e0ebe9a 100644 --- a/arch/powerpc/kvm/book3s_hv_rmhandlers.S +++ b/arch/powerpc/kvm/book3s_hv_rmhandlers.S @@ -2711,8 +2711,7 @@ kvmppc_bad_host_intr: std r0, GPR0(r1) std r9, GPR1(r1) std r2, GPR2(r1) - SAVE_4GPRS(3, r1) - SAVE_2GPRS(7, r1) + SAVE_GPRS(3, 8, r1) srdi r0, r12, 32 clrldi r12, r12, 32 std r0, _CCR(r1) @@ -2735,7 +2734,7 @@ kvmppc_bad_host_intr: ld r9, HSTATE_SCRATCH2(r13) ld r12, HSTATE_SCRATCH0(r13) GET_SCRATCH0(r0) - SAVE_4GPRS(9, r1) + SAVE_GPRS(9, 12, r1) std r0, GPR13(r1) SAVE_NVGPRS(r1) ld r5, HSTATE_CFAR(r13) diff --git a/arch/powerpc/lib/test_emulate_step_exec_instr.S b/arch/powerpc/lib/test_emulate_step_exec_instr.S index 9ef941d958d8..5473f9d03df3 100644 --- a/arch/powerpc/lib/test_emulate_step_exec_instr.S +++ b/arch/powerpc/lib/test_emulate_step_exec_instr.S @@ -37,7 +37,7 @@ _GLOBAL(exec_instr) * The stack pointer (GPR1) and the thread pointer (GPR13) are not * saved as these should not be modified anyway. */ - SAVE_2GPRS(2, r1) + SAVE_GPRS(2, 3, r1) SAVE_NVGPRS(r1)
/* @@ -75,8 +75,7 @@ _GLOBAL(exec_instr)
/* Load GPRs from pt_regs */ REST_GPR(0, r31) - REST_10GPRS(2, r31) - REST_GPR(12, r31) + REST_GPRS(2, 12, r31) REST_NVGPRS(r31)
/* Placeholder for the test instruction */ @@ -99,8 +98,7 @@ _GLOBAL(exec_instr) subi r3, r3, GPR0 SAVE_GPR(0, r3) SAVE_GPR(2, r3) - SAVE_8GPRS(4, r3) - SAVE_GPR(12, r3) + SAVE_GPRS(4, 12, r3) SAVE_NVGPRS(r3)
/* Save resulting LR to pt_regs */
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 9d71165d3934e607070c4e48458c0cf161b1baea ]
Commit cf13435b730a ("powerpc/tm: Fix userspace r13 corruption") fixes a problem in treclaim where a SLB miss can occur on the thread_struct->ckpt_regs while SCRATCH0 is live with the saved user r13 value, clobbering it with the kernel r13 and ultimately resulting in kernel r13 being stored in ckpt_regs.
There is an equivalent problem in trechkpt where the user r13 value is loaded into r13 from chkpt_regs to be recheckpointed, but a SLB miss could occur on ckpt_regs accesses after that, which will result in r13 being clobbered with a kernel value and that will get recheckpointed and then restored to user registers.
The same memory page is accessed right before this critical window where a SLB miss could cause corruption, so hitting the bug requires the SLB entry be removed within a small window of instructions, which is possible if a SLB related MCE hits there. PAPR also permits the hypervisor to discard this SLB entry (because slb_shadow->persistent is only set to SLB_NUM_BOLTED) although it's not known whether any implementations would do this (KVM does not). So this is an extremely unlikely bug, only found by inspection.
Fix this by also storing user r13 in a temporary location on the kernel stack and don't change the r13 register from kernel r13 until the RI=0 critical section that does not fault.
The SCRATCH0 change is not strictly part of the fix, it's only used in the RI=0 section so it does not have the same problem as the previous SCRATCH0 bug.
Fixes: 98ae22e15b43 ("powerpc: Add helper functions for transactional memory context switching") Cc: stable@vger.kernel.org # v3.9+ Signed-off-by: Nicholas Piggin npiggin@gmail.com Acked-by: Michael Neuling mikey@neuling.org Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20220311024733.48926-1-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kernel/tm.S | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-)
diff --git a/arch/powerpc/kernel/tm.S b/arch/powerpc/kernel/tm.S index 3beecc32940b..5a0f023a26e9 100644 --- a/arch/powerpc/kernel/tm.S +++ b/arch/powerpc/kernel/tm.S @@ -443,7 +443,8 @@ restore_gprs:
REST_GPR(0, r7) /* GPR0 */ REST_GPRS(2, 4, r7) /* GPR2-4 */ - REST_GPRS(8, 31, r7) /* GPR8-31 */ + REST_GPRS(8, 12, r7) /* GPR8-12 */ + REST_GPRS(14, 31, r7) /* GPR14-31 */
/* Load up PPR and DSCR here so we don't run with user values for long */ mtspr SPRN_DSCR, r5 @@ -479,18 +480,24 @@ restore_gprs: REST_GPR(6, r7)
/* - * Store r1 and r5 on the stack so that we can access them after we - * clear MSR RI. + * Store user r1 and r5 and r13 on the stack (in the unused save + * areas / compiler reserved areas), so that we can access them after + * we clear MSR RI. */
REST_GPR(5, r7) std r5, -8(r1) - ld r5, GPR1(r7) + ld r5, GPR13(r7) std r5, -16(r1) + ld r5, GPR1(r7) + std r5, -24(r1)
REST_GPR(7, r7)
- /* Clear MSR RI since we are about to use SCRATCH0. EE is already off */ + /* Stash the stack pointer away for use after recheckpoint */ + std r1, PACAR1(r13) + + /* Clear MSR RI since we are about to clobber r13. EE is already off */ li r5, 0 mtmsrd r5, 1
@@ -501,9 +508,9 @@ restore_gprs: * until we turn MSR RI back on. */
- SET_SCRATCH0(r1) ld r5, -8(r1) - ld r1, -16(r1) + ld r13, -16(r1) + ld r1, -24(r1)
/* Commit register state as checkpointed state: */ TRECHKPT @@ -519,9 +526,9 @@ restore_gprs: */
GET_PACA(r13) - GET_SCRATCH0(r1) + ld r1, PACAR1(r13)
- /* R1 is restored, so we are recoverable again. EE is still off */ + /* R13, R1 is restored, so we are recoverable again. EE is still off */ li r4, MSR_RI mtmsrd r4, 1
From: Hui Wang hui.wang@canonical.com
[ Upstream commit 927728a34f11b5a27f4610bdb7068317d6fdc72a ]
We tested RS485 function on an EVB which has SC16IS752, after finishing the test, we started the RS232 function test, but found the RTS is still working in the RS485 mode.
That is because both startup and shutdown call port_update() to set the EFCR_REG, this will not clear the RS485 bits once the bits are set in the reconf_rs485(). To fix it, clear the RS485 bits in shutdown.
Cc: stable@vger.kernel.org Signed-off-by: Hui Wang hui.wang@canonical.com Link: https://lore.kernel.org/r/20220308110042.108451-1-hui.wang@canonical.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/sc16is7xx.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/sc16is7xx.c b/drivers/tty/serial/sc16is7xx.c index 0ab788058fa2..e98aa7b97cc5 100644 --- a/drivers/tty/serial/sc16is7xx.c +++ b/drivers/tty/serial/sc16is7xx.c @@ -1055,10 +1055,12 @@ static void sc16is7xx_shutdown(struct uart_port *port)
/* Disable all interrupts */ sc16is7xx_port_write(port, SC16IS7XX_IER_REG, 0); - /* Disable TX/RX */ + /* Disable TX/RX, clear auto RS485 and RTS invert */ sc16is7xx_port_update(port, SC16IS7XX_EFCR_REG, SC16IS7XX_EFCR_RXDISABLE_BIT | - SC16IS7XX_EFCR_TXDISABLE_BIT, + SC16IS7XX_EFCR_TXDISABLE_BIT | + SC16IS7XX_EFCR_AUTO_RS485_BIT | + SC16IS7XX_EFCR_RTS_INVERT_BIT, SC16IS7XX_EFCR_RXDISABLE_BIT | SC16IS7XX_EFCR_TXDISABLE_BIT);
From: Kees Cook keescook@chromium.org
[ Upstream commit 5a717e93239fc373a314e03e45c43b62ebea1b26 ]
The find.h APIs are designed to be used only on unsigned long arguments. This can technically result in a over-read, but it is harmless in this case. Regardless, fix it to avoid the warning seen under -Warray-bounds, which we'd like to enable globally:
In file included from ./include/linux/bitmap.h:9, from ./include/linux/cpumask.h:12, from ./arch/x86/include/asm/cpumask.h:5, from ./arch/x86/include/asm/msr.h:11, from ./arch/x86/include/asm/processor.h:22, from ./arch/x86/include/asm/cpufeature.h:5, from ./arch/x86/include/asm/thread_info.h:53, from ./include/linux/thread_info.h:60, from ./arch/x86/include/asm/preempt.h:7, from ./include/linux/preempt.h:78, from ./include/linux/spinlock.h:55, from ./include/linux/wait.h:9, from ./include/linux/wait_bit.h:8, from ./include/linux/fs.h:6, from ./include/linux/debugfs.h:15, from drivers/bus/mhi/core/init.c:7: drivers/bus/mhi/core/init.c: In function 'to_mhi_pm_state_str': ./include/linux/find.h:187:37: warning: array subscript 'long unsigned int[0]' is partly outside array bounds of 'enum mhi_pm_state[1]' [-Warray-bounds] 187 | unsigned long val = *addr & GENMASK(size - 1, 0); | ^~~~~ drivers/bus/mhi/core/init.c:80:51: note: while referencing 'state' 80 | const char *to_mhi_pm_state_str(enum mhi_pm_state state) | ~~~~~~~~~~~~~~~~~~^~~~~
Link: https://lore.kernel.org/r/20211215232446.2069794-1-keescook@chromium.org [mani: changed the variable name "bits" to "pm_state"] Reviewed-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20211216081227.237749-10-manivannan.sadhasivam@lin... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/mhi/core/init.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c index 4183945fc2c4..c0187367ae75 100644 --- a/drivers/bus/mhi/core/init.c +++ b/drivers/bus/mhi/core/init.c @@ -79,7 +79,8 @@ static const char * const mhi_pm_state_str[] = {
const char *to_mhi_pm_state_str(enum mhi_pm_state state) { - int index = find_last_bit((unsigned long *)&state, 32); + unsigned long pm_state = state; + int index = find_last_bit(&pm_state, 32);
if (index >= ARRAY_SIZE(mhi_pm_state_str)) return "Invalid State";
From: Paul Davey paul.davey@alliedtelesis.co.nz
[ Upstream commit 64f93a9a27c1970fa8ee5ffc5a6ae2bda477ec5b ]
On big endian architectures the mhi debugfs files which report pm state give "Invalid State" for all states. This is caused by using find_last_bit which takes an unsigned long* while the state is passed in as an enum mhi_pm_state which will be of int size.
Fix by using __fls to pass the value of state instead of find_last_bit.
Also the current API expects "mhi_pm_state" enumerator as the function argument but the function only works with bitmasks. So as Alex suggested, let's change the argument to u32 to avoid confusion.
Fixes: a6e2e3522f29 ("bus: mhi: core: Add support for PM state transitions") Cc: stable@vger.kernel.org [mani: changed the function argument to u32] Reviewed-by: Manivannan Sadhasivam mani@kernel.org Reviewed-by: Hemant Kumar hemantk@codeaurora.org Reviewed-by: Alex Elder elder@linaro.org Signed-off-by: Paul Davey paul.davey@alliedtelesis.co.nz Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20220301160308.107452-3-manivannan.sadhasivam@lina... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/bus/mhi/core/init.c | 10 ++++++---- drivers/bus/mhi/core/internal.h | 2 +- 2 files changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/bus/mhi/core/init.c b/drivers/bus/mhi/core/init.c index c0187367ae75..d8787aaa176b 100644 --- a/drivers/bus/mhi/core/init.c +++ b/drivers/bus/mhi/core/init.c @@ -77,12 +77,14 @@ static const char * const mhi_pm_state_str[] = { [MHI_PM_STATE_LD_ERR_FATAL_DETECT] = "Linkdown or Error Fatal Detect", };
-const char *to_mhi_pm_state_str(enum mhi_pm_state state) +const char *to_mhi_pm_state_str(u32 state) { - unsigned long pm_state = state; - int index = find_last_bit(&pm_state, 32); + int index;
- if (index >= ARRAY_SIZE(mhi_pm_state_str)) + if (state) + index = __fls(state); + + if (!state || index >= ARRAY_SIZE(mhi_pm_state_str)) return "Invalid State";
return mhi_pm_state_str[index]; diff --git a/drivers/bus/mhi/core/internal.h b/drivers/bus/mhi/core/internal.h index c02c4d48b744..71f181402be9 100644 --- a/drivers/bus/mhi/core/internal.h +++ b/drivers/bus/mhi/core/internal.h @@ -622,7 +622,7 @@ void mhi_free_bhie_table(struct mhi_controller *mhi_cntrl, enum mhi_pm_state __must_check mhi_tryset_pm_state( struct mhi_controller *mhi_cntrl, enum mhi_pm_state state); -const char *to_mhi_pm_state_str(enum mhi_pm_state state); +const char *to_mhi_pm_state_str(u32 state); int mhi_queue_state_transition(struct mhi_controller *mhi_cntrl, enum dev_st_transition state); void mhi_pm_st_worker(struct work_struct *work);
From: Kees Cook keescook@chromium.org
[ Upstream commit 3080ea5553cc909b000d1f1d964a9041962f2c5b ]
There are many places where kernel code wants to have several different typed trailing flexible arrays. This would normally be done with multiple flexible arrays in a union, but since GCC and Clang don't (on the surface) allow this, there have been many open-coded workarounds, usually involving neighboring 0-element arrays at the end of a structure. For example, instead of something like this:
struct thing { ... union { struct type1 foo[]; struct type2 bar[]; }; };
code works around the compiler with:
struct thing { ... struct type1 foo[0]; struct type2 bar[]; };
Another case is when a flexible array is wanted as the single member within a struct (which itself is usually in a union). For example, this would be worked around as:
union many { ... struct { struct type3 baz[0]; }; };
These kinds of work-arounds cause problems with size checks against such zero-element arrays (for example when building with -Warray-bounds and -Wzero-length-bounds, and with the coming FORTIFY_SOURCE improvements), so they must all be converted to "real" flexible arrays, avoiding warnings like this:
fs/hpfs/anode.c: In function 'hpfs_add_sector_to_btree': fs/hpfs/anode.c:209:27: warning: array subscript 0 is outside the bounds of an interior zero-length array 'struct bplus_internal_node[0]' [-Wzero-length-bounds] 209 | anode->btree.u.internal[0].down = cpu_to_le32(a); | ~~~~~~~~~~~~~~~~~~~~~~~^~~ In file included from fs/hpfs/hpfs_fn.h:26, from fs/hpfs/anode.c:10: fs/hpfs/hpfs.h:412:32: note: while referencing 'internal' 412 | struct bplus_internal_node internal[0]; /* (internal) 2-word entries giving | ^~~~~~~~
drivers/net/can/usb/etas_es58x/es58x_fd.c: In function 'es58x_fd_tx_can_msg': drivers/net/can/usb/etas_es58x/es58x_fd.c:360:35: warning: array subscript 65535 is outside the bounds of an interior zero-length array 'u8[0]' {aka 'unsigned char[]'} [-Wzero-length-bounds] 360 | tx_can_msg = (typeof(tx_can_msg))&es58x_fd_urb_cmd->raw_msg[msg_len]; | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ In file included from drivers/net/can/usb/etas_es58x/es58x_core.h:22, from drivers/net/can/usb/etas_es58x/es58x_fd.c:17: drivers/net/can/usb/etas_es58x/es58x_fd.h:231:6: note: while referencing 'raw_msg' 231 | u8 raw_msg[0]; | ^~~~~~~
However, it _is_ entirely possible to have one or more flexible arrays in a struct or union: it just has to be in another struct. And since it cannot be alone in a struct, such a struct must have at least 1 other named member -- but that member can be zero sized. Wrap all this nonsense into the new DECLARE_FLEX_ARRAY() in support of having flexible arrays in unions (or alone in a struct).
As with struct_group(), since this is needed in UAPI headers as well, implement the core there, with a non-UAPI wrapper.
Additionally update kernel-doc to understand its existence.
https://github.com/KSPP/linux/issues/137
Cc: Arnd Bergmann arnd@arndb.de Cc: "Gustavo A. R. Silva" gustavoars@kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/stddef.h | 13 +++++++++++++ include/uapi/linux/stddef.h | 16 ++++++++++++++++ scripts/kernel-doc | 2 ++ 3 files changed, 31 insertions(+)
diff --git a/include/linux/stddef.h b/include/linux/stddef.h index 938216f8ab7e..31fdbb784c24 100644 --- a/include/linux/stddef.h +++ b/include/linux/stddef.h @@ -84,4 +84,17 @@ enum { #define struct_group_tagged(TAG, NAME, MEMBERS...) \ __struct_group(TAG, NAME, /* no attrs */, MEMBERS)
+/** + * DECLARE_FLEX_ARRAY() - Declare a flexible array usable in a union + * + * @TYPE: The type of each flexible array element + * @NAME: The name of the flexible array member + * + * In order to have a flexible array member in a union or alone in a + * struct, it needs to be wrapped in an anonymous struct with at least 1 + * named member, but that member can be empty. + */ +#define DECLARE_FLEX_ARRAY(TYPE, NAME) \ + __DECLARE_FLEX_ARRAY(TYPE, NAME) + #endif diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index 610204f7c275..3021ea25a284 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -25,3 +25,19 @@ struct { MEMBERS } ATTRS; \ struct TAG { MEMBERS } ATTRS NAME; \ } + +/** + * __DECLARE_FLEX_ARRAY() - Declare a flexible array usable in a union + * + * @TYPE: The type of each flexible array element + * @NAME: The name of the flexible array member + * + * In order to have a flexible array member in a union or alone in a + * struct, it needs to be wrapped in an anonymous struct with at least 1 + * named member, but that member can be empty. + */ +#define __DECLARE_FLEX_ARRAY(TYPE, NAME) \ + struct { \ + struct { } __empty_ ## NAME; \ + TYPE NAME[]; \ + } diff --git a/scripts/kernel-doc b/scripts/kernel-doc index 38aa799a776c..5d54b57ff90c 100755 --- a/scripts/kernel-doc +++ b/scripts/kernel-doc @@ -1263,6 +1263,8 @@ sub dump_struct($$) { $members =~ s/DECLARE_KFIFO\s*($args,\s*$args,\s*$args)/$2 *$1/gos; # replace DECLARE_KFIFO_PTR $members =~ s/DECLARE_KFIFO_PTR\s*($args,\s*$args)/$2 *$1/gos; + # replace DECLARE_FLEX_ARRAY + $members =~ s/(?:__)?DECLARE_FLEX_ARRAY\s*($args,\s*$args)/$1 $2[]/gos; my $declaration = $members;
# Split nested struct/union elements as newer ones
From: Tadeusz Struk tadeusz.struk@linaro.org
[ Upstream commit 55037ed7bdc62151a726f5685f88afa6a82959b1 ]
Add include guard wrapper define to uapi/linux/stddef.h to prevent macro redefinition errors when stddef.h is included more than once. This was not needed before since the only contents already used a redefinition test.
Signed-off-by: Tadeusz Struk tadeusz.struk@linaro.org Link: https://lore.kernel.org/r/20220329171252.57279-1-tadeusz.struk@linaro.org Fixes: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/linux/stddef.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index 3021ea25a284..7837ba4fe728 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -1,4 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_STDDEF_H +#define _UAPI_LINUX_STDDEF_H + #include <linux/compiler_types.h>
#ifndef __always_inline @@ -41,3 +44,4 @@ struct { } __empty_ ## NAME; \ TYPE NAME[]; \ } +#endif
From: Jack Yu jack.yu@realtek.com
[ Upstream commit 57589f82762e40bdaa975d840fa2bc5157b5be95 ]
The DAI clock is only used in I2S mode, to make it clear and to fix clock resource release issue, we move CCF clock related code to rt5682_i2c_probe to fix clock register/unregister issue.
Signed-off-by: Jack Yu jack.yu@realtek.com Link: https://lore.kernel.org/r/20210929054344.12112-1-jack.yu@realtek.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5682-i2c.c | 22 +++++++++++ sound/soc/codecs/rt5682.c | 70 +++++++++++++---------------------- sound/soc/codecs/rt5682.h | 3 ++ 3 files changed, 51 insertions(+), 44 deletions(-)
diff --git a/sound/soc/codecs/rt5682-i2c.c b/sound/soc/codecs/rt5682-i2c.c index 74a1fee071dd..3d2d7c9ce66d 100644 --- a/sound/soc/codecs/rt5682-i2c.c +++ b/sound/soc/codecs/rt5682-i2c.c @@ -133,6 +133,8 @@ static int rt5682_i2c_probe(struct i2c_client *i2c,
i2c_set_clientdata(i2c, rt5682);
+ rt5682->i2c_dev = &i2c->dev; + rt5682->pdata = i2s_default_platform_data;
if (pdata) @@ -270,6 +272,26 @@ static int rt5682_i2c_probe(struct i2c_client *i2c, dev_err(&i2c->dev, "Failed to reguest IRQ: %d\n", ret); }
+#ifdef CONFIG_COMMON_CLK + /* Check if MCLK provided */ + rt5682->mclk = devm_clk_get(&i2c->dev, "mclk"); + if (IS_ERR(rt5682->mclk)) { + if (PTR_ERR(rt5682->mclk) != -ENOENT) { + ret = PTR_ERR(rt5682->mclk); + return ret; + } + rt5682->mclk = NULL; + } + + /* Register CCF DAI clock control */ + ret = rt5682_register_dai_clks(rt5682); + if (ret) + return ret; + + /* Initial setup for CCF */ + rt5682->lrck[RT5682_AIF1] = 48000; +#endif + return devm_snd_soc_register_component(&i2c->dev, &rt5682_soc_component_dev, rt5682_dai, ARRAY_SIZE(rt5682_dai)); diff --git a/sound/soc/codecs/rt5682.c b/sound/soc/codecs/rt5682.c index 1cd59e166cce..80d199843b8c 100644 --- a/sound/soc/codecs/rt5682.c +++ b/sound/soc/codecs/rt5682.c @@ -2562,7 +2562,7 @@ static int rt5682_set_bias_level(struct snd_soc_component *component, static bool rt5682_clk_check(struct rt5682_priv *rt5682) { if (!rt5682->master[RT5682_AIF1]) { - dev_dbg(rt5682->component->dev, "sysclk/dai not set correctly\n"); + dev_dbg(rt5682->i2c_dev, "sysclk/dai not set correctly\n"); return false; } return true; @@ -2573,13 +2573,15 @@ static int rt5682_wclk_prepare(struct clk_hw *hw) struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_WCLK_IDX]); - struct snd_soc_component *component = rt5682->component; - struct snd_soc_dapm_context *dapm = - snd_soc_component_get_dapm(component); + struct snd_soc_component *component; + struct snd_soc_dapm_context *dapm;
if (!rt5682_clk_check(rt5682)) return -EINVAL;
+ component = rt5682->component; + dapm = snd_soc_component_get_dapm(component); + snd_soc_dapm_mutex_lock(dapm);
snd_soc_dapm_force_enable_pin_unlocked(dapm, "MICBIAS"); @@ -2609,13 +2611,15 @@ static void rt5682_wclk_unprepare(struct clk_hw *hw) struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_WCLK_IDX]); - struct snd_soc_component *component = rt5682->component; - struct snd_soc_dapm_context *dapm = - snd_soc_component_get_dapm(component); + struct snd_soc_component *component; + struct snd_soc_dapm_context *dapm;
if (!rt5682_clk_check(rt5682)) return;
+ component = rt5682->component; + dapm = snd_soc_component_get_dapm(component); + snd_soc_dapm_mutex_lock(dapm);
snd_soc_dapm_disable_pin_unlocked(dapm, "MICBIAS"); @@ -2639,7 +2643,6 @@ static unsigned long rt5682_wclk_recalc_rate(struct clk_hw *hw, struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_WCLK_IDX]); - struct snd_soc_component *component = rt5682->component; const char * const clk_name = clk_hw_get_name(hw);
if (!rt5682_clk_check(rt5682)) @@ -2649,7 +2652,7 @@ static unsigned long rt5682_wclk_recalc_rate(struct clk_hw *hw, */ if (rt5682->lrck[RT5682_AIF1] != CLK_48 && rt5682->lrck[RT5682_AIF1] != CLK_44) { - dev_warn(component->dev, "%s: clk %s only support %d or %d Hz output\n", + dev_warn(rt5682->i2c_dev, "%s: clk %s only support %d or %d Hz output\n", __func__, clk_name, CLK_44, CLK_48); return 0; } @@ -2663,7 +2666,6 @@ static long rt5682_wclk_round_rate(struct clk_hw *hw, unsigned long rate, struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_WCLK_IDX]); - struct snd_soc_component *component = rt5682->component; const char * const clk_name = clk_hw_get_name(hw);
if (!rt5682_clk_check(rt5682)) @@ -2673,7 +2675,7 @@ static long rt5682_wclk_round_rate(struct clk_hw *hw, unsigned long rate, * It will force to 48kHz if not both. */ if (rate != CLK_48 && rate != CLK_44) { - dev_warn(component->dev, "%s: clk %s only support %d or %d Hz output\n", + dev_warn(rt5682->i2c_dev, "%s: clk %s only support %d or %d Hz output\n", __func__, clk_name, CLK_44, CLK_48); rate = CLK_48; } @@ -2687,7 +2689,7 @@ static int rt5682_wclk_set_rate(struct clk_hw *hw, unsigned long rate, struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_WCLK_IDX]); - struct snd_soc_component *component = rt5682->component; + struct snd_soc_component *component; struct clk_hw *parent_hw; const char * const clk_name = clk_hw_get_name(hw); int pre_div; @@ -2696,6 +2698,8 @@ static int rt5682_wclk_set_rate(struct clk_hw *hw, unsigned long rate, if (!rt5682_clk_check(rt5682)) return -EINVAL;
+ component = rt5682->component; + /* * Whether the wclk's parent clk (mclk) exists or not, please ensure * it is fixed or set to 48MHz before setting wclk rate. It's a @@ -2705,12 +2709,12 @@ static int rt5682_wclk_set_rate(struct clk_hw *hw, unsigned long rate, */ parent_hw = clk_hw_get_parent(hw); if (!parent_hw) - dev_warn(component->dev, + dev_warn(rt5682->i2c_dev, "Parent mclk of wclk not acquired in driver. Please ensure mclk was provided as %d Hz.\n", CLK_PLL2_FIN);
if (parent_rate != CLK_PLL2_FIN) - dev_warn(component->dev, "clk %s only support %d Hz input\n", + dev_warn(rt5682->i2c_dev, "clk %s only support %d Hz input\n", clk_name, CLK_PLL2_FIN);
/* @@ -2742,10 +2746,9 @@ static unsigned long rt5682_bclk_recalc_rate(struct clk_hw *hw, struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_BCLK_IDX]); - struct snd_soc_component *component = rt5682->component; unsigned int bclks_per_wclk;
- bclks_per_wclk = snd_soc_component_read(component, RT5682_TDM_TCON_CTRL); + regmap_read(rt5682->regmap, RT5682_TDM_TCON_CTRL, &bclks_per_wclk);
switch (bclks_per_wclk & RT5682_TDM_BCLK_MS1_MASK) { case RT5682_TDM_BCLK_MS1_256: @@ -2806,20 +2809,22 @@ static int rt5682_bclk_set_rate(struct clk_hw *hw, unsigned long rate, struct rt5682_priv *rt5682 = container_of(hw, struct rt5682_priv, dai_clks_hw[RT5682_DAI_BCLK_IDX]); - struct snd_soc_component *component = rt5682->component; + struct snd_soc_component *component; struct snd_soc_dai *dai; unsigned long factor;
if (!rt5682_clk_check(rt5682)) return -EINVAL;
+ component = rt5682->component; + factor = rt5682_bclk_get_factor(rate, parent_rate);
for_each_component_dais(component, dai) if (dai->id == RT5682_AIF1) break; if (!dai) { - dev_err(component->dev, "dai %d not found in component\n", + dev_err(rt5682->i2c_dev, "dai %d not found in component\n", RT5682_AIF1); return -ENODEV; } @@ -2842,10 +2847,9 @@ static const struct clk_ops rt5682_dai_clk_ops[RT5682_DAI_NUM_CLKS] = { }, };
-static int rt5682_register_dai_clks(struct snd_soc_component *component) +int rt5682_register_dai_clks(struct rt5682_priv *rt5682) { - struct device *dev = component->dev; - struct rt5682_priv *rt5682 = snd_soc_component_get_drvdata(component); + struct device *dev = rt5682->i2c_dev; struct rt5682_platform_data *pdata = &rt5682->pdata; struct clk_hw *dai_clk_hw; int i, ret; @@ -2905,6 +2909,7 @@ static int rt5682_register_dai_clks(struct snd_soc_component *component)
return 0; } +EXPORT_SYMBOL_GPL(rt5682_register_dai_clks); #endif /* CONFIG_COMMON_CLK */
static int rt5682_probe(struct snd_soc_component *component) @@ -2914,9 +2919,6 @@ static int rt5682_probe(struct snd_soc_component *component) unsigned long time; struct snd_soc_dapm_context *dapm = &component->dapm;
-#ifdef CONFIG_COMMON_CLK - int ret; -#endif rt5682->component = component;
if (rt5682->is_sdw) { @@ -2928,26 +2930,6 @@ static int rt5682_probe(struct snd_soc_component *component) dev_err(&slave->dev, "Initialization not complete, timed out\n"); return -ETIMEDOUT; } - } else { -#ifdef CONFIG_COMMON_CLK - /* Check if MCLK provided */ - rt5682->mclk = devm_clk_get(component->dev, "mclk"); - if (IS_ERR(rt5682->mclk)) { - if (PTR_ERR(rt5682->mclk) != -ENOENT) { - ret = PTR_ERR(rt5682->mclk); - return ret; - } - rt5682->mclk = NULL; - } - - /* Register CCF DAI clock control */ - ret = rt5682_register_dai_clks(component); - if (ret) - return ret; - - /* Initial setup for CCF */ - rt5682->lrck[RT5682_AIF1] = CLK_48; -#endif }
snd_soc_dapm_disable_pin(dapm, "MICBIAS"); diff --git a/sound/soc/codecs/rt5682.h b/sound/soc/codecs/rt5682.h index 539a9fe26294..52ff0d9c36c5 100644 --- a/sound/soc/codecs/rt5682.h +++ b/sound/soc/codecs/rt5682.h @@ -1428,6 +1428,7 @@ enum {
struct rt5682_priv { struct snd_soc_component *component; + struct device *i2c_dev; struct rt5682_platform_data pdata; struct regmap *regmap; struct regmap *sdw_regmap; @@ -1481,6 +1482,8 @@ void rt5682_calibrate(struct rt5682_priv *rt5682); void rt5682_reset(struct rt5682_priv *rt5682); int rt5682_parse_dt(struct rt5682_priv *rt5682, struct device *dev);
+int rt5682_register_dai_clks(struct rt5682_priv *rt5682); + #define RT5682_REG_NUM 318 extern const struct reg_default rt5682_reg[RT5682_REG_NUM];
From: Xiaomeng Tong xiam0nd.tong@gmail.com
[ Upstream commit c8618d65007ba68d7891130642d73e89372101e8 ]
The bug is here: if (!dai) {
The list iterator value 'dai' will *always* be set and non-NULL by for_each_component_dais(), so it is incorrect to assume that the iterator value will be NULL if the list is empty or no element is found (In fact, it will be a bogus pointer to an invalid struct object containing the HEAD). Otherwise it will bypass the check 'if (!dai) {' (never call dev_err() and never return -ENODEV;) and lead to invalid memory access lately when calling 'rt5682_set_bclk1_ratio(dai, factor);'.
To fix the bug, just return rt5682_set_bclk1_ratio(dai, factor); when found the 'dai', otherwise dev_err() and return -ENODEV;
Cc: stable@vger.kernel.org Fixes: ebbfabc16d23d ("ASoC: rt5682: Add CCF usage for providing I2S clks") Signed-off-by: Xiaomeng Tong xiam0nd.tong@gmail.com Link: https://lore.kernel.org/r/20220327081002.12684-1-xiam0nd.tong@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5682.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/sound/soc/codecs/rt5682.c b/sound/soc/codecs/rt5682.c index 80d199843b8c..8a9e1a4fa03e 100644 --- a/sound/soc/codecs/rt5682.c +++ b/sound/soc/codecs/rt5682.c @@ -2822,14 +2822,11 @@ static int rt5682_bclk_set_rate(struct clk_hw *hw, unsigned long rate,
for_each_component_dais(component, dai) if (dai->id == RT5682_AIF1) - break; - if (!dai) { - dev_err(rt5682->i2c_dev, "dai %d not found in component\n", - RT5682_AIF1); - return -ENODEV; - } + return rt5682_set_bclk1_ratio(dai, factor);
- return rt5682_set_bclk1_ratio(dai, factor); + dev_err(rt5682->i2c_dev, "dai %d not found in component\n", + RT5682_AIF1); + return -ENODEV; }
static const struct clk_ops rt5682_dai_clk_ops[RT5682_DAI_NUM_CLKS] = {
From: tiancyin tianci.yin@amd.com
[ Upstream commit 425d7a87e54ee358f580eaf10cf28dc95f7121c1 ]
Some video card has more than one vcn instance, passing 0 to vcn_v3_0_pause_dpg_mode is incorrect.
Error msg: Register(1) [mmUVD_POWER_STATUS] failed to reach value 0x00000001 != 0x00000002
Reviewed-by: James Zhu James.Zhu@amd.com Signed-off-by: tiancyin tianci.yin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 6e56bef4fdf8..1310617f030f 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -1511,7 +1511,7 @@ static int vcn_v3_0_stop_dpg_mode(struct amdgpu_device *adev, int inst_idx) struct dpg_pause_state state = {.fw_based = VCN_DPG_STATE__UNPAUSE}; uint32_t tmp;
- vcn_v3_0_pause_dpg_mode(adev, 0, &state); + vcn_v3_0_pause_dpg_mode(adev, inst_idx, &state);
/* Wait for power status to be 1 */ SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1,
From: Oliver Upton oupton@google.com
[ Upstream commit a44a4cc1c969afec97dbb2aedaf6f38eaa6253bb ]
Unfortunately, there is no guarantee that KVM was able to instantiate a debugfs directory for a particular VM. To that end, KVM shouldn't even attempt to create new debugfs files in this case. If the specified parent dentry is NULL, debugfs_create_file() will instantiate files at the root of debugfs.
For arm64, it is possible to create the vgic-state file outside of a VM directory, the file is not cleaned up when a VM is destroyed. Nonetheless, the corresponding struct kvm is freed when the VM is destroyed.
Nip the problem in the bud for all possible errant debugfs file creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing, debugfs_create_file() will fail instead of creating the file in the root directory.
Cc: stable@kernel.org Fixes: 929f45e32499 ("kvm: no need to check return value of debugfs_create functions") Signed-off-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- virt/kvm/kvm_main.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 99c591569815..9134ae252d7c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -911,7 +911,7 @@ static void kvm_destroy_vm_debugfs(struct kvm *kvm) int kvm_debugfs_num_entries = kvm_vm_stats_header.num_desc + kvm_vcpu_stats_header.num_desc;
- if (!kvm->debugfs_dentry) + if (IS_ERR(kvm->debugfs_dentry)) return;
debugfs_remove_recursive(kvm->debugfs_dentry); @@ -934,6 +934,12 @@ static int kvm_create_vm_debugfs(struct kvm *kvm, int fd) int kvm_debugfs_num_entries = kvm_vm_stats_header.num_desc + kvm_vcpu_stats_header.num_desc;
+ /* + * Force subsequent debugfs file creations to fail if the VM directory + * is not created. + */ + kvm->debugfs_dentry = ERR_PTR(-ENOENT); + if (!debugfs_initialized()) return 0;
@@ -5373,7 +5379,7 @@ static void kvm_uevent_notify_change(unsigned int type, struct kvm *kvm) } add_uevent_var(env, "PID=%d", kvm->userspace_pid);
- if (kvm->debugfs_dentry) { + if (!IS_ERR(kvm->debugfs_dentry)) { char *tmp, *p = kmalloc(PATH_MAX, GFP_KERNEL_ACCOUNT);
if (p) {
From: Zhenguo Zhao Zhenguo.Zhao1@unisoc.com
[ Upstream commit cc0f42122a7e7a5ede9c5f2a41199128b8449eda ]
When n_gsm config "initiator=0",as requester,gsmld receives dlci SABM/DISC control command frame,but send UA frame is error.
Example: Gsmld receive dlc0 SABM frame "f9 03 3f 01 1c f9",now it sends UA frame "f9 01 63 01 a3 f9",CR and PF bit are 0,but it should be set 1 from requster to initiator.
Kernel test log as follows:
Before modify
[ 271.732031] c1 gsmld_receive: 00000000: f9 03 3f 01 1c f9 [ 271.741719] c1 <-- 0) C: SABM(P) [ 271.749483] c1 gsmld_output: 00000000: f9 01 63 01 a3 f9 [ 271.758337] c1 --> 0) R: UA(F)
After modify
[ 261.233188] c0 gsmld_receive: 00000000: f9 03 3f 01 1c f9 [ 261.242767] c0 <-- 0) C: SABM(P) [ 261.250497] c0 gsmld_output: 00000000: f9 03 73 01 d7 f9 [ 261.259759] c0 --> 0) C: UA(P)
Signed-off-by: Zhenguo Zhao Zhenguo.Zhao1@unisoc.com Link: https://lore.kernel.org/r/1629461872-26965-3-git-send-email-zhenguo6858@gmai... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/n_gsm.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index 6734ef22c304..91ce8e6e889a 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -625,7 +625,7 @@ static void gsm_send(struct gsm_mux *gsm, int addr, int cr, int control)
static inline void gsm_response(struct gsm_mux *gsm, int addr, int control) { - gsm_send(gsm, addr, 0, control); + gsm_send(gsm, addr, 1, control); }
/** @@ -1818,9 +1818,9 @@ static void gsm_queue(struct gsm_mux *gsm) if (dlci == NULL) return; if (dlci->dead) - gsm_response(gsm, address, DM); + gsm_response(gsm, address, DM|PF); else { - gsm_response(gsm, address, UA); + gsm_response(gsm, address, UA|PF); gsm_dlci_open(dlci); } break; @@ -1828,11 +1828,11 @@ static void gsm_queue(struct gsm_mux *gsm) if (cr == 0) goto invalid; if (dlci == NULL || dlci->state == DLCI_CLOSED) { - gsm_response(gsm, address, DM); + gsm_response(gsm, address, DM|PF); return; } /* Real close complete */ - gsm_response(gsm, address, UA); + gsm_response(gsm, address, UA|PF); gsm_dlci_close(dlci); break; case UA|PF:
From: Zhenguo Zhao Zhenguo.Zhao1@unisoc.com
[ Upstream commit 0b91b5332368f2fb0c3e5cfebc6aff9e167acd8b ]
When n_gsm config "initiator=0",as requester ,receive SABM frame,n_gsm register gsmtty dev,and save dlci open address status,if receive DLC0 DISC or CLD frame,it can unregister the gsmtty dev by saving dlci address.
Signed-off-by: Zhenguo Zhao Zhenguo.Zhao1@unisoc.com Link: https://lore.kernel.org/r/1629461872-26965-8-git-send-email-zhenguo6858@gmai... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/n_gsm.c | 57 +++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 53 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_gsm.c b/drivers/tty/n_gsm.c index 91ce8e6e889a..3038e5631be5 100644 --- a/drivers/tty/n_gsm.c +++ b/drivers/tty/n_gsm.c @@ -274,6 +274,10 @@ static DEFINE_SPINLOCK(gsm_mux_lock);
static struct tty_driver *gsm_tty_driver;
+/* Save dlci open address */ +static int addr_open[256] = { 0 }; +/* Save dlci open count */ +static int addr_cnt; /* * This section of the driver logic implements the GSM encodings * both the basic and the 'advanced'. Reliable transport is not @@ -1191,6 +1195,7 @@ static void gsm_control_rls(struct gsm_mux *gsm, const u8 *data, int clen) }
static void gsm_dlci_begin_close(struct gsm_dlci *dlci); +static void gsm_dlci_close(struct gsm_dlci *dlci);
/** * gsm_control_message - DLCI 0 control processing @@ -1209,15 +1214,28 @@ static void gsm_control_message(struct gsm_mux *gsm, unsigned int command, { u8 buf[1]; unsigned long flags; + struct gsm_dlci *dlci; + int i; + int address;
switch (command) { case CMD_CLD: { - struct gsm_dlci *dlci = gsm->dlci[0]; + if (addr_cnt > 0) { + for (i = 0; i < addr_cnt; i++) { + address = addr_open[i]; + dlci = gsm->dlci[address]; + gsm_dlci_close(dlci); + addr_open[i] = 0; + } + } /* Modem wishes to close down */ + dlci = gsm->dlci[0]; if (dlci) { dlci->dead = true; gsm->dead = true; - gsm_dlci_begin_close(dlci); + gsm_dlci_close(dlci); + addr_cnt = 0; + gsm_response(gsm, 0, UA|PF); } } break; @@ -1780,6 +1798,7 @@ static void gsm_queue(struct gsm_mux *gsm) struct gsm_dlci *dlci; u8 cr; int address; + int i, j, k, address_tmp; /* We have to sneak a look at the packet body to do the FCS. A somewhat layering violation in the spec */
@@ -1822,6 +1841,11 @@ static void gsm_queue(struct gsm_mux *gsm) else { gsm_response(gsm, address, UA|PF); gsm_dlci_open(dlci); + /* Save dlci open address */ + if (address) { + addr_open[addr_cnt] = address; + addr_cnt++; + } } break; case DISC|PF: @@ -1832,8 +1856,33 @@ static void gsm_queue(struct gsm_mux *gsm) return; } /* Real close complete */ - gsm_response(gsm, address, UA|PF); - gsm_dlci_close(dlci); + if (!address) { + if (addr_cnt > 0) { + for (i = 0; i < addr_cnt; i++) { + address = addr_open[i]; + dlci = gsm->dlci[address]; + gsm_dlci_close(dlci); + addr_open[i] = 0; + } + } + dlci = gsm->dlci[0]; + gsm_dlci_close(dlci); + addr_cnt = 0; + gsm_response(gsm, 0, UA|PF); + } else { + gsm_response(gsm, address, UA|PF); + gsm_dlci_close(dlci); + /* clear dlci address */ + for (j = 0; j < addr_cnt; j++) { + address_tmp = addr_open[j]; + if (address_tmp == address) { + for (k = j; k < addr_cnt; k++) + addr_open[k] = addr_open[k+1]; + addr_cnt--; + break; + } + } + } break; case UA|PF: if (cr == 0 || dlci == NULL)
From: Daniel Starke daniel.starke@siemens.com
[ Upstream commit 7a0e4b1733b635026a87c023f6d703faf0095e39 ]
The frame checksum (FCS) is currently handled in gsm_queue() after reception of a frame. However, this breaks layering. A workaround with 'received_fcs' was implemented so far. Furthermore, frames are handled as such even if no end flag was received. Move FCS calculation from gsm_queue() to gsm0_receive() and gsm1_receive(). Also delay gsm_queue() call there until a full frame was received to fix both points.
Fixes: e1eaea46bb40 ("tty: n_gsm line discipline") Cc: stable@vger.kernel.org Signed-off-by: Daniel Starke