This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.2.10-rc1
Jan Beulich jbeulich@suse.com x86/PVH: avoid 32-bit build warning when obtaining VGA console info
Matthieu Baerts matthieu.baerts@tessares.net hsr: ratelimit only when errors are printed
Xiaogang Chen xiaogang.chen@amd.com drm/amdkfd: Get prange->offset after svm_range_vram_node_new
Andrii Nakryiko andrii@kernel.org libbpf: Fix btf_dump's packed struct determination
Andrii Nakryiko andrii@kernel.org selftests/bpf: Add few corner cases to test padding handling of btf_dump
Andrii Nakryiko andrii@kernel.org libbpf: Fix BTF-to-C converter's padding logic
Hans de Goede hdegoede@redhat.com usb: ucsi: Fix ucsi->connector race
Marc Zyngier maz@kernel.org KVM: arm64: Check for kvm_vma_mte_allowed in the critical section
Marc Zyngier maz@kernel.org KVM: arm64: Disable interrupts while walking userspace PTs
David Matlack dmatlack@google.com KVM: arm64: Retry fault if vma_lookup() results become invalid
Reiji Watanabe reijiw@google.com KVM: arm64: PMU: Don't save PMCR_EL0.{C,P} for the vCPU
Reiji Watanabe reijiw@google.com KVM: arm64: PMU: Fix GET_ONE_REG for vPMC regs to return the current value
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Move CSC load back into .color_commit_arm() when PSR is enabled on skl/glk
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Split icl_color_commit_noarm() from skl_color_commit_noarm()
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915: Disable DC states for all commits
Ville Syrjälä ville.syrjala@linux.intel.com drm/i915/dpt: Treat the DPT BO as a framebuffer
Chris Wilson chris.p.wilson@linux.intel.com drm/i915/gem: Flush lmem contents after construction
Fangzhi Zuo Jerry.Zuo@amd.com drm/amd/display: Take FEC Overhead into Timeslot Calculation
Fangzhi Zuo Jerry.Zuo@amd.com drm/amd/display: Add DSC Support for Synaptics Cascaded MST Hub
Tim Huang tim.huang@amd.com drm/amdgpu: allow more APUs to do mode2 reset when go to S4
Lucas Stach l.stach@pengutronix.de drm/etnaviv: fix reference leak when mmaping imported buffer
Jiri Slaby (SUSE) jirislaby@kernel.org s390: reintroduce expoline dependence to scripts
Heiko Carstens hca@linux.ibm.com s390/uaccess: add missing earlyclobber annotations to __clear_user()
Geert Uytterhoeven geert+renesas@glider.be dt-bindings: mtd: jedec,spi-nor: Document CPOL/CPHA support
Douglas Raillard douglas.raillard@arm.com rcu: Fix rcu_torture_read ftrace event
Max Filippov jcmvbkbc@gmail.com xtensa: fix KASAN report for show_stack
huangwenhui huangwenhuia@uniontech.com ALSA: hda/realtek: Add quirk for Lenovo ZhaoYang CF4620Z
Tim Crawford tcrawford@system76.com ALSA: hda/realtek: Add quirks for some Clevo laptops
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Fix regression on detection of Roland VS-100
Takashi Iwai tiwai@suse.de ALSA: hda/conexant: Partial revert of a quirk for Lenovo
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Fix hangs when recovering open state after a server reboot
Benjamin Gray bgray@linux.ibm.com powerpc/64s: Fix __pte_needs_flush() false positive warning
Haren Myneni haren@linux.ibm.com powerpc/pseries/vas: Ignore VAS update for DLPAR if copy/paste is not enabled
Jens Axboe axboe@kernel.dk powerpc: Don't try to copy PPR for task with NULL pt_regs
Srinivas Pandruvada srinivas.pandruvada@linux.intel.com thermal: intel: int340x: processor_thermal: Fix additional deadlock
Hans de Goede hdegoede@redhat.com platform/x86: ideapad-laptop: Stop sending KEY_TOUCHPAD_TOGGLE
Johan Hovold johan+linaro@kernel.org pinctrl: at91-pio4: fix domain name assignment
Kornel Dulęba korneld@chromium.org pinctrl: amd: Disable and mask interrupts on resume
Ben Hutchings ben@decadent.org.uk modpost: Fix processing of CRCs on 32-bit build machines
Josua Mayer josua@solid-run.com net: phy: dp83869: fix default value for tx-/rx-internal-delay
Juergen Gross jgross@suse.com xen/netback: don't do grant copy across page boundary
Oleksij Rempel linux@rempel-privat.de can: j1939: prevent deadlock by moving j1939_sk_errqueue()
Mike Snitzer snitzer@kernel.org dm: fix __send_duplicate_bios() to always allow for splitting IO
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Always invalidate last cached page on append write
Ronak Doshi doshir@vmware.com vmxnet3: use gro callback when UPT is enabled
Pavel Begunkov asml.silence@gmail.com io_uring: fix poll/netmsg alloc caches
Pavel Begunkov asml.silence@gmail.com io_uring/rsrc: fix rogue rsrc node grabbing
Jens Axboe axboe@kernel.dk io_uring/poll: clear single/double poll flags on poll arming
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Do not propagate iomap_dio_rw() ENOTBLK error to user space
Filipe Manana fdmanana@suse.com btrfs: ignore fiemap path cache when there are multiple paths for a node
Anand Jain anand.jain@oracle.com btrfs: scan device in non-exclusive mode
Filipe Manana fdmanana@suse.com btrfs: fix race between quota disable and quota assign ioctls
Filipe Manana fdmanana@suse.com btrfs: fix deadlock when aborting transaction during relocation with scrub
Hans de Goede hdegoede@redhat.com Input: goodix - add Lenovo Yoga Book X90F to nine_bytes_report DMI table
Jonathan Denose jdenose@chromium.org Input: i8042 - add quirk for Fujitsu Lifebook A574/H
David Disseldorp ddiss@suse.de cifs: fix DFS traversal oops without CONFIG_CIFS_DFS_UPCALL
Paulo Alcantara pc@manguebit.com cifs: prevent infinite recursion in CIFSGetDFSRefer()
Jason A. Donenfeld Jason@zx2c4.com Input: focaltech - use explicitly signed char type
msizanoen msizanoen@qtmlabs.xyz Input: alps - fix compatibility with -funsigned-char
Werner Sembach wse@tuxedocomputers.com Input: i8042 - add TUXEDO devices to i8042 quirk tables for partial fix
Javier Martinez Canillas javierm@redhat.com Revert "venus: firmware: Correct non-pix start and end addresses"
Lu Baolu baolu.lu@linux.intel.com iommu/vt-d: Allow zero SAGAW if second-stage not supported
Matthias Benkmann matthias.benkmann@gmail.com Input: xpad - fix incorrectly applied patch for MAP_PROFILE_BUTTON
Horatiu Vultur horatiu.vultur@microchip.com pinctrl: ocelot: Fix alt mode for ocelot
Felix Fietkau nbd@nbd.name net: ethernet: mtk_eth_soc: add missing ppe cache flush when deleting a flow
Felix Fietkau nbd@nbd.name net: ethernet: mtk_eth_soc: fix L2 offloading with DSA untag offload
Felix Fietkau nbd@nbd.name net: ethernet: mtk_eth_soc: fix flow block refcounting logic
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: sync unicast and multicast addresses for VLAN filters too
Steffen Bätz steffen@innosonix.de net: dsa: mv88e6xxx: Enable IGMP snooping on user ports only
Michael Chan michael.chan@broadcom.com bnxt_en: Add missing 200G link speed reporting
Kalesh AP kalesh-anakkur.purayil@broadcom.com bnxt_en: Fix typo in PCI id to device description string mapping
Kalesh AP kalesh-anakkur.purayil@broadcom.com bnxt_en: Fix reporting of test result in ethtool selftest
Radoslaw Tyl radoslawx.tyl@intel.com i40e: fix registers dump after run ethtool adapter self test
Jakub Kicinski kuba@kernel.org bnx2x: use the right build_skb() helper
Alex Elder elder@linaro.org net: ipa: compute DMA pool size properly
Hans de Goede hdegoede@redhat.com drm/nouveau/kms: Fix backlight registration
M Chetan Kumar m.chetan.kumar@linux.intel.com net: wwan: iosm: fixes 7560 modem crash
Tasos Sahanidis tasos@tasossah.com ALSA: ymfpci: Fix BUG_ON in probe function
Tasos Sahanidis tasos@tasossah.com ALSA: ymfpci: Create card with device-managed snd_devm_card_new()
Felix Fietkau nbd@nbd.name net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links
Jakob Koschel jkl820.git@gmail.com ice: fix invalid check for empty list in ice_sched_assoc_vsi_to_agg()
Junfeng Guo junfeng.guo@intel.com ice: add profile conflict check for AVF FDIR
Brett Creeley brett.creeley@intel.com ice: Fix ice_cfg_rdma_fltr() to only update relevant fields
Wolfram Sang wsa+renesas@sang-engineering.com smsc911x: avoid PHY being resumed when interface is not up
Sven Auhagen sven.auhagen@voleatech.de net: mvpp2: parser fix PPPoE
Sven Auhagen sven.auhagen@voleatech.de net: mvpp2: parser fix QinQ
Sven Auhagen sven.auhagen@voleatech.de net: mvpp2: classifier flow fix fragmentation flags
Alyssa Ross hi@alyssa.is loop: LOOP_CONFIGURE: send uevents for partitions
Rafael J. Wysocki rafael.j.wysocki@intel.com ACPI: bus: Rework system-level device notification handling
Tony Krowiak akrowiak@linux.ibm.com s390/vfio-ap: fix memory leak in vfio_ap device driver
Ivan Orlov ivan.orlov0322@gmail.com can: bcm: bcm_tx_setup(): fix KMSAN uninit-value in vfs_write
Rajvi Jingar rajvi.jingar@linux.intel.com platform/x86/intel/pmc: Alder Lake PCH slp_s0_residency fix
Chris Wilson chris.p.wilson@linux.intel.com drm/i915/perf: Drop wakeref on GuC RC error
Imre Deak imre.deak@intel.com drm/i915/tc: Fix the ICL PHY ownership check in TC-cold state
Ashutosh Dixit ashutosh.dixit@intel.com drm/i915/pmu: Use functions common with sysfs to read actual freq
Vladimir Oltean vladimir.oltean@nxp.com net: stmmac: don't reject VLANs when IFF_PROMISC is set
Faicker Mo faicker.mo@ucloud.cn net/net_failover: fix txq exceeding warning
Christophe JAILLET christophe.jaillet@wanadoo.fr regulator: Handle deferred clk
ChunHao Lin hau@realtek.com r8169: fix RTL8168H and RTL8107E rx crc error
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8: fix MDB configuration with non-zero VID
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8863_smi: fix bulk access
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8: ksz8_fdb_dump: avoid extracting ghost entry from empty dynamic MAC table.
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8: fix offset for the timestamp filed
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8: fix ksz8_fdb_dump() to extract all 1024 entries
Oleksij Rempel linux@rempel-privat.de net: dsa: microchip: ksz8: fix ksz8_fdb_dump()
SongJingyi u201912584@hust.edu.cn ptp_qoriq: fix memory leak in probe()
Ahmad Fatoum a.fatoum@pengutronix.de net: dsa: realtek: fix out-of-bounds access
Jerry Snitselaar jsnitsel@redhat.com scsi: mpt3sas: Don't print sense pool info twice
Tomas Henzl thenzl@redhat.com scsi: megaraid_sas: Fix crash after a double completion
Íñigo Huguet ihuguet@redhat.com sfc: ef10: don't overwrite offload features at NIC reset
Siddharth Kawar Siddharth.Kawar@microsoft.com SUNRPC: fix shutdown of NFS TCP client socket
Arseniy Krasnov avkrasnov@sberdevices.ru mtd: rawnand: meson: invalidate cache on polling ECC bit
Liang He windhl@126.com platform/surface: aggregator: Add missing fwnode_handle_put()
Mark Pearson mpearson-lenovo@squebb.ca platform/x86: think-lmi: Add possible_values for ThinkStation
Mark Pearson mpearson-lenovo@squebb.ca platform/x86: think-lmi: only display possible_values if available
Mark Pearson mpearson-lenovo@squebb.ca platform/x86: think-lmi: use correct possible_values delimiters
Mark Pearson mpearson-lenovo@squebb.ca platform/x86: think-lmi: add missing type attribute
Petr Tesarik petr.tesarik.ext@huawei.com swiotlb: fix slot alignment checks
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com PCI: dwc: Fix PORT_LINK_CONTROL update when CDM check enabled
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Fix recursive locking at XRUN during syncing
Álvaro Fernández Rojas noltari@gmail.com mips: bmips: BCM6358: disable RAC flush for TP1
Rajnesh Kanwal rkanwal@rivosinc.com riscv/kvm: Fix VM hang in case of timer delta being zero.
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com ca8210: Fix unsigned mac_len comparison with zero in ca8210_skb_tx()
GuoRui.Yu GuoRui.Yu@linux.alibaba.com swiotlb: fix the deadlock in swiotlb_do_find_slots
Christophe JAILLET christophe.jaillet@wanadoo.fr mtd: nand: mxic-ecc: Fix mxic_ecc_data_xfer_wait_for_completion() when irq is used
Arseniy Krasnov AVKrasnov@sberdevices.ru mtd: rawnand: meson: initialize struct with zeroes
Josef Bacik josef@toxicpanda.com btrfs: use temporary variable for space_info in btrfs_update_block_group
Josef Bacik josef@toxicpanda.com btrfs: fix uninitialized variable warning in btrfs_update_block_group
Anton Gusev aagusev@ispras.ru tracing: Fix wrong return in kprobe_event_gen_test.c
Antti Laakso antti.laakso@intel.com tools/power turbostat: fix decoding of HWP_STATUS
Prarit Bhargava prarit@redhat.com tools/power turbostat: Fix /dev/cpu_dma_latency warnings
Wei Chen harperchen1110@gmail.com fbdev: au1200fb: Fix potential divide by zero
Wei Chen harperchen1110@gmail.com fbdev: lxfb: Fix potential divide by zero
Wei Chen harperchen1110@gmail.com fbdev: intelfb: Fix potential divide by zero
Wei Chen harperchen1110@gmail.com fbdev: nvidia: Fix potential divide by zero
Adham Faris afaris@nvidia.com net/mlx5e: Lower maximum allowed MTU in XSK to match XDP prerequisites
David Belanger david.belanger@amd.com drm/amdkfd: Fixed kfd_process cleanup on module exit.
Philipp Geulen p.geulen@js-elektronik.de nvme-pci: add NVME_QUIRK_BOGUS_NID for Lexar NM620
Irvin Cote irvincoteg@gmail.com nvme-pci: fixing memory leak in probe teardown path
Linus Torvalds torvalds@linux-foundation.org sched_getaffinity: don't assume 'cpumask_size()' is fully initialized
Chen Yu yu.c.chen@intel.com ACPI: tools: pfrut: Check if the input of level and type is in the right numeric range
Wei Chen harperchen1110@gmail.com fbdev: tgafb: Fix potential divide by zero
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ALSA: hda/ca0132: fixup buffer overrun at tuning_ctl_set()
Kuninori Morimoto kuninori.morimoto.gx@renesas.com ALSA: asihpi: check pao in control_message()
Kristian Overskeid koverskeid@gmail.com net: hsr: Don't log netdev_err message on unknown prp dst node
Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com drm/amd/display: Fix HDCP failing to enable after suspend
Chia-I Wu olvaffe@gmail.com drm/amdkfd: fix potential kgd_mem UAFs
Jane Jian Jane.Jian@amd.com drm/amdgpu/vcn: custom video info caps for sriov
Chia-I Wu olvaffe@gmail.com drm/amdkfd: fix a potential double free in pqm_create_queue
Xiaogang Chen Xiaogang.Chen@amd.com drm/amdkfd: Fix BO offset for multi-VMA page migration
Jan Beulich jbeulich@suse.com x86/PVH: obtain VGA console info in Dom0
NeilBrown neilb@suse.de md: avoid signed overflow in slot_store()
Johannes Berg johannes.berg@intel.com wifi: mac80211: check basic rates validity
Emil Abildgaard Svendsen EMAS@bang-olufsen.dk ASoC: hdmi-codec: only startup/shutdown on supported streams
Rander Wang rander.wang@intel.com ASoC: SOF: IPC4: update gain ipc msg definition to align with fw
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: SOF: Intel: hda-ctrl: re-add sleep after entering and exiting reset
Rander Wang rander.wang@intel.com ASoC: SOF: Intel: hda-dsp: harden D0i3 programming sequence
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ASoC: SOF: Intel: pci-tng: revert invalid bar size setting
Seppo Ingalsuo seppo.ingalsuo@linux.intel.com ASoC: SOF: ipc4-topology: Fix incorrect sample rate print unit
Peter Ujfalusi peter.ujfalusi@linux.intel.com ASoC: SOF: ipc3: Check for upper size limit for the received message
Hans de Goede hdegoede@redhat.com ACPI: x86: Add skip i2c clients quirk for Lenovo Yoga Book X90
Hans de Goede hdegoede@redhat.com ACPI: x86: Add skip i2c clients quirk for Acer Iconia One 7 B1-750
Hans de Goede hdegoede@redhat.com ACPI: x86: Introduce an acpi_quirk_skip_gpio_event_handlers() helper
Chia-Lin Kao (AceLan) acelan.kao@canonical.com ACPI: video: Add backlight=native DMI quirk for Dell Vostro 15 3535
Jonathan Neuschäfer j.neuschaefer@gmx.net zstd: Fix definition of assert()
Nick Terrell terrelln@fb.com lib: zstd: Backport fix for in-place decompression
Cezary Rojewski cezary.rojewski@intel.com ASoC: Intel: avs: nau8825: Adjust clock control
Cezary Rojewski cezary.rojewski@intel.com ASoC: Intel: avs: ssm4567: Remove nau8825 bits
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ASoC: Intel: avs: rt5682: Explicitly define codec format
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ASoC: Intel: avs: da7219: Explicitly define codec format
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ASoC: Intel: avs: max98357a: Explicitly define codec format
Ravulapati Vishnu Vardhan Rao quic_visr@quicinc.com ASoC: codecs: tx-macro: Fix for KASAN: slab-out-of-bounds
Herbert Xu herbert@gondor.apana.org.au xfrm: Zero padding when dumping algos and encap
Paulo Alcantara pc@manguebit.com cifs: fix missing unload_nls() in smb2_reconnect()
Eric Biggers ebiggers@google.com fsverity: don't drop pagecache at end of FS_IOC_ENABLE_VERITY
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: drop space_info->active_total_bytes
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: count fresh BG region as zone unusable
Josef Bacik josef@toxicpanda.com btrfs: rename BTRFS_FS_NO_OVERCOMMIT to BTRFS_FS_ACTIVE_ZONE_TRACKING
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Fix error message in zonefs_file_dio_append()
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Separate zone information from inode information
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Reduce struct zonefs_inode_info size
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Simplify IO error handling
Damien Le Moal damien.lemoal@opensource.wdc.com zonefs: Reorganize code
Shyam Prasad N sprasad@microsoft.com cifs: avoid race conditions with parallel reconnects
Paulo Alcantara pc@manguebit.com cifs: prevent data race in cifs_reconnect_tcon()
Shyam Prasad N sprasad@microsoft.com cifs: update ip_addr for ses only for primary chan setup
Gil Fine gil.fine@linux.intel.com thunderbolt: Limit USB3 bandwidth of certain Intel USB4 host routers
-------------
Diffstat:
.../devicetree/bindings/mtd/jedec,spi-nor.yaml | 7 + Makefile | 4 +- arch/arm64/kvm/mmu.c | 99 +- arch/arm64/kvm/pmu-emul.c | 3 +- arch/arm64/kvm/sys_regs.c | 21 +- arch/mips/bmips/dma.c | 5 + arch/mips/bmips/setup.c | 8 + arch/powerpc/include/asm/book3s/64/tlbflush.h | 9 +- arch/powerpc/kernel/ptrace/ptrace-view.c | 6 + arch/powerpc/platforms/pseries/vas.c | 8 + arch/riscv/kvm/vcpu_timer.c | 6 +- arch/s390/Makefile | 2 +- arch/s390/lib/uaccess.c | 2 +- arch/x86/xen/Makefile | 2 +- arch/x86/xen/enlighten_pv.c | 3 +- arch/x86/xen/enlighten_pvh.c | 13 + arch/x86/xen/vga.c | 5 +- arch/x86/xen/xen-ops.h | 7 +- arch/xtensa/kernel/traps.c | 16 +- drivers/acpi/bus.c | 83 +- drivers/acpi/video_detect.c | 7 + drivers/acpi/x86/utils.c | 45 +- drivers/block/loop.c | 18 +- drivers/gpio/gpiolib-acpi.c | 3 + drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 7 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 3 +- drivers/gpu/drm/amd/amdgpu/soc21.c | 103 +- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 +- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 33 +- drivers/gpu/drm/amd/amdkfd/kfd_module.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 67 +- .../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 +- .../gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 2 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 51 +- .../amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 15 + drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 10 +- drivers/gpu/drm/i915/display/intel_color.c | 44 +- drivers/gpu/drm/i915/display/intel_display.c | 28 +- drivers/gpu/drm/i915/display/intel_dpt.c | 2 + drivers/gpu/drm/i915/display/intel_tc.c | 4 +- drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 3 +- drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 3 + drivers/gpu/drm/i915/gt/intel_rps.c | 38 +- drivers/gpu/drm/i915/gt/intel_rps.h | 4 +- drivers/gpu/drm/i915/i915_perf.c | 14 +- drivers/gpu/drm/i915/i915_perf_types.h | 6 + drivers/gpu/drm/i915/i915_pmu.c | 10 +- drivers/gpu/drm/nouveau/nouveau_backlight.c | 7 +- drivers/input/joystick/xpad.c | 7 +- drivers/input/mouse/alps.c | 16 +- drivers/input/mouse/focaltech.c | 8 +- drivers/input/serio/i8042-acpipnpio.h | 36 + drivers/input/touchscreen/goodix.c | 14 +- drivers/iommu/intel/dmar.c | 3 +- drivers/md/dm.c | 2 + drivers/md/md.c | 3 + drivers/media/platform/qcom/venus/firmware.c | 4 +- drivers/mtd/nand/ecc-mxic.c | 1 + drivers/mtd/nand/raw/meson_nand.c | 10 +- drivers/net/dsa/microchip/ksz8795.c | 11 +- drivers/net/dsa/microchip/ksz8863_smi.c | 9 - drivers/net/dsa/microchip/ksz_common.c | 12 +- drivers/net/dsa/mv88e6xxx/chip.c | 9 +- drivers/net/dsa/realtek/realtek-mdio.c | 5 +- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 16 +- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 +- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 3 + drivers/net/ethernet/intel/i40e/i40e_diag.c | 11 +- drivers/net/ethernet/intel/i40e/i40e_diag.h | 2 +- drivers/net/ethernet/intel/ice/ice_sched.c | 8 +- drivers/net/ethernet/intel/ice/ice_switch.c | 26 +- drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 73 + drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c | 30 +- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 86 +- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 8 +- drivers/net/ethernet/mediatek/mtk_ppe.c | 6 +- drivers/net/ethernet/mediatek/mtk_ppe_offload.c | 3 +- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +- drivers/net/ethernet/realtek/r8169_phy_config.c | 3 + drivers/net/ethernet/sfc/ef10.c | 38 +- drivers/net/ethernet/sfc/efx.c | 17 +- drivers/net/ethernet/smsc/smsc911x.c | 7 +- drivers/net/ethernet/stmicro/stmmac/common.h | 1 - drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c | 61 +- drivers/net/ieee802154/ca8210.c | 3 +- drivers/net/ipa/gsi_trans.c | 2 +- drivers/net/net_failover.c | 8 +- drivers/net/phy/dp83869.c | 6 +- drivers/net/vmxnet3/vmxnet3_drv.c | 4 +- drivers/net/wwan/iosm/iosm_ipc_imem.c | 7 + drivers/net/xen-netback/common.h | 2 +- drivers/net/xen-netback/netback.c | 25 +- drivers/nvme/host/pci.c | 3 + drivers/pci/controller/dwc/pcie-designware.c | 10 +- drivers/pinctrl/pinctrl-amd.c | 36 +- drivers/pinctrl/pinctrl-at91-pio4.c | 1 - drivers/pinctrl/pinctrl-ocelot.c | 2 +- drivers/platform/surface/aggregator/bus.c | 4 +- drivers/platform/x86/ideapad-laptop.c | 23 +- drivers/platform/x86/intel/pmc/core.c | 13 +- drivers/platform/x86/think-lmi.c | 60 +- drivers/ptp/ptp_qoriq.c | 2 +- drivers/regulator/fixed.c | 2 +- drivers/s390/crypto/vfio_ap_drv.c | 3 +- drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 +- drivers/scsi/mpt3sas/mpt3sas_base.c | 5 - .../int340x_thermal/processor_thermal_device_pci.c | 1 - drivers/thunderbolt/quirks.c | 31 + drivers/thunderbolt/tb.h | 3 + drivers/thunderbolt/usb4.c | 17 +- drivers/usb/typec/ucsi/ucsi.c | 22 +- drivers/video/fbdev/au1200fb.c | 3 + drivers/video/fbdev/geode/lxfb_core.c | 3 + drivers/video/fbdev/intelfb/intelfbdrv.c | 3 + drivers/video/fbdev/nvidia/nvidia.c | 2 + drivers/video/fbdev/tgafb.c | 3 + fs/btrfs/backref.c | 85 +- fs/btrfs/block-group.c | 30 +- fs/btrfs/free-space-cache.c | 8 +- fs/btrfs/fs.h | 7 +- fs/btrfs/ioctl.c | 2 + fs/btrfs/qgroup.c | 11 +- fs/btrfs/space-info.c | 42 +- fs/btrfs/space-info.h | 2 - fs/btrfs/transaction.c | 15 +- fs/btrfs/volumes.c | 20 +- fs/btrfs/zoned.c | 31 +- fs/cifs/cifsfs.h | 5 +- fs/cifs/cifsproto.h | 1 + fs/cifs/cifssmb.c | 52 +- fs/cifs/connect.c | 64 +- fs/cifs/misc.c | 44 + fs/cifs/smb2pdu.c | 132 +- fs/cifs/smb2transport.c | 17 +- fs/nfs/nfs4proc.c | 5 +- fs/verity/enable.c | 24 +- fs/zonefs/Makefile | 2 +- fs/zonefs/file.c | 902 +++++++++++ fs/zonefs/super.c | 1640 ++++++-------------- fs/zonefs/trace.h | 20 +- fs/zonefs/zonefs.h | 100 +- include/acpi/acpi_bus.h | 5 + include/trace/events/rcu.h | 2 +- include/xen/interface/platform.h | 3 + io_uring/alloc_cache.h | 1 + io_uring/poll.c | 1 + io_uring/rsrc.h | 12 +- kernel/compat.c | 2 +- kernel/dma/swiotlb.c | 24 +- kernel/sched/core.c | 4 +- kernel/trace/kprobe_event_gen_test.c | 4 +- lib/zstd/common/zstd_deps.h | 2 +- lib/zstd/decompress/zstd_decompress.c | 25 +- net/can/bcm.c | 16 +- net/can/j1939/transport.c | 8 +- net/dsa/slave.c | 121 +- net/hsr/hsr_framereg.c | 2 +- net/mac80211/cfg.c | 21 +- net/sunrpc/xprtsock.c | 1 + net/xfrm/xfrm_user.c | 45 +- scripts/mod/modpost.c | 2 +- sound/core/pcm_lib.c | 2 + sound/pci/asihpi/hpi6205.c | 2 +- sound/pci/hda/patch_ca0132.c | 4 +- sound/pci/hda/patch_conexant.c | 6 +- sound/pci/hda/patch_realtek.c | 5 + sound/pci/ymfpci/ymfpci.c | 2 +- sound/pci/ymfpci/ymfpci_main.c | 2 +- sound/soc/codecs/hdmi-codec.c | 11 + sound/soc/codecs/lpass-tx-macro.c | 11 +- sound/soc/intel/avs/boards/da7219.c | 21 + sound/soc/intel/avs/boards/max98357a.c | 22 + sound/soc/intel/avs/boards/nau8825.c | 14 +- sound/soc/intel/avs/boards/rt5682.c | 22 + sound/soc/intel/avs/boards/ssm4567.c | 31 - sound/soc/sof/intel/hda-ctrl.c | 3 + sound/soc/sof/intel/hda-dsp.c | 12 + sound/soc/sof/intel/pci-tng.c | 6 +- sound/soc/sof/ipc3.c | 5 +- sound/soc/sof/ipc4-control.c | 3 +- sound/soc/sof/ipc4-topology.c | 6 +- sound/soc/sof/ipc4-topology.h | 6 +- sound/usb/endpoint.c | 22 +- sound/usb/endpoint.h | 4 +- sound/usb/format.c | 8 +- sound/usb/pcm.c | 2 +- tools/lib/bpf/btf_dump.c | 154 +- tools/power/acpi/tools/pfrut/pfrut.c | 18 +- tools/power/x86/turbostat/turbostat.8 | 2 + tools/power/x86/turbostat/turbostat.c | 4 +- .../bpf/progs/btf_dump_test_case_bitfields.c | 2 +- .../bpf/progs/btf_dump_test_case_packing.c | 80 +- .../bpf/progs/btf_dump_test_case_padding.c | 162 +- 197 files changed, 3771 insertions(+), 2190 deletions(-)
From: Gil Fine gil.fine@linux.intel.com
[ Upstream commit f0a57dd33b3eadf540912cd130db727ea824d174 ]
Current Intel USB4 host routers have hardware limitation that the USB3 bandwidth cannot go higher than 16376 Mb/s. Work this around by adding a new quirk that limits the bandwidth for the affected host routers.
Cc: stable@vger.kernel.org Signed-off-by: Gil Fine gil.fine@linux.intel.com Signed-off-by: Mika Westerberg mika.westerberg@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/thunderbolt/quirks.c | 31 +++++++++++++++++++++++++++++++ drivers/thunderbolt/tb.h | 3 +++ drivers/thunderbolt/usb4.c | 17 +++++++++++++++-- 3 files changed, 49 insertions(+), 2 deletions(-)
diff --git a/drivers/thunderbolt/quirks.c b/drivers/thunderbolt/quirks.c index ae28a03fa890b..1157b8869bcca 100644 --- a/drivers/thunderbolt/quirks.c +++ b/drivers/thunderbolt/quirks.c @@ -26,6 +26,19 @@ static void quirk_clx_disable(struct tb_switch *sw) tb_sw_dbg(sw, "disabling CL states\n"); }
+static void quirk_usb3_maximum_bandwidth(struct tb_switch *sw) +{ + struct tb_port *port; + + tb_switch_for_each_port(sw, port) { + if (!tb_port_is_usb3_down(port)) + continue; + port->max_bw = 16376; + tb_port_dbg(port, "USB3 maximum bandwidth limited to %u Mb/s\n", + port->max_bw); + } +} + struct tb_quirk { u16 hw_vendor_id; u16 hw_device_id; @@ -43,6 +56,24 @@ static const struct tb_quirk tb_quirks[] = { * DP buffers. */ { 0x8087, 0x0b26, 0x0000, 0x0000, quirk_dp_credit_allocation }, + /* + * Limit the maximum USB3 bandwidth for the following Intel USB4 + * host routers due to a hardware issue. + */ + { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI0, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_ADL_NHI1, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI0, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_RPL_NHI1, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_MTL_M_NHI0, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI0, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, + { 0x8087, PCI_DEVICE_ID_INTEL_MTL_P_NHI1, 0x0000, 0x0000, + quirk_usb3_maximum_bandwidth }, /* * CLx is not supported on AMD USB4 Yellow Carp and Pink Sardine platforms. */ diff --git a/drivers/thunderbolt/tb.h b/drivers/thunderbolt/tb.h index e11d973a8f9b6..f034723b1b40e 100644 --- a/drivers/thunderbolt/tb.h +++ b/drivers/thunderbolt/tb.h @@ -252,6 +252,8 @@ struct tb_switch { * @ctl_credits: Buffers reserved for control path * @dma_credits: Number of credits allocated for DMA tunneling for all * DMA paths through this port. + * @max_bw: Maximum possible bandwidth through this adapter if set to + * non-zero. * * In USB4 terminology this structure represents an adapter (protocol or * lane adapter). @@ -277,6 +279,7 @@ struct tb_port { unsigned int total_credits; unsigned int ctl_credits; unsigned int dma_credits; + unsigned int max_bw; };
/** diff --git a/drivers/thunderbolt/usb4.c b/drivers/thunderbolt/usb4.c index d5cd219ee9e6b..3a11b30b6c86a 100644 --- a/drivers/thunderbolt/usb4.c +++ b/drivers/thunderbolt/usb4.c @@ -1882,6 +1882,15 @@ int usb4_port_retimer_nvm_read(struct tb_port *port, u8 index, usb4_port_retimer_nvm_read_block, &info); }
+static inline unsigned int +usb4_usb3_port_max_bandwidth(const struct tb_port *port, unsigned int bw) +{ + /* Take the possible bandwidth limitation into account */ + if (port->max_bw) + return min(bw, port->max_bw); + return bw; +} + /** * usb4_usb3_port_max_link_rate() - Maximum support USB3 link rate * @port: USB3 adapter port @@ -1903,7 +1912,9 @@ int usb4_usb3_port_max_link_rate(struct tb_port *port) return ret;
lr = (val & ADP_USB3_CS_4_MSLR_MASK) >> ADP_USB3_CS_4_MSLR_SHIFT; - return lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000; + ret = lr == ADP_USB3_CS_4_MSLR_20G ? 20000 : 10000; + + return usb4_usb3_port_max_bandwidth(port, ret); }
/** @@ -1930,7 +1941,9 @@ int usb4_usb3_port_actual_link_rate(struct tb_port *port) return 0;
lr = val & ADP_USB3_CS_4_ALR_MASK; - return lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000; + ret = lr == ADP_USB3_CS_4_ALR_20G ? 20000 : 10000; + + return usb4_usb3_port_max_bandwidth(port, ret); }
static int usb4_usb3_port_cm_request(struct tb_port *port, bool request)
From: Shyam Prasad N sprasad@microsoft.com
[ Upstream commit e77978de4765229e09c8fabcf4f8419ff367317f ]
We update ses->ip_addr whenever we do a session setup. But this should happen only for primary channel in mchan scenario.
Signed-off-by: Shyam Prasad N sprasad@microsoft.com Reviewed-by: Paulo Alcantara (SUSE) pc@cjr.nz Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: bc962159e8e3 ("cifs: avoid race conditions with parallel reconnects") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index 6da2af97b8bac..f53837f436d08 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -3735,16 +3735,12 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses, struct nls_table *nls_info) { int rc = -ENOSYS; - struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *)&server->dstaddr; - struct sockaddr_in *addr = (struct sockaddr_in *)&server->dstaddr; + struct TCP_Server_Info *pserver = CIFS_SERVER_IS_CHAN(server) ? server->primary_server : server; + struct sockaddr_in6 *addr6 = (struct sockaddr_in6 *)&pserver->dstaddr; + struct sockaddr_in *addr = (struct sockaddr_in *)&pserver->dstaddr; bool is_binding = false;
spin_lock(&ses->ses_lock); - if (server->dstaddr.ss_family == AF_INET6) - scnprintf(ses->ip_addr, sizeof(ses->ip_addr), "%pI6", &addr6->sin6_addr); - else - scnprintf(ses->ip_addr, sizeof(ses->ip_addr), "%pI4", &addr->sin_addr); - if (ses->ses_status != SES_GOOD && ses->ses_status != SES_NEW && ses->ses_status != SES_NEED_RECON) { @@ -3768,6 +3764,14 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses, ses->ses_status = SES_IN_SETUP; spin_unlock(&ses->ses_lock);
+ /* update ses ip_addr only for primary chan */ + if (server == pserver) { + if (server->dstaddr.ss_family == AF_INET6) + scnprintf(ses->ip_addr, sizeof(ses->ip_addr), "%pI6", &addr6->sin6_addr); + else + scnprintf(ses->ip_addr, sizeof(ses->ip_addr), "%pI4", &addr->sin_addr); + } + if (!is_binding) { ses->capabilities = server->capabilities; if (!linuxExtEnabled)
From: Paulo Alcantara pc@manguebit.com
[ Upstream commit 1bcd548d935a33c6fc58331405eb1b82fd6150de ]
Make sure to get an up-to-date TCP_Server_Info::nr_targets value prior to waiting the server to be reconnected in cifs_reconnect_tcon(). It is set in cifs_tcp_ses_needs_reconnect() and protected by TCP_Server_Info::srv_lock.
Create a new cifs_wait_for_server_reconnect() helper that can be used by both SMB2+ and CIFS reconnect code.
Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Signed-off-by: Steve French stfrench@microsoft.com Stable-dep-of: bc962159e8e3 ("cifs: avoid race conditions with parallel reconnects") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifsproto.h | 1 + fs/cifs/cifssmb.c | 43 ++---------------------- fs/cifs/misc.c | 44 ++++++++++++++++++++++++ fs/cifs/smb2pdu.c | 82 ++++++++++++--------------------------------- 4 files changed, 69 insertions(+), 101 deletions(-)
diff --git a/fs/cifs/cifsproto.h b/fs/cifs/cifsproto.h index e75184544ecb4..639df85dafd6c 100644 --- a/fs/cifs/cifsproto.h +++ b/fs/cifs/cifsproto.h @@ -697,5 +697,6 @@ static inline int cifs_create_options(struct cifs_sb_info *cifs_sb, int options)
struct super_block *cifs_get_tcon_super(struct cifs_tcon *tcon); void cifs_put_tcon_super(struct super_block *sb); +int cifs_wait_for_server_reconnect(struct TCP_Server_Info *server, bool retry);
#endif /* _CIFSPROTO_H */ diff --git a/fs/cifs/cifssmb.c b/fs/cifs/cifssmb.c index 566e6a26b897c..5ca4a5383aaae 100644 --- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -70,7 +70,6 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command) struct cifs_ses *ses; struct TCP_Server_Info *server; struct nls_table *nls_codepage; - int retries;
/* * SMBs NegProt, SessSetup, uLogoff do not have tcon yet so check for @@ -98,45 +97,9 @@ cifs_reconnect_tcon(struct cifs_tcon *tcon, int smb_command) } spin_unlock(&tcon->tc_lock);
- retries = server->nr_targets; - - /* - * Give demultiplex thread up to 10 seconds to each target available for - * reconnect -- should be greater than cifs socket timeout which is 7 - * seconds. - */ - while (server->tcpStatus == CifsNeedReconnect) { - rc = wait_event_interruptible_timeout(server->response_q, - (server->tcpStatus != CifsNeedReconnect), - 10 * HZ); - if (rc < 0) { - cifs_dbg(FYI, "%s: aborting reconnect due to a received signal by the process\n", - __func__); - return -ERESTARTSYS; - } - - /* are we still trying to reconnect? */ - spin_lock(&server->srv_lock); - if (server->tcpStatus != CifsNeedReconnect) { - spin_unlock(&server->srv_lock); - break; - } - spin_unlock(&server->srv_lock); - - if (retries && --retries) - continue; - - /* - * on "soft" mounts we wait once. Hard mounts keep - * retrying until process is killed or server comes - * back on-line - */ - if (!tcon->retry) { - cifs_dbg(FYI, "gave up waiting on reconnect in smb_init\n"); - return -EHOSTDOWN; - } - retries = server->nr_targets; - } + rc = cifs_wait_for_server_reconnect(server, tcon->retry); + if (rc) + return rc;
spin_lock(&ses->chan_lock); if (!cifs_chan_needs_reconnect(ses, server) && !tcon->need_reconnect) { diff --git a/fs/cifs/misc.c b/fs/cifs/misc.c index 9f4486b705d5c..5542893ef03f7 100644 --- a/fs/cifs/misc.c +++ b/fs/cifs/misc.c @@ -1376,3 +1376,47 @@ int cifs_inval_name_dfs_link_error(const unsigned int xid, return 0; } #endif + +int cifs_wait_for_server_reconnect(struct TCP_Server_Info *server, bool retry) +{ + int timeout = 10; + int rc; + + spin_lock(&server->srv_lock); + if (server->tcpStatus != CifsNeedReconnect) { + spin_unlock(&server->srv_lock); + return 0; + } + timeout *= server->nr_targets; + spin_unlock(&server->srv_lock); + + /* + * Give demultiplex thread up to 10 seconds to each target available for + * reconnect -- should be greater than cifs socket timeout which is 7 + * seconds. + * + * On "soft" mounts we wait once. Hard mounts keep retrying until + * process is killed or server comes back on-line. + */ + do { + rc = wait_event_interruptible_timeout(server->response_q, + (server->tcpStatus != CifsNeedReconnect), + timeout * HZ); + if (rc < 0) { + cifs_dbg(FYI, "%s: aborting reconnect due to received signal\n", + __func__); + return -ERESTARTSYS; + } + + /* are we still trying to reconnect? */ + spin_lock(&server->srv_lock); + if (server->tcpStatus != CifsNeedReconnect) { + spin_unlock(&server->srv_lock); + return 0; + } + spin_unlock(&server->srv_lock); + } while (retry); + + cifs_dbg(FYI, "%s: gave up waiting on reconnect\n", __func__); + return -EHOSTDOWN; +} diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 6e6e44d8b4c79..83d04cd2f9df8 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -139,66 +139,6 @@ smb2_hdr_assemble(struct smb2_hdr *shdr, __le16 smb2_cmd, return; }
-static int wait_for_server_reconnect(struct TCP_Server_Info *server, - __le16 smb2_command, bool retry) -{ - int timeout = 10; - int rc; - - spin_lock(&server->srv_lock); - if (server->tcpStatus != CifsNeedReconnect) { - spin_unlock(&server->srv_lock); - return 0; - } - timeout *= server->nr_targets; - spin_unlock(&server->srv_lock); - - /* - * Return to caller for TREE_DISCONNECT and LOGOFF and CLOSE - * here since they are implicitly done when session drops. - */ - switch (smb2_command) { - /* - * BB Should we keep oplock break and add flush to exceptions? - */ - case SMB2_TREE_DISCONNECT: - case SMB2_CANCEL: - case SMB2_CLOSE: - case SMB2_OPLOCK_BREAK: - return -EAGAIN; - } - - /* - * Give demultiplex thread up to 10 seconds to each target available for - * reconnect -- should be greater than cifs socket timeout which is 7 - * seconds. - * - * On "soft" mounts we wait once. Hard mounts keep retrying until - * process is killed or server comes back on-line. - */ - do { - rc = wait_event_interruptible_timeout(server->response_q, - (server->tcpStatus != CifsNeedReconnect), - timeout * HZ); - if (rc < 0) { - cifs_dbg(FYI, "%s: aborting reconnect due to received signal\n", - __func__); - return -ERESTARTSYS; - } - - /* are we still trying to reconnect? */ - spin_lock(&server->srv_lock); - if (server->tcpStatus != CifsNeedReconnect) { - spin_unlock(&server->srv_lock); - return 0; - } - spin_unlock(&server->srv_lock); - } while (retry); - - cifs_dbg(FYI, "%s: gave up waiting on reconnect\n", __func__); - return -EHOSTDOWN; -} - static int smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, struct TCP_Server_Info *server) @@ -239,7 +179,27 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, (!tcon->ses->server) || !server) return -EIO;
- rc = wait_for_server_reconnect(server, smb2_command, tcon->retry); + spin_lock(&server->srv_lock); + if (server->tcpStatus == CifsNeedReconnect) { + /* + * Return to caller for TREE_DISCONNECT and LOGOFF and CLOSE + * here since they are implicitly done when session drops. + */ + switch (smb2_command) { + /* + * BB Should we keep oplock break and add flush to exceptions? + */ + case SMB2_TREE_DISCONNECT: + case SMB2_CANCEL: + case SMB2_CLOSE: + case SMB2_OPLOCK_BREAK: + spin_unlock(&server->srv_lock); + return -EAGAIN; + } + } + spin_unlock(&server->srv_lock); + + rc = cifs_wait_for_server_reconnect(server, tcon->retry); if (rc) return rc;
From: Shyam Prasad N sprasad@microsoft.com
[ Upstream commit bc962159e8e326af634a506508034a375bf2b858 ]
When multiple processes/channels do reconnects in parallel we used to return success immediately negotiate/session-setup/tree-connect, causing race conditions between processes that enter the function in parallel. This caused several errors related to session not found to show up during parallel reconnects.
Signed-off-by: Shyam Prasad N sprasad@microsoft.com Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Cc: stable@vger.kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/connect.c | 48 ++++++++++++++++++++++++++++++----------- fs/cifs/smb2pdu.c | 44 +++++++++++++++++++++---------------- fs/cifs/smb2transport.c | 17 ++++++++++++--- 3 files changed, 76 insertions(+), 33 deletions(-)
diff --git a/fs/cifs/connect.c b/fs/cifs/connect.c index f53837f436d08..985e962cf0858 100644 --- a/fs/cifs/connect.c +++ b/fs/cifs/connect.c @@ -244,31 +244,42 @@ cifs_mark_tcp_ses_conns_for_reconnect(struct TCP_Server_Info *server, cifs_chan_update_iface(ses, server);
spin_lock(&ses->chan_lock); - if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server)) - goto next_session; + if (!mark_smb_session && cifs_chan_needs_reconnect(ses, server)) { + spin_unlock(&ses->chan_lock); + continue; + }
if (mark_smb_session) CIFS_SET_ALL_CHANS_NEED_RECONNECT(ses); else cifs_chan_set_need_reconnect(ses, server);
+ cifs_dbg(FYI, "%s: channel connect bitmap: 0x%lx\n", + __func__, ses->chans_need_reconnect); + /* If all channels need reconnect, then tcon needs reconnect */ - if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) - goto next_session; + if (!mark_smb_session && !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) { + spin_unlock(&ses->chan_lock); + continue; + } + spin_unlock(&ses->chan_lock);
+ spin_lock(&ses->ses_lock); ses->ses_status = SES_NEED_RECON; + spin_unlock(&ses->ses_lock);
list_for_each_entry(tcon, &ses->tcon_list, tcon_list) { tcon->need_reconnect = true; + spin_lock(&tcon->tc_lock); tcon->status = TID_NEED_RECON; + spin_unlock(&tcon->tc_lock); } if (ses->tcon_ipc) { ses->tcon_ipc->need_reconnect = true; + spin_lock(&ses->tcon_ipc->tc_lock); ses->tcon_ipc->status = TID_NEED_RECON; + spin_unlock(&ses->tcon_ipc->tc_lock); } - -next_session: - spin_unlock(&ses->chan_lock); } spin_unlock(&cifs_tcp_ses_lock); } @@ -3703,11 +3714,19 @@ cifs_negotiate_protocol(const unsigned int xid, struct cifs_ses *ses,
/* only send once per connect */ spin_lock(&server->srv_lock); - if (!server->ops->need_neg(server) || + if (server->tcpStatus != CifsGood && + server->tcpStatus != CifsNew && server->tcpStatus != CifsNeedNegotiate) { + spin_unlock(&server->srv_lock); + return -EHOSTDOWN; + } + + if (!server->ops->need_neg(server) && + server->tcpStatus == CifsGood) { spin_unlock(&server->srv_lock); return 0; } + server->tcpStatus = CifsInNegotiate; spin_unlock(&server->srv_lock);
@@ -3741,23 +3760,28 @@ cifs_setup_session(const unsigned int xid, struct cifs_ses *ses, bool is_binding = false;
spin_lock(&ses->ses_lock); + cifs_dbg(FYI, "%s: channel connect bitmap: 0x%lx\n", + __func__, ses->chans_need_reconnect); + if (ses->ses_status != SES_GOOD && ses->ses_status != SES_NEW && ses->ses_status != SES_NEED_RECON) { spin_unlock(&ses->ses_lock); - return 0; + return -EHOSTDOWN; }
/* only send once per connect */ spin_lock(&ses->chan_lock); - if (CIFS_ALL_CHANS_GOOD(ses) || - cifs_chan_in_reconnect(ses, server)) { + if (CIFS_ALL_CHANS_GOOD(ses)) { + if (ses->ses_status == SES_NEED_RECON) + ses->ses_status = SES_GOOD; spin_unlock(&ses->chan_lock); spin_unlock(&ses->ses_lock); return 0; } - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); + cifs_chan_set_in_reconnect(ses, server); + is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); spin_unlock(&ses->chan_lock);
if (!is_binding) diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index 83d04cd2f9df8..f0b1ae0835d71 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -199,6 +199,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, } spin_unlock(&server->srv_lock);
+again: rc = cifs_wait_for_server_reconnect(server, tcon->retry); if (rc) return rc; @@ -217,6 +218,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon,
nls_codepage = load_nls_default();
+ mutex_lock(&ses->session_mutex); /* * Recheck after acquire mutex. If another thread is negotiating * and the server never sends an answer the socket will be closed @@ -225,6 +227,11 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, spin_lock(&server->srv_lock); if (server->tcpStatus == CifsNeedReconnect) { spin_unlock(&server->srv_lock); + mutex_unlock(&ses->session_mutex); + + if (tcon->retry) + goto again; + rc = -EHOSTDOWN; goto out; } @@ -234,19 +241,22 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, * need to prevent multiple threads trying to simultaneously * reconnect the same SMB session */ + spin_lock(&ses->ses_lock); spin_lock(&ses->chan_lock); - if (!cifs_chan_needs_reconnect(ses, server)) { + if (!cifs_chan_needs_reconnect(ses, server) && + ses->ses_status == SES_GOOD) { spin_unlock(&ses->chan_lock); - + spin_unlock(&ses->ses_lock); /* this means that we only need to tree connect */ if (tcon->need_reconnect) goto skip_sess_setup;
+ mutex_unlock(&ses->session_mutex); goto out; } spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock);
- mutex_lock(&ses->session_mutex); rc = cifs_negotiate_protocol(0, ses, server); if (!rc) { rc = cifs_setup_session(0, ses, server, nls_codepage); @@ -262,10 +272,8 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, mutex_unlock(&ses->session_mutex); goto out; } - mutex_unlock(&ses->session_mutex);
skip_sess_setup: - mutex_lock(&ses->session_mutex); if (!tcon->need_reconnect) { mutex_unlock(&ses->session_mutex); goto out; @@ -280,7 +288,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, cifs_dbg(FYI, "reconnect tcon rc = %d\n", rc); if (rc) { /* If sess reconnected but tcon didn't, something strange ... */ - pr_warn_once("reconnect tcon failed rc = %d\n", rc); + cifs_dbg(VFS, "reconnect tcon failed rc = %d\n", rc); goto out; }
@@ -1252,9 +1260,9 @@ SMB2_sess_alloc_buffer(struct SMB2_sess_data *sess_data) if (rc) return rc;
- spin_lock(&ses->chan_lock); - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); - spin_unlock(&ses->chan_lock); + spin_lock(&ses->ses_lock); + is_binding = (ses->ses_status == SES_GOOD); + spin_unlock(&ses->ses_lock);
if (is_binding) { req->hdr.SessionId = cpu_to_le64(ses->Suid); @@ -1412,9 +1420,9 @@ SMB2_auth_kerberos(struct SMB2_sess_data *sess_data) goto out_put_spnego_key; }
- spin_lock(&ses->chan_lock); - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); - spin_unlock(&ses->chan_lock); + spin_lock(&ses->ses_lock); + is_binding = (ses->ses_status == SES_GOOD); + spin_unlock(&ses->ses_lock);
/* keep session key if binding */ if (!is_binding) { @@ -1538,9 +1546,9 @@ SMB2_sess_auth_rawntlmssp_negotiate(struct SMB2_sess_data *sess_data)
cifs_dbg(FYI, "rawntlmssp session setup challenge phase\n");
- spin_lock(&ses->chan_lock); - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); - spin_unlock(&ses->chan_lock); + spin_lock(&ses->ses_lock); + is_binding = (ses->ses_status == SES_GOOD); + spin_unlock(&ses->ses_lock);
/* keep existing ses id and flags if binding */ if (!is_binding) { @@ -1606,9 +1614,9 @@ SMB2_sess_auth_rawntlmssp_authenticate(struct SMB2_sess_data *sess_data)
rsp = (struct smb2_sess_setup_rsp *)sess_data->iov[0].iov_base;
- spin_lock(&ses->chan_lock); - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); - spin_unlock(&ses->chan_lock); + spin_lock(&ses->ses_lock); + is_binding = (ses->ses_status == SES_GOOD); + spin_unlock(&ses->ses_lock);
/* keep existing ses id and flags if binding */ if (!is_binding) { diff --git a/fs/cifs/smb2transport.c b/fs/cifs/smb2transport.c index d827b7547ffad..790acf65a0926 100644 --- a/fs/cifs/smb2transport.c +++ b/fs/cifs/smb2transport.c @@ -81,6 +81,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) struct cifs_ses *ses = NULL; int i; int rc = 0; + bool is_binding = false;
spin_lock(&cifs_tcp_ses_lock);
@@ -97,9 +98,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) goto out;
found: + spin_lock(&ses->ses_lock); spin_lock(&ses->chan_lock); - if (cifs_chan_needs_reconnect(ses, server) && - !CIFS_ALL_CHANS_NEED_RECONNECT(ses)) { + + is_binding = (cifs_chan_needs_reconnect(ses, server) && + ses->ses_status == SES_GOOD); + if (is_binding) { /* * If we are in the process of binding a new channel * to an existing session, use the master connection @@ -107,6 +111,7 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) */ memcpy(key, ses->smb3signingkey, SMB3_SIGN_KEY_SIZE); spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock); goto out; }
@@ -119,10 +124,12 @@ int smb2_get_sign_key(__u64 ses_id, struct TCP_Server_Info *server, u8 *key) if (chan->server == server) { memcpy(key, chan->signkey, SMB3_SIGN_KEY_SIZE); spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock); goto out; } } spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock);
cifs_dbg(VFS, "%s: Could not find channel signing key for session 0x%llx\n", @@ -392,11 +399,15 @@ generate_smb3signingkey(struct cifs_ses *ses, bool is_binding = false; int chan_index = 0;
+ spin_lock(&ses->ses_lock); spin_lock(&ses->chan_lock); - is_binding = !CIFS_ALL_CHANS_NEED_RECONNECT(ses); + is_binding = (cifs_chan_needs_reconnect(ses, server) && + ses->ses_status == SES_GOOD); + chan_index = cifs_ses_get_chan_index(ses, server); /* TODO: introduce ref counting for channels when the can be freed */ spin_unlock(&ses->chan_lock); + spin_unlock(&ses->ses_lock);
/* * All channels use the same encryption/decryption keys but
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 4008e2a0b01aba982356fd15b128a47bf11bd9c7 ]
Move all code related to zone file operations from super.c to the new file.c file. Inode and zone management code remains in super.c.
Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/zonefs/Makefile | 2 +- fs/zonefs/file.c | 874 ++++++++++++++++++++++++++++++++++++++++ fs/zonefs/super.c | 973 +++------------------------------------------ fs/zonefs/zonefs.h | 22 + 4 files changed, 955 insertions(+), 916 deletions(-) create mode 100644 fs/zonefs/file.c
diff --git a/fs/zonefs/Makefile b/fs/zonefs/Makefile index 9fe54f5319f22..645f7229de4a0 100644 --- a/fs/zonefs/Makefile +++ b/fs/zonefs/Makefile @@ -3,4 +3,4 @@ ccflags-y += -I$(src)
obj-$(CONFIG_ZONEFS_FS) += zonefs.o
-zonefs-y := super.o sysfs.o +zonefs-y := super.o file.o sysfs.o diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c new file mode 100644 index 0000000000000..ece0f3959b6d1 --- /dev/null +++ b/fs/zonefs/file.c @@ -0,0 +1,874 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Simple file system for zoned block devices exposing zones as files. + * + * Copyright (C) 2022 Western Digital Corporation or its affiliates. + */ +#include <linux/module.h> +#include <linux/pagemap.h> +#include <linux/iomap.h> +#include <linux/init.h> +#include <linux/slab.h> +#include <linux/blkdev.h> +#include <linux/statfs.h> +#include <linux/writeback.h> +#include <linux/quotaops.h> +#include <linux/seq_file.h> +#include <linux/parser.h> +#include <linux/uio.h> +#include <linux/mman.h> +#include <linux/sched/mm.h> +#include <linux/task_io_accounting_ops.h> + +#include "zonefs.h" + +#include "trace.h" + +static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset, + loff_t length, unsigned int flags, + struct iomap *iomap, struct iomap *srcmap) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct super_block *sb = inode->i_sb; + loff_t isize; + + /* + * All blocks are always mapped below EOF. If reading past EOF, + * act as if there is a hole up to the file maximum size. + */ + mutex_lock(&zi->i_truncate_mutex); + iomap->bdev = inode->i_sb->s_bdev; + iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize); + isize = i_size_read(inode); + if (iomap->offset >= isize) { + iomap->type = IOMAP_HOLE; + iomap->addr = IOMAP_NULL_ADDR; + iomap->length = length; + } else { + iomap->type = IOMAP_MAPPED; + iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; + iomap->length = isize - iomap->offset; + } + mutex_unlock(&zi->i_truncate_mutex); + + trace_zonefs_iomap_begin(inode, iomap); + + return 0; +} + +static const struct iomap_ops zonefs_read_iomap_ops = { + .iomap_begin = zonefs_read_iomap_begin, +}; + +static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, + loff_t length, unsigned int flags, + struct iomap *iomap, struct iomap *srcmap) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct super_block *sb = inode->i_sb; + loff_t isize; + + /* All write I/Os should always be within the file maximum size */ + if (WARN_ON_ONCE(offset + length > zi->i_max_size)) + return -EIO; + + /* + * Sequential zones can only accept direct writes. This is already + * checked when writes are issued, so warn if we see a page writeback + * operation. + */ + if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ && + !(flags & IOMAP_DIRECT))) + return -EIO; + + /* + * For conventional zones, all blocks are always mapped. For sequential + * zones, all blocks after always mapped below the inode size (zone + * write pointer) and unwriten beyond. + */ + mutex_lock(&zi->i_truncate_mutex); + iomap->bdev = inode->i_sb->s_bdev; + iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize); + iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; + isize = i_size_read(inode); + if (iomap->offset >= isize) { + iomap->type = IOMAP_UNWRITTEN; + iomap->length = zi->i_max_size - iomap->offset; + } else { + iomap->type = IOMAP_MAPPED; + iomap->length = isize - iomap->offset; + } + mutex_unlock(&zi->i_truncate_mutex); + + trace_zonefs_iomap_begin(inode, iomap); + + return 0; +} + +static const struct iomap_ops zonefs_write_iomap_ops = { + .iomap_begin = zonefs_write_iomap_begin, +}; + +static int zonefs_read_folio(struct file *unused, struct folio *folio) +{ + return iomap_read_folio(folio, &zonefs_read_iomap_ops); +} + +static void zonefs_readahead(struct readahead_control *rac) +{ + iomap_readahead(rac, &zonefs_read_iomap_ops); +} + +/* + * Map blocks for page writeback. This is used only on conventional zone files, + * which implies that the page range can only be within the fixed inode size. + */ +static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc, + struct inode *inode, loff_t offset) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) + return -EIO; + if (WARN_ON_ONCE(offset >= i_size_read(inode))) + return -EIO; + + /* If the mapping is already OK, nothing needs to be done */ + if (offset >= wpc->iomap.offset && + offset < wpc->iomap.offset + wpc->iomap.length) + return 0; + + return zonefs_write_iomap_begin(inode, offset, zi->i_max_size - offset, + IOMAP_WRITE, &wpc->iomap, NULL); +} + +static const struct iomap_writeback_ops zonefs_writeback_ops = { + .map_blocks = zonefs_write_map_blocks, +}; + +static int zonefs_writepages(struct address_space *mapping, + struct writeback_control *wbc) +{ + struct iomap_writepage_ctx wpc = { }; + + return iomap_writepages(mapping, wbc, &wpc, &zonefs_writeback_ops); +} + +static int zonefs_swap_activate(struct swap_info_struct *sis, + struct file *swap_file, sector_t *span) +{ + struct inode *inode = file_inode(swap_file); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + if (zi->i_ztype != ZONEFS_ZTYPE_CNV) { + zonefs_err(inode->i_sb, + "swap file: not a conventional zone file\n"); + return -EINVAL; + } + + return iomap_swapfile_activate(sis, swap_file, span, + &zonefs_read_iomap_ops); +} + +const struct address_space_operations zonefs_file_aops = { + .read_folio = zonefs_read_folio, + .readahead = zonefs_readahead, + .writepages = zonefs_writepages, + .dirty_folio = filemap_dirty_folio, + .release_folio = iomap_release_folio, + .invalidate_folio = iomap_invalidate_folio, + .migrate_folio = filemap_migrate_folio, + .is_partially_uptodate = iomap_is_partially_uptodate, + .error_remove_page = generic_error_remove_page, + .direct_IO = noop_direct_IO, + .swap_activate = zonefs_swap_activate, +}; + +int zonefs_file_truncate(struct inode *inode, loff_t isize) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + loff_t old_isize; + enum req_op op; + int ret = 0; + + /* + * Only sequential zone files can be truncated and truncation is allowed + * only down to a 0 size, which is equivalent to a zone reset, and to + * the maximum file size, which is equivalent to a zone finish. + */ + if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + return -EPERM; + + if (!isize) + op = REQ_OP_ZONE_RESET; + else if (isize == zi->i_max_size) + op = REQ_OP_ZONE_FINISH; + else + return -EPERM; + + inode_dio_wait(inode); + + /* Serialize against page faults */ + filemap_invalidate_lock(inode->i_mapping); + + /* Serialize against zonefs_iomap_begin() */ + mutex_lock(&zi->i_truncate_mutex); + + old_isize = i_size_read(inode); + if (isize == old_isize) + goto unlock; + + ret = zonefs_zone_mgmt(inode, op); + if (ret) + goto unlock; + + /* + * If the mount option ZONEFS_MNTOPT_EXPLICIT_OPEN is set, + * take care of open zones. + */ + if (zi->i_flags & ZONEFS_ZONE_OPEN) { + /* + * Truncating a zone to EMPTY or FULL is the equivalent of + * closing the zone. For a truncation to 0, we need to + * re-open the zone to ensure new writes can be processed. + * For a truncation to the maximum file size, the zone is + * closed and writes cannot be accepted anymore, so clear + * the open flag. + */ + if (!isize) + ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); + else + zi->i_flags &= ~ZONEFS_ZONE_OPEN; + } + + zonefs_update_stats(inode, isize); + truncate_setsize(inode, isize); + zi->i_wpoffset = isize; + zonefs_account_active(inode); + +unlock: + mutex_unlock(&zi->i_truncate_mutex); + filemap_invalidate_unlock(inode->i_mapping); + + return ret; +} + +static int zonefs_file_fsync(struct file *file, loff_t start, loff_t end, + int datasync) +{ + struct inode *inode = file_inode(file); + int ret = 0; + + if (unlikely(IS_IMMUTABLE(inode))) + return -EPERM; + + /* + * Since only direct writes are allowed in sequential files, page cache + * flush is needed only for conventional zone files. + */ + if (ZONEFS_I(inode)->i_ztype == ZONEFS_ZTYPE_CNV) + ret = file_write_and_wait_range(file, start, end); + if (!ret) + ret = blkdev_issue_flush(inode->i_sb->s_bdev); + + if (ret) + zonefs_io_error(inode, true); + + return ret; +} + +static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + vm_fault_t ret; + + if (unlikely(IS_IMMUTABLE(inode))) + return VM_FAULT_SIGBUS; + + /* + * Sanity check: only conventional zone files can have shared + * writeable mappings. + */ + if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) + return VM_FAULT_NOPAGE; + + sb_start_pagefault(inode->i_sb); + file_update_time(vmf->vma->vm_file); + + /* Serialize against truncates */ + filemap_invalidate_lock_shared(inode->i_mapping); + ret = iomap_page_mkwrite(vmf, &zonefs_write_iomap_ops); + filemap_invalidate_unlock_shared(inode->i_mapping); + + sb_end_pagefault(inode->i_sb); + return ret; +} + +static const struct vm_operations_struct zonefs_file_vm_ops = { + .fault = filemap_fault, + .map_pages = filemap_map_pages, + .page_mkwrite = zonefs_filemap_page_mkwrite, +}; + +static int zonefs_file_mmap(struct file *file, struct vm_area_struct *vma) +{ + /* + * Conventional zones accept random writes, so their files can support + * shared writable mappings. For sequential zone files, only read + * mappings are possible since there are no guarantees for write + * ordering between msync() and page cache writeback. + */ + if (ZONEFS_I(file_inode(file))->i_ztype == ZONEFS_ZTYPE_SEQ && + (vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) + return -EINVAL; + + file_accessed(file); + vma->vm_ops = &zonefs_file_vm_ops; + + return 0; +} + +static loff_t zonefs_file_llseek(struct file *file, loff_t offset, int whence) +{ + loff_t isize = i_size_read(file_inode(file)); + + /* + * Seeks are limited to below the zone size for conventional zones + * and below the zone write pointer for sequential zones. In both + * cases, this limit is the inode size. + */ + return generic_file_llseek_size(file, offset, whence, isize, isize); +} + +static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size, + int error, unsigned int flags) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + if (error) { + zonefs_io_error(inode, true); + return error; + } + + if (size && zi->i_ztype != ZONEFS_ZTYPE_CNV) { + /* + * Note that we may be seeing completions out of order, + * but that is not a problem since a write completed + * successfully necessarily means that all preceding writes + * were also successful. So we can safely increase the inode + * size to the write end location. + */ + mutex_lock(&zi->i_truncate_mutex); + if (i_size_read(inode) < iocb->ki_pos + size) { + zonefs_update_stats(inode, iocb->ki_pos + size); + zonefs_i_size_write(inode, iocb->ki_pos + size); + } + mutex_unlock(&zi->i_truncate_mutex); + } + + return 0; +} + +static const struct iomap_dio_ops zonefs_write_dio_ops = { + .end_io = zonefs_file_write_dio_end_io, +}; + +static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct block_device *bdev = inode->i_sb->s_bdev; + unsigned int max = bdev_max_zone_append_sectors(bdev); + struct bio *bio; + ssize_t size; + int nr_pages; + ssize_t ret; + + max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize); + iov_iter_truncate(from, max); + + nr_pages = iov_iter_npages(from, BIO_MAX_VECS); + if (!nr_pages) + return 0; + + bio = bio_alloc(bdev, nr_pages, + REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE, GFP_NOFS); + bio->bi_iter.bi_sector = zi->i_zsector; + bio->bi_ioprio = iocb->ki_ioprio; + if (iocb_is_dsync(iocb)) + bio->bi_opf |= REQ_FUA; + + ret = bio_iov_iter_get_pages(bio, from); + if (unlikely(ret)) + goto out_release; + + size = bio->bi_iter.bi_size; + task_io_account_write(size); + + if (iocb->ki_flags & IOCB_HIPRI) + bio_set_polled(bio, iocb); + + ret = submit_bio_wait(bio); + + /* + * If the file zone was written underneath the file system, the zone + * write pointer may not be where we expect it to be, but the zone + * append write can still succeed. So check manually that we wrote where + * we intended to, that is, at zi->i_wpoffset. + */ + if (!ret) { + sector_t wpsector = + zi->i_zsector + (zi->i_wpoffset >> SECTOR_SHIFT); + + if (bio->bi_iter.bi_sector != wpsector) { + zonefs_warn(inode->i_sb, + "Corrupted write pointer %llu for zone at %llu\n", + wpsector, zi->i_zsector); + ret = -EIO; + } + } + + zonefs_file_write_dio_end_io(iocb, size, ret, 0); + trace_zonefs_file_dio_append(inode, size, ret); + +out_release: + bio_release_pages(bio, false); + bio_put(bio); + + if (ret >= 0) { + iocb->ki_pos += size; + return size; + } + + return ret; +} + +/* + * Do not exceed the LFS limits nor the file zone size. If pos is under the + * limit it becomes a short access. If it exceeds the limit, return -EFBIG. + */ +static loff_t zonefs_write_check_limits(struct file *file, loff_t pos, + loff_t count) +{ + struct inode *inode = file_inode(file); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + loff_t limit = rlimit(RLIMIT_FSIZE); + loff_t max_size = zi->i_max_size; + + if (limit != RLIM_INFINITY) { + if (pos >= limit) { + send_sig(SIGXFSZ, current, 0); + return -EFBIG; + } + count = min(count, limit - pos); + } + + if (!(file->f_flags & O_LARGEFILE)) + max_size = min_t(loff_t, MAX_NON_LFS, max_size); + + if (unlikely(pos >= max_size)) + return -EFBIG; + + return min(count, max_size - pos); +} + +static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + loff_t count; + + if (IS_SWAPFILE(inode)) + return -ETXTBSY; + + if (!iov_iter_count(from)) + return 0; + + if ((iocb->ki_flags & IOCB_NOWAIT) && !(iocb->ki_flags & IOCB_DIRECT)) + return -EINVAL; + + if (iocb->ki_flags & IOCB_APPEND) { + if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + return -EINVAL; + mutex_lock(&zi->i_truncate_mutex); + iocb->ki_pos = zi->i_wpoffset; + mutex_unlock(&zi->i_truncate_mutex); + } + + count = zonefs_write_check_limits(file, iocb->ki_pos, + iov_iter_count(from)); + if (count < 0) + return count; + + iov_iter_truncate(from, count); + return iov_iter_count(from); +} + +/* + * Handle direct writes. For sequential zone files, this is the only possible + * write path. For these files, check that the user is issuing writes + * sequentially from the end of the file. This code assumes that the block layer + * delivers write requests to the device in sequential order. This is always the + * case if a block IO scheduler implementing the ELEVATOR_F_ZBD_SEQ_WRITE + * elevator feature is being used (e.g. mq-deadline). The block layer always + * automatically select such an elevator for zoned block devices during the + * device initialization. + */ +static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct super_block *sb = inode->i_sb; + bool sync = is_sync_kiocb(iocb); + bool append = false; + ssize_t ret, count; + + /* + * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT + * as this can cause write reordering (e.g. the first aio gets EAGAIN + * on the inode lock but the second goes through but is now unaligned). + */ + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !sync && + (iocb->ki_flags & IOCB_NOWAIT)) + return -EOPNOTSUPP; + + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!inode_trylock(inode)) + return -EAGAIN; + } else { + inode_lock(inode); + } + + count = zonefs_write_checks(iocb, from); + if (count <= 0) { + ret = count; + goto inode_unlock; + } + + if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) { + ret = -EINVAL; + goto inode_unlock; + } + + /* Enforce sequential writes (append only) in sequential zones */ + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ) { + mutex_lock(&zi->i_truncate_mutex); + if (iocb->ki_pos != zi->i_wpoffset) { + mutex_unlock(&zi->i_truncate_mutex); + ret = -EINVAL; + goto inode_unlock; + } + mutex_unlock(&zi->i_truncate_mutex); + append = sync; + } + + if (append) + ret = zonefs_file_dio_append(iocb, from); + else + ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops, + &zonefs_write_dio_ops, 0, NULL, 0); + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && + (ret > 0 || ret == -EIOCBQUEUED)) { + if (ret > 0) + count = ret; + + /* + * Update the zone write pointer offset assuming the write + * operation succeeded. If it did not, the error recovery path + * will correct it. Also do active seq file accounting. + */ + mutex_lock(&zi->i_truncate_mutex); + zi->i_wpoffset += count; + zonefs_account_active(inode); + mutex_unlock(&zi->i_truncate_mutex); + } + +inode_unlock: + inode_unlock(inode); + + return ret; +} + +static ssize_t zonefs_file_buffered_write(struct kiocb *iocb, + struct iov_iter *from) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + ssize_t ret; + + /* + * Direct IO writes are mandatory for sequential zone files so that the + * write IO issuing order is preserved. + */ + if (zi->i_ztype != ZONEFS_ZTYPE_CNV) + return -EIO; + + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!inode_trylock(inode)) + return -EAGAIN; + } else { + inode_lock(inode); + } + + ret = zonefs_write_checks(iocb, from); + if (ret <= 0) + goto inode_unlock; + + ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops); + if (ret > 0) + iocb->ki_pos += ret; + else if (ret == -EIO) + zonefs_io_error(inode, true); + +inode_unlock: + inode_unlock(inode); + if (ret > 0) + ret = generic_write_sync(iocb, ret); + + return ret; +} + +static ssize_t zonefs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) +{ + struct inode *inode = file_inode(iocb->ki_filp); + + if (unlikely(IS_IMMUTABLE(inode))) + return -EPERM; + + if (sb_rdonly(inode->i_sb)) + return -EROFS; + + /* Write operations beyond the zone size are not allowed */ + if (iocb->ki_pos >= ZONEFS_I(inode)->i_max_size) + return -EFBIG; + + if (iocb->ki_flags & IOCB_DIRECT) { + ssize_t ret = zonefs_file_dio_write(iocb, from); + + if (ret != -ENOTBLK) + return ret; + } + + return zonefs_file_buffered_write(iocb, from); +} + +static int zonefs_file_read_dio_end_io(struct kiocb *iocb, ssize_t size, + int error, unsigned int flags) +{ + if (error) { + zonefs_io_error(file_inode(iocb->ki_filp), false); + return error; + } + + return 0; +} + +static const struct iomap_dio_ops zonefs_read_dio_ops = { + .end_io = zonefs_file_read_dio_end_io, +}; + +static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct super_block *sb = inode->i_sb; + loff_t isize; + ssize_t ret; + + /* Offline zones cannot be read */ + if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777))) + return -EPERM; + + if (iocb->ki_pos >= zi->i_max_size) + return 0; + + if (iocb->ki_flags & IOCB_NOWAIT) { + if (!inode_trylock_shared(inode)) + return -EAGAIN; + } else { + inode_lock_shared(inode); + } + + /* Limit read operations to written data */ + mutex_lock(&zi->i_truncate_mutex); + isize = i_size_read(inode); + if (iocb->ki_pos >= isize) { + mutex_unlock(&zi->i_truncate_mutex); + ret = 0; + goto inode_unlock; + } + iov_iter_truncate(to, isize - iocb->ki_pos); + mutex_unlock(&zi->i_truncate_mutex); + + if (iocb->ki_flags & IOCB_DIRECT) { + size_t count = iov_iter_count(to); + + if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) { + ret = -EINVAL; + goto inode_unlock; + } + file_accessed(iocb->ki_filp); + ret = iomap_dio_rw(iocb, to, &zonefs_read_iomap_ops, + &zonefs_read_dio_ops, 0, NULL, 0); + } else { + ret = generic_file_read_iter(iocb, to); + if (ret == -EIO) + zonefs_io_error(inode, false); + } + +inode_unlock: + inode_unlock_shared(inode); + + return ret; +} + +/* + * Write open accounting is done only for sequential files. + */ +static inline bool zonefs_seq_file_need_wro(struct inode *inode, + struct file *file) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + return false; + + if (!(file->f_mode & FMODE_WRITE)) + return false; + + return true; +} + +static int zonefs_seq_file_write_open(struct inode *inode) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + int ret = 0; + + mutex_lock(&zi->i_truncate_mutex); + + if (!zi->i_wr_refcnt) { + struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb); + unsigned int wro = atomic_inc_return(&sbi->s_wro_seq_files); + + if (sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) { + + if (sbi->s_max_wro_seq_files + && wro > sbi->s_max_wro_seq_files) { + atomic_dec(&sbi->s_wro_seq_files); + ret = -EBUSY; + goto unlock; + } + + if (i_size_read(inode) < zi->i_max_size) { + ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); + if (ret) { + atomic_dec(&sbi->s_wro_seq_files); + goto unlock; + } + zi->i_flags |= ZONEFS_ZONE_OPEN; + zonefs_account_active(inode); + } + } + } + + zi->i_wr_refcnt++; + +unlock: + mutex_unlock(&zi->i_truncate_mutex); + + return ret; +} + +static int zonefs_file_open(struct inode *inode, struct file *file) +{ + int ret; + + ret = generic_file_open(inode, file); + if (ret) + return ret; + + if (zonefs_seq_file_need_wro(inode, file)) + return zonefs_seq_file_write_open(inode); + + return 0; +} + +static void zonefs_seq_file_write_close(struct inode *inode) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct super_block *sb = inode->i_sb; + struct zonefs_sb_info *sbi = ZONEFS_SB(sb); + int ret = 0; + + mutex_lock(&zi->i_truncate_mutex); + + zi->i_wr_refcnt--; + if (zi->i_wr_refcnt) + goto unlock; + + /* + * The file zone may not be open anymore (e.g. the file was truncated to + * its maximum size or it was fully written). For this case, we only + * need to decrement the write open count. + */ + if (zi->i_flags & ZONEFS_ZONE_OPEN) { + ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_CLOSE); + if (ret) { + __zonefs_io_error(inode, false); + /* + * Leaving zones explicitly open may lead to a state + * where most zones cannot be written (zone resources + * exhausted). So take preventive action by remounting + * read-only. + */ + if (zi->i_flags & ZONEFS_ZONE_OPEN && + !(sb->s_flags & SB_RDONLY)) { + zonefs_warn(sb, + "closing zone at %llu failed %d\n", + zi->i_zsector, ret); + zonefs_warn(sb, + "remounting filesystem read-only\n"); + sb->s_flags |= SB_RDONLY; + } + goto unlock; + } + + zi->i_flags &= ~ZONEFS_ZONE_OPEN; + zonefs_account_active(inode); + } + + atomic_dec(&sbi->s_wro_seq_files); + +unlock: + mutex_unlock(&zi->i_truncate_mutex); +} + +static int zonefs_file_release(struct inode *inode, struct file *file) +{ + /* + * If we explicitly open a zone we must close it again as well, but the + * zone management operation can fail (either due to an IO error or as + * the zone has gone offline or read-only). Make sure we don't fail the + * close(2) for user-space. + */ + if (zonefs_seq_file_need_wro(inode, file)) + zonefs_seq_file_write_close(inode); + + return 0; +} + +const struct file_operations zonefs_file_operations = { + .open = zonefs_file_open, + .release = zonefs_file_release, + .fsync = zonefs_file_fsync, + .mmap = zonefs_file_mmap, + .llseek = zonefs_file_llseek, + .read_iter = zonefs_file_read_iter, + .write_iter = zonefs_file_write_iter, + .splice_read = generic_file_splice_read, + .splice_write = iter_file_splice_write, + .iopoll = iocb_bio_iopoll, +}; diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index a9c5c3f720adf..e808276b88018 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -30,7 +30,7 @@ /* * Manage the active zone count. Called with zi->i_truncate_mutex held. */ -static void zonefs_account_active(struct inode *inode) +void zonefs_account_active(struct inode *inode) { struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb); struct zonefs_inode_info *zi = ZONEFS_I(inode); @@ -68,7 +68,7 @@ static void zonefs_account_active(struct inode *inode) } }
-static inline int zonefs_zone_mgmt(struct inode *inode, enum req_op op) +int zonefs_zone_mgmt(struct inode *inode, enum req_op op) { struct zonefs_inode_info *zi = ZONEFS_I(inode); int ret; @@ -99,7 +99,7 @@ static inline int zonefs_zone_mgmt(struct inode *inode, enum req_op op) return 0; }
-static inline void zonefs_i_size_write(struct inode *inode, loff_t isize) +void zonefs_i_size_write(struct inode *inode, loff_t isize) { struct zonefs_inode_info *zi = ZONEFS_I(inode);
@@ -117,167 +117,7 @@ static inline void zonefs_i_size_write(struct inode *inode, loff_t isize) } }
-static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset, - loff_t length, unsigned int flags, - struct iomap *iomap, struct iomap *srcmap) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct super_block *sb = inode->i_sb; - loff_t isize; - - /* - * All blocks are always mapped below EOF. If reading past EOF, - * act as if there is a hole up to the file maximum size. - */ - mutex_lock(&zi->i_truncate_mutex); - iomap->bdev = inode->i_sb->s_bdev; - iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize); - isize = i_size_read(inode); - if (iomap->offset >= isize) { - iomap->type = IOMAP_HOLE; - iomap->addr = IOMAP_NULL_ADDR; - iomap->length = length; - } else { - iomap->type = IOMAP_MAPPED; - iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; - iomap->length = isize - iomap->offset; - } - mutex_unlock(&zi->i_truncate_mutex); - - trace_zonefs_iomap_begin(inode, iomap); - - return 0; -} - -static const struct iomap_ops zonefs_read_iomap_ops = { - .iomap_begin = zonefs_read_iomap_begin, -}; - -static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, - loff_t length, unsigned int flags, - struct iomap *iomap, struct iomap *srcmap) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct super_block *sb = inode->i_sb; - loff_t isize; - - /* All write I/Os should always be within the file maximum size */ - if (WARN_ON_ONCE(offset + length > zi->i_max_size)) - return -EIO; - - /* - * Sequential zones can only accept direct writes. This is already - * checked when writes are issued, so warn if we see a page writeback - * operation. - */ - if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ && - !(flags & IOMAP_DIRECT))) - return -EIO; - - /* - * For conventional zones, all blocks are always mapped. For sequential - * zones, all blocks after always mapped below the inode size (zone - * write pointer) and unwriten beyond. - */ - mutex_lock(&zi->i_truncate_mutex); - iomap->bdev = inode->i_sb->s_bdev; - iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize); - iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; - isize = i_size_read(inode); - if (iomap->offset >= isize) { - iomap->type = IOMAP_UNWRITTEN; - iomap->length = zi->i_max_size - iomap->offset; - } else { - iomap->type = IOMAP_MAPPED; - iomap->length = isize - iomap->offset; - } - mutex_unlock(&zi->i_truncate_mutex); - - trace_zonefs_iomap_begin(inode, iomap); - - return 0; -} - -static const struct iomap_ops zonefs_write_iomap_ops = { - .iomap_begin = zonefs_write_iomap_begin, -}; - -static int zonefs_read_folio(struct file *unused, struct folio *folio) -{ - return iomap_read_folio(folio, &zonefs_read_iomap_ops); -} - -static void zonefs_readahead(struct readahead_control *rac) -{ - iomap_readahead(rac, &zonefs_read_iomap_ops); -} - -/* - * Map blocks for page writeback. This is used only on conventional zone files, - * which implies that the page range can only be within the fixed inode size. - */ -static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc, - struct inode *inode, loff_t offset) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) - return -EIO; - if (WARN_ON_ONCE(offset >= i_size_read(inode))) - return -EIO; - - /* If the mapping is already OK, nothing needs to be done */ - if (offset >= wpc->iomap.offset && - offset < wpc->iomap.offset + wpc->iomap.length) - return 0; - - return zonefs_write_iomap_begin(inode, offset, zi->i_max_size - offset, - IOMAP_WRITE, &wpc->iomap, NULL); -} - -static const struct iomap_writeback_ops zonefs_writeback_ops = { - .map_blocks = zonefs_write_map_blocks, -}; - -static int zonefs_writepages(struct address_space *mapping, - struct writeback_control *wbc) -{ - struct iomap_writepage_ctx wpc = { }; - - return iomap_writepages(mapping, wbc, &wpc, &zonefs_writeback_ops); -} - -static int zonefs_swap_activate(struct swap_info_struct *sis, - struct file *swap_file, sector_t *span) -{ - struct inode *inode = file_inode(swap_file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - if (zi->i_ztype != ZONEFS_ZTYPE_CNV) { - zonefs_err(inode->i_sb, - "swap file: not a conventional zone file\n"); - return -EINVAL; - } - - return iomap_swapfile_activate(sis, swap_file, span, - &zonefs_read_iomap_ops); -} - -static const struct address_space_operations zonefs_file_aops = { - .read_folio = zonefs_read_folio, - .readahead = zonefs_readahead, - .writepages = zonefs_writepages, - .dirty_folio = filemap_dirty_folio, - .release_folio = iomap_release_folio, - .invalidate_folio = iomap_invalidate_folio, - .migrate_folio = filemap_migrate_folio, - .is_partially_uptodate = iomap_is_partially_uptodate, - .error_remove_page = generic_error_remove_page, - .direct_IO = noop_direct_IO, - .swap_activate = zonefs_swap_activate, -}; - -static void zonefs_update_stats(struct inode *inode, loff_t new_isize) +void zonefs_update_stats(struct inode *inode, loff_t new_isize) { struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); @@ -487,7 +327,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * eventually correct the file size and zonefs inode write pointer offset * (which can be out of sync with the drive due to partial write failures). */ -static void __zonefs_io_error(struct inode *inode, bool write) +void __zonefs_io_error(struct inode *inode, bool write) { struct zonefs_inode_info *zi = ZONEFS_I(inode); struct super_block *sb = inode->i_sb; @@ -526,749 +366,6 @@ static void __zonefs_io_error(struct inode *inode, bool write) memalloc_noio_restore(noio_flag); }
-static void zonefs_io_error(struct inode *inode, bool write) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - mutex_lock(&zi->i_truncate_mutex); - __zonefs_io_error(inode, write); - mutex_unlock(&zi->i_truncate_mutex); -} - -static int zonefs_file_truncate(struct inode *inode, loff_t isize) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - loff_t old_isize; - enum req_op op; - int ret = 0; - - /* - * Only sequential zone files can be truncated and truncation is allowed - * only down to a 0 size, which is equivalent to a zone reset, and to - * the maximum file size, which is equivalent to a zone finish. - */ - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) - return -EPERM; - - if (!isize) - op = REQ_OP_ZONE_RESET; - else if (isize == zi->i_max_size) - op = REQ_OP_ZONE_FINISH; - else - return -EPERM; - - inode_dio_wait(inode); - - /* Serialize against page faults */ - filemap_invalidate_lock(inode->i_mapping); - - /* Serialize against zonefs_iomap_begin() */ - mutex_lock(&zi->i_truncate_mutex); - - old_isize = i_size_read(inode); - if (isize == old_isize) - goto unlock; - - ret = zonefs_zone_mgmt(inode, op); - if (ret) - goto unlock; - - /* - * If the mount option ZONEFS_MNTOPT_EXPLICIT_OPEN is set, - * take care of open zones. - */ - if (zi->i_flags & ZONEFS_ZONE_OPEN) { - /* - * Truncating a zone to EMPTY or FULL is the equivalent of - * closing the zone. For a truncation to 0, we need to - * re-open the zone to ensure new writes can be processed. - * For a truncation to the maximum file size, the zone is - * closed and writes cannot be accepted anymore, so clear - * the open flag. - */ - if (!isize) - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); - else - zi->i_flags &= ~ZONEFS_ZONE_OPEN; - } - - zonefs_update_stats(inode, isize); - truncate_setsize(inode, isize); - zi->i_wpoffset = isize; - zonefs_account_active(inode); - -unlock: - mutex_unlock(&zi->i_truncate_mutex); - filemap_invalidate_unlock(inode->i_mapping); - - return ret; -} - -static int zonefs_inode_setattr(struct user_namespace *mnt_userns, - struct dentry *dentry, struct iattr *iattr) -{ - struct inode *inode = d_inode(dentry); - int ret; - - if (unlikely(IS_IMMUTABLE(inode))) - return -EPERM; - - ret = setattr_prepare(&init_user_ns, dentry, iattr); - if (ret) - return ret; - - /* - * Since files and directories cannot be created nor deleted, do not - * allow setting any write attributes on the sub-directories grouping - * files by zone type. - */ - if ((iattr->ia_valid & ATTR_MODE) && S_ISDIR(inode->i_mode) && - (iattr->ia_mode & 0222)) - return -EPERM; - - if (((iattr->ia_valid & ATTR_UID) && - !uid_eq(iattr->ia_uid, inode->i_uid)) || - ((iattr->ia_valid & ATTR_GID) && - !gid_eq(iattr->ia_gid, inode->i_gid))) { - ret = dquot_transfer(mnt_userns, inode, iattr); - if (ret) - return ret; - } - - if (iattr->ia_valid & ATTR_SIZE) { - ret = zonefs_file_truncate(inode, iattr->ia_size); - if (ret) - return ret; - } - - setattr_copy(&init_user_ns, inode, iattr); - - return 0; -} - -static const struct inode_operations zonefs_file_inode_operations = { - .setattr = zonefs_inode_setattr, -}; - -static int zonefs_file_fsync(struct file *file, loff_t start, loff_t end, - int datasync) -{ - struct inode *inode = file_inode(file); - int ret = 0; - - if (unlikely(IS_IMMUTABLE(inode))) - return -EPERM; - - /* - * Since only direct writes are allowed in sequential files, page cache - * flush is needed only for conventional zone files. - */ - if (ZONEFS_I(inode)->i_ztype == ZONEFS_ZTYPE_CNV) - ret = file_write_and_wait_range(file, start, end); - if (!ret) - ret = blkdev_issue_flush(inode->i_sb->s_bdev); - - if (ret) - zonefs_io_error(inode, true); - - return ret; -} - -static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf) -{ - struct inode *inode = file_inode(vmf->vma->vm_file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - vm_fault_t ret; - - if (unlikely(IS_IMMUTABLE(inode))) - return VM_FAULT_SIGBUS; - - /* - * Sanity check: only conventional zone files can have shared - * writeable mappings. - */ - if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) - return VM_FAULT_NOPAGE; - - sb_start_pagefault(inode->i_sb); - file_update_time(vmf->vma->vm_file); - - /* Serialize against truncates */ - filemap_invalidate_lock_shared(inode->i_mapping); - ret = iomap_page_mkwrite(vmf, &zonefs_write_iomap_ops); - filemap_invalidate_unlock_shared(inode->i_mapping); - - sb_end_pagefault(inode->i_sb); - return ret; -} - -static const struct vm_operations_struct zonefs_file_vm_ops = { - .fault = filemap_fault, - .map_pages = filemap_map_pages, - .page_mkwrite = zonefs_filemap_page_mkwrite, -}; - -static int zonefs_file_mmap(struct file *file, struct vm_area_struct *vma) -{ - /* - * Conventional zones accept random writes, so their files can support - * shared writable mappings. For sequential zone files, only read - * mappings are possible since there are no guarantees for write - * ordering between msync() and page cache writeback. - */ - if (ZONEFS_I(file_inode(file))->i_ztype == ZONEFS_ZTYPE_SEQ && - (vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) - return -EINVAL; - - file_accessed(file); - vma->vm_ops = &zonefs_file_vm_ops; - - return 0; -} - -static loff_t zonefs_file_llseek(struct file *file, loff_t offset, int whence) -{ - loff_t isize = i_size_read(file_inode(file)); - - /* - * Seeks are limited to below the zone size for conventional zones - * and below the zone write pointer for sequential zones. In both - * cases, this limit is the inode size. - */ - return generic_file_llseek_size(file, offset, whence, isize, isize); -} - -static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size, - int error, unsigned int flags) -{ - struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - if (error) { - zonefs_io_error(inode, true); - return error; - } - - if (size && zi->i_ztype != ZONEFS_ZTYPE_CNV) { - /* - * Note that we may be seeing completions out of order, - * but that is not a problem since a write completed - * successfully necessarily means that all preceding writes - * were also successful. So we can safely increase the inode - * size to the write end location. - */ - mutex_lock(&zi->i_truncate_mutex); - if (i_size_read(inode) < iocb->ki_pos + size) { - zonefs_update_stats(inode, iocb->ki_pos + size); - zonefs_i_size_write(inode, iocb->ki_pos + size); - } - mutex_unlock(&zi->i_truncate_mutex); - } - - return 0; -} - -static const struct iomap_dio_ops zonefs_write_dio_ops = { - .end_io = zonefs_file_write_dio_end_io, -}; - -static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) -{ - struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct block_device *bdev = inode->i_sb->s_bdev; - unsigned int max = bdev_max_zone_append_sectors(bdev); - struct bio *bio; - ssize_t size; - int nr_pages; - ssize_t ret; - - max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize); - iov_iter_truncate(from, max); - - nr_pages = iov_iter_npages(from, BIO_MAX_VECS); - if (!nr_pages) - return 0; - - bio = bio_alloc(bdev, nr_pages, - REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE, GFP_NOFS); - bio->bi_iter.bi_sector = zi->i_zsector; - bio->bi_ioprio = iocb->ki_ioprio; - if (iocb_is_dsync(iocb)) - bio->bi_opf |= REQ_FUA; - - ret = bio_iov_iter_get_pages(bio, from); - if (unlikely(ret)) - goto out_release; - - size = bio->bi_iter.bi_size; - task_io_account_write(size); - - if (iocb->ki_flags & IOCB_HIPRI) - bio_set_polled(bio, iocb); - - ret = submit_bio_wait(bio); - - /* - * If the file zone was written underneath the file system, the zone - * write pointer may not be where we expect it to be, but the zone - * append write can still succeed. So check manually that we wrote where - * we intended to, that is, at zi->i_wpoffset. - */ - if (!ret) { - sector_t wpsector = - zi->i_zsector + (zi->i_wpoffset >> SECTOR_SHIFT); - - if (bio->bi_iter.bi_sector != wpsector) { - zonefs_warn(inode->i_sb, - "Corrupted write pointer %llu for zone at %llu\n", - wpsector, zi->i_zsector); - ret = -EIO; - } - } - - zonefs_file_write_dio_end_io(iocb, size, ret, 0); - trace_zonefs_file_dio_append(inode, size, ret); - -out_release: - bio_release_pages(bio, false); - bio_put(bio); - - if (ret >= 0) { - iocb->ki_pos += size; - return size; - } - - return ret; -} - -/* - * Do not exceed the LFS limits nor the file zone size. If pos is under the - * limit it becomes a short access. If it exceeds the limit, return -EFBIG. - */ -static loff_t zonefs_write_check_limits(struct file *file, loff_t pos, - loff_t count) -{ - struct inode *inode = file_inode(file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - loff_t limit = rlimit(RLIMIT_FSIZE); - loff_t max_size = zi->i_max_size; - - if (limit != RLIM_INFINITY) { - if (pos >= limit) { - send_sig(SIGXFSZ, current, 0); - return -EFBIG; - } - count = min(count, limit - pos); - } - - if (!(file->f_flags & O_LARGEFILE)) - max_size = min_t(loff_t, MAX_NON_LFS, max_size); - - if (unlikely(pos >= max_size)) - return -EFBIG; - - return min(count, max_size - pos); -} - -static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from) -{ - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - loff_t count; - - if (IS_SWAPFILE(inode)) - return -ETXTBSY; - - if (!iov_iter_count(from)) - return 0; - - if ((iocb->ki_flags & IOCB_NOWAIT) && !(iocb->ki_flags & IOCB_DIRECT)) - return -EINVAL; - - if (iocb->ki_flags & IOCB_APPEND) { - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) - return -EINVAL; - mutex_lock(&zi->i_truncate_mutex); - iocb->ki_pos = zi->i_wpoffset; - mutex_unlock(&zi->i_truncate_mutex); - } - - count = zonefs_write_check_limits(file, iocb->ki_pos, - iov_iter_count(from)); - if (count < 0) - return count; - - iov_iter_truncate(from, count); - return iov_iter_count(from); -} - -/* - * Handle direct writes. For sequential zone files, this is the only possible - * write path. For these files, check that the user is issuing writes - * sequentially from the end of the file. This code assumes that the block layer - * delivers write requests to the device in sequential order. This is always the - * case if a block IO scheduler implementing the ELEVATOR_F_ZBD_SEQ_WRITE - * elevator feature is being used (e.g. mq-deadline). The block layer always - * automatically select such an elevator for zoned block devices during the - * device initialization. - */ -static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) -{ - struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct super_block *sb = inode->i_sb; - bool sync = is_sync_kiocb(iocb); - bool append = false; - ssize_t ret, count; - - /* - * For async direct IOs to sequential zone files, refuse IOCB_NOWAIT - * as this can cause write reordering (e.g. the first aio gets EAGAIN - * on the inode lock but the second goes through but is now unaligned). - */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !sync && - (iocb->ki_flags & IOCB_NOWAIT)) - return -EOPNOTSUPP; - - if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock(inode)) - return -EAGAIN; - } else { - inode_lock(inode); - } - - count = zonefs_write_checks(iocb, from); - if (count <= 0) { - ret = count; - goto inode_unlock; - } - - if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) { - ret = -EINVAL; - goto inode_unlock; - } - - /* Enforce sequential writes (append only) in sequential zones */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ) { - mutex_lock(&zi->i_truncate_mutex); - if (iocb->ki_pos != zi->i_wpoffset) { - mutex_unlock(&zi->i_truncate_mutex); - ret = -EINVAL; - goto inode_unlock; - } - mutex_unlock(&zi->i_truncate_mutex); - append = sync; - } - - if (append) - ret = zonefs_file_dio_append(iocb, from); - else - ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops, - &zonefs_write_dio_ops, 0, NULL, 0); - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && - (ret > 0 || ret == -EIOCBQUEUED)) { - if (ret > 0) - count = ret; - - /* - * Update the zone write pointer offset assuming the write - * operation succeeded. If it did not, the error recovery path - * will correct it. Also do active seq file accounting. - */ - mutex_lock(&zi->i_truncate_mutex); - zi->i_wpoffset += count; - zonefs_account_active(inode); - mutex_unlock(&zi->i_truncate_mutex); - } - -inode_unlock: - inode_unlock(inode); - - return ret; -} - -static ssize_t zonefs_file_buffered_write(struct kiocb *iocb, - struct iov_iter *from) -{ - struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - ssize_t ret; - - /* - * Direct IO writes are mandatory for sequential zone files so that the - * write IO issuing order is preserved. - */ - if (zi->i_ztype != ZONEFS_ZTYPE_CNV) - return -EIO; - - if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock(inode)) - return -EAGAIN; - } else { - inode_lock(inode); - } - - ret = zonefs_write_checks(iocb, from); - if (ret <= 0) - goto inode_unlock; - - ret = iomap_file_buffered_write(iocb, from, &zonefs_write_iomap_ops); - if (ret > 0) - iocb->ki_pos += ret; - else if (ret == -EIO) - zonefs_io_error(inode, true); - -inode_unlock: - inode_unlock(inode); - if (ret > 0) - ret = generic_write_sync(iocb, ret); - - return ret; -} - -static ssize_t zonefs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) -{ - struct inode *inode = file_inode(iocb->ki_filp); - - if (unlikely(IS_IMMUTABLE(inode))) - return -EPERM; - - if (sb_rdonly(inode->i_sb)) - return -EROFS; - - /* Write operations beyond the zone size are not allowed */ - if (iocb->ki_pos >= ZONEFS_I(inode)->i_max_size) - return -EFBIG; - - if (iocb->ki_flags & IOCB_DIRECT) { - ssize_t ret = zonefs_file_dio_write(iocb, from); - if (ret != -ENOTBLK) - return ret; - } - - return zonefs_file_buffered_write(iocb, from); -} - -static int zonefs_file_read_dio_end_io(struct kiocb *iocb, ssize_t size, - int error, unsigned int flags) -{ - if (error) { - zonefs_io_error(file_inode(iocb->ki_filp), false); - return error; - } - - return 0; -} - -static const struct iomap_dio_ops zonefs_read_dio_ops = { - .end_io = zonefs_file_read_dio_end_io, -}; - -static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) -{ - struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct super_block *sb = inode->i_sb; - loff_t isize; - ssize_t ret; - - /* Offline zones cannot be read */ - if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777))) - return -EPERM; - - if (iocb->ki_pos >= zi->i_max_size) - return 0; - - if (iocb->ki_flags & IOCB_NOWAIT) { - if (!inode_trylock_shared(inode)) - return -EAGAIN; - } else { - inode_lock_shared(inode); - } - - /* Limit read operations to written data */ - mutex_lock(&zi->i_truncate_mutex); - isize = i_size_read(inode); - if (iocb->ki_pos >= isize) { - mutex_unlock(&zi->i_truncate_mutex); - ret = 0; - goto inode_unlock; - } - iov_iter_truncate(to, isize - iocb->ki_pos); - mutex_unlock(&zi->i_truncate_mutex); - - if (iocb->ki_flags & IOCB_DIRECT) { - size_t count = iov_iter_count(to); - - if ((iocb->ki_pos | count) & (sb->s_blocksize - 1)) { - ret = -EINVAL; - goto inode_unlock; - } - file_accessed(iocb->ki_filp); - ret = iomap_dio_rw(iocb, to, &zonefs_read_iomap_ops, - &zonefs_read_dio_ops, 0, NULL, 0); - } else { - ret = generic_file_read_iter(iocb, to); - if (ret == -EIO) - zonefs_io_error(inode, false); - } - -inode_unlock: - inode_unlock_shared(inode); - - return ret; -} - -/* - * Write open accounting is done only for sequential files. - */ -static inline bool zonefs_seq_file_need_wro(struct inode *inode, - struct file *file) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) - return false; - - if (!(file->f_mode & FMODE_WRITE)) - return false; - - return true; -} - -static int zonefs_seq_file_write_open(struct inode *inode) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - int ret = 0; - - mutex_lock(&zi->i_truncate_mutex); - - if (!zi->i_wr_refcnt) { - struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb); - unsigned int wro = atomic_inc_return(&sbi->s_wro_seq_files); - - if (sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) { - - if (sbi->s_max_wro_seq_files - && wro > sbi->s_max_wro_seq_files) { - atomic_dec(&sbi->s_wro_seq_files); - ret = -EBUSY; - goto unlock; - } - - if (i_size_read(inode) < zi->i_max_size) { - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); - if (ret) { - atomic_dec(&sbi->s_wro_seq_files); - goto unlock; - } - zi->i_flags |= ZONEFS_ZONE_OPEN; - zonefs_account_active(inode); - } - } - } - - zi->i_wr_refcnt++; - -unlock: - mutex_unlock(&zi->i_truncate_mutex); - - return ret; -} - -static int zonefs_file_open(struct inode *inode, struct file *file) -{ - int ret; - - ret = generic_file_open(inode, file); - if (ret) - return ret; - - if (zonefs_seq_file_need_wro(inode, file)) - return zonefs_seq_file_write_open(inode); - - return 0; -} - -static void zonefs_seq_file_write_close(struct inode *inode) -{ - struct zonefs_inode_info *zi = ZONEFS_I(inode); - struct super_block *sb = inode->i_sb; - struct zonefs_sb_info *sbi = ZONEFS_SB(sb); - int ret = 0; - - mutex_lock(&zi->i_truncate_mutex); - - zi->i_wr_refcnt--; - if (zi->i_wr_refcnt) - goto unlock; - - /* - * The file zone may not be open anymore (e.g. the file was truncated to - * its maximum size or it was fully written). For this case, we only - * need to decrement the write open count. - */ - if (zi->i_flags & ZONEFS_ZONE_OPEN) { - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_CLOSE); - if (ret) { - __zonefs_io_error(inode, false); - /* - * Leaving zones explicitly open may lead to a state - * where most zones cannot be written (zone resources - * exhausted). So take preventive action by remounting - * read-only. - */ - if (zi->i_flags & ZONEFS_ZONE_OPEN && - !(sb->s_flags & SB_RDONLY)) { - zonefs_warn(sb, - "closing zone at %llu failed %d\n", - zi->i_zsector, ret); - zonefs_warn(sb, - "remounting filesystem read-only\n"); - sb->s_flags |= SB_RDONLY; - } - goto unlock; - } - - zi->i_flags &= ~ZONEFS_ZONE_OPEN; - zonefs_account_active(inode); - } - - atomic_dec(&sbi->s_wro_seq_files); - -unlock: - mutex_unlock(&zi->i_truncate_mutex); -} - -static int zonefs_file_release(struct inode *inode, struct file *file) -{ - /* - * If we explicitly open a zone we must close it again as well, but the - * zone management operation can fail (either due to an IO error or as - * the zone has gone offline or read-only). Make sure we don't fail the - * close(2) for user-space. - */ - if (zonefs_seq_file_need_wro(inode, file)) - zonefs_seq_file_write_close(inode); - - return 0; -} - -static const struct file_operations zonefs_file_operations = { - .open = zonefs_file_open, - .release = zonefs_file_release, - .fsync = zonefs_file_fsync, - .mmap = zonefs_file_mmap, - .llseek = zonefs_file_llseek, - .read_iter = zonefs_file_read_iter, - .write_iter = zonefs_file_write_iter, - .splice_read = generic_file_splice_read, - .splice_write = iter_file_splice_write, - .iopoll = iocb_bio_iopoll, -}; - static struct kmem_cache *zonefs_inode_cachep;
static struct inode *zonefs_alloc_inode(struct super_block *sb) @@ -1408,13 +505,47 @@ static int zonefs_remount(struct super_block *sb, int *flags, char *data) return zonefs_parse_options(sb, data); }
-static const struct super_operations zonefs_sops = { - .alloc_inode = zonefs_alloc_inode, - .free_inode = zonefs_free_inode, - .statfs = zonefs_statfs, - .remount_fs = zonefs_remount, - .show_options = zonefs_show_options, -}; +static int zonefs_inode_setattr(struct user_namespace *mnt_userns, + struct dentry *dentry, struct iattr *iattr) +{ + struct inode *inode = d_inode(dentry); + int ret; + + if (unlikely(IS_IMMUTABLE(inode))) + return -EPERM; + + ret = setattr_prepare(&init_user_ns, dentry, iattr); + if (ret) + return ret; + + /* + * Since files and directories cannot be created nor deleted, do not + * allow setting any write attributes on the sub-directories grouping + * files by zone type. + */ + if ((iattr->ia_valid & ATTR_MODE) && S_ISDIR(inode->i_mode) && + (iattr->ia_mode & 0222)) + return -EPERM; + + if (((iattr->ia_valid & ATTR_UID) && + !uid_eq(iattr->ia_uid, inode->i_uid)) || + ((iattr->ia_valid & ATTR_GID) && + !gid_eq(iattr->ia_gid, inode->i_gid))) { + ret = dquot_transfer(mnt_userns, inode, iattr); + if (ret) + return ret; + } + + if (iattr->ia_valid & ATTR_SIZE) { + ret = zonefs_file_truncate(inode, iattr->ia_size); + if (ret) + return ret; + } + + setattr_copy(&init_user_ns, inode, iattr); + + return 0; +}
static const struct inode_operations zonefs_dir_inode_operations = { .lookup = simple_lookup, @@ -1434,6 +565,10 @@ static void zonefs_init_dir_inode(struct inode *parent, struct inode *inode, inc_nlink(parent); }
+static const struct inode_operations zonefs_file_inode_operations = { + .setattr = zonefs_inode_setattr, +}; + static int zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone, enum zonefs_ztype type) { @@ -1785,6 +920,14 @@ static int zonefs_read_super(struct super_block *sb) return ret; }
+static const struct super_operations zonefs_sops = { + .alloc_inode = zonefs_alloc_inode, + .free_inode = zonefs_free_inode, + .statfs = zonefs_statfs, + .remount_fs = zonefs_remount, + .show_options = zonefs_show_options, +}; + /* * Check that the device is zoned. If it is, get the list of zones and create * sub-directories and files according to the device zone configuration and diff --git a/fs/zonefs/zonefs.h b/fs/zonefs/zonefs.h index 1dbe78119ff16..839ebe9afb6c1 100644 --- a/fs/zonefs/zonefs.h +++ b/fs/zonefs/zonefs.h @@ -209,6 +209,28 @@ static inline struct zonefs_sb_info *ZONEFS_SB(struct super_block *sb) #define zonefs_warn(sb, format, args...) \ pr_warn("zonefs (%s) WARNING: " format, sb->s_id, ## args)
+/* In super.c */ +void zonefs_account_active(struct inode *inode); +int zonefs_zone_mgmt(struct inode *inode, enum req_op op); +void zonefs_i_size_write(struct inode *inode, loff_t isize); +void zonefs_update_stats(struct inode *inode, loff_t new_isize); +void __zonefs_io_error(struct inode *inode, bool write); + +static inline void zonefs_io_error(struct inode *inode, bool write) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + mutex_lock(&zi->i_truncate_mutex); + __zonefs_io_error(inode, write); + mutex_unlock(&zi->i_truncate_mutex); +} + +/* In file.c */ +extern const struct address_space_operations zonefs_file_aops; +extern const struct file_operations zonefs_file_operations; +int zonefs_file_truncate(struct inode *inode, loff_t isize); + +/* In sysfs.c */ int zonefs_sysfs_register(struct super_block *sb); void zonefs_sysfs_unregister(struct super_block *sb); int zonefs_sysfs_init(void);
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 46a9c526eef7fb68a00321e2a9591ce5276ae92b ]
Simplify zonefs_check_zone_condition() by moving the code that changes an inode access rights to the new function zonefs_inode_update_mode(). Furthermore, since on mount an inode wpoffset is always zero when zonefs_check_zone_condition() is called during an inode initialization, the "mount" boolean argument is not necessary for the readonly zone case. This argument is thus removed.
zonefs_io_error_cb() is also modified to use the inode offline and zone state flags instead of checking the device zone condition. The multiple calls to zonefs_check_zone_condition() are reduced to the first call on entry, which allows removing the "warn" argument. zonefs_inode_update_mode() is also used to update an inode access rights as zonefs_io_error_cb() modifies the inode flags depending on the volume error handling mode (defined with a mount option). Since an inode mode change differs for read-only zones between mount time and IO error time, the flag ZONEFS_ZONE_INIT_MODE is used to differentiate both cases.
Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/zonefs/super.c | 110 ++++++++++++++++++++++++--------------------- fs/zonefs/zonefs.h | 9 ++-- 2 files changed, 64 insertions(+), 55 deletions(-)
diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index e808276b88018..6307cc95be061 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -155,48 +155,31 @@ void zonefs_update_stats(struct inode *inode, loff_t new_isize) * amount of readable data in the zone. */ static loff_t zonefs_check_zone_condition(struct inode *inode, - struct blk_zone *zone, bool warn, - bool mount) + struct blk_zone *zone) { struct zonefs_inode_info *zi = ZONEFS_I(inode);
switch (zone->cond) { case BLK_ZONE_COND_OFFLINE: - /* - * Dead zone: make the inode immutable, disable all accesses - * and set the file size to 0 (zone wp set to zone start). - */ - if (warn) - zonefs_warn(inode->i_sb, "inode %lu: offline zone\n", - inode->i_ino); - inode->i_flags |= S_IMMUTABLE; - inode->i_mode &= ~0777; - zone->wp = zone->start; + zonefs_warn(inode->i_sb, "inode %lu: offline zone\n", + inode->i_ino); zi->i_flags |= ZONEFS_ZONE_OFFLINE; return 0; case BLK_ZONE_COND_READONLY: /* - * The write pointer of read-only zones is invalid. If such a - * zone is found during mount, the file size cannot be retrieved - * so we treat the zone as offline (mount == true case). - * Otherwise, keep the file size as it was when last updated - * so that the user can recover data. In both cases, writes are - * always disabled for the zone. + * The write pointer of read-only zones is invalid, so we cannot + * determine the zone wpoffset (inode size). We thus keep the + * zone wpoffset as is, which leads to an empty file + * (wpoffset == 0) on mount. For a runtime error, this keeps + * the inode size as it was when last updated so that the user + * can recover data. */ - if (warn) - zonefs_warn(inode->i_sb, "inode %lu: read-only zone\n", - inode->i_ino); - inode->i_flags |= S_IMMUTABLE; - if (mount) { - zone->cond = BLK_ZONE_COND_OFFLINE; - inode->i_mode &= ~0777; - zone->wp = zone->start; - zi->i_flags |= ZONEFS_ZONE_OFFLINE; - return 0; - } + zonefs_warn(inode->i_sb, "inode %lu: read-only zone\n", + inode->i_ino); zi->i_flags |= ZONEFS_ZONE_READONLY; - inode->i_mode &= ~0222; - return i_size_read(inode); + if (zi->i_ztype == ZONEFS_ZTYPE_CNV) + return zi->i_max_size; + return zi->i_wpoffset; case BLK_ZONE_COND_FULL: /* The write pointer of full zones is invalid. */ return zi->i_max_size; @@ -207,6 +190,30 @@ static loff_t zonefs_check_zone_condition(struct inode *inode, } }
+/* + * Check a zone condition and adjust its inode access permissions for + * offline and readonly zones. + */ +static void zonefs_inode_update_mode(struct inode *inode) +{ + struct zonefs_inode_info *zi = ZONEFS_I(inode); + + if (zi->i_flags & ZONEFS_ZONE_OFFLINE) { + /* Offline zones cannot be read nor written */ + inode->i_flags |= S_IMMUTABLE; + inode->i_mode &= ~0777; + } else if (zi->i_flags & ZONEFS_ZONE_READONLY) { + /* Readonly zones cannot be written */ + inode->i_flags |= S_IMMUTABLE; + if (zi->i_flags & ZONEFS_ZONE_INIT_MODE) + inode->i_mode &= ~0777; + else + inode->i_mode &= ~0222; + } + + zi->i_flags &= ~ZONEFS_ZONE_INIT_MODE; +} + struct zonefs_ioerr_data { struct inode *inode; bool write; @@ -228,10 +235,9 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * as there is no inconsistency between the inode size and the amount of * data writen in the zone (data_size). */ - data_size = zonefs_check_zone_condition(inode, zone, true, false); + data_size = zonefs_check_zone_condition(inode, zone); isize = i_size_read(inode); - if (zone->cond != BLK_ZONE_COND_OFFLINE && - zone->cond != BLK_ZONE_COND_READONLY && + if (!(zi->i_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE)) && !err->write && isize == data_size) return 0;
@@ -264,24 +270,22 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * zone condition to read-only and offline respectively, as if the * condition was signaled by the hardware. */ - if (zone->cond == BLK_ZONE_COND_OFFLINE || - sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZOL) { + if ((zi->i_flags & ZONEFS_ZONE_OFFLINE) || + (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZOL)) { zonefs_warn(sb, "inode %lu: read/write access disabled\n", inode->i_ino); - if (zone->cond != BLK_ZONE_COND_OFFLINE) { - zone->cond = BLK_ZONE_COND_OFFLINE; - data_size = zonefs_check_zone_condition(inode, zone, - false, false); - } - } else if (zone->cond == BLK_ZONE_COND_READONLY || - sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZRO) { + if (!(zi->i_flags & ZONEFS_ZONE_OFFLINE)) + zi->i_flags |= ZONEFS_ZONE_OFFLINE; + zonefs_inode_update_mode(inode); + data_size = 0; + } else if ((zi->i_flags & ZONEFS_ZONE_READONLY) || + (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZRO)) { zonefs_warn(sb, "inode %lu: write access disabled\n", inode->i_ino); - if (zone->cond != BLK_ZONE_COND_READONLY) { - zone->cond = BLK_ZONE_COND_READONLY; - data_size = zonefs_check_zone_condition(inode, zone, - false, false); - } + if (!(zi->i_flags & ZONEFS_ZONE_READONLY)) + zi->i_flags |= ZONEFS_ZONE_READONLY; + zonefs_inode_update_mode(inode); + data_size = isize; } else if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_RO && data_size > isize) { /* Do not expose garbage data */ @@ -295,8 +299,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * close of the zone when the inode file is closed. */ if ((sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) && - (zone->cond == BLK_ZONE_COND_OFFLINE || - zone->cond == BLK_ZONE_COND_READONLY)) + (zi->i_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE))) zi->i_flags &= ~ZONEFS_ZONE_OPEN;
/* @@ -378,6 +381,7 @@ static struct inode *zonefs_alloc_inode(struct super_block *sb)
inode_init_once(&zi->i_vnode); mutex_init(&zi->i_truncate_mutex); + zi->i_wpoffset = 0; zi->i_wr_refcnt = 0; zi->i_flags = 0;
@@ -594,7 +598,7 @@ static int zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone,
zi->i_max_size = min_t(loff_t, MAX_LFS_FILESIZE, zone->capacity << SECTOR_SHIFT); - zi->i_wpoffset = zonefs_check_zone_condition(inode, zone, true, true); + zi->i_wpoffset = zonefs_check_zone_condition(inode, zone);
inode->i_uid = sbi->s_uid; inode->i_gid = sbi->s_gid; @@ -605,6 +609,10 @@ static int zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone, inode->i_fop = &zonefs_file_operations; inode->i_mapping->a_ops = &zonefs_file_aops;
+ /* Update the inode access rights depending on the zone condition */ + zi->i_flags |= ZONEFS_ZONE_INIT_MODE; + zonefs_inode_update_mode(inode); + sb->s_maxbytes = max(zi->i_max_size, sb->s_maxbytes); sbi->s_blocks += zi->i_max_size >> sb->s_blocksize_bits; sbi->s_used_blocks += zi->i_wpoffset >> sb->s_blocksize_bits; diff --git a/fs/zonefs/zonefs.h b/fs/zonefs/zonefs.h index 839ebe9afb6c1..439096445ee53 100644 --- a/fs/zonefs/zonefs.h +++ b/fs/zonefs/zonefs.h @@ -39,10 +39,11 @@ static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone) return ZONEFS_ZTYPE_SEQ; }
-#define ZONEFS_ZONE_OPEN (1U << 0) -#define ZONEFS_ZONE_ACTIVE (1U << 1) -#define ZONEFS_ZONE_OFFLINE (1U << 2) -#define ZONEFS_ZONE_READONLY (1U << 3) +#define ZONEFS_ZONE_INIT_MODE (1U << 0) +#define ZONEFS_ZONE_OPEN (1U << 1) +#define ZONEFS_ZONE_ACTIVE (1U << 2) +#define ZONEFS_ZONE_OFFLINE (1U << 3) +#define ZONEFS_ZONE_READONLY (1U << 4)
/* * In-memory inode data.
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 34422914dc00b291d1c47dbdabe93b154c2f2b25 ]
Instead of using the i_ztype field in struct zonefs_inode_info to indicate the zone type of an inode, introduce the new inode flag ZONEFS_ZONE_CNV to be set in the i_flags field of struct zonefs_inode_info to identify conventional zones. If this flag is not set, the zone of an inode is considered to be a sequential zone.
The helpers zonefs_zone_is_cnv(), zonefs_zone_is_seq(), zonefs_inode_is_cnv() and zonefs_inode_is_seq() are introduced to simplify testing the zone type of a struct zonefs_inode_info and of a struct inode.
Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/zonefs/file.c | 35 ++++++++++++++--------------------- fs/zonefs/super.c | 12 +++++++----- fs/zonefs/zonefs.h | 24 +++++++++++++++++++++--- 3 files changed, 42 insertions(+), 29 deletions(-)
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index ece0f3959b6d1..64873d31d75dd 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -77,8 +77,7 @@ static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, * checked when writes are issued, so warn if we see a page writeback * operation. */ - if (WARN_ON_ONCE(zi->i_ztype == ZONEFS_ZTYPE_SEQ && - !(flags & IOMAP_DIRECT))) + if (WARN_ON_ONCE(zonefs_zone_is_seq(zi) && !(flags & IOMAP_DIRECT))) return -EIO;
/* @@ -128,7 +127,7 @@ static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc, { struct zonefs_inode_info *zi = ZONEFS_I(inode);
- if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) + if (WARN_ON_ONCE(zonefs_zone_is_seq(zi))) return -EIO; if (WARN_ON_ONCE(offset >= i_size_read(inode))) return -EIO; @@ -158,9 +157,8 @@ static int zonefs_swap_activate(struct swap_info_struct *sis, struct file *swap_file, sector_t *span) { struct inode *inode = file_inode(swap_file); - struct zonefs_inode_info *zi = ZONEFS_I(inode);
- if (zi->i_ztype != ZONEFS_ZTYPE_CNV) { + if (zonefs_inode_is_seq(inode)) { zonefs_err(inode->i_sb, "swap file: not a conventional zone file\n"); return -EINVAL; @@ -196,7 +194,7 @@ int zonefs_file_truncate(struct inode *inode, loff_t isize) * only down to a 0 size, which is equivalent to a zone reset, and to * the maximum file size, which is equivalent to a zone finish. */ - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + if (!zonefs_zone_is_seq(zi)) return -EPERM;
if (!isize) @@ -266,7 +264,7 @@ static int zonefs_file_fsync(struct file *file, loff_t start, loff_t end, * Since only direct writes are allowed in sequential files, page cache * flush is needed only for conventional zone files. */ - if (ZONEFS_I(inode)->i_ztype == ZONEFS_ZTYPE_CNV) + if (zonefs_inode_is_cnv(inode)) ret = file_write_and_wait_range(file, start, end); if (!ret) ret = blkdev_issue_flush(inode->i_sb->s_bdev); @@ -280,7 +278,6 @@ static int zonefs_file_fsync(struct file *file, loff_t start, loff_t end, static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf) { struct inode *inode = file_inode(vmf->vma->vm_file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); vm_fault_t ret;
if (unlikely(IS_IMMUTABLE(inode))) @@ -290,7 +287,7 @@ static vm_fault_t zonefs_filemap_page_mkwrite(struct vm_fault *vmf) * Sanity check: only conventional zone files can have shared * writeable mappings. */ - if (WARN_ON_ONCE(zi->i_ztype != ZONEFS_ZTYPE_CNV)) + if (zonefs_inode_is_seq(inode)) return VM_FAULT_NOPAGE;
sb_start_pagefault(inode->i_sb); @@ -319,7 +316,7 @@ static int zonefs_file_mmap(struct file *file, struct vm_area_struct *vma) * mappings are possible since there are no guarantees for write * ordering between msync() and page cache writeback. */ - if (ZONEFS_I(file_inode(file))->i_ztype == ZONEFS_ZTYPE_SEQ && + if (zonefs_inode_is_seq(file_inode(file)) && (vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) return -EINVAL;
@@ -352,7 +349,7 @@ static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size, return error; }
- if (size && zi->i_ztype != ZONEFS_ZTYPE_CNV) { + if (size && zonefs_zone_is_seq(zi)) { /* * Note that we may be seeing completions out of order, * but that is not a problem since a write completed @@ -491,7 +488,7 @@ static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from) return -EINVAL;
if (iocb->ki_flags & IOCB_APPEND) { - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + if (zonefs_zone_is_cnv(zi)) return -EINVAL; mutex_lock(&zi->i_truncate_mutex); iocb->ki_pos = zi->i_wpoffset; @@ -531,8 +528,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) * as this can cause write reordering (e.g. the first aio gets EAGAIN * on the inode lock but the second goes through but is now unaligned). */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !sync && - (iocb->ki_flags & IOCB_NOWAIT)) + if (zonefs_zone_is_seq(zi) && !sync && (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP;
if (iocb->ki_flags & IOCB_NOWAIT) { @@ -554,7 +550,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) }
/* Enforce sequential writes (append only) in sequential zones */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ) { + if (zonefs_zone_is_seq(zi)) { mutex_lock(&zi->i_truncate_mutex); if (iocb->ki_pos != zi->i_wpoffset) { mutex_unlock(&zi->i_truncate_mutex); @@ -570,7 +566,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) else ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops, &zonefs_write_dio_ops, 0, NULL, 0); - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && + if (zonefs_zone_is_seq(zi) && (ret > 0 || ret == -EIOCBQUEUED)) { if (ret > 0) count = ret; @@ -596,14 +592,13 @@ static ssize_t zonefs_file_buffered_write(struct kiocb *iocb, struct iov_iter *from) { struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); ssize_t ret;
/* * Direct IO writes are mandatory for sequential zone files so that the * write IO issuing order is preserved. */ - if (zi->i_ztype != ZONEFS_ZTYPE_CNV) + if (zonefs_inode_is_seq(inode)) return -EIO;
if (iocb->ki_flags & IOCB_NOWAIT) { @@ -731,9 +726,7 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) static inline bool zonefs_seq_file_need_wro(struct inode *inode, struct file *file) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); - - if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + if (zonefs_inode_is_cnv(inode)) return false;
if (!(file->f_mode & FMODE_WRITE)) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 6307cc95be061..a4af29dc32e7d 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -37,7 +37,7 @@ void zonefs_account_active(struct inode *inode)
lockdep_assert_held(&zi->i_truncate_mutex);
- if (zi->i_ztype != ZONEFS_ZTYPE_SEQ) + if (zonefs_zone_is_cnv(zi)) return;
/* @@ -177,14 +177,14 @@ static loff_t zonefs_check_zone_condition(struct inode *inode, zonefs_warn(inode->i_sb, "inode %lu: read-only zone\n", inode->i_ino); zi->i_flags |= ZONEFS_ZONE_READONLY; - if (zi->i_ztype == ZONEFS_ZTYPE_CNV) + if (zonefs_zone_is_cnv(zi)) return zi->i_max_size; return zi->i_wpoffset; case BLK_ZONE_COND_FULL: /* The write pointer of full zones is invalid. */ return zi->i_max_size; default: - if (zi->i_ztype == ZONEFS_ZTYPE_CNV) + if (zonefs_zone_is_cnv(zi)) return zi->i_max_size; return (zone->wp - zone->start) << SECTOR_SHIFT; } @@ -260,7 +260,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * In all cases, warn about inode size inconsistency and handle the * IO error according to the zone condition and to the mount options. */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && isize != data_size) + if (zonefs_zone_is_seq(zi) && isize != data_size) zonefs_warn(sb, "inode %lu: invalid size %lld (should be %lld)\n", inode->i_ino, isize, data_size);
@@ -584,7 +584,9 @@ static int zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone, inode->i_ino = zone->start >> sbi->s_zone_sectors_shift; inode->i_mode = S_IFREG | sbi->s_perm;
- zi->i_ztype = type; + if (type == ZONEFS_ZTYPE_CNV) + zi->i_flags |= ZONEFS_ZONE_CNV; + zi->i_zsector = zone->start; zi->i_zone_size = zone->len << SECTOR_SHIFT; if (zi->i_zone_size > bdev_zone_sectors(sb->s_bdev) << SECTOR_SHIFT && diff --git a/fs/zonefs/zonefs.h b/fs/zonefs/zonefs.h index 439096445ee53..1a225f74015a0 100644 --- a/fs/zonefs/zonefs.h +++ b/fs/zonefs/zonefs.h @@ -44,6 +44,7 @@ static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone) #define ZONEFS_ZONE_ACTIVE (1U << 2) #define ZONEFS_ZONE_OFFLINE (1U << 3) #define ZONEFS_ZONE_READONLY (1U << 4) +#define ZONEFS_ZONE_CNV (1U << 31)
/* * In-memory inode data. @@ -51,9 +52,6 @@ static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone) struct zonefs_inode_info { struct inode i_vnode;
- /* File zone type */ - enum zonefs_ztype i_ztype; - /* File zone start sector (512B unit) */ sector_t i_zsector;
@@ -91,6 +89,26 @@ static inline struct zonefs_inode_info *ZONEFS_I(struct inode *inode) return container_of(inode, struct zonefs_inode_info, i_vnode); }
+static inline bool zonefs_zone_is_cnv(struct zonefs_inode_info *zi) +{ + return zi->i_flags & ZONEFS_ZONE_CNV; +} + +static inline bool zonefs_zone_is_seq(struct zonefs_inode_info *zi) +{ + return !zonefs_zone_is_cnv(zi); +} + +static inline bool zonefs_inode_is_cnv(struct inode *inode) +{ + return zonefs_zone_is_cnv(ZONEFS_I(inode)); +} + +static inline bool zonefs_inode_is_seq(struct inode *inode) +{ + return zonefs_zone_is_seq(ZONEFS_I(inode)); +} + /* * On-disk super block (block 0). */
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit aa7f243f32e1d18036ee00d71d3ccfad70ae2121 ]
In preparation for adding dynamic inode allocation, separate an inode zone information from the zonefs inode structure. The new data structure zonefs_zone is introduced to store in memory information about a zone that must be kept throughout the lifetime of the device mount.
Linking between a zone file inode and its zone information is done by setting the inode i_private field to point to a struct zonefs_zone. Using the i_private pointer avoids the need for adding a pointer in struct zonefs_inode_info. Beside the vfs inode, this structure is reduced to a mutex and a write open counter.
One struct zonefs_zone is created per file inode on mount. These structures are organized in an array using the new struct zonefs_zone_group data structure to represent zone groups. The zonefs_zone arrays are indexed per file number (the index of a struct zonefs_zone in its array directly gives the file number/name for that zone file inode).
Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Stable-dep-of: 88b170088ad2 ("zonefs: Fix error message in zonefs_file_dio_append()") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/zonefs/file.c | 99 ++++---- fs/zonefs/super.c | 571 +++++++++++++++++++++++++++------------------ fs/zonefs/trace.h | 20 +- fs/zonefs/zonefs.h | 63 +++-- 4 files changed, 449 insertions(+), 304 deletions(-)
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 64873d31d75dd..738b0e28d74b5 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -29,6 +29,7 @@ static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset, struct iomap *iomap, struct iomap *srcmap) { struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; loff_t isize;
@@ -46,7 +47,7 @@ static int zonefs_read_iomap_begin(struct inode *inode, loff_t offset, iomap->length = length; } else { iomap->type = IOMAP_MAPPED; - iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; + iomap->addr = (z->z_sector << SECTOR_SHIFT) + iomap->offset; iomap->length = isize - iomap->offset; } mutex_unlock(&zi->i_truncate_mutex); @@ -65,11 +66,12 @@ static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, struct iomap *iomap, struct iomap *srcmap) { struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; loff_t isize;
/* All write I/Os should always be within the file maximum size */ - if (WARN_ON_ONCE(offset + length > zi->i_max_size)) + if (WARN_ON_ONCE(offset + length > z->z_capacity)) return -EIO;
/* @@ -77,7 +79,7 @@ static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, * checked when writes are issued, so warn if we see a page writeback * operation. */ - if (WARN_ON_ONCE(zonefs_zone_is_seq(zi) && !(flags & IOMAP_DIRECT))) + if (WARN_ON_ONCE(zonefs_zone_is_seq(z) && !(flags & IOMAP_DIRECT))) return -EIO;
/* @@ -88,11 +90,11 @@ static int zonefs_write_iomap_begin(struct inode *inode, loff_t offset, mutex_lock(&zi->i_truncate_mutex); iomap->bdev = inode->i_sb->s_bdev; iomap->offset = ALIGN_DOWN(offset, sb->s_blocksize); - iomap->addr = (zi->i_zsector << SECTOR_SHIFT) + iomap->offset; + iomap->addr = (z->z_sector << SECTOR_SHIFT) + iomap->offset; isize = i_size_read(inode); if (iomap->offset >= isize) { iomap->type = IOMAP_UNWRITTEN; - iomap->length = zi->i_max_size - iomap->offset; + iomap->length = z->z_capacity - iomap->offset; } else { iomap->type = IOMAP_MAPPED; iomap->length = isize - iomap->offset; @@ -125,9 +127,9 @@ static void zonefs_readahead(struct readahead_control *rac) static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc, struct inode *inode, loff_t offset) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode);
- if (WARN_ON_ONCE(zonefs_zone_is_seq(zi))) + if (WARN_ON_ONCE(zonefs_zone_is_seq(z))) return -EIO; if (WARN_ON_ONCE(offset >= i_size_read(inode))) return -EIO; @@ -137,7 +139,8 @@ static int zonefs_write_map_blocks(struct iomap_writepage_ctx *wpc, offset < wpc->iomap.offset + wpc->iomap.length) return 0;
- return zonefs_write_iomap_begin(inode, offset, zi->i_max_size - offset, + return zonefs_write_iomap_begin(inode, offset, + z->z_capacity - offset, IOMAP_WRITE, &wpc->iomap, NULL); }
@@ -185,6 +188,7 @@ const struct address_space_operations zonefs_file_aops = { int zonefs_file_truncate(struct inode *inode, loff_t isize) { struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); loff_t old_isize; enum req_op op; int ret = 0; @@ -194,12 +198,12 @@ int zonefs_file_truncate(struct inode *inode, loff_t isize) * only down to a 0 size, which is equivalent to a zone reset, and to * the maximum file size, which is equivalent to a zone finish. */ - if (!zonefs_zone_is_seq(zi)) + if (!zonefs_zone_is_seq(z)) return -EPERM;
if (!isize) op = REQ_OP_ZONE_RESET; - else if (isize == zi->i_max_size) + else if (isize == z->z_capacity) op = REQ_OP_ZONE_FINISH; else return -EPERM; @@ -216,7 +220,7 @@ int zonefs_file_truncate(struct inode *inode, loff_t isize) if (isize == old_isize) goto unlock;
- ret = zonefs_zone_mgmt(inode, op); + ret = zonefs_inode_zone_mgmt(inode, op); if (ret) goto unlock;
@@ -224,7 +228,7 @@ int zonefs_file_truncate(struct inode *inode, loff_t isize) * If the mount option ZONEFS_MNTOPT_EXPLICIT_OPEN is set, * take care of open zones. */ - if (zi->i_flags & ZONEFS_ZONE_OPEN) { + if (z->z_flags & ZONEFS_ZONE_OPEN) { /* * Truncating a zone to EMPTY or FULL is the equivalent of * closing the zone. For a truncation to 0, we need to @@ -234,15 +238,15 @@ int zonefs_file_truncate(struct inode *inode, loff_t isize) * the open flag. */ if (!isize) - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); + ret = zonefs_inode_zone_mgmt(inode, REQ_OP_ZONE_OPEN); else - zi->i_flags &= ~ZONEFS_ZONE_OPEN; + z->z_flags &= ~ZONEFS_ZONE_OPEN; }
zonefs_update_stats(inode, isize); truncate_setsize(inode, isize); - zi->i_wpoffset = isize; - zonefs_account_active(inode); + z->z_wpoffset = isize; + zonefs_inode_account_active(inode);
unlock: mutex_unlock(&zi->i_truncate_mutex); @@ -349,7 +353,7 @@ static int zonefs_file_write_dio_end_io(struct kiocb *iocb, ssize_t size, return error; }
- if (size && zonefs_zone_is_seq(zi)) { + if (size && zonefs_inode_is_seq(inode)) { /* * Note that we may be seeing completions out of order, * but that is not a problem since a write completed @@ -375,7 +379,7 @@ static const struct iomap_dio_ops zonefs_write_dio_ops = { static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) { struct inode *inode = file_inode(iocb->ki_filp); - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct block_device *bdev = inode->i_sb->s_bdev; unsigned int max = bdev_max_zone_append_sectors(bdev); struct bio *bio; @@ -392,7 +396,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from)
bio = bio_alloc(bdev, nr_pages, REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE, GFP_NOFS); - bio->bi_iter.bi_sector = zi->i_zsector; + bio->bi_iter.bi_sector = z->z_sector; bio->bi_ioprio = iocb->ki_ioprio; if (iocb_is_dsync(iocb)) bio->bi_opf |= REQ_FUA; @@ -417,12 +421,12 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) */ if (!ret) { sector_t wpsector = - zi->i_zsector + (zi->i_wpoffset >> SECTOR_SHIFT); + z->z_sector + (z->z_wpoffset >> SECTOR_SHIFT);
if (bio->bi_iter.bi_sector != wpsector) { zonefs_warn(inode->i_sb, "Corrupted write pointer %llu for zone at %llu\n", - wpsector, zi->i_zsector); + wpsector, z->z_sector); ret = -EIO; } } @@ -450,9 +454,9 @@ static loff_t zonefs_write_check_limits(struct file *file, loff_t pos, loff_t count) { struct inode *inode = file_inode(file); - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); loff_t limit = rlimit(RLIMIT_FSIZE); - loff_t max_size = zi->i_max_size; + loff_t max_size = z->z_capacity;
if (limit != RLIM_INFINITY) { if (pos >= limit) { @@ -476,6 +480,7 @@ static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from) struct file *file = iocb->ki_filp; struct inode *inode = file_inode(file); struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); loff_t count;
if (IS_SWAPFILE(inode)) @@ -488,10 +493,10 @@ static ssize_t zonefs_write_checks(struct kiocb *iocb, struct iov_iter *from) return -EINVAL;
if (iocb->ki_flags & IOCB_APPEND) { - if (zonefs_zone_is_cnv(zi)) + if (zonefs_zone_is_cnv(z)) return -EINVAL; mutex_lock(&zi->i_truncate_mutex); - iocb->ki_pos = zi->i_wpoffset; + iocb->ki_pos = z->z_wpoffset; mutex_unlock(&zi->i_truncate_mutex); }
@@ -518,6 +523,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) { struct inode *inode = file_inode(iocb->ki_filp); struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; bool sync = is_sync_kiocb(iocb); bool append = false; @@ -528,7 +534,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) * as this can cause write reordering (e.g. the first aio gets EAGAIN * on the inode lock but the second goes through but is now unaligned). */ - if (zonefs_zone_is_seq(zi) && !sync && (iocb->ki_flags & IOCB_NOWAIT)) + if (zonefs_zone_is_seq(z) && !sync && (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP;
if (iocb->ki_flags & IOCB_NOWAIT) { @@ -550,9 +556,9 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) }
/* Enforce sequential writes (append only) in sequential zones */ - if (zonefs_zone_is_seq(zi)) { + if (zonefs_zone_is_seq(z)) { mutex_lock(&zi->i_truncate_mutex); - if (iocb->ki_pos != zi->i_wpoffset) { + if (iocb->ki_pos != z->z_wpoffset) { mutex_unlock(&zi->i_truncate_mutex); ret = -EINVAL; goto inode_unlock; @@ -566,7 +572,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) else ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops, &zonefs_write_dio_ops, 0, NULL, 0); - if (zonefs_zone_is_seq(zi) && + if (zonefs_zone_is_seq(z) && (ret > 0 || ret == -EIOCBQUEUED)) { if (ret > 0) count = ret; @@ -577,8 +583,8 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) * will correct it. Also do active seq file accounting. */ mutex_lock(&zi->i_truncate_mutex); - zi->i_wpoffset += count; - zonefs_account_active(inode); + z->z_wpoffset += count; + zonefs_inode_account_active(inode); mutex_unlock(&zi->i_truncate_mutex); }
@@ -629,6 +635,7 @@ static ssize_t zonefs_file_buffered_write(struct kiocb *iocb, static ssize_t zonefs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_zone *z = zonefs_inode_zone(inode);
if (unlikely(IS_IMMUTABLE(inode))) return -EPERM; @@ -636,8 +643,8 @@ static ssize_t zonefs_file_write_iter(struct kiocb *iocb, struct iov_iter *from) if (sb_rdonly(inode->i_sb)) return -EROFS;
- /* Write operations beyond the zone size are not allowed */ - if (iocb->ki_pos >= ZONEFS_I(inode)->i_max_size) + /* Write operations beyond the zone capacity are not allowed */ + if (iocb->ki_pos >= z->z_capacity) return -EFBIG;
if (iocb->ki_flags & IOCB_DIRECT) { @@ -669,6 +676,7 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) { struct inode *inode = file_inode(iocb->ki_filp); struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; loff_t isize; ssize_t ret; @@ -677,7 +685,7 @@ static ssize_t zonefs_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (unlikely(IS_IMMUTABLE(inode) && !(inode->i_mode & 0777))) return -EPERM;
- if (iocb->ki_pos >= zi->i_max_size) + if (iocb->ki_pos >= z->z_capacity) return 0;
if (iocb->ki_flags & IOCB_NOWAIT) { @@ -738,6 +746,7 @@ static inline bool zonefs_seq_file_need_wro(struct inode *inode, static int zonefs_seq_file_write_open(struct inode *inode) { struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); int ret = 0;
mutex_lock(&zi->i_truncate_mutex); @@ -755,14 +764,15 @@ static int zonefs_seq_file_write_open(struct inode *inode) goto unlock; }
- if (i_size_read(inode) < zi->i_max_size) { - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_OPEN); + if (i_size_read(inode) < z->z_capacity) { + ret = zonefs_inode_zone_mgmt(inode, + REQ_OP_ZONE_OPEN); if (ret) { atomic_dec(&sbi->s_wro_seq_files); goto unlock; } - zi->i_flags |= ZONEFS_ZONE_OPEN; - zonefs_account_active(inode); + z->z_flags |= ZONEFS_ZONE_OPEN; + zonefs_inode_account_active(inode); } } } @@ -792,6 +802,7 @@ static int zonefs_file_open(struct inode *inode, struct file *file) static void zonefs_seq_file_write_close(struct inode *inode) { struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); int ret = 0; @@ -807,8 +818,8 @@ static void zonefs_seq_file_write_close(struct inode *inode) * its maximum size or it was fully written). For this case, we only * need to decrement the write open count. */ - if (zi->i_flags & ZONEFS_ZONE_OPEN) { - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_CLOSE); + if (z->z_flags & ZONEFS_ZONE_OPEN) { + ret = zonefs_inode_zone_mgmt(inode, REQ_OP_ZONE_CLOSE); if (ret) { __zonefs_io_error(inode, false); /* @@ -817,11 +828,11 @@ static void zonefs_seq_file_write_close(struct inode *inode) * exhausted). So take preventive action by remounting * read-only. */ - if (zi->i_flags & ZONEFS_ZONE_OPEN && + if (z->z_flags & ZONEFS_ZONE_OPEN && !(sb->s_flags & SB_RDONLY)) { zonefs_warn(sb, "closing zone at %llu failed %d\n", - zi->i_zsector, ret); + z->z_sector, ret); zonefs_warn(sb, "remounting filesystem read-only\n"); sb->s_flags |= SB_RDONLY; @@ -829,8 +840,8 @@ static void zonefs_seq_file_write_close(struct inode *inode) goto unlock; }
- zi->i_flags &= ~ZONEFS_ZONE_OPEN; - zonefs_account_active(inode); + z->z_flags &= ~ZONEFS_ZONE_OPEN; + zonefs_inode_account_active(inode); }
atomic_dec(&sbi->s_wro_seq_files); diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index a4af29dc32e7d..270ded209dde5 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -28,33 +28,47 @@ #include "trace.h"
/* - * Manage the active zone count. Called with zi->i_truncate_mutex held. + * Get the name of a zone group directory. */ -void zonefs_account_active(struct inode *inode) +static const char *zonefs_zgroup_name(enum zonefs_ztype ztype) { - struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb); - struct zonefs_inode_info *zi = ZONEFS_I(inode); + switch (ztype) { + case ZONEFS_ZTYPE_CNV: + return "cnv"; + case ZONEFS_ZTYPE_SEQ: + return "seq"; + default: + WARN_ON_ONCE(1); + return "???"; + } +}
- lockdep_assert_held(&zi->i_truncate_mutex); +/* + * Manage the active zone count. + */ +static void zonefs_account_active(struct super_block *sb, + struct zonefs_zone *z) +{ + struct zonefs_sb_info *sbi = ZONEFS_SB(sb);
- if (zonefs_zone_is_cnv(zi)) + if (zonefs_zone_is_cnv(z)) return;
/* * For zones that transitioned to the offline or readonly condition, * we only need to clear the active state. */ - if (zi->i_flags & (ZONEFS_ZONE_OFFLINE | ZONEFS_ZONE_READONLY)) + if (z->z_flags & (ZONEFS_ZONE_OFFLINE | ZONEFS_ZONE_READONLY)) goto out;
/* * If the zone is active, that is, if it is explicitly open or * partially written, check if it was already accounted as active. */ - if ((zi->i_flags & ZONEFS_ZONE_OPEN) || - (zi->i_wpoffset > 0 && zi->i_wpoffset < zi->i_max_size)) { - if (!(zi->i_flags & ZONEFS_ZONE_ACTIVE)) { - zi->i_flags |= ZONEFS_ZONE_ACTIVE; + if ((z->z_flags & ZONEFS_ZONE_OPEN) || + (z->z_wpoffset > 0 && z->z_wpoffset < z->z_capacity)) { + if (!(z->z_flags & ZONEFS_ZONE_ACTIVE)) { + z->z_flags |= ZONEFS_ZONE_ACTIVE; atomic_inc(&sbi->s_active_seq_files); } return; @@ -62,18 +76,29 @@ void zonefs_account_active(struct inode *inode)
out: /* The zone is not active. If it was, update the active count */ - if (zi->i_flags & ZONEFS_ZONE_ACTIVE) { - zi->i_flags &= ~ZONEFS_ZONE_ACTIVE; + if (z->z_flags & ZONEFS_ZONE_ACTIVE) { + z->z_flags &= ~ZONEFS_ZONE_ACTIVE; atomic_dec(&sbi->s_active_seq_files); } }
-int zonefs_zone_mgmt(struct inode *inode, enum req_op op) +/* + * Manage the active zone count. Called with zi->i_truncate_mutex held. + */ +void zonefs_inode_account_active(struct inode *inode) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); - int ret; + lockdep_assert_held(&ZONEFS_I(inode)->i_truncate_mutex);
- lockdep_assert_held(&zi->i_truncate_mutex); + return zonefs_account_active(inode->i_sb, zonefs_inode_zone(inode)); +} + +/* + * Execute a zone management operation. + */ +static int zonefs_zone_mgmt(struct super_block *sb, + struct zonefs_zone *z, enum req_op op) +{ + int ret;
/* * With ZNS drives, closing an explicitly open zone that has not been @@ -83,37 +108,45 @@ int zonefs_zone_mgmt(struct inode *inode, enum req_op op) * are exceeded, make sure that the zone does not remain active by * resetting it. */ - if (op == REQ_OP_ZONE_CLOSE && !zi->i_wpoffset) + if (op == REQ_OP_ZONE_CLOSE && !z->z_wpoffset) op = REQ_OP_ZONE_RESET;
- trace_zonefs_zone_mgmt(inode, op); - ret = blkdev_zone_mgmt(inode->i_sb->s_bdev, op, zi->i_zsector, - zi->i_zone_size >> SECTOR_SHIFT, GFP_NOFS); + trace_zonefs_zone_mgmt(sb, z, op); + ret = blkdev_zone_mgmt(sb->s_bdev, op, z->z_sector, + z->z_size >> SECTOR_SHIFT, GFP_NOFS); if (ret) { - zonefs_err(inode->i_sb, + zonefs_err(sb, "Zone management operation %s at %llu failed %d\n", - blk_op_str(op), zi->i_zsector, ret); + blk_op_str(op), z->z_sector, ret); return ret; }
return 0; }
+int zonefs_inode_zone_mgmt(struct inode *inode, enum req_op op) +{ + lockdep_assert_held(&ZONEFS_I(inode)->i_truncate_mutex); + + return zonefs_zone_mgmt(inode->i_sb, zonefs_inode_zone(inode), op); +} + void zonefs_i_size_write(struct inode *inode, loff_t isize) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode);
i_size_write(inode, isize); + /* * A full zone is no longer open/active and does not need * explicit closing. */ - if (isize >= zi->i_max_size) { + if (isize >= z->z_capacity) { struct zonefs_sb_info *sbi = ZONEFS_SB(inode->i_sb);
- if (zi->i_flags & ZONEFS_ZONE_ACTIVE) + if (z->z_flags & ZONEFS_ZONE_ACTIVE) atomic_dec(&sbi->s_active_seq_files); - zi->i_flags &= ~(ZONEFS_ZONE_OPEN | ZONEFS_ZONE_ACTIVE); + z->z_flags &= ~(ZONEFS_ZONE_OPEN | ZONEFS_ZONE_ACTIVE); } }
@@ -150,20 +183,18 @@ void zonefs_update_stats(struct inode *inode, loff_t new_isize) }
/* - * Check a zone condition and adjust its file inode access permissions for - * offline and readonly zones. Return the inode size corresponding to the - * amount of readable data in the zone. + * Check a zone condition. Return the amount of written (and still readable) + * data in the zone. */ -static loff_t zonefs_check_zone_condition(struct inode *inode, +static loff_t zonefs_check_zone_condition(struct super_block *sb, + struct zonefs_zone *z, struct blk_zone *zone) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); - switch (zone->cond) { case BLK_ZONE_COND_OFFLINE: - zonefs_warn(inode->i_sb, "inode %lu: offline zone\n", - inode->i_ino); - zi->i_flags |= ZONEFS_ZONE_OFFLINE; + zonefs_warn(sb, "Zone %llu: offline zone\n", + z->z_sector); + z->z_flags |= ZONEFS_ZONE_OFFLINE; return 0; case BLK_ZONE_COND_READONLY: /* @@ -174,18 +205,18 @@ static loff_t zonefs_check_zone_condition(struct inode *inode, * the inode size as it was when last updated so that the user * can recover data. */ - zonefs_warn(inode->i_sb, "inode %lu: read-only zone\n", - inode->i_ino); - zi->i_flags |= ZONEFS_ZONE_READONLY; - if (zonefs_zone_is_cnv(zi)) - return zi->i_max_size; - return zi->i_wpoffset; + zonefs_warn(sb, "Zone %llu: read-only zone\n", + z->z_sector); + z->z_flags |= ZONEFS_ZONE_READONLY; + if (zonefs_zone_is_cnv(z)) + return z->z_capacity; + return z->z_wpoffset; case BLK_ZONE_COND_FULL: /* The write pointer of full zones is invalid. */ - return zi->i_max_size; + return z->z_capacity; default: - if (zonefs_zone_is_cnv(zi)) - return zi->i_max_size; + if (zonefs_zone_is_cnv(z)) + return z->z_capacity; return (zone->wp - zone->start) << SECTOR_SHIFT; } } @@ -196,22 +227,22 @@ static loff_t zonefs_check_zone_condition(struct inode *inode, */ static void zonefs_inode_update_mode(struct inode *inode) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode);
- if (zi->i_flags & ZONEFS_ZONE_OFFLINE) { + if (z->z_flags & ZONEFS_ZONE_OFFLINE) { /* Offline zones cannot be read nor written */ inode->i_flags |= S_IMMUTABLE; inode->i_mode &= ~0777; - } else if (zi->i_flags & ZONEFS_ZONE_READONLY) { + } else if (z->z_flags & ZONEFS_ZONE_READONLY) { /* Readonly zones cannot be written */ inode->i_flags |= S_IMMUTABLE; - if (zi->i_flags & ZONEFS_ZONE_INIT_MODE) + if (z->z_flags & ZONEFS_ZONE_INIT_MODE) inode->i_mode &= ~0777; else inode->i_mode &= ~0222; }
- zi->i_flags &= ~ZONEFS_ZONE_INIT_MODE; + z->z_flags &= ~ZONEFS_ZONE_INIT_MODE; }
struct zonefs_ioerr_data { @@ -224,7 +255,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, { struct zonefs_ioerr_data *err = data; struct inode *inode = err->inode; - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); loff_t isize, data_size; @@ -235,9 +266,9 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * as there is no inconsistency between the inode size and the amount of * data writen in the zone (data_size). */ - data_size = zonefs_check_zone_condition(inode, zone); + data_size = zonefs_check_zone_condition(sb, z, zone); isize = i_size_read(inode); - if (!(zi->i_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE)) && + if (!(z->z_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE)) && !err->write && isize == data_size) return 0;
@@ -260,8 +291,9 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * In all cases, warn about inode size inconsistency and handle the * IO error according to the zone condition and to the mount options. */ - if (zonefs_zone_is_seq(zi) && isize != data_size) - zonefs_warn(sb, "inode %lu: invalid size %lld (should be %lld)\n", + if (zonefs_zone_is_seq(z) && isize != data_size) + zonefs_warn(sb, + "inode %lu: invalid size %lld (should be %lld)\n", inode->i_ino, isize, data_size);
/* @@ -270,20 +302,20 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * zone condition to read-only and offline respectively, as if the * condition was signaled by the hardware. */ - if ((zi->i_flags & ZONEFS_ZONE_OFFLINE) || + if ((z->z_flags & ZONEFS_ZONE_OFFLINE) || (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZOL)) { zonefs_warn(sb, "inode %lu: read/write access disabled\n", inode->i_ino); - if (!(zi->i_flags & ZONEFS_ZONE_OFFLINE)) - zi->i_flags |= ZONEFS_ZONE_OFFLINE; + if (!(z->z_flags & ZONEFS_ZONE_OFFLINE)) + z->z_flags |= ZONEFS_ZONE_OFFLINE; zonefs_inode_update_mode(inode); data_size = 0; - } else if ((zi->i_flags & ZONEFS_ZONE_READONLY) || + } else if ((z->z_flags & ZONEFS_ZONE_READONLY) || (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_ZRO)) { zonefs_warn(sb, "inode %lu: write access disabled\n", inode->i_ino); - if (!(zi->i_flags & ZONEFS_ZONE_READONLY)) - zi->i_flags |= ZONEFS_ZONE_READONLY; + if (!(z->z_flags & ZONEFS_ZONE_READONLY)) + z->z_flags |= ZONEFS_ZONE_READONLY; zonefs_inode_update_mode(inode); data_size = isize; } else if (sbi->s_mount_opts & ZONEFS_MNTOPT_ERRORS_RO && @@ -299,8 +331,8 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, * close of the zone when the inode file is closed. */ if ((sbi->s_mount_opts & ZONEFS_MNTOPT_EXPLICIT_OPEN) && - (zi->i_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE))) - zi->i_flags &= ~ZONEFS_ZONE_OPEN; + (z->z_flags & (ZONEFS_ZONE_READONLY | ZONEFS_ZONE_OFFLINE))) + z->z_flags &= ~ZONEFS_ZONE_OPEN;
/* * If error=remount-ro was specified, any error result in remounting @@ -317,8 +349,8 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, */ zonefs_update_stats(inode, data_size); zonefs_i_size_write(inode, data_size); - zi->i_wpoffset = data_size; - zonefs_account_active(inode); + z->z_wpoffset = data_size; + zonefs_inode_account_active(inode);
return 0; } @@ -332,7 +364,7 @@ static int zonefs_io_error_cb(struct blk_zone *zone, unsigned int idx, */ void __zonefs_io_error(struct inode *inode, bool write) { - struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct zonefs_zone *z = zonefs_inode_zone(inode); struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); unsigned int noio_flag; @@ -348,8 +380,8 @@ void __zonefs_io_error(struct inode *inode, bool write) * files with aggregated conventional zones, for which the inode zone * size is always larger than the device zone size. */ - if (zi->i_zone_size > bdev_zone_sectors(sb->s_bdev)) - nr_zones = zi->i_zone_size >> + if (z->z_size > bdev_zone_sectors(sb->s_bdev)) + nr_zones = z->z_size >> (sbi->s_zone_sectors_shift + SECTOR_SHIFT);
/* @@ -361,7 +393,7 @@ void __zonefs_io_error(struct inode *inode, bool write) * the GFP_NOIO context avoids both problems. */ noio_flag = memalloc_noio_save(); - ret = blkdev_report_zones(sb->s_bdev, zi->i_zsector, nr_zones, + ret = blkdev_report_zones(sb->s_bdev, z->z_sector, nr_zones, zonefs_io_error_cb, &err); if (ret != nr_zones) zonefs_err(sb, "Get inode %lu zone information failed %d\n", @@ -381,9 +413,7 @@ static struct inode *zonefs_alloc_inode(struct super_block *sb)
inode_init_once(&zi->i_vnode); mutex_init(&zi->i_truncate_mutex); - zi->i_wpoffset = 0; zi->i_wr_refcnt = 0; - zi->i_flags = 0;
return &zi->i_vnode; } @@ -416,8 +446,8 @@ static int zonefs_statfs(struct dentry *dentry, struct kstatfs *buf) buf->f_bavail = buf->f_bfree;
for (t = 0; t < ZONEFS_ZTYPE_MAX; t++) { - if (sbi->s_nr_files[t]) - buf->f_files += sbi->s_nr_files[t] + 1; + if (sbi->s_zgroup[t].g_nr_zones) + buf->f_files += sbi->s_zgroup[t].g_nr_zones + 1; } buf->f_ffree = 0;
@@ -557,11 +587,11 @@ static const struct inode_operations zonefs_dir_inode_operations = { };
static void zonefs_init_dir_inode(struct inode *parent, struct inode *inode, - enum zonefs_ztype type) + enum zonefs_ztype ztype) { struct super_block *sb = parent->i_sb;
- inode->i_ino = bdev_nr_zones(sb->s_bdev) + type + 1; + inode->i_ino = bdev_nr_zones(sb->s_bdev) + ztype + 1; inode_init_owner(&init_user_ns, inode, parent, S_IFDIR | 0555); inode->i_op = &zonefs_dir_inode_operations; inode->i_fop = &simple_dir_operations; @@ -573,79 +603,34 @@ static const struct inode_operations zonefs_file_inode_operations = { .setattr = zonefs_inode_setattr, };
-static int zonefs_init_file_inode(struct inode *inode, struct blk_zone *zone, - enum zonefs_ztype type) +static void zonefs_init_file_inode(struct inode *inode, + struct zonefs_zone *z) { struct super_block *sb = inode->i_sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); - struct zonefs_inode_info *zi = ZONEFS_I(inode); - int ret = 0; - - inode->i_ino = zone->start >> sbi->s_zone_sectors_shift; - inode->i_mode = S_IFREG | sbi->s_perm;
- if (type == ZONEFS_ZTYPE_CNV) - zi->i_flags |= ZONEFS_ZONE_CNV; - - zi->i_zsector = zone->start; - zi->i_zone_size = zone->len << SECTOR_SHIFT; - if (zi->i_zone_size > bdev_zone_sectors(sb->s_bdev) << SECTOR_SHIFT && - !(sbi->s_features & ZONEFS_F_AGGRCNV)) { - zonefs_err(sb, - "zone size %llu doesn't match device's zone sectors %llu\n", - zi->i_zone_size, - bdev_zone_sectors(sb->s_bdev) << SECTOR_SHIFT); - return -EINVAL; - } - - zi->i_max_size = min_t(loff_t, MAX_LFS_FILESIZE, - zone->capacity << SECTOR_SHIFT); - zi->i_wpoffset = zonefs_check_zone_condition(inode, zone); + inode->i_private = z;
+ inode->i_ino = z->z_sector >> sbi->s_zone_sectors_shift; + inode->i_mode = S_IFREG | sbi->s_perm; inode->i_uid = sbi->s_uid; inode->i_gid = sbi->s_gid; - inode->i_size = zi->i_wpoffset; - inode->i_blocks = zi->i_max_size >> SECTOR_SHIFT; + inode->i_size = z->z_wpoffset; + inode->i_blocks = z->z_capacity >> SECTOR_SHIFT;
inode->i_op = &zonefs_file_inode_operations; inode->i_fop = &zonefs_file_operations; inode->i_mapping->a_ops = &zonefs_file_aops;
/* Update the inode access rights depending on the zone condition */ - zi->i_flags |= ZONEFS_ZONE_INIT_MODE; + z->z_flags |= ZONEFS_ZONE_INIT_MODE; zonefs_inode_update_mode(inode); - - sb->s_maxbytes = max(zi->i_max_size, sb->s_maxbytes); - sbi->s_blocks += zi->i_max_size >> sb->s_blocksize_bits; - sbi->s_used_blocks += zi->i_wpoffset >> sb->s_blocksize_bits; - - mutex_lock(&zi->i_truncate_mutex); - - /* - * For sequential zones, make sure that any open zone is closed first - * to ensure that the initial number of open zones is 0, in sync with - * the open zone accounting done when the mount option - * ZONEFS_MNTOPT_EXPLICIT_OPEN is used. - */ - if (type == ZONEFS_ZTYPE_SEQ && - (zone->cond == BLK_ZONE_COND_IMP_OPEN || - zone->cond == BLK_ZONE_COND_EXP_OPEN)) { - ret = zonefs_zone_mgmt(inode, REQ_OP_ZONE_CLOSE); - if (ret) - goto unlock; - } - - zonefs_account_active(inode); - -unlock: - mutex_unlock(&zi->i_truncate_mutex); - - return ret; }
static struct dentry *zonefs_create_inode(struct dentry *parent, - const char *name, struct blk_zone *zone, - enum zonefs_ztype type) + const char *name, + struct zonefs_zone *z, + enum zonefs_ztype ztype) { struct inode *dir = d_inode(parent); struct dentry *dentry; @@ -661,15 +646,10 @@ static struct dentry *zonefs_create_inode(struct dentry *parent, goto dput;
inode->i_ctime = inode->i_mtime = inode->i_atime = dir->i_ctime; - if (zone) { - ret = zonefs_init_file_inode(inode, zone, type); - if (ret) { - iput(inode); - goto dput; - } - } else { - zonefs_init_dir_inode(dir, inode, type); - } + if (z) + zonefs_init_file_inode(inode, z); + else + zonefs_init_dir_inode(dir, inode, ztype);
d_add(dentry, inode); dir->i_size++; @@ -685,100 +665,51 @@ static struct dentry *zonefs_create_inode(struct dentry *parent, struct zonefs_zone_data { struct super_block *sb; unsigned int nr_zones[ZONEFS_ZTYPE_MAX]; + sector_t cnv_zone_start; struct blk_zone *zones; };
/* - * Create a zone group and populate it with zone files. + * Create the inodes for a zone group. */ -static int zonefs_create_zgroup(struct zonefs_zone_data *zd, - enum zonefs_ztype type) +static int zonefs_create_zgroup_inodes(struct super_block *sb, + enum zonefs_ztype ztype) { - struct super_block *sb = zd->sb; struct zonefs_sb_info *sbi = ZONEFS_SB(sb); - struct blk_zone *zone, *next, *end; - const char *zgroup_name; - char *file_name; + struct zonefs_zone_group *zgroup = &sbi->s_zgroup[ztype]; struct dentry *dir, *dent; - unsigned int n = 0; - int ret; + char *file_name; + int i, ret = 0; + + if (!zgroup) + return -ENOMEM;
/* If the group is empty, there is nothing to do */ - if (!zd->nr_zones[type]) + if (!zgroup->g_nr_zones) return 0;
file_name = kmalloc(ZONEFS_NAME_MAX, GFP_KERNEL); if (!file_name) return -ENOMEM;
- if (type == ZONEFS_ZTYPE_CNV) - zgroup_name = "cnv"; - else - zgroup_name = "seq"; - - dir = zonefs_create_inode(sb->s_root, zgroup_name, NULL, type); + dir = zonefs_create_inode(sb->s_root, zonefs_zgroup_name(ztype), + NULL, ztype); if (IS_ERR(dir)) { ret = PTR_ERR(dir); goto free; }
- /* - * The first zone contains the super block: skip it. - */ - end = zd->zones + bdev_nr_zones(sb->s_bdev); - for (zone = &zd->zones[1]; zone < end; zone = next) { - - next = zone + 1; - if (zonefs_zone_type(zone) != type) - continue; - - /* - * For conventional zones, contiguous zones can be aggregated - * together to form larger files. Note that this overwrites the - * length of the first zone of the set of contiguous zones - * aggregated together. If one offline or read-only zone is - * found, assume that all zones aggregated have the same - * condition. - */ - if (type == ZONEFS_ZTYPE_CNV && - (sbi->s_features & ZONEFS_F_AGGRCNV)) { - for (; next < end; next++) { - if (zonefs_zone_type(next) != type) - break; - zone->len += next->len; - zone->capacity += next->capacity; - if (next->cond == BLK_ZONE_COND_READONLY && - zone->cond != BLK_ZONE_COND_OFFLINE) - zone->cond = BLK_ZONE_COND_READONLY; - else if (next->cond == BLK_ZONE_COND_OFFLINE) - zone->cond = BLK_ZONE_COND_OFFLINE; - } - if (zone->capacity != zone->len) { - zonefs_err(sb, "Invalid conventional zone capacity\n"); - ret = -EINVAL; - goto free; - } - } - - /* - * Use the file number within its group as file name. - */ - snprintf(file_name, ZONEFS_NAME_MAX - 1, "%u", n); - dent = zonefs_create_inode(dir, file_name, zone, type); + for (i = 0; i < zgroup->g_nr_zones; i++) { + /* Use the zone number within its group as the file name */ + snprintf(file_name, ZONEFS_NAME_MAX - 1, "%u", i); + dent = zonefs_create_inode(dir, file_name, + &zgroup->g_zones[i], ztype); if (IS_ERR(dent)) { ret = PTR_ERR(dent); - goto free; + break; } - - n++; }
- zonefs_info(sb, "Zone group "%s" has %u file%s\n", - zgroup_name, n, n > 1 ? "s" : ""); - - sbi->s_nr_files[type] = n; - ret = 0; - free: kfree(file_name);
@@ -789,21 +720,38 @@ static int zonefs_get_zone_info_cb(struct blk_zone *zone, unsigned int idx, void *data) { struct zonefs_zone_data *zd = data; + struct super_block *sb = zd->sb; + struct zonefs_sb_info *sbi = ZONEFS_SB(sb); + + /* + * We do not care about the first zone: it contains the super block + * and not exposed as a file. + */ + if (!idx) + return 0;
/* - * Count the number of usable zones: the first zone at index 0 contains - * the super block and is ignored. + * Count the number of zones that will be exposed as files. + * For sequential zones, we always have as many files as zones. + * FOr conventional zones, the number of files depends on if we have + * conventional zones aggregation enabled. */ switch (zone->type) { case BLK_ZONE_TYPE_CONVENTIONAL: - zone->wp = zone->start + zone->len; - if (idx) - zd->nr_zones[ZONEFS_ZTYPE_CNV]++; + if (sbi->s_features & ZONEFS_F_AGGRCNV) { + /* One file per set of contiguous conventional zones */ + if (!(sbi->s_zgroup[ZONEFS_ZTYPE_CNV].g_nr_zones) || + zone->start != zd->cnv_zone_start) + sbi->s_zgroup[ZONEFS_ZTYPE_CNV].g_nr_zones++; + zd->cnv_zone_start = zone->start + zone->len; + } else { + /* One file per zone */ + sbi->s_zgroup[ZONEFS_ZTYPE_CNV].g_nr_zones++; + } break; case BLK_ZONE_TYPE_SEQWRITE_REQ: case BLK_ZONE_TYPE_SEQWRITE_PREF: - if (idx) - zd->nr_zones[ZONEFS_ZTYPE_SEQ]++; + sbi->s_zgroup[ZONEFS_ZTYPE_SEQ].g_nr_zones++; break; default: zonefs_err(zd->sb, "Unsupported zone type 0x%x\n", @@ -843,11 +791,173 @@ static int zonefs_get_zone_info(struct zonefs_zone_data *zd) return 0; }
-static inline void zonefs_cleanup_zone_info(struct zonefs_zone_data *zd) +static inline void zonefs_free_zone_info(struct zonefs_zone_data *zd) { kvfree(zd->zones); }
+/* + * Create a zone group and populate it with zone files. + */ +static int zonefs_init_zgroup(struct super_block *sb, + struct zonefs_zone_data *zd, + enum zonefs_ztype ztype) +{ + struct zonefs_sb_info *sbi = ZONEFS_SB(sb); + struct zonefs_zone_group *zgroup = &sbi->s_zgroup[ztype]; + struct blk_zone *zone, *next, *end; + struct zonefs_zone *z; + unsigned int n = 0; + int ret; + + /* Allocate the zone group. If it is empty, we have nothing to do. */ + if (!zgroup->g_nr_zones) + return 0; + + zgroup->g_zones = kvcalloc(zgroup->g_nr_zones, + sizeof(struct zonefs_zone), GFP_KERNEL); + if (!zgroup->g_zones) + return -ENOMEM; + + /* + * Initialize the zone groups using the device zone information. + * We always skip the first zone as it contains the super block + * and is not use to back a file. + */ + end = zd->zones + bdev_nr_zones(sb->s_bdev); + for (zone = &zd->zones[1]; zone < end; zone = next) { + + next = zone + 1; + if (zonefs_zone_type(zone) != ztype) + continue; + + if (WARN_ON_ONCE(n >= zgroup->g_nr_zones)) + return -EINVAL; + + /* + * For conventional zones, contiguous zones can be aggregated + * together to form larger files. Note that this overwrites the + * length of the first zone of the set of contiguous zones + * aggregated together. If one offline or read-only zone is + * found, assume that all zones aggregated have the same + * condition. + */ + if (ztype == ZONEFS_ZTYPE_CNV && + (sbi->s_features & ZONEFS_F_AGGRCNV)) { + for (; next < end; next++) { + if (zonefs_zone_type(next) != ztype) + break; + zone->len += next->len; + zone->capacity += next->capacity; + if (next->cond == BLK_ZONE_COND_READONLY && + zone->cond != BLK_ZONE_COND_OFFLINE) + zone->cond = BLK_ZONE_COND_READONLY; + else if (next->cond == BLK_ZONE_COND_OFFLINE) + zone->cond = BLK_ZONE_COND_OFFLINE; + } + } + + z = &zgroup->g_zones[n]; + if (ztype == ZONEFS_ZTYPE_CNV) + z->z_flags |= ZONEFS_ZONE_CNV; + z->z_sector = zone->start; + z->z_size = zone->len << SECTOR_SHIFT; + if (z->z_size > bdev_zone_sectors(sb->s_bdev) << SECTOR_SHIFT && + !(sbi->s_features & ZONEFS_F_AGGRCNV)) { + zonefs_err(sb, + "Invalid zone size %llu (device zone sectors %llu)\n", + z->z_size, + bdev_zone_sectors(sb->s_bdev) << SECTOR_SHIFT); + return -EINVAL; + } + + z->z_capacity = min_t(loff_t, MAX_LFS_FILESIZE, + zone->capacity << SECTOR_SHIFT); + z->z_wpoffset = zonefs_check_zone_condition(sb, z, zone); + + sb->s_maxbytes = max(z->z_capacity, sb->s_maxbytes); + sbi->s_blocks += z->z_capacity >> sb->s_blocksize_bits; + sbi->s_used_blocks += z->z_wpoffset >> sb->s_blocksize_bits; + + /* + * For sequential zones, make sure that any open zone is closed + * first to ensure that the initial number of open zones is 0, + * in sync with the open zone accounting done when the mount + * option ZONEFS_MNTOPT_EXPLICIT_OPEN is used. + */ + if (ztype == ZONEFS_ZTYPE_SEQ && + (zone->cond == BLK_ZONE_COND_IMP_OPEN || + zone->cond == BLK_ZONE_COND_EXP_OPEN)) { + ret = zonefs_zone_mgmt(sb, z, REQ_OP_ZONE_CLOSE); + if (ret) + return ret; + } + + zonefs_account_active(sb, z); + + n++; + } + + if (WARN_ON_ONCE(n != zgroup->g_nr_zones)) + return -EINVAL; + + zonefs_info(sb, "Zone group "%s" has %u file%s\n", + zonefs_zgroup_name(ztype), + zgroup->g_nr_zones, + zgroup->g_nr_zones > 1 ? "s" : ""); + + return 0; +} + +static void zonefs_free_zgroups(struct super_block *sb) +{ + struct zonefs_sb_info *sbi = ZONEFS_SB(sb); + enum zonefs_ztype ztype; + + if (!sbi) + return; + + for (ztype = 0; ztype < ZONEFS_ZTYPE_MAX; ztype++) { + kvfree(sbi->s_zgroup[ztype].g_zones); + sbi->s_zgroup[ztype].g_zones = NULL; + } +} + +/* + * Create a zone group and populate it with zone files. + */ +static int zonefs_init_zgroups(struct super_block *sb) +{ + struct zonefs_zone_data zd; + enum zonefs_ztype ztype; + int ret; + + /* First get the device zone information */ + memset(&zd, 0, sizeof(struct zonefs_zone_data)); + zd.sb = sb; + ret = zonefs_get_zone_info(&zd); + if (ret) + goto cleanup; + + /* Allocate and initialize the zone groups */ + for (ztype = 0; ztype < ZONEFS_ZTYPE_MAX; ztype++) { + ret = zonefs_init_zgroup(sb, &zd, ztype); + if (ret) { + zonefs_info(sb, + "Zone group "%s" initialization failed\n", + zonefs_zgroup_name(ztype)); + break; + } + } + +cleanup: + zonefs_free_zone_info(&zd); + if (ret) + zonefs_free_zgroups(sb); + + return ret; +} + /* * Read super block information from the device. */ @@ -945,7 +1055,6 @@ static const struct super_operations zonefs_sops = { */ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) { - struct zonefs_zone_data zd; struct zonefs_sb_info *sbi; struct inode *inode; enum zonefs_ztype t; @@ -998,16 +1107,6 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) if (ret) return ret;
- memset(&zd, 0, sizeof(struct zonefs_zone_data)); - zd.sb = sb; - ret = zonefs_get_zone_info(&zd); - if (ret) - goto cleanup; - - ret = zonefs_sysfs_register(sb); - if (ret) - goto cleanup; - zonefs_info(sb, "Mounting %u zones", bdev_nr_zones(sb->s_bdev));
if (!sbi->s_max_wro_seq_files && @@ -1018,6 +1117,11 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) sbi->s_mount_opts &= ~ZONEFS_MNTOPT_EXPLICIT_OPEN; }
+ /* Initialize the zone groups */ + ret = zonefs_init_zgroups(sb); + if (ret) + goto cleanup; + /* Create root directory inode */ ret = -ENOMEM; inode = new_inode(sb); @@ -1037,13 +1141,19 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent)
/* Create and populate files in zone groups directories */ for (t = 0; t < ZONEFS_ZTYPE_MAX; t++) { - ret = zonefs_create_zgroup(&zd, t); + ret = zonefs_create_zgroup_inodes(sb, t); if (ret) - break; + goto cleanup; }
+ ret = zonefs_sysfs_register(sb); + if (ret) + goto cleanup; + + return 0; + cleanup: - zonefs_cleanup_zone_info(&zd); + zonefs_free_zgroups(sb);
return ret; } @@ -1062,6 +1172,7 @@ static void zonefs_kill_super(struct super_block *sb) d_genocide(sb->s_root);
zonefs_sysfs_unregister(sb); + zonefs_free_zgroups(sb); kill_block_super(sb); kfree(sbi); } diff --git a/fs/zonefs/trace.h b/fs/zonefs/trace.h index 42edcfd393ed2..9969db3a9c7dc 100644 --- a/fs/zonefs/trace.h +++ b/fs/zonefs/trace.h @@ -20,8 +20,9 @@ #define show_dev(dev) MAJOR(dev), MINOR(dev)
TRACE_EVENT(zonefs_zone_mgmt, - TP_PROTO(struct inode *inode, enum req_op op), - TP_ARGS(inode, op), + TP_PROTO(struct super_block *sb, struct zonefs_zone *z, + enum req_op op), + TP_ARGS(sb, z, op), TP_STRUCT__entry( __field(dev_t, dev) __field(ino_t, ino) @@ -30,12 +31,12 @@ TRACE_EVENT(zonefs_zone_mgmt, __field(sector_t, nr_sectors) ), TP_fast_assign( - __entry->dev = inode->i_sb->s_dev; - __entry->ino = inode->i_ino; + __entry->dev = sb->s_dev; + __entry->ino = + z->z_sector >> ZONEFS_SB(sb)->s_zone_sectors_shift; __entry->op = op; - __entry->sector = ZONEFS_I(inode)->i_zsector; - __entry->nr_sectors = - ZONEFS_I(inode)->i_zone_size >> SECTOR_SHIFT; + __entry->sector = z->z_sector; + __entry->nr_sectors = z->z_size >> SECTOR_SHIFT; ), TP_printk("bdev=(%d,%d), ino=%lu op=%s, sector=%llu, nr_sectors=%llu", show_dev(__entry->dev), (unsigned long)__entry->ino, @@ -58,9 +59,10 @@ TRACE_EVENT(zonefs_file_dio_append, TP_fast_assign( __entry->dev = inode->i_sb->s_dev; __entry->ino = inode->i_ino; - __entry->sector = ZONEFS_I(inode)->i_zsector; + __entry->sector = zonefs_inode_zone(inode)->z_sector; __entry->size = size; - __entry->wpoffset = ZONEFS_I(inode)->i_wpoffset; + __entry->wpoffset = + zonefs_inode_zone(inode)->z_wpoffset; __entry->ret = ret; ), TP_printk("bdev=(%d, %d), ino=%lu, sector=%llu, size=%zu, wpoffset=%llu, ret=%zu", diff --git a/fs/zonefs/zonefs.h b/fs/zonefs/zonefs.h index 1a225f74015a0..2d626e18b1411 100644 --- a/fs/zonefs/zonefs.h +++ b/fs/zonefs/zonefs.h @@ -47,22 +47,39 @@ static inline enum zonefs_ztype zonefs_zone_type(struct blk_zone *zone) #define ZONEFS_ZONE_CNV (1U << 31)
/* - * In-memory inode data. + * In-memory per-file inode zone data. */ -struct zonefs_inode_info { - struct inode i_vnode; +struct zonefs_zone { + /* Zone state flags */ + unsigned int z_flags;
- /* File zone start sector (512B unit) */ - sector_t i_zsector; + /* Zone start sector (512B unit) */ + sector_t z_sector;
- /* File zone write pointer position (sequential zones only) */ - loff_t i_wpoffset; + /* Zone size (bytes) */ + loff_t z_size;
- /* File maximum size */ - loff_t i_max_size; + /* Zone capacity (file maximum size, bytes) */ + loff_t z_capacity;
- /* File zone size */ - loff_t i_zone_size; + /* Write pointer offset in the zone (sequential zones only, bytes) */ + loff_t z_wpoffset; +}; + +/* + * In memory zone group information: all zones of a group are exposed + * as files, one file per zone. + */ +struct zonefs_zone_group { + unsigned int g_nr_zones; + struct zonefs_zone *g_zones; +}; + +/* + * In-memory inode data. + */ +struct zonefs_inode_info { + struct inode i_vnode;
/* * To serialise fully against both syscall and mmap based IO and @@ -81,7 +98,6 @@ struct zonefs_inode_info {
/* guarded by i_truncate_mutex */ unsigned int i_wr_refcnt; - unsigned int i_flags; };
static inline struct zonefs_inode_info *ZONEFS_I(struct inode *inode) @@ -89,24 +105,29 @@ static inline struct zonefs_inode_info *ZONEFS_I(struct inode *inode) return container_of(inode, struct zonefs_inode_info, i_vnode); }
-static inline bool zonefs_zone_is_cnv(struct zonefs_inode_info *zi) +static inline bool zonefs_zone_is_cnv(struct zonefs_zone *z) +{ + return z->z_flags & ZONEFS_ZONE_CNV; +} + +static inline bool zonefs_zone_is_seq(struct zonefs_zone *z) { - return zi->i_flags & ZONEFS_ZONE_CNV; + return !zonefs_zone_is_cnv(z); }
-static inline bool zonefs_zone_is_seq(struct zonefs_inode_info *zi) +static inline struct zonefs_zone *zonefs_inode_zone(struct inode *inode) { - return !zonefs_zone_is_cnv(zi); + return inode->i_private; }
static inline bool zonefs_inode_is_cnv(struct inode *inode) { - return zonefs_zone_is_cnv(ZONEFS_I(inode)); + return zonefs_zone_is_cnv(zonefs_inode_zone(inode)); }
static inline bool zonefs_inode_is_seq(struct inode *inode) { - return zonefs_zone_is_seq(ZONEFS_I(inode)); + return zonefs_zone_is_seq(zonefs_inode_zone(inode)); }
/* @@ -200,7 +221,7 @@ struct zonefs_sb_info { uuid_t s_uuid; unsigned int s_zone_sectors_shift;
- unsigned int s_nr_files[ZONEFS_ZTYPE_MAX]; + struct zonefs_zone_group s_zgroup[ZONEFS_ZTYPE_MAX];
loff_t s_blocks; loff_t s_used_blocks; @@ -229,8 +250,8 @@ static inline struct zonefs_sb_info *ZONEFS_SB(struct super_block *sb) pr_warn("zonefs (%s) WARNING: " format, sb->s_id, ## args)
/* In super.c */ -void zonefs_account_active(struct inode *inode); -int zonefs_zone_mgmt(struct inode *inode, enum req_op op); +void zonefs_inode_account_active(struct inode *inode); +int zonefs_inode_zone_mgmt(struct inode *inode, enum req_op op); void zonefs_i_size_write(struct inode *inode, loff_t isize); void zonefs_update_stats(struct inode *inode, loff_t new_isize); void __zonefs_io_error(struct inode *inode, bool write);
From: Damien Le Moal damien.lemoal@opensource.wdc.com
[ Upstream commit 88b170088ad2c3e27086fe35769aa49f8a512564 ]
Since the expected write location in a sequential file is always at the end of the file (append write), when an invalid write append location is detected in zonefs_file_dio_append(), print the invalid written location instead of the expected write location.
Fixes: a608da3bd730 ("zonefs: Detect append writes at invalid locations") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Himanshu Madhani himanshu.madhani@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/zonefs/file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/zonefs/file.c b/fs/zonefs/file.c index 738b0e28d74b5..c71cc0fcb3ec8 100644 --- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -426,7 +426,7 @@ static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) if (bio->bi_iter.bi_sector != wpsector) { zonefs_warn(inode->i_sb, "Corrupted write pointer %llu for zone at %llu\n", - wpsector, z->z_sector); + bio->bi_iter.bi_sector, z->z_sector); ret = -EIO; } }
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit bf1f1fec2724a33b67ec12032402ea75f2a83622 ]
This flag only gets set when we're doing active zone tracking, and we're going to need to use this flag for things related to this behavior. Rename the flag to represent what it actually means for the file system so it can be used in other ways and still make sense.
Reviewed-by: Naohiro Aota naohiro.aota@wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: e15acc25880c ("btrfs: zoned: drop space_info->active_total_bytes") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/fs.h | 7 ++----- fs/btrfs/space-info.c | 2 +- fs/btrfs/zoned.c | 3 +-- 3 files changed, 4 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h index 3d8156fc8523f..f180ca061aef4 100644 --- a/fs/btrfs/fs.h +++ b/fs/btrfs/fs.h @@ -119,11 +119,8 @@ enum { /* Indicate that we want to commit the transaction. */ BTRFS_FS_NEED_TRANS_COMMIT,
- /* - * Indicate metadata over-commit is disabled. This is set when active - * zone tracking is needed. - */ - BTRFS_FS_NO_OVERCOMMIT, + /* This is set when active zone tracking is needed. */ + BTRFS_FS_ACTIVE_ZONE_TRACKING,
/* * Indicate if we have some features changed, this is mostly for diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 69c09508afb50..2237685d1ed0c 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -407,7 +407,7 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info, return 0;
used = btrfs_space_info_used(space_info, true); - if (test_bit(BTRFS_FS_NO_OVERCOMMIT, &fs_info->flags) && + if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags) && (space_info->flags & BTRFS_BLOCK_GROUP_METADATA)) avail = 0; else diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index f3b7d8ae93a9f..a6a8bc112fc42 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -539,8 +539,7 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache) } atomic_set(&zone_info->active_zones_left, max_active_zones - nactive); - /* Overcommit does not work well with active zone tacking. */ - set_bit(BTRFS_FS_NO_OVERCOMMIT, &fs_info->flags); + set_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &fs_info->flags); }
/* Validate superblock log */
From: Naohiro Aota naohiro.aota@wdc.com
[ Upstream commit fa2068d7e922b434eba5bfb0131e6d39febfdb48 ]
The naming of space_info->active_total_bytes is misleading. It counts not only active block groups but also full ones which are previously active but now inactive. That confusion results in a bug not counting the full BGs into active_total_bytes on mount time.
For a background, there are three kinds of block groups in terms of activation.
1. Block groups never activated 2. Block groups currently active 3. Block groups previously active and currently inactive (due to fully written or zone finish)
What we really wanted to exclude from "total_bytes" is the total size of BGs #1. They seem empty and allocatable but since they are not activated, we cannot rely on them to do the space reservation.
And, since BGs #1 never get activated, they should have no "used", "reserved" and "pinned" bytes.
OTOH, BGs #3 can be counted in the "total", since they are already full we cannot allocate from them anyway. For them, "total_bytes == used + reserved + pinned + zone_unusable" should hold.
Tracking #2 and #3 as "active_total_bytes" (current implementation) is confusing. And, tracking #1 and subtract that properly from "total_bytes" every time you need space reservation is cumbersome.
Instead, we can count the whole region of a newly allocated block group as zone_unusable. Then, once that block group is activated, release [0 .. zone_capacity] from the zone_unusable counters. With this, we can eliminate the confusing ->active_total_bytes and the code will be common among regular and the zoned mode. Also, no additional counter is needed with this approach.
Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: e15acc25880c ("btrfs: zoned: drop space_info->active_total_bytes") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/free-space-cache.c | 8 +++++++- fs/btrfs/zoned.c | 24 +++++++++++++++++++----- 2 files changed, 26 insertions(+), 6 deletions(-)
diff --git a/fs/btrfs/free-space-cache.c b/fs/btrfs/free-space-cache.c index 0d250d052487c..d84cef89cdff5 100644 --- a/fs/btrfs/free-space-cache.c +++ b/fs/btrfs/free-space-cache.c @@ -2693,8 +2693,13 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group, bg_reclaim_threshold = READ_ONCE(sinfo->bg_reclaim_threshold);
spin_lock(&ctl->tree_lock); + /* Count initial region as zone_unusable until it gets activated. */ if (!used) to_free = size; + else if (initial && + test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &block_group->fs_info->flags) && + (block_group->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM))) + to_free = 0; else if (initial) to_free = block_group->zone_capacity; else if (offset >= block_group->alloc_offset) @@ -2722,7 +2727,8 @@ static int __btrfs_add_free_space_zoned(struct btrfs_block_group *block_group, reclaimable_unusable = block_group->zone_unusable - (block_group->length - block_group->zone_capacity); /* All the region is now unusable. Mark it as unused and reclaim */ - if (block_group->zone_unusable == block_group->length) { + if (block_group->zone_unusable == block_group->length && + block_group->alloc_offset) { btrfs_mark_bg_unused(block_group); } else if (bg_reclaim_threshold && reclaimable_unusable >= diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index a6a8bc112fc42..c3c763cc06399 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -1576,9 +1576,19 @@ void btrfs_calc_zone_unusable(struct btrfs_block_group *cache) return;
WARN_ON(cache->bytes_super != 0); - unusable = (cache->alloc_offset - cache->used) + - (cache->length - cache->zone_capacity); - free = cache->zone_capacity - cache->alloc_offset; + + /* Check for block groups never get activated */ + if (test_bit(BTRFS_FS_ACTIVE_ZONE_TRACKING, &cache->fs_info->flags) && + cache->flags & (BTRFS_BLOCK_GROUP_METADATA | BTRFS_BLOCK_GROUP_SYSTEM) && + !test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &cache->runtime_flags) && + cache->alloc_offset == 0) { + unusable = cache->length; + free = 0; + } else { + unusable = (cache->alloc_offset - cache->used) + + (cache->length - cache->zone_capacity); + free = cache->zone_capacity - cache->alloc_offset; + }
/* We only need ->free_space in ALLOC_SEQ block groups */ cache->cached = BTRFS_CACHE_FINISHED; @@ -1915,7 +1925,11 @@ bool btrfs_zone_activate(struct btrfs_block_group *block_group)
/* Successfully activated all the zones */ set_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags); - space_info->active_total_bytes += block_group->length; + WARN_ON(block_group->alloc_offset != 0); + if (block_group->zone_unusable == block_group->length) { + block_group->zone_unusable = block_group->length - block_group->zone_capacity; + space_info->bytes_zone_unusable -= block_group->zone_capacity; + } spin_unlock(&block_group->lock); btrfs_try_granting_tickets(fs_info, space_info); spin_unlock(&space_info->lock); @@ -2279,7 +2293,7 @@ int btrfs_zone_finish_one_bg(struct btrfs_fs_info *fs_info) u64 avail;
spin_lock(&block_group->lock); - if (block_group->reserved || + if (block_group->reserved || block_group->alloc_offset == 0 || (block_group->flags & BTRFS_BLOCK_GROUP_SYSTEM)) { spin_unlock(&block_group->lock); continue;
From: Naohiro Aota naohiro.aota@wdc.com
[ Upstream commit e15acc25880cf048dba9df94d76ed7e7e10040e6 ]
The space_info->active_total_bytes is no longer necessary as we now count the region of newly allocated block group as zone_unusable. Drop its usage.
Fixes: 6a921de58992 ("btrfs: zoned: introduce space_info->active_total_bytes") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 6 ------ fs/btrfs/space-info.c | 40 +++++++++------------------------------- fs/btrfs/space-info.h | 2 -- fs/btrfs/zoned.c | 4 ---- 4 files changed, 9 insertions(+), 43 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index d628d545ffea7..8eb625318e785 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -1036,14 +1036,8 @@ int btrfs_remove_block_group(struct btrfs_trans_handle *trans, < block_group->zone_unusable); WARN_ON(block_group->space_info->disk_total < block_group->length * factor); - WARN_ON(test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, - &block_group->runtime_flags) && - block_group->space_info->active_total_bytes - < block_group->length); } block_group->space_info->total_bytes -= block_group->length; - if (test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags)) - block_group->space_info->active_total_bytes -= block_group->length; block_group->space_info->bytes_readonly -= (block_group->length - block_group->zone_unusable); block_group->space_info->bytes_zone_unusable -= diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c index 2237685d1ed0c..3eecce86f63fc 100644 --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -308,8 +308,6 @@ void btrfs_add_bg_to_space_info(struct btrfs_fs_info *info, ASSERT(found); spin_lock(&found->lock); found->total_bytes += block_group->length; - if (test_bit(BLOCK_GROUP_FLAG_ZONE_IS_ACTIVE, &block_group->runtime_flags)) - found->active_total_bytes += block_group->length; found->disk_total += block_group->length * factor; found->bytes_used += block_group->used; found->disk_used += block_group->used * factor; @@ -379,22 +377,6 @@ static u64 calc_available_free_space(struct btrfs_fs_info *fs_info, return avail; }
-static inline u64 writable_total_bytes(struct btrfs_fs_info *fs_info, - struct btrfs_space_info *space_info) -{ - /* - * On regular filesystem, all total_bytes are always writable. On zoned - * filesystem, there may be a limitation imposed by max_active_zones. - * For metadata allocation, we cannot finish an existing active block - * group to avoid a deadlock. Thus, we need to consider only the active - * groups to be writable for metadata space. - */ - if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA)) - return space_info->total_bytes; - - return space_info->active_total_bytes; -} - int btrfs_can_overcommit(struct btrfs_fs_info *fs_info, struct btrfs_space_info *space_info, u64 bytes, enum btrfs_reserve_flush_enum flush) @@ -413,7 +395,7 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info, else avail = calc_available_free_space(fs_info, space_info, flush);
- if (used + bytes < writable_total_bytes(fs_info, space_info) + avail) + if (used + bytes < space_info->total_bytes + avail) return 1; return 0; } @@ -449,7 +431,7 @@ void btrfs_try_granting_tickets(struct btrfs_fs_info *fs_info, ticket = list_first_entry(head, struct reserve_ticket, list);
/* Check and see if our ticket can be satisfied now. */ - if ((used + ticket->bytes <= writable_total_bytes(fs_info, space_info)) || + if ((used + ticket->bytes <= space_info->total_bytes) || btrfs_can_overcommit(fs_info, space_info, ticket->bytes, flush)) { btrfs_space_info_update_bytes_may_use(fs_info, @@ -829,7 +811,6 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info, { u64 used; u64 avail; - u64 total; u64 to_reclaim = space_info->reclaim_size;
lockdep_assert_held(&space_info->lock); @@ -844,9 +825,8 @@ btrfs_calc_reclaim_metadata_size(struct btrfs_fs_info *fs_info, * space. If that's the case add in our overage so we make sure to put * appropriate pressure on the flushing state machine. */ - total = writable_total_bytes(fs_info, space_info); - if (total + avail < used) - to_reclaim += used - (total + avail); + if (space_info->total_bytes + avail < used) + to_reclaim += used - (space_info->total_bytes + avail);
return to_reclaim; } @@ -856,11 +836,10 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info, { u64 global_rsv_size = fs_info->global_block_rsv.reserved; u64 ordered, delalloc; - u64 total = writable_total_bytes(fs_info, space_info); u64 thresh; u64 used;
- thresh = mult_perc(total, 90); + thresh = mult_perc(space_info->total_bytes, 90);
lockdep_assert_held(&space_info->lock);
@@ -923,8 +902,8 @@ static bool need_preemptive_reclaim(struct btrfs_fs_info *fs_info, BTRFS_RESERVE_FLUSH_ALL); used = space_info->bytes_used + space_info->bytes_reserved + space_info->bytes_readonly + global_rsv_size; - if (used < total) - thresh += total - used; + if (used < space_info->total_bytes) + thresh += space_info->total_bytes - used; thresh >>= space_info->clamp;
used = space_info->bytes_pinned; @@ -1651,7 +1630,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, * can_overcommit() to ensure we can overcommit to continue. */ if (!pending_tickets && - ((used + orig_bytes <= writable_total_bytes(fs_info, space_info)) || + ((used + orig_bytes <= space_info->total_bytes) || btrfs_can_overcommit(fs_info, space_info, orig_bytes, flush))) { btrfs_space_info_update_bytes_may_use(fs_info, space_info, orig_bytes); @@ -1665,8 +1644,7 @@ static int __reserve_bytes(struct btrfs_fs_info *fs_info, */ if (ret && unlikely(flush == BTRFS_RESERVE_FLUSH_EMERGENCY)) { used = btrfs_space_info_used(space_info, false); - if (used + orig_bytes <= - writable_total_bytes(fs_info, space_info)) { + if (used + orig_bytes <= space_info->total_bytes) { btrfs_space_info_update_bytes_may_use(fs_info, space_info, orig_bytes); ret = 0; diff --git a/fs/btrfs/space-info.h b/fs/btrfs/space-info.h index fc99ea2b0c34f..2033b71b18cec 100644 --- a/fs/btrfs/space-info.h +++ b/fs/btrfs/space-info.h @@ -96,8 +96,6 @@ struct btrfs_space_info { u64 bytes_may_use; /* number of bytes that may be used for delalloc/allocations */ u64 bytes_readonly; /* total bytes that are read only */ - /* Total bytes in the space, but only accounts active block groups. */ - u64 active_total_bytes; u64 bytes_zone_unusable; /* total bytes that are unusable until resetting the device zone */
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c index c3c763cc06399..ce5ebba7fdd9a 100644 --- a/fs/btrfs/zoned.c +++ b/fs/btrfs/zoned.c @@ -2330,10 +2330,6 @@ int btrfs_zoned_activate_one_bg(struct btrfs_fs_info *fs_info, if (!btrfs_is_zoned(fs_info) || (space_info->flags & BTRFS_BLOCK_GROUP_DATA)) return 0;
- /* No more block groups to activate */ - if (space_info->active_total_bytes == space_info->total_bytes) - return 0; - for (;;) { int ret; bool need_finish = false;
From: Eric Biggers ebiggers@google.com
[ Upstream commit a075bacde257f755bea0e53400c9f1cdd1b8e8e6 ]
The full pagecache drop at the end of FS_IOC_ENABLE_VERITY is causing performance problems and is hindering adoption of fsverity. It was intended to solve a race condition where unverified pages might be left in the pagecache. But actually it doesn't solve it fully.
Since the incomplete solution for this race condition has too much performance impact for it to be worth it, let's remove it for now.
Fixes: 3fda4c617e84 ("fs-verity: implement FS_IOC_ENABLE_VERITY ioctl") Cc: stable@vger.kernel.org Reviewed-by: Victor Hsieh victorhsieh@google.com Link: https://lore.kernel.org/r/20230314235332.50270-1-ebiggers@kernel.org Signed-off-by: Eric Biggers ebiggers@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/verity/enable.c | 24 +++++++++++++----------- 1 file changed, 13 insertions(+), 11 deletions(-)
diff --git a/fs/verity/enable.c b/fs/verity/enable.c index df6b499bf6a14..400c264bf8930 100644 --- a/fs/verity/enable.c +++ b/fs/verity/enable.c @@ -390,25 +390,27 @@ int fsverity_ioctl_enable(struct file *filp, const void __user *uarg) goto out_drop_write;
err = enable_verity(filp, &arg); - if (err) - goto out_allow_write_access;
/* - * Some pages of the file may have been evicted from pagecache after - * being used in the Merkle tree construction, then read into pagecache - * again by another process reading from the file concurrently. Since - * these pages didn't undergo verification against the file digest which - * fs-verity now claims to be enforcing, we have to wipe the pagecache - * to ensure that all future reads are verified. + * We no longer drop the inode's pagecache after enabling verity. This + * used to be done to try to avoid a race condition where pages could be + * evicted after being used in the Merkle tree construction, then + * re-instantiated by a concurrent read. Such pages are unverified, and + * the backing storage could have filled them with different content, so + * they shouldn't be used to fulfill reads once verity is enabled. + * + * But, dropping the pagecache has a big performance impact, and it + * doesn't fully solve the race condition anyway. So for those reasons, + * and also because this race condition isn't very important relatively + * speaking (especially for small-ish files, where the chance of a page + * being used, evicted, *and* re-instantiated all while enabling verity + * is quite small), we no longer drop the inode's pagecache. */ - filemap_write_and_wait(inode->i_mapping); - invalidate_inode_pages2(inode->i_mapping);
/* * allow_write_access() is needed to pair with deny_write_access(). * Regardless, the filesystem won't allow writing to verity files. */ -out_allow_write_access: allow_write_access(filp); out_drop_write: mnt_drop_write_file(filp);
From: Paulo Alcantara pc@manguebit.com
[ Upstream commit c24bb1a87dc3f2d77d410eaac2c6a295961bf50e ]
Make sure to unload_nls() @nls_codepage if we no longer need it.
Fixes: bc962159e8e3 ("cifs: avoid race conditions with parallel reconnects") Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Cc: Shyam Prasad N sprasad@microsoft.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2pdu.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/fs/cifs/smb2pdu.c b/fs/cifs/smb2pdu.c index f0b1ae0835d71..b37379b62cc77 100644 --- a/fs/cifs/smb2pdu.c +++ b/fs/cifs/smb2pdu.c @@ -144,7 +144,7 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, struct TCP_Server_Info *server) { int rc = 0; - struct nls_table *nls_codepage; + struct nls_table *nls_codepage = NULL; struct cifs_ses *ses;
/* @@ -216,8 +216,6 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, tcon->ses->chans_need_reconnect, tcon->need_reconnect);
- nls_codepage = load_nls_default(); - mutex_lock(&ses->session_mutex); /* * Recheck after acquire mutex. If another thread is negotiating @@ -237,6 +235,8 @@ smb2_reconnect(__le16 smb2_command, struct cifs_tcon *tcon, } spin_unlock(&server->srv_lock);
+ nls_codepage = load_nls_default(); + /* * need to prevent multiple threads trying to simultaneously * reconnect the same SMB session
From: Herbert Xu herbert@gondor.apana.org.au
[ Upstream commit 8222d5910dae08213b6d9d4bc9a7f8502855e624 ]
When copying data to user-space we should ensure that only valid data is copied over. Padding in structures may be filled with random (possibly sensitve) data and should never be given directly to user-space.
This patch fixes the copying of xfrm algorithms and the encap template in xfrm_user so that padding is zeroed.
Reported-by: syzbot+fa5414772d5c445dac3c@syzkaller.appspotmail.com Reported-by: Hyunwoo Kim v4bel@theori.io Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Reviewed-by: Sabrina Dubroca sd@queasysnail.net Signed-off-by: Steffen Klassert steffen.klassert@secunet.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/xfrm/xfrm_user.c | 45 ++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 41 insertions(+), 4 deletions(-)
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c index cf5172d4ce68c..103af2b3e986f 100644 --- a/net/xfrm/xfrm_user.c +++ b/net/xfrm/xfrm_user.c @@ -1012,7 +1012,9 @@ static int copy_to_user_aead(struct xfrm_algo_aead *aead, struct sk_buff *skb) return -EMSGSIZE;
ap = nla_data(nla); - memcpy(ap, aead, sizeof(*aead)); + strscpy_pad(ap->alg_name, aead->alg_name, sizeof(ap->alg_name)); + ap->alg_key_len = aead->alg_key_len; + ap->alg_icv_len = aead->alg_icv_len;
if (redact_secret && aead->alg_key_len) memset(ap->alg_key, 0, (aead->alg_key_len + 7) / 8); @@ -1032,7 +1034,8 @@ static int copy_to_user_ealg(struct xfrm_algo *ealg, struct sk_buff *skb) return -EMSGSIZE;
ap = nla_data(nla); - memcpy(ap, ealg, sizeof(*ealg)); + strscpy_pad(ap->alg_name, ealg->alg_name, sizeof(ap->alg_name)); + ap->alg_key_len = ealg->alg_key_len;
if (redact_secret && ealg->alg_key_len) memset(ap->alg_key, 0, (ealg->alg_key_len + 7) / 8); @@ -1043,6 +1046,40 @@ static int copy_to_user_ealg(struct xfrm_algo *ealg, struct sk_buff *skb) return 0; }
+static int copy_to_user_calg(struct xfrm_algo *calg, struct sk_buff *skb) +{ + struct nlattr *nla = nla_reserve(skb, XFRMA_ALG_COMP, sizeof(*calg)); + struct xfrm_algo *ap; + + if (!nla) + return -EMSGSIZE; + + ap = nla_data(nla); + strscpy_pad(ap->alg_name, calg->alg_name, sizeof(ap->alg_name)); + ap->alg_key_len = 0; + + return 0; +} + +static int copy_to_user_encap(struct xfrm_encap_tmpl *ep, struct sk_buff *skb) +{ + struct nlattr *nla = nla_reserve(skb, XFRMA_ENCAP, sizeof(*ep)); + struct xfrm_encap_tmpl *uep; + + if (!nla) + return -EMSGSIZE; + + uep = nla_data(nla); + memset(uep, 0, sizeof(*uep)); + + uep->encap_type = ep->encap_type; + uep->encap_sport = ep->encap_sport; + uep->encap_dport = ep->encap_dport; + uep->encap_oa = ep->encap_oa; + + return 0; +} + static int xfrm_smark_put(struct sk_buff *skb, struct xfrm_mark *m) { int ret = 0; @@ -1098,12 +1135,12 @@ static int copy_to_user_state_extra(struct xfrm_state *x, goto out; } if (x->calg) { - ret = nla_put(skb, XFRMA_ALG_COMP, sizeof(*(x->calg)), x->calg); + ret = copy_to_user_calg(x->calg, skb); if (ret) goto out; } if (x->encap) { - ret = nla_put(skb, XFRMA_ENCAP, sizeof(*x->encap), x->encap); + ret = copy_to_user_encap(x->encap, skb); if (ret) goto out; }
From: Ravulapati Vishnu Vardhan Rao quic_visr@quicinc.com
[ Upstream commit e5e7e398f6bb7918dab0612eb6991f7bae95520d ]
When we run syzkaller we get below Out of Bound. "KASAN: slab-out-of-bounds Read in regcache_flat_read"
Below is the backtrace of the issue:
dump_backtrace+0x0/0x4c8 show_stack+0x34/0x44 dump_stack_lvl+0xd8/0x118 print_address_description+0x30/0x2d8 kasan_report+0x158/0x198 __asan_report_load4_noabort+0x44/0x50 regcache_flat_read+0x10c/0x110 regcache_read+0xf4/0x180 _regmap_read+0xc4/0x278 _regmap_update_bits+0x130/0x290 regmap_update_bits_base+0xc0/0x15c snd_soc_component_update_bits+0xa8/0x22c snd_soc_component_write_field+0x68/0xd4 tx_macro_digital_mute+0xec/0x140
Actually There is no need to have decimator with 32 bits. By limiting the variable with short type u8 issue is resolved.
Signed-off-by: Ravulapati Vishnu Vardhan Rao quic_visr@quicinc.com Link: https://lore.kernel.org/r/20230304080702.609-1-quic_visr@quicinc.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/lpass-tx-macro.c | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/sound/soc/codecs/lpass-tx-macro.c b/sound/soc/codecs/lpass-tx-macro.c index 2449a2df66df0..8facdb922f076 100644 --- a/sound/soc/codecs/lpass-tx-macro.c +++ b/sound/soc/codecs/lpass-tx-macro.c @@ -242,7 +242,7 @@ enum {
struct tx_mute_work { struct tx_macro *tx; - u32 decimator; + u8 decimator; struct delayed_work dwork; };
@@ -635,7 +635,7 @@ static int tx_macro_mclk_enable(struct tx_macro *tx, return 0; }
-static bool is_amic_enabled(struct snd_soc_component *component, int decimator) +static bool is_amic_enabled(struct snd_soc_component *component, u8 decimator) { u16 adc_mux_reg, adc_reg, adc_n;
@@ -849,7 +849,7 @@ static int tx_macro_enable_dec(struct snd_soc_dapm_widget *w, struct snd_kcontrol *kcontrol, int event) { struct snd_soc_component *component = snd_soc_dapm_to_component(w->dapm); - unsigned int decimator; + u8 decimator; u16 tx_vol_ctl_reg, dec_cfg_reg, hpf_gate_reg, tx_gain_ctl_reg; u8 hpf_cut_off_freq; int hpf_delay = TX_MACRO_DMIC_HPF_DELAY_MS; @@ -1064,7 +1064,8 @@ static int tx_macro_hw_params(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { struct snd_soc_component *component = dai->component; - u32 decimator, sample_rate; + u32 sample_rate; + u8 decimator; int tx_fs_rate; struct tx_macro *tx = snd_soc_component_get_drvdata(component);
@@ -1128,7 +1129,7 @@ static int tx_macro_digital_mute(struct snd_soc_dai *dai, int mute, int stream) { struct snd_soc_component *component = dai->component; struct tx_macro *tx = snd_soc_component_get_drvdata(component); - u16 decimator; + u8 decimator;
/* active decimator not set yet */ if (tx->active_decimator[dai->id] == -1)
From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
[ Upstream commit d16c893425d07ada1fdd817ec06d322efcf69480 ]
max98357a is speaker codec configured in 48000/2/S16_LE format regardless of front end format, so force it to be so.
Reviewed-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://lore.kernel.org/r/20230303134854.2277146-2-amadeuszx.slawinski@linux... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/avs/boards/max98357a.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/sound/soc/intel/avs/boards/max98357a.c b/sound/soc/intel/avs/boards/max98357a.c index 921f42caf7e09..183123d08c5a3 100644 --- a/sound/soc/intel/avs/boards/max98357a.c +++ b/sound/soc/intel/avs/boards/max98357a.c @@ -8,6 +8,7 @@
#include <linux/module.h> #include <linux/platform_device.h> +#include <sound/pcm_params.h> #include <sound/soc.h> #include <sound/soc-acpi.h> #include <sound/soc-dapm.h> @@ -24,6 +25,26 @@ static const struct snd_soc_dapm_route card_base_routes[] = { { "Spk", NULL, "Speaker" }, };
+static int +avs_max98357a_be_fixup(struct snd_soc_pcm_runtime *runrime, struct snd_pcm_hw_params *params) +{ + struct snd_interval *rate, *channels; + struct snd_mask *fmt; + + rate = hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE); + channels = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); + fmt = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + + /* The ADSP will convert the FE rate to 48k, stereo */ + rate->min = rate->max = 48000; + channels->min = channels->max = 2; + + /* set SSP0 to 16 bit */ + snd_mask_none(fmt); + snd_mask_set_format(fmt, SNDRV_PCM_FORMAT_S16_LE); + return 0; +} + static int avs_create_dai_link(struct device *dev, const char *platform_name, int ssp_port, struct snd_soc_dai_link **dai_link) { @@ -55,6 +76,7 @@ static int avs_create_dai_link(struct device *dev, const char *platform_name, in dl->num_platforms = 1; dl->id = 0; dl->dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS; + dl->be_hw_params_fixup = avs_max98357a_be_fixup; dl->nonatomic = 1; dl->no_pcm = 1; dl->dpcm_playback = 1;
From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
[ Upstream commit 61f368624fe4d0c25c6e9c917574b8ace51d776e ]
da7219 is headset codec configured in 48000/2/S24_LE format regardless of front end format, so force it to be so.
Reviewed-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://lore.kernel.org/r/20230303134854.2277146-3-amadeuszx.slawinski@linux... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/avs/boards/da7219.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
diff --git a/sound/soc/intel/avs/boards/da7219.c b/sound/soc/intel/avs/boards/da7219.c index acd43b6108e99..1a1d572cc1d02 100644 --- a/sound/soc/intel/avs/boards/da7219.c +++ b/sound/soc/intel/avs/boards/da7219.c @@ -117,6 +117,26 @@ static void avs_da7219_codec_exit(struct snd_soc_pcm_runtime *rtd) snd_soc_component_set_jack(asoc_rtd_to_codec(rtd, 0)->component, NULL, NULL); }
+static int +avs_da7219_be_fixup(struct snd_soc_pcm_runtime *runrime, struct snd_pcm_hw_params *params) +{ + struct snd_interval *rate, *channels; + struct snd_mask *fmt; + + rate = hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE); + channels = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); + fmt = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + + /* The ADSP will convert the FE rate to 48k, stereo */ + rate->min = rate->max = 48000; + channels->min = channels->max = 2; + + /* set SSP0 to 24 bit */ + snd_mask_none(fmt); + snd_mask_set_format(fmt, SNDRV_PCM_FORMAT_S24_LE); + return 0; +} + static int avs_create_dai_link(struct device *dev, const char *platform_name, int ssp_port, struct snd_soc_dai_link **dai_link) { @@ -148,6 +168,7 @@ static int avs_create_dai_link(struct device *dev, const char *platform_name, in dl->num_platforms = 1; dl->id = 0; dl->dai_fmt = SND_SOC_DAIFMT_I2S | SND_SOC_DAIFMT_NB_NF | SND_SOC_DAIFMT_CBS_CFS; + dl->be_hw_params_fixup = avs_da7219_be_fixup; dl->init = avs_da7219_codec_init; dl->exit = avs_da7219_codec_exit; dl->nonatomic = 1;
From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
[ Upstream commit d24dbc865c2bd5946bef62bb862a65df092dfc79 ]
rt5682 is headset codec configured in 48000/2/S24_LE format regardless of front end format, so force it to be so.
Reviewed-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://lore.kernel.org/r/20230303134854.2277146-4-amadeuszx.slawinski@linux... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/avs/boards/rt5682.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+)
diff --git a/sound/soc/intel/avs/boards/rt5682.c b/sound/soc/intel/avs/boards/rt5682.c index 473e9fe5d0bf7..b2c2ba93dcb56 100644 --- a/sound/soc/intel/avs/boards/rt5682.c +++ b/sound/soc/intel/avs/boards/rt5682.c @@ -169,6 +169,27 @@ static const struct snd_soc_ops avs_rt5682_ops = { .hw_params = avs_rt5682_hw_params, };
+static int +avs_rt5682_be_fixup(struct snd_soc_pcm_runtime *runtime, struct snd_pcm_hw_params *params) +{ + struct snd_interval *rate, *channels; + struct snd_mask *fmt; + + rate = hw_param_interval(params, SNDRV_PCM_HW_PARAM_RATE); + channels = hw_param_interval(params, SNDRV_PCM_HW_PARAM_CHANNELS); + fmt = hw_param_mask(params, SNDRV_PCM_HW_PARAM_FORMAT); + + /* The ADSP will convert the FE rate to 48k, stereo */ + rate->min = rate->max = 48000; + channels->min = channels->max = 2; + + /* set SSPN to 24 bit */ + snd_mask_none(fmt); + snd_mask_set_format(fmt, SNDRV_PCM_FORMAT_S24_LE); + + return 0; +} + static int avs_create_dai_link(struct device *dev, const char *platform_name, int ssp_port, struct snd_soc_dai_link **dai_link) { @@ -201,6 +222,7 @@ static int avs_create_dai_link(struct device *dev, const char *platform_name, in dl->id = 0; dl->init = avs_rt5682_codec_init; dl->exit = avs_rt5682_codec_exit; + dl->be_hw_params_fixup = avs_rt5682_be_fixup; dl->ops = &avs_rt5682_ops; dl->nonatomic = 1; dl->no_pcm = 1;
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit 933de2d127281731166cf2880fa1e23c5a0f7faa ]
Some of the nau8825 clock control got into the ssm4567, remove it.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://lore.kernel.org/r/20230303134854.2277146-5-amadeuszx.slawinski@linux... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/avs/boards/ssm4567.c | 31 ---------------------------- 1 file changed, 31 deletions(-)
diff --git a/sound/soc/intel/avs/boards/ssm4567.c b/sound/soc/intel/avs/boards/ssm4567.c index c5db696127624..2b7f5ad92aca7 100644 --- a/sound/soc/intel/avs/boards/ssm4567.c +++ b/sound/soc/intel/avs/boards/ssm4567.c @@ -15,7 +15,6 @@ #include <sound/soc-acpi.h> #include "../../../codecs/nau8825.h"
-#define SKL_NUVOTON_CODEC_DAI "nau8825-hifi" #define SKL_SSM_CODEC_DAI "ssm4567-hifi"
static struct snd_soc_codec_conf card_codec_conf[] = { @@ -34,41 +33,11 @@ static const struct snd_kcontrol_new card_controls[] = { SOC_DAPM_PIN_SWITCH("Right Speaker"), };
-static int -platform_clock_control(struct snd_soc_dapm_widget *w, struct snd_kcontrol *control, int event) -{ - struct snd_soc_dapm_context *dapm = w->dapm; - struct snd_soc_card *card = dapm->card; - struct snd_soc_dai *codec_dai; - int ret; - - codec_dai = snd_soc_card_get_codec_dai(card, SKL_NUVOTON_CODEC_DAI); - if (!codec_dai) { - dev_err(card->dev, "Codec dai not found\n"); - return -EINVAL; - } - - if (SND_SOC_DAPM_EVENT_ON(event)) { - ret = snd_soc_dai_set_sysclk(codec_dai, NAU8825_CLK_MCLK, 24000000, - SND_SOC_CLOCK_IN); - if (ret < 0) - dev_err(card->dev, "set sysclk err = %d\n", ret); - } else { - ret = snd_soc_dai_set_sysclk(codec_dai, NAU8825_CLK_INTERNAL, 0, SND_SOC_CLOCK_IN); - if (ret < 0) - dev_err(card->dev, "set sysclk err = %d\n", ret); - } - - return ret; -} - static const struct snd_soc_dapm_widget card_widgets[] = { SND_SOC_DAPM_SPK("Left Speaker", NULL), SND_SOC_DAPM_SPK("Right Speaker", NULL), SND_SOC_DAPM_SPK("DP1", NULL), SND_SOC_DAPM_SPK("DP2", NULL), - SND_SOC_DAPM_SUPPLY("Platform Clock", SND_SOC_NOPM, 0, 0, platform_clock_control, - SND_SOC_DAPM_PRE_PMU | SND_SOC_DAPM_POST_PMD), };
static const struct snd_soc_dapm_route card_base_routes[] = {
From: Cezary Rojewski cezary.rojewski@intel.com
[ Upstream commit 6206b2e787da2ed567922c37bb588a44f6fb6705 ]
Internal clock shall be adjusted also in cases when DAPM event other than 'ON' is triggered.
Signed-off-by: Cezary Rojewski cezary.rojewski@intel.com Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://lore.kernel.org/r/20230303134854.2277146-6-amadeuszx.slawinski@linux... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/intel/avs/boards/nau8825.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-)
diff --git a/sound/soc/intel/avs/boards/nau8825.c b/sound/soc/intel/avs/boards/nau8825.c index 6731d8a490767..49438a67a77c6 100644 --- a/sound/soc/intel/avs/boards/nau8825.c +++ b/sound/soc/intel/avs/boards/nau8825.c @@ -33,15 +33,15 @@ avs_nau8825_clock_control(struct snd_soc_dapm_widget *w, struct snd_kcontrol *co return -EINVAL; }
- if (!SND_SOC_DAPM_EVENT_ON(event)) { + if (SND_SOC_DAPM_EVENT_ON(event)) + ret = snd_soc_dai_set_sysclk(codec_dai, NAU8825_CLK_MCLK, 24000000, + SND_SOC_CLOCK_IN); + else ret = snd_soc_dai_set_sysclk(codec_dai, NAU8825_CLK_INTERNAL, 0, SND_SOC_CLOCK_IN); - if (ret < 0) { - dev_err(card->dev, "set sysclk err = %d\n", ret); - return ret; - } - } + if (ret < 0) + dev_err(card->dev, "Set sysclk failed: %d\n", ret);
- return 0; + return ret; }
static const struct snd_kcontrol_new card_controls[] = {
From: Nick Terrell terrelln@fb.com
[ Upstream commit 038505c41f0aad26ef101f4f7f6e111531c3914f ]
Backport the relevant part of upstream commit 5b266196 [0].
This fixes in-place decompression for x86-64 kernel decompression. It uses a bound of 131072 + (uncompressed_size >> 8), which can be violated after upstream commit 6a7ede3d [1], as zstd can use part of the output buffer as temporary storage, and without this patch needs a bound of ~262144.
The fix is for zstd to detect that the input and output buffers overlap, so that zstd knows it can't use the overlapping portion of the output buffer as tempoary storage. If the margin is not large enough, this will ensure that zstd will fail the decompression, rather than overwriting part of the input data, and causing corruption.
This fix has been landed upstream and is in release v1.5.4. That commit also adds unit and fuzz tests to verify that the margin we use is respected, and correct. That means that the fix is well tested upstream.
I have not been able to reproduce the potential bug in x86-64 kernel decompression locally, nor have I recieved reports of failures to decompress the kernel. It is possible that compression saves enough space to make it very hard for the issue to appear.
I've boot tested the zstd compressed kernel on x86-64 and i386 with this patch, which uses in-place decompression, and sanity tested zstd compression in btrfs / squashfs to make sure that we don't see any issues, but other uses of zstd shouldn't be affected, because they don't use in-place decompression.
Thanks to Vasily Gorbik gor@linux.ibm.com for debugging a related issue on s390, which was triggered by the same commit, but was a bug in how __decompress() was called [2]. And to Sasha Levin sashal@kernel.org for the CC alerting me of the issue.
[0] https://github.com/facebook/zstd/commit/5b266196a41e6a15e21bd4f0eeab43b938db... [1] https://github.com/facebook/zstd/commit/6a7ede3dfccbf3e0a5928b4224a039c260dc... [2] https://lore.kernel.org/r/patch-1.thread-41c676.git-41c676c2d153.your-ad-her...
CC: Vasily Gorbik gor@linux.ibm.com CC: Heiko Carstens hca@linux.ibm.com CC: Sasha Levin sashal@kernel.org CC: Yann Collet cyan@fb.com Signed-off-by: Nick Terrell terrelln@fb.com Signed-off-by: Sasha Levin sashal@kernel.org --- lib/zstd/decompress/zstd_decompress.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
diff --git a/lib/zstd/decompress/zstd_decompress.c b/lib/zstd/decompress/zstd_decompress.c index b9b935a9f5c0d..6b3177c947114 100644 --- a/lib/zstd/decompress/zstd_decompress.c +++ b/lib/zstd/decompress/zstd_decompress.c @@ -798,7 +798,7 @@ static size_t ZSTD_copyRawBlock(void* dst, size_t dstCapacity, if (srcSize == 0) return 0; RETURN_ERROR(dstBuffer_null, ""); } - ZSTD_memcpy(dst, src, srcSize); + ZSTD_memmove(dst, src, srcSize); return srcSize; }
@@ -858,6 +858,7 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx,
/* Loop on each block */ while (1) { + BYTE* oBlockEnd = oend; size_t decodedSize; blockProperties_t blockProperties; size_t const cBlockSize = ZSTD_getcBlockSize(ip, remainingSrcSize, &blockProperties); @@ -867,16 +868,34 @@ static size_t ZSTD_decompressFrame(ZSTD_DCtx* dctx, remainingSrcSize -= ZSTD_blockHeaderSize; RETURN_ERROR_IF(cBlockSize > remainingSrcSize, srcSize_wrong, "");
+ if (ip >= op && ip < oBlockEnd) { + /* We are decompressing in-place. Limit the output pointer so that we + * don't overwrite the block that we are currently reading. This will + * fail decompression if the input & output pointers aren't spaced + * far enough apart. + * + * This is important to set, even when the pointers are far enough + * apart, because ZSTD_decompressBlock_internal() can decide to store + * literals in the output buffer, after the block it is decompressing. + * Since we don't want anything to overwrite our input, we have to tell + * ZSTD_decompressBlock_internal to never write past ip. + * + * See ZSTD_allocateLiteralsBuffer() for reference. + */ + oBlockEnd = op + (ip - op); + } + switch(blockProperties.blockType) { case bt_compressed: - decodedSize = ZSTD_decompressBlock_internal(dctx, op, (size_t)(oend-op), ip, cBlockSize, /* frame */ 1, not_streaming); + decodedSize = ZSTD_decompressBlock_internal(dctx, op, (size_t)(oBlockEnd-op), ip, cBlockSize, /* frame */ 1, not_streaming); break; case bt_raw : + /* Use oend instead of oBlockEnd because this function is safe to overlap. It uses memmove. */ decodedSize = ZSTD_copyRawBlock(op, (size_t)(oend-op), ip, cBlockSize); break; case bt_rle : - decodedSize = ZSTD_setRleBlock(op, (size_t)(oend-op), *ip, blockProperties.origSize); + decodedSize = ZSTD_setRleBlock(op, (size_t)(oBlockEnd-op), *ip, blockProperties.origSize); break; case bt_reserved : default:
From: Jonathan Neuschäfer j.neuschaefer@gmx.net
[ Upstream commit 6906598f1ce93761716d780b6e3f171e13f0f4ce ]
assert(x) should emit a warning if x is false. WARN_ON(x) emits a warning if x is true. Thus, assert(x) should be defined as WARN_ON(!x) rather than WARN_ON(x).
Signed-off-by: Jonathan Neuschäfer j.neuschaefer@gmx.net Signed-off-by: Nick Terrell terrelln@fb.com Signed-off-by: Sasha Levin sashal@kernel.org --- lib/zstd/common/zstd_deps.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/lib/zstd/common/zstd_deps.h b/lib/zstd/common/zstd_deps.h index 7a5bf44839c9c..f06df065dec01 100644 --- a/lib/zstd/common/zstd_deps.h +++ b/lib/zstd/common/zstd_deps.h @@ -84,7 +84,7 @@ static uint64_t ZSTD_div64(uint64_t dividend, uint32_t divisor) {
#include <linux/kernel.h>
-#define assert(x) WARN_ON((x)) +#define assert(x) WARN_ON(!(x))
#endif /* ZSTD_DEPS_ASSERT */ #endif /* ZSTD_DEPS_NEED_ASSERT */
From: Chia-Lin Kao (AceLan) acelan.kao@canonical.com
[ Upstream commit 89b0411481967a2e8c91190a211a359966cfcf4b ]
Sometimes the system boots up with a acpi_video0 backlight interface which doesn't work. So add Dell Vostro 15 3535 into the video_detect_dmi_table to set it to native explicitly.
Signed-off-by: Chia-Lin Kao (AceLan) acelan.kao@canonical.com Signed-off-by: Rafael J. Wysocki rjw@rjwysocki.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/video_detect.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/acpi/video_detect.c b/drivers/acpi/video_detect.c index 710ac640267dd..14d6d81e536fe 100644 --- a/drivers/acpi/video_detect.c +++ b/drivers/acpi/video_detect.c @@ -716,6 +716,13 @@ static const struct dmi_system_id video_detect_dmi_table[] = { DMI_MATCH(DMI_PRODUCT_NAME, "Dell G15 5515"), }, }, + { + .callback = video_detect_force_native, + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."), + DMI_MATCH(DMI_PRODUCT_NAME, "Vostro 15 3535"), + }, + },
/* * Desktops which falsely report a backlight and which our heuristics
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 5adc409340b1fc82bc1175e602d14ac82ac685e3 ]
x86 ACPI boards which ship with only Android as their factory image usually have pretty broken ACPI tables, relying on everything being hardcoded in the factory kernel image and often disabling parts of the ACPI enumeration kernel code to avoid the broken tables causing issues.
Part of this broken ACPI code is that sometimes these boards have _AEI ACPI GPIO event handlers which are broken.
So far this has been dealt with in the platform/x86/x86-android-tablets.c module, which contains various workarounds for these devices, by it calling acpi_gpiochip_free_interrupts() on gpiochip-s with troublesome handlers to disable the handlers.
But in some cases this is too late, if the handlers are of the edge type then gpiolib-acpi.c's code will already have run them at boot. This can cause issues such as GPIOs ending up as owned by "ACPI:OpRegion", making them unavailable for drivers which actually need them.
Boards with these broken ACPI tables are already listed in drivers/acpi/x86/utils.c for e.g. acpi_quirk_skip_i2c_client_enumeration(). Extend the quirks mechanism for a new acpi_quirk_skip_gpio_event_handlers() helper, this re-uses the DMI-ids rather then having to duplicate the same DMI table in gpiolib-acpi.c .
Also add the new ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS quirk to existing boards with troublesome ACPI gpio event handlers, so that the current acpi_gpiochip_free_interrupts() hack can be removed from x86-android-tablets.c .
Signed-off-by: Hans de Goede hdegoede@redhat.com Acked-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Rafael J. Wysocki rjw@rjwysocki.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/utils.c | 24 +++++++++++++++++++++--- drivers/gpio/gpiolib-acpi.c | 3 +++ include/acpi/acpi_bus.h | 5 +++++ 3 files changed, 29 insertions(+), 3 deletions(-)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c index e45285d4e62a4..4bf57cce30bbf 100644 --- a/drivers/acpi/x86/utils.c +++ b/drivers/acpi/x86/utils.c @@ -251,6 +251,7 @@ bool force_storage_d3(void) #define ACPI_QUIRK_UART1_TTY_UART2_SKIP BIT(1) #define ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY BIT(2) #define ACPI_QUIRK_USE_ACPI_AC_AND_BATTERY BIT(3) +#define ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS BIT(4)
static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { /* @@ -286,7 +287,8 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { }, .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | ACPI_QUIRK_UART1_TTY_UART2_SKIP | - ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY), + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | + ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, { .matches = { @@ -294,7 +296,8 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { DMI_MATCH(DMI_PRODUCT_NAME, "TF103C"), }, .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | - ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY), + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | + ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, { /* Lenovo Yoga Tablet 2 1050F/L */ @@ -336,7 +339,8 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { DMI_MATCH(DMI_PRODUCT_NAME, "M890BAP"), }, .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | - ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY), + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | + ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, { /* Whitelabel (sold as various brands) TM800A550L */ @@ -413,6 +417,20 @@ int acpi_quirk_skip_serdev_enumeration(struct device *controller_parent, bool *s return 0; } EXPORT_SYMBOL_GPL(acpi_quirk_skip_serdev_enumeration); + +bool acpi_quirk_skip_gpio_event_handlers(void) +{ + const struct dmi_system_id *dmi_id; + long quirks; + + dmi_id = dmi_first_match(acpi_quirk_skip_dmi_ids); + if (!dmi_id) + return false; + + quirks = (unsigned long)dmi_id->driver_data; + return (quirks & ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS); +} +EXPORT_SYMBOL_GPL(acpi_quirk_skip_gpio_event_handlers); #endif
/* Lists of PMIC ACPI HIDs with an (often better) native charger driver */ diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index 34ff048e70d0e..7c9175619a1dc 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -536,6 +536,9 @@ void acpi_gpiochip_request_interrupts(struct gpio_chip *chip) if (ACPI_FAILURE(status)) return;
+ if (acpi_quirk_skip_gpio_event_handlers()) + return; + acpi_walk_resources(handle, METHOD_NAME__AEI, acpi_gpiochip_alloc_event, acpi_gpio);
diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 0584e9f6e3397..57acb895c0381 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -657,6 +657,7 @@ static inline bool acpi_quirk_skip_acpi_ac_and_battery(void) #if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS) bool acpi_quirk_skip_i2c_client_enumeration(struct acpi_device *adev); int acpi_quirk_skip_serdev_enumeration(struct device *controller_parent, bool *skip); +bool acpi_quirk_skip_gpio_event_handlers(void); #else static inline bool acpi_quirk_skip_i2c_client_enumeration(struct acpi_device *adev) { @@ -668,6 +669,10 @@ acpi_quirk_skip_serdev_enumeration(struct device *controller_parent, bool *skip) *skip = false; return 0; } +static inline bool acpi_quirk_skip_gpio_event_handlers(void) +{ + return false; +} #endif
#ifdef CONFIG_PM
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit a5cb0695c5f0ac2ab0cedf2c1c0d75826cb73448 ]
The Acer Iconia One 7 B1-750 is a x86 tablet which ships with Android x86 as factory OS. The Android x86 kernel fork ignores I2C devices described in the DSDT, except for the PMIC and Audio codecs.
As usual the Acer Iconia One 7 B1-750's DSDT contains a bunch of extra I2C devices which are not actually there, causing various resource conflicts. Add an ACPI_QUIRK_SKIP_I2C_CLIENTS quirk for the Acer Iconia One 7 B1-750 to the acpi_quirk_skip_dmi_ids table to woraround this.
The DSDT also contains broken ACPI GPIO event handlers, disable those too.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rjw@rjwysocki.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/utils.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c index 4bf57cce30bbf..b2b0e2701333a 100644 --- a/drivers/acpi/x86/utils.c +++ b/drivers/acpi/x86/utils.c @@ -280,6 +280,16 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { * need the x86-android-tablets module to properly work. */ #if IS_ENABLED(CONFIG_X86_ANDROID_TABLETS) + { + /* Acer Iconia One 7 B1-750 */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "Insyde"), + DMI_MATCH(DMI_PRODUCT_NAME, "VESPA2"), + }, + .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | + ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), + }, { .matches = { DMI_EXACT_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 1a1e7540cf501dd5c8b57a577a155cdd13c7e202 ]
The Lenovo Yoga Book X90 is a x86 tablet which ships with Android x86 as factory OS. The Android x86 kernel fork ignores I2C devices described in the DSDT, except for the PMIC and Audio codecs.
As usual the Lenovo Yoga Book X90's DSDT contains a bunch of extra I2C devices which are not actually there, causing various resource conflicts. Add an ACPI_QUIRK_SKIP_I2C_CLIENTS quirk for the Lenovo Yoga Book X90 to the acpi_quirk_skip_dmi_ids table to woraround this.
The DSDT also contains broken ACPI GPIO event handlers, disable those too.
Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Rafael J. Wysocki rjw@rjwysocki.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/x86/utils.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/acpi/x86/utils.c b/drivers/acpi/x86/utils.c index b2b0e2701333a..da5727069d851 100644 --- a/drivers/acpi/x86/utils.c +++ b/drivers/acpi/x86/utils.c @@ -300,6 +300,17 @@ static const struct dmi_system_id acpi_quirk_skip_dmi_ids[] = { ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), }, + { + /* Lenovo Yoga Book X90F/L */ + .matches = { + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Intel Corporation"), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"), + DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "YETI-11"), + }, + .driver_data = (void *)(ACPI_QUIRK_SKIP_I2C_CLIENTS | + ACPI_QUIRK_SKIP_ACPI_AC_AND_BATTERY | + ACPI_QUIRK_SKIP_GPIO_EVENT_HANDLERS), + }, { .matches = { DMI_MATCH(DMI_SYS_VENDOR, "ASUSTeK COMPUTER INC."),
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
[ Upstream commit 989a3e4479177d0f4afab8be1960731bc0ffbbd0 ]
The sof_ipc3_rx_msg() checks for minimum size of a new rx message but it is missing the check for upper limit. Corrupted or compromised firmware might be able to take advantage of this to cause out of bounds reads outside of the message area.
Reported-by: Curtis Malainey cujomalainey@chromium.org Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Curtis Malainey curtis@malainey.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307114917.5124-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc3.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/sound/soc/sof/ipc3.c b/sound/soc/sof/ipc3.c index 1fef4dcc09368..fde8af5a1f485 100644 --- a/sound/soc/sof/ipc3.c +++ b/sound/soc/sof/ipc3.c @@ -970,8 +970,9 @@ static void sof_ipc3_rx_msg(struct snd_sof_dev *sdev) return; }
- if (hdr.size < sizeof(hdr)) { - dev_err(sdev->dev, "The received message size is invalid\n"); + if (hdr.size < sizeof(hdr) || hdr.size > SOF_IPC_MSG_MAX_SIZE) { + dev_err(sdev->dev, "The received message size is invalid: %u\n", + hdr.size); return; }
From: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com
[ Upstream commit 9e269e3aa9006440de639597079ee7140ef5b5f3 ]
This patch fixes the sample rate print unit from KHz to Hz. E.g. 48000KHz becomes 48000Hz.
Signed-off-by: Seppo Ingalsuo seppo.ingalsuo@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307110751.2053-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc4-topology.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 59f4d42f9011e..65da1cf790d9c 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -155,7 +155,7 @@ static void sof_ipc4_dbg_audio_format(struct device *dev, for (i = 0; i < num_format; i++, ptr = (u8 *)ptr + object_size) { fmt = ptr; dev_dbg(dev, - " #%d: %uKHz, %ubit (ch_map %#x ch_cfg %u interleaving_style %u fmt_cfg %#x)\n", + " #%d: %uHz, %ubit (ch_map %#x ch_cfg %u interleaving_style %u fmt_cfg %#x)\n", i, fmt->sampling_frequency, fmt->bit_depth, fmt->ch_map, fmt->ch_cfg, fmt->interleaving_style, fmt->fmt_cfg); }
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit ca09e2a351fbc7836ba9418304ff0c3e72addfe0 ]
The logic for the ioremap is to find the resource index 3 (IRAM) and infer the BAR address by subtracting the IRAM offset. The BAR size defined in hardware specifications is 2MB.
The commit 5947b2726beb6 ("ASoC: SOF: Intel: Check the bar size before remapping") tried to find the BAR size by querying the resource length instead of a pre-canned value, but by requesting the size for index 3 it only gets the size of the IRAM. That's obviously wrong and prevents the probe from proceeding.
This commit attempted to fix an issue in a fuzzing/simulated environment but created another on actual devices, so the best course of action is to revert that change.
Reported-by: Ferry Toth fntoth@gmail.com Tested-by: Ferry Toth fntoth@gmail.com (Intel Edison-Arduino) Link: https://github.com/thesofproject/linux/issues/3901 Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307095341.3222-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/pci-tng.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)
diff --git a/sound/soc/sof/intel/pci-tng.c b/sound/soc/sof/intel/pci-tng.c index 5b2b409752c58..8c22a00266c06 100644 --- a/sound/soc/sof/intel/pci-tng.c +++ b/sound/soc/sof/intel/pci-tng.c @@ -75,11 +75,7 @@ static int tangier_pci_probe(struct snd_sof_dev *sdev)
/* LPE base */ base = pci_resource_start(pci, desc->resindex_lpe_base) - IRAM_OFFSET; - size = pci_resource_len(pci, desc->resindex_lpe_base); - if (size < PCI_BAR_SIZE) { - dev_err(sdev->dev, "error: I/O region is too small.\n"); - return -ENODEV; - } + size = PCI_BAR_SIZE;
dev_dbg(sdev->dev, "LPE PHY base at 0x%x size 0x%x", base, size); sdev->bar[DSP_BAR] = devm_ioremap(sdev->dev, base, size);
From: Rander Wang rander.wang@intel.com
[ Upstream commit 52a55779ed14792a150421339664193d6eb8e036 ]
Add delay between set and wait command according to hardware programming sequence. Also add debug log to detect error.
Signed-off-by: Rander Wang rander.wang@intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307095453.3719-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/hda-dsp.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
diff --git a/sound/soc/sof/intel/hda-dsp.c b/sound/soc/sof/intel/hda-dsp.c index b4eacae8564c8..6b2094f74c9c0 100644 --- a/sound/soc/sof/intel/hda-dsp.c +++ b/sound/soc/sof/intel/hda-dsp.c @@ -399,6 +399,12 @@ static int hda_dsp_update_d0i3c_register(struct snd_sof_dev *sdev, u8 value) snd_sof_dsp_update8(sdev, HDA_DSP_HDA_BAR, chip->d0i3_offset, SOF_HDA_VS_D0I3C_I3, value);
+ /* + * The value written to the D0I3C::I3 bit may not be taken into account immediately. + * A delay is recommended before checking if D0I3C::CIP is cleared + */ + usleep_range(30, 40); + /* Wait for cmd in progress to be cleared before exiting the function */ ret = hda_dsp_wait_d0i3c_done(sdev); if (ret < 0) { @@ -407,6 +413,12 @@ static int hda_dsp_update_d0i3c_register(struct snd_sof_dev *sdev, u8 value) }
reg = snd_sof_dsp_read8(sdev, HDA_DSP_HDA_BAR, chip->d0i3_offset); + /* Confirm d0i3 state changed with paranoia check */ + if ((reg ^ value) & SOF_HDA_VS_D0I3C_I3) { + dev_err(sdev->dev, "failed to update D0I3C!\n"); + return -EIO; + } + trace_sof_intel_D0I3C_updated(sdev, reg);
return 0;
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 8bac40b8ed17ab1be9133e9620f65fae80262b7e ]
This reverts commit a09d82ce0a867 ("ASoC: SOF: Intel: hda-ctrl: remove useless sleep")
It was a mistake to remove those delays, in light of comments in the HDaudio spec captured in snd_hdac_bus_reset_link() that the codec needs time for its initialization and PLL lock.
Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Reviewed-by: Rander Wang rander.wang@intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307095412.3416-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/hda-ctrl.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/sound/soc/sof/intel/hda-ctrl.c b/sound/soc/sof/intel/hda-ctrl.c index 3aea36c077c9d..f3bdeba284122 100644 --- a/sound/soc/sof/intel/hda-ctrl.c +++ b/sound/soc/sof/intel/hda-ctrl.c @@ -196,12 +196,15 @@ int hda_dsp_ctrl_init_chip(struct snd_sof_dev *sdev) goto err; }
+ usleep_range(500, 1000); + /* exit HDA controller reset */ ret = hda_dsp_ctrl_link_reset(sdev, false); if (ret < 0) { dev_err(sdev->dev, "error: failed to exit HDA controller reset\n"); goto err; } + usleep_range(1000, 1200);
hda_codec_detect_mask(sdev);
From: Rander Wang rander.wang@intel.com
[ Upstream commit e45cd86c3a78bfb9875a5eb8ab5dab459b59bbe2 ]
Recent firmware changes modified the curve duration from 32 to 64 bits, which breaks volume ramps. A simple solution would be to change the definition, but unfortunately the ASoC topology framework only supports up to 32 bit tokens.
This patch suggests breaking the 64 bit value in low and high parts, with only the low-part extracted from topology and high-part only zeroes. Since the curve duration is represented in hundred of nanoseconds, we can still represent a 400s ramp, which is just fine. The defacto ABI change has no effect on existing users since the IPC4 firmware has not been released just yet.
Link: https://github.com/thesofproject/linux/issues/4026
Signed-off-by: Rander Wang rander.wang@intel.com Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Reviewed-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Reviewed-by: Bard Liao yung-chuan.liao@linux.intel.com Reviewed-by: Péter Ujfalusi peter.ujfalusi@linux.intel.com Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20230307110656.1816-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/ipc4-control.c | 3 ++- sound/soc/sof/ipc4-topology.c | 4 ++-- sound/soc/sof/ipc4-topology.h | 6 ++++-- 3 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/sound/soc/sof/ipc4-control.c b/sound/soc/sof/ipc4-control.c index 0d5a578c34962..7442ec1c5a4d4 100644 --- a/sound/soc/sof/ipc4-control.c +++ b/sound/soc/sof/ipc4-control.c @@ -84,7 +84,8 @@ sof_ipc4_set_volume_data(struct snd_sof_dev *sdev, struct snd_sof_widget *swidge }
/* set curve type and duration from topology */ - data.curve_duration = gain->data.curve_duration; + data.curve_duration_l = gain->data.curve_duration_l; + data.curve_duration_h = gain->data.curve_duration_h; data.curve_type = gain->data.curve_type;
msg->data_ptr = &data; diff --git a/sound/soc/sof/ipc4-topology.c b/sound/soc/sof/ipc4-topology.c index 65da1cf790d9c..6da6137fa2cbc 100644 --- a/sound/soc/sof/ipc4-topology.c +++ b/sound/soc/sof/ipc4-topology.c @@ -107,7 +107,7 @@ static const struct sof_topology_token gain_tokens[] = { get_token_u32, offsetof(struct sof_ipc4_gain_data, curve_type)}, {SOF_TKN_GAIN_RAMP_DURATION, SND_SOC_TPLG_TUPLE_TYPE_WORD, get_token_u32, - offsetof(struct sof_ipc4_gain_data, curve_duration)}, + offsetof(struct sof_ipc4_gain_data, curve_duration_l)}, {SOF_TKN_GAIN_VAL, SND_SOC_TPLG_TUPLE_TYPE_WORD, get_token_u32, offsetof(struct sof_ipc4_gain_data, init_val)}, }; @@ -670,7 +670,7 @@ static int sof_ipc4_widget_setup_comp_pga(struct snd_sof_widget *swidget)
dev_dbg(scomp->dev, "pga widget %s: ramp type: %d, ramp duration %d, initial gain value: %#x, cpc %d\n", - swidget->widget->name, gain->data.curve_type, gain->data.curve_duration, + swidget->widget->name, gain->data.curve_type, gain->data.curve_duration_l, gain->data.init_val, gain->base_config.cpc);
ret = sof_ipc4_widget_setup_msg(swidget, &gain->msg); diff --git a/sound/soc/sof/ipc4-topology.h b/sound/soc/sof/ipc4-topology.h index 2363a7cc0b57d..cf9d278524572 100644 --- a/sound/soc/sof/ipc4-topology.h +++ b/sound/soc/sof/ipc4-topology.h @@ -217,14 +217,16 @@ struct sof_ipc4_control_data { * @init_val: Initial value * @curve_type: Curve type * @reserved: reserved for future use - * @curve_duration: Curve duration + * @curve_duration_l: Curve duration low part + * @curve_duration_h: Curve duration high part */ struct sof_ipc4_gain_data { uint32_t channels; uint32_t init_val; uint32_t curve_type; uint32_t reserved; - uint32_t curve_duration; + uint32_t curve_duration_l; + uint32_t curve_duration_h; } __aligned(8);
/**
From: Emil Abildgaard Svendsen EMAS@bang-olufsen.dk
[ Upstream commit e041a2a550582106cba6a7c862c90dfc2ad14492 ]
Currently only one stream is supported. This isn't usally a problem until you have a multi codec audio card. Because the audio card will run startup and shutdown on both capture and playback streams. So if your hdmi-codec only support either playback or capture. Then ALSA can't open for playback and capture.
This patch will ignore if startup and shutdown are called with a non supported stream. Thus, allowing an audio card like this:
+-+ cpu1 <--@-| |-> codec1 (HDMI-CODEC) | |<- codec2 (NOT HDMI-CODEC) +-+
Signed-off-by: Emil Svendsen emas@bang-olufsen.dk Link: https://lore.kernel.org/r/20230309065432.4150700-2-emas@bang-olufsen.dk Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/hdmi-codec.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/sound/soc/codecs/hdmi-codec.c b/sound/soc/codecs/hdmi-codec.c index 74cbbe16f9aec..a22f2ec95901f 100644 --- a/sound/soc/codecs/hdmi-codec.c +++ b/sound/soc/codecs/hdmi-codec.c @@ -428,8 +428,13 @@ static int hdmi_codec_startup(struct snd_pcm_substream *substream, { struct hdmi_codec_priv *hcp = snd_soc_dai_get_drvdata(dai); bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK; + bool has_capture = !hcp->hcd.no_i2s_capture; + bool has_playback = !hcp->hcd.no_i2s_playback; int ret = 0;
+ if (!((has_playback && tx) || (has_capture && !tx))) + return 0; + mutex_lock(&hcp->lock); if (hcp->busy) { dev_err(dai->dev, "Only one simultaneous stream supported!\n"); @@ -468,6 +473,12 @@ static void hdmi_codec_shutdown(struct snd_pcm_substream *substream, struct snd_soc_dai *dai) { struct hdmi_codec_priv *hcp = snd_soc_dai_get_drvdata(dai); + bool tx = substream->stream == SNDRV_PCM_STREAM_PLAYBACK; + bool has_capture = !hcp->hcd.no_i2s_capture; + bool has_playback = !hcp->hcd.no_i2s_playback; + + if (!((has_playback && tx) || (has_capture && !tx))) + return;
hcp->chmap_idx = HDMI_CODEC_CHMAP_IDX_UNKNOWN; hcp->hcd.ops->audio_shutdown(dai->dev->parent, hcp->hcd.data);
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit ce04abc3fcc62cd5640af981ebfd7c4dc3bded28 ]
When userspace sets basic rates, it might send us some rates list that's empty or consists of invalid values only. We're currently ignoring invalid values and then may end up with a rates bitmap that's empty, which later results in a warning.
Reject the call if there were no valid rates.
Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/cfg.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/net/mac80211/cfg.c b/net/mac80211/cfg.c index d611e15301839..e24d2d5b04ad0 100644 --- a/net/mac80211/cfg.c +++ b/net/mac80211/cfg.c @@ -2576,6 +2576,17 @@ static int ieee80211_change_bss(struct wiphy *wiphy, if (!sband) return -EINVAL;
+ if (params->basic_rates) { + if (!ieee80211_parse_bitrates(link->conf->chandef.width, + wiphy->bands[sband->band], + params->basic_rates, + params->basic_rates_len, + &link->conf->basic_rates)) + return -EINVAL; + changed |= BSS_CHANGED_BASIC_RATES; + ieee80211_check_rate_mask(link); + } + if (params->use_cts_prot >= 0) { link->conf->use_cts_prot = params->use_cts_prot; changed |= BSS_CHANGED_ERP_CTS_PROT; @@ -2597,16 +2608,6 @@ static int ieee80211_change_bss(struct wiphy *wiphy, changed |= BSS_CHANGED_ERP_SLOT; }
- if (params->basic_rates) { - ieee80211_parse_bitrates(link->conf->chandef.width, - wiphy->bands[sband->band], - params->basic_rates, - params->basic_rates_len, - &link->conf->basic_rates); - changed |= BSS_CHANGED_BASIC_RATES; - ieee80211_check_rate_mask(link); - } - if (params->ap_isolate >= 0) { if (params->ap_isolate) sdata->flags |= IEEE80211_SDATA_DONT_BRIDGE_PACKETS;
From: NeilBrown neilb@suse.de
[ Upstream commit 3bc57292278a0b6ac4656cad94c14f2453344b57 ]
slot_store() uses kstrtouint() to get a slot number, but stores the result in an "int" variable (by casting a pointer). This can result in a negative slot number if the unsigned int value is very large.
A negative number means that the slot is empty, but setting a negative slot number this way will not remove the device from the array. I don't think this is a serious problem, but it could cause confusion and it is best to fix it.
Reported-by: Dan Carpenter error27@gmail.com Signed-off-by: NeilBrown neilb@suse.de Signed-off-by: Song Liu song@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/md.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c index 272cc5d14906f..beab84f0c585c 100644 --- a/drivers/md/md.c +++ b/drivers/md/md.c @@ -3131,6 +3131,9 @@ slot_store(struct md_rdev *rdev, const char *buf, size_t len) err = kstrtouint(buf, 10, (unsigned int *)&slot); if (err < 0) return err; + if (slot < 0) + /* overflow */ + return -ENOSPC; } if (rdev->mddev->pers && slot == -1) { /* Setting 'slot' on an active array requires also
From: Jan Beulich jbeulich@suse.com
[ Upstream commit 934ef33ee75c3846f605f18b65048acd147e3918 ]
A new platform-op was added to Xen to allow obtaining the same VGA console information PV Dom0 is handed. Invoke the new function and have the output data processed by xen_init_vga().
Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com
Link: https://lore.kernel.org/r/8f315e92-7bda-c124-71cc-478ab9c5e610@suse.com Signed-off-by: Juergen Gross jgross@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/xen/Makefile | 2 +- arch/x86/xen/enlighten_pv.c | 3 ++- arch/x86/xen/enlighten_pvh.c | 13 +++++++++++++ arch/x86/xen/vga.c | 5 ++--- arch/x86/xen/xen-ops.h | 7 ++++--- include/xen/interface/platform.h | 3 +++ 6 files changed, 25 insertions(+), 8 deletions(-)
diff --git a/arch/x86/xen/Makefile b/arch/x86/xen/Makefile index 3c5b52fbe4a7f..a9ec8c9f5c5dd 100644 --- a/arch/x86/xen/Makefile +++ b/arch/x86/xen/Makefile @@ -45,6 +45,6 @@ obj-$(CONFIG_PARAVIRT_SPINLOCKS)+= spinlock.o
obj-$(CONFIG_XEN_DEBUG_FS) += debugfs.o
-obj-$(CONFIG_XEN_PV_DOM0) += vga.o +obj-$(CONFIG_XEN_DOM0) += vga.o
obj-$(CONFIG_XEN_EFI) += efi.o diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c index 5b13796628770..68f5f5d209dfa 100644 --- a/arch/x86/xen/enlighten_pv.c +++ b/arch/x86/xen/enlighten_pv.c @@ -1389,7 +1389,8 @@ asmlinkage __visible void __init xen_start_kernel(struct start_info *si)
x86_platform.set_legacy_features = xen_dom0_set_legacy_features; - xen_init_vga(info, xen_start_info->console.dom0.info_size); + xen_init_vga(info, xen_start_info->console.dom0.info_size, + &boot_params.screen_info); xen_start_info->console.domU.mfn = 0; xen_start_info->console.domU.evtchn = 0;
diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c index bcae606bbc5cf..1da44aca896c6 100644 --- a/arch/x86/xen/enlighten_pvh.c +++ b/arch/x86/xen/enlighten_pvh.c @@ -43,6 +43,19 @@ void __init xen_pvh_init(struct boot_params *boot_params) x86_init.oem.banner = xen_banner;
xen_efi_init(boot_params); + + if (xen_initial_domain()) { + struct xen_platform_op op = { + .cmd = XENPF_get_dom0_console, + }; + long ret = HYPERVISOR_platform_op(&op); + + if (ret > 0) + xen_init_vga(&op.u.dom0_console, + min(ret * sizeof(char), + sizeof(op.u.dom0_console)), + &boot_params->screen_info); + } }
void __init mem_map_via_hcall(struct boot_params *boot_params_p) diff --git a/arch/x86/xen/vga.c b/arch/x86/xen/vga.c index 14ea32e734d59..d97adab8420f4 100644 --- a/arch/x86/xen/vga.c +++ b/arch/x86/xen/vga.c @@ -9,10 +9,9 @@
#include "xen-ops.h"
-void __init xen_init_vga(const struct dom0_vga_console_info *info, size_t size) +void __init xen_init_vga(const struct dom0_vga_console_info *info, size_t size, + struct screen_info *screen_info) { - struct screen_info *screen_info = &boot_params.screen_info; - /* This is drawn from a dump from vgacon:startup in * standard Linux. */ screen_info->orig_video_mode = 3; diff --git a/arch/x86/xen/xen-ops.h b/arch/x86/xen/xen-ops.h index 9a8bb972193d8..a10903785a338 100644 --- a/arch/x86/xen/xen-ops.h +++ b/arch/x86/xen/xen-ops.h @@ -108,11 +108,12 @@ static inline void xen_uninit_lock_cpu(int cpu)
struct dom0_vga_console_info;
-#ifdef CONFIG_XEN_PV_DOM0 -void __init xen_init_vga(const struct dom0_vga_console_info *, size_t size); +#ifdef CONFIG_XEN_DOM0 +void __init xen_init_vga(const struct dom0_vga_console_info *, size_t size, + struct screen_info *); #else static inline void __init xen_init_vga(const struct dom0_vga_console_info *info, - size_t size) + size_t size, struct screen_info *si) { } #endif diff --git a/include/xen/interface/platform.h b/include/xen/interface/platform.h index 655d92e803e14..79a443c65ea93 100644 --- a/include/xen/interface/platform.h +++ b/include/xen/interface/platform.h @@ -483,6 +483,8 @@ struct xenpf_symdata { }; DEFINE_GUEST_HANDLE_STRUCT(xenpf_symdata);
+#define XENPF_get_dom0_console 64 + struct xen_platform_op { uint32_t cmd; uint32_t interface_version; /* XENPF_INTERFACE_VERSION */ @@ -506,6 +508,7 @@ struct xen_platform_op { struct xenpf_mem_hotadd mem_add; struct xenpf_core_parking core_parking; struct xenpf_symdata symdata; + struct dom0_vga_console_info dom0_console; uint8_t pad[128]; } u; };
From: Xiaogang Chen Xiaogang.Chen@amd.com
[ Upstream commit b4ee9606378bb9520c94d8b96f0305c3696f5c29 ]
svm_migrate_ram_to_vram migrates a prange from sys ram to vram. The prange may cross multiple vma. Need remember current dst vram offset in the TTM resource for each migration.
v2: squash in warning fix (Alex)
Signed-off-by: Xiaogang Chen Xiaogang.Chen@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c index 10048ce16aea4..5c319007b4701 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c @@ -289,7 +289,7 @@ static unsigned long svm_migrate_unsuccessful_pages(struct migrate_vma *migrate) static int svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange, struct migrate_vma *migrate, struct dma_fence **mfence, - dma_addr_t *scratch) + dma_addr_t *scratch, uint64_t ttm_res_offset) { uint64_t npages = migrate->npages; struct device *dev = adev->dev; @@ -299,8 +299,8 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange, uint64_t i, j; int r;
- pr_debug("svms 0x%p [0x%lx 0x%lx]\n", prange->svms, prange->start, - prange->last); + pr_debug("svms 0x%p [0x%lx 0x%lx 0x%llx]\n", prange->svms, prange->start, + prange->last, ttm_res_offset);
src = scratch; dst = (uint64_t *)(scratch + npages); @@ -311,7 +311,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange, goto out; }
- amdgpu_res_first(prange->ttm_res, prange->offset << PAGE_SHIFT, + amdgpu_res_first(prange->ttm_res, ttm_res_offset, npages << PAGE_SHIFT, &cursor); for (i = j = 0; i < npages; i++) { struct page *spage; @@ -398,7 +398,7 @@ svm_migrate_copy_to_vram(struct amdgpu_device *adev, struct svm_range *prange, static long svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct svm_range *prange, struct vm_area_struct *vma, uint64_t start, - uint64_t end, uint32_t trigger) + uint64_t end, uint32_t trigger, uint64_t ttm_res_offset) { struct kfd_process *p = container_of(prange->svms, struct kfd_process, svms); uint64_t npages = (end - start) >> PAGE_SHIFT; @@ -451,7 +451,7 @@ svm_migrate_vma_to_vram(struct amdgpu_device *adev, struct svm_range *prange, else pr_debug("0x%lx pages migrated\n", cpages);
- r = svm_migrate_copy_to_vram(adev, prange, &migrate, &mfence, scratch); + r = svm_migrate_copy_to_vram(adev, prange, &migrate, &mfence, scratch, ttm_res_offset); migrate_vma_pages(&migrate);
pr_debug("successful/cpages/npages 0x%lx/0x%lx/0x%lx\n", @@ -499,6 +499,7 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc, unsigned long addr, start, end; struct vm_area_struct *vma; struct amdgpu_device *adev; + uint64_t ttm_res_offset; unsigned long cpages = 0; long r = 0;
@@ -519,6 +520,7 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc,
start = prange->start << PAGE_SHIFT; end = (prange->last + 1) << PAGE_SHIFT; + ttm_res_offset = prange->offset << PAGE_SHIFT;
for (addr = start; addr < end;) { unsigned long next; @@ -528,13 +530,14 @@ svm_migrate_ram_to_vram(struct svm_range *prange, uint32_t best_loc, break;
next = min(vma->vm_end, end); - r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next, trigger); + r = svm_migrate_vma_to_vram(adev, prange, vma, addr, next, trigger, ttm_res_offset); if (r < 0) { pr_debug("failed %ld to migrate\n", r); break; } else { cpages += r; } + ttm_res_offset += next - addr; addr = next; }
From: Chia-I Wu olvaffe@gmail.com
[ Upstream commit b2ca5c5d416b4e72d1e9d0293fc720e2d525fd42 ]
Set *q to NULL on errors, otherwise pqm_create_queue would free it again.
Signed-off-by: Chia-I Wu olvaffe@gmail.com Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c index 5137476ec18e6..4236539d9f932 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process_queue_manager.c @@ -218,8 +218,8 @@ static int init_user_queue(struct process_queue_manager *pqm, return 0;
cleanup: - if (dev->shared_resources.enable_mes) - uninit_queue(*q); + uninit_queue(*q); + *q = NULL; return retval; }
From: Jane Jian Jane.Jian@amd.com
[ Upstream commit d71e38df3b730a17ab6b25cabb2ccfe8a7f04385 ]
for sriov, we added a new flag to indicate av1 support, this will override the original caps info.
Signed-off-by: Jane Jian Jane.Jian@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h | 4 + drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h | 3 +- drivers/gpu/drm/amd/amdgpu/soc21.c | 103 ++++++++++++++++++-- 3 files changed, 99 insertions(+), 11 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h index 2b9d806e23afb..10a0a510910b6 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.h @@ -123,6 +123,8 @@ enum AMDGIM_FEATURE_FLAG { AMDGIM_FEATURE_PP_ONE_VF = (1 << 4), /* Indirect Reg Access enabled */ AMDGIM_FEATURE_INDIRECT_REG_ACCESS = (1 << 5), + /* AV1 Support MODE*/ + AMDGIM_FEATURE_AV1_SUPPORT = (1 << 6), };
enum AMDGIM_REG_ACCESS_FLAG { @@ -321,6 +323,8 @@ static inline bool is_virtual_machine(void) ((!amdgpu_in_reset(adev)) && adev->virt.tdr_debug) #define amdgpu_sriov_is_normal(adev) \ ((!amdgpu_in_reset(adev)) && (!adev->virt.tdr_debug)) +#define amdgpu_sriov_is_av1_support(adev) \ + ((adev)->virt.gim_feature & AMDGIM_FEATURE_AV1_SUPPORT) bool amdgpu_virt_mmio_blocked(struct amdgpu_device *adev); void amdgpu_virt_init_setting(struct amdgpu_device *adev); void amdgpu_virt_kiq_reg_write_reg_wait(struct amdgpu_device *adev, diff --git a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h index 6c97148ca0ed3..24d42d24e6a01 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h +++ b/drivers/gpu/drm/amd/amdgpu/amdgv_sriovmsg.h @@ -93,7 +93,8 @@ union amd_sriov_msg_feature_flags { uint32_t mm_bw_management : 1; uint32_t pp_one_vf_mode : 1; uint32_t reg_indirect_acc : 1; - uint32_t reserved : 26; + uint32_t av1_support : 1; + uint32_t reserved : 25; } flags; uint32_t all; }; diff --git a/drivers/gpu/drm/amd/amdgpu/soc21.c b/drivers/gpu/drm/amd/amdgpu/soc21.c index 3d938b52178e3..9eedc1a1494c0 100644 --- a/drivers/gpu/drm/amd/amdgpu/soc21.c +++ b/drivers/gpu/drm/amd/amdgpu/soc21.c @@ -101,6 +101,59 @@ static const struct amdgpu_video_codecs vcn_4_0_0_video_codecs_decode_vcn1 = .codec_array = vcn_4_0_0_video_codecs_decode_array_vcn1, };
+/* SRIOV SOC21, not const since data is controlled by host */ +static struct amdgpu_video_codec_info sriov_vcn_4_0_0_video_codecs_encode_array_vcn0[] = { + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, 4096, 2304, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 4096, 2304, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_AV1, 8192, 4352, 0)}, +}; + +static struct amdgpu_video_codec_info sriov_vcn_4_0_0_video_codecs_encode_array_vcn1[] = { + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, 4096, 2304, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 4096, 2304, 0)}, +}; + +static struct amdgpu_video_codecs sriov_vcn_4_0_0_video_codecs_encode_vcn0 = { + .codec_count = ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_encode_array_vcn0), + .codec_array = sriov_vcn_4_0_0_video_codecs_encode_array_vcn0, +}; + +static struct amdgpu_video_codecs sriov_vcn_4_0_0_video_codecs_encode_vcn1 = { + .codec_count = ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_encode_array_vcn1), + .codec_array = sriov_vcn_4_0_0_video_codecs_encode_array_vcn1, +}; + +static struct amdgpu_video_codec_info sriov_vcn_4_0_0_video_codecs_decode_array_vcn0[] = { + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG2, 4096, 4096, 3)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4, 4096, 4096, 5)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, 4096, 4096, 52)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VC1, 4096, 4096, 4)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 8192, 4352, 186)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_JPEG, 4096, 4096, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VP9, 8192, 4352, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_AV1, 8192, 4352, 0)}, +}; + +static struct amdgpu_video_codec_info sriov_vcn_4_0_0_video_codecs_decode_array_vcn1[] = { + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG2, 4096, 4096, 3)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4, 4096, 4096, 5)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_MPEG4_AVC, 4096, 4096, 52)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VC1, 4096, 4096, 4)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_HEVC, 8192, 4352, 186)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_JPEG, 4096, 4096, 0)}, + {codec_info_build(AMDGPU_INFO_VIDEO_CAPS_CODEC_IDX_VP9, 8192, 4352, 0)}, +}; + +static struct amdgpu_video_codecs sriov_vcn_4_0_0_video_codecs_decode_vcn0 = { + .codec_count = ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_decode_array_vcn0), + .codec_array = sriov_vcn_4_0_0_video_codecs_decode_array_vcn0, +}; + +static struct amdgpu_video_codecs sriov_vcn_4_0_0_video_codecs_decode_vcn1 = { + .codec_count = ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_decode_array_vcn1), + .codec_array = sriov_vcn_4_0_0_video_codecs_decode_array_vcn1, +}; + static int soc21_query_video_codecs(struct amdgpu_device *adev, bool encode, const struct amdgpu_video_codecs **codecs) { @@ -111,16 +164,31 @@ static int soc21_query_video_codecs(struct amdgpu_device *adev, bool encode, case IP_VERSION(4, 0, 0): case IP_VERSION(4, 0, 2): case IP_VERSION(4, 0, 4): - if (adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) { - if (encode) - *codecs = &vcn_4_0_0_video_codecs_encode_vcn1; - else - *codecs = &vcn_4_0_0_video_codecs_decode_vcn1; + if (amdgpu_sriov_vf(adev)) { + if ((adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) || + !amdgpu_sriov_is_av1_support(adev)) { + if (encode) + *codecs = &sriov_vcn_4_0_0_video_codecs_encode_vcn1; + else + *codecs = &sriov_vcn_4_0_0_video_codecs_decode_vcn1; + } else { + if (encode) + *codecs = &sriov_vcn_4_0_0_video_codecs_encode_vcn0; + else + *codecs = &sriov_vcn_4_0_0_video_codecs_decode_vcn0; + } } else { - if (encode) - *codecs = &vcn_4_0_0_video_codecs_encode_vcn0; - else - *codecs = &vcn_4_0_0_video_codecs_decode_vcn0; + if ((adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0)) { + if (encode) + *codecs = &vcn_4_0_0_video_codecs_encode_vcn1; + else + *codecs = &vcn_4_0_0_video_codecs_decode_vcn1; + } else { + if (encode) + *codecs = &vcn_4_0_0_video_codecs_encode_vcn0; + else + *codecs = &vcn_4_0_0_video_codecs_decode_vcn0; + } } return 0; default: @@ -729,8 +797,23 @@ static int soc21_common_late_init(void *handle) { struct amdgpu_device *adev = (struct amdgpu_device *)handle;
- if (amdgpu_sriov_vf(adev)) + if (amdgpu_sriov_vf(adev)) { xgpu_nv_mailbox_get_irq(adev); + if ((adev->vcn.harvest_config & AMDGPU_VCN_HARVEST_VCN0) || + !amdgpu_sriov_is_av1_support(adev)) { + amdgpu_virt_update_sriov_video_codec(adev, + sriov_vcn_4_0_0_video_codecs_encode_array_vcn1, + ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_encode_array_vcn1), + sriov_vcn_4_0_0_video_codecs_decode_array_vcn1, + ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_decode_array_vcn1)); + } else { + amdgpu_virt_update_sriov_video_codec(adev, + sriov_vcn_4_0_0_video_codecs_encode_array_vcn0, + ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_encode_array_vcn0), + sriov_vcn_4_0_0_video_codecs_decode_array_vcn0, + ARRAY_SIZE(sriov_vcn_4_0_0_video_codecs_decode_array_vcn0)); + } + }
return 0; }
From: Chia-I Wu olvaffe@gmail.com
[ Upstream commit 9da050b0d9e04439d225a2ec3044af70cdfb3933 ]
kgd_mem pointers returned by kfd_process_device_translate_handle are only guaranteed to be valid while p->mutex is held. As soon as the mutex is unlocked, another thread can free the BO.
Signed-off-by: Chia-I Wu olvaffe@gmail.com Signed-off-by: Felix Kuehling Felix.Kuehling@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_chardev.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c index f79b8e964140e..e191d38f3da62 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_chardev.c @@ -1298,14 +1298,14 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep, args->n_success = i+1; }
- mutex_unlock(&p->mutex); - err = amdgpu_amdkfd_gpuvm_sync_memory(dev->adev, (struct kgd_mem *) mem, true); if (err) { pr_debug("Sync memory failed, wait interrupted by user signal\n"); goto sync_memory_failed; }
+ mutex_unlock(&p->mutex); + /* Flush TLBs after waiting for the page table updates to complete */ for (i = 0; i < args->n_devices; i++) { peer_pdd = kfd_process_device_data_by_id(p, devices_arr[i]); @@ -1321,9 +1321,9 @@ static int kfd_ioctl_map_memory_to_gpu(struct file *filep, bind_process_to_device_failed: get_mem_obj_from_handle_failed: map_memory_to_gpu_failed: +sync_memory_failed: mutex_unlock(&p->mutex); copy_from_user_failed: -sync_memory_failed: kfree(devices_arr);
return err; @@ -1337,6 +1337,7 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, void *mem; long err = 0; uint32_t *devices_arr = NULL, i; + bool flush_tlb;
if (!args->n_devices) { pr_debug("Device IDs array empty\n"); @@ -1389,16 +1390,19 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, } args->n_success = i+1; } - mutex_unlock(&p->mutex);
- if (kfd_flush_tlb_after_unmap(pdd->dev)) { + flush_tlb = kfd_flush_tlb_after_unmap(pdd->dev); + if (flush_tlb) { err = amdgpu_amdkfd_gpuvm_sync_memory(pdd->dev->adev, (struct kgd_mem *) mem, true); if (err) { pr_debug("Sync memory failed, wait interrupted by user signal\n"); goto sync_memory_failed; } + } + mutex_unlock(&p->mutex);
+ if (flush_tlb) { /* Flush TLBs after waiting for the page table updates to complete */ for (i = 0; i < args->n_devices; i++) { peer_pdd = kfd_process_device_data_by_id(p, devices_arr[i]); @@ -1414,9 +1418,9 @@ static int kfd_ioctl_unmap_memory_from_gpu(struct file *filep, bind_process_to_device_failed: get_mem_obj_from_handle_failed: unmap_memory_from_gpu_failed: +sync_memory_failed: mutex_unlock(&p->mutex); copy_from_user_failed: -sync_memory_failed: kfree(devices_arr); return err; }
From: Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com
[ Upstream commit 728cefa53a36ba378ed4a7f31a0c08289687d824 ]
[Why] On resume some displays are not ready for HDCP, so they will fail if we start the hdcp authentintication too soon.
Add a delay so that the displays can be ready before we start.
NOTE: Previoulsy this delay was set to 3 seconds but it was causing issues with compliance, 2 seconds should enough for compliance and the s3 resume case.
[How] Change the Delay to 2 seconds.
Reviewed-by: Aurabindo Pillai Aurabindo.Pillai@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Bhawanpreet Lakha Bhawanpreet.Lakha@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c index a7fd98f57f94c..dc62375a8e2c4 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_hdcp.c @@ -495,7 +495,7 @@ static void update_config(void *handle, struct cp_psp_stream_config *config) link->dp.mst_enabled = config->mst_enabled; link->dp.usb4_enabled = config->usb4_enabled; display->adjust.disable = MOD_HDCP_DISPLAY_DISABLE_AUTHENTICATION; - link->adjust.auth_delay = 0; + link->adjust.auth_delay = 2; link->adjust.hdcp1.disable = 0; conn_state = aconnector->base.state;
From: Kristian Overskeid koverskeid@gmail.com
[ Upstream commit 28e8cabe80f3e6e3c98121576eda898eeb20f1b1 ]
If no frames has been exchanged with a node for HSR_NODE_FORGET_TIME, the node will be deleted from the node_db list. If a frame is sent to the node after it is deleted, a netdev_err message for each slave interface is produced. This should not happen with dan nodes because of supervision frames, but can happen often with san nodes, which clutters the kernel log. Since the hsr protocol does not support sans, this is only relevant for the prp protocol.
Signed-off-by: Kristian Overskeid koverskeid@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/hsr/hsr_framereg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/hsr/hsr_framereg.c b/net/hsr/hsr_framereg.c index 00db74d96583d..865eda39d6014 100644 --- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -415,7 +415,7 @@ void hsr_addr_subst_dest(struct hsr_node *node_src, struct sk_buff *skb, node_dst = find_node_by_addr_A(&port->hsr->node_db, eth_hdr(skb)->h_dest); if (!node_dst) { - if (net_ratelimit()) + if (net_ratelimit() && port->hsr->prot_version != PRP_V1) netdev_err(skb->dev, "%s: Unknown node\n", __func__); return; }
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 9026c0bf233db53b86f74f4c620715e94eb32a09 ]
control_message() might be called with pao = NULL. Here indicates control_message() as sample.
(B) static void control_message(struct hpi_adapter_obj *pao, ...) { ^^^ struct hpi_hw_obj *phw = pao->priv; ... ^^^ }
(A) void _HPI_6205(struct hpi_adapter_obj *pao, ...) { ^^^ ... case HPI_OBJ_CONTROL: (B) control_message(pao, phm, phr); break; ^^^ ... }
void HPI_6205(...) { ... (A) _HPI_6205(NULL, phm, phr); ... ^^^^ }
Therefore, We will get too many warning via cppcheck, like below
sound/pci/asihpi/hpi6205.c:238:27: warning: Possible null pointer dereference: pao [nullPointer] struct hpi_hw_obj *phw = pao->priv; ^ sound/pci/asihpi/hpi6205.c:433:13: note: Calling function '_HPI_6205', 1st argument 'NULL' value is 0 _HPI_6205(NULL, phm, phr); ^ sound/pci/asihpi/hpi6205.c:401:20: note: Calling function 'control_message', 1st argument 'pao' value is 0 control_message(pao, phm, phr); ^ Set phr->error like many functions doing, and don't call _HPI_6205() with NULL.
Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87ttypeaqz.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/asihpi/hpi6205.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/asihpi/hpi6205.c b/sound/pci/asihpi/hpi6205.c index 27e11b5f70b97..c7d7eff86727f 100644 --- a/sound/pci/asihpi/hpi6205.c +++ b/sound/pci/asihpi/hpi6205.c @@ -430,7 +430,7 @@ void HPI_6205(struct hpi_message *phm, struct hpi_response *phr) pao = hpi_find_adapter(phm->adapter_index); } else { /* subsys messages don't address an adapter */ - _HPI_6205(NULL, phm, phr); + phr->error = HPI_ERROR_INVALID_OBJ_INDEX; return; }
From: Kuninori Morimoto kuninori.morimoto.gx@renesas.com
[ Upstream commit 98e5eb110095ec77cb6d775051d181edbf9cd3cf ]
tuning_ctl_set() might have buffer overrun at (X) if it didn't break from loop by matching (A).
static int tuning_ctl_set(...) { for (i = 0; i < TUNING_CTLS_COUNT; i++) (A) if (nid == ca0132_tuning_ctls[i].nid) break;
snd_hda_power_up(...); (X) dspio_set_param(..., ca0132_tuning_ctls[i].mid, ...); snd_hda_power_down(...); ^
return 1; }
We will get below error by cppcheck
sound/pci/hda/patch_ca0132.c:4229:2: note: After for loop, i has value 12 for (i = 0; i < TUNING_CTLS_COUNT; i++) ^ sound/pci/hda/patch_ca0132.c:4234:43: note: Array index out of bounds dspio_set_param(codec, ca0132_tuning_ctls[i].mid, 0x20, ^ This patch cares non match case.
Signed-off-by: Kuninori Morimoto kuninori.morimoto.gx@renesas.com Link: https://lore.kernel.org/r/87sfe9eap7.wl-kuninori.morimoto.gx@renesas.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/hda/patch_ca0132.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/sound/pci/hda/patch_ca0132.c b/sound/pci/hda/patch_ca0132.c index acde4cd58785e..099722ebaed83 100644 --- a/sound/pci/hda/patch_ca0132.c +++ b/sound/pci/hda/patch_ca0132.c @@ -4228,8 +4228,10 @@ static int tuning_ctl_set(struct hda_codec *codec, hda_nid_t nid,
for (i = 0; i < TUNING_CTLS_COUNT; i++) if (nid == ca0132_tuning_ctls[i].nid) - break; + goto found;
+ return -EINVAL; +found: snd_hda_power_up(codec); dspio_set_param(codec, ca0132_tuning_ctls[i].mid, 0x20, ca0132_tuning_ctls[i].req,
From: Wei Chen harperchen1110@gmail.com
[ Upstream commit f90bd245de82c095187d8c2cabb8b488a39eaecc ]
fb_set_var would by called when user invokes ioctl with cmd FBIOPUT_VSCREENINFO. User-provided data would finally reach tgafb_check_var. In case var->pixclock is assigned to zero, divide by zero would occur when checking whether reciprocal of var->pixclock is too high.
Similar crashes have happened in other fbdev drivers. There is no check and modification on var->pixclock along the call chain to tgafb_check_var. We believe it could also be triggered in driver tgafb from user site.
Signed-off-by: Wei Chen harperchen1110@gmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/tgafb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/video/fbdev/tgafb.c b/drivers/video/fbdev/tgafb.c index 14d37c49633c6..b44004880f0d1 100644 --- a/drivers/video/fbdev/tgafb.c +++ b/drivers/video/fbdev/tgafb.c @@ -173,6 +173,9 @@ tgafb_check_var(struct fb_var_screeninfo *var, struct fb_info *info) { struct tga_par *par = (struct tga_par *)info->par;
+ if (!var->pixclock) + return -EINVAL; + if (par->tga_type == TGA_TYPE_8PLANE) { if (var->bits_per_pixel != 8) return -EINVAL;
From: Chen Yu yu.c.chen@intel.com
[ Upstream commit 0bc23d8b2237a104d7f8379d687aa4cb82e2968b ]
The user provides arbitrary non-numeic value to level and type, which could bring unexpected behavior. In this case the expected behavior would be to throw an error.
pfrut -h usage: pfrut [OPTIONS] code injection: -l, --load -s, --stage -a, --activate -u, --update [stage and activate] -q, --query -d, --revid update telemetry: -G, --getloginfo -T, --type(0:execution, 1:history) -L, --level(0, 1, 2, 4) -R, --read -D, --revid log
pfrut -T A pfrut -G log_level:0 log_type:0 log_revid:2 max_data_size:65536 chunk1_size:0 chunk2_size:1530 rollover_cnt:0 reset_cnt:17
Fix this by restricting the input to be in the expected range.
Reported-by: Hariganesh Govindarajulu hariganesh.govindarajulu@intel.com Suggested-by: "Rafael J. Wysocki" rafael.j.wysocki@intel.com Signed-off-by: Chen Yu yu.c.chen@intel.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/acpi/tools/pfrut/pfrut.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/tools/power/acpi/tools/pfrut/pfrut.c b/tools/power/acpi/tools/pfrut/pfrut.c index 52aa0351533c3..388c9e3ad0407 100644 --- a/tools/power/acpi/tools/pfrut/pfrut.c +++ b/tools/power/acpi/tools/pfrut/pfrut.c @@ -97,7 +97,7 @@ static struct option long_options[] = { static void parse_options(int argc, char **argv) { int option_index = 0; - char *pathname; + char *pathname, *endptr; int opt;
pathname = strdup(argv[0]); @@ -125,11 +125,23 @@ static void parse_options(int argc, char **argv) log_getinfo = 1; break; case 'T': - log_type = atoi(optarg); + log_type = strtol(optarg, &endptr, 0); + if (*endptr || (log_type != 0 && log_type != 1)) { + printf("Number expected: type(0:execution, 1:history) - Quit.\n"); + exit(1); + } + set_log_type = 1; break; case 'L': - log_level = atoi(optarg); + log_level = strtol(optarg, &endptr, 0); + if (*endptr || + (log_level != 0 && log_level != 1 && + log_level != 2 && log_level != 4)) { + printf("Number expected: level(0, 1, 2, 4) - Quit.\n"); + exit(1); + } + set_log_level = 1; break; case 'R':
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit 6015b1aca1a233379625385feb01dd014aca60b5 ]
The getaffinity() system call uses 'cpumask_size()' to decide how big the CPU mask is - so far so good. It is indeed the allocation size of a cpumask.
But the code also assumes that the whole allocation is initialized without actually doing so itself. That's wrong, because we might have fixed-size allocations (making copying and clearing more efficient), but not all of it is then necessarily used if 'nr_cpu_ids' is smaller.
Having checked other users of 'cpumask_size()', they all seem to be ok, either using it purely for the allocation size, or explicitly zeroing the cpumask before using the size in bytes to copy it.
See for example the ublk_ctrl_get_queue_affinity() function that uses the proper 'zalloc_cpumask_var()' to make sure that the whole mask is cleared, whether the storage is on the stack or if it was an external allocation.
Fix this by just zeroing the allocation before using it. Do the same for the compat version of sched_getaffinity(), which had the same logic.
Also, for consistency, make sched_getaffinity() use 'cpumask_bits()' to access the bits. For a cpumask_var_t, it ends up being a pointer to the same data either way, but it's just a good idea to treat it like you would a 'cpumask_t'. The compat case already did that.
Reported-by: Ryan Roberts ryan.roberts@arm.com Link: https://lore.kernel.org/lkml/7d026744-6bd6-6827-0471-b5e8eae0be3f@arm.com/ Cc: Yury Norov yury.norov@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/compat.c | 2 +- kernel/sched/core.c | 4 ++-- 2 files changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/compat.c b/kernel/compat.c index 55551989d9da5..fb50f29d9b361 100644 --- a/kernel/compat.c +++ b/kernel/compat.c @@ -152,7 +152,7 @@ COMPAT_SYSCALL_DEFINE3(sched_getaffinity, compat_pid_t, pid, unsigned int, len, if (len & (sizeof(compat_ulong_t)-1)) return -EINVAL;
- if (!alloc_cpumask_var(&mask, GFP_KERNEL)) + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) return -ENOMEM;
ret = sched_getaffinity(pid, mask); diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 9a0698353d60f..57d84b534cdea 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -8404,14 +8404,14 @@ SYSCALL_DEFINE3(sched_getaffinity, pid_t, pid, unsigned int, len, if (len & (sizeof(unsigned long)-1)) return -EINVAL;
- if (!alloc_cpumask_var(&mask, GFP_KERNEL)) + if (!zalloc_cpumask_var(&mask, GFP_KERNEL)) return -ENOMEM;
ret = sched_getaffinity(pid, mask); if (ret == 0) { unsigned int retlen = min(len, cpumask_size());
- if (copy_to_user(user_mask_ptr, mask, retlen)) + if (copy_to_user(user_mask_ptr, cpumask_bits(mask), retlen)) ret = -EFAULT; else ret = retlen;
From: Irvin Cote irvincoteg@gmail.com
[ Upstream commit a61d265533b7fe0026a02a49916aa564ffe38e4c ]
In case the nvme_probe teardown path is triggered the ctrl ref count does not reach 0 thus creating a memory leak upon failure of nvme_probe.
Signed-off-by: Irvin Cote irvincoteg@gmail.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 29c902b9aecbd..c51ebbee8103e 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3126,6 +3126,7 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) nvme_dev_unmap(dev); out_uninit_ctrl: nvme_uninit_ctrl(&dev->ctrl); + nvme_put_ctrl(&dev->ctrl); return result; }
From: Philipp Geulen p.geulen@js-elektronik.de
[ Upstream commit b65d44fa0fe072c91bf41cd8756baa2b4c77eff2 ]
Added a quirk to fix Lexar NM620 1TB SSD reporting duplicate NGUIDs.
Signed-off-by: Philipp Geulen p.geulen@js-elektronik.de Reviewed-by: Chaitanya Kulkarni kkch@nvidia.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index c51ebbee8103e..ea3f0806783a3 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3491,6 +3491,8 @@ static const struct pci_device_id nvme_id_table[] = { .driver_data = NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(0x1d97, 0x2263), /* Lexar NM610 */ .driver_data = NVME_QUIRK_BOGUS_NID, }, + { PCI_DEVICE(0x1d97, 0x1d97), /* Lexar NM620 */ + .driver_data = NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(0x1d97, 0x2269), /* Lexar NM760 */ .driver_data = NVME_QUIRK_BOGUS_NID, }, { PCI_DEVICE(PCI_VENDOR_ID_AMAZON, 0x0061),
From: David Belanger david.belanger@amd.com
[ Upstream commit 20bc9f76b6a2455c6b54b91ae7634f147f64987f ]
Handle case when module is unloaded (kfd_exit) before a process space (mm_struct) is released.
v2: Fixed potential race conditions by removing all kfd_process from the process table first, then working on releasing the resources.
v3: Fixed loop element access / synchronization. Fixed extra empty lines.
Signed-off-by: David Belanger david.belanger@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_module.c | 1 + drivers/gpu/drm/amd/amdkfd/kfd_priv.h | 1 + drivers/gpu/drm/amd/amdkfd/kfd_process.c | 67 +++++++++++++++++++++--- 3 files changed, 62 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_module.c b/drivers/gpu/drm/amd/amdkfd/kfd_module.c index 09b966dc37681..aee2212e52f69 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_module.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_module.c @@ -77,6 +77,7 @@ static int kfd_init(void)
static void kfd_exit(void) { + kfd_cleanup_processes(); kfd_debugfs_fini(); kfd_process_destroy_wq(); kfd_procfs_shutdown(); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h index 552c3ac85a132..7dc55919993c0 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_priv.h +++ b/drivers/gpu/drm/amd/amdkfd/kfd_priv.h @@ -926,6 +926,7 @@ bool kfd_dev_is_large_bar(struct kfd_dev *dev);
int kfd_process_create_wq(void); void kfd_process_destroy_wq(void); +void kfd_cleanup_processes(void); struct kfd_process *kfd_create_process(struct file *filep); struct kfd_process *kfd_get_process(const struct task_struct *task); struct kfd_process *kfd_lookup_process_by_pasid(u32 pasid); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_process.c b/drivers/gpu/drm/amd/amdkfd/kfd_process.c index 51b1683ac5c1e..4d9f2d1c49b1d 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_process.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_process.c @@ -1167,6 +1167,17 @@ static void kfd_process_free_notifier(struct mmu_notifier *mn) kfd_unref_process(container_of(mn, struct kfd_process, mmu_notifier)); }
+static void kfd_process_notifier_release_internal(struct kfd_process *p) +{ + cancel_delayed_work_sync(&p->eviction_work); + cancel_delayed_work_sync(&p->restore_work); + + /* Indicate to other users that MM is no longer valid */ + p->mm = NULL; + + mmu_notifier_put(&p->mmu_notifier); +} + static void kfd_process_notifier_release(struct mmu_notifier *mn, struct mm_struct *mm) { @@ -1181,17 +1192,22 @@ static void kfd_process_notifier_release(struct mmu_notifier *mn, return;
mutex_lock(&kfd_processes_mutex); + /* + * Do early return if table is empty. + * + * This could potentially happen if this function is called concurrently + * by mmu_notifier and by kfd_cleanup_pocesses. + * + */ + if (hash_empty(kfd_processes_table)) { + mutex_unlock(&kfd_processes_mutex); + return; + } hash_del_rcu(&p->kfd_processes); mutex_unlock(&kfd_processes_mutex); synchronize_srcu(&kfd_processes_srcu);
- cancel_delayed_work_sync(&p->eviction_work); - cancel_delayed_work_sync(&p->restore_work); - - /* Indicate to other users that MM is no longer valid */ - p->mm = NULL; - - mmu_notifier_put(&p->mmu_notifier); + kfd_process_notifier_release_internal(p); }
static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = { @@ -1200,6 +1216,43 @@ static const struct mmu_notifier_ops kfd_process_mmu_notifier_ops = { .free_notifier = kfd_process_free_notifier, };
+/* + * This code handles the case when driver is being unloaded before all + * mm_struct are released. We need to safely free the kfd_process and + * avoid race conditions with mmu_notifier that might try to free them. + * + */ +void kfd_cleanup_processes(void) +{ + struct kfd_process *p; + struct hlist_node *p_temp; + unsigned int temp; + HLIST_HEAD(cleanup_list); + + /* + * Move all remaining kfd_process from the process table to a + * temp list for processing. Once done, callback from mmu_notifier + * release will not see the kfd_process in the table and do early return, + * avoiding double free issues. + */ + mutex_lock(&kfd_processes_mutex); + hash_for_each_safe(kfd_processes_table, temp, p_temp, p, kfd_processes) { + hash_del_rcu(&p->kfd_processes); + synchronize_srcu(&kfd_processes_srcu); + hlist_add_head(&p->kfd_processes, &cleanup_list); + } + mutex_unlock(&kfd_processes_mutex); + + hlist_for_each_entry_safe(p, p_temp, &cleanup_list, kfd_processes) + kfd_process_notifier_release_internal(p); + + /* + * Ensures that all outstanding free_notifier get called, triggering + * the release of the kfd_process struct. + */ + mmu_notifier_synchronize(); +} + static int kfd_process_init_cwsr_apu(struct kfd_process *p, struct file *filep) { unsigned long offset;
From: Adham Faris afaris@nvidia.com
[ Upstream commit 78dee7befd56987283c13877b834c0aa97ad51b9 ]
XSK redirecting XDP programs require linearity, hence applies restrictions on the MTU. For PAGE_SIZE=4K, MTU shouldn't exceed 3498.
Features that contradict with XDP such HW-LRO and HW-GRO are enforced by the driver in advance, during XSK params validation, except for MTU, which was not enforced before this patch.
This has been spotted during test scenario described below: Attaching xdpsock program (PAGE_SIZE=4K), with MTU < 3498, detaching XDP program, changing the MTU to arbitrary value in the range [3499, 3754], attaching XDP program again, which ended up with failure since MTU is > 3498.
This commit lowers the XSK MTU limitation to be aligned with XDP MTU limitation, since XSK socket is meaningless without XDP program.
Signed-off-by: Adham Faris afaris@nvidia.com Reviewed-by: Tariq Toukan tariqt@nvidia.com Signed-off-by: Saeed Mahameed saeedm@nvidia.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/en_main.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c index 47d4b54d15634..1f4233b2842f7 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/en_main.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/en_main.c @@ -4117,13 +4117,17 @@ static bool mlx5e_xsk_validate_mtu(struct net_device *netdev, struct xsk_buff_pool *xsk_pool = mlx5e_xsk_get_pool(&chs->params, chs->params.xsk, ix); struct mlx5e_xsk_param xsk; + int max_xdp_mtu;
if (!xsk_pool) continue;
mlx5e_build_xsk_param(xsk_pool, &xsk); + max_xdp_mtu = mlx5e_xdp_max_mtu(new_params, &xsk);
- if (!mlx5e_validate_xsk_param(new_params, &xsk, mdev)) { + /* Validate XSK params and XDP MTU in advance */ + if (!mlx5e_validate_xsk_param(new_params, &xsk, mdev) || + new_params->sw_mtu > max_xdp_mtu) { u32 hr = mlx5e_get_linear_rq_headroom(new_params, &xsk); int max_mtu_frame, max_mtu_page, max_mtu;
@@ -4133,9 +4137,9 @@ static bool mlx5e_xsk_validate_mtu(struct net_device *netdev, */ max_mtu_frame = MLX5E_HW2SW_MTU(new_params, xsk.chunk_size - hr); max_mtu_page = MLX5E_HW2SW_MTU(new_params, SKB_MAX_HEAD(0)); - max_mtu = min(max_mtu_frame, max_mtu_page); + max_mtu = min3(max_mtu_frame, max_mtu_page, max_xdp_mtu);
- netdev_err(netdev, "MTU %d is too big for an XSK running on channel %u. Try MTU <= %d\n", + netdev_err(netdev, "MTU %d is too big for an XSK running on channel %u or its redirection XDP program. Try MTU <= %d\n", new_params->sw_mtu, ix, max_mtu); return false; }
From: Wei Chen harperchen1110@gmail.com
[ Upstream commit 92e2a00f2987483e1f9253625828622edd442e61 ]
variable var->pixclock can be set by user. In case it equals to zero, divide by zero would occur in nvidiafb_set_par.
Similar crashes have happened in other fbdev drivers. There is no check and modification on var->pixclock along the call chain to nvidia_check_var and nvidiafb_set_par. We believe it could also be triggered in driver nvidia from user site.
Signed-off-by: Wei Chen harperchen1110@gmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/nvidia/nvidia.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/video/fbdev/nvidia/nvidia.c b/drivers/video/fbdev/nvidia/nvidia.c index e60a276b4855d..ea4ba3dfb96bb 100644 --- a/drivers/video/fbdev/nvidia/nvidia.c +++ b/drivers/video/fbdev/nvidia/nvidia.c @@ -764,6 +764,8 @@ static int nvidiafb_check_var(struct fb_var_screeninfo *var, int pitch, err = 0;
NVTRACE_ENTER(); + if (!var->pixclock) + return -EINVAL;
var->transp.offset = 0; var->transp.length = 0;
From: Wei Chen harperchen1110@gmail.com
[ Upstream commit d823685486a3446d061fed7c7d2f80af984f119a ]
Variable var->pixclock is controlled by user and can be assigned to zero. Without proper check, divide by zero would occur in intelfbhw_validate_mode and intelfbhw_mode_to_hw.
Error out if var->pixclock is zero.
Signed-off-by: Wei Chen harperchen1110@gmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/intelfb/intelfbdrv.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/video/fbdev/intelfb/intelfbdrv.c b/drivers/video/fbdev/intelfb/intelfbdrv.c index 0a9e5067b2010..a81095b2b1ea5 100644 --- a/drivers/video/fbdev/intelfb/intelfbdrv.c +++ b/drivers/video/fbdev/intelfb/intelfbdrv.c @@ -1222,6 +1222,9 @@ static int intelfb_check_var(struct fb_var_screeninfo *var,
dinfo = GET_DINFO(info);
+ if (!var->pixclock) + return -EINVAL; + /* update the pitch */ if (intelfbhw_validate_mode(dinfo, var) != 0) return -EINVAL;
From: Wei Chen harperchen1110@gmail.com
[ Upstream commit 61ac4b86a4c047c20d5cb423ddd87496f14d9868 ]
var->pixclock can be assigned to zero by user. Without proper check, divide by zero would occur in lx_set_clock.
Error out if var->pixclock is zero.
Signed-off-by: Wei Chen harperchen1110@gmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/geode/lxfb_core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/video/fbdev/geode/lxfb_core.c b/drivers/video/fbdev/geode/lxfb_core.c index 8130e9eee2b4b..556d8b1a9e06a 100644 --- a/drivers/video/fbdev/geode/lxfb_core.c +++ b/drivers/video/fbdev/geode/lxfb_core.c @@ -235,6 +235,9 @@ static void get_modedb(struct fb_videomode **modedb, unsigned int *size)
static int lxfb_check_var(struct fb_var_screeninfo *var, struct fb_info *info) { + if (!var->pixclock) + return -EINVAL; + if (var->xres > 1920 || var->yres > 1440) return -EINVAL;
From: Wei Chen harperchen1110@gmail.com
[ Upstream commit 44a3b36b42acfc433aaaf526191dd12fbb919fdb ]
var->pixclock can be assigned to zero by user. Without proper check, divide by zero would occur when invoking macro PICOS2KHZ in au1200fb_fb_check_var.
Error out if var->pixclock is zero.
Signed-off-by: Wei Chen harperchen1110@gmail.com Signed-off-by: Helge Deller deller@gmx.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/video/fbdev/au1200fb.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/video/fbdev/au1200fb.c b/drivers/video/fbdev/au1200fb.c index 81c3154544287..b6b22fa4a8a01 100644 --- a/drivers/video/fbdev/au1200fb.c +++ b/drivers/video/fbdev/au1200fb.c @@ -1040,6 +1040,9 @@ static int au1200fb_fb_check_var(struct fb_var_screeninfo *var, u32 pixclock; int screen_size, plane;
+ if (!var->pixclock) + return -EINVAL; + plane = fbdev->plane;
/* Make sure that the mode respect all LCD controller and
From: Prarit Bhargava prarit@redhat.com
[ Upstream commit 40aafc7d58d3544f152a863a0e9863014b6d5d8c ]
When running as non-root the following error is seen in turbostat:
turbostat: fopen /dev/cpu_dma_latency : Permission denied
turbostat and the man page have information on how to avoid other permission errors, so these can be fixed the same way.
Provide better /dev/cpu_dma_latency warnings that provide instructions on how to avoid the error, and update the man page.
Signed-off-by: Prarit Bhargava prarit@redhat.com Cc: linux-pm@vger.kernel.org Signed-off-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.8 | 2 ++ tools/power/x86/turbostat/turbostat.c | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/power/x86/turbostat/turbostat.8 b/tools/power/x86/turbostat/turbostat.8 index c7b26a3603afe..3e1a4c4be001a 100644 --- a/tools/power/x86/turbostat/turbostat.8 +++ b/tools/power/x86/turbostat/turbostat.8 @@ -344,6 +344,8 @@ Alternatively, non-root users can be enabled to run turbostat this way:
# chmod +r /dev/cpu/*/msr
+# chmod +r /dev/cpu_dma_latency + .B "turbostat " reads hardware counters, but doesn't write them. So it will not interfere with the OS or other programs, including diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index aba460410dbd1..c24054e3ef7ad 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -5482,7 +5482,7 @@ void print_dev_latency(void)
retval = read(fd, (void *)&value, sizeof(int)); if (retval != sizeof(int)) { - warn("read %s\n", path); + warn("read failed %s\n", path); close(fd); return; }
From: Antti Laakso antti.laakso@intel.com
[ Upstream commit 92c25393586ac799b9b7d9e50434f3c44a7622c4 ]
The "excursion to minimum" information is in bit2 in HWP_STATUS MSR. Fix the bitmask used for decoding the register.
Signed-off-by: Antti Laakso antti.laakso@intel.com Reviewed-by: Artem Bityutskiy artem.bityutskiy@linux.intel.com Signed-off-by: Len Brown len.brown@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/power/x86/turbostat/turbostat.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/power/x86/turbostat/turbostat.c b/tools/power/x86/turbostat/turbostat.c index c24054e3ef7ad..c61c6c704fbe6 100644 --- a/tools/power/x86/turbostat/turbostat.c +++ b/tools/power/x86/turbostat/turbostat.c @@ -4426,7 +4426,7 @@ int print_hwp(struct thread_data *t, struct core_data *c, struct pkg_data *p)
fprintf(outf, "cpu%d: MSR_HWP_STATUS: 0x%08llx " "(%sGuaranteed_Perf_Change, %sExcursion_Min)\n", - cpu, msr, ((msr) & 0x1) ? "" : "No-", ((msr) & 0x2) ? "" : "No-"); + cpu, msr, ((msr) & 0x1) ? "" : "No-", ((msr) & 0x4) ? "" : "No-");
return 0; }
From: Anton Gusev aagusev@ispras.ru
[ Upstream commit bc4f359b3b607daac0290d0038561237a86b38cb ]
Overwriting the error code with the deletion result may cause the function to return 0 despite encountering an error. Commit b111545d26c0 ("tracing: Remove the useless value assignment in test_create_synth_event()") solves a similar issue by returning the original error code, so this patch does the same.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Link: https://lore.kernel.org/linux-trace-kernel/20230131075818.5322-1-aagusev@isp...
Signed-off-by: Anton Gusev aagusev@ispras.ru Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org Acked-by: Masami Hiramatsu (Google) mhiramat@kernel.org Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/trace/kprobe_event_gen_test.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/trace/kprobe_event_gen_test.c b/kernel/trace/kprobe_event_gen_test.c index c736487fc0e48..e0c420eb0b2b4 100644 --- a/kernel/trace/kprobe_event_gen_test.c +++ b/kernel/trace/kprobe_event_gen_test.c @@ -146,7 +146,7 @@ static int __init test_gen_kprobe_cmd(void) if (trace_event_file_is_valid(gen_kprobe_test)) gen_kprobe_test = NULL; /* We got an error after creating the event, delete it */ - ret = kprobe_event_delete("gen_kprobe_test"); + kprobe_event_delete("gen_kprobe_test"); goto out; }
@@ -211,7 +211,7 @@ static int __init test_gen_kretprobe_cmd(void) if (trace_event_file_is_valid(gen_kretprobe_test)) gen_kretprobe_test = NULL; /* We got an error after creating the event, delete it */ - ret = kprobe_event_delete("gen_kretprobe_test"); + kprobe_event_delete("gen_kretprobe_test"); goto out; }
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit efbf35a102b20246cfe4409c6ae92e72ecb67ab8 ]
reclaim isn't set in the alloc case, however we only care about reclaim in the !alloc case. This isn't an actual problem, however -Wmaybe-uninitialized will complain, so initialize reclaim to quiet the compiler.
Reviewed-by: Qu Wenruo wqu@suse.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Stable-dep-of: df384da5a49c ("btrfs: use temporary variable for space_info in btrfs_update_block_group") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index 8eb625318e785..f40e56d44276d 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3336,7 +3336,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, spin_unlock(&info->delalloc_root_lock);
while (total) { - bool reclaim; + bool reclaim = false;
cache = btrfs_lookup_block_group(info, bytenr); if (!cache) {
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit df384da5a49cace5c5e3100803dfd563fd982f93 ]
We do
cache->space_info->counter += num_bytes;
everywhere in here. This is makes the lines longer than they need to be, and will be especially noticeable when we add the active tracking in, so add a temp variable for the space_info so this is cleaner.
Reviewed-by: Naohiro Aota naohiro.aota@wdc.com Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Anand Jain anand.jain@oracle.com Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/block-group.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/fs/btrfs/block-group.c b/fs/btrfs/block-group.c index f40e56d44276d..c70a888bf8bf6 100644 --- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3336,6 +3336,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, spin_unlock(&info->delalloc_root_lock);
while (total) { + struct btrfs_space_info *space_info; bool reclaim = false;
cache = btrfs_lookup_block_group(info, bytenr); @@ -3343,6 +3344,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, ret = -ENOENT; break; } + space_info = cache->space_info; factor = btrfs_bg_type_to_factor(cache->flags);
/* @@ -3357,7 +3359,7 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, byte_in_group = bytenr - cache->start; WARN_ON(byte_in_group > cache->length);
- spin_lock(&cache->space_info->lock); + spin_lock(&space_info->lock); spin_lock(&cache->lock);
if (btrfs_test_opt(info, SPACE_CACHE) && @@ -3370,23 +3372,23 @@ int btrfs_update_block_group(struct btrfs_trans_handle *trans, old_val += num_bytes; cache->used = old_val; cache->reserved -= num_bytes; - cache->space_info->bytes_reserved -= num_bytes; - cache->space_info->bytes_used += num_bytes; - cache->space_info->disk_used += num_bytes * factor; + space_info->bytes_reserved -= num_bytes; + space_info->bytes_used += num_bytes; + space_info->disk_used += num_bytes * factor; spin_unlock(&cache->lock); - spin_unlock(&cache->space_info->lock); + spin_unlock(&space_info->lock); } else { old_val -= num_bytes; cache->used = old_val; cache->pinned += num_bytes; - btrfs_space_info_update_bytes_pinned(info, - cache->space_info, num_bytes); - cache->space_info->bytes_used -= num_bytes; - cache->space_info->disk_used -= num_bytes * factor; + btrfs_space_info_update_bytes_pinned(info, space_info, + num_bytes); + space_info->bytes_used -= num_bytes; + space_info->disk_used -= num_bytes * factor;
reclaim = should_reclaim_block_group(cache, num_bytes); spin_unlock(&cache->lock); - spin_unlock(&cache->space_info->lock); + spin_unlock(&space_info->lock);
set_extent_dirty(&trans->transaction->pinned_extents, bytenr, bytenr + num_bytes - 1,
From: Arseniy Krasnov AVKrasnov@sberdevices.ru
[ Upstream commit 4ce341de6c02d02aba7c78a6447ccfcaa9eeb328 ]
This structure must be zeroed, because it's field 'hw->core' is used as 'parent' in 'clk_core_fill_parent_index()', but it will be uninitialized. This happens, because when this struct is not zeroed, pointer 'hw' is "initialized" by garbage, which is valid pointer, but points to some garbage. So 'hw' will be dereferenced, but 'core' contains some random data which will be interpreted as a pointer. The following backtrace is result of dereference of such pointer:
[ 1.081319] __clk_register+0x414/0x820 [ 1.085113] devm_clk_register+0x64/0xd0 [ 1.088995] meson_nfc_probe+0x258/0x6ec [ 1.092875] platform_probe+0x70/0xf0 [ 1.096498] really_probe+0xc8/0x3e0 [ 1.100034] __driver_probe_device+0x84/0x190 [ 1.104346] driver_probe_device+0x44/0x120 [ 1.108487] __driver_attach+0xb4/0x220 [ 1.112282] bus_for_each_dev+0x78/0xd0 [ 1.116077] driver_attach+0x2c/0x40 [ 1.119613] bus_add_driver+0x184/0x240 [ 1.123408] driver_register+0x80/0x140 [ 1.127203] __platform_driver_register+0x30/0x40 [ 1.131860] meson_nfc_driver_init+0x24/0x30
Fixes: 1e4d3ba66888 ("mtd: rawnand: meson: fix the clock") Signed-off-by: Arseniy Krasnov AVKrasnov@sberdevices.ru Acked-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/20230227102425.793841-1-AVKrasnov@sberdevi... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/meson_nand.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c index 5ee01231ac4cd..30e326adabfc1 100644 --- a/drivers/mtd/nand/raw/meson_nand.c +++ b/drivers/mtd/nand/raw/meson_nand.c @@ -991,7 +991,7 @@ static const struct mtd_ooblayout_ops meson_ooblayout_ops = {
static int meson_nfc_clk_init(struct meson_nfc *nfc) { - struct clk_parent_data nfc_divider_parent_data[1]; + struct clk_parent_data nfc_divider_parent_data[1] = {0}; struct clk_init_data init = {0}; int ret;
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 75dce6a941e3f16c3b4878c8b2f46d5d07c619ce ]
wait_for_completion_timeout() and readl_poll_timeout() don't handle their return value the same way.
wait_for_completion_timeout() returns 0 on time out (and >0 in all other cases) readl_poll_timeout() returns 0 on success and -ETIMEDOUT upon a timeout.
In order for the error handling path to work in both cases, the logic against wait_for_completion_timeout() needs to be inverted.
Fixes: 48e6633a9fa2 ("mtd: nand: mxic-ecc: Add Macronix external ECC engine support") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/beddbc374557e44ceec897e68c4a5d12764ddbb9.1... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/ecc-mxic.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/mtd/nand/ecc-mxic.c b/drivers/mtd/nand/ecc-mxic.c index 8afdca731b874..6b487ffe2f2dc 100644 --- a/drivers/mtd/nand/ecc-mxic.c +++ b/drivers/mtd/nand/ecc-mxic.c @@ -429,6 +429,7 @@ static int mxic_ecc_data_xfer_wait_for_completion(struct mxic_ecc_engine *mxic) mxic_ecc_enable_int(mxic); ret = wait_for_completion_timeout(&mxic->complete, msecs_to_jiffies(1000)); + ret = ret ? 0 : -ETIMEDOUT; mxic_ecc_disable_int(mxic); } else { ret = readl_poll_timeout(mxic->regs + INTRPT_STS, val,
From: GuoRui.Yu GuoRui.Yu@linux.alibaba.com
[ Upstream commit 7c3940bf81e5664cdb50c3fedfec8f0a756a34fb ]
In general, if swiotlb is sufficient, the logic of index = wrap_area_index(mem, index + 1) is fine, it will quickly take a slot and release the area->lock; But if swiotlb is insufficient and the device has min_align_mask requirements, such as NVME, we may not be able to satisfy index == wrap and exit the loop properly. In this case, other kernel threads will not be able to acquire the area->lock and release the slot, resulting in a deadlock.
The current implementation of wrap_area_index does not involve a modulo operation, so adjusting the wrap to ensure the loop ends is not trivial. Introduce a new variable to record the number of loops and exit the loop after completing the traversal.
Backtraces: Other CPUs are waiting this core to exit the swiotlb_do_find_slots loop. [10199.924391] RIP: 0010:swiotlb_do_find_slots+0x1fe/0x3e0 [10199.924403] Call Trace: [10199.924404] <TASK> [10199.924405] swiotlb_tbl_map_single+0xec/0x1f0 [10199.924407] swiotlb_map+0x5c/0x260 [10199.924409] ? nvme_pci_setup_prps+0x1ed/0x340 [10199.924411] dma_direct_map_page+0x12e/0x1c0 [10199.924413] nvme_map_data+0x304/0x370 [10199.924415] nvme_prep_rq.part.0+0x31/0x120 [10199.924417] nvme_queue_rq+0x77/0x1f0
... [ 9639.596311] NMI backtrace for cpu 48 [ 9639.596336] Call Trace: [ 9639.596337] [ 9639.596338] _raw_spin_lock_irqsave+0x37/0x40 [ 9639.596341] swiotlb_do_find_slots+0xef/0x3e0 [ 9639.596344] swiotlb_tbl_map_single+0xec/0x1f0 [ 9639.596347] swiotlb_map+0x5c/0x260 [ 9639.596349] dma_direct_map_sg+0x7a/0x280 [ 9639.596352] __dma_map_sg_attrs+0x30/0x70 [ 9639.596355] dma_map_sgtable+0x1d/0x30 [ 9639.596356] nvme_map_data+0xce/0x370
... [ 9639.595665] NMI backtrace for cpu 50 [ 9639.595682] Call Trace: [ 9639.595682] [ 9639.595683] _raw_spin_lock_irqsave+0x37/0x40 [ 9639.595686] swiotlb_release_slots.isra.0+0x86/0x180 [ 9639.595688] dma_direct_unmap_sg+0xcf/0x1a0 [ 9639.595690] nvme_unmap_data.part.0+0x43/0xc0
Fixes: 1f221a0d0dbf ("swiotlb: respect min_align_mask") Signed-off-by: GuoRui.Yu GuoRui.Yu@linux.alibaba.com Signed-off-by: Xiaokang Hu xiaokang.hxk@alibaba-inc.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/swiotlb.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index ef3bc3a5bbed3..869dd6667c464 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -633,8 +633,8 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, unsigned int iotlb_align_mask = dma_get_min_align_mask(dev) & ~(IO_TLB_SIZE - 1); unsigned int nslots = nr_slots(alloc_size), stride; - unsigned int index, wrap, count = 0, i; unsigned int offset = swiotlb_align_offset(dev, orig_addr); + unsigned int index, slots_checked, count = 0, i; unsigned long flags; unsigned int slot_base; unsigned int slot_index; @@ -657,15 +657,16 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, goto not_found;
slot_base = area_index * mem->area_nslabs; - index = wrap = wrap_area_index(mem, ALIGN(area->index, stride)); + index = wrap_area_index(mem, ALIGN(area->index, stride));
- do { + for (slots_checked = 0; slots_checked < mem->area_nslabs; ) { slot_index = slot_base + index;
if (orig_addr && (slot_addr(tbl_dma_addr, slot_index) & iotlb_align_mask) != (orig_addr & iotlb_align_mask)) { index = wrap_area_index(mem, index + 1); + slots_checked++; continue; }
@@ -681,7 +682,8 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, goto found; } index = wrap_area_index(mem, index + stride); - } while (index != wrap); + slots_checked += stride; + }
not_found: spin_unlock_irqrestore(&area->lock, flags);
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
[ Upstream commit 748b2f5e82d17480404b3e2895388fc2925f7caf ]
mac_len is of type unsigned, which can never be less than zero.
mac_len = ieee802154_hdr_peek_addrs(skb, &header); if (mac_len < 0) return mac_len;
Change this to type int as ieee802154_hdr_peek_addrs() can return negative integers, this is found by static analysis with smatch.
Fixes: 6c993779ea1d ("ca8210: fix mac_len negative array access") Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Acked-by: Alexander Aring aahringo@redhat.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230306191824.4115839-1-harshit.m.mogalapalli@ora... Signed-off-by: Stefan Schmidt stefan@datenfreihafen.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ieee802154/ca8210.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c index 0b0c6c0764fe9..d0b5129439ed6 100644 --- a/drivers/net/ieee802154/ca8210.c +++ b/drivers/net/ieee802154/ca8210.c @@ -1902,10 +1902,9 @@ static int ca8210_skb_tx( struct ca8210_priv *priv ) { - int status; struct ieee802154_hdr header = { }; struct secspec secspec; - unsigned int mac_len; + int mac_len, status;
dev_dbg(&priv->spi->dev, "%s called\n", __func__);
From: Rajnesh Kanwal rkanwal@rivosinc.com
[ Upstream commit 6eff38048944cadc3cddcf117acfa5199ec32490 ]
In case when VCPU is blocked due to WFI, we schedule the timer from `kvm_riscv_vcpu_timer_blocking()` to keep timer interrupt ticking.
But in case when delta_ns comes to be zero, we never schedule the timer and VCPU keeps sleeping indefinitely until any activity is done with VM console.
This is easily reproduce-able using kvmtool. ./lkvm-static run -c1 --console virtio -p "earlycon root=/dev/vda" \ -k ./Image -d rootfs.ext4
Also, just add a print in kvm_riscv_vcpu_vstimer_expired() to check the interrupt delivery and run `top` or similar auto-upating cmd from guest. Within sometime one can notice that print from timer expiry routine stops and the `top` cmd output will stop updating.
This change fixes this by making sure we schedule the timer even with delta_ns being zero to bring the VCPU out of sleep immediately.
Fixes: 8f5cb44b1bae ("RISC-V: KVM: Support sstc extension") Signed-off-by: Rajnesh Kanwal rkanwal@rivosinc.com Reviewed-by: Atish Patra atishp@rivosinc.com Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kvm/vcpu_timer.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/kvm/vcpu_timer.c b/arch/riscv/kvm/vcpu_timer.c index ad34519c8a13d..3ac2ff6a65dac 100644 --- a/arch/riscv/kvm/vcpu_timer.c +++ b/arch/riscv/kvm/vcpu_timer.c @@ -147,10 +147,8 @@ static void kvm_riscv_vcpu_timer_blocking(struct kvm_vcpu *vcpu) return;
delta_ns = kvm_riscv_delta_cycles2ns(t->next_cycles, gt, t); - if (delta_ns) { - hrtimer_start(&t->hrt, ktime_set(0, delta_ns), HRTIMER_MODE_REL); - t->next_set = true; - } + hrtimer_start(&t->hrt, ktime_set(0, delta_ns), HRTIMER_MODE_REL); + t->next_set = true; }
static void kvm_riscv_vcpu_timer_unblocking(struct kvm_vcpu *vcpu)
From: Álvaro Fernández Rojas noltari@gmail.com
[ Upstream commit ab327f8acdf8d06601fbf058859a539a9422afff ]
RAC flush causes kernel panics on BCM6358 with EHCI/OHCI when booting from TP1: [ 3.881739] usb 1-1: new high-speed USB device number 2 using ehci-platform [ 3.895011] Reserved instruction in kernel code[#1]: [ 3.900113] CPU: 0 PID: 1 Comm: init Not tainted 5.10.16 #0 [ 3.905829] $ 0 : 00000000 10008700 00000000 77d94060 [ 3.911238] $ 4 : 7fd1f088 00000000 81431cac 81431ca0 [ 3.916641] $ 8 : 00000000 ffffefff 8075cd34 00000000 [ 3.922043] $12 : 806f8d40 f3e812b7 00000000 000d9aaa [ 3.927446] $16 : 7fd1f068 7fd1f080 7ff559b8 81428470 [ 3.932848] $20 : 00000000 00000000 55590000 77d70000 [ 3.938251] $24 : 00000018 00000010 [ 3.943655] $28 : 81430000 81431e60 81431f28 800157fc [ 3.949058] Hi : 00000000 [ 3.952013] Lo : 00000000 [ 3.955019] epc : 80015808 setup_sigcontext+0x54/0x24c [ 3.960464] ra : 800157fc setup_sigcontext+0x48/0x24c [ 3.965913] Status: 10008703 KERNEL EXL IE [ 3.970216] Cause : 00800028 (ExcCode 0a) [ 3.974340] PrId : 0002a010 (Broadcom BMIPS4350) [ 3.979170] Modules linked in: ohci_platform ohci_hcd fsl_mph_dr_of ehci_platform ehci_fsl ehci_hcd gpio_button_hotplug usbcore nls_base usb_common [ 3.992907] Process init (pid: 1, threadinfo=(ptrval), task=(ptrval), tls=77e22ec8) [ 4.000776] Stack : 81431ef4 7fd1f080 81431f28 81428470 7fd1f068 81431edc 7ff559b8 81428470 [ 4.009467] 81431f28 7fd1f080 55590000 77d70000 77d5498c 80015c70 806f0000 8063ae74 [ 4.018149] 08100002 81431f28 0000000a 08100002 81431f28 0000000a 77d6b418 00000003 [ 4.026831] ffffffff 80016414 80080734 81431ecc 81431ecc 00000001 00000000 04000000 [ 4.035512] 77d54874 00000000 00000000 00000000 00000000 00000012 00000002 00000000 [ 4.044196] ... [ 4.046706] Call Trace: [ 4.049238] [<80015808>] setup_sigcontext+0x54/0x24c [ 4.054356] [<80015c70>] setup_frame+0xdc/0x124 [ 4.059015] [<80016414>] do_notify_resume+0x1dc/0x288 [ 4.064207] [<80011b50>] work_notifysig+0x10/0x18 [ 4.069036] [ 4.070538] Code: 8fc300b4 00001025 26240008 <ac820000> ac830004 3c048063 0c0228aa 24846a00 26240010 [ 4.080686] [ 4.082517] ---[ end trace 22a8edb41f5f983b ]--- [ 4.087374] Kernel panic - not syncing: Fatal exception [ 4.092753] Rebooting in 1 seconds..
Because the bootloader (CFE) is not initializing the Read-ahead cache properly on the second thread (TP1). Since the RAC was not initialized properly, we should avoid flushing it at the risk of corrupting the instruction stream as seen in the trace above.
Fixes: d59098a0e9cb ("MIPS: bmips: use generic dma noncoherent ops") Signed-off-by: Álvaro Fernández Rojas noltari@gmail.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/bmips/dma.c | 5 +++++ arch/mips/bmips/setup.c | 8 ++++++++ 2 files changed, 13 insertions(+)
diff --git a/arch/mips/bmips/dma.c b/arch/mips/bmips/dma.c index 33788668cbdbf..3779e7855bd75 100644 --- a/arch/mips/bmips/dma.c +++ b/arch/mips/bmips/dma.c @@ -5,6 +5,8 @@ #include <asm/bmips.h> #include <asm/io.h>
+bool bmips_rac_flush_disable; + void arch_sync_dma_for_cpu_all(void) { void __iomem *cbr = BMIPS_GET_CBR(); @@ -15,6 +17,9 @@ void arch_sync_dma_for_cpu_all(void) boot_cpu_type() != CPU_BMIPS4380) return;
+ if (unlikely(bmips_rac_flush_disable)) + return; + /* Flush stale data out of the readahead cache */ cfg = __raw_readl(cbr + BMIPS_RAC_CONFIG); __raw_writel(cfg | 0x100, cbr + BMIPS_RAC_CONFIG); diff --git a/arch/mips/bmips/setup.c b/arch/mips/bmips/setup.c index e95b3f78e7cd4..549a6392a3d2d 100644 --- a/arch/mips/bmips/setup.c +++ b/arch/mips/bmips/setup.c @@ -35,6 +35,8 @@ #define REG_BCM6328_OTP ((void __iomem *)CKSEG1ADDR(0x1000062c)) #define BCM6328_TP1_DISABLED BIT(9)
+extern bool bmips_rac_flush_disable; + static const unsigned long kbase = VMLINUX_LOAD_ADDRESS & 0xfff00000;
struct bmips_quirk { @@ -104,6 +106,12 @@ static void bcm6358_quirks(void) * disable SMP for now */ bmips_smp_enabled = 0; + + /* + * RAC flush causes kernel panics on BCM6358 when booting from TP1 + * because the bootloader is not initializing it properly. + */ + bmips_rac_flush_disable = !!(read_c0_brcm_cmt_local() & (1 << 31)); }
static void bcm6368_quirks(void)
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 8c721c53dda512fdd48eb24d6d99e56deee57898 ]
The recent support of low latency playback in USB-audio driver made the snd_usb_queue_pending_output_urbs() function to be called via PCM ack ops. In the new code path, the function is performed already in the PCM stream lock. The problem is that, when an XRUN is detected, the function calls snd_pcm_xrun() to notify, but snd_pcm_xrun() is supposed to be called only outside the stream lock. As a result, it leads to a deadlock of PCM stream locking.
For avoiding such a recursive locking, this patch adds an additional check to the code paths in PCM core that call the ack callback; now it checks the error code from the callback, and if it's -EPIPE, the XRUN is handled in the PCM core side gracefully. Along with it, the USB-audio driver code is changed to follow that, i.e. -EPIPE is returned instead of the explicit snd_pcm_xrun() call when the function is performed already in the stream lock.
Fixes: d5f871f89e21 ("ALSA: usb-audio: Improved lowlatency playback support") Reported-and-tested-by: John Keeping john@metanate.com Link: https://lore.kernel.org/r/20230317195128.3911155-1-john@metanate.com Reviewed-by: Jaroslav Kysela perex@perex.cz Reviewed-by; Takashi Sakamoto o-takashi@sakamocchi.jp Link: https://lore.kernel.org/r/20230320142838.494-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/pcm_lib.c | 2 ++ sound/usb/endpoint.c | 22 ++++++++++++++-------- sound/usb/endpoint.h | 4 ++-- sound/usb/pcm.c | 2 +- 4 files changed, 19 insertions(+), 11 deletions(-)
diff --git a/sound/core/pcm_lib.c b/sound/core/pcm_lib.c index 8b6aeb8a78f7d..02fd65993e7e5 100644 --- a/sound/core/pcm_lib.c +++ b/sound/core/pcm_lib.c @@ -2155,6 +2155,8 @@ int pcm_lib_apply_appl_ptr(struct snd_pcm_substream *substream, ret = substream->ops->ack(substream); if (ret < 0) { runtime->control->appl_ptr = old_appl_ptr; + if (ret == -EPIPE) + __snd_pcm_xrun(substream); return ret; } } diff --git a/sound/usb/endpoint.c b/sound/usb/endpoint.c index 419302e2057e8..647fa054d8b1d 100644 --- a/sound/usb/endpoint.c +++ b/sound/usb/endpoint.c @@ -455,8 +455,8 @@ static void push_back_to_ready_list(struct snd_usb_endpoint *ep, * This function is used both for implicit feedback endpoints and in low- * latency playback mode. */ -void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, - bool in_stream_lock) +int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, + bool in_stream_lock) { bool implicit_fb = snd_usb_endpoint_implicit_feedback_sink(ep);
@@ -480,7 +480,7 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, spin_unlock_irqrestore(&ep->lock, flags);
if (ctx == NULL) - return; + break;
/* copy over the length information */ if (implicit_fb) { @@ -495,11 +495,14 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, break; if (err < 0) { /* push back to ready list again for -EAGAIN */ - if (err == -EAGAIN) + if (err == -EAGAIN) { push_back_to_ready_list(ep, ctx); - else + break; + } + + if (!in_stream_lock) notify_xrun(ep); - return; + return -EPIPE; }
err = usb_submit_urb(ctx->urb, GFP_ATOMIC); @@ -507,13 +510,16 @@ void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, usb_audio_err(ep->chip, "Unable to submit urb #%d: %d at %s\n", ctx->index, err, __func__); - notify_xrun(ep); - return; + if (!in_stream_lock) + notify_xrun(ep); + return -EPIPE; }
set_bit(ctx->index, &ep->active_mask); atomic_inc(&ep->submitted_urbs); } + + return 0; }
/* diff --git a/sound/usb/endpoint.h b/sound/usb/endpoint.h index 924f4351588ce..c09f68ce08b18 100644 --- a/sound/usb/endpoint.h +++ b/sound/usb/endpoint.h @@ -52,7 +52,7 @@ int snd_usb_endpoint_implicit_feedback_sink(struct snd_usb_endpoint *ep); int snd_usb_endpoint_next_packet_size(struct snd_usb_endpoint *ep, struct snd_urb_ctx *ctx, int idx, unsigned int avail); -void snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, - bool in_stream_lock); +int snd_usb_queue_pending_output_urbs(struct snd_usb_endpoint *ep, + bool in_stream_lock);
#endif /* __USBAUDIO_ENDPOINT_H */ diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index d959da7a1afba..eec5232f9fb29 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -1639,7 +1639,7 @@ static int snd_usb_pcm_playback_ack(struct snd_pcm_substream *substream) * outputs here */ if (!ep->active_mask) - snd_usb_queue_pending_output_urbs(ep, true); + return snd_usb_queue_pending_output_urbs(ep, true); return 0; }
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
[ Upstream commit cdce67099117ece371582f706c6eff7d3a65326d ]
If CDM_CHECK is enabled (by the DT "snps,enable-cdm-check" property), 'val' is overwritten by PCIE_PL_CHK_REG_CONTROL_STATUS initialization. Commit ec7b952f453c ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") did not account for further usage of 'val', so we wrote improper values to PCIE_PORT_LINK_CONTROL when the CDM check is enabled.
Move the PCIE_PORT_LINK_CONTROL update to be completely after the PCIE_PL_CHK_REG_CONTROL_STATUS register initialization.
[bhelgaas: commit log adapted from Serge's version] Fixes: ec7b952f453c ("PCI: dwc: Always enable CDM check if "snps,enable-cdm-check" exists") Link: https://lore.kernel.org/r/20230310123510.675685-2-yoshihiro.shimoda.uh@renes... Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pcie-designware.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/controller/dwc/pcie-designware.c b/drivers/pci/controller/dwc/pcie-designware.c index 6d5d619ab2e94..346f67d2fdae2 100644 --- a/drivers/pci/controller/dwc/pcie-designware.c +++ b/drivers/pci/controller/dwc/pcie-designware.c @@ -806,11 +806,6 @@ void dw_pcie_setup(struct dw_pcie *pci) dw_pcie_writel_dbi(pci, PCIE_LINK_WIDTH_SPEED_CONTROL, val); }
- val = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL); - val &= ~PORT_LINK_FAST_LINK_MODE; - val |= PORT_LINK_DLL_LINK_EN; - dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val); - if (dw_pcie_cap_is(pci, CDM_CHECK)) { val = dw_pcie_readl_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS); val |= PCIE_PL_CHK_REG_CHK_REG_CONTINUOUS | @@ -818,6 +813,11 @@ void dw_pcie_setup(struct dw_pcie *pci) dw_pcie_writel_dbi(pci, PCIE_PL_CHK_REG_CONTROL_STATUS, val); }
+ val = dw_pcie_readl_dbi(pci, PCIE_PORT_LINK_CONTROL); + val &= ~PORT_LINK_FAST_LINK_MODE; + val |= PORT_LINK_DLL_LINK_EN; + dw_pcie_writel_dbi(pci, PCIE_PORT_LINK_CONTROL, val); + if (!pci->num_lanes) { dev_dbg(pci->dev, "Using h/w default number of lanes\n"); return;
From: Petr Tesarik petr.tesarik.ext@huawei.com
[ Upstream commit 0eee5ae1025699ea93d44fdb6ef2365505082103 ]
Explicit alignment and page alignment are used only to calculate the stride, not when checking actual slot physical address.
Originally, only page alignment was implemented, and that worked, because the whole SWIOTLB is allocated on a page boundary, so aligning the start index was sufficient to ensure a page-aligned slot.
When commit 1f221a0d0dbf ("swiotlb: respect min_align_mask") added support for min_align_mask, the index could be incremented in the search loop, potentially finding an unaligned slot if minimum device alignment is between IO_TLB_SIZE and PAGE_SIZE. The bug could go unnoticed, because the slot size is 2 KiB, and the most common page size is 4 KiB, so there is no alignment value in between.
IIUC the intention has been to find a slot that conforms to all alignment constraints: device minimum alignment, an explicit alignment (given as function parameter) and optionally page alignment (if allocation size is >= PAGE_SIZE). The most restrictive mask can be trivially computed with logical AND. The rest can stay.
Fixes: 1f221a0d0dbf ("swiotlb: respect min_align_mask") Fixes: e81e99bacc9f ("swiotlb: Support aligned swiotlb buffers") Signed-off-by: Petr Tesarik petr.tesarik.ext@huawei.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/swiotlb.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/kernel/dma/swiotlb.c b/kernel/dma/swiotlb.c index 869dd6667c464..312458506e6d5 100644 --- a/kernel/dma/swiotlb.c +++ b/kernel/dma/swiotlb.c @@ -642,22 +642,26 @@ static int swiotlb_do_find_slots(struct device *dev, int area_index, BUG_ON(!nslots); BUG_ON(area_index >= mem->nareas);
+ /* + * For allocations of PAGE_SIZE or larger only look for page aligned + * allocations. + */ + if (alloc_size >= PAGE_SIZE) + iotlb_align_mask &= PAGE_MASK; + iotlb_align_mask &= alloc_align_mask; + /* * For mappings with an alignment requirement don't bother looping to - * unaligned slots once we found an aligned one. For allocations of - * PAGE_SIZE or larger only look for page aligned allocations. + * unaligned slots once we found an aligned one. */ stride = (iotlb_align_mask >> IO_TLB_SHIFT) + 1; - if (alloc_size >= PAGE_SIZE) - stride = max(stride, stride << (PAGE_SHIFT - IO_TLB_SHIFT)); - stride = max(stride, (alloc_align_mask >> IO_TLB_SHIFT) + 1);
spin_lock_irqsave(&area->lock, flags); if (unlikely(nslots > mem->area_nslabs - area->used)) goto not_found;
slot_base = area_index * mem->area_nslabs; - index = wrap_area_index(mem, ALIGN(area->index, stride)); + index = area->index;
for (slots_checked = 0; slots_checked < mem->area_nslabs; ) { slot_index = slot_base + index;
From: Mark Pearson mpearson-lenovo@squebb.ca
[ Upstream commit 583329dcf22e568a328a944f20427ccfc95dce01 ]
This driver was missing the mandatory type attribute...oops.
Add it in along with logic to determine whether the attribute is an enumeration type or a string by parsing the possible_values attribute.
Upstream bug https://bugzilla.kernel.org/show_bug.cgi?id=216460
Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Mark Pearson mpearson-lenovo@squebb.ca Link: https://lore.kernel.org/r/20230320003221.561750-1-mpearson-lenovo@squebb.ca Reviewed-by: Thomas Weißschuh linux@weissschuh.net Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/think-lmi.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c index a01a92769c1a3..07c9dc21eff52 100644 --- a/drivers/platform/x86/think-lmi.c +++ b/drivers/platform/x86/think-lmi.c @@ -947,6 +947,20 @@ static ssize_t possible_values_show(struct kobject *kobj, struct kobj_attribute return sysfs_emit(buf, "%s\n", setting->possible_values); }
+static ssize_t type_show(struct kobject *kobj, struct kobj_attribute *attr, + char *buf) +{ + struct tlmi_attr_setting *setting = to_tlmi_attr_setting(kobj); + + if (setting->possible_values) { + /* Figure out what setting type is as BIOS does not return this */ + if (strchr(setting->possible_values, ',')) + return sysfs_emit(buf, "enumeration\n"); + } + /* Anything else is going to be a string */ + return sysfs_emit(buf, "string\n"); +} + static ssize_t current_value_store(struct kobject *kobj, struct kobj_attribute *attr, const char *buf, size_t count) @@ -1036,10 +1050,13 @@ static struct kobj_attribute attr_possible_values = __ATTR_RO(possible_values);
static struct kobj_attribute attr_current_val = __ATTR_RW_MODE(current_value, 0600);
+static struct kobj_attribute attr_type = __ATTR_RO(type); + static struct attribute *tlmi_attrs[] = { &attr_displ_name.attr, &attr_current_val.attr, &attr_possible_values.attr, + &attr_type.attr, NULL };
From: Mark Pearson mpearson-lenovo@squebb.ca
[ Upstream commit 45e21289bfc6e257885514790a8a8887da822d40 ]
firmware-attributes class requires that possible values are delimited using ';' but the Lenovo firmware uses ',' instead. Parse string and replace where appropriate.
Suggested-by: Thomas Weißschuh linux@weissschuh.net Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Mark Pearson mpearson-lenovo@squebb.ca Link: https://lore.kernel.org/r/20230320003221.561750-2-mpearson-lenovo@squebb.ca Reviewed-by: Thomas Weißschuh linux@weissschuh.net Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/think-lmi.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c index 07c9dc21eff52..62241680c8a90 100644 --- a/drivers/platform/x86/think-lmi.c +++ b/drivers/platform/x86/think-lmi.c @@ -954,7 +954,7 @@ static ssize_t type_show(struct kobject *kobj, struct kobj_attribute *attr,
if (setting->possible_values) { /* Figure out what setting type is as BIOS does not return this */ - if (strchr(setting->possible_values, ',')) + if (strchr(setting->possible_values, ';')) return sysfs_emit(buf, "enumeration\n"); } /* Anything else is going to be a string */ @@ -1441,6 +1441,13 @@ static int tlmi_analyze(void) pr_info("Error retrieving possible values for %d : %s\n", i, setting->display_name); } + /* + * firmware-attributes requires that possible_values are separated by ';' but + * Lenovo FW uses ','. Replace appropriately. + */ + if (setting->possible_values) + strreplace(setting->possible_values, ',', ';'); + kobject_init(&setting->kobj, &tlmi_attr_setting_ktype); tlmi_priv.setting[i] = setting; kfree(item);
From: Mark Pearson mpearson-lenovo@squebb.ca
[ Upstream commit cf337f27f3bfc4aeab4954c468239fd6233c7638 ]
Some attributes don't have any values available. In those cases don't make the possible_values entry visible.
Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Mark Pearson mpearson-lenovo@squebb.ca Link: https://lore.kernel.org/r/20230320003221.561750-3-mpearson-lenovo@squebb.ca Reviewed-by: Thomas Weißschuh linux@weissschuh.net Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/think-lmi.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c index 62241680c8a90..ccd085bacf298 100644 --- a/drivers/platform/x86/think-lmi.c +++ b/drivers/platform/x86/think-lmi.c @@ -941,9 +941,6 @@ static ssize_t possible_values_show(struct kobject *kobj, struct kobj_attribute { struct tlmi_attr_setting *setting = to_tlmi_attr_setting(kobj);
- if (!tlmi_priv.can_get_bios_selections) - return -EOPNOTSUPP; - return sysfs_emit(buf, "%s\n", setting->possible_values); }
@@ -1052,6 +1049,18 @@ static struct kobj_attribute attr_current_val = __ATTR_RW_MODE(current_value, 06
static struct kobj_attribute attr_type = __ATTR_RO(type);
+static umode_t attr_is_visible(struct kobject *kobj, + struct attribute *attr, int n) +{ + struct tlmi_attr_setting *setting = to_tlmi_attr_setting(kobj); + + /* We don't want to display possible_values attributes if not available */ + if ((attr == &attr_possible_values.attr) && (!setting->possible_values)) + return 0; + + return attr->mode; +} + static struct attribute *tlmi_attrs[] = { &attr_displ_name.attr, &attr_current_val.attr, @@ -1061,6 +1070,7 @@ static struct attribute *tlmi_attrs[] = { };
static const struct attribute_group tlmi_attr_group = { + .is_visible = attr_is_visible, .attrs = tlmi_attrs, };
From: Mark Pearson mpearson-lenovo@squebb.ca
[ Upstream commit 8a02d70679fc1c434401863333c8ea7dbf201494 ]
ThinkStation platforms don't support the API to return possible_values but instead embed it in the settings string.
Try and extract this information and set the possible_values attribute appropriately.
Fixes: a40cd7ef22fb ("platform/x86: think-lmi: Add WMI interface support on Lenovo platforms") Signed-off-by: Mark Pearson mpearson-lenovo@squebb.ca Link: https://lore.kernel.org/r/20230320003221.561750-4-mpearson-lenovo@squebb.ca Reviewed-by: Thomas Weißschuh linux@weissschuh.net Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/think-lmi.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
diff --git a/drivers/platform/x86/think-lmi.c b/drivers/platform/x86/think-lmi.c index ccd085bacf298..74af3e593b2ca 100644 --- a/drivers/platform/x86/think-lmi.c +++ b/drivers/platform/x86/think-lmi.c @@ -1450,6 +1450,26 @@ static int tlmi_analyze(void) if (ret || !setting->possible_values) pr_info("Error retrieving possible values for %d : %s\n", i, setting->display_name); + } else { + /* + * Older Thinkstations don't support the bios_selections API. + * Instead they store this as a [Optional:Option1,Option2] section of the + * name string. + * Try and pull that out if it's available. + */ + char *item, *optstart, *optend; + + if (!tlmi_setting(setting->index, &item, LENOVO_BIOS_SETTING_GUID)) { + optstart = strstr(item, "[Optional:"); + if (optstart) { + optstart += strlen("[Optional:"); + optend = strstr(optstart, "]"); + if (optend) + setting->possible_values = + kstrndup(optstart, optend - optstart, + GFP_KERNEL); + } + } } /* * firmware-attributes requires that possible_values are separated by ';' but
From: Liang He windhl@126.com
[ Upstream commit acd0acb802b90f88d19ad4337183e44fd0f77c50 ]
In fwnode_for_each_child_node(), we should add fwnode_handle_put() when break out of the iteration fwnode_for_each_child_node() as it will automatically increase and decrease the refcounter.
Fixes: fc622b3d36e6 ("platform/surface: Set up Surface Aggregator device registry") Signed-off-by: Liang He windhl@126.com Reviewed-by: Maximilian Luz luzmaximilian@gmail.com Link: https://lore.kernel.org/r/20230322033057.1855741-1-windhl@126.com Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/surface/aggregator/bus.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/surface/aggregator/bus.c b/drivers/platform/surface/aggregator/bus.c index de539938896e2..b501a79f2a08a 100644 --- a/drivers/platform/surface/aggregator/bus.c +++ b/drivers/platform/surface/aggregator/bus.c @@ -485,8 +485,10 @@ int __ssam_register_clients(struct device *parent, struct ssam_controller *ctrl, * device, so ignore it and continue with the next one. */ status = ssam_add_client_device(parent, ctrl, child); - if (status && status != -ENODEV) + if (status && status != -ENODEV) { + fwnode_handle_put(child); goto err; + } }
return 0;
From: Arseniy Krasnov avkrasnov@sberdevices.ru
[ Upstream commit e732e39ed9929c05fd219035bc9653ba4100d4fa ]
'info_buf' memory is cached and driver polls ECC bit in it. This bit is set by the NAND controller. If 'usleep_range()' returns before device sets this bit, 'info_buf' will be cached and driver won't see update of this bit and will loop forever.
Fixes: 8fae856c5350 ("mtd: rawnand: meson: add support for Amlogic NAND flash controller") Signed-off-by: Arseniy Krasnov AVKrasnov@sberdevices.ru Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/d4ef0bd6-816e-f6fa-9385-f05f775f0ae2@sberd... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/mtd/nand/raw/meson_nand.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/drivers/mtd/nand/raw/meson_nand.c b/drivers/mtd/nand/raw/meson_nand.c index 30e326adabfc1..a28574c009003 100644 --- a/drivers/mtd/nand/raw/meson_nand.c +++ b/drivers/mtd/nand/raw/meson_nand.c @@ -176,6 +176,7 @@ struct meson_nfc {
dma_addr_t daddr; dma_addr_t iaddr; + u32 info_bytes;
unsigned long assigned_cs; }; @@ -503,6 +504,7 @@ static int meson_nfc_dma_buffer_setup(struct nand_chip *nand, void *databuf, nfc->daddr, datalen, dir); return ret; } + nfc->info_bytes = infolen; cmd = GENCMDIADDRL(NFC_CMD_AIL, nfc->iaddr); writel(cmd, nfc->reg_base + NFC_REG_CMD);
@@ -520,8 +522,10 @@ static void meson_nfc_dma_buffer_release(struct nand_chip *nand, struct meson_nfc *nfc = nand_get_controller_data(nand);
dma_unmap_single(nfc->dev, nfc->daddr, datalen, dir); - if (infolen) + if (infolen) { dma_unmap_single(nfc->dev, nfc->iaddr, infolen, dir); + nfc->info_bytes = 0; + } }
static int meson_nfc_read_buf(struct nand_chip *nand, u8 *buf, int len) @@ -710,6 +714,8 @@ static void meson_nfc_check_ecc_pages_valid(struct meson_nfc *nfc, usleep_range(10, 15); /* info is updated by nfc dma engine*/ smp_rmb(); + dma_sync_single_for_cpu(nfc->dev, nfc->iaddr, nfc->info_bytes, + DMA_FROM_DEVICE); ret = *info & ECC_COMPLETE; } while (!ret); }
From: Siddharth Kawar Siddharth.Kawar@microsoft.com
[ Upstream commit 943d045a6d796175e5d08f9973953b1d2c07d797 ]
NFS server Duplicate Request Cache (DRC) algorithms rely on NFS clients reconnecting using the same local TCP port. Unique NFS operations are identified by the per-TCP connection set of XIDs. This prevents file corruption when non-idempotent NFS operations are retried.
Currently, NFS client TCP connections are using different local TCP ports when reconnecting to NFS servers.
After an NFS server initiates shutdown of the TCP connection, the NFS client's TCP socket is set to NULL after the socket state has reached TCP_LAST_ACK(9). When reconnecting, the new socket attempts to reuse the same local port but fails with EADDRNOTAVAIL (99). This forces the socket to use a different local TCP port to reconnect to the remote NFS server.
State Transition and Events: TCP_CLOSE_WAIT(8) TCP_LAST_ACK(9) connect(fail EADDRNOTAVAIL(99)) TCP_CLOSE(7) bind on new port connect success
dmesg excerpts showing reconnect switching from TCP local port of 926 to 763 after commit 7c81e6a9d75b: [13354.947854] NFS call mkdir testW ... [13405.654781] RPC: xs_tcp_state_change client 00000000037d0f03... [13405.654813] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1 [13405.654826] RPC: xs_data_ready... [13405.654892] RPC: xs_tcp_state_change client 00000000037d0f03... [13405.654895] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 [13405.654899] RPC: xs_tcp_state_change client 00000000037d0f03... [13405.654900] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 [13405.654950] RPC: xs_connect scheduled xprt 00000000037d0f03 [13405.654975] RPC: xs_bind 0.0.0.0:926: ok (0) [13405.654980] RPC: worker connecting xprt 00000000037d0f03 via tcp to 10.101.6.228 (port 2049) [13405.654991] RPC: 00000000037d0f03 connect status 99 connected 0 sock state 7 [13405.655001] RPC: xs_tcp_state_change client 00000000037d0f03... [13405.655002] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 [13405.655024] RPC: xs_connect scheduled xprt 00000000037d0f03 [13405.655038] RPC: xs_bind 0.0.0.0:763: ok (0) [13405.655041] RPC: worker connecting xprt 00000000037d0f03 via tcp to 10.101.6.228 (port 2049) [13405.655065] RPC: 00000000037d0f03 connect status 115 connected 0 sock state 2
State Transition and Events with patch applied: TCP_CLOSE_WAIT(8) TCP_LAST_ACK(9) TCP_CLOSE(7) connect(reuse of port succeeds)
dmesg excerpts showing reconnect on same TCP local port of 936 with patch applied: [ 257.139935] NFS: mkdir(0:59/560857152), testQ [ 257.139937] NFS call mkdir testQ ... [ 307.822702] RPC: state 8 conn 1 dead 0 zapped 1 sk_shutdown 1 [ 307.822714] RPC: xs_data_ready... [ 307.822817] RPC: xs_tcp_state_change client 00000000ce702f14... [ 307.822821] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 [ 307.822825] RPC: xs_tcp_state_change client 00000000ce702f14... [ 307.822826] RPC: state 9 conn 0 dead 0 zapped 1 sk_shutdown 3 [ 307.823606] RPC: xs_tcp_state_change client 00000000ce702f14... [ 307.823609] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 [ 307.823629] RPC: xs_tcp_state_change client 00000000ce702f14... [ 307.823632] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 [ 307.823676] RPC: xs_connect scheduled xprt 00000000ce702f14 [ 307.823704] RPC: xs_bind 0.0.0.0:936: ok (0) [ 307.823709] RPC: worker connecting xprt 00000000ce702f14 via tcp to 10.101.1.30 (port 2049) [ 307.823748] RPC: 00000000ce702f14 connect status 115 connected 0 sock state 2 ... [ 314.916193] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3 [ 314.916251] RPC: xs_connect scheduled xprt 00000000ce702f14 [ 314.916282] RPC: xs_bind 0.0.0.0:936: ok (0) [ 314.916292] RPC: worker connecting xprt 00000000ce702f14 via tcp to 10.101.1.30 (port 2049) [ 314.916342] RPC: 00000000ce702f14 connect status 115 connected 0 sock state 2
Fixes: 7c81e6a9d75b ("SUNRPC: Tweak TCP socket shutdown in the RPC client") Signed-off-by: Siddharth Rajendra Kawar sikawar@microsoft.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/sunrpc/xprtsock.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c index aaa5b2741b79d..1b9465b43997c 100644 --- a/net/sunrpc/xprtsock.c +++ b/net/sunrpc/xprtsock.c @@ -2155,6 +2155,7 @@ static void xs_tcp_shutdown(struct rpc_xprt *xprt) switch (skst) { case TCP_FIN_WAIT1: case TCP_FIN_WAIT2: + case TCP_LAST_ACK: break; case TCP_ESTABLISHED: case TCP_CLOSE_WAIT:
From: Íñigo Huguet ihuguet@redhat.com
[ Upstream commit ca4a80e4bb7e87daf33b27d2ab9e4f5311018a89 ]
At NIC reset, some offload features related to encapsulated traffic might have changed (this mainly happens if the firmware-variant is changed with the sfboot userspace tool). Because of this, features are checked and set again at reset time.
However, this was not done right, and some features were improperly overwritten at NIC reset: - Tunneled IPv6 segmentation was always disabled - Features disabled with ethtool were reenabled - Features that becomes unsupported after the reset were not disabled
Also, checking if the device supports IPV6_CSUM to enable TSO6 is no longer necessary because all currently supported devices support it. Additionally, move the assignment of some other features to the EF10_OFFLOAD_FEATURES macro, like it is done in ef100, leaving the selection of features in efx_pci_probe_post_io a bit cleaner.
Fixes: ffffd2454a7a ("sfc: correctly advertise tunneled IPv6 segmentation") Fixes: 24b2c3751aa3 ("sfc: advertise encapsulated offloads on EF10") Reported-by: Tianhao Zhao tizhao@redhat.com Suggested-by: Jonathan Cooper jonathan.s.cooper@amd.com Tested-by: Jonathan Cooper jonathan.s.cooper@amd.com Signed-off-by: Íñigo Huguet ihuguet@redhat.com Acked-by: Edward Cree ecree.xilinx@gmail.com Link: https://lore.kernel.org/r/20230323083417.7345-1-ihuguet@redhat.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/ef10.c | 38 ++++++++++++++++++++++----------- drivers/net/ethernet/sfc/efx.c | 17 ++++++--------- 2 files changed, 33 insertions(+), 22 deletions(-)
diff --git a/drivers/net/ethernet/sfc/ef10.c b/drivers/net/ethernet/sfc/ef10.c index 7022fb2005a2f..d30459dbfe8f8 100644 --- a/drivers/net/ethernet/sfc/ef10.c +++ b/drivers/net/ethernet/sfc/ef10.c @@ -1304,7 +1304,8 @@ static void efx_ef10_fini_nic(struct efx_nic *efx) static int efx_ef10_init_nic(struct efx_nic *efx) { struct efx_ef10_nic_data *nic_data = efx->nic_data; - netdev_features_t hw_enc_features = 0; + struct net_device *net_dev = efx->net_dev; + netdev_features_t tun_feats, tso_feats; int rc;
if (nic_data->must_check_datapath_caps) { @@ -1349,20 +1350,30 @@ static int efx_ef10_init_nic(struct efx_nic *efx) nic_data->must_restore_piobufs = false; }
- /* add encapsulated checksum offload features */ + /* encap features might change during reset if fw variant changed */ if (efx_has_cap(efx, VXLAN_NVGRE) && !efx_ef10_is_vf(efx)) - hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; - /* add encapsulated TSO features */ - if (efx_has_cap(efx, TX_TSO_V2_ENCAP)) { - netdev_features_t encap_tso_features; + net_dev->hw_enc_features |= NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM; + else + net_dev->hw_enc_features &= ~(NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM);
- encap_tso_features = NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE | - NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM; + tun_feats = NETIF_F_GSO_UDP_TUNNEL | NETIF_F_GSO_GRE | + NETIF_F_GSO_UDP_TUNNEL_CSUM | NETIF_F_GSO_GRE_CSUM; + tso_feats = NETIF_F_TSO | NETIF_F_TSO6;
- hw_enc_features |= encap_tso_features | NETIF_F_TSO; - efx->net_dev->features |= encap_tso_features; + if (efx_has_cap(efx, TX_TSO_V2_ENCAP)) { + /* If this is first nic_init, or if it is a reset and a new fw + * variant has added new features, enable them by default. + * If the features are not new, maintain their current value. + */ + if (!(net_dev->hw_features & tun_feats)) + net_dev->features |= tun_feats; + net_dev->hw_enc_features |= tun_feats | tso_feats; + net_dev->hw_features |= tun_feats; + } else { + net_dev->hw_enc_features &= ~(tun_feats | tso_feats); + net_dev->hw_features &= ~tun_feats; + net_dev->features &= ~tun_feats; } - efx->net_dev->hw_enc_features = hw_enc_features;
/* don't fail init if RSS setup doesn't work */ rc = efx->type->rx_push_rss_config(efx, false, @@ -4021,7 +4032,10 @@ static unsigned int efx_ef10_recycle_ring_size(const struct efx_nic *efx) NETIF_F_HW_VLAN_CTAG_FILTER | \ NETIF_F_IPV6_CSUM | \ NETIF_F_RXHASH | \ - NETIF_F_NTUPLE) + NETIF_F_NTUPLE | \ + NETIF_F_SG | \ + NETIF_F_RXCSUM | \ + NETIF_F_RXALL)
const struct efx_nic_type efx_hunt_a0_vf_nic_type = { .is_vf = true, diff --git a/drivers/net/ethernet/sfc/efx.c b/drivers/net/ethernet/sfc/efx.c index 3a86f1213a051..6a1bff54bc6c3 100644 --- a/drivers/net/ethernet/sfc/efx.c +++ b/drivers/net/ethernet/sfc/efx.c @@ -1001,21 +1001,18 @@ static int efx_pci_probe_post_io(struct efx_nic *efx) }
/* Determine netdevice features */ - net_dev->features |= (efx->type->offload_features | NETIF_F_SG | - NETIF_F_TSO | NETIF_F_RXCSUM | NETIF_F_RXALL); - if (efx->type->offload_features & (NETIF_F_IPV6_CSUM | NETIF_F_HW_CSUM)) { - net_dev->features |= NETIF_F_TSO6; - if (efx_has_cap(efx, TX_TSO_V2_ENCAP)) - net_dev->hw_enc_features |= NETIF_F_TSO6; - } - /* Check whether device supports TSO */ - if (!efx->type->tso_versions || !efx->type->tso_versions(efx)) - net_dev->features &= ~NETIF_F_ALL_TSO; + net_dev->features |= efx->type->offload_features; + + /* Add TSO features */ + if (efx->type->tso_versions && efx->type->tso_versions(efx)) + net_dev->features |= NETIF_F_TSO | NETIF_F_TSO6; + /* Mask for features that also apply to VLAN devices */ net_dev->vlan_features |= (NETIF_F_HW_CSUM | NETIF_F_SG | NETIF_F_HIGHDMA | NETIF_F_ALL_TSO | NETIF_F_RXCSUM);
+ /* Determine user configurable features */ net_dev->hw_features |= net_dev->features & ~efx->fixed_features;
/* Disable receiving frames with bad FCS, by default. */
From: Tomas Henzl thenzl@redhat.com
[ Upstream commit 2309df27111a51734cb9240b4d3c25f2f3c6ab06 ]
When a physical disk is attached directly "without JBOD MAP support" (see megasas_get_tm_devhandle()) then there is no real error handling in the driver. Return FAILED instead of SUCCESS.
Fixes: 18365b138508 ("megaraid_sas: Task management support") Signed-off-by: Tomas Henzl thenzl@redhat.com Link: https://lore.kernel.org/r/20230324150134.14696-1-thenzl@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_sas_fusion.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/scsi/megaraid/megaraid_sas_fusion.c b/drivers/scsi/megaraid/megaraid_sas_fusion.c index fe70f8f114352..5f746b4a6b8da 100644 --- a/drivers/scsi/megaraid/megaraid_sas_fusion.c +++ b/drivers/scsi/megaraid/megaraid_sas_fusion.c @@ -4768,7 +4768,7 @@ int megasas_task_abort_fusion(struct scsi_cmnd *scmd) devhandle = megasas_get_tm_devhandle(scmd->device);
if (devhandle == (u16)ULONG_MAX) { - ret = SUCCESS; + ret = FAILED; sdev_printk(KERN_INFO, scmd->device, "task abort issued for invalid devhandle\n"); mutex_unlock(&instance->reset_mutex); @@ -4838,7 +4838,7 @@ int megasas_reset_target_fusion(struct scsi_cmnd *scmd) devhandle = megasas_get_tm_devhandle(scmd->device);
if (devhandle == (u16)ULONG_MAX) { - ret = SUCCESS; + ret = FAILED; sdev_printk(KERN_INFO, scmd->device, "target reset issued for invalid devhandle\n"); mutex_unlock(&instance->reset_mutex);
From: Jerry Snitselaar jsnitsel@redhat.com
[ Upstream commit d684a7a26f7d2c7122a4581ac966ed64e88fb29c ]
_base_allocate_sense_dma_pool() already prints out the sense pool information, so don't print it a second time after calling it in _base_allocate_memory_pools(). In addition the version in _base_allocate_memory_pools() was using the wrong size value, sz, which was last assigned when doing some nvme calculations instead of sense_sz to determine the pool size in kilobytes.
Cc: Sathya Prakash sathya.prakash@broadcom.com Cc: Sreekanth Reddy sreekanth.reddy@broadcom.com Cc: Suganath Prabu Subramani suganath-prabu.subramani@broadcom.com Cc: MPT-FusionLinux.pdl@broadcom.com Cc: "Martin K. Petersen" martin.petersen@oracle.com Cc: "James E.J. Bottomley" jejb@linux.ibm.com Fixes: 970ac2bb70e7 ("scsi: mpt3sas: Force sense buffer allocations to be within same 4 GB region") Signed-off-by: Jerry Snitselaar jsnitsel@redhat.com Link: https://lore.kernel.org/r/20230324193204.567932-1-jsnitsel@redhat.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_base.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c index 2ee9ea57554d7..14ae0a9c5d3d8 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_base.c +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c @@ -6616,11 +6616,6 @@ _base_allocate_memory_pools(struct MPT3SAS_ADAPTER *ioc) else if (rc == -EAGAIN) goto try_32bit_dma; total_sz += sense_sz; - ioc_info(ioc, - "sense pool(0x%p)- dma(0x%llx): depth(%d)," - "element_size(%d), pool_size(%d kB)\n", - ioc->sense, (unsigned long long)ioc->sense_dma, ioc->scsiio_depth, - SCSI_SENSE_BUFFERSIZE, sz / 1024); /* reply pool, 4 byte align */ sz = ioc->reply_free_queue_depth * ioc->reply_sz; rc = _base_allocate_reply_pool(ioc, sz);
From: Ahmad Fatoum a.fatoum@pengutronix.de
[ Upstream commit b93eb564869321d0dffaf23fcc5c88112ed62466 ]
The probe function sets priv->chip_data to (void *)priv + sizeof(*priv) with the expectation that priv has enough trailing space.
However, only realtek-smi actually allocated this chip_data space. Do likewise in realtek-mdio to fix out-of-bounds accesses.
These accesses likely went unnoticed so far, because of an (unused) buf[4096] member in struct realtek_priv, which caused kmalloc to round up the allocated buffer to a big enough size, so nothing of value was overwritten. With a different allocator (like in the barebox bootloader port of the driver) or with KASAN, the memory corruption becomes quickly apparent.
Fixes: aac94001067d ("net: dsa: realtek: add new mdio interface for drivers") Reviewed-by: Florian Fainelli f.fainelli@gmail.com Reviewed-by: Luiz Angelo Daros de Luca luizluca@gmail.com Reviewed-by: Alvin Šipraga alsi@bang-olufsen.dk Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Ahmad Fatoum a.fatoum@pengutronix.de Link: https://lore.kernel.org/r/20230323103735.2331786-1-a.fatoum@pengutronix.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/realtek/realtek-mdio.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/dsa/realtek/realtek-mdio.c b/drivers/net/dsa/realtek/realtek-mdio.c index 3e54fac5f9027..5a8fe707ca25e 100644 --- a/drivers/net/dsa/realtek/realtek-mdio.c +++ b/drivers/net/dsa/realtek/realtek-mdio.c @@ -21,6 +21,7 @@
#include <linux/module.h> #include <linux/of_device.h> +#include <linux/overflow.h> #include <linux/regmap.h>
#include "realtek.h" @@ -152,7 +153,9 @@ static int realtek_mdio_probe(struct mdio_device *mdiodev) if (!var) return -EINVAL;
- priv = devm_kzalloc(&mdiodev->dev, sizeof(*priv), GFP_KERNEL); + priv = devm_kzalloc(&mdiodev->dev, + size_add(sizeof(*priv), var->chip_data_sz), + GFP_KERNEL); if (!priv) return -ENOMEM;
From: SongJingyi u201912584@hust.edu.cn
[ Upstream commit f33642224e38d7e0d59336e10e7b4e370b1c4506 ]
Smatch complains that: drivers/ptp/ptp_qoriq.c ptp_qoriq_probe() warn: 'base' from ioremap() not released.
Fix this by revising the parameter from 'ptp_qoriq->base' to 'base'. This is only a bug if ptp_qoriq_init() returns on the first -ENODEV error path. For other error paths ptp_qoriq->base and base are the same. And this change makes the code more readable.
Fixes: 7f4399ba405b ("ptp_qoriq: fix NULL access if ptp dt node missing") Signed-off-by: SongJingyi u201912584@hust.edu.cn Reviewed-by: Dan Carpenter error27@gmail.com Reviewed-by: Dongliang Mu dzm91@hust.edu.cn Link: https://lore.kernel.org/r/20230324031406.1895159-1-u201912584@hust.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ptp/ptp_qoriq.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/ptp/ptp_qoriq.c b/drivers/ptp/ptp_qoriq.c index 08f4cf0ad9e3c..8fa9772acf79b 100644 --- a/drivers/ptp/ptp_qoriq.c +++ b/drivers/ptp/ptp_qoriq.c @@ -601,7 +601,7 @@ static int ptp_qoriq_probe(struct platform_device *dev) return 0;
no_clock: - iounmap(ptp_qoriq->base); + iounmap(base); no_ioremap: release_resource(ptp_qoriq->rsrc); no_resource:
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 88e943e83827a349f70c3464b3eba7260be7461d ]
Before this patch, the ksz8_fdb_dump() function had several issues, such as uninitialized variables and incorrect usage of source port as a bit mask. These problems caused inaccurate reporting of vid information and port assignment in the bridge fdb.
Fixes: e587be759e6e ("net: dsa: microchip: update fdb add/del/dump in ksz_common") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Arun Ramadoss arun.ramadoss@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz8795.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz8795.c b/drivers/net/dsa/microchip/ksz8795.c index 003b0ac2854c9..3fffd5da8d3b0 100644 --- a/drivers/net/dsa/microchip/ksz8795.c +++ b/drivers/net/dsa/microchip/ksz8795.c @@ -958,15 +958,14 @@ int ksz8_fdb_dump(struct ksz_device *dev, int port, u16 entries = 0; u8 timestamp = 0; u8 fid; - u8 member; - struct alu_struct alu; + u8 src_port; + u8 mac[ETH_ALEN];
do { - alu.is_static = false; - ret = ksz8_r_dyn_mac_table(dev, i, alu.mac, &fid, &member, + ret = ksz8_r_dyn_mac_table(dev, i, mac, &fid, &src_port, ×tamp, &entries); - if (!ret && (member & BIT(port))) { - ret = cb(alu.mac, alu.fid, alu.is_static, data); + if (!ret && port == src_port) { + ret = cb(mac, fid, false, data); if (ret) break; }
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 5d90492dd4ff50ad65c582c76c345d0b90001728 ]
Current ksz8_fdb_dump() is able to extract only max 249 entries on the ksz8863/ksz8873 series of switches. This happened due to wrong bit mask and offset calculation.
This commit corrects the issue and allows for the complete extraction of all 1024 entries.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Arun Ramadoss arun.ramadoss@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz_common.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 19cd05762ab77..725a868b76f7e 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -398,10 +398,10 @@ static const u32 ksz8863_masks[] = { [STATIC_MAC_TABLE_FID] = GENMASK(29, 26), [STATIC_MAC_TABLE_OVERRIDE] = BIT(20), [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(18, 16), - [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(5, 0), + [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(1, 0), [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7), [DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7), - [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 28), + [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 24), [DYNAMIC_MAC_TABLE_FID] = GENMASK(19, 16), [DYNAMIC_MAC_TABLE_SRC_PORT] = GENMASK(21, 20), [DYNAMIC_MAC_TABLE_TIMESTAMP] = GENMASK(23, 22), @@ -411,7 +411,7 @@ static u8 ksz8863_shifts[] = { [VLAN_TABLE_MEMBERSHIP_S] = 16, [STATIC_MAC_FWD_PORTS] = 16, [STATIC_MAC_FID] = 22, - [DYNAMIC_MAC_ENTRIES_H] = 3, + [DYNAMIC_MAC_ENTRIES_H] = 8, [DYNAMIC_MAC_ENTRIES] = 24, [DYNAMIC_MAC_FID] = 16, [DYNAMIC_MAC_TIMESTAMP] = 24,
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit b3177aab89be540dc50d2328427b073361093e38 ]
We are using wrong offset, so we will get not a timestamp.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Arun Ramadoss arun.ramadoss@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 725a868b76f7e..9da2b3afce26e 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -414,7 +414,7 @@ static u8 ksz8863_shifts[] = { [DYNAMIC_MAC_ENTRIES_H] = 8, [DYNAMIC_MAC_ENTRIES] = 24, [DYNAMIC_MAC_FID] = 16, - [DYNAMIC_MAC_TIMESTAMP] = 24, + [DYNAMIC_MAC_TIMESTAMP] = 22, [DYNAMIC_MAC_SRC_PORT] = 20, };
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 492606cdc74804d372ab1bdb8f3ef4a6fb6f9f59 ]
If the dynamic MAC table is empty, we will still extract one outdated entry. Fix it by using correct bit offset.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Arun Ramadoss arun.ramadoss@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 9da2b3afce26e..f8d04a4a303ef 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -399,7 +399,7 @@ static const u32 ksz8863_masks[] = { [STATIC_MAC_TABLE_OVERRIDE] = BIT(20), [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(18, 16), [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(1, 0), - [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(7), + [DYNAMIC_MAC_TABLE_MAC_EMPTY] = BIT(2), [DYNAMIC_MAC_TABLE_NOT_READY] = BIT(7), [DYNAMIC_MAC_TABLE_ENTRIES] = GENMASK(31, 24), [DYNAMIC_MAC_TABLE_FID] = GENMASK(19, 16),
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 392ff7a84cbca34118ca286dfbfe8aee24605897 ]
Current regmap bulk access is broken, resulting to wrong reads/writes if ksz_read64/ksz_write64 functions are used. Mostly this issue was visible by using ksz8_fdb_dump(), which returned corrupt MAC address.
The reason is that regmap was configured to have max_raw_read/write, even if ksz8863_mdio_read/write functions are able to handle unlimited read/write accesses. On ksz_read64 function we are using multiple 32bit accesses by incrementing each access by 1 instead of 4. Resulting buffer had 01234567.12345678 instead of 01234567.89abcdef.
We have multiple ways to fix it: - enable 4 byte alignment for 32bit accesses. Since the HW do not have this requirement. It will break driver. - disable max_raw_* limit.
This patch is removing max_raw_* limit for regmap accesses in ksz8863_smi.
Fixes: 60a364760002 ("net: dsa: microchip: Add Microchip KSZ8863 SMI based driver support") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz8863_smi.c | 9 --------- 1 file changed, 9 deletions(-)
diff --git a/drivers/net/dsa/microchip/ksz8863_smi.c b/drivers/net/dsa/microchip/ksz8863_smi.c index 2f4623f3bd852..3698112138b78 100644 --- a/drivers/net/dsa/microchip/ksz8863_smi.c +++ b/drivers/net/dsa/microchip/ksz8863_smi.c @@ -82,22 +82,16 @@ static const struct regmap_bus regmap_smi[] = { { .read = ksz8863_mdio_read, .write = ksz8863_mdio_write, - .max_raw_read = 1, - .max_raw_write = 1, }, { .read = ksz8863_mdio_read, .write = ksz8863_mdio_write, .val_format_endian_default = REGMAP_ENDIAN_BIG, - .max_raw_read = 2, - .max_raw_write = 2, }, { .read = ksz8863_mdio_read, .write = ksz8863_mdio_write, .val_format_endian_default = REGMAP_ENDIAN_BIG, - .max_raw_read = 4, - .max_raw_write = 4, } };
@@ -108,7 +102,6 @@ static const struct regmap_config ksz8863_regmap_config[] = { .pad_bits = 24, .val_bits = 8, .cache_type = REGCACHE_NONE, - .use_single_read = 1, .lock = ksz_regmap_lock, .unlock = ksz_regmap_unlock, }, @@ -118,7 +111,6 @@ static const struct regmap_config ksz8863_regmap_config[] = { .pad_bits = 24, .val_bits = 16, .cache_type = REGCACHE_NONE, - .use_single_read = 1, .lock = ksz_regmap_lock, .unlock = ksz_regmap_unlock, }, @@ -128,7 +120,6 @@ static const struct regmap_config ksz8863_regmap_config[] = { .pad_bits = 24, .val_bits = 32, .cache_type = REGCACHE_NONE, - .use_single_read = 1, .lock = ksz_regmap_lock, .unlock = ksz_regmap_unlock, }
From: Oleksij Rempel o.rempel@pengutronix.de
[ Upstream commit 9aa5757e1f71d85facdc3c98028762cbab8d15c7 ]
FID is directly mapped to VID. However, configuring a MAC address with a VID != 0 resulted in incorrect configuration due to an incorrect bit mask. This kernel commit fixed the issue by correcting the bit mask and ensuring proper configuration of MAC addresses with non-zero VID.
Fixes: 4b20a07e103f ("net: dsa: microchip: ksz8795: add support for ksz88xx chips") Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Acked-by: Arun Ramadoss arun.ramadoss@microchip.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/microchip/ksz_common.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index f8d04a4a303ef..8601a9e4e4d2f 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -395,7 +395,7 @@ static const u32 ksz8863_masks[] = { [VLAN_TABLE_VALID] = BIT(19), [STATIC_MAC_TABLE_VALID] = BIT(19), [STATIC_MAC_TABLE_USE_FID] = BIT(21), - [STATIC_MAC_TABLE_FID] = GENMASK(29, 26), + [STATIC_MAC_TABLE_FID] = GENMASK(25, 22), [STATIC_MAC_TABLE_OVERRIDE] = BIT(20), [STATIC_MAC_TABLE_FWD_PORTS] = GENMASK(18, 16), [DYNAMIC_MAC_TABLE_ENTRIES_H] = GENMASK(1, 0),
From: ChunHao Lin hau@realtek.com
[ Upstream commit 33189f0a94b9639c058781fcf82e4ea3803b1682 ]
When link speed is 10 Mbps and temperature is under -20°C, RTL8168H and RTL8107E may have rx crc error. Disable phy 10 Mbps pll off to fix this issue.
Fixes: 6e1d0b898818 ("r8169:add support for RTL8168H and RTL8107E") Signed-off-by: ChunHao Lin hau@realtek.com Reviewed-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_phy_config.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/realtek/r8169_phy_config.c b/drivers/net/ethernet/realtek/r8169_phy_config.c index 930496cd34ed0..b50f16786c246 100644 --- a/drivers/net/ethernet/realtek/r8169_phy_config.c +++ b/drivers/net/ethernet/realtek/r8169_phy_config.c @@ -826,6 +826,9 @@ static void rtl8168h_2_hw_phy_config(struct rtl8169_private *tp, /* disable phy pfm mode */ phy_modify_paged(phydev, 0x0a44, 0x11, BIT(7), 0);
+ /* disable 10m pll off */ + phy_modify_paged(phydev, 0x0a43, 0x10, BIT(0), 0); + rtl8168g_disable_aldps(phydev); rtl8168g_config_eee_phy(phydev); }
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 02bcba0b9f9da706d5bd1e8cbeb83493863e17b5 ]
devm_clk_get() can return -EPROBE_DEFER. So it is better to return the error code from devm_clk_get(), instead of a hard coded -ENOENT.
This gives more opportunities to successfully probe the driver.
Fixes: 8959e5324485 ("regulator: fixed: add possibility to enable by clock") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Link: https://lore.kernel.org/r/18459fae3d017a66313699c7c8456b28158b2dd0.167981935... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/fixed.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/regulator/fixed.c b/drivers/regulator/fixed.c index 2a9867abba20c..e6724a229d237 100644 --- a/drivers/regulator/fixed.c +++ b/drivers/regulator/fixed.c @@ -215,7 +215,7 @@ static int reg_fixed_voltage_probe(struct platform_device *pdev) drvdata->enable_clock = devm_clk_get(dev, NULL); if (IS_ERR(drvdata->enable_clock)) { dev_err(dev, "Can't get enable-clock from devicetree\n"); - return -ENOENT; + return PTR_ERR(drvdata->enable_clock); } } else if (drvtype && drvtype->has_performance_state) { drvdata->desc.ops = &fixed_voltage_domain_ops;
From: Faicker Mo faicker.mo@ucloud.cn
[ Upstream commit e3cbdcb0fbb61045ef3ce0e072927cc41737f787 ]
The failover txq is inited as 16 queues. when a packet is transmitted from the failover device firstly, the failover device will select the queue which is returned from the primary device if the primary device is UP and running. If the primary device txq is bigger than the default 16, it can lead to the following warning: eth0 selects TX queue 18, but real number of TX queues is 16
The warning backtrace is: [ 32.146376] CPU: 18 PID: 9134 Comm: chronyd Tainted: G E 6.2.8-1.el7.centos.x86_64 #1 [ 32.147175] Hardware name: Red Hat KVM, BIOS 1.10.2-3.el7_4.1 04/01/2014 [ 32.147730] Call Trace: [ 32.147971] <TASK> [ 32.148183] dump_stack_lvl+0x48/0x70 [ 32.148514] dump_stack+0x10/0x20 [ 32.148820] netdev_core_pick_tx+0xb1/0xe0 [ 32.149180] __dev_queue_xmit+0x529/0xcf0 [ 32.149533] ? __check_object_size.part.0+0x21c/0x2c0 [ 32.149967] ip_finish_output2+0x278/0x560 [ 32.150327] __ip_finish_output+0x1fe/0x2f0 [ 32.150690] ip_finish_output+0x2a/0xd0 [ 32.151032] ip_output+0x7a/0x110 [ 32.151337] ? __pfx_ip_finish_output+0x10/0x10 [ 32.151733] ip_local_out+0x5e/0x70 [ 32.152054] ip_send_skb+0x19/0x50 [ 32.152366] udp_send_skb.isra.0+0x163/0x3a0 [ 32.152736] udp_sendmsg+0xba8/0xec0 [ 32.153060] ? __folio_memcg_unlock+0x25/0x60 [ 32.153445] ? __pfx_ip_generic_getfrag+0x10/0x10 [ 32.153854] ? sock_has_perm+0x85/0xa0 [ 32.154190] inet_sendmsg+0x6d/0x80 [ 32.154508] ? inet_sendmsg+0x6d/0x80 [ 32.154838] sock_sendmsg+0x62/0x70 [ 32.155152] ____sys_sendmsg+0x134/0x290 [ 32.155499] ___sys_sendmsg+0x81/0xc0 [ 32.155828] ? _get_random_bytes.part.0+0x79/0x1a0 [ 32.156240] ? ip4_datagram_release_cb+0x5f/0x1e0 [ 32.156649] ? get_random_u16+0x69/0xf0 [ 32.156989] ? __fget_light+0xcf/0x110 [ 32.157326] __sys_sendmmsg+0xc4/0x210 [ 32.157657] ? __sys_connect+0xb7/0xe0 [ 32.157995] ? __audit_syscall_entry+0xce/0x140 [ 32.158388] ? syscall_trace_enter.isra.0+0x12c/0x1a0 [ 32.158820] __x64_sys_sendmmsg+0x24/0x30 [ 32.159171] do_syscall_64+0x38/0x90 [ 32.159493] entry_SYSCALL_64_after_hwframe+0x72/0xdc
Fix that by reducing txq number as the non-existent primary-dev does.
Fixes: cfc80d9a1163 ("net: Introduce net_failover driver") Signed-off-by: Faicker Mo faicker.mo@ucloud.cn Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/net_failover.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
diff --git a/drivers/net/net_failover.c b/drivers/net/net_failover.c index 7a28e082436e4..d0c916a53d7ce 100644 --- a/drivers/net/net_failover.c +++ b/drivers/net/net_failover.c @@ -130,14 +130,10 @@ static u16 net_failover_select_queue(struct net_device *dev, txq = ops->ndo_select_queue(primary_dev, skb, sb_dev); else txq = netdev_pick_tx(primary_dev, skb, NULL); - - qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping; - - return txq; + } else { + txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; }
- txq = skb_rx_queue_recorded(skb) ? skb_get_rx_queue(skb) : 0; - /* Save the original txq to restore before passing to the driver */ qdisc_skb_cb(skb)->slave_dev_queue_mapping = skb->queue_mapping;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit a7602e7332b97cfbec7bacb0f1ade99a575fe104 ]
The blamed commit has introduced the following tests to dwmac4_add_hw_vlan_rx_fltr(), called from stmmac_vlan_rx_add_vid():
if (hw->promisc) { netdev_err(dev, "Adding VLAN in promisc mode not supported\n"); return -EPERM; }
"VLAN promiscuous" mode is keyed in this driver to IFF_PROMISC, and so, vlan_vid_add() and vlan_vid_del() calls cannot take place in IFF_PROMISC mode. I have the following 2 arguments that this restriction is.... hm, how shall I put it nicely... unproductive :)
First, take the case of a Linux bridge. If the kernel is compiled with CONFIG_BRIDGE_VLAN_FILTERING=y, then this bridge shall have a VLAN database. The bridge shall try to call vlan_add_vid() on its bridge ports for each VLAN in the VLAN table. It will do this irrespectively of whether that port is *currently* VLAN-aware or not. So it will do this even when the bridge was created with vlan_filtering 0. But the Linux bridge, in VLAN-unaware mode, configures its ports in promiscuous (IFF_PROMISC) mode, so that they accept packets with any MAC DA (a switch must do this in order to forward those packets which are not directly targeted to its MAC address).
As a result, the stmmac driver does not work as a bridge port, when the kernel is compiled with CONFIG_BRIDGE_VLAN_FILTERING=y.
$ ip link add br0 type bridge && ip link set br0 up $ ip link set eth0 master br0 && ip link set eth0 up [ 2333.943296] br0: port 1(eth0) entered blocking state [ 2333.943381] br0: port 1(eth0) entered disabled state [ 2333.943782] device eth0 entered promiscuous mode [ 2333.944080] 4033c000.ethernet eth0: Adding VLAN in promisc mode not supported [ 2333.976509] 4033c000.ethernet eth0: failed to initialize vlan filtering on this port RTNETLINK answers: Operation not permitted
Secondly, take the case of stmmac as DSA master. Some switch tagging protocols are based on 802.1Q VLANs (tag_sja1105.c), and as such, tag_8021q.c uses vlan_vid_add() to work with VLAN-filtering DSA masters. But also, when a DSA port becomes promiscuous (for example when it joins a bridge), the DSA framework also makes the DSA master promiscuous.
Moreover, for every VLAN that a DSA switch sends to the CPU, DSA also programs a VLAN filter on the DSA master, because if the the DSA switch uses a tail tag, then the hardware frame parser of the DSA master will see VLAN as VLAN, and might filter them out, for being unknown.
Due to the above 2 reasons, my belief is that the stmmac driver does not get to choose to not accept vlan_vid_add() calls while IFF_PROMISC is enabled, because the 2 are completely independent and there are code paths in the network stack which directly lead to this situation occurring, without the user's direct input.
In fact, my belief is that "VLAN promiscuous" mode should have never been keyed on IFF_PROMISC in the first place, but rather, on the NETIF_F_HW_VLAN_CTAG_FILTER feature flag which can be toggled by the user through ethtool -k, when present in netdev->hw_features.
In the stmmac driver, NETIF_F_HW_VLAN_CTAG_FILTER is only present in "features", making this feature "on [fixed]".
I have this belief because I am unaware of any definition of promiscuity which implies having an effect on anything other than MAC DA (therefore not VLAN). However, I seem to be rather alone in having this opinion, looking back at the disagreements from this discussion: https://lore.kernel.org/netdev/20201110153958.ci5ekor3o2ekg3ky@ipetronik.com...
In any case, to remove the vlan_vid_add() dependency on !IFF_PROMISC, one would need to remove the check and see what fails. I guess the test was there because of the way in which dwmac4_vlan_promisc_enable() is implemented.
For context, the dwmac4 supports Perfect Filtering for a limited number of VLANs - dwmac4_get_num_vlan(), priv->hw->num_vlan, with a fallback on Hash Filtering - priv->dma_cap.vlhash - see stmmac_vlan_update(), also visible in cat /sys/kernel/debug/stmmaceth/eth0/dma_cap | grep 'VLAN Hash Filtering'.
The perfect filtering is based on MAC_VLAN_Tag_Filter/MAC_VLAN_Tag_Data registers, accessed in the driver through dwmac4_write_vlan_filter().
The hash filtering is based on the MAC_VLAN_Hash_Table register, named GMAC_VLAN_HASH_TABLE in the driver and accessed by dwmac4_update_vlan_hash(). The control bit for enabling hash filtering is GMAC_VLAN_VTHM (MAC_VLAN_Tag_Ctrl bit VTHM: VLAN Tag Hash Table Match Enable).
Now, the description of dwmac4_vlan_promisc_enable() is that it iterates through the driver's cache of perfect filter entries (hw->vlan_filter[i], added by dwmac4_add_hw_vlan_rx_fltr()), and evicts them from hardware by unsetting their GMAC_VLAN_TAG_DATA_VEN (MAC_VLAN_Tag_Data bit VEN - VLAN Tag Enable) bit. Then it unsets the GMAC_VLAN_VTHM bit, which disables hash matching.
This leaves the MAC, according to table "VLAN Match Status" from the documentation, to always enter these data paths:
VID |VLAN Perfect Filter |VTHM Bit |VLAN Hash Filter |Final VLAN Match |Match Result | |Match Result |Status -------|--------------------|---------|-----------------|---------------- VID!=0 |Fail |0 |don't care |Pass
So, dwmac4_vlan_promisc_enable() does its job, but by unsetting GMAC_VLAN_VTHM, it conflicts with the other code path which controls this bit: dwmac4_update_vlan_hash(), called through stmmac_update_vlan_hash() from stmmac_vlan_rx_add_vid() and from stmmac_vlan_rx_kill_vid(). This is, I guess, why dwmac4_add_hw_vlan_rx_fltr() is not allowed to run after dwmac4_vlan_promisc_enable() has unset GMAC_VLAN_VTHM: because if it did, then dwmac4_update_vlan_hash() would set GMAC_VLAN_VTHM again, breaking the "VLAN promiscuity".
It turns out that dwmac4_vlan_promisc_enable() is way too complicated for what needs to be done. The MAC_Packet_Filter register also has the VTFE bit (VLAN Tag Filter Enable), which simply controls whether VLAN tagged packets which don't match the filtering tables (either perfect or hash) are dropped or not. At the moment, this driver unconditionally sets GMAC_PACKET_FILTER_VTFE if NETIF_F_HW_VLAN_CTAG_FILTER was detected through the priv->dma_cap.vlhash capability bits of the device, in stmmac_dvr_probe().
I would suggest deleting the unnecessarily complex logic from dwmac4_vlan_promisc_enable(), and simply unsetting GMAC_PACKET_FILTER_VTFE when becoming IFF_PROMISC, which has the same effect of allowing packets with any VLAN tags, but has the additional benefit of being able to run concurrently with stmmac_vlan_rx_add_vid() and stmmac_vlan_rx_kill_vid().
As much as I believe that the VTFE bit should have been exclusively controlled by NETIF_F_HW_VLAN_CTAG_FILTER through ethtool, and not by IFF_PROMISC, changing that is not a punctual fix to the problem, and it would probably break the VFFQ feature added by the later commit e0f9956a3862 ("net: stmmac: Add option for VLAN filter fail queue enable"). From the commit description, VFFQ needs IFF_PROMISC=on and VTFE=off in order to work (and this change respects that). But if VTFE was changed to be controlled through ethtool -k, then a user-visible change would have been introduced in Intel's scripts (a need to run "ethtool -k eth0 rx-vlan-filter off" which did not exist before).
The patch was tested with this set of commands:
ip link set eth0 up ip link add link eth0 name eth0.100 type vlan id 100 ip addr add 192.168.100.2/24 dev eth0.100 && ip link set eth0.100 up ip link set eth0 promisc on ip link add link eth0 name eth0.101 type vlan id 101 ip addr add 192.168.101.2/24 dev eth0.101 && ip link set eth0.101 up ip link set eth0 promisc off ping -c 5 192.168.100.1 ping -c 5 192.168.101.1 ip link set eth0 promisc on ping -c 5 192.168.100.1 ping -c 5 192.168.101.1 ip link del eth0.100 ip link del eth0.101 # Wait for VLAN-tagged pings from the other end... # Check with "tcpdump -i eth0 -e -n -p" and we should see them ip link set eth0 promisc off # Wait for VLAN-tagged pings from the other end... # Check with "tcpdump -i eth0 -e -n -p" and we shouldn't see them # anymore, but remove the "-p" argument from tcpdump and they're there.
Fixes: c89f44ff10fd ("net: stmmac: Add support for VLAN promiscuous mode") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/common.h | 1 - .../net/ethernet/stmicro/stmmac/dwmac4_core.c | 61 +------------------ 2 files changed, 3 insertions(+), 59 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/common.h b/drivers/net/ethernet/stmicro/stmmac/common.h index ec9c130276d89..54bb072aeb2d3 100644 --- a/drivers/net/ethernet/stmicro/stmmac/common.h +++ b/drivers/net/ethernet/stmicro/stmmac/common.h @@ -532,7 +532,6 @@ struct mac_device_info { unsigned int xlgmac; unsigned int num_vlan; u32 vlan_filter[32]; - unsigned int promisc; bool vlan_fail_q_en; u8 vlan_fail_q; }; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c index 8c7a0b7c99520..36251ec2589c9 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac4_core.c @@ -472,12 +472,6 @@ static int dwmac4_add_hw_vlan_rx_fltr(struct net_device *dev, if (vid > 4095) return -EINVAL;
- if (hw->promisc) { - netdev_err(dev, - "Adding VLAN in promisc mode not supported\n"); - return -EPERM; - } - /* Single Rx VLAN Filter */ if (hw->num_vlan == 1) { /* For single VLAN filter, VID 0 means VLAN promiscuous */ @@ -527,12 +521,6 @@ static int dwmac4_del_hw_vlan_rx_fltr(struct net_device *dev, { int i, ret = 0;
- if (hw->promisc) { - netdev_err(dev, - "Deleting VLAN in promisc mode not supported\n"); - return -EPERM; - } - /* Single Rx VLAN Filter */ if (hw->num_vlan == 1) { if ((hw->vlan_filter[0] & GMAC_VLAN_TAG_VID) == vid) { @@ -557,39 +545,6 @@ static int dwmac4_del_hw_vlan_rx_fltr(struct net_device *dev, return ret; }
-static void dwmac4_vlan_promisc_enable(struct net_device *dev, - struct mac_device_info *hw) -{ - void __iomem *ioaddr = hw->pcsr; - u32 value; - u32 hash; - u32 val; - int i; - - /* Single Rx VLAN Filter */ - if (hw->num_vlan == 1) { - dwmac4_write_single_vlan(dev, 0); - return; - } - - /* Extended Rx VLAN Filter Enable */ - for (i = 0; i < hw->num_vlan; i++) { - if (hw->vlan_filter[i] & GMAC_VLAN_TAG_DATA_VEN) { - val = hw->vlan_filter[i] & ~GMAC_VLAN_TAG_DATA_VEN; - dwmac4_write_vlan_filter(dev, hw, i, val); - } - } - - hash = readl(ioaddr + GMAC_VLAN_HASH_TABLE); - if (hash & GMAC_VLAN_VLHT) { - value = readl(ioaddr + GMAC_VLAN_TAG); - if (value & GMAC_VLAN_VTHM) { - value &= ~GMAC_VLAN_VTHM; - writel(value, ioaddr + GMAC_VLAN_TAG); - } - } -} - static void dwmac4_restore_hw_vlan_rx_fltr(struct net_device *dev, struct mac_device_info *hw) { @@ -709,22 +664,12 @@ static void dwmac4_set_filter(struct mac_device_info *hw, }
/* VLAN filtering */ - if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) + if (dev->flags & IFF_PROMISC && !hw->vlan_fail_q_en) + value &= ~GMAC_PACKET_FILTER_VTFE; + else if (dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) value |= GMAC_PACKET_FILTER_VTFE;
writel(value, ioaddr + GMAC_PACKET_FILTER); - - if (dev->flags & IFF_PROMISC && !hw->vlan_fail_q_en) { - if (!hw->promisc) { - hw->promisc = 1; - dwmac4_vlan_promisc_enable(dev, hw); - } - } else { - if (hw->promisc) { - hw->promisc = 0; - dwmac4_restore_hw_vlan_rx_fltr(dev, hw); - } - } }
static void dwmac4_flow_ctrl(struct mac_device_info *hw, unsigned int duplex,
From: Ashutosh Dixit ashutosh.dixit@intel.com
[ Upstream commit 12d4eb20d9d86fae5f84117ff047e966e470f7b9 ]
Expose intel_rps_read_actual_frequency_fw to read the actual freq without taking forcewake for use by PMU. The code is refactored to use a common set of functions across sysfs and PMU. Using common functions with sysfs in PMU solves the issues of missing support for MTL and missing support for older generations (prior to Gen6). It also future proofs the PMU where sometimes code has been updated for sysfs and PMU has been missed.
v2: Remove runtime_pm_if_in_use from read_actual_frequency_fw (Tvrtko)
v3: (Tvrtko) - Remove goto in __read_cagf - Unexport intel_rps_get_cagf and intel_rps_read_punit_req
Fixes: 22009b6dad66 ("drm/i915/mtl: Modify CAGF functions for MTL") Link: https://gitlab.freedesktop.org/drm/intel/-/issues/8280 Signed-off-by: Ashutosh Dixit ashutosh.dixit@intel.com Reviewed-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Signed-off-by: Tvrtko Ursulin tvrtko.ursulin@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230316004800.2539753-1-ashut... (cherry picked from commit 44df42e66139b5fac8db49ee354be279210f9816) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/gt/intel_rps.c | 38 ++++++++++++++++------------- drivers/gpu/drm/i915/gt/intel_rps.h | 4 +-- drivers/gpu/drm/i915/i915_pmu.c | 10 +++----- 3 files changed, 26 insertions(+), 26 deletions(-)
diff --git a/drivers/gpu/drm/i915/gt/intel_rps.c b/drivers/gpu/drm/i915/gt/intel_rps.c index 9ad3bc7201cba..fc73cfe0e39bb 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.c +++ b/drivers/gpu/drm/i915/gt/intel_rps.c @@ -2074,16 +2074,6 @@ void intel_rps_sanitize(struct intel_rps *rps) rps_disable_interrupts(rps); }
-u32 intel_rps_read_rpstat_fw(struct intel_rps *rps) -{ - struct drm_i915_private *i915 = rps_to_i915(rps); - i915_reg_t rpstat; - - rpstat = (GRAPHICS_VER(i915) >= 12) ? GEN12_RPSTAT1 : GEN6_RPSTAT1; - - return intel_uncore_read_fw(rps_to_gt(rps)->uncore, rpstat); -} - u32 intel_rps_read_rpstat(struct intel_rps *rps) { struct drm_i915_private *i915 = rps_to_i915(rps); @@ -2094,7 +2084,7 @@ u32 intel_rps_read_rpstat(struct intel_rps *rps) return intel_uncore_read(rps_to_gt(rps)->uncore, rpstat); }
-u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat) +static u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat) { struct drm_i915_private *i915 = rps_to_i915(rps); u32 cagf; @@ -2117,10 +2107,11 @@ u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat) return cagf; }
-static u32 read_cagf(struct intel_rps *rps) +static u32 __read_cagf(struct intel_rps *rps, bool take_fw) { struct drm_i915_private *i915 = rps_to_i915(rps); struct intel_uncore *uncore = rps_to_uncore(rps); + i915_reg_t r = INVALID_MMIO_REG; u32 freq;
/* @@ -2128,22 +2119,30 @@ static u32 read_cagf(struct intel_rps *rps) * registers will return 0 freq when GT is in RC6 */ if (GRAPHICS_VER_FULL(i915) >= IP_VER(12, 70)) { - freq = intel_uncore_read(uncore, MTL_MIRROR_TARGET_WP1); + r = MTL_MIRROR_TARGET_WP1; } else if (GRAPHICS_VER(i915) >= 12) { - freq = intel_uncore_read(uncore, GEN12_RPSTAT1); + r = GEN12_RPSTAT1; } else if (IS_VALLEYVIEW(i915) || IS_CHERRYVIEW(i915)) { vlv_punit_get(i915); freq = vlv_punit_read(i915, PUNIT_REG_GPU_FREQ_STS); vlv_punit_put(i915); } else if (GRAPHICS_VER(i915) >= 6) { - freq = intel_uncore_read(uncore, GEN6_RPSTAT1); + r = GEN6_RPSTAT1; } else { - freq = intel_uncore_read(uncore, MEMSTAT_ILK); + r = MEMSTAT_ILK; }
+ if (i915_mmio_reg_valid(r)) + freq = take_fw ? intel_uncore_read(uncore, r) : intel_uncore_read_fw(uncore, r); + return intel_rps_get_cagf(rps, freq); }
+static u32 read_cagf(struct intel_rps *rps) +{ + return __read_cagf(rps, true); +} + u32 intel_rps_read_actual_frequency(struct intel_rps *rps) { struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm; @@ -2156,7 +2155,12 @@ u32 intel_rps_read_actual_frequency(struct intel_rps *rps) return freq; }
-u32 intel_rps_read_punit_req(struct intel_rps *rps) +u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps) +{ + return intel_gpu_freq(rps, __read_cagf(rps, false)); +} + +static u32 intel_rps_read_punit_req(struct intel_rps *rps) { struct intel_uncore *uncore = rps_to_uncore(rps); struct intel_runtime_pm *rpm = rps_to_uncore(rps)->rpm; diff --git a/drivers/gpu/drm/i915/gt/intel_rps.h b/drivers/gpu/drm/i915/gt/intel_rps.h index 9e1cad9ba0e9c..d86ddfee095ed 100644 --- a/drivers/gpu/drm/i915/gt/intel_rps.h +++ b/drivers/gpu/drm/i915/gt/intel_rps.h @@ -34,8 +34,8 @@ void intel_rps_mark_interactive(struct intel_rps *rps, bool interactive);
int intel_gpu_freq(struct intel_rps *rps, int val); int intel_freq_opcode(struct intel_rps *rps, int val); -u32 intel_rps_get_cagf(struct intel_rps *rps, u32 rpstat1); u32 intel_rps_read_actual_frequency(struct intel_rps *rps); +u32 intel_rps_read_actual_frequency_fw(struct intel_rps *rps); u32 intel_rps_get_requested_frequency(struct intel_rps *rps); u32 intel_rps_get_min_frequency(struct intel_rps *rps); u32 intel_rps_get_min_raw_freq(struct intel_rps *rps); @@ -46,10 +46,8 @@ int intel_rps_set_max_frequency(struct intel_rps *rps, u32 val); u32 intel_rps_get_rp0_frequency(struct intel_rps *rps); u32 intel_rps_get_rp1_frequency(struct intel_rps *rps); u32 intel_rps_get_rpn_frequency(struct intel_rps *rps); -u32 intel_rps_read_punit_req(struct intel_rps *rps); u32 intel_rps_read_punit_req_frequency(struct intel_rps *rps); u32 intel_rps_read_rpstat(struct intel_rps *rps); -u32 intel_rps_read_rpstat_fw(struct intel_rps *rps); void gen6_rps_get_freq_caps(struct intel_rps *rps, struct intel_rps_freq_caps *caps); void intel_rps_raise_unslice(struct intel_rps *rps); void intel_rps_lower_unslice(struct intel_rps *rps); diff --git a/drivers/gpu/drm/i915/i915_pmu.c b/drivers/gpu/drm/i915/i915_pmu.c index 52531ab28c5f5..6d422b056f8a8 100644 --- a/drivers/gpu/drm/i915/i915_pmu.c +++ b/drivers/gpu/drm/i915/i915_pmu.c @@ -393,14 +393,12 @@ frequency_sample(struct intel_gt *gt, unsigned int period_ns) * case we assume the system is running at the intended * frequency. Fortunately, the read should rarely fail! */ - val = intel_rps_read_rpstat_fw(rps); - if (val) - val = intel_rps_get_cagf(rps, val); - else - val = rps->cur_freq; + val = intel_rps_read_actual_frequency_fw(rps); + if (!val) + val = intel_gpu_freq(rps, rps->cur_freq);
add_sample_mult(&pmu->sample[__I915_SAMPLE_FREQ_ACT], - intel_gpu_freq(rps, val), period_ns / 1000); + val, period_ns / 1000); }
if (pmu->enable & config_mask(I915_PMU_REQUESTED_FREQUENCY)) {
From: Imre Deak imre.deak@intel.com
[ Upstream commit 38c583019484f190d5b33f59b8ae810e6b1763c6 ]
The commit renaming icl_tc_phy_is_in_safe_mode() to icl_tc_phy_take_ownership() didn't flip the function's return value accordingly, fix this up.
This didn't cause an actual problem besides state check errors, since the function is only used during HW readout.
Cc: José Roberto de Souza jose.souza@intel.com Fixes: f53979d68a77 ("drm/i915/display/tc: Rename safe_mode functions ownership") Reviewed-by: José Roberto de Souza jose.souza@intel.com Reviewed-by: Ville Syrjälä ville.syrjala@linux.intel.com Signed-off-by: Imre Deak imre.deak@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230316131724.359612-4-imre.d... (cherry picked from commit f2c7959dda614d9b7c6a41510492de39d31705ec) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/display/intel_tc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c index 70624b4b2d38c..c5d41fd51118f 100644 --- a/drivers/gpu/drm/i915/display/intel_tc.c +++ b/drivers/gpu/drm/i915/display/intel_tc.c @@ -436,9 +436,9 @@ static bool icl_tc_phy_is_owned(struct intel_digital_port *dig_port) PORT_TX_DFLEXDPCSSS(dig_port->tc_phy_fia)); if (val == 0xffffffff) { drm_dbg_kms(&i915->drm, - "Port %s: PHY in TCCOLD, assume safe mode\n", + "Port %s: PHY in TCCOLD, assume not owned\n", dig_port->tc_port_name); - return true; + return false; }
return val & DP_PHY_MODE_STATUS_NOT_SAFE(dig_port->tc_phy_fia_idx);
From: Chris Wilson chris.p.wilson@linux.intel.com
[ Upstream commit 5c95b2d5d44fa250ce8aeee27bdb39b381d03857 ]
If we fail to adjust the GuC run-control on opening the perf stream, make sure we unwind the wakeref just taken.
v2: Retain old goto label names (Ashutosh) v3: Drop bitfield boolean
Fixes: 01e742746785 ("drm/i915/guc: Support OA when Wa_16011777198 is enabled") Signed-off-by: Chris Wilson chris.p.wilson@linux.intel.com Reviewed-by: Ashutosh Dixit ashutosh.dixit@intel.com Signed-off-by: Umesh Nerlige Ramappa umesh.nerlige.ramappa@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230323225901.3743681-2-umesh... (cherry picked from commit 2810ac6c753d17ee2572ffb57fe2382a786a080a) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_perf.c | 14 +++++++++----- drivers/gpu/drm/i915/i915_perf_types.h | 6 ++++++ 2 files changed, 15 insertions(+), 5 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_perf.c b/drivers/gpu/drm/i915/i915_perf.c index 125b6ca25a756..7d5e2c53c23a7 100644 --- a/drivers/gpu/drm/i915/i915_perf.c +++ b/drivers/gpu/drm/i915/i915_perf.c @@ -1592,9 +1592,7 @@ static void i915_oa_stream_destroy(struct i915_perf_stream *stream) /* * Wa_16011777198:dg2: Unset the override of GUCRC mode to enable rc6. */ - if (intel_uc_uses_guc_rc(>->uc) && - (IS_DG2_GRAPHICS_STEP(gt->i915, G10, STEP_A0, STEP_C0) || - IS_DG2_GRAPHICS_STEP(gt->i915, G11, STEP_A0, STEP_B0))) + if (stream->override_gucrc) drm_WARN_ON(>->i915->drm, intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc));
@@ -3293,8 +3291,10 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, if (ret) { drm_dbg(&stream->perf->i915->drm, "Unable to override gucrc mode\n"); - goto err_config; + goto err_gucrc; } + + stream->override_gucrc = true; }
ret = alloc_oa_buffer(stream); @@ -3333,11 +3333,15 @@ static int i915_oa_stream_init(struct i915_perf_stream *stream, free_oa_buffer(stream);
err_oa_buf_alloc: - free_oa_configs(stream); + if (stream->override_gucrc) + intel_guc_slpc_unset_gucrc_mode(>->uc.guc.slpc);
+err_gucrc: intel_uncore_forcewake_put(stream->uncore, FORCEWAKE_ALL); intel_engine_pm_put(stream->engine);
+ free_oa_configs(stream); + err_config: free_noa_wait(stream);
diff --git a/drivers/gpu/drm/i915/i915_perf_types.h b/drivers/gpu/drm/i915/i915_perf_types.h index ca150b7af3f29..4d5d8c365d9e2 100644 --- a/drivers/gpu/drm/i915/i915_perf_types.h +++ b/drivers/gpu/drm/i915/i915_perf_types.h @@ -316,6 +316,12 @@ struct i915_perf_stream { * buffer should be checked for available data. */ u64 poll_oa_period; + + /** + * @override_gucrc: GuC RC has been overridden for the perf stream, + * and we need to restore the default configuration on release. + */ + bool override_gucrc; };
/**
From: Rajvi Jingar rajvi.jingar@linux.intel.com
[ Upstream commit fb5755100a0a5aa5957bdb204fd1e249684557fc ]
For platforms with Alder Lake PCH (Alder Lake S and Raptor Lake S) the slp_s0_residency attribute has been reporting the wrong value. Unlike other platforms, ADL PCH does not have a counter for the time that the SLP_S0 signal was asserted. Instead, firmware uses the aggregate of the Low Power Mode (LPM) substate counters as the S0ix value. Since the LPM counters run at a different frequency, this lead to misreporting of the S0ix time.
Add a check for Alder Lake PCH and adjust the frequency accordingly when display slp_s0_residency.
Fixes: bbab31101f44 ("platform/x86/intel: pmc/core: Add Alderlake support to pmc core driver") Signed-off-by: Rajvi Jingar rajvi.jingar@linux.intel.com Signed-off-by: David E. Box david.e.box@linux.intel.com Reviewed-by: Rajneesh Bhardwaj irenic.rajneesh@gmail.com Reviewed-by: Andy Shevchenko andy.shevchenko@gmail.com Link: https://lore.kernel.org/r/20230320212029.3154407-1-david.e.box@linux.intel.c... Reviewed-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Hans de Goede hdegoede@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/intel/pmc/core.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/intel/pmc/core.c b/drivers/platform/x86/intel/pmc/core.c index 3a15d32d7644c..b9591969e0fa1 100644 --- a/drivers/platform/x86/intel/pmc/core.c +++ b/drivers/platform/x86/intel/pmc/core.c @@ -66,7 +66,18 @@ static inline void pmc_core_reg_write(struct pmc_dev *pmcdev, int reg_offset,
static inline u64 pmc_core_adjust_slp_s0_step(struct pmc_dev *pmcdev, u32 value) { - return (u64)value * pmcdev->map->slp_s0_res_counter_step; + /* + * ADL PCH does not have the SLP_S0 counter and LPM Residency counters are + * used as a workaround which uses 30.5 usec tick. All other client + * programs have the legacy SLP_S0 residency counter that is using the 122 + * usec tick. + */ + const int lpm_adj_x2 = pmcdev->map->lpm_res_counter_step_x2; + + if (pmcdev->map == &adl_reg_map) + return (u64)value * GET_X2_COUNTER((u64)lpm_adj_x2); + else + return (u64)value * pmcdev->map->slp_s0_res_counter_step; }
static int set_etr3(struct pmc_dev *pmcdev)
From: Ivan Orlov ivan.orlov0322@gmail.com
[ Upstream commit 2b4c99f7d9a57ecd644eda9b1fb0a1072414959f ]
Syzkaller reported the following issue:
===================================================== BUG: KMSAN: uninit-value in aio_rw_done fs/aio.c:1520 [inline] BUG: KMSAN: uninit-value in aio_write+0x899/0x950 fs/aio.c:1600 aio_rw_done fs/aio.c:1520 [inline] aio_write+0x899/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
Uninit was created at: slab_post_alloc_hook mm/slab.h:766 [inline] slab_alloc_node mm/slub.c:3452 [inline] __kmem_cache_alloc_node+0x71f/0xce0 mm/slub.c:3491 __do_kmalloc_node mm/slab_common.c:967 [inline] __kmalloc+0x11d/0x3b0 mm/slab_common.c:981 kmalloc_array include/linux/slab.h:636 [inline] bcm_tx_setup+0x80e/0x29d0 net/can/bcm.c:930 bcm_sendmsg+0x3a2/0xce0 net/can/bcm.c:1351 sock_sendmsg_nosec net/socket.c:714 [inline] sock_sendmsg net/socket.c:734 [inline] sock_write_iter+0x495/0x5e0 net/socket.c:1108 call_write_iter include/linux/fs.h:2189 [inline] aio_write+0x63a/0x950 fs/aio.c:1600 io_submit_one+0x1d1c/0x3bf0 fs/aio.c:2019 __do_sys_io_submit fs/aio.c:2078 [inline] __se_sys_io_submit+0x293/0x770 fs/aio.c:2048 __x64_sys_io_submit+0x92/0xd0 fs/aio.c:2048 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
CPU: 1 PID: 5034 Comm: syz-executor350 Not tainted 6.2.0-rc6-syzkaller-80422-geda666ff2276 #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/12/2023 =====================================================
We can follow the call chain and find that 'bcm_tx_setup' function calls 'memcpy_from_msg' to copy some content to the newly allocated frame of 'op->frames'. After that the 'len' field of copied structure being compared with some constant value (64 or 8). However, if 'memcpy_from_msg' returns an error, we will compare some uninitialized memory. This triggers 'uninit-value' issue.
This patch will add 'memcpy_from_msg' possible errors processing to avoid uninit-value issue.
Tested via syzkaller
Reported-by: syzbot+c9bfd85eca611ebf5db1@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=47f897f8ad958bbde5790ebf389b5e7e0a34508... Signed-off-by: Ivan Orlov ivan.orlov0322@gmail.com Fixes: 6f3b911d5f29b ("can: bcm: add support for CAN FD frames") Acked-by: Oliver Hartkopp socketcan@hartkopp.net Link: https://lore.kernel.org/all/20230314120445.12407-1-ivan.orlov0322@gmail.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/bcm.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
diff --git a/net/can/bcm.c b/net/can/bcm.c index 27706f6ace34a..a962ec2b8ba5b 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -941,6 +941,8 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg,
cf = op->frames + op->cfsiz * i; err = memcpy_from_msg((u8 *)cf, msg, op->cfsiz); + if (err < 0) + goto free_op;
if (op->flags & CAN_FD_FRAME) { if (cf->len > 64) @@ -950,12 +952,8 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg, err = -EINVAL; }
- if (err < 0) { - if (op->frames != &op->sframe) - kfree(op->frames); - kfree(op); - return err; - } + if (err < 0) + goto free_op;
if (msg_head->flags & TX_CP_CAN_ID) { /* copy can_id into frame */ @@ -1026,6 +1024,12 @@ static int bcm_tx_setup(struct bcm_msg_head *msg_head, struct msghdr *msg, bcm_tx_start_timer(op);
return msg_head->nframes * op->cfsiz + MHSIZ; + +free_op: + if (op->frames != &op->sframe) + kfree(op->frames); + kfree(op); + return err; }
/*
From: Tony Krowiak akrowiak@linux.ibm.com
[ Upstream commit 8f8cf767589f2131ae5d40f3758429095c701c84 ]
The device release callback function invoked to release the matrix device uses the dev_get_drvdata(device *dev) function to retrieve the pointer to the vfio_matrix_dev object in order to free its storage. The problem is, this object is not stored as drvdata with the device; since the kfree function will accept a NULL pointer, the memory for the vfio_matrix_dev object is never freed.
Since the device being released is contained within the vfio_matrix_dev object, the container_of macro will be used to retrieve its pointer.
Fixes: 1fde573413b5 ("s390: vfio-ap: base implementation of VFIO AP device driver") Signed-off-by: Tony Krowiak akrowiak@linux.ibm.com Reviewed-by: Harald Freudenberger freude@linux.ibm.com Link: https://lore.kernel.org/r/20230320150447.34557-1-akrowiak@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/s390/crypto/vfio_ap_drv.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/s390/crypto/vfio_ap_drv.c b/drivers/s390/crypto/vfio_ap_drv.c index 997b524bdd2b5..a48c6938ae68f 100644 --- a/drivers/s390/crypto/vfio_ap_drv.c +++ b/drivers/s390/crypto/vfio_ap_drv.c @@ -54,8 +54,9 @@ static struct ap_driver vfio_ap_drv = {
static void vfio_ap_matrix_dev_release(struct device *dev) { - struct ap_matrix_dev *matrix_dev = dev_get_drvdata(dev); + struct ap_matrix_dev *matrix_dev;
+ matrix_dev = container_of(dev, struct ap_matrix_dev, device); kfree(matrix_dev); }
From: Rafael J. Wysocki rafael.j.wysocki@intel.com
[ Upstream commit c56610a869bce03490faf4f157076370c71b8ae3 ]
For ACPI drivers that provide a ->notify() callback and set ACPI_DRIVER_ALL_NOTIFY_EVENTS in their flags, that callback can be invoked while either the ->add() or the ->remove() callback is running without any synchronization at the bus type level which is counter to the common-sense expectation that notification handling should only be enabled when the driver is actually bound to the device. As a result, if the driver is not careful enough, it's ->notify() callback may crash when it is invoked too early or too late [1].
This issue has been amplified by commit d6fb6ee1820c ("ACPI: bus: Drop driver member of struct acpi_device") that made acpi_bus_notify() check for the presence of the driver and its ->notify() callback directly instead of using an extra driver pointer that was only set and cleared by the bus type code, but it was present before that commit although it was harder to reproduce then.
It can be addressed by using the observation that acpi_device_install_notify_handler() can be modified to install the handler for all types of events when ACPI_DRIVER_ALL_NOTIFY_EVENTS is set in the driver flags, in which case acpi_bus_notify() will not need to invoke the driver's ->notify() callback any more and that callback will only be invoked after acpi_device_install_notify_handler() has run and before acpi_device_remove_notify_handler() runs, which implies the correct ordering with respect to the other ACPI driver callbacks.
Modify the code accordingly and while at it, drop two redundant local variables from acpi_bus_notify() and turn its description comment into a proper kerneldoc one.
Fixes: d6fb6ee1820c ("ACPI: bus: Drop driver member of struct acpi_device") Link: https://lore.kernel.org/linux-acpi/9f6cba7a8a57e5a687c934e8e406e28c.squirrel... # [1] Reported-by: Pierre Asselin pa@panix.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Tested-by: Pierre Asselin pa@panix.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/bus.c | 83 +++++++++++++++++++++------------------------- 1 file changed, 37 insertions(+), 46 deletions(-)
diff --git a/drivers/acpi/bus.c b/drivers/acpi/bus.c index 0c05ccde1f7a6..7c16bc15e7a14 100644 --- a/drivers/acpi/bus.c +++ b/drivers/acpi/bus.c @@ -459,85 +459,67 @@ static void acpi_bus_osc_negotiate_usb_control(void) Notification Handling -------------------------------------------------------------------------- */
-/* - * acpi_bus_notify - * --------------- - * Callback for all 'system-level' device notifications (values 0x00-0x7F). +/** + * acpi_bus_notify - Global system-level (0x00-0x7F) notifications handler + * @handle: Target ACPI object. + * @type: Notification type. + * @data: Ignored. + * + * This only handles notifications related to device hotplug. */ static void acpi_bus_notify(acpi_handle handle, u32 type, void *data) { struct acpi_device *adev; - u32 ost_code = ACPI_OST_SC_NON_SPECIFIC_FAILURE; - bool hotplug_event = false;
switch (type) { case ACPI_NOTIFY_BUS_CHECK: acpi_handle_debug(handle, "ACPI_NOTIFY_BUS_CHECK event\n"); - hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_CHECK: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK event\n"); - hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_WAKE: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_WAKE event\n"); - break; + return;
case ACPI_NOTIFY_EJECT_REQUEST: acpi_handle_debug(handle, "ACPI_NOTIFY_EJECT_REQUEST event\n"); - hotplug_event = true; break;
case ACPI_NOTIFY_DEVICE_CHECK_LIGHT: acpi_handle_debug(handle, "ACPI_NOTIFY_DEVICE_CHECK_LIGHT event\n"); /* TBD: Exactly what does 'light' mean? */ - break; + return;
case ACPI_NOTIFY_FREQUENCY_MISMATCH: acpi_handle_err(handle, "Device cannot be configured due " "to a frequency mismatch\n"); - break; + return;
case ACPI_NOTIFY_BUS_MODE_MISMATCH: acpi_handle_err(handle, "Device cannot be configured due " "to a bus mode mismatch\n"); - break; + return;
case ACPI_NOTIFY_POWER_FAULT: acpi_handle_err(handle, "Device has suffered a power fault\n"); - break; + return;
default: acpi_handle_debug(handle, "Unknown event type 0x%x\n", type); - break; + return; }
adev = acpi_get_acpi_dev(handle); - if (!adev) - goto err; - - if (adev->dev.driver) { - struct acpi_driver *driver = to_acpi_driver(adev->dev.driver); - - if (driver && driver->ops.notify && - (driver->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS)) - driver->ops.notify(adev, type); - } - - if (!hotplug_event) { - acpi_put_acpi_dev(adev); - return; - }
- if (ACPI_SUCCESS(acpi_hotplug_schedule(adev, type))) + if (adev && ACPI_SUCCESS(acpi_hotplug_schedule(adev, type))) return;
acpi_put_acpi_dev(adev);
- err: - acpi_evaluate_ost(handle, type, ost_code, NULL); + acpi_evaluate_ost(handle, type, ACPI_OST_SC_NON_SPECIFIC_FAILURE, NULL); }
static void acpi_notify_device(acpi_handle handle, u32 event, void *data) @@ -562,42 +544,51 @@ static u32 acpi_device_fixed_event(void *data) return ACPI_INTERRUPT_HANDLED; }
-static int acpi_device_install_notify_handler(struct acpi_device *device) +static int acpi_device_install_notify_handler(struct acpi_device *device, + struct acpi_driver *acpi_drv) { acpi_status status;
- if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) + if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) { status = acpi_install_fixed_event_handler(ACPI_EVENT_POWER_BUTTON, acpi_device_fixed_event, device); - else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) + } else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) { status = acpi_install_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON, acpi_device_fixed_event, device); - else - status = acpi_install_notify_handler(device->handle, - ACPI_DEVICE_NOTIFY, + } else { + u32 type = acpi_drv->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS ? + ACPI_ALL_NOTIFY : ACPI_DEVICE_NOTIFY; + + status = acpi_install_notify_handler(device->handle, type, acpi_notify_device, device); + }
if (ACPI_FAILURE(status)) return -EINVAL; return 0; }
-static void acpi_device_remove_notify_handler(struct acpi_device *device) +static void acpi_device_remove_notify_handler(struct acpi_device *device, + struct acpi_driver *acpi_drv) { - if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) + if (device->device_type == ACPI_BUS_TYPE_POWER_BUTTON) { acpi_remove_fixed_event_handler(ACPI_EVENT_POWER_BUTTON, acpi_device_fixed_event); - else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) + } else if (device->device_type == ACPI_BUS_TYPE_SLEEP_BUTTON) { acpi_remove_fixed_event_handler(ACPI_EVENT_SLEEP_BUTTON, acpi_device_fixed_event); - else - acpi_remove_notify_handler(device->handle, ACPI_DEVICE_NOTIFY, + } else { + u32 type = acpi_drv->flags & ACPI_DRIVER_ALL_NOTIFY_EVENTS ? + ACPI_ALL_NOTIFY : ACPI_DEVICE_NOTIFY; + + acpi_remove_notify_handler(device->handle, type, acpi_notify_device); + } }
/* Handle events targeting _SB device (at present only graceful shutdown) */ @@ -1039,7 +1030,7 @@ static int acpi_device_probe(struct device *dev) acpi_drv->name, acpi_dev->pnp.bus_id);
if (acpi_drv->ops.notify) { - ret = acpi_device_install_notify_handler(acpi_dev); + ret = acpi_device_install_notify_handler(acpi_dev, acpi_drv); if (ret) { if (acpi_drv->ops.remove) acpi_drv->ops.remove(acpi_dev); @@ -1062,7 +1053,7 @@ static void acpi_device_remove(struct device *dev) struct acpi_driver *acpi_drv = to_acpi_driver(dev->driver);
if (acpi_drv->ops.notify) - acpi_device_remove_notify_handler(acpi_dev); + acpi_device_remove_notify_handler(acpi_dev, acpi_drv);
if (acpi_drv->ops.remove) acpi_drv->ops.remove(acpi_dev);
From: Alyssa Ross hi@alyssa.is
[ Upstream commit bb430b69422640891b0b8db762885730579a4145 ]
LOOP_CONFIGURE is, as far as I understand it, supposed to be a way to combine LOOP_SET_FD and LOOP_SET_STATUS64 into a single syscall. When using LOOP_SET_FD+LOOP_SET_STATUS64, a single uevent would be sent for each partition found on the loop device after the second ioctl(), but when using LOOP_CONFIGURE, no such uevent was being sent.
In the old setup, uevents are disabled for LOOP_SET_FD, but not for LOOP_SET_STATUS64. This makes sense, as it prevents uevents being sent for a partially configured device during LOOP_SET_FD - they're only sent at the end of LOOP_SET_STATUS64. But for LOOP_CONFIGURE, uevents were disabled for the entire operation, so that final notification was never issued. To fix this, reduce the critical section to exclude the loop_reread_partitions() call, which causes the uevents to be issued, to after uevents are re-enabled, matching the behaviour of the LOOP_SET_FD+LOOP_SET_STATUS64 combination.
I noticed this because Busybox's losetup program recently changed from using LOOP_SET_FD+LOOP_SET_STATUS64 to LOOP_CONFIGURE, and this broke my setup, for which I want a notification from the kernel any time a new partition becomes available.
Signed-off-by: Alyssa Ross hi@alyssa.is [hch: reduced the critical section] Signed-off-by: Christoph Hellwig hch@lst.de Fixes: 3448914e8cc5 ("loop: Add LOOP_CONFIGURE ioctl") Link: https://lore.kernel.org/r/20230320125430.55367-1-hch@lst.de Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/loop.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index eabbc3bdec221..4916fe78ab8fa 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -1010,9 +1010,6 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, /* This is safe, since we have a reference from open(). */ __module_get(THIS_MODULE);
- /* suppress uevents while reconfiguring the device */ - dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1); - /* * If we don't hold exclusive handle for the device, upgrade to it * here to avoid changing device under exclusive owner. @@ -1067,6 +1064,9 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, } }
+ /* suppress uevents while reconfiguring the device */ + dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 1); + disk_force_media_change(lo->lo_disk, DISK_EVENT_MEDIA_CHANGE); set_disk_ro(lo->lo_disk, (lo->lo_flags & LO_FLAGS_READ_ONLY) != 0);
@@ -1109,17 +1109,17 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, if (partscan) clear_bit(GD_SUPPRESS_PART_SCAN, &lo->lo_disk->state);
+ /* enable and uncork uevent now that we are done */ + dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 0); + loop_global_unlock(lo, is_loop); if (partscan) loop_reread_partitions(lo); + if (!(mode & FMODE_EXCL)) bd_abort_claiming(bdev, loop_configure);
- error = 0; -done: - /* enable and uncork uevent now that we are done */ - dev_set_uevent_suppress(disk_to_dev(lo->lo_disk), 0); - return error; + return 0;
out_unlock: loop_global_unlock(lo, is_loop); @@ -1130,7 +1130,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, fput(file); /* This is safe: open() is still holding a reference. */ module_put(THIS_MODULE); - goto done; + return error; }
static void __loop_clr_fd(struct loop_device *lo, bool release)
From: Sven Auhagen sven.auhagen@voleatech.de
[ Upstream commit 9a251cae51d57289908222e6c322ca61fccc25fd ]
Add missing IP Fragmentation Flag.
Fixes: f9358e12a0af ("net: mvpp2: split ingress traffic into multiple flows") Signed-off-by: Sven Auhagen sven.auhagen@voleatech.de Reviewed-by: Marcin Wojtas mw@semihalf.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/marvell/mvpp2/mvpp2_cls.c | 30 +++++++++++-------- 1 file changed, 18 insertions(+), 12 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c index 41d935d1aaf6f..40aeaa7bd739f 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_cls.c @@ -62,35 +62,38 @@ static const struct mvpp2_cls_flow cls_flows[MVPP2_N_PRS_FLOWS] = { MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4 | - MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OPT | - MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OTHER | - MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
/* TCP over IPv4 flows, fragmented, with vlan tag */ MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_TCP4, MVPP2_FL_IP4_TCP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_L4_TCP, + MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_TCP, MVPP2_PRS_IP_MASK),
/* UDP over IPv4 flows, Not fragmented, no vlan tag */ @@ -132,35 +135,38 @@ static const struct mvpp2_cls_flow cls_flows[MVPP2_N_PRS_FLOWS] = { MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4 | - MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OPT | - MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_UNTAG, MVPP22_CLS_HEK_IP4_2T, MVPP2_PRS_RI_VLAN_NONE | MVPP2_PRS_RI_L3_IP4_OTHER | - MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_IP_FRAG_TRUE | MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK | MVPP2_PRS_RI_VLAN_MASK),
/* UDP over IPv4 flows, fragmented, with vlan tag */ MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_L3_IP4 | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_L3_IP4_OPT | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK),
MVPP2_DEF_FLOW(MVPP22_FLOW_UDP4, MVPP2_FL_IP4_UDP_FRAG_TAG, MVPP22_CLS_HEK_IP4_2T | MVPP22_CLS_HEK_TAGGED, - MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_L4_UDP, + MVPP2_PRS_RI_L3_IP4_OTHER | MVPP2_PRS_RI_IP_FRAG_TRUE | + MVPP2_PRS_RI_L4_UDP, MVPP2_PRS_IP_MASK),
/* TCP over IPv6 flows, not fragmented, no vlan tag */
From: Sven Auhagen sven.auhagen@voleatech.de
[ Upstream commit a587a84813b90372cb0a7565e201a4075da67919 ]
The mvpp2 parser entry for QinQ has the inner and outer VLAN in the wrong order. Fix the problem by swapping them.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Sven Auhagen sven.auhagen@voleatech.de Reviewed-by: Marcin Wojtas mw@semihalf.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c index 75ba57bd1d46d..ed8be396428b9 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -1539,8 +1539,8 @@ static int mvpp2_prs_vlan_init(struct platform_device *pdev, struct mvpp2 *priv) if (!priv->prs_double_vlans) return -ENOMEM;
- /* Double VLAN: 0x8100, 0x88A8 */ - err = mvpp2_prs_double_vlan_add(priv, ETH_P_8021Q, ETH_P_8021AD, + /* Double VLAN: 0x88A8, 0x8100 */ + err = mvpp2_prs_double_vlan_add(priv, ETH_P_8021AD, ETH_P_8021Q, MVPP2_PRS_PORT_MASK); if (err) return err;
From: Sven Auhagen sven.auhagen@voleatech.de
[ Upstream commit 031a416c2170866be5132ae42e14453d669b0cb1 ]
In PPPoE add all IPv4 header option length to the parser and adjust the L3 and L4 offset accordingly. Currently the L4 match does not work with PPPoE and all packets are matched as L3 IP4 OPT.
Fixes: 3f518509dedc ("ethernet: Add new driver for Marvell Armada 375 network unit") Signed-off-by: Sven Auhagen sven.auhagen@voleatech.de Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/marvell/mvpp2/mvpp2_prs.c | 82 ++++++++----------- 1 file changed, 34 insertions(+), 48 deletions(-)
diff --git a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c index ed8be396428b9..9af22f497a40f 100644 --- a/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c +++ b/drivers/net/ethernet/marvell/mvpp2/mvpp2_prs.c @@ -1607,59 +1607,45 @@ static int mvpp2_prs_vlan_init(struct platform_device *pdev, struct mvpp2 *priv) static int mvpp2_prs_pppoe_init(struct mvpp2 *priv) { struct mvpp2_prs_entry pe; - int tid; - - /* IPv4 over PPPoE with options */ - tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID, - MVPP2_PE_LAST_FREE_TID); - if (tid < 0) - return tid; - - memset(&pe, 0, sizeof(pe)); - mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_PPPOE); - pe.index = tid; - - mvpp2_prs_match_etype(&pe, 0, PPP_IP); - - mvpp2_prs_sram_next_lu_set(&pe, MVPP2_PRS_LU_IP4); - mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4_OPT, - MVPP2_PRS_RI_L3_PROTO_MASK); - /* goto ipv4 dest-address (skip eth_type + IP-header-size - 4) */ - mvpp2_prs_sram_shift_set(&pe, MVPP2_ETH_TYPE_LEN + - sizeof(struct iphdr) - 4, - MVPP2_PRS_SRAM_OP_SEL_SHIFT_ADD); - /* Set L3 offset */ - mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L3, - MVPP2_ETH_TYPE_LEN, - MVPP2_PRS_SRAM_OP_SEL_UDF_ADD); - - /* Update shadow table and hw entry */ - mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE); - mvpp2_prs_hw_write(priv, &pe); + int tid, ihl;
- /* IPv4 over PPPoE without options */ - tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID, - MVPP2_PE_LAST_FREE_TID); - if (tid < 0) - return tid; + /* IPv4 over PPPoE with header length >= 5 */ + for (ihl = MVPP2_PRS_IPV4_IHL_MIN; ihl <= MVPP2_PRS_IPV4_IHL_MAX; ihl++) { + tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID, + MVPP2_PE_LAST_FREE_TID); + if (tid < 0) + return tid;
- pe.index = tid; + memset(&pe, 0, sizeof(pe)); + mvpp2_prs_tcam_lu_set(&pe, MVPP2_PRS_LU_PPPOE); + pe.index = tid;
- mvpp2_prs_tcam_data_byte_set(&pe, MVPP2_ETH_TYPE_LEN, - MVPP2_PRS_IPV4_HEAD | - MVPP2_PRS_IPV4_IHL_MIN, - MVPP2_PRS_IPV4_HEAD_MASK | - MVPP2_PRS_IPV4_IHL_MASK); + mvpp2_prs_match_etype(&pe, 0, PPP_IP); + mvpp2_prs_tcam_data_byte_set(&pe, MVPP2_ETH_TYPE_LEN, + MVPP2_PRS_IPV4_HEAD | ihl, + MVPP2_PRS_IPV4_HEAD_MASK | + MVPP2_PRS_IPV4_IHL_MASK);
- /* Clear ri before updating */ - pe.sram[MVPP2_PRS_SRAM_RI_WORD] = 0x0; - pe.sram[MVPP2_PRS_SRAM_RI_CTRL_WORD] = 0x0; - mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4, - MVPP2_PRS_RI_L3_PROTO_MASK); + mvpp2_prs_sram_next_lu_set(&pe, MVPP2_PRS_LU_IP4); + mvpp2_prs_sram_ri_update(&pe, MVPP2_PRS_RI_L3_IP4, + MVPP2_PRS_RI_L3_PROTO_MASK); + /* goto ipv4 dst-address (skip eth_type + IP-header-size - 4) */ + mvpp2_prs_sram_shift_set(&pe, MVPP2_ETH_TYPE_LEN + + sizeof(struct iphdr) - 4, + MVPP2_PRS_SRAM_OP_SEL_SHIFT_ADD); + /* Set L3 offset */ + mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L3, + MVPP2_ETH_TYPE_LEN, + MVPP2_PRS_SRAM_OP_SEL_UDF_ADD); + /* Set L4 offset */ + mvpp2_prs_sram_offset_set(&pe, MVPP2_PRS_SRAM_UDF_TYPE_L4, + MVPP2_ETH_TYPE_LEN + (ihl * 4), + MVPP2_PRS_SRAM_OP_SEL_UDF_ADD);
- /* Update shadow table and hw entry */ - mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE); - mvpp2_prs_hw_write(priv, &pe); + /* Update shadow table and hw entry */ + mvpp2_prs_shadow_set(priv, pe.index, MVPP2_PRS_LU_PPPOE); + mvpp2_prs_hw_write(priv, &pe); + }
/* IPv6 over PPPoE */ tid = mvpp2_prs_tcam_first_free(priv, MVPP2_PE_FIRST_FREE_TID,
From: Wolfram Sang wsa+renesas@sang-engineering.com
[ Upstream commit f22c993f31fa9615df46e49cd768b713d39a852f ]
SMSC911x doesn't need mdiobus suspend/resume, that's why it sets 'mac_managed_pm'. However, setting it needs to be moved from init to probe, so mdiobus PM functions will really never be called (e.g. when the interface is not up yet during suspend/resume).
Fixes: 3ce9f2bef755 ("net: smsc911x: Stop and start PHY during suspend and resume") Suggested-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: Wolfram Sang wsa+renesas@sang-engineering.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230327083138.6044-1-wsa+renesas@sang-engineering... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/smsc/smsc911x.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/smsc/smsc911x.c b/drivers/net/ethernet/smsc/smsc911x.c index a2e511912e6a9..a690d139e1770 100644 --- a/drivers/net/ethernet/smsc/smsc911x.c +++ b/drivers/net/ethernet/smsc/smsc911x.c @@ -1037,8 +1037,6 @@ static int smsc911x_mii_probe(struct net_device *dev) return ret; }
- /* Indicate that the MAC is responsible for managing PHY PM */ - phydev->mac_managed_pm = true; phy_attached_info(phydev);
phy_set_max_speed(phydev, SPEED_100); @@ -1066,6 +1064,7 @@ static int smsc911x_mii_init(struct platform_device *pdev, struct net_device *dev) { struct smsc911x_data *pdata = netdev_priv(dev); + struct phy_device *phydev; int err = -ENXIO;
pdata->mii_bus = mdiobus_alloc(); @@ -1108,6 +1107,10 @@ static int smsc911x_mii_init(struct platform_device *pdev, goto err_out_free_bus_2; }
+ phydev = phy_find_first(pdata->mii_bus); + if (phydev) + phydev->mac_managed_pm = true; + return 0;
err_out_free_bus_2:
From: Brett Creeley brett.creeley@intel.com
[ Upstream commit d94dbdc4e0209b5e7d736ab696f8d635b034e3ee ]
The current implementation causes ice_vsi_update() to update all VSI fields based on the cached VSI context. This also assumes that the ICE_AQ_VSI_PROP_Q_OPT_VALID bit is set. This can cause problems if the VSI context is not correctly synced by the driver. Fix this by only updating the fields that correspond to ICE_AQ_VSI_PROP_Q_OPT_VALID. Also, make sure to save the updated result in the cached VSI context on success.
Fixes: 348048e724a0 ("ice: Implement iidc operations") Co-developed-by: Robert Malz robertx.malz@intel.com Signed-off-by: Robert Malz robertx.malz@intel.com Signed-off-by: Brett Creeley brett.creeley@intel.com Signed-off-by: Jesse Brandeburg jesse.brandeburg@intel.com Reviewed-by: Piotr Raczynski piotr.raczynski@intel.com Tested-by: Jakub Andrysiak jakub.andrysiak@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_switch.c | 26 +++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_switch.c b/drivers/net/ethernet/intel/ice/ice_switch.c index 61f844d225123..46b36851af460 100644 --- a/drivers/net/ethernet/intel/ice/ice_switch.c +++ b/drivers/net/ethernet/intel/ice/ice_switch.c @@ -1780,18 +1780,36 @@ ice_update_vsi(struct ice_hw *hw, u16 vsi_handle, struct ice_vsi_ctx *vsi_ctx, int ice_cfg_rdma_fltr(struct ice_hw *hw, u16 vsi_handle, bool enable) { - struct ice_vsi_ctx *ctx; + struct ice_vsi_ctx *ctx, *cached_ctx; + int status; + + cached_ctx = ice_get_vsi_ctx(hw, vsi_handle); + if (!cached_ctx) + return -ENOENT;
- ctx = ice_get_vsi_ctx(hw, vsi_handle); + ctx = kzalloc(sizeof(*ctx), GFP_KERNEL); if (!ctx) - return -EIO; + return -ENOMEM; + + ctx->info.q_opt_rss = cached_ctx->info.q_opt_rss; + ctx->info.q_opt_tc = cached_ctx->info.q_opt_tc; + ctx->info.q_opt_flags = cached_ctx->info.q_opt_flags; + + ctx->info.valid_sections = cpu_to_le16(ICE_AQ_VSI_PROP_Q_OPT_VALID);
if (enable) ctx->info.q_opt_flags |= ICE_AQ_VSI_Q_OPT_PE_FLTR_EN; else ctx->info.q_opt_flags &= ~ICE_AQ_VSI_Q_OPT_PE_FLTR_EN;
- return ice_update_vsi(hw, vsi_handle, ctx, NULL); + status = ice_update_vsi(hw, vsi_handle, ctx, NULL); + if (!status) { + cached_ctx->info.q_opt_flags = ctx->info.q_opt_flags; + cached_ctx->info.valid_sections |= ctx->info.valid_sections; + } + + kfree(ctx); + return status; }
/**
From: Junfeng Guo junfeng.guo@intel.com
[ Upstream commit 29486b6df3e6a63b57d1ed1dce06051267282ff4 ]
Add profile conflict check while adding some FDIR rules to avoid unexpected flow behavior, rules may have conflict including: IPv4 <---> {IPv4_UDP, IPv4_TCP, IPv4_SCTP} IPv6 <---> {IPv6_UDP, IPv6_TCP, IPv6_SCTP}
For example, when we create an FDIR rule for IPv4, this rule will work on packets including IPv4, IPv4_UDP, IPv4_TCP and IPv4_SCTP. But if we then create an FDIR rule for IPv4_UDP and then destroy it, the first FDIR rule for IPv4 cannot work on pkt IPv4_UDP then.
To prevent this unexpected behavior, we add restriction in software when creating FDIR rules by adding necessary profile conflict check.
Fixes: 1f7ea1cd6a37 ("ice: Enable FDIR Configure for AVF") Signed-off-by: Junfeng Guo junfeng.guo@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../ethernet/intel/ice/ice_virtchnl_fdir.c | 73 +++++++++++++++++++ 1 file changed, 73 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c index c6a58343d81d8..a2645ff3100e4 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c @@ -541,6 +541,72 @@ static void ice_vc_fdir_rem_prof_all(struct ice_vf *vf) } }
+/** + * ice_vc_fdir_has_prof_conflict + * @vf: pointer to the VF structure + * @conf: FDIR configuration for each filter + * + * Check if @conf has conflicting profile with existing profiles + * + * Return: true on success, and false on error. + */ +static bool +ice_vc_fdir_has_prof_conflict(struct ice_vf *vf, + struct virtchnl_fdir_fltr_conf *conf) +{ + struct ice_fdir_fltr *desc; + + list_for_each_entry(desc, &vf->fdir.fdir_rule_list, fltr_node) { + struct virtchnl_fdir_fltr_conf *existing_conf; + enum ice_fltr_ptype flow_type_a, flow_type_b; + struct ice_fdir_fltr *a, *b; + + existing_conf = to_fltr_conf_from_desc(desc); + a = &existing_conf->input; + b = &conf->input; + flow_type_a = a->flow_type; + flow_type_b = b->flow_type; + + /* No need to compare two rules with different tunnel types or + * with the same protocol type. + */ + if (existing_conf->ttype != conf->ttype || + flow_type_a == flow_type_b) + continue; + + switch (flow_type_a) { + case ICE_FLTR_PTYPE_NONF_IPV4_UDP: + case ICE_FLTR_PTYPE_NONF_IPV4_TCP: + case ICE_FLTR_PTYPE_NONF_IPV4_SCTP: + if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_OTHER) + return true; + break; + case ICE_FLTR_PTYPE_NONF_IPV4_OTHER: + if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_UDP || + flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_TCP || + flow_type_b == ICE_FLTR_PTYPE_NONF_IPV4_SCTP) + return true; + break; + case ICE_FLTR_PTYPE_NONF_IPV6_UDP: + case ICE_FLTR_PTYPE_NONF_IPV6_TCP: + case ICE_FLTR_PTYPE_NONF_IPV6_SCTP: + if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_OTHER) + return true; + break; + case ICE_FLTR_PTYPE_NONF_IPV6_OTHER: + if (flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_UDP || + flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_TCP || + flow_type_b == ICE_FLTR_PTYPE_NONF_IPV6_SCTP) + return true; + break; + default: + break; + } + } + + return false; +} + /** * ice_vc_fdir_write_flow_prof * @vf: pointer to the VF structure @@ -677,6 +743,13 @@ ice_vc_fdir_config_input_set(struct ice_vf *vf, struct virtchnl_fdir_add *fltr, enum ice_fltr_ptype flow; int ret;
+ ret = ice_vc_fdir_has_prof_conflict(vf, conf); + if (ret) { + dev_dbg(dev, "Found flow profile conflict for VF %d\n", + vf->vf_id); + return ret; + } + flow = input->flow_type; ret = ice_vc_fdir_alloc_prof(vf, flow); if (ret) {
From: Jakob Koschel jkl820.git@gmail.com
[ Upstream commit e9a1cc2e4c4ee7c7e60fb26345618c2522a2a10f ]
The code implicitly assumes that the list iterator finds a correct handle. If 'vsi_handle' is not found the 'old_agg_vsi_info' was pointing to an bogus memory location. For safety a separate list iterator variable should be used to make the != NULL check on 'old_agg_vsi_info' correct under any circumstances.
Additionally Linus proposed to avoid any use of the list iterator variable after the loop, in the attempt to move the list iterator variable declaration into the macro to avoid any potential misuse after the loop. Using it in a pointer comparison after the loop is undefined behavior and should be omitted if possible [1].
Fixes: 37c592062b16 ("ice: remove the VSI info from previous agg") Link: https://lore.kernel.org/all/CAHk-=wgRr_D8CB-D9Kg-c=EHreAsk5SqXPwr9Y7k9sA6cWX... [1] Signed-off-by: Jakob Koschel jkl820.git@gmail.com Tested-by: Arpana Arland arpanax.arland@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_sched.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_sched.c b/drivers/net/ethernet/intel/ice/ice_sched.c index 6d08b397df2ad..8f0ea411dfba0 100644 --- a/drivers/net/ethernet/intel/ice/ice_sched.c +++ b/drivers/net/ethernet/intel/ice/ice_sched.c @@ -2787,7 +2787,7 @@ static int ice_sched_assoc_vsi_to_agg(struct ice_port_info *pi, u32 agg_id, u16 vsi_handle, unsigned long *tc_bitmap) { - struct ice_sched_agg_vsi_info *agg_vsi_info, *old_agg_vsi_info = NULL; + struct ice_sched_agg_vsi_info *agg_vsi_info, *iter, *old_agg_vsi_info = NULL; struct ice_sched_agg_info *agg_info, *old_agg_info; struct ice_hw *hw = pi->hw; int status = 0; @@ -2805,11 +2805,13 @@ ice_sched_assoc_vsi_to_agg(struct ice_port_info *pi, u32 agg_id, if (old_agg_info && old_agg_info != agg_info) { struct ice_sched_agg_vsi_info *vtmp;
- list_for_each_entry_safe(old_agg_vsi_info, vtmp, + list_for_each_entry_safe(iter, vtmp, &old_agg_info->agg_vsi_list, list_entry) - if (old_agg_vsi_info->vsi_handle == vsi_handle) + if (iter->vsi_handle == vsi_handle) { + old_agg_vsi_info = iter; break; + } }
/* check if entry already exist */
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 07b3af42d8d528374d4f42d688bae86eeb30831a ]
Using the QDMA tx scheduler to throttle tx to line speed works fine for switch ports, but apparently caused a regression on non-switch ports.
Based on a number of tests, it seems that this throttling can be safely dropped without re-introducing the issues on switch ports that the tx scheduling changes resolved.
Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-... Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") Reported-by: Frank Wunderlich frank-w@public-files.de Reported-by: Daniel Golle daniel@makrotopia.org Tested-by: Daniel Golle daniel@makrotopia.org Signed-off-by: Felix Fietkau nbd@nbd.name Link: https://lore.kernel.org/r/20230324140404.95745-1-nbd@nbd.name Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- 1 file changed, 2 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 332329cb1ee00..9f9df6255baa1 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -713,8 +713,6 @@ static void mtk_mac_link_up(struct phylink_config *config, break; }
- mtk_set_queue_speed(mac->hw, mac->id, speed); - /* Configure duplex */ if (duplex == DUPLEX_FULL) mcr |= MAC_MCR_FORCE_DPX;
From: Tasos Sahanidis tasos@tasossah.com
[ Upstream commit f33fc1576757741479452255132d6e3aaf558ffe ]
snd_card_ymfpci_remove() was removed in commit c6e6bb5eab74 ("ALSA: ymfpci: Allocate resources with device-managed APIs"), but the call to snd_card_new() was not replaced with snd_devm_card_new().
Since there was no longer a call to snd_card_free, unloading the module would eventually result in Oops:
[697561.532887] BUG: unable to handle page fault for address: ffffffffc0924480 [697561.532893] #PF: supervisor read access in kernel mode [697561.532896] #PF: error_code(0x0000) - not-present page [697561.532899] PGD ae1e15067 P4D ae1e15067 PUD ae1e17067 PMD 11a8f5067 PTE 0 [697561.532905] Oops: 0000 [#1] PREEMPT SMP NOPTI [697561.532909] CPU: 21 PID: 5080 Comm: wireplumber Tainted: G W OE 6.2.7 #1 [697561.532914] Hardware name: System manufacturer System Product Name/TUF GAMING X570-PLUS, BIOS 4408 10/28/2022 [697561.532916] RIP: 0010:try_module_get.part.0+0x1a/0xe0 [697561.532924] Code: 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f 44 00 00 55 48 89 e5 41 55 41 54 49 89 fc bf 01 00 00 00 e8 56 3c f8 ff <41> 83 3c 24 02 0f 84 96 00 00 00 41 8b 84 24 30 03 00 00 85 c0 0f [697561.532927] RSP: 0018:ffffbe9b858c3bd8 EFLAGS: 00010246 [697561.532930] RAX: ffff9815d14f1900 RBX: ffff9815c14e6000 RCX: 0000000000000000 [697561.532933] RDX: 0000000000000000 RSI: ffffffffc055092c RDI: ffffffffb3778c1a [697561.532935] RBP: ffffbe9b858c3be8 R08: 0000000000000040 R09: ffff981a1a741380 [697561.532937] R10: ffffbe9b858c3c80 R11: 00000009d56533a6 R12: ffffffffc0924480 [697561.532939] R13: ffff9823439d8500 R14: 0000000000000025 R15: ffff9815cd109f80 [697561.532942] FS: 00007f13084f1f80(0000) GS:ffff9824aef40000(0000) knlGS:0000000000000000 [697561.532945] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [697561.532947] CR2: ffffffffc0924480 CR3: 0000000145344000 CR4: 0000000000350ee0 [697561.532949] Call Trace: [697561.532951] <TASK> [697561.532955] try_module_get+0x13/0x30 [697561.532960] snd_ctl_open+0x61/0x1c0 [snd] [697561.532976] snd_open+0xb4/0x1e0 [snd] [697561.532989] chrdev_open+0xc7/0x240 [697561.532995] ? fsnotify_perm.part.0+0x6e/0x160 [697561.533000] ? __pfx_chrdev_open+0x10/0x10 [697561.533005] do_dentry_open+0x169/0x440 [697561.533009] vfs_open+0x2d/0x40 [697561.533012] path_openat+0xa9d/0x10d0 [697561.533017] ? debug_smp_processor_id+0x17/0x20 [697561.533022] ? trigger_load_balance+0x65/0x370 [697561.533026] do_filp_open+0xb2/0x160 [697561.533032] ? _raw_spin_unlock+0x19/0x40 [697561.533036] ? alloc_fd+0xa9/0x190 [697561.533040] do_sys_openat2+0x9f/0x160 [697561.533044] __x64_sys_openat+0x55/0x90 [697561.533048] do_syscall_64+0x3b/0x90 [697561.533052] entry_SYSCALL_64_after_hwframe+0x72/0xdc [697561.533056] RIP: 0033:0x7f1308a40db4 [697561.533059] Code: 24 20 eb 8f 66 90 44 89 54 24 0c e8 46 68 f8 ff 44 8b 54 24 0c 44 89 e2 48 89 ee 41 89 c0 bf 9c ff ff ff b8 01 01 00 00 0f 05 <48> 3d 00 f0 ff ff 77 32 44 89 c7 89 44 24 0c e8 78 68 f8 ff 8b 44 [697561.533062] RSP: 002b:00007ffcce664450 EFLAGS: 00000293 ORIG_RAX: 0000000000000101 [697561.533066] RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007f1308a40db4 [697561.533068] RDX: 0000000000080000 RSI: 00007ffcce664690 RDI: 00000000ffffff9c [697561.533070] RBP: 00007ffcce664690 R08: 0000000000000000 R09: 0000000000000012 [697561.533072] R10: 0000000000000000 R11: 0000000000000293 R12: 0000000000080000 [697561.533074] R13: 00007f13054b069b R14: 0000565209f83200 R15: 0000000000000000 [697561.533078] </TASK>
Fixes: c6e6bb5eab74 ("ALSA: ymfpci: Allocate resources with device-managed APIs") Signed-off-by: Tasos Sahanidis tasos@tasossah.com Link: https://lore.kernel.org/r/20230329032422.170024-1-tasos@tasossah.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/ymfpci/ymfpci.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/ymfpci/ymfpci.c b/sound/pci/ymfpci/ymfpci.c index 1e198e4d57b8d..82d4e0fda91be 100644 --- a/sound/pci/ymfpci/ymfpci.c +++ b/sound/pci/ymfpci/ymfpci.c @@ -170,7 +170,7 @@ static int snd_card_ymfpci_probe(struct pci_dev *pci, return -ENOENT; }
- err = snd_card_new(&pci->dev, index[dev], id[dev], THIS_MODULE, + err = snd_devm_card_new(&pci->dev, index[dev], id[dev], THIS_MODULE, sizeof(*chip), &card); if (err < 0) return err;
From: Tasos Sahanidis tasos@tasossah.com
[ Upstream commit 6be2e7522eb529b41c16d459f33bbdbcddbf5c15 ]
The snd_dma_buffer.bytes field now contains the aligned size, which this snd_BUG_ON() did not account for, resulting in the following:
[ 9.625915] ------------[ cut here ]------------ [ 9.633440] WARNING: CPU: 0 PID: 126 at sound/pci/ymfpci/ymfpci_main.c:2168 snd_ymfpci_create+0x681/0x698 [snd_ymfpci] [ 9.648926] Modules linked in: snd_ymfpci(+) snd_intel_dspcfg kvm(+) snd_intel_sdw_acpi snd_ac97_codec snd_mpu401_uart snd_opl3_lib irqbypass snd_hda_codec gameport snd_rawmidi crct10dif_pclmul crc32_pclmul cfg80211 snd_hda_core polyval_clmulni polyval_generic gf128mul snd_seq_device ghash_clmulni_intel snd_hwdep ac97_bus sha512_ssse3 rfkill snd_pcm aesni_intel tg3 snd_timer crypto_simd snd mxm_wmi libphy cryptd k10temp fam15h_power pcspkr soundcore sp5100_tco wmi acpi_cpufreq mac_hid dm_multipath sg loop fuse dm_mod bpf_preload ip_tables x_tables ext4 crc32c_generic crc16 mbcache jbd2 sr_mod cdrom ata_generic pata_acpi firewire_ohci crc32c_intel firewire_core xhci_pci crc_itu_t pata_via xhci_pci_renesas floppy [ 9.711849] CPU: 0 PID: 126 Comm: kworker/0:2 Not tainted 6.1.21-1-lts #1 08d2e5ece03136efa7c6aeea9a9c40916b1bd8da [ 9.722200] Hardware name: To Be Filled By O.E.M. To Be Filled By O.E.M./990FX Extreme4, BIOS P2.70 06/05/2014 [ 9.732204] Workqueue: events work_for_cpu_fn [ 9.736580] RIP: 0010:snd_ymfpci_create+0x681/0x698 [snd_ymfpci] [ 9.742594] Code: 8c c0 4c 89 e2 48 89 df 48 c7 c6 92 c6 8c c0 e8 15 d0 e9 ff 48 83 c4 08 44 89 e8 5b 5d 41 5c 41 5d 41 5e 41 5f e9 d3 7a 33 e3 <0f> 0b e9 cb fd ff ff 41 bd fb ff ff ff eb db 41 bd f4 ff ff ff eb [ 9.761358] RSP: 0018:ffffab64804e7da0 EFLAGS: 00010287 [ 9.766594] RAX: ffff8fa2df06c400 RBX: ffff8fa3073a8000 RCX: ffff8fa303fbc4a8 [ 9.773734] RDX: ffff8fa2df06d000 RSI: 0000000000000010 RDI: 0000000000000020 [ 9.780876] RBP: ffff8fa300b5d0d0 R08: ffff8fa3073a8e50 R09: 00000000df06bf00 [ 9.788018] R10: ffff8fa2df06bf00 R11: 00000000df068200 R12: ffff8fa3073a8918 [ 9.795159] R13: 0000000000000000 R14: 0000000000000080 R15: ffff8fa2df068200 [ 9.802317] FS: 0000000000000000(0000) GS:ffff8fa9fec00000(0000) knlGS:0000000000000000 [ 9.810414] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 9.816158] CR2: 000055febaf66500 CR3: 0000000101a2e000 CR4: 00000000000406f0 [ 9.823301] Call Trace: [ 9.825747] <TASK> [ 9.827889] snd_card_ymfpci_probe+0x194/0x950 [snd_ymfpci b78a5fe64b5663a6390a909c67808567e3e73615] [ 9.837030] ? finish_task_switch.isra.0+0x90/0x2d0 [ 9.841918] local_pci_probe+0x45/0x80 [ 9.845680] work_for_cpu_fn+0x1a/0x30 [ 9.849431] process_one_work+0x1c7/0x380 [ 9.853464] worker_thread+0x1af/0x390 [ 9.857225] ? rescuer_thread+0x3b0/0x3b0 [ 9.861254] kthread+0xde/0x110 [ 9.864414] ? kthread_complete_and_exit+0x20/0x20 [ 9.869210] ret_from_fork+0x22/0x30 [ 9.872792] </TASK> [ 9.874985] ---[ end trace 0000000000000000 ]---
Fixes: 5c1733e33c88 ("ALSA: memalloc: Align buffer allocations in page size") Signed-off-by: Tasos Sahanidis tasos@tasossah.com Link: https://lore.kernel.org/r/20230329032808.170403-1-tasos@tasossah.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/ymfpci/ymfpci_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/pci/ymfpci/ymfpci_main.c b/sound/pci/ymfpci/ymfpci_main.c index c80114c0ad7bf..b492c32ce0704 100644 --- a/sound/pci/ymfpci/ymfpci_main.c +++ b/sound/pci/ymfpci/ymfpci_main.c @@ -2165,7 +2165,7 @@ static int snd_ymfpci_memalloc(struct snd_ymfpci *chip) chip->work_base = ptr; chip->work_base_addr = ptr_addr; - snd_BUG_ON(ptr + chip->work_size != + snd_BUG_ON(ptr + PAGE_ALIGN(chip->work_size) != chip->work_ptr->area + chip->work_ptr->bytes);
snd_ymfpci_writel(chip, YDSXGR_PLAYCTRLBASE, chip->bank_base_playback_addr);
From: M Chetan Kumar m.chetan.kumar@linux.intel.com
[ Upstream commit 5f70bcbca469a087b54ad2d934185ed69a098576 ]
ModemManger/Apps probing the wwan0xmmrpc0 port for 7560 Modem results in modem crash.
7560 Modem FW uses the MBIM interface for control command communication whereas 7360 uses Intel RPC interface so disable wwan0xmmrpc0 port for 7560.
Fixes: d08b0f8f46e4 ("net: wwan: iosm: add rpc interface for xmm modems") Reported-and-tested-by: Martin mwolf@adiumentum.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=217200 Signed-off-by: M Chetan Kumar m.chetan.kumar@linux.intel.com Signed-off-by: Shane Parslow shaneparslow808@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wwan/iosm/iosm_ipc_imem.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/wwan/iosm/iosm_ipc_imem.c b/drivers/net/wwan/iosm/iosm_ipc_imem.c index 1e6a479766429..c066b0040a3fe 100644 --- a/drivers/net/wwan/iosm/iosm_ipc_imem.c +++ b/drivers/net/wwan/iosm/iosm_ipc_imem.c @@ -587,6 +587,13 @@ static void ipc_imem_run_state_worker(struct work_struct *instance) while (ctrl_chl_idx < IPC_MEM_MAX_CHANNELS) { if (!ipc_chnl_cfg_get(&chnl_cfg_port, ctrl_chl_idx)) { ipc_imem->ipc_port[ctrl_chl_idx] = NULL; + + if (ipc_imem->pcie->pci->device == INTEL_CP_DEVICE_7560_ID && + chnl_cfg_port.wwan_port_type == WWAN_PORT_XMMRPC) { + ctrl_chl_idx++; + continue; + } + if (ipc_imem->pcie->pci->device == INTEL_CP_DEVICE_7360_ID && chnl_cfg_port.wwan_port_type == WWAN_PORT_MBIM) { ctrl_chl_idx++;
From: Hans de Goede hdegoede@redhat.com
[ Upstream commit 30fb97ba4a8e082ba0a5432479d6995472edbd7b ]
The nouveau code used to call drm_fb_helper_initial_config() from nouveau_fbcon_init() before calling drm_dev_register(). This would probe all connectors so that drm_connector->status could be used during backlight registration which runs from nouveau_connector_late_register().
After commit 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers") the fbdev emulation code, which now is a drm-client, can only run after drm_dev_register(). So during backlight registration the connectors are not probed yet and the drm_connector->status == connected check in nv50_backlight_init() would now always fail.
Replace the drm_connector->status == connected check with a drm_helper_probe_detect() == connected check to fix nv_backlight no longer getting registered because of this.
Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers") Link: https://gitlab.freedesktop.org/drm/nouveau/-/issues/202 Link: https://bugzilla.redhat.com/show_bug.cgi?id=2181941 Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Lyude Paul lyude@redhat.com Link: https://patchwork.freedesktop.org/patch/msgid/20230326205433.36485-1-hdegoed... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/nouveau/nouveau_backlight.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_backlight.c b/drivers/gpu/drm/nouveau/nouveau_backlight.c index 40409a29f5b69..91b5ecc575380 100644 --- a/drivers/gpu/drm/nouveau/nouveau_backlight.c +++ b/drivers/gpu/drm/nouveau/nouveau_backlight.c @@ -33,6 +33,7 @@ #include <linux/apple-gmux.h> #include <linux/backlight.h> #include <linux/idr.h> +#include <drm/drm_probe_helper.h>
#include "nouveau_drv.h" #include "nouveau_reg.h" @@ -299,8 +300,12 @@ nv50_backlight_init(struct nouveau_backlight *bl, struct nouveau_drm *drm = nouveau_drm(nv_encoder->base.base.dev); struct nvif_object *device = &drm->client.device.object;
+ /* + * Note when this runs the connectors have not been probed yet, + * so nv_conn->base.status is not set yet. + */ if (!nvif_rd32(device, NV50_PDISP_SOR_PWM_CTL(ffs(nv_encoder->dcb->or) - 1)) || - nv_conn->base.status != connector_status_connected) + drm_helper_probe_detect(&nv_conn->base, NULL, false) != connector_status_connected) return -ENODEV;
if (nv_conn->type == DCB_CONNECTOR_eDP) {
From: Alex Elder elder@linaro.org
[ Upstream commit 6c75dc94f2b27fff57b305af9236eea181a00b6c ]
In gsi_trans_pool_init_dma(), the total size of a pool of memory used for DMA transactions is calculated. However the calculation is done incorrectly.
For 4KB pages, this total size is currently always more than one page, and as a result, the calculation produces a positive (though incorrect) total size. The code still works in this case; we just end up with fewer DMA pool entries than we intended.
Bjorn Andersson tested booting a kernel with 16KB pages, and hit a null pointer derereference in sg_alloc_append_table_from_pages(), descending from gsi_trans_pool_init_dma(). The cause of this was that a 16KB total size was going to be allocated, and with 16KB pages the order of that allocation is 0. The total_size calculation yielded 0, which eventually led to the crash.
Correcting the total_size calculation fixes the problem.
Reported-by: Bjorn Andersson quic_bjorande@quicinc.com Tested-by: Bjorn Andersson quic_bjorande@quicinc.com Fixes: 9dd441e4ed57 ("soc: qcom: ipa: GSI transactions") Reviewed-by: Mark Bloch mbloch@nvidia.com Signed-off-by: Alex Elder elder@linaro.org Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230328162751.2861791-1-elder@linaro.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ipa/gsi_trans.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ipa/gsi_trans.c b/drivers/net/ipa/gsi_trans.c index 0f52c068c46d6..ee6fb00b71eb6 100644 --- a/drivers/net/ipa/gsi_trans.c +++ b/drivers/net/ipa/gsi_trans.c @@ -156,7 +156,7 @@ int gsi_trans_pool_init_dma(struct device *dev, struct gsi_trans_pool *pool, * gsi_trans_pool_exit_dma() can assume the total allocated * size is exactly (count * size). */ - total_size = get_order(total_size) << PAGE_SHIFT; + total_size = PAGE_SIZE << get_order(total_size);
virt = dma_alloc_coherent(dev, total_size, &addr, GFP_KERNEL); if (!virt)
From: Jakub Kicinski kuba@kernel.org
[ Upstream commit 8c495270845d6b4854607e946baef3637a8259ed ]
build_skb() no longer accepts slab buffers. Since slab use is fairly uncommon we prefer the drivers to call a separate slab_build_skb() function appropriately.
bnx2x uses the old semantics where size of 0 meant buffer from slab. It sets the fp->rx_frag_size to 0 for MTUs which don't fit in a page. It needs to call slab_build_skb().
This fixes the WARN_ONCE() of incorrect API use seen with bnx2x.
Reported-by: Thomas Voegtle tv@lio96.de Link: https://lore.kernel.org/all/b8f295e4-ba57-8bfb-7d9c-9d62a498a727@lio96.de/ Fixes: ce098da1497c ("skbuff: Introduce slab_build_skb()") Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230329000013.2734957-1-kuba@kernel.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c index 16c490692f422..12083b9679b54 100644 --- a/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c +++ b/drivers/net/ethernet/broadcom/bnx2x/bnx2x_cmn.c @@ -672,6 +672,18 @@ static int bnx2x_fill_frag_skb(struct bnx2x *bp, struct bnx2x_fastpath *fp, return 0; }
+static struct sk_buff * +bnx2x_build_skb(const struct bnx2x_fastpath *fp, void *data) +{ + struct sk_buff *skb; + + if (fp->rx_frag_size) + skb = build_skb(data, fp->rx_frag_size); + else + skb = slab_build_skb(data); + return skb; +} + static void bnx2x_frag_free(const struct bnx2x_fastpath *fp, void *data) { if (fp->rx_frag_size) @@ -779,7 +791,7 @@ static void bnx2x_tpa_stop(struct bnx2x *bp, struct bnx2x_fastpath *fp, dma_unmap_single(&bp->pdev->dev, dma_unmap_addr(rx_buf, mapping), fp->rx_buf_size, DMA_FROM_DEVICE); if (likely(new_data)) - skb = build_skb(data, fp->rx_frag_size); + skb = bnx2x_build_skb(fp, data);
if (likely(skb)) { #ifdef BNX2X_STOP_ON_ERROR @@ -1046,7 +1058,7 @@ static int bnx2x_rx_int(struct bnx2x_fastpath *fp, int budget) dma_unmap_addr(rx_buf, mapping), fp->rx_buf_size, DMA_FROM_DEVICE); - skb = build_skb(data, fp->rx_frag_size); + skb = bnx2x_build_skb(fp, data); if (unlikely(!skb)) { bnx2x_frag_free(fp, data); bnx2x_fp_qstats(bp, fp)->
From: Radoslaw Tyl radoslawx.tyl@intel.com
[ Upstream commit c5cff16f461a4a434a9915a7be7ac9ced861a8a4 ]
Fix invalid registers dump from ethtool -d ethX after adapter self test by ethtool -t ethY. It causes invalid data display.
The problem was caused by overwriting i40e_reg_list[].elements which is common for ethtool self test and dump.
Fixes: 22dd9ae8afcc ("i40e: Rework register diagnostic") Signed-off-by: Radoslaw Tyl radoslawx.tyl@intel.com Reviewed-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Tested-by: Arpana Arland arpanax.arland@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230328172659.3906413-1-anthony.l.nguyen@intel.co... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/i40e/i40e_diag.c | 11 ++++++----- drivers/net/ethernet/intel/i40e/i40e_diag.h | 2 +- 2 files changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_diag.c b/drivers/net/ethernet/intel/i40e/i40e_diag.c index ef4d3762bf371..ca229b0efeb65 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_diag.c +++ b/drivers/net/ethernet/intel/i40e/i40e_diag.c @@ -44,7 +44,7 @@ static i40e_status i40e_diag_reg_pattern_test(struct i40e_hw *hw, return 0; }
-struct i40e_diag_reg_test_info i40e_reg_list[] = { +const struct i40e_diag_reg_test_info i40e_reg_list[] = { /* offset mask elements stride */ {I40E_QTX_CTL(0), 0x0000FFBF, 1, I40E_QTX_CTL(1) - I40E_QTX_CTL(0)}, @@ -78,27 +78,28 @@ i40e_status i40e_diag_reg_test(struct i40e_hw *hw) { i40e_status ret_code = 0; u32 reg, mask; + u32 elements; u32 i, j;
for (i = 0; i40e_reg_list[i].offset != 0 && !ret_code; i++) {
+ elements = i40e_reg_list[i].elements; /* set actual reg range for dynamically allocated resources */ if (i40e_reg_list[i].offset == I40E_QTX_CTL(0) && hw->func_caps.num_tx_qp != 0) - i40e_reg_list[i].elements = hw->func_caps.num_tx_qp; + elements = hw->func_caps.num_tx_qp; if ((i40e_reg_list[i].offset == I40E_PFINT_ITRN(0, 0) || i40e_reg_list[i].offset == I40E_PFINT_ITRN(1, 0) || i40e_reg_list[i].offset == I40E_PFINT_ITRN(2, 0) || i40e_reg_list[i].offset == I40E_QINT_TQCTL(0) || i40e_reg_list[i].offset == I40E_QINT_RQCTL(0)) && hw->func_caps.num_msix_vectors != 0) - i40e_reg_list[i].elements = - hw->func_caps.num_msix_vectors - 1; + elements = hw->func_caps.num_msix_vectors - 1;
/* test register access */ mask = i40e_reg_list[i].mask; - for (j = 0; j < i40e_reg_list[i].elements && !ret_code; j++) { + for (j = 0; j < elements && !ret_code; j++) { reg = i40e_reg_list[i].offset + (j * i40e_reg_list[i].stride); ret_code = i40e_diag_reg_pattern_test(hw, reg, mask); diff --git a/drivers/net/ethernet/intel/i40e/i40e_diag.h b/drivers/net/ethernet/intel/i40e/i40e_diag.h index c3340f320a18c..1db7c6d572311 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_diag.h +++ b/drivers/net/ethernet/intel/i40e/i40e_diag.h @@ -20,7 +20,7 @@ struct i40e_diag_reg_test_info { u32 stride; /* bytes between each element */ };
-extern struct i40e_diag_reg_test_info i40e_reg_list[]; +extern const struct i40e_diag_reg_test_info i40e_reg_list[];
i40e_status i40e_diag_reg_test(struct i40e_hw *hw); i40e_status i40e_diag_eeprom_test(struct i40e_hw *hw);
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit 83714dc3db0e4a088673601bc8099b079bc1a077 ]
When the selftest command fails, driver is not reporting the failure by updating the "test->flags" when bnxt_close_nic() fails.
Fixes: eb51365846bc ("bnxt_en: Add basic ethtool -t selftest support.") Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Reviewed-by: Somnath Kotur somnath.kotur@broadcom.com Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index ec573127b7076..7658a06b8d059 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -3738,6 +3738,7 @@ static void bnxt_self_test(struct net_device *dev, struct ethtool_test *etest, bnxt_ulp_stop(bp); rc = bnxt_close_nic(bp, true, false); if (rc) { + etest->flags |= ETH_TEST_FL_FAILED; bnxt_ulp_start(bp, rc); return; }
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit 62aad36ed31abc80f35db11e187e690448a79f7d ]
Fix 57502 and 57508 NPAR description string entries. The typos caused these devices to not match up with lspci output.
Fixes: 49c98421e6ab ("bnxt_en: Add PCI IDs for 57500 series NPAR devices.") Reviewed-by: Pavan Chebbi pavan.chebbi@broadcom.com Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index b44b2ec5e61a2..015b5848b9583 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -175,12 +175,12 @@ static const struct pci_device_id bnxt_pci_tbl[] = { { PCI_VDEVICE(BROADCOM, 0x1750), .driver_data = BCM57508 }, { PCI_VDEVICE(BROADCOM, 0x1751), .driver_data = BCM57504 }, { PCI_VDEVICE(BROADCOM, 0x1752), .driver_data = BCM57502 }, - { PCI_VDEVICE(BROADCOM, 0x1800), .driver_data = BCM57508_NPAR }, + { PCI_VDEVICE(BROADCOM, 0x1800), .driver_data = BCM57502_NPAR }, { PCI_VDEVICE(BROADCOM, 0x1801), .driver_data = BCM57504_NPAR }, - { PCI_VDEVICE(BROADCOM, 0x1802), .driver_data = BCM57502_NPAR }, - { PCI_VDEVICE(BROADCOM, 0x1803), .driver_data = BCM57508_NPAR }, + { PCI_VDEVICE(BROADCOM, 0x1802), .driver_data = BCM57508_NPAR }, + { PCI_VDEVICE(BROADCOM, 0x1803), .driver_data = BCM57502_NPAR }, { PCI_VDEVICE(BROADCOM, 0x1804), .driver_data = BCM57504_NPAR }, - { PCI_VDEVICE(BROADCOM, 0x1805), .driver_data = BCM57502_NPAR }, + { PCI_VDEVICE(BROADCOM, 0x1805), .driver_data = BCM57508_NPAR }, { PCI_VDEVICE(BROADCOM, 0xd802), .driver_data = BCM58802 }, { PCI_VDEVICE(BROADCOM, 0xd804), .driver_data = BCM58804 }, #ifdef CONFIG_BNXT_SRIOV
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 581bce7bcb7e7f100908728e7b292e266c76895b ]
bnxt_fw_to_ethtool_speed() is missing the case statement for 200G link speed reported by firmware. As a result, ethtool will report unknown speed when the firmware reports 200G link speed.
Fixes: 532262ba3b84 ("bnxt_en: ethtool: support PAM4 link speeds up to 200G") Signed-off-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.h | 1 + drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 2 ++ 2 files changed, 3 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.h b/drivers/net/ethernet/broadcom/bnxt/bnxt.h index 56355e64815e2..3056e5bb7d6fa 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.h +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.h @@ -1225,6 +1225,7 @@ struct bnxt_link_info { #define BNXT_LINK_SPEED_40GB PORT_PHY_QCFG_RESP_LINK_SPEED_40GB #define BNXT_LINK_SPEED_50GB PORT_PHY_QCFG_RESP_LINK_SPEED_50GB #define BNXT_LINK_SPEED_100GB PORT_PHY_QCFG_RESP_LINK_SPEED_100GB +#define BNXT_LINK_SPEED_200GB PORT_PHY_QCFG_RESP_LINK_SPEED_200GB u16 support_speeds; u16 support_pam4_speeds; u16 auto_link_speeds; /* fw adv setting */ diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 7658a06b8d059..6bd18eb5137f4 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -1714,6 +1714,8 @@ u32 bnxt_fw_to_ethtool_speed(u16 fw_link_speed) return SPEED_50000; case BNXT_LINK_SPEED_100GB: return SPEED_100000; + case BNXT_LINK_SPEED_200GB: + return SPEED_200000; default: return SPEED_UNKNOWN; }
From: Steffen Bätz steffen@innosonix.de
[ Upstream commit 7bcad0f0e6fbc1d613e49e0ee35c8e5f2e685bb0 ]
Do not set the MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP bit on CPU or DSA ports.
This allows the host CPU port to be a regular IGMP listener by sending out IGMP Membership Reports, which would otherwise not be forwarded by the mv88exxx chip, but directly looped back to the CPU port itself.
Fixes: 54d792f257c6 ("net: dsa: Centralise global and port setup code into mv88e6xxx.") Signed-off-by: Steffen Bätz steffen@innosonix.de Signed-off-by: Fabio Estevam festevam@denx.de Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Vladimir Oltean olteanv@gmail.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20230329150140.701559-1-festevam@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/chip.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 89829e0ca8e8f..8211a4d373e81 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -3354,9 +3354,14 @@ static int mv88e6xxx_setup_port(struct mv88e6xxx_chip *chip, int port) * If this is the upstream port for this switch, enable * forwarding of unknown unicasts and multicasts. */ - reg = MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP | - MV88E6185_PORT_CTL0_USE_TAG | MV88E6185_PORT_CTL0_USE_IP | + reg = MV88E6185_PORT_CTL0_USE_TAG | MV88E6185_PORT_CTL0_USE_IP | MV88E6XXX_PORT_CTL0_STATE_FORWARDING; + /* Forward any IPv4 IGMP or IPv6 MLD frames received + * by a USER port to the CPU port to allow snooping. + */ + if (dsa_is_user_port(ds, port)) + reg |= MV88E6XXX_PORT_CTL0_IGMP_MLD_SNOOP; + err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg); if (err) return err;
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 64fdc5f341db01200e33105265d4b8450122a82e ]
If certain conditions are met, DSA can install all necessary MAC addresses on the CPU ports as FDB entries and disable flooding towards the CPU (we call this RX filtering).
There is one corner case where this does not work.
ip link add br0 type bridge vlan_filtering 1 && ip link set br0 up ip link set swp0 master br0 && ip link set swp0 up ip link add link swp0 name swp0.100 type vlan id 100 ip link set swp0.100 up && ip addr add 192.168.100.1/24 dev swp0.100
Traffic through swp0.100 is broken, because the bridge turns on VLAN filtering in the swp0 port (causing RX packets to be classified to the FDB database corresponding to the VID from their 802.1Q header), and although the 8021q module does call dev_uc_add() towards the real device, that API is VLAN-unaware, so it only contains the MAC address, not the VID; and DSA's current implementation of ndo_set_rx_mode() is only for VID 0 (corresponding to FDB entries which are installed in an FDB database which is only hit when the port is VLAN-unaware).
It's interesting to understand why the bridge does not turn on IFF_PROMISC for its swp0 bridge port, and it may appear at first glance that this is a regression caused by the logic in commit 2796d0c648c9 ("bridge: Automatically manage port promiscuous mode."). After all, a bridge port needs to have IFF_PROMISC by its very nature - it needs to receive and forward frames with a MAC DA different from the bridge ports' MAC addresses.
While that may be true, when the bridge is VLAN-aware *and* it has a single port, there is no real reason to enable promiscuity even if that is an automatic port, with flooding and learning (there is nowhere for packets to go except to the BR_FDB_LOCAL entries), and this is how the corner case appears. Adding a second automatic interface to the bridge would make swp0 promisc as well, and would mask the corner case.
Given the dev_uc_add() / ndo_set_rx_mode() API is what it is (it doesn't pass a VLAN ID), the only way to address that problem is to install host FDB entries for the cartesian product of RX filtering MAC addresses and VLAN RX filters.
Fixes: 7569459a52c9 ("net: dsa: manage flooding on the CPU ports") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Simon Horman simon.horman@corigine.com Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20230329151821.745752-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/dsa/slave.c | 121 ++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 116 insertions(+), 5 deletions(-)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 6711ddc0a3c7d..df8b16c741a40 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -57,6 +57,12 @@ struct dsa_standalone_event_work { u16 vid; };
+struct dsa_host_vlan_rx_filtering_ctx { + struct net_device *dev; + const unsigned char *addr; + enum dsa_standalone_event event; +}; + static bool dsa_switch_supports_uc_filtering(struct dsa_switch *ds) { return ds->ops->port_fdb_add && ds->ops->port_fdb_del && @@ -155,18 +161,37 @@ static int dsa_slave_schedule_standalone_work(struct net_device *dev, return 0; }
+static int dsa_slave_host_vlan_rx_filtering(struct net_device *vdev, int vid, + void *arg) +{ + struct dsa_host_vlan_rx_filtering_ctx *ctx = arg; + + return dsa_slave_schedule_standalone_work(ctx->dev, ctx->event, + ctx->addr, vid); +} + static int dsa_slave_sync_uc(struct net_device *dev, const unsigned char *addr) { struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_host_vlan_rx_filtering_ctx ctx = { + .dev = dev, + .addr = addr, + .event = DSA_UC_ADD, + }; + int err;
dev_uc_add(master, addr);
if (!dsa_switch_supports_uc_filtering(dp->ds)) return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, addr, 0); + err = dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, addr, 0); + if (err) + return err; + + return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx); }
static int dsa_slave_unsync_uc(struct net_device *dev, @@ -174,13 +199,23 @@ static int dsa_slave_unsync_uc(struct net_device *dev, { struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_host_vlan_rx_filtering_ctx ctx = { + .dev = dev, + .addr = addr, + .event = DSA_UC_DEL, + }; + int err;
dev_uc_del(master, addr);
if (!dsa_switch_supports_uc_filtering(dp->ds)) return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, addr, 0); + err = dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, addr, 0); + if (err) + return err; + + return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx); }
static int dsa_slave_sync_mc(struct net_device *dev, @@ -188,13 +223,23 @@ static int dsa_slave_sync_mc(struct net_device *dev, { struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_host_vlan_rx_filtering_ctx ctx = { + .dev = dev, + .addr = addr, + .event = DSA_MC_ADD, + }; + int err;
dev_mc_add(master, addr);
if (!dsa_switch_supports_mc_filtering(dp->ds)) return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, addr, 0); + err = dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, addr, 0); + if (err) + return err; + + return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx); }
static int dsa_slave_unsync_mc(struct net_device *dev, @@ -202,13 +247,23 @@ static int dsa_slave_unsync_mc(struct net_device *dev, { struct net_device *master = dsa_slave_to_master(dev); struct dsa_port *dp = dsa_slave_to_port(dev); + struct dsa_host_vlan_rx_filtering_ctx ctx = { + .dev = dev, + .addr = addr, + .event = DSA_MC_DEL, + }; + int err;
dev_mc_del(master, addr);
if (!dsa_switch_supports_mc_filtering(dp->ds)) return 0;
- return dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0); + err = dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, addr, 0); + if (err) + return err; + + return vlan_for_each(dev, dsa_slave_host_vlan_rx_filtering, &ctx); }
void dsa_slave_sync_ha(struct net_device *dev) @@ -1668,6 +1723,8 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, .flags = 0, }; struct netlink_ext_ack extack = {0}; + struct dsa_switch *ds = dp->ds; + struct netdev_hw_addr *ha; int ret;
/* User port... */ @@ -1687,6 +1744,30 @@ static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto, return ret; }
+ if (!dsa_switch_supports_uc_filtering(ds) && + !dsa_switch_supports_mc_filtering(ds)) + return 0; + + netif_addr_lock_bh(dev); + + if (dsa_switch_supports_mc_filtering(ds)) { + netdev_for_each_synced_mc_addr(ha, dev) { + dsa_slave_schedule_standalone_work(dev, DSA_MC_ADD, + ha->addr, vid); + } + } + + if (dsa_switch_supports_uc_filtering(ds)) { + netdev_for_each_synced_uc_addr(ha, dev) { + dsa_slave_schedule_standalone_work(dev, DSA_UC_ADD, + ha->addr, vid); + } + } + + netif_addr_unlock_bh(dev); + + dsa_flush_workqueue(); + return 0; }
@@ -1699,13 +1780,43 @@ static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto, /* This API only allows programming tagged, non-PVID VIDs */ .flags = 0, }; + struct dsa_switch *ds = dp->ds; + struct netdev_hw_addr *ha; int err;
err = dsa_port_vlan_del(dp, &vlan); if (err) return err;
- return dsa_port_host_vlan_del(dp, &vlan); + err = dsa_port_host_vlan_del(dp, &vlan); + if (err) + return err; + + if (!dsa_switch_supports_uc_filtering(ds) && + !dsa_switch_supports_mc_filtering(ds)) + return 0; + + netif_addr_lock_bh(dev); + + if (dsa_switch_supports_mc_filtering(ds)) { + netdev_for_each_synced_mc_addr(ha, dev) { + dsa_slave_schedule_standalone_work(dev, DSA_MC_DEL, + ha->addr, vid); + } + } + + if (dsa_switch_supports_uc_filtering(ds)) { + netdev_for_each_synced_uc_addr(ha, dev) { + dsa_slave_schedule_standalone_work(dev, DSA_UC_DEL, + ha->addr, vid); + } + } + + netif_addr_unlock_bh(dev); + + dsa_flush_workqueue(); + + return 0; }
static int dsa_slave_restore_vlan(struct net_device *vdev, int vid, void *arg)
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 8c1cb87c2a5c29da416848451a687473f379611c ]
Since we call flow_block_cb_decref on FLOW_BLOCK_UNBIND, we also need to call flow_block_cb_incref for a newly allocated cb. Also fix the accidentally inverted refcount check on unbind.
Fixes: 502e84e2382d ("net: ethernet: mtk_eth_soc: add flow offloading support") Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Felix Fietkau nbd@nbd.name Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230330120840.52079-1-nbd@nbd.name Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_ppe_offload.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c index 81afd5ee3fbf1..161751bb36c9c 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe_offload.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe_offload.c @@ -576,6 +576,7 @@ mtk_eth_setup_tc_block(struct net_device *dev, struct flow_block_offload *f) if (IS_ERR(block_cb)) return PTR_ERR(block_cb);
+ flow_block_cb_incref(block_cb); flow_block_cb_add(block_cb, f); list_add_tail(&block_cb->driver_list, &block_cb_list); return 0; @@ -584,7 +585,7 @@ mtk_eth_setup_tc_block(struct net_device *dev, struct flow_block_offload *f) if (!block_cb) return -ENOENT;
- if (flow_block_cb_decref(block_cb)) { + if (!flow_block_cb_decref(block_cb)) { flow_block_cb_remove(block_cb, f); list_del(&block_cb->driver_list); }
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 5f36ca1b841fb17a20249fd9fedafc7dc7fdd940 ]
Check for skb metadata in order to detect the case where the DSA header is not present.
Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging") Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Felix Fietkau nbd@nbd.name Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230330120840.52079-2-nbd@nbd.name Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_eth_soc.c | 6 +++--- drivers/net/ethernet/mediatek/mtk_ppe.c | 5 ++++- 2 files changed, 7 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index 9f9df6255baa1..bd7c18c839d42 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -2006,9 +2006,6 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget, skb_checksum_none_assert(skb); skb->protocol = eth_type_trans(skb, netdev);
- if (reason == MTK_PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED) - mtk_ppe_check_skb(eth->ppe[0], skb, hash); - if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX) { if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) { if (trxd.rxd3 & RX_DMA_VTAG_V2) { @@ -2036,6 +2033,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget, __vlan_hwaccel_put_tag(skb, htons(vlan_proto), vlan_tci); }
+ if (reason == MTK_PPE_CPU_REASON_HIT_UNBIND_RATE_REACHED) + mtk_ppe_check_skb(eth->ppe[0], skb, hash); + skb_record_rx_queue(skb, 0); napi_gro_receive(napi, skb);
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c index 1ff024f42444b..52db211d9596a 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe.c @@ -8,6 +8,7 @@ #include <linux/platform_device.h> #include <linux/if_ether.h> #include <linux/if_vlan.h> +#include <net/dst_metadata.h> #include <net/dsa.h> #include "mtk_eth_soc.h" #include "mtk_ppe.h" @@ -699,7 +700,9 @@ void __mtk_ppe_check_skb(struct mtk_ppe *ppe, struct sk_buff *skb, u16 hash) skb->dev->dsa_ptr->tag_ops->proto != DSA_TAG_PROTO_MTK) goto out;
- tag += 4; + if (!skb_metadata_dst(skb)) + tag += 4; + if (get_unaligned_be16(tag) != ETH_P_8021Q) break;
From: Felix Fietkau nbd@nbd.name
[ Upstream commit 924531326e2dd4ceabe7240f2b55a88e7d894ec2 ]
The cache needs to be flushed to ensure that the hardware stops offloading the flow immediately.
Fixes: 33fc42de3327 ("net: ethernet: mtk_eth_soc: support creating mac address based offload entries") Reviewed-by: Simon Horman simon.horman@corigine.com Signed-off-by: Felix Fietkau nbd@nbd.name Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230330120840.52079-3-nbd@nbd.name Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_ppe.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mediatek/mtk_ppe.c b/drivers/net/ethernet/mediatek/mtk_ppe.c index 52db211d9596a..2ea539ccc0802 100644 --- a/drivers/net/ethernet/mediatek/mtk_ppe.c +++ b/drivers/net/ethernet/mediatek/mtk_ppe.c @@ -459,6 +459,7 @@ __mtk_foe_entry_clear(struct mtk_ppe *ppe, struct mtk_flow_entry *entry) hwe->ib1 &= ~MTK_FOE_IB1_STATE; hwe->ib1 |= FIELD_PREP(MTK_FOE_IB1_STATE, MTK_FOE_STATE_INVALID); dma_wmb(); + mtk_ppe_cache_clear(ppe); } entry->hash = 0xffff;
From: Horatiu Vultur horatiu.vultur@microchip.com
[ Upstream commit 657fd9da2d4b4aa0a384105b236baa22fa0233bf ]
In case the driver was trying to set an alternate mode for gpio 0 or 32 then the mode was not set correctly. The reason is that there is computation error inside the function ocelot_pinmux_set_mux because in this case it was trying to shift to left by -1. Fix this by actually shifting the function bits and not the position.
Fixes: 4b36082e2e09 ("pinctrl: ocelot: fix pinmuxing for pins after 31") Signed-off-by: Horatiu Vultur horatiu.vultur@microchip.com Link: https://lore.kernel.org/r/20230206203720.1177718-1-horatiu.vultur@microchip.... Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pinctrl/pinctrl-ocelot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/pinctrl/pinctrl-ocelot.c b/drivers/pinctrl/pinctrl-ocelot.c index 29e4a6282a641..1dcbd0937ef5a 100644 --- a/drivers/pinctrl/pinctrl-ocelot.c +++ b/drivers/pinctrl/pinctrl-ocelot.c @@ -1204,7 +1204,7 @@ static int ocelot_pinmux_set_mux(struct pinctrl_dev *pctldev, regmap_update_bits(info->map, REG_ALT(0, info, pin->pin), BIT(p), f << p); regmap_update_bits(info->map, REG_ALT(1, info, pin->pin), - BIT(p), f << (p - 1)); + BIT(p), (f >> 1) << p);
return 0; }
From: Matthias Benkmann matthias.benkmann@gmail.com
[ Upstream commit ffa6206ebf8d39e83d87ac226df68dbbe155819a ]
When commit commit fff1011a26d6 ("Input: xpad - add X-Box Adaptive Profile button") was applied, one hunk ended up in the wrong function; move it to where it belongs.
Fixes: fff1011a26d6 ("Input: xpad - add X-Box Adaptive Profile button") Signed-off-by: Matthias Benkmann matthias.benkmann@gmail.com Link: https://lore.kernel.org/r/20230318162106.0aef4ba5@ninja Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/joystick/xpad.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/drivers/input/joystick/xpad.c b/drivers/input/joystick/xpad.c index 2959d80f7fdb6..cd36cf7165423 100644 --- a/drivers/input/joystick/xpad.c +++ b/drivers/input/joystick/xpad.c @@ -779,9 +779,6 @@ static void xpad_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char *d input_report_key(dev, BTN_C, data[8]); input_report_key(dev, BTN_Z, data[9]);
- /* Profile button has a value of 0-3, so it is reported as an axis */ - if (xpad->mapping & MAP_PROFILE_BUTTON) - input_report_abs(dev, ABS_PROFILE, data[34]);
input_sync(dev); } @@ -1059,6 +1056,10 @@ static void xpadone_process_packet(struct usb_xpad *xpad, u16 cmd, unsigned char (__u16) le16_to_cpup((__le16 *)(data + 8))); }
+ /* Profile button has a value of 0-3, so it is reported as an axis */ + if (xpad->mapping & MAP_PROFILE_BUTTON) + input_report_abs(dev, ABS_PROFILE, data[34]); + /* paddle handling */ /* based on SDL's SDL_hidapi_xboxone.c */ if (xpad->mapping & MAP_PADDLES) {
From: Lu Baolu baolu.lu@linux.intel.com
[ Upstream commit bfd3c6b9fa4a1dc78139dd1621d5bea321ffa69d ]
The VT-d spec states (in section 11.4.2) that hardware implementations reporting second-stage translation support (SSTS) field as Clear also report the SAGAW field as 0. Fix an inappropriate check in alloc_iommu().
Fixes: 792fb43ce2c9 ("iommu/vt-d: Enable Intel IOMMU scalable mode by default") Suggested-by: Raghunathan Srinivasan raghunathan.srinivasan@intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Jacob Pan jacob.jun.pan@linux.intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20230318024824.124542-1-baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20230329134721.469447-3-baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/dmar.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index b00a0ceb2d137..c80c524ad32d7 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1057,7 +1057,8 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd) }
err = -EINVAL; - if (cap_sagaw(iommu->cap) == 0) { + if (!cap_sagaw(iommu->cap) && + (!ecap_smts(iommu->ecap) || ecap_slts(iommu->ecap))) { pr_info("%s: No supported address widths. Not attempting DMA translation.\n", iommu->name); drhd->ignored = 1;
From: Javier Martinez Canillas javierm@redhat.com
[ Upstream commit f95b8ea79c47c0ad3d18f45ad538f9970e414d1f ]
This reverts commit a837e5161cff, which broke probing of the venus driver, at least on the SC7180 SoC HP X2 Chromebook:
qcom-venus aa00000.video-codec: Adding to iommu group 11 qcom-venus aa00000.video-codec: non legacy binding qcom-venus aa00000.video-codec: failed to reset venus core qcom-venus: probe of aa00000.video-codec failed with error -110
Matthias Kaehlcke also reported that the same change caused a regression in SC7180 and sc7280, that prevents AOSS from entering sleep mode during system suspend. So let's revert this commit for now to fix both issues.
Fixes: a837e5161cff ("venus: firmware: Correct non-pix start and end addresses") Reported-by: Matthias Kaehlcke mka@chromium.org Signed-off-by: Javier Martinez Canillas javierm@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/qcom/venus/firmware.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/media/platform/qcom/venus/firmware.c b/drivers/media/platform/qcom/venus/firmware.c index 142d4c74017c0..d59ecf776715c 100644 --- a/drivers/media/platform/qcom/venus/firmware.c +++ b/drivers/media/platform/qcom/venus/firmware.c @@ -38,8 +38,8 @@ static void venus_reset_cpu(struct venus_core *core) writel(fw_size, wrapper_base + WRAPPER_FW_END_ADDR); writel(0, wrapper_base + WRAPPER_CPA_START_ADDR); writel(fw_size, wrapper_base + WRAPPER_CPA_END_ADDR); - writel(0, wrapper_base + WRAPPER_NONPIX_START_ADDR); - writel(0, wrapper_base + WRAPPER_NONPIX_END_ADDR); + writel(fw_size, wrapper_base + WRAPPER_NONPIX_START_ADDR); + writel(fw_size, wrapper_base + WRAPPER_NONPIX_END_ADDR);
if (IS_V6(core)) { /* Bring XTSS out of reset */
From: Werner Sembach wse@tuxedocomputers.com
commit cbedf1a33970c9b825ae75b81fbd3e88e224a418 upstream.
A lot of modern Clevo barebones have touchpad and/or keyboard issues after suspend fixable with nomux + reset + noloop + nopnp. Luckily, none of them have an external PS/2 port so this can safely be set for all of them.
I'm not entirely sure if every device listed really needs all four quirks, but after testing and production use, no negative effects could be observed when setting all four.
Setting SERIO_QUIRK_NOMUX or SERIO_QUIRK_RESET_ALWAYS on the Clevo N150CU and the Clevo NHxxRZQ makes the keyboard very laggy for ~5 seconds after boot and sometimes also after resume. However both are required for the keyboard to not fail completely sometimes after boot or resume.
Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230321191619.647911-1-wse@tuxedocomputers.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/serio/i8042-acpipnpio.h | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
--- a/drivers/input/serio/i8042-acpipnpio.h +++ b/drivers/input/serio/i8042-acpipnpio.h @@ -1117,11 +1117,39 @@ static const struct dmi_system_id i8042_ SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) }, { + /* + * Setting SERIO_QUIRK_NOMUX or SERIO_QUIRK_RESET_ALWAYS makes + * the keyboard very laggy for ~5 seconds after boot and + * sometimes also after resume. + * However both are required for the keyboard to not fail + * completely sometimes after boot or resume. + */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "N150CU"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, + { .matches = { DMI_MATCH(DMI_BOARD_NAME, "NH5xAx"), }, .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) + }, + { + /* + * Setting SERIO_QUIRK_NOMUX or SERIO_QUIRK_RESET_ALWAYS makes + * the keyboard very laggy for ~5 seconds after boot and + * sometimes also after resume. + * However both are required for the keyboard to not fail + * completely sometimes after boot or resume. + */ + .matches = { + DMI_MATCH(DMI_BOARD_NAME, "NHxxRZQ"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX | SERIO_QUIRK_RESET_ALWAYS | + SERIO_QUIRK_NOLOOP | SERIO_QUIRK_NOPNP) }, { .matches = {
From: msizanoen msizanoen@qtmlabs.xyz
commit 754ff5060daf5a1cf4474eff9b4edeb6c17ef7ab upstream.
The AlpsPS/2 code previously relied on the assumption that `char` is a signed type, which was true on x86 platforms (the only place where this driver is used) before kernel 6.2. However, on 6.2 and later, this assumption is broken due to the introduction of -funsigned-char as a new global compiler flag.
Fix this by explicitly specifying the signedness of `char` when sign extending the values received from the device.
Fixes: f3f33c677699 ("Input: alps - Rushmore and v7 resolution support") Signed-off-by: msizanoen msizanoen@qtmlabs.xyz Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230320045228.182259-1-msizanoen@qtmlabs.xyz Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/alps.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/input/mouse/alps.c +++ b/drivers/input/mouse/alps.c @@ -852,8 +852,8 @@ static void alps_process_packet_v6(struc x = y = z = 0;
/* Divide 4 since trackpoint's speed is too fast */ - input_report_rel(dev2, REL_X, (char)x / 4); - input_report_rel(dev2, REL_Y, -((char)y / 4)); + input_report_rel(dev2, REL_X, (s8)x / 4); + input_report_rel(dev2, REL_Y, -((s8)y / 4));
psmouse_report_standard_buttons(dev2, packet[3]);
@@ -1104,8 +1104,8 @@ static void alps_process_trackstick_pack ((packet[3] & 0x20) << 1); z = (packet[5] & 0x3f) | ((packet[3] & 0x80) >> 1);
- input_report_rel(dev2, REL_X, (char)x); - input_report_rel(dev2, REL_Y, -((char)y)); + input_report_rel(dev2, REL_X, (s8)x); + input_report_rel(dev2, REL_Y, -((s8)y)); input_report_abs(dev2, ABS_PRESSURE, z);
psmouse_report_standard_buttons(dev2, packet[1]); @@ -2294,20 +2294,20 @@ static int alps_get_v3_v7_resolution(str if (reg < 0) return reg;
- x_pitch = (char)(reg << 4) >> 4; /* sign extend lower 4 bits */ + x_pitch = (s8)(reg << 4) >> 4; /* sign extend lower 4 bits */ x_pitch = 50 + 2 * x_pitch; /* In 0.1 mm units */
- y_pitch = (char)reg >> 4; /* sign extend upper 4 bits */ + y_pitch = (s8)reg >> 4; /* sign extend upper 4 bits */ y_pitch = 36 + 2 * y_pitch; /* In 0.1 mm units */
reg = alps_command_mode_read_reg(psmouse, reg_pitch + 1); if (reg < 0) return reg;
- x_electrode = (char)(reg << 4) >> 4; /* sign extend lower 4 bits */ + x_electrode = (s8)(reg << 4) >> 4; /* sign extend lower 4 bits */ x_electrode = 17 + x_electrode;
- y_electrode = (char)reg >> 4; /* sign extend upper 4 bits */ + y_electrode = (s8)reg >> 4; /* sign extend upper 4 bits */ y_electrode = 13 + y_electrode;
x_phys = x_pitch * (x_electrode - 1); /* In 0.1 mm units */
From: Jason A. Donenfeld Jason@zx2c4.com
commit 8980f190947ba29f23110408e712444884b74251 upstream.
The recent change of -funsigned-char causes additions of negative numbers to become additions of large positive numbers, leading to wrong calculations of mouse movement. Change these casts to be explicitly signed, to take into account negative offsets.
Fixes: 3bc753c06dd0 ("kbuild: treat char as always unsigned") Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Reviewed-by: Hans de Goede hdegoede@redhat.com Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217211 Link: https://lore.kernel.org/r/20230318133010.1285202-1-Jason@zx2c4.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/mouse/focaltech.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/input/mouse/focaltech.c +++ b/drivers/input/mouse/focaltech.c @@ -202,8 +202,8 @@ static void focaltech_process_rel_packet state->pressed = packet[0] >> 7; finger1 = ((packet[0] >> 4) & 0x7) - 1; if (finger1 < FOC_MAX_FINGERS) { - state->fingers[finger1].x += (char)packet[1]; - state->fingers[finger1].y += (char)packet[2]; + state->fingers[finger1].x += (s8)packet[1]; + state->fingers[finger1].y += (s8)packet[2]; } else { psmouse_err(psmouse, "First finger in rel packet invalid: %d\n", finger1); @@ -218,8 +218,8 @@ static void focaltech_process_rel_packet */ finger2 = ((packet[3] >> 4) & 0x7) - 1; if (finger2 < FOC_MAX_FINGERS) { - state->fingers[finger2].x += (char)packet[4]; - state->fingers[finger2].y += (char)packet[5]; + state->fingers[finger2].x += (s8)packet[4]; + state->fingers[finger2].y += (s8)packet[5]; } }
From: Paulo Alcantara pc@manguebit.com
commit 09ba47b44d26b475bbdf9c80db9e0193d2b58956 upstream.
We can't call smb_init() in CIFSGetDFSRefer() as cifs_reconnect_tcon() may end up calling CIFSGetDFSRefer() again to get new DFS referrals and thus causing an infinite recursion.
Signed-off-by: Paulo Alcantara (SUSE) pc@manguebit.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Cc: stable@vger.kernel.org # 6.2 Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifssmb.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/fs/cifs/cifssmb.c +++ b/fs/cifs/cifssmb.c @@ -4377,8 +4377,13 @@ CIFSGetDFSRefer(const unsigned int xid, return -ENODEV;
getDFSRetry: - rc = smb_init(SMB_COM_TRANSACTION2, 15, ses->tcon_ipc, (void **) &pSMB, - (void **) &pSMBr); + /* + * Use smb_init_no_reconnect() instead of smb_init() as + * CIFSGetDFSRefer() may be called from cifs_reconnect_tcon() and thus + * causing an infinite recursion. + */ + rc = smb_init_no_reconnect(SMB_COM_TRANSACTION2, 15, ses->tcon_ipc, + (void **)&pSMB, (void **)&pSMBr); if (rc) return rc;
From: David Disseldorp ddiss@suse.de
commit 179a88a8558bbf42991d361595281f3e45d7edfc upstream.
When compiled with CONFIG_CIFS_DFS_UPCALL disabled, cifs_dfs_d_automount is NULL. cifs.ko logic for mapping CIFS_FATTR_DFS_REFERRAL attributes to S_AUTOMOUNT and corresponding dentry flags is retained regardless of CONFIG_CIFS_DFS_UPCALL, leading to a NULL pointer dereference in VFS follow_automount() when traversing a DFS referral link: BUG: kernel NULL pointer dereference, address: 0000000000000000 ... Call Trace: <TASK> __traverse_mounts+0xb5/0x220 ? cifs_revalidate_mapping+0x65/0xc0 [cifs] step_into+0x195/0x610 ? lookup_fast+0xe2/0xf0 path_lookupat+0x64/0x140 filename_lookup+0xc2/0x140 ? __create_object+0x299/0x380 ? kmem_cache_alloc+0x119/0x220 ? user_path_at_empty+0x31/0x50 user_path_at_empty+0x31/0x50 __x64_sys_chdir+0x2a/0xd0 ? exit_to_user_mode_prepare+0xca/0x100 do_syscall_64+0x42/0x90 entry_SYSCALL_64_after_hwframe+0x72/0xdc
This fix adds an inline cifs_dfs_d_automount() {return -EREMOTE} handler when CONFIG_CIFS_DFS_UPCALL is disabled. An alternative would be to avoid flagging S_AUTOMOUNT, etc. without CONFIG_CIFS_DFS_UPCALL. This approach was chosen as it provides more control over the error path.
Signed-off-by: David Disseldorp ddiss@suse.de Cc: stable@vger.kernel.org Reviewed-by: Paulo Alcantara (SUSE) pc@manguebit.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifsfs.h | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
--- a/fs/cifs/cifsfs.h +++ b/fs/cifs/cifsfs.h @@ -118,7 +118,10 @@ extern const struct dentry_operations ci #ifdef CONFIG_CIFS_DFS_UPCALL extern struct vfsmount *cifs_dfs_d_automount(struct path *path); #else -#define cifs_dfs_d_automount NULL +static inline struct vfsmount *cifs_dfs_d_automount(struct path *path) +{ + return ERR_PTR(-EREMOTE); +} #endif
/* Functions related to symlinks */
From: Jonathan Denose jdenose@chromium.org
commit f5bad62f9107b701a6def7cac1f5f65862219b83 upstream.
Fujitsu Lifebook A574/H requires the nomux option to properly probe the touchpad, especially when waking from sleep.
Signed-off-by: Jonathan Denose jdenose@google.com Reviewed-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230303152623.45859-1-jdenose@google.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/serio/i8042-acpipnpio.h | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/input/serio/i8042-acpipnpio.h +++ b/drivers/input/serio/i8042-acpipnpio.h @@ -611,6 +611,14 @@ static const struct dmi_system_id i8042_ .driver_data = (void *)(SERIO_QUIRK_NOMUX) }, { + /* Fujitsu Lifebook A574/H */ + .matches = { + DMI_MATCH(DMI_SYS_VENDOR, "FUJITSU"), + DMI_MATCH(DMI_PRODUCT_NAME, "FMVA0501PZ"), + }, + .driver_data = (void *)(SERIO_QUIRK_NOMUX) + }, + { /* Gigabyte M912 */ .matches = { DMI_MATCH(DMI_SYS_VENDOR, "GIGABYTE"),
From: Hans de Goede hdegoede@redhat.com
commit 8a0432bab6ea3203d220785da7ab3c7677f70ecb upstream.
The Android Lenovo Yoga Book X90F / X90L uses the same goodix touchscreen with 9 bytes touch reports for its touch keyboard as the already supported Windows Lenovo Yoga Book X91F/L, add a DMI match for this to the nine_bytes_report DMI table.
When the quirk for the X91F/L was initially added it was written to also apply to the X90F/L but this does not work because the Android version of the Yoga Book uses completely different DMI strings. Also adjust the X91F/L quirk to reflect that it only applies to the X91F/L models.
Signed-off-by: Hans de Goede hdegoede@redhat.com Reviewed-by: Bastien Nocera hadess@hadess.net Link: https://lore.kernel.org/r/20230315134442.71787-1-hdegoede@redhat.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/input/touchscreen/goodix.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/input/touchscreen/goodix.c +++ b/drivers/input/touchscreen/goodix.c @@ -124,10 +124,18 @@ static const unsigned long goodix_irq_fl static const struct dmi_system_id nine_bytes_report[] = { #if defined(CONFIG_DMI) && defined(CONFIG_X86) { - .ident = "Lenovo YogaBook", - /* YB1-X91L/F and YB1-X90L/F */ + /* Lenovo Yoga Book X90F / X90L */ .matches = { - DMI_MATCH(DMI_PRODUCT_NAME, "Lenovo YB1-X9") + DMI_EXACT_MATCH(DMI_SYS_VENDOR, "Intel Corporation"), + DMI_EXACT_MATCH(DMI_PRODUCT_NAME, "CHERRYVIEW D1 PLATFORM"), + DMI_EXACT_MATCH(DMI_PRODUCT_VERSION, "YETI-11"), + } + }, + { + /* Lenovo Yoga Book X91F / X91L */ + .matches = { + /* Non exact match to match F + L versions */ + DMI_MATCH(DMI_PRODUCT_NAME, "Lenovo YB1-X91"), } }, #endif
From: Filipe Manana fdmanana@suse.com
commit 2d82a40aa7d6fcae0250ec68b8566cdee7bfd44c upstream.
Before relocating a block group we pause scrub, then do the relocation and then unpause scrub. The relocation process requires starting and committing a transaction, and if we have a failure in the critical section of the transaction commit path (transaction state >= TRANS_STATE_COMMIT_START), we will deadlock if there is a paused scrub.
That results in stack traces like the following:
[42.479] BTRFS info (device sdc): relocating block group 53876686848 flags metadata|raid6 [42.936] BTRFS warning (device sdc): Skipping commit of aborted transaction. [42.936] ------------[ cut here ]------------ [42.936] BTRFS: Transaction aborted (error -28) [42.936] WARNING: CPU: 11 PID: 346822 at fs/btrfs/transaction.c:1977 btrfs_commit_transaction+0xcc8/0xeb0 [btrfs] [42.936] Modules linked in: dm_flakey dm_mod loop btrfs (...) [42.936] CPU: 11 PID: 346822 Comm: btrfs Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [42.936] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [42.936] RIP: 0010:btrfs_commit_transaction+0xcc8/0xeb0 [btrfs] [42.936] Code: ff ff 45 8b (...) [42.936] RSP: 0018:ffffb58649633b48 EFLAGS: 00010282 [42.936] RAX: 0000000000000000 RBX: ffff8be6ef4d5bd8 RCX: 0000000000000000 [42.936] RDX: 0000000000000002 RSI: ffffffffb35e7782 RDI: 00000000ffffffff [42.936] RBP: ffff8be6ef4d5c98 R08: 0000000000000000 R09: ffffb586496339e8 [42.936] R10: 0000000000000001 R11: 0000000000000001 R12: ffff8be6d38c7c00 [42.936] R13: 00000000ffffffe4 R14: ffff8be6c268c000 R15: ffff8be6ef4d5cf0 [42.936] FS: 00007f381a82b340(0000) GS:ffff8beddfcc0000(0000) knlGS:0000000000000000 [42.936] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [42.936] CR2: 00007f1e35fb7638 CR3: 0000000117680006 CR4: 0000000000370ee0 [42.936] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [42.936] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [42.936] Call Trace: [42.936] <TASK> [42.936] ? start_transaction+0xcb/0x610 [btrfs] [42.936] prepare_to_relocate+0x111/0x1a0 [btrfs] [42.936] relocate_block_group+0x57/0x5d0 [btrfs] [42.936] ? btrfs_wait_nocow_writers+0x25/0xb0 [btrfs] [42.936] btrfs_relocate_block_group+0x248/0x3c0 [btrfs] [42.936] ? __pfx_autoremove_wake_function+0x10/0x10 [42.936] btrfs_relocate_chunk+0x3b/0x150 [btrfs] [42.936] btrfs_balance+0x8ff/0x11d0 [btrfs] [42.936] ? __kmem_cache_alloc_node+0x14a/0x410 [42.936] btrfs_ioctl+0x2334/0x32c0 [btrfs] [42.937] ? mod_objcg_state+0xd2/0x360 [42.937] ? refill_obj_stock+0xb0/0x160 [42.937] ? seq_release+0x25/0x30 [42.937] ? __rseq_handle_notify_resume+0x3b5/0x4b0 [42.937] ? percpu_counter_add_batch+0x2e/0xa0 [42.937] ? __x64_sys_ioctl+0x88/0xc0 [42.937] __x64_sys_ioctl+0x88/0xc0 [42.937] do_syscall_64+0x38/0x90 [42.937] entry_SYSCALL_64_after_hwframe+0x72/0xdc [42.937] RIP: 0033:0x7f381a6ffe9b [42.937] Code: 00 48 89 44 24 (...) [42.937] RSP: 002b:00007ffd45ecf060 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [42.937] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f381a6ffe9b [42.937] RDX: 00007ffd45ecf150 RSI: 00000000c4009420 RDI: 0000000000000003 [42.937] RBP: 0000000000000003 R08: 0000000000000013 R09: 0000000000000000 [42.937] R10: 00007f381a60c878 R11: 0000000000000246 R12: 00007ffd45ed0423 [42.937] R13: 00007ffd45ecf150 R14: 0000000000000000 R15: 00007ffd45ecf148 [42.937] </TASK> [42.937] ---[ end trace 0000000000000000 ]--- [42.937] BTRFS: error (device sdc: state A) in cleanup_transaction:1977: errno=-28 No space left [59.196] INFO: task btrfs:346772 blocked for more than 120 seconds. [59.196] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.196] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.196] task:btrfs state:D stack:0 pid:346772 ppid:1 flags:0x00004002 [59.196] Call Trace: [59.196] <TASK> [59.196] __schedule+0x392/0xa70 [59.196] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.196] schedule+0x5d/0xd0 [59.196] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.197] ? __pfx_autoremove_wake_function+0x10/0x10 [59.197] scrub_pause_off+0x21/0x50 [btrfs] [59.197] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.197] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.198] ? __pfx_autoremove_wake_function+0x10/0x10 [59.198] scrub_stripe+0x20d/0x740 [btrfs] [59.198] scrub_chunk+0xc4/0x130 [btrfs] [59.198] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.198] ? __pfx_autoremove_wake_function+0x10/0x10 [59.198] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.199] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.199] ? _copy_from_user+0x7b/0x80 [59.199] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.199] ? refill_stock+0x33/0x50 [59.199] ? should_failslab+0xa/0x20 [59.199] ? kmem_cache_alloc_node+0x151/0x460 [59.199] ? alloc_io_context+0x1b/0x80 [59.199] ? preempt_count_add+0x70/0xa0 [59.199] ? __x64_sys_ioctl+0x88/0xc0 [59.199] __x64_sys_ioctl+0x88/0xc0 [59.199] do_syscall_64+0x38/0x90 [59.199] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.199] RIP: 0033:0x7f82ffaffe9b [59.199] RSP: 002b:00007f82ff9fcc50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.199] RAX: ffffffffffffffda RBX: 000055b191e36310 RCX: 00007f82ffaffe9b [59.199] RDX: 000055b191e36310 RSI: 00000000c400941b RDI: 0000000000000003 [59.199] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.199] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82ff9fd640 [59.199] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.199] </TASK> [59.199] INFO: task btrfs:346773 blocked for more than 120 seconds. [59.200] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.200] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.201] task:btrfs state:D stack:0 pid:346773 ppid:1 flags:0x00004002 [59.201] Call Trace: [59.201] <TASK> [59.201] __schedule+0x392/0xa70 [59.201] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.201] schedule+0x5d/0xd0 [59.201] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.201] ? __pfx_autoremove_wake_function+0x10/0x10 [59.201] scrub_pause_off+0x21/0x50 [btrfs] [59.202] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.202] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.202] ? __pfx_autoremove_wake_function+0x10/0x10 [59.202] scrub_stripe+0x20d/0x740 [btrfs] [59.202] scrub_chunk+0xc4/0x130 [btrfs] [59.203] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.203] ? __pfx_autoremove_wake_function+0x10/0x10 [59.203] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.203] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.203] ? _copy_from_user+0x7b/0x80 [59.203] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.204] ? should_failslab+0xa/0x20 [59.204] ? kmem_cache_alloc_node+0x151/0x460 [59.204] ? alloc_io_context+0x1b/0x80 [59.204] ? preempt_count_add+0x70/0xa0 [59.204] ? __x64_sys_ioctl+0x88/0xc0 [59.204] __x64_sys_ioctl+0x88/0xc0 [59.204] do_syscall_64+0x38/0x90 [59.204] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.204] RIP: 0033:0x7f82ffaffe9b [59.204] RSP: 002b:00007f82ff1fbc50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.204] RAX: ffffffffffffffda RBX: 000055b191e36790 RCX: 00007f82ffaffe9b [59.204] RDX: 000055b191e36790 RSI: 00000000c400941b RDI: 0000000000000003 [59.204] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.204] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82ff1fc640 [59.204] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.204] </TASK> [59.204] INFO: task btrfs:346774 blocked for more than 120 seconds. [59.205] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.205] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.206] task:btrfs state:D stack:0 pid:346774 ppid:1 flags:0x00004002 [59.206] Call Trace: [59.206] <TASK> [59.206] __schedule+0x392/0xa70 [59.206] schedule+0x5d/0xd0 [59.206] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.206] ? __pfx_autoremove_wake_function+0x10/0x10 [59.206] scrub_pause_off+0x21/0x50 [btrfs] [59.207] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.207] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.207] ? __pfx_autoremove_wake_function+0x10/0x10 [59.207] scrub_stripe+0x20d/0x740 [btrfs] [59.208] scrub_chunk+0xc4/0x130 [btrfs] [59.208] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.208] ? __mutex_unlock_slowpath.isra.0+0x9a/0x120 [59.208] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.208] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.209] ? _copy_from_user+0x7b/0x80 [59.209] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.209] ? should_failslab+0xa/0x20 [59.209] ? kmem_cache_alloc_node+0x151/0x460 [59.209] ? alloc_io_context+0x1b/0x80 [59.209] ? preempt_count_add+0x70/0xa0 [59.209] ? __x64_sys_ioctl+0x88/0xc0 [59.209] __x64_sys_ioctl+0x88/0xc0 [59.209] do_syscall_64+0x38/0x90 [59.209] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.209] RIP: 0033:0x7f82ffaffe9b [59.209] RSP: 002b:00007f82fe9fac50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.209] RAX: ffffffffffffffda RBX: 000055b191e36c10 RCX: 00007f82ffaffe9b [59.209] RDX: 000055b191e36c10 RSI: 00000000c400941b RDI: 0000000000000003 [59.209] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.209] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fe9fb640 [59.209] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.209] </TASK> [59.209] INFO: task btrfs:346775 blocked for more than 120 seconds. [59.210] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.210] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.211] task:btrfs state:D stack:0 pid:346775 ppid:1 flags:0x00004002 [59.211] Call Trace: [59.211] <TASK> [59.211] __schedule+0x392/0xa70 [59.211] schedule+0x5d/0xd0 [59.211] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.211] ? __pfx_autoremove_wake_function+0x10/0x10 [59.211] scrub_pause_off+0x21/0x50 [btrfs] [59.212] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.212] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.212] ? __pfx_autoremove_wake_function+0x10/0x10 [59.212] scrub_stripe+0x20d/0x740 [btrfs] [59.213] scrub_chunk+0xc4/0x130 [btrfs] [59.213] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.213] ? __mutex_unlock_slowpath.isra.0+0x9a/0x120 [59.213] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.213] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.214] ? _copy_from_user+0x7b/0x80 [59.214] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.214] ? should_failslab+0xa/0x20 [59.214] ? kmem_cache_alloc_node+0x151/0x460 [59.214] ? alloc_io_context+0x1b/0x80 [59.214] ? preempt_count_add+0x70/0xa0 [59.214] ? __x64_sys_ioctl+0x88/0xc0 [59.214] __x64_sys_ioctl+0x88/0xc0 [59.214] do_syscall_64+0x38/0x90 [59.214] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.214] RIP: 0033:0x7f82ffaffe9b [59.214] RSP: 002b:00007f82fe1f9c50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.214] RAX: ffffffffffffffda RBX: 000055b191e37090 RCX: 00007f82ffaffe9b [59.214] RDX: 000055b191e37090 RSI: 00000000c400941b RDI: 0000000000000003 [59.214] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.214] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fe1fa640 [59.214] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.214] </TASK> [59.214] INFO: task btrfs:346776 blocked for more than 120 seconds. [59.215] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.216] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.217] task:btrfs state:D stack:0 pid:346776 ppid:1 flags:0x00004002 [59.217] Call Trace: [59.217] <TASK> [59.217] __schedule+0x392/0xa70 [59.217] ? __pv_queued_spin_lock_slowpath+0x165/0x370 [59.217] schedule+0x5d/0xd0 [59.217] __scrub_blocked_if_needed+0x74/0xc0 [btrfs] [59.217] ? __pfx_autoremove_wake_function+0x10/0x10 [59.217] scrub_pause_off+0x21/0x50 [btrfs] [59.217] scrub_simple_mirror+0x1c7/0x950 [btrfs] [59.217] ? scrub_parity_put+0x1a5/0x1d0 [btrfs] [59.218] ? __pfx_autoremove_wake_function+0x10/0x10 [59.218] scrub_stripe+0x20d/0x740 [btrfs] [59.218] scrub_chunk+0xc4/0x130 [btrfs] [59.218] scrub_enumerate_chunks+0x3e4/0x7a0 [btrfs] [59.219] ? __pfx_autoremove_wake_function+0x10/0x10 [59.219] btrfs_scrub_dev+0x236/0x6a0 [btrfs] [59.219] ? btrfs_ioctl+0xd97/0x32c0 [btrfs] [59.219] ? _copy_from_user+0x7b/0x80 [59.219] btrfs_ioctl+0xde1/0x32c0 [btrfs] [59.219] ? should_failslab+0xa/0x20 [59.219] ? kmem_cache_alloc_node+0x151/0x460 [59.219] ? alloc_io_context+0x1b/0x80 [59.219] ? preempt_count_add+0x70/0xa0 [59.219] ? __x64_sys_ioctl+0x88/0xc0 [59.219] __x64_sys_ioctl+0x88/0xc0 [59.219] do_syscall_64+0x38/0x90 [59.219] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.219] RIP: 0033:0x7f82ffaffe9b [59.219] RSP: 002b:00007f82fd9f8c50 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.219] RAX: ffffffffffffffda RBX: 000055b191e37510 RCX: 00007f82ffaffe9b [59.219] RDX: 000055b191e37510 RSI: 00000000c400941b RDI: 0000000000000003 [59.219] RBP: 0000000000000000 R08: 00007fff1575016f R09: 0000000000000000 [59.219] R10: 0000000000000000 R11: 0000000000000246 R12: 00007f82fd9f9640 [59.219] R13: 000000000000006b R14: 00007f82ffa87580 R15: 0000000000000000 [59.219] </TASK> [59.219] INFO: task btrfs:346822 blocked for more than 120 seconds. [59.220] Tainted: G W 6.3.0-rc2-btrfs-next-127+ #1 [59.221] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [59.222] task:btrfs state:D stack:0 pid:346822 ppid:1 flags:0x00004002 [59.222] Call Trace: [59.222] <TASK> [59.222] __schedule+0x392/0xa70 [59.222] schedule+0x5d/0xd0 [59.222] btrfs_scrub_cancel+0x91/0x100 [btrfs] [59.222] ? __pfx_autoremove_wake_function+0x10/0x10 [59.222] btrfs_commit_transaction+0x572/0xeb0 [btrfs] [59.223] ? start_transaction+0xcb/0x610 [btrfs] [59.223] prepare_to_relocate+0x111/0x1a0 [btrfs] [59.223] relocate_block_group+0x57/0x5d0 [btrfs] [59.223] ? btrfs_wait_nocow_writers+0x25/0xb0 [btrfs] [59.223] btrfs_relocate_block_group+0x248/0x3c0 [btrfs] [59.224] ? __pfx_autoremove_wake_function+0x10/0x10 [59.224] btrfs_relocate_chunk+0x3b/0x150 [btrfs] [59.224] btrfs_balance+0x8ff/0x11d0 [btrfs] [59.224] ? __kmem_cache_alloc_node+0x14a/0x410 [59.224] btrfs_ioctl+0x2334/0x32c0 [btrfs] [59.225] ? mod_objcg_state+0xd2/0x360 [59.225] ? refill_obj_stock+0xb0/0x160 [59.225] ? seq_release+0x25/0x30 [59.225] ? __rseq_handle_notify_resume+0x3b5/0x4b0 [59.225] ? percpu_counter_add_batch+0x2e/0xa0 [59.225] ? __x64_sys_ioctl+0x88/0xc0 [59.225] __x64_sys_ioctl+0x88/0xc0 [59.225] do_syscall_64+0x38/0x90 [59.225] entry_SYSCALL_64_after_hwframe+0x72/0xdc [59.225] RIP: 0033:0x7f381a6ffe9b [59.225] RSP: 002b:00007ffd45ecf060 EFLAGS: 00000246 ORIG_RAX: 0000000000000010 [59.225] RAX: ffffffffffffffda RBX: 0000000000000001 RCX: 00007f381a6ffe9b [59.225] RDX: 00007ffd45ecf150 RSI: 00000000c4009420 RDI: 0000000000000003 [59.225] RBP: 0000000000000003 R08: 0000000000000013 R09: 0000000000000000 [59.225] R10: 00007f381a60c878 R11: 0000000000000246 R12: 00007ffd45ed0423 [59.225] R13: 00007ffd45ecf150 R14: 0000000000000000 R15: 00007ffd45ecf148 [59.225] </TASK>
What happens is the following:
1) A scrub is running, so fs_info->scrubs_running is 1;
2) Task A starts block group relocation, and at btrfs_relocate_chunk() it pauses scrub by calling btrfs_scrub_pause(). That increments fs_info->scrub_pause_req from 0 to 1 and waits for the scrub task to pause (for fs_info->scrubs_paused to be == to fs_info->scrubs_running);
3) The scrub task pauses at scrub_pause_off(), waiting for fs_info->scrub_pause_req to decrease to 0;
4) Task A then enters btrfs_relocate_block_group(), and down that call chain we start a transaction and then attempt to commit it;
5) When task A calls btrfs_commit_transaction(), it either will do the commit itself or wait for some other task that already started the commit of the transaction - it doesn't matter which case;
6) The transaction commit enters state TRANS_STATE_COMMIT_START;
7) An error happens during the transaction commit, like -ENOSPC when running delayed refs or delayed items for example;
8) This results in calling transaction.c:cleanup_transaction(), where we call btrfs_scrub_cancel(), incrementing fs_info->scrub_cancel_req from 0 to 1, and blocking this task waiting for fs_info->scrubs_running to decrease to 0;
9) From this point on, both the transaction commit and the scrub task hang forever:
1) The transaction commit is waiting for fs_info->scrubs_running to be decreased to 0;
2) The scrub task is at scrub_pause_off() waiting for fs_info->scrub_pause_req to decrease to 0 - so it can not proceed to stop the scrub and decrement fs_info->scrubs_running from 0 to 1.
Therefore resulting in a deadlock.
Fix this by having cleanup_transaction(), called if a transaction commit fails, not call btrfs_scrub_cancel() if relocation is in progress, and having btrfs_relocate_block_group() call btrfs_scrub_cancel() instead if the relocation failed and a transaction abort happened.
This was triggered with btrfs/061 from fstests.
Fixes: 55e3a601c81c ("btrfs: Fix data checksum error cause by replace with io-load.") CC: stable@vger.kernel.org # 4.14+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/transaction.c | 15 ++++++++++++++- fs/btrfs/volumes.c | 9 ++++++++- 2 files changed, 22 insertions(+), 2 deletions(-)
--- a/fs/btrfs/transaction.c +++ b/fs/btrfs/transaction.c @@ -2035,7 +2035,20 @@ static void cleanup_transaction(struct b
if (current->journal_info == trans) current->journal_info = NULL; - btrfs_scrub_cancel(fs_info); + + /* + * If relocation is running, we can't cancel scrub because that will + * result in a deadlock. Before relocating a block group, relocation + * pauses scrub, then starts and commits a transaction before unpausing + * scrub. If the transaction commit is being done by the relocation + * task or triggered by another task and the relocation task is waiting + * for the commit, and we end up here due to an error in the commit + * path, then calling btrfs_scrub_cancel() will deadlock, as we are + * asking for scrub to stop while having it asked to be paused higher + * above in relocation code. + */ + if (!test_bit(BTRFS_FS_RELOC_RUNNING, &fs_info->flags)) + btrfs_scrub_cancel(fs_info);
kmem_cache_free(btrfs_trans_handle_cachep, trans); } --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3266,8 +3266,15 @@ int btrfs_relocate_chunk(struct btrfs_fs btrfs_scrub_pause(fs_info); ret = btrfs_relocate_block_group(fs_info, chunk_offset); btrfs_scrub_continue(fs_info); - if (ret) + if (ret) { + /* + * If we had a transaction abort, stop all running scrubs. + * See transaction.c:cleanup_transaction() why we do it here. + */ + if (BTRFS_FS_ERROR(fs_info)) + btrfs_scrub_cancel(fs_info); return ret; + }
block_group = btrfs_lookup_block_group(fs_info, chunk_offset); if (!block_group)
From: Filipe Manana fdmanana@suse.com
commit 2f1a6be12ab6c8470d5776e68644726c94257c54 upstream.
The quota assign ioctl can currently run in parallel with a quota disable ioctl call. The assign ioctl uses the quota root, while the disable ioctl frees that root, and therefore we can have a use-after-free triggered in the assign ioctl, leading to a trace like the following when KASAN is enabled:
[672.723][T736] BUG: KASAN: slab-use-after-free in btrfs_search_slot+0x2962/0x2db0 [672.723][T736] Read of size 8 at addr ffff888022ec0208 by task btrfs_search_sl/27736 [672.724][T736] [672.725][T736] CPU: 1 PID: 27736 Comm: btrfs_search_sl Not tainted 6.3.0-rc3 #37 [672.723][T736] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 [672.727][T736] Call Trace: [672.728][T736] <TASK> [672.728][T736] dump_stack_lvl+0xd9/0x150 [672.725][T736] print_report+0xc1/0x5e0 [672.720][T736] ? __virt_addr_valid+0x61/0x2e0 [672.727][T736] ? __phys_addr+0xc9/0x150 [672.725][T736] ? btrfs_search_slot+0x2962/0x2db0 [672.722][T736] kasan_report+0xc0/0xf0 [672.729][T736] ? btrfs_search_slot+0x2962/0x2db0 [672.724][T736] btrfs_search_slot+0x2962/0x2db0 [672.723][T736] ? fs_reclaim_acquire+0xba/0x160 [672.722][T736] ? split_leaf+0x13d0/0x13d0 [672.726][T736] ? rcu_is_watching+0x12/0xb0 [672.723][T736] ? kmem_cache_alloc+0x338/0x3c0 [672.722][T736] update_qgroup_status_item+0xf7/0x320 [672.724][T736] ? add_qgroup_rb+0x3d0/0x3d0 [672.739][T736] ? do_raw_spin_lock+0x12d/0x2b0 [672.730][T736] ? spin_bug+0x1d0/0x1d0 [672.737][T736] btrfs_run_qgroups+0x5de/0x840 [672.730][T736] ? btrfs_qgroup_rescan_worker+0xa70/0xa70 [672.738][T736] ? __del_qgroup_relation+0x4ba/0xe00 [672.738][T736] btrfs_ioctl+0x3d58/0x5d80 [672.735][T736] ? tomoyo_path_number_perm+0x16a/0x550 [672.737][T736] ? tomoyo_execute_permission+0x4a0/0x4a0 [672.731][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50 [672.737][T736] ? __sanitizer_cov_trace_switch+0x54/0x90 [672.734][T736] ? do_vfs_ioctl+0x132/0x1660 [672.730][T736] ? vfs_fileattr_set+0xc40/0xc40 [672.730][T736] ? _raw_spin_unlock_irq+0x2e/0x50 [672.732][T736] ? sigprocmask+0xf2/0x340 [672.737][T736] ? __fget_files+0x26a/0x480 [672.732][T736] ? bpf_lsm_file_ioctl+0x9/0x10 [672.738][T736] ? btrfs_ioctl_get_supported_features+0x50/0x50 [672.736][T736] __x64_sys_ioctl+0x198/0x210 [672.736][T736] do_syscall_64+0x39/0xb0 [672.731][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.739][T736] RIP: 0033:0x4556ad [672.742][T736] </TASK> [672.743][T736] [672.748][T736] Allocated by task 27677: [672.743][T736] kasan_save_stack+0x22/0x40 [672.741][T736] kasan_set_track+0x25/0x30 [672.741][T736] __kasan_kmalloc+0xa4/0xb0 [672.749][T736] btrfs_alloc_root+0x48/0x90 [672.746][T736] btrfs_create_tree+0x146/0xa20 [672.744][T736] btrfs_quota_enable+0x461/0x1d20 [672.743][T736] btrfs_ioctl+0x4a1c/0x5d80 [672.747][T736] __x64_sys_ioctl+0x198/0x210 [672.749][T736] do_syscall_64+0x39/0xb0 [672.744][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.756][T736] [672.757][T736] Freed by task 27677: [672.759][T736] kasan_save_stack+0x22/0x40 [672.759][T736] kasan_set_track+0x25/0x30 [672.756][T736] kasan_save_free_info+0x2e/0x50 [672.751][T736] ____kasan_slab_free+0x162/0x1c0 [672.758][T736] slab_free_freelist_hook+0x89/0x1c0 [672.752][T736] __kmem_cache_free+0xaf/0x2e0 [672.752][T736] btrfs_put_root+0x1ff/0x2b0 [672.759][T736] btrfs_quota_disable+0x80a/0xbc0 [672.752][T736] btrfs_ioctl+0x3e5f/0x5d80 [672.756][T736] __x64_sys_ioctl+0x198/0x210 [672.753][T736] do_syscall_64+0x39/0xb0 [672.765][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.769][T736] [672.768][T736] The buggy address belongs to the object at ffff888022ec0000 [672.768][T736] which belongs to the cache kmalloc-4k of size 4096 [672.769][T736] The buggy address is located 520 bytes inside of [672.769][T736] freed 4096-byte region [ffff888022ec0000, ffff888022ec1000) [672.760][T736] [672.764][T736] The buggy address belongs to the physical page: [672.761][T736] page:ffffea00008bb000 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x22ec0 [672.766][T736] head:ffffea00008bb000 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0 [672.779][T736] flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff) [672.770][T736] raw: 00fff00000010200 ffff888012842140 ffffea000054ba00 dead000000000002 [672.770][T736] raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000 [672.771][T736] page dumped because: kasan: bad access detected [672.778][T736] page_owner tracks the page as allocated [672.777][T736] page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd2040(__GFP_IO|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 88 [672.779][T736] get_page_from_freelist+0x119c/0x2d50 [672.779][T736] __alloc_pages+0x1cb/0x4a0 [672.776][T736] alloc_pages+0x1aa/0x270 [672.773][T736] allocate_slab+0x260/0x390 [672.771][T736] ___slab_alloc+0xa9a/0x13e0 [672.778][T736] __slab_alloc.constprop.0+0x56/0xb0 [672.771][T736] __kmem_cache_alloc_node+0x136/0x320 [672.789][T736] __kmalloc+0x4e/0x1a0 [672.783][T736] tomoyo_realpath_from_path+0xc3/0x600 [672.781][T736] tomoyo_path_perm+0x22f/0x420 [672.782][T736] tomoyo_path_unlink+0x92/0xd0 [672.780][T736] security_path_unlink+0xdb/0x150 [672.788][T736] do_unlinkat+0x377/0x680 [672.788][T736] __x64_sys_unlink+0xca/0x110 [672.789][T736] do_syscall_64+0x39/0xb0 [672.783][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.784][T736] page last free stack trace: [672.787][T736] free_pcp_prepare+0x4e5/0x920 [672.787][T736] free_unref_page+0x1d/0x4e0 [672.784][T736] __unfreeze_partials+0x17c/0x1a0 [672.797][T736] qlist_free_all+0x6a/0x180 [672.796][T736] kasan_quarantine_reduce+0x189/0x1d0 [672.797][T736] __kasan_slab_alloc+0x64/0x90 [672.793][T736] kmem_cache_alloc+0x17c/0x3c0 [672.799][T736] getname_flags.part.0+0x50/0x4e0 [672.799][T736] getname_flags+0x9e/0xe0 [672.792][T736] vfs_fstatat+0x77/0xb0 [672.791][T736] __do_sys_newlstat+0x84/0x100 [672.798][T736] do_syscall_64+0x39/0xb0 [672.796][T736] entry_SYSCALL_64_after_hwframe+0x63/0xcd [672.790][T736] [672.791][T736] Memory state around the buggy address: [672.799][T736] ffff888022ec0100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.805][T736] ffff888022ec0180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.802][T736] >ffff888022ec0200: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.809][T736] ^ [672.809][T736] ffff888022ec0280: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb [672.809][T736] ffff888022ec0300: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
Fix this by having the qgroup assign ioctl take the qgroup ioctl mutex before calling btrfs_run_qgroups(), which is what all qgroup ioctls should call.
Reported-by: butt3rflyh4ck butterflyhuangxx@gmail.com Link: https://lore.kernel.org/linux-btrfs/CAFcO6XN3VD8ogmHwqRk4kbiwtpUSNySu2VAxN8w... CC: stable@vger.kernel.org # 5.10+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/ioctl.c | 2 ++ fs/btrfs/qgroup.c | 11 ++++++++++- 2 files changed, 12 insertions(+), 1 deletion(-)
--- a/fs/btrfs/ioctl.c +++ b/fs/btrfs/ioctl.c @@ -3731,7 +3731,9 @@ static long btrfs_ioctl_qgroup_assign(st }
/* update qgroup status and info */ + mutex_lock(&fs_info->qgroup_ioctl_lock); err = btrfs_run_qgroups(trans); + mutex_unlock(&fs_info->qgroup_ioctl_lock); if (err < 0) btrfs_handle_fs_error(fs_info, err, "failed to update qgroup status and info"); --- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -2828,13 +2828,22 @@ cleanup: }
/* - * called from commit_transaction. Writes all changed qgroups to disk. + * Writes all changed qgroups to disk. + * Called by the transaction commit path and the qgroup assign ioctl. */ int btrfs_run_qgroups(struct btrfs_trans_handle *trans) { struct btrfs_fs_info *fs_info = trans->fs_info; int ret = 0;
+ /* + * In case we are called from the qgroup assign ioctl, assert that we + * are holding the qgroup_ioctl_lock, otherwise we can race with a quota + * disable operation (ioctl) and access a freed quota root. + */ + if (trans->transaction->state != TRANS_STATE_COMMIT_DOING) + lockdep_assert_held(&fs_info->qgroup_ioctl_lock); + if (!fs_info->quota_root) return ret;
From: Anand Jain anand.jain@oracle.com
commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 upstream.
This fixes mkfs/mount/check failures due to race with systemd-udevd scan.
During the device scan initiated by systemd-udevd, other user space EXCL operations such as mkfs, mount, or check may get blocked and result in a "Device or resource busy" error. This is because the device scan process opens the device with the EXCL flag in the kernel.
Two reports were received:
- btrfs/179 test case, where the fsck command failed with the -EBUSY error
- LTP pwritev03 test case, where mkfs.vfs failed with the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem on the device.
In both cases, fsck and mkfs (respectively) were racing with a systemd-udevd device scan, and systemd-udevd won, resulting in the -EBUSY error for fsck and mkfs.
Reproducing the problem has been difficult because there is a very small window during which these userspace threads can race to acquire the exclusive device open. Even on the system where the problem was observed, the problem occurrences were anywhere between 10 to 400 iterations and chances of reproducing decreases with debug printk()s.
However, an exclusive device open is unnecessary for the scan process, as there are no write operations on the device during scan. Furthermore, during the mount process, the superblock is re-read in the below function call chain:
btrfs_mount_root btrfs_open_devices open_fs_devices btrfs_open_one_device btrfs_get_bdev_and_sb
So, to fix this issue, removes the FMODE_EXCL flag from the scan operation, and add a comment.
The case where mkfs may still write to the device and a scan is running, the btrfs signature is not written at that time so scan will not recognize such device.
Reported-by: Sherry Yang sherry.yang@oracle.com Reported-by: kernel test robot oliver.sang@intel.com Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Anand Jain anand.jain@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_devi * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - flags |= FMODE_EXCL;
+ /* + * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may + * initiate the device scan which may race with the user's mount + * or mkfs command, resulting in failure. + * Since the device scan is solely for reading purposes, there is + * no need for FMODE_EXCL. Additionally, the devices are read again + * during the mount process. It is ok to get some inconsistent + * values temporarily, as the device paths of the fsid are the only + * required information for assembling the volume. + */ bdev = blkdev_get_by_path(path, flags, holder); if (IS_ERR(bdev)) return ERR_CAST(bdev);
From: Filipe Manana fdmanana@suse.com
commit 2280d425ba3599bdd85c41bd0ec8ba568f00c032 upstream.
During fiemap, when walking backreferences to determine if a b+tree node/leaf is shared, we may find a tree block (leaf or node) for which two parents were added to the references ulist. This happens if we get for example one direct ref (shared tree block ref) and one indirect ref (non-shared tree block ref) for the tree block at the current level, which can happen during relocation.
In that case the fiemap path cache can not be used since it's meant for a single path, with one tree block at each possible level, so having multiple references for a tree block at any level may result in getting the level counter exceed BTRFS_MAX_LEVEL and eventually trigger the warning:
WARN_ON_ONCE(level >= BTRFS_MAX_LEVEL)
at lookup_backref_shared_cache() and at store_backref_shared_cache(). This is harmless since the code ignores any level >= BTRFS_MAX_LEVEL, the warning is there just to catch any unexpected case like the one described above. However if a user finds this it may be scary and get reported.
So just ignore the path cache once we find a tree block for which there are more than one reference, which is the less common case, and update the cache with the sharedness check result for all levels below the level for which we found multiple references.
Reported-by: Jarno Pelkonen jarno.pelkonen@gmail.com Link: https://lore.kernel.org/linux-btrfs/CAKv8qLmDNAGJGCtsevxx_VZ_YOvvs1L83iEJkTg... Fixes: 12a824dc67a6 ("btrfs: speedup checking for extent sharedness during fiemap") CC: stable@vger.kernel.org # 6.1+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/backref.c | 85 +++++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 63 insertions(+), 22 deletions(-)
--- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -1895,8 +1895,7 @@ int btrfs_is_data_extent_shared(struct b ULIST_ITER_INIT(&uiter); ctx->use_path_cache = true; while (1) { - bool is_shared; - bool cached; + const unsigned long prev_ref_count = ctx->refs.nnodes;
walk_ctx.bytenr = bytenr; ret = find_parent_nodes(&walk_ctx, &shared); @@ -1914,21 +1913,36 @@ int btrfs_is_data_extent_shared(struct b ret = 0;
/* - * If our data extent was not directly shared (without multiple - * reference items), than it might have a single reference item - * with a count > 1 for the same offset, which means there are 2 - * (or more) file extent items that point to the data extent - - * this happens when a file extent item needs to be split and - * then one item gets moved to another leaf due to a b+tree leaf - * split when inserting some item. In this case the file extent - * items may be located in different leaves and therefore some - * of the leaves may be referenced through shared subtrees while - * others are not. Since our extent buffer cache only works for - * a single path (by far the most common case and simpler to - * deal with), we can not use it if we have multiple leaves - * (which implies multiple paths). + * More than one extent buffer (bytenr) may have been added to + * the ctx->refs ulist, in which case we have to check multiple + * tree paths in case the first one is not shared, so we can not + * use the path cache which is made for a single path. Multiple + * extent buffers at the current level happen when: + * + * 1) level -1, the data extent: If our data extent was not + * directly shared (without multiple reference items), then + * it might have a single reference item with a count > 1 for + * the same offset, which means there are 2 (or more) file + * extent items that point to the data extent - this happens + * when a file extent item needs to be split and then one + * item gets moved to another leaf due to a b+tree leaf split + * when inserting some item. In this case the file extent + * items may be located in different leaves and therefore + * some of the leaves may be referenced through shared + * subtrees while others are not. Since our extent buffer + * cache only works for a single path (by far the most common + * case and simpler to deal with), we can not use it if we + * have multiple leaves (which implies multiple paths). + * + * 2) level >= 0, a tree node/leaf: We can have a mix of direct + * and indirect references on a b+tree node/leaf, so we have + * to check multiple paths, and the extent buffer (the + * current bytenr) may be shared or not. One example is + * during relocation as we may get a shared tree block ref + * (direct ref) and a non-shared tree block ref (indirect + * ref) for the same node/leaf. */ - if (level == -1 && ctx->refs.nnodes > 1) + if ((ctx->refs.nnodes - prev_ref_count) > 1) ctx->use_path_cache = false;
if (level >= 0) @@ -1938,12 +1952,17 @@ int btrfs_is_data_extent_shared(struct b if (!node) break; bytenr = node->val; - level++; - cached = lookup_backref_shared_cache(ctx, root, bytenr, level, - &is_shared); - if (cached) { - ret = (is_shared ? 1 : 0); - break; + if (ctx->use_path_cache) { + bool is_shared; + bool cached; + + level++; + cached = lookup_backref_shared_cache(ctx, root, bytenr, + level, &is_shared); + if (cached) { + ret = (is_shared ? 1 : 0); + break; + } } shared.share_count = 0; shared.have_delayed_delete_refs = false; @@ -1951,6 +1970,28 @@ int btrfs_is_data_extent_shared(struct b }
/* + * If the path cache is disabled, then it means at some tree level we + * got multiple parents due to a mix of direct and indirect backrefs or + * multiple leaves with file extent items pointing to the same data + * extent. We have to invalidate the cache and cache only the sharedness + * result for the levels where we got only one node/reference. + */ + if (!ctx->use_path_cache) { + int i = 0; + + level--; + if (ret >= 0 && level >= 0) { + bytenr = ctx->path_cache_entries[level].bytenr; + ctx->use_path_cache = true; + store_backref_shared_cache(ctx, root, bytenr, level, ret); + i = level + 1; + } + + for ( ; i < BTRFS_MAX_LEVEL; i++) + ctx->path_cache_entries[i].bytenr = 0; + } + + /* * Cache the sharedness result for the data extent if we know our inode * has more than 1 file extent item that refers to the data extent. */
From: Damien Le Moal damien.lemoal@opensource.wdc.com
commit 77af13ba3c7f91d91c377c7e2d122849bbc17128 upstream.
The call to invalidate_inode_pages2_range() in __iomap_dio_rw() may fail, in which case -ENOTBLK is returned and this error code is propagated back to user space trhough iomap_dio_rw() -> zonefs_file_dio_write() return chain. This error code is fairly obscure and may confuse the user. Avoid this and be consistent with the behavior of zonefs_file_dio_append() for similar invalidate_inode_pages2_range() errors by returning -EBUSY to user space when iomap_dio_rw() returns -ENOTBLK.
Suggested-by: Christoph Hellwig hch@infradead.org Fixes: 8dcc1a9d90c1 ("fs: New zonefs file system") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Hans Holmberg hans.holmberg@wdc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/zonefs/file.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-)
--- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -567,11 +567,21 @@ static ssize_t zonefs_file_dio_write(str append = sync; }
- if (append) + if (append) { ret = zonefs_file_dio_append(iocb, from); - else + } else { + /* + * iomap_dio_rw() may return ENOTBLK if there was an issue with + * page invalidation. Overwrite that error code with EBUSY to + * be consistent with zonefs_file_dio_append() return value for + * similar issues. + */ ret = iomap_dio_rw(iocb, from, &zonefs_write_iomap_ops, &zonefs_write_dio_ops, 0, NULL, 0); + if (ret == -ENOTBLK) + ret = -EBUSY; + } + if (zonefs_zone_is_seq(z) && (ret > 0 || ret == -EIOCBQUEUED)) { if (ret > 0)
From: Jens Axboe axboe@kernel.dk
commit 005308f7bdacf5685ed1a431244a183dbbb9e0e8 upstream.
Unless we have at least one entry queued, then don't call into io_poll_remove_entries(). Normally this isn't possible, but if we retry poll then we can have ->nr_entries cleared again as we're setting it up. If this happens for a poll retry, then we'll still have at least REQ_F_SINGLE_POLL set. io_poll_remove_entries() then thinks it has entries to remove.
Clear REQ_F_SINGLE_POLL and REQ_F_DOUBLE_POLL unconditionally when arming a poll request.
Fixes: c16bda37594f ("io_uring/poll: allow some retries for poll triggering spuriously") Cc: stable@vger.kernel.org Reported-by: Pengfei Xu pengfei.xu@intel.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/poll.c | 1 + 1 file changed, 1 insertion(+)
--- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -724,6 +724,7 @@ int io_arm_poll_handler(struct io_kiocb apoll = io_req_alloc_apoll(req, issue_flags); if (!apoll) return IO_APOLL_ABORTED; + req->flags &= ~(REQ_F_SINGLE_POLL | REQ_F_DOUBLE_POLL); req->flags |= REQ_F_POLLED; ipt.pt._qproc = io_async_queue_proc;
From: Pavel Begunkov asml.silence@gmail.com
commit 4ff0b50de8cabba055efe50bbcb7506c41a69835 upstream.
We should not be looking at ctx->rsrc_node and anyhow modifying the node without holding uring_lock, grabbing references in such a way is not safe either.
Cc: stable@vger.kernel.org Fixes: 5106dd6e74ab6 ("io_uring: propagate issue_flags state down to file assignment") Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/1202ede2d7bb90136e3482b2b84aad9ed483e5d6.168009843... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/rsrc.h | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -144,15 +144,13 @@ static inline void io_req_set_rsrc_node( unsigned int issue_flags) { if (!req->rsrc_node) { - req->rsrc_node = ctx->rsrc_node; + io_ring_submit_lock(ctx, issue_flags);
- if (!(issue_flags & IO_URING_F_UNLOCKED)) { - lockdep_assert_held(&ctx->uring_lock); + lockdep_assert_held(&ctx->uring_lock);
- io_charge_rsrc_node(ctx); - } else { - percpu_ref_get(&req->rsrc_node->refs); - } + req->rsrc_node = ctx->rsrc_node; + io_charge_rsrc_node(ctx); + io_ring_submit_unlock(ctx, issue_flags); } }
From: Pavel Begunkov asml.silence@gmail.com
commit fd30d1cdcc4ff405fc54765edf2e11b03f2ed4f3 upstream.
We increase cache->nr_cached when we free into the cache but don't decrease when we take from it, so in some time we'll get an empty cache with cache->nr_cached larger than IO_ALLOC_CACHE_MAX, that fails io_alloc_cache_put() and effectively disables caching.
Fixes: 9b797a37c4bd8 ("io_uring: add abstraction around apoll cache") Cc: stable@vger.kernel.org Signed-off-by: Pavel Begunkov asml.silence@gmail.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/alloc_cache.h | 1 + 1 file changed, 1 insertion(+)
--- a/io_uring/alloc_cache.h +++ b/io_uring/alloc_cache.h @@ -27,6 +27,7 @@ static inline struct io_cache_entry *io_ struct hlist_node *node = cache->list.first;
hlist_del(node); + cache->nr_cached--; return container_of(node, struct io_cache_entry, node); }
From: Ronak Doshi doshir@vmware.com
commit 3bced313b9a5a237c347e0f079c8c2fe4b3935aa upstream.
Currently, vmxnet3 uses GRO callback only if LRO is disabled. However, on smartNic based setups where UPT is supported, LRO can be enabled from guest VM but UPT devicve does not support LRO as of now. In such cases, there can be performance degradation as GRO is not being done.
This patch fixes this issue by calling GRO API when UPT is enabled. We use updateRxProd to determine if UPT mode is active or not.
To clarify few things discussed over the thread: The patch is not neglecting any feature bits nor disabling GRO. It uses GRO callback when UPT is active as LRO is not available in UPT. GRO callback cannot be used as default for all cases as it degrades performance for non-UPT cases or for cases when LRO is already done in ESXi.
Cc: stable@vger.kernel.org Fixes: 6f91f4ba046e ("vmxnet3: add support for capability registers") Signed-off-by: Ronak Doshi doshir@vmware.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230323200721.27622-1-doshir@vmware.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/vmxnet3/vmxnet3_drv.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/net/vmxnet3/vmxnet3_drv.c +++ b/drivers/net/vmxnet3/vmxnet3_drv.c @@ -1688,7 +1688,9 @@ not_lro: if (unlikely(rcd->ts)) __vlan_hwaccel_put_tag(skb, htons(ETH_P_8021Q), rcd->tci);
- if (adapter->netdev->features & NETIF_F_LRO) + /* Use GRO callback if UPT is enabled */ + if ((adapter->netdev->features & NETIF_F_LRO) && + !rq->shared->updateRxProd) netif_receive_skb(skb); else napi_gro_receive(&rq->napi, skb);
From: Damien Le Moal damien.lemoal@opensource.wdc.com
commit c1976bd8f23016d8706973908f2bb0ac0d852a8f upstream.
When a direct append write is executed, the append offset may correspond to the last page of a sequential file inode which might have been cached already by buffered reads, page faults with mmap-read or non-direct readahead. To ensure that the on-disk and cached data is consistant for such last cached page, make sure to always invalidate it in zonefs_file_dio_append(). If the invalidation fails, return -EBUSY to userspace to differentiate from IO errors.
This invalidation will always be a no-op when the FS block size (device zone write granularity) is equal to the page size (e.g. 4K).
Reported-by: Hans Holmberg Hans.Holmberg@wdc.com Fixes: 02ef12a663c7 ("zonefs: use REQ_OP_ZONE_APPEND for sync DIO") Cc: stable@vger.kernel.org Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Reviewed-by: Christoph Hellwig hch@lst.de Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Hans Holmberg hans.holmberg@wdc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/zonefs/file.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
--- a/fs/zonefs/file.c +++ b/fs/zonefs/file.c @@ -382,6 +382,7 @@ static ssize_t zonefs_file_dio_append(st struct zonefs_zone *z = zonefs_inode_zone(inode); struct block_device *bdev = inode->i_sb->s_bdev; unsigned int max = bdev_max_zone_append_sectors(bdev); + pgoff_t start, end; struct bio *bio; ssize_t size; int nr_pages; @@ -390,6 +391,19 @@ static ssize_t zonefs_file_dio_append(st max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize); iov_iter_truncate(from, max);
+ /* + * If the inode block size (zone write granularity) is smaller than the + * page size, we may be appending data belonging to the last page of the + * inode straddling inode->i_size, with that page already cached due to + * a buffered read or readahead. So make sure to invalidate that page. + * This will always be a no-op for the case where the block size is + * equal to the page size. + */ + start = iocb->ki_pos >> PAGE_SHIFT; + end = (iocb->ki_pos + iov_iter_count(from) - 1) >> PAGE_SHIFT; + if (invalidate_inode_pages2_range(inode->i_mapping, start, end)) + return -EBUSY; + nr_pages = iov_iter_npages(from, BIO_MAX_VECS); if (!nr_pages) return 0;
From: Mike Snitzer snitzer@kernel.org
commit 666eed46769d929c3e13636134ecfc67d75ef548 upstream.
Commit 7dd76d1feec70 ("dm: improve bio splitting and associated IO accounting") only called setup_split_accounting() from __send_duplicate_bios() if a single bio were being issued. But the case where duplicate bios are issued must call it too.
Otherwise the bio won't be split and resubmitted (via recursion through block core back to DM) to submit the later portions of a bio (which may map to an entirely different target).
For example, when discarding an entire DM striped device with the following DM table: vg-lvol0: 0 159744 striped 2 128 7:0 2048 7:1 2048 vg-lvol0: 159744 45056 striped 2 128 7:2 2048 7:3 2048
Before (broken, discards the first striped target's devices twice): device-mapper: striped: target_stripe=0, bdev=7:0, start=2048 len=79872 device-mapper: striped: target_stripe=1, bdev=7:1, start=2048 len=79872 device-mapper: striped: target_stripe=0, bdev=7:0, start=2049 len=22528 device-mapper: striped: target_stripe=1, bdev=7:1, start=2048 len=22528
After (works as expected): device-mapper: striped: target_stripe=0, bdev=7:0, start=2048 len=79872 device-mapper: striped: target_stripe=1, bdev=7:1, start=2048 len=79872 device-mapper: striped: target_stripe=0, bdev=7:2, start=2048 len=22528 device-mapper: striped: target_stripe=1, bdev=7:3, start=2048 len=22528
Fixes: 7dd76d1feec70 ("dm: improve bio splitting and associated IO accounting") Cc: stable@vger.kernel.org Reported-by: Orange Kao orange@aiven.io Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1509,6 +1509,8 @@ static int __send_duplicate_bios(struct ret = 1; break; default: + if (len) + setup_split_accounting(ci, *len); /* dm_accept_partial_bio() is not supported with shared tio->len_ptr */ alloc_multiple_bios(&blist, ci, ti, num_bios); while ((clone = bio_list_pop(&blist))) {
From: Oleksij Rempel o.rempel@pengutronix.de
commit d1366b283d94ac4537a4b3a1e8668da4df7ce7e9 upstream.
This commit addresses a deadlock situation that can occur in certain scenarios, such as when running data TP/ETP transfer and subscribing to the error queue while receiving a net down event. The deadlock involves locks in the following order:
3 j1939_session_list_lock -> active_session_list_lock j1939_session_activate ... j1939_sk_queue_activate_next -> sk_session_queue_lock ... j1939_xtp_rx_eoma_one
2 j1939_sk_queue_drop_all -> sk_session_queue_lock ... j1939_sk_netdev_event_netdown -> j1939_socks_lock j1939_netdev_notify
1 j1939_sk_errqueue -> j1939_socks_lock __j1939_session_cancel -> active_session_list_lock j1939_tp_rxtimer
CPU0 CPU1 ---- ---- lock(&priv->active_session_list_lock); lock(&jsk->sk_session_queue_lock); lock(&priv->active_session_list_lock); lock(&priv->j1939_socks_lock);
The solution implemented in this commit is to move the j1939_sk_errqueue() call out of the active_session_list_lock context, thus preventing the deadlock situation.
Reported-by: syzbot+ee1cd780f69483a8616b@syzkaller.appspotmail.com Fixes: 5b9272e93f2e ("can: j1939: extend UAPI to notify about RX status") Co-developed-by: Hillf Danton hdanton@sina.com Signed-off-by: Hillf Danton hdanton@sina.com Signed-off-by: Oleksij Rempel o.rempel@pengutronix.de Link: https://lore.kernel.org/all/20230324130141.2132787-1-o.rempel@pengutronix.de Cc: stable@vger.kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/can/j1939/transport.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/net/can/j1939/transport.c +++ b/net/can/j1939/transport.c @@ -1124,8 +1124,6 @@ static void __j1939_session_cancel(struc
if (session->sk) j1939_sk_send_loop_abort(session->sk, session->err); - else - j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT); }
static void j1939_session_cancel(struct j1939_session *session, @@ -1140,6 +1138,9 @@ static void j1939_session_cancel(struct }
j1939_session_list_unlock(session->priv); + + if (!session->sk) + j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT); }
static enum hrtimer_restart j1939_tp_txtimer(struct hrtimer *hrtimer) @@ -1253,6 +1254,9 @@ static enum hrtimer_restart j1939_tp_rxt __j1939_session_cancel(session, J1939_XTP_ABORT_TIMEOUT); } j1939_session_list_unlock(session->priv); + + if (!session->sk) + j1939_sk_errqueue(session, J1939_ERRQUEUE_RX_ABORT); }
j1939_session_put(session);
From: Juergen Gross jgross@suse.com
commit 05310f31ca74673a96567fb14637b7d5d6c82ea5 upstream.
Fix xenvif_get_requests() not to do grant copy operations across local page boundaries. This requires to double the maximum number of copy operations per queue, as each copy could now be split into 2.
Make sure that struct xenvif_tx_cb doesn't grow too large.
Cc: stable@vger.kernel.org Fixes: ad7f402ae4f4 ("xen/netback: Ensure protocol headers don't fall in the non-linear area") Signed-off-by: Juergen Gross jgross@suse.com Reviewed-by: Paul Durrant paul@xen.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/xen-netback/common.h | 2 +- drivers/net/xen-netback/netback.c | 25 +++++++++++++++++++++++-- 2 files changed, 24 insertions(+), 3 deletions(-)
--- a/drivers/net/xen-netback/common.h +++ b/drivers/net/xen-netback/common.h @@ -166,7 +166,7 @@ struct xenvif_queue { /* Per-queue data struct pending_tx_info pending_tx_info[MAX_PENDING_REQS]; grant_handle_t grant_tx_handle[MAX_PENDING_REQS];
- struct gnttab_copy tx_copy_ops[MAX_PENDING_REQS]; + struct gnttab_copy tx_copy_ops[2 * MAX_PENDING_REQS]; struct gnttab_map_grant_ref tx_map_ops[MAX_PENDING_REQS]; struct gnttab_unmap_grant_ref tx_unmap_ops[MAX_PENDING_REQS]; /* passed to gnttab_[un]map_refs with pages under (un)mapping */ --- a/drivers/net/xen-netback/netback.c +++ b/drivers/net/xen-netback/netback.c @@ -334,6 +334,7 @@ static int xenvif_count_requests(struct struct xenvif_tx_cb { u16 copy_pending_idx[XEN_NETBK_LEGACY_SLOTS_MAX + 1]; u8 copy_count; + u32 split_mask; };
#define XENVIF_TX_CB(skb) ((struct xenvif_tx_cb *)(skb)->cb) @@ -361,6 +362,8 @@ static inline struct sk_buff *xenvif_all struct sk_buff *skb = alloc_skb(size + NET_SKB_PAD + NET_IP_ALIGN, GFP_ATOMIC | __GFP_NOWARN); + + BUILD_BUG_ON(sizeof(*XENVIF_TX_CB(skb)) > sizeof(skb->cb)); if (unlikely(skb == NULL)) return NULL;
@@ -396,11 +399,13 @@ static void xenvif_get_requests(struct x nr_slots = shinfo->nr_frags + 1;
copy_count(skb) = 0; + XENVIF_TX_CB(skb)->split_mask = 0;
/* Create copy ops for exactly data_len bytes into the skb head. */ __skb_put(skb, data_len); while (data_len > 0) { int amount = data_len > txp->size ? txp->size : data_len; + bool split = false;
cop->source.u.ref = txp->gref; cop->source.domid = queue->vif->domid; @@ -413,6 +418,13 @@ static void xenvif_get_requests(struct x cop->dest.u.gmfn = virt_to_gfn(skb->data + skb_headlen(skb) - data_len);
+ /* Don't cross local page boundary! */ + if (cop->dest.offset + amount > XEN_PAGE_SIZE) { + amount = XEN_PAGE_SIZE - cop->dest.offset; + XENVIF_TX_CB(skb)->split_mask |= 1U << copy_count(skb); + split = true; + } + cop->len = amount; cop->flags = GNTCOPY_source_gref;
@@ -420,7 +432,8 @@ static void xenvif_get_requests(struct x pending_idx = queue->pending_ring[index]; callback_param(queue, pending_idx).ctx = NULL; copy_pending_idx(skb, copy_count(skb)) = pending_idx; - copy_count(skb)++; + if (!split) + copy_count(skb)++;
cop++; data_len -= amount; @@ -441,7 +454,8 @@ static void xenvif_get_requests(struct x nr_slots--; } else { /* The copy op partially covered the tx_request. - * The remainder will be mapped. + * The remainder will be mapped or copied in the next + * iteration. */ txp->offset += amount; txp->size -= amount; @@ -539,6 +553,13 @@ static int xenvif_tx_check_gop(struct xe pending_idx = copy_pending_idx(skb, i);
newerr = (*gopp_copy)->status; + + /* Split copies need to be handled together. */ + if (XENVIF_TX_CB(skb)->split_mask & (1U << i)) { + (*gopp_copy)++; + if (!newerr) + newerr = (*gopp_copy)->status; + } if (likely(!newerr)) { /* The first frag might still have this slot mapped */ if (i < copy_count(skb) - 1 || !sharedslot)
From: Josua Mayer josua@solid-run.com
commit 82e2c39f9ef78896e9b634dfd82dc042e6956bb7 upstream.
dp83869 internally uses a look-up table for mapping supported delays in nanoseconds to register values. When specific delays are defined in device-tree, phy_get_internal_delay does the lookup automatically returning an index.
The default case wrongly assigns the nanoseconds value from the lookup table, resulting in numeric value 2000 applied to delay configuration register, rather than the expected index values 0-7 (7 for 2000). Ultimately this issue broke RX for 1Gbps links.
Fix default delay configuration by assigning the intended index value directly.
Cc: stable@vger.kernel.org Fixes: 736b25afe284 ("net: dp83869: Add RGMII internal delay configuration") Co-developed-by: Yazan Shhady yazan.shhady@solid-run.com Signed-off-by: Yazan Shhady yazan.shhady@solid-run.com Signed-off-by: Josua Mayer josua@solid-run.com Reviewed-by: Simon Horman simon.horman@corigine.com Link: https://lore.kernel.org/r/20230323102536.31988-1-josua@solid-run.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/phy/dp83869.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/drivers/net/phy/dp83869.c +++ b/drivers/net/phy/dp83869.c @@ -588,15 +588,13 @@ static int dp83869_of_init(struct phy_de &dp83869_internal_delay[0], delay_size, true); if (dp83869->rx_int_delay < 0) - dp83869->rx_int_delay = - dp83869_internal_delay[DP83869_CLK_DELAY_DEF]; + dp83869->rx_int_delay = DP83869_CLK_DELAY_DEF;
dp83869->tx_int_delay = phy_get_internal_delay(phydev, dev, &dp83869_internal_delay[0], delay_size, false); if (dp83869->tx_int_delay < 0) - dp83869->tx_int_delay = - dp83869_internal_delay[DP83869_CLK_DELAY_DEF]; + dp83869->tx_int_delay = DP83869_CLK_DELAY_DEF;
return ret; }
From: Ben Hutchings ben@decadent.org.uk
commit fb27e70f6e408dee5d22b083e7a38a59e6118253 upstream.
modpost now reads CRCs from .*.cmd files, parsing them using strtol(). This is inconsistent with its parsing of Module.symvers and with their definition as *unsigned* 32-bit values.
strtol() clamps values to [LONG_MIN, LONG_MAX], and when building on a 32-bit system this changes all CRCs >= 0x80000000 to be 0x7fffffff.
Change extract_crcs_for_object() to use strtoul() instead.
Cc: stable@vger.kernel.org Fixes: f292d875d0dc ("modpost: extract symbol versions from *.cmd files") Signed-off-by: Ben Hutchings ben@decadent.org.uk Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/mod/modpost.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/scripts/mod/modpost.c +++ b/scripts/mod/modpost.c @@ -1733,7 +1733,7 @@ static void extract_crcs_for_object(cons if (!isdigit(*p)) continue; /* skip this line */
- crc = strtol(p, &p, 0); + crc = strtoul(p, &p, 0); if (*p != '\n') continue; /* skip this line */
From: Kornel Dulęba korneld@chromium.org
commit b26cd9325be4c1fcd331b77f10acb627c560d4d7 upstream.
This fixes a similar problem to the one observed in: commit 4e5a04be88fe ("pinctrl: amd: disable and mask interrupts on probe").
On some systems, during suspend/resume cycle firmware leaves an interrupt enabled on a pin that is not used by the kernel. This confuses the AMD pinctrl driver and causes spurious interrupts.
The driver already has logic to detect if a pin is used by the kernel. Leverage it to re-initialize interrupt fields of a pin only if it's not used by us.
Cc: stable@vger.kernel.org Fixes: dbad75dd1f25 ("pinctrl: add AMD GPIO driver support.") Signed-off-by: Kornel Dulęba korneld@chromium.org Link: https://lore.kernel.org/r/20230320093259.845178-1-korneld@chromium.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-amd.c | 36 ++++++++++++++++++++---------------- 1 file changed, 20 insertions(+), 16 deletions(-)
--- a/drivers/pinctrl/pinctrl-amd.c +++ b/drivers/pinctrl/pinctrl-amd.c @@ -865,32 +865,34 @@ static const struct pinconf_ops amd_pinc .pin_config_group_set = amd_pinconf_group_set, };
-static void amd_gpio_irq_init(struct amd_gpio *gpio_dev) +static void amd_gpio_irq_init_pin(struct amd_gpio *gpio_dev, int pin) { - struct pinctrl_desc *desc = gpio_dev->pctrl->desc; + const struct pin_desc *pd; unsigned long flags; u32 pin_reg, mask; - int i;
mask = BIT(WAKE_CNTRL_OFF_S0I3) | BIT(WAKE_CNTRL_OFF_S3) | BIT(INTERRUPT_MASK_OFF) | BIT(INTERRUPT_ENABLE_OFF) | BIT(WAKE_CNTRL_OFF_S4);
- for (i = 0; i < desc->npins; i++) { - int pin = desc->pins[i].number; - const struct pin_desc *pd = pin_desc_get(gpio_dev->pctrl, pin); - - if (!pd) - continue; + pd = pin_desc_get(gpio_dev->pctrl, pin); + if (!pd) + return;
- raw_spin_lock_irqsave(&gpio_dev->lock, flags); + raw_spin_lock_irqsave(&gpio_dev->lock, flags); + pin_reg = readl(gpio_dev->base + pin * 4); + pin_reg &= ~mask; + writel(pin_reg, gpio_dev->base + pin * 4); + raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); +}
- pin_reg = readl(gpio_dev->base + i * 4); - pin_reg &= ~mask; - writel(pin_reg, gpio_dev->base + i * 4); +static void amd_gpio_irq_init(struct amd_gpio *gpio_dev) +{ + struct pinctrl_desc *desc = gpio_dev->pctrl->desc; + int i;
- raw_spin_unlock_irqrestore(&gpio_dev->lock, flags); - } + for (i = 0; i < desc->npins; i++) + amd_gpio_irq_init_pin(gpio_dev, i); }
#ifdef CONFIG_PM_SLEEP @@ -943,8 +945,10 @@ static int amd_gpio_resume(struct device for (i = 0; i < desc->npins; i++) { int pin = desc->pins[i].number;
- if (!amd_gpio_should_save(gpio_dev, pin)) + if (!amd_gpio_should_save(gpio_dev, pin)) { + amd_gpio_irq_init_pin(gpio_dev, pin); continue; + }
raw_spin_lock_irqsave(&gpio_dev->lock, flags); gpio_dev->saved_regs[i] |= readl(gpio_dev->base + pin * 4) & PIN_IRQ_PENDING;
From: Johan Hovold johan+linaro@kernel.org
commit 7bb97e360acdd38b68ad0a1defb89c6e89c85596 upstream.
Since commit d59f6617eef0 ("genirq: Allow fwnode to carry name information only") an IRQ domain is always given a name during allocation (e.g. used for the debugfs entry).
Drop the no longer valid name assignment, which would lead to an attempt to free a string constant when removing the domain on late probe failures (e.g. probe deferral).
Fixes: d59f6617eef0 ("genirq: Allow fwnode to carry name information only") Cc: stable@vger.kernel.org # 4.13 Signed-off-by: Johan Hovold johan+linaro@kernel.org Reviewed-by: Claudiu Beznea claudiu.beznea@microchip.com Tested-by: Claudiu Beznea claudiu.beznea@microchip.com # on SAMA7G5 Link: https://lore.kernel.org/r/20230224130828.27985-1-johan+linaro@kernel.org Signed-off-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pinctrl/pinctrl-at91-pio4.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/pinctrl/pinctrl-at91-pio4.c +++ b/drivers/pinctrl/pinctrl-at91-pio4.c @@ -1206,7 +1206,6 @@ static int atmel_pinctrl_probe(struct pl dev_err(dev, "can't add the irq domain\n"); return -ENODEV; } - atmel_pioctrl->irq_domain->name = "atmel gpio";
for (i = 0; i < atmel_pioctrl->npins; i++) { int irq = irq_create_mapping(atmel_pioctrl->irq_domain, i);
From: Hans de Goede hdegoede@redhat.com
commit e3271a5917d1501089b1a224d702aa053e2877f4 upstream.
Commit 5829f8a897e4 ("platform/x86: ideapad-laptop: Send KEY_TOUCHPAD_TOGGLE on some models") made ideapad-laptop send KEY_TOUCHPAD_TOGGLE when we receive an ACPI notify with VPC event bit 5 set and the touchpad-state has not been changed by the EC itself already.
This was done under the assumption that this would be good to do to make the touchpad-toggle hotkey work on newer models where the EC does not toggle the touchpad on/off itself (because it is not routed through the PS/2 controller, but uses I2C).
But it turns out that at least some models, e.g. the Yoga 7-15ITL5 the EC triggers an ACPI notify with VPC event bit 5 set on resume, which would now cause a spurious KEY_TOUCHPAD_TOGGLE on resume to which the desktop environment responds by disabling the touchpad in software, breaking the touchpad (until manually re-enabled) on resume.
It was never confirmed that sending KEY_TOUCHPAD_TOGGLE actually improves things on new models and at least some new models like the Yoga 7-15ITL5 don't have a touchpad on/off toggle hotkey at all, while still sending ACPI notify events with VPC event bit 5 set.
So it seems best to revert the change to send KEY_TOUCHPAD_TOGGLE when receiving an ACPI notify events with VPC event bit 5 and the touchpad state as reported by the EC has not changed.
Note this is not a full revert the code to cache the last EC touchpad state is kept to avoid sending spurious KEY_TOUCHPAD_ON / _OFF events on resume.
Fixes: 5829f8a897e4 ("platform/x86: ideapad-laptop: Send KEY_TOUCHPAD_TOGGLE on some models") Link: https://bugzilla.kernel.org/show_bug.cgi?id=217234 Cc: stable@vger.kernel.org Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230330194644.64628-1-hdegoede@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/platform/x86/ideapad-laptop.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-)
--- a/drivers/platform/x86/ideapad-laptop.c +++ b/drivers/platform/x86/ideapad-laptop.c @@ -1170,7 +1170,6 @@ static const struct key_entry ideapad_ke { KE_KEY, 65, { KEY_PROG4 } }, { KE_KEY, 66, { KEY_TOUCHPAD_OFF } }, { KE_KEY, 67, { KEY_TOUCHPAD_ON } }, - { KE_KEY, 68, { KEY_TOUCHPAD_TOGGLE } }, { KE_KEY, 128, { KEY_ESC } },
/* @@ -1526,18 +1525,16 @@ static void ideapad_sync_touchpad_state( if (priv->features.ctrl_ps2_aux_port) i8042_command(¶m, value ? I8042_CMD_AUX_ENABLE : I8042_CMD_AUX_DISABLE);
- if (send_events) { - /* - * On older models the EC controls the touchpad and toggles it - * on/off itself, in this case we report KEY_TOUCHPAD_ON/_OFF. - * If the EC did not toggle, report KEY_TOUCHPAD_TOGGLE. - */ - if (value != priv->r_touchpad_val) { - ideapad_input_report(priv, value ? 67 : 66); - sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad"); - } else { - ideapad_input_report(priv, 68); - } + /* + * On older models the EC controls the touchpad and toggles it on/off + * itself, in this case we report KEY_TOUCHPAD_ON/_OFF. Some models do + * an acpi-notify with VPC bit 5 set on resume, so this function get + * called with send_events=true on every resume. Therefor if the EC did + * not toggle, do nothing to avoid sending spurious KEY_TOUCHPAD_TOGGLE. + */ + if (send_events && value != priv->r_touchpad_val) { + ideapad_input_report(priv, value ? 67 : 66); + sysfs_notify(&priv->platform_device->dev.kobj, NULL, "touchpad"); }
priv->r_touchpad_val = value;
From: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com
commit a57cc2dbb3738930d9cb361b9b473f90c8ede0b8 upstream.
Commit 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix deadlock") addressed deadlock issue during user space trip update. But it missed a case when thermal zone device is disabled when user writes 0.
Call to thermal_zone_device_disable() also causes deadlock as it also tries to lock tz->lock, which is already claimed by trip_point_temp_store() in the thermal core code.
Remove call to thermal_zone_device_disable() in the function sys_set_trip_temp(), which is called from trip_point_temp_store().
Fixes: 52f04f10b900 ("thermal: intel: int340x: processor_thermal: Fix deadlock") Signed-off-by: Srinivas Pandruvada srinivas.pandruvada@linux.intel.com Cc: 6.2+ stable@vger.kernel.org # 6.2+ Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c | 1 - 1 file changed, 1 deletion(-)
--- a/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c +++ b/drivers/thermal/intel/int340x_thermal/processor_thermal_device_pci.c @@ -181,7 +181,6 @@ static int sys_set_trip_temp(struct ther cancel_delayed_work_sync(&pci_info->work); proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_INT_ENABLE_0, 0); proc_thermal_mmio_write(pci_info, PROC_THERMAL_MMIO_THRES_0, 0); - thermal_zone_device_disable(tzd); pci_info->stored_thres = 0; return 0; }
From: Jens Axboe axboe@kernel.dk
commit fd7276189450110ed835eb0a334e62d2f1c4e3be upstream.
powerpc sets up PF_KTHREAD and PF_IO_WORKER with a NULL pt_regs, which from my (arguably very short) checking is not commonly done for other archs. This is fine, except when PF_IO_WORKER's have been created and the task does something that causes a coredump to be generated. Then we get this crash:
Kernel attempted to read user page (160) - exploit attempt? (uid: 1000) BUG: Kernel NULL pointer dereference on read at 0x00000160 Faulting instruction address: 0xc0000000000c3a60 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=32 NUMA pSeries Modules linked in: bochs drm_vram_helper drm_kms_helper xts binfmt_misc ecb ctr syscopyarea sysfillrect cbc sysimgblt drm_ttm_helper aes_generic ttm sg libaes evdev joydev virtio_balloon vmx_crypto gf128mul drm dm_mod fuse loop configfs drm_panel_orientation_quirks ip_tables x_tables autofs4 hid_generic usbhid hid xhci_pci xhci_hcd usbcore usb_common sd_mod CPU: 1 PID: 1982 Comm: ppc-crash Not tainted 6.3.0-rc2+ #88 Hardware name: IBM pSeries (emulated by qemu) POWER9 (raw) 0x4e1202 0xf000005 of:SLOF,HEAD hv:linux,kvm pSeries NIP: c0000000000c3a60 LR: c000000000039944 CTR: c0000000000398e0 REGS: c0000000041833b0 TRAP: 0300 Not tainted (6.3.0-rc2+) MSR: 800000000280b033 <SF,VEC,VSX,EE,FP,ME,IR,DR,RI,LE> CR: 88082828 XER: 200400f8 ... NIP memcpy_power7+0x200/0x7d0 LR ppr_get+0x64/0xb0 Call Trace: ppr_get+0x40/0xb0 (unreliable) __regset_get+0x180/0x1f0 regset_get_alloc+0x64/0x90 elf_core_dump+0xb98/0x1b60 do_coredump+0x1c34/0x24a0 get_signal+0x71c/0x1410 do_notify_resume+0x140/0x6f0 interrupt_exit_user_prepare_main+0x29c/0x320 interrupt_exit_user_prepare+0x6c/0xa0 interrupt_return_srr_user+0x8/0x138
Because ppr_get() is trying to copy from a PF_IO_WORKER with a NULL pt_regs.
Check for a valid pt_regs in both ppc_get/ppr_set, and return an error if not set. The actual error value doesn't seem to be important here, so just pick -EINVAL.
Fixes: fa439810cc1b ("powerpc/ptrace: Enable support for NT_PPPC_TAR, NT_PPC_PPR, NT_PPC_DSCR") Cc: stable@vger.kernel.org # v4.8+ Signed-off-by: Jens Axboe axboe@kernel.dk [mpe: Trim oops in change log, add Fixes & Cc stable] Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/d9f63344-fe7c-56ae-b420-4a1a04a2ae4c@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kernel/ptrace/ptrace-view.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/arch/powerpc/kernel/ptrace/ptrace-view.c +++ b/arch/powerpc/kernel/ptrace/ptrace-view.c @@ -290,6 +290,9 @@ static int gpr_set(struct task_struct *t static int ppr_get(struct task_struct *target, const struct user_regset *regset, struct membuf to) { + if (!target->thread.regs) + return -EINVAL; + return membuf_write(&to, &target->thread.regs->ppr, sizeof(u64)); }
@@ -297,6 +300,9 @@ static int ppr_set(struct task_struct *t unsigned int pos, unsigned int count, const void *kbuf, const void __user *ubuf) { + if (!target->thread.regs) + return -EINVAL; + return user_regset_copyin(&pos, &count, &kbuf, &ubuf, &target->thread.regs->ppr, 0, sizeof(u64)); }
From: Haren Myneni haren@linux.ibm.com
commit eca9f6e6f83b6725b84e1c76fdde19b003cff0eb upstream.
The hypervisor supports user-mode NX from Power10.
pseries_vas_dlpar_cpu() is called from lparcfg_write() to update VAS windows for DLPAR event in shared processor mode and the kernel gets -ENOTSUPP for HCALLs if the user-mode NX is not supported. The current VAS implementation also supports only with Radix page tables. Whereas in dedicated processor mode, pseries_vas_notifier() is registered only if the copy/paste feature is enabled. So instead of displaying HCALL error messages, update VAS capabilities if the copy/paste feature is available.
This patch ignores updating VAS capabilities in pseries_vas_dlpar_cpu() and returns success if the copy/paste feature is not enabled. Then lparcfg_write() completes the processor DLPAR operations without any failures.
Fixes: 2147783d6bf0 ("powerpc/pseries: Use lparcfg to reconfig VAS windows for DLPAR CPU") Cc: stable@vger.kernel.org # v6.1+ Signed-off-by: Haren Myneni haren@linux.ibm.com Reviewed-by: Nathan Lynch nathanl@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/1d0e727e7dbd9a28627ef08ca9df9c86a50175e2.camel@linux.ibm.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/platforms/pseries/vas.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/arch/powerpc/platforms/pseries/vas.c +++ b/arch/powerpc/platforms/pseries/vas.c @@ -857,6 +857,13 @@ int pseries_vas_dlpar_cpu(void) { int new_nr_creds, rc;
+ /* + * NX-GZIP is not enabled. Nothing to do for DLPAR event + */ + if (!copypaste_feat) + return 0; + + rc = h_query_vas_capabilities(H_QUERY_VAS_CAPABILITIES, vascaps[VAS_GZIP_DEF_FEAT_TYPE].feat, (u64)virt_to_phys(&hv_cop_caps)); @@ -1013,6 +1020,7 @@ static int __init pseries_vas_init(void) * Linux supports user space COPY/PASTE only with Radix */ if (!radix_enabled()) { + copypaste_feat = false; pr_err("API is supported only with radix page tables\n"); return -ENOTSUPP; }
From: Benjamin Gray bgray@linux.ibm.com
commit 1abce0580b89464546ae06abd5891ebec43c9470 upstream.
Userspace PROT_NONE ptes set _PAGE_PRIVILEGED, triggering a false positive debug assertion that __pte_flags_need_flush() is not called on a kernel mapping.
Detect when it is a userspace PROT_NONE page by checking the required bits of PAGE_NONE are set, and none of the RWX bits are set. pte_protnone() is insufficient here because it always returns 0 when CONFIG_NUMA_BALANCING=n.
Fixes: b11931e9adc1 ("powerpc/64s: add pte_needs_flush and huge_pmd_needs_flush") Cc: stable@vger.kernel.org # v6.1+ Reported-by: Russell Currey ruscur@russell.cc Signed-off-by: Benjamin Gray bgray@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20230302225947.81083-1-bgray@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/include/asm/book3s/64/tlbflush.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/arch/powerpc/include/asm/book3s/64/tlbflush.h +++ b/arch/powerpc/include/asm/book3s/64/tlbflush.h @@ -148,6 +148,11 @@ static inline void flush_tlb_fix_spuriou */ }
+static inline bool __pte_protnone(unsigned long pte) +{ + return (pte & (pgprot_val(PAGE_NONE) | _PAGE_RWX)) == pgprot_val(PAGE_NONE); +} + static inline bool __pte_flags_need_flush(unsigned long oldval, unsigned long newval) { @@ -164,8 +169,8 @@ static inline bool __pte_flags_need_flus /* * We do not expect kernel mappings or non-PTEs or not-present PTEs. */ - VM_WARN_ON_ONCE(oldval & _PAGE_PRIVILEGED); - VM_WARN_ON_ONCE(newval & _PAGE_PRIVILEGED); + VM_WARN_ON_ONCE(!__pte_protnone(oldval) && oldval & _PAGE_PRIVILEGED); + VM_WARN_ON_ONCE(!__pte_protnone(newval) && newval & _PAGE_PRIVILEGED); VM_WARN_ON_ONCE(!(oldval & _PAGE_PTE)); VM_WARN_ON_ONCE(!(newval & _PAGE_PTE)); VM_WARN_ON_ONCE(!(oldval & _PAGE_PRESENT));
From: Trond Myklebust trond.myklebust@hammerspace.com
commit 6165a16a5ad9b237bb3131cff4d3c601ccb8f9a3 upstream.
When we're using a cached open stateid or a delegation in order to avoid sending a CLAIM_PREVIOUS open RPC call to the server, we don't have a new open stateid to present to update_open_stateid(). Instead rely on nfs4_try_open_cached(), just as if we were doing a normal open.
Fixes: d2bfda2e7aa0 ("NFSv4: don't reprocess cached open CLAIM_PREVIOUS") Cc: stable@vger.kernel.org Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nfs/nfs4proc.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/fs/nfs/nfs4proc.c +++ b/fs/nfs/nfs4proc.c @@ -1980,8 +1980,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(str if (!data->rpc_done) { if (data->rpc_status) return ERR_PTR(data->rpc_status); - /* cached opens have already been processed */ - goto update; + return nfs4_try_open_cached(data); }
ret = nfs_refresh_inode(inode, &data->f_attr); @@ -1990,7 +1989,7 @@ _nfs4_opendata_reclaim_to_nfs4_state(str
if (data->o_res.delegation_type != 0) nfs4_opendata_check_deleg(data, state); -update: + if (!update_open_stateid(state, &data->o_res.stateid, NULL, data->o_arg.fmode)) return ERR_PTR(-EAGAIN);
From: Takashi Iwai tiwai@suse.de
commit b871cb971c683f7f212e7ca3c9a6709a75785116 upstream.
The recent commit f83bb2592482 ("ALSA: hda/conexant: Add quirk for LENOVO 20149 Notebook model") introduced a quirk for the device with 17aa:3977, but this caused a regression on another model (Lenovo Ideadpad U31) with the very same PCI SSID. And, through skimming over the net, it seems that this PCI SSID is used for multiple different models, so it's no good idea to apply the quirk with the SSID.
Although we may take a different ID check (e.g. the codec SSID instead of the PCI SSID), unfortunately, the original patch author couldn't identify the hardware details any longer as the machine was returned, and we can't develop the further proper fix.
In this patch, instead, we partially revert the change so that the quirk won't be applied as default for addressing the regression. Meanwhile, the quirk function itself is kept, and it's now made to be applicable via the explicit model=lenovo-20149 option.
Fixes: f83bb2592482 ("ALSA: hda/conexant: Add quirk for LENOVO 20149 Notebook model") Reported-by: Jetro Jormalainen jje-lxkl@jetro.fi Link: https://lore.kernel.org/r/20230308215009.4d3e58a6@mopti Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230320140954.31154-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -980,7 +980,10 @@ static const struct snd_pci_quirk cxt506 SND_PCI_QUIRK(0x17aa, 0x3905, "Lenovo G50-30", CXT_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x17aa, 0x390b, "Lenovo G50-80", CXT_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x17aa, 0x3975, "Lenovo U300s", CXT_FIXUP_STEREO_DMIC), - SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_PINCFG_LENOVO_NOTEBOOK), + /* NOTE: we'd need to extend the quirk for 17aa:3977 as the same + * PCI SSID is used on multiple Lenovo models + */ + SND_PCI_QUIRK(0x17aa, 0x3977, "Lenovo IdeaPad U310", CXT_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo G50-70", CXT_FIXUP_STEREO_DMIC), SND_PCI_QUIRK(0x17aa, 0x397b, "Lenovo S205", CXT_FIXUP_STEREO_DMIC), SND_PCI_QUIRK_VENDOR(0x17aa, "Thinkpad", CXT_FIXUP_THINKPAD_ACPI), @@ -1003,6 +1006,7 @@ static const struct hda_model_fixup cxt5 { .id = CXT_FIXUP_MUTE_LED_GPIO, .name = "mute-led-gpio" }, { .id = CXT_FIXUP_HP_ZBOOK_MUTE_LED, .name = "hp-zbook-mute-led" }, { .id = CXT_FIXUP_HP_MIC_NO_PRESENCE, .name = "hp-mic-fix" }, + { .id = CXT_PINCFG_LENOVO_NOTEBOOK, .name = "lenovo-20149" }, {} };
From: Takashi Iwai tiwai@suse.de
commit fa4e7a6fa12b1132340785e14bd439cbe95b7a5a upstream.
It's been reported that the recent kernel can't probe the PCM devices on Roland VS-100 properly, and it turned out to be a regression by the recent addition of the bit shift range check for the format bits. In the old code, we just did bit-shift and it resulted in zero, which is then corrected to the standard PCM format, while the new code explicitly returns an error in such a case.
For addressing the regression, relax the check and fallback to the standard PCM type (with the info output).
Fixes: 43d5ca88dfcd ("ALSA: usb-audio: Fix potential out-of-bounds shift") Cc: stable@vger.kernel.org Link: https://bugzilla.kernel.org/show_bug.cgi?id=217084 Link: https://lore.kernel.org/r/20230324075005.19403-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/format.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/usb/format.c +++ b/sound/usb/format.c @@ -39,8 +39,12 @@ static u64 parse_audio_format_i_type(str case UAC_VERSION_1: default: { struct uac_format_type_i_discrete_descriptor *fmt = _fmt; - if (format >= 64) - return 0; /* invalid format */ + if (format >= 64) { + usb_audio_info(chip, + "%u:%d: invalid format type 0x%llx is detected, processed as PCM\n", + fp->iface, fp->altsetting, format); + format = UAC_FORMAT_TYPE_I_PCM; + } sample_width = fmt->bBitResolution; sample_bytes = fmt->bSubframeSize; format = 1ULL << format;
From: Tim Crawford tcrawford@system76.com
commit b7a5822810c4398515300d614d988cf638adecad upstream.
Add the audio quirk for some of Clevo's latest RPL laptops:
- NP50RNJS (ALC256) - NP70SNE (ALC256) - PD50SNE (ALC1220) - PE60RNE (ALC1220)
Co-authored-by: Jeremy Soller jeremy@system76.com Signed-off-by: Tim Crawford tcrawford@system76.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230317141825.11807-1-tcrawford@system76.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2631,6 +2631,7 @@ static const struct snd_pci_quirk alc882 SND_PCI_QUIRK(0x1558, 0x65e5, "Clevo PC50D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x65f1, "Clevo PC50HS", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x65f5, "Clevo PD50PN[NRT]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), + SND_PCI_QUIRK(0x1558, 0x66a2, "Clevo PE60RNE", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67d1, "Clevo PB71[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67e1, "Clevo PB71[DE][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67e5, "Clevo PC70D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS), @@ -2651,6 +2652,7 @@ static const struct snd_pci_quirk alc882 SND_PCI_QUIRK(0x1558, 0x96e1, "Clevo P960[ER][CDFN]-K", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1558, 0x97e1, "Clevo P970[ER][CDFN]", ALC1220_FIXUP_CLEVO_P950), SND_PCI_QUIRK(0x1558, 0x97e2, "Clevo P970RC-M", ALC1220_FIXUP_CLEVO_P950), + SND_PCI_QUIRK(0x1558, 0xd502, "Clevo PD50SNE", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK_VENDOR(0x1558, "Clevo laptop", ALC882_FIXUP_EAPD), SND_PCI_QUIRK(0x161f, 0x2054, "Medion laptop", ALC883_FIXUP_EAPD), SND_PCI_QUIRK(0x17aa, 0x3a0d, "Lenovo Y530", ALC882_FIXUP_LENOVO_Y530), @@ -9574,6 +9576,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1558, 0x5101, "Clevo S510WU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x5157, "Clevo W517GU1", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x51a1, "Clevo NS50MU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1558, 0x5630, "Clevo NP50RNJS", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70a1, "Clevo NB70T[HJK]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70b3, "Clevo NK70SB", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0x70f2, "Clevo NH79EPY", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), @@ -9608,6 +9611,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x1558, 0x971d, "Clevo N970T[CDF]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0xa500, "Clevo NL5[03]RU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0xa600, "Clevo NL50NU", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), + SND_PCI_QUIRK(0x1558, 0xa671, "Clevo NP70SN[CDE]", ALC256_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0xb018, "Clevo NP50D[BE]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0xb019, "Clevo NH77D[BE]Q", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1558, 0xb022, "Clevo NH77D[DC][QW]", ALC293_FIXUP_SYSTEM76_MIC_NO_PRESENCE),
From: huangwenhui huangwenhuia@uniontech.com
commit 52aad39385e1bfdb34a1b405f699a8ef302c58b0 upstream.
Fix headset microphone detection on Lenovo ZhaoYang CF4620Z.
[ adjusted to be applicable to the latest tree -- tiwai ]
Signed-off-by: huangwenhui huangwenhuia@uniontech.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230328074644.30142-1-huangwenhuia@uniontech.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9712,6 +9712,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x511e, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x511f, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x9e54, "LENOVO NB", ALC269_FIXUP_LENOVO_EAPD), + SND_PCI_QUIRK(0x17aa, 0x9e56, "Lenovo ZhaoYang CF4620Z", ALC286_FIXUP_SONY_MIC_NO_PRESENCE), SND_PCI_QUIRK(0x1849, 0x1233, "ASRock NUC Box 1100", ALC233_FIXUP_NO_AUDIO_JACK), SND_PCI_QUIRK(0x1849, 0xa233, "Positivo Master C6300", ALC269_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x19e5, 0x3204, "Huawei MACH-WX9", ALC256_FIXUP_HUAWEI_MACH_WX9_PINS),
From: Max Filippov jcmvbkbc@gmail.com
commit 1d3b7a788ca7435156809a6bd5b20c95b2370d45 upstream.
show_stack dumps raw stack contents which may trigger an unnecessary KASAN report. Fix it by copying stack contents to a temporary buffer with __memcpy and then printing that buffer instead of passing stack pointer directly to the print_hex_dump.
Cc: stable@vger.kernel.org Signed-off-by: Max Filippov jcmvbkbc@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/xtensa/kernel/traps.c | 16 ++++++++++++---- 1 file changed, 12 insertions(+), 4 deletions(-)
--- a/arch/xtensa/kernel/traps.c +++ b/arch/xtensa/kernel/traps.c @@ -539,7 +539,7 @@ static size_t kstack_depth_to_print = CO
void show_stack(struct task_struct *task, unsigned long *sp, const char *loglvl) { - size_t len; + size_t len, off = 0;
if (!sp) sp = stack_pointer(task); @@ -548,9 +548,17 @@ void show_stack(struct task_struct *task kstack_depth_to_print * STACK_DUMP_ENTRY_SIZE);
printk("%sStack:\n", loglvl); - print_hex_dump(loglvl, " ", DUMP_PREFIX_NONE, - STACK_DUMP_LINE_SIZE, STACK_DUMP_ENTRY_SIZE, - sp, len, false); + while (off < len) { + u8 line[STACK_DUMP_LINE_SIZE]; + size_t line_len = len - off > STACK_DUMP_LINE_SIZE ? + STACK_DUMP_LINE_SIZE : len - off; + + __memcpy(line, (u8 *)sp + off, line_len); + print_hex_dump(loglvl, " ", DUMP_PREFIX_NONE, + STACK_DUMP_LINE_SIZE, STACK_DUMP_ENTRY_SIZE, + line, line_len, false); + off += STACK_DUMP_LINE_SIZE; + } show_trace(task, sp, loglvl); }
From: Douglas Raillard douglas.raillard@arm.com
commit d18a04157fc171fd48075e3dc96471bd3b87f0dd upstream.
Fix the rcutorturename field so that its size is correctly reported in the text format embedded in trace.dat files. As it stands, it is reported as being of size 1:
field:char rcutorturename[8]; offset:8; size:1; signed:0;
Signed-off-by: Douglas Raillard douglas.raillard@arm.com Reviewed-by: Mukesh Ojha quic_mojha@quicinc.com Cc: stable@vger.kernel.org Fixes: 04ae87a52074e ("ftrace: Rework event_create_dir()") Reviewed-by: Steven Rostedt (Google) rostedt@goodmis.org [ boqun: Add "Cc" and "Fixes" tags per Steven ] Signed-off-by: Boqun Feng boqun.feng@gmail.com Signed-off-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/trace/events/rcu.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/trace/events/rcu.h +++ b/include/trace/events/rcu.h @@ -768,7 +768,7 @@ TRACE_EVENT_RCU(rcu_torture_read, TP_ARGS(rcutorturename, rhp, secs, c_old, c),
TP_STRUCT__entry( - __field(char, rcutorturename[RCUTORTURENAME_LEN]) + __array(char, rcutorturename, RCUTORTURENAME_LEN) __field(struct rcu_head *, rhp) __field(unsigned long, secs) __field(unsigned long, c_old)
From: Geert Uytterhoeven geert+renesas@glider.be
commit a56cde41340ac4049fa6edac9e6cfbcd2804074e upstream.
SPI EEPROMs typically support both SPI Mode 0 (CPOL=CPHA=0) and Mode 3 (CPOL=CPHA=1). However, using the latter is currently flagged as an error by "make dtbs_check", e.g.:
arch/arm/boot/dts/r8a7791-koelsch.dtb: flash@0: Unevaluated properties are not allowed ('spi-cpha', 'spi-cpol' were unexpected) From schema: Documentation/devicetree/bindings/mtd/jedec,spi-nor.yaml
Fix this by documenting support for CPOL=CPHA=1.
Fixes: 233363aba72ac638 ("spi/panel: dt-bindings: drop CPHA and CPOL from common properties") Cc: stable@vger.kernel.org Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Miquel Raynal miquel.raynal@bootlin.com Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Tudor Ambarus tudor.ambarus@linaro.org Signed-off-by: Miquel Raynal miquel.raynal@bootlin.com Link: https://lore.kernel.org/linux-mtd/afe470603028db9374930b0c57464b1f6d52bdd3.1... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/mtd/jedec,spi-nor.yaml | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/Documentation/devicetree/bindings/mtd/jedec,spi-nor.yaml +++ b/Documentation/devicetree/bindings/mtd/jedec,spi-nor.yaml @@ -76,6 +76,13 @@ properties: If "broken-flash-reset" is present then having this property does not make any difference.
+ spi-cpol: true + spi-cpha: true + +dependencies: + spi-cpol: [ spi-cpha ] + spi-cpha: [ spi-cpol ] + unevaluatedProperties: false
examples:
From: Heiko Carstens hca@linux.ibm.com
commit 89aba4c26fae4e459f755a18912845c348ee48f3 upstream.
Add missing earlyclobber annotation to size, to, and tmp2 operands of the __clear_user() inline assembly since they are modified or written to before the last usage of all input operands. This can lead to incorrect register allocation for the inline assembly.
Fixes: 6c2a9e6df604 ("[S390] Use alternative user-copy operations for new hardware.") Reported-by: Mark Rutland mark.rutland@arm.com Link: https://lore.kernel.org/all/20230321122514.1743889-3-mark.rutland@arm.com/ Cc: stable@vger.kernel.org Reviewed-by: Gerald Schaefer gerald.schaefer@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/lib/uaccess.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/s390/lib/uaccess.c +++ b/arch/s390/lib/uaccess.c @@ -172,7 +172,7 @@ unsigned long __clear_user(void __user * "4: slgr %0,%0\n" "5:\n" EX_TABLE(0b,2b) EX_TABLE(6b,2b) EX_TABLE(3b,5b) EX_TABLE(7b,5b) - : "+a" (size), "+a" (to), "+a" (tmp1), "=a" (tmp2) + : "+&a" (size), "+&a" (to), "+a" (tmp1), "=&a" (tmp2) : "a" (empty_zero_page), [spec] "d" (spec.val) : "cc", "memory", "0"); return size;
From: Jiri Slaby (SUSE) jirislaby@kernel.org
commit 7bb2107e63d8a4a13bbb6fe0e1cbd68784a2e9ac upstream.
Expolines depend on scripts/basic/fixdep. And build of expolines can now race with the fixdep build:
make[1]: *** Deleting file 'arch/s390/lib/expoline/expoline.o' /bin/sh: line 1: scripts/basic/fixdep: Permission denied make[1]: *** [../scripts/Makefile.build:385: arch/s390/lib/expoline/expoline.o] Error 126 make: *** [../arch/s390/Makefile:166: expoline_prepare] Error 2
The dependence was removed in the below Fixes: commit. So reintroduce the dependence on scripts.
Fixes: a0b0987a7811 ("s390/nospec: remove unneeded header includes") Cc: Joe Lawrence joe.lawrence@redhat.com Cc: stable@vger.kernel.org Cc: Heiko Carstens hca@linux.ibm.com Cc: Vasily Gorbik gor@linux.ibm.com Cc: Alexander Gordeev agordeev@linux.ibm.com Cc: Christian Borntraeger borntraeger@linux.ibm.com Cc: Sven Schnelle svens@linux.ibm.com Cc: linux-s390@vger.kernel.org Signed-off-by: Jiri Slaby (SUSE) jirislaby@kernel.org Link: https://lore.kernel.org/r/20230316112809.7903-1-jirislaby@kernel.org Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/s390/Makefile +++ b/arch/s390/Makefile @@ -162,7 +162,7 @@ vdso_prepare: prepare0
ifdef CONFIG_EXPOLINE_EXTERN modules_prepare: expoline_prepare -expoline_prepare: +expoline_prepare: scripts $(Q)$(MAKE) $(build)=arch/s390/lib/expoline arch/s390/lib/expoline/expoline.o endif endif
From: Lucas Stach l.stach@pengutronix.de
commit 963b2e8c428f79489ceeb058e8314554ec9cbe6f upstream.
drm_gem_prime_mmap() takes a reference on the GEM object, but before that drm_gem_mmap_obj() already takes a reference, which will be leaked as only one reference is dropped when the mapping is closed. Drop the extra reference when dma_buf_mmap() succeeds.
Cc: stable@vger.kernel.org Signed-off-by: Lucas Stach l.stach@pengutronix.de Reviewed-by: Christian Gmeiner christian.gmeiner@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c +++ b/drivers/gpu/drm/etnaviv/etnaviv_gem_prime.c @@ -91,7 +91,15 @@ static void *etnaviv_gem_prime_vmap_impl static int etnaviv_gem_prime_mmap_obj(struct etnaviv_gem_object *etnaviv_obj, struct vm_area_struct *vma) { - return dma_buf_mmap(etnaviv_obj->base.dma_buf, vma, 0); + int ret; + + ret = dma_buf_mmap(etnaviv_obj->base.dma_buf, vma, 0); + if (!ret) { + /* Drop the reference acquired by drm_gem_mmap_obj(). */ + drm_gem_object_put(&etnaviv_obj->base); + } + + return ret; }
static const struct etnaviv_gem_ops etnaviv_gem_prime_ops = {
From: Tim Huang tim.huang@amd.com
commit 2fec9dc8e0acc3dfb56d1389151bcf405f087b10 upstream.
Skip mode2 reset only for IMU enabled APUs when do S4.
This patch is to fix the regression issue https://gitlab.freedesktop.org/drm/amd/-/issues/2483 It is generated by commit b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4").
Fixes: b589626674de ("drm/amdgpu: skip ASIC reset for APUs when go to S4") Link: https://gitlab.freedesktop.org/drm/amd/-/issues/2483 Tested-by: Yuan Perry Perry.Yuan@amd.com Signed-off-by: Tim Huang tim.huang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org # 6.1.x Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_acpi.c @@ -981,7 +981,12 @@ static bool amdgpu_atcs_pci_probe_handle */ bool amdgpu_acpi_should_gpu_reset(struct amdgpu_device *adev) { - if (adev->flags & AMD_IS_APU) + if ((adev->flags & AMD_IS_APU) && + adev->gfx.imu.funcs) /* Not need to do mode2 reset for IMU enabled APUs */ + return false; + + if ((adev->flags & AMD_IS_APU) && + amdgpu_acpi_is_s3_active(adev)) return false;
if (amdgpu_sriov_vf(adev))
From: Fangzhi Zuo Jerry.Zuo@amd.com
commit f4f3b7dedbe849e780c779ba67365bb1db0d8637 upstream.
Traditional synaptics hub has one MST branch device without virtual dpcd. Synaptics cascaded hub has two chained MST branch devices. DSC decoding is performed via root MST branch device, instead of the second MST branch device.
Reviewed-by: Hersen Wu hersenxs.wu@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Fangzhi Zuo Jerry.Zuo@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 19 ++++++++++++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 12 +++++++ 2 files changed, 31 insertions(+)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -208,6 +208,21 @@ bool needs_dsc_aux_workaround(struct dc_ return false; }
+bool is_synaptics_cascaded_panamera(struct dc_link *link, struct drm_dp_mst_port *port) +{ + u8 branch_vendor_data[4] = { 0 }; // Vendor data 0x50C ~ 0x50F + + if (drm_dp_dpcd_read(port->mgr->aux, DP_BRANCH_VENDOR_SPECIFIC_START, &branch_vendor_data, 4) == 4) { + if (link->dpcd_caps.branch_dev_id == DP_BRANCH_DEVICE_ID_90CC24 && + IS_SYNAPTICS_CASCADED_PANAMERA(link->dpcd_caps.branch_dev_name, branch_vendor_data)) { + DRM_INFO("Synaptics Cascaded MST hub\n"); + return true; + } + } + + return false; +} + static bool validate_dsc_caps_on_connector(struct amdgpu_dm_connector *aconnector) { struct dc_sink *dc_sink = aconnector->dc_sink; @@ -231,6 +246,10 @@ static bool validate_dsc_caps_on_connect needs_dsc_aux_workaround(aconnector->dc_link)) aconnector->dsc_aux = &aconnector->mst_port->dm_dp_aux.aux;
+ /* synaptics cascaded MST hub case */ + if (!aconnector->dsc_aux && is_synaptics_cascaded_panamera(aconnector->dc_link, port)) + aconnector->dsc_aux = port->mgr->aux; + if (!aconnector->dsc_aux) return false;
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h @@ -34,6 +34,18 @@ #define SYNAPTICS_RC_OFFSET 0x4BC #define SYNAPTICS_RC_DATA 0x4C0
+#define DP_BRANCH_VENDOR_SPECIFIC_START 0x50C + +/** + * Panamera MST Hub detection + * Offset DPCD 050Eh == 0x5A indicates cascaded MST hub case + * Check from beginning of branch device vendor specific field (050Ch) + */ +#define IS_SYNAPTICS_PANAMERA(branchDevName) (((int)branchDevName[4] & 0xF0) == 0x50 ? 1 : 0) +#define BRANCH_HW_REVISION_PANAMERA_A2 0x10 +#define SYNAPTICS_CASCADED_HUB_ID 0x5A +#define IS_SYNAPTICS_CASCADED_PANAMERA(devName, data) ((IS_SYNAPTICS_PANAMERA(devName) && ((int)data[2] == SYNAPTICS_CASCADED_HUB_ID)) ? 1 : 0) + struct amdgpu_display_manager; struct amdgpu_dm_connector;
From: Fangzhi Zuo Jerry.Zuo@amd.com
commit 68dc1846c3a44d5e633be145c169ce2fd5420695 upstream.
8b/10b encoding needs to add 3% fec overhead into the pbn. In the Synapcis Cascaded MST hub, the first stage MST branch device needs the information to determine the timeslot count for the second stage MST branch device. Missing this overhead will leads to insufficient timeslot allocation.
Cc: stable@vger.kernel.org Cc: Mario Limonciello mario.limonciello@amd.com Reviewed-by: Hersen Wu hersenxs.wu@amd.com Acked-by: Qingqing Zhuo qingqing.zhuo@amd.com Signed-off-by: Fangzhi Zuo Jerry.Zuo@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c | 32 +++++++++--- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h | 3 + 2 files changed, 27 insertions(+), 8 deletions(-)
--- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.c @@ -646,12 +646,25 @@ struct dsc_mst_fairness_params { struct amdgpu_dm_connector *aconnector; };
-static int kbps_to_peak_pbn(int kbps) +static uint16_t get_fec_overhead_multiplier(struct dc_link *dc_link) +{ + u8 link_coding_cap; + uint16_t fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B; + + link_coding_cap = dc_link_dp_mst_decide_link_encoding_format(dc_link); + if (link_coding_cap == DP_128b_132b_ENCODING) + fec_overhead_multiplier_x1000 = PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B; + + return fec_overhead_multiplier_x1000; +} + +static int kbps_to_peak_pbn(int kbps, uint16_t fec_overhead_multiplier_x1000) { u64 peak_kbps = kbps;
peak_kbps *= 1006; - peak_kbps = div_u64(peak_kbps, 1000); + peak_kbps *= fec_overhead_multiplier_x1000; + peak_kbps = div_u64(peak_kbps, 1000 * 1000); return (int) DIV64_U64_ROUND_UP(peak_kbps * 64, (54 * 8 * 1000)); }
@@ -745,11 +758,12 @@ static int increase_dsc_bpp(struct drm_a int link_timeslots_used; int fair_pbn_alloc; int ret = 0; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
for (i = 0; i < count; i++) { if (vars[i + k].dsc_enabled) { initial_slack[i] = - kbps_to_peak_pbn(params[i].bw_range.max_kbps) - vars[i + k].pbn; + kbps_to_peak_pbn(params[i].bw_range.max_kbps, fec_overhead_multiplier_x1000) - vars[i + k].pbn; bpp_increased[i] = false; remaining_to_increase += 1; } else { @@ -845,6 +859,7 @@ static int try_disable_dsc(struct drm_at int next_index; int remaining_to_try = 0; int ret; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
for (i = 0; i < count; i++) { if (vars[i + k].dsc_enabled @@ -874,7 +889,7 @@ static int try_disable_dsc(struct drm_at if (next_index == -1) break;
- vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.stream_kbps); + vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.stream_kbps, fec_overhead_multiplier_x1000); ret = drm_dp_atomic_find_time_slots(state, params[next_index].port->mgr, params[next_index].port, @@ -887,7 +902,7 @@ static int try_disable_dsc(struct drm_at vars[next_index].dsc_enabled = false; vars[next_index].bpp_x16 = 0; } else { - vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.max_kbps); + vars[next_index].pbn = kbps_to_peak_pbn(params[next_index].bw_range.max_kbps, fec_overhead_multiplier_x1000); ret = drm_dp_atomic_find_time_slots(state, params[next_index].port->mgr, params[next_index].port, @@ -916,6 +931,7 @@ static int compute_mst_dsc_configs_for_l int count = 0; int i, k, ret; bool debugfs_overwrite = false; + uint16_t fec_overhead_multiplier_x1000 = get_fec_overhead_multiplier(dc_link);
memset(params, 0, sizeof(params));
@@ -977,7 +993,7 @@ static int compute_mst_dsc_configs_for_l /* Try no compression */ for (i = 0; i < count; i++) { vars[i + k].aconnector = params[i].aconnector; - vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = false; vars[i + k].bpp_x16 = 0; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, params[i].port, @@ -996,7 +1012,7 @@ static int compute_mst_dsc_configs_for_l /* Try max compression */ for (i = 0; i < count; i++) { if (params[i].compression_possible && params[i].clock_force_enable != DSC_CLK_FORCE_DISABLE) { - vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.min_kbps); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.min_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = true; vars[i + k].bpp_x16 = params[i].bw_range.min_target_bpp_x16; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, @@ -1004,7 +1020,7 @@ static int compute_mst_dsc_configs_for_l if (ret < 0) return ret; } else { - vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps); + vars[i + k].pbn = kbps_to_peak_pbn(params[i].bw_range.stream_kbps, fec_overhead_multiplier_x1000); vars[i + k].dsc_enabled = false; vars[i + k].bpp_x16 = 0; ret = drm_dp_atomic_find_time_slots(state, params[i].port->mgr, --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm_mst_types.h @@ -46,6 +46,9 @@ #define SYNAPTICS_CASCADED_HUB_ID 0x5A #define IS_SYNAPTICS_CASCADED_PANAMERA(devName, data) ((IS_SYNAPTICS_PANAMERA(devName) && ((int)data[2] == SYNAPTICS_CASCADED_HUB_ID)) ? 1 : 0)
+#define PBN_FEC_OVERHEAD_MULTIPLIER_8B_10B 1031 +#define PBN_FEC_OVERHEAD_MULTIPLIER_128B_132B 1000 + struct amdgpu_display_manager; struct amdgpu_dm_connector;
From: Chris Wilson chris.p.wilson@linux.intel.com
commit d032ca43f2c80049ce5aabd3f208dc3849359497 upstream.
i915_gem_object_create_lmem_from_data() lacks the flush of the data written to lmem to ensure the object is marked as dirty and the writes flushed to the backing store. Once created, we can immediately release the obj->mm.mapping caching of the vmap.
Fixes: 7acbbc7cf485 ("drm/i915/guc: put all guc objects in lmem when available") Cc: Matthew Auld matthew.auld@intel.com Cc: Daniele Ceraolo Spurio daniele.ceraolospurio@intel.com Cc: Andi Shyti andi.shyti@linux.intel.com Cc: Matthew Brost matthew.brost@intel.com Cc: John Harrison John.C.Harrison@Intel.com Signed-off-by: Chris Wilson chris.p.wilson@linux.intel.com Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Nirmoy Das nirmoy.das@intel.com Reviewed-by: Andi Shyti andi.shyti@linux.intel.com Reviewed-by: Nirmoy Das nirmoy.das@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230316165918.13074-1-nirmoy.... (cherry picked from commit e2ee10474ce766686e7a7496585cdfaf79e3a1bf) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_lmem.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_lmem.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_lmem.c @@ -127,7 +127,8 @@ i915_gem_object_create_lmem_from_data(st
memcpy(map, data, size);
- i915_gem_object_unpin_map(obj); + i915_gem_object_flush_map(obj); + __i915_gem_object_release_map(obj);
return obj; }
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 3413881e1ecc3cba722a2e87ec099692eed5be28 upstream.
Currently i915_gem_object_is_framebuffer() doesn't treat the BO containing the framebuffer's DPT as a framebuffer itself. This means eg. that the shrinker can evict the DPT BO while leaving the actual FB BO bound, when the DPT is allocated from regular shmem.
That causes an immediate oops during hibernate as we try to rewrite the PTEs inside the already evicted DPT obj.
TODO: presumably this might also be the reason for the DPT related display faults under heavy memory pressure, but I'm still not sure how that would happen as the object should be pinned by intel_dpt_pin() while in active use by the display engine...
Cc: stable@vger.kernel.org Cc: Juha-Pekka Heikkila juhapekka.heikkila@gmail.com Cc: Matthew Auld matthew.auld@intel.com Cc: Imre Deak imre.deak@intel.com Fixes: 0dc987b699ce ("drm/i915/display: Add smem fallback allocation for dpt") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230320090522.9909-2-ville.sy... Reviewed-by: Juha-Pekka Heikkila juhapekka.heikkila@gmail.com (cherry picked from commit 779cb5ba64ec7df80675a956c9022929514f517a) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_dpt.c | 2 ++ drivers/gpu/drm/i915/gem/i915_gem_object.h | 2 +- drivers/gpu/drm/i915/gem/i915_gem_object_types.h | 3 +++ 3 files changed, 6 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/display/intel_dpt.c +++ b/drivers/gpu/drm/i915/display/intel_dpt.c @@ -301,6 +301,7 @@ intel_dpt_create(struct intel_framebuffe vm->pte_encode = gen8_ggtt_pte_encode;
dpt->obj = dpt_obj; + dpt->obj->is_dpt = true;
return &dpt->vm; } @@ -309,5 +310,6 @@ void intel_dpt_destroy(struct i915_addre { struct i915_dpt *dpt = i915_vm_to_dpt(vm);
+ dpt->obj->is_dpt = false; i915_vm_put(&dpt->vm); } --- a/drivers/gpu/drm/i915/gem/i915_gem_object.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object.h @@ -319,7 +319,7 @@ i915_gem_object_never_mmap(const struct static inline bool i915_gem_object_is_framebuffer(const struct drm_i915_gem_object *obj) { - return READ_ONCE(obj->frontbuffer); + return READ_ONCE(obj->frontbuffer) || obj->is_dpt; }
static inline unsigned int --- a/drivers/gpu/drm/i915/gem/i915_gem_object_types.h +++ b/drivers/gpu/drm/i915/gem/i915_gem_object_types.h @@ -491,6 +491,9 @@ struct drm_i915_gem_object { */ unsigned int cache_dirty:1;
+ /* @is_dpt: Object houses a display page table (DPT) */ + unsigned int is_dpt:1; + /** * @read_domains: Read memory domains. *
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit a2b6e99d8a623544f3bdccd28ee35b9c1b00daa5 upstream.
Keeping DC states enabled is incompatible with the _noarm()/_arm() split we use for writing pipe/plane registers. When DC5 and PSR are enabled, all pipe/plane registers effectively become self-arming on account of DC5 exit arming the update, and PSR exit latching it.
What probably saves us most of the time is that (with PIPE_MISC[21]=0) all pipe register writes themselves trigger PSR exit, and then we don't re-enter PSR until the idle frame count has elapsed. So it may be that the PSR exit happens already before we've updated the state too much.
Also the PSR1 panel (at least on this KBL) seems to discard the first frame we trasmit, presumably still scanning out from its internal framebuffer at that point. So only the second frame we transmit is actually visible. But I suppose that could also be panel specific behaviour. I haven't checked out how other PSR panels behave, nor did I bother to check what the eDP spec has to say about this.
And since this really is all about DC states, let's switch from the MODESET domain to the DC_OFF domain. Functionally they are 100% identical. We should probably remove the MODESET domain...
And for good measure let's toss in an assert to the place where we do the _noarm() register writes to make sure DC states are in fact off.
v2: Just use intel_display_power_is_enabled() (Imre)
Cc: stable@vger.kernel.org #v5.17+ Cc: Manasi Navare navaremanasi@google.com Cc: Drew Davenport ddavenport@chromium.org Cc: Jouni Högander jouni.hogander@intel.com Reviewed-by: Imre Deak imre.deak@intel.com Fixes: d13dde449580 ("drm/i915: Split pipe+output CSC programming to noarm+arm pair") Fixes: f8a005eb8972 ("drm/i915: Optimize icl+ universal plane programming") Fixes: 890b6ec4a522 ("drm/i915: Split skl+ plane update into noarm+arm pair") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230320183532.17727-1-ville.s... (cherry picked from commit 41b4c7fe72b6105a4b49395eea9aa40cef94288d) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_display.c | 28 ++++++++++++++++++++++++--- 1 file changed, 25 insertions(+), 3 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_display.c +++ b/drivers/gpu/drm/i915/display/intel_display.c @@ -7107,6 +7107,8 @@ static void intel_update_crtc(struct int
intel_fbc_update(state, crtc);
+ drm_WARN_ON(&i915->drm, !intel_display_power_is_enabled(i915, POWER_DOMAIN_DC_OFF)); + if (!modeset && intel_crtc_needs_color_update(new_crtc_state)) intel_color_commit_noarm(new_crtc_state); @@ -7480,8 +7482,28 @@ static void intel_atomic_commit_tail(str drm_atomic_helper_wait_for_dependencies(&state->base); drm_dp_mst_atomic_wait_for_dependencies(&state->base);
- if (state->modeset) - wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_MODESET); + /* + * During full modesets we write a lot of registers, wait + * for PLLs, etc. Doing that while DC states are enabled + * is not a good idea. + * + * During fastsets and other updates we also need to + * disable DC states due to the following scenario: + * 1. DC5 exit and PSR exit happen + * 2. Some or all _noarm() registers are written + * 3. Due to some long delay PSR is re-entered + * 4. DC5 entry -> DMC saves the already written new + * _noarm() registers and the old not yet written + * _arm() registers + * 5. DC5 exit -> DMC restores a mixture of old and + * new register values and arms the update + * 6. PSR exit -> hardware latches a mixture of old and + * new register values -> corrupted frame, or worse + * 7. New _arm() registers are finally written + * 8. Hardware finally latches a complete set of new + * register values, and subsequent frames will be OK again + */ + wakeref = intel_display_power_get(dev_priv, POWER_DOMAIN_DC_OFF);
intel_atomic_prepare_plane_clear_colors(state);
@@ -7625,8 +7647,8 @@ static void intel_atomic_commit_tail(str * the culprit. */ intel_uncore_arm_unclaimed_mmio_detection(&dev_priv->uncore); - intel_display_power_put(dev_priv, POWER_DOMAIN_MODESET, wakeref); } + intel_display_power_put(dev_priv, POWER_DOMAIN_DC_OFF, wakeref); intel_runtime_pm_put(&dev_priv->runtime_pm, state->wakeref);
/*
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit 76b767d4d1cd052e455cf18e06929e8b2b70101d upstream.
We're going to want different behavior for skl/glk vs. icl in .color_commit_noarm(), so split the hook into two. Arguably we already had slightly different behaviour since csc_enable/gamma_enable are never set on icl+, so the old code was perhaps a bit confusing as well.
Cc: stable@vger.kernel.org #v5.19+ Cc: Manasi Navare navaremanasi@google.com Cc: Drew Davenport ddavenport@chromium.org Cc: Imre Deak imre.deak@intel.com Cc: Jouni Högander jouni.hogander@intel.com Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230320095438.17328-2-ville.s... Reviewed-by: Imre Deak imre.deak@intel.com (cherry picked from commit f161eb01f50ab31f2084975b43bce54b7b671e17) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_color.c | 21 ++++++++++++++++++++- 1 file changed, 20 insertions(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -574,6 +574,25 @@ static void skl_color_commit_arm(const s crtc_state->csc_mode); }
+static void icl_color_commit_arm(const struct intel_crtc_state *crtc_state) +{ + struct intel_crtc *crtc = to_intel_crtc(crtc_state->uapi.crtc); + struct drm_i915_private *i915 = to_i915(crtc->base.dev); + enum pipe pipe = crtc->pipe; + + /* + * We don't (yet) allow userspace to control the pipe background color, + * so force it to black. + */ + intel_de_write(i915, SKL_BOTTOM_COLOR(pipe), 0); + + intel_de_write(i915, GAMMA_MODE(crtc->pipe), + crtc_state->gamma_mode); + + intel_de_write_fw(i915, PIPE_CSC_MODE(crtc->pipe), + crtc_state->csc_mode); +} + static struct drm_property_blob * create_linear_lut(struct drm_i915_private *i915, int lut_size) { @@ -2287,7 +2306,7 @@ static const struct intel_color_funcs i9 static const struct intel_color_funcs icl_color_funcs = { .color_check = icl_color_check, .color_commit_noarm = icl_color_commit_noarm, - .color_commit_arm = skl_color_commit_arm, + .color_commit_arm = icl_color_commit_arm, .load_luts = icl_load_luts, .read_luts = icl_read_luts, };
From: Ville Syrjälä ville.syrjala@linux.intel.com
commit a8e03e00b62073b494886dbff32f8b5338066c8b upstream.
SKL/GLK CSC unit suffers from a nasty issue where a CSC coeff/offset register read or write between DC5 exit and PSR exit will undo the CSC arming performed by DMC, and then during PSR exit the hardware will latch zeroes into the active CSC registers. This causes any plane going through the CSC to output all black.
We can sidestep the issue by making sure the PSR exit has already actually happened before we touch the CSC coeff/offset registers. Easiest way to guarantee that is to just move the CSC programming back into the .color_commir_arm() as we force a PSR exit (and crucially wait for it to actually happen) prior to touching the arming registers.
When PSR (and thus also DC states) are disabled we don't have anything to worry about, so we can keep using the more optional _noarm() hook for writing the CSC registers.
Cc: stable@vger.kernel.org #v5.19+ Cc: Manasi Navare navaremanasi@google.com Cc: Drew Davenport ddavenport@chromium.org Cc: Imre Deak imre.deak@intel.com Cc: Jouni Högander jouni.hogander@intel.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8283 Fixes: d13dde449580 ("drm/i915: Split pipe+output CSC programming to noarm+arm pair") Signed-off-by: Ville Syrjälä ville.syrjala@linux.intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20230320095438.17328-3-ville.s... Reviewed-by: Imre Deak imre.deak@intel.com (cherry picked from commit 80a892a4c2428b65366721599fc5fe50eaed35fd) Signed-off-by: Jani Nikula jani.nikula@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/display/intel_color.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/i915/display/intel_color.c +++ b/drivers/gpu/drm/i915/display/intel_color.c @@ -514,6 +514,22 @@ static void icl_color_commit_noarm(const icl_load_csc_matrix(crtc_state); }
+static void skl_color_commit_noarm(const struct intel_crtc_state *crtc_state) +{ + /* + * Possibly related to display WA #1184, SKL CSC loses the latched + * CSC coeff/offset register values if the CSC registers are disarmed + * between DC5 exit and PSR exit. This will cause the plane(s) to + * output all black (until CSC_MODE is rearmed and properly latched). + * Once PSR exit (and proper register latching) has occurred the + * danger is over. Thus when PSR is enabled the CSC coeff/offset + * register programming will be peformed from skl_color_commit_arm() + * which is called after PSR exit. + */ + if (!crtc_state->has_psr) + ilk_load_csc_matrix(crtc_state); +} + static void ilk_color_commit_noarm(const struct intel_crtc_state *crtc_state) { ilk_load_csc_matrix(crtc_state); @@ -556,6 +572,9 @@ static void skl_color_commit_arm(const s enum pipe pipe = crtc->pipe; u32 val = 0;
+ if (crtc_state->has_psr) + ilk_load_csc_matrix(crtc_state); + /* * We don't (yet) allow userspace to control the pipe background color, * so force it to black, but apply pipe gamma and CSC appropriately @@ -2313,7 +2332,7 @@ static const struct intel_color_funcs ic
static const struct intel_color_funcs glk_color_funcs = { .color_check = glk_color_check, - .color_commit_noarm = ilk_color_commit_noarm, + .color_commit_noarm = skl_color_commit_noarm, .color_commit_arm = skl_color_commit_arm, .load_luts = glk_load_luts, .read_luts = glk_read_luts, @@ -2321,7 +2340,7 @@ static const struct intel_color_funcs gl
static const struct intel_color_funcs skl_color_funcs = { .color_check = ivb_color_check, - .color_commit_noarm = ilk_color_commit_noarm, + .color_commit_noarm = skl_color_commit_noarm, .color_commit_arm = skl_color_commit_arm, .load_luts = bdw_load_luts, .read_luts = NULL,
From: Reiji Watanabe reijiw@google.com
commit 9228b26194d1cc00449f12f306f53ef2e234a55b upstream.
Have KVM_GET_ONE_REG for vPMU counter (vPMC) registers (PMCCNTR_EL0 and PMEVCNTR<n>_EL0) return the sum of the register value in the sysreg file and the current perf event counter value.
Values of vPMC registers are saved in sysreg files on certain occasions. These saved values don't represent the current values of the vPMC registers if the perf events for the vPMCs count events after the save. The current values of those registers are the sum of the sysreg file value and the current perf event counter value. But, when userspace reads those registers (using KVM_GET_ONE_REG), KVM returns the sysreg file value to userspace (not the sum value).
Fix this to return the sum value for KVM_GET_ONE_REG.
Fixes: 051ff581ce70 ("arm64: KVM: Add access handler for event counter register") Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Signed-off-by: Reiji Watanabe reijiw@google.com Link: https://lore.kernel.org/r/20230313033208.1475499-1-reijiw@google.com Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/sys_regs.c | 21 +++++++++++++++++++-- 1 file changed, 19 insertions(+), 2 deletions(-)
--- a/arch/arm64/kvm/sys_regs.c +++ b/arch/arm64/kvm/sys_regs.c @@ -765,6 +765,22 @@ static bool pmu_counter_idx_valid(struct return true; }
+static int get_pmu_evcntr(struct kvm_vcpu *vcpu, const struct sys_reg_desc *r, + u64 *val) +{ + u64 idx; + + if (r->CRn == 9 && r->CRm == 13 && r->Op2 == 0) + /* PMCCNTR_EL0 */ + idx = ARMV8_PMU_CYCLE_IDX; + else + /* PMEVCNTRn_EL0 */ + idx = ((r->CRm & 3) << 3) | (r->Op2 & 7); + + *val = kvm_pmu_get_counter_value(vcpu, idx); + return 0; +} + static bool access_pmu_evcntr(struct kvm_vcpu *vcpu, struct sys_reg_params *p, const struct sys_reg_desc *r) @@ -981,7 +997,7 @@ static bool access_pmuserenr(struct kvm_ /* Macro to expand the PMEVCNTRn_EL0 register */ #define PMU_PMEVCNTR_EL0(n) \ { PMU_SYS_REG(SYS_PMEVCNTRn_EL0(n)), \ - .reset = reset_pmevcntr, \ + .reset = reset_pmevcntr, .get_user = get_pmu_evcntr, \ .access = access_pmu_evcntr, .reg = (PMEVCNTR0_EL0 + n), }
/* Macro to expand the PMEVTYPERn_EL0 register */ @@ -1745,7 +1761,8 @@ static const struct sys_reg_desc sys_reg { PMU_SYS_REG(SYS_PMCEID1_EL0), .access = access_pmceid, .reset = NULL }, { PMU_SYS_REG(SYS_PMCCNTR_EL0), - .access = access_pmu_evcntr, .reset = reset_unknown, .reg = PMCCNTR_EL0 }, + .access = access_pmu_evcntr, .reset = reset_unknown, + .reg = PMCCNTR_EL0, .get_user = get_pmu_evcntr}, { PMU_SYS_REG(SYS_PMXEVTYPER_EL0), .access = access_pmu_evtyper, .reset = NULL }, { PMU_SYS_REG(SYS_PMXEVCNTR_EL0),
From: Reiji Watanabe reijiw@google.com
commit f6da81f650fa47b61b847488f3938d43f90d093d upstream.
Presently, when a guest writes 1 to PMCR_EL0.{C,P}, which is WO/RAZ, KVM saves the register value, including these bits. When userspace reads the register using KVM_GET_ONE_REG, KVM returns the saved register value as it is (the saved value might have these bits set). This could result in userspace setting these bits on the destination during migration. Consequently, KVM may end up resetting the vPMU counter registers (PMCCNTR_EL0 and/or PMEVCNTR<n>_EL0) to zero on the first KVM_RUN after migration.
Fix this by not saving those bits when a guest writes 1 to those bits.
Fixes: ab9468340d2b ("arm64: KVM: Add access handler for PMCR register") Cc: stable@vger.kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Signed-off-by: Reiji Watanabe reijiw@google.com Link: https://lore.kernel.org/r/20230313033234.1475987-1-reijiw@google.com Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/pmu-emul.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -538,7 +538,8 @@ void kvm_pmu_handle_pmcr(struct kvm_vcpu if (!kvm_pmu_is_3p5(vcpu)) val &= ~ARMV8_PMU_PMCR_LP;
- __vcpu_sys_reg(vcpu, PMCR_EL0) = val; + /* The reset bits don't indicate any state, and shouldn't be saved. */ + __vcpu_sys_reg(vcpu, PMCR_EL0) = val & ~(ARMV8_PMU_PMCR_C | ARMV8_PMU_PMCR_P);
if (val & ARMV8_PMU_PMCR_E) { kvm_pmu_enable_counter_mask(vcpu,
From: David Matlack dmatlack@google.com
commit 13ec9308a85702af7c31f3638a2720863848a7f2 upstream.
Read mmu_invalidate_seq before dropping the mmap_lock so that KVM can detect if the results of vma_lookup() (e.g. vma_shift) become stale before it acquires kvm->mmu_lock. This fixes a theoretical bug where a VMA could be changed by userspace after vma_lookup() and before KVM reads the mmu_invalidate_seq, causing KVM to install page table entries based on a (possibly) no-longer-valid vma_shift.
Re-order the MMU cache top-up to earlier in user_mem_abort() so that it is not done after KVM has read mmu_invalidate_seq (i.e. so as to avoid inducing spurious fault retries).
This bug has existed since KVM/ARM's inception. It's unlikely that any sane userspace currently modifies VMAs in such a way as to trigger this race. And even with directed testing I was unable to reproduce it. But a sufficiently motivated host userspace might be able to exploit this race.
Fixes: 94f8e6418d39 ("KVM: ARM: Handle guest faults in KVM") Cc: stable@vger.kernel.org Reported-by: Sean Christopherson seanjc@google.com Signed-off-by: David Matlack dmatlack@google.com Reviewed-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20230313235454.2964067-1-dmatlack@google.com Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 48 +++++++++++++++++++++--------------------------- 1 file changed, 21 insertions(+), 27 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1218,6 +1218,20 @@ static int user_mem_abort(struct kvm_vcp }
/* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (fault_status != ESR_ELx_FSC_PERM || + (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(kvm)); + if (ret) + return ret; + } + + /* * Let's check if we will get back a huge page backed by hugetlbfs, or * get block mapping for device MMIO region. */ @@ -1269,37 +1283,17 @@ static int user_mem_abort(struct kvm_vcp fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT; - mmap_read_unlock(current->mm); - - /* - * Permission faults just need to update the existing leaf entry, - * and so normally don't require allocations from the memcache. The - * only exception to this is when dirty logging is enabled at runtime - * and a write fault needs to collapse a block entry into a table. - */ - if (fault_status != ESR_ELx_FSC_PERM || - (logging_active && write_fault)) { - ret = kvm_mmu_topup_memory_cache(memcache, - kvm_mmu_cache_min_pages(kvm)); - if (ret) - return ret; - }
- mmu_seq = vcpu->kvm->mmu_invalidate_seq; /* - * Ensure the read of mmu_invalidate_seq happens before we call - * gfn_to_pfn_prot (which calls get_user_pages), so that we don't risk - * the page we just got a reference to gets unmapped before we have a - * chance to grab the mmu_lock, which ensure that if the page gets - * unmapped afterwards, the call to kvm_unmap_gfn will take it away - * from us again properly. This smp_rmb() interacts with the smp_wmb() - * in kvm_mmu_notifier_invalidate_<page|range_end>. + * Read mmu_invalidate_seq so that KVM can detect if the results of + * vma_lookup() or __gfn_to_pfn_memslot() become stale prior to + * acquiring kvm->mmu_lock. * - * Besides, __gfn_to_pfn_memslot() instead of gfn_to_pfn_prot() is - * used to avoid unnecessary overhead introduced to locate the memory - * slot because it's always fixed even @gfn is adjusted for huge pages. + * Rely on mmap_read_unlock() for an implicit smp_rmb(), which pairs + * with the smp_wmb() in kvm_mmu_invalidate_end(). */ - smp_rmb(); + mmu_seq = vcpu->kvm->mmu_invalidate_seq; + mmap_read_unlock(current->mm);
pfn = __gfn_to_pfn_memslot(memslot, gfn, false, false, NULL, write_fault, &writable, NULL);
From: Marc Zyngier maz@kernel.org
commit e86fc1a3a3e9b4850fe74d738e3cfcf4297d8bba upstream.
We walk the userspace PTs to discover what mapping size was used there. However, this can race against the userspace tables being freed, and we end-up in the weeds.
Thankfully, the mm code is being generous and will IPI us when doing so. So let's implement our part of the bargain and disable interrupts around the walk. This ensures that nothing terrible happens during that time.
We still need to handle the removal of the page tables before the walk. For that, allow get_user_mapping_size() to return an error, and make sure this error can be propagated all the way to the the exit handler.
Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230316174546.3777507-2-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 45 ++++++++++++++++++++++++++++++++++++++------- 1 file changed, 38 insertions(+), 7 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -665,14 +665,33 @@ static int get_user_mapping_size(struct CONFIG_PGTABLE_LEVELS), .mm_ops = &kvm_user_mm_ops, }; + unsigned long flags; kvm_pte_t pte = 0; /* Keep GCC quiet... */ u32 level = ~0; int ret;
+ /* + * Disable IRQs so that we hazard against a concurrent + * teardown of the userspace page tables (which relies on + * IPI-ing threads). + */ + local_irq_save(flags); ret = kvm_pgtable_get_leaf(&pgt, addr, &pte, &level); - VM_BUG_ON(ret); - VM_BUG_ON(level >= KVM_PGTABLE_MAX_LEVELS); - VM_BUG_ON(!(pte & PTE_VALID)); + local_irq_restore(flags); + + if (ret) + return ret; + + /* + * Not seeing an error, but not updating level? Something went + * deeply wrong... + */ + if (WARN_ON(level >= KVM_PGTABLE_MAX_LEVELS)) + return -EFAULT; + + /* Oops, the userspace PTs are gone... Replay the fault */ + if (!kvm_pte_valid(pte)) + return -EAGAIN;
return BIT(ARM64_HW_PGTABLE_LEVEL_SHIFT(level)); } @@ -1079,7 +1098,7 @@ static bool fault_supports_stage2_huge_m * * Returns the size of the mapping. */ -static unsigned long +static long transparent_hugepage_adjust(struct kvm *kvm, struct kvm_memory_slot *memslot, unsigned long hva, kvm_pfn_t *pfnp, phys_addr_t *ipap) @@ -1091,8 +1110,15 @@ transparent_hugepage_adjust(struct kvm * * sure that the HVA and IPA are sufficiently aligned and that the * block map is contained within the memslot. */ - if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE) && - get_user_mapping_size(kvm, hva) >= PMD_SIZE) { + if (fault_supports_stage2_huge_mapping(memslot, hva, PMD_SIZE)) { + int sz = get_user_mapping_size(kvm, hva); + + if (sz < 0) + return sz; + + if (sz < PMD_SIZE) + return PAGE_SIZE; + /* * The address we faulted on is backed by a transparent huge * page. However, because we map the compound huge page and @@ -1203,7 +1229,7 @@ static int user_mem_abort(struct kvm_vcp kvm_pfn_t pfn; bool logging_active = memslot_is_logging(memslot); unsigned long fault_level = kvm_vcpu_trap_get_fault_level(vcpu); - unsigned long vma_pagesize, fault_granule; + long vma_pagesize, fault_granule; enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; struct kvm_pgtable *pgt;
@@ -1344,6 +1370,11 @@ static int user_mem_abort(struct kvm_vcp vma_pagesize = transparent_hugepage_adjust(kvm, memslot, hva, &pfn, &fault_ipa); + + if (vma_pagesize < 0) { + ret = vma_pagesize; + goto out_unlock; + } }
if (fault_status != ESR_ELx_FSC_PERM && !device && kvm_has_mte(kvm)) {
From: Marc Zyngier maz@kernel.org
commit 8c2e8ac8ad4be68409e806ce1cc78fc7a04539f3 upstream.
On page fault, we find about the VMA that backs the page fault early on, and quickly release the mmap_read_lock. However, using the VMA pointer after the critical section is pretty dangerous, as a teardown may happen in the meantime and the VMA be long gone.
Move the sampling of the MTE permission early, and NULL-ify the VMA pointer after that, just to be on the safe side.
Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20230316174546.3777507-3-maz@kernel.org Signed-off-by: Oliver Upton oliver.upton@linux.dev Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kvm/mmu.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1218,7 +1218,7 @@ static int user_mem_abort(struct kvm_vcp { int ret = 0; bool write_fault, writable, force_pte = false; - bool exec_fault; + bool exec_fault, mte_allowed; bool device = false; unsigned long mmu_seq; struct kvm *kvm = vcpu->kvm; @@ -1309,6 +1309,10 @@ static int user_mem_abort(struct kvm_vcp fault_ipa &= ~(vma_pagesize - 1);
gfn = fault_ipa >> PAGE_SHIFT; + mte_allowed = kvm_vma_mte_allowed(vma); + + /* Don't use the VMA after the unlock -- it may have vanished */ + vma = NULL;
/* * Read mmu_invalidate_seq so that KVM can detect if the results of @@ -1379,7 +1383,7 @@ static int user_mem_abort(struct kvm_vcp
if (fault_status != ESR_ELx_FSC_PERM && !device && kvm_has_mte(kvm)) { /* Check the VMM hasn't introduced a new disallowed VMA */ - if (kvm_vma_mte_allowed(vma)) { + if (mte_allowed) { sanitise_mte_tags(kvm, pfn, vma_pagesize); } else { ret = -EFAULT;
From: Hans de Goede hdegoede@redhat.com
commit 0482c34ec6f8557e06cd0f8e2d0e20e8ede6a22c upstream.
ucsi_init() which runs from a workqueue sets ucsi->connector and on an error will clear it again.
ucsi->connector gets dereferenced by ucsi_resume(), this checks for ucsi->connector being NULL in case ucsi_init() has not finished yet; or in case ucsi_init() has failed.
ucsi_init() setting ucsi->connector and then clearing it again on an error creates a race where the check in ucsi_resume() may pass, only to have ucsi->connector free-ed underneath it when ucsi_init() hits an error.
Fix this race by making ucsi_init() store the connector array in a local variable and only assign it to ucsi->connector on success.
Fixes: bdc62f2bae8f ("usb: typec: ucsi: Simplified registration and I/O API") Cc: stable@vger.kernel.org Reviewed-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Hans de Goede hdegoede@redhat.com Link: https://lore.kernel.org/r/20230308154244.722337-3-hdegoede@redhat.com Signed-off-by: Joakim Tjernlund joakim.tjernlund@infinera.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/ucsi/ucsi.c | 22 +++++++++------------- 1 file changed, 9 insertions(+), 13 deletions(-)
--- a/drivers/usb/typec/ucsi/ucsi.c +++ b/drivers/usb/typec/ucsi/ucsi.c @@ -1039,9 +1039,8 @@ static struct fwnode_handle *ucsi_find_f return NULL; }
-static int ucsi_register_port(struct ucsi *ucsi, int index) +static int ucsi_register_port(struct ucsi *ucsi, struct ucsi_connector *con) { - struct ucsi_connector *con = &ucsi->connector[index]; struct typec_capability *cap = &con->typec_cap; enum typec_accessory *accessory = cap->accessory; enum usb_role u_role = USB_ROLE_NONE; @@ -1062,7 +1061,6 @@ static int ucsi_register_port(struct ucs init_completion(&con->complete); mutex_init(&con->lock); INIT_LIST_HEAD(&con->partner_tasks); - con->num = index + 1; con->ucsi = ucsi;
cap->fwnode = ucsi_find_fwnode(con); @@ -1204,7 +1202,7 @@ out_unlock: */ static int ucsi_init(struct ucsi *ucsi) { - struct ucsi_connector *con; + struct ucsi_connector *con, *connector; u64 command, ntfy; int ret; int i; @@ -1235,16 +1233,16 @@ static int ucsi_init(struct ucsi *ucsi) }
/* Allocate the connectors. Released in ucsi_unregister() */ - ucsi->connector = kcalloc(ucsi->cap.num_connectors + 1, - sizeof(*ucsi->connector), GFP_KERNEL); - if (!ucsi->connector) { + connector = kcalloc(ucsi->cap.num_connectors + 1, sizeof(*connector), GFP_KERNEL); + if (!connector) { ret = -ENOMEM; goto err_reset; }
/* Register all connectors */ for (i = 0; i < ucsi->cap.num_connectors; i++) { - ret = ucsi_register_port(ucsi, i); + connector[i].num = i + 1; + ret = ucsi_register_port(ucsi, &connector[i]); if (ret) goto err_unregister; } @@ -1256,11 +1254,12 @@ static int ucsi_init(struct ucsi *ucsi) if (ret < 0) goto err_unregister;
+ ucsi->connector = connector; ucsi->ntfy = ntfy; return 0;
err_unregister: - for (con = ucsi->connector; con->port; con++) { + for (con = connector; con->port; con++) { ucsi_unregister_partner(con); ucsi_unregister_altmodes(con, UCSI_RECIPIENT_CON); ucsi_unregister_port_psy(con); @@ -1269,10 +1268,7 @@ err_unregister: typec_unregister_port(con->port); con->port = NULL; } - - kfree(ucsi->connector); - ucsi->connector = NULL; - + kfree(connector); err_reset: memset(&ucsi->cap, 0, sizeof(ucsi->cap)); ucsi_reset_ppm(ucsi);
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit ea2ce1ba99aa6a60c8d8a706e3abadf3de372163 ]
Turns out that btf_dump API doesn't handle a bunch of tricky corner cases, as reported by Per, and further discovered using his testing Python script ([0]).
This patch revamps btf_dump's padding logic significantly, making it more correct and also avoiding unnecessary explicit padding, where compiler would pad naturally. This overall topic turned out to be very tricky and subtle, there are lots of subtle corner cases. The comments in the code tries to give some clues, but comments themselves are supposed to be paired with good understanding of C alignment and padding rules. Plus some experimentation to figure out subtle things like whether `long :0;` means that struct is now forced to be long-aligned (no, it's not, turns out).
Anyways, Per's script, while not completely correct in some known situations, doesn't show any obvious cases where this logic breaks, so this is a nice improvement over the previous state of this logic.
Some selftests had to be adjusted to accommodate better use of natural alignment rules, eliminating some unnecessary padding, or changing it to `type: 0;` alignment markers.
Note also that for when we are in between bitfields, we emit explicit bit size, while otherwise we use `: 0`, this feels much more natural in practice.
Next patch will add few more test cases, found through randomized Per's script.
[0] https://lore.kernel.org/bpf/85f83c333f5355c8ac026f835b18d15060725fcb.camel@e...
Reported-by: Per Sundström XP per.xp.sundstrom@ericsson.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20221212211505.558851-6-andrii@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/btf_dump.c | 169 +++++++++++++----- .../bpf/progs/btf_dump_test_case_bitfields.c | 2 +- .../bpf/progs/btf_dump_test_case_padding.c | 58 ++++-- 3 files changed, 164 insertions(+), 65 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index 69e80ee5f70e2..e0f0b17541b4a 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -829,6 +829,25 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id) } }
+static int btf_natural_align_of(const struct btf *btf, __u32 id) +{ + const struct btf_type *t = btf__type_by_id(btf, id); + int i, align, vlen; + const struct btf_member *m; + + if (!btf_is_composite(t)) + return btf__align_of(btf, id); + + align = 1; + m = btf_members(t); + vlen = btf_vlen(t); + for (i = 0; i < vlen; i++, m++) { + align = max(align, btf__align_of(btf, m->type)); + } + + return align; +} + static bool btf_is_struct_packed(const struct btf *btf, __u32 id, const struct btf_type *t) { @@ -836,16 +855,16 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id, int align, i, bit_sz; __u16 vlen;
- align = btf__align_of(btf, id); - /* size of a non-packed struct has to be a multiple of its alignment*/ - if (align && t->size % align) + align = btf_natural_align_of(btf, id); + /* size of a non-packed struct has to be a multiple of its alignment */ + if (align && (t->size % align) != 0) return true;
m = btf_members(t); vlen = btf_vlen(t); /* all non-bitfield fields have to be naturally aligned */ for (i = 0; i < vlen; i++, m++) { - align = btf__align_of(btf, m->type); + align = btf_natural_align_of(btf, m->type); bit_sz = btf_member_bitfield_size(t, i); if (align && bit_sz == 0 && m->offset % (8 * align) != 0) return true; @@ -858,44 +877,97 @@ static bool btf_is_struct_packed(const struct btf *btf, __u32 id, return false; }
-static int chip_away_bits(int total, int at_most) -{ - return total % at_most ? : at_most; -} - static void btf_dump_emit_bit_padding(const struct btf_dump *d, - int cur_off, int m_off, int m_bit_sz, - int align, int lvl) + int cur_off, int next_off, int next_align, + bool in_bitfield, int lvl) { - int off_diff = m_off - cur_off; - int ptr_bits = d->ptr_sz * 8; + const struct { + const char *name; + int bits; + } pads[] = { + {"long", d->ptr_sz * 8}, {"int", 32}, {"short", 16}, {"char", 8} + }; + int new_off, pad_bits, bits, i; + const char *pad_type; + + if (cur_off >= next_off) + return; /* no gap */ + + /* For filling out padding we want to take advantage of + * natural alignment rules to minimize unnecessary explicit + * padding. First, we find the largest type (among long, int, + * short, or char) that can be used to force naturally aligned + * boundary. Once determined, we'll use such type to fill in + * the remaining padding gap. In some cases we can rely on + * compiler filling some gaps, but sometimes we need to force + * alignment to close natural alignment with markers like + * `long: 0` (this is always the case for bitfields). Note + * that even if struct itself has, let's say 4-byte alignment + * (i.e., it only uses up to int-aligned types), using `long: + * X;` explicit padding doesn't actually change struct's + * overall alignment requirements, but compiler does take into + * account that type's (long, in this example) natural + * alignment requirements when adding implicit padding. We use + * this fact heavily and don't worry about ruining correct + * struct alignment requirement. + */ + for (i = 0; i < ARRAY_SIZE(pads); i++) { + pad_bits = pads[i].bits; + pad_type = pads[i].name;
- if (off_diff <= 0) - /* no gap */ - return; - if (m_bit_sz == 0 && off_diff < align * 8) - /* natural padding will take care of a gap */ - return; + new_off = roundup(cur_off, pad_bits); + if (new_off <= next_off) + break; + }
- while (off_diff > 0) { - const char *pad_type; - int pad_bits; - - if (ptr_bits > 32 && off_diff > 32) { - pad_type = "long"; - pad_bits = chip_away_bits(off_diff, ptr_bits); - } else if (off_diff > 16) { - pad_type = "int"; - pad_bits = chip_away_bits(off_diff, 32); - } else if (off_diff > 8) { - pad_type = "short"; - pad_bits = chip_away_bits(off_diff, 16); - } else { - pad_type = "char"; - pad_bits = chip_away_bits(off_diff, 8); + if (new_off > cur_off && new_off <= next_off) { + /* We need explicit `<type>: 0` aligning mark if next + * field is right on alignment offset and its + * alignment requirement is less strict than <type>'s + * alignment (so compiler won't naturally align to the + * offset we expect), or if subsequent `<type>: X`, + * will actually completely fit in the remaining hole, + * making compiler basically ignore `<type>: X` + * completely. + */ + if (in_bitfield || + (new_off == next_off && roundup(cur_off, next_align * 8) != new_off) || + (new_off != next_off && next_off - new_off <= new_off - cur_off)) + /* but for bitfields we'll emit explicit bit count */ + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, + in_bitfield ? new_off - cur_off : 0); + cur_off = new_off; + } + + /* Now we know we start at naturally aligned offset for a chosen + * padding type (long, int, short, or char), and so the rest is just + * a straightforward filling of remaining padding gap with full + * `<type>: sizeof(<type>);` markers, except for the last one, which + * might need smaller than sizeof(<type>) padding. + */ + while (cur_off != next_off) { + bits = min(next_off - cur_off, pad_bits); + if (bits == pad_bits) { + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits); + cur_off += bits; + continue; + } + /* For the remainder padding that doesn't cover entire + * pad_type bit length, we pick the smallest necessary type. + * This is pure aesthetics, we could have just used `long`, + * but having smallest necessary one communicates better the + * scale of the padding gap. + */ + for (i = ARRAY_SIZE(pads) - 1; i >= 0; i--) { + pad_type = pads[i].name; + pad_bits = pads[i].bits; + if (pad_bits < bits) + continue; + + btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, bits); + cur_off += bits; + break; } - btf_dump_printf(d, "\n%s%s: %d;", pfx(lvl), pad_type, pad_bits); - off_diff -= pad_bits; } }
@@ -915,9 +987,11 @@ static void btf_dump_emit_struct_def(struct btf_dump *d, { const struct btf_member *m = btf_members(t); bool is_struct = btf_is_struct(t); - int align, i, packed, off = 0; + bool packed, prev_bitfield = false; + int align, i, off = 0; __u16 vlen = btf_vlen(t);
+ align = btf__align_of(d->btf, id); packed = is_struct ? btf_is_struct_packed(d->btf, id, t) : 0;
btf_dump_printf(d, "%s%s%s {", @@ -927,33 +1001,36 @@ static void btf_dump_emit_struct_def(struct btf_dump *d,
for (i = 0; i < vlen; i++, m++) { const char *fname; - int m_off, m_sz; + int m_off, m_sz, m_align; + bool in_bitfield;
fname = btf_name_of(d, m->name_off); m_sz = btf_member_bitfield_size(t, i); m_off = btf_member_bit_offset(t, i); - align = packed ? 1 : btf__align_of(d->btf, m->type); + m_align = packed ? 1 : btf__align_of(d->btf, m->type); + + in_bitfield = prev_bitfield && m_sz != 0;
- btf_dump_emit_bit_padding(d, off, m_off, m_sz, align, lvl + 1); + btf_dump_emit_bit_padding(d, off, m_off, m_align, in_bitfield, lvl + 1); btf_dump_printf(d, "\n%s", pfx(lvl + 1)); btf_dump_emit_type_decl(d, m->type, fname, lvl + 1);
if (m_sz) { btf_dump_printf(d, ": %d", m_sz); off = m_off + m_sz; + prev_bitfield = true; } else { m_sz = max((__s64)0, btf__resolve_size(d->btf, m->type)); off = m_off + m_sz * 8; + prev_bitfield = false; } + btf_dump_printf(d, ";"); }
/* pad at the end, if necessary */ - if (is_struct) { - align = packed ? 1 : btf__align_of(d->btf, id); - btf_dump_emit_bit_padding(d, off, t->size * 8, 0, align, - lvl + 1); - } + if (is_struct) + btf_dump_emit_bit_padding(d, off, t->size * 8, align, false, lvl + 1);
/* * Keep `struct empty {}` on a single line, diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c index e5560a6560309..e01690618e1ee 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_bitfields.c @@ -53,7 +53,7 @@ struct bitfields_only_mixed_types { */ /* ------ END-EXPECTED-OUTPUT ------ */ struct bitfield_mixed_with_others { - long: 4; /* char is enough as a backing field */ + char: 4; /* char is enough as a backing field */ int a: 4; /* 8-bit implicit padding */ short b; /* combined with previous bitfield */ diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c index 7cb522d22a664..6f963d34c45ba 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c @@ -19,7 +19,7 @@ struct padded_implicitly { /* *struct padded_explicitly { * int a; - * int: 32; + * long: 0; * int b; *}; * @@ -28,41 +28,28 @@ struct padded_implicitly {
struct padded_explicitly { int a; - int: 1; /* algo will explicitly pad with full 32 bits here */ + int: 1; /* algo will emit aligning `long: 0;` here */ int b; };
/* ----- START-EXPECTED-OUTPUT ----- */ -/* - *struct padded_a_lot { - * int a; - * long: 32; - * long: 64; - * long: 64; - * int b; - *}; - * - */ -/* ------ END-EXPECTED-OUTPUT ------ */ - struct padded_a_lot { int a; - /* 32 bit of implicit padding here, which algo will make explicit */ long: 64; long: 64; int b; };
+/* ------ END-EXPECTED-OUTPUT ------ */ + /* ----- START-EXPECTED-OUTPUT ----- */ /* *struct padded_cache_line { * int a; - * long: 32; * long: 64; * long: 64; * long: 64; * int b; - * long: 32; * long: 64; * long: 64; * long: 64; @@ -85,7 +72,7 @@ struct padded_cache_line { *struct zone { * int a; * short b; - * short: 16; + * long: 0; * struct zone_padding __pad__; *}; * @@ -108,6 +95,39 @@ struct padding_wo_named_members { long: 64; };
+struct padding_weird_1 { + int a; + long: 64; + short: 16; + short b; +}; + +/* ------ END-EXPECTED-OUTPUT ------ */ + +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct padding_weird_2 { + * long: 56; + * char a; + * long: 56; + * char b; + * char: 8; + *}; + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ +struct padding_weird_2 { + int: 32; /* these paddings will be collapsed into `long: 56;` */ + short: 16; + char: 8; + char a; + int: 32; /* these paddings will be collapsed into `long: 56;` */ + short: 16; + char: 8; + char b; + char: 8; +}; + /* ------ END-EXPECTED-OUTPUT ------ */
int f(struct { @@ -117,6 +137,8 @@ int f(struct { struct padded_cache_line _4; struct zone _5; struct padding_wo_named_members _6; + struct padding_weird_1 _7; + struct padding_weird_2 _8; } *_) { return 0;
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit b148c8b9b926e257a59c8eb2cd6fa3adfd443254 ]
Add few hand-crafted cases and few randomized cases found using script from [0] that tests btf_dump's padding logic.
[0] https://lore.kernel.org/bpf/85f83c333f5355c8ac026f835b18d15060725fcb.camel@e...
Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20221212211505.558851-7-andrii@kernel.org Stable-dep-of: 4fb877aaa179 ("libbpf: Fix btf_dump's packed struct determination") Signed-off-by: Sasha Levin sashal@kernel.org --- .../bpf/progs/btf_dump_test_case_packing.c | 61 +++++++++- .../bpf/progs/btf_dump_test_case_padding.c | 104 ++++++++++++++++++ 2 files changed, 164 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c index e304b6204bd9d..5c6c62f7ed328 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c @@ -58,7 +58,64 @@ union jump_code_union { } __attribute__((packed)); };
-/*------ END-EXPECTED-OUTPUT ------ */ +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct nested_packed_but_aligned_struct { + * int x1; + * int x2; + *}; + * + *struct outer_implicitly_packed_struct { + * char y1; + * struct nested_packed_but_aligned_struct y2; + *} __attribute__((packed)); + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ + +struct nested_packed_but_aligned_struct { + int x1; + int x2; +} __attribute__((packed)); + +struct outer_implicitly_packed_struct { + char y1; + struct nested_packed_but_aligned_struct y2; +}; +/* ----- START-EXPECTED-OUTPUT ----- */ +/* + *struct usb_ss_ep_comp_descriptor { + * char: 8; + * char bDescriptorType; + * char bMaxBurst; + * short wBytesPerInterval; + *}; + * + *struct usb_host_endpoint { + * long: 64; + * char: 8; + * struct usb_ss_ep_comp_descriptor ss_ep_comp; + * long: 0; + *} __attribute__((packed)); + * + */ +/* ------ END-EXPECTED-OUTPUT ------ */ + +struct usb_ss_ep_comp_descriptor { + char: 8; + char bDescriptorType; + char bMaxBurst; + int: 0; + short wBytesPerInterval; +} __attribute__((packed)); + +struct usb_host_endpoint { + long: 64; + char: 8; + struct usb_ss_ep_comp_descriptor ss_ep_comp; + long: 0; +}; +
int f(struct { struct packed_trailing_space _1; @@ -69,6 +126,8 @@ int f(struct { union union_is_never_packed _6; union union_does_not_need_packing _7; union jump_code_union _8; + struct outer_implicitly_packed_struct _9; + struct usb_host_endpoint _10; } *_) { return 0; diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c index 6f963d34c45ba..79276fbe454a8 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_padding.c @@ -128,6 +128,98 @@ struct padding_weird_2 { char: 8; };
+/* ----- START-EXPECTED-OUTPUT ----- */ +struct exact_1byte { + char x; +}; + +struct padded_1byte { + char: 8; +}; + +struct exact_2bytes { + short x; +}; + +struct padded_2bytes { + short: 16; +}; + +struct exact_4bytes { + int x; +}; + +struct padded_4bytes { + int: 32; +}; + +struct exact_8bytes { + long x; +}; + +struct padded_8bytes { + long: 64; +}; + +struct ff_periodic_effect { + int: 32; + short magnitude; + long: 0; + short phase; + long: 0; + int: 32; + int custom_len; + short *custom_data; +}; + +struct ib_wc { + long: 64; + long: 64; + int: 32; + int byte_len; + void *qp; + union {} ex; + long: 64; + int slid; + int wc_flags; + long: 64; + char smac[6]; + long: 0; + char network_hdr_type; +}; + +struct acpi_object_method { + long: 64; + char: 8; + char type; + short reference_count; + char flags; + short: 0; + char: 8; + char sync_level; + long: 64; + void *node; + void *aml_start; + union {} dispatch; + long: 64; + int aml_length; +}; + +struct nested_unpacked { + int x; +}; + +struct nested_packed { + struct nested_unpacked a; + char c; +} __attribute__((packed)); + +struct outer_mixed_but_unpacked { + struct nested_packed b1; + short a1; + struct nested_packed b2; +}; + /* ------ END-EXPECTED-OUTPUT ------ */
int f(struct { @@ -139,6 +231,18 @@ int f(struct { struct padding_wo_named_members _6; struct padding_weird_1 _7; struct padding_weird_2 _8; + struct exact_1byte _100; + struct padded_1byte _101; + struct exact_2bytes _102; + struct padded_2bytes _103; + struct exact_4bytes _104; + struct padded_4bytes _105; + struct exact_8bytes _106; + struct padded_8bytes _107; + struct ff_periodic_effect _200; + struct ib_wc _201; + struct acpi_object_method _202; + struct outer_mixed_but_unpacked _203; } *_) { return 0;
From: Andrii Nakryiko andrii@kernel.org
[ Upstream commit 4fb877aaa179dcdb1676d55216482febaada457e ]
Fix bug in btf_dump's logic of determining if a given struct type is packed or not. The notion of "natural alignment" is not needed and is even harmful in this case, so drop it altogether. The biggest difference in btf_is_struct_packed() compared to its original implementation is that we don't really use btf__align_of() to determine overall alignment of a struct type (because it could be 1 for both packed and non-packed struct, depending on specifci field definitions), and just use field's actual alignment to calculate whether any field is requiring packing or struct's size overall necessitates packing.
Add two simple test cases that demonstrate the difference this change would make.
Fixes: ea2ce1ba99aa ("libbpf: Fix BTF-to-C converter's padding logic") Reported-by: Eduard Zingerman eddyz87@gmail.com Signed-off-by: Andrii Nakryiko andrii@kernel.org Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Eduard Zingerman eddyz87@gmail.com Link: https://lore.kernel.org/bpf/20221215183605.4149488-1-andrii@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/btf_dump.c | 33 ++++--------------- .../bpf/progs/btf_dump_test_case_packing.c | 19 +++++++++++ 2 files changed, 25 insertions(+), 27 deletions(-)
diff --git a/tools/lib/bpf/btf_dump.c b/tools/lib/bpf/btf_dump.c index e0f0b17541b4a..cfbec31e115cc 100644 --- a/tools/lib/bpf/btf_dump.c +++ b/tools/lib/bpf/btf_dump.c @@ -829,47 +829,26 @@ static void btf_dump_emit_type(struct btf_dump *d, __u32 id, __u32 cont_id) } }
-static int btf_natural_align_of(const struct btf *btf, __u32 id) -{ - const struct btf_type *t = btf__type_by_id(btf, id); - int i, align, vlen; - const struct btf_member *m; - - if (!btf_is_composite(t)) - return btf__align_of(btf, id); - - align = 1; - m = btf_members(t); - vlen = btf_vlen(t); - for (i = 0; i < vlen; i++, m++) { - align = max(align, btf__align_of(btf, m->type)); - } - - return align; -} - static bool btf_is_struct_packed(const struct btf *btf, __u32 id, const struct btf_type *t) { const struct btf_member *m; - int align, i, bit_sz; + int max_align = 1, align, i, bit_sz; __u16 vlen;
- align = btf_natural_align_of(btf, id); - /* size of a non-packed struct has to be a multiple of its alignment */ - if (align && (t->size % align) != 0) - return true; - m = btf_members(t); vlen = btf_vlen(t); /* all non-bitfield fields have to be naturally aligned */ for (i = 0; i < vlen; i++, m++) { - align = btf_natural_align_of(btf, m->type); + align = btf__align_of(btf, m->type); bit_sz = btf_member_bitfield_size(t, i); if (align && bit_sz == 0 && m->offset % (8 * align) != 0) return true; + max_align = max(align, max_align); } - + /* size of a non-packed struct has to be a multiple of its alignment */ + if (t->size % max_align != 0) + return true; /* * if original struct was marked as packed, but its layout is * naturally aligned, we'll detect that it's not packed diff --git a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c index 5c6c62f7ed328..7998f27df7ddd 100644 --- a/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c +++ b/tools/testing/selftests/bpf/progs/btf_dump_test_case_packing.c @@ -116,6 +116,23 @@ struct usb_host_endpoint { long: 0; };
+/* ----- START-EXPECTED-OUTPUT ----- */ +struct nested_packed_struct { + int a; + char b; +} __attribute__((packed)); + +struct outer_nonpacked_struct { + short a; + struct nested_packed_struct b; +}; + +struct outer_packed_struct { + short a; + struct nested_packed_struct b; +} __attribute__((packed)); + +/* ------ END-EXPECTED-OUTPUT ------ */
int f(struct { struct packed_trailing_space _1; @@ -128,6 +145,8 @@ int f(struct { union jump_code_union _8; struct outer_implicitly_packed_struct _9; struct usb_host_endpoint _10; + struct outer_nonpacked_struct _11; + struct outer_packed_struct _12; } *_) { return 0;
From: Xiaogang Chen xiaogang.chen@amd.com
commit 8eeddc0d4200762063e1c66b9cc63afa7b24ebf0 upstream.
During miration to vram prange->offset is valid after vram buffer is located, either use old one or allocate a new one. Move svm_range_vram_node_new before migrate for each vma to get valid prange->offset.
v2: squash in warning fix
Fixes: b4ee9606378b ("drm/amdkfd: Fix BO offset for multi-VMA page migration") Signed-off-by: Xiaogang Chen Xiaogang.Chen@amd.com Reviewed-by: Felix Kuehling Felix.Kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdkfd/kfd_migrate.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_migrate.c @@ -305,12 +305,6 @@ svm_migrate_copy_to_vram(struct amdgpu_d src = scratch; dst = (uint64_t *)(scratch + npages);
- r = svm_range_vram_node_new(adev, prange, true); - if (r) { - dev_dbg(adev->dev, "fail %d to alloc vram\n", r); - goto out; - } - amdgpu_res_first(prange->ttm_res, ttm_res_offset, npages << PAGE_SHIFT, &cursor); for (i = j = 0; i < npages; i++) { @@ -391,7 +385,7 @@ out_free_vram_pages: migrate->dst[i + 3] = 0; } #endif -out: + return r; }
@@ -520,6 +514,12 @@ svm_migrate_ram_to_vram(struct svm_range
start = prange->start << PAGE_SHIFT; end = (prange->last + 1) << PAGE_SHIFT; + + r = svm_range_vram_node_new(adev, prange, true); + if (r) { + dev_dbg(adev->dev, "fail %ld to alloc vram\n", r); + return r; + } ttm_res_offset = prange->offset << PAGE_SHIFT;
for (addr = start; addr < end;) { @@ -543,6 +543,8 @@ svm_migrate_ram_to_vram(struct svm_range
if (cpages) prange->actual_loc = best_loc; + else + svm_range_vram_node_free(prange);
return r < 0 ? r : 0; }
From: Matthieu Baerts matthieu.baerts@tessares.net
commit 1b0120e4db0bf2838d1ce741195ce4b7cc100b91 upstream.
Recently, when automatically merging -net and net-next in MPTCP devel tree, our CI reported [1] a conflict in hsr, the same as the one reported by Stephen in netdev [2].
When looking at the conflict, I noticed it is in fact the v1 [3] that has been applied in -net and the v2 [4] in net-next. Maybe the v1 was applied by accident.
As mentioned by Jakub Kicinski [5], the new condition makes more sense before the net_ratelimit(), not to update net_ratelimit's state which is unnecessary if we're not going to print either way.
Here, this modification applies the v2 but in -net.
Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/4423171069 [1] Link: https://lore.kernel.org/netdev/20230315100914.53fc1760@canb.auug.org.au/ [2] Link: https://lore.kernel.org/netdev/20230307133229.127442-1-koverskeid@gmail.com/ [3] Link: https://lore.kernel.org/netdev/20230309092302.179586-1-koverskeid@gmail.com/ [4] Link: https://lore.kernel.org/netdev/20230308232001.2fb62013@kernel.org/ [5] Fixes: 28e8cabe80f3 ("net: hsr: Don't log netdev_err message on unknown prp dst node") Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Reviewed-by: Steen Hegelund Steen.Hegelund@microchip.com Link: https://lore.kernel.org/r/20230315-net-20230315-hsr_framereg-ratelimit-v1-1-... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/hsr/hsr_framereg.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/hsr/hsr_framereg.c +++ b/net/hsr/hsr_framereg.c @@ -415,7 +415,7 @@ void hsr_addr_subst_dest(struct hsr_node node_dst = find_node_by_addr_A(&port->hsr->node_db, eth_hdr(skb)->h_dest); if (!node_dst) { - if (net_ratelimit() && port->hsr->prot_version != PRP_V1) + if (port->hsr->prot_version != PRP_V1 && net_ratelimit()) netdev_err(skb->dev, "%s: Unknown node\n", __func__); return; }
From: Jan Beulich jbeulich@suse.com
commit aadbd07ff8a75ed342388846da78dfaddb8b106a upstream.
In the commit referenced below I failed to pay attention to this code also being buildable as 32-bit. Adjust the type of "ret" - there's no real need for it to be wider than 32 bits.
Fixes: 934ef33ee75c ("x86/PVH: obtain VGA console info in Dom0") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Jan Beulich jbeulich@suse.com Reviewed-by: Juergen Gross jgross@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Link: https://lore.kernel.org/r/2d2193ff-670b-0a27-e12d-2c5c4c121c79@suse.com Signed-off-by: Juergen Gross jgross@suse.com --- arch/x86/xen/enlighten_pvh.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/xen/enlighten_pvh.c +++ b/arch/x86/xen/enlighten_pvh.c @@ -48,7 +48,7 @@ void __init xen_pvh_init(struct boot_par struct xen_platform_op op = { .cmd = XENPF_get_dom0_console, }; - long ret = HYPERVISOR_platform_op(&op); + int ret = HYPERVISOR_platform_op(&op);
if (ret > 0) xen_init_vga(&op.u.dom0_console,
On 4/3/23 08:07, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, Apr 03, 2023 at 04:07:25PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, armv7, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On 4/3/2023 7:07 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On Mon, Apr 03, 2023 at 04:07:25PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Successfully cross-compiled for arm64 (bcm2711_defconfig, GCC 10.2.0) and powerpc (ps3_defconfig, GCC 12.2.0).
Tested-by: Bagas Sanjaya bagasdotme@gmail.com
On 4/3/23 7:07 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, Apr 03, 2023 at 04:07:25PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Again, looks grand on our RISC-V stuff... Tested-by: Conor Dooley conor.dooley@microchip.com
Thanks, Conor.
Hello Greg,
From: Greg Kroah-Hartman gregkh@linuxfoundation.org Sent: Monday, April 3, 2023 3:07 PM
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
CIP configurations built and booted with Linux 6.2.10-rc1 (6e4466c69e20): https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/82... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/commits/linu...
Tested-by: Chris Paterson (CIP) chris.paterson2@renesas.com
Kind regards, Chris
On Mon, 3 Apr 2023 at 20:13, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.2.10-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.2.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 6.2.10-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-6.2.y * git commit: 6e4466c69e20058ca543fa4ced70f8d745dce1ec * git describe: v6.2.9-188-g6e4466c69e20 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.2.y/build/v6.2.9-...
## Test Regressions (compared to v6.2.9)
## Metric Regressions (compared to v6.2.9)
## Test Fixes (compared to v6.2.9)
## Metric Fixes (compared to v6.2.9)
## Test result summary total: 195294, pass: 167379, fail: 4270, skip: 23297, xfail: 348
## Build Summary * arc: 5 total, 5 passed, 0 failed * arm: 145 total, 142 passed, 3 failed * arm64: 54 total, 53 passed, 1 failed * i386: 41 total, 38 passed, 3 failed * mips: 30 total, 28 passed, 2 failed * parisc: 8 total, 8 passed, 0 failed * powerpc: 38 total, 36 passed, 2 failed * riscv: 26 total, 25 passed, 1 failed * s390: 16 total, 16 passed, 0 failed * sh: 14 total, 12 passed, 2 failed * sparc: 8 total, 7 passed, 1 failed * x86_64: 46 total, 46 passed, 0 failed
## Test suites summary * boot * fwts * igt-gpu-tools * kselftest-android * kselftest-arm64 * kselftest-breakpoints * kselftest-capabilities * kselftest-cgroup * kselftest-clone3 * kselftest-core * kselftest-cpu-hotplug * kselftest-cpufreq * kselftest-drivers-dma-buf * kselftest-efivarfs * kselftest-exec * kselftest-filesystems * kselftest-filesystems-binderfs * kselftest-firmware * kselftest-fpu * kselftest-ftrace * kselftest-futex * kselftest-gpio * kselftest-intel_pstate * kselftest-ipc * kselftest-ir * kselftest-kcmp * kselftest-kexec * kselftest-kvm * kselftest-lib * kselftest-livepatch * kselftest-membarrier * kselftest-memfd * kselftest-memory-hotplug * kselftest-mincore * kselftest-mount * kselftest-mqueue * kselftest-net * kselftest-net-forwarding * kselftest-net-mptcp * kselftest-netfilter * kselftest-nsfs * kselftest-openat2 * kselftest-pid_namespace * kselftest-pidfd * kselftest-proc * kselftest-pstore * kselftest-ptrace * kselftest-rseq * kselftest-rtc * kselftest-seccomp * kselftest-sigaltstack * kselftest-size * kselftest-splice * kselftest-static_keys * kselftest-sync * kselftest-sysctl * kselftest-tc-testing * kselftest-timens * kselftest-timers * kselftest-tmpfs * kselftest-tpm2 * kselftest-user * kselftest-user_events * kselftest-vDSO * kselftest-vm * kselftest-watchdog * kselftest-x86 * kselftest-zram * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * log-parser-boot * log-parser-test * ltp-cap_bounds * ltp-commands * ltp-containers * ltp-controllers * ltp-cpuhotplug * ltp-crypto * ltp-cve * ltp-dio * ltp-fcntl-locktests * ltp-filecaps * ltp-fs * ltp-fs_bind * ltp-fs_perms_simple * ltp-fsx * ltp-hugetlb * ltp-io * ltp-ipc * ltp-math * ltp-mm * ltp-nptl * ltp-nptl++ * ltp-open-posix-tests * ltp-pty * ltp-sched * ltp-securebits * ltp-smoke * ltp-syscalls * ltp-tracing * ltpa-6328543/0/tests/1_ltp-dio * network-basic-tests * perf * rcutorture * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
* Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
Hi Greg
6.2.10-rc1
compiles, boots and runs here on x86_64 (AMD Ryzen 5 PRO 4650G, Slackware64-15.0)
Tested-by: Markus Reichelt lkt+2023@mareichelt.com
On Mon, Apr 03, 2023 at 04:07:25PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.2.10 release. There are 187 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 05 Apr 2023 14:03:18 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 520 pass: 520 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
linux-stable-mirror@lists.linaro.org