This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.17.4-rc1
Duoming Zhou duoming@zju.edu.cn ax25: Fix UAF bugs in ax25 timers
Steven Price steven.price@arm.com cpu/hotplug: Remove the 'cpu' member of cpuhp_cpu_state
Matt Roper matthew.d.roper@intel.com drm/i915: Sunset igpu legacy mmap support based on GRAPHICS_VER_FULL
Marco Elver elver@google.com mm, kfence: support kmem_dump_obj() for KFENCE objects
Chao Gao chao.gao@intel.com dma-direct: avoid redundant memory sync for swiotlb
Anna-Maria Behnsen anna-maria@linutronix.de timers: Fix warning condition in __run_timers()
Dongjin Yang dj76.yang@samsung.com dt-bindings: net: snps: remove duplicate name
Martin Povišer povik+lin@cutebit.org i2c: pasemi: Wait for write xfers to finish
Sherry Sun sherry.sun@nxp.com dt-bindings: memory: snps,ddrc-3.80a compatible also need interrupts
Nadav Amit namit@vmware.com smp: Fix offline cpu check in flush_smp_call_function_queue()
Vladimir Oltean vladimir.oltean@nxp.com Revert "net: dsa: setup master before ports"
Andy Shevchenko andriy.shevchenko@linux.intel.com i2c: dev: check return value when calling dev_set_name()
Mikulas Patocka mpatocka@redhat.com dm integrity: fix memory corruption when tag_size is less than digest size
Alexander Sverdlin alexander.sverdlin@gmail.com ep93xx: clock: Fix UAF in ep93xx_clk_register_gate()
Nathan Chancellor nathan@kernel.org ARM: davinci: da850-evm: Avoid NULL pointer dereference
Paul Gortmaker paul.gortmaker@windriver.com tick/nohz: Use WARN_ON_ONCE() to prevent console saturation
Rei Yamamoto yamamoto.rei@jp.fujitsu.com genirq/affinity: Consider that CPUs on nodes can be unbalanced
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/tsx: Disable TSX development mode at boot
Pawan Gupta pawan.kumar.gupta@linux.intel.com x86/tsx: Use MSR_TSX_CTRL to clear CPUID bits
Tomasz Moń desowin@gmail.com drm/amdgpu: Enable gfxoff quirk on MacBook Pro
Melissa Wen mwen@igalia.com drm/amd/display: don't ignore alpha property on pre-multiplied mode
Nicolas Dichtel nicolas.dichtel@6wind.com ipv6: fix panic when forwarding a pkt with no in6 dev
Johannes Berg johannes.berg@intel.com nl80211: correctly check NL80211_ATTR_REG_ALPHA2 size
Fabio M. De Francesco fmdefrancesco@gmail.com ALSA: pcm: Test for "silence" field in struct "pcm_format_data"
Tao Jin tao-j@outlook.com ALSA: hda/realtek: add quirk for Lenovo Thinkpad X12 speakers
Tim Crawford tcrawford@system76.com ALSA: hda/realtek: Add quirk for Clevo PD50PNT
Naohiro Aota naohiro.aota@wdc.com btrfs: mark resumed async balance as writing
Jia-Ju Bai baijiaju1990@gmail.com btrfs: fix root ref counts in error handling in btrfs_get_root_ref
Naohiro Aota naohiro.aota@wdc.com btrfs: zoned: activate block group only for extent allocation
Toke Høiland-Jørgensen toke@redhat.com ath9k: Fix usage of driver-private space in tx_info
Toke Høiland-Jørgensen toke@toke.dk ath9k: Properly clear TX status area before reporting to mac80211
Bartosz Golaszewski brgl@bgdev.pl gpio: sim: fix setting and getting multiple lines
Ronnie Sahlberg lsahlber@redhat.com cifs: verify that tcon is valid before dereference in cifs_kill_sb
Jason A. Donenfeld Jason@zx2c4.com gcc-plugins: latent_entropy: use /dev/urandom
Johan Hovold johan@kernel.org memory: renesas-rpc-if: fix platform-device leak in error path
Chuck Lever chuck.lever@oracle.com SUNRPC: Fix NFSD's request deferral on RDMA transports
Oliver Upton oupton@google.com KVM: Don't create VM debugfs files outside of the VM directory
Sean Christopherson seanjc@google.com KVM: x86/mmu: Resolve nx_huge_pages when kvm.ko is loaded
Andrew Morton akpm@linux-foundation.org revert "fs/binfmt_elf: use PT_LOAD p_align values for static PIE"
Andrew Morton akpm@linux-foundation.org revert "fs/binfmt_elf: fix PT_LOAD p_align values for loaders"
Mike Kravetz mike.kravetz@oracle.com hugetlb: do not demote poisoned hugetlb pages
Patrick Wang patrick.wang.shcn@gmail.com mm: kmemleak: take a full lowmem check in kmemleak_*_phys()
Minchan Kim minchan@kernel.org mm: fix unexpected zeroed page mapping with zram swap
Juergen Gross jgross@suse.com mm, page_alloc: fix build_zonerefs_node()
Axel Rasmussen axelrasmussen@google.com mm/secretmem: fix panic when growing a memfd_secret
Borislav Petkov bp@suse.de perf/imx_ddr: Fix undefined behavior due to shift overflowing the constant
Pavel Begunkov asml.silence@gmail.com io_uring: use nospec annotation for more indexes
Pavel Begunkov asml.silence@gmail.com io_uring: zero tag on rsrc removal
Peter Zijlstra peterz@infradead.org x86,bpf: Avoid IBT objtool warning
Duoming Zhou duoming@zju.edu.cn drivers: net: slip: fix NPD bug in sl_tx_timeout()
Chandrakanth patil chandrakanth.patil@broadcom.com scsi: megaraid_sas: Target with invalid LUN ID is deleted during scan
Alexey Galakhov agalakhov@gmail.com scsi: mvsas: Add PCI ID of RocketRaid 2640
Sreekanth Reddy sreekanth.reddy@broadcom.com scsi: mpt3sas: Fail reset operation if config request timed out
Christoph Böhmwalder christoph@boehmwalder.at drbd: set QUEUE_FLAG_STABLE_WRITES
Roman Li Roman.Li@amd.com drm/amd/display: Fix allocate_mst_payload assert on resume
Martin Leung Martin.Leung@amd.com drm/amd/display: Revert FEC check in validation
Roman Li Roman.Li@amd.com drm/amd/display: Enable power gating before init_pipes
Chris Park Chris.Park@amd.com drm/amd/display: Correct Slice reset calculation
Matthias Schiffer matthias.schiffer@ew.tq-group.com spi: cadence-quadspi: fix protocol setup for non-1-1-X operations
Xiaomeng Tong xiam0nd.tong@gmail.com myri10ge: fix an incorrect free for skb in myri10ge_sw_tso
Marcin Kozlowski marcinguy@gmail.com net: usb: aqc111: Fix out-of-bounds accesses in RX fixup
Boqun Feng boqun.feng@gmail.com Drivers: hv: balloon: Disable balloon and hot-add accordingly
Andy Chiu andy.chiu@sifive.com net: axienet: setup mdio unconditionally
Steve Capper steve.capper@arm.com tlb: hugetlb: Add more sizes to tlb_remove_huge_tlb_entry
Joey Gouly joey.gouly@arm.com arm64: alternatives: mark patch_alternative() as `noinstr`
Christophe Leroy christophe.leroy@csgroup.eu static_call: Properly initialise DEFINE_STATIC_CALL_RET0()
Jonathan Bakker xc-racer2@live.ca regulator: wm8994: Add an off-on delay for WM8994 variant
Leo Ruan tingquan.ruan@cn.bosch.com gpu: ipu-v3: Fix dev_dbg frequency output
Christian Lamparter chunkeey@gmail.com ata: libata-core: Disable READ LOG DMA EXT for Samsung 840 EVOs
Randy Dunlap rdunlap@infradead.org net: micrel: fix KS8851_MLL Kconfig
Tyrel Datwyler tyreld@linux.ibm.com scsi: ibmvscsis: Increase INITIAL_SRP_LIMIT to 1024
James Smart jsmart2021@gmail.com scsi: lpfc: Fix queue failures when recovering from PCI parity error
James Smart jsmart2021@gmail.com scsi: lpfc: Fix unload hang after back to back PCI EEH faults
James Smart jsmart2021@gmail.com scsi: lpfc: Improve PCI EEH Error and Recovery Handling
Xiaoguang Wang xiaoguang.wang@linux.alibaba.com scsi: target: tcmu: Fix possible page UAF
Michael Kelley mikelley@microsoft.com Drivers: hv: vmbus: Prevent load re-ordering when reading ring buffer
Michael Kelley mikelley@microsoft.com PCI: hv: Propagate coherence from VMbus device to PCI device
Michael Kelley mikelley@microsoft.com Drivers: hv: vmbus: Propagate VMbus coherence to each VMbus device
Andrea Parri (Microsoft) parri.andrea@gmail.com Drivers: hv: vmbus: Deactivate sysctl_record_panic_msg by default in isolated guests
QintaoShen unSimple1993@163.com drm/amdkfd: Check for potential null return of kmalloc_array()
Tianci Yin tianci.yin@amd.com drm/amdgpu/vcn: improve vcn dpg stop procedure
Tushar Patel tushar.patel@amd.com drm/amdkfd: Fix Incorrect VMIDs passed to HWS
Leo (Hanghong) Ma hanghong.ma@amd.com drm/amd/display: Update VTEM Infopacket definition
Chiawen Huang chiawen.huang@amd.com drm/amd/display: FEC check in timing validation
Charlene Liu Charlene.Liu@amd.com drm/amd/display: fix audio format not updated after edid updated
Alex Deucher alexander.deucher@amd.com drm/amdgpu/gmc: use PCI BARs for APUs in passthrough
Guchun Chen guchun.chen@amd.com drm/amdgpu: conduct a proper cleanup of PDB bo
Josef Bacik josef@toxicpanda.com btrfs: do not warn for free space inode in cow_file_range
Darrick J. Wong djwong@kernel.org btrfs: fix fallocate to use file_modified to update permissions consistently
Aurabindo Pillai aurabindo.pillai@amd.com drm/amd: Add USBC connector ID
Nicholas Piggin npiggin@gmail.com KVM: PPC: Book3S HV P9: Fix "lost kick" race
Jens Axboe axboe@kernel.dk io_uring: abort file assignment prior to assigning creds
Ming Lei ming.lei@redhat.com block: null_blk: end timed out poll request
Ming Lei ming.lei@redhat.com block: fix offset/size check in bio_trim()
Jeremy Linton jeremy.linton@arm.com net: bcmgenet: Revert "Use stronger register read/writes to assure ordering"
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: felix: fix tagging protocol changes with multiple CPU ports
Antoine Tenart atenart@kernel.org tun: annotate access to queue->trans_start
Jason Gunthorpe jgg@ziepe.ca vfio/pci: Fix vf_token mechanism when device-specific VF drivers are used
Khazhismel Kumykov khazhy@google.com dm mpath: only use ktime_get_ns() in historical selector
Harshit Mogalapalli harshit.m.mogalapalli@oracle.com cifs: potential buffer overflow in handling symlinks
Lin Ma linma@zju.edu.cn nfc: nci: add flush_workqueue to prevent uaf
Dylan Hung dylan_hung@aspeedtech.com net: ftgmac100: access hardware register after clock ready
Martin Willi martin@strongswan.org macvlan: Fix leaking skb in source mode with nodst option
Adrian Hunter adrian.hunter@intel.com perf tools: Fix misleading add event PMU debug message
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Limit max buffer and period sizes per time
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Increase max buffer size
Athira Rajeev atrajeev@linux.vnet.ibm.com testing/selftests/mqueue: Fix mq_perf_tests to free the allocated cpu set
Dylan Yudaken dylany@fb.com io_uring: verify pad field is 0 in io_get_ext_arg
Dylan Yudaken dylany@fb.com io_uring: verify that resv2 is 0 in io_uring_rsrc_update2
Dylan Yudaken dylany@fb.com io_uring: move io_uring_rsrc_update2 validation
Takashi Iwai tiwai@suse.de ALSA: mtpav: Don't call card private_free at probe error path
Takashi Iwai tiwai@suse.de ALSA: ad1889: Fix the missing snd_card_free() call at probe error
Pavel Begunkov asml.silence@gmail.com io_uring: fix assign file locking issue
Antoine Tenart atenart@kernel.org netfilter: nf_tables: nft_parse_register can return a negative value
Horatiu Vultur horatiu.vultur@microchip.com net: lan966x: Stop processing the MAC entry is port is wrong.
Horatiu Vultur horatiu.vultur@microchip.com net: lan966x: Fix when a port's upper is changed.
Petr Malat oss@malat.biz sctp: Initialize daddr on peeled off socket
Mike Christie michael.christie@oracle.com scsi: iscsi: Fix unbound endpoint error handling
Mike Christie michael.christie@oracle.com scsi: iscsi: Fix conn cleanup and stop race during iscsid restart
Mike Christie michael.christie@oracle.com scsi: iscsi: Fix endpoint reuse regression
Mike Christie michael.christie@oracle.com scsi: iscsi: Fix offload conn cleanup when iscsid restarts
Mike Christie michael.christie@oracle.com scsi: iscsi: Move iscsi_ep_disconnect()
Ajish Koshy Ajish.Koshy@microchip.com scsi: pm80xx: Enable upper inbound, outbound queues
Ajish Koshy Ajish.Koshy@microchip.com scsi: pm80xx: Mask and unmask upper interrupt vectors 32-63
Karsten Graul kgraul@linux.ibm.com net/smc: Fix NULL pointer dereference in smc_pnet_find_ib()
Karsten Graul kgraul@linux.ibm.com net/smc: use memcpy instead of snprintf to avoid out of bounds read
Jens Axboe axboe@kernel.dk io_uring: stop using io_wq_work as an fd placeholder
Kuogee Hsieh quic_khsieh@quicinc.com drm/msm/dp: add fail safe mode outside of event_mutex context
Stephen Boyd swboyd@chromium.org drm/msm/dsi: Use connector directly in msm_dsi_manager_connector_init()
Rob Clark robdclark@chromium.org drm/msm: Fix range size vs end confusion
Florian Westphal fw@strlen.de netfilter: nft_socket: make cgroup match work in input too
Ben Greear greearb@candelatech.com mac80211: fix ht_capa printout in debugfs
Rameshkumar Sundaram quic_ramess@quicinc.com cfg80211: hold bss_lock while updating nontrans_list
Benedikt Spranger b.spranger@linutronix.de net/sched: taprio: Check if socket flags are valid
Dinh Nguyen dinguyen@kernel.org net: ethernet: stmmac: fix altr_tse_pcs function when using a fixed-link
Jens Axboe axboe@kernel.dk io_uring: flag the fact that linked file assignment is sane
Heiko Stuebner heiko@sntech.de RISC-V: KVM: include missing hwcap.h into vcpu_fp
Anup Patel apatel@ventanamicro.com KVM: selftests: riscv: Fix alignment of the guest_hang() function
Anup Patel apatel@ventanamicro.com KVM: selftests: riscv: Set PTE A and D bits in VS-stage page table
Michael Walle michael@walle.cc net: dsa: felix: suppress -EPROBE_DEFER errors
Dave Wysochanski dwysocha@redhat.com cachefiles: Fix KASAN slab-out-of-bounds in cachefiles_set_volume_xattr
Jeffle Xu jefflexu@linux.alibaba.com cachefiles: unmark inode in use in error path
Marcelo Ricardo Leitner marcelo.leitner@gmail.com net/sched: fix initialization order when updating chain 0 head
Xin Long lucien.xin@gmail.com sctp: use the correct skb for security_sctp_assoc_request
Vadim Pasternak vadimp@nvidia.com mlxsw: i2c: Fix initialization error flow
Vladimir Oltean vladimir.oltean@nxp.com net: mdio: don't defer probe forever if PHY IRQ provider is missing
Mateusz Palczewski mateusz.palczewski@intel.com Revert "iavf: Fix deadlock occurrence during resetting VF interface"
Alexander Lobakin alexandr.lobakin@intel.com ice: arfs: fix use-after-free when freeing @rx_cpu_rmap
Shyam Prasad N sprasad@microsoft.com cifs: release cached dentries only if mount is complete
Linus Torvalds torvalds@linux-foundation.org gpiolib: acpi: use correct format characters
Guillaume Nault gnault@redhat.com veth: Ensure eth header is in skb's linear part
Vlad Buslov vladbu@nvidia.com net/sched: flower: fix parsing of ethertype following VLAN header
Chuck Lever chuck.lever@oracle.com SUNRPC: Fix the svc_deferred_event trace class
Reiji Watanabe reijiw@google.com KVM: arm64: mixed-width check should be skipped for uninitialized vCPUs
Marc Zyngier maz@kernel.org KVM: arm64: Generalise VM features into a set of flags
Kyle Copperfield kmcopper@danwin1210.me media: rockchip/rga: do proper error checking in probe
Cristian Marussi cristian.marussi@arm.com firmware: arm_scmi: Fix sorting of retrieved clock rates
Anilkumar Kolli quic_akolli@quicinc.com Revert "ath11k: mesh: add support for 256 bitmap in blockack frames in 11ax"
Miaoqian Lin linmq006@gmail.com memory: atmel-ebi: Fix missing of_node_put in atmel_ebi_probe
Cristian Marussi cristian.marussi@arm.com firmware: arm_scmi: Remove clear channel call on the TX channel
Trond Myklebust trond.myklebust@hammerspace.com nfsd: Fix a write performance regression
Rob Clark robdclark@chromium.org drm/msm: Add missing put_task_struct() in debugfs path
Takashi Iwai tiwai@suse.de ALSA: nm256: Don't call card private_free at probe error path
Takashi Iwai tiwai@suse.de ALSA: memalloc: Add fallback SG-buffer allocations for x86
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Cap upper limits of buffer/period bytes for implicit fb
Takashi Iwai tiwai@suse.de ALSA: via82xx: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: sonicvibes: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: sc6000: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: rme96: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: rme9652: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: rme32: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: riptide: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: oxygen: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: maestro3: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: lx6464es: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: lola: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: korg1212: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: intel_hdmi: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: intel8x0: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: ice1724: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: hdspm: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: hdsp: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: galaxy: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: fm801: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: es1968: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: es1938: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: ens137x: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: emu10k1x: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: echoaudio: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: cs5535audio: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: cs4281: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: cmipci: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: ca0106: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: bt87x: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: azt3328: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: aw2: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: au88x0: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: atiixp: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: als4000: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: als300: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: ali5451: Fix the missing snd_card_free() call at probe error
Takashi Iwai tiwai@suse.de ALSA: sis7019: Fix the missing error handling
Takashi Iwai tiwai@suse.de ALSA: core: Add snd_card_free_on_error() helper
Naohiro Aota naohiro.aota@wdc.com btrfs: return allocated block group from do_chunk_alloc()
Dennis Zhou dennis@kernel.org btrfs: fix btrfs_submit_compressed_write cgroup attribution
Naohiro Aota naohiro.aota@wdc.com btrfs: release correct delalloc amount in direct IO write path
Kai-Heng Feng kai.heng.feng@canonical.com drm/amdgpu: Ensure HDA function is suspended before ASIC reset
Tadeusz Struk tadeusz.struk@linaro.org uapi/linux/stddef.h: Add include guards
Piotr Chmura chmooreck@gmail.com media: si2157: unknown chip version Si2147-A30 ROM 0x50
Anup Patel apatel@ventanamicro.com RISC-V: KVM: Don't clear hgatp CSR in kvm_arch_vcpu_put()
Nathan Chancellor nathan@kernel.org btrfs: remove unused variable in btrfs_{start,write}_dirty_block_groups()
Filipe Manana fdmanana@suse.com btrfs: remove no longer used counter when reading data page
Alvin Šipraga alsi@bang-olufsen.dk net: dsa: realtek: make interface drivers depend on OF
Alvin Šipraga alsi@bang-olufsen.dk net: dsa: realtek: rtl8365mb: serialize indirect PHY register access
Alvin Šipraga alsi@bang-olufsen.dk net: dsa: realtek: allow subdrivers to externally lock regmap
Mario Limonciello mario.limonciello@amd.com ACPI: processor idle: Check for architectural support for LPI
Mario Limonciello mario.limonciello@amd.com cpuidle: PSCI: Move the `has_lpi` check to the beginning of the function
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Fix p-state allow debug index on dcn31
Nicholas Kazlauskas nicholas.kazlauskas@amd.com drm/amd/display: Add pstate verification and recovery for DCN31
-------------
Diffstat:
.../memory-controllers/synopsys,ddrc-ecc.yaml | 6 +- .../devicetree/bindings/net/snps,dwmac.yaml | 6 +- Makefile | 4 +- arch/arm/mach-davinci/board-da850-evm.c | 4 +- arch/arm/mach-ep93xx/clock.c | 4 +- arch/arm64/include/asm/kvm_emulate.h | 27 +++- arch/arm64/include/asm/kvm_host.h | 25 ++- arch/arm64/kernel/alternative.c | 6 +- arch/arm64/kernel/cpuidle.c | 6 +- arch/arm64/kvm/arm.c | 7 +- arch/arm64/kvm/mmio.c | 3 +- arch/arm64/kvm/pmu-emul.c | 2 +- arch/arm64/kvm/reset.c | 65 +++++--- arch/powerpc/include/asm/static_call.h | 1 + arch/powerpc/kvm/book3s_hv.c | 41 ++++- arch/riscv/kvm/vcpu.c | 2 - arch/riscv/kvm/vcpu_fp.c | 1 + arch/x86/include/asm/kvm_host.h | 5 +- arch/x86/include/asm/msr-index.h | 4 +- arch/x86/include/asm/static_call.h | 2 + arch/x86/kernel/cpu/common.c | 2 + arch/x86/kernel/cpu/cpu.h | 5 +- arch/x86/kernel/cpu/intel.c | 7 - arch/x86/kernel/cpu/tsx.c | 104 +++++++++++-- arch/x86/kvm/mmu/mmu.c | 20 ++- arch/x86/kvm/x86.c | 20 ++- arch/x86/net/bpf_jit_comp.c | 1 + block/bio.c | 2 +- drivers/acpi/processor_idle.c | 15 +- drivers/ata/libata-core.c | 3 + drivers/base/dd.c | 1 + drivers/block/drbd/drbd_main.c | 1 + drivers/block/null_blk/main.c | 2 +- drivers/firmware/arm_scmi/clock.c | 3 +- drivers/firmware/arm_scmi/driver.c | 3 +- drivers/gpio/gpio-sim.c | 4 +- drivers/gpio/gpiolib-acpi.c | 4 +- drivers/gpu/drm/amd/amdgpu/ObjectID.h | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 +- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 20 ++- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 + drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 5 +- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 4 +- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 3 + drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 + drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 +- drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 4 +- .../gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c | 1 + .../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 29 ++-- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 +- .../gpu/drm/amd/display/dc/dcn30/dcn30_hubbub.c | 1 + drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 5 +- .../gpu/drm/amd/display/dc/dcn301/dcn301_hubbub.c | 1 + .../gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c | 62 ++++++++ drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c | 5 +- .../gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 2 +- drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 4 +- drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h | 2 + .../amd/display/modules/info_packet/info_packet.c | 5 +- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- drivers/gpu/drm/msm/dp/dp_display.c | 6 + drivers/gpu/drm/msm/dp/dp_panel.c | 20 +-- drivers/gpu/drm/msm/dp/dp_panel.h | 1 + drivers/gpu/drm/msm/dsi/dsi_manager.c | 2 +- drivers/gpu/drm/msm/msm_gem.c | 1 + drivers/gpu/ipu-v3/ipu-di.c | 5 +- drivers/hv/hv_balloon.c | 36 ++++- drivers/hv/hv_common.c | 11 ++ drivers/hv/ring_buffer.c | 11 +- drivers/hv/vmbus_drv.c | 49 +++++- drivers/i2c/busses/i2c-pasemi-core.c | 6 + drivers/i2c/i2c-dev.c | 15 +- drivers/md/dm-integrity.c | 7 +- drivers/md/dm-ps-historical-service-time.c | 11 +- drivers/media/platform/rockchip/rga/rga.c | 2 +- drivers/media/tuners/si2157.c | 22 +-- drivers/memory/atmel-ebi.c | 23 ++- drivers/memory/renesas-rpc-if.c | 10 +- drivers/net/dsa/ocelot/felix.c | 23 +++ drivers/net/dsa/ocelot/felix_vsc9959.c | 2 +- drivers/net/dsa/realtek/Kconfig | 1 + drivers/net/dsa/realtek/realtek-smi-core.c | 48 +++++- drivers/net/dsa/realtek/realtek-smi-core.h | 2 + drivers/net/dsa/realtek/rtl8365mb.c | 54 ++++--- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 +- drivers/net/ethernet/faraday/ftgmac100.c | 10 +- drivers/net/ethernet/intel/iavf/iavf_main.c | 7 +- drivers/net/ethernet/intel/ice/ice_arfs.c | 9 +- drivers/net/ethernet/intel/ice/ice_lib.c | 5 +- drivers/net/ethernet/intel/ice/ice_main.c | 18 +-- drivers/net/ethernet/mellanox/mlxsw/i2c.c | 1 + drivers/net/ethernet/micrel/Kconfig | 1 + .../net/ethernet/microchip/lan966x/lan966x_mac.c | 6 +- .../ethernet/microchip/lan966x/lan966x_switchdev.c | 3 +- drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 6 +- drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c | 8 - drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h | 4 + .../net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 13 +- drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 13 +- drivers/net/macvlan.c | 8 +- drivers/net/mdio/fwnode_mdio.c | 5 + drivers/net/slip/slip.c | 2 +- drivers/net/tun.c | 2 +- drivers/net/usb/aqc111.c | 9 +- drivers/net/veth.c | 2 +- drivers/net/wireless/ath/ath11k/mac.c | 22 ++- drivers/net/wireless/ath/ath9k/main.c | 2 +- drivers/net/wireless/ath/ath9k/xmit.c | 33 ++-- drivers/pci/controller/pci-hyperv.c | 9 ++ drivers/perf/fsl_imx8_ddr_perf.c | 2 +- drivers/regulator/wm8994-regulator.c | 42 +++++- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 2 +- drivers/scsi/lpfc/lpfc.h | 7 +- drivers/scsi/lpfc/lpfc_crtn.h | 3 + drivers/scsi/lpfc/lpfc_hbadisc.c | 120 ++++++++++++--- drivers/scsi/lpfc/lpfc_init.c | 88 ++++++++--- drivers/scsi/lpfc/lpfc_nvme.c | 27 ++-- drivers/scsi/lpfc/lpfc_sli.c | 65 +++++--- drivers/scsi/megaraid/megaraid_sas.h | 3 + drivers/scsi/megaraid/megaraid_sas_base.c | 7 + drivers/scsi/mpt3sas/mpt3sas_config.c | 9 +- drivers/scsi/mvsas/mv_init.c | 1 + drivers/scsi/pm8001/pm80xx_hwi.c | 33 ++-- drivers/scsi/scsi_transport_iscsi.c | 168 +++++++++++++-------- drivers/spi/spi-cadence-quadspi.c | 46 ++---- drivers/target/target_core_user.c | 3 +- drivers/vfio/pci/vfio_pci_core.c | 124 +++++++++------ fs/binfmt_elf.c | 4 +- fs/btrfs/block-group.c | 40 +++-- fs/btrfs/block-group.h | 4 + fs/btrfs/compression.c | 8 + fs/btrfs/disk-io.c | 5 +- fs/btrfs/extent-tree.c | 2 +- fs/btrfs/extent_io.c | 5 +- fs/btrfs/file.c | 13 +- fs/btrfs/inode.c | 7 +- fs/btrfs/volumes.c | 2 + fs/cachefiles/namei.c | 33 ++-- fs/cachefiles/xattr.c | 2 +- fs/cifs/cifsfs.c | 28 ++-- fs/cifs/link.c | 3 + fs/io-wq.h | 1 - fs/io_uring.c | 54 ++++--- fs/nfsd/filecache.c | 18 ++- include/asm-generic/mshyperv.h | 1 + include/asm-generic/tlb.h | 10 +- include/linux/kfence.h | 24 +++ include/linux/static_call.h | 20 ++- include/linux/sunrpc/svc.h | 1 + include/linux/vfio_pci_core.h | 2 + include/net/flow_dissector.h | 2 + include/scsi/scsi_transport_iscsi.h | 2 + include/sound/core.h | 1 + include/sound/memalloc.h | 5 + include/trace/events/sunrpc.h | 7 +- include/uapi/linux/io_uring.h | 1 + include/uapi/linux/stddef.h | 4 + kernel/cpu.c | 36 ++--- kernel/dma/direct.h | 3 +- kernel/irq/affinity.c | 5 +- kernel/smp.c | 2 +- kernel/time/tick-sched.c | 2 +- kernel/time/timer.c | 11 +- mm/hugetlb.c | 17 ++- mm/kfence/core.c | 21 --- mm/kfence/kfence.h | 21 +++ mm/kfence/report.c | 47 ++++++ mm/kmemleak.c | 8 +- mm/page_alloc.c | 2 +- mm/page_io.c | 54 ------- mm/secretmem.c | 17 +++ mm/slab.c | 2 +- mm/slab.h | 2 +- mm/slab_common.c | 9 ++ mm/slob.c | 2 +- mm/slub.c | 2 +- net/ax25/af_ax25.c | 5 + net/core/flow_dissector.c | 1 + net/dsa/dsa2.c | 23 ++- net/ipv6/ip6_output.c | 2 +- net/mac80211/debugfs_sta.c | 2 +- net/netfilter/nf_tables_api.c | 2 +- net/netfilter/nft_socket.c | 7 +- net/nfc/nci/core.c | 4 + net/sched/cls_api.c | 2 +- net/sched/cls_flower.c | 18 ++- net/sched/sch_taprio.c | 3 +- net/sctp/sm_statefuns.c | 6 +- net/sctp/socket.c | 2 +- net/smc/smc_clc.c | 6 +- net/smc/smc_pnet.c | 5 +- net/sunrpc/svc_xprt.c | 3 + net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 +- net/wireless/nl80211.c | 3 +- net/wireless/scan.c | 2 + scripts/gcc-plugins/latent_entropy_plugin.c | 44 +++--- sound/core/init.c | 28 ++++ sound/core/memalloc.c | 111 +++++++++++++- sound/core/pcm_misc.c | 2 +- sound/drivers/mtpav.c | 4 +- sound/isa/galaxy/galaxy.c | 7 +- sound/isa/sc6000.c | 7 +- sound/pci/ad1889.c | 10 +- sound/pci/ali5451/ali5451.c | 10 +- sound/pci/als300.c | 8 +- sound/pci/als4000.c | 10 +- sound/pci/atiixp.c | 10 +- sound/pci/atiixp_modem.c | 10 +- sound/pci/au88x0/au88x0.c | 8 +- sound/pci/aw2/aw2-alsa.c | 8 +- sound/pci/azt3328.c | 8 +- sound/pci/bt87x.c | 10 +- sound/pci/ca0106/ca0106_main.c | 10 +- sound/pci/cmipci.c | 8 +- sound/pci/cs4281.c | 10 +- sound/pci/cs5535audio/cs5535audio.c | 10 +- sound/pci/echoaudio/echoaudio.c | 9 +- sound/pci/emu10k1/emu10k1x.c | 10 +- sound/pci/ens1370.c | 10 +- sound/pci/es1938.c | 10 +- sound/pci/es1968.c | 10 +- sound/pci/fm801.c | 10 +- sound/pci/hda/patch_realtek.c | 2 + sound/pci/ice1712/ice1724.c | 10 +- sound/pci/intel8x0.c | 10 +- sound/pci/intel8x0m.c | 10 +- sound/pci/korg1212/korg1212.c | 8 +- sound/pci/lola/lola.c | 10 +- sound/pci/lx6464es/lx6464es.c | 8 +- sound/pci/maestro3.c | 8 +- sound/pci/nm256/nm256.c | 2 +- sound/pci/oxygen/oxygen_lib.c | 12 +- sound/pci/riptide/riptide.c | 8 +- sound/pci/rme32.c | 8 +- sound/pci/rme96.c | 10 +- sound/pci/rme9652/hdsp.c | 8 +- sound/pci/rme9652/hdspm.c | 8 +- sound/pci/rme9652/rme9652.c | 8 +- sound/pci/sis7019.c | 14 +- sound/pci/sonicvibes.c | 10 +- sound/pci/via82xx.c | 10 +- sound/pci/via82xx_modem.c | 10 +- sound/usb/pcm.c | 16 +- sound/x86/intel_hdmi_audio.c | 7 +- tools/arch/x86/include/asm/msr-index.h | 4 +- tools/perf/util/parse-events.c | 5 +- .../selftests/kvm/include/riscv/processor.h | 4 +- tools/testing/selftests/kvm/lib/riscv/processor.c | 2 +- tools/testing/selftests/mqueue/mq_perf_tests.c | 25 ++- virt/kvm/kvm_main.c | 10 +- 254 files changed, 2318 insertions(+), 927 deletions(-)
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
commit e7031d8258f1b4d6d50e5e5b5d92ba16f66eb8b4 upstream.
[Why] To debug when p-state is being blocked and avoid PMFW hangs when it does occur.
[How] Re-use the DCN10 hardware sequencer by adding a new interface for verifying p-state high on the hubbub. The interface is mostly the same as the DCN10 interface, but the bit definitions have changed for the debug bus.
Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Reviewed-by: Eric Yang Eric.Yang2@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c | 1 drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 10 +- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubbub.c | 1 drivers/gpu/drm/amd/display/dc/dcn301/dcn301_hubbub.c | 1 drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c | 60 ++++++++++++++ drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c | 2 drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h | 2 7 files changed, 73 insertions(+), 4 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hubbub.c @@ -940,6 +940,7 @@ static const struct hubbub_funcs hubbub1 .program_watermarks = hubbub1_program_watermarks, .is_allow_self_refresh_enabled = hubbub1_is_allow_self_refresh_enabled, .allow_self_refresh_control = hubbub1_allow_self_refresh_control, + .verify_allow_pstate_change_high = hubbub1_verify_allow_pstate_change_high, };
void hubbub1_construct(struct hubbub *hubbub, --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -1112,9 +1112,13 @@ static bool dcn10_hw_wa_force_recovery(s
void dcn10_verify_allow_pstate_change_high(struct dc *dc) { + struct hubbub *hubbub = dc->res_pool->hubbub; static bool should_log_hw_state; /* prevent hw state log by default */
- if (!hubbub1_verify_allow_pstate_change_high(dc->res_pool->hubbub)) { + if (!hubbub->funcs->verify_allow_pstate_change_high) + return; + + if (!hubbub->funcs->verify_allow_pstate_change_high(hubbub)) { int i = 0;
if (should_log_hw_state) @@ -1123,8 +1127,8 @@ void dcn10_verify_allow_pstate_change_hi TRACE_DC_PIPE_STATE(pipe_ctx, i, MAX_PIPES); BREAK_TO_DEBUGGER(); if (dcn10_hw_wa_force_recovery(dc)) { - /*check again*/ - if (!hubbub1_verify_allow_pstate_change_high(dc->res_pool->hubbub)) + /*check again*/ + if (!hubbub->funcs->verify_allow_pstate_change_high(hubbub)) BREAK_TO_DEBUGGER(); } } --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hubbub.c @@ -448,6 +448,7 @@ static const struct hubbub_funcs hubbub3 .program_watermarks = hubbub3_program_watermarks, .allow_self_refresh_control = hubbub1_allow_self_refresh_control, .is_allow_self_refresh_enabled = hubbub1_is_allow_self_refresh_enabled, + .verify_allow_pstate_change_high = hubbub1_verify_allow_pstate_change_high, .force_wm_propagate_to_pipes = hubbub3_force_wm_propagate_to_pipes, .force_pstate_change_control = hubbub3_force_pstate_change_control, .init_watermarks = hubbub3_init_watermarks, --- a/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn301/dcn301_hubbub.c @@ -60,6 +60,7 @@ static const struct hubbub_funcs hubbub3 .program_watermarks = hubbub3_program_watermarks, .allow_self_refresh_control = hubbub1_allow_self_refresh_control, .is_allow_self_refresh_enabled = hubbub1_is_allow_self_refresh_enabled, + .verify_allow_pstate_change_high = hubbub1_verify_allow_pstate_change_high, .force_wm_propagate_to_pipes = hubbub3_force_wm_propagate_to_pipes, .force_pstate_change_control = hubbub3_force_pstate_change_control, .hubbub_read_state = hubbub2_read_state, --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c @@ -949,6 +949,65 @@ static void hubbub31_get_dchub_ref_freq( } }
+static bool hubbub31_verify_allow_pstate_change_high(struct hubbub *hubbub) +{ + struct dcn20_hubbub *hubbub2 = TO_DCN20_HUBBUB(hubbub); + + /* + * Pstate latency is ~20us so if we wait over 40us and pstate allow + * still not asserted, we are probably stuck and going to hang + */ + const unsigned int pstate_wait_timeout_us = 100; + const unsigned int pstate_wait_expected_timeout_us = 40; + + static unsigned int max_sampled_pstate_wait_us; /* data collection */ + static bool forced_pstate_allow; /* help with revert wa */ + + unsigned int debug_data = 0; + unsigned int i; + + if (forced_pstate_allow) { + /* we hacked to force pstate allow to prevent hang last time + * we verify_allow_pstate_change_high. so disable force + * here so we can check status + */ + REG_UPDATE_2(DCHUBBUB_ARB_DRAM_STATE_CNTL, + DCHUBBUB_ARB_ALLOW_PSTATE_CHANGE_FORCE_VALUE, 0, + DCHUBBUB_ARB_ALLOW_PSTATE_CHANGE_FORCE_ENABLE, 0); + forced_pstate_allow = false; + } + + REG_WRITE(DCHUBBUB_TEST_DEBUG_INDEX, hubbub2->debug_test_index_pstate); + + for (i = 0; i < pstate_wait_timeout_us; i++) { + debug_data = REG_READ(DCHUBBUB_TEST_DEBUG_DATA); + + /* Debug bit is specific to ASIC. */ + if (debug_data & (1 << 26)) { + if (i > pstate_wait_expected_timeout_us) + DC_LOG_WARNING("pstate took longer than expected ~%dus\n", i); + return true; + } + if (max_sampled_pstate_wait_us < i) + max_sampled_pstate_wait_us = i; + + udelay(1); + } + + /* force pstate allow to prevent system hang + * and break to debugger to investigate + */ + REG_UPDATE_2(DCHUBBUB_ARB_DRAM_STATE_CNTL, + DCHUBBUB_ARB_ALLOW_PSTATE_CHANGE_FORCE_VALUE, 1, + DCHUBBUB_ARB_ALLOW_PSTATE_CHANGE_FORCE_ENABLE, 1); + forced_pstate_allow = true; + + DC_LOG_WARNING("pstate TEST_DEBUG_DATA: 0x%X\n", + debug_data); + + return false; +} + static const struct hubbub_funcs hubbub31_funcs = { .update_dchub = hubbub2_update_dchub, .init_dchub_sys_ctx = hubbub31_init_dchub_sys_ctx, @@ -961,6 +1020,7 @@ static const struct hubbub_funcs hubbub3 .program_watermarks = hubbub31_program_watermarks, .allow_self_refresh_control = hubbub1_allow_self_refresh_control, .is_allow_self_refresh_enabled = hubbub1_is_allow_self_refresh_enabled, + .verify_allow_pstate_change_high = hubbub31_verify_allow_pstate_change_high, .program_det_size = dcn31_program_det_size, .program_compbuf_size = dcn31_program_compbuf_size, .init_crb = dcn31_init_crb, --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_resource.c @@ -1011,7 +1011,7 @@ static const struct dc_debug_options deb .max_downscale_src_width = 4096,/*upto true 4K*/ .disable_pplib_wm_range = false, .scl_reset_length10 = true, - .sanity_checks = false, + .sanity_checks = true, .underflow_assert_delay_us = 0xFFFFFFFF, .dwb_fi_phase = -1, // -1 = disable, .dmub_command_table = true, --- a/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h +++ b/drivers/gpu/drm/amd/display/dc/inc/hw/dchubbub.h @@ -154,6 +154,8 @@ struct hubbub_funcs { bool (*is_allow_self_refresh_enabled)(struct hubbub *hubbub); void (*allow_self_refresh_control)(struct hubbub *hubbub, bool allow);
+ bool (*verify_allow_pstate_change_high)(struct hubbub *hubbub); + void (*apply_DEDCN21_147_wa)(struct hubbub *hubbub);
void (*force_wm_propagate_to_pipes)(struct hubbub *hubbub);
From: Nicholas Kazlauskas nicholas.kazlauskas@amd.com
commit 3107e1a7ae088ee94323fe9ab05dbefd65b3077f upstream.
[Why] It changed since dcn30 but the hubbub31 constructor hasn't been modified to reflect this.
[How] Update the value in the constructor to 0x6 so we're checking the right bits for p-state allow.
It worked before by accident, but can falsely assert 0 depending on HW state transitions. The most frequent of which appears to be when all pipes turn off during IGT tests.
Cc: Harry Wentland harry.wentland@amd.com
Fixes: e7031d8258f1b4 ("drm/amd/display: Add pstate verification and recovery for DCN31") Signed-off-by: Nicholas Kazlauskas nicholas.kazlauskas@amd.com Reviewed-by: Eric Yang Eric.Yang2@amd.com Acked-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hubbub.c @@ -1042,5 +1042,7 @@ void hubbub31_construct(struct dcn20_hub hubbub31->detile_buf_size = det_size_kb * 1024; hubbub31->pixel_chunk_size = pixel_chunk_size_kb * 1024; hubbub31->crb_size_segs = config_return_buffer_size_kb / DCN31_CRB_SEGMENT_SIZE_KB; + + hubbub31->debug_test_index_pstate = 0x6; }
From: Mario Limonciello mario.limonciello@amd.com
commit 01f6c7338ce267959975da65d86ba34f44d54220 upstream.
Currently the first thing checked is whether the PCSI cpu_suspend function has been initialized.
Another change will be overloading `acpi_processor_ffh_lpi_probe` and calling it sooner. So make the `has_lpi` check the first thing checked to prepare for that change.
Reviewed-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/cpuidle.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/arch/arm64/kernel/cpuidle.c +++ b/arch/arm64/kernel/cpuidle.c @@ -54,6 +54,9 @@ static int psci_acpi_cpu_init_idle(unsig struct acpi_lpi_state *lpi; struct acpi_processor *pr = per_cpu(processors, cpu);
+ if (unlikely(!pr || !pr->flags.has_lpi)) + return -EINVAL; + /* * If the PSCI cpu_suspend function hook has not been initialized * idle states must not be enabled, so bail out @@ -61,9 +64,6 @@ static int psci_acpi_cpu_init_idle(unsig if (!psci_ops.cpu_suspend) return -EOPNOTSUPP;
- if (unlikely(!pr || !pr->flags.has_lpi)) - return -EINVAL; - count = pr->power.count - 1; if (count <= 0) return -ENODEV;
From: Mario Limonciello mario.limonciello@amd.com
commit eb087f305919ee8169ad65665610313e74260463 upstream.
When `osc_pc_lpi_support_confirmed` is set through `_OSC` and `_LPI` is populated then the cpuidle driver assumes that LPI is fully functional.
However currently the kernel only provides architectural support for LPI on ARM. This leads to high power consumption on X86 platforms that otherwise try to enable LPI.
So probe whether or not LPI support is implemented before enabling LPI in the kernel. This is done by overloading `acpi_processor_ffh_lpi_probe` to check whether it returns `-EOPNOTSUPP`. It also means that all future implementations of `acpi_processor_ffh_lpi_probe` will need to follow these semantics as well.
Reviewed-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Mario Limonciello mario.limonciello@amd.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/acpi/processor_idle.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/drivers/acpi/processor_idle.c +++ b/drivers/acpi/processor_idle.c @@ -1079,6 +1079,11 @@ static int flatten_lpi_states(struct acp return 0; }
+int __weak acpi_processor_ffh_lpi_probe(unsigned int cpu) +{ + return -EOPNOTSUPP; +} + static int acpi_processor_get_lpi_info(struct acpi_processor *pr) { int ret, i; @@ -1087,6 +1092,11 @@ static int acpi_processor_get_lpi_info(s struct acpi_device *d = NULL; struct acpi_lpi_states_array info[2], *tmp, *prev, *curr;
+ /* make sure our architecture has support */ + ret = acpi_processor_ffh_lpi_probe(pr->id); + if (ret == -EOPNOTSUPP) + return ret; + if (!osc_pc_lpi_support_confirmed) return -EOPNOTSUPP;
@@ -1138,11 +1148,6 @@ static int acpi_processor_get_lpi_info(s return 0; }
-int __weak acpi_processor_ffh_lpi_probe(unsigned int cpu) -{ - return -ENODEV; -} - int __weak acpi_processor_ffh_lpi_enter(struct acpi_lpi_state *lpi) { return -ENODEV;
From: Alvin Šipraga alsi@bang-olufsen.dk
commit 907e772f6f6debb610ea28298ab57b31019a4edb upstream.
Currently there is no way for Realtek DSA subdrivers to serialize consecutive regmap accesses. In preparation for a bugfix relating to indirect PHY register access - which involves a series of regmap reads and writes - add a facility for subdrivers to serialize their regmap access.
Specifically, a mutex is added to the driver private data structure and the standard regmap is initialized with custom lock/unlock ops which use this mutex. Then, a "nolock" variant of the regmap is added, which is functionally equivalent to the existing regmap except that regmap locking is disabled. Functions that wish to serialize a sequence of regmap accesses may then lock the newly introduced driver-owned mutex before using the nolock regmap.
Doing things this way means that subdriver code that doesn't care about serialized register access - i.e. the vast majority of code - needn't worry about synchronizing register access with an external lock: it can just continue to use the original regmap.
Another advantage of this design is that, while regmaps with locking disabled do not expose a debugfs interface for obvious reasons, there still exists the original regmap which does expose this interface. This interface remains safe to use even combined with driver codepaths that use the nolock regmap, because said codepaths will use the same mutex to synchronize access.
With respect to disadvantages, it can be argued that having near-duplicate regmaps is confusing. However, the naming is rather explicit, and examples will abound.
Finally, while we are at it, rename realtek_smi_mdio_regmap_config to realtek_smi_regmap_config. This makes it consistent with the naming realtek_mdio_regmap_config in realtek-mdio.c.
Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Reviewed-by: Vladimir Oltean olteanv@gmail.com Signed-off-by: David S. Miller davem@davemloft.net [alsi: backport to 5.16: s/priv/smi/g and remove realtek-mdio changes] Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/realtek/realtek-smi-core.c | 48 +++++++++++++++++++++++++++-- drivers/net/dsa/realtek/realtek-smi-core.h | 2 + 2 files changed, 47 insertions(+), 3 deletions(-)
--- a/drivers/net/dsa/realtek/realtek-smi-core.c +++ b/drivers/net/dsa/realtek/realtek-smi-core.c @@ -315,7 +315,21 @@ static int realtek_smi_read(void *ctx, u return realtek_smi_read_reg(smi, reg, val); }
-static const struct regmap_config realtek_smi_mdio_regmap_config = { +static void realtek_smi_lock(void *ctx) +{ + struct realtek_smi *smi = ctx; + + mutex_lock(&smi->map_lock); +} + +static void realtek_smi_unlock(void *ctx) +{ + struct realtek_smi *smi = ctx; + + mutex_unlock(&smi->map_lock); +} + +static const struct regmap_config realtek_smi_regmap_config = { .reg_bits = 10, /* A4..A0 R4..R0 */ .val_bits = 16, .reg_stride = 1, @@ -325,6 +339,21 @@ static const struct regmap_config realte .reg_read = realtek_smi_read, .reg_write = realtek_smi_write, .cache_type = REGCACHE_NONE, + .lock = realtek_smi_lock, + .unlock = realtek_smi_unlock, +}; + +static const struct regmap_config realtek_smi_nolock_regmap_config = { + .reg_bits = 10, /* A4..A0 R4..R0 */ + .val_bits = 16, + .reg_stride = 1, + /* PHY regs are at 0x8000 */ + .max_register = 0xffff, + .reg_format_endian = REGMAP_ENDIAN_BIG, + .reg_read = realtek_smi_read, + .reg_write = realtek_smi_write, + .cache_type = REGCACHE_NONE, + .disable_locking = true, };
static int realtek_smi_mdio_read(struct mii_bus *bus, int addr, int regnum) @@ -388,6 +417,7 @@ static int realtek_smi_probe(struct plat const struct realtek_smi_variant *var; struct device *dev = &pdev->dev; struct realtek_smi *smi; + struct regmap_config rc; struct device_node *np; int ret;
@@ -398,13 +428,25 @@ static int realtek_smi_probe(struct plat if (!smi) return -ENOMEM; smi->chip_data = (void *)smi + sizeof(*smi); - smi->map = devm_regmap_init(dev, NULL, smi, - &realtek_smi_mdio_regmap_config); + + mutex_init(&smi->map_lock); + + rc = realtek_smi_regmap_config; + rc.lock_arg = smi; + smi->map = devm_regmap_init(dev, NULL, smi, &rc); if (IS_ERR(smi->map)) { ret = PTR_ERR(smi->map); dev_err(dev, "regmap init failed: %d\n", ret); return ret; } + + rc = realtek_smi_nolock_regmap_config; + smi->map_nolock = devm_regmap_init(dev, NULL, smi, &rc); + if (IS_ERR(smi->map_nolock)) { + ret = PTR_ERR(smi->map_nolock); + dev_err(dev, "regmap init failed: %d\n", ret); + return ret; + }
/* Link forward and backward */ smi->dev = dev; --- a/drivers/net/dsa/realtek/realtek-smi-core.h +++ b/drivers/net/dsa/realtek/realtek-smi-core.h @@ -49,6 +49,8 @@ struct realtek_smi { struct gpio_desc *mdc; struct gpio_desc *mdio; struct regmap *map; + struct regmap *map_nolock; + struct mutex map_lock; struct mii_bus *slave_mii_bus;
unsigned int clk_delay;
From: Alvin Šipraga alsi@bang-olufsen.dk
commit 2796728460b822d549841e0341752b263dc265c4 upstream.
Realtek switches in the rtl8365mb family can access the PHY registers of the internal PHYs via the switch registers. This method is called indirect access. At a high level, the indirect PHY register access method involves reading and writing some special switch registers in a particular sequence. This works for both SMI and MDIO connected switches.
Currently the rtl8365mb driver does not take any care to serialize the aforementioned access to the switch registers. In particular, it is permitted for other driver code to access other switch registers while the indirect PHY register access is ongoing. Locking is only done at the regmap level. This, however, is a bug: concurrent register access, even to unrelated switch registers, risks corrupting the PHY register value read back via the indirect access method described above.
Arınç reported that the switch sometimes returns nonsense data when reading the PHY registers. In particular, a value of 0 causes the kernel's PHY subsystem to think that the link is down, but since most reads return correct data, the link then flip-flops between up and down over a period of time.
The aforementioned bug can be readily observed by:
1. Enabling ftrace events for regmap and mdio 2. Polling BSMR PHY register for a connected port; it should always read the same (e.g. 0x79ed) 3. Wait for step 2 to give a different value
Example command for step 2:
while true; do phytool read swp2/2/0x01; done
On my i.MX8MM, the above steps will yield a bogus value for the BSMR PHY register within a matter of seconds. The interleaved register access it then evident in the trace log:
kworker/3:4-70 [003] ....... 1927.139849: regmap_reg_write: ethernet-switch reg=1004 val=bd phytool-16816 [002] ....... 1927.139979: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.140381: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.140468: regmap_reg_read: ethernet-switch reg=1d15 val=a69 kworker/3:4-70 [003] ....... 1927.140864: regmap_reg_read: ethernet-switch reg=1003 val=0 phytool-16816 [002] ....... 1927.140955: regmap_reg_write: ethernet-switch reg=1f02 val=2041 kworker/3:4-70 [003] ....... 1927.141390: regmap_reg_read: ethernet-switch reg=1002 val=0 phytool-16816 [002] ....... 1927.141479: regmap_reg_write: ethernet-switch reg=1f00 val=1 kworker/3:4-70 [003] ....... 1927.142311: regmap_reg_write: ethernet-switch reg=1004 val=be phytool-16816 [002] ....... 1927.142410: regmap_reg_read: ethernet-switch reg=1f01 val=0 kworker/3:4-70 [003] ....... 1927.142534: regmap_reg_read: ethernet-switch reg=1005 val=0 phytool-16816 [002] ....... 1927.142618: regmap_reg_read: ethernet-switch reg=1f04 val=0 phytool-16816 [002] ....... 1927.142641: mdio_access: SMI-0 read phy:0x02 reg:0x01 val:0x0000 <- ?! kworker/3:4-70 [003] ....... 1927.143037: regmap_reg_read: ethernet-switch reg=1001 val=0 kworker/3:4-70 [003] ....... 1927.143133: regmap_reg_read: ethernet-switch reg=1000 val=2d89 kworker/3:4-70 [003] ....... 1927.143213: regmap_reg_write: ethernet-switch reg=1004 val=be kworker/3:4-70 [003] ....... 1927.143291: regmap_reg_read: ethernet-switch reg=1005 val=0 kworker/3:4-70 [003] ....... 1927.143368: regmap_reg_read: ethernet-switch reg=1003 val=0 kworker/3:4-70 [003] ....... 1927.143443: regmap_reg_read: ethernet-switch reg=1002 val=6
The kworker here is polling MIB counters for stats, as evidenced by the register 0x1004 that we are writing to (RTL8365MB_MIB_ADDRESS_REG). This polling is performed every 3 seconds, but is just one example of such unsynchronized access. In Arınç's case, the driver was not using the switch IRQ, so the PHY subsystem was itself doing polling analogous to phytool in the above example.
A test module was created [see second Link] to simulate such spurious switch register accesses while performing indirect PHY register reads and writes. Realtek was also consulted to confirm whether this is a known issue or not. The conclusion of these lines of inquiry is as follows:
1. Reading of PHY registers via indirect access will be aborted if, after executing the read operation (via a write to the INDIRECT_ACCESS_CTRL_REG), any register is accessed, other than INDIRECT_ACCESS_STATUS_REG.
2. The PHY register indirect read is only complete when INDIRECT_ACCESS_STATUS_REG reads zero.
3. The INDIRECT_ACCESS_DATA_REG, which is read to get the result of the PHY read, will contain the result of the last successful read operation. If there was spurious register access and the indirect read was aborted, then this register is not guaranteed to hold anything meaningful and the PHY read will silently fail.
4. PHY writes do not appear to be affected by this mechanism.
5. Other similar access routines, such as for MIB counters, although similar to the PHY indirect access method, are actually table access. Table access is not affected by spurious reads or writes of other registers. However, concurrent table access is not allowed. Currently this is protected via mib_lock, so there is nothing to fix.
The above statements are corroborated both via the test module and through consultation with Realtek. In particular, Realtek states that this is simply a property of the hardware design and is not a hardware bug.
To fix this problem, one must guard against regmap access while the PHY indirect register read is executing. Fix this by using the newly introduced "nolock" regmap in all PHY-related functions, and by aquiring the regmap mutex at the top level of the PHY register access callbacks. Although no issue has been observed with PHY register _writes_, this change also serializes the indirect access method there. This is done purely as a matter of convenience and for reasons of symmetry.
Fixes: 4af2950c50c8 ("net: dsa: realtek-smi: add rtl8365mb subdriver for RTL8365MB-VC") Link: https://lore.kernel.org/netdev/CAJq09z5FCgG-+jVT7uxh1a-0CiiFsoKoHYsAWJtiKwv7... Link: https://lore.kernel.org/netdev/871qzwjmtv.fsf@bang-olufsen.dk/ Reported-by: Arınç ÜNAL arinc.unal@arinc9.com Reported-by: Luiz Angelo Daros de Luca luizluca@gmail.com Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Reviewed-by: Vladimir Oltean olteanv@gmail.com Reviewed-by: Linus Walleij linus.walleij@linaro.org Signed-off-by: David S. Miller davem@davemloft.net [alsi: backport to 5.16: s/priv/smi/g] Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/realtek/rtl8365mb.c | 54 ++++++++++++++++++++++-------------- 1 file changed, 33 insertions(+), 21 deletions(-)
--- a/drivers/net/dsa/realtek/rtl8365mb.c +++ b/drivers/net/dsa/realtek/rtl8365mb.c @@ -565,7 +565,7 @@ static int rtl8365mb_phy_poll_busy(struc { u32 val;
- return regmap_read_poll_timeout(smi->map, + return regmap_read_poll_timeout(smi->map_nolock, RTL8365MB_INDIRECT_ACCESS_STATUS_REG, val, !val, 10, 100); } @@ -579,7 +579,7 @@ static int rtl8365mb_phy_ocp_prepare(str /* Set OCP prefix */ val = FIELD_GET(RTL8365MB_PHY_OCP_ADDR_PREFIX_MASK, ocp_addr); ret = regmap_update_bits( - smi->map, RTL8365MB_GPHY_OCP_MSB_0_REG, + smi->map_nolock, RTL8365MB_GPHY_OCP_MSB_0_REG, RTL8365MB_GPHY_OCP_MSB_0_CFG_CPU_OCPADR_MASK, FIELD_PREP(RTL8365MB_GPHY_OCP_MSB_0_CFG_CPU_OCPADR_MASK, val)); if (ret) @@ -592,8 +592,8 @@ static int rtl8365mb_phy_ocp_prepare(str ocp_addr >> 1); val |= FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_ADDRESS_OCPADR_9_6_MASK, ocp_addr >> 6); - ret = regmap_write(smi->map, RTL8365MB_INDIRECT_ACCESS_ADDRESS_REG, - val); + ret = regmap_write(smi->map_nolock, + RTL8365MB_INDIRECT_ACCESS_ADDRESS_REG, val); if (ret) return ret;
@@ -606,36 +606,42 @@ static int rtl8365mb_phy_ocp_read(struct u32 val; int ret;
+ mutex_lock(&smi->map_lock); + ret = rtl8365mb_phy_poll_busy(smi); if (ret) - return ret; + goto out;
ret = rtl8365mb_phy_ocp_prepare(smi, phy, ocp_addr); if (ret) - return ret; + goto out;
/* Execute read operation */ val = FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_MASK, RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_VALUE) | FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_RW_MASK, RTL8365MB_INDIRECT_ACCESS_CTRL_RW_READ); - ret = regmap_write(smi->map, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, val); + ret = regmap_write(smi->map_nolock, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, + val); if (ret) - return ret; + goto out;
ret = rtl8365mb_phy_poll_busy(smi); if (ret) - return ret; + goto out;
/* Get PHY register data */ - ret = regmap_read(smi->map, RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG, - &val); + ret = regmap_read(smi->map_nolock, + RTL8365MB_INDIRECT_ACCESS_READ_DATA_REG, &val); if (ret) - return ret; + goto out;
*data = val & 0xFFFF;
- return 0; +out: + mutex_unlock(&smi->map_lock); + + return ret; }
static int rtl8365mb_phy_ocp_write(struct realtek_smi *smi, int phy, @@ -644,32 +650,38 @@ static int rtl8365mb_phy_ocp_write(struc u32 val; int ret;
+ mutex_lock(&smi->map_lock); + ret = rtl8365mb_phy_poll_busy(smi); if (ret) - return ret; + goto out;
ret = rtl8365mb_phy_ocp_prepare(smi, phy, ocp_addr); if (ret) - return ret; + goto out;
/* Set PHY register data */ - ret = regmap_write(smi->map, RTL8365MB_INDIRECT_ACCESS_WRITE_DATA_REG, - data); + ret = regmap_write(smi->map_nolock, + RTL8365MB_INDIRECT_ACCESS_WRITE_DATA_REG, data); if (ret) - return ret; + goto out;
/* Execute write operation */ val = FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_MASK, RTL8365MB_INDIRECT_ACCESS_CTRL_CMD_VALUE) | FIELD_PREP(RTL8365MB_INDIRECT_ACCESS_CTRL_RW_MASK, RTL8365MB_INDIRECT_ACCESS_CTRL_RW_WRITE); - ret = regmap_write(smi->map, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, val); + ret = regmap_write(smi->map_nolock, RTL8365MB_INDIRECT_ACCESS_CTRL_REG, + val); if (ret) - return ret; + goto out;
ret = rtl8365mb_phy_poll_busy(smi); if (ret) - return ret; + goto out; + +out: + mutex_unlock(&smi->map_lock);
return 0; }
From: Alvin Šipraga alsi@bang-olufsen.dk
commit 109d899452ba17996eccec7ae8249fb1f8900a16 upstream.
The kernel test robot reported build warnings with a randconfig that built realtek-{smi,mdio} without CONFIG_OF set. Since both interface drivers are using OF and will not probe without, add the corresponding dependency to Kconfig.
Link: https://lore.kernel.org/all/202203231233.Xx73Y40o-lkp@intel.com/ Link: https://lore.kernel.org/all/202203231439.ycl0jg50-lkp@intel.com/ Fixes: aac94001067d ("net: dsa: realtek: add new mdio interface for drivers") Fixes: 765c39a4fafe ("net: dsa: realtek: convert subdrivers into modules") Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Reviewed-by: Andrew Lunn andrew@lunn.ch Acked-by: Luiz Angelo Daros de Luca luizluca@gmail.com Link: https://lore.kernel.org/r/20220323124225.91763-1-alvin@pqrs.dk Signed-off-by: Jakub Kicinski kuba@kernel.org [alsi: backport to 5.16: remove mdio part] Cc: stable@vger.kernel.org # v5.16+ Signed-off-by: Alvin Šipraga alsi@bang-olufsen.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/dsa/realtek/Kconfig | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/net/dsa/realtek/Kconfig +++ b/drivers/net/dsa/realtek/Kconfig @@ -14,6 +14,7 @@ menuconfig NET_DSA_REALTEK config NET_DSA_REALTEK_SMI tristate "Realtek SMI connected switch driver" depends on NET_DSA_REALTEK + depends on OF default y help Select to enable support for registering switches connected
From: Filipe Manana fdmanana@suse.com
commit ad3fc7946b1829213bbdbb2b9ad0d124b31ae4a7 upstream.
After commit 92082d40976ed0 ("btrfs: integrate page status update for data read path into begin/end_page_read"), the 'nr' counter at btrfs_do_readpage() is no longer used, we increment it but we never read from it. So just remove it.
Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Cc: Nathan Chancellor nathan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent_io.c | 5 +---- 1 file changed, 1 insertion(+), 4 deletions(-)
--- a/fs/btrfs/extent_io.c +++ b/fs/btrfs/extent_io.c @@ -3562,7 +3562,6 @@ int btrfs_do_readpage(struct page *page, u64 cur_end; struct extent_map *em; int ret = 0; - int nr = 0; size_t pg_offset = 0; size_t iosize; size_t blocksize = inode->i_sb->s_blocksize; @@ -3720,9 +3719,7 @@ int btrfs_do_readpage(struct page *page, end_bio_extent_readpage, 0, this_bio_flag, force_bio_submit); - if (!ret) { - nr++; - } else { + if (ret) { unlock_extent(tree, cur, cur + iosize - 1); end_page_read(page, false, cur, iosize); goto out;
From: Nathan Chancellor nathan@kernel.org
commit 6d4a6b515c39f1f8763093e0f828959b2fbc2f45 upstream.
Clang's version of -Wunused-but-set-variable recently gained support for unary operations, which reveals two unused variables:
fs/btrfs/block-group.c:2949:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable] int num_started = 0; ^ fs/btrfs/block-group.c:3116:6: error: variable 'num_started' set but not used [-Werror,-Wunused-but-set-variable] int num_started = 0; ^ 2 errors generated.
These variables appear to be unused from their introduction, so just remove them to silence the warnings.
Fixes: c9dc4c657850 ("Btrfs: two stage dirty block group writeout") Fixes: 1bbc621ef284 ("Btrfs: allow block group cache writeout outside critical section in commit") CC: stable@vger.kernel.org # 5.4+ Link: https://github.com/ClangBuiltLinux/linux/issues/1614 Signed-off-by: Nathan Chancellor nathan@kernel.org Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-group.c | 4 ---- 1 file changed, 4 deletions(-)
--- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2922,7 +2922,6 @@ int btrfs_start_dirty_block_groups(struc struct btrfs_path *path = NULL; LIST_HEAD(dirty); struct list_head *io = &cur_trans->io_bgs; - int num_started = 0; int loops = 0;
spin_lock(&cur_trans->dirty_bgs_lock); @@ -2988,7 +2987,6 @@ again: cache->io_ctl.inode = NULL; ret = btrfs_write_out_cache(trans, cache, path); if (ret == 0 && cache->io_ctl.inode) { - num_started++; should_put = 0;
/* @@ -3089,7 +3087,6 @@ int btrfs_write_dirty_block_groups(struc int should_put; struct btrfs_path *path; struct list_head *io = &cur_trans->io_bgs; - int num_started = 0;
path = btrfs_alloc_path(); if (!path) @@ -3147,7 +3144,6 @@ int btrfs_write_dirty_block_groups(struc cache->io_ctl.inode = NULL; ret = btrfs_write_out_cache(trans, cache, path); if (ret == 0 && cache->io_ctl.inode) { - num_started++; should_put = 0; list_add_tail(&cache->io_list, io); } else {
From: Anup Patel apatel@ventanamicro.com
commit 8c3ce496bd612bd21679e445f75fcabb6be997b2 upstream.
We might have RISC-V systems (such as QEMU) where VMID is not part of the TLB entry tag so these systems will have to flush all TLB entries upon any change in hgatp.VMID.
Currently, we zero-out hgatp CSR in kvm_arch_vcpu_put() and we re-program hgatp CSR in kvm_arch_vcpu_load(). For above described systems, this will flush all TLB entries whenever VCPU exits to user-space hence reducing performance.
This patch fixes above described performance issue by not clearing hgatp CSR in kvm_arch_vcpu_put().
Fixes: 34bde9d8b9e6 ("RISC-V: KVM: Implement VCPU world-switch") Cc: stable@vger.kernel.org Signed-off-by: Anup Patel apatel@ventanamicro.com Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/riscv/kvm/vcpu.c | 2 -- 1 file changed, 2 deletions(-)
--- a/arch/riscv/kvm/vcpu.c +++ b/arch/riscv/kvm/vcpu.c @@ -653,8 +653,6 @@ void kvm_arch_vcpu_put(struct kvm_vcpu * vcpu->arch.isa); kvm_riscv_vcpu_host_fp_restore(&vcpu->arch.host_context);
- csr_write(CSR_HGATP, 0); - csr->vsstatus = csr_read(CSR_VSSTATUS); csr->vsie = csr_read(CSR_VSIE); csr->vstvec = csr_read(CSR_VSTVEC);
From: Piotr Chmura chmooreck@gmail.com
commit 3ae87d2f25c0e998da2721ce332e2b80d3d53c39 upstream.
Fix firmware file names assignment in si2157 tuner, allow for running devices without firmware files needed.
modprobe gives error: unknown chip version Si2147-A30 ROM 0x50 Device initialization is interrupted.
Caused by: 1. table si2157_tuners has swapped fields rom_id and required vs struct si2157_tuner_info. 2. both firmware file names can be null for devices with required == false - device uses build-in firmware in this case
Tested on this device: m07ca:1871 AVerMedia Technologies, Inc. TD310 DVB-T/T2/C dongle
[mchehab: fix mangled patch] Link: https://bugzilla.kernel.org/show_bug.cgi?id=215726 Link: https://lore.kernel.org/lkml/5f660108-8812-383c-83e4-29ee0558d623@leemhuis.i... Link: https://lore.kernel.org/linux-media/c4bcaff8-fbad-969e-ad47-e2c487ac02a1@gma... Fixes: 1c35ba3bf972 ("media: si2157: use a different namespace for firmware") Cc: stable@vger.kernel.org # 5.17.x Signed-off-by: Piotr Chmura chmooreck@gmail.com Tested-by: Robert Schlabbach robert_s@gmx.net Signed-off-by: Mauro Carvalho Chehab mchehab@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/tuners/si2157.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
--- a/drivers/media/tuners/si2157.c +++ b/drivers/media/tuners/si2157.c @@ -77,16 +77,16 @@ err_mutex_unlock: }
static const struct si2157_tuner_info si2157_tuners[] = { - { SI2141, false, 0x60, SI2141_60_FIRMWARE, SI2141_A10_FIRMWARE }, - { SI2141, false, 0x61, SI2141_61_FIRMWARE, SI2141_A10_FIRMWARE }, - { SI2146, false, 0x11, SI2146_11_FIRMWARE, NULL }, - { SI2147, false, 0x50, SI2147_50_FIRMWARE, NULL }, - { SI2148, true, 0x32, SI2148_32_FIRMWARE, SI2158_A20_FIRMWARE }, - { SI2148, true, 0x33, SI2148_33_FIRMWARE, SI2158_A20_FIRMWARE }, - { SI2157, false, 0x50, SI2157_50_FIRMWARE, SI2157_A30_FIRMWARE }, - { SI2158, false, 0x50, SI2158_50_FIRMWARE, SI2158_A20_FIRMWARE }, - { SI2158, false, 0x51, SI2158_51_FIRMWARE, SI2158_A20_FIRMWARE }, - { SI2177, false, 0x50, SI2177_50_FIRMWARE, SI2157_A30_FIRMWARE }, + { SI2141, 0x60, false, SI2141_60_FIRMWARE, SI2141_A10_FIRMWARE }, + { SI2141, 0x61, false, SI2141_61_FIRMWARE, SI2141_A10_FIRMWARE }, + { SI2146, 0x11, false, SI2146_11_FIRMWARE, NULL }, + { SI2147, 0x50, false, SI2147_50_FIRMWARE, NULL }, + { SI2148, 0x32, true, SI2148_32_FIRMWARE, SI2158_A20_FIRMWARE }, + { SI2148, 0x33, true, SI2148_33_FIRMWARE, SI2158_A20_FIRMWARE }, + { SI2157, 0x50, false, SI2157_50_FIRMWARE, SI2157_A30_FIRMWARE }, + { SI2158, 0x50, false, SI2158_50_FIRMWARE, SI2158_A20_FIRMWARE }, + { SI2158, 0x51, false, SI2158_51_FIRMWARE, SI2158_A20_FIRMWARE }, + { SI2177, 0x50, false, SI2177_50_FIRMWARE, SI2157_A30_FIRMWARE }, };
static int si2157_load_firmware(struct dvb_frontend *fe, @@ -178,7 +178,7 @@ static int si2157_find_and_load_firmware } }
- if (!fw_name && !fw_alt_name) { + if (required && !fw_name && !fw_alt_name) { dev_err(&client->dev, "unknown chip version Si21%d-%c%c%c ROM 0x%02x\n", part_id, cmd.args[1], cmd.args[3], cmd.args[4], rom_id);
From: Tadeusz Struk tadeusz.struk@linaro.org
commit 55037ed7bdc62151a726f5685f88afa6a82959b1 upstream.
Add include guard wrapper define to uapi/linux/stddef.h to prevent macro redefinition errors when stddef.h is included more than once. This was not needed before since the only contents already used a redefinition test.
Signed-off-by: Tadeusz Struk tadeusz.struk@linaro.org Link: https://lore.kernel.org/r/20220329171252.57279-1-tadeusz.struk@linaro.org Fixes: 50d7bd38c3aa ("stddef: Introduce struct_group() helper macro") Cc: stable@vger.kernel.org Signed-off-by: Kees Cook keescook@chromium.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/uapi/linux/stddef.h | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/include/uapi/linux/stddef.h b/include/uapi/linux/stddef.h index 3021ea25a284..7837ba4fe728 100644 --- a/include/uapi/linux/stddef.h +++ b/include/uapi/linux/stddef.h @@ -1,4 +1,7 @@ /* SPDX-License-Identifier: GPL-2.0 WITH Linux-syscall-note */ +#ifndef _UAPI_LINUX_STDDEF_H +#define _UAPI_LINUX_STDDEF_H + #include <linux/compiler_types.h>
#ifndef __always_inline @@ -41,3 +44,4 @@ struct { } __empty_ ## NAME; \ TYPE NAME[]; \ } +#endif
From: Kai-Heng Feng kai.heng.feng@canonical.com
commit 887f75cfd0da44c19dda93b2ff9e70ca8792cdc1 upstream.
DP/HDMI audio on AMD PRO VII stops working after S3: [ 149.450391] amdgpu 0000:63:00.0: amdgpu: MODE1 reset [ 149.450395] amdgpu 0000:63:00.0: amdgpu: GPU mode1 reset [ 149.450494] amdgpu 0000:63:00.0: amdgpu: GPU psp mode1 reset [ 149.983693] snd_hda_intel 0000:63:00.1: refused to change power state from D0 to D3hot [ 150.003439] amdgpu 0000:63:00.0: refused to change power state from D0 to D3hot ... [ 155.432975] snd_hda_intel 0000:63:00.1: CORB reset timeout#2, CORBRP = 65535
The offending commit is daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)"). Commit 34452ac3038a7 ("drm/amdgpu: don't use BACO for reset in S3 ") doesn't help, so the issue is something different.
Assuming that to make HDA resume to D0 fully realized, it needs to be successfully put to D3 first. And this guesswork proves working, by moving amdgpu_asic_reset() to noirq callback, so it's called after HDA function is in D3.
Fixes: daf8de0874ab5b ("drm/amdgpu: always reset the asic in suspend (v2)") Signed-off-by: Kai-Heng Feng kai.heng.feng@canonical.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -2276,18 +2276,23 @@ static int amdgpu_pmops_suspend(struct d { struct drm_device *drm_dev = dev_get_drvdata(dev); struct amdgpu_device *adev = drm_to_adev(drm_dev); - int r;
if (amdgpu_acpi_is_s0ix_active(adev)) adev->in_s0ix = true; else adev->in_s3 = true; - r = amdgpu_device_suspend(drm_dev, true); - if (r) - return r; + return amdgpu_device_suspend(drm_dev, true); +} + +static int amdgpu_pmops_suspend_noirq(struct device *dev) +{ + struct drm_device *drm_dev = dev_get_drvdata(dev); + struct amdgpu_device *adev = drm_to_adev(drm_dev); + if (!adev->in_s0ix) - r = amdgpu_asic_reset(adev); - return r; + return amdgpu_asic_reset(adev); + + return 0; }
static int amdgpu_pmops_resume(struct device *dev) @@ -2528,6 +2533,7 @@ static const struct dev_pm_ops amdgpu_pm .prepare = amdgpu_pmops_prepare, .complete = amdgpu_pmops_complete, .suspend = amdgpu_pmops_suspend, + .suspend_noirq = amdgpu_pmops_suspend_noirq, .resume = amdgpu_pmops_resume, .freeze = amdgpu_pmops_freeze, .thaw = amdgpu_pmops_thaw,
From: Naohiro Aota naohiro.aota@wdc.com
commit 6d82ad13c4110e73c7b0392f00534a1502a1b520 upstream.
Running generic/406 causes the following WARNING in btrfs_destroy_inode() which tells there are outstanding extents left.
In btrfs_get_blocks_direct_write(), we reserve a temporary outstanding extents with btrfs_delalloc_reserve_metadata() (or indirectly from btrfs_delalloc_reserve_space(()). We then release the outstanding extents with btrfs_delalloc_release_extents(). However, the "len" can be modified in the COW case, which releases fewer outstanding extents than expected.
Fix it by calling btrfs_delalloc_release_extents() for the original length.
To reproduce the warning, the filesystem should be 1 GiB. It's triggering a short-write, due to not being able to allocate a large extent and instead allocating a smaller one.
WARNING: CPU: 0 PID: 757 at fs/btrfs/inode.c:8848 btrfs_destroy_inode+0x1e6/0x210 [btrfs] Modules linked in: btrfs blake2b_generic xor lzo_compress lzo_decompress raid6_pq zstd zstd_decompress zstd_compress xxhash zram zsmalloc CPU: 0 PID: 757 Comm: umount Not tainted 5.17.0-rc8+ #101 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS d55cb5a 04/01/2014 RIP: 0010:btrfs_destroy_inode+0x1e6/0x210 [btrfs] RSP: 0018:ffffc9000327bda8 EFLAGS: 00010206 RAX: 0000000000000000 RBX: ffff888100548b78 RCX: 0000000000000000 RDX: 0000000000026900 RSI: 0000000000000000 RDI: ffff888100548b78 RBP: ffff888100548940 R08: 0000000000000000 R09: ffff88810b48aba8 R10: 0000000000000001 R11: ffff8881004eb240 R12: ffff88810b48a800 R13: ffff88810b48ec08 R14: ffff88810b48ed00 R15: ffff888100490c68 FS: 00007f8549ea0b80(0000) GS:ffff888237c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f854a09e733 CR3: 000000010a2e9003 CR4: 0000000000370eb0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: <TASK> destroy_inode+0x33/0x70 dispose_list+0x43/0x60 evict_inodes+0x161/0x1b0 generic_shutdown_super+0x2d/0x110 kill_anon_super+0xf/0x20 btrfs_kill_super+0xd/0x20 [btrfs] deactivate_locked_super+0x27/0x90 cleanup_mnt+0x12c/0x180 task_work_run+0x54/0x80 exit_to_user_mode_prepare+0x152/0x160 syscall_exit_to_user_mode+0x12/0x30 do_syscall_64+0x42/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae RIP: 0033:0x7f854a000fb7
Fixes: f0bfa76a11e9 ("btrfs: fix ENOSPC failure when attempting direct IO write into NOCOW range") CC: stable@vger.kernel.org # 5.17 Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Johannes Thumshirn johannes.thumshirn@wdc.com Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -7423,6 +7423,7 @@ static int btrfs_get_blocks_direct_write u64 block_start, orig_start, orig_block_len, ram_bytes; bool can_nocow = false; bool space_reserved = false; + u64 prev_len; int ret = 0;
/* @@ -7450,6 +7451,7 @@ static int btrfs_get_blocks_direct_write can_nocow = true; }
+ prev_len = len; if (can_nocow) { struct extent_map *em2;
@@ -7479,8 +7481,6 @@ static int btrfs_get_blocks_direct_write goto out; } } else { - const u64 prev_len = len; - /* Our caller expects us to free the input extent map. */ free_extent_map(em); *map = NULL; @@ -7511,7 +7511,7 @@ static int btrfs_get_blocks_direct_write * We have created our ordered extent, so we can now release our reservation * for an outstanding extent. */ - btrfs_delalloc_release_extents(BTRFS_I(inode), len); + btrfs_delalloc_release_extents(BTRFS_I(inode), prev_len);
/* * Need to update the i_size under the extent lock so buffered
From: Dennis Zhou dennis@kernel.org
commit acee08aaf6d158d03668dc82b0a0eef41100531b upstream.
This restores the logic from commit 46bcff2bfc5e ("btrfs: fix compressed write bio blkcg attribution") which added cgroup attribution to btrfs writeback. It also adds back the REQ_CGROUP_PUNT flag for these ios.
Fixes: 91507240482e ("btrfs: determine stripe boundary at bio allocation time in btrfs_submit_compressed_write") CC: stable@vger.kernel.org # 5.16+ Signed-off-by: Dennis Zhou dennis@kernel.org Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/compression.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/fs/btrfs/compression.c +++ b/fs/btrfs/compression.c @@ -534,6 +534,9 @@ blk_status_t btrfs_submit_compressed_wri cb->orig_bio = NULL; cb->nr_pages = nr_pages;
+ if (blkcg_css) + kthread_associate_blkcg(blkcg_css); + while (cur_disk_bytenr < disk_start + compressed_len) { u64 offset = cur_disk_bytenr - disk_start; unsigned int index = offset >> PAGE_SHIFT; @@ -552,6 +555,8 @@ blk_status_t btrfs_submit_compressed_wri bio = NULL; goto finish_cb; } + if (blkcg_css) + bio->bi_opf |= REQ_CGROUP_PUNT; } /* * We should never reach next_stripe_start start as we will @@ -609,6 +614,9 @@ blk_status_t btrfs_submit_compressed_wri return 0;
finish_cb: + if (blkcg_css) + kthread_associate_blkcg(NULL); + if (bio) { bio->bi_status = ret; bio_endio(bio);
From: Naohiro Aota naohiro.aota@wdc.com
commit 820c363bd526ec8e133e4b84e6ad1fda12023b4b upstream.
Return the allocated block group from do_chunk_alloc(). This is a preparation patch for the next patch.
CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-group.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -3427,7 +3427,7 @@ int btrfs_force_chunk_alloc(struct btrfs return btrfs_chunk_alloc(trans, alloc_flags, CHUNK_ALLOC_FORCE); }
-static int do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) +static struct btrfs_block_group *do_chunk_alloc(struct btrfs_trans_handle *trans, u64 flags) { struct btrfs_block_group *bg; int ret; @@ -3514,7 +3514,11 @@ static int do_chunk_alloc(struct btrfs_t out: btrfs_trans_release_chunk_metadata(trans);
- return ret; + if (ret) + return ERR_PTR(ret); + + btrfs_get_block_group(bg); + return bg; }
/* @@ -3629,6 +3633,7 @@ int btrfs_chunk_alloc(struct btrfs_trans { struct btrfs_fs_info *fs_info = trans->fs_info; struct btrfs_space_info *space_info; + struct btrfs_block_group *ret_bg; bool wait_for_alloc = false; bool should_alloc = false; int ret = 0; @@ -3722,9 +3727,14 @@ int btrfs_chunk_alloc(struct btrfs_trans force_metadata_allocation(fs_info); }
- ret = do_chunk_alloc(trans, flags); + ret_bg = do_chunk_alloc(trans, flags); trans->allocating_chunk = false;
+ if (IS_ERR(ret_bg)) + ret = PTR_ERR(ret_bg); + else + btrfs_put_block_group(ret_bg); + spin_lock(&space_info->lock); if (ret < 0) { if (ret == -ENOSPC)
From: Takashi Iwai tiwai@suse.de
commit fee2b871d8d6389c9b4bdf9346a99ccc1c98c9b8 upstream.
This is a small helper function to handle the error path more easily when an error happens during the probe for the device with the device-managed card. Since devres releases in the reverser order of the creations, usually snd_card_free() gets called at the last in the probe error path unless it already reached snd_card_register() calls. Due to this nature, when a driver expects the resource releases in card->private_free, this might be called too lately.
As a workaround, one should call the probe like:
static int __some_probe(...) { // do real probe.... }
static int some_probe(...) { return snd_card_free_on_error(dev, __some_probe(dev, ...)); }
so that the snd_card_free() is called explicitly at the beginning of the error path from the probe.
This function will be used in the upcoming fixes to address the regressions by devres usages.
Fixes: e8ad415b7a55 ("ALSA: core: Add managed card creation") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412093141.8008-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/sound/core.h | 1 + sound/core/init.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 29 insertions(+)
--- a/include/sound/core.h +++ b/include/sound/core.h @@ -284,6 +284,7 @@ int snd_card_disconnect(struct snd_card void snd_card_disconnect_sync(struct snd_card *card); int snd_card_free(struct snd_card *card); int snd_card_free_when_closed(struct snd_card *card); +int snd_card_free_on_error(struct device *dev, int ret); void snd_card_set_id(struct snd_card *card, const char *id); int snd_card_register(struct snd_card *card); int snd_card_info_init(void); --- a/sound/core/init.c +++ b/sound/core/init.c @@ -209,6 +209,12 @@ static void __snd_card_release(struct de * snd_card_register(), the very first devres action to call snd_card_free() * is added automatically. In that way, the resource disconnection is assured * at first, then released in the expected order. + * + * If an error happens at the probe before snd_card_register() is called and + * there have been other devres resources, you'd need to free the card manually + * via snd_card_free() call in the error; otherwise it may lead to UAF due to + * devres call orders. You can use snd_card_free_on_error() helper for + * handling it more easily. */ int snd_devm_card_new(struct device *parent, int idx, const char *xid, struct module *module, size_t extra_size, @@ -235,6 +241,28 @@ int snd_devm_card_new(struct device *par } EXPORT_SYMBOL_GPL(snd_devm_card_new);
+/** + * snd_card_free_on_error - a small helper for handling devm probe errors + * @dev: the managed device object + * @ret: the return code from the probe callback + * + * This function handles the explicit snd_card_free() call at the error from + * the probe callback. It's just a small helper for simplifying the error + * handling with the managed devices. + */ +int snd_card_free_on_error(struct device *dev, int ret) +{ + struct snd_card *card; + + if (!ret) + return 0; + card = devres_find(dev, __snd_card_release, NULL, NULL); + if (card) + snd_card_free(card); + return ret; +} +EXPORT_SYMBOL_GPL(snd_card_free_on_error); + static int snd_card_init(struct snd_card *card, struct device *parent, int idx, const char *xid, struct module *module, size_t extra_size)
From: Takashi Iwai tiwai@suse.de
commit 2236a3243ff8291e97c70097dd11a0fdb8904380 upstream.
The previous cleanup with devres forgot to replace the snd_card_free() call with the devm version. Moreover, it still needs the manual call of snd_card_free() at the probe error path, otherwise the reverse order of the releases may happen. This patch addresses those issues.
Fixes: 499ddc16394c ("ALSA: sis7019: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-28-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/sis7019.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/sound/pci/sis7019.c +++ b/sound/pci/sis7019.c @@ -1331,8 +1331,8 @@ static int sis_chip_create(struct snd_ca return 0; }
-static int snd_sis7019_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_sis7019_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct sis7019 *sis; @@ -1352,8 +1352,8 @@ static int snd_sis7019_probe(struct pci_ if (!codecs) codecs = SIS_PRIMARY_CODEC_PRESENT;
- rc = snd_card_new(&pci->dev, index, id, THIS_MODULE, - sizeof(*sis), &card); + rc = snd_devm_card_new(&pci->dev, index, id, THIS_MODULE, + sizeof(*sis), &card); if (rc < 0) return rc;
@@ -1386,6 +1386,12 @@ static int snd_sis7019_probe(struct pci_ return 0; }
+static int snd_sis7019_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_sis7019_probe(pci, pci_id)); +} + static struct pci_driver sis7019_driver = { .name = KBUILD_MODNAME, .id_table = snd_sis7019_ids,
From: Takashi Iwai tiwai@suse.de
commit 19401a9441236cfbbbeb1bef4ef4c8668db45dfc upstream.
The recent cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 1f0819979248 ("ALSA: ali5451: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-5-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/ali5451/ali5451.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/ali5451/ali5451.c +++ b/sound/pci/ali5451/ali5451.c @@ -2124,8 +2124,8 @@ static int snd_ali_create(struct snd_car return 0; }
-static int snd_ali_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_ali_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct snd_ali *codec; @@ -2170,6 +2170,12 @@ static int snd_ali_probe(struct pci_dev return 0; }
+static int snd_ali_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_ali_probe(pci, pci_id)); +} + static struct pci_driver ali5451_driver = { .name = KBUILD_MODNAME, .id_table = snd_ali_ids,
From: Takashi Iwai tiwai@suse.de
commit ab8bce9da6102c575c473c053672547589bc4c59 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 21a9314cf93b ("ALSA: als300: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-31-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/als300.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/als300.c +++ b/sound/pci/als300.c @@ -708,7 +708,7 @@ static int snd_als300_probe(struct pci_d
err = snd_als300_create(card, pci, chip_type); if (err < 0) - return err; + goto error;
strcpy(card->driver, "ALS300"); if (chip->chip_type == DEVICE_ALS300_PLUS) @@ -723,11 +723,15 @@ static int snd_als300_probe(struct pci_d
err = snd_card_register(card); if (err < 0) - return err; + goto error;
pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver als300_driver = {
From: Takashi Iwai tiwai@suse.de
commit d616a0246da88d811f9f4c3aa83003c05efd3af0 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 0e175f665960 ("ALSA: als4000: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-6-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/als4000.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/als4000.c +++ b/sound/pci/als4000.c @@ -806,8 +806,8 @@ static void snd_card_als4000_free( struc snd_als4000_free_gameport(acard); }
-static int snd_card_als4000_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_card_als4000_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -930,6 +930,12 @@ static int snd_card_als4000_probe(struct return 0; }
+static int snd_card_als4000_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_card_als4000_probe(pci, pci_id)); +} + #ifdef CONFIG_PM_SLEEP static int snd_als4000_suspend(struct device *dev) {
From: Takashi Iwai tiwai@suse.de
commit 48e8adde8d1c586c799dab123fc1ebc8b8db620f upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 86bde74dbf09 ("ALSA: atiixp: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-7-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/atiixp.c | 10 ++++++++-- sound/pci/atiixp_modem.c | 10 ++++++++-- 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/sound/pci/atiixp.c +++ b/sound/pci/atiixp.c @@ -1572,8 +1572,8 @@ static int snd_atiixp_init(struct snd_ca }
-static int snd_atiixp_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_atiixp_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct atiixp *chip; @@ -1623,6 +1623,12 @@ static int snd_atiixp_probe(struct pci_d return 0; }
+static int snd_atiixp_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_atiixp_probe(pci, pci_id)); +} + static struct pci_driver atiixp_driver = { .name = KBUILD_MODNAME, .id_table = snd_atiixp_ids, --- a/sound/pci/atiixp_modem.c +++ b/sound/pci/atiixp_modem.c @@ -1201,8 +1201,8 @@ static int snd_atiixp_init(struct snd_ca }
-static int snd_atiixp_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_atiixp_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct atiixp_modem *chip; @@ -1247,6 +1247,12 @@ static int snd_atiixp_probe(struct pci_d return 0; }
+static int snd_atiixp_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_atiixp_probe(pci, pci_id)); +} + static struct pci_driver atiixp_modem_driver = { .name = KBUILD_MODNAME, .id_table = snd_atiixp_ids,
From: Takashi Iwai tiwai@suse.de
commit b093de145bc8769c6e9207947afad9efe102f4f6 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: e44b5b440609 ("ALSA: au88x0: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-8-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/au88x0/au88x0.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/pci/au88x0/au88x0.c +++ b/sound/pci/au88x0/au88x0.c @@ -193,7 +193,7 @@ snd_vortex_create(struct snd_card *card,
// constructor -- see "Constructor" sub-section static int -snd_vortex_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +__snd_vortex_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -310,6 +310,12 @@ snd_vortex_probe(struct pci_dev *pci, co return 0; }
+static int +snd_vortex_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_vortex_probe(pci, pci_id)); +} + // pci_driver definition static struct pci_driver vortex_driver = { .name = KBUILD_MODNAME,
From: Takashi Iwai tiwai@suse.de
commit bf4067e8a19eae67c45659a956c361d59251ba57 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 33631012cd06 ("ALSA: aw2: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-32-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/aw2/aw2-alsa.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/aw2/aw2-alsa.c +++ b/sound/pci/aw2/aw2-alsa.c @@ -275,7 +275,7 @@ static int snd_aw2_probe(struct pci_dev /* (3) Create main component */ err = snd_aw2_create(card, pci); if (err < 0) - return err; + goto error;
/* initialize mutex */ mutex_init(&chip->mtx); @@ -294,13 +294,17 @@ static int snd_aw2_probe(struct pci_dev /* (6) Register card instance */ err = snd_card_register(card); if (err < 0) - return err; + goto error;
/* (7) Set PCI driver data */ pci_set_drvdata(pci, card);
dev++; return 0; + + error: + snd_card_free(card); + return err; }
/* open callback */
From: Takashi Iwai tiwai@suse.de
commit 49fe36e1c02cb06f66689c888e4e767c31cd259d upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 8c5823ef31e1 ("ALSA: azt3328: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-9-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/azt3328.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/pci/azt3328.c +++ b/sound/pci/azt3328.c @@ -2427,7 +2427,7 @@ snd_azf3328_create(struct snd_card *card }
static int -snd_azf3328_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +__snd_azf3328_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2520,6 +2520,12 @@ snd_azf3328_probe(struct pci_dev *pci, c return 0; }
+static int +snd_azf3328_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_azf3328_probe(pci, pci_id)); +} + #ifdef CONFIG_PM_SLEEP static inline void snd_azf3328_suspend_regs(const struct snd_azf3328 *chip,
From: Takashi Iwai tiwai@suse.de
commit f0438155273f057fec9818bc9d1b782ba35cf6a1 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 9e80ed64a006 ("ALSA: bt87x: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-29-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/bt87x.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/bt87x.c +++ b/sound/pci/bt87x.c @@ -805,8 +805,8 @@ static int snd_bt87x_detect_card(struct return SND_BT87X_BOARD_UNKNOWN; }
-static int snd_bt87x_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_bt87x_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -889,6 +889,12 @@ static int snd_bt87x_probe(struct pci_de return 0; }
+static int snd_bt87x_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_bt87x_probe(pci, pci_id)); +} + /* default entries for all Bt87x cards - it's not exported */ /* driver_data is set to 0 to call detection */ static const struct pci_device_id snd_bt87x_default_ids[] = {
From: Takashi Iwai tiwai@suse.de
commit c79442cc5a38e46597bc647128c8f1de62d80020 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 1656fa6ea258 ("ALSA: ca0106: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-10-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/ca0106/ca0106_main.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/ca0106/ca0106_main.c +++ b/sound/pci/ca0106/ca0106_main.c @@ -1725,8 +1725,8 @@ static int snd_ca0106_midi(struct snd_ca }
-static int snd_ca0106_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_ca0106_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1786,6 +1786,12 @@ static int snd_ca0106_probe(struct pci_d return 0; }
+static int snd_ca0106_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_ca0106_probe(pci, pci_id)); +} + #ifdef CONFIG_PM_SLEEP static int snd_ca0106_suspend(struct device *dev) {
From: Takashi Iwai tiwai@suse.de
commit a59396b1c11823c69c31621198c04def17f3a869 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 87e082ad84a7 ("ALSA: cmipci: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-33-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/cmipci.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/cmipci.c +++ b/sound/pci/cmipci.c @@ -3247,15 +3247,19 @@ static int snd_cmipci_probe(struct pci_d
err = snd_cmipci_create(card, pci, dev); if (err < 0) - return err; + goto error;
err = snd_card_register(card); if (err < 0) - return err; + goto error;
pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
#ifdef CONFIG_PM_SLEEP
From: Takashi Iwai tiwai@suse.de
commit 9bf5ed9a4e623583f15202d99f4521bc39050f61 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 99041fea70d0 ("ALSA: cs4281: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-11-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/cs4281.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/cs4281.c +++ b/sound/pci/cs4281.c @@ -1827,8 +1827,8 @@ static void snd_cs4281_opl3_command(stru spin_unlock_irqrestore(&opl3->reg_lock, flags); }
-static int snd_cs4281_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_cs4281_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1888,6 +1888,12 @@ static int snd_cs4281_probe(struct pci_d return 0; }
+static int snd_cs4281_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_cs4281_probe(pci, pci_id)); +} + /* * Power Management */
From: Takashi Iwai tiwai@suse.de
commit 2a56314798e0227cf51e3d1d184a419dc07bc173 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
Fixes: 5eba4c646dfe ("ALSA: cs5535audio: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-12-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/cs5535audio/cs5535audio.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/cs5535audio/cs5535audio.c +++ b/sound/pci/cs5535audio/cs5535audio.c @@ -281,8 +281,8 @@ static int snd_cs5535audio_create(struct return 0; }
-static int snd_cs5535audio_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_cs5535audio_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -331,6 +331,12 @@ static int snd_cs5535audio_probe(struct return 0; }
+static int snd_cs5535audio_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_cs5535audio_probe(pci, pci_id)); +} + static struct pci_driver cs5535audio_driver = { .name = KBUILD_MODNAME, .id_table = snd_cs5535audio_ids,
From: Takashi Iwai tiwai@suse.de
commit 313c7e57035125cb7533b53ddd0bc7aa562b433c upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 9c211bf392bb ("ALSA: echoaudio: Allocate resources with device-managed APIs") Reported-and-tested-by: Zheyu Ma zheyuma97@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CAMhUBjm2AdyEZ_-EgexdNDN7SvY4f89=4=FwAL+c0Mg0O+X50... Link: https://lore.kernel.org/r/20220412093141.8008-3-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/echoaudio/echoaudio.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
--- a/sound/pci/echoaudio/echoaudio.c +++ b/sound/pci/echoaudio/echoaudio.c @@ -1970,8 +1970,8 @@ static int snd_echo_create(struct snd_ca }
/* constructor */ -static int snd_echo_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_echo_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2139,6 +2139,11 @@ static int snd_echo_probe(struct pci_dev return 0; }
+static int snd_echo_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_echo_probe(pci, pci_id)); +}
#if defined(CONFIG_PM_SLEEP)
From: Takashi Iwai tiwai@suse.de
commit f37019b6bfe2e13cc536af0e6a42ed62005392ae upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 2b377c6b6012 ("ALSA: emu10k1x: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-13-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/emu10k1/emu10k1x.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/emu10k1/emu10k1x.c +++ b/sound/pci/emu10k1/emu10k1x.c @@ -1491,8 +1491,8 @@ static int snd_emu10k1x_midi(struct emu1 return 0; }
-static int snd_emu10k1x_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_emu10k1x_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1554,6 +1554,12 @@ static int snd_emu10k1x_probe(struct pci return 0; }
+static int snd_emu10k1x_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_emu10k1x_probe(pci, pci_id)); +} + // PCI IDs static const struct pci_device_id snd_emu10k1x_ids[] = { { PCI_VDEVICE(CREATIVE, 0x0006), 0 }, /* Dell OEM version (EMU10K1) */
From: Takashi Iwai tiwai@suse.de
commit c2dc46932d117a1505f589ad1db3095aa6789058 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 10ed6eaf9d72 ("ALSA: ens137x: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-14-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/ens1370.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/ens1370.c +++ b/sound/pci/ens1370.c @@ -2304,8 +2304,8 @@ static irqreturn_t snd_audiopci_interrup return IRQ_HANDLED; }
-static int snd_audiopci_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_audiopci_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2369,6 +2369,12 @@ static int snd_audiopci_probe(struct pci return 0; }
+static int snd_audiopci_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_audiopci_probe(pci, pci_id)); +} + static struct pci_driver ens137x_driver = { .name = KBUILD_MODNAME, .id_table = snd_audiopci_ids,
From: Takashi Iwai tiwai@suse.de
commit bc22628591e5913e67edb3c2a89b97849e30a8f8 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 08e9d3ab4cc1 ("ALSA: es1938: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-15-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/es1938.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/es1938.c +++ b/sound/pci/es1938.c @@ -1716,8 +1716,8 @@ static int snd_es1938_mixer(struct es193 }
-static int snd_es1938_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_es1938_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1796,6 +1796,12 @@ static int snd_es1938_probe(struct pci_d return 0; }
+static int snd_es1938_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_es1938_probe(pci, pci_id)); +} + static struct pci_driver es1938_driver = { .name = KBUILD_MODNAME, .id_table = snd_es1938_ids,
From: Takashi Iwai tiwai@suse.de
commit de9a01bc95a9e5e36d0659521bb04579053d8566 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: a7b4cbfdc701 ("ALSA: es1968: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-16-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/es1968.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/es1968.c +++ b/sound/pci/es1968.c @@ -2741,8 +2741,8 @@ static int snd_es1968_create(struct snd_
/* */ -static int snd_es1968_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_es1968_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2848,6 +2848,12 @@ static int snd_es1968_probe(struct pci_d return 0; }
+static int snd_es1968_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_es1968_probe(pci, pci_id)); +} + static struct pci_driver es1968_driver = { .name = KBUILD_MODNAME, .id_table = snd_es1968_ids,
From: Takashi Iwai tiwai@suse.de
commit 7f611274a3d1657a67b3fa8cd0cec1dee00e02b4 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 47c413395376 ("ALSA: fm801: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-17-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/fm801.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/fm801.c +++ b/sound/pci/fm801.c @@ -1268,8 +1268,8 @@ static int snd_fm801_create(struct snd_c return 0; }
-static int snd_card_fm801_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_card_fm801_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1333,6 +1333,12 @@ static int snd_card_fm801_probe(struct p return 0; }
+static int snd_card_fm801_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_card_fm801_probe(pci, pci_id)); +} + #ifdef CONFIG_PM_SLEEP static const unsigned char saved_regs[] = { FM801_PCM_VOL, FM801_I2S_VOL, FM801_FM_VOL, FM801_REC_SRC,
From: Takashi Iwai tiwai@suse.de
commit 10b1881a97be240126891cb384bd3bc1869f52d8 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 35a245ec0619 ("ALSA: galaxy: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-2-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/isa/galaxy/galaxy.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/sound/isa/galaxy/galaxy.c +++ b/sound/isa/galaxy/galaxy.c @@ -478,7 +478,7 @@ static void snd_galaxy_free(struct snd_c galaxy_set_config(galaxy, galaxy->config); }
-static int snd_galaxy_probe(struct device *dev, unsigned int n) +static int __snd_galaxy_probe(struct device *dev, unsigned int n) { struct snd_galaxy *galaxy; struct snd_wss *chip; @@ -598,6 +598,11 @@ static int snd_galaxy_probe(struct devic return 0; }
+static int snd_galaxy_probe(struct device *dev, unsigned int n) +{ + return snd_card_free_on_error(dev, __snd_galaxy_probe(dev, n)); +} + static struct isa_driver snd_galaxy_driver = { .match = snd_galaxy_match, .probe = snd_galaxy_probe,
From: Takashi Iwai tiwai@suse.de
commit e2263f0bf7443a200a5c1c418baefd92f1674600 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: d136b8e54f92 ("ALSA: hdsp: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-36-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/rme9652/hdsp.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/rme9652/hdsp.c +++ b/sound/pci/rme9652/hdsp.c @@ -5444,17 +5444,21 @@ static int snd_hdsp_probe(struct pci_dev hdsp->pci = pci; err = snd_hdsp_create(card, hdsp); if (err) - return err; + goto error;
strcpy(card->shortname, "Hammerfall DSP"); sprintf(card->longname, "%s at 0x%lx, irq %d", hdsp->card_name, hdsp->port, hdsp->irq); err = snd_card_register(card); if (err) - return err; + goto error; pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver hdsp_driver = {
From: Takashi Iwai tiwai@suse.de
commit eab521aebcdeb1c801009503e3a7f8989e3c6b36 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 0195ca5fd1f4 ("ALSA: hdspm: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-37-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/rme9652/hdspm.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/rme9652/hdspm.c +++ b/sound/pci/rme9652/hdspm.c @@ -6895,7 +6895,7 @@ static int snd_hdspm_probe(struct pci_de
err = snd_hdspm_create(card, hdspm); if (err < 0) - return err; + goto error;
if (hdspm->io_type != MADIface) { snprintf(card->shortname, sizeof(card->shortname), "%s_%x", @@ -6914,12 +6914,16 @@ static int snd_hdspm_probe(struct pci_de
err = snd_card_register(card); if (err < 0) - return err; + goto error;
pci_set_drvdata(pci, card);
dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver hdspm_driver = {
From: Takashi Iwai tiwai@suse.de
commit 4a850a0079ce601c0c4016f4edb7d618e811ed7d upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 314f6dbb1f33 ("ALSA: ice1724: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-18-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/ice1712/ice1724.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/ice1712/ice1724.c +++ b/sound/pci/ice1712/ice1724.c @@ -2519,8 +2519,8 @@ static int snd_vt1724_create(struct snd_ * */
-static int snd_vt1724_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_vt1724_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2662,6 +2662,12 @@ static int snd_vt1724_probe(struct pci_d return 0; }
+static int snd_vt1724_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_vt1724_probe(pci, pci_id)); +} + #ifdef CONFIG_PM_SLEEP static int snd_vt1724_suspend(struct device *dev) {
From: Takashi Iwai tiwai@suse.de
commit 71b21f5f8970a87f034138454ebeff0608d24875 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 7835e0901e24 ("ALSA: intel8x0: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-19-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/intel8x0.c | 10 ++++++++-- sound/pci/intel8x0m.c | 10 ++++++++-- 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/sound/pci/intel8x0.c +++ b/sound/pci/intel8x0.c @@ -3109,8 +3109,8 @@ static int check_default_spdif_aclink(st return 0; }
-static int snd_intel8x0_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_intel8x0_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct intel8x0 *chip; @@ -3189,6 +3189,12 @@ static int snd_intel8x0_probe(struct pci return 0; }
+static int snd_intel8x0_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_intel8x0_probe(pci, pci_id)); +} + static struct pci_driver intel8x0_driver = { .name = KBUILD_MODNAME, .id_table = snd_intel8x0_ids, --- a/sound/pci/intel8x0m.c +++ b/sound/pci/intel8x0m.c @@ -1178,8 +1178,8 @@ static struct shortname_table { { 0 }, };
-static int snd_intel8x0m_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_intel8x0m_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct intel8x0m *chip; @@ -1225,6 +1225,12 @@ static int snd_intel8x0m_probe(struct pc return 0; }
+static int snd_intel8x0m_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_intel8x0m_probe(pci, pci_id)); +} + static struct pci_driver intel8x0m_driver = { .name = KBUILD_MODNAME, .id_table = snd_intel8x0m_ids,
From: Takashi Iwai tiwai@suse.de
commit 5e154dfb4f9995096aa6d342df75040ae802c17e upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 854577ac2aea ("ALSA: x86: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-27-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/x86/intel_hdmi_audio.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/sound/x86/intel_hdmi_audio.c +++ b/sound/x86/intel_hdmi_audio.c @@ -1665,7 +1665,7 @@ static void hdmi_lpe_audio_free(struct s * This function is called when the i915 driver creates the * hdmi-lpe-audio platform device. */ -static int hdmi_lpe_audio_probe(struct platform_device *pdev) +static int __hdmi_lpe_audio_probe(struct platform_device *pdev) { struct snd_card *card; struct snd_intelhad_card *card_ctx; @@ -1828,6 +1828,11 @@ static int hdmi_lpe_audio_probe(struct p return 0; }
+static int hdmi_lpe_audio_probe(struct platform_device *pdev) +{ + return snd_card_free_on_error(&pdev->dev, __hdmi_lpe_audio_probe(pdev)); +} + static const struct dev_pm_ops hdmi_lpe_audio_pm = { SET_SYSTEM_SLEEP_PM_OPS(hdmi_lpe_audio_suspend, hdmi_lpe_audio_resume) };
From: Takashi Iwai tiwai@suse.de
commit c01b723a56ce18ae66ff18c5803942badc15fbcd upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: b5cde369b618 ("ALSA: korg1212: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-20-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/korg1212/korg1212.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/korg1212/korg1212.c +++ b/sound/pci/korg1212/korg1212.c @@ -2355,7 +2355,7 @@ snd_korg1212_probe(struct pci_dev *pci,
err = snd_korg1212_create(card, pci); if (err < 0) - return err; + goto error;
strcpy(card->driver, "korg1212"); strcpy(card->shortname, "korg1212"); @@ -2366,10 +2366,14 @@ snd_korg1212_probe(struct pci_dev *pci,
err = snd_card_register(card); if (err < 0) - return err; + goto error; pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver korg1212_driver = {
From: Takashi Iwai tiwai@suse.de
commit d04e84b9817c652002f0ee9b42059d41493e9118 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 098fe3d6e775 ("ALSA: lola: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-30-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/lola/lola.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/lola/lola.c +++ b/sound/pci/lola/lola.c @@ -637,8 +637,8 @@ static int lola_create(struct snd_card * return 0; }
-static int lola_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __lola_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -687,6 +687,12 @@ static int lola_probe(struct pci_dev *pc return 0; }
+static int lola_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __lola_probe(pci, pci_id)); +} + /* PCI IDs */ static const struct pci_device_id lola_ids[] = { { PCI_VDEVICE(DIGIGRAM, 0x0001) },
From: Takashi Iwai tiwai@suse.de
commit 60797a21dd8360a99ba797f8ca587087c07bb54c upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: 6f16c19b115e ("ALSA: lx6464es: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-34-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/lx6464es/lx6464es.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/lx6464es/lx6464es.c +++ b/sound/pci/lx6464es/lx6464es.c @@ -1019,7 +1019,7 @@ static int snd_lx6464es_probe(struct pci err = snd_lx6464es_create(card, pci); if (err < 0) { dev_err(card->dev, "error during snd_lx6464es_create\n"); - return err; + goto error; }
strcpy(card->driver, "LX6464ES"); @@ -1036,12 +1036,16 @@ static int snd_lx6464es_probe(struct pci
err = snd_card_register(card); if (err < 0) - return err; + goto error;
dev_dbg(chip->card->dev, "initialization successful\n"); pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver lx6464es_driver = {
From: Takashi Iwai tiwai@suse.de
commit ae86bf5c2a8d81418eadf1c31dd9253b609e3093 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 5c0939253c3c ("ALSA: maestro3: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-21-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/maestro3.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/pci/maestro3.c +++ b/sound/pci/maestro3.c @@ -2637,7 +2637,7 @@ snd_m3_create(struct snd_card *card, str /* */ static int -snd_m3_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +__snd_m3_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2702,6 +2702,12 @@ snd_m3_probe(struct pci_dev *pci, const return 0; }
+static int +snd_m3_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_m3_probe(pci, pci_id)); +} + static struct pci_driver m3_driver = { .name = KBUILD_MODNAME, .id_table = snd_m3_ids,
From: Takashi Iwai tiwai@suse.de
commit 6ebc16e206aa82ddb0450c907865c55bcb7c0f43 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 596ae97ab0ce ("ALSA: oxygen: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-35-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/oxygen/oxygen_lib.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
--- a/sound/pci/oxygen/oxygen_lib.c +++ b/sound/pci/oxygen/oxygen_lib.c @@ -576,7 +576,7 @@ static void oxygen_card_free(struct snd_ mutex_destroy(&chip->mutex); }
-int oxygen_pci_probe(struct pci_dev *pci, int index, char *id, +static int __oxygen_pci_probe(struct pci_dev *pci, int index, char *id, struct module *owner, const struct pci_device_id *ids, int (*get_model)(struct oxygen *chip, @@ -701,6 +701,16 @@ int oxygen_pci_probe(struct pci_dev *pci pci_set_drvdata(pci, card); return 0; } + +int oxygen_pci_probe(struct pci_dev *pci, int index, char *id, + struct module *owner, + const struct pci_device_id *ids, + int (*get_model)(struct oxygen *chip, + const struct pci_device_id *id)) +{ + return snd_card_free_on_error(&pci->dev, + __oxygen_pci_probe(pci, index, id, owner, ids, get_model)); +} EXPORT_SYMBOL(oxygen_pci_probe);
#ifdef CONFIG_PM_SLEEP
From: Takashi Iwai tiwai@suse.de
commit 348f08de55b149e41a05111d1a713c4484e5a426 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 546c201a891e ("ALSA: riptide: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-22-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/riptide/riptide.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/pci/riptide/riptide.c +++ b/sound/pci/riptide/riptide.c @@ -2023,7 +2023,7 @@ static void snd_riptide_joystick_remove( #endif
static int -snd_card_riptide_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +__snd_card_riptide_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -2124,6 +2124,12 @@ snd_card_riptide_probe(struct pci_dev *p return 0; }
+static int +snd_card_riptide_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_card_riptide_probe(pci, pci_id)); +} + static struct pci_driver driver = { .name = KBUILD_MODNAME, .id_table = snd_riptide_ids,
From: Takashi Iwai tiwai@suse.de
commit 55d2d046b23b9bcb907f6b3e38e52113d55085eb upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 102e6156ded2 ("ALSA: rme32: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-23-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/rme32.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/sound/pci/rme32.c +++ b/sound/pci/rme32.c @@ -1875,7 +1875,7 @@ static void snd_rme32_card_free(struct s }
static int -snd_rme32_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +__snd_rme32_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) { static int dev; struct rme32 *rme32; @@ -1927,6 +1927,12 @@ snd_rme32_probe(struct pci_dev *pci, con return 0; }
+static int +snd_rme32_probe(struct pci_dev *pci, const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_rme32_probe(pci, pci_id)); +} + static struct pci_driver rme32_driver = { .name = KBUILD_MODNAME, .id_table = snd_rme32_ids,
From: Takashi Iwai tiwai@suse.de
commit b2aa4f80693b7841e5ac4eadbd2d8cec56b10a51 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() manually on the error from the probe callback.
Fixes: b1002b2d41c5 ("ALSA: rme9652: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-38-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/rme9652/rme9652.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/sound/pci/rme9652/rme9652.c +++ b/sound/pci/rme9652/rme9652.c @@ -2572,7 +2572,7 @@ static int snd_rme9652_probe(struct pci_ rme9652->pci = pci; err = snd_rme9652_create(card, rme9652, precise_ptr[dev]); if (err) - return err; + goto error;
strcpy(card->shortname, rme9652->card_name);
@@ -2580,10 +2580,14 @@ static int snd_rme9652_probe(struct pci_ card->shortname, rme9652->port, rme9652->irq); err = snd_card_register(card); if (err) - return err; + goto error; pci_set_drvdata(pci, card); dev++; return 0; + + error: + snd_card_free(card); + return err; }
static struct pci_driver rme9652_driver = {
From: Takashi Iwai tiwai@suse.de
commit 93b884f8d82f08c7af542703a724cc23cd2d5bfc upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: df06df7cc997 ("ALSA: rme96: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-24-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/rme96.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/rme96.c +++ b/sound/pci/rme96.c @@ -2430,8 +2430,8 @@ static void snd_rme96_card_free(struct s }
static int -snd_rme96_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +__snd_rme96_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct rme96 *rme96; @@ -2498,6 +2498,12 @@ snd_rme96_probe(struct pci_dev *pci, return 0; }
+static int snd_rme96_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_rme96_probe(pci, pci_id)); +} + static struct pci_driver rme96_driver = { .name = KBUILD_MODNAME, .id_table = snd_rme96_ids,
From: Takashi Iwai tiwai@suse.de
commit d72458071150b802940204950d0d462ea3c913b1 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 111601ff76e9 ("ALSA: sc6000: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-3-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/isa/sc6000.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
--- a/sound/isa/sc6000.c +++ b/sound/isa/sc6000.c @@ -537,7 +537,7 @@ static void snd_sc6000_free(struct snd_c sc6000_setup_board(vport, 0); }
-static int snd_sc6000_probe(struct device *devptr, unsigned int dev) +static int __snd_sc6000_probe(struct device *devptr, unsigned int dev) { static const int possible_irqs[] = { 5, 7, 9, 10, 11, -1 }; static const int possible_dmas[] = { 1, 3, 0, -1 }; @@ -662,6 +662,11 @@ static int snd_sc6000_probe(struct devic return 0; }
+static int snd_sc6000_probe(struct device *devptr, unsigned int dev) +{ + return snd_card_free_on_error(devptr, __snd_sc6000_probe(devptr, dev)); +} + static struct isa_driver snd_sc6000_driver = { .match = snd_sc6000_match, .probe = snd_sc6000_probe,
From: Takashi Iwai tiwai@suse.de
commit b087a381d7386ec95803222d0d9b1ac499550713 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 2ca6cbde6ad7 ("ALSA: sonicvibes: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-25-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/sonicvibes.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/sound/pci/sonicvibes.c +++ b/sound/pci/sonicvibes.c @@ -1387,8 +1387,8 @@ static int snd_sonicvibes_midi(struct so return 0; }
-static int snd_sonic_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_sonic_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { static int dev; struct snd_card *card; @@ -1459,6 +1459,12 @@ static int snd_sonic_probe(struct pci_de return 0; }
+static int snd_sonic_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_sonic_probe(pci, pci_id)); +} + static struct pci_driver sonicvibes_driver = { .name = KBUILD_MODNAME, .id_table = snd_sonic_ids,
From: Takashi Iwai tiwai@suse.de
commit 27a0963f9cea5be3c68281f07fe82cdf712ef333 upstream.
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: afaf99751d0c ("ALSA: via82xx: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-26-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/via82xx.c | 10 ++++++++-- sound/pci/via82xx_modem.c | 10 ++++++++-- 2 files changed, 16 insertions(+), 4 deletions(-)
--- a/sound/pci/via82xx.c +++ b/sound/pci/via82xx.c @@ -2458,8 +2458,8 @@ static int check_dxs_list(struct pci_dev return VIA_DXS_48K; };
-static int snd_via82xx_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_via82xx_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct via82xx *chip; @@ -2569,6 +2569,12 @@ static int snd_via82xx_probe(struct pci_ return 0; }
+static int snd_via82xx_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_via82xx_probe(pci, pci_id)); +} + static struct pci_driver via82xx_driver = { .name = KBUILD_MODNAME, .id_table = snd_via82xx_ids, --- a/sound/pci/via82xx_modem.c +++ b/sound/pci/via82xx_modem.c @@ -1103,8 +1103,8 @@ static int snd_via82xx_create(struct snd }
-static int snd_via82xx_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +static int __snd_via82xx_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { struct snd_card *card; struct via82xx_modem *chip; @@ -1157,6 +1157,12 @@ static int snd_via82xx_probe(struct pci_ return 0; }
+static int snd_via82xx_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_via82xx_probe(pci, pci_id)); +} + static struct pci_driver via82xx_modem_driver = { .name = KBUILD_MODNAME, .id_table = snd_via82xx_modem_ids,
From: Takashi Iwai tiwai@suse.de
commit 98c27add5d96485db731a92dac31567b0486cae8 upstream.
In the implicit feedback mode, some parameters are tied between both playback and capture streams. One of the tied parameters is the period size, and this can be a problem if the device has different number of channels to both streams. Assume that an application opens a playback stream that has an implicit feedback from a capture stream, and it allocates up to the max period and buffer size as much as possible. When the capture device supports only more channels than the playback, the minimum period and buffer sizes become larger than the sizes the playback stream took. That is, the minimum size will be over the max size the driver limits, and PCM core sees as if no available configuration is found, returning -EINVAL mercilessly.
For avoiding this problem, we have to look through the counter part of audioformat list for each sync ep, and checks the channels. If more channels are found there, we reduce the max period and buffer sizes accordingly.
You may wonder that the patch adds only the evaluation of channels between streams, and what about other parameters? Both the format and the rate are tied in the implicit fb mode, hence they are always identical.
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=215792 Fixes: 5a6c3e11c9c9 ("ALSA: usb-audio: Add hw constraint for implicit fb sync") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220407211657.15087-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/pcm.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 87 insertions(+), 2 deletions(-)
--- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -659,6 +659,9 @@ static int snd_usb_pcm_prepare(struct sn #define hwc_debug(fmt, args...) do { } while(0) #endif
+#define MAX_BUFFER_BYTES (1024 * 1024) +#define MAX_PERIOD_BYTES (512 * 1024) + static const struct snd_pcm_hardware snd_usb_hardware = { .info = SNDRV_PCM_INFO_MMAP | @@ -669,9 +672,9 @@ static const struct snd_pcm_hardware snd SNDRV_PCM_INFO_PAUSE, .channels_min = 1, .channels_max = 256, - .buffer_bytes_max = 1024 * 1024, + .buffer_bytes_max = MAX_BUFFER_BYTES, .period_bytes_min = 64, - .period_bytes_max = 512 * 1024, + .period_bytes_max = MAX_PERIOD_BYTES, .periods_min = 2, .periods_max = 1024, }; @@ -971,6 +974,78 @@ static int hw_rule_periods_implicit_fb(s ep->cur_buffer_periods); }
+/* get the adjusted max buffer (or period) bytes that can fit with the + * paired format for implicit fb + */ +static unsigned int +get_adjusted_max_bytes(struct snd_usb_substream *subs, + struct snd_usb_substream *pair, + struct snd_pcm_hw_params *params, + unsigned int max_bytes, + bool reverse_map) +{ + const struct audioformat *fp, *pp; + unsigned int rmax = 0, r; + + list_for_each_entry(fp, &subs->fmt_list, list) { + if (!fp->implicit_fb) + continue; + if (!reverse_map && + !hw_check_valid_format(subs, params, fp)) + continue; + list_for_each_entry(pp, &pair->fmt_list, list) { + if (pp->iface != fp->sync_iface || + pp->altsetting != fp->sync_altsetting || + pp->ep_idx != fp->sync_ep_idx) + continue; + if (reverse_map && + !hw_check_valid_format(pair, params, pp)) + break; + if (!reverse_map && pp->channels > fp->channels) + r = max_bytes * fp->channels / pp->channels; + else if (reverse_map && pp->channels < fp->channels) + r = max_bytes * pp->channels / fp->channels; + else + r = max_bytes; + rmax = max(rmax, r); + break; + } + } + return rmax; +} + +/* Reduce the period or buffer bytes depending on the paired substream; + * when a paired configuration for implicit fb has a higher number of channels, + * we need to reduce the max size accordingly, otherwise it may become unusable + */ +static int hw_rule_bytes_implicit_fb(struct snd_pcm_hw_params *params, + struct snd_pcm_hw_rule *rule) +{ + struct snd_usb_substream *subs = rule->private; + struct snd_usb_substream *pair; + struct snd_interval *it; + unsigned int max_bytes; + unsigned int rmax; + + pair = &subs->stream->substream[!subs->direction]; + if (!pair->ep_num) + return 0; + + if (rule->var == SNDRV_PCM_HW_PARAM_PERIOD_BYTES) + max_bytes = MAX_PERIOD_BYTES; + else + max_bytes = MAX_BUFFER_BYTES; + + rmax = get_adjusted_max_bytes(subs, pair, params, max_bytes, false); + if (!rmax) + rmax = get_adjusted_max_bytes(pair, subs, params, max_bytes, true); + if (!rmax) + return 0; + + it = hw_param_interval(params, rule->var); + return apply_hw_params_minmax(it, 0, rmax); +} + /* * set up the runtime hardware information. */ @@ -1085,6 +1160,16 @@ static int setup_hw_info(struct snd_pcm_ SNDRV_PCM_HW_PARAM_PERIODS, -1); if (err < 0) return err; + err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, + hw_rule_bytes_implicit_fb, subs, + SNDRV_PCM_HW_PARAM_BUFFER_BYTES, -1); + if (err < 0) + return err; + err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_PERIOD_BYTES, + hw_rule_bytes_implicit_fb, subs, + SNDRV_PCM_HW_PARAM_PERIOD_BYTES, -1); + if (err < 0) + return err;
list_for_each_entry(fp, &subs->fmt_list, list) { if (fp->implicit_fb) {
From: Takashi Iwai tiwai@suse.de
commit 925ca893b4a65177394581737b95d03fea2660f2 upstream.
The recent change for memory allocator replaced the SG-buffer handling helper for x86 with the standard non-contiguous page handler. This works for most cases, but there is a corner case I obviously overlooked, namely, the fallback of non-contiguous handler without IOMMU. When the system runs without IOMMU, the core handler tries to use the continuous pages with a single SGL entry. It works nicely for most cases, but when the system memory gets fragmented, the large allocation may fail frequently.
Ideally the non-contig handler could deal with the proper SG pages, it's cumbersome to extend for now. As a workaround, here we add new types for (minimalistic) SG allocations, instead, so that the allocator falls back to those types automatically when the allocation with the standard API failed.
BTW, one better (but pretty minor) improvement from the previous SG-buffer code is that this provides the proper mmap support without the PCM's page fault handling.
Fixes: 2c95b92ecd92 ("ALSA: memalloc: Unify x86 SG-buffer handling (take#3)") BugLink: https://gitlab.freedesktop.org/pipewire/pipewire/-/issues/2272 BugLink: https://bugzilla.suse.com/show_bug.cgi?id=1198248 Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220413054808.7547-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/sound/memalloc.h | 5 ++ sound/core/memalloc.c | 111 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 115 insertions(+), 1 deletion(-)
diff --git a/include/sound/memalloc.h b/include/sound/memalloc.h index 653dfffb3ac8..8d79cebf95f3 100644 --- a/include/sound/memalloc.h +++ b/include/sound/memalloc.h @@ -51,6 +51,11 @@ struct snd_dma_device { #define SNDRV_DMA_TYPE_DEV_SG SNDRV_DMA_TYPE_DEV /* no SG-buf support */ #define SNDRV_DMA_TYPE_DEV_WC_SG SNDRV_DMA_TYPE_DEV_WC #endif +/* fallback types, don't use those directly */ +#ifdef CONFIG_SND_DMA_SGBUF +#define SNDRV_DMA_TYPE_DEV_SG_FALLBACK 10 +#define SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK 11 +#endif
/* * info for buffer allocation diff --git a/sound/core/memalloc.c b/sound/core/memalloc.c index 6fd763d4d15b..15dc7160ba34 100644 --- a/sound/core/memalloc.c +++ b/sound/core/memalloc.c @@ -499,6 +499,10 @@ static const struct snd_malloc_ops snd_dma_wc_ops = { }; #endif /* CONFIG_X86 */
+#ifdef CONFIG_SND_DMA_SGBUF +static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size); +#endif + /* * Non-contiguous pages allocator */ @@ -509,8 +513,18 @@ static void *snd_dma_noncontig_alloc(struct snd_dma_buffer *dmab, size_t size)
sgt = dma_alloc_noncontiguous(dmab->dev.dev, size, dmab->dev.dir, DEFAULT_GFP, 0); - if (!sgt) + if (!sgt) { +#ifdef CONFIG_SND_DMA_SGBUF + if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG) + dmab->dev.type = SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK; + else + dmab->dev.type = SNDRV_DMA_TYPE_DEV_SG_FALLBACK; + return snd_dma_sg_fallback_alloc(dmab, size); +#else return NULL; +#endif + } + dmab->dev.need_sync = dma_need_sync(dmab->dev.dev, sg_dma_address(sgt->sgl)); p = dma_vmap_noncontiguous(dmab->dev.dev, size, sgt); @@ -633,6 +647,8 @@ static void *snd_dma_sg_wc_alloc(struct snd_dma_buffer *dmab, size_t size)
if (!p) return NULL; + if (dmab->dev.type != SNDRV_DMA_TYPE_DEV_WC_SG) + return p; for_each_sgtable_page(sgt, &iter, 0) set_memory_wc(sg_wc_address(&iter), 1); return p; @@ -665,6 +681,95 @@ static const struct snd_malloc_ops snd_dma_sg_wc_ops = { .get_page = snd_dma_noncontig_get_page, .get_chunk_size = snd_dma_noncontig_get_chunk_size, }; + +/* Fallback SG-buffer allocations for x86 */ +struct snd_dma_sg_fallback { + size_t count; + struct page **pages; + dma_addr_t *addrs; +}; + +static void __snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab, + struct snd_dma_sg_fallback *sgbuf) +{ + size_t i; + + if (sgbuf->count && dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK) + set_pages_array_wb(sgbuf->pages, sgbuf->count); + for (i = 0; i < sgbuf->count && sgbuf->pages[i]; i++) + dma_free_coherent(dmab->dev.dev, PAGE_SIZE, + page_address(sgbuf->pages[i]), + sgbuf->addrs[i]); + kvfree(sgbuf->pages); + kvfree(sgbuf->addrs); + kfree(sgbuf); +} + +static void *snd_dma_sg_fallback_alloc(struct snd_dma_buffer *dmab, size_t size) +{ + struct snd_dma_sg_fallback *sgbuf; + struct page **pages; + size_t i, count; + void *p; + + sgbuf = kzalloc(sizeof(*sgbuf), GFP_KERNEL); + if (!sgbuf) + return NULL; + count = PAGE_ALIGN(size) >> PAGE_SHIFT; + pages = kvcalloc(count, sizeof(*pages), GFP_KERNEL); + if (!pages) + goto error; + sgbuf->pages = pages; + sgbuf->addrs = kvcalloc(count, sizeof(*sgbuf->addrs), GFP_KERNEL); + if (!sgbuf->addrs) + goto error; + + for (i = 0; i < count; sgbuf->count++, i++) { + p = dma_alloc_coherent(dmab->dev.dev, PAGE_SIZE, + &sgbuf->addrs[i], DEFAULT_GFP); + if (!p) + goto error; + sgbuf->pages[i] = virt_to_page(p); + } + + if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK) + set_pages_array_wc(pages, count); + p = vmap(pages, count, VM_MAP, PAGE_KERNEL); + if (!p) + goto error; + dmab->private_data = sgbuf; + return p; + + error: + __snd_dma_sg_fallback_free(dmab, sgbuf); + return NULL; +} + +static void snd_dma_sg_fallback_free(struct snd_dma_buffer *dmab) +{ + vunmap(dmab->area); + __snd_dma_sg_fallback_free(dmab, dmab->private_data); +} + +static int snd_dma_sg_fallback_mmap(struct snd_dma_buffer *dmab, + struct vm_area_struct *area) +{ + struct snd_dma_sg_fallback *sgbuf = dmab->private_data; + + if (dmab->dev.type == SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK) + area->vm_page_prot = pgprot_writecombine(area->vm_page_prot); + return vm_map_pages(area, sgbuf->pages, sgbuf->count); +} + +static const struct snd_malloc_ops snd_dma_sg_fallback_ops = { + .alloc = snd_dma_sg_fallback_alloc, + .free = snd_dma_sg_fallback_free, + .mmap = snd_dma_sg_fallback_mmap, + /* reuse vmalloc helpers */ + .get_addr = snd_dma_vmalloc_get_addr, + .get_page = snd_dma_vmalloc_get_page, + .get_chunk_size = snd_dma_vmalloc_get_chunk_size, +}; #endif /* CONFIG_SND_DMA_SGBUF */
/* @@ -736,6 +841,10 @@ static const struct snd_malloc_ops *dma_ops[] = { #ifdef CONFIG_GENERIC_ALLOCATOR [SNDRV_DMA_TYPE_DEV_IRAM] = &snd_dma_iram_ops, #endif /* CONFIG_GENERIC_ALLOCATOR */ +#ifdef CONFIG_SND_DMA_SGBUF + [SNDRV_DMA_TYPE_DEV_SG_FALLBACK] = &snd_dma_sg_fallback_ops, + [SNDRV_DMA_TYPE_DEV_WC_SG_FALLBACK] = &snd_dma_sg_fallback_ops, +#endif #endif /* CONFIG_HAS_DMA */ };
From: Takashi Iwai tiwai@suse.de
commit f20ae5074dfb38f23b0c07c62bdf8e7254a0acf8 upstream.
The card destructor of nm256 driver does merely stopping the running streams, and it's superfluous for the probe error handling. Moreover, calling this via the previous devres change would lead to another problem due to the reverse call order.
This patch moves the setup of the private_free callback after the card registration, so that it can be used only after fully set up.
Fixes: c19935f04784 ("ALSA: nm256: Allocate resources with device-managed APIs") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220412102636.16000-40-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/nm256/nm256.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/pci/nm256/nm256.c +++ b/sound/pci/nm256/nm256.c @@ -1573,7 +1573,6 @@ snd_nm256_create(struct snd_card *card, chip->coeffs_current = 0;
snd_nm256_init_chip(chip); - card->private_free = snd_nm256_free;
// pci_set_master(pci); /* needed? */ return 0; @@ -1680,6 +1679,7 @@ static int snd_nm256_probe(struct pci_de err = snd_card_register(card); if (err < 0) return err; + card->private_free = snd_nm256_free;
pci_set_drvdata(pci, card); return 0;
From: Rob Clark robdclark@chromium.org
[ Upstream commit ac3e4f42d5ec459f701743debd9c1ad2f2247402 ]
Fixes: 25faf2f2e065 ("drm/msm: Show process names in gem_describe") Signed-off-by: Rob Clark robdclark@chromium.org Link: https://lore.kernel.org/r/20220317184550.227991-1-robdclark@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/msm_gem.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/msm/msm_gem.c b/drivers/gpu/drm/msm/msm_gem.c index 02b9ae65a96a..a4f61972667b 100644 --- a/drivers/gpu/drm/msm/msm_gem.c +++ b/drivers/gpu/drm/msm/msm_gem.c @@ -926,6 +926,7 @@ void msm_gem_describe(struct drm_gem_object *obj, struct seq_file *m, get_pid_task(aspace->pid, PIDTYPE_PID); if (task) { comm = kstrdup(task->comm, GFP_KERNEL); + put_task_struct(task); } else { comm = NULL; }
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit 6b8a94332ee4f7d9a8ae0cbac7609f79c212f06c ]
The call to filemap_flush() in nfsd_file_put() is there to ensure that we clear out any writes belonging to a NFSv3 client relatively quickly and avoid situations where the file can't be evicted by the garbage collector. It also ensures that we detect write errors quickly.
The problem is this causes a regression in performance for some workloads.
So try to improve matters by deferring writeback until we're ready to close the file, and need to detect errors so that we can force the client to resend.
Tested-by: Jan Kara jack@suse.cz Fixes: b6669305d35a ("nfsd: Reduce the number of calls to nfsd_file_gc()") Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Link: https://lore.kernel.org/all/20220330103457.r4xrhy2d6nhtouzk@quack3.lan Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfsd/filecache.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)
diff --git a/fs/nfsd/filecache.c b/fs/nfsd/filecache.c index cc2831cec669..496f7b3f7523 100644 --- a/fs/nfsd/filecache.c +++ b/fs/nfsd/filecache.c @@ -235,6 +235,13 @@ nfsd_file_check_write_error(struct nfsd_file *nf) return filemap_check_wb_err(file->f_mapping, READ_ONCE(file->f_wb_err)); }
+static void +nfsd_file_flush(struct nfsd_file *nf) +{ + if (nf->nf_file && vfs_fsync(nf->nf_file, 1) != 0) + nfsd_reset_write_verifier(net_generic(nf->nf_net, nfsd_net_id)); +} + static void nfsd_file_do_unhash(struct nfsd_file *nf) { @@ -302,11 +309,14 @@ nfsd_file_put(struct nfsd_file *nf) return; }
- filemap_flush(nf->nf_file->f_mapping); is_hashed = test_bit(NFSD_FILE_HASHED, &nf->nf_flags) != 0; - nfsd_file_put_noref(nf); - if (is_hashed) + if (!is_hashed) { + nfsd_file_flush(nf); + nfsd_file_put_noref(nf); + } else { + nfsd_file_put_noref(nf); nfsd_file_schedule_laundrette(); + } if (atomic_long_read(&nfsd_filecache_count) >= NFSD_FILE_LRU_LIMIT) nfsd_file_gc(); } @@ -327,6 +337,7 @@ nfsd_file_dispose_list(struct list_head *dispose) while(!list_empty(dispose)) { nf = list_first_entry(dispose, struct nfsd_file, nf_lru); list_del(&nf->nf_lru); + nfsd_file_flush(nf); nfsd_file_put_noref(nf); } } @@ -340,6 +351,7 @@ nfsd_file_dispose_list_sync(struct list_head *dispose) while(!list_empty(dispose)) { nf = list_first_entry(dispose, struct nfsd_file, nf_lru); list_del(&nf->nf_lru); + nfsd_file_flush(nf); if (!refcount_dec_and_test(&nf->nf_ref)) continue; if (nfsd_file_free(nf))
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit 98f0d68f94ea21541e0050cc64fa108ade779839 ]
On SCMI transports whose channels are based on a shared resource the TX channel area has to be acquired by the agent before placing the desired command into the channel and it will be then relinquished by the platform once the related reply has been made available into the channel. On an RX channel the logic is reversed with the platform acquiring the channel area and the agent reliquishing it once done by calling the scmi_clear_channel() helper.
As a consequence, even in case of error, the agent must never try to clear a TX channel from its side: restrict the existing clear channel call on the the reply path only to delayed responses since they are indeed coming from the RX channel.
Link: https://lore.kernel.org/r/20220224152404.12877-1-cristian.marussi@arm.com Fixes: e9b21c96181c ("firmware: arm_scmi: Make .clear_channel optional") Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/driver.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/driver.c b/drivers/firmware/arm_scmi/driver.c index d76bab3aaac4..e815b8f98739 100644 --- a/drivers/firmware/arm_scmi/driver.c +++ b/drivers/firmware/arm_scmi/driver.c @@ -652,7 +652,8 @@ static void scmi_handle_response(struct scmi_chan_info *cinfo,
xfer = scmi_xfer_command_acquire(cinfo, msg_hdr); if (IS_ERR(xfer)) { - scmi_clear_channel(info, cinfo); + if (MSG_XTRACT_TYPE(msg_hdr) == MSG_TYPE_DELAYED_RESP) + scmi_clear_channel(info, cinfo); return; }
From: Miaoqian Lin linmq006@gmail.com
[ Upstream commit 6f296a9665ba5ac68937bf11f96214eb9de81baa ]
The device_node pointer is returned by of_parse_phandle() with refcount incremented. We should use of_node_put() on it when done.
Fixes: 87108dc78eb8 ("memory: atmel-ebi: Enable the SMC clock if specified") Signed-off-by: Miaoqian Lin linmq006@gmail.com Reviewed-by: Claudiu Beznea claudiu.beznea@microchip.com Link: https://lore.kernel.org/r/20220309110144.22412-1-linmq006@gmail.com Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/memory/atmel-ebi.c | 23 +++++++++++++++++------ 1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/drivers/memory/atmel-ebi.c b/drivers/memory/atmel-ebi.c index c267283b01fd..e749dcb3ddea 100644 --- a/drivers/memory/atmel-ebi.c +++ b/drivers/memory/atmel-ebi.c @@ -544,20 +544,27 @@ static int atmel_ebi_probe(struct platform_device *pdev) smc_np = of_parse_phandle(dev->of_node, "atmel,smc", 0);
ebi->smc.regmap = syscon_node_to_regmap(smc_np); - if (IS_ERR(ebi->smc.regmap)) - return PTR_ERR(ebi->smc.regmap); + if (IS_ERR(ebi->smc.regmap)) { + ret = PTR_ERR(ebi->smc.regmap); + goto put_node; + }
ebi->smc.layout = atmel_hsmc_get_reg_layout(smc_np); - if (IS_ERR(ebi->smc.layout)) - return PTR_ERR(ebi->smc.layout); + if (IS_ERR(ebi->smc.layout)) { + ret = PTR_ERR(ebi->smc.layout); + goto put_node; + }
ebi->smc.clk = of_clk_get(smc_np, 0); if (IS_ERR(ebi->smc.clk)) { - if (PTR_ERR(ebi->smc.clk) != -ENOENT) - return PTR_ERR(ebi->smc.clk); + if (PTR_ERR(ebi->smc.clk) != -ENOENT) { + ret = PTR_ERR(ebi->smc.clk); + goto put_node; + }
ebi->smc.clk = NULL; } + of_node_put(smc_np); ret = clk_prepare_enable(ebi->smc.clk); if (ret) return ret; @@ -608,6 +615,10 @@ static int atmel_ebi_probe(struct platform_device *pdev) }
return of_platform_populate(np, NULL, NULL, dev); + +put_node: + of_node_put(smc_np); + return ret; }
static __maybe_unused int atmel_ebi_resume(struct device *dev)
From: Anilkumar Kolli quic_akolli@quicinc.com
[ Upstream commit 10cb21f4ff3f9cb36d1e1c39bf80426f02f4986a ]
This reverts commit 743b9065fe6348a5f8f5ce04869ce2d701e5e1bc.
The original commit breaks the 256 bitmap in blockack frames in AP mode. After reverting the commit the feature works again in both AP and mesh modes
Tested-on: IPQ8074 hw2.0 PCI WLAN.HK.2.6.0.1-00786-QCAHKSWPL_SILICONZ-1
Fixes: 743b9065fe63 ("ath11k: mesh: add support for 256 bitmap in blockack frames in 11ax") Signed-off-by: Anilkumar Kolli quic_akolli@quicinc.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/1648701477-16367-1-git-send-email-quic_akolli@quic... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/ath/ath11k/mac.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/net/wireless/ath/ath11k/mac.c b/drivers/net/wireless/ath/ath11k/mac.c index 28de877ad6c4..f54d5819477a 100644 --- a/drivers/net/wireless/ath/ath11k/mac.c +++ b/drivers/net/wireless/ath/ath11k/mac.c @@ -3131,6 +3131,20 @@ static void ath11k_mac_op_bss_info_changed(struct ieee80211_hw *hw, arvif->do_not_send_tmpl = true; else arvif->do_not_send_tmpl = false; + + if (vif->bss_conf.he_support) { + ret = ath11k_wmi_vdev_set_param_cmd(ar, arvif->vdev_id, + WMI_VDEV_PARAM_BA_MODE, + WMI_BA_MODE_BUFFER_SIZE_256); + if (ret) + ath11k_warn(ar->ab, + "failed to set BA BUFFER SIZE 256 for vdev: %d\n", + arvif->vdev_id); + else + ath11k_dbg(ar->ab, ATH11K_DBG_MAC, + "Set BA BUFFER SIZE 256 for VDEV: %d\n", + arvif->vdev_id); + } }
if (changed & (BSS_CHANGED_BEACON_INFO | BSS_CHANGED_BEACON)) { @@ -3166,14 +3180,6 @@ static void ath11k_mac_op_bss_info_changed(struct ieee80211_hw *hw,
if (arvif->is_up && vif->bss_conf.he_support && vif->bss_conf.he_oper.params) { - ret = ath11k_wmi_vdev_set_param_cmd(ar, arvif->vdev_id, - WMI_VDEV_PARAM_BA_MODE, - WMI_BA_MODE_BUFFER_SIZE_256); - if (ret) - ath11k_warn(ar->ab, - "failed to set BA BUFFER SIZE 256 for vdev: %d\n", - arvif->vdev_id); - param_id = WMI_VDEV_PARAM_HEOPS_0_31; param_value = vif->bss_conf.he_oper.params; ret = ath11k_wmi_vdev_set_param_cmd(ar, arvif->vdev_id,
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit 23274739a5b6166f74d8d9cb5243d7bf6b46aab9 ]
During SCMI Clock protocol initialization, after having retrieved from the SCMI platform all the available discrete rates for a specific clock, the clock rates array is sorted, unfortunately using a pointer to its end as a base instead of its start, so that sorting does not work.
Fix invocation of sort() passing as base a pointer to the start of the retrieved clock rates array.
Link: https://lore.kernel.org/r/20220318092813.49283-1-cristian.marussi@arm.com Fixes: dccec73de91d ("firmware: arm_scmi: Keep the discrete clock rates sorted") Signed-off-by: Cristian Marussi cristian.marussi@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/clock.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/firmware/arm_scmi/clock.c b/drivers/firmware/arm_scmi/clock.c index 35b56c8ba0c0..492f3a9197ec 100644 --- a/drivers/firmware/arm_scmi/clock.c +++ b/drivers/firmware/arm_scmi/clock.c @@ -204,7 +204,8 @@ scmi_clock_describe_rates_get(const struct scmi_protocol_handle *ph, u32 clk_id,
if (rate_discrete && rate) { clk->list.num_rates = tot_rate_cnt; - sort(rate, tot_rate_cnt, sizeof(*rate), rate_cmp_func, NULL); + sort(clk->list.rates, tot_rate_cnt, sizeof(*rate), + rate_cmp_func, NULL); }
clk->rate_discrete = rate_discrete;
From: Kyle Copperfield kmcopper@danwin1210.me
[ Upstream commit 6150f276073a1480030242a7e006a89e161d6cd6 ]
The latest fix for probe error handling contained a typo that causes probing to fail with the following message:
rockchip-rga: probe of ff680000.rga failed with error -12
This patch fixes the typo.
Fixes: e58430e1d4fd (media: rockchip/rga: fix error handling in probe) Reviewed-by: Dragan Simic dragan.simic@gmail.com Signed-off-by: Kyle Copperfield kmcopper@danwin1210.me Reviewed-by: Kieran Bingham kieran.bingham+renesas@ideasonboard.com Reviewed-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Mauro Carvalho Chehab mchehab+huawei@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/rockchip/rga/rga.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/media/platform/rockchip/rga/rga.c b/drivers/media/platform/rockchip/rga/rga.c index 4de5e8d2b261..3d3d1062e212 100644 --- a/drivers/media/platform/rockchip/rga/rga.c +++ b/drivers/media/platform/rockchip/rga/rga.c @@ -892,7 +892,7 @@ static int rga_probe(struct platform_device *pdev) } rga->dst_mmu_pages = (unsigned int *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 3); - if (rga->dst_mmu_pages) { + if (!rga->dst_mmu_pages) { ret = -ENOMEM; goto free_src_pages; }
From: Marc Zyngier maz@kernel.org
[ Upstream commit 06394531b425794dc56f3d525b7994d25b8072f7 ]
We currently deal with a set of booleans for VM features, while they could be better represented as set of flags contained in an unsigned long, similarily to what we are doing on the CPU side.
Signed-off-by: Marc Zyngier maz@kernel.org [Oliver: Flag-ify the 'ran_once' boolean] Signed-off-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220311174001.605719-2-oupton@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_host.h | 15 +++++++++------ arch/arm64/kvm/arm.c | 7 ++++--- arch/arm64/kvm/mmio.c | 3 ++- arch/arm64/kvm/pmu-emul.c | 2 +- 4 files changed, 16 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 8234626a945a..2619fda42ca2 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -122,7 +122,12 @@ struct kvm_arch { * should) opt in to this feature if KVM_CAP_ARM_NISV_TO_USER is * supported. */ - bool return_nisv_io_abort_to_user; +#define KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER 0 + /* Memory Tagging Extension enabled for the guest */ +#define KVM_ARCH_FLAG_MTE_ENABLED 1 + /* At least one vCPU has ran in the VM */ +#define KVM_ARCH_FLAG_HAS_RAN_ONCE 2 + unsigned long flags;
/* * VM-wide PMU filter, implemented as a bitmap and big enough for @@ -133,10 +138,6 @@ struct kvm_arch {
u8 pfr0_csv2; u8 pfr0_csv3; - - /* Memory Tagging Extension enabled for the guest */ - bool mte_enabled; - bool ran_once; };
struct kvm_vcpu_fault_info { @@ -792,7 +793,9 @@ bool kvm_arm_vcpu_is_finalized(struct kvm_vcpu *vcpu); #define kvm_arm_vcpu_sve_finalized(vcpu) \ ((vcpu)->arch.flags & KVM_ARM64_VCPU_SVE_FINALIZED)
-#define kvm_has_mte(kvm) (system_supports_mte() && (kvm)->arch.mte_enabled) +#define kvm_has_mte(kvm) \ + (system_supports_mte() && \ + test_bit(KVM_ARCH_FLAG_MTE_ENABLED, &(kvm)->arch.flags)) #define kvm_vcpu_has_pmu(vcpu) \ (test_bit(KVM_ARM_VCPU_PMU_V3, (vcpu)->arch.features))
diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 85a2a75f4498..25d8aff273a1 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -89,7 +89,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, switch (cap->cap) { case KVM_CAP_ARM_NISV_TO_USER: r = 0; - kvm->arch.return_nisv_io_abort_to_user = true; + set_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER, + &kvm->arch.flags); break; case KVM_CAP_ARM_MTE: mutex_lock(&kvm->lock); @@ -97,7 +98,7 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, r = -EINVAL; } else { r = 0; - kvm->arch.mte_enabled = true; + set_bit(KVM_ARCH_FLAG_MTE_ENABLED, &kvm->arch.flags); } mutex_unlock(&kvm->lock); break; @@ -635,7 +636,7 @@ int kvm_arch_vcpu_run_pid_change(struct kvm_vcpu *vcpu) kvm_call_hyp_nvhe(__pkvm_vcpu_init_traps, vcpu);
mutex_lock(&kvm->lock); - kvm->arch.ran_once = true; + set_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags); mutex_unlock(&kvm->lock);
return ret; diff --git a/arch/arm64/kvm/mmio.c b/arch/arm64/kvm/mmio.c index 3e2d8ba11a02..3dd38a151d2a 100644 --- a/arch/arm64/kvm/mmio.c +++ b/arch/arm64/kvm/mmio.c @@ -135,7 +135,8 @@ int io_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa) * volunteered to do so, and bail out otherwise. */ if (!kvm_vcpu_dabt_isvalid(vcpu)) { - if (vcpu->kvm->arch.return_nisv_io_abort_to_user) { + if (test_bit(KVM_ARCH_FLAG_RETURN_NISV_IO_ABORT_TO_USER, + &vcpu->kvm->arch.flags)) { run->exit_reason = KVM_EXIT_ARM_NISV; run->arm_nisv.esr_iss = kvm_vcpu_dabt_iss_nisv_sanitized(vcpu); run->arm_nisv.fault_ipa = fault_ipa; diff --git a/arch/arm64/kvm/pmu-emul.c b/arch/arm64/kvm/pmu-emul.c index bc771bc1a041..fc6ee6f02fec 100644 --- a/arch/arm64/kvm/pmu-emul.c +++ b/arch/arm64/kvm/pmu-emul.c @@ -982,7 +982,7 @@ int kvm_arm_pmu_v3_set_attr(struct kvm_vcpu *vcpu, struct kvm_device_attr *attr)
mutex_lock(&kvm->lock);
- if (kvm->arch.ran_once) { + if (test_bit(KVM_ARCH_FLAG_HAS_RAN_ONCE, &kvm->arch.flags)) { mutex_unlock(&kvm->lock); return -EBUSY; }
From: Reiji Watanabe reijiw@google.com
[ Upstream commit 26bf74bd9f6ff0f1545b4f0c92a37c232d076014 ]
KVM allows userspace to configure either all EL1 32bit or 64bit vCPUs for a guest. At vCPU reset, vcpu_allowed_register_width() checks if the vcpu's register width is consistent with all other vCPUs'. Since the checking is done even against vCPUs that are not initialized (KVM_ARM_VCPU_INIT has not been done) yet, the uninitialized vCPUs are erroneously treated as 64bit vCPU, which causes the function to incorrectly detect a mixed-width VM.
Introduce KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED bits for kvm->arch.flags. A value of the EL1_32BIT bit indicates that the guest needs to be configured with all 32bit or 64bit vCPUs, and a value of the REG_WIDTH_CONFIGURED bit indicates if a value of the EL1_32BIT bit is valid (already set up). Values in those bits are set at the first KVM_ARM_VCPU_INIT for the guest based on KVM_ARM_VCPU_EL1_32BIT configuration for the vCPU.
Check vcpu's register width against those new bits at the vcpu's KVM_ARM_VCPU_INIT (instead of against other vCPUs' register width).
Fixes: 66e94d5cafd4 ("KVM: arm64: Prevent mixed-width VM creation") Signed-off-by: Reiji Watanabe reijiw@google.com Reviewed-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220329031924.619453-2-reijiw@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/kvm_emulate.h | 27 ++++++++---- arch/arm64/include/asm/kvm_host.h | 10 +++++ arch/arm64/kvm/reset.c | 65 +++++++++++++++++++--------- 3 files changed, 74 insertions(+), 28 deletions(-)
diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/kvm_emulate.h index d62405ce3e6d..7496deab025a 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -43,10 +43,22 @@ void kvm_inject_pabt(struct kvm_vcpu *vcpu, unsigned long addr);
void kvm_vcpu_wfi(struct kvm_vcpu *vcpu);
+#if defined(__KVM_VHE_HYPERVISOR__) || defined(__KVM_NVHE_HYPERVISOR__) static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu) { return !(vcpu->arch.hcr_el2 & HCR_RW); } +#else +static __always_inline bool vcpu_el1_is_32bit(struct kvm_vcpu *vcpu) +{ + struct kvm *kvm = vcpu->kvm; + + WARN_ON_ONCE(!test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, + &kvm->arch.flags)); + + return test_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags); +} +#endif
static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) { @@ -72,15 +84,14 @@ static inline void vcpu_reset_hcr(struct kvm_vcpu *vcpu) vcpu->arch.hcr_el2 |= HCR_TVM; }
- if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) + if (vcpu_el1_is_32bit(vcpu)) vcpu->arch.hcr_el2 &= ~HCR_RW; - - /* - * TID3: trap feature register accesses that we virtualise. - * For now this is conditional, since no AArch32 feature regs - * are currently virtualised. - */ - if (!vcpu_el1_is_32bit(vcpu)) + else + /* + * TID3: trap feature register accesses that we virtualise. + * For now this is conditional, since no AArch32 feature regs + * are currently virtualised. + */ vcpu->arch.hcr_el2 |= HCR_TID3;
if (cpus_have_const_cap(ARM64_MISMATCHED_CACHE_TYPE) || diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 2619fda42ca2..b5ae92f77c61 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -127,6 +127,16 @@ struct kvm_arch { #define KVM_ARCH_FLAG_MTE_ENABLED 1 /* At least one vCPU has ran in the VM */ #define KVM_ARCH_FLAG_HAS_RAN_ONCE 2 + /* + * The following two bits are used to indicate the guest's EL1 + * register width configuration. A value of KVM_ARCH_FLAG_EL1_32BIT + * bit is valid only when KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED is set. + * Otherwise, the guest's EL1 register width has not yet been + * determined yet. + */ +#define KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED 3 +#define KVM_ARCH_FLAG_EL1_32BIT 4 + unsigned long flags;
/* diff --git a/arch/arm64/kvm/reset.c b/arch/arm64/kvm/reset.c index ecc40c8cd6f6..6c70c6f61c70 100644 --- a/arch/arm64/kvm/reset.c +++ b/arch/arm64/kvm/reset.c @@ -181,27 +181,51 @@ static int kvm_vcpu_enable_ptrauth(struct kvm_vcpu *vcpu) return 0; }
-static bool vcpu_allowed_register_width(struct kvm_vcpu *vcpu) +/** + * kvm_set_vm_width() - set the register width for the guest + * @vcpu: Pointer to the vcpu being configured + * + * Set both KVM_ARCH_FLAG_EL1_32BIT and KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED + * in the VM flags based on the vcpu's requested register width, the HW + * capabilities and other options (such as MTE). + * When REG_WIDTH_CONFIGURED is already set, the vcpu settings must be + * consistent with the value of the FLAG_EL1_32BIT bit in the flags. + * + * Return: 0 on success, negative error code on failure. + */ +static int kvm_set_vm_width(struct kvm_vcpu *vcpu) { - struct kvm_vcpu *tmp; + struct kvm *kvm = vcpu->kvm; bool is32bit; - unsigned long i;
is32bit = vcpu_has_feature(vcpu, KVM_ARM_VCPU_EL1_32BIT); + + lockdep_assert_held(&kvm->lock); + + if (test_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &kvm->arch.flags)) { + /* + * The guest's register width is already configured. + * Make sure that the vcpu is consistent with it. + */ + if (is32bit == test_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags)) + return 0; + + return -EINVAL; + } + if (!cpus_have_const_cap(ARM64_HAS_32BIT_EL1) && is32bit) - return false; + return -EINVAL;
/* MTE is incompatible with AArch32 */ - if (kvm_has_mte(vcpu->kvm) && is32bit) - return false; + if (kvm_has_mte(kvm) && is32bit) + return -EINVAL;
- /* Check that the vcpus are either all 32bit or all 64bit */ - kvm_for_each_vcpu(i, tmp, vcpu->kvm) { - if (vcpu_has_feature(tmp, KVM_ARM_VCPU_EL1_32BIT) != is32bit) - return false; - } + if (is32bit) + set_bit(KVM_ARCH_FLAG_EL1_32BIT, &kvm->arch.flags);
- return true; + set_bit(KVM_ARCH_FLAG_REG_WIDTH_CONFIGURED, &kvm->arch.flags); + + return 0; }
/** @@ -230,10 +254,16 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) u32 pstate;
mutex_lock(&vcpu->kvm->lock); - reset_state = vcpu->arch.reset_state; - WRITE_ONCE(vcpu->arch.reset_state.reset, false); + ret = kvm_set_vm_width(vcpu); + if (!ret) { + reset_state = vcpu->arch.reset_state; + WRITE_ONCE(vcpu->arch.reset_state.reset, false); + } mutex_unlock(&vcpu->kvm->lock);
+ if (ret) + return ret; + /* Reset PMU outside of the non-preemptible section */ kvm_pmu_vcpu_reset(vcpu);
@@ -260,14 +290,9 @@ int kvm_reset_vcpu(struct kvm_vcpu *vcpu) } }
- if (!vcpu_allowed_register_width(vcpu)) { - ret = -EINVAL; - goto out; - } - switch (vcpu->arch.target) { default: - if (test_bit(KVM_ARM_VCPU_EL1_32BIT, vcpu->arch.features)) { + if (vcpu_el1_is_32bit(vcpu)) { pstate = VCPU_RESET_PSTATE_SVC; } else { pstate = VCPU_RESET_PSTATE_EL1;
From: Chuck Lever chuck.lever@oracle.com
[ Upstream commit 4d5004451ab2218eab94a30e1841462c9316ba19 ]
Fix a NULL deref crash that occurs when an svc_rqst is deferred while the sunrpc tracing subsystem is enabled. svc_revisit() sets dr->xprt to NULL, so it can't be relied upon in the tracepoint to provide the remote's address.
Unfortunately we can't revert the "svc_deferred_class" hunk in commit ece200ddd54b ("sunrpc: Save remote presentation address in svc_xprt for trace events") because there is now a specific check of event format specifiers for unsafe dereferences. The warning that check emits is:
event svc_defer_recv has unsafe dereference of argument 1
A "%pISpc" format specifier with a "struct sockaddr *" is indeed flagged by this check.
Instead, take the brute-force approach used by the svcrdma_qp_error tracepoint. Convert the dr::addr field into a presentation address in the TP_fast_assign() arm of the trace event, and store that as a string. This fix can be backported to -stable kernels.
In the meantime, commit c6ced22997ad ("tracing: Update print fmt check to handle new __get_sockaddr() macro") is now in v5.18, so this wonky fix can be replaced with __sockaddr() and friends properly during the v5.19 merge window.
Fixes: ece200ddd54b ("sunrpc: Save remote presentation address in svc_xprt for trace events") Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/trace/events/sunrpc.h | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h index 5be3faf88c1a..06fe47fb3686 100644 --- a/include/trace/events/sunrpc.h +++ b/include/trace/events/sunrpc.h @@ -1956,17 +1956,18 @@ DECLARE_EVENT_CLASS(svc_deferred_event, TP_STRUCT__entry( __field(const void *, dr) __field(u32, xid) - __string(addr, dr->xprt->xpt_remotebuf) + __array(__u8, addr, INET6_ADDRSTRLEN + 10) ),
TP_fast_assign( __entry->dr = dr; __entry->xid = be32_to_cpu(*(__be32 *)(dr->args + (dr->xprt_hlen>>2))); - __assign_str(addr, dr->xprt->xpt_remotebuf); + snprintf(__entry->addr, sizeof(__entry->addr) - 1, + "%pISpc", (struct sockaddr *)&dr->addr); ),
- TP_printk("addr=%s dr=%p xid=0x%08x", __get_str(addr), __entry->dr, + TP_printk("addr=%s dr=%p xid=0x%08x", __entry->addr, __entry->dr, __entry->xid) );
From: Vlad Buslov vladbu@nvidia.com
[ Upstream commit 2105f700b53c24aa48b65c15652acc386044d26a ]
A tc flower filter matching TCA_FLOWER_KEY_VLAN_ETH_TYPE is expected to match the L2 ethertype following the first VLAN header, as confirmed by linked discussion with the maintainer. However, such rule also matches packets that have additional second VLAN header, even though filter has both eth_type and vlan_ethtype set to "ipv4". Looking at the code this seems to be mostly an artifact of the way flower uses flow dissector. First, even though looking at the uAPI eth_type and vlan_ethtype appear like a distinct fields, in flower they are all mapped to the same key->basic.n_proto. Second, flow dissector skips following VLAN header as no keys for FLOW_DISSECTOR_KEY_CVLAN are set and eventually assigns the value of n_proto to last parsed header. With these, such filters ignore any headers present between first VLAN header and first "non magic" header (ipv4 in this case) that doesn't result FLOW_DISSECT_RET_PROTO_AGAIN.
Fix the issue by extending flow dissector VLAN key structure with new 'vlan_eth_type' field that matches first ethertype following previously parsed VLAN header. Modify flower classifier to set the new flow_dissector_key_vlan->vlan_eth_type with value obtained from TCA_FLOWER_KEY_VLAN_ETH_TYPE/TCA_FLOWER_KEY_CVLAN_ETH_TYPE uAPIs.
Link: https://lore.kernel.org/all/Yjhgi48BpTGh6dig@nanopsycho/ Fixes: 9399ae9a6cb2 ("net_sched: flower: Add vlan support") Fixes: d64efd0926ba ("net/sched: flower: Add supprt for matching on QinQ vlan headers") Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Jiri Pirko jiri@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/flow_dissector.h | 2 ++ net/core/flow_dissector.c | 1 + net/sched/cls_flower.c | 18 +++++++++++++----- 3 files changed, 16 insertions(+), 5 deletions(-)
diff --git a/include/net/flow_dissector.h b/include/net/flow_dissector.h index aa33e1092e2c..9f65f1bfbd24 100644 --- a/include/net/flow_dissector.h +++ b/include/net/flow_dissector.h @@ -59,6 +59,8 @@ struct flow_dissector_key_vlan { __be16 vlan_tci; }; __be16 vlan_tpid; + __be16 vlan_eth_type; + u16 padding; };
struct flow_dissector_mpls_lse { diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c index 15833e1d6ea1..544d2028ccf5 100644 --- a/net/core/flow_dissector.c +++ b/net/core/flow_dissector.c @@ -1182,6 +1182,7 @@ bool __skb_flow_dissect(const struct net *net, VLAN_PRIO_MASK) >> VLAN_PRIO_SHIFT; } key_vlan->vlan_tpid = saved_vlan_tpid; + key_vlan->vlan_eth_type = proto; }
fdret = FLOW_DISSECT_RET_PROTO_AGAIN; diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c index 1a9b1f140f9e..ef5b3452254a 100644 --- a/net/sched/cls_flower.c +++ b/net/sched/cls_flower.c @@ -1005,6 +1005,7 @@ static int fl_set_key_mpls(struct nlattr **tb, static void fl_set_key_vlan(struct nlattr **tb, __be16 ethertype, int vlan_id_key, int vlan_prio_key, + int vlan_next_eth_type_key, struct flow_dissector_key_vlan *key_val, struct flow_dissector_key_vlan *key_mask) { @@ -1023,6 +1024,11 @@ static void fl_set_key_vlan(struct nlattr **tb, } key_val->vlan_tpid = ethertype; key_mask->vlan_tpid = cpu_to_be16(~0); + if (tb[vlan_next_eth_type_key]) { + key_val->vlan_eth_type = + nla_get_be16(tb[vlan_next_eth_type_key]); + key_mask->vlan_eth_type = cpu_to_be16(~0); + } }
static void fl_set_key_flag(u32 flower_key, u32 flower_mask, @@ -1519,8 +1525,9 @@ static int fl_set_key(struct net *net, struct nlattr **tb,
if (eth_type_vlan(ethertype)) { fl_set_key_vlan(tb, ethertype, TCA_FLOWER_KEY_VLAN_ID, - TCA_FLOWER_KEY_VLAN_PRIO, &key->vlan, - &mask->vlan); + TCA_FLOWER_KEY_VLAN_PRIO, + TCA_FLOWER_KEY_VLAN_ETH_TYPE, + &key->vlan, &mask->vlan);
if (tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE]) { ethertype = nla_get_be16(tb[TCA_FLOWER_KEY_VLAN_ETH_TYPE]); @@ -1528,6 +1535,7 @@ static int fl_set_key(struct net *net, struct nlattr **tb, fl_set_key_vlan(tb, ethertype, TCA_FLOWER_KEY_CVLAN_ID, TCA_FLOWER_KEY_CVLAN_PRIO, + TCA_FLOWER_KEY_CVLAN_ETH_TYPE, &key->cvlan, &mask->cvlan); fl_set_key_val(tb, &key->basic.n_proto, TCA_FLOWER_KEY_CVLAN_ETH_TYPE, @@ -2886,13 +2894,13 @@ static int fl_dump_key(struct sk_buff *skb, struct net *net, goto nla_put_failure;
if (mask->basic.n_proto) { - if (mask->cvlan.vlan_tpid) { + if (mask->cvlan.vlan_eth_type) { if (nla_put_be16(skb, TCA_FLOWER_KEY_CVLAN_ETH_TYPE, key->basic.n_proto)) goto nla_put_failure; - } else if (mask->vlan.vlan_tpid) { + } else if (mask->vlan.vlan_eth_type) { if (nla_put_be16(skb, TCA_FLOWER_KEY_VLAN_ETH_TYPE, - key->basic.n_proto)) + key->vlan.vlan_eth_type)) goto nla_put_failure; } }
From: Guillaume Nault gnault@redhat.com
[ Upstream commit 726e2c5929de841fdcef4e2bf995680688ae1b87 ]
After feeding a decapsulated packet to a veth device with act_mirred, skb_headlen() may be 0. But veth_xmit() calls __dev_forward_skb(), which expects at least ETH_HLEN byte of linear data (as __dev_forward_skb2() calls eth_type_trans(), which pulls ETH_HLEN bytes unconditionally).
Use pskb_may_pull() to ensure veth_xmit() respects this constraint.
kernel BUG at include/linux/skbuff.h:2328! RIP: 0010:eth_type_trans+0xcf/0x140 Call Trace: <IRQ> __dev_forward_skb2+0xe3/0x160 veth_xmit+0x6e/0x250 [veth] dev_hard_start_xmit+0xc7/0x200 __dev_queue_xmit+0x47f/0x520 ? skb_ensure_writable+0x85/0xa0 ? skb_mpls_pop+0x98/0x1c0 tcf_mirred_act+0x442/0x47e [act_mirred] tcf_action_exec+0x86/0x140 fl_classify+0x1d8/0x1e0 [cls_flower] ? dma_pte_clear_level+0x129/0x1a0 ? dma_pte_clear_level+0x129/0x1a0 ? prb_fill_curr_block+0x2f/0xc0 ? skb_copy_bits+0x11a/0x220 __tcf_classify+0x58/0x110 tcf_classify_ingress+0x6b/0x140 __netif_receive_skb_core.constprop.0+0x47d/0xfd0 ? __iommu_dma_unmap_swiotlb+0x44/0x90 __netif_receive_skb_one_core+0x3d/0xa0 netif_receive_skb+0x116/0x170 be_process_rx+0x22f/0x330 [be2net] be_poll+0x13c/0x370 [be2net] __napi_poll+0x2a/0x170 net_rx_action+0x22f/0x2f0 __do_softirq+0xca/0x2a8 __irq_exit_rcu+0xc1/0xe0 common_interrupt+0x83/0xa0
Fixes: e314dbdc1c0d ("[NET]: Virtual ethernet device driver.") Signed-off-by: Guillaume Nault gnault@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/veth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/veth.c b/drivers/net/veth.c index d29fb9759cc9..6c8f4f4dfc8a 100644 --- a/drivers/net/veth.c +++ b/drivers/net/veth.c @@ -320,7 +320,7 @@ static netdev_tx_t veth_xmit(struct sk_buff *skb, struct net_device *dev)
rcu_read_lock(); rcv = rcu_dereference(priv->peer); - if (unlikely(!rcv)) { + if (unlikely(!rcv) || !pskb_may_pull(skb, ETH_HLEN)) { kfree_skb(skb); goto drop; }
From: Linus Torvalds torvalds@linux-foundation.org
[ Upstream commit 213d266ebfb1621aab79cfe63388facc520a1381 ]
When compiling with -Wformat, clang emits the following warning:
gpiolib-acpi.c:393:4: warning: format specifies type 'unsigned char' but the argument has type 'int' [-Wformat] pin); ^~~
So warning that '%hhX' is paired with an 'int' is all just completely mindless and wrong. Sadly, I can see a different bogus warning reason why people would want to use '%02hhX'.
Again, the *sane* thing from a human perspective is to use '%02X. But if the compiler doesn't do any range analysis at all, it could decide that "Oh, that print format could need up to 8 bytes of space in the result". Using '%02hhX' would cut that down to two.
And since we use
char ev_name[5];
and currently use "_%c%02hhX" as the format string, even a compiler that doesn't notice that "pin <= 255" test that guards this all will go "OK, that's at most 4 bytes and the final NUL termination, so it's fine".
While a compiler - like gcc - that only sees that the original source of the 'pin' value is a 'unsigned short' array, and then doesn't take the "pin <= 255" into account, will warn like this:
gpiolib-acpi.c: In function 'acpi_gpiochip_request_interrupt': gpiolib-acpi.c:206:24: warning: '%02X' directive writing between 2 and 4 bytes into a region of size 3 [-Wformat-overflow=] sprintf(ev_name, "_%c%02X", ^~~~ gpiolib-acpi.c:206:20: note: directive argument in the range [0, 65535]
because gcc isn't being very good at that argument range analysis either.
In other words, the original use of 'hhx' was bogus to begin with, and due to *another* compiler warning being bad, and we had that bad code being written back in 2016 to work around _that_ compiler warning (commit e40a3ae1f794: "gpio: acpi: work around false-positive -Wstring-overflow warning").
Sadly, two different bad compiler warnings together does not make for one good one.
It just makes for even more pain.
End result: I think the simplest and cleanest option is simply the proposed change which undoes that '%hhX' change for gcc, and replaces it with just using a slightly bigger stack allocation. It's not like a 5-byte allocation is in any way likely to have saved any actual stack, since all the other variables in that function are 'int' or bigger.
False-positive compiler warnings really do make people write worse code, and that's a problem. But on a scale of bad code, I feel that extending the buffer trivially is better than adding a pointless cast that literally makes no sense.
At least in this case the end result isn't unreadable or buggy. We've had several cases of bad compiler warnings that caused changes that were actually horrendously wrong.
Fixes: e40a3ae1f794 ("gpio: acpi: work around false-positive -Wstring-overflow warning") Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpiolib-acpi.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpio/gpiolib-acpi.c b/drivers/gpio/gpiolib-acpi.c index a5495ad31c9c..b7c2f2af1dee 100644 --- a/drivers/gpio/gpiolib-acpi.c +++ b/drivers/gpio/gpiolib-acpi.c @@ -387,8 +387,8 @@ static acpi_status acpi_gpiochip_alloc_event(struct acpi_resource *ares, pin = agpio->pin_table[0];
if (pin <= 255) { - char ev_name[5]; - sprintf(ev_name, "_%c%02hhX", + char ev_name[8]; + sprintf(ev_name, "_%c%02X", agpio->triggering == ACPI_EDGE_SENSITIVE ? 'E' : 'L', pin); if (ACPI_SUCCESS(acpi_get_handle(handle, ev_name, &evt_handle)))
From: Shyam Prasad N sprasad@microsoft.com
[ Upstream commit d788e51636462e61c6883f7d96b07b06bc291650 ]
During cifs_kill_sb, we first dput all the dentries that we have cached. However this function can also get called for mount failures. So dput the cached dentries only if the filesystem mount is complete. i.e. cifs_sb->root is populated.
Fixes: 5e9c89d43fa6 ("cifs: Grab a reference for the dentry of the cached directory during the lifetime of the cache") Signed-off-by: Shyam Prasad N sprasad@microsoft.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/cifsfs.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/fs/cifs/cifsfs.c b/fs/cifs/cifsfs.c index 6e5246122ee2..9f0f6bdb3a4d 100644 --- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -266,22 +266,24 @@ static void cifs_kill_sb(struct super_block *sb) * before we kill the sb. */ if (cifs_sb->root) { + node = rb_first(root); + while (node != NULL) { + tlink = rb_entry(node, struct tcon_link, tl_rbnode); + tcon = tlink_tcon(tlink); + cfid = &tcon->crfid; + mutex_lock(&cfid->fid_mutex); + if (cfid->dentry) { + dput(cfid->dentry); + cfid->dentry = NULL; + } + mutex_unlock(&cfid->fid_mutex); + node = rb_next(node); + } + + /* finally release root dentry */ dput(cifs_sb->root); cifs_sb->root = NULL; } - node = rb_first(root); - while (node != NULL) { - tlink = rb_entry(node, struct tcon_link, tl_rbnode); - tcon = tlink_tcon(tlink); - cfid = &tcon->crfid; - mutex_lock(&cfid->fid_mutex); - if (cfid->dentry) { - dput(cfid->dentry); - cfid->dentry = NULL; - } - mutex_unlock(&cfid->fid_mutex); - node = rb_next(node); - }
kill_anon_super(sb); cifs_umount(cifs_sb);
From: Alexander Lobakin alexandr.lobakin@intel.com
[ Upstream commit d7442f512b71fc63a99c8a801422dde4fbbf9f93 ]
The CI testing bots triggered the following splat:
[ 718.203054] BUG: KASAN: use-after-free in free_irq_cpu_rmap+0x53/0x80 [ 718.206349] Read of size 4 at addr ffff8881bd127e00 by task sh/20834 [ 718.212852] CPU: 28 PID: 20834 Comm: sh Kdump: loaded Tainted: G S W IOE 5.17.0-rc8_nextqueue-devqueue-02643-g23f3121aca93 #1 [ 718.219695] Hardware name: Intel Corporation S2600WFT/S2600WFT, BIOS SE5C620.86B.02.01.0012.070720200218 07/07/2020 [ 718.223418] Call Trace: [ 718.227139] [ 718.230783] dump_stack_lvl+0x33/0x42 [ 718.234431] print_address_description.constprop.9+0x21/0x170 [ 718.238177] ? free_irq_cpu_rmap+0x53/0x80 [ 718.241885] ? free_irq_cpu_rmap+0x53/0x80 [ 718.245539] kasan_report.cold.18+0x7f/0x11b [ 718.249197] ? free_irq_cpu_rmap+0x53/0x80 [ 718.252852] free_irq_cpu_rmap+0x53/0x80 [ 718.256471] ice_free_cpu_rx_rmap.part.11+0x37/0x50 [ice] [ 718.260174] ice_remove_arfs+0x5f/0x70 [ice] [ 718.263810] ice_rebuild_arfs+0x3b/0x70 [ice] [ 718.267419] ice_rebuild+0x39c/0xb60 [ice] [ 718.270974] ? asm_sysvec_apic_timer_interrupt+0x12/0x20 [ 718.274472] ? ice_init_phy_user_cfg+0x360/0x360 [ice] [ 718.278033] ? delay_tsc+0x4a/0xb0 [ 718.281513] ? preempt_count_sub+0x14/0xc0 [ 718.284984] ? delay_tsc+0x8f/0xb0 [ 718.288463] ice_do_reset+0x92/0xf0 [ice] [ 718.292014] ice_pci_err_resume+0x91/0xf0 [ice] [ 718.295561] pci_reset_function+0x53/0x80 <...> [ 718.393035] Allocated by task 690: [ 718.433497] Freed by task 20834: [ 718.495688] Last potentially related work creation: [ 718.568966] The buggy address belongs to the object at ffff8881bd127e00 which belongs to the cache kmalloc-96 of size 96 [ 718.574085] The buggy address is located 0 bytes inside of 96-byte region [ffff8881bd127e00, ffff8881bd127e60) [ 718.579265] The buggy address belongs to the page: [ 718.598905] Memory state around the buggy address: [ 718.601809] ffff8881bd127d00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.604796] ffff8881bd127d80: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc [ 718.607794] >ffff8881bd127e00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc [ 718.610811] ^ [ 718.613819] ffff8881bd127e80: 00 00 00 00 00 00 00 00 00 00 00 00 fc fc fc fc [ 718.617107] ffff8881bd127f00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
This is due to that free_irq_cpu_rmap() is always being called *after* (devm_)free_irq() and thus it tries to work with IRQ descs already freed. For example, on device reset the driver frees the rmap right before allocating a new one (the splat above). Make rmap creation and freeing function symmetrical with {request,free}_irq() calls i.e. do that on ifup/ifdown instead of device probe/remove/resume. These operations can be performed independently from the actual device aRFS configuration. Also, make sure ice_vsi_free_irq() clears IRQ affinity notifiers only when aRFS is disabled -- otherwise, CPU rmap sets and clears its own and they must not be touched manually.
Fixes: 28bf26724fdb0 ("ice: Implement aRFS") Co-developed-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Alexander Lobakin alexandr.lobakin@intel.com Tested-by: Ivan Vecera ivecera@redhat.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_arfs.c | 9 ++------- drivers/net/ethernet/intel/ice/ice_lib.c | 5 ++++- drivers/net/ethernet/intel/ice/ice_main.c | 18 ++++++++---------- 3 files changed, 14 insertions(+), 18 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_arfs.c b/drivers/net/ethernet/intel/ice/ice_arfs.c index 5daade32ea62..fba178e07600 100644 --- a/drivers/net/ethernet/intel/ice/ice_arfs.c +++ b/drivers/net/ethernet/intel/ice/ice_arfs.c @@ -577,7 +577,7 @@ void ice_free_cpu_rx_rmap(struct ice_vsi *vsi) { struct net_device *netdev;
- if (!vsi || vsi->type != ICE_VSI_PF || !vsi->arfs_fltr_list) + if (!vsi || vsi->type != ICE_VSI_PF) return;
netdev = vsi->netdev; @@ -599,7 +599,7 @@ int ice_set_cpu_rx_rmap(struct ice_vsi *vsi) int base_idx, i;
if (!vsi || vsi->type != ICE_VSI_PF) - return -EINVAL; + return 0;
pf = vsi->back; netdev = vsi->netdev; @@ -636,7 +636,6 @@ void ice_remove_arfs(struct ice_pf *pf) if (!pf_vsi) return;
- ice_free_cpu_rx_rmap(pf_vsi); ice_clear_arfs(pf_vsi); }
@@ -653,9 +652,5 @@ void ice_rebuild_arfs(struct ice_pf *pf) return;
ice_remove_arfs(pf); - if (ice_set_cpu_rx_rmap(pf_vsi)) { - dev_err(ice_pf_to_dev(pf), "Failed to rebuild aRFS\n"); - return; - } ice_init_arfs(pf_vsi); } diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 5fd2bbeab2d1..15bb6f001a04 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -2869,6 +2869,8 @@ void ice_vsi_free_irq(struct ice_vsi *vsi) return;
vsi->irqs_ready = false; + ice_free_cpu_rx_rmap(vsi); + ice_for_each_q_vector(vsi, i) { u16 vector = i + base; int irq_num; @@ -2882,7 +2884,8 @@ void ice_vsi_free_irq(struct ice_vsi *vsi) continue;
/* clear the affinity notifier in the IRQ descriptor */ - irq_set_affinity_notifier(irq_num, NULL); + if (!IS_ENABLED(CONFIG_RFS_ACCEL)) + irq_set_affinity_notifier(irq_num, NULL);
/* clear the affinity_mask in the IRQ descriptor */ irq_set_affinity_hint(irq_num, NULL); diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index db2e02e673a7..2de2bbbca1e9 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2494,6 +2494,13 @@ static int ice_vsi_req_irq_msix(struct ice_vsi *vsi, char *basename) irq_set_affinity_hint(irq_num, &q_vector->affinity_mask); }
+ err = ice_set_cpu_rx_rmap(vsi); + if (err) { + netdev_err(vsi->netdev, "Failed to setup CPU RMAP on VSI %u: %pe\n", + vsi->vsi_num, ERR_PTR(err)); + goto free_q_irqs; + } + vsi->irqs_ready = true; return 0;
@@ -3605,20 +3612,12 @@ static int ice_setup_pf_sw(struct ice_pf *pf) */ ice_napi_add(vsi);
- status = ice_set_cpu_rx_rmap(vsi); - if (status) { - dev_err(dev, "Failed to set CPU Rx map VSI %d error %d\n", - vsi->vsi_num, status); - goto unroll_napi_add; - } status = ice_init_mac_fltr(pf); if (status) - goto free_cpu_rx_map; + goto unroll_napi_add;
return 0;
-free_cpu_rx_map: - ice_free_cpu_rx_rmap(vsi); unroll_napi_add: ice_tc_indir_block_unregister(vsi); unroll_cfg_netdev: @@ -5076,7 +5075,6 @@ static int __maybe_unused ice_suspend(struct device *dev) continue; ice_vsi_free_q_vectors(pf->vsi[v]); } - ice_free_cpu_rx_rmap(ice_get_main_vsi(pf)); ice_clear_interrupt_scheme(pf);
pci_save_state(pdev);
From: Mateusz Palczewski mateusz.palczewski@intel.com
[ Upstream commit 7d59706dbef8de83b3662026766507bc494223d7 ]
This change caused a regression with resetting while changing network namespaces. By clearing the IFF_UP flag, the kernel now thinks it has fully closed the device.
This reverts commit 0cc318d2e8408bc0ffb4662a0c3e5e57005ac6ff.
Fixes: 0cc318d2e840 ("iavf: Fix deadlock occurrence during resetting VF interface") Signed-off-by: Mateusz Palczewski mateusz.palczewski@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 7 ++----- 1 file changed, 2 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index d10e9a8e8011..f55ecb672768 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -2817,7 +2817,6 @@ static void iavf_reset_task(struct work_struct *work) running = adapter->state == __IAVF_RUNNING;
if (running) { - netdev->flags &= ~IFF_UP; netif_carrier_off(netdev); netif_tx_stop_all_queues(netdev); adapter->link_up = false; @@ -2934,7 +2933,7 @@ static void iavf_reset_task(struct work_struct *work) * to __IAVF_RUNNING */ iavf_up_complete(adapter); - netdev->flags |= IFF_UP; + iavf_irq_enable(adapter, true); } else { iavf_change_state(adapter, __IAVF_DOWN); @@ -2950,10 +2949,8 @@ static void iavf_reset_task(struct work_struct *work) reset_err: mutex_unlock(&adapter->client_lock); mutex_unlock(&adapter->crit_lock); - if (running) { + if (running) iavf_change_state(adapter, __IAVF_RUNNING); - netdev->flags |= IFF_UP; - } dev_err(&adapter->pdev->dev, "failed to allocate resources during reinit\n"); iavf_close(netdev); }
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 74befa447e6839cdd90ed541159ec783726946f9 ]
When a driver for an interrupt controller is missing, of_irq_get() returns -EPROBE_DEFER ad infinitum, causing fwnode_mdiobus_phy_device_register(), and ultimately, the entire of_mdiobus_register() call, to fail. In turn, any phy_connect() call towards a PHY on this MDIO bus will also fail.
This is not what is expected to happen, because the PHY library falls back to poll mode when of_irq_get() returns a hard error code, and the MDIO bus, PHY and attached Ethernet controller work fine, albeit suboptimally, when the PHY library polls for link status. However, -EPROBE_DEFER has special handling given the assumption that at some point probe deferral will stop, and the driver for the supplier will kick in and create the IRQ domain.
Reasons for which the interrupt controller may be missing:
- It is not yet written. This may happen if a more recent DT blob (with an interrupt-parent for the PHY) is used to boot an old kernel where the driver didn't exist, and that kernel worked with the vintage-correct DT blob using poll mode.
- It is compiled out. Behavior is the same as above.
- It is compiled as a module. The kernel will wait for a number of seconds specified in the "deferred_probe_timeout" boot parameter for user space to load the required module. The current default is 0, which times out at the end of initcalls. It is possible that this might cause regressions unless users adjust this boot parameter.
The proposed solution is to use the driver_deferred_probe_check_state() helper function provided by the driver core, which gives up after some -EPROBE_DEFER attempts, taking "deferred_probe_timeout" into consideration. The return code is changed from -EPROBE_DEFER into -ENODEV or -ETIMEDOUT, depending on whether the kernel is compiled with support for modules or not.
Fixes: 66bdede495c7 ("of_mdio: Fix broken PHY IRQ in case of probe deferral") Suggested-by: Robin Murphy robin.murphy@arm.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Acked-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20220407165538.4084809-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/dd.c | 1 + drivers/net/mdio/fwnode_mdio.c | 5 +++++ 2 files changed, 6 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index 752a11d16e26..7e079fa3795b 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -296,6 +296,7 @@ int driver_deferred_probe_check_state(struct device *dev)
return -EPROBE_DEFER; } +EXPORT_SYMBOL_GPL(driver_deferred_probe_check_state);
static void deferred_probe_timeout_work_func(struct work_struct *work) { diff --git a/drivers/net/mdio/fwnode_mdio.c b/drivers/net/mdio/fwnode_mdio.c index 1becb1a731f6..1c1584fca632 100644 --- a/drivers/net/mdio/fwnode_mdio.c +++ b/drivers/net/mdio/fwnode_mdio.c @@ -43,6 +43,11 @@ int fwnode_mdiobus_phy_device_register(struct mii_bus *mdio, int rc;
rc = fwnode_irq_get(child, 0); + /* Don't wait forever if the IRQ provider doesn't become available, + * just fall back to poll mode + */ + if (rc == -EPROBE_DEFER) + rc = driver_deferred_probe_check_state(&phy->mdio.dev); if (rc == -EPROBE_DEFER) return rc;
From: Vadim Pasternak vadimp@nvidia.com
[ Upstream commit d452088cdfd5a4ad9d96d847d2273fe958d6339b ]
Add mutex_destroy() call in driver initialization error flow.
Fixes: 6882b0aee180f ("mlxsw: Introduce support for I2C bus") Signed-off-by: Vadim Pasternak vadimp@nvidia.com Signed-off-by: Ido Schimmel idosch@nvidia.com Link: https://lore.kernel.org/r/20220407070703.2421076-1-idosch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlxsw/i2c.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/i2c.c b/drivers/net/ethernet/mellanox/mlxsw/i2c.c index 939b692ffc33..ce843ea91464 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/i2c.c +++ b/drivers/net/ethernet/mellanox/mlxsw/i2c.c @@ -650,6 +650,7 @@ static int mlxsw_i2c_probe(struct i2c_client *client, return 0;
errout: + mutex_destroy(&mlxsw_i2c->cmd.lock); i2c_set_clientdata(client, NULL);
return err;
From: Xin Long lucien.xin@gmail.com
[ Upstream commit e2d88f9ce678cd33763826ae2f0412f181251314 ]
Yi Chen reported an unexpected sctp connection abort, and it occurred when COOKIE_ECHO is bundled with DATA Fragment by SCTP HW GSO. As the IP header is included in chunk->head_skb instead of chunk->skb, it failed to check IP header version in security_sctp_assoc_request().
According to Ondrej, SELinux only looks at IP header (address and IPsec options) and XFRM state data, and these are all included in head_skb for SCTP HW GSO packets. So fix it by using head_skb when calling security_sctp_assoc_request() in processing COOKIE_ECHO.
v1->v2: - As Ondrej noticed, chunk->head_skb should also be used for security_sctp_assoc_established() in sctp_sf_do_5_1E_ca().
Fixes: e215dab1c490 ("security: call security_sctp_assoc_request in sctp_sf_do_5_1D_ce") Reported-by: Yi Chen yiche@redhat.com Signed-off-by: Xin Long lucien.xin@gmail.com Reviewed-by: Ondrej Mosnacek omosnace@redhat.com Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Link: https://lore.kernel.org/r/71becb489e51284edf0c11fc15246f4ed4cef5b6.164933786... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/sm_statefuns.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/sctp/sm_statefuns.c b/net/sctp/sm_statefuns.c index 7f342bc12735..52edee1322fc 100644 --- a/net/sctp/sm_statefuns.c +++ b/net/sctp/sm_statefuns.c @@ -781,7 +781,7 @@ enum sctp_disposition sctp_sf_do_5_1D_ce(struct net *net, } }
- if (security_sctp_assoc_request(new_asoc, chunk->skb)) { + if (security_sctp_assoc_request(new_asoc, chunk->head_skb ?: chunk->skb)) { sctp_association_free(new_asoc); return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands); } @@ -932,7 +932,7 @@ enum sctp_disposition sctp_sf_do_5_1E_ca(struct net *net,
/* Set peer label for connection. */ if (security_sctp_assoc_established((struct sctp_association *)asoc, - chunk->skb)) + chunk->head_skb ?: chunk->skb)) return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands);
/* Verify that the chunk length for the COOKIE-ACK is OK. @@ -2262,7 +2262,7 @@ enum sctp_disposition sctp_sf_do_5_2_4_dupcook( }
/* Update socket peer label if first association. */ - if (security_sctp_assoc_request(new_asoc, chunk->skb)) { + if (security_sctp_assoc_request(new_asoc, chunk->head_skb ?: chunk->skb)) { sctp_association_free(new_asoc); return sctp_sf_pdiscard(net, ep, asoc, type, arg, commands); }
From: Marcelo Ricardo Leitner marcelo.leitner@gmail.com
[ Upstream commit e65812fd22eba32f11abe28cb377cbd64cfb1ba0 ]
Currently, when inserting a new filter that needs to sit at the head of chain 0, it will first update the heads pointer on all devices using the (shared) block, and only then complete the initialization of the new element so that it has a "next" element.
This can lead to a situation that the chain 0 head is propagated to another CPU before the "next" initialization is done. When this race condition is triggered, packets being matched on that CPU will simply miss all other filters, and will flow through the stack as if there were no other filters installed. If the system is using OVS + TC, such packets will get handled by vswitchd via upcall, which results in much higher latency and reordering. For other applications it may result in packet drops.
This is reproducible with a tc only setup, but it varies from system to system. It could be reproduced with a shared block amongst 10 veth tunnels, and an ingress filter mirroring packets to another veth. That's because using the last added veth tunnel to the shared block to do the actual traffic, it makes the race window bigger and easier to trigger.
The fix is rather simple, to just initialize the next pointer of the new filter instance (tp) before propagating the head change.
The fixes tag is pointing to the original code though this issue should only be observed when using it unlocked.
Fixes: 2190d1d0944f ("net: sched: introduce helpers to work with filter chains") Signed-off-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Signed-off-by: Vlad Buslov vladbu@nvidia.com Reviewed-by: Davide Caratti dcaratti@redhat.com Link: https://lore.kernel.org/r/b97d5f4eaffeeb9d058155bcab63347527261abf.164934136... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index 5ce1208a6ea3..130b5fda9c51 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -1653,10 +1653,10 @@ static int tcf_chain_tp_insert(struct tcf_chain *chain, if (chain->flushing) return -EAGAIN;
+ RCU_INIT_POINTER(tp->next, tcf_chain_tp_prev(chain, chain_info)); if (*chain_info->pprev == chain->filter_chain) tcf_chain0_head_change(chain, tp); tcf_proto_get(tp); - RCU_INIT_POINTER(tp->next, tcf_chain_tp_prev(chain, chain_info)); rcu_assign_pointer(*chain_info->pprev, tp);
return 0;
From: Jeffle Xu jefflexu@linux.alibaba.com
[ Upstream commit ea5dc046127e857a7873ae55fd57c866e9e86fb2 ]
Unmark inode in use if error encountered. If the in-use flag leakage occurs in cachefiles_open_file(), Cachefiles will complain "Inode already in use" when later another cookie with the same index key is looked up.
If the in-use flag leakage occurs in cachefiles_create_tmpfile(), though the "Inode already in use" warning won't be triggered, fix the leakage anyway.
Reported-by: Gao Xiang hsiangkao@linux.alibaba.com Fixes: 1f08c925e7a3 ("cachefiles: Implement backing file wrangling") Signed-off-by: Jeffle Xu jefflexu@linux.alibaba.com Signed-off-by: David Howells dhowells@redhat.com cc: linux-cachefs@redhat.com Link: https://listman.redhat.com/archives/linux-cachefs/2022-March/006615.html # v1 Link: https://listman.redhat.com/archives/linux-cachefs/2022-March/006618.html # v2 Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/namei.c | 33 ++++++++++++++++++++++++--------- 1 file changed, 24 insertions(+), 9 deletions(-)
diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c index f256c8aff7bb..ca9f3e4ec4b3 100644 --- a/fs/cachefiles/namei.c +++ b/fs/cachefiles/namei.c @@ -57,6 +57,16 @@ static void __cachefiles_unmark_inode_in_use(struct cachefiles_object *object, trace_cachefiles_mark_inactive(object, inode); }
+static void cachefiles_do_unmark_inode_in_use(struct cachefiles_object *object, + struct dentry *dentry) +{ + struct inode *inode = d_backing_inode(dentry); + + inode_lock(inode); + __cachefiles_unmark_inode_in_use(object, dentry); + inode_unlock(inode); +} + /* * Unmark a backing inode and tell cachefilesd that there's something that can * be culled. @@ -68,9 +78,7 @@ void cachefiles_unmark_inode_in_use(struct cachefiles_object *object, struct inode *inode = file_inode(file);
if (inode) { - inode_lock(inode); - __cachefiles_unmark_inode_in_use(object, file->f_path.dentry); - inode_unlock(inode); + cachefiles_do_unmark_inode_in_use(object, file->f_path.dentry);
if (!test_bit(CACHEFILES_OBJECT_USING_TMPFILE, &object->flags)) { atomic_long_add(inode->i_blocks, &cache->b_released); @@ -484,7 +492,7 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object) object, d_backing_inode(path.dentry), ret, cachefiles_trace_trunc_error); file = ERR_PTR(ret); - goto out_dput; + goto out_unuse; } }
@@ -494,15 +502,20 @@ struct file *cachefiles_create_tmpfile(struct cachefiles_object *object) trace_cachefiles_vfs_error(object, d_backing_inode(path.dentry), PTR_ERR(file), cachefiles_trace_open_error); - goto out_dput; + goto out_unuse; } if (unlikely(!file->f_op->read_iter) || unlikely(!file->f_op->write_iter)) { fput(file); pr_notice("Cache does not support read_iter and write_iter\n"); file = ERR_PTR(-EINVAL); + goto out_unuse; }
+ goto out_dput; + +out_unuse: + cachefiles_do_unmark_inode_in_use(object, path.dentry); out_dput: dput(path.dentry); out: @@ -590,14 +603,16 @@ static bool cachefiles_open_file(struct cachefiles_object *object, check_failed: fscache_cookie_lookup_negative(object->cookie); cachefiles_unmark_inode_in_use(object, file); - if (ret == -ESTALE) { - fput(file); - dput(dentry); + fput(file); + dput(dentry); + if (ret == -ESTALE) return cachefiles_create_file(object); - } + return false; + error_fput: fput(file); error: + cachefiles_do_unmark_inode_in_use(object, dentry); dput(dentry); return false; }
From: Dave Wysochanski dwysocha@redhat.com
[ Upstream commit 7b2f6c306601240635c72caa61f682e74d4591b2 ]
Use the actual length of volume coherency data when setting the xattr to avoid the following KASAN report.
BUG: KASAN: slab-out-of-bounds in cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles] Write of size 4 at addr ffff888101e02af4 by task kworker/6:0/1347
CPU: 6 PID: 1347 Comm: kworker/6:0 Kdump: loaded Not tainted 5.18.0-rc1-nfs-fscache-netfs+ #13 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-4.fc34 04/01/2014 Workqueue: events fscache_create_volume_work [fscache] Call Trace: <TASK> dump_stack_lvl+0x45/0x5a print_report.cold+0x5e/0x5db ? __lock_text_start+0x8/0x8 ? cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles] kasan_report+0xab/0x120 ? cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles] kasan_check_range+0xf5/0x1d0 memcpy+0x39/0x60 cachefiles_set_volume_xattr+0xa0/0x350 [cachefiles] cachefiles_acquire_volume+0x2be/0x500 [cachefiles] ? __cachefiles_free_volume+0x90/0x90 [cachefiles] fscache_create_volume_work+0x68/0x160 [fscache] process_one_work+0x3b7/0x6a0 worker_thread+0x2c4/0x650 ? process_one_work+0x6a0/0x6a0 kthread+0x16c/0x1a0 ? kthread_complete_and_exit+0x20/0x20 ret_from_fork+0x22/0x30 </TASK>
Allocated by task 1347: kasan_save_stack+0x1e/0x40 __kasan_kmalloc+0x81/0xa0 cachefiles_set_volume_xattr+0x76/0x350 [cachefiles] cachefiles_acquire_volume+0x2be/0x500 [cachefiles] fscache_create_volume_work+0x68/0x160 [fscache] process_one_work+0x3b7/0x6a0 worker_thread+0x2c4/0x650 kthread+0x16c/0x1a0 ret_from_fork+0x22/0x30
The buggy address belongs to the object at ffff888101e02af0 which belongs to the cache kmalloc-8 of size 8 The buggy address is located 4 bytes inside of 8-byte region [ffff888101e02af0, ffff888101e02af8)
The buggy address belongs to the physical page: page:00000000a2292d70 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x101e02 flags: 0x17ffffc0000200(slab|node=0|zone=2|lastcpupid=0x1fffff) raw: 0017ffffc0000200 0000000000000000 dead000000000001 ffff888100042280 raw: 0000000000000000 0000000080660066 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffff888101e02980: fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc ffff888101e02a00: 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00
ffff888101e02a80: fc fc fc fc 00 fc fc fc fc 00 fc fc fc fc 04 fc
^ ffff888101e02b00: fc fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc ffff888101e02b80: fc fc 00 fc fc fc fc 00 fc fc fc fc 00 fc fc fc ==================================================================
Fixes: 413a4a6b0b55 "cachefiles: Fix volume coherency attribute" Signed-off-by: Dave Wysochanski dwysocha@redhat.com Signed-off-by: David Howells dhowells@redhat.com cc: linux-cachefs@redhat.com Link: https://lore.kernel.org/r/20220405134649.6579-1-dwysocha@redhat.com/ # v1 Link: https://lore.kernel.org/r/20220405142810.8208-1-dwysocha@redhat.com/ # Incorrect v2 Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cachefiles/xattr.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/cachefiles/xattr.c b/fs/cachefiles/xattr.c index 35465109d9c4..00b087c14995 100644 --- a/fs/cachefiles/xattr.c +++ b/fs/cachefiles/xattr.c @@ -203,7 +203,7 @@ bool cachefiles_set_volume_xattr(struct cachefiles_volume *volume) if (!buf) return false; buf->reserved = cpu_to_be32(0); - memcpy(buf->data, p, len); + memcpy(buf->data, p, volume->vcookie->coherency_len);
ret = cachefiles_inject_write_error(); if (ret == 0)
From: Michael Walle michael@walle.cc
[ Upstream commit e6934e4048c91502efcb21da92b7ae37cd8fa741 ]
The DSA master might not have been probed yet in which case the probe of the felix switch fails with -EPROBE_DEFER: [ 4.435305] mscc_felix 0000:00:00.5: Failed to register DSA switch: -517
It is not an error. Use dev_err_probe() to demote this particular error to a debug message.
Fixes: 56051948773e ("net: dsa: ocelot: add driver for Felix switch family") Signed-off-by: Michael Walle michael@walle.cc Reviewed-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://lore.kernel.org/r/20220408101521.281886-1-michael@walle.cc Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/ocelot/felix_vsc9959.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/ocelot/felix_vsc9959.c b/drivers/net/dsa/ocelot/felix_vsc9959.c index 2875b5250856..443d34ce2853 100644 --- a/drivers/net/dsa/ocelot/felix_vsc9959.c +++ b/drivers/net/dsa/ocelot/felix_vsc9959.c @@ -2328,7 +2328,7 @@ static int felix_pci_probe(struct pci_dev *pdev,
err = dsa_register_switch(ds); if (err) { - dev_err(&pdev->dev, "Failed to register DSA switch: %d\n", err); + dev_err_probe(&pdev->dev, err, "Failed to register DSA switch\n"); goto err_register_ds; }
From: Anup Patel apatel@ventanamicro.com
[ Upstream commit fac3725364397f9a40a101f089b86ea655a58d06 ]
Supporting hardware updates of PTE A and D bits is optional for any RISC-V implementation so current software strategy is to always set these bits in both G-stage (hypervisor) and VS-stage (guest kernel).
If PTE A and D bits are not set by software (hypervisor or guest) then RISC-V implementations not supporting hardware updates of these bits will cause traps even for perfectly valid PTEs.
Based on above explanation, the VS-stage page table created by various KVM selftest applications is not correct because PTE A and D bits are not set. This patch fixes VS-stage page table programming of PTE A and D bits for KVM selftests.
Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V 64-bit") Signed-off-by: Anup Patel apatel@ventanamicro.com Tested-by: Mayuresh Chitale mchitale@ventanamicro.com Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/kvm/include/riscv/processor.h | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/include/riscv/processor.h b/tools/testing/selftests/kvm/include/riscv/processor.h index dc284c6bdbc3..eca5c622efd2 100644 --- a/tools/testing/selftests/kvm/include/riscv/processor.h +++ b/tools/testing/selftests/kvm/include/riscv/processor.h @@ -101,7 +101,9 @@ static inline void set_reg(struct kvm_vm *vm, uint32_t vcpuid, uint64_t id, #define PGTBL_PTE_WRITE_SHIFT 2 #define PGTBL_PTE_READ_MASK 0x0000000000000002ULL #define PGTBL_PTE_READ_SHIFT 1 -#define PGTBL_PTE_PERM_MASK (PGTBL_PTE_EXECUTE_MASK | \ +#define PGTBL_PTE_PERM_MASK (PGTBL_PTE_ACCESSED_MASK | \ + PGTBL_PTE_DIRTY_MASK | \ + PGTBL_PTE_EXECUTE_MASK | \ PGTBL_PTE_WRITE_MASK | \ PGTBL_PTE_READ_MASK) #define PGTBL_PTE_VALID_MASK 0x0000000000000001ULL
From: Anup Patel apatel@ventanamicro.com
[ Upstream commit ebdef0de2dbc40e697adaa6b3408130f7a7b8351 ]
The guest_hang() function is used as the default exception handler for various KVM selftests applications by setting it's address in the vstvec CSR. The vstvec CSR requires exception handler base address to be at least 4-byte aligned so this patch fixes alignment of the guest_hang() function.
Fixes: 3e06cdf10520 ("KVM: selftests: Add initial support for RISC-V 64-bit") Signed-off-by: Anup Patel apatel@ventanamicro.com Tested-by: Mayuresh Chitale mchitale@ventanamicro.com Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/kvm/lib/riscv/processor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/kvm/lib/riscv/processor.c b/tools/testing/selftests/kvm/lib/riscv/processor.c index d377f2603d98..3961487a4870 100644 --- a/tools/testing/selftests/kvm/lib/riscv/processor.c +++ b/tools/testing/selftests/kvm/lib/riscv/processor.c @@ -268,7 +268,7 @@ void vcpu_dump(FILE *stream, struct kvm_vm *vm, uint32_t vcpuid, uint8_t indent) core.regs.t3, core.regs.t4, core.regs.t5, core.regs.t6); }
-static void guest_hang(void) +static void __aligned(16) guest_hang(void) { while (1) ;
From: Heiko Stuebner heiko@sntech.de
[ Upstream commit 4054eee9290248bf66c5eacb58879c9aaad37f71 ]
vcpu_fp uses the riscv_isa_extension mechanism which gets defined in hwcap.h but doesn't include that head file.
While it seems to work in most cases, in certain conditions this can lead to build failures like
../arch/riscv/kvm/vcpu_fp.c: In function ‘kvm_riscv_vcpu_fp_reset’: ../arch/riscv/kvm/vcpu_fp.c:22:13: error: implicit declaration of function ‘riscv_isa_extension_available’ [-Werror=implicit-function-declaration] 22 | if (riscv_isa_extension_available(&isa, f) || | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../arch/riscv/kvm/vcpu_fp.c:22:49: error: ‘f’ undeclared (first use in this function) 22 | if (riscv_isa_extension_available(&isa, f) ||
Fix this by simply including the necessary header.
Fixes: 0a86512dc113 ("RISC-V: KVM: Factor-out FP virtualization into separate sources") Signed-off-by: Heiko Stuebner heiko@sntech.de Signed-off-by: Anup Patel anup@brainfault.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kvm/vcpu_fp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/riscv/kvm/vcpu_fp.c b/arch/riscv/kvm/vcpu_fp.c index 4449a976e5a6..d4308c512007 100644 --- a/arch/riscv/kvm/vcpu_fp.c +++ b/arch/riscv/kvm/vcpu_fp.c @@ -11,6 +11,7 @@ #include <linux/err.h> #include <linux/kvm_host.h> #include <linux/uaccess.h> +#include <asm/hwcap.h>
#ifdef CONFIG_FPU void kvm_riscv_vcpu_fp_reset(struct kvm_vcpu *vcpu)
From: Jens Axboe axboe@kernel.dk
[ Upstream commit c4212f3eb89fd5654f0a6ed2ee1d13fcb86cb664 ]
Give applications a way to tell if the kernel supports sane linked files, as in files being assigned at the right time to be able to reliably do <open file direct into slot X><read file from slot X> while using IOSQE_IO_LINK to order them.
Not really a bug fix, but flag it as such so that it gets pulled in with backports of the deferred file assignment.
Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 3 ++- include/uapi/linux/io_uring.h | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 1a9630dc5361..d13f142793f2 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10566,7 +10566,8 @@ static __cold int io_uring_create(unsigned entries, struct io_uring_params *p, IORING_FEAT_CUR_PERSONALITY | IORING_FEAT_FAST_POLL | IORING_FEAT_POLL_32BITS | IORING_FEAT_SQPOLL_NONFIXED | IORING_FEAT_EXT_ARG | IORING_FEAT_NATIVE_WORKERS | - IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP; + IORING_FEAT_RSRC_TAGS | IORING_FEAT_CQE_SKIP | + IORING_FEAT_LINKED_FILE;
if (copy_to_user(params, p, sizeof(*p))) { ret = -EFAULT; diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 787f491f0d2a..1e45368ad33f 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -293,6 +293,7 @@ struct io_uring_params { #define IORING_FEAT_NATIVE_WORKERS (1U << 9) #define IORING_FEAT_RSRC_TAGS (1U << 10) #define IORING_FEAT_CQE_SKIP (1U << 11) +#define IORING_FEAT_LINKED_FILE (1U << 12)
/* * io_uring_register(2) opcodes and arguments
From: Dinh Nguyen dinguyen@kernel.org
[ Upstream commit a6aaa00324240967272b451bfa772547bd576ee6 ]
When using a fixed-link, the altr_tse_pcs driver crashes due to null-pointer dereference as no phy_device is provided to tse_pcs_fix_mac_speed function. Fix this by adding a check for phy_dev before calling the tse_pcs_fix_mac_speed() function.
Also clean up the tse_pcs_fix_mac_speed function a bit. There is no need to check for splitter_base and sgmii_adapter_base because the driver will fail if these 2 variables are not derived from the device tree.
Fixes: fb3bbdb85989 ("net: ethernet: Add TSE PCS support to dwmac-socfpga") Signed-off-by: Dinh Nguyen dinguyen@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c | 8 -------- drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h | 4 ++++ drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c | 13 +++++-------- 3 files changed, 9 insertions(+), 16 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c b/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c index cd478d2cd871..00f6d347eaf7 100644 --- a/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c +++ b/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.c @@ -57,10 +57,6 @@ #define TSE_PCS_USE_SGMII_ENA BIT(0) #define TSE_PCS_IF_USE_SGMII 0x03
-#define SGMII_ADAPTER_CTRL_REG 0x00 -#define SGMII_ADAPTER_DISABLE 0x0001 -#define SGMII_ADAPTER_ENABLE 0x0000 - #define AUTONEGO_LINK_TIMER 20
static int tse_pcs_reset(void __iomem *base, struct tse_pcs *pcs) @@ -202,12 +198,8 @@ void tse_pcs_fix_mac_speed(struct tse_pcs *pcs, struct phy_device *phy_dev, unsigned int speed) { void __iomem *tse_pcs_base = pcs->tse_pcs_base; - void __iomem *sgmii_adapter_base = pcs->sgmii_adapter_base; u32 val;
- writew(SGMII_ADAPTER_ENABLE, - sgmii_adapter_base + SGMII_ADAPTER_CTRL_REG); - pcs->autoneg = phy_dev->autoneg;
if (phy_dev->autoneg == AUTONEG_ENABLE) { diff --git a/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h b/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h index 442812c0a4bd..694ac25ef426 100644 --- a/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h +++ b/drivers/net/ethernet/stmicro/stmmac/altr_tse_pcs.h @@ -10,6 +10,10 @@ #include <linux/phy.h> #include <linux/timer.h>
+#define SGMII_ADAPTER_CTRL_REG 0x00 +#define SGMII_ADAPTER_ENABLE 0x0000 +#define SGMII_ADAPTER_DISABLE 0x0001 + struct tse_pcs { struct device *dev; void __iomem *tse_pcs_base; diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c index b7c2579c963b..ac9e6c7a33b5 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-socfpga.c @@ -18,9 +18,6 @@
#include "altr_tse_pcs.h"
-#define SGMII_ADAPTER_CTRL_REG 0x00 -#define SGMII_ADAPTER_DISABLE 0x0001 - #define SYSMGR_EMACGRP_CTRL_PHYSEL_ENUM_GMII_MII 0x0 #define SYSMGR_EMACGRP_CTRL_PHYSEL_ENUM_RGMII 0x1 #define SYSMGR_EMACGRP_CTRL_PHYSEL_ENUM_RMII 0x2 @@ -62,16 +59,14 @@ static void socfpga_dwmac_fix_mac_speed(void *priv, unsigned int speed) { struct socfpga_dwmac *dwmac = (struct socfpga_dwmac *)priv; void __iomem *splitter_base = dwmac->splitter_base; - void __iomem *tse_pcs_base = dwmac->pcs.tse_pcs_base; void __iomem *sgmii_adapter_base = dwmac->pcs.sgmii_adapter_base; struct device *dev = dwmac->dev; struct net_device *ndev = dev_get_drvdata(dev); struct phy_device *phy_dev = ndev->phydev; u32 val;
- if ((tse_pcs_base) && (sgmii_adapter_base)) - writew(SGMII_ADAPTER_DISABLE, - sgmii_adapter_base + SGMII_ADAPTER_CTRL_REG); + writew(SGMII_ADAPTER_DISABLE, + sgmii_adapter_base + SGMII_ADAPTER_CTRL_REG);
if (splitter_base) { val = readl(splitter_base + EMAC_SPLITTER_CTRL_REG); @@ -93,7 +88,9 @@ static void socfpga_dwmac_fix_mac_speed(void *priv, unsigned int speed) writel(val, splitter_base + EMAC_SPLITTER_CTRL_REG); }
- if (tse_pcs_base && sgmii_adapter_base) + writew(SGMII_ADAPTER_ENABLE, + sgmii_adapter_base + SGMII_ADAPTER_CTRL_REG); + if (phy_dev) tse_pcs_fix_mac_speed(&dwmac->pcs, phy_dev, speed); }
From: Benedikt Spranger b.spranger@linutronix.de
[ Upstream commit e8a64bbaaad1f6548cec5508297bc6d45e8ab69e ]
A user may set the SO_TXTIME socket option to ensure a packet is send at a given time. The taprio scheduler has to confirm, that it is allowed to send a packet at that given time, by a check against the packet time schedule. The scheduler drop the packet, if the gates are closed at the given send time.
The check, if SO_TXTIME is set, may fail since sk_flags are part of an union and the union is used otherwise. This happen, if a socket is not a full socket, like a request socket for example.
Add a check to verify, if the union is used for sk_flags.
Fixes: 4cfd5779bd6e ("taprio: Add support for txtime-assist mode") Signed-off-by: Benedikt Spranger b.spranger@linutronix.de Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_taprio.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/sched/sch_taprio.c b/net/sched/sch_taprio.c index 377f896bdedc..b9c71a304d39 100644 --- a/net/sched/sch_taprio.c +++ b/net/sched/sch_taprio.c @@ -417,7 +417,8 @@ static int taprio_enqueue_one(struct sk_buff *skb, struct Qdisc *sch, { struct taprio_sched *q = qdisc_priv(sch);
- if (skb->sk && sock_flag(skb->sk, SOCK_TXTIME)) { + /* sk_flags are only safe to use on full sockets. */ + if (skb->sk && sk_fullsock(skb->sk) && sock_flag(skb->sk, SOCK_TXTIME)) { if (!is_valid_interval(skb, sch)) return qdisc_drop(skb, sch, to_free); } else if (TXTIME_ASSIST_IS_ENABLED(q->flags)) {
From: Rameshkumar Sundaram quic_ramess@quicinc.com
[ Upstream commit a5199b5626cd6913cf8776a835bc63d40e0686ad ]
Synchronize additions to nontrans_list of transmitting BSS with bss_lock to avoid races. Also when cfg80211_add_nontrans_list() fails __cfg80211_unlink_bss() needs bss_lock to be held (has lockdep assert on bss_lock). So protect the whole block with bss_lock to avoid races and warnings. Found during code review.
Fixes: 0b8fb8235be8 ("cfg80211: Parsing of Multiple BSSID information in scanning") Signed-off-by: Rameshkumar Sundaram quic_ramess@quicinc.com Link: https://lore.kernel.org/r/1649668071-9370-1-git-send-email-quic_ramess@quici... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/wireless/scan.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/wireless/scan.c b/net/wireless/scan.c index b2fdac96bab0..4a6d86432910 100644 --- a/net/wireless/scan.c +++ b/net/wireless/scan.c @@ -2018,11 +2018,13 @@ cfg80211_inform_single_bss_data(struct wiphy *wiphy, /* this is a nontransmitting bss, we need to add it to * transmitting bss' list if it is not there */ + spin_lock_bh(&rdev->bss_lock); if (cfg80211_add_nontrans_list(non_tx_data->tx_bss, &res->pub)) { if (__cfg80211_unlink_bss(rdev, res)) rdev->bss_generation++; } + spin_unlock_bh(&rdev->bss_lock); }
trace_cfg80211_return_bss(&res->pub);
From: Ben Greear greearb@candelatech.com
[ Upstream commit fb4bccd863ccccd36ad000601856609e259a1859 ]
Don't use sizeof(pointer) when calculating scnprintf offset.
Fixes: 01f84f0ed3b4 ("mac80211: reduce stack usage in debugfs") Signed-off-by: Ben Greear greearb@candelatech.com Link: https://lore.kernel.org/r/20220406175659.20611-1-greearb@candelatech.com [correct the Fixes tag] Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/mac80211/debugfs_sta.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/mac80211/debugfs_sta.c b/net/mac80211/debugfs_sta.c index 9479f2787ea7..88d9cc945a21 100644 --- a/net/mac80211/debugfs_sta.c +++ b/net/mac80211/debugfs_sta.c @@ -441,7 +441,7 @@ static ssize_t sta_ht_capa_read(struct file *file, char __user *userbuf, #define PRINT_HT_CAP(_cond, _str) \ do { \ if (_cond) \ - p += scnprintf(p, sizeof(buf)+buf-p, "\t" _str "\n"); \ + p += scnprintf(p, bufsz + buf - p, "\t" _str "\n"); \ } while (0) char *buf, *p; int i;
From: Florian Westphal fw@strlen.de
[ Upstream commit 05ae2fba821c4d122ab4ba3e52144e21586c4010 ]
cgroupv2 helper function ignores the already-looked up sk and uses skb->sk instead.
Just pass sk from the calling function instead; this will make cgroup matching work for udp and tcp in input even when edemux did not set skb->sk already.
Fixes: e0bb96db96f8 ("netfilter: nft_socket: add support for cgroupsv2") Signed-off-by: Florian Westphal fw@strlen.de Tested-by: Topi Miettinen toiwoton@gmail.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nft_socket.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nft_socket.c b/net/netfilter/nft_socket.c index d601974c9d2e..b8f011145765 100644 --- a/net/netfilter/nft_socket.c +++ b/net/netfilter/nft_socket.c @@ -36,12 +36,11 @@ static void nft_socket_wildcard(const struct nft_pktinfo *pkt,
#ifdef CONFIG_SOCK_CGROUP_DATA static noinline bool -nft_sock_get_eval_cgroupv2(u32 *dest, const struct nft_pktinfo *pkt, u32 level) +nft_sock_get_eval_cgroupv2(u32 *dest, struct sock *sk, const struct nft_pktinfo *pkt, u32 level) { - struct sock *sk = skb_to_full_sk(pkt->skb); struct cgroup *cgrp;
- if (!sk || !sk_fullsock(sk) || !net_eq(nft_net(pkt), sock_net(sk))) + if (!sk_fullsock(sk)) return false;
cgrp = sock_cgroup_ptr(&sk->sk_cgrp_data); @@ -108,7 +107,7 @@ static void nft_socket_eval(const struct nft_expr *expr, break; #ifdef CONFIG_SOCK_CGROUP_DATA case NFT_SOCKET_CGROUPV2: - if (!nft_sock_get_eval_cgroupv2(dest, pkt, priv->level)) { + if (!nft_sock_get_eval_cgroupv2(dest, sk, pkt, priv->level)) { regs->verdict.code = NFT_BREAK; return; }
From: Rob Clark robdclark@chromium.org
[ Upstream commit 537fef808be5ea56f6fc06932162550819a3b3c3 ]
The fourth param is size, rather than range_end.
Note that we could increase the address space size if we had a way to prevent buffers from spanning a 4G split, mostly just to avoid fw bugs with 64b math.
Fixes: 84c31ee16f90 ("drm/msm/a6xx: Add support for per-instance pagetables") Signed-off-by: Rob Clark robdclark@chromium.org Link: https://lore.kernel.org/r/20220407202836.1211268-1-robdclark@gmail.com Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/adreno/a6xx_gpu.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c index 616be7265da4..19622fb1fa35 100644 --- a/drivers/gpu/drm/msm/adreno/a6xx_gpu.c +++ b/drivers/gpu/drm/msm/adreno/a6xx_gpu.c @@ -1714,7 +1714,7 @@ a6xx_create_private_address_space(struct msm_gpu *gpu) return ERR_CAST(mmu);
return msm_gem_address_space_create(mmu, - "gpu", 0x100000000ULL, 0x1ffffffffULL); + "gpu", 0x100000000ULL, SZ_4G); }
static uint32_t a6xx_get_rptr(struct msm_gpu *gpu, struct msm_ringbuffer *ring)
From: Stephen Boyd swboyd@chromium.org
[ Upstream commit 47b7de6b88b962ef339a2427a023d2a23d161654 ]
The member 'msm_dsi->connector' isn't assigned until msm_dsi_manager_connector_init() returns (see msm_dsi_modeset_init() and how it assigns the return value). Therefore this pointer is going to be NULL here. Let's use 'connector' which is what was intended.
Cc: Dmitry Baryshkov dmitry.baryshkov@linaro.org Cc: Sean Paul seanpaul@chromium.org Fixes: 6d5e78406991 ("drm/msm/dsi: Move dsi panel init into modeset init path") Signed-off-by: Stephen Boyd swboyd@chromium.org Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/478693/ Link: https://lore.kernel.org/r/20220318000731.2823718-1-swboyd@chromium.org Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dsi/dsi_manager.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/msm/dsi/dsi_manager.c b/drivers/gpu/drm/msm/dsi/dsi_manager.c index f19bae475c96..cd7b41b7d518 100644 --- a/drivers/gpu/drm/msm/dsi/dsi_manager.c +++ b/drivers/gpu/drm/msm/dsi/dsi_manager.c @@ -641,7 +641,7 @@ struct drm_connector *msm_dsi_manager_connector_init(u8 id) return connector;
fail: - connector->funcs->destroy(msm_dsi->connector); + connector->funcs->destroy(connector); return ERR_PTR(ret); }
From: Kuogee Hsieh quic_khsieh@quicinc.com
[ Upstream commit 8b2c181e3dcf7562445af6702ee94aaedcbe13c8 ]
There is possible circular locking dependency detected on event_mutex (see below logs). This is due to set fail safe mode is done at dp_panel_read_sink_caps() within event_mutex scope. To break this possible circular locking, this patch move setting fail safe mode out of event_mutex scope.
[ 23.958078] ====================================================== [ 23.964430] WARNING: possible circular locking dependency detected [ 23.970777] 5.17.0-rc2-lockdep-00088-g05241de1f69e #148 Not tainted [ 23.977219] ------------------------------------------------------ [ 23.983570] DrmThread/1574 is trying to acquire lock: [ 23.988763] ffffff808423aab0 (&dp->event_mutex){+.+.}-{3:3}, at: msm_dp_displ ay_enable+0x58/0x164 [ 23.997895] [ 23.997895] but task is already holding lock: [ 24.003895] ffffff808420b280 (&kms->commit_lock[i]/1){+.+.}-{3:3}, at: lock_c rtcs+0x80/0x8c [ 24.012495] [ 24.012495] which lock already depends on the new lock. [ 24.012495] [ 24.020886] [ 24.020886] the existing dependency chain (in reverse order) is: [ 24.028570] [ 24.028570] -> #5 (&kms->commit_lock[i]/1){+.+.}-{3:3}: [ 24.035472] __mutex_lock+0xc8/0x384 [ 24.039695] mutex_lock_nested+0x54/0x74 [ 24.044272] lock_crtcs+0x80/0x8c [ 24.048222] msm_atomic_commit_tail+0x1e8/0x3d0 [ 24.053413] commit_tail+0x7c/0xfc [ 24.057452] drm_atomic_helper_commit+0x158/0x15c [ 24.062826] drm_atomic_commit+0x60/0x74 [ 24.067403] drm_mode_atomic_ioctl+0x6b0/0x908 [ 24.072508] drm_ioctl_kernel+0xe8/0x168 [ 24.077086] drm_ioctl+0x320/0x370 [ 24.081123] drm_compat_ioctl+0x40/0xdc [ 24.085602] __arm64_compat_sys_ioctl+0xe0/0x150 [ 24.090895] invoke_syscall+0x80/0x114 [ 24.095294] el0_svc_common.constprop.3+0xc4/0xf8 [ 24.100668] do_el0_svc_compat+0x2c/0x54 [ 24.105242] el0_svc_compat+0x4c/0xe4 [ 24.109548] el0t_32_sync_handler+0xc4/0xf4 [ 24.114381] el0t_32_sync+0x178 [ 24.118688] [ 24.118688] -> #4 (&kms->commit_lock[i]){+.+.}-{3:3}: [ 24.125408] __mutex_lock+0xc8/0x384 [ 24.129628] mutex_lock_nested+0x54/0x74 [ 24.134204] lock_crtcs+0x80/0x8c [ 24.138155] msm_atomic_commit_tail+0x1e8/0x3d0 [ 24.143345] commit_tail+0x7c/0xfc [ 24.147382] drm_atomic_helper_commit+0x158/0x15c [ 24.152755] drm_atomic_commit+0x60/0x74 [ 24.157323] drm_atomic_helper_set_config+0x68/0x90 [ 24.162869] drm_mode_setcrtc+0x394/0x648 [ 24.167535] drm_ioctl_kernel+0xe8/0x168 [ 24.172102] drm_ioctl+0x320/0x370 [ 24.176135] drm_compat_ioctl+0x40/0xdc [ 24.180621] __arm64_compat_sys_ioctl+0xe0/0x150 [ 24.185904] invoke_syscall+0x80/0x114 [ 24.190302] el0_svc_common.constprop.3+0xc4/0xf8 [ 24.195673] do_el0_svc_compat+0x2c/0x54 [ 24.200241] el0_svc_compat+0x4c/0xe4 [ 24.204544] el0t_32_sync_handler+0xc4/0xf4 [ 24.209378] el0t_32_sync+0x174/0x178 [ 24.213680] -> #3 (crtc_ww_class_mutex){+.+.}-{3:3}: [ 24.220308] __ww_mutex_lock.constprop.20+0xe8/0x878 [ 24.225951] ww_mutex_lock+0x60/0xd0 [ 24.230166] modeset_lock+0x190/0x19c [ 24.234467] drm_modeset_lock+0x34/0x54 [ 24.238953] drmm_mode_config_init+0x550/0x764 [ 24.244065] msm_drm_bind+0x170/0x59c [ 24.248374] try_to_bring_up_master+0x244/0x294 [ 24.253572] __component_add+0xf4/0x14c [ 24.258057] component_add+0x2c/0x38 [ 24.262273] dsi_dev_attach+0x2c/0x38 [ 24.266575] dsi_host_attach+0xc4/0x120 [ 24.271060] mipi_dsi_attach+0x34/0x48 [ 24.275456] devm_mipi_dsi_attach+0x28/0x68 [ 24.280298] ti_sn_bridge_probe+0x2b4/0x2dc [ 24.285137] auxiliary_bus_probe+0x78/0x90 [ 24.289893] really_probe+0x1e4/0x3d8 [ 24.294194] __driver_probe_device+0x14c/0x164 [ 24.299298] driver_probe_device+0x54/0xf8 [ 24.304043] __device_attach_driver+0xb4/0x118 [ 24.309145] bus_for_each_drv+0xb0/0xd4 [ 24.313628] __device_attach+0xcc/0x158 [ 24.318112] device_initial_probe+0x24/0x30 [ 24.322954] bus_probe_device+0x38/0x9c [ 24.327439] deferred_probe_work_func+0xd4/0xf0 [ 24.332628] process_one_work+0x2f0/0x498 [ 24.337289] process_scheduled_works+0x44/0x48 [ 24.342391] worker_thread+0x1e4/0x26c [ 24.346788] kthread+0xe4/0xf4 [ 24.350470] ret_from_fork+0x10/0x20 [ 24.354683] [ 24.354683] [ 24.354683] -> #2 (crtc_ww_class_acquire){+.+.}-{0:0}: [ 24.361489] drm_modeset_acquire_init+0xe4/0x138 [ 24.366777] drm_helper_probe_detect_ctx+0x44/0x114 [ 24.372327] check_connector_changed+0xbc/0x198 [ 24.377517] drm_helper_hpd_irq_event+0xcc/0x11c [ 24.382804] dsi_hpd_worker+0x24/0x30 [ 24.387104] process_one_work+0x2f0/0x498 [ 24.391762] worker_thread+0x1d0/0x26c [ 24.396158] kthread+0xe4/0xf4 [ 24.399840] ret_from_fork+0x10/0x20 [ 24.404053] [ 24.404053] -> #1 (&dev->mode_config.mutex){+.+.}-{3:3}: [ 24.411032] __mutex_lock+0xc8/0x384 [ 24.415247] mutex_lock_nested+0x54/0x74 [ 24.419819] dp_panel_read_sink_caps+0x23c/0x26c [ 24.425108] dp_display_process_hpd_high+0x34/0xd4 [ 24.430570] dp_display_usbpd_configure_cb+0x30/0x3c [ 24.436205] hpd_event_thread+0x2ac/0x550 [ 24.440864] kthread+0xe4/0xf4 [ 24.444544] ret_from_fork+0x10/0x20 [ 24.448757] [ 24.448757] -> #0 (&dp->event_mutex){+.+.}-{3:3}: [ 24.455116] __lock_acquire+0xe2c/0x10d8 [ 24.459690] lock_acquire+0x1ac/0x2d0 [ 24.463988] __mutex_lock+0xc8/0x384 [ 24.468201] mutex_lock_nested+0x54/0x74 [ 24.472773] msm_dp_display_enable+0x58/0x164 [ 24.477789] dp_bridge_enable+0x24/0x30 [ 24.482273] drm_atomic_bridge_chain_enable+0x78/0x9c [ 24.488006] drm_atomic_helper_commit_modeset_enables+0x1bc/0x244 [ 24.494801] msm_atomic_commit_tail+0x248/0x3d0 [ 24.499992] commit_tail+0x7c/0xfc [ 24.504031] drm_atomic_helper_commit+0x158/0x15c [ 24.509404] drm_atomic_commit+0x60/0x74 [ 24.513976] drm_mode_atomic_ioctl+0x6b0/0x908 [ 24.519079] drm_ioctl_kernel+0xe8/0x168 [ 24.523650] drm_ioctl+0x320/0x370 [ 24.527689] drm_compat_ioctl+0x40/0xdc [ 24.532175] __arm64_compat_sys_ioctl+0xe0/0x150 [ 24.537463] invoke_syscall+0x80/0x114 [ 24.541861] el0_svc_common.constprop.3+0xc4/0xf8 [ 24.547235] do_el0_svc_compat+0x2c/0x54 [ 24.551806] el0_svc_compat+0x4c/0xe4 [ 24.556106] el0t_32_sync_handler+0xc4/0xf4 [ 24.560948] el0t_32_sync+0x174/0x178
Changes in v2: -- add circular lockiing trace
Fixes: d4aca422539c ("drm/msm/dp: always add fail-safe mode into connector mode list") Signed-off-by: Kuogee Hsieh quic_khsieh@quicinc.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Patchwork: https://patchwork.freedesktop.org/patch/481396/ Link: https://lore.kernel.org/r/1649451894-554-1-git-send-email-quic_khsieh@quicin... Signed-off-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Signed-off-by: Rob Clark robdclark@chromium.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/msm/dp/dp_display.c | 6 ++++++ drivers/gpu/drm/msm/dp/dp_panel.c | 20 ++++++++++---------- drivers/gpu/drm/msm/dp/dp_panel.h | 1 + 3 files changed, 17 insertions(+), 10 deletions(-)
diff --git a/drivers/gpu/drm/msm/dp/dp_display.c b/drivers/gpu/drm/msm/dp/dp_display.c index 1d7f82e6eafe..af9c09c30860 100644 --- a/drivers/gpu/drm/msm/dp/dp_display.c +++ b/drivers/gpu/drm/msm/dp/dp_display.c @@ -551,6 +551,12 @@ static int dp_hpd_plug_handle(struct dp_display_private *dp, u32 data)
mutex_unlock(&dp->event_mutex);
+ /* + * add fail safe mode outside event_mutex scope + * to avoid potiential circular lock with drm thread + */ + dp_panel_add_fail_safe_mode(dp->dp_display.connector); + /* uevent will complete connection part */ return 0; }; diff --git a/drivers/gpu/drm/msm/dp/dp_panel.c b/drivers/gpu/drm/msm/dp/dp_panel.c index f1418722c549..26c3653c99ec 100644 --- a/drivers/gpu/drm/msm/dp/dp_panel.c +++ b/drivers/gpu/drm/msm/dp/dp_panel.c @@ -151,6 +151,15 @@ static int dp_panel_update_modes(struct drm_connector *connector, return rc; }
+void dp_panel_add_fail_safe_mode(struct drm_connector *connector) +{ + /* fail safe edid */ + mutex_lock(&connector->dev->mode_config.mutex); + if (drm_add_modes_noedid(connector, 640, 480)) + drm_set_preferred_mode(connector, 640, 480); + mutex_unlock(&connector->dev->mode_config.mutex); +} + int dp_panel_read_sink_caps(struct dp_panel *dp_panel, struct drm_connector *connector) { @@ -207,16 +216,7 @@ int dp_panel_read_sink_caps(struct dp_panel *dp_panel, goto end; }
- /* fail safe edid */ - mutex_lock(&connector->dev->mode_config.mutex); - if (drm_add_modes_noedid(connector, 640, 480)) - drm_set_preferred_mode(connector, 640, 480); - mutex_unlock(&connector->dev->mode_config.mutex); - } else { - /* always add fail-safe mode as backup mode */ - mutex_lock(&connector->dev->mode_config.mutex); - drm_add_modes_noedid(connector, 640, 480); - mutex_unlock(&connector->dev->mode_config.mutex); + dp_panel_add_fail_safe_mode(connector); }
if (panel->aux_cfg_update_done) { diff --git a/drivers/gpu/drm/msm/dp/dp_panel.h b/drivers/gpu/drm/msm/dp/dp_panel.h index 9023e5bb4b8b..99739ea679a7 100644 --- a/drivers/gpu/drm/msm/dp/dp_panel.h +++ b/drivers/gpu/drm/msm/dp/dp_panel.h @@ -59,6 +59,7 @@ int dp_panel_init_panel_info(struct dp_panel *dp_panel); int dp_panel_deinit(struct dp_panel *dp_panel); int dp_panel_timing_cfg(struct dp_panel *dp_panel); void dp_panel_dump_regs(struct dp_panel *dp_panel); +void dp_panel_add_fail_safe_mode(struct drm_connector *connector); int dp_panel_read_sink_caps(struct dp_panel *dp_panel, struct drm_connector *connector); u32 dp_panel_get_mode_bpp(struct dp_panel *dp_panel, u32 mode_max_bpp,
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 82733d168cbd3fe9dab603f05894316b99008924 ]
There are two reasons why this isn't the best idea:
- It's an odd area to grab a bit of storage space, hence it's an odd area to grab storage from. - It puts the 3rd io_kiocb cacheline into the hot path, where normal hot path just needs the first two.
Use 'cflags' for joint fd/cflags storage. We only need fd until we successfully issue, and we only need cflags once a request is done and is completed.
Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io-wq.h | 1 - fs/io_uring.c | 12 ++++++++---- 2 files changed, 8 insertions(+), 5 deletions(-)
diff --git a/fs/io-wq.h b/fs/io-wq.h index 04d374e65e54..dbecd27656c7 100644 --- a/fs/io-wq.h +++ b/fs/io-wq.h @@ -155,7 +155,6 @@ struct io_wq_work_node *wq_stack_extract(struct io_wq_work_node *stack) struct io_wq_work { struct io_wq_work_node list; unsigned flags; - int fd; };
static inline struct io_wq_work *wq_next_work(struct io_wq_work *work) diff --git a/fs/io_uring.c b/fs/io_uring.c index d13f142793f2..d05394b0c1e6 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -864,7 +864,11 @@ struct io_kiocb {
u64 user_data; u32 result; - u32 cflags; + /* fd initially, then cflags for completion */ + union { + u32 cflags; + int fd; + };
struct io_ring_ctx *ctx; struct task_struct *task; @@ -6708,9 +6712,9 @@ static bool io_assign_file(struct io_kiocb *req, unsigned int issue_flags) return true;
if (req->flags & REQ_F_FIXED_FILE) - req->file = io_file_get_fixed(req, req->work.fd, issue_flags); + req->file = io_file_get_fixed(req, req->fd, issue_flags); else - req->file = io_file_get_normal(req, req->work.fd); + req->file = io_file_get_normal(req, req->fd); if (req->file) return true;
@@ -7243,7 +7247,7 @@ static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, if (io_op_defs[opcode].needs_file) { struct io_submit_state *state = &ctx->submit_state;
- req->work.fd = READ_ONCE(sqe->fd); + req->fd = READ_ONCE(sqe->fd);
/* * Plug now if we have more than 2 IO left after this, and the
From: Karsten Graul kgraul@linux.ibm.com
[ Upstream commit b1871fd48efc567650dbdc974e5a2342a03fe0d2 ]
Using snprintf() to convert not null-terminated strings to null terminated strings may cause out of bounds read in the source string. Therefore use memcpy() and terminate the target string with a null afterwards.
Fixes: fa0866625543 ("net/smc: add support for user defined EIDs") Fixes: 3c572145c24e ("net/smc: add generic netlink support for system EID") Signed-off-by: Karsten Graul kgraul@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_clc.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/smc/smc_clc.c b/net/smc/smc_clc.c index ce27399b38b1..f9f3f59c79de 100644 --- a/net/smc/smc_clc.c +++ b/net/smc/smc_clc.c @@ -191,7 +191,8 @@ static int smc_nl_ueid_dumpinfo(struct sk_buff *skb, u32 portid, u32 seq, flags, SMC_NETLINK_DUMP_UEID); if (!hdr) return -ENOMEM; - snprintf(ueid_str, sizeof(ueid_str), "%s", ueid); + memcpy(ueid_str, ueid, SMC_MAX_EID_LEN); + ueid_str[SMC_MAX_EID_LEN] = 0; if (nla_put_string(skb, SMC_NLA_EID_TABLE_ENTRY, ueid_str)) { genlmsg_cancel(skb, hdr); return -EMSGSIZE; @@ -252,7 +253,8 @@ int smc_nl_dump_seid(struct sk_buff *skb, struct netlink_callback *cb) goto end;
smc_ism_get_system_eid(&seid); - snprintf(seid_str, sizeof(seid_str), "%s", seid); + memcpy(seid_str, seid, SMC_MAX_EID_LEN); + seid_str[SMC_MAX_EID_LEN] = 0; if (nla_put_string(skb, SMC_NLA_SEID_ENTRY, seid_str)) goto err; read_lock(&smc_clc_eid_table.lock);
From: Karsten Graul kgraul@linux.ibm.com
[ Upstream commit d22f4f977236f97e01255a80bca2ea93a8094fc8 ]
dev_name() was called with dev.parent as argument but without to NULL-check it before. Solve this by checking the pointer before the call to dev_name().
Fixes: af5f60c7e3d5 ("net/smc: allow PCI IDs as ib device names in the pnet table") Reported-by: syzbot+03e3e228510223dabd34@syzkaller.appspotmail.com Signed-off-by: Karsten Graul kgraul@linux.ibm.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/smc/smc_pnet.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/net/smc/smc_pnet.c b/net/smc/smc_pnet.c index 29f0a559d884..4769f76505af 100644 --- a/net/smc/smc_pnet.c +++ b/net/smc/smc_pnet.c @@ -311,8 +311,9 @@ static struct smc_ib_device *smc_pnet_find_ib(char *ib_name) list_for_each_entry(ibdev, &smc_ib_devices.list, list) { if (!strncmp(ibdev->ibdev->name, ib_name, sizeof(ibdev->ibdev->name)) || - !strncmp(dev_name(ibdev->ibdev->dev.parent), ib_name, - IB_DEVICE_NAME_MAX - 1)) { + (ibdev->ibdev->dev.parent && + !strncmp(dev_name(ibdev->ibdev->dev.parent), ib_name, + IB_DEVICE_NAME_MAX - 1))) { goto out; } }
From: Ajish Koshy Ajish.Koshy@microchip.com
[ Upstream commit 294080eacf92a0781e6d43663448a55001ec8c64 ]
When upper inbound and outbound queues 32-63 are enabled, we see upper vectors 32-63 in interrupt service routine. We need corresponding registers to handle masking and unmasking of these upper interrupts.
To achieve this, we use registers MSGU_ODMR_U(0x34) to mask and MSGU_ODMR_CLR_U(0x3C) to unmask the interrupts. In these registers bit 0-31 represents interrupt vectors 32-63.
Link: https://lore.kernel.org/r/20220411064603.668448-2-Ajish.Koshy@microchip.com Fixes: 05c6c029a44d ("scsi: pm80xx: Increase number of supported queues") Reviewed-by: John Garry john.garry@huawei.com Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Ajish Koshy Ajish.Koshy@microchip.com Signed-off-by: Viswas G Viswas.G@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm80xx_hwi.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c index 55163469030d..2dea48933ef9 100644 --- a/drivers/scsi/pm8001/pm80xx_hwi.c +++ b/drivers/scsi/pm8001/pm80xx_hwi.c @@ -1734,10 +1734,11 @@ static void pm80xx_chip_interrupt_enable(struct pm8001_hba_info *pm8001_ha, u8 vec) { #ifdef PM8001_USE_MSIX - u32 mask; - mask = (u32)(1 << vec); - - pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_CLR, (u32)(mask & 0xFFFFFFFF)); + if (vec < 32) + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_CLR, 1U << vec); + else + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_CLR_U, + 1U << (vec - 32)); return; #endif pm80xx_chip_intx_interrupt_enable(pm8001_ha); @@ -1753,12 +1754,15 @@ static void pm80xx_chip_interrupt_disable(struct pm8001_hba_info *pm8001_ha, u8 vec) { #ifdef PM8001_USE_MSIX - u32 mask; - if (vec == 0xFF) - mask = 0xFFFFFFFF; + if (vec == 0xFF) { + /* disable all vectors 0-31, 32-63 */ + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR, 0xFFFFFFFF); + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_U, 0xFFFFFFFF); + } else if (vec < 32) + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR, 1U << vec); else - mask = (u32)(1 << vec); - pm8001_cw32(pm8001_ha, 0, MSGU_ODMR, (u32)(mask & 0xFFFFFFFF)); + pm8001_cw32(pm8001_ha, 0, MSGU_ODMR_U, + 1U << (vec - 32)); return; #endif pm80xx_chip_intx_interrupt_disable(pm8001_ha);
From: Ajish Koshy Ajish.Koshy@microchip.com
[ Upstream commit bcd8a45223470e00b5f254018174d64a75db4bbe ]
Executing driver on servers with more than 32 CPUs were faced with command timeouts. This is because we were not geting completions for commands submitted on IQ32 - IQ63.
Set E64Q bit to enable upper inbound and outbound queues 32 to 63 in the MPI main configuration table.
Added 500ms delay after successful MPI initialization as mentioned in controller datasheet.
Link: https://lore.kernel.org/r/20220411064603.668448-3-Ajish.Koshy@microchip.com Fixes: 05c6c029a44d ("scsi: pm80xx: Increase number of supported queues") Reviewed-by: Damien Le Moal damien.lemoal@opensource.wdc.com Acked-by: Jack Wang jinpu.wang@ionos.com Signed-off-by: Ajish Koshy Ajish.Koshy@microchip.com Signed-off-by: Viswas G Viswas.G@microchip.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/pm8001/pm80xx_hwi.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
diff --git a/drivers/scsi/pm8001/pm80xx_hwi.c b/drivers/scsi/pm8001/pm80xx_hwi.c index 2dea48933ef9..5853b3c0d76d 100644 --- a/drivers/scsi/pm8001/pm80xx_hwi.c +++ b/drivers/scsi/pm8001/pm80xx_hwi.c @@ -766,6 +766,10 @@ static void init_default_table_values(struct pm8001_hba_info *pm8001_ha) pm8001_ha->main_cfg_tbl.pm80xx_tbl.pcs_event_log_severity = 0x01; pm8001_ha->main_cfg_tbl.pm80xx_tbl.fatal_err_interrupt = 0x01;
+ /* Enable higher IQs and OQs, 32 to 63, bit 16 */ + if (pm8001_ha->max_q_num > 32) + pm8001_ha->main_cfg_tbl.pm80xx_tbl.fatal_err_interrupt |= + 1 << 16; /* Disable end to end CRC checking */ pm8001_ha->main_cfg_tbl.pm80xx_tbl.crc_core_dump = (0x1 << 16);
@@ -1027,6 +1031,13 @@ static int mpi_init_check(struct pm8001_hba_info *pm8001_ha) if (0x0000 != gst_len_mpistate) return -EBUSY;
+ /* + * As per controller datasheet, after successful MPI + * initialization minimum 500ms delay is required before + * issuing commands. + */ + msleep(500); + return 0; }
From: Mike Christie michael.christie@oracle.com
[ Upstream commit c34f95e98d8fb750eefd4f3fe58b4f8b5e89253b ]
This patch moves iscsi_ep_disconnect() so it can be called earlier in the next patch.
Link: https://lore.kernel.org/r/20220408001314.5014-2-michael.christie@oracle.com Tested-by: Manish Rangankar mrangankar@marvell.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Chris Leech cleech@redhat.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 38 ++++++++++++++--------------- 1 file changed, 19 insertions(+), 19 deletions(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 554b6f784223..126f6f23bffa 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2236,6 +2236,25 @@ static void iscsi_stop_conn(struct iscsi_cls_conn *conn, int flag) ISCSI_DBG_TRANS_CONN(conn, "Stopping conn done.\n"); }
+static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) +{ + struct iscsi_cls_session *session = iscsi_conn_to_session(conn); + struct iscsi_endpoint *ep; + + ISCSI_DBG_TRANS_CONN(conn, "disconnect ep.\n"); + conn->state = ISCSI_CONN_FAILED; + + if (!conn->ep || !session->transport->ep_disconnect) + return; + + ep = conn->ep; + conn->ep = NULL; + + session->transport->unbind_conn(conn, is_active); + session->transport->ep_disconnect(ep); + ISCSI_DBG_TRANS_CONN(conn, "disconnect ep done.\n"); +} + static int iscsi_if_stop_conn(struct iscsi_transport *transport, struct iscsi_uevent *ev) { @@ -2276,25 +2295,6 @@ static int iscsi_if_stop_conn(struct iscsi_transport *transport, return 0; }
-static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) -{ - struct iscsi_cls_session *session = iscsi_conn_to_session(conn); - struct iscsi_endpoint *ep; - - ISCSI_DBG_TRANS_CONN(conn, "disconnect ep.\n"); - conn->state = ISCSI_CONN_FAILED; - - if (!conn->ep || !session->transport->ep_disconnect) - return; - - ep = conn->ep; - conn->ep = NULL; - - session->transport->unbind_conn(conn, is_active); - session->transport->ep_disconnect(ep); - ISCSI_DBG_TRANS_CONN(conn, "disconnect ep done.\n"); -} - static void iscsi_cleanup_conn_work_fn(struct work_struct *work) { struct iscsi_cls_conn *conn = container_of(work, struct iscsi_cls_conn,
From: Mike Christie michael.christie@oracle.com
[ Upstream commit cbd2283aaf47fef4ded4b29124b1ef3beb515f3a ]
When userspace restarts during boot or upgrades it won't know about the offload driver's endpoint and connection mappings. iscsid will start by cleaning up the old session by doing a stop_conn call. Later, if we are able to create a new connection, we clean up the old endpoint during the binding stage. The problem is that if we do stop_conn before doing the ep_disconnect call offload, drivers can still be executing I/O. We then might free tasks from the under the card/driver.
This moves the ep_disconnect call to before we do the stop_conn call for this case. It will then work and look like a normal recovery/cleanup procedure from the driver's point of view.
Link: https://lore.kernel.org/r/20220408001314.5014-3-michael.christie@oracle.com Tested-by: Manish Rangankar mrangankar@marvell.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Chris Leech cleech@redhat.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 48 +++++++++++++++++------------ 1 file changed, 28 insertions(+), 20 deletions(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 126f6f23bffa..03cda2da80ef 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2255,6 +2255,23 @@ static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) ISCSI_DBG_TRANS_CONN(conn, "disconnect ep done.\n"); }
+static void iscsi_if_disconnect_bound_ep(struct iscsi_cls_conn *conn, + struct iscsi_endpoint *ep, + bool is_active) +{ + /* Check if this was a conn error and the kernel took ownership */ + if (!test_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) { + iscsi_ep_disconnect(conn, is_active); + } else { + ISCSI_DBG_TRANS_CONN(conn, "flush kernel conn cleanup.\n"); + mutex_unlock(&conn->ep_mutex); + + flush_work(&conn->cleanup_work); + + mutex_lock(&conn->ep_mutex); + } +} + static int iscsi_if_stop_conn(struct iscsi_transport *transport, struct iscsi_uevent *ev) { @@ -2275,6 +2292,16 @@ static int iscsi_if_stop_conn(struct iscsi_transport *transport, cancel_work_sync(&conn->cleanup_work); iscsi_stop_conn(conn, flag); } else { + /* + * For offload, when iscsid is restarted it won't know about + * existing endpoints so it can't do a ep_disconnect. We clean + * it up here for userspace. + */ + mutex_lock(&conn->ep_mutex); + if (conn->ep) + iscsi_if_disconnect_bound_ep(conn, conn->ep, true); + mutex_unlock(&conn->ep_mutex); + /* * Figure out if it was the kernel or userspace initiating this. */ @@ -3003,16 +3030,7 @@ static int iscsi_if_ep_disconnect(struct iscsi_transport *transport, }
mutex_lock(&conn->ep_mutex); - /* Check if this was a conn error and the kernel took ownership */ - if (test_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) { - ISCSI_DBG_TRANS_CONN(conn, "flush kernel conn cleanup.\n"); - mutex_unlock(&conn->ep_mutex); - - flush_work(&conn->cleanup_work); - goto put_ep; - } - - iscsi_ep_disconnect(conn, false); + iscsi_if_disconnect_bound_ep(conn, ep, false); mutex_unlock(&conn->ep_mutex); put_ep: iscsi_put_endpoint(ep); @@ -3723,16 +3741,6 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport,
switch (nlh->nlmsg_type) { case ISCSI_UEVENT_BIND_CONN: - if (conn->ep) { - /* - * For offload boot support where iscsid is restarted - * during the pivot root stage, the ep will be intact - * here when the new iscsid instance starts up and - * reconnects. - */ - iscsi_ep_disconnect(conn, true); - } - session = iscsi_session_lookup(ev->u.b_conn.sid); if (!session) { err = -EINVAL;
From: Mike Christie michael.christie@oracle.com
[ Upstream commit 0aadafb5c34403a7cced1a8d61877048dc059f70 ]
This patch fixes a bug where when using iSCSI offload we can free an endpoint while userspace still thinks it's active. That then causes the endpoint ID to be reused for a new connection's endpoint while userspace still thinks the ID is for the original connection. Userspace will then end up disconnecting a running connection's endpoint or trying to bind to another connection's endpoint.
This bug is a regression added in:
Commit 23d6fefbb3f6 ("scsi: iscsi: Fix in-kernel conn failure handling")
where we added a in kernel ep_disconnect call to fix a bug in:
Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space")
where we would call stop_conn without having done ep_disconnect. This early ep_disconnect call will then free the endpoint and it's ID while userspace still thinks the ID is valid.
Fix the early release of the ID by having the in kernel recovery code keep a reference to the endpoint until userspace has called into the kernel to finish cleaning up the endpoint/connection. It requires the previous commit "scsi: iscsi: Release endpoint ID when its freed" which moved the freeing of the ID until when the endpoint is released.
Link: https://lore.kernel.org/r/20220408001314.5014-5-michael.christie@oracle.com Fixes: 23d6fefbb3f6 ("scsi: iscsi: Fix in-kernel conn failure handling") Tested-by: Manish Rangankar mrangankar@marvell.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Chris Leech cleech@redhat.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 03cda2da80ef..4fa2fd7f4c72 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2267,7 +2267,11 @@ static void iscsi_if_disconnect_bound_ep(struct iscsi_cls_conn *conn, mutex_unlock(&conn->ep_mutex);
flush_work(&conn->cleanup_work); - + /* + * Userspace is now done with the EP so we can release the ref + * iscsi_cleanup_conn_work_fn took. + */ + iscsi_put_endpoint(ep); mutex_lock(&conn->ep_mutex); } } @@ -2342,6 +2346,12 @@ static void iscsi_cleanup_conn_work_fn(struct work_struct *work) return; }
+ /* + * Get a ref to the ep, so we don't release its ID until after + * userspace is done referencing it in iscsi_if_disconnect_bound_ep. + */ + if (conn->ep) + get_device(&conn->ep->dev); iscsi_ep_disconnect(conn, false);
if (system_state != SYSTEM_RUNNING) {
From: Mike Christie michael.christie@oracle.com
[ Upstream commit 7c6e99c18167ed89729bf167ccb4a7e3ab3115ba ]
If iscsid is doing a stop_conn at the same time the kernel is starting error recovery we can hit a race that allows the cleanup work to run on a valid connection. In the race, iscsi_if_stop_conn sees the cleanup bit set, but it calls flush_work on the clean_work before iscsi_conn_error_event has queued it. The flush then returns before the queueing and so the cleanup_work can run later and disconnect/stop a conn while it's in a connected state.
The patch:
Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space")
added the late stop_conn call bug originally, and the patch:
Commit 23d6fefbb3f6 ("scsi: iscsi: Fix in-kernel conn failure handling")
attempted to fix it but only fixed the normal EH case and left the above race for the iscsid restart case. For the normal EH case we don't hit the race because we only signal userspace to start recovery after we have done the queueing, so the flush will always catch the queued work or see it completed.
For iscsid restart cases like boot, we can hit the race because iscsid will call down to the kernel before the kernel has signaled any error, so both code paths can be running at the same time. This adds a lock around the setting of the cleanup bit and queueing so they happen together.
Link: https://lore.kernel.org/r/20220408001314.5014-6-michael.christie@oracle.com Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") Tested-by: Manish Rangankar mrangankar@marvell.com Reviewed-by: Lee Duncan lduncan@suse.com Reviewed-by: Chris Leech cleech@redhat.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 17 +++++++++++++++++ include/scsi/scsi_transport_iscsi.h | 2 ++ 2 files changed, 19 insertions(+)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index 4fa2fd7f4c72..ed289e1242c9 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2260,9 +2260,12 @@ static void iscsi_if_disconnect_bound_ep(struct iscsi_cls_conn *conn, bool is_active) { /* Check if this was a conn error and the kernel took ownership */ + spin_lock_irq(&conn->lock); if (!test_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) { + spin_unlock_irq(&conn->lock); iscsi_ep_disconnect(conn, is_active); } else { + spin_unlock_irq(&conn->lock); ISCSI_DBG_TRANS_CONN(conn, "flush kernel conn cleanup.\n"); mutex_unlock(&conn->ep_mutex);
@@ -2309,9 +2312,12 @@ static int iscsi_if_stop_conn(struct iscsi_transport *transport, /* * Figure out if it was the kernel or userspace initiating this. */ + spin_lock_irq(&conn->lock); if (!test_and_set_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) { + spin_unlock_irq(&conn->lock); iscsi_stop_conn(conn, flag); } else { + spin_unlock_irq(&conn->lock); ISCSI_DBG_TRANS_CONN(conn, "flush kernel conn cleanup.\n"); flush_work(&conn->cleanup_work); @@ -2320,7 +2326,9 @@ static int iscsi_if_stop_conn(struct iscsi_transport *transport, * Only clear for recovery to avoid extra cleanup runs during * termination. */ + spin_lock_irq(&conn->lock); clear_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags); + spin_unlock_irq(&conn->lock); } ISCSI_DBG_TRANS_CONN(conn, "iscsi if conn stop done.\n"); return 0; @@ -2341,7 +2349,9 @@ static void iscsi_cleanup_conn_work_fn(struct work_struct *work) */ if (conn->state != ISCSI_CONN_BOUND && conn->state != ISCSI_CONN_UP) { ISCSI_DBG_TRANS_CONN(conn, "Got error while conn is already failed. Ignoring.\n"); + spin_lock_irq(&conn->lock); clear_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags); + spin_unlock_irq(&conn->lock); mutex_unlock(&conn->ep_mutex); return; } @@ -2407,6 +2417,7 @@ iscsi_create_conn(struct iscsi_cls_session *session, int dd_size, uint32_t cid) conn->dd_data = &conn[1];
mutex_init(&conn->ep_mutex); + spin_lock_init(&conn->lock); INIT_LIST_HEAD(&conn->conn_list); INIT_WORK(&conn->cleanup_work, iscsi_cleanup_conn_work_fn); conn->transport = transport; @@ -2598,9 +2609,12 @@ void iscsi_conn_error_event(struct iscsi_cls_conn *conn, enum iscsi_err error) struct iscsi_uevent *ev; struct iscsi_internal *priv; int len = nlmsg_total_size(sizeof(*ev)); + unsigned long flags;
+ spin_lock_irqsave(&conn->lock, flags); if (!test_and_set_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) queue_work(iscsi_conn_cleanup_workq, &conn->cleanup_work); + spin_unlock_irqrestore(&conn->lock, flags);
priv = iscsi_if_transport_lookup(conn->transport); if (!priv) @@ -3743,11 +3757,14 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport, return -EINVAL;
mutex_lock(&conn->ep_mutex); + spin_lock_irq(&conn->lock); if (test_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) { + spin_unlock_irq(&conn->lock); mutex_unlock(&conn->ep_mutex); ev->r.retcode = -ENOTCONN; return 0; } + spin_unlock_irq(&conn->lock);
switch (nlh->nlmsg_type) { case ISCSI_UEVENT_BIND_CONN: diff --git a/include/scsi/scsi_transport_iscsi.h b/include/scsi/scsi_transport_iscsi.h index c5d7810fd792..037c77fb5dc5 100644 --- a/include/scsi/scsi_transport_iscsi.h +++ b/include/scsi/scsi_transport_iscsi.h @@ -211,6 +211,8 @@ struct iscsi_cls_conn { struct mutex ep_mutex; struct iscsi_endpoint *ep;
+ /* Used when accessing flags and queueing work. */ + spinlock_t lock; unsigned long flags; struct work_struct cleanup_work;
From: Mike Christie michael.christie@oracle.com
[ Upstream commit 03690d81974535f228e892a14f0d2d44404fe555 ]
If a driver raises a connection error before the connection is bound, we can leave a cleanup_work queued that can later run and disconnect/stop a connection that is logged in. The problem is that drivers can call iscsi_conn_error_event for endpoints that are connected but not yet bound when something like the network port they are using is brought down. iscsi_cleanup_conn_work_fn will check for this and exit early, but if the cleanup_work is stuck behind other works, it might not get run until after userspace has done ep_disconnect. Because the endpoint is not yet bound there was no way for ep_disconnect to flush the work.
The bug of leaving stop_conns queued was added in:
Commit 23d6fefbb3f6 ("scsi: iscsi: Fix in-kernel conn failure handling")
and:
Commit 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space")
was supposed to fix it, but left this case.
This patch moves the conn state check to before we even queue the work so we can avoid queueing.
Link: https://lore.kernel.org/r/20220408001314.5014-7-michael.christie@oracle.com Fixes: 0ab710458da1 ("scsi: iscsi: Perform connection failure entirely in kernel space") Tested-by: Manish Rangankar mrangankar@marvell.com Reviewed-by: Lee Duncan <lduncan@@suse.com> Reviewed-by: Chris Leech cleech@redhat.com Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 65 ++++++++++++++++------------- 1 file changed, 36 insertions(+), 29 deletions(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index ed289e1242c9..c7b1b2e8bb02 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -2221,10 +2221,10 @@ static void iscsi_stop_conn(struct iscsi_cls_conn *conn, int flag)
switch (flag) { case STOP_CONN_RECOVER: - conn->state = ISCSI_CONN_FAILED; + WRITE_ONCE(conn->state, ISCSI_CONN_FAILED); break; case STOP_CONN_TERM: - conn->state = ISCSI_CONN_DOWN; + WRITE_ONCE(conn->state, ISCSI_CONN_DOWN); break; default: iscsi_cls_conn_printk(KERN_ERR, conn, "invalid stop flag %d\n", @@ -2242,7 +2242,7 @@ static void iscsi_ep_disconnect(struct iscsi_cls_conn *conn, bool is_active) struct iscsi_endpoint *ep;
ISCSI_DBG_TRANS_CONN(conn, "disconnect ep.\n"); - conn->state = ISCSI_CONN_FAILED; + WRITE_ONCE(conn->state, ISCSI_CONN_FAILED);
if (!conn->ep || !session->transport->ep_disconnect) return; @@ -2341,21 +2341,6 @@ static void iscsi_cleanup_conn_work_fn(struct work_struct *work) struct iscsi_cls_session *session = iscsi_conn_to_session(conn);
mutex_lock(&conn->ep_mutex); - /* - * If we are not at least bound there is nothing for us to do. Userspace - * will do a ep_disconnect call if offload is used, but will not be - * doing a stop since there is nothing to clean up, so we have to clear - * the cleanup bit here. - */ - if (conn->state != ISCSI_CONN_BOUND && conn->state != ISCSI_CONN_UP) { - ISCSI_DBG_TRANS_CONN(conn, "Got error while conn is already failed. Ignoring.\n"); - spin_lock_irq(&conn->lock); - clear_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags); - spin_unlock_irq(&conn->lock); - mutex_unlock(&conn->ep_mutex); - return; - } - /* * Get a ref to the ep, so we don't release its ID until after * userspace is done referencing it in iscsi_if_disconnect_bound_ep. @@ -2422,7 +2407,7 @@ iscsi_create_conn(struct iscsi_cls_session *session, int dd_size, uint32_t cid) INIT_WORK(&conn->cleanup_work, iscsi_cleanup_conn_work_fn); conn->transport = transport; conn->cid = cid; - conn->state = ISCSI_CONN_DOWN; + WRITE_ONCE(conn->state, ISCSI_CONN_DOWN);
/* this is released in the dev's release function */ if (!get_device(&session->dev)) @@ -2610,10 +2595,30 @@ void iscsi_conn_error_event(struct iscsi_cls_conn *conn, enum iscsi_err error) struct iscsi_internal *priv; int len = nlmsg_total_size(sizeof(*ev)); unsigned long flags; + int state;
spin_lock_irqsave(&conn->lock, flags); - if (!test_and_set_bit(ISCSI_CLS_CONN_BIT_CLEANUP, &conn->flags)) - queue_work(iscsi_conn_cleanup_workq, &conn->cleanup_work); + /* + * Userspace will only do a stop call if we are at least bound. And, we + * only need to do the in kernel cleanup if in the UP state so cmds can + * be released to upper layers. If in other states just wait for + * userspace to avoid races that can leave the cleanup_work queued. + */ + state = READ_ONCE(conn->state); + switch (state) { + case ISCSI_CONN_BOUND: + case ISCSI_CONN_UP: + if (!test_and_set_bit(ISCSI_CLS_CONN_BIT_CLEANUP, + &conn->flags)) { + queue_work(iscsi_conn_cleanup_workq, + &conn->cleanup_work); + } + break; + default: + ISCSI_DBG_TRANS_CONN(conn, "Got conn error in state %d\n", + state); + break; + } spin_unlock_irqrestore(&conn->lock, flags);
priv = iscsi_if_transport_lookup(conn->transport); @@ -2964,7 +2969,7 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) char *data = (char*)ev + sizeof(*ev); struct iscsi_cls_conn *conn; struct iscsi_cls_session *session; - int err = 0, value = 0; + int err = 0, value = 0, state;
if (ev->u.set_param.len > PAGE_SIZE) return -EINVAL; @@ -2981,8 +2986,8 @@ iscsi_set_param(struct iscsi_transport *transport, struct iscsi_uevent *ev) session->recovery_tmo = value; break; default: - if ((conn->state == ISCSI_CONN_BOUND) || - (conn->state == ISCSI_CONN_UP)) { + state = READ_ONCE(conn->state); + if (state == ISCSI_CONN_BOUND || state == ISCSI_CONN_UP) { err = transport->set_param(conn, ev->u.set_param.param, data, ev->u.set_param.len); } else { @@ -3778,7 +3783,7 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport, ev->u.b_conn.transport_eph, ev->u.b_conn.is_leading); if (!ev->r.retcode) - conn->state = ISCSI_CONN_BOUND; + WRITE_ONCE(conn->state, ISCSI_CONN_BOUND);
if (ev->r.retcode || !transport->ep_connect) break; @@ -3797,7 +3802,8 @@ static int iscsi_if_transport_conn(struct iscsi_transport *transport, case ISCSI_UEVENT_START_CONN: ev->r.retcode = transport->start_conn(conn); if (!ev->r.retcode) - conn->state = ISCSI_CONN_UP; + WRITE_ONCE(conn->state, ISCSI_CONN_UP); + break; case ISCSI_UEVENT_SEND_PDU: pdu_len = nlh->nlmsg_len - sizeof(*nlh) - sizeof(*ev); @@ -4105,10 +4111,11 @@ static ssize_t show_conn_state(struct device *dev, { struct iscsi_cls_conn *conn = iscsi_dev_to_conn(dev->parent); const char *state = "unknown"; + int conn_state = READ_ONCE(conn->state);
- if (conn->state >= 0 && - conn->state < ARRAY_SIZE(connection_state_names)) - state = connection_state_names[conn->state]; + if (conn_state >= 0 && + conn_state < ARRAY_SIZE(connection_state_names)) + state = connection_state_names[conn_state];
return sysfs_emit(buf, "%s\n", state); }
From: Petr Malat oss@malat.biz
[ Upstream commit 8467dda0c26583547731e7f3ea73fc3856bae3bf ]
Function sctp_do_peeloff() wrongly initializes daddr of the original socket instead of the peeled off socket, which makes getpeername() return zeroes instead of the primary address. Initialize the new socket instead.
Fixes: d570ee490fb1 ("[SCTP]: Correctly set daddr for IPv6 sockets during peeloff") Signed-off-by: Petr Malat oss@malat.biz Acked-by: Marcelo Ricardo Leitner marcelo.leitner@gmail.com Link: https://lore.kernel.org/r/20220409063611.673193-1-oss@malat.biz Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/socket.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c index 3e1a9600be5e..7b0427658056 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -5636,7 +5636,7 @@ int sctp_do_peeloff(struct sock *sk, sctp_assoc_t id, struct socket **sockp) * Set the daddr and initialize id to something more random and also * copy over any ip options. */ - sp->pf->to_sk_daddr(&asoc->peer.primary_addr, sk); + sp->pf->to_sk_daddr(&asoc->peer.primary_addr, sock->sk); sp->pf->copy_ip_options(sk, sock->sk);
/* Populate the fields of the newsk from the oldsk and migrate the
From: Horatiu Vultur horatiu.vultur@microchip.com
[ Upstream commit d7a947d289dc205fc717c004dcebe33b15305afd ]
On lan966x it is not allowed to have foreign interfaces under a bridge which already contains lan966x ports. So when a port leaves the bridge it would call switchdev_bridge_port_unoffload which eventually will notify the other ports that bridge left the vlan group but that is not true because the bridge is still part of the vlan group.
Therefore when a port leaves the bridge, stop generating replays because already the HW cleared after itself and the other ports don't need to do anything else.
Fixes: cf2f60897e921e ("net: lan966x: Add support to offload the forwarding.") Signed-off-by: Horatiu Vultur horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c index 7de55f6a4da8..3c987fd6b9e2 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_switchdev.c @@ -261,8 +261,7 @@ static int lan966x_port_prechangeupper(struct net_device *dev,
if (netif_is_bridge_master(info->upper_dev) && !info->linking) switchdev_bridge_port_unoffload(port->dev, port, - &lan966x_switchdev_nb, - &lan966x_switchdev_blocking_nb); + NULL, NULL);
return NOTIFY_DONE; }
From: Horatiu Vultur horatiu.vultur@microchip.com
[ Upstream commit 269219321eb7d7645a3122cf40a420c5dc655eb9 ]
Currently when getting a new MAC is learn, the HW generates an interrupt. So then the SW will check the new entry and checks if it arrived on a correct port. If it didn't just generate a warning. But this could still crash the system. Therefore stop processing that entry when an issue is seen.
Fixes: 5ccd66e01cbef8 ("net: lan966x: add support for interrupts from analyzer") Signed-off-by: Horatiu Vultur horatiu.vultur@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan966x/lan966x_mac.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c index ce5970bdcc6a..2679111ef669 100644 --- a/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c +++ b/drivers/net/ethernet/microchip/lan966x/lan966x_mac.c @@ -346,7 +346,8 @@ static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
lan966x_mac_process_raw_entry(&raw_entries[column], mac, &vid, &dest_idx); - WARN_ON(dest_idx > lan966x->num_phys_ports); + if (WARN_ON(dest_idx > lan966x->num_phys_ports)) + continue;
/* If the entry in SW is found, then there is nothing * to do @@ -392,7 +393,8 @@ static void lan966x_mac_irq_process(struct lan966x *lan966x, u32 row,
lan966x_mac_process_raw_entry(&raw_entries[column], mac, &vid, &dest_idx); - WARN_ON(dest_idx > lan966x->num_phys_ports); + if (WARN_ON(dest_idx > lan966x->num_phys_ports)) + continue;
mac_entry = lan966x_mac_alloc_entry(mac, vid, dest_idx); if (!mac_entry)
From: Antoine Tenart atenart@kernel.org
[ Upstream commit 6c6f9f31ecd47dce1d0dafca4bec8805f9bc97cd ]
Since commit 6e1acfa387b9 ("netfilter: nf_tables: validate registers coming from userspace.") nft_parse_register can return a negative value, but the function prototype is still returning an unsigned int.
Fixes: 6e1acfa387b9 ("netfilter: nf_tables: validate registers coming from userspace.") Signed-off-by: Antoine Tenart atenart@kernel.org Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_tables_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index 1f5a0eece0d1..30d29d038d09 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -9275,7 +9275,7 @@ int nft_parse_u32_check(const struct nlattr *attr, int max, u32 *dest) } EXPORT_SYMBOL_GPL(nft_parse_u32_check);
-static unsigned int nft_parse_register(const struct nlattr *attr, u32 *preg) +static int nft_parse_register(const struct nlattr *attr, u32 *preg) { unsigned int reg;
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit 0f8da75b51ac863b9435368bd50691718cc454b0 ]
io-wq work cancellation path can't take uring_lock as how it's done on file assignment, we have to handle IO_WQ_WORK_CANCEL first, this fixes encountered hangs.
Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/0d9b9f37841645518503f6a207e509d14a286aba.164977346... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index d05394b0c1e6..e3d1fc954933 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6892,16 +6892,18 @@ static void io_wq_submit_work(struct io_wq_work *work) if (timeout) io_queue_linked_timeout(timeout);
- if (!io_assign_file(req, issue_flags)) { - err = -EBADF; - work->flags |= IO_WQ_WORK_CANCEL; - }
/* either cancelled or io-wq is dying, so don't touch tctx->iowq */ if (work->flags & IO_WQ_WORK_CANCEL) { +fail: io_req_task_queue_fail(req, err); return; } + if (!io_assign_file(req, issue_flags)) { + err = -EBADF; + work->flags |= IO_WQ_WORK_CANCEL; + goto fail; + }
if (req->flags & REQ_F_FORCE_ASYNC) { bool opcode_poll = def->pollin || def->pollout;
From: Takashi Iwai tiwai@suse.de
[ Upstream commit a8e84a5da18e6d786540aa4ceb6f969d5f1a441d ]
The previous cleanup with devres may lead to the incorrect release orders at the probe error handling due to the devres's nature. Until we register the card, snd_card_free() has to be called at first for releasing the stuff properly when the driver tries to manage and release the stuff via card->private_free().
This patch fixes it by calling snd_card_free() on the error from the probe callback using a new helper function.
Fixes: 567f58754109 ("ALSA: ad1889: Allocate resources with device-managed APIs") Link: https://lore.kernel.org/r/20220412102636.16000-4-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/pci/ad1889.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/sound/pci/ad1889.c b/sound/pci/ad1889.c index bba4dae8dcc7..50e30704bf6f 100644 --- a/sound/pci/ad1889.c +++ b/sound/pci/ad1889.c @@ -844,8 +844,8 @@ snd_ad1889_create(struct snd_card *card, struct pci_dev *pci) }
static int -snd_ad1889_probe(struct pci_dev *pci, - const struct pci_device_id *pci_id) +__snd_ad1889_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) { int err; static int devno; @@ -904,6 +904,12 @@ snd_ad1889_probe(struct pci_dev *pci, return 0; }
+static int snd_ad1889_probe(struct pci_dev *pci, + const struct pci_device_id *pci_id) +{ + return snd_card_free_on_error(&pci->dev, __snd_ad1889_probe(pci, pci_id)); +} + static const struct pci_device_id snd_ad1889_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_ANALOG_DEVICES, PCI_DEVICE_ID_AD1889JS) }, { 0, },
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 4fb27190879b82e48ce89a56e9d6c04437dbc065 ]
The card destructor of nm256 driver does merely stopping the running timer, and it's superfluous for the probe error handling. Moreover, calling this via the previous devres change would lead to another problem due to the reverse call order.
This patch moves the setup of the private_free callback after the card registration, so that it can be used only after fully set up.
Fixes: aa92050f10f0 ("ALSA: mtpav: Allocate resources with device-managed APIs") Link: https://lore.kernel.org/r/20220412102636.16000-39-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/drivers/mtpav.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/sound/drivers/mtpav.c b/sound/drivers/mtpav.c index 11235baaf6fa..f212f233ea61 100644 --- a/sound/drivers/mtpav.c +++ b/sound/drivers/mtpav.c @@ -693,8 +693,6 @@ static int snd_mtpav_probe(struct platform_device *dev) mtp_card->outmidihwport = 0xffffffff; timer_setup(&mtp_card->timer, snd_mtpav_output_timer, 0);
- card->private_free = snd_mtpav_free; - err = snd_mtpav_get_RAWMIDI(mtp_card); if (err < 0) return err; @@ -716,6 +714,8 @@ static int snd_mtpav_probe(struct platform_device *dev) if (err < 0) return err;
+ card->private_free = snd_mtpav_free; + platform_set_drvdata(dev, card); printk(KERN_INFO "Motu MidiTimePiece on parallel port irq: %d ioport: 0x%lx\n", irq, port); return 0;
From: Dylan Yudaken dylany@fb.com
[ Upstream commit 565c5e616e8061b40a2e1d786c418a7ac3503a8d ]
Move validation to be more consistently straight after copy_from_user. This is already done in io_register_rsrc_update and so this removes that redundant check.
Signed-off-by: Dylan Yudaken dylany@fb.com Link: https://lore.kernel.org/r/20220412163042.2788062-2-dylany@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index e3d1fc954933..7da6fddaef4d 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10784,8 +10784,6 @@ static int __io_register_rsrc_update(struct io_ring_ctx *ctx, unsigned type, __u32 tmp; int err;
- if (up->resv) - return -EINVAL; if (check_add_overflow(up->offset, nr_args, &tmp)) return -EOVERFLOW; err = io_rsrc_node_switch_start(ctx); @@ -10811,6 +10809,8 @@ static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg, memset(&up, 0, sizeof(up)); if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update))) return -EFAULT; + if (up.resv) + return -EINVAL; return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args); }
From: Dylan Yudaken dylany@fb.com
[ Upstream commit d8a3ba9c143bf89c032deced8a686ffa53b46098 ]
Verify that the user does not pass in anything but 0 for this field.
Fixes: 992da01aa932 ("io_uring: change registration/upd/rsrc tagging ABI") Signed-off-by: Dylan Yudaken dylany@fb.com Link: https://lore.kernel.org/r/20220412163042.2788062-3-dylany@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 7da6fddaef4d..2838bc6cdbc8 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6466,6 +6466,7 @@ static int io_files_update(struct io_kiocb *req, unsigned int issue_flags) up.nr = 0; up.tags = 0; up.resv = 0; + up.resv2 = 0;
io_ring_submit_lock(ctx, needs_lock); ret = __io_register_rsrc_update(ctx, IORING_RSRC_FILE, @@ -10809,7 +10810,7 @@ static int io_register_files_update(struct io_ring_ctx *ctx, void __user *arg, memset(&up, 0, sizeof(up)); if (copy_from_user(&up, arg, sizeof(struct io_uring_rsrc_update))) return -EFAULT; - if (up.resv) + if (up.resv || up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, IORING_RSRC_FILE, &up, nr_args); } @@ -10823,7 +10824,7 @@ static int io_register_rsrc_update(struct io_ring_ctx *ctx, void __user *arg, return -EINVAL; if (copy_from_user(&up, arg, sizeof(up))) return -EFAULT; - if (!up.nr || up.resv) + if (!up.nr || up.resv || up.resv2) return -EINVAL; return __io_register_rsrc_update(ctx, type, &up, up.nr); }
From: Dylan Yudaken dylany@fb.com
[ Upstream commit d2347b9695dafe5c388a5f9aeb70e27a7a4d29cf ]
Ensure that only 0 is passed for pad here.
Fixes: c73ebb685fb6 ("io_uring: add timeout support for io_uring_enter()") Signed-off-by: Dylan Yudaken dylany@fb.com Link: https://lore.kernel.org/r/20220412163042.2788062-5-dylany@fb.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 2838bc6cdbc8..7a652c8eeed2 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10109,6 +10109,8 @@ static int io_get_ext_arg(unsigned flags, const void __user *argp, size_t *argsz return -EINVAL; if (copy_from_user(&arg, argp, sizeof(arg))) return -EFAULT; + if (arg.pad) + return -EINVAL; *sig = u64_to_user_ptr(arg.sigmask); *argsz = arg.sigmask_sz; *ts = u64_to_user_ptr(arg.ts);
From: Athira Rajeev atrajeev@linux.vnet.ibm.com
[ Upstream commit ce64763c63854b4079f2e036638aa881a1fb3fbc ]
The selftest "mqueue/mq_perf_tests.c" use CPU_ALLOC to allocate CPU set. This cpu set is used further in pthread_attr_setaffinity_np and by pthread_create in the code. But in current code, allocated cpu set is not freed.
Fix this issue by adding CPU_FREE in the "shutdown" function which is called in most of the error/exit path for the cleanup. There are few error paths which exit without using shutdown. Add a common goto error path with CPU_FREE for these cases.
Fixes: 7820b0715b6f ("tools/selftests: add mq_perf_tests") Signed-off-by: Athira Rajeev atrajeev@linux.vnet.ibm.com Signed-off-by: Shuah Khan skhan@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../testing/selftests/mqueue/mq_perf_tests.c | 25 +++++++++++++------ 1 file changed, 17 insertions(+), 8 deletions(-)
diff --git a/tools/testing/selftests/mqueue/mq_perf_tests.c b/tools/testing/selftests/mqueue/mq_perf_tests.c index b019e0b8221c..84fda3b49073 100644 --- a/tools/testing/selftests/mqueue/mq_perf_tests.c +++ b/tools/testing/selftests/mqueue/mq_perf_tests.c @@ -180,6 +180,9 @@ void shutdown(int exit_val, char *err_cause, int line_no) if (in_shutdown++) return;
+ /* Free the cpu_set allocated using CPU_ALLOC in main function */ + CPU_FREE(cpu_set); + for (i = 0; i < num_cpus_to_pin; i++) if (cpu_threads[i]) { pthread_kill(cpu_threads[i], SIGUSR1); @@ -551,6 +554,12 @@ int main(int argc, char *argv[]) perror("sysconf(_SC_NPROCESSORS_ONLN)"); exit(1); } + + if (getuid() != 0) + ksft_exit_skip("Not running as root, but almost all tests " + "require root in order to modify\nsystem settings. " + "Exiting.\n"); + cpus_online = min(MAX_CPUS, sysconf(_SC_NPROCESSORS_ONLN)); cpu_set = CPU_ALLOC(cpus_online); if (cpu_set == NULL) { @@ -589,7 +598,7 @@ int main(int argc, char *argv[]) cpu_set)) { fprintf(stderr, "Any given CPU may " "only be given once.\n"); - exit(1); + goto err_code; } else CPU_SET_S(cpus_to_pin[cpu], cpu_set_size, cpu_set); @@ -607,7 +616,7 @@ int main(int argc, char *argv[]) queue_path = malloc(strlen(option) + 2); if (!queue_path) { perror("malloc()"); - exit(1); + goto err_code; } queue_path[0] = '/'; queue_path[1] = 0; @@ -622,17 +631,12 @@ int main(int argc, char *argv[]) fprintf(stderr, "Must pass at least one CPU to continuous " "mode.\n"); poptPrintUsage(popt_context, stderr, 0); - exit(1); + goto err_code; } else if (!continuous_mode) { num_cpus_to_pin = 1; cpus_to_pin[0] = cpus_online - 1; }
- if (getuid() != 0) - ksft_exit_skip("Not running as root, but almost all tests " - "require root in order to modify\nsystem settings. " - "Exiting.\n"); - max_msgs = fopen(MAX_MSGS, "r+"); max_msgsize = fopen(MAX_MSGSIZE, "r+"); if (!max_msgs) @@ -740,4 +744,9 @@ int main(int argc, char *argv[]) sleep(1); } shutdown(0, "", 0); + +err_code: + CPU_FREE(cpu_set); + exit(1); + }
From: Takashi Iwai tiwai@suse.de
[ Upstream commit fee2ec8cceb33b8886bc5894fb07e0b2e34148af ]
The current limit of max buffer size 1MB seems too small for modern devices with lots of channels and high sample rates. Let's make bigger, 4MB.
Reviewed-by: Jaroslav Kysela perex@perex.cz Link: https://lore.kernel.org/r/20220407212740.17920-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/pcm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 6a460225f2e3..37ee6df8b15a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -659,7 +659,7 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) #define hwc_debug(fmt, args...) do { } while(0) #endif
-#define MAX_BUFFER_BYTES (1024 * 1024) +#define MAX_BUFFER_BYTES (4 * 1024 * 1024) #define MAX_PERIOD_BYTES (512 * 1024)
static const struct snd_pcm_hardware snd_usb_hardware =
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 24d0c9f0e7de95fe3e3e0067cbea1cd5d413244b ]
In the previous fix, we increased the max buffer bytes from 1MB to 4MB so that we can use bigger buffers for the modern HiFi devices with higher rates, more channels and wider formats. OTOH, extending this has a concern that too big buffer is allowed for the lower rates, less channels and narrower formats; when an application tries to allocate as big buffer as possible, it'll lead to unexpectedly too huge size.
Also, we had a problem about the inconsistent max buffer and period bytes for the implicit feedback mode when both streams have different channels. This was fixed by the (relatively complex) patch to reduce the max buffer and period bytes accordingly.
This is an alternative fix for those, a patch to kill two birds with one stone (*): instead of increasing the max buffer bytes blindly and applying the reduction per channels, we simply use the hw constraints for the buffer and period "time". Meanwhile the max buffer and period bytes are set unlimited instead.
Since the inconsistency of buffer (and period) bytes comes from the difference of the channels in the tied streams, as long as we care only about the buffer (and period) time, it doesn't matter; the buffer time is same for different channels, although we still allow higher buffer size. Similarly, this will allow more buffer bytes for HiFi devices while it also keeps the reasonable size for the legacy devices, too.
As of this patch, the max period and buffer time are set to 1 and 2 seconds, which should be large enough for all possible use cases.
(*) No animals were harmed in the making of this patch.
Fixes: 98c27add5d96 ("ALSA: usb-audio: Cap upper limits of buffer/period bytes for implicit fb") Fixes: fee2ec8cceb3 ("ALSA: usb-audio: Increase max buffer size") Link: https://lore.kernel.org/r/20220412130740.18933-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/usb/pcm.c | 101 +++++++----------------------------------------- 1 file changed, 14 insertions(+), 87 deletions(-)
diff --git a/sound/usb/pcm.c b/sound/usb/pcm.c index 37ee6df8b15a..6d699065e81a 100644 --- a/sound/usb/pcm.c +++ b/sound/usb/pcm.c @@ -659,9 +659,6 @@ static int snd_usb_pcm_prepare(struct snd_pcm_substream *substream) #define hwc_debug(fmt, args...) do { } while(0) #endif
-#define MAX_BUFFER_BYTES (4 * 1024 * 1024) -#define MAX_PERIOD_BYTES (512 * 1024) - static const struct snd_pcm_hardware snd_usb_hardware = { .info = SNDRV_PCM_INFO_MMAP | @@ -672,9 +669,9 @@ static const struct snd_pcm_hardware snd_usb_hardware = SNDRV_PCM_INFO_PAUSE, .channels_min = 1, .channels_max = 256, - .buffer_bytes_max = MAX_BUFFER_BYTES, + .buffer_bytes_max = INT_MAX, /* limited by BUFFER_TIME later */ .period_bytes_min = 64, - .period_bytes_max = MAX_PERIOD_BYTES, + .period_bytes_max = INT_MAX, /* limited by PERIOD_TIME later */ .periods_min = 2, .periods_max = 1024, }; @@ -974,78 +971,6 @@ static int hw_rule_periods_implicit_fb(struct snd_pcm_hw_params *params, ep->cur_buffer_periods); }
-/* get the adjusted max buffer (or period) bytes that can fit with the - * paired format for implicit fb - */ -static unsigned int -get_adjusted_max_bytes(struct snd_usb_substream *subs, - struct snd_usb_substream *pair, - struct snd_pcm_hw_params *params, - unsigned int max_bytes, - bool reverse_map) -{ - const struct audioformat *fp, *pp; - unsigned int rmax = 0, r; - - list_for_each_entry(fp, &subs->fmt_list, list) { - if (!fp->implicit_fb) - continue; - if (!reverse_map && - !hw_check_valid_format(subs, params, fp)) - continue; - list_for_each_entry(pp, &pair->fmt_list, list) { - if (pp->iface != fp->sync_iface || - pp->altsetting != fp->sync_altsetting || - pp->ep_idx != fp->sync_ep_idx) - continue; - if (reverse_map && - !hw_check_valid_format(pair, params, pp)) - break; - if (!reverse_map && pp->channels > fp->channels) - r = max_bytes * fp->channels / pp->channels; - else if (reverse_map && pp->channels < fp->channels) - r = max_bytes * pp->channels / fp->channels; - else - r = max_bytes; - rmax = max(rmax, r); - break; - } - } - return rmax; -} - -/* Reduce the period or buffer bytes depending on the paired substream; - * when a paired configuration for implicit fb has a higher number of channels, - * we need to reduce the max size accordingly, otherwise it may become unusable - */ -static int hw_rule_bytes_implicit_fb(struct snd_pcm_hw_params *params, - struct snd_pcm_hw_rule *rule) -{ - struct snd_usb_substream *subs = rule->private; - struct snd_usb_substream *pair; - struct snd_interval *it; - unsigned int max_bytes; - unsigned int rmax; - - pair = &subs->stream->substream[!subs->direction]; - if (!pair->ep_num) - return 0; - - if (rule->var == SNDRV_PCM_HW_PARAM_PERIOD_BYTES) - max_bytes = MAX_PERIOD_BYTES; - else - max_bytes = MAX_BUFFER_BYTES; - - rmax = get_adjusted_max_bytes(subs, pair, params, max_bytes, false); - if (!rmax) - rmax = get_adjusted_max_bytes(pair, subs, params, max_bytes, true); - if (!rmax) - return 0; - - it = hw_param_interval(params, rule->var); - return apply_hw_params_minmax(it, 0, rmax); -} - /* * set up the runtime hardware information. */ @@ -1139,6 +1064,18 @@ static int setup_hw_info(struct snd_pcm_runtime *runtime, struct snd_usb_substre return err; }
+ /* set max period and buffer sizes for 1 and 2 seconds, respectively */ + err = snd_pcm_hw_constraint_minmax(runtime, + SNDRV_PCM_HW_PARAM_PERIOD_TIME, + 0, 1000000); + if (err < 0) + return err; + err = snd_pcm_hw_constraint_minmax(runtime, + SNDRV_PCM_HW_PARAM_BUFFER_TIME, + 0, 2000000); + if (err < 0) + return err; + /* additional hw constraints for implicit fb */ err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_FORMAT, hw_rule_format_implicit_fb, subs, @@ -1160,16 +1097,6 @@ static int setup_hw_info(struct snd_pcm_runtime *runtime, struct snd_usb_substre SNDRV_PCM_HW_PARAM_PERIODS, -1); if (err < 0) return err; - err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_BUFFER_BYTES, - hw_rule_bytes_implicit_fb, subs, - SNDRV_PCM_HW_PARAM_BUFFER_BYTES, -1); - if (err < 0) - return err; - err = snd_pcm_hw_rule_add(runtime, 0, SNDRV_PCM_HW_PARAM_PERIOD_BYTES, - hw_rule_bytes_implicit_fb, subs, - SNDRV_PCM_HW_PARAM_PERIOD_BYTES, -1); - if (err < 0) - return err;
list_for_each_entry(fp, &subs->fmt_list, list) { if (fp->implicit_fb) {
From: Adrian Hunter adrian.hunter@intel.com
[ Upstream commit f034fc50d3c7d9385c20d505ab4cf56b8fd18ac7 ]
Fix incorrect debug message:
Attempting to add event pmu 'intel_pt' with '' that may result in non-fatal errors
which always appears with perf record -vv and intel_pt e.g.
perf record -vv -e intel_pt//u uname
The message is incorrect because there will never be non-fatal errors.
Suppress the message if the PMU is 'selectable' i.e. meant to be selected directly as an event.
Fixes: 4ac22b484d4c79e8 ("perf parse-events: Make add PMU verbose output clearer") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@kernel.org Link: http://lore.kernel.org/lkml/20220411061758.2458417-1-adrian.hunter@intel.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/parse-events.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/parse-events.c b/tools/perf/util/parse-events.c index 24997925ae00..dd84fed698a3 100644 --- a/tools/perf/util/parse-events.c +++ b/tools/perf/util/parse-events.c @@ -1523,7 +1523,9 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, bool use_uncore_alias; LIST_HEAD(config_terms);
- if (verbose > 1) { + pmu = parse_state->fake_pmu ?: perf_pmu__find(name); + + if (verbose > 1 && !(pmu && pmu->selectable)) { fprintf(stderr, "Attempting to add event pmu '%s' with '", name); if (head_config) { @@ -1536,7 +1538,6 @@ int parse_events_add_pmu(struct parse_events_state *parse_state, fprintf(stderr, "' that may result in non-fatal errors\n"); }
- pmu = parse_state->fake_pmu ?: perf_pmu__find(name); if (!pmu) { char *err_str;
From: Martin Willi martin@strongswan.org
[ Upstream commit e16b859872b87650bb55b12cca5a5fcdc49c1442 ]
The MACVLAN receive handler clones skbs to all matching source MACVLAN interfaces, before it passes the packet along to match on destination based MACVLANs.
When using the MACVLAN nodst mode, passing the packet to destination based MACVLANs is omitted and the handler returns with RX_HANDLER_CONSUMED. However, the passed skb is not freed, leaking for any packet processed with the nodst option.
Properly free the skb when consuming packets to fix that leak.
Fixes: 427f0c8c194b ("macvlan: Add nodst option to macvlan type source") Signed-off-by: Martin Willi martin@strongswan.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/macvlan.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/net/macvlan.c b/drivers/net/macvlan.c index 6ef5f77be4d0..c83664b28d89 100644 --- a/drivers/net/macvlan.c +++ b/drivers/net/macvlan.c @@ -460,8 +460,10 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) return RX_HANDLER_CONSUMED; *pskb = skb; eth = eth_hdr(skb); - if (macvlan_forward_source(skb, port, eth->h_source)) + if (macvlan_forward_source(skb, port, eth->h_source)) { + kfree_skb(skb); return RX_HANDLER_CONSUMED; + } src = macvlan_hash_lookup(port, eth->h_source); if (src && src->mode != MACVLAN_MODE_VEPA && src->mode != MACVLAN_MODE_BRIDGE) { @@ -480,8 +482,10 @@ static rx_handler_result_t macvlan_handle_frame(struct sk_buff **pskb) return RX_HANDLER_PASS; }
- if (macvlan_forward_source(skb, port, eth->h_source)) + if (macvlan_forward_source(skb, port, eth->h_source)) { + kfree_skb(skb); return RX_HANDLER_CONSUMED; + } if (macvlan_passthru(port)) vlan = list_first_or_null_rcu(&port->vlans, struct macvlan_dev, list);
From: Dylan Hung dylan_hung@aspeedtech.com
[ Upstream commit 3d2504524531990b32a0629cc984db44f399d161 ]
AST2600 MAC register 0x58 is writable only when the MAC clock is enabled. Usually, the MAC clock is enabled by the bootloader so register 0x58 is set normally when the bootloader is involved. To make ast2600 ftgmac100 work without the bootloader, postpone the register write until the clock is ready.
Fixes: 137d23cea1c0 ("net: ftgmac100: Fix Aspeed ast2600 TX hang issue") Signed-off-by: Dylan Hung dylan_hung@aspeedtech.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/faraday/ftgmac100.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/faraday/ftgmac100.c b/drivers/net/ethernet/faraday/ftgmac100.c index d5356db7539a..caf48023f8ea 100644 --- a/drivers/net/ethernet/faraday/ftgmac100.c +++ b/drivers/net/ethernet/faraday/ftgmac100.c @@ -1835,11 +1835,6 @@ static int ftgmac100_probe(struct platform_device *pdev) priv->rxdes0_edorr_mask = BIT(30); priv->txdes0_edotr_mask = BIT(30); priv->is_aspeed = true; - /* Disable ast2600 problematic HW arbitration */ - if (of_device_is_compatible(np, "aspeed,ast2600-mac")) { - iowrite32(FTGMAC100_TM_DEFAULT, - priv->base + FTGMAC100_OFFSET_TM); - } } else { priv->rxdes0_edorr_mask = BIT(15); priv->txdes0_edotr_mask = BIT(15); @@ -1911,6 +1906,11 @@ static int ftgmac100_probe(struct platform_device *pdev) err = ftgmac100_setup_clk(priv); if (err) goto err_phy_connect; + + /* Disable ast2600 problematic HW arbitration */ + if (of_device_is_compatible(np, "aspeed,ast2600-mac")) + iowrite32(FTGMAC100_TM_DEFAULT, + priv->base + FTGMAC100_OFFSET_TM); }
/* Default ring sizes */
From: Lin Ma linma@zju.edu.cn
[ Upstream commit ef27324e2cb7bb24542d6cb2571740eefe6b00dc ]
Our detector found a concurrent use-after-free bug when detaching an NCI device. The main reason for this bug is the unexpected scheduling between the used delayed mechanism (timer and workqueue).
The race can be demonstrated below:
Thread-1 Thread-2 | nci_dev_up() | nci_open_device() | __nci_request(nci_reset_req) | nci_send_cmd | queue_work(cmd_work) nci_unregister_device() | nci_close_device() | ... del_timer_sync(cmd_timer)[1] | ... | Worker nci_free_device() | nci_cmd_work() kfree(ndev)[3] | mod_timer(cmd_timer)[2]
In short, the cleanup routine thought that the cmd_timer has already been detached by [1] but the mod_timer can re-attach the timer [2], even it is already released [3], resulting in UAF.
This UAF is easy to trigger, crash trace by POC is like below
[ 66.703713] ================================================================== [ 66.703974] BUG: KASAN: use-after-free in enqueue_timer+0x448/0x490 [ 66.703974] Write of size 8 at addr ffff888009fb7058 by task kworker/u4:1/33 [ 66.703974] [ 66.703974] CPU: 1 PID: 33 Comm: kworker/u4:1 Not tainted 5.18.0-rc2 #5 [ 66.703974] Workqueue: nfc2_nci_cmd_wq nci_cmd_work [ 66.703974] Call Trace: [ 66.703974] <TASK> [ 66.703974] dump_stack_lvl+0x57/0x7d [ 66.703974] print_report.cold+0x5e/0x5db [ 66.703974] ? enqueue_timer+0x448/0x490 [ 66.703974] kasan_report+0xbe/0x1c0 [ 66.703974] ? enqueue_timer+0x448/0x490 [ 66.703974] enqueue_timer+0x448/0x490 [ 66.703974] __mod_timer+0x5e6/0xb80 [ 66.703974] ? mark_held_locks+0x9e/0xe0 [ 66.703974] ? try_to_del_timer_sync+0xf0/0xf0 [ 66.703974] ? lockdep_hardirqs_on_prepare+0x17b/0x410 [ 66.703974] ? queue_work_on+0x61/0x80 [ 66.703974] ? lockdep_hardirqs_on+0xbf/0x130 [ 66.703974] process_one_work+0x8bb/0x1510 [ 66.703974] ? lockdep_hardirqs_on_prepare+0x410/0x410 [ 66.703974] ? pwq_dec_nr_in_flight+0x230/0x230 [ 66.703974] ? rwlock_bug.part.0+0x90/0x90 [ 66.703974] ? _raw_spin_lock_irq+0x41/0x50 [ 66.703974] worker_thread+0x575/0x1190 [ 66.703974] ? process_one_work+0x1510/0x1510 [ 66.703974] kthread+0x2a0/0x340 [ 66.703974] ? kthread_complete_and_exit+0x20/0x20 [ 66.703974] ret_from_fork+0x22/0x30 [ 66.703974] </TASK> [ 66.703974] [ 66.703974] Allocated by task 267: [ 66.703974] kasan_save_stack+0x1e/0x40 [ 66.703974] __kasan_kmalloc+0x81/0xa0 [ 66.703974] nci_allocate_device+0xd3/0x390 [ 66.703974] nfcmrvl_nci_register_dev+0x183/0x2c0 [ 66.703974] nfcmrvl_nci_uart_open+0xf2/0x1dd [ 66.703974] nci_uart_tty_ioctl+0x2c3/0x4a0 [ 66.703974] tty_ioctl+0x764/0x1310 [ 66.703974] __x64_sys_ioctl+0x122/0x190 [ 66.703974] do_syscall_64+0x3b/0x90 [ 66.703974] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 66.703974] [ 66.703974] Freed by task 406: [ 66.703974] kasan_save_stack+0x1e/0x40 [ 66.703974] kasan_set_track+0x21/0x30 [ 66.703974] kasan_set_free_info+0x20/0x30 [ 66.703974] __kasan_slab_free+0x108/0x170 [ 66.703974] kfree+0xb0/0x330 [ 66.703974] nfcmrvl_nci_unregister_dev+0x90/0xd0 [ 66.703974] nci_uart_tty_close+0xdf/0x180 [ 66.703974] tty_ldisc_kill+0x73/0x110 [ 66.703974] tty_ldisc_hangup+0x281/0x5b0 [ 66.703974] __tty_hangup.part.0+0x431/0x890 [ 66.703974] tty_release+0x3a8/0xc80 [ 66.703974] __fput+0x1f0/0x8c0 [ 66.703974] task_work_run+0xc9/0x170 [ 66.703974] exit_to_user_mode_prepare+0x194/0x1a0 [ 66.703974] syscall_exit_to_user_mode+0x19/0x50 [ 66.703974] do_syscall_64+0x48/0x90 [ 66.703974] entry_SYSCALL_64_after_hwframe+0x44/0xae
To fix the UAF, this patch adds flush_workqueue() to ensure the nci_cmd_work is finished before the following del_timer_sync. This combination will promise the timer is actually detached.
Fixes: 6a2968aaf50c ("NFC: basic NCI protocol implementation") Signed-off-by: Lin Ma linma@zju.edu.cn Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/nfc/nci/core.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/nfc/nci/core.c b/net/nfc/nci/core.c index d2537383a3e8..6a193cce2a75 100644 --- a/net/nfc/nci/core.c +++ b/net/nfc/nci/core.c @@ -560,6 +560,10 @@ static int nci_close_device(struct nci_dev *ndev) mutex_lock(&ndev->req_lock);
if (!test_and_clear_bit(NCI_UP, &ndev->flags)) { + /* Need to flush the cmd wq in case + * there is a queued/running cmd_work + */ + flush_workqueue(ndev->cmd_wq); del_timer_sync(&ndev->cmd_timer); del_timer_sync(&ndev->data_timer); mutex_unlock(&ndev->req_lock);
From: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com
[ Upstream commit 64c4a37ac04eeb43c42d272f6e6c8c12bfcf4304 ]
Smatch printed a warning: arch/x86/crypto/poly1305_glue.c:198 poly1305_update_arch() error: __memcpy() 'dctx->buf' too small (16 vs u32max)
It's caused because Smatch marks 'link_len' as untrusted since it comes from sscanf(). Add a check to ensure that 'link_len' is not larger than the size of the 'link_str' buffer.
Fixes: c69c1b6eaea1 ("cifs: implement CIFSParseMFSymlink()") Signed-off-by: Harshit Mogalapalli harshit.m.mogalapalli@oracle.com Reviewed-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/link.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/cifs/link.c b/fs/cifs/link.c index 852e54ee82c2..bbdf3281559c 100644 --- a/fs/cifs/link.c +++ b/fs/cifs/link.c @@ -85,6 +85,9 @@ parse_mf_symlink(const u8 *buf, unsigned int buf_len, unsigned int *_link_len, if (rc != 1) return -EINVAL;
+ if (link_len > CIFS_MF_SYMLINK_LINK_MAXLEN) + return -EINVAL; + rc = symlink_hash(link_len, link_str, md5_hash); if (rc) { cifs_dbg(FYI, "%s: MD5 hash failure: %d\n", __func__, rc);
From: Khazhismel Kumykov khazhy@google.com
[ Upstream commit ce40426fdc3c92acdba6b5ca74bc7277ffaa6a3d ]
Mixing sched_clock() and ktime_get_ns() usage will give bad results.
Switch hst_select_path() from using sched_clock() to ktime_get_ns(). Also rename path_service_time()'s 'sched_now' variable to 'now'.
Fixes: 2613eab11996 ("dm mpath: add Historical Service Time Path Selector") Signed-off-by: Khazhismel Kumykov khazhy@google.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-ps-historical-service-time.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/drivers/md/dm-ps-historical-service-time.c b/drivers/md/dm-ps-historical-service-time.c index 875bca30a0dd..82f2a06153dc 100644 --- a/drivers/md/dm-ps-historical-service-time.c +++ b/drivers/md/dm-ps-historical-service-time.c @@ -27,7 +27,6 @@ #include <linux/blkdev.h> #include <linux/slab.h> #include <linux/module.h> -#include <linux/sched/clock.h>
#define DM_MSG_PREFIX "multipath historical-service-time" @@ -433,7 +432,7 @@ static struct dm_path *hst_select_path(struct path_selector *ps, { struct selector *s = ps->context; struct path_info *pi = NULL, *best = NULL; - u64 time_now = sched_clock(); + u64 time_now = ktime_get_ns(); struct dm_path *ret = NULL; unsigned long flags;
@@ -474,7 +473,7 @@ static int hst_start_io(struct path_selector *ps, struct dm_path *path,
static u64 path_service_time(struct path_info *pi, u64 start_time) { - u64 sched_now = ktime_get_ns(); + u64 now = ktime_get_ns();
/* if a previous disk request has finished after this IO was * sent to the hardware, pretend the submission happened @@ -483,11 +482,11 @@ static u64 path_service_time(struct path_info *pi, u64 start_time) if (time_after64(pi->last_finish, start_time)) start_time = pi->last_finish;
- pi->last_finish = sched_now; - if (time_before64(sched_now, start_time)) + pi->last_finish = now; + if (time_before64(now, start_time)) return 0;
- return sched_now - start_time; + return now - start_time; }
static int hst_end_io(struct path_selector *ps, struct dm_path *path,
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit 1ef3342a934e235aca72b4bcc0d6854d80a65077 ]
get_pf_vdev() tries to check if a PF is a VFIO PF by looking at the driver:
if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) {
However now that we have multiple VF and PF drivers this is no longer reliable.
This means that security tests realted to vf_token can be skipped by mixing and matching different VFIO PCI drivers.
Instead of trying to use the driver core to find the PF devices maintain a linked list of all PF vfio_pci_core_device's that we have called pci_enable_sriov() on.
When registering a VF just search the list to see if the PF is present and record the match permanently in the struct. PCI core locking prevents a PF from passing pci_disable_sriov() while VF drivers are attached so the VFIO owned PF becomes a static property of the VF.
In common cases where vfio does not own the PF the global list remains empty and the VF's pointer is statically NULL.
This also fixes a lockdep splat from recursive locking of the vfio_group::device_lock between vfio_device_get_from_name() and vfio_device_get_from_dev(). If the VF and PF share the same group this would deadlock.
Fixes: ff53edf6d6ab ("vfio/pci: Split the pci_driver code out of vfio_pci_core.c") Signed-off-by: Jason Gunthorpe jgg@nvidia.com Link: https://lore.kernel.org/r/0-v3-876570980634+f2e8-vfio_vf_token_jgg@nvidia.co... Signed-off-by: Alex Williamson alex.williamson@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/vfio/pci/vfio_pci_core.c | 124 ++++++++++++++++++------------- include/linux/vfio_pci_core.h | 2 + 2 files changed, 76 insertions(+), 50 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c index 2e6409cc11ad..ef54ef11af55 100644 --- a/drivers/vfio/pci/vfio_pci_core.c +++ b/drivers/vfio/pci/vfio_pci_core.c @@ -36,6 +36,10 @@ static bool nointxmask; static bool disable_vga; static bool disable_idle_d3;
+/* List of PF's that vfio_pci_core_sriov_configure() has been called on */ +static DEFINE_MUTEX(vfio_pci_sriov_pfs_mutex); +static LIST_HEAD(vfio_pci_sriov_pfs); + static inline bool vfio_vga_disabled(void) { #ifdef CONFIG_VFIO_PCI_VGA @@ -434,47 +438,17 @@ void vfio_pci_core_disable(struct vfio_pci_core_device *vdev) } EXPORT_SYMBOL_GPL(vfio_pci_core_disable);
-static struct vfio_pci_core_device *get_pf_vdev(struct vfio_pci_core_device *vdev) -{ - struct pci_dev *physfn = pci_physfn(vdev->pdev); - struct vfio_device *pf_dev; - - if (!vdev->pdev->is_virtfn) - return NULL; - - pf_dev = vfio_device_get_from_dev(&physfn->dev); - if (!pf_dev) - return NULL; - - if (pci_dev_driver(physfn) != pci_dev_driver(vdev->pdev)) { - vfio_device_put(pf_dev); - return NULL; - } - - return container_of(pf_dev, struct vfio_pci_core_device, vdev); -} - -static void vfio_pci_vf_token_user_add(struct vfio_pci_core_device *vdev, int val) -{ - struct vfio_pci_core_device *pf_vdev = get_pf_vdev(vdev); - - if (!pf_vdev) - return; - - mutex_lock(&pf_vdev->vf_token->lock); - pf_vdev->vf_token->users += val; - WARN_ON(pf_vdev->vf_token->users < 0); - mutex_unlock(&pf_vdev->vf_token->lock); - - vfio_device_put(&pf_vdev->vdev); -} - void vfio_pci_core_close_device(struct vfio_device *core_vdev) { struct vfio_pci_core_device *vdev = container_of(core_vdev, struct vfio_pci_core_device, vdev);
- vfio_pci_vf_token_user_add(vdev, -1); + if (vdev->sriov_pf_core_dev) { + mutex_lock(&vdev->sriov_pf_core_dev->vf_token->lock); + WARN_ON(!vdev->sriov_pf_core_dev->vf_token->users); + vdev->sriov_pf_core_dev->vf_token->users--; + mutex_unlock(&vdev->sriov_pf_core_dev->vf_token->lock); + } vfio_spapr_pci_eeh_release(vdev->pdev); vfio_pci_core_disable(vdev);
@@ -495,7 +469,12 @@ void vfio_pci_core_finish_enable(struct vfio_pci_core_device *vdev) { vfio_pci_probe_mmaps(vdev); vfio_spapr_pci_eeh_open(vdev->pdev); - vfio_pci_vf_token_user_add(vdev, 1); + + if (vdev->sriov_pf_core_dev) { + mutex_lock(&vdev->sriov_pf_core_dev->vf_token->lock); + vdev->sriov_pf_core_dev->vf_token->users++; + mutex_unlock(&vdev->sriov_pf_core_dev->vf_token->lock); + } } EXPORT_SYMBOL_GPL(vfio_pci_core_finish_enable);
@@ -1603,11 +1582,8 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev, * * If the VF token is provided but unused, an error is generated. */ - if (!vdev->pdev->is_virtfn && !vdev->vf_token && !vf_token) - return 0; /* No VF token provided or required */ - if (vdev->pdev->is_virtfn) { - struct vfio_pci_core_device *pf_vdev = get_pf_vdev(vdev); + struct vfio_pci_core_device *pf_vdev = vdev->sriov_pf_core_dev; bool match;
if (!pf_vdev) { @@ -1620,7 +1596,6 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev, }
if (!vf_token) { - vfio_device_put(&pf_vdev->vdev); pci_info_ratelimited(vdev->pdev, "VF token required to access device\n"); return -EACCES; @@ -1630,8 +1605,6 @@ static int vfio_pci_validate_vf_token(struct vfio_pci_core_device *vdev, match = uuid_equal(uuid, &pf_vdev->vf_token->uuid); mutex_unlock(&pf_vdev->vf_token->lock);
- vfio_device_put(&pf_vdev->vdev); - if (!match) { pci_info_ratelimited(vdev->pdev, "Incorrect VF token provided for device\n"); @@ -1752,8 +1725,30 @@ static int vfio_pci_bus_notifier(struct notifier_block *nb, static int vfio_pci_vf_init(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev; + struct vfio_pci_core_device *cur; + struct pci_dev *physfn; int ret;
+ if (pdev->is_virtfn) { + /* + * If this VF was created by our vfio_pci_core_sriov_configure() + * then we can find the PF vfio_pci_core_device now, and due to + * the locking in pci_disable_sriov() it cannot change until + * this VF device driver is removed. + */ + physfn = pci_physfn(vdev->pdev); + mutex_lock(&vfio_pci_sriov_pfs_mutex); + list_for_each_entry(cur, &vfio_pci_sriov_pfs, sriov_pfs_item) { + if (cur->pdev == physfn) { + vdev->sriov_pf_core_dev = cur; + break; + } + } + mutex_unlock(&vfio_pci_sriov_pfs_mutex); + return 0; + } + + /* Not a SRIOV PF */ if (!pdev->is_physfn) return 0;
@@ -1825,6 +1820,7 @@ void vfio_pci_core_init_device(struct vfio_pci_core_device *vdev, INIT_LIST_HEAD(&vdev->ioeventfds_list); mutex_init(&vdev->vma_lock); INIT_LIST_HEAD(&vdev->vma_list); + INIT_LIST_HEAD(&vdev->sriov_pfs_item); init_rwsem(&vdev->memory_lock); } EXPORT_SYMBOL_GPL(vfio_pci_core_init_device); @@ -1916,7 +1912,7 @@ void vfio_pci_core_unregister_device(struct vfio_pci_core_device *vdev) { struct pci_dev *pdev = vdev->pdev;
- pci_disable_sriov(pdev); + vfio_pci_core_sriov_configure(pdev, 0);
vfio_unregister_group_dev(&vdev->vdev);
@@ -1954,21 +1950,49 @@ static pci_ers_result_t vfio_pci_aer_err_detected(struct pci_dev *pdev,
int vfio_pci_core_sriov_configure(struct pci_dev *pdev, int nr_virtfn) { + struct vfio_pci_core_device *vdev; struct vfio_device *device; int ret = 0;
+ device_lock_assert(&pdev->dev); + device = vfio_device_get_from_dev(&pdev->dev); if (!device) return -ENODEV;
- if (nr_virtfn == 0) - pci_disable_sriov(pdev); - else + vdev = container_of(device, struct vfio_pci_core_device, vdev); + + if (nr_virtfn) { + mutex_lock(&vfio_pci_sriov_pfs_mutex); + /* + * The thread that adds the vdev to the list is the only thread + * that gets to call pci_enable_sriov() and we will only allow + * it to be called once without going through + * pci_disable_sriov() + */ + if (!list_empty(&vdev->sriov_pfs_item)) { + ret = -EINVAL; + goto out_unlock; + } + list_add_tail(&vdev->sriov_pfs_item, &vfio_pci_sriov_pfs); + mutex_unlock(&vfio_pci_sriov_pfs_mutex); ret = pci_enable_sriov(pdev, nr_virtfn); + if (ret) + goto out_del; + ret = nr_virtfn; + goto out_put; + }
- vfio_device_put(device); + pci_disable_sriov(pdev);
- return ret < 0 ? ret : nr_virtfn; +out_del: + mutex_lock(&vfio_pci_sriov_pfs_mutex); + list_del_init(&vdev->sriov_pfs_item); +out_unlock: + mutex_unlock(&vfio_pci_sriov_pfs_mutex); +out_put: + vfio_device_put(device); + return ret; } EXPORT_SYMBOL_GPL(vfio_pci_core_sriov_configure);
diff --git a/include/linux/vfio_pci_core.h b/include/linux/vfio_pci_core.h index ae6f4838ab75..6e5db4edc335 100644 --- a/include/linux/vfio_pci_core.h +++ b/include/linux/vfio_pci_core.h @@ -133,6 +133,8 @@ struct vfio_pci_core_device { struct mutex ioeventfds_lock; struct list_head ioeventfds_list; struct vfio_pci_vf_token *vf_token; + struct list_head sriov_pfs_item; + struct vfio_pci_core_device *sriov_pf_core_dev; struct notifier_block nb; struct mutex vma_lock; struct list_head vma_list;
From: Antoine Tenart atenart@kernel.org
[ Upstream commit 968a1a5d6541cd24e37dadc1926eab9c10aeb09b ]
Commit 5337824f4dc4 ("net: annotate accesses to queue->trans_start") introduced a new helper, txq_trans_cond_update, to update queue->trans_start using WRITE_ONCE. One snippet in drivers/net/tun.c was missed, as it was introduced roughly at the same time.
Fixes: 5337824f4dc4 ("net: annotate accesses to queue->trans_start") Cc: Eric Dumazet edumazet@google.com Signed-off-by: Antoine Tenart atenart@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://lore.kernel.org/r/20220412135852.466386-1-atenart@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/tun.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/tun.c b/drivers/net/tun.c index de999e0fedbc..aa78d7e00289 100644 --- a/drivers/net/tun.c +++ b/drivers/net/tun.c @@ -1106,7 +1106,7 @@ static netdev_tx_t tun_net_xmit(struct sk_buff *skb, struct net_device *dev)
/* NETIF_F_LLTX requires to do our own update of trans_start */ queue = netdev_get_tx_queue(dev, txq); - queue->trans_start = jiffies; + txq_trans_cond_update(queue);
/* Notify and wake up reader process */ if (tfile->flags & TUN_FASYNC)
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 00fa91bc9cc2a9d340f963af5e457610ad4b2f9c ]
When the device tree has 2 CPU ports defined, a single one is active (has any dp->cpu_dp pointers point to it). Yet the second one is still a CPU port, and DSA still calls ->change_tag_protocol on it.
On the NXP LS1028A, the CPU ports are ports 4 and 5. Port 4 is the active CPU port and port 5 is inactive.
After the following commands:
# Initial setting cat /sys/class/net/eno2/dsa/tagging ocelot echo ocelot-8021q > /sys/class/net/eno2/dsa/tagging echo ocelot > /sys/class/net/eno2/dsa/tagging
traffic is now broken, because the driver has moved the NPI port from port 4 to port 5, unbeknown to DSA.
The problem can be avoided by detecting that the second CPU port is unused, and not doing anything for it. Further rework will be needed when proper support for multiple CPU ports is added.
Treat this as a bug and prepare current kernels to work in single-CPU mode with multiple-CPU DT blobs.
Fixes: adb3dccf090b ("net: dsa: felix: convert to the new .change_tag_protocol DSA API") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Link: https://lore.kernel.org/r/20220412172209.2531865-1-vladimir.oltean@nxp.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/ocelot/felix.c | 23 +++++++++++++++++++++++ 1 file changed, 23 insertions(+)
diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 9957772201d5..c414d9e9d7c0 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -599,6 +599,8 @@ static int felix_change_tag_protocol(struct dsa_switch *ds, int cpu, struct ocelot *ocelot = ds->priv; struct felix *felix = ocelot_to_felix(ocelot); enum dsa_tag_protocol old_proto = felix->tag_proto; + bool cpu_port_active = false; + struct dsa_port *dp; int err;
if (proto != DSA_TAG_PROTO_SEVILLE && @@ -606,6 +608,27 @@ static int felix_change_tag_protocol(struct dsa_switch *ds, int cpu, proto != DSA_TAG_PROTO_OCELOT_8021Q) return -EPROTONOSUPPORT;
+ /* We don't support multiple CPU ports, yet the DT blob may have + * multiple CPU ports defined. The first CPU port is the active one, + * the others are inactive. In this case, DSA will call + * ->change_tag_protocol() multiple times, once per CPU port. + * Since we implement the tagging protocol change towards "ocelot" or + * "seville" as effectively initializing the NPI port, what we are + * doing is effectively changing who the NPI port is to the last @cpu + * argument passed, which is an unused DSA CPU port and not the one + * that should actively pass traffic. + * Suppress DSA's calls on CPU ports that are inactive. + */ + dsa_switch_for_each_user_port(dp, ds) { + if (dp->cpu_dp->index == cpu) { + cpu_port_active = true; + break; + } + } + + if (!cpu_port_active) + return 0; + felix_del_tag_protocol(ds, cpu, old_proto);
err = felix_set_tag_protocol(ds, cpu, proto);
From: Jeremy Linton jeremy.linton@arm.com
[ Upstream commit 2df3fc4a84e917a422935cc5bae18f43f9955d31 ]
It turns out after digging deeper into this bug, that it was being triggered by GCC12 failing to call the bcmgenet_enable_dma() routine. Given that a gcc12 fix has been merged [1] and the genet driver now works properly when built with gcc12, this commit should be reverted.
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105160 https://gcc.gnu.org/git/?p=gcc.git%3Ba=commit%3Bh=aabb9a261ef060cf24fd626713...
Fixes: 8d3ea3d402db ("net: bcmgenet: Use stronger register read/writes to assure ordering") Signed-off-by: Jeremy Linton jeremy.linton@arm.com Acked-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20220412210420.1129430-1-jeremy.linton@arm.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/genet/bcmgenet.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/genet/bcmgenet.c b/drivers/net/ethernet/broadcom/genet/bcmgenet.c index bd5998012a87..2da804f84b48 100644 --- a/drivers/net/ethernet/broadcom/genet/bcmgenet.c +++ b/drivers/net/ethernet/broadcom/genet/bcmgenet.c @@ -76,7 +76,7 @@ static inline void bcmgenet_writel(u32 value, void __iomem *offset) if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) __raw_writel(value, offset); else - writel(value, offset); + writel_relaxed(value, offset); }
static inline u32 bcmgenet_readl(void __iomem *offset) @@ -84,7 +84,7 @@ static inline u32 bcmgenet_readl(void __iomem *offset) if (IS_ENABLED(CONFIG_MIPS) && IS_ENABLED(CONFIG_CPU_BIG_ENDIAN)) return __raw_readl(offset); else - return readl(offset); + return readl_relaxed(offset); }
static inline void dmadesc_set_length_status(struct bcmgenet_priv *priv,
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 8535c0185d14ea41f0efd6a357961b05daf6687e ]
Unit of bio->bi_iter.bi_size is bytes, but unit of offset/size is sector.
Fix the above issue in checking offset/size in bio_trim().
Fixes: e83502ca5f1e ("block: fix argument type of bio_trim()") Cc: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20220414084443.1736850-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- block/bio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c index 1be1e360967d..342b1cf5d713 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1570,7 +1570,7 @@ EXPORT_SYMBOL(bio_split); void bio_trim(struct bio *bio, sector_t offset, sector_t size) { if (WARN_ON_ONCE(offset > BIO_MAX_SECTORS || size > BIO_MAX_SECTORS || - offset + size > bio->bi_iter.bi_size)) + offset + size > bio_sectors(bio))) return;
size <<= 9;
From: Ming Lei ming.lei@redhat.com
[ Upstream commit 3e3876d322aef82416ecc496a4d4a587e0fdf7a3 ]
When poll request is timed out, it is removed from the poll list, but not completed, so the request is leaked, and never get chance to complete.
Fix the issue by ending it in timeout handler.
Fixes: 0a593fbbc245 ("null_blk: poll queue support") Signed-off-by: Ming Lei ming.lei@redhat.com Link: https://lore.kernel.org/r/20220413084836.1571995-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/null_blk/main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 13004beb48ca..233577b14141 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1606,7 +1606,7 @@ static enum blk_eh_timer_return null_timeout_rq(struct request *rq, bool res) * Only fake timeouts need to execute blk_mq_complete_request() here. */ cmd->error = BLK_STS_TIMEOUT; - if (cmd->fake_timeout) + if (cmd->fake_timeout || hctx->type == HCTX_TYPE_POLL) blk_mq_complete_request(rq); return BLK_EH_DONE; }
From: Jens Axboe axboe@kernel.dk
[ Upstream commit 701521403cfb228536b3947035c8a6eca40d8e58 ]
We need to either restore creds properly if we fail on the file assignment, or just do the file assignment first instead. Let's do the latter as it's simpler, should make no difference here for file assignment.
Link: https://lore.kernel.org/lkml/000000000000a7edb305dca75a50@google.com/ Reported-by: syzbot+60c52ca98513a8760a91@syzkaller.appspotmail.com Fixes: 6bf9c47a3989 ("io_uring: defer file assignment") Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 7a652c8eeed2..6f93bff7633c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6729,13 +6729,14 @@ static int io_issue_sqe(struct io_kiocb *req, unsigned int issue_flags) const struct cred *creds = NULL; int ret;
+ if (unlikely(!io_assign_file(req, issue_flags))) + return -EBADF; + if (unlikely((req->flags & REQ_F_CREDS) && req->creds != current_cred())) creds = override_creds(req->creds);
if (!io_op_defs[req->opcode].audit_skip) audit_uring_entry(req->opcode); - if (unlikely(!io_assign_file(req, issue_flags))) - return -EBADF;
switch (req->opcode) { case IORING_OP_NOP:
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit c7fa848ff01dad9ed3146a6b1a7d3622131bcedd ]
When new work is created that requires attention from the hypervisor (e.g., to inject an interrupt into the guest), fast_vcpu_kick is used to pull the target vcpu out of the guest if it may have been running.
Therefore the work creation side looks like this:
vcpu->arch.doorbell_request = 1; kvmppc_fast_vcpu_kick_hv(vcpu) { smp_mb(); cpu = vcpu->cpu; if (cpu != -1) send_ipi(cpu); }
And the guest entry side *should* look like this:
vcpu->cpu = smp_processor_id(); smp_mb(); if (vcpu->arch.doorbell_request) { // do something (abort entry or inject doorbell etc) }
But currently the store and load are flipped, so it is possible for the entry to see no doorbell pending, and the doorbell creation misses the store to set cpu, resulting lost work (or at least delayed until the next guest exit).
Fix this by reordering the entry operations and adding a smp_mb between them. The P8 path appears to have a similar race which is commented but not addressed yet.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20220303053315.1056880-2-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 41 +++++++++++++++++++++++++++++------- 1 file changed, 33 insertions(+), 8 deletions(-)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 791db769080d..316f61a4cb59 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -225,6 +225,13 @@ static void kvmppc_fast_vcpu_kick_hv(struct kvm_vcpu *vcpu) int cpu; struct rcuwait *waitp;
+ /* + * rcuwait_wake_up contains smp_mb() which orders prior stores that + * create pending work vs below loads of cpu fields. The other side + * is the barrier in vcpu run that orders setting the cpu fields vs + * testing for pending work. + */ + waitp = kvm_arch_vcpu_get_wait(vcpu); if (rcuwait_wake_up(waitp)) ++vcpu->stat.generic.halt_wakeup; @@ -1089,7 +1096,7 @@ int kvmppc_pseries_do_hcall(struct kvm_vcpu *vcpu) break; } tvcpu->arch.prodded = 1; - smp_mb(); + smp_mb(); /* This orders prodded store vs ceded load */ if (tvcpu->arch.ceded) kvmppc_fast_vcpu_kick_hv(tvcpu); break; @@ -3771,6 +3778,14 @@ static noinline void kvmppc_run_core(struct kvmppc_vcore *vc) pvc = core_info.vc[sub]; pvc->pcpu = pcpu + thr; for_each_runnable_thread(i, vcpu, pvc) { + /* + * XXX: is kvmppc_start_thread called too late here? + * It updates vcpu->cpu and vcpu->arch.thread_cpu + * which are used by kvmppc_fast_vcpu_kick_hv(), but + * kick is called after new exceptions become available + * and exceptions are checked earlier than here, by + * kvmppc_core_prepare_to_enter. + */ kvmppc_start_thread(vcpu, pvc); kvmppc_create_dtl_entry(vcpu, pvc); trace_kvm_guest_enter(vcpu); @@ -4492,6 +4507,21 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit, if (need_resched() || !kvm->arch.mmu_ready) goto out;
+ vcpu->cpu = pcpu; + vcpu->arch.thread_cpu = pcpu; + vc->pcpu = pcpu; + local_paca->kvm_hstate.kvm_vcpu = vcpu; + local_paca->kvm_hstate.ptid = 0; + local_paca->kvm_hstate.fake_suspend = 0; + + /* + * Orders set cpu/thread_cpu vs testing for pending interrupts and + * doorbells below. The other side is when these fields are set vs + * kvmppc_fast_vcpu_kick_hv reading the cpu/thread_cpu fields to + * kick a vCPU to notice the pending interrupt. + */ + smp_mb(); + if (!nested) { kvmppc_core_prepare_to_enter(vcpu); if (test_bit(BOOK3S_IRQPRIO_EXTERNAL, @@ -4511,13 +4541,6 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit,
tb = mftb();
- vcpu->cpu = pcpu; - vcpu->arch.thread_cpu = pcpu; - vc->pcpu = pcpu; - local_paca->kvm_hstate.kvm_vcpu = vcpu; - local_paca->kvm_hstate.ptid = 0; - local_paca->kvm_hstate.fake_suspend = 0; - __kvmppc_create_dtl_entry(vcpu, pcpu, tb + vc->tb_offset, 0);
trace_kvm_guest_enter(vcpu); @@ -4619,6 +4642,8 @@ int kvmhv_run_single_vcpu(struct kvm_vcpu *vcpu, u64 time_limit, run->exit_reason = KVM_EXIT_INTR; vcpu->arch.ret = -EINTR; out: + vcpu->cpu = -1; + vcpu->arch.thread_cpu = -1; powerpc_local_irq_pmu_restore(flags); preempt_enable(); goto done;
From: Aurabindo Pillai aurabindo.pillai@amd.com
[ Upstream commit c5c948aa894a831f96fccd025e47186b1ee41615 ]
[Why&How] Add a dedicated AMDGPU specific ID for use with newer ASICs that support USB-C output
Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/ObjectID.h | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/ObjectID.h b/drivers/gpu/drm/amd/amdgpu/ObjectID.h index 5b393622f592..a0f0a17e224f 100644 --- a/drivers/gpu/drm/amd/amdgpu/ObjectID.h +++ b/drivers/gpu/drm/amd/amdgpu/ObjectID.h @@ -119,6 +119,7 @@ #define CONNECTOR_OBJECT_ID_eDP 0x14 #define CONNECTOR_OBJECT_ID_MXM 0x15 #define CONNECTOR_OBJECT_ID_LVDS_eDP 0x16 +#define CONNECTOR_OBJECT_ID_USBC 0x17
/* deleted */
From: Darrick J. Wong djwong@kernel.org
[ Upstream commit 05fd9564e9faf0f23b4676385e27d9405cef6637 ]
Since the initial introduction of (posix) fallocate back at the turn of the century, it has been possible to use this syscall to change the user-visible contents of files. This can happen by extending the file size during a preallocation, or through any of the newer modes (punch, zero range). Because the call can be used to change file contents, we should treat it like we do any other modification to a file -- update the mtime, and drop set[ug]id privileges/capabilities.
The VFS function file_modified() does all this for us if pass it a locked inode, so let's make fallocate drop permissions correctly.
Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Darrick J. Wong djwong@kernel.org Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/file.c | 13 +++++++++++-- 1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c index a0179cc62913..28ddd9cf2069 100644 --- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -2918,8 +2918,9 @@ int btrfs_replace_file_extents(struct btrfs_inode *inode, return ret; }
-static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) +static int btrfs_punch_hole(struct file *file, loff_t offset, loff_t len) { + struct inode *inode = file_inode(file); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); struct btrfs_root *root = BTRFS_I(inode)->root; struct extent_state *cached_state = NULL; @@ -2951,6 +2952,10 @@ static int btrfs_punch_hole(struct inode *inode, loff_t offset, loff_t len) goto out_only_mutex; }
+ ret = file_modified(file); + if (ret) + goto out_only_mutex; + lockstart = round_up(offset, btrfs_inode_sectorsize(BTRFS_I(inode))); lockend = round_down(offset + len, btrfs_inode_sectorsize(BTRFS_I(inode))) - 1; @@ -3391,7 +3396,7 @@ static long btrfs_fallocate(struct file *file, int mode, return -EOPNOTSUPP;
if (mode & FALLOC_FL_PUNCH_HOLE) - return btrfs_punch_hole(inode, offset, len); + return btrfs_punch_hole(file, offset, len);
/* * Only trigger disk allocation, don't trigger qgroup reserve @@ -3413,6 +3418,10 @@ static long btrfs_fallocate(struct file *file, int mode, goto out; }
+ ret = file_modified(file); + if (ret) + goto out; + /* * TODO: Move these two operations after we have checked * accurate reserved space, or fallocate can still fail but
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit a7d16d9a07bbcb7dcd5214a1bea75c808830bc0d ]
This is a long time leftover from when I originally added the free space inode, the point was to catch cases where we weren't honoring the NOCOW flag. However there exists a race with relocation, if we allocate our free space inode in a block group that is about to be relocated, we could trigger the COW path before the relocation has the opportunity to find the extents and delete the free space cache. In production where we have auto-relocation enabled we're seeing this WARN_ON_ONCE() around 5k times in a 2 week period, so not super common but enough that it's at the top of our metrics.
We're properly handling the error here, and with us phasing out v1 space cache anyway just drop the WARN_ON_ONCE.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 85daae70afda..9547088a9306 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -1130,7 +1130,6 @@ static noinline int cow_file_range(struct btrfs_inode *inode, int ret = 0;
if (btrfs_is_free_space_inode(inode)) { - WARN_ON_ONCE(1); ret = -EINVAL; goto out_unlock; }
From: Guchun Chen guchun.chen@amd.com
[ Upstream commit 2d505453f38e18d42ba7d5428aaa17aaa7752c65 ]
Use amdgpu_bo_free_kernel instead of amdgpu_bo_unref to perform a proper cleanup of PDB bo.
v2: update subject to be more accurate
Signed-off-by: Guchun Chen guchun.chen@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 88c1eb9ad068..34ee75cf7954 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1684,7 +1684,7 @@ static int gmc_v9_0_sw_fini(void *handle) amdgpu_gem_force_release(adev); amdgpu_vm_manager_fini(adev); amdgpu_gart_table_vram_free(adev); - amdgpu_bo_unref(&adev->gmc.pdb0_bo); + amdgpu_bo_free_kernel(&adev->gmc.pdb0_bo, NULL, &adev->gmc.ptr_pdb0); amdgpu_bo_fini(adev);
return 0;
From: Alex Deucher alexander.deucher@amd.com
[ Upstream commit b818a5d374542ccec73dcfe578a081574029820e ]
If the GPU is passed through to a guest VM, use the PCI BAR for CPU FB access rather than the physical address of carve out. The physical address is not valid in a guest.
v2: Fix HDP handing as suggested by Michel
Reviewed-by: Christian König christian.koenig@amd.com Reviewed-by: Michel Dänzer mdaenzer@redhat.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 4 ++-- drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c | 2 +- drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c | 2 +- 5 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c index b87dca6d09fa..052816f0efed 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c @@ -5678,7 +5678,7 @@ void amdgpu_device_flush_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { #ifdef CONFIG_X86_64 - if (adev->flags & AMD_IS_APU) + if ((adev->flags & AMD_IS_APU) && !amdgpu_passthrough(adev)) return; #endif if (adev->gmc.xgmi.connected_to_cpu) @@ -5694,7 +5694,7 @@ void amdgpu_device_invalidate_hdp(struct amdgpu_device *adev, struct amdgpu_ring *ring) { #ifdef CONFIG_X86_64 - if (adev->flags & AMD_IS_APU) + if ((adev->flags & AMD_IS_APU) && !amdgpu_passthrough(adev)) return; #endif if (adev->gmc.xgmi.connected_to_cpu) diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c index a2f8ed0e6a64..f1b794d5d87d 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v10_0.c @@ -788,7 +788,7 @@ static int gmc_v10_0_mc_init(struct amdgpu_device *adev) adev->gmc.aper_size = pci_resource_len(adev->pdev, 0);
#ifdef CONFIG_X86_64 - if (adev->flags & AMD_IS_APU) { + if ((adev->flags & AMD_IS_APU) && !amdgpu_passthrough(adev)) { adev->gmc.aper_base = adev->gfxhub.funcs->get_mc_fb_offset(adev); adev->gmc.aper_size = adev->gmc.real_vram_size; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c index ab8adbff9e2d..5206e2da334a 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v7_0.c @@ -381,8 +381,9 @@ static int gmc_v7_0_mc_init(struct amdgpu_device *adev) adev->gmc.aper_size = pci_resource_len(adev->pdev, 0);
#ifdef CONFIG_X86_64 - if (adev->flags & AMD_IS_APU && - adev->gmc.real_vram_size > adev->gmc.aper_size) { + if ((adev->flags & AMD_IS_APU) && + adev->gmc.real_vram_size > adev->gmc.aper_size && + !amdgpu_passthrough(adev)) { adev->gmc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22; adev->gmc.aper_size = adev->gmc.real_vram_size; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c index 054733838292..d07d36786836 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v8_0.c @@ -581,7 +581,7 @@ static int gmc_v8_0_mc_init(struct amdgpu_device *adev) adev->gmc.aper_size = pci_resource_len(adev->pdev, 0);
#ifdef CONFIG_X86_64 - if (adev->flags & AMD_IS_APU) { + if ((adev->flags & AMD_IS_APU) && !amdgpu_passthrough(adev)) { adev->gmc.aper_base = ((u64)RREG32(mmMC_VM_FB_OFFSET)) << 22; adev->gmc.aper_size = adev->gmc.real_vram_size; } diff --git a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c index 34ee75cf7954..2fb24178eaef 100644 --- a/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gmc_v9_0.c @@ -1420,7 +1420,7 @@ static int gmc_v9_0_mc_init(struct amdgpu_device *adev) */
/* check whether both host-gpu and gpu-gpu xgmi links exist */ - if ((adev->flags & AMD_IS_APU) || + if (((adev->flags & AMD_IS_APU) && !amdgpu_passthrough(adev)) || (adev->gmc.xgmi.supported && adev->gmc.xgmi.connected_to_cpu)) { adev->gmc.aper_base =
From: Charlene Liu Charlene.Liu@amd.com
[ Upstream commit 5e8a71cf13bc9184fee915b2220be71b4c6cac74 ]
[why] for the case edid change only changed audio format. driver still need to update stream.
Reviewed-by: Alvin Lee Alvin.Lee2@amd.com Reviewed-by: Aric Cyr Aric.Cyr@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Charlene Liu Charlene.Liu@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc_resource.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c index ac3071e38e4a..f0a97b82c33a 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc_resource.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc_resource.c @@ -1667,8 +1667,8 @@ bool dc_is_stream_unchanged( if (old_stream->ignore_msa_timing_param != stream->ignore_msa_timing_param) return false;
- // Only Have Audio left to check whether it is same or not. This is a corner case for Tiled sinks - if (old_stream->audio_info.mode_count != stream->audio_info.mode_count) + /*compare audio info*/ + if (memcmp(&old_stream->audio_info, &stream->audio_info, sizeof(stream->audio_info)) != 0) return false;
return true;
From: Chiawen Huang chiawen.huang@amd.com
[ Upstream commit 7d56a154e22ffb3613fdebf83ec34d5225a22993 ]
[Why] disable/enable leads FEC mismatch between hw/sw FEC state.
[How] check FEC status to fastboot on/off.
Reviewed-by: Anthony Koo Anthony.Koo@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Chiawen Huang chiawen.huang@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 62bc6ce88753..78e17a4af4ab 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1495,6 +1495,10 @@ bool dc_validate_seamless_boot_timing(const struct dc *dc, if (!link->link_enc->funcs->is_dig_enabled(link->link_enc)) return false;
+ /* Check for FEC status*/ + if (link->link_enc->funcs->fec_is_active(link->link_enc)) + return false; + enc_inst = link->link_enc->funcs->get_dig_frontend(link->link_enc);
if (enc_inst == ENGINE_ID_UNKNOWN)
From: Leo (Hanghong) Ma hanghong.ma@amd.com
[ Upstream commit c9fbf6435162ed5fb7201d1d4adf6585c6a8c327 ]
[Why & How] The latest HDMI SPEC has updated the VTEM packet structure, so change the VTEM Infopacket defined in the driver side to align with the SPEC.
Reviewed-by: Chris Park Chris.Park@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Leo (Hanghong) Ma hanghong.ma@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../gpu/drm/amd/display/modules/info_packet/info_packet.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/modules/info_packet/info_packet.c b/drivers/gpu/drm/amd/display/modules/info_packet/info_packet.c index 57f198de5e2c..4e075b01d48b 100644 --- a/drivers/gpu/drm/amd/display/modules/info_packet/info_packet.c +++ b/drivers/gpu/drm/amd/display/modules/info_packet/info_packet.c @@ -100,7 +100,8 @@ enum vsc_packet_revision { //PB7 = MD0 #define MASK_VTEM_MD0__VRR_EN 0x01 #define MASK_VTEM_MD0__M_CONST 0x02 -#define MASK_VTEM_MD0__RESERVED2 0x0C +#define MASK_VTEM_MD0__QMS_EN 0x04 +#define MASK_VTEM_MD0__RESERVED2 0x08 #define MASK_VTEM_MD0__FVA_FACTOR_M1 0xF0
//MD1 @@ -109,7 +110,7 @@ enum vsc_packet_revision { //MD2 #define MASK_VTEM_MD2__BASE_REFRESH_RATE_98 0x03 #define MASK_VTEM_MD2__RB 0x04 -#define MASK_VTEM_MD2__RESERVED3 0xF8 +#define MASK_VTEM_MD2__NEXT_TFR 0xF8
//MD3 #define MASK_VTEM_MD3__BASE_REFRESH_RATE_07 0xFF
From: Tushar Patel tushar.patel@amd.com
[ Upstream commit b7dfbd2e601f3fee545bc158feceba4f340fe7cf ]
Compute-only GPUs have more than 8 VMIDs allocated to KFD. Fix this by passing correct number of VMIDs to HWS
v2: squash in warning fix (Alex)
Signed-off-by: Tushar Patel tushar.patel@amd.com Reviewed-by: Felix Kuehling felix.kuehling@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c | 2 +- drivers/gpu/drm/amd/amdkfd/kfd_device.c | 11 +++-------- 2 files changed, 4 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c index 284892b2d3b4..c853266957ce 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_drv.c @@ -686,7 +686,7 @@ MODULE_PARM_DESC(sched_policy, * Maximum number of processes that HWS can schedule concurrently. The maximum is the * number of VMIDs assigned to the HWS, which is also the default. */ -int hws_max_conc_proc = 8; +int hws_max_conc_proc = -1; module_param(hws_max_conc_proc, int, 0444); MODULE_PARM_DESC(hws_max_conc_proc, "Max # processes HWS can execute concurrently when sched_policy=0 (0 = no concurrency, #VMIDs for KFD = Maximum(default))"); diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_device.c b/drivers/gpu/drm/amd/amdkfd/kfd_device.c index 2b65d0acae2c..2fdbe2f475e4 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_device.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_device.c @@ -480,15 +480,10 @@ bool kgd2kfd_device_init(struct kfd_dev *kfd, }
/* Verify module parameters regarding mapped process number*/ - if ((hws_max_conc_proc < 0) - || (hws_max_conc_proc > kfd->vm_info.vmid_num_kfd)) { - dev_err(kfd_device, - "hws_max_conc_proc %d must be between 0 and %d, use %d instead\n", - hws_max_conc_proc, kfd->vm_info.vmid_num_kfd, - kfd->vm_info.vmid_num_kfd); + if (hws_max_conc_proc >= 0) + kfd->max_proc_per_quantum = min((u32)hws_max_conc_proc, kfd->vm_info.vmid_num_kfd); + else kfd->max_proc_per_quantum = kfd->vm_info.vmid_num_kfd; - } else - kfd->max_proc_per_quantum = hws_max_conc_proc;
/* calculate max size of mqds needed for queues */ size = max_num_of_queues_per_device *
From: Tianci Yin tianci.yin@amd.com
[ Upstream commit 6ea239adc2a712eb318f04f5c29b018ba65ea38a ]
Prior to disabling dpg, VCN need unpausing dpg mode, or VCN will hang in S3 resuming.
Reviewed-by: James Zhu James.Zhu@amd.com Signed-off-by: Tianci Yin tianci.yin@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c index 0ce2a7aa400b..ad9bfc772bdf 100644 --- a/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c +++ b/drivers/gpu/drm/amd/amdgpu/vcn_v3_0.c @@ -1474,8 +1474,11 @@ static int vcn_v3_0_start_sriov(struct amdgpu_device *adev)
static int vcn_v3_0_stop_dpg_mode(struct amdgpu_device *adev, int inst_idx) { + struct dpg_pause_state state = {.fw_based = VCN_DPG_STATE__UNPAUSE}; uint32_t tmp;
+ vcn_v3_0_pause_dpg_mode(adev, 0, &state); + /* Wait for power status to be 1 */ SOC15_WAIT_ON_RREG(VCN, inst_idx, mmUVD_POWER_STATUS, 1, UVD_POWER_STATUS__UVD_POWER_STATUS_MASK);
From: QintaoShen unSimple1993@163.com
[ Upstream commit ebbb7bb9e80305820dc2328a371c1b35679f2667 ]
As the kmalloc_array() may return null, the 'event_waiters[i].wait' would lead to null-pointer dereference. Therefore, it is better to check the return value of kmalloc_array() to avoid this confusion.
Signed-off-by: QintaoShen unSimple1993@163.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdkfd/kfd_events.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdkfd/kfd_events.c b/drivers/gpu/drm/amd/amdkfd/kfd_events.c index afe72dd11325..6ca7e12bdab8 100644 --- a/drivers/gpu/drm/amd/amdkfd/kfd_events.c +++ b/drivers/gpu/drm/amd/amdkfd/kfd_events.c @@ -531,6 +531,8 @@ static struct kfd_event_waiter *alloc_event_waiters(uint32_t num_events) event_waiters = kmalloc_array(num_events, sizeof(struct kfd_event_waiter), GFP_KERNEL); + if (!event_waiters) + return NULL;
for (i = 0; (event_waiters) && (i < num_events) ; i++) { init_wait(&event_waiters[i].wait);
From: Andrea Parri (Microsoft) parri.andrea@gmail.com
[ Upstream commit 9f8b577f7b43b2170628d6c537252785dcc2dcea ]
hv_panic_page might contain guest-sensitive information, do not dump it over to Hyper-V by default in isolated guests.
While at it, update some comments in hyperv_{panic,die}_event().
Reported-by: Dexuan Cui decui@microsoft.com Signed-off-by: Andrea Parri (Microsoft) parri.andrea@gmail.com Reviewed-by: Dexuan Cui decui@microsoft.com Link: https://lore.kernel.org/r/20220301141135.2232-1-parri.andrea@gmail.com Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/vmbus_drv.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 4bea1dfa41cd..6c057c76c2ca 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -77,8 +77,8 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
/* * Hyper-V should be notified only once about a panic. If we will be - * doing hyperv_report_panic_msg() later with kmsg data, don't do - * the notification here. + * doing hv_kmsg_dump() with kmsg data later, don't do the notification + * here. */ if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE && hyperv_report_reg()) { @@ -100,8 +100,8 @@ static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
/* * Hyper-V should be notified only once about a panic. If we will be - * doing hyperv_report_panic_msg() later with kmsg data, don't do - * the notification here. + * doing hv_kmsg_dump() with kmsg data later, don't do the notification + * here. */ if (hyperv_report_reg()) hyperv_report_panic(regs, val, true); @@ -1546,14 +1546,20 @@ static int vmbus_bus_init(void) if (ret) goto err_connect;
+ if (hv_is_isolation_supported()) + sysctl_record_panic_msg = 0; + /* * Only register if the crash MSRs are available */ if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) { u64 hyperv_crash_ctl; /* - * Sysctl registration is not fatal, since by default - * reporting is enabled. + * Panic message recording (sysctl_record_panic_msg) + * is enabled by default in non-isolated guests and + * disabled by default in isolated guests; the panic + * message recording won't be available in isolated + * guests should the following registration fail. */ hv_ctl_table_hdr = register_sysctl_table(hv_root_table); if (!hv_ctl_table_hdr)
From: Michael Kelley mikelley@microsoft.com
[ Upstream commit 37200078ed6aa2ac3c88a01a64996133dccfdd34 ]
VMbus synthetic devices are not represented in the ACPI DSDT -- only the top level VMbus device is represented. As a result, on ARM64 coherence information in the _CCA method is not specified for synthetic devices, so they default to not hardware coherent. Drivers for some of these synthetic devices have been recently updated to use the standard DMA APIs, and they are incurring extra overhead of unneeded software coherence management.
Fix this by propagating coherence information from the VMbus node in ACPI to the individual synthetic devices. There's no effect on x86/x64 where devices are always hardware coherent.
Signed-off-by: Michael Kelley mikelley@microsoft.com Acked-by: Robin Murphy robin.murphy@arm.com Link: https://lore.kernel.org/r/1648138492-2191-2-git-send-email-mikelley@microsof... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/hv_common.c | 11 +++++++++++ drivers/hv/vmbus_drv.c | 31 +++++++++++++++++++++++++++++++ include/asm-generic/mshyperv.h | 1 + 3 files changed, 43 insertions(+)
diff --git a/drivers/hv/hv_common.c b/drivers/hv/hv_common.c index 181d16bbf49d..820e81406251 100644 --- a/drivers/hv/hv_common.c +++ b/drivers/hv/hv_common.c @@ -20,6 +20,7 @@ #include <linux/panic_notifier.h> #include <linux/ptrace.h> #include <linux/slab.h> +#include <linux/dma-map-ops.h> #include <asm/hyperv-tlfs.h> #include <asm/mshyperv.h>
@@ -216,6 +217,16 @@ bool hv_query_ext_cap(u64 cap_query) } EXPORT_SYMBOL_GPL(hv_query_ext_cap);
+void hv_setup_dma_ops(struct device *dev, bool coherent) +{ + /* + * Hyper-V does not offer a vIOMMU in the guest + * VM, so pass 0/NULL for the IOMMU settings + */ + arch_setup_dma_ops(dev, 0, 0, NULL, coherent); +} +EXPORT_SYMBOL_GPL(hv_setup_dma_ops); + bool hv_is_hibernation_supported(void) { return !hv_root_partition && acpi_sleep_state_supported(ACPI_STATE_S4); diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c index 6c057c76c2ca..3cd0d3a44fa2 100644 --- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -920,6 +920,21 @@ static int vmbus_probe(struct device *child_device) return ret; }
+/* + * vmbus_dma_configure -- Configure DMA coherence for VMbus device + */ +static int vmbus_dma_configure(struct device *child_device) +{ + /* + * On ARM64, propagate the DMA coherence setting from the top level + * VMbus ACPI device to the child VMbus device being added here. + * On x86/x64 coherence is assumed and these calls have no effect. + */ + hv_setup_dma_ops(child_device, + device_get_dma_attr(&hv_acpi_dev->dev) == DEV_DMA_COHERENT); + return 0; +} + /* * vmbus_remove - Remove a vmbus device */ @@ -1040,6 +1055,7 @@ static struct bus_type hv_bus = { .remove = vmbus_remove, .probe = vmbus_probe, .uevent = vmbus_uevent, + .dma_configure = vmbus_dma_configure, .dev_groups = vmbus_dev_groups, .drv_groups = vmbus_drv_groups, .bus_groups = vmbus_bus_groups, @@ -2435,6 +2451,21 @@ static int vmbus_acpi_add(struct acpi_device *device)
hv_acpi_dev = device;
+ /* + * Older versions of Hyper-V for ARM64 fail to include the _CCA + * method on the top level VMbus device in the DSDT. But devices + * are hardware coherent in all current Hyper-V use cases, so fix + * up the ACPI device to behave as if _CCA is present and indicates + * hardware coherence. + */ + ACPI_COMPANION_SET(&device->dev, device); + if (IS_ENABLED(CONFIG_ACPI_CCA_REQUIRED) && + device_get_dma_attr(&device->dev) == DEV_DMA_NOT_SUPPORTED) { + pr_info("No ACPI _CCA found; assuming coherent device I/O\n"); + device->flags.cca_seen = true; + device->flags.coherent_dma = true; + } + result = acpi_walk_resources(device->handle, METHOD_NAME__CRS, vmbus_walk_resources, NULL);
diff --git a/include/asm-generic/mshyperv.h b/include/asm-generic/mshyperv.h index c08758b6b364..c05d2ce9b6cd 100644 --- a/include/asm-generic/mshyperv.h +++ b/include/asm-generic/mshyperv.h @@ -269,6 +269,7 @@ bool hv_isolation_type_snp(void); u64 hv_ghcb_hypercall(u64 control, void *input, void *output, u32 input_size); void hyperv_cleanup(void); bool hv_query_ext_cap(u64 cap_query); +void hv_setup_dma_ops(struct device *dev, bool coherent); void *hv_map_memory(void *addr, unsigned long size); void hv_unmap_memory(void *addr); #else /* CONFIG_HYPERV */
From: Michael Kelley mikelley@microsoft.com
[ Upstream commit 8d21732475c637c7efcdb91dc927a4c594e97898 ]
PCI pass-thru devices in a Hyper-V VM are represented as a VMBus device and as a PCI device. The coherence of the VMbus device is set based on the VMbus node in ACPI, but the PCI device has no ACPI node and defaults to not hardware coherent. This results in extra software coherence management overhead on ARM64 when devices are hardware coherent.
Fix this by setting up the PCI host bus so that normal PCI mechanisms will propagate the coherence of the VMbus device to the PCI device. There's no effect on x86/x64 where devices are always hardware coherent.
Signed-off-by: Michael Kelley mikelley@microsoft.com Acked-by: Boqun Feng boqun.feng@gmail.com Acked-by: Robin Murphy robin.murphy@arm.com Link: https://lore.kernel.org/r/1648138492-2191-3-git-send-email-mikelley@microsof... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/pci-hyperv.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index ae0bc2fee4ca..88b3b56d0522 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3404,6 +3404,15 @@ static int hv_pci_probe(struct hv_device *hdev, hbus->bridge->domain_nr = dom; #ifdef CONFIG_X86 hbus->sysdata.domain = dom; +#elif defined(CONFIG_ARM64) + /* + * Set the PCI bus parent to be the corresponding VMbus + * device. Then the VMbus device will be assigned as the + * ACPI companion in pcibios_root_bridge_prepare() and + * pci_dma_configure() will propagate device coherence + * information to devices created on the bus. + */ + hbus->sysdata.parent = hdev->device.parent; #endif
hbus->hdev = hdev;
From: Michael Kelley mikelley@microsoft.com
[ Upstream commit b6cae15b5710c8097aad26a2e5e752c323ee5348 ]
When reading a packet from a host-to-guest ring buffer, there is no memory barrier between reading the write index (to see if there is a packet to read) and reading the contents of the packet. The Hyper-V host uses store-release when updating the write index to ensure that writes of the packet data are completed first. On the guest side, the processor can reorder and read the packet data before the write index, and sometimes get stale packet data. Getting such stale packet data has been observed in a reproducible case in a VM on ARM64.
Fix this by using virt_load_acquire() to read the write index, ensuring that reads of the packet data cannot be reordered before it. Preventing such reordering is logically correct, and with this change, getting stale data can no longer be reproduced.
Signed-off-by: Michael Kelley mikelley@microsoft.com Reviewed-by: Andrea Parri (Microsoft) parri.andrea@gmail.com Link: https://lore.kernel.org/r/1648394710-33480-1-git-send-email-mikelley@microso... Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/ring_buffer.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/drivers/hv/ring_buffer.c b/drivers/hv/ring_buffer.c index 71efacb90965..3d215d9dec43 100644 --- a/drivers/hv/ring_buffer.c +++ b/drivers/hv/ring_buffer.c @@ -439,7 +439,16 @@ int hv_ringbuffer_read(struct vmbus_channel *channel, static u32 hv_pkt_iter_avail(const struct hv_ring_buffer_info *rbi) { u32 priv_read_loc = rbi->priv_read_index; - u32 write_loc = READ_ONCE(rbi->ring_buffer->write_index); + u32 write_loc; + + /* + * The Hyper-V host writes the packet data, then uses + * store_release() to update the write_index. Use load_acquire() + * here to prevent loads of the packet data from being re-ordered + * before the read of the write_index and potentially getting + * stale data. + */ + write_loc = virt_load_acquire(&rbi->ring_buffer->write_index);
if (write_loc >= priv_read_loc) return write_loc - priv_read_loc;
From: Xiaoguang Wang xiaoguang.wang@linux.alibaba.com
[ Upstream commit a6968f7a367f128d120447360734344d5a3d5336 ]
tcmu_try_get_data_page() looks up pages under cmdr_lock, but it does not take refcount properly and just returns page pointer. When tcmu_try_get_data_page() returns, the returned page may have been freed by tcmu_blocks_release().
We need to get_page() under cmdr_lock to avoid concurrent tcmu_blocks_release().
Link: https://lore.kernel.org/r/20220311132206.24515-1-xiaoguang.wang@linux.alibab... Reviewed-by: Bodo Stroesser bostroesser@gmail.com Signed-off-by: Xiaoguang Wang xiaoguang.wang@linux.alibaba.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_user.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/target/target_core_user.c b/drivers/target/target_core_user.c index 7b2a89a67cdb..06a5c4086551 100644 --- a/drivers/target/target_core_user.c +++ b/drivers/target/target_core_user.c @@ -1820,6 +1820,7 @@ static struct page *tcmu_try_get_data_page(struct tcmu_dev *udev, uint32_t dpi) mutex_lock(&udev->cmdr_lock); page = xa_load(&udev->data_pages, dpi); if (likely(page)) { + get_page(page); mutex_unlock(&udev->cmdr_lock); return page; } @@ -1876,6 +1877,7 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf) /* For the vmalloc()ed cmd area pages */ addr = (void *)(unsigned long)info->mem[mi].addr + offset; page = vmalloc_to_page(addr); + get_page(page); } else { uint32_t dpi;
@@ -1886,7 +1888,6 @@ static vm_fault_t tcmu_vma_fault(struct vm_fault *vmf) return VM_FAULT_SIGBUS; }
- get_page(page); vmf->page = page; return 0; }
From: James Smart jsmart2021@gmail.com
[ Upstream commit 35ed9613d83f3c1f011877d591fd7d36f2666106 ]
Following EEH errors, the driver can crash or hang when deleting the localport or when attempting to unload.
The EEH handlers in the driver did not notify the NVMe-FC transport before tearing the driver down. This was delayed until the resume steps. This worked for SCSI because lpfc_block_scsi() would notify the scsi_fc_transport that the target was not available but it would not clean up all the references to the ndlp.
The SLI3 prep for dev reset handler did the lpfc_offline_prep() and lpfc_offline() calls to get the port stopped before restarting. The SLI4 version of the prep for dev reset just destroyed the queues and did not stop NVMe from continuing. Also because the port was not really stopped the localport destroy would hang because the transport was still waiting for I/O. Additionally, a devloss tmo can fire and post events to a stopped worker thread creating another hang condition.
lpfc_sli4_prep_dev_for_reset() is modified to call lpfc_offline_prep() and lpfc_offline() rather than just lpfc_scsi_dev_block() to ensure both SCSI and NVMe transports are notified to block I/O to the driver.
Logic is added to devloss handler and worker thread to clean up ndlp references and quiesce appropriately.
Link: https://lore.kernel.org/r/20220317032737.45308-2-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc.h | 7 +- drivers/scsi/lpfc/lpfc_crtn.h | 3 + drivers/scsi/lpfc/lpfc_hbadisc.c | 119 +++++++++++++++++++++++++------ drivers/scsi/lpfc/lpfc_init.c | 60 ++++++++++------ drivers/scsi/lpfc/lpfc_nvme.c | 11 ++- drivers/scsi/lpfc/lpfc_sli.c | 15 ++-- 6 files changed, 157 insertions(+), 58 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc.h b/drivers/scsi/lpfc/lpfc.h index 98cabe09c040..8748c5996478 100644 --- a/drivers/scsi/lpfc/lpfc.h +++ b/drivers/scsi/lpfc/lpfc.h @@ -897,6 +897,11 @@ enum lpfc_irq_chann_mode { NHT_MODE, };
+enum lpfc_hba_bit_flags { + FABRIC_COMANDS_BLOCKED, + HBA_PCI_ERR, +}; + struct lpfc_hba { /* SCSI interface function jump table entries */ struct lpfc_io_buf * (*lpfc_get_scsi_buf) @@ -1025,7 +1030,6 @@ struct lpfc_hba { * Firmware supports Forced Link Speed * capability */ -#define HBA_PCI_ERR 0x80000 /* The PCI slot is offline */ #define HBA_FLOGI_ISSUED 0x100000 /* FLOGI was issued */ #define HBA_SHORT_CMF 0x200000 /* shorter CMF timer routine */ #define HBA_CGN_DAY_WRAP 0x400000 /* HBA Congestion info day wraps */ @@ -1335,7 +1339,6 @@ struct lpfc_hba { atomic_t fabric_iocb_count; struct timer_list fabric_block_timer; unsigned long bit_flags; -#define FABRIC_COMANDS_BLOCKED 0 atomic_t num_rsrc_err; atomic_t num_cmd_success; unsigned long last_rsrc_error_time; diff --git a/drivers/scsi/lpfc/lpfc_crtn.h b/drivers/scsi/lpfc/lpfc_crtn.h index 89e36bf14d8f..d4340e5a3aac 100644 --- a/drivers/scsi/lpfc/lpfc_crtn.h +++ b/drivers/scsi/lpfc/lpfc_crtn.h @@ -652,3 +652,6 @@ struct lpfc_vmid *lpfc_get_vmid_from_hashtable(struct lpfc_vport *vport, uint32_t hash, uint8_t *buf); void lpfc_vmid_vport_cleanup(struct lpfc_vport *vport); int lpfc_issue_els_qfpa(struct lpfc_vport *vport); + +void lpfc_sli_rpi_release(struct lpfc_vport *vport, + struct lpfc_nodelist *ndlp); diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index 816fc406135b..e10371611ef8 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -109,8 +109,8 @@ lpfc_rport_invalid(struct fc_rport *rport)
ndlp = rdata->pnode; if (!rdata->pnode) { - pr_err("**** %s: NULL ndlp on rport x%px SID x%x\n", - __func__, rport, rport->scsi_target_id); + pr_info("**** %s: NULL ndlp on rport x%px SID x%x\n", + __func__, rport, rport->scsi_target_id); return -EINVAL; }
@@ -169,9 +169,10 @@ lpfc_dev_loss_tmo_callbk(struct fc_rport *rport)
lpfc_printf_vlog(ndlp->vport, KERN_INFO, LOG_NODE, "3181 dev_loss_callbk x%06x, rport x%px flg x%x " - "load_flag x%x refcnt %d\n", + "load_flag x%x refcnt %d state %d xpt x%x\n", ndlp->nlp_DID, ndlp->rport, ndlp->nlp_flag, - vport->load_flag, kref_read(&ndlp->kref)); + vport->load_flag, kref_read(&ndlp->kref), + ndlp->nlp_state, ndlp->fc4_xpt_flags);
/* Don't schedule a worker thread event if the vport is going down. * The teardown process cleans up the node via lpfc_drop_node. @@ -181,6 +182,11 @@ lpfc_dev_loss_tmo_callbk(struct fc_rport *rport) ndlp->rport = NULL;
ndlp->fc4_xpt_flags &= ~SCSI_XPT_REGD; + /* clear the NLP_XPT_REGD if the node is not registered + * with nvme-fc + */ + if (ndlp->fc4_xpt_flags == NLP_XPT_REGD) + ndlp->fc4_xpt_flags &= ~NLP_XPT_REGD;
/* Remove the node reference from remote_port_add now. * The driver will not call remote_port_delete. @@ -225,18 +231,36 @@ lpfc_dev_loss_tmo_callbk(struct fc_rport *rport) ndlp->rport = NULL; spin_unlock_irqrestore(&ndlp->lock, iflags);
- /* We need to hold the node by incrementing the reference - * count until this queued work is done - */ - evtp->evt_arg1 = lpfc_nlp_get(ndlp); + if (phba->worker_thread) { + /* We need to hold the node by incrementing the reference + * count until this queued work is done + */ + evtp->evt_arg1 = lpfc_nlp_get(ndlp); + + spin_lock_irqsave(&phba->hbalock, iflags); + if (evtp->evt_arg1) { + evtp->evt = LPFC_EVT_DEV_LOSS; + list_add_tail(&evtp->evt_listp, &phba->work_list); + lpfc_worker_wake_up(phba); + } + spin_unlock_irqrestore(&phba->hbalock, iflags); + } else { + lpfc_printf_vlog(ndlp->vport, KERN_INFO, LOG_NODE, + "3188 worker thread is stopped %s x%06x, " + " rport x%px flg x%x load_flag x%x refcnt " + "%d\n", __func__, ndlp->nlp_DID, + ndlp->rport, ndlp->nlp_flag, + vport->load_flag, kref_read(&ndlp->kref)); + if (!(ndlp->fc4_xpt_flags & NVME_XPT_REGD)) { + spin_lock_irqsave(&ndlp->lock, iflags); + /* Node is in dev loss. No further transaction. */ + ndlp->nlp_flag &= ~NLP_IN_DEV_LOSS; + spin_unlock_irqrestore(&ndlp->lock, iflags); + lpfc_disc_state_machine(vport, ndlp, NULL, + NLP_EVT_DEVICE_RM); + }
- spin_lock_irqsave(&phba->hbalock, iflags); - if (evtp->evt_arg1) { - evtp->evt = LPFC_EVT_DEV_LOSS; - list_add_tail(&evtp->evt_listp, &phba->work_list); - lpfc_worker_wake_up(phba); } - spin_unlock_irqrestore(&phba->hbalock, iflags);
return; } @@ -503,11 +527,12 @@ lpfc_dev_loss_tmo_handler(struct lpfc_nodelist *ndlp) lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, "0203 Devloss timeout on " "WWPN %02x:%02x:%02x:%02x:%02x:%02x:%02x:%02x " - "NPort x%06x Data: x%x x%x x%x\n", + "NPort x%06x Data: x%x x%x x%x refcnt %d\n", *name, *(name+1), *(name+2), *(name+3), *(name+4), *(name+5), *(name+6), *(name+7), ndlp->nlp_DID, ndlp->nlp_flag, - ndlp->nlp_state, ndlp->nlp_rpi); + ndlp->nlp_state, ndlp->nlp_rpi, + kref_read(&ndlp->kref)); } else { lpfc_printf_vlog(vport, KERN_INFO, LOG_TRACE_EVENT, "0204 Devloss timeout on " @@ -755,18 +780,22 @@ lpfc_work_list_done(struct lpfc_hba *phba) int free_evt; int fcf_inuse; uint32_t nlp_did; + bool hba_pci_err;
spin_lock_irq(&phba->hbalock); while (!list_empty(&phba->work_list)) { list_remove_head((&phba->work_list), evtp, typeof(*evtp), evt_listp); spin_unlock_irq(&phba->hbalock); + hba_pci_err = test_bit(HBA_PCI_ERR, &phba->bit_flags); free_evt = 1; switch (evtp->evt) { case LPFC_EVT_ELS_RETRY: ndlp = (struct lpfc_nodelist *) (evtp->evt_arg1); - lpfc_els_retry_delay_handler(ndlp); - free_evt = 0; /* evt is part of ndlp */ + if (!hba_pci_err) { + lpfc_els_retry_delay_handler(ndlp); + free_evt = 0; /* evt is part of ndlp */ + } /* decrement the node reference count held * for this queued work */ @@ -788,8 +817,10 @@ lpfc_work_list_done(struct lpfc_hba *phba) break; case LPFC_EVT_RECOVER_PORT: ndlp = (struct lpfc_nodelist *)(evtp->evt_arg1); - lpfc_sli_abts_recover_port(ndlp->vport, ndlp); - free_evt = 0; + if (!hba_pci_err) { + lpfc_sli_abts_recover_port(ndlp->vport, ndlp); + free_evt = 0; + } /* decrement the node reference count held for * this queued work */ @@ -859,14 +890,18 @@ lpfc_work_done(struct lpfc_hba *phba) struct lpfc_vport **vports; struct lpfc_vport *vport; int i; + bool hba_pci_err;
+ hba_pci_err = test_bit(HBA_PCI_ERR, &phba->bit_flags); spin_lock_irq(&phba->hbalock); ha_copy = phba->work_ha; phba->work_ha = 0; spin_unlock_irq(&phba->hbalock); + if (hba_pci_err) + ha_copy = 0;
/* First, try to post the next mailbox command to SLI4 device */ - if (phba->pci_dev_grp == LPFC_PCI_DEV_OC) + if (phba->pci_dev_grp == LPFC_PCI_DEV_OC && !hba_pci_err) lpfc_sli4_post_async_mbox(phba);
if (ha_copy & HA_ERATT) { @@ -886,7 +921,7 @@ lpfc_work_done(struct lpfc_hba *phba) lpfc_handle_latt(phba);
/* Handle VMID Events */ - if (lpfc_is_vmid_enabled(phba)) { + if (lpfc_is_vmid_enabled(phba) && !hba_pci_err) { if (phba->pport->work_port_events & WORKER_CHECK_VMID_ISSUE_QFPA) { lpfc_check_vmid_qfpa_issue(phba); @@ -936,6 +971,8 @@ lpfc_work_done(struct lpfc_hba *phba) work_port_events = vport->work_port_events; vport->work_port_events &= ~work_port_events; spin_unlock_irq(&vport->work_port_lock); + if (hba_pci_err) + continue; if (work_port_events & WORKER_DISC_TMO) lpfc_disc_timeout_handler(vport); if (work_port_events & WORKER_ELS_TMO) @@ -1173,12 +1210,14 @@ lpfc_linkdown(struct lpfc_hba *phba) struct lpfc_vport **vports; LPFC_MBOXQ_t *mb; int i; + int offline;
if (phba->link_state == LPFC_LINK_DOWN) return 0;
/* Block all SCSI stack I/Os */ lpfc_scsi_dev_block(phba); + offline = pci_channel_offline(phba->pcidev);
phba->defer_flogi_acc_flag = false;
@@ -1219,7 +1258,7 @@ lpfc_linkdown(struct lpfc_hba *phba) lpfc_destroy_vport_work_array(phba, vports);
/* Clean up any SLI3 firmware default rpi's */ - if (phba->sli_rev > LPFC_SLI_REV3) + if (phba->sli_rev > LPFC_SLI_REV3 || offline) goto skip_unreg_did;
mb = mempool_alloc(phba->mbox_mem_pool, GFP_KERNEL); @@ -4712,6 +4751,11 @@ lpfc_nlp_unreg_node(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) spin_lock_irqsave(&ndlp->lock, iflags); if (!(ndlp->fc4_xpt_flags & NLP_XPT_REGD)) { spin_unlock_irqrestore(&ndlp->lock, iflags); + lpfc_printf_vlog(vport, KERN_INFO, LOG_SLI, + "0999 %s Not regd: ndlp x%px rport x%px DID " + "x%x FLG x%x XPT x%x\n", + __func__, ndlp, ndlp->rport, ndlp->nlp_DID, + ndlp->nlp_flag, ndlp->fc4_xpt_flags); return; }
@@ -4722,6 +4766,13 @@ lpfc_nlp_unreg_node(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) ndlp->fc4_xpt_flags & SCSI_XPT_REGD) { vport->phba->nport_event_cnt++; lpfc_unregister_remote_port(ndlp); + } else if (!ndlp->rport) { + lpfc_printf_vlog(vport, KERN_INFO, LOG_SLI, + "1999 %s NDLP in devloss x%px DID x%x FLG x%x" + " XPT x%x refcnt %d\n", + __func__, ndlp, ndlp->nlp_DID, ndlp->nlp_flag, + ndlp->fc4_xpt_flags, + kref_read(&ndlp->kref)); }
if (ndlp->fc4_xpt_flags & NVME_XPT_REGD) { @@ -6089,12 +6140,34 @@ lpfc_disc_flush_list(struct lpfc_vport *vport) } }
+/* + * lpfc_notify_xport_npr - notifies xport of node disappearance + * @vport: Pointer to Virtual Port object. + * + * Transitions all ndlps to NPR state. When lpfc_nlp_set_state + * calls lpfc_nlp_state_cleanup, the ndlp->rport is unregistered + * and transport notified that the node is gone. + * Return Code: + * none + */ +static void +lpfc_notify_xport_npr(struct lpfc_vport *vport) +{ + struct lpfc_nodelist *ndlp, *next_ndlp; + + list_for_each_entry_safe(ndlp, next_ndlp, &vport->fc_nodes, + nlp_listp) { + lpfc_nlp_set_state(vport, ndlp, NLP_STE_NPR_NODE); + } +} void lpfc_cleanup_discovery_resources(struct lpfc_vport *vport) { lpfc_els_flush_rscn(vport); lpfc_els_flush_cmd(vport); lpfc_disc_flush_list(vport); + if (pci_channel_offline(vport->phba->pcidev)) + lpfc_notify_xport_npr(vport); }
/*****************************************************************************/ diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index 558f7d2559c4..fe9a04b2df3e 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -1652,7 +1652,7 @@ lpfc_sli4_offline_eratt(struct lpfc_hba *phba) { spin_lock_irq(&phba->hbalock); if (phba->link_state == LPFC_HBA_ERROR && - phba->hba_flag & HBA_PCI_ERR) { + test_bit(HBA_PCI_ERR, &phba->bit_flags)) { spin_unlock_irq(&phba->hbalock); return; } @@ -3692,7 +3692,8 @@ lpfc_offline_prep(struct lpfc_hba *phba, int mbx_action) struct lpfc_vport **vports; struct Scsi_Host *shost; int i; - int offline = 0; + int offline; + bool hba_pci_err;
if (vport->fc_flag & FC_OFFLINE_MODE) return; @@ -3702,6 +3703,7 @@ lpfc_offline_prep(struct lpfc_hba *phba, int mbx_action) lpfc_linkdown(phba);
offline = pci_channel_offline(phba->pcidev); + hba_pci_err = test_bit(HBA_PCI_ERR, &phba->bit_flags);
/* Issue an unreg_login to all nodes on all vports */ vports = lpfc_create_vport_work_array(phba); @@ -3725,11 +3727,14 @@ lpfc_offline_prep(struct lpfc_hba *phba, int mbx_action) ndlp->nlp_flag &= ~NLP_NPR_ADISC; spin_unlock_irq(&ndlp->lock);
- if (offline) { + if (offline || hba_pci_err) { spin_lock_irq(&ndlp->lock); ndlp->nlp_flag &= ~(NLP_UNREG_INP | NLP_RPI_REGISTERED); spin_unlock_irq(&ndlp->lock); + if (phba->sli_rev == LPFC_SLI_REV4) + lpfc_sli_rpi_release(vports[i], + ndlp); } else { lpfc_unreg_rpi(vports[i], ndlp); } @@ -13386,15 +13391,12 @@ lpfc_sli4_hba_unset(struct lpfc_hba *phba) /* Disable FW logging to host memory */ lpfc_ras_stop_fwlog(phba);
- /* Unset the queues shared with the hardware then release all - * allocated resources. - */ - lpfc_sli4_queue_unset(phba); - lpfc_sli4_queue_destroy(phba); - /* Reset SLI4 HBA FCoE function */ lpfc_pci_function_reset(phba);
+ /* release all queue allocated resources. */ + lpfc_sli4_queue_destroy(phba); + /* Free RAS DMA memory */ if (phba->ras_fwlog.ras_enabled) lpfc_sli4_ras_dma_free(phba); @@ -15069,24 +15071,28 @@ lpfc_sli4_prep_dev_for_recover(struct lpfc_hba *phba) static void lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba) { - lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT, - "2826 PCI channel disable preparing for reset\n"); + int offline = pci_channel_offline(phba->pcidev); + + lpfc_printf_log(phba, KERN_ERR, LOG_INIT, + "2826 PCI channel disable preparing for reset offline" + " %d\n", offline);
/* Block any management I/Os to the device */ lpfc_block_mgmt_io(phba, LPFC_MBX_NO_WAIT);
- /* Block all SCSI devices' I/Os on the host */ - lpfc_scsi_dev_block(phba);
+ /* HBA_PCI_ERR was set in io_error_detect */ + lpfc_offline_prep(phba, LPFC_MBX_NO_WAIT); /* Flush all driver's outstanding I/Os as we are to reset */ lpfc_sli_flush_io_rings(phba); + lpfc_offline(phba);
/* stop all timers */ lpfc_stop_hba_timers(phba);
+ lpfc_sli4_queue_destroy(phba); /* Disable interrupt and pci device */ lpfc_sli4_disable_intr(phba); - lpfc_sli4_queue_destroy(phba); pci_disable_device(phba->pcidev); }
@@ -15135,6 +15141,7 @@ lpfc_io_error_detected_s4(struct pci_dev *pdev, pci_channel_state_t state) { struct Scsi_Host *shost = pci_get_drvdata(pdev); struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba; + bool hba_pci_err;
switch (state) { case pci_channel_io_normal: @@ -15142,17 +15149,24 @@ lpfc_io_error_detected_s4(struct pci_dev *pdev, pci_channel_state_t state) lpfc_sli4_prep_dev_for_recover(phba); return PCI_ERS_RESULT_CAN_RECOVER; case pci_channel_io_frozen: - phba->hba_flag |= HBA_PCI_ERR; + hba_pci_err = test_and_set_bit(HBA_PCI_ERR, &phba->bit_flags); /* Fatal error, prepare for slot reset */ - lpfc_sli4_prep_dev_for_reset(phba); + if (!hba_pci_err) + lpfc_sli4_prep_dev_for_reset(phba); + else + lpfc_printf_log(phba, KERN_ERR, LOG_INIT, + "2832 Already handling PCI error " + "state: x%x\n", state); return PCI_ERS_RESULT_NEED_RESET; case pci_channel_io_perm_failure: - phba->hba_flag |= HBA_PCI_ERR; + set_bit(HBA_PCI_ERR, &phba->bit_flags); /* Permanent failure, prepare for device down */ lpfc_sli4_prep_dev_for_perm_failure(phba); return PCI_ERS_RESULT_DISCONNECT; default: - phba->hba_flag |= HBA_PCI_ERR; + hba_pci_err = test_and_set_bit(HBA_PCI_ERR, &phba->bit_flags); + if (!hba_pci_err) + lpfc_sli4_prep_dev_for_reset(phba); /* Unknown state, prepare and request slot reset */ lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT, "2825 Unknown PCI error state: x%x\n", state); @@ -15186,17 +15200,21 @@ lpfc_io_slot_reset_s4(struct pci_dev *pdev) struct lpfc_hba *phba = ((struct lpfc_vport *)shost->hostdata)->phba; struct lpfc_sli *psli = &phba->sli; uint32_t intr_mode; + bool hba_pci_err;
dev_printk(KERN_INFO, &pdev->dev, "recovering from a slot reset.\n"); if (pci_enable_device_mem(pdev)) { printk(KERN_ERR "lpfc: Cannot re-enable " - "PCI device after reset.\n"); + "PCI device after reset.\n"); return PCI_ERS_RESULT_DISCONNECT; }
pci_restore_state(pdev);
- phba->hba_flag &= ~HBA_PCI_ERR; + hba_pci_err = test_and_clear_bit(HBA_PCI_ERR, &phba->bit_flags); + if (!hba_pci_err) + dev_info(&pdev->dev, + "hba_pci_err was not set, recovering slot reset.\n"); /* * As the new kernel behavior of pci_restore_state() API call clears * device saved_state flag, need to save the restored state again. @@ -15251,8 +15269,6 @@ lpfc_io_resume_s4(struct pci_dev *pdev) */ if (!(phba->sli.sli_flag & LPFC_SLI_ACTIVE)) { /* Perform device reset */ - lpfc_offline_prep(phba, LPFC_MBX_WAIT); - lpfc_offline(phba); lpfc_sli_brdrestart(phba); /* Bring the device back online */ lpfc_online(phba); diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 9601edd838e1..8983f6440858 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -2169,8 +2169,7 @@ lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport, abts_nvme = 0; for (i = 0; i < phba->cfg_hdw_queue; i++) { qp = &phba->sli4_hba.hdwq[i]; - if (!vport || !vport->localport || - !qp || !qp->io_wq) + if (!vport->localport || !qp || !qp->io_wq) return;
pring = qp->io_wq->pring; @@ -2180,8 +2179,9 @@ lpfc_nvme_lport_unreg_wait(struct lpfc_vport *vport, abts_scsi += qp->abts_scsi_io_bufs; abts_nvme += qp->abts_nvme_io_bufs; } - if (!vport || !vport->localport || - vport->phba->hba_flag & HBA_PCI_ERR) + if (!vport->localport || + test_bit(HBA_PCI_ERR, &vport->phba->bit_flags) || + vport->load_flag & FC_UNLOADING) return;
lpfc_printf_vlog(vport, KERN_ERR, LOG_TRACE_EVENT, @@ -2541,8 +2541,7 @@ lpfc_nvme_unregister_port(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) * return values is ignored. The upcall is a courtesy to the * transport. */ - if (vport->load_flag & FC_UNLOADING || - unlikely(vport->phba->hba_flag & HBA_PCI_ERR)) + if (vport->load_flag & FC_UNLOADING) (void)nvme_fc_set_remoteport_devloss(remoteport, 0);
ret = nvme_fc_unregister_remoteport(remoteport); diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 430abebf99f1..661ed0999f1c 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -2833,6 +2833,12 @@ __lpfc_sli_rpi_release(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) ndlp->nlp_flag &= ~NLP_UNREG_INP; }
+void +lpfc_sli_rpi_release(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) +{ + __lpfc_sli_rpi_release(vport, ndlp); +} + /** * lpfc_sli_def_mbox_cmpl - Default mailbox completion handler * @phba: Pointer to HBA context object. @@ -4554,11 +4560,6 @@ lpfc_sli_flush_io_rings(struct lpfc_hba *phba) struct lpfc_iocbq *piocb, *next_iocb;
spin_lock_irq(&phba->hbalock); - if (phba->hba_flag & HBA_IOQ_FLUSH || - !phba->sli4_hba.hdwq) { - spin_unlock_irq(&phba->hbalock); - return; - } /* Indicate the I/O queues are flushed */ phba->hba_flag |= HBA_IOQ_FLUSH; spin_unlock_irq(&phba->hbalock); @@ -11235,6 +11236,10 @@ lpfc_sli_issue_iocb(struct lpfc_hba *phba, uint32_t ring_number, unsigned long iflags; int rc;
+ /* If the PCI channel is in offline state, do not post iocbs. */ + if (unlikely(pci_channel_offline(phba->pcidev))) + return IOCB_ERROR; + if (phba->sli_rev == LPFC_SLI_REV4) { eq = phba->sli4_hba.hdwq[piocb->hba_wqidx].hba_eq;
From: James Smart jsmart2021@gmail.com
[ Upstream commit a4691038b4071ff0d9ae486d8822a2c0d41d5796 ]
When injecting EEH errors the port is getting hung up waiting on the node list to empty, message number 0233. The driver is stuck at this point and also can't unload. The driver makes transport remoteport delete calls which try to abort I/O's, but the EEH daemon has already called the driver to detach and the detachment has set the global FC_UNLOADING flag. There are several code paths that will avoid I/O cleanup if the FC_UNLOADING flag is set, resulting in transports waiting for I/O while the driver is waiting on transports to clean up.
Additionally, during study of the list, a locking issue was found in lpfc_sli_abort_iocb_ring that could corrupt the list.
A special case was added to the lpfc_cleanup() routine to call lpfc_sli_flush_rings() if the driver is FC_UNLOADING and if the pci-slot is offline (e.g. EEH).
The SLI4 part of lpfc_sli_abort_iocb_ring() is changed to use the ring_lock. Also added code to cancel the I/Os if the pci-slot is offline and added checks and returns for the FC_UNLOADING and HBA_IOQ_FLUSH flags to prevent trying to send an I/O that we cannot handle.
Link: https://lore.kernel.org/r/20220317032737.45308-3-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_hbadisc.c | 1 + drivers/scsi/lpfc/lpfc_init.c | 26 +++++++++++++++-- drivers/scsi/lpfc/lpfc_nvme.c | 16 ++++++++-- drivers/scsi/lpfc/lpfc_sli.c | 50 ++++++++++++++++++++++---------- 4 files changed, 72 insertions(+), 21 deletions(-)
diff --git a/drivers/scsi/lpfc/lpfc_hbadisc.c b/drivers/scsi/lpfc/lpfc_hbadisc.c index e10371611ef8..0cba306de0db 100644 --- a/drivers/scsi/lpfc/lpfc_hbadisc.c +++ b/drivers/scsi/lpfc/lpfc_hbadisc.c @@ -5416,6 +5416,7 @@ lpfc_unreg_rpi(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp) ndlp->nlp_flag &= ~NLP_UNREG_INP; mempool_free(mbox, phba->mbox_mem_pool); acc_plogi = 1; + lpfc_nlp_put(ndlp); } } else { lpfc_printf_vlog(vport, KERN_INFO, diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index fe9a04b2df3e..c8c049cf8d96 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -95,6 +95,7 @@ static void lpfc_sli4_oas_verify(struct lpfc_hba *phba); static uint16_t lpfc_find_cpu_handle(struct lpfc_hba *, uint16_t, int); static void lpfc_setup_bg(struct lpfc_hba *, struct Scsi_Host *); static int lpfc_sli4_cgn_parm_chg_evt(struct lpfc_hba *); +static void lpfc_sli4_prep_dev_for_reset(struct lpfc_hba *phba);
static struct scsi_transport_template *lpfc_transport_template = NULL; static struct scsi_transport_template *lpfc_vport_transport_template = NULL; @@ -1995,6 +1996,7 @@ lpfc_handle_eratt_s4(struct lpfc_hba *phba) if (pci_channel_offline(phba->pcidev)) { lpfc_printf_log(phba, KERN_ERR, LOG_TRACE_EVENT, "3166 pci channel is offline\n"); + lpfc_sli_flush_io_rings(phba); return; }
@@ -2983,6 +2985,22 @@ lpfc_cleanup(struct lpfc_vport *vport) NLP_EVT_DEVICE_RM); }
+ /* This is a special case flush to return all + * IOs before entering this loop. There are + * two points in the code where a flush is + * avoided if the FC_UNLOADING flag is set. + * one is in the multipool destroy, + * (this prevents a crash) and the other is + * in the nvme abort handler, ( also prevents + * a crash). Both of these exceptions are + * cases where the slot is still accessible. + * The flush here is only when the pci slot + * is offline. + */ + if (vport->load_flag & FC_UNLOADING && + pci_channel_offline(phba->pcidev)) + lpfc_sli_flush_io_rings(vport->phba); + /* At this point, ALL ndlp's should be gone * because of the previous NLP_EVT_DEVICE_RM. * Lets wait for this to happen, if needed. @@ -2995,7 +3013,7 @@ lpfc_cleanup(struct lpfc_vport *vport) list_for_each_entry_safe(ndlp, next_ndlp, &vport->fc_nodes, nlp_listp) { lpfc_printf_vlog(ndlp->vport, KERN_ERR, - LOG_TRACE_EVENT, + LOG_DISCOVERY, "0282 did:x%x ndlp:x%px " "refcnt:%d xflags x%x nflag x%x\n", ndlp->nlp_DID, (void *)ndlp, @@ -13371,8 +13389,9 @@ lpfc_sli4_hba_unset(struct lpfc_hba *phba) /* Abort all iocbs associated with the hba */ lpfc_sli_hba_iocb_abort(phba);
- /* Wait for completion of device XRI exchange busy */ - lpfc_sli4_xri_exchange_busy_wait(phba); + if (!pci_channel_offline(phba->pcidev)) + /* Wait for completion of device XRI exchange busy */ + lpfc_sli4_xri_exchange_busy_wait(phba);
/* per-phba callback de-registration for hotplug event */ if (phba->pport) @@ -14276,6 +14295,7 @@ lpfc_sli_prep_dev_for_perm_failure(struct lpfc_hba *phba) "2711 PCI channel permanent disable for failure\n"); /* Block all SCSI devices' I/Os on the host */ lpfc_scsi_dev_block(phba); + lpfc_sli4_prep_dev_for_reset(phba);
/* stop all timers */ lpfc_stop_hba_timers(phba); diff --git a/drivers/scsi/lpfc/lpfc_nvme.c b/drivers/scsi/lpfc/lpfc_nvme.c index 8983f6440858..df73abb59407 100644 --- a/drivers/scsi/lpfc/lpfc_nvme.c +++ b/drivers/scsi/lpfc/lpfc_nvme.c @@ -93,6 +93,11 @@ lpfc_nvme_create_queue(struct nvme_fc_local_port *pnvme_lport,
lport = (struct lpfc_nvme_lport *)pnvme_lport->private; vport = lport->vport; + + if (!vport || vport->load_flag & FC_UNLOADING || + vport->phba->hba_flag & HBA_IOQ_FLUSH) + return -ENODEV; + qhandle = kzalloc(sizeof(struct lpfc_nvme_qhandle), GFP_KERNEL); if (qhandle == NULL) return -ENOMEM; @@ -267,7 +272,8 @@ lpfc_nvme_handle_lsreq(struct lpfc_hba *phba, return -EINVAL;
remoteport = lpfc_rport->remoteport; - if (!vport->localport) + if (!vport->localport || + vport->phba->hba_flag & HBA_IOQ_FLUSH) return -EINVAL;
lport = vport->localport->private; @@ -559,6 +565,8 @@ __lpfc_nvme_ls_req(struct lpfc_vport *vport, struct lpfc_nodelist *ndlp, ndlp->nlp_DID, ntype, nstate); return -ENODEV; } + if (vport->phba->hba_flag & HBA_IOQ_FLUSH) + return -ENODEV;
if (!vport->phba->sli4_hba.nvmels_wq) return -ENOMEM; @@ -662,7 +670,8 @@ lpfc_nvme_ls_req(struct nvme_fc_local_port *pnvme_lport, return -EINVAL;
vport = lport->vport; - if (vport->load_flag & FC_UNLOADING) + if (vport->load_flag & FC_UNLOADING || + vport->phba->hba_flag & HBA_IOQ_FLUSH) return -ENODEV;
atomic_inc(&lport->fc4NvmeLsRequests); @@ -1515,7 +1524,8 @@ lpfc_nvme_fcp_io_submit(struct nvme_fc_local_port *pnvme_lport,
phba = vport->phba;
- if (unlikely(vport->load_flag & FC_UNLOADING)) { + if ((unlikely(vport->load_flag & FC_UNLOADING)) || + phba->hba_flag & HBA_IOQ_FLUSH) { lpfc_printf_vlog(vport, KERN_INFO, LOG_NVME_IOERR, "6124 Fail IO, Driver unload\n"); atomic_inc(&lport->xmt_fcp_err); diff --git a/drivers/scsi/lpfc/lpfc_sli.c b/drivers/scsi/lpfc/lpfc_sli.c index 661ed0999f1c..b64c5f157ce9 100644 --- a/drivers/scsi/lpfc/lpfc_sli.c +++ b/drivers/scsi/lpfc/lpfc_sli.c @@ -4472,42 +4472,62 @@ lpfc_sli_handle_slow_ring_event_s4(struct lpfc_hba *phba, void lpfc_sli_abort_iocb_ring(struct lpfc_hba *phba, struct lpfc_sli_ring *pring) { - LIST_HEAD(completions); + LIST_HEAD(tx_completions); + LIST_HEAD(txcmplq_completions); struct lpfc_iocbq *iocb, *next_iocb; + int offline;
if (pring->ringno == LPFC_ELS_RING) { lpfc_fabric_abort_hba(phba); } + offline = pci_channel_offline(phba->pcidev);
/* Error everything on txq and txcmplq * First do the txq. */ if (phba->sli_rev >= LPFC_SLI_REV4) { spin_lock_irq(&pring->ring_lock); - list_splice_init(&pring->txq, &completions); + list_splice_init(&pring->txq, &tx_completions); pring->txq_cnt = 0; - spin_unlock_irq(&pring->ring_lock);
- spin_lock_irq(&phba->hbalock); - /* Next issue ABTS for everything on the txcmplq */ - list_for_each_entry_safe(iocb, next_iocb, &pring->txcmplq, list) - lpfc_sli_issue_abort_iotag(phba, pring, iocb, NULL); - spin_unlock_irq(&phba->hbalock); + if (offline) { + list_splice_init(&pring->txcmplq, + &txcmplq_completions); + } else { + /* Next issue ABTS for everything on the txcmplq */ + list_for_each_entry_safe(iocb, next_iocb, + &pring->txcmplq, list) + lpfc_sli_issue_abort_iotag(phba, pring, + iocb, NULL); + } + spin_unlock_irq(&pring->ring_lock); } else { spin_lock_irq(&phba->hbalock); - list_splice_init(&pring->txq, &completions); + list_splice_init(&pring->txq, &tx_completions); pring->txq_cnt = 0;
- /* Next issue ABTS for everything on the txcmplq */ - list_for_each_entry_safe(iocb, next_iocb, &pring->txcmplq, list) - lpfc_sli_issue_abort_iotag(phba, pring, iocb, NULL); + if (offline) { + list_splice_init(&pring->txcmplq, &txcmplq_completions); + } else { + /* Next issue ABTS for everything on the txcmplq */ + list_for_each_entry_safe(iocb, next_iocb, + &pring->txcmplq, list) + lpfc_sli_issue_abort_iotag(phba, pring, + iocb, NULL); + } spin_unlock_irq(&phba->hbalock); } - /* Make sure HBA is alive */ - lpfc_issue_hb_tmo(phba);
+ if (offline) { + /* Cancel all the IOCBs from the completions list */ + lpfc_sli_cancel_iocbs(phba, &txcmplq_completions, + IOSTAT_LOCAL_REJECT, IOERR_SLI_ABORTED); + } else { + /* Make sure HBA is alive */ + lpfc_issue_hb_tmo(phba); + } /* Cancel all the IOCBs from the completions list */ - lpfc_sli_cancel_iocbs(phba, &completions, IOSTAT_LOCAL_REJECT, + lpfc_sli_cancel_iocbs(phba, &tx_completions, IOSTAT_LOCAL_REJECT, IOERR_SLI_ABORTED); }
From: James Smart jsmart2021@gmail.com
[ Upstream commit df0101197c4d9596682901631f3ee193ed354873 ]
When recovering from a pci-parity error the driver is failing to re-create queues, causing recovery to fail. Looking deeper, it was found that the interrupt vector count allocated on the recovery was fewer than the vectors originally allocated. This disparity resulted in CPU map entries with stale information. When the driver tries to re-create the queues, it attempts to use the stale information which indicates an eq/interrupt vector that was no longer created.
Fix by clearng the cpup map array before enabling and requesting the IRQs in the lpfc_sli_reset_slot_s4 routine().
Link: https://lore.kernel.org/r/20220317032737.45308-4-jsmart2021@gmail.com Co-developed-by: Justin Tee justin.tee@broadcom.com Signed-off-by: Justin Tee justin.tee@broadcom.com Signed-off-by: James Smart jsmart2021@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/lpfc/lpfc_init.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/scsi/lpfc/lpfc_init.c b/drivers/scsi/lpfc/lpfc_init.c index c8c049cf8d96..9569a7390f9d 100644 --- a/drivers/scsi/lpfc/lpfc_init.c +++ b/drivers/scsi/lpfc/lpfc_init.c @@ -15248,6 +15248,8 @@ lpfc_io_slot_reset_s4(struct pci_dev *pdev) psli->sli_flag &= ~LPFC_SLI_ACTIVE; spin_unlock_irq(&phba->hbalock);
+ /* Init cpu_map array */ + lpfc_cpu_map_array_init(phba); /* Configure and enable interrupt */ intr_mode = lpfc_sli4_enable_intr(phba, phba->intr_mode); if (intr_mode == LPFC_INTR_ERROR) {
From: Tyrel Datwyler tyreld@linux.ibm.com
[ Upstream commit 0bade8e53279157c7cc9dd95d573b7e82223d78a ]
The adapter request_limit is hardcoded to be INITIAL_SRP_LIMIT which is currently an arbitrary value of 800. Increase this value to 1024 which better matches the characteristics of the typical IBMi Initiator that supports 32 LUNs and a queue depth of 32.
This change also has the secondary benefit of being a power of two as required by the kfifo API. Since, Commit ab9bb6318b09 ("Partially revert "kfifo: fix kfifo_alloc() and kfifo_init()"") the size of IU pool for each target has been rounded down to 512 when attempting to kfifo_init() those pools with the current request_limit size of 800.
Link: https://lore.kernel.org/r/20220322194443.678433-1-tyreld@linux.ibm.com Signed-off-by: Tyrel Datwyler tyreld@linux.ibm.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c index 61f06f6885a5..89b9fbce7488 100644 --- a/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c +++ b/drivers/scsi/ibmvscsi_tgt/ibmvscsi_tgt.c @@ -36,7 +36,7 @@
#define IBMVSCSIS_VERSION "v0.2"
-#define INITIAL_SRP_LIMIT 800 +#define INITIAL_SRP_LIMIT 1024 #define DEFAULT_MAX_SECTORS 256 #define MAX_TXU 1024 * 1024
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit c3efcedd272aa6dd5929e20cf902a52ddaa1197a ]
KS8851_MLL selects MICREL_PHY, which depends on PTP_1588_CLOCK_OPTIONAL, so make KS8851_MLL also depend on PTP_1588_CLOCK_OPTIONAL since 'select' does not follow any dependency chains.
Fixes kconfig warning and build errors:
WARNING: unmet direct dependencies detected for MICREL_PHY Depends on [m]: NETDEVICES [=y] && PHYLIB [=y] && PTP_1588_CLOCK_OPTIONAL [=m] Selected by [y]: - KS8851_MLL [=y] && NETDEVICES [=y] && ETHERNET [=y] && NET_VENDOR_MICREL [=y] && HAS_IOMEM [=y]
ld: drivers/net/phy/micrel.o: in function `lan8814_ts_info': micrel.c:(.text+0xb35): undefined reference to `ptp_clock_index' ld: drivers/net/phy/micrel.o: in function `lan8814_probe': micrel.c:(.text+0x2586): undefined reference to `ptp_clock_register'
Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Cc: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/micrel/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/micrel/Kconfig b/drivers/net/ethernet/micrel/Kconfig index 93df3049cdc0..1b632cdd7630 100644 --- a/drivers/net/ethernet/micrel/Kconfig +++ b/drivers/net/ethernet/micrel/Kconfig @@ -39,6 +39,7 @@ config KS8851 config KS8851_MLL tristate "Micrel KS8851 MLL" depends on HAS_IOMEM + depends on PTP_1588_CLOCK_OPTIONAL select MII select CRC32 select EEPROM_93CX6
From: Christian Lamparter chunkeey@gmail.com
[ Upstream commit 5399752299396a3c9df6617f4b3c907d7aa4ded8 ]
Samsung' 840 EVO with the latest firmware (EXT0DB6Q) locks up with the a message: "READ LOG DMA EXT failed, trying PIO" during boot.
Initially this was discovered because it caused a crash with the sata_dwc_460ex controller on a WD MyBook Live DUO.
The reporter "Tice Rex" which has the unique opportunity that he has two Samsung 840 EVO SSD! One with the older firmware "EXT0BB0Q" which booted fine and didn't expose "READ LOG DMA EXT". But the newer/latest firmware "EXT0DB6Q" caused the headaches.
BugLink: https://github.com/openwrt/openwrt/issues/9505 Signed-off-by: Christian Lamparter chunkeey@gmail.com Signed-off-by: Damien Le Moal damien.lemoal@opensource.wdc.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/libata-core.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/ata/libata-core.c b/drivers/ata/libata-core.c index 0c854aebfe0b..760c0d81d148 100644 --- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -4014,6 +4014,9 @@ static const struct ata_blacklist_entry ata_device_blacklist [] = { ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "Crucial_CT*MX100*", "MU01", ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM, }, + { "Samsung SSD 840 EVO*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | + ATA_HORKAGE_NO_DMA_LOG | + ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "Samsung SSD 840*", NULL, ATA_HORKAGE_NO_NCQ_TRIM | ATA_HORKAGE_ZERO_AFTER_TRIM, }, { "Samsung SSD 850*", NULL, ATA_HORKAGE_NO_NCQ_TRIM |
From: Leo Ruan tingquan.ruan@cn.bosch.com
[ Upstream commit 070a88fd4a03f921b73a2059e97d55faaa447dab ]
This commit corrects the printing of the IPU clock error percentage if it is between -0.1% to -0.9%. For example, if the pixel clock requested is 27.2 MHz but only 27.0 MHz can be achieved the deviation is -0.8%. But the fixed point math had a flaw and calculated error of 0.2%.
Before: Clocks: IPU 270000000Hz DI 24716667Hz Needed 27200000Hz IPU clock can give 27000000 with divider 10, error 0.2% Want 27200000Hz IPU 270000000Hz DI 24716667Hz using IPU, 27000000Hz
After: Clocks: IPU 270000000Hz DI 24716667Hz Needed 27200000Hz IPU clock can give 27000000 with divider 10, error -0.8% Want 27200000Hz IPU 270000000Hz DI 24716667Hz using IPU, 27000000Hz
Signed-off-by: Leo Ruan tingquan.ruan@cn.bosch.com Signed-off-by: Mark Jonas mark.jonas@de.bosch.com Reviewed-by: Philipp Zabel p.zabel@pengutronix.de Signed-off-by: Philipp Zabel p.zabel@pengutronix.de Link: https://lore.kernel.org/r/20220207151411.5009-1-mark.jonas@de.bosch.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/ipu-v3/ipu-di.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/ipu-v3/ipu-di.c b/drivers/gpu/ipu-v3/ipu-di.c index 666223c6bec4..0a34e0ab4fe6 100644 --- a/drivers/gpu/ipu-v3/ipu-di.c +++ b/drivers/gpu/ipu-v3/ipu-di.c @@ -447,8 +447,9 @@ static void ipu_di_config_clock(struct ipu_di *di,
error = rate / (sig->mode.pixelclock / 1000);
- dev_dbg(di->ipu->dev, " IPU clock can give %lu with divider %u, error %d.%u%%\n", - rate, div, (signed)(error - 1000) / 10, error % 10); + dev_dbg(di->ipu->dev, " IPU clock can give %lu with divider %u, error %c%d.%d%%\n", + rate, div, error < 1000 ? '-' : '+', + abs(error - 1000) / 10, abs(error - 1000) % 10);
/* Allow a 1% error */ if (error < 1010 && error >= 990) {
From: Jonathan Bakker xc-racer2@live.ca
[ Upstream commit 92d96b603738ec4f35cde7198c303ae264dd47cb ]
As per Table 130 of the wm8994 datasheet at [1], there is an off-on delay for LDO1 and LDO2. In the wm8958 datasheet [2], I could not find any reference to it. I could not find a wm1811 datasheet to double-check there, but as no one has complained presumably it works without it.
This solves the issue on Samsung Aries boards with a wm8994 where register writes fail when the device is powered off and back-on quickly.
[1] https://statics.cirrus.com/pubs/proDatasheet/WM8994_Rev4.6.pdf [2] https://statics.cirrus.com/pubs/proDatasheet/WM8958_v3.5.pdf
Signed-off-by: Jonathan Bakker xc-racer2@live.ca Acked-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/CY4PR04MB056771CFB80DC447C30D5A31CB1D9@CY4PR04MB05... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/wm8994-regulator.c | 42 ++++++++++++++++++++++++++-- 1 file changed, 39 insertions(+), 3 deletions(-)
diff --git a/drivers/regulator/wm8994-regulator.c b/drivers/regulator/wm8994-regulator.c index cadea0344486..40befdd9dfa9 100644 --- a/drivers/regulator/wm8994-regulator.c +++ b/drivers/regulator/wm8994-regulator.c @@ -71,6 +71,35 @@ static const struct regulator_ops wm8994_ldo2_ops = { };
static const struct regulator_desc wm8994_ldo_desc[] = { + { + .name = "LDO1", + .id = 1, + .type = REGULATOR_VOLTAGE, + .n_voltages = WM8994_LDO1_MAX_SELECTOR + 1, + .vsel_reg = WM8994_LDO_1, + .vsel_mask = WM8994_LDO1_VSEL_MASK, + .ops = &wm8994_ldo1_ops, + .min_uV = 2400000, + .uV_step = 100000, + .enable_time = 3000, + .off_on_delay = 36000, + .owner = THIS_MODULE, + }, + { + .name = "LDO2", + .id = 2, + .type = REGULATOR_VOLTAGE, + .n_voltages = WM8994_LDO2_MAX_SELECTOR + 1, + .vsel_reg = WM8994_LDO_2, + .vsel_mask = WM8994_LDO2_VSEL_MASK, + .ops = &wm8994_ldo2_ops, + .enable_time = 3000, + .off_on_delay = 36000, + .owner = THIS_MODULE, + }, +}; + +static const struct regulator_desc wm8958_ldo_desc[] = { { .name = "LDO1", .id = 1, @@ -172,9 +201,16 @@ static int wm8994_ldo_probe(struct platform_device *pdev) * regulator core and we need not worry about it on the * error path. */ - ldo->regulator = devm_regulator_register(&pdev->dev, - &wm8994_ldo_desc[id], - &config); + if (ldo->wm8994->type == WM8994) { + ldo->regulator = devm_regulator_register(&pdev->dev, + &wm8994_ldo_desc[id], + &config); + } else { + ldo->regulator = devm_regulator_register(&pdev->dev, + &wm8958_ldo_desc[id], + &config); + } + if (IS_ERR(ldo->regulator)) { ret = PTR_ERR(ldo->regulator); dev_err(wm8994->dev, "Failed to register LDO%d: %d\n",
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit 5517d500829c683a358a8de04ecb2e28af629ae5 ]
When a static call is updated with __static_call_return0() as target, arch_static_call_transform() set it to use an optimised set of instructions which are meant to lay in the same cacheline.
But when initialising a static call with DEFINE_STATIC_CALL_RET0(), we get a branch to the real __static_call_return0() function instead of getting the optimised setup:
c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 4b ff ff f4 b c00d8114 <__static_call_return0> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0
Add ARCH_DEFINE_STATIC_CALL_RET0_TRAMP() defined by each architecture to setup the optimised configuration, and rework DEFINE_STATIC_CALL_RET0() to call it:
c00d8120 <__SCT__perf_snapshot_branch_stack>: c00d8120: 48 00 00 14 b c00d8134 <__SCT__perf_snapshot_branch_stack+0x14> c00d8124: 3d 80 c0 0e lis r12,-16370 c00d8128: 81 8c 81 3c lwz r12,-32452(r12) c00d812c: 7d 89 03 a6 mtctr r12 c00d8130: 4e 80 04 20 bctr c00d8134: 38 60 00 00 li r3,0 c00d8138: 4e 80 00 20 blr c00d813c: 00 00 00 00 .long 0x0
Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Josh Poimboeuf jpoimboe@redhat.com Link: https://lore.kernel.org/r/1e0a61a88f52a460f62a58ffc2a5f847d1f7d9d8.164725345... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/static_call.h | 1 + arch/x86/include/asm/static_call.h | 2 ++ include/linux/static_call.h | 20 +++++++++++++++++--- 3 files changed, 20 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/include/asm/static_call.h b/arch/powerpc/include/asm/static_call.h index 0a0bc79bd1fa..de1018cc522b 100644 --- a/arch/powerpc/include/asm/static_call.h +++ b/arch/powerpc/include/asm/static_call.h @@ -24,5 +24,6 @@
#define ARCH_DEFINE_STATIC_CALL_TRAMP(name, func) __PPC_SCT(name, "b " #func) #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) __PPC_SCT(name, "blr") +#define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) __PPC_SCT(name, "b .+20")
#endif /* _ASM_POWERPC_STATIC_CALL_H */ diff --git a/arch/x86/include/asm/static_call.h b/arch/x86/include/asm/static_call.h index ed4f8bb6c2d9..2455d721503e 100644 --- a/arch/x86/include/asm/static_call.h +++ b/arch/x86/include/asm/static_call.h @@ -38,6 +38,8 @@ #define ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name) \ __ARCH_DEFINE_STATIC_CALL_TRAMP(name, "ret; int3; nop; nop; nop")
+#define ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) \ + ARCH_DEFINE_STATIC_CALL_TRAMP(name, __static_call_return0)
#define ARCH_ADD_TRAMP_KEY(name) \ asm(".pushsection .static_call_tramp_key, "a" \n" \ diff --git a/include/linux/static_call.h b/include/linux/static_call.h index fcc5b48989b3..3c50b0fdda16 100644 --- a/include/linux/static_call.h +++ b/include/linux/static_call.h @@ -196,6 +196,14 @@ extern long __static_call_return0(void); }; \ ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
+#define DEFINE_STATIC_CALL_RET0(name, _func) \ + DECLARE_STATIC_CALL(name, _func); \ + struct static_call_key STATIC_CALL_KEY(name) = { \ + .func = __static_call_return0, \ + .type = 1, \ + }; \ + ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name) + #define static_call_cond(name) (void)__static_call(name)
#define EXPORT_STATIC_CALL(name) \ @@ -231,6 +239,12 @@ static inline int static_call_init(void) { return 0; } }; \ ARCH_DEFINE_STATIC_CALL_NULL_TRAMP(name)
+#define DEFINE_STATIC_CALL_RET0(name, _func) \ + DECLARE_STATIC_CALL(name, _func); \ + struct static_call_key STATIC_CALL_KEY(name) = { \ + .func = __static_call_return0, \ + }; \ + ARCH_DEFINE_STATIC_CALL_RET0_TRAMP(name)
#define static_call_cond(name) (void)__static_call(name)
@@ -284,6 +298,9 @@ static inline long __static_call_return0(void) .func = NULL, \ }
+#define DEFINE_STATIC_CALL_RET0(name, _func) \ + __DEFINE_STATIC_CALL(name, _func, __static_call_return0) + static inline void __static_call_nop(void) { }
/* @@ -327,7 +344,4 @@ static inline int static_call_text_reserved(void *start, void *end) #define DEFINE_STATIC_CALL(name, _func) \ __DEFINE_STATIC_CALL(name, _func, _func)
-#define DEFINE_STATIC_CALL_RET0(name, _func) \ - __DEFINE_STATIC_CALL(name, _func, __static_call_return0) - #endif /* _LINUX_STATIC_CALL_H */
From: Joey Gouly joey.gouly@arm.com
[ Upstream commit a2c0b0fbe01419f8f5d1c0b9c581631f34ffce8b ]
The alternatives code must be `noinstr` such that it does not patch itself, as the cache invalidation is only performed after all the alternatives have been applied.
Mark patch_alternative() as `noinstr`. Mark branch_insn_requires_update() and get_alt_insn() with `__always_inline` since they are both only called through patch_alternative().
Booting a kernel in QEMU TCG with KCSAN=y and ARM64_USE_LSE_ATOMICS=y caused a boot hang: [ 0.241121] CPU: All CPU(s) started at EL2
The alternatives code was patching the atomics in __tsan_read4() from LL/SC atomics to LSE atomics.
The following fragment is using LL/SC atomics in the .text section: | <__tsan_unaligned_read4+304>: ldxr x6, [x2] | <__tsan_unaligned_read4+308>: add x6, x6, x5 | <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] | <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304>
This LL/SC atomic sequence was to be replaced with LSE atomics. However since the alternatives code was instrumentable, __tsan_read4() was being called after only the first instruction was replaced, which led to the following code in memory: | <__tsan_unaligned_read4+304>: ldadd x5, x6, [x2] | <__tsan_unaligned_read4+308>: add x6, x6, x5 | <__tsan_unaligned_read4+312>: stxr w7, x6, [x2] | <__tsan_unaligned_read4+316>: cbnz w7, <__tsan_unaligned_read4+304>
This caused an infinite loop as the `stxr` instruction never completed successfully, so `w7` was always 0.
Signed-off-by: Joey Gouly joey.gouly@arm.com Cc: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20220405104733.11476-1-joey.gouly@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/alternative.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/kernel/alternative.c b/arch/arm64/kernel/alternative.c index 3fb79b76e9d9..7bbf5104b7b7 100644 --- a/arch/arm64/kernel/alternative.c +++ b/arch/arm64/kernel/alternative.c @@ -42,7 +42,7 @@ bool alternative_is_applied(u16 cpufeature) /* * Check if the target PC is within an alternative block. */ -static bool branch_insn_requires_update(struct alt_instr *alt, unsigned long pc) +static __always_inline bool branch_insn_requires_update(struct alt_instr *alt, unsigned long pc) { unsigned long replptr = (unsigned long)ALT_REPL_PTR(alt); return !(pc >= replptr && pc <= (replptr + alt->alt_len)); @@ -50,7 +50,7 @@ static bool branch_insn_requires_update(struct alt_instr *alt, unsigned long pc)
#define align_down(x, a) ((unsigned long)(x) & ~(((unsigned long)(a)) - 1))
-static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnptr) +static __always_inline u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnptr) { u32 insn;
@@ -95,7 +95,7 @@ static u32 get_alt_insn(struct alt_instr *alt, __le32 *insnptr, __le32 *altinsnp return insn; }
-static void patch_alternative(struct alt_instr *alt, +static noinstr void patch_alternative(struct alt_instr *alt, __le32 *origptr, __le32 *updptr, int nr_inst) { __le32 *replptr;
From: Steve Capper steve.capper@arm.com
[ Upstream commit 697a1d44af8ba0477ee729e632f4ade37999249a ]
tlb_remove_huge_tlb_entry only considers PMD_SIZE and PUD_SIZE when updating the mmu_gather structure.
Unfortunately on arm64 there are two additional huge page sizes that need to be covered: CONT_PTE_SIZE and CONT_PMD_SIZE. Where an end-user attempts to employ contiguous huge pages, a VM_BUG_ON can be experienced due to the fact that the tlb structure hasn't been correctly updated by the relevant tlb_flush_p.._range() call from tlb_remove_huge_tlb_entry.
This patch adds inequality logic to the generic implementation of tlb_remove_huge_tlb_entry s.t. CONT_PTE_SIZE and CONT_PMD_SIZE are effectively covered on arm64. Also, as well as ptes, pmds and puds; p4ds are now considered too.
Reported-by: David Hildenbrand david@redhat.com Suggested-by: Peter Zijlstra (Intel) peterz@infradead.org Cc: Anshuman Khandual anshuman.khandual@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/linux-mm/811c5c8e-b3a2-85d2-049c-717f17c3a03a@redhat... Signed-off-by: Steve Capper steve.capper@arm.com Acked-by: David Hildenbrand david@redhat.com Reviewed-by: Anshuman Khandual anshuman.khandual@arm.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lore.kernel.org/r/20220330112543.863-1-steve.capper@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/asm-generic/tlb.h | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 2c68a545ffa7..71942a1c642d 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -565,10 +565,14 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ do { \ unsigned long _sz = huge_page_size(h); \ - if (_sz == PMD_SIZE) \ - tlb_flush_pmd_range(tlb, address, _sz); \ - else if (_sz == PUD_SIZE) \ + if (_sz >= P4D_SIZE) \ + tlb_flush_p4d_range(tlb, address, _sz); \ + else if (_sz >= PUD_SIZE) \ tlb_flush_pud_range(tlb, address, _sz); \ + else if (_sz >= PMD_SIZE) \ + tlb_flush_pmd_range(tlb, address, _sz); \ + else \ + tlb_flush_pte_range(tlb, address, _sz); \ __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0)
From: Andy Chiu andy.chiu@sifive.com
[ Upstream commit d1c4f93e3f0a023024a6f022a61528c06cf1daa9 ]
The call to axienet_mdio_setup should not depend on whether "phy-node" pressents on the DT. Besides, since `lp->phy_node` is used if PHY is in SGMII or 100Base-X modes, move it into the if statement. And the next patch will remove `lp->phy_node` from driver's private structure and do an of_node_put on it right away after use since it is not used elsewhere.
Signed-off-by: Andy Chiu andy.chiu@sifive.com Reviewed-by: Greentime Hu greentime.hu@sifive.com Reviewed-by: Robert Hancock robert.hancock@calian.com Reviewed-by: Radhey Shyam Pandey radhey.shyam.pandey@xilinx.com Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/xilinx/xilinx_axienet_main.c | 13 ++++++------- 1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c index 90d96eb79984..a960227f61da 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet_main.c +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_main.c @@ -2072,15 +2072,14 @@ static int axienet_probe(struct platform_device *pdev) if (ret) goto cleanup_clk;
- lp->phy_node = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0); - if (lp->phy_node) { - ret = axienet_mdio_setup(lp); - if (ret) - dev_warn(&pdev->dev, - "error registering MDIO bus: %d\n", ret); - } + ret = axienet_mdio_setup(lp); + if (ret) + dev_warn(&pdev->dev, + "error registering MDIO bus: %d\n", ret); + if (lp->phy_mode == PHY_INTERFACE_MODE_SGMII || lp->phy_mode == PHY_INTERFACE_MODE_1000BASEX) { + lp->phy_node = of_parse_phandle(pdev->dev.of_node, "phy-handle", 0); if (!lp->phy_node) { dev_err(&pdev->dev, "phy-handle required for 1000BaseX/SGMII\n"); ret = -EINVAL;
From: Boqun Feng boqun.feng@gmail.com
[ Upstream commit be5802795cf8d0b881745fa9ba7790293b382280 ]
Currently there are known potential issues for balloon and hot-add on ARM64:
* Unballoon requests from Hyper-V should only unballoon ranges that are guest page size aligned, otherwise guests cannot handle because it's impossible to partially free a page. This is a problem when guest page size > 4096 bytes.
* Memory hot-add requests from Hyper-V should provide the NUMA node id of the added ranges or ARM64 should have a functional memory_add_physaddr_to_nid(), otherwise the node id is missing for add_memory().
These issues require discussions on design and implementation. In the meanwhile, post_status() is working and essential to guest monitoring. Therefore instead of disabling the entire hv_balloon driver, the ballooning (when page size > 4096 bytes) and hot-add are disabled accordingly for now. Once the issues are fixed, they can be re-enable in these cases.
Signed-off-by: Boqun Feng boqun.feng@gmail.com Reviewed-by: Michael Kelley mikelley@microsoft.com Link: https://lore.kernel.org/r/20220325023212.1570049-3-boqun.feng@gmail.com Signed-off-by: Wei Liu wei.liu@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hv/hv_balloon.c | 36 ++++++++++++++++++++++++++++++++++-- 1 file changed, 34 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/hv_balloon.c b/drivers/hv/hv_balloon.c index 439f99b8b5de..3cf334c46c31 100644 --- a/drivers/hv/hv_balloon.c +++ b/drivers/hv/hv_balloon.c @@ -1653,6 +1653,38 @@ static void disable_page_reporting(void) } }
+static int ballooning_enabled(void) +{ + /* + * Disable ballooning if the page size is not 4k (HV_HYP_PAGE_SIZE), + * since currently it's unclear to us whether an unballoon request can + * make sure all page ranges are guest page size aligned. + */ + if (PAGE_SIZE != HV_HYP_PAGE_SIZE) { + pr_info("Ballooning disabled because page size is not 4096 bytes\n"); + return 0; + } + + return 1; +} + +static int hot_add_enabled(void) +{ + /* + * Disable hot add on ARM64, because we currently rely on + * memory_add_physaddr_to_nid() to get a node id of a hot add range, + * however ARM64's memory_add_physaddr_to_nid() always return 0 and + * DM_MEM_HOT_ADD_REQUEST doesn't have the NUMA node information for + * add_memory(). + */ + if (IS_ENABLED(CONFIG_ARM64)) { + pr_info("Memory hot add disabled on ARM64\n"); + return 0; + } + + return 1; +} + static int balloon_connect_vsp(struct hv_device *dev) { struct dm_version_request version_req; @@ -1724,8 +1756,8 @@ static int balloon_connect_vsp(struct hv_device *dev) * currently still requires the bits to be set, so we have to add code * to fail the host's hot-add and balloon up/down requests, if any. */ - cap_msg.caps.cap_bits.balloon = 1; - cap_msg.caps.cap_bits.hot_add = 1; + cap_msg.caps.cap_bits.balloon = ballooning_enabled(); + cap_msg.caps.cap_bits.hot_add = hot_add_enabled();
/* * Specify our alignment requirements as it relates
From: Marcin Kozlowski marcinguy@gmail.com
[ Upstream commit afb8e246527536848b9b4025b40e613edf776a9d ]
aqc111_rx_fixup() contains several out-of-bounds accesses that can be triggered by a malicious (or defective) USB device, in particular:
- The metadata array (desc_offset..desc_offset+2*pkt_count) can be out of bounds, causing OOB reads and (on big-endian systems) OOB endianness flips. - A packet can overlap the metadata array, causing a later OOB endianness flip to corrupt data used by a cloned SKB that has already been handed off into the network stack. - A packet SKB can be constructed whose tail is far beyond its end, causing out-of-bounds heap data to be considered part of the SKB's data.
Found doing variant analysis. Tested it with another driver (ax88179_178a), since I don't have a aqc111 device to test it, but the code looks very similar.
Signed-off-by: Marcin Kozlowski marcinguy@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/aqc111.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/usb/aqc111.c b/drivers/net/usb/aqc111.c index ea06d10e1c21..ca409d450a29 100644 --- a/drivers/net/usb/aqc111.c +++ b/drivers/net/usb/aqc111.c @@ -1102,10 +1102,15 @@ static int aqc111_rx_fixup(struct usbnet *dev, struct sk_buff *skb) if (start_of_descs != desc_offset) goto err;
- /* self check desc_offset from header*/ - if (desc_offset >= skb_len) + /* self check desc_offset from header and make sure that the + * bounds of the metadata array are inside the SKB + */ + if (pkt_count * 2 + desc_offset >= skb_len) goto err;
+ /* Packets must not overlap the metadata array */ + skb_trim(skb, desc_offset); + if (pkt_count == 0) goto err;
From: Xiaomeng Tong xiam0nd.tong@gmail.com
[ Upstream commit b423e54ba965b4469b48e46fd16941f1e1701697 ]
All remaining skbs should be released when myri10ge_xmit fails to transmit a packet. Fix it within another skb_list_walk_safe.
Signed-off-by: Xiaomeng Tong xiam0nd.tong@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/myricom/myri10ge/myri10ge.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c index 50ac3ee2577a..21d2645885ce 100644 --- a/drivers/net/ethernet/myricom/myri10ge/myri10ge.c +++ b/drivers/net/ethernet/myricom/myri10ge/myri10ge.c @@ -2903,11 +2903,9 @@ static netdev_tx_t myri10ge_sw_tso(struct sk_buff *skb, status = myri10ge_xmit(curr, dev); if (status != 0) { dev_kfree_skb_any(curr); - if (segs != NULL) { - curr = segs; - segs = next; + skb_list_walk_safe(next, curr, next) { curr->next = NULL; - dev_kfree_skb_any(segs); + dev_kfree_skb_any(curr); } goto drop; }
From: Matthias Schiffer matthias.schiffer@ew.tq-group.com
[ Upstream commit 97e4827d775faa9a32b5e1a97959c69dd77d17a3 ]
cqspi_set_protocol() only set the data width, but ignored the command and address width (except for 8-8-8 DTR ops), leading to corruption of all transfers using 1-X-X or X-X-X ops. Fix by setting the other two widths as well.
While we're at it, simplify the code a bit by replacing the CQSPI_INST_TYPE_* constants with ilog2().
Tested on a TI AM64x with a Macronix MX25U51245G QSPI flash with 1-4-4 read and write operations.
Signed-off-by: Matthias Schiffer matthias.schiffer@ew.tq-group.com Link: https://lore.kernel.org/r/20220331110819.133392-1-matthias.schiffer@ew.tq-gr... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-cadence-quadspi.c | 46 ++++++++----------------------- 1 file changed, 12 insertions(+), 34 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index b808c94641fa..75f356041138 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -19,6 +19,7 @@ #include <linux/iopoll.h> #include <linux/jiffies.h> #include <linux/kernel.h> +#include <linux/log2.h> #include <linux/module.h> #include <linux/of_device.h> #include <linux/of.h> @@ -102,12 +103,6 @@ struct cqspi_driver_platdata { #define CQSPI_TIMEOUT_MS 500 #define CQSPI_READ_TIMEOUT_MS 10
-/* Instruction type */ -#define CQSPI_INST_TYPE_SINGLE 0 -#define CQSPI_INST_TYPE_DUAL 1 -#define CQSPI_INST_TYPE_QUAD 2 -#define CQSPI_INST_TYPE_OCTAL 3 - #define CQSPI_DUMMY_CLKS_PER_BYTE 8 #define CQSPI_DUMMY_BYTES_MAX 4 #define CQSPI_DUMMY_CLKS_MAX 31 @@ -376,10 +371,6 @@ static unsigned int cqspi_calc_dummy(const struct spi_mem_op *op, bool dtr) static int cqspi_set_protocol(struct cqspi_flash_pdata *f_pdata, const struct spi_mem_op *op) { - f_pdata->inst_width = CQSPI_INST_TYPE_SINGLE; - f_pdata->addr_width = CQSPI_INST_TYPE_SINGLE; - f_pdata->data_width = CQSPI_INST_TYPE_SINGLE; - /* * For an op to be DTR, cmd phase along with every other non-empty * phase should have dtr field set to 1. If an op phase has zero @@ -389,32 +380,23 @@ static int cqspi_set_protocol(struct cqspi_flash_pdata *f_pdata, (!op->addr.nbytes || op->addr.dtr) && (!op->data.nbytes || op->data.dtr);
- switch (op->data.buswidth) { - case 0: - break; - case 1: - f_pdata->data_width = CQSPI_INST_TYPE_SINGLE; - break; - case 2: - f_pdata->data_width = CQSPI_INST_TYPE_DUAL; - break; - case 4: - f_pdata->data_width = CQSPI_INST_TYPE_QUAD; - break; - case 8: - f_pdata->data_width = CQSPI_INST_TYPE_OCTAL; - break; - default: - return -EINVAL; - } + f_pdata->inst_width = 0; + if (op->cmd.buswidth) + f_pdata->inst_width = ilog2(op->cmd.buswidth); + + f_pdata->addr_width = 0; + if (op->addr.buswidth) + f_pdata->addr_width = ilog2(op->addr.buswidth); + + f_pdata->data_width = 0; + if (op->data.buswidth) + f_pdata->data_width = ilog2(op->data.buswidth);
/* Right now we only support 8-8-8 DTR mode. */ if (f_pdata->dtr) { switch (op->cmd.buswidth) { case 0: - break; case 8: - f_pdata->inst_width = CQSPI_INST_TYPE_OCTAL; break; default: return -EINVAL; @@ -422,9 +404,7 @@ static int cqspi_set_protocol(struct cqspi_flash_pdata *f_pdata,
switch (op->addr.buswidth) { case 0: - break; case 8: - f_pdata->addr_width = CQSPI_INST_TYPE_OCTAL; break; default: return -EINVAL; @@ -432,9 +412,7 @@ static int cqspi_set_protocol(struct cqspi_flash_pdata *f_pdata,
switch (op->data.buswidth) { case 0: - break; case 8: - f_pdata->data_width = CQSPI_INST_TYPE_OCTAL; break; default: return -EINVAL;
From: Chris Park Chris.Park@amd.com
[ Upstream commit 862a876c3a6372f2fa9d0c6510f1976ac94fc857 ]
[Why] Once DSC slice cannot fit pixel clock, we incorrectly reset min slices to 0 and allow max slice to operate, even when max slice itself cannot fit the pixel clock properly.
[How] Change the sequence such that we correctly determine DSC is not possible when both min slices and max slices cannot fit pixel clock per slice.
Reviewed-by: Wenjing Liu Wenjing.Liu@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Chris Park Chris.Park@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c index 9c74564cbd8d..8973d3a38f9c 100644 --- a/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c +++ b/drivers/gpu/drm/amd/display/dc/dsc/dc_dsc.c @@ -864,11 +864,11 @@ static bool setup_dsc_config( min_slices_h = inc_num_slices(dsc_common_caps.slice_caps, min_slices_h); }
+ is_dsc_possible = (min_slices_h <= max_slices_h); + if (pic_width % min_slices_h != 0) min_slices_h = 0; // DSC TODO: Maybe try increasing the number of slices first?
- is_dsc_possible = (min_slices_h <= max_slices_h); - if (min_slices_h == 0 && max_slices_h == 0) is_dsc_possible = false;
From: Roman Li Roman.Li@amd.com
[ Upstream commit 58e16c752e9540b28a873c44c3bee83e022007c1 ]
[Why] In init_hw() we call init_pipes() before enabling power gating. init_pipes() tries to power gate dsc but it may fail because required force-ons are not released yet. As a result with dsc config the following errors observed on resume: "REG_WAIT timeout 1us * 1000 tries - dcn20_dsc_pg_control" "REG_WAIT timeout 1us * 1000 tries - dcn20_dpp_pg_control" "REG_WAIT timeout 1us * 1000 tries - dcn20_hubp_pg_control"
[How] Move enable_power_gating_plane() before init_pipes() in init_hw()
Reviewed-by: Anthony Koo Anthony.Koo@amd.com Reviewed-by: Eric Yang Eric.Yang2@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Roman Li Roman.Li@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 5 +++-- drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c | 5 +++-- drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c | 5 +++-- 3 files changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c index a6703d17fcb6..c6a0daa56fa0 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -1501,6 +1501,9 @@ void dcn10_init_hw(struct dc *dc) if (dc->config.power_down_display_on_boot) dc_link_blank_all_dp_displays(dc);
+ if (hws->funcs.enable_power_gating_plane) + hws->funcs.enable_power_gating_plane(dc->hwseq, true); + /* If taking control over from VBIOS, we may want to optimize our first * mode set, so we need to skip powering down pipes until we know which * pipes we want to use. @@ -1553,8 +1556,6 @@ void dcn10_init_hw(struct dc *dc)
REG_UPDATE(DCFCLK_CNTL, DCFCLK_GATE_DIS, 0); } - if (hws->funcs.enable_power_gating_plane) - hws->funcs.enable_power_gating_plane(dc->hwseq, true);
if (dc->clk_mgr->funcs->notify_wm_ranges) dc->clk_mgr->funcs->notify_wm_ranges(dc->clk_mgr); diff --git a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c index 1db1ca19411d..05dc0a3ae2a3 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn30/dcn30_hwseq.c @@ -548,6 +548,9 @@ void dcn30_init_hw(struct dc *dc) if (dc->config.power_down_display_on_boot) dc_link_blank_all_dp_displays(dc);
+ if (hws->funcs.enable_power_gating_plane) + hws->funcs.enable_power_gating_plane(dc->hwseq, true); + /* If taking control over from VBIOS, we may want to optimize our first * mode set, so we need to skip powering down pipes until we know which * pipes we want to use. @@ -625,8 +628,6 @@ void dcn30_init_hw(struct dc *dc)
REG_UPDATE(DCFCLK_CNTL, DCFCLK_GATE_DIS, 0); } - if (hws->funcs.enable_power_gating_plane) - hws->funcs.enable_power_gating_plane(dc->hwseq, true);
if (!dcb->funcs->is_accelerated_mode(dcb) && dc->res_pool->hubbub->funcs->init_watermarks) dc->res_pool->hubbub->funcs->init_watermarks(dc->res_pool->hubbub); diff --git a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c index 1e156f398065..bdc4467b40d7 100644 --- a/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn31/dcn31_hwseq.c @@ -200,6 +200,9 @@ void dcn31_init_hw(struct dc *dc) if (dc->config.power_down_display_on_boot) dc_link_blank_all_dp_displays(dc);
+ if (hws->funcs.enable_power_gating_plane) + hws->funcs.enable_power_gating_plane(dc->hwseq, true); + /* If taking control over from VBIOS, we may want to optimize our first * mode set, so we need to skip powering down pipes until we know which * pipes we want to use. @@ -249,8 +252,6 @@ void dcn31_init_hw(struct dc *dc)
REG_UPDATE(DCFCLK_CNTL, DCFCLK_GATE_DIS, 0); } - if (hws->funcs.enable_power_gating_plane) - hws->funcs.enable_power_gating_plane(dc->hwseq, true);
if (!dcb->funcs->is_accelerated_mode(dcb) && dc->res_pool->hubbub->funcs->init_watermarks) dc->res_pool->hubbub->funcs->init_watermarks(dc->res_pool->hubbub);
From: Martin Leung Martin.Leung@amd.com
[ Upstream commit b2075fce104b88b789c15ef1ed2b91dc94198e26 ]
why and how: causes failure on install on certain machines
Reviewed-by: George Shen George.Shen@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Martin Leung Martin.Leung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/dc/core/dc.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/dc/core/dc.c b/drivers/gpu/drm/amd/display/dc/core/dc.c index 78e17a4af4ab..62bc6ce88753 100644 --- a/drivers/gpu/drm/amd/display/dc/core/dc.c +++ b/drivers/gpu/drm/amd/display/dc/core/dc.c @@ -1495,10 +1495,6 @@ bool dc_validate_seamless_boot_timing(const struct dc *dc, if (!link->link_enc->funcs->is_dig_enabled(link->link_enc)) return false;
- /* Check for FEC status*/ - if (link->link_enc->funcs->fec_is_active(link->link_enc)) - return false; - enc_inst = link->link_enc->funcs->get_dig_frontend(link->link_enc);
if (enc_inst == ENGINE_ID_UNKNOWN)
From: Roman Li Roman.Li@amd.com
[ Upstream commit f4346fb3edf7720db3f7f5e1cab1f667cd024280 ]
[Why] On resume we do link detection for all non-MST connectors. MST is handled separately. However the condition for telling if connector is on mst branch is not enough for mst hub case. Link detection for mst branch link leads to mst topology reset. That causes assert in dc_link_allocate_mst_payload()
[How] Use link type as indicator for mst link.
Reviewed-by: Wayne Lin Wayne.Lin@amd.com Acked-by: Alex Hung alex.hung@amd.com Signed-off-by: Roman Li Roman.Li@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 90c017859ad4..24db2297857b 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -2693,7 +2693,8 @@ static int dm_resume(void *handle) * this is the case when traversing through already created * MST connectors, should be skipped */ - if (aconnector->mst_port) + if (aconnector->dc_link && + aconnector->dc_link->type == dc_connection_mst_branch) continue;
mutex_lock(&aconnector->hpd_lock);
From: Christoph Böhmwalder christoph@boehmwalder.at
[ Upstream commit 286901941fd18a52b2138fddbbf589ad3639eb00 ]
We want our pages not to change while they are being written.
Signed-off-by: Christoph Böhmwalder christoph.boehmwalder@linbit.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/drbd/drbd_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/block/drbd/drbd_main.c b/drivers/block/drbd/drbd_main.c index 5d5beeba3ed4..478ba959362c 100644 --- a/drivers/block/drbd/drbd_main.c +++ b/drivers/block/drbd/drbd_main.c @@ -2739,6 +2739,7 @@ enum drbd_ret_code drbd_create_device(struct drbd_config_context *adm_ctx, unsig sprintf(disk->disk_name, "drbd%d", minor); disk->private_data = device;
+ blk_queue_flag_set(QUEUE_FLAG_STABLE_WRITES, disk->queue); blk_queue_write_cache(disk->queue, true, true); /* Setting the max_hw_sectors to an odd value of 8kibyte here This triggers a max_bio_size message upon first attach or connect */
From: Sreekanth Reddy sreekanth.reddy@broadcom.com
[ Upstream commit f61eb1216c959f93ffabd3b8781fa5b2b22f8907 ]
As part of controller reset operation the driver issues a config request command. If this command gets times out, then fail the controller reset operation instead of retrying it.
Link: https://lore.kernel.org/r/20220405120637.20528-1-sreekanth.reddy@broadcom.co... Signed-off-by: Sreekanth Reddy sreekanth.reddy@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mpt3sas/mpt3sas_config.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/scsi/mpt3sas/mpt3sas_config.c b/drivers/scsi/mpt3sas/mpt3sas_config.c index 0563078227de..a8dd14c91efd 100644 --- a/drivers/scsi/mpt3sas/mpt3sas_config.c +++ b/drivers/scsi/mpt3sas/mpt3sas_config.c @@ -394,10 +394,13 @@ _config_request(struct MPT3SAS_ADAPTER *ioc, Mpi2ConfigRequest_t retry_count++; if (ioc->config_cmds.smid == smid) mpt3sas_base_free_smid(ioc, smid); - if ((ioc->shost_recovery) || (ioc->config_cmds.status & - MPT3_CMD_RESET) || ioc->pci_error_recovery) + if (ioc->config_cmds.status & MPT3_CMD_RESET) goto retry_config; - issue_host_reset = 1; + if (ioc->shost_recovery || ioc->pci_error_recovery) { + issue_host_reset = 0; + r = -EFAULT; + } else + issue_host_reset = 1; goto free_mem; }
From: Alexey Galakhov agalakhov@gmail.com
[ Upstream commit 5f2bce1e222028dc1c15f130109a17aa654ae6e8 ]
The HighPoint RocketRaid 2640 is a low-cost SAS controller based on Marvell chip. The chip in question was already supported by the kernel, just the PCI ID of this particular board was missing.
Link: https://lore.kernel.org/r/20220309212535.402987-1-agalakhov@gmail.com Signed-off-by: Alexey Galakhov agalakhov@gmail.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/mvsas/mv_init.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/scsi/mvsas/mv_init.c b/drivers/scsi/mvsas/mv_init.c index 44df7c03aab8..605a8eb7344a 100644 --- a/drivers/scsi/mvsas/mv_init.c +++ b/drivers/scsi/mvsas/mv_init.c @@ -646,6 +646,7 @@ static struct pci_device_id mvs_pci_table[] = { { PCI_VDEVICE(ARECA, PCI_DEVICE_ID_ARECA_1300), chip_1300 }, { PCI_VDEVICE(ARECA, PCI_DEVICE_ID_ARECA_1320), chip_1320 }, { PCI_VDEVICE(ADAPTEC2, 0x0450), chip_6440 }, + { PCI_VDEVICE(TTI, 0x2640), chip_6440 }, { PCI_VDEVICE(TTI, 0x2710), chip_9480 }, { PCI_VDEVICE(TTI, 0x2720), chip_9480 }, { PCI_VDEVICE(TTI, 0x2721), chip_9480 },
From: Chandrakanth patil chandrakanth.patil@broadcom.com
[ Upstream commit 56495f295d8e021f77d065b890fc0100e3f9f6d8 ]
The megaraid_sas driver supports single LUN for RAID devices. That is LUN 0. All other LUNs are unsupported. When a device scan on a logical target with invalid LUN number is invoked through sysfs, that target ends up getting removed.
Add LUN ID validation in the slave destroy function to avoid the target deletion.
Link: https://lore.kernel.org/r/20220324094711.48833-1-chandrakanth.patil@broadcom... Signed-off-by: Chandrakanth patil chandrakanth.patil@broadcom.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/megaraid/megaraid_sas.h | 3 +++ drivers/scsi/megaraid/megaraid_sas_base.c | 7 +++++++ 2 files changed, 10 insertions(+)
diff --git a/drivers/scsi/megaraid/megaraid_sas.h b/drivers/scsi/megaraid/megaraid_sas.h index 2c9d1b796475..ae2aef9ba8cf 100644 --- a/drivers/scsi/megaraid/megaraid_sas.h +++ b/drivers/scsi/megaraid/megaraid_sas.h @@ -2558,6 +2558,9 @@ struct megasas_instance_template { #define MEGASAS_IS_LOGICAL(sdev) \ ((sdev->channel < MEGASAS_MAX_PD_CHANNELS) ? 0 : 1)
+#define MEGASAS_IS_LUN_VALID(sdev) \ + (((sdev)->lun == 0) ? 1 : 0) + #define MEGASAS_DEV_INDEX(scp) \ (((scp->device->channel % 2) * MEGASAS_MAX_DEV_PER_CHANNEL) + \ scp->device->id) diff --git a/drivers/scsi/megaraid/megaraid_sas_base.c b/drivers/scsi/megaraid/megaraid_sas_base.c index 82e1e24257bc..ca563498dcdb 100644 --- a/drivers/scsi/megaraid/megaraid_sas_base.c +++ b/drivers/scsi/megaraid/megaraid_sas_base.c @@ -2126,6 +2126,9 @@ static int megasas_slave_alloc(struct scsi_device *sdev) goto scan_target; } return -ENXIO; + } else if (!MEGASAS_IS_LUN_VALID(sdev)) { + sdev_printk(KERN_INFO, sdev, "%s: invalid LUN\n", __func__); + return -ENXIO; }
scan_target: @@ -2156,6 +2159,10 @@ static void megasas_slave_destroy(struct scsi_device *sdev) instance = megasas_lookup_instance(sdev->host->host_no);
if (MEGASAS_IS_LOGICAL(sdev)) { + if (!MEGASAS_IS_LUN_VALID(sdev)) { + sdev_printk(KERN_INFO, sdev, "%s: invalid LUN\n", __func__); + return; + } ld_tgt_id = MEGASAS_TARGET_ID(sdev); instance->ld_tgtid_status[ld_tgt_id] = LD_TARGET_ID_DELETED; if (megasas_dbg_lvl & LD_PD_DEBUG)
From: Duoming Zhou duoming@zju.edu.cn
[ Upstream commit ec4eb8a86ade4d22633e1da2a7d85a846b7d1798 ]
When a slip driver is detaching, the slip_close() will act to cleanup necessary resources and sl->tty is set to NULL in slip_close(). Meanwhile, the packet we transmit is blocked, sl_tx_timeout() will be called. Although slip_close() and sl_tx_timeout() use sl->lock to synchronize, we don`t judge whether sl->tty equals to NULL in sl_tx_timeout() and the null pointer dereference bug will happen.
(Thread 1) | (Thread 2) | slip_close() | spin_lock_bh(&sl->lock) | ... ... | sl->tty = NULL //(1) sl_tx_timeout() | spin_unlock_bh(&sl->lock) spin_lock(&sl->lock); | ... | ... tty_chars_in_buffer(sl->tty)| if (tty->ops->..) //(2) | ... | synchronize_rcu()
We set NULL to sl->tty in position (1) and dereference sl->tty in position (2).
This patch adds check in sl_tx_timeout(). If sl->tty equals to NULL, sl_tx_timeout() will goto out.
Signed-off-by: Duoming Zhou duoming@zju.edu.cn Reviewed-by: Jiri Slaby jirislaby@kernel.org Link: https://lore.kernel.org/r/20220405132206.55291-1-duoming@zju.edu.cn Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/slip/slip.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/slip/slip.c b/drivers/net/slip/slip.c index 98f586f910fb..8ed4fcf70b9b 100644 --- a/drivers/net/slip/slip.c +++ b/drivers/net/slip/slip.c @@ -469,7 +469,7 @@ static void sl_tx_timeout(struct net_device *dev, unsigned int txqueue) spin_lock(&sl->lock);
if (netif_queue_stopped(dev)) { - if (!netif_running(dev)) + if (!netif_running(dev) || !sl->tty) goto out;
/* May be we must check transmitter timeout here ?
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit be8a096521ca1a252bf078b347f96ce94582612e ]
Clang can inline emit_indirect_jump() and then folds constants, which results in:
| vmlinux.o: warning: objtool: emit_bpf_dispatcher()+0x6a4: relocation to !ENDBR: .text.__x86.indirect_thunk+0x40 | vmlinux.o: warning: objtool: emit_bpf_dispatcher()+0x67d: relocation to !ENDBR: .text.__x86.indirect_thunk+0x40 | vmlinux.o: warning: objtool: emit_bpf_tail_call_indirect()+0x386: relocation to !ENDBR: .text.__x86.indirect_thunk+0x20 | vmlinux.o: warning: objtool: emit_bpf_tail_call_indirect()+0x35d: relocation to !ENDBR: .text.__x86.indirect_thunk+0x20
Suppress the optimization such that it must emit a code reference to the __x86_indirect_thunk_array[] base.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Alexei Starovoitov ast@kernel.org Link: https://lkml.kernel.org/r/20220405075531.GB30877@worktop.programming.kicks-a... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/net/bpf_jit_comp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/arch/x86/net/bpf_jit_comp.c b/arch/x86/net/bpf_jit_comp.c index 0ecb140864b2..b272e963388c 100644 --- a/arch/x86/net/bpf_jit_comp.c +++ b/arch/x86/net/bpf_jit_comp.c @@ -398,6 +398,7 @@ static void emit_indirect_jump(u8 **pprog, int reg, u8 *ip) EMIT_LFENCE(); EMIT2(0xFF, 0xE0 + reg); } else if (cpu_feature_enabled(X86_FEATURE_RETPOLINE)) { + OPTIMIZER_HIDE_VAR(reg); emit_jump(&prog, &__x86_indirect_thunk_array[reg], ip); } else #endif
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit 8f0a24801bb44aa58496945aabb904c729176772 ]
Automatically default rsrc tag in io_queue_rsrc_removal(), it's safer than leaving it there and relying on the rest of the code to behave and not use it.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/1cf262a50df17478ea25b22494dcc19f3a80301f.164933634... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 6f93bff7633c..53b20d8b1129 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8536,13 +8536,15 @@ static int io_sqe_file_register(struct io_ring_ctx *ctx, struct file *file, static int io_queue_rsrc_removal(struct io_rsrc_data *data, unsigned idx, struct io_rsrc_node *node, void *rsrc) { + u64 *tag_slot = io_get_tag_slot(data, idx); struct io_rsrc_put *prsrc;
prsrc = kzalloc(sizeof(*prsrc), GFP_KERNEL); if (!prsrc) return -ENOMEM;
- prsrc->tag = *io_get_tag_slot(data, idx); + prsrc->tag = *tag_slot; + *tag_slot = 0; prsrc->rsrc = rsrc; list_add(&prsrc->list, &node->rsrc_list); return 0;
From: Pavel Begunkov asml.silence@gmail.com
[ Upstream commit 4cdd158be9d09223737df83136a1fb65269d809a ]
There are still several places that using pre array_index_nospec() indexes, fix them up.
Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/b01ef5ee83f72ed35ad525912370b729f5d145f4.164933634... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index 53b20d8b1129..f0febb2cb016 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -8613,7 +8613,7 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags) bool needs_lock = issue_flags & IO_URING_F_UNLOCKED; struct io_fixed_file *file_slot; struct file *file; - int ret, i; + int ret;
io_ring_submit_lock(ctx, needs_lock); ret = -ENXIO; @@ -8626,8 +8626,8 @@ static int io_close_fixed(struct io_kiocb *req, unsigned int issue_flags) if (ret) goto out;
- i = array_index_nospec(offset, ctx->nr_user_files); - file_slot = io_fixed_file_slot(&ctx->file_table, i); + offset = array_index_nospec(offset, ctx->nr_user_files); + file_slot = io_fixed_file_slot(&ctx->file_table, offset); ret = -EBADF; if (!file_slot->file_ptr) goto out; @@ -8683,8 +8683,7 @@ static int __io_sqe_files_update(struct io_ring_ctx *ctx,
if (file_slot->file_ptr) { file = (struct file *)(file_slot->file_ptr & FFS_MASK); - err = io_queue_rsrc_removal(data, up->offset + done, - ctx->rsrc_node, file); + err = io_queue_rsrc_removal(data, i, ctx->rsrc_node, file); if (err) break; file_slot->file_ptr = 0; @@ -9357,7 +9356,7 @@ static int __io_sqe_buffers_update(struct io_ring_ctx *ctx,
i = array_index_nospec(offset, ctx->nr_user_bufs); if (ctx->user_bufs[i] != ctx->dummy_ubuf) { - err = io_queue_rsrc_removal(ctx->buf_data, offset, + err = io_queue_rsrc_removal(ctx->buf_data, i, ctx->rsrc_node, ctx->user_bufs[i]); if (unlikely(err)) { io_buffer_unmap(ctx, &imu);
From: Borislav Petkov bp@suse.de
[ Upstream commit d02b4dd84e1a90f7f1444d027c0289bf355b0d5a ]
Fix:
In file included from <command-line>:0:0: In function ‘ddr_perf_counter_enable’, inlined from ‘ddr_perf_irq_handler’ at drivers/perf/fsl_imx8_ddr_perf.c:651:2: ././include/linux/compiler_types.h:352:38: error: call to ‘__compiletime_assert_729’ \ declared with attribute error: FIELD_PREP: mask is not constant _compiletime_assert(condition, msg, __compiletime_assert_, __COUNTER__) ...
See https://lore.kernel.org/r/YkwQ6%2BtIH8GQpuct@zn.tnic for the gory details as to why it triggers with older gccs only.
Signed-off-by: Borislav Petkov bp@suse.de Cc: Frank Li Frank.li@nxp.com Cc: Will Deacon will@kernel.org Cc: Mark Rutland mark.rutland@arm.com Cc: Shawn Guo shawnguo@kernel.org Cc: Sascha Hauer s.hauer@pengutronix.de Cc: Pengutronix Kernel Team kernel@pengutronix.de Cc: Fabio Estevam festevam@gmail.com Cc: NXP Linux Team linux-imx@nxp.com Cc: linux-arm-kernel@lists.infradead.org Acked-by: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20220405151517.29753-10-bp@alien8.de Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/perf/fsl_imx8_ddr_perf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/perf/fsl_imx8_ddr_perf.c b/drivers/perf/fsl_imx8_ddr_perf.c index 94ebc1ecace7..b1b2a55de77f 100644 --- a/drivers/perf/fsl_imx8_ddr_perf.c +++ b/drivers/perf/fsl_imx8_ddr_perf.c @@ -29,7 +29,7 @@ #define CNTL_OVER_MASK 0xFFFFFFFE
#define CNTL_CSV_SHIFT 24 -#define CNTL_CSV_MASK (0xFF << CNTL_CSV_SHIFT) +#define CNTL_CSV_MASK (0xFFU << CNTL_CSV_SHIFT)
#define EVENT_CYCLES_ID 0 #define EVENT_CYCLES_COUNTER 0
From: Axel Rasmussen axelrasmussen@google.com
commit f9b141f93659e09a52e28791ccbaf69c273b8e92 upstream.
When one tries to grow an existing memfd_secret with ftruncate, one gets a panic [1]. For example, doing the following reliably induces the panic:
fd = memfd_secret();
ftruncate(fd, 10); ptr = mmap(NULL, 10, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); strcpy(ptr, "123456789");
munmap(ptr, 10); ftruncate(fd, 20);
The basic reason for this is, when we grow with ftruncate, we call down into simple_setattr, and then truncate_inode_pages_range, and eventually we try to zero part of the memory. The normal truncation code does this via the direct map (i.e., it calls page_address() and hands that to memset()).
For memfd_secret though, we specifically don't map our pages via the direct map (i.e. we call set_direct_map_invalid_noflush() on every fault). So the address returned by page_address() isn't useful, and when we try to memset() with it we panic.
This patch avoids the panic by implementing a custom setattr for memfd_secret, which detects resizes specifically (setting the size for the first time works just fine, since there are no existing pages to try to zero), and rejects them with EINVAL.
One could argue growing should be supported, but I think that will require a significantly more lengthy change. So, I propose a minimal fix for the benefit of stable kernels, and then perhaps to extend memfd_secret to support growing in a separate patch.
[1]:
BUG: unable to handle page fault for address: ffffa0a889277028 #PF: supervisor write access in kernel mode #PF: error_code(0x0002) - not-present page PGD afa01067 P4D afa01067 PUD 83f909067 PMD 83f8bf067 PTE 800ffffef6d88060 Oops: 0002 [#1] PREEMPT SMP DEBUG_PAGEALLOC PTI CPU: 0 PID: 281 Comm: repro Not tainted 5.17.0-dbg-DEV #1 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014 RIP: 0010:memset_erms+0x9/0x10 Code: c1 e9 03 40 0f b6 f6 48 b8 01 01 01 01 01 01 01 01 48 0f af c6 f3 48 ab 89 d1 f3 aa 4c 89 c8 c3 90 49 89 f9 40 88 f0 48 89 d1 <f3> aa 4c 89 c8 c3 90 49 89 fa 40 0f b6 ce 48 b8 01 01 01 01 01 01 RSP: 0018:ffffb932c09afbf0 EFLAGS: 00010246 RAX: 0000000000000000 RBX: ffffda63c4249dc0 RCX: 0000000000000fd8 RDX: 0000000000000fd8 RSI: 0000000000000000 RDI: ffffa0a889277028 RBP: ffffb932c09afc00 R08: 0000000000001000 R09: ffffa0a889277028 R10: 0000000000020023 R11: 0000000000000000 R12: ffffda63c4249dc0 R13: ffffa0a890d70d98 R14: 0000000000000028 R15: 0000000000000fd8 FS: 00007f7294899580(0000) GS:ffffa0af9bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: ffffa0a889277028 CR3: 0000000107ef6006 CR4: 0000000000370ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ? zero_user_segments+0x82/0x190 truncate_inode_partial_folio+0xd4/0x2a0 truncate_inode_pages_range+0x380/0x830 truncate_setsize+0x63/0x80 simple_setattr+0x37/0x60 notify_change+0x3d8/0x4d0 do_sys_ftruncate+0x162/0x1d0 __x64_sys_ftruncate+0x1c/0x20 do_syscall_64+0x44/0xa0 entry_SYSCALL_64_after_hwframe+0x44/0xae Modules linked in: xhci_pci xhci_hcd virtio_net net_failover failover virtio_blk virtio_balloon uhci_hcd ohci_pci ohci_hcd evdev ehci_pci ehci_hcd 9pnet_virtio 9p netfs 9pnet CR2: ffffa0a889277028
[lkp@intel.com: secretmem_iops can be static] Signed-off-by: kernel test robot lkp@intel.com [axelrasmussen@google.com: return EINVAL]
Link: https://lkml.kernel.org/r/20220324210909.1843814-1-axelrasmussen@google.com Link: https://lkml.kernel.org/r/20220412193023.279320-1-axelrasmussen@google.com Signed-off-by: Axel Rasmussen axelrasmussen@google.com Cc: Mike Rapoport rppt@kernel.org Cc: Matthew Wilcox willy@infradead.org Cc: stable@vger.kernel.org Cc: kernel test robot lkp@intel.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/secretmem.c | 17 +++++++++++++++++ 1 file changed, 17 insertions(+)
--- a/mm/secretmem.c +++ b/mm/secretmem.c @@ -158,6 +158,22 @@ const struct address_space_operations se .isolate_page = secretmem_isolate_page, };
+static int secretmem_setattr(struct user_namespace *mnt_userns, + struct dentry *dentry, struct iattr *iattr) +{ + struct inode *inode = d_inode(dentry); + unsigned int ia_valid = iattr->ia_valid; + + if ((ia_valid & ATTR_SIZE) && inode->i_size) + return -EINVAL; + + return simple_setattr(mnt_userns, dentry, iattr); +} + +static const struct inode_operations secretmem_iops = { + .setattr = secretmem_setattr, +}; + static struct vfsmount *secretmem_mnt;
static struct file *secretmem_file_create(unsigned long flags) @@ -177,6 +193,7 @@ static struct file *secretmem_file_creat mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); mapping_set_unevictable(inode->i_mapping);
+ inode->i_op = &secretmem_iops; inode->i_mapping->a_ops = &secretmem_aops;
/* pretend we are a normal file with zero size */
From: Juergen Gross jgross@suse.com
commit e553f62f10d93551eb883eca227ac54d1a4fad84 upstream.
Since commit 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") only zones with free memory are included in a built zonelist. This is problematic when e.g. all memory of a zone has been ballooned out when zonelists are being rebuilt.
The decision whether to rebuild the zonelists when onlining new memory is done based on populated_zone() returning 0 for the zone the memory will be added to. The new zone is added to the zonelists only, if it has free memory pages (managed_zone() returns a non-zero value) after the memory has been onlined. This implies, that onlining memory will always free the added pages to the allocator immediately, but this is not true in all cases: when e.g. running as a Xen guest the onlined new memory will be added only to the ballooned memory list, it will be freed only when the guest is being ballooned up afterwards.
Another problem with using managed_zone() for the decision whether a zone is being added to the zonelists is, that a zone with all memory used will in fact be removed from all zonelists in case the zonelists happen to be rebuilt.
Use populated_zone() when building a zonelist as it has been done before that commit.
There was a report that QubesOS (based on Xen) is hitting this problem. Xen has switched to use the zone device functionality in kernel 5.9 and QubesOS wants to use memory hotplugging for guests in order to be able to start a guest with minimal memory and expand it as needed. This was the report leading to the patch.
Link: https://lkml.kernel.org/r/20220407120637.9035-1-jgross@suse.com Fixes: 6aa303defb74 ("mm, vmscan: only allocate and reclaim from zones with pages managed by the buddy allocator") Signed-off-by: Juergen Gross jgross@suse.com Reported-by: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Acked-by: Michal Hocko mhocko@suse.com Acked-by: David Hildenbrand david@redhat.com Cc: Marek Marczykowski-Górecki marmarek@invisiblethingslab.com Reviewed-by: Wei Yang richard.weiyang@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -6112,7 +6112,7 @@ static int build_zonerefs_node(pg_data_t do { zone_type--; zone = pgdat->node_zones + zone_type; - if (managed_zone(zone)) { + if (populated_zone(zone)) { zoneref_set_zone(zone, &zonerefs[nr_zones++]); check_highest_zone(zone_type); }
From: Minchan Kim minchan@kernel.org
commit e914d8f00391520ecc4495dd0ca0124538ab7119 upstream.
Two processes under CLONE_VM cloning, user process can be corrupted by seeing zeroed page unexpectedly.
CPU A CPU B
do_swap_page do_swap_page SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path swap_readpage valid data swap_slot_free_notify delete zram entry swap_readpage zeroed(invalid) data pte_lock map the *zero data* to userspace pte_unlock pte_lock if (!pte_same) goto out_nomap; pte_unlock return and next refault will read zeroed data
The swap_slot_free_notify is bogus for CLONE_VM case since it doesn't increase the refcount of swap slot at copy_mm so it couldn't catch up whether it's safe or not to discard data from backing device. In the case, only the lock it could rely on to synchronize swap slot freeing is page table lock. Thus, this patch gets rid of the swap_slot_free_notify function. With this patch, CPU A will see correct data.
CPU A CPU B
do_swap_page do_swap_page SWP_SYNCHRONOUS_IO path SWP_SYNCHRONOUS_IO path swap_readpage original data pte_lock map the original data swap_free swap_range_free bd_disk->fops->swap_slot_free_notify swap_readpage read zeroed data pte_unlock pte_lock if (!pte_same) goto out_nomap; pte_unlock return on next refault will see mapped data by CPU B
The concern of the patch would increase memory consumption since it could keep wasted memory with compressed form in zram as well as uncompressed form in address space. However, most of cases of zram uses no readahead and do_swap_page is followed by swap_free so it will free the compressed form from in zram quickly.
Link: https://lkml.kernel.org/r/YjTVVxIAsnKAXjTd@google.com Fixes: 0bcac06f27d7 ("mm, swap: skip swapcache for swapin of synchronous device") Reported-by: Ivan Babrou ivan@cloudflare.com Tested-by: Ivan Babrou ivan@cloudflare.com Signed-off-by: Minchan Kim minchan@kernel.org Cc: Nitin Gupta ngupta@vflare.org Cc: Sergey Senozhatsky senozhatsky@chromium.org Cc: Jens Axboe axboe@kernel.dk Cc: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org [4.14+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_io.c | 54 ------------------------------------------------------ 1 file changed, 54 deletions(-)
--- a/mm/page_io.c +++ b/mm/page_io.c @@ -51,54 +51,6 @@ void end_swap_bio_write(struct bio *bio) bio_put(bio); }
-static void swap_slot_free_notify(struct page *page) -{ - struct swap_info_struct *sis; - struct gendisk *disk; - swp_entry_t entry; - - /* - * There is no guarantee that the page is in swap cache - the software - * suspend code (at least) uses end_swap_bio_read() against a non- - * swapcache page. So we must check PG_swapcache before proceeding with - * this optimization. - */ - if (unlikely(!PageSwapCache(page))) - return; - - sis = page_swap_info(page); - if (data_race(!(sis->flags & SWP_BLKDEV))) - return; - - /* - * The swap subsystem performs lazy swap slot freeing, - * expecting that the page will be swapped out again. - * So we can avoid an unnecessary write if the page - * isn't redirtied. - * This is good for real swap storage because we can - * reduce unnecessary I/O and enhance wear-leveling - * if an SSD is used as the as swap device. - * But if in-memory swap device (eg zram) is used, - * this causes a duplicated copy between uncompressed - * data in VM-owned memory and compressed data in - * zram-owned memory. So let's free zram-owned memory - * and make the VM-owned decompressed page *dirty*, - * so the page should be swapped out somewhere again if - * we again wish to reclaim it. - */ - disk = sis->bdev->bd_disk; - entry.val = page_private(page); - if (disk->fops->swap_slot_free_notify && __swap_count(entry) == 1) { - unsigned long offset; - - offset = swp_offset(entry); - - SetPageDirty(page); - disk->fops->swap_slot_free_notify(sis->bdev, - offset); - } -} - static void end_swap_bio_read(struct bio *bio) { struct page *page = bio_first_page_all(bio); @@ -114,7 +66,6 @@ static void end_swap_bio_read(struct bio }
SetPageUptodate(page); - swap_slot_free_notify(page); out: unlock_page(page); WRITE_ONCE(bio->bi_private, NULL); @@ -392,11 +343,6 @@ int swap_readpage(struct page *page, boo if (sis->flags & SWP_SYNCHRONOUS_IO) { ret = bdev_read_page(sis->bdev, swap_page_sector(page), page); if (!ret) { - if (trylock_page(page)) { - swap_slot_free_notify(page); - unlock_page(page); - } - count_vm_event(PSWPIN); goto out; }
From: Patrick Wang patrick.wang.shcn@gmail.com
commit 23c2d497de21f25898fbea70aeb292ab8acc8c94 upstream.
The kmemleak_*_phys() apis do not check the address for lowmem's min boundary, while the caller may pass an address below lowmem, which will trigger an oops:
# echo scan > /sys/kernel/debug/kmemleak Unable to handle kernel paging request at virtual address ff5fffffffe00000 Oops [#1] Modules linked in: CPU: 2 PID: 134 Comm: bash Not tainted 5.18.0-rc1-next-20220407 #33 Hardware name: riscv-virtio,qemu (DT) epc : scan_block+0x74/0x15c ra : scan_block+0x72/0x15c epc : ffffffff801e5806 ra : ffffffff801e5804 sp : ff200000104abc30 gp : ffffffff815cd4e8 tp : ff60000004cfa340 t0 : 0000000000000200 t1 : 00aaaaaac23954cc t2 : 00000000000003ff s0 : ff200000104abc90 s1 : ffffffff81b0ff28 a0 : 0000000000000000 a1 : ff5fffffffe01000 a2 : ffffffff81b0ff28 a3 : 0000000000000002 a4 : 0000000000000001 a5 : 0000000000000000 a6 : ff200000104abd7c a7 : 0000000000000005 s2 : ff5fffffffe00ff9 s3 : ffffffff815cd998 s4 : ffffffff815d0e90 s5 : ffffffff81b0ff28 s6 : 0000000000000020 s7 : ffffffff815d0eb0 s8 : ffffffffffffffff s9 : ff5fffffffe00000 s10: ff5fffffffe01000 s11: 0000000000000022 t3 : 00ffffffaa17db4c t4 : 000000000000000f t5 : 0000000000000001 t6 : 0000000000000000 status: 0000000000000100 badaddr: ff5fffffffe00000 cause: 000000000000000d scan_gray_list+0x12e/0x1a6 kmemleak_scan+0x2aa/0x57e kmemleak_write+0x32a/0x40c full_proxy_write+0x56/0x82 vfs_write+0xa6/0x2a6 ksys_write+0x6c/0xe2 sys_write+0x22/0x2a ret_from_syscall+0x0/0x2
The callers may not quite know the actual address they pass(e.g. from devicetree). So the kmemleak_*_phys() apis should guarantee the address they finally use is in lowmem range, so check the address for lowmem's min boundary.
Link: https://lkml.kernel.org/r/20220413122925.33856-1-patrick.wang.shcn@gmail.com Signed-off-by: Patrick Wang patrick.wang.shcn@gmail.com Acked-by: Catalin Marinas catalin.marinas@arm.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kmemleak.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -1132,7 +1132,7 @@ EXPORT_SYMBOL(kmemleak_no_scan); void __ref kmemleak_alloc_phys(phys_addr_t phys, size_t size, int min_count, gfp_t gfp) { - if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn) + if (PHYS_PFN(phys) >= min_low_pfn && PHYS_PFN(phys) < max_low_pfn) kmemleak_alloc(__va(phys), size, min_count, gfp); } EXPORT_SYMBOL(kmemleak_alloc_phys); @@ -1146,7 +1146,7 @@ EXPORT_SYMBOL(kmemleak_alloc_phys); */ void __ref kmemleak_free_part_phys(phys_addr_t phys, size_t size) { - if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn) + if (PHYS_PFN(phys) >= min_low_pfn && PHYS_PFN(phys) < max_low_pfn) kmemleak_free_part(__va(phys), size); } EXPORT_SYMBOL(kmemleak_free_part_phys); @@ -1158,7 +1158,7 @@ EXPORT_SYMBOL(kmemleak_free_part_phys); */ void __ref kmemleak_not_leak_phys(phys_addr_t phys) { - if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn) + if (PHYS_PFN(phys) >= min_low_pfn && PHYS_PFN(phys) < max_low_pfn) kmemleak_not_leak(__va(phys)); } EXPORT_SYMBOL(kmemleak_not_leak_phys); @@ -1170,7 +1170,7 @@ EXPORT_SYMBOL(kmemleak_not_leak_phys); */ void __ref kmemleak_ignore_phys(phys_addr_t phys) { - if (!IS_ENABLED(CONFIG_HIGHMEM) || PHYS_PFN(phys) < max_low_pfn) + if (PHYS_PFN(phys) >= min_low_pfn && PHYS_PFN(phys) < max_low_pfn) kmemleak_ignore(__va(phys)); } EXPORT_SYMBOL(kmemleak_ignore_phys);
From: Mike Kravetz mike.kravetz@oracle.com
commit 5a317412ef884763fdf7aa17f9f3636959d11d8f upstream.
It is possible for poisoned hugetlb pages to reside on the free lists. The huge page allocation routines which dequeue entries from the free lists make a point of avoiding poisoned pages. There is no such check and avoidance in the demote code path.
If a hugetlb page on the is on a free list, poison will only be set in the head page rather then the page with the actual error. If such a page is demoted, then the poison flag may follow the wrong page. A page without error could have poison set, and a page with poison could not have the flag set.
Check for poison before attempting to demote a hugetlb page. Also, return -EBUSY to the caller if only poisoned pages are on the free list.
Link: https://lkml.kernel.org/r/20220307215707.50916-1-mike.kravetz@oracle.com Fixes: 8531fc6f52f5 ("hugetlb: add hugetlb demote page support") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Reviewed-by: Naoya Horiguchi naoya.horiguchi@nec.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/hugetlb.c | 17 ++++++++++------- 1 file changed, 10 insertions(+), 7 deletions(-)
--- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3469,7 +3469,6 @@ static int demote_pool_huge_page(struct { int nr_nodes, node; struct page *page; - int rc = 0;
lockdep_assert_held(&hugetlb_lock);
@@ -3480,15 +3479,19 @@ static int demote_pool_huge_page(struct }
for_each_node_mask_to_free(h, nr_nodes, node, nodes_allowed) { - if (!list_empty(&h->hugepage_freelists[node])) { - page = list_entry(h->hugepage_freelists[node].next, - struct page, lru); - rc = demote_free_huge_page(h, page); - break; + list_for_each_entry(page, &h->hugepage_freelists[node], lru) { + if (PageHWPoison(page)) + continue; + + return demote_free_huge_page(h, page); } }
- return rc; + /* + * Only way to get here is if all pages on free lists are poisoned. + * Return -EBUSY so that caller will not retry. + */ + return -EBUSY; }
#define HSTATE_ATTR_RO(_name) \
From: Andrew Morton akpm@linux-foundation.org
commit 354e923df042a11d1ab8ca06b3ebfab3a018a4ec upstream.
Commit 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") was an attempt to fix regressions due to 9630f0d60fec5f ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE").
But regressionss continue to be reported:
https://lore.kernel.org/lkml/cb5b81bd-9882-e5dc-cd22-54bdbaaefbbc@leemhuis.i... https://bugzilla.kernel.org/show_bug.cgi?id=215720 https://lkml.kernel.org/r/b685f3d0-da34-531d-1aa9-479accd3e21b@leemhuis.info
This patch reverts the fix, so the original can also be reverted.
Fixes: 925346c129da11 ("fs/binfmt_elf: fix PT_LOAD p_align values for loaders") Cc: H.J. Lu hjl.tools@gmail.com Cc: Chris Kennelly ckennelly@google.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Alexey Dobriyan adobriyan@gmail.com Cc: Song Liu songliubraving@fb.com Cc: David Rientjes rientjes@google.com Cc: Ian Rogers irogers@google.com Cc: Hugh Dickins hughd@google.com Cc: Suren Baghdasaryan surenb@google.com Cc: Sandeep Patil sspatil@google.com Cc: Fangrui Song maskray@google.com Cc: Nick Desaulniers ndesaulniers@google.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Shuah Khan shuah@kernel.org Cc: Thorsten Leemhuis regressions@leemhuis.info Cc: Mike Rapoport rppt@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/binfmt_elf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1117,7 +1117,7 @@ out_free_interp: * without MAP_FIXED nor MAP_FIXED_NOREPLACE). */ alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); - if (interpreter || alignment > ELF_MIN_ALIGN) { + if (alignment > ELF_MIN_ALIGN) { load_bias = ELF_ET_DYN_BASE; if (current->flags & PF_RANDOMIZE) load_bias += arch_mmap_rnd();
From: Andrew Morton akpm@linux-foundation.org
commit aeb7923733d100b86c6bc68e7ae32913b0cec9d8 upstream.
Despite Mike's attempted fix (925346c129da117122), regressions reports continue:
https://lore.kernel.org/lkml/cb5b81bd-9882-e5dc-cd22-54bdbaaefbbc@leemhuis.i... https://bugzilla.kernel.org/show_bug.cgi?id=215720 https://lkml.kernel.org/r/b685f3d0-da34-531d-1aa9-479accd3e21b@leemhuis.info
So revert this patch.
Fixes: 9630f0d60fec ("fs/binfmt_elf: use PT_LOAD p_align values for static PIE") Cc: Alexey Dobriyan adobriyan@gmail.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: Chris Kennelly ckennelly@google.com Cc: David Rientjes rientjes@google.com Cc: Fangrui Song maskray@google.com Cc: H.J. Lu hjl.tools@gmail.com Cc: Hugh Dickins hughd@google.com Cc: Ian Rogers irogers@google.com Cc: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Mike Kravetz mike.kravetz@oracle.com Cc: Mike Rapoport rppt@kernel.org Cc: Nick Desaulniers ndesaulniers@google.com Cc: Sandeep Patil sspatil@google.com Cc: Shuah Khan shuah@kernel.org Cc: Song Liu songliubraving@fb.com Cc: Suren Baghdasaryan surenb@google.com Cc: Thorsten Leemhuis regressions@leemhuis.info Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/binfmt_elf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1116,11 +1116,11 @@ out_free_interp: * independently randomized mmap region (0 load_bias * without MAP_FIXED nor MAP_FIXED_NOREPLACE). */ - alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); - if (alignment > ELF_MIN_ALIGN) { + if (interpreter) { load_bias = ELF_ET_DYN_BASE; if (current->flags & PF_RANDOMIZE) load_bias += arch_mmap_rnd(); + alignment = maximum_alignment(elf_phdata, elf_ex->e_phnum); if (alignment) load_bias &= ~(alignment - 1); elf_flags |= MAP_FIXED_NOREPLACE;
From: Sean Christopherson seanjc@google.com
commit 1d0e84806047f38027d7572adb4702ef7c09b317 upstream.
Resolve nx_huge_pages to true/false when kvm.ko is loaded, leaving it as -1 is technically undefined behavior when its value is read out by param_get_bool(), as boolean values are supposed to be '0' or '1'.
Alternatively, KVM could define a custom getter for the param, but the auto value doesn't depend on the vendor module in any way, and printing "auto" would be unnecessarily unfriendly to the user.
In addition to fixing the undefined behavior, resolving the auto value also fixes the scenario where the auto value resolves to N and no vendor module is loaded. Previously, -1 would result in Y being printed even though KVM would ultimately disable the mitigation.
Rename the existing MMU module init/exit helpers to clarify that they're invoked with respect to the vendor module, and add comments to document why KVM has two separate "module init" flows.
========================================================================= UBSAN: invalid-load in kernel/params.c:320:33 load of value 255 is not a valid value for type '_Bool' CPU: 6 PID: 892 Comm: tail Not tainted 5.17.0-rc3+ #799 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x34/0x44 ubsan_epilogue+0x5/0x40 __ubsan_handle_load_invalid_value.cold+0x43/0x48 param_get_bool.cold+0xf/0x14 param_attr_show+0x55/0x80 module_attr_show+0x1c/0x30 sysfs_kf_seq_show+0x93/0xc0 seq_read_iter+0x11c/0x450 new_sync_read+0x11b/0x1a0 vfs_read+0xf0/0x190 ksys_read+0x5f/0xe0 do_syscall_64+0x3b/0xc0 entry_SYSCALL_64_after_hwframe+0x44/0xae </TASK> =========================================================================
Fixes: b8e8c8303ff2 ("kvm: mmu: ITLB_MULTIHIT mitigation") Cc: stable@vger.kernel.org Reported-by: Bruno Goncalves bgoncalv@redhat.com Reported-by: Jan Stancek jstancek@redhat.com Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20220331221359.3912754-1-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm_host.h | 5 +++-- arch/x86/kvm/mmu/mmu.c | 20 ++++++++++++++++---- arch/x86/kvm/x86.c | 20 ++++++++++++++++++-- 3 files changed, 37 insertions(+), 8 deletions(-)
--- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1574,8 +1574,9 @@ static inline int kvm_arch_flush_remote_ #define kvm_arch_pmi_in_guest(vcpu) \ ((vcpu) && (vcpu)->arch.handling_intr_from_guest)
-int kvm_mmu_module_init(void); -void kvm_mmu_module_exit(void); +void kvm_mmu_x86_module_init(void); +int kvm_mmu_vendor_module_init(void); +void kvm_mmu_vendor_module_exit(void);
void kvm_mmu_destroy(struct kvm_vcpu *vcpu); int kvm_mmu_create(struct kvm_vcpu *vcpu); --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -6144,12 +6144,24 @@ static int set_nx_huge_pages(const char return 0; }
-int kvm_mmu_module_init(void) +/* + * nx_huge_pages needs to be resolved to true/false when kvm.ko is loaded, as + * its default value of -1 is technically undefined behavior for a boolean. + */ +void kvm_mmu_x86_module_init(void) { - int ret = -ENOMEM; - if (nx_huge_pages == -1) __set_nx_huge_pages(get_nx_auto_mode()); +} + +/* + * The bulk of the MMU initialization is deferred until the vendor module is + * loaded as many of the masks/values may be modified by VMX or SVM, i.e. need + * to be reset when a potentially different vendor module is loaded. + */ +int kvm_mmu_vendor_module_init(void) +{ + int ret = -ENOMEM;
/* * MMU roles use union aliasing which is, generally speaking, an @@ -6197,7 +6209,7 @@ void kvm_mmu_destroy(struct kvm_vcpu *vc mmu_free_memory_caches(vcpu); }
-void kvm_mmu_module_exit(void) +void kvm_mmu_vendor_module_exit(void) { mmu_destroy_caches(); percpu_counter_destroy(&kvm_total_used_mmu_pages); --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -8846,7 +8846,7 @@ int kvm_arch_init(void *opaque) } kvm_nr_uret_msrs = 0;
- r = kvm_mmu_module_init(); + r = kvm_mmu_vendor_module_init(); if (r) goto out_free_percpu;
@@ -8894,7 +8894,7 @@ void kvm_arch_exit(void) cancel_work_sync(&pvclock_gtod_work); #endif kvm_x86_ops.hardware_enable = NULL; - kvm_mmu_module_exit(); + kvm_mmu_vendor_module_exit(); free_percpu(user_return_msrs); kmem_cache_destroy(x86_emulator_cache); #ifdef CONFIG_KVM_XEN @@ -12887,3 +12887,19 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_exit); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_enter); EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_msr_protocol_exit); + +static int __init kvm_x86_init(void) +{ + kvm_mmu_x86_module_init(); + return 0; +} +module_init(kvm_x86_init); + +static void __exit kvm_x86_exit(void) +{ + /* + * If module_init() is implemented, module_exit() must also be + * implemented to allow module unload. + */ +} +module_exit(kvm_x86_exit);
From: Oliver Upton oupton@google.com
commit a44a4cc1c969afec97dbb2aedaf6f38eaa6253bb upstream.
Unfortunately, there is no guarantee that KVM was able to instantiate a debugfs directory for a particular VM. To that end, KVM shouldn't even attempt to create new debugfs files in this case. If the specified parent dentry is NULL, debugfs_create_file() will instantiate files at the root of debugfs.
For arm64, it is possible to create the vgic-state file outside of a VM directory, the file is not cleaned up when a VM is destroyed. Nonetheless, the corresponding struct kvm is freed when the VM is destroyed.
Nip the problem in the bud for all possible errant debugfs file creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing, debugfs_create_file() will fail instead of creating the file in the root directory.
Cc: stable@kernel.org Fixes: 929f45e32499 ("kvm: no need to check return value of debugfs_create functions") Signed-off-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- virt/kvm/kvm_main.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
--- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -937,7 +937,7 @@ static void kvm_destroy_vm_debugfs(struc int kvm_debugfs_num_entries = kvm_vm_stats_header.num_desc + kvm_vcpu_stats_header.num_desc;
- if (!kvm->debugfs_dentry) + if (IS_ERR(kvm->debugfs_dentry)) return;
debugfs_remove_recursive(kvm->debugfs_dentry); @@ -960,6 +960,12 @@ static int kvm_create_vm_debugfs(struct int kvm_debugfs_num_entries = kvm_vm_stats_header.num_desc + kvm_vcpu_stats_header.num_desc;
+ /* + * Force subsequent debugfs file creations to fail if the VM directory + * is not created. + */ + kvm->debugfs_dentry = ERR_PTR(-ENOENT); + if (!debugfs_initialized()) return 0;
@@ -5484,7 +5490,7 @@ static void kvm_uevent_notify_change(uns } add_uevent_var(env, "PID=%d", kvm->userspace_pid);
- if (kvm->debugfs_dentry) { + if (!IS_ERR(kvm->debugfs_dentry)) { char *tmp, *p = kmalloc(PATH_MAX, GFP_KERNEL_ACCOUNT);
if (p) {
Hi Greg,
On Mon, Apr 18, 2022 at 5:24 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Oliver Upton oupton@google.com
commit a44a4cc1c969afec97dbb2aedaf6f38eaa6253bb upstream.
Unfortunately, there is no guarantee that KVM was able to instantiate a debugfs directory for a particular VM. To that end, KVM shouldn't even attempt to create new debugfs files in this case. If the specified parent dentry is NULL, debugfs_create_file() will instantiate files at the root of debugfs.
For arm64, it is possible to create the vgic-state file outside of a VM directory, the file is not cleaned up when a VM is destroyed. Nonetheless, the corresponding struct kvm is freed when the VM is destroyed.
Nip the problem in the bud for all possible errant debugfs file creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing, debugfs_create_file() will fail instead of creating the file in the root directory.
Cc: stable@kernel.org Fixes: 929f45e32499 ("kvm: no need to check return value of debugfs_create functions") Signed-off-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Can you drop this patch from stable for the time being? There's a bug in it because KVM does init/destroy awkwardly. Sean working on a fix [1].
[1]: https://lore.kernel.org/kvm/20220415004622.2207751-1-seanjc@google.com/
-- Thanks, Oliver
On Mon, Apr 18, 2022 at 10:14:42AM -0700, Oliver Upton wrote:
Hi Greg,
On Mon, Apr 18, 2022 at 5:24 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Oliver Upton oupton@google.com
commit a44a4cc1c969afec97dbb2aedaf6f38eaa6253bb upstream.
Unfortunately, there is no guarantee that KVM was able to instantiate a debugfs directory for a particular VM. To that end, KVM shouldn't even attempt to create new debugfs files in this case. If the specified parent dentry is NULL, debugfs_create_file() will instantiate files at the root of debugfs.
For arm64, it is possible to create the vgic-state file outside of a VM directory, the file is not cleaned up when a VM is destroyed. Nonetheless, the corresponding struct kvm is freed when the VM is destroyed.
Nip the problem in the bud for all possible errant debugfs file creations by initializing kvm->debugfs_dentry to -ENOENT. In so doing, debugfs_create_file() will fail instead of creating the file in the root directory.
Cc: stable@kernel.org Fixes: 929f45e32499 ("kvm: no need to check return value of debugfs_create functions") Signed-off-by: Oliver Upton oupton@google.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20220406235615.1447180-2-oupton@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
Can you drop this patch from stable for the time being? There's a bug in it because KVM does init/destroy awkwardly. Sean working on a fix [1].
Will do, I'll go drop it from everywhere.
greg k-h
From: Chuck Lever chuck.lever@oracle.com
commit 773f91b2cf3f52df0d7508fdbf60f37567cdaee4 upstream.
Trond Myklebust reports an NFSD crash in svc_rdma_sendto(). Further investigation shows that the crash occurred while NFSD was handling a deferred request.
This patch addresses two inter-related issues that prevent request deferral from working correctly for RPC/RDMA requests:
1. Prevent the crash by ensuring that the original svc_rqst::rq_xprt_ctxt value is available when the request is revisited. Otherwise svc_rdma_sendto() does not have a Receive context available with which to construct its reply.
2. Possibly since before commit 71641d99ce03 ("svcrdma: Properly compute .len and .buflen for received RPC Calls"), svc_rdma_recvfrom() did not include the transport header in the returned xdr_buf. There should have been no need for svc_defer() and friends to save and restore that header, as of that commit. This issue is addressed in a backport-friendly way by simply having svc_rdma_recvfrom() set rq_xprt_hlen to zero unconditionally, just as svc_tcp_recvfrom() does. This enables svc_deferred_recv() to correctly reconstruct an RPC message received via RPC/RDMA.
Reported-by: Trond Myklebust trondmy@hammerspace.com Link: https://lore.kernel.org/linux-nfs/82662b7190f26fb304eb0ab1bb04279072439d4e.c... Signed-off-by: Chuck Lever chuck.lever@oracle.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/sunrpc/svc.h | 1 + net/sunrpc/svc_xprt.c | 3 +++ net/sunrpc/xprtrdma/svc_rdma_recvfrom.c | 2 +- 3 files changed, 5 insertions(+), 1 deletion(-)
--- a/include/linux/sunrpc/svc.h +++ b/include/linux/sunrpc/svc.h @@ -412,6 +412,7 @@ struct svc_deferred_req { size_t addrlen; struct sockaddr_storage daddr; /* where reply must come from */ size_t daddrlen; + void *xprt_ctxt; struct cache_deferred_req handle; size_t xprt_hlen; int argslen; --- a/net/sunrpc/svc_xprt.c +++ b/net/sunrpc/svc_xprt.c @@ -1213,6 +1213,8 @@ static struct cache_deferred_req *svc_de dr->daddr = rqstp->rq_daddr; dr->argslen = rqstp->rq_arg.len >> 2; dr->xprt_hlen = rqstp->rq_xprt_hlen; + dr->xprt_ctxt = rqstp->rq_xprt_ctxt; + rqstp->rq_xprt_ctxt = NULL;
/* back up head to the start of the buffer and copy */ skip = rqstp->rq_arg.len - rqstp->rq_arg.head[0].iov_len; @@ -1251,6 +1253,7 @@ static noinline int svc_deferred_recv(st rqstp->rq_xprt_hlen = dr->xprt_hlen; rqstp->rq_daddr = dr->daddr; rqstp->rq_respages = rqstp->rq_pages; + rqstp->rq_xprt_ctxt = dr->xprt_ctxt; svc_xprt_received(rqstp->rq_xprt); return (dr->argslen<<2) - dr->xprt_hlen; } --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c @@ -831,7 +831,7 @@ int svc_rdma_recvfrom(struct svc_rqst *r goto out_err; if (ret == 0) goto out_drop; - rqstp->rq_xprt_hlen = ret; + rqstp->rq_xprt_hlen = 0;
if (svc_rdma_is_reverse_direction_reply(xprt, ctxt)) goto out_backchannel;
From: Johan Hovold johan@kernel.org
commit b452dbf24d7d9a990d70118462925f6ee287d135 upstream.
Make sure to free the flash platform device in the event that registration fails during probe.
Fixes: ca7d8b980b67 ("memory: add Renesas RPC-IF driver") Cc: stable@vger.kernel.org # 5.8 Cc: Sergei Shtylyov sergei.shtylyov@cogentembedded.com Signed-off-by: Johan Hovold johan@kernel.org Link: https://lore.kernel.org/r/20220303180632.3194-1-johan@kernel.org Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/memory/renesas-rpc-if.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/drivers/memory/renesas-rpc-if.c +++ b/drivers/memory/renesas-rpc-if.c @@ -651,6 +651,7 @@ static int rpcif_probe(struct platform_d struct platform_device *vdev; struct device_node *flash; const char *name; + int ret;
flash = of_get_next_child(pdev->dev.of_node, NULL); if (!flash) { @@ -674,7 +675,14 @@ static int rpcif_probe(struct platform_d return -ENOMEM; vdev->dev.parent = &pdev->dev; platform_set_drvdata(pdev, vdev); - return platform_device_add(vdev); + + ret = platform_device_add(vdev); + if (ret) { + platform_device_put(vdev); + return ret; + } + + return 0; }
static int rpcif_remove(struct platform_device *pdev)
From: Jason A. Donenfeld Jason@zx2c4.com
commit c40160f2998c897231f8454bf797558d30a20375 upstream.
While the latent entropy plugin mostly doesn't derive entropy from get_random_const() for measuring the call graph, when __latent_entropy is applied to a constant, then it's initialized statically to output from get_random_const(). In that case, this data is derived from a 64-bit seed, which means a buffer of 512 bits doesn't really have that amount of compile-time entropy.
This patch fixes that shortcoming by just buffering chunks of /dev/urandom output and doling it out as requested.
At the same time, it's important that we don't break the use of -frandom-seed, for people who want the runtime benefits of the latent entropy plugin, while still having compile-time determinism. In that case, we detect whether gcc's set_random_seed() has been called by making a call to get_random_seed(noinit=true) in the plugin init function, which is called after set_random_seed() is called but before anything that calls get_random_seed(noinit=false), and seeing if it's zero or not. If it's not zero, we're in deterministic mode, and so we just generate numbers with a basic xorshift prng.
Note that we don't detect if -frandom-seed is being used using the documented local_tick variable, because it's assigned via: local_tick = (unsigned) tv.tv_sec * 1000 + tv.tv_usec / 1000; which may well overflow and become -1 on its own, and so isn't reliable: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105171
[kees: The 256 byte rnd_buf size was chosen based on average (250), median (64), and std deviation (575) bytes of used entropy for a defconfig x86_64 build]
Fixes: 38addce8b600 ("gcc-plugins: Add latent_entropy plugin") Cc: stable@vger.kernel.org Cc: PaX Team pageexec@freemail.hu Signed-off-by: Jason A. Donenfeld Jason@zx2c4.com Signed-off-by: Kees Cook keescook@chromium.org Link: https://lore.kernel.org/r/20220405222815.21155-1-Jason@zx2c4.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- scripts/gcc-plugins/latent_entropy_plugin.c | 44 +++++++++++++++++----------- 1 file changed, 27 insertions(+), 17 deletions(-)
--- a/scripts/gcc-plugins/latent_entropy_plugin.c +++ b/scripts/gcc-plugins/latent_entropy_plugin.c @@ -86,25 +86,31 @@ static struct plugin_info latent_entropy .help = "disable\tturn off latent entropy instrumentation\n", };
-static unsigned HOST_WIDE_INT seed; -/* - * get_random_seed() (this is a GCC function) generates the seed. - * This is a simple random generator without any cryptographic security because - * the entropy doesn't come from here. - */ +static unsigned HOST_WIDE_INT deterministic_seed; +static unsigned HOST_WIDE_INT rnd_buf[32]; +static size_t rnd_idx = ARRAY_SIZE(rnd_buf); +static int urandom_fd = -1; + static unsigned HOST_WIDE_INT get_random_const(void) { - unsigned int i; - unsigned HOST_WIDE_INT ret = 0; - - for (i = 0; i < 8 * sizeof(ret); i++) { - ret = (ret << 1) | (seed & 1); - seed >>= 1; - if (ret & 1) - seed ^= 0xD800000000000000ULL; + if (deterministic_seed) { + unsigned HOST_WIDE_INT w = deterministic_seed; + w ^= w << 13; + w ^= w >> 7; + w ^= w << 17; + deterministic_seed = w; + return deterministic_seed; }
- return ret; + if (urandom_fd < 0) { + urandom_fd = open("/dev/urandom", O_RDONLY); + gcc_assert(urandom_fd >= 0); + } + if (rnd_idx >= ARRAY_SIZE(rnd_buf)) { + gcc_assert(read(urandom_fd, rnd_buf, sizeof(rnd_buf)) == sizeof(rnd_buf)); + rnd_idx = 0; + } + return rnd_buf[rnd_idx++]; }
static tree tree_get_random_const(tree type) @@ -537,8 +543,6 @@ static void latent_entropy_start_unit(vo tree type, id; int quals;
- seed = get_random_seed(false); - if (in_lto_p) return;
@@ -573,6 +577,12 @@ __visible int plugin_init(struct plugin_ const struct plugin_argument * const argv = plugin_info->argv; int i;
+ /* + * Call get_random_seed() with noinit=true, so that this returns + * 0 in the case where no seed has been passed via -frandom-seed. + */ + deterministic_seed = get_random_seed(true); + static const struct ggc_root_tab gt_ggc_r_gt_latent_entropy[] = { { .base = &latent_entropy_decl,
From: Ronnie Sahlberg lsahlber@redhat.com
commit 8b6c58458ee3206dde345fce327a4cb83e69caf9 upstream.
On umount, cifs_sb->tlink_tree might contain entries that do not represent a valid tcon. Check the tcon for error before we dereference it.
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Cc: stable@vger.kernel.org Reviewed-by: Shyam Prasad N sprasad@microsoft.com Reported-by: Xiaoli Feng xifeng@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/cifs/cifsfs.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/fs/cifs/cifsfs.c +++ b/fs/cifs/cifsfs.c @@ -266,10 +266,11 @@ static void cifs_kill_sb(struct super_bl * before we kill the sb. */ if (cifs_sb->root) { - node = rb_first(root); - while (node != NULL) { + for (node = rb_first(root); node; node = rb_next(node)) { tlink = rb_entry(node, struct tcon_link, tl_rbnode); tcon = tlink_tcon(tlink); + if (IS_ERR(tcon)) + continue; cfid = &tcon->crfid; mutex_lock(&cfid->fid_mutex); if (cfid->dentry) { @@ -277,7 +278,6 @@ static void cifs_kill_sb(struct super_bl cfid->dentry = NULL; } mutex_unlock(&cfid->fid_mutex); - node = rb_next(node); }
/* finally release root dentry */
From: Bartosz Golaszewski brgl@bgdev.pl
commit 3836c73e6a2585561af928c6641d74528a8bdfa4 upstream.
We need to take mask into account in the set/get_multiple() callbacks. Use bitmap_replace() instead of bitmap_copy().
Fixes: cb8c474e79be ("gpio: sim: new testing module") Cc: stable@vger.kernel.org Signed-off-by: Bartosz Golaszewski brgl@bgdev.pl Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpio/gpio-sim.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/gpio/gpio-sim.c +++ b/drivers/gpio/gpio-sim.c @@ -134,7 +134,7 @@ static int gpio_sim_get_multiple(struct struct gpio_sim_chip *chip = gpiochip_get_data(gc);
mutex_lock(&chip->lock); - bitmap_copy(bits, chip->value_map, gc->ngpio); + bitmap_replace(bits, bits, chip->value_map, mask, gc->ngpio); mutex_unlock(&chip->lock);
return 0; @@ -146,7 +146,7 @@ static void gpio_sim_set_multiple(struct struct gpio_sim_chip *chip = gpiochip_get_data(gc);
mutex_lock(&chip->lock); - bitmap_copy(chip->value_map, bits, gc->ngpio); + bitmap_replace(chip->value_map, chip->value_map, bits, mask, gc->ngpio); mutex_unlock(&chip->lock); }
From: Toke Høiland-Jørgensen toke@toke.dk
commit 037250f0a45cf9ecf5b52d4b9ff8eadeb609c800 upstream.
The ath9k driver was not properly clearing the status area in the ieee80211_tx_info struct before reporting TX status to mac80211. Instead, it was manually filling in fields, which meant that fields introduced later were left as-is.
Conveniently, mac80211 actually provides a helper to zero out the status area, so use that to make sure we zero everything.
The last commit touching the driver function writing the status information seems to have actually been fixing an issue that was also caused by the area being uninitialised; but it only added clearing of a single field instead of the whole struct. That is now redundant, though, so revert that commit and use it as a convenient Fixes tag.
Fixes: cc591d77aba1 ("ath9k: Make sure to zero status.tx_time before reporting TX status") Reported-by: Bagas Sanjaya bagasdotme@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Toke Høiland-Jørgensen toke@toke.dk Tested-by: Bagas Sanjaya bagasdotme@gmail.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20220330164409.16645-1-toke@toke.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath9k/xmit.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -2553,6 +2553,8 @@ static void ath_tx_rc_status(struct ath_ struct ath_hw *ah = sc->sc_ah; u8 i, tx_rateindex;
+ ieee80211_tx_info_clear_status(tx_info); + if (txok) tx_info->status.ack_signal = ts->ts_rssi;
@@ -2595,9 +2597,6 @@ static void ath_tx_rc_status(struct ath_ }
tx_info->status.rates[tx_rateindex].count = ts->ts_longretry + 1; - - /* we report airtime in ath_tx_count_airtime(), don't report twice */ - tx_info->status.tx_time = 0; }
static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq)
From: Toke Høiland-Jørgensen toke@redhat.com
commit 5a6b06f5927c940fa44026695779c30b7536474c upstream.
The ieee80211_tx_info_clear_status() helper also clears the rate counts and the driver-private part of struct ieee80211_tx_info, so using it breaks quite a few other things. So back out of using it, and instead define a ath-internal helper that only clears the area between the status_driver_data and the rates info. Combined with moving the ath_frame_info struct to status_driver_data, this avoids clearing anything we shouldn't be, and so we can keep the existing code for handling the rate information.
While fixing this I also noticed that the setting of tx_info->status.rates[tx_rateindex].count on hardware underrun errors was always immediately overridden by the normal setting of the same fields, so rearrange the code so that the underrun detection actually takes effect.
The new helper could be generalised to a 'memset_between()' helper, but leave it as a driver-internal helper for now since this needs to go to stable.
Cc: stable@vger.kernel.org Reported-by: Peter Seiderer ps.report@gmx.net Fixes: 037250f0a45c ("ath9k: Properly clear TX status area before reporting to mac80211") Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Reviewed-by: Peter Seiderer ps.report@gmx.net Tested-by: Peter Seiderer ps.report@gmx.net Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://lore.kernel.org/r/20220404204800.2681133-1-toke@toke.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/ath/ath9k/main.c | 2 +- drivers/net/wireless/ath/ath9k/xmit.c | 30 ++++++++++++++++++++---------- 2 files changed, 21 insertions(+), 11 deletions(-)
--- a/drivers/net/wireless/ath/ath9k/main.c +++ b/drivers/net/wireless/ath/ath9k/main.c @@ -839,7 +839,7 @@ static bool ath9k_txq_list_has_key(struc continue;
txinfo = IEEE80211_SKB_CB(bf->bf_mpdu); - fi = (struct ath_frame_info *)&txinfo->rate_driver_data[0]; + fi = (struct ath_frame_info *)&txinfo->status.status_driver_data[0]; if (fi->keyix == keyix) return true; } --- a/drivers/net/wireless/ath/ath9k/xmit.c +++ b/drivers/net/wireless/ath/ath9k/xmit.c @@ -141,8 +141,8 @@ static struct ath_frame_info *get_frame_ { struct ieee80211_tx_info *tx_info = IEEE80211_SKB_CB(skb); BUILD_BUG_ON(sizeof(struct ath_frame_info) > - sizeof(tx_info->rate_driver_data)); - return (struct ath_frame_info *) &tx_info->rate_driver_data[0]; + sizeof(tx_info->status.status_driver_data)); + return (struct ath_frame_info *) &tx_info->status.status_driver_data[0]; }
static void ath_send_bar(struct ath_atx_tid *tid, u16 seqno) @@ -2542,6 +2542,16 @@ skip_tx_complete: spin_unlock_irqrestore(&sc->tx.txbuflock, flags); }
+static void ath_clear_tx_status(struct ieee80211_tx_info *tx_info) +{ + void *ptr = &tx_info->status; + + memset(ptr + sizeof(tx_info->status.rates), 0, + sizeof(tx_info->status) - + sizeof(tx_info->status.rates) - + sizeof(tx_info->status.status_driver_data)); +} + static void ath_tx_rc_status(struct ath_softc *sc, struct ath_buf *bf, struct ath_tx_status *ts, int nframes, int nbad, int txok) @@ -2553,7 +2563,7 @@ static void ath_tx_rc_status(struct ath_ struct ath_hw *ah = sc->sc_ah; u8 i, tx_rateindex;
- ieee80211_tx_info_clear_status(tx_info); + ath_clear_tx_status(tx_info);
if (txok) tx_info->status.ack_signal = ts->ts_rssi; @@ -2569,6 +2579,13 @@ static void ath_tx_rc_status(struct ath_ tx_info->status.ampdu_len = nframes; tx_info->status.ampdu_ack_len = nframes - nbad;
+ tx_info->status.rates[tx_rateindex].count = ts->ts_longretry + 1; + + for (i = tx_rateindex + 1; i < hw->max_rates; i++) { + tx_info->status.rates[i].count = 0; + tx_info->status.rates[i].idx = -1; + } + if ((ts->ts_status & ATH9K_TXERR_FILT) == 0 && (tx_info->flags & IEEE80211_TX_CTL_NO_ACK) == 0) { /* @@ -2590,13 +2607,6 @@ static void ath_tx_rc_status(struct ath_ tx_info->status.rates[tx_rateindex].count = hw->max_rate_tries; } - - for (i = tx_rateindex + 1; i < hw->max_rates; i++) { - tx_info->status.rates[i].count = 0; - tx_info->status.rates[i].idx = -1; - } - - tx_info->status.rates[tx_rateindex].count = ts->ts_longretry + 1; }
static void ath_tx_processq(struct ath_softc *sc, struct ath_txq *txq)
From: Naohiro Aota naohiro.aota@wdc.com
commit 760e69c4c2e2f475a812bdd414b62758215ce9cb upstream.
In btrfs_make_block_group(), we activate the allocated block group, expecting that the block group is soon used for allocation. However, the chunk allocation from flush_space() context broke the assumption. There can be a large time gap between the chunk allocation time and the extent allocation time from the chunk.
Activating the empty block groups pre-allocated from flush_space() context can exhaust the active zone counter of a device. Once we use all the active zone counts for empty pre-allocated block groups, we cannot activate new block group for the other things: metadata, tree-log, or data relocation block group. That failure results in a fake -ENOSPC.
This patch introduces CHUNK_ALLOC_FORCE_FOR_EXTENT to distinguish the chunk allocation from find_free_extent(). Now, the new block group is activated only in that context.
Fixes: eb66a010d518 ("btrfs: zoned: activate new block group") CC: stable@vger.kernel.org # 5.16+ Reviewed-by: Johannes Thumshirn johannes.thumshirn@wdc.com Tested-by: Johannes Thumshirn johannes.thumshirn@wdc.com Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/block-group.c | 24 ++++++++++++++++-------- fs/btrfs/block-group.h | 4 ++++ fs/btrfs/extent-tree.c | 2 +- 3 files changed, 21 insertions(+), 9 deletions(-)
--- a/fs/btrfs/block-group.c +++ b/fs/btrfs/block-group.c @@ -2479,12 +2479,6 @@ struct btrfs_block_group *btrfs_make_blo return ERR_PTR(ret); }
- /* - * New block group is likely to be used soon. Try to activate it now. - * Failure is OK for now. - */ - btrfs_zone_activate(cache); - ret = exclude_super_stripes(cache); if (ret) { /* We may have excluded something, so call this just in case */ @@ -3636,8 +3630,14 @@ int btrfs_chunk_alloc(struct btrfs_trans struct btrfs_block_group *ret_bg; bool wait_for_alloc = false; bool should_alloc = false; + bool from_extent_allocation = false; int ret = 0;
+ if (force == CHUNK_ALLOC_FORCE_FOR_EXTENT) { + from_extent_allocation = true; + force = CHUNK_ALLOC_FORCE; + } + /* Don't re-enter if we're already allocating a chunk */ if (trans->allocating_chunk) return -ENOSPC; @@ -3730,9 +3730,17 @@ int btrfs_chunk_alloc(struct btrfs_trans ret_bg = do_chunk_alloc(trans, flags); trans->allocating_chunk = false;
- if (IS_ERR(ret_bg)) + if (IS_ERR(ret_bg)) { ret = PTR_ERR(ret_bg); - else + } else if (from_extent_allocation) { + /* + * New block group is likely to be used soon. Try to activate + * it now. Failure is OK for now. + */ + btrfs_zone_activate(ret_bg); + } + + if (!ret) btrfs_put_block_group(ret_bg);
spin_lock(&space_info->lock); --- a/fs/btrfs/block-group.h +++ b/fs/btrfs/block-group.h @@ -35,11 +35,15 @@ enum btrfs_discard_state { * the FS with empty chunks * * CHUNK_ALLOC_FORCE means it must try to allocate one + * + * CHUNK_ALLOC_FORCE_FOR_EXTENT like CHUNK_ALLOC_FORCE but called from + * find_free_extent() that also activaes the zone */ enum btrfs_chunk_alloc_enum { CHUNK_ALLOC_NO_FORCE, CHUNK_ALLOC_LIMITED, CHUNK_ALLOC_FORCE, + CHUNK_ALLOC_FORCE_FOR_EXTENT, };
struct btrfs_caching_control { --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -4087,7 +4087,7 @@ static int find_free_extent_update_loop( }
ret = btrfs_chunk_alloc(trans, ffe_ctl->flags, - CHUNK_ALLOC_FORCE); + CHUNK_ALLOC_FORCE_FOR_EXTENT);
/* Do not bail out on ENOSPC since we can do more. */ if (ret == -ENOSPC)
From: Jia-Ju Bai baijiaju1990@gmail.com
commit 168a2f776b9762f4021421008512dd7ab7474df1 upstream.
In btrfs_get_root_ref(), when btrfs_insert_fs_root() fails, btrfs_put_root() can happen for two reasons:
- the root already exists in the tree, in that case it returns the reference obtained in btrfs_lookup_fs_root()
- another error so the cleanup is done in the fail label
Calling btrfs_put_root() unconditionally would lead to double decrement of the root reference possibly freeing it in the second case.
Reported-by: TOTE Robot oslab@tsinghua.edu.cn Fixes: bc44d7c4b2b1 ("btrfs: push btrfs_grab_fs_root into btrfs_get_fs_root") CC: stable@vger.kernel.org # 5.10+ Signed-off-by: Jia-Ju Bai baijiaju1990@gmail.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/disk-io.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/fs/btrfs/disk-io.c +++ b/fs/btrfs/disk-io.c @@ -1826,9 +1826,10 @@ again:
ret = btrfs_insert_fs_root(fs_info, root); if (ret) { - btrfs_put_root(root); - if (ret == -EEXIST) + if (ret == -EEXIST) { + btrfs_put_root(root); goto again; + } goto fail; } return root;
From: Naohiro Aota naohiro.aota@wdc.com
commit a690e5f2db4d1dca742ce734aaff9f3112d63764 upstream.
When btrfs balance is interrupted with umount, the background balance resumes on the next mount. There is a potential deadlock with FS freezing here like as described in commit 26559780b953 ("btrfs: zoned: mark relocation as writing"). Mark the process as sb_writing to avoid it.
Reviewed-by: Filipe Manana fdmanana@suse.com CC: stable@vger.kernel.org # 4.9+ Signed-off-by: Naohiro Aota naohiro.aota@wdc.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/volumes.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -4467,10 +4467,12 @@ static int balance_kthread(void *data) struct btrfs_fs_info *fs_info = data; int ret = 0;
+ sb_start_write(fs_info->sb); mutex_lock(&fs_info->balance_mutex); if (fs_info->balance_ctl) ret = btrfs_balance(fs_info, fs_info->balance_ctl, NULL); mutex_unlock(&fs_info->balance_mutex); + sb_end_write(fs_info->sb);
return ret; }
From: Tim Crawford tcrawford@system76.com
commit 9eb6f5c388060d8cef3c8b616cc31b765e022359 upstream.
Fixes speaker output and headset detection on Clevo PD50PNT.
Signed-off-by: Tim Crawford tcrawford@system76.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220405182029.27431-1-tcrawford@system76.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -2619,6 +2619,7 @@ static const struct snd_pci_quirk alc882 SND_PCI_QUIRK(0x1558, 0x65e1, "Clevo PB51[ED][DF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x65e5, "Clevo PC50D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x65f1, "Clevo PC50HS", ALC1220_FIXUP_CLEVO_PB51ED_PINS), + SND_PCI_QUIRK(0x1558, 0x65f5, "Clevo PD50PN[NRT]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67d1, "Clevo PB71[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67e1, "Clevo PB71[DE][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS), SND_PCI_QUIRK(0x1558, 0x67e5, "Clevo PC70D[PRS](?:-D|-G)?", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
From: Tao Jin tao-j@outlook.com
commit 264fb03497ec1c7841bba872571bcd11beed57a7 upstream.
For this specific device on Lenovo Thinkpad X12 tablet, the verbs were dumped by qemu running a guest OS that init this codec properly. After studying the dump, it turns out that the same quirk used by the other Lenovo devices can be reused.
The patch was tested working against the mainline kernel.
Cc: stable@vger.kernel.org Signed-off-by: Tao Jin tao-j@outlook.com Link: https://lore.kernel.org/r/CO6PR03MB6241CD73310B37858FE64C85E1E89@CO6PR03MB62... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9218,6 +9218,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x505d, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x505f, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x5062, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), + SND_PCI_QUIRK(0x17aa, 0x508b, "Thinkpad X12 Gen 1", ALC287_FIXUP_LEGION_15IMHG05_SPEAKERS), SND_PCI_QUIRK(0x17aa, 0x5109, "Thinkpad", ALC269_FIXUP_LIMIT_INT_MIC_BOOST), SND_PCI_QUIRK(0x17aa, 0x511e, "Thinkpad", ALC298_FIXUP_TPT470_DOCK), SND_PCI_QUIRK(0x17aa, 0x511f, "Thinkpad", ALC298_FIXUP_TPT470_DOCK),
From: Fabio M. De Francesco fmdefrancesco@gmail.com
commit 2f7a26abb8241a0208c68d22815aa247c5ddacab upstream.
Syzbot reports "KASAN: null-ptr-deref Write in snd_pcm_format_set_silence".[1]
It is due to missing validation of the "silence" field of struct "pcm_format_data" in "pcm_formats" array.
Add a test for valid "pat" and, if it is not so, return -EINVAL.
[1] https://lore.kernel.org/lkml/000000000000d188ef05dc2c7279@google.com/
Reported-and-tested-by: syzbot+205eb15961852c2c5974@syzkaller.appspotmail.com Signed-off-by: Fabio M. De Francesco fmdefrancesco@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220409012655.9399-1-fmdefrancesco@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/pcm_misc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/sound/core/pcm_misc.c +++ b/sound/core/pcm_misc.c @@ -433,7 +433,7 @@ int snd_pcm_format_set_silence(snd_pcm_f return 0; width = pcm_formats[(INT)format].phys; /* physical width */ pat = pcm_formats[(INT)format].silence; - if (! width) + if (!width || !pat) return -EINVAL; /* signed or 1 byte data */ if (pcm_formats[(INT)format].signd == 1 || width <= 8) {
From: Johannes Berg johannes.berg@intel.com
commit 6624bb34b4eb19f715db9908cca00122748765d7 upstream.
We need this to be at least two bytes, so we can access alpha2[0] and alpha2[1]. It may be three in case some userspace used NUL-termination since it was NLA_STRING (and we also push it out with NUL-termination).
Cc: stable@vger.kernel.org Reported-by: Lee Jones lee.jones@linaro.org Link: https://lore.kernel.org/r/20220411114201.fd4a31f06541.Ie7ff4be2cf348d8cc28ed... Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/wireless/nl80211.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/wireless/nl80211.c +++ b/net/wireless/nl80211.c @@ -519,7 +519,8 @@ static const struct nla_policy nl80211_p .len = IEEE80211_MAX_MESH_ID_LEN }, [NL80211_ATTR_MPATH_NEXT_HOP] = NLA_POLICY_ETH_ADDR_COMPAT,
- [NL80211_ATTR_REG_ALPHA2] = { .type = NLA_STRING, .len = 2 }, + /* allow 3 for NUL-termination, we used to declare this NLA_STRING */ + [NL80211_ATTR_REG_ALPHA2] = NLA_POLICY_RANGE(NLA_BINARY, 2, 3), [NL80211_ATTR_REG_RULES] = { .type = NLA_NESTED },
[NL80211_ATTR_BSS_CTS_PROT] = { .type = NLA_U8 },
From: Nicolas Dichtel nicolas.dichtel@6wind.com
commit e3fa461d8b0e185b7da8a101fe94dfe6dd500ac0 upstream.
kongweibin reported a kernel panic in ip6_forward() when input interface has no in6 dev associated.
The following tc commands were used to reproduce this panic: tc qdisc del dev vxlan100 root tc qdisc add dev vxlan100 root netem corrupt 5%
CC: stable@vger.kernel.org Fixes: ccd27f05ae7b ("ipv6: fix 'disable_policy' for fwd packets") Reported-by: kongweibin kongweibin2@huawei.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ip6_output.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -485,7 +485,7 @@ int ip6_forward(struct sk_buff *skb) goto drop;
if (!net->ipv6.devconf_all->disable_policy && - !idev->cnf.disable_policy && + (!idev || !idev->cnf.disable_policy) && !xfrm6_policy_check(NULL, XFRM_POLICY_FWD, skb)) { __IP6_INC_STATS(net, idev, IPSTATS_MIB_INDISCARDS); goto drop;
From: Melissa Wen mwen@igalia.com
commit e4f1541caf60fcbe5a59e9d25805c0b5865e546a upstream.
"Pre-multiplied" is the default pixel blend mode for KMS/DRM, as documented in supported_modes of drm_plane_create_blend_mode_property(): https://cgit.freedesktop.org/drm/drm-misc/tree/drivers/gpu/drm/drm_blend.c
In this mode, both 'pixel alpha' and 'plane alpha' participate in the calculation, as described by the pixel blend mode formula in KMS/DRM documentation:
out.rgb = plane_alpha * fg.rgb + (1 - (plane_alpha * fg.alpha)) * bg.rgb
Considering the blend config mechanisms we have in the driver so far, the alpha mode that better fits this blend mode is the _PER_PIXEL_ALPHA_COMBINED_GLOBAL_GAIN, where the value for global_gain is the plane alpha (global_alpha).
With this change, alpha property stops to be ignored. It also addresses Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/1734
v2: * keep the 8-bit value for global_alpha_value (Nicholas) * correct the logical ordering for combined global gain (Nicholas) * apply to dcn10 too (Nicholas)
Signed-off-by: Melissa Wen mwen@igalia.com Tested-by: Rodrigo Siqueira Rodrigo.Siqueira@amd.com Reviewed-by: Harry Wentland harry.wentland@amd.com Tested-by: Simon Ser contact@emersion.fr Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 14 +++++++++----- drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 14 +++++++++----- 2 files changed, 18 insertions(+), 10 deletions(-)
--- a/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c +++ b/drivers/gpu/drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c @@ -2520,14 +2520,18 @@ void dcn10_update_mpcc(struct dc *dc, st struct mpc *mpc = dc->res_pool->mpc; struct mpc_tree *mpc_tree_params = &(pipe_ctx->stream_res.opp->mpc_tree_params);
- if (per_pixel_alpha) - blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA; - else - blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_GLOBAL_ALPHA; - blnd_cfg.overlap_only = false; blnd_cfg.global_gain = 0xff;
+ if (per_pixel_alpha && pipe_ctx->plane_state->global_alpha) { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA_COMBINED_GLOBAL_GAIN; + blnd_cfg.global_gain = pipe_ctx->plane_state->global_alpha_value; + } else if (per_pixel_alpha) { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA; + } else { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_GLOBAL_ALPHA; + } + if (pipe_ctx->plane_state->global_alpha) blnd_cfg.global_alpha = pipe_ctx->plane_state->global_alpha_value; else --- a/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c +++ b/drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c @@ -2313,14 +2313,18 @@ void dcn20_update_mpcc(struct dc *dc, st struct mpc *mpc = dc->res_pool->mpc; struct mpc_tree *mpc_tree_params = &(pipe_ctx->stream_res.opp->mpc_tree_params);
- if (per_pixel_alpha) - blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA; - else - blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_GLOBAL_ALPHA; - blnd_cfg.overlap_only = false; blnd_cfg.global_gain = 0xff;
+ if (per_pixel_alpha && pipe_ctx->plane_state->global_alpha) { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA_COMBINED_GLOBAL_GAIN; + blnd_cfg.global_gain = pipe_ctx->plane_state->global_alpha_value; + } else if (per_pixel_alpha) { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_PER_PIXEL_ALPHA; + } else { + blnd_cfg.alpha_mode = MPCC_ALPHA_BLEND_MODE_GLOBAL_ALPHA; + } + if (pipe_ctx->plane_state->global_alpha) blnd_cfg.global_alpha = pipe_ctx->plane_state->global_alpha_value; else
From: Tomasz Moń desowin@gmail.com
commit 4593c1b6d159f1e5c35c07a7f125e79e5a864302 upstream.
Enabling gfxoff quirk results in perfectly usable graphical user interface on MacBook Pro (15-inch, 2019) with Radeon Pro Vega 20 4 GB.
Without the quirk, X server is completely unusable as every few seconds there is gpu reset due to ring gfx timeout.
Signed-off-by: Tomasz Moń desowin@gmail.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v9_0.c @@ -1334,6 +1334,8 @@ static const struct amdgpu_gfxoff_quirk { 0x1002, 0x15dd, 0x103c, 0x83e7, 0xd3 }, /* GFXOFF is unstable on C6 parts with a VBIOS 113-RAVEN-114 */ { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc6 }, + /* Apple MacBook Pro (15-inch, 2019) Radeon Pro Vega 20 4 GB */ + { 0x1002, 0x69af, 0x106b, 0x019a, 0xc0 }, { 0, 0, 0, 0, 0 }, };
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 258f3b8c3210b03386e4ad92b4bd8652b5c1beb3 upstream.
tsx_clear_cpuid() uses MSR_TSX_FORCE_ABORT to clear CPUID.RTM and CPUID.HLE. Not all CPUs support MSR_TSX_FORCE_ABORT, alternatively use MSR_IA32_TSX_CTRL when supported.
[ bp: Document how and why TSX gets disabled. ]
Fixes: 293649307ef9 ("x86/tsx: Clear CPUID bits when TSX always force aborts") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Tested-by: Neelima Krishnan neelima.krishnan@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/5b323e77e251a9c8bcdda498c5cc0095be1e1d3c.164694378... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kernel/cpu/intel.c | 1 arch/x86/kernel/cpu/tsx.c | 54 ++++++++++++++++++++++++++++++++++++++------ 2 files changed, 48 insertions(+), 7 deletions(-)
--- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -722,6 +722,7 @@ static void init_intel(struct cpuinfo_x8 else if (tsx_ctrl_state == TSX_CTRL_DISABLE) tsx_disable(); else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) + /* See comment over that function for more details. */ tsx_clear_cpuid();
split_lock_init(); --- a/arch/x86/kernel/cpu/tsx.c +++ b/arch/x86/kernel/cpu/tsx.c @@ -58,7 +58,7 @@ void tsx_enable(void) wrmsrl(MSR_IA32_TSX_CTRL, tsx); }
-static bool __init tsx_ctrl_is_supported(void) +static bool tsx_ctrl_is_supported(void) { u64 ia32_cap = x86_read_arch_cap_msr();
@@ -84,6 +84,44 @@ static enum tsx_ctrl_states x86_get_tsx_ return TSX_CTRL_ENABLE; }
+/* + * Disabling TSX is not a trivial business. + * + * First of all, there's a CPUID bit: X86_FEATURE_RTM_ALWAYS_ABORT + * which says that TSX is practically disabled (all transactions are + * aborted by default). When that bit is set, the kernel unconditionally + * disables TSX. + * + * In order to do that, however, it needs to dance a bit: + * + * 1. The first method to disable it is through MSR_TSX_FORCE_ABORT and + * the MSR is present only when *two* CPUID bits are set: + * + * - X86_FEATURE_RTM_ALWAYS_ABORT + * - X86_FEATURE_TSX_FORCE_ABORT + * + * 2. The second method is for CPUs which do not have the above-mentioned + * MSR: those use a different MSR - MSR_IA32_TSX_CTRL and disable TSX + * through that one. Those CPUs can also have the initially mentioned + * CPUID bit X86_FEATURE_RTM_ALWAYS_ABORT set and for those the same strategy + * applies: TSX gets disabled unconditionally. + * + * When either of the two methods are present, the kernel disables TSX and + * clears the respective RTM and HLE feature flags. + * + * An additional twist in the whole thing presents late microcode loading + * which, when done, may cause for the X86_FEATURE_RTM_ALWAYS_ABORT CPUID + * bit to be set after the update. + * + * A subsequent hotplug operation on any logical CPU except the BSP will + * cause for the supported CPUID feature bits to get re-detected and, if + * RTM and HLE get cleared all of a sudden, but, userspace did consult + * them before the update, then funny explosions will happen. Long story + * short: the kernel doesn't modify CPUID feature bits after booting. + * + * That's why, this function's call in init_intel() doesn't clear the + * feature flags. + */ void tsx_clear_cpuid(void) { u64 msr; @@ -97,6 +135,10 @@ void tsx_clear_cpuid(void) rdmsrl(MSR_TSX_FORCE_ABORT, msr); msr |= MSR_TFA_TSX_CPUID_CLEAR; wrmsrl(MSR_TSX_FORCE_ABORT, msr); + } else if (tsx_ctrl_is_supported()) { + rdmsrl(MSR_IA32_TSX_CTRL, msr); + msr |= TSX_CTRL_CPUID_CLEAR; + wrmsrl(MSR_IA32_TSX_CTRL, msr); } }
@@ -106,13 +148,11 @@ void __init tsx_init(void) int ret;
/* - * Hardware will always abort a TSX transaction if both CPUID bits - * RTM_ALWAYS_ABORT and TSX_FORCE_ABORT are set. In this case, it is - * better not to enumerate CPUID.RTM and CPUID.HLE bits. Clear them - * here. + * Hardware will always abort a TSX transaction when the CPUID bit + * RTM_ALWAYS_ABORT is set. In this case, it is better not to enumerate + * CPUID.RTM and CPUID.HLE bits. Clear them here. */ - if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT) && - boot_cpu_has(X86_FEATURE_TSX_FORCE_ABORT)) { + if (boot_cpu_has(X86_FEATURE_RTM_ALWAYS_ABORT)) { tsx_ctrl_state = TSX_CTRL_RTM_ALWAYS_ABORT; tsx_clear_cpuid(); setup_clear_cpu_cap(X86_FEATURE_RTM);
From: Pawan Gupta pawan.kumar.gupta@linux.intel.com
commit 400331f8ffa3bec5c561417e5eec6848464e9160 upstream.
A microcode update on some Intel processors causes all TSX transactions to always abort by default[*]. Microcode also added functionality to re-enable TSX for development purposes. With this microcode loaded, if tsx=on was passed on the cmdline, and TSX development mode was already enabled before the kernel boot, it may make the system vulnerable to TSX Asynchronous Abort (TAA).
To be on safer side, unconditionally disable TSX development mode during boot. If a viable use case appears, this can be revisited later.
[*]: Intel TSX Disable Update for Selected Processors, doc ID: 643557
[ bp: Drop unstable web link, massage heavily. ]
Suggested-by: Andrew Cooper andrew.cooper3@citrix.com Suggested-by: Borislav Petkov bp@alien8.de Signed-off-by: Pawan Gupta pawan.kumar.gupta@linux.intel.com Signed-off-by: Borislav Petkov bp@suse.de Tested-by: Neelima Krishnan neelima.krishnan@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/347bd844da3a333a9793c6687d4e4eb3b2419a3e.164694378... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/msr-index.h | 4 +- arch/x86/kernel/cpu/common.c | 2 + arch/x86/kernel/cpu/cpu.h | 5 +-- arch/x86/kernel/cpu/intel.c | 8 ----- arch/x86/kernel/cpu/tsx.c | 50 +++++++++++++++++++++++++++++++-- tools/arch/x86/include/asm/msr-index.h | 4 +- 6 files changed, 55 insertions(+), 18 deletions(-)
--- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -128,9 +128,9 @@ #define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */ #define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
-/* SRBDS support */ #define MSR_IA32_MCU_OPT_CTRL 0x00000123 -#define RNGDS_MITG_DIS BIT(0) +#define RNGDS_MITG_DIS BIT(0) /* SRBDS support */ +#define RTM_ALLOW BIT(1) /* TSX development mode */
#define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1719,6 +1719,8 @@ void identify_secondary_cpu(struct cpuin validate_apic_and_package_id(c); x86_spec_ctrl_setup_ap(); update_srbds_msr(); + + tsx_ap_init(); }
static __init int setup_noclflush(char *arg) --- a/arch/x86/kernel/cpu/cpu.h +++ b/arch/x86/kernel/cpu/cpu.h @@ -55,11 +55,10 @@ enum tsx_ctrl_states { extern __ro_after_init enum tsx_ctrl_states tsx_ctrl_state;
extern void __init tsx_init(void); -extern void tsx_enable(void); -extern void tsx_disable(void); -extern void tsx_clear_cpuid(void); +void tsx_ap_init(void); #else static inline void tsx_init(void) { } +static inline void tsx_ap_init(void) { } #endif /* CONFIG_CPU_SUP_INTEL */
extern void get_cpu_cap(struct cpuinfo_x86 *c); --- a/arch/x86/kernel/cpu/intel.c +++ b/arch/x86/kernel/cpu/intel.c @@ -717,14 +717,6 @@ static void init_intel(struct cpuinfo_x8
init_intel_misc_features(c);
- if (tsx_ctrl_state == TSX_CTRL_ENABLE) - tsx_enable(); - else if (tsx_ctrl_state == TSX_CTRL_DISABLE) - tsx_disable(); - else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) - /* See comment over that function for more details. */ - tsx_clear_cpuid(); - split_lock_init(); bus_lock_init();
--- a/arch/x86/kernel/cpu/tsx.c +++ b/arch/x86/kernel/cpu/tsx.c @@ -19,7 +19,7 @@
enum tsx_ctrl_states tsx_ctrl_state __ro_after_init = TSX_CTRL_NOT_SUPPORTED;
-void tsx_disable(void) +static void tsx_disable(void) { u64 tsx;
@@ -39,7 +39,7 @@ void tsx_disable(void) wrmsrl(MSR_IA32_TSX_CTRL, tsx); }
-void tsx_enable(void) +static void tsx_enable(void) { u64 tsx;
@@ -122,7 +122,7 @@ static enum tsx_ctrl_states x86_get_tsx_ * That's why, this function's call in init_intel() doesn't clear the * feature flags. */ -void tsx_clear_cpuid(void) +static void tsx_clear_cpuid(void) { u64 msr;
@@ -142,11 +142,42 @@ void tsx_clear_cpuid(void) } }
+/* + * Disable TSX development mode + * + * When the microcode released in Feb 2022 is applied, TSX will be disabled by + * default on some processors. MSR 0x122 (TSX_CTRL) and MSR 0x123 + * (IA32_MCU_OPT_CTRL) can be used to re-enable TSX for development, doing so is + * not recommended for production deployments. In particular, applying MD_CLEAR + * flows for mitigation of the Intel TSX Asynchronous Abort (TAA) transient + * execution attack may not be effective on these processors when Intel TSX is + * enabled with updated microcode. + */ +static void tsx_dev_mode_disable(void) +{ + u64 mcu_opt_ctrl; + + /* Check if RTM_ALLOW exists */ + if (!boot_cpu_has_bug(X86_BUG_TAA) || !tsx_ctrl_is_supported() || + !cpu_feature_enabled(X86_FEATURE_SRBDS_CTRL)) + return; + + rdmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_opt_ctrl); + + if (mcu_opt_ctrl & RTM_ALLOW) { + mcu_opt_ctrl &= ~RTM_ALLOW; + wrmsrl(MSR_IA32_MCU_OPT_CTRL, mcu_opt_ctrl); + setup_force_cpu_cap(X86_FEATURE_RTM_ALWAYS_ABORT); + } +} + void __init tsx_init(void) { char arg[5] = {}; int ret;
+ tsx_dev_mode_disable(); + /* * Hardware will always abort a TSX transaction when the CPUID bit * RTM_ALWAYS_ABORT is set. In this case, it is better not to enumerate @@ -215,3 +246,16 @@ void __init tsx_init(void) setup_force_cpu_cap(X86_FEATURE_HLE); } } + +void tsx_ap_init(void) +{ + tsx_dev_mode_disable(); + + if (tsx_ctrl_state == TSX_CTRL_ENABLE) + tsx_enable(); + else if (tsx_ctrl_state == TSX_CTRL_DISABLE) + tsx_disable(); + else if (tsx_ctrl_state == TSX_CTRL_RTM_ALWAYS_ABORT) + /* See comment over that function for more details. */ + tsx_clear_cpuid(); +} --- a/tools/arch/x86/include/asm/msr-index.h +++ b/tools/arch/x86/include/asm/msr-index.h @@ -128,9 +128,9 @@ #define TSX_CTRL_RTM_DISABLE BIT(0) /* Disable RTM feature */ #define TSX_CTRL_CPUID_CLEAR BIT(1) /* Disable TSX enumeration */
-/* SRBDS support */ #define MSR_IA32_MCU_OPT_CTRL 0x00000123 -#define RNGDS_MITG_DIS BIT(0) +#define RNGDS_MITG_DIS BIT(0) /* SRBDS support */ +#define RTM_ALLOW BIT(1) /* TSX development mode */
#define MSR_IA32_SYSENTER_CS 0x00000174 #define MSR_IA32_SYSENTER_ESP 0x00000175
From: Rei Yamamoto yamamoto.rei@jp.fujitsu.com
commit 08d835dff916bfe8f45acc7b92c7af6c4081c8a7 upstream.
If CPUs on a node are offline at boot time, the number of nodes is different when building affinity masks for present cpus and when building affinity masks for possible cpus. This causes the following problem:
In the case that the number of vectors is less than the number of nodes there are cases where bits of masks for present cpus are overwritten when building masks for possible cpus.
Fix this by excluding CPUs, which are not part of the current build mask (present/possible).
[ tglx: Massaged changelog and added comment ]
Fixes: b82592199032 ("genirq/affinity: Spread IRQs to all available NUMA nodes") Signed-off-by: Rei Yamamoto yamamoto.rei@jp.fujitsu.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Ming Lei ming.lei@redhat.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20220331003309.10891-1-yamamoto.rei@jp.fujitsu.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/irq/affinity.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/irq/affinity.c +++ b/kernel/irq/affinity.c @@ -269,8 +269,9 @@ static int __irq_build_affinity_masks(un */ if (numvecs <= nodes) { for_each_node_mask(n, nodemsk) { - cpumask_or(&masks[curvec].mask, &masks[curvec].mask, - node_to_cpumask[n]); + /* Ensure that only CPUs which are in both masks are set */ + cpumask_and(nmsk, cpu_mask, node_to_cpumask[n]); + cpumask_or(&masks[curvec].mask, &masks[curvec].mask, nmsk); if (++curvec == last_affv) curvec = firstvec; }
From: Paul Gortmaker paul.gortmaker@windriver.com
commit 40e97e42961f8c6cc7bd5fe67cc18417e02d78f1 upstream.
While running some testing on code that happened to allow the variable tick_nohz_full_running to get set but with no "possible" NOHZ cores to back up that setting, this warning triggered:
if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) WARN_ON(tick_nohz_full_running);
The console was overwhemled with an endless stream of one WARN per tick per core and there was no way to even see what was going on w/o using a serial console to capture it and then trace it back to this.
Change it to WARN_ON_ONCE().
Fixes: 08ae95f4fd3b ("nohz_full: Allow the boot CPU to be nohz_full") Signed-off-by: Paul Gortmaker paul.gortmaker@windriver.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20211206145950.10927-3-paul.gortmaker@windriver.co... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/tick-sched.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/time/tick-sched.c +++ b/kernel/time/tick-sched.c @@ -186,7 +186,7 @@ static void tick_sched_do_timer(struct t */ if (unlikely(tick_do_timer_cpu == TICK_DO_TIMER_NONE)) { #ifdef CONFIG_NO_HZ_FULL - WARN_ON(tick_nohz_full_running); + WARN_ON_ONCE(tick_nohz_full_running); #endif tick_do_timer_cpu = cpu; }
From: Nathan Chancellor nathan@kernel.org
commit 83a1cde5c74bfb44b49cb2a940d044bb2380f4ea upstream.
With newer versions of GCC, there is a panic in da850_evm_config_emac() when booting multi_v5_defconfig in QEMU under the palmetto-bmc machine:
Unable to handle kernel NULL pointer dereference at virtual address 00000020 pgd = (ptrval) [00000020] *pgd=00000000 Internal error: Oops: 5 [#1] PREEMPT ARM Modules linked in: CPU: 0 PID: 1 Comm: swapper Not tainted 5.15.0 #1 Hardware name: Generic DT based system PC is at da850_evm_config_emac+0x1c/0x120 LR is at do_one_initcall+0x50/0x1e0
The emac_pdata pointer in soc_info is NULL because davinci_soc_info only gets populated on davinci machines but da850_evm_config_emac() is called on all machines via device_initcall().
Move the rmii_en assignment below the machine check so that it is only dereferenced when running on a supported SoC.
Fixes: bae105879f2f ("davinci: DA850/OMAP-L138 EVM: implement autodetect of RMII PHY") Signed-off-by: Nathan Chancellor nathan@kernel.org Reviewed-by: Arnd Bergmann arnd@arndb.de Reviewed-by: Bartosz Golaszewski brgl@bgdev.pl Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/YcS4xVWs6bQlQSPC@archlinux-ax161/ Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-davinci/board-da850-evm.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/arm/mach-davinci/board-da850-evm.c +++ b/arch/arm/mach-davinci/board-da850-evm.c @@ -1101,11 +1101,13 @@ static int __init da850_evm_config_emac( int ret; u32 val; struct davinci_soc_info *soc_info = &davinci_soc_info; - u8 rmii_en = soc_info->emac_pdata->rmii_en; + u8 rmii_en;
if (!machine_is_davinci_da850_evm()) return 0;
+ rmii_en = soc_info->emac_pdata->rmii_en; + cfg_chip3_base = DA8XX_SYSCFG0_VIRT(DA8XX_CFGCHIP3_REG);
val = __raw_readl(cfg_chip3_base);
From: Alexander Sverdlin alexander.sverdlin@gmail.com
commit 3b68b08885217abd9c57ff9b3bb3eb173eee02a9 upstream.
arch/arm/mach-ep93xx/clock.c:154:2: warning: Use of memory after it is freed [clang-analyzer-unix.Malloc] arch/arm/mach-ep93xx/clock.c:151:2: note: Taking true branch if (IS_ERR(clk)) ^ arch/arm/mach-ep93xx/clock.c:152:3: note: Memory is released kfree(psc); ^~~~~~~~~~ arch/arm/mach-ep93xx/clock.c:154:2: note: Use of memory after it is freed return &psc->hw; ^ ~~~~~~~~
Fixes: 9645ccc7bd7a ("ep93xx: clock: convert in-place to COMMON_CLK") Reported-by: kernel test robot lkp@intel.com Signed-off-by: Alexander Sverdlin alexander.sverdlin@gmail.com Cc: stable@vger.kernel.org Link: https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org/thread/B5YCO2NJ... Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/mach-ep93xx/clock.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/arch/arm/mach-ep93xx/clock.c +++ b/arch/arm/mach-ep93xx/clock.c @@ -148,8 +148,10 @@ static struct clk_hw *ep93xx_clk_registe psc->lock = &clk_lock;
clk = clk_register(NULL, &psc->hw); - if (IS_ERR(clk)) + if (IS_ERR(clk)) { kfree(psc); + return ERR_CAST(clk); + }
return &psc->hw; }
From: Mikulas Patocka mpatocka@redhat.com
commit 08c1af8f1c13bbf210f1760132f4df24d0ed46d6 upstream.
It is possible to set up dm-integrity in such a way that the "tag_size" parameter is less than the actual digest size. In this situation, a part of the digest beyond tag_size is ignored.
In this case, dm-integrity would write beyond the end of the ic->recalc_tags array and corrupt memory. The corruption happened in integrity_recalc->integrity_sector_checksum->crypto_shash_final.
Fix this corruption by increasing the tags array so that it has enough padding at the end to accomodate the loop in integrity_recalc() being able to write a full digest size for the last member of the tags array.
Cc: stable@vger.kernel.org # v4.19+ Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Mike Snitzer snitzer@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-integrity.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -4400,6 +4400,7 @@ try_smaller_buffer: }
if (ic->internal_hash) { + size_t recalc_tags_size; ic->recalc_wq = alloc_workqueue("dm-integrity-recalc", WQ_MEM_RECLAIM, 1); if (!ic->recalc_wq ) { ti->error = "Cannot allocate workqueue"; @@ -4413,8 +4414,10 @@ try_smaller_buffer: r = -ENOMEM; goto bad; } - ic->recalc_tags = kvmalloc_array(RECALC_SECTORS >> ic->sb->log2_sectors_per_block, - ic->tag_size, GFP_KERNEL); + recalc_tags_size = (RECALC_SECTORS >> ic->sb->log2_sectors_per_block) * ic->tag_size; + if (crypto_shash_digestsize(ic->internal_hash) > ic->tag_size) + recalc_tags_size += crypto_shash_digestsize(ic->internal_hash) - ic->tag_size; + ic->recalc_tags = kvmalloc(recalc_tags_size, GFP_KERNEL); if (!ic->recalc_tags) { ti->error = "Cannot allocate tags for recalculating"; r = -ENOMEM;
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
commit 993eb48fa199b5f476df8204e652eff63dd19361 upstream.
If dev_set_name() fails, the dev_name() is null, check the return value of dev_set_name() to avoid the null-ptr-deref.
Fixes: 1413ef638aba ("i2c: dev: Fix the race between the release of i2c_dev and cdev") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/i2c-dev.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
--- a/drivers/i2c/i2c-dev.c +++ b/drivers/i2c/i2c-dev.c @@ -668,16 +668,21 @@ static int i2cdev_attach_adapter(struct i2c_dev->dev.class = i2c_dev_class; i2c_dev->dev.parent = &adap->dev; i2c_dev->dev.release = i2cdev_dev_release; - dev_set_name(&i2c_dev->dev, "i2c-%d", adap->nr); + + res = dev_set_name(&i2c_dev->dev, "i2c-%d", adap->nr); + if (res) + goto err_put_i2c_dev;
res = cdev_device_add(&i2c_dev->cdev, &i2c_dev->dev); - if (res) { - put_i2c_dev(i2c_dev, false); - return res; - } + if (res) + goto err_put_i2c_dev;
pr_debug("adapter [%s] registered as minor %d\n", adap->name, adap->nr); return 0; + +err_put_i2c_dev: + put_i2c_dev(i2c_dev, false); + return res; }
static int i2cdev_detach_adapter(struct device *dev, void *dummy)
From: Vladimir Oltean vladimir.oltean@nxp.com
commit 762c2998c9625f642f0d23da7d3f7e4f90665fdf upstream.
This reverts commit 11fd667dac315ea3f2469961f6d2869271a46cae.
dsa_slave_change_mtu() updates the MTU of the DSA master and of the associated CPU port, but only if it detects a change to the master MTU.
The blamed commit in the Fixes: tag below addressed a regression where dsa_slave_change_mtu() would return early and not do anything due to ds->ops->port_change_mtu() not being implemented.
However, that commit also had the effect that the master MTU got set up to the correct value by dsa_master_setup(), but the associated CPU port's MTU did not get updated. This causes breakage for drivers that rely on the ->port_change_mtu() DSA call to account for the tagging overhead on the CPU port, and don't set up the initial MTU during the setup phase.
Things actually worked before because they were in a fragile equilibrium where dsa_slave_change_mtu() was called before dsa_master_setup() was. So dsa_slave_change_mtu() could actually detect a change and update the CPU port MTU too.
Restore the code to the way things used to work by reverting the reorder of dsa_tree_setup_master() and dsa_tree_setup_ports(). That change did not have a concrete motivation going for it anyway, it just looked better.
Fixes: 066dfc429040 ("Revert "net: dsa: stop updating master MTU from master.c"") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/dsa/dsa2.c | 23 ++++++++++------------- 1 file changed, 10 insertions(+), 13 deletions(-)
--- a/net/dsa/dsa2.c +++ b/net/dsa/dsa2.c @@ -561,7 +561,6 @@ static void dsa_port_teardown(struct dsa struct devlink_port *dlp = &dp->devlink_port; struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a, *tmp; - struct net_device *slave;
if (!dp->setup) return; @@ -583,11 +582,9 @@ static void dsa_port_teardown(struct dsa dsa_port_link_unregister_of(dp); break; case DSA_PORT_TYPE_USER: - slave = dp->slave; - - if (slave) { + if (dp->slave) { + dsa_slave_destroy(dp->slave); dp->slave = NULL; - dsa_slave_destroy(slave); } break; } @@ -1137,17 +1134,17 @@ static int dsa_tree_setup(struct dsa_swi if (err) goto teardown_cpu_ports;
- err = dsa_tree_setup_master(dst); + err = dsa_tree_setup_ports(dst); if (err) goto teardown_switches;
- err = dsa_tree_setup_ports(dst); + err = dsa_tree_setup_master(dst); if (err) - goto teardown_master; + goto teardown_ports;
err = dsa_tree_setup_lags(dst); if (err) - goto teardown_ports; + goto teardown_master;
dst->setup = true;
@@ -1155,10 +1152,10 @@ static int dsa_tree_setup(struct dsa_swi
return 0;
-teardown_ports: - dsa_tree_teardown_ports(dst); teardown_master: dsa_tree_teardown_master(dst); +teardown_ports: + dsa_tree_teardown_ports(dst); teardown_switches: dsa_tree_teardown_switches(dst); teardown_cpu_ports: @@ -1176,10 +1173,10 @@ static void dsa_tree_teardown(struct dsa
dsa_tree_teardown_lags(dst);
- dsa_tree_teardown_ports(dst); - dsa_tree_teardown_master(dst);
+ dsa_tree_teardown_ports(dst); + dsa_tree_teardown_switches(dst);
dsa_tree_teardown_cpu_ports(dst);
From: Nadav Amit namit@vmware.com
commit 9e949a3886356fe9112c6f6f34a6e23d1d35407f upstream.
The check in flush_smp_call_function_queue() for callbacks that are sent to offline CPUs currently checks whether the queue is empty.
However, flush_smp_call_function_queue() has just deleted all the callbacks from the queue and moved all the entries into a local list. This checks would only be positive if some callbacks were added in the short time after llist_del_all() was called. This does not seem to be the intention of this check.
Change the check to look at the local list to which the entries were moved instead of the queue from which all the callbacks were just removed.
Fixes: 8d056c48e4862 ("CPU hotplug, smp: flush any pending IPI callbacks before CPU offline") Signed-off-by: Nadav Amit namit@vmware.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20220319072015.1495036-1-namit@vmware.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/smp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/smp.c +++ b/kernel/smp.c @@ -579,7 +579,7 @@ static void flush_smp_call_function_queu
/* There shouldn't be any pending callbacks on an offline CPU. */ if (unlikely(warn_cpu_offline && !cpu_online(smp_processor_id()) && - !warned && !llist_empty(head))) { + !warned && entry != NULL)) { warned = true; WARN(1, "IPI on offline CPU %d\n", smp_processor_id());
From: Sherry Sun sherry.sun@nxp.com
commit 4f9f45d0eb0e7d449bc9294459df79b9c66edfac upstream.
For the snps,ddrc-3.80a compatible, the interrupts property is also required, also order the compatibles by name (s goes before x).
Signed-off-by: Sherry Sun sherry.sun@nxp.com Fixes: a9e6b3819b36 ("dt-bindings: memory: Add entry for version 3.80a") Link: https://lore.kernel.org/r/20220321075131.17811-2-sherry.sun@nxp.com Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/memory-controllers/synopsys,ddrc-ecc.yaml | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/Documentation/devicetree/bindings/memory-controllers/synopsys,ddrc-ecc.yaml +++ b/Documentation/devicetree/bindings/memory-controllers/synopsys,ddrc-ecc.yaml @@ -24,9 +24,9 @@ description: | properties: compatible: enum: + - snps,ddrc-3.80a - xlnx,zynq-ddrc-a05 - xlnx,zynqmp-ddrc-2.40a - - snps,ddrc-3.80a
interrupts: maxItems: 1 @@ -43,7 +43,9 @@ allOf: properties: compatible: contains: - const: xlnx,zynqmp-ddrc-2.40a + enum: + - snps,ddrc-3.80a + - xlnx,zynqmp-ddrc-2.40a then: required: - interrupts
From: Martin Povišer povik+lin@cutebit.org
commit bd8963e602c77adc76dbbbfc3417c3cf14fed76b upstream.
Wait for completion of write transfers before returning from the driver. At first sight it may seem advantageous to leave write transfers queued for the controller to carry out on its own time, but there's a couple of issues with it:
* Driver doesn't check for FIFO space.
* The queued writes can complete while the driver is in its I2C read transfer path which means it will get confused by the raising of XEN (the 'transaction ended' signal). This can cause a spurious ENODATA error due to premature reading of the MRXFIFO register.
Adding the wait fixes some unreliability issues with the driver. There's some efficiency cost to it (especially with pasemi_smb_waitready doing its polling), but that will be alleviated once the driver receives interrupt support.
Fixes: beb58aa39e6e ("i2c: PA Semi SMBus driver") Signed-off-by: Martin Povišer povik+lin@cutebit.org Reviewed-by: Sven Peter sven@svenpeter.dev Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-pasemi-core.c | 6 ++++++ 1 file changed, 6 insertions(+)
--- a/drivers/i2c/busses/i2c-pasemi-core.c +++ b/drivers/i2c/busses/i2c-pasemi-core.c @@ -137,6 +137,12 @@ static int pasemi_i2c_xfer_msg(struct i2
TXFIFO_WR(smbus, msg->buf[msg->len-1] | (stop ? MTXFIFO_STOP : 0)); + + if (stop) { + err = pasemi_smb_waitready(smbus); + if (err) + goto reset_out; + } }
return 0;
From: Dongjin Yang dj76.yang@samsung.com
commit ce8b3ad1071b764e963d9b08ac34ffddddf12da6 upstream.
snps,dwmac has duplicated name for loongson,ls2k-dwmac and loongson,ls7a-dwmac.
Signed-off-by: Dongjin Yang dj76.yang@samsung.com Fixes: 68277749a013 ("dt-bindings: dwmac: Add bindings for new Loongson SoC and bridge chip") Reviewed-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Signed-off-by: Rob Herring robh@kernel.org Link: https://lore.kernel.org/r/20220404022857epcms1p6e6af1a6a86569f339e50c318abde... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/devicetree/bindings/net/snps,dwmac.yaml | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
--- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml @@ -53,20 +53,18 @@ properties: - allwinner,sun8i-r40-gmac - allwinner,sun8i-v3s-emac - allwinner,sun50i-a64-emac - - loongson,ls2k-dwmac - - loongson,ls7a-dwmac - amlogic,meson6-dwmac - amlogic,meson8b-dwmac - amlogic,meson8m2-dwmac - amlogic,meson-gxbb-dwmac - amlogic,meson-axg-dwmac - - loongson,ls2k-dwmac - - loongson,ls7a-dwmac - ingenic,jz4775-mac - ingenic,x1000-mac - ingenic,x1600-mac - ingenic,x1830-mac - ingenic,x2000-mac + - loongson,ls2k-dwmac + - loongson,ls7a-dwmac - rockchip,px30-gmac - rockchip,rk3128-gmac - rockchip,rk3228-gmac
From: Anna-Maria Behnsen anna-maria@linutronix.de
commit c54bc0fc84214b203f7a0ebfd1bd308ce2abe920 upstream.
When the timer base is empty, base::next_expiry is set to base::clk + NEXT_TIMER_MAX_DELTA and base::next_expiry_recalc is false. When no timer is queued until jiffies reaches base::next_expiry value, the warning for not finding any expired timer and base::next_expiry_recalc is false in __run_timers() triggers.
To prevent triggering the warning in this valid scenario base::timers_pending needs to be added to the warning condition.
Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") Reported-by: Johannes Berg johannes@sipsolutions.net Signed-off-by: Anna-Maria Behnsen anna-maria@linutronix.de Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Frederic Weisbecker frederic@kernel.org Link: https://lore.kernel.org/r/20220405191732.7438-3-anna-maria@linutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/timer.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1722,11 +1722,14 @@ static inline void __run_timers(struct t time_after_eq(jiffies, base->next_expiry)) { levels = collect_expired_timers(base, heads); /* - * The only possible reason for not finding any expired - * timer at this clk is that all matching timers have been - * dequeued. + * The two possible reasons for not finding any expired + * timer at this clk are that all matching timers have been + * dequeued or no timer has been queued since + * base::next_expiry was set to base::clk + + * NEXT_TIMER_MAX_DELTA. */ - WARN_ON_ONCE(!levels && !base->next_expiry_recalc); + WARN_ON_ONCE(!levels && !base->next_expiry_recalc + && base->timers_pending); base->clk++; base->next_expiry = __next_timer_interrupt(base);
From: Chao Gao chao.gao@intel.com
commit 9e02977bfad006af328add9434c8bffa40e053bb upstream.
When we looked into FIO performance with swiotlb enabled in VM, we found swiotlb_bounce() is always called one more time than expected for each DMA read request.
It turns out that the bounce buffer is copied to original DMA buffer twice after the completion of a DMA request (one is done by in dma_direct_sync_single_for_cpu(), the other by swiotlb_tbl_unmap_single()). But the content in bounce buffer actually doesn't change between the two rounds of copy. So, one round of copy is redundant.
Pass DMA_ATTR_SKIP_CPU_SYNC flag to swiotlb_tbl_unmap_single() to skip the memory copy in it.
This fix increases FIO 64KB sequential read throughput in a guest with swiotlb=force by 5.6%.
Fixes: 55897af63091 ("dma-direct: merge swiotlb_dma_ops into the dma_direct code") Reported-by: Wang Zhaoyang1 zhaoyang1.wang@intel.com Reported-by: Gao Liang liang.gao@intel.com Signed-off-by: Chao Gao chao.gao@intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/dma/direct.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/dma/direct.h +++ b/kernel/dma/direct.h @@ -114,6 +114,7 @@ static inline void dma_direct_unmap_page dma_direct_sync_single_for_cpu(dev, addr, size, dir);
if (unlikely(is_swiotlb_buffer(dev, phys))) - swiotlb_tbl_unmap_single(dev, phys, size, dir, attrs); + swiotlb_tbl_unmap_single(dev, phys, size, dir, + attrs | DMA_ATTR_SKIP_CPU_SYNC); } #endif /* _KERNEL_DMA_DIRECT_H */
From: Marco Elver elver@google.com
commit 2dfe63e61cc31ee59ce951672b0850b5229cd5b0 upstream.
Calling kmem_obj_info() via kmem_dump_obj() on KFENCE objects has been producing garbage data due to the object not actually being maintained by SLAB or SLUB.
Fix this by implementing __kfence_obj_info() that copies relevant information to struct kmem_obj_info when the object was allocated by KFENCE; this is called by a common kmem_obj_info(), which also calls the slab/slub/slob specific variant now called __kmem_obj_info().
For completeness, kmem_dump_obj() now displays if the object was allocated by KFENCE.
Link: https://lore.kernel.org/all/20220323090520.GG16885@xsang-OptiPlex-9020/ Link: https://lkml.kernel.org/r/20220406131558.3558585-1-elver@google.com Fixes: b89fb5ef0ce6 ("mm, kfence: insert KFENCE hooks for SLUB") Fixes: d3fb45f370d9 ("mm, kfence: insert KFENCE hooks for SLAB") Signed-off-by: Marco Elver elver@google.com Reviewed-by: Hyeonggon Yoo 42.hyeyoo@gmail.com Reported-by: kernel test robot oliver.sang@intel.com Acked-by: Vlastimil Babka vbabka@suse.cz [slab] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/kfence.h | 24 ++++++++++++++++++++++++ mm/kfence/core.c | 21 --------------------- mm/kfence/kfence.h | 21 +++++++++++++++++++++ mm/kfence/report.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ mm/slab.c | 2 +- mm/slab.h | 2 +- mm/slab_common.c | 9 +++++++++ mm/slob.c | 2 +- mm/slub.c | 2 +- 9 files changed, 105 insertions(+), 25 deletions(-)
--- a/include/linux/kfence.h +++ b/include/linux/kfence.h @@ -204,6 +204,22 @@ static __always_inline __must_check bool */ bool __must_check kfence_handle_page_fault(unsigned long addr, bool is_write, struct pt_regs *regs);
+#ifdef CONFIG_PRINTK +struct kmem_obj_info; +/** + * __kfence_obj_info() - fill kmem_obj_info struct + * @kpp: kmem_obj_info to be filled + * @object: the object + * + * Return: + * * false - not a KFENCE object + * * true - a KFENCE object, filled @kpp + * + * Copies information to @kpp for KFENCE objects. + */ +bool __kfence_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab); +#endif + #else /* CONFIG_KFENCE */
static inline bool is_kfence_address(const void *addr) { return false; } @@ -221,6 +237,14 @@ static inline bool __must_check kfence_h return false; }
+#ifdef CONFIG_PRINTK +struct kmem_obj_info; +static inline bool __kfence_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +{ + return false; +} +#endif + #endif
#endif /* _LINUX_KFENCE_H */ --- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -222,27 +222,6 @@ static bool kfence_unprotect(unsigned lo return !KFENCE_WARN_ON(!kfence_protect_page(ALIGN_DOWN(addr, PAGE_SIZE), false)); }
-static inline struct kfence_metadata *addr_to_metadata(unsigned long addr) -{ - long index; - - /* The checks do not affect performance; only called from slow-paths. */ - - if (!is_kfence_address((void *)addr)) - return NULL; - - /* - * May be an invalid index if called with an address at the edge of - * __kfence_pool, in which case we would report an "invalid access" - * error. - */ - index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2) - 1; - if (index < 0 || index >= CONFIG_KFENCE_NUM_OBJECTS) - return NULL; - - return &kfence_metadata[index]; -} - static inline unsigned long metadata_to_pageaddr(const struct kfence_metadata *meta) { unsigned long offset = (meta - kfence_metadata + 1) * PAGE_SIZE * 2; --- a/mm/kfence/kfence.h +++ b/mm/kfence/kfence.h @@ -96,6 +96,27 @@ struct kfence_metadata {
extern struct kfence_metadata kfence_metadata[CONFIG_KFENCE_NUM_OBJECTS];
+static inline struct kfence_metadata *addr_to_metadata(unsigned long addr) +{ + long index; + + /* The checks do not affect performance; only called from slow-paths. */ + + if (!is_kfence_address((void *)addr)) + return NULL; + + /* + * May be an invalid index if called with an address at the edge of + * __kfence_pool, in which case we would report an "invalid access" + * error. + */ + index = (addr - (unsigned long)__kfence_pool) / (PAGE_SIZE * 2) - 1; + if (index < 0 || index >= CONFIG_KFENCE_NUM_OBJECTS) + return NULL; + + return &kfence_metadata[index]; +} + /* KFENCE error types for report generation. */ enum kfence_error_type { KFENCE_ERROR_OOB, /* Detected a out-of-bounds access. */ --- a/mm/kfence/report.c +++ b/mm/kfence/report.c @@ -273,3 +273,50 @@ void kfence_report_error(unsigned long a /* We encountered a memory safety error, taint the kernel! */ add_taint(TAINT_BAD_PAGE, LOCKDEP_STILL_OK); } + +#ifdef CONFIG_PRINTK +static void kfence_to_kp_stack(const struct kfence_track *track, void **kp_stack) +{ + int i, j; + + i = get_stack_skipnr(track->stack_entries, track->num_stack_entries, NULL); + for (j = 0; i < track->num_stack_entries && j < KS_ADDRS_COUNT; ++i, ++j) + kp_stack[j] = (void *)track->stack_entries[i]; + if (j < KS_ADDRS_COUNT) + kp_stack[j] = NULL; +} + +bool __kfence_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +{ + struct kfence_metadata *meta = addr_to_metadata((unsigned long)object); + unsigned long flags; + + if (!meta) + return false; + + /* + * If state is UNUSED at least show the pointer requested; the rest + * would be garbage data. + */ + kpp->kp_ptr = object; + + /* Requesting info an a never-used object is almost certainly a bug. */ + if (WARN_ON(meta->state == KFENCE_OBJECT_UNUSED)) + return true; + + raw_spin_lock_irqsave(&meta->lock, flags); + + kpp->kp_slab = slab; + kpp->kp_slab_cache = meta->cache; + kpp->kp_objp = (void *)meta->addr; + kfence_to_kp_stack(&meta->alloc_track, kpp->kp_stack); + if (meta->state == KFENCE_OBJECT_FREED) + kfence_to_kp_stack(&meta->free_track, kpp->kp_free_stack); + /* get_stack_skipnr() ensures the first entry is outside allocator. */ + kpp->kp_ret = kpp->kp_stack[0]; + + raw_spin_unlock_irqrestore(&meta->lock, flags); + + return true; +} +#endif --- a/mm/slab.c +++ b/mm/slab.c @@ -3650,7 +3650,7 @@ EXPORT_SYMBOL(__kmalloc_node_track_calle #endif /* CONFIG_NUMA */
#ifdef CONFIG_PRINTK -void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) { struct kmem_cache *cachep; unsigned int objnr; --- a/mm/slab.h +++ b/mm/slab.h @@ -851,7 +851,7 @@ struct kmem_obj_info { void *kp_stack[KS_ADDRS_COUNT]; void *kp_free_stack[KS_ADDRS_COUNT]; }; -void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab); +void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab); #endif
#ifdef CONFIG_HAVE_HARDENED_USERCOPY_ALLOCATOR --- a/mm/slab_common.c +++ b/mm/slab_common.c @@ -555,6 +555,13 @@ bool kmem_valid_obj(void *object) } EXPORT_SYMBOL_GPL(kmem_valid_obj);
+static void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +{ + if (__kfence_obj_info(kpp, object, slab)) + return; + __kmem_obj_info(kpp, object, slab); +} + /** * kmem_dump_obj - Print available slab provenance information * @object: slab object for which to find provenance information. @@ -590,6 +597,8 @@ void kmem_dump_obj(void *object) pr_cont(" slab%s %s", cp, kp.kp_slab_cache->name); else pr_cont(" slab%s", cp); + if (is_kfence_address(object)) + pr_cont(" (kfence)"); if (kp.kp_objp) pr_cont(" start %px", kp.kp_objp); if (kp.kp_data_offset) --- a/mm/slob.c +++ b/mm/slob.c @@ -463,7 +463,7 @@ out: }
#ifdef CONFIG_PRINTK -void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) { kpp->kp_ptr = object; kpp->kp_slab = slab; --- a/mm/slub.c +++ b/mm/slub.c @@ -4322,7 +4322,7 @@ int __kmem_cache_shutdown(struct kmem_ca }
#ifdef CONFIG_PRINTK -void kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) +void __kmem_obj_info(struct kmem_obj_info *kpp, void *object, struct slab *slab) { void *base; int __maybe_unused i;
From: Matt Roper matthew.d.roper@intel.com
commit 1acb34e7dd7720a1fff00cbd4d000ec3219dc9d6 upstream.
The intent of the version check in the mmap ioctl was to maintain support for existing platforms (i.e., ADL/RPL and earlier), but drop support on all future igpu platforms. As we've seen on the dgpu side, the hardware teams are using a more fine-grained numbering system for IP version numbers these days, so it's possible the version number associated with our next igpu could be some form of "12.xx" rather than 13 or higher. Comparing against the full ver.release number will ensure the intent of the check is maintained no matter what numbering the hardware teams settle on.
Fixes: d3f3baa3562a ("drm/i915: Reinstate the mmap ioctl for some platforms") Cc: Thomas Hellström thomas.hellstrom@linux.intel.com Cc: Lucas De Marchi lucas.demarchi@intel.com Signed-off-by: Matt Roper matthew.d.roper@intel.com Reviewed-by: Lucas De Marchi lucas.demarchi@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20220407161839.1073443-1-matth... (cherry picked from commit 8e7e5c077cd57ee9a36d58c65f07257dc49a88d5) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gem/i915_gem_mman.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/i915/gem/i915_gem_mman.c +++ b/drivers/gpu/drm/i915/gem/i915_gem_mman.c @@ -67,7 +67,7 @@ i915_gem_mmap_ioctl(struct drm_device *d * mmap ioctl is disallowed for all discrete platforms, * and for all platforms with GRAPHICS_VER > 12. */ - if (IS_DGFX(i915) || GRAPHICS_VER(i915) > 12) + if (IS_DGFX(i915) || GRAPHICS_VER_FULL(i915) > IP_VER(12, 0)) return -EOPNOTSUPP;
if (args->flags & ~(I915_MMAP_WC))
From: Steven Price steven.price@arm.com
commit b7ba6d8dc3569e49800ef0136799f26f43e237e8 upstream.
Currently the setting of the 'cpu' member of struct cpuhp_cpu_state in cpuhp_create() is too late as it is used earlier in _cpu_up().
If kzalloc_node() in __smpboot_create_thread() fails then the rollback will be done with st->cpu==0 causing CPU0 to be erroneously set to be dying, causing the scheduler to get mightily confused and throw its toys out of the pram.
However the cpu number is actually available directly, so simply remove the 'cpu' member and avoid the problem in the first place.
Fixes: 2ea46c6fc945 ("cpumask/hotplug: Fix cpu_dying() state tracking") Signed-off-by: Steven Price steven.price@arm.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Link: https://lore.kernel.org/r/20220411152233.474129-2-steven.price@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/cpu.c | 36 ++++++++++++++++++------------------ 1 file changed, 18 insertions(+), 18 deletions(-)
--- a/kernel/cpu.c +++ b/kernel/cpu.c @@ -70,7 +70,6 @@ struct cpuhp_cpu_state { bool rollback; bool single; bool bringup; - int cpu; struct hlist_node *node; struct hlist_node *last; enum cpuhp_state cb_state; @@ -474,7 +473,7 @@ static inline bool cpu_smt_allowed(unsig #endif
static inline enum cpuhp_state -cpuhp_set_state(struct cpuhp_cpu_state *st, enum cpuhp_state target) +cpuhp_set_state(int cpu, struct cpuhp_cpu_state *st, enum cpuhp_state target) { enum cpuhp_state prev_state = st->state; bool bringup = st->state < target; @@ -485,14 +484,15 @@ cpuhp_set_state(struct cpuhp_cpu_state * st->target = target; st->single = false; st->bringup = bringup; - if (cpu_dying(st->cpu) != !bringup) - set_cpu_dying(st->cpu, !bringup); + if (cpu_dying(cpu) != !bringup) + set_cpu_dying(cpu, !bringup);
return prev_state; }
static inline void -cpuhp_reset_state(struct cpuhp_cpu_state *st, enum cpuhp_state prev_state) +cpuhp_reset_state(int cpu, struct cpuhp_cpu_state *st, + enum cpuhp_state prev_state) { bool bringup = !st->bringup;
@@ -519,8 +519,8 @@ cpuhp_reset_state(struct cpuhp_cpu_state }
st->bringup = bringup; - if (cpu_dying(st->cpu) != !bringup) - set_cpu_dying(st->cpu, !bringup); + if (cpu_dying(cpu) != !bringup) + set_cpu_dying(cpu, !bringup); }
/* Regular hotplug invocation of the AP hotplug thread */ @@ -540,15 +540,16 @@ static void __cpuhp_kick_ap(struct cpuhp wait_for_ap_thread(st, st->bringup); }
-static int cpuhp_kick_ap(struct cpuhp_cpu_state *st, enum cpuhp_state target) +static int cpuhp_kick_ap(int cpu, struct cpuhp_cpu_state *st, + enum cpuhp_state target) { enum cpuhp_state prev_state; int ret;
- prev_state = cpuhp_set_state(st, target); + prev_state = cpuhp_set_state(cpu, st, target); __cpuhp_kick_ap(st); if ((ret = st->result)) { - cpuhp_reset_state(st, prev_state); + cpuhp_reset_state(cpu, st, prev_state); __cpuhp_kick_ap(st); }
@@ -580,7 +581,7 @@ static int bringup_wait_for_ap(unsigned if (st->target <= CPUHP_AP_ONLINE_IDLE) return 0;
- return cpuhp_kick_ap(st, st->target); + return cpuhp_kick_ap(cpu, st, st->target); }
static int bringup_cpu(unsigned int cpu) @@ -703,7 +704,7 @@ static int cpuhp_up_callbacks(unsigned i ret, cpu, cpuhp_get_step(st->state)->name, st->state);
- cpuhp_reset_state(st, prev_state); + cpuhp_reset_state(cpu, st, prev_state); if (can_rollback_cpu(st)) WARN_ON(cpuhp_invoke_callback_range(false, cpu, st, prev_state)); @@ -720,7 +721,6 @@ static void cpuhp_create(unsigned int cp
init_completion(&st->done_up); init_completion(&st->done_down); - st->cpu = cpu; }
static int cpuhp_should_run(unsigned int cpu) @@ -874,7 +874,7 @@ static int cpuhp_kick_ap_work(unsigned i cpuhp_lock_release(true);
trace_cpuhp_enter(cpu, st->target, prev_state, cpuhp_kick_ap_work); - ret = cpuhp_kick_ap(st, st->target); + ret = cpuhp_kick_ap(cpu, st, st->target); trace_cpuhp_exit(cpu, st->state, prev_state, ret);
return ret; @@ -1106,7 +1106,7 @@ static int cpuhp_down_callbacks(unsigned ret, cpu, cpuhp_get_step(st->state)->name, st->state);
- cpuhp_reset_state(st, prev_state); + cpuhp_reset_state(cpu, st, prev_state);
if (st->state < prev_state) WARN_ON(cpuhp_invoke_callback_range(true, cpu, st, @@ -1133,7 +1133,7 @@ static int __ref _cpu_down(unsigned int
cpuhp_tasks_frozen = tasks_frozen;
- prev_state = cpuhp_set_state(st, target); + prev_state = cpuhp_set_state(cpu, st, target); /* * If the current CPU state is in the range of the AP hotplug thread, * then we need to kick the thread. @@ -1164,7 +1164,7 @@ static int __ref _cpu_down(unsigned int ret = cpuhp_down_callbacks(cpu, st, target); if (ret && st->state < prev_state) { if (st->state == CPUHP_TEARDOWN_CPU) { - cpuhp_reset_state(st, prev_state); + cpuhp_reset_state(cpu, st, prev_state); __cpuhp_kick_ap(st); } else { WARN(1, "DEAD callback error for CPU%d", cpu); @@ -1351,7 +1351,7 @@ static int _cpu_up(unsigned int cpu, int
cpuhp_tasks_frozen = tasks_frozen;
- cpuhp_set_state(st, target); + cpuhp_set_state(cpu, st, target); /* * If the current CPU state is in the range of the AP hotplug thread, * then we need to kick the thread once more.
From: Duoming Zhou duoming@zju.edu.cn
commit 82e31755e55fbcea6a9dfaae5fe4860ade17cbc0 upstream.
There are race conditions that may lead to UAF bugs in ax25_heartbeat_expiry(), ax25_t1timer_expiry(), ax25_t2timer_expiry(), ax25_t3timer_expiry() and ax25_idletimer_expiry(), when we call ax25_release() to deallocate ax25_dev.
One of the UAF bugs caused by ax25_release() is shown below:
(Thread 1) | (Thread 2) ax25_dev_device_up() //(1) | ... | ax25_kill_by_device() ax25_bind() //(2) | ax25_connect() | ... ax25_std_establish_data_link() | ax25_start_t1timer() | ax25_dev_device_down() //(3) mod_timer(&ax25->t1timer,..) | | ax25_release() (wait a time) | ... | ax25_dev_put(ax25_dev) //(4)FREE ax25_t1timer_expiry() | ax25->ax25_dev->values[..] //USE| ... ... |
We increase the refcount of ax25_dev in position (1) and (2), and decrease the refcount of ax25_dev in position (3) and (4). The ax25_dev will be freed in position (4) and be used in ax25_t1timer_expiry().
The fail log is shown below: ==============================================================
[ 106.116942] BUG: KASAN: use-after-free in ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] Read of size 8 at addr ffff88800bda9028 by task swapper/0/0 [ 106.116942] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.17.0-06123-g0905eec574 [ 106.116942] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-14 [ 106.116942] Call Trace: ... [ 106.116942] ax25_t1timer_expiry+0x1c/0x60 [ 106.116942] call_timer_fn+0x122/0x3d0 [ 106.116942] __run_timers.part.0+0x3f6/0x520 [ 106.116942] run_timer_softirq+0x4f/0xb0 [ 106.116942] __do_softirq+0x1c2/0x651 ...
This patch adds del_timer_sync() in ax25_release(), which could ensure that all timers stop before we deallocate ax25_dev.
Signed-off-by: Duoming Zhou duoming@zju.edu.cn Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Ovidiu Panait ovidiu.panait@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ax25/af_ax25.c | 5 +++++ 1 file changed, 5 insertions(+)
--- a/net/ax25/af_ax25.c +++ b/net/ax25/af_ax25.c @@ -1053,6 +1053,11 @@ static int ax25_release(struct socket *s ax25_destroy_socket(ax25); } if (ax25_dev) { + del_timer_sync(&ax25->timer); + del_timer_sync(&ax25->t1timer); + del_timer_sync(&ax25->t2timer); + del_timer_sync(&ax25->t3timer); + del_timer_sync(&ax25->idletimer); dev_put_track(ax25_dev->dev, &ax25_dev->dev_tracker); ax25_dev_put(ax25_dev); }
On Mon, Apr 18, 2022 at 02:09:29PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc1 against the Fedora build system (aarch64, armv7, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On 4/18/2022 5:09 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On Mon, Apr 18, 2022 at 02:09:29PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
Build results: total: 155 pass: 155 fail: 0 Qemu test results: total: 489 pass: 489 fail: 0
Tested-by: Guenter Roeck linux@roeck-us.net
Guenter
On 4/18/22 6:09 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
On Mon, Apr 18, 2022 at 02:09:29PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
Hi Greg,
5.17.4-rc1 tested.
Run tested on: - Allwinner H6 (Tanix TX6) - Intel Tiger Lake x86_64 (nuc11 i7-1165G7)
In addition - build tested for: - Allwinner A64 - Allwinner H3 - Allwinner H5 - NXP iMX6 - NXP iMX8 - Qualcomm Dragonboard - Rockchip RK3288 - Rockchip RK3328 - Rockchip RK3399pro - Samsung Exynos
Tested-by: Rudi Heitbaum rudi@heitbaum.com -- Rudi
On Mon, Apr 18, 2022 at 8:25 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
Hi Greg,
Compiled and booted on my test system Lenovo P50s: Intel Core i7 No emergency and critical messages in the dmesg
./perf bench sched all # Running sched/messaging benchmark... # 20 sender and receiver processes per group # 10 groups == 400 processes run
Total time: 0.435 [sec]
# Running sched/pipe benchmark... # Executed 1000000 pipe operations between two processes
Total time: 8.614 [sec]
8.614146 usecs/op 116088 ops/sec
Tested-by: Zan Aziz zanaziz313@gmail.com
Thanks -Zan
On Mon, 18 Apr 2022 at 17:45, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
Results from Linaro’s test farm. No regressions on arm64, arm, x86_64, and i386.
Tested-by: Linux Kernel Functional Testing lkft@linaro.org
## Build * kernel: 5.17.4-rc1 * git: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc * git branch: linux-5.17.y * git commit: 12e5bd3d067695aa29a9780ccff9fa9d1e761337 * git describe: v5.17.3-223-g12e5bd3d0676 * test details: https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-5.17.y/build/v5.17....
## Test Regressions (compared to v5.17.3-8-g8a239d5c59a8) No test regressions found.
## Metric Regressions (compared to v5.17.3-8-g8a239d5c59a8) No metric regressions found.
## Test Fixes (compared to v5.17.3-8-g8a239d5c59a8) No metric fixes found.
## Metric Fixes (compared to v5.17.3-8-g8a239d5c59a8) No metric fixes found.
## Test result summary total: 90780, pass: 78715, fail: 220, skip: 11125, xfail: 720
## Build Summary * arc: 10 total, 10 passed, 0 failed * arm: 296 total, 293 passed, 3 failed * arm64: 47 total, 47 passed, 0 failed * i386: 44 total, 40 passed, 4 failed * mips: 41 total, 38 passed, 3 failed * parisc: 14 total, 14 passed, 0 failed * powerpc: 65 total, 56 passed, 9 failed * riscv: 32 total, 27 passed, 5 failed * s390: 26 total, 23 passed, 3 failed * sh: 26 total, 24 passed, 2 failed * sparc: 14 total, 14 passed, 0 failed * x86_64: 47 total, 47 passed, 0 failed
## Test suites summary * fwts * igt-gpu-tools * kunit * kvm-unit-tests * libgpiod * libhugetlbfs * linux-log-parser * ltp-cap_bounds-tests * ltp-commands-tests * ltp-containers-tests * ltp-controllers-tests * ltp-cpuhotplug-tests * ltp-crypto-tests * ltp-cve-tests * ltp-dio-tests * ltp-fcntl-locktests-te[ * ltp-fcntl-locktests-tests * ltp-filecaps-tests * ltp-fs-tests * ltp-fs_bind-tests * ltp-fs_perms_simple-tests * ltp-fsx-tests * ltp-hugetlb-tests * ltp-io-tests * ltp-ipc-tests * ltp-math-tests * ltp-mm-tests * ltp-nptl-tests * ltp-open-posix-tests * ltp-pty-tests * ltp-sched-tests * ltp-securebits-tests * ltp-syscalls-tests * ltp-tracing-tests * network-basic-tests * packetdrill * rcutorture * ssuite * v4l2-compliance * vdso
-- Linaro LKFT https://lkft.linaro.org
On 4/18/22 5:09 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
Built and booted successfully on RISC-V RV64 (HiFive Unmatched).
Tested-by: Ron Economos re@w6rz.net
On Mon, 18 Apr 2022 14:09:29 +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
thanks,
greg k-h
All tests passing for Tegra ...
Test results for stable-v5.17: 10 builds: 10 pass, 0 fail 28 boots: 28 pass, 0 fail 130 tests: 130 pass, 0 fail
Linux version: 5.17.4-rc1-g12e5bd3d0676 Boards tested: tegra124-jetson-tk1, tegra186-p2771-0000, tegra194-p2972-0000, tegra194-p3509-0000+p3668-0000, tegra20-ventana, tegra210-p2371-2180, tegra210-p3450-0000, tegra30-cardhu-a04
Tested-by: Jon Hunter jonathanh@nvidia.com
Jon
On 18. 04. 22, 14:09, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.17.4 release. There are 219 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 20 Apr 2022 12:11:14 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.17.4-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.17.y and the diffstat can be found below.
openSUSE configs¹⁾ all green.
Tested-by: Jiri Slaby jirislaby@kernel.org
¹⁾ armv6hl armv7hl arm64 i386 ppc64 ppc64le riscv64 s390x x86_64
linux-stable-mirror@lists.linaro.org