This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.13.6-rc1
Íñigo Huguet ihuguet@redhat.com sfc: ensure correct number of XDP queues
Yoshitaka Ikeda ikeda@nskint.co.jp spi: spi-cadence-quadspi: Fix division by zero warning - try2
Colin Xu colin.xu@intel.com drm/i915/gvt: Clear d3_entered on elsp cmd submission.
Riccardo Mancini rickyman7@gmail.com perf inject: Close inject.output on exit
Mark Rutland mark.rutland@arm.com arm64: entry: fix KCOV suppression
Robert Richter rrichter@amd.com Documentation: Fix intiramfs script name
Stefan Wahren stefan.wahren@i2se.com ARM: multi_v7_defconfig: Make NOP_USB_XCEIV driver built-in
Paul Blakey paulb@nvidia.com skbuff: Release nfct refcount on napi stolen or re-used skbs
Matthieu Baerts matthieu.baerts@tessares.net mptcp: fix 'masking a bool' warning
Mahesh Bandewar maheshb@google.com bonding: fix build issue
Yoshitaka Ikeda ikeda@nskint.co.jp spi: spi-cadence-quadspi: Revert "Fix division by zero warning"
Likun Gao Likun.Gao@amd.com drm/amdgpu: update golden setting for sienna_cichlid
Xiaojian Du Xiaojian.Du@amd.com drm/amdgpu: update the golden setting for vangogh
Tao Zhou tao.zhou1@amd.com drm/amdgpu: update gc golden setting for dimgrey_cavefish
Charles Baylis cb-kernel@fishzet.co.uk drm: Return -ENOTTY for non-drm ioctls
Jason Ekstrand jason@jlekstrand.net Revert "drm/i915: Propagate errors on awaiting already signaled fences"
Adrian Hunter adrian.hunter@intel.com driver core: Prevent warning when removing a device link from unregistered consumer
Greg Kroah-Hartman gregkh@linuxfoundation.org nds32: fix up stack guard gap
Jérôme Glisse jglisse@redhat.com misc: eeprom: at24: Always append device id even if label property is set.
Ilya Dryomov idryomov@gmail.com rbd: always kick acquire on "acquired" and "released" notifications
Ilya Dryomov idryomov@gmail.com rbd: don't hold lock_rwsem while running_list is being drained
Mike Kravetz mike.kravetz@oracle.com hugetlbfs: fix mount mode command line processing
Qi Zheng zhengqi.arch@bytedance.com mm: fix the deadlock in finish_fault()
Mike Rapoport rppt@kernel.org memblock: make for_each_mem_range() traverse MEMBLOCK_HOTPLUG regions
Sergei Trofimovich slyfox@gentoo.org mm: page_alloc: fix page_poison=1 / INIT_ON_ALLOC_DEFAULT_ON interaction
Christoph Hellwig hch@lst.de mm: call flush_dcache_page() in memcpy_to_page() and memzero_page()
Alexander Potapenko glider@google.com kfence: skip all GFP_ZONEMASK allocations
Alexander Potapenko glider@google.com kfence: move the size check to the beginning of __kfence_alloc()
Peter Collingbourne pcc@google.com userfaultfd: do not untag user pointers
Jens Axboe axboe@kernel.dk io_uring: fix early fdput() of file
Pavel Begunkov asml.silence@gmail.com io_uring: remove double poll entry on arm failure
Pavel Begunkov asml.silence@gmail.com io_uring: explicitly count entries for poll reqs
Peter Collingbourne pcc@google.com selftest: use mmap instead of posix_memalign to allocate memory
Frederic Weisbecker frederic@kernel.org posix-cpu-timers: Fix rearm racing against process tick
Loic Poulain loic.poulain@linaro.org bus: mhi: pci_generic: Fix inbound IPCR channel
Bhaumik Bhatt bbhatt@codeaurora.org bus: mhi: core: Validate channel ID when processing command completions
Bhaumik Bhatt bbhatt@codeaurora.org bus: mhi: pci_generic: Apply no-op for wake using sideband wake boolean
Peter Ujfalusi peter.ujfalusi@linux.intel.com driver core: auxiliary bus: Fix memory leak when driver_register() fail
Markus Boehme markubo@amazon.com ixgbe: Fix packet corruption due to missing DMA sync
Gustavo A. R. Silva gustavoars@kernel.org media: ngene: Fix out-of-bounds bug in ngene_command_config_free_buf()
Filipe Manana fdmanana@suse.com btrfs: fix lock inversion problem when doing qgroup extent tracing
Filipe Manana fdmanana@suse.com btrfs: fix unpersisted i_size on fsync after expanding truncate
Anand Jain anand.jain@oracle.com btrfs: check for missing device in btrfs_trim_fs
Steven Rostedt (VMware) rostedt@goodmis.org tracing: Synthetic event field_pos is an index not a boolean
Haoran Luo www@aegistudio.net tracing: Fix bug in rb_per_cpu_empty() that might cause deadloop.
Steven Rostedt (VMware) rostedt@goodmis.org tracing/histogram: Rename "cpu" to "common_cpu"
Steven Rostedt (VMware) rostedt@goodmis.org tracepoints: Update static_call before tp_funcs when adding a tracepoint
Marc Zyngier maz@kernel.org firmware/efi: Tell memblock about EFI iomem reservations
Amelie Delaunay amelie.delaunay@foss.st.com usb: typec: stusb160x: Don't block probing of consumer of "connector" nodes
Amelie Delaunay amelie.delaunay@foss.st.com usb: typec: stusb160x: register role switch before interrupt registration
Martin Kepplinger martink@posteo.de usb: typec: tipd: Don't block probing of consumer of "connector" nodes
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: gadget: Fix sending zero length packet in DDMA mode.
Minas Harutyunyan Minas.Harutyunyan@synopsys.com usb: dwc2: gadget: Fix GOUTNAK flow for Slave mode.
Marek Szyprowski m.szyprowski@samsung.com usb: dwc2: Skip clock gating on Samsung SoCs
Zhang Qilong zhangqilong3@huawei.com usb: gadget: Fix Unbalanced pm_runtime_enable in tegra_xudc_probe
John Keeping john@metanate.com USB: serial: cp210x: add ID for CEL EM3588 USB ZigBee stick
Ian Ray ian.ray@ge.com USB: serial: cp210x: fix comments for GE CS1000
Marco De Marco marco.demarco@posteo.net USB: serial: option: add support for u-blox LARA-R6 family
Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com usb: renesas_usbhs: Fix superfluous irqs happen after usb_pkt_pop()
Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz usb: max-3421: Prevent corruption of freed memory
Julian Sikorski belegdol@gmail.com USB: usb-storage: Add LaCie Rugged USB3-FW to IGNORE_UAS
Mathias Nyman mathias.nyman@linux.intel.com usb: hub: Fix link power management max exit latency (MEL) calculations
Mathias Nyman mathias.nyman@linux.intel.com usb: hub: Disable USB 3 device initiated lpm if exit latency is too high
Nicholas Piggin npiggin@gmail.com KVM: PPC: Book3S HV Nested: Sanitise H_ENTER_NESTED TM state
Nicholas Piggin npiggin@gmail.com KVM: PPC: Book3S: Fix H_RTAS rets buffer overflow
David Jeffery djeffery@redhat.com usb: ehci: Prevent missed ehci interrupts with edge-triggered MSI
Mathias Nyman mathias.nyman@linux.intel.com xhci: Fix lost USB 2 remote wake
Greg Thelen gthelen@google.com usb: xhci: avoid renesas_usb_fw.mem when it's unusable
Moritz Fischer mdf@kernel.org Revert "usb: renesas-xhci: Fix handling of unknown ROM state"
Takashi Iwai tiwai@suse.de ALSA: pcm: Fix mmap capability check
Alan Young consult.awy@gmail.com ALSA: pcm: Call substream ack() method upon compat mmap commit
Takashi Iwai tiwai@suse.de ALSA: hdmi: Expose all pins on MSI MS-7C94 board
Hui Wang hui.wang@canonical.com ALSA: hda/realtek: Fix pop noise and 2 Front Mic issues on a machine
Takashi Iwai tiwai@suse.de ALSA: sb: Fix potential ABBA deadlock in CSP driver
Alexander Tsoy alexander@tsoy.me ALSA: usb-audio: Add registration quirk for JBL Quantum headsets
Takashi Iwai tiwai@suse.de ALSA: usb-audio: Add missing proc text entry for BESPOKEN type
Alexander Egorenkov egorenar@linux.ibm.com s390/boot: fix use of expolines in the DMA code
Vasily Gorbik gor@linux.ibm.com s390/ftrace: fix ftrace_update_ftrace_func implementation
Stephen Boyd swboyd@chromium.org mmc: core: Don't allocate IDA for OF aliases
Olivier Langlois olivier@trillion01.com io_uring: Fix race condition when sqp thread goes to sleep
Linus Torvalds torvalds@linux-foundation.org ACPI: fix NULL pointer dereference
Marcelo Henrique Cerri marcelo.cerri@canonical.com proc: Avoid mixing integer types in mem_rw()
Ronnie Sahlberg lsahlber@redhat.com cifs: fix fallocate when trying to allocate a hole.
Ronnie Sahlberg lsahlber@redhat.com cifs: only write 64kb at a time when fallocating a small region of a file
Ioana Ciornei ioana.ciornei@nxp.com dpaa2-switch: seed the buffer pool after allocating the swp
Maxime Ripard maxime@cerno.tech drm/panel: raspberrypi-touchscreen: Prevent double-free
Yajun Deng yajun.deng@linux.dev net: sched: cls_api: Fix the the wrong parameter
Heinrich Schuchardt xypron.glpk@gmx.de RISC-V: load initrd wherever it fits into memory
Vladimir Oltean vladimir.oltean@nxp.com net: dsa: sja1105: make VID 4095 a bridge VLAN too
Wei Wang weiwan@google.com tcp: disable TFO blackhole logic by default
Bin Meng bmeng.cn@gmail.com riscv: Fix 32-bit RISC-V boot failure
Sukadev Bhattiprolu sukadev@linux.ibm.com ibmvnic: Remove the proper scrq flush
Vadim Fedorenko vfedorenko@novek.ru udp: check encap socket in __udp_lib_err
Xin Long lucien.xin@gmail.com sctp: update active_key for asoc when old key is being replaced
Christoph Hellwig hch@lst.de nvme: set the PRACT bit when using Write Zeroes with T10 PI
Sayanta Pattanayak sayanta.pattanayak@arm.com r8169: Avoid duplicate sysfs entry creation error
David Howells dhowells@redhat.com afs: Fix setting of writeback_index
Tom Rix trix@redhat.com afs: check function return
David Howells dhowells@redhat.com afs: Fix tracepoint string placement with built-in AFS
Vincent Palatin vpalatin@chromium.org Revert "USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem"
Zhihao Cheng chengzhihao1@huawei.com nvme-pci: don't WARN_ON in nvme_reset_work if ctrl.state is not RESETTING
Jason Ekstrand jason@jlekstrand.net drm/ttm: Force re-init if ttm_global_init() fails
David Disseldorp ddiss@suse.de scsi: target: Fix NULL dereference on XCOPY completion
Chris Packham chris.packham@alliedtelesis.co.nz i2c: mpc: Poll for MCF
Luis Henriques lhenriques@suse.de ceph: don't WARN if we're still opening a session to an MDS
Paolo Abeni pabeni@redhat.com ipv6: fix another slab-out-of-bounds in fib6_nh_flush_exceptions
Peilin Ye peilin.ye@bytedance.com net/sched: act_skbmod: Skip non-Ethernet packets
Yang Yingliang yangyingliang@huawei.com io_uring: fix memleak in io_init_wq_offload()
Alexandru Tachici alexandru.tachici@analog.com spi: spi-bcm2835: Fix deadlock
Jian Shen shenjian15@huawei.com net: hns3: fix rx VLAN offload state inconsistent issue
Chengwen Feng fengchengwen@huawei.com net: hns3: fix possible mismatches resp of mailbox
Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com ALSA: hda: intel-dsp-cfg: add missing ElkhartLake PCI ID
Eric Dumazet edumazet@google.com net/tcp_fastopen: fix data races around tfo_active_disable_stamp
Randy Dunlap rdunlap@infradead.org net: hisilicon: rename CACHE_LINE_MASK to avoid redefinition
Somnath Kotur somnath.kotur@broadcom.com bnxt_en: Check abort error state in bnxt_half_open_nic()
Michael Chan michael.chan@broadcom.com bnxt_en: Validate vlan protocol ID on RX packets
Somnath Kotur somnath.kotur@broadcom.com bnxt_en: fix error path of FW reset
Michael Chan michael.chan@broadcom.com bnxt_en: Add missing check for BNXT_STATE_ABORT_ERR in bnxt_fw_rset_task()
Michael Chan michael.chan@broadcom.com bnxt_en: Refresh RoCE capabilities in bnxt_ulp_probe()
Kalesh AP kalesh-anakkur.purayil@broadcom.com bnxt_en: don't disable an already disabled PCI device
Andy Shevchenko andy.shevchenko@gmail.com ACPI: utils: Fix reference counting in for_each_acpi_dev_match()
Andy Shevchenko andy.shevchenko@gmail.com efi/dev-path-parser: Switch to use for_each_acpi_dev_match()
Robert Richter rrichter@amd.com ACPI: Kconfig: Fix table override from built-in initrd
Marek Vasut marex@denx.de spi: cadence: Correct initialisation of runtime PM again
Dmitry Bogdanov d.bogdanov@yadro.com scsi: target: Fix protect handling in WRITE SAME(32)
Mike Christie michael.christie@oracle.com scsi: iscsi: Fix iface sysfs attr detection
Nguyen Dinh Phi phind.uet@gmail.com netrom: Decrease sock refcount when sock timers expire
Xin Long lucien.xin@gmail.com sctp: trim optlen when it's a huge value in sctp_setsockopt
Pavel Skripkin paskripkin@gmail.com net: sched: fix memory leak in tcindex_partial_destroy_work
Nicholas Piggin npiggin@gmail.com KVM: PPC: Fix kvm_arch_vcpu_ioctl vcpu_load leak
Nicholas Piggin npiggin@gmail.com KVM: PPC: Book3S: Fix CONFIG_TRANSACTIONAL_MEM=n crash
Yajun Deng yajun.deng@linux.dev net: decnet: Fix sleeping inside in af_decnet
Michal Suchanek msuchanek@suse.de efi/tpm: Differentiate missing and invalid final event log table.
Vijendar Mukunda vijendar.mukunda@amd.com ASoC: soc-pcm: add a flag to reverse the stop sequence
Roman Skakun Roman_Skakun@epam.com dma-mapping: handle vmalloc addresses in dma_common_{mmap,get_sgtable}
Dongliang Mu mudongliangabcd@gmail.com usb: hso: fix error handling code of hso_create_net_device
Yoshitaka Ikeda ikeda@nskint.co.jp spi: spi-cadence-quadspi: Fix division by zero warning
Ziyang Xuan william.xuanziyang@huawei.com net: fix uninit-value in caif_seqpkt_sendmsg
Tobias Klauser tklauser@distanz.ch bpftool: Check malloc return value in mount_bpffs_for_pin
Jakub Sitnicki jakub@cloudflare.com bpf, sockmap, udp: sk_prot needs inuse_idx set for proc stats
John Fastabend john.fastabend@gmail.com bpf, sockmap, tcp: sk_prot needs inuse_idx set for proc stats
John Fastabend john.fastabend@gmail.com bpf, sockmap: Fix potential memory leak on unlikely error case
Colin Ian King colin.king@canonical.com s390/bpf: Perform r1 range checking before accessing jit->seen_reg[r1]
Colin Ian King colin.king@canonical.com liquidio: Fix unintentional sign extension issue on left shift of u16
Geert Uytterhoeven geert+renesas@glider.be net: dsa: mv88e6xxx: NET_DSA_MV88E6XXX_PTP should depend on NET_DSA_MV88E6XXX
Maxime Ripard maxime@cerno.tech drm/vc4: hdmi: Drop devm interrupt handler for CEC interrupts
Nicolas Saenz Julienne nsaenzju@redhat.com timers: Fix get_next_timer_interrupt() with no timers pending
Sathya Prakash M R sathya.prakash.m.r@intel.com ASoC: SOF: Intel: Update ADL descriptor to use ACPI power states
Xuan Zhuo xuanzhuo@linux.alibaba.com xdp, net: Fix use-after-free in bpf_xdp_link_release
Daniel Borkmann daniel@iogearbox.net bpf: Fix tail_call_reachable rejection for interpreter when jit failed
Xuan Zhuo xuanzhuo@linux.alibaba.com bpf, test: fix NULL pointer dereference on invalid expected_attach_type
Maxim Schwalm maxim.schwalm@gmail.com ASoC: rt5631: Fix regcache sync errors on resume
Peter Hess peter.hess@ph-home.de spi: mediatek: fix fifo rx mode
Axel Lin axel.lin@ingics.com regulator: hi6421: Fix getting wrong drvdata
Axel Lin axel.lin@ingics.com regulator: hi6421: Use correct variable type for regmap api val argument
Alain Volmat alain.volmat@foss.st.com spi: stm32: fixes pm_runtime calls in probe/remove
Charles Keepax ckeepax@opensource.cirrus.com ASoC: wm_adsp: Correct wm_coeff_tlv_get handling
Lecopzer Chen lecopzer.chen@mediatek.com Kbuild: lto: fix module versionings mismatch in GNU make 3.X
Yang Jihong yangjihong1@huawei.com perf sched: Fix record failure when CONFIG_SCHEDSTATS is not set
Riccardo Mancini rickyman7@gmail.com perf data: Close all files in close_dir()
Riccardo Mancini rickyman7@gmail.com perf probe-file: Delete namelist in del_events() on the error path
Riccardo Mancini rickyman7@gmail.com perf lzma: Close lzma stream on exit
Riccardo Mancini rickyman7@gmail.com perf script: Fix memory 'threads' and 'cpus' leaks on exit
Riccardo Mancini rickyman7@gmail.com perf script: Release zstd data
Riccardo Mancini rickyman7@gmail.com perf report: Free generated help strings for sort option
Riccardo Mancini rickyman7@gmail.com perf env: Fix memory leak of cpu_pmu_caps
Riccardo Mancini rickyman7@gmail.com perf test maps__merge_in: Fix memory leak of maps
Riccardo Mancini rickyman7@gmail.com perf dso: Fix memory leak in dso__new_map()
Riccardo Mancini rickyman7@gmail.com perf test event_update: Fix memory leak of unit
Riccardo Mancini rickyman7@gmail.com perf test event_update: Fix memory leak of evlist
Riccardo Mancini rickyman7@gmail.com perf test session_topology: Delete session->evlist
Riccardo Mancini rickyman7@gmail.com perf env: Fix sibling_dies memory leak
Riccardo Mancini rickyman7@gmail.com perf probe: Fix dso->nsinfo refcounting
Riccardo Mancini rickyman7@gmail.com perf map: Fix dso->nsinfo refcounting
Riccardo Mancini rickyman7@gmail.com perf inject: Fix dso->nsinfo refcounting
Sudeep Holla sudeep.holla@arm.com firmware: arm_scmi: Ensure drivers provide a probe function
Zev Weiss zev@bewilderbeest.net ARM: dts: aspeed: Update e3c246d4i vuart properties
Mark Rutland mark.rutland@arm.com arm64: mte: fix restoration of GCR_EL1 from suspend
Sean Christopherson seanjc@google.com KVM: SVM: Fix sev_pin_memory() error checks in SEV migration utilities
Sean Christopherson seanjc@google.com KVM: SVM: Return -EFAULT if copy_to_user() for SEV mig packet header fails
Like Xu like.xu.linux@gmail.com KVM: x86/pmu: Clear anythread deprecated bit when 0xa leaf is unsupported on the SVM
Íñigo Huguet ihuguet@redhat.com sfc: fix lack of XDP TX queues - error XDP TX failed (-22)
Vladimir Oltean vladimir.oltean@nxp.com net: ocelot: fix switchdev objects synced for wrong netdev with LAG offload
Casey Chen cachen@purestorage.com nvme-pci: do not call nvme_dev_remove_admin from nvme_remove
Marek Behún kabel@kernel.org net: phy: marvell10g: fix differentiation of 88X3310 from 88X3340
Paolo Abeni pabeni@redhat.com mptcp: properly account bulk freed memory
Paolo Abeni pabeni@redhat.com mptcp: refine mptcp_cleanup_rbuf
Paolo Abeni pabeni@redhat.com mptcp: use fast lock for subflows when possible
Jianguo Wu wujianguo@chinatelecom.cn selftests: mptcp: fix case multiple subflows limited by server
Jianguo Wu wujianguo@chinatelecom.cn mptcp: avoid processing packet if a subflow reset
Geliang Tang geliangtang@gmail.com mptcp: add sk parameter for mptcp_get_options
Jianguo Wu wujianguo@chinatelecom.cn mptcp: fix syncookie process if mptcp can not_accept new subflow
Jianguo Wu wujianguo@chinatelecom.cn mptcp: remove redundant req destruct in subflow_check_req()
Jianguo Wu wujianguo@chinatelecom.cn mptcp: fix warning in __skb_flow_dissect() when do syn cookie for subflow join
Zack Rusin zackr@vmware.com drm/vmwgfx: Fix a bad merge in otable batch takedown
Shahjada Abul Husain shahjada@chelsio.com cxgb4: fix IRQ free race during driver unload
Uwe Kleine-König u.kleine-koenig@pengutronix.de pwm: sprd: Ensure configuring period and duty_cycle isn't wrongly skipped
Hangbin Liu liuhangbin@gmail.com selftests: icmp_redirect: IPv6 PMTU info should be cleared after redirect
Hangbin Liu liuhangbin@gmail.com selftests: icmp_redirect: remove from checking for IPv6 route get
YueHaibing yuehaibing@huawei.com stmmac: platform: Fix signedness bug in stmmac_probe_config_dt()
Nicolas Dichtel nicolas.dichtel@6wind.com ipv6: fix 'disable_policy' for fwd packets
Taehee Yoo ap420073@gmail.com bonding: fix incorrect return value of bond_ipsec_offload_ok()
Taehee Yoo ap420073@gmail.com bonding: fix suspicious RCU usage in bond_ipsec_offload_ok()
Taehee Yoo ap420073@gmail.com bonding: Add struct bond_ipesc to manage SA
Taehee Yoo ap420073@gmail.com bonding: disallow setting nested bonding + ipsec offload
Taehee Yoo ap420073@gmail.com bonding: fix suspicious RCU usage in bond_ipsec_del_sa()
Taehee Yoo ap420073@gmail.com ixgbevf: use xso.real_dev instead of xso.dev in callback functions of struct xfrmdev_ops
Taehee Yoo ap420073@gmail.com bonding: fix null dereference in bond_ipsec_add_sa()
Taehee Yoo ap420073@gmail.com bonding: fix suspicious RCU usage in bond_ipsec_add_sa()
Wang Hai wanghai38@huawei.com bpf, samples: Fix xdpsock with '-M' parameter missing unload process
Christophe JAILLET christophe.jaillet@wanadoo.fr gve: Fix an error handling path in 'gve_probe()'
Mohammad Athari Bin Ismail mohammad.athari.ismail@intel.com net: stmmac: Terminate FPE workqueue in suspend
Jedrzej Jagielski jedrzej.jagielski@intel.com igb: Fix position of assignment to *ring
Aleksandr Loktionov aleksandr.loktionov@intel.com igb: Check if num of q_vectors is smaller than max before array access
Christophe JAILLET christophe.jaillet@wanadoo.fr iavf: Fix an error handling path in 'iavf_probe()'
Christophe JAILLET christophe.jaillet@wanadoo.fr e1000e: Fix an error handling path in 'e1000_probe()'
Christophe JAILLET christophe.jaillet@wanadoo.fr fm10k: Fix an error handling path in 'fm10k_probe()'
Christophe JAILLET christophe.jaillet@wanadoo.fr igb: Fix an error handling path in 'igb_probe()'
Christophe JAILLET christophe.jaillet@wanadoo.fr igc: Fix an error handling path in 'igc_probe()'
Christophe JAILLET christophe.jaillet@wanadoo.fr ixgbe: Fix an error handling path in 'ixgbe_probe()'
Tom Rix trix@redhat.com igc: change default return of igc_read_phy_reg()
Vinicius Costa Gomes vinicius.gomes@intel.com igb: Fix use-after-free error during reset
Vinicius Costa Gomes vinicius.gomes@intel.com igc: Fix use-after-free error during reset
-------------
Diffstat:
Documentation/arm64/tagged-address-abi.rst | 26 ++- .../early-userspace/early_userspace_support.rst | 8 +- .../filesystems/ramfs-rootfs-initramfs.rst | 2 +- Documentation/networking/ip-sysctl.rst | 2 +- Documentation/trace/histogram.rst | 2 +- Makefile | 4 +- arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts | 4 +- arch/arm/configs/multi_v7_defconfig | 2 +- arch/arm64/kernel/Makefile | 2 +- arch/arm64/kernel/mte.c | 15 +- arch/nds32/mm/mmap.c | 2 +- arch/powerpc/kvm/book3s_hv.c | 2 + arch/powerpc/kvm/book3s_hv_nested.c | 20 +++ arch/powerpc/kvm/book3s_rtas.c | 25 ++- arch/powerpc/kvm/powerpc.c | 4 +- arch/riscv/include/asm/efi.h | 4 +- arch/riscv/mm/init.c | 4 +- arch/s390/boot/text_dma.S | 19 +-- arch/s390/include/asm/ftrace.h | 1 + arch/s390/kernel/ftrace.c | 2 + arch/s390/kernel/mcount.S | 4 +- arch/s390/net/bpf_jit_comp.c | 2 +- arch/x86/kvm/cpuid.c | 3 +- arch/x86/kvm/svm/sev.c | 14 +- drivers/acpi/Kconfig | 2 +- drivers/acpi/utils.c | 7 +- drivers/base/auxiliary.c | 8 +- drivers/base/core.c | 6 +- drivers/block/rbd.c | 32 ++-- drivers/bus/mhi/core/main.c | 17 +- drivers/bus/mhi/pci_generic.c | 45 ++++- drivers/firmware/arm_scmi/bus.c | 3 + drivers/firmware/efi/dev-path-parser.c | 46 ++---- drivers/firmware/efi/efi.c | 13 +- drivers/firmware/efi/tpm.c | 8 +- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 3 + drivers/gpu/drm/drm_ioctl.c | 3 + drivers/gpu/drm/i915/gvt/handlers.c | 15 ++ drivers/gpu/drm/i915/i915_request.c | 8 +- .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 1 - drivers/gpu/drm/ttm/ttm_device.c | 2 + drivers/gpu/drm/vc4/vc4_hdmi.c | 49 ++++-- drivers/gpu/drm/vmwgfx/vmwgfx_mob.c | 1 - drivers/i2c/busses/i2c-mpc.c | 4 +- drivers/media/pci/intel/ipu3/cio2-bridge.c | 6 +- drivers/media/pci/ngene/ngene-core.c | 2 +- drivers/media/pci/ngene/ngene.h | 14 +- drivers/misc/eeprom/at24.c | 17 +- drivers/mmc/core/host.c | 20 +-- drivers/net/bonding/bond_main.c | 183 +++++++++++++++++---- drivers/net/dsa/mv88e6xxx/Kconfig | 2 +- drivers/net/dsa/sja1105/sja1105_main.c | 6 + drivers/net/ethernet/broadcom/bnxt/bnxt.c | 63 +++++-- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 9 +- .../ethernet/cavium/liquidio/cn23xx_pf_device.c | 2 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c | 18 +- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c | 3 + .../net/ethernet/freescale/dpaa2/dpaa2-switch.c | 16 +- drivers/net/ethernet/google/gve/gve_main.c | 5 +- drivers/net/ethernet/hisilicon/hip04_eth.c | 6 +- drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h | 6 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 1 + .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 10 ++ drivers/net/ethernet/ibm/ibmvnic.c | 2 +- drivers/net/ethernet/intel/e1000e/netdev.c | 1 + drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 1 + drivers/net/ethernet/intel/iavf/iavf_main.c | 1 + drivers/net/ethernet/intel/igb/igb_main.c | 15 +- drivers/net/ethernet/intel/igc/igc.h | 2 +- drivers/net/ethernet/intel/igc/igc_main.c | 3 + drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 4 +- drivers/net/ethernet/intel/ixgbevf/ipsec.c | 20 ++- drivers/net/ethernet/mscc/ocelot_net.c | 9 +- drivers/net/ethernet/realtek/r8169_main.c | 3 +- drivers/net/ethernet/sfc/efx_channels.c | 18 +- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + .../net/ethernet/stmicro/stmmac/stmmac_platform.c | 8 +- drivers/net/phy/marvell10g.c | 40 ++++- drivers/net/usb/hso.c | 33 ++-- drivers/nvme/host/core.c | 5 +- drivers/nvme/host/pci.c | 5 +- drivers/pwm/pwm-sprd.c | 11 +- drivers/regulator/hi6421-regulator.c | 30 ++-- drivers/scsi/scsi_transport_iscsi.c | 90 ++++------ drivers/spi/spi-bcm2835.c | 12 +- drivers/spi/spi-cadence-quadspi.c | 3 + drivers/spi/spi-cadence.c | 14 +- drivers/spi/spi-mt65xx.c | 16 +- drivers/spi/spi-stm32.c | 9 +- drivers/target/target_core_sbc.c | 35 ++-- drivers/target/target_core_transport.c | 2 +- drivers/usb/core/hub.c | 120 ++++++++++---- drivers/usb/core/quirks.c | 4 - drivers/usb/dwc2/core.h | 4 + drivers/usb/dwc2/core_intr.c | 3 +- drivers/usb/dwc2/gadget.c | 31 +++- drivers/usb/dwc2/hcd.c | 6 +- drivers/usb/dwc2/params.c | 1 + drivers/usb/gadget/udc/tegra-xudc.c | 1 + drivers/usb/host/ehci-hcd.c | 18 +- drivers/usb/host/max3421-hcd.c | 44 ++--- drivers/usb/host/xhci-hub.c | 3 +- drivers/usb/host/xhci-pci-renesas.c | 16 +- drivers/usb/host/xhci-pci.c | 7 + drivers/usb/renesas_usbhs/fifo.c | 7 + drivers/usb/serial/cp210x.c | 5 +- drivers/usb/serial/option.c | 3 + drivers/usb/storage/unusual_uas.h | 7 + drivers/usb/typec/stusb160x.c | 20 ++- drivers/usb/typec/tipd/core.c | 9 + fs/afs/cmservice.c | 25 +-- fs/afs/write.c | 18 +- fs/btrfs/backref.c | 6 +- fs/btrfs/backref.h | 3 +- fs/btrfs/delayed-ref.c | 4 +- fs/btrfs/extent-tree.c | 3 + fs/btrfs/qgroup.c | 38 ++++- fs/btrfs/qgroup.h | 2 +- fs/btrfs/tests/qgroup-tests.c | 20 +-- fs/btrfs/tree-log.c | 31 +++- fs/ceph/mds_client.c | 2 +- fs/cifs/smb2ops.c | 49 ++++-- fs/hugetlbfs/inode.c | 2 +- fs/io_uring.c | 33 ++-- fs/proc/base.c | 2 +- fs/userfaultfd.c | 26 ++- include/acpi/acpi_bus.h | 8 +- include/drm/drm_ioctl.h | 1 + include/linux/highmem.h | 2 + include/linux/marvell_phy.h | 6 +- include/linux/memblock.h | 4 +- include/net/bonding.h | 9 +- include/net/mptcp.h | 5 +- include/sound/soc.h | 6 + include/trace/events/afs.h | 67 +++++++- kernel/bpf/verifier.c | 2 + kernel/dma/ops_helpers.c | 12 +- kernel/time/posix-cpu-timers.c | 10 +- kernel/time/timer.c | 8 +- kernel/trace/ring_buffer.c | 28 +++- kernel/trace/trace.c | 4 + kernel/trace/trace_events_hist.c | 22 ++- kernel/trace/trace_synth.h | 2 +- kernel/tracepoint.c | 2 +- mm/kfence/core.c | 19 ++- mm/memblock.c | 3 +- mm/memory.c | 11 +- mm/page_alloc.c | 29 ++-- net/bpf/test_run.c | 3 + net/caif/caif_socket.c | 3 +- net/core/dev.c | 27 ++- net/core/skbuff.c | 1 + net/core/skmsg.c | 16 +- net/decnet/af_decnet.c | 27 ++- net/ipv4/tcp_bpf.c | 2 +- net/ipv4/tcp_fastopen.c | 28 +++- net/ipv4/tcp_input.c | 19 ++- net/ipv4/tcp_ipv4.c | 2 +- net/ipv4/udp.c | 25 ++- net/ipv4/udp_bpf.c | 2 +- net/ipv6/ip6_output.c | 4 +- net/ipv6/route.c | 2 +- net/ipv6/udp.c | 25 ++- net/mptcp/mib.c | 1 + net/mptcp/mib.h | 1 + net/mptcp/options.c | 24 ++- net/mptcp/pm_netlink.c | 10 +- net/mptcp/protocol.c | 81 +++++---- net/mptcp/protocol.h | 14 +- net/mptcp/subflow.c | 21 +-- net/mptcp/syncookies.c | 16 +- net/netrom/nr_timer.c | 20 ++- net/sched/act_skbmod.c | 12 +- net/sched/cls_api.c | 2 +- net/sched/cls_tcindex.c | 5 +- net/sctp/auth.c | 2 + net/sctp/socket.c | 4 + samples/bpf/xdpsock_user.c | 28 ++++ scripts/Makefile.build | 2 +- sound/core/pcm_native.c | 25 ++- sound/hda/intel-dsp-config.c | 4 + sound/isa/sb/sb16_csp.c | 4 + sound/pci/hda/patch_hdmi.c | 1 + sound/pci/hda/patch_realtek.c | 1 + sound/soc/codecs/rt5631.c | 2 + sound/soc/codecs/wm_adsp.c | 2 +- sound/soc/soc-pcm.c | 22 ++- sound/soc/sof/intel/pci-tgl.c | 1 + sound/usb/mixer.c | 10 +- sound/usb/quirks.c | 3 + tools/bpf/bpftool/common.c | 5 + tools/perf/builtin-inject.c | 13 +- tools/perf/builtin-report.c | 33 ++-- tools/perf/builtin-sched.c | 33 +++- tools/perf/builtin-script.c | 8 + tools/perf/tests/event_update.c | 6 +- tools/perf/tests/maps.c | 2 + tools/perf/tests/topology.c | 1 + tools/perf/util/data.c | 2 +- tools/perf/util/dso.c | 4 +- tools/perf/util/env.c | 2 + tools/perf/util/lzma.c | 8 +- tools/perf/util/map.c | 2 + tools/perf/util/probe-event.c | 4 +- tools/perf/util/probe-file.c | 4 +- tools/perf/util/sort.c | 2 +- tools/perf/util/sort.h | 2 +- tools/testing/selftests/net/icmp_redirect.sh | 5 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 +- tools/testing/selftests/vm/userfaultfd.c | 6 +- 210 files changed, 1840 insertions(+), 869 deletions(-)
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 56ea7ed103b46970e171eb1c95916f393d64eeff ]
Cleans the next descriptor to watch (next_to_watch) when cleaning the TX ring.
Failure to do so can cause invalid memory accesses. If igc_poll() runs while the controller is being reset this can lead to the driver try to free a skb that was already freed.
Log message:
[ 101.525242] refcount_t: underflow; use-after-free. [ 101.525251] WARNING: CPU: 1 PID: 646 at lib/refcount.c:28 refcount_warn_saturate+0xab/0xf0 [ 101.525259] Modules linked in: sch_etf(E) sch_mqprio(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) binfmt_misc(E) kvm_intel(E) kvm(E) irqbypass(E) crc32_pclmul(E) ghash_clmulni_intel(E) aesni_intel(E) mei_wdt(E) libaes(E) crypto_simd(E) cryptd(E) glue_helper(E) snd_hda_codec_hdmi(E) rapl(E) intel_cstate(E) snd_hda_intel(E) snd_intel_dspcfg(E) sg(E) soundwire_intel(E) intel_uncore(E) at24(E) soundwire_generic_allocation(E) iTCO_wdt(E) soundwire_cadence(E) intel_pmc_bxt(E) serio_raw(E) snd_hda_codec(E) iTCO_vendor_support(E) watchdog(E) snd_hda_core(E) snd_hwdep(E) snd_soc_core(E) snd_compress(E) snd_pcsp(E) soundwire_bus(E) snd_pcm(E) evdev(E) snd_timer(E) mei_me(E) snd(E) soundcore(E) mei(E) configfs(E) ip_tables(E) x_tables(E) autofs4(E) ext4(E) crc32c_generic(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E) t10_pi(E) crc_t10dif(E) crct10dif_generic(E) i915(E) ahci(E) libahci(E) ehci_pci(E) igb(E) xhci_pci(E) ehci_hcd(E) [ 101.525303] drm_kms_helper(E) dca(E) xhci_hcd(E) libata(E) crct10dif_pclmul(E) cec(E) crct10dif_common(E) tsn(E) igc(E) e1000e(E) ptp(E) i2c_i801(E) crc32c_intel(E) psmouse(E) i2c_algo_bit(E) i2c_smbus(E) scsi_mod(E) lpc_ich(E) pps_core(E) usbcore(E) drm(E) button(E) video(E) [ 101.525318] CPU: 1 PID: 646 Comm: irq/37-enp7s0-T Tainted: G E 5.10.30-rt37-tsn1-rt-ipipe #ipipe [ 101.525320] Hardware name: SIEMENS AG SIMATIC IPC427D/A5E31233588, BIOS V17.02.09 03/31/2017 [ 101.525322] RIP: 0010:refcount_warn_saturate+0xab/0xf0 [ 101.525325] Code: 05 31 48 44 01 01 e8 f0 c6 42 00 0f 0b c3 80 3d 1f 48 44 01 00 75 90 48 c7 c7 78 a8 f3 a6 c6 05 0f 48 44 01 01 e8 d1 c6 42 00 <0f> 0b c3 80 3d fe 47 44 01 00 0f 85 6d ff ff ff 48 c7 c7 d0 a8 f3 [ 101.525327] RSP: 0018:ffffbdedc0917cb8 EFLAGS: 00010286 [ 101.525329] RAX: 0000000000000000 RBX: ffff98fd6becbf40 RCX: 0000000000000001 [ 101.525330] RDX: 0000000000000001 RSI: ffffffffa6f2700c RDI: 00000000ffffffff [ 101.525332] RBP: ffff98fd6becc14c R08: ffffffffa7463d00 R09: ffffbdedc0917c50 [ 101.525333] R10: ffffffffa74c3578 R11: 0000000000000034 R12: 00000000ffffff00 [ 101.525335] R13: ffff98fd6b0b1000 R14: 0000000000000039 R15: ffff98fd6be35c40 [ 101.525337] FS: 0000000000000000(0000) GS:ffff98fd6e240000(0000) knlGS:0000000000000000 [ 101.525339] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 101.525341] CR2: 00007f34135a3a70 CR3: 0000000150210003 CR4: 00000000001706e0 [ 101.525343] Call Trace: [ 101.525346] sock_wfree+0x9c/0xa0 [ 101.525353] unix_destruct_scm+0x7b/0xa0 [ 101.525358] skb_release_head_state+0x40/0x90 [ 101.525362] skb_release_all+0xe/0x30 [ 101.525364] napi_consume_skb+0x57/0x160 [ 101.525367] igc_poll+0xb7/0xc80 [igc] [ 101.525376] ? sched_clock+0x5/0x10 [ 101.525381] ? sched_clock_cpu+0xe/0x100 [ 101.525385] net_rx_action+0x14c/0x410 [ 101.525388] __do_softirq+0xe9/0x2f4 [ 101.525391] __local_bh_enable_ip+0xe3/0x110 [ 101.525395] ? irq_finalize_oneshot.part.47+0xe0/0xe0 [ 101.525398] irq_forced_thread_fn+0x6a/0x80 [ 101.525401] irq_thread+0xe8/0x180 [ 101.525403] ? wake_threads_waitq+0x30/0x30 [ 101.525406] ? irq_thread_check_affinity+0xd0/0xd0 [ 101.525408] kthread+0x183/0x1a0 [ 101.525412] ? kthread_park+0x80/0x80 [ 101.525415] ret_from_fork+0x22/0x30
Fixes: 13b5b7fd6a4a ("igc: Add support for Tx/Rx rings") Reported-by: Erez Geva erez.geva.ext@siemens.com Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index f1adf154ec4a..9cac1e74a2ba 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -217,6 +217,8 @@ static void igc_clean_tx_ring(struct igc_ring *tx_ring) DMA_TO_DEVICE); }
+ tx_buffer->next_to_watch = NULL; + /* move us one more past the eop_desc for start of next pkt */ tx_buffer++; i++;
From: Vinicius Costa Gomes vinicius.gomes@intel.com
[ Upstream commit 7b292608db23ccbbfbfa50cdb155d01725d7a52e ]
Cleans the next descriptor to watch (next_to_watch) when cleaning the TX ring.
Failure to do so can cause invalid memory accesses. If igb_poll() runs while the controller is reset this can lead to the driver try to free a skb that was already freed.
(The crash is harder to reproduce with the igb driver, but the same potential problem exists as the code is identical to igc)
Fixes: 7cc6fd4c60f2 ("igb: Don't bother clearing Tx buffer_info in igb_clean_tx_ring") Signed-off-by: Vinicius Costa Gomes vinicius.gomes@intel.com Reported-by: Erez Geva erez.geva.ext@siemens.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 7b1885f9ce03..ed7ec27df8c2 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -4835,6 +4835,8 @@ static void igb_clean_tx_ring(struct igb_ring *tx_ring) DMA_TO_DEVICE); }
+ tx_buffer->next_to_watch = NULL; + /* move us one more past the eop_desc for start of next pkt */ tx_buffer++; i++;
From: Tom Rix trix@redhat.com
[ Upstream commit 05682a0a61b6cbecd97a0f37f743b2cbfd516977 ]
Static analysis reports this problem
igc_main.c:4944:20: warning: The left operand of '&' is a garbage value if (!(phy_data & SR_1000T_REMOTE_RX_STATUS) && ~~~~~~~~ ^
phy_data is set by the call to igc_read_phy_reg() only if there is a read_reg() op, else it is unset and a 0 is returned. Change the return to -EOPNOTSUPP.
Fixes: 208983f099d9 ("igc: Add watchdog") Signed-off-by: Tom Rix trix@redhat.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igc/igc.h b/drivers/net/ethernet/intel/igc/igc.h index 25871351730b..58e842cbf6ef 100644 --- a/drivers/net/ethernet/intel/igc/igc.h +++ b/drivers/net/ethernet/intel/igc/igc.h @@ -560,7 +560,7 @@ static inline s32 igc_read_phy_reg(struct igc_hw *hw, u32 offset, u16 *data) if (hw->phy.ops.read_reg) return hw->phy.ops.read_reg(hw, offset, data);
- return 0; + return -EOPNOTSUPP; }
void igc_reinit_locked(struct igc_adapter *);
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit dd2aefcd5e37989ae5f90afdae44bbbf3a2990da ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: 6fabd715e6d8 ("ixgbe: Implement PCIe AER support") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c index 2ac5b82676f3..39fdc46f34f9 100644 --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -11069,6 +11069,7 @@ err_ioremap: disable_dev = !test_and_set_bit(__IXGBE_DISABLED, &adapter->state); free_netdev(netdev); err_alloc_etherdev: + pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_reg: err_dma:
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit c6bc9e5ce5d37cb3e6b552f41b92a193db1806ab ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: c9a11c23ceb6 ("igc: Add netdev") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Acked-by: Sasha Neftin sasha.neftin@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 9cac1e74a2ba..a8d5f196fdbd 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -5596,6 +5596,7 @@ err_sw_init: err_ioremap: free_netdev(netdev); err_alloc_etherdev: + pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_reg: err_dma:
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit fea03b1cebd653cd095f2e9a58cfe1c85661c363 ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: 40a914fa72ab ("igb: Add support for pci-e Advanced Error Reporting") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index ed7ec27df8c2..a371c51a3fe8 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -3615,6 +3615,7 @@ err_sw_init: err_ioremap: free_netdev(netdev); err_alloc_etherdev: + pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_reg: err_dma:
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit e85e14d68f517ef12a5fb8123fff65526b35b6cd ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: 19ae1b3fb99c ("fm10k: Add support for PCI power management and error handling") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/fm10k/fm10k_pci.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c index 9e3103fae723..caedf24c24c1 100644 --- a/drivers/net/ethernet/intel/fm10k/fm10k_pci.c +++ b/drivers/net/ethernet/intel/fm10k/fm10k_pci.c @@ -2227,6 +2227,7 @@ err_sw_init: err_ioremap: free_netdev(netdev); err_alloc_netdev: + pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_reg: err_dma:
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 4589075608420bc49fcef6e98279324bf2bb91ae ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: 111b9dc5c981 ("e1000e: add aer support") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Acked-by: Sasha Neftin sasha.neftin@intel.com Tested-by: Dvora Fuxbrumer dvorax.fuxbrumer@linux.intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/e1000e/netdev.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c index dc0ded7e5e61..86b7778dc9b4 100644 --- a/drivers/net/ethernet/intel/e1000e/netdev.c +++ b/drivers/net/ethernet/intel/e1000e/netdev.c @@ -7664,6 +7664,7 @@ err_flashmap: err_ioremap: free_netdev(netdev); err_alloc_etherdev: + pci_disable_pcie_error_reporting(pdev); pci_release_mem_regions(pdev); err_pci_reg: err_dma:
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit af30cbd2f4d6d66a9b6094e0aa32420bc8b20e08 ]
If an error occurs after a 'pci_enable_pcie_error_reporting()' call, it must be undone by a corresponding 'pci_disable_pcie_error_reporting()' call, as already done in the remove function.
Fixes: 5eae00c57f5e ("i40evf: main driver core") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/iavf/iavf_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/iavf/iavf_main.c b/drivers/net/ethernet/intel/iavf/iavf_main.c index e612c24fa384..44bafedd09f2 100644 --- a/drivers/net/ethernet/intel/iavf/iavf_main.c +++ b/drivers/net/ethernet/intel/iavf/iavf_main.c @@ -3798,6 +3798,7 @@ static int iavf_probe(struct pci_dev *pdev, const struct pci_device_id *ent) err_ioremap: free_netdev(netdev); err_alloc_etherdev: + pci_disable_pcie_error_reporting(pdev); pci_release_regions(pdev); err_pci_reg: err_dma:
From: Aleksandr Loktionov aleksandr.loktionov@intel.com
[ Upstream commit 6c19d772618fea40d9681f259368f284a330fd90 ]
Ensure that the adapter->q_vector[MAX_Q_VECTORS] array isn't accessed beyond its size. It was fixed by using a local variable num_q_vectors as a limit for loop index, and ensure that num_q_vectors is not bigger than MAX_Q_VECTORS.
Fixes: 047e0030f1e6 ("igb: add new data structure for handling interrupts and NAPI") Signed-off-by: Aleksandr Loktionov aleksandr.loktionov@intel.com Reviewed-by: Grzegorz Siwik grzegorz.siwik@intel.com Reviewed-by: Arkadiusz Kubalewski arkadiusz.kubalewski@intel.com Reviewed-by: Slawomir Laba slawomirx.laba@intel.com Reviewed-by: Sylwester Dziedziuch sylwesterx.dziedziuch@intel.com Reviewed-by: Mateusz Palczewski mateusz.placzewski@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index a371c51a3fe8..9f83ff55394c 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -931,6 +931,7 @@ static void igb_configure_msix(struct igb_adapter *adapter) **/ static int igb_request_msix(struct igb_adapter *adapter) { + unsigned int num_q_vectors = adapter->num_q_vectors; struct net_device *netdev = adapter->netdev; int i, err = 0, vector = 0, free_vector = 0;
@@ -939,7 +940,13 @@ static int igb_request_msix(struct igb_adapter *adapter) if (err) goto err_out;
- for (i = 0; i < adapter->num_q_vectors; i++) { + if (num_q_vectors > MAX_Q_VECTORS) { + num_q_vectors = MAX_Q_VECTORS; + dev_warn(&adapter->pdev->dev, + "The number of queue vectors (%d) is higher than max allowed (%d)\n", + adapter->num_q_vectors, MAX_Q_VECTORS); + } + for (i = 0; i < num_q_vectors; i++) { struct igb_q_vector *q_vector = adapter->q_vector[i];
vector++;
From: Jedrzej Jagielski jedrzej.jagielski@intel.com
[ Upstream commit 382a7c20d9253bcd5715789b8179528d0f3de72c ]
Assignment to *ring should be done after correctness check of the argument queue.
Fixes: 91db364236c8 ("igb: Refactor igb_configure_cbs()") Signed-off-by: Jedrzej Jagielski jedrzej.jagielski@intel.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 9f83ff55394c..b0e900d1eae2 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -1685,14 +1685,15 @@ static bool is_any_txtime_enabled(struct igb_adapter *adapter) **/ static void igb_config_tx_modes(struct igb_adapter *adapter, int queue) { - struct igb_ring *ring = adapter->tx_ring[queue]; struct net_device *netdev = adapter->netdev; struct e1000_hw *hw = &adapter->hw; + struct igb_ring *ring; u32 tqavcc, tqavctrl; u16 value;
WARN_ON(hw->mac.type != e1000_i210); WARN_ON(queue < 0 || queue > 1); + ring = adapter->tx_ring[queue];
/* If any of the Qav features is enabled, configure queues as SR and * with HIGH PRIO. If none is, then configure them with LOW PRIO and
From: Mohammad Athari Bin Ismail mohammad.athari.ismail@intel.com
[ Upstream commit 6b28a86d6c0bb02119f386ec2f56efde909e9bcb ]
Add stmmac_fpe_stop_wq() in stmmac_suspend() to terminate FPE workqueue during suspend. So, in suspend mode, there will be no FPE workqueue available. Without this fix, new additional FPE workqueue will be created in every suspend->resume cycle.
Fixes: 5a5586112b92 ("net: stmmac: support FPE link partner hand-shaking procedure") Signed-off-by: Mohammad Athari Bin Ismail mohammad.athari.ismail@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c index 91cd5073ddb2..980a60477b02 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c @@ -7170,6 +7170,7 @@ int stmmac_suspend(struct device *dev) priv->plat->rx_queues_to_use, false);
stmmac_fpe_handshake(priv, false); + stmmac_fpe_stop_wq(priv); }
priv->speed = SPEED_UNKNOWN;
From: Christophe JAILLET christophe.jaillet@wanadoo.fr
[ Upstream commit 2342ae10d1272d411a468a85a67647dd115b344f ]
If the 'register_netdev() call fails, we must release the resources allocated by the previous 'gve_init_priv()' call, as already done in the remove function.
Add a new label and the missing 'gve_teardown_priv_resources()' in the error handling path.
Fixes: 893ce44df565 ("gve: Add basic driver framework for Compute Engine Virtual NIC") Signed-off-by: Christophe JAILLET christophe.jaillet@wanadoo.fr Reviewed-by: Catherine Sullivan csully@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/google/gve/gve_main.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c index 79cefe85a799..b43c6ff07614 100644 --- a/drivers/net/ethernet/google/gve/gve_main.c +++ b/drivers/net/ethernet/google/gve/gve_main.c @@ -1349,13 +1349,16 @@ static int gve_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
err = register_netdev(dev); if (err) - goto abort_with_wq; + goto abort_with_gve_init;
dev_info(&pdev->dev, "GVE version %s\n", gve_version_str); gve_clear_probe_in_progress(priv); queue_work(priv->gve_wq, &priv->service_task); return 0;
+abort_with_gve_init: + gve_teardown_priv_resources(priv); + abort_with_wq: destroy_workqueue(priv->gve_wq);
From: Wang Hai wanghai38@huawei.com
[ Upstream commit 2620e92ae6ed83260eb46d214554cd308ee35d92 ]
Execute the following command and exit, then execute it again, the following error will be reported:
$ sudo ./samples/bpf/xdpsock -i ens4f2 -M ^C $ sudo ./samples/bpf/xdpsock -i ens4f2 -M libbpf: elf: skipping unrecognized data section(16) .eh_frame libbpf: elf: skipping relo section(17) .rel.eh_frame for section(16) .eh_frame libbpf: Kernel error message: XDP program already attached ERROR: link set xdp fd failed
Commit c9d27c9e8dc7 ("samples: bpf: Do not unload prog within xdpsock") removed the unloading prog code because of the presence of bpf_link. This is fine if XDP_SHARED_UMEM is disabled, but if it is enabled, unloading the prog is still needed.
Fixes: c9d27c9e8dc7 ("samples: bpf: Do not unload prog within xdpsock") Signed-off-by: Wang Hai wanghai38@huawei.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Acked-by: Magnus Karlsson magnus.karlsson@intel.com Cc: Maciej Fijalkowski maciej.fijalkowski@intel.com Link: https://lore.kernel.org/bpf/20210628091815.2373487-1-wanghai38@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- samples/bpf/xdpsock_user.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/samples/bpf/xdpsock_user.c b/samples/bpf/xdpsock_user.c index 53e300f860bb..33d0bdebbed8 100644 --- a/samples/bpf/xdpsock_user.c +++ b/samples/bpf/xdpsock_user.c @@ -96,6 +96,7 @@ static int opt_xsk_frame_size = XSK_UMEM__DEFAULT_FRAME_SIZE; static int opt_timeout = 1000; static bool opt_need_wakeup = true; static u32 opt_num_xsks = 1; +static u32 prog_id; static bool opt_busy_poll; static bool opt_reduced_cap;
@@ -461,6 +462,23 @@ static void *poller(void *arg) return NULL; }
+static void remove_xdp_program(void) +{ + u32 curr_prog_id = 0; + + if (bpf_get_link_xdp_id(opt_ifindex, &curr_prog_id, opt_xdp_flags)) { + printf("bpf_get_link_xdp_id failed\n"); + exit(EXIT_FAILURE); + } + + if (prog_id == curr_prog_id) + bpf_set_link_xdp_fd(opt_ifindex, -1, opt_xdp_flags); + else if (!curr_prog_id) + printf("couldn't find a prog id on a given interface\n"); + else + printf("program on interface changed, not removing\n"); +} + static void int_exit(int sig) { benchmark_done = true; @@ -471,6 +489,9 @@ static void __exit_with_error(int error, const char *file, const char *func, { fprintf(stderr, "%s:%s:%i: errno: %d/"%s"\n", file, func, line, error, strerror(error)); + + if (opt_num_xsks > 1) + remove_xdp_program(); exit(EXIT_FAILURE); }
@@ -490,6 +511,9 @@ static void xdpsock_cleanup(void) if (write(sock, &cmd, sizeof(int)) < 0) exit_with_error(errno); } + + if (opt_num_xsks > 1) + remove_xdp_program(); }
static void swap_mac_addresses(void *data) @@ -857,6 +881,10 @@ static struct xsk_socket_info *xsk_configure_socket(struct xsk_umem_info *umem, if (ret) exit_with_error(-ret);
+ ret = bpf_get_link_xdp_id(opt_ifindex, &prog_id, opt_xdp_flags); + if (ret) + exit_with_error(-ret); + xsk->app_stats.rx_empty_polls = 0; xsk->app_stats.fill_fail_polls = 0; xsk->app_stats.copy_tx_sendtos = 0;
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit b648eba4c69e5819880b4907e7fcb2bb576069ab ]
To dereference bond->curr_active_slave, it uses rcu_dereference(). But it and the caller doesn't acquire RCU so a warning occurs. So add rcu_read_lock().
Test commands: ip link add dummy0 type dummy ip link add bond0 type bond ip link set dummy0 master bond0 ip link set dummy0 up ip link set bond0 up ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 \ mode transport \ reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \ 0x44434241343332312423222114131211f4f3f2f1 128 sel \ src 14.0.0.52/24 dst 14.0.0.70/24 proto tcp offload \ dev bond0 dir in
Splat looks like: ============================= WARNING: suspicious RCU usage 5.13.0-rc3+ #1168 Not tainted ----------------------------- drivers/net/bonding/bond_main.c:411 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ip/684: #0: ffffffff9a2757c0 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3}, at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user] 55.191733][ T684] stack backtrace: CPU: 0 PID: 684 Comm: ip Not tainted 5.13.0-rc3+ #1168 Call Trace: dump_stack+0xa4/0xe5 bond_ipsec_add_sa+0x18c/0x1f0 [bonding] xfrm_dev_state_add+0x2a9/0x770 ? memcpy+0x38/0x60 xfrm_add_sa+0x2278/0x3b10 [xfrm_user] ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user] ? register_lock_class+0x1750/0x1750 xfrm_user_rcv_msg+0x331/0x660 [xfrm_user] ? rcu_read_lock_sched_held+0x91/0xc0 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? find_held_lock+0x3a/0x1c0 ? mutex_lock_io_nested+0x1210/0x1210 ? sched_clock_cpu+0x18/0x170 netlink_rcv_skb+0x121/0x350 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? netlink_ack+0x9d0/0x9d0 ? netlink_deliver_tap+0x17c/0xa50 xfrm_netlink_rcv+0x68/0x80 [xfrm_user] netlink_unicast+0x41c/0x610 ? netlink_attachskb+0x710/0x710 netlink_sendmsg+0x6b9/0xb70 [ ... ]
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index c5a646d06102..026f4511bf7b 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -403,10 +403,12 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) struct net_device *bond_dev = xs->xso.dev; struct bonding *bond; struct slave *slave; + int err;
if (!bond_dev) return -EINVAL;
+ rcu_read_lock(); bond = netdev_priv(bond_dev); slave = rcu_dereference(bond->curr_active_slave); xs->xso.real_dev = slave->dev; @@ -415,10 +417,13 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) if (!(slave->dev->xfrmdev_ops && slave->dev->xfrmdev_ops->xdo_dev_state_add)) { slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n"); + rcu_read_unlock(); return -EINVAL; }
- return slave->dev->xfrmdev_ops->xdo_dev_state_add(xs); + err = slave->dev->xfrmdev_ops->xdo_dev_state_add(xs); + rcu_read_unlock(); + return err; }
/**
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 105cd17a866017b45f3c45901b394c711c97bf40 ]
If bond doesn't have real device, bond->curr_active_slave is null. But bond_ipsec_add_sa() dereferences bond->curr_active_slave without null checking. So, null-ptr-deref would occur.
Test commands: ip link add bond0 type bond ip link set bond0 up ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi \ 0x07 mode transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \ 0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \ dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
Splat looks like: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 4 PID: 680 Comm: ip Not tainted 5.13.0-rc3+ #1168 RIP: 0010:bond_ipsec_add_sa+0xc4/0x2e0 [bonding] Code: 85 21 02 00 00 4d 8b a6 48 0c 00 00 e8 75 58 44 ce 85 c0 0f 85 14 01 00 00 48 b8 00 00 00 00 00 fc ff df 4c 89 e2 48 c1 ea 03 <80> 3c 02 00 0f 85 fc 01 00 00 48 8d bb e0 02 00 00 4d 8b 2c 24 48 RSP: 0018:ffff88810946f508 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: ffff88810b4e8040 RCX: 0000000000000001 RDX: 0000000000000000 RSI: ffffffff8fe34280 RDI: ffff888115abe100 RBP: ffff88810946f528 R08: 0000000000000003 R09: fffffbfff2287e11 R10: 0000000000000001 R11: ffff888115abe0c8 R12: 0000000000000000 R13: ffffffffc0aea9a0 R14: ffff88800d7d2000 R15: ffff88810b4e8330 FS: 00007efc5552e680(0000) GS:ffff888119c00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055c2530dbf40 CR3: 0000000103056004 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: xfrm_dev_state_add+0x2a9/0x770 ? memcpy+0x38/0x60 xfrm_add_sa+0x2278/0x3b10 [xfrm_user] ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user] ? register_lock_class+0x1750/0x1750 xfrm_user_rcv_msg+0x331/0x660 [xfrm_user] ? rcu_read_lock_sched_held+0x91/0xc0 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? find_held_lock+0x3a/0x1c0 ? mutex_lock_io_nested+0x1210/0x1210 ? sched_clock_cpu+0x18/0x170 netlink_rcv_skb+0x121/0x350 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? netlink_ack+0x9d0/0x9d0 ? netlink_deliver_tap+0x17c/0xa50 xfrm_netlink_rcv+0x68/0x80 [xfrm_user] netlink_unicast+0x41c/0x610 ? netlink_attachskb+0x710/0x710 netlink_sendmsg+0x6b9/0xb70 [ ...]
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 026f4511bf7b..24b33118105a 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -411,6 +411,11 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) rcu_read_lock(); bond = netdev_priv(bond_dev); slave = rcu_dereference(bond->curr_active_slave); + if (!slave) { + rcu_read_unlock(); + return -ENODEV; + } + xs->xso.real_dev = slave->dev; bond->xs = xs;
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 2de7e4f67599affc97132bd07e30e3bd59d0b777 ]
There are two pointers in struct xfrm_state_offload, *dev, *real_dev. These are used in callback functions of struct xfrmdev_ops. The *dev points whether bonding interface or real interface. If bonding ipsec offload is used, it points bonding interface If not, it points real interface. And real_dev always points real interface. So, ixgbevf should always use real_dev instead of dev. Of course, real_dev always not be null.
Test commands: ip link add bond0 type bond #eth0 is ixgbevf interface ip link set eth0 master bond0 ip link set bond0 up ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \ transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \ 0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \ dst 14.0.0.70/24 proto tcp offload dev bond0 dir in
Splat looks like: KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007] CPU: 6 PID: 688 Comm: ip Not tainted 5.13.0-rc3+ #1168 RIP: 0010:ixgbevf_ipsec_find_empty_idx+0x28/0x1b0 [ixgbevf] Code: 00 00 0f 1f 44 00 00 55 53 48 89 fb 48 83 ec 08 40 84 f6 0f 84 9c 00 00 00 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 01 0f 8e 4c 01 00 00 66 81 3b 00 04 0f RSP: 0018:ffff8880089af390 EFLAGS: 00010246 RAX: dffffc0000000000 RBX: 0000000000000000 RCX: 0000000000000001 RDX: 0000000000000000 RSI: 0000000000000001 RDI: 0000000000000000 RBP: ffff8880089af4f8 R08: 0000000000000003 R09: fffffbfff4287e11 R10: 0000000000000001 R11: ffff888005de8908 R12: 0000000000000000 R13: ffff88810936a000 R14: ffff88810936a000 R15: ffff888004d78040 FS: 00007fdf9883a680(0000) GS:ffff88811a400000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000055bc14adbf40 CR3: 000000000b87c005 CR4: 00000000003706e0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: ixgbevf_ipsec_add_sa+0x1bf/0x9c0 [ixgbevf] ? rcu_read_lock_sched_held+0x91/0xc0 ? ixgbevf_ipsec_parse_proto_keys.isra.9+0x280/0x280 [ixgbevf] ? lock_acquire+0x191/0x720 ? bond_ipsec_add_sa+0x48/0x350 [bonding] ? lockdep_hardirqs_on_prepare+0x3e0/0x3e0 ? rcu_read_lock_held+0x91/0xa0 ? rcu_read_lock_sched_held+0xc0/0xc0 bond_ipsec_add_sa+0x193/0x350 [bonding] xfrm_dev_state_add+0x2a9/0x770 ? memcpy+0x38/0x60 xfrm_add_sa+0x2278/0x3b10 [xfrm_user] ? xfrm_get_policy+0xaa0/0xaa0 [xfrm_user] ? register_lock_class+0x1750/0x1750 xfrm_user_rcv_msg+0x331/0x660 [xfrm_user] ? rcu_read_lock_sched_held+0x91/0xc0 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? find_held_lock+0x3a/0x1c0 ? mutex_lock_io_nested+0x1210/0x1210 ? sched_clock_cpu+0x18/0x170 netlink_rcv_skb+0x121/0x350 [ ... ]
Fixes: 272c2330adc9 ("xfrm: bail early on slave pass over skb") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ixgbevf/ipsec.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/intel/ixgbevf/ipsec.c b/drivers/net/ethernet/intel/ixgbevf/ipsec.c index caaea2c920a6..e3e4676af9e4 100644 --- a/drivers/net/ethernet/intel/ixgbevf/ipsec.c +++ b/drivers/net/ethernet/intel/ixgbevf/ipsec.c @@ -211,7 +211,7 @@ struct xfrm_state *ixgbevf_ipsec_find_rx_state(struct ixgbevf_ipsec *ipsec, static int ixgbevf_ipsec_parse_proto_keys(struct xfrm_state *xs, u32 *mykey, u32 *mysalt) { - struct net_device *dev = xs->xso.dev; + struct net_device *dev = xs->xso.real_dev; unsigned char *key_data; char *alg_name = NULL; int key_len; @@ -260,12 +260,15 @@ static int ixgbevf_ipsec_parse_proto_keys(struct xfrm_state *xs, **/ static int ixgbevf_ipsec_add_sa(struct xfrm_state *xs) { - struct net_device *dev = xs->xso.dev; - struct ixgbevf_adapter *adapter = netdev_priv(dev); - struct ixgbevf_ipsec *ipsec = adapter->ipsec; + struct net_device *dev = xs->xso.real_dev; + struct ixgbevf_adapter *adapter; + struct ixgbevf_ipsec *ipsec; u16 sa_idx; int ret;
+ adapter = netdev_priv(dev); + ipsec = adapter->ipsec; + if (xs->id.proto != IPPROTO_ESP && xs->id.proto != IPPROTO_AH) { netdev_err(dev, "Unsupported protocol 0x%04x for IPsec offload\n", xs->id.proto); @@ -383,11 +386,14 @@ static int ixgbevf_ipsec_add_sa(struct xfrm_state *xs) **/ static void ixgbevf_ipsec_del_sa(struct xfrm_state *xs) { - struct net_device *dev = xs->xso.dev; - struct ixgbevf_adapter *adapter = netdev_priv(dev); - struct ixgbevf_ipsec *ipsec = adapter->ipsec; + struct net_device *dev = xs->xso.real_dev; + struct ixgbevf_adapter *adapter; + struct ixgbevf_ipsec *ipsec; u16 sa_idx;
+ adapter = netdev_priv(dev); + ipsec = adapter->ipsec; + if (xs->xso.flags & XFRM_OFFLOAD_INBOUND) { sa_idx = xs->xso.offload_handle - IXGBE_IPSEC_BASE_RX_INDEX;
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit a22c39b831a081da9b2c488bd970a4412d926f30 ]
To dereference bond->curr_active_slave, it uses rcu_dereference(). But it and the caller doesn't acquire RCU so a warning occurs. So add rcu_read_lock().
Test commands: ip netns add A ip netns exec A bash modprobe netdevsim echo "1 1" > /sys/bus/netdevsim/new_device ip link add bond0 type bond ip link set eth0 master bond0 ip link set eth0 up ip link set bond0 up ip x s add proto esp dst 14.1.1.1 src 15.1.1.1 spi 0x07 mode \ transport reqid 0x07 replay-window 32 aead 'rfc4106(gcm(aes))' \ 0x44434241343332312423222114131211f4f3f2f1 128 sel src 14.0.0.52/24 \ dst 14.0.0.70/24 proto tcp offload dev bond0 dir in ip x s f
Splat looks like: ============================= WARNING: suspicious RCU usage 5.13.0-rc3+ #1168 Not tainted ----------------------------- drivers/net/bonding/bond_main.c:448 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 2 locks held by ip/705: #0: ffff888106701780 (&net->xfrm.xfrm_cfg_mutex){+.+.}-{3:3}, at: xfrm_netlink_rcv+0x59/0x80 [xfrm_user] #1: ffff8880075b0098 (&x->lock){+.-.}-{2:2}, at: xfrm_state_delete+0x16/0x30
stack backtrace: CPU: 6 PID: 705 Comm: ip Not tainted 5.13.0-rc3+ #1168 Call Trace: dump_stack+0xa4/0xe5 bond_ipsec_del_sa+0x16a/0x1c0 [bonding] __xfrm_state_delete+0x51f/0x730 xfrm_state_delete+0x1e/0x30 xfrm_state_flush+0x22f/0x390 xfrm_flush_sa+0xd8/0x260 [xfrm_user] ? xfrm_flush_policy+0x290/0x290 [xfrm_user] xfrm_user_rcv_msg+0x331/0x660 [xfrm_user] ? rcu_read_lock_sched_held+0x91/0xc0 ? xfrm_user_state_lookup.constprop.39+0x320/0x320 [xfrm_user] ? find_held_lock+0x3a/0x1c0 ? mutex_lock_io_nested+0x1210/0x1210 ? sched_clock_cpu+0x18/0x170 netlink_rcv_skb+0x121/0x350 [ ... ]
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 24b33118105a..a7b6550063b2 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -444,21 +444,24 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) if (!bond_dev) return;
+ rcu_read_lock(); bond = netdev_priv(bond_dev); slave = rcu_dereference(bond->curr_active_slave);
if (!slave) - return; + goto out;
xs->xso.real_dev = slave->dev;
if (!(slave->dev->xfrmdev_ops && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) { slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__); - return; + goto out; }
slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs); +out: + rcu_read_unlock(); }
/**
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit b121693381b112b78c076dea171ee113e237c0e4 ]
bonding interface can be nested and it supports ipsec offload. So, it allows setting the nested bonding + ipsec scenario. But code does not support this scenario. So, it should be disallowed.
interface graph: bond2 | bond1 | eth0
The nested bonding + ipsec offload may not a real usecase. So, disallowing this scenario is fine.
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index a7b6550063b2..d85a19c06c69 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -419,8 +419,9 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) xs->xso.real_dev = slave->dev; bond->xs = xs;
- if (!(slave->dev->xfrmdev_ops - && slave->dev->xfrmdev_ops->xdo_dev_state_add)) { + if (!slave->dev->xfrmdev_ops || + !slave->dev->xfrmdev_ops->xdo_dev_state_add || + netif_is_bond_master(slave->dev)) { slave_warn(bond_dev, slave->dev, "Slave does not support ipsec offload\n"); rcu_read_unlock(); return -EINVAL; @@ -453,8 +454,9 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
xs->xso.real_dev = slave->dev;
- if (!(slave->dev->xfrmdev_ops - && slave->dev->xfrmdev_ops->xdo_dev_state_delete)) { + if (!slave->dev->xfrmdev_ops || + !slave->dev->xfrmdev_ops->xdo_dev_state_delete || + netif_is_bond_master(slave->dev)) { slave_warn(bond_dev, slave->dev, "%s: no slave xdo_dev_state_delete\n", __func__); goto out; } @@ -479,8 +481,9 @@ static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs) if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) return true;
- if (!(slave_dev->xfrmdev_ops - && slave_dev->xfrmdev_ops->xdo_dev_offload_ok)) { + if (!slave_dev->xfrmdev_ops || + !slave_dev->xfrmdev_ops->xdo_dev_offload_ok || + netif_is_bond_master(slave_dev)) { slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__); return false; }
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 9a5605505d9c7dbfdb89cc29a8f5fc5cf9fd2334 ]
bonding has been supporting ipsec offload. When SA is added, bonding just passes SA to its own active real interface. But it doesn't manage SA. So, when events(add/del real interface, active real interface change, etc) occur, bonding can't handle that well because It doesn't manage SA. So some problems(panic, UAF, refcnt leak)occur.
In order to make it stable, it should manage SA. That's the reason why struct bond_ipsec is added. When a new SA is added to bonding interface, it is stored in the bond_ipsec list. And the SA is passed to a current active real interface. If events occur, it uses bond_ipsec data to handle these events. bond->ipsec_list is protected by bond->ipsec_lock.
If a current active real interface is changed, the following logic works. 1. delete all SAs from old active real interface 2. Add all SAs to the new active real interface. 3. If a new active real interface doesn't support ipsec offload or SA's option, it sets real_dev to NULL.
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 139 +++++++++++++++++++++++++++----- include/net/bonding.h | 9 ++- 2 files changed, 127 insertions(+), 21 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index d85a19c06c69..3f67b4b794ac 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -401,6 +401,7 @@ static int bond_vlan_rx_kill_vid(struct net_device *bond_dev, static int bond_ipsec_add_sa(struct xfrm_state *xs) { struct net_device *bond_dev = xs->xso.dev; + struct bond_ipsec *ipsec; struct bonding *bond; struct slave *slave; int err; @@ -416,9 +417,6 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) return -ENODEV; }
- xs->xso.real_dev = slave->dev; - bond->xs = xs; - if (!slave->dev->xfrmdev_ops || !slave->dev->xfrmdev_ops->xdo_dev_state_add || netif_is_bond_master(slave->dev)) { @@ -427,11 +425,63 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) return -EINVAL; }
+ ipsec = kmalloc(sizeof(*ipsec), GFP_ATOMIC); + if (!ipsec) { + rcu_read_unlock(); + return -ENOMEM; + } + xs->xso.real_dev = slave->dev; + err = slave->dev->xfrmdev_ops->xdo_dev_state_add(xs); + if (!err) { + ipsec->xs = xs; + INIT_LIST_HEAD(&ipsec->list); + spin_lock_bh(&bond->ipsec_lock); + list_add(&ipsec->list, &bond->ipsec_list); + spin_unlock_bh(&bond->ipsec_lock); + } else { + kfree(ipsec); + } rcu_read_unlock(); return err; }
+static void bond_ipsec_add_sa_all(struct bonding *bond) +{ + struct net_device *bond_dev = bond->dev; + struct bond_ipsec *ipsec; + struct slave *slave; + + rcu_read_lock(); + slave = rcu_dereference(bond->curr_active_slave); + if (!slave) + goto out; + + if (!slave->dev->xfrmdev_ops || + !slave->dev->xfrmdev_ops->xdo_dev_state_add || + netif_is_bond_master(slave->dev)) { + spin_lock_bh(&bond->ipsec_lock); + if (!list_empty(&bond->ipsec_list)) + slave_warn(bond_dev, slave->dev, + "%s: no slave xdo_dev_state_add\n", + __func__); + spin_unlock_bh(&bond->ipsec_lock); + goto out; + } + + spin_lock_bh(&bond->ipsec_lock); + list_for_each_entry(ipsec, &bond->ipsec_list, list) { + ipsec->xs->xso.real_dev = slave->dev; + if (slave->dev->xfrmdev_ops->xdo_dev_state_add(ipsec->xs)) { + slave_warn(bond_dev, slave->dev, "%s: failed to add SA\n", __func__); + ipsec->xs->xso.real_dev = NULL; + } + } + spin_unlock_bh(&bond->ipsec_lock); +out: + rcu_read_unlock(); +} + /** * bond_ipsec_del_sa - clear out this specific SA * @xs: pointer to transformer state struct @@ -439,6 +489,7 @@ static int bond_ipsec_add_sa(struct xfrm_state *xs) static void bond_ipsec_del_sa(struct xfrm_state *xs) { struct net_device *bond_dev = xs->xso.dev; + struct bond_ipsec *ipsec; struct bonding *bond; struct slave *slave;
@@ -452,7 +503,10 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs) if (!slave) goto out;
- xs->xso.real_dev = slave->dev; + if (!xs->xso.real_dev) + goto out; + + WARN_ON(xs->xso.real_dev != slave->dev);
if (!slave->dev->xfrmdev_ops || !slave->dev->xfrmdev_ops->xdo_dev_state_delete || @@ -463,6 +517,48 @@ static void bond_ipsec_del_sa(struct xfrm_state *xs)
slave->dev->xfrmdev_ops->xdo_dev_state_delete(xs); out: + spin_lock_bh(&bond->ipsec_lock); + list_for_each_entry(ipsec, &bond->ipsec_list, list) { + if (ipsec->xs == xs) { + list_del(&ipsec->list); + kfree(ipsec); + break; + } + } + spin_unlock_bh(&bond->ipsec_lock); + rcu_read_unlock(); +} + +static void bond_ipsec_del_sa_all(struct bonding *bond) +{ + struct net_device *bond_dev = bond->dev; + struct bond_ipsec *ipsec; + struct slave *slave; + + rcu_read_lock(); + slave = rcu_dereference(bond->curr_active_slave); + if (!slave) { + rcu_read_unlock(); + return; + } + + spin_lock_bh(&bond->ipsec_lock); + list_for_each_entry(ipsec, &bond->ipsec_list, list) { + if (!ipsec->xs->xso.real_dev) + continue; + + if (!slave->dev->xfrmdev_ops || + !slave->dev->xfrmdev_ops->xdo_dev_state_delete || + netif_is_bond_master(slave->dev)) { + slave_warn(bond_dev, slave->dev, + "%s: no slave xdo_dev_state_delete\n", + __func__); + } else { + slave->dev->xfrmdev_ops->xdo_dev_state_delete(ipsec->xs); + } + ipsec->xs->xso.real_dev = NULL; + } + spin_unlock_bh(&bond->ipsec_lock); rcu_read_unlock(); }
@@ -474,22 +570,27 @@ out: static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs) { struct net_device *bond_dev = xs->xso.dev; - struct bonding *bond = netdev_priv(bond_dev); - struct slave *curr_active = rcu_dereference(bond->curr_active_slave); - struct net_device *slave_dev = curr_active->dev; + struct net_device *real_dev; + struct slave *curr_active; + struct bonding *bond; + + bond = netdev_priv(bond_dev); + curr_active = rcu_dereference(bond->curr_active_slave); + real_dev = curr_active->dev;
if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) return true;
- if (!slave_dev->xfrmdev_ops || - !slave_dev->xfrmdev_ops->xdo_dev_offload_ok || - netif_is_bond_master(slave_dev)) { - slave_warn(bond_dev, slave_dev, "%s: no slave xdo_dev_offload_ok\n", __func__); + if (!xs->xso.real_dev) + return false; + + if (!real_dev->xfrmdev_ops || + !real_dev->xfrmdev_ops->xdo_dev_offload_ok || + netif_is_bond_master(real_dev)) { return false; }
- xs->xso.real_dev = slave_dev; - return slave_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs); + return real_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs); }
static const struct xfrmdev_ops bond_xfrmdev_ops = { @@ -1006,8 +1107,7 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active) return;
#ifdef CONFIG_XFRM_OFFLOAD - if (old_active && bond->xs) - bond_ipsec_del_sa(bond->xs); + bond_ipsec_del_sa_all(bond); #endif /* CONFIG_XFRM_OFFLOAD */
if (new_active) { @@ -1083,10 +1183,7 @@ void bond_change_active_slave(struct bonding *bond, struct slave *new_active) }
#ifdef CONFIG_XFRM_OFFLOAD - if (new_active && bond->xs) { - xfrm_dev_state_flush(dev_net(bond->dev), bond->dev, true); - bond_ipsec_add_sa(bond->xs); - } + bond_ipsec_add_sa_all(bond); #endif /* CONFIG_XFRM_OFFLOAD */
/* resend IGMP joins since active slave has changed or @@ -3335,6 +3432,7 @@ static int bond_master_netdev_event(unsigned long event, return bond_event_changename(event_bond); case NETDEV_UNREGISTER: bond_remove_proc_entry(event_bond); + xfrm_dev_state_flush(dev_net(bond_dev), bond_dev, true); break; case NETDEV_REGISTER: bond_create_proc_entry(event_bond); @@ -4898,7 +4996,8 @@ void bond_setup(struct net_device *bond_dev) #ifdef CONFIG_XFRM_OFFLOAD /* set up xfrm device ops (only supported in active-backup right now) */ bond_dev->xfrmdev_ops = &bond_xfrmdev_ops; - bond->xs = NULL; + INIT_LIST_HEAD(&bond->ipsec_list); + spin_lock_init(&bond->ipsec_lock); #endif /* CONFIG_XFRM_OFFLOAD */
/* don't acquire bond device's netif_tx_lock when transmitting */ diff --git a/include/net/bonding.h b/include/net/bonding.h index 019e998d944a..a02b19843819 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -201,6 +201,11 @@ struct bond_up_slave { */ #define BOND_LINK_NOCHANGE -1
+struct bond_ipsec { + struct list_head list; + struct xfrm_state *xs; +}; + /* * Here are the locking policies for the two bonding locks: * Get rcu_read_lock when reading or RTNL when writing slave list. @@ -249,7 +254,9 @@ struct bonding { #endif /* CONFIG_DEBUG_FS */ struct rtnl_link_stats64 bond_stats; #ifdef CONFIG_XFRM_OFFLOAD - struct xfrm_state *xs; + struct list_head ipsec_list; + /* protecting ipsec_list */ + spinlock_t ipsec_lock; #endif /* CONFIG_XFRM_OFFLOAD */ };
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 955b785ec6b3b2f9b91914d6eeac8ee66ee29239 ]
To dereference bond->curr_active_slave, it uses rcu_dereference(). But it and the caller doesn't acquire RCU so a warning occurs. So add rcu_read_lock().
Splat looks like: WARNING: suspicious RCU usage 5.13.0-rc6+ #1179 Not tainted drivers/net/bonding/bond_main.c:571 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by ping/974: #0: ffff888109e7db70 (sk_lock-AF_INET){+.+.}-{0:0}, at: raw_sendmsg+0x1303/0x2cb0
stack backtrace: CPU: 2 PID: 974 Comm: ping Not tainted 5.13.0-rc6+ #1179 Call Trace: dump_stack+0xa4/0xe5 bond_ipsec_offload_ok+0x1f4/0x260 [bonding] xfrm_output+0x179/0x890 xfrm4_output+0xfa/0x410 ? __xfrm4_output+0x4b0/0x4b0 ? __ip_make_skb+0xecc/0x2030 ? xfrm4_udp_encap_rcv+0x800/0x800 ? ip_local_out+0x21/0x3a0 ip_send_skb+0x37/0xa0 raw_sendmsg+0x1bfd/0x2cb0
Fixes: 18cb261afd7b ("bonding: support hardware encryption offload to slaves") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 22 ++++++++++++++++------ 1 file changed, 16 insertions(+), 6 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 3f67b4b794ac..d267791a06c0 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -573,24 +573,34 @@ static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs) struct net_device *real_dev; struct slave *curr_active; struct bonding *bond; + int err;
bond = netdev_priv(bond_dev); + rcu_read_lock(); curr_active = rcu_dereference(bond->curr_active_slave); real_dev = curr_active->dev;
- if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) - return true; + if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) { + err = true; + goto out; + }
- if (!xs->xso.real_dev) - return false; + if (!xs->xso.real_dev) { + err = false; + goto out; + }
if (!real_dev->xfrmdev_ops || !real_dev->xfrmdev_ops->xdo_dev_offload_ok || netif_is_bond_master(real_dev)) { - return false; + err = false; + goto out; }
- return real_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs); + err = real_dev->xfrmdev_ops->xdo_dev_offload_ok(skb, xs); +out: + rcu_read_unlock(); + return err; }
static const struct xfrmdev_ops bond_xfrmdev_ops = {
From: Taehee Yoo ap420073@gmail.com
[ Upstream commit 168e696a36792a4a3b2525a06249e7472ef90186 ]
bond_ipsec_offload_ok() is called to check whether the interface supports ipsec offload or not. bonding interface support ipsec offload only in active-backup mode. So, if a bond interface is not in active-backup mode, it should return false but it returns true.
Fixes: a3b658cfb664 ("bonding: allow xfrm offload setup post-module-load") Signed-off-by: Taehee Yoo ap420073@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bonding/bond_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index d267791a06c0..bf8ade982940 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -581,7 +581,7 @@ static bool bond_ipsec_offload_ok(struct sk_buff *skb, struct xfrm_state *xs) real_dev = curr_active->dev;
if (BOND_MODE(bond) != BOND_MODE_ACTIVEBACKUP) { - err = true; + err = false; goto out; }
From: Nicolas Dichtel nicolas.dichtel@6wind.com
[ Upstream commit ccd27f05ae7b8ebc40af5b004e94517a919aa862 ]
The goal of commit df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl") was to have the disable_policy from ipv4 available on ipv6. However, it's not exactly the same mechanism. On IPv4, all packets coming from an interface, which has disable_policy set, bypass the policy check. For ipv6, this is done only for local packets, ie for packets destinated to an address configured on the incoming interface.
Let's align ipv6 with ipv4 so that the 'disable_policy' sysctl has the same effect for both protocols.
My first approach was to create a new kind of route cache entries, to be able to set DST_NOPOLICY without modifying routes. This would have added a lot of code. Because the local delivery path is already handled, I choose to focus on the forwarding path to minimize code churn.
Fixes: df789fe75206 ("ipv6: Provide ipv6 version of "disable_policy" sysctl") Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/ip6_output.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c index 497974b4372a..b7ffb4f227a4 100644 --- a/net/ipv6/ip6_output.c +++ b/net/ipv6/ip6_output.c @@ -479,7 +479,9 @@ int ip6_forward(struct sk_buff *skb) if (skb_warn_if_lro(skb)) goto drop;
- if (!xfrm6_policy_check(NULL, XFRM_POLICY_FWD, skb)) { + if (!net->ipv6.devconf_all->disable_policy && + !idev->cnf.disable_policy && + !xfrm6_policy_check(NULL, XFRM_POLICY_FWD, skb)) { __IP6_INC_STATS(net, idev, IPSTATS_MIB_INDISCARDS); goto drop; }
From: YueHaibing yuehaibing@huawei.com
[ Upstream commit eca81f09145d765c21dd8fb1ba5d874ca255c32c ]
The "plat->phy_interface" variable is an enum and in this context GCC will treat it as an unsigned int so the error handling is never triggered.
Fixes: b9f0b2f634c0 ("net: stmmac: platform: fix probe for ACPI devices") Signed-off-by: YueHaibing yuehaibing@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c index a696ada013eb..cad9e466353f 100644 --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_platform.c @@ -399,6 +399,7 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) struct device_node *np = pdev->dev.of_node; struct plat_stmmacenet_data *plat; struct stmmac_dma_cfg *dma_cfg; + int phy_mode; void *ret; int rc;
@@ -414,10 +415,11 @@ stmmac_probe_config_dt(struct platform_device *pdev, u8 *mac) eth_zero_addr(mac); }
- plat->phy_interface = device_get_phy_mode(&pdev->dev); - if (plat->phy_interface < 0) - return ERR_PTR(plat->phy_interface); + phy_mode = device_get_phy_mode(&pdev->dev); + if (phy_mode < 0) + return ERR_PTR(phy_mode);
+ plat->phy_interface = phy_mode; plat->interface = stmmac_of_get_mac_mode(np); if (plat->interface < 0) plat->interface = plat->phy_interface;
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 24b671aad4eae423e1abf5b7f08d9a5235458b8d ]
If the kernel doesn't enable option CONFIG_IPV6_SUBTREES, the RTA_SRC info will not be exported to userspace in rt6_fill_node(). And ip cmd will not print "from ::" to the route output. So remove this check.
Fixes: ec8105352869 ("selftests: Add redirect tests") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/icmp_redirect.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/icmp_redirect.sh b/tools/testing/selftests/net/icmp_redirect.sh index bf361f30d6ef..bfcabee50155 100755 --- a/tools/testing/selftests/net/icmp_redirect.sh +++ b/tools/testing/selftests/net/icmp_redirect.sh @@ -311,7 +311,7 @@ check_exception()
if [ "$with_redirect" = "yes" ]; then ip -netns h1 -6 ro get ${H1_VRF_ARG} ${H2_N2_IP6} | \ - grep -q "${H2_N2_IP6} from :: via ${R2_LLADDR} dev br0.*${mtu}" + grep -q "${H2_N2_IP6} .*via ${R2_LLADDR} dev br0.*${mtu}" elif [ -n "${mtu}" ]; then ip -netns h1 -6 ro get ${H1_VRF_ARG} ${H2_N2_IP6} | \ grep -q "${mtu}"
From: Hangbin Liu liuhangbin@gmail.com
[ Upstream commit 0e02bf5de46ae30074a2e1a8194a422a84482a1a ]
After redirecting, it's already a new path. So the old PMTU info should be cleared. The IPv6 test "mtu exception plus redirect" should only has redirect info without old PMTU.
The IPv4 test can not be changed because of legacy.
Fixes: ec8105352869 ("selftests: Add redirect tests") Signed-off-by: Hangbin Liu liuhangbin@gmail.com Reviewed-by: David Ahern dsahern@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/icmp_redirect.sh | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/icmp_redirect.sh b/tools/testing/selftests/net/icmp_redirect.sh index bfcabee50155..104a7a5f13b1 100755 --- a/tools/testing/selftests/net/icmp_redirect.sh +++ b/tools/testing/selftests/net/icmp_redirect.sh @@ -309,9 +309,10 @@ check_exception() fi log_test $? 0 "IPv4: ${desc}"
- if [ "$with_redirect" = "yes" ]; then + # No PMTU info for test "redirect" and "mtu exception plus redirect" + if [ "$with_redirect" = "yes" ] && [ "$desc" != "redirect exception plus mtu" ]; then ip -netns h1 -6 ro get ${H1_VRF_ARG} ${H2_N2_IP6} | \ - grep -q "${H2_N2_IP6} .*via ${R2_LLADDR} dev br0.*${mtu}" + grep -v "mtu" | grep -q "${H2_N2_IP6} .*via ${R2_LLADDR} dev br0" elif [ -n "${mtu}" ]; then ip -netns h1 -6 ro get ${H1_VRF_ARG} ${H2_N2_IP6} | \ grep -q "${mtu}"
From: Uwe Kleine-König u.kleine-koenig@pengutronix.de
[ Upstream commit 65e2e6c1c20104ed19060a38f4edbf14e9f9a9a5 ]
As the last call to sprd_pwm_apply() might have exited early if state->enabled was false, the values for period and duty_cycle stored in pwm->state might not have been written to hardware and it must be ensured that they are configured before enabling the PWM.
Fixes: 8aae4b02e8a6 ("pwm: sprd: Add Spreadtrum PWM support") Signed-off-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Thierry Reding thierry.reding@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pwm/pwm-sprd.c | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-)
diff --git a/drivers/pwm/pwm-sprd.c b/drivers/pwm/pwm-sprd.c index 98c479dfae31..3041f0b3bbb6 100644 --- a/drivers/pwm/pwm-sprd.c +++ b/drivers/pwm/pwm-sprd.c @@ -183,13 +183,10 @@ static int sprd_pwm_apply(struct pwm_chip *chip, struct pwm_device *pwm, } }
- if (state->period != cstate->period || - state->duty_cycle != cstate->duty_cycle) { - ret = sprd_pwm_config(spc, pwm, state->duty_cycle, - state->period); - if (ret) - return ret; - } + ret = sprd_pwm_config(spc, pwm, state->duty_cycle, + state->period); + if (ret) + return ret;
sprd_pwm_write(spc, pwm->hwpwm, SPRD_PWM_ENABLE, 1); } else if (cstate->enabled) {
From: Shahjada Abul Husain shahjada@chelsio.com
[ Upstream commit 015fe6fd29c4b9ac0f61b8c4455ef88e6018b9cc ]
IRQs are requested during driver's ndo_open() and then later freed up in disable_interrupts() during driver unload. A race exists where driver can set the CXGB4_FULL_INIT_DONE flag in ndo_open() after the disable_interrupts() in driver unload path checks it, and hence misses calling free_irq().
Fix by unregistering netdevice first and sync with driver's ndo_open(). This ensures disable_interrupts() checks the flag correctly and frees up the IRQs properly.
Fixes: b37987e8db5f ("cxgb4: Disable interrupts and napi before unregistering netdev") Signed-off-by: Shahjada Abul Husain shahjada@chelsio.com Signed-off-by: Raju Rangoju rajur@chelsio.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/chelsio/cxgb4/cxgb4_main.c | 18 ++++++++++-------- drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c | 3 +++ 2 files changed, 13 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c index 762113a04dde..9f62ffe64781 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c @@ -2643,6 +2643,9 @@ static void detach_ulds(struct adapter *adap) { unsigned int i;
+ if (!is_uld(adap)) + return; + mutex_lock(&uld_mutex); list_del(&adap->list_node);
@@ -7141,10 +7144,13 @@ static void remove_one(struct pci_dev *pdev) */ destroy_workqueue(adapter->workq);
- if (is_uld(adapter)) { - detach_ulds(adapter); - t4_uld_clean_up(adapter); - } + detach_ulds(adapter); + + for_each_port(adapter, i) + if (adapter->port[i]->reg_state == NETREG_REGISTERED) + unregister_netdev(adapter->port[i]); + + t4_uld_clean_up(adapter);
adap_free_hma_mem(adapter);
@@ -7152,10 +7158,6 @@ static void remove_one(struct pci_dev *pdev)
cxgb4_free_mps_ref_entries(adapter);
- for_each_port(adapter, i) - if (adapter->port[i]->reg_state == NETREG_REGISTERED) - unregister_netdev(adapter->port[i]); - debugfs_remove_recursive(adapter->debugfs_root);
if (!is_t4(adapter->params.chip)) diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c index 743af9e654aa..17faac715882 100644 --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_uld.c @@ -581,6 +581,9 @@ void t4_uld_clean_up(struct adapter *adap) { unsigned int i;
+ if (!is_uld(adap)) + return; + mutex_lock(&uld_mutex); for (i = 0; i < CXGB4_ULD_MAX; i++) { if (!adap->uld[i].handle)
From: Zack Rusin zackr@vmware.com
[ Upstream commit 34bd46bcf3de72cbffcdc42d3fa67e543d1c869b ]
Change 2ef4fb92363c ("drm/vmwgfx: Make sure bo's are unpinned before putting them back") caused a conflict in one of the drm trees and the merge commit 68a32ba14177 ("Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm") accidently re-added code that the original change was removing. Fixed by removing the incorrect buffer unpin - it has already been unpinned two lines above.
Fixes: 68a32ba14177 ("Merge tag 'drm-next-2021-04-28' of git://anongit.freedesktop.org/drm/drm") Signed-off-by: Zack Rusin zackr@vmware.com Reviewed-by: Martin Krastev krastevm@vmware.com Link: https://patchwork.freedesktop.org/patch/msgid/20210615182336.995192-4-zackr@... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vmwgfx/vmwgfx_mob.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c index 5648664f71bc..f2d625415458 100644 --- a/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c +++ b/drivers/gpu/drm/vmwgfx/vmwgfx_mob.c @@ -354,7 +354,6 @@ static void vmw_otable_batch_takedown(struct vmw_private *dev_priv, ttm_bo_unpin(bo); ttm_bo_unreserve(bo);
- ttm_bo_unpin(batch->otable_bo); ttm_bo_put(batch->otable_bo); batch->otable_bo = NULL; }
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit 0c71929b5893e410e0efbe1bbeca6f19a5f19956 ]
I did stress test with wrk[1] and webfsd[2] with the assistance of mptcp-tools[3]:
Server side: ./use_mptcp.sh webfsd -4 -R /tmp/ -p 8099 Client side: ./use_mptcp.sh wrk -c 200 -d 30 -t 4 http://192.168.174.129:8099/
and got the following warning message:
[ 55.552626] TCP: request_sock_subflow: Possible SYN flooding on port 8099. Sending cookies. Check SNMP counters. [ 55.553024] ------------[ cut here ]------------ [ 55.553027] WARNING: CPU: 0 PID: 10 at net/core/flow_dissector.c:984 __skb_flow_dissect+0x280/0x1650 ... [ 55.553117] CPU: 0 PID: 10 Comm: ksoftirqd/0 Not tainted 5.12.0+ #18 [ 55.553121] Hardware name: VMware, Inc. VMware Virtual Platform/440BX Desktop Reference Platform, BIOS 6.00 02/27/2020 [ 55.553124] RIP: 0010:__skb_flow_dissect+0x280/0x1650 ... [ 55.553133] RSP: 0018:ffffb79580087770 EFLAGS: 00010246 [ 55.553137] RAX: 0000000000000000 RBX: ffffffff8ddb58e0 RCX: ffffb79580087888 [ 55.553139] RDX: ffffffff8ddb58e0 RSI: ffff8f7e4652b600 RDI: 0000000000000000 [ 55.553141] RBP: ffffb79580087858 R08: 0000000000000000 R09: 0000000000000008 [ 55.553143] R10: 000000008c622965 R11: 00000000d3313a5b R12: ffff8f7e4652b600 [ 55.553146] R13: ffff8f7e465c9062 R14: 0000000000000000 R15: ffffb79580087888 [ 55.553149] FS: 0000000000000000(0000) GS:ffff8f7f75e00000(0000) knlGS:0000000000000000 [ 55.553152] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 55.553154] CR2: 00007f73d1d19000 CR3: 0000000135e10004 CR4: 00000000003706f0 [ 55.553160] Call Trace: [ 55.553166] ? __sha256_final+0x67/0xd0 [ 55.553173] ? sha256+0x7e/0xa0 [ 55.553177] __skb_get_hash+0x57/0x210 [ 55.553182] subflow_init_req_cookie_join_save+0xac/0xc0 [ 55.553189] subflow_check_req+0x474/0x550 [ 55.553195] ? ip_route_output_key_hash+0x67/0x90 [ 55.553200] ? xfrm_lookup_route+0x1d/0xa0 [ 55.553207] subflow_v4_route_req+0x8e/0xd0 [ 55.553212] tcp_conn_request+0x31e/0xab0 [ 55.553218] ? selinux_socket_sock_rcv_skb+0x116/0x210 [ 55.553224] ? tcp_rcv_state_process+0x179/0x6d0 [ 55.553229] tcp_rcv_state_process+0x179/0x6d0 [ 55.553235] tcp_v4_do_rcv+0xaf/0x220 [ 55.553239] tcp_v4_rcv+0xce4/0xd80 [ 55.553243] ? ip_route_input_rcu+0x246/0x260 [ 55.553248] ip_protocol_deliver_rcu+0x35/0x1b0 [ 55.553253] ip_local_deliver_finish+0x44/0x50 [ 55.553258] ip_local_deliver+0x6c/0x110 [ 55.553262] ? ip_rcv_finish_core.isra.19+0x5a/0x400 [ 55.553267] ip_rcv+0xd1/0xe0 ...
After debugging, I found in __skb_flow_dissect(), skb->dev and skb->sk are both NULL, then net is NULL, and trigger WARN_ON_ONCE(!net), actually net is always NULL in this code path, as skb->dev is set to NULL in tcp_v4_rcv(), and skb->sk is never set.
Code snippet in __skb_flow_dissect() that trigger warning: 975 if (skb) { 976 if (!net) { 977 if (skb->dev) 978 net = dev_net(skb->dev); 979 else if (skb->sk) 980 net = sock_net(skb->sk); 981 } 982 } 983 984 WARN_ON_ONCE(!net);
So, using seq and transport header derived hash.
[1] https://github.com/wg/wrk [2] https://github.com/ourway/webfsd [3] https://github.com/pabeni/mptcp-tools
Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in use") Suggested-by: Paolo Abeni pabeni@redhat.com Suggested-by: Florian Westphal fw@strlen.de Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/syncookies.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/net/mptcp/syncookies.c b/net/mptcp/syncookies.c index abe0fd099746..37127781aee9 100644 --- a/net/mptcp/syncookies.c +++ b/net/mptcp/syncookies.c @@ -37,7 +37,21 @@ static spinlock_t join_entry_locks[COOKIE_JOIN_SLOTS] __cacheline_aligned_in_smp
static u32 mptcp_join_entry_hash(struct sk_buff *skb, struct net *net) { - u32 i = skb_get_hash(skb) ^ net_hash_mix(net); + static u32 mptcp_join_hash_secret __read_mostly; + struct tcphdr *th = tcp_hdr(skb); + u32 seq, i; + + net_get_random_once(&mptcp_join_hash_secret, + sizeof(mptcp_join_hash_secret)); + + if (th->syn) + seq = TCP_SKB_CB(skb)->seq; + else + seq = TCP_SKB_CB(skb)->seq - 1; + + i = jhash_3words(seq, net_hash_mix(net), + (__force __u32)th->source << 16 | (__force __u32)th->dest, + mptcp_join_hash_secret);
return i % ARRAY_SIZE(join_entries); }
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit 030d37bd1cd2443a1f21db47eb301899bfa45a2a ]
In subflow_check_req(), if subflow sport is mismatch, will put msk, destroy token, and destruct req, then return -EPERM, which can be done by subflow_req_destructor() via:
tcp_conn_request() |--__reqsk_free() |--subflow_req_destructor()
So we should remove these redundant code, otherwise will call tcp_v4_reqsk_destructor() twice, and may double free inet_rsk(req)->ireq_opt.
Fixes: 5bc56388c74f ("mptcp: add port number check for MP_JOIN") Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/subflow.c | 5 ----- 1 file changed, 5 deletions(-)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index cbc452d0901e..5493c851ca6c 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -212,11 +212,6 @@ again: ntohs(inet_sk(sk_listener)->inet_sport), ntohs(inet_sk((struct sock *)subflow_req->msk)->inet_sport)); if (!mptcp_pm_sport_in_anno_list(subflow_req->msk, sk_listener)) { - sock_put((struct sock *)subflow_req->msk); - mptcp_token_destroy_request(req); - tcp_request_sock_ops.destructor(req); - subflow_req->msk = NULL; - subflow_req->mp_join = 0; SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_MISMATCHPORTSYNRX); return -EPERM; }
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit 8547ea5f52dd8ef19b69c25c41b1415481b3503b ]
Lots of "TCP: tcp_fin: Impossible, sk->sk_state=7" in client side when doing stress testing using wrk and webfsd.
There are at least two cases may trigger this warning: 1.mptcp is in syncookie, and server recv MP_JOIN SYN request, in subflow_check_req(), the mptcp_can_accept_new_subflow() return false, so subflow_init_req_cookie_join_save() isn't called, i.e. not store the data present in the MP_JOIN syn request and the random nonce in hash table - join_entries[], but still send synack. When recv 3rd-ack, mptcp_token_join_cookie_init_state() will return false, and 3rd-ack is dropped, then if mptcp conn is closed by client, client will send a DATA_FIN and a MPTCP FIN, the DATA_FIN doesn't have MP_CAPABLE or MP_JOIN, so mptcp_subflow_init_cookie_req() will return 0, and pass the cookie check, MP_JOIN request is fallback to normal TCP. Server will send a TCP FIN if closed, in client side, when process TCP FIN, it will do reset, the code path is: tcp_data_queue()->mptcp_incoming_options() ->check_fully_established()->mptcp_subflow_reset(). mptcp_subflow_reset() will set sock state to TCP_CLOSE, so tcp_fin will hit TCP_CLOSE, and print the warning.
2.mptcp is in syncookie, and server recv 3rd-ack, in mptcp_subflow_init_cookie_req(), mptcp_can_accept_new_subflow() return false, and subflow_req->mp_join is not set to 1, so in subflow_syn_recv_sock() will not reset the MP_JOIN subflow, but fallback to normal TCP, and then the same thing happens when server will send a TCP FIN if closed.
For case1, subflow_check_req() return -EPERM, then tcp_conn_request() will drop MP_JOIN SYN.
For case2, let subflow_syn_recv_sock() call mptcp_can_accept_new_subflow(), and do fatal fallback, send reset.
Fixes: 9466a1ccebbe ("mptcp: enable JOIN requests even if cookies are in use") Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/subflow.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 5493c851ca6c..5221cfce5390 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -223,6 +223,8 @@ again: if (unlikely(req->syncookie)) { if (mptcp_can_accept_new_subflow(subflow_req->msk)) subflow_init_req_cookie_join_save(subflow_req, skb); + else + return -EPERM; }
pr_debug("token=%u, remote_nonce=%u msk=%p", subflow_req->token, @@ -262,9 +264,7 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req, if (!mptcp_token_join_cookie_init_state(subflow_req, skb)) return -EINVAL;
- if (mptcp_can_accept_new_subflow(subflow_req->msk)) - subflow_req->mp_join = 1; - + subflow_req->mp_join = 1; subflow_req->ssn_offset = TCP_SKB_CB(skb)->seq - 1; }
From: Geliang Tang geliangtang@gmail.com
[ Upstream commit c863225b79426459feca2ef5b0cc2f07e8e68771 ]
This patch added a new parameter name sk in mptcp_get_options().
Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Geliang Tang geliangtang@gmail.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/options.c | 5 +++-- net/mptcp/protocol.h | 3 ++- net/mptcp/subflow.c | 10 +++++----- 3 files changed, 10 insertions(+), 8 deletions(-)
diff --git a/net/mptcp/options.c b/net/mptcp/options.c index b87e46f515fb..72b1067d5aa2 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -323,7 +323,8 @@ static void mptcp_parse_option(const struct sk_buff *skb, } }
-void mptcp_get_options(const struct sk_buff *skb, +void mptcp_get_options(const struct sock *sk, + const struct sk_buff *skb, struct mptcp_options_received *mp_opt) { const struct tcphdr *th = tcp_hdr(skb); @@ -1010,7 +1011,7 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) return; }
- mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk, skb, &mp_opt); if (!check_fully_established(msk, sk, subflow, skb, &mp_opt)) return;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index 7b634568f49c..f74258377c05 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -576,7 +576,8 @@ int __init mptcp_proto_v6_init(void); struct sock *mptcp_sk_clone(const struct sock *sk, const struct mptcp_options_received *mp_opt, struct request_sock *req); -void mptcp_get_options(const struct sk_buff *skb, +void mptcp_get_options(const struct sock *sk, + const struct sk_buff *skb, struct mptcp_options_received *mp_opt);
void mptcp_finish_connect(struct sock *sk); diff --git a/net/mptcp/subflow.c b/net/mptcp/subflow.c index 5221cfce5390..78e787ef8fff 100644 --- a/net/mptcp/subflow.c +++ b/net/mptcp/subflow.c @@ -150,7 +150,7 @@ static int subflow_check_req(struct request_sock *req, return -EINVAL; #endif
- mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk_listener, skb, &mp_opt);
if (mp_opt.mp_capable) { SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_MPCAPABLEPASSIVE); @@ -244,7 +244,7 @@ int mptcp_subflow_init_cookie_req(struct request_sock *req, int err;
subflow_init_req(req, sk_listener); - mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk_listener, skb, &mp_opt);
if (mp_opt.mp_capable && mp_opt.mp_join) return -EINVAL; @@ -403,7 +403,7 @@ static void subflow_finish_connect(struct sock *sk, const struct sk_buff *skb) subflow->ssn_offset = TCP_SKB_CB(skb)->seq; pr_debug("subflow=%p synack seq=%x", subflow, subflow->ssn_offset);
- mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk, skb, &mp_opt); if (subflow->request_mptcp) { if (!mp_opt.mp_capable) { MPTCP_INC_STATS(sock_net(sk), @@ -650,7 +650,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, * reordered MPC will cause fallback, but we don't have other * options. */ - mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk, skb, &mp_opt); if (!mp_opt.mp_capable) { fallback = true; goto create_child; @@ -660,7 +660,7 @@ static struct sock *subflow_syn_recv_sock(const struct sock *sk, if (!new_msk) fallback = true; } else if (subflow_req->mp_join) { - mptcp_get_options(skb, &mp_opt); + mptcp_get_options(sk, skb, &mp_opt); if (!mp_opt.mp_join || !subflow_hmac_valid(req, &mp_opt) || !mptcp_can_accept_new_subflow(subflow_req->msk)) { SUBFLOW_REQ_INC_STATS(req, MPTCP_MIB_JOINACKMAC);
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit 6787b7e350d3552651a3422d3d8980fbc8d65368 ]
If check_fully_established() causes a subflow reset, it should not continue to process the packet in tcp_data_queue(). Add a return value to mptcp_incoming_options(), and return false if a subflow has been reset, else return true. Then drop the packet in tcp_data_queue()/tcp_rcv_state_process() if mptcp_incoming_options() return false.
Fixes: d582484726c4 ("mptcp: fix fallback for MP_JOIN subflows") Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- include/net/mptcp.h | 5 +++-- net/ipv4/tcp_input.c | 19 +++++++++++++++---- net/mptcp/options.c | 19 +++++++++++++------ 3 files changed, 31 insertions(+), 12 deletions(-)
diff --git a/include/net/mptcp.h b/include/net/mptcp.h index 83f23774b908..f1d798ff29e9 100644 --- a/include/net/mptcp.h +++ b/include/net/mptcp.h @@ -101,7 +101,7 @@ bool mptcp_synack_options(const struct request_sock *req, unsigned int *size, bool mptcp_established_options(struct sock *sk, struct sk_buff *skb, unsigned int *size, unsigned int remaining, struct mptcp_out_options *opts); -void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb); +bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb);
void mptcp_write_options(__be32 *ptr, const struct tcp_sock *tp, struct mptcp_out_options *opts); @@ -223,9 +223,10 @@ static inline bool mptcp_established_options(struct sock *sk, return false; }
-static inline void mptcp_incoming_options(struct sock *sk, +static inline bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) { + return true; }
static inline void mptcp_skb_ext_move(struct sk_buff *to, diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 6bd628f08ded..0f1b4bfddfd4 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -4247,6 +4247,9 @@ void tcp_reset(struct sock *sk, struct sk_buff *skb) { trace_tcp_receive_reset(sk);
+ /* mptcp can't tell us to ignore reset pkts, + * so just ignore the return value of mptcp_incoming_options(). + */ if (sk_is_mptcp(sk)) mptcp_incoming_options(sk, skb);
@@ -4941,8 +4944,13 @@ static void tcp_data_queue(struct sock *sk, struct sk_buff *skb) bool fragstolen; int eaten;
- if (sk_is_mptcp(sk)) - mptcp_incoming_options(sk, skb); + /* If a subflow has been reset, the packet should not continue + * to be processed, drop the packet. + */ + if (sk_is_mptcp(sk) && !mptcp_incoming_options(sk, skb)) { + __kfree_skb(skb); + return; + }
if (TCP_SKB_CB(skb)->seq == TCP_SKB_CB(skb)->end_seq) { __kfree_skb(skb); @@ -6522,8 +6530,11 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) case TCP_CLOSING: case TCP_LAST_ACK: if (!before(TCP_SKB_CB(skb)->seq, tp->rcv_nxt)) { - if (sk_is_mptcp(sk)) - mptcp_incoming_options(sk, skb); + /* If a subflow has been reset, the packet should not + * continue to be processed, drop the packet. + */ + if (sk_is_mptcp(sk) && !mptcp_incoming_options(sk, skb)) + goto discard; break; } fallthrough; diff --git a/net/mptcp/options.c b/net/mptcp/options.c index 72b1067d5aa2..4f08e04e1ab7 100644 --- a/net/mptcp/options.c +++ b/net/mptcp/options.c @@ -990,7 +990,8 @@ static bool add_addr_hmac_valid(struct mptcp_sock *msk, return hmac == mp_opt->ahmac; }
-void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) +/* Return false if a subflow has been reset, else return true */ +bool mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) { struct mptcp_subflow_context *subflow = mptcp_subflow_ctx(sk); struct mptcp_sock *msk = mptcp_sk(subflow->conn); @@ -1008,12 +1009,16 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) __mptcp_check_push(subflow->conn, sk); __mptcp_data_acked(subflow->conn); mptcp_data_unlock(subflow->conn); - return; + return true; }
mptcp_get_options(sk, skb, &mp_opt); + + /* The subflow can be in close state only if check_fully_established() + * just sent a reset. If so, tell the caller to ignore the current packet. + */ if (!check_fully_established(msk, sk, subflow, skb, &mp_opt)) - return; + return sk->sk_state != TCP_CLOSE;
if (mp_opt.fastclose && msk->local_key == mp_opt.rcvr_key) { @@ -1055,7 +1060,7 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) }
if (!mp_opt.dss) - return; + return true;
/* we can't wait for recvmsg() to update the ack_seq, otherwise * monodirectional flows will stuck @@ -1074,12 +1079,12 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) schedule_work(&msk->work)) sock_hold(subflow->conn);
- return; + return true; }
mpext = skb_ext_add(skb, SKB_EXT_MPTCP); if (!mpext) - return; + return true;
memset(mpext, 0, sizeof(*mpext));
@@ -1104,6 +1109,8 @@ void mptcp_incoming_options(struct sock *sk, struct sk_buff *skb) mpext->data_len = mp_opt.data_len; mpext->use_map = 1; } + + return true; }
static void mptcp_set_rwin(const struct tcp_sock *tp)
From: Jianguo Wu wujianguo@chinatelecom.cn
[ Upstream commit a7da441621c7945fbfd43ed239c93b8073cda502 ]
After patch "mptcp: fix syncookie process if mptcp can not_accept new subflow", if subflow is limited, MP_JOIN SYN is dropped, and no SYN/ACK will be replied.
So in case "multiple subflows limited by server", the expected SYN/ACK number should be 1.
Fixes: 00587187ad30 ("selftests: mptcp: add test cases for mptcp join tests with syn cookies") Reported-by: kernel test robot oliver.sang@intel.com Signed-off-by: Jianguo Wu wujianguo@chinatelecom.cn Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/mptcp/mptcp_join.sh b/tools/testing/selftests/net/mptcp/mptcp_join.sh index fd99485cf2a4..e8ac852c6ff6 100755 --- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -1341,7 +1341,7 @@ syncookies_tests() ip netns exec $ns2 ./pm_nl_ctl add 10.0.3.2 flags subflow ip netns exec $ns2 ./pm_nl_ctl add 10.0.2.2 flags subflow run_tests $ns1 $ns2 10.0.1.1 - chk_join_nr "subflows limited by server w cookies" 2 2 1 + chk_join_nr "subflows limited by server w cookies" 2 1 1
# test signal address with cookies reset_with_cookies
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 75e908c33615999abe1f3a8429d25dea30d28e4e ]
There are a bunch of callsite where the ssk socket lock is acquired using the full-blown version eligible for the fast variant. Let's move to the latter.
Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/pm_netlink.c | 10 ++++++---- net/mptcp/protocol.c | 15 +++++++++------ 2 files changed, 15 insertions(+), 10 deletions(-)
diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c index 3f5d90a20235..fce1d057d19e 100644 --- a/net/mptcp/pm_netlink.c +++ b/net/mptcp/pm_netlink.c @@ -540,6 +540,7 @@ void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) subflow = list_first_entry_or_null(&msk->conn_list, typeof(*subflow), node); if (subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + bool slow;
spin_unlock_bh(&msk->pm.lock); pr_debug("send ack for %s%s%s", @@ -547,9 +548,9 @@ void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk) mptcp_pm_should_add_signal_ipv6(msk) ? " [ipv6]" : "", mptcp_pm_should_add_signal_port(msk) ? " [port]" : "");
- lock_sock(ssk); + slow = lock_sock_fast(ssk); tcp_send_ack(ssk); - release_sock(ssk); + unlock_sock_fast(ssk, slow); spin_lock_bh(&msk->pm.lock); } } @@ -566,6 +567,7 @@ int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk, struct sock *ssk = mptcp_subflow_tcp_sock(subflow); struct sock *sk = (struct sock *)msk; struct mptcp_addr_info local; + bool slow;
local_address((struct sock_common *)ssk, &local); if (!addresses_equal(&local, addr, addr->port)) @@ -578,9 +580,9 @@ int mptcp_pm_nl_mp_prio_send_ack(struct mptcp_sock *msk,
spin_unlock_bh(&msk->pm.lock); pr_debug("send ack for mp_prio"); - lock_sock(ssk); + slow = lock_sock_fast(ssk); tcp_send_ack(ssk); - release_sock(ssk); + unlock_sock_fast(ssk, slow); spin_lock_bh(&msk->pm.lock);
return 0; diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 8ead550df8b1..0f36fefcc77e 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -424,23 +424,25 @@ static void mptcp_send_ack(struct mptcp_sock *msk)
mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + bool slow;
- lock_sock(ssk); + slow = lock_sock_fast(ssk); if (tcp_can_send_ack(ssk)) tcp_send_ack(ssk); - release_sock(ssk); + unlock_sock_fast(ssk, slow); } }
static bool mptcp_subflow_cleanup_rbuf(struct sock *ssk) { + bool slow; int ret;
- lock_sock(ssk); + slow = lock_sock_fast(ssk); ret = tcp_can_send_ack(ssk); if (ret) tcp_cleanup_rbuf(ssk, 1); - release_sock(ssk); + unlock_sock_fast(ssk, slow); return ret; }
@@ -2288,13 +2290,14 @@ static void mptcp_check_fastclose(struct mptcp_sock *msk)
list_for_each_entry_safe(subflow, tmp, &msk->conn_list, node) { struct sock *tcp_sk = mptcp_subflow_tcp_sock(subflow); + bool slow;
- lock_sock(tcp_sk); + slow = lock_sock_fast(tcp_sk); if (tcp_sk->sk_state != TCP_CLOSE) { tcp_send_active_reset(tcp_sk, GFP_ATOMIC); tcp_set_state(tcp_sk, TCP_CLOSE); } - release_sock(tcp_sk); + unlock_sock_fast(tcp_sk, slow); }
inet_sk_state_store(sk, TCP_CLOSE);
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit fde56eea01f96b664eb63033990be0fd2a945da5 ]
The current cleanup rbuf tries a bit too hard to avoid acquiring the subflow socket lock. We may end-up delaying the needed ack, or skip acking a blocked subflow.
Address the above extending the conditions used to trigger the cleanup to reflect more closely what TCP does and invoking tcp_cleanup_rbuf() on all the active subflows.
Note that we can't replicate the exact tests implemented in tcp_cleanup_rbuf(), as MPTCP lacks some of the required info - e.g. ping-pong mode.
Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/protocol.c | 56 ++++++++++++++++++-------------------------- net/mptcp/protocol.h | 1 - 2 files changed, 23 insertions(+), 34 deletions(-)
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 0f36fefcc77e..18f152bdb66f 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -433,49 +433,46 @@ static void mptcp_send_ack(struct mptcp_sock *msk) } }
-static bool mptcp_subflow_cleanup_rbuf(struct sock *ssk) +static void mptcp_subflow_cleanup_rbuf(struct sock *ssk) { bool slow; - int ret;
slow = lock_sock_fast(ssk); - ret = tcp_can_send_ack(ssk); - if (ret) + if (tcp_can_send_ack(ssk)) tcp_cleanup_rbuf(ssk, 1); unlock_sock_fast(ssk, slow); - return ret; +} + +static bool mptcp_subflow_could_cleanup(const struct sock *ssk, bool rx_empty) +{ + const struct inet_connection_sock *icsk = inet_csk(ssk); + bool ack_pending = READ_ONCE(icsk->icsk_ack.pending); + const struct tcp_sock *tp = tcp_sk(ssk); + + return (ack_pending & ICSK_ACK_SCHED) && + ((READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->rcv_wup) > + READ_ONCE(icsk->icsk_ack.rcv_mss)) || + (rx_empty && ack_pending & + (ICSK_ACK_PUSHED2 | ICSK_ACK_PUSHED))); }
static void mptcp_cleanup_rbuf(struct mptcp_sock *msk) { - struct sock *ack_hint = READ_ONCE(msk->ack_hint); int old_space = READ_ONCE(msk->old_wspace); struct mptcp_subflow_context *subflow; struct sock *sk = (struct sock *)msk; - bool cleanup; + int space = __mptcp_space(sk); + bool cleanup, rx_empty;
- /* this is a simple superset of what tcp_cleanup_rbuf() implements - * so that we don't have to acquire the ssk socket lock most of the time - * to do actually nothing - */ - cleanup = __mptcp_space(sk) - old_space >= max(0, old_space); - if (!cleanup) - return; + cleanup = (space > 0) && (space >= (old_space << 1)); + rx_empty = !atomic_read(&sk->sk_rmem_alloc);
- /* if the hinted ssk is still active, try to use it */ - if (likely(ack_hint)) { - mptcp_for_each_subflow(msk, subflow) { - struct sock *ssk = mptcp_subflow_tcp_sock(subflow); + mptcp_for_each_subflow(msk, subflow) { + struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
- if (ack_hint == ssk && mptcp_subflow_cleanup_rbuf(ssk)) - return; - } + if (cleanup || mptcp_subflow_could_cleanup(ssk, rx_empty)) + mptcp_subflow_cleanup_rbuf(ssk); } - - /* otherwise pick the first active subflow */ - mptcp_for_each_subflow(msk, subflow) - if (mptcp_subflow_cleanup_rbuf(mptcp_subflow_tcp_sock(subflow))) - return; }
static bool mptcp_check_data_fin(struct sock *sk) @@ -620,7 +617,6 @@ static bool __mptcp_move_skbs_from_subflow(struct mptcp_sock *msk, break; } } while (more_data_avail); - WRITE_ONCE(msk->ack_hint, ssk);
*bytes += moved; return done; @@ -1955,7 +1951,6 @@ static bool __mptcp_move_skbs(struct mptcp_sock *msk) __mptcp_update_rmem(sk); done = __mptcp_move_skbs_from_subflow(msk, ssk, &moved); mptcp_data_unlock(sk); - tcp_cleanup_rbuf(ssk, moved);
if (unlikely(ssk->sk_err)) __mptcp_error_report(sk); @@ -1971,7 +1966,6 @@ static bool __mptcp_move_skbs(struct mptcp_sock *msk) ret |= __mptcp_ofo_queue(msk); __mptcp_splice_receive_queue(sk); mptcp_data_unlock(sk); - mptcp_cleanup_rbuf(msk); } if (ret) mptcp_check_data_fin((struct sock *)msk); @@ -2216,9 +2210,6 @@ static void __mptcp_close_ssk(struct sock *sk, struct sock *ssk, if (ssk == msk->last_snd) msk->last_snd = NULL;
- if (ssk == msk->ack_hint) - msk->ack_hint = NULL; - if (ssk == msk->first) msk->first = NULL;
@@ -2433,7 +2424,6 @@ static int __mptcp_init_sock(struct sock *sk) msk->tx_pending_data = 0; msk->size_goal_cache = TCP_BASE_MSS;
- msk->ack_hint = NULL; msk->first = NULL; inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index f74258377c05..f842c832f6b0 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -236,7 +236,6 @@ struct mptcp_sock { bool rcv_fastclose; bool use_64bit_ack; /* Set when we received a 64-bit DSN */ spinlock_t join_list_lock; - struct sock *ack_hint; struct work_struct work; struct sk_buff *ooo_last_skb; struct rb_root out_of_order_queue;
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit ce599c516386f09ca30848a1a4eb93d3fffbe187 ]
After commit 879526030c8b ("mptcp: protect the rx path with the msk socket spinlock") the rmem currently used by a given msk is really sk_rmem_alloc - rmem_released.
The safety check in mptcp_data_ready() does not take the above in due account, as a result legit incoming data is kept in subflow receive queue with no reason, delaying or blocking MPTCP-level ack generation.
This change addresses the issue introducing a new helper to fetch the rmem memory and using it as needed. Additionally add a MIB counter for the exceptional event described above - the peer is misbehaving.
Finally, introduce the required annotation when rmem_released is updated.
Fixes: 879526030c8b ("mptcp: protect the rx path with the msk socket spinlock") Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/211 Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/mptcp/mib.c | 1 + net/mptcp/mib.h | 1 + net/mptcp/protocol.c | 12 +++++++----- net/mptcp/protocol.h | 10 +++++++++- 4 files changed, 18 insertions(+), 6 deletions(-)
diff --git a/net/mptcp/mib.c b/net/mptcp/mib.c index eb2dc6dbe212..c8f4823cd79f 100644 --- a/net/mptcp/mib.c +++ b/net/mptcp/mib.c @@ -42,6 +42,7 @@ static const struct snmp_mib mptcp_snmp_list[] = { SNMP_MIB_ITEM("RmSubflow", MPTCP_MIB_RMSUBFLOW), SNMP_MIB_ITEM("MPPrioTx", MPTCP_MIB_MPPRIOTX), SNMP_MIB_ITEM("MPPrioRx", MPTCP_MIB_MPPRIORX), + SNMP_MIB_ITEM("RcvPruned", MPTCP_MIB_RCVPRUNED), SNMP_MIB_SENTINEL };
diff --git a/net/mptcp/mib.h b/net/mptcp/mib.h index f0da4f060fe1..93fa7c95e206 100644 --- a/net/mptcp/mib.h +++ b/net/mptcp/mib.h @@ -35,6 +35,7 @@ enum linux_mptcp_mib_field { MPTCP_MIB_RMSUBFLOW, /* Remove a subflow */ MPTCP_MIB_MPPRIOTX, /* Transmit a MP_PRIO */ MPTCP_MIB_MPPRIORX, /* Received a MP_PRIO */ + MPTCP_MIB_RCVPRUNED, /* Incoming packet dropped due to memory limit */ __MPTCP_MIB_MAX };
diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c index 18f152bdb66f..94b707a39bc3 100644 --- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -465,7 +465,7 @@ static void mptcp_cleanup_rbuf(struct mptcp_sock *msk) bool cleanup, rx_empty;
cleanup = (space > 0) && (space >= (old_space << 1)); - rx_empty = !atomic_read(&sk->sk_rmem_alloc); + rx_empty = !__mptcp_rmem(sk);
mptcp_for_each_subflow(msk, subflow) { struct sock *ssk = mptcp_subflow_tcp_sock(subflow); @@ -714,8 +714,10 @@ void mptcp_data_ready(struct sock *sk, struct sock *ssk) sk_rbuf = ssk_rbuf;
/* over limit? can't append more skbs to msk, Also, no need to wake-up*/ - if (atomic_read(&sk->sk_rmem_alloc) > sk_rbuf) + if (__mptcp_rmem(sk) > sk_rbuf) { + MPTCP_INC_STATS(sock_net(sk), MPTCP_MIB_RCVPRUNED); return; + }
/* Wake-up the reader only for in-sequence data */ mptcp_data_lock(sk); @@ -1799,7 +1801,7 @@ static int __mptcp_recvmsg_mskq(struct mptcp_sock *msk, if (!(flags & MSG_PEEK)) { /* we will bulk release the skb memory later */ skb->destructor = NULL; - msk->rmem_released += skb->truesize; + WRITE_ONCE(msk->rmem_released, msk->rmem_released + skb->truesize); __skb_unlink(skb, &msk->receive_queue); __kfree_skb(skb); } @@ -1918,7 +1920,7 @@ static void __mptcp_update_rmem(struct sock *sk)
atomic_sub(msk->rmem_released, &sk->sk_rmem_alloc); sk_mem_uncharge(sk, msk->rmem_released); - msk->rmem_released = 0; + WRITE_ONCE(msk->rmem_released, 0); }
static void __mptcp_splice_receive_queue(struct sock *sk) @@ -2420,7 +2422,7 @@ static int __mptcp_init_sock(struct sock *sk) msk->out_of_order_queue = RB_ROOT; msk->first_pending = NULL; msk->wmem_reserved = 0; - msk->rmem_released = 0; + WRITE_ONCE(msk->rmem_released, 0); msk->tx_pending_data = 0; msk->size_goal_cache = TCP_BASE_MSS;
diff --git a/net/mptcp/protocol.h b/net/mptcp/protocol.h index f842c832f6b0..dc5b71de0a9a 100644 --- a/net/mptcp/protocol.h +++ b/net/mptcp/protocol.h @@ -290,9 +290,17 @@ static inline struct mptcp_sock *mptcp_sk(const struct sock *sk) return (struct mptcp_sock *)sk; }
+/* the msk socket don't use the backlog, also account for the bulk + * free memory + */ +static inline int __mptcp_rmem(const struct sock *sk) +{ + return atomic_read(&sk->sk_rmem_alloc) - READ_ONCE(mptcp_sk(sk)->rmem_released); +} + static inline int __mptcp_space(const struct sock *sk) { - return tcp_space(sk) + READ_ONCE(mptcp_sk(sk)->rmem_released); + return tcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf) - __mptcp_rmem(sk)); }
static inline struct mptcp_data_frag *mptcp_send_head(const struct sock *sk)
From: Marek Behún kabel@kernel.org
[ Upstream commit a5de4be0aaaa66a2fa98e8a33bdbed3bd0682804 ]
It seems that we cannot differentiate 88X3310 from 88X3340 by simply looking at bit 3 of revision ID. This only works on revisions A0 and A1. On revision B0, this bit is always 1.
Instead use the 3.d00d register for differentiation, since this register contains information about number of ports on the device.
Fixes: 9885d016ffa9 ("net: phy: marvell10g: add separate structure for 88X3340") Signed-off-by: Marek Behún kabel@kernel.org Reported-by: Matteo Croce mcroce@linux.microsoft.com Tested-by: Matteo Croce mcroce@microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/marvell10g.c | 40 +++++++++++++++++++++++++++++++----- include/linux/marvell_phy.h | 6 +----- 2 files changed, 36 insertions(+), 10 deletions(-)
diff --git a/drivers/net/phy/marvell10g.c b/drivers/net/phy/marvell10g.c index bbbc6ac8fa82..53a433442803 100644 --- a/drivers/net/phy/marvell10g.c +++ b/drivers/net/phy/marvell10g.c @@ -78,6 +78,11 @@ enum { /* Temperature read register (88E2110 only) */ MV_PCS_TEMP = 0x8042,
+ /* Number of ports on the device */ + MV_PCS_PORT_INFO = 0xd00d, + MV_PCS_PORT_INFO_NPORTS_MASK = 0x0380, + MV_PCS_PORT_INFO_NPORTS_SHIFT = 7, + /* These registers appear at 0x800X and 0xa00X - the 0xa00X control * registers appear to set themselves to the 0x800X when AN is * restarted, but status registers appear readable from either. @@ -966,6 +971,30 @@ static const struct mv3310_chip mv2111_type = { #endif };
+static int mv3310_get_number_of_ports(struct phy_device *phydev) +{ + int ret; + + ret = phy_read_mmd(phydev, MDIO_MMD_PCS, MV_PCS_PORT_INFO); + if (ret < 0) + return ret; + + ret &= MV_PCS_PORT_INFO_NPORTS_MASK; + ret >>= MV_PCS_PORT_INFO_NPORTS_SHIFT; + + return ret + 1; +} + +static int mv3310_match_phy_device(struct phy_device *phydev) +{ + return mv3310_get_number_of_ports(phydev) == 1; +} + +static int mv3340_match_phy_device(struct phy_device *phydev) +{ + return mv3310_get_number_of_ports(phydev) == 4; +} + static int mv211x_match_phy_device(struct phy_device *phydev, bool has_5g) { int val; @@ -994,7 +1023,8 @@ static int mv2111_match_phy_device(struct phy_device *phydev) static struct phy_driver mv3310_drivers[] = { { .phy_id = MARVELL_PHY_ID_88X3310, - .phy_id_mask = MARVELL_PHY_ID_88X33X0_MASK, + .phy_id_mask = MARVELL_PHY_ID_MASK, + .match_phy_device = mv3310_match_phy_device, .name = "mv88x3310", .driver_data = &mv3310_type, .get_features = mv3310_get_features, @@ -1011,8 +1041,9 @@ static struct phy_driver mv3310_drivers[] = { .set_loopback = genphy_c45_loopback, }, { - .phy_id = MARVELL_PHY_ID_88X3340, - .phy_id_mask = MARVELL_PHY_ID_88X33X0_MASK, + .phy_id = MARVELL_PHY_ID_88X3310, + .phy_id_mask = MARVELL_PHY_ID_MASK, + .match_phy_device = mv3340_match_phy_device, .name = "mv88x3340", .driver_data = &mv3340_type, .get_features = mv3310_get_features, @@ -1069,8 +1100,7 @@ static struct phy_driver mv3310_drivers[] = { module_phy_driver(mv3310_drivers);
static struct mdio_device_id __maybe_unused mv3310_tbl[] = { - { MARVELL_PHY_ID_88X3310, MARVELL_PHY_ID_88X33X0_MASK }, - { MARVELL_PHY_ID_88X3340, MARVELL_PHY_ID_88X33X0_MASK }, + { MARVELL_PHY_ID_88X3310, MARVELL_PHY_ID_MASK }, { MARVELL_PHY_ID_88E2110, MARVELL_PHY_ID_MASK }, { }, }; diff --git a/include/linux/marvell_phy.h b/include/linux/marvell_phy.h index acee44b9db26..0f06c2287b52 100644 --- a/include/linux/marvell_phy.h +++ b/include/linux/marvell_phy.h @@ -22,14 +22,10 @@ #define MARVELL_PHY_ID_88E1545 0x01410ea0 #define MARVELL_PHY_ID_88E1548P 0x01410ec0 #define MARVELL_PHY_ID_88E3016 0x01410e60 +#define MARVELL_PHY_ID_88X3310 0x002b09a0 #define MARVELL_PHY_ID_88E2110 0x002b09b0 #define MARVELL_PHY_ID_88X2222 0x01410f10
-/* PHY IDs and mask for Alaska 10G PHYs */ -#define MARVELL_PHY_ID_88X33X0_MASK 0xfffffff8 -#define MARVELL_PHY_ID_88X3310 0x002b09a0 -#define MARVELL_PHY_ID_88X3340 0x002b09a8 - /* Marvel 88E1111 in Finisar SFP module with modified PHY ID */ #define MARVELL_PHY_ID_88E1111_FINISAR 0x01ff0cc0
From: Casey Chen cachen@purestorage.com
[ Upstream commit 251ef6f71be2adfd09546a26643426fe62585173 ]
nvme_dev_remove_admin could free dev->admin_q and the admin_tagset while they are being accessed by nvme_dev_disable(), which can be called by nvme_reset_work via nvme_remove_dead_ctrl.
Commit cb4bfda62afa ("nvme-pci: fix hot removal during error handling") intended to avoid requests being stuck on a removed controller by killing the admin queue. But the later fix c8e9e9b7646e ("nvme-pci: unquiesce admin queue on shutdown"), together with nvme_dev_disable(dev, true) right before nvme_dev_remove_admin() could help dispatch requests and fail them early, so we don't need nvme_dev_remove_admin() any more.
Fixes: cb4bfda62afa ("nvme-pci: fix hot removal during error handling") Signed-off-by: Casey Chen cachen@purestorage.com Reviewed-by: Keith Busch kbusch@kernel.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 42ad75ff1348..c625da463330 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2998,7 +2998,6 @@ static void nvme_remove(struct pci_dev *pdev) if (!pci_device_is_present(pdev)) { nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DEAD); nvme_dev_disable(dev, true); - nvme_dev_remove_admin(dev); }
flush_work(&dev->ctrl.reset_work);
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit e56c6bbd98dc1cefb6f9c5d795fd29016e4f2fe7 ]
The point with a *dev and a *brport_dev is that when we have a LAG net device that is a bridge port, *dev is an ocelot net device and *brport_dev is the bonding/team net device. The ocelot net device beneath the LAG does not exist from the bridge's perspective, so we need to sync the switchdev objects belonging to the brport_dev and not to the dev.
Fixes: e4bd44e89dcf ("net: ocelot: replay switchdev events when joining bridge") Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mscc/ocelot_net.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mscc/ocelot_net.c b/drivers/net/ethernet/mscc/ocelot_net.c index aad33d22c33f..3dc577183a40 100644 --- a/drivers/net/ethernet/mscc/ocelot_net.c +++ b/drivers/net/ethernet/mscc/ocelot_net.c @@ -1287,6 +1287,7 @@ static int ocelot_netdevice_lag_leave(struct net_device *dev, }
static int ocelot_netdevice_changeupper(struct net_device *dev, + struct net_device *brport_dev, struct netdev_notifier_changeupper_info *info) { struct netlink_ext_ack *extack; @@ -1296,11 +1297,11 @@ static int ocelot_netdevice_changeupper(struct net_device *dev,
if (netif_is_bridge_master(info->upper_dev)) { if (info->linking) - err = ocelot_netdevice_bridge_join(dev, dev, + err = ocelot_netdevice_bridge_join(dev, brport_dev, info->upper_dev, extack); else - err = ocelot_netdevice_bridge_leave(dev, dev, + err = ocelot_netdevice_bridge_leave(dev, brport_dev, info->upper_dev); } if (netif_is_lag_master(info->upper_dev)) { @@ -1335,7 +1336,7 @@ ocelot_netdevice_lag_changeupper(struct net_device *dev, if (ocelot_port->bond != dev) return NOTIFY_OK;
- err = ocelot_netdevice_changeupper(lower, info); + err = ocelot_netdevice_changeupper(lower, dev, info); if (err) return notifier_from_errno(err); } @@ -1374,7 +1375,7 @@ static int ocelot_netdevice_event(struct notifier_block *unused, struct netdev_notifier_changeupper_info *info = ptr;
if (ocelot_netdevice_dev_check(dev)) - return ocelot_netdevice_changeupper(dev, info); + return ocelot_netdevice_changeupper(dev, dev, info);
if (netif_is_lag_master(dev)) return ocelot_netdevice_lag_changeupper(dev, info);
From: Íñigo Huguet ihuguet@redhat.com
[ Upstream commit f28100cb9c9645c07cbd22431278ac9492f6a01c ]
Fixes: e26ca4b53582 sfc: reduce the number of requested xdp ev queues
The buggy commit intended to allocate less channels for XDP in order to be more unlikely to reach the limit of 32 channels of the driver.
The idea was to use each IRQ/eventqeue for more XDP TX queues than before, calculating which is the maximum number of TX queues that one event queue can handle. For example, in EF10 each event queue could handle up to 8 queues, better than the 4 they were handling before the change. This way, it would have to allocate half of channels than before for XDP TX.
The problem is that the TX queues are also contained inside the channel structs, and there are only 4 queues per channel. Reducing the number of channels means also reducing the number of queues, resulting in not having the desired number of 1 queue per CPU.
This leads to getting errors on XDP_TX and XDP_REDIRECT if they're executed from a high numbered CPU, because there only exist queues for the low half of CPUs, actually. If XDP_TX/REDIRECT is executed in a low numbered CPU, the error doesn't happen. This is the error in the logs (repeated many times, even rate limited): sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)
This errors happens in function efx_xdp_tx_buffers, where it expects to have a dedicated XDP TX queue per CPU.
Reverting the change makes again more likely to reach the limit of 32 channels in machines with many CPUs. If this happen, no XDP_TX/REDIRECT will be possible at all, and we will have this log error messages:
At interface probe: sfc 0000:5e:00.0: Insufficient resources for 12 XDP event queues (24 other channels, max 32)
At every subsequent XDP_TX/REDIRECT failure, rate limited: sfc 0000:5e:00.0 ens3f0np0: XDP TX failed (-22)
However, without reverting the change, it makes the user to think that everything is OK at probe time, but later it fails in an unpredictable way, depending on the CPU that handles the packet.
It is better to restore the predictable behaviour. If the user sees the error message at probe time, he/she can try to configure the best way it fits his/her needs. At least, he/she will have 2 options: - Accept that XDP_TX/REDIRECT is not available (he/she may not need it) - Load sfc module with modparam 'rss_cpus' with a lower number, thus creating less normal RX queues/channels, letting more free resources for XDP, with some performance penalty.
Anyway, let the calculation of maximum TX queues that can be handled by a single event queue, and use it only if it's less than the number of TX queues per channel. This doesn't happen in practice, but could happen if some constant values are tweaked in the future, such us EFX_MAX_TXQ_PER_CHANNEL, EFX_MAX_EVQ_SIZE or EFX_MAX_DMAQ_SIZE.
Related mailing list thread: https://lore.kernel.org/bpf/20201215104327.2be76156@carbon/
Signed-off-by: Íñigo Huguet ihuguet@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/efx_channels.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index a3ca406a3561..5b71f8a03a6d 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -152,6 +152,7 @@ static int efx_allocate_msix_channels(struct efx_nic *efx, * maximum size. */ tx_per_ev = EFX_MAX_EVQ_SIZE / EFX_TXQ_MAX_ENT(efx); + tx_per_ev = min(tx_per_ev, EFX_MAX_TXQ_PER_CHANNEL); n_xdp_tx = num_possible_cpus(); n_xdp_ev = DIV_ROUND_UP(n_xdp_tx, tx_per_ev);
@@ -181,7 +182,7 @@ static int efx_allocate_msix_channels(struct efx_nic *efx, efx->xdp_tx_queue_count = 0; } else { efx->n_xdp_channels = n_xdp_ev; - efx->xdp_tx_per_channel = EFX_MAX_TXQ_PER_CHANNEL; + efx->xdp_tx_per_channel = tx_per_ev; efx->xdp_tx_queue_count = n_xdp_tx; n_channels += n_xdp_ev; netif_dbg(efx, drv, efx->net_dev,
From: Like Xu like.xu.linux@gmail.com
[ Upstream commit 7234c362ccb3c2228f06f19f93b132de9cfa7ae4 ]
The AMD platform does not support the functions Ah CPUID leaf. The returned results for this entry should all remain zero just like the native does:
AMD host: 0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00000000 (uncanny) AMD guest: 0x0000000a 0x00: eax=0x00000000 ebx=0x00000000 ecx=0x00000000 edx=0x00008000
Fixes: cadbaa039b99 ("perf/x86/intel: Make anythread filter support conditional") Signed-off-by: Like Xu likexu@tencent.com Message-Id: 20210628074354.33848-1-likexu@tencent.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/cpuid.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c index ca7866d63e98..739be5da3bca 100644 --- a/arch/x86/kvm/cpuid.c +++ b/arch/x86/kvm/cpuid.c @@ -765,7 +765,8 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
edx.split.num_counters_fixed = min(cap.num_counters_fixed, MAX_FIXED_COUNTERS); edx.split.bit_width_fixed = cap.bit_width_fixed; - edx.split.anythread_deprecated = 1; + if (cap.version) + edx.split.anythread_deprecated = 1; edx.split.reserved1 = 0; edx.split.reserved2 = 0;
From: Sean Christopherson seanjc@google.com
[ Upstream commit b4a693924aab93f3747465b2261add46c82c3220 ]
Return -EFAULT if copy_to_user() fails; if accessing user memory faults, copy_to_user() returns the number of bytes remaining, not an error code.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Cc: Steve Rutherford srutherford@google.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: Ashish Kalra ashish.kalra@amd.com Fixes: d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210506175826.2166383-2-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/svm/sev.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 8d36f0c73071..3dc3e2897804 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1309,8 +1309,9 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp) }
/* Copy packet header to userspace. */ - ret = copy_to_user((void __user *)(uintptr_t)params.hdr_uaddr, hdr, - params.hdr_len); + if (copy_to_user((void __user *)(uintptr_t)params.hdr_uaddr, hdr, + params.hdr_len)) + ret = -EFAULT;
e_free_trans_data: kfree(trans_data);
From: Sean Christopherson seanjc@google.com
[ Upstream commit c7a1b2b678c54ac19320daf525038d0e2e43ca7c ]
Use IS_ERR() instead of checking for a NULL pointer when querying for sev_pin_memory() failures. sev_pin_memory() always returns an error code cast to a pointer, or a valid pointer; it never returns NULL.
Reported-by: Dan Carpenter dan.carpenter@oracle.com Cc: Steve Rutherford srutherford@google.com Cc: Brijesh Singh brijesh.singh@amd.com Cc: Ashish Kalra ashish.kalra@amd.com Fixes: d3d1af85e2c7 ("KVM: SVM: Add KVM_SEND_UPDATE_DATA command") Fixes: 15fb7de1a7f5 ("KVM: SVM: Add KVM_SEV_RECEIVE_UPDATE_DATA command") Signed-off-by: Sean Christopherson seanjc@google.com Message-Id: 20210506175826.2166383-3-seanjc@google.com Signed-off-by: Paolo Bonzini pbonzini@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/kvm/svm/sev.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c index 3dc3e2897804..02d60d7f903d 100644 --- a/arch/x86/kvm/svm/sev.c +++ b/arch/x86/kvm/svm/sev.c @@ -1271,8 +1271,8 @@ static int sev_send_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp) /* Pin guest memory */ guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK, PAGE_SIZE, &n, 0); - if (!guest_page) - return -EFAULT; + if (IS_ERR(guest_page)) + return PTR_ERR(guest_page);
/* allocate memory for header and transport buffer */ ret = -ENOMEM; @@ -1463,11 +1463,12 @@ static int sev_receive_update_data(struct kvm *kvm, struct kvm_sev_cmd *argp) data.trans_len = params.trans_len;
/* Pin guest memory */ - ret = -EFAULT; guest_page = sev_pin_memory(kvm, params.guest_uaddr & PAGE_MASK, PAGE_SIZE, &n, 0); - if (!guest_page) + if (IS_ERR(guest_page)) { + ret = PTR_ERR(guest_page); goto e_free_trans; + }
/* The RECEIVE_UPDATE_DATA command requires C-bit to be always set. */ data.guest_address = (page_to_pfn(guest_page[0]) << PAGE_SHIFT) + offset;
From: Mark Rutland mark.rutland@arm.com
[ Upstream commit 59f44069e0527523f27948da7b77599a73dab157 ]
Since commit:
bad1e1c663e0a72f ("arm64: mte: switch GCR_EL1 in kernel entry and exit")
we saved/restored the user GCR_EL1 value at exception boundaries, and update_gcr_el1_excl() is no longer used for this. However it is used to restore the kernel's GCR_EL1 value when returning from a suspend state. Thus, the comment is misleading (and an ISB is necessary).
When restoring the kernel's GCR value, we need an ISB to ensure this is used by subsequent instructions. We don't necessarily get an ISB by other means (e.g. if the kernel is built without support for pointer authentication). As __cpu_setup() initialised GCR_EL1.Exclude to 0xffff, until a context synchronization event, allocation tag 0 may be used rather than the desired set of tags.
This patch drops the misleading comment, adds the missing ISB, and for clarity folds update_gcr_el1_excl() into its only user.
Fixes: bad1e1c663e0 ("arm64: mte: switch GCR_EL1 in kernel entry and exit") Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20210714143843.56537-2-mark.rutland@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/kernel/mte.c | 15 ++------------- 1 file changed, 2 insertions(+), 13 deletions(-)
diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c index 125a10e413e9..23e9879a6e78 100644 --- a/arch/arm64/kernel/mte.c +++ b/arch/arm64/kernel/mte.c @@ -185,18 +185,6 @@ void mte_check_tfsr_el1(void) } #endif
-static void update_gcr_el1_excl(u64 excl) -{ - - /* - * Note that the mask controlled by the user via prctl() is an - * include while GCR_EL1 accepts an exclude mask. - * No need for ISB since this only affects EL0 currently, implicit - * with ERET. - */ - sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, excl); -} - static void set_gcr_el1_excl(u64 excl) { current->thread.gcr_user_excl = excl; @@ -257,7 +245,8 @@ void mte_suspend_exit(void) if (!system_supports_mte()) return;
- update_gcr_el1_excl(gcr_kernel_excl); + sysreg_clear_set_s(SYS_GCR_EL1, SYS_GCR_EL1_EXCL_MASK, gcr_kernel_excl); + isb(); }
long set_mte_ctrl(struct task_struct *task, unsigned long arg)
From: Zev Weiss zev@bewilderbeest.net
[ Upstream commit 812bae32e5d50914f75a6e036d3bde39ca86b0c3 ]
This device-tree was merged with a provisional vuart IRQ-polarity property that was still under review and ended up taking a somewhat different form. This patch updates it to match the final form of the new vuart properties, which additionally allow specifying the SIRQ number and LPC address.
Signed-off-by: Zev Weiss zev@bewilderbeest.net Reviewed-by: Andrew Jeffery andrew@aj.id.au Fixes: ca03042f0f12 ("serial: 8250_aspeed_vuart: add aspeed, lpc-io-reg and aspeed, lpc-interrupts DT properties") Reviewed-by: Joel Stanley joel@jms.id.au Link: https://lore.kernel.org/r/20210416075113.18047-1-zev@bewilderbeest.net Signed-off-by: Joel Stanley joel@jms.id.au Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts b/arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts index dcab6e78dfa4..8be40c8283af 100644 --- a/arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts +++ b/arch/arm/boot/dts/aspeed-bmc-asrock-e3c246d4i.dts @@ -4,6 +4,7 @@ #include "aspeed-g5.dtsi" #include <dt-bindings/gpio/aspeed-gpio.h> #include <dt-bindings/i2c/i2c.h> +#include <dt-bindings/interrupt-controller/irq.h>
/{ model = "ASRock E3C246D4I BMC"; @@ -73,7 +74,8 @@
&vuart { status = "okay"; - aspeed,sirq-active-high; + aspeed,lpc-io-reg = <0x2f8>; + aspeed,lpc-interrupts = <3 IRQ_TYPE_LEVEL_HIGH>; };
&mac0 {
From: Sudeep Holla sudeep.holla@arm.com
[ Upstream commit 5e469dac326555d2038d199a6329458cc82a34e5 ]
The bus probe callback calls the driver callback without further checking. Better be safe than sorry and refuse registration of a driver without a probe function to prevent a NULL pointer exception.
Link: https://lore.kernel.org/r/20210624095059.4010157-2-sudeep.holla@arm.com Fixes: 933c504424a2 ("firmware: arm_scmi: add scmi protocol bus to enumerate protocol devices") Reported-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Tested-by: Cristian Marussi cristian.marussi@arm.com Reviewed-by: Cristian Marussi cristian.marussi@arm.com Acked-by: Uwe Kleine-König u.kleine-koenig@pengutronix.de Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/bus.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/firmware/arm_scmi/bus.c b/drivers/firmware/arm_scmi/bus.c index 784cf0027da3..9184a0d5acbe 100644 --- a/drivers/firmware/arm_scmi/bus.c +++ b/drivers/firmware/arm_scmi/bus.c @@ -139,6 +139,9 @@ int scmi_driver_register(struct scmi_driver *driver, struct module *owner, { int retval;
+ if (!driver->probe) + return -EINVAL; + retval = scmi_protocol_device_request(driver->id_table); if (retval) return retval;
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 0967ebffe098157180a0bbd180ac90348c6e07d7 ]
ASan reports a memory leak of nsinfo during the execution of:
# perf test "31: Lookup mmap thread"
The leak is caused by a refcounted variable being replaced without dropping the refcount.
This patch makes sure that the refcnt of nsinfo is decreased when a refcounted variable is replaced with a new value.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 27c9c3424fc217da ("perf inject: Add --buildid-all option") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343... [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-inject.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/tools/perf/builtin-inject.c b/tools/perf/builtin-inject.c index ddccc0eb7390..614e428e4ac5 100644 --- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -358,9 +358,10 @@ static struct dso *findnew_dso(int pid, int tid, const char *filename, dso = machine__findnew_dso_id(machine, filename, id); }
- if (dso) + if (dso) { + nsinfo__put(dso->nsinfo); dso->nsinfo = nsi; - else + } else nsinfo__put(nsi);
thread__put(thread);
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 2d6b74baa7147251c30a46c4996e8cc224aa2dc5 ]
ASan reports a memory leak of nsinfo during the execution of
# perf test "31: Lookup mmap thread"
The leak is caused by a refcounted variable being replaced without dropping the refcount.
This patch makes sure that the refcnt of nsinfo is decreased whenever a refcounted variable is replaced with a new value.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: bf2e710b3cb8445c ("perf maps: Lookup maps in both intitial mountns and inner mountns.") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Krister Johansen kjlx@templeofstupid.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343... [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/map.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/perf/util/map.c b/tools/perf/util/map.c index 8af693d9678c..72e7f3616157 100644 --- a/tools/perf/util/map.c +++ b/tools/perf/util/map.c @@ -192,6 +192,8 @@ struct map *map__new(struct machine *machine, u64 start, u64 len, if (!(prot & PROT_EXEC)) dso__set_loaded(dso); } + + nsinfo__put(dso->nsinfo); dso->nsinfo = nsi;
if (build_id__is_defined(bid))
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit dedeb4be203b382ba7245d13079bc3b0f6d40c65 ]
ASan reports a memory leak of nsinfo during the execution of:
# perf test "31: Lookup mmap thread".
The leak is caused by a refcounted variable being replaced without dropping the refcount.
This patch makes sure that the refcnt of nsinfo is decreased whenever a refcounted variable is replaced with a new value.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 544abd44c7064c8a ("perf probe: Allow placing uprobes in alternate namespaces.") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Krister Johansen kjlx@templeofstupid.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/55223bc8821b34ccb01f92ef1401c02b6a32e61f.1626343... [ Split from a larger patch ] Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/probe-event.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/probe-event.c b/tools/perf/util/probe-event.c index a78c8d59a555..9cc89a047b15 100644 --- a/tools/perf/util/probe-event.c +++ b/tools/perf/util/probe-event.c @@ -180,8 +180,10 @@ struct map *get_target_map(const char *target, struct nsinfo *nsi, bool user) struct map *map;
map = dso__new_map(target); - if (map && map->dso) + if (map && map->dso) { + nsinfo__put(map->dso->nsinfo); map->dso->nsinfo = nsinfo__get(nsi); + } return map; } else { return kernel_get_module_map(target);
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 42db3d9ded555f7148b5695109a7dc8d66f0dde4 ]
ASan reports a memory leak in perf_env while running:
# perf test "41: Session topology"
Caused by sibling_dies not being freed.
This patch adds the required free.
Fixes: acae8b36cded0ee6 ("perf header: Add die information in CPU topology") Signed-off-by: Riccardo Mancini rickyman7@gmail.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/2140d0b57656e4eb9021ca9772250c24c032924b.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/env.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index bc5e4f294e9e..f3b90412cc70 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -186,6 +186,7 @@ void perf_env__exit(struct perf_env *env) zfree(&env->cpuid); zfree(&env->cmdline); zfree(&env->cmdline_argv); + zfree(&env->sibling_dies); zfree(&env->sibling_cores); zfree(&env->sibling_threads); zfree(&env->pmu_mappings);
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 233f2dc1c284337286f9a64c0152236779a42f6c ]
ASan reports a memory leak related to session->evlist while running:
# perf test "41: Session topology".
When perf_data is in write mode, session->evlist is owned by the caller, which should also take care of deleting it.
This patch adds the missing evlist__delete().
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: c84974ed9fb67293 ("perf test: Add entry to test cpu topology") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/822f741f06eb25250fb60686cf30a35f447e9e91.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/topology.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/tests/topology.c b/tools/perf/tests/topology.c index ec4e3b21b831..b5efe675b321 100644 --- a/tools/perf/tests/topology.c +++ b/tools/perf/tests/topology.c @@ -61,6 +61,7 @@ static int session_write_header(char *path) TEST_ASSERT_VAL("failed to write header", !perf_session__write_header(session, session->evlist, data.file.fd, true));
+ evlist__delete(session->evlist); perf_session__delete(session);
return 0;
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit fc56f54f6fcd5337634f4545af6459613129b432 ]
ASan reports a memory leak when running:
# perf test "49: Synthesize attr update"
Caused by evlist not being deleted.
This patch adds the missing evlist__delete and removes the perf_cpu_map__put since it's already being deleted by evlist__delete.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: a6e5281780d1da65 ("perf tools: Add event_update event unit type") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/f7994ad63d248f7645f901132d208fadf9f2b7e4.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/event_update.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c index 656218179222..932ab0740d11 100644 --- a/tools/perf/tests/event_update.c +++ b/tools/perf/tests/event_update.c @@ -118,6 +118,6 @@ int test__event_update(struct test *test __maybe_unused, int subtest __maybe_unu TEST_ASSERT_VAL("failed to synthesize attr update cpus", !perf_event__synthesize_event_update_cpus(&tmp.tool, evsel, process_event_cpus));
- perf_cpu_map__put(evsel->core.own_cpus); + evlist__delete(evlist); return 0; }
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit dccfca926c351ba0893af4c8b481477bdb2881a4 ]
ASan reports a memory leak while running:
# perf test "49: Synthesize attr update"
Caused by a string being duplicated but never freed.
This patch adds the missing free().
Note that evsel->unit is not deallocated together with evsel since it is supposed to be a constant string.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: a6e5281780d1da65 ("perf tools: Add event_update event unit type") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/1fbc8158663fb0d4d5392e36bae564f6ad60be3c.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/event_update.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/perf/tests/event_update.c b/tools/perf/tests/event_update.c index 932ab0740d11..44a50527f9d9 100644 --- a/tools/perf/tests/event_update.c +++ b/tools/perf/tests/event_update.c @@ -88,6 +88,7 @@ int test__event_update(struct test *test __maybe_unused, int subtest __maybe_unu struct evsel *evsel; struct event_name tmp; struct evlist *evlist = evlist__new_default(); + char *unit = strdup("KRAVA");
TEST_ASSERT_VAL("failed to get evlist", evlist);
@@ -98,7 +99,7 @@ int test__event_update(struct test *test __maybe_unused, int subtest __maybe_unu
perf_evlist__id_add(&evlist->core, &evsel->core, 0, 0, 123);
- evsel->unit = strdup("KRAVA"); + evsel->unit = unit;
TEST_ASSERT_VAL("failed to synthesize attr update unit", !perf_event__synthesize_event_update_unit(NULL, evsel, process_event_unit)); @@ -118,6 +119,7 @@ int test__event_update(struct test *test __maybe_unused, int subtest __maybe_unu TEST_ASSERT_VAL("failed to synthesize attr update cpus", !perf_event__synthesize_event_update_cpus(&tmp.tool, evsel, process_event_cpus));
+ free(unit); evlist__delete(evlist); return 0; }
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 581e295a0f6b5c2931d280259fbbfff56959faa9 ]
ASan reports a memory leak when running:
# perf test "65: maps__merge_in".
The causes of the leaks are two, this patch addresses only the first one, which is related to dso__new_map().
The bug is that dso__new_map() creates a new dso but never decreases the refcount it gets from creating it.
This patch adds the missing dso__put().
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: d3a7c489c7fd2463 ("perf tools: Reference count struct dso") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/60bfe0cd06e89e2ca33646eb8468d7f5de2ee597.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/dso.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/perf/util/dso.c b/tools/perf/util/dso.c index d786cf6b0cfa..ee15db2be2f4 100644 --- a/tools/perf/util/dso.c +++ b/tools/perf/util/dso.c @@ -1154,8 +1154,10 @@ struct map *dso__new_map(const char *name) struct map *map = NULL; struct dso *dso = dso__new(name);
- if (dso) + if (dso) { map = map__new2(0, dso); + dso__put(dso); + }
return map; }
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 244d1797c8c8e850b8de7992af713aa5c70d5650 ]
ASan reports a memory leak when running:
# perf test "65: maps__merge_in"
This is the second and final patch addressing these memory leaks.
This time, the problem is simply that the maps object is never destructed.
This patch adds the missing maps__exit call.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 79b6bb73f888933c ("perf maps: Merge 'struct maps' with 'struct map_groups'") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/a1a29b97a58738987d150e94d4ebfad0282fb038.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/tests/maps.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/tools/perf/tests/maps.c b/tools/perf/tests/maps.c index edcbc70ff9d6..1ac72919fa35 100644 --- a/tools/perf/tests/maps.c +++ b/tools/perf/tests/maps.c @@ -116,5 +116,7 @@ int test__maps__merge_in(struct test *t __maybe_unused, int subtest __maybe_unus
ret = check_maps(merged3, ARRAY_SIZE(merged3), &maps); TEST_ASSERT_VAL("merge check failed", !ret); + + maps__exit(&maps); return TEST_OK; }
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit da6b7c6c0626901428245f65712385805e42eba6 ]
ASan reports memory leaks while running:
# perf test "83: Zstd perf.data compression/decompression"
The first of the leaks is caused by env->cpu_pmu_caps not being freed.
This patch adds the missing (z)free inside perf_env__exit.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 6f91ea283a1ed23e ("perf header: Support CPU PMU capabilities") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Kan Liang kan.liang@linux.intel.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/6ba036a8220156ec1f3d6be3e5d25920f6145028.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/env.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/util/env.c b/tools/perf/util/env.c index f3b90412cc70..16a111b62cc3 100644 --- a/tools/perf/util/env.c +++ b/tools/perf/util/env.c @@ -191,6 +191,7 @@ void perf_env__exit(struct perf_env *env) zfree(&env->sibling_threads); zfree(&env->pmu_mappings); zfree(&env->cpu); + zfree(&env->cpu_pmu_caps); zfree(&env->numa_map);
for (i = 0; i < env->nr_numa_nodes; i++)
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit a37338aad8c4d8676173ead14e881d2ec308155c ]
ASan reports the memory leak of the strings allocated by sort_help() when running perf report.
This patch changes the returned pointer to char* (instead of const char*), saves it in a temporary variable, and finally deallocates it at function exit.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 702fb9b415e7c99b ("perf report: Show all sort keys in help output") Cc: Andi Kleen ak@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/a38b13f02812a8a6759200b9063c6191337f44d4.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-report.c | 33 ++++++++++++++++++++++----------- tools/perf/util/sort.c | 2 +- tools/perf/util/sort.h | 2 +- 3 files changed, 24 insertions(+), 13 deletions(-)
diff --git a/tools/perf/builtin-report.c b/tools/perf/builtin-report.c index 36f9ccfeb38a..ce420f910ff8 100644 --- a/tools/perf/builtin-report.c +++ b/tools/perf/builtin-report.c @@ -1167,6 +1167,8 @@ int cmd_report(int argc, const char **argv) .annotation_opts = annotation__default_options, .skip_empty = true, }; + char *sort_order_help = sort_help("sort by key(s):"); + char *field_order_help = sort_help("output field(s): overhead period sample "); const struct option options[] = { OPT_STRING('i', "input", &input_name, "file", "input file name"), @@ -1201,9 +1203,9 @@ int cmd_report(int argc, const char **argv) OPT_BOOLEAN(0, "header-only", &report.header_only, "Show only data header."), OPT_STRING('s', "sort", &sort_order, "key[,key2...]", - sort_help("sort by key(s):")), + sort_order_help), OPT_STRING('F', "fields", &field_order, "key[,keys...]", - sort_help("output field(s): overhead period sample ")), + field_order_help), OPT_BOOLEAN(0, "show-cpu-utilization", &symbol_conf.show_cpu_utilization, "Show sample percentage for different cpu modes"), OPT_BOOLEAN_FLAG(0, "showcpuutilization", &symbol_conf.show_cpu_utilization, @@ -1336,11 +1338,11 @@ int cmd_report(int argc, const char **argv) char sort_tmp[128];
if (ret < 0) - return ret; + goto exit;
ret = perf_config(report__config, &report); if (ret) - return ret; + goto exit;
argc = parse_options(argc, argv, options, report_usage, 0); if (argc) { @@ -1354,8 +1356,10 @@ int cmd_report(int argc, const char **argv) report.symbol_filter_str = argv[0]; }
- if (annotate_check_args(&report.annotation_opts) < 0) - return -EINVAL; + if (annotate_check_args(&report.annotation_opts) < 0) { + ret = -EINVAL; + goto exit; + }
if (report.mmaps_mode) report.tasks_mode = true; @@ -1369,12 +1373,14 @@ int cmd_report(int argc, const char **argv) if (symbol_conf.vmlinux_name && access(symbol_conf.vmlinux_name, R_OK)) { pr_err("Invalid file: %s\n", symbol_conf.vmlinux_name); - return -EINVAL; + ret = -EINVAL; + goto exit; } if (symbol_conf.kallsyms_name && access(symbol_conf.kallsyms_name, R_OK)) { pr_err("Invalid file: %s\n", symbol_conf.kallsyms_name); - return -EINVAL; + ret = -EINVAL; + goto exit; }
if (report.inverted_callchain) @@ -1398,12 +1404,14 @@ int cmd_report(int argc, const char **argv)
repeat: session = perf_session__new(&data, false, &report.tool); - if (IS_ERR(session)) - return PTR_ERR(session); + if (IS_ERR(session)) { + ret = PTR_ERR(session); + goto exit; + }
ret = evswitch__init(&report.evswitch, session->evlist, stderr); if (ret) - return ret; + goto exit;
if (zstd_init(&(session->zstd_data), 0) < 0) pr_warning("Decompression initialization failed. Reported data may be incomplete.\n"); @@ -1638,5 +1646,8 @@ error:
zstd_fini(&(session->zstd_data)); perf_session__delete(session); +exit: + free(sort_order_help); + free(field_order_help); return ret; } diff --git a/tools/perf/util/sort.c b/tools/perf/util/sort.c index 88ce47f2547e..568a88c001c6 100644 --- a/tools/perf/util/sort.c +++ b/tools/perf/util/sort.c @@ -3370,7 +3370,7 @@ static void add_hpp_sort_string(struct strbuf *sb, struct hpp_dimension *s, int add_key(sb, s[i].name, llen); }
-const char *sort_help(const char *prefix) +char *sort_help(const char *prefix) { struct strbuf sb; char *s; diff --git a/tools/perf/util/sort.h b/tools/perf/util/sort.h index 87a092645aa7..b67c469aba79 100644 --- a/tools/perf/util/sort.h +++ b/tools/perf/util/sort.h @@ -302,7 +302,7 @@ void reset_output_field(void); void sort__setup_elide(FILE *fp); void perf_hpp__set_elide(int idx, bool elide);
-const char *sort_help(const char *prefix); +char *sort_help(const char *prefix);
int report_parse_ignore_callees_opt(const struct option *opt, const char *arg, int unset);
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit 1b1f57cf9e4c8eb16c8f6b2ce12cc5dd3517fc61 ]
ASan reports several memory leak while running:
# perf test "82: Use vfs_getname probe to get syscall args filenames"
One of the leaks is caused by zstd data not being released on exit in perf-script.
This patch adds the missing zstd_fini().
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: b13b04d9382113f7 ("perf script: Initialize zstd_data") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Milian Wolff milian.wolff@kdab.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/39388e8cc2f85ca219ea18697a17b7bd8f74b693.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-script.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 1280cbfad4db..8a6656ab835b 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -3991,6 +3991,7 @@ out_delete: zfree(&script.ptime_range); }
+ zstd_fini(&(session->zstd_data)); evlist__free_stats(session->evlist); perf_session__delete(session);
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit faf3ac305d61341c74e5cdd9e41daecce7f67bfe ]
ASan reports several memory leaks while running:
# perf test "82: Use vfs_getname probe to get syscall args filenames"
Two of these are caused by some refcounts not being decreased on perf-script exit, namely script.threads and script.cpus.
This patch adds the missing __put calls in a new perf_script__exit function, which is called at the end of cmd_script.
This patch concludes the fixes of all remaining memory leaks in perf test "82: Use vfs_getname probe to get syscall args filenames".
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: cfc8874a48599249 ("perf script: Process cpu/threads maps") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/5ee73b19791c6fa9d24c4d57f4ac1a23609400d7.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-script.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c index 8a6656ab835b..c43c2963117d 100644 --- a/tools/perf/builtin-script.c +++ b/tools/perf/builtin-script.c @@ -2534,6 +2534,12 @@ static void perf_script__exit_per_event_dump_stats(struct perf_script *script) } }
+static void perf_script__exit(struct perf_script *script) +{ + perf_thread_map__put(script->threads); + perf_cpu_map__put(script->cpus); +} + static int __cmd_script(struct perf_script *script) { int ret; @@ -3994,6 +4000,7 @@ out_delete: zstd_fini(&(session->zstd_data)); evlist__free_stats(session->evlist); perf_session__delete(session); + perf_script__exit(&script);
if (script_started) cleanup_scripting();
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit f8cbb0f926ae1e1fb5f9e51614e5437560ed4039 ]
ASan reports memory leaks when running:
# perf test "88: Check open filename arg using perf trace + vfs_getname"
One of these is caused by the lzma stream never being closed inside lzma_decompress_to_file().
This patch adds the missing lzma_end().
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 80a32e5b498a7547 ("perf tools: Add lzma decompression support for kernel module") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/aaf50bdce7afe996cfc06e1bbb36e4a2a9b9db93.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/lzma.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/tools/perf/util/lzma.c b/tools/perf/util/lzma.c index 39062df02629..51424cdc3b68 100644 --- a/tools/perf/util/lzma.c +++ b/tools/perf/util/lzma.c @@ -69,7 +69,7 @@ int lzma_decompress_to_file(const char *input, int output_fd)
if (ferror(infile)) { pr_err("lzma: read error: %s\n", strerror(errno)); - goto err_fclose; + goto err_lzma_end; }
if (feof(infile)) @@ -83,7 +83,7 @@ int lzma_decompress_to_file(const char *input, int output_fd)
if (writen(output_fd, buf_out, write_size) != write_size) { pr_err("lzma: write error: %s\n", strerror(errno)); - goto err_fclose; + goto err_lzma_end; }
strm.next_out = buf_out; @@ -95,11 +95,13 @@ int lzma_decompress_to_file(const char *input, int output_fd) break;
pr_err("lzma: failed %s\n", lzma_strerror(ret)); - goto err_fclose; + goto err_lzma_end; } }
err = 0; +err_lzma_end: + lzma_end(&strm); err_fclose: fclose(infile); return err;
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit e0fa7ab42232e742dcb3de9f3c1f6127b5adc019 ]
ASan reports some memory leaks when running:
# perf test "42: BPF filter"
This second leak is caused by a strlist not being dellocated on error inside probe_file__del_events.
This patch adds a goto label before the deallocation and makes the error path jump to it.
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: e7895e422e4da63d ("perf probe: Split del_perf_probe_events()") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/174963c587ae77fa108af794669998e4ae558338.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/probe-file.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/perf/util/probe-file.c b/tools/perf/util/probe-file.c index 52273542e6ef..3f6de459ac2b 100644 --- a/tools/perf/util/probe-file.c +++ b/tools/perf/util/probe-file.c @@ -342,11 +342,11 @@ int probe_file__del_events(int fd, struct strfilter *filter)
ret = probe_file__get_events(fd, filter, namelist); if (ret < 0) - return ret; + goto out;
ret = probe_file__del_strlist(fd, namelist); +out: strlist__delete(namelist); - return ret; }
From: Riccardo Mancini rickyman7@gmail.com
[ Upstream commit d4b3eedce151e63932ce4a00f1d0baa340a8b907 ]
When using 'perf report' in directory mode, the first file is not closed on exit, causing a memory leak.
The problem is caused by the iterating variable never reaching 0.
Fixes: 145520631130bd64 ("perf data: Add perf_data__(create_dir|close_dir) functions") Signed-off-by: Riccardo Mancini rickyman7@gmail.com Acked-by: Namhyung Kim namhyung@kernel.org Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Peter Zijlstra peterz@infradead.org Cc: Zhen Lei thunder.leizhen@huawei.com Link: http://lore.kernel.org/lkml/20210716141122.858082-1-rickyman7@gmail.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/util/data.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/tools/perf/util/data.c b/tools/perf/util/data.c index 8fca4779ae6a..70b91ce35178 100644 --- a/tools/perf/util/data.c +++ b/tools/perf/util/data.c @@ -20,7 +20,7 @@
static void close_dir(struct perf_data_file *files, int nr) { - while (--nr >= 1) { + while (--nr >= 0) { close(files[nr].fd); zfree(&files[nr].path); }
From: Yang Jihong yangjihong1@huawei.com
[ Upstream commit b0f008551f0bf4d5f6db9b5f0e071b02790d6a2e ]
The tracepoints trace_sched_stat_{wait, sleep, iowait} are not exposed to user if CONFIG_SCHEDSTATS is not set, "perf sched record" records the three events. As a result, the command fails.
Before:
#perf sched record sleep 1 event syntax error: 'sched:sched_stat_wait' ___ unknown tracepoint
Error: File /sys/kernel/tracing/events/sched/sched_stat_wait not found. Hint: Perhaps this kernel misses some CONFIG_ setting to enable this feature?.
Run 'perf list' for a list of valid events
Usage: perf record [<options>] [<command>] or: perf record [<options>] -- <command> [<options>]
-e, --event <event> event selector. use 'perf list' to list available events
Solution: Check whether schedstat tracepoints are exposed. If no, these events are not recorded.
After: # perf sched record sleep 1 [ perf record: Woken up 1 times to write data ] [ perf record: Captured and wrote 0.163 MB perf.data (1091 samples) ] # perf sched report run measurement overhead: 4736 nsecs sleep measurement overhead: 9059979 nsecs the run test took 999854 nsecs the sleep test took 8945271 nsecs nr_run_events: 716 nr_sleep_events: 785 nr_wakeup_events: 0 ... ------------------------------------------------------------
Fixes: 2a09b5de235a6 ("sched/fair: do not expose some tracepoints to user if CONFIG_SCHEDSTATS is not set") Signed-off-by: Yang Jihong yangjihong1@huawei.com Cc: Alexander Shishkin alexander.shishkin@linux.intel.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Cc: Steven Rostedt (VMware) rostedt@goodmis.org Cc: Yafang Shao laoar.shao@gmail.com Link: http://lore.kernel.org/lkml/20210713112358.194693-1-yangjihong1@huawei.com Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- tools/perf/builtin-sched.c | 33 +++++++++++++++++++++++++++++---- 1 file changed, 29 insertions(+), 4 deletions(-)
diff --git a/tools/perf/builtin-sched.c b/tools/perf/builtin-sched.c index 954ce2f594e9..3e5b7faf0c16 100644 --- a/tools/perf/builtin-sched.c +++ b/tools/perf/builtin-sched.c @@ -3335,6 +3335,16 @@ static void setup_sorting(struct perf_sched *sched, const struct option *options sort_dimension__add("pid", &sched->cmp_pid); }
+static bool schedstat_events_exposed(void) +{ + /* + * Select "sched:sched_stat_wait" event to check + * whether schedstat tracepoints are exposed. + */ + return IS_ERR(trace_event__tp_format("sched", "sched_stat_wait")) ? + false : true; +} + static int __cmd_record(int argc, const char **argv) { unsigned int rec_argc, i, j; @@ -3346,21 +3356,33 @@ static int __cmd_record(int argc, const char **argv) "-m", "1024", "-c", "1", "-e", "sched:sched_switch", - "-e", "sched:sched_stat_wait", - "-e", "sched:sched_stat_sleep", - "-e", "sched:sched_stat_iowait", "-e", "sched:sched_stat_runtime", "-e", "sched:sched_process_fork", "-e", "sched:sched_wakeup_new", "-e", "sched:sched_migrate_task", }; + + /* + * The tracepoints trace_sched_stat_{wait, sleep, iowait} + * are not exposed to user if CONFIG_SCHEDSTATS is not set, + * to prevent "perf sched record" execution failure, determine + * whether to record schedstat events according to actual situation. + */ + const char * const schedstat_args[] = { + "-e", "sched:sched_stat_wait", + "-e", "sched:sched_stat_sleep", + "-e", "sched:sched_stat_iowait", + }; + unsigned int schedstat_argc = schedstat_events_exposed() ? + ARRAY_SIZE(schedstat_args) : 0; + struct tep_event *waking_event;
/* * +2 for either "-e", "sched:sched_wakeup" or * "-e", "sched:sched_waking" */ - rec_argc = ARRAY_SIZE(record_args) + 2 + argc - 1; + rec_argc = ARRAY_SIZE(record_args) + 2 + schedstat_argc + argc - 1; rec_argv = calloc(rec_argc + 1, sizeof(char *));
if (rec_argv == NULL) @@ -3376,6 +3398,9 @@ static int __cmd_record(int argc, const char **argv) else rec_argv[i++] = strdup("sched:sched_wakeup");
+ for (j = 0; j < schedstat_argc; j++) + rec_argv[i++] = strdup(schedstat_args[j]); + for (j = 1; j < (unsigned int)argc; j++, i++) rec_argv[i] = argv[j];
From: Lecopzer Chen lecopzer.chen@mediatek.com
[ Upstream commit 1d11053dc63094075bf9e4809fffd3bb5e72f9a6 ]
When building modules(CONFIG_...=m), I found some of module versions are incorrect and set to 0. This can be found in build log for first clean build which shows
WARNING: EXPORT symbol "XXXX" [drivers/XXX/XXX.ko] version generation failed, symbol will not be versioned.
But in second build(incremental build), the WARNING disappeared and the module version becomes valid CRC and make someone who want to change modules without updating kernel image can't insert their modules.
The problematic code is + $(foreach n, $(filter-out FORCE,$^), \ + $(if $(wildcard $(n).symversions), \ + ; cat $(n).symversions >> $@.symversions))
For example: rm -f fs/notify/built-in.a.symversions ; rm -f fs/notify/built-in.a; \ llvm-ar cDPrST fs/notify/built-in.a fs/notify/fsnotify.o \ fs/notify/notification.o fs/notify/group.o ...
`foreach n` shows nothing to `cat` into $(n).symversions because `if $(wildcard $(n).symversions)` return nothing, but actually they do exist during this line was executed.
-rw-r--r-- 1 root root 168580 Jun 13 19:10 fs/notify/fsnotify.o -rw-r--r-- 1 root root 111 Jun 13 19:10 fs/notify/fsnotify.o.symversions
The reason is the $(n).symversions are generated at runtime, but Makefile wildcard function expends and checks the file exist or not during parsing the Makefile.
Thus fix this by use `test` shell command to check the file existence in runtime.
Rebase from both: 1. [https://lore.kernel.org/lkml/20210616080252.32046-1-lecopzer.chen@mediatek.c...] 2. [https://lore.kernel.org/lkml/20210702032943.7865-1-lecopzer.chen@mediatek.co...]
Fixes: 38e891849003 ("kbuild: lto: fix module versioning") Co-developed-by: Sami Tolvanen samitolvanen@google.com Signed-off-by: Lecopzer Chen lecopzer.chen@mediatek.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- scripts/Makefile.build | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/scripts/Makefile.build b/scripts/Makefile.build index 34d257653fb4..c6bd62f518ff 100644 --- a/scripts/Makefile.build +++ b/scripts/Makefile.build @@ -388,7 +388,7 @@ ifeq ($(CONFIG_LTO_CLANG) $(CONFIG_MODVERSIONS),y y) cmd_update_lto_symversions = \ rm -f $@.symversions \ $(foreach n, $(filter-out FORCE,$^), \ - $(if $(wildcard $(n).symversions), \ + $(if $(shell test -s $(n).symversions && echo y), \ ; cat $(n).symversions >> $@.symversions)) else cmd_update_lto_symversions = echo >/dev/null
From: Charles Keepax ckeepax@opensource.cirrus.com
[ Upstream commit dd6fb8ff2210f74b056bf9234d0605e8c26a8ac0 ]
When wm_coeff_tlv_get was updated it was accidentally switch to the _raw version of the helper causing it to ignore the current DSP state it should be checking. Switch the code back to the correct helper so that users can't read the controls when they arn't available.
Fixes: 73ecf1a673d3 ("ASoC: wm_adsp: Correct cache handling of new kernel control API") Signed-off-by: Charles Keepax ckeepax@opensource.cirrus.com Link: https://lore.kernel.org/r/20210626155941.12251-1-ckeepax@opensource.cirrus.c... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/wm_adsp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/codecs/wm_adsp.c b/sound/soc/codecs/wm_adsp.c index 3dc119daf2f6..cef05d81c39b 100644 --- a/sound/soc/codecs/wm_adsp.c +++ b/sound/soc/codecs/wm_adsp.c @@ -1213,7 +1213,7 @@ static int wm_coeff_tlv_get(struct snd_kcontrol *kctl,
mutex_lock(&ctl->dsp->pwr_lock);
- ret = wm_coeff_read_ctrl_raw(ctl, ctl->cache, size); + ret = wm_coeff_read_ctrl(ctl, ctl->cache, size);
if (!ret && copy_to_user(bytes, ctl->cache, size)) ret = -EFAULT;
From: Alain Volmat alain.volmat@foss.st.com
[ Upstream commit 7999d2555c9f879d006ea8469d74db9cdb038af0 ]
Add pm_runtime calls in probe/probe error path and remove in order to be consistent in all places in ordering and ensure that pm_runtime is disabled prior to resources used by the SPI controller.
This patch also fixes the 2 following warnings on driver remove: WARNING: CPU: 0 PID: 743 at drivers/clk/clk.c:594 clk_core_disable_lock+0x18/0x24 WARNING: CPU: 0 PID: 743 at drivers/clk/clk.c:476 clk_unprepare+0x24/0x2c
Fixes: 038ac869c9d2 ("spi: stm32: add runtime PM support")
Signed-off-by: Amelie Delaunay amelie.delaunay@foss.st.com Signed-off-by: Alain Volmat alain.volmat@foss.st.com Link: https://lore.kernel.org/r/1625646426-5826-2-git-send-email-alain.volmat@foss... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-stm32.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)
diff --git a/drivers/spi/spi-stm32.c b/drivers/spi/spi-stm32.c index 8ffcffbb8157..a92a28933edb 100644 --- a/drivers/spi/spi-stm32.c +++ b/drivers/spi/spi-stm32.c @@ -1925,6 +1925,7 @@ static int stm32_spi_probe(struct platform_device *pdev) master->can_dma = stm32_spi_can_dma;
pm_runtime_set_active(&pdev->dev); + pm_runtime_get_noresume(&pdev->dev); pm_runtime_enable(&pdev->dev);
ret = spi_register_master(master); @@ -1940,6 +1941,8 @@ static int stm32_spi_probe(struct platform_device *pdev)
err_pm_disable: pm_runtime_disable(&pdev->dev); + pm_runtime_put_noidle(&pdev->dev); + pm_runtime_set_suspended(&pdev->dev); err_dma_release: if (spi->dma_tx) dma_release_channel(spi->dma_tx); @@ -1956,9 +1959,14 @@ static int stm32_spi_remove(struct platform_device *pdev) struct spi_master *master = platform_get_drvdata(pdev); struct stm32_spi *spi = spi_master_get_devdata(master);
+ pm_runtime_get_sync(&pdev->dev); + spi_unregister_master(master); spi->cfg->disable(spi);
+ pm_runtime_disable(&pdev->dev); + pm_runtime_put_noidle(&pdev->dev); + pm_runtime_set_suspended(&pdev->dev); if (master->dma_tx) dma_release_channel(master->dma_tx); if (master->dma_rx) @@ -1966,7 +1974,6 @@ static int stm32_spi_remove(struct platform_device *pdev)
clk_disable_unprepare(spi->clk);
- pm_runtime_disable(&pdev->dev);
pinctrl_pm_select_sleep_state(&pdev->dev);
From: Axel Lin axel.lin@ingics.com
[ Upstream commit ae60e6a9d24e89a74e2512204ad04de94921bdd2 ]
Use unsigned int instead of u32 for regmap_read/regmap_update_bits val argument.
Signed-off-by: Axel Lin axel.lin@ingics.com Link: https://lore.kernel.org/r/20210619124133.4096683-1-axel.lin@ingics.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/hi6421-regulator.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/regulator/hi6421-regulator.c b/drivers/regulator/hi6421-regulator.c index dc631c1a46b4..bff8c515dcde 100644 --- a/drivers/regulator/hi6421-regulator.c +++ b/drivers/regulator/hi6421-regulator.c @@ -386,7 +386,7 @@ static int hi6421_regulator_enable(struct regulator_dev *rdev) static unsigned int hi6421_regulator_ldo_get_mode(struct regulator_dev *rdev) { struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); - u32 reg_val; + unsigned int reg_val;
regmap_read(rdev->regmap, rdev->desc->enable_reg, ®_val); if (reg_val & info->mode_mask) @@ -398,7 +398,7 @@ static unsigned int hi6421_regulator_ldo_get_mode(struct regulator_dev *rdev) static unsigned int hi6421_regulator_buck_get_mode(struct regulator_dev *rdev) { struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); - u32 reg_val; + unsigned int reg_val;
regmap_read(rdev->regmap, rdev->desc->enable_reg, ®_val); if (reg_val & info->mode_mask) @@ -411,7 +411,7 @@ static int hi6421_regulator_ldo_set_mode(struct regulator_dev *rdev, unsigned int mode) { struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); - u32 new_mode; + unsigned int new_mode;
switch (mode) { case REGULATOR_MODE_NORMAL: @@ -435,7 +435,7 @@ static int hi6421_regulator_buck_set_mode(struct regulator_dev *rdev, unsigned int mode) { struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); - u32 new_mode; + unsigned int new_mode;
switch (mode) { case REGULATOR_MODE_NORMAL:
From: Axel Lin axel.lin@ingics.com
[ Upstream commit 1c73daee4bf30ccdff5e86dc400daa6f74735da5 ]
Since config.dev = pdev->dev.parent in current code, so dev_get_drvdata(rdev->dev.parent) call in hi6421_regulator_enable returns the drvdata of the mfd device rather than the regulator. Fix it.
This was broken while converting to use simplified DT parsing because the config.dev changed from pdev->dev to pdev->dev.parent for parsing the parent's of_node.
Fixes: 29dc269a85ef ("regulator: hi6421: Convert to use simplified DT parsing") Signed-off-by: Axel Lin axel.lin@ingics.com Link: https://lore.kernel.org/r/20210630095959.2411543-1-axel.lin@ingics.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/hi6421-regulator.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/drivers/regulator/hi6421-regulator.c b/drivers/regulator/hi6421-regulator.c index bff8c515dcde..d144a4bdb76d 100644 --- a/drivers/regulator/hi6421-regulator.c +++ b/drivers/regulator/hi6421-regulator.c @@ -366,9 +366,8 @@ static struct hi6421_regulator_info
static int hi6421_regulator_enable(struct regulator_dev *rdev) { - struct hi6421_regulator_pdata *pdata; + struct hi6421_regulator_pdata *pdata = rdev_get_drvdata(rdev);
- pdata = dev_get_drvdata(rdev->dev.parent); /* hi6421 spec requires regulator enablement must be serialized: * - Because when BUCK, LDO switching from off to on, it will have * a huge instantaneous current; so you can not turn on two or @@ -385,9 +384,10 @@ static int hi6421_regulator_enable(struct regulator_dev *rdev)
static unsigned int hi6421_regulator_ldo_get_mode(struct regulator_dev *rdev) { - struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); + struct hi6421_regulator_info *info; unsigned int reg_val;
+ info = container_of(rdev->desc, struct hi6421_regulator_info, desc); regmap_read(rdev->regmap, rdev->desc->enable_reg, ®_val); if (reg_val & info->mode_mask) return REGULATOR_MODE_IDLE; @@ -397,9 +397,10 @@ static unsigned int hi6421_regulator_ldo_get_mode(struct regulator_dev *rdev)
static unsigned int hi6421_regulator_buck_get_mode(struct regulator_dev *rdev) { - struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); + struct hi6421_regulator_info *info; unsigned int reg_val;
+ info = container_of(rdev->desc, struct hi6421_regulator_info, desc); regmap_read(rdev->regmap, rdev->desc->enable_reg, ®_val); if (reg_val & info->mode_mask) return REGULATOR_MODE_STANDBY; @@ -410,9 +411,10 @@ static unsigned int hi6421_regulator_buck_get_mode(struct regulator_dev *rdev) static int hi6421_regulator_ldo_set_mode(struct regulator_dev *rdev, unsigned int mode) { - struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); + struct hi6421_regulator_info *info; unsigned int new_mode;
+ info = container_of(rdev->desc, struct hi6421_regulator_info, desc); switch (mode) { case REGULATOR_MODE_NORMAL: new_mode = 0; @@ -434,9 +436,10 @@ static int hi6421_regulator_ldo_set_mode(struct regulator_dev *rdev, static int hi6421_regulator_buck_set_mode(struct regulator_dev *rdev, unsigned int mode) { - struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); + struct hi6421_regulator_info *info; unsigned int new_mode;
+ info = container_of(rdev->desc, struct hi6421_regulator_info, desc); switch (mode) { case REGULATOR_MODE_NORMAL: new_mode = 0; @@ -459,7 +462,9 @@ static unsigned int hi6421_regulator_ldo_get_optimum_mode(struct regulator_dev *rdev, int input_uV, int output_uV, int load_uA) { - struct hi6421_regulator_info *info = rdev_get_drvdata(rdev); + struct hi6421_regulator_info *info; + + info = container_of(rdev->desc, struct hi6421_regulator_info, desc);
if (load_uA > info->eco_microamp) return REGULATOR_MODE_NORMAL; @@ -543,14 +548,13 @@ static int hi6421_regulator_probe(struct platform_device *pdev) if (!pdata) return -ENOMEM; mutex_init(&pdata->lock); - platform_set_drvdata(pdev, pdata);
for (i = 0; i < ARRAY_SIZE(hi6421_regulator_info); i++) { /* assign per-regulator data */ info = &hi6421_regulator_info[i];
config.dev = pdev->dev.parent; - config.driver_data = info; + config.driver_data = pdata; config.regmap = pmic->regmap;
rdev = devm_regulator_register(&pdev->dev, &info->desc,
From: Peter Hess peter.hess@ph-home.de
[ Upstream commit 3a70dd2d050331ee4cf5ad9d5c0a32d83ead9a43 ]
In FIFO mode were two problems: - RX mode was never handled and - in this case the tx_buf pointer was NULL and caused an exception
fix this by handling RX mode in mtk_spi_fifo_transfer
Fixes: a568231f4632 ("spi: mediatek: Add spi bus for Mediatek MT8173") Signed-off-by: Peter Hess peter.hess@ph-home.de Signed-off-by: Frank Wunderlich frank-w@public-files.de Link: https://lore.kernel.org/r/20210706121609.680534-1-linux@fw-web.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-mt65xx.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
diff --git a/drivers/spi/spi-mt65xx.c b/drivers/spi/spi-mt65xx.c index 976f73b9e299..8d5fa7f1e506 100644 --- a/drivers/spi/spi-mt65xx.c +++ b/drivers/spi/spi-mt65xx.c @@ -427,13 +427,23 @@ static int mtk_spi_fifo_transfer(struct spi_master *master, mtk_spi_setup_packet(master);
cnt = xfer->len / 4; - iowrite32_rep(mdata->base + SPI_TX_DATA_REG, xfer->tx_buf, cnt); + if (xfer->tx_buf) + iowrite32_rep(mdata->base + SPI_TX_DATA_REG, xfer->tx_buf, cnt); + + if (xfer->rx_buf) + ioread32_rep(mdata->base + SPI_RX_DATA_REG, xfer->rx_buf, cnt);
remainder = xfer->len % 4; if (remainder > 0) { reg_val = 0; - memcpy(®_val, xfer->tx_buf + (cnt * 4), remainder); - writel(reg_val, mdata->base + SPI_TX_DATA_REG); + if (xfer->tx_buf) { + memcpy(®_val, xfer->tx_buf + (cnt * 4), remainder); + writel(reg_val, mdata->base + SPI_TX_DATA_REG); + } + if (xfer->rx_buf) { + reg_val = readl(mdata->base + SPI_RX_DATA_REG); + memcpy(xfer->rx_buf + (cnt * 4), ®_val, remainder); + } }
mtk_spi_enable_transfer(master);
From: Maxim Schwalm maxim.schwalm@gmail.com
[ Upstream commit c71f78a662611fe2c67f3155da19b0eff0f29762 ]
The ALC5631 does not like multi-write accesses, avoid them. This fixes:
rt5631 4-001a: Unable to sync registers 0x3a-0x3c. -121
errors on resume from suspend (and all registers after the registers in the error not being synced).
Inspired by commit 2d30e9494f1e ("ASoC: rt5651: Fix regcache sync errors on resume") from Hans de Geode, which fixed the same errors on ALC5651.
Signed-off-by: Maxim Schwalm maxim.schwalm@gmail.com Link: https://lore.kernel.org/r/20210712005011.28536-1-digetx@gmail.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/codecs/rt5631.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/codecs/rt5631.c b/sound/soc/codecs/rt5631.c index 3000bc128b5b..38356ea2bd6e 100644 --- a/sound/soc/codecs/rt5631.c +++ b/sound/soc/codecs/rt5631.c @@ -1695,6 +1695,8 @@ static const struct regmap_config rt5631_regmap_config = { .reg_defaults = rt5631_reg, .num_reg_defaults = ARRAY_SIZE(rt5631_reg), .cache_type = REGCACHE_RBTREE, + .use_single_read = true, + .use_single_write = true, };
static int rt5631_i2c_probe(struct i2c_client *i2c,
From: Xuan Zhuo xuanzhuo@linux.alibaba.com
[ Upstream commit 5e21bb4e812566aef86fbb77c96a4ec0782286e4 ]
These two types of XDP progs (BPF_XDP_DEVMAP, BPF_XDP_CPUMAP) will not be executed directly in the driver, therefore we should also not directly run them from here. To run in these two situations, there must be further preparations done, otherwise these may cause a kernel panic.
For more details, see also dev_xdp_attach().
[ 46.982479] BUG: kernel NULL pointer dereference, address: 0000000000000000 [ 46.984295] #PF: supervisor read access in kernel mode [ 46.985777] #PF: error_code(0x0000) - not-present page [ 46.987227] PGD 800000010dca4067 P4D 800000010dca4067 PUD 10dca6067 PMD 0 [ 46.989201] Oops: 0000 [#1] SMP PTI [ 46.990304] CPU: 7 PID: 562 Comm: a.out Not tainted 5.13.0+ #44 [ 46.992001] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/24 [ 46.995113] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 46.996586] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.001562] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.003115] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.005163] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.007135] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.009171] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.011172] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.013244] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.015705] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.017475] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.019558] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.021595] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.023574] PKRU: 55555554 [ 47.024571] Call Trace: [ 47.025424] __bpf_prog_run32+0x32/0x50 [ 47.026296] ? printk+0x53/0x6a [ 47.027066] ? ktime_get+0x39/0x90 [ 47.027895] bpf_test_run.cold.28+0x23/0x123 [ 47.028866] ? printk+0x53/0x6a [ 47.029630] bpf_prog_test_run_xdp+0x149/0x1d0 [ 47.030649] __sys_bpf+0x1305/0x23d0 [ 47.031482] __x64_sys_bpf+0x17/0x20 [ 47.032316] do_syscall_64+0x3a/0x80 [ 47.033165] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 47.034254] RIP: 0033:0x7f04a51364dd [ 47.035133] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 48 [ 47.038768] RSP: 002b:00007fff8f9fc518 EFLAGS: 00000213 ORIG_RAX: 0000000000000141 [ 47.040344] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f04a51364dd [ 47.041749] RDX: 0000000000000048 RSI: 0000000020002a80 RDI: 000000000000000a [ 47.043171] RBP: 00007fff8f9fc530 R08: 0000000002049300 R09: 0000000020000100 [ 47.044626] R10: 0000000000000004 R11: 0000000000000213 R12: 0000000000401070 [ 47.046088] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000 [ 47.047579] Modules linked in: [ 47.048318] CR2: 0000000000000000 [ 47.049120] ---[ end trace 7ad34443d5be719a ]--- [ 47.050273] RIP: 0010:___bpf_prog_run+0x17b/0x1710 [ 47.051343] Code: 49 03 14 cc e8 76 f6 fe ff e9 ad fe ff ff 0f b6 43 01 48 0f bf 4b 02 48 83 c3 08 89 c2 83 e0 0f c0 ea 04 02 [ 47.054943] RSP: 0018:ffffc900005afc58 EFLAGS: 00010246 [ 47.056068] RAX: 0000000000000000 RBX: ffffc9000023f068 RCX: 0000000000000000 [ 47.057522] RDX: 0000000000000000 RSI: 0000000000000079 RDI: ffffc900005afc98 [ 47.058961] RBP: 0000000000000000 R08: ffffc9000023f048 R09: c0000000ffffdfff [ 47.060390] R10: 0000000000000001 R11: ffffc900005afb40 R12: ffffc900005afc98 [ 47.061803] R13: 0000000000000001 R14: 0000000000000001 R15: ffffffff825258a8 [ 47.063249] FS: 00007f04a5207580(0000) GS:ffff88842fdc0000(0000) knlGS:0000000000000000 [ 47.065070] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 47.066307] CR2: 0000000000000000 CR3: 0000000100182005 CR4: 0000000000770ee0 [ 47.067747] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [ 47.069217] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 [ 47.070652] PKRU: 55555554 [ 47.071318] Kernel panic - not syncing: Fatal exception [ 47.072854] Kernel Offset: disabled [ 47.073683] ---[ end Kernel panic - not syncing: Fatal exception ]---
Fixes: 9216477449f3 ("bpf: cpumap: Add the possibility to attach an eBPF program to cpumap") Fixes: fbee97feed9b ("bpf: Add support to attach bpf program to a devmap entry") Reported-by: Abaci abaci@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Dust Li dust.li@linux.alibaba.com Acked-by: Jesper Dangaard Brouer brouer@redhat.com Acked-by: David Ahern dsahern@kernel.org Acked-by: Song Liu songliubraving@fb.com Link: https://lore.kernel.org/bpf/20210708080409.73525-1-xuanzhuo@linux.alibaba.co... Signed-off-by: Sasha Levin sashal@kernel.org --- net/bpf/test_run.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c index a5d72c48fb66..28ac3c96fa88 100644 --- a/net/bpf/test_run.c +++ b/net/bpf/test_run.c @@ -701,6 +701,9 @@ int bpf_prog_test_run_xdp(struct bpf_prog *prog, const union bpf_attr *kattr, void *data; int ret;
+ if (prog->expected_attach_type == BPF_XDP_DEVMAP || + prog->expected_attach_type == BPF_XDP_CPUMAP) + return -EINVAL; if (kattr->test.ctx_in || kattr->test.ctx_out) return -EINVAL;
From: Daniel Borkmann daniel@iogearbox.net
[ Upstream commit 5dd0a6b8582ffbfa88351949d50eccd5b6694ade ]
During testing of f263a81451c1 ("bpf: Track subprog poke descriptors correctly and fix use-after-free") under various failure conditions, for example, when jit_subprogs() fails and tries to clean up the program to be run under the interpreter, we ran into the following freeze:
[...] #127/8 tailcall_bpf2bpf_3:FAIL [...] [ 92.041251] BUG: KASAN: slab-out-of-bounds in ___bpf_prog_run+0x1b9d/0x2e20 [ 92.042408] Read of size 8 at addr ffff88800da67f68 by task test_progs/682 [ 92.043707] [ 92.044030] CPU: 1 PID: 682 Comm: test_progs Tainted: G O 5.13.0-53301-ge6c08cb33a30-dirty #87 [ 92.045542] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1 04/01/2014 [ 92.046785] Call Trace: [ 92.047171] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.047773] ? __bpf_prog_run_args32+0x8b/0xb0 [ 92.048389] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.049019] ? ktime_get+0x117/0x130 [...] // few hundred [similar] lines more [ 92.659025] ? ktime_get+0x117/0x130 [ 92.659845] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.660738] ? __bpf_prog_run_args32+0x8b/0xb0 [ 92.661528] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.662378] ? print_usage_bug+0x50/0x50 [ 92.663221] ? print_usage_bug+0x50/0x50 [ 92.664077] ? bpf_ksym_find+0x9c/0xe0 [ 92.664887] ? ktime_get+0x117/0x130 [ 92.665624] ? kernel_text_address+0xf5/0x100 [ 92.666529] ? __kernel_text_address+0xe/0x30 [ 92.667725] ? unwind_get_return_address+0x2f/0x50 [ 92.668854] ? ___bpf_prog_run+0x15d4/0x2e20 [ 92.670185] ? ktime_get+0x117/0x130 [ 92.671130] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.672020] ? __bpf_prog_run_args32+0x8b/0xb0 [ 92.672860] ? __bpf_prog_run_args64+0xc0/0xc0 [ 92.675159] ? ktime_get+0x117/0x130 [ 92.677074] ? lock_is_held_type+0xd5/0x130 [ 92.678662] ? ___bpf_prog_run+0x15d4/0x2e20 [ 92.680046] ? ktime_get+0x117/0x130 [ 92.681285] ? __bpf_prog_run32+0x6b/0x90 [ 92.682601] ? __bpf_prog_run64+0x90/0x90 [ 92.683636] ? lock_downgrade+0x370/0x370 [ 92.684647] ? mark_held_locks+0x44/0x90 [ 92.685652] ? ktime_get+0x117/0x130 [ 92.686752] ? lockdep_hardirqs_on+0x79/0x100 [ 92.688004] ? ktime_get+0x117/0x130 [ 92.688573] ? __cant_migrate+0x2b/0x80 [ 92.689192] ? bpf_test_run+0x2f4/0x510 [ 92.689869] ? bpf_test_timer_continue+0x1c0/0x1c0 [ 92.690856] ? rcu_read_lock_bh_held+0x90/0x90 [ 92.691506] ? __kasan_slab_alloc+0x61/0x80 [ 92.692128] ? eth_type_trans+0x128/0x240 [ 92.692737] ? __build_skb+0x46/0x50 [ 92.693252] ? bpf_prog_test_run_skb+0x65e/0xc50 [ 92.693954] ? bpf_prog_test_run_raw_tp+0x2d0/0x2d0 [ 92.694639] ? __fget_light+0xa1/0x100 [ 92.695162] ? bpf_prog_inc+0x23/0x30 [ 92.695685] ? __sys_bpf+0xb40/0x2c80 [ 92.696324] ? bpf_link_get_from_fd+0x90/0x90 [ 92.697150] ? mark_held_locks+0x24/0x90 [ 92.698007] ? lockdep_hardirqs_on_prepare+0x124/0x220 [ 92.699045] ? finish_task_switch+0xe6/0x370 [ 92.700072] ? lockdep_hardirqs_on+0x79/0x100 [ 92.701233] ? finish_task_switch+0x11d/0x370 [ 92.702264] ? __switch_to+0x2c0/0x740 [ 92.703148] ? mark_held_locks+0x24/0x90 [ 92.704155] ? __x64_sys_bpf+0x45/0x50 [ 92.705146] ? do_syscall_64+0x35/0x80 [ 92.706953] ? entry_SYSCALL_64_after_hwframe+0x44/0xae [...]
Turns out that the program rejection from e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") is buggy since env->prog->aux->tail_call_reachable is never true. Commit ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") added a tracker into check_max_stack_depth() which propagates the tail_call_reachable condition throughout the subprograms. This info is then assigned to the subprogram's func[i]->aux->tail_call_reachable. However, in the case of the rejection check upon JIT failure, env->prog->aux->tail_call_reachable is used. func[0]->aux->tail_call_reachable which represents the main program's information did not propagate this to the outer env->prog->aux, though. Add this propagation into check_max_stack_depth() where it needs to belong so that the check can be done reliably.
Fixes: ebf7d1f508a7 ("bpf, x64: rework pro/epilogue and tailcall handling in JIT") Fixes: e411901c0b77 ("bpf: allow for tailcalls in BPF subprograms for x64 JIT") Co-developed-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Alexei Starovoitov ast@kernel.org Acked-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Link: https://lore.kernel.org/bpf/618c34e3163ad1a36b1e82377576a6081e182f25.1626123... Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/bpf/verifier.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c index d8a6fcd28e39..e6db39a00de2 100644 --- a/kernel/bpf/verifier.c +++ b/kernel/bpf/verifier.c @@ -3675,6 +3675,8 @@ continue_func: if (tail_call_reachable) for (j = 0; j < frame; j++) subprog[ret_prog[j]].tail_call_reachable = true; + if (subprog[0].tail_call_reachable) + env->prog->aux->tail_call_reachable = true;
/* end of for() loop means the last insn of the 'subprog' * was reached. Doesn't matter whether it was JA or EXIT
From: Xuan Zhuo xuanzhuo@linux.alibaba.com
[ Upstream commit 5acc7d3e8d342858405fbbc671221f676b547ce7 ]
The problem occurs between dev_get_by_index() and dev_xdp_attach_link(). At this point, dev_xdp_uninstall() is called. Then xdp link will not be detached automatically when dev is released. But link->dev already points to dev, when xdp link is released, dev will still be accessed, but dev has been released.
dev_get_by_index() | link->dev = dev | | rtnl_lock() | unregister_netdevice_many() | dev_xdp_uninstall() | rtnl_unlock() rtnl_lock(); | dev_xdp_attach_link() | rtnl_unlock(); | | netdev_run_todo() // dev released bpf_xdp_link_release() | /* access dev. | use-after-free */ |
[ 45.966867] BUG: KASAN: use-after-free in bpf_xdp_link_release+0x3b8/0x3d0 [ 45.967619] Read of size 8 at addr ffff00000f9980c8 by task a.out/732 [ 45.968297] [ 45.968502] CPU: 1 PID: 732 Comm: a.out Not tainted 5.13.0+ #22 [ 45.969222] Hardware name: linux,dummy-virt (DT) [ 45.969795] Call trace: [ 45.970106] dump_backtrace+0x0/0x4c8 [ 45.970564] show_stack+0x30/0x40 [ 45.970981] dump_stack_lvl+0x120/0x18c [ 45.971470] print_address_description.constprop.0+0x74/0x30c [ 45.972182] kasan_report+0x1e8/0x200 [ 45.972659] __asan_report_load8_noabort+0x2c/0x50 [ 45.973273] bpf_xdp_link_release+0x3b8/0x3d0 [ 45.973834] bpf_link_free+0xd0/0x188 [ 45.974315] bpf_link_put+0x1d0/0x218 [ 45.974790] bpf_link_release+0x3c/0x58 [ 45.975291] __fput+0x20c/0x7e8 [ 45.975706] ____fput+0x24/0x30 [ 45.976117] task_work_run+0x104/0x258 [ 45.976609] do_notify_resume+0x894/0xaf8 [ 45.977121] work_pending+0xc/0x328 [ 45.977575] [ 45.977775] The buggy address belongs to the page: [ 45.978369] page:fffffc00003e6600 refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x4f998 [ 45.979522] flags: 0x7fffe0000000000(node=0|zone=0|lastcpupid=0x3ffff) [ 45.980349] raw: 07fffe0000000000 fffffc00003e6708 ffff0000dac3c010 0000000000000000 [ 45.981309] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000 [ 45.982259] page dumped because: kasan: bad access detected [ 45.982948] [ 45.983153] Memory state around the buggy address: [ 45.983753] ffff00000f997f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc [ 45.984645] ffff00000f998000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 45.985533] >ffff00000f998080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 45.986419] ^ [ 45.987112] ffff00000f998100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 45.988006] ffff00000f998180: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff [ 45.988895] ================================================================== [ 45.989773] Disabling lock debugging due to kernel taint [ 45.990552] Kernel panic - not syncing: panic_on_warn set ... [ 45.991166] CPU: 1 PID: 732 Comm: a.out Tainted: G B 5.13.0+ #22 [ 45.991929] Hardware name: linux,dummy-virt (DT) [ 45.992448] Call trace: [ 45.992753] dump_backtrace+0x0/0x4c8 [ 45.993208] show_stack+0x30/0x40 [ 45.993627] dump_stack_lvl+0x120/0x18c [ 45.994113] dump_stack+0x1c/0x34 [ 45.994530] panic+0x3a4/0x7d8 [ 45.994930] end_report+0x194/0x198 [ 45.995380] kasan_report+0x134/0x200 [ 45.995850] __asan_report_load8_noabort+0x2c/0x50 [ 45.996453] bpf_xdp_link_release+0x3b8/0x3d0 [ 45.997007] bpf_link_free+0xd0/0x188 [ 45.997474] bpf_link_put+0x1d0/0x218 [ 45.997942] bpf_link_release+0x3c/0x58 [ 45.998429] __fput+0x20c/0x7e8 [ 45.998833] ____fput+0x24/0x30 [ 45.999247] task_work_run+0x104/0x258 [ 45.999731] do_notify_resume+0x894/0xaf8 [ 46.000236] work_pending+0xc/0x328 [ 46.000697] SMP: stopping secondary CPUs [ 46.001226] Dumping ftrace buffer: [ 46.001663] (ftrace buffer empty) [ 46.002110] Kernel Offset: disabled [ 46.002545] CPU features: 0x00000001,23202c00 [ 46.003080] Memory Limit: none
Fixes: aa8d3a716b59db6c ("bpf, xdp: Add bpf_link-based XDP attachment API") Reported-by: Abaci abaci@linux.alibaba.com Signed-off-by: Xuan Zhuo xuanzhuo@linux.alibaba.com Signed-off-by: Alexei Starovoitov ast@kernel.org Reviewed-by: Dust Li dust.li@linux.alibaba.com Acked-by: Andrii Nakryiko andrii@kernel.org Link: https://lore.kernel.org/bpf/20210710031635.41649-1-xuanzhuo@linux.alibaba.co... Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/dev.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
diff --git a/net/core/dev.c b/net/core/dev.c index 4f29dde4ed0a..0dcedcdf6d7e 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -9659,14 +9659,17 @@ int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) struct net_device *dev; int err, fd;
+ rtnl_lock(); dev = dev_get_by_index(net, attr->link_create.target_ifindex); - if (!dev) + if (!dev) { + rtnl_unlock(); return -EINVAL; + }
link = kzalloc(sizeof(*link), GFP_USER); if (!link) { err = -ENOMEM; - goto out_put_dev; + goto unlock; }
bpf_link_init(&link->link, BPF_LINK_TYPE_XDP, &bpf_xdp_link_lops, prog); @@ -9676,14 +9679,14 @@ int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) err = bpf_link_prime(&link->link, &link_primer); if (err) { kfree(link); - goto out_put_dev; + goto unlock; }
- rtnl_lock(); err = dev_xdp_attach_link(dev, NULL, link); rtnl_unlock();
if (err) { + link->dev = NULL; bpf_link_cleanup(&link_primer); goto out_put_dev; } @@ -9693,6 +9696,9 @@ int bpf_xdp_link_attach(const union bpf_attr *attr, struct bpf_prog *prog) dev_put(dev); return fd;
+unlock: + rtnl_unlock(); + out_put_dev: dev_put(dev); return err;
From: Sathya Prakash M R sathya.prakash.m.r@intel.com
[ Upstream commit aa21548e34c19c12e924c736f3fd9e6a4d0f5419 ]
The ADL descriptor was missing an ACPI power setting, causing the DSP to enter D3 even with a D0i1-compatible wake-on-voice/hotwording capture stream.
Fixes: 4ad03f894b3c ('ASoC: SOF: Intel: Update ADL P to use its own descriptor') Reviewed-by: Ranjani Sridharan ranjani.sridharan@linux.intel.com Signed-off-by: Sathya Prakash M R sathya.prakash.m.r@intel.com Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20210712201620.44311-1-pierre-louis.bossart@linux.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/intel/pci-tgl.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/sound/soc/sof/intel/pci-tgl.c b/sound/soc/sof/intel/pci-tgl.c index 88c3bf404dd7..d1fd0a330554 100644 --- a/sound/soc/sof/intel/pci-tgl.c +++ b/sound/soc/sof/intel/pci-tgl.c @@ -89,6 +89,7 @@ static const struct sof_dev_desc adls_desc = { static const struct sof_dev_desc adl_desc = { .machines = snd_soc_acpi_intel_adl_machines, .alt_machines = snd_soc_acpi_intel_adl_sdw_machines, + .use_acpi_target_states = true, .resindex_lpe_base = 0, .resindex_pcicfg_base = -1, .resindex_imr_base = -1,
From: Nicolas Saenz Julienne nsaenzju@redhat.com
[ Upstream commit aebacb7f6ca1926918734faae14d1f0b6fae5cb7 ]
31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") subtly altered get_next_timer_interrupt()'s behaviour. The function no longer consistently returns KTIME_MAX with no timers pending.
In order to decide if there are any timers pending we check whether the next expiry will happen NEXT_TIMER_MAX_DELTA jiffies from now. Unfortunately, the next expiry time and the timer base clock are no longer updated in unison. The former changes upon certain timer operations (enqueue, expire, detach), whereas the latter keeps track of jiffies as they move forward. Ultimately breaking the logic above.
A simplified example:
- Upon entering get_next_timer_interrupt() with:
jiffies = 1 base->clk = 0; base->next_expiry = NEXT_TIMER_MAX_DELTA;
'base->next_expiry == base->clk + NEXT_TIMER_MAX_DELTA', the function returns KTIME_MAX.
- 'base->clk' is updated to the jiffies value.
- The next time we enter get_next_timer_interrupt(), taking into account no timer operations happened:
base->clk = 1; base->next_expiry = NEXT_TIMER_MAX_DELTA;
'base->next_expiry != base->clk + NEXT_TIMER_MAX_DELTA', the function returns a valid expire time, which is incorrect.
This ultimately might unnecessarily rearm sched's timer on nohz_full setups, and add latency to the system[1].
So, introduce 'base->timers_pending'[2], update it every time 'base->next_expiry' changes, and use it in get_next_timer_interrupt().
[1] See tick_nohz_stop_tick(). [2] A quick pahole check on x86_64 and arm64 shows it doesn't make 'struct timer_base' any bigger.
Fixes: 31cd0e119d50 ("timers: Recalculate next timer interrupt only when necessary") Signed-off-by: Nicolas Saenz Julienne nsaenzju@redhat.com Signed-off-by: Frederic Weisbecker frederic@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/time/timer.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index d111adf4a0cb..99b97ccefdbd 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -207,6 +207,7 @@ struct timer_base { unsigned int cpu; bool next_expiry_recalc; bool is_idle; + bool timers_pending; DECLARE_BITMAP(pending_map, WHEEL_SIZE); struct hlist_head vectors[WHEEL_SIZE]; } ____cacheline_aligned; @@ -595,6 +596,7 @@ static void enqueue_timer(struct timer_base *base, struct timer_list *timer, * can reevaluate the wheel: */ base->next_expiry = bucket_expiry; + base->timers_pending = true; base->next_expiry_recalc = false; trigger_dyntick_cpu(base, timer); } @@ -1596,6 +1598,7 @@ static unsigned long __next_timer_interrupt(struct timer_base *base) }
base->next_expiry_recalc = false; + base->timers_pending = !(next == base->clk + NEXT_TIMER_MAX_DELTA);
return next; } @@ -1647,7 +1650,6 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]); u64 expires = KTIME_MAX; unsigned long nextevt; - bool is_max_delta;
/* * Pretend that there is no timer pending if the cpu is offline. @@ -1660,7 +1662,6 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) if (base->next_expiry_recalc) base->next_expiry = __next_timer_interrupt(base); nextevt = base->next_expiry; - is_max_delta = (nextevt == base->clk + NEXT_TIMER_MAX_DELTA);
/* * We have a fresh next event. Check whether we can forward the @@ -1678,7 +1679,7 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem) expires = basem; base->is_idle = false; } else { - if (!is_max_delta) + if (base->timers_pending) expires = basem + (u64)(nextevt - basej) * TICK_NSEC; /* * If we expect to sleep more than a tick, mark the base idle. @@ -1961,6 +1962,7 @@ int timers_prepare_cpu(unsigned int cpu) base = per_cpu_ptr(&timer_bases[b], cpu); base->clk = jiffies; base->next_expiry = base->clk + NEXT_TIMER_MAX_DELTA; + base->timers_pending = false; base->is_idle = false; } return 0;
From: Maxime Ripard maxime@cerno.tech
[ Upstream commit 32a19de21ae40f0601f48575b610dde4f518ccc6 ]
The CEC interrupt handlers are registered through the devm_request_threaded_irq function. However, while free_irq is indeed called properly when the device is unbound or bind fails, it's called after unbind or bind is done.
In our particular case, it means that on failure it creates a window where our interrupt handler can be called, but we're freeing every resource (CEC adapter, DRM objects, etc.) it might need.
In order to address this, let's switch to the non-devm variant to control better when the handler will be unregistered and allow us to make it safe.
Fixes: 15b4511a4af6 ("drm/vc4: add HDMI CEC support") Signed-off-by: Maxime Ripard maxime@cerno.tech Reviewed-by: Dave Stevenson dave.stevenson@raspberrypi.com Link: https://patchwork.freedesktop.org/patch/msgid/20210707095112.1469670-2-maxim... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/vc4/vc4_hdmi.c | 49 +++++++++++++++++++++++----------- 1 file changed, 33 insertions(+), 16 deletions(-)
diff --git a/drivers/gpu/drm/vc4/vc4_hdmi.c b/drivers/gpu/drm/vc4/vc4_hdmi.c index 188b74c9e9ff..edee565334d8 100644 --- a/drivers/gpu/drm/vc4/vc4_hdmi.c +++ b/drivers/gpu/drm/vc4/vc4_hdmi.c @@ -1690,38 +1690,46 @@ static int vc4_hdmi_cec_init(struct vc4_hdmi *vc4_hdmi) vc4_hdmi_cec_update_clk_div(vc4_hdmi);
if (vc4_hdmi->variant->external_irq_controller) { - ret = devm_request_threaded_irq(&pdev->dev, - platform_get_irq_byname(pdev, "cec-rx"), - vc4_cec_irq_handler_rx_bare, - vc4_cec_irq_handler_rx_thread, 0, - "vc4 hdmi cec rx", vc4_hdmi); + ret = request_threaded_irq(platform_get_irq_byname(pdev, "cec-rx"), + vc4_cec_irq_handler_rx_bare, + vc4_cec_irq_handler_rx_thread, 0, + "vc4 hdmi cec rx", vc4_hdmi); if (ret) goto err_delete_cec_adap;
- ret = devm_request_threaded_irq(&pdev->dev, - platform_get_irq_byname(pdev, "cec-tx"), - vc4_cec_irq_handler_tx_bare, - vc4_cec_irq_handler_tx_thread, 0, - "vc4 hdmi cec tx", vc4_hdmi); + ret = request_threaded_irq(platform_get_irq_byname(pdev, "cec-tx"), + vc4_cec_irq_handler_tx_bare, + vc4_cec_irq_handler_tx_thread, 0, + "vc4 hdmi cec tx", vc4_hdmi); if (ret) - goto err_delete_cec_adap; + goto err_remove_cec_rx_handler; } else { HDMI_WRITE(HDMI_CEC_CPU_MASK_SET, 0xffffffff);
- ret = devm_request_threaded_irq(&pdev->dev, platform_get_irq(pdev, 0), - vc4_cec_irq_handler, - vc4_cec_irq_handler_thread, 0, - "vc4 hdmi cec", vc4_hdmi); + ret = request_threaded_irq(platform_get_irq(pdev, 0), + vc4_cec_irq_handler, + vc4_cec_irq_handler_thread, 0, + "vc4 hdmi cec", vc4_hdmi); if (ret) goto err_delete_cec_adap; }
ret = cec_register_adapter(vc4_hdmi->cec_adap, &pdev->dev); if (ret < 0) - goto err_delete_cec_adap; + goto err_remove_handlers;
return 0;
+err_remove_handlers: + if (vc4_hdmi->variant->external_irq_controller) + free_irq(platform_get_irq_byname(pdev, "cec-tx"), vc4_hdmi); + else + free_irq(platform_get_irq(pdev, 0), vc4_hdmi); + +err_remove_cec_rx_handler: + if (vc4_hdmi->variant->external_irq_controller) + free_irq(platform_get_irq_byname(pdev, "cec-rx"), vc4_hdmi); + err_delete_cec_adap: cec_delete_adapter(vc4_hdmi->cec_adap);
@@ -1730,6 +1738,15 @@ err_delete_cec_adap:
static void vc4_hdmi_cec_exit(struct vc4_hdmi *vc4_hdmi) { + struct platform_device *pdev = vc4_hdmi->pdev; + + if (vc4_hdmi->variant->external_irq_controller) { + free_irq(platform_get_irq_byname(pdev, "cec-rx"), vc4_hdmi); + free_irq(platform_get_irq_byname(pdev, "cec-tx"), vc4_hdmi); + } else { + free_irq(platform_get_irq(pdev, 0), vc4_hdmi); + } + cec_unregister_adapter(vc4_hdmi->cec_adap); } #else
From: Geert Uytterhoeven geert+renesas@glider.be
[ Upstream commit 99bb2ebab953435852340cdb198c5abbf0bb5dd3 ]
Making global2 support mandatory removed the Kconfig symbol NET_DSA_MV88E6XXX_GLOBAL2. This symbol also served as an intermediate symbol to make NET_DSA_MV88E6XXX_PTP depend on NET_DSA_MV88E6XXX. With the symbol removed, the user is always asked about PTP support for Marvell 88E6xxx switches, even if the latter support is not enabled.
Fix this by reinstating the dependency.
Fixes: 63368a7416df144b ("net: dsa: mv88e6xxx: Make global2 support mandatory") Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Andrew Lunn andrew@lunn.ch Reviewed-by: Vladimir Oltean olteanv@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/mv88e6xxx/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/dsa/mv88e6xxx/Kconfig b/drivers/net/dsa/mv88e6xxx/Kconfig index 05af632b0f59..634a48e6616b 100644 --- a/drivers/net/dsa/mv88e6xxx/Kconfig +++ b/drivers/net/dsa/mv88e6xxx/Kconfig @@ -12,7 +12,7 @@ config NET_DSA_MV88E6XXX config NET_DSA_MV88E6XXX_PTP bool "PTP support for Marvell 88E6xxx" default n - depends on PTP_1588_CLOCK + depends on NET_DSA_MV88E6XXX && PTP_1588_CLOCK help Say Y to enable PTP hardware timestamping on Marvell 88E6xxx switch chips that support it.
From: Colin Ian King colin.king@canonical.com
[ Upstream commit e7efc2ce3d0789cd7c21b70ff00cd7838d382639 ]
Shifting the u16 integer oct->pcie_port by CN23XX_PKT_INPUT_CTL_MAC_NUM_POS (29) bits will be promoted to a 32 bit signed int and then sign-extended to a u64. In the cases where oct->pcie_port where bit 2 is set (e.g. 3..7) the shifted value will be sign extended and the top 32 bits of the result will be set.
Fix this by casting the u16 values to a u64 before the 29 bit left shift.
Addresses-Coverity: ("Unintended sign extension")
Fixes: 3451b97cce2d ("liquidio: CN23XX register setup") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c index 4cddd628d41b..9ed3d1ab2ca5 100644 --- a/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c +++ b/drivers/net/ethernet/cavium/liquidio/cn23xx_pf_device.c @@ -420,7 +420,7 @@ static int cn23xx_pf_setup_global_input_regs(struct octeon_device *oct) * bits 32:47 indicate the PVF num. */ for (q_no = 0; q_no < ern; q_no++) { - reg_val = oct->pcie_port << CN23XX_PKT_INPUT_CTL_MAC_NUM_POS; + reg_val = (u64)oct->pcie_port << CN23XX_PKT_INPUT_CTL_MAC_NUM_POS;
/* for VF assigned queues. */ if (q_no < oct->sriov_info.pf_srn) {
From: Colin Ian King colin.king@canonical.com
[ Upstream commit 91091656252f5d6d8c476e0c92776ce9fae7b445 ]
Currently array jit->seen_reg[r1] is being accessed before the range checking of index r1. The range changing on r1 should be performed first since it will avoid any potential out-of-range accesses on the array seen_reg[] and also it is more optimal to perform checks on r1 before fetching data from the array. Fix this by swapping the order of the checks before the array access.
Fixes: 054623105728 ("s390/bpf: Add s390x eBPF JIT compiler backend") Signed-off-by: Colin Ian King colin.king@canonical.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Tested-by: Ilya Leoshkevich iii@linux.ibm.com Acked-by: Ilya Leoshkevich iii@linux.ibm.com Link: https://lore.kernel.org/bpf/20210715125712.24690-1-colin.king@canonical.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/net/bpf_jit_comp.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/s390/net/bpf_jit_comp.c b/arch/s390/net/bpf_jit_comp.c index 63cae0476bb4..2ae419f5115a 100644 --- a/arch/s390/net/bpf_jit_comp.c +++ b/arch/s390/net/bpf_jit_comp.c @@ -112,7 +112,7 @@ static inline void reg_set_seen(struct bpf_jit *jit, u32 b1) { u32 r1 = reg2hex[b1];
- if (!jit->seen_reg[r1] && r1 >= 6 && r1 <= 15) + if (r1 >= 6 && r1 <= 15 && !jit->seen_reg[r1]) jit->seen_reg[r1] = 1; }
From: John Fastabend john.fastabend@gmail.com
[ Upstream commit 7e6b27a69167f97c56b5437871d29e9722c3e470 ]
If skb_linearize is needed and fails we could leak a msg on the error handling. To fix ensure we kfree the msg block before returning error. Found during code review.
Fixes: 4363023d2668e ("bpf, sockmap: Avoid failures from skb_to_sgvec when skb has frag_list") Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Cong Wang cong.wang@bytedance.com Link: https://lore.kernel.org/bpf/20210712195546.423990-2-john.fastabend@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/core/skmsg.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/net/core/skmsg.c b/net/core/skmsg.c index 539c83a45665..b2410a1bfa23 100644 --- a/net/core/skmsg.c +++ b/net/core/skmsg.c @@ -531,10 +531,8 @@ static int sk_psock_skb_ingress_enqueue(struct sk_buff *skb, if (skb_linearize(skb)) return -EAGAIN; num_sge = skb_to_sgvec(skb, msg->sg.data, 0, skb->len); - if (unlikely(num_sge < 0)) { - kfree(msg); + if (unlikely(num_sge < 0)) return num_sge; - }
copied = skb->len; msg->sg.start = 0; @@ -553,6 +551,7 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb) { struct sock *sk = psock->sk; struct sk_msg *msg; + int err;
/* If we are receiving on the same sock skb->sk is already assigned, * skip memory accounting and owner transition seeing it already set @@ -571,7 +570,10 @@ static int sk_psock_skb_ingress(struct sk_psock *psock, struct sk_buff *skb) * into user buffers. */ skb_set_owner_r(skb, sk); - return sk_psock_skb_ingress_enqueue(skb, psock, sk, msg); + err = sk_psock_skb_ingress_enqueue(skb, psock, sk, msg); + if (err < 0) + kfree(msg); + return err; }
/* Puts an skb on the ingress queue of the socket already assigned to the @@ -582,12 +584,16 @@ static int sk_psock_skb_ingress_self(struct sk_psock *psock, struct sk_buff *skb { struct sk_msg *msg = kzalloc(sizeof(*msg), __GFP_NOWARN | GFP_ATOMIC); struct sock *sk = psock->sk; + int err;
if (unlikely(!msg)) return -EAGAIN; sk_msg_init(msg); skb_set_owner_r(skb, sk); - return sk_psock_skb_ingress_enqueue(skb, psock, sk, msg); + err = sk_psock_skb_ingress_enqueue(skb, psock, sk, msg); + if (err < 0) + kfree(msg); + return err; }
static int sk_psock_handle_skb(struct sk_psock *psock, struct sk_buff *skb,
From: John Fastabend john.fastabend@gmail.com
[ Upstream commit 228a4a7ba8e99bb9ef980b62f71e3be33f4aae69 ]
The proc socket stats use sk_prot->inuse_idx value to record inuse sock stats. We currently do not set this correctly from sockmap side. The result is reading sock stats '/proc/net/sockstat' gives incorrect values. The socket counter is incremented correctly, but because we don't set the counter correctly when we replace sk_prot we may omit the decrement.
To get the correct inuse_idx value move the core_initcall that initializes the TCP proto handlers to late_initcall. This way it is initialized after TCP has the chance to assign the inuse_idx value from the register protocol handler.
Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface") Suggested-by: Jakub Sitnicki jakub@cloudflare.com Signed-off-by: John Fastabend john.fastabend@gmail.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Cong Wang cong.wang@bytedance.com Link: https://lore.kernel.org/bpf/20210712195546.423990-3-john.fastabend@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_bpf.c b/net/ipv4/tcp_bpf.c index ad9d17923fc5..b65201ba4d93 100644 --- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -486,7 +486,7 @@ static int __init tcp_bpf_v4_build_proto(void) tcp_bpf_rebuild_protos(tcp_bpf_prots[TCP_BPF_IPV4], &tcp_prot); return 0; } -core_initcall(tcp_bpf_v4_build_proto); +late_initcall(tcp_bpf_v4_build_proto);
static int tcp_bpf_assert_proto_ops(struct proto *ops) {
From: Jakub Sitnicki jakub@cloudflare.com
[ Upstream commit 54ea2f49fd9400dd698c25450be3352b5613b3b4 ]
The proc socket stats use sk_prot->inuse_idx value to record inuse sock stats. We currently do not set this correctly from sockmap side. The result is reading sock stats '/proc/net/sockstat' gives incorrect values. The socket counter is incremented correctly, but because we don't set the counter correctly when we replace sk_prot we may omit the decrement.
To get the correct inuse_idx value move the core_initcall that initializes the UDP proto handlers to late_initcall. This way it is initialized after UDP has the chance to assign the inuse_idx value from the register protocol handler.
Fixes: edc6741cc660 ("bpf: Add sockmap hooks for UDP sockets") Signed-off-by: Jakub Sitnicki jakub@cloudflare.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Cong Wang cong.wang@bytedance.com Acked-by: John Fastabend john.fastabend@gmail.com Link: https://lore.kernel.org/bpf/20210714154750.528206-1-jakub@cloudflare.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp_bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/udp_bpf.c b/net/ipv4/udp_bpf.c index 954c4591a6fd..725b6df4b2a2 100644 --- a/net/ipv4/udp_bpf.c +++ b/net/ipv4/udp_bpf.c @@ -101,7 +101,7 @@ static int __init udp_bpf_v4_build_proto(void) udp_bpf_rebuild_protos(&udp_bpf_prots[UDP_BPF_IPV4], &udp_prot); return 0; } -core_initcall(udp_bpf_v4_build_proto); +late_initcall(udp_bpf_v4_build_proto);
int udp_bpf_update_proto(struct sock *sk, struct sk_psock *psock, bool restore) {
From: Tobias Klauser tklauser@distanz.ch
[ Upstream commit d444b06e40855219ef38b5e9286db16d435f06dc ]
Fix and add a missing NULL check for the prior malloc() call.
Fixes: 49a086c201a9 ("bpftool: implement prog load command") Signed-off-by: Tobias Klauser tklauser@distanz.ch Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Quentin Monnet quentin@isovalent.com Acked-by: Roman Gushchin guro@fb.com Link: https://lore.kernel.org/bpf/20210715110609.29364-1-tklauser@distanz.ch Signed-off-by: Sasha Levin sashal@kernel.org --- tools/bpf/bpftool/common.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/tools/bpf/bpftool/common.c b/tools/bpf/bpftool/common.c index 1828bba19020..dc6daa193557 100644 --- a/tools/bpf/bpftool/common.c +++ b/tools/bpf/bpftool/common.c @@ -222,6 +222,11 @@ int mount_bpffs_for_pin(const char *name) int err = 0;
file = malloc(strlen(name) + 1); + if (!file) { + p_err("mem alloc failed"); + return -1; + } + strcpy(file, name); dir = dirname(file);
From: Ziyang Xuan william.xuanziyang@huawei.com
[ Upstream commit 991e634360f2622a683b48dfe44fe6d9cb765a09 ]
When nr_segs equal to zero in iovec_from_user, the object msg->msg_iter.iov is uninit stack memory in caif_seqpkt_sendmsg which is defined in ___sys_sendmsg. So we cann't just judge msg->msg_iter.iov->base directlly. We can use nr_segs to judge msg in caif_seqpkt_sendmsg whether has data buffers.
===================================================== BUG: KMSAN: uninit-value in caif_seqpkt_sendmsg+0x693/0xf60 net/caif/caif_socket.c:542 Call Trace: __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x1c9/0x220 lib/dump_stack.c:118 kmsan_report+0xf7/0x1e0 mm/kmsan/kmsan_report.c:118 __msan_warning+0x58/0xa0 mm/kmsan/kmsan_instr.c:215 caif_seqpkt_sendmsg+0x693/0xf60 net/caif/caif_socket.c:542 sock_sendmsg_nosec net/socket.c:652 [inline] sock_sendmsg net/socket.c:672 [inline] ____sys_sendmsg+0x12b6/0x1350 net/socket.c:2343 ___sys_sendmsg net/socket.c:2397 [inline] __sys_sendmmsg+0x808/0xc90 net/socket.c:2480 __compat_sys_sendmmsg net/compat.c:656 [inline]
Reported-by: syzbot+09a5d591c1f98cf5efcb@syzkaller.appspotmail.com Link: https://syzkaller.appspot.com/bug?id=1ace85e8fc9b0d5a45c08c2656c3e91762daa9b... Fixes: bece7b2398d0 ("caif: Rewritten socket implementation") Signed-off-by: Ziyang Xuan william.xuanziyang@huawei.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/caif/caif_socket.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/caif/caif_socket.c b/net/caif/caif_socket.c index 3ad0a1df6712..9d26c5e9da05 100644 --- a/net/caif/caif_socket.c +++ b/net/caif/caif_socket.c @@ -539,7 +539,8 @@ static int caif_seqpkt_sendmsg(struct socket *sock, struct msghdr *msg, goto err;
ret = -EINVAL; - if (unlikely(msg->msg_iter.iov->iov_base == NULL)) + if (unlikely(msg->msg_iter.nr_segs == 0) || + unlikely(msg->msg_iter.iov->iov_base == NULL)) goto err; noblock = msg->msg_flags & MSG_DONTWAIT;
From: Yoshitaka Ikeda ikeda@nskint.co.jp
[ Upstream commit 55cef88bbf12f3bfbe5c2379a8868a034707e755 ]
Fix below division by zero warning: - Added an if statement because buswidth can be zero, resulting in division by zero. - The modified code was based on another driver (atmel-quadspi).
[ 0.795337] Division by zero in kernel. : [ 0.834051] [<807fd40c>] (__div0) from [<804e1acc>] (Ldiv0+0x8/0x10) [ 0.839097] [<805f0710>] (cqspi_exec_mem_op) from [<805edb4c>] (spi_mem_exec_op+0x3b0/0x3f8)
Fixes: 7512eaf54190 ("spi: cadence-quadspi: Fix dummy cycle calculation when buswidth > 1") Signed-off-by: Yoshitaka Ikeda ikeda@nskint.co.jp Link: https://lore.kernel.org/r/ed989af6-da88-4e0b-9ed8-126db6cad2e4@nskint.co.jp Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-cadence-quadspi.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/spi/spi-cadence-quadspi.c b/drivers/spi/spi-cadence-quadspi.c index 7a00346ff9b9..13d1f0ce618e 100644 --- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -307,11 +307,13 @@ static unsigned int cqspi_calc_rdreg(struct cqspi_flash_pdata *f_pdata)
static unsigned int cqspi_calc_dummy(const struct spi_mem_op *op, bool dtr) { - unsigned int dummy_clk; + unsigned int dummy_clk = 0;
- dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth); - if (dtr) - dummy_clk /= 2; + if (op->dummy.buswidth && op->dummy.nbytes) { + dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth); + if (dtr) + dummy_clk /= 2; + }
return dummy_clk; }
From: Dongliang Mu mudongliangabcd@gmail.com
[ Upstream commit a6ecfb39ba9d7316057cea823b196b734f6b18ca ]
The current error handling code of hso_create_net_device is hso_free_net_device, no matter which errors lead to. For example, WARNING in hso_free_net_device [1].
Fix this by refactoring the error handling code of hso_create_net_device by handling different errors by different code.
[1] https://syzkaller.appspot.com/bug?id=66eff8d49af1b28370ad342787413e35bbe76ef...
Reported-by: syzbot+44d53c7255bb1aea22d2@syzkaller.appspotmail.com Fixes: 5fcfb6d0bfcd ("hso: fix bailout in error case of probe") Signed-off-by: Dongliang Mu mudongliangabcd@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/hso.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-)
diff --git a/drivers/net/usb/hso.c b/drivers/net/usb/hso.c index 5c779cc0ea11..28ebf4955b83 100644 --- a/drivers/net/usb/hso.c +++ b/drivers/net/usb/hso.c @@ -2496,7 +2496,7 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface, hso_net_init); if (!net) { dev_err(&interface->dev, "Unable to create ethernet device\n"); - goto exit; + goto err_hso_dev; }
hso_net = netdev_priv(net); @@ -2509,13 +2509,13 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface, USB_DIR_IN); if (!hso_net->in_endp) { dev_err(&interface->dev, "Can't find BULK IN endpoint\n"); - goto exit; + goto err_net; } hso_net->out_endp = hso_get_ep(interface, USB_ENDPOINT_XFER_BULK, USB_DIR_OUT); if (!hso_net->out_endp) { dev_err(&interface->dev, "Can't find BULK OUT endpoint\n"); - goto exit; + goto err_net; } SET_NETDEV_DEV(net, &interface->dev); SET_NETDEV_DEVTYPE(net, &hso_type); @@ -2524,18 +2524,18 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface, for (i = 0; i < MUX_BULK_RX_BUF_COUNT; i++) { hso_net->mux_bulk_rx_urb_pool[i] = usb_alloc_urb(0, GFP_KERNEL); if (!hso_net->mux_bulk_rx_urb_pool[i]) - goto exit; + goto err_mux_bulk_rx; hso_net->mux_bulk_rx_buf_pool[i] = kzalloc(MUX_BULK_RX_BUF_SIZE, GFP_KERNEL); if (!hso_net->mux_bulk_rx_buf_pool[i]) - goto exit; + goto err_mux_bulk_rx; } hso_net->mux_bulk_tx_urb = usb_alloc_urb(0, GFP_KERNEL); if (!hso_net->mux_bulk_tx_urb) - goto exit; + goto err_mux_bulk_rx; hso_net->mux_bulk_tx_buf = kzalloc(MUX_BULK_TX_BUF_SIZE, GFP_KERNEL); if (!hso_net->mux_bulk_tx_buf) - goto exit; + goto err_free_tx_urb;
add_net_device(hso_dev);
@@ -2543,7 +2543,7 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface, result = register_netdev(net); if (result) { dev_err(&interface->dev, "Failed to register device\n"); - goto exit; + goto err_free_tx_buf; }
hso_log_port(hso_dev); @@ -2551,8 +2551,21 @@ static struct hso_device *hso_create_net_device(struct usb_interface *interface, hso_create_rfkill(hso_dev, interface);
return hso_dev; -exit: - hso_free_net_device(hso_dev, true); + +err_free_tx_buf: + remove_net_device(hso_dev); + kfree(hso_net->mux_bulk_tx_buf); +err_free_tx_urb: + usb_free_urb(hso_net->mux_bulk_tx_urb); +err_mux_bulk_rx: + for (i = 0; i < MUX_BULK_RX_BUF_COUNT; i++) { + usb_free_urb(hso_net->mux_bulk_rx_urb_pool[i]); + kfree(hso_net->mux_bulk_rx_buf_pool[i]); + } +err_net: + free_netdev(net); +err_hso_dev: + kfree(hso_dev); return NULL; }
From: Roman Skakun Roman_Skakun@epam.com
[ Upstream commit 40ac971eab89330d6153e7721e88acd2d98833f9 ]
xen-swiotlb can use vmalloc backed addresses for dma coherent allocations and uses the common helpers. Properly handle them to unbreak Xen on ARM platforms.
Fixes: 1b65c4e5a9af ("swiotlb-xen: use xen_alloc/free_coherent_pages") Signed-off-by: Roman Skakun roman_skakun@epam.com Reviewed-by: Andrii Anisov andrii_anisov@epam.com [hch: split the patch, renamed the helpers] Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/ops_helpers.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/dma/ops_helpers.c b/kernel/dma/ops_helpers.c index 910ae69cae77..af4a6ef48ce0 100644 --- a/kernel/dma/ops_helpers.c +++ b/kernel/dma/ops_helpers.c @@ -5,6 +5,13 @@ */ #include <linux/dma-map-ops.h>
+static struct page *dma_common_vaddr_to_page(void *cpu_addr) +{ + if (is_vmalloc_addr(cpu_addr)) + return vmalloc_to_page(cpu_addr); + return virt_to_page(cpu_addr); +} + /* * Create scatter-list for the already allocated DMA buffer. */ @@ -12,7 +19,7 @@ int dma_common_get_sgtable(struct device *dev, struct sg_table *sgt, void *cpu_addr, dma_addr_t dma_addr, size_t size, unsigned long attrs) { - struct page *page = virt_to_page(cpu_addr); + struct page *page = dma_common_vaddr_to_page(cpu_addr); int ret;
ret = sg_alloc_table(sgt, 1, GFP_KERNEL); @@ -32,6 +39,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, unsigned long user_count = vma_pages(vma); unsigned long count = PAGE_ALIGN(size) >> PAGE_SHIFT; unsigned long off = vma->vm_pgoff; + struct page *page = dma_common_vaddr_to_page(cpu_addr); int ret = -ENXIO;
vma->vm_page_prot = dma_pgprot(dev, vma->vm_page_prot, attrs); @@ -43,7 +51,7 @@ int dma_common_mmap(struct device *dev, struct vm_area_struct *vma, return -ENXIO;
return remap_pfn_range(vma, vma->vm_start, - page_to_pfn(virt_to_page(cpu_addr)) + vma->vm_pgoff, + page_to_pfn(page) + vma->vm_pgoff, user_count << PAGE_SHIFT, vma->vm_page_prot); #else return -ENXIO;
From: Vijendar Mukunda vijendar.mukunda@amd.com
[ Upstream commit 59dd33f82dc0975c55d3d46801e7ca45532d7673 ]
On stream stop, currently CPU DAI stop sequence invoked first followed by DMA. For Few platforms, it is required to stop the DMA first before stopping CPU DAI.
Introduced new flag in dai_link structure for reordering stop sequence. Based on flag check, ASoC core will re-order the stop sequence.
Fixes: 4378f1fbe92405 ("ASoC: soc-pcm: Use different sequence for start/stop trigger") Signed-off-by: Vijendar Mukunda Vijendar.Mukunda@amd.com Link: https://lore.kernel.org/r/20210716123015.15697-1-vijendar.mukunda@amd.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/sound/soc.h | 6 ++++++ sound/soc/soc-pcm.c | 22 ++++++++++++++++------ 2 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/include/sound/soc.h b/include/sound/soc.h index e746da996351..723eeb1c3f78 100644 --- a/include/sound/soc.h +++ b/include/sound/soc.h @@ -712,6 +712,12 @@ struct snd_soc_dai_link { /* Do not create a PCM for this DAI link (Backend link) */ unsigned int ignore:1;
+ /* This flag will reorder stop sequence. By enabling this flag + * DMA controller stop sequence will be invoked first followed by + * CPU DAI driver stop sequence + */ + unsigned int stop_dma_first:1; + #ifdef CONFIG_SND_SOC_TOPOLOGY struct snd_soc_dobj dobj; /* For topology */ #endif diff --git a/sound/soc/soc-pcm.c b/sound/soc/soc-pcm.c index 46513bb97904..d1c570ca21ea 100644 --- a/sound/soc/soc-pcm.c +++ b/sound/soc/soc-pcm.c @@ -1015,6 +1015,7 @@ out:
static int soc_pcm_trigger(struct snd_pcm_substream *substream, int cmd) { + struct snd_soc_pcm_runtime *rtd = asoc_substream_to_rtd(substream); int ret = -EINVAL, _ret = 0; int rollback = 0;
@@ -1055,14 +1056,23 @@ start_err: case SNDRV_PCM_TRIGGER_STOP: case SNDRV_PCM_TRIGGER_SUSPEND: case SNDRV_PCM_TRIGGER_PAUSE_PUSH: - ret = snd_soc_pcm_dai_trigger(substream, cmd, rollback); - if (ret < 0) - break; + if (rtd->dai_link->stop_dma_first) { + ret = snd_soc_pcm_component_trigger(substream, cmd, rollback); + if (ret < 0) + break;
- ret = snd_soc_pcm_component_trigger(substream, cmd, rollback); - if (ret < 0) - break; + ret = snd_soc_pcm_dai_trigger(substream, cmd, rollback); + if (ret < 0) + break; + } else { + ret = snd_soc_pcm_dai_trigger(substream, cmd, rollback); + if (ret < 0) + break;
+ ret = snd_soc_pcm_component_trigger(substream, cmd, rollback); + if (ret < 0) + break; + } ret = snd_soc_link_trigger(substream, cmd, rollback); break; }
From: Michal Suchanek msuchanek@suse.de
[ Upstream commit 674a9f1f6815849bfb5bf385e7da8fc198aaaba9 ]
Missing TPM final event log table is not a firmware bug.
Clearly if providing event log in the old format makes the final event log invalid it should not be provided at least in that case.
Fixes: b4f1874c6216 ("tpm: check event log version before reading final events") Signed-off-by: Michal Suchanek msuchanek@suse.de Reviewed-by: Jarkko Sakkinen jarkko@kernel.org Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/efi/tpm.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/firmware/efi/tpm.c b/drivers/firmware/efi/tpm.c index c1955d320fec..8f665678e9e3 100644 --- a/drivers/firmware/efi/tpm.c +++ b/drivers/firmware/efi/tpm.c @@ -62,9 +62,11 @@ int __init efi_tpm_eventlog_init(void) tbl_size = sizeof(*log_tbl) + log_tbl->size; memblock_reserve(efi.tpm_log, tbl_size);
- if (efi.tpm_final_log == EFI_INVALID_TABLE_ADDR || - log_tbl->version != EFI_TCG2_EVENT_LOG_FORMAT_TCG_2) { - pr_warn(FW_BUG "TPM Final Events table missing or invalid\n"); + if (efi.tpm_final_log == EFI_INVALID_TABLE_ADDR) { + pr_info("TPM Final Events table not present\n"); + goto out; + } else if (log_tbl->version != EFI_TCG2_EVENT_LOG_FORMAT_TCG_2) { + pr_warn(FW_BUG "TPM Final Events table invalid\n"); goto out; }
From: Yajun Deng yajun.deng@linux.dev
[ Upstream commit 5f119ba1d5771bbf46d57cff7417dcd84d3084ba ]
The release_sock() is blocking function, it would change the state after sleeping. use wait_woken() instead.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Signed-off-by: Yajun Deng yajun.deng@linux.dev Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/decnet/af_decnet.c | 27 ++++++++++++--------------- 1 file changed, 12 insertions(+), 15 deletions(-)
diff --git a/net/decnet/af_decnet.c b/net/decnet/af_decnet.c index 5dbd45dc35ad..dc92a67baea3 100644 --- a/net/decnet/af_decnet.c +++ b/net/decnet/af_decnet.c @@ -816,7 +816,7 @@ static int dn_auto_bind(struct socket *sock) static int dn_confirm_accept(struct sock *sk, long *timeo, gfp_t allocation) { struct dn_scp *scp = DN_SK(sk); - DEFINE_WAIT(wait); + DEFINE_WAIT_FUNC(wait, woken_wake_function); int err;
if (scp->state != DN_CR) @@ -826,11 +826,11 @@ static int dn_confirm_accept(struct sock *sk, long *timeo, gfp_t allocation) scp->segsize_loc = dst_metric_advmss(__sk_dst_get(sk)); dn_send_conn_conf(sk, allocation);
- prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); + add_wait_queue(sk_sleep(sk), &wait); for(;;) { release_sock(sk); if (scp->state == DN_CC) - *timeo = schedule_timeout(*timeo); + *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); lock_sock(sk); err = 0; if (scp->state == DN_RUN) @@ -844,9 +844,8 @@ static int dn_confirm_accept(struct sock *sk, long *timeo, gfp_t allocation) err = -EAGAIN; if (!*timeo) break; - prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); } - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); if (err == 0) { sk->sk_socket->state = SS_CONNECTED; } else if (scp->state != DN_CC) { @@ -858,7 +857,7 @@ static int dn_confirm_accept(struct sock *sk, long *timeo, gfp_t allocation) static int dn_wait_run(struct sock *sk, long *timeo) { struct dn_scp *scp = DN_SK(sk); - DEFINE_WAIT(wait); + DEFINE_WAIT_FUNC(wait, woken_wake_function); int err = 0;
if (scp->state == DN_RUN) @@ -867,11 +866,11 @@ static int dn_wait_run(struct sock *sk, long *timeo) if (!*timeo) return -EALREADY;
- prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); + add_wait_queue(sk_sleep(sk), &wait); for(;;) { release_sock(sk); if (scp->state == DN_CI || scp->state == DN_CC) - *timeo = schedule_timeout(*timeo); + *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); lock_sock(sk); err = 0; if (scp->state == DN_RUN) @@ -885,9 +884,8 @@ static int dn_wait_run(struct sock *sk, long *timeo) err = -ETIMEDOUT; if (!*timeo) break; - prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); } - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait); out: if (err == 0) { sk->sk_socket->state = SS_CONNECTED; @@ -1032,16 +1030,16 @@ static void dn_user_copy(struct sk_buff *skb, struct optdata_dn *opt)
static struct sk_buff *dn_wait_for_connect(struct sock *sk, long *timeo) { - DEFINE_WAIT(wait); + DEFINE_WAIT_FUNC(wait, woken_wake_function); struct sk_buff *skb = NULL; int err = 0;
- prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); + add_wait_queue(sk_sleep(sk), &wait); for(;;) { release_sock(sk); skb = skb_dequeue(&sk->sk_receive_queue); if (skb == NULL) { - *timeo = schedule_timeout(*timeo); + *timeo = wait_woken(&wait, TASK_INTERRUPTIBLE, *timeo); skb = skb_dequeue(&sk->sk_receive_queue); } lock_sock(sk); @@ -1056,9 +1054,8 @@ static struct sk_buff *dn_wait_for_connect(struct sock *sk, long *timeo) err = -EAGAIN; if (!*timeo) break; - prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE); } - finish_wait(sk_sleep(sk), &wait); + remove_wait_queue(sk_sleep(sk), &wait);
return skb == NULL ? ERR_PTR(err) : skb; }
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit bd31ecf44b8e18ccb1e5f6b50f85de6922a60de3 ]
When running CPU_FTR_P9_TM_HV_ASSIST, HFSCR[TM] is set for the guest even if the host has CONFIG_TRANSACTIONAL_MEM=n, which causes it to be unprepared to handle guest exits while transactional.
Normal guests don't have a problem because the HTM capability will not be advertised, but a rogue or buggy one could crash the host.
Fixes: 4bb3c7a0208f ("KVM: PPC: Book3S HV: Work around transactional memory bugs in POWER9") Reported-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210716024310.164448-1-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/book3s_hv.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/powerpc/kvm/book3s_hv.c b/arch/powerpc/kvm/book3s_hv.c index 67cc164c4ac1..395f98158e81 100644 --- a/arch/powerpc/kvm/book3s_hv.c +++ b/arch/powerpc/kvm/book3s_hv.c @@ -2445,8 +2445,10 @@ static int kvmppc_core_vcpu_create_hv(struct kvm_vcpu *vcpu) HFSCR_DSCR | HFSCR_VECVSX | HFSCR_FP | HFSCR_PREFIX; if (cpu_has_feature(CPU_FTR_HVMODE)) { vcpu->arch.hfscr &= mfspr(SPRN_HFSCR); +#ifdef CONFIG_PPC_TRANSACTIONAL_MEM if (cpu_has_feature(CPU_FTR_P9_TM_HV_ASSIST)) vcpu->arch.hfscr |= HFSCR_TM; +#endif } if (cpu_has_feature(CPU_FTR_TM_COMP)) vcpu->arch.hfscr |= HFSCR_TM;
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit bc4188a2f56e821ea057aca6bf444e138d06c252 ]
vcpu_put is not called if the user copy fails. This can result in preempt notifier corruption and crashes, among other issues.
Fixes: b3cebfe8c1ca ("KVM: PPC: Move vcpu_load/vcpu_put down to each ioctl case in kvm_arch_vcpu_ioctl") Reported-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210716024310.164448-2-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/kvm/powerpc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/powerpc/kvm/powerpc.c b/arch/powerpc/kvm/powerpc.c index a2a68a958fa0..6e4f03c02a0a 100644 --- a/arch/powerpc/kvm/powerpc.c +++ b/arch/powerpc/kvm/powerpc.c @@ -2045,9 +2045,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, { struct kvm_enable_cap cap; r = -EFAULT; - vcpu_load(vcpu); if (copy_from_user(&cap, argp, sizeof(cap))) goto out; + vcpu_load(vcpu); r = kvm_vcpu_ioctl_enable_cap(vcpu, &cap); vcpu_put(vcpu); break; @@ -2071,9 +2071,9 @@ long kvm_arch_vcpu_ioctl(struct file *filp, case KVM_DIRTY_TLB: { struct kvm_dirty_tlb dirty; r = -EFAULT; - vcpu_load(vcpu); if (copy_from_user(&dirty, argp, sizeof(dirty))) goto out; + vcpu_load(vcpu); r = kvm_vcpu_ioctl_dirty_tlb(vcpu, &dirty); vcpu_put(vcpu); break;
From: Pavel Skripkin paskripkin@gmail.com
[ Upstream commit f5051bcece50140abd1a11a2d36dc3ec5484fc32 ]
Syzbot reported memory leak in tcindex_set_parms(). The problem was in non-freed perfect hash in tcindex_partial_destroy_work().
In tcindex_set_parms() new tcindex_data is allocated and some fields from old one are copied to new one, but not the perfect hash. Since tcindex_partial_destroy_work() is the destroy function for old tcindex_data, we need to free perfect hash to avoid memory leak.
Reported-and-tested-by: syzbot+f0bbb2287b8993d4fa74@syzkaller.appspotmail.com Fixes: 331b72922c5f ("net: sched: RCU cls_tcindex") Signed-off-by: Pavel Skripkin paskripkin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_tcindex.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/net/sched/cls_tcindex.c b/net/sched/cls_tcindex.c index 5b274534264c..e9a8a2c86bbd 100644 --- a/net/sched/cls_tcindex.c +++ b/net/sched/cls_tcindex.c @@ -278,6 +278,8 @@ static int tcindex_filter_result_init(struct tcindex_filter_result *r, TCA_TCINDEX_POLICE); }
+static void tcindex_free_perfect_hash(struct tcindex_data *cp); + static void tcindex_partial_destroy_work(struct work_struct *work) { struct tcindex_data *p = container_of(to_rcu_work(work), @@ -285,7 +287,8 @@ static void tcindex_partial_destroy_work(struct work_struct *work) rwork);
rtnl_lock(); - kfree(p->perfect); + if (p->perfect) + tcindex_free_perfect_hash(p); kfree(p); rtnl_unlock(); }
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 2f3fdd8d4805015fa964807e1c7f3d88f31bd389 ]
After commit ca84bd058dae ("sctp: copy the optval from user space in sctp_setsockopt"), it does memory allocation in sctp_setsockopt with the optlen, and it would fail the allocation and return error if the optlen from user space is a huge value.
This breaks some sockopts, like SCTP_HMAC_IDENT, SCTP_RESET_STREAMS and SCTP_AUTH_KEY, as when processing these sockopts before, optlen would be trimmed to a biggest value it needs when optlen is a huge value, instead of failing the allocation and returning error.
This patch is to fix the allocation failure when it's a huge optlen from user space by trimming it to the biggest size sctp sockopt may need when necessary, and this biggest size is from sctp_setsockopt_reset_streams() for SCTP_RESET_STREAMS, which is bigger than those for SCTP_HMAC_IDENT and SCTP_AUTH_KEY.
Fixes: ca84bd058dae ("sctp: copy the optval from user space in sctp_setsockopt") Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/socket.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/sctp/socket.c b/net/sctp/socket.c index a79d193ff872..dbd074f4d450 100644 --- a/net/sctp/socket.c +++ b/net/sctp/socket.c @@ -4521,6 +4521,10 @@ static int sctp_setsockopt(struct sock *sk, int level, int optname, }
if (optlen > 0) { + /* Trim it to the biggest size sctp sockopt may need if necessary */ + optlen = min_t(unsigned int, optlen, + PAGE_ALIGN(USHRT_MAX + + sizeof(__u16) * sizeof(struct sctp_reset_streams))); kopt = memdup_sockptr(optval, optlen); if (IS_ERR(kopt)) return PTR_ERR(kopt);
From: Nguyen Dinh Phi phind.uet@gmail.com
[ Upstream commit 517a16b1a88bdb6b530f48d5d153478b2552d9a8 ]
Commit 63346650c1a9 ("netrom: switch to sock timer API") switched to use sock timer API. It replaces mod_timer() by sk_reset_timer(), and del_timer() by sk_stop_timer().
Function sk_reset_timer() will increase the refcount of sock if it is called on an inactive timer, hence, in case the timer expires, we need to decrease the refcount ourselves in the handler, otherwise, the sock refcount will be unbalanced and the sock will never be freed.
Signed-off-by: Nguyen Dinh Phi phind.uet@gmail.com Reported-by: syzbot+10f1194569953b72f1ae@syzkaller.appspotmail.com Fixes: 63346650c1a9 ("netrom: switch to sock timer API") Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/netrom/nr_timer.c | 20 +++++++++++--------- 1 file changed, 11 insertions(+), 9 deletions(-)
diff --git a/net/netrom/nr_timer.c b/net/netrom/nr_timer.c index 9115f8a7dd45..a8da88db7893 100644 --- a/net/netrom/nr_timer.c +++ b/net/netrom/nr_timer.c @@ -121,11 +121,9 @@ static void nr_heartbeat_expiry(struct timer_list *t) is accepted() it isn't 'dead' so doesn't get removed. */ if (sock_flag(sk, SOCK_DESTROY) || (sk->sk_state == TCP_LISTEN && sock_flag(sk, SOCK_DEAD))) { - sock_hold(sk); bh_unlock_sock(sk); nr_destroy_socket(sk); - sock_put(sk); - return; + goto out; } break;
@@ -146,6 +144,8 @@ static void nr_heartbeat_expiry(struct timer_list *t)
nr_start_heartbeat(sk); bh_unlock_sock(sk); +out: + sock_put(sk); }
static void nr_t2timer_expiry(struct timer_list *t) @@ -159,6 +159,7 @@ static void nr_t2timer_expiry(struct timer_list *t) nr_enquiry_response(sk); } bh_unlock_sock(sk); + sock_put(sk); }
static void nr_t4timer_expiry(struct timer_list *t) @@ -169,6 +170,7 @@ static void nr_t4timer_expiry(struct timer_list *t) bh_lock_sock(sk); nr_sk(sk)->condition &= ~NR_COND_PEER_RX_BUSY; bh_unlock_sock(sk); + sock_put(sk); }
static void nr_idletimer_expiry(struct timer_list *t) @@ -197,6 +199,7 @@ static void nr_idletimer_expiry(struct timer_list *t) sock_set_flag(sk, SOCK_DEAD); } bh_unlock_sock(sk); + sock_put(sk); }
static void nr_t1timer_expiry(struct timer_list *t) @@ -209,8 +212,7 @@ static void nr_t1timer_expiry(struct timer_list *t) case NR_STATE_1: if (nr->n2count == nr->n2) { nr_disconnect(sk, ETIMEDOUT); - bh_unlock_sock(sk); - return; + goto out; } else { nr->n2count++; nr_write_internal(sk, NR_CONNREQ); @@ -220,8 +222,7 @@ static void nr_t1timer_expiry(struct timer_list *t) case NR_STATE_2: if (nr->n2count == nr->n2) { nr_disconnect(sk, ETIMEDOUT); - bh_unlock_sock(sk); - return; + goto out; } else { nr->n2count++; nr_write_internal(sk, NR_DISCREQ); @@ -231,8 +232,7 @@ static void nr_t1timer_expiry(struct timer_list *t) case NR_STATE_3: if (nr->n2count == nr->n2) { nr_disconnect(sk, ETIMEDOUT); - bh_unlock_sock(sk); - return; + goto out; } else { nr->n2count++; nr_requeue_frames(sk); @@ -241,5 +241,7 @@ static void nr_t1timer_expiry(struct timer_list *t) }
nr_start_t1timer(sk); +out: bh_unlock_sock(sk); + sock_put(sk); }
From: Mike Christie michael.christie@oracle.com
[ Upstream commit e746f3451ec7f91dcc9fd67a631239c715850a34 ]
A ISCSI_IFACE_PARAM can have the same value as a ISCSI_NET_PARAM so when iscsi_iface_attr_is_visible tries to figure out the type by just checking the value, we can collide and return the wrong type. When we call into the driver we might not match and return that we don't want attr visible in sysfs. The patch fixes this by setting the type when we figure out what the param is.
Link: https://lore.kernel.org/r/20210701002559.89533-1-michael.christie@oracle.com Fixes: 3e0f65b34cc9 ("[SCSI] iscsi_transport: Additional parameters for network settings") Signed-off-by: Mike Christie michael.christie@oracle.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/scsi/scsi_transport_iscsi.c | 90 +++++++++++------------------ 1 file changed, 34 insertions(+), 56 deletions(-)
diff --git a/drivers/scsi/scsi_transport_iscsi.c b/drivers/scsi/scsi_transport_iscsi.c index b07105ae7c91..d8b05d8b5470 100644 --- a/drivers/scsi/scsi_transport_iscsi.c +++ b/drivers/scsi/scsi_transport_iscsi.c @@ -439,39 +439,10 @@ static umode_t iscsi_iface_attr_is_visible(struct kobject *kobj, struct device *dev = container_of(kobj, struct device, kobj); struct iscsi_iface *iface = iscsi_dev_to_iface(dev); struct iscsi_transport *t = iface->transport; - int param; - int param_type; + int param = -1;
if (attr == &dev_attr_iface_enabled.attr) param = ISCSI_NET_PARAM_IFACE_ENABLE; - else if (attr == &dev_attr_iface_vlan_id.attr) - param = ISCSI_NET_PARAM_VLAN_ID; - else if (attr == &dev_attr_iface_vlan_priority.attr) - param = ISCSI_NET_PARAM_VLAN_PRIORITY; - else if (attr == &dev_attr_iface_vlan_enabled.attr) - param = ISCSI_NET_PARAM_VLAN_ENABLED; - else if (attr == &dev_attr_iface_mtu.attr) - param = ISCSI_NET_PARAM_MTU; - else if (attr == &dev_attr_iface_port.attr) - param = ISCSI_NET_PARAM_PORT; - else if (attr == &dev_attr_iface_ipaddress_state.attr) - param = ISCSI_NET_PARAM_IPADDR_STATE; - else if (attr == &dev_attr_iface_delayed_ack_en.attr) - param = ISCSI_NET_PARAM_DELAYED_ACK_EN; - else if (attr == &dev_attr_iface_tcp_nagle_disable.attr) - param = ISCSI_NET_PARAM_TCP_NAGLE_DISABLE; - else if (attr == &dev_attr_iface_tcp_wsf_disable.attr) - param = ISCSI_NET_PARAM_TCP_WSF_DISABLE; - else if (attr == &dev_attr_iface_tcp_wsf.attr) - param = ISCSI_NET_PARAM_TCP_WSF; - else if (attr == &dev_attr_iface_tcp_timer_scale.attr) - param = ISCSI_NET_PARAM_TCP_TIMER_SCALE; - else if (attr == &dev_attr_iface_tcp_timestamp_en.attr) - param = ISCSI_NET_PARAM_TCP_TIMESTAMP_EN; - else if (attr == &dev_attr_iface_cache_id.attr) - param = ISCSI_NET_PARAM_CACHE_ID; - else if (attr == &dev_attr_iface_redirect_en.attr) - param = ISCSI_NET_PARAM_REDIRECT_EN; else if (attr == &dev_attr_iface_def_taskmgmt_tmo.attr) param = ISCSI_IFACE_PARAM_DEF_TASKMGMT_TMO; else if (attr == &dev_attr_iface_header_digest.attr) @@ -508,6 +479,38 @@ static umode_t iscsi_iface_attr_is_visible(struct kobject *kobj, param = ISCSI_IFACE_PARAM_STRICT_LOGIN_COMP_EN; else if (attr == &dev_attr_iface_initiator_name.attr) param = ISCSI_IFACE_PARAM_INITIATOR_NAME; + + if (param != -1) + return t->attr_is_visible(ISCSI_IFACE_PARAM, param); + + if (attr == &dev_attr_iface_vlan_id.attr) + param = ISCSI_NET_PARAM_VLAN_ID; + else if (attr == &dev_attr_iface_vlan_priority.attr) + param = ISCSI_NET_PARAM_VLAN_PRIORITY; + else if (attr == &dev_attr_iface_vlan_enabled.attr) + param = ISCSI_NET_PARAM_VLAN_ENABLED; + else if (attr == &dev_attr_iface_mtu.attr) + param = ISCSI_NET_PARAM_MTU; + else if (attr == &dev_attr_iface_port.attr) + param = ISCSI_NET_PARAM_PORT; + else if (attr == &dev_attr_iface_ipaddress_state.attr) + param = ISCSI_NET_PARAM_IPADDR_STATE; + else if (attr == &dev_attr_iface_delayed_ack_en.attr) + param = ISCSI_NET_PARAM_DELAYED_ACK_EN; + else if (attr == &dev_attr_iface_tcp_nagle_disable.attr) + param = ISCSI_NET_PARAM_TCP_NAGLE_DISABLE; + else if (attr == &dev_attr_iface_tcp_wsf_disable.attr) + param = ISCSI_NET_PARAM_TCP_WSF_DISABLE; + else if (attr == &dev_attr_iface_tcp_wsf.attr) + param = ISCSI_NET_PARAM_TCP_WSF; + else if (attr == &dev_attr_iface_tcp_timer_scale.attr) + param = ISCSI_NET_PARAM_TCP_TIMER_SCALE; + else if (attr == &dev_attr_iface_tcp_timestamp_en.attr) + param = ISCSI_NET_PARAM_TCP_TIMESTAMP_EN; + else if (attr == &dev_attr_iface_cache_id.attr) + param = ISCSI_NET_PARAM_CACHE_ID; + else if (attr == &dev_attr_iface_redirect_en.attr) + param = ISCSI_NET_PARAM_REDIRECT_EN; else if (iface->iface_type == ISCSI_IFACE_TYPE_IPV4) { if (attr == &dev_attr_ipv4_iface_ipaddress.attr) param = ISCSI_NET_PARAM_IPV4_ADDR; @@ -598,32 +601,7 @@ static umode_t iscsi_iface_attr_is_visible(struct kobject *kobj, return 0; }
- switch (param) { - case ISCSI_IFACE_PARAM_DEF_TASKMGMT_TMO: - case ISCSI_IFACE_PARAM_HDRDGST_EN: - case ISCSI_IFACE_PARAM_DATADGST_EN: - case ISCSI_IFACE_PARAM_IMM_DATA_EN: - case ISCSI_IFACE_PARAM_INITIAL_R2T_EN: - case ISCSI_IFACE_PARAM_DATASEQ_INORDER_EN: - case ISCSI_IFACE_PARAM_PDU_INORDER_EN: - case ISCSI_IFACE_PARAM_ERL: - case ISCSI_IFACE_PARAM_MAX_RECV_DLENGTH: - case ISCSI_IFACE_PARAM_FIRST_BURST: - case ISCSI_IFACE_PARAM_MAX_R2T: - case ISCSI_IFACE_PARAM_MAX_BURST: - case ISCSI_IFACE_PARAM_CHAP_AUTH_EN: - case ISCSI_IFACE_PARAM_BIDI_CHAP_EN: - case ISCSI_IFACE_PARAM_DISCOVERY_AUTH_OPTIONAL: - case ISCSI_IFACE_PARAM_DISCOVERY_LOGOUT_EN: - case ISCSI_IFACE_PARAM_STRICT_LOGIN_COMP_EN: - case ISCSI_IFACE_PARAM_INITIATOR_NAME: - param_type = ISCSI_IFACE_PARAM; - break; - default: - param_type = ISCSI_NET_PARAM; - } - - return t->attr_is_visible(param_type, param); + return t->attr_is_visible(ISCSI_NET_PARAM, param); }
static struct attribute *iscsi_iface_attrs[] = {
From: Dmitry Bogdanov d.bogdanov@yadro.com
[ Upstream commit 6d8e7e7c932162bccd06872362751b0e1d76f5af ]
WRITE SAME(32) command handling reads WRPROTECT at the wrong offset in 1st byte instead of 10th byte.
Link: https://lore.kernel.org/r/20210702091655.22818-1-d.bogdanov@yadro.com Fixes: afd73f1b60fc ("target: Perform PROTECT sanity checks for WRITE_SAME") Signed-off-by: Dmitry Bogdanov d.bogdanov@yadro.com Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_sbc.c | 35 ++++++++++++++++---------------- 1 file changed, 17 insertions(+), 18 deletions(-)
diff --git a/drivers/target/target_core_sbc.c b/drivers/target/target_core_sbc.c index 7b07e557dc8d..6594bb0b9df0 100644 --- a/drivers/target/target_core_sbc.c +++ b/drivers/target/target_core_sbc.c @@ -25,7 +25,7 @@ #include "target_core_alua.h"
static sense_reason_t -sbc_check_prot(struct se_device *, struct se_cmd *, unsigned char *, u32, bool); +sbc_check_prot(struct se_device *, struct se_cmd *, unsigned char, u32, bool); static sense_reason_t sbc_execute_unmap(struct se_cmd *cmd);
static sense_reason_t @@ -279,14 +279,14 @@ static inline unsigned long long transport_lba_64_ext(unsigned char *cdb) }
static sense_reason_t -sbc_setup_write_same(struct se_cmd *cmd, unsigned char *flags, struct sbc_ops *ops) +sbc_setup_write_same(struct se_cmd *cmd, unsigned char flags, struct sbc_ops *ops) { struct se_device *dev = cmd->se_dev; sector_t end_lba = dev->transport->get_blocks(dev) + 1; unsigned int sectors = sbc_get_write_same_sectors(cmd); sense_reason_t ret;
- if ((flags[0] & 0x04) || (flags[0] & 0x02)) { + if ((flags & 0x04) || (flags & 0x02)) { pr_err("WRITE_SAME PBDATA and LBDATA" " bits not supported for Block Discard" " Emulation\n"); @@ -308,7 +308,7 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char *flags, struct sbc_ops *o }
/* We always have ANC_SUP == 0 so setting ANCHOR is always an error */ - if (flags[0] & 0x10) { + if (flags & 0x10) { pr_warn("WRITE SAME with ANCHOR not supported\n"); return TCM_INVALID_CDB_FIELD; } @@ -316,7 +316,7 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char *flags, struct sbc_ops *o * Special case for WRITE_SAME w/ UNMAP=1 that ends up getting * translated into block discard requests within backend code. */ - if (flags[0] & 0x08) { + if (flags & 0x08) { if (!ops->execute_unmap) return TCM_UNSUPPORTED_SCSI_OPCODE;
@@ -331,7 +331,7 @@ sbc_setup_write_same(struct se_cmd *cmd, unsigned char *flags, struct sbc_ops *o if (!ops->execute_write_same) return TCM_UNSUPPORTED_SCSI_OPCODE;
- ret = sbc_check_prot(dev, cmd, &cmd->t_task_cdb[0], sectors, true); + ret = sbc_check_prot(dev, cmd, flags >> 5, sectors, true); if (ret) return ret;
@@ -717,10 +717,9 @@ sbc_set_prot_op_checks(u8 protect, bool fabric_prot, enum target_prot_type prot_ }
static sense_reason_t -sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb, +sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char protect, u32 sectors, bool is_write) { - u8 protect = cdb[1] >> 5; int sp_ops = cmd->se_sess->sup_prot_ops; int pi_prot_type = dev->dev_attrib.pi_prot_type; bool fabric_prot = false; @@ -768,7 +767,7 @@ sbc_check_prot(struct se_device *dev, struct se_cmd *cmd, unsigned char *cdb, fallthrough; default: pr_err("Unable to determine pi_prot_type for CDB: 0x%02x " - "PROTECT: 0x%02x\n", cdb[0], protect); + "PROTECT: 0x%02x\n", cmd->t_task_cdb[0], protect); return TCM_INVALID_CDB_FIELD; }
@@ -843,7 +842,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, false); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, false); if (ret) return ret;
@@ -857,7 +856,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, false); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, false); if (ret) return ret;
@@ -871,7 +870,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, false); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, false); if (ret) return ret;
@@ -892,7 +891,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, true); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, true); if (ret) return ret;
@@ -906,7 +905,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, true); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, true); if (ret) return ret;
@@ -921,7 +920,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) if (sbc_check_dpofua(dev, cmd, cdb)) return TCM_INVALID_CDB_FIELD;
- ret = sbc_check_prot(dev, cmd, cdb, sectors, true); + ret = sbc_check_prot(dev, cmd, cdb[1] >> 5, sectors, true); if (ret) return ret;
@@ -980,7 +979,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) size = sbc_get_size(cmd, 1); cmd->t_task_lba = get_unaligned_be64(&cdb[12]);
- ret = sbc_setup_write_same(cmd, &cdb[10], ops); + ret = sbc_setup_write_same(cmd, cdb[10], ops); if (ret) return ret; break; @@ -1079,7 +1078,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) size = sbc_get_size(cmd, 1); cmd->t_task_lba = get_unaligned_be64(&cdb[2]);
- ret = sbc_setup_write_same(cmd, &cdb[1], ops); + ret = sbc_setup_write_same(cmd, cdb[1], ops); if (ret) return ret; break; @@ -1097,7 +1096,7 @@ sbc_parse_cdb(struct se_cmd *cmd, struct sbc_ops *ops) * Follow sbcr26 with WRITE_SAME (10) and check for the existence * of byte 1 bit 3 UNMAP instead of original reserved field */ - ret = sbc_setup_write_same(cmd, &cdb[1], ops); + ret = sbc_setup_write_same(cmd, cdb[1], ops); if (ret) return ret; break;
From: Marek Vasut marex@denx.de
[ Upstream commit 56912da7a68c8356df6a6740476237441b0b792a ]
The original implementation of RPM handling in probe() was mostly correct, except it failed to call pm_runtime_get_*() to activate the hardware. The subsequent fix, 734882a8bf98 ("spi: cadence: Correct initialisation of runtime PM"), breaks the implementation further, to the point where the system using this hard IP on ZynqMP hangs on boot, because it accesses hardware which is gated off.
Undo 734882a8bf98 ("spi: cadence: Correct initialisation of runtime PM") and instead add missing pm_runtime_get_noresume() and move the RPM disabling all the way to the end of probe(). That makes ZynqMP not hang on boot yet again.
Fixes: 734882a8bf98 ("spi: cadence: Correct initialisation of runtime PM") Signed-off-by: Marek Vasut marex@denx.de Cc: Charles Keepax ckeepax@opensource.cirrus.com Cc: Mark Brown broonie@kernel.org Link: https://lore.kernel.org/r/20210716182133.218640-1-marex@denx.de Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-cadence.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/spi/spi-cadence.c b/drivers/spi/spi-cadence.c index a3afd1b9ac56..ceb16e70d235 100644 --- a/drivers/spi/spi-cadence.c +++ b/drivers/spi/spi-cadence.c @@ -517,6 +517,12 @@ static int cdns_spi_probe(struct platform_device *pdev) goto clk_dis_apb; }
+ pm_runtime_use_autosuspend(&pdev->dev); + pm_runtime_set_autosuspend_delay(&pdev->dev, SPI_AUTOSUSPEND_TIMEOUT); + pm_runtime_get_noresume(&pdev->dev); + pm_runtime_set_active(&pdev->dev); + pm_runtime_enable(&pdev->dev); + ret = of_property_read_u32(pdev->dev.of_node, "num-cs", &num_cs); if (ret < 0) master->num_chipselect = CDNS_SPI_DEFAULT_NUM_CS; @@ -531,11 +537,6 @@ static int cdns_spi_probe(struct platform_device *pdev) /* SPI controller initializations */ cdns_spi_init_hw(xspi);
- pm_runtime_set_active(&pdev->dev); - pm_runtime_enable(&pdev->dev); - pm_runtime_use_autosuspend(&pdev->dev); - pm_runtime_set_autosuspend_delay(&pdev->dev, SPI_AUTOSUSPEND_TIMEOUT); - irq = platform_get_irq(pdev, 0); if (irq <= 0) { ret = -ENXIO; @@ -566,6 +567,9 @@ static int cdns_spi_probe(struct platform_device *pdev)
master->bits_per_word_mask = SPI_BPW_MASK(8);
+ pm_runtime_mark_last_busy(&pdev->dev); + pm_runtime_put_autosuspend(&pdev->dev); + ret = spi_register_master(master); if (ret) { dev_err(&pdev->dev, "spi_register_master failed\n");
From: Robert Richter rrichter@amd.com
[ Upstream commit d2cbbf1fe503c07e466c62f83aa1926d74d15821 ]
During a rework of initramfs code the INITRAMFS_COMPRESSION config option was removed in commit 65e00e04e5ae. A leftover as a dependency broke the config option ACPI_TABLE_OVERRIDE_VIA_ BUILTIN_INITRD that is used to enable the overriding of ACPI tables from built-in initrd. Fixing the dependency.
Fixes: 65e00e04e5ae ("initramfs: refactor the initramfs build rules") Signed-off-by: Robert Richter rrichter@amd.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/Kconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/Kconfig b/drivers/acpi/Kconfig index eedec61e3476..226f849fe7dc 100644 --- a/drivers/acpi/Kconfig +++ b/drivers/acpi/Kconfig @@ -370,7 +370,7 @@ config ACPI_TABLE_UPGRADE config ACPI_TABLE_OVERRIDE_VIA_BUILTIN_INITRD bool "Override ACPI tables from built-in initrd" depends on ACPI_TABLE_UPGRADE - depends on INITRAMFS_SOURCE!="" && INITRAMFS_COMPRESSION="" + depends on INITRAMFS_SOURCE!="" && INITRAMFS_COMPRESSION_NONE help This option provides functionality to override arbitrary ACPI tables from built-in uncompressed initrd.
From: Andy Shevchenko andy.shevchenko@gmail.com
[ Upstream commit edbd1bc4951eff8da65732dbe0d381e555054428 ]
Switch to use for_each_acpi_dev_match() instead of home grown analogue. No functional change intended.
Signed-off-by: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/efi/dev-path-parser.c | 49 ++++++++++---------------- 1 file changed, 18 insertions(+), 31 deletions(-)
diff --git a/drivers/firmware/efi/dev-path-parser.c b/drivers/firmware/efi/dev-path-parser.c index 5c9625e552f4..10d4457417a4 100644 --- a/drivers/firmware/efi/dev-path-parser.c +++ b/drivers/firmware/efi/dev-path-parser.c @@ -12,52 +12,39 @@ #include <linux/efi.h> #include <linux/pci.h>
-struct acpi_hid_uid { - struct acpi_device_id hid[2]; - char uid[11]; /* UINT_MAX + null byte */ -}; - -static int __init match_acpi_dev(struct device *dev, const void *data) -{ - struct acpi_hid_uid hid_uid = *(const struct acpi_hid_uid *)data; - struct acpi_device *adev = to_acpi_device(dev); - - if (acpi_match_device_ids(adev, hid_uid.hid)) - return 0; - - if (adev->pnp.unique_id) - return !strcmp(adev->pnp.unique_id, hid_uid.uid); - else - return !strcmp("0", hid_uid.uid); -} - static long __init parse_acpi_path(const struct efi_dev_path *node, struct device *parent, struct device **child) { - struct acpi_hid_uid hid_uid = {}; + char hid[ACPI_ID_LEN], uid[11]; /* UINT_MAX + null byte */ + struct acpi_device *adev; struct device *phys_dev;
if (node->header.length != 12) return -EINVAL;
- sprintf(hid_uid.hid[0].id, "%c%c%c%04X", + sprintf(hid, "%c%c%c%04X", 'A' + ((node->acpi.hid >> 10) & 0x1f) - 1, 'A' + ((node->acpi.hid >> 5) & 0x1f) - 1, 'A' + ((node->acpi.hid >> 0) & 0x1f) - 1, node->acpi.hid >> 16); - sprintf(hid_uid.uid, "%u", node->acpi.uid); - - *child = bus_find_device(&acpi_bus_type, NULL, &hid_uid, - match_acpi_dev); - if (!*child) + sprintf(uid, "%u", node->acpi.uid); + + for_each_acpi_dev_match(adev, hid, NULL, -1) { + if (adev->pnp.unique_id && !strcmp(adev->pnp.unique_id, uid)) + break; + if (!adev->pnp.unique_id && node->acpi.uid == 0) + break; + acpi_dev_put(adev); + } + if (!adev) return -ENODEV;
- phys_dev = acpi_get_first_physical_node(to_acpi_device(*child)); + phys_dev = acpi_get_first_physical_node(adev); if (phys_dev) { - get_device(phys_dev); - put_device(*child); - *child = phys_dev; - } + *child = get_device(phys_dev); + acpi_dev_put(adev); + } else + *child = &adev->dev;
return 0; }
From: Andy Shevchenko andy.shevchenko@gmail.com
[ Upstream commit 71f6428332844f38c7cb10461d9f29e9c9b983a0 ]
Currently it's possible to iterate over the dangling pointer in case the device suddenly disappears. This may happen becase callers put it at the end of a loop.
Instead, let's move that call inside acpi_dev_get_next_match_dev().
Fixes: 803abec64ef9 ("media: ipu3-cio2: Add cio2-bridge to ipu3-cio2 driver") Fixes: bf263f64e804 ("media: ACPI / bus: Add acpi_dev_get_next_match_dev() and helper macro") Fixes: edbd1bc4951e ("efi/dev-path-parser: Switch to use for_each_acpi_dev_match()") Signed-off-by: Andy Shevchenko andy.shevchenko@gmail.com Reviewed-by: Daniel Scally djrscally@gmail.com Signed-off-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/utils.c | 7 +++---- drivers/firmware/efi/dev-path-parser.c | 1 - drivers/media/pci/intel/ipu3/cio2-bridge.c | 6 ++---- include/acpi/acpi_bus.h | 5 ----- 4 files changed, 5 insertions(+), 14 deletions(-)
diff --git a/drivers/acpi/utils.c b/drivers/acpi/utils.c index 3b54b8fd7396..27ec9d57f3b8 100644 --- a/drivers/acpi/utils.c +++ b/drivers/acpi/utils.c @@ -846,11 +846,9 @@ EXPORT_SYMBOL(acpi_dev_present); * Return the next match of ACPI device if another matching device was present * at the moment of invocation, or NULL otherwise. * - * FIXME: The function does not tolerate the sudden disappearance of @adev, e.g. - * in the case of a hotplug event. That said, the caller should ensure that - * this will never happen. - * * The caller is responsible for invoking acpi_dev_put() on the returned device. + * On the other hand the function invokes acpi_dev_put() on the given @adev + * assuming that its reference counter had been increased beforehand. * * See additional information in acpi_dev_present() as well. */ @@ -866,6 +864,7 @@ acpi_dev_get_next_match_dev(struct acpi_device *adev, const char *hid, const cha match.hrv = hrv;
dev = bus_find_device(&acpi_bus_type, start, &match, acpi_dev_match_cb); + acpi_dev_put(adev); return dev ? to_acpi_device(dev) : NULL; } EXPORT_SYMBOL(acpi_dev_get_next_match_dev); diff --git a/drivers/firmware/efi/dev-path-parser.c b/drivers/firmware/efi/dev-path-parser.c index 10d4457417a4..eb9c65f97841 100644 --- a/drivers/firmware/efi/dev-path-parser.c +++ b/drivers/firmware/efi/dev-path-parser.c @@ -34,7 +34,6 @@ static long __init parse_acpi_path(const struct efi_dev_path *node, break; if (!adev->pnp.unique_id && node->acpi.uid == 0) break; - acpi_dev_put(adev); } if (!adev) return -ENODEV; diff --git a/drivers/media/pci/intel/ipu3/cio2-bridge.c b/drivers/media/pci/intel/ipu3/cio2-bridge.c index 4657e99df033..59a36f922675 100644 --- a/drivers/media/pci/intel/ipu3/cio2-bridge.c +++ b/drivers/media/pci/intel/ipu3/cio2-bridge.c @@ -173,10 +173,8 @@ static int cio2_bridge_connect_sensor(const struct cio2_sensor_config *cfg, int ret;
for_each_acpi_dev_match(adev, cfg->hid, NULL, -1) { - if (!adev->status.enabled) { - acpi_dev_put(adev); + if (!adev->status.enabled) continue; - }
if (bridge->n_sensors >= CIO2_NUM_PORTS) { acpi_dev_put(adev); @@ -185,7 +183,6 @@ static int cio2_bridge_connect_sensor(const struct cio2_sensor_config *cfg, }
sensor = &bridge->sensors[bridge->n_sensors]; - sensor->adev = adev; strscpy(sensor->name, cfg->hid, sizeof(sensor->name));
ret = cio2_bridge_read_acpi_buffer(adev, "SSDB", @@ -215,6 +212,7 @@ static int cio2_bridge_connect_sensor(const struct cio2_sensor_config *cfg, goto err_free_swnodes; }
+ sensor->adev = acpi_dev_get(adev); adev->fwnode.secondary = fwnode;
dev_info(&cio2->dev, "Found supported sensor %s\n", diff --git a/include/acpi/acpi_bus.h b/include/acpi/acpi_bus.h index 3a82faac5767..bff6a11bb21f 100644 --- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -698,11 +698,6 @@ acpi_dev_get_first_match_dev(const char *hid, const char *uid, s64 hrv); * @hrv: Hardware Revision of the device, pass -1 to not check _HRV * * The caller is responsible for invoking acpi_dev_put() on the returned device. - * - * FIXME: Due to above requirement there is a window that may invalidate @adev - * and next iteration will use a dangling pointer, e.g. in the case of a - * hotplug event. That said, the caller should ensure that this will never - * happen. */ #define for_each_acpi_dev_match(adev, hid, uid, hrv) \ for (adev = acpi_dev_get_first_match_dev(hid, uid, hrv); \
From: Kalesh AP kalesh-anakkur.purayil@broadcom.com
[ Upstream commit c81cfb6256d90ea5ba4a6fb280ea3b171be4e05c ]
If device is already disabled in reset path and PCI io error is detected before the device could be enabled, driver could call pci_disable_device() for already disabled device. Fix this problem by calling pci_disable_device() only if the device is already enabled.
Fixes: 6316ea6db93d ("bnxt_en: Enable AER support.") Signed-off-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index aef3fccc27a9..d57fb1613cfc 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -13315,7 +13315,8 @@ static pci_ers_result_t bnxt_io_error_detected(struct pci_dev *pdev, if (netif_running(netdev)) bnxt_close(netdev);
- pci_disable_device(pdev); + if (pci_is_enabled(pdev)) + pci_disable_device(pdev); bnxt_free_ctx_mem(bp); kfree(bp->ctx); bp->ctx = NULL;
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 2c9f046bc377efd1f5e26e74817d5f96e9506c86 ]
The capabilities can change after firmware upgrade/downgrade, so we should get the up-to-date RoCE capabilities everytime bnxt_ulp_probe() is called.
Fixes: 2151fe0830fd ("bnxt_en: Handle RESET_NOTIFY async event from firmware.") Reviewed-by: Somnath Kotur somnath.kotur@broadcom.com Reviewed-by: Edwin Peer edwin.peer@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c index a918e374f3c5..187ff643ad2a 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ulp.c @@ -479,16 +479,17 @@ struct bnxt_en_dev *bnxt_ulp_probe(struct net_device *dev) if (!edev) return ERR_PTR(-ENOMEM); edev->en_ops = &bnxt_en_ops_tbl; - if (bp->flags & BNXT_FLAG_ROCEV1_CAP) - edev->flags |= BNXT_EN_FLAG_ROCEV1_CAP; - if (bp->flags & BNXT_FLAG_ROCEV2_CAP) - edev->flags |= BNXT_EN_FLAG_ROCEV2_CAP; edev->net = dev; edev->pdev = bp->pdev; edev->l2_db_size = bp->db_size; edev->l2_db_size_nc = bp->db_size; bp->edev = edev; } + edev->flags &= ~BNXT_EN_FLAG_ROCE_CAP; + if (bp->flags & BNXT_FLAG_ROCEV1_CAP) + edev->flags |= BNXT_EN_FLAG_ROCEV1_CAP; + if (bp->flags & BNXT_FLAG_ROCEV2_CAP) + edev->flags |= BNXT_EN_FLAG_ROCEV2_CAP; return bp->edev; } EXPORT_SYMBOL(bnxt_ulp_probe);
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 6cd657cb3ee6f4de57e635b126ffbe0e51d00f1a ]
In the BNXT_FW_RESET_STATE_POLL_VF state in bnxt_fw_reset_task() after all VFs have unregistered, we need to check for BNXT_STATE_ABORT_ERR after we acquire the rtnl_lock. If the flag is set, we need to abort.
Fixes: 230d1f0de754 ("bnxt_en: Handle firmware reset.") Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index d57fb1613cfc..07efab5bad95 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -11882,6 +11882,10 @@ static void bnxt_fw_reset_task(struct work_struct *work) } bp->fw_reset_timestamp = jiffies; rtnl_lock(); + if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) { + rtnl_unlock(); + goto fw_reset_abort; + } bnxt_fw_reset_close(bp); if (bp->fw_cap & BNXT_FW_CAP_ERR_RECOVER_RELOAD) { bp->fw_reset_state = BNXT_FW_RESET_STATE_POLL_FW_DOWN;
From: Somnath Kotur somnath.kotur@broadcom.com
[ Upstream commit 3958b1da725a477b4a222183d16a14d85445d4b6 ]
When bnxt_open() fails in the firmware reset path, the driver needs to gracefully abort, but it is executing code that should be invoked only in the success path. Define a function to abort FW reset and consolidate all error paths to call this new function.
Fixes: dab62e7c2de7 ("bnxt_en: Implement faster recovery for firmware fatal error.") Signed-off-by: Somnath Kotur somnath.kotur@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 31 +++++++++++++++-------- 1 file changed, 21 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 07efab5bad95..49aca3289c00 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -11849,10 +11849,21 @@ static bool bnxt_fw_reset_timeout(struct bnxt *bp) (bp->fw_reset_max_dsecs * HZ / 10)); }
+static void bnxt_fw_reset_abort(struct bnxt *bp, int rc) +{ + clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); + if (bp->fw_reset_state != BNXT_FW_RESET_STATE_POLL_VF) { + bnxt_ulp_start(bp, rc); + bnxt_dl_health_status_update(bp, false); + } + bp->fw_reset_state = 0; + dev_close(bp->dev); +} + static void bnxt_fw_reset_task(struct work_struct *work) { struct bnxt *bp = container_of(work, struct bnxt, fw_reset_task.work); - int rc; + int rc = 0;
if (!test_bit(BNXT_STATE_IN_FW_RESET, &bp->state)) { netdev_err(bp->dev, "bnxt_fw_reset_task() called when not in fw reset mode!\n"); @@ -11883,8 +11894,9 @@ static void bnxt_fw_reset_task(struct work_struct *work) bp->fw_reset_timestamp = jiffies; rtnl_lock(); if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) { + bnxt_fw_reset_abort(bp, rc); rtnl_unlock(); - goto fw_reset_abort; + return; } bnxt_fw_reset_close(bp); if (bp->fw_cap & BNXT_FW_CAP_ERR_RECOVER_RELOAD) { @@ -11933,6 +11945,7 @@ static void bnxt_fw_reset_task(struct work_struct *work) if (val == 0xffff) { if (bnxt_fw_reset_timeout(bp)) { netdev_err(bp->dev, "Firmware reset aborted, PCI config space invalid\n"); + rc = -ETIMEDOUT; goto fw_reset_abort; } bnxt_queue_fw_reset_work(bp, HZ / 1000); @@ -11942,6 +11955,7 @@ static void bnxt_fw_reset_task(struct work_struct *work) clear_bit(BNXT_STATE_FW_FATAL_COND, &bp->state); if (pci_enable_device(bp->pdev)) { netdev_err(bp->dev, "Cannot re-enable PCI device\n"); + rc = -ENODEV; goto fw_reset_abort; } pci_set_master(bp->pdev); @@ -11968,9 +11982,10 @@ static void bnxt_fw_reset_task(struct work_struct *work) } rc = bnxt_open(bp->dev); if (rc) { - netdev_err(bp->dev, "bnxt_open_nic() failed\n"); - clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - dev_close(bp->dev); + netdev_err(bp->dev, "bnxt_open() failed during FW reset\n"); + bnxt_fw_reset_abort(bp, rc); + rtnl_unlock(); + return; }
bp->fw_reset_state = 0; @@ -11997,12 +12012,8 @@ fw_reset_abort_status: netdev_err(bp->dev, "fw_health_status 0x%x\n", sts); } fw_reset_abort: - clear_bit(BNXT_STATE_IN_FW_RESET, &bp->state); - if (bp->fw_reset_state != BNXT_FW_RESET_STATE_POLL_VF) - bnxt_dl_health_status_update(bp, false); - bp->fw_reset_state = 0; rtnl_lock(); - dev_close(bp->dev); + bnxt_fw_reset_abort(bp, rc); rtnl_unlock(); }
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 96bdd4b9ea7ef9a12db8fdd0ce90e37dffbd3703 ]
Only pass supported VLAN protocol IDs for stripped VLAN tags to the stack. The stack will hit WARN() if the protocol ID is unsupported.
Existing firmware sets up the chip to strip 0x8100, 0x88a8, 0x9100. Only the 1st two protocols are supported by the kernel.
Fixes: a196e96bb68f ("bnxt_en: clean up VLAN feature bit handling") Reviewed-by: Somnath Kotur somnath.kotur@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index 49aca3289c00..be36dee65f90 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -1640,11 +1640,16 @@ static inline struct sk_buff *bnxt_tpa_end(struct bnxt *bp,
if ((tpa_info->flags2 & RX_CMP_FLAGS2_META_FORMAT_VLAN) && (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX)) { - u16 vlan_proto = tpa_info->metadata >> - RX_CMP_FLAGS2_METADATA_TPID_SFT; + __be16 vlan_proto = htons(tpa_info->metadata >> + RX_CMP_FLAGS2_METADATA_TPID_SFT); u16 vtag = tpa_info->metadata & RX_CMP_FLAGS2_METADATA_TCI_MASK;
- __vlan_hwaccel_put_tag(skb, htons(vlan_proto), vtag); + if (eth_type_vlan(vlan_proto)) { + __vlan_hwaccel_put_tag(skb, vlan_proto, vtag); + } else { + dev_kfree_skb(skb); + return NULL; + } }
skb_checksum_none_assert(skb); @@ -1865,9 +1870,15 @@ static int bnxt_rx_pkt(struct bnxt *bp, struct bnxt_cp_ring_info *cpr, (skb->dev->features & BNXT_HW_FEATURE_VLAN_ALL_RX)) { u32 meta_data = le32_to_cpu(rxcmp1->rx_cmp_meta_data); u16 vtag = meta_data & RX_CMP_FLAGS2_METADATA_TCI_MASK; - u16 vlan_proto = meta_data >> RX_CMP_FLAGS2_METADATA_TPID_SFT; + __be16 vlan_proto = htons(meta_data >> + RX_CMP_FLAGS2_METADATA_TPID_SFT);
- __vlan_hwaccel_put_tag(skb, htons(vlan_proto), vtag); + if (eth_type_vlan(vlan_proto)) { + __vlan_hwaccel_put_tag(skb, vlan_proto, vtag); + } else { + dev_kfree_skb(skb); + goto next_rx; + } }
skb_checksum_none_assert(skb);
From: Somnath Kotur somnath.kotur@broadcom.com
[ Upstream commit 11a39259ff79b74bc99f8b7c44075a2d6d5e7ab1 ]
bnxt_half_open_nic() is called during during ethtool self test and is protected by rtnl_lock. Firmware reset can be happening at the same time. Only critical portions of the entire firmware reset sequence are protected by the rtnl_lock. It is possible that bnxt_half_open_nic() can be called when the firmware reset sequence is aborting. In that case, bnxt_half_open_nic() needs to check if the ABORT_ERR flag is set and abort if it is. The ethtool self test will fail but the NIC will be brought to a consistent IF_DOWN state.
Without this patch, if bnxt_half_open_nic() were to continue in this error state, it may crash like this:
bnxt_en 0000:82:00.1 enp130s0f1np1: FW reset in progress during close, FW reset will be aborted Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000 ... Process ethtool (pid: 333327, stack limit = 0x0000000046476577) Call trace: bnxt_alloc_mem+0x444/0xef0 [bnxt_en] bnxt_half_open_nic+0x24/0xb8 [bnxt_en] bnxt_self_test+0x2dc/0x390 [bnxt_en] ethtool_self_test+0xe0/0x1f8 dev_ethtool+0x1744/0x22d0 dev_ioctl+0x190/0x3e0 sock_ioctl+0x238/0x480 do_vfs_ioctl+0xc4/0x758 ksys_ioctl+0x84/0xb8 __arm64_sys_ioctl+0x28/0x38 el0_svc_handler+0xb0/0x180 el0_svc+0x8/0xc
Fixes: a1301f08c5ac ("bnxt_en: Check abort error state in bnxt_open_nic().") Signed-off-by: Somnath Kotur somnath.kotur@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt.c b/drivers/net/ethernet/broadcom/bnxt/bnxt.c index be36dee65f90..3c3aa9467310 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt.c @@ -10104,6 +10104,12 @@ int bnxt_half_open_nic(struct bnxt *bp) { int rc = 0;
+ if (test_bit(BNXT_STATE_ABORT_ERR, &bp->state)) { + netdev_err(bp->dev, "A previous firmware reset has not completed, aborting half open\n"); + rc = -ENODEV; + goto half_open_err; + } + rc = bnxt_alloc_mem(bp, false); if (rc) { netdev_err(bp->dev, "bnxt_alloc_mem err: %x\n", rc);
From: Randy Dunlap rdunlap@infradead.org
[ Upstream commit b16f3299ae1aa3c327e1fb742d0379ae4d6e86f2 ]
Building on ARCH=arc causes a "redefined" warning, so rename this driver's CACHE_LINE_MASK to avoid the warning.
../drivers/net/ethernet/hisilicon/hip04_eth.c:134: warning: "CACHE_LINE_MASK" redefined 134 | #define CACHE_LINE_MASK 0x3F In file included from ../include/linux/cache.h:6, from ../include/linux/printk.h:9, from ../include/linux/kernel.h:19, from ../include/linux/list.h:9, from ../include/linux/module.h:12, from ../drivers/net/ethernet/hisilicon/hip04_eth.c:7: ../arch/arc/include/asm/cache.h:17: note: this is the location of the previous definition 17 | #define CACHE_LINE_MASK (~(L1_CACHE_BYTES - 1))
Fixes: d413779cdd93 ("net: hisilicon: Add an tx_desc to adapt HI13X1_GMAC") Signed-off-by: Randy Dunlap rdunlap@infradead.org Cc: Vineet Gupta vgupta@synopsys.com Cc: Jiangfeng Xiao xiaojiangfeng@huawei.com Cc: "David S. Miller" davem@davemloft.net Cc: Jakub Kicinski kuba@kernel.org Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hip04_eth.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hip04_eth.c b/drivers/net/ethernet/hisilicon/hip04_eth.c index 12f6c2442a7a..e53512f6878a 100644 --- a/drivers/net/ethernet/hisilicon/hip04_eth.c +++ b/drivers/net/ethernet/hisilicon/hip04_eth.c @@ -131,7 +131,7 @@ /* buf unit size is cache_line_size, which is 64, so the shift is 6 */ #define PPE_BUF_SIZE_SHIFT 6 #define PPE_TX_BUF_HOLD BIT(31) -#define CACHE_LINE_MASK 0x3F +#define SOC_CACHE_LINE_MASK 0x3F #else #define PPE_CFG_QOS_VMID_GRP_SHIFT 8 #define PPE_CFG_RX_CTRL_ALIGN_SHIFT 11 @@ -531,8 +531,8 @@ hip04_mac_start_xmit(struct sk_buff *skb, struct net_device *ndev) #if defined(CONFIG_HI13X1_GMAC) desc->cfg = (__force u32)cpu_to_be32(TX_CLEAR_WB | TX_FINISH_CACHE_INV | TX_RELEASE_TO_PPE | priv->port << TX_POOL_SHIFT); - desc->data_offset = (__force u32)cpu_to_be32(phys & CACHE_LINE_MASK); - desc->send_addr = (__force u32)cpu_to_be32(phys & ~CACHE_LINE_MASK); + desc->data_offset = (__force u32)cpu_to_be32(phys & SOC_CACHE_LINE_MASK); + desc->send_addr = (__force u32)cpu_to_be32(phys & ~SOC_CACHE_LINE_MASK); #else desc->cfg = (__force u32)cpu_to_be32(TX_CLEAR_WB | TX_FINISH_CACHE_INV); desc->send_addr = (__force u32)cpu_to_be32(phys);
From: Eric Dumazet edumazet@google.com
[ Upstream commit 6f20c8adb1813467ea52c1296d52c4e95978cb2f ]
tfo_active_disable_stamp is read and written locklessly. We need to annotate these accesses appropriately.
Then, we need to perform the atomic_inc(tfo_active_disable_times) after the timestamp has been updated, and thus add barriers to make sure tcp_fastopen_active_should_disable() wont read a stale timestamp.
Fixes: cf1ef3f0719b ("net/tcp_fastopen: Disable active side TFO in certain scenarios") Signed-off-by: Eric Dumazet edumazet@google.com Cc: Wei Wang weiwan@google.com Cc: Yuchung Cheng ycheng@google.com Cc: Neal Cardwell ncardwell@google.com Acked-by: Wei Wang weiwan@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_fastopen.c | 19 ++++++++++++++++--- 1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index af2814c9342a..08548ff23d83 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -507,8 +507,15 @@ void tcp_fastopen_active_disable(struct sock *sk) { struct net *net = sock_net(sk);
+ /* Paired with READ_ONCE() in tcp_fastopen_active_should_disable() */ + WRITE_ONCE(net->ipv4.tfo_active_disable_stamp, jiffies); + + /* Paired with smp_rmb() in tcp_fastopen_active_should_disable(). + * We want net->ipv4.tfo_active_disable_stamp to be updated first. + */ + smp_mb__before_atomic(); atomic_inc(&net->ipv4.tfo_active_disable_times); - net->ipv4.tfo_active_disable_stamp = jiffies; + NET_INC_STATS(net, LINUX_MIB_TCPFASTOPENBLACKHOLE); }
@@ -526,10 +533,16 @@ bool tcp_fastopen_active_should_disable(struct sock *sk) if (!tfo_da_times) return false;
+ /* Paired with smp_mb__before_atomic() in tcp_fastopen_active_disable() */ + smp_rmb(); + /* Limit timout to max: 2^6 * initial timeout */ multiplier = 1 << min(tfo_da_times - 1, 6); - timeout = multiplier * tfo_bh_timeout * HZ; - if (time_before(jiffies, sock_net(sk)->ipv4.tfo_active_disable_stamp + timeout)) + + /* Paired with the WRITE_ONCE() in tcp_fastopen_active_disable(). */ + timeout = READ_ONCE(sock_net(sk)->ipv4.tfo_active_disable_stamp) + + multiplier * tfo_bh_timeout * HZ; + if (time_before(jiffies, timeout)) return true;
/* Mark check bit so we can check for successful active TFO
From: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com
[ Upstream commit 114613f62f42e7cbc1242c4e82076a0153043761 ]
We missed the fact that ElkhartLake platforms have two different PCI IDs. We only added one so the SOF driver is never selected by the autodetection logic for the missing configuration.
BugLink: https://github.com/thesofproject/linux/issues/2990 Fixes: cc8f81c7e625 ('ALSA: hda: fix intel DSP config') Signed-off-by: Pierre-Louis Bossart pierre-louis.bossart@linux.intel.com Link: https://lore.kernel.org/r/20210719231746.557325-1-pierre-louis.bossart@linux... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/intel-dsp-config.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/sound/hda/intel-dsp-config.c b/sound/hda/intel-dsp-config.c index d8be146793ee..c9d0ba353463 100644 --- a/sound/hda/intel-dsp-config.c +++ b/sound/hda/intel-dsp-config.c @@ -319,6 +319,10 @@ static const struct config_entry config_table[] = { .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC, .device = 0x4b55, }, + { + .flags = FLAG_SOF | FLAG_SOF_ONLY_IF_DMIC, + .device = 0x4b58, + }, #endif
/* Alder Lake */
From: Chengwen Feng fengchengwen@huawei.com
[ Upstream commit 1b713d14dc3c077ec45e65dab4ea01a8bc41b8c1 ]
Currently, the mailbox synchronous communication between VF and PF use the following fields to maintain communication: 1. Origin_mbx_msg which was combined by message code and subcode, used to match request and response. 2. Received_resp which means whether received response.
There may possible mismatches of the following situation: 1. VF sends message A with code=1 subcode=1. 2. PF was blocked about 500ms when processing the message A. 3. VF will detect message A timeout because it can't get the response within 500ms. 4. VF sends message B with code=1 subcode=1 which equal message A. 5. PF processes the first message A and send the response message to VF. 6. VF will identify the response matched the message B because the code/subcode is the same. This will lead to mismatch of request and response.
To fix the above bug, we use the following scheme: 1. The message sent from VF was labelled with match_id which was a unique 16-bit non-zero value. 2. The response sent from PF will label with match_id which got from the request. 3. The VF uses the match_id to match request and response message.
As for PF driver, it only needs to copy the match_id from request to response.
Fixes: dde1a86e93ca ("net: hns3: Add mailbox support to PF driver") Signed-off-by: Chengwen Feng fengchengwen@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h | 6 ++++-- drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c | 1 + 2 files changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h index a2c17af57fde..d283beec9f66 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h +++ b/drivers/net/ethernet/hisilicon/hns3/hclge_mbx.h @@ -135,7 +135,8 @@ struct hclge_mbx_vf_to_pf_cmd { u8 mbx_need_resp; u8 rsv1[1]; u8 msg_len; - u8 rsv2[3]; + u8 rsv2; + u16 match_id; struct hclge_vf_to_pf_msg msg; };
@@ -145,7 +146,8 @@ struct hclge_mbx_pf_to_vf_cmd { u8 dest_vfid; u8 rsv[3]; u8 msg_len; - u8 rsv1[3]; + u8 rsv1; + u16 match_id; struct hclge_pf_to_vf_msg msg; };
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c index f1c9f4ada348..38b601031db4 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_mbx.c @@ -47,6 +47,7 @@ static int hclge_gen_resp_to_vf(struct hclge_vport *vport,
resp_pf_to_vf->dest_vfid = vf_to_pf_req->mbx_src_vfid; resp_pf_to_vf->msg_len = vf_to_pf_req->msg_len; + resp_pf_to_vf->match_id = vf_to_pf_req->match_id;
resp_pf_to_vf->msg.code = HCLGE_MBX_PF_VF_RESP; resp_pf_to_vf->msg.vf_mbx_msg_code = vf_to_pf_req->msg.code;
From: Jian Shen shenjian15@huawei.com
[ Upstream commit bbfd4506f962e7e6fff8f37f017154a3c3791264 ]
Currently, VF doesn't enable rx VLAN offload when initializating, and PF does it for VFs. If user disable the rx VLAN offload for VF with ethtool -K, and reload the VF driver, it may cause the rx VLAN offload state being inconsistent between hardware and software.
Fixes it by enabling rx VLAN offload when VF initializing.
Fixes: e2cb1dec9779 ("net: hns3: Add HNS3 VF HCL(Hardware Compatibility Layer) Support") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 0db51ef15ef6..fe03c8419890 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -2621,6 +2621,16 @@ static int hclgevf_rss_init_hw(struct hclgevf_dev *hdev)
static int hclgevf_init_vlan_config(struct hclgevf_dev *hdev) { + struct hnae3_handle *nic = &hdev->nic; + int ret; + + ret = hclgevf_en_hw_strip_rxvtag(nic, true); + if (ret) { + dev_err(&hdev->pdev->dev, + "failed to enable rx vlan offload, ret = %d\n", ret); + return ret; + } + return hclgevf_set_vlan_filter(&hdev->nic, htons(ETH_P_8021Q), 0, false); }
From: Alexandru Tachici alexandru.tachici@analog.com
[ Upstream commit c45c1e82bba130db4f19d9dbc1deefcf4ea994ed ]
The bcm2835_spi_transfer_one function can create a deadlock if it is called while another thread already has the CCF lock.
Signed-off-by: Alexandru Tachici alexandru.tachici@analog.com Fixes: f8043872e796 ("spi: add driver for BCM2835") Reviewed-by: Florian Fainelli f.fainelli@gmail.com Link: https://lore.kernel.org/r/20210716210245.13240-2-alexandru.tachici@analog.co... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/spi/spi-bcm2835.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/drivers/spi/spi-bcm2835.c b/drivers/spi/spi-bcm2835.c index fe40626e45aa..61cbcc7e2121 100644 --- a/drivers/spi/spi-bcm2835.c +++ b/drivers/spi/spi-bcm2835.c @@ -84,6 +84,7 @@ MODULE_PARM_DESC(polling_limit_us, * struct bcm2835_spi - BCM2835 SPI controller * @regs: base address of register map * @clk: core clock, divided to calculate serial clock + * @clk_hz: core clock cached speed * @irq: interrupt, signals TX FIFO empty or RX FIFO ¾ full * @tfr: SPI transfer currently processed * @ctlr: SPI controller reverse lookup @@ -124,6 +125,7 @@ MODULE_PARM_DESC(polling_limit_us, struct bcm2835_spi { void __iomem *regs; struct clk *clk; + unsigned long clk_hz; int irq; struct spi_transfer *tfr; struct spi_controller *ctlr; @@ -1082,19 +1084,18 @@ static int bcm2835_spi_transfer_one(struct spi_controller *ctlr, struct spi_transfer *tfr) { struct bcm2835_spi *bs = spi_controller_get_devdata(ctlr); - unsigned long spi_hz, clk_hz, cdiv; + unsigned long spi_hz, cdiv; unsigned long hz_per_byte, byte_limit; u32 cs = bs->prepare_cs[spi->chip_select];
/* set clock */ spi_hz = tfr->speed_hz; - clk_hz = clk_get_rate(bs->clk);
- if (spi_hz >= clk_hz / 2) { + if (spi_hz >= bs->clk_hz / 2) { cdiv = 2; /* clk_hz/2 is the fastest we can go */ } else if (spi_hz) { /* CDIV must be a multiple of two */ - cdiv = DIV_ROUND_UP(clk_hz, spi_hz); + cdiv = DIV_ROUND_UP(bs->clk_hz, spi_hz); cdiv += (cdiv % 2);
if (cdiv >= 65536) @@ -1102,7 +1103,7 @@ static int bcm2835_spi_transfer_one(struct spi_controller *ctlr, } else { cdiv = 0; /* 0 is the slowest we can go */ } - tfr->effective_speed_hz = cdiv ? (clk_hz / cdiv) : (clk_hz / 65536); + tfr->effective_speed_hz = cdiv ? (bs->clk_hz / cdiv) : (bs->clk_hz / 65536); bcm2835_wr(bs, BCM2835_SPI_CLK, cdiv);
/* handle all the 3-wire mode */ @@ -1320,6 +1321,7 @@ static int bcm2835_spi_probe(struct platform_device *pdev) return bs->irq ? bs->irq : -ENODEV;
clk_prepare_enable(bs->clk); + bs->clk_hz = clk_get_rate(bs->clk);
err = bcm2835_dma_init(ctlr, &pdev->dev, bs); if (err)
From: Yang Yingliang yangyingliang@huawei.com
[ Upstream commit 362a9e65289284f36403058eea2462d0330c1f24 ]
I got memory leak report when doing fuzz test:
BUG: memory leak unreferenced object 0xffff888107310a80 (size 96): comm "syz-executor.6", pid 4610, jiffies 4295140240 (age 20.135s) hex dump (first 32 bytes): 01 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... backtrace: [<000000001974933b>] kmalloc include/linux/slab.h:591 [inline] [<000000001974933b>] kzalloc include/linux/slab.h:721 [inline] [<000000001974933b>] io_init_wq_offload fs/io_uring.c:7920 [inline] [<000000001974933b>] io_uring_alloc_task_context+0x466/0x640 fs/io_uring.c:7955 [<0000000039d0800d>] __io_uring_add_tctx_node+0x256/0x360 fs/io_uring.c:9016 [<000000008482e78c>] io_uring_add_tctx_node fs/io_uring.c:9052 [inline] [<000000008482e78c>] __do_sys_io_uring_enter fs/io_uring.c:9354 [inline] [<000000008482e78c>] __se_sys_io_uring_enter fs/io_uring.c:9301 [inline] [<000000008482e78c>] __x64_sys_io_uring_enter+0xabc/0xc20 fs/io_uring.c:9301 [<00000000b875f18f>] do_syscall_x64 arch/x86/entry/common.c:50 [inline] [<00000000b875f18f>] do_syscall_64+0x3b/0x90 arch/x86/entry/common.c:80 [<000000006b0a8484>] entry_SYSCALL_64_after_hwframe+0x44/0xae
CPU0 CPU1 io_uring_enter io_uring_enter io_uring_add_tctx_node io_uring_add_tctx_node __io_uring_add_tctx_node __io_uring_add_tctx_node io_uring_alloc_task_context io_uring_alloc_task_context io_init_wq_offload io_init_wq_offload hash = kzalloc hash = kzalloc ctx->hash_map = hash ctx->hash_map = hash <- one of the hash is leaked
When calling io_uring_enter() in parallel, the 'hash_map' will be leaked, add uring_lock to protect 'hash_map'.
Fixes: e941894eae31 ("io-wq: make buffered file write hashed work map per-ctx") Reported-by: Hulk Robot hulkci@huawei.com Signed-off-by: Yang Yingliang yangyingliang@huawei.com Reviewed-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/20210720083805.3030730-1-yangyingliang@huawei.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- fs/io_uring.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/io_uring.c b/fs/io_uring.c index eeea6b8c8bee..8843f48ace27 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7859,15 +7859,19 @@ static struct io_wq *io_init_wq_offload(struct io_ring_ctx *ctx, struct io_wq_data data; unsigned int concurrency;
+ mutex_lock(&ctx->uring_lock); hash = ctx->hash_map; if (!hash) { hash = kzalloc(sizeof(*hash), GFP_KERNEL); - if (!hash) + if (!hash) { + mutex_unlock(&ctx->uring_lock); return ERR_PTR(-ENOMEM); + } refcount_set(&hash->refs, 1); init_waitqueue_head(&hash->wait); ctx->hash_map = hash; } + mutex_unlock(&ctx->uring_lock);
data.hash = hash; data.task = task;
From: Peilin Ye peilin.ye@bytedance.com
[ Upstream commit 727d6a8b7ef3d25080fad228b2c4a1d4da5999c6 ]
Currently tcf_skbmod_act() assumes that packets use Ethernet as their L2 protocol, which is not always the case. As an example, for CAN devices:
$ ip link add dev vcan0 type vcan $ ip link set up vcan0 $ tc qdisc add dev vcan0 root handle 1: htb $ tc filter add dev vcan0 parent 1: protocol ip prio 10 \ matchall action skbmod swap mac
Doing the above silently corrupts all the packets. Do not perform skbmod actions for non-Ethernet packets.
Fixes: 86da71b57383 ("net_sched: Introduce skbmod action") Reviewed-by: Cong Wang cong.wang@bytedance.com Signed-off-by: Peilin Ye peilin.ye@bytedance.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/act_skbmod.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/net/sched/act_skbmod.c b/net/sched/act_skbmod.c index 81a1c67335be..8d17a543cc9f 100644 --- a/net/sched/act_skbmod.c +++ b/net/sched/act_skbmod.c @@ -6,6 +6,7 @@ */
#include <linux/module.h> +#include <linux/if_arp.h> #include <linux/init.h> #include <linux/kernel.h> #include <linux/skbuff.h> @@ -33,6 +34,13 @@ static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a, tcf_lastuse_update(&d->tcf_tm); bstats_cpu_update(this_cpu_ptr(d->common.cpu_bstats), skb);
+ action = READ_ONCE(d->tcf_action); + if (unlikely(action == TC_ACT_SHOT)) + goto drop; + + if (!skb->dev || skb->dev->type != ARPHRD_ETHER) + return action; + /* XXX: if you are going to edit more fields beyond ethernet header * (example when you add IP header replacement or vlan swap) * then MAX_EDIT_LEN needs to change appropriately @@ -41,10 +49,6 @@ static int tcf_skbmod_act(struct sk_buff *skb, const struct tc_action *a, if (unlikely(err)) /* best policy is to drop on the floor */ goto drop;
- action = READ_ONCE(d->tcf_action); - if (unlikely(action == TC_ACT_SHOT)) - goto drop; - p = rcu_dereference_bh(d->skbmod_p); flags = p->flags; if (flags & SKBMOD_F_DMAC)
From: Paolo Abeni pabeni@redhat.com
[ Upstream commit 8fb4792f091e608a0a1d353dfdf07ef55a719db5 ]
While running the self-tests on a KASAN enabled kernel, I observed a slab-out-of-bounds splat very similar to the one reported in commit 821bbf79fe46 ("ipv6: Fix KASAN: slab-out-of-bounds Read in fib6_nh_flush_exceptions").
We additionally need to take care of fib6_metrics initialization failure when the caller provides an nh.
The fix is similar, explicitly free the route instead of calling fib6_info_release on a half-initialized object.
Fixes: f88d8ea67fbdb ("ipv6: Plumb support for nexthop object in a fib6_info") Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv6/route.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/route.c b/net/ipv6/route.c index d417e514bd52..09e84161b731 100644 --- a/net/ipv6/route.c +++ b/net/ipv6/route.c @@ -3642,7 +3642,7 @@ static struct fib6_info *ip6_route_info_create(struct fib6_config *cfg, err = PTR_ERR(rt->fib6_metrics); /* Do not leave garbage there. */ rt->fib6_metrics = (struct dst_metrics *)&dst_default_metrics; - goto out; + goto out_free; }
if (cfg->fc_flags & RTF_ADDRCONF)
From: Luis Henriques lhenriques@suse.de
[ Upstream commit cdb330f4b41ab55feb35487729e883c9e08b8a54 ]
If MDSs aren't available while mounting a filesystem, the session state will transition from SESSION_OPENING to SESSION_CLOSING. And in that scenario check_session_state() will be called from delayed_work() and trigger this WARN.
Avoid this by only WARNing after a session has already been established (i.e., the s_ttl will be different from 0).
Fixes: 62575e270f66 ("ceph: check session state after bumping session->s_seq") Signed-off-by: Luis Henriques lhenriques@suse.de Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Ilya Dryomov idryomov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ceph/mds_client.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index e5af591d3bd4..86f09b1110a2 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -4468,7 +4468,7 @@ bool check_session_state(struct ceph_mds_session *s) break; case CEPH_MDS_SESSION_CLOSING: /* Should never reach this when we're unmounting */ - WARN_ON_ONCE(true); + WARN_ON_ONCE(s->s_ttl); fallthrough; case CEPH_MDS_SESSION_NEW: case CEPH_MDS_SESSION_RESTARTING:
From: Chris Packham chris.packham@alliedtelesis.co.nz
[ Upstream commit 4a8ac5e45cdaa88884b4ce05303e304cbabeb367 ]
During some transfers the bus can still be busy when an interrupt is received. Commit 763778cd7926 ("i2c: mpc: Restore reread of I2C status register") attempted to address this by re-reading MPC_I2C_SR once but that just made it less likely to happen without actually preventing it. Instead of a single re-read, poll with a timeout so that the bus is given enough time to settle but a genuine stuck SCL is still noticed.
Fixes: 1538d82f4647 ("i2c: mpc: Interrupt driven transfer") Signed-off-by: Chris Packham chris.packham@alliedtelesis.co.nz Signed-off-by: Wolfram Sang wsa@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i2c/busses/i2c-mpc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/i2c/busses/i2c-mpc.c b/drivers/i2c/busses/i2c-mpc.c index 6d5014ebaab5..a6ea1eb1394e 100644 --- a/drivers/i2c/busses/i2c-mpc.c +++ b/drivers/i2c/busses/i2c-mpc.c @@ -635,8 +635,8 @@ static irqreturn_t mpc_i2c_isr(int irq, void *dev_id)
status = readb(i2c->base + MPC_I2C_SR); if (status & CSR_MIF) { - /* Read again to allow register to stabilise */ - status = readb(i2c->base + MPC_I2C_SR); + /* Wait up to 100us for transfer to properly complete */ + readb_poll_timeout(i2c->base + MPC_I2C_SR, status, !(status & CSR_MCF), 0, 100); writeb(0, i2c->base + MPC_I2C_SR); mpc_i2c_do_intr(i2c, status); return IRQ_HANDLED;
From: David Disseldorp ddiss@suse.de
[ Upstream commit a47fa41381a09e5997afd762664db4f5f6657e03 ]
CPU affinity control added with commit 39ae3edda325 ("scsi: target: core: Make completion affinity configurable") makes target_complete_cmd() queue work on a CPU based on se_tpg->se_tpg_wwn->cmd_compl_affinity state.
LIO's EXTENDED COPY worker is a special case in that read/write cmds are dispatched using the global xcopy_pt_tpg, which carries a NULL se_tpg_wwn pointer following initialization in target_xcopy_setup_pt().
The NULL xcopy_pt_tpg->se_tpg_wwn pointer is dereferenced on completion of any EXTENDED COPY initiated read/write cmds. E.g using the libiscsi SCSI.ExtendedCopy.Simple test:
BUG: kernel NULL pointer dereference, address: 00000000000001a8 RIP: 0010:target_complete_cmd+0x9d/0x130 [target_core_mod] Call Trace: fd_execute_rw+0x148/0x42a [target_core_file] ? __dynamic_pr_debug+0xa7/0xe0 ? target_check_reservation+0x5b/0x940 [target_core_mod] __target_execute_cmd+0x1e/0x90 [target_core_mod] transport_generic_new_cmd+0x17c/0x330 [target_core_mod] target_xcopy_issue_pt_cmd+0x9/0x60 [target_core_mod] target_xcopy_read_source.isra.7+0x10b/0x1b0 [target_core_mod] ? target_check_fua+0x40/0x40 [target_core_mod] ? transport_complete_task_attr+0x130/0x130 [target_core_mod] target_xcopy_do_work+0x61f/0xc00 [target_core_mod]
This fix makes target_complete_cmd() queue work on se_cmd->cpuid if se_tpg_wwn is NULL.
Link: https://lore.kernel.org/r/20210720225522.26291-1-ddiss@suse.de Fixes: 39ae3edda325 ("scsi: target: core: Make completion affinity configurable") Cc: Lee Duncan lduncan@suse.com Cc: Mike Christie michael.christie@oracle.com Reviewed-by: Mike Christie michael.christie@oracle.com Signed-off-by: David Disseldorp ddiss@suse.de Signed-off-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/target/target_core_transport.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/target/target_core_transport.c b/drivers/target/target_core_transport.c index 7e35eddd9eb7..26ceabe34de5 100644 --- a/drivers/target/target_core_transport.c +++ b/drivers/target/target_core_transport.c @@ -886,7 +886,7 @@ void target_complete_cmd(struct se_cmd *cmd, u8 scsi_status) INIT_WORK(&cmd->work, success ? target_complete_ok_work : target_complete_failure_work);
- if (wwn->cmd_compl_affinity == SE_COMPL_AFFINITY_CPUID) + if (!wwn || wwn->cmd_compl_affinity == SE_COMPL_AFFINITY_CPUID) cpu = cmd->cpuid; else cpu = wwn->cmd_compl_affinity;
From: Jason Ekstrand jason@jlekstrand.net
[ Upstream commit 235c3610d5f02ee91244239b43cd9ae8b4859dff ]
If we have a failure, decrement the reference count so that the next call to ttm_global_init() will actually do something instead of assume everything is all set up.
Signed-off-by: Jason Ekstrand jason@jlekstrand.net Fixes: 62b53b37e4b1 ("drm/ttm: use a static ttm_bo_global instance") Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20210720181357.2760720-5-jason... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_device.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_device.c b/drivers/gpu/drm/ttm/ttm_device.c index 3d9c62b93e29..ef6e0c042bb1 100644 --- a/drivers/gpu/drm/ttm/ttm_device.c +++ b/drivers/gpu/drm/ttm/ttm_device.c @@ -100,6 +100,8 @@ static int ttm_global_init(void) debugfs_create_atomic_t("buffer_objects", 0444, ttm_debugfs_root, &glob->bo_count); out: + if (ret) + --ttm_glob_use_count; mutex_unlock(&ttm_global_mutex); return ret; }
From: Zhihao Cheng chengzhihao1@huawei.com
[ Upstream commit 7764656b108cd308c39e9a8554353b8f9ca232a3 ]
Followling process: nvme_probe nvme_reset_ctrl nvme_change_ctrl_state(ctrl, NVME_CTRL_RESETTING) queue_work(nvme_reset_wq, &ctrl->reset_work)
--------------> nvme_remove nvme_change_ctrl_state(&dev->ctrl, NVME_CTRL_DELETING) worker_thread process_one_work nvme_reset_work WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)
, which will trigger WARN_ON in nvme_reset_work(): [ 127.534298] WARNING: CPU: 0 PID: 139 at drivers/nvme/host/pci.c:2594 [ 127.536161] CPU: 0 PID: 139 Comm: kworker/u8:7 Not tainted 5.13.0 [ 127.552518] Call Trace: [ 127.552840] ? kvm_sched_clock_read+0x25/0x40 [ 127.553936] ? native_send_call_func_single_ipi+0x1c/0x30 [ 127.555117] ? send_call_function_single_ipi+0x9b/0x130 [ 127.556263] ? __smp_call_single_queue+0x48/0x60 [ 127.557278] ? ttwu_queue_wakelist+0xfa/0x1c0 [ 127.558231] ? try_to_wake_up+0x265/0x9d0 [ 127.559120] ? ext4_end_io_rsv_work+0x160/0x290 [ 127.560118] process_one_work+0x28c/0x640 [ 127.561002] worker_thread+0x39a/0x700 [ 127.561833] ? rescuer_thread+0x580/0x580 [ 127.562714] kthread+0x18c/0x1e0 [ 127.563444] ? set_kthread_struct+0x70/0x70 [ 127.564347] ret_from_fork+0x1f/0x30
The preceding problem can be easily reproduced by executing following script (based on blktests suite): test() { pdev="$(_get_pci_dev_from_blkdev)" sysfs="/sys/bus/pci/devices/${pdev}" for ((i = 0; i < 10; i++)); do echo 1 > "$sysfs/remove" echo 1 > /sys/bus/pci/rescan done }
Since the device ctrl could be updated as an non-RESETTING state by repeating probe/remove in userspace (which is a normal situation), we can replace stack dumping WARN_ON with a warnning message.
Fixes: 82b057caefaff ("nvme-pci: fix multiple ctrl removal schedulin") Signed-off-by: Zhihao Cheng chengzhihao1@huawei.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/pci.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index c625da463330..fb1c5ae0da39 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -2591,7 +2591,9 @@ static void nvme_reset_work(struct work_struct *work) bool was_suspend = !!(dev->ctrl.ctrl_config & NVME_CC_SHN_NORMAL); int result;
- if (WARN_ON(dev->ctrl.state != NVME_CTRL_RESETTING)) { + if (dev->ctrl.state != NVME_CTRL_RESETTING) { + dev_warn(dev->ctrl.device, "ctrl state %d is not RESETTING\n", + dev->ctrl.state); result = -ENODEV; goto out; }
From: Vincent Palatin vpalatin@chromium.org
[ Upstream commit f3a1a937f7b240be623d989c8553a6d01465d04f ]
This reverts commit 0bd860493f81eb2a46173f6f5e44cc38331c8dbd.
While the patch was working as stated,ie preventing the L850-GL LTE modem from crashing on some U3 wake-ups due to a race condition between the host wake-up and the modem-side wake-up, when using the MBIM interface, this would force disabling the USB runtime PM on the device.
The increased power consumption is significant for LTE laptops, and given that with decently recent modem firmwares, when the modem hits the bug, it automatically recovers (ie it drops from the bus, but automatically re-enumerates after less than half a second, rather than being stuck until a power cycle as it was doing with ancient firmware), for most people, the trade-off now seems in favor of re-enabling it by default.
For people with access to the platform code, the bug can also be worked-around successfully by changing the USB3 LFPM polling off-time for the XHCI controller in the BIOS code.
Signed-off-by: Vincent Palatin vpalatin@chromium.org Link: https://lore.kernel.org/r/20210721092516.2775971-1-vpalatin@chromium.org Fixes: 0bd860493f81 ("USB: quirks: ignore remote wake-up on Fibocom L850-GL LTE modem") Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/core/quirks.c | 4 ---- 1 file changed, 4 deletions(-)
diff --git a/drivers/usb/core/quirks.c b/drivers/usb/core/quirks.c index 21e7522655ac..a54a735b6384 100644 --- a/drivers/usb/core/quirks.c +++ b/drivers/usb/core/quirks.c @@ -502,10 +502,6 @@ static const struct usb_device_id usb_quirk_list[] = { /* DJI CineSSD */ { USB_DEVICE(0x2ca3, 0x0031), .driver_info = USB_QUIRK_NO_LPM },
- /* Fibocom L850-GL LTE Modem */ - { USB_DEVICE(0x2cb7, 0x0007), .driver_info = - USB_QUIRK_IGNORE_REMOTE_WAKEUP }, - /* INTEL VALUE SSD */ { USB_DEVICE(0x8086, 0xf1a5), .driver_info = USB_QUIRK_RESET_RESUME },
From: David Howells dhowells@redhat.com
[ Upstream commit 6c881ca0b3040f3e724eae513117ba4ddef86057 ]
To quote Alexey[1]:
I was adding custom tracepoint to the kernel, grabbed full F34 kernel .config, disabled modules and booted whole shebang as VM kernel.
Then did
perf record -a -e ...
It crashed:
general protection fault, probably for non-canonical address 0x435f5346592e4243: 0000 [#1] SMP PTI CPU: 1 PID: 842 Comm: cat Not tainted 5.12.6+ #26 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014 RIP: 0010:t_show+0x22/0xd0
Then reproducer was narrowed to
# cat /sys/kernel/tracing/printk_formats
Original F34 kernel with modules didn't crash.
So I started to disable options and after disabling AFS everything started working again.
The root cause is that AFS was placing char arrays content into a section full of _pointers_ to strings with predictable consequences.
Non canonical address 435f5346592e4243 is "CB.YFS_" which came from CM_NAME macro.
Steps to reproduce:
CONFIG_AFS=y CONFIG_TRACING=y
# cat /sys/kernel/tracing/printk_formats
Fix this by the following means:
(1) Add enum->string translation tables in the event header with the AFS and YFS cache/callback manager operations listed by RPC operation ID.
(2) Modify the afs_cb_call tracepoint to print the string from the translation table rather than using the string at the afs_call name pointer.
(3) Switch translation table depending on the service we're being accessed as (AFS or YFS) in the tracepoint print clause. Will this cause problems to userspace utilities?
Note that the symbolic representation of the YFS service ID isn't available to this header, so I've put it in as a number. I'm not sure if this is the best way to do this.
(4) Remove the name wrangling (CM_NAME) macro and put the names directly into the afs_call_type structs in cmservice.c.
Fixes: 8e8d7f13b6d5a9 ("afs: Add some tracepoints") Reported-by: Alexey Dobriyan (SK hynix) adobriyan@gmail.com Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Steven Rostedt (VMware) rostedt@goodmis.org Reviewed-by: Marc Dionne marc.dionne@auristor.com cc: Andrew Morton akpm@linux-foundation.org cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/YLAXfvZ+rObEOdc%2F@localhost.localdomain/ [1] Link: https://lore.kernel.org/r/643721.1623754699@warthog.procyon.org.uk/ Link: https://lore.kernel.org/r/162430903582.2896199.6098150063997983353.stgit@war... # v1 Link: https://lore.kernel.org/r/162609463957.3133237.15916579353149746363.stgit@wa... # v1 (repost) Link: https://lore.kernel.org/r/162610726860.3408253.445207609466288531.stgit@wart... # v2 Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/cmservice.c | 25 ++++---------- include/trace/events/afs.h | 67 +++++++++++++++++++++++++++++++++++--- 2 files changed, 69 insertions(+), 23 deletions(-)
diff --git a/fs/afs/cmservice.c b/fs/afs/cmservice.c index d3c6bb22c5f4..a3f5de28be79 100644 --- a/fs/afs/cmservice.c +++ b/fs/afs/cmservice.c @@ -29,16 +29,11 @@ static void SRXAFSCB_TellMeAboutYourself(struct work_struct *);
static int afs_deliver_yfs_cb_callback(struct afs_call *);
-#define CM_NAME(name) \ - char afs_SRXCB##name##_name[] __tracepoint_string = \ - "CB." #name - /* * CB.CallBack operation type */ -static CM_NAME(CallBack); static const struct afs_call_type afs_SRXCBCallBack = { - .name = afs_SRXCBCallBack_name, + .name = "CB.CallBack", .deliver = afs_deliver_cb_callback, .destructor = afs_cm_destructor, .work = SRXAFSCB_CallBack, @@ -47,9 +42,8 @@ static const struct afs_call_type afs_SRXCBCallBack = { /* * CB.InitCallBackState operation type */ -static CM_NAME(InitCallBackState); static const struct afs_call_type afs_SRXCBInitCallBackState = { - .name = afs_SRXCBInitCallBackState_name, + .name = "CB.InitCallBackState", .deliver = afs_deliver_cb_init_call_back_state, .destructor = afs_cm_destructor, .work = SRXAFSCB_InitCallBackState, @@ -58,9 +52,8 @@ static const struct afs_call_type afs_SRXCBInitCallBackState = { /* * CB.InitCallBackState3 operation type */ -static CM_NAME(InitCallBackState3); static const struct afs_call_type afs_SRXCBInitCallBackState3 = { - .name = afs_SRXCBInitCallBackState3_name, + .name = "CB.InitCallBackState3", .deliver = afs_deliver_cb_init_call_back_state3, .destructor = afs_cm_destructor, .work = SRXAFSCB_InitCallBackState, @@ -69,9 +62,8 @@ static const struct afs_call_type afs_SRXCBInitCallBackState3 = { /* * CB.Probe operation type */ -static CM_NAME(Probe); static const struct afs_call_type afs_SRXCBProbe = { - .name = afs_SRXCBProbe_name, + .name = "CB.Probe", .deliver = afs_deliver_cb_probe, .destructor = afs_cm_destructor, .work = SRXAFSCB_Probe, @@ -80,9 +72,8 @@ static const struct afs_call_type afs_SRXCBProbe = { /* * CB.ProbeUuid operation type */ -static CM_NAME(ProbeUuid); static const struct afs_call_type afs_SRXCBProbeUuid = { - .name = afs_SRXCBProbeUuid_name, + .name = "CB.ProbeUuid", .deliver = afs_deliver_cb_probe_uuid, .destructor = afs_cm_destructor, .work = SRXAFSCB_ProbeUuid, @@ -91,9 +82,8 @@ static const struct afs_call_type afs_SRXCBProbeUuid = { /* * CB.TellMeAboutYourself operation type */ -static CM_NAME(TellMeAboutYourself); static const struct afs_call_type afs_SRXCBTellMeAboutYourself = { - .name = afs_SRXCBTellMeAboutYourself_name, + .name = "CB.TellMeAboutYourself", .deliver = afs_deliver_cb_tell_me_about_yourself, .destructor = afs_cm_destructor, .work = SRXAFSCB_TellMeAboutYourself, @@ -102,9 +92,8 @@ static const struct afs_call_type afs_SRXCBTellMeAboutYourself = { /* * YFS CB.CallBack operation type */ -static CM_NAME(YFS_CallBack); static const struct afs_call_type afs_SRXYFSCB_CallBack = { - .name = afs_SRXCBYFS_CallBack_name, + .name = "YFSCB.CallBack", .deliver = afs_deliver_yfs_cb_callback, .destructor = afs_cm_destructor, .work = SRXAFSCB_CallBack, diff --git a/include/trace/events/afs.h b/include/trace/events/afs.h index 3ccf591b2374..9f73ed2cf061 100644 --- a/include/trace/events/afs.h +++ b/include/trace/events/afs.h @@ -174,6 +174,34 @@ enum afs_vl_operation { afs_VL_GetCapabilities = 65537, /* AFS Get VL server capabilities */ };
+enum afs_cm_operation { + afs_CB_CallBack = 204, /* AFS break callback promises */ + afs_CB_InitCallBackState = 205, /* AFS initialise callback state */ + afs_CB_Probe = 206, /* AFS probe client */ + afs_CB_GetLock = 207, /* AFS get contents of CM lock table */ + afs_CB_GetCE = 208, /* AFS get cache file description */ + afs_CB_GetXStatsVersion = 209, /* AFS get version of extended statistics */ + afs_CB_GetXStats = 210, /* AFS get contents of extended statistics data */ + afs_CB_InitCallBackState3 = 213, /* AFS initialise callback state, version 3 */ + afs_CB_ProbeUuid = 214, /* AFS check the client hasn't rebooted */ +}; + +enum yfs_cm_operation { + yfs_CB_Probe = 206, /* YFS probe client */ + yfs_CB_GetLock = 207, /* YFS get contents of CM lock table */ + yfs_CB_XStatsVersion = 209, /* YFS get version of extended statistics */ + yfs_CB_GetXStats = 210, /* YFS get contents of extended statistics data */ + yfs_CB_InitCallBackState3 = 213, /* YFS initialise callback state, version 3 */ + yfs_CB_ProbeUuid = 214, /* YFS check the client hasn't rebooted */ + yfs_CB_GetServerPrefs = 215, + yfs_CB_GetCellServDV = 216, + yfs_CB_GetLocalCell = 217, + yfs_CB_GetCacheConfig = 218, + yfs_CB_GetCellByNum = 65537, + yfs_CB_TellMeAboutYourself = 65538, /* get client capabilities */ + yfs_CB_CallBack = 64204, +}; + enum afs_edit_dir_op { afs_edit_dir_create, afs_edit_dir_create_error, @@ -436,6 +464,32 @@ enum afs_cb_break_reason { EM(afs_YFSVL_GetCellName, "YFSVL.GetCellName") \ E_(afs_VL_GetCapabilities, "VL.GetCapabilities")
+#define afs_cm_operations \ + EM(afs_CB_CallBack, "CB.CallBack") \ + EM(afs_CB_InitCallBackState, "CB.InitCallBackState") \ + EM(afs_CB_Probe, "CB.Probe") \ + EM(afs_CB_GetLock, "CB.GetLock") \ + EM(afs_CB_GetCE, "CB.GetCE") \ + EM(afs_CB_GetXStatsVersion, "CB.GetXStatsVersion") \ + EM(afs_CB_GetXStats, "CB.GetXStats") \ + EM(afs_CB_InitCallBackState3, "CB.InitCallBackState3") \ + E_(afs_CB_ProbeUuid, "CB.ProbeUuid") + +#define yfs_cm_operations \ + EM(yfs_CB_Probe, "YFSCB.Probe") \ + EM(yfs_CB_GetLock, "YFSCB.GetLock") \ + EM(yfs_CB_XStatsVersion, "YFSCB.XStatsVersion") \ + EM(yfs_CB_GetXStats, "YFSCB.GetXStats") \ + EM(yfs_CB_InitCallBackState3, "YFSCB.InitCallBackState3") \ + EM(yfs_CB_ProbeUuid, "YFSCB.ProbeUuid") \ + EM(yfs_CB_GetServerPrefs, "YFSCB.GetServerPrefs") \ + EM(yfs_CB_GetCellServDV, "YFSCB.GetCellServDV") \ + EM(yfs_CB_GetLocalCell, "YFSCB.GetLocalCell") \ + EM(yfs_CB_GetCacheConfig, "YFSCB.GetCacheConfig") \ + EM(yfs_CB_GetCellByNum, "YFSCB.GetCellByNum") \ + EM(yfs_CB_TellMeAboutYourself, "YFSCB.TellMeAboutYourself") \ + E_(yfs_CB_CallBack, "YFSCB.CallBack") + #define afs_edit_dir_ops \ EM(afs_edit_dir_create, "create") \ EM(afs_edit_dir_create_error, "c_fail") \ @@ -569,6 +623,8 @@ afs_server_traces; afs_cell_traces; afs_fs_operations; afs_vl_operations; +afs_cm_operations; +yfs_cm_operations; afs_edit_dir_ops; afs_edit_dir_reasons; afs_eproto_causes; @@ -649,20 +705,21 @@ TRACE_EVENT(afs_cb_call,
TP_STRUCT__entry( __field(unsigned int, call ) - __field(const char *, name ) __field(u32, op ) + __field(u16, service_id ) ),
TP_fast_assign( __entry->call = call->debug_id; - __entry->name = call->type->name; __entry->op = call->operation_ID; + __entry->service_id = call->service_id; ),
- TP_printk("c=%08x %s o=%u", + TP_printk("c=%08x %s", __entry->call, - __entry->name, - __entry->op) + __entry->service_id == 2501 ? + __print_symbolic(__entry->op, yfs_cm_operations) : + __print_symbolic(__entry->op, afs_cm_operations)) );
TRACE_EVENT(afs_call,
From: Tom Rix trix@redhat.com
[ Upstream commit afe6949862f77bcc14fa16ad7938a04e84586d6a ]
Static analysis reports this problem
write.c:773:29: warning: Assigned value is garbage or undefined mapping->writeback_index = next; ^ ~~~~ The call to afs_writepages_region() can return without setting next. So check the function return before using next.
Changes: ver #2: - Need to fix the range_cyclic case also[1].
Fixes: e87b03f5830e ("afs: Prepare for use of THPs") Signed-off-by: Tom Rix trix@redhat.com Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Marc Dionne marc.dionne@auristor.com cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/20210430155031.3287870-1-trix@redhat.com Link: https://lore.kernel.org/r/CAB9dFdvHsLsw7CMnB+4cgciWDSqVjuij4mH3TaXnHQB8sz5rH... [1] Link: https://lore.kernel.org/r/162609464716.3133237.10354897554363093252.stgit@wa... # v1 Link: https://lore.kernel.org/r/162610727640.3408253.8687445613469681311.stgit@war... # v2 Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/write.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
diff --git a/fs/afs/write.c b/fs/afs/write.c index 3104b62c2082..1ed62e0ccfe5 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -771,13 +771,19 @@ int afs_writepages(struct address_space *mapping, if (wbc->range_cyclic) { start = mapping->writeback_index * PAGE_SIZE; ret = afs_writepages_region(mapping, wbc, start, LLONG_MAX, &next); - if (start > 0 && wbc->nr_to_write > 0 && ret == 0) - ret = afs_writepages_region(mapping, wbc, 0, start, - &next); - mapping->writeback_index = next / PAGE_SIZE; + if (ret == 0) { + mapping->writeback_index = next / PAGE_SIZE; + if (start > 0 && wbc->nr_to_write > 0) { + ret = afs_writepages_region(mapping, wbc, 0, + start, &next); + if (ret == 0) + mapping->writeback_index = + next / PAGE_SIZE; + } + } } else if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) { ret = afs_writepages_region(mapping, wbc, 0, LLONG_MAX, &next); - if (wbc->nr_to_write > 0) + if (wbc->nr_to_write > 0 && ret == 0) mapping->writeback_index = next; } else { ret = afs_writepages_region(mapping, wbc,
From: David Howells dhowells@redhat.com
[ Upstream commit 5a972474cf685bf99ca430979657095bda3a15c8 ]
Fix afs_writepages() to always set mapping->writeback_index to a page index and not a byte position[1].
Fixes: 31143d5d515e ("AFS: implement basic file write support") Reported-by: Marc Dionne marc.dionne@auristor.com Signed-off-by: David Howells dhowells@redhat.com Reviewed-by: Marc Dionne marc.dionne@auristor.com cc: linux-afs@lists.infradead.org Link: https://lore.kernel.org/r/CAB9dFdvHsLsw7CMnB+4cgciWDSqVjuij4mH3TaXnHQB8sz5rH... [1] Link: https://lore.kernel.org/r/162610728339.3408253.4604750166391496546.stgit@war... # v2 (no v1) Signed-off-by: Sasha Levin sashal@kernel.org --- fs/afs/write.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/afs/write.c b/fs/afs/write.c index 1ed62e0ccfe5..c0534697268e 100644 --- a/fs/afs/write.c +++ b/fs/afs/write.c @@ -784,7 +784,7 @@ int afs_writepages(struct address_space *mapping, } else if (wbc->range_start == 0 && wbc->range_end == LLONG_MAX) { ret = afs_writepages_region(mapping, wbc, 0, LLONG_MAX, &next); if (wbc->nr_to_write > 0 && ret == 0) - mapping->writeback_index = next; + mapping->writeback_index = next / PAGE_SIZE; } else { ret = afs_writepages_region(mapping, wbc, wbc->range_start, wbc->range_end, &next);
From: Sayanta Pattanayak sayanta.pattanayak@arm.com
[ Upstream commit e9a72f874d5b95cef0765bafc56005a50f72c5fe ]
When registering the MDIO bus for a r8169 device, we use the PCI bus/device specifier as a (seemingly) unique device identifier. However the very same BDF number can be used on another PCI segment, which makes the driver fail probing:
[ 27.544136] r8169 0002:07:00.0: enabling device (0000 -> 0003) [ 27.559734] sysfs: cannot create duplicate filename '/class/mdio_bus/r8169-700' .... [ 27.684858] libphy: mii_bus r8169-700 failed to register [ 27.695602] r8169: probe of 0002:07:00.0 failed with error -22
Add the segment number to the device name to make it more unique.
This fixes operation on ARM N1SDP boards, with two boards connected together to form an SMP system, and all on-board devices showing up twice, just on different PCI segments. A similar issue would occur on large systems with many PCI slots and multiple RTL8169 NICs.
Fixes: f1e911d5d0dfd ("r8169: add basic phylib support") Signed-off-by: Sayanta Pattanayak sayanta.pattanayak@arm.com [Andre: expand commit message, use pci_domain_nr()] Signed-off-by: Andre Przywara andre.przywara@arm.com Acked-by: Heiner Kallweit hkallweit1@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/realtek/r8169_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index a0d4e052a79e..b8eb1b2a8de3 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -5085,7 +5085,8 @@ static int r8169_mdio_register(struct rtl8169_private *tp) new_bus->priv = tp; new_bus->parent = &pdev->dev; new_bus->irq[0] = PHY_MAC_INTERRUPT; - snprintf(new_bus->id, MII_BUS_ID_SIZE, "r8169-%x", pci_dev_id(pdev)); + snprintf(new_bus->id, MII_BUS_ID_SIZE, "r8169-%x-%x", + pci_domain_nr(pdev->bus), pci_dev_id(pdev));
new_bus->read = r8169_mdio_read_reg; new_bus->write = r8169_mdio_write_reg;
From: Christoph Hellwig hch@lst.de
[ Upstream commit aaeb7bb061be545251606f4d9c82d710ca2a7c8e ]
When using Write Zeroes on a namespace that has protection information enabled they behavior without the PRACT bit counter-intuitive and will generally lead to validation failures when reading the written blocks. Fix this by always setting the PRACT bit that generates matching PI data on the fly.
Fixes: 6e02318eaea5 ("nvme: add support for the Write Zeroes command") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Keith Busch kbusch@kernel.org Reviewed-by: Martin K. Petersen martin.petersen@oracle.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/core.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 66973bb56305..148e756857a8 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -880,7 +880,10 @@ static inline blk_status_t nvme_setup_write_zeroes(struct nvme_ns *ns, cpu_to_le64(nvme_sect_to_lba(ns, blk_rq_pos(req))); cmnd->write_zeroes.length = cpu_to_le16((blk_rq_bytes(req) >> ns->lba_shift) - 1); - cmnd->write_zeroes.control = 0; + if (nvme_ns_has_pi(ns)) + cmnd->write_zeroes.control = cpu_to_le16(NVME_RW_PRINFO_PRACT); + else + cmnd->write_zeroes.control = 0; return BLK_STS_OK; }
From: Xin Long lucien.xin@gmail.com
[ Upstream commit 58acd10092268831e49de279446c314727101292 ]
syzbot reported a call trace:
BUG: KASAN: use-after-free in sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112 Call Trace: sctp_auth_shkey_hold+0x22/0xa0 net/sctp/auth.c:112 sctp_set_owner_w net/sctp/socket.c:131 [inline] sctp_sendmsg_to_asoc+0x152e/0x2180 net/sctp/socket.c:1865 sctp_sendmsg+0x103b/0x1d30 net/sctp/socket.c:2027 inet_sendmsg+0x99/0xe0 net/ipv4/af_inet.c:821 sock_sendmsg_nosec net/socket.c:703 [inline] sock_sendmsg+0xcf/0x120 net/socket.c:723
This is an use-after-free issue caused by not updating asoc->shkey after it was replaced in the key list asoc->endpoint_shared_keys, and the old key was freed.
This patch is to fix by also updating active_key for asoc when old key is being replaced with a new one. Note that this issue doesn't exist in sctp_auth_del_key_id(), as it's not allowed to delete the active_key from the asoc.
Fixes: 1b1e0bc99474 ("sctp: add refcnt support for sh_key") Reported-by: syzbot+b774577370208727d12b@syzkaller.appspotmail.com Signed-off-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sctp/auth.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/net/sctp/auth.c b/net/sctp/auth.c index 6f8319b828b0..fe74c5f95630 100644 --- a/net/sctp/auth.c +++ b/net/sctp/auth.c @@ -860,6 +860,8 @@ int sctp_auth_set_key(struct sctp_endpoint *ep, if (replace) { list_del_init(&shkey->key_list); sctp_auth_shkey_release(shkey); + if (asoc && asoc->active_key_id == auth_key->sca_keynumber) + sctp_auth_asoc_init_active_key(asoc, GFP_KERNEL); } list_add(&cur_key->key_list, sh_keys);
From: Vadim Fedorenko vfedorenko@novek.ru
[ Upstream commit 9bfce73c8921c92a9565562e6e7d458d37b7ce80 ]
Commit d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err") added checks for encapsulated sockets but it broke cases when there is no implementation of encap_err_lookup for encapsulation, i.e. ESP in UDP encapsulation. Fix it by calling encap_err_lookup only if socket implements this method otherwise treat it as legal socket.
Fixes: d26796ae5894 ("udp: check udp sock encap_type in __udp_lib_err") Signed-off-by: Vadim Fedorenko vfedorenko@novek.ru Reviewed-by: Xin Long lucien.xin@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp.c | 25 +++++++++++++++++++------ net/ipv6/udp.c | 25 +++++++++++++++++++------ 2 files changed, 38 insertions(+), 12 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c index ca9cf1051b1e..568dc31a0467 100644 --- a/net/ipv4/udp.c +++ b/net/ipv4/udp.c @@ -645,10 +645,12 @@ static struct sock *__udp4_lib_err_encap(struct net *net, const struct iphdr *iph, struct udphdr *uh, struct udp_table *udptable, + struct sock *sk, struct sk_buff *skb, u32 info) { + int (*lookup)(struct sock *sk, struct sk_buff *skb); int network_offset, transport_offset; - struct sock *sk; + struct udp_sock *up;
network_offset = skb_network_offset(skb); transport_offset = skb_transport_offset(skb); @@ -659,18 +661,28 @@ static struct sock *__udp4_lib_err_encap(struct net *net, /* Transport header needs to point to the UDP header */ skb_set_transport_header(skb, iph->ihl << 2);
+ if (sk) { + up = udp_sk(sk); + + lookup = READ_ONCE(up->encap_err_lookup); + if (lookup && lookup(sk, skb)) + sk = NULL; + + goto out; + } + sk = __udp4_lib_lookup(net, iph->daddr, uh->source, iph->saddr, uh->dest, skb->dev->ifindex, 0, udptable, NULL); if (sk) { - int (*lookup)(struct sock *sk, struct sk_buff *skb); - struct udp_sock *up = udp_sk(sk); + up = udp_sk(sk);
lookup = READ_ONCE(up->encap_err_lookup); if (!lookup || lookup(sk, skb)) sk = NULL; }
+out: if (!sk) sk = ERR_PTR(__udp4_lib_err_encap_no_sk(skb, info));
@@ -707,15 +719,16 @@ int __udp4_lib_err(struct sk_buff *skb, u32 info, struct udp_table *udptable) sk = __udp4_lib_lookup(net, iph->daddr, uh->dest, iph->saddr, uh->source, skb->dev->ifindex, inet_sdif(skb), udptable, NULL); + if (!sk || udp_sk(sk)->encap_type) { /* No socket for error: try tunnels before discarding */ - sk = ERR_PTR(-ENOENT); if (static_branch_unlikely(&udp_encap_needed_key)) { - sk = __udp4_lib_err_encap(net, iph, uh, udptable, skb, + sk = __udp4_lib_err_encap(net, iph, uh, udptable, sk, skb, info); if (!sk) return 0; - } + } else + sk = ERR_PTR(-ENOENT);
if (IS_ERR(sk)) { __ICMP_INC_STATS(net, ICMP_MIB_INERRORS); diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c index 6774e776228c..2d3bd4a9b0d0 100644 --- a/net/ipv6/udp.c +++ b/net/ipv6/udp.c @@ -502,12 +502,14 @@ static struct sock *__udp6_lib_err_encap(struct net *net, const struct ipv6hdr *hdr, int offset, struct udphdr *uh, struct udp_table *udptable, + struct sock *sk, struct sk_buff *skb, struct inet6_skb_parm *opt, u8 type, u8 code, __be32 info) { + int (*lookup)(struct sock *sk, struct sk_buff *skb); int network_offset, transport_offset; - struct sock *sk; + struct udp_sock *up;
network_offset = skb_network_offset(skb); transport_offset = skb_transport_offset(skb); @@ -518,18 +520,28 @@ static struct sock *__udp6_lib_err_encap(struct net *net, /* Transport header needs to point to the UDP header */ skb_set_transport_header(skb, offset);
+ if (sk) { + up = udp_sk(sk); + + lookup = READ_ONCE(up->encap_err_lookup); + if (lookup && lookup(sk, skb)) + sk = NULL; + + goto out; + } + sk = __udp6_lib_lookup(net, &hdr->daddr, uh->source, &hdr->saddr, uh->dest, inet6_iif(skb), 0, udptable, skb); if (sk) { - int (*lookup)(struct sock *sk, struct sk_buff *skb); - struct udp_sock *up = udp_sk(sk); + up = udp_sk(sk);
lookup = READ_ONCE(up->encap_err_lookup); if (!lookup || lookup(sk, skb)) sk = NULL; }
+out: if (!sk) { sk = ERR_PTR(__udp6_lib_err_encap_no_sk(skb, opt, type, code, offset, info)); @@ -558,16 +570,17 @@ int __udp6_lib_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
sk = __udp6_lib_lookup(net, daddr, uh->dest, saddr, uh->source, inet6_iif(skb), inet6_sdif(skb), udptable, NULL); + if (!sk || udp_sk(sk)->encap_type) { /* No socket for error: try tunnels before discarding */ - sk = ERR_PTR(-ENOENT); if (static_branch_unlikely(&udpv6_encap_needed_key)) { sk = __udp6_lib_err_encap(net, hdr, offset, uh, - udptable, skb, + udptable, sk, skb, opt, type, code, info); if (!sk) return 0; - } + } else + sk = ERR_PTR(-ENOENT);
if (IS_ERR(sk)) { __ICMP6_INC_STATS(net, __in6_dev_get(skb->dev),
From: Sukadev Bhattiprolu sukadev@linux.ibm.com
[ Upstream commit bb55362bd6976631b662ca712779b6532d8de0a6 ]
Commit 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") intended to remove the call to ibmvnic_tx_scrq_flush() when the ->resetting flag is true and was tested that way. But during the final rebase to net-next, the hunk got applied to a block few lines below (which happened to have the same diff context) and the wrong call to ibmvnic_tx_scrq_flush() got removed.
Fix that by removing the correct ibmvnic_tx_scrq_flush() and restoring the one that was incorrectly removed.
Fixes: 65d6470d139a ("ibmvnic: clean pending indirect buffs during reset") Reported-by: Dany Madden drt@linux.ibm.com Signed-off-by: Sukadev Bhattiprolu sukadev@linux.ibm.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/ibm/ibmvnic.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/ibm/ibmvnic.c b/drivers/net/ethernet/ibm/ibmvnic.c index efc98903c0b7..5b4a7ef7dffa 100644 --- a/drivers/net/ethernet/ibm/ibmvnic.c +++ b/drivers/net/ethernet/ibm/ibmvnic.c @@ -1707,7 +1707,6 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) tx_send_failed++; tx_dropped++; ret = NETDEV_TX_OK; - ibmvnic_tx_scrq_flush(adapter, tx_scrq); goto out; }
@@ -1729,6 +1728,7 @@ static netdev_tx_t ibmvnic_xmit(struct sk_buff *skb, struct net_device *netdev) dev_kfree_skb_any(skb); tx_send_failed++; tx_dropped++; + ibmvnic_tx_scrq_flush(adapter, tx_scrq); ret = NETDEV_TX_OK; goto out; }
From: Bin Meng bmeng.cn@gmail.com
[ Upstream commit d0e4dae74470fb709fc0ab61862c317938f4cc4d ]
Commit dd2d082b5760 ("riscv: Cleanup setup_bootmem()") adjusted the calling sequence in setup_bootmem(), which invalidates the fix commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") did for 32-bit RISC-V unfortunately.
So now 32-bit RISC-V does not boot again when testing booting kernel on QEMU 'virt' with '-m 2G', which was exactly what the original commit de043da0b9e7 ("RISC-V: Fix usage of memblock_enforce_memory_limit") tried to fix.
Fixes: dd2d082b5760 ("riscv: Cleanup setup_bootmem()") Signed-off-by: Bin Meng bmeng.cn@gmail.com Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/mm/init.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 4c4c92ce0bb8..9b23b95c50cf 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -123,7 +123,7 @@ void __init setup_bootmem(void) { phys_addr_t vmlinux_end = __pa_symbol(&_end); phys_addr_t vmlinux_start = __pa_symbol(&_start); - phys_addr_t dram_end = memblock_end_of_DRAM(); + phys_addr_t dram_end; phys_addr_t max_mapped_addr = __pa(~(ulong)0);
#ifdef CONFIG_XIP_KERNEL @@ -146,6 +146,8 @@ void __init setup_bootmem(void) #endif memblock_reserve(vmlinux_start, vmlinux_end - vmlinux_start);
+ dram_end = memblock_end_of_DRAM(); + /* * memblock allocator is not aware of the fact that last 4K bytes of * the addressable memory can not be mapped because of IS_ERR_VALUE
From: Wei Wang weiwan@google.com
[ Upstream commit 213ad73d06073b197a02476db3a4998e219ddb06 ]
Multiple complaints have been raised from the TFO users on the internet stating that the TFO blackhole logic is too aggressive and gets falsely triggered too often. (e.g. https://blog.apnic.net/2021/07/05/tcp-fast-open-not-so-fast/) Considering that most middleboxes no longer drop TFO packets, we decide to disable the blackhole logic by setting /proc/sys/net/ipv4/tcp_fastopen_blackhole_timeout_set to 0 by default.
Fixes: cf1ef3f0719b4 ("net/tcp_fastopen: Disable active side TFO in certain scenarios") Signed-off-by: Wei Wang weiwan@google.com Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Neal Cardwell ncardwell@google.com Acked-by: Soheil Hassas Yeganeh soheil@google.com Acked-by: Yuchung Cheng ycheng@google.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- Documentation/networking/ip-sysctl.rst | 2 +- net/ipv4/tcp_fastopen.c | 9 ++++++++- net/ipv4/tcp_ipv4.c | 2 +- 3 files changed, 10 insertions(+), 3 deletions(-)
diff --git a/Documentation/networking/ip-sysctl.rst b/Documentation/networking/ip-sysctl.rst index c2ecc9894fd0..9a57e972dae4 100644 --- a/Documentation/networking/ip-sysctl.rst +++ b/Documentation/networking/ip-sysctl.rst @@ -772,7 +772,7 @@ tcp_fastopen_blackhole_timeout_sec - INTEGER initial value when the blackhole issue goes away. 0 to disable the blackhole detection.
- By default, it is set to 1hr. + By default, it is set to 0 (feature is disabled).
tcp_fastopen_key - list of comma separated 32-digit hexadecimal INTEGERs The list consists of a primary key and an optional backup key. The diff --git a/net/ipv4/tcp_fastopen.c b/net/ipv4/tcp_fastopen.c index 08548ff23d83..d49709ba8e16 100644 --- a/net/ipv4/tcp_fastopen.c +++ b/net/ipv4/tcp_fastopen.c @@ -507,6 +507,9 @@ void tcp_fastopen_active_disable(struct sock *sk) { struct net *net = sock_net(sk);
+ if (!sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout) + return; + /* Paired with READ_ONCE() in tcp_fastopen_active_should_disable() */ WRITE_ONCE(net->ipv4.tfo_active_disable_stamp, jiffies);
@@ -526,10 +529,14 @@ void tcp_fastopen_active_disable(struct sock *sk) bool tcp_fastopen_active_should_disable(struct sock *sk) { unsigned int tfo_bh_timeout = sock_net(sk)->ipv4.sysctl_tcp_fastopen_blackhole_timeout; - int tfo_da_times = atomic_read(&sock_net(sk)->ipv4.tfo_active_disable_times); unsigned long timeout; + int tfo_da_times; int multiplier;
+ if (!tfo_bh_timeout) + return false; + + tfo_da_times = atomic_read(&sock_net(sk)->ipv4.tfo_active_disable_times); if (!tfo_da_times) return false;
diff --git a/net/ipv4/tcp_ipv4.c b/net/ipv4/tcp_ipv4.c index e409f2de5dc4..8bb5f7f51dae 100644 --- a/net/ipv4/tcp_ipv4.c +++ b/net/ipv4/tcp_ipv4.c @@ -2954,7 +2954,7 @@ static int __net_init tcp_sk_init(struct net *net) net->ipv4.sysctl_tcp_comp_sack_nr = 44; net->ipv4.sysctl_tcp_fastopen = TFO_CLIENT_ENABLE; spin_lock_init(&net->ipv4.tcp_fastopen_ctx_lock); - net->ipv4.sysctl_tcp_fastopen_blackhole_timeout = 60 * 60; + net->ipv4.sysctl_tcp_fastopen_blackhole_timeout = 0; atomic_set(&net->ipv4.tfo_active_disable_times, 0);
/* Reno is always built in */
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit e40cba9490bab1414d45c2d62defc0ad4f6e4136 ]
This simple series of commands:
ip link add br0 type bridge vlan_filtering 1 ip link set swp0 master br0
fails on sja1105 with the following error: [ 33.439103] sja1105 spi0.1: vlan-lookup-table needs to have at least the default untagged VLAN [ 33.447710] sja1105 spi0.1: Invalid config, cannot upload Warning: sja1105: Failed to change VLAN Ethertype.
For context, sja1105 has 3 operating modes: - SJA1105_VLAN_UNAWARE: the dsa_8021q_vlans are committed to hardware - SJA1105_VLAN_FILTERING_FULL: the bridge_vlans are committed to hardware - SJA1105_VLAN_FILTERING_BEST_EFFORT: both the dsa_8021q_vlans and the bridge_vlans are committed to hardware
Swapping out a VLAN list and another in happens in sja1105_build_vlan_table(), which performs a delta update procedure. That function is called from a few places, notably from sja1105_vlan_filtering() which is called from the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler.
The above set of 2 commands fails when run on a kernel pre-commit 8841f6e63f2c ("net: dsa: sja1105: make devlink property best_effort_vlan_filtering true by default"). So the priv->vlan_state transition that takes place is between VLAN-unaware and full VLAN filtering. So the dsa_8021q_vlans are swapped out and the bridge_vlans are swapped in.
So why does it fail?
Well, the bridge driver, through nbp_vlan_init(), first sets up the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING attribute, and only then proceeds to call nbp_vlan_add for the default_pvid.
So when we swap out the dsa_8021q_vlans and swap in the bridge_vlans in the SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING handler, there are no bridge VLANs (yet). So we have wiped the VLAN table clean, and the low-level static config checker complains of an invalid configuration. We _will_ add the bridge VLANs using the dynamic config interface, albeit later, when nbp_vlan_add() calls us. So it is natural that it fails.
So why did it ever work?
Surprisingly, it looks like I only tested this configuration with 2 things set up in a particular way: - a network manager that brings all ports up - a kernel with CONFIG_VLAN_8021Q=y
It is widely known that commit ad1afb003939 ("vlan_dev: VLAN 0 should be treated as "no vlan tag" (802.1p packet)") installs VID 0 to every net device that comes up. DSA treats these VLANs as bridge VLANs, and therefore, in my testing, the list of bridge_vlans was never empty.
However, if CONFIG_VLAN_8021Q is not enabled, or the port is not up when it joins a VLAN-aware bridge, the bridge_vlans list will be temporarily empty, and the sja1105_static_config_reload() call from sja1105_vlan_filtering() will fail.
To fix this, the simplest thing is to keep VID 4095, the one used for CPU-injected control packets since commit ed040abca4c1 ("net: dsa: sja1105: use 4095 as the private VLAN for untagged traffic"), in the list of bridge VLANs too, not just the list of tag_8021q VLANs. This ensures that the list of bridge VLANs will never be empty.
Fixes: ec5ae61076d0 ("net: dsa: sja1105: save/restore VLANs using a delta commit method") Reported-by: Radu Pirea (NXP OSS) radu-nicolae.pirea@oss.nxp.com Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/sja1105/sja1105_main.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index ebe4d33cda27..6e5dbe9f3892 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -378,6 +378,12 @@ static int sja1105_init_static_vlan(struct sja1105_private *priv) if (dsa_is_cpu_port(ds, port)) v->pvid = true; list_add(&v->list, &priv->dsa_8021q_vlans); + + v = kmemdup(v, sizeof(*v), GFP_KERNEL); + if (!v) + return -ENOMEM; + + list_add(&v->list, &priv->bridge_vlans); }
((struct sja1105_vlan_lookup_entry *)table->entries)[0] = pvid;
From: Heinrich Schuchardt xypron.glpk@gmx.de
[ Upstream commit c79e89ecaa246c880292ba68cbe08c9c30db77e3 ]
Requiring that initrd is loaded below RAM start + 256 MiB led to failure to boot SUSE Linux with GRUB on QEMU, cf. https://lists.gnu.org/archive/html/grub-devel/2021-06/msg00037.html
Remove the constraint.
Reported-by: Andreas Schwab schwab@linux-m68k.org Signed-off-by: Heinrich Schuchardt xypron.glpk@gmx.de Reviewed-by: Atish Patra atish.patra@wdc.com Acked-by: Ard Biesheuvel ardb@kernel.org Fixes: d7071743db31 ("RISC-V: Add EFI stub support.") Signed-off-by: Palmer Dabbelt palmerdabbelt@google.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/include/asm/efi.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/include/asm/efi.h b/arch/riscv/include/asm/efi.h index 6d98cd999680..7b3483ba2e84 100644 --- a/arch/riscv/include/asm/efi.h +++ b/arch/riscv/include/asm/efi.h @@ -27,10 +27,10 @@ int efi_set_mapping_permissions(struct mm_struct *mm, efi_memory_desc_t *md);
#define ARCH_EFI_IRQ_FLAGS_MASK (SR_IE | SR_SPIE)
-/* Load initrd at enough distance from DRAM start */ +/* Load initrd anywhere in system RAM */ static inline unsigned long efi_get_max_initrd_addr(unsigned long image_addr) { - return image_addr + SZ_256M; + return ULONG_MAX; }
#define alloc_screen_info(x...) (&screen_info)
From: Yajun Deng yajun.deng@linux.dev
[ Upstream commit 9d85a6f44bd5585761947f40f7821c9cd78a1bbe ]
The 4th parameter in tc_chain_notify() should be flags rather than seq. Let's change it back correctly.
Fixes: 32a4f5ecd738 ("net: sched: introduce chain object to uapi") Signed-off-by: Yajun Deng yajun.deng@linux.dev Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/cls_api.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c index d73b5c5514a9..e3e79e9bd706 100644 --- a/net/sched/cls_api.c +++ b/net/sched/cls_api.c @@ -2904,7 +2904,7 @@ replay: break; case RTM_GETCHAIN: err = tc_chain_notify(chain, skb, n->nlmsg_seq, - n->nlmsg_seq, n->nlmsg_type, true); + n->nlmsg_flags, n->nlmsg_type, true); if (err < 0) NL_SET_ERR_MSG(extack, "Failed to send chain notify message"); break;
From: Maxime Ripard maxime@cerno.tech
[ Upstream commit 7bbcb919e32d776ca8ddce08abb391ab92eef6a9 ]
The mipi_dsi_device allocated by mipi_dsi_device_register_full() is already free'd on release.
Fixes: 2f733d6194bd ("drm/panel: Add support for the Raspberry Pi 7" Touchscreen.") Signed-off-by: Maxime Ripard maxime@cerno.tech Reviewed-by: Sam Ravnborg sam@ravnborg.org Link: https://patchwork.freedesktop.org/patch/msgid/20210720134525.563936-9-maxime... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c | 1 - 1 file changed, 1 deletion(-)
diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c index 5e9ccefb88f6..bbdd086be7f5 100644 --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c @@ -447,7 +447,6 @@ static int rpi_touchscreen_remove(struct i2c_client *i2c) drm_panel_remove(&ts->base);
mipi_dsi_device_unregister(ts->dsi); - kfree(ts->dsi);
return 0; }
From: Ioana Ciornei ioana.ciornei@nxp.com
[ Upstream commit 7aaa0f311e2df2704fa8ddb8ed681a3b5841d0bf ]
Any interraction with the buffer pool (seeding a buffer, acquire one) is made through a software portal (SWP, a DPIO object). There are circumstances where the dpaa2-switch driver probes on a DPSW before any DPIO devices have been probed. In this case, seeding of the buffer pool will lead to a panic since no SWPs are initialized.
To fix this, seed the buffer pool after making sure that the software portals have been probed and are ready to be used.
Fixes: 0b1b71370458 ("staging: dpaa2-switch: handle Rx path on control interface") Signed-off-by: Ioana Ciornei ioana.ciornei@nxp.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/freescale/dpaa2/dpaa2-switch.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c index 05de37c3b64c..87321b7239cf 100644 --- a/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c +++ b/drivers/net/ethernet/freescale/dpaa2/dpaa2-switch.c @@ -2770,32 +2770,32 @@ static int dpaa2_switch_ctrl_if_setup(struct ethsw_core *ethsw) if (err) return err;
- err = dpaa2_switch_seed_bp(ethsw); - if (err) - goto err_free_dpbp; - err = dpaa2_switch_alloc_rings(ethsw); if (err) - goto err_drain_dpbp; + goto err_free_dpbp;
err = dpaa2_switch_setup_dpio(ethsw); if (err) goto err_destroy_rings;
+ err = dpaa2_switch_seed_bp(ethsw); + if (err) + goto err_deregister_dpio; + err = dpsw_ctrl_if_enable(ethsw->mc_io, 0, ethsw->dpsw_handle); if (err) { dev_err(ethsw->dev, "dpsw_ctrl_if_enable err %d\n", err); - goto err_deregister_dpio; + goto err_drain_dpbp; }
return 0;
+err_drain_dpbp: + dpaa2_switch_drain_bp(ethsw); err_deregister_dpio: dpaa2_switch_free_dpio(ethsw); err_destroy_rings: dpaa2_switch_destroy_rings(ethsw); -err_drain_dpbp: - dpaa2_switch_drain_bp(ethsw); err_free_dpbp: dpaa2_switch_free_dpbp(ethsw);
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit 2485bd7557a7edb4520b4072af464f0a08c8efe0 ]
We only allow sending single credit writes through the SMB2_write() synchronous api so split this into smaller chunks.
Fixes: 966a3cb7c7db ("cifs: improve fallocate emulation")
Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Reported-by: Namjae Jeon namjae.jeon@samsung.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2ops.c | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index 903de7449aa3..cc253bbff696 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -3613,7 +3613,7 @@ static int smb3_simple_fallocate_write_range(unsigned int xid, char *buf) { struct cifs_io_parms io_parms = {0}; - int nbytes; + int rc, nbytes; struct kvec iov[2];
io_parms.netfid = cfile->fid.netfid; @@ -3621,13 +3621,25 @@ static int smb3_simple_fallocate_write_range(unsigned int xid, io_parms.tcon = tcon; io_parms.persistent_fid = cfile->fid.persistent_fid; io_parms.volatile_fid = cfile->fid.volatile_fid; - io_parms.offset = off; - io_parms.length = len;
- /* iov[0] is reserved for smb header */ - iov[1].iov_base = buf; - iov[1].iov_len = io_parms.length; - return SMB2_write(xid, &io_parms, &nbytes, iov, 1); + while (len) { + io_parms.offset = off; + io_parms.length = len; + if (io_parms.length > SMB2_MAX_BUFFER_SIZE) + io_parms.length = SMB2_MAX_BUFFER_SIZE; + /* iov[0] is reserved for smb header */ + iov[1].iov_base = buf; + iov[1].iov_len = io_parms.length; + rc = SMB2_write(xid, &io_parms, &nbytes, iov, 1); + if (rc) + break; + if (nbytes > len) + return -EINVAL; + buf += nbytes; + off += nbytes; + len -= nbytes; + } + return rc; }
static int smb3_simple_fallocate_range(unsigned int xid,
From: Ronnie Sahlberg lsahlber@redhat.com
[ Upstream commit 488968a8945c119859d91bb6a8dc13bf50002f15 ]
Remove the conditional checking for out_data_len and skipping the fallocate if it is 0. This is wrong will actually change any legitimate the fallocate where the entire region is unallocated into a no-op.
Additionally, before allocating the range, if FALLOC_FL_KEEP_SIZE is set then we need to clamp the length of the fallocate region as to not extend the size of the file.
Fixes: 966a3cb7c7db ("cifs: improve fallocate emulation") Signed-off-by: Ronnie Sahlberg lsahlber@redhat.com Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/cifs/smb2ops.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)
diff --git a/fs/cifs/smb2ops.c b/fs/cifs/smb2ops.c index cc253bbff696..64cad843ce72 100644 --- a/fs/cifs/smb2ops.c +++ b/fs/cifs/smb2ops.c @@ -3663,11 +3663,6 @@ static int smb3_simple_fallocate_range(unsigned int xid, (char **)&out_data, &out_data_len); if (rc) goto out; - /* - * It is already all allocated - */ - if (out_data_len == 0) - goto out;
buf = kzalloc(1024 * 1024, GFP_KERNEL); if (buf == NULL) { @@ -3790,6 +3785,24 @@ static long smb3_simple_falloc(struct file *file, struct cifs_tcon *tcon, goto out; }
+ if (keep_size == true) { + /* + * We can not preallocate pages beyond the end of the file + * in SMB2 + */ + if (off >= i_size_read(inode)) { + rc = 0; + goto out; + } + /* + * For fallocates that are partially beyond the end of file, + * clamp len so we only fallocate up to the end of file. + */ + if (off + len > i_size_read(inode)) { + len = i_size_read(inode) - off; + } + } + if ((keep_size == true) || (i_size_read(inode) >= off + len)) { /* * At this point, we are trying to fallocate an internal
From: Marcelo Henrique Cerri marcelo.cerri@canonical.com
[ Upstream commit d238692b4b9f2c36e35af4c6e6f6da36184aeb3e ]
Use size_t when capping the count argument received by mem_rw(). Since count is size_t, using min_t(int, ...) can lead to a negative value that will later be passed to access_remote_vm(), which can cause unexpected behavior.
Since we are capping the value to at maximum PAGE_SIZE, the conversion from size_t to int when passing it to access_remote_vm() as "len" shouldn't be a problem.
Link: https://lkml.kernel.org/r/20210512125215.3348316-1-marcelo.cerri@canonical.c... Reviewed-by: David Disseldorp ddiss@suse.de Signed-off-by: Thadeu Lima de Souza Cascardo cascardo@canonical.com Signed-off-by: Marcelo Henrique Cerri marcelo.cerri@canonical.com Cc: Alexey Dobriyan adobriyan@gmail.com Cc: Souza Cascardo cascardo@canonical.com Cc: Christian Brauner christian.brauner@ubuntu.com Cc: Michel Lespinasse walken@google.com Cc: Helge Deller deller@gmx.de Cc: Oleg Nesterov oleg@redhat.com Cc: Lorenzo Stoakes lstoakes@gmail.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/proc/base.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/proc/base.c b/fs/proc/base.c index 9cbd915025ad..a0a2fc1c9da2 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -854,7 +854,7 @@ static ssize_t mem_rw(struct file *file, char __user *buf, flags = FOLL_FORCE | (write ? FOLL_WRITE : 0);
while (count > 0) { - int this_len = min_t(int, count, PAGE_SIZE); + size_t this_len = min_t(size_t, count, PAGE_SIZE);
if (write && copy_from_user(page, buf, this_len)) { copied = -EFAULT;
From: Linus Torvalds torvalds@linux-foundation.org
commit fc68f42aa737dc15e7665a4101d4168aadb8e4c4 upstream.
Commit 71f642833284 ("ACPI: utils: Fix reference counting in for_each_acpi_dev_match()") started doing "acpi_dev_put()" on a pointer that was possibly NULL. That fails miserably, because that helper inline function is not set up to handle that case.
Just make acpi_dev_put() silently accept a NULL pointer, rather than calling down to put_device() with an invalid offset off that NULL pointer.
Link: https://lore.kernel.org/lkml/a607c149-6bf6-0fd0-0e31-100378504da2@kernel.dk/ Reported-and-tested-by: Jens Axboe axboe@kernel.dk Tested-by: Daniel Scally djrscally@gmail.com Cc: Andy Shevchenko andy.shevchenko@gmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/acpi/acpi_bus.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/include/acpi/acpi_bus.h +++ b/include/acpi/acpi_bus.h @@ -711,7 +711,8 @@ static inline struct acpi_device *acpi_d
static inline void acpi_dev_put(struct acpi_device *adev) { - put_device(&adev->dev); + if (adev) + put_device(&adev->dev); } #else /* CONFIG_ACPI */
From: Olivier Langlois olivier@trillion01.com
commit 997135017716c33f3405e86cca5da9567b40a08e upstream.
If an asynchronous completion happens before the task is preparing itself to wait and set its state to TASK_INTERRUPTIBLE, the completion will not wake up the sqp thread.
Cc: stable@vger.kernel.org Signed-off-by: Olivier Langlois olivier@trillion01.com Reviewed-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/d1419dc32ec6a97b453bee34dc03fa6a02797142.162447320... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/io_uring.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -6876,7 +6876,8 @@ static int io_sq_thread(void *data) }
prepare_to_wait(&sqd->wait, &wait, TASK_INTERRUPTIBLE); - if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state)) { + if (!test_bit(IO_SQ_THREAD_SHOULD_PARK, &sqd->state) && + !io_run_task_work()) { list_for_each_entry(ctx, &sqd->ctx_list, sqd_list) io_ring_set_wakeup_flag(ctx);
From: Stephen Boyd swboyd@chromium.org
commit 10252bae863d09b9648bed2e035572d207200ca1 upstream.
There's a chance that the IDA allocated in mmc_alloc_host() is not freed for some time because it's freed as part of a class' release function (see mmc_host_classdev_release() where the IDA is freed). If another thread is holding a reference to the class, then only once all balancing device_put() calls (in turn calling kobject_put()) have been made will the IDA be released and usable again.
Normally this isn't a problem because the kobject is released before anything else that may want to use the same number tries to again, but with CONFIG_DEBUG_KOBJECT_RELEASE=y and OF aliases it becomes pretty easy to try to allocate an alias from the IDA twice while the first time it was allocated is still pending a call to ida_simple_remove(). It's also possible to trigger it by using CONFIG_DEBUG_KOBJECT_RELEASE and probe defering a driver at boot that calls mmc_alloc_host() before trying to get resources that may defer likes clks or regulators.
Instead of allocating from the IDA in this scenario, let's just skip it if we know this is an OF alias. The number is already "claimed" and devices that aren't using OF aliases won't try to use the claimed numbers anyway (see mmc_first_nonreserved_index()). This should avoid any issues with mmc_alloc_host() returning failures from the ida_simple_get() in the case that we're using an OF alias.
Cc: Matthias Schiffer matthias.schiffer@ew.tq-group.com Cc: Sujit Kautkar sujitka@chromium.org Reported-by: Zubin Mithra zsm@chromium.org Fixes: fa2d0aa96941 ("mmc: core: Allow setting slot index via device tree alias") Signed-off-by: Stephen Boyd swboyd@chromium.org Link: https://lore.kernel.org/r/20210623075002.1746924-3-swboyd@chromium.org Cc: stable@vger.kernel.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/host.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-)
--- a/drivers/mmc/core/host.c +++ b/drivers/mmc/core/host.c @@ -75,7 +75,8 @@ static void mmc_host_classdev_release(st { struct mmc_host *host = cls_dev_to_mmc_host(dev); wakeup_source_unregister(host->ws); - ida_simple_remove(&mmc_host_ida, host->index); + if (of_alias_get_id(host->parent->of_node, "mmc") < 0) + ida_simple_remove(&mmc_host_ida, host->index); kfree(host); }
@@ -499,7 +500,7 @@ static int mmc_first_nonreserved_index(v */ struct mmc_host *mmc_alloc_host(int extra, struct device *dev) { - int err; + int index; struct mmc_host *host; int alias_id, min_idx, max_idx;
@@ -512,20 +513,19 @@ struct mmc_host *mmc_alloc_host(int extr
alias_id = of_alias_get_id(dev->of_node, "mmc"); if (alias_id >= 0) { - min_idx = alias_id; - max_idx = alias_id + 1; + index = alias_id; } else { min_idx = mmc_first_nonreserved_index(); max_idx = 0; - }
- err = ida_simple_get(&mmc_host_ida, min_idx, max_idx, GFP_KERNEL); - if (err < 0) { - kfree(host); - return NULL; + index = ida_simple_get(&mmc_host_ida, min_idx, max_idx, GFP_KERNEL); + if (index < 0) { + kfree(host); + return NULL; + } }
- host->index = err; + host->index = index;
dev_set_name(&host->class_dev, "mmc%d", host->index); host->ws = wakeup_source_register(NULL, dev_name(&host->class_dev));
From: Vasily Gorbik gor@linux.ibm.com
commit f8c2602733c953ed7a16e060640b8e96f9d94b9b upstream.
s390 enforces DYNAMIC_FTRACE if FUNCTION_TRACER is selected. At the same time implementation of ftrace_caller is not compliant with HAVE_DYNAMIC_FTRACE since it doesn't provide implementation of ftrace_update_ftrace_func() and calls ftrace_trace_function() directly.
The subtle difference is that during ftrace code patching ftrace replaces function tracer via ftrace_update_ftrace_func() and activates it back afterwards. Unexpected direct calls to ftrace_trace_function() during ftrace code patching leads to nullptr-dereferences when tracing is activated for one of functions which are used during code patching. Those function currently are: copy_from_kernel_nofault() copy_from_kernel_nofault_allowed() preempt_count_sub() [with debug_defconfig] preempt_count_add() [with debug_defconfig]
Corresponding KASAN report: BUG: KASAN: nullptr-dereference in function_trace_call+0x316/0x3b0 Read of size 4 at addr 0000000000001e08 by task migration/0/15
CPU: 0 PID: 15 Comm: migration/0 Tainted: G B 5.13.0-41423-g08316af3644d Hardware name: IBM 3906 M04 704 (LPAR) Stopper: multi_cpu_stop+0x0/0x3e0 <- stop_machine_cpuslocked+0x1e4/0x218 Call Trace: [<0000000001f77caa>] show_stack+0x16a/0x1d0 [<0000000001f8de42>] dump_stack+0x15a/0x1b0 [<0000000001f81d56>] print_address_description.constprop.0+0x66/0x2e0 [<000000000082b0ca>] kasan_report+0x152/0x1c0 [<00000000004cfd8e>] function_trace_call+0x316/0x3b0 [<0000000001fb7082>] ftrace_caller+0x7a/0x7e [<00000000006bb3e6>] copy_from_kernel_nofault_allowed+0x6/0x10 [<00000000006bb42e>] copy_from_kernel_nofault+0x3e/0xd0 [<000000000014605c>] ftrace_make_call+0xb4/0x1f8 [<000000000047a1b4>] ftrace_replace_code+0x134/0x1d8 [<000000000047a6e0>] ftrace_modify_all_code+0x120/0x1d0 [<000000000047a7ec>] __ftrace_modify_code+0x5c/0x78 [<000000000042395c>] multi_cpu_stop+0x224/0x3e0 [<0000000000423212>] cpu_stopper_thread+0x33a/0x5a0 [<0000000000243ff2>] smpboot_thread_fn+0x302/0x708 [<00000000002329ea>] kthread+0x342/0x408 [<00000000001066b2>] __ret_from_fork+0x92/0xf0 [<0000000001fb57fa>] ret_from_fork+0xa/0x30
The buggy address belongs to the page: page:(____ptrval____) refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1 flags: 0x1ffff00000001000(reserved|node=0|zone=0|lastcpupid=0x1ffff) raw: 1ffff00000001000 0000040000000048 0000040000000048 0000000000000000 raw: 0000000000000000 0000000000000000 ffffffff00000001 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: 0000000000001d00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 0000000000001d80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
0000000000001e00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7
^ 0000000000001e80: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 0000000000001f00: f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 f7 ==================================================================
To fix that introduce ftrace_func callback to be called from ftrace_caller and update it in ftrace_update_ftrace_func().
Fixes: 4cc9bed034d1 ("[S390] cleanup ftrace backend functions") Cc: stable@vger.kernel.org Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/include/asm/ftrace.h | 1 + arch/s390/kernel/ftrace.c | 2 ++ arch/s390/kernel/mcount.S | 4 ++-- 3 files changed, 5 insertions(+), 2 deletions(-)
--- a/arch/s390/include/asm/ftrace.h +++ b/arch/s390/include/asm/ftrace.h @@ -19,6 +19,7 @@ void ftrace_caller(void);
extern char ftrace_graph_caller_end; extern unsigned long ftrace_plt; +extern void *ftrace_func;
struct dyn_arch_ftrace { };
--- a/arch/s390/kernel/ftrace.c +++ b/arch/s390/kernel/ftrace.c @@ -40,6 +40,7 @@ * trampoline (ftrace_plt), which clobbers also r1. */
+void *ftrace_func __read_mostly = ftrace_stub; unsigned long ftrace_plt;
int ftrace_modify_call(struct dyn_ftrace *rec, unsigned long old_addr, @@ -85,6 +86,7 @@ int ftrace_make_call(struct dyn_ftrace *
int ftrace_update_ftrace_func(ftrace_func_t func) { + ftrace_func = func; return 0; }
--- a/arch/s390/kernel/mcount.S +++ b/arch/s390/kernel/mcount.S @@ -59,13 +59,13 @@ ENTRY(ftrace_caller) #ifdef CONFIG_HAVE_MARCH_Z196_FEATURES aghik %r2,%r0,-MCOUNT_INSN_SIZE lgrl %r4,function_trace_op - lgrl %r1,ftrace_trace_function + lgrl %r1,ftrace_func #else lgr %r2,%r0 aghi %r2,-MCOUNT_INSN_SIZE larl %r4,function_trace_op lg %r4,0(%r4) - larl %r1,ftrace_trace_function + larl %r1,ftrace_func lg %r1,0(%r1) #endif lgr %r3,%r14
From: Alexander Egorenkov egorenar@linux.ibm.com
commit 463f36c76fa4ec015c640ff63ccf52e7527abee0 upstream.
The DMA code section of the decompressor must be compiled with expolines if Spectre V2 mitigation has been enabled for the decompressed kernel. This is required because although the decompressor's image contains the DMA code section, it is handed over to the decompressed kernel for use.
Because the DMA code is already slow w/o expolines, use expolines always regardless whether the decompressed kernel is using them or not. This simplifies the DMA code by dropping the conditional compilation of expolines.
Fixes: bf72630130c2 ("s390: use proper expoline sections for .dma code") Cc: stable@vger.kernel.org # 5.2 Signed-off-by: Alexander Egorenkov egorenar@linux.ibm.com Reviewed-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/s390/boot/text_dma.S | 19 ++++--------------- 1 file changed, 4 insertions(+), 15 deletions(-)
--- a/arch/s390/boot/text_dma.S +++ b/arch/s390/boot/text_dma.S @@ -9,16 +9,6 @@ #include <asm/errno.h> #include <asm/sigp.h>
-#ifdef CC_USING_EXPOLINE - .pushsection .dma.text.__s390_indirect_jump_r14,"axG" -__dma__s390_indirect_jump_r14: - larl %r1,0f - ex 0,0(%r1) - j . -0: br %r14 - .popsection -#endif - .section .dma.text,"ax" /* * Simplified version of expoline thunk. The normal thunks can not be used here, @@ -27,11 +17,10 @@ __dma__s390_indirect_jump_r14: * affects a few functions that are not performance-relevant. */ .macro BR_EX_DMA_r14 -#ifdef CC_USING_EXPOLINE - jg __dma__s390_indirect_jump_r14 -#else - br %r14 -#endif + larl %r1,0f + ex 0,0(%r1) + j . +0: br %r14 .endm
/*
From: Takashi Iwai tiwai@suse.de
commit 64752a95b702817602d72f109ceaf5ec0780e283 upstream.
Recently we've added a new usb_mixer element type, USB_MIXER_BESPOKEN, but it wasn't added in the table in snd_usb_mixer_dump_cval(). This is no big problem since each bespoken type should have its own dump method, but it still isn't disallowed to use the standard one, so we should cover it as well. Along with it, define the table with the explicit array initializer for avoiding other pitfalls.
Fixes: 785b6f29a795 ("ALSA: usb-audio: scarlett2: Fix wrong resume call") Reported-by: Pavel Machek pavel@denx.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210714084836.1977-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/mixer.c | 10 +++++++++- 1 file changed, 9 insertions(+), 1 deletion(-)
--- a/sound/usb/mixer.c +++ b/sound/usb/mixer.c @@ -3295,7 +3295,15 @@ static void snd_usb_mixer_dump_cval(stru { struct usb_mixer_elem_info *cval = mixer_elem_list_to_info(list); static const char * const val_types[] = { - "BOOLEAN", "INV_BOOLEAN", "S8", "U8", "S16", "U16", "S32", "U32", + [USB_MIXER_BOOLEAN] = "BOOLEAN", + [USB_MIXER_INV_BOOLEAN] = "INV_BOOLEAN", + [USB_MIXER_S8] = "S8", + [USB_MIXER_U8] = "U8", + [USB_MIXER_S16] = "S16", + [USB_MIXER_U16] = "U16", + [USB_MIXER_S32] = "S32", + [USB_MIXER_U32] = "U32", + [USB_MIXER_BESPOKEN] = "BESPOKEN", }; snd_iprintf(buffer, " Info: id=%i, control=%i, cmask=0x%x, " "channels=%i, type="%s"\n", cval->head.id,
From: Alexander Tsoy alexander@tsoy.me
commit b0084afde27fe8a504377dee65f55bc6aa776937 upstream.
These devices has two interfaces, but only the second interface contains the capture endpoint, thus quirk is required to delay the registration until the second interface appears.
Tested-by: Jakub Fišer jakub@ufiseru.cz Signed-off-by: Alexander Tsoy alexander@tsoy.me Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210721235605.53741-1-alexander@tsoy.me Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/quirks.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/sound/usb/quirks.c +++ b/sound/usb/quirks.c @@ -1897,6 +1897,9 @@ static const struct registration_quirk r REG_QUIRK_ENTRY(0x0951, 0x16d8, 2), /* Kingston HyperX AMP */ REG_QUIRK_ENTRY(0x0951, 0x16ed, 2), /* Kingston HyperX Cloud Alpha S */ REG_QUIRK_ENTRY(0x0951, 0x16ea, 2), /* Kingston HyperX Cloud Flight S */ + REG_QUIRK_ENTRY(0x0ecb, 0x1f46, 2), /* JBL Quantum 600 */ + REG_QUIRK_ENTRY(0x0ecb, 0x2039, 2), /* JBL Quantum 400 */ + REG_QUIRK_ENTRY(0x0ecb, 0x203e, 2), /* JBL Quantum 800 */ { 0 } /* terminator */ };
From: Takashi Iwai tiwai@suse.de
commit 1c2b9519159b470ef24b2638f4794e86e2952ab7 upstream.
SB16 CSP driver may hit potentially a typical ABBA deadlock in two code paths:
In snd_sb_csp_stop(): spin_lock_irqsave(&p->chip->mixer_lock, flags); spin_lock(&p->chip->reg_lock);
In snd_sb_csp_load(): spin_lock_irqsave(&p->chip->reg_lock, flags); spin_lock(&p->chip->mixer_lock);
Also the similar pattern is seen in snd_sb_csp_start().
Although the practical impact is very small (those states aren't triggered in the same running state and this happens only on a real hardware, decades old ISA sound boards -- which must be very difficult to find nowadays), it's a real scenario and has to be fixed.
This patch addresses those deadlocks by splitting the locks in snd_sb_csp_start() and snd_sb_csp_stop() for avoiding the nested locks.
Reported-by: Jia-Ju Bai baijiaju1990@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/7b0fcdaf-cd4f-4728-2eae-48c151a92e10@gmail.com Link: https://lore.kernel.org/r/20210716132723.13216-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/isa/sb/sb16_csp.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/sound/isa/sb/sb16_csp.c +++ b/sound/isa/sb/sb16_csp.c @@ -814,6 +814,7 @@ static int snd_sb_csp_start(struct snd_s mixR = snd_sbmixer_read(p->chip, SB_DSP4_PCM_DEV + 1); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV, mixL & 0x7); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV + 1, mixR & 0x7); + spin_unlock_irqrestore(&p->chip->mixer_lock, flags);
spin_lock(&p->chip->reg_lock); set_mode_register(p->chip, 0xc0); /* c0 = STOP */ @@ -853,6 +854,7 @@ static int snd_sb_csp_start(struct snd_s spin_unlock(&p->chip->reg_lock);
/* restore PCM volume */ + spin_lock_irqsave(&p->chip->mixer_lock, flags); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV, mixL); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV + 1, mixR); spin_unlock_irqrestore(&p->chip->mixer_lock, flags); @@ -878,6 +880,7 @@ static int snd_sb_csp_stop(struct snd_sb mixR = snd_sbmixer_read(p->chip, SB_DSP4_PCM_DEV + 1); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV, mixL & 0x7); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV + 1, mixR & 0x7); + spin_unlock_irqrestore(&p->chip->mixer_lock, flags);
spin_lock(&p->chip->reg_lock); if (p->running & SNDRV_SB_CSP_ST_QSOUND) { @@ -892,6 +895,7 @@ static int snd_sb_csp_stop(struct snd_sb spin_unlock(&p->chip->reg_lock);
/* restore PCM volume */ + spin_lock_irqsave(&p->chip->mixer_lock, flags); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV, mixL); snd_sbmixer_write(p->chip, SB_DSP4_PCM_DEV + 1, mixR); spin_unlock_irqrestore(&p->chip->mixer_lock, flags);
From: Hui Wang hui.wang@canonical.com
commit e4efa82660e6d80338c554e45e903714e1b2c27b upstream.
This is a Lenovo ThinkStation machine which uses the codec alc623. There are 2 issues on this machine, the 1st one is the pop noise in the lineout, the 2nd one is there are 2 Front Mics and pulseaudio can't handle them, After applying the fixup of ALC623_FIXUP_LENOVO_THINKSTATION_P340 to this machine, the 2 issues are fixed.
Cc: stable@vger.kernel.org Signed-off-by: Hui Wang hui.wang@canonical.com Link: https://lore.kernel.org/r/20210719030231.6870-1-hui.wang@canonical.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -8566,6 +8566,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x3151, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x17aa, 0x3176, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), SND_PCI_QUIRK(0x17aa, 0x3178, "ThinkCentre Station", ALC283_FIXUP_HEADSET_MIC), + SND_PCI_QUIRK(0x17aa, 0x31af, "ThinkCentre Station", ALC623_FIXUP_LENOVO_THINKSTATION_P340), SND_PCI_QUIRK(0x17aa, 0x3818, "Lenovo C940", ALC298_FIXUP_LENOVO_SPK_VOLUME), SND_PCI_QUIRK(0x17aa, 0x3827, "Ideapad S740", ALC285_FIXUP_IDEAPAD_S740_COEF), SND_PCI_QUIRK(0x17aa, 0x3843, "Yoga 9i", ALC287_FIXUP_IDEAPAD_BASS_SPK_AMP),
From: Takashi Iwai tiwai@suse.de
commit 33f735f137c6539e3ceceb515cd1e2a644005b49 upstream.
The BIOS on MSI Mortar B550m WiFi (MS-7C94) board with AMDGPU seems disabling the other pins than HDMI although it has more outputs including DP.
This patch adds the board to the allow list for enabling all pins.
Reported-by: Damjan Georgievski gdamjan@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/CAEk1YH4Jd0a8vfZxORVu7qg+Zsc-K+pR187ezNq8QhJBPW4gp... Link: https://lore.kernel.org/r/20210716135600.24176-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_hdmi.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_hdmi.c +++ b/sound/pci/hda/patch_hdmi.c @@ -1940,6 +1940,7 @@ static int hdmi_add_cvt(struct hda_codec static const struct snd_pci_quirk force_connect_list[] = { SND_PCI_QUIRK(0x103c, 0x870f, "HP", 1), SND_PCI_QUIRK(0x103c, 0x871a, "HP", 1), + SND_PCI_QUIRK(0x1462, 0xec94, "MS-7C94", 1), {} };
From: Alan Young consult.awy@gmail.com
commit 2e2832562c877e6530b8480982d99a4ff90c6777 upstream.
If a 32-bit application is being used with a 64-bit kernel and is using the mmap mechanism to write data, then the SNDRV_PCM_IOCTL_SYNC_PTR ioctl results in calling snd_pcm_ioctl_sync_ptr_compat(). Make this use pcm_lib_apply_appl_ptr() so that the substream's ack() method, if defined, is called.
The snd_pcm_sync_ptr() function, used in the 64-bit ioctl case, already uses snd_pcm_ioctl_sync_ptr_compat().
Fixes: 9027c4639ef1 ("ALSA: pcm: Call ack() whenever appl_ptr is updated") Signed-off-by: Alan Young consult.awy@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/c441f18c-eb2a-3bdd-299a-696ccca2de9c@gmail.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/pcm_native.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
--- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -3057,9 +3057,14 @@ static int snd_pcm_ioctl_sync_ptr_compat boundary = 0x7fffffff; snd_pcm_stream_lock_irq(substream); /* FIXME: we should consider the boundary for the sync from app */ - if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL)) - control->appl_ptr = scontrol.appl_ptr; - else + if (!(sflags & SNDRV_PCM_SYNC_PTR_APPL)) { + err = pcm_lib_apply_appl_ptr(substream, + scontrol.appl_ptr); + if (err < 0) { + snd_pcm_stream_unlock_irq(substream); + return err; + } + } else scontrol.appl_ptr = control->appl_ptr % boundary; if (!(sflags & SNDRV_PCM_SYNC_PTR_AVAIL_MIN)) control->avail_min = scontrol.avail_min;
From: Takashi Iwai tiwai@suse.de
commit c4824ae7db418aee6f50f308a20b832e58e997fd upstream.
The hw_support_mmap() doesn't cover all memory allocation types and might use a wrong device pointer for checking the capability. Check the all memory allocation types more completely.
Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20210720092640.12338-1-tiwai@suse.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/core/pcm_native.c | 14 ++++++++++---- 1 file changed, 10 insertions(+), 4 deletions(-)
--- a/sound/core/pcm_native.c +++ b/sound/core/pcm_native.c @@ -246,12 +246,18 @@ static bool hw_support_mmap(struct snd_p if (!(substream->runtime->hw.info & SNDRV_PCM_INFO_MMAP)) return false;
- if (substream->ops->mmap || - (substream->dma_buffer.dev.type != SNDRV_DMA_TYPE_DEV && - substream->dma_buffer.dev.type != SNDRV_DMA_TYPE_DEV_UC)) + if (substream->ops->mmap) return true;
- return dma_can_mmap(substream->dma_buffer.dev.dev); + switch (substream->dma_buffer.dev.type) { + case SNDRV_DMA_TYPE_UNKNOWN: + return false; + case SNDRV_DMA_TYPE_CONTINUOUS: + case SNDRV_DMA_TYPE_VMALLOC: + return true; + default: + return dma_can_mmap(substream->dma_buffer.dev.dev); + } }
static int constrain_mask_params(struct snd_pcm_substream *substream,
From: Moritz Fischer mdf@kernel.org
commit 44cf53602f5a0db80d53c8fff6cdbcae59650a42 upstream.
This reverts commit d143825baf15f204dac60acdf95e428182aa3374.
Justin reports some of his systems now fail as result of this commit:
xhci_hcd 0000:04:00.0: Direct firmware load for renesas_usb_fw.mem failed with error -2 xhci_hcd 0000:04:00.0: request_firmware failed: -2 xhci_hcd: probe of 0000:04:00.0 failed with error -2
The revert brings back the original issue the commit tried to solve but at least unbreaks existing systems relying on previous behavior.
Cc: stable@vger.kernel.org Cc: Mathias Nyman mathias.nyman@intel.com Cc: Vinod Koul vkoul@kernel.org Cc: Justin Forbes jmforbes@linuxtx.org Reported-by: Justin Forbes jmforbes@linuxtx.org Signed-off-by: Moritz Fischer mdf@kernel.org Fixes: d143825baf15 ("usb: renesas-xhci: Fix handling of unknown ROM state") Link: https://lore.kernel.org/r/20210719070519.41114-1-mdf@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci-renesas.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-)
--- a/drivers/usb/host/xhci-pci-renesas.c +++ b/drivers/usb/host/xhci-pci-renesas.c @@ -207,8 +207,7 @@ static int renesas_check_rom_state(struc return 0;
case RENESAS_ROM_STATUS_NO_RESULT: /* No result yet */ - dev_dbg(&pdev->dev, "Unknown ROM status ...\n"); - break; + return 0;
case RENESAS_ROM_STATUS_ERROR: /* Error State */ default: /* All other states are marked as "Reserved states" */ @@ -225,12 +224,13 @@ static int renesas_fw_check_running(stru u8 fw_state; int err;
- /* - * Only if device has ROM and loaded FW we can skip loading and - * return success. Otherwise (even unknown state), attempt to load FW. - */ - if (renesas_check_rom(pdev) && !renesas_check_rom_state(pdev)) - return 0; + /* Check if device has ROM and loaded, if so skip everything */ + err = renesas_check_rom(pdev); + if (err) { /* we have rom */ + err = renesas_check_rom_state(pdev); + if (!err) + return err; + }
/* * Test if the device is actually needing the firmware. As most
From: Greg Thelen gthelen@google.com
commit 0665e387318607d8269bfdea60723c627c8bae43 upstream.
Commit a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with memory") added renesas_usb_fw.mem firmware reference to xhci-pci. Thus modinfo indicates xhci-pci.ko has "firmware: renesas_usb_fw.mem". But the firmware is only actually used with CONFIG_USB_XHCI_PCI_RENESAS. An unusable firmware reference can trigger safety checkers which look for drivers with unmet firmware dependencies.
Avoid referring to renesas_usb_fw.mem in circumstances when it cannot be loaded (when CONFIG_USB_XHCI_PCI_RENESAS isn't set).
Fixes: a66d21d7dba8 ("usb: xhci: Add support for Renesas controller with memory") Cc: stable stable@vger.kernel.org Signed-off-by: Greg Thelen gthelen@google.com Link: https://lore.kernel.org/r/20210702071224.3673568-1-gthelen@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-pci.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/host/xhci-pci.c +++ b/drivers/usb/host/xhci-pci.c @@ -636,7 +636,14 @@ static const struct pci_device_id pci_id { /* end: all zeroes */ } }; MODULE_DEVICE_TABLE(pci, pci_ids); + +/* + * Without CONFIG_USB_XHCI_PCI_RENESAS renesas_xhci_check_request_fw() won't + * load firmware, so don't encumber the xhci-pci driver with it. + */ +#if IS_ENABLED(CONFIG_USB_XHCI_PCI_RENESAS) MODULE_FIRMWARE("renesas_usb_fw.mem"); +#endif
/* pci driver glue; this is a "new style" PCI driver module */ static struct pci_driver xhci_pci_driver = {
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 72f68bf5c756f5ce1139b31daae2684501383ad5 upstream.
There's a small window where a USB 2 remote wake may be left unhandled due to a race between hub thread and xhci port event interrupt handler.
When the resume event is detected in the xhci interrupt handler it kicks the hub timer, which should move the port from resume to U0 once resume has been signalled for long enough.
To keep the hub "thread" running we set a bus_state->resuming_ports flag. This flag makes sure hub timer function kicks itself.
checking this flag was not properly protected by the spinlock. Flag was copied to a local variable before lock was taken. The local variable was then checked later with spinlock held.
If interrupt is handled right after copying the flag to the local variable we end up stopping the hub thread before it can handle the USB 2 resume.
CPU0 CPU1 (hub thread) (xhci event handler)
xhci_hub_status_data() status = bus_state->resuming_ports; <Interrupt> handle_port_status() spin_lock() bus_state->resuming_ports = 1 set_flag(HCD_FLAG_POLL_RH) spin_unlock() spin_lock() if (!status) clear_flag(HCD_FLAG_POLL_RH) spin_unlock()
Fix this by taking the lock a bit earlier so that it covers the resuming_ports flag copy in the hub thread
Cc: stable@vger.kernel.org Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210715150651.1996099-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/xhci-hub.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/usb/host/xhci-hub.c +++ b/drivers/usb/host/xhci-hub.c @@ -1638,11 +1638,12 @@ int xhci_hub_status_data(struct usb_hcd * Inform the usbcore about resume-in-progress by returning * a non-zero value even if there are no status changes. */ + spin_lock_irqsave(&xhci->lock, flags); + status = bus_state->resuming_ports;
mask = PORT_CSC | PORT_PEC | PORT_OCC | PORT_PLC | PORT_WRC | PORT_CEC;
- spin_lock_irqsave(&xhci->lock, flags); /* For each port, did anything change? If so, set that bit in buf. */ for (i = 0; i < max_ports; i++) { temp = readl(ports[i]->addr);
From: David Jeffery djeffery@redhat.com
commit 0b60557230adfdeb8164e0b342ac9cd469a75759 upstream.
When MSI is used by the ehci-hcd driver, it can cause lost interrupts which results in EHCI only continuing to work due to a polling fallback. But the reliance of polling drastically reduces performance of any I/O through EHCI.
Interrupts are lost as the EHCI interrupt handler does not safely handle edge-triggered interrupts. It fails to ensure all interrupt status bits are cleared, which works with level-triggered interrupts but not the edge-triggered interrupts typical from using MSI.
To fix this problem, check if the driver may have raced with the hardware setting additional interrupt status bits and clear status until it is in a stable state.
Fixes: 306c54d0edb6 ("usb: hcd: Try MSI interrupts on PCI devices") Tested-by: Laurence Oberman loberman@redhat.com Reviewed-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Acked-by: Alan Stern stern@rowland.harvard.edu Signed-off-by: David Jeffery djeffery@redhat.com Link: https://lore.kernel.org/r/20210715213744.GA44506@redhat Cc: stable stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/ehci-hcd.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-)
--- a/drivers/usb/host/ehci-hcd.c +++ b/drivers/usb/host/ehci-hcd.c @@ -703,24 +703,28 @@ EXPORT_SYMBOL_GPL(ehci_setup); static irqreturn_t ehci_irq (struct usb_hcd *hcd) { struct ehci_hcd *ehci = hcd_to_ehci (hcd); - u32 status, masked_status, pcd_status = 0, cmd; + u32 status, current_status, masked_status, pcd_status = 0; + u32 cmd; int bh;
spin_lock(&ehci->lock);
- status = ehci_readl(ehci, &ehci->regs->status); + status = 0; + current_status = ehci_readl(ehci, &ehci->regs->status); +restart:
/* e.g. cardbus physical eject */ - if (status == ~(u32) 0) { + if (current_status == ~(u32) 0) { ehci_dbg (ehci, "device removed\n"); goto dead; } + status |= current_status;
/* * We don't use STS_FLR, but some controllers don't like it to * remain on, so mask it out along with the other status bits. */ - masked_status = status & (INTR_MASK | STS_FLR); + masked_status = current_status & (INTR_MASK | STS_FLR);
/* Shared IRQ? */ if (!masked_status || unlikely(ehci->rh_state == EHCI_RH_HALTED)) { @@ -730,6 +734,12 @@ static irqreturn_t ehci_irq (struct usb_
/* clear (just) interrupts */ ehci_writel(ehci, masked_status, &ehci->regs->status); + + /* For edge interrupts, don't race with an interrupt bit being raised */ + current_status = ehci_readl(ehci, &ehci->regs->status); + if (current_status & INTR_MASK) + goto restart; + cmd = ehci_readl(ehci, &ehci->regs->command); bh = 0;
From: Nicholas Piggin npiggin@gmail.com
commit f62f3c20647ebd5fb6ecb8f0b477b9281c44c10a upstream.
The kvmppc_rtas_hcall() sets the host rtas_args.rets pointer based on the rtas_args.nargs that was provided by the guest. That guest nargs value is not range checked, so the guest can cause the host rets pointer to be pointed outside the args array. The individual rtas function handlers check the nargs and nrets values to ensure they are correct, but if they are not, the handlers store a -3 (0xfffffffd) failure indication in rets[0] which corrupts host memory.
Fix this by testing up front whether the guest supplied nargs and nret would exceed the array size, and fail the hcall directly without storing a failure indication to rets[0].
Also expand on a comment about why we kill the guest and try not to return errors directly if we have a valid rets[0] pointer.
Fixes: 8e591cb72047 ("KVM: PPC: Book3S: Add infrastructure to implement kernel-side RTAS calls") Cc: stable@vger.kernel.org # v3.10+ Reported-by: Alexey Kardashevskiy aik@ozlabs.ru Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kvm/book3s_rtas.c | 25 ++++++++++++++++++++++--- 1 file changed, 22 insertions(+), 3 deletions(-)
--- a/arch/powerpc/kvm/book3s_rtas.c +++ b/arch/powerpc/kvm/book3s_rtas.c @@ -242,6 +242,17 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *v * value so we can restore it on the way out. */ orig_rets = args.rets; + if (be32_to_cpu(args.nargs) >= ARRAY_SIZE(args.args)) { + /* + * Don't overflow our args array: ensure there is room for + * at least rets[0] (even if the call specifies 0 nret). + * + * Each handler must then check for the correct nargs and nret + * values, but they may always return failure in rets[0]. + */ + rc = -EINVAL; + goto fail; + } args.rets = &args.args[be32_to_cpu(args.nargs)];
mutex_lock(&vcpu->kvm->arch.rtas_token_lock); @@ -269,9 +280,17 @@ int kvmppc_rtas_hcall(struct kvm_vcpu *v fail: /* * We only get here if the guest has called RTAS with a bogus - * args pointer. That means we can't get to the args, and so we - * can't fail the RTAS call. So fail right out to userspace, - * which should kill the guest. + * args pointer or nargs/nret values that would overflow the + * array. That means we can't get to the args, and so we can't + * fail the RTAS call. So fail right out to userspace, which + * should kill the guest. + * + * SLOF should actually pass the hcall return value from the + * rtas handler call in r3, so enter_rtas could be modified to + * return a failure indication in r3 and we could return such + * errors to the guest rather than failing to host userspace. + * However old guests that don't test for failure could then + * continue silently after errors, so for now we won't do this. */ return rc; }
From: Nicholas Piggin npiggin@gmail.com
commit d9c57d3ed52a92536f5fa59dc5ccdd58b4875076 upstream.
The H_ENTER_NESTED hypercall is handled by the L0, and it is a request by the L1 to switch the context of the vCPU over to that of its L2 guest, and return with an interrupt indication. The L1 is responsible for switching some registers to guest context, and the L0 switches others (including all the hypervisor privileged state).
If the L2 MSR has TM active, then the L1 is responsible for recheckpointing the L2 TM state. Then the L1 exits to L0 via the H_ENTER_NESTED hcall, and the L0 saves the TM state as part of the exit, and then it recheckpoints the TM state as part of the nested entry and finally HRFIDs into the L2 with TM active MSR. Not efficient, but about the simplest approach for something that's horrendously complicated.
Problems arise if the L1 exits to the L0 with a TM state which does not match the L2 TM state being requested. For example if the L1 is transactional but the L2 MSR is non-transactional, or vice versa. The L0's HRFID can take a TM Bad Thing interrupt and crash.
Fix this by disallowing H_ENTER_NESTED in TM[T] state entirely, and then ensuring that if the L1 is suspended then the L2 must have TM active, and if the L1 is not suspended then the L2 must not have TM active.
Fixes: 360cae313702 ("KVM: PPC: Book3S HV: Nested guest entry via hypercall") Cc: stable@vger.kernel.org # v4.20+ Reported-by: Alexey Kardashevskiy aik@ozlabs.ru Acked-by: Michael Neuling mikey@neuling.org Signed-off-by: Nicholas Piggin npiggin@gmail.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/powerpc/kvm/book3s_hv_nested.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+)
--- a/arch/powerpc/kvm/book3s_hv_nested.c +++ b/arch/powerpc/kvm/book3s_hv_nested.c @@ -301,6 +301,9 @@ long kvmhv_enter_nested_guest(struct kvm if (vcpu->kvm->arch.l1_ptcr == 0) return H_NOT_AVAILABLE;
+ if (MSR_TM_TRANSACTIONAL(vcpu->arch.shregs.msr)) + return H_BAD_MODE; + /* copy parameters in */ hv_ptr = kvmppc_get_gpr(vcpu, 4); regs_ptr = kvmppc_get_gpr(vcpu, 5); @@ -321,6 +324,23 @@ long kvmhv_enter_nested_guest(struct kvm if (l2_hv.vcpu_token >= NR_CPUS) return H_PARAMETER;
+ /* + * L1 must have set up a suspended state to enter the L2 in a + * transactional state, and only in that case. These have to be + * filtered out here to prevent causing a TM Bad Thing in the + * host HRFID. We could synthesize a TM Bad Thing back to the L1 + * here but there doesn't seem like much point. + */ + if (MSR_TM_SUSPENDED(vcpu->arch.shregs.msr)) { + if (!MSR_TM_ACTIVE(l2_regs.msr)) + return H_BAD_MODE; + } else { + if (l2_regs.msr & MSR_TS_MASK) + return H_BAD_MODE; + if (WARN_ON_ONCE(vcpu->arch.shregs.msr & MSR_TS_MASK)) + return H_BAD_MODE; + } + /* translate lpid */ l2 = kvmhv_get_nested(vcpu->kvm, l2_hv.lpid, true); if (!l2)
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 1b7f56fbc7a1b66967b6114d1b5f5a257c3abae6 upstream.
The device initiated link power management U1/U2 states should not be enabled in case the system exit latency plus one bus interval (125us) is greater than the shortest service interval of any periodic endpoint.
This is the case for both U1 and U2 sytstem exit latencies and link states.
See USB 3.2 section 9.4.9 "Set Feature" for more details
Note, before this patch the host and device initiated U1/U2 lpm states were both enabled with lpm. After this patch it's possible to end up with only host inititated U1/U2 lpm in case the exit latencies won't allow device initiated lpm.
If this case we still want to set the udev->usb3_lpm_ux_enabled flag so that sysfs users can see the link may go to U1/U2.
Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210715150122.1995966-2-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 68 ++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 56 insertions(+), 12 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -4091,6 +4091,47 @@ static int usb_set_lpm_timeout(struct us }
/* + * Don't allow device intiated U1/U2 if the system exit latency + one bus + * interval is greater than the minimum service interval of any active + * periodic endpoint. See USB 3.2 section 9.4.9 + */ +static bool usb_device_may_initiate_lpm(struct usb_device *udev, + enum usb3_link_state state) +{ + unsigned int sel; /* us */ + int i, j; + + if (state == USB3_LPM_U1) + sel = DIV_ROUND_UP(udev->u1_params.sel, 1000); + else if (state == USB3_LPM_U2) + sel = DIV_ROUND_UP(udev->u2_params.sel, 1000); + else + return false; + + for (i = 0; i < udev->actconfig->desc.bNumInterfaces; i++) { + struct usb_interface *intf; + struct usb_endpoint_descriptor *desc; + unsigned int interval; + + intf = udev->actconfig->interface[i]; + if (!intf) + continue; + + for (j = 0; j < intf->cur_altsetting->desc.bNumEndpoints; j++) { + desc = &intf->cur_altsetting->endpoint[j].desc; + + if (usb_endpoint_xfer_int(desc) || + usb_endpoint_xfer_isoc(desc)) { + interval = (1 << (desc->bInterval - 1)) * 125; + if (sel + 125 > interval) + return false; + } + } + } + return true; +} + +/* * Enable the hub-initiated U1/U2 idle timeouts, and enable device-initiated * U1/U2 entry. * @@ -4162,20 +4203,23 @@ static void usb_enable_link_state(struct * U1/U2_ENABLE */ if (udev->actconfig && - usb_set_device_initiated_lpm(udev, state, true) == 0) { - if (state == USB3_LPM_U1) - udev->usb3_lpm_u1_enabled = 1; - else if (state == USB3_LPM_U2) - udev->usb3_lpm_u2_enabled = 1; - } else { - /* Don't request U1/U2 entry if the device - * cannot transition to U1/U2. - */ - usb_set_lpm_timeout(udev, state, 0); - hcd->driver->disable_usb3_lpm_timeout(hcd, udev, state); + usb_device_may_initiate_lpm(udev, state)) { + if (usb_set_device_initiated_lpm(udev, state, true)) { + /* + * Request to enable device initiated U1/U2 failed, + * better to turn off lpm in this case. + */ + usb_set_lpm_timeout(udev, state, 0); + hcd->driver->disable_usb3_lpm_timeout(hcd, udev, state); + return; + } } -}
+ if (state == USB3_LPM_U1) + udev->usb3_lpm_u1_enabled = 1; + else if (state == USB3_LPM_U2) + udev->usb3_lpm_u2_enabled = 1; +} /* * Disable the hub-initiated U1/U2 idle timeouts, and disable device-initiated * U1/U2 entry.
From: Mathias Nyman mathias.nyman@linux.intel.com
commit 1bf2761c837571a66ec290fb66c90413821ffda2 upstream.
Maximum Exit Latency (MEL) value is used by host to know how much in advance it needs to start waking up a U1/U2 suspended link in order to service a periodic transfer in time.
Current MEL calculation only includes the time to wake up the path from U1/U2 to U0. This is called tMEL1 in USB 3.1 section C 1.5.2
Total MEL = tMEL1 + tMEL2 +tMEL3 + tMEL4 which should additinally include: - tMEL2 which is the time it takes for PING message to reach device - tMEL3 time for device to process the PING and submit a PING_RESPONSE - tMEL4 time for PING_RESPONSE to traverse back upstream to host.
Add the missing tMEL2, tMEL3 and tMEL4 to MEL calculation.
Cc: stable@kernel.org # v3.5 Signed-off-by: Mathias Nyman mathias.nyman@linux.intel.com Link: https://lore.kernel.org/r/20210715150122.1995966-1-mathias.nyman@linux.intel... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/core/hub.c | 52 ++++++++++++++++++++++++++----------------------- 1 file changed, 28 insertions(+), 24 deletions(-)
--- a/drivers/usb/core/hub.c +++ b/drivers/usb/core/hub.c @@ -48,6 +48,7 @@
#define USB_TP_TRANSMISSION_DELAY 40 /* ns */ #define USB_TP_TRANSMISSION_DELAY_MAX 65535 /* ns */ +#define USB_PING_RESPONSE_TIME 400 /* ns */
/* Protect struct usb_device->state and ->children members * Note: Both are also protected by ->dev.sem, except that ->state can @@ -182,8 +183,9 @@ int usb_device_supports_lpm(struct usb_d }
/* - * Set the Maximum Exit Latency (MEL) for the host to initiate a transition from - * either U1 or U2. + * Set the Maximum Exit Latency (MEL) for the host to wakup up the path from + * U1/U2, send a PING to the device and receive a PING_RESPONSE. + * See USB 3.1 section C.1.5.2 */ static void usb_set_lpm_mel(struct usb_device *udev, struct usb3_lpm_parameters *udev_lpm_params, @@ -193,35 +195,37 @@ static void usb_set_lpm_mel(struct usb_d unsigned int hub_exit_latency) { unsigned int total_mel; - unsigned int device_mel; - unsigned int hub_mel;
/* - * Calculate the time it takes to transition all links from the roothub - * to the parent hub into U0. The parent hub must then decode the - * packet (hub header decode latency) to figure out which port it was - * bound for. - * - * The Hub Header decode latency is expressed in 0.1us intervals (0x1 - * means 0.1us). Multiply that by 100 to get nanoseconds. + * tMEL1. time to transition path from host to device into U0. + * MEL for parent already contains the delay up to parent, so only add + * the exit latency for the last link (pick the slower exit latency), + * and the hub header decode latency. See USB 3.1 section C 2.2.1 + * Store MEL in nanoseconds */ total_mel = hub_lpm_params->mel + - (hub->descriptor->u.ss.bHubHdrDecLat * 100); + max(udev_exit_latency, hub_exit_latency) * 1000 + + hub->descriptor->u.ss.bHubHdrDecLat * 100;
/* - * How long will it take to transition the downstream hub's port into - * U0? The greater of either the hub exit latency or the device exit - * latency. - * - * The BOS U1/U2 exit latencies are expressed in 1us intervals. - * Multiply that by 1000 to get nanoseconds. + * tMEL2. Time to submit PING packet. Sum of tTPTransmissionDelay for + * each link + wHubDelay for each hub. Add only for last link. + * tMEL4, the time for PING_RESPONSE to traverse upstream is similar. + * Multiply by 2 to include it as well. */ - device_mel = udev_exit_latency * 1000; - hub_mel = hub_exit_latency * 1000; - if (device_mel > hub_mel) - total_mel += device_mel; - else - total_mel += hub_mel; + total_mel += (__le16_to_cpu(hub->descriptor->u.ss.wHubDelay) + + USB_TP_TRANSMISSION_DELAY) * 2; + + /* + * tMEL3, tPingResponse. Time taken by device to generate PING_RESPONSE + * after receiving PING. Also add 2100ns as stated in USB 3.1 C 1.5.2.4 + * to cover the delay if the PING_RESPONSE is queued behind a Max Packet + * Size DP. + * Note these delays should be added only once for the entire path, so + * add them to the MEL of the device connected to the roothub. + */ + if (!hub->hdev->parent) + total_mel += USB_PING_RESPONSE_TIME + 2100;
udev_lpm_params->mel = total_mel; }
From: Julian Sikorski belegdol@gmail.com
commit 6abf2fe6b4bf6e5256b80c5817908151d2d33e9f upstream.
LaCie Rugged USB3-FW appears to be incompatible with UAS. It generates errors like: [ 1151.582598] sd 14:0:0:0: tag#16 uas_eh_abort_handler 0 uas-tag 1 inflight: IN [ 1151.582602] sd 14:0:0:0: tag#16 CDB: Report supported operation codes a3 0c 01 12 00 00 00 00 02 00 00 00 [ 1151.588594] scsi host14: uas_eh_device_reset_handler start [ 1151.710482] usb 2-4: reset SuperSpeed Gen 1 USB device number 2 using xhci_hcd [ 1151.741398] scsi host14: uas_eh_device_reset_handler success [ 1181.785534] scsi host14: uas_eh_device_reset_handler start
Signed-off-by: Julian Sikorski belegdol+github@gmail.com Cc: stable stable@vger.kernel.org Link: https://lore.kernel.org/r/20210720171910.36497-1-belegdol+github@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/storage/unusual_uas.h | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/storage/unusual_uas.h +++ b/drivers/usb/storage/unusual_uas.h @@ -45,6 +45,13 @@ UNUSUAL_DEV(0x059f, 0x105f, 0x0000, 0x99 USB_SC_DEVICE, USB_PR_DEVICE, NULL, US_FL_NO_REPORT_OPCODES | US_FL_NO_SAME),
+/* Reported-by: Julian Sikorski belegdol@gmail.com */ +UNUSUAL_DEV(0x059f, 0x1061, 0x0000, 0x9999, + "LaCie", + "Rugged USB3-FW", + USB_SC_DEVICE, USB_PR_DEVICE, NULL, + US_FL_IGNORE_UAS), + /* * Apricorn USB3 dongle sometimes returns "USBSUSBSUSBS" in response to SCSI * commands in UAS mode. Observed with the 1.28 firmware; are there others?
From: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz
commit b5fdf5c6e6bee35837e160c00ac89327bdad031b upstream.
The MAX-3421 USB driver remembers the state of the USB toggles for a device/endpoint. To save SPI writes, this was only done when a new device/endpoint was being used. Unfortunately, if the old device was removed, this would cause writes to freed memory.
To fix this, a simpler scheme is used. The toggles are read from hardware when a URB is completed, and the toggles are always written to hardware when any URB transaction is started. This will cause a few more SPI transactions, but no causes kernel panics.
Fixes: 2d53139f3162 ("Add support for using a MAX3421E chip as a host driver.") Cc: stable stable@vger.kernel.org Signed-off-by: Mark Tomlinson mark.tomlinson@alliedtelesis.co.nz Link: https://lore.kernel.org/r/20210625031456.8632-1-mark.tomlinson@alliedtelesis... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/host/max3421-hcd.c | 44 +++++++++++++---------------------------- 1 file changed, 14 insertions(+), 30 deletions(-)
--- a/drivers/usb/host/max3421-hcd.c +++ b/drivers/usb/host/max3421-hcd.c @@ -153,8 +153,6 @@ struct max3421_hcd { */ struct urb *curr_urb; enum scheduling_pass sched_pass; - struct usb_device *loaded_dev; /* dev that's loaded into the chip */ - int loaded_epnum; /* epnum whose toggles are loaded */ int urb_done; /* > 0 -> no errors, < 0: errno */ size_t curr_len; u8 hien; @@ -492,39 +490,17 @@ max3421_set_speed(struct usb_hcd *hcd, s * Caller must NOT hold HCD spinlock. */ static void -max3421_set_address(struct usb_hcd *hcd, struct usb_device *dev, int epnum, - int force_toggles) +max3421_set_address(struct usb_hcd *hcd, struct usb_device *dev, int epnum) { - struct max3421_hcd *max3421_hcd = hcd_to_max3421(hcd); - int old_epnum, same_ep, rcvtog, sndtog; - struct usb_device *old_dev; + int rcvtog, sndtog; u8 hctl;
- old_dev = max3421_hcd->loaded_dev; - old_epnum = max3421_hcd->loaded_epnum; - - same_ep = (dev == old_dev && epnum == old_epnum); - if (same_ep && !force_toggles) - return; - - if (old_dev && !same_ep) { - /* save the old end-points toggles: */ - u8 hrsl = spi_rd8(hcd, MAX3421_REG_HRSL); - - rcvtog = (hrsl >> MAX3421_HRSL_RCVTOGRD_BIT) & 1; - sndtog = (hrsl >> MAX3421_HRSL_SNDTOGRD_BIT) & 1; - - /* no locking: HCD (i.e., we) own toggles, don't we? */ - usb_settoggle(old_dev, old_epnum, 0, rcvtog); - usb_settoggle(old_dev, old_epnum, 1, sndtog); - } /* setup new endpoint's toggle bits: */ rcvtog = usb_gettoggle(dev, epnum, 0); sndtog = usb_gettoggle(dev, epnum, 1); hctl = (BIT(rcvtog + MAX3421_HCTL_RCVTOG0_BIT) | BIT(sndtog + MAX3421_HCTL_SNDTOG0_BIT));
- max3421_hcd->loaded_epnum = epnum; spi_wr8(hcd, MAX3421_REG_HCTL, hctl);
/* @@ -532,7 +508,6 @@ max3421_set_address(struct usb_hcd *hcd, * address-assignment so it's best to just always load the * address whenever the end-point changed/was forced. */ - max3421_hcd->loaded_dev = dev; spi_wr8(hcd, MAX3421_REG_PERADDR, dev->devnum); }
@@ -667,7 +642,7 @@ max3421_select_and_start_urb(struct usb_ struct max3421_hcd *max3421_hcd = hcd_to_max3421(hcd); struct urb *urb, *curr_urb = NULL; struct max3421_ep *max3421_ep; - int epnum, force_toggles = 0; + int epnum; struct usb_host_endpoint *ep; struct list_head *pos; unsigned long flags; @@ -777,7 +752,6 @@ done: usb_settoggle(urb->dev, epnum, 0, 1); usb_settoggle(urb->dev, epnum, 1, 1); max3421_ep->pkt_state = PKT_STATE_SETUP; - force_toggles = 1; } else max3421_ep->pkt_state = PKT_STATE_TRANSFER; } @@ -785,7 +759,7 @@ done: spin_unlock_irqrestore(&max3421_hcd->lock, flags);
max3421_ep->last_active = max3421_hcd->frame_number; - max3421_set_address(hcd, urb->dev, epnum, force_toggles); + max3421_set_address(hcd, urb->dev, epnum); max3421_set_speed(hcd, urb->dev); max3421_next_transfer(hcd, 0); return 1; @@ -1380,6 +1354,16 @@ max3421_urb_done(struct usb_hcd *hcd) status = 0; urb = max3421_hcd->curr_urb; if (urb) { + /* save the old end-points toggles: */ + u8 hrsl = spi_rd8(hcd, MAX3421_REG_HRSL); + int rcvtog = (hrsl >> MAX3421_HRSL_RCVTOGRD_BIT) & 1; + int sndtog = (hrsl >> MAX3421_HRSL_SNDTOGRD_BIT) & 1; + int epnum = usb_endpoint_num(&urb->ep->desc); + + /* no locking: HCD (i.e., we) own toggles, don't we? */ + usb_settoggle(urb->dev, epnum, 0, rcvtog); + usb_settoggle(urb->dev, epnum, 1, sndtog); + max3421_hcd->curr_urb = NULL; spin_lock_irqsave(&max3421_hcd->lock, flags); usb_hcd_unlink_urb_from_ep(hcd, urb);
From: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com
commit 5719df243e118fb343725e8b2afb1637e1af1373 upstream.
This driver has a potential issue which this driver is possible to cause superfluous irqs after usb_pkt_pop() is called. So, after the commit 3af32605289e ("usb: renesas_usbhs: fix error return code of usbhsf_pkt_handler()") had been applied, we could observe the following error happened when we used g_audio.
renesas_usbhs e6590000.usb: irq_ready run_error 1 : -22
To fix the issue, disable the tx or rx interrupt in usb_pkt_pop().
Fixes: 2743e7f90dc0 ("usb: renesas_usbhs: fix the usb_pkt_pop()") Cc: stable@vger.kernel.org # v4.4+ Signed-off-by: Yoshihiro Shimoda yoshihiro.shimoda.uh@renesas.com Link: https://lore.kernel.org/r/20210624122039.596528-1-yoshihiro.shimoda.uh@renes... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/renesas_usbhs/fifo.c | 7 +++++++ 1 file changed, 7 insertions(+)
--- a/drivers/usb/renesas_usbhs/fifo.c +++ b/drivers/usb/renesas_usbhs/fifo.c @@ -101,6 +101,8 @@ static struct dma_chan *usbhsf_dma_chan_ #define usbhsf_dma_map(p) __usbhsf_dma_map_ctrl(p, 1) #define usbhsf_dma_unmap(p) __usbhsf_dma_map_ctrl(p, 0) static int __usbhsf_dma_map_ctrl(struct usbhs_pkt *pkt, int map); +static void usbhsf_tx_irq_ctrl(struct usbhs_pipe *pipe, int enable); +static void usbhsf_rx_irq_ctrl(struct usbhs_pipe *pipe, int enable); struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt) { struct usbhs_priv *priv = usbhs_pipe_to_priv(pipe); @@ -123,6 +125,11 @@ struct usbhs_pkt *usbhs_pkt_pop(struct u if (chan) { dmaengine_terminate_all(chan); usbhsf_dma_unmap(pkt); + } else { + if (usbhs_pipe_is_dir_in(pipe)) + usbhsf_rx_irq_ctrl(pipe, 0); + else + usbhsf_tx_irq_ctrl(pipe, 0); }
usbhs_pipe_clear_without_sequence(pipe, 0, 0);
From: Marco De Marco marco.demarco@posteo.net
commit 94b619a07655805a1622484967754f5848640456 upstream.
The patch is meant to support LARA-R6 Cat 1 module family.
Module USB ID: Vendor ID: 0x05c6 Product ID: 0x90fA
Interface layout: If 0: Diagnostic If 1: AT parser If 2: AT parser If 3: QMI wwan (not available in all versions)
Signed-off-by: Marco De Marco marco.demarco@posteo.net Link: https://lore.kernel.org/r/49260184.kfMIbaSn9k@mars Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/option.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/usb/serial/option.c +++ b/drivers/usb/serial/option.c @@ -238,6 +238,7 @@ static void option_instat_callback(struc #define QUECTEL_PRODUCT_UC15 0x9090 /* These u-blox products use Qualcomm's vendor ID */ #define UBLOX_PRODUCT_R410M 0x90b2 +#define UBLOX_PRODUCT_R6XX 0x90fa /* These Yuga products use Qualcomm's vendor ID */ #define YUGA_PRODUCT_CLM920_NC5 0x9625
@@ -1101,6 +1102,8 @@ static const struct usb_device_id option /* u-blox products using Qualcomm vendor ID */ { USB_DEVICE(QUALCOMM_VENDOR_ID, UBLOX_PRODUCT_R410M), .driver_info = RSVD(1) | RSVD(3) }, + { USB_DEVICE(QUALCOMM_VENDOR_ID, UBLOX_PRODUCT_R6XX), + .driver_info = RSVD(3) }, /* Quectel products using Quectel vendor ID */ { USB_DEVICE_AND_INTERFACE_INFO(QUECTEL_VENDOR_ID, QUECTEL_PRODUCT_EC21, 0xff, 0xff, 0xff), .driver_info = NUMEP2 },
From: Ian Ray ian.ray@ge.com
commit e9db418d4b828dd049caaf5ed65dc86f93bb1a0c upstream.
Fix comments for GE CS1000 CP210x USB ID assignments.
Fixes: 42213a0190b5 ("USB: serial: cp210x: add some more GE USB IDs") Signed-off-by: Ian Ray ian.ray@ge.com Signed-off-by: Sebastian Reichel sebastian.reichel@collabora.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -202,8 +202,8 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x1901, 0x0194) }, /* GE Healthcare Remote Alarm Box */ { USB_DEVICE(0x1901, 0x0195) }, /* GE B850/B650/B450 CP2104 DP UART interface */ { USB_DEVICE(0x1901, 0x0196) }, /* GE B850 CP2105 DP UART interface */ - { USB_DEVICE(0x1901, 0x0197) }, /* GE CS1000 Display serial interface */ - { USB_DEVICE(0x1901, 0x0198) }, /* GE CS1000 M.2 Key E serial interface */ + { USB_DEVICE(0x1901, 0x0197) }, /* GE CS1000 M.2 Key E serial interface */ + { USB_DEVICE(0x1901, 0x0198) }, /* GE CS1000 Display serial interface */ { USB_DEVICE(0x199B, 0xBA30) }, /* LORD WSDA-200-USB */ { USB_DEVICE(0x19CF, 0x3000) }, /* Parrot NMEA GPS Flight Recorder */ { USB_DEVICE(0x1ADB, 0x0001) }, /* Schweitzer Engineering C662 Cable */
From: John Keeping john@metanate.com
commit d6a206e60124a9759dd7f6dfb86b0e1d3b1df82e upstream.
Add the USB serial device ID for the CEL ZigBee EM3588 radio stick.
Signed-off-by: John Keeping john@metanate.com Cc: stable@vger.kernel.org Signed-off-by: Johan Hovold johan@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/serial/cp210x.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/serial/cp210x.c +++ b/drivers/usb/serial/cp210x.c @@ -155,6 +155,7 @@ static const struct usb_device_id id_tab { USB_DEVICE(0x10C4, 0x89A4) }, /* CESINEL FTBC Flexible Thyristor Bridge Controller */ { USB_DEVICE(0x10C4, 0x89FB) }, /* Qivicon ZigBee USB Radio Stick */ { USB_DEVICE(0x10C4, 0x8A2A) }, /* HubZ dual ZigBee and Z-Wave dongle */ + { USB_DEVICE(0x10C4, 0x8A5B) }, /* CEL EM3588 ZigBee USB Stick */ { USB_DEVICE(0x10C4, 0x8A5E) }, /* CEL EM3588 ZigBee USB Stick Long Range */ { USB_DEVICE(0x10C4, 0x8B34) }, /* Qivicon ZigBee USB Radio Stick */ { USB_DEVICE(0x10C4, 0xEA60) }, /* Silicon Labs factory default */
From: Zhang Qilong zhangqilong3@huawei.com
commit 5b01248156bd75303e66985c351dee648c149979 upstream.
Add missing pm_runtime_disable() when probe error out. It could avoid pm_runtime implementation complains when removing and probing again the driver.
Fixes: 49db427232fe ("usb: gadget: Add UDC driver for tegra XUSB device mode controller") Cc: stable stable@vger.kernel.org Signed-off-by: Zhang Qilong zhangqilong3@huawei.com Link: https://lore.kernel.org/r/20210618141441.107817-1-zhangqilong3@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/gadget/udc/tegra-xudc.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/usb/gadget/udc/tegra-xudc.c +++ b/drivers/usb/gadget/udc/tegra-xudc.c @@ -3861,6 +3861,7 @@ static int tegra_xudc_probe(struct platf return 0;
free_eps: + pm_runtime_disable(&pdev->dev); tegra_xudc_free_eps(xudc); free_event_ring: tegra_xudc_free_event_ring(xudc);
From: Marek Szyprowski m.szyprowski@samsung.com
commit c4a0f7a6ab5417eb6105b0e1d7e6e67f6ef7d4e5 upstream.
Commit 0112b7ce68ea ("usb: dwc2: Update dwc2_handle_usb_suspend_intr function.") changed the way the driver handles power down modes in a such way that it uses clock gating when no other power down mode is available.
This however doesn't work well on the DWC2 implementation used on the Samsung SoCs. When a clock gating is enabled, system hangs. It looks that the proper clock gating requires some additional glue code in the shared USB2 PHY and/or Samsung glue code for the DWC2. To restore driver operation on the Samsung SoCs simply skip enabling clock gating mode until one finds what is really needed to make it working reliably.
Fixes: 0112b7ce68ea ("usb: dwc2: Update dwc2_handle_usb_suspend_intr function.") Cc: stable stable@vger.kernel.org Acked-by: Krzysztof Kozlowski krzysztof.kozlowski@canonical.com Signed-off-by: Marek Szyprowski m.szyprowski@samsung.com Link: https://lore.kernel.org/r/20210716050127.4406-1-m.szyprowski@samsung.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/core.h | 4 ++++ drivers/usb/dwc2/core_intr.c | 3 ++- drivers/usb/dwc2/hcd.c | 6 ++++-- drivers/usb/dwc2/params.c | 1 + 4 files changed, 11 insertions(+), 3 deletions(-)
--- a/drivers/usb/dwc2/core.h +++ b/drivers/usb/dwc2/core.h @@ -383,6 +383,9 @@ enum dwc2_ep0_state { * 0 - No (default) * 1 - Partial power down * 2 - Hibernation + * @no_clock_gating: Specifies whether to avoid clock gating feature. + * 0 - No (use clock gating) + * 1 - Yes (avoid it) * @lpm: Enable LPM support. * 0 - No * 1 - Yes @@ -480,6 +483,7 @@ struct dwc2_core_params { #define DWC2_POWER_DOWN_PARAM_NONE 0 #define DWC2_POWER_DOWN_PARAM_PARTIAL 1 #define DWC2_POWER_DOWN_PARAM_HIBERNATION 2 + bool no_clock_gating;
bool lpm; bool lpm_clock_gating; --- a/drivers/usb/dwc2/core_intr.c +++ b/drivers/usb/dwc2/core_intr.c @@ -556,7 +556,8 @@ static void dwc2_handle_usb_suspend_intr * If neither hibernation nor partial power down are supported, * clock gating is used to save power. */ - dwc2_gadget_enter_clock_gating(hsotg); + if (!hsotg->params.no_clock_gating) + dwc2_gadget_enter_clock_gating(hsotg); }
/* --- a/drivers/usb/dwc2/hcd.c +++ b/drivers/usb/dwc2/hcd.c @@ -3338,7 +3338,8 @@ int dwc2_port_suspend(struct dwc2_hsotg * If not hibernation nor partial power down are supported, * clock gating is used to save power. */ - dwc2_host_enter_clock_gating(hsotg); + if (!hsotg->params.no_clock_gating) + dwc2_host_enter_clock_gating(hsotg); break; }
@@ -4402,7 +4403,8 @@ static int _dwc2_hcd_suspend(struct usb_ * If not hibernation nor partial power down are supported, * clock gating is used to save power. */ - dwc2_host_enter_clock_gating(hsotg); + if (!hsotg->params.no_clock_gating) + dwc2_host_enter_clock_gating(hsotg);
/* After entering suspend, hardware is not accessible */ clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags); --- a/drivers/usb/dwc2/params.c +++ b/drivers/usb/dwc2/params.c @@ -76,6 +76,7 @@ static void dwc2_set_s3c6400_params(stru struct dwc2_core_params *p = &hsotg->params;
p->power_down = DWC2_POWER_DOWN_PARAM_NONE; + p->no_clock_gating = true; p->phy_utmi_width = 8; }
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit fecb3a171db425e5068b27231f8efe154bf72637 upstream.
Because of dwc2_hsotg_ep_stop_xfr() function uses poll mode, first need to mask GINTSTS_GOUTNAKEFF interrupt. In Slave mode GINTSTS_GOUTNAKEFF interrupt will be aserted only after pop OUT NAK status packet from RxFIFO.
In dwc2_hsotg_ep_sethalt() function before setting DCTL_SGOUTNAK need to unmask GOUTNAKEFF interrupt.
Tested by USBCV CH9 and MSC tests set in Slave, BDMA and DDMA. All tests are passed.
Fixes: a4f827714539a ("usb: dwc2: gadget: Disable enabled HW endpoint in dwc2_hsotg_ep_disable") Fixes: 6070636c4918c ("usb: dwc2: Fix Stalling a Non-Isochronous OUT EP") Cc: stable stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/e17fad802bbcaf879e1ed6745030993abb93baf8.162615292... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -3900,9 +3900,27 @@ static void dwc2_hsotg_ep_stop_xfr(struc __func__); } } else { + /* Mask GINTSTS_GOUTNAKEFF interrupt */ + dwc2_hsotg_disable_gsint(hsotg, GINTSTS_GOUTNAKEFF); + if (!(dwc2_readl(hsotg, GINTSTS) & GINTSTS_GOUTNAKEFF)) dwc2_set_bit(hsotg, DCTL, DCTL_SGOUTNAK);
+ if (!using_dma(hsotg)) { + /* Wait for GINTSTS_RXFLVL interrupt */ + if (dwc2_hsotg_wait_bit_set(hsotg, GINTSTS, + GINTSTS_RXFLVL, 100)) { + dev_warn(hsotg->dev, "%s: timeout GINTSTS.RXFLVL\n", + __func__); + } else { + /* + * Pop GLOBAL OUT NAK status packet from RxFIFO + * to assert GOUTNAKEFF interrupt + */ + dwc2_readl(hsotg, GRXSTSP); + } + } + /* Wait for global nak to take effect */ if (dwc2_hsotg_wait_bit_set(hsotg, GINTSTS, GINTSTS_GOUTNAKEFF, 100)) @@ -4348,6 +4366,9 @@ static int dwc2_hsotg_ep_sethalt(struct epctl = dwc2_readl(hs, epreg);
if (value) { + /* Unmask GOUTNAKEFF interrupt */ + dwc2_hsotg_en_gsint(hs, GINTSTS_GOUTNAKEFF); + if (!(dwc2_readl(hs, GINTSTS) & GINTSTS_GOUTNAKEFF)) dwc2_set_bit(hs, DCTL, DCTL_SGOUTNAK); // STALL bit will be set in GOUTNAKEFF interrupt handler
From: Minas Harutyunyan Minas.Harutyunyan@synopsys.com
commit d53dc38857f6dbefabd9eecfcbf67b6eac9a1ef4 upstream.
Sending zero length packet in DDMA mode perform by DMA descriptor by setting SP (short packet) flag.
For DDMA in function dwc2_hsotg_complete_in() does not need to send zlp.
Tested by USBCV MSC tests.
Fixes: f71b5e2533de ("usb: dwc2: gadget: fix zero length packet transfers") Cc: stable stable@vger.kernel.org Signed-off-by: Minas Harutyunyan Minas.Harutyunyan@synopsys.com Link: https://lore.kernel.org/r/967bad78c55dd2db1c19714eee3d0a17cf99d74a.162677773... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc2/gadget.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/drivers/usb/dwc2/gadget.c +++ b/drivers/usb/dwc2/gadget.c @@ -2749,12 +2749,14 @@ static void dwc2_hsotg_complete_in(struc return; }
- /* Zlp for all endpoints, for ep0 only in DATA IN stage */ + /* Zlp for all endpoints in non DDMA, for ep0 only in DATA IN stage */ if (hs_ep->send_zlp) { - dwc2_hsotg_program_zlp(hsotg, hs_ep); hs_ep->send_zlp = 0; - /* transfer will be completed on next complete interrupt */ - return; + if (!using_desc_dma(hsotg)) { + dwc2_hsotg_program_zlp(hsotg, hs_ep); + /* transfer will be completed on next complete interrupt */ + return; + } }
if (hs_ep->index == 0 && hsotg->ep0_state == DWC2_EP0_DATA_IN) {
From: Martin Kepplinger martin.kepplinger@puri.sm
commit 57560ee95cb7f91cf0bc31d4ae8276e0dcfe17aa upstream.
Similar as with tcpm this patch lets fw_devlink know not to wait on the fwnode to be populated as a struct device.
Without this patch, USB functionality can be broken on some previously supported boards.
Fixes: 28ec344bb891 ("usb: typec: tcpm: Don't block probing of consumers of "connector" nodes") Cc: stable stable@vger.kernel.org Acked-by: Heikki Krogerus heikki.krogerus@linux.intel.com Signed-off-by: Martin Kepplinger martin.kepplinger@puri.sm Link: https://lore.kernel.org/r/20210714061807.5737-1-martin.kepplinger@puri.sm Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/tipd/core.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/usb/typec/tipd/core.c +++ b/drivers/usb/typec/tipd/core.c @@ -629,6 +629,15 @@ static int tps6598x_probe(struct i2c_cli if (!fwnode) return -ENODEV;
+ /* + * This fwnode has a "compatible" property, but is never populated as a + * struct device. Instead we simply parse it to read the properties. + * This breaks fw_devlink=on. To maintain backward compatibility + * with existing DT files, we work around this by deleting any + * fwnode_links to/from this fwnode. + */ + fw_devlink_purge_absent_suppliers(fwnode); + tps->role_sw = fwnode_usb_role_switch_get(fwnode); if (IS_ERR(tps->role_sw)) { ret = PTR_ERR(tps->role_sw);
From: Amelie Delaunay amelie.delaunay@foss.st.com
commit 86762ad4abcc549deb7a155c8e5e961b9755bcf0 upstream.
During interrupt registration, attach state is checked. If attached, then the Type-C state is updated with typec_set_xxx functions and role switch is set with usb_role_switch_set_role().
If the usb_role_switch parameter is error or null, the function simply returns 0.
So, to update usb_role_switch role if a device is attached before the irq is registered, usb_role_switch must be registered before irq registration.
Fixes: da0cb6310094 ("usb: typec: add support for STUSB160x Type-C controller family") Cc: stable stable@vger.kernel.org Signed-off-by: Amelie Delaunay amelie.delaunay@foss.st.com Link: https://lore.kernel.org/r/20210716120718.20398-2-amelie.delaunay@foss.st.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/stusb160x.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/drivers/usb/typec/stusb160x.c +++ b/drivers/usb/typec/stusb160x.c @@ -739,10 +739,6 @@ static int stusb160x_probe(struct i2c_cl typec_set_pwr_opmode(chip->port, chip->pwr_opmode);
if (client->irq) { - ret = stusb160x_irq_init(chip, client->irq); - if (ret) - goto port_unregister; - chip->role_sw = fwnode_usb_role_switch_get(fwnode); if (IS_ERR(chip->role_sw)) { ret = PTR_ERR(chip->role_sw); @@ -752,6 +748,10 @@ static int stusb160x_probe(struct i2c_cl ret); goto port_unregister; } + + ret = stusb160x_irq_init(chip, client->irq); + if (ret) + goto role_sw_put; } else { /* * If Source or Dual power role, need to enable VDD supply @@ -775,6 +775,9 @@ static int stusb160x_probe(struct i2c_cl
return 0;
+role_sw_put: + if (chip->role_sw) + usb_role_switch_put(chip->role_sw); port_unregister: typec_unregister_port(chip->port); all_reg_disable:
From: Amelie Delaunay amelie.delaunay@foss.st.com
commit 6b63376722d9e1b915a2948e9b30f4ba2712e3f5 upstream.
Similar as with tcpm this patch lets fw_devlink know not to wait on the fwnode to be populated as a struct device.
Without this patch, USB functionality can be broken on some previously supported boards.
Fixes: 28ec344bb891 ("usb: typec: tcpm: Don't block probing of consumers of "connector" nodes") Cc: stable stable@vger.kernel.org Signed-off-by: Amelie Delaunay amelie.delaunay@foss.st.com Link: https://lore.kernel.org/r/20210716120718.20398-3-amelie.delaunay@foss.st.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/typec/stusb160x.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/drivers/usb/typec/stusb160x.c +++ b/drivers/usb/typec/stusb160x.c @@ -686,6 +686,15 @@ static int stusb160x_probe(struct i2c_cl return -ENODEV;
/* + * This fwnode has a "compatible" property, but is never populated as a + * struct device. Instead we simply parse it to read the properties. + * This it breaks fw_devlink=on. To maintain backward compatibility + * with existing DT files, we work around this by deleting any + * fwnode_links to/from this fwnode. + */ + fw_devlink_purge_absent_suppliers(fwnode); + + /* * When both VDD and VSYS power supplies are present, the low power * supply VSYS is selected when VSYS voltage is above 3.1 V. * Otherwise VDD is selected.
From: Marc Zyngier maz@kernel.org
commit 2bab693a608bdf614b9fcd44083c5100f34b9f77 upstream.
kexec_load_file() relies on the memblock infrastructure to avoid stamping over regions of memory that are essential to the survival of the system.
However, nobody seems to agree how to flag these regions as reserved, and (for example) EFI only publishes its reservations in /proc/iomem for the benefit of the traditional, userspace based kexec tool.
On arm64 platforms with GICv3, this can result in the payload being placed at the location of the LPI tables. Shock, horror!
Let's augment the EFI reservation code with a memblock_reserve() call, protecting our dear tables from the secondary kernel invasion.
Reported-by: Moritz Fischer mdf@kernel.org Tested-by: Moritz Fischer mdf@kernel.org Signed-off-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Cc: Ard Biesheuvel ardb@kernel.org Cc: James Morse james.morse@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: Will Deacon will@kernel.org Signed-off-by: Ard Biesheuvel ardb@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/firmware/efi/efi.c | 13 ++++++++++++- 1 file changed, 12 insertions(+), 1 deletion(-)
--- a/drivers/firmware/efi/efi.c +++ b/drivers/firmware/efi/efi.c @@ -896,6 +896,7 @@ static int __init efi_memreserve_map_roo static int efi_mem_reserve_iomem(phys_addr_t addr, u64 size) { struct resource *res, *parent; + int ret;
res = kzalloc(sizeof(struct resource), GFP_ATOMIC); if (!res) @@ -908,7 +909,17 @@ static int efi_mem_reserve_iomem(phys_ad
/* we expect a conflict with a 'System RAM' region */ parent = request_resource_conflict(&iomem_resource, res); - return parent ? request_resource(parent, res) : 0; + ret = parent ? request_resource(parent, res) : 0; + + /* + * Given that efi_mem_reserve_iomem() can be called at any + * time, only call memblock_reserve() if the architecture + * keeps the infrastructure around. + */ + if (IS_ENABLED(CONFIG_ARCH_KEEP_MEMBLOCK) && !ret) + memblock_reserve(addr, size); + + return ret; }
int __ref efi_mem_reserve_persistent(phys_addr_t addr, u64 size)
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 352384d5c84ebe40fa77098cc234fe173247d8ef upstream.
Because of the significant overhead that retpolines pose on indirect calls, the tracepoint code was updated to use the new "static_calls" that can modify the running code to directly call a function instead of using an indirect caller, and this function can be changed at runtime.
In the tracepoint code that calls all the registered callbacks that are attached to a tracepoint, the following is done:
it_func_ptr = rcu_dereference_raw((&__tracepoint_##name)->funcs); if (it_func_ptr) { __data = (it_func_ptr)->data; static_call(tp_func_##name)(__data, args); }
If there's just a single callback, the static_call is updated to just call that callback directly. Once another handler is added, then the static caller is updated to call the iterator, that simply loops over all the funcs in the array and calls each of the callbacks like the old method using indirect calling.
The issue was discovered with a race between updating the funcs array and updating the static_call. The funcs array was updated first and then the static_call was updated. This is not an issue as long as the first element in the old array is the same as the first element in the new array. But that assumption is incorrect, because callbacks also have a priority field, and if there's a callback added that has a higher priority than the callback on the old array, then it will become the first callback in the new array. This means that it is possible to call the old callback with the new callback data element, which can cause a kernel panic.
static_call = callback1() funcs[] = {callback1,data1}; callback2 has higher priority than callback1
CPU 1 CPU 2 ----- -----
new_funcs = {callback2,data2}, {callback1,data1}
rcu_assign_pointer(tp->funcs, new_funcs);
/* * Now tp->funcs has the new array * but the static_call still calls callback1 */
it_func_ptr = tp->funcs [ new_funcs ] data = it_func_ptr->data [ data2 ] static_call(callback1, data);
/* Now callback1 is called with * callback2's data */
[ KERNEL PANIC ]
update_static_call(iterator);
To prevent this from happening, always switch the static_call to the iterator before assigning the tp->funcs to the new array. The iterator will always properly match the callback with its data.
To trigger this bug:
In one terminal:
while :; do hackbench 50; done
In another terminal
echo 1 > /sys/kernel/tracing/events/sched/sched_waking/enable while :; do echo 1 > /sys/kernel/tracing/set_event_pid; sleep 0.5 echo 0 > /sys/kernel/tracing/set_event_pid; sleep 0.5 done
And it doesn't take long to crash. This is because the set_event_pid adds a callback to the sched_waking tracepoint with a high priority, which will be called before the sched_waking trace event callback is called.
Note, the removal to a single callback updates the array first, before changing the static_call to single callback, which is the proper order as the first element in the array is the same as what the static_call is being changed to.
Link: https://lore.kernel.org/io-uring/4ebea8f0-58c9-e571-fd30-0ce4f6f09c70@samba....
Cc: stable@vger.kernel.org Fixes: d25e37d89dd2f ("tracepoint: Optimize using static_call()") Reported-by: Stefan Metzmacher metze@samba.org tested-by: Stefan Metzmacher metze@samba.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/tracepoint.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/tracepoint.c +++ b/kernel/tracepoint.c @@ -299,8 +299,8 @@ static int tracepoint_add_func(struct tr * a pointer to it. This array is referenced by __DO_TRACE from * include/linux/tracepoint.h using rcu_dereference_sched(). */ - rcu_assign_pointer(tp->funcs, tp_funcs); tracepoint_update_call(tp, tp_funcs, false); + rcu_assign_pointer(tp->funcs, tp_funcs); static_key_enable(&tp->key);
release_probes(old);
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 1e3bac71c5053c99d438771fc9fa5082ae5d90aa upstream.
Currently the histogram logic allows the user to write "cpu" in as an event field, and it will record the CPU that the event happened on.
The problem with this is that there's a lot of events that have "cpu" as a real field, and using "cpu" as the CPU it ran on, makes it impossible to run histograms on the "cpu" field of events.
For example, if I want to have a histogram on the count of the workqueue_queue_work event on its cpu field, running:
# echo 'hist:keys=cpu' > events/workqueue/workqueue_queue_work/trigger
Gives a misleading and wrong result.
Change the command to "common_cpu" as no event should have "common_*" fields as that's a reserved name for fields used by all events. And this makes sense here as common_cpu would be a field used by all events.
Now we can even do:
# echo 'hist:keys=common_cpu,cpu if cpu < 100' > events/workqueue/workqueue_queue_work/trigger # cat events/workqueue/workqueue_queue_work/hist
# event histogram # # trigger info: hist:keys=common_cpu,cpu:vals=hitcount:sort=hitcount:size=2048 if cpu < 100 [active] #
{ common_cpu: 0, cpu: 2 } hitcount: 1 { common_cpu: 0, cpu: 4 } hitcount: 1 { common_cpu: 7, cpu: 7 } hitcount: 1 { common_cpu: 0, cpu: 7 } hitcount: 1 { common_cpu: 0, cpu: 1 } hitcount: 1 { common_cpu: 0, cpu: 6 } hitcount: 2 { common_cpu: 0, cpu: 5 } hitcount: 2 { common_cpu: 1, cpu: 1 } hitcount: 4 { common_cpu: 6, cpu: 6 } hitcount: 4 { common_cpu: 5, cpu: 5 } hitcount: 14 { common_cpu: 4, cpu: 4 } hitcount: 26 { common_cpu: 0, cpu: 0 } hitcount: 39 { common_cpu: 2, cpu: 2 } hitcount: 184
Now for backward compatibility, I added a trick. If "cpu" is used, and the field is not found, it will fall back to "common_cpu" and work as it did before. This way, it will still work for old programs that use "cpu" to get the actual CPU, but if the event has a "cpu" as a field, it will get that event's "cpu" field, which is probably what it wants anyway.
I updated the tracefs/README to include documentation about both the common_timestamp and the common_cpu. This way, if that text is present in the README, then an application can know that common_cpu is supported over just plain "cpu".
Link: https://lkml.kernel.org/r/20210721110053.26b4f641@oasis.local.home
Cc: Namhyung Kim namhyung@kernel.org Cc: Ingo Molnar mingo@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org Fixes: 8b7622bf94a44 ("tracing: Add cpu field for hist triggers") Reviewed-by: Tom Zanussi zanussi@kernel.org Reviewed-by: Masami Hiramatsu mhiramat@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/trace/histogram.rst | 2 +- kernel/trace/trace.c | 4 ++++ kernel/trace/trace_events_hist.c | 22 ++++++++++++++++------ 3 files changed, 21 insertions(+), 7 deletions(-)
--- a/Documentation/trace/histogram.rst +++ b/Documentation/trace/histogram.rst @@ -191,7 +191,7 @@ Documentation written by Tom Zanussi with the event, in nanoseconds. May be modified by .usecs to have timestamps interpreted as microseconds. - cpu int the cpu on which the event occurred. + common_cpu int the cpu on which the event occurred. ====================== ==== =======================================
Extended error information --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -5565,6 +5565,10 @@ static const char readme_msg[] = "\t [:name=histname1]\n" "\t [:<handler>.<action>]\n" "\t [if <filter>]\n\n" + "\t Note, special fields can be used as well:\n" + "\t common_timestamp - to record current timestamp\n" + "\t common_cpu - to record the CPU the event happened on\n" + "\n" "\t When a matching event is hit, an entry is added to a hash\n" "\t table using the key(s) and value(s) named, and the value of a\n" "\t sum called 'hitcount' is incremented. Keys and values\n" --- a/kernel/trace/trace_events_hist.c +++ b/kernel/trace/trace_events_hist.c @@ -1111,7 +1111,7 @@ static const char *hist_field_name(struc field->flags & HIST_FIELD_FL_ALIAS) field_name = hist_field_name(field->operands[0], ++level); else if (field->flags & HIST_FIELD_FL_CPU) - field_name = "cpu"; + field_name = "common_cpu"; else if (field->flags & HIST_FIELD_FL_EXPR || field->flags & HIST_FIELD_FL_VAR_REF) { if (field->system) { @@ -1991,14 +1991,24 @@ parse_field(struct hist_trigger_data *hi hist_data->enable_timestamps = true; if (*flags & HIST_FIELD_FL_TIMESTAMP_USECS) hist_data->attrs->ts_in_usecs = true; - } else if (strcmp(field_name, "cpu") == 0) + } else if (strcmp(field_name, "common_cpu") == 0) *flags |= HIST_FIELD_FL_CPU; else { field = trace_find_event_field(file->event_call, field_name); if (!field || !field->size) { - hist_err(tr, HIST_ERR_FIELD_NOT_FOUND, errpos(field_name)); - field = ERR_PTR(-EINVAL); - goto out; + /* + * For backward compatibility, if field_name + * was "cpu", then we treat this the same as + * common_cpu. + */ + if (strcmp(field_name, "cpu") == 0) { + *flags |= HIST_FIELD_FL_CPU; + } else { + hist_err(tr, HIST_ERR_FIELD_NOT_FOUND, + errpos(field_name)); + field = ERR_PTR(-EINVAL); + goto out; + } } } out: @@ -5085,7 +5095,7 @@ static void hist_field_print(struct seq_ seq_printf(m, "%s=", hist_field->var.name);
if (hist_field->flags & HIST_FIELD_FL_CPU) - seq_puts(m, "cpu"); + seq_puts(m, "common_cpu"); else if (field_name) { if (hist_field->flags & HIST_FIELD_FL_VAR_REF || hist_field->flags & HIST_FIELD_FL_ALIAS)
From: Haoran Luo www@aegistudio.net
commit 67f0d6d9883c13174669f88adac4f0ee656cc16a upstream.
The "rb_per_cpu_empty()" misinterpret the condition (as not-empty) when "head_page" and "commit_page" of "struct ring_buffer_per_cpu" points to the same buffer page, whose "buffer_data_page" is empty and "read" field is non-zero.
An error scenario could be constructed as followed (kernel perspective):
1. All pages in the buffer has been accessed by reader(s) so that all of them will have non-zero "read" field.
2. Read and clear all buffer pages so that "rb_num_of_entries()" will return 0 rendering there's no more data to read. It is also required that the "read_page", "commit_page" and "tail_page" points to the same page, while "head_page" is the next page of them.
3. Invoke "ring_buffer_lock_reserve()" with large enough "length" so that it shot pass the end of current tail buffer page. Now the "head_page", "commit_page" and "tail_page" points to the same page.
4. Discard current event with "ring_buffer_discard_commit()", so that "head_page", "commit_page" and "tail_page" points to a page whose buffer data page is now empty.
When the error scenario has been constructed, "tracing_read_pipe" will be trapped inside a deadloop: "trace_empty()" returns 0 since "rb_per_cpu_empty()" returns 0 when it hits the CPU containing such constructed ring buffer. Then "trace_find_next_entry_inc()" always return NULL since "rb_num_of_entries()" reports there's no more entry to read. Finally "trace_seq_to_user()" returns "-EBUSY" spanking "tracing_read_pipe" back to the start of the "waitagain" loop.
I've also written a proof-of-concept script to construct the scenario and trigger the bug automatically, you can use it to trace and validate my reasoning above:
https://github.com/aegistudio/RingBufferDetonator.git
Tests has been carried out on linux kernel 5.14-rc2 (2734d6c1b1a089fb593ef6a23d4b70903526fe0c), my fixed version of kernel (for testing whether my update fixes the bug) and some older kernels (for range of affected kernels). Test result is also attached to the proof-of-concept repository.
Link: https://lore.kernel.org/linux-trace-devel/YPaNxsIlb2yjSi5Y@aegistudio/ Link: https://lore.kernel.org/linux-trace-devel/YPgrN85WL9VyrZ55@aegistudio
Cc: stable@vger.kernel.org Fixes: bf41a158cacba ("ring-buffer: make reentrant") Suggested-by: Linus Torvalds torvalds@linuxfoundation.org Signed-off-by: Haoran Luo www@aegistudio.net Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/ring_buffer.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-)
--- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -3880,10 +3880,30 @@ static bool rb_per_cpu_empty(struct ring if (unlikely(!head)) return true;
- return reader->read == rb_page_commit(reader) && - (commit == reader || - (commit == head && - head->read == rb_page_commit(commit))); + /* Reader should exhaust content in reader page */ + if (reader->read != rb_page_commit(reader)) + return false; + + /* + * If writers are committing on the reader page, knowing all + * committed content has been read, the ring buffer is empty. + */ + if (commit == reader) + return true; + + /* + * If writers are committing on a page other than reader page + * and head page, there should always be content to read. + */ + if (commit != head) + return false; + + /* + * Writers are committing on the head page, we just need + * to care about there're committed data, and the reader will + * swap reader page with head page when it is to read data. + */ + return rb_page_commit(commit) == 0; }
/**
From: Steven Rostedt (VMware) rostedt@goodmis.org
commit 3b13911a2fd0dd0146c9777a254840c5466cf120 upstream.
Performing the following:
# echo 'wakeup_lat s32 pid; u64 delta; char wake_comm[]' > synthetic_events # echo 'hist:keys=pid:__arg__1=common_timestamp.usecs' > events/sched/sched_waking/trigger # echo 'hist:keys=next_pid:pid=next_pid,delta=common_timestamp.usecs-$__arg__1:onmatch(sched.sched_waking).trace(wakeup_lat,$pid,$delta,prev_comm)'\
> events/sched/sched_switch/trigger
# echo 1 > events/synthetic/enable
Crashed the kernel:
BUG: kernel NULL pointer dereference, address: 000000000000001b #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP CPU: 7 PID: 0 Comm: swapper/7 Not tainted 5.13.0-rc5-test+ #104 Hardware name: Hewlett-Packard HP Compaq Pro 6300 SFF/339A, BIOS K01 v03.03 07/14/2016 RIP: 0010:strlen+0x0/0x20 Code: f6 82 80 2b 0b bc 20 74 11 0f b6 50 01 48 83 c0 01 f6 82 80 2b 0b bc 20 75 ef c3 66 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 40 00 <80> 3f 00 74 10 48 89 f8 48 83 c0 01 80 38 9 f8 c3 31 RSP: 0018:ffffaa75000d79d0 EFLAGS: 00010046 RAX: 0000000000000002 RBX: ffff9cdb55575270 RCX: 0000000000000000 RDX: ffff9cdb58c7a320 RSI: ffffaa75000d7b40 RDI: 000000000000001b RBP: ffffaa75000d7b40 R08: ffff9cdb40a4f010 R09: ffffaa75000d7ab8 R10: ffff9cdb4398c700 R11: 0000000000000008 R12: ffff9cdb58c7a320 R13: ffff9cdb55575270 R14: ffff9cdb58c7a000 R15: 0000000000000018 FS: 0000000000000000(0000) GS:ffff9cdb5aa00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 000000000000001b CR3: 00000000c0612006 CR4: 00000000001706e0 Call Trace: trace_event_raw_event_synth+0x90/0x1d0 action_trace+0x5b/0x70 event_hist_trigger+0x4bd/0x4e0 ? cpumask_next_and+0x20/0x30 ? update_sd_lb_stats.constprop.0+0xf6/0x840 ? __lock_acquire.constprop.0+0x125/0x550 ? find_held_lock+0x32/0x90 ? sched_clock_cpu+0xe/0xd0 ? lock_release+0x155/0x440 ? update_load_avg+0x8c/0x6f0 ? enqueue_entity+0x18a/0x920 ? __rb_reserve_next+0xe5/0x460 ? ring_buffer_lock_reserve+0x12a/0x3f0 event_triggers_call+0x52/0xe0 trace_event_buffer_commit+0x1ae/0x240 trace_event_raw_event_sched_switch+0x114/0x170 __traceiter_sched_switch+0x39/0x50 __schedule+0x431/0xb00 schedule_idle+0x28/0x40 do_idle+0x198/0x2e0 cpu_startup_entry+0x19/0x20 secondary_startup_64_no_verify+0xc2/0xcb
The reason is that the dynamic events array keeps track of the field position of the fields array, via the field_pos variable in the synth_field structure. Unfortunately, that field is a boolean for some reason, which means any field_pos greater than 1 will be a bug (in this case it was 2).
Link: https://lkml.kernel.org/r/20210721191008.638bce34@oasis.local.home
Cc: Masami Hiramatsu mhiramat@kernel.org Cc: Namhyung Kim namhyung@kernel.org Cc: Ingo Molnar mingo@kernel.org Cc: Andrew Morton akpm@linux-foundation.org Cc: stable@vger.kernel.org Fixes: bd82631d7ccdc ("tracing: Add support for dynamic strings to synthetic events") Reviewed-by: Tom Zanussi zanussi@kernel.org Signed-off-by: Steven Rostedt (VMware) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace_synth.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/kernel/trace/trace_synth.h +++ b/kernel/trace/trace_synth.h @@ -14,10 +14,10 @@ struct synth_field { char *name; size_t size; unsigned int offset; + unsigned int field_pos; bool is_signed; bool is_string; bool is_dynamic; - bool field_pos; };
struct synth_event {
From: Anand Jain anand.jain@oracle.com
commit 16a200f66ede3f9afa2e51d90ade017aaa18d213 upstream.
A fstrim on a degraded raid1 can trigger the following null pointer dereference:
BTRFS info (device loop0): allowing degraded mounts BTRFS info (device loop0): disk space caching is enabled BTRFS info (device loop0): has skinny extents BTRFS warning (device loop0): devid 2 uuid 97ac16f7-e14d-4db1-95bc-3d489b424adb is missing BTRFS warning (device loop0): devid 2 uuid 97ac16f7-e14d-4db1-95bc-3d489b424adb is missing BTRFS info (device loop0): enabling ssd optimizations BUG: kernel NULL pointer dereference, address: 0000000000000620 PGD 0 P4D 0 Oops: 0000 [#1] SMP NOPTI CPU: 0 PID: 4574 Comm: fstrim Not tainted 5.13.0-rc7+ #31 Hardware name: innotek GmbH VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006 RIP: 0010:btrfs_trim_fs+0x199/0x4a0 [btrfs] RSP: 0018:ffff959541797d28 EFLAGS: 00010293 RAX: 0000000000000000 RBX: ffff946f84eca508 RCX: a7a67937adff8608 RDX: ffff946e8122d000 RSI: 0000000000000000 RDI: ffffffffc02fdbf0 RBP: ffff946ea4615000 R08: 0000000000000001 R09: 0000000000000000 R10: 0000000000000000 R11: ffff946e8122d960 R12: 0000000000000000 R13: ffff959541797db8 R14: ffff946e8122d000 R15: ffff959541797db8 FS: 00007f55917a5080(0000) GS:ffff946f9bc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000620 CR3: 000000002d2c8001 CR4: 00000000000706f0 Call Trace: btrfs_ioctl_fitrim+0x167/0x260 [btrfs] btrfs_ioctl+0x1c00/0x2fe0 [btrfs] ? selinux_file_ioctl+0x140/0x240 ? syscall_trace_enter.constprop.0+0x188/0x240 ? __x64_sys_ioctl+0x83/0xb0 __x64_sys_ioctl+0x83/0xb0
Reproducer:
$ mkfs.btrfs -fq -d raid1 -m raid1 /dev/loop0 /dev/loop1 $ mount /dev/loop0 /btrfs $ umount /btrfs $ btrfs dev scan --forget $ mount -o degraded /dev/loop0 /btrfs
$ fstrim /btrfs
The reason is we call btrfs_trim_free_extents() for the missing device, which uses device->bdev (NULL for missing device) to find if the device supports discard.
Fix is to check if the device is missing before calling btrfs_trim_free_extents().
CC: stable@vger.kernel.org # 5.4+ Reviewed-by: Filipe Manana fdmanana@suse.com Signed-off-by: Anand Jain anand.jain@oracle.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/extent-tree.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -6034,6 +6034,9 @@ int btrfs_trim_fs(struct btrfs_fs_info * mutex_lock(&fs_info->fs_devices->device_list_mutex); devices = &fs_info->fs_devices->devices; list_for_each_entry(device, devices, dev_list) { + if (test_bit(BTRFS_DEV_STATE_MISSING, &device->dev_state)) + continue; + ret = btrfs_trim_free_extents(device, &group_trimmed); if (ret) { dev_failed++;
From: Filipe Manana fdmanana@suse.com
commit 9acc8103ab594f72250788cb45a43427f36d685d upstream.
If we have an inode that does not have the full sync flag set, was changed in the current transaction, then it is logged while logging some other inode (like its parent directory for example), its i_size is increased by a truncate operation, the log is synced through an fsync of some other inode and then finally we explicitly call fsync on our inode, the new i_size is not persisted.
The following example shows how to trigger it, with comments explaining how and why the issue happens:
$ mkfs.btrfs -f /dev/sdc $ mount /dev/sdc /mnt
$ touch /mnt/foo $ xfs_io -f -c "pwrite -S 0xab 0 1M" /mnt/bar
$ sync
# Fsync bar, this will be a noop since the file has not yet been # modified in the current transaction. The goal here is to clear # BTRFS_INODE_NEEDS_FULL_SYNC from the inode's runtime flags. $ xfs_io -c "fsync" /mnt/bar
# Now rename both files, without changing their parent directory. $ mv /mnt/bar /mnt/bar2 $ mv /mnt/foo /mnt/foo2
# Increase the size of bar2 with a truncate operation. $ xfs_io -c "truncate 2M" /mnt/bar2
# Now fsync foo2, this results in logging its parent inode (the root # directory), and logging the parent results in logging the inode of # file bar2 (its inode item and the new name). The inode of file bar2 # is logged with an i_size of 0 bytes since it's logged in # LOG_INODE_EXISTS mode, meaning we are only logging its names (and # xattrs if it had any) and the i_size of the inode will not be changed # when the log is replayed. $ xfs_io -c "fsync" /mnt/foo2
# Now explicitly fsync bar2. This resulted in doing nothing, not # logging the inode with the new i_size of 2M and the hole from file # offset 1M to 2M. Because the inode did not have the flag # BTRFS_INODE_NEEDS_FULL_SYNC set, when it was logged through the # fsync of file foo2, its last_log_commit field was updated, # resulting in this explicit of file bar2 not doing anything. $ xfs_io -c "fsync" /mnt/bar2
# File bar2 content and size before a power failure. $ od -A d -t x1 /mnt/bar2 0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab * 1048576 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 * 2097152
<power failure>
# Mount the filesystem to replay the log. $ mount /dev/sdc /mnt
# Read the file again, should have the same content and size as before # the power failure happened, but it doesn't, i_size is still at 1M. $ od -A d -t x1 /mnt/bar2 0000000 ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab ab * 1048576
This started to happen after commit 209ecbb8585bf6 ("btrfs: remove stale comment and logic from btrfs_inode_in_log()"), since btrfs_inode_in_log() no longer checks if the inode's list of modified extents is not empty. However, checking that list is not the right way to address this case and the check was added long time ago in commit 125c4cf9f37c98 ("Btrfs: set inode's logged_trans/last_log_commit after ranged fsync") for a different purpose, to address consecutive ranged fsyncs.
The reason that checking for the list emptiness makes this test pass is because during an expanding truncate we create an extent map to represent a hole from the old i_size to the new i_size, and add that extent map to the list of modified extents in the inode. However if we are low on available memory and we can not allocate a new extent map, then we don't treat it as an error and just set the full sync flag on the inode, so that the next fsync does not rely on the list of modified extents - so checking for the emptiness of the list to decide if the inode needs to be logged is not reliable, and results in not logging the inode if it was not possible to allocate the extent map for the hole.
Fix this by ensuring that if we are only logging that an inode exists (inode item, names/references and xattrs), we don't update the inode's last_log_commit even if it does not have the full sync runtime flag set.
A test case for fstests follows soon.
CC: stable@vger.kernel.org # 5.13+ Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/tree-log.c | 31 ++++++++++++++++++++++--------- 1 file changed, 22 insertions(+), 9 deletions(-)
--- a/fs/btrfs/tree-log.c +++ b/fs/btrfs/tree-log.c @@ -5515,16 +5515,29 @@ log_extents: spin_lock(&inode->lock); inode->logged_trans = trans->transid; /* - * Don't update last_log_commit if we logged that an inode exists - * after it was loaded to memory (full_sync bit set). - * This is to prevent data loss when we do a write to the inode, - * then the inode gets evicted after all delalloc was flushed, - * then we log it exists (due to a rename for example) and then - * fsync it. This last fsync would do nothing (not logging the - * extents previously written). + * Don't update last_log_commit if we logged that an inode exists. + * We do this for two reasons: + * + * 1) We might have had buffered writes to this inode that were + * flushed and had their ordered extents completed in this + * transaction, but we did not previously log the inode with + * LOG_INODE_ALL. Later the inode was evicted and after that + * it was loaded again and this LOG_INODE_EXISTS log operation + * happened. We must make sure that if an explicit fsync against + * the inode is performed later, it logs the new extents, an + * updated inode item, etc, and syncs the log. The same logic + * applies to direct IO writes instead of buffered writes. + * + * 2) When we log the inode with LOG_INODE_EXISTS, its inode item + * is logged with an i_size of 0 or whatever value was logged + * before. If later the i_size of the inode is increased by a + * truncate operation, the log is synced through an fsync of + * some other inode and then finally an explicit fsync against + * this inode is made, we must make sure this fsync logs the + * inode with the new i_size, the hole between old i_size and + * the new i_size, and syncs the log. */ - if (inode_only != LOG_INODE_EXISTS || - !test_bit(BTRFS_INODE_NEEDS_FULL_SYNC, &inode->runtime_flags)) + if (inode_only != LOG_INODE_EXISTS) inode->last_log_commit = inode->last_sub_trans; spin_unlock(&inode->lock); }
From: Filipe Manana fdmanana@suse.com
commit 8949b9a114019b03fbd0d03d65b8647cba4feef3 upstream.
At btrfs_qgroup_trace_extent_post() we call btrfs_find_all_roots() with a NULL value as the transaction handle argument, which makes that function take the commit_root_sem semaphore, which is necessary when we don't hold a transaction handle or any other mechanism to prevent a transaction commit from wiping out commit roots.
However btrfs_qgroup_trace_extent_post() can be called in a context where we are holding a write lock on an extent buffer from a subvolume tree, namely from btrfs_truncate_inode_items(), called either during truncate or unlink operations. In this case we end up with a lock inversion problem because the commit_root_sem is a higher level lock, always supposed to be acquired before locking any extent buffer.
Lockdep detects this lock inversion problem since we switched the extent buffer locks from custom locks to semaphores, and when running btrfs/158 from fstests, it reported the following trace:
[ 9057.626435] ====================================================== [ 9057.627541] WARNING: possible circular locking dependency detected [ 9057.628334] 5.14.0-rc2-btrfs-next-93 #1 Not tainted [ 9057.628961] ------------------------------------------------------ [ 9057.629867] kworker/u16:4/30781 is trying to acquire lock: [ 9057.630824] ffff8e2590f58760 (btrfs-tree-00){++++}-{3:3}, at: __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 9057.632542] but task is already holding lock: [ 9057.633551] ffff8e25582d4b70 (&fs_info->commit_root_sem){++++}-{3:3}, at: iterate_extent_inodes+0x10b/0x280 [btrfs] [ 9057.635255] which lock already depends on the new lock.
[ 9057.636292] the existing dependency chain (in reverse order) is: [ 9057.637240] -> #1 (&fs_info->commit_root_sem){++++}-{3:3}: [ 9057.638138] down_read+0x46/0x140 [ 9057.638648] btrfs_find_all_roots+0x41/0x80 [btrfs] [ 9057.639398] btrfs_qgroup_trace_extent_post+0x37/0x70 [btrfs] [ 9057.640283] btrfs_add_delayed_data_ref+0x418/0x490 [btrfs] [ 9057.641114] btrfs_free_extent+0x35/0xb0 [btrfs] [ 9057.641819] btrfs_truncate_inode_items+0x424/0xf70 [btrfs] [ 9057.642643] btrfs_evict_inode+0x454/0x4f0 [btrfs] [ 9057.643418] evict+0xcf/0x1d0 [ 9057.643895] do_unlinkat+0x1e9/0x300 [ 9057.644525] do_syscall_64+0x3b/0xc0 [ 9057.645110] entry_SYSCALL_64_after_hwframe+0x44/0xae [ 9057.645835] -> #0 (btrfs-tree-00){++++}-{3:3}: [ 9057.646600] __lock_acquire+0x130e/0x2210 [ 9057.647248] lock_acquire+0xd7/0x310 [ 9057.647773] down_read_nested+0x4b/0x140 [ 9057.648350] __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 9057.649175] btrfs_read_lock_root_node+0x31/0x40 [btrfs] [ 9057.650010] btrfs_search_slot+0x537/0xc00 [btrfs] [ 9057.650849] scrub_print_warning_inode+0x89/0x370 [btrfs] [ 9057.651733] iterate_extent_inodes+0x1e3/0x280 [btrfs] [ 9057.652501] scrub_print_warning+0x15d/0x2f0 [btrfs] [ 9057.653264] scrub_handle_errored_block.isra.0+0x135f/0x1640 [btrfs] [ 9057.654295] scrub_bio_end_io_worker+0x101/0x2e0 [btrfs] [ 9057.655111] btrfs_work_helper+0xf8/0x400 [btrfs] [ 9057.655831] process_one_work+0x247/0x5a0 [ 9057.656425] worker_thread+0x55/0x3c0 [ 9057.656993] kthread+0x155/0x180 [ 9057.657494] ret_from_fork+0x22/0x30 [ 9057.658030] other info that might help us debug this:
[ 9057.659064] Possible unsafe locking scenario:
[ 9057.659824] CPU0 CPU1 [ 9057.660402] ---- ---- [ 9057.660988] lock(&fs_info->commit_root_sem); [ 9057.661581] lock(btrfs-tree-00); [ 9057.662348] lock(&fs_info->commit_root_sem); [ 9057.663254] lock(btrfs-tree-00); [ 9057.663690] *** DEADLOCK ***
[ 9057.664437] 4 locks held by kworker/u16:4/30781: [ 9057.665023] #0: ffff8e25922a1148 ((wq_completion)btrfs-scrub){+.+.}-{0:0}, at: process_one_work+0x1c7/0x5a0 [ 9057.666260] #1: ffffabb3451ffe70 ((work_completion)(&work->normal_work)){+.+.}-{0:0}, at: process_one_work+0x1c7/0x5a0 [ 9057.667639] #2: ffff8e25922da198 (&ret->mutex){+.+.}-{3:3}, at: scrub_handle_errored_block.isra.0+0x5d2/0x1640 [btrfs] [ 9057.669017] #3: ffff8e25582d4b70 (&fs_info->commit_root_sem){++++}-{3:3}, at: iterate_extent_inodes+0x10b/0x280 [btrfs] [ 9057.670408] stack backtrace: [ 9057.670976] CPU: 7 PID: 30781 Comm: kworker/u16:4 Not tainted 5.14.0-rc2-btrfs-next-93 #1 [ 9057.672030] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014 [ 9057.673492] Workqueue: btrfs-scrub btrfs_work_helper [btrfs] [ 9057.674258] Call Trace: [ 9057.674588] dump_stack_lvl+0x57/0x72 [ 9057.675083] check_noncircular+0xf3/0x110 [ 9057.675611] __lock_acquire+0x130e/0x2210 [ 9057.676132] lock_acquire+0xd7/0x310 [ 9057.676605] ? __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 9057.677313] ? lock_is_held_type+0xe8/0x140 [ 9057.677849] down_read_nested+0x4b/0x140 [ 9057.678349] ? __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 9057.679068] __btrfs_tree_read_lock+0x24/0x110 [btrfs] [ 9057.679760] btrfs_read_lock_root_node+0x31/0x40 [btrfs] [ 9057.680458] btrfs_search_slot+0x537/0xc00 [btrfs] [ 9057.681083] ? _raw_spin_unlock+0x29/0x40 [ 9057.681594] ? btrfs_find_all_roots_safe+0x11f/0x140 [btrfs] [ 9057.682336] scrub_print_warning_inode+0x89/0x370 [btrfs] [ 9057.683058] ? btrfs_find_all_roots_safe+0x11f/0x140 [btrfs] [ 9057.683834] ? scrub_write_block_to_dev_replace+0xb0/0xb0 [btrfs] [ 9057.684632] iterate_extent_inodes+0x1e3/0x280 [btrfs] [ 9057.685316] scrub_print_warning+0x15d/0x2f0 [btrfs] [ 9057.685977] ? ___ratelimit+0xa4/0x110 [ 9057.686460] scrub_handle_errored_block.isra.0+0x135f/0x1640 [btrfs] [ 9057.687316] scrub_bio_end_io_worker+0x101/0x2e0 [btrfs] [ 9057.688021] btrfs_work_helper+0xf8/0x400 [btrfs] [ 9057.688649] ? lock_is_held_type+0xe8/0x140 [ 9057.689180] process_one_work+0x247/0x5a0 [ 9057.689696] worker_thread+0x55/0x3c0 [ 9057.690175] ? process_one_work+0x5a0/0x5a0 [ 9057.690731] kthread+0x155/0x180 [ 9057.691158] ? set_kthread_struct+0x40/0x40 [ 9057.691697] ret_from_fork+0x22/0x30
Fix this by making btrfs_find_all_roots() never attempt to lock the commit_root_sem when it is called from btrfs_qgroup_trace_extent_post().
We can't just pass a non-NULL transaction handle to btrfs_find_all_roots() from btrfs_qgroup_trace_extent_post(), because that would make backref lookup not use commit roots and acquire read locks on extent buffers, and therefore could deadlock when btrfs_qgroup_trace_extent_post() is called from the btrfs_truncate_inode_items() code path which has acquired a write lock on an extent buffer of the subvolume btree.
CC: stable@vger.kernel.org # 4.19+ Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/btrfs/backref.c | 6 +++--- fs/btrfs/backref.h | 3 ++- fs/btrfs/delayed-ref.c | 4 ++-- fs/btrfs/qgroup.c | 38 ++++++++++++++++++++++++++++++-------- fs/btrfs/qgroup.h | 2 +- fs/btrfs/tests/qgroup-tests.c | 20 ++++++++++---------- 6 files changed, 48 insertions(+), 25 deletions(-)
--- a/fs/btrfs/backref.c +++ b/fs/btrfs/backref.c @@ -1488,15 +1488,15 @@ static int btrfs_find_all_roots_safe(str int btrfs_find_all_roots(struct btrfs_trans_handle *trans, struct btrfs_fs_info *fs_info, u64 bytenr, u64 time_seq, struct ulist **roots, - bool ignore_offset) + bool ignore_offset, bool skip_commit_root_sem) { int ret;
- if (!trans) + if (!trans && !skip_commit_root_sem) down_read(&fs_info->commit_root_sem); ret = btrfs_find_all_roots_safe(trans, fs_info, bytenr, time_seq, roots, ignore_offset); - if (!trans) + if (!trans && !skip_commit_root_sem) up_read(&fs_info->commit_root_sem); return ret; } --- a/fs/btrfs/backref.h +++ b/fs/btrfs/backref.h @@ -47,7 +47,8 @@ int btrfs_find_all_leafs(struct btrfs_tr const u64 *extent_item_pos, bool ignore_offset); int btrfs_find_all_roots(struct btrfs_trans_handle *trans, struct btrfs_fs_info *fs_info, u64 bytenr, - u64 time_seq, struct ulist **roots, bool ignore_offset); + u64 time_seq, struct ulist **roots, bool ignore_offset, + bool skip_commit_root_sem); char *btrfs_ref_to_path(struct btrfs_root *fs_root, struct btrfs_path *path, u32 name_len, unsigned long name_off, struct extent_buffer *eb_in, u64 parent, --- a/fs/btrfs/delayed-ref.c +++ b/fs/btrfs/delayed-ref.c @@ -1000,7 +1000,7 @@ int btrfs_add_delayed_tree_ref(struct bt kmem_cache_free(btrfs_delayed_tree_ref_cachep, ref);
if (qrecord_inserted) - btrfs_qgroup_trace_extent_post(fs_info, record); + btrfs_qgroup_trace_extent_post(trans, record);
return 0; } @@ -1095,7 +1095,7 @@ int btrfs_add_delayed_data_ref(struct bt
if (qrecord_inserted) - return btrfs_qgroup_trace_extent_post(fs_info, record); + return btrfs_qgroup_trace_extent_post(trans, record); return 0; }
--- a/fs/btrfs/qgroup.c +++ b/fs/btrfs/qgroup.c @@ -1704,17 +1704,39 @@ int btrfs_qgroup_trace_extent_nolock(str return 0; }
-int btrfs_qgroup_trace_extent_post(struct btrfs_fs_info *fs_info, +int btrfs_qgroup_trace_extent_post(struct btrfs_trans_handle *trans, struct btrfs_qgroup_extent_record *qrecord) { struct ulist *old_root; u64 bytenr = qrecord->bytenr; int ret;
- ret = btrfs_find_all_roots(NULL, fs_info, bytenr, 0, &old_root, false); + /* + * We are always called in a context where we are already holding a + * transaction handle. Often we are called when adding a data delayed + * reference from btrfs_truncate_inode_items() (truncating or unlinking), + * in which case we will be holding a write lock on extent buffer from a + * subvolume tree. In this case we can't allow btrfs_find_all_roots() to + * acquire fs_info->commit_root_sem, because that is a higher level lock + * that must be acquired before locking any extent buffers. + * + * So we want btrfs_find_all_roots() to not acquire the commit_root_sem + * but we can't pass it a non-NULL transaction handle, because otherwise + * it would not use commit roots and would lock extent buffers, causing + * a deadlock if it ends up trying to read lock the same extent buffer + * that was previously write locked at btrfs_truncate_inode_items(). + * + * So pass a NULL transaction handle to btrfs_find_all_roots() and + * explicitly tell it to not acquire the commit_root_sem - if we are + * holding a transaction handle we don't need its protection. + */ + ASSERT(trans != NULL); + + ret = btrfs_find_all_roots(NULL, trans->fs_info, bytenr, 0, &old_root, + false, true); if (ret < 0) { - fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; - btrfs_warn(fs_info, + trans->fs_info->qgroup_flags |= BTRFS_QGROUP_STATUS_FLAG_INCONSISTENT; + btrfs_warn(trans->fs_info, "error accounting new delayed refs extent (err code: %d), quota inconsistent", ret); return 0; @@ -1758,7 +1780,7 @@ int btrfs_qgroup_trace_extent(struct btr kfree(record); return 0; } - return btrfs_qgroup_trace_extent_post(fs_info, record); + return btrfs_qgroup_trace_extent_post(trans, record); }
int btrfs_qgroup_trace_leaf_items(struct btrfs_trans_handle *trans, @@ -2629,7 +2651,7 @@ int btrfs_qgroup_account_extents(struct /* Search commit root to find old_roots */ ret = btrfs_find_all_roots(NULL, fs_info, record->bytenr, 0, - &record->old_roots, false); + &record->old_roots, false, false); if (ret < 0) goto cleanup; } @@ -2645,7 +2667,7 @@ int btrfs_qgroup_account_extents(struct * current root. It's safe inside commit_transaction(). */ ret = btrfs_find_all_roots(trans, fs_info, - record->bytenr, BTRFS_SEQ_LAST, &new_roots, false); + record->bytenr, BTRFS_SEQ_LAST, &new_roots, false, false); if (ret < 0) goto cleanup; if (qgroup_to_skip) { @@ -3179,7 +3201,7 @@ static int qgroup_rescan_leaf(struct btr num_bytes = found.offset;
ret = btrfs_find_all_roots(NULL, fs_info, found.objectid, 0, - &roots, false); + &roots, false, false); if (ret < 0) goto out; /* For rescan, just pass old_roots as NULL */ --- a/fs/btrfs/qgroup.h +++ b/fs/btrfs/qgroup.h @@ -298,7 +298,7 @@ int btrfs_qgroup_trace_extent_nolock( * using current root, then we can move all expensive backref walk out of * transaction committing, but not now as qgroup accounting will be wrong again. */ -int btrfs_qgroup_trace_extent_post(struct btrfs_fs_info *fs_info, +int btrfs_qgroup_trace_extent_post(struct btrfs_trans_handle *trans, struct btrfs_qgroup_extent_record *qrecord);
/* --- a/fs/btrfs/tests/qgroup-tests.c +++ b/fs/btrfs/tests/qgroup-tests.c @@ -224,7 +224,7 @@ static int test_no_shared_qgroup(struct * quota. */ ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &old_roots, - false); + false, false); if (ret) { ulist_free(old_roots); test_err("couldn't find old roots: %d", ret); @@ -237,7 +237,7 @@ static int test_no_shared_qgroup(struct return ret;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &new_roots, - false); + false, false); if (ret) { ulist_free(old_roots); ulist_free(new_roots); @@ -261,7 +261,7 @@ static int test_no_shared_qgroup(struct new_roots = NULL;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &old_roots, - false); + false, false); if (ret) { ulist_free(old_roots); test_err("couldn't find old roots: %d", ret); @@ -273,7 +273,7 @@ static int test_no_shared_qgroup(struct return -EINVAL;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &new_roots, - false); + false, false); if (ret) { ulist_free(old_roots); ulist_free(new_roots); @@ -325,7 +325,7 @@ static int test_multiple_refs(struct btr }
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &old_roots, - false); + false, false); if (ret) { ulist_free(old_roots); test_err("couldn't find old roots: %d", ret); @@ -338,7 +338,7 @@ static int test_multiple_refs(struct btr return ret;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &new_roots, - false); + false, false); if (ret) { ulist_free(old_roots); ulist_free(new_roots); @@ -360,7 +360,7 @@ static int test_multiple_refs(struct btr }
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &old_roots, - false); + false, false); if (ret) { ulist_free(old_roots); test_err("couldn't find old roots: %d", ret); @@ -373,7 +373,7 @@ static int test_multiple_refs(struct btr return ret;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &new_roots, - false); + false, false); if (ret) { ulist_free(old_roots); ulist_free(new_roots); @@ -401,7 +401,7 @@ static int test_multiple_refs(struct btr }
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &old_roots, - false); + false, false); if (ret) { ulist_free(old_roots); test_err("couldn't find old roots: %d", ret); @@ -414,7 +414,7 @@ static int test_multiple_refs(struct btr return ret;
ret = btrfs_find_all_roots(&trans, fs_info, nodesize, 0, &new_roots, - false); + false, false); if (ret) { ulist_free(old_roots); ulist_free(new_roots);
From: Gustavo A. R. Silva gustavoars@kernel.org
commit 8d4abca95ecc82fc8c41912fa0085281f19cc29f upstream.
Fix an 11-year old bug in ngene_command_config_free_buf() while addressing the following warnings caught with -Warray-bounds:
arch/alpha/include/asm/string.h:22:16: warning: '__builtin_memcpy' offset [12, 16] from the object at 'com' is out of the bounds of referenced subobject 'config' with type 'unsigned char' at offset 10 [-Warray-bounds] arch/x86/include/asm/string_32.h:182:25: warning: '__builtin_memcpy' offset [12, 16] from the object at 'com' is out of the bounds of referenced subobject 'config' with type 'unsigned char' at offset 10 [-Warray-bounds]
The problem is that the original code is trying to copy 6 bytes of data into a one-byte size member _config_ of the wrong structue FW_CONFIGURE_BUFFERS, in a single call to memcpy(). This causes a legitimate compiler warning because memcpy() overruns the length of &com.cmd.ConfigureBuffers.config. It seems that the right structure is FW_CONFIGURE_FREE_BUFFERS, instead, because it contains 6 more members apart from the header _hdr_. Also, the name of the function ngene_command_config_free_buf() suggests that the actual intention is to ConfigureFreeBuffers, instead of ConfigureBuffers (which takes place in the function ngene_command_config_buf(), above).
Fix this by enclosing those 6 members of struct FW_CONFIGURE_FREE_BUFFERS into new struct config, and use &com.cmd.ConfigureFreeBuffers.config as the destination address, instead of &com.cmd.ConfigureBuffers.config, when calling memcpy().
This also helps with the ongoing efforts to globally enable -Warray-bounds and get us closer to being able to tighten the FORTIFY_SOURCE routines on memcpy().
Link: https://github.com/KSPP/linux/issues/109 Fixes: dae52d009fc9 ("V4L/DVB: ngene: Initial check-in") Cc: stable@vger.kernel.org Reported-by: kernel test robot lkp@intel.com Reviewed-by: Kees Cook keescook@chromium.org Signed-off-by: Gustavo A. R. Silva gustavoars@kernel.org Link: https://lore.kernel.org/linux-hardening/20210420001631.GA45456@embeddedor/ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/media/pci/ngene/ngene-core.c | 2 +- drivers/media/pci/ngene/ngene.h | 14 ++++++++------ 2 files changed, 9 insertions(+), 7 deletions(-)
--- a/drivers/media/pci/ngene/ngene-core.c +++ b/drivers/media/pci/ngene/ngene-core.c @@ -385,7 +385,7 @@ static int ngene_command_config_free_buf
com.cmd.hdr.Opcode = CMD_CONFIGURE_FREE_BUFFER; com.cmd.hdr.Length = 6; - memcpy(&com.cmd.ConfigureBuffers.config, config, 6); + memcpy(&com.cmd.ConfigureFreeBuffers.config, config, 6); com.in_len = 6; com.out_len = 0;
--- a/drivers/media/pci/ngene/ngene.h +++ b/drivers/media/pci/ngene/ngene.h @@ -407,12 +407,14 @@ enum _BUFFER_CONFIGS {
struct FW_CONFIGURE_FREE_BUFFERS { struct FW_HEADER hdr; - u8 UVI1_BufferLength; - u8 UVI2_BufferLength; - u8 TVO_BufferLength; - u8 AUD1_BufferLength; - u8 AUD2_BufferLength; - u8 TVA_BufferLength; + struct { + u8 UVI1_BufferLength; + u8 UVI2_BufferLength; + u8 TVO_BufferLength; + u8 AUD1_BufferLength; + u8 AUD2_BufferLength; + u8 TVA_BufferLength; + } __packed config; } __attribute__ ((__packed__));
struct FW_CONFIGURE_UART {
From: Markus Boehme markubo@amazon.com
commit 09cfae9f13d51700b0fecf591dcd658fc5375428 upstream.
When receiving a packet with multiple fragments, hardware may still touch the first fragment until the entire packet has been received. The driver therefore keeps the first fragment mapped for DMA until end of packet has been asserted, and delays its dma_sync call until then.
The driver tries to fit multiple receive buffers on one page. When using 3K receive buffers (e.g. using Jumbo frames and legacy-rx is turned off/build_skb is being used) on an architecture with 4K pages, the driver allocates an order 1 compound page and uses one page per receive buffer. To determine the correct offset for a delayed DMA sync of the first fragment of a multi-fragment packet, the driver then cannot just use PAGE_MASK on the DMA address but has to construct a mask based on the actual size of the backing page.
Using PAGE_MASK in the 3K RX buffer/4K page architecture configuration will always sync the first page of a compound page. With the SWIOTLB enabled this can lead to corrupted packets (zeroed out first fragment, re-used garbage from another packet) and various consequences, such as slow/stalling data transfers and connection resets. For example, testing on a link with MTU exceeding 3058 bytes on a host with SWIOTLB enabled (e.g. "iommu=soft swiotlb=262144,force") TCP transfers quickly fizzle out without this patch.
Cc: stable@vger.kernel.org Fixes: 0c5661ecc5dd7 ("ixgbe: fix crash in build_skb Rx code path") Signed-off-by: Markus Boehme markubo@amazon.com Tested-by: Tony Brelinski tonyx.brelinski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/intel/ixgbe/ixgbe_main.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c @@ -1825,7 +1825,8 @@ static void ixgbe_dma_sync_frag(struct i struct sk_buff *skb) { if (ring_uses_build_skb(rx_ring)) { - unsigned long offset = (unsigned long)(skb->data) & ~PAGE_MASK; + unsigned long mask = (unsigned long)ixgbe_rx_pg_size(rx_ring) - 1; + unsigned long offset = (unsigned long)(skb->data) & mask;
dma_sync_single_range_for_cpu(rx_ring->dev, IXGBE_CB(skb)->dma,
From: Peter Ujfalusi peter.ujfalusi@linux.intel.com
commit 4afa0c22eed33cfe0c590742387f0d16f32412f3 upstream.
If driver_register() returns with error we need to free the memory allocated for auxdrv->driver.name before returning from __auxiliary_driver_register()
Fixes: 7de3697e9cbd4 ("Add auxiliary bus support") Reviewed-by: Dan Williams dan.j.williams@intel.com Cc: stable stable@vger.kernel.org Signed-off-by: Peter Ujfalusi peter.ujfalusi@linux.intel.com Link: https://lore.kernel.org/r/20210713093438.3173-1-peter.ujfalusi@linux.intel.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/auxiliary.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/drivers/base/auxiliary.c +++ b/drivers/base/auxiliary.c @@ -231,6 +231,8 @@ EXPORT_SYMBOL_GPL(auxiliary_find_device) int __auxiliary_driver_register(struct auxiliary_driver *auxdrv, struct module *owner, const char *modname) { + int ret; + if (WARN_ON(!auxdrv->probe) || WARN_ON(!auxdrv->id_table)) return -EINVAL;
@@ -246,7 +248,11 @@ int __auxiliary_driver_register(struct a auxdrv->driver.bus = &auxiliary_bus_type; auxdrv->driver.mod_name = modname;
- return driver_register(&auxdrv->driver); + ret = driver_register(&auxdrv->driver); + if (ret) + kfree(auxdrv->driver.name); + + return ret; } EXPORT_SYMBOL_GPL(__auxiliary_driver_register);
From: Bhaumik Bhatt bbhatt@codeaurora.org
commit 56f6f4c4eb2a710ec8878dd9373d3d2b2eb75f5c upstream.
Devices such as SDX24 do not have the provision for inband wake doorbell in the form of channel 127 and instead have a sideband GPIO for it. Newer devices such as SDX55 or SDX65 support inband wake method by default. Ensure the functionality is used based on this such that device wake stays held when a client driver uses mhi_device_get() API or the equivalent debugfs entry.
Link: https://lore.kernel.org/r/1624560809-30610-1-git-send-email-bbhatt@codeauror... Fixes: e3e5e6508fc1 ("bus: mhi: pci_generic: No-Op for device_wake operations") Cc: stable@vger.kernel.org #5.12 Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Signed-off-by: Bhaumik Bhatt bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20210716075106.49938-2-manivannan.sadhasivam@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/mhi/pci_generic.c | 27 +++++++++++++++++++-------- 1 file changed, 19 insertions(+), 8 deletions(-)
--- a/drivers/bus/mhi/pci_generic.c +++ b/drivers/bus/mhi/pci_generic.c @@ -32,6 +32,8 @@ * @edl: emergency download mode firmware path (if any) * @bar_num: PCI base address register to use for MHI MMIO register space * @dma_data_width: DMA transfer word size (32 or 64 bits) + * @sideband_wake: Devices using dedicated sideband GPIO for wakeup instead + * of inband wake support (such as sdx24) */ struct mhi_pci_dev_info { const struct mhi_controller_config *config; @@ -40,6 +42,7 @@ struct mhi_pci_dev_info { const char *edl; unsigned int bar_num; unsigned int dma_data_width; + bool sideband_wake; };
#define MHI_CHANNEL_CONFIG_UL(ch_num, ch_name, el_count, ev_ring) \ @@ -242,7 +245,8 @@ static const struct mhi_pci_dev_info mhi .edl = "qcom/sdx65m/edl.mbn", .config = &modem_qcom_v1_mhiv_config, .bar_num = MHI_PCI_DEFAULT_BAR_NUM, - .dma_data_width = 32 + .dma_data_width = 32, + .sideband_wake = false, };
static const struct mhi_pci_dev_info mhi_qcom_sdx55_info = { @@ -251,7 +255,8 @@ static const struct mhi_pci_dev_info mhi .edl = "qcom/sdx55m/edl.mbn", .config = &modem_qcom_v1_mhiv_config, .bar_num = MHI_PCI_DEFAULT_BAR_NUM, - .dma_data_width = 32 + .dma_data_width = 32, + .sideband_wake = false, };
static const struct mhi_pci_dev_info mhi_qcom_sdx24_info = { @@ -259,7 +264,8 @@ static const struct mhi_pci_dev_info mhi .edl = "qcom/prog_firehose_sdx24.mbn", .config = &modem_qcom_v1_mhiv_config, .bar_num = MHI_PCI_DEFAULT_BAR_NUM, - .dma_data_width = 32 + .dma_data_width = 32, + .sideband_wake = true, };
static const struct mhi_channel_config mhi_quectel_em1xx_channels[] = { @@ -301,7 +307,8 @@ static const struct mhi_pci_dev_info mhi .edl = "qcom/prog_firehose_sdx24.mbn", .config = &modem_quectel_em1xx_config, .bar_num = MHI_PCI_DEFAULT_BAR_NUM, - .dma_data_width = 32 + .dma_data_width = 32, + .sideband_wake = true, };
static const struct mhi_channel_config mhi_foxconn_sdx55_channels[] = { @@ -339,7 +346,8 @@ static const struct mhi_pci_dev_info mhi .edl = "qcom/sdx55m/edl.mbn", .config = &modem_foxconn_sdx55_config, .bar_num = MHI_PCI_DEFAULT_BAR_NUM, - .dma_data_width = 32 + .dma_data_width = 32, + .sideband_wake = false, };
static const struct pci_device_id mhi_pci_id_table[] = { @@ -640,9 +648,12 @@ static int mhi_pci_probe(struct pci_dev mhi_cntrl->status_cb = mhi_pci_status_cb; mhi_cntrl->runtime_get = mhi_pci_runtime_get; mhi_cntrl->runtime_put = mhi_pci_runtime_put; - mhi_cntrl->wake_get = mhi_pci_wake_get_nop; - mhi_cntrl->wake_put = mhi_pci_wake_put_nop; - mhi_cntrl->wake_toggle = mhi_pci_wake_toggle_nop; + + if (info->sideband_wake) { + mhi_cntrl->wake_get = mhi_pci_wake_get_nop; + mhi_cntrl->wake_put = mhi_pci_wake_put_nop; + mhi_cntrl->wake_toggle = mhi_pci_wake_toggle_nop; + }
err = mhi_pci_claim(mhi_cntrl, info->bar_num, DMA_BIT_MASK(info->dma_data_width)); if (err)
From: Bhaumik Bhatt bbhatt@codeaurora.org
commit 546362a9ef2ef40b57c6605f14e88ced507f8dd0 upstream.
MHI reads the channel ID from the event ring element sent by the device which can be any value between 0 and 255. In order to prevent any out of bound accesses, add a check against the maximum number of channels supported by the controller and those channels not configured yet so as to skip processing of that event ring element.
Link: https://lore.kernel.org/r/1624558141-11045-1-git-send-email-bbhatt@codeauror... Fixes: 1d3173a3bae7 ("bus: mhi: core: Add support for processing events from client device") Cc: stable@vger.kernel.org #5.10 Reviewed-by: Hemant Kumar hemantk@codeaurora.org Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Reviewed-by: Jeffrey Hugo quic_jhugo@quicinc.com Signed-off-by: Bhaumik Bhatt bbhatt@codeaurora.org Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20210716075106.49938-3-manivannan.sadhasivam@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/mhi/core/main.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)
--- a/drivers/bus/mhi/core/main.c +++ b/drivers/bus/mhi/core/main.c @@ -773,11 +773,18 @@ static void mhi_process_cmd_completion(s cmd_pkt = mhi_to_virtual(mhi_ring, ptr);
chan = MHI_TRE_GET_CMD_CHID(cmd_pkt); - mhi_chan = &mhi_cntrl->mhi_chan[chan]; - write_lock_bh(&mhi_chan->lock); - mhi_chan->ccs = MHI_TRE_GET_EV_CODE(tre); - complete(&mhi_chan->completion); - write_unlock_bh(&mhi_chan->lock); + + if (chan < mhi_cntrl->max_chan && + mhi_cntrl->mhi_chan[chan].configured) { + mhi_chan = &mhi_cntrl->mhi_chan[chan]; + write_lock_bh(&mhi_chan->lock); + mhi_chan->ccs = MHI_TRE_GET_EV_CODE(tre); + complete(&mhi_chan->completion); + write_unlock_bh(&mhi_chan->lock); + } else { + dev_err(&mhi_cntrl->mhi_dev->dev, + "Completion packet for invalid channel ID: %d\n", chan); + }
mhi_del_ring_element(mhi_cntrl, mhi_ring); }
From: Loic Poulain loic.poulain@linaro.org
commit b8a97f2a65388394f433bf0730293a94f7d49046 upstream.
The qrtr-mhi client driver assumes that inbound buffers are automatically allocated and queued by the MHI core, but this doesn't happen for mhi pci devices since IPCR inbound channel is not flagged with auto_queue, causing unusable IPCR (qrtr) feature. Fix that.
Link: https://lore.kernel.org/r/1625736749-24947-1-git-send-email-loic.poulain@lin... [mani: fixed a spelling mistake in commit description] Fixes: 855a70c12021 ("bus: mhi: Add MHI PCI support for WWAN modems") Cc: stable@vger.kernel.org #5.10 Reviewed-by: Hemant kumar hemantk@codeaurora.org Reviewed-by: Manivannan Sadhasivam mani@kernel.org Signed-off-by: Loic Poulain loic.poulain@linaro.org Signed-off-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Link: https://lore.kernel.org/r/20210716075106.49938-4-manivannan.sadhasivam@linar... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/bus/mhi/pci_generic.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
--- a/drivers/bus/mhi/pci_generic.c +++ b/drivers/bus/mhi/pci_generic.c @@ -75,6 +75,22 @@ struct mhi_pci_dev_info { .doorbell_mode_switch = false, \ }
+#define MHI_CHANNEL_CONFIG_DL_AUTOQUEUE(ch_num, ch_name, el_count, ev_ring) \ + { \ + .num = ch_num, \ + .name = ch_name, \ + .num_elements = el_count, \ + .event_ring = ev_ring, \ + .dir = DMA_FROM_DEVICE, \ + .ee_mask = BIT(MHI_EE_AMSS), \ + .pollcfg = 0, \ + .doorbell = MHI_DB_BRST_DISABLE, \ + .lpm_notify = false, \ + .offload_channel = false, \ + .doorbell_mode_switch = false, \ + .auto_queue = true, \ + } + #define MHI_EVENT_CONFIG_CTRL(ev_ring, el_count) \ { \ .num_elements = el_count, \ @@ -213,7 +229,7 @@ static const struct mhi_channel_config m MHI_CHANNEL_CONFIG_UL(14, "QMI", 4, 0), MHI_CHANNEL_CONFIG_DL(15, "QMI", 4, 0), MHI_CHANNEL_CONFIG_UL(20, "IPCR", 8, 0), - MHI_CHANNEL_CONFIG_DL(21, "IPCR", 8, 0), + MHI_CHANNEL_CONFIG_DL_AUTOQUEUE(21, "IPCR", 8, 0), MHI_CHANNEL_CONFIG_UL_FP(34, "FIREHOSE", 32, 0), MHI_CHANNEL_CONFIG_DL_FP(35, "FIREHOSE", 32, 0), MHI_CHANNEL_CONFIG_HW_UL(100, "IP_HW0", 128, 2),
From: Frederic Weisbecker frederic@kernel.org
commit 1a3402d93c73bf6bb4df6d7c2aac35abfc3c50e2 upstream.
Since the process wide cputime counter is started locklessly from posix_cpu_timer_rearm(), it can be concurrently stopped by operations on other timers from the same thread group, such as in the following unlucky scenario:
CPU 0 CPU 1 ----- ----- timer_settime(TIMER B) posix_cpu_timer_rearm(TIMER A) cpu_clock_sample_group() (pct->timers_active already true)
handle_posix_cpu_timers() check_process_timers() stop_process_timers() pct->timers_active = false arm_timer(TIMER A)
tick -> run_posix_cpu_timers() // sees !pct->timers_active, ignore // our TIMER A
Fix this with simply locking process wide cputime counting start and timer arm in the same block.
Acked-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Frederic Weisbecker frederic@kernel.org Fixes: 60f2ceaa8111 ("posix-cpu-timers: Remove unnecessary locking around cpu_clock_sample_group") Cc: stable@vger.kernel.org Cc: Oleg Nesterov oleg@redhat.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Ingo Molnar mingo@kernel.org Cc: Eric W. Biederman ebiederm@xmission.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/time/posix-cpu-timers.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
--- a/kernel/time/posix-cpu-timers.c +++ b/kernel/time/posix-cpu-timers.c @@ -991,6 +991,11 @@ static void posix_cpu_timer_rearm(struct if (!p) goto out;
+ /* Protect timer list r/w in arm_timer() */ + sighand = lock_task_sighand(p, &flags); + if (unlikely(sighand == NULL)) + goto out; + /* * Fetch the current sample and update the timer's expiry time. */ @@ -1001,11 +1006,6 @@ static void posix_cpu_timer_rearm(struct
bump_cpu_timer(timer, now);
- /* Protect timer list r/w in arm_timer() */ - sighand = lock_task_sighand(p, &flags); - if (unlikely(sighand == NULL)) - goto out; - /* * Now re-arm for the new expiry time. */
From: Peter Collingbourne pcc@google.com
commit 0db282ba2c12c1515d490d14a1ff696643ab0f1b upstream.
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b... Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dave Martin Dave.Martin@arm.com Cc: Will Deacon will@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Alistair Delva adelva@google.com Cc: William McVicker willmcvicker@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Mitch Phillips mitchp@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/vm/userfaultfd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -197,8 +197,10 @@ static int anon_release_pages(char *rel_
static void anon_allocate_area(void **alloc_area) { - if (posix_memalign(alloc_area, page_size, nr_pages * page_size)) { - fprintf(stderr, "out of memory\n"); + *alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE, + MAP_ANONYMOUS | MAP_PRIVATE, -1, 0); + if (*alloc_area == MAP_FAILED) + fprintf(stderr, "mmap of anonymous memory failed"); *alloc_area = NULL; } }
On Mon, Jul 26, 2021 at 9:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Peter Collingbourne pcc@google.com
commit 0db282ba2c12c1515d490d14a1ff696643ab0f1b upstream.
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b... Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dave Martin Dave.Martin@arm.com Cc: Will Deacon will@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Alistair Delva adelva@google.com Cc: William McVicker willmcvicker@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Mitch Phillips mitchp@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
tools/testing/selftests/vm/userfaultfd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -197,8 +197,10 @@ static int anon_release_pages(char *rel_
static void anon_allocate_area(void **alloc_area) {
if (posix_memalign(alloc_area, page_size, nr_pages * page_size)) {
fprintf(stderr, "out of memory\n");
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (*alloc_area == MAP_FAILED)
Hi Greg,
It looks like your backport of this patch (and the backports to stable kernels) are missing a left brace here.
Peter
fprintf(stderr, "mmap of anonymous memory failed"); *alloc_area = NULL; }
}
On Thu, Jul 29, 2021 at 10:58:11AM -0700, Peter Collingbourne wrote:
On Mon, Jul 26, 2021 at 9:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Peter Collingbourne pcc@google.com
commit 0db282ba2c12c1515d490d14a1ff696643ab0f1b upstream.
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b... Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dave Martin Dave.Martin@arm.com Cc: Will Deacon will@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Alistair Delva adelva@google.com Cc: William McVicker willmcvicker@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Mitch Phillips mitchp@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
tools/testing/selftests/vm/userfaultfd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -197,8 +197,10 @@ static int anon_release_pages(char *rel_
static void anon_allocate_area(void **alloc_area) {
if (posix_memalign(alloc_area, page_size, nr_pages * page_size)) {
fprintf(stderr, "out of memory\n");
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (*alloc_area == MAP_FAILED)
Hi Greg,
It looks like your backport of this patch (and the backports to stable kernels) are missing a left brace here.
Already fixed up in the latest -rc releases, right?
thanks,
greg k-h
On Thu, Jul 29, 2021 at 9:38 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jul 29, 2021 at 10:58:11AM -0700, Peter Collingbourne wrote:
On Mon, Jul 26, 2021 at 9:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Peter Collingbourne pcc@google.com
commit 0db282ba2c12c1515d490d14a1ff696643ab0f1b upstream.
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b... Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dave Martin Dave.Martin@arm.com Cc: Will Deacon will@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Alistair Delva adelva@google.com Cc: William McVicker willmcvicker@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Mitch Phillips mitchp@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
tools/testing/selftests/vm/userfaultfd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -197,8 +197,10 @@ static int anon_release_pages(char *rel_
static void anon_allocate_area(void **alloc_area) {
if (posix_memalign(alloc_area, page_size, nr_pages * page_size)) {
fprintf(stderr, "out of memory\n");
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (*alloc_area == MAP_FAILED)
Hi Greg,
It looks like your backport of this patch (and the backports to stable kernels) are missing a left brace here.
Already fixed up in the latest -rc releases, right?
It looks like you fixed it on linux-4.19.y and linux-5.4.y, but not linux-4.14.y, linux-5.10.y or linux-5.13.y.
Peter
On Mon, Aug 02, 2021 at 01:10:47PM -0700, Peter Collingbourne wrote:
On Thu, Jul 29, 2021 at 9:38 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Jul 29, 2021 at 10:58:11AM -0700, Peter Collingbourne wrote:
On Mon, Jul 26, 2021 at 9:16 AM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
From: Peter Collingbourne pcc@google.com
commit 0db282ba2c12c1515d490d14a1ff696643ab0f1b upstream.
This test passes pointers obtained from anon_allocate_area to the userfaultfd and mremap APIs. This causes a problem if the system allocator returns tagged pointers because with the tagged address ABI the kernel rejects tagged addresses passed to these APIs, which would end up causing the test to fail. To make this test compatible with such system allocators, stop using the system allocator to allocate memory in anon_allocate_area, and instead just use mmap.
Link: https://lkml.kernel.org/r/20210714195437.118982-3-pcc@google.com Link: https://linux-review.googlesource.com/id/Icac91064fcd923f77a83e8e133f8631c5b... Fixes: c47174fc362a ("userfaultfd: selftest") Co-developed-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Lokesh Gidra lokeshgidra@google.com Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Dave Martin Dave.Martin@arm.com Cc: Will Deacon will@kernel.org Cc: Andrea Arcangeli aarcange@redhat.com Cc: Alistair Delva adelva@google.com Cc: William McVicker willmcvicker@google.com Cc: Evgenii Stepanov eugenis@google.com Cc: Mitch Phillips mitchp@google.com Cc: Andrey Konovalov andreyknvl@gmail.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org
tools/testing/selftests/vm/userfaultfd.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/tools/testing/selftests/vm/userfaultfd.c +++ b/tools/testing/selftests/vm/userfaultfd.c @@ -197,8 +197,10 @@ static int anon_release_pages(char *rel_
static void anon_allocate_area(void **alloc_area) {
if (posix_memalign(alloc_area, page_size, nr_pages * page_size)) {
fprintf(stderr, "out of memory\n");
*alloc_area = mmap(NULL, nr_pages * page_size, PROT_READ | PROT_WRITE,
MAP_ANONYMOUS | MAP_PRIVATE, -1, 0);
if (*alloc_area == MAP_FAILED)
Hi Greg,
It looks like your backport of this patch (and the backports to stable kernels) are missing a left brace here.
Already fixed up in the latest -rc releases, right?
It looks like you fixed it on linux-4.19.y and linux-5.4.y, but not linux-4.14.y, linux-5.10.y or linux-5.13.y.
4.14 had not had a release with this in it yet (it is in the queue), I missed that I also broke this on 5.10 and 5.13 so will go add it there as well.
thanks!
greg k-h
From: Pavel Begunkov asml.silence@gmail.com
commit 68b11e8b1562986c134764433af64e97d30c9fc0 upstream.
If __io_queue_proc() fails to add a second poll entry, e.g. kmalloc() failed, but it goes on with a third waitqueue, it may succeed and overwrite the error status. Count the number of poll entries we added, so we can set pt->error to zero at the beginning and find out when the mentioned scenario happens.
Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/9d6b9e561f88bcc0163623b74a76c39f712151c3.162677445... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/io_uring.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4805,6 +4805,7 @@ IO_NETOP_FN(recv); struct io_poll_table { struct poll_table_struct pt; struct io_kiocb *req; + int nr_entries; int error; };
@@ -5002,11 +5003,11 @@ static void __io_queue_proc(struct io_po struct io_kiocb *req = pt->req;
/* - * If poll->head is already set, it's because the file being polled - * uses multiple waitqueues for poll handling (eg one for read, one - * for write). Setup a separate io_poll_iocb if this happens. + * The file being polled uses multiple waitqueues for poll handling + * (e.g. one for read, one for write). Setup a separate io_poll_iocb + * if this happens. */ - if (unlikely(poll->head)) { + if (unlikely(pt->nr_entries)) { struct io_poll_iocb *poll_one = poll;
/* already have a 2nd entry, fail a third attempt */ @@ -5034,7 +5035,7 @@ static void __io_queue_proc(struct io_po *poll_ptr = poll; }
- pt->error = 0; + pt->nr_entries++; poll->head = head;
if (poll->events & EPOLLEXCLUSIVE) @@ -5112,9 +5113,12 @@ static __poll_t __io_arm_poll_handler(st
ipt->pt._key = mask; ipt->req = req; - ipt->error = -EINVAL; + ipt->error = 0; + ipt->nr_entries = 0;
mask = vfs_poll(req->file, &ipt->pt) & poll->events; + if (unlikely(!ipt->nr_entries) && !ipt->error) + ipt->error = -EINVAL;
spin_lock_irq(&ctx->completion_lock); if (likely(poll->head)) {
From: Pavel Begunkov asml.silence@gmail.com
commit 46fee9ab02cb24979bbe07631fc3ae95ae08aa3e upstream.
__io_queue_proc() can enqueue both poll entries and still fail afterwards, so the callers trying to cancel it should also try to remove the second poll entry (if any).
For example, it may leave the request alive referencing a io_uring context but not accessible for cancellation:
[ 282.599913][ T1620] task:iou-sqp-23145 state:D stack:28720 pid:23155 ppid: 8844 flags:0x00004004 [ 282.609927][ T1620] Call Trace: [ 282.613711][ T1620] __schedule+0x93a/0x26f0 [ 282.634647][ T1620] schedule+0xd3/0x270 [ 282.638874][ T1620] io_uring_cancel_generic+0x54d/0x890 [ 282.660346][ T1620] io_sq_thread+0xaac/0x1250 [ 282.696394][ T1620] ret_from_fork+0x1f/0x30
Cc: stable@vger.kernel.org Fixes: 18bceab101add ("io_uring: allow POLL_ADD with double poll_wait() users") Reported-and-tested-by: syzbot+ac957324022b7132accf@syzkaller.appspotmail.com Signed-off-by: Pavel Begunkov asml.silence@gmail.com Link: https://lore.kernel.org/r/0ec1228fc5eda4cb524eeda857da8efdc43c331c.162677445... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/io_uring.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -5121,6 +5121,8 @@ static __poll_t __io_arm_poll_handler(st ipt->error = -EINVAL;
spin_lock_irq(&ctx->completion_lock); + if (ipt->error) + io_poll_remove_double(req); if (likely(poll->head)) { spin_lock(&poll->head->lock); if (unlikely(list_empty(&poll->wait.entry))) {
From: Jens Axboe axboe@kernel.dk
commit 0cc936f74bcacb039b7533aeac0a887dfc896bf6 upstream.
A previous commit shuffled some code around, and inadvertently used struct file after fdput() had been called on it. As we can't touch the file post fdput() dropping our reference, move the fdput() to after that has been done.
Cc: Pavel Begunkov asml.silence@gmail.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/io-uring/YPnqM0fY3nM5RdRI@zeniv-ca.linux.org.uk/ Fixes: f2a48dd09b8e ("io_uring: refactor io_sq_offload_create()") Reported-by: Al Viro viro@zeniv.linux.org.uk Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/io_uring.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -7953,9 +7953,11 @@ static int io_sq_offload_create(struct i f = fdget(p->wq_fd); if (!f.file) return -ENXIO; - fdput(f); - if (f.file->f_op != &io_uring_fops) + if (f.file->f_op != &io_uring_fops) { + fdput(f); return -EINVAL; + } + fdput(f); } if (ctx->flags & IORING_SETUP_SQPOLL) { struct task_struct *tsk;
From: Peter Collingbourne pcc@google.com
commit e71e2ace5721a8b921dca18b045069e7bb411277 upstream.
Patch series "userfaultfd: do not untag user pointers", v5.
If a user program uses userfaultfd on ranges of heap memory, it may end up passing a tagged pointer to the kernel in the range.start field of the UFFDIO_REGISTER ioctl. This can happen when using an MTE-capable allocator, or on Android if using the Tagged Pointers feature for MTE readiness [1].
When a fault subsequently occurs, the tag is stripped from the fault address returned to the application in the fault.address field of struct uffd_msg. However, from the application's perspective, the tagged address *is* the memory address, so if the application is unaware of memory tags, it may get confused by receiving an address that is, from its point of view, outside of the bounds of the allocation. We observed this behavior in the kselftest for userfaultfd [2] but other applications could have the same problem.
Address this by not untagging pointers passed to the userfaultfd ioctls. Instead, let the system call fail. Also change the kselftest to use mmap so that it doesn't encounter this problem.
[1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c
This patch (of 2):
Do not untag pointers passed to the userfaultfd ioctls. Instead, let the system call fail. This will provide an early indication of problems with tag-unaware userspace code instead of letting the code get confused later, and is consistent with how we decided to handle brk/mmap/mremap in commit dcde237319e6 ("mm: Avoid creating virtual address aliases in brk()/mmap()/mremap()"), as well as being consistent with the existing tagged address ABI documentation relating to how ioctl arguments are handled.
The code change is a revert of commit 7d0325749a6c ("userfaultfd: untag user pointers") plus some fixups to some additional calls to validate_range that have appeared since then.
[1] https://source.android.com/devices/tech/debug/tagged-pointers [2] tools/testing/selftests/vm/userfaultfd.c
Link: https://lkml.kernel.org/r/20210714195437.118982-1-pcc@google.com Link: https://lkml.kernel.org/r/20210714195437.118982-2-pcc@google.com Link: https://linux-review.googlesource.com/id/I761aa9f0344454c482b83fcfcce547db0a... Fixes: 63f0c6037965 ("arm64: Introduce prctl() options to control the tagged user addresses ABI") Signed-off-by: Peter Collingbourne pcc@google.com Reviewed-by: Andrey Konovalov andreyknvl@gmail.com Reviewed-by: Catalin Marinas catalin.marinas@arm.com Cc: Alistair Delva adelva@google.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Dave Martin Dave.Martin@arm.com Cc: Evgenii Stepanov eugenis@google.com Cc: Lokesh Gidra lokeshgidra@google.com Cc: Mitch Phillips mitchp@google.com Cc: Vincenzo Frascino vincenzo.frascino@arm.com Cc: Will Deacon will@kernel.org Cc: William McVicker willmcvicker@google.com Cc: stable@vger.kernel.org [5.4] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/arm64/tagged-address-abi.rst | 26 ++++++++++++++++++-------- fs/userfaultfd.c | 26 ++++++++++++-------------- 2 files changed, 30 insertions(+), 22 deletions(-)
--- a/Documentation/arm64/tagged-address-abi.rst +++ b/Documentation/arm64/tagged-address-abi.rst @@ -45,14 +45,24 @@ how the user addresses are used by the k
1. User addresses not accessed by the kernel but used for address space management (e.g. ``mprotect()``, ``madvise()``). The use of valid - tagged pointers in this context is allowed with the exception of - ``brk()``, ``mmap()`` and the ``new_address`` argument to - ``mremap()`` as these have the potential to alias with existing - user addresses. - - NOTE: This behaviour changed in v5.6 and so some earlier kernels may - incorrectly accept valid tagged pointers for the ``brk()``, - ``mmap()`` and ``mremap()`` system calls. + tagged pointers in this context is allowed with these exceptions: + + - ``brk()``, ``mmap()`` and the ``new_address`` argument to + ``mremap()`` as these have the potential to alias with existing + user addresses. + + NOTE: This behaviour changed in v5.6 and so some earlier kernels may + incorrectly accept valid tagged pointers for the ``brk()``, + ``mmap()`` and ``mremap()`` system calls. + + - The ``range.start``, ``start`` and ``dst`` arguments to the + ``UFFDIO_*`` ``ioctl()``s used on a file descriptor obtained from + ``userfaultfd()``, as fault addresses subsequently obtained by reading + the file descriptor will be untagged, which may otherwise confuse + tag-unaware programs. + + NOTE: This behaviour changed in v5.14 and so some earlier kernels may + incorrectly accept valid tagged pointers for this system call.
2. User addresses accessed by the kernel (e.g. ``write()``). This ABI relaxation is disabled by default and the application thread needs to --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1236,23 +1236,21 @@ static __always_inline void wake_userfau }
static __always_inline int validate_range(struct mm_struct *mm, - __u64 *start, __u64 len) + __u64 start, __u64 len) { __u64 task_size = mm->task_size;
- *start = untagged_addr(*start); - - if (*start & ~PAGE_MASK) + if (start & ~PAGE_MASK) return -EINVAL; if (len & ~PAGE_MASK) return -EINVAL; if (!len) return -EINVAL; - if (*start < mmap_min_addr) + if (start < mmap_min_addr) return -EINVAL; - if (*start >= task_size) + if (start >= task_size) return -EINVAL; - if (len > task_size - *start) + if (len > task_size - start) return -EINVAL; return 0; } @@ -1313,7 +1311,7 @@ static int userfaultfd_register(struct u vm_flags |= VM_UFFD_MINOR; }
- ret = validate_range(mm, &uffdio_register.range.start, + ret = validate_range(mm, uffdio_register.range.start, uffdio_register.range.len); if (ret) goto out; @@ -1519,7 +1517,7 @@ static int userfaultfd_unregister(struct if (copy_from_user(&uffdio_unregister, buf, sizeof(uffdio_unregister))) goto out;
- ret = validate_range(mm, &uffdio_unregister.start, + ret = validate_range(mm, uffdio_unregister.start, uffdio_unregister.len); if (ret) goto out; @@ -1668,7 +1666,7 @@ static int userfaultfd_wake(struct userf if (copy_from_user(&uffdio_wake, buf, sizeof(uffdio_wake))) goto out;
- ret = validate_range(ctx->mm, &uffdio_wake.start, uffdio_wake.len); + ret = validate_range(ctx->mm, uffdio_wake.start, uffdio_wake.len); if (ret) goto out;
@@ -1708,7 +1706,7 @@ static int userfaultfd_copy(struct userf sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, &uffdio_copy.dst, uffdio_copy.len); + ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out; /* @@ -1765,7 +1763,7 @@ static int userfaultfd_zeropage(struct u sizeof(uffdio_zeropage)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, &uffdio_zeropage.range.start, + ret = validate_range(ctx->mm, uffdio_zeropage.range.start, uffdio_zeropage.range.len); if (ret) goto out; @@ -1815,7 +1813,7 @@ static int userfaultfd_writeprotect(stru sizeof(struct uffdio_writeprotect))) return -EFAULT;
- ret = validate_range(ctx->mm, &uffdio_wp.range.start, + ret = validate_range(ctx->mm, uffdio_wp.range.start, uffdio_wp.range.len); if (ret) return ret; @@ -1863,7 +1861,7 @@ static int userfaultfd_continue(struct u sizeof(uffdio_continue) - (sizeof(__s64)))) goto out;
- ret = validate_range(ctx->mm, &uffdio_continue.range.start, + ret = validate_range(ctx->mm, uffdio_continue.range.start, uffdio_continue.range.len); if (ret) goto out;
From: Alexander Potapenko glider@google.com
commit 235a85cb32bb123854ad31de46fdbf04c1d57cda upstream.
Check the allocation size before toggling kfence_allocation_gate.
This way allocations that can't be served by KFENCE will not result in waiting for another CONFIG_KFENCE_SAMPLE_INTERVAL without allocating anything.
Link: https://lkml.kernel.org/r/20210714092222.1890268-1-glider@google.com Signed-off-by: Alexander Potapenko glider@google.com Suggested-by: Marco Elver elver@google.com Reviewed-by: Marco Elver elver@google.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Marco Elver elver@google.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: stable@vger.kernel.org [5.12+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kfence/core.c | 10 +++++++--- 1 file changed, 7 insertions(+), 3 deletions(-)
--- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -734,6 +734,13 @@ void kfence_shutdown_cache(struct kmem_c void *__kfence_alloc(struct kmem_cache *s, size_t size, gfp_t flags) { /* + * Perform size check before switching kfence_allocation_gate, so that + * we don't disable KFENCE without making an allocation. + */ + if (size > PAGE_SIZE) + return NULL; + + /* * allocation_gate only needs to become non-zero, so it doesn't make * sense to continue writing to it and pay the associated contention * cost, in case we have a large number of concurrent allocations. @@ -757,9 +764,6 @@ void *__kfence_alloc(struct kmem_cache * if (!READ_ONCE(kfence_enabled)) return NULL;
- if (size > PAGE_SIZE) - return NULL; - return kfence_guarded_alloc(s, size, flags); }
From: Alexander Potapenko glider@google.com
commit 236e9f1538523d3d380dda1cc99571d587058f37 upstream.
Allocation requests outside ZONE_NORMAL (MOVABLE, HIGHMEM or DMA) cannot be fulfilled by KFENCE, because KFENCE memory pool is located in a zone different from the requested one.
Because callers of kmem_cache_alloc() may actually rely on the allocation to reside in the requested zone (e.g. memory allocations done with __GFP_DMA must be DMAable), skip all allocations done with GFP_ZONEMASK and/or respective SLAB flags (SLAB_CACHE_DMA and SLAB_CACHE_DMA32).
Link: https://lkml.kernel.org/r/20210714092222.1890268-2-glider@google.com Fixes: 0ce20dd84089 ("mm: add Kernel Electric-Fence infrastructure") Signed-off-by: Alexander Potapenko glider@google.com Reviewed-by: Marco Elver elver@google.com Acked-by: Souptick Joarder jrdr.linux@gmail.com Cc: Dmitry Vyukov dvyukov@google.com Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Souptick Joarder jrdr.linux@gmail.com Cc: stable@vger.kernel.org [5.12+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/kfence/core.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/mm/kfence/core.c +++ b/mm/kfence/core.c @@ -741,6 +741,15 @@ void *__kfence_alloc(struct kmem_cache * return NULL;
/* + * Skip allocations from non-default zones, including DMA. We cannot + * guarantee that pages in the KFENCE pool will have the requested + * properties (e.g. reside in DMAable memory). + */ + if ((flags & GFP_ZONEMASK) || + (s->flags & (SLAB_CACHE_DMA | SLAB_CACHE_DMA32))) + return NULL; + + /* * allocation_gate only needs to become non-zero, so it doesn't make * sense to continue writing to it and pay the associated contention * cost, in case we have a large number of concurrent allocations.
From: Christoph Hellwig hch@lst.de
commit 8dad53a11f8d94dceb540a5f8f153484f42be84b upstream.
memcpy_to_page and memzero_page can write to arbitrary pages, which could be in the page cache or in high memory, so call flush_kernel_dcache_pages to flush the dcache.
This is a problem when using these helpers on dcache challeneged architectures. Right now there are just a few users, chances are no one used the PC floppy driver, the aha1542 driver for an ISA SCSI HBA, and a few advanced and optional btrfs and ext4 features on those platforms yet since the conversion.
Link: https://lkml.kernel.org/r/20210713055231.137602-2-hch@lst.de Fixes: bb90d4bc7b6a ("mm/highmem: Lift memcpy_[to|from]_page to core") Fixes: 28961998f858 ("iov_iter: lift memzero_page() to highmem.h") Signed-off-by: Christoph Hellwig hch@lst.de Reviewed-by: Ira Weiny ira.weiny@intel.com Cc: Chaitanya Kulkarni chaitanya.kulkarni@wdc.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/highmem.h | 2 ++ 1 file changed, 2 insertions(+)
--- a/include/linux/highmem.h +++ b/include/linux/highmem.h @@ -329,6 +329,7 @@ static inline void memcpy_to_page(struct
VM_BUG_ON(offset + len > PAGE_SIZE); memcpy(to + offset, from, len); + flush_dcache_page(page); kunmap_local(to); }
@@ -336,6 +337,7 @@ static inline void memzero_page(struct p { char *addr = kmap_atomic(page); memset(addr + offset, 0, len); + flush_dcache_page(page); kunmap_atomic(addr); }
From: Sergei Trofimovich slyfox@gentoo.org
commit 69e5d322a2fb86173fde8bad26e8eb38cad1b1e9 upstream.
To reproduce the failure we need the following system:
- kernel command: page_poison=1 init_on_free=0 init_on_alloc=0
- kernel config: * CONFIG_INIT_ON_ALLOC_DEFAULT_ON=y * CONFIG_INIT_ON_FREE_DEFAULT_ON=y * CONFIG_PAGE_POISONING=y
Resulting in:
0000000085629bdd: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 0000000022861832: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00000000c597f5b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ CPU: 11 PID: 15195 Comm: bash Kdump: loaded Tainted: G U O 5.13.1-gentoo-x86_64 #1 Hardware name: System manufacturer System Product Name/PRIME Z370-A, BIOS 2801 01/13/2021 Call Trace: dump_stack+0x64/0x7c __kernel_unpoison_pages.cold+0x48/0x84 post_alloc_hook+0x60/0xa0 get_page_from_freelist+0xdb8/0x1000 __alloc_pages+0x163/0x2b0 __get_free_pages+0xc/0x30 pgd_alloc+0x2e/0x1a0 mm_init+0x185/0x270 dup_mm+0x6b/0x4f0 copy_process+0x190d/0x1b10 kernel_clone+0xba/0x3b0 __do_sys_clone+0x8f/0xb0 do_syscall_64+0x68/0x80 entry_SYSCALL_64_after_hwframe+0x44/0xae
Before commit 51cba1ebc60d ("init_on_alloc: Optimize static branches") init_on_alloc never enabled static branch by default. It could only be enabed explicitly by init_mem_debugging_and_hardening().
But after commit 51cba1ebc60d, a static branch could already be enabled by default. There was no code to ever disable it. That caused page_poison=1 / init_on_free=1 conflict.
This change extends init_mem_debugging_and_hardening() to also disable static branch disabling.
Link: https://lkml.kernel.org/r/20210714031935.4094114-1-keescook@chromium.org Link: https://lore.kernel.org/r/20210712215816.1512739-1-slyfox@gentoo.org Fixes: 51cba1ebc60d ("init_on_alloc: Optimize static branches") Signed-off-by: Sergei Trofimovich slyfox@gentoo.org Signed-off-by: Kees Cook keescook@chromium.org Co-developed-by: Kees Cook keescook@chromium.org Reported-by: Mikhail Morfikov mmorfikov@gmail.com Reported-by: bowsingbetee@pm.me Tested-by: bowsingbetee@protonmail.com Reviewed-by: David Hildenbrand david@redhat.com Cc: Alexander Potapenko glider@google.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Vlastimil Babka vbabka@suse.cz Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/page_alloc.c | 29 ++++++++++++++++------------- 1 file changed, 16 insertions(+), 13 deletions(-)
--- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -801,21 +801,24 @@ void init_mem_debugging_and_hardening(vo } #endif
- if (_init_on_alloc_enabled_early) { - if (page_poisoning_requested) - pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, " - "will take precedence over init_on_alloc\n"); - else - static_branch_enable(&init_on_alloc); - } - if (_init_on_free_enabled_early) { - if (page_poisoning_requested) - pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, " - "will take precedence over init_on_free\n"); - else - static_branch_enable(&init_on_free); + if ((_init_on_alloc_enabled_early || _init_on_free_enabled_early) && + page_poisoning_requested) { + pr_info("mem auto-init: CONFIG_PAGE_POISONING is on, " + "will take precedence over init_on_alloc and init_on_free\n"); + _init_on_alloc_enabled_early = false; + _init_on_free_enabled_early = false; }
+ if (_init_on_alloc_enabled_early) + static_branch_enable(&init_on_alloc); + else + static_branch_disable(&init_on_alloc); + + if (_init_on_free_enabled_early) + static_branch_enable(&init_on_free); + else + static_branch_disable(&init_on_free); + #ifdef CONFIG_DEBUG_PAGEALLOC if (!debug_pagealloc_enabled()) return;
From: Mike Rapoport rppt@linux.ibm.com
commit 79e482e9c3ae86e849c701c846592e72baddda5a upstream.
Commit b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") didn't take into account that when there is movable_node parameter in the kernel command line, for_each_mem_range() would skip ranges marked with MEMBLOCK_HOTPLUG.
The page table setup code in POWER uses for_each_mem_range() to create the linear mapping of the physical memory and since the regions marked as MEMORY_HOTPLUG are skipped, they never make it to the linear map.
A later access to the memory in those ranges will fail:
BUG: Unable to handle kernel data access on write at 0xc000000400000000 Faulting instruction address: 0xc00000000008a3c0 Oops: Kernel access of bad area, sig: 11 [#1] LE PAGE_SIZE=64K MMU=Radix SMP NR_CPUS=2048 NUMA pSeries Modules linked in: CPU: 0 PID: 53 Comm: kworker/u2:0 Not tainted 5.13.0 #7 NIP: c00000000008a3c0 LR: c0000000003c1ed8 CTR: 0000000000000040 REGS: c000000008a57770 TRAP: 0300 Not tainted (5.13.0) MSR: 8000000002009033 <SF,VEC,EE,ME,IR,DR,RI,LE> CR: 84222202 XER: 20040000 CFAR: c0000000003c1ed4 DAR: c000000400000000 DSISR: 42000000 IRQMASK: 0 GPR00: c0000000003c1ed8 c000000008a57a10 c0000000019da700 c000000400000000 GPR04: 0000000000000280 0000000000000180 0000000000000400 0000000000000200 GPR08: 0000000000000100 0000000000000080 0000000000000040 0000000000000300 GPR12: 0000000000000380 c000000001bc0000 c0000000001660c8 c000000006337e00 GPR16: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 GPR20: 0000000040000000 0000000020000000 c000000001a81990 c000000008c30000 GPR24: c000000008c20000 c000000001a81998 000fffffffff0000 c000000001a819a0 GPR28: c000000001a81908 c00c000001000000 c000000008c40000 c000000008a64680 NIP clear_user_page+0x50/0x80 LR __handle_mm_fault+0xc88/0x1910 Call Trace: __handle_mm_fault+0xc44/0x1910 (unreliable) handle_mm_fault+0x130/0x2a0 __get_user_pages+0x248/0x610 __get_user_pages_remote+0x12c/0x3e0 get_arg_page+0x54/0xf0 copy_string_kernel+0x11c/0x210 kernel_execve+0x16c/0x220 call_usermodehelper_exec_async+0x1b0/0x2f0 ret_from_kernel_thread+0x5c/0x70 Instruction dump: 79280fa4 79271764 79261f24 794ae8e2 7ca94214 7d683a14 7c893a14 7d893050 7d4903a6 60000000 60000000 60000000 <7c001fec> 7c091fec 7c081fec 7c051fec ---[ end trace 490b8c67e6075e09 ]---
Making for_each_mem_range() include MEMBLOCK_HOTPLUG regions in the traversal fixes this issue.
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1976100 Link: https://lkml.kernel.org/r/20210712071132.20902-1-rppt@kernel.org Fixes: b10d6bca8720 ("arch, drivers: replace for_each_membock() with for_each_mem_range()") Signed-off-by: Mike Rapoport rppt@linux.ibm.com Tested-by: Greg Kurz groug@kaod.org Reviewed-by: David Hildenbrand david@redhat.com Cc: stable@vger.kernel.org [5.10+] Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/linux/memblock.h | 4 ++-- mm/memblock.c | 3 ++- 2 files changed, 4 insertions(+), 3 deletions(-)
--- a/include/linux/memblock.h +++ b/include/linux/memblock.h @@ -207,7 +207,7 @@ static inline void __next_physmem_range( */ #define for_each_mem_range(i, p_start, p_end) \ __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, \ - MEMBLOCK_NONE, p_start, p_end, NULL) + MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
/** * for_each_mem_range_rev - reverse iterate through memblock areas from @@ -218,7 +218,7 @@ static inline void __next_physmem_range( */ #define for_each_mem_range_rev(i, p_start, p_end) \ __for_each_mem_range_rev(i, &memblock.memory, NULL, NUMA_NO_NODE, \ - MEMBLOCK_NONE, p_start, p_end, NULL) + MEMBLOCK_HOTPLUG, p_start, p_end, NULL)
/** * for_each_reserved_mem_range - iterate over all reserved memblock areas --- a/mm/memblock.c +++ b/mm/memblock.c @@ -940,7 +940,8 @@ static bool should_skip_region(struct me return true;
/* skip hotpluggable memory regions if needed */ - if (movable_node_is_enabled() && memblock_is_hotpluggable(m)) + if (movable_node_is_enabled() && memblock_is_hotpluggable(m) && + !(flags & MEMBLOCK_HOTPLUG)) return true;
/* if we want mirror memory skip non-mirror memory regions */
From: Qi Zheng zhengqi.arch@bytedance.com
commit e4dc3489143f84f7ed30be58b886bb6772f229b9 upstream.
Commit 63f3655f9501 ("mm, memcg: fix reclaim deadlock with writeback") fix the following ABBA deadlock by pre-allocating the pte page table without holding the page lock.
lock_page(A) SetPageWriteback(A) unlock_page(A) lock_page(B) lock_page(B) pte_alloc_one shrink_page_list wait_on_page_writeback(A) SetPageWriteback(B) unlock_page(B)
# flush A, B to clear the writeback
Commit f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") reworked the relevant code but ignored this race. This will cause the deadlock above to appear again, so fix it.
Link: https://lkml.kernel.org/r/20210721074849.57004-1-zhengqi.arch@bytedance.com Fixes: f9ce0be71d1f ("mm: Cleanup faultaround and finish_fault() codepaths") Signed-off-by: Qi Zheng zhengqi.arch@bytedance.com Acked-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Cc: Thomas Gleixner tglx@linutronix.de Cc: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@kernel.org Cc: Vladimir Davydov vdavydov.dev@gmail.com Cc: Muchun Song songmuchun@bytedance.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memory.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)
--- a/mm/memory.c +++ b/mm/memory.c @@ -3891,8 +3891,17 @@ vm_fault_t finish_fault(struct vm_fault return ret; }
- if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) + if (vmf->prealloc_pte) { + vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); + if (likely(pmd_none(*vmf->pmd))) { + mm_inc_nr_ptes(vma->vm_mm); + pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte); + vmf->prealloc_pte = NULL; + } + spin_unlock(vmf->ptl); + } else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) { return VM_FAULT_OOM; + } }
/* See comment in handle_pte_fault() */
From: Mike Kravetz mike.kravetz@oracle.com
commit e0f7e2b2f7e7864238a4eea05cc77ae1be2bf784 upstream.
In commit 32021982a324 ("hugetlbfs: Convert to fs_context") processing of the mount mode string was changed from match_octal() to fsparam_u32.
This changed existing behavior as match_octal does not require octal values to have a '0' prefix, but fsparam_u32 does.
Use fsparam_u32oct which provides the same behavior as match_octal.
Link: https://lkml.kernel.org/r/20210721183326.102716-1-mike.kravetz@oracle.com Fixes: 32021982a324 ("hugetlbfs: Convert to fs_context") Signed-off-by: Mike Kravetz mike.kravetz@oracle.com Reported-by: Dennis Camera bugs+kernel.org@dtnr.ch Reviewed-by: Matthew Wilcox (Oracle) willy@infradead.org Cc: David Howells dhowells@redhat.com Cc: Al Viro viro@zeniv.linux.org.uk Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/hugetlbfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -77,7 +77,7 @@ enum hugetlb_param { static const struct fs_parameter_spec hugetlb_fs_parameters[] = { fsparam_u32 ("gid", Opt_gid), fsparam_string("min_size", Opt_min_size), - fsparam_u32 ("mode", Opt_mode), + fsparam_u32oct("mode", Opt_mode), fsparam_string("nr_inodes", Opt_nr_inodes), fsparam_string("pagesize", Opt_pagesize), fsparam_string("size", Opt_size),
From: Ilya Dryomov idryomov@gmail.com
commit ed9eb71085ecb7ded9a5118cec2ab70667cc7350 upstream.
Currently rbd_quiesce_lock() holds lock_rwsem for read while blocking on releasing_wait completion. On the I/O completion side, each image request also needs to take lock_rwsem for read. Because rw_semaphore implementation doesn't allow new readers after a writer has indicated interest in the lock, this can result in a deadlock if something that needs to take lock_rwsem for write gets involved. For example:
1. watch error occurs 2. rbd_watch_errcb() takes lock_rwsem for write, clears owner_cid and releases lock_rwsem 3. after reestablishing the watch, rbd_reregister_watch() takes lock_rwsem for write and calls rbd_reacquire_lock() 4. rbd_quiesce_lock() downgrades lock_rwsem to for read and blocks on releasing_wait until running_list becomes empty 5. another watch error occurs 6. rbd_watch_errcb() blocks trying to take lock_rwsem for write 7. no in-flight image request can complete and delete itself from running_list because lock_rwsem won't be granted anymore
A similar scenario can occur with "lock has been acquired" and "lock has been released" notification handers which also take lock_rwsem for write to update owner_cid.
We don't actually get anything useful from sitting on lock_rwsem in rbd_quiesce_lock() -- owner_cid updates certainly don't need to be synchronized with. In fact the whole owner_cid tracking logic could probably be removed from the kernel client because we don't support proxied maintenance operations.
Cc: stable@vger.kernel.org # 5.3+ URL: https://tracker.ceph.com/issues/42757 Signed-off-by: Ilya Dryomov idryomov@gmail.com Tested-by: Robin Geuze robin.geuze@nl.team.blue Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 12 +++++------- 1 file changed, 5 insertions(+), 7 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4100,8 +4100,6 @@ again:
static bool rbd_quiesce_lock(struct rbd_device *rbd_dev) { - bool need_wait; - dout("%s rbd_dev %p\n", __func__, rbd_dev); lockdep_assert_held_write(&rbd_dev->lock_rwsem);
@@ -4113,11 +4111,11 @@ static bool rbd_quiesce_lock(struct rbd_ */ rbd_dev->lock_state = RBD_LOCK_STATE_RELEASING; rbd_assert(!completion_done(&rbd_dev->releasing_wait)); - need_wait = !list_empty(&rbd_dev->running_list); - downgrade_write(&rbd_dev->lock_rwsem); - if (need_wait) - wait_for_completion(&rbd_dev->releasing_wait); - up_read(&rbd_dev->lock_rwsem); + if (list_empty(&rbd_dev->running_list)) + return true; + + up_write(&rbd_dev->lock_rwsem); + wait_for_completion(&rbd_dev->releasing_wait);
down_write(&rbd_dev->lock_rwsem); if (rbd_dev->lock_state != RBD_LOCK_STATE_RELEASING)
From: Ilya Dryomov idryomov@gmail.com
commit 8798d070d416d18a75770fc19787e96705073f43 upstream.
Skipping the "lock has been released" notification if the lock owner is not what we expect based on owner_cid can lead to I/O hangs. One example is our own notifications: because owner_cid is cleared in rbd_unlock(), when we get our own notification it is processed as unexpected/duplicate and maybe_kick_acquire() isn't called. If a peer that requested the lock then doesn't go through with acquiring it, I/O requests that came in while the lock was being quiesced would be stalled until another I/O request is submitted and kicks acquire from rbd_img_exclusive_lock().
This makes the comment in rbd_release_lock() actually true: prior to this change the canceled work was being requeued in response to the "lock has been acquired" notification from rbd_handle_acquired_lock().
Cc: stable@vger.kernel.org # 5.3+ Signed-off-by: Ilya Dryomov idryomov@gmail.com Tested-by: Robin Geuze robin.geuze@nl.team.blue Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/block/rbd.c | 20 +++++++------------- 1 file changed, 7 insertions(+), 13 deletions(-)
--- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4201,15 +4201,11 @@ static void rbd_handle_acquired_lock(str if (!rbd_cid_equal(&cid, &rbd_empty_cid)) { down_write(&rbd_dev->lock_rwsem); if (rbd_cid_equal(&cid, &rbd_dev->owner_cid)) { - /* - * we already know that the remote client is - * the owner - */ - up_write(&rbd_dev->lock_rwsem); - return; + dout("%s rbd_dev %p cid %llu-%llu == owner_cid\n", + __func__, rbd_dev, cid.gid, cid.handle); + } else { + rbd_set_owner_cid(rbd_dev, &cid); } - - rbd_set_owner_cid(rbd_dev, &cid); downgrade_write(&rbd_dev->lock_rwsem); } else { down_read(&rbd_dev->lock_rwsem); @@ -4234,14 +4230,12 @@ static void rbd_handle_released_lock(str if (!rbd_cid_equal(&cid, &rbd_empty_cid)) { down_write(&rbd_dev->lock_rwsem); if (!rbd_cid_equal(&cid, &rbd_dev->owner_cid)) { - dout("%s rbd_dev %p unexpected owner, cid %llu-%llu != owner_cid %llu-%llu\n", + dout("%s rbd_dev %p cid %llu-%llu != owner_cid %llu-%llu\n", __func__, rbd_dev, cid.gid, cid.handle, rbd_dev->owner_cid.gid, rbd_dev->owner_cid.handle); - up_write(&rbd_dev->lock_rwsem); - return; + } else { + rbd_set_owner_cid(rbd_dev, &rbd_empty_cid); } - - rbd_set_owner_cid(rbd_dev, &rbd_empty_cid); downgrade_write(&rbd_dev->lock_rwsem); } else { down_read(&rbd_dev->lock_rwsem);
From: Jérôme Glisse jglisse@redhat.com
commit c36748ac545421d94a5091c754414c0f3664bf10 upstream.
We need to append device id even if eeprom have a label property set as some platform can have multiple eeproms with same label and we can not register each of those with same label. Failing to register those eeproms trigger cascade failures on such platform (system is no longer working).
This fix regression on such platform introduced with 4e302c3b568e
Reported-by: Alexander Fomichev fomichev.ru@gmail.com Fixes: 4e302c3b568e ("misc: eeprom: at24: fix NVMEM name with custom AT24 device name") Cc: stable@vger.kernel.org Signed-off-by: Jérôme Glisse jglisse@redhat.com Signed-off-by: Bartosz Golaszewski bgolaszewski@baylibre.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/eeprom/at24.c | 17 +++++++---------- 1 file changed, 7 insertions(+), 10 deletions(-)
--- a/drivers/misc/eeprom/at24.c +++ b/drivers/misc/eeprom/at24.c @@ -714,23 +714,20 @@ static int at24_probe(struct i2c_client }
/* - * If the 'label' property is not present for the AT24 EEPROM, - * then nvmem_config.id is initialised to NVMEM_DEVID_AUTO, - * and this will append the 'devid' to the name of the NVMEM - * device. This is purely legacy and the AT24 driver has always - * defaulted to this. However, if the 'label' property is - * present then this means that the name is specified by the - * firmware and this name should be used verbatim and so it is - * not necessary to append the 'devid'. + * We initialize nvmem_config.id to NVMEM_DEVID_AUTO even if the + * label property is set as some platform can have multiple eeproms + * with same label and we can not register each of those with same + * label. Failing to register those eeproms trigger cascade failure + * on such platform. */ + nvmem_config.id = NVMEM_DEVID_AUTO; + if (device_property_present(dev, "label")) { - nvmem_config.id = NVMEM_DEVID_NONE; err = device_property_read_string(dev, "label", &nvmem_config.name); if (err) return err; } else { - nvmem_config.id = NVMEM_DEVID_AUTO; nvmem_config.name = dev_name(dev); }
From: Greg Kroah-Hartman gregkh@linuxfoundation.org
commit c453db6cd96418c79702eaf38259002755ab23ff upstream.
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") fixed up all architectures to deal with the stack guard gap. But when nds32 was added to the tree, it forgot to do the same thing.
Resolve this by properly fixing up the nsd32's version of arch_get_unmapped_area()
Cc: Nick Hu nickhu@andestech.com Cc: Greentime Hu green.hu@gmail.com Cc: Vincent Chen deanbo422@gmail.com Cc: Michal Hocko mhocko@suse.com Cc: Hugh Dickins hughd@google.com Cc: Qiang Liu cyruscyliu@gmail.com Cc: stable stable@vger.kernel.org Reported-by: iLifetruth yixiaonn@gmail.com Acked-by: Hugh Dickins hughd@google.com Link: https://lore.kernel.org/r/20210629104024.2293615-1-gregkh@linuxfoundation.or... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/nds32/mm/mmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/nds32/mm/mmap.c +++ b/arch/nds32/mm/mmap.c @@ -59,7 +59,7 @@ arch_get_unmapped_area(struct file *filp
vma = find_vma(mm, addr); if (TASK_SIZE - len >= addr && - (!vma || addr + len <= vma->vm_start)) + (!vma || addr + len <= vm_start_gap(vma))) return addr; }
From: Adrian Hunter adrian.hunter@intel.com
commit e64daad660a0c9ace3acdc57099fffe5ed83f977 upstream.
sysfs_remove_link() causes a warning if the parent directory does not exist. That can happen if the device link consumer has not been registered. So do not attempt sysfs_remove_link() in that case.
Fixes: 287905e68dd29 ("driver core: Expose device link details in sysfs") Signed-off-by: Adrian Hunter adrian.hunter@intel.com Cc: stable@vger.kernel.org # 5.9+ Reviewed-by: Rafael J. Wysocki rafael@kernel.org Link: https://lore.kernel.org/r/20210716114408.17320-2-adrian.hunter@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/base/core.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/base/core.c +++ b/drivers/base/core.c @@ -574,8 +574,10 @@ static void devlink_remove_symlinks(stru return; }
- snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup)); - sysfs_remove_link(&con->kobj, buf); + if (device_is_registered(con)) { + snprintf(buf, len, "supplier:%s:%s", dev_bus_name(sup), dev_name(sup)); + sysfs_remove_link(&con->kobj, buf); + } snprintf(buf, len, "consumer:%s:%s", dev_bus_name(con), dev_name(con)); sysfs_remove_link(&sup->kobj, buf); kfree(buf);
From: Jason Ekstrand jason@jlekstrand.net
commit 3761baae908a7b5012be08d70fa553cc2eb82305 upstream.
This reverts commit 9e31c1fe45d555a948ff66f1f0e3fe1f83ca63f7. Ever since that commit, we've been having issues where a hang in one client can propagate to another. In particular, a hang in an app can propagate to the X server which causes the whole desktop to lock up.
Error propagation along fences sound like a good idea, but as your bug shows, surprising consequences, since propagating errors across security boundaries is not a good thing.
What we do have is track the hangs on the ctx, and report information to userspace using RESET_STATS. That's how arb_robustness works. Also, if my understanding is still correct, the EIO from execbuf is when your context is banned (because not recoverable or too many hangs). And in all these cases it's up to userspace to figure out what is all impacted and should be reported to the application, that's not on the kernel to guess and automatically propagate.
What's more, we're also building more features on top of ctx error reporting with RESET_STATS ioctl: Encrypted buffers use the same, and the userspace fence wait also relies on that mechanism. So it is the path going forward for reporting gpu hangs and resets to userspace.
So all together that's why I think we should just bury this idea again as not quite the direction we want to go to, hence why I think the revert is the right option here.
For backporters: Please note that you _must_ have a backport of https://lore.kernel.org/dri-devel/20210602164149.391653-2-jason@jlekstrand.n... for otherwise backporting just this patch opens up a security bug.
v2: Augment commit message. Also restore Jason's sob that I accidentally lost.
v3: Add a note for backporters
Signed-off-by: Jason Ekstrand jason@jlekstrand.net Reported-by: Marcin Slusarz marcin.slusarz@intel.com Cc: stable@vger.kernel.org # v5.6+ Cc: Jason Ekstrand jason.ekstrand@intel.com Cc: Marcin Slusarz marcin.slusarz@intel.com Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3080 Fixes: 9e31c1fe45d5 ("drm/i915: Propagate errors on awaiting already signaled fences") Acked-by: Daniel Vetter daniel.vetter@ffwll.ch Reviewed-by: Jon Bloomfield jon.bloomfield@intel.com Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20210714193419.1459723-3-jason... (cherry picked from commit 93a2711cddd5760e2f0f901817d71c93183c3b87) Signed-off-by: Rodrigo Vivi rodrigo.vivi@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/i915_request.c | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/i915/i915_request.c +++ b/drivers/gpu/drm/i915/i915_request.c @@ -1426,10 +1426,8 @@ i915_request_await_execution(struct i915
do { fence = *child++; - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { - i915_sw_fence_set_error_once(&rq->submit, fence->error); + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) continue; - }
if (fence->context == rq->fence.context) continue; @@ -1527,10 +1525,8 @@ i915_request_await_dma_fence(struct i915
do { fence = *child++; - if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) { - i915_sw_fence_set_error_once(&rq->submit, fence->error); + if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->flags)) continue; - }
/* * Requests on the same timeline are explicitly ordered, along
From: Charles Baylis cb-kernel@fishzet.co.uk
commit 3abab27c322e0f2acf981595aa8040c9164dc9fb upstream.
drm: Return -ENOTTY for non-drm ioctls
Return -ENOTTY from drm_ioctl() when userspace passes in a cmd number which doesn't relate to the drm subsystem.
Glibc uses the TCGETS ioctl to implement isatty(), and without this change isatty() returns it incorrectly returns true for drm devices.
To test run this command: $ if [ -t 0 ]; then echo is a tty; fi < /dev/dri/card0 which shows "is a tty" without this patch.
This may also modify memory which the userspace application is not expecting.
Signed-off-by: Charles Baylis cb-kernel@fishzet.co.uk Cc: stable@vger.kernel.org Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/YPG3IBlzaMhfPqCr@stando.fishze... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/drm_ioctl.c | 3 +++ include/drm/drm_ioctl.h | 1 + 2 files changed, 4 insertions(+)
--- a/drivers/gpu/drm/drm_ioctl.c +++ b/drivers/gpu/drm/drm_ioctl.c @@ -827,6 +827,9 @@ long drm_ioctl(struct file *filp, if (drm_dev_is_unplugged(dev)) return -ENODEV;
+ if (DRM_IOCTL_TYPE(cmd) != DRM_IOCTL_BASE) + return -ENOTTY; + is_driver_ioctl = nr >= DRM_COMMAND_BASE && nr < DRM_COMMAND_END;
if (is_driver_ioctl) { --- a/include/drm/drm_ioctl.h +++ b/include/drm/drm_ioctl.h @@ -68,6 +68,7 @@ typedef int drm_ioctl_compat_t(struct fi unsigned long arg);
#define DRM_IOCTL_NR(n) _IOC_NR(n) +#define DRM_IOCTL_TYPE(n) _IOC_TYPE(n) #define DRM_MAJOR 226
/**
From: Tao Zhou tao.zhou1@amd.com
commit cfe4e8f00f8f19ba305800f64962d1949ab5d4ca upstream.
Update gc_10_3_4 golden setting.
Signed-off-by: Tao Zhou tao.zhou1@amd.com Reviewed-by: Guchun Chen guchun.chen@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3411,6 +3411,7 @@ static const struct soc15_reg_golden gol SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER7_SELECT, 0xf0f001ff, 0x00000000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER8_SELECT, 0xf0f001ff, 0x00000000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER9_SELECT, 0xf0f001ff, 0x00000000), + SOC15_REG_GOLDEN_VALUE(GC, 0, mmSX_DEBUG_1, 0x00010000, 0x00010020), SOC15_REG_GOLDEN_VALUE(GC, 0, mmTA_CNTL_AUX, 0x01030000, 0x01030000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0x03a00000, 0x00a00000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmLDS_CONFIG, 0x00000020, 0x00000020)
From: Xiaojian Du Xiaojian.Du@amd.com
commit 4fff6fbca12524358a32e56f125ae738141f62b4 upstream.
This patch is to update the golden setting for vangogh.
Signed-off-by: Xiaojian Du Xiaojian.Du@amd.com Reviewed-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3369,6 +3369,7 @@ static const struct soc15_reg_golden gol SOC15_REG_GOLDEN_VALUE(GC, 0, mmPA_SC_ENHANCE_2, 0xffffffbf, 0x00000020), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSPI_CONFIG_CNTL_1_Vangogh, 0xffffffff, 0x00070103), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQG_CONFIG, 0x000017ff, 0x00001000), + SOC15_REG_GOLDEN_VALUE(GC, 0, mmSX_DEBUG_1, 0x00010000, 0x00010020), SOC15_REG_GOLDEN_VALUE(GC, 0, mmTA_CNTL_AUX, 0xfff7ffff, 0x01030000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0xffffffff, 0x00400000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmVGT_GS_MAX_WAVE_ID, 0x00000fff, 0x000000ff),
From: Likun Gao Likun.Gao@amd.com
commit 3e94b5965e624f7e6d8dd18eb8f3bf2bb99ba30d upstream.
Update GFX golden setting for sienna_cichlid.
Signed-off-by: Likun Gao Likun.Gao@amd.com Reviewed-by: Hawking Zhang Hawking.Zhang@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v10_0.c @@ -3290,6 +3290,7 @@ static const struct soc15_reg_golden gol SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER7_SELECT, 0xf0f001ff, 0x00000000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER8_SELECT, 0xf0f001ff, 0x00000000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmSQ_PERFCOUNTER9_SELECT, 0xf0f001ff, 0x00000000), + SOC15_REG_GOLDEN_VALUE(GC, 0, mmSX_DEBUG_1, 0x00010000, 0x00010020), SOC15_REG_GOLDEN_VALUE(GC, 0, mmTA_CNTL_AUX, 0xfff7ffff, 0x01030000), SOC15_REG_GOLDEN_VALUE(GC, 0, mmUTCL1_CTRL, 0xffbfffff, 0x00a00000) };
From: Yoshitaka Ikeda ikeda@nskint.co.jp
commit 0ccfd1ba84a4503b509250941af149e9ebd605ca upstream.
Revert to change to a better code.
This reverts commit 55cef88bbf12f3bfbe5c2379a8868a034707e755.
Signed-off-by: Yoshitaka Ikeda ikeda@nskint.co.jp Link: https://lore.kernel.org/r/bd30bdb4-07c4-f713-5648-01c898d51f1b@nskint.co.jp Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-cadence-quadspi.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -307,13 +307,11 @@ static unsigned int cqspi_calc_rdreg(str
static unsigned int cqspi_calc_dummy(const struct spi_mem_op *op, bool dtr) { - unsigned int dummy_clk = 0; + unsigned int dummy_clk;
- if (op->dummy.buswidth && op->dummy.nbytes) { - dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth); - if (dtr) - dummy_clk /= 2; - } + dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth); + if (dtr) + dummy_clk /= 2;
return dummy_clk; }
From: Mahesh Bandewar maheshb@google.com
commit 5b69874f74cc5707edd95fcdaa757c507ac8af0f upstream.
The commit 9a5605505d9c (" bonding: Add struct bond_ipesc to manage SA") is causing following build error when XFRM is not selected in kernel config.
lld: error: undefined symbol: xfrm_dev_state_flush
referenced by bond_main.c:3453 (drivers/net/bonding/bond_main.c:3453) net/bonding/bond_main.o:(bond_netdev_event) in archive drivers/built-in.a
Fixes: 9a5605505d9c (" bonding: Add struct bond_ipesc to manage SA") Signed-off-by: Mahesh Bandewar maheshb@google.com CC: Taehee Yoo ap420073@gmail.com CC: Jay Vosburgh jay.vosburgh@canonical.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/bonding/bond_main.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -3442,7 +3442,9 @@ static int bond_master_netdev_event(unsi return bond_event_changename(event_bond); case NETDEV_UNREGISTER: bond_remove_proc_entry(event_bond); +#ifdef CONFIG_XFRM_OFFLOAD xfrm_dev_state_flush(dev_net(bond_dev), bond_dev, true); +#endif /* CONFIG_XFRM_OFFLOAD */ break; case NETDEV_REGISTER: bond_create_proc_entry(event_bond);
From: Matthieu Baerts matthieu.baerts@tessares.net
commit c4512c63b1193c73b3f09c598a6d0a7f88da1dd8 upstream.
Dan Carpenter reported an issue introduced in commit fde56eea01f9 ("mptcp: refine mptcp_cleanup_rbuf") where a new boolean (ack_pending) is masked with 0x9.
This is not the intention to ignore values by using a boolean. This variable should not have a 'bool' type: we should keep the 'u8' to allow this comparison.
Fixes: fde56eea01f9 ("mptcp: refine mptcp_cleanup_rbuf") Reported-by: Dan Carpenter dan.carpenter@oracle.com Signed-off-by: Matthieu Baerts matthieu.baerts@tessares.net Signed-off-by: Mat Martineau mathew.j.martineau@linux.intel.com Acked-by: Paolo Abeni pabeni@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/mptcp/protocol.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/mptcp/protocol.c +++ b/net/mptcp/protocol.c @@ -446,7 +446,7 @@ static void mptcp_subflow_cleanup_rbuf(s static bool mptcp_subflow_could_cleanup(const struct sock *ssk, bool rx_empty) { const struct inet_connection_sock *icsk = inet_csk(ssk); - bool ack_pending = READ_ONCE(icsk->icsk_ack.pending); + u8 ack_pending = READ_ONCE(icsk->icsk_ack.pending); const struct tcp_sock *tp = tcp_sk(ssk);
return (ack_pending & ICSK_ACK_SCHED) &&
From: Paul Blakey paulb@nvidia.com
commit 8550ff8d8c75416e984d9c4b082845e57e560984 upstream.
When multiple SKBs are merged to a new skb under napi GRO, or SKB is re-used by napi, if nfct was set for them in the driver, it will not be released while freeing their stolen head state or on re-use.
Release nfct on napi's stolen or re-used SKBs, and in gro_list_prepare, check conntrack metadata diff.
Fixes: 5c6b94604744 ("net/mlx5e: CT: Handle misses after executing CT action") Reviewed-by: Roi Dayan roid@nvidia.com Signed-off-by: Paul Blakey paulb@nvidia.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/core/dev.c | 13 +++++++++++++ net/core/skbuff.c | 1 + 2 files changed, 14 insertions(+)
--- a/net/core/dev.c +++ b/net/core/dev.c @@ -5981,6 +5981,18 @@ static void gro_list_prepare(const struc diffs = memcmp(skb_mac_header(p), skb_mac_header(skb), maclen); + + diffs |= skb_get_nfct(p) ^ skb_get_nfct(skb); + + if (!diffs) { + struct tc_skb_ext *skb_ext = skb_ext_find(skb, TC_SKB_EXT); + struct tc_skb_ext *p_ext = skb_ext_find(p, TC_SKB_EXT); + + diffs |= (!!p_ext) ^ (!!skb_ext); + if (!diffs && unlikely(skb_ext)) + diffs |= p_ext->chain ^ skb_ext->chain; + } + NAPI_GRO_CB(p)->same_flow = !diffs; } } @@ -6245,6 +6257,7 @@ static void napi_reuse_skb(struct napi_s skb_shinfo(skb)->gso_type = 0; skb->truesize = SKB_TRUESIZE(skb_end_offset(skb)); skb_ext_reset(skb); + nf_reset_ct(skb);
napi->skb = skb; } --- a/net/core/skbuff.c +++ b/net/core/skbuff.c @@ -939,6 +939,7 @@ void __kfree_skb_defer(struct sk_buff *s
void napi_skb_free_stolen_head(struct sk_buff *skb) { + nf_reset_ct(skb); skb_dst_drop(skb); skb_ext_put(skb); napi_skb_cache_put(skb);
From: Stefan Wahren stefan.wahren@i2se.com
commit ab37a7a890c1176144a4c66ff3d51ef2c20ed486 upstream.
The usage of usb-nop-xceiv PHY on Raspberry Pi boards with BCM283x has been a "regression source" a lot of times. The last case is breakage of USB mass storage boot has been commit e590474768f1 ("driver core: Set fw_devlink=on by default") for multi_v7_defconfig. As long as NOP_USB_XCEIV is configured as module, the dwc2 USB driver defer probing endlessly and prevent booting from USB mass storage device. So make the driver built-in as in bcm2835_defconfig and arm64/defconfig.
Fixes: e590474768f1 ("driver core: Set fw_devlink=on by default") Reported-by: Ojaswin Mujoo ojaswin98@gmail.com Signed-off-by: Stefan Wahren stefan.wahren@i2se.com Link: https://lore.kernel.org/r/1625915095-23077-1-git-send-email-stefan.wahren@i2...' Signed-off-by: Arnd Bergmann arnd@arndb.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm/configs/multi_v7_defconfig | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/configs/multi_v7_defconfig +++ b/arch/arm/configs/multi_v7_defconfig @@ -821,7 +821,7 @@ CONFIG_USB_ISP1760=y CONFIG_USB_HSIC_USB3503=y CONFIG_AB8500_USB=y CONFIG_KEYSTONE_USB_PHY=m -CONFIG_NOP_USB_XCEIV=m +CONFIG_NOP_USB_XCEIV=y CONFIG_AM335X_PHY_USB=m CONFIG_TWL6030_USB=m CONFIG_USB_GPIO_VBUS=y
From: Robert Richter rrichter@amd.com
commit 5e60f363b38fd40e4d8838b5d6f4d4ecee92c777 upstream.
Documentation was not changed when renaming the script in commit 80e715a06c2d ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh"). Fixing this.
Basically does:
$ sed -i -e s/gen_initramfs_list.sh/gen_initramfs.sh/g $(git grep -l gen_initramfs_list.sh)
Fixes: 80e715a06c2d ("initramfs: rename gen_initramfs_list.sh to gen_initramfs.sh") Signed-off-by: Robert Richter rrichter@amd.com Signed-off-by: Masahiro Yamada masahiroy@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- Documentation/driver-api/early-userspace/early_userspace_support.rst | 8 ++++---- Documentation/filesystems/ramfs-rootfs-initramfs.rst | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-)
--- a/Documentation/driver-api/early-userspace/early_userspace_support.rst +++ b/Documentation/driver-api/early-userspace/early_userspace_support.rst @@ -69,17 +69,17 @@ early userspace image can be built by an
As a technical note, when directories and files are specified, the entire CONFIG_INITRAMFS_SOURCE is passed to -usr/gen_initramfs_list.sh. This means that CONFIG_INITRAMFS_SOURCE +usr/gen_initramfs.sh. This means that CONFIG_INITRAMFS_SOURCE can really be interpreted as any legal argument to -gen_initramfs_list.sh. If a directory is specified as an argument then +gen_initramfs.sh. If a directory is specified as an argument then the contents are scanned, uid/gid translation is performed, and usr/gen_init_cpio file directives are output. If a directory is -specified as an argument to usr/gen_initramfs_list.sh then the +specified as an argument to usr/gen_initramfs.sh then the contents of the file are simply copied to the output. All of the output directives from directory scanning and file contents copying are processed by usr/gen_init_cpio.
-See also 'usr/gen_initramfs_list.sh -h'. +See also 'usr/gen_initramfs.sh -h'.
Where's this all leading? ========================= --- a/Documentation/filesystems/ramfs-rootfs-initramfs.rst +++ b/Documentation/filesystems/ramfs-rootfs-initramfs.rst @@ -170,7 +170,7 @@ Documentation/driver-api/early-userspace The kernel does not depend on external cpio tools. If you specify a directory instead of a configuration file, the kernel's build infrastructure creates a configuration file from that directory (usr/Makefile calls -usr/gen_initramfs_list.sh), and proceeds to package up that directory +usr/gen_initramfs.sh), and proceeds to package up that directory using the config file (by feeding it to usr/gen_init_cpio, which is created from usr/gen_init_cpio.c). The kernel's build-time cpio creation code is entirely self-contained, and the kernel's boot-time extractor is also
From: Mark Rutland mark.rutland@arm.com
commit e6f85cbeb23bd74b8966cf1f15bf7d01399ff625 upstream.
We suppress KCOV for entry.o rather than entry-common.o. As entry.o is built from entry.S, this is pointless, and permits instrumentation of entry-common.o, which is built from entry-common.c.
Fix the Makefile to suppress KCOV for entry-common.o, as we had intended to begin with. I've verified with objdump that this is working as expected.
Fixes: bf6fa2c0dda7 ("arm64: entry: don't instrument entry code with KCOV") Signed-off-by: Mark Rutland mark.rutland@arm.com Cc: Catalin Marinas catalin.marinas@arm.com Cc: James Morse james.morse@arm.com Cc: Marc Zyngier maz@kernel.org Cc: Will Deacon will@kernel.org Link: https://lore.kernel.org/r/20210715123049.9990-1-mark.rutland@arm.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm64/kernel/Makefile +++ b/arch/arm64/kernel/Makefile @@ -17,7 +17,7 @@ CFLAGS_syscall.o += -fno-stack-protector # It's not safe to invoke KCOV when portions of the kernel environment aren't # available or are out-of-sync with HW state. Since `noinstr` doesn't always # inhibit KCOV instrumentation, disable it for the entire compilation unit. -KCOV_INSTRUMENT_entry.o := n +KCOV_INSTRUMENT_entry-common.o := n
# Object file lists. obj-y := debug-monitors.o entry.o irq.o fpsimd.o \
From: Riccardo Mancini rickyman7@gmail.com
commit 02e6246f5364d5260a6ea6f92ab6f409058b162f upstream.
ASan reports a memory leak when running:
# perf test "83: Zstd perf.data compression/decompression"
which happens inside 'perf inject'.
The bug is caused by inject.output never being closed.
This patch adds the missing perf_data__close().
Signed-off-by: Riccardo Mancini rickyman7@gmail.com Fixes: 6ef81c55a2b6584c ("perf session: Return error code for perf_session__new() function on failure") Cc: Ian Rogers irogers@google.com Cc: Jiri Olsa jolsa@redhat.com Cc: Mamatha Inamdar mamatha4@linux.vnet.ibm.com Cc: Mark Rutland mark.rutland@arm.com Cc: Namhyung Kim namhyung@kernel.org Cc: Peter Zijlstra peterz@infradead.org Link: http://lore.kernel.org/lkml/c06f682afa964687367cf6e92a64ceb49aec76a5.1626343... Signed-off-by: Arnaldo Carvalho de Melo acme@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/perf/builtin-inject.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-)
--- a/tools/perf/builtin-inject.c +++ b/tools/perf/builtin-inject.c @@ -908,8 +908,10 @@ int cmd_inject(int argc, const char **ar
data.path = inject.input_name; inject.session = perf_session__new(&data, inject.output.is_pipe, &inject.tool); - if (IS_ERR(inject.session)) - return PTR_ERR(inject.session); + if (IS_ERR(inject.session)) { + ret = PTR_ERR(inject.session); + goto out_close_output; + }
if (zstd_init(&(inject.session->zstd_data), 0) < 0) pr_warning("Decompression initialization failed.\n"); @@ -951,5 +953,7 @@ int cmd_inject(int argc, const char **ar out_delete: zstd_fini(&(inject.session->zstd_data)); perf_session__delete(inject.session); +out_close_output: + perf_data__close(&inject.output); return ret; }
From: Colin Xu colin.xu@intel.com
commit c90b4503ccf42d9d367e843c223df44aa550e82a upstream.
d3_entered flag is used to mark for vgpu_reset a previous power transition from D3->D0, typically for VM resume from S3, so that gvt could skip PPGTT invalidation in current vgpu_reset during resuming.
In case S0ix exit, although there is D3->D0, guest driver continue to use vgpu as normal, with d3_entered set, until next shutdown/reboot or power transition.
If a reboot follows a S0ix exit, device power state transite as: D0->D3->D0->D0(reboot), while system power state transites as: S0->S0 (reboot). There is no vgpu_reset until D0(reboot), thus d3_entered won't be cleared, the vgpu_reset will skip PPGTT invalidation however those PPGTT entries are no longer valid. Err appears like:
gvt: vgpu 2: vfio_pin_pages failed for gfn 0xxxxx, ret -22 gvt: vgpu 2: fail: spt xxxx guest entry 0xxxxx type 2 gvt: vgpu 2: fail: shadow page xxxx guest entry 0xxxxx type 2.
Give gvt a chance to clear d3_entered on elsp cmd submission so that the states before & after S0ix enter/exit are consistent.
Fixes: ba25d977571e ("drm/i915/gvt: Do not destroy ppgtt_mm during vGPU D3->D0.") Reviewed-by: Zhenyu Wang zhenyuw@linux.intel.com Signed-off-by: Colin Xu colin.xu@intel.com Signed-off-by: Zhenyu Wang zhenyuw@linux.intel.com Link: http://patchwork.freedesktop.org/patch/msgid/20210707004531.4873-1-colin.xu@... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/i915/gvt/handlers.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
--- a/drivers/gpu/drm/i915/gvt/handlers.c +++ b/drivers/gpu/drm/i915/gvt/handlers.c @@ -1977,6 +1977,21 @@ static int elsp_mmio_write(struct intel_ if (drm_WARN_ON(&i915->drm, !engine)) return -EINVAL;
+ /* + * Due to d3_entered is used to indicate skipping PPGTT invalidation on + * vGPU reset, it's set on D0->D3 on PCI config write, and cleared after + * vGPU reset if in resuming. + * In S0ix exit, the device power state also transite from D3 to D0 as + * S3 resume, but no vGPU reset (triggered by QEMU devic model). After + * S0ix exit, all engines continue to work. However the d3_entered + * remains set which will break next vGPU reset logic (miss the expected + * PPGTT invalidation). + * Engines can only work in D0. Thus the 1st elsp write gives GVT a + * chance to clear d3_entered. + */ + if (vgpu->d3_entered) + vgpu->d3_entered = false; + execlist = &vgpu->submission.execlist[engine->id];
execlist->elsp_dwords.data[3 - execlist->elsp_dwords.index] = data;
From: Yoshitaka Ikeda ikeda@nskint.co.jp
commit 0e85ee897858b1c7a5de53f496d016899d9639c5 upstream.
Fix below division by zero warning: - The reason for dividing by zero is because the dummy bus width is zero, but if the dummy n bytes is zero, it indicates that there is no data transfer, so we can just return zero without doing any calculations.
[ 0.795337] Division by zero in kernel. : [ 0.834051] [<807fd40c>] (__div0) from [<804e1acc>] (Ldiv0+0x8/0x10) [ 0.839097] [<805f0710>] (cqspi_exec_mem_op) from [<805edb4c>] (spi_mem_exec_op+0x3b0/0x3f8)
Fixes: 7512eaf54190 ("spi: cadence-quadspi: Fix dummy cycle calculation when buswidth > 1") Signed-off-by: Yoshitaka Ikeda ikeda@nskint.co.jp Reviewed-by: Pratyush Yadav p.yadav@ti.com Link: https://lore.kernel.org/r/92eea403-9b21-2488-9cc1-664bee760c5e@nskint.co.jp Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-cadence-quadspi.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/spi/spi-cadence-quadspi.c +++ b/drivers/spi/spi-cadence-quadspi.c @@ -309,6 +309,9 @@ static unsigned int cqspi_calc_dummy(con { unsigned int dummy_clk;
+ if (!op->dummy.nbytes) + return 0; + dummy_clk = op->dummy.nbytes * (8 / op->dummy.buswidth); if (dtr) dummy_clk /= 2;
From: Íñigo Huguet ihuguet@redhat.com
[ Upstream commit 788bc000d4c2f25232db19ab3a0add0ba4e27671 ]
Commit 99ba0ea616aa ("sfc: adjust efx->xdp_tx_queue_count with the real number of initialized queues") intended to fix a problem caused by a round up when calculating the number of XDP channels and queues. However, this was not the real problem. The real problem was that the number of XDP TX queues had been reduced to half in commit e26ca4b53582 ("sfc: reduce the number of requested xdp ev queues"), but the variable xdp_tx_queue_count had remained the same.
Once the correct number of XDP TX queues is created again in the previous patch of this series, this also can be reverted since the error doesn't actually exist.
Only in the case that there is a bug in the code we can have different values in xdp_queue_number and efx->xdp_tx_queue_count. Because of this, and per Edward Cree's suggestion, I add instead a WARN_ON to catch if it happens again in the future.
Note that the number of allocated queues can be higher than the number of used ones due to the round up, as explained in the existing comment in the code. That's why we also have to stop increasing xdp_queue_number beyond efx->xdp_tx_queue_count.
Signed-off-by: Íñigo Huguet ihuguet@redhat.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/sfc/efx_channels.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/sfc/efx_channels.c b/drivers/net/ethernet/sfc/efx_channels.c index 5b71f8a03a6d..bb48a139dd15 100644 --- a/drivers/net/ethernet/sfc/efx_channels.c +++ b/drivers/net/ethernet/sfc/efx_channels.c @@ -892,18 +892,20 @@ int efx_set_channels(struct efx_nic *efx) if (efx_channel_is_xdp_tx(channel)) { efx_for_each_channel_tx_queue(tx_queue, channel) { tx_queue->queue = next_queue++; - netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n", - channel->channel, tx_queue->label, - xdp_queue_number, tx_queue->queue); + /* We may have a few left-over XDP TX * queues owing to xdp_tx_queue_count * not dividing evenly by EFX_MAX_TXQ_PER_CHANNEL. * We still allocate and probe those * TXQs, but never use them. */ - if (xdp_queue_number < efx->xdp_tx_queue_count) + if (xdp_queue_number < efx->xdp_tx_queue_count) { + netif_dbg(efx, drv, efx->net_dev, "Channel %u TXQ %u is XDP %u, HW %u\n", + channel->channel, tx_queue->label, + xdp_queue_number, tx_queue->queue); efx->xdp_tx_queues[xdp_queue_number] = tx_queue; - xdp_queue_number++; + xdp_queue_number++; + } } } else { efx_for_each_channel_tx_queue(tx_queue, channel) { @@ -915,8 +917,7 @@ int efx_set_channels(struct efx_nic *efx) } } } - if (xdp_queue_number) - efx->xdp_tx_queue_count = xdp_queue_number; + WARN_ON(xdp_queue_number != efx->xdp_tx_queue_count);
rc = netif_set_real_num_tx_queues(efx->net_dev, efx->n_tx_channels); if (rc)
Hello!
On 7/26/21 10:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Build regressions detected across plenty of architectures and configurations:
/builds/linux/net/core/dev.c: In function 'gro_list_prepare': /builds/linux/net/core/dev.c:5988:72: error: 'TC_SKB_EXT' undeclared (first use in this function) 5988 | struct tc_skb_ext *skb_ext = skb_ext_find(skb, TC_SKB_EXT); | ^~~~~~~~~~ /builds/linux/net/core/dev.c:5988:72: note: each undeclared identifier is reported only once for each function it appears in /builds/linux/net/core/dev.c:5993:47: error: invalid use of undefined type 'struct tc_skb_ext' 5993 | diffs |= p_ext->chain ^ skb_ext->chain; | ^~ /builds/linux/net/core/dev.c:5993:64: error: invalid use of undefined type 'struct tc_skb_ext' 5993 | diffs |= p_ext->chain ^ skb_ext->chain; | ^~ make[3]: *** [/builds/linux/scripts/Makefile.build:273: net/core/dev.o] Error 1 make[3]: Target '__build' not remade because of errors. make[2]: *** [/builds/linux/scripts/Makefile.build:516: net/core] Error 2 make[2]: Target '__build' not remade because of errors. make[1]: *** [/builds/linux/Makefile:1853: net] Error 2 make[1]: Target '__all' not remade because of errors. make: *** [Makefile:220: __sub-make] Error 2
Here's a summary: * arc: 10 total, 4 passed, 6 failed * arm: 193 total, 18 passed, 175 failed * arm64: 27 total, 12 passed, 15 failed * dragonboard-410c: 1 total, 0 passed, 1 failed * hi6220-hikey: 1 total, 0 passed, 1 failed * i386: 26 total, 12 passed, 14 failed * mips: 45 total, 15 passed, 30 failed * parisc: 9 total, 6 passed, 3 failed * powerpc: 27 total, 6 passed, 21 failed * riscv: 21 total, 14 passed, 7 failed * s390: 18 total, 12 passed, 6 failed * sh: 18 total, 6 passed, 12 failed * sparc: 9 total, 6 passed, 3 failed * x15: 1 total, 0 passed, 1 failed * x86: 1 total, 0 passed, 1 failed * x86_64: 27 total, 12 passed, 15 failed
All failed, except allnoconfig, tinyconfig, and a few others here and there.
Same applies for 5.10.
Greetings!
Daniel Díaz daniel.diaz@linaro.org
On Mon, Jul 26, 2021 at 11:43:10AM -0500, Daniel Díaz wrote:
Hello!
On 7/26/21 10:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Build regressions detected across plenty of architectures and configurations:
/builds/linux/net/core/dev.c: In function 'gro_list_prepare': /builds/linux/net/core/dev.c:5988:72: error: 'TC_SKB_EXT' undeclared (first use in this function) 5988 | struct tc_skb_ext *skb_ext = skb_ext_find(skb, TC_SKB_EXT); | ^~~~~~~~~~ /builds/linux/net/core/dev.c:5988:72: note: each undeclared identifier is reported only once for each function it appears in /builds/linux/net/core/dev.c:5993:47: error: invalid use of undefined type 'struct tc_skb_ext' 5993 | diffs |= p_ext->chain ^ skb_ext->chain; | ^~ /builds/linux/net/core/dev.c:5993:64: error: invalid use of undefined type 'struct tc_skb_ext' 5993 | diffs |= p_ext->chain ^ skb_ext->chain; | ^~ make[3]: *** [/builds/linux/scripts/Makefile.build:273: net/core/dev.o] Error 1 make[3]: Target '__build' not remade because of errors. make[2]: *** [/builds/linux/scripts/Makefile.build:516: net/core] Error 2 make[2]: Target '__build' not remade because of errors. make[1]: *** [/builds/linux/Makefile:1853: net] Error 2 make[1]: Target '__all' not remade because of errors. make: *** [Makefile:220: __sub-make] Error 2
Here's a summary:
- arc: 10 total, 4 passed, 6 failed
- arm: 193 total, 18 passed, 175 failed
- arm64: 27 total, 12 passed, 15 failed
- dragonboard-410c: 1 total, 0 passed, 1 failed
- hi6220-hikey: 1 total, 0 passed, 1 failed
- i386: 26 total, 12 passed, 14 failed
- mips: 45 total, 15 passed, 30 failed
- parisc: 9 total, 6 passed, 3 failed
- powerpc: 27 total, 6 passed, 21 failed
- riscv: 21 total, 14 passed, 7 failed
- s390: 18 total, 12 passed, 6 failed
- sh: 18 total, 6 passed, 12 failed
- sparc: 9 total, 6 passed, 3 failed
- x15: 1 total, 0 passed, 1 failed
- x86: 1 total, 0 passed, 1 failed
- x86_64: 27 total, 12 passed, 15 failed
All failed, except allnoconfig, tinyconfig, and a few others here and there.
Same applies for 5.10.
Ah, I missed: 9615fe36b31d ("skbuff: Fix build with SKB extensions disabled") will fix that and push out a 5.13.y and 5.10.y update now...
thanks,
greg k-h
On Mon, Jul 26, 2021 at 5:47 PM Daniel Díaz daniel.diaz@linaro.org wrote:
Hello!
On 7/26/21 10:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Build regressions detected across plenty of architectures and configurations:
I was going to report the same but just noticed that Greg has pushed out -rc2.
On Mon, Jul 26, 2021 at 06:05:27PM +0100, Sudip Mukherjee wrote:
On Mon, Jul 26, 2021 at 5:47 PM Daniel Díaz daniel.diaz@linaro.org wrote:
Hello!
On 7/26/21 10:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Build regressions detected across plenty of architectures and configurations:
I was going to report the same but just noticed that Greg has pushed out -rc2.
I forgot to push it out last night, it's there now, sorry for the delay.
greg k-h
On Mon, Jul 26, 2021 at 05:36:32PM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Tested rc2 against the Fedora build system (aarch64, armv7, ppc64le, s390x, x86_64), and boot tested x86_64. No regressions noted.
Tested-by: Justin M. Forbes jforbes@fedoraproject.org
On 7/26/21 8:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, using -rc2:
Tested-by: Florian Fainelli f.fainelli@gmail.com
On 7/26/21 9:36 AM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.13.6 release. There are 223 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Wed, 28 Jul 2021 15:38:12 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.13.6-rc1.... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.13.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org