This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 6.1.110-rc1
Miklos Szeredi mszeredi@redhat.com fuse: add feature flag for expire-only
Peng Wu wupeng58@huawei.com regulator: of: fix a NULL vs IS_ERR() check in of_regulator_bulk_get_all()
Shakeel Butt shakeel.butt@linux.dev memcg: protect concurrent access to mem_cgroup_idr
Yonghong Song yhs@fb.com bpf: Silence a warning in btf_type_id_size()
Filipe Manana fdmanana@suse.com btrfs: fix race between direct IO write and fsync when using same fd
Thomas Gleixner tglx@linutronix.de x86/mm: Fix PTI for i386 some more
Li Nan linan122@huawei.com ublk_drv: fix NULL pointer dereference in ublk_ctrl_start_recovery()
Liao Chen liaochen4@huawei.com gpio: modepin: Enable module autoloading
Krzysztof Kozlowski krzysztof.kozlowski@linaro.org gpio: rockchip: fix OF node leak in probe()
Andy Shevchenko andriy.shevchenko@linux.intel.com drm/i915/fence: Mark debug_fence_free() with __maybe_unused
Andy Shevchenko andriy.shevchenko@linux.intel.com drm/i915/fence: Mark debug_fence_init_onstack() with __maybe_unused
Matteo Martelli matteomartelli3@gmail.com ASoC: sunxi: sun4i-i2s: fix LRCLK polarity in i2s mode
Chen-Yu Tsai wenst@chromium.org ASoc: SOF: topology: Clear SOF link platform name upon unload
Maurizio Lombardi mlombard@redhat.com nvmet-tcp: fix kernel crash if commands allocation fails
Mohan Kumar mkumard@nvidia.com ASoC: tegra: Fix CBB error during probe()
Christophe Leroy christophe.leroy@csgroup.eu powerpc/64e: Define mmu_pte_psize static
Michael Ellerman mpe@ellerman.id.au powerpc/64e: split out nohash Book3E 64-bit code
Michael Ellerman mpe@ellerman.id.au powerpc/64e: remove unused IBM HTW code
Marek Olšák marek.olsak@amd.com drm/amdgpu: handle gfx12 in amdgpu_display_verify_sizes
Aurabindo Pillai aurabindo.pillai@amd.com drm/amd: Add gfx12 swizzle mode defs
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: rx: add workaround for erratum DS80000789E 6 of mcp2518fd
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: clarify the meaning of timestamp
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: rx: prepare to workaround broken RX FIFO head index erratum
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: mcp251xfd_handle_rxif_ring_uinc(): factor out in separate function
Jonathan Cameron Jonathan.Cameron@huawei.com arm64: acpi: Harden get_cpu_for_acpi_id() against missing CPU entry
James Morse james.morse@arm.com arm64: acpi: Move get_cpu_for_acpi_id() to a header
Jonathan Cameron Jonathan.Cameron@huawei.com ACPI: processor: Fix memory leaks in error paths of processor_add()
Jonathan Cameron Jonathan.Cameron@huawei.com ACPI: processor: Return an error if acpi_processor_get_info() fails in processor_add()
Nicholas Piggin npiggin@gmail.com workqueue: Improve scalability of workqueue watchdog touch
Nicholas Piggin npiggin@gmail.com workqueue: wq_watchdog_touch is always called with valid CPU
Souradeep Chakrabarti schakrabarti@linux.microsoft.com net: mana: Fix error handling in mana_create_txq/rxq's NAPI cleanup
Jann Horn jannh@google.com userfaultfd: fix checks for huge PMDs
Peter Zijlstra peterz@infradead.org mm: Rename pmd_read_atomic()
Peter Zijlstra peterz@infradead.org mm: Fix pmd_read_atomic()
Peter Zijlstra peterz@infradead.org x86/mm/pae: Make pmd_t similar to pte_t
yangyun yangyun50@huawei.com fuse: fix memory leak in fuse_create_open
Miklos Szeredi mszeredi@redhat.com fuse: add request extension
Dharmendra Singh dsingh@ddn.com fuse: allow non-extending parallel direct writes on the same file
Miklos Szeredi mszeredi@redhat.com fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY
Boqun Feng boqun.feng@gmail.com rust: macros: provide correct provenance when constructing THIS_MODULE
Peter Zijlstra peterz@infradead.org perf/aux: Fix AUX buffer serialization
Sven Schnelle svens@linux.ibm.com uprobes: Use kzalloc to allocate xol area
Daniel Lezcano daniel.lezcano@linaro.org clocksource/drivers/timer-of: Remove percpu irq related code
Jacky Bai ping.bai@nxp.com clocksource/drivers/imx-tpm: Fix next event not taking effect sometime
Jacky Bai ping.bai@nxp.com clocksource/drivers/imx-tpm: Fix return -ETIME when delta exceeds INT_MAX
David Fernandez Gonzalez david.fernandez.gonzalez@oracle.com VMCI: Fix use-after-free when removing resource in vmci_resource_remove()
Naman Jain namjain@linux.microsoft.com Drivers: hv: vmbus: Fix rescind handling in uio_hv_generic
Saurabh Sengar ssengar@linux.microsoft.com uio_hv_generic: Fix kernel NULL pointer dereference in hv_uio_rescind
Geert Uytterhoeven geert+renesas@glider.be nvmem: Fix return type of devm_nvmem_device_get() in kerneldoc
Carlos Llamas cmllamas@google.com binder: fix UAF caused by offsets overwrite
Faisal Hassan quic_faisalh@quicinc.com usb: dwc3: core: update LC timer as per USB Spec V3.2
Dumitru Ceclan mitrutzceclan@gmail.com iio: adc: ad7124: fix chip ID mismatch
Guillaume Stols gstols@baylibre.com iio: adc: ad7606: remove frstdata check for serial mode
Dumitru Ceclan mitrutzceclan@gmail.com iio: adc: ad7124: fix config comparison
Matteo Martelli matteomartelli3@gmail.com iio: fix scale application in iio_convert_raw_to_processed_unlocked
David Lechner dlechner@baylibre.com iio: buffer-dmaengine: fix releasing dma channel on error
Aleksandr Mishin amishin@t-argos.ru staging: iio: frequency: ad9834: Validate frequency parameter value
Matthieu Baerts (NGI0) matttbe@kernel.org tcp: process the 3rd ACK with sk_socket for TFO/MPTCP
Michal Koutný mkoutny@suse.com io_uring/sqpoll: Do not set PF_NO_SETAFFINITY on sqpoll threads
Jens Axboe axboe@kernel.dk io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: join: check re-re-adding ID 0 signal
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: join: validate event numbers
Matthieu Baerts (NGI0) matttbe@kernel.org selftests: mptcp: fix backport issues
Trond Myklebust trond.myklebust@hammerspace.com NFSv4: Add missing rescheduling points in nfs_client_return_marked_delegations
Michael Ellerman mpe@ellerman.id.au ata: pata_macio: Use WARN instead of BUG
Jiaxun Yang jiaxun.yang@flygoat.com MIPS: cevt-r4k: Don't call get_c0_compare_int if timer irq is installed
Kent Overstreet kent.overstreet@linux.dev lib/generic-radix-tree.c: Fix rare race in __genradix_ptr_alloc()
Stefan Wiehler stefan.wiehler@nokia.com of/irq: Prevent device address out-of-bounds read in interrupt map walk
Phillip Lougher phillip@squashfs.org.uk Squashfs: sanity check symbolic link size
Oliver Neukum oneukum@suse.com usbnet: ipheth: race between ipheth_close and error handling
Dmitry Torokhov dmitry.torokhov@gmail.com Input: uinput - reject requests with unreasonable number of slots
Olivier Sobrie olivier@sobrie.be HID: amd_sfh: free driver_data after destroying hid device
Camila Alvarez cam.alvarez.i@gmail.com HID: cougar: fix slab-out-of-bounds Read in cougar_report_fixup
Heiko Carstens hca@linux.ibm.com s390/vmlinux.lds.S: Move ro_after_init section behind rodata section
David Sterba dsterba@suse.com btrfs: initialize location to fix -Wmaybe-uninitialized in btrfs_lookup_dentry()
Zenghui Yu yuzenghui@huawei.com kselftests: dmabuf-heaps: Ensure the driver name is null-terminated
Jarkko Nikula jarkko.nikula@linux.intel.com i3c: mipi-i3c-hci: Error out instead on BUG_ON() in IBI DMA setup
Vladimir Oltean vladimir.oltean@nxp.com net: dpaa: avoid on-stack arrays of NR_CPUS elements
Kuniyuki Iwashima kuniyu@amazon.com tcp: Don't drop SYN+ACK for simultaneous connect().
Dan Williams dan.j.williams@intel.com PCI: Add missing bridge lock to pci_bus_lock()
yang.zhang yang.zhang@hexintek.com riscv: set trap vector earlier
Filipe Manana fdmanana@suse.com btrfs: replace BUG_ON() with error handling at update_ref_for_cow()
Josef Bacik josef@toxicpanda.com btrfs: clean up our handling of refs == 0 in snapshot delete
Josef Bacik josef@toxicpanda.com btrfs: replace BUG_ON with ASSERT in walk_down_proc()
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: Check more cases when directory is corrupted
Zqiang qiang.zhang1211@gmail.com smp: Add missing destroy_work_on_stack() call in smp_call_on_cpu()
Sascha Hauer s.hauer@pengutronix.de wifi: mwifiex: Do not return unused priv in mwifiex_get_priv_by_id()
Yicong Yang yangyicong@hisilicon.com dma-mapping: benchmark: Don't starve others when doing the test
Luis Henriques (SUSE) luis.henriques@linux.dev ext4: fix possible tid_t sequence overflows
Yifan Zha Yifan.Zha@amd.com drm/amdgpu: Set no_hw_access when VF request full GPU fails
Andreas Ziegler ziegler.andreas@siemens.com libbpf: Add NULL checks to bpf_object__{prev_map,next_map}
Guenter Roeck linux@roeck-us.net hwmon: (w83627ehf) Fix underflows seen when writing limit attributes
Guenter Roeck linux@roeck-us.net hwmon: (nct6775-core) Fix underflows seen when writing limit attributes
Guenter Roeck linux@roeck-us.net hwmon: (lm95234) Fix underflows seen when writing limit attributes
Guenter Roeck linux@roeck-us.net hwmon: (adc128d818) Fix underflows seen when writing limit attributes
Hareshx Sankar Raj hareshx.sankar.raj@intel.com crypto: qat - fix unintentional re-enabling of error interrupts
Krishna Kumar krishnak@linux.ibm.com pci/hotplug/pnv_php: Fix hotplug driver crash on Powernv
Zijun Hu quic_zijuhu@quicinc.com devres: Initialize an uninitialized struct member
Johannes Berg johannes.berg@intel.com um: line: always fill *error_out in setup_one_line()
Waiman Long longman@redhat.com cgroup: Protect css->cgroup write under css_set_lock
Jacob Pan jacob.jun.pan@linux.intel.com iommu/vt-d: Handle volatile descriptor status read
Benjamin Marzinski bmarzins@redhat.com dm init: Handle minors larger than 255
Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com ASoC: topology: Properly initialize soc_enum values
Sean Anderson sean.anderson@linux.dev phy: zynqmp: Take the phy mutex in xlate
Richard Fitzgerald rf@opensource.cirrus.com firmware: cs_dsp: Don't allow writes to read-only controls
Pawel Dembicki paweldembicki@gmail.com net: dsa: vsc73xx: fix possible subblocks range of CAPT block
Jonas Gorski jonas.gorski@bisdn.de net: bridge: br_fdb_external_learn_add(): always set EXT_LEARN
Kuniyuki Iwashima kuniyu@amazon.com fou: Fix null-ptr-deref in GRO.
Guillaume Nault gnault@redhat.com bareudp: Fix device stats updates.
Oliver Neukum oneukum@suse.com usbnet: modern method to get random MAC
Larysa Zaremba larysa.zaremba@intel.com ice: do not bring the VSI up, if it was down before the XDP setup
Maciej Fijalkowski maciej.fijalkowski@intel.com ice: allow hot-swapping XDP programs
Maciej Fijalkowski maciej.fijalkowski@intel.com ice: Use ice_max_xdp_frame_size() in ice_xdp_setup_prog()
Dan Carpenter dan.carpenter@linaro.org igc: Unlock on error in igc_io_resume()
Douglas Anderson dianders@chromium.org regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATOR
Corentin Labbe clabbe@baylibre.com regulator: Add of_regulator_bulk_get_all
Aleksandr Mishin amishin@t-argos.ru platform/x86: dell-smbios: Fix error path in dell_smbios_init()
Dawid Osuchowski dawid.osuchowski@linux.intel.com ice: Add netif_device_attach/detach into PF reset flow
Daiwei Li daiweili@google.com igb: Fix not clearing TimeSync interrupts for 82580
David Howells dhowells@redhat.com cifs: Fix FALLOC_FL_ZERO_RANGE to preflush buffered part of target region
Andreas Hindborg a.hindborg@samsung.com rust: kbuild: fix export of bss symbols
Matthew Maurer mmaurer@google.com rust: Use awk instead of recent xargs
Marc Kleine-Budde mkl@pengutronix.de can: mcp251xfd: fix ring configuration when switching from CAN-CC to CAN-FD mode
Simon Horman horms@kernel.org can: m_can: Release irq on error in m_can_open
Kuniyuki Iwashima kuniyu@amazon.com can: bcm: Remove proc entry when dev is unregistered.
Marek Olšák marek.olsak@amd.com drm/amdgpu: check for LINEAR_ALIGNED correctly in check_tiling_flags_gfx6
Alex Hung alex.hung@amd.com drm/amd/display: Check denominator pbn_div before used
Jules Irenge jbi.octave@gmail.com pcmcia: Use resource_size function on resource object
Chen Ni nichen@iscas.ac.cn media: qcom: camss: Add check for v4l2_fwnode_endpoint_parse
Dmitry Torokhov dmitry.torokhov@gmail.com Input: ili210x - use kvmalloc() to allocate buffer for firmware update
Kishon Vijay Abraham I kishon@ti.com PCI: keystone: Add workaround for Errata #i2037 (AM65x SR 1.0)
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: don't set HDMI TX controls if there are no HDMI outputs
Danijel Slivka danijel.slivka@amd.com drm/amdgpu: clear RB_OVERFLOW bit when enabling interrupts
Hawking Zhang Hawking.Zhang@amd.com drm/amdgpu: Fix smatch static checker warning
Alex Hung alex.hung@amd.com drm/amd/display: Check HDCP returned status
Ma Ke make24@iscas.ac.cn usb: gadget: aspeed_udc: validate endpoint index for ast udc
Shantanu Goel sgoel01@yahoo.com usb: uas: set host status byte on data completion error
Arend van Spriel arend.vanspriel@broadcom.com wifi: brcmsmac: advertise MFP_CAPABLE to enable WPA3
Andy Shevchenko andriy.shevchenko@linux.intel.com leds: spi-byte: Call of_node_put() on error path
Hans Verkuil hverkuil-cisco@xs4all.nl media: vivid: fix wrong sizeimage value for mplane
Konstantin Komarov almaz.alexandrovich@paragon-software.com fs/ntfs3: One more reason to mark inode bad
Jan Kara jack@suse.cz udf: Avoid excessive partition lengths
Yunjian Wang wangyunjian@huawei.com netfilter: nf_conncount: fix wrong variable type
Jernej Skrabec jernej.skrabec@gmail.com iommu: sun50i: clear bypass register
Brian Johannesmeyer bjohannesmeyer@gmail.com x86/kmsan: Fix hook for unaligned accesses
Kuniyuki Iwashima kuniyu@amazon.com af_unix: Remove put_pid()/put_cred() in copy_peercred().
Pali Rohár pali@kernel.org irqchip/armada-370-xp: Do not allow mapping IRQ 0 and 1
Alexey Dobriyan adobriyan@gmail.com ELF: fix kernel.randomize_va_space double read
Konstantin Andreev andreev@swemel.ru smack: unix sockets: fix accept()ed socket label
Takashi Iwai tiwai@suse.de ALSA: hda: Add input value sanity checks to HDMI channel map controls
Takashi Iwai tiwai@suse.de ALSA: control: Apply sanity check of input values for user elements
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix state management in error path of log writing function
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: protect references to superblock parameters exposed in sysfs
Ryusuke Konishi konishi.ryusuke@gmail.com nilfs2: fix missing cleanup on rollforward recovery error
Toke Høiland-Jørgensen toke@redhat.com sched: sch_cake: fix bulk flow accounting logic for host fairness
Eric Dumazet edumazet@google.com ila: call nf_unregister_net_hooks() sooner
Cong Wang cong.wang@bytedance.com tcp_bpf: fix return value of tcp_bpf_sendmsg()
Alex Deucher alexander.deucher@amd.com Revert "drm/amdgpu: align pp_power_profile_mode with kernel docs"
Mitchell Levy levymitchell0@gmail.com x86/fpu: Avoid writing LBR bit to IA32_XSS unless supported
Matt Johnston matt@codeconstruct.com.au net: mctp-serial: Fix missing escapes on transmit
Zheng Yejian zhengyejian@huaweicloud.com tracing: Avoid possible softlockup in tracing_iter_reset()
Brian Norris briannorris@chromium.org spi: rockchip: Resolve unbalanced runtime PM / system PM handling
Simon Arlott simon@octiron.net can: mcp251x: fix deadlock if an interrupt occurs during mcp251x_open
Satya Priya Kakitapalli quic_skakitap@quicinc.com clk: qcom: clk-alpha-pll: Update set_rate for Zonda PLL
Satya Priya Kakitapalli quic_skakitap@quicinc.com clk: qcom: clk-alpha-pll: Fix zonda set_rate failure when PLL is disabled
Satya Priya Kakitapalli quic_skakitap@quicinc.com clk: qcom: clk-alpha-pll: Fix the trion pll postdiv set rate API
Satya Priya Kakitapalli quic_skakitap@quicinc.com clk: qcom: clk-alpha-pll: Fix the pll post div mask
Jann Horn jannh@google.com fuse: use unsigned type for getxattr/listxattr size truncation
Joanne Koong joannelkoong@gmail.com fuse: update stats for pages in dropped aux writeback list
Seunghwan Baek sh8267.baek@samsung.com mmc: cqhci: Fix checking of CQHCI_HALT state
Liao Chen liaochen4@huawei.com mmc: sdhci-of-aspeed: fix module autoloading
Sam Protsenko semen.protsenko@linaro.org mmc: dw_mmc: Fix IDMAC operation with pages bigger than 4K
Jonathan Bell jonathan@raspberrypi.com mmc: core: apply SD quirks earlier during probe
Luiz Augusto von Dentz luiz.von.dentz@intel.com Bluetooth: MGMT: Ignore keys being loaded with invalid type
Luiz Augusto von Dentz luiz.von.dentz@intel.com Revert "Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE"
Georg Gottleuber ggo@tuxedocomputers.com nvme-pci: Add sleep quirk for Samsung 990 Evo
Roland Xu mu001999@outlook.com rtmutex: Drop rt_mutex::wait_lock before scheduling
Thomas Gleixner tglx@linutronix.de x86/kaslr: Expose and use the end of the physical memory address space
Ma Ke make24@iscas.ac.cn irqchip/gic-v2m: Fix refcount leak in gicv2m_of_init()
Kan Liang kan.liang@linux.intel.com perf/x86/intel: Limit the period on Haswell
Kirill A. Shutemov kirill.shutemov@linux.intel.com x86/tdx: Fix data leak in mmio_read()
Zheng Qixing zhengqixing@huawei.com ata: libata: Fix memory leak for error path in ata_host_alloc()
Dan Carpenter dan.carpenter@linaro.org ksmbd: Unlock on in ksmbd_tcp_set_interfaces()
Namjae Jeon linkinjeon@kernel.org ksmbd: unset the binding mark of a reused connection
Maximilien Perreault maximilienperreault@gmail.com ALSA: hda/realtek: Support mute LED on HP Laptop 14-dq2xxx
Terry Cheong htcheong@chromium.org ALSA: hda/realtek: add patch for internal mic in Lenovo V145
Christoffer Sandberg cs@tuxedo.de ALSA: hda/conexant: Add pincfg quirk to enable top speakers on Sirius devices
Ravi Bangoria ravi.bangoria@amd.com KVM: SVM: Don't advertise Bus Lock Detect to guest if SVM support is missing
Maxim Levitsky mlevitsk@redhat.com KVM: SVM: fix emulation of msr reads/writes of MSR_FS_BASE and MSR_GS_BASE
Sean Christopherson seanjc@google.com KVM: x86: Acquire kvm->srcu when handling KVM_SET_VCPU_EVENTS
robelin robelin@nvidia.com ASoC: dapm: Fix UAF for snd_soc_pcm_runtime object
Stephen Hemminger stephen@networkplumber.org sch/netem: fix use after free in netem_dequeue
-------------
Diffstat:
Makefile | 4 +- arch/arm64/include/asm/acpi.h | 12 + arch/arm64/kernel/acpi_numa.c | 11 - arch/mips/kernel/cevt-r4k.c | 15 +- arch/powerpc/include/asm/nohash/mmu-e500.h | 3 +- arch/powerpc/mm/nohash/Makefile | 2 +- arch/powerpc/mm/nohash/tlb.c | 398 +-------------------- arch/powerpc/mm/nohash/tlb_64e.c | 361 +++++++++++++++++++ arch/powerpc/mm/nohash/tlb_low_64e.S | 195 ---------- arch/riscv/kernel/head.S | 3 + arch/s390/kernel/vmlinux.lds.S | 9 + arch/um/drivers/line.c | 2 + arch/x86/coco/tdx/tdx.c | 1 - arch/x86/events/intel/core.c | 23 +- arch/x86/include/asm/fpu/types.h | 7 + arch/x86/include/asm/page_64.h | 1 + arch/x86/include/asm/pgtable-3level.h | 96 +---- arch/x86/include/asm/pgtable-3level_types.h | 7 + arch/x86/include/asm/pgtable_64_types.h | 5 + arch/x86/include/asm/pgtable_types.h | 4 +- arch/x86/kernel/fpu/xstate.c | 3 + arch/x86/kernel/fpu/xstate.h | 4 +- arch/x86/kvm/svm/svm.c | 15 + arch/x86/kvm/x86.c | 2 + arch/x86/lib/iomem.c | 5 +- arch/x86/mm/init_64.c | 4 + arch/x86/mm/kaslr.c | 34 +- arch/x86/mm/pti.c | 45 ++- drivers/acpi/acpi_processor.c | 15 +- drivers/android/binder.c | 1 + drivers/ata/libata-core.c | 4 +- drivers/ata/pata_macio.c | 7 +- drivers/base/devres.c | 1 + drivers/block/ublk_drv.c | 2 + drivers/clk/qcom/clk-alpha-pll.c | 25 +- drivers/clocksource/timer-imx-tpm.c | 16 +- drivers/clocksource/timer-of.c | 17 +- drivers/clocksource/timer-of.h | 1 - drivers/crypto/qat/qat_common/adf_gen2_pfvf.c | 4 +- .../crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c | 8 +- drivers/firmware/cirrus/cs_dsp.c | 3 + drivers/gpio/gpio-rockchip.c | 1 + drivers/gpio/gpio-zynqmp-modepin.c | 1 + drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 30 +- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 8 +- drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 28 ++ drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- .../drm/amd/display/modules/hdcp/hdcp1_execution.c | 15 +- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 6 +- drivers/gpu/drm/i915/i915_sw_fence.c | 8 +- drivers/hid/amd-sfh-hid/amd_sfh_hid.c | 4 +- drivers/hid/hid-cougar.c | 2 +- drivers/hv/vmbus_drv.c | 1 + drivers/hwmon/adc128d818.c | 4 +- drivers/hwmon/lm95234.c | 9 +- drivers/hwmon/nct6775-core.c | 2 +- drivers/hwmon/w83627ehf.c | 4 +- drivers/i3c/master/mipi-i3c-hci/dma.c | 5 +- drivers/iio/adc/ad7124.c | 27 +- drivers/iio/adc/ad7606.c | 28 +- drivers/iio/adc/ad7606.h | 2 + drivers/iio/adc/ad7606_par.c | 46 ++- drivers/iio/buffer/industrialio-buffer-dmaengine.c | 4 +- drivers/iio/inkern.c | 8 +- drivers/input/misc/uinput.c | 14 + drivers/input/touchscreen/ili210x.c | 6 +- drivers/iommu/intel/dmar.c | 2 +- drivers/iommu/sun50i-iommu.c | 1 + drivers/irqchip/irq-armada-370-xp.c | 4 + drivers/irqchip/irq-gic-v2m.c | 6 +- drivers/leds/leds-spi-byte.c | 6 +- drivers/md/dm-init.c | 4 +- drivers/media/platform/qcom/camss/camss.c | 5 +- drivers/media/test-drivers/vivid/vivid-vid-cap.c | 17 +- drivers/media/test-drivers/vivid/vivid-vid-out.c | 16 +- drivers/misc/vmw_vmci/vmci_resource.c | 3 +- drivers/mmc/core/quirks.h | 22 +- drivers/mmc/core/sd.c | 4 + drivers/mmc/host/cqhci-core.c | 2 +- drivers/mmc/host/dw_mmc.c | 4 +- drivers/mmc/host/sdhci-of-aspeed.c | 1 + drivers/net/bareudp.c | 22 +- drivers/net/can/m_can/m_can.c | 5 +- drivers/net/can/spi/mcp251x.c | 2 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c | 28 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c | 11 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c | 23 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c | 165 ++++++--- drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c | 2 +- .../net/can/spi/mcp251xfd/mcp251xfd-timestamp.c | 22 +- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 42 ++- drivers/net/dsa/vitesse-vsc73xx-core.c | 10 +- drivers/net/ethernet/freescale/dpaa/dpaa_eth.c | 20 +- drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c | 10 +- drivers/net/ethernet/intel/ice/ice_main.c | 61 ++-- drivers/net/ethernet/intel/igb/igb_main.c | 10 + drivers/net/ethernet/intel/igc/igc_main.c | 1 + drivers/net/ethernet/microsoft/mana/mana.h | 2 + drivers/net/ethernet/microsoft/mana/mana_en.c | 22 +- drivers/net/mctp/mctp-serial.c | 4 +- drivers/net/usb/ipheth.c | 2 +- drivers/net/usb/usbnet.c | 11 +- .../broadcom/brcm80211/brcmsmac/mac80211_if.c | 1 + drivers/net/wireless/marvell/mwifiex/main.h | 3 + drivers/nvme/host/pci.c | 11 + drivers/nvme/target/tcp.c | 4 +- drivers/nvmem/core.c | 6 +- drivers/of/irq.c | 15 +- drivers/pci/controller/dwc/pci-keystone.c | 44 ++- drivers/pci/hotplug/pnv_php.c | 3 +- drivers/pci/pci.c | 35 +- drivers/pcmcia/yenta_socket.c | 6 +- drivers/phy/xilinx/phy-zynqmp.c | 1 + drivers/platform/x86/dell/dell-smbios-base.c | 5 +- drivers/regulator/of_regulator.c | 92 +++++ drivers/spi/spi-rockchip.c | 23 +- drivers/staging/iio/frequency/ad9834.c | 2 +- drivers/uio/uio_hv_generic.c | 11 +- drivers/usb/dwc3/core.c | 15 + drivers/usb/dwc3/core.h | 2 + drivers/usb/gadget/udc/aspeed_udc.c | 2 + drivers/usb/storage/uas.c | 1 + fs/binfmt_elf.c | 5 +- fs/btrfs/ctree.c | 12 +- fs/btrfs/ctree.h | 1 - fs/btrfs/extent-tree.c | 32 +- fs/btrfs/file.c | 25 +- fs/btrfs/inode.c | 2 +- fs/btrfs/transaction.h | 6 + fs/ext4/fast_commit.c | 8 +- fs/fuse/dev.c | 6 +- fs/fuse/dir.c | 72 ++-- fs/fuse/file.c | 51 ++- fs/fuse/fuse_i.h | 8 +- fs/fuse/inode.c | 3 +- fs/fuse/xattr.c | 4 +- fs/nfs/super.c | 2 + fs/nilfs2/recovery.c | 35 +- fs/nilfs2/segment.c | 10 +- fs/nilfs2/sysfs.c | 43 ++- fs/ntfs3/dir.c | 52 +-- fs/ntfs3/frecord.c | 4 +- fs/smb/client/smb2ops.c | 16 +- fs/smb/server/smb2pdu.c | 4 + fs/smb/server/transport_tcp.c | 4 +- fs/squashfs/inode.c | 7 +- fs/udf/super.c | 15 + include/linux/mm.h | 4 + include/linux/pgtable.h | 54 ++- include/linux/regulator/consumer.h | 16 + include/net/bluetooth/hci_core.h | 5 - include/uapi/drm/drm_fourcc.h | 18 + include/uapi/linux/fuse.h | 46 ++- io_uring/io-wq.c | 16 +- io_uring/sqpoll.c | 1 - kernel/bpf/btf.c | 13 +- kernel/cgroup/cgroup.c | 2 +- kernel/dma/map_benchmark.c | 16 + kernel/events/core.c | 18 +- kernel/events/internal.h | 1 + kernel/events/ring_buffer.c | 2 + kernel/events/uprobes.c | 3 +- kernel/locking/rtmutex.c | 9 +- kernel/resource.c | 6 +- kernel/smp.c | 1 + kernel/trace/trace.c | 2 + kernel/workqueue.c | 14 +- lib/generic-radix-tree.c | 2 + mm/hmm.c | 2 +- mm/khugepaged.c | 2 +- mm/mapping_dirty_helpers.c | 2 +- mm/memcontrol.c | 22 +- mm/memory_hotplug.c | 2 +- mm/mprotect.c | 2 +- mm/sparse.c | 2 +- mm/userfaultfd.c | 24 +- mm/vmscan.c | 4 +- net/bluetooth/mgmt.c | 60 ++-- net/bluetooth/smp.c | 7 - net/bridge/br_fdb.c | 6 +- net/can/bcm.c | 4 + net/ipv4/fou.c | 29 +- net/ipv4/tcp_bpf.c | 2 +- net/ipv4/tcp_input.c | 6 + net/ipv6/ila/ila.h | 1 + net/ipv6/ila/ila_main.c | 6 + net/ipv6/ila/ila_xlat.c | 13 +- net/netfilter/nf_conncount.c | 8 +- net/sched/sch_cake.c | 11 +- net/sched/sch_netem.c | 9 +- net/unix/af_unix.c | 9 +- rust/Makefile | 4 +- rust/macros/module.rs | 6 +- security/smack/smack_lsm.c | 12 +- sound/core/control.c | 6 +- sound/hda/hdmi_chmap.c | 18 + sound/pci/hda/patch_conexant.c | 11 + sound/pci/hda/patch_realtek.c | 10 + sound/soc/soc-dapm.c | 1 + sound/soc/soc-topology.c | 2 + sound/soc/sof/topology.c | 2 + sound/soc/sunxi/sun4i-i2s.c | 143 ++++---- sound/soc/tegra/tegra210_ahub.c | 12 +- tools/lib/bpf/libbpf.c | 4 +- tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 4 +- tools/testing/selftests/net/mptcp/mptcp_join.sh | 234 ++++++++---- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 15 + 208 files changed, 2369 insertions(+), 1522 deletions(-)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephen Hemminger stephen@networkplumber.org
commit 3b3a2a9c6349e25a025d2330f479bc33a6ccb54a upstream.
If netem_dequeue() enqueues packet to inner qdisc and that qdisc returns __NET_XMIT_STOLEN. The packet is dropped but qdisc_tree_reduce_backlog() is not called to update the parent's q.qlen, leading to the similar use-after-free as Commit e04991a48dbaf382 ("netem: fix return value if duplicate enqueue fails")
Commands to trigger KASAN UaF:
ip link add type dummy ip link set lo up ip link set dummy0 up tc qdisc add dev lo parent root handle 1: drr tc filter add dev lo parent 1: basic classid 1:1 tc class add dev lo classid 1:1 drr tc qdisc add dev lo parent 1:1 handle 2: netem tc qdisc add dev lo parent 2: handle 3: drr tc filter add dev lo parent 3: basic classid 3:1 action mirred egress redirect dev dummy0 tc class add dev lo classid 3:1 drr ping -c1 -W0.01 localhost # Trigger bug tc class del dev lo classid 1:1 tc class add dev lo classid 1:1 drr ping -c1 -W0.01 localhost # UaF
Fixes: 50612537e9ab ("netem: fix classful handling") Reported-by: Budimir Markovic markovicbudimir@gmail.com Signed-off-by: Stephen Hemminger stephen@networkplumber.org Link: https://patch.msgid.link/20240901182438.4992-1-stephen@networkplumber.org Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_netem.c | 9 ++++----- 1 file changed, 4 insertions(+), 5 deletions(-)
--- a/net/sched/sch_netem.c +++ b/net/sched/sch_netem.c @@ -733,11 +733,10 @@ deliver:
err = qdisc_enqueue(skb, q->qdisc, &to_free); kfree_skb_list(to_free); - if (err != NET_XMIT_SUCCESS && - net_xmit_drop_count(err)) { - qdisc_qstats_drop(sch); - qdisc_tree_reduce_backlog(sch, 1, - pkt_len); + if (err != NET_XMIT_SUCCESS) { + if (net_xmit_drop_count(err)) + qdisc_qstats_drop(sch); + qdisc_tree_reduce_backlog(sch, 1, pkt_len); } goto tfifo_dequeue; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: robelin robelin@nvidia.com
commit b4a90b543d9f62d3ac34ec1ab97fc5334b048565 upstream.
When using kernel with the following extra config,
- CONFIG_KASAN=y - CONFIG_KASAN_GENERIC=y - CONFIG_KASAN_INLINE=y - CONFIG_KASAN_VMALLOC=y - CONFIG_FRAME_WARN=4096
kernel detects that snd_pcm_suspend_all() access a freed 'snd_soc_pcm_runtime' object when the system is suspended, which leads to a use-after-free bug:
[ 52.047746] BUG: KASAN: use-after-free in snd_pcm_suspend_all+0x1a8/0x270 [ 52.047765] Read of size 1 at addr ffff0000b9434d50 by task systemd-sleep/2330
[ 52.047785] Call trace: [ 52.047787] dump_backtrace+0x0/0x3c0 [ 52.047794] show_stack+0x34/0x50 [ 52.047797] dump_stack_lvl+0x68/0x8c [ 52.047802] print_address_description.constprop.0+0x74/0x2c0 [ 52.047809] kasan_report+0x210/0x230 [ 52.047815] __asan_report_load1_noabort+0x3c/0x50 [ 52.047820] snd_pcm_suspend_all+0x1a8/0x270 [ 52.047824] snd_soc_suspend+0x19c/0x4e0
The snd_pcm_sync_stop() has a NULL check on 'substream->runtime' before making any access. So we need to always set 'substream->runtime' to NULL everytime we kfree() it.
Fixes: a72706ed8208 ("ASoC: codec2codec: remove ephemeral variables") Signed-off-by: robelin robelin@nvidia.com Signed-off-by: Sameer Pujar spujar@nvidia.com Link: https://patch.msgid.link/20240823144342.4123814-2-spujar@nvidia.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/soc/soc-dapm.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/soc/soc-dapm.c +++ b/sound/soc/soc-dapm.c @@ -4018,6 +4018,7 @@ static int snd_soc_dai_link_event(struct
case SND_SOC_DAPM_POST_PMD: kfree(substream->runtime); + substream->runtime = NULL; break;
default:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit 4bcdd831d9d01e0fb64faea50732b59b2ee88da1 upstream.
Grab kvm->srcu when processing KVM_SET_VCPU_EVENTS, as KVM will forcibly leave nested VMX/SVM if SMM mode is being toggled, and leaving nested VMX reads guest memory.
Note, kvm_vcpu_ioctl_x86_set_vcpu_events() can also be called from KVM_RUN via sync_regs(), which already holds SRCU. I.e. trying to precisely use kvm_vcpu_srcu_read_lock() around the problematic SMM code would cause problems. Acquiring SRCU isn't all that expensive, so for simplicity, grab it unconditionally for KVM_SET_VCPU_EVENTS.
============================= WARNING: suspicious RCU usage 6.10.0-rc7-332d2c1d713e-next-vm #552 Not tainted ----------------------------- include/linux/kvm_host.h:1027 suspicious rcu_dereference_check() usage!
other info that might help us debug this:
rcu_scheduler_active = 2, debug_locks = 1 1 lock held by repro/1071: #0: ffff88811e424430 (&vcpu->mutex){+.+.}-{3:3}, at: kvm_vcpu_ioctl+0x7d/0x970 [kvm]
stack backtrace: CPU: 15 PID: 1071 Comm: repro Not tainted 6.10.0-rc7-332d2c1d713e-next-vm #552 Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 0.0.0 02/06/2015 Call Trace: <TASK> dump_stack_lvl+0x7f/0x90 lockdep_rcu_suspicious+0x13f/0x1a0 kvm_vcpu_gfn_to_memslot+0x168/0x190 [kvm] kvm_vcpu_read_guest+0x3e/0x90 [kvm] nested_vmx_load_msr+0x6b/0x1d0 [kvm_intel] load_vmcs12_host_state+0x432/0xb40 [kvm_intel] vmx_leave_nested+0x30/0x40 [kvm_intel] kvm_vcpu_ioctl_x86_set_vcpu_events+0x15d/0x2b0 [kvm] kvm_arch_vcpu_ioctl+0x1107/0x1750 [kvm] ? mark_held_locks+0x49/0x70 ? kvm_vcpu_ioctl+0x7d/0x970 [kvm] ? kvm_vcpu_ioctl+0x497/0x970 [kvm] kvm_vcpu_ioctl+0x497/0x970 [kvm] ? lock_acquire+0xba/0x2d0 ? find_held_lock+0x2b/0x80 ? do_user_addr_fault+0x40c/0x6f0 ? lock_release+0xb7/0x270 __x64_sys_ioctl+0x82/0xb0 do_syscall_64+0x6c/0x170 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7ff11eb1b539 </TASK>
Fixes: f7e570780efc ("KVM: x86: Forcibly leave nested virt when SMM state is toggled") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240723232055.3643811-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/x86.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -5747,7 +5747,9 @@ long kvm_arch_vcpu_ioctl(struct file *fi if (copy_from_user(&events, argp, sizeof(struct kvm_vcpu_events))) break;
+ kvm_vcpu_srcu_read_lock(vcpu); r = kvm_vcpu_ioctl_x86_set_vcpu_events(vcpu, &events); + kvm_vcpu_srcu_read_unlock(vcpu); break; } case KVM_GET_DEBUGREGS: {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maxim Levitsky mlevitsk@redhat.com
commit dad1613e0533b380318281c1519e1a3477c2d0d2 upstream.
If these msrs are read by the emulator (e.g due to 'force emulation' prefix), SVM code currently fails to extract the corresponding segment bases, and return them to the emulator.
Fix that.
Cc: stable@vger.kernel.org Signed-off-by: Maxim Levitsky mlevitsk@redhat.com Link: https://lore.kernel.org/r/20240802151608.72896-3-mlevitsk@redhat.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -2756,6 +2756,12 @@ static int svm_get_msr(struct kvm_vcpu * case MSR_CSTAR: msr_info->data = svm->vmcb01.ptr->save.cstar; break; + case MSR_GS_BASE: + msr_info->data = svm->vmcb01.ptr->save.gs.base; + break; + case MSR_FS_BASE: + msr_info->data = svm->vmcb01.ptr->save.fs.base; + break; case MSR_KERNEL_GS_BASE: msr_info->data = svm->vmcb01.ptr->save.kernel_gs_base; break; @@ -2982,6 +2988,12 @@ static int svm_set_msr(struct kvm_vcpu * case MSR_CSTAR: svm->vmcb01.ptr->save.cstar = data; break; + case MSR_GS_BASE: + svm->vmcb01.ptr->save.gs.base = data; + break; + case MSR_FS_BASE: + svm->vmcb01.ptr->save.fs.base = data; + break; case MSR_KERNEL_GS_BASE: svm->vmcb01.ptr->save.kernel_gs_base = data; break;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ravi Bangoria ravi.bangoria@amd.com
commit 54950bfe2b69cdc06ef753872b5225e54eb73506 upstream.
If host supports Bus Lock Detect, KVM advertises it to guests even if SVM support is absent. Additionally, guest wouldn't be able to use it despite guest CPUID bit being set. Fix it by unconditionally clearing the feature bit in KVM cpu capability.
Reported-by: Jim Mattson jmattson@google.com Closes: https://lore.kernel.org/r/CALMp9eRet6+v8Y1Q-i6mqPm4hUow_kJNhmVHfOV8tMfuSS=tV... Fixes: 76ea438b4afc ("KVM: X86: Expose bus lock debug exception to guest") Cc: stable@vger.kernel.org Signed-off-by: Ravi Bangoria ravi.bangoria@amd.com Reviewed-by: Jim Mattson jmattson@google.com Reviewed-by: Tom Lendacky thomas.lendacky@amd.com Link: https://lore.kernel.org/r/20240808062937.1149-4-ravi.bangoria@amd.com Signed-off-by: Sean Christopherson seanjc@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/kvm/svm/svm.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -4972,6 +4972,9 @@ static __init void svm_set_cpu_caps(void
/* CPUID 0x8000001F (SME/SEV features) */ sev_set_cpu_caps(); + + /* Don't advertise Bus Lock Detect to guest if SVM support is absent */ + kvm_cpu_cap_clear(X86_FEATURE_BUS_LOCK_DETECT); }
static __init int svm_hardware_setup(void)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christoffer Sandberg cs@tuxedo.de
commit 4178d78cd7a86510ba68d203f26fc01113c7f126 upstream.
The Sirius notebooks have two sets of speakers 0x17 (sides) and 0x1d (top center). The side speakers are active by default but the top speakers aren't.
This patch provides a pincfg quirk to activate the top speakers.
Signed-off-by: Christoffer Sandberg cs@tuxedo.de Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240827102540.9480-1-wse@tuxedocomputers.com Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_conexant.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/sound/pci/hda/patch_conexant.c +++ b/sound/pci/hda/patch_conexant.c @@ -311,6 +311,7 @@ enum { CXT_FIXUP_HEADSET_MIC, CXT_FIXUP_HP_MIC_NO_PRESENCE, CXT_PINCFG_SWS_JS201D, + CXT_PINCFG_TOP_SPEAKER, };
/* for hda_fixup_thinkpad_acpi() */ @@ -978,6 +979,13 @@ static const struct hda_fixup cxt_fixups .type = HDA_FIXUP_PINS, .v.pins = cxt_pincfg_sws_js201d, }, + [CXT_PINCFG_TOP_SPEAKER] = { + .type = HDA_FIXUP_PINS, + .v.pins = (const struct hda_pintbl[]) { + { 0x1d, 0x82170111 }, + { } + }, + }, };
static const struct snd_pci_quirk cxt5045_fixups[] = { @@ -1074,6 +1082,8 @@ static const struct snd_pci_quirk cxt506 SND_PCI_QUIRK_VENDOR(0x17aa, "Thinkpad", CXT_FIXUP_THINKPAD_ACPI), SND_PCI_QUIRK(0x1c06, 0x2011, "Lemote A1004", CXT_PINCFG_LEMOTE_A1004), SND_PCI_QUIRK(0x1c06, 0x2012, "Lemote A1205", CXT_PINCFG_LEMOTE_A1205), + SND_PCI_QUIRK(0x2782, 0x12c3, "Sirius Gen1", CXT_PINCFG_TOP_SPEAKER), + SND_PCI_QUIRK(0x2782, 0x12c5, "Sirius Gen2", CXT_PINCFG_TOP_SPEAKER), {} };
@@ -1093,6 +1103,7 @@ static const struct hda_model_fixup cxt5 { .id = CXT_FIXUP_HP_MIC_NO_PRESENCE, .name = "hp-mic-fix" }, { .id = CXT_PINCFG_LENOVO_NOTEBOOK, .name = "lenovo-20149" }, { .id = CXT_PINCFG_SWS_JS201D, .name = "sws-js201d" }, + { .id = CXT_PINCFG_TOP_SPEAKER, .name = "sirius-top-speaker" }, {} };
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Terry Cheong htcheong@chromium.org
commit ef27e89e7f3015be2b3c124833fbd6d2e4686561 upstream.
Lenovo V145 is having phase inverted dmic but simply applying inverted dmic fixups does not work. Chaining up verb fixes for ALC283 enables inverting dmic fixup to work properly.
Signed-off-by: Terry Cheong htcheong@chromium.org Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240830-lenovo-v145-fixes-v3-1-f7b7265068fa@chromi... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 9 +++++++++ 1 file changed, 9 insertions(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -7289,6 +7289,7 @@ enum { ALC236_FIXUP_HP_GPIO_LED, ALC236_FIXUP_HP_MUTE_LED, ALC236_FIXUP_HP_MUTE_LED_MICMUTE_VREF, + ALC236_FIXUP_LENOVO_INV_DMIC, ALC298_FIXUP_SAMSUNG_AMP, ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET, ALC256_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET, @@ -8805,6 +8806,12 @@ static const struct hda_fixup alc269_fix .type = HDA_FIXUP_FUNC, .v.func = alc236_fixup_hp_mute_led_micmute_vref, }, + [ALC236_FIXUP_LENOVO_INV_DMIC] = { + .type = HDA_FIXUP_FUNC, + .v.func = alc_fixup_inv_dmic, + .chained = true, + .chain_id = ALC283_FIXUP_INT_MIC, + }, [ALC298_FIXUP_SAMSUNG_AMP] = { .type = HDA_FIXUP_FUNC, .v.func = alc298_fixup_samsung_amp, @@ -10111,6 +10118,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x17aa, 0x3866, "Lenovo 13X", ALC287_FIXUP_CS35L41_I2C_2), SND_PCI_QUIRK(0x17aa, 0x3869, "Lenovo Yoga7 14IAL7", ALC287_FIXUP_YOGA9_14IAP7_BASS_SPK_PIN), SND_PCI_QUIRK(0x17aa, 0x3902, "Lenovo E50-80", ALC269_FIXUP_DMIC_THINKPAD_ACPI), + SND_PCI_QUIRK(0x17aa, 0x3913, "Lenovo 145", ALC236_FIXUP_LENOVO_INV_DMIC), SND_PCI_QUIRK(0x17aa, 0x3977, "IdeaPad S210", ALC283_FIXUP_INT_MIC), SND_PCI_QUIRK(0x17aa, 0x3978, "Lenovo B50-70", ALC269_FIXUP_DMIC_THINKPAD_ACPI), SND_PCI_QUIRK(0x17aa, 0x3bf8, "Quanta FL1", ALC269_FIXUP_PCM_44K), @@ -10359,6 +10367,7 @@ static const struct hda_model_fixup alc2 {.id = ALC623_FIXUP_LENOVO_THINKSTATION_P340, .name = "alc623-lenovo-thinkstation-p340"}, {.id = ALC255_FIXUP_ACER_HEADPHONE_AND_MIC, .name = "alc255-acer-headphone-and-mic"}, {.id = ALC285_FIXUP_HP_GPIO_AMP_INIT, .name = "alc285-hp-amp-init"}, + {.id = ALC236_FIXUP_LENOVO_INV_DMIC, .name = "alc236-fixup-lenovo-inv-mic"}, {} }; #define ALC225_STANDARD_PINS \
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maximilien Perreault maximilienperreault@gmail.com
commit 47a9e8dbb8d4713a9aac7cc6ce3c82dcc94217d8 upstream.
The mute LED on this HP laptop uses ALC236 and requires a quirk to function. This patch enables the existing quirk for the device.
Signed-off-by: Maximilien Perreault maximilienperreault@gmail.com Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20240904031013.21220-1-maximilienperreault@gmail.co... Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/pci/hda/patch_realtek.c | 1 + 1 file changed, 1 insertion(+)
--- a/sound/pci/hda/patch_realtek.c +++ b/sound/pci/hda/patch_realtek.c @@ -9717,6 +9717,7 @@ static const struct snd_pci_quirk alc269 SND_PCI_QUIRK(0x103c, 0x87f5, "HP", ALC287_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x87f6, "HP Spectre x360 14", ALC245_FIXUP_HP_X360_AMP), SND_PCI_QUIRK(0x103c, 0x87f7, "HP Spectre x360 14", ALC245_FIXUP_HP_X360_AMP), + SND_PCI_QUIRK(0x103c, 0x87fd, "HP Laptop 14-dq2xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x87fe, "HP Laptop 15s-fq2xxx", ALC236_FIXUP_HP_MUTE_LED_COEFBIT2), SND_PCI_QUIRK(0x103c, 0x8805, "HP ProBook 650 G8 Notebook PC", ALC236_FIXUP_HP_GPIO_LED), SND_PCI_QUIRK(0x103c, 0x880d, "HP EliteBook 830 G8 Notebook PC", ALC285_FIXUP_HP_GPIO_LED),
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Namjae Jeon linkinjeon@kernel.org
commit 78c5a6f1f630172b19af4912e755e1da93ef0ab5 upstream.
Steve French reported null pointer dereference error from sha256 lib. cifs.ko can send session setup requests on reused connection. If reused connection is used for binding session, conn->binding can still remain true and generate_preauth_hash() will not set sess->Preauth_HashValue and it will be NULL. It is used as a material to create an encryption key in ksmbd_gen_smb311_encryptionkey. ->Preauth_HashValue cause null pointer dereference error from crypto_shash_update().
BUG: kernel NULL pointer dereference, address: 0000000000000000 #PF: supervisor read access in kernel mode #PF: error_code(0x0000) - not-present page PGD 0 P4D 0 Oops: 0000 [#1] PREEMPT SMP PTI CPU: 8 PID: 429254 Comm: kworker/8:39 Hardware name: LENOVO 20MAS08500/20MAS08500, BIOS N2CET69W (1.52 ) Workqueue: ksmbd-io handle_ksmbd_work [ksmbd] RIP: 0010:lib_sha256_base_do_update.isra.0+0x11e/0x1d0 [sha256_ssse3] <TASK> ? show_regs+0x6d/0x80 ? __die+0x24/0x80 ? page_fault_oops+0x99/0x1b0 ? do_user_addr_fault+0x2ee/0x6b0 ? exc_page_fault+0x83/0x1b0 ? asm_exc_page_fault+0x27/0x30 ? __pfx_sha256_transform_rorx+0x10/0x10 [sha256_ssse3] ? lib_sha256_base_do_update.isra.0+0x11e/0x1d0 [sha256_ssse3] ? __pfx_sha256_transform_rorx+0x10/0x10 [sha256_ssse3] ? __pfx_sha256_transform_rorx+0x10/0x10 [sha256_ssse3] _sha256_update+0x77/0xa0 [sha256_ssse3] sha256_avx2_update+0x15/0x30 [sha256_ssse3] crypto_shash_update+0x1e/0x40 hmac_update+0x12/0x20 crypto_shash_update+0x1e/0x40 generate_key+0x234/0x380 [ksmbd] generate_smb3encryptionkey+0x40/0x1c0 [ksmbd] ksmbd_gen_smb311_encryptionkey+0x72/0xa0 [ksmbd] ntlm_authenticate.isra.0+0x423/0x5d0 [ksmbd] smb2_sess_setup+0x952/0xaa0 [ksmbd] __process_request+0xa3/0x1d0 [ksmbd] __handle_ksmbd_work+0x1c4/0x2f0 [ksmbd] handle_ksmbd_work+0x2d/0xa0 [ksmbd] process_one_work+0x16c/0x350 worker_thread+0x306/0x440 ? __pfx_worker_thread+0x10/0x10 kthread+0xef/0x120 ? __pfx_kthread+0x10/0x10 ret_from_fork+0x44/0x70 ? __pfx_kthread+0x10/0x10 ret_from_fork_asm+0x1b/0x30 </TASK>
Fixes: f5a544e3bab7 ("ksmbd: add support for SMB3 multichannel") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/smb2pdu.c | 4 ++++ 1 file changed, 4 insertions(+)
--- a/fs/smb/server/smb2pdu.c +++ b/fs/smb/server/smb2pdu.c @@ -1703,6 +1703,8 @@ int smb2_sess_setup(struct ksmbd_work *w rc = ksmbd_session_register(conn, sess); if (rc) goto out_err; + + conn->binding = false; } else if (conn->dialect >= SMB30_PROT_ID && (server_conf.flags & KSMBD_GLOBAL_FLAG_SMB3_MULTICHANNEL) && req->Flags & SMB2_SESSION_REQ_FLAG_BINDING) { @@ -1781,6 +1783,8 @@ int smb2_sess_setup(struct ksmbd_work *w sess = NULL; goto out_err; } + + conn->binding = false; } work->sess = sess;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
commit 844436e045ac2ab7895d8b281cb784a24de1d14d upstream.
Unlock before returning an error code if this allocation fails.
Fixes: 0626e6641f6b ("cifsd: add server handler for central processing and tranport layers") Cc: stable@vger.kernel.org # v5.15+ Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Acked-by: Namjae Jeon linkinjeon@kernel.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/smb/server/transport_tcp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/fs/smb/server/transport_tcp.c +++ b/fs/smb/server/transport_tcp.c @@ -622,8 +622,10 @@ int ksmbd_tcp_set_interfaces(char *ifc_l for_each_netdev(&init_net, netdev) { if (netif_is_bridge_port(netdev)) continue; - if (!alloc_iface(kstrdup(netdev->name, GFP_KERNEL))) + if (!alloc_iface(kstrdup(netdev->name, GFP_KERNEL))) { + rtnl_unlock(); return -ENOMEM; + } } rtnl_unlock(); bind_additional_ifaces = 1;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zheng Qixing zhengqixing@huawei.com
commit 284b75a3d83c7631586d98f6dede1d90f128f0db upstream.
In ata_host_alloc(), if devres_alloc() fails to allocate the device host resource data pointer, the already allocated ata_host structure is not freed before returning from the function. This results in a potential memory leak.
Call kfree(host) before jumping to the error handling path to ensure that the ata_host structure is properly freed if devres_alloc() fails.
Fixes: 2623c7a5f279 ("libata: add refcounting to ata_host") Cc: stable@vger.kernel.org Signed-off-by: Zheng Qixing zhengqixing@huawei.com Reviewed-by: Yu Kuai yukuai3@huawei.com Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/ata/libata-core.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/ata/libata-core.c +++ b/drivers/ata/libata-core.c @@ -5532,8 +5532,10 @@ struct ata_host *ata_host_alloc(struct d }
dr = devres_alloc(ata_devres_release, 0, GFP_KERNEL); - if (!dr) + if (!dr) { + kfree(host); goto err_out; + }
devres_add(dev, dr); dev_set_drvdata(dev, host);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kirill A. Shutemov kirill.shutemov@linux.intel.com
commit b6fb565a2d15277896583d471b21bc14a0c99661 upstream.
The mmio_read() function makes a TDVMCALL to retrieve MMIO data for an address from the VMM.
Sean noticed that mmio_read() unintentionally exposes the value of an initialized variable (val) on the stack to the VMM.
This variable is only needed as an output value. It did not need to be passed to the VMM in the first place.
Do not send the original value of *val to the VMM.
[ dhansen: clarify what 'val' is used for. ]
Fixes: 31d58c4e557d ("x86/tdx: Handle in-kernel MMIO") Reported-by: Sean Christopherson seanjc@google.com Signed-off-by: Kirill A. Shutemov kirill.shutemov@linux.intel.com Signed-off-by: Dave Hansen dave.hansen@linux.intel.com Cc:stable@vger.kernel.org Link: https://lore.kernel.org/all/20240826125304.1566719-1-kirill.shutemov%40linux... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/coco/tdx/tdx.c | 1 - 1 file changed, 1 deletion(-)
--- a/arch/x86/coco/tdx/tdx.c +++ b/arch/x86/coco/tdx/tdx.c @@ -328,7 +328,6 @@ static bool mmio_read(int size, unsigned .r12 = size, .r13 = EPT_READ, .r14 = addr, - .r15 = *val, };
if (__tdx_hypercall(&args, TDX_HCALL_HAS_OUTPUT))
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kan Liang kan.liang@linux.intel.com
commit 25dfc9e357af8aed1ca79b318a73f2c59c1f0b2b upstream.
Running the ltp test cve-2015-3290 concurrently reports the following warnings.
perfevents: irq loop stuck! WARNING: CPU: 31 PID: 32438 at arch/x86/events/intel/core.c:3174 intel_pmu_handle_irq+0x285/0x370 Call Trace: <NMI> ? __warn+0xa4/0x220 ? intel_pmu_handle_irq+0x285/0x370 ? __report_bug+0x123/0x130 ? intel_pmu_handle_irq+0x285/0x370 ? __report_bug+0x123/0x130 ? intel_pmu_handle_irq+0x285/0x370 ? report_bug+0x3e/0xa0 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x50 ? asm_exc_invalid_op+0x1a/0x20 ? irq_work_claim+0x1e/0x40 ? intel_pmu_handle_irq+0x285/0x370 perf_event_nmi_handler+0x3d/0x60 nmi_handle+0x104/0x330
Thanks to Thomas Gleixner's analysis, the issue is caused by the low initial period (1) of the frequency estimation algorithm, which triggers the defects of the HW, specifically erratum HSW11 and HSW143. (For the details, please refer https://lore.kernel.org/lkml/87plq9l5d2.ffs@tglx/)
The HSW11 requires a period larger than 100 for the INST_RETIRED.ALL event, but the initial period in the freq mode is 1. The erratum is the same as the BDM11, which has been supported in the kernel. A minimum period of 128 is enforced as well on HSW.
HSW143 is regarding that the fixed counter 1 may overcount 32 with the Hyper-Threading is enabled. However, based on the test, the hardware has more issues than it tells. Besides the fixed counter 1, the message 'interrupt took too long' can be observed on any counter which was armed with a period < 32 and two events expired in the same NMI. A minimum period of 32 is enforced for the rest of the events. The recommended workaround code of the HSW143 is not implemented. Because it only addresses the issue for the fixed counter. It brings extra overhead through extra MSR writing. No related overcounting issue has been reported so far.
Fixes: 3a632cb229bf ("perf/x86/intel: Add simple Haswell PMU support") Reported-by: Li Huafei lihuafei1@huawei.com Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Kan Liang kan.liang@linux.intel.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240819183004.3132920-1-kan.liang@linux.intel.c... Closes: https://lore.kernel.org/lkml/20240729223328.327835-1-lihuafei1@huawei.com/ Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/events/intel/core.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-)
--- a/arch/x86/events/intel/core.c +++ b/arch/x86/events/intel/core.c @@ -4352,6 +4352,25 @@ static u8 adl_get_hybrid_cpu_type(void) return hybrid_big; }
+static inline bool erratum_hsw11(struct perf_event *event) +{ + return (event->hw.config & INTEL_ARCH_EVENT_MASK) == + X86_CONFIG(.event=0xc0, .umask=0x01); +} + +/* + * The HSW11 requires a period larger than 100 which is the same as the BDM11. + * A minimum period of 128 is enforced as well for the INST_RETIRED.ALL. + * + * The message 'interrupt took too long' can be observed on any counter which + * was armed with a period < 32 and two events expired in the same NMI. + * A minimum period of 32 is enforced for the rest of the events. + */ +static void hsw_limit_period(struct perf_event *event, s64 *left) +{ + *left = max(*left, erratum_hsw11(event) ? 128 : 32); +} + /* * Broadwell: * @@ -4369,8 +4388,7 @@ static u8 adl_get_hybrid_cpu_type(void) */ static void bdw_limit_period(struct perf_event *event, s64 *left) { - if ((event->hw.config & INTEL_ARCH_EVENT_MASK) == - X86_CONFIG(.event=0xc0, .umask=0x01)) { + if (erratum_hsw11(event)) { if (*left < 128) *left = 128; *left &= ~0x3fULL; @@ -6180,6 +6198,7 @@ __init int intel_pmu_init(void)
x86_pmu.hw_config = hsw_hw_config; x86_pmu.get_event_constraints = hsw_get_event_constraints; + x86_pmu.limit_period = hsw_limit_period; x86_pmu.lbr_double_abort = true; extra_attr = boot_cpu_has(X86_FEATURE_RTM) ? hsw_format_attr : nhm_format_attr;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
commit c5af2c90ba5629f0424a8d315f75fb8d91713c3c upstream.
gicv2m_of_init() fails to perform an of_node_put() when of_address_to_resource() fails, leading to a refcount leak.
Address this by moving the error handling path outside of the loop and making it common to all failure modes.
Fixes: 4266ab1a8ff5 ("irqchip/gic-v2m: Refactor to prepare for ACPI support") Signed-off-by: Ma Ke make24@iscas.ac.cn Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240820092843.1219933-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/irqchip/irq-gic-v2m.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/irqchip/irq-gic-v2m.c +++ b/drivers/irqchip/irq-gic-v2m.c @@ -438,12 +438,12 @@ static int __init gicv2m_of_init(struct
ret = gicv2m_init_one(&child->fwnode, spi_start, nr_spis, &res, 0); - if (ret) { - of_node_put(child); + if (ret) break; - } }
+ if (ret && child) + of_node_put(child); if (!ret) ret = gicv2m_allocate_domains(parent); if (ret)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit ea72ce5da22806d5713f3ffb39a6d5ae73841f93 upstream.
iounmap() on x86 occasionally fails to unmap because the provided valid ioremap address is not below high_memory. It turned out that this happens due to KASLR.
KASLR uses the full address space between PAGE_OFFSET and vaddr_end to randomize the starting points of the direct map, vmalloc and vmemmap regions. It thereby limits the size of the direct map by using the installed memory size plus an extra configurable margin for hot-plug memory. This limitation is done to gain more randomization space because otherwise only the holes between the direct map, vmalloc, vmemmap and vaddr_end would be usable for randomizing.
The limited direct map size is not exposed to the rest of the kernel, so the memory hot-plug and resource management related code paths still operate under the assumption that the available address space can be determined with MAX_PHYSMEM_BITS.
request_free_mem_region() allocates from (1 << MAX_PHYSMEM_BITS) - 1 downwards. That means the first allocation happens past the end of the direct map and if unlucky this address is in the vmalloc space, which causes high_memory to become greater than VMALLOC_START and consequently causes iounmap() to fail for valid ioremap addresses.
MAX_PHYSMEM_BITS cannot be changed for that because the randomization does not align with address bit boundaries and there are other places which actually require to know the maximum number of address bits. All remaining usage sites of MAX_PHYSMEM_BITS have been analyzed and found to be correct.
Cure this by exposing the end of the direct map via PHYSMEM_END and use that for the memory hot-plug and resource management related places instead of relying on MAX_PHYSMEM_BITS. In the KASLR case PHYSMEM_END maps to a variable which is initialized by the KASLR initialization and otherwise it is based on MAX_PHYSMEM_BITS as before.
To prevent future hickups add a check into add_pages() to catch callers trying to add memory above PHYSMEM_END.
Fixes: 0483e1fa6e09 ("x86/mm: Implement ASLR for kernel memory regions") Reported-by: Max Ramanouski max8rr8@gmail.com Reported-by: Alistair Popple apopple@nvidia.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-By: Max Ramanouski max8rr8@gmail.com Tested-by: Alistair Popple apopple@nvidia.com Reviewed-by: Dan Williams dan.j.williams@intel.com Reviewed-by: Alistair Popple apopple@nvidia.com Reviewed-by: Kees Cook kees@kernel.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/87ed6soy3z.ffs@tglx Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/page_64.h | 1 + arch/x86/include/asm/pgtable_64_types.h | 4 ++++ arch/x86/mm/init_64.c | 4 ++++ arch/x86/mm/kaslr.c | 32 ++++++++++++++++++++++++++------ include/linux/mm.h | 4 ++++ kernel/resource.c | 6 ++---- mm/memory_hotplug.c | 2 +- mm/sparse.c | 2 +- 8 files changed, 43 insertions(+), 12 deletions(-)
--- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -17,6 +17,7 @@ extern unsigned long phys_base; extern unsigned long page_offset_base; extern unsigned long vmalloc_base; extern unsigned long vmemmap_base; +extern unsigned long physmem_end;
static __always_inline unsigned long __phys_addr_nodebug(unsigned long x) { --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -139,6 +139,10 @@ extern unsigned int ptrs_per_p4d; # define VMEMMAP_START __VMEMMAP_BASE_L4 #endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */
+#ifdef CONFIG_RANDOMIZE_MEMORY +# define PHYSMEM_END physmem_end +#endif + /* * End of the region for which vmalloc page tables are pre-allocated. * For non-KMSAN builds, this is the same as VMALLOC_END. --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -950,8 +950,12 @@ static void update_end_of_memory_vars(u6 int add_pages(int nid, unsigned long start_pfn, unsigned long nr_pages, struct mhp_params *params) { + unsigned long end = ((start_pfn + nr_pages) << PAGE_SHIFT) - 1; int ret;
+ if (WARN_ON_ONCE(end > PHYSMEM_END)) + return -ERANGE; + ret = __add_pages(nid, start_pfn, nr_pages, params); WARN_ON_ONCE(ret);
--- a/arch/x86/mm/kaslr.c +++ b/arch/x86/mm/kaslr.c @@ -47,13 +47,24 @@ static const unsigned long vaddr_end = C */ static __initdata struct kaslr_memory_region { unsigned long *base; + unsigned long *end; unsigned long size_tb; } kaslr_regions[] = { - { &page_offset_base, 0 }, - { &vmalloc_base, 0 }, - { &vmemmap_base, 0 }, + { + .base = &page_offset_base, + .end = &physmem_end, + }, + { + .base = &vmalloc_base, + }, + { + .base = &vmemmap_base, + }, };
+/* The end of the possible address space for physical memory */ +unsigned long physmem_end __ro_after_init; + /* Get size in bytes used by the memory region */ static inline unsigned long get_padding(struct kaslr_memory_region *region) { @@ -82,6 +93,8 @@ void __init kernel_randomize_memory(void BUILD_BUG_ON(vaddr_end != CPU_ENTRY_AREA_BASE); BUILD_BUG_ON(vaddr_end > __START_KERNEL_map);
+ /* Preset the end of the possible address space for physical memory */ + physmem_end = ((1ULL << MAX_PHYSMEM_BITS) - 1); if (!kaslr_memory_enabled()) return;
@@ -128,11 +141,18 @@ void __init kernel_randomize_memory(void vaddr += entropy; *kaslr_regions[i].base = vaddr;
+ /* Calculate the end of the region */ + vaddr += get_padding(&kaslr_regions[i]); /* - * Jump the region and add a minimum padding based on - * randomization alignment. + * KASLR trims the maximum possible size of the + * direct-map. Update the physmem_end boundary. + * No rounding required as the region starts + * PUD aligned and size is in units of TB. */ - vaddr += get_padding(&kaslr_regions[i]); + if (kaslr_regions[i].end) + *kaslr_regions[i].end = __pa_nodebug(vaddr - 1); + + /* Add a minimum padding based on randomization alignment. */ vaddr = round_up(vaddr + 1, PUD_SIZE); remain_entropy -= entropy; } --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -92,6 +92,10 @@ extern const int mmap_rnd_compat_bits_ma extern int mmap_rnd_compat_bits __read_mostly; #endif
+#ifndef PHYSMEM_END +# define PHYSMEM_END ((1ULL << MAX_PHYSMEM_BITS) - 1) +#endif + #include <asm/page.h> #include <asm/processor.h>
--- a/kernel/resource.c +++ b/kernel/resource.c @@ -1781,8 +1781,7 @@ static resource_size_t gfr_start(struct if (flags & GFR_DESCENDING) { resource_size_t end;
- end = min_t(resource_size_t, base->end, - (1ULL << MAX_PHYSMEM_BITS) - 1); + end = min_t(resource_size_t, base->end, PHYSMEM_END); return end - size + 1; }
@@ -1799,8 +1798,7 @@ static bool gfr_continue(struct resource * @size did not wrap 0. */ return addr > addr - size && - addr <= min_t(resource_size_t, base->end, - (1ULL << MAX_PHYSMEM_BITS) - 1); + addr <= min_t(resource_size_t, base->end, PHYSMEM_END); }
static resource_size_t gfr_next(resource_size_t addr, resource_size_t size, --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1530,7 +1530,7 @@ struct range __weak arch_get_mappable_ra
struct range mhp_get_pluggable_range(bool need_mapping) { - const u64 max_phys = (1ULL << MAX_PHYSMEM_BITS) - 1; + const u64 max_phys = PHYSMEM_END; struct range mhp_range;
if (need_mapping) { --- a/mm/sparse.c +++ b/mm/sparse.c @@ -129,7 +129,7 @@ static inline int sparse_early_nid(struc static void __meminit mminit_validate_memmodel_limits(unsigned long *start_pfn, unsigned long *end_pfn) { - unsigned long max_sparsemem_pfn = 1UL << (MAX_PHYSMEM_BITS-PAGE_SHIFT); + unsigned long max_sparsemem_pfn = (PHYSMEM_END + 1) >> PAGE_SHIFT;
/* * Sanity checks - do not allow an architecture to pass
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Roland Xu mu001999@outlook.com
commit d33d26036a0274b472299d7dcdaa5fb34329f91b upstream.
rt_mutex_handle_deadlock() is called with rt_mutex::wait_lock held. In the good case it returns with the lock held and in the deadlock case it emits a warning and goes into an endless scheduling loop with the lock held, which triggers the 'scheduling in atomic' warning.
Unlock rt_mutex::wait_lock in the dead lock case before issuing the warning and dropping into the schedule for ever loop.
[ tglx: Moved unlock before the WARN(), removed the pointless comment, massaged changelog, added Fixes tag ]
Fixes: 3d5c9340d194 ("rtmutex: Handle deadlock detection smarter") Signed-off-by: Roland Xu mu001999@outlook.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/ME0P300MB063599BEF0743B8FA339C2CECC802@ME0P300MB... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/locking/rtmutex.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
--- a/kernel/locking/rtmutex.c +++ b/kernel/locking/rtmutex.c @@ -1624,6 +1624,7 @@ static int __sched rt_mutex_slowlock_blo }
static void __sched rt_mutex_handle_deadlock(int res, int detect_deadlock, + struct rt_mutex_base *lock, struct rt_mutex_waiter *w) { /* @@ -1636,10 +1637,10 @@ static void __sched rt_mutex_handle_dead if (build_ww_mutex() && w->ww_ctx) return;
- /* - * Yell loudly and stop the task right here. - */ + raw_spin_unlock_irq(&lock->wait_lock); + WARN(1, "rtmutex deadlock detected\n"); + while (1) { set_current_state(TASK_INTERRUPTIBLE); schedule(); @@ -1693,7 +1694,7 @@ static int __sched __rt_mutex_slowlock(s } else { __set_current_state(TASK_RUNNING); remove_waiter(lock, waiter); - rt_mutex_handle_deadlock(ret, chwalk, waiter); + rt_mutex_handle_deadlock(ret, chwalk, lock, waiter); }
/*
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Georg Gottleuber ggo@tuxedocomputers.com
commit 61aa894e7a2fda4ee026523b01d07e83ce2abb72 upstream.
On some TUXEDO platforms, a Samsung 990 Evo NVMe leads to a high power consumption in s2idle sleep (2-3 watts).
This patch applies 'Force No Simple Suspend' quirk to achieve a sleep with a lower power consumption, typically around 0.5 watts.
Signed-off-by: Georg Gottleuber ggo@tuxedocomputers.com Signed-off-by: Werner Sembach wse@tuxedocomputers.com Cc: stable@vger.kernel.org Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvme/host/pci.c | 11 +++++++++++ 1 file changed, 11 insertions(+)
--- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -3107,6 +3107,17 @@ static unsigned long check_vendor_combin dmi_match(DMI_BOARD_NAME, "NS5x_7xPU") || dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1")) return NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND; + } else if (pdev->vendor == 0x144d && pdev->device == 0xa80d) { + /* + * Exclude Samsung 990 Evo from NVME_QUIRK_SIMPLE_SUSPEND + * because of high power consumption (> 2 Watt) in s2idle + * sleep. Only some boards with Intel CPU are affected. + */ + if (dmi_match(DMI_BOARD_NAME, "GMxPXxx") || + dmi_match(DMI_BOARD_NAME, "PH4PG31") || + dmi_match(DMI_BOARD_NAME, "PH4PRX1_PH6PRX1") || + dmi_match(DMI_BOARD_NAME, "PH6PG01_PH6PG71")) + return NVME_QUIRK_FORCE_NO_SIMPLE_SUSPEND; }
/*
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
commit 532f8bcd1c2c4e8112f62e1922fd1703bc0ffce0 upstream.
This reverts commit 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 which breaks compatibility with commands like:
bluetoothd[46328]: @ MGMT Command: Load.. (0x0013) plen 74 {0x0001} [hci0] Keys: 2 BR/EDR Address: C0:DC:DA:A5:E5:47 (Samsung Electronics Co.,Ltd) Key type: Authenticated key from P-256 (0x03) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 6ed96089bd9765be2f2c971b0b95f624 LE Address: D7:2A:DE:1E:73:A2 (Static) Key type: Unauthenticated key from P-256 (0x02) Central: 0x00 Encryption size: 16 Diversifier[2]: 0000 Randomizer[8]: 0000000000000000 Key[16]: 87dd2546ededda380ffcdc0a8faa4597 @ MGMT Event: Command Status (0x0002) plen 3 {0x0001} [hci0] Load Long Term Keys (0x0013) Status: Invalid Parameters (0x0d)
Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- include/net/bluetooth/hci_core.h | 5 ----- net/bluetooth/mgmt.c | 25 +++++++------------------ net/bluetooth/smp.c | 7 ------- 3 files changed, 7 insertions(+), 30 deletions(-)
--- a/include/net/bluetooth/hci_core.h +++ b/include/net/bluetooth/hci_core.h @@ -187,7 +187,6 @@ struct blocked_key { struct smp_csrk { bdaddr_t bdaddr; u8 bdaddr_type; - u8 link_type; u8 type; u8 val[16]; }; @@ -197,7 +196,6 @@ struct smp_ltk { struct rcu_head rcu; bdaddr_t bdaddr; u8 bdaddr_type; - u8 link_type; u8 authenticated; u8 type; u8 enc_size; @@ -212,7 +210,6 @@ struct smp_irk { bdaddr_t rpa; bdaddr_t bdaddr; u8 addr_type; - u8 link_type; u8 val[16]; };
@@ -220,8 +217,6 @@ struct link_key { struct list_head list; struct rcu_head rcu; bdaddr_t bdaddr; - u8 bdaddr_type; - u8 link_type; u8 type; u8 val[HCI_LINK_KEY_SIZE]; u8 pin_len; --- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -2903,8 +2903,7 @@ static int load_link_keys(struct sock *s for (i = 0; i < key_count; i++) { struct mgmt_link_key_info *key = &cp->keys[i];
- /* Considering SMP over BREDR/LE, there is no need to check addr_type */ - if (key->type > 0x08) + if (key->addr.type != BDADDR_BREDR || key->type > 0x08) return mgmt_cmd_status(sk, hdev->id, MGMT_OP_LOAD_LINK_KEYS, MGMT_STATUS_INVALID_PARAMS); @@ -7141,7 +7140,6 @@ static int load_irks(struct sock *sk, st
for (i = 0; i < irk_count; i++) { struct mgmt_irk_info *irk = &cp->irks[i]; - u8 addr_type = le_addr_type(irk->addr.type);
if (hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_IRK, @@ -7151,12 +7149,8 @@ static int load_irks(struct sock *sk, st continue; }
- /* When using SMP over BR/EDR, the addr type should be set to BREDR */ - if (irk->addr.type == BDADDR_BREDR) - addr_type = BDADDR_BREDR; - hci_add_irk(hdev, &irk->addr.bdaddr, - addr_type, irk->val, + le_addr_type(irk->addr.type), irk->val, BDADDR_ANY); }
@@ -7237,7 +7231,6 @@ static int load_long_term_keys(struct so for (i = 0; i < key_count; i++) { struct mgmt_ltk_info *key = &cp->keys[i]; u8 type, authenticated; - u8 addr_type = le_addr_type(key->addr.type);
if (hci_is_blocked_key(hdev, HCI_BLOCKED_KEY_TYPE_LTK, @@ -7272,12 +7265,8 @@ static int load_long_term_keys(struct so continue; }
- /* When using SMP over BR/EDR, the addr type should be set to BREDR */ - if (key->addr.type == BDADDR_BREDR) - addr_type = BDADDR_BREDR; - hci_add_ltk(hdev, &key->addr.bdaddr, - addr_type, type, authenticated, + le_addr_type(key->addr.type), type, authenticated, key->val, key->enc_size, key->ediv, key->rand); }
@@ -9545,7 +9534,7 @@ void mgmt_new_link_key(struct hci_dev *h
ev.store_hint = persistent; bacpy(&ev.key.addr.bdaddr, &key->bdaddr); - ev.key.addr.type = link_to_bdaddr(key->link_type, key->bdaddr_type); + ev.key.addr.type = BDADDR_BREDR; ev.key.type = key->type; memcpy(ev.key.val, key->val, HCI_LINK_KEY_SIZE); ev.key.pin_len = key->pin_len; @@ -9596,7 +9585,7 @@ void mgmt_new_ltk(struct hci_dev *hdev, ev.store_hint = persistent;
bacpy(&ev.key.addr.bdaddr, &key->bdaddr); - ev.key.addr.type = link_to_bdaddr(key->link_type, key->bdaddr_type); + ev.key.addr.type = link_to_bdaddr(LE_LINK, key->bdaddr_type); ev.key.type = mgmt_ltk_type(key); ev.key.enc_size = key->enc_size; ev.key.ediv = key->ediv; @@ -9625,7 +9614,7 @@ void mgmt_new_irk(struct hci_dev *hdev,
bacpy(&ev.rpa, &irk->rpa); bacpy(&ev.irk.addr.bdaddr, &irk->bdaddr); - ev.irk.addr.type = link_to_bdaddr(irk->link_type, irk->addr_type); + ev.irk.addr.type = link_to_bdaddr(LE_LINK, irk->addr_type); memcpy(ev.irk.val, irk->val, sizeof(irk->val));
mgmt_event(MGMT_EV_NEW_IRK, hdev, &ev, sizeof(ev), NULL); @@ -9654,7 +9643,7 @@ void mgmt_new_csrk(struct hci_dev *hdev, ev.store_hint = persistent;
bacpy(&ev.key.addr.bdaddr, &csrk->bdaddr); - ev.key.addr.type = link_to_bdaddr(csrk->link_type, csrk->bdaddr_type); + ev.key.addr.type = link_to_bdaddr(LE_LINK, csrk->bdaddr_type); ev.key.type = csrk->type; memcpy(ev.key.val, csrk->val, sizeof(csrk->val));
--- a/net/bluetooth/smp.c +++ b/net/bluetooth/smp.c @@ -1059,7 +1059,6 @@ static void smp_notify_keys(struct l2cap }
if (smp->remote_irk) { - smp->remote_irk->link_type = hcon->type; mgmt_new_irk(hdev, smp->remote_irk, persistent);
/* Now that user space can be considered to know the @@ -1074,28 +1073,24 @@ static void smp_notify_keys(struct l2cap }
if (smp->csrk) { - smp->csrk->link_type = hcon->type; smp->csrk->bdaddr_type = hcon->dst_type; bacpy(&smp->csrk->bdaddr, &hcon->dst); mgmt_new_csrk(hdev, smp->csrk, persistent); }
if (smp->responder_csrk) { - smp->responder_csrk->link_type = hcon->type; smp->responder_csrk->bdaddr_type = hcon->dst_type; bacpy(&smp->responder_csrk->bdaddr, &hcon->dst); mgmt_new_csrk(hdev, smp->responder_csrk, persistent); }
if (smp->ltk) { - smp->ltk->link_type = hcon->type; smp->ltk->bdaddr_type = hcon->dst_type; bacpy(&smp->ltk->bdaddr, &hcon->dst); mgmt_new_ltk(hdev, smp->ltk, persistent); }
if (smp->responder_ltk) { - smp->responder_ltk->link_type = hcon->type; smp->responder_ltk->bdaddr_type = hcon->dst_type; bacpy(&smp->responder_ltk->bdaddr, &hcon->dst); mgmt_new_ltk(hdev, smp->responder_ltk, persistent); @@ -1115,8 +1110,6 @@ static void smp_notify_keys(struct l2cap key = hci_add_link_key(hdev, smp->conn->hcon, &hcon->dst, smp->link_key, type, 0, &persistent); if (key) { - key->link_type = hcon->type; - key->bdaddr_type = hcon->dst_type; mgmt_new_link_key(hdev, key, persistent);
/* Don't keep debug keys around if the relevant
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luiz Augusto von Dentz luiz.von.dentz@intel.com
commit 1e9683c9b6ca88cc9340cdca85edd6134c8cffe3 upstream.
Due to 59b047bc98084f8af2c41483e4d68a5adf2fa7f7 there could be keys stored with the wrong address type so this attempt to detect it and ignore them instead of just failing to load all keys.
Cc: stable@vger.kernel.org Link: https://github.com/bluez/bluez/issues/875 Fixes: 59b047bc9808 ("Bluetooth: MGMT/SMP: Fix address type when using SMP over BREDR/LE") Signed-off-by: Luiz Augusto von Dentz luiz.von.dentz@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/bluetooth/mgmt.c | 37 +++++++++++++++++++------------------ 1 file changed, 19 insertions(+), 18 deletions(-)
--- a/net/bluetooth/mgmt.c +++ b/net/bluetooth/mgmt.c @@ -2900,15 +2900,6 @@ static int load_link_keys(struct sock *s bt_dev_dbg(hdev, "debug_keys %u key_count %u", cp->debug_keys, key_count);
- for (i = 0; i < key_count; i++) { - struct mgmt_link_key_info *key = &cp->keys[i]; - - if (key->addr.type != BDADDR_BREDR || key->type > 0x08) - return mgmt_cmd_status(sk, hdev->id, - MGMT_OP_LOAD_LINK_KEYS, - MGMT_STATUS_INVALID_PARAMS); - } - hci_dev_lock(hdev);
hci_link_keys_clear(hdev); @@ -2933,6 +2924,19 @@ static int load_link_keys(struct sock *s continue; }
+ if (key->addr.type != BDADDR_BREDR) { + bt_dev_warn(hdev, + "Invalid link address type %u for %pMR", + key->addr.type, &key->addr.bdaddr); + continue; + } + + if (key->type > 0x08) { + bt_dev_warn(hdev, "Invalid link key type %u for %pMR", + key->type, &key->addr.bdaddr); + continue; + } + /* Always ignore debug keys and require a new pairing if * the user wants to use them. */ @@ -7215,15 +7219,6 @@ static int load_long_term_keys(struct so
bt_dev_dbg(hdev, "key_count %u", key_count);
- for (i = 0; i < key_count; i++) { - struct mgmt_ltk_info *key = &cp->keys[i]; - - if (!ltk_is_valid(key)) - return mgmt_cmd_status(sk, hdev->id, - MGMT_OP_LOAD_LONG_TERM_KEYS, - MGMT_STATUS_INVALID_PARAMS); - } - hci_dev_lock(hdev);
hci_smp_ltks_clear(hdev); @@ -7239,6 +7234,12 @@ static int load_long_term_keys(struct so &key->addr.bdaddr); continue; } + + if (!ltk_is_valid(key)) { + bt_dev_warn(hdev, "Invalid LTK for %pMR", + &key->addr.bdaddr); + continue; + }
switch (key->type) { case MGMT_LTK_UNAUTHENTICATED:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Bell jonathan@raspberrypi.com
commit 469e5e4713989fdd5e3e502b922e7be0da2464b9 upstream.
Applying MMC_QUIRK_BROKEN_SD_CACHE is broken, as the card's SD quirks are referenced in sd_parse_ext_reg_perf() prior to the quirks being initialized in mmc_blk_probe().
To fix this problem, let's split out an SD-specific list of quirks and apply in mmc_sd_init_card() instead. In this way, sd_read_ext_regs() to has the available information for not assigning the SD_EXT_PERF_CACHE as one of the (un)supported features, which in turn allows mmc_sd_init_card() to properly skip execution of sd_enable_cache().
Fixes: c467c8f08185 ("mmc: Add MMC_QUIRK_BROKEN_SD_CACHE for Kingston Canvas Go Plus from 11/2019") Signed-off-by: Jonathan Bell jonathan@raspberrypi.com Co-developed-by: Keita Aihara keita.aihara@sony.com Signed-off-by: Keita Aihara keita.aihara@sony.com Reviewed-by: Dragan Simic dsimic@manjaro.org Reviewed-by: Avri Altman avri.altman@wdc.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240820230631.GA436523@sony.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/core/quirks.h | 22 +++++++++++++--------- drivers/mmc/core/sd.c | 4 ++++ 2 files changed, 17 insertions(+), 9 deletions(-)
--- a/drivers/mmc/core/quirks.h +++ b/drivers/mmc/core/quirks.h @@ -15,6 +15,19 @@
#include "card.h"
+static const struct mmc_fixup __maybe_unused mmc_sd_fixups[] = { + /* + * Kingston Canvas Go! Plus microSD cards never finish SD cache flush. + * This has so far only been observed on cards from 11/2019, while new + * cards from 2023/05 do not exhibit this behavior. + */ + _FIXUP_EXT("SD64G", CID_MANFID_KINGSTON_SD, 0x5449, 2019, 11, + 0, -1ull, SDIO_ANY_ID, SDIO_ANY_ID, add_quirk_sd, + MMC_QUIRK_BROKEN_SD_CACHE, EXT_CSD_REV_ANY), + + END_FIXUP +}; + static const struct mmc_fixup __maybe_unused mmc_blk_fixups[] = { #define INAND_CMD38_ARG_EXT_CSD 113 #define INAND_CMD38_ARG_ERASE 0x00 @@ -54,15 +67,6 @@ static const struct mmc_fixup __maybe_un MMC_QUIRK_BLK_NO_CMD23),
/* - * Kingston Canvas Go! Plus microSD cards never finish SD cache flush. - * This has so far only been observed on cards from 11/2019, while new - * cards from 2023/05 do not exhibit this behavior. - */ - _FIXUP_EXT("SD64G", CID_MANFID_KINGSTON_SD, 0x5449, 2019, 11, - 0, -1ull, SDIO_ANY_ID, SDIO_ANY_ID, add_quirk_sd, - MMC_QUIRK_BROKEN_SD_CACHE, EXT_CSD_REV_ANY), - - /* * Some SD cards lockup while using CMD23 multiblock transfers. */ MMC_FIXUP("AF SD", CID_MANFID_ATP, CID_OEMID_ANY, add_quirk_sd, --- a/drivers/mmc/core/sd.c +++ b/drivers/mmc/core/sd.c @@ -26,6 +26,7 @@ #include "host.h" #include "bus.h" #include "mmc_ops.h" +#include "quirks.h" #include "sd.h" #include "sd_ops.h"
@@ -1475,6 +1476,9 @@ retry: goto free_card; }
+ /* Apply quirks prior to card setup */ + mmc_fixup_device(card, mmc_sd_fixups); + err = mmc_sd_setup_card(host, card, oldcard != NULL); if (err) goto free_card;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sam Protsenko semen.protsenko@linaro.org
commit 8396c793ffdf28bb8aee7cfe0891080f8cab7890 upstream.
Commit 616f87661792 ("mmc: pass queue_limits to blk_mq_alloc_disk") [1] revealed the long living issue in dw_mmc.c driver, existing since the time when it was first introduced in commit f95f3850f7a9 ("mmc: dw_mmc: Add Synopsys DesignWare mmc host driver."), also making kernel boot broken on platforms using dw_mmc driver with 16K or 64K pages enabled, with this message in dmesg:
mmcblk: probe of mmc0:0001 failed with error -22
That's happening because mmc_blk_probe() fails when it calls blk_validate_limits() consequently, which returns the error due to failed max_segment_size check in this code:
/* * The maximum segment size has an odd historic 64k default that * drivers probably should override. Just like the I/O size we * require drivers to at least handle a full page per segment. */ ... if (WARN_ON_ONCE(lim->max_segment_size < PAGE_SIZE)) return -EINVAL;
In case when IDMAC (Internal DMA Controller) is used, dw_mmc.c always sets .max_seg_size to 4 KiB:
mmc->max_seg_size = 0x1000;
The comment in the code above explains why it's incorrect. Arnd suggested setting .max_seg_size to .max_req_size to fix it, which is also what some other drivers are doing:
$ grep -rl 'max_seg_size.*=.*max_req_size' drivers/mmc/host/ | \ wc -l 18
This change is not only fixing the boot with 16K/64K pages, but also leads to a better MMC performance. The linear write performance was tested on E850-96 board (eMMC only), before commit [1] (where it's possible to boot with 16K/64K pages without this fix, to be able to do a comparison). It was tested with this command:
# dd if=/dev/zero of=somefile bs=1M count=500 oflag=sync
Test results are as follows:
- 4K pages, .max_seg_size = 4 KiB: 94.2 MB/s - 4K pages, .max_seg_size = .max_req_size = 512 KiB: 96.9 MB/s - 16K pages, .max_seg_size = 4 KiB: 126 MB/s - 16K pages, .max_seg_size = .max_req_size = 2 MiB: 128 MB/s - 64K pages, .max_seg_size = 4 KiB: 138 MB/s - 64K pages, .max_seg_size = .max_req_size = 8 MiB: 138 MB/s
Unfortunately, SD card controller is not enabled in E850-96 yet, so it wasn't possible for me to run the test on some cheap SD cards to check this patch's impact on those. But it's possible that this change might also reduce the writes count, thus improving SD/eMMC longevity.
All credit for the analysis and the suggested solution goes to Arnd.
[1] https://lore.kernel.org/all/20240215070300.2200308-18-hch@lst.de/
Fixes: f95f3850f7a9 ("mmc: dw_mmc: Add Synopsys DesignWare mmc host driver.") Suggested-by: Arnd Bergmann arnd@arndb.de Reported-by: Linux Kernel Functional Testing lkft@linaro.org Closes: https://lore.kernel.org/all/CA+G9fYtddf2Fd3be+YShHP6CmSDNcn0ptW8qg+stUKW+Cn0... Signed-off-by: Sam Protsenko semen.protsenko@linaro.org Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240306232052.21317-1-semen.protsenko@linaro.org Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/dw_mmc.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/mmc/host/dw_mmc.c +++ b/drivers/mmc/host/dw_mmc.c @@ -2952,8 +2952,8 @@ static int dw_mci_init_slot(struct dw_mc if (host->use_dma == TRANS_MODE_IDMAC) { mmc->max_segs = host->ring_size; mmc->max_blk_size = 65535; - mmc->max_seg_size = 0x1000; - mmc->max_req_size = mmc->max_seg_size * host->ring_size; + mmc->max_req_size = DW_MCI_DESC_DATA_LENGTH * host->ring_size; + mmc->max_seg_size = mmc->max_req_size; mmc->max_blk_count = mmc->max_req_size / 512; } else if (host->use_dma == TRANS_MODE_EDMAC) { mmc->max_segs = 64;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liao Chen liaochen4@huawei.com
commit 6e540da4c1db7b840e347c4dfe48359b18b7e376 upstream.
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table.
Signed-off-by: Liao Chen liaochen4@huawei.com Acked-by: Andrew Jeffery andrew@codeconstruct.com.au Fixes: bb7b8ec62dfb ("mmc: sdhci-of-aspeed: Add support for the ASPEED SD controller") Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20240826124851.379759-1-liaochen4@huawei.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/sdhci-of-aspeed.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/mmc/host/sdhci-of-aspeed.c +++ b/drivers/mmc/host/sdhci-of-aspeed.c @@ -513,6 +513,7 @@ static const struct of_device_id aspeed_ { .compatible = "aspeed,ast2600-sdhci", .data = &ast2600_sdhci_pdata, }, { } }; +MODULE_DEVICE_TABLE(of, aspeed_sdhci_of_match);
static struct platform_driver aspeed_sdhci_driver = { .driver = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Seunghwan Baek sh8267.baek@samsung.com
commit aea62c744a9ae2a8247c54ec42138405216414da upstream.
To check if mmc cqe is in halt state, need to check set/clear of CQHCI_HALT bit. At this time, we need to check with &, not &&.
Fixes: a4080225f51d ("mmc: cqhci: support for command queue enabled host") Cc: stable@vger.kernel.org Signed-off-by: Seunghwan Baek sh8267.baek@samsung.com Reviewed-by: Ritesh Harjani ritesh.list@gmail.com Acked-by: Adrian Hunter adrian.hunter@intel.com Link: https://lore.kernel.org/r/20240829061823.3718-2-sh8267.baek@samsung.com Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/cqhci-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/mmc/host/cqhci-core.c +++ b/drivers/mmc/host/cqhci-core.c @@ -612,7 +612,7 @@ static int cqhci_request(struct mmc_host cqhci_writel(cq_host, 0, CQHCI_CTL); mmc->cqe_on = true; pr_debug("%s: cqhci: CQE on\n", mmc_hostname(mmc)); - if (cqhci_readl(cq_host, CQHCI_CTL) && CQHCI_HALT) { + if (cqhci_readl(cq_host, CQHCI_CTL) & CQHCI_HALT) { pr_err("%s: cqhci: CQE failed to exit halt state\n", mmc_hostname(mmc)); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joanne Koong joannelkoong@gmail.com
commit f7790d67785302b3116bbbfda62a5a44524601a3 upstream.
In the case where the aux writeback list is dropped (e.g. the pages have been truncated or the connection is broken), the stats for its pages and backing device info need to be updated as well.
Fixes: e2653bd53a98 ("fuse: fix leaked aux requests") Signed-off-by: Joanne Koong joannelkoong@gmail.com Reviewed-by: Josef Bacik josef@toxicpanda.com Cc: stable@vger.kernel.org # v5.1 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/file.c | 8 +++++++- 1 file changed, 7 insertions(+), 1 deletion(-)
--- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1703,10 +1703,16 @@ __acquires(fi->lock) fuse_writepage_finish(fm, wpa); spin_unlock(&fi->lock);
- /* After fuse_writepage_finish() aux request list is private */ + /* After rb_erase() aux request list is private */ for (aux = wpa->next; aux; aux = next) { + struct backing_dev_info *bdi = inode_to_bdi(aux->inode); + next = aux->next; aux->next = NULL; + + dec_wb_stat(&bdi->wb, WB_WRITEBACK); + dec_node_page_state(aux->ia.ap.pages[0], NR_WRITEBACK_TEMP); + wb_writeout_inc(&bdi->wb); fuse_writepage_free(aux); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
commit b18915248a15eae7d901262f108d6ff0ffb4ffc1 upstream.
The existing code uses min_t(ssize_t, outarg.size, XATTR_LIST_MAX) when parsing the FUSE daemon's response to a zero-length getxattr/listxattr request. On 32-bit kernels, where ssize_t and outarg.size are the same size, this is wrong: The min_t() will pass through any size values that are negative when interpreted as signed. fuse_listxattr() will then return this userspace-supplied negative value, which callers will treat as an error value.
This kind of bug pattern can lead to fairly bad security bugs because of how error codes are used in the Linux kernel. If a caller were to convert the numeric error into an error pointer, like so:
struct foo *func(...) { int len = fuse_getxattr(..., NULL, 0); if (len < 0) return ERR_PTR(len); ... }
then it would end up returning this userspace-supplied negative value cast to a pointer - but the caller of this function wouldn't recognize it as an error pointer (IS_ERR_VALUE() only detects values in the narrow range in which legitimate errno values are), and so it would just be treated as a kernel pointer.
I think there is at least one theoretical codepath where this could happen, but that path would involve virtio-fs with submounts plus some weird SELinux configuration, so I think it's probably not a concern in practice.
Cc: stable@vger.kernel.org # v4.9 Fixes: 63401ccdb2ca ("fuse: limit xattr returned size") Signed-off-by: Jann Horn jannh@google.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/xattr.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/fs/fuse/xattr.c +++ b/fs/fuse/xattr.c @@ -81,7 +81,7 @@ ssize_t fuse_getxattr(struct inode *inod } ret = fuse_simple_request(fm, &args); if (!ret && !size) - ret = min_t(ssize_t, outarg.size, XATTR_SIZE_MAX); + ret = min_t(size_t, outarg.size, XATTR_SIZE_MAX); if (ret == -ENOSYS) { fm->fc->no_getxattr = 1; ret = -EOPNOTSUPP; @@ -143,7 +143,7 @@ ssize_t fuse_listxattr(struct dentry *en } ret = fuse_simple_request(fm, &args); if (!ret && !size) - ret = min_t(ssize_t, outarg.size, XATTR_LIST_MAX); + ret = min_t(size_t, outarg.size, XATTR_LIST_MAX); if (ret > 0 && size) ret = fuse_verify_xattr_list(list, ret); if (ret == -ENOSYS) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Satya Priya Kakitapalli quic_skakitap@quicinc.com
commit 2c4553e6c485a96b5d86989eb9654bf20e51e6dd upstream.
The PLL_POST_DIV_MASK should be 0 to (width - 1) bits. Fix it.
Fixes: 1c3541145cbf ("clk: qcom: support for 2 bit PLL post divider") Cc: stable@vger.kernel.org Reviewed-by: Konrad Dybcio konrad.dybcio@linaro.org Signed-off-by: Satya Priya Kakitapalli quic_skakitap@quicinc.com Link: https://lore.kernel.org/r/20240731062916.2680823-2-quic_skakitap@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/clk-alpha-pll.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -40,7 +40,7 @@
#define PLL_USER_CTL(p) ((p)->offset + (p)->regs[PLL_OFF_USER_CTL]) # define PLL_POST_DIV_SHIFT 8 -# define PLL_POST_DIV_MASK(p) GENMASK((p)->width, 0) +# define PLL_POST_DIV_MASK(p) GENMASK((p)->width - 1, 0) # define PLL_ALPHA_EN BIT(24) # define PLL_ALPHA_MODE BIT(25) # define PLL_VCO_SHIFT 20
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Satya Priya Kakitapalli quic_skakitap@quicinc.com
commit 4ad1ed6ef27cab94888bb3c740c14042d5c0dff2 upstream.
Correct the pll postdiv shift used in clk_trion_pll_postdiv_set_rate API. The shift value is not same for different types of plls and should be taken from the pll's .post_div_shift member.
Fixes: 548a909597d5 ("clk: qcom: clk-alpha-pll: Add support for Trion PLLs") Cc: stable@vger.kernel.org Signed-off-by: Satya Priya Kakitapalli quic_skakitap@quicinc.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20240731062916.2680823-3-quic_skakitap@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/clk-alpha-pll.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -1421,8 +1421,8 @@ clk_trion_pll_postdiv_set_rate(struct cl }
return regmap_update_bits(regmap, PLL_USER_CTL(pll), - PLL_POST_DIV_MASK(pll) << PLL_POST_DIV_SHIFT, - val << PLL_POST_DIV_SHIFT); + PLL_POST_DIV_MASK(pll) << pll->post_div_shift, + val << pll->post_div_shift); }
const struct clk_ops clk_alpha_pll_postdiv_trion_ops = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Satya Priya Kakitapalli quic_skakitap@quicinc.com
commit 85e8ee59dfde1a7b847fbed0778391392cd985cb upstream.
Currently, clk_zonda_pll_set_rate polls for the PLL to lock even if the PLL is disabled. However, if the PLL is disabled then LOCK_DET will never assert and we'll return an error. There is no reason to poll LOCK_DET if the PLL is already disabled, so skip polling in this case.
Fixes: f21b6bfecc27 ("clk: qcom: clk-alpha-pll: add support for zonda pll") Cc: stable@vger.kernel.org Signed-off-by: Satya Priya Kakitapalli quic_skakitap@quicinc.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20240731062916.2680823-4-quic_skakitap@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/clk-alpha-pll.c | 3 +++ 1 file changed, 3 insertions(+)
--- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -2005,6 +2005,9 @@ static int clk_zonda_pll_set_rate(struct regmap_write(pll->clkr.regmap, PLL_ALPHA_VAL(pll), a); regmap_write(pll->clkr.regmap, PLL_L_VAL(pll), l);
+ if (!clk_hw_is_enabled(hw)) + return 0; + /* Wait before polling for the frequency latch */ udelay(5);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Satya Priya Kakitapalli quic_skakitap@quicinc.com
commit f4973130d255dd4811006f5822d4fa4d0de9d712 upstream.
The Zonda PLL has a 16 bit signed alpha and in the cases where the alpha value is greater than 0.5, the L value needs to be adjusted accordingly. Thus update the logic to handle the signed alpha val.
Fixes: f21b6bfecc27 ("clk: qcom: clk-alpha-pll: add support for zonda pll") Cc: stable@vger.kernel.org Signed-off-by: Satya Priya Kakitapalli quic_skakitap@quicinc.com Reviewed-by: Dmitry Baryshkov dmitry.baryshkov@linaro.org Link: https://lore.kernel.org/r/20240731062916.2680823-5-quic_skakitap@quicinc.com Signed-off-by: Bjorn Andersson andersson@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clk/qcom/clk-alpha-pll.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
--- a/drivers/clk/qcom/clk-alpha-pll.c +++ b/drivers/clk/qcom/clk-alpha-pll.c @@ -41,6 +41,7 @@ #define PLL_USER_CTL(p) ((p)->offset + (p)->regs[PLL_OFF_USER_CTL]) # define PLL_POST_DIV_SHIFT 8 # define PLL_POST_DIV_MASK(p) GENMASK((p)->width - 1, 0) +# define PLL_ALPHA_MSB BIT(15) # define PLL_ALPHA_EN BIT(24) # define PLL_ALPHA_MODE BIT(25) # define PLL_VCO_SHIFT 20 @@ -1986,6 +1987,18 @@ static void clk_zonda_pll_disable(struct regmap_write(regmap, PLL_OPMODE(pll), 0x0); }
+static void zonda_pll_adjust_l_val(unsigned long rate, unsigned long prate, u32 *l) +{ + u64 remainder, quotient; + + quotient = rate; + remainder = do_div(quotient, prate); + *l = quotient; + + if ((remainder * 2) / prate) + *l = *l + 1; +} + static int clk_zonda_pll_set_rate(struct clk_hw *hw, unsigned long rate, unsigned long prate) { @@ -2002,6 +2015,9 @@ static int clk_zonda_pll_set_rate(struct if (ret < 0) return ret;
+ if (a & PLL_ALPHA_MSB) + zonda_pll_adjust_l_val(rate, prate, &l); + regmap_write(pll->clkr.regmap, PLL_ALPHA_VAL(pll), a); regmap_write(pll->clkr.regmap, PLL_L_VAL(pll), l);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Arlott simon@octiron.net
commit 7dd9c26bd6cf679bcfdef01a8659791aa6487a29 upstream.
The mcp251x_hw_wake() function is called with the mpc_lock mutex held and disables the interrupt handler so that no interrupts can be processed while waking the device. If an interrupt has already occurred then waiting for the interrupt handler to complete will deadlock because it will be trying to acquire the same mutex.
CPU0 CPU1 ---- ---- mcp251x_open() mutex_lock(&priv->mcp_lock) request_threaded_irq() <interrupt> mcp251x_can_ist() mutex_lock(&priv->mcp_lock) mcp251x_hw_wake() disable_irq() <-- deadlock
Use disable_irq_nosync() instead because the interrupt handler does everything while holding the mutex so it doesn't matter if it's still running.
Fixes: 8ce8c0abcba3 ("can: mcp251x: only reset hardware as required") Signed-off-by: Simon Arlott simon@octiron.net Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/4fc08687-1d80-43fe-9f0d-8ef8475e75f6@0882a8b5-c6... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/can/spi/mcp251x.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/can/spi/mcp251x.c +++ b/drivers/net/can/spi/mcp251x.c @@ -753,7 +753,7 @@ static int mcp251x_hw_wake(struct spi_de int ret;
/* Force wakeup interrupt to wake device, but don't execute IST */ - disable_irq(spi->irq); + disable_irq_nosync(spi->irq); mcp251x_write_2regs(spi, CANINTE, CANINTE_WAKIE, CANINTF_WAKIF);
/* Wait for oscillator startup timer after wake up */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Norris briannorris@chromium.org
commit be721b451affbecc4ba4eaac3b71cdbdcade1b1b upstream.
Commit e882575efc77 ("spi: rockchip: Suspend and resume the bus during NOIRQ_SYSTEM_SLEEP_PM ops") stopped respecting runtime PM status and simply disabled clocks unconditionally when suspending the system. This causes problems when the device is already runtime suspended when we go to sleep -- in which case we double-disable clocks and produce a WARNing.
Switch back to pm_runtime_force_{suspend,resume}(), because that still seems like the right thing to do, and the aforementioned commit makes no explanation why it stopped using it.
Also, refactor some of the resume() error handling, because it's not actually a good idea to re-disable clocks on failure.
Fixes: e882575efc77 ("spi: rockchip: Suspend and resume the bus during NOIRQ_SYSTEM_SLEEP_PM ops") Cc: stable@vger.kernel.org Reported-by: Ondřej Jirman megi@xff.cz Closes: https://lore.kernel.org/lkml/20220621154218.sau54jeij4bunf56@core/ Signed-off-by: Brian Norris briannorris@chromium.org Link: https://patch.msgid.link/20240827171126.1115748-1-briannorris@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/spi/spi-rockchip.c | 23 +++++++---------------- 1 file changed, 7 insertions(+), 16 deletions(-)
--- a/drivers/spi/spi-rockchip.c +++ b/drivers/spi/spi-rockchip.c @@ -976,14 +976,16 @@ static int rockchip_spi_suspend(struct d { int ret; struct spi_controller *ctlr = dev_get_drvdata(dev); - struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
ret = spi_controller_suspend(ctlr); if (ret < 0) return ret;
- clk_disable_unprepare(rs->spiclk); - clk_disable_unprepare(rs->apb_pclk); + ret = pm_runtime_force_suspend(dev); + if (ret < 0) { + spi_controller_resume(ctlr); + return ret; + }
pinctrl_pm_select_sleep_state(dev);
@@ -994,25 +996,14 @@ static int rockchip_spi_resume(struct de { int ret; struct spi_controller *ctlr = dev_get_drvdata(dev); - struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
pinctrl_pm_select_default_state(dev);
- ret = clk_prepare_enable(rs->apb_pclk); + ret = pm_runtime_force_resume(dev); if (ret < 0) return ret;
- ret = clk_prepare_enable(rs->spiclk); - if (ret < 0) - clk_disable_unprepare(rs->apb_pclk); - - ret = spi_controller_resume(ctlr); - if (ret < 0) { - clk_disable_unprepare(rs->spiclk); - clk_disable_unprepare(rs->apb_pclk); - } - - return 0; + return spi_controller_resume(ctlr); } #endif /* CONFIG_PM_SLEEP */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zheng Yejian zhengyejian@huaweicloud.com
commit 49aa8a1f4d6800721c7971ed383078257f12e8f9 upstream.
In __tracing_open(), when max latency tracers took place on the cpu, the time start of its buffer would be updated, then event entries with timestamps being earlier than start of the buffer would be skipped (see tracing_iter_reset()).
Softlockup will occur if the kernel is non-preemptible and too many entries were skipped in the loop that reset every cpu buffer, so add cond_resched() to avoid it.
Cc: stable@vger.kernel.org Fixes: 2f26ebd549b9a ("tracing: use timestamp to determine start of latency traces") Link: https://lore.kernel.org/20240827124654.3817443-1-zhengyejian@huaweicloud.com Suggested-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Zheng Yejian zhengyejian@huaweicloud.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -4088,6 +4088,8 @@ void tracing_iter_reset(struct trace_ite break; entries++; ring_buffer_iter_advance(buf_iter); + /* This could be a big loop */ + cond_resched(); }
per_cpu_ptr(iter->array_buffer->data, cpu)->skipped_entries = entries;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matt Johnston matt@codeconstruct.com.au
commit f962e8361adfa84e8252d3fc3e5e6bb879f029b1 upstream.
0x7d and 0x7e bytes are meant to be escaped in the data portion of frames, but this didn't occur since next_chunk_len() had an off-by-one error. That also resulted in the final byte of a payload being written as a separate tty write op.
The chunk prior to an escaped byte would be one byte short, and the next call would never test the txpos+1 case, which is where the escaped byte was located. That meant it never hit the escaping case in mctp_serial_tx_work().
Example Input: 01 00 08 c8 7e 80 02
Previous incorrect chunks from next_chunk_len():
01 00 08 c8 7e 80 02
With this fix:
01 00 08 c8 7e 80 02
Cc: stable@vger.kernel.org Fixes: a0c2ccd9b5ad ("mctp: Add MCTP-over-serial transport binding") Signed-off-by: Matt Johnston matt@codeconstruct.com.au Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/mctp/mctp-serial.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/net/mctp/mctp-serial.c +++ b/drivers/net/mctp/mctp-serial.c @@ -91,8 +91,8 @@ static int next_chunk_len(struct mctp_se * will be those non-escaped bytes, and does not include the escaped * byte. */ - for (i = 1; i + dev->txpos + 1 < dev->txlen; i++) { - if (needs_escape(dev->txbuf[dev->txpos + i + 1])) + for (i = 1; i + dev->txpos < dev->txlen; i++) { + if (needs_escape(dev->txbuf[dev->txpos + i])) break; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mitchell Levy levymitchell0@gmail.com
commit 2848ff28d180bd63a95da8e5dcbcdd76c1beeb7b upstream.
There are two distinct CPU features related to the use of XSAVES and LBR: whether LBR is itself supported and whether XSAVES supports LBR. The LBR subsystem correctly checks both in intel_pmu_arch_lbr_init(), but the XSTATE subsystem does not.
The LBR bit is only removed from xfeatures_mask_independent when LBR is not supported by the CPU, but there is no validation of XSTATE support.
If XSAVES does not support LBR the write to IA32_XSS causes a #GP fault, leaving the state of IA32_XSS unchanged, i.e. zero. The fault is handled with a warning and the boot continues.
Consequently the next XRSTORS which tries to restore supervisor state fails with #GP because the RFBM has zero for all supervisor features, which does not match the XCOMP_BV field.
As XFEATURE_MASK_FPSTATE includes supervisor features setting up the FPU causes a #GP, which ends up in fpu_reset_from_exception_fixup(). That fails due to the same problem resulting in recursive #GPs until the kernel runs out of stack space and double faults.
Prevent this by storing the supported independent features in fpu_kernel_cfg during XSTATE initialization and use that cached value for retrieving the independent feature bits to be written into IA32_XSS.
[ tglx: Massaged change log ]
Fixes: f0dccc9da4c0 ("x86/fpu/xstate: Support dynamic supervisor feature for LBR") Suggested-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Mitchell Levy levymitchell0@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Cc: stable@vger.kernel.org Link: https://lore.kernel.org/all/20240812-xsave-lbr-fix-v3-1-95bac1bf62f4@gmail.c... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/fpu/types.h | 7 +++++++ arch/x86/kernel/fpu/xstate.c | 3 +++ arch/x86/kernel/fpu/xstate.h | 4 ++-- 3 files changed, 12 insertions(+), 2 deletions(-)
--- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -577,6 +577,13 @@ struct fpu_state_config { * even without XSAVE support, i.e. legacy features FP + SSE */ u64 legacy_features; + /* + * @independent_features: + * + * Features that are supported by XSAVES, but not managed as part of + * the FPU core, such as LBR + */ + u64 independent_features; };
/* FPU state configuration information */ --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -792,6 +792,9 @@ void __init fpu__init_system_xstate(unsi goto out_disable; }
+ fpu_kernel_cfg.independent_features = fpu_kernel_cfg.max_features & + XFEATURE_MASK_INDEPENDENT; + /* * Clear XSAVE features that are disabled in the normal CPUID. */ --- a/arch/x86/kernel/fpu/xstate.h +++ b/arch/x86/kernel/fpu/xstate.h @@ -64,9 +64,9 @@ static inline u64 xfeatures_mask_supervi static inline u64 xfeatures_mask_independent(void) { if (!cpu_feature_enabled(X86_FEATURE_ARCH_LBR)) - return XFEATURE_MASK_INDEPENDENT & ~XFEATURE_MASK_LBR; + return fpu_kernel_cfg.independent_features & ~XFEATURE_MASK_LBR;
- return XFEATURE_MASK_INDEPENDENT; + return fpu_kernel_cfg.independent_features; }
/* XSAVE/XRSTOR wrapper functions */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Deucher alexander.deucher@amd.com
commit 1a8d845470941f1b6de1b392227530c097dc5e0c upstream.
This reverts commit 8f614469de248a4bc55fb07e55d5f4c340c75b11.
This breaks some manual setting of the profile mode in certain cases.
Closes: https://gitlab.freedesktop.org/drm/amd/-/issues/3600 Signed-off-by: Alex Deucher alexander.deucher@amd.com (cherry picked from commit 7a199557643e993d4e7357860624b8aa5d8f4340) Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c +++ b/drivers/gpu/drm/amd/pm/swsmu/amdgpu_smu.c @@ -1871,7 +1871,8 @@ static int smu_adjust_power_state_dynami smu_dpm_ctx->dpm_level = level; }
- if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) { + if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL && + smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) { index = fls(smu->workload_mask); index = index > 0 && index <= WORKLOAD_POLICY_MAX ? index - 1 : 0; workload[0] = smu->workload_setting[index]; @@ -1950,7 +1951,8 @@ static int smu_switch_power_profile(void workload[0] = smu->workload_setting[index]; }
- if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) + if (smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_MANUAL && + smu_dpm_ctx->dpm_level != AMD_DPM_FORCED_LEVEL_PERF_DETERMINISM) smu_bump_power_profile_mode(smu, workload, 0);
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cong Wang cong.wang@bytedance.com
commit fe1910f9337bd46a9343967b547ccab26b4b2c6e upstream.
When we cork messages in psock->cork, the last message triggers the flushing will result in sending a sk_msg larger than the current message size. In this case, in tcp_bpf_send_verdict(), 'copied' becomes negative at least in the following case:
468 case __SK_DROP: 469 default: 470 sk_msg_free_partial(sk, msg, tosend); 471 sk_msg_apply_bytes(psock, tosend); 472 *copied -= (tosend + delta); // <==== HERE 473 return -EACCES;
Therefore, it could lead to the following BUG with a proper value of 'copied' (thanks to syzbot). We should not use negative 'copied' as a return value here.
------------[ cut here ]------------ kernel BUG at net/socket.c:733! Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP Modules linked in: CPU: 0 UID: 0 PID: 3265 Comm: syz-executor510 Not tainted 6.11.0-rc3-syzkaller-00060-gd07b43284ab3 #0 Hardware name: linux,dummy-virt (DT) pstate: 61400009 (nZCv daif +PAN -UAO -TCO +DIT -SSBS BTYPE=--) pc : sock_sendmsg_nosec net/socket.c:733 [inline] pc : sock_sendmsg_nosec net/socket.c:728 [inline] pc : __sock_sendmsg+0x5c/0x60 net/socket.c:745 lr : sock_sendmsg_nosec net/socket.c:730 [inline] lr : __sock_sendmsg+0x54/0x60 net/socket.c:745 sp : ffff800088ea3b30 x29: ffff800088ea3b30 x28: fbf00000062bc900 x27: 0000000000000000 x26: ffff800088ea3bc0 x25: ffff800088ea3bc0 x24: 0000000000000000 x23: f9f00000048dc000 x22: 0000000000000000 x21: ffff800088ea3d90 x20: f9f00000048dc000 x19: ffff800088ea3d90 x18: 0000000000000001 x17: 0000000000000000 x16: 0000000000000000 x15: 000000002002ffaf x14: 0000000000000000 x13: 0000000000000000 x12: 0000000000000000 x11: 0000000000000000 x10: ffff8000815849c0 x9 : ffff8000815b49c0 x8 : 0000000000000000 x7 : 000000000000003f x6 : 0000000000000000 x5 : 00000000000007e0 x4 : fff07ffffd239000 x3 : fbf00000062bc900 x2 : 0000000000000000 x1 : 0000000000000000 x0 : 00000000fffffdef Call trace: sock_sendmsg_nosec net/socket.c:733 [inline] __sock_sendmsg+0x5c/0x60 net/socket.c:745 ____sys_sendmsg+0x274/0x2ac net/socket.c:2597 ___sys_sendmsg+0xac/0x100 net/socket.c:2651 __sys_sendmsg+0x84/0xe0 net/socket.c:2680 __do_sys_sendmsg net/socket.c:2689 [inline] __se_sys_sendmsg net/socket.c:2687 [inline] __arm64_sys_sendmsg+0x24/0x30 net/socket.c:2687 __invoke_syscall arch/arm64/kernel/syscall.c:35 [inline] invoke_syscall+0x48/0x110 arch/arm64/kernel/syscall.c:49 el0_svc_common.constprop.0+0x40/0xe0 arch/arm64/kernel/syscall.c:132 do_el0_svc+0x1c/0x28 arch/arm64/kernel/syscall.c:151 el0_svc+0x34/0xec arch/arm64/kernel/entry-common.c:712 el0t_64_sync_handler+0x100/0x12c arch/arm64/kernel/entry-common.c:730 el0t_64_sync+0x19c/0x1a0 arch/arm64/kernel/entry.S:598 Code: f9404463 d63f0060 3108441f 54fffe81 (d4210000) ---[ end trace 0000000000000000 ]---
Fixes: 4f738adba30a ("bpf: create tcp_bpf_ulp allowing BPF to monitor socket TX/RX data") Reported-by: syzbot+58c03971700330ce14d8@syzkaller.appspotmail.com Cc: Jakub Sitnicki jakub@cloudflare.com Signed-off-by: Cong Wang cong.wang@bytedance.com Reviewed-by: John Fastabend john.fastabend@gmail.com Acked-by: Martin KaFai Lau martin.lau@kernel.org Link: https://patch.msgid.link/20240821030744.320934-1-xiyou.wangcong@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_bpf.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/tcp_bpf.c +++ b/net/ipv4/tcp_bpf.c @@ -572,7 +572,7 @@ out_err: err = sk_stream_error(sk, msg->msg_flags, err); release_sock(sk); sk_psock_put(sk, psock); - return copied ? copied : err; + return copied > 0 ? copied : err; }
static int tcp_bpf_sendpage(struct sock *sk, struct page *page, int offset,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Eric Dumazet edumazet@google.com
commit 031ae72825cef43e4650140b800ad58bf7a6a466 upstream.
syzbot found an use-after-free Read in ila_nf_input [1]
Issue here is that ila_xlat_exit_net() frees the rhashtable, then call nf_unregister_net_hooks().
It should be done in the reverse way, with a synchronize_rcu().
This is a good match for a pre_exit() method.
[1] BUG: KASAN: use-after-free in rht_key_hashfn include/linux/rhashtable.h:159 [inline] BUG: KASAN: use-after-free in __rhashtable_lookup include/linux/rhashtable.h:604 [inline] BUG: KASAN: use-after-free in rhashtable_lookup include/linux/rhashtable.h:646 [inline] BUG: KASAN: use-after-free in rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672 Read of size 4 at addr ffff888064620008 by task ksoftirqd/0/16
CPU: 0 UID: 0 PID: 16 Comm: ksoftirqd/0 Not tainted 6.11.0-rc4-syzkaller-00238-g2ad6d23f465a #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 08/06/2024 Call Trace: <TASK> __dump_stack lib/dump_stack.c:93 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:119 print_address_description mm/kasan/report.c:377 [inline] print_report+0x169/0x550 mm/kasan/report.c:488 kasan_report+0x143/0x180 mm/kasan/report.c:601 rht_key_hashfn include/linux/rhashtable.h:159 [inline] __rhashtable_lookup include/linux/rhashtable.h:604 [inline] rhashtable_lookup include/linux/rhashtable.h:646 [inline] rhashtable_lookup_fast+0x77a/0x9b0 include/linux/rhashtable.h:672 ila_lookup_wildcards net/ipv6/ila/ila_xlat.c:132 [inline] ila_xlat_addr net/ipv6/ila/ila_xlat.c:652 [inline] ila_nf_input+0x1fe/0x3c0 net/ipv6/ila/ila_xlat.c:190 nf_hook_entry_hookfn include/linux/netfilter.h:154 [inline] nf_hook_slow+0xc3/0x220 net/netfilter/core.c:626 nf_hook include/linux/netfilter.h:269 [inline] NF_HOOK+0x29e/0x450 include/linux/netfilter.h:312 __netif_receive_skb_one_core net/core/dev.c:5661 [inline] __netif_receive_skb+0x1ea/0x650 net/core/dev.c:5775 process_backlog+0x662/0x15b0 net/core/dev.c:6108 __napi_poll+0xcb/0x490 net/core/dev.c:6772 napi_poll net/core/dev.c:6841 [inline] net_rx_action+0x89b/0x1240 net/core/dev.c:6963 handle_softirqs+0x2c4/0x970 kernel/softirq.c:554 run_ksoftirqd+0xca/0x130 kernel/softirq.c:928 smpboot_thread_fn+0x544/0xa30 kernel/smpboot.c:164 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 </TASK>
The buggy address belongs to the physical page: page: refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x64620 flags: 0xfff00000000000(node=0|zone=1|lastcpupid=0x7ff) page_type: 0xbfffffff(buddy) raw: 00fff00000000000 ffffea0000959608 ffffea00019d9408 0000000000000000 raw: 0000000000000000 0000000000000003 00000000bfffffff 0000000000000000 page dumped because: kasan: bad access detected page_owner tracks the page as freed page last allocated via order 3, migratetype Unmovable, gfp_mask 0x52dc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_ZERO), pid 5242, tgid 5242 (syz-executor), ts 73611328570, free_ts 618981657187 set_page_owner include/linux/page_owner.h:32 [inline] post_alloc_hook+0x1f3/0x230 mm/page_alloc.c:1493 prep_new_page mm/page_alloc.c:1501 [inline] get_page_from_freelist+0x2e4c/0x2f10 mm/page_alloc.c:3439 __alloc_pages_noprof+0x256/0x6c0 mm/page_alloc.c:4695 __alloc_pages_node_noprof include/linux/gfp.h:269 [inline] alloc_pages_node_noprof include/linux/gfp.h:296 [inline] ___kmalloc_large_node+0x8b/0x1d0 mm/slub.c:4103 __kmalloc_large_node_noprof+0x1a/0x80 mm/slub.c:4130 __do_kmalloc_node mm/slub.c:4146 [inline] __kmalloc_node_noprof+0x2d2/0x440 mm/slub.c:4164 __kvmalloc_node_noprof+0x72/0x190 mm/util.c:650 bucket_table_alloc lib/rhashtable.c:186 [inline] rhashtable_init_noprof+0x534/0xa60 lib/rhashtable.c:1071 ila_xlat_init_net+0xa0/0x110 net/ipv6/ila/ila_xlat.c:613 ops_init+0x359/0x610 net/core/net_namespace.c:139 setup_net+0x515/0xca0 net/core/net_namespace.c:343 copy_net_ns+0x4e2/0x7b0 net/core/net_namespace.c:508 create_new_namespaces+0x425/0x7b0 kernel/nsproxy.c:110 unshare_nsproxy_namespaces+0x124/0x180 kernel/nsproxy.c:228 ksys_unshare+0x619/0xc10 kernel/fork.c:3328 __do_sys_unshare kernel/fork.c:3399 [inline] __se_sys_unshare kernel/fork.c:3397 [inline] __x64_sys_unshare+0x38/0x40 kernel/fork.c:3397 page last free pid 11846 tgid 11846 stack trace: reset_page_owner include/linux/page_owner.h:25 [inline] free_pages_prepare mm/page_alloc.c:1094 [inline] free_unref_page+0xd22/0xea0 mm/page_alloc.c:2612 __folio_put+0x2c8/0x440 mm/swap.c:128 folio_put include/linux/mm.h:1486 [inline] free_large_kmalloc+0x105/0x1c0 mm/slub.c:4565 kfree+0x1c4/0x360 mm/slub.c:4588 rhashtable_free_and_destroy+0x7c6/0x920 lib/rhashtable.c:1169 ila_xlat_exit_net+0x55/0x110 net/ipv6/ila/ila_xlat.c:626 ops_exit_list net/core/net_namespace.c:173 [inline] cleanup_net+0x802/0xcc0 net/core/net_namespace.c:640 process_one_work kernel/workqueue.c:3231 [inline] process_scheduled_works+0xa2c/0x1830 kernel/workqueue.c:3312 worker_thread+0x86d/0xd40 kernel/workqueue.c:3390 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address: ffff88806461ff00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffff88806461ff80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888064620000: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
^ ffff888064620080: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ffff888064620100: ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff
Fixes: 7f00feaf1076 ("ila: Add generic ILA translation facility") Reported-by: syzbot syzkaller@googlegroups.com Signed-off-by: Eric Dumazet edumazet@google.com Cc: Tom Herbert tom@herbertland.com Reviewed-by: Florian Westphal fw@strlen.de Link: https://patch.msgid.link/20240904144418.1162839-1-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv6/ila/ila.h | 1 + net/ipv6/ila/ila_main.c | 6 ++++++ net/ipv6/ila/ila_xlat.c | 13 +++++++++---- 3 files changed, 16 insertions(+), 4 deletions(-)
--- a/net/ipv6/ila/ila.h +++ b/net/ipv6/ila/ila.h @@ -108,6 +108,7 @@ int ila_lwt_init(void); void ila_lwt_fini(void);
int ila_xlat_init_net(struct net *net); +void ila_xlat_pre_exit_net(struct net *net); void ila_xlat_exit_net(struct net *net);
int ila_xlat_nl_cmd_add_mapping(struct sk_buff *skb, struct genl_info *info); --- a/net/ipv6/ila/ila_main.c +++ b/net/ipv6/ila/ila_main.c @@ -72,6 +72,11 @@ ila_xlat_init_fail: return err; }
+static __net_exit void ila_pre_exit_net(struct net *net) +{ + ila_xlat_pre_exit_net(net); +} + static __net_exit void ila_exit_net(struct net *net) { ila_xlat_exit_net(net); @@ -79,6 +84,7 @@ static __net_exit void ila_exit_net(stru
static struct pernet_operations ila_net_ops = { .init = ila_init_net, + .pre_exit = ila_pre_exit_net, .exit = ila_exit_net, .id = &ila_net_id, .size = sizeof(struct ila_net), --- a/net/ipv6/ila/ila_xlat.c +++ b/net/ipv6/ila/ila_xlat.c @@ -620,6 +620,15 @@ int ila_xlat_init_net(struct net *net) return 0; }
+void ila_xlat_pre_exit_net(struct net *net) +{ + struct ila_net *ilan = net_generic(net, ila_net_id); + + if (ilan->xlat.hooks_registered) + nf_unregister_net_hooks(net, ila_nf_hook_ops, + ARRAY_SIZE(ila_nf_hook_ops)); +} + void ila_xlat_exit_net(struct net *net) { struct ila_net *ilan = net_generic(net, ila_net_id); @@ -627,10 +636,6 @@ void ila_xlat_exit_net(struct net *net) rhashtable_free_and_destroy(&ilan->xlat.rhash_table, ila_free_cb, NULL);
free_bucket_spinlocks(ilan->xlat.locks); - - if (ilan->xlat.hooks_registered) - nf_unregister_net_hooks(net, ila_nf_hook_ops, - ARRAY_SIZE(ila_nf_hook_ops)); }
static int ila_xlat_addr(struct sk_buff *skb, bool sir2ila)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Toke Høiland-Jørgensen toke@redhat.com
commit 546ea84d07e3e324644025e2aae2d12ea4c5896e upstream.
In sch_cake, we keep track of the count of active bulk flows per host, when running in dst/src host fairness mode, which is used as the round-robin weight when iterating through flows. The count of active bulk flows is updated whenever a flow changes state.
This has a peculiar interaction with the hash collision handling: when a hash collision occurs (after the set-associative hashing), the state of the hash bucket is simply updated to match the new packet that collided, and if host fairness is enabled, that also means assigning new per-host state to the flow. For this reason, the bulk flow counters of the host(s) assigned to the flow are decremented, before new state is assigned (and the counters, which may not belong to the same host anymore, are incremented again).
Back when this code was introduced, the host fairness mode was always enabled, so the decrement was unconditional. When the configuration flags were introduced the *increment* was made conditional, but the *decrement* was not. Which of course can lead to a spurious decrement (and associated wrap-around to U16_MAX).
AFAICT, when host fairness is disabled, the decrement and wrap-around happens as soon as a hash collision occurs (which is not that common in itself, due to the set-associative hashing). However, in most cases this is harmless, as the value is only used when host fairness mode is enabled. So in order to trigger an array overflow, sch_cake has to first be configured with host fairness disabled, and while running in this mode, a hash collision has to occur to cause the overflow. Then, the qdisc has to be reconfigured to enable host fairness, which leads to the array out-of-bounds because the wrapped-around value is retained and used as an array index. It seems that syzbot managed to trigger this, which is quite impressive in its own right.
This patch fixes the issue by introducing the same conditional check on decrement as is used on increment.
The original bug predates the upstreaming of cake, but the commit listed in the Fixes tag touched that code, meaning that this patch won't apply before that.
Fixes: 712639929912 ("sch_cake: Make the dual modes fairer") Reported-by: syzbot+7fe7b81d602cc1e6b94d@syzkaller.appspotmail.com Signed-off-by: Toke Høiland-Jørgensen toke@redhat.com Link: https://patch.msgid.link/20240903160846.20909-1-toke@redhat.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/sch_cake.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
--- a/net/sched/sch_cake.c +++ b/net/sched/sch_cake.c @@ -785,12 +785,15 @@ skip_hash: * queue, accept the collision, update the host tags. */ q->way_collisions++; - if (q->flows[outer_hash + k].set == CAKE_SET_BULK) { - q->hosts[q->flows[reduced_hash].srchost].srchost_bulk_flow_count--; - q->hosts[q->flows[reduced_hash].dsthost].dsthost_bulk_flow_count--; - } allocate_src = cake_dsrc(flow_mode); allocate_dst = cake_ddst(flow_mode); + + if (q->flows[outer_hash + k].set == CAKE_SET_BULK) { + if (allocate_src) + q->hosts[q->flows[reduced_hash].srchost].srchost_bulk_flow_count--; + if (allocate_dst) + q->hosts[q->flows[reduced_hash].dsthost].dsthost_bulk_flow_count--; + } found: /* reserve queue for future packets in same flow */ reduced_hash = outer_hash + k;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 5787fcaab9eb5930f5378d6a1dd03d916d146622 upstream.
In an error injection test of a routine for mount-time recovery, KASAN found a use-after-free bug.
It turned out that if data recovery was performed using partial logs created by dsync writes, but an error occurred before starting the log writer to create a recovered checkpoint, the inodes whose data had been recovered were left in the ns_dirty_files list of the nilfs object and were not freed.
Fix this issue by cleaning up inodes that have read the recovery data if the recovery routine fails midway before the log writer starts.
Link: https://lkml.kernel.org/r/20240810065242.3701-1-konishi.ryusuke@gmail.com Fixes: 0f3e1c7f23f8 ("nilfs2: recovery functions") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/recovery.c | 35 +++++++++++++++++++++++++++++++++-- 1 file changed, 33 insertions(+), 2 deletions(-)
--- a/fs/nilfs2/recovery.c +++ b/fs/nilfs2/recovery.c @@ -709,6 +709,33 @@ static void nilfs_finish_roll_forward(st }
/** + * nilfs_abort_roll_forward - cleaning up after a failed rollforward recovery + * @nilfs: nilfs object + */ +static void nilfs_abort_roll_forward(struct the_nilfs *nilfs) +{ + struct nilfs_inode_info *ii, *n; + LIST_HEAD(head); + + /* Abandon inodes that have read recovery data */ + spin_lock(&nilfs->ns_inode_lock); + list_splice_init(&nilfs->ns_dirty_files, &head); + spin_unlock(&nilfs->ns_inode_lock); + if (list_empty(&head)) + return; + + set_nilfs_purging(nilfs); + list_for_each_entry_safe(ii, n, &head, i_dirty) { + spin_lock(&nilfs->ns_inode_lock); + list_del_init(&ii->i_dirty); + spin_unlock(&nilfs->ns_inode_lock); + + iput(&ii->vfs_inode); + } + clear_nilfs_purging(nilfs); +} + +/** * nilfs_salvage_orphan_logs - salvage logs written after the latest checkpoint * @nilfs: nilfs object * @sb: super block instance @@ -766,15 +793,19 @@ int nilfs_salvage_orphan_logs(struct the if (unlikely(err)) { nilfs_err(sb, "error %d writing segment for recovery", err); - goto failed; + goto put_root; }
nilfs_finish_roll_forward(nilfs, ri); }
- failed: +put_root: nilfs_put_root(root); return err; + +failed: + nilfs_abort_roll_forward(nilfs); + goto put_root; }
/**
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 683408258917541bdb294cd717c210a04381931e upstream.
The superblock buffers of nilfs2 can not only be overwritten at runtime for modifications/repairs, but they are also regularly swapped, replaced during resizing, and even abandoned when degrading to one side due to backing device issues. So, accessing them requires mutual exclusion using the reader/writer semaphore "nilfs->ns_sem".
Some sysfs attribute show methods read this superblock buffer without the necessary mutual exclusion, which can cause problems with pointer dereferencing and memory access, so fix it.
Link: https://lkml.kernel.org/r/20240811100320.9913-1-konishi.ryusuke@gmail.com Fixes: da7141fb78db ("nilfs2: add /sys/fs/nilfs2/<device> group") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/sysfs.c | 43 +++++++++++++++++++++++++++++++++---------- 1 file changed, 33 insertions(+), 10 deletions(-)
--- a/fs/nilfs2/sysfs.c +++ b/fs/nilfs2/sysfs.c @@ -836,9 +836,15 @@ ssize_t nilfs_dev_revision_show(struct n struct the_nilfs *nilfs, char *buf) { - struct nilfs_super_block **sbp = nilfs->ns_sbp; - u32 major = le32_to_cpu(sbp[0]->s_rev_level); - u16 minor = le16_to_cpu(sbp[0]->s_minor_rev_level); + struct nilfs_super_block *raw_sb; + u32 major; + u16 minor; + + down_read(&nilfs->ns_sem); + raw_sb = nilfs->ns_sbp[0]; + major = le32_to_cpu(raw_sb->s_rev_level); + minor = le16_to_cpu(raw_sb->s_minor_rev_level); + up_read(&nilfs->ns_sem);
return sysfs_emit(buf, "%d.%d\n", major, minor); } @@ -856,8 +862,13 @@ ssize_t nilfs_dev_device_size_show(struc struct the_nilfs *nilfs, char *buf) { - struct nilfs_super_block **sbp = nilfs->ns_sbp; - u64 dev_size = le64_to_cpu(sbp[0]->s_dev_size); + struct nilfs_super_block *raw_sb; + u64 dev_size; + + down_read(&nilfs->ns_sem); + raw_sb = nilfs->ns_sbp[0]; + dev_size = le64_to_cpu(raw_sb->s_dev_size); + up_read(&nilfs->ns_sem);
return sysfs_emit(buf, "%llu\n", dev_size); } @@ -879,9 +890,15 @@ ssize_t nilfs_dev_uuid_show(struct nilfs struct the_nilfs *nilfs, char *buf) { - struct nilfs_super_block **sbp = nilfs->ns_sbp; + struct nilfs_super_block *raw_sb; + ssize_t len; + + down_read(&nilfs->ns_sem); + raw_sb = nilfs->ns_sbp[0]; + len = sysfs_emit(buf, "%pUb\n", raw_sb->s_uuid); + up_read(&nilfs->ns_sem);
- return sysfs_emit(buf, "%pUb\n", sbp[0]->s_uuid); + return len; }
static @@ -889,10 +906,16 @@ ssize_t nilfs_dev_volume_name_show(struc struct the_nilfs *nilfs, char *buf) { - struct nilfs_super_block **sbp = nilfs->ns_sbp; + struct nilfs_super_block *raw_sb; + ssize_t len; + + down_read(&nilfs->ns_sem); + raw_sb = nilfs->ns_sbp[0]; + len = scnprintf(buf, sizeof(raw_sb->s_volume_name), "%s\n", + raw_sb->s_volume_name); + up_read(&nilfs->ns_sem);
- return scnprintf(buf, sizeof(sbp[0]->s_volume_name), "%s\n", - sbp[0]->s_volume_name); + return len; }
static const char dev_readme_str[] =
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ryusuke Konishi konishi.ryusuke@gmail.com
commit 6576dd6695f2afca3f4954029ac4a64f82ba60ab upstream.
After commit a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") was applied, the log writing function nilfs_segctor_do_construct() was able to issue I/O requests continuously even if user data blocks were split into multiple logs across segments, but two potential flaws were introduced in its error handling.
First, if nilfs_segctor_begin_construction() fails while creating the second or subsequent logs, the log writing function returns without calling nilfs_segctor_abort_construction(), so the writeback flag set on pages/folios will remain uncleared. This causes page cache operations to hang waiting for the writeback flag. For example, truncate_inode_pages_final(), which is called via nilfs_evict_inode() when an inode is evicted from memory, will hang.
Second, the NILFS_I_COLLECTED flag set on normal inodes remain uncleared. As a result, if the next log write involves checkpoint creation, that's fine, but if a partial log write is performed that does not, inodes with NILFS_I_COLLECTED set are erroneously removed from the "sc_dirty_files" list, and their data and b-tree blocks may not be written to the device, corrupting the block mapping.
Fix these issues by uniformly calling nilfs_segctor_abort_construction() on failure of each step in the loop in nilfs_segctor_do_construct(), having it clean up logs and segment usages according to progress, and correcting the conditions for calling nilfs_redirty_inodes() to ensure that the NILFS_I_COLLECTED flag is cleared.
Link: https://lkml.kernel.org/r/20240814101119.4070-1-konishi.ryusuke@gmail.com Fixes: a694291a6211 ("nilfs2: separate wait function from nilfs_segctor_write") Signed-off-by: Ryusuke Konishi konishi.ryusuke@gmail.com Tested-by: Ryusuke Konishi konishi.ryusuke@gmail.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/nilfs2/segment.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
--- a/fs/nilfs2/segment.c +++ b/fs/nilfs2/segment.c @@ -1833,6 +1833,9 @@ static void nilfs_segctor_abort_construc nilfs_abort_logs(&logs, ret ? : err);
list_splice_tail_init(&sci->sc_segbufs, &logs); + if (list_empty(&logs)) + return; /* if the first segment buffer preparation failed */ + nilfs_cancel_segusage(&logs, nilfs->ns_sufile); nilfs_free_incomplete_logs(&logs, nilfs);
@@ -2077,7 +2080,7 @@ static int nilfs_segctor_do_construct(st
err = nilfs_segctor_begin_construction(sci, nilfs); if (unlikely(err)) - goto out; + goto failed;
/* Update time stamp */ sci->sc_seg_ctime = ktime_get_real_seconds(); @@ -2140,10 +2143,9 @@ static int nilfs_segctor_do_construct(st return err;
failed_to_write: - if (sci->sc_stage.flags & NILFS_CF_IFILE_STARTED) - nilfs_redirty_inodes(&sci->sc_dirty_files); - failed: + if (mode == SC_LSEG_SR && nilfs_sc_cstage_get(sci) >= NILFS_ST_IFILE) + nilfs_redirty_inodes(&sci->sc_dirty_files); if (nilfs_doing_gc()) nilfs_redirty_inodes(&sci->sc_gc_inodes); nilfs_segctor_abort_construction(sci, nilfs, err);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 50ed081284fe2bfd1f25e8b92f4f6a4990e73c0a ]
Although we have already a mechanism for sanity checks of input values for control writes, it's not applied unless the kconfig CONFIG_SND_CTL_INPUT_VALIDATION is set due to the performance reason. Nevertheless, it still makes sense to apply the same check for user elements despite of its cost, as that's the only way to filter out the invalid values; the user controls are handled solely in ALSA core code, and there is no corresponding driver, after all.
This patch adds the same input value validation for user control elements at its put callback. The kselftest will be happier with this change, as the incorrect values will be bailed out now with errors.
For other normal controls, the check is applied still only when CONFIG_SND_CTL_INPUT_VALIDATION is set.
Reported-by: Paul Menzel pmenzel@molgen.mpg.de Closes: https://lore.kernel.org/r/1d44be36-9bb9-4d82-8953-5ae2a4f09405@molgen.mpg.de Reviewed-by: Jaroslav Kysela perex@perex.cz Reviewed-by: Mark Brown broonie@kernel.org Reviewed-by: Takashi Sakamoto o-takashi@sakamocchi.jp Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/20240616073454.16512-4-tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/core/control.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/sound/core/control.c b/sound/core/control.c index 82aa1af1d1d8..92266c97238d 100644 --- a/sound/core/control.c +++ b/sound/core/control.c @@ -1477,12 +1477,16 @@ static int snd_ctl_elem_user_get(struct snd_kcontrol *kcontrol, static int snd_ctl_elem_user_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { - int change; + int err, change; struct user_element *ue = kcontrol->private_data; unsigned int size = ue->elem_data_size; char *dst = ue->elem_data + snd_ctl_get_ioff(kcontrol, &ucontrol->id) * size;
+ err = sanity_check_input_values(ue->card, ucontrol, &ue->info, false); + if (err < 0) + return err; + change = memcmp(&ucontrol->value, dst, size) != 0; if (change) memcpy(dst, &ucontrol->value, size);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Takashi Iwai tiwai@suse.de
[ Upstream commit 6278056e42d953e207e2afd416be39d09ed2d496 ]
Add a simple sanity check to HD-audio HDMI Channel Map controls. Although the value might not be accepted for the actual connection, we can filter out some bogus values beforehand, and that should be enough for making kselftest happier.
Reviewed-by: Jaroslav Kysela perex@perex.cz Signed-off-by: Takashi Iwai tiwai@suse.de Link: https://lore.kernel.org/20240616073454.16512-7-tiwai@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- sound/hda/hdmi_chmap.c | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+)
diff --git a/sound/hda/hdmi_chmap.c b/sound/hda/hdmi_chmap.c index 5d8e1d944b0a..7b276047f85a 100644 --- a/sound/hda/hdmi_chmap.c +++ b/sound/hda/hdmi_chmap.c @@ -753,6 +753,20 @@ static int hdmi_chmap_ctl_get(struct snd_kcontrol *kcontrol, return 0; }
+/* a simple sanity check for input values to chmap kcontrol */ +static int chmap_value_check(struct hdac_chmap *hchmap, + const struct snd_ctl_elem_value *ucontrol) +{ + int i; + + for (i = 0; i < hchmap->channels_max; i++) { + if (ucontrol->value.integer.value[i] < 0 || + ucontrol->value.integer.value[i] > SNDRV_CHMAP_LAST) + return -EINVAL; + } + return 0; +} + static int hdmi_chmap_ctl_put(struct snd_kcontrol *kcontrol, struct snd_ctl_elem_value *ucontrol) { @@ -764,6 +778,10 @@ static int hdmi_chmap_ctl_put(struct snd_kcontrol *kcontrol, unsigned char chmap[8], per_pin_chmap[8]; int i, err, ca, prepared = 0;
+ err = chmap_value_check(hchmap, ucontrol); + if (err < 0) + return err; + /* No monitor is connected in dyn_pcm_assign. * It's invalid to setup the chmap */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Andreev andreev@swemel.ru
[ Upstream commit e86cac0acdb1a74f608bacefe702f2034133a047 ]
When a process accept()s connection from a unix socket (either stream or seqpacket) it gets the socket with the label of the connecting process.
For example, if a connecting process has a label 'foo', the accept()ed socket will also have 'in' and 'out' labels 'foo', regardless of the label of the listener process.
This is because kernel creates unix child sockets in the context of the connecting process.
I do not see any obvious way for the listener to abuse alien labels coming with the new socket, but, to be on the safe side, it's better fix new socket labels.
Signed-off-by: Konstantin Andreev andreev@swemel.ru Signed-off-by: Casey Schaufler casey@schaufler-ca.com Signed-off-by: Sasha Levin sashal@kernel.org --- security/smack/smack_lsm.c | 12 +++++++++--- 1 file changed, 9 insertions(+), 3 deletions(-)
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c index 75b3e91d5a5f..c18366dbbfed 100644 --- a/security/smack/smack_lsm.c +++ b/security/smack/smack_lsm.c @@ -3706,12 +3706,18 @@ static int smack_unix_stream_connect(struct sock *sock, } }
- /* - * Cross reference the peer labels for SO_PEERSEC. - */ if (rc == 0) { + /* + * Cross reference the peer labels for SO_PEERSEC. + */ nsp->smk_packet = ssp->smk_out; ssp->smk_packet = osp->smk_out; + + /* + * new/child/established socket must inherit listening socket labels + */ + nsp->smk_out = osp->smk_out; + nsp->smk_in = osp->smk_in; }
return rc;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alexey Dobriyan adobriyan@gmail.com
[ Upstream commit 2a97388a807b6ab5538aa8f8537b2463c6988bd2 ]
ELF loader uses "randomize_va_space" twice. It is sysctl and can change at any moment, so 2 loads could see 2 different values in theory with unpredictable consequences.
Issue exactly one load for consistent value across one exec.
Signed-off-by: Alexey Dobriyan adobriyan@gmail.com Link: https://lore.kernel.org/r/3329905c-7eb8-400a-8f0a-d87cff979b5b@p183 Signed-off-by: Kees Cook kees@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/binfmt_elf.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/fs/binfmt_elf.c b/fs/binfmt_elf.c index e6c9c0e08448..89e7e4826efc 100644 --- a/fs/binfmt_elf.c +++ b/fs/binfmt_elf.c @@ -1009,7 +1009,8 @@ static int load_elf_binary(struct linux_binprm *bprm) if (elf_read_implies_exec(*elf_ex, executable_stack)) current->personality |= READ_IMPLIES_EXEC;
- if (!(current->personality & ADDR_NO_RANDOMIZE) && randomize_va_space) + const int snapshot_randomize_va_space = READ_ONCE(randomize_va_space); + if (!(current->personality & ADDR_NO_RANDOMIZE) && snapshot_randomize_va_space) current->flags |= PF_RANDOMIZE;
setup_new_exec(bprm); @@ -1301,7 +1302,7 @@ static int load_elf_binary(struct linux_binprm *bprm) mm->end_data = end_data; mm->start_stack = bprm->p;
- if ((current->flags & PF_RANDOMIZE) && (randomize_va_space > 1)) { + if ((current->flags & PF_RANDOMIZE) && (snapshot_randomize_va_space > 1)) { /* * For architectures with ELF randomization, when executing * a loader directly (i.e. no interpreter listed in ELF
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pali Rohár pali@kernel.org
[ Upstream commit 3cef738208e5c3cb7084e208caf9bbf684f24feb ]
IRQs 0 (IPI) and 1 (MSI) are handled internally by this driver, generic_handle_domain_irq() is never called for these IRQs.
Disallow mapping these IRQs.
[ Marek: changed commit message ]
Signed-off-by: Pali Rohár pali@kernel.org Signed-off-by: Marek Behún kabel@kernel.org Signed-off-by: Thomas Gleixner tglx@linutronix.de Reviewed-by: Andrew Lunn andrew@lunn.ch Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-armada-370-xp.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c index ee18eb3e72b7..ab02b44a3b4e 100644 --- a/drivers/irqchip/irq-armada-370-xp.c +++ b/drivers/irqchip/irq-armada-370-xp.c @@ -567,6 +567,10 @@ static struct irq_chip armada_370_xp_irq_chip = { static int armada_370_xp_mpic_irq_map(struct irq_domain *h, unsigned int virq, irq_hw_number_t hw) { + /* IRQs 0 and 1 cannot be mapped, they are handled internally */ + if (hw <= 1) + return -EINVAL; + armada_370_xp_irq_mask(irq_get_irq_data(virq)); if (!is_percpu_irq(hw)) writel(hw, per_cpu_int_base +
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit e4bd881d987121dbf1a288641491955a53d9f8f7 ]
When (AF_UNIX, SOCK_STREAM) socket connect()s to a listening socket, the listener's sk_peer_pid/sk_peer_cred are copied to the client in copy_peercred().
Then, the client's sk_peer_pid and sk_peer_cred are always NULL, so we need not call put_pid() and put_cred() there.
Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- net/unix/af_unix.c | 9 +-------- 1 file changed, 1 insertion(+), 8 deletions(-)
diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 7d59f9a6c904..5ce60087086c 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -680,9 +680,6 @@ static void init_peercred(struct sock *sk)
static void copy_peercred(struct sock *sk, struct sock *peersk) { - const struct cred *old_cred; - struct pid *old_pid; - if (sk < peersk) { spin_lock(&sk->sk_peer_lock); spin_lock_nested(&peersk->sk_peer_lock, SINGLE_DEPTH_NESTING); @@ -690,16 +687,12 @@ static void copy_peercred(struct sock *sk, struct sock *peersk) spin_lock(&peersk->sk_peer_lock); spin_lock_nested(&sk->sk_peer_lock, SINGLE_DEPTH_NESTING); } - old_pid = sk->sk_peer_pid; - old_cred = sk->sk_peer_cred; + sk->sk_peer_pid = get_pid(peersk->sk_peer_pid); sk->sk_peer_cred = get_cred(peersk->sk_peer_cred);
spin_unlock(&sk->sk_peer_lock); spin_unlock(&peersk->sk_peer_lock); - - put_pid(old_pid); - put_cred(old_cred); }
static int unix_listen(struct socket *sock, int backlog)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brian Johannesmeyer bjohannesmeyer@gmail.com
[ Upstream commit bf6ab33d8487f5e2a0998ce75286eae65bb0a6d6 ]
When called with a 'from' that is not 4-byte-aligned, string_memcpy_fromio() calls the movs() macro to copy the first few bytes, so that 'from' becomes 4-byte-aligned before calling rep_movs(). This movs() macro modifies 'to', and the subsequent line modifies 'n'.
As a result, on unaligned accesses, kmsan_unpoison_memory() uses the updated (aligned) values of 'to' and 'n'. Hence, it does not unpoison the entire region.
Save the original values of 'to' and 'n', and pass those to kmsan_unpoison_memory(), so that the entire region is unpoisoned.
Signed-off-by: Brian Johannesmeyer bjohannesmeyer@gmail.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Reviewed-by: Alexander Potapenko glider@google.com Link: https://lore.kernel.org/r/20240523215029.4160518-1-bjohannesmeyer@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/lib/iomem.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/arch/x86/lib/iomem.c b/arch/x86/lib/iomem.c index e0411a3774d4..5eecb45d05d5 100644 --- a/arch/x86/lib/iomem.c +++ b/arch/x86/lib/iomem.c @@ -25,6 +25,9 @@ static __always_inline void rep_movs(void *to, const void *from, size_t n)
static void string_memcpy_fromio(void *to, const volatile void __iomem *from, size_t n) { + const void *orig_to = to; + const size_t orig_n = n; + if (unlikely(!n)) return;
@@ -39,7 +42,7 @@ static void string_memcpy_fromio(void *to, const volatile void __iomem *from, si } rep_movs(to, (const void *)from, n); /* KMSAN must treat values read from devices as initialized. */ - kmsan_unpoison_memory(to, n); + kmsan_unpoison_memory(orig_to, orig_n); }
static void string_memcpy_toio(volatile void __iomem *to, const void *from, size_t n)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jernej Skrabec jernej.skrabec@gmail.com
[ Upstream commit 927c70c93d929f4c2dcaf72f51b31bb7d118a51a ]
The Allwinner H6 IOMMU has a bypass register, which allows to circumvent the page tables for each possible master. The reset value for this register is 0, which disables the bypass. The Allwinner H616 IOMMU resets this register to 0x7f, which activates the bypass for all masters, which is not what we want.
Always clear this register to 0, to enforce the usage of page tables, and make this driver compatible with the H616 in this respect.
Signed-off-by: Jernej Skrabec jernej.skrabec@gmail.com Signed-off-by: Andre Przywara andre.przywara@arm.com Reviewed-by: Chen-Yu Tsai wens@csie.org Link: https://lore.kernel.org/r/20240616224056.29159-2-andre.przywara@arm.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/sun50i-iommu.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/iommu/sun50i-iommu.c b/drivers/iommu/sun50i-iommu.c index 5b585eace3d4..e8dc1a7d9491 100644 --- a/drivers/iommu/sun50i-iommu.c +++ b/drivers/iommu/sun50i-iommu.c @@ -449,6 +449,7 @@ static int sun50i_iommu_enable(struct sun50i_iommu *iommu) IOMMU_TLB_PREFETCH_MASTER_ENABLE(3) | IOMMU_TLB_PREFETCH_MASTER_ENABLE(4) | IOMMU_TLB_PREFETCH_MASTER_ENABLE(5)); + iommu_write(iommu, IOMMU_BYPASS_REG, 0); iommu_write(iommu, IOMMU_INT_ENABLE_REG, IOMMU_INT_MASK); iommu_write(iommu, IOMMU_DM_AUT_CTRL_REG(SUN50I_IOMMU_ACI_NONE), IOMMU_DM_AUT_CTRL_RD_UNAVAIL(SUN50I_IOMMU_ACI_NONE, 0) |
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yunjian Wang wangyunjian@huawei.com
[ Upstream commit 0b88d1654d556264bcd24a9cb6383f0888e30131 ]
Now there is a issue is that code checks reports a warning: implicit narrowing conversion from type 'unsigned int' to small type 'u8' (the 'keylen' variable). Fix it by removing the 'keylen' variable.
Signed-off-by: Yunjian Wang wangyunjian@huawei.com Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/netfilter/nf_conncount.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-)
diff --git a/net/netfilter/nf_conncount.c b/net/netfilter/nf_conncount.c index 5d8ed6c90b7e..5885810da412 100644 --- a/net/netfilter/nf_conncount.c +++ b/net/netfilter/nf_conncount.c @@ -321,7 +321,6 @@ insert_tree(struct net *net, struct nf_conncount_rb *rbconn; struct nf_conncount_tuple *conn; unsigned int count = 0, gc_count = 0; - u8 keylen = data->keylen; bool do_gc = true;
spin_lock_bh(&nf_conncount_locks[hash]); @@ -333,7 +332,7 @@ insert_tree(struct net *net, rbconn = rb_entry(*rbnode, struct nf_conncount_rb, node);
parent = *rbnode; - diff = key_diff(key, rbconn->key, keylen); + diff = key_diff(key, rbconn->key, data->keylen); if (diff < 0) { rbnode = &((*rbnode)->rb_left); } else if (diff > 0) { @@ -378,7 +377,7 @@ insert_tree(struct net *net,
conn->tuple = *tuple; conn->zone = *zone; - memcpy(rbconn->key, key, sizeof(u32) * keylen); + memcpy(rbconn->key, key, sizeof(u32) * data->keylen);
nf_conncount_list_init(&rbconn->list); list_add(&conn->node, &rbconn->list.head); @@ -403,7 +402,6 @@ count_tree(struct net *net, struct rb_node *parent; struct nf_conncount_rb *rbconn; unsigned int hash; - u8 keylen = data->keylen;
hash = jhash2(key, data->keylen, conncount_rnd) % CONNCOUNT_SLOTS; root = &data->root[hash]; @@ -414,7 +412,7 @@ count_tree(struct net *net,
rbconn = rb_entry(parent, struct nf_conncount_rb, node);
- diff = key_diff(key, rbconn->key, keylen); + diff = key_diff(key, rbconn->key, data->keylen); if (diff < 0) { parent = rcu_dereference_raw(parent->rb_left); } else if (diff > 0) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jan Kara jack@suse.cz
[ Upstream commit ebbe26fd54a9621994bc16b14f2ba8f84c089693 ]
Avoid mounting filesystems where the partition would overflow the 32-bits used for block number. Also refuse to mount filesystems where the partition length is so large we cannot safely index bits in a block bitmap.
Link: https://patch.msgid.link/20240620130403.14731-1-jack@suse.cz Signed-off-by: Jan Kara jack@suse.cz Signed-off-by: Sasha Levin sashal@kernel.org --- fs/udf/super.c | 15 +++++++++++++++ 1 file changed, 15 insertions(+)
diff --git a/fs/udf/super.c b/fs/udf/super.c index 3b6419f29a4c..fa790be4f19f 100644 --- a/fs/udf/super.c +++ b/fs/udf/super.c @@ -1084,12 +1084,19 @@ static int udf_fill_partdesc_info(struct super_block *sb, struct udf_part_map *map; struct udf_sb_info *sbi = UDF_SB(sb); struct partitionHeaderDesc *phd; + u32 sum; int err;
map = &sbi->s_partmaps[p_index];
map->s_partition_len = le32_to_cpu(p->partitionLength); /* blocks */ map->s_partition_root = le32_to_cpu(p->partitionStartingLocation); + if (check_add_overflow(map->s_partition_root, map->s_partition_len, + &sum)) { + udf_err(sb, "Partition %d has invalid location %u + %u\n", + p_index, map->s_partition_root, map->s_partition_len); + return -EFSCORRUPTED; + }
if (p->accessType == cpu_to_le32(PD_ACCESS_TYPE_READ_ONLY)) map->s_partition_flags |= UDF_PART_FLAG_READ_ONLY; @@ -1145,6 +1152,14 @@ static int udf_fill_partdesc_info(struct super_block *sb, bitmap->s_extPosition = le32_to_cpu( phd->unallocSpaceBitmap.extPosition); map->s_partition_flags |= UDF_PART_FLAG_UNALLOC_BITMAP; + /* Check whether math over bitmap won't overflow. */ + if (check_add_overflow(map->s_partition_len, + sizeof(struct spaceBitmapDesc) << 3, + &sum)) { + udf_err(sb, "Partition %d is too long (%u)\n", p_index, + map->s_partition_len); + return -EFSCORRUPTED; + } udf_debug("unallocSpaceBitmap (part %d) @ %u\n", p_index, bitmap->s_extPosition); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit a0dde5d7a58b6bf9184ef3d8c6e62275c3645584 ]
In addition to returning an error, mark the node as bad.
Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/frecord.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/fs/ntfs3/frecord.c b/fs/ntfs3/frecord.c index 6cce71cc750e..7bfdc91fae1e 100644 --- a/fs/ntfs3/frecord.c +++ b/fs/ntfs3/frecord.c @@ -1601,8 +1601,10 @@ int ni_delete_all(struct ntfs_inode *ni) asize = le32_to_cpu(attr->size); roff = le16_to_cpu(attr->nres.run_off);
- if (roff > asize) + if (roff > asize) { + _ntfs_bad_inode(&ni->vfs_inode); return -EINVAL; + }
/* run==1 means unpack and deallocate. */ run_unpack_ex(RUN_DEALLOCATE, sbi, ni->mi.rno, svcn, evcn, svcn,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 0fd7c0c2c156270dceb8c15fad3120cdce03e539 ]
In several places a division by fmt->vdownsampling[p] was missing in the sizeimage[p] calculation, causing incorrect behavior for multiplanar formats were some planes are smaller than the first plane.
Found by new v4l2-compliance tests.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/test-drivers/vivid/vivid-vid-cap.c | 5 +++-- drivers/media/test-drivers/vivid/vivid-vid-out.c | 16 +++++++++------- 2 files changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c index c0999581c599..bff95cb57718 100644 --- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c +++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c @@ -113,8 +113,9 @@ static int vid_cap_queue_setup(struct vb2_queue *vq, if (*nplanes != buffers) return -EINVAL; for (p = 0; p < buffers; p++) { - if (sizes[p] < tpg_g_line_width(&dev->tpg, p) * h + - dev->fmt_cap->data_offset[p]) + if (sizes[p] < tpg_g_line_width(&dev->tpg, p) * h / + dev->fmt_cap->vdownsampling[p] + + dev->fmt_cap->data_offset[p]) return -EINVAL; } } else { diff --git a/drivers/media/test-drivers/vivid/vivid-vid-out.c b/drivers/media/test-drivers/vivid/vivid-vid-out.c index 9f731f085179..e96d3d014143 100644 --- a/drivers/media/test-drivers/vivid/vivid-vid-out.c +++ b/drivers/media/test-drivers/vivid/vivid-vid-out.c @@ -63,14 +63,16 @@ static int vid_out_queue_setup(struct vb2_queue *vq, if (sizes[0] < size) return -EINVAL; for (p = 1; p < planes; p++) { - if (sizes[p] < dev->bytesperline_out[p] * h + - vfmt->data_offset[p]) + if (sizes[p] < dev->bytesperline_out[p] * h / + vfmt->vdownsampling[p] + + vfmt->data_offset[p]) return -EINVAL; } } else { for (p = 0; p < planes; p++) - sizes[p] = p ? dev->bytesperline_out[p] * h + - vfmt->data_offset[p] : size; + sizes[p] = p ? dev->bytesperline_out[p] * h / + vfmt->vdownsampling[p] + + vfmt->data_offset[p] : size; }
if (vq->num_buffers + *nbuffers < 2) @@ -127,7 +129,7 @@ static int vid_out_buf_prepare(struct vb2_buffer *vb)
for (p = 0; p < planes; p++) { if (p) - size = dev->bytesperline_out[p] * h; + size = dev->bytesperline_out[p] * h / vfmt->vdownsampling[p]; size += vb->planes[p].data_offset;
if (vb2_get_plane_payload(vb, p) < size) { @@ -334,8 +336,8 @@ int vivid_g_fmt_vid_out(struct file *file, void *priv, for (p = 0; p < mp->num_planes; p++) { mp->plane_fmt[p].bytesperline = dev->bytesperline_out[p]; mp->plane_fmt[p].sizeimage = - mp->plane_fmt[p].bytesperline * mp->height + - fmt->data_offset[p]; + mp->plane_fmt[p].bytesperline * mp->height / + fmt->vdownsampling[p] + fmt->data_offset[p]; } for (p = fmt->buffers; p < fmt->planes; p++) { unsigned stride = dev->bytesperline_out[p];
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit 7f9ab862e05c5bc755f65bf6db7edcffb3b49dfc ]
Add a missing call to of_node_put(np) on error.
Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Link: https://lore.kernel.org/r/20240606173037.3091598-2-andriy.shevchenko@linux.i... Signed-off-by: Lee Jones lee@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/leds/leds-spi-byte.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/leds/leds-spi-byte.c b/drivers/leds/leds-spi-byte.c index 2bc5c99daf51..6883d3ba382f 100644 --- a/drivers/leds/leds-spi-byte.c +++ b/drivers/leds/leds-spi-byte.c @@ -91,7 +91,6 @@ static int spi_byte_probe(struct spi_device *spi) dev_err(dev, "Device must have exactly one LED sub-node."); return -EINVAL; } - child = of_get_next_available_child(dev_of_node(dev), NULL);
led = devm_kzalloc(dev, sizeof(*led), GFP_KERNEL); if (!led) @@ -107,11 +106,13 @@ static int spi_byte_probe(struct spi_device *spi) led->ldev.max_brightness = led->cdef->max_value - led->cdef->off_value; led->ldev.brightness_set_blocking = spi_byte_brightness_set_blocking;
+ child = of_get_next_available_child(dev_of_node(dev), NULL); state = of_get_property(child, "default-state", NULL); if (state) { if (!strcmp(state, "on")) { led->ldev.brightness = led->ldev.max_brightness; } else if (strcmp(state, "off")) { + of_node_put(child); /* all other cases except "off" */ dev_err(dev, "default-state can only be 'on' or 'off'"); return -EINVAL; @@ -122,9 +123,12 @@ static int spi_byte_probe(struct spi_device *spi)
ret = devm_led_classdev_register(&spi->dev, &led->ldev); if (ret) { + of_node_put(child); mutex_destroy(&led->mutex); return ret; } + + of_node_put(child); spi_set_drvdata(spi, led);
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Arend van Spriel arend.vanspriel@broadcom.com
[ Upstream commit dbb5265a5d7cca1cdba7736dba313ab7d07bc19d ]
After being asked about support for WPA3 for BCM43224 chipset it was found that all it takes is setting the MFP_CAPABLE flag and mac80211 will take care of all that is needed [1].
Link: https://lore.kernel.org/linux-wireless/20200526155909.5807-2-Larry.Finger@lw... [1] Signed-off-by: Arend van Spriel arend.vanspriel@broadcom.com Tested-by: Reijer Boekhoff reijerboekhoff@protonmail.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://patch.msgid.link/20240617122609.349582-1-arend.vanspriel@broadcom.co... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/broadcom/brcm80211/brcmsmac/mac80211_if.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/mac80211_if.c b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/mac80211_if.c index a4034d44609b..94b1e4f15b41 100644 --- a/drivers/net/wireless/broadcom/brcm80211/brcmsmac/mac80211_if.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmsmac/mac80211_if.c @@ -1089,6 +1089,7 @@ static int ieee_hw_init(struct ieee80211_hw *hw) ieee80211_hw_set(hw, AMPDU_AGGREGATION); ieee80211_hw_set(hw, SIGNAL_DBM); ieee80211_hw_set(hw, REPORTS_TX_ACK_STATUS); + ieee80211_hw_set(hw, MFP_CAPABLE);
hw->extra_tx_headroom = brcms_c_get_header_len(); hw->queues = N_TX_QUEUES;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shantanu Goel sgoel01@yahoo.com
[ Upstream commit 9d32685a251a754f1823d287df233716aa23bcb9 ]
Set the host status byte when a data completion error is encountered otherwise the upper layer may end up using the invalid zero'ed data. The following output was observed from scsi/sd.c prior to this fix.
[ 11.872824] sd 0:0:0:1: [sdf] tag#9 data cmplt err -75 uas-tag 1 inflight: [ 11.872826] sd 0:0:0:1: [sdf] tag#9 CDB: Read capacity(16) 9e 10 00 00 00 00 00 00 00 00 00 00 00 20 00 00 [ 11.872830] sd 0:0:0:1: [sdf] Sector size 0 reported, assuming 512.
Signed-off-by: Shantanu Goel sgoel01@yahoo.com Acked-by: Oliver Neukum oneukum@suse.com Link: https://lore.kernel.org/r/87msnx4ec6.fsf@yahoo.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/storage/uas.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/usb/storage/uas.c b/drivers/usb/storage/uas.c index af619efe8eab..b565c1eb84b3 100644 --- a/drivers/usb/storage/uas.c +++ b/drivers/usb/storage/uas.c @@ -422,6 +422,7 @@ static void uas_data_cmplt(struct urb *urb) uas_log_cmd_state(cmnd, "data cmplt err", status); /* error: no data transfered */ scsi_set_resid(cmnd, sdb->length); + set_host_byte(cmnd, DID_ERROR); } else { scsi_set_resid(cmnd, sdb->length - urb->actual_length); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ma Ke make24@iscas.ac.cn
[ Upstream commit ee0d382feb44ec0f445e2ad63786cd7f3f6a8199 ]
We should verify the bound of the array to assure that host may not manipulate the index to point past endpoint array.
Found by static analysis.
Signed-off-by: Ma Ke make24@iscas.ac.cn Reviewed-by: Andrew Jeffery andrew@codeconstruct.com.au Acked-by: Andrew Jeffery andrew@codeconstruct.com.au Link: https://lore.kernel.org/r/20240625022306.2568122-1-make24@iscas.ac.cn Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/usb/gadget/udc/aspeed_udc.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/usb/gadget/udc/aspeed_udc.c b/drivers/usb/gadget/udc/aspeed_udc.c index cedf17e38245..1a5a1115c1d9 100644 --- a/drivers/usb/gadget/udc/aspeed_udc.c +++ b/drivers/usb/gadget/udc/aspeed_udc.c @@ -1009,6 +1009,8 @@ static void ast_udc_getstatus(struct ast_udc_dev *udc) break; case USB_RECIP_ENDPOINT: epnum = crq.wIndex & USB_ENDPOINT_NUMBER_MASK; + if (epnum >= AST_UDC_NUM_ENDPOINTS) + goto stall; status = udc->ep[epnum].stopped; break; default:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Hung alex.hung@amd.com
[ Upstream commit 5d93060d430b359e16e7c555c8f151ead1ac614b ]
[WHAT & HOW] Check mod_hdcp_execute_and_set() return values in authenticated_dp.
This fixes 3 CHECKED_RETURN issues reported by Coverity.
Reviewed-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Alex Hung alex.hung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- .../amd/display/modules/hdcp/hdcp1_execution.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-)
diff --git a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp1_execution.c b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp1_execution.c index 1ddb4f5eac8e..93c0455766dd 100644 --- a/drivers/gpu/drm/amd/display/modules/hdcp/hdcp1_execution.c +++ b/drivers/gpu/drm/amd/display/modules/hdcp/hdcp1_execution.c @@ -433,17 +433,20 @@ static enum mod_hdcp_status authenticated_dp(struct mod_hdcp *hdcp, }
if (status == MOD_HDCP_STATUS_SUCCESS) - mod_hdcp_execute_and_set(mod_hdcp_read_bstatus, + if (!mod_hdcp_execute_and_set(mod_hdcp_read_bstatus, &input->bstatus_read, &status, - hdcp, "bstatus_read"); + hdcp, "bstatus_read")) + goto out; if (status == MOD_HDCP_STATUS_SUCCESS) - mod_hdcp_execute_and_set(check_link_integrity_dp, + if (!mod_hdcp_execute_and_set(check_link_integrity_dp, &input->link_integrity_check, &status, - hdcp, "link_integrity_check"); + hdcp, "link_integrity_check")) + goto out; if (status == MOD_HDCP_STATUS_SUCCESS) - mod_hdcp_execute_and_set(check_no_reauthentication_request_dp, + if (!mod_hdcp_execute_and_set(check_no_reauthentication_request_dp, &input->reauth_request_check, &status, - hdcp, "reauth_request_check"); + hdcp, "reauth_request_check")) + goto out; out: return status; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hawking Zhang Hawking.Zhang@amd.com
[ Upstream commit bdbdc7cecd00305dc844a361f9883d3a21022027 ]
adev->gfx.imu.funcs could be NULL
Signed-off-by: Hawking Zhang Hawking.Zhang@amd.com Reviewed-by: Likun Gao Likun.Gao@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c index 1f9f7fdd4b8e..c76895cca4d9 100644 --- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c +++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c @@ -4330,11 +4330,11 @@ static int gfx_v11_0_hw_init(void *handle) /* RLC autoload sequence 1: Program rlc ram */ if (adev->gfx.imu.funcs->program_rlc_ram) adev->gfx.imu.funcs->program_rlc_ram(adev); + /* rlc autoload firmware */ + r = gfx_v11_0_rlc_backdoor_autoload_enable(adev); + if (r) + return r; } - /* rlc autoload firmware */ - r = gfx_v11_0_rlc_backdoor_autoload_enable(adev); - if (r) - return r; } else { if (adev->firmware.load_type == AMDGPU_FW_LOAD_DIRECT) { if (adev->gfx.imu.funcs && (amdgpu_dpm > 0)) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Danijel Slivka danijel.slivka@amd.com
[ Upstream commit afbf7955ff01e952dbdd465fa25a2ba92d00291c ]
Why: Setting IH_RB_WPTR register to 0 will not clear the RB_OVERFLOW bit if RB_ENABLE is not set.
How to fix: Set WPTR_OVERFLOW_CLEAR bit after RB_ENABLE bit is set. The RB_ENABLE bit is required to be set, together with WPTR_OVERFLOW_ENABLE bit so that setting WPTR_OVERFLOW_CLEAR bit would clear the RB_OVERFLOW.
Signed-off-by: Danijel Slivka danijel.slivka@amd.com Reviewed-by: Christian König christian.koenig@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/ih_v6_0.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c b/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c index 657e4ca6f9dd..fccbec438bbe 100644 --- a/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c +++ b/drivers/gpu/drm/amd/amdgpu/ih_v6_0.c @@ -135,6 +135,34 @@ static int ih_v6_0_toggle_ring_interrupts(struct amdgpu_device *adev,
tmp = RREG32(ih_regs->ih_rb_cntl); tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, RB_ENABLE, (enable ? 1 : 0)); + + if (enable) { + /* Unset the CLEAR_OVERFLOW bit to make sure the next step + * is switching the bit from 0 to 1 + */ + tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 0); + if (amdgpu_sriov_vf(adev) && amdgpu_sriov_reg_indirect_ih(adev)) { + if (psp_reg_program(&adev->psp, ih_regs->psp_reg_id, tmp)) + return -ETIMEDOUT; + } else { + WREG32_NO_KIQ(ih_regs->ih_rb_cntl, tmp); + } + + /* Clear RB_OVERFLOW bit */ + tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 1); + if (amdgpu_sriov_vf(adev) && amdgpu_sriov_reg_indirect_ih(adev)) { + if (psp_reg_program(&adev->psp, ih_regs->psp_reg_id, tmp)) + return -ETIMEDOUT; + } else { + WREG32_NO_KIQ(ih_regs->ih_rb_cntl, tmp); + } + + /* Unset the CLEAR_OVERFLOW bit immediately so new overflows + * can be detected. + */ + tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, WPTR_OVERFLOW_CLEAR, 0); + } + /* enable_intr field is only valid in ring0 */ if (ih == &adev->irq.ih) tmp = REG_SET_FIELD(tmp, IH_RB_CNTL, ENABLE_INTR, (enable ? 1 : 0));
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hans Verkuil hverkuil-cisco@xs4all.nl
[ Upstream commit 17763960b1784578e8fe915304b330922f646209 ]
When setting the EDID it would attempt to update two controls that are only present if there is an HDMI output configured.
If there isn't any (e.g. when the vivid module is loaded with node_types=1), then calling VIDIOC_S_EDID would crash.
Fix this by first checking if outputs are present.
Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/test-drivers/vivid/vivid-vid-cap.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/media/test-drivers/vivid/vivid-vid-cap.c b/drivers/media/test-drivers/vivid/vivid-vid-cap.c index bff95cb57718..3864df45077d 100644 --- a/drivers/media/test-drivers/vivid/vivid-vid-cap.c +++ b/drivers/media/test-drivers/vivid/vivid-vid-cap.c @@ -1810,8 +1810,10 @@ int vidioc_s_edid(struct file *file, void *_fh, return -EINVAL; if (edid->blocks == 0) { dev->edid_blocks = 0; - v4l2_ctrl_s_ctrl(dev->ctrl_tx_edid_present, 0); - v4l2_ctrl_s_ctrl(dev->ctrl_tx_hotplug, 0); + if (dev->num_outputs) { + v4l2_ctrl_s_ctrl(dev->ctrl_tx_edid_present, 0); + v4l2_ctrl_s_ctrl(dev->ctrl_tx_hotplug, 0); + } phys_addr = CEC_PHYS_ADDR_INVALID; goto set_phys_addr; } @@ -1835,8 +1837,10 @@ int vidioc_s_edid(struct file *file, void *_fh, display_present |= dev->display_present[i] << j++;
- v4l2_ctrl_s_ctrl(dev->ctrl_tx_edid_present, display_present); - v4l2_ctrl_s_ctrl(dev->ctrl_tx_hotplug, display_present); + if (dev->num_outputs) { + v4l2_ctrl_s_ctrl(dev->ctrl_tx_edid_present, display_present); + v4l2_ctrl_s_ctrl(dev->ctrl_tx_hotplug, display_present); + }
set_phys_addr: /* TODO: a proper hotplug detect cycle should be emulated here */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kishon Vijay Abraham I kishon@ti.com
[ Upstream commit 86f271f22bbb6391410a07e08d6ca3757fda01fa ]
Errata #i2037 in AM65x/DRA80xM Processors Silicon Revision 1.0 (SPRZ452D_July 2018_Revised December 2019 [1]) mentions when an inbound PCIe TLP spans more than two internal AXI 128-byte bursts, the bus may corrupt the packet payload and the corrupt data may cause associated applications or the processor to hang.
The workaround for Errata #i2037 is to limit the maximum read request size and maximum payload size to 128 bytes. Add workaround for Errata #i2037 here.
The errata and workaround is applicable only to AM65x SR 1.0 and later versions of the silicon will have this fixed.
[1] -> https://www.ti.com/lit/er/sprz452i/sprz452i.pdf
Link: https://lore.kernel.org/linux-pci/16e1fcae-1ea7-46be-b157-096e05661b15@sieme... Signed-off-by: Kishon Vijay Abraham I kishon@ti.com Signed-off-by: Achal Verma a-verma1@ti.com Signed-off-by: Vignesh Raghavendra vigneshr@ti.com Signed-off-by: Jan Kiszka jan.kiszka@siemens.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Reviewed-by: Siddharth Vadapalli s-vadapalli@ti.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/controller/dwc/pci-keystone.c | 44 ++++++++++++++++++++++- 1 file changed, 43 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/controller/dwc/pci-keystone.c b/drivers/pci/controller/dwc/pci-keystone.c index 6007ffcb4752..e738013c6d4f 100644 --- a/drivers/pci/controller/dwc/pci-keystone.c +++ b/drivers/pci/controller/dwc/pci-keystone.c @@ -35,6 +35,11 @@ #define PCIE_DEVICEID_SHIFT 16
/* Application registers */ +#define PID 0x000 +#define RTL GENMASK(15, 11) +#define RTL_SHIFT 11 +#define AM6_PCI_PG1_RTL_VER 0x15 + #define CMD_STATUS 0x004 #define LTSSM_EN_VAL BIT(0) #define OB_XLAT_EN_VAL BIT(1) @@ -105,6 +110,8 @@
#define to_keystone_pcie(x) dev_get_drvdata((x)->dev)
+#define PCI_DEVICE_ID_TI_AM654X 0xb00c + struct ks_pcie_of_data { enum dw_pcie_device_mode mode; const struct dw_pcie_host_ops *host_ops; @@ -519,7 +526,11 @@ static int ks_pcie_start_link(struct dw_pcie *pci) static void ks_pcie_quirk(struct pci_dev *dev) { struct pci_bus *bus = dev->bus; + struct keystone_pcie *ks_pcie; + struct device *bridge_dev; struct pci_dev *bridge; + u32 val; + static const struct pci_device_id rc_pci_devids[] = { { PCI_DEVICE(PCI_VENDOR_ID_TI, PCIE_RC_K2HK), .class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0, }, @@ -531,6 +542,11 @@ static void ks_pcie_quirk(struct pci_dev *dev) .class = PCI_CLASS_BRIDGE_PCI_NORMAL, .class_mask = ~0, }, { 0, }, }; + static const struct pci_device_id am6_pci_devids[] = { + { PCI_DEVICE(PCI_VENDOR_ID_TI, PCI_DEVICE_ID_TI_AM654X), + .class = PCI_CLASS_BRIDGE_PCI << 8, .class_mask = ~0, }, + { 0, }, + };
if (pci_is_root_bus(bus)) bridge = dev; @@ -552,10 +568,36 @@ static void ks_pcie_quirk(struct pci_dev *dev) */ if (pci_match_id(rc_pci_devids, bridge)) { if (pcie_get_readrq(dev) > 256) { - dev_info(&dev->dev, "limiting MRRS to 256\n"); + dev_info(&dev->dev, "limiting MRRS to 256 bytes\n"); pcie_set_readrq(dev, 256); } } + + /* + * Memory transactions fail with PCI controller in AM654 PG1.0 + * when MRRS is set to more than 128 bytes. Force the MRRS to + * 128 bytes in all downstream devices. + */ + if (pci_match_id(am6_pci_devids, bridge)) { + bridge_dev = pci_get_host_bridge_device(dev); + if (!bridge_dev && !bridge_dev->parent) + return; + + ks_pcie = dev_get_drvdata(bridge_dev->parent); + if (!ks_pcie) + return; + + val = ks_pcie_app_readl(ks_pcie, PID); + val &= RTL; + val >>= RTL_SHIFT; + if (val != AM6_PCI_PG1_RTL_VER) + return; + + if (pcie_get_readrq(dev) > 128) { + dev_info(&dev->dev, "limiting MRRS to 128 bytes\n"); + pcie_set_readrq(dev, 128); + } + } } DECLARE_PCI_FIXUP_ENABLE(PCI_ANY_ID, PCI_ANY_ID, ks_pcie_quirk);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Torokhov dmitry.torokhov@gmail.com
[ Upstream commit 17f5eebf6780eee50f887542e1833fda95f53e4d ]
Allocating a contiguous buffer of 64K may fail if memory is sufficiently fragmented, and may cause OOM kill of an unrelated process. However we do not need to have contiguous memory. We also do not need to zero out the buffer since it will be overwritten with firmware data.
Switch to using kvmalloc() instead of kzalloc().
Link: https://lore.kernel.org/r/20240609234757.610273-1-dmitry.torokhov@gmail.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/touchscreen/ili210x.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/input/touchscreen/ili210x.c b/drivers/input/touchscreen/ili210x.c index e3a36cd3656c..8c8eea5173f7 100644 --- a/drivers/input/touchscreen/ili210x.c +++ b/drivers/input/touchscreen/ili210x.c @@ -586,7 +586,7 @@ static int ili251x_firmware_to_buffer(const struct firmware *fw, * once, copy them all into this buffer at the right locations, and then * do all operations on this linear buffer. */ - fw_buf = kzalloc(SZ_64K, GFP_KERNEL); + fw_buf = kvmalloc(SZ_64K, GFP_KERNEL); if (!fw_buf) return -ENOMEM;
@@ -616,7 +616,7 @@ static int ili251x_firmware_to_buffer(const struct firmware *fw, return 0;
err_big: - kfree(fw_buf); + kvfree(fw_buf); return error; }
@@ -859,7 +859,7 @@ static ssize_t ili210x_firmware_update_store(struct device *dev, ili210x_hardware_reset(priv->reset_gpio); dev_dbg(dev, "Firmware update ended, error=%i\n", error); enable_irq(client->irq); - kfree(fwbuf); + kvfree(fwbuf); return error; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen Ni nichen@iscas.ac.cn
[ Upstream commit 4caf6d93d9f2c11d6441c64e1c549c445fa322ed ]
Add check for the return value of v4l2_fwnode_endpoint_parse() and return the error if it fails in order to catch the error.
Signed-off-by: Chen Ni nichen@iscas.ac.cn Signed-off-by: Hans Verkuil hverkuil-cisco@xs4all.nl Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/media/platform/qcom/camss/camss.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/media/platform/qcom/camss/camss.c b/drivers/media/platform/qcom/camss/camss.c index a30461de3e84..d173ac80e01c 100644 --- a/drivers/media/platform/qcom/camss/camss.c +++ b/drivers/media/platform/qcom/camss/camss.c @@ -1038,8 +1038,11 @@ static int camss_of_parse_endpoint_node(struct device *dev, struct v4l2_mbus_config_mipi_csi2 *mipi_csi2; struct v4l2_fwnode_endpoint vep = { { 0 } }; unsigned int i; + int ret;
- v4l2_fwnode_endpoint_parse(of_fwnode_handle(node), &vep); + ret = v4l2_fwnode_endpoint_parse(of_fwnode_handle(node), &vep); + if (ret) + return ret;
csd->interface.csiphy_id = vep.base.port;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jules Irenge jbi.octave@gmail.com
[ Upstream commit 24a025497e7e883bd2adef5d0ece1e9b9268009f ]
Cocinnele reports a warning
WARNING: Suspicious code. resource_size is maybe missing with root
The root cause is the function resource_size is not used when needed
Use resource_size() on variable "root" of type resource
Signed-off-by: Jules Irenge jbi.octave@gmail.com Signed-off-by: Dominik Brodowski linux@dominikbrodowski.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pcmcia/yenta_socket.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/pcmcia/yenta_socket.c b/drivers/pcmcia/yenta_socket.c index 3966a6ceb1ac..a1c16352c01c 100644 --- a/drivers/pcmcia/yenta_socket.c +++ b/drivers/pcmcia/yenta_socket.c @@ -638,11 +638,11 @@ static int yenta_search_one_res(struct resource *root, struct resource *res, start = PCIBIOS_MIN_CARDBUS_IO; end = ~0U; } else { - unsigned long avail = root->end - root->start; + unsigned long avail = resource_size(root); int i; size = BRIDGE_MEM_MAX; - if (size > avail/8) { - size = (avail+1)/8; + if (size > (avail - 1) / 8) { + size = avail / 8; /* round size down to next power of 2 */ i = 0; while ((size /= 2) != 0)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Alex Hung alex.hung@amd.com
[ Upstream commit 116a678f3a9abc24f5c9d2525b7393d18d9eb58e ]
[WHAT & HOW] A denominator cannot be 0, and is checked before used.
This fixes 1 DIVIDE_BY_ZERO issue reported by Coverity.
Reviewed-by: Harry Wentland harry.wentland@amd.com Signed-off-by: Jerry Zuo jerry.zuo@amd.com Signed-off-by: Alex Hung alex.hung@amd.com Tested-by: Daniel Wheeler daniel.wheeler@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 0be1a1149a3f..393e32259a77 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -6767,7 +6767,7 @@ static int dm_update_mst_vcpi_slots_for_dsc(struct drm_atomic_state *state, } }
- if (j == dc_state->stream_count) + if (j == dc_state->stream_count || pbn_div == 0) continue;
slot_num = DIV_ROUND_UP(pbn, pbn_div);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Olšák marek.olsak@amd.com
[ Upstream commit 11317d2963fa79767cd7c6231a00a9d77f2e0f54 ]
Fix incorrect check.
Signed-off-by: Marek Olšák marek.olsak@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index aabde6ebb190..ac773b191071 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -873,8 +873,7 @@ static int check_tiling_flags_gfx6(struct amdgpu_framebuffer *afb) { u64 micro_tile_mode;
- /* Zero swizzle mode means linear */ - if (AMDGPU_TILING_GET(afb->tiling_flags, SWIZZLE_MODE) == 0) + if (AMDGPU_TILING_GET(afb->tiling_flags, ARRAY_MODE) == 1) /* LINEAR_ALIGNED */ return 0;
micro_tile_mode = AMDGPU_TILING_GET(afb->tiling_flags, MICRO_TILE_MODE);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 76fe372ccb81b0c89b6cd2fec26e2f38c958be85 ]
syzkaller reported a warning in bcm_connect() below. [0]
The repro calls connect() to vxcan1, removes vxcan1, and calls connect() with ifindex == 0.
Calling connect() for a BCM socket allocates a proc entry. Then, bcm_sk(sk)->bound is set to 1 to prevent further connect().
However, removing the bound device resets bcm_sk(sk)->bound to 0 in bcm_notify().
The 2nd connect() tries to allocate a proc entry with the same name and sets NULL to bcm_sk(sk)->bcm_proc_read, leaking the original proc entry.
Since the proc entry is available only for connect()ed sockets, let's clean up the entry when the bound netdev is unregistered.
[0]: proc_dir_entry 'can-bcm/2456' already registered WARNING: CPU: 1 PID: 394 at fs/proc/generic.c:376 proc_register+0x645/0x8f0 fs/proc/generic.c:375 Modules linked in: CPU: 1 PID: 394 Comm: syz-executor403 Not tainted 6.10.0-rc7-g852e42cc2dd4 Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.3-0-ga6ed6b701f0a-prebuilt.qemu.org 04/01/2014 RIP: 0010:proc_register+0x645/0x8f0 fs/proc/generic.c:375 Code: 00 00 00 00 00 48 85 ed 0f 85 97 02 00 00 4d 85 f6 0f 85 9f 02 00 00 48 c7 c7 9b cb cf 87 48 89 de 4c 89 fa e8 1c 6f eb fe 90 <0f> 0b 90 90 48 c7 c7 98 37 99 89 e8 cb 7e 22 05 bb 00 00 00 10 48 RSP: 0018:ffa0000000cd7c30 EFLAGS: 00010246 RAX: 9e129be1950f0200 RBX: ff1100011b51582c RCX: ff1100011857cd80 RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000002 RBP: 0000000000000000 R08: ffd400000000000f R09: ff1100013e78cac0 R10: ffac800000cd7980 R11: ff1100013e12b1f0 R12: 0000000000000000 R13: 0000000000000000 R14: 0000000000000000 R15: ff1100011a99a2ec FS: 00007fbd7086f740(0000) GS:ff1100013fd00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00000000200071c0 CR3: 0000000118556004 CR4: 0000000000771ef0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe07f0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <TASK> proc_create_net_single+0x144/0x210 fs/proc/proc_net.c:220 bcm_connect+0x472/0x840 net/can/bcm.c:1673 __sys_connect_file net/socket.c:2049 [inline] __sys_connect+0x5d2/0x690 net/socket.c:2066 __do_sys_connect net/socket.c:2076 [inline] __se_sys_connect net/socket.c:2073 [inline] __x64_sys_connect+0x8f/0x100 net/socket.c:2073 do_syscall_x64 arch/x86/entry/common.c:52 [inline] do_syscall_64+0xd9/0x1c0 arch/x86/entry/common.c:83 entry_SYSCALL_64_after_hwframe+0x4b/0x53 RIP: 0033:0x7fbd708b0e5d Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 73 9f 1b 00 f7 d8 64 89 01 48 RSP: 002b:00007fff8cd33f08 EFLAGS: 00000246 ORIG_RAX: 000000000000002a RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 00007fbd708b0e5d RDX: 0000000000000010 RSI: 0000000020000040 RDI: 0000000000000003 RBP: 0000000000000000 R08: 0000000000000040 R09: 0000000000000040 R10: 0000000000000040 R11: 0000000000000246 R12: 00007fff8cd34098 R13: 0000000000401280 R14: 0000000000406de8 R15: 00007fbd70ab9000 </TASK> remove_proc_entry: removing non-empty directory 'net/can-bcm', leaking at least '2456'
Fixes: ffd980f976e7 ("[CAN]: Add broadcast manager (bcm) protocol") Reported-by: syzkaller syzkaller@googlegroups.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/all/20240722192842.37421-1-kuniyu@amazon.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- net/can/bcm.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/net/can/bcm.c b/net/can/bcm.c index 925d48cc50f8..4ecb5cd8a22d 100644 --- a/net/can/bcm.c +++ b/net/can/bcm.c @@ -1428,6 +1428,10 @@ static void bcm_notify(struct bcm_sock *bo, unsigned long msg,
/* remove device reference, if this is our bound device */ if (bo->bound && bo->ifindex == dev->ifindex) { +#if IS_ENABLED(CONFIG_PROC_FS) + if (sock_net(sk)->can.bcmproc_dir && bo->bcm_proc_read) + remove_proc_entry(bo->procname, sock_net(sk)->can.bcmproc_dir); +#endif bo->bound = 0; bo->ifindex = 0; notify_enodev = 1;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Horman horms@kernel.org
[ Upstream commit 06d4ef3056a7ac31be331281bb7a6302ef5a7f8a ]
It appears that the irq requested in m_can_open() may be leaked if an error subsequently occurs: if m_can_start() fails.
Address this by calling free_irq in the unwind path for such cases.
Flagged by Smatch. Compile tested only.
Fixes: eaacfeaca7ad ("can: m_can: Call the RAM init directly from m_can_chip_config") Acked-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Simon Horman horms@kernel.org Link: https://lore.kernel.org/all/20240805-mcan-irq-v2-1-7154c0484819@kernel.org Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/m_can/m_can.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/net/can/m_can/m_can.c b/drivers/net/can/m_can/m_can.c index 2de998b98cb5..561f25cdad3f 100644 --- a/drivers/net/can/m_can/m_can.c +++ b/drivers/net/can/m_can/m_can.c @@ -1815,7 +1815,7 @@ static int m_can_open(struct net_device *dev) /* start the m_can controller */ err = m_can_start(dev); if (err) - goto exit_irq_fail; + goto exit_start_fail;
if (!cdev->is_peripheral) napi_enable(&cdev->napi); @@ -1824,6 +1824,9 @@ static int m_can_open(struct net_device *dev)
return 0;
+exit_start_fail: + if (cdev->is_peripheral || dev->irq) + free_irq(dev->irq, dev); exit_irq_fail: if (cdev->is_peripheral) destroy_workqueue(cdev->tx_wq);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit 50ea5449c56310d2d31c28ba91a59232116d3c1e ]
If the ring (rx, tx) and/or coalescing parameters (rx-frames-irq, tx-frames-irq) have been configured while the interface was in CAN-CC mode, but the interface is brought up in CAN-FD mode, the ring parameters might be too big.
Use the default CAN-FD values in this case.
Fixes: 9263c2e92be9 ("can: mcp251xfd: ring: add support for runtime configurable RX/TX ring parameters") Link: https://lore.kernel.org/all/20240805-mcp251xfd-fix-ringconfig-v1-1-72086f0ca... Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c | 11 +++++++++- .../net/can/spi/mcp251xfd/mcp251xfd-ring.c | 20 ++++++++++++++++--- 2 files changed, 27 insertions(+), 4 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c index 9e8e82cdba46..61b0d6fa52dd 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ram.c @@ -97,7 +97,16 @@ void can_ram_get_layout(struct can_ram_layout *layout, if (ring) { u8 num_rx_coalesce = 0, num_tx_coalesce = 0;
- num_rx = can_ram_rounddown_pow_of_two(config, &config->rx, 0, ring->rx_pending); + /* If the ring parameters have been configured in + * CAN-CC mode, but and we are in CAN-FD mode now, + * they might be to big. Use the default CAN-FD values + * in this case. + */ + num_rx = ring->rx_pending; + if (num_rx > layout->max_rx) + num_rx = layout->default_rx; + + num_rx = can_ram_rounddown_pow_of_two(config, &config->rx, 0, num_rx);
/* The ethtool doc says: * To disable coalescing, set usecs = 0 and max_frames = 1. diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c index 4d0246a0779a..915d505a304f 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c @@ -458,11 +458,25 @@ int mcp251xfd_ring_alloc(struct mcp251xfd_priv *priv)
/* switching from CAN-2.0 to CAN-FD mode or vice versa */ if (fd_mode != test_bit(MCP251XFD_FLAGS_FD_MODE, priv->flags)) { + const struct ethtool_ringparam ring = { + .rx_pending = priv->rx_obj_num, + .tx_pending = priv->tx->obj_num, + }; + const struct ethtool_coalesce ec = { + .rx_coalesce_usecs_irq = priv->rx_coalesce_usecs_irq, + .rx_max_coalesced_frames_irq = priv->rx_obj_num_coalesce_irq, + .tx_coalesce_usecs_irq = priv->tx_coalesce_usecs_irq, + .tx_max_coalesced_frames_irq = priv->tx_obj_num_coalesce_irq, + }; struct can_ram_layout layout;
- can_ram_get_layout(&layout, &mcp251xfd_ram_config, NULL, NULL, fd_mode); - priv->rx_obj_num = layout.default_rx; - tx_ring->obj_num = layout.default_tx; + can_ram_get_layout(&layout, &mcp251xfd_ram_config, &ring, &ec, fd_mode); + + priv->rx_obj_num = layout.cur_rx; + priv->rx_obj_num_coalesce_irq = layout.rx_coalesce; + + tx_ring->obj_num = layout.cur_tx; + priv->tx_obj_num_coalesce_irq = layout.tx_coalesce; }
if (fd_mode) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthew Maurer mmaurer@google.com
[ Upstream commit 45f97e6385cad6d0e48a27ddcd08793bb4d35851 ]
`awk` is already required by the kernel build, and the `xargs` feature used in current Rust detection is not present in all `xargs` (notably, toybox based xargs, used in the Android kernel build).
Signed-off-by: Matthew Maurer mmaurer@google.com Reviewed-by: Alice Ryhl aliceryhl@google.com Tested-by: Alice Ryhl aliceryhl@google.com Reviewed-by: Martin Rodriguez Reboredo yakoyoku@gmail.com Link: https://lore.kernel.org/r/20230928205045.2375899-1-mmaurer@google.com Signed-off-by: Miguel Ojeda ojeda@kernel.org Stable-dep-of: b8673d56935c ("rust: kbuild: fix export of bss symbols") Signed-off-by: Sasha Levin sashal@kernel.org --- rust/Makefile | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/rust/Makefile b/rust/Makefile index 6d0c0e9757f2..9321a4c83b71 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -303,9 +303,7 @@ $(obj)/bindings/bindings_helpers_generated.rs: $(src)/helpers.c FORCE quiet_cmd_exports = EXPORTS $@ cmd_exports = \ $(NM) -p --defined-only $< \ - | grep -E ' (T|R|D) ' | cut -d ' ' -f 3 \ - | xargs -Isymbol \ - echo 'EXPORT_SYMBOL_RUST_GPL(symbol);' > $@ + | awk '/ (T|R|D) / {printf "EXPORT_SYMBOL_RUST_GPL(%s);\n",$$3}' > $@
$(obj)/exports_core_generated.h: $(obj)/core.o FORCE $(call if_changed,exports)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Hindborg a.hindborg@samsung.com
[ Upstream commit b8673d56935c32a4e0a1a0b40951fdd313dbf340 ]
Symbols in the bss segment are not currently exported. This is a problem for Rust modules that link against statics, that are resident in the kernel image. Thus export symbols in the bss segment.
Fixes: 2f7ab1267dc9 ("Kbuild: add Rust support") Signed-off-by: Andreas Hindborg a.hindborg@samsung.com Reviewed-by: Alice Ryhl aliceryhl@google.com Tested-by: Alice Ryhl aliceryhl@google.com Reviewed-by: Gary Guo gary@garyguo.net Link: https://lore.kernel.org/r/20240815074519.2684107-2-nmi@metaspace.dk [ Reworded slightly. - Miguel ] Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- rust/Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/rust/Makefile b/rust/Makefile index 9321a4c83b71..28ba3b9ee18d 100644 --- a/rust/Makefile +++ b/rust/Makefile @@ -303,7 +303,7 @@ $(obj)/bindings/bindings_helpers_generated.rs: $(src)/helpers.c FORCE quiet_cmd_exports = EXPORTS $@ cmd_exports = \ $(NM) -p --defined-only $< \ - | awk '/ (T|R|D) / {printf "EXPORT_SYMBOL_RUST_GPL(%s);\n",$$3}' > $@ + | awk '/ (T|R|D|B) / {printf "EXPORT_SYMBOL_RUST_GPL(%s);\n",$$3}' > $@
$(obj)/exports_core_generated.h: $(obj)/core.o FORCE $(call if_changed,exports)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Howells dhowells@redhat.com
[ Upstream commit 91d1dfae464987aaf6c79ff51d8674880fb3be77 ]
Under certain conditions, the range to be cleared by FALLOC_FL_ZERO_RANGE may only be buffered locally and not yet have been flushed to the server. For example:
xfs_io -f -t -c "pwrite -S 0x41 0 4k" \ -c "pwrite -S 0x42 4k 4k" \ -c "fzero 0 4k" \ -c "pread -v 0 8k" /xfstest.test/foo
will write two 4KiB blocks of data, which get buffered in the pagecache, and then fallocate() is used to clear the first 4KiB block on the server - but we don't flush the data first, which means the EOF position on the server is wrong, and so the FSCTL_SET_ZERO_DATA RPC fails (and xfs_io ignores the error), but then when we try to read it, we see the old data.
Fix this by preflushing any part of the target region that above the server's idea of the EOF position to force the server to update its EOF position.
Note, however, that we don't want to simply expand the file by moving the EOF before doing the FSCTL_SET_ZERO_DATA[*] because someone else might see the zeroed region or if the RPC fails we then have to try to clean it up or risk getting corruption.
[*] And we have to move the EOF first otherwise FSCTL_SET_ZERO_DATA won't do what we want.
This fixes the generic/008 xfstest.
[!] Note: A better way to do this might be to split the operation into two parts: we only do FSCTL_SET_ZERO_DATA for the part of the range below the server's EOF and then, if that worked, invalidate the buffered pages for the part above the range.
Fixes: 6b69040247e1 ("cifs/smb3: Fix data inconsistent when zero file range") Signed-off-by: David Howells dhowells@redhat.com cc: Steve French stfrench@microsoft.com cc: Zhang Xiaoxu zhangxiaoxu5@huawei.com cc: Pavel Shilovsky pshilov@microsoft.com cc: Paulo Alcantara pc@manguebit.com cc: Shyam Prasad N nspmangalore@gmail.com cc: Rohith Surabattula rohiths.msft@gmail.com cc: Jeff Layton jlayton@kernel.org cc: linux-cifs@vger.kernel.org cc: linux-mm@kvack.org Signed-off-by: Steve French stfrench@microsoft.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/smb/client/smb2ops.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/fs/smb/client/smb2ops.c b/fs/smb/client/smb2ops.c index 2291081653a8..5e9478f31d47 100644 --- a/fs/smb/client/smb2ops.c +++ b/fs/smb/client/smb2ops.c @@ -3443,13 +3443,15 @@ static long smb3_zero_data(struct file *file, struct cifs_tcon *tcon, }
static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, - loff_t offset, loff_t len, bool keep_size) + unsigned long long offset, unsigned long long len, + bool keep_size) { struct cifs_ses *ses = tcon->ses; struct inode *inode = file_inode(file); struct cifsInodeInfo *cifsi = CIFS_I(inode); struct cifsFileInfo *cfile = file->private_data; - unsigned long long new_size; + struct netfs_inode *ictx = netfs_inode(inode); + unsigned long long i_size, new_size, remote_size; long rc; unsigned int xid; __le64 eof; @@ -3462,6 +3464,16 @@ static long smb3_zero_range(struct file *file, struct cifs_tcon *tcon, inode_lock(inode); filemap_invalidate_lock(inode->i_mapping);
+ i_size = i_size_read(inode); + remote_size = ictx->remote_i_size; + if (offset + len >= remote_size && offset < i_size) { + unsigned long long top = umin(offset + len, i_size); + + rc = filemap_write_and_wait_range(inode->i_mapping, offset, top - 1); + if (rc < 0) + goto zero_range_exit; + } + /* * We zero the range through ioctl, so we need remove the page caches * first, otherwise the data may be inconsistent with the server.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daiwei Li daiweili@google.com
[ Upstream commit ba8cf80724dbc09825b52498e4efacb563935408 ]
82580 NICs have a hardware bug that makes it necessary to write into the TSICR (TimeSync Interrupt Cause) register to clear it: https://lore.kernel.org/all/CDCB8BE0.1EC2C%25matthew.vick@intel.com/
Add a conditional so only for 82580 we write into the TSICR register, so we don't risk losing events for other models.
Without this change, when running ptp4l with an Intel 82580 card, I get the following output:
timed out while polling for tx timestamp increasing tx_timestamp_timeout or increasing kworker priority may correct this issue, but a driver bug likely causes it
This goes away with this change.
This (partially) reverts commit ee14cc9ea19b ("igb: Fix missing time sync events").
Fixes: ee14cc9ea19b ("igb: Fix missing time sync events") Closes: https://lore.kernel.org/intel-wired-lan/CAN0jFd1kO0MMtOh8N2Ztxn6f7vvDKp2h507... Tested-by: Daiwei Li daiweili@google.com Suggested-by: Vinicius Costa Gomes vinicius.gomes@intel.com Signed-off-by: Daiwei Li daiweili@google.com Acked-by: Vinicius Costa Gomes vinicius.gomes@intel.com Reviewed-by: Kurt Kanzenbach kurt@linutronix.de Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igb/igb_main.c | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index 81d9a5338be5..76bd41058f3a 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -6925,10 +6925,20 @@ static void igb_extts(struct igb_adapter *adapter, int tsintr_tt)
static void igb_tsync_interrupt(struct igb_adapter *adapter) { + const u32 mask = (TSINTR_SYS_WRAP | E1000_TSICR_TXTS | + TSINTR_TT0 | TSINTR_TT1 | + TSINTR_AUTT0 | TSINTR_AUTT1); struct e1000_hw *hw = &adapter->hw; u32 tsicr = rd32(E1000_TSICR); struct ptp_clock_event event;
+ if (hw->mac.type == e1000_82580) { + /* 82580 has a hardware bug that requires an explicit + * write to clear the TimeSync interrupt cause. + */ + wr32(E1000_TSICR, tsicr & mask); + } + if (tsicr & TSINTR_SYS_WRAP) { event.type = PTP_CLOCK_PPS; if (adapter->ptp_caps.pps)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dawid Osuchowski dawid.osuchowski@linux.intel.com
[ Upstream commit d11a67634227f9f9da51938af085fb41a733848f ]
Ethtool callbacks can be executed while reset is in progress and try to access deleted resources, e.g. getting coalesce settings can result in a NULL pointer dereference seen below.
Reproduction steps: Once the driver is fully initialized, trigger reset: # echo 1 > /sys/class/net/<interface>/device/reset when reset is in progress try to get coalesce settings using ethtool: # ethtool -c <interface>
BUG: kernel NULL pointer dereference, address: 0000000000000020 PGD 0 P4D 0 Oops: Oops: 0000 [#1] PREEMPT SMP PTI CPU: 11 PID: 19713 Comm: ethtool Tainted: G S 6.10.0-rc7+ #7 RIP: 0010:ice_get_q_coalesce+0x2e/0xa0 [ice] RSP: 0018:ffffbab1e9bcf6a8 EFLAGS: 00010206 RAX: 000000000000000c RBX: ffff94512305b028 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff9451c3f2e588 RDI: ffff9451c3f2e588 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: ffff9451c3f2e580 R11: 000000000000001f R12: ffff945121fa9000 R13: ffffbab1e9bcf760 R14: 0000000000000013 R15: ffffffff9e65dd40 FS: 00007faee5fbe740(0000) GS:ffff94546fd80000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000020 CR3: 0000000106c2e005 CR4: 00000000001706f0 Call Trace: <TASK> ice_get_coalesce+0x17/0x30 [ice] coalesce_prepare_data+0x61/0x80 ethnl_default_doit+0xde/0x340 genl_family_rcv_msg_doit+0xf2/0x150 genl_rcv_msg+0x1b3/0x2c0 netlink_rcv_skb+0x5b/0x110 genl_rcv+0x28/0x40 netlink_unicast+0x19c/0x290 netlink_sendmsg+0x222/0x490 __sys_sendto+0x1df/0x1f0 __x64_sys_sendto+0x24/0x30 do_syscall_64+0x82/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e RIP: 0033:0x7faee60d8e27
Calling netif_device_detach() before reset makes the net core not call the driver when ethtool command is issued, the attempt to execute an ethtool command during reset will result in the following message:
netlink error: No such device
instead of NULL pointer dereference. Once reset is done and ice_rebuild() is executing, the netif_device_attach() is called to allow for ethtool operations to occur again in a safe manner.
Fixes: fcea6f3da546 ("ice: Add stats and ethtool support") Suggested-by: Jakub Kicinski kuba@kernel.org Reviewed-by: Igor Bagnucki igor.bagnucki@intel.com Signed-off-by: Dawid Osuchowski dawid.osuchowski@linux.intel.com Tested-by: Pucha Himasekhar Reddy himasekharx.reddy.pucha@intel.com (A Contingent worker at Intel) Reviewed-by: Michal Schmidt mschmidt@redhat.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 9dbfbc90485e..15876f388d68 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -611,6 +611,9 @@ ice_prepare_for_reset(struct ice_pf *pf, enum ice_reset_req reset_type) memset(&vsi->mqprio_qopt, 0, sizeof(vsi->mqprio_qopt)); } } + + if (vsi->netdev) + netif_device_detach(vsi->netdev); skip:
/* clear SW filtering DB */ @@ -7140,6 +7143,7 @@ static void ice_update_pf_netdev_link(struct ice_pf *pf) */ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type) { + struct ice_vsi *vsi = ice_get_main_vsi(pf); struct device *dev = ice_pf_to_dev(pf); struct ice_hw *hw = &pf->hw; bool dvm; @@ -7292,6 +7296,9 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type) ice_rebuild_arfs(pf); }
+ if (vsi && vsi->netdev) + netif_device_attach(vsi->netdev); + ice_update_pf_netdev_link(pf);
/* tell the firmware we are up */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
[ Upstream commit ffc17e1479e8e9459b7afa80e5d9d40d0dd78abb ]
In case of error in build_tokens_sysfs(), all the memory that has been allocated is freed at end of this function. But then free_group() is called which performs memory deallocation again.
Also, instead of free_group() call, there should be exit_dell_smbios_smm() and exit_dell_smbios_wmi() calls, since there is initialization, but there is no release of resources in case of an error.
Fix these issues by replacing free_group() call with exit_dell_smbios_wmi() and exit_dell_smbios_smm().
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 33b9ca1e53b4 ("platform/x86: dell-smbios: Add a sysfs interface for SMBIOS tokens") Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Link: https://lore.kernel.org/r/20240830065428.9544-1-amishin@t-argos.ru Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/platform/x86/dell/dell-smbios-base.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/platform/x86/dell/dell-smbios-base.c b/drivers/platform/x86/dell/dell-smbios-base.c index 86b95206cb1b..6fb538a13868 100644 --- a/drivers/platform/x86/dell/dell-smbios-base.c +++ b/drivers/platform/x86/dell/dell-smbios-base.c @@ -590,7 +590,10 @@ static int __init dell_smbios_init(void) return 0;
fail_sysfs: - free_group(platform_device); + if (!wmi) + exit_dell_smbios_wmi(); + if (!smm) + exit_dell_smbios_smm();
fail_create_group: platform_device_del(platform_device);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Corentin Labbe clabbe@baylibre.com
[ Upstream commit 27b9ecc7a9ba1d0014779bfe5a6dbf630899c6e7 ]
It work exactly like regulator_bulk_get() but instead of working on a provided list of names, it seek all consumers properties matching xxx-supply.
Signed-off-by: Corentin Labbe clabbe@baylibre.com Link: https://lore.kernel.org/r/20221115073603.3425396-2-clabbe@baylibre.com Signed-off-by: Mark Brown broonie@kernel.org Stable-dep-of: 1a5caec7f80c ("regulator: core: Stub devm_regulator_bulk_get_const() if !CONFIG_REGULATOR") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/regulator/of_regulator.c | 92 ++++++++++++++++++++++++++++++ include/linux/regulator/consumer.h | 8 +++ 2 files changed, 100 insertions(+)
diff --git a/drivers/regulator/of_regulator.c b/drivers/regulator/of_regulator.c index cd726d4e8fbf..c3571781a8cc 100644 --- a/drivers/regulator/of_regulator.c +++ b/drivers/regulator/of_regulator.c @@ -701,3 +701,95 @@ struct regulator_dev *of_parse_coupled_regulator(struct regulator_dev *rdev,
return c_rdev; } + +/* + * Check if name is a supply name according to the '*-supply' pattern + * return 0 if false + * return length of supply name without the -supply + */ +static int is_supply_name(const char *name) +{ + int strs, i; + + strs = strlen(name); + /* string need to be at minimum len(x-supply) */ + if (strs < 8) + return 0; + for (i = strs - 6; i > 0; i--) { + /* find first '-' and check if right part is supply */ + if (name[i] != '-') + continue; + if (strcmp(name + i + 1, "supply") != 0) + return 0; + return i; + } + return 0; +} + +/* + * of_regulator_bulk_get_all - get multiple regulator consumers + * + * @dev: Device to supply + * @np: device node to search for consumers + * @consumers: Configuration of consumers; clients are stored here. + * + * @return number of regulators on success, an errno on failure. + * + * This helper function allows drivers to get several regulator + * consumers in one operation. If any of the regulators cannot be + * acquired then any regulators that were allocated will be freed + * before returning to the caller. + */ +int of_regulator_bulk_get_all(struct device *dev, struct device_node *np, + struct regulator_bulk_data **consumers) +{ + int num_consumers = 0; + struct regulator *tmp; + struct property *prop; + int i, n = 0, ret; + char name[64]; + + *consumers = NULL; + + /* + * first pass: get numbers of xxx-supply + * second pass: fill consumers + */ +restart: + for_each_property_of_node(np, prop) { + i = is_supply_name(prop->name); + if (i == 0) + continue; + if (!*consumers) { + num_consumers++; + continue; + } else { + memcpy(name, prop->name, i); + name[i] = '\0'; + tmp = regulator_get(dev, name); + if (!tmp) { + ret = -EINVAL; + goto error; + } + (*consumers)[n].consumer = tmp; + n++; + continue; + } + } + if (*consumers) + return num_consumers; + if (num_consumers == 0) + return 0; + *consumers = kmalloc_array(num_consumers, + sizeof(struct regulator_bulk_data), + GFP_KERNEL); + if (!*consumers) + return -ENOMEM; + goto restart; + +error: + while (--n >= 0) + regulator_put(consumers[n]->consumer); + return ret; +} +EXPORT_SYMBOL_GPL(of_regulator_bulk_get_all); diff --git a/include/linux/regulator/consumer.h b/include/linux/regulator/consumer.h index a9ca87a8f4e6..40c80c844ce5 100644 --- a/include/linux/regulator/consumer.h +++ b/include/linux/regulator/consumer.h @@ -244,6 +244,8 @@ int regulator_disable_deferred(struct regulator *regulator, int ms);
int __must_check regulator_bulk_get(struct device *dev, int num_consumers, struct regulator_bulk_data *consumers); +int __must_check of_regulator_bulk_get_all(struct device *dev, struct device_node *np, + struct regulator_bulk_data **consumers); int __must_check devm_regulator_bulk_get(struct device *dev, int num_consumers, struct regulator_bulk_data *consumers); void devm_regulator_bulk_put(struct regulator_bulk_data *consumers); @@ -479,6 +481,12 @@ static inline int devm_regulator_bulk_get(struct device *dev, int num_consumers, return 0; }
+static inline int of_regulator_bulk_get_all(struct device *dev, struct device_node *np, + struct regulator_bulk_data **consumers) +{ + return 0; +} + static inline int regulator_bulk_enable(int num_consumers, struct regulator_bulk_data *consumers) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Douglas Anderson dianders@chromium.org
[ Upstream commit 1a5caec7f80ca2e659c03f45378ee26915f4eda2 ]
When adding devm_regulator_bulk_get_const() I missed adding a stub for when CONFIG_REGULATOR is not enabled. Under certain conditions (like randconfig testing) this can cause the compiler to reports errors like:
error: implicit declaration of function 'devm_regulator_bulk_get_const'; did you mean 'devm_regulator_bulk_get_enable'?
Add the stub.
Fixes: 1de452a0edda ("regulator: core: Allow drivers to define their init data as const") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202408301813.TesFuSbh-lkp@intel.com/ Cc: Neil Armstrong neil.armstrong@linaro.org Signed-off-by: Douglas Anderson dianders@chromium.org Link: https://patch.msgid.link/20240830073511.1.Ib733229a8a19fad8179213c05e1af01b5... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/regulator/consumer.h | 8 ++++++++ 1 file changed, 8 insertions(+)
diff --git a/include/linux/regulator/consumer.h b/include/linux/regulator/consumer.h index 40c80c844ce5..60bc7e143869 100644 --- a/include/linux/regulator/consumer.h +++ b/include/linux/regulator/consumer.h @@ -487,6 +487,14 @@ static inline int of_regulator_bulk_get_all(struct device *dev, struct device_no return 0; }
+static inline int devm_regulator_bulk_get_const( + struct device *dev, int num_consumers, + const struct regulator_bulk_data *in_consumers, + struct regulator_bulk_data **out_consumers) +{ + return 0; +} + static inline int regulator_bulk_enable(int num_consumers, struct regulator_bulk_data *consumers) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Carpenter dan.carpenter@linaro.org
[ Upstream commit ef4a99a0164e3972abb421cbb1b09ea6c61414df ]
Call rtnl_unlock() on this error path, before returning.
Fixes: bc23aa949aeb ("igc: Add pcie error handler support") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org Reviewed-by: Gerhard Engleder gerhard@engleder-embedded.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/igc/igc_main.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c index 39f8f28288aa..6ae2d0b723c8 100644 --- a/drivers/net/ethernet/intel/igc/igc_main.c +++ b/drivers/net/ethernet/intel/igc/igc_main.c @@ -7063,6 +7063,7 @@ static void igc_io_resume(struct pci_dev *pdev) rtnl_lock(); if (netif_running(netdev)) { if (igc_open(netdev)) { + rtnl_unlock(); netdev_err(netdev, "igc_open failed after reset\n"); return; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit 60bc72b3c4e9127f29686770005da40b10be0576 ]
This should have been used in there from day 1, let us address that before introducing XDP multi-buffer support for Rx side.
Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Reviewed-by: Alexander Lobakin alexandr.lobakin@intel.com Link: https://lore.kernel.org/bpf/20230131204506.219292-8-maciej.fijalkowski@intel... Stable-dep-of: 04c7e14e5b0b ("ice: do not bring the VSI up, if it was down before the XDP setup") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 28 +++++++++++------------ 1 file changed, 14 insertions(+), 14 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 15876f388d68..cd9bcc3536fb 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2886,6 +2886,18 @@ int ice_vsi_determine_xdp_res(struct ice_vsi *vsi) return 0; }
+/** + * ice_max_xdp_frame_size - returns the maximum allowed frame size for XDP + * @vsi: Pointer to VSI structure + */ +static int ice_max_xdp_frame_size(struct ice_vsi *vsi) +{ + if (test_bit(ICE_FLAG_LEGACY_RX, vsi->back->flags)) + return ICE_RXBUF_1664; + else + return ICE_RXBUF_3072; +} + /** * ice_xdp_setup_prog - Add or remove XDP eBPF program * @vsi: VSI to setup XDP for @@ -2896,11 +2908,11 @@ static int ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, struct netlink_ext_ack *extack) { - int frame_size = vsi->netdev->mtu + ICE_ETH_PKT_HDR_PAD; + unsigned int frame_size = vsi->netdev->mtu + ICE_ETH_PKT_HDR_PAD; bool if_running = netif_running(vsi->netdev); int ret = 0, xdp_ring_err = 0;
- if (frame_size > vsi->rx_buf_len) { + if (frame_size > ice_max_xdp_frame_size(vsi)) { NL_SET_ERR_MSG_MOD(extack, "MTU too large for loading XDP"); return -EOPNOTSUPP; } @@ -7329,18 +7341,6 @@ static void ice_rebuild(struct ice_pf *pf, enum ice_reset_req reset_type) dev_err(dev, "Rebuild failed, unload and reload driver\n"); }
-/** - * ice_max_xdp_frame_size - returns the maximum allowed frame size for XDP - * @vsi: Pointer to VSI structure - */ -static int ice_max_xdp_frame_size(struct ice_vsi *vsi) -{ - if (test_bit(ICE_FLAG_LEGACY_RX, vsi->back->flags)) - return ICE_RXBUF_1664; - else - return ICE_RXBUF_3072; -} - /** * ice_change_mtu - NDO callback to change the MTU * @netdev: network interface device structure
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maciej Fijalkowski maciej.fijalkowski@intel.com
[ Upstream commit 469748429ac81f0a6a344637fc9d3b1d16a9f3d8 ]
Currently ice driver's .ndo_bpf callback brings interface down and up independently of XDP resources' presence. This is only needed when either these resources have to be configured or removed. It means that if one is switching XDP programs on-the-fly with running traffic, packets will be dropped.
To avoid this, compare early on ice_xdp_setup_prog() state of incoming bpf_prog pointer vs the bpf_prog pointer that is already assigned to VSI. Do the swap in case VSI has bpf_prog and incoming one are non-NULL.
Lastly, while at it, put old bpf_prog *after* the update of Rx ring's bpf_prog pointer. In theory previous code could expose us to a state where Rx ring's bpf_prog would still be referring to old_prog that got released with earlier bpf_prog_put().
Signed-off-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Acked-by: Toke Høiland-Jørgensen toke@redhat.com Reviewed-by: Alexander Lobakin aleksander.lobakin@intel.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com (A Contingent Worker at Intel) Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: 04c7e14e5b0b ("ice: do not bring the VSI up, if it was down before the XDP setup") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 19 +++++++++---------- 1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index cd9bcc3536fb..1973b032fe05 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2633,11 +2633,11 @@ static void ice_vsi_assign_bpf_prog(struct ice_vsi *vsi, struct bpf_prog *prog) int i;
old_prog = xchg(&vsi->xdp_prog, prog); - if (old_prog) - bpf_prog_put(old_prog); - ice_for_each_rxq(vsi, i) WRITE_ONCE(vsi->rx_rings[i]->xdp_prog, vsi->xdp_prog); + + if (old_prog) + bpf_prog_put(old_prog); }
/** @@ -2917,6 +2917,12 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, return -EOPNOTSUPP; }
+ /* hot swap progs and avoid toggling link */ + if (ice_is_xdp_ena_vsi(vsi) == !!prog) { + ice_vsi_assign_bpf_prog(vsi, prog); + return 0; + } + /* need to stop netdev while setting up the program for Rx rings */ if (if_running && !test_and_set_bit(ICE_VSI_DOWN, vsi->state)) { ret = ice_down(vsi); @@ -2947,13 +2953,6 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, xdp_ring_err = ice_realloc_zc_buf(vsi, false); if (xdp_ring_err) NL_SET_ERR_MSG_MOD(extack, "Freeing XDP Rx resources failed"); - } else { - /* safe to call even when prog == vsi->xdp_prog as - * dev_xdp_install in net/core/dev.c incremented prog's - * refcount so corresponding bpf_prog_put won't cause - * underflow - */ - ice_vsi_assign_bpf_prog(vsi, prog); }
if (if_running)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Larysa Zaremba larysa.zaremba@intel.com
[ Upstream commit 04c7e14e5b0b6227e7b00d7a96ca2f2426ab9171 ]
After XDP configuration is completed, we bring the interface up unconditionally, regardless of its state before the call to .ndo_bpf().
Preserve the information whether the interface had to be brought down and later bring it up only in such case.
Fixes: efc2214b6047 ("ice: Add support for XDP") Reviewed-by: Wojciech Drewek wojciech.drewek@intel.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Tested-by: Chandan Kumar Rout chandanx.rout@intel.com Acked-by: Maciej Fijalkowski maciej.fijalkowski@intel.com Signed-off-by: Larysa Zaremba larysa.zaremba@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_main.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 1973b032fe05..3f01942e4982 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -2909,8 +2909,8 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, struct netlink_ext_ack *extack) { unsigned int frame_size = vsi->netdev->mtu + ICE_ETH_PKT_HDR_PAD; - bool if_running = netif_running(vsi->netdev); int ret = 0, xdp_ring_err = 0; + bool if_running;
if (frame_size > ice_max_xdp_frame_size(vsi)) { NL_SET_ERR_MSG_MOD(extack, "MTU too large for loading XDP"); @@ -2923,8 +2923,11 @@ ice_xdp_setup_prog(struct ice_vsi *vsi, struct bpf_prog *prog, return 0; }
+ if_running = netif_running(vsi->netdev) && + !test_and_set_bit(ICE_VSI_DOWN, vsi->state); + /* need to stop netdev while setting up the program for Rx rings */ - if (if_running && !test_and_set_bit(ICE_VSI_DOWN, vsi->state)) { + if (if_running) { ret = ice_down(vsi); if (ret) { NL_SET_ERR_MSG_MOD(extack, "Preparing device for XDP attach failed");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Neukum oneukum@suse.com
[ Upstream commit bab8eb0dd4cb995caa4a0529d5655531c2ec5e8e ]
The driver generates a random MAC once on load and uses it over and over, including on two devices needing a random MAC at the same time.
Jakub suggested revamping the driver to the modern API for setting a random MAC rather than fixing the old stuff.
The bug is as old as the driver.
Signed-off-by: Oliver Neukum oneukum@suse.com Reviewed-by: Simon Horman horms@kernel.org Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2") Link: https://patch.msgid.link/20240829175201.670718-1-oneukum@suse.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/usbnet.c | 11 +++-------- 1 file changed, 3 insertions(+), 8 deletions(-)
diff --git a/drivers/net/usb/usbnet.c b/drivers/net/usb/usbnet.c index 405e588f8a3a..0f848d318544 100644 --- a/drivers/net/usb/usbnet.c +++ b/drivers/net/usb/usbnet.c @@ -61,9 +61,6 @@
/*-------------------------------------------------------------------------*/
-// randomly generated ethernet address -static u8 node_id [ETH_ALEN]; - /* use ethtool to change the level for any given device */ static int msg_level = -1; module_param (msg_level, int, 0); @@ -1726,7 +1723,6 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod)
dev->net = net; strscpy(net->name, "usb%d", sizeof(net->name)); - eth_hw_addr_set(net, node_id);
/* rx and tx sides can use different message sizes; * bind() should set rx_urb_size in that case. @@ -1800,9 +1796,9 @@ usbnet_probe (struct usb_interface *udev, const struct usb_device_id *prod) goto out4; }
- /* let userspace know we have a random address */ - if (ether_addr_equal(net->dev_addr, node_id)) - net->addr_assign_type = NET_ADDR_RANDOM; + /* this flags the device for user space */ + if (!is_valid_ether_addr(net->dev_addr)) + eth_hw_addr_random(net);
if ((dev->driver_info->flags & FLAG_WLAN) != 0) SET_NETDEV_DEVTYPE(net, &wlan_type); @@ -2212,7 +2208,6 @@ static int __init usbnet_init(void) BUILD_BUG_ON( sizeof_field(struct sk_buff, cb) < sizeof(struct skb_data));
- eth_random_addr(node_id); return 0; } module_init(usbnet_init);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Nault gnault@redhat.com
[ Upstream commit 4963d2343af81f493519f9c3ea9f2169eaa7353a ]
Bareudp devices update their stats concurrently. Therefore they need proper atomic increments.
Fixes: 571912c69f0e ("net: UDP tunnel encapsulation module for tunnelling different protocols like MPLS, IP, NSH etc.") Signed-off-by: Guillaume Nault gnault@redhat.com Reviewed-by: Willem de Bruijn willemb@google.com Link: https://patch.msgid.link/04b7b9d0b480158eb3ab4366ec80aa2ab7e41fcb.1725031794... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/bareudp.c | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)
diff --git a/drivers/net/bareudp.c b/drivers/net/bareudp.c index 683203f87ae2..277493e41b07 100644 --- a/drivers/net/bareudp.c +++ b/drivers/net/bareudp.c @@ -82,7 +82,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb)
if (skb_copy_bits(skb, BAREUDP_BASE_HLEN, &ipversion, sizeof(ipversion))) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } ipversion >>= 4; @@ -92,7 +92,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) } else if (ipversion == 6 && bareudp->multi_proto_mode) { proto = htons(ETH_P_IPV6); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } else if (bareudp->ethertype == htons(ETH_P_MPLS_UC)) { @@ -106,7 +106,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) ipv4_is_multicast(tunnel_hdr->daddr)) { proto = htons(ETH_P_MPLS_MC); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } else { @@ -122,7 +122,7 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) (addr_type & IPV6_ADDR_MULTICAST)) { proto = htons(ETH_P_MPLS_MC); } else { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } } @@ -134,12 +134,12 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) proto, !net_eq(bareudp->net, dev_net(bareudp->dev)))) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } tun_dst = udp_tun_rx_dst(skb, family, TUNNEL_KEY, 0, 0); if (!tun_dst) { - bareudp->dev->stats.rx_dropped++; + DEV_STATS_INC(bareudp->dev, rx_dropped); goto drop; } skb_dst_set(skb, &tun_dst->dst); @@ -165,8 +165,8 @@ static int bareudp_udp_encap_recv(struct sock *sk, struct sk_buff *skb) &((struct ipv6hdr *)oiph)->saddr); } if (err > 1) { - ++bareudp->dev->stats.rx_frame_errors; - ++bareudp->dev->stats.rx_errors; + DEV_STATS_INC(bareudp->dev, rx_frame_errors); + DEV_STATS_INC(bareudp->dev, rx_errors); goto drop; } } @@ -462,11 +462,11 @@ static netdev_tx_t bareudp_xmit(struct sk_buff *skb, struct net_device *dev) dev_kfree_skb(skb);
if (err == -ELOOP) - dev->stats.collisions++; + DEV_STATS_INC(dev, collisions); else if (err == -ENETUNREACH) - dev->stats.tx_carrier_errors++; + DEV_STATS_INC(dev, tx_carrier_errors);
- dev->stats.tx_errors++; + DEV_STATS_INC(dev, tx_errors); return NETDEV_TX_OK; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 7e4196935069947d8b70b09c1660b67b067e75cb ]
We observed a null-ptr-deref in fou_gro_receive() while shutting down a host. [0]
The NULL pointer is sk->sk_user_data, and the offset 8 is of protocol in struct fou.
When fou_release() is called due to netns dismantle or explicit tunnel teardown, udp_tunnel_sock_release() sets NULL to sk->sk_user_data. Then, the tunnel socket is destroyed after a single RCU grace period.
So, in-flight udp4_gro_receive() could find the socket and execute the FOU GRO handler, where sk->sk_user_data could be NULL.
Let's use rcu_dereference_sk_user_data() in fou_from_sock() and add NULL checks in FOU GRO handlers.
[0]: BUG: kernel NULL pointer dereference, address: 0000000000000008 PF: supervisor read access in kernel mode PF: error_code(0x0000) - not-present page PGD 80000001032f4067 P4D 80000001032f4067 PUD 103240067 PMD 0 SMP PTI CPU: 0 PID: 0 Comm: swapper/0 Not tainted 5.10.216-204.855.amzn2.x86_64 #1 Hardware name: Amazon EC2 c5.large/, BIOS 1.0 10/16/2017 RIP: 0010:fou_gro_receive (net/ipv4/fou.c:233) [fou] Code: 41 5f c3 cc cc cc cc e8 e7 2e 69 f4 0f 1f 80 00 00 00 00 0f 1f 44 00 00 49 89 f8 41 54 48 89 f7 48 89 d6 49 8b 80 88 02 00 00 <0f> b6 48 08 0f b7 42 4a 66 25 fd fd 80 cc 02 66 89 42 4a 0f b6 42 RSP: 0018:ffffa330c0003d08 EFLAGS: 00010297 RAX: 0000000000000000 RBX: ffff93d9e3a6b900 RCX: 0000000000000010 RDX: ffff93d9e3a6b900 RSI: ffff93d9e3a6b900 RDI: ffff93dac2e24d08 RBP: ffff93d9e3a6b900 R08: ffff93dacbce6400 R09: 0000000000000002 R10: 0000000000000000 R11: ffffffffb5f369b0 R12: ffff93dacbce6400 R13: ffff93dac2e24d08 R14: 0000000000000000 R15: ffffffffb4edd1c0 FS: 0000000000000000(0000) GS:ffff93daee800000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000000000000008 CR3: 0000000102140001 CR4: 00000000007706f0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 PKRU: 55555554 Call Trace: <IRQ> ? show_trace_log_lvl (arch/x86/kernel/dumpstack.c:259) ? __die_body.cold (arch/x86/kernel/dumpstack.c:478 arch/x86/kernel/dumpstack.c:420) ? no_context (arch/x86/mm/fault.c:752) ? exc_page_fault (arch/x86/include/asm/irqflags.h:49 arch/x86/include/asm/irqflags.h:89 arch/x86/mm/fault.c:1435 arch/x86/mm/fault.c:1483) ? asm_exc_page_fault (arch/x86/include/asm/idtentry.h:571) ? fou_gro_receive (net/ipv4/fou.c:233) [fou] udp_gro_receive (include/linux/netdevice.h:2552 net/ipv4/udp_offload.c:559) udp4_gro_receive (net/ipv4/udp_offload.c:604) inet_gro_receive (net/ipv4/af_inet.c:1549 (discriminator 7)) dev_gro_receive (net/core/dev.c:6035 (discriminator 4)) napi_gro_receive (net/core/dev.c:6170) ena_clean_rx_irq (drivers/amazon/net/ena/ena_netdev.c:1558) [ena] ena_io_poll (drivers/amazon/net/ena/ena_netdev.c:1742) [ena] napi_poll (net/core/dev.c:6847) net_rx_action (net/core/dev.c:6917) __do_softirq (arch/x86/include/asm/jump_label.h:25 include/linux/jump_label.h:200 include/trace/events/irq.h:142 kernel/softirq.c:299) asm_call_irq_on_stack (arch/x86/entry/entry_64.S:809) </IRQ> do_softirq_own_stack (arch/x86/include/asm/irq_stack.h:27 arch/x86/include/asm/irq_stack.h:77 arch/x86/kernel/irq_64.c:77) irq_exit_rcu (kernel/softirq.c:393 kernel/softirq.c:423 kernel/softirq.c:435) common_interrupt (arch/x86/kernel/irq.c:239) asm_common_interrupt (arch/x86/include/asm/idtentry.h:626) RIP: 0010:acpi_idle_do_entry (arch/x86/include/asm/irqflags.h:49 arch/x86/include/asm/irqflags.h:89 drivers/acpi/processor_idle.c:114 drivers/acpi/processor_idle.c:575) Code: 8b 15 d1 3c c4 02 ed c3 cc cc cc cc 65 48 8b 04 25 40 ef 01 00 48 8b 00 a8 08 75 eb 0f 1f 44 00 00 0f 00 2d d5 09 55 00 fb f4 <fa> c3 cc cc cc cc e9 be fc ff ff 66 66 2e 0f 1f 84 00 00 00 00 00 RSP: 0018:ffffffffb5603e58 EFLAGS: 00000246 RAX: 0000000000004000 RBX: ffff93dac0929c00 RCX: ffff93daee833900 RDX: ffff93daee800000 RSI: ffff93daee87dc00 RDI: ffff93daee87dc64 RBP: 0000000000000001 R08: ffffffffb5e7b6c0 R09: 0000000000000044 R10: ffff93daee831b04 R11: 00000000000001cd R12: 0000000000000001 R13: ffffffffb5e7b740 R14: 0000000000000001 R15: 0000000000000000 ? sched_clock_cpu (kernel/sched/clock.c:371) acpi_idle_enter (drivers/acpi/processor_idle.c:712 (discriminator 3)) cpuidle_enter_state (drivers/cpuidle/cpuidle.c:237) cpuidle_enter (drivers/cpuidle/cpuidle.c:353) cpuidle_idle_call (kernel/sched/idle.c:158 kernel/sched/idle.c:239) do_idle (kernel/sched/idle.c:302) cpu_startup_entry (kernel/sched/idle.c:395 (discriminator 1)) start_kernel (init/main.c:1048) secondary_startup_64_no_verify (arch/x86/kernel/head_64.S:310) Modules linked in: udp_diag tcp_diag inet_diag nft_nat ipip tunnel4 dummy fou ip_tunnel nft_masq nft_chain_nat nf_nat wireguard nft_ct curve25519_x86_64 libcurve25519_generic nf_conntrack libchacha20poly1305 nf_defrag_ipv6 nf_defrag_ipv4 nft_objref chacha_x86_64 nft_counter nf_tables nfnetlink poly1305_x86_64 ip6_udp_tunnel udp_tunnel libchacha crc32_pclmul ghash_clmulni_intel aesni_intel crypto_simd cryptd glue_helper mousedev psmouse button ena ptp pps_core crc32c_intel CR2: 0000000000000008
Fixes: d92283e338f6 ("fou: change to use UDP socket GRO") Reported-by: Alphonse Kurian alkurian@amazon.com Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Link: https://patch.msgid.link/20240902173927.62706-1-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/fou.c | 29 ++++++++++++++++++++++++----- 1 file changed, 24 insertions(+), 5 deletions(-)
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c index 358bff068eef..7bcc933103e2 100644 --- a/net/ipv4/fou.c +++ b/net/ipv4/fou.c @@ -48,7 +48,7 @@ struct fou_net {
static inline struct fou *fou_from_sock(struct sock *sk) { - return sk->sk_user_data; + return rcu_dereference_sk_user_data(sk); }
static int fou_recv_pull(struct sk_buff *skb, struct fou *fou, size_t len) @@ -231,9 +231,15 @@ static struct sk_buff *fou_gro_receive(struct sock *sk, struct sk_buff *skb) { const struct net_offload __rcu **offloads; - u8 proto = fou_from_sock(sk)->protocol; + struct fou *fou = fou_from_sock(sk); const struct net_offload *ops; struct sk_buff *pp = NULL; + u8 proto; + + if (!fou) + goto out; + + proto = fou->protocol;
/* We can clear the encap_mark for FOU as we are essentially doing * one of two possible things. We are either adding an L4 tunnel @@ -261,14 +267,24 @@ static int fou_gro_complete(struct sock *sk, struct sk_buff *skb, int nhoff) { const struct net_offload __rcu **offloads; - u8 proto = fou_from_sock(sk)->protocol; + struct fou *fou = fou_from_sock(sk); const struct net_offload *ops; - int err = -ENOSYS; + u8 proto; + int err; + + if (!fou) { + err = -ENOENT; + goto out; + } + + proto = fou->protocol;
offloads = NAPI_GRO_CB(skb)->is_ipv6 ? inet6_offloads : inet_offloads; ops = rcu_dereference(offloads[proto]); - if (WARN_ON(!ops || !ops->callbacks.gro_complete)) + if (WARN_ON(!ops || !ops->callbacks.gro_complete)) { + err = -ENOSYS; goto out; + }
err = ops->callbacks.gro_complete(skb, nhoff);
@@ -318,6 +334,9 @@ static struct sk_buff *gue_gro_receive(struct sock *sk, struct gro_remcsum grc; u8 proto;
+ if (!fou) + goto out; + skb_gro_remcsum_init(&grc);
off = skb_gro_offset(skb);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonas Gorski jonas.gorski@bisdn.de
[ Upstream commit bee2ef946d3184e99077be526567d791c473036f ]
When userspace wants to take over a fdb entry by setting it as EXTERN_LEARNED, we set both flags BR_FDB_ADDED_BY_EXT_LEARN and BR_FDB_ADDED_BY_USER in br_fdb_external_learn_add().
If the bridge updates the entry later because its port changed, we clear the BR_FDB_ADDED_BY_EXT_LEARN flag, but leave the BR_FDB_ADDED_BY_USER flag set.
If userspace then wants to take over the entry again, br_fdb_external_learn_add() sees that BR_FDB_ADDED_BY_USER and skips setting the BR_FDB_ADDED_BY_EXT_LEARN flags, thus silently ignores the update.
Fix this by always allowing to set BR_FDB_ADDED_BY_EXT_LEARN regardless if this was a user fdb entry or not.
Fixes: 710ae7287737 ("net: bridge: Mark FDB entries that were added by user as such") Signed-off-by: Jonas Gorski jonas.gorski@bisdn.de Acked-by: Nikolay Aleksandrov razor@blackwall.org Reviewed-by: Ido Schimmel idosch@nvidia.com Link: https://patch.msgid.link/20240903081958.29951-1-jonas.gorski@bisdn.de Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/bridge/br_fdb.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..882b6a67e11f 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1388,12 +1388,10 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, modified = true; }
- if (test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags)) { + if (test_and_set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags)) { /* Refresh entry */ fdb->used = jiffies; - } else if (!test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags)) { - /* Take over SW learned entry */ - set_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags); + } else { modified = true; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pawel Dembicki paweldembicki@gmail.com
[ Upstream commit 8e69c96df771ab469cec278edb47009351de4da6 ]
CAPT block (CPU Capture Buffer) have 7 sublocks: 0-3, 4, 6, 7. Function 'vsc73xx_is_addr_valid' allows to use only block 0 at this moment.
This patch fix it.
Fixes: 05bd97fc559d ("net: dsa: Add Vitesse VSC73xx DSA router driver") Signed-off-by: Pawel Dembicki paweldembicki@gmail.com Reviewed-by: Florian Fainelli florian.fainelli@broadcom.com Link: https://patch.msgid.link/20240903203340.1518789-1-paweldembicki@gmail.com Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/dsa/vitesse-vsc73xx-core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/vitesse-vsc73xx-core.c b/drivers/net/dsa/vitesse-vsc73xx-core.c index c8e9ca5d5c28..de444d201e0f 100644 --- a/drivers/net/dsa/vitesse-vsc73xx-core.c +++ b/drivers/net/dsa/vitesse-vsc73xx-core.c @@ -35,7 +35,7 @@ #define VSC73XX_BLOCK_ANALYZER 0x2 /* Only subblock 0 */ #define VSC73XX_BLOCK_MII 0x3 /* Subblocks 0 and 1 */ #define VSC73XX_BLOCK_MEMINIT 0x3 /* Only subblock 2 */ -#define VSC73XX_BLOCK_CAPTURE 0x4 /* Only subblock 2 */ +#define VSC73XX_BLOCK_CAPTURE 0x4 /* Subblocks 0-4, 6, 7 */ #define VSC73XX_BLOCK_ARBITER 0x5 /* Only subblock 0 */ #define VSC73XX_BLOCK_SYSTEM 0x7 /* Only subblock 0 */
@@ -371,13 +371,19 @@ int vsc73xx_is_addr_valid(u8 block, u8 subblock) break;
case VSC73XX_BLOCK_MII: - case VSC73XX_BLOCK_CAPTURE: case VSC73XX_BLOCK_ARBITER: switch (subblock) { case 0 ... 1: return 1; } break; + case VSC73XX_BLOCK_CAPTURE: + switch (subblock) { + case 0 ... 4: + case 6 ... 7: + return 1; + } + break; }
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Fitzgerald rf@opensource.cirrus.com
[ Upstream commit 62412a9357b16a4e39dc582deb2e2a682b92524c ]
Add a check to cs_dsp_coeff_write_ctrl() to abort if the control is not writeable.
The cs_dsp code originated as an ASoC driver (wm_adsp) where all controls were exported as ALSA controls. It relied on ALSA to enforce the read-only permission. Now that the code has been separated from ALSA/ASoC it must perform its own permission check.
This isn't currently causing any problems so there shouldn't be any need to backport this. If the client of cs_dsp exposes the control as an ALSA control, it should set permissions on that ALSA control to protect it. The few uses of cs_dsp_coeff_write_ctrl() inside drivers are for writable controls.
Signed-off-by: Richard Fitzgerald rf@opensource.cirrus.com Link: https://patch.msgid.link/20240702110809.16836-1-rf@opensource.cirrus.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/cirrus/cs_dsp.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/firmware/cirrus/cs_dsp.c b/drivers/firmware/cirrus/cs_dsp.c index 68005cce0136..cf4f4da0cb87 100644 --- a/drivers/firmware/cirrus/cs_dsp.c +++ b/drivers/firmware/cirrus/cs_dsp.c @@ -764,6 +764,9 @@ int cs_dsp_coeff_write_ctrl(struct cs_dsp_coeff_ctl *ctl,
lockdep_assert_held(&ctl->dsp->pwr_lock);
+ if (ctl->flags && !(ctl->flags & WMFW_CTL_FLAG_WRITEABLE)) + return -EPERM; + if (len + off * sizeof(u32) > ctl->len) return -EINVAL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Anderson sean.anderson@linux.dev
[ Upstream commit d79c6840917097285e03a49f709321f5fb972750 ]
Take the phy mutex in xlate to protect against concurrent modification/access to gtr_phy. This does not typically cause any issues, since in most systems the phys are only xlated once and thereafter accessed with the phy API (which takes the locks). However, we are about to allow userspace to access phys for debugging, so it's important to avoid any data races.
Signed-off-by: Sean Anderson sean.anderson@linux.dev Link: https://lore.kernel.org/r/20240628205540.3098010-5-sean.anderson@linux.dev Signed-off-by: Vinod Koul vkoul@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/phy/xilinx/phy-zynqmp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/phy/xilinx/phy-zynqmp.c b/drivers/phy/xilinx/phy-zynqmp.c index ac9a9124a36d..cc36fb7616ae 100644 --- a/drivers/phy/xilinx/phy-zynqmp.c +++ b/drivers/phy/xilinx/phy-zynqmp.c @@ -847,6 +847,7 @@ static struct phy *xpsgtr_xlate(struct device *dev, phy_type = args->args[1]; phy_instance = args->args[2];
+ guard(mutex)(>r_phy->phy->mutex); ret = xpsgtr_set_lane_type(gtr_phy, phy_type, phy_instance); if (ret < 0) { dev_err(gtr_dev->dev, "Invalid PHY type and/or instance\n");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com
[ Upstream commit 8ec2a2643544ce352f012ad3d248163199d05dfc ]
soc_tplg_denum_create_values() should properly set its values field.
Signed-off-by: Amadeusz Sławiński amadeuszx.slawinski@linux.intel.com Link: https://patch.msgid.link/20240627101850.2191513-4-amadeuszx.slawinski@linux.... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/soc-topology.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c index fcb8a36d4a06..77296c950376 100644 --- a/sound/soc/soc-topology.c +++ b/sound/soc/soc-topology.c @@ -889,6 +889,8 @@ static int soc_tplg_denum_create_values(struct soc_tplg *tplg, struct soc_enum * se->dobj.control.dvalues[i] = le32_to_cpu(ec->values[i]); }
+ se->items = le32_to_cpu(ec->items); + se->values = (const unsigned int *)se->dobj.control.dvalues; return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Marzinski bmarzins@redhat.com
[ Upstream commit 140ce37fd78a629105377e17842465258a5459ef ]
dm_parse_device_entry() simply copies the minor number into dmi.dev, but the dev_t format splits the minor number between the lowest 8 bytes and highest 12 bytes. If the minor number is larger than 255, part of it will end up getting treated as the major number
Fix this by checking that the minor number is valid and then encoding it as a dev_t.
Signed-off-by: Benjamin Marzinski bmarzins@redhat.com Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/md/dm-init.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm-init.c b/drivers/md/dm-init.c index dc4381d68313..6e9e73a55874 100644 --- a/drivers/md/dm-init.c +++ b/drivers/md/dm-init.c @@ -213,8 +213,10 @@ static char __init *dm_parse_device_entry(struct dm_device *dev, char *str) strscpy(dev->dmi.uuid, field[1], sizeof(dev->dmi.uuid)); /* minor */ if (strlen(field[2])) { - if (kstrtoull(field[2], 0, &dev->dmi.dev)) + if (kstrtoull(field[2], 0, &dev->dmi.dev) || + dev->dmi.dev >= (1 << MINORBITS)) return ERR_PTR(-EINVAL); + dev->dmi.dev = huge_encode_dev((dev_t)dev->dmi.dev); dev->dmi.flags |= DM_PERSISTENT_DEV_FLAG; } /* flags */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacob Pan jacob.jun.pan@linux.intel.com
[ Upstream commit b5e86a95541cea737394a1da967df4cd4d8f7182 ]
Queued invalidation wait descriptor status is volatile in that IOMMU hardware writes the data upon completion.
Use READ_ONCE() to prevent compiler optimizations which ensures memory reads every time. As a side effect, READ_ONCE() also enforces strict types and may add an extra instruction. But it should not have negative performance impact since we use cpu_relax anyway and the extra time(by adding an instruction) may allow IOMMU HW request cacheline ownership easier.
e.g. gcc 12.3 BEFORE: 81 38 ad de 00 00 cmpl $0x2,(%rax)
AFTER (with READ_ONCE()) 772f: 8b 00 mov (%rax),%eax 7731: 3d ad de 00 00 cmp $0x2,%eax //status data is 32 bit
Signed-off-by: Jacob Pan jacob.jun.pan@linux.intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Reviewed-by: Yi Liu yi.l.liu@intel.com Link: https://lore.kernel.org/r/20240607173817.3914600-1-jacob.jun.pan@linux.intel... Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20240702130839.108139-2-baolu.lu@linux.intel.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/intel/dmar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c index 4759f79ad7b9..345c161ffb27 100644 --- a/drivers/iommu/intel/dmar.c +++ b/drivers/iommu/intel/dmar.c @@ -1402,7 +1402,7 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, */ writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG);
- while (qi->desc_status[wait_index] != QI_DONE) { + while (READ_ONCE(qi->desc_status[wait_index]) != QI_DONE) { /* * We will leave the interrupts disabled, to prevent interrupt * context to queue another cmd while a cmd is already submitted
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Waiman Long longman@redhat.com
[ Upstream commit 57b56d16800e8961278ecff0dc755d46c4575092 ]
The writing of css->cgroup associated with the cgroup root in rebind_subsystems() is currently protected only by cgroup_mutex. However, the reading of css->cgroup in both proc_cpuset_show() and proc_cgroup_show() is protected just by css_set_lock. That makes the readers susceptible to racing problems like data tearing or caching. It is also a problem that can be reported by KCSAN.
This can be fixed by using READ_ONCE() and WRITE_ONCE() to access css->cgroup. Alternatively, the writing of css->cgroup can be moved under css_set_lock as well which is done by this patch.
Signed-off-by: Waiman Long longman@redhat.com Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/cgroup/cgroup.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/cgroup/cgroup.c b/kernel/cgroup/cgroup.c index 455f67ff31b5..f6656fd410d0 100644 --- a/kernel/cgroup/cgroup.c +++ b/kernel/cgroup/cgroup.c @@ -1852,9 +1852,9 @@ int rebind_subsystems(struct cgroup_root *dst_root, u16 ss_mask) RCU_INIT_POINTER(scgrp->subsys[ssid], NULL); rcu_assign_pointer(dcgrp->subsys[ssid], css); ss->root = dst_root; - css->cgroup = dcgrp;
spin_lock_irq(&css_set_lock); + css->cgroup = dcgrp; WARN_ON(!list_empty(&dcgrp->e_csets[ss->id])); list_for_each_entry_safe(cset, cset_pos, &scgrp->e_csets[ss->id], e_cset_node[ss->id]) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Johannes Berg johannes.berg@intel.com
[ Upstream commit 824ac4a5edd3f7494ab1996826c4f47f8ef0f63d ]
The pointer isn't initialized by callers, but I have encountered cases where it's still printed; initialize it in all possible cases in setup_one_line().
Link: https://patch.msgid.link/20240703172235.ad863568b55f.Iaa1eba4db8265d7715ba71... Acked-By: Anton Ivanov anton.ivanov@cambridgegreys.com Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/um/drivers/line.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/arch/um/drivers/line.c b/arch/um/drivers/line.c index 95ad6b190d1d..6b4faca401ea 100644 --- a/arch/um/drivers/line.c +++ b/arch/um/drivers/line.c @@ -383,6 +383,7 @@ int setup_one_line(struct line *lines, int n, char *init, parse_chan_pair(NULL, line, n, opts, error_out); err = 0; } + *error_out = "configured as 'none'"; } else { char *new = kstrdup(init, GFP_KERNEL); if (!new) { @@ -406,6 +407,7 @@ int setup_one_line(struct line *lines, int n, char *init, } } if (err) { + *error_out = "failed to parse channel pair"; line->init_str = NULL; line->valid = 0; kfree(new);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zijun Hu quic_zijuhu@quicinc.com
[ Upstream commit 56a20ad349b5c51909cf8810f7c79b288864ad33 ]
Initialize an uninitialized struct member for driver API devres_open_group().
Signed-off-by: Zijun Hu quic_zijuhu@quicinc.com Link: https://lore.kernel.org/r/1719931914-19035-4-git-send-email-quic_zijuhu@quic... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/base/devres.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/base/devres.c b/drivers/base/devres.c index f9add2ecdc55..35d1e2864696 100644 --- a/drivers/base/devres.c +++ b/drivers/base/devres.c @@ -564,6 +564,7 @@ void * devres_open_group(struct device *dev, void *id, gfp_t gfp) grp->id = grp; if (id) grp->id = id; + grp->color = 0;
spin_lock_irqsave(&dev->devres_lock, flags); add_dr(dev, &grp->node[0]);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krishna Kumar krishnak@linux.ibm.com
[ Upstream commit 335e35b748527f0c06ded9eebb65387f60647fda ]
The hotplug driver for powerpc (pci/hotplug/pnv_php.c) causes a kernel crash when we try to hot-unplug/disable the PCIe switch/bridge from the PHB.
The crash occurs because although the MSI data structure has been released during disable/hot-unplug path and it has been assigned with NULL, still during unregistration the code was again trying to explicitly disable the MSI which causes the NULL pointer dereference and kernel crash.
The patch fixes the check during unregistration path to prevent invoking pci_disable_msi/msix() since its data structure is already freed.
Reported-by: Timothy Pearson tpearson@raptorengineering.com Closes: https://lore.kernel.org/all/1981605666.2142272.1703742465927.JavaMail.zimbra... Acked-by: Bjorn Helgaas bhelgaas@google.com Tested-by: Shawn Anastasio sanastasio@raptorengineering.com Signed-off-by: Krishna Kumar krishnak@linux.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/20240701074513.94873-2-krishnak@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/hotplug/pnv_php.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/pci/hotplug/pnv_php.c b/drivers/pci/hotplug/pnv_php.c index 881d420637bf..092c9ac0d26d 100644 --- a/drivers/pci/hotplug/pnv_php.c +++ b/drivers/pci/hotplug/pnv_php.c @@ -39,7 +39,6 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, bool disable_device) { struct pci_dev *pdev = php_slot->pdev; - int irq = php_slot->irq; u16 ctrl;
if (php_slot->irq > 0) { @@ -58,7 +57,7 @@ static void pnv_php_disable_irq(struct pnv_php_slot *php_slot, php_slot->wq = NULL; }
- if (disable_device || irq > 0) { + if (disable_device) { if (pdev->msix_enabled) pci_disable_msix(pdev); else if (pdev->msi_enabled)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hareshx Sankar Raj hareshx.sankar.raj@intel.com
[ Upstream commit f0622894c59458fceb33c4197462bc2006f3fc6b ]
The logic that detects pending VF2PF interrupts unintentionally clears the section of the error mask register(s) not related to VF2PF. This might cause interrupts unrelated to VF2PF, reported through errsou3 and errsou5, to be reported again after the execution of the function disable_pending_vf2pf_interrupts() in dh895xcc and GEN2 devices.
Fix by updating only section of errmsk3 and errmsk5 related to VF2PF.
Signed-off-by: Hareshx Sankar Raj hareshx.sankar.raj@intel.com Reviewed-by: Damian Muszynski damian.muszynski@intel.com Signed-off-by: Giovanni Cabiddu giovanni.cabiddu@intel.com Signed-off-by: Herbert Xu herbert@gondor.apana.org.au Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/crypto/qat/qat_common/adf_gen2_pfvf.c | 4 +++- drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c | 8 ++++++-- 2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c b/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c index 70ef11963938..43af81fcab86 100644 --- a/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c +++ b/drivers/crypto/qat/qat_common/adf_gen2_pfvf.c @@ -100,7 +100,9 @@ static u32 adf_gen2_disable_pending_vf2pf_interrupts(void __iomem *pmisc_addr) errmsk3 |= ADF_GEN2_ERR_MSK_VF2PF(ADF_GEN2_VF_MSK); ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
- errmsk3 &= ADF_GEN2_ERR_MSK_VF2PF(sources | disabled); + /* Update only section of errmsk3 related to VF2PF */ + errmsk3 &= ~ADF_GEN2_ERR_MSK_VF2PF(ADF_GEN2_VF_MSK); + errmsk3 |= ADF_GEN2_ERR_MSK_VF2PF(sources | disabled); ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3);
/* Return the sources of the (new) interrupt(s) */ diff --git a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c index cb3bdd3618fb..85295c7ee0e0 100644 --- a/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c +++ b/drivers/crypto/qat/qat_dh895xcc/adf_dh895xcc_hw_data.c @@ -180,8 +180,12 @@ static u32 disable_pending_vf2pf_interrupts(void __iomem *pmisc_addr) ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3); ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK5, errmsk5);
- errmsk3 &= ADF_DH895XCC_ERR_MSK_VF2PF_L(sources | disabled); - errmsk5 &= ADF_DH895XCC_ERR_MSK_VF2PF_U(sources | disabled); + /* Update only section of errmsk3 and errmsk5 related to VF2PF */ + errmsk3 &= ~ADF_DH895XCC_ERR_MSK_VF2PF_L(ADF_DH895XCC_VF_MSK); + errmsk5 &= ~ADF_DH895XCC_ERR_MSK_VF2PF_U(ADF_DH895XCC_VF_MSK); + + errmsk3 |= ADF_DH895XCC_ERR_MSK_VF2PF_L(sources | disabled); + errmsk5 |= ADF_DH895XCC_ERR_MSK_VF2PF_U(sources | disabled); ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK3, errmsk3); ADF_CSR_WR(pmisc_addr, ADF_GEN2_ERRMSK5, errmsk5);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 8cad724c8537fe3e0da8004646abc00290adae40 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large negative number such as -9223372036854775808 is provided by the user. Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/adc128d818.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/adc128d818.c b/drivers/hwmon/adc128d818.c index 97b330b6c165..bad2d39d9733 100644 --- a/drivers/hwmon/adc128d818.c +++ b/drivers/hwmon/adc128d818.c @@ -176,7 +176,7 @@ static ssize_t adc128_in_store(struct device *dev,
mutex_lock(&data->update_lock); /* 10 mV LSB on limit registers */ - regval = clamp_val(DIV_ROUND_CLOSEST(val, 10), 0, 255); + regval = DIV_ROUND_CLOSEST(clamp_val(val, 0, 2550), 10); data->in[index][nr] = regval << 4; reg = index == 1 ? ADC128_REG_IN_MIN(nr) : ADC128_REG_IN_MAX(nr); i2c_smbus_write_byte_data(data->client, reg, regval); @@ -214,7 +214,7 @@ static ssize_t adc128_temp_store(struct device *dev, return err;
mutex_lock(&data->update_lock); - regval = clamp_val(DIV_ROUND_CLOSEST(val, 1000), -128, 127); + regval = DIV_ROUND_CLOSEST(clamp_val(val, -128000, 127000), 1000); data->temp[index] = regval << 1; i2c_smbus_write_byte_data(data->client, index == 1 ? ADC128_REG_TEMP_MAX
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit af64e3e1537896337405f880c1e9ac1f8c0c6198 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large negative number such as -9223372036854775808 is provided by the user. Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/lm95234.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/drivers/hwmon/lm95234.c b/drivers/hwmon/lm95234.c index b4a9d0c223c4..db570fe84132 100644 --- a/drivers/hwmon/lm95234.c +++ b/drivers/hwmon/lm95234.c @@ -301,7 +301,8 @@ static ssize_t tcrit2_store(struct device *dev, struct device_attribute *attr, if (ret < 0) return ret;
- val = clamp_val(DIV_ROUND_CLOSEST(val, 1000), 0, index ? 255 : 127); + val = DIV_ROUND_CLOSEST(clamp_val(val, 0, (index ? 255 : 127) * 1000), + 1000);
mutex_lock(&data->update_lock); data->tcrit2[index] = val; @@ -350,7 +351,7 @@ static ssize_t tcrit1_store(struct device *dev, struct device_attribute *attr, if (ret < 0) return ret;
- val = clamp_val(DIV_ROUND_CLOSEST(val, 1000), 0, 255); + val = DIV_ROUND_CLOSEST(clamp_val(val, 0, 255000), 1000);
mutex_lock(&data->update_lock); data->tcrit1[index] = val; @@ -391,7 +392,7 @@ static ssize_t tcrit1_hyst_store(struct device *dev, if (ret < 0) return ret;
- val = DIV_ROUND_CLOSEST(val, 1000); + val = DIV_ROUND_CLOSEST(clamp_val(val, -255000, 255000), 1000); val = clamp_val((int)data->tcrit1[index] - val, 0, 31);
mutex_lock(&data->update_lock); @@ -431,7 +432,7 @@ static ssize_t offset_store(struct device *dev, struct device_attribute *attr, return ret;
/* Accuracy is 1/2 degrees C */ - val = clamp_val(DIV_ROUND_CLOSEST(val, 500), -128, 127); + val = DIV_ROUND_CLOSEST(clamp_val(val, -64000, 63500), 500);
mutex_lock(&data->update_lock); data->toffset[index] = val;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 0403e10bf0824bf0ec2bb135d4cf1c0cc3bf4bf0 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large negative number such as -9223372036854775808 is provided by the user. Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/nct6775-core.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hwmon/nct6775-core.c b/drivers/hwmon/nct6775-core.c index 9720ad214c20..83e424945b59 100644 --- a/drivers/hwmon/nct6775-core.c +++ b/drivers/hwmon/nct6775-core.c @@ -2171,7 +2171,7 @@ store_temp_offset(struct device *dev, struct device_attribute *attr, if (err < 0) return err;
- val = clamp_val(DIV_ROUND_CLOSEST(val, 1000), -128, 127); + val = DIV_ROUND_CLOSEST(clamp_val(val, -128000, 127000), 1000);
mutex_lock(&data->update_lock); data->temp_offset[nr] = val;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guenter Roeck linux@roeck-us.net
[ Upstream commit 5c1de37969b7bc0abcb20b86e91e70caebbd4f89 ]
DIV_ROUND_CLOSEST() after kstrtol() results in an underflow if a large negative number such as -9223372036854775808 is provided by the user. Fix it by reordering clamp_val() and DIV_ROUND_CLOSEST() operations.
Signed-off-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hwmon/w83627ehf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/hwmon/w83627ehf.c b/drivers/hwmon/w83627ehf.c index 939d4c35e713..66d71aba4171 100644 --- a/drivers/hwmon/w83627ehf.c +++ b/drivers/hwmon/w83627ehf.c @@ -895,7 +895,7 @@ store_target_temp(struct device *dev, struct device_attribute *attr, if (err < 0) return err;
- val = clamp_val(DIV_ROUND_CLOSEST(val, 1000), 0, 127); + val = DIV_ROUND_CLOSEST(clamp_val(val, 0, 127000), 1000);
mutex_lock(&data->update_lock); data->target_temp[nr] = val; @@ -920,7 +920,7 @@ store_tolerance(struct device *dev, struct device_attribute *attr, return err;
/* Limit the temp to 0C - 15C */ - val = clamp_val(DIV_ROUND_CLOSEST(val, 1000), 0, 15); + val = DIV_ROUND_CLOSEST(clamp_val(val, 0, 15000), 1000);
mutex_lock(&data->update_lock); reg = w83627ehf_read_value(data, W83627EHF_REG_TOLERANCE[nr]);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andreas Ziegler ziegler.andreas@siemens.com
[ Upstream commit cedc12c5b57f7efa6dbebfb2b140e8675f5a2616 ]
In the current state, an erroneous call to bpf_object__find_map_by_name(NULL, ...) leads to a segmentation fault through the following call chain:
bpf_object__find_map_by_name(obj = NULL, ...) -> bpf_object__for_each_map(pos, obj = NULL) -> bpf_object__next_map((obj = NULL), NULL) -> return (obj = NULL)->maps
While calling bpf_object__find_map_by_name with obj = NULL is obviously incorrect, this should not lead to a segmentation fault but rather be handled gracefully.
As __bpf_map__iter already handles this situation correctly, we can delegate the check for the regular case there and only add a check in case the prev or next parameter is NULL.
Signed-off-by: Andreas Ziegler ziegler.andreas@siemens.com Signed-off-by: Daniel Borkmann daniel@iogearbox.net Link: https://lore.kernel.org/bpf/20240703083436.505124-1-ziegler.andreas@siemens.... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/lib/bpf/libbpf.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c index bb27dfd6b97a..878f05a42421 100644 --- a/tools/lib/bpf/libbpf.c +++ b/tools/lib/bpf/libbpf.c @@ -9364,7 +9364,7 @@ __bpf_map__iter(const struct bpf_map *m, const struct bpf_object *obj, int i) struct bpf_map * bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *prev) { - if (prev == NULL) + if (prev == NULL && obj != NULL) return obj->maps;
return __bpf_map__iter(prev, obj, 1); @@ -9373,7 +9373,7 @@ bpf_object__next_map(const struct bpf_object *obj, const struct bpf_map *prev) struct bpf_map * bpf_object__prev_map(const struct bpf_object *obj, const struct bpf_map *next) { - if (next == NULL) { + if (next == NULL && obj != NULL) { if (!obj->nr_maps) return NULL; return obj->maps + obj->nr_maps - 1;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yifan Zha Yifan.Zha@amd.com
[ Upstream commit 33f23fc3155b13c4a96d94a0a22dc26db767440b ]
[Why] If VF request full GPU access and the request failed, the VF driver can get stuck accessing registers for an extended period during the unload of KMS.
[How] Set no_hw_access flag when VF request for full GPU access fails This prevents further hardware access attempts, avoiding the prolonged stuck state.
Signed-off-by: Yifan Zha Yifan.Zha@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c index af50e6ce39e1..d7b76a3d2d55 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_virt.c @@ -132,8 +132,10 @@ int amdgpu_virt_request_full_gpu(struct amdgpu_device *adev, bool init)
if (virt->ops && virt->ops->req_full_gpu) { r = virt->ops->req_full_gpu(adev, init); - if (r) + if (r) { + adev->no_hw_access = true; return r; + }
adev->virt.caps &= ~AMDGPU_SRIOV_CAPS_RUNTIME; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Luis Henriques (SUSE) luis.henriques@linux.dev
[ Upstream commit 63469662cc45d41705f14b4648481d5d29cf5999 ]
In the fast commit code there are a few places where tid_t variables are being compared without taking into account the fact that these sequence numbers may wrap. Fix this issue by using the helper functions tid_gt() and tid_geq().
Signed-off-by: Luis Henriques (SUSE) luis.henriques@linux.dev Reviewed-by: Jan Kara jack@suse.cz Reviewed-by: Harshad Shirwadkar harshadshirwadkar@gmail.com Link: https://patch.msgid.link/20240529092030.9557-3-luis.henriques@linux.dev Signed-off-by: Theodore Ts'o tytso@mit.edu Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ext4/fast_commit.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/fast_commit.c b/fs/ext4/fast_commit.c index 19353a2f44bb..2ef773d40ffd 100644 --- a/fs/ext4/fast_commit.c +++ b/fs/ext4/fast_commit.c @@ -353,7 +353,7 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl read_unlock(&sbi->s_journal->j_state_lock); } spin_lock(&sbi->s_fc_lock); - if (sbi->s_fc_ineligible_tid < tid) + if (tid_gt(tid, sbi->s_fc_ineligible_tid)) sbi->s_fc_ineligible_tid = tid; spin_unlock(&sbi->s_fc_lock); WARN_ON(reason >= EXT4_FC_REASON_MAX); @@ -1235,7 +1235,7 @@ int ext4_fc_commit(journal_t *journal, tid_t commit_tid) if (ret == -EALREADY) { /* There was an ongoing commit, check if we need to restart */ if (atomic_read(&sbi->s_fc_subtid) <= subtid && - commit_tid > journal->j_commit_sequence) + tid_gt(commit_tid, journal->j_commit_sequence)) goto restart_fc; ext4_fc_update_stats(sb, EXT4_FC_STATUS_SKIPPED, 0, 0, commit_tid); @@ -1310,7 +1310,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid) list_del_init(&iter->i_fc_list); ext4_clear_inode_state(&iter->vfs_inode, EXT4_STATE_FC_COMMITTING); - if (iter->i_sync_tid <= tid) + if (tid_geq(tid, iter->i_sync_tid)) ext4_fc_reset_inode(&iter->vfs_inode); /* Make sure EXT4_STATE_FC_COMMITTING bit is clear */ smp_mb(); @@ -1341,7 +1341,7 @@ static void ext4_fc_cleanup(journal_t *journal, int full, tid_t tid) list_splice_init(&sbi->s_fc_q[FC_Q_STAGING], &sbi->s_fc_q[FC_Q_MAIN]);
- if (tid >= sbi->s_fc_ineligible_tid) { + if (tid_geq(tid, sbi->s_fc_ineligible_tid)) { sbi->s_fc_ineligible_tid = 0; ext4_clear_mount_flag(sb, EXT4_MF_FC_INELIGIBLE); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yicong Yang yangyicong@hisilicon.com
[ Upstream commit 54624acf8843375a6de3717ac18df3b5104c39c5 ]
The test thread will start N benchmark kthreads and then schedule out until the test time finished and notify the benchmark kthreads to stop. The benchmark kthreads will keep running until notified to stop. There's a problem with current implementation when the benchmark kthreads number is equal to the CPUs on a non-preemptible kernel: since the scheduler will balance the kthreads across the CPUs and when the test time's out the test thread won't get a chance to be scheduled on any CPU then cannot notify the benchmark kthreads to stop.
This can be easily reproduced on a VM (simulated with 16 CPUs) with PREEMPT_VOLUNTARY: estuary:/mnt$ ./dma_map_benchmark -t 16 -s 1 rcu: INFO: rcu_sched self-detected stall on CPU rcu: 10-...!: (5221 ticks this GP) idle=ed24/1/0x4000000000000000 softirq=142/142 fqs=0 rcu: (t=5254 jiffies g=-559 q=45 ncpus=16) rcu: rcu_sched kthread starved for 5255 jiffies! g-559 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x0 ->cpu=12 rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. rcu: RCU grace-period kthread stack dump: task:rcu_sched state:R running task stack:0 pid:16 tgid:16 ppid:2 flags:0x00000008 Call trace __switch_to+0xec/0x138 __schedule+0x2f8/0x1080 schedule+0x30/0x130 schedule_timeout+0xa0/0x188 rcu_gp_fqs_loop+0x128/0x528 rcu_gp_kthread+0x1c8/0x208 kthread+0xec/0xf8 ret_from_fork+0x10/0x20 Sending NMI from CPU 10 to CPUs 0: NMI backtrace for cpu 0 CPU: 0 PID: 332 Comm: dma-map-benchma Not tainted 6.10.0-rc1-vanilla-LSE #8 Hardware name: QEMU KVM Virtual Machine, BIOS 0.0.0 02/06/2015 pstate: 20400005 (nzCv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) pc : arm_smmu_cmdq_issue_cmdlist+0x218/0x730 lr : arm_smmu_cmdq_issue_cmdlist+0x488/0x730 sp : ffff80008748b630 x29: ffff80008748b630 x28: 0000000000000000 x27: ffff80008748b780 x26: 0000000000000000 x25: 000000000000bc70 x24: 000000000001bc70 x23: ffff0000c12af080 x22: 0000000000010000 x21: 000000000000ffff x20: ffff80008748b700 x19: ffff0000c12af0c0 x18: 0000000000010000 x17: 0000000000000001 x16: 0000000000000040 x15: ffffffffffffffff x14: 0001ffffffffffff x13: 000000000000ffff x12: 00000000000002f1 x11: 000000000001ffff x10: 0000000000000031 x9 : ffff800080b6b0b8 x8 : ffff0000c2a48000 x7 : 000000000001bc71 x6 : 0001800000000000 x5 : 00000000000002f1 x4 : 01ffffffffffffff x3 : 000000000009aaf1 x2 : 0000000000000018 x1 : 000000000000000f x0 : ffff0000c12af18c Call trace: arm_smmu_cmdq_issue_cmdlist+0x218/0x730 __arm_smmu_tlb_inv_range+0xe0/0x1a8 arm_smmu_iotlb_sync+0xc0/0x128 __iommu_dma_unmap+0x248/0x320 iommu_dma_unmap_page+0x5c/0xe8 dma_unmap_page_attrs+0x38/0x1d0 map_benchmark_thread+0x118/0x2c0 kthread+0xec/0xf8 ret_from_fork+0x10/0x20
Solve this by adding scheduling point in the kthread loop, so if there're other threads in the system they may have a chance to run, especially the thread to notify the test end. However this may degrade the test concurrency so it's recommended to run this on an idle system.
Signed-off-by: Yicong Yang yangyicong@hisilicon.com Acked-by: Barry Song baohua@kernel.org Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/dma/map_benchmark.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/kernel/dma/map_benchmark.c b/kernel/dma/map_benchmark.c index dafdc47ae5fc..3efdc5cfe390 100644 --- a/kernel/dma/map_benchmark.c +++ b/kernel/dma/map_benchmark.c @@ -89,6 +89,22 @@ static int map_benchmark_thread(void *data) atomic64_add(map_sq, &map->sum_sq_map); atomic64_add(unmap_sq, &map->sum_sq_unmap); atomic64_inc(&map->loops); + + /* + * We may test for a long time so periodically check whether + * we need to schedule to avoid starving the others. Otherwise + * we may hangup the kernel in a non-preemptible kernel when + * the test kthreads number >= CPU number, the test kthreads + * will run endless on every CPU since the thread resposible + * for notifying the kthread stop (in do_map_benchmark()) + * could not be scheduled. + * + * Note this may degrade the test concurrency since the test + * threads may need to share the CPU time with other load + * in the system. So it's recommended to run this benchmark + * on an idle system. + */ + cond_resched(); }
out:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sascha Hauer s.hauer@pengutronix.de
[ Upstream commit c145eea2f75ff7949392aebecf7ef0a81c1f6c14 ]
mwifiex_get_priv_by_id() returns the priv pointer corresponding to the bss_num and bss_type, but without checking if the priv is actually currently in use. Unused priv pointers do not have a wiphy attached to them which can lead to NULL pointer dereferences further down the callstack. Fix this by returning only used priv pointers which have priv->bss_mode set to something else than NL80211_IFTYPE_UNSPECIFIED.
Said NULL pointer dereference happened when an Accesspoint was started with wpa_supplicant -i mlan0 with this config:
network={ ssid="somessid" mode=2 frequency=2412 key_mgmt=WPA-PSK WPA-PSK-SHA256 proto=RSN group=CCMP pairwise=CCMP psk="12345678" }
When waiting for the AP to be established, interrupting wpa_supplicant with <ctrl-c> and starting it again this happens:
| Unable to handle kernel NULL pointer dereference at virtual address 0000000000000140 | Mem abort info: | ESR = 0x0000000096000004 | EC = 0x25: DABT (current EL), IL = 32 bits | SET = 0, FnV = 0 | EA = 0, S1PTW = 0 | FSC = 0x04: level 0 translation fault | Data abort info: | ISV = 0, ISS = 0x00000004, ISS2 = 0x00000000 | CM = 0, WnR = 0, TnD = 0, TagAccess = 0 | GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | user pgtable: 4k pages, 48-bit VAs, pgdp=0000000046d96000 | [0000000000000140] pgd=0000000000000000, p4d=0000000000000000 | Internal error: Oops: 0000000096000004 [#1] PREEMPT SMP | Modules linked in: caam_jr caamhash_desc spidev caamalg_desc crypto_engine authenc libdes mwifiex_sdio +mwifiex crct10dif_ce cdc_acm onboard_usb_hub fsl_imx8_ddr_perf imx8m_ddrc rtc_ds1307 lm75 rtc_snvs +imx_sdma caam imx8mm_thermal spi_imx error imx_cpufreq_dt fuse ip_tables x_tables ipv6 | CPU: 0 PID: 8 Comm: kworker/0:1 Not tainted 6.9.0-00007-g937242013fce-dirty #18 | Hardware name: somemachine (DT) | Workqueue: events sdio_irq_work | pstate: 00000005 (nzcv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : mwifiex_get_cfp+0xd8/0x15c [mwifiex] | lr : mwifiex_get_cfp+0x34/0x15c [mwifiex] | sp : ffff8000818b3a70 | x29: ffff8000818b3a70 x28: ffff000006bfd8a5 x27: 0000000000000004 | x26: 000000000000002c x25: 0000000000001511 x24: 0000000002e86bc9 | x23: ffff000006bfd996 x22: 0000000000000004 x21: ffff000007bec000 | x20: 000000000000002c x19: 0000000000000000 x18: 0000000000000000 | x17: 000000040044ffff x16: 00500072b5503510 x15: ccc283740681e517 | x14: 0201000101006d15 x13: 0000000002e8ff43 x12: 002c01000000ffb1 | x11: 0100000000000000 x10: 02e8ff43002c0100 x9 : 0000ffb100100157 | x8 : ffff000003d20000 x7 : 00000000000002f1 x6 : 00000000ffffe124 | x5 : 0000000000000001 x4 : 0000000000000003 x3 : 0000000000000000 | x2 : 0000000000000000 x1 : 0001000000011001 x0 : 0000000000000000 | Call trace: | mwifiex_get_cfp+0xd8/0x15c [mwifiex] | mwifiex_parse_single_response_buf+0x1d0/0x504 [mwifiex] | mwifiex_handle_event_ext_scan_report+0x19c/0x2f8 [mwifiex] | mwifiex_process_sta_event+0x298/0xf0c [mwifiex] | mwifiex_process_event+0x110/0x238 [mwifiex] | mwifiex_main_process+0x428/0xa44 [mwifiex] | mwifiex_sdio_interrupt+0x64/0x12c [mwifiex_sdio] | process_sdio_pending_irqs+0x64/0x1b8 | sdio_irq_work+0x4c/0x7c | process_one_work+0x148/0x2a0 | worker_thread+0x2fc/0x40c | kthread+0x110/0x114 | ret_from_fork+0x10/0x20 | Code: a94153f3 a8c37bfd d50323bf d65f03c0 (f940a000) | ---[ end trace 0000000000000000 ]---
Signed-off-by: Sascha Hauer s.hauer@pengutronix.de Acked-by: Brian Norris briannorris@chromium.org Reviewed-by: Francesco Dolcini francesco.dolcini@toradex.com Signed-off-by: Kalle Valo kvalo@kernel.org Link: https://patch.msgid.link/20240703072409.556618-1-s.hauer@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/wireless/marvell/mwifiex/main.h | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/net/wireless/marvell/mwifiex/main.h b/drivers/net/wireless/marvell/mwifiex/main.h index 63f861e6b28a..fb98eb342bd9 100644 --- a/drivers/net/wireless/marvell/mwifiex/main.h +++ b/drivers/net/wireless/marvell/mwifiex/main.h @@ -1301,6 +1301,9 @@ mwifiex_get_priv_by_id(struct mwifiex_adapter *adapter,
for (i = 0; i < adapter->priv_num; i++) { if (adapter->priv[i]) { + if (adapter->priv[i]->bss_mode == NL80211_IFTYPE_UNSPECIFIED) + continue; + if ((adapter->priv[i]->bss_num == bss_num) && (adapter->priv[i]->bss_type == bss_type)) break;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zqiang qiang.zhang1211@gmail.com
[ Upstream commit 77aeb1b685f9db73d276bad4bb30d48505a6fd23 ]
For CONFIG_DEBUG_OBJECTS_WORK=y kernels sscs.work defined by INIT_WORK_ONSTACK() is initialized by debug_object_init_on_stack() for the debug check in __init_work() to work correctly.
But this lacks the counterpart to remove the tracked object from debug objects again, which will cause a debug object warning once the stack is freed.
Add the missing destroy_work_on_stack() invocation to cure that.
[ tglx: Massaged changelog ]
Signed-off-by: Zqiang qiang.zhang1211@gmail.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Tested-by: Paul E. McKenney paulmck@kernel.org Link: https://lore.kernel.org/r/20240704065213.13559-1-qiang.zhang1211@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/smp.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/kernel/smp.c b/kernel/smp.c index 63e466bb6b03..0acd433afa7b 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -1262,6 +1262,7 @@ int smp_call_on_cpu(unsigned int cpu, int (*func)(void *), void *par, bool phys)
queue_work_on(cpu, system_wq, &sscs.work); wait_for_completion(&sscs.done); + destroy_work_on_stack(&sscs.work);
return sscs.ret; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Konstantin Komarov almaz.alexandrovich@paragon-software.com
[ Upstream commit 744375343662058cbfda96d871786e5a5cbe1947 ]
Mark ntfs dirty in this case. Rename ntfs_filldir to ntfs_dir_emit.
Signed-off-by: Konstantin Komarov almaz.alexandrovich@paragon-software.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/ntfs3/dir.c | 52 +++++++++++++++++++++++++++++++------------------- 1 file changed, 32 insertions(+), 20 deletions(-)
diff --git a/fs/ntfs3/dir.c b/fs/ntfs3/dir.c index dcd689ed4baa..a4ab0164d150 100644 --- a/fs/ntfs3/dir.c +++ b/fs/ntfs3/dir.c @@ -272,9 +272,12 @@ struct inode *dir_search_u(struct inode *dir, const struct cpu_str *uni, return err == -ENOENT ? NULL : err ? ERR_PTR(err) : inode; }
-static inline int ntfs_filldir(struct ntfs_sb_info *sbi, struct ntfs_inode *ni, - const struct NTFS_DE *e, u8 *name, - struct dir_context *ctx) +/* + * returns false if 'ctx' if full + */ +static inline bool ntfs_dir_emit(struct ntfs_sb_info *sbi, + struct ntfs_inode *ni, const struct NTFS_DE *e, + u8 *name, struct dir_context *ctx) { const struct ATTR_FILE_NAME *fname; unsigned long ino; @@ -284,29 +287,29 @@ static inline int ntfs_filldir(struct ntfs_sb_info *sbi, struct ntfs_inode *ni, fname = Add2Ptr(e, sizeof(struct NTFS_DE));
if (fname->type == FILE_NAME_DOS) - return 0; + return true;
if (!mi_is_ref(&ni->mi, &fname->home)) - return 0; + return true;
ino = ino_get(&e->ref);
if (ino == MFT_REC_ROOT) - return 0; + return true;
/* Skip meta files. Unless option to show metafiles is set. */ if (!sbi->options->showmeta && ntfs_is_meta_file(sbi, ino)) - return 0; + return true;
if (sbi->options->nohidden && (fname->dup.fa & FILE_ATTRIBUTE_HIDDEN)) - return 0; + return true;
name_len = ntfs_utf16_to_nls(sbi, fname->name, fname->name_len, name, PATH_MAX); if (name_len <= 0) { ntfs_warn(sbi->sb, "failed to convert name for inode %lx.", ino); - return 0; + return true; }
/* @@ -336,17 +339,20 @@ static inline int ntfs_filldir(struct ntfs_sb_info *sbi, struct ntfs_inode *ni, } }
- return !dir_emit(ctx, (s8 *)name, name_len, ino, dt_type); + return dir_emit(ctx, (s8 *)name, name_len, ino, dt_type); }
/* * ntfs_read_hdr - Helper function for ntfs_readdir(). + * + * returns 0 if ok. + * returns -EINVAL if directory is corrupted. + * returns +1 if 'ctx' is full. */ static int ntfs_read_hdr(struct ntfs_sb_info *sbi, struct ntfs_inode *ni, const struct INDEX_HDR *hdr, u64 vbo, u64 pos, u8 *name, struct dir_context *ctx) { - int err; const struct NTFS_DE *e; u32 e_size; u32 end = le32_to_cpu(hdr->used); @@ -354,12 +360,12 @@ static int ntfs_read_hdr(struct ntfs_sb_info *sbi, struct ntfs_inode *ni,
for (;; off += e_size) { if (off + sizeof(struct NTFS_DE) > end) - return -1; + return -EINVAL;
e = Add2Ptr(hdr, off); e_size = le16_to_cpu(e->size); if (e_size < sizeof(struct NTFS_DE) || off + e_size > end) - return -1; + return -EINVAL;
if (de_is_last(e)) return 0; @@ -369,14 +375,15 @@ static int ntfs_read_hdr(struct ntfs_sb_info *sbi, struct ntfs_inode *ni, continue;
if (le16_to_cpu(e->key_size) < SIZEOF_ATTRIBUTE_FILENAME) - return -1; + return -EINVAL;
ctx->pos = vbo + off;
/* Submit the name to the filldir callback. */ - err = ntfs_filldir(sbi, ni, e, name, ctx); - if (err) - return err; + if (!ntfs_dir_emit(sbi, ni, e, name, ctx)) { + /* ctx is full. */ + return +1; + } } }
@@ -475,8 +482,6 @@ static int ntfs_readdir(struct file *file, struct dir_context *ctx)
vbo = (u64)bit << index_bits; if (vbo >= i_size) { - ntfs_inode_err(dir, "Looks like your dir is corrupt"); - ctx->pos = eod; err = -EINVAL; goto out; } @@ -499,9 +504,16 @@ static int ntfs_readdir(struct file *file, struct dir_context *ctx) __putname(name); put_indx_node(node);
- if (err == -ENOENT) { + if (err == 1) { + /* 'ctx' is full. */ + err = 0; + } else if (err == -ENOENT) { err = 0; ctx->pos = pos; + } else if (err < 0) { + if (err == -EINVAL) + ntfs_inode_err(dir, "directory corrupted"); + ctx->pos = eod; }
return err;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit 1f9d44c0a12730a24f8bb75c5e1102207413cc9b ]
We have a couple of areas where we check to make sure the tree block is locked before looking up or messing with references. This is old code so it has this as BUG_ON(). Convert this to ASSERT() for developers.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent-tree.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index 528cd88a77fd..d6efdcf95991 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5200,7 +5200,7 @@ static noinline int walk_down_proc(struct btrfs_trans_handle *trans, if (lookup_info && ((wc->stage == DROP_REFERENCE && wc->refs[level] != 1) || (wc->stage == UPDATE_BACKREF && !(wc->flags[level] & flag)))) { - BUG_ON(!path->locks[level]); + ASSERT(path->locks[level]); ret = btrfs_lookup_extent_info(trans, fs_info, eb->start, level, 1, &wc->refs[level], @@ -5224,7 +5224,7 @@ static noinline int walk_down_proc(struct btrfs_trans_handle *trans,
/* wc->stage == UPDATE_BACKREF */ if (!(wc->flags[level] & flag)) { - BUG_ON(!path->locks[level]); + ASSERT(path->locks[level]); ret = btrfs_inc_ref(trans, root, eb, 1); BUG_ON(ret); /* -ENOMEM */ ret = btrfs_dec_ref(trans, root, eb, 0);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Josef Bacik josef@toxicpanda.com
[ Upstream commit b8ccef048354074a548f108e51d0557d6adfd3a3 ]
In reada we BUG_ON(refs == 0), which could be unkind since we aren't holding a lock on the extent leaf and thus could get a transient incorrect answer. In walk_down_proc we also BUG_ON(refs == 0), which could happen if we have extent tree corruption. Change that to return -EUCLEAN. In do_walk_down() we catch this case and handle it correctly, however we return -EIO, which -EUCLEAN is a more appropriate error code. Finally in walk_up_proc we have the same BUG_ON(refs == 0), so convert that to proper error handling. Also adjust the error message so we can actually do something with the information.
Signed-off-by: Josef Bacik josef@toxicpanda.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/extent-tree.c | 28 +++++++++++++++++++++++----- 1 file changed, 23 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index d6efdcf95991..0d97c8ee6b4f 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5141,7 +5141,15 @@ static noinline void reada_walk_down(struct btrfs_trans_handle *trans, /* We don't care about errors in readahead. */ if (ret < 0) continue; - BUG_ON(refs == 0); + + /* + * This could be racey, it's conceivable that we raced and end + * up with a bogus refs count, if that's the case just skip, if + * we are actually corrupt we will notice when we look up + * everything again with our locks. + */ + if (refs == 0) + continue;
if (wc->stage == DROP_REFERENCE) { if (refs == 1) @@ -5208,7 +5216,11 @@ static noinline int walk_down_proc(struct btrfs_trans_handle *trans, BUG_ON(ret == -ENOMEM); if (ret) return ret; - BUG_ON(wc->refs[level] == 0); + if (unlikely(wc->refs[level] == 0)) { + btrfs_err(fs_info, "bytenr %llu has 0 references, expect > 0", + eb->start); + return -EUCLEAN; + } }
if (wc->stage == DROP_REFERENCE) { @@ -5338,8 +5350,9 @@ static noinline int do_walk_down(struct btrfs_trans_handle *trans, goto out_unlock;
if (unlikely(wc->refs[level - 1] == 0)) { - btrfs_err(fs_info, "Missing references."); - ret = -EIO; + btrfs_err(fs_info, "bytenr %llu has 0 references, expect > 0", + bytenr); + ret = -EUCLEAN; goto out_unlock; } *lookup_info = 0; @@ -5540,7 +5553,12 @@ static noinline int walk_up_proc(struct btrfs_trans_handle *trans, path->locks[level] = 0; return ret; } - BUG_ON(wc->refs[level] == 0); + if (unlikely(wc->refs[level] == 0)) { + btrfs_tree_unlock_rw(eb, path->locks[level]); + btrfs_err(fs_info, "bytenr %llu has 0 references, expect > 0", + eb->start); + return -EUCLEAN; + } if (wc->refs[level] == 1) { btrfs_tree_unlock_rw(eb, path->locks[level]); path->locks[level] = 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
[ Upstream commit b56329a782314fde5b61058e2a25097af7ccb675 ]
Instead of a BUG_ON() just return an error, log an error message and abort the transaction in case we find an extent buffer belonging to the relocation tree that doesn't have the full backref flag set. This is unexpected and should never happen (save for bugs or a potential bad memory).
Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/ctree.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c index e08688844f1e..66d1f34c3fc6 100644 --- a/fs/btrfs/ctree.c +++ b/fs/btrfs/ctree.c @@ -324,8 +324,16 @@ static noinline int update_ref_for_cow(struct btrfs_trans_handle *trans, }
owner = btrfs_header_owner(buf); - BUG_ON(owner == BTRFS_TREE_RELOC_OBJECTID && - !(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF)); + if (unlikely(owner == BTRFS_TREE_RELOC_OBJECTID && + !(flags & BTRFS_BLOCK_FLAG_FULL_BACKREF))) { + btrfs_crit(fs_info, +"found tree block at bytenr %llu level %d root %llu refs %llu flags %llx without full backref flag set", + buf->start, btrfs_header_level(buf), + btrfs_root_id(root), refs, flags); + ret = -EUCLEAN; + btrfs_abort_transaction(trans, ret); + return ret; + }
if (refs > 1) { if ((owner == root->root_key.objectid ||
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: yang.zhang yang.zhang@hexintek.com
[ Upstream commit 6ad8735994b854b23c824dd6b1dd2126e893a3b4 ]
The exception vector of the booting hart is not set before enabling the mmu and then still points to the value of the previous firmware, typically _start. That makes it hard to debug setup_vm() when bad things happen. So fix that by setting the exception vector earlier.
Reviewed-by: Alexandre Ghiti alexghiti@rivosinc.com Signed-off-by: yang.zhang yang.zhang@hexintek.com Link: https://lore.kernel.org/r/20240508022445.6131-1-gaoshanliukou@163.com Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/head.S | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S index 4bf6c449d78b..1a017ad53343 100644 --- a/arch/riscv/kernel/head.S +++ b/arch/riscv/kernel/head.S @@ -307,6 +307,9 @@ clear_bss_done: #else mv a0, s1 #endif /* CONFIG_BUILTIN_DTB */ + /* Set trap vector to spin forever to help debug */ + la a3, .Lsecondary_park + csrw CSR_TVEC, a3 call setup_vm #ifdef CONFIG_MMU la a0, early_pg_dir
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dan Williams dan.j.williams@intel.com
[ Upstream commit a4e772898f8bf2e7e1cf661a12c60a5612c4afab ]
One of the true positives that the cfg_access_lock lockdep effort identified is this sequence:
WARNING: CPU: 14 PID: 1 at drivers/pci/pci.c:4886 pci_bridge_secondary_bus_reset+0x5d/0x70 RIP: 0010:pci_bridge_secondary_bus_reset+0x5d/0x70 Call Trace: <TASK> ? __warn+0x8c/0x190 ? pci_bridge_secondary_bus_reset+0x5d/0x70 ? report_bug+0x1f8/0x200 ? handle_bug+0x3c/0x70 ? exc_invalid_op+0x18/0x70 ? asm_exc_invalid_op+0x1a/0x20 ? pci_bridge_secondary_bus_reset+0x5d/0x70 pci_reset_bus+0x1d8/0x270 vmd_probe+0x778/0xa10 pci_device_probe+0x95/0x120
Where pci_reset_bus() users are triggering unlocked secondary bus resets. Ironically pci_bus_reset(), several calls down from pci_reset_bus(), uses pci_bus_lock() before issuing the reset which locks everything *but* the bridge itself.
For the same motivation as adding:
bridge = pci_upstream_bridge(dev); if (bridge) pci_dev_lock(bridge);
to pci_reset_function() for the "bus" and "cxl_bus" reset cases, add pci_dev_lock() for @bus->self to pci_bus_lock().
Link: https://lore.kernel.org/r/171711747501.1628941.15217746952476635316.stgit@dw... Reported-by: Imre Deak imre.deak@intel.com Closes: http://lore.kernel.org/r/6657833b3b5ae_14984b29437@dwillia2-xfh.jf.intel.com... Signed-off-by: Dan Williams dan.j.williams@intel.com Signed-off-by: Keith Busch kbusch@kernel.org [bhelgaas: squash in recursive locking deadlock fix from Keith Busch: https://lore.kernel.org/r/20240711193650.701834-1-kbusch@meta.com] Signed-off-by: Bjorn Helgaas bhelgaas@google.com Tested-by: Hans de Goede hdegoede@redhat.com Tested-by: Kalle Valo kvalo@kernel.org Reviewed-by: Dave Jiang dave.jiang@intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/pci/pci.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c index 2d373ab3ccb3..f7592348ebee 100644 --- a/drivers/pci/pci.c +++ b/drivers/pci/pci.c @@ -5584,10 +5584,12 @@ static void pci_bus_lock(struct pci_bus *bus) { struct pci_dev *dev;
+ pci_dev_lock(bus->self); list_for_each_entry(dev, &bus->devices, bus_list) { - pci_dev_lock(dev); if (dev->subordinate) pci_bus_lock(dev->subordinate); + else + pci_dev_lock(dev); } }
@@ -5599,8 +5601,10 @@ static void pci_bus_unlock(struct pci_bus *bus) list_for_each_entry(dev, &bus->devices, bus_list) { if (dev->subordinate) pci_bus_unlock(dev->subordinate); - pci_dev_unlock(dev); + else + pci_dev_unlock(dev); } + pci_dev_unlock(bus->self); }
/* Return 1 on successful lock, 0 on contention */ @@ -5608,15 +5612,15 @@ static int pci_bus_trylock(struct pci_bus *bus) { struct pci_dev *dev;
+ if (!pci_dev_trylock(bus->self)) + return 0; + list_for_each_entry(dev, &bus->devices, bus_list) { - if (!pci_dev_trylock(dev)) - goto unlock; if (dev->subordinate) { - if (!pci_bus_trylock(dev->subordinate)) { - pci_dev_unlock(dev); + if (!pci_bus_trylock(dev->subordinate)) goto unlock; - } - } + } else if (!pci_dev_trylock(dev)) + goto unlock; } return 1;
@@ -5624,8 +5628,10 @@ static int pci_bus_trylock(struct pci_bus *bus) list_for_each_entry_continue_reverse(dev, &bus->devices, bus_list) { if (dev->subordinate) pci_bus_unlock(dev->subordinate); - pci_dev_unlock(dev); + else + pci_dev_unlock(dev); } + pci_dev_unlock(bus->self); return 0; }
@@ -5657,9 +5663,10 @@ static void pci_slot_lock(struct pci_slot *slot) list_for_each_entry(dev, &slot->bus->devices, bus_list) { if (!dev->slot || dev->slot != slot) continue; - pci_dev_lock(dev); if (dev->subordinate) pci_bus_lock(dev->subordinate); + else + pci_dev_lock(dev); } }
@@ -5685,14 +5692,13 @@ static int pci_slot_trylock(struct pci_slot *slot) list_for_each_entry(dev, &slot->bus->devices, bus_list) { if (!dev->slot || dev->slot != slot) continue; - if (!pci_dev_trylock(dev)) - goto unlock; if (dev->subordinate) { if (!pci_bus_trylock(dev->subordinate)) { pci_dev_unlock(dev); goto unlock; } - } + } else if (!pci_dev_trylock(dev)) + goto unlock; } return 1;
@@ -5703,7 +5709,8 @@ static int pci_slot_trylock(struct pci_slot *slot) continue; if (dev->subordinate) pci_bus_unlock(dev->subordinate); - pci_dev_unlock(dev); + else + pci_dev_unlock(dev); } return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kuniyuki Iwashima kuniyu@amazon.com
[ Upstream commit 23e89e8ee7be73e21200947885a6d3a109a2c58d ]
RFC 9293 states that in the case of simultaneous connect(), the connection gets established when SYN+ACK is received. [0]
TCP Peer A TCP Peer B
1. CLOSED CLOSED 2. SYN-SENT --> <SEQ=100><CTL=SYN> ... 3. SYN-RECEIVED <-- <SEQ=300><CTL=SYN> <-- SYN-SENT 4. ... <SEQ=100><CTL=SYN> --> SYN-RECEIVED 5. SYN-RECEIVED --> <SEQ=100><ACK=301><CTL=SYN,ACK> ... 6. ESTABLISHED <-- <SEQ=300><ACK=101><CTL=SYN,ACK> <-- SYN-RECEIVED 7. ... <SEQ=100><ACK=301><CTL=SYN,ACK> --> ESTABLISHED
However, since commit 0c24604b68fc ("tcp: implement RFC 5961 4.2"), such a SYN+ACK is dropped in tcp_validate_incoming() and responded with Challenge ACK.
For example, the write() syscall in the following packetdrill script fails with -EAGAIN, and wrong SNMP stats get incremented.
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460,sackOK,TS val 1000 ecr 0,nop,wscale 8> +0 < S 0:0(0) win 1000 <mss 1000> +0 > S. 0:0(0) ack 1 <mss 1460,sackOK,TS val 3308134035 ecr 0,nop,wscale 8> +0 < S. 0:0(0) ack 1 win 1000
+0 write(3, ..., 100) = 100 +0 > P. 1:101(100) ack 1
--
# packetdrill cross-synack.pkt cross-synack.pkt:13: runtime error in write call: Expected result 100 but got -1 with errno 11 (Resource temporarily unavailable) # nstat ... TcpExtTCPChallengeACK 1 0.0 TcpExtTCPSYNChallenge 1 0.0
The problem is that bpf_skops_established() is triggered by the Challenge ACK instead of SYN+ACK. This causes the bpf prog to miss the chance to check if the peer supports a TCP option that is expected to be exchanged in SYN and SYN+ACK.
Let's accept a bare SYN+ACK for active-open TCP_SYN_RECV sockets to avoid such a situation.
Note that tcp_ack_snd_check() in tcp_rcv_state_process() is skipped not to send an unnecessary ACK, but this could be a bit risky for net.git, so this targets for net-next.
Link: https://www.rfc-editor.org/rfc/rfc9293.html#section-3.5-7 [0] Signed-off-by: Kuniyuki Iwashima kuniyu@amazon.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20240710171246.87533-2-kuniyu@amazon.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/tcp_input.c | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index b1b4f44d2137..b7d038b24a6d 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -5843,6 +5843,11 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, * RFC 5961 4.2 : Send a challenge ack */ if (th->syn) { + if (sk->sk_state == TCP_SYN_RECV && sk->sk_socket && th->ack && + TCP_SKB_CB(skb)->seq + 1 == TCP_SKB_CB(skb)->end_seq && + TCP_SKB_CB(skb)->seq + 1 == tp->rcv_nxt && + TCP_SKB_CB(skb)->ack_seq == tp->snd_nxt) + goto pass; syn_challenge: if (syn_inerr) TCP_INC_STATS(sock_net(sk), TCP_MIB_INERRS); @@ -5852,6 +5857,7 @@ static bool tcp_validate_incoming(struct sock *sk, struct sk_buff *skb, goto discard; }
+pass: bpf_skops_parse_hdr(sk, skb);
return true; @@ -6635,6 +6641,9 @@ int tcp_rcv_state_process(struct sock *sk, struct sk_buff *skb) tcp_fast_path_on(tp); if (sk->sk_shutdown & SEND_SHUTDOWN) tcp_shutdown(sk, SEND_SHUTDOWN); + + if (sk->sk_socket) + goto consume; break;
case TCP_FIN_WAIT1: {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vladimir Oltean vladimir.oltean@nxp.com
[ Upstream commit 555a05d84ca2c587e2d4777006e2c2fb3dfbd91d ]
The dpaa-eth driver is written for PowerPC and Arm SoCs which have 1-24 CPUs. It depends on CONFIG_NR_CPUS having a reasonably small value in Kconfig. Otherwise, there are 2 functions which allocate on-stack arrays of NR_CPUS elements, and these can quickly explode in size, leading to warnings such as:
drivers/net/ethernet/freescale/dpaa/dpaa_eth.c:3280:12: warning: stack frame size (16664) exceeds limit (2048) in 'dpaa_eth_probe' [-Wframe-larger-than]
The problem is twofold: - Reducing the array size to the boot-time num_possible_cpus() (rather than the compile-time NR_CPUS) creates a variable-length array, which should be avoided in the Linux kernel. - Using NR_CPUS as an array size makes the driver blow up in stack consumption with generic, as opposed to hand-crafted, .config files.
A simple solution is to use dynamic allocation for num_possible_cpus() elements (aka a small number determined at runtime).
Link: https://lore.kernel.org/all/202406261920.l5pzM1rj-lkp@intel.com/ Signed-off-by: Vladimir Oltean vladimir.oltean@nxp.com Reviewed-by: Breno Leitao leitao@debian.org Acked-by: Madalin Bucur madalin.bucur@oss.nxp.com Link: https://patch.msgid.link/20240713225336.1746343-2-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/freescale/dpaa/dpaa_eth.c | 20 ++++++++++++++----- .../ethernet/freescale/dpaa/dpaa_ethtool.c | 10 +++++++++- 2 files changed, 24 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c index 981cc3248047..19506f2be4d4 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_eth.c @@ -916,14 +916,18 @@ static inline void dpaa_setup_egress(const struct dpaa_priv *priv, } }
-static void dpaa_fq_setup(struct dpaa_priv *priv, - const struct dpaa_fq_cbs *fq_cbs, - struct fman_port *tx_port) +static int dpaa_fq_setup(struct dpaa_priv *priv, + const struct dpaa_fq_cbs *fq_cbs, + struct fman_port *tx_port) { int egress_cnt = 0, conf_cnt = 0, num_portals = 0, portal_cnt = 0, cpu; const cpumask_t *affine_cpus = qman_affine_cpus(); - u16 channels[NR_CPUS]; struct dpaa_fq *fq; + u16 *channels; + + channels = kcalloc(num_possible_cpus(), sizeof(u16), GFP_KERNEL); + if (!channels) + return -ENOMEM;
for_each_cpu_and(cpu, affine_cpus, cpu_online_mask) channels[num_portals++] = qman_affine_channel(cpu); @@ -982,6 +986,10 @@ static void dpaa_fq_setup(struct dpaa_priv *priv, break; } } + + kfree(channels); + + return 0; }
static inline int dpaa_tx_fq_to_id(const struct dpaa_priv *priv, @@ -3454,7 +3462,9 @@ static int dpaa_eth_probe(struct platform_device *pdev) */ dpaa_eth_add_channel(priv->channel, &pdev->dev);
- dpaa_fq_setup(priv, &dpaa_fq_cbs, priv->mac_dev->port[TX]); + err = dpaa_fq_setup(priv, &dpaa_fq_cbs, priv->mac_dev->port[TX]); + if (err) + goto free_dpaa_bps;
/* Create a congestion group for this netdev, with * dynamically-allocated CGR ID. diff --git a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c index 769e936a263c..fcb0cba4611e 100644 --- a/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c +++ b/drivers/net/ethernet/freescale/dpaa/dpaa_ethtool.c @@ -515,12 +515,16 @@ static int dpaa_set_coalesce(struct net_device *dev, struct netlink_ext_ack *extack) { const cpumask_t *cpus = qman_affine_cpus(); - bool needs_revert[NR_CPUS] = {false}; struct qman_portal *portal; u32 period, prev_period; u8 thresh, prev_thresh; + bool *needs_revert; int cpu, res;
+ needs_revert = kcalloc(num_possible_cpus(), sizeof(bool), GFP_KERNEL); + if (!needs_revert) + return -ENOMEM; + period = c->rx_coalesce_usecs; thresh = c->rx_max_coalesced_frames;
@@ -543,6 +547,8 @@ static int dpaa_set_coalesce(struct net_device *dev, needs_revert[cpu] = true; }
+ kfree(needs_revert); + return 0;
revert_values: @@ -556,6 +562,8 @@ static int dpaa_set_coalesce(struct net_device *dev, qman_dqrr_set_ithresh(portal, prev_thresh); }
+ kfree(needs_revert); + return res; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jarkko Nikula jarkko.nikula@linux.intel.com
[ Upstream commit 8a2be2f1db268ec735419e53ef04ca039fc027dc ]
Definitely condition dma_get_cache_alignment * defined value > 256 during driver initialization is not reason to BUG_ON(). Turn that to graceful error out with -EINVAL.
Signed-off-by: Jarkko Nikula jarkko.nikula@linux.intel.com Link: https://lore.kernel.org/r/20240628131559.502822-3-jarkko.nikula@linux.intel.... Signed-off-by: Alexandre Belloni alexandre.belloni@bootlin.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/i3c/master/mipi-i3c-hci/dma.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/i3c/master/mipi-i3c-hci/dma.c b/drivers/i3c/master/mipi-i3c-hci/dma.c index 337c95d43f3f..edc3a69bfe31 100644 --- a/drivers/i3c/master/mipi-i3c-hci/dma.c +++ b/drivers/i3c/master/mipi-i3c-hci/dma.c @@ -291,7 +291,10 @@ static int hci_dma_init(struct i3c_hci *hci)
rh->ibi_chunk_sz = dma_get_cache_alignment(); rh->ibi_chunk_sz *= IBI_CHUNK_CACHELINES; - BUG_ON(rh->ibi_chunk_sz > 256); + if (rh->ibi_chunk_sz > 256) { + ret = -EINVAL; + goto err_out; + }
ibi_status_ring_sz = rh->ibi_status_sz * rh->ibi_status_entries; ibi_data_ring_sz = rh->ibi_chunk_sz * rh->ibi_chunks_total;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Zenghui Yu yuzenghui@huawei.com
[ Upstream commit 291e4baf70019f17a81b7b47aeb186b27d222159 ]
Even if a vgem device is configured in, we will skip the import_vgem_fd() test almost every time.
TAP version 13 1..11 # Testing heap: system # ======================================= # Testing allocation and importing: ok 1 # SKIP Could not open vgem -1
The problem is that we use the DRM_IOCTL_VERSION ioctl to query the driver version information but leave the name field a non-null-terminated string. Terminate it properly to actually test against the vgem device.
While at it, let's check the length of the driver name is exactly 4 bytes and return early otherwise (in case there is a name like "vgemfoo" that gets converted to "vgem\0" unexpectedly).
Signed-off-by: Zenghui Yu yuzenghui@huawei.com Signed-off-by: Daniel Vetter daniel.vetter@ffwll.ch Link: https://patchwork.freedesktop.org/patch/msgid/20240729024604.2046-1-yuzenghu... Signed-off-by: Sasha Levin sashal@kernel.org --- tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c index 890a8236a8ba..2809f9a25c43 100644 --- a/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c +++ b/tools/testing/selftests/dmabuf-heaps/dmabuf-heap.c @@ -28,9 +28,11 @@ static int check_vgem(int fd) version.name = name;
ret = ioctl(fd, DRM_IOCTL_VERSION, &version); - if (ret) + if (ret || version.name_len != 4) return 0;
+ name[4] = '\0'; + return !strcmp(name, "vgem"); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Sterba dsterba@suse.com
[ Upstream commit b8e947e9f64cac9df85a07672b658df5b2bcff07 ]
Some arch + compiler combinations report a potentially unused variable location in btrfs_lookup_dentry(). This is a false alert as the variable is passed by value and always valid or there's an error. The compilers cannot probably reason about that although btrfs_inode_by_name() is in the same file.
- /kisskb/src/fs/btrfs/inode.c: error: 'location.objectid' may be used
+uninitialized in this function [-Werror=maybe-uninitialized]: => 5603:9
- /kisskb/src/fs/btrfs/inode.c: error: 'location.type' may be used
+uninitialized in this function [-Werror=maybe-uninitialized]: => 5674:5
m68k-gcc8/m68k-allmodconfig mips-gcc8/mips-allmodconfig powerpc-gcc5/powerpc-all{mod,yes}config powerpc-gcc5/ppc64_defconfig
Initialize it to zero, this should fix the warnings and won't change the behaviour as btrfs_inode_by_name() accepts only a root or inode item types, otherwise returns an error.
Reported-by: Geert Uytterhoeven geert@linux-m68k.org Tested-by: Geert Uytterhoeven geert@linux-m68k.org Link: https://lore.kernel.org/linux-btrfs/bd4e9928-17b3-9257-8ba7-6b7f9bbb639a@lin... Reviewed-by: Qu Wenruo wqu@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/btrfs/inode.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 934e360d1aef..e5017b2ade57 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -5890,7 +5890,7 @@ struct inode *btrfs_lookup_dentry(struct inode *dir, struct dentry *dentry) struct inode *inode; struct btrfs_root *root = BTRFS_I(dir)->root; struct btrfs_root *sub_root = root; - struct btrfs_key location; + struct btrfs_key location = { 0 }; u8 di_type = 0; int ret = 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Heiko Carstens hca@linux.ibm.com
[ Upstream commit 75c10d5377d8821efafed32e4d72068d9c1f8ec0 ]
The .data.rel.ro and .got section were added between the rodata and ro_after_init data section, which adds an RW mapping in between all RO mapping of the kernel image:
---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1331000 196K PTE RO NX 0x000003ffe1331000-0x000003ffe13b3000 520K PTE RW NX <--- 0x000003ffe13b3000-0x000003ffe13d5000 136K PTE RO NX 0x000003ffe13d5000-0x000003ffe1400000 172K PTE RW NX 0x000003ffe1400000-0x000003ffe1500000 1M PMD RW NX 0x000003ffe1500000-0x000003ffe1700000 2M PTE RW NX 0x000003ffe1700000-0x000003ffe1800000 1M PMD RW NX 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]---
Move the ro_after_init data section again right behind the rodata section to prevent interleaving RO and RW mappings:
---[ Kernel Image Start ]--- 0x000003ffe0000000-0x000003ffe0e00000 14M PMD RO X 0x000003ffe0e00000-0x000003ffe0ec7000 796K PTE RO X 0x000003ffe0ec7000-0x000003ffe0f00000 228K PTE RO NX 0x000003ffe0f00000-0x000003ffe1300000 4M PMD RO NX 0x000003ffe1300000-0x000003ffe1353000 332K PTE RO NX 0x000003ffe1353000-0x000003ffe1400000 692K PTE RW NX 0x000003ffe1400000-0x000003ffe1500000 1M PMD RW NX 0x000003ffe1500000-0x000003ffe1700000 2M PTE RW NX 0x000003ffe1700000-0x000003ffe1800000 1M PMD RW NX 0x000003ffe1800000-0x000003ffe187e000 504K PTE RW NX ---[ Kernel Image End ]---
Reviewed-by: Alexander Gordeev agordeev@linux.ibm.com Signed-off-by: Heiko Carstens hca@linux.ibm.com Signed-off-by: Vasily Gorbik gor@linux.ibm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/s390/kernel/vmlinux.lds.S | 9 +++++++++ 1 file changed, 9 insertions(+)
diff --git a/arch/s390/kernel/vmlinux.lds.S b/arch/s390/kernel/vmlinux.lds.S index 729d4f949cfe..52d6e5d1b453 100644 --- a/arch/s390/kernel/vmlinux.lds.S +++ b/arch/s390/kernel/vmlinux.lds.S @@ -71,6 +71,15 @@ SECTIONS . = ALIGN(PAGE_SIZE); __end_ro_after_init = .;
+ .data.rel.ro : { + *(.data.rel.ro .data.rel.ro.*) + } + .got : { + __got_start = .; + *(.got) + __got_end = .; + } + RW_DATA(0x100, PAGE_SIZE, THREAD_SIZE) BOOT_DATA_PRESERVED
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Camila Alvarez cam.alvarez.i@gmail.com
[ Upstream commit a6e9c391d45b5865b61e569146304cff72821a5d ]
report_fixup for the Cougar 500k Gaming Keyboard was not verifying that the report descriptor size was correct before accessing it
Reported-by: syzbot+24c0361074799d02c452@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=24c0361074799d02c452 Signed-off-by: Camila Alvarez cam.alvarez.i@gmail.com Reviewed-by: Silvan Jegen s.jegen@gmail.com Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/hid-cougar.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/hid/hid-cougar.c b/drivers/hid/hid-cougar.c index cb8bd8aae15b..0fa785f52707 100644 --- a/drivers/hid/hid-cougar.c +++ b/drivers/hid/hid-cougar.c @@ -106,7 +106,7 @@ static void cougar_fix_g6_mapping(void) static __u8 *cougar_report_fixup(struct hid_device *hdev, __u8 *rdesc, unsigned int *rsize) { - if (rdesc[2] == 0x09 && rdesc[3] == 0x02 && + if (*rsize >= 117 && rdesc[2] == 0x09 && rdesc[3] == 0x02 && (rdesc[115] | rdesc[116] << 8) >= HID_MAX_USAGES) { hid_info(hdev, "usage count exceeds max: fixing up report descriptor\n");
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Olivier Sobrie olivier@sobrie.be
[ Upstream commit 97155021ae17b86985121b33cf8098bcde00d497 ]
HID driver callbacks aren't called anymore once hid_destroy_device() has been called. Hence, hid driver_data should be freed only after the hid_destroy_device() function returned as driver_data is used in several callbacks.
I observed a crash with kernel 6.10.0 on my T14s Gen 3, after enabling KASAN to debug memory allocation, I got this output:
[ 13.050438] ================================================================== [ 13.054060] BUG: KASAN: slab-use-after-free in amd_sfh_get_report+0x3ec/0x530 [amd_sfh] [ 13.054809] psmouse serio1: trackpoint: Synaptics TrackPoint firmware: 0x02, buttons: 3/3 [ 13.056432] Read of size 8 at addr ffff88813152f408 by task (udev-worker)/479
[ 13.060970] CPU: 5 PID: 479 Comm: (udev-worker) Not tainted 6.10.0-arch1-2 #1 893bb55d7f0073f25c46adbb49eb3785fefd74b0 [ 13.063978] Hardware name: LENOVO 21CQCTO1WW/21CQCTO1WW, BIOS R22ET70W (1.40 ) 03/21/2024 [ 13.067860] Call Trace: [ 13.069383] input: TPPS/2 Synaptics TrackPoint as /devices/platform/i8042/serio1/input/input8 [ 13.071486] <TASK> [ 13.071492] dump_stack_lvl+0x5d/0x80 [ 13.074870] snd_hda_intel 0000:33:00.6: enabling device (0000 -> 0002) [ 13.078296] ? amd_sfh_get_report+0x3ec/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.082199] print_report+0x174/0x505 [ 13.085776] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 13.089367] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.093255] ? amd_sfh_get_report+0x3ec/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.097464] kasan_report+0xc8/0x150 [ 13.101461] ? amd_sfh_get_report+0x3ec/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.105802] amd_sfh_get_report+0x3ec/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.110303] amdtp_hid_request+0xb8/0x110 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.114879] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.119450] sensor_hub_get_feature+0x1d3/0x540 [hid_sensor_hub 3f13be3016ff415bea03008d45d99da837ee3082] [ 13.124097] hid_sensor_parse_common_attributes+0x4d0/0xad0 [hid_sensor_iio_common c3a5cbe93969c28b122609768bbe23efe52eb8f5] [ 13.127404] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.131925] ? __pfx_hid_sensor_parse_common_attributes+0x10/0x10 [hid_sensor_iio_common c3a5cbe93969c28b122609768bbe23efe52eb8f5] [ 13.136455] ? _raw_spin_lock_irqsave+0x96/0xf0 [ 13.140197] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 13.143602] ? devm_iio_device_alloc+0x34/0x50 [industrialio 3d261d5e5765625d2b052be40e526d62b1d2123b] [ 13.147234] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.150446] ? __devm_add_action+0x167/0x1d0 [ 13.155061] hid_gyro_3d_probe+0x120/0x7f0 [hid_sensor_gyro_3d 63da36a143b775846ab2dbb86c343b401b5e3172] [ 13.158581] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.161814] platform_probe+0xa2/0x150 [ 13.165029] really_probe+0x1e3/0x8a0 [ 13.168243] __driver_probe_device+0x18c/0x370 [ 13.171500] driver_probe_device+0x4a/0x120 [ 13.175000] __driver_attach+0x190/0x4a0 [ 13.178521] ? __pfx___driver_attach+0x10/0x10 [ 13.181771] bus_for_each_dev+0x106/0x180 [ 13.185033] ? __pfx__raw_spin_lock+0x10/0x10 [ 13.188229] ? __pfx_bus_for_each_dev+0x10/0x10 [ 13.191446] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.194382] bus_add_driver+0x29e/0x4d0 [ 13.197328] driver_register+0x1a5/0x360 [ 13.200283] ? __pfx_hid_gyro_3d_platform_driver_init+0x10/0x10 [hid_sensor_gyro_3d 63da36a143b775846ab2dbb86c343b401b5e3172] [ 13.203362] do_one_initcall+0xa7/0x380 [ 13.206432] ? __pfx_do_one_initcall+0x10/0x10 [ 13.210175] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.213211] ? kasan_unpoison+0x44/0x70 [ 13.216688] do_init_module+0x238/0x750 [ 13.219696] load_module+0x5011/0x6af0 [ 13.223096] ? kasan_save_stack+0x30/0x50 [ 13.226743] ? kasan_save_track+0x14/0x30 [ 13.230080] ? kasan_save_free_info+0x3b/0x60 [ 13.233323] ? poison_slab_object+0x109/0x180 [ 13.236778] ? __pfx_load_module+0x10/0x10 [ 13.239703] ? poison_slab_object+0x109/0x180 [ 13.243070] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.245924] ? init_module_from_file+0x13d/0x150 [ 13.248745] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.251503] ? init_module_from_file+0xdf/0x150 [ 13.254198] init_module_from_file+0xdf/0x150 [ 13.256826] ? __pfx_init_module_from_file+0x10/0x10 [ 13.259428] ? kasan_save_track+0x14/0x30 [ 13.261959] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.264471] ? kasan_save_free_info+0x3b/0x60 [ 13.267026] ? poison_slab_object+0x109/0x180 [ 13.269494] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.271949] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.274324] ? _raw_spin_lock+0x85/0xe0 [ 13.276671] ? __pfx__raw_spin_lock+0x10/0x10 [ 13.278963] ? __rseq_handle_notify_resume+0x1a6/0xad0 [ 13.281193] idempotent_init_module+0x23b/0x650 [ 13.283420] ? __pfx_idempotent_init_module+0x10/0x10 [ 13.285619] ? __pfx___seccomp_filter+0x10/0x10 [ 13.287714] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.289828] ? __fget_light+0x57/0x420 [ 13.291870] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.293880] ? security_capable+0x74/0xb0 [ 13.295820] __x64_sys_finit_module+0xbe/0x130 [ 13.297874] do_syscall_64+0x82/0x190 [ 13.299898] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.301905] ? irqtime_account_irq+0x3d/0x1f0 [ 13.303877] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.305753] ? __irq_exit_rcu+0x4e/0x130 [ 13.307577] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.309489] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 13.311371] RIP: 0033:0x7a21f96ade9d [ 13.313234] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 63 de 0c 00 f7 d8 64 89 01 48 [ 13.317051] RSP: 002b:00007ffeae934e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 13.319024] RAX: ffffffffffffffda RBX: 00005987276bfcf0 RCX: 00007a21f96ade9d [ 13.321100] RDX: 0000000000000004 RSI: 00007a21f8eda376 RDI: 000000000000001c [ 13.323314] RBP: 00007a21f8eda376 R08: 0000000000000001 R09: 00007ffeae934ec0 [ 13.325505] R10: 0000000000000050 R11: 0000000000000246 R12: 0000000000020000 [ 13.327637] R13: 00005987276c1250 R14: 0000000000000000 R15: 00005987276c4530 [ 13.329737] </TASK>
[ 13.333945] Allocated by task 139: [ 13.336111] kasan_save_stack+0x30/0x50 [ 13.336121] kasan_save_track+0x14/0x30 [ 13.336125] __kasan_kmalloc+0xaa/0xb0 [ 13.336129] amdtp_hid_probe+0xb1/0x440 [amd_sfh] [ 13.336138] amd_sfh_hid_client_init+0xb8a/0x10f0 [amd_sfh] [ 13.336144] sfh_init_work+0x47/0x120 [amd_sfh] [ 13.336150] process_one_work+0x673/0xeb0 [ 13.336155] worker_thread+0x795/0x1250 [ 13.336160] kthread+0x290/0x350 [ 13.336164] ret_from_fork+0x34/0x70 [ 13.336169] ret_from_fork_asm+0x1a/0x30
[ 13.338175] Freed by task 139: [ 13.340064] kasan_save_stack+0x30/0x50 [ 13.340072] kasan_save_track+0x14/0x30 [ 13.340076] kasan_save_free_info+0x3b/0x60 [ 13.340081] poison_slab_object+0x109/0x180 [ 13.340085] __kasan_slab_free+0x32/0x50 [ 13.340089] kfree+0xe5/0x310 [ 13.340094] amdtp_hid_remove+0xb2/0x160 [amd_sfh] [ 13.340102] amd_sfh_hid_client_deinit+0x324/0x640 [amd_sfh] [ 13.340107] amd_sfh_hid_client_init+0x94a/0x10f0 [amd_sfh] [ 13.340113] sfh_init_work+0x47/0x120 [amd_sfh] [ 13.340118] process_one_work+0x673/0xeb0 [ 13.340123] worker_thread+0x795/0x1250 [ 13.340127] kthread+0x290/0x350 [ 13.340132] ret_from_fork+0x34/0x70 [ 13.340136] ret_from_fork_asm+0x1a/0x30
[ 13.342482] The buggy address belongs to the object at ffff88813152f400 which belongs to the cache kmalloc-64 of size 64 [ 13.347357] The buggy address is located 8 bytes inside of freed 64-byte region [ffff88813152f400, ffff88813152f440)
[ 13.347367] The buggy address belongs to the physical page: [ 13.355409] page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x13152f [ 13.355416] anon flags: 0x2ffff8000000000(node=0|zone=2|lastcpupid=0x1ffff) [ 13.355423] page_type: 0xffffefff(slab) [ 13.355429] raw: 02ffff8000000000 ffff8881000428c0 ffffea0004c43a00 0000000000000005 [ 13.355435] raw: 0000000000000000 0000000000200020 00000001ffffefff 0000000000000000 [ 13.355439] page dumped because: kasan: bad access detected
[ 13.357295] Memory state around the buggy address: [ 13.357299] ffff88813152f300: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 13.357303] ffff88813152f380: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 13.357306] >ffff88813152f400: fa fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc [ 13.357309] ^ [ 13.357311] ffff88813152f480: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc [ 13.357315] ffff88813152f500: 00 00 00 00 00 00 00 06 fc fc fc fc fc fc fc fc [ 13.357318] ================================================================== [ 13.357405] Disabling lock debugging due to kernel taint [ 13.383534] Oops: general protection fault, probably for non-canonical address 0xe0a1bc4140000013: 0000 [#1] PREEMPT SMP KASAN NOPTI [ 13.383544] KASAN: maybe wild-memory-access in range [0x050e020a00000098-0x050e020a0000009f] [ 13.383551] CPU: 3 PID: 479 Comm: (udev-worker) Tainted: G B 6.10.0-arch1-2 #1 893bb55d7f0073f25c46adbb49eb3785fefd74b0 [ 13.383561] Hardware name: LENOVO 21CQCTO1WW/21CQCTO1WW, BIOS R22ET70W (1.40 ) 03/21/2024 [ 13.383565] RIP: 0010:amd_sfh_get_report+0x81/0x530 [amd_sfh] [ 13.383580] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 78 03 00 00 48 b8 00 00 00 00 00 fc ff df 4c 8b 63 08 49 8d 7c 24 10 48 89 fa 48 c1 ea 03 <0f> b6 04 02 84 c0 74 08 3c 03 0f 8e 1a 03 00 00 45 8b 74 24 10 45 [ 13.383585] RSP: 0018:ffff8881261f7388 EFLAGS: 00010212 [ 13.383592] RAX: dffffc0000000000 RBX: ffff88813152f400 RCX: 0000000000000002 [ 13.383597] RDX: 00a1c04140000013 RSI: 0000000000000008 RDI: 050e020a0000009b [ 13.383600] RBP: ffff88814d010000 R08: 0000000000000002 R09: fffffbfff3ddb8c0 [ 13.383604] R10: ffffffff9eedc607 R11: ffff88810ce98000 R12: 050e020a0000008b [ 13.383607] R13: ffff88814d010000 R14: dffffc0000000000 R15: 0000000000000004 [ 13.383611] FS: 00007a21f94d0880(0000) GS:ffff8887e7d80000(0000) knlGS:0000000000000000 [ 13.383615] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 13.383618] CR2: 00007e0014c438f0 CR3: 000000012614c000 CR4: 0000000000f50ef0 [ 13.383622] PKRU: 55555554 [ 13.383625] Call Trace: [ 13.383629] <TASK> [ 13.383632] ? __die_body.cold+0x19/0x27 [ 13.383644] ? die_addr+0x46/0x70 [ 13.383652] ? exc_general_protection+0x150/0x240 [ 13.383664] ? asm_exc_general_protection+0x26/0x30 [ 13.383674] ? amd_sfh_get_report+0x81/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.383686] ? amd_sfh_get_report+0x3ec/0x530 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.383697] amdtp_hid_request+0xb8/0x110 [amd_sfh 05f43221435b5205f734cd9da29399130f398a38] [ 13.383706] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383713] sensor_hub_get_feature+0x1d3/0x540 [hid_sensor_hub 3f13be3016ff415bea03008d45d99da837ee3082] [ 13.383727] hid_sensor_parse_common_attributes+0x4d0/0xad0 [hid_sensor_iio_common c3a5cbe93969c28b122609768bbe23efe52eb8f5] [ 13.383739] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383745] ? __pfx_hid_sensor_parse_common_attributes+0x10/0x10 [hid_sensor_iio_common c3a5cbe93969c28b122609768bbe23efe52eb8f5] [ 13.383753] ? _raw_spin_lock_irqsave+0x96/0xf0 [ 13.383762] ? __pfx__raw_spin_lock_irqsave+0x10/0x10 [ 13.383768] ? devm_iio_device_alloc+0x34/0x50 [industrialio 3d261d5e5765625d2b052be40e526d62b1d2123b] [ 13.383790] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383795] ? __devm_add_action+0x167/0x1d0 [ 13.383806] hid_gyro_3d_probe+0x120/0x7f0 [hid_sensor_gyro_3d 63da36a143b775846ab2dbb86c343b401b5e3172] [ 13.383818] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383826] platform_probe+0xa2/0x150 [ 13.383832] really_probe+0x1e3/0x8a0 [ 13.383838] __driver_probe_device+0x18c/0x370 [ 13.383844] driver_probe_device+0x4a/0x120 [ 13.383851] __driver_attach+0x190/0x4a0 [ 13.383857] ? __pfx___driver_attach+0x10/0x10 [ 13.383863] bus_for_each_dev+0x106/0x180 [ 13.383868] ? __pfx__raw_spin_lock+0x10/0x10 [ 13.383874] ? __pfx_bus_for_each_dev+0x10/0x10 [ 13.383880] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383887] bus_add_driver+0x29e/0x4d0 [ 13.383895] driver_register+0x1a5/0x360 [ 13.383902] ? __pfx_hid_gyro_3d_platform_driver_init+0x10/0x10 [hid_sensor_gyro_3d 63da36a143b775846ab2dbb86c343b401b5e3172] [ 13.383910] do_one_initcall+0xa7/0x380 [ 13.383919] ? __pfx_do_one_initcall+0x10/0x10 [ 13.383927] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.383933] ? kasan_unpoison+0x44/0x70 [ 13.383943] do_init_module+0x238/0x750 [ 13.383955] load_module+0x5011/0x6af0 [ 13.383962] ? kasan_save_stack+0x30/0x50 [ 13.383968] ? kasan_save_track+0x14/0x30 [ 13.383973] ? kasan_save_free_info+0x3b/0x60 [ 13.383980] ? poison_slab_object+0x109/0x180 [ 13.383993] ? __pfx_load_module+0x10/0x10 [ 13.384007] ? poison_slab_object+0x109/0x180 [ 13.384012] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384018] ? init_module_from_file+0x13d/0x150 [ 13.384025] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384032] ? init_module_from_file+0xdf/0x150 [ 13.384037] init_module_from_file+0xdf/0x150 [ 13.384044] ? __pfx_init_module_from_file+0x10/0x10 [ 13.384050] ? kasan_save_track+0x14/0x30 [ 13.384055] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384060] ? kasan_save_free_info+0x3b/0x60 [ 13.384066] ? poison_slab_object+0x109/0x180 [ 13.384071] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384080] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384085] ? _raw_spin_lock+0x85/0xe0 [ 13.384091] ? __pfx__raw_spin_lock+0x10/0x10 [ 13.384096] ? __rseq_handle_notify_resume+0x1a6/0xad0 [ 13.384106] idempotent_init_module+0x23b/0x650 [ 13.384114] ? __pfx_idempotent_init_module+0x10/0x10 [ 13.384120] ? __pfx___seccomp_filter+0x10/0x10 [ 13.384129] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384135] ? __fget_light+0x57/0x420 [ 13.384142] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384147] ? security_capable+0x74/0xb0 [ 13.384157] __x64_sys_finit_module+0xbe/0x130 [ 13.384164] do_syscall_64+0x82/0x190 [ 13.384174] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384179] ? irqtime_account_irq+0x3d/0x1f0 [ 13.384188] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384193] ? __irq_exit_rcu+0x4e/0x130 [ 13.384201] ? srso_alias_return_thunk+0x5/0xfbef5 [ 13.384206] entry_SYSCALL_64_after_hwframe+0x76/0x7e [ 13.384212] RIP: 0033:0x7a21f96ade9d [ 13.384263] Code: ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 63 de 0c 00 f7 d8 64 89 01 48 [ 13.384267] RSP: 002b:00007ffeae934e78 EFLAGS: 00000246 ORIG_RAX: 0000000000000139 [ 13.384273] RAX: ffffffffffffffda RBX: 00005987276bfcf0 RCX: 00007a21f96ade9d [ 13.384277] RDX: 0000000000000004 RSI: 00007a21f8eda376 RDI: 000000000000001c [ 13.384280] RBP: 00007a21f8eda376 R08: 0000000000000001 R09: 00007ffeae934ec0 [ 13.384284] R10: 0000000000000050 R11: 0000000000000246 R12: 0000000000020000 [ 13.384288] R13: 00005987276c1250 R14: 0000000000000000 R15: 00005987276c4530 [ 13.384297] </TASK> [ 13.384299] Modules linked in: soundwire_amd(+) hid_sensor_gyro_3d(+) hid_sensor_magn_3d hid_sensor_accel_3d soundwire_generic_allocation amdxcp hid_sensor_trigger drm_exec industrialio_triggered_buffer soundwire_bus gpu_sched kvm_amd kfifo_buf qmi_helpers joydev drm_buddy hid_sensor_iio_common mousedev snd_soc_core industrialio i2c_algo_bit mac80211 snd_compress drm_suballoc_helper kvm snd_hda_intel drm_ttm_helper ac97_bus snd_pcm_dmaengine snd_intel_dspcfg ttm thinkpad_acpi(+) snd_intel_sdw_acpi hid_sensor_hub snd_rpl_pci_acp6x drm_display_helper snd_hda_codec hid_multitouch libarc4 snd_acp_pci platform_profile think_lmi(+) hid_generic firmware_attributes_class wmi_bmof cec snd_acp_legacy_common sparse_keymap rapl snd_hda_core psmouse cfg80211 pcspkr snd_pci_acp6x snd_hwdep video snd_pcm snd_pci_acp5x snd_timer snd_rn_pci_acp3x ucsi_acpi snd_acp_config snd sp5100_tco rfkill snd_soc_acpi typec_ucsi thunderbolt amd_sfh k10temp mhi soundcore i2c_piix4 snd_pci_acp3x typec i2c_hid_acpi roles i2c_hid wmi acpi_tad amd_pmc [ 13.384454] mac_hid i2c_dev crypto_user loop nfnetlink zram ip_tables x_tables dm_crypt cbc encrypted_keys trusted asn1_encoder tee dm_mod crct10dif_pclmul crc32_pclmul polyval_clmulni polyval_generic gf128mul ghash_clmulni_intel serio_raw sha512_ssse3 atkbd sha256_ssse3 libps2 sha1_ssse3 vivaldi_fmap nvme aesni_intel crypto_simd nvme_core cryptd ccp xhci_pci i8042 nvme_auth xhci_pci_renesas serio vfat fat btrfs blake2b_generic libcrc32c crc32c_generic crc32c_intel xor raid6_pq [ 13.384552] ---[ end trace 0000000000000000 ]---
KASAN reports a use-after-free of hid->driver_data in function amd_sfh_get_report(). The backtrace indicates that the function is called by amdtp_hid_request() which is one of the callbacks of hid device. The current make sure that driver_data is freed only once hid_destroy_device() returned.
Note that I observed the crash both on v6.9.9 and v6.10.0. The code seems to be as it was from the early days of the driver.
Signed-off-by: Olivier Sobrie olivier@sobrie.be Acked-by: Basavaraj Natikar Basavaraj.Natikar@amd.com Signed-off-by: Jiri Kosina jkosina@suse.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/hid/amd-sfh-hid/amd_sfh_hid.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c index 1b18291fc5af..d682b99c25b1 100644 --- a/drivers/hid/amd-sfh-hid/amd_sfh_hid.c +++ b/drivers/hid/amd-sfh-hid/amd_sfh_hid.c @@ -171,11 +171,13 @@ int amdtp_hid_probe(u32 cur_hid_dev, struct amdtp_cl_data *cli_data) void amdtp_hid_remove(struct amdtp_cl_data *cli_data) { int i; + struct amdtp_hid_data *hid_data;
for (i = 0; i < cli_data->num_hid_devices; ++i) { if (cli_data->hid_sensor_hubs[i]) { - kfree(cli_data->hid_sensor_hubs[i]->driver_data); + hid_data = cli_data->hid_sensor_hubs[i]->driver_data; hid_destroy_device(cli_data->hid_sensor_hubs[i]); + kfree(hid_data); cli_data->hid_sensor_hubs[i] = NULL; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dmitry Torokhov dmitry.torokhov@gmail.com
[ Upstream commit 206f533a0a7c683982af473079c4111f4a0f9f5e ]
From: Dmitry Torokhov dmitry.torokhov@gmail.com
When exercising uinput interface syzkaller may try setting up device with a really large number of slots, which causes memory allocation failure in input_mt_init_slots(). While this allocation failure is handled properly and request is rejected, it results in syzkaller reports. Additionally, such request may put undue burden on the system which will try to free a lot of memory for a bogus request.
Fix it by limiting allowed number of slots to 100. This can easily be extended if we see devices that can track more than 100 contacts.
Reported-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Reported-by: syzbot syzbot+0122fa359a69694395d5@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=0122fa359a69694395d5 Link: https://lore.kernel.org/r/Zqgi7NYEbpRsJfa2@google.com Signed-off-by: Dmitry Torokhov dmitry.torokhov@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/input/misc/uinput.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+)
diff --git a/drivers/input/misc/uinput.c b/drivers/input/misc/uinput.c index f2593133e524..790db3ceb208 100644 --- a/drivers/input/misc/uinput.c +++ b/drivers/input/misc/uinput.c @@ -416,6 +416,20 @@ static int uinput_validate_absinfo(struct input_dev *dev, unsigned int code, return -EINVAL; }
+ /* + * Limit number of contacts to a reasonable value (100). This + * ensures that we need less than 2 pages for struct input_mt + * (we are not using in-kernel slot assignment so not going to + * allocate memory for the "red" table), and we should have no + * trouble getting this much memory. + */ + if (code == ABS_MT_SLOT && max > 99) { + printk(KERN_DEBUG + "%s: unreasonably large number of slots requested: %d\n", + UINPUT_NAME, max); + return -EINVAL; + } + return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Oliver Neukum oneukum@suse.com
[ Upstream commit e5876b088ba03a62124266fa20d00e65533c7269 ]
ipheth_sndbulk_callback() can submit carrier_work as a part of its error handling. That means that the driver must make sure that the work is cancelled after it has made sure that no more URB can terminate with an error condition.
Hence the order of actions in ipheth_close() needs to be inverted.
Signed-off-by: Oliver Neukum oneukum@suse.com Signed-off-by: Foster Snowhill forst@pen.gy Tested-by: Georgi Valkov gvalkov@gmail.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/usb/ipheth.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/usb/ipheth.c b/drivers/net/usb/ipheth.c index 6a769df0b421..13381d87eeb0 100644 --- a/drivers/net/usb/ipheth.c +++ b/drivers/net/usb/ipheth.c @@ -353,8 +353,8 @@ static int ipheth_close(struct net_device *net) { struct ipheth_device *dev = netdev_priv(net);
- cancel_delayed_work_sync(&dev->carrier_work); netif_stop_queue(net); + cancel_delayed_work_sync(&dev->carrier_work); return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Phillip Lougher phillip@squashfs.org.uk
[ Upstream commit 810ee43d9cd245d138a2733d87a24858a23f577d ]
Syzkiller reports a "KMSAN: uninit-value in pick_link" bug.
This is caused by an uninitialised page, which is ultimately caused by a corrupted symbolic link size read from disk.
The reason why the corrupted symlink size causes an uninitialised page is due to the following sequence of events:
1. squashfs_read_inode() is called to read the symbolic link from disk. This assigns the corrupted value 3875536935 to inode->i_size.
2. Later squashfs_symlink_read_folio() is called, which assigns this corrupted value to the length variable, which being a signed int, overflows producing a negative number.
3. The following loop that fills in the page contents checks that the copied bytes is less than length, which being negative means the loop is skipped, producing an uninitialised page.
This patch adds a sanity check which checks that the symbolic link size is not larger than expected.
--
Signed-off-by: Phillip Lougher phillip@squashfs.org.uk Link: https://lore.kernel.org/r/20240811232821.13903-1-phillip@squashfs.org.uk Reported-by: Lizhi Xu lizhi.xu@windriver.com Reported-by: syzbot+24ac24ff58dc5b0d26b9@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/000000000000a90e8c061e86a76b@google.com/ V2: fix spelling mistake. Signed-off-by: Christian Brauner brauner@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- fs/squashfs/inode.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/fs/squashfs/inode.c b/fs/squashfs/inode.c index 24463145b351..f31649080a88 100644 --- a/fs/squashfs/inode.c +++ b/fs/squashfs/inode.c @@ -276,8 +276,13 @@ int squashfs_read_inode(struct inode *inode, long long ino) if (err < 0) goto failed_read;
- set_nlink(inode, le32_to_cpu(sqsh_ino->nlink)); inode->i_size = le32_to_cpu(sqsh_ino->symlink_size); + if (inode->i_size > PAGE_SIZE) { + ERROR("Corrupted symlink\n"); + return -EINVAL; + } + + set_nlink(inode, le32_to_cpu(sqsh_ino->nlink)); inode->i_op = &squashfs_symlink_inode_ops; inode_nohighmem(inode); inode->i_data.a_ops = &squashfs_symlink_aops;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stefan Wiehler stefan.wiehler@nokia.com
[ Upstream commit b739dffa5d570b411d4bdf4bb9b8dfd6b7d72305 ]
When of_irq_parse_raw() is invoked with a device address smaller than the interrupt parent node (from #address-cells property), KASAN detects the following out-of-bounds read when populating the initial match table (dyndbg="func of_irq_parse_* +p"):
OF: of_irq_parse_one: dev=/soc@0/picasso/watchdog, index=0 OF: parent=/soc@0/pci@878000000000/gpio0@17,0, intsize=2 OF: intspec=4 OF: of_irq_parse_raw: ipar=/soc@0/pci@878000000000/gpio0@17,0, size=2 OF: -> addrsize=3 ================================================================== BUG: KASAN: slab-out-of-bounds in of_irq_parse_raw+0x2b8/0x8d0 Read of size 4 at addr ffffff81beca5608 by task bash/764
CPU: 1 PID: 764 Comm: bash Tainted: G O 6.1.67-484c613561-nokia_sm_arm64 #1 Hardware name: Unknown Unknown Product/Unknown Product, BIOS 2023.01-12.24.03-dirty 01/01/2023 Call trace: dump_backtrace+0xdc/0x130 show_stack+0x1c/0x30 dump_stack_lvl+0x6c/0x84 print_report+0x150/0x448 kasan_report+0x98/0x140 __asan_load4+0x78/0xa0 of_irq_parse_raw+0x2b8/0x8d0 of_irq_parse_one+0x24c/0x270 parse_interrupts+0xc0/0x120 of_fwnode_add_links+0x100/0x2d0 fw_devlink_parse_fwtree+0x64/0xc0 device_add+0xb38/0xc30 of_device_add+0x64/0x90 of_platform_device_create_pdata+0xd0/0x170 of_platform_bus_create+0x244/0x600 of_platform_notify+0x1b0/0x254 blocking_notifier_call_chain+0x9c/0xd0 __of_changeset_entry_notify+0x1b8/0x230 __of_changeset_apply_notify+0x54/0xe4 of_overlay_fdt_apply+0xc04/0xd94 ...
The buggy address belongs to the object at ffffff81beca5600 which belongs to the cache kmalloc-128 of size 128 The buggy address is located 8 bytes inside of 128-byte region [ffffff81beca5600, ffffff81beca5680)
The buggy address belongs to the physical page: page:00000000230d3d03 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1beca4 head:00000000230d3d03 order:1 compound_mapcount:0 compound_pincount:0 flags: 0x8000000000010200(slab|head|zone=2) raw: 8000000000010200 0000000000000000 dead000000000122 ffffff810000c300 raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000 page dumped because: kasan: bad access detected
Memory state around the buggy address: ffffff81beca5500: 04 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff81beca5580: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffffff81beca5600: 00 fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
^ ffffff81beca5680: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc ffffff81beca5700: 00 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc ================================================================== OF: -> got it !
Prevent the out-of-bounds read by copying the device address into a buffer of sufficient size.
Signed-off-by: Stefan Wiehler stefan.wiehler@nokia.com Link: https://lore.kernel.org/r/20240812100652.3800963-1-stefan.wiehler@nokia.com Signed-off-by: Rob Herring (Arm) robh@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/of/irq.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/of/irq.c b/drivers/of/irq.c index 88c24d88c4b9..a8e306606c4b 100644 --- a/drivers/of/irq.c +++ b/drivers/of/irq.c @@ -344,7 +344,8 @@ int of_irq_parse_one(struct device_node *device, int index, struct of_phandle_ar struct device_node *p; const __be32 *addr; u32 intsize; - int i, res; + int i, res, addr_len; + __be32 addr_buf[3] = { 0 };
pr_debug("of_irq_parse_one: dev=%pOF, index=%d\n", device, index);
@@ -353,13 +354,19 @@ int of_irq_parse_one(struct device_node *device, int index, struct of_phandle_ar return of_irq_parse_oldworld(device, index, out_irq);
/* Get the reg property (if any) */ - addr = of_get_property(device, "reg", NULL); + addr = of_get_property(device, "reg", &addr_len); + + /* Prevent out-of-bounds read in case of longer interrupt parent address size */ + if (addr_len > (3 * sizeof(__be32))) + addr_len = 3 * sizeof(__be32); + if (addr) + memcpy(addr_buf, addr, addr_len);
/* Try the new-style interrupts-extended first */ res = of_parse_phandle_with_args(device, "interrupts-extended", "#interrupt-cells", index, out_irq); if (!res) - return of_irq_parse_raw(addr, out_irq); + return of_irq_parse_raw(addr_buf, out_irq);
/* Look for the interrupt parent. */ p = of_irq_find_parent(device); @@ -389,7 +396,7 @@ int of_irq_parse_one(struct device_node *device, int index, struct of_phandle_ar
/* Check if there are any interrupt-map translations to process */ - res = of_irq_parse_raw(addr, out_irq); + res = of_irq_parse_raw(addr_buf, out_irq); out: of_node_put(p); return res;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Kent Overstreet kent.overstreet@linux.dev
[ Upstream commit b2f11c6f3e1fc60742673b8675c95b78447f3dae ]
If we need to increase the tree depth, allocate a new node, and then race with another thread that increased the tree depth before us, we'll still have a preallocated node that might be used later.
If we then use that node for a new non-root node, it'll still have a pointer to the old root instead of being zeroed - fix this by zeroing it in the cmpxchg failure path.
Signed-off-by: Kent Overstreet kent.overstreet@linux.dev Signed-off-by: Sasha Levin sashal@kernel.org --- lib/generic-radix-tree.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/lib/generic-radix-tree.c b/lib/generic-radix-tree.c index 7dfa88282b00..78f081d695d0 100644 --- a/lib/generic-radix-tree.c +++ b/lib/generic-radix-tree.c @@ -131,6 +131,8 @@ void *__genradix_ptr_alloc(struct __genradix *radix, size_t offset, if ((v = cmpxchg_release(&radix->root, r, new_root)) == r) { v = new_root; new_node = NULL; + } else { + new_node->children[0] = NULL; } }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jiaxun Yang jiaxun.yang@flygoat.com
[ Upstream commit 50f2b98dc83de7809a5c5bf0ccf9af2e75c37c13 ]
This avoids warning:
[ 0.118053] BUG: sleeping function called from invalid context at kernel/locking/mutex.c:283
Caused by get_c0_compare_int on secondary CPU.
We also skipped saving IRQ number to struct clock_event_device *cd as it's never used by clockevent core, as per comments it's only meant for "non CPU local devices".
Reported-by: Serge Semin fancer.lancer@gmail.com Closes: https://lore.kernel.org/linux-mips/6szkkqxpsw26zajwysdrwplpjvhl5abpnmxgu2xuj... Signed-off-by: Jiaxun Yang jiaxun.yang@flygoat.com Reviewed-by: Philippe Mathieu-Daudé philmd@linaro.org Reviewed-by: Serge Semin fancer.lancer@gmail.com Tested-by: Serge Semin fancer.lancer@gmail.com Signed-off-by: Thomas Bogendoerfer tsbogend@alpha.franken.de Signed-off-by: Sasha Levin sashal@kernel.org --- arch/mips/kernel/cevt-r4k.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/arch/mips/kernel/cevt-r4k.c b/arch/mips/kernel/cevt-r4k.c index 32ec67c9ab67..77028aa8c107 100644 --- a/arch/mips/kernel/cevt-r4k.c +++ b/arch/mips/kernel/cevt-r4k.c @@ -303,13 +303,6 @@ int r4k_clockevent_init(void) if (!c0_compare_int_usable()) return -ENXIO;
- /* - * With vectored interrupts things are getting platform specific. - * get_c0_compare_int is a hook to allow a platform to return the - * interrupt number of its liking. - */ - irq = get_c0_compare_int(); - cd = &per_cpu(mips_clockevent_device, cpu);
cd->name = "MIPS"; @@ -320,7 +313,6 @@ int r4k_clockevent_init(void) min_delta = calculate_min_delta();
cd->rating = 300; - cd->irq = irq; cd->cpumask = cpumask_of(cpu); cd->set_next_event = mips_next_event; cd->event_handler = mips_event_handler; @@ -332,6 +324,13 @@ int r4k_clockevent_init(void)
cp0_timer_irq_installed = 1;
+ /* + * With vectored interrupts things are getting platform specific. + * get_c0_compare_int is a hook to allow a platform to return the + * interrupt number of its liking. + */ + irq = get_c0_compare_int(); + if (request_irq(irq, c0_compare_interrupt, flags, "timer", c0_compare_interrupt)) pr_err("Failed to request irq %d (timer)\n", irq);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit d4bc0a264fb482b019c84fbc7202dd3cab059087 ]
The overflow/underflow conditions in pata_macio_qc_prep() should never happen. But if they do there's no need to kill the system entirely, a WARN and failing the IO request should be sufficient and might allow the system to keep running.
Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Damien Le Moal dlemoal@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/ata/pata_macio.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/ata/pata_macio.c b/drivers/ata/pata_macio.c index 9ccaac9e2bc3..f19ede2965a1 100644 --- a/drivers/ata/pata_macio.c +++ b/drivers/ata/pata_macio.c @@ -540,7 +540,8 @@ static enum ata_completion_errors pata_macio_qc_prep(struct ata_queued_cmd *qc)
while (sg_len) { /* table overflow should never happen */ - BUG_ON (pi++ >= MAX_DCMDS); + if (WARN_ON_ONCE(pi >= MAX_DCMDS)) + return AC_ERR_SYSTEM;
len = (sg_len < MAX_DBDMA_SEG) ? sg_len : MAX_DBDMA_SEG; table->command = cpu_to_le16(write ? OUTPUT_MORE: INPUT_MORE); @@ -552,11 +553,13 @@ static enum ata_completion_errors pata_macio_qc_prep(struct ata_queued_cmd *qc) addr += len; sg_len -= len; ++table; + ++pi; } }
/* Should never happen according to Tejun */ - BUG_ON(!pi); + if (WARN_ON_ONCE(!pi)) + return AC_ERR_SYSTEM;
/* Convert the last command to an input/output */ table--;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Trond Myklebust trond.myklebust@hammerspace.com
[ Upstream commit a017ad1313fc91bdf235097fd0a02f673fc7bb11 ]
We're seeing reports of soft lockups when iterating through the loops, so let's add rescheduling points.
Signed-off-by: Trond Myklebust trond.myklebust@hammerspace.com Reviewed-by: Jeff Layton jlayton@kernel.org Signed-off-by: Anna Schumaker Anna.Schumaker@Netapp.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/nfs/super.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/fs/nfs/super.c b/fs/nfs/super.c index 05ae23657527..f7b4df29ac5f 100644 --- a/fs/nfs/super.c +++ b/fs/nfs/super.c @@ -47,6 +47,7 @@ #include <linux/vfs.h> #include <linux/inet.h> #include <linux/in6.h> +#include <linux/sched.h> #include <linux/slab.h> #include <net/ipv6.h> #include <linux/netdevice.h> @@ -219,6 +220,7 @@ static int __nfs_list_for_each_server(struct list_head *head, ret = fn(server, data); if (ret) goto out; + cond_resched(); rcu_read_lock(); } rcu_read_unlock();
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Matthieu Baerts (NGI0)" matttbe@kernel.org
By accident, some patches modifying the MPTCP selftests have been applied twice, using different versions of the patch [1].
These patches have been dropped, but it looks like quilt incorrectly handled that by placing the new subtests at the wrong place: in userspace_tests() instead of endpoint_tests(). That caused a few other patches not to apply properly.
Not to have to revert and re-apply patches, this issue can be fixed by moving some code around.
Link: https://lore.kernel.org/fc21db4a-508d-41db-aa45-e3bc06d18ce7@kernel.org [1] Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 138 ++++++++++++------------ 1 file changed, 69 insertions(+), 69 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3236,75 +3236,6 @@ userspace_tests() chk_join_nr 1 1 1 chk_rm_nr 0 1 fi - - # remove and re-add - if reset "delete re-add signal" && - mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then - pm_nl_set_limits $ns1 0 3 - pm_nl_set_limits $ns2 3 3 - pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal - # broadcast IP: no packet for this address will be received on ns1 - pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal - pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal - run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_20 2>/dev/null & - local tests_pid=$! - - wait_mpj $ns2 - chk_subflow_nr needtitle "before delete" 2 - - pm_nl_del_endpoint $ns1 1 10.0.2.1 - pm_nl_del_endpoint $ns1 2 224.0.0.1 - sleep 0.5 - chk_subflow_nr "" "after delete" 1 - - pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal - pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal - wait_mpj $ns2 - chk_subflow_nr "" "after re-add" 3 - - pm_nl_del_endpoint $ns1 42 10.0.1.1 - sleep 0.5 - chk_subflow_nr "" "after delete ID 0" 2 - - pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal - wait_mpj $ns2 - chk_subflow_nr "" "after re-add" 3 - kill_tests_wait - - chk_join_nr 4 4 4 - chk_add_nr 5 5 - chk_rm_nr 3 2 invert - fi - - # flush and re-add - if reset_with_tcp_filter "flush re-add" ns2 10.0.3.2 REJECT OUTPUT && - mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then - pm_nl_set_limits $ns1 0 2 - pm_nl_set_limits $ns2 1 2 - # broadcast IP: no packet for this address will be received on ns1 - pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal - pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow - run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_20 2>/dev/null & - local tests_pid=$! - - wait_attempt_fail $ns2 - chk_subflow_nr needtitle "before flush" 1 - - pm_nl_flush_endpoint $ns2 - pm_nl_flush_endpoint $ns1 - wait_rm_addr $ns2 0 - ip netns exec "${ns2}" ${iptables} -D OUTPUT -s "10.0.3.2" -p tcp -j REJECT - pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow - wait_mpj $ns2 - pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal - wait_mpj $ns2 - kill_wait "${tests_pid}" - kill_tests_wait - - chk_join_nr 2 2 2 - chk_add_nr 2 2 - chk_rm_nr 1 0 invert - fi }
endpoint_tests() @@ -3375,6 +3306,75 @@ endpoint_tests() chk_join_nr 6 6 6 chk_rm_nr 4 4 fi + + # remove and re-add + if reset "delete re-add signal" && + mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + pm_nl_set_limits $ns1 0 3 + pm_nl_set_limits $ns2 3 3 + pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal + # broadcast IP: no packet for this address will be received on ns1 + pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal + pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal + run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_20 2>/dev/null & + local tests_pid=$! + + wait_mpj $ns2 + chk_subflow_nr needtitle "before delete" 2 + + pm_nl_del_endpoint $ns1 1 10.0.2.1 + pm_nl_del_endpoint $ns1 2 224.0.0.1 + sleep 0.5 + chk_subflow_nr "" "after delete" 1 + + pm_nl_add_endpoint $ns1 10.0.2.1 id 1 flags signal + pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal + wait_mpj $ns2 + chk_subflow_nr "" "after re-add" 3 + + pm_nl_del_endpoint $ns1 42 10.0.1.1 + sleep 0.5 + chk_subflow_nr "" "after delete ID 0" 2 + + pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal + wait_mpj $ns2 + chk_subflow_nr "" "after re-add" 3 + kill_tests_wait + + chk_join_nr 4 4 4 + chk_add_nr 5 5 + chk_rm_nr 3 2 invert + fi + + # flush and re-add + if reset_with_tcp_filter "flush re-add" ns2 10.0.3.2 REJECT OUTPUT && + mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + pm_nl_set_limits $ns1 0 2 + pm_nl_set_limits $ns2 1 2 + # broadcast IP: no packet for this address will be received on ns1 + pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal + pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow + run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_20 2>/dev/null & + local tests_pid=$! + + wait_attempt_fail $ns2 + chk_subflow_nr needtitle "before flush" 1 + + pm_nl_flush_endpoint $ns2 + pm_nl_flush_endpoint $ns1 + wait_rm_addr $ns2 0 + ip netns exec "${ns2}" ${iptables} -D OUTPUT -s "10.0.3.2" -p tcp -j REJECT + pm_nl_add_endpoint $ns2 10.0.3.2 id 3 flags subflow + wait_mpj $ns2 + pm_nl_add_endpoint $ns1 10.0.3.1 id 2 flags signal + wait_mpj $ns2 + kill_wait "${tests_pid}" + kill_tests_wait + + chk_join_nr 2 2 2 + chk_add_nr 2 2 + chk_rm_nr 1 0 invert + fi }
# [$1: error message]
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Matthieu Baerts (NGI0)" matttbe@kernel.org
commit 20ccc7c5f7a3aa48092441a4b182f9f40418392e upstream.
This test extends "delete and re-add" and "delete re-add signal" to validate the previous commit: the number of MPTCP events are checked to make sure there are no duplicated or unexpected ones.
A new helper has been introduced to easily check these events. The missing events have been added to the lib.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: b911c97c7dc7 ("mptcp: add netlink event support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com [ Conflicts in mptcp_join.sh and mptcp_lib.sh, due to commit 38f027fca1b7 ("selftests: mptcp: dump userspace addrs list") -- linked to a new feature, not backportable to stable -- and commit 23a0485d1c04 ("selftests: mptcp: declare event macros in mptcp_lib") -- depending on the previous one -- not in this version. The conflicts in mptcp_join.sh were in the context, because a new helper had to be added after others that are not in this version. The conflicts in mptcp_lib.sh were due to the fact the other MPTCP_LIB_EVENT_* constants were not present. They have all been added in this version to ease future backports if any. In this version, it was also needed to import reset_with_events and kill_events_pids from the newer version, and adapt chk_evt_nr to how the results are printed in this version, plus remove the LISTENER events checks because the linked feature is not available in this kernel version. ] Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 90 +++++++++++++++++++++++- tools/testing/selftests/net/mptcp/mptcp_lib.sh | 15 ++++ 2 files changed, 104 insertions(+), 1 deletion(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -394,6 +394,25 @@ reset_with_fail() fi }
+start_events() +{ + evts_ns1=$(mktemp) + evts_ns2=$(mktemp) + :> "$evts_ns1" + :> "$evts_ns2" + ip netns exec "${ns1}" ./pm_nl_ctl events >> "$evts_ns1" 2>&1 & + evts_ns1_pid=$! + ip netns exec "${ns2}" ./pm_nl_ctl events >> "$evts_ns2" 2>&1 & + evts_ns2_pid=$! +} + +reset_with_events() +{ + reset "${1}" || return 1 + + start_events +} + reset_with_tcp_filter() { reset "${1}" || return 1 @@ -596,6 +615,14 @@ kill_tests_wait() wait }
+kill_events_pids() +{ + kill_wait $evts_ns1_pid + evts_ns1_pid=0 + kill_wait $evts_ns2_pid + evts_ns2_pid=0 +} + pm_nl_set_limits() { local ns=$1 @@ -3143,6 +3170,32 @@ fail_tests() fi }
+# $1: ns ; $2: event type ; $3: count +chk_evt_nr() +{ + local ns=${1} + local evt_name="${2}" + local exp="${3}" + + local evts="${evts_ns1}" + local evt="${!evt_name}" + local count + + evt_name="${evt_name:16}" # without MPTCP_LIB_EVENT_ + [ "${ns}" == "ns2" ] && evts="${evts_ns2}" + + printf "%-${nr_blank}s %s" " " "event ${ns} ${evt_name} (${exp})" + + count=$(grep -cw "type:${evt}" "${evts}") + if [ "${count}" != "${exp}" ]; then + echo "[fail] got $count events, expected $exp" + fail_test + dump_stats + else + echo "[ ok ]" + fi +} + userspace_tests() { # userspace pm type prevents add_addr @@ -3265,11 +3318,13 @@ endpoint_tests()
if reset_with_tcp_filter "delete and re-add" ns2 10.0.3.2 REJECT OUTPUT && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then + start_events pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 0 3 pm_nl_add_endpoint $ns2 10.0.1.2 id 1 dev ns2eth1 flags subflow pm_nl_add_endpoint $ns2 10.0.2.2 id 2 dev ns2eth2 flags subflow run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_5 2>/dev/null & + local tests_pid=$!
wait_mpj $ns2 pm_nl_del_endpoint $ns2 2 10.0.2.2 @@ -3301,14 +3356,30 @@ endpoint_tests() chk_subflow_nr "" "after re-add id 0 ($i)" 3 done
+ kill_wait "${tests_pid}" + kill_events_pids kill_tests_wait
+ chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 4 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 5 # one has been closed before estab + chk_join_nr 6 6 6 chk_rm_nr 4 4 fi
# remove and re-add - if reset "delete re-add signal" && + if reset_with_events "delete re-add signal" && mptcp_lib_kallsyms_has "subflow_rebuild_header$"; then pm_nl_set_limits $ns1 0 3 pm_nl_set_limits $ns2 3 3 @@ -3339,8 +3410,25 @@ endpoint_tests() pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 chk_subflow_nr "" "after re-add" 3 + + kill_wait "${tests_pid}" + kill_events_pids kill_tests_wait
+ chk_evt_nr ns1 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + + chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_join_nr 4 4 4 chk_add_nr 5 5 chk_rm_nr 3 2 invert --- a/tools/testing/selftests/net/mptcp/mptcp_lib.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_lib.sh @@ -4,6 +4,21 @@ readonly KSFT_FAIL=1 readonly KSFT_SKIP=4
+# These variables are used in some selftests, read-only +declare -rx MPTCP_LIB_EVENT_CREATED=1 # MPTCP_EVENT_CREATED +declare -rx MPTCP_LIB_EVENT_ESTABLISHED=2 # MPTCP_EVENT_ESTABLISHED +declare -rx MPTCP_LIB_EVENT_CLOSED=3 # MPTCP_EVENT_CLOSED +declare -rx MPTCP_LIB_EVENT_ANNOUNCED=6 # MPTCP_EVENT_ANNOUNCED +declare -rx MPTCP_LIB_EVENT_REMOVED=7 # MPTCP_EVENT_REMOVED +declare -rx MPTCP_LIB_EVENT_SUB_ESTABLISHED=10 # MPTCP_EVENT_SUB_ESTABLISHED +declare -rx MPTCP_LIB_EVENT_SUB_CLOSED=11 # MPTCP_EVENT_SUB_CLOSED +declare -rx MPTCP_LIB_EVENT_SUB_PRIORITY=13 # MPTCP_EVENT_SUB_PRIORITY +declare -rx MPTCP_LIB_EVENT_LISTENER_CREATED=15 # MPTCP_EVENT_LISTENER_CREATED +declare -rx MPTCP_LIB_EVENT_LISTENER_CLOSED=16 # MPTCP_EVENT_LISTENER_CLOSED + +declare -rx MPTCP_LIB_AF_INET=2 +declare -rx MPTCP_LIB_AF_INET6=10 + # SELFTESTS_MPTCP_LIB_EXPECT_ALL_FEATURES env var can be set when validating all # features using the last version of the kernel and the selftests to make sure # a test is not being skipped by mistake.
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: "Matthieu Baerts (NGI0)" matttbe@kernel.org
commit f18fa2abf81099d822d842a107f8c9889c86043c upstream.
This test extends "delete re-add signal" to validate the previous commit: when the 'signal' endpoint linked to the initial subflow (ID 0) is re-added multiple times, it will re-send the ADD_ADDR with id 0. The client should still be able to re-create this subflow, even if the add_addr_accepted limit has been reached as this special address is not considered as a new address.
The 'Fixes' tag here below is the same as the one from the previous commit: this patch here is not fixing anything wrong in the selftests, but it validates the previous fix for an issue introduced by this commit ID.
Fixes: d0876b2284cf ("mptcp: add the incoming RM_ADDR support") Cc: stable@vger.kernel.org Reviewed-by: Mat Martineau martineau@kernel.org Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com [ Conflicts in mptcp_join.sh, because the helpers are different in this version: - run_tests has been modified a few times to reduce the number of positional parameters - no chk_mptcp_info helper - chk_subflow_nr taking an extra parameter - kill_tests_wait instead of mptcp_lib_kill_wait ] Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- tools/testing/selftests/net/mptcp/mptcp_join.sh | 32 +++++++++++++++--------- 1 file changed, 20 insertions(+), 12 deletions(-)
--- a/tools/testing/selftests/net/mptcp/mptcp_join.sh +++ b/tools/testing/selftests/net/mptcp/mptcp_join.sh @@ -3387,7 +3387,7 @@ endpoint_tests() # broadcast IP: no packet for this address will be received on ns1 pm_nl_add_endpoint $ns1 224.0.0.1 id 2 flags signal pm_nl_add_endpoint $ns1 10.0.1.1 id 42 flags signal - run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_20 2>/dev/null & + run_tests $ns1 $ns2 10.0.1.1 4 0 0 speed_5 2>/dev/null & local tests_pid=$!
wait_mpj $ns2 @@ -3409,7 +3409,15 @@ endpoint_tests()
pm_nl_add_endpoint $ns1 10.0.1.1 id 99 flags signal wait_mpj $ns2 - chk_subflow_nr "" "after re-add" 3 + chk_subflow_nr "" "after re-add ID 0" 3 + + pm_nl_del_endpoint $ns1 99 10.0.1.1 + sleep 0.5 + chk_subflow_nr "" "after re-delete ID 0" 2 + + pm_nl_add_endpoint $ns1 10.0.1.1 id 88 flags signal + wait_mpj $ns2 + chk_subflow_nr "" "after re-re-add ID 0" 3
kill_wait "${tests_pid}" kill_events_pids @@ -3419,19 +3427,19 @@ endpoint_tests() chk_evt_nr ns1 MPTCP_LIB_EVENT_ESTABLISHED 1 chk_evt_nr ns1 MPTCP_LIB_EVENT_ANNOUNCED 0 chk_evt_nr ns1 MPTCP_LIB_EVENT_REMOVED 0 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 2 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns1 MPTCP_LIB_EVENT_SUB_CLOSED 3
chk_evt_nr ns2 MPTCP_LIB_EVENT_CREATED 1 chk_evt_nr ns2 MPTCP_LIB_EVENT_ESTABLISHED 1 - chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 5 - chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 3 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 4 - chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 2 - - chk_join_nr 4 4 4 - chk_add_nr 5 5 - chk_rm_nr 3 2 invert + chk_evt_nr ns2 MPTCP_LIB_EVENT_ANNOUNCED 6 + chk_evt_nr ns2 MPTCP_LIB_EVENT_REMOVED 4 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_ESTABLISHED 5 + chk_evt_nr ns2 MPTCP_LIB_EVENT_SUB_CLOSED 3 + + chk_join_nr 5 5 5 + chk_add_nr 6 6 + chk_rm_nr 4 3 invert fi
# flush and re-add
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jens Axboe axboe@kernel.dk
commit 01e68ce08a30db3d842ce7a55f7f6e0474a55f9a upstream.
Every now and then reports come in that are puzzled on why changing affinity on the io-wq workers fails with EINVAL. This happens because they set PF_NO_SETAFFINITY as part of their creation, as io-wq organizes workers into groups based on what CPU they are running on.
However, this is purely an optimization and not a functional requirement. We can allow setting affinity, and just lazily update our worker to wqe mappings. If a given io-wq thread times out, it normally exits if there's no more work to do. The exception is if it's the last worker available. For the timeout case, check the affinity of the worker against group mask and exit even if it's the last worker. New workers should be created with the right mask and in the right location.
Reported-by:Daniel Dao dqminh@cloudflare.com Link: https://lore.kernel.org/io-uring/CA+wXwBQwgxB3_UphSny-yAP5b26meeOu1W4TwYVcD_... Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Felix Moessbauer felix.moessbauer@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/io-wq.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-)
--- a/io_uring/io-wq.c +++ b/io_uring/io-wq.c @@ -628,7 +628,7 @@ static int io_wqe_worker(void *data) struct io_wqe_acct *acct = io_wqe_get_acct(worker); struct io_wqe *wqe = worker->wqe; struct io_wq *wq = wqe->wq; - bool last_timeout = false; + bool exit_mask = false, last_timeout = false; char buf[TASK_COMM_LEN];
worker->flags |= (IO_WORKER_F_UP | IO_WORKER_F_RUNNING); @@ -644,8 +644,11 @@ static int io_wqe_worker(void *data) io_worker_handle_work(worker);
raw_spin_lock(&wqe->lock); - /* timed out, exit unless we're the last worker */ - if (last_timeout && acct->nr_workers > 1) { + /* + * Last sleep timed out. Exit if we're not the last worker, + * or if someone modified our affinity. + */ + if (last_timeout && (exit_mask || acct->nr_workers > 1)) { acct->nr_workers--; raw_spin_unlock(&wqe->lock); __set_current_state(TASK_RUNNING); @@ -664,7 +667,11 @@ static int io_wqe_worker(void *data) continue; break; } - last_timeout = !ret; + if (!ret) { + last_timeout = true; + exit_mask = !cpumask_test_cpu(raw_smp_processor_id(), + wqe->cpu_mask); + } }
if (test_bit(IO_WQ_BIT_EXIT, &wq->state)) @@ -716,7 +723,6 @@ static void io_init_new_worker(struct io tsk->worker_private = worker; worker->task = tsk; set_cpus_allowed_ptr(tsk, wqe->cpu_mask); - tsk->flags |= PF_NO_SETAFFINITY;
raw_spin_lock(&wqe->lock); hlist_nulls_add_head_rcu(&worker->nulls_node, &wqe->free_list);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michal Koutný mkoutny@suse.com
commit a5fc1441af7719e93dc7a638a960befb694ade89 upstream.
Users may specify a CPU where the sqpoll thread would run. This may conflict with cpuset operations because of strict PF_NO_SETAFFINITY requirement. That flag is unnecessary for polling "kernel" threads, see the reasoning in commit 01e68ce08a30 ("io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers"). Drop the flag on poll threads too.
Fixes: 01e68ce08a30 ("io_uring/io-wq: stop setting PF_NO_SETAFFINITY on io-wq workers") Link: https://lore.kernel.org/all/20230314162559.pnyxdllzgw7jozgx@blackpad/ Signed-off-by: Michal Koutný mkoutny@suse.com Link: https://lore.kernel.org/r/20230314183332.25834-1-mkoutny@suse.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Felix Moessbauer felix.moessbauer@siemens.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- io_uring/sqpoll.c | 1 - 1 file changed, 1 deletion(-)
--- a/io_uring/sqpoll.c +++ b/io_uring/sqpoll.c @@ -233,7 +233,6 @@ static int io_sq_thread(void *data) set_cpus_allowed_ptr(current, cpumask_of(sqd->sq_cpu)); else set_cpus_allowed_ptr(current, cpu_online_mask); - current->flags |= PF_NO_SETAFFINITY;
/* * Force audit context to get setup, in case we do prep side async
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matthieu Baerts (NGI0) matttbe@kernel.org
commit c1668292689ad2ee16c9c1750a8044b0b0aad663 upstream.
The 'Fixes' commit recently changed the behaviour of TCP by skipping the processing of the 3rd ACK when a sk->sk_socket is set. The goal was to skip tcp_ack_snd_check() in tcp_rcv_state_process() not to send an unnecessary ACK in case of simultaneous connect(). Unfortunately, that had an impact on TFO and MPTCP.
I started to look at the impact on MPTCP, because the MPTCP CI found some issues with the MPTCP Packetdrill tests [1]. Then Paolo Abeni suggested me to look at the impact on TFO with "plain" TCP.
For MPTCP, when receiving the 3rd ACK of a request adding a new path (MP_JOIN), sk->sk_socket will be set, and point to the MPTCP sock that has been created when the MPTCP connection got established before with the first path. The newly added 'goto' will then skip the processing of the segment text (step 7) and not go through tcp_data_queue() where the MPTCP options are validated, and some actions are triggered, e.g. sending the MPJ 4th ACK [2] as demonstrated by the new errors when running a packetdrill test [3] establishing a second subflow.
This doesn't fully break MPTCP, mainly the 4th MPJ ACK that will be delayed. Still, we don't want to have this behaviour as it delays the switch to the fully established mode, and invalid MPTCP options in this 3rd ACK will not be caught any more. This modification also affects the MPTCP + TFO feature as well, and being the reason why the selftests started to be unstable the last few days [4].
For TFO, the existing 'basic-cookie-not-reqd' test [5] was no longer passing: if the 3rd ACK contains data, and the connection is accept()ed before receiving them, these data would no longer be processed, and thus not ACKed.
One last thing about MPTCP, in case of simultaneous connect(), a fallback to TCP will be done, which seems fine:
`../common/defaults.sh`
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_MPTCP) = 3 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 100 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey> +0 < S 0:0(0) win 1000 <mss 1460, sackOK, TS val 407 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey> +0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 330 ecr 0, nop, wscale 8, mpcapable v1 flags[flag_h] nokey> +0 < S. 0:0(0) ack 1 win 65535 <mss 1460, sackOK, TS val 700 ecr 100, nop, wscale 8, mpcapable v1 flags[flag_h] key[skey=2]> +0 > . 1:1(0) ack 1 <nop, nop, TS val 845707014 ecr 700, nop, nop, sack 0:1>
Simultaneous SYN-data crossing is also not supported by TFO, see [6].
Kuniyuki Iwashima suggested to restrict the processing to SYN+ACK only: that's a more generic solution than the one initially proposed, and also enough to fix the issues described above.
Later on, Eric Dumazet mentioned that an ACK should still be sent in reaction to the second SYN+ACK that is received: not sending a DUPACK here seems wrong and could hurt:
0 socket(..., SOCK_STREAM|SOCK_NONBLOCK, IPPROTO_TCP) = 3 +0 connect(3, ..., ...) = -1 EINPROGRESS (Operation now in progress)
+0 > S 0:0(0) <mss 1460, sackOK, TS val 1000 ecr 0,nop,wscale 8> +0 < S 0:0(0) win 1000 <mss 1000, sackOK, nop, nop> +0 > S. 0:0(0) ack 1 <mss 1460, sackOK, TS val 3308134035 ecr 0,nop,wscale 8> +0 < S. 0:0(0) ack 1 win 1000 <mss 1000, sackOK, nop, nop> +0 > . 1:1(0) ack 1 <nop, nop, sack 0:1> // <== Here
So in this version, the 'goto consume' is dropped, to always send an ACK when switching from TCP_SYN_RECV to TCP_ESTABLISHED. This ACK will be seen as a DUPACK -- with DSACK if SACK has been negotiated -- in case of simultaneous SYN crossing: that's what is expected here.
Link: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9936227696 [1] Link: https://datatracker.ietf.org/doc/html/rfc8684#fig_tokens [2] Link: https://github.com/multipath-tcp/packetdrill/blob/mptcp-net-next/gtests/net/... [3] Link: https://netdev.bots.linux.dev/contest.html?executor=vmksft-mptcp-dbg&tes... [4] Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/se... [5] Link: https://github.com/google/packetdrill/blob/master/gtests/net/tcp/fastopen/cl... [6] Fixes: 23e89e8ee7be ("tcp: Don't drop SYN+ACK for simultaneous connect().") Suggested-by: Paolo Abeni pabeni@redhat.com Suggested-by: Kuniyuki Iwashima kuniyu@amazon.com Suggested-by: Eric Dumazet edumazet@google.com Signed-off-by: Matthieu Baerts (NGI0) matttbe@kernel.org Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20240724-upstream-net-next-20240716-tcp-3rd-ack-con... Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/ipv4/tcp_input.c | 3 --- 1 file changed, 3 deletions(-)
--- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -6641,9 +6641,6 @@ int tcp_rcv_state_process(struct sock *s tcp_fast_path_on(tp); if (sk->sk_shutdown & SEND_SHUTDOWN) tcp_shutdown(sk, SEND_SHUTDOWN); - - if (sk->sk_socket) - goto consume; break;
case TCP_FIN_WAIT1: {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aleksandr Mishin amishin@t-argos.ru
commit b48aa991758999d4e8f9296c5bbe388f293ef465 upstream.
In ad9834_write_frequency() clk_get_rate() can return 0. In such case ad9834_calc_freqreg() call will lead to division by zero. Checking 'if (fout > (clk_freq / 2))' doesn't protect in case of 'fout' is 0. ad9834_write_frequency() is called from ad9834_write(), where fout is taken from text buffer, which can contain any value.
Modify parameters checking.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: 12b9d5bf76bf ("Staging: IIO: DDS: AD9833 / AD9834 driver") Suggested-by: Dan Carpenter dan.carpenter@linaro.org Signed-off-by: Aleksandr Mishin amishin@t-argos.ru Reviewed-by: Dan Carpenter dan.carpenter@linaro.org Link: https://patch.msgid.link/20240703154506.25584-1-amishin@t-argos.ru Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/staging/iio/frequency/ad9834.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/staging/iio/frequency/ad9834.c +++ b/drivers/staging/iio/frequency/ad9834.c @@ -114,7 +114,7 @@ static int ad9834_write_frequency(struct
clk_freq = clk_get_rate(st->mclk);
- if (fout > (clk_freq / 2)) + if (!clk_freq || fout > (clk_freq / 2)) return -EINVAL;
regval = ad9834_calc_freqreg(clk_freq, fout);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Lechner dlechner@baylibre.com
commit 84c65d8008764a8fb4e627ff02de01ec4245f2c4 upstream.
If dma_get_slave_caps() fails, we need to release the dma channel before returning an error to avoid leaking the channel.
Fixes: 2d6ca60f3284 ("iio: Add a DMAengine framework based buffer") Signed-off-by: David Lechner dlechner@baylibre.com Link: https://patch.msgid.link/20240723-iio-fix-dmaengine-free-on-error-v1-1-2c7cb... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/buffer/industrialio-buffer-dmaengine.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/iio/buffer/industrialio-buffer-dmaengine.c +++ b/drivers/iio/buffer/industrialio-buffer-dmaengine.c @@ -180,7 +180,7 @@ static struct iio_buffer *iio_dmaengine_
ret = dma_get_slave_caps(chan, &caps); if (ret < 0) - goto err_free; + goto err_release;
/* Needs to be aligned to the maximum of the minimums */ if (caps.src_addr_widths) @@ -206,6 +206,8 @@ static struct iio_buffer *iio_dmaengine_
return &dmaengine_buffer->queue.buffer;
+err_release: + dma_release_channel(chan); err_free: kfree(dmaengine_buffer); return ERR_PTR(ret);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matteo Martelli matteomartelli3@gmail.com
commit 8a3dcc970dc57b358c8db2702447bf0af4e0d83a upstream.
When the scale_type is IIO_VAL_INT_PLUS_MICRO or IIO_VAL_INT_PLUS_NANO the scale passed as argument is only applied to the fractional part of the value. Fix it by also multiplying the integer part by the scale provided.
Fixes: 48e44ce0f881 ("iio:inkern: Add function to read the processed value") Signed-off-by: Matteo Martelli matteomartelli3@gmail.com Link: https://patch.msgid.link/20240730-iio-fix-scale-v1-1-6246638c8daa@gmail.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/inkern.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/iio/inkern.c +++ b/drivers/iio/inkern.c @@ -679,17 +679,17 @@ static int iio_convert_raw_to_processed_ break; case IIO_VAL_INT_PLUS_MICRO: if (scale_val2 < 0) - *processed = -raw64 * scale_val; + *processed = -raw64 * scale_val * scale; else - *processed = raw64 * scale_val; + *processed = raw64 * scale_val * scale; *processed += div_s64(raw64 * (s64)scale_val2 * scale, 1000000LL); break; case IIO_VAL_INT_PLUS_NANO: if (scale_val2 < 0) - *processed = -raw64 * scale_val; + *processed = -raw64 * scale_val * scale; else - *processed = raw64 * scale_val; + *processed = raw64 * scale_val * scale; *processed += div_s64(raw64 * (s64)scale_val2 * scale, 1000000000LL); break;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dumitru Ceclan mitrutzceclan@gmail.com
commit 2f6b92d0f69f04d9e2ea0db1228ab7f82f3173af upstream.
The ad7124_find_similar_live_cfg() computes the compare size by substracting the address of the cfg struct from the address of the live field. Because the live field is the first field in the struct, the result is 0.
Also, the memcmp() call is made from the start of the cfg struct, which includes the live and cfg_slot fields, which are not relevant for the comparison.
Fix by grouping the relevant fields with struct_group() and use the size of the group to compute the compare size; make the memcmp() call from the address of the group.
Fixes: 7b8d045e497a ("iio: adc: ad7124: allow more than 8 channels") Signed-off-by: Dumitru Ceclan dumitru.ceclan@analog.com Reviewed-by: Nuno Sa nuno.sa@analog.com Link: https://patch.msgid.link/20240731-ad7124-fix-v1-2-46a76aa4b9be@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7124.c | 26 +++++++++++++++----------- 1 file changed, 15 insertions(+), 11 deletions(-)
--- a/drivers/iio/adc/ad7124.c +++ b/drivers/iio/adc/ad7124.c @@ -146,15 +146,18 @@ struct ad7124_chip_info { struct ad7124_channel_config { bool live; unsigned int cfg_slot; - enum ad7124_ref_sel refsel; - bool bipolar; - bool buf_positive; - bool buf_negative; - unsigned int vref_mv; - unsigned int pga_bits; - unsigned int odr; - unsigned int odr_sel_bits; - unsigned int filter_type; + /* Following fields are used to compare equality. */ + struct_group(config_props, + enum ad7124_ref_sel refsel; + bool bipolar; + bool buf_positive; + bool buf_negative; + unsigned int vref_mv; + unsigned int pga_bits; + unsigned int odr; + unsigned int odr_sel_bits; + unsigned int filter_type; + ); };
struct ad7124_channel { @@ -333,11 +336,12 @@ static struct ad7124_channel_config *ad7 ptrdiff_t cmp_size; int i;
- cmp_size = (u8 *)&cfg->live - (u8 *)cfg; + cmp_size = sizeof_field(struct ad7124_channel_config, config_props); for (i = 0; i < st->num_channels; i++) { cfg_aux = &st->channels[i].cfg;
- if (cfg_aux->live && !memcmp(cfg, cfg_aux, cmp_size)) + if (cfg_aux->live && + !memcmp(&cfg->config_props, &cfg_aux->config_props, cmp_size)) return cfg_aux; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Guillaume Stols gstols@baylibre.com
commit 90826e08468ba7fb35d8b39645b22d9e80004afe upstream.
The current implementation attempts to recover from an eventual glitch in the clock by checking frstdata state after reading the first channel's sample: If frstdata is low, it will reset the chip and return -EIO.
This will only work in parallel mode, where frstdata pin is set low after the 2nd sample read starts.
For the serial mode, according to the datasheet, "The FRSTDATA output returns to a logic low following the 16th SCLK falling edge.", thus after the Xth pulse, X being the number of bits in a sample, the check will always be true, and the driver will not work at all in serial mode if frstdata(optional) is defined in the devicetree as it will reset the chip, and return -EIO every time read_sample is called.
Hence, this check must be removed for serial mode.
Fixes: b9618c0cacd7 ("staging: IIO: ADC: New driver for AD7606/AD7606-6/AD7606-4") Signed-off-by: Guillaume Stols gstols@baylibre.com Reviewed-by: Nuno Sa nuno.sa@analog.com Link: https://patch.msgid.link/20240702-cleanup-ad7606-v3-1-18d5ea18770e@baylibre.... Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7606.c | 28 +------------------------ drivers/iio/adc/ad7606.h | 2 + drivers/iio/adc/ad7606_par.c | 48 ++++++++++++++++++++++++++++++++++++++++--- 3 files changed, 49 insertions(+), 29 deletions(-)
--- a/drivers/iio/adc/ad7606.c +++ b/drivers/iio/adc/ad7606.c @@ -49,7 +49,7 @@ static const unsigned int ad7616_oversam 1, 2, 4, 8, 16, 32, 64, 128, };
-static int ad7606_reset(struct ad7606_state *st) +int ad7606_reset(struct ad7606_state *st) { if (st->gpio_reset) { gpiod_set_value(st->gpio_reset, 1); @@ -60,6 +60,7 @@ static int ad7606_reset(struct ad7606_st
return -ENODEV; } +EXPORT_SYMBOL_NS_GPL(ad7606_reset, IIO_AD7606);
static int ad7606_reg_access(struct iio_dev *indio_dev, unsigned int reg, @@ -88,31 +89,6 @@ static int ad7606_read_samples(struct ad { unsigned int num = st->chip_info->num_channels - 1; u16 *data = st->data; - int ret; - - /* - * The frstdata signal is set to high while and after reading the sample - * of the first channel and low for all other channels. This can be used - * to check that the incoming data is correctly aligned. During normal - * operation the data should never become unaligned, but some glitch or - * electrostatic discharge might cause an extra read or clock cycle. - * Monitoring the frstdata signal allows to recover from such failure - * situations. - */ - - if (st->gpio_frstdata) { - ret = st->bops->read_block(st->dev, 1, data); - if (ret) - return ret; - - if (!gpiod_get_value(st->gpio_frstdata)) { - ad7606_reset(st); - return -EIO; - } - - data++; - num--; - }
return st->bops->read_block(st->dev, num, data); } --- a/drivers/iio/adc/ad7606.h +++ b/drivers/iio/adc/ad7606.h @@ -153,6 +153,8 @@ int ad7606_probe(struct device *dev, int const char *name, unsigned int id, const struct ad7606_bus_ops *bops);
+int ad7606_reset(struct ad7606_state *st); + enum ad7606_supported_device_ids { ID_AD7605_4, ID_AD7606_8, --- a/drivers/iio/adc/ad7606_par.c +++ b/drivers/iio/adc/ad7606_par.c @@ -7,6 +7,7 @@
#include <linux/mod_devicetable.h> #include <linux/module.h> +#include <linux/gpio/consumer.h> #include <linux/platform_device.h> #include <linux/types.h> #include <linux/err.h> @@ -21,8 +22,29 @@ static int ad7606_par16_read_block(struc struct iio_dev *indio_dev = dev_get_drvdata(dev); struct ad7606_state *st = iio_priv(indio_dev);
- insw((unsigned long)st->base_address, buf, count);
+ /* + * On the parallel interface, the frstdata signal is set to high while + * and after reading the sample of the first channel and low for all + * other channels. This can be used to check that the incoming data is + * correctly aligned. During normal operation the data should never + * become unaligned, but some glitch or electrostatic discharge might + * cause an extra read or clock cycle. Monitoring the frstdata signal + * allows to recover from such failure situations. + */ + int num = count; + u16 *_buf = buf; + + if (st->gpio_frstdata) { + insw((unsigned long)st->base_address, _buf, 1); + if (!gpiod_get_value(st->gpio_frstdata)) { + ad7606_reset(st); + return -EIO; + } + _buf++; + num--; + } + insw((unsigned long)st->base_address, _buf, num); return 0; }
@@ -35,8 +57,28 @@ static int ad7606_par8_read_block(struct { struct iio_dev *indio_dev = dev_get_drvdata(dev); struct ad7606_state *st = iio_priv(indio_dev); - - insb((unsigned long)st->base_address, buf, count * 2); + /* + * On the parallel interface, the frstdata signal is set to high while + * and after reading the sample of the first channel and low for all + * other channels. This can be used to check that the incoming data is + * correctly aligned. During normal operation the data should never + * become unaligned, but some glitch or electrostatic discharge might + * cause an extra read or clock cycle. Monitoring the frstdata signal + * allows to recover from such failure situations. + */ + int num = count; + u16 *_buf = buf; + + if (st->gpio_frstdata) { + insb((unsigned long)st->base_address, _buf, 2); + if (!gpiod_get_value(st->gpio_frstdata)) { + ad7606_reset(st); + return -EIO; + } + _buf++; + num--; + } + insb((unsigned long)st->base_address, _buf, num * 2);
return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dumitru Ceclan mitrutzceclan@gmail.com
commit 96f9ab0d5933c1c00142dd052f259fce0bc3ced2 upstream.
The ad7124_soft_reset() function has the assumption that the chip will assert the "power-on reset" bit in the STATUS register after a software reset without any delay. The POR bit =0 is used to check if the chip initialization is done.
A chip ID mismatch probe error appears intermittently when the probe continues too soon and the ID register does not contain the expected value.
Fix by adding a 200us delay after the software reset command is issued.
Fixes: b3af341bbd96 ("iio: adc: Add ad7124 support") Signed-off-by: Dumitru Ceclan dumitru.ceclan@analog.com Reviewed-by: Nuno Sa nuno.sa@analog.com Link: https://patch.msgid.link/20240731-ad7124-fix-v1-1-46a76aa4b9be@analog.com Cc: Stable@vger.kernel.org Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iio/adc/ad7124.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/iio/adc/ad7124.c +++ b/drivers/iio/adc/ad7124.c @@ -765,6 +765,7 @@ static int ad7124_soft_reset(struct ad71 if (ret < 0) return ret;
+ fsleep(200); timeout = 100; do { ret = ad_sd_read_reg(&st->sd, AD7124_STATUS, 1, &readval);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Faisal Hassan quic_faisalh@quicinc.com
commit 9149c9b0c7e046273141e41eebd8a517416144ac upstream.
This fix addresses STAR 9001285599, which only affects DWC_usb3 version 3.20a. The timer value for PM_LC_TIMER in DWC_usb3 3.20a for the Link ECN changes is incorrect. If the PM TIMER ECN is enabled via GUCTL2[19], the link compliance test (TD7.21) may fail. If the ECN is not enabled (GUCTL2[19] = 0), the controller will use the old timer value (5us), which is still acceptable for the link compliance test. Therefore, clear GUCTL2[19] to pass the USB link compliance test: TD 7.21.
Cc: stable@vger.kernel.org Signed-off-by: Faisal Hassan quic_faisalh@quicinc.com Acked-by: Thinh Nguyen Thinh.Nguyen@synopsys.com Link: https://lore.kernel.org/r/20240829094502.26502-1-quic_faisalh@quicinc.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/usb/dwc3/core.c | 15 +++++++++++++++ drivers/usb/dwc3/core.h | 2 ++ 2 files changed, 17 insertions(+)
--- a/drivers/usb/dwc3/core.c +++ b/drivers/usb/dwc3/core.c @@ -1306,6 +1306,21 @@ static int dwc3_core_init(struct dwc3 *d }
/* + * STAR 9001285599: This issue affects DWC_usb3 version 3.20a + * only. If the PM TIMER ECM is enabled through GUCTL2[19], the + * link compliance test (TD7.21) may fail. If the ECN is not + * enabled (GUCTL2[19] = 0), the controller will use the old timer + * value (5us), which is still acceptable for the link compliance + * test. Therefore, do not enable PM TIMER ECM in 3.20a by + * setting GUCTL2[19] by default; instead, use GUCTL2[19] = 0. + */ + if (DWC3_VER_IS(DWC3, 320A)) { + reg = dwc3_readl(dwc->regs, DWC3_GUCTL2); + reg &= ~DWC3_GUCTL2_LC_TIMER; + dwc3_writel(dwc->regs, DWC3_GUCTL2, reg); + } + + /* * When configured in HOST mode, after issuing U3/L2 exit controller * fails to send proper CRC checksum in CRC5 feild. Because of this * behaviour Transaction Error is generated, resulting in reset and --- a/drivers/usb/dwc3/core.h +++ b/drivers/usb/dwc3/core.h @@ -404,6 +404,7 @@
/* Global User Control Register 2 */ #define DWC3_GUCTL2_RST_ACTBITLATER BIT(14) +#define DWC3_GUCTL2_LC_TIMER BIT(19)
/* Global User Control Register 3 */ #define DWC3_GUCTL3_SPLITDISABLE BIT(14) @@ -1232,6 +1233,7 @@ struct dwc3 { #define DWC3_REVISION_290A 0x5533290a #define DWC3_REVISION_300A 0x5533300a #define DWC3_REVISION_310A 0x5533310a +#define DWC3_REVISION_320A 0x5533320a #define DWC3_REVISION_330A 0x5533330a
#define DWC31_REVISION_ANY 0x0
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Carlos Llamas cmllamas@google.com
commit 4df153652cc46545722879415937582028c18af5 upstream.
Binder objects are processed and copied individually into the target buffer during transactions. Any raw data in-between these objects is copied as well. However, this raw data copy lacks an out-of-bounds check. If the raw data exceeds the data section size then the copy overwrites the offsets section. This eventually triggers an error that attempts to unwind the processed objects. However, at this point the offsets used to index these objects are now corrupted.
Unwinding with corrupted offsets can result in decrements of arbitrary nodes and lead to their premature release. Other users of such nodes are left with a dangling pointer triggering a use-after-free. This issue is made evident by the following KASAN report (trimmed):
================================================================== BUG: KASAN: slab-use-after-free in _raw_spin_lock+0xe4/0x19c Write of size 4 at addr ffff47fc91598f04 by task binder-util/743
CPU: 9 UID: 0 PID: 743 Comm: binder-util Not tainted 6.11.0-rc4 #1 Hardware name: linux,dummy-virt (DT) Call trace: _raw_spin_lock+0xe4/0x19c binder_free_buf+0x128/0x434 binder_thread_write+0x8a4/0x3260 binder_ioctl+0x18f0/0x258c [...]
Allocated by task 743: __kmalloc_cache_noprof+0x110/0x270 binder_new_node+0x50/0x700 binder_transaction+0x413c/0x6da8 binder_thread_write+0x978/0x3260 binder_ioctl+0x18f0/0x258c [...]
Freed by task 745: kfree+0xbc/0x208 binder_thread_read+0x1c5c/0x37d4 binder_ioctl+0x16d8/0x258c [...] ==================================================================
To avoid this issue, let's check that the raw data copy is within the boundaries of the data section.
Fixes: 6d98eb95b450 ("binder: avoid potential data leakage when copying txn") Cc: Todd Kjos tkjos@google.com Cc: stable@vger.kernel.org Signed-off-by: Carlos Llamas cmllamas@google.com Link: https://lore.kernel.org/r/20240822182353.2129600-1-cmllamas@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/android/binder.c | 1 + 1 file changed, 1 insertion(+)
--- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -3329,6 +3329,7 @@ static void binder_transaction(struct bi */ copy_size = object_offset - user_offset; if (copy_size && (user_offset > object_offset || + object_offset > tr->data_size || binder_alloc_copy_user_to_buffer( &target_proc->alloc, t->buffer, user_offset,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Geert Uytterhoeven geert+renesas@glider.be
commit c69f37f6559a8948d70badd2b179db7714dedd62 upstream.
devm_nvmem_device_get() returns an nvmem device, not an nvmem cell.
Fixes: e2a5402ec7c6d044 ("nvmem: Add nvmem_device based consumer apis.") Cc: stable stable@kernel.org Signed-off-by: Geert Uytterhoeven geert+renesas@glider.be Signed-off-by: Srinivas Kandagatla srinivas.kandagatla@linaro.org Link: https://lore.kernel.org/r/20240902142510.71096-3-srinivas.kandagatla@linaro.... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/nvmem/core.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
--- a/drivers/nvmem/core.c +++ b/drivers/nvmem/core.c @@ -1092,13 +1092,13 @@ void nvmem_device_put(struct nvmem_devic EXPORT_SYMBOL_GPL(nvmem_device_put);
/** - * devm_nvmem_device_get() - Get nvmem cell of device form a given id + * devm_nvmem_device_get() - Get nvmem device of device form a given id * * @dev: Device that requests the nvmem device. * @id: name id for the requested nvmem device. * - * Return: ERR_PTR() on error or a valid pointer to a struct nvmem_cell - * on success. The nvmem_cell will be freed by the automatically once the + * Return: ERR_PTR() on error or a valid pointer to a struct nvmem_device + * on success. The nvmem_device will be freed by the automatically once the * device is freed. */ struct nvmem_device *devm_nvmem_device_get(struct device *dev, const char *id)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Saurabh Sengar ssengar@linux.microsoft.com
commit fb1adbd7e50f3d2de56d0a2bb0700e2e819a329e upstream.
For primary VM Bus channels, primary_channel pointer is always NULL. This pointer is valid only for the secondary channels. Also, rescind callback is meant for primary channels only.
Fix NULL pointer dereference by retrieving the device_obj from the parent for the primary channel.
Cc: stable@vger.kernel.org Fixes: ca3cda6fcf1e ("uio_hv_generic: add rescind support") Signed-off-by: Saurabh Sengar ssengar@linux.microsoft.com Signed-off-by: Naman Jain namjain@linux.microsoft.com Link: https://lore.kernel.org/r/20240829071312.1595-2-namjain@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/uio/uio_hv_generic.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/uio/uio_hv_generic.c +++ b/drivers/uio/uio_hv_generic.c @@ -104,10 +104,11 @@ static void hv_uio_channel_cb(void *cont
/* * Callback from vmbus_event when channel is rescinded. + * It is meant for rescind of primary channels only. */ static void hv_uio_rescind(struct vmbus_channel *channel) { - struct hv_device *hv_dev = channel->primary_channel->device_obj; + struct hv_device *hv_dev = channel->device_obj; struct hv_uio_private_data *pdata = hv_get_drvdata(hv_dev);
/*
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Naman Jain namjain@linux.microsoft.com
commit 6fd28941447bf2c8ca0f26fda612a1cabc41663f upstream.
Rescind offer handling relies on rescind callbacks for some of the resources cleanup, if they are registered. It does not unregister vmbus device for the primary channel closure, when callback is registered. Without it, next onoffer does not come, rescind flag remains set and device goes to unusable state.
Add logic to unregister vmbus for the primary channel in rescind callback to ensure channel removal and relid release, and to ensure that next onoffer can be received and handled properly.
Cc: stable@vger.kernel.org Fixes: ca3cda6fcf1e ("uio_hv_generic: add rescind support") Signed-off-by: Naman Jain namjain@linux.microsoft.com Reviewed-by: Saurabh Sengar ssengar@linux.microsoft.com Link: https://lore.kernel.org/r/20240829071312.1595-3-namjain@linux.microsoft.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/hv/vmbus_drv.c | 1 + drivers/uio/uio_hv_generic.c | 8 ++++++++ 2 files changed, 9 insertions(+)
--- a/drivers/hv/vmbus_drv.c +++ b/drivers/hv/vmbus_drv.c @@ -1977,6 +1977,7 @@ static umode_t vmbus_chan_attr_is_visibl
return attr->mode; } +EXPORT_SYMBOL_GPL(vmbus_device_unregister);
static struct attribute_group vmbus_chan_group = { .attrs = vmbus_chan_attrs, --- a/drivers/uio/uio_hv_generic.c +++ b/drivers/uio/uio_hv_generic.c @@ -119,6 +119,14 @@ static void hv_uio_rescind(struct vmbus_
/* Wake up reader */ uio_event_notify(&pdata->info); + + /* + * With rescind callback registered, rescind path will not unregister the device + * from vmbus when the primary channel is rescinded. + * Without it, rescind handling is incomplete and next onoffer msg does not come. + * Unregister the device from vmbus here. + */ + vmbus_device_unregister(channel->device_obj); }
/* Sysfs API to allow mmap of the ring buffers
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: David Fernandez Gonzalez david.fernandez.gonzalez@oracle.com
commit 48b9a8dabcc3cf5f961b2ebcd8933bf9204babb7 upstream.
When removing a resource from vmci_resource_table in vmci_resource_remove(), the search is performed using the resource handle by comparing context and resource fields.
It is possible though to create two resources with different types but same handle (same context and resource fields).
When trying to remove one of the resources, vmci_resource_remove() may not remove the intended one, but the object will still be freed as in the case of the datagram type in vmci_datagram_destroy_handle(). vmci_resource_table will still hold a pointer to this freed resource leading to a use-after-free vulnerability.
BUG: KASAN: use-after-free in vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline] BUG: KASAN: use-after-free in vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147 Read of size 4 at addr ffff88801c16d800 by task syz-executor197/1592 Call Trace: <TASK> __dump_stack lib/dump_stack.c:88 [inline] dump_stack_lvl+0x82/0xa9 lib/dump_stack.c:106 print_address_description.constprop.0+0x21/0x366 mm/kasan/report.c:239 __kasan_report.cold+0x7f/0x132 mm/kasan/report.c:425 kasan_report+0x38/0x51 mm/kasan/report.c:442 vmci_handle_is_equal include/linux/vmw_vmci_defs.h:142 [inline] vmci_resource_remove+0x3a1/0x410 drivers/misc/vmw_vmci/vmci_resource.c:147 vmci_qp_broker_detach+0x89a/0x11b9 drivers/misc/vmw_vmci/vmci_queue_pair.c:2182 ctx_free_ctx+0x473/0xbe1 drivers/misc/vmw_vmci/vmci_context.c:444 kref_put include/linux/kref.h:65 [inline] vmci_ctx_put drivers/misc/vmw_vmci/vmci_context.c:497 [inline] vmci_ctx_destroy+0x170/0x1d6 drivers/misc/vmw_vmci/vmci_context.c:195 vmci_host_close+0x125/0x1ac drivers/misc/vmw_vmci/vmci_host.c:143 __fput+0x261/0xa34 fs/file_table.c:282 task_work_run+0xf0/0x194 kernel/task_work.c:164 tracehook_notify_resume include/linux/tracehook.h:189 [inline] exit_to_user_mode_loop+0x184/0x189 kernel/entry/common.c:187 exit_to_user_mode_prepare+0x11b/0x123 kernel/entry/common.c:220 __syscall_exit_to_user_mode_work kernel/entry/common.c:302 [inline] syscall_exit_to_user_mode+0x18/0x42 kernel/entry/common.c:313 do_syscall_64+0x41/0x85 arch/x86/entry/common.c:86 entry_SYSCALL_64_after_hwframe+0x6e/0x0
This change ensures the type is also checked when removing the resource from vmci_resource_table in vmci_resource_remove().
Fixes: bc63dedb7d46 ("VMCI: resource object implementation.") Cc: stable@vger.kernel.org Reported-by: George Kennedy george.kennedy@oracle.com Signed-off-by: David Fernandez Gonzalez david.fernandez.gonzalez@oracle.com Link: https://lore.kernel.org/r/20240828154338.754746-1-david.fernandez.gonzalez@o... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/misc/vmw_vmci/vmci_resource.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/drivers/misc/vmw_vmci/vmci_resource.c +++ b/drivers/misc/vmw_vmci/vmci_resource.c @@ -144,7 +144,8 @@ void vmci_resource_remove(struct vmci_re spin_lock(&vmci_resource_table.lock);
hlist_for_each_entry(r, &vmci_resource_table.entries[idx], node) { - if (vmci_handle_is_equal(r->handle, resource->handle)) { + if (vmci_handle_is_equal(r->handle, resource->handle) && + resource->type == r->type) { hlist_del_init_rcu(&r->node); break; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacky Bai ping.bai@nxp.com
commit 5b8843fcd49827813da80c0f590a17ae4ce93c5d upstream.
In tpm_set_next_event(delta), return -ETIME by wrong cast to int when delta is larger than INT_MAX.
For example:
tpm_set_next_event(delta = 0xffff_fffe) { ... next = tpm_read_counter(); // assume next is 0x10 next += delta; // next will 0xffff_fffe + 0x10 = 0x1_0000_000e now = tpm_read_counter(); // now is 0x10 ...
return (int)(next - now) <= 0 ? -ETIME : 0; ^^^^^^^^^^ 0x1_0000_000e - 0x10 = 0xffff_fffe, which is -2 when cast to int. So return -ETIME. }
To fix this, introduce a 'prev' variable and check if 'now - prev' is larger than delta.
Cc: stable@vger.kernel.org Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Jacky Bai ping.bai@nxp.com Reviewed-by: Peng Fan peng.fan@nxp.com Reviewed-by: Ye Li ye.li@nxp.com Reviewed-by: Jason Liu jason.hui.liu@nxp.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20240725193355.1436005-1-Frank.Li@nxp.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/timer-imx-tpm.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-)
--- a/drivers/clocksource/timer-imx-tpm.c +++ b/drivers/clocksource/timer-imx-tpm.c @@ -83,10 +83,10 @@ static u64 notrace tpm_read_sched_clock( static int tpm_set_next_event(unsigned long delta, struct clock_event_device *evt) { - unsigned long next, now; + unsigned long next, prev, now;
- next = tpm_read_counter(); - next += delta; + prev = tpm_read_counter(); + next = prev + delta; writel(next, timer_base + TPM_C0V); now = tpm_read_counter();
@@ -96,7 +96,7 @@ static int tpm_set_next_event(unsigned l * of writing CNT registers which may cause the min_delta event got * missed, so we need add a ETIME check here in case it happened. */ - return (int)(next - now) <= 0 ? -ETIME : 0; + return (now - prev) >= delta ? -ETIME : 0; }
static int tpm_set_state_oneshot(struct clock_event_device *evt)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jacky Bai ping.bai@nxp.com
commit 3d5c2f8e75a55cfb11a85086c71996af0354a1fb upstream.
The value written into the TPM CnV can only be updated into the hardware when the counter increases. Additional writes to the CnV write buffer are ignored until the register has been updated. Therefore, we need to check if the CnV has been updated before continuing. This may require waiting for 1 counter cycle in the worst case.
Cc: stable@vger.kernel.org Fixes: 059ab7b82eec ("clocksource/drivers/imx-tpm: Add imx tpm timer support") Signed-off-by: Jacky Bai ping.bai@nxp.com Reviewed-by: Peng Fan peng.fan@nxp.com Reviewed-by: Ye Li ye.li@nxp.com Reviewed-by: Jason Liu jason.hui.liu@nxp.com Signed-off-by: Frank Li Frank.Li@nxp.com Link: https://lore.kernel.org/r/20240725193355.1436005-2-Frank.Li@nxp.com Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/timer-imx-tpm.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/clocksource/timer-imx-tpm.c +++ b/drivers/clocksource/timer-imx-tpm.c @@ -91,6 +91,14 @@ static int tpm_set_next_event(unsigned l now = tpm_read_counter();
/* + * Need to wait CNT increase at least 1 cycle to make sure + * the C0V has been updated into HW. + */ + if ((next & 0xffffffff) != readl(timer_base + TPM_C0V)) + while (now == tpm_read_counter()) + ; + + /* * NOTE: We observed in a very small probability, the bus fabric * contention between GPU and A7 may results a few cycles delay * of writing CNT registers which may cause the min_delta event got
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Daniel Lezcano daniel.lezcano@linaro.org
commit 471ef0b5a8aaca4296108e756b970acfc499ede4 upstream.
GCC's named address space checks errors out with:
drivers/clocksource/timer-of.c: In function ‘timer_of_irq_exit’: drivers/clocksource/timer-of.c:29:46: error: passing argument 2 of ‘free_percpu_irq’ from pointer to non-enclosed address space 29 | free_percpu_irq(of_irq->irq, clkevt); | ^~~~~~ In file included from drivers/clocksource/timer-of.c:8: ./include/linux/interrupt.h:201:43: note: expected ‘__seg_gs void *’ but argument is of type ‘struct clock_event_device *’ 201 | extern void free_percpu_irq(unsigned int, void __percpu *); | ^~~~~~~~~~~~~~~ drivers/clocksource/timer-of.c: In function ‘timer_of_irq_init’: drivers/clocksource/timer-of.c:74:51: error: passing argument 4 of ‘request_percpu_irq’ from pointer to non-enclosed address space 74 | np->full_name, clkevt) : | ^~~~~~ ./include/linux/interrupt.h:190:56: note: expected ‘__seg_gs void *’ but argument is of type ‘struct clock_event_device *’ 190 | const char *devname, void __percpu *percpu_dev_id)
Sparse warns about:
timer-of.c:29:46: warning: incorrect type in argument 2 (different address spaces) timer-of.c:29:46: expected void [noderef] __percpu * timer-of.c:29:46: got struct clock_event_device *clkevt timer-of.c:74:51: warning: incorrect type in argument 4 (different address spaces) timer-of.c:74:51: expected void [noderef] __percpu *percpu_dev_id timer-of.c:74:51: got struct clock_event_device *clkevt
It appears the code is incorrect as reported by Uros Bizjak:
"The referred code is questionable as it tries to reuse the clkevent pointer once as percpu pointer and once as generic pointer, which should be avoided."
This change removes the percpu related code as no drivers is using it.
[Daniel: Fixed the description]
Fixes: dc11bae785295 ("clocksource/drivers: Add timer-of common init routine") Reported-by: Uros Bizjak ubizjak@gmail.com Tested-by: Uros Bizjak ubizjak@gmail.com Link: https://lore.kernel.org/r/20240819100335.2394751-1-daniel.lezcano@linaro.org Signed-off-by: Daniel Lezcano daniel.lezcano@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/clocksource/timer-of.c | 17 ++++------------- drivers/clocksource/timer-of.h | 1 - 2 files changed, 4 insertions(+), 14 deletions(-)
--- a/drivers/clocksource/timer-of.c +++ b/drivers/clocksource/timer-of.c @@ -25,10 +25,7 @@ static __init void timer_of_irq_exit(str
struct clock_event_device *clkevt = &to->clkevt;
- if (of_irq->percpu) - free_percpu_irq(of_irq->irq, clkevt); - else - free_irq(of_irq->irq, clkevt); + free_irq(of_irq->irq, clkevt); }
/** @@ -42,9 +39,6 @@ static __init void timer_of_irq_exit(str * - Get interrupt number by name * - Get interrupt number by index * - * When the interrupt is per CPU, 'request_percpu_irq()' is called, - * otherwise 'request_irq()' is used. - * * Returns 0 on success, < 0 otherwise */ static __init int timer_of_irq_init(struct device_node *np, @@ -69,12 +63,9 @@ static __init int timer_of_irq_init(stru return -EINVAL; }
- ret = of_irq->percpu ? - request_percpu_irq(of_irq->irq, of_irq->handler, - np->full_name, clkevt) : - request_irq(of_irq->irq, of_irq->handler, - of_irq->flags ? of_irq->flags : IRQF_TIMER, - np->full_name, clkevt); + ret = request_irq(of_irq->irq, of_irq->handler, + of_irq->flags ? of_irq->flags : IRQF_TIMER, + np->full_name, clkevt); if (ret) { pr_err("Failed to request irq %d for %pOF\n", of_irq->irq, np); return ret; --- a/drivers/clocksource/timer-of.h +++ b/drivers/clocksource/timer-of.h @@ -11,7 +11,6 @@ struct of_timer_irq { int irq; int index; - int percpu; const char *name; unsigned long flags; irq_handler_t handler;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sven Schnelle svens@linux.ibm.com
commit e240b0fde52f33670d1336697c22d90a4fe33c84 upstream.
To prevent unitialized members, use kzalloc to allocate the xol area.
Fixes: b059a453b1cf1 ("x86/vdso: Add mremap hook to vm_special_mapping") Signed-off-by: Sven Schnelle svens@linux.ibm.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Acked-by: Oleg Nesterov oleg@redhat.com Link: https://lore.kernel.org/r/20240903102313.3402529-1-svens@linux.ibm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/events/uprobes.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
--- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -1483,7 +1483,7 @@ static struct xol_area *__create_xol_are uprobe_opcode_t insn = UPROBE_SWBP_INSN; struct xol_area *area;
- area = kmalloc(sizeof(*area), GFP_KERNEL); + area = kzalloc(sizeof(*area), GFP_KERNEL); if (unlikely(!area)) goto out;
@@ -1493,7 +1493,6 @@ static struct xol_area *__create_xol_are goto free_area;
area->xol_mapping.name = "[uprobes]"; - area->xol_mapping.fault = NULL; area->xol_mapping.pages = area->pages; area->pages[0] = alloc_page(GFP_HIGHUSER); if (!area->pages[0])
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
commit 2ab9d830262c132ab5db2f571003d80850d56b2a upstream.
Ole reported that event->mmap_mutex is strictly insufficient to serialize the AUX buffer, add a per RB mutex to fully serialize it.
Note that in the lock order comment the perf_event::mmap_mutex order was already wrong, that is, it nesting under mmap_lock is not new with this patch.
Fixes: 45bfb2e50471 ("perf: Add AUX area to ring buffer for raw data streams") Reported-by: Ole ole@binarygecko.com Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Signed-off-by: Ingo Molnar mingo@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/events/core.c | 18 ++++++++++++------ kernel/events/internal.h | 1 + kernel/events/ring_buffer.c | 2 ++ 3 files changed, 15 insertions(+), 6 deletions(-)
--- a/kernel/events/core.c +++ b/kernel/events/core.c @@ -1277,8 +1277,9 @@ static void put_ctx(struct perf_event_co * perf_event_context::mutex * perf_event::child_mutex; * perf_event_context::lock - * perf_event::mmap_mutex * mmap_lock + * perf_event::mmap_mutex + * perf_buffer::aux_mutex * perf_addr_filters_head::lock * * cpu_hotplug_lock @@ -6181,12 +6182,11 @@ static void perf_mmap_close(struct vm_ar event->pmu->event_unmapped(event, vma->vm_mm);
/* - * rb->aux_mmap_count will always drop before rb->mmap_count and - * event->mmap_count, so it is ok to use event->mmap_mutex to - * serialize with perf_mmap here. + * The AUX buffer is strictly a sub-buffer, serialize using aux_mutex + * to avoid complications. */ if (rb_has_aux(rb) && vma->vm_pgoff == rb->aux_pgoff && - atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &event->mmap_mutex)) { + atomic_dec_and_mutex_lock(&rb->aux_mmap_count, &rb->aux_mutex)) { /* * Stop all AUX events that are writing to this buffer, * so that we can free its AUX pages and corresponding PMU @@ -6203,7 +6203,7 @@ static void perf_mmap_close(struct vm_ar rb_free_aux(rb); WARN_ON_ONCE(refcount_read(&rb->aux_refcount));
- mutex_unlock(&event->mmap_mutex); + mutex_unlock(&rb->aux_mutex); }
if (atomic_dec_and_test(&rb->mmap_count)) @@ -6291,6 +6291,7 @@ static int perf_mmap(struct file *file, struct perf_event *event = file->private_data; unsigned long user_locked, user_lock_limit; struct user_struct *user = current_user(); + struct mutex *aux_mutex = NULL; struct perf_buffer *rb = NULL; unsigned long locked, lock_limit; unsigned long vma_size; @@ -6339,6 +6340,9 @@ static int perf_mmap(struct file *file, if (!rb) goto aux_unlock;
+ aux_mutex = &rb->aux_mutex; + mutex_lock(aux_mutex); + aux_offset = READ_ONCE(rb->user_page->aux_offset); aux_size = READ_ONCE(rb->user_page->aux_size);
@@ -6489,6 +6493,8 @@ unlock: atomic_dec(&rb->mmap_count); } aux_unlock: + if (aux_mutex) + mutex_unlock(aux_mutex); mutex_unlock(&event->mmap_mutex);
/* --- a/kernel/events/internal.h +++ b/kernel/events/internal.h @@ -40,6 +40,7 @@ struct perf_buffer { struct user_struct *mmap_user;
/* AUX area */ + struct mutex aux_mutex; long aux_head; unsigned int aux_nest; long aux_wakeup; /* last aux_watermark boundary crossed by aux_head */ --- a/kernel/events/ring_buffer.c +++ b/kernel/events/ring_buffer.c @@ -332,6 +332,8 @@ ring_buffer_init(struct perf_buffer *rb, */ if (!rb->nr_pages) rb->paused = 1; + + mutex_init(&rb->aux_mutex); }
void perf_aux_output_flag(struct perf_output_handle *handle, u64 flags)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Boqun Feng boqun.feng@gmail.com
[ Upstream commit a5a3c952e82c1ada12bf8c55b73af26f1a454bd2 ]
Currently while defining `THIS_MODULE` symbol in `module!()`, the pointer used to construct `ThisModule` is derived from an immutable reference of `__this_module`, which means the pointer doesn't have the provenance for writing, and that means any write to that pointer is UB regardless of data races or not. However, the usage of `THIS_MODULE` includes passing this pointer to functions that may write to it (probably in unsafe code), and this will create soundness issues.
One way to fix this is using `addr_of_mut!()` but that requires the unstable feature "const_mut_refs". So instead of `addr_of_mut()!`, an extern static `Opaque` is used here: since `Opaque<T>` is transparent to `T`, an extern static `Opaque` will just wrap the C symbol (defined in a C compile unit) in an `Opaque`, which provides a pointer with writable provenance via `Opaque::get()`. This fix the potential UBs because of pointer provenance unmatched.
Reported-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Boqun Feng boqun.feng@gmail.com Reviewed-by: Alice Ryhl aliceryhl@google.com Reviewed-by: Trevor Gross tmgross@umich.edu Reviewed-by: Benno Lossin benno.lossin@proton.me Reviewed-by: Gary Guo gary@garyguo.net Closes: https://rust-for-linux.zulipchat.com/#narrow/stream/x/topic/x/near/465412664 Fixes: 1fbde52bde73 ("rust: add `macros` crate") Cc: stable@vger.kernel.org # 6.6.x: be2ca1e03965: ("rust: types: Make Opaque::get const") Link: https://lore.kernel.org/r/20240828180129.4046355-1-boqun.feng@gmail.com [ Fixed two typos, reworded title. - Miguel ] Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- rust/macros/module.rs | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/rust/macros/module.rs b/rust/macros/module.rs index 031028b3dc41..071b96639a2e 100644 --- a/rust/macros/module.rs +++ b/rust/macros/module.rs @@ -183,7 +183,11 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream { // freed until the module is unloaded. #[cfg(MODULE)] static THIS_MODULE: kernel::ThisModule = unsafe {{ - kernel::ThisModule::from_ptr(&kernel::bindings::__this_module as *const _ as *mut _) + extern "C" {{ + static __this_module: kernel::types::Opaquekernel::bindings::module; + }} + + kernel::ThisModule::from_ptr(__this_module.get()) }}; #[cfg(not(MODULE))] static THIS_MODULE: kernel::ThisModule = unsafe {{
On Tue, Sep 10, 2024 at 12:12 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
6.1-stable review patch. If anyone has any objections, please let me know.
From: Boqun Feng boqun.feng@gmail.com
[ Upstream commit a5a3c952e82c1ada12bf8c55b73af26f1a454bd2 ]
Currently while defining `THIS_MODULE` symbol in `module!()`, the pointer used to construct `ThisModule` is derived from an immutable reference of `__this_module`, which means the pointer doesn't have the provenance for writing, and that means any write to that pointer is UB regardless of data races or not. However, the usage of `THIS_MODULE` includes passing this pointer to functions that may write to it (probably in unsafe code), and this will create soundness issues.
One way to fix this is using `addr_of_mut!()` but that requires the unstable feature "const_mut_refs". So instead of `addr_of_mut()!`, an extern static `Opaque` is used here: since `Opaque<T>` is transparent to `T`, an extern static `Opaque` will just wrap the C symbol (defined in a C compile unit) in an `Opaque`, which provides a pointer with writable provenance via `Opaque::get()`. This fix the potential UBs because of pointer provenance unmatched.
Reported-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Boqun Feng boqun.feng@gmail.com Reviewed-by: Alice Ryhl aliceryhl@google.com Reviewed-by: Trevor Gross tmgross@umich.edu Reviewed-by: Benno Lossin benno.lossin@proton.me Reviewed-by: Gary Guo gary@garyguo.net Closes: https://rust-for-linux.zulipchat.com/#narrow/stream/x/topic/x/near/465412664 Fixes: 1fbde52bde73 ("rust: add `macros` crate") Cc: stable@vger.kernel.org # 6.6.x: be2ca1e03965: ("rust: types: Make Opaque::get const") Link: https://lore.kernel.org/r/20240828180129.4046355-1-boqun.feng@gmail.com [ Fixed two typos, reworded title. - Miguel ] Signed-off-by: Miguel Ojeda ojeda@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org
The opaque type doesn't exist yet on 6.1, so this needs to be changed for the 6.1 backport. It won't compile as-is.
rust/macros/module.rs | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/rust/macros/module.rs b/rust/macros/module.rs index 031028b3dc41..071b96639a2e 100644 --- a/rust/macros/module.rs +++ b/rust/macros/module.rs @@ -183,7 +183,11 @@ pub(crate) fn module(ts: TokenStream) -> TokenStream { // freed until the module is unloaded. #[cfg(MODULE)] static THIS_MODULE: kernel::ThisModule = unsafe {{
kernel::ThisModule::from_ptr(&kernel::bindings::__this_module as *const _ as *mut _)
extern \"C\" {{
static __this_module: kernel::types::Opaque<kernel::bindings::module>;
}}
kernel::ThisModule::from_ptr(__this_module.get()) }}; #[cfg(not(MODULE))] static THIS_MODULE: kernel::ThisModule = unsafe {{
-- 2.43.0
On Tue, Sep 10, 2024 at 12:25 PM Alice Ryhl aliceryhl@google.com wrote:
On Tue, Sep 10, 2024 at 12:12 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
Fixes: 1fbde52bde73 ("rust: add `macros` crate") Cc: stable@vger.kernel.org # 6.6.x: be2ca1e03965: ("rust: types: Make Opaque::get const")
The opaque type doesn't exist yet on 6.1, so this needs to be changed for the 6.1 backport. It won't compile as-is.
+1 -- Greg/Sasha: for context, this is the one I asked about using Option 3 since otherwise the list of cherry-picks for 6.1 would be quite long.
Cheers, Miguel
On Tue, Sep 10, 2024 at 01:14:34PM +0200, Miguel Ojeda wrote:
On Tue, Sep 10, 2024 at 12:25 PM Alice Ryhl aliceryhl@google.com wrote:
On Tue, Sep 10, 2024 at 12:12 PM Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
Fixes: 1fbde52bde73 ("rust: add `macros` crate") Cc: stable@vger.kernel.org # 6.6.x: be2ca1e03965: ("rust: types: Make Opaque::get const")
The opaque type doesn't exist yet on 6.1, so this needs to be changed for the 6.1 backport. It won't compile as-is.
+1 -- Greg/Sasha: for context, this is the one I asked about using Option 3 since otherwise the list of cherry-picks for 6.1 would be quite long.
Sorry about the delay, yeah, this is why I didn't apply this to 6.1.y, but Sasha picked it up as he saw it looked like it applied and didn't break the build. Maybe we aren't building rust code in our systems? I know I'm not, I need to fix that...
Anyway, now dropped, thanks all!
greg k-h
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
[ Upstream commit 4f8d37020e1fd0bf6ee9381ba918135ef3712efd ]
Add a flag to entry expiration that lets the filesystem expire a dentry without kicking it out from the cache immediately.
This makes a difference for overmounted dentries, where plain invalidation would detach all submounts before dropping the dentry from the cache. If only expiry is set on the dentry, then any overmounts are left alone and until ->d_revalidate() is called.
Note: ->d_revalidate() is not called for the case of following a submount, so invalidation will only be triggered for the non-overmounted case. The dentry could also be mounted in a different mount instance, in which case any submounts will still be detached.
Suggested-by: Jakob Blomer jblomer@cern.ch Signed-off-by: Miklos Szeredi mszeredi@redhat.com Stable-dep-of: 3002240d1649 ("fuse: fix memory leak in fuse_create_open") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/dev.c | 4 ++-- fs/fuse/dir.c | 6 ++++-- fs/fuse/fuse_i.h | 2 +- include/uapi/linux/fuse.h | 13 +++++++++++-- 4 files changed, 18 insertions(+), 7 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 96a717f73ce3..61bef919c042 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -1498,7 +1498,7 @@ static int fuse_notify_inval_entry(struct fuse_conn *fc, unsigned int size, buf[outarg.namelen] = 0;
down_read(&fc->killsb); - err = fuse_reverse_inval_entry(fc, outarg.parent, 0, &name); + err = fuse_reverse_inval_entry(fc, outarg.parent, 0, &name, outarg.flags); up_read(&fc->killsb); kfree(buf); return err; @@ -1546,7 +1546,7 @@ static int fuse_notify_delete(struct fuse_conn *fc, unsigned int size, buf[outarg.namelen] = 0;
down_read(&fc->killsb); - err = fuse_reverse_inval_entry(fc, outarg.parent, outarg.child, &name); + err = fuse_reverse_inval_entry(fc, outarg.parent, outarg.child, &name, 0); up_read(&fc->killsb); kfree(buf); return err; diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 936a24b646ce..8474003aa54d 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1174,7 +1174,7 @@ int fuse_update_attributes(struct inode *inode, struct file *file, u32 mask) }
int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid, - u64 child_nodeid, struct qstr *name) + u64 child_nodeid, struct qstr *name, u32 flags) { int err = -ENOTDIR; struct inode *parent; @@ -1201,7 +1201,9 @@ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid, goto unlock;
fuse_dir_changed(parent); - fuse_invalidate_entry(entry); + if (!(flags & FUSE_EXPIRE_ONLY)) + d_invalidate(entry); + fuse_invalidate_entry_cache(entry);
if (child_nodeid != 0 && d_really_is_positive(entry)) { inode_lock(d_inode(entry)); diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 66c2a9999468..cb464e5b171a 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -1235,7 +1235,7 @@ int fuse_reverse_inval_inode(struct fuse_conn *fc, u64 nodeid, * then the dentry is unhashed (d_delete()). */ int fuse_reverse_inval_entry(struct fuse_conn *fc, u64 parent_nodeid, - u64 child_nodeid, struct qstr *name); + u64 child_nodeid, struct qstr *name, u32 flags);
int fuse_do_open(struct fuse_mount *fm, u64 nodeid, struct file *file, bool isdir); diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 76ee8f9e024a..39cfb343faa8 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -197,6 +197,9 @@ * * 7.37 * - add FUSE_TMPFILE + * + * 7.38 + * - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry */
#ifndef _LINUX_FUSE_H @@ -232,7 +235,7 @@ #define FUSE_KERNEL_VERSION 7
/** Minor version number of this interface */ -#define FUSE_KERNEL_MINOR_VERSION 37 +#define FUSE_KERNEL_MINOR_VERSION 38
/** The node ID of the root inode */ #define FUSE_ROOT_ID 1 @@ -491,6 +494,12 @@ struct fuse_file_lock { */ #define FUSE_SETXATTR_ACL_KILL_SGID (1 << 0)
+/** + * notify_inval_entry flags + * FUSE_EXPIRE_ONLY + */ +#define FUSE_EXPIRE_ONLY (1 << 0) + enum fuse_opcode { FUSE_LOOKUP = 1, FUSE_FORGET = 2, /* no reply */ @@ -919,7 +928,7 @@ struct fuse_notify_inval_inode_out { struct fuse_notify_inval_entry_out { uint64_t parent; uint32_t namelen; - uint32_t padding; + uint32_t flags; };
struct fuse_notify_delete_out {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Dharmendra Singh dsingh@ddn.com
[ Upstream commit 153524053bbb0d27bb2e0be36d1b46862e9ce74c ]
In general, as of now, in FUSE, direct writes on the same file are serialized over inode lock i.e we hold inode lock for the full duration of the write request. I could not find in fuse code and git history a comment which clearly explains why this exclusive lock is taken for direct writes. Following might be the reasons for acquiring an exclusive lock but not be limited to
1) Our guess is some USER space fuse implementations might be relying on this lock for serialization.
2) The lock protects against file read/write size races.
3) Ruling out any issues arising from partial write failures.
This patch relaxes the exclusive lock for direct non-extending writes only. File size extending writes might not need the lock either, but we are not entirely sure if there is a risk to introduce any kind of regression. Furthermore, benchmarking with fio does not show a difference between patch versions that take on file size extension a) an exclusive lock and b) a shared lock.
A possible example of an issue with i_size extending writes are write error cases. Some writes might succeed and others might fail for file system internal reasons - for example ENOSPACE. With parallel file size extending writes it _might_ be difficult to revert the action of the failing write, especially to restore the right i_size.
With these changes, we allow non-extending parallel direct writes on the same file with the help of a flag called FOPEN_PARALLEL_DIRECT_WRITES. If this flag is set on the file (flag is passed from libfuse to fuse kernel as part of file open/create), we do not take exclusive lock anymore, but instead use a shared lock that allows non-extending writes to run in parallel. FUSE implementations which rely on this inode lock for serialization can continue to do so and serialized direct writes are still the default. Implementations that do not do write serialization need to be updated and need to set the FOPEN_PARALLEL_DIRECT_WRITES flag in their file open/create reply.
On patch review there were concerns that network file systems (or vfs multiple mounts of the same file system) might have issues with parallel writes. We believe this is not the case, as this is just a local lock, which network file systems could not rely on anyway. I.e. this lock is just for local consistency.
Signed-off-by: Dharmendra Singh dsingh@ddn.com Signed-off-by: Bernd Schubert bschubert@ddn.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Stable-dep-of: 3002240d1649 ("fuse: fix memory leak in fuse_create_open") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/file.c | 43 ++++++++++++++++++++++++++++++++++++--- include/uapi/linux/fuse.h | 3 +++ 2 files changed, 43 insertions(+), 3 deletions(-)
diff --git a/fs/fuse/file.c b/fs/fuse/file.c index e6ec4338a9c5..0df1311afb87 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1563,14 +1563,47 @@ static ssize_t fuse_direct_read_iter(struct kiocb *iocb, struct iov_iter *to) return res; }
+static bool fuse_direct_write_extending_i_size(struct kiocb *iocb, + struct iov_iter *iter) +{ + struct inode *inode = file_inode(iocb->ki_filp); + + return iocb->ki_pos + iov_iter_count(iter) > i_size_read(inode); +} + static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from) { struct inode *inode = file_inode(iocb->ki_filp); + struct file *file = iocb->ki_filp; + struct fuse_file *ff = file->private_data; struct fuse_io_priv io = FUSE_IO_PRIV_SYNC(iocb); ssize_t res; + bool exclusive_lock = + !(ff->open_flags & FOPEN_PARALLEL_DIRECT_WRITES) || + iocb->ki_flags & IOCB_APPEND || + fuse_direct_write_extending_i_size(iocb, from); + + /* + * Take exclusive lock if + * - Parallel direct writes are disabled - a user space decision + * - Parallel direct writes are enabled and i_size is being extended. + * This might not be needed at all, but needs further investigation. + */ + if (exclusive_lock) + inode_lock(inode); + else { + inode_lock_shared(inode); + + /* A race with truncate might have come up as the decision for + * the lock type was done without holding the lock, check again. + */ + if (fuse_direct_write_extending_i_size(iocb, from)) { + inode_unlock_shared(inode); + inode_lock(inode); + exclusive_lock = true; + } + }
- /* Don't allow parallel writes to the same file */ - inode_lock(inode); res = generic_write_checks(iocb, from); if (res > 0) { if (!is_sync_kiocb(iocb) && iocb->ki_flags & IOCB_DIRECT) { @@ -1581,7 +1614,10 @@ static ssize_t fuse_direct_write_iter(struct kiocb *iocb, struct iov_iter *from) fuse_write_update_attr(inode, iocb->ki_pos, res); } } - inode_unlock(inode); + if (exclusive_lock) + inode_unlock(inode); + else + inode_unlock_shared(inode);
return res; } @@ -2937,6 +2973,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
if (iov_iter_rw(iter) == WRITE) { fuse_write_update_attr(inode, pos, ret); + /* For extending writes we already hold exclusive lock */ if (ret < 0 && offset + count > i_size) fuse_do_truncate(file); } diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 39cfb343faa8..e3c54109bae9 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -200,6 +200,7 @@ * * 7.38 * - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry + * - add FOPEN_PARALLEL_DIRECT_WRITES */
#ifndef _LINUX_FUSE_H @@ -307,6 +308,7 @@ struct fuse_file_lock { * FOPEN_CACHE_DIR: allow caching this directory * FOPEN_STREAM: the file is stream-like (no file position at all) * FOPEN_NOFLUSH: don't flush data cache on close (unless FUSE_WRITEBACK_CACHE) + * FOPEN_PARALLEL_DIRECT_WRITES: Allow concurrent direct writes on the same inode */ #define FOPEN_DIRECT_IO (1 << 0) #define FOPEN_KEEP_CACHE (1 << 1) @@ -314,6 +316,7 @@ struct fuse_file_lock { #define FOPEN_CACHE_DIR (1 << 3) #define FOPEN_STREAM (1 << 4) #define FOPEN_NOFLUSH (1 << 5) +#define FOPEN_PARALLEL_DIRECT_WRITES (1 << 6)
/** * INIT request/reply flags
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
[ Upstream commit 15d937d7ca8c55d2b0ce9116e20c780fdd0b67cc ]
Will need to add supplementary groups to create messages, so add the general concept of a request extension. A request extension is appended to the end of the main request. It has a header indicating the size and type of the extension.
The create security context (fuse_secctx_*) is similar to the generic request extension, so include that as well in a backward compatible manner.
Add the total extension length to the request header. The offset of the extension block within the request can be calculated by:
inh->len - inh->total_extlen * 8
Signed-off-by: Miklos Szeredi mszeredi@redhat.com Stable-dep-of: 3002240d1649 ("fuse: fix memory leak in fuse_create_open") Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/dev.c | 2 ++ fs/fuse/dir.c | 66 ++++++++++++++++++++++----------------- fs/fuse/fuse_i.h | 6 ++-- include/uapi/linux/fuse.h | 28 ++++++++++++++++- 4 files changed, 71 insertions(+), 31 deletions(-)
diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 61bef919c042..7e0d4f08a0cf 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -476,6 +476,8 @@ static void fuse_args_to_req(struct fuse_req *req, struct fuse_args *args) req->in.h.opcode = args->opcode; req->in.h.nodeid = args->nodeid; req->args = args; + if (args->is_ext) + req->in.h.total_extlen = args->in_args[args->ext_idx].size / 8; if (args->end) __set_bit(FR_ASYNC, &req->flags); } diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 8474003aa54d..3b7887312ac0 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -470,7 +470,7 @@ static struct dentry *fuse_lookup(struct inode *dir, struct dentry *entry, }
static int get_security_context(struct dentry *entry, umode_t mode, - void **security_ctx, u32 *security_ctxlen) + struct fuse_in_arg *ext) { struct fuse_secctx *fctx; struct fuse_secctx_header *header; @@ -517,14 +517,42 @@ static int get_security_context(struct dentry *entry, umode_t mode,
memcpy(ptr, ctx, ctxlen); } - *security_ctxlen = total_len; - *security_ctx = header; + ext->size = total_len; + ext->value = header; err = 0; out_err: kfree(ctx); return err; }
+static int get_create_ext(struct fuse_args *args, struct dentry *dentry, + umode_t mode) +{ + struct fuse_conn *fc = get_fuse_conn_super(dentry->d_sb); + struct fuse_in_arg ext = { .size = 0, .value = NULL }; + int err = 0; + + if (fc->init_security) + err = get_security_context(dentry, mode, &ext); + + if (!err && ext.size) { + WARN_ON(args->in_numargs >= ARRAY_SIZE(args->in_args)); + args->is_ext = true; + args->ext_idx = args->in_numargs++; + args->in_args[args->ext_idx] = ext; + } else { + kfree(ext.value); + } + + return err; +} + +static void free_ext_value(struct fuse_args *args) +{ + if (args->is_ext) + kfree(args->in_args[args->ext_idx].value); +} + /* * Atomic create+open operation * @@ -545,8 +573,6 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, struct fuse_entry_out outentry; struct fuse_inode *fi; struct fuse_file *ff; - void *security_ctx = NULL; - u32 security_ctxlen; bool trunc = flags & O_TRUNC;
/* Userspace expects S_IFREG in create mode */ @@ -590,19 +616,12 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry, args.out_args[1].size = sizeof(outopen); args.out_args[1].value = &outopen;
- if (fm->fc->init_security) { - err = get_security_context(entry, mode, &security_ctx, - &security_ctxlen); - if (err) - goto out_put_forget_req; - - args.in_numargs = 3; - args.in_args[2].size = security_ctxlen; - args.in_args[2].value = security_ctx; - } + err = get_create_ext(&args, entry, mode); + if (err) + goto out_put_forget_req;
err = fuse_simple_request(fm, &args); - kfree(security_ctx); + free_ext_value(&args); if (err) goto out_free_ff;
@@ -709,8 +728,6 @@ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args, struct dentry *d; int err; struct fuse_forget_link *forget; - void *security_ctx = NULL; - u32 security_ctxlen;
if (fuse_is_bad(dir)) return -EIO; @@ -725,21 +742,14 @@ static int create_new_entry(struct fuse_mount *fm, struct fuse_args *args, args->out_args[0].size = sizeof(outarg); args->out_args[0].value = &outarg;
- if (fm->fc->init_security && args->opcode != FUSE_LINK) { - err = get_security_context(entry, mode, &security_ctx, - &security_ctxlen); + if (args->opcode != FUSE_LINK) { + err = get_create_ext(args, entry, mode); if (err) goto out_put_forget_req; - - BUG_ON(args->in_numargs != 2); - - args->in_numargs = 3; - args->in_args[2].size = security_ctxlen; - args->in_args[2].value = security_ctx; }
err = fuse_simple_request(fm, args); - kfree(security_ctx); + free_ext_value(args); if (err) goto out_put_forget_req;
diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index cb464e5b171a..6c3ec70c1b70 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -264,8 +264,9 @@ struct fuse_page_desc { struct fuse_args { uint64_t nodeid; uint32_t opcode; - unsigned short in_numargs; - unsigned short out_numargs; + uint8_t in_numargs; + uint8_t out_numargs; + uint8_t ext_idx; bool force:1; bool noreply:1; bool nocreds:1; @@ -276,6 +277,7 @@ struct fuse_args { bool page_zeroing:1; bool page_replace:1; bool may_block:1; + bool is_ext:1; struct fuse_in_arg in_args[3]; struct fuse_arg out_args[2]; void (*end)(struct fuse_mount *fm, struct fuse_args *args, int error); diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index e3c54109bae9..c71f12429e3d 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -201,6 +201,9 @@ * 7.38 * - add FUSE_EXPIRE_ONLY flag to fuse_notify_inval_entry * - add FOPEN_PARALLEL_DIRECT_WRITES + * - add total_extlen to fuse_in_header + * - add FUSE_MAX_NR_SECCTX + * - add extension header */
#ifndef _LINUX_FUSE_H @@ -503,6 +506,15 @@ struct fuse_file_lock { */ #define FUSE_EXPIRE_ONLY (1 << 0)
+/** + * extension type + * FUSE_MAX_NR_SECCTX: maximum value of &fuse_secctx_header.nr_secctx + */ +enum fuse_ext_type { + /* Types 0..31 are reserved for fuse_secctx_header */ + FUSE_MAX_NR_SECCTX = 31, +}; + enum fuse_opcode { FUSE_LOOKUP = 1, FUSE_FORGET = 2, /* no reply */ @@ -886,7 +898,8 @@ struct fuse_in_header { uint32_t uid; uint32_t gid; uint32_t pid; - uint32_t padding; + uint16_t total_extlen; /* length of extensions in 8byte units */ + uint16_t padding; };
struct fuse_out_header { @@ -1047,4 +1060,17 @@ struct fuse_secctx_header { uint32_t nr_secctx; };
+/** + * struct fuse_ext_header - extension header + * @size: total size of this extension including this header + * @type: type of extension + * + * This is made compatible with fuse_secctx_header by using type values > + * FUSE_MAX_NR_SECCTX + */ +struct fuse_ext_header { + uint32_t size; + uint32_t type; +}; + #endif /* _LINUX_FUSE_H */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: yangyun yangyun50@huawei.com
[ Upstream commit 3002240d16494d798add0575e8ba1f284258ab34 ]
The memory of struct fuse_file is allocated but not freed when get_create_ext return error.
Fixes: 3e2b6fdbdc9a ("fuse: send security context of inode on file") Cc: stable@vger.kernel.org # v5.17 Signed-off-by: yangyun yangyun50@huawei.com Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- fs/fuse/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 3b7887312ac0..aa2be4c1ea8f 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -618,7 +618,7 @@ static int fuse_create_open(struct inode *dir, struct dentry *entry,
err = get_create_ext(&args, entry, mode); if (err) - goto out_put_forget_req; + goto out_free_ff;
err = fuse_simple_request(fm, &args); free_ext_value(&args);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit fbfdec9989e69e0b17aa3bf32fcb22d04cc33301 ]
Instead of mucking about with at least 2 different ways of fudging it, do the same thing we do for pte_t.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20221022114424.580310787%40infradead.org Stable-dep-of: 71c186efc1b2 ("userfaultfd: fix checks for huge PMDs") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/pgtable-3level.h | 42 +++++++-------------- arch/x86/include/asm/pgtable-3level_types.h | 7 ++++ arch/x86/include/asm/pgtable_64_types.h | 1 + arch/x86/include/asm/pgtable_types.h | 4 +- 4 files changed, 23 insertions(+), 31 deletions(-)
diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index 28421a887209..28556d22feb8 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -87,7 +87,7 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) ret |= ((pmdval_t)*(tmp + 1)) << 32; }
- return (pmd_t) { ret }; + return (pmd_t) { .pmd = ret }; }
static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) @@ -121,12 +121,11 @@ static inline void native_pte_clear(struct mm_struct *mm, unsigned long addr, ptep->pte_high = 0; }
-static inline void native_pmd_clear(pmd_t *pmd) +static inline void native_pmd_clear(pmd_t *pmdp) { - u32 *tmp = (u32 *)pmd; - *tmp = 0; + pmdp->pmd_low = 0; smp_wmb(); - *(tmp + 1) = 0; + pmdp->pmd_high = 0; }
static inline void native_pud_clear(pud_t *pudp) @@ -162,25 +161,17 @@ static inline pte_t native_ptep_get_and_clear(pte_t *ptep) #define native_ptep_get_and_clear(xp) native_local_ptep_get_and_clear(xp) #endif
-union split_pmd { - struct { - u32 pmd_low; - u32 pmd_high; - }; - pmd_t pmd; -}; - #ifdef CONFIG_SMP static inline pmd_t native_pmdp_get_and_clear(pmd_t *pmdp) { - union split_pmd res, *orig = (union split_pmd *)pmdp; + pmd_t res;
/* xchg acts as a barrier before setting of the high bits */ - res.pmd_low = xchg(&orig->pmd_low, 0); - res.pmd_high = orig->pmd_high; - orig->pmd_high = 0; + res.pmd_low = xchg(&pmdp->pmd_low, 0); + res.pmd_high = READ_ONCE(pmdp->pmd_high); + WRITE_ONCE(pmdp->pmd_high, 0);
- return res.pmd; + return res; } #else #define native_pmdp_get_and_clear(xp) native_local_pmdp_get_and_clear(xp) @@ -199,17 +190,12 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, * anybody. */ if (!(pmd_val(pmd) & _PAGE_PRESENT)) { - union split_pmd old, new, *ptr; - - ptr = (union split_pmd *)pmdp; - - new.pmd = pmd; - /* xchg acts as a barrier before setting of the high bits */ - old.pmd_low = xchg(&ptr->pmd_low, new.pmd_low); - old.pmd_high = ptr->pmd_high; - ptr->pmd_high = new.pmd_high; - return old.pmd; + old.pmd_low = xchg(&pmdp->pmd_low, pmd.pmd_low); + old.pmd_high = READ_ONCE(pmdp->pmd_high); + WRITE_ONCE(pmdp->pmd_high, pmd.pmd_high); + + return old; }
do { diff --git a/arch/x86/include/asm/pgtable-3level_types.h b/arch/x86/include/asm/pgtable-3level_types.h index 56baf43befb4..80911349519e 100644 --- a/arch/x86/include/asm/pgtable-3level_types.h +++ b/arch/x86/include/asm/pgtable-3level_types.h @@ -18,6 +18,13 @@ typedef union { }; pteval_t pte; } pte_t; + +typedef union { + struct { + unsigned long pmd_low, pmd_high; + }; + pmdval_t pmd; +} pmd_t; #endif /* !__ASSEMBLY__ */
#define SHARED_KERNEL_PMD (!static_cpu_has(X86_FEATURE_PTI)) diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm/pgtable_64_types.h index 6c7f7c526450..4ea3755f2444 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -19,6 +19,7 @@ typedef unsigned long pgdval_t; typedef unsigned long pgprotval_t;
typedef struct { pteval_t pte; } pte_t; +typedef struct { pmdval_t pmd; } pmd_t;
#ifdef CONFIG_X86_5LEVEL extern unsigned int __pgtable_l5_enabled; diff --git a/arch/x86/include/asm/pgtable_types.h b/arch/x86/include/asm/pgtable_types.h index e3028373f0b4..d0e9654d7272 100644 --- a/arch/x86/include/asm/pgtable_types.h +++ b/arch/x86/include/asm/pgtable_types.h @@ -363,11 +363,9 @@ static inline pudval_t native_pud_val(pud_t pud) #endif
#if CONFIG_PGTABLE_LEVELS > 2 -typedef struct { pmdval_t pmd; } pmd_t; - static inline pmd_t native_make_pmd(pmdval_t val) { - return (pmd_t) { val }; + return (pmd_t) { .pmd = val }; }
static inline pmdval_t native_pmd_val(pmd_t pmd)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit 024d232ae4fcd7a7ce8ea239607d6c1246d7adc8 ]
AFAICT there's no reason to do anything different than what we do for PTEs. Make it so (also affects SH).
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20221022114424.711181252%40infradead.org Stable-dep-of: 71c186efc1b2 ("userfaultfd: fix checks for huge PMDs") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/x86/include/asm/pgtable-3level.h | 56 --------------------------- include/linux/pgtable.h | 47 +++++++++++++++++----- 2 files changed, 37 insertions(+), 66 deletions(-)
diff --git a/arch/x86/include/asm/pgtable-3level.h b/arch/x86/include/asm/pgtable-3level.h index 28556d22feb8..94f50b0100a5 100644 --- a/arch/x86/include/asm/pgtable-3level.h +++ b/arch/x86/include/asm/pgtable-3level.h @@ -34,62 +34,6 @@ static inline void native_set_pte(pte_t *ptep, pte_t pte) ptep->pte_low = pte.pte_low; }
-#define pmd_read_atomic pmd_read_atomic -/* - * pte_offset_map_lock() on 32-bit PAE kernels was reading the pmd_t with - * a "*pmdp" dereference done by GCC. Problem is, in certain places - * where pte_offset_map_lock() is called, concurrent page faults are - * allowed, if the mmap_lock is hold for reading. An example is mincore - * vs page faults vs MADV_DONTNEED. On the page fault side - * pmd_populate() rightfully does a set_64bit(), but if we're reading the - * pmd_t with a "*pmdp" on the mincore side, a SMP race can happen - * because GCC will not read the 64-bit value of the pmd atomically. - * - * To fix this all places running pte_offset_map_lock() while holding the - * mmap_lock in read mode, shall read the pmdp pointer using this - * function to know if the pmd is null or not, and in turn to know if - * they can run pte_offset_map_lock() or pmd_trans_huge() or other pmd - * operations. - * - * Without THP if the mmap_lock is held for reading, the pmd can only - * transition from null to not null while pmd_read_atomic() runs. So - * we can always return atomic pmd values with this function. - * - * With THP if the mmap_lock is held for reading, the pmd can become - * trans_huge or none or point to a pte (and in turn become "stable") - * at any time under pmd_read_atomic(). We could read it truly - * atomically here with an atomic64_read() for the THP enabled case (and - * it would be a whole lot simpler), but to avoid using cmpxchg8b we - * only return an atomic pmdval if the low part of the pmdval is later - * found to be stable (i.e. pointing to a pte). We are also returning a - * 'none' (zero) pmdval if the low part of the pmd is zero. - * - * In some cases the high and low part of the pmdval returned may not be - * consistent if THP is enabled (the low part may point to previously - * mapped hugepage, while the high part may point to a more recently - * mapped hugepage), but pmd_none_or_trans_huge_or_clear_bad() only - * needs the low part of the pmd to be read atomically to decide if the - * pmd is unstable or not, with the only exception when the low part - * of the pmd is zero, in which case we return a 'none' pmd. - */ -static inline pmd_t pmd_read_atomic(pmd_t *pmdp) -{ - pmdval_t ret; - u32 *tmp = (u32 *)pmdp; - - ret = (pmdval_t) (*tmp); - if (ret) { - /* - * If the low part is null, we must not read the high part - * or we can end up with a partial pmd. - */ - smp_rmb(); - ret |= ((pmdval_t)*(tmp + 1)) << 32; - } - - return (pmd_t) { .pmd = ret }; -} - static inline void native_set_pte_atomic(pte_t *ptep, pte_t pte) { set_64bit((unsigned long long *)(ptep), native_pte_val(pte)); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 5f0d7d0b9471..8f31e2ff6b58 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -316,6 +316,13 @@ static inline pte_t ptep_get(pte_t *ptep) } #endif
+#ifndef __HAVE_ARCH_PMDP_GET +static inline pmd_t pmdp_get(pmd_t *pmdp) +{ + return READ_ONCE(*pmdp); +} +#endif + #ifdef CONFIG_GUP_GET_PTE_LOW_HIGH /* * WARNING: only to be used in the get_user_pages_fast() implementation. @@ -361,15 +368,42 @@ static inline pte_t ptep_get_lockless(pte_t *ptep)
return pte; } -#else /* CONFIG_GUP_GET_PTE_LOW_HIGH */ +#define ptep_get_lockless ptep_get_lockless + +#if CONFIG_PGTABLE_LEVELS > 2 +static inline pmd_t pmdp_get_lockless(pmd_t *pmdp) +{ + pmd_t pmd; + + do { + pmd.pmd_low = pmdp->pmd_low; + smp_rmb(); + pmd.pmd_high = pmdp->pmd_high; + smp_rmb(); + } while (unlikely(pmd.pmd_low != pmdp->pmd_low)); + + return pmd; +} +#define pmdp_get_lockless pmdp_get_lockless +#endif /* CONFIG_PGTABLE_LEVELS > 2 */ +#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ + /* * We require that the PTE can be read atomically. */ +#ifndef ptep_get_lockless static inline pte_t ptep_get_lockless(pte_t *ptep) { return ptep_get(ptep); } -#endif /* CONFIG_GUP_GET_PTE_LOW_HIGH */ +#endif + +#ifndef pmdp_get_lockless +static inline pmd_t pmdp_get_lockless(pmd_t *pmdp) +{ + return pmdp_get(pmdp); +} +#endif
#ifdef CONFIG_TRANSPARENT_HUGEPAGE #ifndef __HAVE_ARCH_PMDP_HUGE_GET_AND_CLEAR @@ -1339,17 +1373,10 @@ static inline int pud_trans_unstable(pud_t *pud) #endif }
-#ifndef pmd_read_atomic static inline pmd_t pmd_read_atomic(pmd_t *pmdp) { - /* - * Depend on compiler for an atomic pmd read. NOTE: this is - * only going to work, if the pmdval_t isn't larger than - * an unsigned long. - */ - return *pmdp; + return pmdp_get_lockless(pmdp); } -#endif
#ifndef arch_needs_pgtable_deposit #define arch_needs_pgtable_deposit() (false)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peter Zijlstra peterz@infradead.org
[ Upstream commit dab6e717429e5ec795d558a0e9a5337a1ed33a3d ]
There's no point in having the identical routines for PTE/PMD have different names.
Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Link: https://lkml.kernel.org/r/20221022114424.841277397%40infradead.org Stable-dep-of: 71c186efc1b2 ("userfaultfd: fix checks for huge PMDs") Signed-off-by: Sasha Levin sashal@kernel.org --- include/linux/pgtable.h | 9 ++------- mm/hmm.c | 2 +- mm/khugepaged.c | 2 +- mm/mapping_dirty_helpers.c | 2 +- mm/mprotect.c | 2 +- mm/userfaultfd.c | 2 +- mm/vmscan.c | 4 ++-- 7 files changed, 9 insertions(+), 14 deletions(-)
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 8f31e2ff6b58..3e3c00e80b65 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1373,11 +1373,6 @@ static inline int pud_trans_unstable(pud_t *pud) #endif }
-static inline pmd_t pmd_read_atomic(pmd_t *pmdp) -{ - return pmdp_get_lockless(pmdp); -} - #ifndef arch_needs_pgtable_deposit #define arch_needs_pgtable_deposit() (false) #endif @@ -1404,13 +1399,13 @@ static inline pmd_t pmd_read_atomic(pmd_t *pmdp) */ static inline int pmd_none_or_trans_huge_or_clear_bad(pmd_t *pmd) { - pmd_t pmdval = pmd_read_atomic(pmd); + pmd_t pmdval = pmdp_get_lockless(pmd); /* * The barrier will stabilize the pmdval in a register or on * the stack so that it will stop changing under the code. * * When CONFIG_TRANSPARENT_HUGEPAGE=y on x86 32bit PAE, - * pmd_read_atomic is allowed to return a not atomic pmdval + * pmdp_get_lockless is allowed to return a not atomic pmdval * (for example pointing to an hugepage that has never been * mapped in the pmd). The below checks will only care about * the low part of the pmd with 32bit PAE x86 anyway, with the diff --git a/mm/hmm.c b/mm/hmm.c index 3850fb625dda..39cf50de76d7 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -361,7 +361,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, * huge or device mapping one and compute corresponding pfn * values. */ - pmd = pmd_read_atomic(pmdp); + pmd = pmdp_get_lockless(pmdp); barrier(); if (!pmd_devmap(pmd) && !pmd_trans_huge(pmd)) goto again; diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 085fca1fa27a..47010c3b5c4d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -866,7 +866,7 @@ static int find_pmd_or_thp_or_none(struct mm_struct *mm, if (!*pmd) return SCAN_PMD_NULL;
- pmde = pmd_read_atomic(*pmd); + pmde = pmdp_get_lockless(*pmd);
#ifdef CONFIG_TRANSPARENT_HUGEPAGE /* See comments in pmd_none_or_trans_huge_or_clear_bad() */ diff --git a/mm/mapping_dirty_helpers.c b/mm/mapping_dirty_helpers.c index 1b0ab8fcfd8b..175e424b9ab1 100644 --- a/mm/mapping_dirty_helpers.c +++ b/mm/mapping_dirty_helpers.c @@ -126,7 +126,7 @@ static int clean_record_pte(pte_t *pte, unsigned long addr, static int wp_clean_pmd_entry(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { - pmd_t pmdval = pmd_read_atomic(pmd); + pmd_t pmdval = pmdp_get_lockless(pmd);
if (!pmd_trans_unstable(&pmdval)) return 0; diff --git a/mm/mprotect.c b/mm/mprotect.c index 668bfaa6ed2a..f006bafe338f 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -294,7 +294,7 @@ static unsigned long change_pte_range(struct mmu_gather *tlb, */ static inline int pmd_none_or_clear_bad_unless_trans_huge(pmd_t *pmd) { - pmd_t pmdval = pmd_read_atomic(pmd); + pmd_t pmdval = pmdp_get_lockless(pmd);
/* See pmd_none_or_trans_huge_or_clear_bad for info on barrier */ #ifdef CONFIG_TRANSPARENT_HUGEPAGE diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 992a0a16846f..5d873aadec76 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -641,7 +641,7 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, break; }
- dst_pmdval = pmd_read_atomic(dst_pmd); + dst_pmdval = pmdp_get_lockless(dst_pmd); /* * If the dst_pmd is mapped as THP don't * override it and just be strict. diff --git a/mm/vmscan.c b/mm/vmscan.c index 4cd0cbf9c121..f5fa1c76d9e6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4068,9 +4068,9 @@ static void walk_pmd_range(pud_t *pud, unsigned long start, unsigned long end, /* walk_pte_range() may call get_next_vma() */ vma = args->vma; for (i = pmd_index(start), addr = start; addr != end; i++, addr = next) { - pmd_t val = pmd_read_atomic(pmd + i); + pmd_t val = pmdp_get_lockless(pmd + i);
- /* for pmd_read_atomic() */ + /* for pmdp_get_lockless() */ barrier();
next = pmd_addr_end(addr, end);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jann Horn jannh@google.com
[ Upstream commit 71c186efc1b2cf1aeabfeff3b9bd5ac4c5ac14d8 ]
Patch series "userfaultfd: fix races around pmd_trans_huge() check", v2.
The pmd_trans_huge() code in mfill_atomic() is wrong in three different ways depending on kernel version:
1. The pmd_trans_huge() check is racy and can lead to a BUG_ON() (if you hit the right two race windows) - I've tested this in a kernel build with some extra mdelay() calls. See the commit message for a description of the race scenario. On older kernels (before 6.5), I think the same bug can even theoretically lead to accessing transhuge page contents as a page table if you hit the right 5 narrow race windows (I haven't tested this case). 2. As pointed out by Qi Zheng, pmd_trans_huge() is not sufficient for detecting PMDs that don't point to page tables. On older kernels (before 6.5), you'd just have to win a single fairly wide race to hit this. I've tested this on 6.1 stable by racing migration (with a mdelay() patched into try_to_migrate()) against UFFDIO_ZEROPAGE - on my x86 VM, that causes a kernel oops in ptlock_ptr(). 3. On newer kernels (>=6.5), for shmem mappings, khugepaged is allowed to yank page tables out from under us (though I haven't tested that), so I think the BUG_ON() checks in mfill_atomic() are just wrong.
I decided to write two separate fixes for these (one fix for bugs 1+2, one fix for bug 3), so that the first fix can be backported to kernels affected by bugs 1+2.
This patch (of 2):
This fixes two issues.
I discovered that the following race can occur:
mfill_atomic other thread ============ ============ <zap PMD> pmdp_get_lockless() [reads none pmd] <bail if trans_huge> <if none:> <pagefault creates transhuge zeropage> __pte_alloc [no-op] <zap PMD> <bail if pmd_trans_huge(*dst_pmd)> BUG_ON(pmd_none(*dst_pmd))
I have experimentally verified this in a kernel with extra mdelay() calls; the BUG_ON(pmd_none(*dst_pmd)) triggers.
On kernels newer than commit 0d940a9b270b ("mm/pgtable: allow pte_offset_map[_lock]() to fail"), this can't lead to anything worse than a BUG_ON(), since the page table access helpers are actually designed to deal with page tables concurrently disappearing; but on older kernels (<=6.4), I think we could probably theoretically race past the two BUG_ON() checks and end up treating a hugepage as a page table.
The second issue is that, as Qi Zheng pointed out, there are other types of huge PMDs that pmd_trans_huge() can't catch: devmap PMDs and swap PMDs (in particular, migration PMDs).
On <=6.4, this is worse than the first issue: If mfill_atomic() runs on a PMD that contains a migration entry (which just requires winning a single, fairly wide race), it will pass the PMD to pte_offset_map_lock(), which assumes that the PMD points to a page table.
Breakage follows: First, the kernel tries to take the PTE lock (which will crash or maybe worse if there is no "struct page" for the address bits in the migration entry PMD - I think at least on X86 there usually is no corresponding "struct page" thanks to the PTE inversion mitigation, amd64 looks different).
If that didn't crash, the kernel would next try to write a PTE into what it wrongly thinks is a page table.
As part of fixing these issues, get rid of the check for pmd_trans_huge() before __pte_alloc() - that's redundant, we're going to have to check for that after the __pte_alloc() anyway.
Backport note: pmdp_get_lockless() is pmd_read_atomic() in older kernels.
Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-0-5efa61078a41@googl... Link: https://lkml.kernel.org/r/20240813-uffd-thp-flip-fix-v2-1-5efa61078a41@googl... Fixes: c1a4de99fada ("userfaultfd: mcopy_atomic|mfill_zeropage: UFFDIO_COPY|UFFDIO_ZEROPAGE preparation") Signed-off-by: Jann Horn jannh@google.com Acked-by: David Hildenbrand david@redhat.com Cc: Andrea Arcangeli aarcange@redhat.com Cc: Hugh Dickins hughd@google.com Cc: Jann Horn jannh@google.com Cc: Pavel Emelyanov xemul@virtuozzo.com Cc: Qi Zheng zhengqi.arch@bytedance.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- mm/userfaultfd.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 5d873aadec76..50c01a7eb705 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -642,21 +642,23 @@ static __always_inline ssize_t __mcopy_atomic(struct mm_struct *dst_mm, }
dst_pmdval = pmdp_get_lockless(dst_pmd); - /* - * If the dst_pmd is mapped as THP don't - * override it and just be strict. - */ - if (unlikely(pmd_trans_huge(dst_pmdval))) { - err = -EEXIST; - break; - } if (unlikely(pmd_none(dst_pmdval)) && unlikely(__pte_alloc(dst_mm, dst_pmd))) { err = -ENOMEM; break; } - /* If an huge pmd materialized from under us fail */ - if (unlikely(pmd_trans_huge(*dst_pmd))) { + dst_pmdval = pmdp_get_lockless(dst_pmd); + /* + * If the dst_pmd is THP don't override it and just be strict. + * (This includes the case where the PMD used to be THP and + * changed back to none after __pte_alloc().) + */ + if (unlikely(!pmd_present(dst_pmdval) || pmd_trans_huge(dst_pmdval) || + pmd_devmap(dst_pmdval))) { + err = -EEXIST; + break; + } + if (unlikely(pmd_bad(dst_pmdval))) { err = -EFAULT; break; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Souradeep Chakrabarti schakrabarti@linux.microsoft.com
[ Upstream commit b6ecc662037694488bfff7c9fd21c405df8411f2 ]
Currently napi_disable() gets called during rxq and txq cleanup, even before napi is enabled and hrtimer is initialized. It causes kernel panic.
? page_fault_oops+0x136/0x2b0 ? page_counter_cancel+0x2e/0x80 ? do_user_addr_fault+0x2f2/0x640 ? refill_obj_stock+0xc4/0x110 ? exc_page_fault+0x71/0x160 ? asm_exc_page_fault+0x27/0x30 ? __mmdrop+0x10/0x180 ? __mmdrop+0xec/0x180 ? hrtimer_active+0xd/0x50 hrtimer_try_to_cancel+0x2c/0xf0 hrtimer_cancel+0x15/0x30 napi_disable+0x65/0x90 mana_destroy_rxq+0x4c/0x2f0 mana_create_rxq.isra.0+0x56c/0x6d0 ? mana_uncfg_vport+0x50/0x50 mana_alloc_queues+0x21b/0x320 ? skb_dequeue+0x5f/0x80
Cc: stable@vger.kernel.org Fixes: e1b5683ff62e ("net: mana: Move NAPI from EQ to CQ") Signed-off-by: Souradeep Chakrabarti schakrabarti@linux.microsoft.com Reviewed-by: Haiyang Zhang haiyangz@microsoft.com Reviewed-by: Shradha Gupta shradhagupta@linux.microsoft.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microsoft/mana/mana.h | 2 ++ drivers/net/ethernet/microsoft/mana/mana_en.c | 22 +++++++++++-------- 2 files changed, 15 insertions(+), 9 deletions(-)
diff --git a/drivers/net/ethernet/microsoft/mana/mana.h b/drivers/net/ethernet/microsoft/mana/mana.h index 41c99eabf40a..2b00d6a29117 100644 --- a/drivers/net/ethernet/microsoft/mana/mana.h +++ b/drivers/net/ethernet/microsoft/mana/mana.h @@ -86,6 +86,8 @@ struct mana_txq {
atomic_t pending_sends;
+ bool napi_initialized; + struct mana_stats_tx stats; };
diff --git a/drivers/net/ethernet/microsoft/mana/mana_en.c b/drivers/net/ethernet/microsoft/mana/mana_en.c index e7d1ce68f05e..b52612eef0a6 100644 --- a/drivers/net/ethernet/microsoft/mana/mana_en.c +++ b/drivers/net/ethernet/microsoft/mana/mana_en.c @@ -1391,10 +1391,12 @@ static void mana_destroy_txq(struct mana_port_context *apc)
for (i = 0; i < apc->num_queues; i++) { napi = &apc->tx_qp[i].tx_cq.napi; - napi_synchronize(napi); - napi_disable(napi); - netif_napi_del(napi); - + if (apc->tx_qp[i].txq.napi_initialized) { + napi_synchronize(napi); + napi_disable(napi); + netif_napi_del(napi); + apc->tx_qp[i].txq.napi_initialized = false; + } mana_destroy_wq_obj(apc, GDMA_SQ, apc->tx_qp[i].tx_object);
mana_deinit_cq(apc, &apc->tx_qp[i].tx_cq); @@ -1450,6 +1452,7 @@ static int mana_create_txq(struct mana_port_context *apc, txq->ndev = net; txq->net_txq = netdev_get_tx_queue(net, i); txq->vp_offset = apc->tx_vp_offset; + txq->napi_initialized = false; skb_queue_head_init(&txq->pending_skbs);
memset(&spec, 0, sizeof(spec)); @@ -1514,6 +1517,7 @@ static int mana_create_txq(struct mana_port_context *apc,
netif_napi_add_tx(net, &cq->napi, mana_poll); napi_enable(&cq->napi); + txq->napi_initialized = true;
mana_gd_ring_cq(cq->gdma_cq, SET_ARM_BIT); } @@ -1525,7 +1529,7 @@ static int mana_create_txq(struct mana_port_context *apc, }
static void mana_destroy_rxq(struct mana_port_context *apc, - struct mana_rxq *rxq, bool validate_state) + struct mana_rxq *rxq, bool napi_initialized)
{ struct gdma_context *gc = apc->ac->gdma_dev->gdma_context; @@ -1539,15 +1543,15 @@ static void mana_destroy_rxq(struct mana_port_context *apc,
napi = &rxq->rx_cq.napi;
- if (validate_state) + if (napi_initialized) { napi_synchronize(napi);
- napi_disable(napi); + napi_disable(napi);
+ netif_napi_del(napi); + } xdp_rxq_info_unreg(&rxq->xdp_rxq);
- netif_napi_del(napi); - mana_destroy_wq_obj(apc, GDMA_RQ, rxq->rxobj);
mana_deinit_cq(apc, &rxq->rx_cq);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 18e24deb1cc92f2068ce7434a94233741fbd7771 ]
Warn in the case it is called with cpu == -1. This does not appear to happen anywhere.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index f3b6ac232e21..4da8a5e702f8 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5893,6 +5893,8 @@ notrace void wq_watchdog_touch(int cpu) { if (cpu >= 0) per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies; + else + WARN_ONCE(1, "%s should be called with valid CPU", __func__);
wq_watchdog_touched = jiffies; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 98f887f820c993e05a12e8aa816c80b8661d4c87 ]
On a ~2000 CPU powerpc system, hard lockups have been observed in the workqueue code when stop_machine runs (in this case due to CPU hotplug). This is due to lots of CPUs spinning in multi_cpu_stop, calling touch_nmi_watchdog() which ends up calling wq_watchdog_touch(). wq_watchdog_touch() writes to the global variable wq_watchdog_touched, and that can find itself in the same cacheline as other important workqueue data, which slows down operations to the point of lockups.
In the case of the following abridged trace, worker_pool_idr was in the hot line, causing the lockups to always appear at idr_find.
watchdog: CPU 1125 self-detected hard LOCKUP @ idr_find Call Trace: get_work_pool __queue_work call_timer_fn run_timer_softirq __do_softirq do_softirq_own_stack irq_exit timer_interrupt decrementer_common_virt * interrupt: 900 (timer) at multi_cpu_stop multi_cpu_stop cpu_stopper_thread smpboot_thread_fn kthread
Fix this by having wq_watchdog_touch() only write to the line if the last time a touch was recorded exceeds 1/4 of the watchdog threshold.
Reported-by: Srikar Dronamraju srikar@linux.vnet.ibm.com Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Paul E. McKenney paulmck@kernel.org Signed-off-by: Tejun Heo tj@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/workqueue.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index 4da8a5e702f8..93303148a434 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5891,12 +5891,18 @@ static void wq_watchdog_timer_fn(struct timer_list *unused)
notrace void wq_watchdog_touch(int cpu) { + unsigned long thresh = READ_ONCE(wq_watchdog_thresh) * HZ; + unsigned long touch_ts = READ_ONCE(wq_watchdog_touched); + unsigned long now = jiffies; + if (cpu >= 0) - per_cpu(wq_watchdog_touched_cpu, cpu) = jiffies; + per_cpu(wq_watchdog_touched_cpu, cpu) = now; else WARN_ONCE(1, "%s should be called with valid CPU", __func__);
- wq_watchdog_touched = jiffies; + /* Don't unnecessarily store to global cacheline */ + if (time_after(now, touch_ts + thresh / 4)) + WRITE_ONCE(wq_watchdog_touched, jiffies); }
static void wq_watchdog_set_thresh(unsigned long thresh)
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit fadf231f0a06a6748a7fc4a2c29ac9ef7bca6bfd ]
Rafael observed [1] that returning 0 from processor_add() will result in acpi_default_enumeration() being called which will attempt to create a platform device, but that makes little sense when the processor is known to be not available. So just return the error code from acpi_processor_get_info() instead.
Link: https://lore.kernel.org/all/CAJZ5v0iKU8ra9jR+EmgxbuNm=Uwx2m1-8vn_RAZ+aCiUVLe... [1] Suggested-by: Rafael J. Wysocki rafael@kernel.org Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Gavin Shan gshan@redhat.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/20240529133446.28446-5-Jonathan.Cameron@huawei.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpi_processor.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 6737b1cbf6d6..5662c157fda7 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -373,7 +373,7 @@ static int acpi_processor_add(struct acpi_device *device,
result = acpi_processor_get_info(device); if (result) /* Processor is not physically present or unavailable */ - return 0; + return result;
BUG_ON(pr->id >= nr_cpu_ids);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit 47ec9b417ed9b6b8ec2a941cd84d9de62adc358a ]
If acpi_processor_get_info() returned an error, pr and the associated pr->throttling.shared_cpu_map were leaked.
The unwind code was in the wrong order wrt to setup, relying on some unwind actions having no affect (clearing variables that were never set etc). That makes it harder to reason about so reorder and add appropriate labels to only undo what was actually set up in the first place.
Acked-by: Rafael J. Wysocki rafael.j.wysocki@intel.com Reviewed-by: Gavin Shan gshan@redhat.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/20240529133446.28446-6-Jonathan.Cameron@huawei.com Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/acpi/acpi_processor.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/acpi/acpi_processor.c b/drivers/acpi/acpi_processor.c index 5662c157fda7..8bd5c4fa91f2 100644 --- a/drivers/acpi/acpi_processor.c +++ b/drivers/acpi/acpi_processor.c @@ -373,7 +373,7 @@ static int acpi_processor_add(struct acpi_device *device,
result = acpi_processor_get_info(device); if (result) /* Processor is not physically present or unavailable */ - return result; + goto err_clear_driver_data;
BUG_ON(pr->id >= nr_cpu_ids);
@@ -388,7 +388,7 @@ static int acpi_processor_add(struct acpi_device *device, "BIOS reported wrong ACPI id %d for the processor\n", pr->id); /* Give up, but do not abort the namespace scan. */ - goto err; + goto err_clear_driver_data; } /* * processor_device_array is not cleared on errors to allow buggy BIOS @@ -400,12 +400,12 @@ static int acpi_processor_add(struct acpi_device *device, dev = get_cpu_device(pr->id); if (!dev) { result = -ENODEV; - goto err; + goto err_clear_per_cpu; }
result = acpi_bind_one(dev, device); if (result) - goto err; + goto err_clear_per_cpu;
pr->dev = dev;
@@ -416,10 +416,11 @@ static int acpi_processor_add(struct acpi_device *device, dev_err(dev, "Processor driver could not be attached\n"); acpi_unbind_one(dev);
- err: - free_cpumask_var(pr->throttling.shared_cpu_map); - device->driver_data = NULL; + err_clear_per_cpu: per_cpu(processors, pr->id) = NULL; + err_clear_driver_data: + device->driver_data = NULL; + free_cpumask_var(pr->throttling.shared_cpu_map); err_free_pr: kfree(pr); return result;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: James Morse james.morse@arm.com
[ Upstream commit 8d34b6f17b9ac93faa2791eb037dcb08bdf755de ]
ACPI identifies CPUs by UID. get_cpu_for_acpi_id() maps the ACPI UID to the Linux CPU number.
The helper to retrieve this mapping is only available in arm64's NUMA code.
Move it to live next to get_acpi_id_for_cpu().
Signed-off-by: James Morse james.morse@arm.com Reviewed-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Gavin Shan gshan@redhat.com Tested-by: Miguel Luis miguel.luis@oracle.com Tested-by: Vishnu Pajjuri vishnu@os.amperecomputing.com Tested-by: Jianyong Wu jianyong.wu@arm.com Signed-off-by: Russell King (Oracle) rmk+kernel@armlinux.org.uk Acked-by: Hanjun Guo guohanjun@huawei.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Reviewed-by: Lorenzo Pieralisi lpieralisi@kernel.org Link: https://lore.kernel.org/r/20240529133446.28446-12-Jonathan.Cameron@huawei.co... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/acpi.h | 11 +++++++++++ arch/arm64/kernel/acpi_numa.c | 11 ----------- 2 files changed, 11 insertions(+), 11 deletions(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index bd68e1b7f29f..0d1da93a5bad 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -97,6 +97,17 @@ static inline u32 get_acpi_id_for_cpu(unsigned int cpu) return acpi_cpu_get_madt_gicc(cpu)->uid; }
+static inline int get_cpu_for_acpi_id(u32 uid) +{ + int cpu; + + for (cpu = 0; cpu < nr_cpu_ids; cpu++) + if (uid == get_acpi_id_for_cpu(cpu)) + return cpu; + + return -EINVAL; +} + static inline void arch_fix_phys_package_id(int num, u32 slot) { } void __init acpi_init_cpus(void); int apei_claim_sea(struct pt_regs *regs); diff --git a/arch/arm64/kernel/acpi_numa.c b/arch/arm64/kernel/acpi_numa.c index ccbff21ce1fa..2465f291c7e1 100644 --- a/arch/arm64/kernel/acpi_numa.c +++ b/arch/arm64/kernel/acpi_numa.c @@ -34,17 +34,6 @@ int __init acpi_numa_get_nid(unsigned int cpu) return acpi_early_node_map[cpu]; }
-static inline int get_cpu_for_acpi_id(u32 uid) -{ - int cpu; - - for (cpu = 0; cpu < nr_cpu_ids; cpu++) - if (uid == get_acpi_id_for_cpu(cpu)) - return cpu; - - return -EINVAL; -} - static int __init acpi_parse_gicc_pxm(union acpi_subtable_headers *header, const unsigned long end) {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jonathan Cameron Jonathan.Cameron@huawei.com
[ Upstream commit 2488444274c70038eb6b686cba5f1ce48ebb9cdd ]
In a review discussion of the changes to support vCPU hotplug where a check was added on the GICC being enabled if was online, it was noted that there is need to map back to the cpu and use that to index into a cpumask. As such, a valid ID is needed.
If an MPIDR check fails in acpi_map_gic_cpu_interface() it is possible for the entry in cpu_madt_gicc[cpu] == NULL. This function would then cause a NULL pointer dereference. Whilst a path to trigger this has not been established, harden this caller against the possibility.
Reviewed-by: Gavin Shan gshan@redhat.com Signed-off-by: Jonathan Cameron Jonathan.Cameron@huawei.com Link: https://lore.kernel.org/r/20240529133446.28446-13-Jonathan.Cameron@huawei.co... Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm64/include/asm/acpi.h | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/include/asm/acpi.h b/arch/arm64/include/asm/acpi.h index 0d1da93a5bad..702587fda70c 100644 --- a/arch/arm64/include/asm/acpi.h +++ b/arch/arm64/include/asm/acpi.h @@ -102,7 +102,8 @@ static inline int get_cpu_for_acpi_id(u32 uid) int cpu;
for (cpu = 0; cpu < nr_cpu_ids; cpu++) - if (uid == get_acpi_id_for_cpu(cpu)) + if (acpi_cpu_get_madt_gicc(cpu) && + uid == get_acpi_id_for_cpu(cpu)) return cpu;
return -EINVAL;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit d49184b7b585f9da7ee546b744525f62117019f6 ]
This is a preparation patch.
Sending the UINC messages followed by incrementing the tail pointer will be called in more than one place in upcoming patches, so factor this out into a separate function.
Also make mcp251xfd_handle_rxif_ring_uinc() safe to be called with a "len" of 0.
Tested-by: Stefan Althöfer Stefan.Althoefer@janztec.com Tested-by: Thomas Kopp thomas.kopp@microchip.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c | 48 +++++++++++++------- 1 file changed, 32 insertions(+), 16 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c index ced8d9c81f8c..5e2f39de88f3 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c @@ -197,6 +197,37 @@ mcp251xfd_rx_obj_read(const struct mcp251xfd_priv *priv, return err; }
+static int +mcp251xfd_handle_rxif_ring_uinc(const struct mcp251xfd_priv *priv, + struct mcp251xfd_rx_ring *ring, + u8 len) +{ + int offset; + int err; + + if (!len) + return 0; + + /* Increment the RX FIFO tail pointer 'len' times in a + * single SPI message. + * + * Note: + * Calculate offset, so that the SPI transfer ends on + * the last message of the uinc_xfer array, which has + * "cs_change == 0", to properly deactivate the chip + * select. + */ + offset = ARRAY_SIZE(ring->uinc_xfer) - len; + err = spi_sync_transfer(priv->spi, + ring->uinc_xfer + offset, len); + if (err) + return err; + + ring->tail += len; + + return 0; +} + static int mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, struct mcp251xfd_rx_ring *ring) @@ -210,8 +241,6 @@ mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, return err;
while ((len = mcp251xfd_get_rx_linear_len(ring))) { - int offset; - rx_tail = mcp251xfd_get_rx_tail(ring);
err = mcp251xfd_rx_obj_read(priv, ring, hw_rx_obj, @@ -227,22 +256,9 @@ mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, return err; }
- /* Increment the RX FIFO tail pointer 'len' times in a - * single SPI message. - * - * Note: - * Calculate offset, so that the SPI transfer ends on - * the last message of the uinc_xfer array, which has - * "cs_change == 0", to properly deactivate the chip - * select. - */ - offset = ARRAY_SIZE(ring->uinc_xfer) - len; - err = spi_sync_transfer(priv->spi, - ring->uinc_xfer + offset, len); + err = mcp251xfd_handle_rxif_ring_uinc(priv, ring, len); if (err) return err; - - ring->tail += len; }
return 0;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit 85505e585637a737e4713c1386c30e37c325b82e ]
This is a preparatory patch to work around erratum DS80000789E 6 of the mcp2518fd, the other variants of the chip family (mcp2517fd and mcp251863) are probably also affected.
When handling the RX interrupt, the driver iterates over all pending FIFOs (which are implemented as ring buffers in hardware) and reads the FIFO header index from the RX FIFO STA register of the chip.
In the bad case, the driver reads a too large head index. In the original code, the driver always trusted the read value, which caused old CAN frames that were already processed, or new, incompletely written CAN frames to be (re-)processed.
Instead of reading and trusting the head index, read the head index and calculate the number of CAN frames that were supposedly received - replace mcp251xfd_rx_ring_update() with mcp251xfd_get_rx_len().
The mcp251xfd_handle_rxif_ring() function reads the received CAN frames from the chip, iterates over them and pushes them into the network stack. Prepare that the iteration can be stopped if an old CAN frame is detected. The actual code to detect old or incomplete frames and abort will be added in the next patch.
Link: https://lore.kernel.org/all/BL3PR11MB64844C1C95CA3BDADAE4D8CCFBC99@BL3PR11MB... Reported-by: Stefan Althöfer Stefan.Althoefer@janztec.com Closes: https://lore.kernel.org/all/FR0P281MB1966273C216630B120ABB6E197E89@FR0P281MB... Tested-by: Stefan Althöfer Stefan.Althoefer@janztec.com Tested-by: Thomas Kopp thomas.kopp@microchip.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/can/spi/mcp251xfd/mcp251xfd-ring.c | 2 + drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c | 89 +++++++++++-------- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 12 +-- 3 files changed, 56 insertions(+), 47 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c index 915d505a304f..5ed0cd62f4f8 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c @@ -513,6 +513,8 @@ int mcp251xfd_ring_alloc(struct mcp251xfd_priv *priv) }
rx_ring->obj_num = rx_obj_num; + rx_ring->obj_num_shift_to_u8 = BITS_PER_TYPE(rx_ring->obj_num_shift_to_u8) - + ilog2(rx_obj_num); rx_ring->obj_size = rx_obj_size; priv->rx[i] = rx_ring; } diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c index 5e2f39de88f3..5d0fb1c454cd 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c @@ -2,7 +2,7 @@ // // mcp251xfd - Microchip MCP251xFD Family CAN controller driver // -// Copyright (c) 2019, 2020, 2021 Pengutronix, +// Copyright (c) 2019, 2020, 2021, 2023 Pengutronix, // Marc Kleine-Budde kernel@pengutronix.de // // Based on: @@ -16,23 +16,14 @@
#include "mcp251xfd.h"
-static inline int -mcp251xfd_rx_head_get_from_chip(const struct mcp251xfd_priv *priv, - const struct mcp251xfd_rx_ring *ring, - u8 *rx_head, bool *fifo_empty) +static inline bool mcp251xfd_rx_fifo_sta_empty(const u32 fifo_sta) { - u32 fifo_sta; - int err; - - err = regmap_read(priv->map_reg, MCP251XFD_REG_FIFOSTA(ring->fifo_nr), - &fifo_sta); - if (err) - return err; - - *rx_head = FIELD_GET(MCP251XFD_REG_FIFOSTA_FIFOCI_MASK, fifo_sta); - *fifo_empty = !(fifo_sta & MCP251XFD_REG_FIFOSTA_TFNRFNIF); + return !(fifo_sta & MCP251XFD_REG_FIFOSTA_TFNRFNIF); +}
- return 0; +static inline bool mcp251xfd_rx_fifo_sta_full(const u32 fifo_sta) +{ + return fifo_sta & MCP251XFD_REG_FIFOSTA_TFERFFIF; }
static inline int @@ -80,29 +71,49 @@ mcp251xfd_check_rx_tail(const struct mcp251xfd_priv *priv, }
static int -mcp251xfd_rx_ring_update(const struct mcp251xfd_priv *priv, - struct mcp251xfd_rx_ring *ring) +mcp251xfd_get_rx_len(const struct mcp251xfd_priv *priv, + const struct mcp251xfd_rx_ring *ring, + u8 *len_p) { - u32 new_head; - u8 chip_rx_head; - bool fifo_empty; + const u8 shift = ring->obj_num_shift_to_u8; + u8 chip_head, tail, len; + u32 fifo_sta; int err;
- err = mcp251xfd_rx_head_get_from_chip(priv, ring, &chip_rx_head, - &fifo_empty); - if (err || fifo_empty) + err = regmap_read(priv->map_reg, MCP251XFD_REG_FIFOSTA(ring->fifo_nr), + &fifo_sta); + if (err) + return err; + + if (mcp251xfd_rx_fifo_sta_empty(fifo_sta)) { + *len_p = 0; + return 0; + } + + if (mcp251xfd_rx_fifo_sta_full(fifo_sta)) { + *len_p = ring->obj_num; + return 0; + } + + chip_head = FIELD_GET(MCP251XFD_REG_FIFOSTA_FIFOCI_MASK, fifo_sta); + + err = mcp251xfd_check_rx_tail(priv, ring); + if (err) return err; + tail = mcp251xfd_get_rx_tail(ring);
- /* chip_rx_head, is the next RX-Object filled by the HW. - * The new RX head must be >= the old head. + /* First shift to full u8. The subtraction works on signed + * values, that keeps the difference steady around the u8 + * overflow. The right shift acts on len, which is an u8. */ - new_head = round_down(ring->head, ring->obj_num) + chip_rx_head; - if (new_head <= ring->head) - new_head += ring->obj_num; + BUILD_BUG_ON(sizeof(ring->obj_num) != sizeof(chip_head)); + BUILD_BUG_ON(sizeof(ring->obj_num) != sizeof(tail)); + BUILD_BUG_ON(sizeof(ring->obj_num) != sizeof(len));
- ring->head = new_head; + len = (chip_head << shift) - (tail << shift); + *len_p = len >> shift;
- return mcp251xfd_check_rx_tail(priv, ring); + return 0; }
static void @@ -208,6 +219,8 @@ mcp251xfd_handle_rxif_ring_uinc(const struct mcp251xfd_priv *priv, if (!len) return 0;
+ ring->head += len; + /* Increment the RX FIFO tail pointer 'len' times in a * single SPI message. * @@ -233,22 +246,22 @@ mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, struct mcp251xfd_rx_ring *ring) { struct mcp251xfd_hw_rx_obj_canfd *hw_rx_obj = ring->obj; - u8 rx_tail, len; + u8 rx_tail, len, l; int err, i;
- err = mcp251xfd_rx_ring_update(priv, ring); + err = mcp251xfd_get_rx_len(priv, ring, &len); if (err) return err;
- while ((len = mcp251xfd_get_rx_linear_len(ring))) { + while ((l = mcp251xfd_get_rx_linear_len(ring, len))) { rx_tail = mcp251xfd_get_rx_tail(ring);
err = mcp251xfd_rx_obj_read(priv, ring, hw_rx_obj, - rx_tail, len); + rx_tail, l); if (err) return err;
- for (i = 0; i < len; i++) { + for (i = 0; i < l; i++) { err = mcp251xfd_handle_rxif_one(priv, ring, (void *)hw_rx_obj + i * ring->obj_size); @@ -256,9 +269,11 @@ mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, return err; }
- err = mcp251xfd_handle_rxif_ring_uinc(priv, ring, len); + err = mcp251xfd_handle_rxif_ring_uinc(priv, ring, l); if (err) return err; + + len -= l; }
return 0; diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h index 78d12dda08a0..ca5f4e670ec1 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h @@ -553,6 +553,7 @@ struct mcp251xfd_rx_ring { u8 nr; u8 fifo_nr; u8 obj_num; + u8 obj_num_shift_to_u8; u8 obj_size;
union mcp251xfd_write_reg_buf irq_enable_buf; @@ -889,18 +890,9 @@ static inline u8 mcp251xfd_get_rx_tail(const struct mcp251xfd_rx_ring *ring) return ring->tail & (ring->obj_num - 1); }
-static inline u8 mcp251xfd_get_rx_len(const struct mcp251xfd_rx_ring *ring) -{ - return ring->head - ring->tail; -} - static inline u8 -mcp251xfd_get_rx_linear_len(const struct mcp251xfd_rx_ring *ring) +mcp251xfd_get_rx_linear_len(const struct mcp251xfd_rx_ring *ring, u8 len) { - u8 len; - - len = mcp251xfd_get_rx_len(ring); - return min_t(u8, len, ring->obj_num - mcp251xfd_get_rx_tail(ring)); }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit e793c724b48ca8cae9693bc3be528e85284c126a ]
The mcp251xfd chip is configured to provide a timestamp with each received and transmitted CAN frame. The timestamp is derived from the internal free-running timer, which can also be read from the TBC register via SPI. The timer is 32 bits wide and is clocked by the external oscillator (typically 20 or 40 MHz).
To avoid confusion, we call this timestamp "timestamp_raw" or "ts_raw" for short.
Using the timecounter framework, the "ts_raw" is converted to 64 bit nanoseconds since the epoch. This is what we call "timestamp".
This is a preparation for the next patches which use the "timestamp" to work around a bug where so far only the "ts_raw" is used.
Tested-by: Stefan Althöfer Stefan.Althoefer@janztec.com Tested-by: Thomas Kopp thomas.kopp@microchip.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/can/spi/mcp251xfd/mcp251xfd-core.c | 28 +++++++++---------- drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c | 2 +- drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c | 2 +- .../can/spi/mcp251xfd/mcp251xfd-timestamp.c | 22 ++++----------- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 27 ++++++++++++++---- 5 files changed, 43 insertions(+), 38 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c index 1665f78abb5c..a9bafa96e2f9 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-core.c @@ -2,7 +2,7 @@ // // mcp251xfd - Microchip MCP251xFD Family CAN controller driver // -// Copyright (c) 2019, 2020, 2021 Pengutronix, +// Copyright (c) 2019, 2020, 2021, 2023 Pengutronix, // Marc Kleine-Budde kernel@pengutronix.de // // Based on: @@ -867,18 +867,18 @@ static int mcp251xfd_get_berr_counter(const struct net_device *ndev,
static struct sk_buff * mcp251xfd_alloc_can_err_skb(struct mcp251xfd_priv *priv, - struct can_frame **cf, u32 *timestamp) + struct can_frame **cf, u32 *ts_raw) { struct sk_buff *skb; int err;
- err = mcp251xfd_get_timestamp(priv, timestamp); + err = mcp251xfd_get_timestamp_raw(priv, ts_raw); if (err) return NULL;
skb = alloc_can_err_skb(priv->ndev, cf); if (skb) - mcp251xfd_skb_set_timestamp(priv, skb, *timestamp); + mcp251xfd_skb_set_timestamp_raw(priv, skb, *ts_raw);
return skb; } @@ -889,7 +889,7 @@ static int mcp251xfd_handle_rxovif(struct mcp251xfd_priv *priv) struct mcp251xfd_rx_ring *ring; struct sk_buff *skb; struct can_frame *cf; - u32 timestamp, rxovif; + u32 ts_raw, rxovif; int err, i;
stats->rx_over_errors++; @@ -924,14 +924,14 @@ static int mcp251xfd_handle_rxovif(struct mcp251xfd_priv *priv) return err; }
- skb = mcp251xfd_alloc_can_err_skb(priv, &cf, ×tamp); + skb = mcp251xfd_alloc_can_err_skb(priv, &cf, &ts_raw); if (!skb) return 0;
cf->can_id |= CAN_ERR_CRTL; cf->data[1] = CAN_ERR_CRTL_RX_OVERFLOW;
- err = can_rx_offload_queue_timestamp(&priv->offload, skb, timestamp); + err = can_rx_offload_queue_timestamp(&priv->offload, skb, ts_raw); if (err) stats->rx_fifo_errors++;
@@ -948,12 +948,12 @@ static int mcp251xfd_handle_txatif(struct mcp251xfd_priv *priv) static int mcp251xfd_handle_ivmif(struct mcp251xfd_priv *priv) { struct net_device_stats *stats = &priv->ndev->stats; - u32 bdiag1, timestamp; + u32 bdiag1, ts_raw; struct sk_buff *skb; struct can_frame *cf = NULL; int err;
- err = mcp251xfd_get_timestamp(priv, ×tamp); + err = mcp251xfd_get_timestamp_raw(priv, &ts_raw); if (err) return err;
@@ -1035,8 +1035,8 @@ static int mcp251xfd_handle_ivmif(struct mcp251xfd_priv *priv) if (!cf) return 0;
- mcp251xfd_skb_set_timestamp(priv, skb, timestamp); - err = can_rx_offload_queue_timestamp(&priv->offload, skb, timestamp); + mcp251xfd_skb_set_timestamp_raw(priv, skb, ts_raw); + err = can_rx_offload_queue_timestamp(&priv->offload, skb, ts_raw); if (err) stats->rx_fifo_errors++;
@@ -1049,7 +1049,7 @@ static int mcp251xfd_handle_cerrif(struct mcp251xfd_priv *priv) struct sk_buff *skb; struct can_frame *cf = NULL; enum can_state new_state, rx_state, tx_state; - u32 trec, timestamp; + u32 trec, ts_raw; int err;
err = regmap_read(priv->map_reg, MCP251XFD_REG_TREC, &trec); @@ -1079,7 +1079,7 @@ static int mcp251xfd_handle_cerrif(struct mcp251xfd_priv *priv) /* The skb allocation might fail, but can_change_state() * handles cf == NULL. */ - skb = mcp251xfd_alloc_can_err_skb(priv, &cf, ×tamp); + skb = mcp251xfd_alloc_can_err_skb(priv, &cf, &ts_raw); can_change_state(priv->ndev, cf, tx_state, rx_state);
if (new_state == CAN_STATE_BUS_OFF) { @@ -1110,7 +1110,7 @@ static int mcp251xfd_handle_cerrif(struct mcp251xfd_priv *priv) cf->data[7] = bec.rxerr; }
- err = can_rx_offload_queue_timestamp(&priv->offload, skb, timestamp); + err = can_rx_offload_queue_timestamp(&priv->offload, skb, ts_raw); if (err) stats->rx_fifo_errors++;
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c index 5d0fb1c454cd..a79e6c661ecc 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c @@ -160,7 +160,7 @@ mcp251xfd_hw_rx_obj_to_skb(const struct mcp251xfd_priv *priv, if (!(hw_rx_obj->flags & MCP251XFD_OBJ_FLAGS_RTR)) memcpy(cfd->data, hw_rx_obj->data, cfd->len);
- mcp251xfd_skb_set_timestamp(priv, skb, hw_rx_obj->ts); + mcp251xfd_skb_set_timestamp_raw(priv, skb, hw_rx_obj->ts); }
static int diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c index 902eb767426d..8f39730f3122 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-tef.c @@ -97,7 +97,7 @@ mcp251xfd_handle_tefif_one(struct mcp251xfd_priv *priv, tef_tail = mcp251xfd_get_tef_tail(priv); skb = priv->can.echo_skb[tef_tail]; if (skb) - mcp251xfd_skb_set_timestamp(priv, skb, hw_tef_obj->ts); + mcp251xfd_skb_set_timestamp_raw(priv, skb, hw_tef_obj->ts); stats->tx_bytes += can_rx_offload_get_echo_skb(&priv->offload, tef_tail, hw_tef_obj->ts, diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-timestamp.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-timestamp.c index 712e09186987..1db99aabe85c 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-timestamp.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-timestamp.c @@ -2,7 +2,7 @@ // // mcp251xfd - Microchip MCP251xFD Family CAN controller driver // -// Copyright (c) 2021 Pengutronix, +// Copyright (c) 2021, 2023 Pengutronix, // Marc Kleine-Budde kernel@pengutronix.de //
@@ -11,20 +11,20 @@
#include "mcp251xfd.h"
-static u64 mcp251xfd_timestamp_read(const struct cyclecounter *cc) +static u64 mcp251xfd_timestamp_raw_read(const struct cyclecounter *cc) { const struct mcp251xfd_priv *priv; - u32 timestamp = 0; + u32 ts_raw = 0; int err;
priv = container_of(cc, struct mcp251xfd_priv, cc); - err = mcp251xfd_get_timestamp(priv, ×tamp); + err = mcp251xfd_get_timestamp_raw(priv, &ts_raw); if (err) netdev_err(priv->ndev, "Error %d while reading timestamp. HW timestamps may be inaccurate.", err);
- return timestamp; + return ts_raw; }
static void mcp251xfd_timestamp_work(struct work_struct *work) @@ -39,21 +39,11 @@ static void mcp251xfd_timestamp_work(struct work_struct *work) MCP251XFD_TIMESTAMP_WORK_DELAY_SEC * HZ); }
-void mcp251xfd_skb_set_timestamp(const struct mcp251xfd_priv *priv, - struct sk_buff *skb, u32 timestamp) -{ - struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); - u64 ns; - - ns = timecounter_cyc2time(&priv->tc, timestamp); - hwtstamps->hwtstamp = ns_to_ktime(ns); -} - void mcp251xfd_timestamp_init(struct mcp251xfd_priv *priv) { struct cyclecounter *cc = &priv->cc;
- cc->read = mcp251xfd_timestamp_read; + cc->read = mcp251xfd_timestamp_raw_read; cc->mask = CYCLECOUNTER_MASK(32); cc->shift = 1; cc->mult = clocksource_hz2mult(priv->can.clock.freq, cc->shift); diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h index ca5f4e670ec1..7713c9264fb5 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h @@ -2,7 +2,7 @@ * * mcp251xfd - Microchip MCP251xFD Family CAN controller driver * - * Copyright (c) 2019, 2020, 2021 Pengutronix, + * Copyright (c) 2019, 2020, 2021, 2023 Pengutronix, * Marc Kleine-Budde kernel@pengutronix.de * Copyright (c) 2019 Martin Sperl kernel@martin.sperl.org */ @@ -794,10 +794,27 @@ mcp251xfd_spi_cmd_write(const struct mcp251xfd_priv *priv, return data; }
-static inline int mcp251xfd_get_timestamp(const struct mcp251xfd_priv *priv, - u32 *timestamp) +static inline int mcp251xfd_get_timestamp_raw(const struct mcp251xfd_priv *priv, + u32 *ts_raw) { - return regmap_read(priv->map_reg, MCP251XFD_REG_TBC, timestamp); + return regmap_read(priv->map_reg, MCP251XFD_REG_TBC, ts_raw); +} + +static inline void mcp251xfd_skb_set_timestamp(struct sk_buff *skb, u64 ns) +{ + struct skb_shared_hwtstamps *hwtstamps = skb_hwtstamps(skb); + + hwtstamps->hwtstamp = ns_to_ktime(ns); +} + +static inline +void mcp251xfd_skb_set_timestamp_raw(const struct mcp251xfd_priv *priv, + struct sk_buff *skb, u32 ts_raw) +{ + u64 ns; + + ns = timecounter_cyc2time(&priv->tc, ts_raw); + mcp251xfd_skb_set_timestamp(skb, ns); }
static inline u16 mcp251xfd_get_tef_obj_addr(u8 n) @@ -918,8 +935,6 @@ void mcp251xfd_ring_free(struct mcp251xfd_priv *priv); int mcp251xfd_ring_alloc(struct mcp251xfd_priv *priv); int mcp251xfd_handle_rxif(struct mcp251xfd_priv *priv); int mcp251xfd_handle_tefif(struct mcp251xfd_priv *priv); -void mcp251xfd_skb_set_timestamp(const struct mcp251xfd_priv *priv, - struct sk_buff *skb, u32 timestamp); void mcp251xfd_timestamp_init(struct mcp251xfd_priv *priv); void mcp251xfd_timestamp_stop(struct mcp251xfd_priv *priv);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marc Kleine-Budde mkl@pengutronix.de
[ Upstream commit 24436be590c6fbb05f6161b0dfba7d9da60214aa ]
This patch tries to works around erratum DS80000789E 6 of the mcp2518fd, the other variants of the chip family (mcp2517fd and mcp251863) are probably also affected.
In the bad case, the driver reads a too large head index. In the original code, the driver always trusted the read value, which caused old, already processed CAN frames or new, incompletely written CAN frames to be (re-)processed.
To work around this issue, keep a per FIFO timestamp [1] of the last valid received CAN frame and compare against the timestamp of every received CAN frame. If an old CAN frame is detected, abort the iteration and mark the number of valid CAN frames as processed in the chip by incrementing the FIFO's tail index.
Further tests showed that this workaround can recognize old CAN frames, but a small time window remains in which partially written CAN frames [2] are not recognized but then processed. These CAN frames have the correct data and time stamps, but the DLC has not yet been updated.
[1] As the raw timestamp overflows every 107 seconds (at the usual clock rate of 40 MHz) convert it to nanoseconds with the timecounter framework and use this to detect stale CAN frames.
Link: https://lore.kernel.org/all/BL3PR11MB64844C1C95CA3BDADAE4D8CCFBC99@BL3PR11MB... [2] Reported-by: Stefan Althöfer Stefan.Althoefer@janztec.com Closes: https://lore.kernel.org/all/FR0P281MB1966273C216630B120ABB6E197E89@FR0P281MB... Tested-by: Stefan Althöfer Stefan.Althoefer@janztec.com Tested-by: Thomas Kopp thomas.kopp@microchip.com Signed-off-by: Marc Kleine-Budde mkl@pengutronix.de Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/can/spi/mcp251xfd/mcp251xfd-ring.c | 1 + drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c | 32 +++++++++++++++++-- drivers/net/can/spi/mcp251xfd/mcp251xfd.h | 3 ++ 3 files changed, 33 insertions(+), 3 deletions(-)
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c index 5ed0cd62f4f8..0fde8154a649 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-ring.c @@ -196,6 +196,7 @@ mcp251xfd_ring_init_rx(struct mcp251xfd_priv *priv, u16 *base, u8 *fifo_nr) int i, j;
mcp251xfd_for_each_rx_ring(priv, rx_ring, i) { + rx_ring->last_valid = timecounter_read(&priv->tc); rx_ring->head = 0; rx_ring->tail = 0; rx_ring->base = *base; diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c index a79e6c661ecc..fe897f3e4c12 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd-rx.c @@ -159,8 +159,6 @@ mcp251xfd_hw_rx_obj_to_skb(const struct mcp251xfd_priv *priv,
if (!(hw_rx_obj->flags & MCP251XFD_OBJ_FLAGS_RTR)) memcpy(cfd->data, hw_rx_obj->data, cfd->len); - - mcp251xfd_skb_set_timestamp_raw(priv, skb, hw_rx_obj->ts); }
static int @@ -171,8 +169,26 @@ mcp251xfd_handle_rxif_one(struct mcp251xfd_priv *priv, struct net_device_stats *stats = &priv->ndev->stats; struct sk_buff *skb; struct canfd_frame *cfd; + u64 timestamp; int err;
+ /* According to mcp2518fd erratum DS80000789E 6. the FIFOCI + * bits of a FIFOSTA register, here the RX FIFO head index + * might be corrupted and we might process past the RX FIFO's + * head into old CAN frames. + * + * Compare the timestamp of currently processed CAN frame with + * last valid frame received. Abort with -EBADMSG if an old + * CAN frame is detected. + */ + timestamp = timecounter_cyc2time(&priv->tc, hw_rx_obj->ts); + if (timestamp <= ring->last_valid) { + stats->rx_fifo_errors++; + + return -EBADMSG; + } + ring->last_valid = timestamp; + if (hw_rx_obj->flags & MCP251XFD_OBJ_FLAGS_FDF) skb = alloc_canfd_skb(priv->ndev, &cfd); else @@ -183,6 +199,7 @@ mcp251xfd_handle_rxif_one(struct mcp251xfd_priv *priv, return 0; }
+ mcp251xfd_skb_set_timestamp(skb, timestamp); mcp251xfd_hw_rx_obj_to_skb(priv, hw_rx_obj, skb); err = can_rx_offload_queue_timestamp(&priv->offload, skb, hw_rx_obj->ts); if (err) @@ -265,7 +282,16 @@ mcp251xfd_handle_rxif_ring(struct mcp251xfd_priv *priv, err = mcp251xfd_handle_rxif_one(priv, ring, (void *)hw_rx_obj + i * ring->obj_size); - if (err) + + /* -EBADMSG means we're affected by mcp2518fd + * erratum DS80000789E 6., i.e. the timestamp + * in the RX object is older that the last + * valid received CAN frame. Don't process any + * further and mark processed frames as good. + */ + if (err == -EBADMSG) + return mcp251xfd_handle_rxif_ring_uinc(priv, ring, i); + else if (err) return err; }
diff --git a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h index 7713c9264fb5..c07300443c6a 100644 --- a/drivers/net/can/spi/mcp251xfd/mcp251xfd.h +++ b/drivers/net/can/spi/mcp251xfd/mcp251xfd.h @@ -549,6 +549,9 @@ struct mcp251xfd_rx_ring { unsigned int head; unsigned int tail;
+ /* timestamp of the last valid received CAN frame */ + u64 last_valid; + u16 base; u8 nr; u8 fifo_nr;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Aurabindo Pillai aurabindo.pillai@amd.com
[ Upstream commit 7ceb94e87bffff7c12b61eb29749e1d8ac976896 ]
Add GFX12 swizzle mode definitions for use with DCN401
Signed-off-by: Aurabindo Pillai aurabindo.pillai@amd.com Acked-by: Rodrigo Siqueira rodrigo.siqueira@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- include/uapi/drm/drm_fourcc.h | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 868d6909b718..52ce13488eaf 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -1390,6 +1390,7 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) #define AMD_FMT_MOD_TILE_VER_GFX10 2 #define AMD_FMT_MOD_TILE_VER_GFX10_RBPLUS 3 #define AMD_FMT_MOD_TILE_VER_GFX11 4 +#define AMD_FMT_MOD_TILE_VER_GFX12 5
/* * 64K_S is the same for GFX9/GFX10/GFX10_RBPLUS and hence has GFX9 as canonical @@ -1400,6 +1401,8 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) /* * 64K_D for non-32 bpp is the same for GFX9/GFX10/GFX10_RBPLUS and hence has * GFX9 as canonical version. + * + * 64K_D_2D on GFX12 is identical to 64K_D on GFX11. */ #define AMD_FMT_MOD_TILE_GFX9_64K_D 10 #define AMD_FMT_MOD_TILE_GFX9_64K_S_X 25 @@ -1407,6 +1410,19 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) #define AMD_FMT_MOD_TILE_GFX9_64K_R_X 27 #define AMD_FMT_MOD_TILE_GFX11_256K_R_X 31
+/* Gfx12 swizzle modes: + * 0 - LINEAR + * 1 - 256B_2D - 2D block dimensions + * 2 - 4KB_2D + * 3 - 64KB_2D + * 4 - 256KB_2D + * 5 - 4KB_3D - 3D block dimensions + * 6 - 64KB_3D + * 7 - 256KB_3D + */ +#define AMD_FMT_MOD_TILE_GFX12_64K_2D 3 +#define AMD_FMT_MOD_TILE_GFX12_256K_2D 4 + #define AMD_FMT_MOD_DCC_BLOCK_64B 0 #define AMD_FMT_MOD_DCC_BLOCK_128B 1 #define AMD_FMT_MOD_DCC_BLOCK_256B 2
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Marek Olšák marek.olsak@amd.com
[ Upstream commit 8dd1426e2c80e32ac1995007330c8f95ffa28ebb ]
It verified GFX9-11 swizzle modes on GFX12, which has undefined behavior.
Signed-off-by: Marek Olšák marek.olsak@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Alex Deucher alexander.deucher@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/amd/amdgpu/amdgpu_display.c | 27 ++++++++++++++++++++- include/uapi/drm/drm_fourcc.h | 2 ++ 2 files changed, 28 insertions(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c index ac773b191071..cd0bccc95205 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_display.c @@ -997,6 +997,30 @@ static int amdgpu_display_verify_sizes(struct amdgpu_framebuffer *rfb) block_width = 256 / format_info->cpp[i]; block_height = 1; block_size_log2 = 8; + } else if (AMD_FMT_MOD_GET(TILE_VERSION, modifier) >= AMD_FMT_MOD_TILE_VER_GFX12) { + int swizzle = AMD_FMT_MOD_GET(TILE, modifier); + + switch (swizzle) { + case AMD_FMT_MOD_TILE_GFX12_256B_2D: + block_size_log2 = 8; + break; + case AMD_FMT_MOD_TILE_GFX12_4K_2D: + block_size_log2 = 12; + break; + case AMD_FMT_MOD_TILE_GFX12_64K_2D: + block_size_log2 = 16; + break; + case AMD_FMT_MOD_TILE_GFX12_256K_2D: + block_size_log2 = 18; + break; + default: + drm_dbg_kms(rfb->base.dev, + "Gfx12 swizzle mode with unknown block size: %d\n", swizzle); + return -EINVAL; + } + + get_block_dimensions(block_size_log2, format_info->cpp[i], + &block_width, &block_height); } else { int swizzle = AMD_FMT_MOD_GET(TILE, modifier);
@@ -1032,7 +1056,8 @@ static int amdgpu_display_verify_sizes(struct amdgpu_framebuffer *rfb) return ret; }
- if (AMD_FMT_MOD_GET(DCC, modifier)) { + if (AMD_FMT_MOD_GET(TILE_VERSION, modifier) <= AMD_FMT_MOD_TILE_VER_GFX11 && + AMD_FMT_MOD_GET(DCC, modifier)) { if (AMD_FMT_MOD_GET(DCC_RETILE, modifier)) { block_size_log2 = get_dcc_block_size(modifier, false, false); get_block_dimensions(block_size_log2 + 8, format_info->cpp[0], diff --git a/include/uapi/drm/drm_fourcc.h b/include/uapi/drm/drm_fourcc.h index 52ce13488eaf..6245928d76ee 100644 --- a/include/uapi/drm/drm_fourcc.h +++ b/include/uapi/drm/drm_fourcc.h @@ -1420,6 +1420,8 @@ drm_fourcc_canonicalize_nvidia_format_mod(__u64 modifier) * 6 - 64KB_3D * 7 - 256KB_3D */ +#define AMD_FMT_MOD_TILE_GFX12_256B_2D 1 +#define AMD_FMT_MOD_TILE_GFX12_4K_2D 2 #define AMD_FMT_MOD_TILE_GFX12_64K_2D 3 #define AMD_FMT_MOD_TILE_GFX12_256K_2D 4
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit 88715b6e5d529f4ef3830ad2a893e4624c6af0b8 ]
Patch series "Reimplement huge pages without hugepd on powerpc (8xx, e500, book3s/64)", v7.
Unlike most architectures, powerpc 8xx HW requires a two-level pagetable topology for all page sizes. So a leaf PMD-contig approach is not feasible as such.
Possible sizes on 8xx are 4k, 16k, 512k and 8M.
First level (PGD/PMD) covers 4M per entry. For 8M pages, two PMD entries must point to a single entry level-2 page table. Until now that was done using hugepd. This series changes it to use standard page tables where the entry is replicated 1024 times on each of the two pagetables refered by the two associated PMD entries for that 8M page.
For e500 and book3s/64 there are less constraints because it is not tied to the HW assisted tablewalk like on 8xx, so it is easier to use leaf PMDs (and PUDs).
On e500 the supported page sizes are 4M, 16M, 64M, 256M and 1G. All at PMD level on e500/32 (mpc85xx) and mix of PMD and PUD for e500/64. We encode page size with 4 available bits in PTE entries. On e300/32 PGD entries size is increases to 64 bits in order to allow leaf-PMD entries because PTE are 64 bits on e500.
On book3s/64 only the hash-4k mode is concerned. It supports 16M pages as cont-PMD and 16G pages as cont-PUD. In other modes (radix-4k, radix-6k and hash-64k) the sizes match with PMD and PUD sizes so that's just leaf entries. The hash processing make things a bit more complex. To ease things, __hash_page_huge() is modified to bail out when DIRTY or ACCESSED bits are missing, leaving it to mm core to fix it.
This patch (of 23):
The nohash HTW_IBM (Hardware Table Walk) code is unused since support for A2 was removed in commit fb5a515704d7 ("powerpc: Remove platforms/ wsp and associated pieces") (2014).
The remaining supported CPUs use either no HTW (data_tlb_miss_bolted), or the e6500 HTW (data_tlb_miss_e6500).
Link: https://lkml.kernel.org/r/cover.1719928057.git.christophe.leroy@csgroup.eu Link: https://lkml.kernel.org/r/820dd1385ecc931f07b0d7a0fa827b1613917ab6.171992805... Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Cc: Jason Gunthorpe jgg@nvidia.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Oscar Salvador osalvador@suse.de Cc: Peter Xu peterx@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: d92b5cc29c79 ("powerpc/64e: Define mmu_pte_psize static") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/nohash/mmu-e500.h | 3 +- arch/powerpc/mm/nohash/tlb.c | 57 +----- arch/powerpc/mm/nohash/tlb_low_64e.S | 195 --------------------- 3 files changed, 2 insertions(+), 253 deletions(-)
diff --git a/arch/powerpc/include/asm/nohash/mmu-e500.h b/arch/powerpc/include/asm/nohash/mmu-e500.h index e43a418d3ccd..9b5ba73d33d6 100644 --- a/arch/powerpc/include/asm/nohash/mmu-e500.h +++ b/arch/powerpc/include/asm/nohash/mmu-e500.h @@ -303,8 +303,7 @@ extern unsigned long linear_map_top; extern int book3e_htw_mode;
#define PPC_HTW_NONE 0 -#define PPC_HTW_IBM 1 -#define PPC_HTW_E6500 2 +#define PPC_HTW_E6500 1
/* * 64-bit booke platforms don't load the tlb in the tlb miss handler code. diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index 2c15c86c7015..19df1e13fe0e 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -403,9 +403,8 @@ void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) static void __init setup_page_sizes(void) { unsigned int tlb0cfg; - unsigned int tlb0ps; unsigned int eptcfg; - int i, psize; + int psize;
#ifdef CONFIG_PPC_E500 unsigned int mmucfg = mfspr(SPRN_MMUCFG); @@ -474,50 +473,6 @@ static void __init setup_page_sizes(void) goto out; } #endif - - tlb0cfg = mfspr(SPRN_TLB0CFG); - tlb0ps = mfspr(SPRN_TLB0PS); - eptcfg = mfspr(SPRN_EPTCFG); - - /* Look for supported direct sizes */ - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (tlb0ps & (1U << (def->shift - 10))) - def->flags |= MMU_PAGE_SIZE_DIRECT; - } - - /* Indirect page sizes supported ? */ - if ((tlb0cfg & TLBnCFG_IND) == 0 || - (tlb0cfg & TLBnCFG_PT) == 0) - goto out; - - book3e_htw_mode = PPC_HTW_IBM; - - /* Now, we only deal with one IND page size for each - * direct size. Hopefully all implementations today are - * unambiguous, but we might want to be careful in the - * future. - */ - for (i = 0; i < 3; i++) { - unsigned int ps, sps; - - sps = eptcfg & 0x1f; - eptcfg >>= 5; - ps = eptcfg & 0x1f; - eptcfg >>= 5; - if (!ps || !sps) - continue; - for (psize = 0; psize < MMU_PAGE_COUNT; psize++) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (ps == (def->shift - 10)) - def->flags |= MMU_PAGE_SIZE_INDIRECT; - if (sps == (def->shift - 10)) - def->ind = ps + 10; - } - } - out: /* Cleanup array and print summary */ pr_info("MMU: Supported page sizes\n"); @@ -546,10 +501,6 @@ static void __init setup_mmu_htw(void) */
switch (book3e_htw_mode) { - case PPC_HTW_IBM: - patch_exception(0x1c0, exc_data_tlb_miss_htw_book3e); - patch_exception(0x1e0, exc_instruction_tlb_miss_htw_book3e); - break; #ifdef CONFIG_PPC_E500 case PPC_HTW_E6500: extlb_level_exc = EX_TLB_SIZE; @@ -580,12 +531,6 @@ static void early_init_this_mmu(void) mmu_pte_psize = MMU_PAGE_2M; break;
- case PPC_HTW_IBM: - mas4 |= MAS4_INDD; - mas4 |= BOOK3E_PAGESZ_1M << MAS4_TSIZED_SHIFT; - mmu_pte_psize = MMU_PAGE_1M; - break; - case PPC_HTW_NONE: mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT; mmu_pte_psize = mmu_virtual_psize; diff --git a/arch/powerpc/mm/nohash/tlb_low_64e.S b/arch/powerpc/mm/nohash/tlb_low_64e.S index 76cf456d7976..d831a111eaba 100644 --- a/arch/powerpc/mm/nohash/tlb_low_64e.S +++ b/arch/powerpc/mm/nohash/tlb_low_64e.S @@ -893,201 +893,6 @@ virt_page_table_tlb_miss_whacko_fault: TLB_MISS_EPILOG_ERROR b exc_data_storage_book3e
- -/************************************************************** - * * - * TLB miss handling for Book3E with hw page table support * - * * - **************************************************************/ - - -/* Data TLB miss */ - START_EXCEPTION(data_tlb_miss_htw) - TLB_MISS_PROLOG - - /* Now we handle the fault proper. We only save DEAR in normal - * fault case since that's the only interesting values here. - * We could probably also optimize by not saving SRR0/1 in the - * linear mapping case but I'll leave that for later - */ - mfspr r14,SPRN_ESR - mfspr r16,SPRN_DEAR /* get faulting address */ - srdi r11,r16,44 /* get region */ - xoris r11,r11,0xc - cmpldi cr0,r11,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r11,1 /* vmalloc mapping ? */ - - /* We do the user/kernel test for the PID here along with the RW test - */ - srdi. r11,r16,60 /* Check for user region */ - ld r15,PACAPGD(r13) /* Load user pgdir */ - beq htw_tlb_miss - - /* XXX replace the RMW cycles with immediate loads + writes */ -1: mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - ld r15,PACA_KERNELPGD(r13) /* Load kernel pgdir */ - beq+ cr1,htw_tlb_miss - - /* We got a crappy address, just fault with whatever DEAR and ESR - * are here - */ - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e - -/* Instruction TLB miss */ - START_EXCEPTION(instruction_tlb_miss_htw) - TLB_MISS_PROLOG - - /* If we take a recursive fault, the second level handler may need - * to know whether we are handling a data or instruction fault in - * order to get to the right store fault handler. We provide that - * info by keeping a crazy value for ESR in r14 - */ - li r14,-1 /* store to exception frame is done later */ - - /* Now we handle the fault proper. We only save DEAR in the non - * linear mapping case since we know the linear mapping case will - * not re-enter. We could indeed optimize and also not save SRR0/1 - * in the linear mapping case but I'll leave that for later - * - * Faulting address is SRR0 which is already in r16 - */ - srdi r11,r16,44 /* get region */ - xoris r11,r11,0xc - cmpldi cr0,r11,0 /* linear mapping ? */ - beq tlb_load_linear /* yes -> go to linear map load */ - cmpldi cr1,r11,1 /* vmalloc mapping ? */ - - /* We do the user/kernel test for the PID here along with the RW test - */ - srdi. r11,r16,60 /* Check for user region */ - ld r15,PACAPGD(r13) /* Load user pgdir */ - beq htw_tlb_miss - - /* XXX replace the RMW cycles with immediate loads + writes */ -1: mfspr r10,SPRN_MAS1 - rlwinm r10,r10,0,16,1 /* Clear TID */ - mtspr SPRN_MAS1,r10 - ld r15,PACA_KERNELPGD(r13) /* Load kernel pgdir */ - beq+ htw_tlb_miss - - /* We got a crappy address, just fault */ - TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - - -/* - * This is the guts of the second-level TLB miss handler for direct - * misses. We are entered with: - * - * r16 = virtual page table faulting address - * r15 = PGD pointer - * r14 = ESR - * r13 = PACA - * r12 = TLB exception frame in PACA - * r11 = crap (free to use) - * r10 = crap (free to use) - * - * It can be re-entered by the linear mapping miss handler. However, to - * avoid too much complication, it will save/restore things for us - */ -htw_tlb_miss: -#ifdef CONFIG_PPC_KUAP - mfspr r10,SPRN_MAS1 - rlwinm. r10,r10,0,0x3fff0000 - beq- htw_tlb_miss_fault /* KUAP fault */ -#endif - /* Search if we already have a TLB entry for that virtual address, and - * if we do, bail out. - * - * MAS1:IND should be already set based on MAS4 - */ - PPC_TLBSRX_DOT(0,R16) - beq htw_tlb_miss_done - - /* Now, we need to walk the page tables. First check if we are in - * range. - */ - rldicl. r10,r16,64-PGTABLE_EADDR_SIZE,PGTABLE_EADDR_SIZE+4 - bne- htw_tlb_miss_fault - - /* Get the PGD pointer */ - cmpldi cr0,r15,0 - beq- htw_tlb_miss_fault - - /* Get to PGD entry */ - rldicl r11,r16,64-(PGDIR_SHIFT-3),64-PGD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Get to PUD entry */ - rldicl r11,r16,64-(PUD_SHIFT-3),64-PUD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Get to PMD entry */ - rldicl r11,r16,64-(PMD_SHIFT-3),64-PMD_INDEX_SIZE-3 - clrrdi r10,r11,3 - ldx r15,r10,r15 - cmpdi cr0,r15,0 - bge htw_tlb_miss_fault - - /* Ok, we're all right, we can now create an indirect entry for - * a 1M or 256M page. - * - * The last trick is now that because we use "half" pages for - * the HTW (1M IND is 2K and 256M IND is 32K) we need to account - * for an added LSB bit to the RPN. For 64K pages, there is no - * problem as we already use 32K arrays (half PTE pages), but for - * 4K page we need to extract a bit from the virtual address and - * insert it into the "PA52" bit of the RPN. - */ - rlwimi r15,r16,32-9,20,20 - /* Now we build the MAS: - * - * MAS 0 : Fully setup with defaults in MAS4 and TLBnCFG - * MAS 1 : Almost fully setup - * - PID already updated by caller if necessary - * - TSIZE for now is base ind page size always - * MAS 2 : Use defaults - * MAS 3+7 : Needs to be done - */ - ori r10,r15,(BOOK3E_PAGESZ_4K << MAS3_SPSIZE_SHIFT) - - srdi r16,r10,32 - mtspr SPRN_MAS3,r10 - mtspr SPRN_MAS7,r16 - - tlbwe - -htw_tlb_miss_done: - /* We don't bother with restoring DEAR or ESR since we know we are - * level 0 and just going back to userland. They are only needed - * if you are going to take an access fault - */ - TLB_MISS_EPILOG_SUCCESS - rfi - -htw_tlb_miss_fault: - /* We need to check if it was an instruction miss. We know this - * though because r14 would contain -1 - */ - cmpdi cr0,r14,-1 - beq 1f - mtspr SPRN_DEAR,r16 - mtspr SPRN_ESR,r14 - TLB_MISS_EPILOG_ERROR - b exc_data_storage_book3e -1: TLB_MISS_EPILOG_ERROR - b exc_instruction_storage_book3e - /* * This is the guts of "any" level TLB miss handler for kernel linear * mapping misses. We are entered with:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Ellerman mpe@ellerman.id.au
[ Upstream commit a898530eea3d0ba08c17a60865995a3bb468d1bc ]
A reasonable chunk of nohash/tlb.c is 64-bit only code, split it out into a separate file.
Link: https://lkml.kernel.org/r/cb2b118f9d8a86f82d01bfb9ad309d1d304480a1.171992805... Signed-off-by: Michael Ellerman mpe@ellerman.id.au Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Cc: Jason Gunthorpe jgg@nvidia.com Cc: Nicholas Piggin npiggin@gmail.com Cc: Oscar Salvador osalvador@suse.de Cc: Peter Xu peterx@redhat.com Signed-off-by: Andrew Morton akpm@linux-foundation.org Stable-dep-of: d92b5cc29c79 ("powerpc/64e: Define mmu_pte_psize static") Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/mm/nohash/Makefile | 2 +- arch/powerpc/mm/nohash/tlb.c | 343 +---------------------------- arch/powerpc/mm/nohash/tlb_64e.c | 361 +++++++++++++++++++++++++++++++ 3 files changed, 363 insertions(+), 343 deletions(-) create mode 100644 arch/powerpc/mm/nohash/tlb_64e.c
diff --git a/arch/powerpc/mm/nohash/Makefile b/arch/powerpc/mm/nohash/Makefile index f3894e79d5f7..24b445a5fcac 100644 --- a/arch/powerpc/mm/nohash/Makefile +++ b/arch/powerpc/mm/nohash/Makefile @@ -3,7 +3,7 @@ ccflags-$(CONFIG_PPC64) := $(NO_MINIMAL_TOC)
obj-y += mmu_context.o tlb.o tlb_low.o kup.o -obj-$(CONFIG_PPC_BOOK3E_64) += tlb_low_64e.o book3e_pgtable.o +obj-$(CONFIG_PPC_BOOK3E_64) += tlb_64e.o tlb_low_64e.o book3e_pgtable.o obj-$(CONFIG_40x) += 40x.o obj-$(CONFIG_44x) += 44x.o obj-$(CONFIG_PPC_8xx) += 8xx.o diff --git a/arch/powerpc/mm/nohash/tlb.c b/arch/powerpc/mm/nohash/tlb.c index 19df1e13fe0e..c0b643d30fcc 100644 --- a/arch/powerpc/mm/nohash/tlb.c +++ b/arch/powerpc/mm/nohash/tlb.c @@ -110,28 +110,6 @@ struct mmu_psize_def mmu_psize_defs[MMU_PAGE_COUNT] = { }; #endif
-/* The variables below are currently only used on 64-bit Book3E - * though this will probably be made common with other nohash - * implementations at some point - */ -#ifdef CONFIG_PPC64 - -int mmu_pte_psize; /* Page size used for PTE pages */ -int mmu_vmemmap_psize; /* Page size used for the virtual mem map */ -int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */ -unsigned long linear_map_top; /* Top of linear mapping */ - - -/* - * Number of bytes to add to SPRN_SPRG_TLB_EXFRAME on crit/mcheck/debug - * exceptions. This is used for bolted and e6500 TLB miss handlers which - * do not modify this SPRG in the TLB miss code; for other TLB miss handlers, - * this is set to zero. - */ -int extlb_level_exc; - -#endif /* CONFIG_PPC64 */ - #ifdef CONFIG_PPC_E500 /* next_tlbcam_idx is used to round-robin tlbcam entry assignment */ DEFINE_PER_CPU(int, next_tlbcam_idx); @@ -361,326 +339,7 @@ void tlb_flush(struct mmu_gather *tlb) flush_tlb_mm(tlb->mm); }
-/* - * Below are functions specific to the 64-bit variant of Book3E though that - * may change in the future - */ - -#ifdef CONFIG_PPC64 - -/* - * Handling of virtual linear page tables or indirect TLB entries - * flushing when PTE pages are freed - */ -void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) -{ - int tsize = mmu_psize_defs[mmu_pte_psize].enc; - - if (book3e_htw_mode != PPC_HTW_NONE) { - unsigned long start = address & PMD_MASK; - unsigned long end = address + PMD_SIZE; - unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift; - - /* This isn't the most optimal, ideally we would factor out the - * while preempt & CPU mask mucking around, or even the IPI but - * it will do for now - */ - while (start < end) { - __flush_tlb_page(tlb->mm, start, tsize, 1); - start += size; - } - } else { - unsigned long rmask = 0xf000000000000000ul; - unsigned long rid = (address & rmask) | 0x1000000000000000ul; - unsigned long vpte = address & ~rmask; - - vpte = (vpte >> (PAGE_SHIFT - 3)) & ~0xffful; - vpte |= rid; - __flush_tlb_page(tlb->mm, vpte, tsize, 0); - } -} - -static void __init setup_page_sizes(void) -{ - unsigned int tlb0cfg; - unsigned int eptcfg; - int psize; - -#ifdef CONFIG_PPC_E500 - unsigned int mmucfg = mfspr(SPRN_MMUCFG); - int fsl_mmu = mmu_has_feature(MMU_FTR_TYPE_FSL_E); - - if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) { - unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG); - unsigned int min_pg, max_pg; - - min_pg = (tlb1cfg & TLBnCFG_MINSIZE) >> TLBnCFG_MINSIZE_SHIFT; - max_pg = (tlb1cfg & TLBnCFG_MAXSIZE) >> TLBnCFG_MAXSIZE_SHIFT; - - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def; - unsigned int shift; - - def = &mmu_psize_defs[psize]; - shift = def->shift; - - if (shift == 0 || shift & 1) - continue; - - /* adjust to be in terms of 4^shift Kb */ - shift = (shift - 10) >> 1; - - if ((shift >= min_pg) && (shift <= max_pg)) - def->flags |= MMU_PAGE_SIZE_DIRECT; - } - - goto out; - } - - if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) { - u32 tlb1cfg, tlb1ps; - - tlb0cfg = mfspr(SPRN_TLB0CFG); - tlb1cfg = mfspr(SPRN_TLB1CFG); - tlb1ps = mfspr(SPRN_TLB1PS); - eptcfg = mfspr(SPRN_EPTCFG); - - if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT)) - book3e_htw_mode = PPC_HTW_E6500; - - /* - * We expect 4K subpage size and unrestricted indirect size. - * The lack of a restriction on indirect size is a Freescale - * extension, indicated by PSn = 0 but SPSn != 0. - */ - if (eptcfg != 2) - book3e_htw_mode = PPC_HTW_NONE; - - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - - if (!def->shift) - continue; - - if (tlb1ps & (1U << (def->shift - 10))) { - def->flags |= MMU_PAGE_SIZE_DIRECT; - - if (book3e_htw_mode && psize == MMU_PAGE_2M) - def->flags |= MMU_PAGE_SIZE_INDIRECT; - } - } - - goto out; - } -#endif -out: - /* Cleanup array and print summary */ - pr_info("MMU: Supported page sizes\n"); - for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { - struct mmu_psize_def *def = &mmu_psize_defs[psize]; - const char *__page_type_names[] = { - "unsupported", - "direct", - "indirect", - "direct & indirect" - }; - if (def->flags == 0) { - def->shift = 0; - continue; - } - pr_info(" %8ld KB as %s\n", 1ul << (def->shift - 10), - __page_type_names[def->flags & 0x3]); - } -} - -static void __init setup_mmu_htw(void) -{ - /* - * If we want to use HW tablewalk, enable it by patching the TLB miss - * handlers to branch to the one dedicated to it. - */ - - switch (book3e_htw_mode) { -#ifdef CONFIG_PPC_E500 - case PPC_HTW_E6500: - extlb_level_exc = EX_TLB_SIZE; - patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e); - patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e); - break; -#endif - } - pr_info("MMU: Book3E HW tablewalk %s\n", - book3e_htw_mode != PPC_HTW_NONE ? "enabled" : "not supported"); -} - -/* - * Early initialization of the MMU TLB code - */ -static void early_init_this_mmu(void) -{ - unsigned int mas4; - - /* Set MAS4 based on page table setting */ - - mas4 = 0x4 << MAS4_WIMGED_SHIFT; - switch (book3e_htw_mode) { - case PPC_HTW_E6500: - mas4 |= MAS4_INDD; - mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT; - mas4 |= MAS4_TLBSELD(1); - mmu_pte_psize = MMU_PAGE_2M; - break; - - case PPC_HTW_NONE: - mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT; - mmu_pte_psize = mmu_virtual_psize; - break; - } - mtspr(SPRN_MAS4, mas4); - -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - unsigned int num_cams; - bool map = true; - - /* use a quarter of the TLBCAM for bolted linear map */ - num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; - - /* - * Only do the mapping once per core, or else the - * transient mapping would cause problems. - */ -#ifdef CONFIG_SMP - if (hweight32(get_tensr()) > 1) - map = false; -#endif - - if (map) - linear_map_top = map_mem_in_cams(linear_map_top, - num_cams, false, true); - } -#endif - - /* A sync won't hurt us after mucking around with - * the MMU configuration - */ - mb(); -} - -static void __init early_init_mmu_global(void) -{ - /* XXX This should be decided at runtime based on supported - * page sizes in the TLB, but for now let's assume 16M is - * always there and a good fit (which it probably is) - * - * Freescale booke only supports 4K pages in TLB0, so use that. - */ - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) - mmu_vmemmap_psize = MMU_PAGE_4K; - else - mmu_vmemmap_psize = MMU_PAGE_16M; - - /* XXX This code only checks for TLB 0 capabilities and doesn't - * check what page size combos are supported by the HW. It - * also doesn't handle the case where a separate array holds - * the IND entries from the array loaded by the PT. - */ - /* Look for supported page sizes */ - setup_page_sizes(); - - /* Look for HW tablewalk support */ - setup_mmu_htw(); - -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - if (book3e_htw_mode == PPC_HTW_NONE) { - extlb_level_exc = EX_TLB_SIZE; - patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e); - patch_exception(0x1e0, - exc_instruction_tlb_miss_bolted_book3e); - } - } -#endif - - /* Set the global containing the top of the linear mapping - * for use by the TLB miss code - */ - linear_map_top = memblock_end_of_DRAM(); - - ioremap_bot = IOREMAP_BASE; -} - -static void __init early_mmu_set_memory_limit(void) -{ -#ifdef CONFIG_PPC_E500 - if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - /* - * Limit memory so we dont have linear faults. - * Unlike memblock_set_current_limit, which limits - * memory available during early boot, this permanently - * reduces the memory available to Linux. We need to - * do this because highmem is not supported on 64-bit. - */ - memblock_enforce_memory_limit(linear_map_top); - } -#endif - - memblock_set_current_limit(linear_map_top); -} - -/* boot cpu only */ -void __init early_init_mmu(void) -{ - early_init_mmu_global(); - early_init_this_mmu(); - early_mmu_set_memory_limit(); -} - -void early_init_mmu_secondary(void) -{ - early_init_this_mmu(); -} - -void setup_initial_memory_limit(phys_addr_t first_memblock_base, - phys_addr_t first_memblock_size) -{ - /* On non-FSL Embedded 64-bit, we adjust the RMA size to match - * the bolted TLB entry. We know for now that only 1G - * entries are supported though that may eventually - * change. - * - * on FSL Embedded 64-bit, usually all RAM is bolted, but with - * unusual memory sizes it's possible for some RAM to not be mapped - * (such RAM is not used at all by Linux, since we don't support - * highmem on 64-bit). We limit ppc64_rma_size to what would be - * mappable if this memblock is the only one. Additional memblocks - * can only increase, not decrease, the amount that ends up getting - * mapped. We still limit max to 1G even if we'll eventually map - * more. This is due to what the early init code is set up to do. - * - * We crop it to the size of the first MEMBLOCK to - * avoid going over total available memory just in case... - */ -#ifdef CONFIG_PPC_E500 - if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { - unsigned long linear_sz; - unsigned int num_cams; - - /* use a quarter of the TLBCAM for bolted linear map */ - num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; - - linear_sz = map_mem_in_cams(first_memblock_size, num_cams, - true, true); - - ppc64_rma_size = min_t(u64, linear_sz, 0x40000000); - } else -#endif - ppc64_rma_size = min_t(u64, first_memblock_size, 0x40000000); - - /* Finally limit subsequent allocations */ - memblock_set_current_limit(first_memblock_base + ppc64_rma_size); -} -#else /* ! CONFIG_PPC64 */ +#ifndef CONFIG_PPC64 void __init early_init_mmu(void) { #ifdef CONFIG_PPC_47x diff --git a/arch/powerpc/mm/nohash/tlb_64e.c b/arch/powerpc/mm/nohash/tlb_64e.c new file mode 100644 index 000000000000..1dcda261554c --- /dev/null +++ b/arch/powerpc/mm/nohash/tlb_64e.c @@ -0,0 +1,361 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright 2008,2009 Ben Herrenschmidt benh@kernel.crashing.org + * IBM Corp. + * + * Derived from arch/ppc/mm/init.c: + * Copyright (C) 1995-1996 Gary Thomas (gdt@linuxppc.org) + * + * Modifications by Paul Mackerras (PowerMac) (paulus@cs.anu.edu.au) + * and Cort Dougan (PReP) (cort@cs.nmt.edu) + * Copyright (C) 1996 Paul Mackerras + * + * Derived from "arch/i386/mm/init.c" + * Copyright (C) 1991, 1992, 1993, 1994 Linus Torvalds + */ + +#include <linux/kernel.h> +#include <linux/export.h> +#include <linux/mm.h> +#include <linux/init.h> +#include <linux/pagemap.h> +#include <linux/memblock.h> + +#include <asm/pgalloc.h> +#include <asm/tlbflush.h> +#include <asm/tlb.h> +#include <asm/code-patching.h> +#include <asm/cputhreads.h> + +#include <mm/mmu_decl.h> + +/* The variables below are currently only used on 64-bit Book3E + * though this will probably be made common with other nohash + * implementations at some point + */ +int mmu_pte_psize; /* Page size used for PTE pages */ +int mmu_vmemmap_psize; /* Page size used for the virtual mem map */ +int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */ +unsigned long linear_map_top; /* Top of linear mapping */ + + +/* + * Number of bytes to add to SPRN_SPRG_TLB_EXFRAME on crit/mcheck/debug + * exceptions. This is used for bolted and e6500 TLB miss handlers which + * do not modify this SPRG in the TLB miss code; for other TLB miss handlers, + * this is set to zero. + */ +int extlb_level_exc; + +/* + * Handling of virtual linear page tables or indirect TLB entries + * flushing when PTE pages are freed + */ +void tlb_flush_pgtable(struct mmu_gather *tlb, unsigned long address) +{ + int tsize = mmu_psize_defs[mmu_pte_psize].enc; + + if (book3e_htw_mode != PPC_HTW_NONE) { + unsigned long start = address & PMD_MASK; + unsigned long end = address + PMD_SIZE; + unsigned long size = 1UL << mmu_psize_defs[mmu_pte_psize].shift; + + /* This isn't the most optimal, ideally we would factor out the + * while preempt & CPU mask mucking around, or even the IPI but + * it will do for now + */ + while (start < end) { + __flush_tlb_page(tlb->mm, start, tsize, 1); + start += size; + } + } else { + unsigned long rmask = 0xf000000000000000ul; + unsigned long rid = (address & rmask) | 0x1000000000000000ul; + unsigned long vpte = address & ~rmask; + + vpte = (vpte >> (PAGE_SHIFT - 3)) & ~0xffful; + vpte |= rid; + __flush_tlb_page(tlb->mm, vpte, tsize, 0); + } +} + +static void __init setup_page_sizes(void) +{ + unsigned int tlb0cfg; + unsigned int eptcfg; + int psize; + +#ifdef CONFIG_PPC_E500 + unsigned int mmucfg = mfspr(SPRN_MMUCFG); + int fsl_mmu = mmu_has_feature(MMU_FTR_TYPE_FSL_E); + + if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V1) { + unsigned int tlb1cfg = mfspr(SPRN_TLB1CFG); + unsigned int min_pg, max_pg; + + min_pg = (tlb1cfg & TLBnCFG_MINSIZE) >> TLBnCFG_MINSIZE_SHIFT; + max_pg = (tlb1cfg & TLBnCFG_MAXSIZE) >> TLBnCFG_MAXSIZE_SHIFT; + + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def; + unsigned int shift; + + def = &mmu_psize_defs[psize]; + shift = def->shift; + + if (shift == 0 || shift & 1) + continue; + + /* adjust to be in terms of 4^shift Kb */ + shift = (shift - 10) >> 1; + + if ((shift >= min_pg) && (shift <= max_pg)) + def->flags |= MMU_PAGE_SIZE_DIRECT; + } + + goto out; + } + + if (fsl_mmu && (mmucfg & MMUCFG_MAVN) == MMUCFG_MAVN_V2) { + u32 tlb1cfg, tlb1ps; + + tlb0cfg = mfspr(SPRN_TLB0CFG); + tlb1cfg = mfspr(SPRN_TLB1CFG); + tlb1ps = mfspr(SPRN_TLB1PS); + eptcfg = mfspr(SPRN_EPTCFG); + + if ((tlb1cfg & TLBnCFG_IND) && (tlb0cfg & TLBnCFG_PT)) + book3e_htw_mode = PPC_HTW_E6500; + + /* + * We expect 4K subpage size and unrestricted indirect size. + * The lack of a restriction on indirect size is a Freescale + * extension, indicated by PSn = 0 but SPSn != 0. + */ + if (eptcfg != 2) + book3e_htw_mode = PPC_HTW_NONE; + + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def = &mmu_psize_defs[psize]; + + if (!def->shift) + continue; + + if (tlb1ps & (1U << (def->shift - 10))) { + def->flags |= MMU_PAGE_SIZE_DIRECT; + + if (book3e_htw_mode && psize == MMU_PAGE_2M) + def->flags |= MMU_PAGE_SIZE_INDIRECT; + } + } + + goto out; + } +#endif +out: + /* Cleanup array and print summary */ + pr_info("MMU: Supported page sizes\n"); + for (psize = 0; psize < MMU_PAGE_COUNT; ++psize) { + struct mmu_psize_def *def = &mmu_psize_defs[psize]; + const char *__page_type_names[] = { + "unsupported", + "direct", + "indirect", + "direct & indirect" + }; + if (def->flags == 0) { + def->shift = 0; + continue; + } + pr_info(" %8ld KB as %s\n", 1ul << (def->shift - 10), + __page_type_names[def->flags & 0x3]); + } +} + +static void __init setup_mmu_htw(void) +{ + /* + * If we want to use HW tablewalk, enable it by patching the TLB miss + * handlers to branch to the one dedicated to it. + */ + + switch (book3e_htw_mode) { +#ifdef CONFIG_PPC_E500 + case PPC_HTW_E6500: + extlb_level_exc = EX_TLB_SIZE; + patch_exception(0x1c0, exc_data_tlb_miss_e6500_book3e); + patch_exception(0x1e0, exc_instruction_tlb_miss_e6500_book3e); + break; +#endif + } + pr_info("MMU: Book3E HW tablewalk %s\n", + book3e_htw_mode != PPC_HTW_NONE ? "enabled" : "not supported"); +} + +/* + * Early initialization of the MMU TLB code + */ +static void early_init_this_mmu(void) +{ + unsigned int mas4; + + /* Set MAS4 based on page table setting */ + + mas4 = 0x4 << MAS4_WIMGED_SHIFT; + switch (book3e_htw_mode) { + case PPC_HTW_E6500: + mas4 |= MAS4_INDD; + mas4 |= BOOK3E_PAGESZ_2M << MAS4_TSIZED_SHIFT; + mas4 |= MAS4_TLBSELD(1); + mmu_pte_psize = MMU_PAGE_2M; + break; + + case PPC_HTW_NONE: + mas4 |= BOOK3E_PAGESZ_4K << MAS4_TSIZED_SHIFT; + mmu_pte_psize = mmu_virtual_psize; + break; + } + mtspr(SPRN_MAS4, mas4); + +#ifdef CONFIG_PPC_E500 + if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { + unsigned int num_cams; + bool map = true; + + /* use a quarter of the TLBCAM for bolted linear map */ + num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; + + /* + * Only do the mapping once per core, or else the + * transient mapping would cause problems. + */ +#ifdef CONFIG_SMP + if (hweight32(get_tensr()) > 1) + map = false; +#endif + + if (map) + linear_map_top = map_mem_in_cams(linear_map_top, + num_cams, false, true); + } +#endif + + /* A sync won't hurt us after mucking around with + * the MMU configuration + */ + mb(); +} + +static void __init early_init_mmu_global(void) +{ + /* XXX This should be decided at runtime based on supported + * page sizes in the TLB, but for now let's assume 16M is + * always there and a good fit (which it probably is) + * + * Freescale booke only supports 4K pages in TLB0, so use that. + */ + if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) + mmu_vmemmap_psize = MMU_PAGE_4K; + else + mmu_vmemmap_psize = MMU_PAGE_16M; + + /* XXX This code only checks for TLB 0 capabilities and doesn't + * check what page size combos are supported by the HW. It + * also doesn't handle the case where a separate array holds + * the IND entries from the array loaded by the PT. + */ + /* Look for supported page sizes */ + setup_page_sizes(); + + /* Look for HW tablewalk support */ + setup_mmu_htw(); + +#ifdef CONFIG_PPC_E500 + if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { + if (book3e_htw_mode == PPC_HTW_NONE) { + extlb_level_exc = EX_TLB_SIZE; + patch_exception(0x1c0, exc_data_tlb_miss_bolted_book3e); + patch_exception(0x1e0, + exc_instruction_tlb_miss_bolted_book3e); + } + } +#endif + + /* Set the global containing the top of the linear mapping + * for use by the TLB miss code + */ + linear_map_top = memblock_end_of_DRAM(); + + ioremap_bot = IOREMAP_BASE; +} + +static void __init early_mmu_set_memory_limit(void) +{ +#ifdef CONFIG_PPC_E500 + if (mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { + /* + * Limit memory so we dont have linear faults. + * Unlike memblock_set_current_limit, which limits + * memory available during early boot, this permanently + * reduces the memory available to Linux. We need to + * do this because highmem is not supported on 64-bit. + */ + memblock_enforce_memory_limit(linear_map_top); + } +#endif + + memblock_set_current_limit(linear_map_top); +} + +/* boot cpu only */ +void __init early_init_mmu(void) +{ + early_init_mmu_global(); + early_init_this_mmu(); + early_mmu_set_memory_limit(); +} + +void early_init_mmu_secondary(void) +{ + early_init_this_mmu(); +} + +void setup_initial_memory_limit(phys_addr_t first_memblock_base, + phys_addr_t first_memblock_size) +{ + /* On non-FSL Embedded 64-bit, we adjust the RMA size to match + * the bolted TLB entry. We know for now that only 1G + * entries are supported though that may eventually + * change. + * + * on FSL Embedded 64-bit, usually all RAM is bolted, but with + * unusual memory sizes it's possible for some RAM to not be mapped + * (such RAM is not used at all by Linux, since we don't support + * highmem on 64-bit). We limit ppc64_rma_size to what would be + * mappable if this memblock is the only one. Additional memblocks + * can only increase, not decrease, the amount that ends up getting + * mapped. We still limit max to 1G even if we'll eventually map + * more. This is due to what the early init code is set up to do. + * + * We crop it to the size of the first MEMBLOCK to + * avoid going over total available memory just in case... + */ +#ifdef CONFIG_PPC_E500 + if (early_mmu_has_feature(MMU_FTR_TYPE_FSL_E)) { + unsigned long linear_sz; + unsigned int num_cams; + + /* use a quarter of the TLBCAM for bolted linear map */ + num_cams = (mfspr(SPRN_TLB1CFG) & TLBnCFG_N_ENTRY) / 4; + + linear_sz = map_mem_in_cams(first_memblock_size, num_cams, + true, true); + + ppc64_rma_size = min_t(u64, linear_sz, 0x40000000); + } else +#endif + ppc64_rma_size = min_t(u64, first_memblock_size, 0x40000000); + + /* Finally limit subsequent allocations */ + memblock_set_current_limit(first_memblock_base + ppc64_rma_size); +}
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christophe Leroy christophe.leroy@csgroup.eu
[ Upstream commit d92b5cc29c792f1d3f0aaa3b29dddfe816c03e88 ]
mmu_pte_psize is only used in the tlb_64e.c, define it static.
Fixes: 25d21ad6e799 ("powerpc: Add TLB management code for 64-bit Book3E") Reported-by: kernel test robot lkp@intel.com Closes: https://lore.kernel.org/oe-kbuild-all/202408011256.1O99IB0s-lkp@intel.com/ Signed-off-by: Christophe Leroy christophe.leroy@csgroup.eu Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://msgid.link/beb30d280eaa5d857c38a0834b147dffd6b28aa9.1724157750.git.c... Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/mm/nohash/tlb_64e.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/mm/nohash/tlb_64e.c b/arch/powerpc/mm/nohash/tlb_64e.c index 1dcda261554c..b6af3ec4d001 100644 --- a/arch/powerpc/mm/nohash/tlb_64e.c +++ b/arch/powerpc/mm/nohash/tlb_64e.c @@ -33,7 +33,7 @@ * though this will probably be made common with other nohash * implementations at some point */ -int mmu_pte_psize; /* Page size used for PTE pages */ +static int mmu_pte_psize; /* Page size used for PTE pages */ int mmu_vmemmap_psize; /* Page size used for the virtual mem map */ int book3e_htw_mode; /* HW tablewalk? Value is PPC_HTW_* */ unsigned long linear_map_top; /* Top of linear mapping */
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mohan Kumar mkumard@nvidia.com
[ Upstream commit 6781b962d97bc52715a8db8cc17278cc3c23ebe8 ]
When Tegra audio drivers are built as part of the kernel image, TIMEOUT_ERR is observed from cbb-fabric. Following is seen on Jetson AGX Orin during boot:
[ 8.012482] ************************************** [ 8.017423] CPU:0, Error:cbb-fabric, Errmon:2 [ 8.021922] Error Code : TIMEOUT_ERR [ 8.025966] Overflow : Multiple TIMEOUT_ERR [ 8.030644] [ 8.032175] Error Code : TIMEOUT_ERR [ 8.036217] MASTER_ID : CCPLEX [ 8.039722] Address : 0x290a0a8 [ 8.043318] Cache : 0x1 -- Bufferable [ 8.047630] Protection : 0x2 -- Unprivileged, Non-Secure, Data Access [ 8.054628] Access_Type : Write
[ 8.106130] WARNING: CPU: 0 PID: 124 at drivers/soc/tegra/cbb/tegra234-cbb.c:604 tegra234_cbb_isr+0x134/0x178
[ 8.240602] Call trace: [ 8.243126] tegra234_cbb_isr+0x134/0x178 [ 8.247261] __handle_irq_event_percpu+0x60/0x238 [ 8.252132] handle_irq_event+0x54/0xb8
These errors happen when MVC device, which is a child of AHUB device, tries to access its device registers. This happens as part of call tegra210_mvc_reset_vol_settings() in MVC device probe().
The root cause of this problem is, the child MVC device gets probed before the AHUB clock gets enabled. The AHUB clock is enabled in runtime PM resume of parent AHUB device and due to the wrong sequence of pm_runtime_enable() in AHUB driver, runtime PM resume doesn't happen for AHUB device when MVC makes register access.
Fix this by calling pm_runtime_enable() for parent AHUB device before of_platform_populate() in AHUB driver. This ensures that clock becomes available when MVC makes register access.
Fixes: 16e1bcc2caf4 ("ASoC: tegra: Add Tegra210 based AHUB driver") Signed-off-by: Mohan Kumar mkumard@nvidia.com Signed-off-by: Ritu Chaudhary rituc@nvidia.com Signed-off-by: Sameer Pujar spujar@nvidia.com Link: https://patch.msgid.link/20240823144342.4123814-3-spujar@nvidia.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/tegra/tegra210_ahub.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/sound/soc/tegra/tegra210_ahub.c b/sound/soc/tegra/tegra210_ahub.c index b38d205b69cc..dfdcb4580cd7 100644 --- a/sound/soc/tegra/tegra210_ahub.c +++ b/sound/soc/tegra/tegra210_ahub.c @@ -2,7 +2,7 @@ // // tegra210_ahub.c - Tegra210 AHUB driver // -// Copyright (c) 2020-2022, NVIDIA CORPORATION. All rights reserved. +// Copyright (c) 2020-2024, NVIDIA CORPORATION. All rights reserved.
#include <linux/clk.h> #include <linux/device.h> @@ -1401,11 +1401,13 @@ static int tegra_ahub_probe(struct platform_device *pdev) return err; }
+ pm_runtime_enable(&pdev->dev); + err = of_platform_populate(pdev->dev.of_node, NULL, NULL, &pdev->dev); - if (err) + if (err) { + pm_runtime_disable(&pdev->dev); return err; - - pm_runtime_enable(&pdev->dev); + }
return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maurizio Lombardi mlombard@redhat.com
[ Upstream commit 5572a55a6f830ee3f3a994b6b962a5c327d28cb3 ]
If the commands allocation fails in nvmet_tcp_alloc_cmds() the kernel crashes in nvmet_tcp_release_queue_work() because of a NULL pointer dereference.
nvmet: failed to install queue 0 cntlid 1 ret 6 Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
Fix the bug by setting queue->nr_cmds to zero in case nvmet_tcp_alloc_cmd() fails.
Fixes: 872d26a391da ("nvmet-tcp: add NVMe over TCP target driver") Signed-off-by: Maurizio Lombardi mlombard@redhat.com Reviewed-by: Christoph Hellwig hch@lst.de Signed-off-by: Keith Busch kbusch@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/tcp.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c index 76b9eb438268..81574500a57c 100644 --- a/drivers/nvme/target/tcp.c +++ b/drivers/nvme/target/tcp.c @@ -1816,8 +1816,10 @@ static u16 nvmet_tcp_install_queue(struct nvmet_sq *sq) }
queue->nr_cmds = sq->size * 2; - if (nvmet_tcp_alloc_cmds(queue)) + if (nvmet_tcp_alloc_cmds(queue)) { + queue->nr_cmds = 0; return NVME_SC_INTERNAL; + } return 0; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chen-Yu Tsai wenst@chromium.org
[ Upstream commit e0be875c5bf03a9676a6bfed9e0f1766922a7dbd ]
The SOF topology loading function sets the device name for the platform component link. This should be unset when unloading the topology, otherwise a machine driver unbind/bind or reprobe would complain about an invalid component as having both its component name and of_node set:
mt8186_mt6366 sound: ASoC: Both Component name/of_node are set for AFE_SOF_DL1 mt8186_mt6366 sound: error -EINVAL: Cannot register card mt8186_mt6366 sound: probe with driver mt8186_mt6366 failed with error -22
This happens with machine drivers that set the of_node separately.
Clear the SOF link platform name in the topology unload callback.
Fixes: 311ce4fe7637 ("ASoC: SOF: Add support for loading topologies") Signed-off-by: Chen-Yu Tsai wenst@chromium.org Link: https://patch.msgid.link/20240821041006.2618855-1-wenst@chromium.org Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sof/topology.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/sound/soc/sof/topology.c b/sound/soc/sof/topology.c index e7305ce57ea1..374c8b1d6958 100644 --- a/sound/soc/sof/topology.c +++ b/sound/soc/sof/topology.c @@ -1817,6 +1817,8 @@ static int sof_link_unload(struct snd_soc_component *scomp, struct snd_soc_dobj if (!slink) return 0;
+ slink->link->platforms->name = NULL; + kfree(slink->tuples); list_del(&slink->list); kfree(slink->hw_configs);
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Matteo Martelli matteomartelli3@gmail.com
[ Upstream commit 3e83957e8dd7433a69116780d9bad217b00913ea ]
This fixes the LRCLK polarity for sun8i-h3 and sun50i-h6 in i2s mode which was wrongly inverted.
The LRCLK was being set in reversed logic compared to the DAI format: inverted LRCLK for SND_SOC_DAIFMT_IB_NF and SND_SOC_DAIFMT_NB_NF; normal LRCLK for SND_SOC_DAIFMT_IB_IF and SND_SOC_DAIFMT_NB_IF. Such reversed logic applies properly for DSP_A, DSP_B, LEFT_J and RIGHT_J modes but not for I2S mode, for which the LRCLK signal results reversed to what expected on the bus. The issue is due to a misinterpretation of the LRCLK polarity bit of the H3 and H6 i2s controllers. Such bit in this case does not mean "0 => normal" or "1 => inverted" according to the expected bus operation, but it means "0 => frame starts on low edge" and "1 => frame starts on high edge" (from the User Manuals).
This commit fixes the LRCLK polarity by setting the LRCLK polarity bit according to the selected bus mode and renames the LRCLK polarity bit definition to avoid further confusion.
Fixes: dd657eae8164 ("ASoC: sun4i-i2s: Fix the LRCK polarity") Fixes: 73adf87b7a58 ("ASoC: sun4i-i2s: Add support for H6 I2S") Signed-off-by: Matteo Martelli matteomartelli3@gmail.com Link: https://patch.msgid.link/20240801-asoc-fix-sun4i-i2s-v2-1-a8e4e9daa363@gmail... Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- sound/soc/sunxi/sun4i-i2s.c | 143 ++++++++++++++++++------------------ 1 file changed, 73 insertions(+), 70 deletions(-)
diff --git a/sound/soc/sunxi/sun4i-i2s.c b/sound/soc/sunxi/sun4i-i2s.c index 6028871825ba..47faaf849de0 100644 --- a/sound/soc/sunxi/sun4i-i2s.c +++ b/sound/soc/sunxi/sun4i-i2s.c @@ -100,8 +100,8 @@ #define SUN8I_I2S_CTRL_MODE_PCM (0 << 4)
#define SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK BIT(19) -#define SUN8I_I2S_FMT0_LRCLK_POLARITY_INVERTED (1 << 19) -#define SUN8I_I2S_FMT0_LRCLK_POLARITY_NORMAL (0 << 19) +#define SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH (1 << 19) +#define SUN8I_I2S_FMT0_LRCLK_POLARITY_START_LOW (0 << 19) #define SUN8I_I2S_FMT0_LRCK_PERIOD_MASK GENMASK(17, 8) #define SUN8I_I2S_FMT0_LRCK_PERIOD(period) ((period - 1) << 8) #define SUN8I_I2S_FMT0_BCLK_POLARITY_MASK BIT(7) @@ -727,65 +727,37 @@ static int sun4i_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, static int sun8i_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, unsigned int fmt) { - u32 mode, val; + u32 mode, lrclk_pol, bclk_pol, val; u8 offset;
- /* - * DAI clock polarity - * - * The setup for LRCK contradicts the datasheet, but under a - * scope it's clear that the LRCK polarity is reversed - * compared to the expected polarity on the bus. - */ - switch (fmt & SND_SOC_DAIFMT_INV_MASK) { - case SND_SOC_DAIFMT_IB_IF: - /* Invert both clocks */ - val = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; - break; - case SND_SOC_DAIFMT_IB_NF: - /* Invert bit clock */ - val = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED | - SUN8I_I2S_FMT0_LRCLK_POLARITY_INVERTED; - break; - case SND_SOC_DAIFMT_NB_IF: - /* Invert frame clock */ - val = 0; - break; - case SND_SOC_DAIFMT_NB_NF: - val = SUN8I_I2S_FMT0_LRCLK_POLARITY_INVERTED; - break; - default: - return -EINVAL; - } - - regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG, - SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK | - SUN8I_I2S_FMT0_BCLK_POLARITY_MASK, - val); - /* DAI Mode */ switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { case SND_SOC_DAIFMT_DSP_A: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_PCM; offset = 1; break;
case SND_SOC_DAIFMT_DSP_B: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_PCM; offset = 0; break;
case SND_SOC_DAIFMT_I2S: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_LOW; mode = SUN8I_I2S_CTRL_MODE_LEFT; offset = 1; break;
case SND_SOC_DAIFMT_LEFT_J: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_LEFT; offset = 0; break;
case SND_SOC_DAIFMT_RIGHT_J: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_RIGHT; offset = 0; break; @@ -803,6 +775,35 @@ static int sun8i_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, SUN8I_I2S_TX_CHAN_OFFSET_MASK, SUN8I_I2S_TX_CHAN_OFFSET(offset));
+ /* DAI clock polarity */ + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_NORMAL; + + switch (fmt & SND_SOC_DAIFMT_INV_MASK) { + case SND_SOC_DAIFMT_IB_IF: + /* Invert both clocks */ + lrclk_pol ^= SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK; + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; + break; + case SND_SOC_DAIFMT_IB_NF: + /* Invert bit clock */ + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; + break; + case SND_SOC_DAIFMT_NB_IF: + /* Invert frame clock */ + lrclk_pol ^= SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK; + break; + case SND_SOC_DAIFMT_NB_NF: + /* No inversion */ + break; + default: + return -EINVAL; + } + + regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG, + SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK | + SUN8I_I2S_FMT0_BCLK_POLARITY_MASK, + lrclk_pol | bclk_pol); + /* DAI clock master masks */ switch (fmt & SND_SOC_DAIFMT_CLOCK_PROVIDER_MASK) { case SND_SOC_DAIFMT_BP_FP: @@ -834,65 +835,37 @@ static int sun8i_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, static int sun50i_h6_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, unsigned int fmt) { - u32 mode, val; + u32 mode, lrclk_pol, bclk_pol, val; u8 offset;
- /* - * DAI clock polarity - * - * The setup for LRCK contradicts the datasheet, but under a - * scope it's clear that the LRCK polarity is reversed - * compared to the expected polarity on the bus. - */ - switch (fmt & SND_SOC_DAIFMT_INV_MASK) { - case SND_SOC_DAIFMT_IB_IF: - /* Invert both clocks */ - val = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; - break; - case SND_SOC_DAIFMT_IB_NF: - /* Invert bit clock */ - val = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED | - SUN8I_I2S_FMT0_LRCLK_POLARITY_INVERTED; - break; - case SND_SOC_DAIFMT_NB_IF: - /* Invert frame clock */ - val = 0; - break; - case SND_SOC_DAIFMT_NB_NF: - val = SUN8I_I2S_FMT0_LRCLK_POLARITY_INVERTED; - break; - default: - return -EINVAL; - } - - regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG, - SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK | - SUN8I_I2S_FMT0_BCLK_POLARITY_MASK, - val); - /* DAI Mode */ switch (fmt & SND_SOC_DAIFMT_FORMAT_MASK) { case SND_SOC_DAIFMT_DSP_A: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_PCM; offset = 1; break;
case SND_SOC_DAIFMT_DSP_B: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_PCM; offset = 0; break;
case SND_SOC_DAIFMT_I2S: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_LOW; mode = SUN8I_I2S_CTRL_MODE_LEFT; offset = 1; break;
case SND_SOC_DAIFMT_LEFT_J: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_LEFT; offset = 0; break;
case SND_SOC_DAIFMT_RIGHT_J: + lrclk_pol = SUN8I_I2S_FMT0_LRCLK_POLARITY_START_HIGH; mode = SUN8I_I2S_CTRL_MODE_RIGHT; offset = 0; break; @@ -910,6 +883,36 @@ static int sun50i_h6_i2s_set_soc_fmt(const struct sun4i_i2s *i2s, SUN50I_H6_I2S_TX_CHAN_SEL_OFFSET_MASK, SUN50I_H6_I2S_TX_CHAN_SEL_OFFSET(offset));
+ /* DAI clock polarity */ + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_NORMAL; + + switch (fmt & SND_SOC_DAIFMT_INV_MASK) { + case SND_SOC_DAIFMT_IB_IF: + /* Invert both clocks */ + lrclk_pol ^= SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK; + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; + break; + case SND_SOC_DAIFMT_IB_NF: + /* Invert bit clock */ + bclk_pol = SUN8I_I2S_FMT0_BCLK_POLARITY_INVERTED; + break; + case SND_SOC_DAIFMT_NB_IF: + /* Invert frame clock */ + lrclk_pol ^= SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK; + break; + case SND_SOC_DAIFMT_NB_NF: + /* No inversion */ + break; + default: + return -EINVAL; + } + + regmap_update_bits(i2s->regmap, SUN4I_I2S_FMT0_REG, + SUN8I_I2S_FMT0_LRCLK_POLARITY_MASK | + SUN8I_I2S_FMT0_BCLK_POLARITY_MASK, + lrclk_pol | bclk_pol); + + /* DAI clock master masks */ switch (fmt & SND_SOC_DAIFMT_CLOCK_PROVIDER_MASK) { case SND_SOC_DAIFMT_BP_FP:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit fcd9e8afd546f6ced378d078345a89bf346d065e ]
When debug_fence_init_onstack() is unused (CONFIG_DRM_I915_SELFTEST=n), it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y:
.../i915_sw_fence.c:97:20: error: unused function 'debug_fence_init_onstack' [-Werror,-Wunused-function] 97 | static inline void debug_fence_init_onstack(struct i915_sw_fence *fence) | ^~~~~~~~~~~~~~~~~~~~~~~~
Fix this by marking debug_fence_init_onstack() with __maybe_unused.
See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build").
Fixes: 214707fc2ce0 ("drm/i915/selftests: Wrap a timer into a i915_sw_fence") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240829155950.1141978-2-andri... Signed-off-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit 5bf472058ffb43baf6a4cdfe1d7f58c4c194c688) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index 6fc0d1b89690..c2ac1900d73e 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -51,7 +51,7 @@ static inline void debug_fence_init(struct i915_sw_fence *fence) debug_object_init(fence, &i915_sw_fence_debug_descr); }
-static inline void debug_fence_init_onstack(struct i915_sw_fence *fence) +static inline __maybe_unused void debug_fence_init_onstack(struct i915_sw_fence *fence) { debug_object_init_on_stack(fence, &i915_sw_fence_debug_descr); } @@ -94,7 +94,7 @@ static inline void debug_fence_init(struct i915_sw_fence *fence) { }
-static inline void debug_fence_init_onstack(struct i915_sw_fence *fence) +static inline __maybe_unused void debug_fence_init_onstack(struct i915_sw_fence *fence) { }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Andy Shevchenko andriy.shevchenko@linux.intel.com
[ Upstream commit f99999536128b14b5d765a9982763b5134efdd79 ]
When debug_fence_free() is unused (CONFIG_DRM_I915_SW_FENCE_DEBUG_OBJECTS=n), it prevents kernel builds with clang, `make W=1` and CONFIG_WERROR=y:
.../i915_sw_fence.c:118:20: error: unused function 'debug_fence_free' [-Werror,-Wunused-function] 118 | static inline void debug_fence_free(struct i915_sw_fence *fence) | ^~~~~~~~~~~~~~~~
Fix this by marking debug_fence_free() with __maybe_unused.
See also commit 6863f5643dd7 ("kbuild: allow Clang to find unused static inline functions for W=1 build").
Fixes: fc1584059d6c ("drm/i915: Integrate i915_sw_fence with debugobjects") Signed-off-by: Andy Shevchenko andriy.shevchenko@linux.intel.com Reviewed-by: Jani Nikula jani.nikula@intel.com Link: https://patchwork.freedesktop.org/patch/msgid/20240829155950.1141978-3-andri... Signed-off-by: Jani Nikula jani.nikula@intel.com (cherry picked from commit 8be4dce5ea6f2368cc25edc71989c4690fa66964) Signed-off-by: Joonas Lahtinen joonas.lahtinen@linux.intel.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/i915/i915_sw_fence.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c index c2ac1900d73e..e664f8e461e6 100644 --- a/drivers/gpu/drm/i915/i915_sw_fence.c +++ b/drivers/gpu/drm/i915/i915_sw_fence.c @@ -77,7 +77,7 @@ static inline void debug_fence_destroy(struct i915_sw_fence *fence) debug_object_destroy(fence, &i915_sw_fence_debug_descr); }
-static inline void debug_fence_free(struct i915_sw_fence *fence) +static inline __maybe_unused void debug_fence_free(struct i915_sw_fence *fence) { debug_object_free(fence, &i915_sw_fence_debug_descr); smp_wmb(); /* flush the change in state before reallocation */ @@ -115,7 +115,7 @@ static inline void debug_fence_destroy(struct i915_sw_fence *fence) { }
-static inline void debug_fence_free(struct i915_sw_fence *fence) +static inline __maybe_unused void debug_fence_free(struct i915_sw_fence *fence) { }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org
[ Upstream commit adad2e460e505a556f5ea6f0dc16fe95e62d5d76 ]
Driver code is leaking OF node reference from of_get_parent() in probe().
Fixes: 936ee2675eee ("gpio/rockchip: add driver for rockchip gpio") Signed-off-by: Krzysztof Kozlowski krzysztof.kozlowski@linaro.org Reviewed-by: Heiko Stuebner heiko@sntech.de Reviewed-by: Shawn Lin shawn.lin@rock-chips.com Link: https://lore.kernel.org/r/20240826150832.65657-1-krzysztof.kozlowski@linaro.... Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-rockchip.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpio/gpio-rockchip.c b/drivers/gpio/gpio-rockchip.c index 200e43a6f4b4..3c1e303aaca8 100644 --- a/drivers/gpio/gpio-rockchip.c +++ b/drivers/gpio/gpio-rockchip.c @@ -713,6 +713,7 @@ static int rockchip_gpio_probe(struct platform_device *pdev) return -ENODEV;
pctldev = of_pinctrl_get(pctlnp); + of_node_put(pctlnp); if (!pctldev) return -EPROBE_DEFER;
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Liao Chen liaochen4@huawei.com
[ Upstream commit a5135526426df5319d5f4bcd15ae57c45a97714b ]
Add MODULE_DEVICE_TABLE(), so modules could be properly autoloaded based on the alias from of_device_id table.
Fixes: 7687a5b0ee93 ("gpio: modepin: Add driver support for modepin GPIO controller") Signed-off-by: Liao Chen liaochen4@huawei.com Reviewed-by: Michal Simek michal.simek@amd.com Link: https://lore.kernel.org/r/20240902115848.904227-1-liaochen4@huawei.com Signed-off-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpio/gpio-zynqmp-modepin.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/gpio/gpio-zynqmp-modepin.c b/drivers/gpio/gpio-zynqmp-modepin.c index a0d69387c153..2f3c9ebfa78d 100644 --- a/drivers/gpio/gpio-zynqmp-modepin.c +++ b/drivers/gpio/gpio-zynqmp-modepin.c @@ -146,6 +146,7 @@ static const struct of_device_id modepin_platform_id[] = { { .compatible = "xlnx,zynqmp-gpio-modepin", }, { } }; +MODULE_DEVICE_TABLE(of, modepin_platform_id);
static struct platform_driver modepin_platform_driver = { .driver = {
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Li Nan linan122@huawei.com
[ Upstream commit e58f5142f88320a5b1449f96a146f2f24615c5c7 ]
When two UBLK_CMD_START_USER_RECOVERY commands are submitted, the first one sets 'ubq->ubq_daemon' to NULL, and the second one triggers WARN in ublk_queue_reinit() and subsequently a NULL pointer dereference issue.
Fix it by adding the check in ublk_ctrl_start_recovery() and return immediately in case of zero 'ub->nr_queues_ready'.
BUG: kernel NULL pointer dereference, address: 0000000000000028 RIP: 0010:ublk_ctrl_start_recovery.constprop.0+0x82/0x180 Call Trace: <TASK> ? __die+0x20/0x70 ? page_fault_oops+0x75/0x170 ? exc_page_fault+0x64/0x140 ? asm_exc_page_fault+0x22/0x30 ? ublk_ctrl_start_recovery.constprop.0+0x82/0x180 ublk_ctrl_uring_cmd+0x4f7/0x6c0 ? pick_next_task_idle+0x26/0x40 io_uring_cmd+0x9a/0x1b0 io_issue_sqe+0x193/0x3f0 io_wq_submit_work+0x9b/0x390 io_worker_handle_work+0x165/0x360 io_wq_worker+0xcb/0x2f0 ? finish_task_switch.isra.0+0x203/0x290 ? finish_task_switch.isra.0+0x203/0x290 ? __pfx_io_wq_worker+0x10/0x10 ret_from_fork+0x2d/0x50 ? __pfx_io_wq_worker+0x10/0x10 ret_from_fork_asm+0x1a/0x30 </TASK>
Fixes: c732a852b419 ("ublk_drv: add START_USER_RECOVERY and END_USER_RECOVERY support") Reported-and-tested-by: Changhui Zhong czhong@redhat.com Closes: https://lore.kernel.org/all/CAGVVp+UvLiS+bhNXV-h2icwX1dyybbYHeQUuH7RYqUvMQf6... Reviewed-by: Ming Lei ming.lei@redhat.com Signed-off-by: Li Nan linan122@huawei.com Link: https://lore.kernel.org/r/20240904031348.4139545-1-ming.lei@redhat.com Signed-off-by: Jens Axboe axboe@kernel.dk Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/block/ublk_drv.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index 3fa74051f31b..bfd643856f64 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -1915,6 +1915,8 @@ static int ublk_ctrl_start_recovery(struct ublk_device *ub, mutex_lock(&ub->mutex); if (!ublk_can_use_recovery(ub)) goto out_unlock; + if (!ub->nr_queues_ready) + goto out_unlock; /* * START_RECOVERY is only allowd after: *
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
commit c48b5a4cf3125adb679e28ef093f66ff81368d05 upstream.
So it turns out that we have to do two passes of pti_clone_entry_text(), once before initcalls, such that device and late initcalls can use user-mode-helper / modprobe and once after free_initmem() / mark_readonly().
Now obviously mark_readonly() can cause PMD splits, and pti_clone_pgtable() doesn't like that much.
Allow the late clone to split PMDs so that pagetables stay in sync.
[peterz: Changelog and comments] Reported-by: Guenter Roeck linux@roeck-us.net Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Peter Zijlstra (Intel) peterz@infradead.org Tested-by: Guenter Roeck linux@roeck-us.net Link: https://lkml.kernel.org/r/20240806184843.GX37996@noisy.programming.kicks-ass... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/mm/pti.c | 45 +++++++++++++++++++++++++++++---------------- 1 file changed, 29 insertions(+), 16 deletions(-)
--- a/arch/x86/mm/pti.c +++ b/arch/x86/mm/pti.c @@ -241,7 +241,7 @@ static pmd_t *pti_user_pagetable_walk_pm * * Returns a pointer to a PTE on success, or NULL on failure. */ -static pte_t *pti_user_pagetable_walk_pte(unsigned long address) +static pte_t *pti_user_pagetable_walk_pte(unsigned long address, bool late_text) { gfp_t gfp = (GFP_KERNEL | __GFP_NOTRACK | __GFP_ZERO); pmd_t *pmd; @@ -251,10 +251,15 @@ static pte_t *pti_user_pagetable_walk_pt if (!pmd) return NULL;
- /* We can't do anything sensible if we hit a large mapping. */ + /* Large PMD mapping found */ if (pmd_large(*pmd)) { - WARN_ON(1); - return NULL; + /* Clear the PMD if we hit a large mapping from the first round */ + if (late_text) { + set_pmd(pmd, __pmd(0)); + } else { + WARN_ON_ONCE(1); + return NULL; + } }
if (pmd_none(*pmd)) { @@ -283,7 +288,7 @@ static void __init pti_setup_vsyscall(vo if (!pte || WARN_ON(level != PG_LEVEL_4K) || pte_none(*pte)) return;
- target_pte = pti_user_pagetable_walk_pte(VSYSCALL_ADDR); + target_pte = pti_user_pagetable_walk_pte(VSYSCALL_ADDR, false); if (WARN_ON(!target_pte)) return;
@@ -301,7 +306,7 @@ enum pti_clone_level {
static void pti_clone_pgtable(unsigned long start, unsigned long end, - enum pti_clone_level level) + enum pti_clone_level level, bool late_text) { unsigned long addr;
@@ -390,7 +395,7 @@ pti_clone_pgtable(unsigned long start, u return;
/* Allocate PTE in the user page-table */ - target_pte = pti_user_pagetable_walk_pte(addr); + target_pte = pti_user_pagetable_walk_pte(addr, late_text); if (WARN_ON(!target_pte)) return;
@@ -452,7 +457,7 @@ static void __init pti_clone_user_shared phys_addr_t pa = per_cpu_ptr_to_phys((void *)va); pte_t *target_pte;
- target_pte = pti_user_pagetable_walk_pte(va); + target_pte = pti_user_pagetable_walk_pte(va, false); if (WARN_ON(!target_pte)) return;
@@ -475,7 +480,7 @@ static void __init pti_clone_user_shared start = CPU_ENTRY_AREA_BASE; end = start + (PAGE_SIZE * CPU_ENTRY_AREA_PAGES);
- pti_clone_pgtable(start, end, PTI_CLONE_PMD); + pti_clone_pgtable(start, end, PTI_CLONE_PMD, false); } #endif /* CONFIG_X86_64 */
@@ -492,11 +497,11 @@ static void __init pti_setup_espfix64(vo /* * Clone the populated PMDs of the entry text and force it RO. */ -static void pti_clone_entry_text(void) +static void pti_clone_entry_text(bool late) { pti_clone_pgtable((unsigned long) __entry_text_start, (unsigned long) __entry_text_end, - PTI_LEVEL_KERNEL_IMAGE); + PTI_LEVEL_KERNEL_IMAGE, late); }
/* @@ -571,7 +576,7 @@ static void pti_clone_kernel_text(void) * pti_set_kernel_image_nonglobal() did to clear the * global bit. */ - pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE); + pti_clone_pgtable(start, end_clone, PTI_LEVEL_KERNEL_IMAGE, false);
/* * pti_clone_pgtable() will set the global bit in any PMDs @@ -638,8 +643,15 @@ void __init pti_init(void)
/* Undo all global bits from the init pagetables in head_64.S: */ pti_set_kernel_image_nonglobal(); + /* Replace some of the global bits just for shared entry text: */ - pti_clone_entry_text(); + /* + * This is very early in boot. Device and Late initcalls can do + * modprobe before free_initmem() and mark_readonly(). This + * pti_clone_entry_text() allows those user-mode-helpers to function, + * but notably the text is still RW. + */ + pti_clone_entry_text(false); pti_setup_espfix64(); pti_setup_vsyscall(); } @@ -656,10 +668,11 @@ void pti_finalize(void) if (!boot_cpu_has(X86_FEATURE_PTI)) return; /* - * We need to clone everything (again) that maps parts of the - * kernel image. + * This is after free_initmem() (all initcalls are done) and we've done + * mark_readonly(). Text is now NX which might've split some PMDs + * relative to the early clone. */ - pti_clone_entry_text(); + pti_clone_entry_text(true); pti_clone_kernel_text();
debug_checkwx_user();
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Filipe Manana fdmanana@suse.com
commit cd9253c23aedd61eb5ff11f37a36247cd46faf86 upstream.
If we have 2 threads that are using the same file descriptor and one of them is doing direct IO writes while the other is doing fsync, we have a race where we can end up either:
1) Attempt a fsync without holding the inode's lock, triggering an assertion failures when assertions are enabled;
2) Do an invalid memory access from the fsync task because the file private points to memory allocated on stack by the direct IO task and it may be used by the fsync task after the stack was destroyed.
The race happens like this:
1) A user space program opens a file descriptor with O_DIRECT;
2) The program spawns 2 threads using libpthread for example;
3) One of the threads uses the file descriptor to do direct IO writes, while the other calls fsync using the same file descriptor.
4) Call task A the thread doing direct IO writes and task B the thread doing fsyncs;
5) Task A does a direct IO write, and at btrfs_direct_write() sets the file's private to an on stack allocated private with the member 'fsync_skip_inode_lock' set to true;
6) Task B enters btrfs_sync_file() and sees that there's a private structure associated to the file which has 'fsync_skip_inode_lock' set to true, so it skips locking the inode's VFS lock;
7) Task A completes the direct IO write, and resets the file's private to NULL since it had no prior private and our private was stack allocated. Then it unlocks the inode's VFS lock;
8) Task B enters btrfs_get_ordered_extents_for_logging(), then the assertion that checks the inode's VFS lock is held fails, since task B never locked it and task A has already unlocked it.
The stack trace produced is the following:
assertion failed: inode_is_locked(&inode->vfs_inode), in fs/btrfs/ordered-data.c:983 ------------[ cut here ]------------ kernel BUG at fs/btrfs/ordered-data.c:983! Oops: invalid opcode: 0000 [#1] PREEMPT SMP PTI CPU: 9 PID: 5072 Comm: worker Tainted: G U OE 6.10.5-1-default #1 openSUSE Tumbleweed 69f48d427608e1c09e60ea24c6c55e2ca1b049e8 Hardware name: Acer Predator PH315-52/Covini_CFS, BIOS V1.12 07/28/2020 RIP: 0010:btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs] Code: 50 d6 86 c0 e8 (...) RSP: 0018:ffff9e4a03dcfc78 EFLAGS: 00010246 RAX: 0000000000000054 RBX: ffff9078a9868e98 RCX: 0000000000000000 RDX: 0000000000000000 RSI: ffff907dce4a7800 RDI: ffff907dce4a7800 RBP: ffff907805518800 R08: 0000000000000000 R09: ffff9e4a03dcfb38 R10: ffff9e4a03dcfb30 R11: 0000000000000003 R12: ffff907684ae7800 R13: 0000000000000001 R14: ffff90774646b600 R15: 0000000000000000 FS: 00007f04b96006c0(0000) GS:ffff907dce480000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 00007f32acbfc000 CR3: 00000001fd4fa005 CR4: 00000000003726f0 Call Trace: <TASK> ? __die_body.cold+0x14/0x24 ? die+0x2e/0x50 ? do_trap+0xca/0x110 ? do_error_trap+0x6a/0x90 ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a] ? exc_invalid_op+0x50/0x70 ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a] ? asm_exc_invalid_op+0x1a/0x20 ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a] ? btrfs_get_ordered_extents_for_logging.cold+0x1f/0x42 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a] btrfs_sync_file+0x21a/0x4d0 [btrfs bb26272d49b4cdc847cf3f7faadd459b62caee9a] ? __seccomp_filter+0x31d/0x4f0 __x64_sys_fdatasync+0x4f/0x90 do_syscall_64+0x82/0x160 ? do_futex+0xcb/0x190 ? __x64_sys_futex+0x10e/0x1d0 ? switch_fpu_return+0x4f/0xd0 ? syscall_exit_to_user_mode+0x72/0x220 ? do_syscall_64+0x8e/0x160 ? syscall_exit_to_user_mode+0x72/0x220 ? do_syscall_64+0x8e/0x160 ? syscall_exit_to_user_mode+0x72/0x220 ? do_syscall_64+0x8e/0x160 ? syscall_exit_to_user_mode+0x72/0x220 ? do_syscall_64+0x8e/0x160 entry_SYSCALL_64_after_hwframe+0x76/0x7e
Another problem here is if task B grabs the private pointer and then uses it after task A has finished, since the private was allocated in the stack of task A, it results in some invalid memory access with a hard to predict result.
This issue, triggering the assertion, was observed with QEMU workloads by two users in the Link tags below.
Fix this by not relying on a file's private to pass information to fsync that it should skip locking the inode and instead pass this information through a special value stored in current->journal_info. This is safe because in the relevant section of the direct IO write path we are not holding a transaction handle, so current->journal_info is NULL.
The following C program triggers the issue:
$ cat repro.c /* Get the O_DIRECT definition. */ #ifndef _GNU_SOURCE #define _GNU_SOURCE #endif
#include <stdio.h> #include <stdlib.h> #include <unistd.h> #include <stdint.h> #include <fcntl.h> #include <errno.h> #include <string.h> #include <pthread.h>
static int fd;
static ssize_t do_write(int fd, const void *buf, size_t count, off_t offset) { while (count > 0) { ssize_t ret;
ret = pwrite(fd, buf, count, offset); if (ret < 0) { if (errno == EINTR) continue; return ret; } count -= ret; buf += ret; } return 0; }
static void *fsync_loop(void *arg) { while (1) { int ret;
ret = fsync(fd); if (ret != 0) { perror("Fsync failed"); exit(6); } } }
int main(int argc, char *argv[]) { long pagesize; void *write_buf; pthread_t fsyncer; int ret;
if (argc != 2) { fprintf(stderr, "Use: %s <file path>\n", argv[0]); return 1; }
fd = open(argv[1], O_WRONLY | O_CREAT | O_TRUNC | O_DIRECT, 0666); if (fd == -1) { perror("Failed to open/create file"); return 1; }
pagesize = sysconf(_SC_PAGE_SIZE); if (pagesize == -1) { perror("Failed to get page size"); return 2; }
ret = posix_memalign(&write_buf, pagesize, pagesize); if (ret) { perror("Failed to allocate buffer"); return 3; }
ret = pthread_create(&fsyncer, NULL, fsync_loop, NULL); if (ret != 0) { fprintf(stderr, "Failed to create writer thread: %d\n", ret); return 4; }
while (1) { ret = do_write(fd, write_buf, pagesize, 0); if (ret != 0) { perror("Write failed"); exit(5); } }
return 0; }
$ mkfs.btrfs -f /dev/sdi $ mount /dev/sdi /mnt/sdi $ timeout 10 ./repro /mnt/sdi/foo
Usually the race is triggered within less than 1 second. A test case for fstests will follow soon.
Reported-by: Paulo Dias paulo.miguel.dias@gmail.com Link: https://bugzilla.kernel.org/show_bug.cgi?id=219187 Reported-by: Andreas Jahn jahn-andi@web.de Link: https://bugzilla.kernel.org/show_bug.cgi?id=219199 Reported-by: syzbot+4704b3cc972bd76024f1@syzkaller.appspotmail.com Link: https://lore.kernel.org/linux-btrfs/00000000000044ff540620d7dee2@google.com/ Fixes: 939b656bc8ab ("btrfs: fix corruption after buffer fault in during direct IO append write") CC: stable@vger.kernel.org # 5.15+ Reviewed-by: Josef Bacik josef@toxicpanda.com Signed-off-by: Filipe Manana fdmanana@suse.com Reviewed-by: David Sterba dsterba@suse.com Signed-off-by: David Sterba dsterba@suse.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org ---
--- fs/btrfs/ctree.h | 1 - fs/btrfs/file.c | 25 ++++++++++--------------- fs/btrfs/transaction.h | 6 ++++++ 3 files changed, 16 insertions(+), 16 deletions(-)
--- a/fs/btrfs/ctree.h +++ b/fs/btrfs/ctree.h @@ -1553,7 +1553,6 @@ struct btrfs_drop_extents_args { struct btrfs_file_private { void *filldir_buf; u64 last_index; - bool fsync_skip_inode_lock; };
--- a/fs/btrfs/file.c +++ b/fs/btrfs/file.c @@ -1534,13 +1534,6 @@ again: if (IS_ERR_OR_NULL(dio)) { err = PTR_ERR_OR_ZERO(dio); } else { - struct btrfs_file_private stack_private = { 0 }; - struct btrfs_file_private *private; - const bool have_private = (file->private_data != NULL); - - if (!have_private) - file->private_data = &stack_private; - /* * If we have a synchoronous write, we must make sure the fsync * triggered by the iomap_dio_complete() call below doesn't @@ -1549,13 +1542,10 @@ again: * partial writes due to the input buffer (or parts of it) not * being already faulted in. */ - private = file->private_data; - private->fsync_skip_inode_lock = true; + ASSERT(current->journal_info == NULL); + current->journal_info = BTRFS_TRANS_DIO_WRITE_STUB; err = iomap_dio_complete(dio); - private->fsync_skip_inode_lock = false; - - if (!have_private) - file->private_data = NULL; + current->journal_info = NULL; }
/* No increment (+=) because iomap returns a cumulative value. */ @@ -1795,7 +1785,6 @@ static inline bool skip_inode_logging(co */ int btrfs_sync_file(struct file *file, loff_t start, loff_t end, int datasync) { - struct btrfs_file_private *private = file->private_data; struct dentry *dentry = file_dentry(file); struct inode *inode = d_inode(dentry); struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb); @@ -1805,7 +1794,13 @@ int btrfs_sync_file(struct file *file, l int ret = 0, err; u64 len; bool full_sync; - const bool skip_ilock = (private ? private->fsync_skip_inode_lock : false); + bool skip_ilock = false; + + if (current->journal_info == BTRFS_TRANS_DIO_WRITE_STUB) { + skip_ilock = true; + current->journal_info = NULL; + lockdep_assert_held(&inode->i_rwsem); + }
trace_btrfs_sync_file(file, datasync);
--- a/fs/btrfs/transaction.h +++ b/fs/btrfs/transaction.h @@ -11,6 +11,12 @@ #include "delayed-ref.h" #include "ctree.h"
+/* + * Signal that a direct IO write is in progress, to avoid deadlock for sync + * direct IO writes when fsync is called during the direct IO write path. + */ +#define BTRFS_TRANS_DIO_WRITE_STUB ((void *) 1) + enum btrfs_trans_state { TRANS_STATE_RUNNING, TRANS_STATE_COMMIT_START,
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonghong Song yhs@fb.com
commit e6c2f594ed961273479505b42040782820190305 upstream.
syzbot reported a warning in [1] with the following stacktrace: WARNING: CPU: 0 PID: 5005 at kernel/bpf/btf.c:1988 btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... RIP: 0010:btf_type_id_size+0x2d9/0x9d0 kernel/bpf/btf.c:1988 ... Call Trace: <TASK> map_check_btf kernel/bpf/syscall.c:1024 [inline] map_create+0x1157/0x1860 kernel/bpf/syscall.c:1198 __sys_bpf+0x127f/0x5420 kernel/bpf/syscall.c:5040 __do_sys_bpf kernel/bpf/syscall.c:5162 [inline] __se_sys_bpf kernel/bpf/syscall.c:5160 [inline] __x64_sys_bpf+0x79/0xc0 kernel/bpf/syscall.c:5160 do_syscall_x64 arch/x86/entry/common.c:50 [inline] do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80 entry_SYSCALL_64_after_hwframe+0x63/0xcd
With the following btf [1] DECL_TAG 'a' type_id=4 component_idx=-1 [2] PTR '(anon)' type_id=0 [3] TYPE_TAG 'a' type_id=2 [4] VAR 'a' type_id=3, linkage=static and when the bpf_attr.btf_key_type_id = 1 (DECL_TAG), the following WARN_ON_ONCE in btf_type_id_size() is triggered: if (WARN_ON_ONCE(!btf_type_is_modifier(size_type) && !btf_type_is_var(size_type))) return NULL;
Note that 'return NULL' is the correct behavior as we don't want a DECL_TAG type to be used as a btf_{key,value}_type_id even for the case like 'DECL_TAG -> STRUCT'. So there is no correctness issue here, we just want to silence warning.
To silence the warning, I added DECL_TAG as one of kinds in btf_type_nosize() which will cause btf_type_id_size() returning NULL earlier without the warning.
[1] https://lore.kernel.org/bpf/000000000000e0df8d05fc75ba86@google.com/
Reported-by: syzbot+958967f249155967d42a@syzkaller.appspotmail.com Signed-off-by: Yonghong Song yhs@fb.com Link: https://lore.kernel.org/r/20230530205029.264910-1-yhs@fb.com Signed-off-by: Martin KaFai Lau martin.lau@kernel.org Signed-off-by: Diogo Jahchan Koike djahchankoike@gmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/bpf/btf.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
--- a/kernel/bpf/btf.c +++ b/kernel/bpf/btf.c @@ -466,10 +466,16 @@ static bool btf_type_is_fwd(const struct return BTF_INFO_KIND(t->info) == BTF_KIND_FWD; }
+static bool btf_type_is_decl_tag(const struct btf_type *t) +{ + return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG; +} + static bool btf_type_nosize(const struct btf_type *t) { return btf_type_is_void(t) || btf_type_is_fwd(t) || - btf_type_is_func(t) || btf_type_is_func_proto(t); + btf_type_is_func(t) || btf_type_is_func_proto(t) || + btf_type_is_decl_tag(t); }
static bool btf_type_nosize_or_null(const struct btf_type *t) @@ -492,11 +498,6 @@ static bool btf_type_is_datasec(const st return BTF_INFO_KIND(t->info) == BTF_KIND_DATASEC; }
-static bool btf_type_is_decl_tag(const struct btf_type *t) -{ - return BTF_INFO_KIND(t->info) == BTF_KIND_DECL_TAG; -} - static bool btf_type_is_decl_tag_target(const struct btf_type *t) { return btf_type_is_func(t) || btf_type_is_struct(t) ||
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shakeel Butt shakeel.butt@linux.dev
commit 9972605a238339b85bd16b084eed5f18414d22db upstream.
Commit 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") decoupled the memcg IDs from the CSS ID space to fix the cgroup creation failures. It introduced IDR to maintain the memcg ID space. The IDR depends on external synchronization mechanisms for modifications. For the mem_cgroup_idr, the idr_alloc() and idr_replace() happen within css callback and thus are protected through cgroup_mutex from concurrent modifications. However idr_remove() for mem_cgroup_idr was not protected against concurrency and can be run concurrently for different memcgs when they hit their refcnt to zero. Fix that.
We have been seeing list_lru based kernel crashes at a low frequency in our fleet for a long time. These crashes were in different part of list_lru code including list_lru_add(), list_lru_del() and reparenting code. Upon further inspection, it looked like for a given object (dentry and inode), the super_block's list_lru didn't have list_lru_one for the memcg of that object. The initial suspicions were either the object is not allocated through kmem_cache_alloc_lru() or somehow memcg_list_lru_alloc() failed to allocate list_lru_one() for a memcg but returned success. No evidence were found for these cases.
Looking more deeply, we started seeing situations where valid memcg's id is not present in mem_cgroup_idr and in some cases multiple valid memcgs have same id and mem_cgroup_idr is pointing to one of them. So, the most reasonable explanation is that these situations can happen due to race between multiple idr_remove() calls or race between idr_alloc()/idr_replace() and idr_remove(). These races are causing multiple memcgs to acquire the same ID and then offlining of one of them would cleanup list_lrus on the system for all of them. Later access from other memcgs to the list_lru cause crashes due to missing list_lru_one.
Link: https://lkml.kernel.org/r/20240802235822.1830976-1-shakeel.butt@linux.dev Fixes: 73f576c04b94 ("mm: memcontrol: fix cgroup creation failure after many small jobs") Signed-off-by: Shakeel Butt shakeel.butt@linux.dev Acked-by: Muchun Song muchun.song@linux.dev Reviewed-by: Roman Gushchin roman.gushchin@linux.dev Acked-by: Johannes Weiner hannes@cmpxchg.org Cc: Michal Hocko mhocko@suse.com Cc: stable@vger.kernel.org Signed-off-by: Andrew Morton akpm@linux-foundation.org [ Adapted over commit 6f0df8e16eb5 ("memcontrol: ensure memcg acquired by id is properly set up") not in the tree ] Signed-off-by: Tomas Krcka krckatom@amazon.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- mm/memcontrol.c | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5143,11 +5143,28 @@ static struct cftype mem_cgroup_legacy_f */
static DEFINE_IDR(mem_cgroup_idr); +static DEFINE_SPINLOCK(memcg_idr_lock); + +static int mem_cgroup_alloc_id(void) +{ + int ret; + + idr_preload(GFP_KERNEL); + spin_lock(&memcg_idr_lock); + ret = idr_alloc(&mem_cgroup_idr, NULL, 1, MEM_CGROUP_ID_MAX + 1, + GFP_NOWAIT); + spin_unlock(&memcg_idr_lock); + idr_preload_end(); + return ret; +}
static void mem_cgroup_id_remove(struct mem_cgroup *memcg) { if (memcg->id.id > 0) { + spin_lock(&memcg_idr_lock); idr_remove(&mem_cgroup_idr, memcg->id.id); + spin_unlock(&memcg_idr_lock); + memcg->id.id = 0; } } @@ -5270,8 +5287,7 @@ static struct mem_cgroup *mem_cgroup_all if (!memcg) return ERR_PTR(error);
- memcg->id.id = idr_alloc(&mem_cgroup_idr, NULL, - 1, MEM_CGROUP_ID_MAX + 1, GFP_KERNEL); + memcg->id.id = mem_cgroup_alloc_id(); if (memcg->id.id < 0) { error = memcg->id.id; goto fail; @@ -5316,7 +5332,9 @@ static struct mem_cgroup *mem_cgroup_all INIT_LIST_HEAD(&memcg->deferred_split_queue.split_queue); memcg->deferred_split_queue.split_queue_len = 0; #endif + spin_lock(&memcg_idr_lock); idr_replace(&mem_cgroup_idr, memcg, memcg->id.id); + spin_unlock(&memcg_idr_lock); lru_gen_init_memcg(memcg); return memcg; fail:
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Peng Wu wupeng58@huawei.com
commit c957387c402a1a213102e38f92b800d7909a728d upstream.
The regulator_get() function never returns NULL. It returns error pointers.
Fixes: 27b9ecc7a9ba ("regulator: Add of_regulator_bulk_get_all") Signed-off-by: Peng Wu wupeng58@huawei.com Link: https://lore.kernel.org/r/20221122082242.82937-1-wupeng58@huawei.com Signed-off-by: Mark Brown broonie@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/regulator/of_regulator.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/regulator/of_regulator.c +++ b/drivers/regulator/of_regulator.c @@ -767,7 +767,7 @@ restart: memcpy(name, prop->name, i); name[i] = '\0'; tmp = regulator_get(dev, name); - if (!tmp) { + if (IS_ERR(tmp)) { ret = -EINVAL; goto error; }
6.1-stable review patch. If anyone has any objections, please let me know.
------------------
From: Miklos Szeredi mszeredi@redhat.com
commit 5cadfbd5a11e5495cac217534c5f788168b1afd7 upstream.
Add an init flag idicating whether the FUSE_EXPIRE_ONLY flag of FUSE_NOTIFY_INVAL_ENTRY is effective.
This is needed for backports of this feature, otherwise the server could just check the protocol version.
Fixes: 4f8d37020e1f ("fuse: add "expire only" mode to FUSE_NOTIFY_INVAL_ENTRY") Cc: stable@vger.kernel.org # v6.2 Signed-off-by: Miklos Szeredi mszeredi@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- fs/fuse/inode.c | 3 ++- include/uapi/linux/fuse.h | 2 ++ 2 files changed, 4 insertions(+), 1 deletion(-)
--- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1327,7 +1327,8 @@ void fuse_send_init(struct fuse_mount *f FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS | FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA | FUSE_HANDLE_KILLPRIV_V2 | FUSE_SETXATTR_EXT | FUSE_INIT_EXT | - FUSE_SECURITY_CTX; + FUSE_SECURITY_CTX | + FUSE_HAS_EXPIRE_ONLY; #ifdef CONFIG_FUSE_DAX if (fm->fc->dax) flags |= FUSE_MAP_ALIGNMENT; --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -365,6 +365,7 @@ struct fuse_file_lock { * FUSE_SECURITY_CTX: add security context to create, mkdir, symlink, and * mknod * FUSE_HAS_INODE_DAX: use per inode DAX + * FUSE_HAS_EXPIRE_ONLY: kernel supports expiry-only entry invalidation */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -401,6 +402,7 @@ struct fuse_file_lock { /* bits 32..63 get shifted down 32 bits into the flags2 field */ #define FUSE_SECURITY_CTX (1ULL << 32) #define FUSE_HAS_INODE_DAX (1ULL << 33) +#define FUSE_HAS_EXPIRE_ONLY (1ULL << 35)
/** * CUSE INIT request/reply flags
Hi!
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Can you quote git hash of the 6.1.110-rc1?
We do have
Linux 6.1.110-rc1 (244a97bb85be) Greg Kroah-Hartman authored 1 day ago
passing tests
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
. But that's 1 day old.
Best regards, Pavel
On Tue, Sep 10, 2024 at 12:35:54PM +0200, Pavel Machek wrote:
Hi!
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Can you quote git hash of the 6.1.110-rc1?
Nope! We create the git trees for those that want throw-away trees, once I create them I automatically delete them so I don't even know the hash anymore.
We do have
Linux 6.1.110-rc1 (244a97bb85be) Greg Kroah-Hartman authored 1 day ago
passing tests
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
. But that's 1 day old.
Lots have changed since then, please use the latest.
thanks,
greg k-h
On Tue 2024-09-10 12:44:36, Greg Kroah-Hartman wrote:
On Tue, Sep 10, 2024 at 12:35:54PM +0200, Pavel Machek wrote:
Hi!
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Can you quote git hash of the 6.1.110-rc1?
Nope! We create the git trees for those that want throw-away trees, once I create them I automatically delete them so I don't even know the hash anymore.
Modify your scripts so that you can quote hash in the announcement?
Avoid using "-rc1" tag for things that are... well... not released as -rc1?
We do have
Linux 6.1.110-rc1 (244a97bb85be) Greg Kroah-Hartman authored 1 day ago
passing tests
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/tree/linux-6...
. But that's 1 day old.
Lots have changed since then, please use the latest.
I will, but matching releases and tests based on timestamps is just not right.
Best regards, Pavel
On Tue, 10 Sept 2024 at 15:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
The SuperH defconfig builds failed due to following build warnings / errors on the stable-rc linux-6.1.y.
* SuperH, build - gcc-8-defconfig - gcc-11-shx3_defconfig - gcc-11-defconfig - gcc-8-shx3_defconfig
Build log: -------- In file included from include/linux/mm.h:29, from arch/sh/kernel/asm-offsets.c:14: include/linux/pgtable.h: In function 'pmdp_get_lockless': include/linux/pgtable.h:379:20: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^ include/linux/pgtable.h:379:35: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^~
Metadata: -------- build_name: gcc-11-defconfig config: https://storage.tuxsuite.com/public/linaro/lkft/builds/2lsNsLYIfqdkNQOzLLZO4... download_url: https://storage.tuxsuite.com/public/linaro/lkft/builds/2lsNsLYIfqdkNQOzLLZO4... git_describe: v6.1.109-193-gb220bb28da0f git_repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git git_sha: b220bb28da0f6a629a0d88be3f8e57ea5025c728 compiler: {'name': 'sh4-linux-gnu-gcc', 'version': '11', 'version_full': 'sh4-linux-gnu-gcc (Debian 11.4.0-5) 11.4.0'}
Steps to reproduce: ------- - # tuxmake --runtime podman --target-arch sh --toolchain gcc-11 --kconfig defconfig
-- Linaro LKFT https://lkft.linaro.org
On Tue, 10 Sept 2024 at 18:24, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Tue, 10 Sept 2024 at 15:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
The SuperH defconfig builds failed due to following build warnings / errors on the stable-rc linux-6.1.y.
- SuperH, build
- gcc-8-defconfig
- gcc-11-shx3_defconfig
- gcc-11-defconfig
- gcc-8-shx3_defconfig
Build log:
In file included from include/linux/mm.h:29, from arch/sh/kernel/asm-offsets.c:14: include/linux/pgtable.h: In function 'pmdp_get_lockless': include/linux/pgtable.h:379:20: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^ include/linux/pgtable.h:379:35: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^~
Anders bisected this down to, # first bad commit: [4f5373c50a1177e2a195f0ef6a6e5b7f64bf8b6c] mm: Fix pmd_read_atomic() [ Upstream commit 024d232ae4fcd7a7ce8ea239607d6c1246d7adc8 ]
AFAICT there's no reason to do anything different than what we do for PTEs. Make it so (also affects SH)
-- Linaro LKFT https://lkft.linaro.org
On Tue, Sep 10, 2024 at 08:05:27PM +0530, Naresh Kamboju wrote:
On Tue, 10 Sept 2024 at 18:24, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Tue, 10 Sept 2024 at 15:36, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
The SuperH defconfig builds failed due to following build warnings / errors on the stable-rc linux-6.1.y.
- SuperH, build
- gcc-8-defconfig
- gcc-11-shx3_defconfig
- gcc-11-defconfig
- gcc-8-shx3_defconfig
Build log:
In file included from include/linux/mm.h:29, from arch/sh/kernel/asm-offsets.c:14: include/linux/pgtable.h: In function 'pmdp_get_lockless': include/linux/pgtable.h:379:20: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^ include/linux/pgtable.h:379:35: error: 'pmd_t' has no member named 'pmd_low' 379 | pmd.pmd_low = pmdp->pmd_low; | ^~
Anders bisected this down to, # first bad commit: [4f5373c50a1177e2a195f0ef6a6e5b7f64bf8b6c] mm: Fix pmd_read_atomic() [ Upstream commit 024d232ae4fcd7a7ce8ea239607d6c1246d7adc8 ]
AFAICT there's no reason to do anything different than what we do for PTEs. Make it so (also affects SH)
Ok, I'm going to drop this series as it shouldn't be breaking builds :(
thanks,
greg k-h
On 9/10/24 02:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels, build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
On Tue, Sep 10, 2024 at 11:30:24AM +0200, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Tested-by: Mark Brown broonie@kernel.org
On 9/10/24 03:30, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 6.1.110 release. There are 192 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Thu, 12 Sep 2024 09:25:22 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v6.x/stable-review/patch-6.1.110-rc1... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-6.1.y and the diffstat can be found below.
thanks,
greg k-h
Compiled and booted on my test system. No dmesg regressions.
Tested-by: Shuah Khan skhan@linuxfoundation.org
thanks, -- Shuah
linux-stable-mirror@lists.linaro.org