This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 09 May 2025 18:37:41 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.182-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
------------- Pseudo-Shortlog of commits:
Greg Kroah-Hartman gregkh@linuxfoundation.org Linux 5.15.182-rc1
Nicolin Chen nicolinc@nvidia.com iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids
Jason Gunthorpe jgg@ziepe.ca iommu/arm-smmu-v3: Use the new rb tree helpers
Björn Töpel bjorn@rivosinc.com riscv: uprobes: Add missing fence.i after building the XOL buffer
Stephan Gerhold stephan.gerhold@linaro.org serial: msm: Configure correct working mode before starting earlycon
Suzuki K Poulose suzuki.poulose@arm.com irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()
Thomas Gleixner tglx@linutronix.de irqchip/gic-v2m: Mark a few functions __init
Xiang wangx wangxiang@cdjrlc.com irqchip/gic-v2m: Add const to of_device_id
Christian Hewitt christianshewitt@gmail.com Revert "drm/meson: vclk: fix calculation of 59.94 fractional rates"
Fiona Klute fiona.klute@gmx.de net: phy: microchip: force IRQ polling mode for lan88xx
Sébastien Szymanski sebastien.szymanski@armadeus.com ARM: dts: opos6ul: add ksz8081 phy properties
Cristian Marussi cristian.marussi@arm.com firmware: arm_scmi: Balance device refcount when destroying devices
Yonglong Liu liuyonglong@huawei.com net: hns3: fix deadlock issue when externel_lb and reset are executed together
Sergey Shtylyov s.shtylyov@omp.ru of: module: add buffer overflow check in of_modalias()
Richard Zhu hongxing.zhu@nxp.com PCI: imx6: Skip controller_id generation logic for i.MX7D
Jian Shen shenjian15@huawei.com net: hns3: defer calling ptp_clock_register()
Hao Lan lanhao@huawei.com net: hns3: fixed debugfs tm_qset size
Yonglong Liu liuyonglong@huawei.com net: hns3: fix an interrupt residual problem
Yonglong Liu liuyonglong@huawei.com net: hns3: add support for external loopback test
Jian Shen shenjian15@huawei.com net: hns3: store rx VLAN tag offload state for VF
Mattias Barthel mattias.barthel@atlascopco.com net: fec: ERR007885 Workaround for conventional TX
Thangaraj Samynathan thangaraj.s@microchip.com net: lan743x: Fix memleak issue when GSO enabled
Michael Liang mliang@purestorage.com nvme-tcp: fix premature queue removal and I/O failover
Michael Chan michael.chan@broadcom.com bnxt_en: Fix ethtool -d byte order for 32-bit values
Shruti Parab shruti.parab@broadcom.com bnxt_en: Fix out-of-bound memcpy() during ethtool -w
Shruti Parab shruti.parab@broadcom.com bnxt_en: Fix coredump logic to free allocated buffer
Felix Fietkau nbd@nbd.name net: ipv6: fix UDPv6 GSO segmentation with NAT
Simon Horman horms@kernel.org net: dlink: Correct endianness handling of led_mode
Xuanqiang Luo luoxuanqiang@kylinos.cn ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr()
Brett Creeley brett.creeley@intel.com ice: Refactor promiscuous functions
Victor Nogueira victor@mojatatu.com net_sched: qfq: Fix double list add in class with netem as child qdisc
Victor Nogueira victor@mojatatu.com net_sched: ets: Fix double list add in class with netem as child qdisc
Victor Nogueira victor@mojatatu.com net_sched: hfsc: Fix a UAF vulnerability in class with netem as child qdisc
Victor Nogueira victor@mojatatu.com net_sched: drr: Fix double list add in class with netem as child qdisc
Louis-Alexis Eyraud louisalexis.eyraud@collabora.com net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised
Louis-Alexis Eyraud louisalexis.eyraud@collabora.com net: ethernet: mtk-star-emac: fix spinlock recursion issues on rx/tx poll
Biao Huang biao.huang@mediatek.com net: ethernet: mtk-star-emac: separate tx/rx handling with two NAPIs
Chris Mi cmi@nvidia.com net/mlx5: E-switch, Fix error handling for enabling roce
Maor Gottlieb maorg@nvidia.com net/mlx5: E-Switch, Initialize MAC Address for Default GID
Jakub Kicinski kuba@kernel.org net/sched: act_mirred: don't override retval if we already lost the skb
Sean Christopherson seanjc@google.com KVM: x86: Load DR6 with guest value only before entering .vcpu_run() loop
Jeongjun Park aha310510@gmail.com tracing: Fix oob write in trace_seq_to_buffer()
Mingcong Bai jeffbai@aosc.io iommu/vt-d: Apply quirk_iommu_igfx for 8086:0044 (QM57/QS57)
Pavel Paklov Pavel.Paklov@cyberprotect.ru iommu/amd: Fix potential buffer overflow in parse_ivrs_acpihid
Benjamin Marzinski bmarzins@redhat.com dm: always update the array size in realloc_argv on success
Mikulas Patocka mpatocka@redhat.com dm-integrity: fix a warning on invalid table line
Wentao Liang vulab@iscas.ac.cn wifi: brcm80211: fmac: Add error handling for brcmf_usb_dl_writeimage()
Ruslan Piasetskyi ruslan.piasetskyi@gmail.com mmc: renesas_sdhi: Fix error handling in renesas_sdhi_probe
Vishal Badole Vishal.Badole@amd.com amd-xgbe: Fix to ensure dependent features are toggled with RX checksum offload
Helge Deller deller@gmx.de parisc: Fix double SIGFPE crash
Will Deacon will@kernel.org arm64: errata: Add missing sentinels to Spectre-BHB MIDR arrays
Clark Wang xiaoning.wang@nxp.com i2c: imx-lpi2c: Fix clock count when probe defers
Niravkumar L Rabara niravkumar.l.rabara@altera.com EDAC/altera: Set DDR and SDMMC interrupt mask before registration
Niravkumar L Rabara niravkumar.l.rabara@altera.com EDAC/altera: Test the correct error reg offset
Philipp Stanner phasta@kernel.org drm/nouveau: Fix WARN_ON in nouveau_fence_context_kill()
Joachim Priesner joachim.priesner@web.de ALSA: usb-audio: Add second USB ID for Jabra Evolve 65 headset
-------------
Diffstat:
Makefile | 4 +- arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi | 3 + arch/arm64/kernel/proton-pack.c | 2 + arch/parisc/math-emu/driver.c | 16 +- arch/riscv/kernel/probes/uprobes.c | 10 +- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 13 +- arch/x86/kvm/vmx/vmx.c | 11 +- arch/x86/kvm/x86.c | 3 + drivers/edac/altera_edac.c | 9 +- drivers/edac/altera_edac.h | 2 + drivers/firmware/arm_scmi/bus.c | 3 + drivers/gpu/drm/meson/meson_vclk.c | 6 +- drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- drivers/i2c/busses/i2c-imx-lpi2c.c | 4 +- drivers/iommu/amd/init.c | 8 + drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 79 ++--- drivers/iommu/intel/iommu.c | 4 +- drivers/irqchip/irq-gic-v2m.c | 8 +- drivers/md/dm-integrity.c | 2 +- drivers/md/dm-table.c | 5 +- drivers/mmc/host/renesas_sdhi_core.c | 10 +- drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 9 +- drivers/net/ethernet/amd/xgbe/xgbe-dev.c | 24 +- drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 11 +- drivers/net/ethernet/amd/xgbe/xgbe.h | 4 + drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c | 30 +- drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 36 ++- drivers/net/ethernet/dlink/dl2k.c | 2 +- drivers/net/ethernet/dlink/dl2k.h | 2 +- drivers/net/ethernet/freescale/fec_main.c | 7 +- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 2 + drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 2 +- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 119 ++++++-- drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 3 + drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 61 ++-- .../ethernet/hisilicon/hns3/hns3pf/hclge_main.c | 26 +- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c | 13 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c | 25 +- .../ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h | 1 + drivers/net/ethernet/intel/ice/ice_fltr.c | 58 ++++ drivers/net/ethernet/intel/ice/ice_fltr.h | 12 + drivers/net/ethernet/intel/ice/ice_main.c | 49 +-- drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 5 + drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c | 139 ++++----- drivers/net/ethernet/mediatek/mtk_star_emac.c | 339 ++++++++++++--------- .../ethernet/mellanox/mlx5/core/eswitch_offloads.c | 5 +- drivers/net/ethernet/mellanox/mlx5/core/rdma.c | 11 +- drivers/net/ethernet/mellanox/mlx5/core/rdma.h | 4 +- drivers/net/ethernet/microchip/lan743x_main.c | 8 +- drivers/net/ethernet/microchip/lan743x_main.h | 1 + drivers/net/phy/microchip.c | 46 +-- .../net/wireless/broadcom/brcm80211/brcmfmac/usb.c | 6 +- drivers/nvme/host/tcp.c | 31 +- drivers/of/device.c | 7 +- drivers/pci/controller/dwc/pci-imx6.c | 5 +- drivers/tty/serial/msm_serial.c | 6 + kernel/trace/trace.c | 5 +- net/ipv4/udp_offload.c | 61 +++- net/sched/act_mirred.c | 22 +- net/sched/sch_drr.c | 9 +- net/sched/sch_ets.c | 9 +- net/sched/sch_hfsc.c | 2 +- net/sched/sch_qfq.c | 11 +- sound/usb/format.c | 3 +- 66 files changed, 932 insertions(+), 505 deletions(-)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Joachim Priesner joachim.priesner@web.de
commit 1149719442d28c96dc63cad432b5a6db7c300e1a upstream.
There seem to be multiple USB device IDs used for these; the one I have reports as 0b0e:030c when powered on. (When powered off, it reports as 0b0e:0311.)
Signed-off-by: Joachim Priesner joachim.priesner@web.de Cc: stable@vger.kernel.org Link: https://patch.msgid.link/20250428053606.9237-1-joachim.priesner@web.de Signed-off-by: Takashi Iwai tiwai@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- sound/usb/format.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
--- a/sound/usb/format.c +++ b/sound/usb/format.c @@ -263,7 +263,8 @@ static int parse_audio_format_rates_v1(s }
/* Jabra Evolve 65 headset */ - if (chip->usb_id == USB_ID(0x0b0e, 0x030b)) { + if (chip->usb_id == USB_ID(0x0b0e, 0x030b) || + chip->usb_id == USB_ID(0x0b0e, 0x030c)) { /* only 48kHz for playback while keeping 16kHz for capture */ if (fp->nr_rates != 1) return set_fixed_rate(fp, 48000, SNDRV_PCM_RATE_48000);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Philipp Stanner phasta@kernel.org
commit bbe5679f30d7690a9b6838a583b9690ea73fe0e9 upstream.
Nouveau is mostly designed in a way that it's expected that fences only ever get signaled through nouveau_fence_signal(). However, in at least one other place, nouveau_fence_done(), can signal fences, too. If that happens (race) a signaled fence remains in the pending list for a while, until it gets removed by nouveau_fence_update().
Should nouveau_fence_context_kill() run in the meantime, this would be a bug because the function would attempt to set an error code on an already signaled fence.
Have nouveau_fence_context_kill() check for a fence being signaled.
Cc: stable@vger.kernel.org # v5.10+ Fixes: ea13e5abf807 ("drm/nouveau: signal pending fences when channel has been killed") Suggested-by: Christian König christian.koenig@amd.com Signed-off-by: Philipp Stanner phasta@kernel.org Link: https://lore.kernel.org/r/20250415121900.55719-3-phasta@kernel.org Signed-off-by: Danilo Krummrich dakr@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/gpu/drm/nouveau/nouveau_fence.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c @@ -95,7 +95,7 @@ nouveau_fence_context_kill(struct nouvea while (!list_empty(&fctx->pending)) { fence = list_entry(fctx->pending.next, typeof(*fence), head);
- if (error) + if (error && !dma_fence_is_signaled_locked(&fence->base)) dma_fence_set_error(&fence->base, error);
if (nouveau_fence_signal(fence))
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niravkumar L Rabara niravkumar.l.rabara@altera.com
commit 4fb7b8fceb0beebbe00712c3daf49ade0386076a upstream.
Test correct structure member, ecc_cecnt_offset, before using it.
[ bp: Massage commit message. ]
Fixes: 73bcc942f427 ("EDAC, altera: Add Arria10 EDAC support") Signed-off-by: Niravkumar L Rabara niravkumar.l.rabara@altera.com Signed-off-by: Matthew Gerlach matthew.gerlach@altera.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Dinh Nguyen dinguyen@kernel.org Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-2-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/altera_edac.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -98,7 +98,7 @@ static irqreturn_t altr_sdram_mc_err_han if (status & priv->ecc_stat_ce_mask) { regmap_read(drvdata->mc_vbase, priv->ecc_saddr_offset, &err_addr); - if (priv->ecc_uecnt_offset) + if (priv->ecc_cecnt_offset) regmap_read(drvdata->mc_vbase, priv->ecc_cecnt_offset, &err_count); edac_mc_handle_error(HW_EVENT_ERR_CORRECTED, mci, err_count,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Niravkumar L Rabara niravkumar.l.rabara@altera.com
commit 6dbe3c5418c4368e824bff6ae4889257dd544892 upstream.
Mask DDR and SDMMC in probe function to avoid spurious interrupts before registration. Removed invalid register write to system manager.
Fixes: 1166fde93d5b ("EDAC, altera: Add Arria10 ECC memory init functions") Signed-off-by: Niravkumar L Rabara niravkumar.l.rabara@altera.com Signed-off-by: Matthew Gerlach matthew.gerlach@altera.com Signed-off-by: Borislav Petkov (AMD) bp@alien8.de Acked-by: Dinh Nguyen dinguyen@kernel.org Cc: stable@kernel.org Link: https://lore.kernel.org/20250425142640.33125-3-matthew.gerlach@altera.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/edac/altera_edac.c | 7 ++++--- drivers/edac/altera_edac.h | 2 ++ 2 files changed, 6 insertions(+), 3 deletions(-)
--- a/drivers/edac/altera_edac.c +++ b/drivers/edac/altera_edac.c @@ -1015,9 +1015,6 @@ altr_init_a10_ecc_block(struct device_no } }
- /* Interrupt mode set to every SBERR */ - regmap_write(ecc_mgr_map, ALTR_A10_ECC_INTMODE_OFST, - ALTR_A10_ECC_INTMODE); /* Enable ECC */ ecc_set_bits(ecc_ctrl_en_mask, (ecc_block_base + ALTR_A10_ECC_CTRL_OFST)); @@ -2100,6 +2097,10 @@ static int altr_edac_a10_probe(struct pl return PTR_ERR(edac->ecc_mgr_map); }
+ /* Set irq mask for DDR SBE to avoid any pending irq before registration */ + regmap_write(edac->ecc_mgr_map, A10_SYSMGR_ECC_INTMASK_SET_OFST, + (A10_SYSMGR_ECC_INTMASK_SDMMCB | A10_SYSMGR_ECC_INTMASK_DDR0)); + edac->irq_chip.name = pdev->dev.of_node->name; edac->irq_chip.irq_mask = a10_eccmgr_irq_mask; edac->irq_chip.irq_unmask = a10_eccmgr_irq_unmask; --- a/drivers/edac/altera_edac.h +++ b/drivers/edac/altera_edac.h @@ -249,6 +249,8 @@ struct altr_sdram_mc_data { #define A10_SYSMGR_ECC_INTMASK_SET_OFST 0x94 #define A10_SYSMGR_ECC_INTMASK_CLR_OFST 0x98 #define A10_SYSMGR_ECC_INTMASK_OCRAM BIT(1) +#define A10_SYSMGR_ECC_INTMASK_SDMMCB BIT(16) +#define A10_SYSMGR_ECC_INTMASK_DDR0 BIT(17)
#define A10_SYSMGR_ECC_INTSTAT_SERR_OFST 0x9C #define A10_SYSMGR_ECC_INTSTAT_DERR_OFST 0xA0
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Clark Wang xiaoning.wang@nxp.com
commit b1852c5de2f2a37dd4462f7837c9e3e678f9e546 upstream.
Deferred probe with pm_runtime_put() may delay clock disable, causing incorrect clock usage count. Use pm_runtime_put_sync() to ensure the clock is disabled immediately.
Fixes: 13d6eb20fc79 ("i2c: imx-lpi2c: add runtime pm support") Signed-off-by: Clark Wang xiaoning.wang@nxp.com Signed-off-by: Carlos Song carlos.song@nxp.com Cc: stable@vger.kernel.org # v4.16+ Link: https://lore.kernel.org/r/20250421062341.2471922-1-carlos.song@nxp.com Signed-off-by: Andi Shyti andi.shyti@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/i2c/busses/i2c-imx-lpi2c.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)
--- a/drivers/i2c/busses/i2c-imx-lpi2c.c +++ b/drivers/i2c/busses/i2c-imx-lpi2c.c @@ -616,9 +616,9 @@ static int lpi2c_imx_probe(struct platfo return 0;
rpm_disable: - pm_runtime_put(&pdev->dev); - pm_runtime_disable(&pdev->dev); pm_runtime_dont_use_autosuspend(&pdev->dev); + pm_runtime_put_sync(&pdev->dev); + pm_runtime_disable(&pdev->dev);
return ret; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Will Deacon will@kernel.org
commit fee4d171451c1ad9e8aaf65fc0ab7d143a33bd72 upstream.
Commit a5951389e58d ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists") added some additional CPUs to the Spectre-BHB workaround, including some new arrays for designs that require new 'k' values for the workaround to be effective.
Unfortunately, the new arrays omitted the sentinel entry and so is_midr_in_range_list() will walk off the end when it doesn't find a match. With UBSAN enabled, this leads to a crash during boot when is_midr_in_range_list() is inlined (which was more common prior to c8c2647e69be ("arm64: Make _midr_in_range_list() an exported function")):
| Internal error: aarch64 BRK: 00000000f2000001 [#1] PREEMPT SMP | pstate: 804000c5 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | pc : spectre_bhb_loop_affected+0x28/0x30 | lr : is_spectre_bhb_affected+0x170/0x190 | [...] | Call trace: | spectre_bhb_loop_affected+0x28/0x30 | update_cpu_capabilities+0xc0/0x184 | init_cpu_features+0x188/0x1a4 | cpuinfo_store_boot_cpu+0x4c/0x60 | smp_prepare_boot_cpu+0x38/0x54 | start_kernel+0x8c/0x478 | __primary_switched+0xc8/0xd4 | Code: 6b09011f 54000061 52801080 d65f03c0 (d4200020) | ---[ end trace 0000000000000000 ]--- | Kernel panic - not syncing: aarch64 BRK: Fatal exception
Add the missing sentinel entries.
Cc: Lee Jones lee@kernel.org Cc: James Morse james.morse@arm.com Cc: Doug Anderson dianders@chromium.org Cc: Shameer Kolothum shameerali.kolothum.thodi@huawei.com Cc: stable@vger.kernel.org Reported-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Fixes: a5951389e58d ("arm64: errata: Add newer ARM cores to the spectre_bhb_loop_affected() lists") Signed-off-by: Will Deacon will@kernel.org Reviewed-by: Lee Jones lee@kernel.org Reviewed-by: Douglas Anderson dianders@chromium.org Reviewed-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Link: https://lore.kernel.org/r/20250501104747.28431-1-will@kernel.org Signed-off-by: Catalin Marinas catalin.marinas@arm.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/arm64/kernel/proton-pack.c | 2 ++ 1 file changed, 2 insertions(+)
--- a/arch/arm64/kernel/proton-pack.c +++ b/arch/arm64/kernel/proton-pack.c @@ -879,10 +879,12 @@ static u8 spectre_bhb_loop_affected(void static const struct midr_range spectre_bhb_k132_list[] = { MIDR_ALL_VERSIONS(MIDR_CORTEX_X3), MIDR_ALL_VERSIONS(MIDR_NEOVERSE_V2), + {}, }; static const struct midr_range spectre_bhb_k38_list[] = { MIDR_ALL_VERSIONS(MIDR_CORTEX_A715), MIDR_ALL_VERSIONS(MIDR_CORTEX_A720), + {}, }; static const struct midr_range spectre_bhb_k32_list[] = { MIDR_ALL_VERSIONS(MIDR_CORTEX_A78),
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Helge Deller deller@gmx.de
commit de3629baf5a33af1919dec7136d643b0662e85ef upstream.
Camm noticed that on parisc a SIGFPE exception will crash an application with a second SIGFPE in the signal handler. Dave analyzed it, and it happens because glibc uses a double-word floating-point store to atomically update function descriptors. As a result of lazy binding, we hit a floating-point store in fpe_func almost immediately.
When the T bit is set, an assist exception trap occurs when when the co-processor encounters *any* floating-point instruction except for a double store of register %fr0. The latter cancels all pending traps. Let's fix this by clearing the Trap (T) bit in the FP status register before returning to the signal handler in userspace.
The issue can be reproduced with this test program:
root@parisc:~# cat fpe.c
static void fpe_func(int sig, siginfo_t *i, void *v) { sigset_t set; sigemptyset(&set); sigaddset(&set, SIGFPE); sigprocmask(SIG_UNBLOCK, &set, NULL); printf("GOT signal %d with si_code %ld\n", sig, i->si_code); }
int main() { struct sigaction action = { .sa_sigaction = fpe_func, .sa_flags = SA_RESTART|SA_SIGINFO }; sigaction(SIGFPE, &action, 0); feenableexcept(FE_OVERFLOW); return printf("%lf\n",1.7976931348623158E308*1.7976931348623158E308); }
root@parisc:~# gcc fpe.c -lm root@parisc:~# ./a.out Floating point exception
root@parisc:~# strace -f ./a.out execve("./a.out", ["./a.out"], 0xf9ac7034 /* 20 vars */) = 0 getrlimit(RLIMIT_STACK, {rlim_cur=8192*1024, rlim_max=RLIM_INFINITY}) = 0 ... rt_sigaction(SIGFPE, {sa_handler=0x1110a, sa_mask=[], sa_flags=SA_RESTART|SA_SIGINFO}, NULL, 8) = 0 --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0x1078f} --- --- SIGFPE {si_signo=SIGFPE, si_code=FPE_FLTOVF, si_addr=0xf8f21237} --- +++ killed by SIGFPE +++ Floating point exception
Signed-off-by: Helge Deller deller@gmx.de Suggested-by: John David Anglin dave.anglin@bell.net Reported-by: Camm Maguire camm@maguirefamily.org Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/parisc/math-emu/driver.c | 16 +++++++++++++--- 1 file changed, 13 insertions(+), 3 deletions(-)
--- a/arch/parisc/math-emu/driver.c +++ b/arch/parisc/math-emu/driver.c @@ -103,9 +103,19 @@ handle_fpe(struct pt_regs *regs)
memcpy(regs->fr, frcopy, sizeof regs->fr); if (signalcode != 0) { - force_sig_fault(signalcode >> 24, signalcode & 0xffffff, - (void __user *) regs->iaoq[0]); - return -1; + int sig = signalcode >> 24; + + if (sig == SIGFPE) { + /* + * Clear floating point trap bit to avoid trapping + * again on the first floating-point instruction in + * the userspace signal handler. + */ + regs->fr[0] &= ~(1ULL << 38); + } + force_sig_fault(sig, signalcode & 0xffffff, + (void __user *) regs->iaoq[0]); + return -1; }
return signalcode ? -1 : 0;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Vishal Badole Vishal.Badole@amd.com
commit f04dd30f1bef1ed2e74a4050af6e5e5e3869bac3 upstream.
According to the XGMAC specification, enabling features such as Layer 3 and Layer 4 Packet Filtering, Split Header and Virtualized Network support automatically selects the IPC Full Checksum Offload Engine on the receive side.
When RX checksum offload is disabled, these dependent features must also be disabled to prevent abnormal behavior caused by mismatched feature dependencies.
Ensure that toggling RX checksum offload (disabling or enabling) properly disables or enables all dependent features, maintaining consistent and expected behavior in the network device.
Cc: stable@vger.kernel.org Fixes: 1a510ccf5869 ("amd-xgbe: Add support for VXLAN offload capabilities") Signed-off-by: Vishal Badole Vishal.Badole@amd.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250424130248.428865-1-Vishal.Badole@amd.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/amd/xgbe/xgbe-desc.c | 9 +++++++-- drivers/net/ethernet/amd/xgbe/xgbe-dev.c | 24 ++++++++++++++++++++++-- drivers/net/ethernet/amd/xgbe/xgbe-drv.c | 11 +++++++++-- drivers/net/ethernet/amd/xgbe/xgbe.h | 4 ++++ 4 files changed, 42 insertions(+), 6 deletions(-)
--- a/drivers/net/ethernet/amd/xgbe/xgbe-desc.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-desc.c @@ -373,8 +373,13 @@ static int xgbe_map_rx_buffer(struct xgb }
/* Set up the header page info */ - xgbe_set_buffer_data(&rdata->rx.hdr, &ring->rx_hdr_pa, - XGBE_SKB_ALLOC_SIZE); + if (pdata->netdev->features & NETIF_F_RXCSUM) { + xgbe_set_buffer_data(&rdata->rx.hdr, &ring->rx_hdr_pa, + XGBE_SKB_ALLOC_SIZE); + } else { + xgbe_set_buffer_data(&rdata->rx.hdr, &ring->rx_hdr_pa, + pdata->rx_buf_size); + }
/* Set up the buffer page info */ xgbe_set_buffer_data(&rdata->rx.buf, &ring->rx_buf_pa, --- a/drivers/net/ethernet/amd/xgbe/xgbe-dev.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-dev.c @@ -320,6 +320,18 @@ static void xgbe_config_sph_mode(struct XGMAC_IOWRITE_BITS(pdata, MAC_RCR, HDSMS, XGBE_SPH_HDSMS_SIZE); }
+static void xgbe_disable_sph_mode(struct xgbe_prv_data *pdata) +{ + unsigned int i; + + for (i = 0; i < pdata->channel_count; i++) { + if (!pdata->channel[i]->rx_ring) + break; + + XGMAC_DMA_IOWRITE_BITS(pdata->channel[i], DMA_CH_CR, SPH, 0); + } +} + static int xgbe_write_rss_reg(struct xgbe_prv_data *pdata, unsigned int type, unsigned int index, unsigned int val) { @@ -3495,8 +3507,12 @@ static int xgbe_init(struct xgbe_prv_dat xgbe_config_tx_coalesce(pdata); xgbe_config_rx_buffer_size(pdata); xgbe_config_tso_mode(pdata); - xgbe_config_sph_mode(pdata); - xgbe_config_rss(pdata); + + if (pdata->netdev->features & NETIF_F_RXCSUM) { + xgbe_config_sph_mode(pdata); + xgbe_config_rss(pdata); + } + desc_if->wrapper_tx_desc_init(pdata); desc_if->wrapper_rx_desc_init(pdata); xgbe_enable_dma_interrupts(pdata); @@ -3650,5 +3666,9 @@ void xgbe_init_function_ptrs_dev(struct hw_if->disable_vxlan = xgbe_disable_vxlan; hw_if->set_vxlan_id = xgbe_set_vxlan_id;
+ /* For Split Header*/ + hw_if->enable_sph = xgbe_config_sph_mode; + hw_if->disable_sph = xgbe_disable_sph_mode; + DBGPR("<--xgbe_init_function_ptrs\n"); } --- a/drivers/net/ethernet/amd/xgbe/xgbe-drv.c +++ b/drivers/net/ethernet/amd/xgbe/xgbe-drv.c @@ -2264,10 +2264,17 @@ static int xgbe_set_features(struct net_ if (ret) return ret;
- if ((features & NETIF_F_RXCSUM) && !rxcsum) + if ((features & NETIF_F_RXCSUM) && !rxcsum) { + hw_if->enable_sph(pdata); + hw_if->enable_vxlan(pdata); hw_if->enable_rx_csum(pdata); - else if (!(features & NETIF_F_RXCSUM) && rxcsum) + schedule_work(&pdata->restart_work); + } else if (!(features & NETIF_F_RXCSUM) && rxcsum) { + hw_if->disable_sph(pdata); + hw_if->disable_vxlan(pdata); hw_if->disable_rx_csum(pdata); + schedule_work(&pdata->restart_work); + }
if ((features & NETIF_F_HW_VLAN_CTAG_RX) && !rxvlan) hw_if->enable_rx_vlan_stripping(pdata); --- a/drivers/net/ethernet/amd/xgbe/xgbe.h +++ b/drivers/net/ethernet/amd/xgbe/xgbe.h @@ -833,6 +833,10 @@ struct xgbe_hw_if { void (*enable_vxlan)(struct xgbe_prv_data *); void (*disable_vxlan)(struct xgbe_prv_data *); void (*set_vxlan_id)(struct xgbe_prv_data *); + + /* For Split Header */ + void (*enable_sph)(struct xgbe_prv_data *pdata); + void (*disable_sph)(struct xgbe_prv_data *pdata); };
/* This structure represents implementation specific routines for an
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Ruslan Piasetskyi ruslan.piasetskyi@gmail.com
commit 649b50a82f09fa44c2f7a65618e4584072145ab7 upstream.
After moving tmio_mmc_host_probe down, error handling has to be adjusted.
Fixes: 74f45de394d9 ("mmc: renesas_sdhi: register irqs before registering controller") Reviewed-by: Ihar Salauyou salauyou.ihar@gmail.com Signed-off-by: Ruslan Piasetskyi ruslan.piasetskyi@gmail.com Reviewed-by: Geert Uytterhoeven geert+renesas@glider.be Reviewed-by: Wolfram Sang wsa+renesas@sang-engineering.com Tested-by: Wolfram Sang wsa+renesas@sang-engineering.com Cc: stable@vger.kernel.org Link: https://lore.kernel.org/r/20250326220638.460083-1-ruslan.piasetskyi@gmail.co... Signed-off-by: Ulf Hansson ulf.hansson@linaro.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/mmc/host/renesas_sdhi_core.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-)
--- a/drivers/mmc/host/renesas_sdhi_core.c +++ b/drivers/mmc/host/renesas_sdhi_core.c @@ -1078,26 +1078,26 @@ int renesas_sdhi_probe(struct platform_d num_irqs = platform_irq_count(pdev); if (num_irqs < 0) { ret = num_irqs; - goto eirq; + goto edisclk; }
/* There must be at least one IRQ source */ if (!num_irqs) { ret = -ENXIO; - goto eirq; + goto edisclk; }
for (i = 0; i < num_irqs; i++) { irq = platform_get_irq(pdev, i); if (irq < 0) { ret = irq; - goto eirq; + goto edisclk; }
ret = devm_request_irq(&pdev->dev, irq, tmio_mmc_irq, 0, dev_name(&pdev->dev), host); if (ret) - goto eirq; + goto edisclk; }
ret = tmio_mmc_host_probe(host); @@ -1109,8 +1109,6 @@ int renesas_sdhi_probe(struct platform_d
return ret;
-eirq: - tmio_mmc_host_remove(host); edisclk: renesas_sdhi_clk_disable(host); efree:
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Wentao Liang vulab@iscas.ac.cn
commit 8e089e7b585d95122c8122d732d1d5ef8f879396 upstream.
The function brcmf_usb_dl_writeimage() calls the function brcmf_usb_dl_cmd() but dose not check its return value. The 'state.state' and the 'state.bytes' are uninitialized if the function brcmf_usb_dl_cmd() fails. It is dangerous to use uninitialized variables in the conditions.
Add error handling for brcmf_usb_dl_cmd() to jump to error handling path if the brcmf_usb_dl_cmd() fails and the 'state.state' and the 'state.bytes' are uninitialized.
Improve the error message to report more detailed error information.
Fixes: 71bb244ba2fd ("brcm80211: fmac: add USB support for bcm43235/6/8 chipsets") Cc: stable@vger.kernel.org # v3.4+ Signed-off-by: Wentao Liang vulab@iscas.ac.cn Acked-by: Arend van Spriel arend.vanspriel@broadcom.com Link: https://patch.msgid.link/20250422042203.2259-1-vulab@iscas.ac.cn Signed-off-by: Johannes Berg johannes.berg@intel.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
--- a/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c +++ b/drivers/net/wireless/broadcom/brcm80211/brcmfmac/usb.c @@ -903,14 +903,16 @@ brcmf_usb_dl_writeimage(struct brcmf_usb }
/* 1) Prepare USB boot loader for runtime image */ - brcmf_usb_dl_cmd(devinfo, DL_START, &state, sizeof(state)); + err = brcmf_usb_dl_cmd(devinfo, DL_START, &state, sizeof(state)); + if (err) + goto fail;
rdlstate = le32_to_cpu(state.state); rdlbytes = le32_to_cpu(state.bytes);
/* 2) Check we are in the Waiting state */ if (rdlstate != DL_WAITING) { - brcmf_err("Failed to DL_START\n"); + brcmf_err("Invalid DL state: %u\n", rdlstate); err = -EINVAL; goto fail; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mikulas Patocka mpatocka@redhat.com
commit 0a533c3e4246c29d502a7e0fba0e86d80a906b04 upstream.
If we use the 'B' mode and we have an invalit table line, cancel_delayed_work_sync would trigger a warning. This commit avoids the warning.
Signed-off-by: Mikulas Patocka mpatocka@redhat.com Cc: stable@vger.kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-integrity.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/md/dm-integrity.c +++ b/drivers/md/dm-integrity.c @@ -4543,7 +4543,7 @@ static void dm_integrity_dtr(struct dm_t BUG_ON(!RB_EMPTY_ROOT(&ic->in_progress)); BUG_ON(!list_empty(&ic->wait_list));
- if (ic->mode == 'B') + if (ic->mode == 'B' && ic->bitmap_flush_work.work.func) cancel_delayed_work_sync(&ic->bitmap_flush_work); if (ic->metadata_wq) destroy_workqueue(ic->metadata_wq);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Benjamin Marzinski bmarzins@redhat.com
commit 5a2a6c428190f945c5cbf5791f72dbea83e97f66 upstream.
realloc_argv() was only updating the array size if it was called with old_argv already allocated. The first time it was called to create an argv array, it would allocate the array but return the array size as zero. dm_split_args() would think that it couldn't store any arguments in the array and would call realloc_argv() again, causing it to reallocate the initial slots (this time using GPF_KERNEL) and finally return a size. Aside from being wasteful, this could cause deadlocks on targets that need to process messages without starting new IO. Instead, realloc_argv should always update the allocated array size on success.
Fixes: a0651926553c ("dm table: don't copy from a NULL pointer in realloc_argv()") Cc: stable@vger.kernel.org Signed-off-by: Benjamin Marzinski bmarzins@redhat.com Signed-off-by: Mikulas Patocka mpatocka@redhat.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/md/dm-table.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -492,9 +492,10 @@ static char **realloc_argv(unsigned *siz gfp = GFP_NOIO; } argv = kmalloc_array(new_size, sizeof(*argv), gfp); - if (argv && old_argv) { - memcpy(argv, old_argv, *size * sizeof(*argv)); + if (argv) { *size = new_size; + if (old_argv) + memcpy(argv, old_argv, *size * sizeof(*argv)); }
kfree(old_argv);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Pavel Paklov Pavel.Paklov@cyberprotect.ru
commit 8dee308e4c01dea48fc104d37f92d5b58c50b96c upstream.
There is a string parsing logic error which can lead to an overflow of hid or uid buffers. Comparing ACPIID_LEN against a total string length doesn't take into account the lengths of individual hid and uid buffers so the check is insufficient in some cases. For example if the length of hid string is 4 and the length of the uid string is 260, the length of str will be equal to ACPIID_LEN + 1 but uid string will overflow uid buffer which size is 256.
The same applies to the hid string with length 13 and uid string with length 250.
Check the length of hid and uid strings separately to prevent buffer overflow.
Found by Linux Verification Center (linuxtesting.org) with SVACE.
Fixes: ca3bf5d47cec ("iommu/amd: Introduces ivrs_acpihid kernel parameter") Cc: stable@vger.kernel.org Signed-off-by: Pavel Paklov Pavel.Paklov@cyberprotect.ru Link: https://lore.kernel.org/r/20250325092259.392844-1-Pavel.Paklov@cyberprotect.... Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/amd/init.c | 8 ++++++++ 1 file changed, 8 insertions(+)
--- a/drivers/iommu/amd/init.c +++ b/drivers/iommu/amd/init.c @@ -3343,6 +3343,14 @@ found: while (*uid == '0' && *(uid + 1)) uid++;
+ if (strlen(hid) >= ACPIHID_HID_LEN) { + pr_err("Invalid command line: hid is too long\n"); + return 1; + } else if (strlen(uid) >= ACPIHID_UID_LEN) { + pr_err("Invalid command line: uid is too long\n"); + return 1; + } + i = early_acpihid_map_size++; memcpy(early_acpihid_map[i].hid, hid, strlen(hid)); memcpy(early_acpihid_map[i].uid, uid, strlen(uid));
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mingcong Bai jeffbai@aosc.io
commit 2c8a7c66c90832432496616a9a3c07293f1364f3 upstream.
On the Lenovo ThinkPad X201, when Intel VT-d is enabled in the BIOS, the kernel boots with errors related to DMAR, the graphical interface appeared quite choppy, and the system resets erratically within a minute after it booted:
DMAR: DRHD: handling fault status reg 3 DMAR: [DMA Write NO_PASID] Request device [00:02.0] fault addr 0xb97ff000 [fault reason 0x05] PTE Write access is not set
Upon comparing boot logs with VT-d on/off, I found that the Intel Calpella quirk (`quirk_calpella_no_shadow_gtt()') correctly applied the igfx IOMMU disable/quirk correctly:
pci 0000:00:00.0: DMAR: BIOS has allocated no shadow GTT; disabling IOMMU for graphics
Whereas with VT-d on, it went into the "else" branch, which then triggered the DMAR handling fault above:
... else if (!disable_igfx_iommu) { /* we have to ensure the gfx device is idle before we flush */ pci_info(dev, "Disabling batched IOTLB flush on Ironlake\n"); iommu_set_dma_strict(); }
Now, this is not exactly scientific, but moving 0x0044 to quirk_iommu_igfx seems to have fixed the aforementioned issue. Running a few `git blame' runs on the function, I have found that the quirk was originally introduced as a fix specific to ThinkPad X201:
commit 9eecabcb9a92 ("intel-iommu: Abort IOMMU setup for igfx if BIOS gave no shadow GTT space")
Which was later revised twice to the "else" branch we saw above:
- 2011: commit 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") - 2024: commit ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping")
I'm uncertain whether further testings on this particular laptops were done in 2011 and (honestly I'm not sure) 2024, but I would be happy to do some distro-specific testing if that's what would be required to verify this patch.
P.S., I also see IDs 0x0040, 0x0062, and 0x006a listed under the same `quirk_calpella_no_shadow_gtt()' quirk, but I'm not sure how similar these chipsets are (if they share the same issue with VT-d or even, indeed, if this issue is specific to a bug in the Lenovo BIOS). With regards to 0x0062, it seems to be a Centrino wireless card, but not a chipset?
I have also listed a couple (distro and kernel) bug reports below as references (some of them are from 7-8 years ago!), as they seem to be similar issue found on different Westmere/Ironlake, Haswell, and Broadwell hardware setups.
Cc: stable@vger.kernel.org Fixes: 6fbcfb3e467a ("intel-iommu: Workaround IOTLB hang on Ironlake GPU") Fixes: ba00196ca41c ("iommu/vt-d: Decouple igfx_off from graphic identity mapping") Link: https://groups.google.com/g/qubes-users/c/4NP4goUds2c?pli=1 Link: https://bugs.archlinux.org/task/65362 Link: https://bbs.archlinux.org/viewtopic.php?id=230323 Reported-by: Wenhao Sun weiguangtwk@outlook.com Closes: https://bugzilla.kernel.org/show_bug.cgi?id=197029 Signed-off-by: Mingcong Bai jeffbai@aosc.io Link: https://lore.kernel.org/r/20250415133330.12528-1-jeffbai@aosc.io Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Joerg Roedel jroedel@suse.de Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/iommu/intel/iommu.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
--- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5660,6 +5660,9 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_I DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e40, quirk_iommu_igfx); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x2e90, quirk_iommu_igfx);
+/* QM57/QS57 integrated gfx malfunctions with dmar */ +DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_iommu_igfx); + /* Broadwell igfx malfunctions with dmar */ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x1606, quirk_iommu_igfx); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x160B, quirk_iommu_igfx); @@ -5737,7 +5740,6 @@ static void quirk_calpella_no_shadow_gtt } } DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0040, quirk_calpella_no_shadow_gtt); -DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_calpella_no_shadow_gtt); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, quirk_calpella_no_shadow_gtt); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, quirk_calpella_no_shadow_gtt);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jeongjun Park aha310510@gmail.com
commit f5178c41bb43444a6008150fe6094497135d07cb upstream.
syzbot reported this bug: ================================================================== BUG: KASAN: slab-out-of-bounds in trace_seq_to_buffer kernel/trace/trace.c:1830 [inline] BUG: KASAN: slab-out-of-bounds in tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822 Write of size 4507 at addr ffff888032b6b000 by task syz.2.320/7260
CPU: 1 UID: 0 PID: 7260 Comm: syz.2.320 Not tainted 6.15.0-rc1-syzkaller-00301-g3bde70a2c827 #0 PREEMPT(full) Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 02/12/2025 Call Trace: <TASK> __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:120 print_address_description mm/kasan/report.c:408 [inline] print_report+0xc3/0x670 mm/kasan/report.c:521 kasan_report+0xe0/0x110 mm/kasan/report.c:634 check_region_inline mm/kasan/generic.c:183 [inline] kasan_check_range+0xef/0x1a0 mm/kasan/generic.c:189 __asan_memcpy+0x3c/0x60 mm/kasan/shadow.c:106 trace_seq_to_buffer kernel/trace/trace.c:1830 [inline] tracing_splice_read_pipe+0x6be/0xdd0 kernel/trace/trace.c:6822 .... ==================================================================
It has been reported that trace_seq_to_buffer() tries to copy more data than PAGE_SIZE to buf. Therefore, to prevent this, we should use the smaller of trace_seq_used(&iter->seq) and PAGE_SIZE as an argument.
Link: https://lore.kernel.org/20250422113026.13308-1-aha310510@gmail.com Reported-by: syzbot+c8cd2d2c412b868263fb@syzkaller.appspotmail.com Fixes: 3c56819b14b0 ("tracing: splice support for tracing_pipe") Suggested-by: Steven Rostedt rostedt@goodmis.org Signed-off-by: Jeongjun Park aha310510@gmail.com Signed-off-by: Steven Rostedt (Google) rostedt@goodmis.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- kernel/trace/trace.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
--- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -7034,13 +7034,14 @@ static ssize_t tracing_splice_read_pipe( /* Copy the data into the page, so we can start over. */ ret = trace_seq_to_buffer(&iter->seq, page_address(spd.pages[i]), - trace_seq_used(&iter->seq)); + min((size_t)trace_seq_used(&iter->seq), + PAGE_SIZE)); if (ret < 0) { __free_page(spd.pages[i]); break; } spd.partial[i].offset = 0; - spd.partial[i].len = trace_seq_used(&iter->seq); + spd.partial[i].len = ret;
trace_seq_init(&iter->seq); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sean Christopherson seanjc@google.com
commit c2fee09fc167c74a64adb08656cb993ea475197e upstream.
Move the conditional loading of hardware DR6 with the guest's DR6 value out of the core .vcpu_run() loop to fix a bug where KVM can load hardware with a stale vcpu->arch.dr6.
When the guest accesses a DR and host userspace isn't debugging the guest, KVM disables DR interception and loads the guest's values into hardware on VM-Enter and saves them on VM-Exit. This allows the guest to access DRs at will, e.g. so that a sequence of DR accesses to configure a breakpoint only generates one VM-Exit.
For DR0-DR3, the logic/behavior is identical between VMX and SVM, and also identical between KVM_DEBUGREG_BP_ENABLED (userspace debugging the guest) and KVM_DEBUGREG_WONT_EXIT (guest using DRs), and so KVM handles loading DR0-DR3 in common code, _outside_ of the core kvm_x86_ops.vcpu_run() loop.
But for DR6, the guest's value doesn't need to be loaded into hardware for KVM_DEBUGREG_BP_ENABLED, and SVM provides a dedicated VMCB field whereas VMX requires software to manually load the guest value, and so loading the guest's value into DR6 is handled by {svm,vmx}_vcpu_run(), i.e. is done _inside_ the core run loop.
Unfortunately, saving the guest values on VM-Exit is initiated by common x86, again outside of the core run loop. If the guest modifies DR6 (in hardware, when DR interception is disabled), and then the next VM-Exit is a fastpath VM-Exit, KVM will reload hardware DR6 with vcpu->arch.dr6 and clobber the guest's actual value.
The bug shows up primarily with nested VMX because KVM handles the VMX preemption timer in the fastpath, and the window between hardware DR6 being modified (in guest context) and DR6 being read by guest software is orders of magnitude larger in a nested setup. E.g. in non-nested, the VMX preemption timer would need to fire precisely between #DB injection and the #DB handler's read of DR6, whereas with a KVM-on-KVM setup, the window where hardware DR6 is "dirty" extends all the way from L1 writing DR6 to VMRESUME (in L1).
L1's view: ========== <L1 disables DR interception> CPU 0/KVM-7289 [023] d.... 2925.640961: kvm_entry: vcpu 0 A: L1 Writes DR6 CPU 0/KVM-7289 [023] d.... 2925.640963: <hack>: Set DRs, DR6 = 0xffff0ff1
B: CPU 0/KVM-7289 [023] d.... 2925.640967: kvm_exit: vcpu 0 reason EXTERNAL_INTERRUPT intr_info 0x800000ec
D: L1 reads DR6, arch.dr6 = 0 CPU 0/KVM-7289 [023] d.... 2925.640969: <hack>: Sync DRs, DR6 = 0xffff0ff0
CPU 0/KVM-7289 [023] d.... 2925.640976: kvm_entry: vcpu 0 L2 reads DR6, L1 disables DR interception CPU 0/KVM-7289 [023] d.... 2925.640980: kvm_exit: vcpu 0 reason DR_ACCESS info1 0x0000000000000216 CPU 0/KVM-7289 [023] d.... 2925.640983: kvm_entry: vcpu 0
CPU 0/KVM-7289 [023] d.... 2925.640983: <hack>: Set DRs, DR6 = 0xffff0ff0
L2 detects failure CPU 0/KVM-7289 [023] d.... 2925.640987: kvm_exit: vcpu 0 reason HLT L1 reads DR6 (confirms failure) CPU 0/KVM-7289 [023] d.... 2925.640990: <hack>: Sync DRs, DR6 = 0xffff0ff0
L0's view: ========== L2 reads DR6, arch.dr6 = 0 CPU 23/KVM-5046 [001] d.... 3410.005610: kvm_exit: vcpu 23 reason DR_ACCESS info1 0x0000000000000216 CPU 23/KVM-5046 [001] ..... 3410.005610: kvm_nested_vmexit: vcpu 23 reason DR_ACCESS info1 0x0000000000000216
L2 => L1 nested VM-Exit CPU 23/KVM-5046 [001] ..... 3410.005610: kvm_nested_vmexit_inject: reason: DR_ACCESS ext_inf1: 0x0000000000000216
CPU 23/KVM-5046 [001] d.... 3410.005610: kvm_entry: vcpu 23 CPU 23/KVM-5046 [001] d.... 3410.005611: kvm_exit: vcpu 23 reason VMREAD CPU 23/KVM-5046 [001] d.... 3410.005611: kvm_entry: vcpu 23 CPU 23/KVM-5046 [001] d.... 3410.005612: kvm_exit: vcpu 23 reason VMREAD CPU 23/KVM-5046 [001] d.... 3410.005612: kvm_entry: vcpu 23
L1 writes DR7, L0 disables DR interception CPU 23/KVM-5046 [001] d.... 3410.005612: kvm_exit: vcpu 23 reason DR_ACCESS info1 0x0000000000000007 CPU 23/KVM-5046 [001] d.... 3410.005613: kvm_entry: vcpu 23
L0 writes DR6 = 0 (arch.dr6) CPU 23/KVM-5046 [001] d.... 3410.005613: <hack>: Set DRs, DR6 = 0xffff0ff0
A: <L1 writes DR6 = 1, no interception, arch.dr6 is still '0'>
B: CPU 23/KVM-5046 [001] d.... 3410.005614: kvm_exit: vcpu 23 reason PREEMPTION_TIMER CPU 23/KVM-5046 [001] d.... 3410.005614: kvm_entry: vcpu 23
C: L0 writes DR6 = 0 (arch.dr6) CPU 23/KVM-5046 [001] d.... 3410.005614: <hack>: Set DRs, DR6 = 0xffff0ff0
L1 => L2 nested VM-Enter CPU 23/KVM-5046 [001] d.... 3410.005616: kvm_exit: vcpu 23 reason VMRESUME
L0 reads DR6, arch.dr6 = 0
Reported-by: John Stultz jstultz@google.com Closes: https://lkml.kernel.org/r/CANDhNCq5_F3HfFYABqFGCA1bPd_%2BxgNj-iDQhH4tDk%2Bwi... Fixes: 375e28ffc0cf ("KVM: X86: Set host DR6 only on VMX and for KVM_DEBUGREG_WONT_EXIT") Fixes: d67668e9dd76 ("KVM: x86, SVM: isolate vcpu->arch.dr6 from vmcb->save.dr6") Cc: stable@vger.kernel.org Cc: Jim Mattson jmattson@google.com Tested-by: John Stultz jstultz@google.com Link: https://lore.kernel.org/r/20250125011833.3644371-1-seanjc@google.com Signed-off-by: Sean Christopherson seanjc@google.com [jth: Handled conflicts with kvm_x86_ops reshuffle] Signed-off-by: James Houghton jthoughton@google.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- arch/x86/include/asm/kvm-x86-ops.h | 1 + arch/x86/include/asm/kvm_host.h | 1 + arch/x86/kvm/svm/svm.c | 13 ++++++------- arch/x86/kvm/vmx/vmx.c | 11 +++++++---- arch/x86/kvm/x86.c | 3 +++ 5 files changed, 18 insertions(+), 11 deletions(-)
--- a/arch/x86/include/asm/kvm-x86-ops.h +++ b/arch/x86/include/asm/kvm-x86-ops.h @@ -44,6 +44,7 @@ KVM_X86_OP(set_idt) KVM_X86_OP(get_gdt) KVM_X86_OP(set_gdt) KVM_X86_OP(sync_dirty_debug_regs) +KVM_X86_OP(set_dr6) KVM_X86_OP(set_dr7) KVM_X86_OP(cache_reg) KVM_X86_OP(get_rflags) --- a/arch/x86/include/asm/kvm_host.h +++ b/arch/x86/include/asm/kvm_host.h @@ -1344,6 +1344,7 @@ struct kvm_x86_ops { void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt); void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu); + void (*set_dr6)(struct kvm_vcpu *vcpu, unsigned long value); void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value); void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg); unsigned long (*get_rflags)(struct kvm_vcpu *vcpu); --- a/arch/x86/kvm/svm/svm.c +++ b/arch/x86/kvm/svm/svm.c @@ -1887,11 +1887,11 @@ static void new_asid(struct vcpu_svm *sv svm->asid = sd->next_asid++; }
-static void svm_set_dr6(struct vcpu_svm *svm, unsigned long value) +static void svm_set_dr6(struct kvm_vcpu *vcpu, unsigned long value) { - struct vmcb *vmcb = svm->vmcb; + struct vmcb *vmcb = to_svm(vcpu)->vmcb;
- if (svm->vcpu.arch.guest_state_protected) + if (vcpu->arch.guest_state_protected) return;
if (unlikely(value != vmcb->save.dr6)) { @@ -3851,10 +3851,8 @@ static __no_kcsan fastpath_t svm_vcpu_ru * Run with all-zero DR6 unless needed, so that we can get the exact cause * of a #DB. */ - if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) - svm_set_dr6(svm, vcpu->arch.dr6); - else - svm_set_dr6(svm, DR6_ACTIVE_LOW); + if (likely(!(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))) + svm_set_dr6(vcpu, DR6_ACTIVE_LOW);
clgi(); kvm_load_guest_xsave_state(vcpu); @@ -4631,6 +4629,7 @@ static struct kvm_x86_ops svm_x86_ops __ .set_idt = svm_set_idt, .get_gdt = svm_get_gdt, .set_gdt = svm_set_gdt, + .set_dr6 = svm_set_dr6, .set_dr7 = svm_set_dr7, .sync_dirty_debug_regs = svm_sync_dirty_debug_regs, .cache_reg = svm_cache_reg, --- a/arch/x86/kvm/vmx/vmx.c +++ b/arch/x86/kvm/vmx/vmx.c @@ -5249,6 +5249,12 @@ static void vmx_sync_dirty_debug_regs(st set_debugreg(DR6_RESERVED, 6); }
+static void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val) +{ + lockdep_assert_irqs_disabled(); + set_debugreg(vcpu->arch.dr6, 6); +} + static void vmx_set_dr7(struct kvm_vcpu *vcpu, unsigned long val) { vmcs_writel(GUEST_DR7, val); @@ -6839,10 +6845,6 @@ static fastpath_t vmx_vcpu_run(struct kv vmx->loaded_vmcs->host_state.cr4 = cr4; }
- /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */ - if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) - set_debugreg(vcpu->arch.dr6, 6); - /* When single-stepping over STI and MOV SS, we must clear the * corresponding interruptibility bits in the guest state. Otherwise * vmentry fails as it then expects bit 14 (BS) in pending debug @@ -7777,6 +7779,7 @@ static struct kvm_x86_ops vmx_x86_ops __ .set_idt = vmx_set_idt, .get_gdt = vmx_get_gdt, .set_gdt = vmx_set_gdt, + .set_dr6 = vmx_set_dr6, .set_dr7 = vmx_set_dr7, .sync_dirty_debug_regs = vmx_sync_dirty_debug_regs, .cache_reg = vmx_cache_reg, --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -9963,6 +9963,9 @@ static int vcpu_enter_guest(struct kvm_v set_debugreg(vcpu->arch.eff_db[1], 1); set_debugreg(vcpu->arch.eff_db[2], 2); set_debugreg(vcpu->arch.eff_db[3], 3); + /* When KVM_DEBUGREG_WONT_EXIT, dr6 is accessible in guest. */ + if (unlikely(vcpu->arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT)) + static_call(kvm_x86_set_dr6)(vcpu, vcpu->arch.dr6); } else if (unlikely(hw_breakpoint_active())) { set_debugreg(0, 7); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jakub Kicinski kuba@kernel.org
commit 166c2c8a6a4dc2e4ceba9e10cfe81c3e469e3210 upstream.
If we're redirecting the skb, and haven't called tcf_mirred_forward(), yet, we need to tell the core to drop the skb by setting the retcode to SHOT. If we have called tcf_mirred_forward(), however, the skb is out of our hands and returning SHOT will lead to UaF.
Move the retval override to the error path which actually need it.
Reviewed-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Fixes: e5cf1baf92cb ("act_mirred: use TC_ACT_REINSERT when possible") Signed-off-by: Jakub Kicinski kuba@kernel.org Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: David S. Miller davem@davemloft.net [Minor conflict resolved due to code context change.] Signed-off-by: Jianqi Ren jianqi.ren.cn@windriver.com Signed-off-by: He Zhe zhe.he@windriver.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- net/sched/act_mirred.c | 22 +++++++++++++--------- 1 file changed, 13 insertions(+), 9 deletions(-)
--- a/net/sched/act_mirred.c +++ b/net/sched/act_mirred.c @@ -254,31 +254,31 @@ static int tcf_mirred_act(struct sk_buff
m_mac_header_xmit = READ_ONCE(m->tcfm_mac_header_xmit); m_eaction = READ_ONCE(m->tcfm_eaction); + is_redirect = tcf_mirred_is_act_redirect(m_eaction); retval = READ_ONCE(m->tcf_action); dev = rcu_dereference_bh(m->tcfm_dev); if (unlikely(!dev)) { pr_notice_once("tc mirred: target device is gone\n"); - goto out; + goto err_cant_do; }
if (unlikely(!(dev->flags & IFF_UP)) || !netif_carrier_ok(dev)) { net_notice_ratelimited("tc mirred to Houston: device %s is down\n", dev->name); - goto out; + goto err_cant_do; }
/* we could easily avoid the clone only if called by ingress and clsact; * since we can't easily detect the clsact caller, skip clone only for * ingress - that covers the TC S/W datapath. */ - is_redirect = tcf_mirred_is_act_redirect(m_eaction); at_ingress = skb_at_tc_ingress(skb); use_reinsert = at_ingress && is_redirect && tcf_mirred_can_reinsert(retval); if (!use_reinsert) { skb2 = skb_clone(skb, GFP_ATOMIC); if (!skb2) - goto out; + goto err_cant_do; }
want_ingress = tcf_mirred_act_wants_ingress(m_eaction); @@ -321,12 +321,16 @@ static int tcf_mirred_act(struct sk_buff }
err = tcf_mirred_forward(want_ingress, skb2); - if (err) { -out: + if (err) tcf_action_inc_overlimit_qstats(&m->common); - if (tcf_mirred_is_act_redirect(m_eaction)) - retval = TC_ACT_SHOT; - } + __this_cpu_dec(mirred_nest_level); + + return retval; + +err_cant_do: + if (is_redirect) + retval = TC_ACT_SHOT; + tcf_action_inc_overlimit_qstats(&m->common); __this_cpu_dec(mirred_nest_level);
return retval;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Maor Gottlieb maorg@nvidia.com
[ Upstream commit 5d1a04f347e6cbf5ffe74da409a5d71fbe8c5f19 ]
Initialize the source MAC address when creating the default GID entry. Since this entry is used only for loopback traffic, it only needs to be a unicast address. A zeroed-out MAC address is sufficient for this purpose. Without this fix, random bits would be assigned as the source address. If these bits formed a multicast address, the firmware would return an error, preventing the user from switching to switchdev mode:
Error: mlx5_core: Failed setting eswitch to offloads. kernel answers: Invalid argument
Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Mark Bloch mbloch@nvidia.com Reviewed-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Link: https://patch.msgid.link/20250423083611.324567-3-mbloch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mellanox/mlx5/core/rdma.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c index 540cf05f63739..ab5afa6c5e0fd 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c @@ -130,8 +130,8 @@ static void mlx5_rdma_make_default_gid(struct mlx5_core_dev *dev, union ib_gid *
static int mlx5_rdma_add_roce_addr(struct mlx5_core_dev *dev) { + u8 mac[ETH_ALEN] = {}; union ib_gid gid; - u8 mac[ETH_ALEN];
mlx5_rdma_make_default_gid(dev, &gid); return mlx5_core_roce_gid_set(dev, 0,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chris Mi cmi@nvidia.com
[ Upstream commit 90538d23278a981e344d364e923162fce752afeb ]
The cited commit assumes enabling roce always succeeds. But it is not true. Add error handling for it.
Fixes: 80f09dfc237f ("net/mlx5: Eswitch, enable RoCE loopback traffic") Signed-off-by: Chris Mi cmi@nvidia.com Reviewed-by: Roi Dayan roid@nvidia.com Reviewed-by: Maor Gottlieb maorg@nvidia.com Signed-off-by: Mark Bloch mbloch@nvidia.com Reviewed-by: Michal Swiatkowski michal.swiatkowski@linux.intel.com Link: https://patch.msgid.link/20250423083611.324567-6-mbloch@nvidia.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/mellanox/mlx5/core/eswitch_offloads.c | 5 ++++- drivers/net/ethernet/mellanox/mlx5/core/rdma.c | 9 +++++---- drivers/net/ethernet/mellanox/mlx5/core/rdma.h | 4 ++-- 3 files changed, 11 insertions(+), 7 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c index 829f703233a9e..766a05f557fba 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch_offloads.c @@ -3138,7 +3138,9 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) int err;
mutex_init(&esw->offloads.termtbl_mutex); - mlx5_rdma_enable_roce(esw->dev); + err = mlx5_rdma_enable_roce(esw->dev); + if (err) + goto err_roce;
err = mlx5_esw_host_number_init(esw); if (err) @@ -3198,6 +3200,7 @@ int esw_offloads_enable(struct mlx5_eswitch *esw) esw_offloads_metadata_uninit(esw); err_metadata: mlx5_rdma_disable_roce(esw->dev); +err_roce: mutex_destroy(&esw->offloads.termtbl_mutex); return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c index ab5afa6c5e0fd..e61a4fa46d772 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/rdma.c +++ b/drivers/net/ethernet/mellanox/mlx5/core/rdma.c @@ -152,17 +152,17 @@ void mlx5_rdma_disable_roce(struct mlx5_core_dev *dev) mlx5_nic_vport_disable_roce(dev); }
-void mlx5_rdma_enable_roce(struct mlx5_core_dev *dev) +int mlx5_rdma_enable_roce(struct mlx5_core_dev *dev) { int err;
if (!MLX5_CAP_GEN(dev, roce)) - return; + return 0;
err = mlx5_nic_vport_enable_roce(dev); if (err) { mlx5_core_err(dev, "Failed to enable RoCE: %d\n", err); - return; + return err; }
err = mlx5_rdma_add_roce_addr(dev); @@ -177,10 +177,11 @@ void mlx5_rdma_enable_roce(struct mlx5_core_dev *dev) goto del_roce_addr; }
- return; + return err;
del_roce_addr: mlx5_rdma_del_roce_addr(dev); disable_roce: mlx5_nic_vport_disable_roce(dev); + return err; } diff --git a/drivers/net/ethernet/mellanox/mlx5/core/rdma.h b/drivers/net/ethernet/mellanox/mlx5/core/rdma.h index 750cff2a71a4b..3d9e76c3d42fb 100644 --- a/drivers/net/ethernet/mellanox/mlx5/core/rdma.h +++ b/drivers/net/ethernet/mellanox/mlx5/core/rdma.h @@ -8,12 +8,12 @@
#ifdef CONFIG_MLX5_ESWITCH
-void mlx5_rdma_enable_roce(struct mlx5_core_dev *dev); +int mlx5_rdma_enable_roce(struct mlx5_core_dev *dev); void mlx5_rdma_disable_roce(struct mlx5_core_dev *dev);
#else /* CONFIG_MLX5_ESWITCH */
-static inline void mlx5_rdma_enable_roce(struct mlx5_core_dev *dev) {} +static inline int mlx5_rdma_enable_roce(struct mlx5_core_dev *dev) { return 0; } static inline void mlx5_rdma_disable_roce(struct mlx5_core_dev *dev) {}
#endif /* CONFIG_MLX5_ESWITCH */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Biao Huang biao.huang@mediatek.com
[ Upstream commit 0a8bd81fd6aaace14979152e0540da8ff158a00a ]
Current driver may lost tx interrupts under bidirectional test with iperf3, which leads to some unexpected issues.
This patch let rx/tx interrupt enable/disable separately, and rx/tx are handled in different NAPIs.
Signed-off-by: Biao Huang biao.huang@mediatek.com Signed-off-by: Yinghua Pan ot_yinghua.pan@mediatek.com Signed-off-by: David S. Miller davem@davemloft.net Stable-dep-of: e54b4db35e20 ("net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_star_emac.c | 340 ++++++++++-------- 1 file changed, 199 insertions(+), 141 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_star_emac.c b/drivers/net/ethernet/mediatek/mtk_star_emac.c index 392648246d8f4..209e79f2c3e8c 100644 --- a/drivers/net/ethernet/mediatek/mtk_star_emac.c +++ b/drivers/net/ethernet/mediatek/mtk_star_emac.c @@ -32,6 +32,7 @@ #define MTK_STAR_SKB_ALIGNMENT 16 #define MTK_STAR_HASHTABLE_MC_LIMIT 256 #define MTK_STAR_HASHTABLE_SIZE_MAX 512 +#define MTK_STAR_DESC_NEEDED (MAX_SKB_FRAGS + 4)
/* Normally we'd use NET_IP_ALIGN but on arm64 its value is 0 and it doesn't * work for this controller. @@ -216,7 +217,8 @@ struct mtk_star_ring_desc_data { struct sk_buff *skb; };
-#define MTK_STAR_RING_NUM_DESCS 128 +#define MTK_STAR_RING_NUM_DESCS 512 +#define MTK_STAR_TX_THRESH (MTK_STAR_RING_NUM_DESCS / 4) #define MTK_STAR_NUM_TX_DESCS MTK_STAR_RING_NUM_DESCS #define MTK_STAR_NUM_RX_DESCS MTK_STAR_RING_NUM_DESCS #define MTK_STAR_NUM_DESCS_TOTAL (MTK_STAR_RING_NUM_DESCS * 2) @@ -246,7 +248,8 @@ struct mtk_star_priv { struct mtk_star_ring rx_ring;
struct mii_bus *mii; - struct napi_struct napi; + struct napi_struct tx_napi; + struct napi_struct rx_napi;
struct device_node *phy_node; phy_interface_t phy_intf; @@ -357,19 +360,16 @@ mtk_star_ring_push_head_tx(struct mtk_star_ring *ring, mtk_star_ring_push_head(ring, desc_data, flags); }
-static unsigned int mtk_star_ring_num_used_descs(struct mtk_star_ring *ring) +static unsigned int mtk_star_tx_ring_avail(struct mtk_star_ring *ring) { - return abs(ring->head - ring->tail); -} + u32 avail;
-static bool mtk_star_ring_full(struct mtk_star_ring *ring) -{ - return mtk_star_ring_num_used_descs(ring) == MTK_STAR_RING_NUM_DESCS; -} + if (ring->tail > ring->head) + avail = ring->tail - ring->head - 1; + else + avail = MTK_STAR_RING_NUM_DESCS - ring->head + ring->tail - 1;
-static bool mtk_star_ring_descs_available(struct mtk_star_ring *ring) -{ - return mtk_star_ring_num_used_descs(ring) > 0; + return avail; }
static dma_addr_t mtk_star_dma_map_rx(struct mtk_star_priv *priv, @@ -414,6 +414,36 @@ static void mtk_star_nic_disable_pd(struct mtk_star_priv *priv) MTK_STAR_BIT_MAC_CFG_NIC_PD); }
+static void mtk_star_enable_dma_irq(struct mtk_star_priv *priv, + bool rx, bool tx) +{ + u32 value; + + regmap_read(priv->regs, MTK_STAR_REG_INT_MASK, &value); + + if (tx) + value &= ~MTK_STAR_BIT_INT_STS_TNTC; + if (rx) + value &= ~MTK_STAR_BIT_INT_STS_FNRC; + + regmap_write(priv->regs, MTK_STAR_REG_INT_MASK, value); +} + +static void mtk_star_disable_dma_irq(struct mtk_star_priv *priv, + bool rx, bool tx) +{ + u32 value; + + regmap_read(priv->regs, MTK_STAR_REG_INT_MASK, &value); + + if (tx) + value |= MTK_STAR_BIT_INT_STS_TNTC; + if (rx) + value |= MTK_STAR_BIT_INT_STS_FNRC; + + regmap_write(priv->regs, MTK_STAR_REG_INT_MASK, value); +} + /* Unmask the three interrupts we care about, mask all others. */ static void mtk_star_intr_enable(struct mtk_star_priv *priv) { @@ -429,20 +459,11 @@ static void mtk_star_intr_disable(struct mtk_star_priv *priv) regmap_write(priv->regs, MTK_STAR_REG_INT_MASK, ~0); }
-static unsigned int mtk_star_intr_read(struct mtk_star_priv *priv) -{ - unsigned int val; - - regmap_read(priv->regs, MTK_STAR_REG_INT_STS, &val); - - return val; -} - static unsigned int mtk_star_intr_ack_all(struct mtk_star_priv *priv) { unsigned int val;
- val = mtk_star_intr_read(priv); + regmap_read(priv->regs, MTK_STAR_REG_INT_STS, &val); regmap_write(priv->regs, MTK_STAR_REG_INT_STS, val);
return val; @@ -714,25 +735,44 @@ static void mtk_star_free_tx_skbs(struct mtk_star_priv *priv) mtk_star_ring_free_skbs(priv, ring, mtk_star_dma_unmap_tx); }
-/* All processing for TX and RX happens in the napi poll callback. - * - * FIXME: The interrupt handling should be more fine-grained with each - * interrupt enabled/disabled independently when needed. Unfortunatly this - * turned out to impact the driver's stability and until we have something - * working properly, we're disabling all interrupts during TX & RX processing - * or when resetting the counter registers. - */ +/** + * mtk_star_handle_irq - Interrupt Handler. + * @irq: interrupt number. + * @data: pointer to a network interface device structure. + * Description : this is the driver interrupt service routine. + * it mainly handles: + * 1. tx complete interrupt for frame transmission. + * 2. rx complete interrupt for frame reception. + * 3. MAC Management Counter interrupt to avoid counter overflow. + **/ static irqreturn_t mtk_star_handle_irq(int irq, void *data) { - struct mtk_star_priv *priv; - struct net_device *ndev; - - ndev = data; - priv = netdev_priv(ndev); + struct net_device *ndev = data; + struct mtk_star_priv *priv = netdev_priv(ndev); + unsigned int intr_status = mtk_star_intr_ack_all(priv); + bool rx, tx; + + rx = (intr_status & MTK_STAR_BIT_INT_STS_FNRC) && + napi_schedule_prep(&priv->rx_napi); + tx = (intr_status & MTK_STAR_BIT_INT_STS_TNTC) && + napi_schedule_prep(&priv->tx_napi); + + if (rx || tx) { + spin_lock(&priv->lock); + /* mask Rx and TX Complete interrupt */ + mtk_star_disable_dma_irq(priv, rx, tx); + spin_unlock(&priv->lock); + + if (rx) + __napi_schedule(&priv->rx_napi); + if (tx) + __napi_schedule(&priv->tx_napi); + }
- if (netif_running(ndev)) { - mtk_star_intr_disable(priv); - napi_schedule(&priv->napi); + /* interrupt is triggered once any counters reach 0x8000000 */ + if (intr_status & MTK_STAR_REG_INT_STS_MIB_CNT_TH) { + mtk_star_update_stats(priv); + mtk_star_reset_counters(priv); }
return IRQ_HANDLED; @@ -955,7 +995,8 @@ static int mtk_star_enable(struct net_device *ndev) if (ret) goto err_free_skbs;
- napi_enable(&priv->napi); + napi_enable(&priv->tx_napi); + napi_enable(&priv->rx_napi);
mtk_star_intr_ack_all(priv); mtk_star_intr_enable(priv); @@ -988,7 +1029,8 @@ static void mtk_star_disable(struct net_device *ndev) struct mtk_star_priv *priv = netdev_priv(ndev);
netif_stop_queue(ndev); - napi_disable(&priv->napi); + napi_disable(&priv->tx_napi); + napi_disable(&priv->rx_napi); mtk_star_intr_disable(priv); mtk_star_dma_disable(priv); mtk_star_intr_ack_all(priv); @@ -1020,13 +1062,45 @@ static int mtk_star_netdev_ioctl(struct net_device *ndev, return phy_mii_ioctl(ndev->phydev, req, cmd); }
-static int mtk_star_netdev_start_xmit(struct sk_buff *skb, - struct net_device *ndev) +static int __mtk_star_maybe_stop_tx(struct mtk_star_priv *priv, u16 size) +{ + netif_stop_queue(priv->ndev); + + /* Might race with mtk_star_tx_poll, check again */ + smp_mb(); + if (likely(mtk_star_tx_ring_avail(&priv->tx_ring) < size)) + return -EBUSY; + + netif_start_queue(priv->ndev); + + return 0; +} + +static inline int mtk_star_maybe_stop_tx(struct mtk_star_priv *priv, u16 size) +{ + if (likely(mtk_star_tx_ring_avail(&priv->tx_ring) >= size)) + return 0; + + return __mtk_star_maybe_stop_tx(priv, size); +} + +static netdev_tx_t mtk_star_netdev_start_xmit(struct sk_buff *skb, + struct net_device *ndev) { struct mtk_star_priv *priv = netdev_priv(ndev); struct mtk_star_ring *ring = &priv->tx_ring; struct device *dev = mtk_star_get_dev(priv); struct mtk_star_ring_desc_data desc_data; + int nfrags = skb_shinfo(skb)->nr_frags; + + if (unlikely(mtk_star_tx_ring_avail(ring) < nfrags + 1)) { + if (!netif_queue_stopped(ndev)) { + netif_stop_queue(ndev); + /* This is a hard error, log it. */ + pr_err_ratelimited("Tx ring full when queue awake\n"); + } + return NETDEV_TX_BUSY; + }
desc_data.dma_addr = mtk_star_dma_map_tx(priv, skb); if (dma_mapping_error(dev, desc_data.dma_addr)) @@ -1034,17 +1108,11 @@ static int mtk_star_netdev_start_xmit(struct sk_buff *skb,
desc_data.skb = skb; desc_data.len = skb->len; - - spin_lock_bh(&priv->lock); - mtk_star_ring_push_head_tx(ring, &desc_data);
netdev_sent_queue(ndev, skb->len);
- if (mtk_star_ring_full(ring)) - netif_stop_queue(ndev); - - spin_unlock_bh(&priv->lock); + mtk_star_maybe_stop_tx(priv, MTK_STAR_DESC_NEEDED);
mtk_star_dma_resume_tx(priv);
@@ -1076,31 +1144,40 @@ static int mtk_star_tx_complete_one(struct mtk_star_priv *priv) return ret; }
-static void mtk_star_tx_complete_all(struct mtk_star_priv *priv) +static int mtk_star_tx_poll(struct napi_struct *napi, int budget) { + struct mtk_star_priv *priv = container_of(napi, struct mtk_star_priv, + tx_napi); + int ret = 0, pkts_compl = 0, bytes_compl = 0, count = 0; struct mtk_star_ring *ring = &priv->tx_ring; struct net_device *ndev = priv->ndev; - int ret, pkts_compl, bytes_compl; - bool wake = false; - - spin_lock(&priv->lock); - - for (pkts_compl = 0, bytes_compl = 0;; - pkts_compl++, bytes_compl += ret, wake = true) { - if (!mtk_star_ring_descs_available(ring)) - break; + unsigned int head = ring->head; + unsigned int entry = ring->tail;
+ while (entry != head && count < (MTK_STAR_RING_NUM_DESCS - 1)) { ret = mtk_star_tx_complete_one(priv); if (ret < 0) break; + + count++; + pkts_compl++; + bytes_compl += ret; + entry = ring->tail; }
netdev_completed_queue(ndev, pkts_compl, bytes_compl);
- if (wake && netif_queue_stopped(ndev)) + if (unlikely(netif_queue_stopped(ndev)) && + (mtk_star_tx_ring_avail(ring) > MTK_STAR_TX_THRESH)) netif_wake_queue(ndev);
- spin_unlock(&priv->lock); + if (napi_complete(napi)) { + spin_lock(&priv->lock); + mtk_star_enable_dma_irq(priv, false, true); + spin_unlock(&priv->lock); + } + + return 0; }
static void mtk_star_netdev_get_stats64(struct net_device *ndev, @@ -1180,7 +1257,7 @@ static const struct ethtool_ops mtk_star_ethtool_ops = { .set_link_ksettings = phy_ethtool_set_link_ksettings, };
-static int mtk_star_receive_packet(struct mtk_star_priv *priv) +static int mtk_star_rx(struct mtk_star_priv *priv, int budget) { struct mtk_star_ring *ring = &priv->rx_ring; struct device *dev = mtk_star_get_dev(priv); @@ -1188,107 +1265,85 @@ static int mtk_star_receive_packet(struct mtk_star_priv *priv) struct net_device *ndev = priv->ndev; struct sk_buff *curr_skb, *new_skb; dma_addr_t new_dma_addr; - int ret; + int ret, count = 0;
- spin_lock(&priv->lock); - ret = mtk_star_ring_pop_tail(ring, &desc_data); - spin_unlock(&priv->lock); - if (ret) - return -1; + while (count < budget) { + ret = mtk_star_ring_pop_tail(ring, &desc_data); + if (ret) + return -1;
- curr_skb = desc_data.skb; + curr_skb = desc_data.skb;
- if ((desc_data.flags & MTK_STAR_DESC_BIT_RX_CRCE) || - (desc_data.flags & MTK_STAR_DESC_BIT_RX_OSIZE)) { - /* Error packet -> drop and reuse skb. */ - new_skb = curr_skb; - goto push_new_skb; - } + if ((desc_data.flags & MTK_STAR_DESC_BIT_RX_CRCE) || + (desc_data.flags & MTK_STAR_DESC_BIT_RX_OSIZE)) { + /* Error packet -> drop and reuse skb. */ + new_skb = curr_skb; + goto push_new_skb; + }
- /* Prepare new skb before receiving the current one. Reuse the current - * skb if we fail at any point. - */ - new_skb = mtk_star_alloc_skb(ndev); - if (!new_skb) { - ndev->stats.rx_dropped++; - new_skb = curr_skb; - goto push_new_skb; - } + /* Prepare new skb before receiving the current one. + * Reuse the current skb if we fail at any point. + */ + new_skb = mtk_star_alloc_skb(ndev); + if (!new_skb) { + ndev->stats.rx_dropped++; + new_skb = curr_skb; + goto push_new_skb; + }
- new_dma_addr = mtk_star_dma_map_rx(priv, new_skb); - if (dma_mapping_error(dev, new_dma_addr)) { - ndev->stats.rx_dropped++; - dev_kfree_skb(new_skb); - new_skb = curr_skb; - netdev_err(ndev, "DMA mapping error of RX descriptor\n"); - goto push_new_skb; - } + new_dma_addr = mtk_star_dma_map_rx(priv, new_skb); + if (dma_mapping_error(dev, new_dma_addr)) { + ndev->stats.rx_dropped++; + dev_kfree_skb(new_skb); + new_skb = curr_skb; + netdev_err(ndev, "DMA mapping error of RX descriptor\n"); + goto push_new_skb; + }
- /* We can't fail anymore at this point: it's safe to unmap the skb. */ - mtk_star_dma_unmap_rx(priv, &desc_data); + /* We can't fail anymore at this point: + * it's safe to unmap the skb. + */ + mtk_star_dma_unmap_rx(priv, &desc_data);
- skb_put(desc_data.skb, desc_data.len); - desc_data.skb->ip_summed = CHECKSUM_NONE; - desc_data.skb->protocol = eth_type_trans(desc_data.skb, ndev); - desc_data.skb->dev = ndev; - netif_receive_skb(desc_data.skb); + skb_put(desc_data.skb, desc_data.len); + desc_data.skb->ip_summed = CHECKSUM_NONE; + desc_data.skb->protocol = eth_type_trans(desc_data.skb, ndev); + desc_data.skb->dev = ndev; + netif_receive_skb(desc_data.skb);
- /* update dma_addr for new skb */ - desc_data.dma_addr = new_dma_addr; + /* update dma_addr for new skb */ + desc_data.dma_addr = new_dma_addr;
push_new_skb: - desc_data.len = skb_tailroom(new_skb); - desc_data.skb = new_skb;
- spin_lock(&priv->lock); - mtk_star_ring_push_head_rx(ring, &desc_data); - spin_unlock(&priv->lock); - - return 0; -} - -static int mtk_star_process_rx(struct mtk_star_priv *priv, int budget) -{ - int received, ret; + count++;
- for (received = 0, ret = 0; received < budget && ret == 0; received++) - ret = mtk_star_receive_packet(priv); + desc_data.len = skb_tailroom(new_skb); + desc_data.skb = new_skb; + mtk_star_ring_push_head_rx(ring, &desc_data); + }
mtk_star_dma_resume_rx(priv);
- return received; + return count; }
-static int mtk_star_poll(struct napi_struct *napi, int budget) +static int mtk_star_rx_poll(struct napi_struct *napi, int budget) { struct mtk_star_priv *priv; - unsigned int status; - int received = 0; - - priv = container_of(napi, struct mtk_star_priv, napi); - - status = mtk_star_intr_read(priv); - mtk_star_intr_ack_all(priv); - - if (status & MTK_STAR_BIT_INT_STS_TNTC) - /* Clean-up all TX descriptors. */ - mtk_star_tx_complete_all(priv); + int work_done = 0;
- if (status & MTK_STAR_BIT_INT_STS_FNRC) - /* Receive up to $budget packets. */ - received = mtk_star_process_rx(priv, budget); + priv = container_of(napi, struct mtk_star_priv, rx_napi);
- if (unlikely(status & MTK_STAR_REG_INT_STS_MIB_CNT_TH)) { - mtk_star_update_stats(priv); - mtk_star_reset_counters(priv); + work_done = mtk_star_rx(priv, budget); + if (work_done < budget) { + napi_complete_done(napi, work_done); + spin_lock(&priv->lock); + mtk_star_enable_dma_irq(priv, true, false); + spin_unlock(&priv->lock); }
- if (received < budget) - napi_complete_done(napi, received); - - mtk_star_intr_enable(priv); - - return received; + return work_done; }
static void mtk_star_mdio_rwok_clear(struct mtk_star_priv *priv) @@ -1551,7 +1606,10 @@ static int mtk_star_probe(struct platform_device *pdev) ndev->netdev_ops = &mtk_star_netdev_ops; ndev->ethtool_ops = &mtk_star_ethtool_ops;
- netif_napi_add(ndev, &priv->napi, mtk_star_poll, NAPI_POLL_WEIGHT); + netif_napi_add(ndev, &priv->rx_napi, mtk_star_rx_poll, + NAPI_POLL_WEIGHT); + netif_tx_napi_add(ndev, &priv->tx_napi, mtk_star_tx_poll, + NAPI_POLL_WEIGHT);
phydev = of_phy_find_device(priv->phy_node); if (phydev) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Louis-Alexis Eyraud louisalexis.eyraud@collabora.com
[ Upstream commit 6fe0866014486736cc3ba1c6fd4606d3dbe55c9c ]
Use spin_lock_irqsave and spin_unlock_irqrestore instead of spin_lock and spin_unlock in mtk_star_emac driver to avoid spinlock recursion occurrence that can happen when enabling the DMA interrupts again in rx/tx poll.
``` BUG: spinlock recursion on CPU#0, swapper/0/0 lock: 0xffff00000db9cf20, .magic: dead4ead, .owner: swapper/0/0, .owner_cpu: 0 CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc2-next-20250417-00001-gf6a27738686c-dirty #28 PREEMPT Hardware name: MediaTek MT8365 Open Platform EVK (DT) Call trace: show_stack+0x18/0x24 (C) dump_stack_lvl+0x60/0x80 dump_stack+0x18/0x24 spin_dump+0x78/0x88 do_raw_spin_lock+0x11c/0x120 _raw_spin_lock+0x20/0x2c mtk_star_handle_irq+0xc0/0x22c [mtk_star_emac] __handle_irq_event_percpu+0x48/0x140 handle_irq_event+0x4c/0xb0 handle_fasteoi_irq+0xa0/0x1bc handle_irq_desc+0x34/0x58 generic_handle_domain_irq+0x1c/0x28 gic_handle_irq+0x4c/0x120 do_interrupt_handler+0x50/0x84 el1_interrupt+0x34/0x68 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x6c/0x70 regmap_mmio_read32le+0xc/0x20 (P) _regmap_bus_reg_read+0x6c/0xac _regmap_read+0x60/0xdc regmap_read+0x4c/0x80 mtk_star_rx_poll+0x2f4/0x39c [mtk_star_emac] __napi_poll+0x38/0x188 net_rx_action+0x164/0x2c0 handle_softirqs+0x100/0x244 __do_softirq+0x14/0x20 ____do_softirq+0x10/0x20 call_on_irq_stack+0x24/0x64 do_softirq_own_stack+0x1c/0x40 __irq_exit_rcu+0xd4/0x10c irq_exit_rcu+0x10/0x1c el1_interrupt+0x38/0x68 el1h_64_irq_handler+0x18/0x24 el1h_64_irq+0x6c/0x70 cpuidle_enter_state+0xac/0x320 (P) cpuidle_enter+0x38/0x50 do_idle+0x1e4/0x260 cpu_startup_entry+0x34/0x3c rest_init+0xdc/0xe0 console_on_rootfs+0x0/0x6c __primary_switched+0x88/0x90 ```
Fixes: 0a8bd81fd6aa ("net: ethernet: mtk-star-emac: separate tx/rx handling with two NAPIs") Signed-off-by: Louis-Alexis Eyraud louisalexis.eyraud@collabora.com Reviewed-by: Maxime Chevallier maxime.chevallier@bootlin.com Acked-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue... Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: e54b4db35e20 ("net: ethernet: mtk-star-emac: rearm interrupts in rx_poll only when advised") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_star_emac.c | 10 ++++++---- 1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_star_emac.c b/drivers/net/ethernet/mediatek/mtk_star_emac.c index 209e79f2c3e8c..c7155e0102232 100644 --- a/drivers/net/ethernet/mediatek/mtk_star_emac.c +++ b/drivers/net/ethernet/mediatek/mtk_star_emac.c @@ -1153,6 +1153,7 @@ static int mtk_star_tx_poll(struct napi_struct *napi, int budget) struct net_device *ndev = priv->ndev; unsigned int head = ring->head; unsigned int entry = ring->tail; + unsigned long flags;
while (entry != head && count < (MTK_STAR_RING_NUM_DESCS - 1)) { ret = mtk_star_tx_complete_one(priv); @@ -1172,9 +1173,9 @@ static int mtk_star_tx_poll(struct napi_struct *napi, int budget) netif_wake_queue(ndev);
if (napi_complete(napi)) { - spin_lock(&priv->lock); + spin_lock_irqsave(&priv->lock, flags); mtk_star_enable_dma_irq(priv, false, true); - spin_unlock(&priv->lock); + spin_unlock_irqrestore(&priv->lock, flags); }
return 0; @@ -1331,6 +1332,7 @@ static int mtk_star_rx(struct mtk_star_priv *priv, int budget) static int mtk_star_rx_poll(struct napi_struct *napi, int budget) { struct mtk_star_priv *priv; + unsigned long flags; int work_done = 0;
priv = container_of(napi, struct mtk_star_priv, rx_napi); @@ -1338,9 +1340,9 @@ static int mtk_star_rx_poll(struct napi_struct *napi, int budget) work_done = mtk_star_rx(priv, budget); if (work_done < budget) { napi_complete_done(napi, work_done); - spin_lock(&priv->lock); + spin_lock_irqsave(&priv->lock, flags); mtk_star_enable_dma_irq(priv, true, false); - spin_unlock(&priv->lock); + spin_unlock_irqrestore(&priv->lock, flags); }
return work_done;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Louis-Alexis Eyraud louisalexis.eyraud@collabora.com
[ Upstream commit e54b4db35e201a9173da9cb7abc8377e12abaf87 ]
In mtk_star_rx_poll function, on event processing completion, the mtk_star_emac driver calls napi_complete_done but ignores its return code and enable RX DMA interrupts inconditionally. This return code gives the info if a device should avoid rearming its interrupts or not, so fix this behaviour by taking it into account.
Fixes: 8c7bd5a454ff ("net: ethernet: mtk-star-emac: new driver") Signed-off-by: Louis-Alexis Eyraud louisalexis.eyraud@collabora.com Acked-by: Bartosz Golaszewski bartosz.golaszewski@linaro.org Link: https://patch.msgid.link/20250424-mtk_star_emac-fix-spinlock-recursion-issue... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/mediatek/mtk_star_emac.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_star_emac.c b/drivers/net/ethernet/mediatek/mtk_star_emac.c index c7155e0102232..639cf1c27dbd4 100644 --- a/drivers/net/ethernet/mediatek/mtk_star_emac.c +++ b/drivers/net/ethernet/mediatek/mtk_star_emac.c @@ -1338,8 +1338,7 @@ static int mtk_star_rx_poll(struct napi_struct *napi, int budget) priv = container_of(napi, struct mtk_star_priv, rx_napi);
work_done = mtk_star_rx(priv, budget); - if (work_done < budget) { - napi_complete_done(napi, work_done); + if (work_done < budget && napi_complete_done(napi, work_done)) { spin_lock_irqsave(&priv->lock, flags); mtk_star_enable_dma_irq(priv, true, false); spin_unlock_irqrestore(&priv->lock, flags);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit f99a3fbf023e20b626be4b0f042463d598050c9a ]
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of drr, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption.
In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before adding to the list to cover for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-...
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Victor Nogueira victor@mojatatu.com Link: https://patch.msgid.link/20250425220710.3964791-2-victor@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_drr.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_drr.c b/net/sched/sch_drr.c index 80a88e208d2bc..e33a72c356c87 100644 --- a/net/sched/sch_drr.c +++ b/net/sched/sch_drr.c @@ -36,6 +36,11 @@ struct drr_sched { struct Qdisc_class_hash clhash; };
+static bool cl_is_active(struct drr_class *cl) +{ + return !list_empty(&cl->alist); +} + static struct drr_class *drr_find_class(struct Qdisc *sch, u32 classid) { struct drr_sched *q = qdisc_priv(sch); @@ -345,7 +350,6 @@ static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct drr_sched *q = qdisc_priv(sch); struct drr_class *cl; int err = 0; - bool first;
cl = drr_classify(skb, sch, &err); if (cl == NULL) { @@ -355,7 +359,6 @@ static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; }
- first = !cl->qdisc->q.qlen; err = qdisc_enqueue(skb, cl->qdisc, to_free); if (unlikely(err != NET_XMIT_SUCCESS)) { if (net_xmit_drop_count(err)) { @@ -365,7 +368,7 @@ static int drr_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; }
- if (first) { + if (!cl_is_active(cl)) { list_add_tail(&cl->alist, &q->active); cl->deficit = cl->quantum; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit 141d34391abbb315d68556b7c67ad97885407547 ]
As described in Gerrard's report [1], we have a UAF case when an hfsc class has a netem child qdisc. The crux of the issue is that hfsc is assuming that checking for cl->qdisc->q.qlen == 0 guarantees that it hasn't inserted the class in the vttree or eltree (which is not true for the netem duplicate case).
This patch checks the n_active class variable to make sure that the code won't insert the class in the vttree or eltree twice, catering for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-...
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Reported-by: Gerrard Tai gerrard.tai@starlabs.sg Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Victor Nogueira victor@mojatatu.com Link: https://patch.msgid.link/20250425220710.3964791-3-victor@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_hfsc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/sched/sch_hfsc.c b/net/sched/sch_hfsc.c index 85c296664c9ab..d6c5fc543f652 100644 --- a/net/sched/sch_hfsc.c +++ b/net/sched/sch_hfsc.c @@ -1572,7 +1572,7 @@ hfsc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct sk_buff **to_free) return err; }
- if (first) { + if (first && !cl->cl_nactive) { if (cl->cl_flags & HFSC_RSC) init_ed(cl, len); if (cl->cl_flags & HFSC_FSC)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit 1a6d0c00fa07972384b0c308c72db091d49988b6 ]
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of ets, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption.
In addition to checking for qlen being zero, this patch checks whether the class was already added to the active_list (cl_is_active) before doing the addition to cater for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-...
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Victor Nogueira victor@mojatatu.com Link: https://patch.msgid.link/20250425220710.3964791-4-victor@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_ets.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/net/sched/sch_ets.c b/net/sched/sch_ets.c index d686ea7e8db49..07fae45f58732 100644 --- a/net/sched/sch_ets.c +++ b/net/sched/sch_ets.c @@ -74,6 +74,11 @@ static const struct nla_policy ets_class_policy[TCA_ETS_MAX + 1] = { [TCA_ETS_QUANTA_BAND] = { .type = NLA_U32 }, };
+static bool cl_is_active(struct ets_class *cl) +{ + return !list_empty(&cl->alist); +} + static int ets_quantum_parse(struct Qdisc *sch, const struct nlattr *attr, unsigned int *quantum, struct netlink_ext_ack *extack) @@ -424,7 +429,6 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct ets_sched *q = qdisc_priv(sch); struct ets_class *cl; int err = 0; - bool first;
cl = ets_classify(skb, sch, &err); if (!cl) { @@ -434,7 +438,6 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; }
- first = !cl->qdisc->q.qlen; err = qdisc_enqueue(skb, cl->qdisc, to_free); if (unlikely(err != NET_XMIT_SUCCESS)) { if (net_xmit_drop_count(err)) { @@ -444,7 +447,7 @@ static int ets_qdisc_enqueue(struct sk_buff *skb, struct Qdisc *sch, return err; }
- if (first && !ets_class_is_strict(q, cl)) { + if (!cl_is_active(cl) && !ets_class_is_strict(q, cl)) { list_add_tail(&cl->alist, &q->active); cl->deficit = cl->quantum; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Victor Nogueira victor@mojatatu.com
[ Upstream commit f139f37dcdf34b67f5bf92bc8e0f7f6b3ac63aa4 ]
As described in Gerrard's report [1], there are use cases where a netem child qdisc will make the parent qdisc's enqueue callback reentrant. In the case of qfq, there won't be a UAF, but the code will add the same classifier to the list twice, which will cause memory corruption.
This patch checks whether the class was already added to the agg->active list (cl_is_active) before doing the addition to cater for the reentrant case.
[1] https://lore.kernel.org/netdev/CAHcdcOm+03OD2j6R0=YHKqmy=VgJ8xEOKuP6c7mSgnp-...
Fixes: 37d9cf1a3ce3 ("sched: Fix detection of empty queues in child qdiscs") Acked-by: Jamal Hadi Salim jhs@mojatatu.com Signed-off-by: Victor Nogueira victor@mojatatu.com Link: https://patch.msgid.link/20250425220710.3964791-5-victor@mojatatu.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/sched/sch_qfq.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/sched/sch_qfq.c b/net/sched/sch_qfq.c index b1dbe03dde1b5..a198145f1251f 100644 --- a/net/sched/sch_qfq.c +++ b/net/sched/sch_qfq.c @@ -204,6 +204,11 @@ struct qfq_sched { */ enum update_reason {enqueue, requeue};
+static bool cl_is_active(struct qfq_class *cl) +{ + return !list_empty(&cl->alist); +} + static struct qfq_class *qfq_find_class(struct Qdisc *sch, u32 classid) { struct qfq_sched *q = qdisc_priv(sch); @@ -1223,7 +1228,6 @@ static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, struct qfq_class *cl; struct qfq_aggregate *agg; int err = 0; - bool first;
cl = qfq_classify(skb, sch, &err); if (cl == NULL) { @@ -1245,7 +1249,6 @@ static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, }
gso_segs = skb_is_gso(skb) ? skb_shinfo(skb)->gso_segs : 1; - first = !cl->qdisc->q.qlen; err = qdisc_enqueue(skb, cl->qdisc, to_free); if (unlikely(err != NET_XMIT_SUCCESS)) { pr_debug("qfq_enqueue: enqueue failed %d\n", err); @@ -1262,8 +1265,8 @@ static int qfq_enqueue(struct sk_buff *skb, struct Qdisc *sch, ++sch->q.qlen;
agg = cl->agg; - /* if the queue was not empty, then done here */ - if (!first) { + /* if the class is active, then done here */ + if (cl_is_active(cl)) { if (unlikely(skb == cl->qdisc->ops->peek(cl->qdisc)) && list_first_entry(&agg->active, struct qfq_class, alist) == cl && cl->deficit < len)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Brett Creeley brett.creeley@intel.com
[ Upstream commit fabf480bf95d71c9cfe8a8d6307e0035df963a6a ]
Some of the promiscuous mode functions take a boolean to indicate set/clear, which affects readability. Refactor and provide an interface for the promiscuous mode code with explicit set and clear promiscuous mode operations.
Signed-off-by: Brett Creeley brett.creeley@intel.com Tested-by: Konrad Jankowski konrad0.jankowski@intel.com Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Stable-dep-of: 425c5f266b2e ("ice: Check VF VSI Pointer Value in ice_vc_add_fdir_fltr()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_fltr.c | 58 ++++++++ drivers/net/ethernet/intel/ice/ice_fltr.h | 12 ++ drivers/net/ethernet/intel/ice/ice_main.c | 49 +++--- .../net/ethernet/intel/ice/ice_virtchnl_pf.c | 139 +++++++----------- 4 files changed, 156 insertions(+), 102 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice_fltr.c b/drivers/net/ethernet/intel/ice/ice_fltr.c index e27b4de7e7aa3..7536451cb09ef 100644 --- a/drivers/net/ethernet/intel/ice/ice_fltr.c +++ b/drivers/net/ethernet/intel/ice/ice_fltr.c @@ -46,6 +46,64 @@ ice_fltr_add_entry_to_list(struct device *dev, struct ice_fltr_info *info, return 0; }
+/** + * ice_fltr_set_vlan_vsi_promisc + * @hw: pointer to the hardware structure + * @vsi: the VSI being configured + * @promisc_mask: mask of promiscuous config bits + * + * Set VSI with all associated VLANs to given promiscuous mode(s) + */ +enum ice_status +ice_fltr_set_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, + u8 promisc_mask) +{ + return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, false); +} + +/** + * ice_fltr_clear_vlan_vsi_promisc + * @hw: pointer to the hardware structure + * @vsi: the VSI being configured + * @promisc_mask: mask of promiscuous config bits + * + * Clear VSI with all associated VLANs to given promiscuous mode(s) + */ +enum ice_status +ice_fltr_clear_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, + u8 promisc_mask) +{ + return ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_mask, true); +} + +/** + * ice_fltr_clear_vsi_promisc - clear specified promiscuous mode(s) + * @hw: pointer to the hardware structure + * @vsi_handle: VSI handle to clear mode + * @promisc_mask: mask of promiscuous config bits to clear + * @vid: VLAN ID to clear VLAN promiscuous + */ +enum ice_status +ice_fltr_clear_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, + u16 vid) +{ + return ice_clear_vsi_promisc(hw, vsi_handle, promisc_mask, vid); +} + +/** + * ice_fltr_set_vsi_promisc - set given VSI to given promiscuous mode(s) + * @hw: pointer to the hardware structure + * @vsi_handle: VSI handle to configure + * @promisc_mask: mask of promiscuous config bits + * @vid: VLAN ID to set VLAN promiscuous + */ +enum ice_status +ice_fltr_set_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, + u16 vid) +{ + return ice_set_vsi_promisc(hw, vsi_handle, promisc_mask, vid); +} + /** * ice_fltr_add_mac_list - add list of MAC filters * @vsi: pointer to VSI struct diff --git a/drivers/net/ethernet/intel/ice/ice_fltr.h b/drivers/net/ethernet/intel/ice/ice_fltr.h index 361cb4da9b43b..a0e8226f64f61 100644 --- a/drivers/net/ethernet/intel/ice/ice_fltr.h +++ b/drivers/net/ethernet/intel/ice/ice_fltr.h @@ -6,6 +6,18 @@
void ice_fltr_free_list(struct device *dev, struct list_head *h); enum ice_status +ice_fltr_set_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, + u8 promisc_mask); +enum ice_status +ice_fltr_clear_vlan_vsi_promisc(struct ice_hw *hw, struct ice_vsi *vsi, + u8 promisc_mask); +enum ice_status +ice_fltr_clear_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, + u16 vid); +enum ice_status +ice_fltr_set_vsi_promisc(struct ice_hw *hw, u16 vsi_handle, u8 promisc_mask, + u16 vid); +enum ice_status ice_fltr_add_mac_to_list(struct ice_vsi *vsi, struct list_head *list, const u8 *mac, enum ice_sw_fwd_act_type action); enum ice_status diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index 329bf24a3f0e5..735f8cef6bfa4 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -222,32 +222,45 @@ static bool ice_vsi_fltr_changed(struct ice_vsi *vsi) }
/** - * ice_cfg_promisc - Enable or disable promiscuous mode for a given PF + * ice_set_promisc - Enable promiscuous mode for a given PF * @vsi: the VSI being configured * @promisc_m: mask of promiscuous config bits - * @set_promisc: enable or disable promisc flag request * */ -static int ice_cfg_promisc(struct ice_vsi *vsi, u8 promisc_m, bool set_promisc) +static int ice_set_promisc(struct ice_vsi *vsi, u8 promisc_m) { - struct ice_hw *hw = &vsi->back->hw; - enum ice_status status = 0; + enum ice_status status;
if (vsi->type != ICE_VSI_PF) return 0;
- if (vsi->num_vlan > 1) { - status = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_m, - set_promisc); - } else { - if (set_promisc) - status = ice_set_vsi_promisc(hw, vsi->idx, promisc_m, - 0); - else - status = ice_clear_vsi_promisc(hw, vsi->idx, promisc_m, - 0); - } + if (vsi->num_vlan > 1) + status = ice_fltr_set_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m); + else + status = ice_fltr_set_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); + if (status) + return -EIO; + + return 0; +}
+/** + * ice_clear_promisc - Disable promiscuous mode for a given PF + * @vsi: the VSI being configured + * @promisc_m: mask of promiscuous config bits + * + */ +static int ice_clear_promisc(struct ice_vsi *vsi, u8 promisc_m) +{ + enum ice_status status; + + if (vsi->type != ICE_VSI_PF) + return 0; + + if (vsi->num_vlan > 1) + status = ice_fltr_clear_vlan_vsi_promisc(&vsi->back->hw, vsi, promisc_m); + else + status = ice_fltr_clear_vsi_promisc(&vsi->back->hw, vsi->idx, promisc_m, 0); if (status) return -EIO;
@@ -343,7 +356,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi) else promisc_m = ICE_MCAST_PROMISC_BITS;
- err = ice_cfg_promisc(vsi, promisc_m, true); + err = ice_set_promisc(vsi, promisc_m); if (err) { netdev_err(netdev, "Error setting Multicast promiscuous mode on VSI %i\n", vsi->vsi_num); @@ -357,7 +370,7 @@ static int ice_vsi_sync_fltr(struct ice_vsi *vsi) else promisc_m = ICE_MCAST_PROMISC_BITS;
- err = ice_cfg_promisc(vsi, promisc_m, false); + err = ice_clear_promisc(vsi, promisc_m); if (err) { netdev_err(netdev, "Error clearing Multicast promiscuous mode on VSI %i\n", vsi->vsi_num); diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c index 9d4d58757e040..e4e25f3ba8493 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_pf.c @@ -286,37 +286,6 @@ static int ice_check_vf_init(struct ice_pf *pf, struct ice_vf *vf) return 0; }
-/** - * ice_err_to_virt_err - translate errors for VF return code - * @ice_err: error return code - */ -static enum virtchnl_status_code ice_err_to_virt_err(enum ice_status ice_err) -{ - switch (ice_err) { - case ICE_SUCCESS: - return VIRTCHNL_STATUS_SUCCESS; - case ICE_ERR_BAD_PTR: - case ICE_ERR_INVAL_SIZE: - case ICE_ERR_DEVICE_NOT_SUPPORTED: - case ICE_ERR_PARAM: - case ICE_ERR_CFG: - return VIRTCHNL_STATUS_ERR_PARAM; - case ICE_ERR_NO_MEMORY: - return VIRTCHNL_STATUS_ERR_NO_MEMORY; - case ICE_ERR_NOT_READY: - case ICE_ERR_RESET_FAILED: - case ICE_ERR_FW_API_VER: - case ICE_ERR_AQ_ERROR: - case ICE_ERR_AQ_TIMEOUT: - case ICE_ERR_AQ_FULL: - case ICE_ERR_AQ_NO_WORK: - case ICE_ERR_AQ_EMPTY: - return VIRTCHNL_STATUS_ERR_ADMIN_QUEUE_ERROR; - default: - return VIRTCHNL_STATUS_ERR_NOT_SUPPORTED; - } -} - /** * ice_vc_vf_broadcast - Broadcast a message to all VFs on PF * @pf: pointer to the PF structure @@ -1301,45 +1270,50 @@ static void ice_clear_vf_reset_trigger(struct ice_vf *vf) ice_flush(hw); }
-/** - * ice_vf_set_vsi_promisc - set given VF VSI to given promiscuous mode(s) - * @vf: pointer to the VF info - * @vsi: the VSI being configured - * @promisc_m: mask of promiscuous config bits - * @rm_promisc: promisc flag request from the VF to remove or add filter - * - * This function configures VF VSI promiscuous mode, based on the VF requests, - * for Unicast, Multicast and VLAN - */ -static enum ice_status -ice_vf_set_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8 promisc_m, - bool rm_promisc) +static int +ice_vf_set_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8 promisc_m) { - struct ice_pf *pf = vf->pf; - enum ice_status status = 0; - struct ice_hw *hw; + struct ice_hw *hw = &vsi->back->hw; + enum ice_status status;
- hw = &pf->hw; - if (vsi->num_vlan) { - status = ice_set_vlan_vsi_promisc(hw, vsi->idx, promisc_m, - rm_promisc); - } else if (vf->port_vlan_info) { - if (rm_promisc) - status = ice_clear_vsi_promisc(hw, vsi->idx, promisc_m, - vf->port_vlan_info); - else - status = ice_set_vsi_promisc(hw, vsi->idx, promisc_m, - vf->port_vlan_info); - } else { - if (rm_promisc) - status = ice_clear_vsi_promisc(hw, vsi->idx, promisc_m, - 0); - else - status = ice_set_vsi_promisc(hw, vsi->idx, promisc_m, - 0); + if (vf->port_vlan_info) + status = ice_fltr_set_vsi_promisc(hw, vsi->idx, promisc_m, + vf->port_vlan_info & VLAN_VID_MASK); + else if (vsi->num_vlan > 1) + status = ice_fltr_set_vlan_vsi_promisc(hw, vsi, promisc_m); + else + status = ice_fltr_set_vsi_promisc(hw, vsi->idx, promisc_m, 0); + + if (status && status != ICE_ERR_ALREADY_EXISTS) { + dev_err(ice_pf_to_dev(vsi->back), "enable Tx/Rx filter promiscuous mode on VF-%u failed, error: %s\n", + vf->vf_id, ice_stat_str(status)); + return ice_status_to_errno(status); + } + + return 0; +} + +static int +ice_vf_clear_vsi_promisc(struct ice_vf *vf, struct ice_vsi *vsi, u8 promisc_m) +{ + struct ice_hw *hw = &vsi->back->hw; + enum ice_status status; + + if (vf->port_vlan_info) + status = ice_fltr_clear_vsi_promisc(hw, vsi->idx, promisc_m, + vf->port_vlan_info & VLAN_VID_MASK); + else if (vsi->num_vlan > 1) + status = ice_fltr_clear_vlan_vsi_promisc(hw, vsi, promisc_m); + else + status = ice_fltr_clear_vsi_promisc(hw, vsi->idx, promisc_m, 0); + + if (status && status != ICE_ERR_DOES_NOT_EXIST) { + dev_err(ice_pf_to_dev(vsi->back), "disable Tx/Rx filter promiscuous mode on VF-%u failed, error: %s\n", + vf->vf_id, ice_stat_str(status)); + return ice_status_to_errno(status); }
- return status; + return 0; }
static void ice_vf_clear_counters(struct ice_vf *vf) @@ -1700,7 +1674,7 @@ bool ice_reset_vf(struct ice_vf *vf, bool is_vflr) else promisc_m = ICE_UCAST_PROMISC_BITS;
- if (ice_vf_set_vsi_promisc(vf, vsi, promisc_m, true)) + if (ice_vf_clear_vsi_promisc(vf, vsi, promisc_m)) dev_err(dev, "disabling promiscuous mode failed\n"); }
@@ -2952,10 +2926,10 @@ bool ice_is_any_vf_in_promisc(struct ice_pf *pf) static int ice_vc_cfg_promiscuous_mode_msg(struct ice_vf *vf, u8 *msg) { enum virtchnl_status_code v_ret = VIRTCHNL_STATUS_SUCCESS; - enum ice_status mcast_status = 0, ucast_status = 0; bool rm_promisc, alluni = false, allmulti = false; struct virtchnl_promisc_info *info = (struct virtchnl_promisc_info *)msg; + int mcast_err = 0, ucast_err = 0; struct ice_pf *pf = vf->pf; struct ice_vsi *vsi; struct device *dev; @@ -3052,24 +3026,21 @@ static int ice_vc_cfg_promiscuous_mode_msg(struct ice_vf *vf, u8 *msg) ucast_m = ICE_UCAST_PROMISC_BITS; }
- ucast_status = ice_vf_set_vsi_promisc(vf, vsi, ucast_m, - !alluni); - if (ucast_status) { - dev_err(dev, "%sable Tx/Rx filter promiscuous mode on VF-%d failed\n", - alluni ? "en" : "dis", vf->vf_id); - v_ret = ice_err_to_virt_err(ucast_status); - } + if (alluni) + ucast_err = ice_vf_set_vsi_promisc(vf, vsi, ucast_m); + else + ucast_err = ice_vf_clear_vsi_promisc(vf, vsi, ucast_m);
- mcast_status = ice_vf_set_vsi_promisc(vf, vsi, mcast_m, - !allmulti); - if (mcast_status) { - dev_err(dev, "%sable Tx/Rx filter promiscuous mode on VF-%d failed\n", - allmulti ? "en" : "dis", vf->vf_id); - v_ret = ice_err_to_virt_err(mcast_status); - } + if (allmulti) + mcast_err = ice_vf_set_vsi_promisc(vf, vsi, mcast_m); + else + mcast_err = ice_vf_clear_vsi_promisc(vf, vsi, mcast_m); + + if (ucast_err || mcast_err) + v_ret = VIRTCHNL_STATUS_ERR_PARAM; }
- if (!mcast_status) { + if (!mcast_err) { if (allmulti && !test_and_set_bit(ICE_VF_STATE_MC_PROMISC, vf->vf_states)) dev_info(dev, "VF %u successfully set multicast promiscuous mode\n", @@ -3079,7 +3050,7 @@ static int ice_vc_cfg_promiscuous_mode_msg(struct ice_vf *vf, u8 *msg) vf->vf_id); }
- if (!ucast_status) { + if (!ucast_err) { if (alluni && !test_and_set_bit(ICE_VF_STATE_UC_PROMISC, vf->vf_states)) dev_info(dev, "VF %u successfully set unicast promiscuous mode\n", vf->vf_id);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xuanqiang Luo luoxuanqiang@kylinos.cn
[ Upstream commit 425c5f266b2edeee0ce16fedd8466410cdcfcfe3 ]
As mentioned in the commit baeb705fd6a7 ("ice: always check VF VSI pointer values"), we need to perform a null pointer check on the return value of ice_get_vf_vsi() before using it.
Fixes: 6ebbe97a4881 ("ice: Add a per-VF limit on number of FDIR filters") Signed-off-by: Xuanqiang Luo luoxuanqiang@kylinos.cn Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Tony Nguyen anthony.l.nguyen@intel.com Link: https://patch.msgid.link/20250425222636.3188441-3-anthony.l.nguyen@intel.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c index 2ca8102e8f36e..3b87cc9dfd46e 100644 --- a/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c +++ b/drivers/net/ethernet/intel/ice/ice_virtchnl_fdir.c @@ -2079,6 +2079,11 @@ int ice_vc_add_fdir_fltr(struct ice_vf *vf, u8 *msg) pf = vf->pf; dev = ice_pf_to_dev(pf); vf_vsi = ice_get_vf_vsi(vf); + if (!vf_vsi) { + dev_err(dev, "Can not get FDIR vf_vsi for VF %u\n", vf->vf_id); + v_ret = VIRTCHNL_STATUS_ERR_PARAM; + goto err_exit; + }
#define ICE_VF_MAX_FDIR_FILTERS 128 if (!ice_fdir_num_avail_fltr(&pf->hw, vf_vsi) ||
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Simon Horman horms@kernel.org
[ Upstream commit e7e5ae71831c44d58627a991e603845a2fed2cab ]
As it's name suggests, parse_eeprom() parses EEPROM data.
This is done by reading data, 16 bits at a time as follows:
for (i = 0; i < 128; i++) ((__le16 *) sromdata)[i] = cpu_to_le16(read_eeprom(np, i));
sromdata is at the same memory location as psrom. And the type of psrom is a pointer to struct t_SROM.
As can be seen in the loop above, data is stored in sromdata, and thus psrom, as 16-bit little-endian values.
However, the integer fields of t_SROM are host byte order integers. And in the case of led_mode this leads to a little endian value being incorrectly treated as host byte order.
Looking at rio_set_led_mode, this does appear to be a bug as that code masks led_mode with 0x1, 0x2 and 0x8. Logic that would be effected by a reversed byte order.
This problem would only manifest on big endian hosts.
Found by inspection while investigating a sparse warning regarding the crc field of t_SROM.
I believe that warning is a false positive. And although I plan to send a follow-up to use little-endian types for other the integer fields of PSROM_t I do not believe that will involve any bug fixes.
Compile tested only.
Fixes: c3f45d322cbd ("dl2k: Add support for IP1000A-based cards") Signed-off-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250425-dlink-led-mode-v1-1-6bae3c36e736@kernel.or... Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/dlink/dl2k.c | 2 +- drivers/net/ethernet/dlink/dl2k.h | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/dlink/dl2k.c b/drivers/net/ethernet/dlink/dl2k.c index 993bba0ffb161..af0b6fa296e56 100644 --- a/drivers/net/ethernet/dlink/dl2k.c +++ b/drivers/net/ethernet/dlink/dl2k.c @@ -353,7 +353,7 @@ parse_eeprom (struct net_device *dev) dev->dev_addr[i] = psrom->mac_addr[i];
if (np->chip_id == CHIP_IP1000A) { - np->led_mode = psrom->led_mode; + np->led_mode = le16_to_cpu(psrom->led_mode); return 0; }
diff --git a/drivers/net/ethernet/dlink/dl2k.h b/drivers/net/ethernet/dlink/dl2k.h index 195dc6cfd8955..0e33e2eaae960 100644 --- a/drivers/net/ethernet/dlink/dl2k.h +++ b/drivers/net/ethernet/dlink/dl2k.h @@ -335,7 +335,7 @@ typedef struct t_SROM { u16 sub_system_id; /* 0x06 */ u16 pci_base_1; /* 0x08 (IP1000A only) */ u16 pci_base_2; /* 0x0a (IP1000A only) */ - u16 led_mode; /* 0x0c (IP1000A only) */ + __le16 led_mode; /* 0x0c (IP1000A only) */ u16 reserved1[9]; /* 0x0e-0x1f */ u8 mac_addr[6]; /* 0x20-0x25 */ u8 reserved2[10]; /* 0x26-0x2f */
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Felix Fietkau nbd@nbd.name
[ Upstream commit b936a9b8d4a585ccb6d454921c36286bfe63e01d ]
If any address or port is changed, update it in all packets and recalculate checksum.
Fixes: 9fd1ff5d2ac7 ("udp: Support UDP fraglist GRO/GSO.") Signed-off-by: Felix Fietkau nbd@nbd.name Reviewed-by: Willem de Bruijn willemb@google.com Link: https://patch.msgid.link/20250426153210.14044-1-nbd@nbd.name Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- net/ipv4/udp_offload.c | 61 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 60 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/udp_offload.c b/net/ipv4/udp_offload.c index 1a57dd8aa513b..612da8ec1081c 100644 --- a/net/ipv4/udp_offload.c +++ b/net/ipv4/udp_offload.c @@ -245,6 +245,62 @@ static struct sk_buff *__udpv4_gso_segment_list_csum(struct sk_buff *segs) return segs; }
+static void __udpv6_gso_segment_csum(struct sk_buff *seg, + struct in6_addr *oldip, + const struct in6_addr *newip, + __be16 *oldport, __be16 newport) +{ + struct udphdr *uh = udp_hdr(seg); + + if (ipv6_addr_equal(oldip, newip) && *oldport == newport) + return; + + if (uh->check) { + inet_proto_csum_replace16(&uh->check, seg, oldip->s6_addr32, + newip->s6_addr32, true); + + inet_proto_csum_replace2(&uh->check, seg, *oldport, newport, + false); + if (!uh->check) + uh->check = CSUM_MANGLED_0; + } + + *oldip = *newip; + *oldport = newport; +} + +static struct sk_buff *__udpv6_gso_segment_list_csum(struct sk_buff *segs) +{ + const struct ipv6hdr *iph; + const struct udphdr *uh; + struct ipv6hdr *iph2; + struct sk_buff *seg; + struct udphdr *uh2; + + seg = segs; + uh = udp_hdr(seg); + iph = ipv6_hdr(seg); + uh2 = udp_hdr(seg->next); + iph2 = ipv6_hdr(seg->next); + + if (!(*(const u32 *)&uh->source ^ *(const u32 *)&uh2->source) && + ipv6_addr_equal(&iph->saddr, &iph2->saddr) && + ipv6_addr_equal(&iph->daddr, &iph2->daddr)) + return segs; + + while ((seg = seg->next)) { + uh2 = udp_hdr(seg); + iph2 = ipv6_hdr(seg); + + __udpv6_gso_segment_csum(seg, &iph2->saddr, &iph->saddr, + &uh2->source, uh->source); + __udpv6_gso_segment_csum(seg, &iph2->daddr, &iph->daddr, + &uh2->dest, uh->dest); + } + + return segs; +} + static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb, netdev_features_t features, bool is_ipv6) @@ -257,7 +313,10 @@ static struct sk_buff *__udp_gso_segment_list(struct sk_buff *skb,
udp_hdr(skb)->len = htons(sizeof(struct udphdr) + mss);
- return is_ipv6 ? skb : __udpv4_gso_segment_list_csum(skb); + if (is_ipv6) + return __udpv6_gso_segment_list_csum(skb); + else + return __udpv4_gso_segment_list_csum(skb); }
struct sk_buff *__udp_gso_segment(struct sk_buff *gso_skb,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shruti Parab shruti.parab@broadcom.com
[ Upstream commit ea9376cf68230e05492f22ca45d329f16e262c7b ]
When handling HWRM_DBG_COREDUMP_LIST FW command in bnxt_hwrm_dbg_dma_data(), the allocated buffer info->dest_buf is not freed in the error path. In the normal path, info->dest_buf is assigned to coredump->data and it will eventually be freed after the coredump is collected.
Free info->dest_buf immediately inside bnxt_hwrm_dbg_dma_data() in the error path.
Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length") Reported-by: Michael Chan michael.chan@broadcom.com Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Shruti Parab shruti.parab@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c index 156f76bcea7eb..e0e7bfaf860b7 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c @@ -72,6 +72,11 @@ static int bnxt_hwrm_dbg_dma_data(struct bnxt *bp, void *msg, memcpy(info->dest_buf + off, dma_buf, len); } else { rc = -ENOBUFS; + if (cmn_req->req_type == + cpu_to_le16(HWRM_DBG_COREDUMP_LIST)) { + kfree(info->dest_buf); + info->dest_buf = NULL; + } break; } }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Shruti Parab shruti.parab@broadcom.com
[ Upstream commit 6b87bd94f34370bbf1dfa59352bed8efab5bf419 ]
When retrieving the FW coredump using ethtool, it can sometimes cause memory corruption:
BUG: KFENCE: memory corruption in __bnxt_get_coredump+0x3ef/0x670 [bnxt_en] Corrupted memory at 0x000000008f0f30e8 [ ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ! ] (in kfence-#45): __bnxt_get_coredump+0x3ef/0x670 [bnxt_en] ethtool_get_dump_data+0xdc/0x1a0 __dev_ethtool+0xa1e/0x1af0 dev_ethtool+0xa8/0x170 dev_ioctl+0x1b5/0x580 sock_do_ioctl+0xab/0xf0 sock_ioctl+0x1ce/0x2e0 __x64_sys_ioctl+0x87/0xc0 do_syscall_64+0x5c/0xf0 entry_SYSCALL_64_after_hwframe+0x78/0x80
...
This happens when copying the coredump segment list in bnxt_hwrm_dbg_dma_data() with the HWRM_DBG_COREDUMP_LIST FW command. The info->dest_buf buffer is allocated based on the number of coredump segments returned by the FW. The segment list is then DMA'ed by the FW and the length of the DMA is returned by FW. The driver then copies this DMA'ed segment list to info->dest_buf.
In some cases, this DMA length may exceed the info->dest_buf length and cause the above BUG condition. Fix it by capping the copy length to not exceed the length of info->dest_buf. The extra DMA data contains no useful information.
This code path is shared for the HWRM_DBG_COREDUMP_LIST and the HWRM_DBG_COREDUMP_RETRIEVE FW commands. The buffering is different for these 2 FW commands. To simplify the logic, we need to move the line to adjust the buffer length for HWRM_DBG_COREDUMP_RETRIEVE up, so that the new check to cap the copy length will work for both commands.
Fixes: c74751f4c392 ("bnxt_en: Return error if FW returns more data than dump length") Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Signed-off-by: Shruti Parab shruti.parab@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/broadcom/bnxt/bnxt_coredump.c | 15 ++++++++++----- 1 file changed, 10 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c index e0e7bfaf860b7..8716c924f3f50 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_coredump.c @@ -66,10 +66,19 @@ static int bnxt_hwrm_dbg_dma_data(struct bnxt *bp, void *msg, } }
+ if (cmn_req->req_type == + cpu_to_le16(HWRM_DBG_COREDUMP_RETRIEVE)) + info->dest_buf_size += len; + if (info->dest_buf) { if ((info->seg_start + off + len) <= BNXT_COREDUMP_BUF_LEN(info->buf_len)) { - memcpy(info->dest_buf + off, dma_buf, len); + u16 copylen = min_t(u16, len, + info->dest_buf_size - off); + + memcpy(info->dest_buf + off, dma_buf, copylen); + if (copylen < len) + break; } else { rc = -ENOBUFS; if (cmn_req->req_type == @@ -81,10 +90,6 @@ static int bnxt_hwrm_dbg_dma_data(struct bnxt *bp, void *msg, } }
- if (cmn_req->req_type == - cpu_to_le16(HWRM_DBG_COREDUMP_RETRIEVE)) - info->dest_buf_size += len; - if (!(cmn_resp->flags & HWRM_DBG_CMN_FLAGS_MORE)) break;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Chan michael.chan@broadcom.com
[ Upstream commit 02e8be5a032cae0f4ca33c6053c44d83cf4acc93 ]
For version 1 register dump that includes the PCIe stats, the existing code incorrectly assumes that all PCIe stats are 64-bit values. Fix it by using an array containing the starting and ending index of the 32-bit values. The loop in bnxt_get_regs() will use the array to do proper endian swap for the 32-bit values.
Fixes: b5d600b027eb ("bnxt_en: Add support for 'ethtool -d'") Reviewed-by: Shruti Parab shruti.parab@broadcom.com Reviewed-by: Kalesh AP kalesh-anakkur.purayil@broadcom.com Reviewed-by: Andy Gospodarek andrew.gospodarek@broadcom.com Signed-off-by: Michael Chan michael.chan@broadcom.com Signed-off-by: David S. Miller davem@davemloft.net Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/broadcom/bnxt/bnxt_ethtool.c | 38 ++++++++++++++++--- 1 file changed, 32 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c index 8ebc1c522a05b..ad307df8d97ba 100644 --- a/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c +++ b/drivers/net/ethernet/broadcom/bnxt/bnxt_ethtool.c @@ -1360,6 +1360,17 @@ static int bnxt_get_regs_len(struct net_device *dev) return reg_len; }
+#define BNXT_PCIE_32B_ENTRY(start, end) \ + { offsetof(struct pcie_ctx_hw_stats, start), \ + offsetof(struct pcie_ctx_hw_stats, end) } + +static const struct { + u16 start; + u16 end; +} bnxt_pcie_32b_entries[] = { + BNXT_PCIE_32B_ENTRY(pcie_ltssm_histogram[0], pcie_ltssm_histogram[3]), +}; + static void bnxt_get_regs(struct net_device *dev, struct ethtool_regs *regs, void *_p) { @@ -1391,12 +1402,27 @@ static void bnxt_get_regs(struct net_device *dev, struct ethtool_regs *regs, req->pcie_stat_host_addr = cpu_to_le64(hw_pcie_stats_addr); rc = hwrm_req_send(bp, req); if (!rc) { - __le64 *src = (__le64 *)hw_pcie_stats; - u64 *dst = (u64 *)(_p + BNXT_PXP_REG_LEN); - int i; - - for (i = 0; i < sizeof(*hw_pcie_stats) / sizeof(__le64); i++) - dst[i] = le64_to_cpu(src[i]); + u8 *dst = (u8 *)(_p + BNXT_PXP_REG_LEN); + u8 *src = (u8 *)hw_pcie_stats; + int i, j; + + for (i = 0, j = 0; i < sizeof(*hw_pcie_stats); ) { + if (i >= bnxt_pcie_32b_entries[j].start && + i <= bnxt_pcie_32b_entries[j].end) { + u32 *dst32 = (u32 *)(dst + i); + + *dst32 = le32_to_cpu(*(__le32 *)(src + i)); + i += 4; + if (i > bnxt_pcie_32b_entries[j].end && + j < ARRAY_SIZE(bnxt_pcie_32b_entries) - 1) + j++; + } else { + u64 *dst64 = (u64 *)(dst + i); + + *dst64 = le64_to_cpu(*(__le64 *)(src + i)); + i += 8; + } + } } hwrm_req_drop(bp, req); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Michael Liang mliang@purestorage.com
[ Upstream commit 77e40bbce93059658aee02786a32c5c98a240a8a ]
This patch addresses a data corruption issue observed in nvme-tcp during testing.
In an NVMe native multipath setup, when an I/O timeout occurs, all inflight I/Os are canceled almost immediately after the kernel socket is shut down. These canceled I/Os are reported as host path errors, triggering a failover that succeeds on a different path.
However, at this point, the original I/O may still be outstanding in the host's network transmission path (e.g., the NIC’s TX queue). From the user-space app's perspective, the buffer associated with the I/O is considered completed since they're acked on the different path and may be reused for new I/O requests.
Because nvme-tcp enables zero-copy by default in the transmission path, this can lead to corrupted data being sent to the original target, ultimately causing data corruption.
We can reproduce this data corruption by injecting delay on one path and triggering i/o timeout.
To prevent this issue, this change ensures that all inflight transmissions are fully completed from host's perspective before returning from queue stop. To handle concurrent I/O timeout from multiple namespaces under the same controller, always wait in queue stop regardless of queue's state.
This aligns with the behavior of queue stopping in other NVMe fabric transports.
Fixes: 3f2304f8c6d6 ("nvme-tcp: add NVMe over TCP host driver") Signed-off-by: Michael Liang mliang@purestorage.com Reviewed-by: Mohamed Khalfella mkhalfella@purestorage.com Reviewed-by: Randy Jennings randyj@purestorage.com Reviewed-by: Sagi Grimberg sagi@grimberg.me Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/host/tcp.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 0fc5aba88bc15..99bf17f2dcfca 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -1602,7 +1602,7 @@ static void __nvme_tcp_stop_queue(struct nvme_tcp_queue *queue) cancel_work_sync(&queue->io_work); }
-static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) +static void nvme_tcp_stop_queue_nowait(struct nvme_ctrl *nctrl, int qid) { struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); struct nvme_tcp_queue *queue = &ctrl->queues[qid]; @@ -1613,6 +1613,31 @@ static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) mutex_unlock(&queue->queue_lock); }
+static void nvme_tcp_wait_queue(struct nvme_ctrl *nctrl, int qid) +{ + struct nvme_tcp_ctrl *ctrl = to_tcp_ctrl(nctrl); + struct nvme_tcp_queue *queue = &ctrl->queues[qid]; + int timeout = 100; + + while (timeout > 0) { + if (!test_bit(NVME_TCP_Q_ALLOCATED, &queue->flags) || + !sk_wmem_alloc_get(queue->sock->sk)) + return; + msleep(2); + timeout -= 2; + } + dev_warn(nctrl->device, + "qid %d: timeout draining sock wmem allocation expired\n", + qid); +} + +static void nvme_tcp_stop_queue(struct nvme_ctrl *nctrl, int qid) +{ + nvme_tcp_stop_queue_nowait(nctrl, qid); + nvme_tcp_wait_queue(nctrl, qid); +} + + static void nvme_tcp_setup_sock_ops(struct nvme_tcp_queue *queue) { write_lock_bh(&queue->sock->sk->sk_callback_lock); @@ -1720,7 +1745,9 @@ static void nvme_tcp_stop_io_queues(struct nvme_ctrl *ctrl) int i;
for (i = 1; i < ctrl->queue_count; i++) - nvme_tcp_stop_queue(ctrl, i); + nvme_tcp_stop_queue_nowait(ctrl, i); + for (i = 1; i < ctrl->queue_count; i++) + nvme_tcp_wait_queue(ctrl, i); }
static int nvme_tcp_start_io_queues(struct nvme_ctrl *ctrl)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thangaraj Samynathan thangaraj.s@microchip.com
[ Upstream commit 2d52e2e38b85c8b7bc00dca55c2499f46f8c8198 ]
Always map the `skb` to the LS descriptor. Previously skb was mapped to EXT descriptor when the number of fragments is zero with GSO enabled. Mapping the skb to EXT descriptor prevents it from being freed, leading to a memory leak
Fixes: 23f0703c125b ("lan743x: Add main source files for new lan743x driver") Signed-off-by: Thangaraj Samynathan thangaraj.s@microchip.com Reviewed-by: Jacob Keller jacob.e.keller@intel.com Link: https://patch.msgid.link/20250429052527.10031-1-thangaraj.s@microchip.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/microchip/lan743x_main.c | 8 ++++++-- drivers/net/ethernet/microchip/lan743x_main.h | 1 + 2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/microchip/lan743x_main.c b/drivers/net/ethernet/microchip/lan743x_main.c index a3392c74372a8..fe919c1974505 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.c +++ b/drivers/net/ethernet/microchip/lan743x_main.c @@ -1448,6 +1448,7 @@ static void lan743x_tx_frame_add_lso(struct lan743x_tx *tx, if (nr_frags <= 0) { tx->frame_data0 |= TX_DESC_DATA0_LS_; tx->frame_data0 |= TX_DESC_DATA0_IOC_; + tx->frame_last = tx->frame_first; } tx_descriptor = &tx->ring_cpu_ptr[tx->frame_tail]; tx_descriptor->data0 = cpu_to_le32(tx->frame_data0); @@ -1517,6 +1518,7 @@ static int lan743x_tx_frame_add_fragment(struct lan743x_tx *tx, tx->frame_first = 0; tx->frame_data0 = 0; tx->frame_tail = 0; + tx->frame_last = 0; return -ENOMEM; }
@@ -1557,16 +1559,18 @@ static void lan743x_tx_frame_end(struct lan743x_tx *tx, TX_DESC_DATA0_DTYPE_DATA_) { tx->frame_data0 |= TX_DESC_DATA0_LS_; tx->frame_data0 |= TX_DESC_DATA0_IOC_; + tx->frame_last = tx->frame_tail; }
- tx_descriptor = &tx->ring_cpu_ptr[tx->frame_tail]; - buffer_info = &tx->buffer_info[tx->frame_tail]; + tx_descriptor = &tx->ring_cpu_ptr[tx->frame_last]; + buffer_info = &tx->buffer_info[tx->frame_last]; buffer_info->skb = skb; if (time_stamp) buffer_info->flags |= TX_BUFFER_INFO_FLAG_TIMESTAMP_REQUESTED; if (ignore_sync) buffer_info->flags |= TX_BUFFER_INFO_FLAG_IGNORE_SYNC;
+ tx_descriptor = &tx->ring_cpu_ptr[tx->frame_tail]; tx_descriptor->data0 = cpu_to_le32(tx->frame_data0); tx->frame_tail = lan743x_tx_next_index(tx, tx->frame_tail); tx->last_tail = tx->frame_tail; diff --git a/drivers/net/ethernet/microchip/lan743x_main.h b/drivers/net/ethernet/microchip/lan743x_main.h index 6080028c1df2c..a1226ab0fb421 100644 --- a/drivers/net/ethernet/microchip/lan743x_main.h +++ b/drivers/net/ethernet/microchip/lan743x_main.h @@ -658,6 +658,7 @@ struct lan743x_tx { u32 frame_first; u32 frame_data0; u32 frame_tail; + u32 frame_last;
struct lan743x_tx_buffer_info *buffer_info;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Mattias Barthel mattias.barthel@atlascopco.com
[ Upstream commit a179aad12badc43201cbf45d1e8ed2c1383c76b9 ]
Activate TX hang workaround also in fec_enet_txq_submit_skb() when TSO is not enabled.
Errata: ERR007885
Symptoms: NETDEV WATCHDOG: eth0 (fec): transmit queue 0 timed out
commit 37d6017b84f7 ("net: fec: Workaround for imx6sx enet tx hang when enable three queues") There is a TDAR race condition for mutliQ when the software sets TDAR and the UDMA clears TDAR simultaneously or in a small window (2-4 cycles). This will cause the udma_tx and udma_tx_arbiter state machines to hang.
So, the Workaround is checking TDAR status four time, if TDAR cleared by hardware and then write TDAR, otherwise don't set TDAR.
Fixes: 53bb20d1faba ("net: fec: add variable reg_desc_active to speed things up") Signed-off-by: Mattias Barthel mattias.barthel@atlascopco.com Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20250429090826.3101258-1-mattiasbarthel@gmail.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/freescale/fec_main.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/freescale/fec_main.c b/drivers/net/ethernet/freescale/fec_main.c index 7b5585bc21d8f..5c860eef03007 100644 --- a/drivers/net/ethernet/freescale/fec_main.c +++ b/drivers/net/ethernet/freescale/fec_main.c @@ -635,7 +635,12 @@ static int fec_enet_txq_submit_skb(struct fec_enet_priv_tx_q *txq, txq->bd.cur = bdp;
/* Trigger transmission start */ - writel(0, txq->bd.reg_desc_active); + if (!(fep->quirks & FEC_QUIRK_ERR007885) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active) || + !readl(txq->bd.reg_desc_active)) + writel(0, txq->bd.reg_desc_active);
return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit ef2383d078edcbe3055032436b16cdf206f26de2 ]
The VF driver missed to store the rx VLAN tag strip state when user change the rx VLAN tag offload state. And it will default to enable the rx vlan tag strip when re-init VF device after reset. So if user disable rx VLAN tag offload, and trig reset, then the HW will still strip the VLAN tag from packet nad fill into RX BD, but the VF driver will ignore it for rx VLAN tag offload disabled. It may cause the rx VLAN tag dropped.
Fixes: b2641e2ad456 ("net: hns3: Add support of hardware rx-vlan-offload to HNS3 VF driver") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Simon Horman horms@kernel.org Link: https://patch.msgid.link/20250430093052.2400464-2-shaojijie@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../hisilicon/hns3/hns3vf/hclgevf_main.c | 25 ++++++++++++++----- .../hisilicon/hns3/hns3vf/hclgevf_main.h | 1 + 2 files changed, 20 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c index 7bb01eafba745..628d5c5ad75de 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.c @@ -1761,9 +1761,8 @@ static void hclgevf_sync_vlan_filter(struct hclgevf_dev *hdev) rtnl_unlock(); }
-static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable) +static int hclgevf_en_hw_strip_rxvtag_cmd(struct hclgevf_dev *hdev, bool enable) { - struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle); struct hclge_vf_to_pf_msg send_msg;
hclgevf_build_send_msg(&send_msg, HCLGE_MBX_SET_VLAN, @@ -1772,6 +1771,19 @@ static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable) return hclgevf_send_mbx_msg(hdev, &send_msg, false, NULL, 0); }
+static int hclgevf_en_hw_strip_rxvtag(struct hnae3_handle *handle, bool enable) +{ + struct hclgevf_dev *hdev = hclgevf_ae_get_hdev(handle); + int ret; + + ret = hclgevf_en_hw_strip_rxvtag_cmd(hdev, enable); + if (ret) + return ret; + + hdev->rxvtag_strip_en = enable; + return 0; +} + static int hclgevf_reset_tqp(struct hnae3_handle *handle) { #define HCLGEVF_RESET_ALL_QUEUE_DONE 1U @@ -2684,12 +2696,13 @@ static int hclgevf_rss_init_hw(struct hclgevf_dev *hdev) return hclgevf_set_rss_tc_mode(hdev, rss_cfg->rss_size); }
-static int hclgevf_init_vlan_config(struct hclgevf_dev *hdev) +static int hclgevf_init_vlan_config(struct hclgevf_dev *hdev, + bool rxvtag_strip_en) { struct hnae3_handle *nic = &hdev->nic; int ret;
- ret = hclgevf_en_hw_strip_rxvtag(nic, true); + ret = hclgevf_en_hw_strip_rxvtag(nic, rxvtag_strip_en); if (ret) { dev_err(&hdev->pdev->dev, "failed to enable rx vlan offload, ret = %d\n", ret); @@ -3359,7 +3372,7 @@ static int hclgevf_reset_hdev(struct hclgevf_dev *hdev) if (ret) return ret;
- ret = hclgevf_init_vlan_config(hdev); + ret = hclgevf_init_vlan_config(hdev, hdev->rxvtag_strip_en); if (ret) { dev_err(&hdev->pdev->dev, "failed(%d) to initialize VLAN config\n", ret); @@ -3472,7 +3485,7 @@ static int hclgevf_init_hdev(struct hclgevf_dev *hdev) goto err_config; }
- ret = hclgevf_init_vlan_config(hdev); + ret = hclgevf_init_vlan_config(hdev, true); if (ret) { dev_err(&hdev->pdev->dev, "failed(%d) to initialize VLAN config\n", ret); diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h index 2b216ac96914c..a6468fe2ec326 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3vf/hclgevf_main.h @@ -315,6 +315,7 @@ struct hclgevf_dev { int *vector_irq;
bool gro_en; + bool rxvtag_strip_en;
unsigned long vlan_del_fail_bmap[BITS_TO_LONGS(VLAN_N_VID)];
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit 04b6ba143521f4485b7f2c36c655b262a79dae97 ]
This patch add support for external loopback test. The successful test need the link is up with duplex full. The driver do external loopback first, and then the whole offline test.
Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Guangbin Huang huangguangbin2@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Stable-dep-of: 8e6b9c6ea5a5 ("net: hns3: fix an interrupt residual problem") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hnae3.h | 2 + .../net/ethernet/hisilicon/hns3/hns3_enet.c | 51 ++++++++++++++++ .../net/ethernet/hisilicon/hns3/hns3_enet.h | 3 + .../ethernet/hisilicon/hns3/hns3_ethtool.c | 61 +++++++++++++------ .../hisilicon/hns3/hns3pf/hclge_main.c | 26 +++++--- 5 files changed, 119 insertions(+), 24 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hnae3.h b/drivers/net/ethernet/hisilicon/hns3/hnae3.h index fa16cdcee10db..8d1b66281c095 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hnae3.h +++ b/drivers/net/ethernet/hisilicon/hns3/hnae3.h @@ -178,6 +178,7 @@ struct hns3_mac_stats {
/* hnae3 loop mode */ enum hnae3_loop { + HNAE3_LOOP_EXTERNAL, HNAE3_LOOP_APP, HNAE3_LOOP_SERIAL_SERDES, HNAE3_LOOP_PARALLEL_SERDES, @@ -802,6 +803,7 @@ struct hnae3_roce_private_info { #define HNAE3_SUPPORT_SERDES_SERIAL_LOOPBACK BIT(2) #define HNAE3_SUPPORT_VF BIT(3) #define HNAE3_SUPPORT_SERDES_PARALLEL_LOOPBACK BIT(4) +#define HNAE3_SUPPORT_EXTERNAL_LOOPBACK BIT(5)
#define HNAE3_USER_UPE BIT(0) /* unicast promisc enabled by user */ #define HNAE3_USER_MPE BIT(1) /* mulitcast promisc enabled by user */ diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 60592e8ddf3b8..03fe5e0729f64 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -5642,6 +5642,57 @@ int hns3_set_channels(struct net_device *netdev, return 0; }
+void hns3_external_lb_prepare(struct net_device *ndev, bool if_running) +{ + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct hnae3_handle *h = priv->ae_handle; + int i; + + if (!if_running) + return; + + netif_carrier_off(ndev); + netif_tx_disable(ndev); + + for (i = 0; i < priv->vector_num; i++) + hns3_vector_disable(&priv->tqp_vector[i]); + + for (i = 0; i < h->kinfo.num_tqps; i++) + hns3_tqp_disable(h->kinfo.tqp[i]); + + /* delay ring buffer clearing to hns3_reset_notify_uninit_enet + * during reset process, because driver may not be able + * to disable the ring through firmware when downing the netdev. + */ + if (!hns3_nic_resetting(ndev)) + hns3_nic_reset_all_ring(priv->ae_handle); + + hns3_reset_tx_queue(priv->ae_handle); +} + +void hns3_external_lb_restore(struct net_device *ndev, bool if_running) +{ + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct hnae3_handle *h = priv->ae_handle; + int i; + + if (!if_running) + return; + + hns3_nic_reset_all_ring(priv->ae_handle); + + for (i = 0; i < priv->vector_num; i++) + hns3_vector_enable(&priv->tqp_vector[i]); + + for (i = 0; i < h->kinfo.num_tqps; i++) + hns3_tqp_enable(h->kinfo.tqp[i]); + + netif_tx_wake_all_queues(ndev); + + if (h->ae_algo->ops->get_status(h)) + netif_carrier_on(ndev); +} + static const struct hns3_hw_error_info hns3_hw_err[] = { { .type = HNAE3_PPU_POISON_ERROR, .msg = "PPU poison" }, diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h index f60ba2ee8b8b1..f3f7f370807f0 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.h @@ -729,4 +729,7 @@ u16 hns3_get_max_available_channels(struct hnae3_handle *h); void hns3_cq_period_mode_init(struct hns3_nic_priv *priv, enum dim_cq_period_mode tx_mode, enum dim_cq_period_mode rx_mode); + +void hns3_external_lb_prepare(struct net_device *ndev, bool if_running); +void hns3_external_lb_restore(struct net_device *ndev, bool if_running); #endif diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c index 17fa4e7684cd2..b01ce4fd6bc43 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c @@ -67,7 +67,6 @@ static const struct hns3_stats hns3_rxq_stats[] = {
#define HNS3_TQP_STATS_COUNT (HNS3_TXQ_STATS_COUNT + HNS3_RXQ_STATS_COUNT)
-#define HNS3_SELF_TEST_TYPE_NUM 4 #define HNS3_NIC_LB_TEST_PKT_NUM 1 #define HNS3_NIC_LB_TEST_RING_ID 0 #define HNS3_NIC_LB_TEST_PACKET_SIZE 128 @@ -93,6 +92,7 @@ static int hns3_lp_setup(struct net_device *ndev, enum hnae3_loop loop, bool en) case HNAE3_LOOP_PARALLEL_SERDES: case HNAE3_LOOP_APP: case HNAE3_LOOP_PHY: + case HNAE3_LOOP_EXTERNAL: ret = h->ae_algo->ops->set_loopback(h, loop, en); break; default: @@ -300,6 +300,10 @@ static int hns3_lp_run_test(struct net_device *ndev, enum hnae3_loop mode)
static void hns3_set_selftest_param(struct hnae3_handle *h, int (*st_param)[2]) { + st_param[HNAE3_LOOP_EXTERNAL][0] = HNAE3_LOOP_EXTERNAL; + st_param[HNAE3_LOOP_EXTERNAL][1] = + h->flags & HNAE3_SUPPORT_EXTERNAL_LOOPBACK; + st_param[HNAE3_LOOP_APP][0] = HNAE3_LOOP_APP; st_param[HNAE3_LOOP_APP][1] = h->flags & HNAE3_SUPPORT_APP_LOOPBACK; @@ -318,17 +322,11 @@ static void hns3_set_selftest_param(struct hnae3_handle *h, int (*st_param)[2]) h->flags & HNAE3_SUPPORT_PHY_LOOPBACK; }
-static void hns3_selftest_prepare(struct net_device *ndev, - bool if_running, int (*st_param)[2]) +static void hns3_selftest_prepare(struct net_device *ndev, bool if_running) { struct hns3_nic_priv *priv = netdev_priv(ndev); struct hnae3_handle *h = priv->ae_handle;
- if (netif_msg_ifdown(h)) - netdev_info(ndev, "self test start\n"); - - hns3_set_selftest_param(h, st_param); - if (if_running) ndev->netdev_ops->ndo_stop(ndev);
@@ -367,18 +365,15 @@ static void hns3_selftest_restore(struct net_device *ndev, bool if_running)
if (if_running) ndev->netdev_ops->ndo_open(ndev); - - if (netif_msg_ifdown(h)) - netdev_info(ndev, "self test end\n"); }
static void hns3_do_selftest(struct net_device *ndev, int (*st_param)[2], struct ethtool_test *eth_test, u64 *data) { - int test_index = 0; + int test_index = HNAE3_LOOP_APP; u32 i;
- for (i = 0; i < HNS3_SELF_TEST_TYPE_NUM; i++) { + for (i = HNAE3_LOOP_APP; i < HNAE3_LOOP_NONE; i++) { enum hnae3_loop loop_type = (enum hnae3_loop)st_param[i][0];
if (!st_param[i][1]) @@ -397,6 +392,20 @@ static void hns3_do_selftest(struct net_device *ndev, int (*st_param)[2], } }
+static void hns3_do_external_lb(struct net_device *ndev, + struct ethtool_test *eth_test, u64 *data) +{ + data[HNAE3_LOOP_EXTERNAL] = hns3_lp_up(ndev, HNAE3_LOOP_EXTERNAL); + if (!data[HNAE3_LOOP_EXTERNAL]) + data[HNAE3_LOOP_EXTERNAL] = hns3_lp_run_test(ndev, HNAE3_LOOP_EXTERNAL); + hns3_lp_down(ndev, HNAE3_LOOP_EXTERNAL); + + if (data[HNAE3_LOOP_EXTERNAL]) + eth_test->flags |= ETH_TEST_FL_FAILED; + + eth_test->flags |= ETH_TEST_FL_EXTERNAL_LB_DONE; +} + /** * hns3_nic_self_test - self test * @ndev: net device @@ -406,7 +415,9 @@ static void hns3_do_selftest(struct net_device *ndev, int (*st_param)[2], static void hns3_self_test(struct net_device *ndev, struct ethtool_test *eth_test, u64 *data) { - int st_param[HNS3_SELF_TEST_TYPE_NUM][2]; + struct hns3_nic_priv *priv = netdev_priv(ndev); + struct hnae3_handle *h = priv->ae_handle; + int st_param[HNAE3_LOOP_NONE][2]; bool if_running = netif_running(ndev);
if (hns3_nic_resetting(ndev)) { @@ -414,13 +425,29 @@ static void hns3_self_test(struct net_device *ndev, return; }
- /* Only do offline selftest, or pass by default */ - if (eth_test->flags != ETH_TEST_FL_OFFLINE) + if (!(eth_test->flags & ETH_TEST_FL_OFFLINE)) return;
- hns3_selftest_prepare(ndev, if_running, st_param); + if (netif_msg_ifdown(h)) + netdev_info(ndev, "self test start\n"); + + hns3_set_selftest_param(h, st_param); + + /* external loopback test requires that the link is up and the duplex is + * full, do external test first to reduce the whole test time + */ + if (eth_test->flags & ETH_TEST_FL_EXTERNAL_LB) { + hns3_external_lb_prepare(ndev, if_running); + hns3_do_external_lb(ndev, eth_test, data); + hns3_external_lb_restore(ndev, if_running); + } + + hns3_selftest_prepare(ndev, if_running); hns3_do_selftest(ndev, st_param, eth_test, data); hns3_selftest_restore(ndev, if_running); + + if (netif_msg_ifdown(h)) + netdev_info(ndev, "self test end\n"); }
static void hns3_update_limit_promisc_mode(struct net_device *netdev, diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c index 35411f9a14323..a0284a9d90e89 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_main.c @@ -151,10 +151,11 @@ static const u32 tqp_intr_reg_addr_list[] = {HCLGE_TQP_INTR_CTRL_REG, HCLGE_TQP_INTR_RL_REG};
static const char hns3_nic_test_strs[][ETH_GSTRING_LEN] = { - "App Loopback test", - "Serdes serial Loopback test", - "Serdes parallel Loopback test", - "Phy Loopback test" + "External Loopback test", + "App Loopback test", + "Serdes serial Loopback test", + "Serdes parallel Loopback test", + "Phy Loopback test" };
static const struct hclge_comm_stats_str g_mac_stats_string[] = { @@ -754,7 +755,8 @@ static int hclge_get_sset_count(struct hnae3_handle *handle, int stringset) #define HCLGE_LOOPBACK_TEST_FLAGS (HNAE3_SUPPORT_APP_LOOPBACK | \ HNAE3_SUPPORT_PHY_LOOPBACK | \ HNAE3_SUPPORT_SERDES_SERIAL_LOOPBACK | \ - HNAE3_SUPPORT_SERDES_PARALLEL_LOOPBACK) + HNAE3_SUPPORT_SERDES_PARALLEL_LOOPBACK | \ + HNAE3_SUPPORT_EXTERNAL_LOOPBACK)
struct hclge_vport *vport = hclge_get_vport(handle); struct hclge_dev *hdev = vport->back; @@ -776,9 +778,12 @@ static int hclge_get_sset_count(struct hnae3_handle *handle, int stringset) handle->flags |= HNAE3_SUPPORT_APP_LOOPBACK; }
- count += 2; + count += 1; handle->flags |= HNAE3_SUPPORT_SERDES_SERIAL_LOOPBACK; + count += 1; handle->flags |= HNAE3_SUPPORT_SERDES_PARALLEL_LOOPBACK; + count += 1; + handle->flags |= HNAE3_SUPPORT_EXTERNAL_LOOPBACK;
if ((hdev->hw.mac.phydev && hdev->hw.mac.phydev->drv && hdev->hw.mac.phydev->drv->set_loopback) || @@ -806,6 +811,11 @@ static void hclge_get_strings(struct hnae3_handle *handle, u32 stringset, size, p); p = hclge_tqps_get_strings(handle, p); } else if (stringset == ETH_SS_TEST) { + if (handle->flags & HNAE3_SUPPORT_EXTERNAL_LOOPBACK) { + memcpy(p, hns3_nic_test_strs[HNAE3_LOOP_EXTERNAL], + ETH_GSTRING_LEN); + p += ETH_GSTRING_LEN; + } if (handle->flags & HNAE3_SUPPORT_APP_LOOPBACK) { memcpy(p, hns3_nic_test_strs[HNAE3_LOOP_APP], ETH_GSTRING_LEN); @@ -8060,7 +8070,7 @@ static int hclge_set_loopback(struct hnae3_handle *handle, { struct hclge_vport *vport = hclge_get_vport(handle); struct hclge_dev *hdev = vport->back; - int ret; + int ret = 0;
/* Loopback can be enabled in three places: SSU, MAC, and serdes. By * default, SSU loopback is enabled, so if the SMAC and the DMAC are @@ -8087,6 +8097,8 @@ static int hclge_set_loopback(struct hnae3_handle *handle, case HNAE3_LOOP_PHY: ret = hclge_set_phy_loopback(hdev, en); break; + case HNAE3_LOOP_EXTERNAL: + break; default: ret = -ENOTSUPP; dev_err(&hdev->pdev->dev,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
[ Upstream commit 8e6b9c6ea5a55045eed6526d8ee49e93192d1a58 ]
When a VF is passthrough to a VM, and the VM is killed, the reported interrupt may not been handled, it will remain, and won't be clear by the nic engine even with a flr or tqp reset. When the VM restart, the interrupt of the first vector may be dropped by the second enable_irq in vfio, see the issue below: https://gitlab.com/qemu-project/qemu/-/issues/2884#note_2423361621
We notice that the vfio has always behaved this way, and the interrupt is a residue of the nic engine, so we fix the problem by moving the vector enable process out of the enable_irq loop.
Fixes: 08a100689d4b ("net: hns3: re-organize vector handle") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Link: https://patch.msgid.link/20250430093052.2400464-3-shaojijie@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3_enet.c | 82 +++++++++---------- 1 file changed, 39 insertions(+), 43 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c index 03fe5e0729f64..af33074267ec9 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -473,20 +473,14 @@ static void hns3_mask_vector_irq(struct hns3_enet_tqp_vector *tqp_vector, writel(mask_en, tqp_vector->mask_addr); }
-static void hns3_vector_enable(struct hns3_enet_tqp_vector *tqp_vector) +static void hns3_irq_enable(struct hns3_enet_tqp_vector *tqp_vector) { napi_enable(&tqp_vector->napi); enable_irq(tqp_vector->vector_irq); - - /* enable vector */ - hns3_mask_vector_irq(tqp_vector, 1); }
-static void hns3_vector_disable(struct hns3_enet_tqp_vector *tqp_vector) +static void hns3_irq_disable(struct hns3_enet_tqp_vector *tqp_vector) { - /* disable vector */ - hns3_mask_vector_irq(tqp_vector, 0); - disable_irq(tqp_vector->vector_irq); napi_disable(&tqp_vector->napi); cancel_work_sync(&tqp_vector->rx_group.dim.work); @@ -707,11 +701,42 @@ static int hns3_set_rx_cpu_rmap(struct net_device *netdev) return 0; }
+static void hns3_enable_irqs_and_tqps(struct net_device *netdev) +{ + struct hns3_nic_priv *priv = netdev_priv(netdev); + struct hnae3_handle *h = priv->ae_handle; + u16 i; + + for (i = 0; i < priv->vector_num; i++) + hns3_irq_enable(&priv->tqp_vector[i]); + + for (i = 0; i < priv->vector_num; i++) + hns3_mask_vector_irq(&priv->tqp_vector[i], 1); + + for (i = 0; i < h->kinfo.num_tqps; i++) + hns3_tqp_enable(h->kinfo.tqp[i]); +} + +static void hns3_disable_irqs_and_tqps(struct net_device *netdev) +{ + struct hns3_nic_priv *priv = netdev_priv(netdev); + struct hnae3_handle *h = priv->ae_handle; + u16 i; + + for (i = 0; i < h->kinfo.num_tqps; i++) + hns3_tqp_disable(h->kinfo.tqp[i]); + + for (i = 0; i < priv->vector_num; i++) + hns3_mask_vector_irq(&priv->tqp_vector[i], 0); + + for (i = 0; i < priv->vector_num; i++) + hns3_irq_disable(&priv->tqp_vector[i]); +} + static int hns3_nic_net_up(struct net_device *netdev) { struct hns3_nic_priv *priv = netdev_priv(netdev); struct hnae3_handle *h = priv->ae_handle; - int i, j; int ret;
ret = hns3_nic_reset_all_ring(h); @@ -720,23 +745,13 @@ static int hns3_nic_net_up(struct net_device *netdev)
clear_bit(HNS3_NIC_STATE_DOWN, &priv->state);
- /* enable the vectors */ - for (i = 0; i < priv->vector_num; i++) - hns3_vector_enable(&priv->tqp_vector[i]); - - /* enable rcb */ - for (j = 0; j < h->kinfo.num_tqps; j++) - hns3_tqp_enable(h->kinfo.tqp[j]); + hns3_enable_irqs_and_tqps(netdev);
/* start the ae_dev */ ret = h->ae_algo->ops->start ? h->ae_algo->ops->start(h) : 0; if (ret) { set_bit(HNS3_NIC_STATE_DOWN, &priv->state); - while (j--) - hns3_tqp_disable(h->kinfo.tqp[j]); - - for (j = i - 1; j >= 0; j--) - hns3_vector_disable(&priv->tqp_vector[j]); + hns3_disable_irqs_and_tqps(netdev); }
return ret; @@ -823,17 +838,9 @@ static void hns3_reset_tx_queue(struct hnae3_handle *h) static void hns3_nic_net_down(struct net_device *netdev) { struct hns3_nic_priv *priv = netdev_priv(netdev); - struct hnae3_handle *h = hns3_get_handle(netdev); const struct hnae3_ae_ops *ops; - int i;
- /* disable vectors */ - for (i = 0; i < priv->vector_num; i++) - hns3_vector_disable(&priv->tqp_vector[i]); - - /* disable rcb */ - for (i = 0; i < h->kinfo.num_tqps; i++) - hns3_tqp_disable(h->kinfo.tqp[i]); + hns3_disable_irqs_and_tqps(netdev);
/* stop ae_dev */ ops = priv->ae_handle->ae_algo->ops; @@ -5645,8 +5652,6 @@ int hns3_set_channels(struct net_device *netdev, void hns3_external_lb_prepare(struct net_device *ndev, bool if_running) { struct hns3_nic_priv *priv = netdev_priv(ndev); - struct hnae3_handle *h = priv->ae_handle; - int i;
if (!if_running) return; @@ -5654,11 +5659,7 @@ void hns3_external_lb_prepare(struct net_device *ndev, bool if_running) netif_carrier_off(ndev); netif_tx_disable(ndev);
- for (i = 0; i < priv->vector_num; i++) - hns3_vector_disable(&priv->tqp_vector[i]); - - for (i = 0; i < h->kinfo.num_tqps; i++) - hns3_tqp_disable(h->kinfo.tqp[i]); + hns3_disable_irqs_and_tqps(ndev);
/* delay ring buffer clearing to hns3_reset_notify_uninit_enet * during reset process, because driver may not be able @@ -5674,18 +5675,13 @@ void hns3_external_lb_restore(struct net_device *ndev, bool if_running) { struct hns3_nic_priv *priv = netdev_priv(ndev); struct hnae3_handle *h = priv->ae_handle; - int i;
if (!if_running) return;
hns3_nic_reset_all_ring(priv->ae_handle);
- for (i = 0; i < priv->vector_num; i++) - hns3_vector_enable(&priv->tqp_vector[i]); - - for (i = 0; i < h->kinfo.num_tqps; i++) - hns3_tqp_enable(h->kinfo.tqp[i]); + hns3_enable_irqs_and_tqps(ndev);
netif_tx_wake_all_queues(ndev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Hao Lan lanhao@huawei.com
[ Upstream commit e317aebeefcb3b0c71f2305af3c22871ca6b3833 ]
The size of the tm_qset file of debugfs is limited to 64 KB, which is too small in the scenario with 1280 qsets. The size needs to be expanded to 1 MB.
Fixes: 5e69ea7ee2a6 ("net: hns3: refactor the debugfs process") Signed-off-by: Hao Lan lanhao@huawei.com Signed-off-by: Peiyang Wang wangpeiyang1@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Link: https://patch.msgid.link/20250430093052.2400464-4-shaojijie@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c index bd801e35d51ea..d6fe09ca03d27 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_debugfs.c @@ -60,7 +60,7 @@ static struct hns3_dbg_cmd_info hns3_dbg_cmd[] = { .name = "tm_qset", .cmd = HNAE3_DBG_CMD_TM_QSET, .dentry = HNS3_DBG_DENTRY_TM, - .buf_len = HNS3_DBG_READ_LEN, + .buf_len = HNS3_DBG_READ_LEN_1MB, .init = hns3_dbg_common_file_init, }, {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jian Shen shenjian15@huawei.com
[ Upstream commit 4971394d9d624f91689d766f31ce668d169d9959 ]
Currently the ptp_clock_register() is called before relative ptp resource ready. It may cause unexpected result when upper layer called the ptp API during the timewindow. Fix it by moving the ptp_clock_register() to the function end.
Fixes: 0bf5eb788512 ("net: hns3: add support for PTP") Signed-off-by: Jian Shen shenjian15@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Vadim Fedorenko vadim.fedorenko@linux.dev Link: https://patch.msgid.link/20250430093052.2400464-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- .../net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c | 13 +++++++------ 1 file changed, 7 insertions(+), 6 deletions(-)
diff --git a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c index 4d4cea1f50157..b7cf9fbf97183 100644 --- a/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3pf/hclge_ptp.c @@ -452,6 +452,13 @@ static int hclge_ptp_create_clock(struct hclge_dev *hdev) ptp->info.settime64 = hclge_ptp_settime;
ptp->info.n_alarm = 0; + + spin_lock_init(&ptp->lock); + ptp->io_base = hdev->hw.hw.io_base + HCLGE_PTP_REG_OFFSET; + ptp->ts_cfg.rx_filter = HWTSTAMP_FILTER_NONE; + ptp->ts_cfg.tx_type = HWTSTAMP_TX_OFF; + hdev->ptp = ptp; + ptp->clock = ptp_clock_register(&ptp->info, &hdev->pdev->dev); if (IS_ERR(ptp->clock)) { dev_err(&hdev->pdev->dev, @@ -463,12 +470,6 @@ static int hclge_ptp_create_clock(struct hclge_dev *hdev) return -ENODEV; }
- spin_lock_init(&ptp->lock); - ptp->io_base = hdev->hw.hw.io_base + HCLGE_PTP_REG_OFFSET; - ptp->ts_cfg.rx_filter = HWTSTAMP_FILTER_NONE; - ptp->ts_cfg.tx_type = HWTSTAMP_TX_OFF; - hdev->ptp = ptp; - return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Richard Zhu hongxing.zhu@nxp.com
commit f068ffdd034c93f0c768acdc87d4d2d7023c1379 upstream.
The i.MX7D only has one PCIe controller, so controller_id should always be 0. The previous code is incorrect although yielding the correct result.
Fix by removing "IMX7D" from the switch case branch.
Fixes: 2d8ed461dbc9 ("PCI: imx6: Add support for i.MX8MQ") Link: https://lore.kernel.org/r/20241126075702.4099164-5-hongxing.zhu@nxp.com Signed-off-by: Richard Zhu hongxing.zhu@nxp.com Signed-off-by: Krzysztof Wilczyński kwilczynski@kernel.org Signed-off-by: Bjorn Helgaas bhelgaas@google.com Reviewed-by: Manivannan Sadhasivam manivannan.sadhasivam@linaro.org Reviewed-by: Frank Li Frank.Li@nxp.com [Because this switch case does more than just controller_id logic, move the "IMX7D" case label instead of removing it entirely.] Signed-off-by: Ryan Matthews ryanmatthews@fastmail.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/pci/controller/dwc/pci-imx6.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-)
--- a/drivers/pci/controller/dwc/pci-imx6.c +++ b/drivers/pci/controller/dwc/pci-imx6.c @@ -1070,11 +1070,10 @@ static int imx6_pcie_probe(struct platfo if (IS_ERR(imx6_pcie->pcie_aux)) return dev_err_probe(dev, PTR_ERR(imx6_pcie->pcie_aux), "pcie_aux clock source missing or invalid\n"); - fallthrough; - case IMX7D: if (dbi_base->start == IMX8MQ_PCIE2_BASE_ADDR) imx6_pcie->controller_id = 1; - + fallthrough; + case IMX7D: imx6_pcie->pciephy_reset = devm_reset_control_get_exclusive(dev, "pciephy"); if (IS_ERR(imx6_pcie->pciephy_reset)) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sergey Shtylyov s.shtylyov@omp.ru
commit cf7385cb26ac4f0ee6c7385960525ad534323252 upstream.
In of_modalias(), if the buffer happens to be too small even for the 1st snprintf() call, the len parameter will become negative and str parameter (if not NULL initially) will point beyond the buffer's end. Add the buffer overflow check after the 1st snprintf() call and fix such check after the strlen() call (accounting for the terminating NUL char).
Fixes: bc575064d688 ("of/device: use of_property_for_each_string to parse compatible strings") Signed-off-by: Sergey Shtylyov s.shtylyov@omp.ru Link: https://lore.kernel.org/r/bbfc6be0-c687-62b6-d015-5141b93f313e@omp.ru Signed-off-by: Rob Herring robh@kernel.org Signed-off-by: "Uwe Kleine-König" ukleinek@debian.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/of/device.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
--- a/drivers/of/device.c +++ b/drivers/of/device.c @@ -257,14 +257,15 @@ static ssize_t of_device_get_modalias(st csize = snprintf(str, len, "of:N%pOFn%c%s", dev->of_node, 'T', of_node_get_device_type(dev->of_node)); tsize = csize; + if (csize >= len) + csize = len > 0 ? len - 1 : 0; len -= csize; - if (str) - str += csize; + str += csize;
of_property_for_each_string(dev->of_node, "compatible", p, compat) { csize = strlen(compat) + 1; tsize += csize; - if (csize > len) + if (csize >= len) continue;
csize = snprintf(str, len, "C%s", compat);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Yonglong Liu liuyonglong@huawei.com
commit ac6257a3ae5db5193b1f19c268e4f72d274ddb88 upstream.
When externel_lb and reset are executed together, a deadlock may occur: [ 3147.217009] INFO: task kworker/u321:0:7 blocked for more than 120 seconds. [ 3147.230483] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 3147.238999] task:kworker/u321:0 state:D stack: 0 pid: 7 ppid: 2 flags:0x00000008 [ 3147.248045] Workqueue: hclge hclge_service_task [hclge] [ 3147.253957] Call trace: [ 3147.257093] __switch_to+0x7c/0xbc [ 3147.261183] __schedule+0x338/0x6f0 [ 3147.265357] schedule+0x50/0xe0 [ 3147.269185] schedule_preempt_disabled+0x18/0x24 [ 3147.274488] __mutex_lock.constprop.0+0x1d4/0x5dc [ 3147.279880] __mutex_lock_slowpath+0x1c/0x30 [ 3147.284839] mutex_lock+0x50/0x60 [ 3147.288841] rtnl_lock+0x20/0x2c [ 3147.292759] hclge_reset_prepare+0x68/0x90 [hclge] [ 3147.298239] hclge_reset_subtask+0x88/0xe0 [hclge] [ 3147.303718] hclge_reset_service_task+0x84/0x120 [hclge] [ 3147.309718] hclge_service_task+0x2c/0x70 [hclge] [ 3147.315109] process_one_work+0x1d0/0x490 [ 3147.319805] worker_thread+0x158/0x3d0 [ 3147.324240] kthread+0x108/0x13c [ 3147.328154] ret_from_fork+0x10/0x18
In externel_lb process, the hns3 driver call napi_disable() first, then the reset happen, then the restore process of the externel_lb will fail, and will not call napi_enable(). When doing externel_lb again, napi_disable() will be double call, cause a deadlock of rtnl_lock().
This patch use the HNS3_NIC_STATE_DOWN state to protect the calling of napi_disable() and napi_enable() in externel_lb process, just as the usage in ndo_stop() and ndo_start().
Fixes: 04b6ba143521 ("net: hns3: add support for external loopback test") Signed-off-by: Yonglong Liu liuyonglong@huawei.com Signed-off-by: Jijie Shao shaojijie@huawei.com Reviewed-by: Leon Romanovsky leonro@nvidia.com Link: https://lore.kernel.org/r/20230807113452.474224-5-shaojijie@huawei.com Signed-off-by: Jakub Kicinski kuba@kernel.org Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)
--- a/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c +++ b/drivers/net/ethernet/hisilicon/hns3/hns3_enet.c @@ -5656,6 +5656,9 @@ void hns3_external_lb_prepare(struct net if (!if_running) return;
+ if (test_and_set_bit(HNS3_NIC_STATE_DOWN, &priv->state)) + return; + netif_carrier_off(ndev); netif_tx_disable(ndev);
@@ -5679,7 +5682,16 @@ void hns3_external_lb_restore(struct net if (!if_running) return;
- hns3_nic_reset_all_ring(priv->ae_handle); + if (hns3_nic_resetting(ndev)) + return; + + if (!test_bit(HNS3_NIC_STATE_DOWN, &priv->state)) + return; + + if (hns3_nic_reset_all_ring(priv->ae_handle)) + return; + + clear_bit(HNS3_NIC_STATE_DOWN, &priv->state);
hns3_enable_irqs_and_tqps(ndev);
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Cristian Marussi cristian.marussi@arm.com
[ Upstream commit 9ca67840c0ddf3f39407339624cef824a4f27599 ]
Using device_find_child() to lookup the proper SCMI device to destroy causes an unbalance in device refcount, since device_find_child() calls an implicit get_device(): this, in turns, inhibits the call of the provided release methods upon devices destruction.
As a consequence, one of the structures that is not freed properly upon destruction is the internal struct device_private dev->p populated by the drivers subsystem core.
KMemleak detects this situation since loading/unloding some SCMI driver causes related devices to be created/destroyed without calling any device_release method.
unreferenced object 0xffff00000f583800 (size 512): comm "insmod", pid 227, jiffies 4294912190 hex dump (first 32 bytes): 00 00 00 00 ad 4e ad de ff ff ff ff 00 00 00 00 .....N.......... ff ff ff ff ff ff ff ff 60 36 1d 8a 00 80 ff ff ........`6...... backtrace (crc 114e2eed): kmemleak_alloc+0xbc/0xd8 __kmalloc_cache_noprof+0x2dc/0x398 device_add+0x954/0x12d0 device_register+0x28/0x40 __scmi_device_create.part.0+0x1bc/0x380 scmi_device_create+0x2d0/0x390 scmi_create_protocol_devices+0x74/0xf8 scmi_device_request_notifier+0x1f8/0x2a8 notifier_call_chain+0x110/0x3b0 blocking_notifier_call_chain+0x70/0xb0 scmi_driver_register+0x350/0x7f0 0xffff80000a3b3038 do_one_initcall+0x12c/0x730 do_init_module+0x1dc/0x640 load_module+0x4b20/0x5b70 init_module_from_file+0xec/0x158
$ ./scripts/faddr2line ./vmlinux device_add+0x954/0x12d0 device_add+0x954/0x12d0: kmalloc_noprof at include/linux/slab.h:901 (inlined by) kzalloc_noprof at include/linux/slab.h:1037 (inlined by) device_private_init at drivers/base/core.c:3510 (inlined by) device_add at drivers/base/core.c:3561
Balance device refcount by issuing a put_device() on devices found via device_find_child().
Reported-by: Alice Ryhl aliceryhl@google.com Closes: https://lore.kernel.org/linux-arm-kernel/Z8nK3uFkspy61yjP@arm.com/T/#mc1f73a... CC: Sudeep Holla sudeep.holla@arm.com CC: Catalin Marinas catalin.marinas@arm.com Fixes: d4f9dddd21f3 ("firmware: arm_scmi: Add dynamic scmi devices creation") Signed-off-by: Cristian Marussi cristian.marussi@arm.com Tested-by: Alice Ryhl aliceryhl@google.com Message-Id: 20250306185447.2039336-1-cristian.marussi@arm.com Signed-off-by: Sudeep Holla sudeep.holla@arm.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/firmware/arm_scmi/bus.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/firmware/arm_scmi/bus.c b/drivers/firmware/arm_scmi/bus.c index 7c1c0951e562d..758ced6a8cc4e 100644 --- a/drivers/firmware/arm_scmi/bus.c +++ b/drivers/firmware/arm_scmi/bus.c @@ -73,6 +73,9 @@ struct scmi_device *scmi_child_dev_find(struct device *parent, if (!dev) return NULL;
+ /* Drop the refcnt bumped implicitly by device_find_child */ + put_device(dev); + return to_scmi_dev(dev); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Sébastien Szymanski sebastien.szymanski@armadeus.com
[ Upstream commit 6e1a7bc8382b0d4208258f7d2a4474fae788dd90 ]
Commit c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") removed a PHY fixup that setted the clock mode and the LED mode. Make the Ethernet interface work again by doing as advised in the commit's log, set clock mode and the LED mode in the device tree.
Fixes: c7e73b5051d6 ("ARM: imx: mach-imx6ul: remove 14x14 EVK specific PHY fixup") Signed-off-by: Sébastien Szymanski sebastien.szymanski@armadeus.com Reviewed-by: Oleksij Rempel o.rempel@pengutronix.de Signed-off-by: Shawn Guo shawnguo@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi b/arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi index f2386dcb9ff2c..dda4fa91b2f2c 100644 --- a/arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi +++ b/arch/arm/boot/dts/imx6ul-imx6ull-opos6ul.dtsi @@ -40,6 +40,9 @@ reg = <1>; interrupt-parent = <&gpio4>; interrupts = <16 IRQ_TYPE_LEVEL_LOW>; + micrel,led-mode = <1>; + clocks = <&clks IMX6UL_CLK_ENET_REF>; + clock-names = "rmii-ref"; status = "okay"; }; };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Fiona Klute fiona.klute@gmx.de
[ Upstream commit 30a41ed32d3088cd0d682a13d7f30b23baed7e93 ]
With lan88xx based devices the lan78xx driver can get stuck in an interrupt loop while bringing the device up, flooding the kernel log with messages like the following:
lan78xx 2-3:1.0 enp1s0u3: kevent 4 may have been dropped
Removing interrupt support from the lan88xx PHY driver forces the driver to use polling instead, which avoids the problem.
The issue has been observed with Raspberry Pi devices at least since 4.14 (see [1], bug report for their downstream kernel), as well as with Nvidia devices [2] in 2020, where disabling interrupts was the vendor-suggested workaround (together with the claim that phylib changes in 4.9 made the interrupt handling in lan78xx incompatible).
Iperf reports well over 900Mbits/sec per direction with client in --dualtest mode, so there does not seem to be a significant impact on throughput (lan88xx device connected via switch to the peer).
[1] https://github.com/raspberrypi/linux/issues/2447 [2] https://forums.developer.nvidia.com/t/jetson-xavier-and-lan7800-problem/1421...
Link: https://lore.kernel.org/0901d90d-3f20-4a10-b680-9c978e04ddda@lunn.ch Fixes: 792aec47d59d ("add microchip LAN88xx phy driver") Signed-off-by: Fiona Klute fiona.klute@gmx.de Cc: kernel-list@raspberrypi.com Cc: stable@vger.kernel.org Reviewed-by: Andrew Lunn andrew@lunn.ch Link: https://patch.msgid.link/20250416102413.30654-1-fiona.klute@gmx.de Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/net/phy/microchip.c | 46 +++---------------------------------- 1 file changed, 3 insertions(+), 43 deletions(-)
diff --git a/drivers/net/phy/microchip.c b/drivers/net/phy/microchip.c index 230f2fcf9c46a..7c8bcec0a8fab 100644 --- a/drivers/net/phy/microchip.c +++ b/drivers/net/phy/microchip.c @@ -31,47 +31,6 @@ static int lan88xx_write_page(struct phy_device *phydev, int page) return __phy_write(phydev, LAN88XX_EXT_PAGE_ACCESS, page); }
-static int lan88xx_phy_config_intr(struct phy_device *phydev) -{ - int rc; - - if (phydev->interrupts == PHY_INTERRUPT_ENABLED) { - /* unmask all source and clear them before enable */ - rc = phy_write(phydev, LAN88XX_INT_MASK, 0x7FFF); - rc = phy_read(phydev, LAN88XX_INT_STS); - rc = phy_write(phydev, LAN88XX_INT_MASK, - LAN88XX_INT_MASK_MDINTPIN_EN_ | - LAN88XX_INT_MASK_LINK_CHANGE_); - } else { - rc = phy_write(phydev, LAN88XX_INT_MASK, 0); - if (rc) - return rc; - - /* Ack interrupts after they have been disabled */ - rc = phy_read(phydev, LAN88XX_INT_STS); - } - - return rc < 0 ? rc : 0; -} - -static irqreturn_t lan88xx_handle_interrupt(struct phy_device *phydev) -{ - int irq_status; - - irq_status = phy_read(phydev, LAN88XX_INT_STS); - if (irq_status < 0) { - phy_error(phydev); - return IRQ_NONE; - } - - if (!(irq_status & LAN88XX_INT_STS_LINK_CHANGE_)) - return IRQ_NONE; - - phy_trigger_machine(phydev); - - return IRQ_HANDLED; -} - static int lan88xx_suspend(struct phy_device *phydev) { struct lan88xx_priv *priv = phydev->priv; @@ -388,8 +347,9 @@ static struct phy_driver microchip_phy_driver[] = { .config_aneg = lan88xx_config_aneg, .link_change_notify = lan88xx_link_change_notify,
- .config_intr = lan88xx_phy_config_intr, - .handle_interrupt = lan88xx_handle_interrupt, + /* Interrupt handling is broken, do not define related + * functions to force polling. + */
.suspend = lan88xx_suspend, .resume = genphy_resume,
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Christian Hewitt christianshewitt@gmail.com
[ Upstream commit f37bb5486ea536c1d61df89feeaeff3f84f0b560 ]
This reverts commit bfbc68e.
The patch does permit the offending YUV420 @ 59.94 phy_freq and vclk_freq mode to match in calculations. It also results in all fractional rates being unavailable for use. This was unintended and requires the patch to be reverted.
Fixes: bfbc68e4d869 ("drm/meson: vclk: fix calculation of 59.94 fractional rates") Cc: stable@vger.kernel.org Signed-off-by: Christian Hewitt christianshewitt@gmail.com Signed-off-by: Martin Blumenstingl martin.blumenstingl@googlemail.com Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250421201300.778955-2-martin.blumenstingl@google... Signed-off-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250421201300.778955-2-martin.blumenstingl@google... Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/meson/meson_vclk.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/meson/meson_vclk.c b/drivers/gpu/drm/meson/meson_vclk.c index 2a942dc6a6dc2..2a82119eb58ed 100644 --- a/drivers/gpu/drm/meson/meson_vclk.c +++ b/drivers/gpu/drm/meson/meson_vclk.c @@ -790,13 +790,13 @@ meson_vclk_vic_supported_freq(struct meson_drm *priv, unsigned int phy_freq, FREQ_1000_1001(params[i].pixel_freq)); DRM_DEBUG_DRIVER("i = %d phy_freq = %d alt = %d\n", i, params[i].phy_freq, - FREQ_1000_1001(params[i].phy_freq/1000)*1000); + FREQ_1000_1001(params[i].phy_freq/10)*10); /* Match strict frequency */ if (phy_freq == params[i].phy_freq && vclk_freq == params[i].vclk_freq) return MODE_OK; /* Match 1000/1001 variant */ - if (phy_freq == (FREQ_1000_1001(params[i].phy_freq/1000)*1000) && + if (phy_freq == (FREQ_1000_1001(params[i].phy_freq/10)*10) && vclk_freq == FREQ_1000_1001(params[i].vclk_freq)) return MODE_OK; } @@ -1070,7 +1070,7 @@ void meson_vclk_setup(struct meson_drm *priv, unsigned int target,
for (freq = 0 ; params[freq].pixel_freq ; ++freq) { if ((phy_freq == params[freq].phy_freq || - phy_freq == FREQ_1000_1001(params[freq].phy_freq/1000)*1000) && + phy_freq == FREQ_1000_1001(params[freq].phy_freq/10)*10) && (vclk_freq == params[freq].vclk_freq || vclk_freq == FREQ_1000_1001(params[freq].vclk_freq))) { if (vclk_freq != params[freq].vclk_freq)
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Xiang wangx wangxiang@cdjrlc.com
[ Upstream commit c10f2f8b5d8027c1ea77f777f2d16cb9043a6c09 ]
struct of_device_id should normally be const.
Signed-off-by: Xiang wangx wangxiang@cdjrlc.com Signed-off-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20211209132453.25623-1-wangxiang@cdjrlc.com Stable-dep-of: 3318dc299b07 ("irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic-v2m.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c index 0e57c60681aab..48e2eed33f8fa 100644 --- a/drivers/irqchip/irq-gic-v2m.c +++ b/drivers/irqchip/irq-gic-v2m.c @@ -405,7 +405,7 @@ static int __init gicv2m_init_one(struct fwnode_handle *fwnode, return ret; }
-static struct of_device_id gicv2m_device_id[] = { +static const struct of_device_id gicv2m_device_id[] = { { .compatible = "arm,gic-v2m-frame", }, {}, };
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Thomas Gleixner tglx@linutronix.de
[ Upstream commit d51a15af37ce8cf59e73de51dcdce3c9f4944974 ]
They are all part of the init sequence.
Signed-off-by: Thomas Gleixner tglx@linutronix.de Acked-by: Marc Zyngier maz@kernel.org Link: https://lore.kernel.org/r/20221121140048.534395323@linutronix.de Stable-dep-of: 3318dc299b07 ("irqchip/gic-v2m: Prevent use after free of gicv2m_get_fwnode()") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic-v2m.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c index 48e2eed33f8fa..6790a621a9324 100644 --- a/drivers/irqchip/irq-gic-v2m.c +++ b/drivers/irqchip/irq-gic-v2m.c @@ -263,7 +263,7 @@ static struct msi_domain_info gicv2m_pmsi_domain_info = { .chip = &gicv2m_pmsi_irq_chip, };
-static void gicv2m_teardown(void) +static void __init gicv2m_teardown(void) { struct v2m_data *v2m, *tmp;
@@ -278,7 +278,7 @@ static void gicv2m_teardown(void) } }
-static int gicv2m_allocate_domains(struct irq_domain *parent) +static __init int gicv2m_allocate_domains(struct irq_domain *parent) { struct irq_domain *inner_domain, *pci_domain, *plat_domain; struct v2m_data *v2m; @@ -405,7 +405,7 @@ static int __init gicv2m_init_one(struct fwnode_handle *fwnode, return ret; }
-static const struct of_device_id gicv2m_device_id[] = { +static __initconst struct of_device_id gicv2m_device_id[] = { { .compatible = "arm,gic-v2m-frame", }, {}, }; @@ -455,7 +455,7 @@ static int __init gicv2m_of_init(struct fwnode_handle *parent_handle, #ifdef CONFIG_ACPI static int acpi_num_msi;
-static struct fwnode_handle *gicv2m_get_fwnode(struct device *dev) +static __init struct fwnode_handle *gicv2m_get_fwnode(struct device *dev) { struct v2m_data *data;
@@ -470,7 +470,7 @@ static struct fwnode_handle *gicv2m_get_fwnode(struct device *dev) return data->fwnode; }
-static bool acpi_check_amazon_graviton_quirks(void) +static __init bool acpi_check_amazon_graviton_quirks(void) { static struct acpi_table_madt *madt; acpi_status status;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Suzuki K Poulose suzuki.poulose@arm.com
[ Upstream commit 3318dc299b072a0511d6dfd8367f3304fb6d9827 ]
With ACPI in place, gicv2m_get_fwnode() is registered with the pci subsystem as pci_msi_get_fwnode_cb(), which may get invoked at runtime during a PCI host bridge probe. But, the call back is wrongly marked as __init, causing it to be freed, while being registered with the PCI subsystem and could trigger:
Unable to handle kernel paging request at virtual address ffff8000816c0400 gicv2m_get_fwnode+0x0/0x58 (P) pci_set_bus_msi_domain+0x74/0x88 pci_register_host_bridge+0x194/0x548
This is easily reproducible on a Juno board with ACPI boot.
Retain the function for later use.
Fixes: 0644b3daca28 ("irqchip/gic-v2m: acpi: Introducing GICv2m ACPI support") Signed-off-by: Suzuki K Poulose suzuki.poulose@arm.com Signed-off-by: Thomas Gleixner tglx@linutronix.de Signed-off-by: Ingo Molnar mingo@kernel.org Reviewed-by: Marc Zyngier maz@kernel.org Cc: stable@vger.kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/irqchip/irq-gic-v2m.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/irqchip/irq-gic-v2m.c b/drivers/irqchip/irq-gic-v2m.c index 6790a621a9324..9d99b19cd21b6 100644 --- a/drivers/irqchip/irq-gic-v2m.c +++ b/drivers/irqchip/irq-gic-v2m.c @@ -455,7 +455,7 @@ static int __init gicv2m_of_init(struct fwnode_handle *parent_handle, #ifdef CONFIG_ACPI static int acpi_num_msi;
-static __init struct fwnode_handle *gicv2m_get_fwnode(struct device *dev) +static struct fwnode_handle *gicv2m_get_fwnode(struct device *dev) { struct v2m_data *data;
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Stephan Gerhold stephan.gerhold@linaro.org
[ Upstream commit 7094832b5ac861b0bd7ed8866c93cb15ef619996 ]
The MSM UART DM controller supports different working modes, e.g. DMA or the "single-character mode", where all reads/writes operate on a single character rather than 4 chars (32-bit) at once. When using earlycon, __msm_console_write() always writes 4 characters at a time, but we don't know which mode the bootloader was using and we don't set the mode either.
This causes garbled output if the bootloader was using the single-character mode, because only every 4th character appears in the serial console, e.g.
"[ 00oni pi 000xf0[ 00i s 5rm9(l)l s 1 1 SPMTA 7:C 5[ 00A ade k d[ 00ano:ameoi .Q1B[ 00ac _idaM00080oo'"
If the bootloader was using the DMA ("DM") mode, output would likely fail entirely. Later, when the full serial driver probes, the port is re-initialized and output works as expected.
Fix this also for earlycon by clearing the DMEN register and reset+re-enable the transmitter to apply the change. This ensures the transmitter is in the expected state before writing any output.
Cc: stable stable@kernel.org Fixes: 0efe72963409 ("tty: serial: msm: Add earlycon support") Signed-off-by: Stephan Gerhold stephan.gerhold@linaro.org Reviewed-by: Neil Armstrong neil.armstrong@linaro.org Link: https://lore.kernel.org/r/20250408-msm-serial-earlycon-v1-1-429080127530@lin... Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/serial/msm_serial.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/tty/serial/msm_serial.c b/drivers/tty/serial/msm_serial.c index 03ff63438e772..9740bc301cc27 100644 --- a/drivers/tty/serial/msm_serial.c +++ b/drivers/tty/serial/msm_serial.c @@ -1732,6 +1732,12 @@ msm_serial_early_console_setup_dm(struct earlycon_device *device, if (!device->port.membase) return -ENODEV;
+ /* Disable DM / single-character modes */ + msm_write(&device->port, 0, UARTDM_DMEN); + msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); + msm_write(&device->port, MSM_UART_CR_CMD_RESET_TX, MSM_UART_CR); + msm_write(&device->port, MSM_UART_CR_TX_ENABLE, MSM_UART_CR); + device->con->write = msm_serial_early_write_dm; return 0; }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Björn Töpel bjorn@rivosinc.com
[ Upstream commit 7d1d19a11cfbfd8bae1d89cc010b2cc397cd0c48 ]
The XOL (execute out-of-line) buffer is used to single-step the replaced instruction(s) for uprobes. The RISC-V port was missing a proper fence.i (i$ flushing) after constructing the XOL buffer, which can result in incorrect execution of stale/broken instructions.
This was found running the BPF selftests "test_progs: uprobe_autoattach, attach_probe" on the Spacemit K1/X60, where the uprobes tests randomly blew up.
Reviewed-by: Guo Ren guoren@kernel.org Fixes: 74784081aac8 ("riscv: Add uprobes supported") Signed-off-by: Björn Töpel bjorn@rivosinc.com Link: https://lore.kernel.org/r/20250419111402.1660267-2-bjorn@kernel.org Signed-off-by: Palmer Dabbelt palmer@rivosinc.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/riscv/kernel/probes/uprobes.c | 10 ++-------- 1 file changed, 2 insertions(+), 8 deletions(-)
diff --git a/arch/riscv/kernel/probes/uprobes.c b/arch/riscv/kernel/probes/uprobes.c index 194f166b2cc40..0d18ee53fd649 100644 --- a/arch/riscv/kernel/probes/uprobes.c +++ b/arch/riscv/kernel/probes/uprobes.c @@ -161,6 +161,7 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, /* Initialize the slot */ void *kaddr = kmap_atomic(page); void *dst = kaddr + (vaddr & ~PAGE_MASK); + unsigned long start = (unsigned long)dst;
memcpy(dst, src, len);
@@ -170,13 +171,6 @@ void arch_uprobe_copy_ixol(struct page *page, unsigned long vaddr, *(uprobe_opcode_t *)dst = __BUG_INSN_32; }
+ flush_icache_range(start, start + len); kunmap_atomic(kaddr); - - /* - * We probably need flush_icache_user_page() but it needs vma. - * This should work on most of architectures by default. If - * architecture needs to do something different it can define - * its own version of the function. - */ - flush_dcache_page(page); }
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Jason Gunthorpe jgg@nvidia.com
[ Upstream commit a2bb820e862d61f9ca1499e500915f9f505a2655 ]
Since v5.12 the rbtree has gained some simplifying helpers aimed at making rb tree users write less convoluted boiler plate code. Instead the caller provides a single comparison function and the helpers generate the prior open-coded stuff.
Update smmu->streams to use rb_find_add() and rb_find().
Tested-by: Nicolin Chen nicolinc@nvidia.com Reviewed-by: Mostafa Saleh smostafa@google.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Link: https://lore.kernel.org/r/1-v3-9fef8cdc2ff6+150d1-smmuv3_tidy_jgg@nvidia.com Signed-off-by: Will Deacon will@kernel.org Stable-dep-of: b00d24997a11 ("iommu/arm-smmu-v3: Fix iommu_device_probe bug due to duplicated stream ids") Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 68 ++++++++++----------- 1 file changed, 31 insertions(+), 37 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index ec4c87095c6cd..fc07ecce426ef 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -1430,26 +1430,37 @@ static int arm_smmu_init_l2_strtab(struct arm_smmu_device *smmu, u32 sid) return 0; }
+static int arm_smmu_streams_cmp_key(const void *lhs, const struct rb_node *rhs) +{ + struct arm_smmu_stream *stream_rhs = + rb_entry(rhs, struct arm_smmu_stream, node); + const u32 *sid_lhs = lhs; + + if (*sid_lhs < stream_rhs->id) + return -1; + if (*sid_lhs > stream_rhs->id) + return 1; + return 0; +} + +static int arm_smmu_streams_cmp_node(struct rb_node *lhs, + const struct rb_node *rhs) +{ + return arm_smmu_streams_cmp_key( + &rb_entry(lhs, struct arm_smmu_stream, node)->id, rhs); +} + static struct arm_smmu_master * arm_smmu_find_master(struct arm_smmu_device *smmu, u32 sid) { struct rb_node *node; - struct arm_smmu_stream *stream;
lockdep_assert_held(&smmu->streams_mutex);
- node = smmu->streams.rb_node; - while (node) { - stream = rb_entry(node, struct arm_smmu_stream, node); - if (stream->id < sid) - node = node->rb_right; - else if (stream->id > sid) - node = node->rb_left; - else - return stream->master; - } - - return NULL; + node = rb_find(&sid, &smmu->streams, arm_smmu_streams_cmp_key); + if (!node) + return NULL; + return rb_entry(node, struct arm_smmu_stream, node)->master; }
/* IRQ and event handlers */ @@ -2560,8 +2571,6 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu, { int i; int ret = 0; - struct arm_smmu_stream *new_stream, *cur_stream; - struct rb_node **new_node, *parent_node = NULL; struct iommu_fwspec *fwspec = dev_iommu_fwspec_get(master->dev);
master->streams = kcalloc(fwspec->num_ids, sizeof(*master->streams), @@ -2572,9 +2581,9 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu,
mutex_lock(&smmu->streams_mutex); for (i = 0; i < fwspec->num_ids; i++) { + struct arm_smmu_stream *new_stream = &master->streams[i]; u32 sid = fwspec->ids[i];
- new_stream = &master->streams[i]; new_stream->id = sid; new_stream->master = master;
@@ -2594,28 +2603,13 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu, }
/* Insert into SID tree */ - new_node = &(smmu->streams.rb_node); - while (*new_node) { - cur_stream = rb_entry(*new_node, struct arm_smmu_stream, - node); - parent_node = *new_node; - if (cur_stream->id > new_stream->id) { - new_node = &((*new_node)->rb_left); - } else if (cur_stream->id < new_stream->id) { - new_node = &((*new_node)->rb_right); - } else { - dev_warn(master->dev, - "stream %u already in tree\n", - cur_stream->id); - ret = -EINVAL; - break; - } - } - if (ret) + if (rb_find_add(&new_stream->node, &smmu->streams, + arm_smmu_streams_cmp_node)) { + dev_warn(master->dev, "stream %u already in tree\n", + sid); + ret = -EINVAL; break; - - rb_link_node(&new_stream->node, parent_node, new_node); - rb_insert_color(&new_stream->node, &smmu->streams); + } }
if (ret) {
5.15-stable review patch. If anyone has any objections, please let me know.
------------------
From: Nicolin Chen nicolinc@nvidia.com
[ Upstream commit b00d24997a11c10d3e420614f0873b83ce358a34 ]
ASPEED VGA card has two built-in devices: 0008:06:00.0 PCI bridge: ASPEED Technology, Inc. AST1150 PCI-to-PCI Bridge (rev 06) 0008:07:00.0 VGA compatible controller: ASPEED Technology, Inc. ASPEED Graphics Family (rev 52)
Its toplogy looks like this: +-[0008:00]---00.0-[01-09]--+-00.0-[02-09]--+-00.0-[03]----00.0 Sandisk Corp Device 5017 | +-01.0-[04]-- | +-02.0-[05]----00.0 NVIDIA Corporation Device | +-03.0-[06-07]----00.0-[07]----00.0 ASPEED Technology, Inc. ASPEED Graphics Family | +-04.0-[08]----00.0 Renesas Technology Corp. uPD720201 USB 3.0 Host Controller | -05.0-[09]----00.0 Realtek Semiconductor Co., Ltd. RTL8111/8168/8411 PCI Express Gigabit Ethernet Controller -00.1 PMC-Sierra Inc. Device 4028
The IORT logic populaties two identical IDs into the fwspec->ids array via DMA aliasing in iort_pci_iommu_init() called by pci_for_each_dma_alias().
Though the SMMU driver had been able to handle this situation since commit 563b5cbe334e ("iommu/arm-smmu-v3: Cope with duplicated Stream IDs"), that got broken by the later commit cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure"), which ended up with allocating separate streams with the same stuffing.
On a kernel prior to v6.15-rc1, there has been an overlooked warning: pci 0008:07:00.0: vgaarb: setting as boot VGA device pci 0008:07:00.0: vgaarb: bridge control possible pci 0008:07:00.0: vgaarb: VGA device added: decodes=io+mem,owns=none,locks=none pcieport 0008:06:00.0: Adding to iommu group 14 ast 0008:07:00.0: stream 67328 already in tree <===== WARNING ast 0008:07:00.0: enabling device (0002 -> 0003) ast 0008:07:00.0: Using default configuration ast 0008:07:00.0: AST 2600 detected ast 0008:07:00.0: [drm] Using analog VGA ast 0008:07:00.0: [drm] dram MCLK=396 Mhz type=1 bus_width=16 [drm] Initialized ast 0.1.0 for 0008:07:00.0 on minor 0 ast 0008:07:00.0: [drm] fb0: astdrmfb frame buffer device
With v6.15-rc, since the commit bcb81ac6ae3c ("iommu: Get DT/ACPI parsing into the proper probe path"), the error returned with the warning is moved to the SMMU device probe flow: arm_smmu_probe_device+0x15c/0x4c0 __iommu_probe_device+0x150/0x4f8 probe_iommu_group+0x44/0x80 bus_for_each_dev+0x7c/0x100 bus_iommu_probe+0x48/0x1a8 iommu_device_register+0xb8/0x178 arm_smmu_device_probe+0x1350/0x1db0 which then fails the entire SMMU driver probe: pci 0008:06:00.0: Adding to iommu group 21 pci 0008:07:00.0: stream 67328 already in tree arm-smmu-v3 arm-smmu-v3.9.auto: Failed to register iommu arm-smmu-v3 arm-smmu-v3.9.auto: probe with driver arm-smmu-v3 failed with error -22
Since SMMU driver had been already expecting a potential duplicated Stream ID in arm_smmu_install_ste_for_dev(), change the arm_smmu_insert_master() routine to ignore a duplicated ID from the fwspec->sids array as well.
Note: this has been failing the iommu_device_probe() since 2021, although a recent iommu commit in v6.15-rc1 that moves iommu_device_probe() started to fail the SMMU driver probe. Since nobody has cared about DMA Alias support, leave that as it was but fix the fundamental iommu_device_probe() breakage.
Fixes: cdf315f907d4 ("iommu/arm-smmu-v3: Maintain a SID->device structure") Cc: stable@vger.kernel.org Suggested-by: Jason Gunthorpe jgg@nvidia.com Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Link: https://lore.kernel.org/r/20250415185620.504299-1-nicolinc@nvidia.com Signed-off-by: Will Deacon will@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index fc07ecce426ef..bc65e7b4f0045 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2582,6 +2582,7 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu, mutex_lock(&smmu->streams_mutex); for (i = 0; i < fwspec->num_ids; i++) { struct arm_smmu_stream *new_stream = &master->streams[i]; + struct rb_node *existing; u32 sid = fwspec->ids[i];
new_stream->id = sid; @@ -2603,10 +2604,20 @@ static int arm_smmu_insert_master(struct arm_smmu_device *smmu, }
/* Insert into SID tree */ - if (rb_find_add(&new_stream->node, &smmu->streams, - arm_smmu_streams_cmp_node)) { - dev_warn(master->dev, "stream %u already in tree\n", - sid); + existing = rb_find_add(&new_stream->node, &smmu->streams, + arm_smmu_streams_cmp_node); + if (existing) { + struct arm_smmu_master *existing_master = + rb_entry(existing, struct arm_smmu_stream, node) + ->master; + + /* Bridged PCI devices may end up with duplicated IDs */ + if (existing_master == master) + continue; + + dev_warn(master->dev, + "stream %u already in tree from dev %s\n", sid, + dev_name(existing_master->dev)); ret = -EINVAL; break; }
Hi!
This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
We get build errors here:
CC drivers/scsi/libsas/sas_port.o 4415 drivers/tty/serial/msm_serial.c: In function 'msm_serial_early_console_setup_dm': 4416 drivers/tty/serial/msm_serial.c:1737:27: error: 'MSM_UART_CR_CMD_RESET_RX' undeclared (first use in this function); did you mean 'UART_CR_CMD_RESET_RX'? 4417 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); 4418 | ^~~~~~~~~~~~~~~~~~~~~~~~ 4419 | UART_CR_CMD_RESET_RX
https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/jobs/9967131... https://gitlab.com/cip-project/cip-testing/linux-stable-rc-ci/-/pipelines/18...
Best regards, Pavel
Hi Greg,
On 07/05/2025 19:39, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 09 May 2025 18:37:41 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.182-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
...
Stephan Gerhold stephan.gerhold@linaro.org serial: msm: Configure correct working mode before starting earlycon
The above commit is breaking the build for ARM64 and I am seeing the following build error ...
drivers/tty/serial/msm_serial.c: In function ‘msm_serial_early_console_setup_dm’: drivers/tty/serial/msm_serial.c:1737:34: error: ‘MSM_UART_CR_CMD_RESET_RX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_RX’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_RX drivers/tty/serial/msm_serial.c:1737:34: note: each undeclared identifier is reported only once for each function it appears in drivers/tty/serial/msm_serial.c:1737:60: error: ‘MSM_UART_CR’ undeclared (first use in this function); did you mean ‘UART_CR’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~ | UART_CR drivers/tty/serial/msm_serial.c:1738:34: error: ‘MSM_UART_CR_CMD_RESET_TX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_TX’? 1738 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_TX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_TX CC drivers/ata/libata-transport.o drivers/tty/serial/msm_serial.c:1739:34: error: ‘MSM_UART_CR_TX_ENABLE’ undeclared (first use in this function); did you mean ‘UART_CR_TX_ENABLE’? 1739 | msm_write(&device->port, MSM_UART_CR_TX_ENABLE, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~ | UART_CR_TX_ENABLE
After reverting this, the build is passing again.
Thanks! Jon
On Thu, May 08, 2025 at 10:44:45AM +0100, Jon Hunter wrote:
Hi Greg,
On 07/05/2025 19:39, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 09 May 2025 18:37:41 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.182-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
...
Stephan Gerhold stephan.gerhold@linaro.org serial: msm: Configure correct working mode before starting earlycon
The above commit is breaking the build for ARM64 and I am seeing the following build error ...
drivers/tty/serial/msm_serial.c: In function ‘msm_serial_early_console_setup_dm’: drivers/tty/serial/msm_serial.c:1737:34: error: ‘MSM_UART_CR_CMD_RESET_RX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_RX’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_RX drivers/tty/serial/msm_serial.c:1737:34: note: each undeclared identifier is reported only once for each function it appears in drivers/tty/serial/msm_serial.c:1737:60: error: ‘MSM_UART_CR’ undeclared (first use in this function); did you mean ‘UART_CR’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~ | UART_CR drivers/tty/serial/msm_serial.c:1738:34: error: ‘MSM_UART_CR_CMD_RESET_TX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_TX’? 1738 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_TX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_TX CC drivers/ata/libata-transport.o drivers/tty/serial/msm_serial.c:1739:34: error: ‘MSM_UART_CR_TX_ENABLE’ undeclared (first use in this function); did you mean ‘UART_CR_TX_ENABLE’? 1739 | msm_write(&device->port, MSM_UART_CR_TX_ENABLE, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~ | UART_CR_TX_ENABLE
After reverting this, the build is passing again.
Thanks, commit is now dropped. Odd it didn't show up on my arm64 allmodconfig builds :(
greg k-h
On 08/05/2025 12:23, Greg Kroah-Hartman wrote:
On Thu, May 08, 2025 at 10:44:45AM +0100, Jon Hunter wrote:
Hi Greg,
On 07/05/2025 19:39, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 09 May 2025 18:37:41 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.182-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
Pseudo-Shortlog of commits:
...
Stephan Gerhold stephan.gerhold@linaro.org serial: msm: Configure correct working mode before starting earlycon
The above commit is breaking the build for ARM64 and I am seeing the following build error ...
drivers/tty/serial/msm_serial.c: In function ‘msm_serial_early_console_setup_dm’: drivers/tty/serial/msm_serial.c:1737:34: error: ‘MSM_UART_CR_CMD_RESET_RX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_RX’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_RX drivers/tty/serial/msm_serial.c:1737:34: note: each undeclared identifier is reported only once for each function it appears in drivers/tty/serial/msm_serial.c:1737:60: error: ‘MSM_UART_CR’ undeclared (first use in this function); did you mean ‘UART_CR’? 1737 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_RX, MSM_UART_CR); | ^~~~~~~~~~~ | UART_CR drivers/tty/serial/msm_serial.c:1738:34: error: ‘MSM_UART_CR_CMD_RESET_TX’ undeclared (first use in this function); did you mean ‘UART_CR_CMD_RESET_TX’? 1738 | msm_write(&device->port, MSM_UART_CR_CMD_RESET_TX, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~~~~ | UART_CR_CMD_RESET_TX CC drivers/ata/libata-transport.o drivers/tty/serial/msm_serial.c:1739:34: error: ‘MSM_UART_CR_TX_ENABLE’ undeclared (first use in this function); did you mean ‘UART_CR_TX_ENABLE’? 1739 | msm_write(&device->port, MSM_UART_CR_TX_ENABLE, MSM_UART_CR); | ^~~~~~~~~~~~~~~~~~~~~ | UART_CR_TX_ENABLE
After reverting this, the build is passing again.
Thanks, commit is now dropped. Odd it didn't show up on my arm64 allmodconfig builds :(
I saw this just building the stock ARM64 defconfig.
Jon
On 5/7/2025 8:39 PM, Greg Kroah-Hartman wrote:
This is the start of the stable review cycle for the 5.15.182 release. There are 55 patches in this series, all will be posted as a response to this one. If anyone has any issues with these being applied, please let me know.
Responses should be made by Fri, 09 May 2025 18:37:41 +0000. Anything received after that time might be too late.
The whole patch series can be found in one patch at: https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.15.182-rc... or in the git tree and branch at: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.15.y and the diffstat can be found below.
thanks,
greg k-h
On ARCH_BRCMSTB using 32-bit and 64-bit ARM kernels build tested on BMIPS_GENERIC:
Tested-by: Florian Fainelli florian.fainelli@broadcom.com
linux-stable-mirror@lists.linaro.org