May 2025 - Linux-stable-mirror

[REGRESSION] e1000e heavy packet loss on Meteor Lake - 6.14.2

by Marek Marczykowski-Górecki

Hi, After updating to 6.14.2, the ethernet adapter is almost unusable, I get over 30% packet loss. Bisect says it's this commit: commit 85f6414167da39e0da30bf370f1ecda5a58c6f7b Author: Vitaly Lifshits <vitaly.lifshits(a)intel.com> Date: Thu Mar 13 16:05:56 2025 +0200 e1000e: change k1 configuration on MTP and later platforms [ Upstream commit efaaf344bc2917cbfa5997633bc18a05d3aed27f ] Starting from Meteor Lake, the Kumeran interface between the integrated MAC and the I219 PHY works at a different frequency. This causes sporadic MDI errors when accessing the PHY, and in rare circumstances could lead to packet corruption. To overcome this, introduce minor changes to the Kumeran idle state (K1) parameters during device initialization. Hardware reset reverts this configuration, therefore it needs to be applied in a few places. Fixes: cc23f4f0b6b9 ("e1000e: Add support for Meteor Lake") Signed-off-by: Vitaly Lifshits <vitaly.lifshits(a)intel.com> Tested-by: Avigail Dahan <avigailx.dahan(a)intel.com> Signed-off-by: Tony Nguyen <anthony.l.nguyen(a)intel.com> Signed-off-by: Sasha Levin <sashal(a)kernel.org> drivers/net/ethernet/intel/e1000e/defines.h | 3 ++ drivers/net/ethernet/intel/e1000e/ich8lan.c | 80 +++++++++++++++++++++++++++-- drivers/net/ethernet/intel/e1000e/ich8lan.h | 4 ++ 3 files changed, 82 insertions(+), 5 deletions(-) My system is Novacustom V540TU laptop with Intel Core Ultra 5 125H. And the e1000e driver is running in a Xen HVM (with PCI passthrough). Interestingly, I have also another one with Intel Core Ultra 7 155H where the issue does not happen. I don't see what is different about network adapter there, they look identical on lspci (but there are differences about other devices)... I see the commit above was already backported to other stable branches too... #regzbot introduced: 85f6414167da39e0da30bf370f1ecda5a58c6f7b -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab

3 weeks, 1 day

4
19
0 0

[PATCH v1 1/2] usb: xhci: Skip xhci_reset in xhci_resume if xhci is being removed

by Roy Luo

xhci_reset() currently returns -ENODEV if XHCI_STATE_REMOVING is set, without completing the xhci handshake, unless the reset completes exceptionally quickly. This behavior causes a regression on Synopsys DWC3 USB controllers with dual-role capabilities. Specifically, when a DWC3 controller exits host mode and removes xhci while a reset is still in progress, and then attempts to configure its hardware for device mode, the ongoing, incomplete reset leads to critical register access issues. All register reads return zero, not just within the xHCI register space (which might be expected during a reset), but across the entire DWC3 IP block. This patch addresses the issue by preventing xhci_reset() from being called in xhci_resume() and bailing out early in the reinit flow when XHCI_STATE_REMOVING is set. Cc: stable(a)vger.kernel.org Fixes: 6ccb83d6c497 ("usb: xhci: Implement xhci_handshake_check_state() helper") Suggested-by: Mathias Nyman <mathias.nyman(a)intel.com> Signed-off-by: Roy Luo <royluo(a)google.com> --- drivers/usb/host/xhci.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/usb/host/xhci.c b/drivers/usb/host/xhci.c index 90eb491267b5..244b12eafd95 100644 --- a/drivers/usb/host/xhci.c +++ b/drivers/usb/host/xhci.c @@ -1084,7 +1084,10 @@ int xhci_resume(struct xhci_hcd *xhci, bool power_lost, bool is_auto_resume) xhci_dbg(xhci, "Stop HCD\n"); xhci_halt(xhci); xhci_zero_64b_regs(xhci); - retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC); + if (xhci->xhc_state & XHCI_STATE_REMOVING) + retval = -ENODEV; + else + retval = xhci_reset(xhci, XHCI_RESET_LONG_USEC); spin_unlock_irq(&xhci->lock); if (retval) return retval; -- 2.49.0.1204.g71687c7c1d-goog

3 weeks, 1 day

4
8
0 0

[PATCH V3 2/2] PCI: Prevent LS7A Bus Master clearing on kexec

by Huacai Chen

This is similar to commit 62b6dee1b44a ("PCI/portdrv: Prevent LS7A Bus Master clearing on shutdown"), which prevents LS7A Bus Master clearing on kexec. The key point of this is to work around the LS7A defect that clearing PCI_COMMAND_MASTER prevents MMIO requests from going downstream, and we may need to do that even after .shutdown(), e.g., to print console messages. And in this case we rely on .shutdown() for the downstream devices to disable interrupts and DMA. Only skip Bus Master clearing on bridges because endpoint devices still need it. Cc: <stable(a)vger.kernel.org> Signed-off-by: Ming Wang <wangming01(a)loongson.cn> Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> --- drivers/pci/pci-driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index 602838416e6a..8a1e32367a06 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -517,7 +517,7 @@ static void pci_device_shutdown(struct device *dev) * If it is not a kexec reboot, firmware will hit the PCI * devices with big hammer and stop their DMA any way. */ - if (kexec_in_progress && (pci_dev->current_state <= PCI_D3hot)) + if (kexec_in_progress && !pci_is_bridge(pci_dev) && (pci_dev->current_state <= PCI_D3hot)) pci_clear_master(pci_dev); } -- 2.47.1

3 weeks, 1 day

3
2
0 0

[PATCH V3 1/2] PCI: Use local_pci_probe() when best selected cpu is offline

by Huacai Chen

From: Hongchen Zhang <zhanghongchen(a)loongson.cn> When the best selected CPU is offline, work_on_cpu() will stuck forever. This can be happen if a node is online while all its CPUs are offline (we can use "maxcpus=1" without "nr_cpus=1" to reproduce it), Therefore, in this case, we should call local_pci_probe() instead of work_on_cpu(). Cc: <stable(a)vger.kernel.org> Signed-off-by: Huacai Chen <chenhuacai(a)loongson.cn> Signed-off-by: Hongchen Zhang <zhanghongchen(a)loongson.cn> --- drivers/pci/pci-driver.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/pci/pci-driver.c b/drivers/pci/pci-driver.c index c8bd71a739f7..602838416e6a 100644 --- a/drivers/pci/pci-driver.c +++ b/drivers/pci/pci-driver.c @@ -386,7 +386,7 @@ static int pci_call_probe(struct pci_driver *drv, struct pci_dev *dev, free_cpumask_var(wq_domain_mask); } - if (cpu < nr_cpu_ids) + if ((cpu < nr_cpu_ids) && cpu_online(cpu)) error = work_on_cpu(cpu, local_pci_probe, &ddi); else error = local_pci_probe(&ddi); -- 2.47.1

3 weeks, 1 day

3
2
0 0

Backports of d0afcfeb9e3810ec89d1ffde1a0e36621bb75dca for 6.6 and older

by Nathan Chancellor

Hi Greg and Sasha, Please find attached backports of commit d0afcfeb9e38 ("kbuild: Disable -Wdefault-const-init-unsafe") for 6.6 and older, which is needed for tip of tree versions of LLVM. Please let me know if there are any questions. Cheers, Nathan

3 weeks, 3 days

2
3
0 0

[PATCH] platform/x86: ideapad-laptop: use usleep_range() for EC polling

by Rong Zhang

It was reported that ideapad-laptop sometimes causes some recent (since 2024) Lenovo ThinkBook models shut down when: - suspending/resuming - closing/opening the lid - (dis)connecting a charger - reading/writing some sysfs properties, e.g., fan_mode, touchpad - pressing down some Fn keys, e.g., Brightness Up/Down (Fn+F5/F6) - (seldom) loading the kmod The issue has existed since the launch day of such models, and there have been some out-of-tree workarounds (see Link:) for the issue. One disables some functionalities, while another one simply shortens IDEAPAD_EC_TIMEOUT. The disabled functionalities have read_ec_data() in their call chains, which calls schedule() between each poll. It turns out that these models suffer from the indeterminacy of schedule() because of their low tolerance for being polled too frequently. Sometimes schedule() returns too soon due to the lack of ready tasks, causing the margin between two polls to be too short. In this case, the command is somehow aborted, and too many subsequent polls (they poll for "nothing!") may eventually break the state machine in the EC, resulting in a hard shutdown. This explains why shortening IDEAPAD_EC_TIMEOUT works around the issue - it reduces the total number of polls sent to the EC. Even when it doesn't lead to a shutdown, frequent polls may also disturb the ongoing operation and notably delay (+ 10-20ms) the availability of EC response. This phenomenon is unlikely to be exclusive to the models mentioned above, so dropping the schedule() manner should also slightly improve the responsiveness of various models. Fix these issues by migrating to usleep_range(150, 300). The interval is chosen to add some margin to the minimal 50us and considering EC responses are usually available after 150-2500us based on my test. It should be enough to fix these issues on all models subject to the EC bug without introducing latency on other models. Tested on ThinkBook 14 G7+ ASP and solved both issues. No regression was introduced in the test on a model without the EC bug (ThinkBook X IMH, thanks Eric). Link: https://github.com/ty2/ideapad-laptop-tb2024g6plus/commit/6c5db18c9e8109873… Link: https://github.com/ferstar/ideapad-laptop-tb/commit/42d1e68e5009529d31bd23f… Closes: https://bugzilla.kernel.org/show_bug.cgi?id=218771 Fixes: 6a09f21dd1e2 ("ideapad: add ACPI helpers") Cc: stable(a)vger.kernel.org Tested-by: Eric Long <i(a)hack3r.moe> Signed-off-by: Rong Zhang <i(a)rong.moe> --- drivers/platform/x86/ideapad-laptop.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/drivers/platform/x86/ideapad-laptop.c b/drivers/platform/x86/ideapad-laptop.c index ede483573fe0..b5e4da6a6779 100644 --- a/drivers/platform/x86/ideapad-laptop.c +++ b/drivers/platform/x86/ideapad-laptop.c @@ -15,6 +15,7 @@ #include <linux/bug.h> #include <linux/cleanup.h> #include <linux/debugfs.h> +#include <linux/delay.h> #include <linux/device.h> #include <linux/dmi.h> #include <linux/i8042.h> @@ -267,6 +268,20 @@ static void ideapad_shared_exit(struct ideapad_private *priv) */ #define IDEAPAD_EC_TIMEOUT 200 /* in ms */ +/* + * Some models (e.g., ThinkBook since 2024) have a low tolerance for being + * polled too frequently. Doing so may break the state machine in the EC, + * resulting in a hard shutdown. + * + * It is also observed that frequent polls may disturb the ongoing operation + * and notably delay the availability of EC response. + * + * These values are used as the delay before the first poll and the interval + * between subsequent polls to solve the above issues. + */ +#define IDEAPAD_EC_POLL_MIN_US 150 +#define IDEAPAD_EC_POLL_MAX_US 300 + static int eval_int(acpi_handle handle, const char *name, unsigned long *res) { unsigned long long result; @@ -383,7 +398,7 @@ static int read_ec_data(acpi_handle handle, unsigned long cmd, unsigned long *da end_jiffies = jiffies + msecs_to_jiffies(IDEAPAD_EC_TIMEOUT) + 1; while (time_before(jiffies, end_jiffies)) { - schedule(); + usleep_range(IDEAPAD_EC_POLL_MIN_US, IDEAPAD_EC_POLL_MAX_US); err = eval_vpcr(handle, 1, &val); if (err) @@ -414,7 +429,7 @@ static int write_ec_cmd(acpi_handle handle, unsigned long cmd, unsigned long dat end_jiffies = jiffies + msecs_to_jiffies(IDEAPAD_EC_TIMEOUT) + 1; while (time_before(jiffies, end_jiffies)) { - schedule(); + usleep_range(IDEAPAD_EC_POLL_MIN_US, IDEAPAD_EC_POLL_MAX_US); err = eval_vpcr(handle, 1, &val); if (err) base-commit: a5806cd506af5a7c19bcd596e4708b5c464bfd21 -- 2.49.0

3 weeks, 3 days

7
6
0 0

Unplayable framerates in game but specific kernel versions work, maybe amdgpu problem

by machion＠disroot.org

Hello kernel/driver developers, I hope, with my information it's possible to find a bug/problem in the kernel. Otherwise I am sorry, that I disturbed you. I only use LTS kernels, but I can narrow it down to a hand full of them, where it works. The PC: Manjaro Stable/Cinnamon/X11/AMD Ryzen 5 2600/Radeon HD 7790/8GB RAM I already asked the Manjaro community, but with no luck. The game: Hellpoint (GOG Linux latest version, Unity3D-Engine v2021), uses vulkan --- I came a long road of kernels. I had many versions of 5.4, 5.10, 5.15, 6.1 and 6.6 and and the game was always unplayable, because the frames where around 1fps (performance of PC is not the problem). I asked the mesa and cinnamon team for help in the past, but also with no luck. It never worked, till on 2025-03-29 when I installed 6.12.19 for the first time and it worked! But it only worked with 6.12.19, 6.12.20 and 6.12.21 When I updated to 6.12.25, it was back to unplayable. For testing I installed 6.14.4 with the same result. It doesn't work. I also compared file /proc/config.gz of both kernels (6.12.21 <> 6.14.4), but can't seem to see drastic changes to the graphical part. I presume it has something to do with amdgpu. If you need more information, I would be happy to help. Kind regards, Marion

3 weeks, 3 days

2
4
0 0

[PATCH] fpga: zynq-fpga: use sgtable-based scatterlist wrappers

by Marek Szyprowski

Use common wrappers operating directly on the struct sg_table objects to fix incorrect use of statterlists related calls. dma_unmap_sg() function has to be called with the number of elements originally passed to the dma_map_sg() function, not the one returned in sgtable's nents. CC: stable(a)vger.kernel.org Fixes: 425902f5c8e3 ("fpga zynq: Use the scatterlist interface") Signed-off-by: Marek Szyprowski <m.szyprowski(a)samsung.com> --- drivers/fpga/zynq-fpga.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/fpga/zynq-fpga.c b/drivers/fpga/zynq-fpga.c index f7e08f7ea9ef..9bd39d1d4048 100644 --- a/drivers/fpga/zynq-fpga.c +++ b/drivers/fpga/zynq-fpga.c @@ -406,7 +406,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt) } priv->dma_nelms = - dma_map_sg(mgr->dev.parent, sgt->sgl, sgt->nents, DMA_TO_DEVICE); + dma_map_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE); if (priv->dma_nelms == 0) { dev_err(&mgr->dev, "Unable to DMA map (TO_DEVICE)\n"); return -ENOMEM; @@ -478,7 +478,7 @@ static int zynq_fpga_ops_write(struct fpga_manager *mgr, struct sg_table *sgt) clk_disable(priv->clk); out_free: - dma_unmap_sg(mgr->dev.parent, sgt->sgl, sgt->nents, DMA_TO_DEVICE); + dma_unmap_sgtable(mgr->dev.parent, sgt, DMA_TO_DEVICE); return err; } -- 2.34.1

3 weeks, 4 days

3
3
0 0

[PATCH 1/1] phy: tegra: xusb: Fix unbalanced regulator disable in UTMI PHY mode

by Wayne Chang

When transitioning from USB_ROLE_DEVICE to USB_ROLE_NONE, the code assumed that the regulator should be disabled. However, if the regulator is marked as always-on, regulator_is_enabled() continues to return true, leading to an incorrect attempt to disable a regulator which is not enabled. This can result in warnings such as: [ 250.155624] WARNING: CPU: 1 PID: 7326 at drivers/regulator/core.c:3004 _regulator_disable+0xe4/0x1a0 [ 250.155652] unbalanced disables for VIN_SYS_5V0 To fix this, we move the regulator control logic into tegra186_xusb_padctl_id_override() function since it's directly related to the ID override state. The regulator is now only disabled when the role transitions from USB_ROLE_HOST to USB_ROLE_NONE, by checking the VBUS_ID register. This ensures that regulator enable/disable operations are properly balanced and only occur when actually transitioning to/from host mode. Fixes: 49d46e3c7e59 ("phy: tegra: xusb: Add set_mode support for UTMI phy on Tegra186") Cc: stable(a)vger.kernel.org Signed-off-by: Wayne Chang <waynec(a)nvidia.com> --- drivers/phy/tegra/xusb-tegra186.c | 59 +++++++++++++++++++------------ 1 file changed, 37 insertions(+), 22 deletions(-) diff --git a/drivers/phy/tegra/xusb-tegra186.c b/drivers/phy/tegra/xusb-tegra186.c index fae6242aa730..1b35d50821f7 100644 --- a/drivers/phy/tegra/xusb-tegra186.c +++ b/drivers/phy/tegra/xusb-tegra186.c @@ -774,13 +774,15 @@ static int tegra186_xusb_padctl_vbus_override(struct tegra_xusb_padctl *padctl, } static int tegra186_xusb_padctl_id_override(struct tegra_xusb_padctl *padctl, - bool status) + struct tegra_xusb_usb2_port *port, bool status) { - u32 value; + u32 value, id_override; + int err = 0; dev_dbg(padctl->dev, "%s id override\n", status ? "set" : "clear"); value = padctl_readl(padctl, USB2_VBUS_ID); + id_override = value & ID_OVERRIDE(~0); if (status) { if (value & VBUS_OVERRIDE) { @@ -791,15 +793,35 @@ static int tegra186_xusb_padctl_id_override(struct tegra_xusb_padctl *padctl, value = padctl_readl(padctl, USB2_VBUS_ID); } - value &= ~ID_OVERRIDE(~0); - value |= ID_OVERRIDE_GROUNDED; + if (id_override != ID_OVERRIDE_GROUNDED) { + value &= ~ID_OVERRIDE(~0); + value |= ID_OVERRIDE_GROUNDED; + padctl_writel(padctl, value, USB2_VBUS_ID); + + err = regulator_enable(port->supply); + if (err) { + dev_err(padctl->dev, "Failed to enable regulator: %d\n", err); + return err; + } + } } else { - value &= ~ID_OVERRIDE(~0); - value |= ID_OVERRIDE_FLOATING; + if (id_override == ID_OVERRIDE_GROUNDED) { + /* + * The regulator is disabled only when the role transitions + * from USB_ROLE_HOST to USB_ROLE_NONE. + */ + err = regulator_disable(port->supply); + if (err) { + dev_err(padctl->dev, "Failed to disable regulator: %d\n", err); + return err; + } + + value &= ~ID_OVERRIDE(~0); + value |= ID_OVERRIDE_FLOATING; + padctl_writel(padctl, value, USB2_VBUS_ID); + } } - padctl_writel(padctl, value, USB2_VBUS_ID); - return 0; } @@ -818,27 +840,20 @@ static int tegra186_utmi_phy_set_mode(struct phy *phy, enum phy_mode mode, if (mode == PHY_MODE_USB_OTG) { if (submode == USB_ROLE_HOST) { - tegra186_xusb_padctl_id_override(padctl, true); - - err = regulator_enable(port->supply); + err = tegra186_xusb_padctl_id_override(padctl, port, true); + if (err) + goto out; } else if (submode == USB_ROLE_DEVICE) { tegra186_xusb_padctl_vbus_override(padctl, true); } else if (submode == USB_ROLE_NONE) { - /* - * When port is peripheral only or role transitions to - * USB_ROLE_NONE from USB_ROLE_DEVICE, regulator is not - * enabled. - */ - if (regulator_is_enabled(port->supply)) - regulator_disable(port->supply); - - tegra186_xusb_padctl_id_override(padctl, false); + err = tegra186_xusb_padctl_id_override(padctl, port, false); + if (err) + goto out; tegra186_xusb_padctl_vbus_override(padctl, false); } } - +out: mutex_unlock(&padctl->lock); - return err; } -- 2.25.1

3 weeks, 5 days

3
3
0 0

[PATCH v2] leds: flash: leds-qcom-flash: Fix registry access after re-bind

by Krzysztof Kozlowski

Driver in probe() updates each of 'reg_field' with 'reg_base': for (i = 0; i < REG_MAX_COUNT; i++) regs[i].reg += reg_base; 'reg_field' array (under variable 'regs' above) is statically allocated, thus each re-bind would add another 'reg_base' leading to bogus register addresses. Constify the local 'reg_field' array and duplicate it in probe to solve this. Fixes: 96a2e242a5dc ("leds: flash: Add driver to support flash LED module in QCOM PMICs") Cc: <stable(a)vger.kernel.org> Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org> --- Changes in v2: 1. Fix sizeof() argument (Fenglin Wu) This is a nice example why constifying static memory is useful. --- drivers/leds/flash/leds-qcom-flash.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-) diff --git a/drivers/leds/flash/leds-qcom-flash.c b/drivers/leds/flash/leds-qcom-flash.c index b4c19be51c4d..89cf5120f5d5 100644 --- a/drivers/leds/flash/leds-qcom-flash.c +++ b/drivers/leds/flash/leds-qcom-flash.c @@ -117,7 +117,7 @@ enum { REG_MAX_COUNT, }; -static struct reg_field mvflash_3ch_regs[REG_MAX_COUNT] = { +static const struct reg_field mvflash_3ch_regs[REG_MAX_COUNT] = { REG_FIELD(0x08, 0, 7), /* status1 */ REG_FIELD(0x09, 0, 7), /* status2 */ REG_FIELD(0x0a, 0, 7), /* status3 */ @@ -132,7 +132,7 @@ static struct reg_field mvflash_3ch_regs[REG_MAX_COUNT] = { REG_FIELD(0x58, 0, 2), /* therm_thrsh3 */ }; -static struct reg_field mvflash_4ch_regs[REG_MAX_COUNT] = { +static const struct reg_field mvflash_4ch_regs[REG_MAX_COUNT] = { REG_FIELD(0x06, 0, 7), /* status1 */ REG_FIELD(0x07, 0, 6), /* status2 */ REG_FIELD(0x09, 0, 7), /* status3 */ @@ -854,11 +854,17 @@ static int qcom_flash_led_probe(struct platform_device *pdev) if (val == FLASH_SUBTYPE_3CH_PM8150_VAL || val == FLASH_SUBTYPE_3CH_PMI8998_VAL) { flash_data->hw_type = QCOM_MVFLASH_3CH; flash_data->max_channels = 3; - regs = mvflash_3ch_regs; + regs = devm_kmemdup(dev, mvflash_3ch_regs, sizeof(mvflash_3ch_regs), + GFP_KERNEL); + if (!regs) + return -ENOMEM; } else if (val == FLASH_SUBTYPE_4CH_VAL) { flash_data->hw_type = QCOM_MVFLASH_4CH; flash_data->max_channels = 4; - regs = mvflash_4ch_regs; + regs = devm_kmemdup(dev, mvflash_4ch_regs, sizeof(mvflash_4ch_regs), + GFP_KERNEL); + if (!regs) + return -ENOMEM; rc = regmap_read(regmap, reg_base + FLASH_REVISION_REG, &val); if (rc < 0) { @@ -880,6 +886,7 @@ static int qcom_flash_led_probe(struct platform_device *pdev) dev_err(dev, "Failed to allocate regmap field, rc=%d\n", rc); return rc; } + devm_kfree(dev, regs); /* devm_regmap_field_bulk_alloc() makes copies */ platform_set_drvdata(pdev, flash_data); mutex_init(&flash_data->lock); -- 2.45.2

4 weeks

3
2
0 0

2025

2024

2023

2022

2021

2020

2019

2018

2017

Linux-stable-mirror May 2025