damos_quota.py assumes the quota will always exceeded. But whether
quota will be exceeded or not depend on the monitoring results.
Actually the monitored workload has chaning access pattern and hence
sometimes the quota may not really be exceeded. As a result, false
positive test failures happen. Expect how much time the quota will be
exceeded by checking the monitoring results, and use it instead of the
naive assumption.
Fixes: 51f58c9da14b ("selftests/damon: add a test for DAMOS quota")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: SeongJae Park <sj(a)kernel.org>
---
tools/testing/selftests/damon/damos_quota.py | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/tools/testing/selftests/damon/damos_quota.py b/tools/testing/selftests/damon/damos_quota.py
index 7d4c6bb2e3cd..57c4937aaed2 100755
--- a/tools/testing/selftests/damon/damos_quota.py
+++ b/tools/testing/selftests/damon/damos_quota.py
@@ -51,16 +51,19 @@ def main():
nr_quota_exceeds = scheme.stats.qt_exceeds
wss_collected.sort()
+ nr_expected_quota_exceeds = 0
for wss in wss_collected:
if wss > sz_quota:
print('quota is not kept: %s > %s' % (wss, sz_quota))
print('collected samples are as below')
print('\n'.join(['%d' % wss for wss in wss_collected]))
exit(1)
+ if wss == sz_quota:
+ nr_expected_quota_exceeds += 1
- if nr_quota_exceeds < len(wss_collected):
- print('quota is not always exceeded: %d > %d' %
- (len(wss_collected), nr_quota_exceeds))
+ if nr_quota_exceeds < nr_expected_quota_exceeds:
+ print('quota is exceeded less than expected: %d < %d' %
+ (nr_quota_exceeds, nr_expected_quota_exceeds))
exit(1)
if __name__ == '__main__':
--
2.39.5
From: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
We currently call intel_set_cdclk_post_plane_update() far
too early. When pipes are active during the reprogramming
the current spot only works for the cd2x divider update
case, as that is synchronize to the pipe's vblank. Squashing
and crawling are not synchronized in any way, so doing the
programming while the pipes/planes are potentially still using
the old hardware state could lead to underruns.
Move the post plane reprgramming to a spot where we know
that the pipes/planes have switched over the new hardware
state.
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
---
drivers/gpu/drm/i915/display/intel_display.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_display.c b/drivers/gpu/drm/i915/display/intel_display.c
index 065fdf6dbb88..cb9c6ad3aa11 100644
--- a/drivers/gpu/drm/i915/display/intel_display.c
+++ b/drivers/gpu/drm/i915/display/intel_display.c
@@ -7527,9 +7527,6 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
intel_program_dpkgc_latency(state);
- if (state->modeset)
- intel_set_cdclk_post_plane_update(state);
-
intel_wait_for_vblank_workers(state);
/* FIXME: We should call drm_atomic_helper_commit_hw_done() here
@@ -7606,6 +7603,8 @@ static void intel_atomic_commit_tail(struct intel_atomic_state *state)
intel_verify_planes(state);
intel_sagv_post_plane_update(state);
+ if (state->modeset)
+ intel_set_cdclk_post_plane_update(state);
intel_pmdemand_post_plane_update(state);
drm_atomic_helper_commit_hw_done(&state->base);
--
2.45.3
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
phy-rcar-gen3-usb2 driver exports 4 PHYs. The timing registers are common
to all PHYs. There is no need to set them every time a PHY is initialized.
Set timing register only when the 1st PHY is initialized.
Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable(a)vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
---
Changes in v2:
- collected tags
drivers/phy/renesas/phy-rcar-gen3-usb2.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
index 21cf14ea3437..a89621d3f94b 100644
--- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
+++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
@@ -467,8 +467,11 @@ static int rcar_gen3_phy_usb2_init(struct phy *p)
val = readl(usb2_base + USB2_INT_ENABLE);
val |= USB2_INT_ENABLE_UCOM_INTEN | rphy->int_enable_bits;
writel(val, usb2_base + USB2_INT_ENABLE);
- writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET);
- writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET);
+
+ if (!rcar_gen3_is_any_rphy_initialized(channel)) {
+ writel(USB2_SPD_RSM_TIMSET_INIT, usb2_base + USB2_SPD_RSM_TIMSET);
+ writel(USB2_OC_TIMSET_INIT, usb2_base + USB2_OC_TIMSET);
+ }
/* Initialize otg part (only if we initialize a PHY with IRQs). */
if (rphy->int_enable_bits)
--
2.43.0
From: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
The phy-rcar-gen3-usb2 driver exposes four individual PHYs that are
requested and configured by PHY users. The struct phy_ops APIs access the
same set of registers to configure all PHYs. Additionally, PHY settings can
be modified through sysfs or an IRQ handler. While some struct phy_ops APIs
are protected by a driver-wide mutex, others rely on individual
PHY-specific mutexes.
This approach can lead to various issues, including:
1/ the IRQ handler may interrupt PHY settings in progress, racing with
hardware configuration protected by a mutex lock
2/ due to msleep(20) in rcar_gen3_init_otg(), while a configuration thread
suspends to wait for the delay, another thread may try to configure
another PHY (with phy_init() + phy_power_on()); re-running the
phy_init() goes to the exact same configuration code, re-running the
same hardware configuration on the same set of registers (and bits)
which might impact the result of the msleep for the 1st configuring
thread
3/ sysfs can configure the hardware (though role_store()) and it can
still race with the phy_init()/phy_power_on() APIs calling into the
drivers struct phy_ops
To address these issues, add a spinlock to protect hardware register access
and driver private data structures (e.g., calls to
rcar_gen3_is_any_rphy_initialized()). Checking driver-specific data remains
necessary as all PHY instances share common settings. With this change,
the existing mutex protection is removed and the cleanup.h helpers are
used.
While at it, to keep the code simpler, do not skip
regulator_enable()/regulator_disable() APIs in
rcar_gen3_phy_usb2_power_on()/rcar_gen3_phy_usb2_power_off() as the
regulators enable/disable operations are reference counted anyway.
Fixes: f3b5a8d9b50d ("phy: rcar-gen3-usb2: Add R-Car Gen3 USB2 PHY driver")
Cc: stable(a)vger.kernel.org
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Signed-off-by: Claudiu Beznea <claudiu.beznea.uj(a)bp.renesas.com>
---
Changes in v2:
- collected tags
drivers/phy/renesas/phy-rcar-gen3-usb2.c | 49 +++++++++++++-----------
1 file changed, 26 insertions(+), 23 deletions(-)
diff --git a/drivers/phy/renesas/phy-rcar-gen3-usb2.c b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
index 826c9c4dd4c0..5c0ceba09b67 100644
--- a/drivers/phy/renesas/phy-rcar-gen3-usb2.c
+++ b/drivers/phy/renesas/phy-rcar-gen3-usb2.c
@@ -9,6 +9,7 @@
* Copyright (C) 2014 Cogent Embedded, Inc.
*/
+#include <linux/cleanup.h>
#include <linux/extcon-provider.h>
#include <linux/interrupt.h>
#include <linux/io.h>
@@ -118,7 +119,7 @@ struct rcar_gen3_chan {
struct regulator *vbus;
struct reset_control *rstc;
struct work_struct work;
- struct mutex lock; /* protects rphys[...].powered */
+ spinlock_t lock; /* protects access to hardware and driver data structure. */
enum usb_dr_mode dr_mode;
u32 obint_enable_bits;
bool extcon_host;
@@ -348,6 +349,8 @@ static ssize_t role_store(struct device *dev, struct device_attribute *attr,
bool is_b_device;
enum phy_mode cur_mode, new_mode;
+ guard(spinlock_irqsave)(&ch->lock);
+
if (!ch->is_otg_channel || !rcar_gen3_is_any_otg_rphy_initialized(ch))
return -EIO;
@@ -415,7 +418,7 @@ static void rcar_gen3_init_otg(struct rcar_gen3_chan *ch)
val = readl(usb2_base + USB2_ADPCTRL);
writel(val | USB2_ADPCTRL_IDPULLUP, usb2_base + USB2_ADPCTRL);
}
- msleep(20);
+ mdelay(20);
writel(0xffffffff, usb2_base + USB2_OBINTSTA);
writel(ch->obint_enable_bits, usb2_base + USB2_OBINTEN);
@@ -436,12 +439,14 @@ static irqreturn_t rcar_gen3_phy_usb2_irq(int irq, void *_ch)
if (pm_runtime_suspended(dev))
goto rpm_put;
- status = readl(usb2_base + USB2_OBINTSTA);
- if (status & ch->obint_enable_bits) {
- dev_vdbg(dev, "%s: %08x\n", __func__, status);
- writel(ch->obint_enable_bits, usb2_base + USB2_OBINTSTA);
- rcar_gen3_device_recognition(ch);
- ret = IRQ_HANDLED;
+ scoped_guard(spinlock, &ch->lock) {
+ status = readl(usb2_base + USB2_OBINTSTA);
+ if (status & ch->obint_enable_bits) {
+ dev_vdbg(dev, "%s: %08x\n", __func__, status);
+ writel(ch->obint_enable_bits, usb2_base + USB2_OBINTSTA);
+ rcar_gen3_device_recognition(ch);
+ ret = IRQ_HANDLED;
+ }
}
rpm_put:
@@ -456,6 +461,8 @@ static int rcar_gen3_phy_usb2_init(struct phy *p)
void __iomem *usb2_base = channel->base;
u32 val;
+ guard(spinlock_irqsave)(&channel->lock);
+
/* Initialize USB2 part */
val = readl(usb2_base + USB2_INT_ENABLE);
val |= USB2_INT_ENABLE_UCOM_INTEN | rphy->int_enable_bits;
@@ -479,6 +486,8 @@ static int rcar_gen3_phy_usb2_exit(struct phy *p)
void __iomem *usb2_base = channel->base;
u32 val;
+ guard(spinlock_irqsave)(&channel->lock);
+
rphy->initialized = false;
val = readl(usb2_base + USB2_INT_ENABLE);
@@ -498,16 +507,17 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
u32 val;
int ret = 0;
- mutex_lock(&channel->lock);
- if (!rcar_gen3_are_all_rphys_power_off(channel))
- goto out;
-
if (channel->vbus) {
ret = regulator_enable(channel->vbus);
if (ret)
- goto out;
+ return ret;
}
+ guard(spinlock_irqsave)(&channel->lock);
+
+ if (!rcar_gen3_are_all_rphys_power_off(channel))
+ goto out;
+
val = readl(usb2_base + USB2_USBCTR);
val |= USB2_USBCTR_PLL_RST;
writel(val, usb2_base + USB2_USBCTR);
@@ -517,7 +527,6 @@ static int rcar_gen3_phy_usb2_power_on(struct phy *p)
out:
/* The powered flag should be set for any other phys anyway */
rphy->powered = true;
- mutex_unlock(&channel->lock);
return 0;
}
@@ -528,18 +537,12 @@ static int rcar_gen3_phy_usb2_power_off(struct phy *p)
struct rcar_gen3_chan *channel = rphy->ch;
int ret = 0;
- mutex_lock(&channel->lock);
- rphy->powered = false;
-
- if (!rcar_gen3_are_all_rphys_power_off(channel))
- goto out;
+ scoped_guard(spinlock_irqsave, &channel->lock)
+ rphy->powered = false;
if (channel->vbus)
ret = regulator_disable(channel->vbus);
-out:
- mutex_unlock(&channel->lock);
-
return ret;
}
@@ -750,7 +753,7 @@ static int rcar_gen3_phy_usb2_probe(struct platform_device *pdev)
if (phy_data->no_adp_ctrl)
channel->obint_enable_bits = USB2_OBINT_IDCHG_EN;
- mutex_init(&channel->lock);
+ spin_lock_init(&channel->lock);
for (i = 0; i < NUM_OF_PHYS; i++) {
channel->rphys[i].phy = devm_phy_create(dev, NULL,
phy_data->phy_usb2_ops);
--
2.43.0
It was observed on sc7180 (A618 gpu) that GPU votes for GX rail and CNOC
BCM nodes were not removed after GPU suspend. This was because we
skipped sending 'prepare-slumber' request to gmu during suspend sequence
in some cases. So, make sure we always call prepare-slumber hfi during
suspend. Also, calling prepare-slumber without a prior oob-gpu handshake
messes up gmu firmware's internal state. So, do that when required.
Fixes: 4b565ca5a2cb ("drm/msm: Add A6XX device support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Akhil P Oommen <quic_akhilpo(a)quicinc.com>
---
Changes in v2:
- Minor update to commit text and CC'ed Stable
- Link to v1: https://lore.kernel.org/r/20250226-adreno-sys-suspend-fix-v1-1-054261bba114…
---
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 72 +++++++++++++++++++----------------
1 file changed, 39 insertions(+), 33 deletions(-)
diff --git a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
index 699b0dd34b18f0ec811e975779ba95991d485098..38c94915d4c9d6d33354502651a77c1f9e4648df 100644
--- a/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
+++ b/drivers/gpu/drm/msm/adreno/a6xx_gmu.c
@@ -1169,49 +1169,50 @@ static void a6xx_gmu_shutdown(struct a6xx_gmu *gmu)
struct a6xx_gpu *a6xx_gpu = container_of(gmu, struct a6xx_gpu, gmu);
struct adreno_gpu *adreno_gpu = &a6xx_gpu->base;
u32 val;
+ int ret;
/*
- * The GMU may still be in slumber unless the GPU started so check and
- * skip putting it back into slumber if so
+ * GMU firmware's internal power state gets messed up if we send "prepare_slumber" hfi when
+ * oob_gpu handshake wasn't done after the last wake up. So do a dummy handshake here when
+ * required
*/
- val = gmu_read(gmu, REG_A6XX_GPU_GMU_CX_GMU_RPMH_POWER_STATE);
+ if (adreno_gpu->base.needs_hw_init) {
+ if (a6xx_gmu_set_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET))
+ goto force_off;
- if (val != 0xf) {
- int ret = a6xx_gmu_wait_for_idle(gmu);
+ a6xx_gmu_clear_oob(&a6xx_gpu->gmu, GMU_OOB_GPU_SET);
+ }
- /* If the GMU isn't responding assume it is hung */
- if (ret) {
- a6xx_gmu_force_off(gmu);
- return;
- }
+ ret = a6xx_gmu_wait_for_idle(gmu);
- a6xx_bus_clear_pending_transactions(adreno_gpu, a6xx_gpu->hung);
+ /* If the GMU isn't responding assume it is hung */
+ if (ret)
+ goto force_off;
- /* tell the GMU we want to slumber */
- ret = a6xx_gmu_notify_slumber(gmu);
- if (ret) {
- a6xx_gmu_force_off(gmu);
- return;
- }
+ a6xx_bus_clear_pending_transactions(adreno_gpu, a6xx_gpu->hung);
- ret = gmu_poll_timeout(gmu,
- REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS, val,
- !(val & A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS_GPUBUSYIGNAHB),
- 100, 10000);
+ /* tell the GMU we want to slumber */
+ ret = a6xx_gmu_notify_slumber(gmu);
+ if (ret)
+ goto force_off;
- /*
- * Let the user know we failed to slumber but don't worry too
- * much because we are powering down anyway
- */
+ ret = gmu_poll_timeout(gmu,
+ REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS, val,
+ !(val & A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS_GPUBUSYIGNAHB),
+ 100, 10000);
- if (ret)
- DRM_DEV_ERROR(gmu->dev,
- "Unable to slumber GMU: status = 0%x/0%x\n",
- gmu_read(gmu,
- REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS),
- gmu_read(gmu,
- REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS2));
- }
+ /*
+ * Let the user know we failed to slumber but don't worry too
+ * much because we are powering down anyway
+ */
+
+ if (ret)
+ DRM_DEV_ERROR(gmu->dev,
+ "Unable to slumber GMU: status = 0%x/0%x\n",
+ gmu_read(gmu,
+ REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS),
+ gmu_read(gmu,
+ REG_A6XX_GPU_GMU_AO_GPU_CX_BUSY_STATUS2));
/* Turn off HFI */
a6xx_hfi_stop(gmu);
@@ -1221,6 +1222,11 @@ static void a6xx_gmu_shutdown(struct a6xx_gmu *gmu)
/* Tell RPMh to power off the GPU */
a6xx_rpmh_stop(gmu);
+
+ return;
+
+force_off:
+ a6xx_gmu_force_off(gmu);
}
---
base-commit: 72d0af4accd965dc32f504440d74d0a4d18bf781
change-id: 20250110-adreno-sys-suspend-fix-c5bc7beea0c4
Best regards,
--
Akhil P Oommen <quic_akhilpo(a)quicinc.com>
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 415cadd505464d9a11ff5e0f6e0329c127849da5
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025022437-molecular-next-d0f6@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 415cadd505464d9a11ff5e0f6e0329c127849da5 Mon Sep 17 00:00:00 2001
From: Joshua Washington <joshwash(a)google.com>
Date: Fri, 14 Feb 2025 14:43:59 -0800
Subject: [PATCH] gve: set xdp redirect target only when it is available
Before this patch the NETDEV_XDP_ACT_NDO_XMIT XDP feature flag is set by
default as part of driver initialization, and is never cleared. However,
this flag differs from others in that it is used as an indicator for
whether the driver is ready to perform the ndo_xdp_xmit operation as
part of an XDP_REDIRECT. Kernel helpers
xdp_features_(set|clear)_redirect_target exist to convey this meaning.
This patch ensures that the netdev is only reported as a redirect target
when XDP queues exist to forward traffic.
Fixes: 39a7f4aa3e4a ("gve: Add XDP REDIRECT support for GQI-QPL format")
Cc: stable(a)vger.kernel.org
Reviewed-by: Praveen Kaligineedi <pkaligineedi(a)google.com>
Reviewed-by: Jeroen de Borst <jeroendb(a)google.com>
Signed-off-by: Joshua Washington <joshwash(a)google.com>
Link: https://patch.msgid.link/20250214224417.1237818-1-joshwash@google.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
diff --git a/drivers/net/ethernet/google/gve/gve.h b/drivers/net/ethernet/google/gve/gve.h
index 8167cc5fb0df..78d2a19593d1 100644
--- a/drivers/net/ethernet/google/gve/gve.h
+++ b/drivers/net/ethernet/google/gve/gve.h
@@ -1116,6 +1116,16 @@ static inline u32 gve_xdp_tx_start_queue_id(struct gve_priv *priv)
return gve_xdp_tx_queue_id(priv, 0);
}
+static inline bool gve_supports_xdp_xmit(struct gve_priv *priv)
+{
+ switch (priv->queue_format) {
+ case GVE_GQI_QPL_FORMAT:
+ return true;
+ default:
+ return false;
+ }
+}
+
/* gqi napi handler defined in gve_main.c */
int gve_napi_poll(struct napi_struct *napi, int budget);
diff --git a/drivers/net/ethernet/google/gve/gve_main.c b/drivers/net/ethernet/google/gve/gve_main.c
index 533e659b15b3..92237fb0b60c 100644
--- a/drivers/net/ethernet/google/gve/gve_main.c
+++ b/drivers/net/ethernet/google/gve/gve_main.c
@@ -1903,6 +1903,8 @@ static void gve_turndown(struct gve_priv *priv)
/* Stop tx queues */
netif_tx_disable(priv->dev);
+ xdp_features_clear_redirect_target(priv->dev);
+
gve_clear_napi_enabled(priv);
gve_clear_report_stats(priv);
@@ -1972,6 +1974,9 @@ static void gve_turnup(struct gve_priv *priv)
napi_schedule(&block->napi);
}
+ if (priv->num_xdp_queues && gve_supports_xdp_xmit(priv))
+ xdp_features_set_redirect_target(priv->dev, false);
+
gve_set_napi_enabled(priv);
}
@@ -2246,7 +2251,6 @@ static void gve_set_netdev_xdp_features(struct gve_priv *priv)
if (priv->queue_format == GVE_GQI_QPL_FORMAT) {
xdp_features = NETDEV_XDP_ACT_BASIC;
xdp_features |= NETDEV_XDP_ACT_REDIRECT;
- xdp_features |= NETDEV_XDP_ACT_NDO_XMIT;
xdp_features |= NETDEV_XDP_ACT_XSK_ZEROCOPY;
} else {
xdp_features = 0;
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x 5ae4dca718eacd0a56173a687a3736eb7e627c77
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2025022438-automated-recycled-cc12@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 5ae4dca718eacd0a56173a687a3736eb7e627c77 Mon Sep 17 00:00:00 2001
From: Lukasz Czechowski <lukasz.czechowski(a)thaumatec.com>
Date: Tue, 21 Jan 2025 13:56:04 +0100
Subject: [PATCH] arm64: dts: rockchip: Disable DMA for uart5 on px30-ringneck
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
UART controllers without flow control seem to behave unstable
in case DMA is enabled. The issues were indicated in the message:
https://lore.kernel.org/linux-arm-kernel/CAMdYzYpXtMocCtCpZLU_xuWmOp2Ja_v0A…
In case of PX30-uQ7 Ringneck SoM, it was noticed that after couple
of hours of UART communication, the CPU stall was occurring,
leading to the system becoming unresponsive.
After disabling the DMA, extensive UART communication tests for
up to two weeks were performed, and no issues were further
observed.
The flow control pins for uart5 are not available on PX30-uQ7
Ringneck, as configured by pinctrl-0, so the DMA nodes were
removed on SoM dtsi.
Cc: stable(a)vger.kernel.org
Fixes: c484cf93f61b ("arm64: dts: rockchip: add PX30-µQ7 (Ringneck) SoM with Haikou baseboard")
Reviewed-by: Quentin Schulz <quentin.schulz(a)cherry.de>
Signed-off-by: Lukasz Czechowski <lukasz.czechowski(a)thaumatec.com>
Link: https://lore.kernel.org/r/20250121125604.3115235-3-lukasz.czechowski@thauma…
Signed-off-by: Heiko Stuebner <heiko(a)sntech.de>
diff --git a/arch/arm64/boot/dts/rockchip/px30-ringneck.dtsi b/arch/arm64/boot/dts/rockchip/px30-ringneck.dtsi
index 2c87005c89bd..e80412abec08 100644
--- a/arch/arm64/boot/dts/rockchip/px30-ringneck.dtsi
+++ b/arch/arm64/boot/dts/rockchip/px30-ringneck.dtsi
@@ -397,6 +397,8 @@ &u2phy_host {
};
&uart5 {
+ /delete-property/ dmas;
+ /delete-property/ dma-names;
pinctrl-0 = <&uart5_xfer>;
};
[ Upstream commit 647cef20e649c576dff271e018d5d15d998b629d ]
Expected behaviour:
In case we reach scheduler's limit, pfifo_tail_enqueue() will drop a
packet in scheduler's queue and decrease scheduler's qlen by one.
Then, pfifo_tail_enqueue() enqueue new packet and increase
scheduler's qlen by one. Finally, pfifo_tail_enqueue() return
`NET_XMIT_CN` status code.
Weird behaviour:
In case we set `sch->limit == 0` and trigger pfifo_tail_enqueue() on a
scheduler that has no packet, the 'drop a packet' step will do nothing.
This means the scheduler's qlen still has value equal 0.
Then, we continue to enqueue new packet and increase scheduler's qlen by
one. In summary, we can leverage pfifo_tail_enqueue() to increase qlen by
one and return `NET_XMIT_CN` status code.
The problem is:
Let's say we have two qdiscs: Qdisc_A and Qdisc_B.
- Qdisc_A's type must have '->graft()' function to create parent/child relationship.
Let's say Qdisc_A's type is `hfsc`. Enqueue packet to this qdisc will trigger `hfsc_enqueue`.
- Qdisc_B's type is pfifo_head_drop. Enqueue packet to this qdisc will trigger `pfifo_tail_enqueue`.
- Qdisc_B is configured to have `sch->limit == 0`.
- Qdisc_A is configured to route the enqueued's packet to Qdisc_B.
Enqueue packet through Qdisc_A will lead to:
- hfsc_enqueue(Qdisc_A) -> pfifo_tail_enqueue(Qdisc_B)
- Qdisc_B->q.qlen += 1
- pfifo_tail_enqueue() return `NET_XMIT_CN`
- hfsc_enqueue() check for `NET_XMIT_SUCCESS` and see `NET_XMIT_CN` => hfsc_enqueue() don't increase qlen of Qdisc_A.
The whole process lead to a situation where Qdisc_A->q.qlen == 0 and Qdisc_B->q.qlen == 1.
Replace 'hfsc' with other type (for example: 'drr') still lead to the same problem.
This violate the design where parent's qlen should equal to the sum of its childrens'qlen.
Bug impact: This issue can be used for user->kernel privilege escalation when it is reachable.
Fixes: 57dbb2d83d10 ("sched: add head drop fifo queue")
Reported-by: Quang Le <quanglex97(a)gmail.com>
Signed-off-by: Quang Le <quanglex97(a)gmail.com>
Signed-off-by: Cong Wang <cong.wang(a)bytedance.com>
Link: https://patch.msgid.link/20250204005841.223511-2-xiyou.wangcong@gmail.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
[Lee: Backported to linux-6.6.y - fixed a minor surrounding diff conflict]
(cherry picked from commit e40cb34b7f247fe2e366fd192700d1b4f38196ca)
Signed-off-by: Lee Jones <lee(a)kernel.org>
---
- Applies cleanly to v6.1, v5.15, v5.10 and v5.4
net/sched/sch_fifo.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/net/sched/sch_fifo.c b/net/sched/sch_fifo.c
index e1040421b797..af5f2ab69b8d 100644
--- a/net/sched/sch_fifo.c
+++ b/net/sched/sch_fifo.c
@@ -39,6 +39,9 @@ static int pfifo_tail_enqueue(struct sk_buff *skb, struct Qdisc *sch,
{
unsigned int prev_backlog;
+ if (unlikely(READ_ONCE(sch->limit) == 0))
+ return qdisc_drop(skb, sch, to_free);
+
if (likely(sch->q.qlen < sch->limit))
return qdisc_enqueue_tail(skb, sch);
--
2.48.1.658.g4767266eb4-goog
Hello maintainers,
I would like to report a potential lock ordering issue in the r8188eu
driver. This may lead to deadlocks under certain conditions.
The functions rtw_wx_set_wap() and rtw_wx_set_essid() acquire locks in
an order that contradicts the established locking hierarchy observed
in other parts of the driver:
1. They first take &pmlmepriv->scanned_queue.lock
2. Then call rtw_set_802_11_infrastructure_mode() which takes &pmlmepriv->lock
This is inverted compared to the common pattern seen in functions like
rtw_joinbss_event_prehandle(), rtw_createbss_cmd_callback(), and
others, which typically:
1. Take &pmlmepriv->lock first
2. Then take &pmlmepriv->scanned_queue.lock
This lock inversion creates a potential deadlock scenario when these
code paths execute concurrently.
Moreover, the call chain: rtw_wx_set_* ->
rtw_set_802_11_infrastructure_mode() -> rtw_free_assoc_resources()
could lead to recursive acquisition of &pmlmepriv->scanned_queue.lock,
potentially causing self-deadlock even without concurrency.
This issue exists in longterm kernels containing the r8188eu driver:
5.4.y (until 5.4.290)
5.10.y (until 5.10.234)
5.15.y (until 5.15.178)
6.1.y (until 6.1.129)
The r8188eu driver has been removed from upstream, but older
maintained versions (5.4.x–6.1.x) still include this driver and are
affected.
This issue was identified through static analysis. While I've verified
the locking patterns through code review, I'm not sufficiently
familiar with the driver's internals to propose a safe fix.
Thank you for your attention to this matter.
Best regards,
Gui-Dong Han
From: Kan Liang <kan.liang(a)linux.intel.com>
Perf doesn't work with a low freq.
perf record -e cpu_core/instructions/ppp -F 120
Error:
The sys_perf_event_open() syscall returned with 22 (Invalid argument)
for event (cpu_core/instructions/ppp).
"dmesg | grep -i perf" may provide additional information.
The limit_period() check avoids a low sampling period on a counter. It
doesn't intend to limit the frequency.
The check in the x86_pmu_hw_config() should be limited to non-freq mode.
The attr.sample_period and attr.sample_freq are union. The
attr.sample_period should not be used to indicate the freq mode.
Fixes: c46e665f0377 ("perf/x86: Add INST_RETIRED.ALL workarounds")
Closes: https://lore.kernel.org/lkml/20250115154949.3147-1-ravi.bangoria@amd.com/
Signed-off-by: Kan Liang <kan.liang(a)linux.intel.com>
Cc: Andi Kleen <ak(a)linux.intel.com>
Cc: Ravi Bangoria <ravi.bangoria(a)amd.com>
Cc: stable(a)vger.kernel.org
---
arch/x86/events/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 7b6430e5a77b..20ad5cca6ad2 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -630,7 +630,7 @@ int x86_pmu_hw_config(struct perf_event *event)
if (event->attr.type == event->pmu->type)
event->hw.config |= x86_pmu_get_event_config(event);
- if (event->attr.sample_period && x86_pmu.limit_period) {
+ if (!event->attr.freq && x86_pmu.limit_period) {
s64 left = event->attr.sample_period;
x86_pmu.limit_period(event, &left);
if (left > event->attr.sample_period)
--
2.38.1
Hi all,
This series backports three upstream commits:
- 135ffc7 "bpf, vsock: Invoke proto::close on close()"
- fcdd224 "vsock: Keep the binding until socket destruction"
- 78dafe1 "vsock: Orphan socket after transport release"
Although this version of the kernel does not support sockmap, I think
backporting this patch can be useful to reduce conflicts in future
backports [1]. It does not harm the system. The comment it introduces in
the code can be misleading. I added some words in the commit to explain
the situation.
The other two commits are untouched, fixing a use-after free[2] and a
null-ptr-deref[3] respectively.
[1]https://lore.kernel.org/stable/f7lr3ftzo66sl6phlcygh4xx4spga4b6je37fhawjr…
[2]https://lore.kernel.org/all/20250128-vsock-transport-vs-autobind-v3-0-1cf…
[3]https://lore.kernel.org/all/20250210-vsock-linger-nullderef-v3-0-ef6244d0…
Cheers,
Luigi
To: Stefano Garzarella <sgarzare(a)redhat.com>
To: Michal Luczaj <mhal(a)rbox.co>
To: stable(a)vger.kernel.org
Signed-off-by: Luigi Leonardi <leonardi(a)redhat.com>
---
Michal Luczaj (3):
bpf, vsock: Invoke proto::close on close()
vsock: Keep the binding until socket destruction
vsock: Orphan socket after transport release
net/vmw_vsock/af_vsock.c | 77 +++++++++++++++++++++++++++++++-----------------
1 file changed, 50 insertions(+), 27 deletions(-)
---
base-commit: 0cbb5f65e52f3e66410a7fe0edf75e1b2bf41e80
change-id: 20250220-backport_fix-9a9a58f64f14
Best regards,
--
Luigi Leonardi <leonardi(a)redhat.com>
Hi all,
This series backports three upstream commits:
- 135ffc7 "bpf, vsock: Invoke proto::close on close()"
- fcdd224 "vsock: Keep the binding until socket destruction"
- 78dafe1 "vsock: Orphan socket after transport release"
Although this version of the kernel does not support sockmap, I think
backporting this patch can be useful to reduce conflicts in future
backports [1]. It does not harm the system. The comment it introduces in
the code can be misleading. I added some words in the commit to explain
the situation.
The other two commits are untouched, fixing a use-after free[2] and a
null-ptr-deref[3] respectively.
[1]https://lore.kernel.org/stable/f7lr3ftzo66sl6phlcygh4xx4spga4b6je37fhawjr…
[2]https://lore.kernel.org/all/20250128-vsock-transport-vs-autobind-v3-0-1cf…
[3]https://lore.kernel.org/all/20250210-vsock-linger-nullderef-v3-0-ef6244d0…
Cheers,
Luigi
To: Stefano Garzarella <sgarzare(a)redhat.com>
To: Michal Luczaj <mhal(a)rbox.co>
To: stable(a)vger.kernel.org
Signed-off-by: Luigi Leonardi <leonardi(a)redhat.com>
---
Michal Luczaj (3):
bpf, vsock: Invoke proto::close on close()
vsock: Keep the binding until socket destruction
vsock: Orphan socket after transport release
net/vmw_vsock/af_vsock.c | 77 +++++++++++++++++++++++++++++++-----------------
1 file changed, 50 insertions(+), 27 deletions(-)
---
base-commit: c16c81c81336c0912eb3542194f16215c0a40037
change-id: 20250220-backport_fix_5_15-27efd9233dc2
Best regards,
--
Luigi Leonardi <leonardi(a)redhat.com>
We've had instances of drivers returning invalid values from gpio_chip
calbacks. In several cases these return values would be propagated to
user-space and confuse programs that only expect 0 or negative errnos
from ioctl()s. Let's sanitize the return values of callbacks and make
sure we don't allow anyone see invalid ones.
The first patch checks the return values of get_direction() in kernel
where needed and is a backportable fix.
Signed-off-by: Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
---
Bartosz Golaszewski (8):
gpiolib: check the return value of gpio_chip::get_direction()
gpiolib: sanitize the return value of gpio_chip::request()
gpiolib: sanitize the return value of gpio_chip::set_config()
gpiolib: sanitize the return value of gpio_chip::get()
gpiolib: sanitize the return value of gpio_chip::get_multiple()
gpiolib: sanitize the return value of gpio_chip::direction_output()
gpiolib: sanitize the return value of gpio_chip::direction_input()
gpiolib: sanitize the return value of gpio_chip::get_direction()
drivers/gpio/gpiolib.c | 144 +++++++++++++++++++++++++++++++++++---------
include/linux/gpio/driver.h | 6 +-
2 files changed, 120 insertions(+), 30 deletions(-)
---
base-commit: a13f6e0f405ed0d3bcfd37c692c7d7fa3c052154
change-id: 20241212-gpio-sanitize-retvals-f5f4e0d6f57d
Best regards,
--
Bartosz Golaszewski <bartosz.golaszewski(a)linaro.org>
This patch series is to fix of bugs about refcount.
Signed-off-by: Zijun Hu <quic_zijuhu(a)quicinc.com>
---
Changes in v2:
- Add 2 unittest patches + 1 refcount bug fix + 1 refcount comments patch
- Correct titles and commit messages
- Link to v1: https://lore.kernel.org/r/20241209-of_irq_fix-v1-0-782f1419c8a1@quicinc.com
---
Zijun Hu (9):
of: unittest: Add a case to test if API of_irq_parse_one() leaks refcount
of/irq: Fix device node refcount leakage in API of_irq_parse_one()
of: unittest: Add a case to test if API of_irq_parse_raw() leaks refcount
of/irq: Fix device node refcount leakage in API of_irq_parse_raw()
of/irq: Fix device node refcount leakages in of_irq_count()
of/irq: Fix device node refcount leakage in API irq_of_parse_and_map()
of/irq: Fix device node refcount leakages in of_irq_init()
of/irq: Add comments about refcount for API of_irq_find_parent()
of: resolver: Fix device node refcount leakage in of_resolve_phandles()
drivers/of/irq.c | 34 ++++++++++---
drivers/of/resolver.c | 2 +
drivers/of/unittest-data/tests-interrupts.dtsi | 13 +++++
drivers/of/unittest.c | 67 ++++++++++++++++++++++++++
4 files changed, 110 insertions(+), 6 deletions(-)
---
base-commit: 40fc0083a9dbcf2e81b1506274cb541f84d022ed
change-id: 20241208-of_irq_fix-659514bc9aa3
Best regards,
--
Zijun Hu <quic_zijuhu(a)quicinc.com>
There are two variables that indicate the interrupt type to be used
in the next test execution, global "irq_type" and test->irq_type.
The former is referenced from pci_endpoint_test_get_irq() to preserve
the current type for ioctl(PCITEST_GET_IRQTYPE).
In pci_endpoint_test_request_irq(), since this global variable is
referenced when an error occurs, the unintended error message is
displayed.
For example, the following message shows "MSI 3" even if the current
irq type becomes "MSI-X".
# pcitest -i 2
pci-endpoint-test 0000:01:00.0: Failed to request IRQ 30 for MSI 3
SET IRQ TYPE TO MSI-X: NOT OKAY
Fix this issue by using test->irq_type instead of global "irq_type".
Cc: stable(a)vger.kernel.org
Fixes: b2ba9225e031 ("misc: pci_endpoint_test: Avoid using module parameter to determine irqtype")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
---
drivers/misc/pci_endpoint_test.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index 9e56d200d2f0..acf3d8dab131 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -242,7 +242,7 @@ static int pci_endpoint_test_request_irq(struct pci_endpoint_test *test)
return 0;
fail:
- switch (irq_type) {
+ switch (test->irq_type) {
case IRQ_TYPE_INTX:
dev_err(dev, "Failed to request IRQ %d for Legacy\n",
pci_irq_vector(pdev, i));
--
2.25.1
After devm_request_irq() fails with error in
pci_endpoint_test_request_irq(), pci_endpoint_test_free_irq_vectors() is
called assuming that all IRQs have been released.
However some requested IRQs remain unreleased, so there are still
/proc/irq/* entries remaining and this results in WARN() with the following
message:
remove_proc_entry: removing non-empty directory 'irq/30', leaking at
least 'pci-endpoint-test.0'
WARNING: CPU: 0 PID: 202 at fs/proc/generic.c:719 remove_proc_entry
+0x190/0x19c
To solve this issue, set the number of remaining IRQs to test->num_irqs
and release IRQs in advance by calling pci_endpoint_test_release_irq().
Cc: stable(a)vger.kernel.org
Fixes: e03327122e2c ("pci_endpoint_test: Add 2 ioctl commands")
Reviewed-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
Signed-off-by: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
---
drivers/misc/pci_endpoint_test.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/misc/pci_endpoint_test.c b/drivers/misc/pci_endpoint_test.c
index a3d2caa7a6bb..9e56d200d2f0 100644
--- a/drivers/misc/pci_endpoint_test.c
+++ b/drivers/misc/pci_endpoint_test.c
@@ -259,6 +259,9 @@ static int pci_endpoint_test_request_irq(struct pci_endpoint_test *test)
break;
}
+ test->num_irqs = i;
+ pci_endpoint_test_release_irq(test);
+
return ret;
}
--
2.25.1