This reverts commit b3b274bc9d3d7307308aeaf75f70731765ac999a.
On the DragonBoard 820c (which uses APQ8096/MSM8996) this change causes the CPUs to downclock to roughly half speed under sustained load. The regression is visible both during boot and when running CPU stress workloads such as stress-ng: the CPUs initially ramp up to the expected frequency, then drop to a lower OPP even though the system is clearly CPU-bound.
Bisecting points to this commit and reverting it restores the expected behaviour on the DragonBoard 820c - the CPUs track the cpufreq policy and run at full performance under load.
The exact interaction with the ACD is not yet fully understood and we would like to keep ACD in use to avoid possible SoC reliability issues. Until we have a better fix that preserves ACD while avoiding this performance regression, revert the bisected patch to restore the previous behaviour.
Fixes: b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb") Cc: stable@vger.kernel.org # v6.3+ Link: https://lore.kernel.org/linux-arm-msm/20230113120544.59320-8-dmitry.baryshko... Cc: Dmitry Baryshkov dmitry.baryshkov@oss.qualcomm.com Signed-off-by: Christopher Obbard christopher.obbard@linaro.org --- Hi all,
This series contains a single revert for a regression affecting the APQ8096/MSM8996 (DragonBoard 820c).
The commit being reverted, b3b274bc9d3d ("clk: qcom: cpu-8996: simplify the cpu_clk_notifier_cb"), introduces a significant performance issue where the CPUs downclock to ~50% of their expected frequency under sustained load. The problem is reproducible both at boot and when running CPU-bound workloads such as stress-ng.
Bisecting the issue pointed directly to this commit and reverting it restores correct cpufreq behaviour.
The root cause appears to be related to the interaction between the simplified notifier callback and ACD (Adaptive Clock Distribution). Since we would prefer to keep ACD enabled for SoC reliability reasons, a revert is the safest option until a proper fix is identified.
Full details are included in the commit message.
Feedback & suggestions welcome.
Cheers!
Christopher Obbard --- drivers/clk/qcom/clk-cpu-8996.c | 30 +++++++++++------------------- 1 file changed, 11 insertions(+), 19 deletions(-)
diff --git a/drivers/clk/qcom/clk-cpu-8996.c b/drivers/clk/qcom/clk-cpu-8996.c index 21d13c0841ed..028476931747 100644 --- a/drivers/clk/qcom/clk-cpu-8996.c +++ b/drivers/clk/qcom/clk-cpu-8996.c @@ -547,35 +547,27 @@ static int cpu_clk_notifier_cb(struct notifier_block *nb, unsigned long event, { struct clk_cpu_8996_pmux *cpuclk = to_clk_cpu_8996_pmux_nb(nb); struct clk_notifier_data *cnd = data; + int ret;
switch (event) { case PRE_RATE_CHANGE: + ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ALT_INDEX); qcom_cpu_clk_msm8996_acd_init(cpuclk->clkr.regmap); - - /* - * Avoid overvolting. clk_core_set_rate_nolock() walks from top - * to bottom, so it will change the rate of the PLL before - * chaging the parent of PMUX. This can result in pmux getting - * clocked twice the expected rate. - * - * Manually switch to PLL/2 here. - */ - if (cnd->new_rate < DIV_2_THRESHOLD && - cnd->old_rate > DIV_2_THRESHOLD) - clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, SMUX_INDEX); - break; - case ABORT_RATE_CHANGE: - /* Revert manual change */ - if (cnd->new_rate < DIV_2_THRESHOLD && - cnd->old_rate > DIV_2_THRESHOLD) - clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, ACD_INDEX); + case POST_RATE_CHANGE: + if (cnd->new_rate < DIV_2_THRESHOLD) + ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, + SMUX_INDEX); + else + ret = clk_cpu_8996_pmux_set_parent(&cpuclk->clkr.hw, + ACD_INDEX); break; default: + ret = 0; break; }
- return NOTIFY_OK; + return notifier_from_errno(ret); };
static int qcom_cpu_clk_msm8996_driver_probe(struct platform_device *pdev)
--- base-commit: c17e270dfb342a782d69c4a7c4c32980455afd9c change-id: 20251202-wip-obbardc-qcom-msm8096-clk-cpu-fix-downclock-b7561da4cb95
Best regards,