From: Shawn Guo shawnguo@kernel.org
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1), due to that platform's DT doesn't provide the optional property 'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms, because that the 10 ms cap for transition_delay_us was accidentally dropped by the commits below.
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us") commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
It slows down dbs governor's reacting to CPU loading change dramatically. Also, as transition_delay_us is used by schedutil governor as rate_limit_us, it shows a negative impact on device idle power consumption, because the device gets slightly less time in the lowest OPP.
Fix the regressions by adding the 10 ms cap on transition delay back.
Cc: stable@vger.kernel.org Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") Signed-off-by: Shawn Guo shawnguo@kernel.org --- drivers/cpufreq/cpufreq.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fc7eace8b65b..36e0c85cb4e0 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC; if (latency) - /* Give a 50% breathing room between updates */ - return latency + (latency >> 1); + /* + * Give a 50% breathing room between updates. + * And cap the transition delay to 10 ms for platforms + * where the latency is too high to be reasonable for + * reevaluating frequency. + */ + return min(latency + (latency >> 1), 10 * MSEC_PER_SEC);
return USEC_PER_MSEC; }
On Wed, Sep 10, 2025 at 02:53:12PM +0800, Shawn Guo wrote:
From: Shawn Guo shawnguo@kernel.org
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1), due to that platform's DT doesn't provide the optional property 'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms, because that the 10 ms cap for transition_delay_us was accidentally dropped by the commits below.
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us") commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
It slows down dbs governor's reacting to CPU loading change dramatically. Also, as transition_delay_us is used by schedutil governor as rate_limit_us, it shows a negative impact on device idle power consumption, because the device gets slightly less time in the lowest OPP.
Fix the regressions by adding the 10 ms cap on transition delay back.
Cc: stable@vger.kernel.org Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") Signed-off-by: Shawn Guo shawnguo@kernel.org
drivers/cpufreq/cpufreq.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fc7eace8b65b..36e0c85cb4e0 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy) latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC; if (latency)
/* Give a 50% breathing room between updates */
return latency + (latency >> 1);
/*
* Give a 50% breathing room between updates.
* And cap the transition delay to 10 ms for platforms
* where the latency is too high to be reasonable for
* reevaluating frequency.
*/
return min(latency + (latency >> 1), 10 * MSEC_PER_SEC);
I guess it's more correct to use USEC_PER_MSEC instead, even if both have the value 1000. Will fix in v2.
Shawn
return USEC_PER_MSEC; } -- 2.43.0
On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo shawnguo2@yeah.net wrote:
From: Shawn Guo shawnguo@kernel.org
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1), due to that platform's DT doesn't provide the optional property 'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms, because that the 10 ms cap for transition_delay_us was accidentally dropped by the commits below.
IIRC, this was not accidental.
Why do you want to address the issue in the cpufreq core instead of doing that in the cpufreq-dt driver?
CPUFREQ_ETERNAL doesn't appear to be a reasonable default for cpuinfo.transition_latency. Maybe just change the default there to 10 ms?
commit 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") commit a755d0e2d41b ("cpufreq: Honour transition_latency over transition_delay_us") commit e13aa799c2a6 ("cpufreq: Change default transition delay to 2ms")
It slows down dbs governor's reacting to CPU loading change dramatically. Also, as transition_delay_us is used by schedutil governor as rate_limit_us, it shows a negative impact on device idle power consumption, because the device gets slightly less time in the lowest OPP.
Fix the regressions by adding the 10 ms cap on transition delay back.
Cc: stable@vger.kernel.org Fixes: 37c6dccd6837 ("cpufreq: Remove LATENCY_MULTIPLIER") Signed-off-by: Shawn Guo shawnguo@kernel.org
drivers/cpufreq/cpufreq.c | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c index fc7eace8b65b..36e0c85cb4e0 100644 --- a/drivers/cpufreq/cpufreq.c +++ b/drivers/cpufreq/cpufreq.c @@ -551,8 +551,13 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
latency = policy->cpuinfo.transition_latency / NSEC_PER_USEC; if (latency)
/* Give a 50% breathing room between updates */
return latency + (latency >> 1);
/*
* Give a 50% breathing room between updates.
* And cap the transition delay to 10 ms for platforms
* where the latency is too high to be reasonable for
* reevaluating frequency.
*/
return min(latency + (latency >> 1), 10 * MSEC_PER_SEC); return USEC_PER_MSEC;
}
2.43.0
On Fri, Sep 12, 2025 at 12:41:14PM +0200, Rafael J. Wysocki wrote:
On Wed, Sep 10, 2025 at 8:53 AM Shawn Guo shawnguo2@yeah.net wrote:
From: Shawn Guo shawnguo@kernel.org
A regression is seen with 6.6 -> 6.12 kernel upgrade on platforms where cpufreq-dt driver sets cpuinfo.transition_latency as CPUFREQ_ETERNAL (-1), due to that platform's DT doesn't provide the optional property 'clock-latency-ns'. The dbs sampling_rate was 10000 us on 6.6 and suddently becomes 6442450 us (4294967295 / 1000 * 1.5) on 6.12 for these platforms, because that the 10 ms cap for transition_delay_us was accidentally dropped by the commits below.
IIRC, this was not accidental.
I could be wrong, but my understanding is that the intention of Qais's commits is to drop 10 ms (and LATENCY_MULTIPLIER) as the *minimal* limit on transition_delay_us, so that it's possible to get a much less transition_delay_us on platforms like M1 mac mini where the transition latency is just tens of us. But it breaks platforms where 10 ms used to be the *maximum* limit.
Even if it's intentional to remove 10 ms as both the minimal and maximum limits, breaking some platforms must not be intentional, I guess :)
Why do you want to address the issue in the cpufreq core instead of doing that in the cpufreq-dt driver?
My intuition was to fix the regression at where the regression was introduced by recovering the code behavior.
CPUFREQ_ETERNAL doesn't appear to be a reasonable default for cpuinfo.transition_latency. Maybe just change the default there to 10 ms?
I think cpufreq-dt is doing what it's asked to do, no?
/* * Maximum transition latency is in nanoseconds - if it's unknown, * CPUFREQ_ETERNAL shall be used. */
Also, 10 ms will then be turned into 15 ms by:
/* Give a 50% breathing room between updates */ return latency + (latency >> 1);
Shawn
linux-stable-mirror@lists.linaro.org