Re: CPU excessively long times between frequency scaling driver calls - bisected

3 Mar 2022

On Tue, Mar 01, 2022 at 08:06:24PM -0800, Doug Smythies wrote:
...
On Tue, Mar 1, 2022 at 9:34 AM Rafael J. Wysocki rafael@kernel.org wrote:
...
I guess the numbers above could be reduced still by using a P-state
below the max non-turbo one as a limit.
Yes, and for a test I did "rjw-3".
...
...
overruns: 1042.
max overrun time: 9,769 uSec.
This would probably get worse then, though.
Yes, that was my expectation, but not what happened.
rjw-3:
ave: 3.09 watts
min: 3.01 watts
max: 31.7 watts
ave freq: 2.42 GHz.
overruns: 12. (I did not expect this.)
Max overruns time: 621 uSec.
Note 1: IRQ's increased by 74%. i.e. it was going in
and out of idle a lot more.
Note 2: We know that processor package power
is highly temperature dependent. I forgot to let my
coolant cool adequately after the kernel compile,
and so had to throw out the first 4 power samples
(20 minutes).
I retested both rjw-2 and rjw-3, but shorter tests
and got 0 overruns in both cases.
One thought is can we consider trying the previous debug patch of
calling the util_update when entering idle (time limited).
In current code, the RT/CFS/Deadline class all have places to call
cpufreq_update_util(), the patch will make sure it is called in all
four classes, also it follows the principle of 'schedutil' of not
introducing more system cost. And surely I could be missing some
details here.
Following is a cleaner version of the patch, and the code could be
moved down to the internal loop of
while (!need_resched()) {
}
Which will make it get called more frequently.
---

diff --git a/kernel/sched/idle.c b/kernel/sched/idle.c
index d17b0a5ce6ac..e12688036725 100644
--- a/kernel/sched/idle.c
+++ b/kernel/sched/idle.c
@@ -258,15 +258,23 @@ static void cpuidle_idle_call(void)
  *
  * Called with polling cleared.
  */
+DEFINE_PER_CPU(u64, last_util_update_time);	/* in jiffies */
 static void do_idle(void)
 {
    int cpu = smp_processor_id();
+	u64 expire;
/*
     * Check if we need to update blocked load
     */
    nohz_run_idle_balance(cpu);
+	expire = __this_cpu_read(last_util_update_time) + HZ * 3;
+	if (unlikely(time_is_before_jiffies((unsigned long)expire))) {
+		cpufreq_update_util(this_rq(), 0);
+		__this_cpu_write(last_util_update_time, get_jiffies_64());
+	}
+
    /*
     * If the arch has a polling bit, we maintain an invariant:
     *
Thanks,
Feng
...
...
ATM I'm not quite sure why this happens, but you seem to have some
insight into it, so it would help if you shared it.
My insight seems questionable.
My thinking was that one can not decide if the pstate needs to go
down or not based on such a localized look. The risk being that the
higher periodic load might suffer overruns. Since my first test did exactly
that, I violated my own "repeat all tests 3 times before reporting rule".
Now, I am not sure what is going on.
I will need more time to acquire traces and dig into it.
I also did a 1 hour intel_pstate_tracer test, with rjw-2, on an idle system
and saw several long durations. This was expected as this patch set
wouldn't change durations by more than a few jiffies.
755 long durations (>6.1 seconds), and 327.7 seconds longest.
... Doug

    

2024

2023

2022

2021

2020

2019

2018

2017

Re: CPU excessively long times between frequency scaling driver calls - bisected