The rate_limit_us for the schedutil governor is getting set to 500 ms by default for the ARM64 hikey board. And its way too much, even for the default value. Lets set the default transition_delay_ns to something more realistic (10 ms), while the userspace always have a chance to set something it wants.
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org --- drivers/cpufreq/cpufreq-dt.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index c943787d761e..70eac3fd89ac 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -275,6 +275,9 @@ static int cpufreq_init(struct cpufreq_policy *policy)
policy->cpuinfo.transition_latency = transition_latency;
+ /* Set the default transition delay to 10ms */ + policy->transition_delay_us = 10 * USEC_PER_MSEC; + return 0;
out_free_cpufreq_table:
Hi Viresh,
On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:
The rate_limit_us for the schedutil governor is getting set to 500 ms by default for the ARM64 hikey board. And its way too much, even for the default value. Lets set the default transition_delay_ns to something more realistic (10 ms), while the userspace always have a chance to set something it wants.
Just a thought - do you think we can treat the reported transition latency as a proxy for the "cost" of freq transitions? I.e. assume that on platforms with very fast frequency switching it's probably cheap to switch frequency and we want schedutil to respond quickly, whereas on platforms with big latencies, frequency switches might be expensive and we probably want hysteresis.
If that makes sense then maybe we could use 10 * transition_latency / NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems sensible to me..
Cheers, Brendan
Signed-off-by: Viresh Kumar viresh.kumar@linaro.org
drivers/cpufreq/cpufreq-dt.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index c943787d761e..70eac3fd89ac 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -275,6 +275,9 @@ static int cpufreq_init(struct cpufreq_policy *policy)
policy->cpuinfo.transition_latency = transition_latency;
- /* Set the default transition delay to 10ms */
- policy->transition_delay_us = 10 * USEC_PER_MSEC;
- return 0;
out_free_cpufreq_table:
On 22-05-17, 11:45, Brendan Jackman wrote:
Hi Viresh,
On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:
The rate_limit_us for the schedutil governor is getting set to 500 ms by default for the ARM64 hikey board. And its way too much, even for the default value. Lets set the default transition_delay_ns to something more realistic (10 ms), while the userspace always have a chance to set something it wants.
Just a thought - do you think we can treat the reported transition latency as a proxy for the "cost" of freq transitions? I.e. assume that on platforms with very fast frequency switching it's probably cheap to switch frequency and we want schedutil to respond quickly, whereas on platforms with big latencies, frequency switches might be expensive and we probably want hysteresis.
If that makes sense then maybe we could use 10 * transition_latency / NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems sensible to me..
So my platform (hikey) does provide transition-latency as 500 us. But schedutil multiplies that with LATENCY_MULTIPLIER (1000) and that makes it 500000 rate_limit_us, which is unacceptable.
@Rafael: Why does the LATENCY_MULTIPLIER has such a high value? I am not sure I understood completely on why we have this multiplier :(
On Mon, May 22, 2017 at 04:25:22PM +0530, Viresh Kumar wrote:
On 22-05-17, 11:45, Brendan Jackman wrote:
Hi Viresh,
On Mon, May 22 2017 at 05:10, Viresh Kumar wrote:
The rate_limit_us for the schedutil governor is getting set to 500 ms by default for the ARM64 hikey board. And its way too much, even for the default value. Lets set the default transition_delay_ns to something more realistic (10 ms), while the userspace always have a chance to set something it wants.
Just a thought - do you think we can treat the reported transition latency as a proxy for the "cost" of freq transitions? I.e. assume that on platforms with very fast frequency switching it's probably cheap to switch frequency and we want schedutil to respond quickly, whereas on platforms with big latencies, frequency switches might be expensive and we probably want hysteresis.
If that makes sense then maybe we could use 10 * transition_latency / NSEC_PER_USEC, when transition_latency is reported? Otherwise 10ms seems sensible to me..
So my platform (hikey) does provide transition-latency as 500 us. But schedutil multiplies that with LATENCY_MULTIPLIER (1000) and that makes it 500000 rate_limit_us, which is unacceptable.
@Rafael: Why does the LATENCY_MULTIPLIER has such a high value? I am not sure I understood completely on why we have this multiplier :(
This afternoon Amit pointed me for this patch, should fix as below? Otherwise it seems directly assign the same value from unit 'ns' to 'us' but without any value conversion.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 76877a6..dcc90fc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy) unsigned int lat;
tunables->rate_limit_us = LATENCY_MULTIPLIER; - lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC; + lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC; if (lat) tunables->rate_limit_us *= lat; }
Thanks, Leo Yan
On 22-05-17, 19:17, Leo Yan wrote:
This afternoon Amit pointed me for this patch, should fix as below? Otherwise it seems directly assign the same value from unit 'ns' to 'us' but without any value conversion.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 76877a6..dcc90fc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy) unsigned int lat; tunables->rate_limit_us = LATENCY_MULTIPLIER;
lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC; if (lat) tunables->rate_limit_us *= lat; }
I will let Rafael comment in as well. NSEC_PER_USEC is used in the earlier governors as well (ondemand/conservative) in exactly the same way as schedutil is using.
On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:
On 22-05-17, 19:17, Leo Yan wrote:
This afternoon Amit pointed me for this patch, should fix as below? Otherwise it seems directly assign the same value from unit 'ns' to 'us' but without any value conversion.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 76877a6..dcc90fc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy) unsigned int lat; tunables->rate_limit_us = LATENCY_MULTIPLIER;
lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC; if (lat) tunables->rate_limit_us *= lat; }
I will let Rafael comment in as well. NSEC_PER_USEC is used in the earlier governors as well (ondemand/conservative) in exactly the same way as schedutil is using.
The reason why it is used by schedutil is because the other governors used it that way. IOW, doesn't matter. :-)
Thanks, Rafael
On 27-06-17, 02:15, Rafael J. Wysocki wrote:
On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:
On 22-05-17, 19:17, Leo Yan wrote:
This afternoon Amit pointed me for this patch, should fix as below? Otherwise it seems directly assign the same value from unit 'ns' to 'us' but without any value conversion.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 76877a6..dcc90fc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy) unsigned int lat; tunables->rate_limit_us = LATENCY_MULTIPLIER;
lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
I think the above line is just fine and the below one is incorrect, as we wanted to convert transition latency to usec here (i.e. in the units of rate_limit_us).
lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC; if (lat) tunables->rate_limit_us *= lat; }
I will let Rafael comment in as well. NSEC_PER_USEC is used in the earlier governors as well (ondemand/conservative) in exactly the same way as schedutil is using.
The reason why it is used by schedutil is because the other governors used it that way. IOW, doesn't matter. :-)
But I feel the value of LATENCY_MULTIPLIER (1000) is way too high. It currently says that if freq-switching takes time X, then we should wait for 999X time before we change the freq again.
Perhaps LATENCY_MULTIPLIER should be just 10 or 20 here. For a platform with transition_latency 500 us, rate_limit_us comes to 500 ms. Which is absurd. We ideally want it to be around 10-20 ms here. And compared to other ARM platforms, 500 us transition_latency is very low. It normally is around 1-3 ms for ARM32 platforms.
@Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?
Hi,
On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar viresh.kumar@linaro.org wrote:
On 27-06-17, 02:15, Rafael J. Wysocki wrote:
On Monday, May 22, 2017 04:57:27 PM Viresh Kumar wrote:
On 22-05-17, 19:17, Leo Yan wrote:
This afternoon Amit pointed me for this patch, should fix as below? Otherwise it seems directly assign the same value from unit 'ns' to 'us' but without any value conversion.
diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c index 76877a6..dcc90fc 100644 --- a/kernel/sched/cpufreq_schedutil.c +++ b/kernel/sched/cpufreq_schedutil.c @@ -538,7 +538,7 @@ static int sugov_init(struct cpufreq_policy *policy) unsigned int lat;
tunables->rate_limit_us = LATENCY_MULTIPLIER;
lat = policy->cpuinfo.transition_latency / NSEC_PER_USEC;
I think the above line is just fine and the below one is incorrect, as we wanted to convert transition latency to usec here (i.e. in the units of rate_limit_us).
lat = policy->cpuinfo.transition_latency / NSEC_PER_MSEC; if (lat) tunables->rate_limit_us *= lat; }
I will let Rafael comment in as well. NSEC_PER_USEC is used in the earlier governors as well (ondemand/conservative) in exactly the same way as schedutil is using.
The reason why it is used by schedutil is because the other governors used it that way. IOW, doesn't matter. :-)
But I feel the value of LATENCY_MULTIPLIER (1000) is way too high. It currently says that if freq-switching takes time X, then we should wait for 999X time before we change the freq again.
Perhaps LATENCY_MULTIPLIER should be just 10 or 20 here. For a platform with transition_latency 500 us, rate_limit_us comes to 500 ms. Which is absurd. We ideally want it to be around 10-20 ms here. And compared to other ARM platforms, 500 us transition_latency is very low. It normally is around 1-3 ms for ARM32 platforms.
@Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?
We can do that, but then I think we need to compensate for the change in the old governors code or there may be surprises.
Thanks, Rafael
On 27-06-17, 18:08, Rafael J. Wysocki wrote:
On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar viresh.kumar@linaro.org wrote:
@Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?
We can do that, but then I think we need to compensate for the change in the old governors code or there may be surprises.
Why shouldn't we change the value of LATENCY_MULTIPLIER for old governors as well? They use the same calculations and the sampling rate there is also this bad (like rate_limit_us).
If we aren't going to change that for old governors, then we can create a local version of LATENCY_MULTIPLIER for schedutil I believe.
On Wednesday, June 28, 2017 09:44:55 AM Viresh Kumar wrote:
On 27-06-17, 18:08, Rafael J. Wysocki wrote:
On Tue, Jun 27, 2017 at 6:20 AM, Viresh Kumar viresh.kumar@linaro.org wrote:
@Rafael: Will it be fine to lower down the value of LATENCY_MULTIPLIER?
We can do that, but then I think we need to compensate for the change in the old governors code or there may be surprises.
Why shouldn't we change the value of LATENCY_MULTIPLIER for old governors as well? They use the same calculations and the sampling rate there is also this bad (like rate_limit_us).
On some systems. On other systems it isn't.
If we aren't going to change that for old governors, then we can create a local version of LATENCY_MULTIPLIER for schedutil I believe.
OK, so at least for intel_pstate and acpi-cpufreq we want a 10 ms default which is what we have currently.
If you want to rework all that, make sure you preserve that.
Thanks, Rafael
linaro-kernel@lists.linaro.org