New subject: cpufreq_sched policy for combining requests from multiple sched classes

14 Oct 2015


      [+eas-dev]
On Mon, Oct 12, 2015 at 02:16:42AM -0700, Michael Turquette wrote:
...
Quoting Patrick Bellasi (2015-10-12 01:51:29)
...
On Fri, Oct 09, 2015 at 11:58:22AM +0100, Michael Turquette wrote:
...
Steve,
On Thu, Oct 8, 2015 at 1:59 AM, Steve Muckle steve.muckle@linaro.org wrote:
...
At Linaro Connect a couple weeks back there was some sentiment that
taking the max of multiple capacity requests would be the only viable
policy when cpufreq_sched is extended to support multiple sched classes.
But I'm concerned that this is not workable - if CFS is requesting 1GHz
worth of bandwidth on a 2GHz CPU, and DEADLINE is also requesting 1GHz
of bandwidth, we would run the CPU at 1GHz and starve the CFS tasks
indefinitely.
For the scheduler I think that summing makes sense. For peripheral
devices it places a much higher burden on the system integrator to
figure out how much compute is needed to achieve a specific use case.
Are we still thinking about exposing an interface to device drivers?
With sched-DVFS we are able to select the CPU's OPP based on real
(expected) task demand. Thus, I expect constraints from device drivers
being useful only in these use cases:
a) the tasks activated by the driver are not "big enough" to
    generate an OPP switch, but still we want to race-to-idle their
    execution
    e.g. a light-weight control thread for an external accelerator,
    which could benefit from running on an higher OPP to reduce
    overall processing latencies
 b) we do not know which tasks require a boost in performances
    e.g. something like the "boost pulse" exposed by the Interactive
    governor, where the input subsystem is considered to trigger
    latency sensitive operations and thus the system deserves to be
    globally boosted
There are other classes of use-cases?
If these are the only use-cases we are thinking about, I'm wondering
if we could not try to cover all them via SchedTune and the boost
value it exposes.
The use-case a) is already covered by the first implementation of
SchedTune. If we know which task should be boosted, technically we
could expose the SchedTune interface to drivers to allow them to boosts
specific tasks.
It sounds like could work for me.
My concerns are bandwidth-related use cases. I've observed high speed
MMC controllers, WiFi chips, GPUs and other devices whose CPU tasks were
"small", but their performance was adversely affected when the CPU runs
at a slower rate. This is true even on relatively quiet system.
I see your point, I'm wondering if most of these use-case should not be
better implemented using DEADLINE instead of FIFO/RR.
For these cases the problem will be solved once DL is properly
integrated with sched-DVFS. For the remaining use-case where we still
want (or we are "limited") to use FIFO/RT, I'm more on the idea that a
race-to-idle strategy could just work.
...
Tasks running on the CPU can be viewed as latencies from the perspective
of a peripheral/IO device. TI and many other vendors have implemented
out-of-tree solutions to hold a CPU at a minimum OPP/frequency when
these drivers are running (usually with terrible hacks, but in some
cases nicely wrapped up in runtime_pm_{get,put} callbacks).
I know very well these scenarios, in the past I experimented a lot
with x86 machines running OpenCL workloads offloaded on GPGPUs.
Quite frequently the bottleneck was the control thread running on the
CPU, at the point that by co-scheduling two apps on the same GPU you
can get better performance (for both apps) than just running one app
at the time.
This calls out for a kind of coordination, between workloads running on
the CPU and the accelerator, on the frequency selection.
But if you consider the specific GPGPU use-case, many time the source
of knowledge about the required bandwidth is not kernel-space but
instead in user-space. The OpenCL run-time, as well as many other
run-time, could provide valuable input to the scheduler about these
dependencies.
...
Does this fit with your model of how schedtune is supposed to work? I
I think it's worth to have a try... provided that, in case of
promising results, we are eventually satisfied to replace the CPUFreq
specific API exposed to drivers with a more generic interface exposed to
both kernel- and user-space.
If instead we end up with two different APIs to achieve the same goal
this will be just confusing.
...
have not looked at that stuff at all... are start_the_work() and
stop_the_work(() critical-section functions exposed to drivers?
Mmm... I cannot get that question.
Which functions/critical-sections are you referring to?
...
Regards,
Mike
Cheers Patrick
...
...
Regarding the second use-case b), this is a feature we try to address
using the global boost value. Right now the only consumer of that
global boost value is the FAIR scheduling class. However, it should be
quite easy to extend it with the integration of SchedDVFS into other
scheduling classes.
...
However, I might be the only one that is concerned with use case right
now.
...
I'd think there has to be a summing of bandwidth requests from scheduler
class clients. MikeT raised the concern that in such schemes you often
end up with a bunch of extra overhead because everyone adds their own
fudge factor (Mike please correct me if I'm misstating our concern
here).
To be clear, I raised that point because I've actually seen that in
the past when TI implemented out-of-tree cpu frequency constraint
systems. I'm not being hypothetical :-)
That's the point, SchedTune aims at becoming a sort of (hopefully)
official solution to setup "frequency constraints".
...
...
We should be able to control this in the scheduler classes though
and ensure headroom is only added after the requests are combined.
Sounds promising. Everyone else in this thread supports aggregation
(or summing) over maximum value, and I definitely won't argue the
point on the list unless it presents a real problem in testing.
Regards,
Mike
...
Thoughts?
-- 
Michael Turquette
CEO
BayLibre - At the Heart of Embedded Linux
http://baylibre.com/
Cheers Patrick
-- 
#include <best/regards.h>
Patrick Bellasi
-- 
#include <best/regards.h>

Patrick Bellasi

Re: [Eas-dev] cpufreq_sched policy for combining requests from multiple sched classes