[+eas-dev]
On Mon, Oct 12, 2015 at 02:16:42AM -0700, Michael Turquette wrote:
Quoting Patrick Bellasi (2015-10-12 01:51:29)
On Fri, Oct 09, 2015 at 11:58:22AM +0100, Michael Turquette wrote:
Steve,
On Thu, Oct 8, 2015 at 1:59 AM, Steve Muckle steve.muckle@linaro.org wrote:
At Linaro Connect a couple weeks back there was some sentiment that taking the max of multiple capacity requests would be the only viable policy when cpufreq_sched is extended to support multiple sched classes. But I'm concerned that this is not workable - if CFS is requesting 1GHz worth of bandwidth on a 2GHz CPU, and DEADLINE is also requesting 1GHz of bandwidth, we would run the CPU at 1GHz and starve the CFS tasks indefinitely.
For the scheduler I think that summing makes sense. For peripheral devices it places a much higher burden on the system integrator to figure out how much compute is needed to achieve a specific use case.
Are we still thinking about exposing an interface to device drivers?
With sched-DVFS we are able to select the CPU's OPP based on real (expected) task demand. Thus, I expect constraints from device drivers being useful only in these use cases:
a) the tasks activated by the driver are not "big enough" to generate an OPP switch, but still we want to race-to-idle their execution e.g. a light-weight control thread for an external accelerator, which could benefit from running on an higher OPP to reduce overall processing latencies b) we do not know which tasks require a boost in performances e.g. something like the "boost pulse" exposed by the Interactive governor, where the input subsystem is considered to trigger latency sensitive operations and thus the system deserves to be globally boosted
There are other classes of use-cases?
If these are the only use-cases we are thinking about, I'm wondering if we could not try to cover all them via SchedTune and the boost value it exposes.
The use-case a) is already covered by the first implementation of SchedTune. If we know which task should be boosted, technically we could expose the SchedTune interface to drivers to allow them to boosts specific tasks.
It sounds like could work for me.
My concerns are bandwidth-related use cases. I've observed high speed MMC controllers, WiFi chips, GPUs and other devices whose CPU tasks were "small", but their performance was adversely affected when the CPU runs at a slower rate. This is true even on relatively quiet system.
I see your point, I'm wondering if most of these use-case should not be better implemented using DEADLINE instead of FIFO/RR. For these cases the problem will be solved once DL is properly integrated with sched-DVFS. For the remaining use-case where we still want (or we are "limited") to use FIFO/RT, I'm more on the idea that a race-to-idle strategy could just work.
Tasks running on the CPU can be viewed as latencies from the perspective of a peripheral/IO device. TI and many other vendors have implemented out-of-tree solutions to hold a CPU at a minimum OPP/frequency when these drivers are running (usually with terrible hacks, but in some cases nicely wrapped up in runtime_pm_{get,put} callbacks).
I know very well these scenarios, in the past I experimented a lot with x86 machines running OpenCL workloads offloaded on GPGPUs.
Quite frequently the bottleneck was the control thread running on the CPU, at the point that by co-scheduling two apps on the same GPU you can get better performance (for both apps) than just running one app at the time.
This calls out for a kind of coordination, between workloads running on the CPU and the accelerator, on the frequency selection. But if you consider the specific GPGPU use-case, many time the source of knowledge about the required bandwidth is not kernel-space but instead in user-space. The OpenCL run-time, as well as many other run-time, could provide valuable input to the scheduler about these dependencies.
Does this fit with your model of how schedtune is supposed to work? I
I think it's worth to have a try... provided that, in case of promising results, we are eventually satisfied to replace the CPUFreq specific API exposed to drivers with a more generic interface exposed to both kernel- and user-space. If instead we end up with two different APIs to achieve the same goal this will be just confusing.
have not looked at that stuff at all... are start_the_work() and stop_the_work(() critical-section functions exposed to drivers?
Mmm... I cannot get that question. Which functions/critical-sections are you referring to?
Regards, Mike
Cheers Patrick
Regarding the second use-case b), this is a feature we try to address using the global boost value. Right now the only consumer of that global boost value is the FAIR scheduling class. However, it should be quite easy to extend it with the integration of SchedDVFS into other scheduling classes.
However, I might be the only one that is concerned with use case right now.
I'd think there has to be a summing of bandwidth requests from scheduler class clients. MikeT raised the concern that in such schemes you often end up with a bunch of extra overhead because everyone adds their own fudge factor (Mike please correct me if I'm misstating our concern here).
To be clear, I raised that point because I've actually seen that in the past when TI implemented out-of-tree cpu frequency constraint systems. I'm not being hypothetical :-)
That's the point, SchedTune aims at becoming a sort of (hopefully) official solution to setup "frequency constraints".
We should be able to control this in the scheduler classes though and ensure headroom is only added after the requests are combined.
Sounds promising. Everyone else in this thread supports aggregation (or summing) over maximum value, and I definitely won't argue the point on the list unless it presents a real problem in testing.
Regards, Mike
Thoughts?
-- Michael Turquette CEO BayLibre - At the Heart of Embedded Linux http://baylibre.com/
Cheers Patrick
-- #include <best/regards.h>
Patrick Bellasi
+Saravana (QC DVFS)
Caveat, I haven't read through the whole thread but wanted to comment on a few points:
However, is the _main_ goal of sched-DVFS to be energy-efficient?
I'm not sure it's the _main_ goal, but DVFS behavior does contribute heavily towards overall system energy-efficiency and performance, so it should be an important goal.
IMHO one of the "main" goal of sched-DVFS is to contribute to provide (as much as possible) deterministic behaviors. We have the chance to refactor CPUFreq to better integrate with the scheduler and thus we should try to exploit this opportunity to improve the overall determines of the solution.
Deterministic behavior would be great. However, we should also be mindful of response latency, so I would describe it as deterministic with minimal latency. I'm also concerned about trying to closely match dvfs response to scheduler demand. This is a policy decision and not always what you want.
But again, the goal of sched-DVFS is to be energy-efficient? I think that this responsibility should be better assigned to other players, i.e. scheduling classes.
I think scheduling classes can play a part in managing scheduler demand but don't think they are sufficient as the only DVFS management policy.
Specifically, if you care about responsiveness and energy-efficiency, you should better use DEADLINE instead of FIFO/RR. While if you go for this last class, than you should be aware that you get a race-to-idle behavior, whatever this means from an energy/power standpoint.
I'm hoping you mean that DEADLINE is preferred over FIFO/RR as a better way for EAS to manage energy-efficiency. That makes sense to me for RT, but there still needs to be a plan for EAS to handle FIFO/RR in a sane manner.
- Bryan
(reflowed Bryan's mail a bit)
Quoting Huntsman, Bryan (2015-10-14 13:01:45)
+Saravana (QC DVFS)
Caveat, I haven't read through the whole thread but wanted to comment on a few points:
However, is the _main_ goal of sched-DVFS to be energy-efficient?
I'm not sure it's the _main_ goal, but DVFS behavior does contribute heavily towards overall system energy-efficiency and performance, so it should be an important goal.
At the risk of instigating a religious conflict, I think that I do know the _main_ goal of sched-dvfs: it is to coordinate the decisions taken by the frequency selection policy, the idle state selection policy and the scheduling/task placement policy.
What those policies look like is orthogonal to the goal of coordinating three separate actors in the Linux kernel that have sometimes worked at odds with each other in the past.
Put another way, the goal is to make it better ;-)
Regards, Mike
IMHO one of the "main" goal of sched-DVFS is to contribute to provide (as much as possible) deterministic behaviors. We have the chance to refactor CPUFreq to better integrate with the scheduler and thus we should try to exploit this opportunity to improve the overall determines of the solution.
Deterministic behavior would be great. However, we should also be mindful of response latency, so I would describe it as deterministic with minimal latency. I'm also concerned about trying to closely match dvfs response to scheduler demand. This is a policy decision and not always what you want.
But again, the goal of sched-DVFS is to be energy-efficient? I think that this responsibility should be better assigned to other players, i.e. scheduling classes.
I think scheduling classes can play a part in managing scheduler demand but don't think they are sufficient as the only DVFS management policy.
Specifically, if you care about responsiveness and energy-efficiency, you should better use DEADLINE instead of FIFO/RR. While if you go for this last class, than you should be aware that you get a race-to-idle behavior, whatever this means from an energy/power standpoint.
I'm hoping you mean that DEADLINE is preferred over FIFO/RR as a better way for EAS to manage energy-efficiency. That makes sense to me for RT, but there still needs to be a plan for EAS to handle FIFO/RR in a sane manner.
- Bryan