On 19 October 2015 at 15:02, Juri Lelli juri.lelli@arm.com wrote:
[+eas-dev]
On 19/10/15 09:06, Vincent Guittot wrote:
On 8 October 2015 at 12:28, Morten Rasmussen morten.rasmussen@arm.com wrote:
On Thu, Oct 08, 2015 at 10:54:15AM +0200, Vincent Guittot wrote:
On 8 October 2015 at 02:59, Steve Muckle steve.muckle@linaro.org wrote:
At Linaro Connect a couple weeks back there was some sentiment that taking the max of multiple capacity requests would be the only viable policy when cpufreq_sched is extended to support multiple sched classes. But I'm concerned that this is not workable - if CFS is requesting 1GHz worth of bandwidth on a 2GHz CPU, and DEADLINE is also requesting 1GHz of bandwidth, we would run the CPU at 1GHz and starve the CFS tasks indefinitely.
I'd think there has to be a summing of bandwidth requests from scheduler class clients. MikeT raised the concern that in such schemes you often end up with a bunch of extra overhead because everyone adds their own fudge factor (Mike please correct me if I'm misstating our concern here). We should be able to control this in the scheduler classes though and ensure headroom is only added after the requests are combined.
Thoughts?
I have always been in favor of using summing instead of maximum because of the example you mentioned above. IIRC, we also said that the scheduler classes should not request more than needed capacity with regards of schedtune knob position it means that if schedtune is set to max power save, no margin should be taken any scheduler class other than to filter uncertainties in cpu util computation). Regarding RT, it's a bit less straight foward as we much ensure an unknown responsiveness constraint (unlike deadline) so we could easily request the max capacity to be sure to ensure this unknown constraint.
Agreed. I'm in favor with summing the requests, but with a minor twist. As Steve points out, and I have discussed it with Juri as well, with three sched classes and using the max capacity request we would always request too little capacity if more than one class has tasks. Worst case we would only request a third of the required capacity. Deadline would take it all and leave nothing for RT and CFS.
Summing the requests instead should be fine, but deadline might cause us to reserve too much capacity if we have short deadline tasks with a tight deadline. For example, a 2ms task (@max capacity) with a 4ms deadline and a 10ms period. In this case deadline would have to request 50% capacity (at least) to meet its deadline but it only uses 20% capacity (scale-invariant). Since deadline has higher priority than RT and CFS we can safely assume that they can use the remaining 30% without harming the deadline task. We can take this into account if we let deadline provide a utilization request (20%) and a minimum capacity request (50%). We would sum the utilization request with the utilization requests of RT and CFS. If sum < deadline_min_capacity, we would choose deadline_min_capacity instead of the sum to determine the capacity. What do you think? It might not be worth the trouble as there are plenty of other scenarios where we would request too much capacity for deadline tasks that can't be fixed.
I have some concern with a deadline min capacity field. If we take the example above, the request seems to be a bit too much static, deadline
We should be able to get away from having a special "min capacity" field for SCHED_DEADLINE, yes. Requests coming from SCHED_DEADLINE should always concern minimum requirements (too meet deadlines). However, I'll have to play a bit with all this before being 100% sure.
scheduler should request 50% only when the task is running. The 50% make only sense if the task start ro run at t he beg of the period and inorder to run the complete 4ms time slot of the deadline. But, the request might even have to be increased to 100% if for some reasons like another deadline task or a irq or a disable preemption, the task starts to run in the last 2ms of the deadline time slot. Once the task has finished its running period, the request should go back to 0.
Capacity requests of SCHED_DEADLINE are supposed to be more stable, I think it's built in how the thing works. There is a particular instant of time, relative to each period, called "0-lag" point after which, if the task is not running, we are sure (by construction) that we can safely release a capacity request relative to a certain task. It is about theory behind SCHED_DEADLINE implementation, but we should be able to use this information to ask for the right capacity. As said above, I'll need more time to think this through and experiment with it.
I agree that reality is a bit more complex because we don't have "immediate" change of the freq/capacity so we must take into account the time needed to change the capacity of the CPU but we should try to make the requirement as close as possible to the reality.
Using a min value just mean that we are not able to evaluate the current capacity requirement of the deadline class and that we will just steal the capacity requested by other class which is not a good solution IMHO
Juri, what will be the granularity of the computation of the bandwidth of the patches you are going to send ?
I'm not sure I get what you mean by granularity here. The patches will add a 0..100% bandwidth number, that we'll have to normalize to 0..1024, for SCHED_DEADLINE tasks currently active. Were you expecting something else?
My question was more about time granularity than range: Is the bandwidth statically define with deadline parameters of the tasks or is it updated once task has consumed its runtime budget of the current period ? But you have answered to my question in your former comments
Thanks Vincent
Thanks,
- Juri
As Vincent points out, RT is a bit tricky. AFAIK, it doesn't have any utilization tracking at all. I think we have to fix that somehow. Regarding responsiveness, RT doesn't provide any guarantees by design (in the same way deadline does) so we shouldn't be violating any policies by slowing RT tasks down. The users might not be happy though so we could favor performance for RT tasks to avoid breaking legacy software and ask users that care about energy to migrate to deadline where we actually know the performance constraints of the tasks.