On Tue, Jul 05, 2016 at 10:25:31AM +0100, Morten Rasmussen wrote:
[...]
- /*
* change to use load metrics if can meet two conditions:
* - load is 20% higher than util, so that means task have extra
* 20% time for runnable state and waiting to run; Or the task has
* higher prioirty than nice 0; then consider to use load signal
* rather than util signal;
* - load reach CPU "over-utilized" criteria.
*/
- if ((load * capacity_margin > capacity_of(cpu) * 1024) &&
(load * 1024 > util * capacity_margin))
util = load;
- else {
/*
* Avoid ping-pong issue, so make sure the task can run at
* least once in higher capacity CPU
*/
delta = se->avg.last_update_time - se->avg.last_migrate_time;
if (delta < sysctl_sched_latency &&
capacity_of(cpu) == cpu_rq(cpu)->rd->max_cpu_capacity.val)
util = load;
- }
This extra boost for tasks that have recently migrated isn't mentioned in the cover letter but seems to be a significant part of the actual patch.
Yes.
IIUC, you boost utilization of tasks that have recently migrated. Could you explain a little more about why it is needed?
At first the patch wants to boost utilization if the task have stayed at runnable state for enough time; then after migrate the task from little core to big core, ensure the task can keep to run on big core for a while (at least ensure the task can run on big core for one time). So this is why in this two scenarios replace utilization by load signals.
I don't see why a task that has recently migrated little->big should not get to run at least once on the big cpu if the system is not above the tipping point (over-utilized).
Sorry I introduced confusion and agree with you.
The task was enqueued on a big rq recently, and nobody should have pulled it away before it had a chance to run at least once. We don't do load_balance() when below the tipping point. AFAICT, your recently migrated condition only has effect after the first wake-up on a big cpu (i.e. second wake, third wake, and so forth until sched_latency time has passed since the migration). The first wake-up was when the migration happened.
So to me, it looks like a mechanism to make the task keep waking up on a big cpu until sched_latency time has passed after a migration. The first wake-up should already be covered.
If so, I still think we need set some barriers to avoid the task is easily migrated back to little core. But not sure if sched_latency time is a reasonable value or not, especially as your said, we cannot use sched_latency time as the reason for first wake-up anymore.
The task will appear bigger each time it migrates regardless of whether it has migrate little->big or big->little. Doesn't it mean that you are likely to send tasks that have recently migrated big->little back to big immediately because of the boost?
Yes. Also want to avoid ping-pong issue if we have boosted utilization signal so let task can run at big cluster for a while.
Actually this patch wants to achieve similiar effect with HMP up_threshold and down_threshold, if task is over up_threshod then the task can stay at big core for a while and the task will be migrated back to little core until its load is less than down_threshold.
I think I get what you want to achieve, but isn't it more a kind of one-way bias rather than a hysteresis like HMP has? You only try to keep task on big cpus.
After the taks is migrated to big core, it's hard for this sentence to get true: (load * capacity_margin > capacity_of(cpu) * 1024).
So it's easily to go back to use original util signal rather than load signal if we don't add the sched_latency time condition.
This is why I don't consider much for task migration from big core to little core and hope can rely on the decayed util signal after task goes to smaller load and finally can migrate to little core.
So especially for the scenario of single thread which has big load but it does _NOT_ over EAS tipping point, we can see the task can stay in little cluster and has much less chance to migrate to big core. But the same scenario running with HMP, its load_avg value just need occasionally cross up_threshod then it will have chance to stay on big core. So HMP can achieve much better performance result than EAS.
This sounds like a scenario where you want to boost utilization of the task to get it out of the tipping point grey zone to improve latency.
Yes, exactly. Like the task has util_avg = (~300) but little core has quite high capacity value (~600) then finally the task has no any chance to migrate to big core.
Thanks, Leo Yan