Hi Patrick,
On Tue, Oct 11, 2016 at 01:01:46PM +0100, Patrick Bellasi wrote:
On 10-Oct 16:35, Leo Yan wrote:
When a task is running on one CPU, when the task utilization is more than 80% of CPU capacity the task will be considered as a misfit task and should be migrated to higher capacity CPU. But the running task will take more than 200ms for migration, this is caused by long latency to trigger load balance.
The latency is decided by two factors, the first one factor is time interval for schedule domain: busy_factor * balance_interval, by default cluster's schedule domain busy_factor = 32 and balance_interval = 8ms so finally latency is 256ms. If we set busy_factor to 1 from sysfs node for every schedule domain, this will reduce the time interval for load balance.
Besides this, another factor is to trigger active load balance for running task, this can be finished by kicking off an idle balance to pull running task to a big core. In the function nohz_kick_needed() it will check if need to wake up a idle CPU for idle balance, but in previous code it have no any checking for misfit task on rq, so finally will not trigger idle balance. As result we can see the running task is sticking on LITTLE core for long time.
In nohz_kick_needed we already check for cpu_overutilized which in the "general case" (i.e. no boosting, not cappings) should match with misfit_task. I mean, when misfit_task is set the CPU is also always marked as overutilized, isn't it?
The checking code you meantion is as below:
9341 if (rq->nr_running >= 2 && 9342 (!energy_aware() || cpu_overutilized(cpu))) 9343 return true;
So it must meet the condition to have at least two runnable tasks, but if there have only one running task on rq, it's hard to trigger nohz idle balance. This is the purpose this patch try to fix.
Maybe I can change code as below:
if (rq->nr_running >= 2 && !energy_aware()) return true;
if (energy_aware() && cpu_overutilized(cpu)) return true;
This will give more chance to migrate tasks to big core.
Actually, task_fits_max checks for the task fitting the _maximum_ capacity available in the system, which is tracked at root SD level. Thus, it normally checks if a task fits the 1024 (minus margin) capacity.
AFAIKS, the main difference between cpu_overutilized and misfit_task is that this last (only) considers the "boosted" task utilization.
Thus, while a small boosted task does not mark a CPU as overutilized, the same task can still be marked as a misfitting one.
Do you think that's the case captured by the following extra check condition?
This patch is to add checking misfit task in function nohz_kick_needed(), so make sure if there have misfit task can be quickly pulled to higher capacity CPU.
Tested this patch with Geekbench on the ARM Juno R2 board for multi-thread case, the score can be improved from 2176 to 2281, so can improve performance ~4.8%.
Signed-off-by: Leo Yan leo.yan@linaro.org
kernel/sched/fair.c | 4 ++++ 1 file changed, 4 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cf56241..dedb3e0 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -8939,6 +8939,10 @@ static inline bool nohz_kick_needed(struct rq *rq) (!energy_aware() || cpu_overutilized(cpu))) return true;
- /* Do idle load balance if there have misfit task */
- if (energy_aware() && rq->misfit_task)
return true;
- rcu_read_lock(); sd = rcu_dereference(per_cpu(sd_busy, cpu)); if (sd && !energy_aware()) {
-- 1.9.1
-- #include <best/regards.h>
Patrick Bellasi