On Tue, May 29, 2018 at 01:16:17PM +0100, Chris Redpath wrote:
Hi,
Joel was looking at something else in android-4.14 wakeup path and he noticed that we have a difference in behavior for prefer_idle tasks when we're using find_best_target, so he asked if we could discuss that here.
Behavior in android-4.9 and earlier
When we find an idle CPU for a task which has the prefer_idle attribute, we immediately return that CPU from the energy-based wakeup CPU selection. This happens in slightly different places over the EAS and android versions but it is always true that we take the recommended idle CPU for tasks in this class.
How that changed in android-4.14
Thanks Chris and ARM team for the changes, the strf path looks much more mainline friendly now. :)
android-4.14 has two different wakeup paths, selected with a sched_feature FIND_BEST_TARGET. This defaults to true with the intent of preserving the previous behavior. Both paths are different, so I'll describe them below separately.
The two paths however share some common code - the way we integrated EAS with the regular wakeup code is different in android-4.14.
There were two reasons for doing this.
- minimize the differences in select_task_rq_fair wrt mainline code
- make better use of the per-sd overutilization flags
Since we have per-sd overutilised flags, we attempt to perform an EAS wakeup at the highest non-overutilised sched_domain - meaning that we can still perform an energy-aware wakeup for small tasks inside a non-overutilized group of small CPUs while potentially other groups of CPUs are overutilized.
The decision about attempting to use energy awareness is taken in wake_energy function at the top of strf - all the cases where we can't use energy-awareness are ruled out, and the decision about using find_idlest_cpu/EAS for prefer_idle tasks is also done here.
If we are using energy aware wakeups, then we will find the highest non-overutilised SD to wake in.
In all cases where we do an energy aware wakeup but don't find any suitable candidate CPUs we will go on to use find_idlest_cpu.
One thing I wanted to mention is that there are some cases where even if want_affine is set to 0 because want_energy = 1, we can still enter the select_idle_sibling path instead of energy-aware wake ups. I discovered this when I was playing with cpusets. If sched_load_balance is set to 0 in the root cpuset, then its possible that the main for_each_domain loop can be turned off. This is because all the domains would be detached from the rq. To trigger this, you could just do: mkdir /cpuset mount -t cpuset none /cpuset echo 0 > sched_load_balance
So in other words, I believe these cases shouldn't also end up in turning off the find_energy_efficient_cpu. Does that make sense or did I miss something?
android-4.14 sched_feat(FIND_BEST_TARGET) true wakeup path
When FIND_BEST_TARGET sched feature is on (the default), we call find_best_target to populate the energy_env structure. This takes note of the prefer_idle flag and the task boost to change which task placement strategy will be used - the algorithm is the same as in previous versions of android.
However in android-4.14 (unintentionally) the prefer_idle task placement is not immediately acted upon - when a prefer_idle task is placed, we will select the first idle CPU we see *but* this will become the target CPU and we will still perform an energy diff and select between prev/target based upon energy requirement.
The open question is - now that we have realized that there is a different strategy in place, should we change it to be the same as the old version? I think that we should - it will be a simple change to use the idle cpu selected immediately without the energy diff.
I guess its safer to keep the older version behavior since that's been well tested over the years and perhaps hasn't caused any issues that need fixing?
Also, if we are changing the behavior, I guess we could also make it such that if EAS_PREFER_IDLE is set and the task is prefer-idle, then we just run find_idlest_cpu unconditionally to find an idle CPU? That will also make sure that we are using the mainline find_idlest_cpu path for prefer-idle CPU. I think that would be a worthwhile change to make so we are even more aligned with mainline slow-path in hunting for an idle CPU. What do you think?
thanks,
- Joel