Hi Leo & Steve,
On 12/23/2015 03:14 AM, Leo Yan wrote:
Hi Steve,
Thanks for review.
On Tue, Dec 22, 2015 at 04:32:11PM -0800, Steve Muckle wrote:
On 12/09/2015 06:53 PM, Leo Yan wrote:
[...]
static int energy_aware_wake_cpu(struct task_struct *p, int target) { struct sched_domain *sd; struct sched_group *sg, *sg_target; int target_cpu; struct cpumask target_cpus;
sd = rcu_dereference(per_cpu(sd_ea, task_cpu(p))); if (!sd) return target; sg = sd->groups; sg_target = sg; cpumask_clear(&target_cpus); do { find_best_cpu_in_sg(&target_cpus, sg, p); } while (sg = sg->next, sg != sd->groups);
here we would break with the scalabiliy model in cfs that we first find the most suitable (busiest, idlest, most_energy_efficient, ...) sg and than we look in this group for the most suitable cpu. This is definitely a problem if EAS should be applicable to system other than 2/2 or 2/4 (cluster/cpus) out of the box.
One EAS design goal was that for various platforms (topology-wise (including big.Little vs. SMP, different # of sg's, different # of sd levels, different strategies in power and frequency domain layouts), all the difference is expressed with different values of the appropriate energy model. It is clearly not possible to introduce special EAS code paths for these various topologies we want to support. We could introduce special behavior based on sd topology flags though (existing or new onces) but not entirely new, specific policies.
if (cpumask_empty(&target_cpus))
I think you could just return task_cpu(p) here rather than setting the mask and continuing into find_power_efficient_cpu?
Good point. I will fix it.
Just to try to sync up ... this patches are against the issue related to patch 32/46 of RFCv5 on Hikey (SMP) where you guys saw that the low intensive rt-app tests didn't show enough packing. The initial behavior on Hikey was that task_cpu(p) determines which cluster will be chosen to find the target cpu. But then there was also the argument that too much packing could lead to a higher OPP and more energy consumption (especially on a platform with system wide frequency domain (Hikey). So maybe the initial approach (vanilla) RFCv5 wasn't so bad after all?
Another open issue was against 22/46 of RFCv5, the one with the missing sg spawning the entire frequency domain on Hikey. This one is related to '[Eas-dev] [PATCH 1/4] sched/fair: EASv5: Fix CPU shared capacity issue'. Currently we have two proposals (struct cpumask sg_cap on the stack or '[PATCH] sched: EAS & cpu hotplug interoperability'))
IMHO, we should address both in RFCv6 early next year.
I also saw your suggestion to use latest kernel to verify this patch, Dietmar also suggested me to sync with EAS RFC 5.2 in another email. This patch is very dependent on CPU's utilization to select target CPU. So I will firstly to rebase on EAS RFC 5.2 and continue profiling.
This becomes very much important because everybody who played with this code already on multiple systems knows how fragile the whole thing is.
E.g., currently Patrick is trying to make sched_freq more responsive so he's proposing to change the way we use per cpu utilization signal (so far only shared ARM internally). This is one of these cases where it could have a negative effect on other EAS functionality on a specific platform.
That's why we should always use the latest code and make sure that we run at least the EAS tests in schedtest on big.Little and SMP platforms even we work on related functionalities like sched_freq or sched_tune. I'm planning to have Hikey as an SMP platform integrated into ARM's schedtest environment before RFCv6 hits LKML.
-- Dietmar IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.