Re: [Eas-dev] [PATCH 1/4] sched/fair: EASv5: Fix CPU shared capacity issue

30 Dec 2015

Hi Leo,
On 12/18/2015 03:29 AM, Leo Yan wrote:
...
On Thu, Dec 17, 2015 at 07:18:56PM +0000, Dietmar Eggemann wrote:
...
On 10/11/15 09:58, Leo Yan wrote:
[...]
...
Your idea is better than my patch, because it has more clear topology
corresponding to hardware design. Also I have below questions so can
understand well your patch :)
...
(1) Hold cluster energy model (EM) data on a single cluster system
For single cluster system, there only has 'MC' sched domain (for cpu level)
so there have no 'CPU' sched domain to hold cluster's EM data. So should
Yes, this is the plan for a single cluster EAS system. (s/'CPU'/'DIE')
...
add sched domain 'SYS' to help save cluster's EM data, right?
Yeah, something like this. You only need the SYS sd with one sg spawning
the whole system, no EM info is needed on this sd for hikey.

diff --git a/arch/arm64/kernel/topology.c b/arch/arm64/kernel/topology.c
index 5883f9e262ec..805627f61fc5 100644
--- a/arch/arm64/kernel/topology.c
+++ b/arch/arm64/kernel/topology.c
@@ -354,6 +354,7 @@ static struct sched_domain_topology_level
arm64_topology[] = {
         { cpu_coregroup_mask, cpu_corepower_flags, cpu_core_energy,
SD_INIT_NAME(MC) },
  #endif
         { cpu_cpu_mask, NULL, cpu_cluster_energy, SD_INIT_NAME(DIE) },
+       { cpu_cpu_mask, NULL, NULL, SD_INIT_NAME(SYS) },
         { NULL, },
  };
In the meantime I played a little bit with this patch and I think you
have to add some code to cover your functionality of having a top sd
with one sg spawning all cpus. The original patch covers only the
functionality when individual cpus are offline on a big.Little system
and a single cluster system where we want to attach EM data on SYS sd.
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 487bb3c4627b..8b6cb48b8985 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5929,7 +5929,9 @@ sd_parent_degenerate(struct sched_domain *sd,
struct sched_domain *parent)
                                 SD_PREFER_SIBLING |
                                 SD_SHARE_POWERDOMAIN |
                                 SD_SHARE_CAP_STATES);
-               if (parent->groups->sge) {
+               if (parent->groups->sge ||
+                   (sd->groups->sge && parent->groups ==
parent->groups->next &&
+                    parent->span_weight == num_online_cpus())) {
                         parent->flags &= ~SD_LOAD_BALANCE;
                         return 0;
                 }
The additional if condition is necessary since you don't want to attach
EM data (parent->groups->sge) onto your SYS sd.
JUNO looks like this with these two changes on top of patch 'sched: EAS
& cpu hotplug interoperability':
root@genericarmv8:~# cat /proc/schedstat (only for cpu0)
cpu0 0 0 4436 1854 3368 1751 1034319260 310590900 2555
domain0 39 1136 1136 0 0 0 0 0 1136 1 1 0 0 0 0 0 1 1436 1335 94   ...
domain1 3f 900 852 46 25311 2 0 1 770 0 0 0 0 0 0 0 0 1429 1235    ...
domain2 3f 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 ...
$ cat /proc/sys/kernel/sched_domain/cpu0/domain*/{name,flags}
MC
DIE
SYS
33583
4143
4142 <-- !SD_LOAD_BALANCE
...
...
(2) Offer this 'sg_shared_cap = sd->parent->groups' thing on a platform
     like Hikey
(3) Let cpu and cluster level EM data survive if cpus get off-lined
Just curious, what benefit we can get for this? When CPU or whole
cluster has been off-lined, then we should not take care their EM data
anymore.
If the entire cluster is offline, then yes, we shouldn't take these cpus
or the cluster contribution into consideration but I was talking about
situations where all but one cpu of a cluster is offline or about the
cpus in the other cluster of a two cluster system.
From the patch header:
For Energy-Aware Scheduling (EAS) to work properly, even in the case
that cpus are hot-plugged out, the energy model (EM) data on all
energy-aware sched domains has to be present for all online cpus.
Mainline sd hierarchy setup code will remove sd's which are not useful
for task scheduling e.g. in the following situations:
1. Only one cpu remains in one cluster of a two cluster system.
This remaining cpu only has DIE and no MC sd.
2. A complete cluster in a two-cluster system is hot-plugged out.
The cpus of the remaining cluster only have MC and no DIE sd.
...
...
Obviously, this sd level can't be part of the actual scheduling
decisions so CFS code has to change slightly for this to work. I
attached the appropriate patch. It's only tested lightly on TC2 and JUNO
and might not apply on the current code line.
Would be nice if you can test it and give feedback if you consider it as
something you're fine with. Thanks!
Do you suggest I test this based on RFCv5 or RFCv6?
Definitely RFCv5.2. EAS RFCv6 does not exist yet.
-- Dietmar
[...]
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

    

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] [PATCH 1/4] sched/fair: EASv5: Fix CPU shared capacity issue