Hello all,
I have an x86 based platform which is running android. I wanted to
play around with the EAS patches to see if it would improve power
numbers on it.
I had a few basic questions regarding this:
1) Can EAS be used with x86 based platforms ? I see some arm/arm64
energy model related patches in the eas integration tree
(git://linux-arm.org/linux-power.git). However, there aren't any x86
specific changes present. Is that because no x86 specific changes are
required or just that it is untested there ?
2) Is it expected that EAS would show significant power savings on SMP
systems or just on HMP systems ?
3) Would there be any cpufreq/cpuidle integration be required for x86
specifically ? If so, would I need to base them on the arm stuff or is
there any other reference code.
4) Are there other in-flight patches that need to be applied over the
patches in the eas integration tree for best results ?
If indeed the EAS patches can be used on x86. then I would be
interested in integrating and providing results on my platform. Please
guide.
Regards,
Darren
Hello,
I'm pleased to announce that we have pushed a very early version of
some of the key features we intend to make available as EAS 1.2 this
year to Google's msm repository
( https://android.googlesource.com/kernel/msm.git/ ) as
android-msm-marlin-3.18-nougat-mr1-eas-experimental.
EAS 1.2 is intended to be the next iteration of EAS for AOSP,
including improvements to the wakeup path to better support
big.LITTLE and trialling other upstream scheduler enhancements such
as schedutil along with some important load/util tracking enhancements
to PELT.
Although EAS 1.2 will be primarily focused on a 4.4-based kernel, we
are making this experimental branch available on the 3.18-based Pixel
kernel (marlin_defconfig) in order that we have a readily-available
real platform with an optimised userspace for experimentation.
There are some differences in the scheduler task wake-up path between
this release and that shipping in the Pixel kernel which should be
taken into account when using this kernel.
The most visible change in the wake-up path is the removal of the
is_big_little sysctl. Wake-up now uses a single cpu selection
algorithm (the same one used previously for !isBigLittle) but modified
to remove the assumption that the highest capacity cpus have the
highest logical cpu number. We now allow cpu topology independent
selection of max capacity cpus for tasks which belong to a schedtune
group which has some boost applied irrespective of the cpu numbering.
This changes the iteration order of cpus when looking for a place to
run these tasks from [3,2],[1,0] to [2,3], [0,1]. This has an impact
on runtime configuration. Not making a change to this configuration is
likely to have a small impact for lightly-loaded systems where there
will usually be two idle high-capacity cpus, but we should anyway
match cpuset configuration to the selection ordering to restore the
expectations used when tuning.
In Pixel, cpusets are arranged such that one of the highest capacity
cpus is available only to tasks belonging to the ‘top-app’ cpuset. In
combination with the cpu iteration order used for schedtune boosted
tasks, we hope to find an empty cpu more often for these tasks to wake
on. As a result of the changed iteration order, the top-app should now
be set to the lowest numbered high capacity cpu (in this case #2 for
Pixel). The impact of this is likely to be small for most light use
cases if not changed. This is done in the initrc:
The usual group setup for Pixel is in init.sailfish.rc - the part
which configures the CPUSets for the tuning groups is normally as follows:
on property:sys.boot_completed=1
write /proc/sys/kernel/sched_boost 0
# update cpusets now that boot is complete and we want better load balancing
write /dev/cpuset/top-app/cpus 0-3
write /dev/cpuset/foreground/boost/cpus 0-2
write /dev/cpuset/foreground/cpus 0-2
write /dev/cpuset/background/cpus 0
write /dev/cpuset/system-background/cpus 0-2
As we wish to make cpu 2 the one which is only available for tasks in
the top-app group, we should exclude cpu 2 from the other groups.
on property:sys.boot_completed=1
write /proc/sys/kernel/sched_boost 0
# update cpusets now that boot is complete and we want better load balancing
write /dev/cpuset/top-app/cpus 0-3
write /dev/cpuset/foreground/boost/cpus 0-1,3
write /dev/cpuset/foreground/cpus 0-1,3
write /dev/cpuset/background/cpus 0
write /dev/cpuset/system-background/cpus 0-1,3
We normally do this at run time in a root shell rather than modifying
the init scripts.
The schedutil governor is present but not selected as the default
cpufreq governor.
It is important to note that there is a slight difference in the
meaning of the up & down frequency select throttling for the 'sched'
governor (sched-dvfs) and 'schedutil'. The 'sched' governor considers
time to be measured since the last *frequency change* whilst the
'schedutil' governor considers the time to be measured since the last
*utilisation request*. This means that we need to shorten the throttle
periods used for schedutil when comparing it to sched-dvfs to avoid
staying at the maximum frequency for long periods in UI-driven
workloads.
We have been experimenting with up_rate_limit_usec set to 500 and
down_rate_limit_usec set to 2000 or 5000 which appears to give
results comparable with those of the 'sched' governor.
The branch is based upon the mr1 kernel release, and contains the
patches shown at the end of this mail.
They are comprised of 6 main areas of functionality.
* ec114ba...d2238c2 and 8646350...35ea67a
patches to reduce the delta between the msm kernel and the common kernel
* b055eba...d2e2970
introduce a backport of the upstream schedutil governor (but it is not the default
governor in marlin_defconfig)
* 7f7e79e...14531d4e
bring the energy-aware-scheduling calculations into line with our
mainline-focused implementation and backport capacity-based-scheduling to 3.18
* b75b728...407d2a7
integrate the current EAS 1.1 wakeup path with the mainline-focused
wakeup path and introduce a way to provide a common algorithm implementing
the alternate CPU search algorithm for schedtune boosted tasks
* f966249...1ad6d08
Backport some important upstream CFS fixes to 3.18. This fixes some critical
group accounting issues which had a negative impact on the suitability of PELT
utilisation signals for Android
* 6ae4707
Allows EAS to continue to calculate energy for systems which end up with
a single CPU in a sched domain
Best Regards,
Chris
Amit Pundir (3):
sched/walt: use do_div instead of division operator
ANDROID: sched/walt: fix build failure if FAIR_GROUP_SCHED=n
Revert "cgroup: Fix issues in allow_attach callback"
Brendan Jackman (2):
DEBUG: sched/fair: Fix missing sched_load_avg_cpu events
DEBUG: sched/fair: Fix sched_load_avg_cpu events for task_groups
Chris Redpath (17):
Revert "WIP: UTIL_EST: use estimated utilization on load balancing paths"
Revert "WIP: UTIL_EST: use estimated utilization on energy aware wakeup path"
Revert "WIP: UTIL_EST: sched/fair: use estimated utilization to drive CPUFreq"
Revert "WIP: UTIL_EST: switch to usage of tasks's estimated utilization"
sched: revert UTIL_EST usage from commit 6bf72ca7f1
Revert "WIP: UTIL_EST: sched/{core,fair}: add support to use estimated utilization"
Revert "WIP: UTIL_EST: sched/fair: add support for estimated utilization"
sched/fair: missing parts of 'optimize idle cpu selection for boosted tasks'
sched/fair: Fix uninitialised variable in idle_balance
Revert: UTIL_EST code from 'fix set_cfs_cpu_capacity when WALT is in use"
Unify whitespace layout with android-3.18
schedtune: Guarding against compile errors
sched/walt: Drop arch-specific timer access
Revert "DEBUG: UTIL_EST: sched: update tracepoint to report estimated CPU utilzation"
sched: This kernel expects sched_cfs_boost to be signed
schedutil: Fix linkage of schedutil and walt
config: Update marlin_defconfig to include schedutil governor
Dietmar Eggemann (20):
Revert "WIP: sched: Consider spare cpu capacity at task wake-up"
Partial Revert: "WIP: sched: Add cpu capacity awareness to wakeup balancing"
Experimental! arm64: Set SD_SHARE_CAP_STATES sched_domain flag on DIE level
Experimental!: sched/fair: Do not force want_affine eq. true if EAS is enabled
Experimental!: sched/fair: Decommission energy_aware_wake_cpu()
Fixup!: sched/fair.c: Set SchedTune specific struct energy_env.task
Experimental!: EAS: sched/fair: Re-integrate 'honor sync wakeups' into wakeup path
Experimental!: sched/fair: Code !is_big_little path into select_energy_cpu_brute()
Experimental!: sched: Remove sysctl_sched_is_big_little
sched/core: Remove remnants of commit fd5c98da1a42
Experimental!: sched/core: Add first cpu w/ max/min orig capacity to root domain
Experimental!: sched/fair: Change cpu iteration order in find_best_target()
sched/fair: Simplify backup_capacity handling in find_best_target()
Fixup!: sched/fair: Simplify target_util handling in find_best_target()
Fixup!: sched/fair: Simplify idle_idx handling in find_best_target()
Fixup!: sched/fair: Refactor min_util, new_util in find_best_target()
Fixup!: sched/fair: Simplify idle_idx handling in select_idle_sibling()
Fixup!: Return first idle cpu for prefer_idle task immediately
Fixup!: sched/fair: No need to 'and' current cpu w/ online mask in wakeup
sched: EAS & 'single cpu per cluster'/cpu hotplug interoperability
Dmitry Shmidt (1):
sched: Fix sysctl_sched_cfs_boost type to be int
Juri Lelli (3):
sched/cpufreq: make schedutil use WALT signal
trace/sched: add rq utilization signal for WALT
sched/walt: kill {min,max}_capacity
Ke Wang (1):
sched: tune: Fix lacking spinlock initialization
Morten Rasmussen (15):
sched/core: Fix power to capacity renaming in comment
sched/fair: Make the use of prev_cpu consistent in the wakeup path
sched/fair: Optimize find_idlest_cpu() when there is no choice
sched/core: Remove unnecessary NULL-pointer check
sched/core: Introduce SD_ASYM_CPUCAPACITY sched_domain topology flag
sched/core: Pass child domain into sd_init()
sched/core: Enable SD_BALANCE_WAKE for asymmetric capacity systems
sched/fair: Let asymmetric CPU configurations balance at wake-up
sched/fair: Compute task/cpu utilization at wake-up correctly
sched/fair: Consider spare capacity in find_idlest_group()
sched/fair: Add per-CPU min capacity to sched_group_capacity
sched/fair: Avoid pulling tasks from non-overloaded higher capacity groups
sched/fair: Fix incorrect comment for capacity_margin
Experimental!: sched/fair: Add energy_diff dead-zone margin
Experimental!: sched/fair: Energy-aware wake-up task placement
Patrick Bellasi (3):
FIXUP: sched/tune: update accouting before CPU capacity
FIX: sched/tune: move schedtune_nornalize_energy into fair.c
sched/tune: backport 'fix accounting for runnable tasks'
Peter Zijlstra (Intel) (3):
sched/fair: Apply more PELT fixes
sched/fair: Improve PELT stuff some more
sched/fair: Fix effective_load() to consistently use smoothed load
Petr Mladek (1):
kthread: allow to cancel kthread work
Srinath Sridharan (1):
eas/sched/fair: Fixing comments in find_best_target.
Steve Muckle (5):
sched/cpufreq: fix tunables for schedfreq governor
sched: backport cpufreq hooks from 4.9-rc4
sched: backport schedutil governor from 4.9-rc4
sched: cpufreq: use rt_avg as estimate of required RT CPU capacity
cpufreq: schedutil: add up/down frequency transition rate limits
Vincent Guittot (6):
sched: factorize attach entity
sched: factorize PELT update
sched: fix hierarchical order in rq->leaf_cfs_rq_list
sched: propagate load during synchronous attach/detach
sched: propagate asynchrous detach
sched: Multiple upstream load tracking changes
Viresh Kumar (1):
cpufreq: schedutil: move slow path from workqueue to SCHED_FIFO task
Yuyang Du (1):
sched/fair: Initiate a new task's util avg to a bounded value
kbuild test robot (2):
ANDROID: sched/tune: __pcpu_scope_cpu_boost_groups can be static
ANDROID: sched/tune: schedtune_allow_attach() can be static
arch/arm64/configs/marlin_defconfig | 2 +-
arch/arm64/kernel/topology.c | 7 +-
drivers/cpufreq/Kconfig | 27 +
drivers/cpufreq/Makefile | 2 +-
drivers/cpufreq/cpufreq.c | 32 +
drivers/cpufreq/cpufreq_governor_attr_set.c | 84 ++
include/linux/cgroup.h | 2 +-
include/linux/cpufreq.h | 49 ++
include/linux/kthread.h | 4 +
include/linux/sched.h | 20 +-
include/linux/sched/sysctl.h | 7 +-
include/trace/events/sched.h | 22 +-
init/Kconfig | 1 +
kernel/kthread.c | 96 +-
kernel/sched/Makefile | 2 +
kernel/sched/core.c | 84 +-
kernel/sched/cpufreq.c | 63 ++
kernel/sched/cpufreq_sched.c | 220 ++---
kernel/sched/cpufreq_schedutil.c | 762 ++++++++++++++++
kernel/sched/deadline.c | 3 +
kernel/sched/debug.c | 4 -
kernel/sched/fair.c | 1254 ++++++++++++++++++---------
kernel/sched/features.h | 5 -
kernel/sched/rt.c | 3 +
kernel/sched/sched.h | 84 +-
kernel/sched/tune.c | 5 +-
kernel/sched/tune.h | 3 +
kernel/sched/walt.c | 52 +-
kernel/sysctl.c | 7 -
29 files changed, 2261 insertions(+), 645 deletions(-)
create mode 100644 drivers/cpufreq/cpufreq_governor_attr_set.c
create mode 100644 kernel/sched/cpufreq.c
create mode 100644 kernel/sched/cpufreq_schedutil.c
--
1.9.1
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Hi Guys,
All of this work was done by Steve before he left. I have made very
minor changes, merged few patches, rebased over 4.10-rc5.
More details can be found here:
https://projects.linaro.org/browse/PMWG-1018
With Android UI and benchmarks the latency of cpufreq response to
certain scheduling events can become very critical. Currently on
mainline tip, callbacks into schedutil are only made from the scheduler
if the target CPU of the event is the same as the current CPU. This
means there are certain situations where a target CPU may not run
schedutil for some time.
One testcase to show this behavior is where a task starts running on
CPU0, then a new task is also spawned on CPU0 by a task on CPU1. If the
system is configured such that new tasks should receive maximum demand
initially, this should result in CPU0 increasing frequency immediately.
Because of the above mentioned limitation though this does not occur.
This patchset defers the callback into schedutil if the callback would
be remote (not for a CPU in the policy of which we are running). If
there is no preemption required by the wakeup a late callback into
schedutil is made, and schedutil is modified to be able to correctly
deal with remote callbacks. If preemption does occur then the scheduler,
and schedutil, will run on the remote CPU anyway.
I would be doing further testing on this to get more performance numbers
with it, just wanted to get some early responses and so sending it to
the EAS list.
--
viresh
Steve Muckle (9):
sched: cpufreq: add cpu to update_util_data
irq_work: add irq_work_queue_on for !CONFIG_SMP
sched: cpufreq: extend irq work to support fast switches
sched: cpufreq: remove smp_processor_id() in remote paths
sched: create late cpufreq callback
sched: cpufreq: detect, process remote callbacks
cpufreq: governor: support scheduler cpufreq callbacks on remote CPUs
intel_pstate: ignore scheduler cpufreq callbacks on remote CPUs
sched: cpufreq: enable remote sched cpufreq callbacks
drivers/cpufreq/cpufreq_governor.c | 2 +-
drivers/cpufreq/intel_pstate.c | 3 ++
include/linux/irq_work.h | 7 ++++
include/linux/sched.h | 1 +
kernel/sched/core.c | 4 ++
kernel/sched/cpufreq.c | 1 +
kernel/sched/cpufreq_schedutil.c | 80 +++++++++++++++++++++++++++-----------
kernel/sched/fair.c | 6 ++-
kernel/sched/sched.h | 24 +++++++++++-
9 files changed, 102 insertions(+), 26 deletions(-)
--
2.7.1.410.g6faf27b
The current implementation of overutilization, aborts energy aware
scheduling if any cpu in the system is over-utilized. This patch introduces
over utilization flag per sched group level instead of a single flag
system wide. Load balancing is done at the sched domain where any
of the sched group is over utilized. If energy aware scheduling is
enabled and no sched group in a sched domain is overuttilized,
load balancing is skipped for that sched domain and energy aware
scheduling continues at that level.
The implementation is based on two points
1. For every cpu in every sched domain the first group
is the group that contains the cpu itself.
2. sched groups are shared between cpus.
Thus if a sched group is overutilized the overutilized flag is
set at the first sched group of the parent sched domain. This ensures a
load balancing at the overutilzed sched domain level.
For example consider a big little system with two little cpu's (CPU A and CPU B)
and two big cpu's (CPU C and CPU D). In this system, the hierarchy will be as follows
CPU A
SD level 1 - SG1 (CPUA), SG2 (CPUB)
SD level 2 - SG5(CPUA, CPUB), SG6(CPU C, CPU D)
RD
CPU B
SD level 1 - SG2(CPUB), SG1 (CPUA)
SD level 2 - SG5(CPU A, CPU B), SG6(CPU C, CPUD)
RD
CPU C
SD level 1 - SG3(CPU C), SG4 (CPUD)
SD level 2 - SG6(CPUC, CPUD), SG5(CPUA, CPU B)
RD
CPU D
SD level 1 - SG4(CPU D), SG3(CPU C)
SD level2 - SG6(CPUC, CPU D), SG5(CPU A, APU B)
RD
In the above system if CPUA is overutilized, the overutilized
flag is set at SG5(parent sched domain first sched group). Similarly
if CPUB is overutilized, the flag is set at SG5. During load balancing,
at SD level 1, the overutilized flag is checked at the parent sched domain,
first sched group level(SG5). If there is no parent sched domain, then the flag
is set/checked at the root domain. This ensures that load balancing happens
irrespective of which cpu is over utilized in a sched domain.
Signed-off-by: Thara Gopinath <thara.gopinath(a)linaro.org>
---
kernel/sched/fair.c | 108 ++++++++++++++++++++++++++++++++++++++++++---------
kernel/sched/sched.h | 1 +
2 files changed, 90 insertions(+), 19 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 01fa969..0c97e0a 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -4559,6 +4559,36 @@ static inline void hrtick_update(struct rq *rq)
static bool cpu_overutilized(int cpu);
+static bool
+is_sd_overutilized(struct sched_domain *sd, struct root_domain *rd)
+{
+ if (sd && sd->parent)
+ return sd->parent->groups->overutilized;
+
+ if (!rd)
+ return false;
+
+ return rd->overutilized;
+}
+
+static void
+set_sd_overutilized(struct sched_domain *sd, struct root_domain *rd)
+{
+ if (sd && sd->parent)
+ sd->parent->groups->overutilized = true;
+ else if (rd)
+ rd->overutilized = true;
+}
+
+static void
+clear_sd_overutilized(struct sched_domain *sd, struct root_domain *rd)
+{
+ if (sd && sd->parent)
+ sd->parent->groups->overutilized = false;
+ else if (rd)
+ rd->overutilized = false;
+}
+
/*
* The enqueue_task method is called before nr_running is
* increased. Here we update the fair scheduling stats and
@@ -4568,6 +4598,7 @@ static void
enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
{
struct cfs_rq *cfs_rq;
+ struct sched_domain *sd;
struct sched_entity *se = &p->se;
int task_new = !(flags & ENQUEUE_WAKEUP);
@@ -4603,9 +4634,12 @@ enqueue_task_fair(struct rq *rq, struct task_struct *p, int flags)
if (!se) {
add_nr_running(rq, 1);
- if (!task_new && !rq->rd->overutilized &&
- cpu_overutilized(rq->cpu))
- rq->rd->overutilized = true;
+ rcu_read_lock();
+ sd = rcu_dereference(rq->sd);
+ if (!task_new && !is_sd_overutilized(sd, rq->rd) &&
+ cpu_overutilized(rq->cpu))
+ set_sd_overutilized(sd, rq->rd);
+ rcu_read_unlock();
}
hrtick_update(rq);
}
@@ -5989,8 +6023,6 @@ static int select_energy_cpu_brute(struct task_struct *p, int prev_cpu)
unsigned long max_spare = 0;
struct sched_domain *sd;
- rcu_read_lock();
-
sd = rcu_dereference(per_cpu(sd_ea, prev_cpu));
if (!sd)
@@ -6028,7 +6060,6 @@ static int select_energy_cpu_brute(struct task_struct *p, int prev_cpu)
}
unlock:
- rcu_read_unlock();
if (energy_cpu == prev_cpu && !cpu_overutilized(prev_cpu))
return prev_cpu;
@@ -6063,10 +6094,16 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
&& cpumask_test_cpu(cpu, tsk_cpus_allowed(p));
}
- if (energy_aware() && !(cpu_rq(prev_cpu)->rd->overutilized))
- return select_energy_cpu_brute(p, prev_cpu);
-
rcu_read_lock();
+ sd = rcu_dereference(cpu_rq(prev_cpu)->sd);
+ if (energy_aware() &&
+ !is_sd_overutilized(sd,
+ cpu_rq(cpu)->rd)) {
+ new_cpu = select_energy_cpu_brute(p, prev_cpu);
+ goto unlock;
+ }
+
+ sd = NULL;
for_each_domain(cpu, tmp) {
if (!(tmp->flags & SD_LOAD_BALANCE))
break;
@@ -6131,6 +6168,8 @@ select_task_rq_fair(struct task_struct *p, int prev_cpu, int sd_flag, int wake_f
}
/* while loop will break here if sd == NULL */
}
+
+unlock:
rcu_read_unlock();
return new_cpu;
@@ -7178,6 +7217,7 @@ struct sd_lb_stats {
struct sched_group *local; /* Local group in this sd */
unsigned long total_load; /* Total load of all groups in sd */
unsigned long total_capacity; /* Total capacity of all groups in sd */
+ unsigned long total_util; /* Total util of all groups in sd */
unsigned long avg_load; /* Average load across all groups in sd */
struct sg_lb_stats busiest_stat;/* Statistics of the busiest group */
@@ -7197,6 +7237,7 @@ static inline void init_sd_lb_stats(struct sd_lb_stats *sds)
.local = NULL,
.total_load = 0UL,
.total_capacity = 0UL,
+ .total_util = 0UL,
.busiest_stat = {
.avg_load = 0UL,
.sum_nr_running = 0,
@@ -7692,6 +7733,7 @@ next_group:
/* Now, start updating sd_lb_stats */
sds->total_load += sgs->group_load;
sds->total_capacity += sgs->group_capacity;
+ sds->total_util += sgs->group_util;
sg = sg->next;
} while (sg != env->sd->groups);
@@ -7701,17 +7743,26 @@ next_group:
env->src_grp_nr_running = sds->busiest_stat.sum_nr_running;
+ /* Setting overutilized flag might not be necessary here
+ * Revisit
+ */
if (!lb_sd_parent(env->sd)) {
/* update overload indicator if we are at root domain */
if (env->dst_rq->rd->overload != overload)
env->dst_rq->rd->overload = overload;
+ }
- /* Update over-utilization (tipping point, U >= 0) indicator */
- if (env->dst_rq->rd->overutilized != overutilized)
- env->dst_rq->rd->overutilized = overutilized;
- } else {
- if (!env->dst_rq->rd->overutilized && overutilized)
- env->dst_rq->rd->overutilized = true;
+ if (overutilized)
+ set_sd_overutilized(env->sd, env->dst_rq->rd);
+
+ /* If the domain util is greater that domain capacity, load balancing
+ * needs to be done at the next sched domain level as well
+ */
+ if (sds->total_capacity * 1024 < sds->total_util * capacity_margin) {
+ /* If already at the highest domain nothing can be done */
+ if (env->sd->parent)
+ set_sd_overutilized(env->sd->parent,
+ env->dst_rq->rd);
}
}
@@ -7932,8 +7983,11 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
*/
update_sd_lb_stats(env, &sds);
- if (energy_aware() && !env->dst_rq->rd->overutilized)
- goto out_balanced;
+ /* Is this check really required here?? Revisit */
+ if (energy_aware()) {
+ if (!is_sd_overutilized(env->sd, env->dst_rq->rd))
+ goto out_balanced;
+ }
local = &sds.local_stat;
busiest = &sds.busiest_stat;
@@ -8000,6 +8054,12 @@ static struct sched_group *find_busiest_group(struct lb_env *env)
force_balance:
/* Looks like there is an imbalance. Compute it */
calculate_imbalance(env, &sds);
+
+ /* Is this the correct place to clear this flag? Should access
+ * to flag be locked? Revisit.
+ */
+ clear_sd_overutilized(env->sd, env->dst_rq->rd);
+
return sds.busiest;
out_balanced:
@@ -8790,6 +8850,11 @@ static void rebalance_domains(struct rq *rq, enum cpu_idle_type idle)
rcu_read_lock();
for_each_domain(cpu, sd) {
+ if (energy_aware()) {
+ if (!is_sd_overutilized(sd, rq->rd))
+ continue;
+ }
+
/*
* Decay the newidle max times here because this is a regular
* visit to all the domains. Decay ~1% per second.
@@ -9083,6 +9148,7 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
{
struct cfs_rq *cfs_rq;
struct sched_entity *se = &curr->se;
+ struct sched_domain *sd;
for_each_sched_entity(se) {
cfs_rq = cfs_rq_of(se);
@@ -9092,8 +9158,12 @@ static void task_tick_fair(struct rq *rq, struct task_struct *curr, int queued)
if (static_branch_unlikely(&sched_numa_balancing))
task_tick_numa(rq, curr);
- if (!rq->rd->overutilized && cpu_overutilized(task_cpu(curr)))
- rq->rd->overutilized = true;
+ rcu_read_lock();
+ sd = rcu_dereference(rq->sd);
+ if (!is_sd_overutilized(sd, rq->rd) &&
+ cpu_overutilized(task_cpu(curr)))
+ set_sd_overutilized(sd, rq->rd);
+ rcu_read_unlock();
}
/*
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index f99391d..90c48ac 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -913,6 +913,7 @@ struct sched_group {
unsigned int group_weight;
struct sched_group_capacity *sgc;
const struct sched_group_energy const *sge;
+ bool overutilized;
/*
* The CPUs this group covers.
--
2.1.4
Dear Manager;
How is everything going? I had sent you some emails before but got no response, It will be very much grateful if you can give me a short reply this time.
The price is still unstable in China steel market, It is predicted it will continue to rise very soon, we decided to make promotion with stock steel pipes to thanks our new and old clients.The price is really good in such a market situation and we can arrange delivery before Spring Festival.
ItemTypeStandardMaterialOD (MM)WT
(MM)Length
(M)QuantityCoatingFOB Tianjin(20"*5)
PcsTon UNIT
USD/TonAmount
1SMLSAPI 5LGr.B21.32.775.8190714.00 Bare$769.91$10,778.73
2SMLSAPI 5LGr.B13.72.245.8381314.00 Bare$988.91$13,844.54
3SMLSAPI 5LGr.B26.72.875.8143114.00 Bare$718.38$10,055.92
4SMLSAPI 5LGr.B42.23.565.852910.41 Bare$639.80$6,658.97
5SMLSAPI 5LGr.B48.33.685.846810.99 Bare$621.76$6,833.91
6SMLSAPI 5LGr.B735.165.81798.96 Bare$620.48$5,560.76
7SMLSAPI 5LGr.B88.95.495.8140.92 Bare$620.48$568.94
8SMLSAPI 5LGr.B101.65.745.820315.98 Bare$611.46$9,768.61
9SMLSAPI 5LGr.B114.36.025.8555.13 Bare$611.46$3,135.41
10SMLSAPI 5LGr.B141.36.555.8688.58 Bare$614.03$5,270.99
11SMLSAPI 5LGr.B219.18.185.8204.94 Bare$614.03$3,030.50
12SMLSAPI 5LGr.B2739.275.86020.98 Bare$614.03$12,882.61
13SMLSAPI 5LGr.B323.810.315.8104.62 Bare$614.03$2,838.54
14SMLSAPI 5LGr.B355.69.535.894.25 Bare$641.09$2,721.68
Please don't hesitate to let me know if any inquiry or questions.
All the best,
P Please consider the environment before printing this e-mail.
Dear Friend;
Happy New Year!
How time flies, It has been a fresh new year now. you must have had a fruitful and wonderful year in 2016, may the joy and fortune continue to be in company with you and your families in 2017.
We are planing to make promotion with some stock steel pipes include but not limited to welded steel pipe to support you at the beginning of this year. The quantities are very small and price is very favorable , please check the form below and don't hesitate to let me know if any inquiry or questions.
ItemNameStandardMaterialSizeQuantityRemark
OD
(mm)THK
(mm)Length(m)PCSTONS
1ERWAPI 5LQ345B139.75.1 6.5879.57PE ends,Black Paint
2ERWAPI 5LQ345B168.36.1 11.955014.58PE ends,Black Paint
3LSAWASTM A516 GR70 CL2276212.7668.45 BE ends,Light Oil Paint
4LSAWASTM A516 GR70 CL22660.412.762024.34 BE ends,Light Oil Paint
5LSAWASTM A516 GR70 CL22660.49.52665.50 BE ends,Light Oil Paint
6LSAWA672C60 CL13406.49.531222.24 Bare Pipe, BE ends
7LSAWA672C60 CL13609.69.53122847.38 Bare Pipe, BE ends
8LSAWA672C60 CL13914.49.531212.55 Bare Pipe, BE ends
9LSAWA672C60 CL13101611.13120.00 Bare Pipe, BE ends
10LSAWA672C60 CL13406.46.35121813.53 Bare Pipe, BE ends
11LSAWA672C60 CL13457.26.351221.69 Bare Pipe, BE ends
12LSAWA672C60 CL135086.351210.94 Bare Pipe, BE ends
13LSAWA672C60 CL13609.67.92120.00 Bare Pipe, BE ends
14LSAWA672C65 CL13609.69.5312813.54 Bare Pipe, BE ends
15Welded PipeA358304/304L219.13.57610 1.15 PE ends, end caps in woven bags
16Welded PipeA358304/304L2733.98622 3.52 PE ends, end caps in woven bags
17Welded PipeA358304/304L323.84.3464 0.83 PE ends, end caps in woven bags
18Welded PipeA358304/304L355.64.5465 1.19 PE ends, end caps in woven bags
19Welded PipeA358304/304L406.44.5464 1.09 PE ends, end caps in woven bags
20Welded PipeA358304/304L406.412.0765 3.56 PE ends, end caps in woven bags
21Welded PipeA358304/304L457.24.5464 1.23 PE ends, end caps in woven bags
22Welded PipeA358304/304L50814.3461 1.06 PE ends, end caps in woven bags
23Welded PipeA358304/304L7627.5263 2.54 PE ends, end caps in woven bags
24Welded PipeA312N0890488.92.961 0.04 PE ends, end caps in woven bags
25Welded PipeA312N08904219.13.5761 0.12 PE ends, end caps in woven bags
26Welded PipeA312N08904168.33.2361 0.08 PE ends, end caps in woven bags
27Welded PipeA312N08904219.13.5762 0.23 PE ends, end caps in woven bags
28Welded PipeA312N08904355.64.5462 0.48 PE ends, end caps in woven bags
29Welded PipeA312N08904406.44.5462 0.55 PE ends, end caps in woven bags
33SSAWAPI 5LGr.B10167.2512000170367.91 PE ends
34ERWAPI 5LGr.B60.33122500127.17 PE ends
35Hollow SectionASTM A500Gr.B40*401.512m370081.22 Bare Pipe, PE ends
36Hollow SectionASTM A500Gr.B80*801.512m3700164.91 Bare Pipe, PE ends
37Hollow SectionASTM A500Gr.B100*1008.7512m493148.32 Bare Pipe, PE ends
38Hollow SectionASTM A500Gr.B100*1504.7512m38783.29 Bare Pipe, PE ends
All the best,
P Please consider the environment before printing this e-mail.
In energy aware path for waken task, it calculates energy difference to
select power saving CPU. For some corner case, the task utilization is
0, this means the task has run very short time and don't cross 1024us
(1ms). At the end energy difference = 0 when task utilized is 0, so
finally select an unexpected target CPU for below scenario:
If the task previously run on CPUA and CPUA is a higher capacity CPU,
when calculate energy difference between CPUA with another lower
capacity CPU (CPUB), we will get energy_diff = 0. Finally this task
sticks on CPUA and miss the chance to migrate to CPUB.
If the energy difference calculation happens between two CPUs with same
capacity, it will always stay on previous CPU so the calculation is
pointless.
This patch checks if the task util_avg is 0, it directly returns back
'target_cpu' which is selected by the power efficiency loop.
Signed-off-by: Leo Yan <leo.yan(a)linaro.org>
---
kernel/sched/fair.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index d1d5dad..7b1f65b 100755
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5616,6 +5616,9 @@ static int energy_aware_wake_cpu(struct task_struct *p, int target, int sync)
}
}
+ if (unlikely(!task_util(p)))
+ return target_cpu;
+
if (target_cpu != task_cpu(p)) {
struct energy_env eenv = {
.util_delta = task_util(p),
--
2.7.4
Hi Leo.
[CC'ing eas-dev].
As discussed, here's an useful break-down of middleware configuration
examples and resources shared with us. The publically available
resources provide additional context about how hinting is coupled to
SchedTune.
cpusets
https://android.googlesource.com/device/google/marlin/+/nougat-dr1-release/…
write /dev/cpuset/top-app/cpus 0-3
write /dev/cpuset/foreground/boost/cpus 0-2
write /dev/cpuset/foreground/cpus 0-2
write /dev/cpuset/background/cpus 0 write
/dev/cpuset/system-background/cpus 0-2
cpuctl
https://android.googlesource.com/device/google/marlin/+/nougat-dr1-release/…
write /dev/cpuctl/cpu.shares 1024
write /dev/cpuctl/cpu.rt_runtime_us 800000
write /dev/cpuctl/cpu.rt_period_us 1000000
mkdir /dev/cpuctl/bg_non_interactive
chown system system /dev/cpuctl/bg_non_interactive/tasks
chmod 0666 /dev/cpuctl/bg_non_interactive/tasks
# 5.0 %
write /dev/cpuctl/bg_non_interactive/cpu.shares 52
write /dev/cpuctl/bg_non_interactive/cpu.rt_runtime_us 700000
write /dev/cpuctl/bg_non_interactive/cpu.rt_period_us 1000000
SchedTune
https://android.googlesource.com/device/google/marlin/+/nougat-dr1-release/…
write /dev/stune/foreground/schedtune.prefer_idle 1
write /dev/stune/top-app/schedtune.boost 10
write /dev/stune/top-app/schedtune.prefer_idle 1
PowerHAL
https://android.googlesource.com/device/google/marlin/+/nougat-dr1-release/…
Robin
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.