Hi Lei,
Thanks for looking at these patches. Have you seen the additional
optimization patch set that was released to go on-top of 13.06?
It's linked from the 13.06 release notes at
http://releases.linaro.org/13.06/android/vexpress
. The additional patches contained there will be part of the 13.07
release. Please note however that we're releasing these patches
against a 3.10-based kernel release and we have not attempted to
port or test them on 3.11.
That said, I will attempt to answer your questions inline.
On 29/07/13 06:51, Lei Wen wrote:
Hi list,
I am trying to porting linaro MP patches to other kernel branches,
and find some regression due to latest kernel change.
The regression is while we test over Linaro 13.06 release with MP3 scenario,
we find CA7 is always busy, while one core of CA15 is occasionally raise up with
the average of 2% cpu usage ratio, another core is kept silence most of time.
But when switch to our backported branch, we find both core in CA15 becomes
active, and keep the usage ratio around 2%.
With further checking, I find one recent merged patch in mainline cause this:
https://lkml.org/lkml/2013/6/27/152
This patch mainly change the initial load for the new born task to the
largest one,
so that it cause hmp make the decision to move all such task to the big cluster.
Since cpu3(The first cpu of CA15) is getting busy now, cpu4 is raise
up to share the
load as consequence. And it make ALL 5 cpus are used in the MP3 scenario, which
should make power result looks bad than before...
What is your opinion for this regression, especially for how HMP should act with
the increased load raising up with the merged patch?
In general, the changes introduced in Alex Shi's patch set will
probably conflict a bit with big.LITTLE MP patches. Our patches and
his both use the same load tracking metrics and having the two patch
sets present will probably have some subtle interactions even though
on the surface they look relatively benign. We don't intend to work
on forward-porting to 3.11 right now since the Linaro LSK is going
to stay on 3.10 and we and Linaro are supporting the patch set on
that release. Even if you were to remove all the patches from 141965c7 to 2509940f (Alex's
switch to use load-tracking) before integrating big.LITTLE MP, I
can't guarantee that there hasn't been a behavioural change
elsewhere which won't increase power consumption or damage
performance since we haven't tested it.
Specifically about the initial task load profile, we agree with Alex
that tasks probably should not always start with zero load for
performance reasons and have done something similar in the patch
titled "HMP: Force new non-kernel tasks onto big CPUs until load
stabilises" in the 13.06 Optimization pack. The aim of that patch is
to provide application tasks with access to the maximum compute
capacity as soon as possible, and to keep all other tasks migrating
between big and little clusters only on the basis of need. It's a
kind of initial performance boost behaviour for user application
threads, after a couple of sched_periods the load profile will
reasonably accurately represent the history and task placement will
happen according to need as usual.
Its interesting that the effect is long-lasting enough to use both
A15s - are there tasks being created all the time which are
exercising the big CPUs? Alex only adds a small amount of busy time
to teh accounting, so after 10-20ms its impact should be very small.
From the behaviour you describe, I feel that something else is
happening to interfere with task placement which is made more
visible by the patch you point to. The only way for you to be sure
of the cause is to investigate the behaviour - which tasks are
resident on the big CPUs, for how long, and what does their load
profile look like.
For investigating those kinds of issues, I generally use Streamline
or Kernelshark.
In the 13.06 Optimization pack patch I mentioned we give each task
an initial load profile, but we chose to make it fully loaded in the
beginning. However, just giving a task an initial load profile is
not the whole story. In the kernels we have tested, the CPU that a
task runs on in its first few slices is largely governed by the
location of the parent task and the overall system load (it can be
balanced elsewhere within its cluster, but only if unbalanced), so
energy consumption and the performance in the first few schedules
can be unpredictable.
In order to achieve our desired behaviour, we added further code in
select_task_rq_fair to place new tasks on a big CPU. I did not
expect the power consumption to increase much, since tasks will only
be on a big CPU for a short time unless they have a heavy enough
load to justify it, but it turns out that new tasks are started much
more often than expected in these low-power media use cases, so we
also added code to prevent giving a startup-boost to kernel threads,
rt tasks and indeed any tasks forked from init directly (further
patches in the 13.06 optimization pack). However, the init fork
limit is likely only to be a good idea if you are using an Android
userspace where all the apps generally fork from Zygote.
Actually, I have one patch which would forbid the new born task to the
faster cluster.
Not sure it would cause any other side affect. Comments welcomes. :)
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -840,10 +840,26 @@ void init_task_runnable_average(struct task_struct *p)
p->se.avg.runnable_avg_period = slice;
__update_task_entity_contrib(&p->se);
}
+
+static inline bool task_is_new_born(struct task_struct *p)
+{
+ u32 slice;
+
+ /* tribble the times for the new born task */
+ slice = sched_slice(task_cfs_rq(p), &p->se) >> 8;
+
+ return p->se.avg.runnable_avg_period < slice;
+}
+
#else
void init_task_runnable_average(struct task_struct *p)
{
}
+
+static inline bool task_is_new_born(struct task_struct *p)
+{
+ return true;
+}
#endif
/*
@@ -6331,7 +6347,7 @@ static unsigned int hmp_up_migration(int cpu,
struct sched_entity *se)
< hmp_next_up_threshold)
return 0;
- if (se->avg.load_avg_ratio > hmp_up_threshold) {
+ if (!task_is_new_born(p) && se->avg.load_avg_ratio > hmp_up_threshold) {
/* Target domain load < ~94% */
if (hmp_domain_min_load(hmp_faster_domain(cpu), NULL)
> NICE_0_LOAD-64)
Thanks,
Lei
This patch avoids up-migration of the newly-created tasks with the
artificial load for all tasks. Why not just avoid applying the
initial load in the first place if you don't want it to have an
impact, or do you want to retain the load-balance behaviour without
impacting migration?
Good Luck and Best Regards,
Chris