o This patch series is to optimize power. For power optimization, it
should resolve issues from two factors, the first one is to find
the method to save power and avoid unnecessary task migrations to
big core, on the other hand it cannot downgrade for performance.
So this patch series is based on performance optimization patch
series [1] to finish furthermore works for power saving and achieve
the target: optimize power but without performance downgradation.
In RFCv3 have introduced power optimization related patches, but
related patches are not general enough. E.g, RFCv3 defines the
criteria for small task is: task_util(p) < 1/4 * cpu_capacity(cpu),
So this is very hard to apply this criteria cross all SoCs. This
patch series tries to figure out more general method for this.
o Below are backgroud info for power optimization:
For first step of power optimization, we should make sure the tasks
in the cluster can spread out; this have two benefits, one benefit is
trying to decrease frequency for every cluster, another benefit is after
spreading tasks within cluster it can explore the CPU capacity as
possible and avoid CPU is overutilized, so as result this can avoid
to migrate tasks to big cores; This is finished by patch 0001.
If there have big tasks and really need to migrate them onto big
core, for this case we should ensure the big tasks can be migrate to
big core firstly rather than small tasks. So introduces rb tree to
track biggest task on RQ in patch 0002, and patch 0003 uses rb tree
to migrate biggest tasks for higher capacity CPU.
Patch 0004 has most affection for power saving, it checks if wakeup
task can run at low capacity CPU. If so, it will force to run energy
aware scheduling path even system is over tipping point. The criteria
for wakeup task can run at low capacity CPU is: if any CPU's spare
bandwidth can meet waken task requirement; so this can ensure even
the task is keeping to run on low capacity CPU, the performanc is not
sacrificed.
o Test result:
Firstly applied patch series "EASv5.2+: Performance Optimization And
Fixes", tested power and performance; Then based on the code base
also applied this power saving patch series. Finally compare the power
data and performance data.
For power comprision the test case is video playback (1080p), below
are results on Juno board:
Items | LITTLE Nrg | big Nrg | Nrg
----------------------------------------------------------------
Perf opt | 11.0520992 | 9.7118762 | 20.7639754
Perf + Power opt | 11.4157602 | 8.7319138 | 20.147674
Comparision | +3.29% | -10.09% | -2.97%
[1] https://lists.linaro.org/pipermail/eas-dev/2016-October/000610.html
Leo Yan (4):
sched/fair: select lowest capacity CPU with packing tasks
sched/fair: support to track biggest task util on rq
sched/fair: migrate highest utilization task to higher capacity CPU
sched/fair: check if wakeup task can run low capacity CPU
include/linux/sched.h | 1 +
kernel/sched/fair.c | 213 +++++++++++++++++++++++++++++++++++++++++++++-----
kernel/sched/sched.h | 4 +
3 files changed, 200 insertions(+), 18 deletions(-)
--
1.9.1
Dear Dev,
This is to confirm that one or more of your parcels has been shipped.
Delivery Label is attached to this email.
Yours sincerely,
Roger Dunlap,
FedEx Station Manager.
Dear Dev,
Courier was unable to deliver the parcel to you.
Please, open email attachment to print shipment label.
Yours sincerely,
Dustin Savage,
FedEx Delivery Manager.
Hi Patrick,
This patch mainly have two purpose.
The first one purpose is to adjust the range for capacity index so let
capacity index and energy index have similiar range between each other.
This helps task to fall into more reasonable PE filter region. So this
is finished by patch 1.
The second purpose is to support negative boosting value in PE filter,
so schedTune has integrity of algorithm which can support both for
positive and negative boosting values. As we know, if we set boost
value as positive value, then the PE filter region will rotate to right
side so give more chance for (PB) region and reduce chance for (PC)
region, so finally we can get filter region as below:
^
(O) | / (PB)
| /
| /
| / `-> cut
|/
-------------------------->
/|
/ |
/ |
/ |
/ |
(PC) | (SO)
On the other than, if set boosting as negative value, then it should
rotate the PE filter region to left side, so we can get filter region
as below. This is finished by patch 0002~0006.
^
(O) \ | (PB)
\ |
\ |
\ |
\|
-------------------------->
|\
| \
| \
| \
| \
(PC) | \ (SO)
Patch 0007 is used to verify PE filter table with LISA. I did some
testing on Hikey for TraceAnalysis::plotEDiffSpace() for PE filtering
and TraceAnalysis::plotTasks() for boosting signals; have passed these
testing.
v2 -> v1:
* Refine for patch 0001 to discount cap_delta in function energy_diff();
* Fix bug and typo in patch 0003;
* Refine patch 0004, so open optimal and sub-optimal regions checkin;
when disabled configuration CONFIG_CGROUP_SCHEDTUNE;
* Add patch 0006 to support negative value for sysctl_sched_cfs_boost;
* Add patch 0007 to trace energy_diff properly.
Leo Yan (7):
sched/fair: discount capacity index for PE filter
sched/tune: minor fix for gain table
sched/tune: polish for PE gain table index
sched/tune: open optimal and sub-optimal regions for checking
sched/tune: add PE filter support for negative boosting
sched/tune: let sysctl_sched_cfs_boost support negative value
DEBUG: sched/tune: move energy_diff trace point
include/linux/sched/sysctl.h | 6 +--
kernel/sched/fair.c | 29 +++++++---
kernel/sched/tune.c | 124 +++++++++++++++++--------------------------
kernel/sysctl.c | 5 +-
4 files changed, 76 insertions(+), 88 deletions(-)
--
1.9.1
Subject: Re: [Eas-dev] [RFC PATCH v1 0/3] sched: Introduce Window Assisted
Load Tracking
Reply-To:
In-Reply-To: <7a94b493-178a-e2ed-a39d-66a7105f566a(a)arm.com>
On 16-Sep 19:09, Dietmar Eggemann wrote:
> On 03/09/16 00:27, markivx(a)codeaurora.org wrote:
> > This patch series implements an alternative window assisted load tracking
> > mechanism in lieu of PELT based cpu utilization tracking. Testing has
> > shown that a window based non-decaying metric such as WALT guiding cpu
> > frequency and task placement decisions can improve performance/power
> > especially when running workloads more commonly found on mobile devices.
> > The aim of this series is to incorporate WALT accounting into the
> > scheduler and feed WALT statistics to schedutil in order to guide cpu
> > frequency selection. The implementation is detailed in the commit text
> > of Patch 1. The eventual goal is to also guide placement decisions
> > based on WALT statistics.
> >
> > WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized
> > devices for a few years. This is an effort to bring WALT to mainline
> > as well as to test on multiple architectures and with varied workloads.
> >
> > This RFC version is mainly to preview what the code will look like on
> > mainline. Future RFC revisions will include a theoretical discussion and
> > benchmark results.
> >
> > Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark
> > results will be sent out separately and as part of this message in the
> > next RFC version).
> >
> > Patch 1: Adds WALT tracking to the scheduler
> >
> > Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity
> > table and to use Intel PMC counters for more accurate
> > frequency invariant load tracking on X86. Included for
> > completeness but not meant for merging.
> >
> > include/linux/sched.h | 35 ++++++++++
> > include/linux/sched/sysctl.h | 2 +
> > include/trace/events/sched.h | 76 +++++++++++++++++++++
> > init/Kconfig | 9 +++
> > kernel/sched/Makefile | 1 +
> > kernel/sched/core.c | 29 ++++++++-
> > kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++-
> > kernel/sched/cputime.c | 11 +++-
> > kernel/sched/debug.c | 10 +++
> > kernel/sched/fair.c | 7 +-
> > kernel/sched/sched.h | 13 ++++
> > kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++
> > kernel/sched/walt.h | 75 +++++++++++++++++++++
> > kernel/sysctl.c | 18 +++++
> > 14 files changed, 904 insertions(+), 6 deletions(-)
> >
>
> I caught a WALT related hard lockup on a v4.7 kernel with only patch 1 on top. Fairly easy to reproduce by watching a video in firefox browser on Ubuntu 16.04.
>
> $ addr2line -e vmlinux ffffffff810d835e
> kernel/sched/sched.h:1542
>
> $ addr2line -e vmlinux ffffffff810d29b0
> kernel/sched/sched.h:1538
>
> 1531 static inline int _double_lock_balance(struct rq *this_rq, struct rq *busiest)
> 1532 __releases(this_rq->lock)
> 1533 __acquires(busiest->lock)
> 1534 __acquires(this_rq->lock)
> 1535 {
> 1536 int ret = 0;
> 1537
> 1538 if (unlikely(!raw_spin_trylock(&busiest->lock))) {
> 1539 if (busiest < this_rq) {
> 1540 raw_spin_unlock(&this_rq->lock);
> 1541 raw_spin_lock(&busiest->lock);
> 1542 raw_spin_lock_nested(&this_rq->lock,
> 1543 SINGLE_DEPTH_NESTING);
> 1544 ret = 1;
> 1545 } else
> 1546 raw_spin_lock_nested(&busiest->lock,
> 1547 SINGLE_DEPTH_NESTING);
> 1548 }
> 1549 return ret;
> 1550 }
To me this issue seems something related to the one fixed
by this Todd's patch:
https://android.googlesource.com/kernel/common/+/ab1b90f03a063f4ef9899835e9…
We noticed an issue while working on AOSP v3.18 but it is potentially
still present in mainline kernels since the implementation of the
locking functions has not been updated.
Here is how Todd described a possible race condition:
Thanks for the review. I've convinced myself that getting to
move_queued_task() with the two cpus being the same is possible (but
probably rare) since there are races between normal scheduler
migration and the forced migration via the cpu_migration_thread. If
the thread migrates naturally from the src to the dest and does it
after the last check in __migrate_task, we get into this case. This
can happen since we drop the rq lock during double_lock_balance
allowing a migration behind our back while we are re-acquiring the rq
lock.
And here the resume of the analysis we did:
1. the double_(un)lock_balance() calls are mainly used by rt/deadline
code, where there are proper checks that the two RQs are not the
same.
While it's never used by core/fair, where the dobule_rq_(un)lock()
calls are preferred.
2. All the usages of double_(un)lock_balance() are introduced in
core/walt by WALT related patches.
However, the invariant: "RQs must not be the same" is not always
granted in these paths.
3. The implementation of double_(un)lock_balance is both the Android
kernel and mainline is "asymmetric". In the CONFIG_PREEMPT case at
least the locking call is implemented using the double_rq_lock()
which provides the proper check on RQs being different,
while this check is not present in the unlocking function.
Juri has also got a confirmation from PeterZ that the double_(un)lock_balance
functions are not to be used in case we cannot grant RQs are different.
However, still the asymmetry is there and thus this code deserve a patch
mainline as well which is the one Todd added to the AOSP v3.18.
Perhaps a better solution for WALT should be to use the double_rq_(un)lock()
primitives instead of the double_(un)lock_balance() ones. Which also makes the
code more aligned with the locking APIs already used in core
scheduler.
Cheers Patrick
> [ 118.795603] =============================================
> [ 118.795606] [ INFO: possible recursive locking detected ]
> [ 118.795609] 4.7.0-walt-v4 #3 Not tainted
> [ 118.795612] ---------------------------------------------
> [ 118.795615] rtkit-daemon/3133 is trying to acquire lock:
> [ 118.795619] (&rq->lock){-.-.-.}, at: [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300
> [ 118.795635]
> [ 118.795635] but task is already holding lock:
> [ 118.795638] (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0
> [ 118.795650]
> [ 118.795650] other info that might help us debug this:
> [ 118.795653] Possible unsafe locking scenario:
> [ 118.795653]
> [ 118.795656] CPU0
> [ 118.795659] ----
> [ 118.795661] lock(&rq->lock);
> [ 118.795667] lock(&rq->lock);
> [ 118.795673]
> [ 118.795673] *** DEADLOCK ***
> [ 118.795673]
> [ 118.795676] May be due to missing lock nesting notation
> [ 118.795676]
> [ 118.795680] 1 lock held by rtkit-daemon/3133:
> [ 118.795682] #0: (&rq->lock){-.-.-.}, at: [<ffffffff810d29b0>] push_rt_task.part.39+0xb0/0x2a0
> [ 118.795692]
> [ 118.795692] stack backtrace:
> [ 118.795697] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3
> [ 118.795700] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010
> [ 118.795703] 0000000000000000 ffff8800ad7e77a8 ffffffff8143001c ffffffff829e8fc0
> [ 118.795711] ffffffff829e8fc0 ffff8800ad7e7848 ffffffff810e5eab ffff880000000000
> [ 118.795722] 000000000003e01f ffffffff8235f800 ffff8800af3ccd40 000000000000032f
> [ 118.795729] Call Trace:
> [ 118.795735] BUG: sleeping function called from invalid context at kernel/irq/manage.c:110
> [ 118.795736] in_atomic(): 1, irqs_disabled(): 1, pid: 3133, name: rtkit-daemon
> [ 118.795736] INFO: lockdep is turned off.
> [ 118.795737] irq event stamp: 330
> [ 118.795741] hardirqs last enabled at (329): [<ffffffff818b675c>] _raw_spin_unlock_irq+0x2c/0x40
> [ 118.795743] hardirqs last disabled at (330): [<ffffffff818b6eeb>] _raw_spin_lock_irqsave+0x2b/0x90
> [ 118.795747] softirqs last enabled at (0): [<ffffffff810827b1>] copy_process.part.30+0x5c1/0x1e60
> [ 118.795749] softirqs last disabled at (0): [< (null)>] (null)
> [ 118.795750] CPU: 1 PID: 3133 Comm: rtkit-daemon Not tainted 4.7.0-walt-v4 #3
> [ 118.795751] Hardware name: LENOVO 2537Z5F/2537Z5F, BIOS 6IET74WW (1.34 ) 10/25/2010
> [ 118.795753] 0000000000000001 ffff8800ad7e7390 ffffffff8143001c ffff8800af3ccd40
> [ 118.795754] ffffffff81ca0267 ffff8800ad7e73b8 ffffffff810b3490 ffffffff81ca0267
> [ 118.795756] 000000000000006e 0000000000000000 ffff8800ad7e73e0 ffffffff810b3599
> [ 118.795756] Call Trace:
> [ 118.795763] [<ffffffff8143001c>] dump_stack+0x85/0xc9
> [ 118.795766] [<ffffffff810b3490>] ___might_sleep+0x180/0x240
> [ 118.795768] [<ffffffff810b3599>] __might_sleep+0x49/0x80
> [ 118.795771] [<ffffffff810fc838>] synchronize_irq+0x38/0xa0
> [ 118.795772] [<ffffffff810fbdfe>] ? __irq_put_desc_unlock+0x1e/0x40
> [ 118.795774] [<ffffffff810fcae9>] ? __disable_irq_nosync+0x49/0x70
> [ 118.795775] [<ffffffff810fcb3c>] disable_irq+0x1c/0x30
> [ 118.795787] [<ffffffffc0172a02>] e1000_netpoll+0xf2/0x120 [e1000e]
> [ 118.795791] [<ffffffff817a3518>] netpoll_poll_dev+0x78/0x2c0
> [ 118.795793] [<ffffffff817a3900>] netpoll_send_skb_on_dev+0x1a0/0x290
> [ 118.795795] [<ffffffff817a3ccf>] netpoll_send_udp+0x2df/0x470
> [ 118.795798] [<ffffffffc012ab32>] write_msg+0xb2/0xf0 [netconsole]
> [ 118.795800] [<ffffffff810f9489>] call_console_drivers.constprop.23+0x149/0x1e0
> [ 118.795802] [<ffffffff810fa334>] console_unlock+0x4e4/0x5b0
> [ 118.795803] [<ffffffff810fa7ae>] vprintk_emit+0x3ae/0x5d0
> [ 118.795805] [<ffffffff810fab29>] vprintk_default+0x29/0x40
> [ 118.795808] [<ffffffff811b7be2>] printk+0x4d/0x4f
> [ 118.795812] [<ffffffff810372c2>] show_trace_log_lvl+0x32/0x60
> [ 118.795814] [<ffffffff8103681f>] show_stack_log_lvl+0xff/0x180
> [ 118.795816] [<ffffffff81037335>] show_stack+0x25/0x50
> [ 118.795818] [<ffffffff8143001c>] dump_stack+0x85/0xc9
> [ 118.795821] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940
> [ 118.795823] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0
> [ 118.795825] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.795826] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300
> [ 118.795828] [<ffffffff810e6533>] lock_acquire+0xd3/0x220
> [ 118.795830] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300
> [ 118.795831] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80
> [ 118.795833] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300
> [ 118.795834] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300
> [ 118.795836] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0
> [ 118.795837] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0
> [ 118.795839] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30
> [ 118.795841] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c
> [ 118.795844] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0
> [ 118.795846] [<ffffffff818b0e8c>] schedule+0x3c/0x90
> [ 118.795847] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140
> [ 118.795850] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230
> [ 118.795852] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140
> [ 118.795853] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20
> [ 118.795858] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80
> [ 118.795859] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510
> [ 118.795861] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.795864] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360
> [ 118.795866] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.795867] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60
> [ 118.795869] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150
> [ 118.795871] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150
> [ 118.795873] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.795875] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0
> [ 118.795877] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940
> [ 118.795879] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.795880] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90
> [ 118.795883] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180
> [ 118.795885] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10
> [ 118.795886] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180
> [ 118.795888] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90
> [ 118.795889] [<ffffffff81263095>] SyS_poll+0x65/0xf0
> [ 118.795891] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1
> [ 118.796149] [<ffffffff8143001c>] dump_stack+0x85/0xc9
> [ 118.796153] [<ffffffff810e5eab>] __lock_acquire+0x193b/0x1940
> [ 118.796156] [<ffffffff810deab4>] ? cpuacct_charge+0xd4/0x1d0
> [ 118.796159] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.796163] [<ffffffff810d1dda>] ? update_curr_rt+0x15a/0x300
> [ 118.796166] [<ffffffff810e6533>] lock_acquire+0xd3/0x220
> [ 118.796169] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300
> [ 118.796173] [<ffffffff818b638d>] _raw_spin_lock+0x3d/0x80
> [ 118.796176] [<ffffffff810d835e>] ? walt_fixup_busy_time+0x1ee/0x300
> [ 118.796179] [<ffffffff810d835e>] walt_fixup_busy_time+0x1ee/0x300
> [ 118.796183] [<ffffffff810b783c>] set_task_cpu+0xac/0x2e0
> [ 118.796187] [<ffffffff810d2a53>] push_rt_task.part.39+0x153/0x2a0
> [ 118.796190] [<ffffffff810d2cb7>] push_rt_tasks+0x17/0x30
> [ 118.796194] [<ffffffff811b6d3b>] __balance_callback+0x45/0x5c
> [ 118.796198] [<ffffffff818b0d96>] __schedule+0xaf6/0xbb0
> [ 118.796201] [<ffffffff818b0e8c>] schedule+0x3c/0x90
> [ 118.796204] [<ffffffff818b6053>] schedule_hrtimeout_range_clock+0xe3/0x140
> [ 118.796207] [<ffffffff811125c0>] ? hrtimer_init+0x230/0x230
> [ 118.796211] [<ffffffff818b6047>] ? schedule_hrtimeout_range_clock+0xd7/0x140
> [ 118.796215] [<ffffffff818b60c3>] schedule_hrtimeout_range+0x13/0x20
> [ 118.796218] [<ffffffff81261604>] poll_schedule_timeout+0x54/0x80
> [ 118.796221] [<ffffffff81262e67>] do_sys_poll+0x3a7/0x510
> [ 118.796225] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.796228] [<ffffffff8129ac30>] ? ep_poll_callback+0x120/0x360
> [ 118.796232] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.796235] [<ffffffff810d60c0>] ? __wake_up_sync_key+0x50/0x60
> [ 118.796239] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150
> [ 118.796242] [<ffffffff812617d0>] ? poll_select_copy_remaining+0x150/0x150
> [ 118.796246] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.796249] [<ffffffff811eeced>] ? __might_fault+0x4d/0xa0
> [ 118.796253] [<ffffffff810e497d>] ? __lock_acquire+0x40d/0x1940
> [ 118.796257] [<ffffffff8103d319>] ? sched_clock+0x9/0x10
> [ 118.796260] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90
> [ 118.796264] [<ffffffff8111a3a4>] ? ktime_get_ts64+0x84/0x180
> [ 118.796268] [<ffffffff810e424d>] ? trace_hardirqs_on+0xd/0x10
> [ 118.796271] [<ffffffff8111a3d6>] ? ktime_get_ts64+0xb6/0x180
> [ 118.796275] [<ffffffff81261a2a>] ? poll_select_set_timeout+0x5a/0x90
> [ 118.796278] [<ffffffff81263095>] SyS_poll+0x65/0xf0
> [ 118.796281] [<ffffffff818b7080>] entry_SYSCALL_64_fastpath+0x23/0xc1
> [ 128.972478] NMI watchdog: Watchdog detected hard LOCKUP on cpu 0dModules linked in: intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel arc4 aesni_intel iwldvm aes_x86_64 lrw gf128mul mac80211 glue_helper ablk_helper cryptd joydev iwlwifi snd_hda_codec_hdmi serio_raw snd_hda_codec_conexant snd_hda_codec_generic snd_hda_intel intel_ips snd_hda_codec cfg80211 snd_hda_core snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi thinkpad_acpi snd_seq lpc_ich snd_seq_device mei_me snd_timer nvram mei snd soundcore mac_hid shpchp netconsole configfs parport_pc ppdev lp parport autofs4 hid_generic nouveau mxm_wmi i2c_algo_bit ttm drm_kms_helper syscopyarea firewire_ohci sysfillrect usbhid sysimgblt ahci fb_sys_fops e1000e psmouse hid libahci sdhci_pci firewire_core crc_itu_t sdhci drm ptp pps_core video wmi
> [ 128.972479] irq event stamp: 1026850
> [ 128.972480] hardirqs last enabled at (1026849): [<ffffffff811243a6>] tick_nohz_idle_enter+0x46/0x80
> [ 128.972481] hardirqs last disabled at (1026850): [<ffffffff810d6a7d>] cpu_startup_entry+0xcd/0x450
> [ 128.972481] softirqs last enabled at (1026834): [<ffffffff8108b181>] _local_bh_enable+0x21/0x50
> [ 128.972482] softirqs last disabled at (1026833): [<ffffffff8108c2b2>] irq_enter+0x72/0xa0
> _______________________________________________
> eas-dev mailing list
> eas-dev(a)lists.linaro.org
> https://lists.linaro.org/mailman/listinfo/eas-dev
--
#include <best/regards.h>
Patrick Bellasi
This patch series implements an alternative window assisted load tracking
mechanism in lieu of PELT based cpu utilization tracking. Testing has
shown that a window based non-decaying metric such as WALT guiding cpu
frequency and task placement decisions can improve performance/power
especially when running workloads more commonly found on mobile devices.
The aim of this series is to incorporate WALT accounting into the
scheduler and feed WALT statistics to schedutil in order to guide cpu
frequency selection. The implementation is detailed in the commit text
of Patch 1. The eventual goal is to also guide placement decisions
based on WALT statistics.
WALT has existed in out-of-tree kernels for ARM/ARM64 commercialized
devices for a few years. This is an effort to bring WALT to mainline
as well as to test on multiple architectures and with varied workloads.
This RFC version is mainly to preview what the code will look like on
mainline. Future RFC revisions will include a theoretical discussion and
benchmark results.
Tested on an Intel x86_64 machine (on top of 4.7-rc6). (Benchmark
results will be sent out separately and as part of this message in the
next RFC version).
Patch 1: Adds WALT tracking to the scheduler
Patches 2-3: Temporary patches to bring in EAS/sched-freq like capacity
table and to use Intel PMC counters for more accurate
frequency invariant load tracking on X86. Included for
completeness but not meant for merging.
include/linux/sched.h | 35 ++++++++++
include/linux/sched/sysctl.h | 2 +
include/trace/events/sched.h | 76 +++++++++++++++++++++
init/Kconfig | 9 +++
kernel/sched/Makefile | 1 +
kernel/sched/core.c | 29 ++++++++-
kernel/sched/cpufreq_schedutil.c | 44 ++++++++++++-
kernel/sched/cputime.c | 11 +++-
kernel/sched/debug.c | 10 +++
kernel/sched/fair.c | 7 +-
kernel/sched/sched.h | 13 ++++
kernel/sched/walt.c | 580 ++++++++++++++++++++++++++++++++++
kernel/sched/walt.h | 75 +++++++++++++++++++++
kernel/sysctl.c | 18 +++++
14 files changed, 904 insertions(+), 6 deletions(-)
--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum,
a Linux Foundation Collaborative Project
Hi,
In preparation for a Connect LAS16 hacking sessions, I've been trying to
compare schedfreq with schedutil with EAS on Hikey running Android 4.4.
In the document below you can view some preliminary results of what I
found. Please excuse the brevity; I'm sure way more comments would be
needed to make the document more understandable, but I hope it's still
useful and I wanted to share it as soon as possible. Please don't
hesitate to ask for more information or clarifications (best is by
commenting directly on the document, I think).
https://docs.google.com/document/d/1tMb9yfJgaZmVANbbhTWwjLQLA2v5tj-_aTLXDeS…
tl;dr;
- merging pain wasn't so bad
- schedutil relatively close to schedfreq and interactive (even if
high percentiles seems to be quite off): perf is lower, but saving
some energy
- schedutil driven by WALT generally improves figures (but not that
much)
- it remains to see the amount of work required to put schedtune on top
of schedutil
Best,
- Juri
This patch series is essentially based on Morten's patch "sched/fair:
Compute task/cpu utilization at wake-up more correctly"; so want to
achieve more accurate estimation for CPU utilization and choose
proper CPU as possible.
Before we have two mainly issues for CPU utilization:
- without Morten's patch, the previous CPU for task running has
stale utilization for the task; so after the task is waken up, if we
add previous CPU utilization and task utilization, actually part of
task utilization has been calculated twice. As result, previous CPU
has less chance to be choosed for the task.
So patch "sched/fair: use cpu_util_wake() for energy awared path" is
to based on Morten's patch to calibrate previous CPU utilization
value if the task has run on it.
- Another well known issue is the idle CPU's utilization will keep
an old value after CPU enter idle states. So idle CPU utilization
will not change until it's waken up again. This will introduce
misunderstanding when select target CPU.
In the kernel, function update_blocked_averages() can be directly
called to update idle CPUs utilization value. But this function will
acquire CPU's rq lock, so this will introduce race condition between
CPUs. This is the mainly concern which may introduce potential
performance issue, so this only will be done when CPU is idle and CPU
utilization value has not been decayed to 0.
Leo Yan (3):
sched/fair: use cpu_util_wake() for energy awared path
sched/fair: add trace point for sched_new_util
sched/fair: update idle CPUs utilization when wake task
Morten Rasmussen (1):
sched/fair: Compute task/cpu utilization at wake-up more correctly
include/trace/events/sched.h | 25 ++++++++++++++
kernel/sched/fair.c | 80 +++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 104 insertions(+), 1 deletion(-)
--
1.9.1
Hi Patrick,
This patch series is to refine and enhance schedTune.
There have mainly two purpose. One purpose is to adjust the range for
capacity index so let capacity index and energy index have similiar
range between each other. This will help for task to fall into more
reasonable PE filter region. This is finished by patch 1.
Another target is to support negative boosting value in PE filter, so
schedTune has integrity of algorithm which can support both for
positive and negative boosting values. This is finished by patch 2~5.
Please note, this patch set is mainly used for discussion. I have _NOT_
do any testing at my side.
Leo Yan (5):
sched/fair: discount capacity index for PE filter
sched/tune: minor fix for gain table
sched/tune: polish for PE gain table index
sched/tune: open optimal and sub-optimal regions for checking
sched/tune: add PE filter support for negative boosting
kernel/sched/fair.c | 10 +++++
kernel/sched/tune.c | 111 +++++++++++++++++++++++-----------------------------
2 files changed, 58 insertions(+), 63 deletions(-)
--
1.9.1