eas-dev October 2017

eas-dev@lists.linaro.org

12 participants
12 discussions

[PATCH 0/4] Consider RT Pressure for Energy Saving

by Leo Yan

Currently energy calculation in EAS has missed to consider RT pressure, it's quite possible to select CPU for CFS tasks which has high RT pressure and finally accumulate total utilization; as result the other low RT pressure CPUs lose chance to run CFS tasks and reduce contention between CFS and RT tasks, from performance view this is not optimal; furthermore this also harms power data due pack RT task and CFS task on single one CPU is more easily to trigger CPU frequency increasing. We can measure the summed CPU utilization and calculate the CPU freqency standard deviation to get to if the tasks can be well spreading within the same cluster for middle workload case. So below is the comparison result for video playback on Hikey960 for before and after applied this patch set (Using schedutil CPUFreq governor): Without Patch Set: With Patch Set: CPU Min(Util) Mean(Util) Mean(Util) | Min(Util) Mean(Util) Mean(Util) 0 7 67 205 | 8 52 170 1 4 53 227 | 9 47 188 2 4 57 191 | 8 38 192 3 4 35 165 | 16 47 146 s.d. 1.5 13.3 25.9 | 3.9 5.83 20.9 4 0 35 160 | 10 34 129 5 0 24 129 | 0 30 115 6 0 18 123 | 0 18 95 7 0 12 84 | 0 21 73 s.d. 0 9.8 31.2 | 5 7.5 24.4 The standard diviation for CPU utilization mean value has been decreased after applying this patch set (Little cluster: 13.3 vs 5.83, big cluster: 9.8 vs 7.5). This also confirm from the average CPU frequency: Without Patch Set: With Patch Set: Average Frequency | Average Frequency LITTLT Cluster 737MHz | 646MHz big Cluster 916MHz | 922MHz Leo Yan (4): sched/fair: Select maximum spare capacity for idle candidate CPUs sched: Introduce cpu_util_sum()/__cpu_util_sum() functions sched/fair: Consider RT pressure for find_best_target() sched/fair: Consider RT/DL pressure for energy calculation kernel/sched/fair.c | 22 +++++++++++++++++++--- kernel/sched/sched.h | 29 +++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 3 deletions(-) -- 1.9.1

7 years, 5 months

[PATCH RFC 0/5] sched and cpufreq fixes/cleanups

by Joel Fernandes

Here are some patches that are generally minor changes and I am posting them together. Patches 1/5 and 2/5 are related to skipping cpufreq updates for the dequeue of the last task before the CPU enters idle. That's just a rebase of [1] mostly. Patches 3/5 and 4/5 fix some minor things I noticed after the remote cpufreq update work. and patch 5/5 is just a small clean up of find_idlest_group. Let me know your thoughts and thanks. I've based these patches on peterz's queue.git master branch. [1] https://patchwork.kernel.org/patch/9936555/ Joel Fernandes (5): Revert "sched/fair: Drop always true parameter of update_cfs_rq_load_avg()" sched/fair: Skip frequency update if CPU about to idle cpufreq: schedutil: Use idle_calls counter of the remote CPU sched/fair: Correct obsolete comment about cpufreq_update_util sched/fair: remove impossible condition from find_idlest_group_cpu include/linux/tick.h | 1 + kernel/sched/cpufreq_schedutil.c | 2 +- kernel/sched/fair.c | 44 ++++++++++++++++++++++++++++------------ kernel/sched/sched.h | 1 + kernel/time/tick-sched.c | 13 ++++++++++++ 5 files changed, 47 insertions(+), 14 deletions(-) -- 2.15.0.rc2.357.g7e34df9404-goog

7 years, 9 months

Question about change in EAS 1.4 (from ACK 4.4)

by Zachariah Kennedy

Works better with a subject! ;) Hey guys, This is a question for Brendan Jackman but feel free to chime in if you know the answer. I am having an issue when pulling in the new EAS 1.4 changes from ACK4.4. Mainly, I am getting a warning from: https://android.googlesource.com/kernel/common.git/+/a21299785a502ca4b3592a… You can see the warning below: c0 1865 [20171029_10:29:47.834626]@0 PC is at build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834632]@0 LR is at build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834637]@0 pc : [<ffffff84000d2758>] lr : [<ffffff84000d2758>] pstate: 60000145 c0 1865 [20171029_10:29:47.834641]@0 sp : ffffffcac19b3800 c0 1865 [20171029_10:29:47.834645]@0 x29: ffffffcac19b3800 x28: ffffff8401df7ee4 c0 1865 [20171029_10:29:47.834652]@0 x27: ffffffcae6626480 x26: ffffff8401e08858 c0 1865 [20171029_10:29:47.834658]@0 x25: ffffffcaf35fc780 x24: ffffff8400f77238 c0 1865 [20171029_10:29:47.834665]@0 x23: ffffff8401df85a0 x22: ffffff8401777400 c0 1865 [20171029_10:29:47.834672]@0 x21: 0000000000000008 x20: ffffff8401777400 c0 1865 [20171029_10:29:47.834678]@0 x19: ffffff8401df7ee4 x18: 00000000ffffffe8 c0 1865 [20171029_10:29:47.834684]@0 x17: 0000000000000000 x16: 0000000000000000 c0 1865 [20171029_10:29:47.834691]@0 x15: ffffff8401e16850 x14: 6465686373206572 c0 1865 [20171029_10:29:47.834697]@0 x13: 6177612079677265 x12: 6e6520726f662061 c0 1865 [20171029_10:29:47.834703]@0 x11: 74616420676e6973 x10: 73694d2030405d35 c0 1865 [20171029_10:29:47.834709]@0 x9 : 37353433382e3734 x8 : ffffffcaf46402ab c0 1865 [20171029_10:29:47.834715]@0 x7 : 0000000000000000 x6 : 000002257b061a96 c0 1865 [20171029_10:29:47.834721]@0 x5 : 00ffffffffffffff x4 : 0000000000000000 c0 1865 [20171029_10:29:47.834727]@0 x3 : 0000000000000140 x2 : a2032cf00b50bf18 c0 1865 [20171029_10:29:47.834734]@0 x1 : 0000000000000000 x0 : 0000000000000045 c0 1865 [20171029_10:29:47.834740]@0 c0 1865 PC: 0xffffff84000d2718: c0 1865 [20171029_10:29:47.834744]@0 2718 9120bc21 39402424 35ffec84 d4210000 52800024 39002424 17ffff60 d503201f c0 1865 [20171029_10:29:47.834756]@0 2738 9400e4a6 d503201f 97ffdcfc 72001c1f 54ffe501 b0009ac0 911a8000 9402a9cb c0 1865 [20171029_10:29:47.834767]@0 2758 d4210000 17ffff23 aa1403e0 9403d705 12800160 f9006fbf b90067a0 17ffff20 c0 1865 [20171029_10:29:47.834778]@0 2778 97ff3ab2 b9401005 b9401321 6b0100bf 54fffa81 34fff6a5 f9400c02 f9400f21 c0 1865 [20171029_10:29:47.834790]@0 c0 1865 LR: 0xffffff84000d2718: c0 1865 [20171029_10:29:47.834794]@0 2718 9120bc21 39402424 35ffec84 d4210000 52800024 39002424 17ffff60 d503201f c0 1865 [20171029_10:29:47.834806]@0 2738 9400e4a6 d503201f 97ffdcfc 72001c1f 54ffe501 b0009ac0 911a8000 9402a9cb c0 1865 [20171029_10:29:47.834817]@0 2758 d4210000 17ffff23 aa1403e0 9403d705 12800160 f9006fbf b90067a0 17ffff20 c0 1865 [20171029_10:29:47.834828]@0 2778 97ff3ab2 b9401005 b9401321 6b0100bf 54fffa81 34fff6a5 f9400c02 f9400f21 c0 1865 [20171029_10:29:47.834840]@0 c0 1865 SP: 0xffffffcac19b37c0: c0 1865 [20171029_10:29:47.834844]@0 37c0 000d2758 ffffff84 c19b3800 ffffffca 000d2758 ffffff84 60000145 00000000 c0 1865 [20171029_10:29:47.834855]@0 37e0 00000008 00000000 000000ff 00000000 00000000 00000080 f3405d50 ffffffca c0 1865 [20171029_10:29:47.834867]@0 3800 c19b38f0 ffffffca 000d2bbc ffffff84 00000000 00000000 01feacd0 ffffff84 c0 1865 [20171029_10:29:47.834878]@0 3820 01feab00 ffffff84 00000000 00000000 01feab00 ffffff84 00000004 00000000 c0 1865 [20171029_10:29:47.834890]@0 c0 1865 [20171029_10:29:47.834894]@0 ---[ end trace f7934377fe8659bc ]--- c0 1865 [20171029_10:29:47.834899]@0 Call trace: c0 1865 [20171029_10:29:47.834904]@0 Exception stack(0xffffffcac19b3610 to 0xffffffcac19b3740) c0 1865 [20171029_10:29:47.834910]@0 3600: ffffff8401df7ee4 0000008000000000 c0 1865 [20171029_10:29:47.834917]@0 3620: ffffffcac19b3800 ffffff84000d2758 0000000060000145 ffffff8401777400 c0 1865 [20171029_10:29:47.834923]@0 3640: ffffff8401df85a0 ffffff8400f77238 ffffffcaf35fc780 ffffff8401e08858 c0 1865 [20171029_10:29:47.834930]@0 3660: ffffffcae6626480 ffffff8401df7ee4 ffffffcac19b36c0 ffffff8401fecb90 c0 1865 [20171029_10:29:47.834937]@0 3680: 0000000000000000 00004d1712d78a33 ffffff8401fed000 00000000fcbeb400 c0 1865 [20171029_10:29:47.834943]@0 36a0: ffffff8401fed550 0000000000000140 ffffffcac19b3800 ffffffcac19b3800 c0 1865 [20171029_10:29:47.834950]@0 36c0: ffffffcac19b37c0 a2032cf00b50bf18 0000000000000045 0000000000000000 c0 1865 [20171029_10:29:47.834957]@0 36e0: a2032cf00b50bf18 0000000000000140 0000000000000000 00ffffffffffffff c0 1865 [20171029_10:29:47.834964]@0 3700: 000002257b061a96 0000000000000000 ffffffcaf46402ab 37353433382e3734 c0 1865 [20171029_10:29:47.834970]@0 3720: 73694d2030405d35 74616420676e6973 6e6520726f662061 6177612079677265 c0 1865 [20171029_10:29:47.834977]@0 [<ffffff84000d2758>] build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834983]@0 [<ffffff84000d2bbc>] partition_sched_domains+0x35c/0x410 c0 1865 [20171029_10:29:47.834990]@0 [<ffffff84000d2cb0>] cpuset_cpu_active+0x40/0x78 c0 1865 [20171029_10:29:47.834997]@0 [<ffffff84000c0a80>] notifier_call_chain+0x50/0x90 c0 1865 [20171029_10:29:47.835005]@0 [<ffffff84000c0be4>] __raw_notifier_call_chain+0xc/0x18 c0 1865 [20171029_10:29:47.835013]@0 [<ffffff84000a16e8>] cpu_notify+0x28/0x48 c0 1865 [20171029_10:29:47.835019]@0 [<ffffff84000a200c>] _cpu_up+0x23c/0x250 c0 1865 [20171029_10:29:47.835026]@0 [<ffffff84000a25cc>] enable_nonboot_cpus+0xc4/0x258 c0 1865 [20171029_10:29:47.835032]@0 [<ffffff84000fcb84>] suspend_enter+0x304/0x5f8 c0 1865 [20171029_10:29:47.835038]@0 [<ffffff84000fcf4c>] suspend_devices_and_enter+0xd4/0x310 c0 1865 [20171029_10:29:47.835045]@0 [<ffffff84000fd630>] pm_suspend+0x4a8/0x640 c0 1865 [20171029_10:29:47.835051]@0 [<ffffff84000fba84>] state_store+0x94/0xa8 c0 1865 [20171029_10:29:47.835058]@0 [<ffffff84003b642c>] kobj_attr_store+0x14/0x28 c0 1865 [20171029_10:29:47.835066]@0 [<ffffff8400243178>] sysfs_kf_write+0x48/0x58 c0 1865 [20171029_10:29:47.835073]@0 [<ffffff840024258c>] kernfs_fop_write+0xbc/0x190 c0 1865 [20171029_10:29:47.835080]@0 [<ffffff84001d2bb4>] __vfs_write+0x34/0xf8 c0 1865 [20171029_10:29:47.835086]@0 [<ffffff84001d34cc>] vfs_write+0x8c/0x178 c0 1865 [20171029_10:29:47.835093]@0 [<ffffff84001d3f64>] SyS_write+0x5c/0xc8 c0 1865 [20171029_10:29:47.835100]@0 [<ffffff8400084630>] el0_svc_naked+0x24/0x28 c0 1865 [20171029_10:29:47.836943]@0 Missing data for energy aware scheduling c0 1865 [20171029_10:29:47.836950]@0 ------------[ cut here ]------------ The error only occurs during suspend or if I manually set a core(s) to offline. This just floods the log during suspend. Is this the expected behavior? Kind Regards, Zachariah Kennedy

7 years, 9 months

(no subject)

by Zachariah Kennedy

Hey guys, This is a question for Brendan Jackman but feel free to chime in. I am having an issue when pulling in the new EAS 1.4 changes from ACK4.4. Mainly, I am getting a warning from: https://android.googlesource.com/kernel/common.git/+/a21299785a502ca4b3592a… You can see the warning below: c0 1865 [20171029_10:29:47.834626]@0 PC is at build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834632]@0 LR is at build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834637]@0 pc : [<ffffff84000d2758>] lr : [<ffffff84000d2758>] pstate: 60000145 c0 1865 [20171029_10:29:47.834641]@0 sp : ffffffcac19b3800 c0 1865 [20171029_10:29:47.834645]@0 x29: ffffffcac19b3800 x28: ffffff8401df7ee4 c0 1865 [20171029_10:29:47.834652]@0 x27: ffffffcae6626480 x26: ffffff8401e08858 c0 1865 [20171029_10:29:47.834658]@0 x25: ffffffcaf35fc780 x24: ffffff8400f77238 c0 1865 [20171029_10:29:47.834665]@0 x23: ffffff8401df85a0 x22: ffffff8401777400 c0 1865 [20171029_10:29:47.834672]@0 x21: 0000000000000008 x20: ffffff8401777400 c0 1865 [20171029_10:29:47.834678]@0 x19: ffffff8401df7ee4 x18: 00000000ffffffe8 c0 1865 [20171029_10:29:47.834684]@0 x17: 0000000000000000 x16: 0000000000000000 c0 1865 [20171029_10:29:47.834691]@0 x15: ffffff8401e16850 x14: 6465686373206572 c0 1865 [20171029_10:29:47.834697]@0 x13: 6177612079677265 x12: 6e6520726f662061 c0 1865 [20171029_10:29:47.834703]@0 x11: 74616420676e6973 x10: 73694d2030405d35 c0 1865 [20171029_10:29:47.834709]@0 x9 : 37353433382e3734 x8 : ffffffcaf46402ab c0 1865 [20171029_10:29:47.834715]@0 x7 : 0000000000000000 x6 : 000002257b061a96 c0 1865 [20171029_10:29:47.834721]@0 x5 : 00ffffffffffffff x4 : 0000000000000000 c0 1865 [20171029_10:29:47.834727]@0 x3 : 0000000000000140 x2 : a2032cf00b50bf18 c0 1865 [20171029_10:29:47.834734]@0 x1 : 0000000000000000 x0 : 0000000000000045 c0 1865 [20171029_10:29:47.834740]@0 c0 1865 PC: 0xffffff84000d2718: c0 1865 [20171029_10:29:47.834744]@0 2718 9120bc21 39402424 35ffec84 d4210000 52800024 39002424 17ffff60 d503201f c0 1865 [20171029_10:29:47.834756]@0 2738 9400e4a6 d503201f 97ffdcfc 72001c1f 54ffe501 b0009ac0 911a8000 9402a9cb c0 1865 [20171029_10:29:47.834767]@0 2758 d4210000 17ffff23 aa1403e0 9403d705 12800160 f9006fbf b90067a0 17ffff20 c0 1865 [20171029_10:29:47.834778]@0 2778 97ff3ab2 b9401005 b9401321 6b0100bf 54fffa81 34fff6a5 f9400c02 f9400f21 c0 1865 [20171029_10:29:47.834790]@0 c0 1865 LR: 0xffffff84000d2718: c0 1865 [20171029_10:29:47.834794]@0 2718 9120bc21 39402424 35ffec84 d4210000 52800024 39002424 17ffff60 d503201f c0 1865 [20171029_10:29:47.834806]@0 2738 9400e4a6 d503201f 97ffdcfc 72001c1f 54ffe501 b0009ac0 911a8000 9402a9cb c0 1865 [20171029_10:29:47.834817]@0 2758 d4210000 17ffff23 aa1403e0 9403d705 12800160 f9006fbf b90067a0 17ffff20 c0 1865 [20171029_10:29:47.834828]@0 2778 97ff3ab2 b9401005 b9401321 6b0100bf 54fffa81 34fff6a5 f9400c02 f9400f21 c0 1865 [20171029_10:29:47.834840]@0 c0 1865 SP: 0xffffffcac19b37c0: c0 1865 [20171029_10:29:47.834844]@0 37c0 000d2758 ffffff84 c19b3800 ffffffca 000d2758 ffffff84 60000145 00000000 c0 1865 [20171029_10:29:47.834855]@0 37e0 00000008 00000000 000000ff 00000000 00000000 00000080 f3405d50 ffffffca c0 1865 [20171029_10:29:47.834867]@0 3800 c19b38f0 ffffffca 000d2bbc ffffff84 00000000 00000000 01feacd0 ffffff84 c0 1865 [20171029_10:29:47.834878]@0 3820 01feab00 ffffff84 00000000 00000000 01feab00 ffffff84 00000004 00000000 c0 1865 [20171029_10:29:47.834890]@0 c0 1865 [20171029_10:29:47.834894]@0 ---[ end trace f7934377fe8659bc ]--- c0 1865 [20171029_10:29:47.834899]@0 Call trace: c0 1865 [20171029_10:29:47.834904]@0 Exception stack(0xffffffcac19b3610 to 0xffffffcac19b3740) c0 1865 [20171029_10:29:47.834910]@0 3600: ffffff8401df7ee4 0000008000000000 c0 1865 [20171029_10:29:47.834917]@0 3620: ffffffcac19b3800 ffffff84000d2758 0000000060000145 ffffff8401777400 c0 1865 [20171029_10:29:47.834923]@0 3640: ffffff8401df85a0 ffffff8400f77238 ffffffcaf35fc780 ffffff8401e08858 c0 1865 [20171029_10:29:47.834930]@0 3660: ffffffcae6626480 ffffff8401df7ee4 ffffffcac19b36c0 ffffff8401fecb90 c0 1865 [20171029_10:29:47.834937]@0 3680: 0000000000000000 00004d1712d78a33 ffffff8401fed000 00000000fcbeb400 c0 1865 [20171029_10:29:47.834943]@0 36a0: ffffff8401fed550 0000000000000140 ffffffcac19b3800 ffffffcac19b3800 c0 1865 [20171029_10:29:47.834950]@0 36c0: ffffffcac19b37c0 a2032cf00b50bf18 0000000000000045 0000000000000000 c0 1865 [20171029_10:29:47.834957]@0 36e0: a2032cf00b50bf18 0000000000000140 0000000000000000 00ffffffffffffff c0 1865 [20171029_10:29:47.834964]@0 3700: 000002257b061a96 0000000000000000 ffffffcaf46402ab 37353433382e3734 c0 1865 [20171029_10:29:47.834970]@0 3720: 73694d2030405d35 74616420676e6973 6e6520726f662061 6177612079677265 c0 1865 [20171029_10:29:47.834977]@0 [<ffffff84000d2758>] build_sched_domains+0xc00/0xcc8 c0 1865 [20171029_10:29:47.834983]@0 [<ffffff84000d2bbc>] partition_sched_domains+0x35c/0x410 c0 1865 [20171029_10:29:47.834990]@0 [<ffffff84000d2cb0>] cpuset_cpu_active+0x40/0x78 c0 1865 [20171029_10:29:47.834997]@0 [<ffffff84000c0a80>] notifier_call_chain+0x50/0x90 c0 1865 [20171029_10:29:47.835005]@0 [<ffffff84000c0be4>] __raw_notifier_call_chain+0xc/0x18 c0 1865 [20171029_10:29:47.835013]@0 [<ffffff84000a16e8>] cpu_notify+0x28/0x48 c0 1865 [20171029_10:29:47.835019]@0 [<ffffff84000a200c>] _cpu_up+0x23c/0x250 c0 1865 [20171029_10:29:47.835026]@0 [<ffffff84000a25cc>] enable_nonboot_cpus+0xc4/0x258 c0 1865 [20171029_10:29:47.835032]@0 [<ffffff84000fcb84>] suspend_enter+0x304/0x5f8 c0 1865 [20171029_10:29:47.835038]@0 [<ffffff84000fcf4c>] suspend_devices_and_enter+0xd4/0x310 c0 1865 [20171029_10:29:47.835045]@0 [<ffffff84000fd630>] pm_suspend+0x4a8/0x640 c0 1865 [20171029_10:29:47.835051]@0 [<ffffff84000fba84>] state_store+0x94/0xa8 c0 1865 [20171029_10:29:47.835058]@0 [<ffffff84003b642c>] kobj_attr_store+0x14/0x28 c0 1865 [20171029_10:29:47.835066]@0 [<ffffff8400243178>] sysfs_kf_write+0x48/0x58 c0 1865 [20171029_10:29:47.835073]@0 [<ffffff840024258c>] kernfs_fop_write+0xbc/0x190 c0 1865 [20171029_10:29:47.835080]@0 [<ffffff84001d2bb4>] __vfs_write+0x34/0xf8 c0 1865 [20171029_10:29:47.835086]@0 [<ffffff84001d34cc>] vfs_write+0x8c/0x178 c0 1865 [20171029_10:29:47.835093]@0 [<ffffff84001d3f64>] SyS_write+0x5c/0xc8 c0 1865 [20171029_10:29:47.835100]@0 [<ffffff8400084630>] el0_svc_naked+0x24/0x28 c0 1865 [20171029_10:29:47.836943]@0 Missing data for energy aware scheduling c0 1865 [20171029_10:29:47.836950]@0 ------------[ cut here ]------------ The error only occurs during suspend or if I manually set a core(s) to offline. This just floods the log during suspend. Is this the expected behavior? Kind Regards, Zachariah Kennedy

7 years, 9 months

[RFC eas-dev] sched: Consider RT/IRQ pressure in capacity_spare_wake

by Joel Fernandes

capacity_spare_wake in the slow path influences choice of idlest groups, as we search for groups with maximum spare capacity. In scenarios where RT pressure is high, a sub optimal group can be chosen and hurt performance of the task being woken up. Several tests with results are included below to show improvements with this change. 1) Hackbench on Pixel 2 Android device (4x4 ARM64 Octa core) ------------------------------------------------------------ Here we have RT activity running on big CPU cluster induced with rt-app, and running hackbench in parallel. The RT tasks are bound to 4 CPUs on the big cluster (cpu 4,5,6,7) and have 100ms periodicity with runtime=20ms sleep=80ms. Hackbench shows big benefit (30%) improvement when number of tasks is 8 and 32: Note: data is completion time in seconds (lower is better). Number of loops for 8 and 16 tasks is 50000, and for 32 tasks its 20000. +--------+-----+-------+-------------------+---------------------------+ | groups | fds | tasks | Without Patch | With Patch | +--------+-----+-------+---------+---------+-----------------+---------+ | | | | Mean | Stdev | Mean | Stdev | | | | +-------------------+-----------------+---------+ | 1 | 8 | 8 | 1.0534 | 0.13722 | 0.7293 (+30.7%) | 0.02653 | | 2 | 8 | 16 | 1.6219 | 0.16631 | 1.6391 (-1%) | 0.24001 | | 4 | 8 | 32 | 1.2538 | 0.13086 | 1.1080 (+11.6%) | 0.16201 | +--------+-----+-------+---------+---------+-----------------+---------+ 2) Rohit ran barrier.c test (details below) with following improvements: ------------------------------------------------------------------------ This was Rohit's original use case for a patch he posted at [1] however from his recent tests he showed my patch can replace his slow path changes [1] and there's no need to selectively scan/skip CPUs in find_idlest_group_cpu in the slow path to get the improvement he sees. barrier.c (open_mp code) as a micro-benchmark. It does a number of iterations and barrier sync at the end of each for loop. Here barrier,c is running in along with ping on CPU 0 and 1 as: 'ping -l 10000 -q -s 10 -f hostX' barrier.c can be found at: http://www.spinics.net/lists/kernel/msg2506955.html Following are the results for the iterations per second with this micro-benchmark (higher is better), on a 44 core, 2 socket 88 Threads Intel x86 machine: +--------+------------------+---------------------------+ |Threads | Without patch | With patch | | | | | +--------+--------+---------+-----------------+---------+ | | Mean | Std Dev | Mean | Std Dev | +--------+--------+---------+-----------------+---------+ |1 | 539.36 | 60.16 | 572.54 (+6.15%) | 40.95 | |2 | 481.01 | 19.32 | 530.64 (+10.32%)| 56.16 | |4 | 474.78 | 22.28 | 479.46 (+0.99%) | 18.89 | |8 | 450.06 | 24.91 | 447.82 (-0.50%) | 12.36 | |16 | 436.99 | 22.57 | 441.88 (+1.12%) | 7.39 | |32 | 388.28 | 55.59 | 429.4 (+10.59%)| 31.14 | |64 | 314.62 | 6.33 | 311.81 (-0.89%) | 11.99 | +--------+--------+---------+-----------------+---------+ 3) ping+hackbench test on bare-metal sever (Rohit ran this test) ---------------------------------------------------------------- Here hackbench is running in threaded mode along with, running ping on CPU 0 and 1 as: 'ping -l 10000 -q -s 10 -f hostX' This test is running on 2 socket, 20 core and 40 threads Intel x86 machine: Number of loops is 10000 and runtime is in seconds (Lower is better). +--------------+-----------------+--------------------------+ |Task Groups | Without patch | With patch | | +-------+---------+----------------+---------+ |(Groups of 40)| Mean | Std Dev | Mean | Std Dev | +--------------+-------+---------+----------------+---------+ |1 | 0.851 | 0.007 | 0.828 (+2.77%)| 0.032 | |2 | 1.083 | 0.203 | 1.087 (-0.37%)| 0.246 | |4 | 1.601 | 0.051 | 1.611 (-0.62%)| 0.055 | |8 | 2.837 | 0.060 | 2.827 (+0.35%)| 0.031 | |16 | 5.139 | 0.133 | 5.107 (+0.63%)| 0.085 | |25 | 7.569 | 0.142 | 7.503 (+0.88%)| 0.143 | +--------------+-------+---------+----------------+---------+ [1] https://patchwork.kernel.org/patch/9991635/ Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com> Cc: Vincent Guittot <vincent.guittot(a)linaro.org> Cc: Morten Ramussen <morten.rasmussen(a)arm.com> Cc: Brendan Jackman <brendan.jackman(a)arm.com> Cc: Matt Fleming <matt(a)codeblueprint.co.uk> Tested-by: Rohit Jain <rohit.k.jain(a)oracle.com> Signed-off-by: Joel Fernandes <joelaf(a)google.com> --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 740602ce799f..487e485b3560 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5742,7 +5742,7 @@ static int cpu_util_wake(int cpu, struct task_struct *p); static unsigned long capacity_spare_wake(int cpu, struct task_struct *p) { - return capacity_orig_of(cpu) - cpu_util_wake(cpu, p); + return max_t(long, capacity_of(cpu) - cpu_util_wake(cpu, p), 0); } /* -- 2.15.0.rc2.357.g7e34df9404-goog

7 years, 9 months

Improvement by replacing capacity_orig_of with capacity_of in wakeup

by Joel Fernandes

Hi, I tried an experiment this weekend - basically I have RT threads bound to big CPUs running a fixed period load, with hack bench running with all CPUs allowed. The system is a Pixel2 ARM big.LITTLE 8-core (4x4). Basically, I changed capacity_orig_of to capacity_of in capacity_spare_wake and wake_cap and I see a good performance improvement. That makes sense because wake_cap would send the task wake up to the slow-path if RT capacity was eating into the CFS capacity on prev/current CPU, and capacity_spare_wake would find a better group with spare-capacity deducted by the RT pressure capacity. One of the concerns for such a change to wake_cap, that I had, was that it might affect upstream cases that may still want to do a select_idle_sibling even if the capacity on the previous/waker's CPU was not enough after deducting RT pressure. In that case, the wake_cap change to use capacity_of would cause it to enter the slow-path for those cases I think. Could you let me know your thoughts about such a change? I heard that capacity_of was attempted before and there might be some cases to consider. Anything from your previous experiences with this change that you could share? Atleast for capacity_spare_wake, the improvements seems to be worthwhile and dramatic in some cases. I also have some more changes I am thinking off to find_idlest_group but I wanted to start a discussion on the spare capacity idea first. This is related to Rohit's work on RT Capacity awareness, I was talking to him and we were discussing ideas on the implementation. thanks, - Joel

7 years, 9 months

[PATCH 0/3] sched/fair: Remote load updates for idle CPUs

by Brendan Jackman

The blocked load and shares of root cfs_rqs is currently only updated by a the CPU owning the rq. That means if a CPU goes suddenly from being busy to totally idle, its load and shares are not updated. Schedutil works around this problem by ignoring the util of CPUs that were last updated more than a tick ago. However the stale load does impact task placement: elements that look at load and util (in particular the slow-path of select_task_rq_fair) can leave the idle CPUs un-used while other CPUs go unnecessarily overloaded. Furthermore the stale shares can impact CPU time allotment. Two complementary solutions are proposed here: 1. When a task wakes up, if necessary an idle CPU is woken as if to perform a NOHZ idle balance, which is then aborted once the load of NOHZ idle CPUs has been updated. This solves the problem but brings with it extra CPU wakeups, which have an energy cost. 2. During newly-idle load balancing, the load of remote nohz-idle CPUs in the sched_domain is updated. When all of the idle CPUs were updated in that step, the nohz.next_update field is pushed further into the future. This field is used to determine the need for triggering the newly-added NOHZ kick. So if such newly-idle balances are happening often enough, no additional CPU wakeups are required to keep all the CPUs' loads updated. [eas-dev] Patch 2/3 here is to highlight a change I made from Vincent's original patch, so that it can be reviewed more easily - if the modification is accepted then I'll squash it before posting this to LKML proper. Brendan Jackman (2): sched/fair: Refactor nohz blocked load udpates sched/fair: Update blocked load from newly idle balance Vincent Guittot (1): sched: force update of blocked load of idle cpus kernel/sched/core.c | 1 + kernel/sched/fair.c | 106 ++++++++++++++++++++++++++++++++++++++++++++------- kernel/sched/sched.h | 2 + 3 files changed, 96 insertions(+), 13 deletions(-) -- 2.14.1

7 years, 10 months

cpu_util() after use cumulative_runnable_avg always hit 0

by Ke Wang

Hi Joonwoo, Recently, I backport EAS1.3 related patches (the latest commit is ec888d46d8993b2bf205ed375e538a3819c23659) on google android-4.4 branch to SPREADTRUM platform(kernel3.18, 4 A53 LITTLE + 4A53 big), and enabled WALT signal, tracing util_avg_pelt(avg.util_avg), util_avg_walt(cumulative_runnable_avg), util_avg_freq(prev_runnable_sum) at the same time. The event ftrace (a game scenario: Subway Surf) is as below: <idle>-0 [002] dn.3 53.899765: sched_load_avg_cpu: cpu=2 load_avg=306 util_avg=27 util_avg_pelt=57 util_avg_walt=27 util_avg_freq=69 <idle>-0 [000] d.s5 53.899766: sched_load_avg_cpu: cpu=0 load_avg=385 util_avg=0 util_avg_pelt=115 util_avg_walt=0 util_avg_freq=121 UnityMain-4608 [006] d..3 53.899773: sched_load_avg_cpu: cpu=6 load_avg=964 util_avg=0 util_avg_pelt=923 util_avg_walt=0 util_avg_freq=678 UnityMain-4608 [006] d..3 53.899774: sched_load_avg_cpu: cpu=6 load_avg=964 util_avg=0 util_avg_pelt=923 util_avg_walt=0 util_avg_freq=678 <idle>-0 [000] dn.3 53.899813: sched_load_avg_cpu: cpu=0 load_avg=385 util_avg=9 util_avg_pelt=115 util_avg_walt=9 util_avg_freq=121 kworker/u17:2-4204 [001] d..3 53.899830: sched_load_avg_cpu: cpu=1 load_avg=5445 util_avg=223 util_avg_pelt=139 util_avg_walt=223 util_avg_freq=175 kworker/u17:2-4204 [001] d..3 53.899836: sched_load_avg_cpu: cpu=1 load_avg=5445 util_avg=144 util_avg_pelt=139 util_avg_walt=144 util_avg_freq=175 kworker/u17:1-2763 [001] d..3 53.899853: sched_load_avg_cpu: cpu=1 load_avg=5445 util_avg=144 util_avg_pelt=139 util_avg_walt=144 util_avg_freq=175 kworker/u17:1-2763 [001] d..3 53.899858: sched_load_avg_cpu: cpu=1 load_avg=5445 util_avg=100 util_avg_pelt=139 util_avg_walt=100 util_avg_freq=175 adbd-2915 [000] d..3 53.899900: sched_load_avg_cpu: cpu=0 load_avg=385 util_avg=9 util_avg_pelt=115 util_avg_walt=9 util_avg_freq=121 adbd-2915 [000] d..3 53.899907: sched_load_avg_cpu: cpu=0 load_avg=385 util_avg=0 util_avg_pelt=115 util_avg_walt=0 util_avg_freq=121 adbd-2915 [000] d..3 53.899909: sched_load_avg_cpu: cpu=0 load_avg=385 util_avg=0 util_avg_pelt=115 util_avg_walt=0 util_avg_freq=121 mali-event-hnd-2919 [001] d..4 53.899910: sched_load_avg_cpu: cpu=3 load_avg=934 util_avg=0 util_avg_pelt=190 util_avg_walt=0 util_avg_freq=155 >From the ftrace, we found that util_avg_walt always hit 0 while util_pelt&util_avg_freq stay on a relative big value. Could you give some suggestion for this? Thanks in advance.

7 years, 10 months

[PATCH v5 0/3] sched/fair: Introduce scaled capacity awareness in enqueue

by Rohit Jain

Changelog: --------------------------------------------------------------------------- v1->v2: * Changed the dynamic threshold calculation as the having global state can be avoided. v2->v3: * Split up the patch for find_idlest_cpu and select_idle_sibling code paths. v3->v4: * Rebased it to peterz's tree (apologies for wrong tree for v3) v4->v5: * Changed the threshold to 768 from 819 for easier shifts * Changed the find_idlest_cpu code path to be simpler * Changed the select_idle_core code path to search for idlest+full_capacity core * Added scaled capacity awareness to wake_affine_idle code path --------------------------------------------------------------------------- During OLTP workload runs, threads can end up on CPUs with a lot of softIRQ activity, thus delaying progress. For more reliable and faster runs, if the system can spare it, these threads should be scheduled on CPUs with lower IRQ/RT activity. Currently, the scheduler takes into account the original capacity of CPUs when providing 'hints' for select_idle_sibling code path to return an idle CPU. However, the rest of the select_idle_* code paths remain capacity agnostic. Further, these code paths are only aware of the original capacity and not the capacity stolen by IRQ/RT activity. This patch introduces capacity awarness in scheduler (CAS) which avoids CPUs which might have their capacities reduced (due to IRQ/RT activity) when trying to schedule threads (on the push side) in the system. This awareness has been added into the fair scheduling class. It does so by, using the following algorithm: 1) As in rt_avg the scaled capacities are already calculated. 2) Any CPU which is running below 80% capacity is considered running low on capacity. 3) During idle CPU search if a CPU is found running low on capacity, it is skipped if better CPUs are available. 4) If none of the CPUs are better in terms of idleness and capacity, then the low-capacity CPU is considered to be the best available CPU. The performance numbers: --------------------------------------------------------------------------- CAS shows upto 1.5% improvement on x86 when running 'SELECT' database workload. For microbenchmark results, I used hackbench running with process along with, running ping on CPU 0,1 and 2 as: 'ping -l 10000 -q -s 10 -f hostX' The results below should be read as: * 'Baseline without ping' is how the workload would've behaved if there was no IRQ activity. * Compare 'Baseline with ping' and 'Baseline without ping' to see the effect of ping * Compare 'Baseline with ping' and 'CAS with ping' to see the improvement CAS can give over baseline Following are the runtime(s) with hackbench and ping activity as described above (lower is better), on a 44 core 2 socket x86 machine: +---------------+------+--------+--------+ |Num. |CAS |Baseline|Baseline| |Tasks |with |with |without | |(groups of 40) |ping |ping |ping | +---------------+------+--------+--------+ | |Mean |Mean |Mean | +---------------+------+--------+--------+ |1 | 0.55 | 0.59 | 0.53 | |2 | 0.66 | 0.81 | 0.51 | |4 | 0.99 | 1.16 | 0.95 | |8 | 1.92 | 1.93 | 1.88 | |16 | 3.24 | 3.26 | 3.15 | |32 | 5.93 | 5.98 | 5.68 | |64 | 11.55| 11.94 | 10.89 | +---------------+------+--------+--------+ Rohit Jain (3): sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path sched/fair: Introduce scaled capacity awareness in select_idle_sibling code path sched/fair: Introduce scaled capacity awareness in wake_affine_idle code path kernel/sched/fair.c | 66 ++++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 53 insertions(+), 13 deletions(-) -- 2.7.4

7 years, 10 months

Re: [Eas-dev] [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path

by Atish Patra

Minor nit: Patch version missing in the subject line. Other than that: Reviewed-by: Atish Patra <atish.patra(a)oracle.com> Regards, Atish ----- Original Message ----- From: rohit.k.jain(a)oracle.com To: linux-kernel(a)vger.kernel.org, eas-dev(a)lists.linaro.org Cc: peterz(a)infradead.org, mingo(a)redhat.com, joelaf(a)google.com, atish.patra(a)oracle.com, vincent.guittot(a)linaro.org, dietmar.eggemann(a)arm.com, morten.rasmussen(a)arm.com Sent: Saturday, October 7, 2017 6:44:47 PM GMT -06:00 US/Canada Central Subject: [PATCH 1/3] sched/fair: Introduce scaled capacity awareness in find_idlest_cpu code path While looking for idle CPUs for a waking task, we should also account for the delays caused due to the bandwidth reduction by RT/IRQ tasks. This patch does that by trying to find a higher capacity CPU with minimum wake up latency. Signed-off-by: Rohit Jain <rohit.k.jain(a)oracle.com> --- kernel/sched/fair.c | 27 ++++++++++++++++++++++++--- 1 file changed, 24 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 0107280..eaede50 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5579,6 +5579,11 @@ static unsigned long capacity_orig_of(int cpu) return cpu_rq(cpu)->cpu_capacity_orig; } +static inline bool full_capacity(int cpu) +{ + return (capacity_of(cpu) >= (capacity_orig_of(cpu)*768 >> 10)); +} + static unsigned long cpu_avg_load_per_task(int cpu) { struct rq *rq = cpu_rq(cpu); @@ -5865,8 +5870,10 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) unsigned long load, min_load = ULONG_MAX; unsigned int min_exit_latency = UINT_MAX; u64 latest_idle_timestamp = 0; + unsigned int backup_cap = 0; int least_loaded_cpu = this_cpu; int shallowest_idle_cpu = -1; + int shallowest_idle_cpu_backup = -1; int i; /* Check if we have any choice: */ @@ -5876,6 +5883,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) /* Traverse only the allowed CPUs */ for_each_cpu_and(i, sched_group_span(group), &p->cpus_allowed) { if (idle_cpu(i)) { + int idle_candidate = -1; struct rq *rq = cpu_rq(i); struct cpuidle_state *idle = idle_get_state(rq); if (idle && idle->exit_latency < min_exit_latency) { @@ -5886,7 +5894,7 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) */ min_exit_latency = idle->exit_latency; latest_idle_timestamp = rq->idle_stamp; - shallowest_idle_cpu = i; + idle_candidate = i; } else if ((!idle || idle->exit_latency == min_exit_latency) && rq->idle_stamp > latest_idle_timestamp) { /* @@ -5895,7 +5903,16 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) * a warmer cache. */ latest_idle_timestamp = rq->idle_stamp; - shallowest_idle_cpu = i; + idle_candidate = i; + } + + if (idle_candidate != -1) { + if (full_capacity(idle_candidate)) { + shallowest_idle_cpu = idle_candidate; + } else if (capacity_of(idle_candidate) > backup_cap) { + shallowest_idle_cpu_backup = idle_candidate; + backup_cap = capacity_of(idle_candidate); + } } } else if (shallowest_idle_cpu == -1) { load = weighted_cpuload(cpu_rq(i)); @@ -5906,7 +5923,11 @@ find_idlest_cpu(struct sched_group *group, struct task_struct *p, int this_cpu) } } - return shallowest_idle_cpu != -1 ? shallowest_idle_cpu : least_loaded_cpu; + if (shallowest_idle_cpu != -1) + return shallowest_idle_cpu; + + return (shallowest_idle_cpu_backup != -1 ? + shallowest_idle_cpu_backup : least_loaded_cpu); } #ifdef CONFIG_SCHED_SMT -- 2.7.4

7 years, 10 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

eas-dev October 2017