eas-dev November 2017

eas-dev@lists.linaro.org

11 participants
9 discussions

[PATCH 0/4] Consider RT Pressure for Energy Saving

by Leo Yan

Currently energy calculation in EAS has missed to consider RT pressure, it's quite possible to select CPU for CFS tasks which has high RT pressure and finally accumulate total utilization; as result the other low RT pressure CPUs lose chance to run CFS tasks and reduce contention between CFS and RT tasks, from performance view this is not optimal; furthermore this also harms power data due pack RT task and CFS task on single one CPU is more easily to trigger CPU frequency increasing. We can measure the summed CPU utilization and calculate the CPU freqency standard deviation to get to if the tasks can be well spreading within the same cluster for middle workload case. So below is the comparison result for video playback on Hikey960 for before and after applied this patch set (Using schedutil CPUFreq governor): Without Patch Set: With Patch Set: CPU Min(Util) Mean(Util) Mean(Util) | Min(Util) Mean(Util) Mean(Util) 0 7 67 205 | 8 52 170 1 4 53 227 | 9 47 188 2 4 57 191 | 8 38 192 3 4 35 165 | 16 47 146 s.d. 1.5 13.3 25.9 | 3.9 5.83 20.9 4 0 35 160 | 10 34 129 5 0 24 129 | 0 30 115 6 0 18 123 | 0 18 95 7 0 12 84 | 0 21 73 s.d. 0 9.8 31.2 | 5 7.5 24.4 The standard diviation for CPU utilization mean value has been decreased after applying this patch set (Little cluster: 13.3 vs 5.83, big cluster: 9.8 vs 7.5). This also confirm from the average CPU frequency: Without Patch Set: With Patch Set: Average Frequency | Average Frequency LITTLT Cluster 737MHz | 646MHz big Cluster 916MHz | 922MHz Leo Yan (4): sched/fair: Select maximum spare capacity for idle candidate CPUs sched: Introduce cpu_util_sum()/__cpu_util_sum() functions sched/fair: Consider RT pressure for find_best_target() sched/fair: Consider RT/DL pressure for energy calculation kernel/sched/fair.c | 22 +++++++++++++++++++--- kernel/sched/sched.h | 29 +++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 3 deletions(-) -- 1.9.1

8 years, 4 months

[PATCH] sched/fair: Consider RT/IRQ pressure in capacity_spare_wake

by Joel Fernandes

capacity_spare_wake in the slow path influences choice of idlest groups, as we search for groups with maximum spare capacity. In scenarios where RT pressure is high, a sub optimal group can be chosen and hurt performance of the task being woken up. Several tests with results are included below to show improvements with this change. 1) Hackbench on Pixel 2 Android device (4x4 ARM64 Octa core) ------------------------------------------------------------ Here we have RT activity running on big CPU cluster induced with rt-app, and running hackbench in parallel. The RT tasks are bound to 4 CPUs on the big cluster (cpu 4,5,6,7) and have 100ms periodicity with runtime=20ms sleep=80ms. Hackbench shows big benefit (30%) improvement when number of tasks is 8 and 32: Note: data is completion time in seconds (lower is better). Number of loops for 8 and 16 tasks is 50000, and for 32 tasks its 20000. +--------+-----+-------+-------------------+---------------------------+ | groups | fds | tasks | Without Patch | With Patch | +--------+-----+-------+---------+---------+-----------------+---------+ | | | | Mean | Stdev | Mean | Stdev | | | | +-------------------+-----------------+---------+ | 1 | 8 | 8 | 1.0534 | 0.13722 | 0.7293 (+30.7%) | 0.02653 | | 2 | 8 | 16 | 1.6219 | 0.16631 | 1.6391 (-1%) | 0.24001 | | 4 | 8 | 32 | 1.2538 | 0.13086 | 1.1080 (+11.6%) | 0.16201 | +--------+-----+-------+---------+---------+-----------------+---------+ 2) Rohit ran barrier.c test (details below) with following improvements: ------------------------------------------------------------------------ This was Rohit's original use case for a patch he posted at [1] however from his recent tests he showed my patch can replace his slow path changes [1] and there's no need to selectively scan/skip CPUs in find_idlest_group_cpu in the slow path to get the improvement he sees. barrier.c (open_mp code) as a micro-benchmark. It does a number of iterations and barrier sync at the end of each for loop. Here barrier,c is running in along with ping on CPU 0 and 1 as: 'ping -l 10000 -q -s 10 -f hostX' barrier.c can be found at: http://www.spinics.net/lists/kernel/msg2506955.html Following are the results for the iterations per second with this micro-benchmark (higher is better), on a 44 core, 2 socket 88 Threads Intel x86 machine: +--------+------------------+---------------------------+ |Threads | Without patch | With patch | | | | | +--------+--------+---------+-----------------+---------+ | | Mean | Std Dev | Mean | Std Dev | +--------+--------+---------+-----------------+---------+ |1 | 539.36 | 60.16 | 572.54 (+6.15%) | 40.95 | |2 | 481.01 | 19.32 | 530.64 (+10.32%)| 56.16 | |4 | 474.78 | 22.28 | 479.46 (+0.99%) | 18.89 | |8 | 450.06 | 24.91 | 447.82 (-0.50%) | 12.36 | |16 | 436.99 | 22.57 | 441.88 (+1.12%) | 7.39 | |32 | 388.28 | 55.59 | 429.4 (+10.59%)| 31.14 | |64 | 314.62 | 6.33 | 311.81 (-0.89%) | 11.99 | +--------+--------+---------+-----------------+---------+ 3) ping+hackbench test on bare-metal sever (Rohit ran this test) ---------------------------------------------------------------- Here hackbench is running in threaded mode along with, running ping on CPU 0 and 1 as: 'ping -l 10000 -q -s 10 -f hostX' This test is running on 2 socket, 20 core and 40 threads Intel x86 machine: Number of loops is 10000 and runtime is in seconds (Lower is better). +--------------+-----------------+--------------------------+ |Task Groups | Without patch | With patch | | +-------+---------+----------------+---------+ |(Groups of 40)| Mean | Std Dev | Mean | Std Dev | +--------------+-------+---------+----------------+---------+ |1 | 0.851 | 0.007 | 0.828 (+2.77%)| 0.032 | |2 | 1.083 | 0.203 | 1.087 (-0.37%)| 0.246 | |4 | 1.601 | 0.051 | 1.611 (-0.62%)| 0.055 | |8 | 2.837 | 0.060 | 2.827 (+0.35%)| 0.031 | |16 | 5.139 | 0.133 | 5.107 (+0.63%)| 0.085 | |25 | 7.569 | 0.142 | 7.503 (+0.88%)| 0.143 | +--------------+-------+---------+----------------+---------+ [1] https://patchwork.kernel.org/patch/9991635/ Matt Fleming also ran cyclictest and several different hackbench tests on his test machines to santiy-check that the patch doesn't harm any of his usecases. Cc: Dietmar Eggemann <dietmar.eggemann(a)arm.com> Cc: Vincent Guittot <vincent.guittot(a)linaro.org> Cc: Morten Ramussen <morten.rasmussen(a)arm.com> Cc: Brendan Jackman <brendan.jackman(a)arm.com> Tested-by: Rohit Jain <rohit.k.jain(a)oracle.com> Tested-by: Matt Fleming <matt(a)codeblueprint.co.uk> Signed-off-by: Joel Fernandes <joelaf(a)google.com> --- kernel/sched/fair.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 56f343b8e749..ba9609407cb9 100644 --- a/kernel/sched/fair.c +++ b/kernel/sched/fair.c @@ -5724,7 +5724,7 @@ static int cpu_util_wake(int cpu, struct task_struct *p); static unsigned long capacity_spare_wake(int cpu, struct task_struct *p) { - return capacity_orig_of(cpu) - cpu_util_wake(cpu, p); + return max_t(long, capacity_of(cpu) - cpu_util_wake(cpu, p), 0); } /* -- 2.15.0.448.gf294e3d99a-goog

8 years, 7 months

Energy Model Question regarding the Pixel 2

by Zachariah Kennedy

Good day! I have noticed since release that EM for the Pixel 2 doesnt cover each frequency step. 22 steps for small cores, 31 steps for big cores. There are 22 tuples for the small cores but only 27 tuples for big cores. I have checked and the Pixel 2 is using all frequency steps for both small and big cores, so why doesnt the EM account for the last 4 freq steps for big cores? Thanks as always for taking the time to answer my questions. Kind Regards, Zachariah Kennedy

8 years, 7 months

[Integration Branch] Update 24-Nov-2017

by Michele DiGiorgio

Hello EAS developers, This email is to inform you about the latest EAS integration branch that was published last Friday. All the information on where to get the branch from are available at: https://developer.arm.com/open-source/energy-aware-scheduling/EAS%20Mainlin… The integration branch was conceived to keep the latest EAS patches on track with tip/sched/core. Hence, on top of that the integration branch puts: - some new scheduler features, i.e. patches that relate to scheduler but are not main components of EAS - EAS-core patches - debug patches, i.e. trace events, procfs interfaces, etc. Integration will happen every two weeks. The above website covers the main additions to each integration and the next work items for the ones that will follow. Kind regards, Michele

8 years, 8 months

An update on EAS development branches

by Chris Redpath

Hello eas-dev! I'm pleased to announce that EAS development is moving to the next version of the android common kernel, android-4.9. * EAS development will be done in a new android-4.9-eas-dev branch * android-4.9-eas-dev will be merged into android-4.9 twice during the period January - June 2018 * EAS functionality in android-4.4 is frozen * an android-4.4-eas-test branch is provided to help testing new EAS features on android-4.4 devices * assembly of an android common kernel based upon 4.14 is underway Q&A: * Why have you moved to android-4.9? * Partners developing devices have largely completed their android-4.4 derived device kernels and continuous development of EAS features is disruptive to tuning efforts * Device kernels derived from android-4.9 are in active development * Will you deliver new EAS patches to android-4.4? * The plan is to only do fixes for critical bugs for android-4.4 * How will you be confident your patches are OK when you don't have devices running android-4.9 kernels yet? * This is the reason that the android-4.4-eas-test branch exists * This branch will contain patches which are merged into android-4.9-eas-dev and can be used to help test on device kernels derived from android-4.4 * The content will be whatever patches are necessary to be able to add patches from android-4.9-eas-dev cleanly, plus the patches from android-4.9-eas-dev * android-4.4-eas-test will be updated until we have a product quality device for testing with android-4.9 derived kernels * What is the expected patch flow for testing eas-dev patches on android-4.4? * first cherry-pick the patches from android-4.4-eas-test to the device kernel * next cherry-pick in-development patches from android-4.9-eas-dev gerrit reviews * run tests to obtain power and performance numbers from real product-quality environments * How critical are you going to be for patches sent to android-4.9-eas-dev? * Patches accepted there must be of good code quality and have at least one of the four necessary attributes: 1. Must reduce energy consumption 2. Must improve performance 3. Must bring android EAS closer to mainline 4. Must fix a bug * All patches must pass checkpatch.pl * Given that you intend to merge android-4.9-eas-dev into android-4.9, will you freeze it at any time? * Yes. The intention is to have a 1 month stabilization ahead of each merge (January and June) * For the January merge, stabilization will begin December 1st, 2017. * For the June merge, stabilization will begin May 1st, 2018 * During stabilization, only fixes will be taken * Will there be merges in-between January and June? * We do not plan to do this right now, but in principle it can be done * When will android-4.4-eas-test update after android-4.9-eas-dev merges into android-4.9? * We intend to add patches to android-4.4-eas-test for review soon after merging them * What happens if there is a bug in the merged branch? * A fix will be provided to android-4.9 and android-4.9-eas-dev * The fix will be reflected in android-4.4-eas-test * Can I expect this to happen again any time soon? * Yes, there has been a new android common kernel based on a new LTS branch each year so far * Arm expects that pattern to continue * If the pattern holds, in October 2018 the target android kernel version for EAS development will be based on Linux 4.14 * We currently plan to use the same branching structure with the version numbers changed * Dates are projections based upon previous android releases and are subject to change * The kernel versions of eas-dev and eas-test branches are driven by the availability of suitable development and testing platforms, so are also subject to change * What happens when you move to a 4.14 kernel? * After changes are reviewed and merged into android-4.9 from android-4.9-eas-test, those changes will be pushed for review on the 4.14 android branch * Anything merged in android's 4.14 branch which is broken will also be patched Warmest Regards, Chris Redpath Open Source Software Power Team @ arm

8 years, 8 months

[PATCH RFC 0/5] sched and cpufreq fixes/cleanups

by Joel Fernandes

Here are some patches that are generally minor changes and I am posting them together. Patches 1/5 and 2/5 are related to skipping cpufreq updates for the dequeue of the last task before the CPU enters idle. That's just a rebase of [1] mostly. Patches 3/5 and 4/5 fix some minor things I noticed after the remote cpufreq update work. and patch 5/5 is just a small clean up of find_idlest_group. Let me know your thoughts and thanks. I've based these patches on peterz's queue.git master branch. [1] https://patchwork.kernel.org/patch/9936555/ Joel Fernandes (5): Revert "sched/fair: Drop always true parameter of update_cfs_rq_load_avg()" sched/fair: Skip frequency update if CPU about to idle cpufreq: schedutil: Use idle_calls counter of the remote CPU sched/fair: Correct obsolete comment about cpufreq_update_util sched/fair: remove impossible condition from find_idlest_group_cpu include/linux/tick.h | 1 + kernel/sched/cpufreq_schedutil.c | 2 +- kernel/sched/fair.c | 44 ++++++++++++++++++++++++++++------------ kernel/sched/sched.h | 1 + kernel/time/tick-sched.c | 13 ++++++++++++ 5 files changed, 47 insertions(+), 14 deletions(-) -- 2.15.0.rc2.357.g7e34df9404-goog

8 years, 8 months

EAS r1.4 release

by Ian Rickards

ARM has released EAS r1.4. The main changes in this release are: EAS refactoring improvements Fixes to sched-freq for big.LITTLE platforms Upstream PELT and load balance improvements Upstream schedutil changes Cumulative Runnable Average signal for OPP selection when using WALT Improved WALT integration with EAS Linux-4.4 version is merged into AOSP common kernel 4.4: https://android.googlesource.com/kernel/common/+/android-4.4 Linux-4.9 version is merged into AOSP common kernel 4.9: https://android.googlesource.com/kernel/common/+/android-4.9 Release documentation is here: https://developer.arm.com/-/media/developer/developers/open-source/energy-a… Basic testing on ARM Juno & Hikey960 using LISA tests -- ARM powersoftware team

8 years, 8 months

'wltests' automated power/performance comparisons

by Ian Rickards

wltests (workload tests) ARM is pleased to announce a new automated test suite for benchmarking Linux scheduler & EAS improvements on Android workloads. wltests is built on top of Lisa and Workload Automation (in-development version of WA v3) with the goal of: * automatically running a range of Android-based tests on a platform, collecting performance and power metrics * comparing different kernel versions and/or kernel options * analyzing differences using Lisa-based notebooks * easier porting to custom platform It is intended to allow full evaluation of EAS/scheduler changes with real Android workloads (for example PELT vs. WALT comparisons) The current set of workloads are: * Jankbench * Exoplayer for video & audio playback tests * Youtube (if gapps available on platform) * PCmark * Geekbench * Homescreen (to measure steady state energy consumption) Install entire Lisa first according to installation instructions (Lisa now includes an in-development version of WA v3) https://github.com/ARM-software/lisa/wiki/Installation#required-dependencies The VM can be used if you have incompatibilities with locally-installed python libraries Please see README.md in the wltests directory: https://github.com/ARM-software/lisa/tree/master/tools/wltests If you have concerns about results being published for in-development hardware, comment out the commercial benchmarks (PCmark & Geekbench) in the agenda: tools/wltests/agendas/sched-evaluation-full.yaml Platform - currently only one public platform (Linaro HiKey960): tools/wltests/platforms/hikey960_android-4.4 (this actually works for 4.4 and 4.9 based kernels) Adding a new platform is easy - 3 files in platform directory Any questions please let us know! -- ARM powersoftware team

8 years, 8 months

Microsoft Dynamics Users List

by Erin Marino

Hello there, I would like to know if you are interested in acquiring Microsoft Dynamics Users List. Information fields: Names, Title, Email, Phone, Company Name, Company URL, Company physical address, SIC Code, Industry, Company Size (Revenue and Employee). If you are interested, let me know your targeted geography so that I will get back to you with the counts and more information. Regards, Erin Marketing Executive If you are not interested in receiving further emails, please answer back with "overlook" in the title.

8 years, 8 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

eas-dev November 2017