Re: [Eas-dev] Pelt vs Walt

8 Jan 2018

      On 05-Jan 16:13, Viresh Kumar wrote:
...
Hello,
I did some comparisons of Pelt and Walt and have some very interesting
performance results that I wanted to share with all of you. I haven't
got any power numbers as I don't have setup for that.
Key points:

All the tests were done on Hikey960, with a 5V Fan placed over the
SoC to cool it down.

HDMI port was disconnected while running tests.

CONFIG_SCHED_TUNE was configured out to keep things simple.

Only the PCmark bench was tested, with help of workload automation.

Below number shows the average out of 3 runs, performed during a
single kernel boot cycle.

Pelt 8/16/32 are the half-life periods.

While testing Pelt, CONFIG_WALT was disabled.
+------------------+----------+------------+------------+-----------+
 |                  |          |            |            |           |
 | Test name        |   WALT   |  Pelt 8 ms | Pelt 16 ms | Pelt 32 ms|
 +------------------+----------+------------+------------+-----------+
 |                  |          |            |            |           |
 | DataManipulation |   5341   |  5561      | 5453       | 5400      |
 |                  |          |            |            |           |
 | PhotoEditingV2   |   9015   |  8577      | 7911       | 6043      |
 |                  |          |            |            |           |
 | VideoEditing     |   0      |  4291      | 3746       | 3755      |
 |                  |          |            |            |           |
 | WebV2            |   6202   |  6448      | 5465       | 4648      |
 |                  |          |            |            |           |
 | Workv2           |   0      |  5697      | 5069       | 4517      |
 |                  |          |            |            |           |
 | WritingV2        |   4302   |  4549      | 3811       | 3306      |
 +------------------+----------+------------+------------+-----------+

As you can see in the results Pelt 8 is very much comparable to the
Walt results now. Hurray ? :)
A detailed report is present here with some more useful numbers:
https://goo.gl/eCx4Pk
I don't have access to this report... just sent a requests.
...
How to replicate setup:

Android kernel tree:
https://git.linaro.org/people/vireshk/mylinux.git android-4.9-hikey
This has several patches over latest 4.9-hikey aosp tree.

Some patches to reduce disturbances, which Vincent shared earlier
with a document.

"thermal: Add debugfs support for cooling devices" and "cpufreq:
stats: New sysfs attribute for clearing statistics" are used to
read some more data from userspace after tests are done which can
be used to build conclusions on working of pelt/walt and how they
are behaving differently.
For example, we can know the amount of time we spent on individual
cpu frequencies while the test was running. And also the time for
which cpu-cooling and devfreq (ddr) has throttled some
frequencies.

Pelt 16 and pelt 8 patches.

Are those the patches I've shared few weeks ago, on top of util_est?
http://www.linux-arm.org/git?p=linux-pb.git%3Ba=shortlog%3Bh=refs/heads/eas/...
There are two main observations regarding PELT speedups:
1) faster decay time: by speeding up the ramp-up we also have faster
   decay times, which ultimately make PELT even more different than
   WALT, where instead utilization never decays.
   This can benefits benchmarks but can affect other interactive
   use-cases.
2) the constants you change affects LOAD too, do we know what are the
   side-effects in this case?
Moreover, as Leo pointed out, speeding up PELT can also have side
effects on overutilization, thus reducing the time we run in energy
aware mode.
All that considered, IMHO evaluating PELT speed-up requires a much
more extensive set of tests then just comparing 3 runs of PCMark.
Energy must definitively be one of the metrics and a more
comprehensive set of workloads is also required to get a full picture.
That's why we spent 1 month time to create a simple and reproducible
workflow based on LISA/WA which allows to collect a complete set of
measurements and easy share them.
...
The below changes are required to capture the extra data that I have
captured in my sheet above.
I have attached pelt_walt.sh script, which you need to push to /data:
    $ adb push pelt_walt.sh /data

And I have updated the pcmark plugin file to run the script and
collect data. That is attached as well.
Happy testing !!
Do we need another one? Can't you share instead wltest results?
That's also what ultimately Google want to see as experimental
evaluation of scheduler propose modifications.
...
I heard from Vincent earlier that ARM did similar testing earlier on
but never found anything significant. Why ? I may have an answer to
that, not sure though.
That's not completely true, we did testing and we are doing testing.
The branch above is part of the testing we are doing, on both PELT
speed-ups and util_est, which we still consider as part of the same
story to have a more WALT-like PELT.
Maybe it's just that for us testing requires more time to run all of
them? ;-)
...
I found a patch from Juri which someone is using:
https://android.googlesource.com/kernel/msm/+/b52bb1f248e4cef65edaece54a68c6...
and one of the problem here is that the patch hasn't updated the
__accumulated_sum_N32 array, but only runnable_avg_yN_inv and
runnable_avg_yN_sum.
That patch did not updated __accumulated_sum_N32 because it was not
used in that kernel, a 3.18 codebased, where PELT was updated
using a different set of support data structures: the ones modified
by the patch.
Regarding the results however there was benefits, and that's why Pixel
phones have been released with a 16ms PELT.

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

Re: [Eas-dev] Pelt vs Walt