Hi,
There are on going patches on the mailing list that modify the scheduler load tracking area. +Yuyang has rewritten the per entity load tracking: https://lkml.org/lkml/2015/6/2/124 +Morten has also done some modification on the load tracking: https://lkml.org/lkml/2015/5/13/448. Patches 01-12 modifies the load tracking area. I haven't considered the end of the patchset which implements the energy awareness, which it is out of the scope of the tests i wanted to do.
In order to have a better idea of the impact of each patchset on the performance of the scheduler, i have run some benches on a quad ARM cortex A15 platform.
The list of bench that i have run: -perf bench sched pipe -l 1000000 -hackbench --loops 400 --datasize 4096 -memcpy -sysbench test=threads -sysbench test=cpu -ebizzy.
Here are the results:
main: mainline kernel based on v4.1-rc6 http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/ sha1 9ef7adfa7c0b548665ef3248228d548586e693ca pelt: main + yuyang's patches inv: main + morten's patches
I have also run the bench with and without CONFIG_SCHED_MC. The only impact of this config on my platform is the setting of a llc sched domain pointer when the config is set.
die: CONFIG_SCHED_MC is not. There is 1 of sched_domain (the die level) and sched_domain flags : 0x102f mc: CONFIG_SCHED_MC is set. There is 1 of sched_domain (the mc level) and sched_domain flags : 0x22f
main+die main+mc pelt+die pelt+mc inv+die inv+mc sched pipe ops/sec 45091.40 44.30% 83.40% 43.78% 83.35% 40.42% +/- 0.33% 3.79% 0.30% 2.63% 0.30% 0.60% hackbench duration 7.84 98.33% 99.27% 95.51% 99.03% 97.54% +/- 0.37% 0.88% 1.08% 0.61% 1.18% 1.13% memcopy MB/s 4950.47 102.76% 100.76% 99.03% 99.44% 102.59% +/- 4.09% 6.13% 4.98% 2.19% 5.30% 7.26%
sysbench test=threads 2 thrds/1 lock events 5891.50 91.81% 94.81% 88.99% 99.18% 91.15% +/- 0.39% 0.63% 0.92% 0.69% 0.43% 1.10% 3 thrds/1 lock events 4061.83 86.10% 90.44% 82.56% 100.59% 86.45% +/- 1.28% 2.08% 3.76% 1.11% 0.44% 1.87% 4 thrds/2 locks events 6203.83 86.19% 89.09% 83.05% 99.61% 86.26% +/- 1.69% 1.41% 2.78% 1.64% 0.88% 0.92% 5 thrds/2 locks events 4062.00 137.43% 130.77% 132.53% 93.67% 136.80% +/- 0.59% 0.89% 2.56% 2.06% 1.05% 1.29% 6 thrds/3 locks events 5531.00 159.52% 109.85% 151.88% 96.11% 159.00% +/- 1.59% 0.78% 1.76% 1.37% 2.72% 1.04%
ebizzy 1 thread records/s 6040.50 100.68% 99.60% 101.05% 98.64% 97.42% +/- 1.97% 1.50% 1.75% 0.90% 1.66% 0.95% 2 threads records/s 9278.50 100.59% 101.21% 100.64% 100.71% 99.05% +/- 2.82% 0.86% 0.59% 0.63% 0.88% 1.50% 3 threads records/s 11205.33 99.75% 101.41% 100.98% 100.16% 97.64% +/- 2.76% 2.13% 2.30% 1.51% 3.26% 2.58% 4 threads records/s 10970.00 102.78% 99.59% 102.00% 107.24% 106.10% +/- 3.39% 4.68% 3.63% 5.75% 4.07% 4.41% 5 threads records/s 11716.50 95.57% 93.81% 96.36% 98.51% 96.81% +/- 3.52% 4.95% 4.50% 5.27% 4.28% 5.51% 6 threads records/s 11209.33 99.42% 100.33% 97.86% 99.38% 95.75% +/- 3.57% 2.95% 5.16% 6.84% 3.70% 3.57% 7 threads records/s 11204.50 99.55% 99.31% 95.73% 99.02% 96.55% +/- 4.54% 4.22% 5.39% 3.71% 5.36% 3.69% 8 threads records/s 17210.83 99.57% 100.65% 99.80% 100.16% 100.37% +/- 2.01% 1.88% 1.22% 2.25% 2.86% 1.69%
I have skipped the results of sysbench cpu as they are "exactly" the same with all kernels.
The 1st noticeable point is the impact of the LLC on the sched pipe and on hackbench in a less extent
Then, the results don't show any clear performance advantage for 1 of the 3 kernels.
I have just seen that Yuyang has sent some performance figures for his patchset and AFAICT, there is no clear perf advantage for one version of the kernel.
Have anyone else also run some bench of these patchsets ?
Regards, Vincent
linaro-kernel@lists.linaro.org