CC to linaro-kernel.
Good job, Chase!
A few comments on the testing!
1, It is performance testing. data is only useful when do kernel comparison. but in different testing we may use different benchmark version as follow log show. that make result variational and useless.
===== additional disk space will be used. Get:1 http://ports.ubuntu.com/ubuntu-ports/ vivid/universe dbench arm64 4.0-2 [1921 kB] Fetched 1921 kB in 0s (19.2 MB/s) =====
2, performance testing often has variational results, that normally requests repeat testing and collect the average, standard deviations etc index of results. We need to collect repeatly running data and to decide how many times re-run needed.
3, on this test 'dbench' each of operations are kind of performance result. we need to store their all, includs Count, AvgLat, MaxLat, not only the Throughput. And the 8 clients/8 procs are testing parameters, that is meaningless to store as test case. For each of benchmarks, we need to tune them one by one on our testing machine to find out typical/meaningful parameters, For most benchmark, it is good to re-test them with different parameters according to our testing boards.
$dbench 8 -c /usr/share/dbench/client.txt .... Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 2011265 0.848 1478.406 Close 1477355 0.350 1165.450 Rename 85163 1.263 62.960 Unlink 406180 0.517 1287.522 Deltree 48 57.127 186.366 Mkdir 24 0.009 0.027 Qpathinfo 1823148 0.567 1445.759 Qfileinfo 319390 0.272 486.622 Qfsinfo 334240 0.421 1161.980 Sfileinfo 163808 0.558 993.785 Find 704767 0.874 1164.246 WriteX 1002240 0.032 9.801 ReadX 3152551 0.032 662.566 LockX 6550 0.011 0.727 UnlockX 6550 0.005 0.535 Flush 140954 0.613 53.237 Throughput 105.205 MB/sec 8 clients 8 procs max_latency=1478.412 ms wait for background monitors: perf-profile
4, perf tool output is very important for kernel developer to know why we got this performance data, where need to improve, that is same important as test results. So we'd better to figure out how many perf data we can get from testing, and collect them.
On 06/10/2015 02:01 PM, Riku Voipio wrote:
On 9 June 2015 at 22:33, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
On 9 June 2015 at 18:50, Mark Brown broonie@linaro.org wrote:
On 8 June 2015 at 14:39, Riku Voipio riku.voipio@linaro.org wrote:
I've pushed my version of LKP test definition: https://review.linaro.org/#/c/6382/ So I don't expect to work on that side anymore. I'll still fix the few benhmarks that don't build on Aarch64.
Mark or Kevin, can you give a spin on the tests at the current state?
I'm not sure how to get LAVA to run a test definition from a gerritt review. In so far as I'm able to review by looking at the code it looks good.
It probably would be possible, but I also have no idea how to do that :)
Chase presented his attempt today. Here are the results: https://validation.linaro.org/dashboard/streams/anonymous/chase-qi/bundles/a... LKP produces a JSON file with the results. Chase took the file and translated it to LAVA results. If I understood correctly, the JSON schema is unified for all benchmarks so we will be able to run it in LAVA with just the list of benchmarks to use (similar to LTP).
Looks great, thanks Chase!.
I'm not sure this is wise unless we actually have a realistic intention of actually running these tests, we'd need to be very clear about that.
My plan is to include them in LSK testing. If everything goes fine, it will happen in ~2 weeks from now.
Ok, I think we can wait for that - I'm escaping to VAC in the end of this month, but I think you can manage fine without me :)
Riku