On 06/16/2015 03:19 AM, Kevin Hilman wrote:
Riku Voipio riku.voipio@linaro.org writes:
On 11 June 2015 at 15:18, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
On 11 June 2015 at 09:31, Alex Shi alex.shi@linaro.org wrote:
The following format would be better than current. | kernel 1| kernel 2| |benchmark 1| value x | value x2| |benchmark 2| value y | value y2|
Again, this isn't possible for single run as we're only running on a single kernel. Such feature requires raw data postprocessing ans most likely will not be a part of LAVA. LAVA will be used to run tests and store raw data. The comparison will happen somewhere else. I just had a meeting with ARM ART (Android RunTime) team and they requested similar comparison features for their benchmarks. They are willing to share code they're using. This includes DB for storing the benchmark results and some scripts that do comparison. So eventually we will get the build-to-build or branch-to-branch comparison. For the moment let's focus on collecting the benchmark data and making sure we store everything you need.
It seems LKP itself has result comparing tools. We can upload the raw data as a test run attachment to LAVA, and extract them for post processing with LKP scripts. It is worth at least testing these before re-inventing them.
From a kernel developer PoV, I just want to re-iterate that what's most important for these performance/benchmark "tests" is not the specific values/results, but the comparison of results to other kernels and the trends over time. The trending could be done with LAVA image reports, but the comparison to other kernels will likely need to be some external post-processing tool.
Of course, the first phase is the simple pass/fail so we know the benchmarks actualy run (or why they didn't) but the more important phase is the ability to compare performance/benchmarks.
Yes, definitely!
When the different performance result show on different kernel on a same specific board, we know the kernel has an performance issue. Then performance monitor tools, like vmstat/iostat, or the kernel profiling data will give some clues for problem location. Further more, if the testing infrastructure can locate the buggy commit via bisection, it would be perfect!