On 11 June 2015 at 09:31, Alex Shi alex.shi@linaro.org wrote:
On 06/11/2015 03:26 PM, Milosz Wasilewski wrote:
On 11 June 2015 at 08:17, Alex Shi alex.shi@linaro.org wrote:
On 06/11/2015 11:55 AM, Chase Qi wrote:
The parsing of test output is done by LKP, LKP save metrics to json files, our test definition decode the json file and send them to LAVA. If we want to have all the sub-metrics, I guess patch the LKP test suite and send it to upstream is the right way to go. IHMO, it can be done, but not at this stage.
Maybe the upstream LKP don't want our specific parse for LAVA. We probably need to handle them by ourself. and if the test output can not be show out clear/appropriately, it willn't so helpful for us.
There is nothing LAVA specific there. Chase is using LKP output only and LKP doesn't save the table you presented in any way. So if we want to have the data, LKP needs to be patched.
Seems there are some misunderstanding here. I didn't mean we don't need a patch for parse. That needed. I just don't know if LKP upstream like to pick up this 'json file decode' script.
This script is a part of our LAVA integration and doesn't need to go to upstream LKP. What's missing (if I understand correctly) from LKP are these values:
Operation Count AvgLat MaxLat ---------------------------------------- NTCreateX 2011265 0.848 1478.406 Close 1477355 0.350 1165.450 Rename 85163 1.263 62.960 Unlink 406180 0.517 1287.522 Deltree 48 57.127 186.366 Mkdir 24 0.009 0.027 Qpathinfo 1823148 0.567 1445.759 Qfileinfo 319390 0.272 486.622 Qfsinfo 334240 0.421 1161.980 Sfileinfo 163808 0.558 993.785 Find 704767 0.874 1164.246 WriteX 1002240 0.032 9.801 ReadX 3152551 0.032 662.566 LockX 6550 0.011 0.727 UnlockX 6550 0.005 0.535 Flush 140954 0.613 53.237
If they are important for you, LKP needs to be patched to include them in the LKP results, not LAVA results. Our LAVA integration takes only what LKP produces.
The data from 'time' and 'perf' also will be saved by LKP. I think the 'avg.json' is the right file I should parse, it included metrics of the benchmark and time and perf. I added a 'LOOPS' parameter in test definition to support repeatedly run. If we run the test more then once, the data in avg.json will be the average of the runs. Here is a lava job example https://validation.linaro.org/scheduler/job/382401
It is hard to figure out something useful from this links. https://validation.linaro.org/dashboard/streams/anonymous/chase-qi/bundles/8...
seems it doesn't work now. Could you like to resend the report when everything right.
It does work, here are detailed results: https://validation.linaro.org/dashboard/streams/anonymous/chase-qi/bundles/8...
Sorry for miss this.
For this results show, it is still organized as functional testing results. and mixed the profile data with benchmark data and even with 'split job', 'setup local dbench' etc setup step as benchmarks.
We'd better to split out our targets that just benchmark data. Also we care about the measurement value instead of 'pass' or 'fail' of benchmarks.
that isn't possible at this stage. LAVA structures the results this way.
The following format would be better than current. | kernel 1| kernel 2| |benchmark 1| value x | value x2| |benchmark 2| value y | value y2|
Again, this isn't possible for single run as we're only running on a single kernel. Such feature requires raw data postprocessing ans most likely will not be a part of LAVA. LAVA will be used to run tests and store raw data. The comparison will happen somewhere else. I just had a meeting with ARM ART (Android RunTime) team and they requested similar comparison features for their benchmarks. They are willing to share code they're using. This includes DB for storing the benchmark results and some scripts that do comparison. So eventually we will get the build-to-build or branch-to-branch comparison. For the moment let's focus on collecting the benchmark data and making sure we store everything you need.
And further steps, we'd better set up an auto compare function to tracking if some measurement has regression on new kernel version. At that time, it worth to look into for details.
Alex, we're not kernel hackers and we don't know what's important and what is not.
I knew this, that is why I explain what's important or useful for kernel engineers.
Chase is asking for help identifying the important bits. Complaining that what we present is not what you want without details doesn't help :(
I am sorry, if the feature request looks just complains. I do appreciate what Riku and Chase did on this job!
Guess we should have the same goal, that is making the performance testing useful and reliable for kernel engineers in linaro. Not sth we made it in a hurry, but no one like using it, since it hard to get details and missed useful info.
+1 on that. Let's try to identify the important data, store it and have ready for postprocessing. If LKP is missing something, we need to fix it upstream.
milosz