On 12 September 2016 at 12:09, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
On 12 September 2016 at 11:37, Neil Williams neil.williams@linaro.org wrote:
On 12 September 2016 at 10:32, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
On 12 September 2016 at 08:55, Neil Williams neil.williams@linaro.org wrote:
On 9 September 2016 at 14:09, Milosz Wasilewski milosz.wasilewski@linaro.org wrote:
Hi,
I'm trying to get the proper relationship between requested tests and results in LAVA v2. Here is example job: https://validation.linaro.org/scheduler/job/1109234 and results for this job: https://validation.linaro.org/results/1109234
I'll add notes to the docs for the 2016.11 release based on these responses and any feedback on this list.
How can I tell:
- which result matches which test?
There is a chevron in the test case detail page, directly after the test case name, which links to the point in the log where that test was reported. The same URL can also be determined in advance by knowing the job ID, the sequence of test definitions in the test job definition and the name of the test case.
The chevron seems to always point to #bottom of log file.
That's the double chevron >> on the same line as the Job ID.
Below that, there is the test case name and a single chevron.
https://validation.linaro.org/results/1109234/1_lamp-test/mysql-show-databas...
mysql-show-databases >
Suggestions on making this clearer are welcome...
OK, I was looking at the wrong chevron :) This is OK once one knows how to use it.
The URL can also be assembled from the data available in the results, allowing parsers to go directly to that position in the log file.
Note: Unlike V1, the test shell does not wait until the test case entry has been created before moving on, so there can be an offset between the point linked from the result (where the test case entry was created) to the point slightly earlier in the log where the test itself was executed. This wait behaviour caused various bugs as it needs to block at the shell read command which gets confused by other messages on the serial console. The offset is the consequence of removing this behaviour.
So:
https://validation.linaro.org/results/1109234/1_lamp-test/mysql-show-databas... links to https://validation.linaro.org/scheduler/job/1109234#results_1_lamp-test_mysq...
i.e. once you know the URL of the result, you can generate the URL of the point in the test job log where that result was created.
In the log file this section looks like:
Received signal: <TESTCASE> TEST_CASE_ID=mysql-show-databases RESULT=pass case: mysql-show-databases definition: 1_lamp-test result: pass
So, in this case, there was no offset.
There is a REST API using the name of the test definition and the name of the test case.
My question was about API. Manually it's possible to do the matching even in v1.
I'm not sure what else you want from a REST API other than having all of the data available to build the URL immediately after completion, without needing to do round-trip lookups to find hashes or other generated strings. A single call to the results for a completed testjob provides all the information you need to build URLs for all test cases including the links to the position within the log file for each test case. There is no "matching" required in V2 and no round-trips back to the server with more API calls. One call gets all the data but the job of a REST API is not to build those URLs for you, it's to provide enough information to predict those URLs in a single call. Are you looking for an API call which returns all the URLs pre-assembled?
I don't need URLs at all. All I need is to know which test results come from which 'tests' in job definition
Test suites come from the job definition, following the name specified by the job definition.
and if there is anything missing.
There is a specific calculation for this on the Results page for the test job. This checks that all test-suites defined in the test definition have provided results - that is why inline definitions show as omitted if no lava-test-case is used.
The important part is to know that some tests screws something up and produces no results.
That is outside the control of LAVA *except* in the case where the test runner itself fails and thereby stops a subsequent test suite from executing. So if 1_smoke-tests falls over in a heap such that the job terminates, times out then the job will be Incomplete. If the test-runner exits early then this will also be picked up as a test runner failure. "lava-test-runner exited with an error".
We need to be careful with terminology here.
test-suite - maps to the test definition in your git repo or the inline definition. If a test suite fails to execute, LAVA will report that.
test-set and test-case - individual lines within a test definition. If any of these fail, there is *nothing* LAVA can do about the rest of the test sets or test cases in that test definition. The reason is simple - lava-test-case can be called from custom scripts in your git repo and those scripts can easily call lava-test-case in loops. Therefore, there is nothing LAVA can do to anticipate whether a test definition is going to call lava-test-case 5 times or 100. Either could be deemed correct, either could be wrong.
In such situations, if the test writer *knows* that this call in their test definition needs to call lava-test-case N times and should not call it N-1 or N+1 times, then the test definition needs to do that calculation itself (in a custom script) and make an explicit test case for that. These custom scripts massively improve the ability of the test writers to run these tests in standalone mode without LAVA. The script does all the work and then, when it is time to report on the work, it can check to see if it needs to output the results for LAVA or for the console or something else.
The contents of the Lava Test Shell Definition is essentially hidden to LAVA until that content causes something to trigger lava-test-case (or lava-test-set) or the content exits. LAVA cannot introspect inside the Test Shell Definition as this goes down a rabbit hole of endless permutations depending on kernel config, device behaviour and remote source code (like LTP itself).
What will I have in the 'LAVA Results' then? Will the metadata present such test as 'omitted'? It's also important to know which results come from which parametrized tests (when more than 1 parameter is present).
That can only be done by the test writer. When parameters are used, the purpose or meaning of those parameters needs to be declared into the results of that test job if anyone is later to be able to identify that meaning from those results. This can be done with explicit results and/or with specific names of test cases. It doesn't matter whether those results appear in LAVA or on the console when running the test in standalone mode - the parsing and calculation still needs to be done by the standalone script to express the meaning and/or purpose of whichever parameters may be of interest.
The name of the test definition comes from the test job definition:
- repository: http://git.linaro.org/lava-team/lava-functional-tests.git from: git path: lava-test-shell/single-node/singlenode03.yaml name: singlenode-advanced
The digit comes from the sequence of definitions in the list in the - test: action of the test job definition. So job 154736 on staging has three definitions to the test action, 0_env-dut-inline, 1_smoke_tests and 2_singlenode_advanced.
OK. So when I download the jobdefinition and test results I should get the match by order of appearance. Is 'lava' always present in the results?
Yes.
The test case name comes directly from the call to lava-test-case.
When an inline test definition does not report any test cases (by not calling lava-test-case anywhere, just doing setup or diagnostic calls to put data into the logs) then the metadata shows that test definition as "omitted" and it has no entry in the results table.
omitted.0.inline.name: env-dut-inline
lava-test-case calls are not that interesing yet as for example the test can return different number of results based on parameters passed.
However, lava-test-case can also be used to report results for things which are "hidden" within the scripts in the remote git repo. It is also the test-case which provides the link into the position in the job log file.
This approach ties tests to LAVA which I don't like as users requested to have ability to run tests 'standalone'. So anything that takes the test in the direction of being 'LAVA specific' can't be used.
Then a custom script is going to be needed which does the parsing - including checking whether the correct number of tests have been run - and then produces data which is reported to LAVA (or something else). I do this for the django unit tests with lava-server. We have a single script, ./ci-run, that everyone runs to execute the tests locally (and in gerrit). The custom script sets up the environment, then runs ./ci-run | tee filename and then parses the file. Once it has done the checks it needs, it loops through it's own data. At that point, it can check for lava-test-case in $PATH and use that or dump to some other output or call something else. This is what provides the standalone support with LAVA picking up the results once the standalone script has done the execution and parsing of the data.
In addition, each test job gets a set of LAVA results containing useful information like the commit hash of the test definition when it was cloned for this test job.
- if there are multiple occurrences of the same test with different
parameters, how to recognize the results?
Multiple occurrences show up in the results table: https://staging.validation.linaro.org/results/154736/2_singlenode-advanced (realpath_check occurs twice with separate results)
The question was about multiple occurences of the same test definition.
Will occur as discrete entries in the results - prefixed with the order.
1_smoke_tests 2_smoke_tests etc.
For example we use subsets of LTP. So I would like to test:
- LTP - syscalls
- LTP - math
As I wrote above the test cases will be different, so they're not that interesting.
That is where test-set is useful. I'll be writing up more documentation on that today.
lava-test-set start syscalls lava-test-case syscalls ... lava-test-set stop syscalls lava-test-case start math lava-test-case math ... lava-test-set stop math
This adds a set around those test cases by adding the test set to the URL.
/results/JOB_ID/2_smoke-tests/syscalls/syscall_one_test
This approach ties the test to LAVA which is a 'no go' from my point of view. Beside that there are other params which are important to know (see CTS: https://git.linaro.org/qa/test-definitions.git/blob/HEAD:/android/cts-host.y... or hackbench: https://git.linaro.org/qa/test-definitions.git/blob/HEAD:/ubuntu/hackbench.y...).
I disagree. You can do all the processing in the standalone script and still call lava-test-set at particular points if that proves to be useful, as part of the reporting stage at the end of the standalone script. The script still needs to output something sensible when run outside LAVA, so it still needs to do all the same checks and parsing. When it chooses to report to LAVA, it is able to use lava-test-set if that is useful or simply put everything through lava-test-case.
Analysis of the data within LAVA will need the relevant elements to be reported to LAVA - we cannot go into the Lava Test Shell Definition and *guess* how many times lava-test-case is meant to be called.
[cut]
Example of such jobs: https://validation.linaro.org/results/1107487 (not the best as the names are different) https://validation.linaro.org/scheduler/job/1113188/definition (job failed, so no results, but I'm trying to get this working)
That needs to be declared to LAVA via the test suite name or a test-set or via the test case names. LAVA cannot introspect into your remote git repo any more easily than you can.
Hmm, this approach implies there is only 1 parameter. How do I know if there are more than 1?
That is up to the standalone script that does the parsing.
So if the default isn't clear, add a lava-test-case which tests that the default is what you expect - smoke-test-default-true: fail.
This looks like regression from v1 which reported all params in result bundles (default and set in job definition).
It's not a regression, it is a different method. Not all test writers always need / want all parameters to be reported. V2 provides the control that V1 just presumed to take whether the writer wanted that or not.
Doing these tests to support standalone testing means putting nearly all the logic of the test into a single script which can be run both standalone and in LAVA - the script simply has options which determine how that work is reported.