On 28 April 2017 at 14:55, Guillaume Tucker guillaume.tucker@collabora.com wrote:
Hi Milosz,
On 28/04/17 14:47, Milosz Wasilewski wrote:
On 28 April 2017 at 14:20, Guillaume Tucker guillaume.tucker@collabora.com wrote:
Hi Neil,
Thanks for your explanation, also sorry if I've somewhat misused the terms job, shell and test. Trying to piece everything together, I've made a small test definition to see how this would all work in practice. Here's a run:
In such case it's easier to use in-line tests for prototyping I guess.
Definitely, use inline test definitions for all such prototyping. It's what the LAVA software team and the QA team do all the time for this work. It is a step on from hacking sessions, particularly as the inline operates before a hacking session can be started and all details of the inline definition are visible to everyone looking at the test job. More eyes makes it easier to debug.
Things like raising and lowering permissions are all userspace actions which are perfectly feasible in test writer scripts and perfectly good tools exist in lots of operating systems to do the work. Don't neglect the flexibility of test writer scripts - LAVA is hindered by everything having to work with the lowest common denominator - busybox and OpenEmbedded levels of shell support. Test writers have no such limitations and it is wrong of LAVA to make it hard to use sensible languages and sensible tools by putting too much magic into the test shell helpers.
There is still this perception that LAVA has a set of tests and we keep being asked to provide such a list. There never has been a list and part of the problem is that LAVA V1 has too many helpers that make it look like tests *must* use those. What we are pushing is that, no, there is no list, LAVA can run anything a test writer can create / compile. Anything a test writer can do once at a shell prompt can be done in LAVA.
[cut]
Right, except here I wanted to run a hello.sh script which needed to be downloaded from somewhere, so I created a git repo for that and put the yaml test definition in there as well.
Exactly. The script and the YAML calling the script live in the same git repo - then the YAML knows where to find the script and git can control the permissions etc.
If your test then needs another git repo, you have all the tools and the knowledge to clone that repo and adjust any permissions or add dependencies. There is no reason for LAVA to be involved in any of that, other than to get the device into a state where it can perform the requested actions.
OK, I think I generally understand this. One part I'm not too sure of is claiming that LAVA test scripts should be portable on one hand, and on the other hand that LAVA should not be involved with how test scripts actually manage to run on a system.
Portability is about developers taking your script and being able to run it outside LAVA. LAVA and all automation will always have issues with debugging why tests fail. Why make that harder by writing tests that can only be run inside LAVA? So this is portability of the test writer scripts called within the test shell definitions. The LAVA test shell helpers (things like lava-test-case, lava-test-runner and lava-sync) have to be portable in a different way - to work on any system, even if it only has a minimal busybox shell. Test writer scripts have no such limitation - you can write scripts in Java or Rust or NodeJS if your script configures the device appropriately.
The objective is that if any developer with access to suitable hardware can obtain a full log file of a LAVA test job, the developer should be able to deploy the images in their own ways, get the system to a prompt (or an emulated VM to a similar prompt) and just run the test, without having to know about LAVA at all. So portable scripts check $PATH for lava-test-case and just do the work to prepare the results, reporting via lava-test-case if that exists or via some other method.
Portability between operating systems is a slightly different problem. What we are driving for is tests which do not rely on the lava helpers to operate, except where the lava helpers provide critical services or information. This include lava-test-case because that's how to get results recorded and most of the MultiNode API which provides synchronisation services and information about the group.
Start thinking about a developer with the same board and the same image files, wanting to replicate a LAVA test job. It is a lot easier if there are scripts which live outside LAVA to do the installation of dependencies, setting up of directories and possibly users etc. It is also a lot easier to debug that script when prototyping it outside LAVA.
Terminology:
LAVA test job: provides test shell definitions (and inline definitions for prototyping) as well as describing the deployment and booting of the device to a prompt - not portable between devices or operating system deployments. (This is a departure from V1 but is required to drop the magic done by V1 and make the steps explicit.) Inline definitions are also suitable for MultiNode synchronisation by being inserted between LAVA Test Shell Definitions which do the rest of the work.
LAVA Test Shell Definition: lives in a git repo alongside test writer scripts - not portable between operating system deployments, can easily use different scripts in different directories with the test job selecting which to use for which deployment. Ideally, a single run step which calls the appropriate test writer script. Must be single lines of shell, no redirects, functions or pipes.
LAVA Test Helpers: scripts maintained by LAVA, like lava-test-case, which are restricted down to the lowest common denominator - must be portable to all deployments, where necessary using deployment_data to customise content. Principally used to embed information from LAVA into the test shell and to support communication with LAVA during test jobs. Helpers which are too close to any one operating system are likely to be deprecated and removed after V1 is removed. Helpers which duplicate operating system support will also be deprecated and removed.
Test Writer scripts: Executed by LAVA and by developers, with no need to be restricted to only running in LAVA or only by developers. No need to be portable to different operating system deployments as the developer or LAVA can be told which one to use. Can use any language or method or tools which are available in the operating system, including compiling custom tools from source per job. No restrictions on compatibility. Scripts should detect the presence of lava-test-case in $PATH and report results to LAVA using it when it is available, else report to the user in whatever method is suitable. Scripts should also contain progress messages, debug statements, error handling, logging and other support to allow developers to see what is actually happening to aid debugging. Scripts are commonly shared amongst test writers and should be self-contained where possible to support reuse. One script doing one task in the Unix model. Can use shell redirects, functions, pipes and all other features of the available shell as well as any language, interpreter, compiler or utility which is available in the relevant deployment.
If that summary helps, I'll add it to the docs.
It sounds to me that it's rather making portability the user's problem, and LAVA will happily schedule any job that can be successfully submitted regardless of whether the test scripts involved would also manage to run on any other system.
LAVA V2 is all about giving the test writers more control. That comes from LAVA being much more hands-off and just letting the tests run. Yes, that does mean that a test which runs in LAVA should also be capable of running outside LAVA. There should be no restriction that LAVA only schedules test jobs which contain special magic needed to work with LAVA. LAVA V2 is all about scheduling test jobs on hardware without restricting the tests themselves to only being run in LAVA.
I think this is a matter of convention. If your test is only targeting 1 build/os/device and you don't care about sharing the test - no problem. LAVA will happily run your test. But if you plan to re-use the test on different OSes, different shells and different boards, portability becomes an issue.
As above, there are different meanings for portability for different parts of the process.
Yes, so it's up to the user and while it's usually a good thing to write portable scripts LAVA does not enforce any level of portability. I think we all agree on this point :)
LAVA retains the current helpers only as long as V1 submissions are accepted, to retain V1 compatibility as much as possible during the migration. Once V1 has gone away, improvements are planned for the test script helpers which will remove some of the ones that remain. (A number of them will simply go away because only V1 can currently use them anyway, like lava-test-case-attach.) The next development after that is to remove test shell helpers which could be used in V2 but which have already been shown to be problematic. A case in point is lava-network which has always struggled because it cannot handle all possible network tool outputs with the limited shell support available in busybox ash. There are perfectly good tools out there already, customised to each operating system and maintained by those systems. There is no role for LAVA to reinvent those helpers.
This all arises from the lessons learnt during the final stages of V1. Rid LAVA of hidden magic - let the test writers use the best tools for the job according to the operating system in use and not add more support which duplicates or simply relocates support already available in the shell itself.
If a task is achievable within the test shell without a LAVA helper, there is no role for a LAVA helper in that task. (So the current package install helpers are likely to be deprecated and then removed in the future.)
If there is critical information or support required which only LAVA can do, e.g. information about the device configuration, synchronisation between MultiNode devices, access to external hardware inside LXC, then that does need some help from LAVA. However, once that minimal step is available, how that is used needs to be left to the test writers so that the most suitable tools can be applied when those are available.
So far, this login command support would seem to fit perfectly into the "Don't repeat what the OS can do in the test shell". It differs from the QEMU GuestFS support as that is required to provide access to the test shell in the first place (by mounting the guest filesystem). It is not part of the login process (which ends at the first shell prompt) and can be done with a script executed before the rest of the test shell continues.