Recently I've been thinking about the dispatcher a bit (as other mails should have indicated) and I've gotten to think about dependencies between actions. If you don't already know, a dispatcher job file mostly consists of a list of actions to execute, for example:
"actions": [ { "command": "deploy_linaro_image", "parameters": {"rootfs": "...", "hwpack": "..."} }, { "command": "lava_test_install", "parameters": {"tests": ["stream", "ltp"]} }, { "command": "boot_linaro_image" }, { "command": "lava_test_run", "parameters": {"test_name": "stream"} }, { "command": "lava_test_run", "parameters": {"test_name": "ltp"} }, { "command": "submit_results", "parameters": { "server": "http://localhost/lava-server/RPC2/", "stream": "/anonymous/test/" } } ]
I hope what the actions do is reasonably clear from their names.
What is easy-ish for us, but probably rather harder for a computer program to do is to see the data dependencies between the different actions.
boot_linaro_image makes no sense if deploy_linaro_image failed.
Running tests with lava_test_run doesn't make sense if the test failed to install (or if boot_linaro_image failed).
But the lava_test_run's are independent, even if the stream test hangs we could still run the ltp tests.
And we should always submit the results, that's a kind of special case (and I'm not sure submit results should really be an action).
It seems like the way we (aim to, at least) handle this isn't too bad: basically any action can veto the running of any more actions (apart from the special case of submit_results).
But there's more than control flow going on here -- there is data flow too. The reason I'm writing this mail is that I'm working on testing images in qemu[1]. If we want to use a similar layout of the job files (and I think using the same commands even will be possible), then we'll have an action that builds an image and another that starts up qemu. The action that starts up qemu needs to know where the previous action is!
And of course I've been a bit sneaky here, because there's another, very important kind of data that needs to move around: the test results. Currently we assume that all the test results end up in a particular directory (either on the device (for ubuntu-based tests) or on the host (android-based tests)). This feels a bit grotty to me, and will need to change for tests run under qemu and possibly for the multi-system tests that were discussed at the connect.
There is an object in the dispatcher -- the context -- that encapsulates the state that persists through the run, so this is probably where the data should live -- we could have a dictionary attached to the context and then deploy_linaro_image for a qemu client type could stuff the path into this and the boot_linaro_image action for a qemu client could read this path (and complain appropriately if its not there). Additionally we could have a list of 'result locations' (could be filesystem paths on the host, or locations on the linaro image) and the submit results step could read from here to gather together the results.
This feels like it will work, but is still a bit implicit -- it's still not obvious that boot_linaro_image depends on something deploy_linaro_image does -- but maybe this is information that should be maintained outside in the job file, maybe in a JSON schema for job files?
Apologies for the second brain dump today. I think these are the changes I want to make:
1) Change submit results to not be an action. 2) Add a result_locations list and action_data dictionary to LavaContext. My half-thought through idea is that actions will use the action name as a prefix, e.g. deploy_linaro_image for a qemu client might set 'deploy_linaro_image.qemu_img_path' in this dict. 3) Change the lava-test and lava-android-test to store into result_locations and the submit step to read from there. 4) Use action data to have deploy_linaro_image and boot_linaro_image (and maybe lava_test_install and lava_test_run) talk to each other.
What do you guys think?
Cheers, mwh
[1] testing in qemu is perhaps not incredibly useful for us, but doing this forces me to confront some of the issues with testing images with in a fast model, which is something we really want to do, as we can get access to the fast model of the cortex-a15 long before we'll get access to hardware
- Change submit results to not be an action.
2) Add a result_locations list and action_data dictionary to
LavaContext. My half-thought through idea is that actions will use the action name as a prefix, e.g. deploy_linaro_image for a qemu client might set 'deploy_linaro_image.qemu_img_path' in this dict. 3) Change the lava-test and lava-android-test to store into result_locations and the submit step to read from there. 4) Use action data to have deploy_linaro_image and boot_linaro_image (and maybe lava_test_install and lava_test_run) talk to each other.
about the action, from the point used by lava(lava-server,lava-dashboard,
lava-scheduler), some actions are not necessary to specified, or to say they are certainly executed. like: 1. submit results if no results submitted, the test will no meaning. 2. deploy and boot the test target if not deploy and boot, then no target to test. 3. the installation of various test. if the test is not installed, we can install it before run.
They can be or should be execute implicitly I think.
Thanks, Yongqin Liu
On Fri, 11 Nov 2011 16:15:47 +0800, yong qin yongqin.liu@linaro.org wrote:
- Change submit results to not be an action.
- Add a result_locations list and action_data dictionary to
LavaContext. My half-thought through idea is that actions will use the action name as a prefix, e.g. deploy_linaro_image for a qemu client might set 'deploy_linaro_image.qemu_img_path' in this dict. 3) Change the lava-test and lava-android-test to store into result_locations and the submit step to read from there. 4) Use action data to have deploy_linaro_image and boot_linaro_image (and maybe lava_test_install and lava_test_run) talk to each other.
about the action, from the point used by lava(lava-server,lava-dashboard,
lava-scheduler), some actions are not necessary to specified, or to say they are certainly executed. like:
- submit results if no results submitted, the test will no meaning.
- deploy and boot the test target if not deploy and boot, then no target to test.
- the installation of various test. if the test is not installed, we can install it before run.
They can be or should be execute implicitly I think.
That's a good point... but sadly I don't think it works because the actions we'd want to implicitly execute require data -- we can't implicitly submit results without knowing where to send the results, we can't implicitly deploy without knowing what to deploy, in the case of out of tree tests at least, we can't know where to install a test from before we run it.
It does seem though that we can very quickly detect all of these problems even at submit_job time, which would be much friendlier to the user -- but I don't instantly see how to do this without some messy hard coding in the dispatcher.
Cheers, mwh
- Change submit results to not be an action.
So it would be implicit that we always want to submit results? I know there are times that we do jobs that do not submit results. For instance, when testing the scheduler, or other pieces of LAVA. Also, where would the results be submitted from? Still from the dispatcher? There's an unrelated issue we've discussed before with submitting results to streams that must be authenticated to. One of the earlier options that came up for this was submitting results from the scheduler, however I'm not sure I like that any better.
- Add a result_locations list and action_data dictionary to
LavaContext. My half-thought through idea is that actions will use the action name as a prefix, e.g. deploy_linaro_image for a qemu client might set 'deploy_linaro_image.qemu_img_path' in this dict.
Those would be parameters to the action? The way it works right now is that we have a default location where all the test results go, and then they are gathered at the end. I'm not sure I understand how this helps to let the job submitter specify the location.
- Change the lava-test and lava-android-test to store into
result_locations and the submit step to read from there.
See #2, we already do this I believe, we just don't give an option to change that location at runtime. This location is _inside_ the image because thats where the parsing takes place. I'm starting to wonder if you want it to be stored outside the image for qemu testing purposes? Since lava-test is the piece that parses it, it needs to be inside the image, but with lava-android-test it could certainly be outside.
- Use action data to have deploy_linaro_image and boot_linaro_image
(and maybe lava_test_install and lava_test_run) talk to each other.
Maybe if you showed how this might look in json it would make sense to me. I don't think we should be too restrictive with how it's used though. One thing that's come up before as a possible user of this is to have outputs that come out of some actions used in other actions. For instance, if we had a build_foo action that you pointed at a source tree and produced output, how could you point a subsequent step at the artifacts of that build for consumption?
Thanks, Paul Larson
On Mon, 14 Nov 2011 13:17:25 -0600, Paul Larson paul.larson@linaro.org wrote:
- Change submit results to not be an action.
So it would be implicit that we always want to submit results? I know there are times that we do jobs that do not submit results. For instance, when testing the scheduler, or other pieces of LAVA.
Well, that's interesting data then :-)
Also, where would the results be submitted from? Still from the dispatcher?
Err, I was assuming so. I guess it doesn't have to be that way, but I think for the use of people who run the dispatcher by hand, without the scheduler, it will be much easier if the dispatcher submits the jobs.
There's an unrelated issue we've discussed before with submitting results to streams that must be authenticated to. One of the earlier options that came up for this was submitting results from the scheduler, however I'm not sure I like that any better.
I think we came up with a plan for this didn't we? Have the scheduler generate a token when the job starts, stuff it in the job given to the dispatcher and invalidate it when the job finishes.
- Add a result_locations list and action_data dictionary to
LavaContext. My half-thought through idea is that actions will use the action name as a prefix, e.g. deploy_linaro_image for a qemu client might set 'deploy_linaro_image.qemu_img_path' in this dict.
Those would be parameters to the action?
Not in the sense that those words mean things today no. What I'm proposing is that actions record on the context where the result bundles are...
The way it works right now is that we have a default location where all the test results go, and then they are gathered at the end.
... rather than assuming they all end up in the same place.
I'm not sure I understand how this helps to let the job submitter specify the location.
That's not surprising because that's not what I was proposing :-) Apologies for lack of clarity on my side.
- Change the lava-test and lava-android-test to store into
result_locations and the submit step to read from there.
See #2, we already do this I believe, we just don't give an option to change that location at runtime. This location is _inside_ the image because thats where the parsing takes place. I'm starting to wonder if you want it to be stored outside the image for qemu testing purposes? Since lava-test is the piece that parses it, it needs to be inside the image, but with lava-android-test it could certainly be outside.
Again, it's not so much that I want to prescribe where the test results go, but rather not assuming so much about where they are, or perhaps how to get at them -- for qemu, we're not going to boot a known good image, mount the tested rootfs, copy files around and then run an HTTP server to get the results onto the host (well, we _could_ I guess, but that would be properly crazy).
- Use action data to have deploy_linaro_image and boot_linaro_image
(and maybe lava_test_install and lava_test_run) talk to each other.
Maybe if you showed how this might look in json it would make sense to me.
The stuff I was trying to propose does not imply changing the json at all.
I don't think we should be too restrictive with how it's used though. One thing that's come up before as a possible user of this is to have outputs that come out of some actions used in other actions.
This sort of question is _precisely_ what I'm talking about! :)
And a point I was trying to make, probably not very clearly, is that we _already_ have outputs of actions that are used in other actions[1] and ideally any framework we build to have more of this sort of this thing should incorporate what we have now, not be some parallel system.
[1] the bundles produced by lava_test_run and also the state of the testrootfs and testboot partitions, although that last one is a bit hidden and implicit
For instance, if we had a build_foo action that you pointed at a source tree and produced output, how could you point a subsequent step at the artifacts of that build for consumption?
I did half-hearted think of inventing some kind of syntax for writing actions that could have outputs that could be referred to by later actions, something like:
{ "command": "build_kernel", "parameters": { "git_location": "git:..." }, "outputs": ["kernel_deb_location"] }, { "command": "inject_kernel", "parameters": { "deb_location": "${build_kernel.kernel_deb_location}" }, "outputs": ["hwpack_location"] }, { "command": "deploy_linaro_image", "parameters": { "rootfs": "...", "hwpack": "${inject_kernel.hwpack_location}" } },
... but this seems a bit over-engineered somehow?
Cheers, mwh