W dniu 29.03.2012 06:33, Michael Hudson-Doyle pisze:
On Mon, 26 Mar 2012 19:55:59 +0200, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
I've registered another blueprint. This time on LAVA Dispatcher. The goal is similar as the previous blueprint about device manager, to gather feedback and to propose some small steps sub-blueprints that could be scheduled in 2012.04.
The general goal is to improve the way we run tests by making them more reliable and more featureful (richer in state) at the same time.
Please read and comment on the mailing list
https://blueprints.launchpad.net/lava-dispatcher/+spec/lava-dispatcher-de-em...
I basically like this. Let's do it.
Glad to hear that.
I think we can implement this incrementally by making a LavaAgentUsingClient or something in the dispatcher, although we'll have to make changes to the dispatcher too -- for example, the lava_test_run actions become a little... different (I guess the job file becomes a little less imperative? Or maybe there is a distinction between actions that execute on the host and those that execute on the board? Or something else?). But nothing impossible.
I'll write a more concrete proposal. I'm +1 on starting small and doing iterations but I want to make it clear that the goal is to have something entirely different than what we have today, we'll start with the dispatcher source code but let's keep our heads open ;-)
I'd like to lay down a plan on how the implementation will evolve, at each milestone we should be good to deploy this to production with full confidence.
+0.0 (current dispatcher tree)
+0.1, replace external programs with lava-serial, serial config is not constrained to serial class (direct, networked) and constructor data
Side note: with lava-device-manager the dispatcher would get this from the outside and would be thus 100% config-free
+0.2, add mini master agent to master rootfs, make it accept shell commands over IP, master image is scripted with one RPC method similar to subprocess.Popen(). Dispatcher learns of the board IP over serial.
+0.3, add improved master image agent, extra specialized methods for deployment, no shell during image deployment (download and copy to partition driven from python)
+0.4, add mini test agent to test image before reboot, mounts testrootfs and unpacks a tarball from master image (so that agent code is synchronized to master image version), test agent supplements current serial scripted code with simple methods (IP discovery, maybe shell execution as in +0.2)
+0.5, test agent drives the whole test process, dispatcher job copied by the master agent, data saved to testrootfs partition (TODO: maybe we should pick a better location?)
+0.6, master agent takes over the part of sending the data back to the dispatcher, no hacky/racy webservers, clean code on both ends
How does this sound? I just wrote it from the top of my head, no deeper thoughts yet.
The only thing I'd change is that I don't really see a reason to *not* spam the test output over the serial line and show said spam in the scheduler web UI. We should also store it in neat files on the test image and make sure that those files are what the dashboard's view of events are based on.
Thinking about it yeah, we may just spam the serial line for now. Ideally I'd like to be able to get perfect separation of sources without loosing correlated time. Imagine a scheduler page that has filters for kernel messages (without relying on flaky pattern matching) and can highlight them perfectly in the application run log.
ZK