Hi,
Was just chatting with Tyler and he asked me to share what the API that I would like to use in the CI runtime to interact with LAVA would look like.
I figure it is useful to give you guys some context, so I am going to share a simple user story. I am going to use "LAVA/CI" in the following description as a short cut for LAVA with the extra logic and dashboard we need to make this work.
The most simple user story for this tool is: 1. Write CI job 2. Check into VCS 3. Tell LAVA/CI that the job exists 4. Ask LAVA/CI to run the job 5. LAVA/CI checks the job out of VCS 6. LAVA/CI start running the job[a] 7. The job requests a slave machine that matches a specification 8. The job runs a series of commands on that machine 9. The job releases the machine 10. The job completes 11. The user polls LAVA/CI to see the result of the job
[a] I expect us to have a box that just sits there running these jobs (Python scripts). The scripts just run commands on other machines and shuffle files around, so 1 relatively modest computer should cope with running all the scripts at once.
Now, the job could be more complex. The most likely use is that an x86 machine is requested at the start of the job and used as a build slave, then the resulting image is booted and tested on an ARM board. You may perform 1 kernel build and derive several images from it to perform single zImage testing, request one of each of your target boards and execute your boot & test on those ARM boards in parallel. We could also request an ARM server part and several other machines as traffic generators to do some multi-node testing. The nice thing is that LAVA doesn't care - it is just handing out machines.
OK, so that is a bit of context out of the way. What do I actually want?
First, I want to ask LAVA for a machine based on a specification I give. For now, lets stick to a name for each type (x86_64 / Panda / Origen etc). Later I think a list of keys specifying a minimum specification would be nice, such as arch=ARMv7, subarch=NEON, RAM=1GB, cpus=2. I would like to connect to it using SSH. If this is directly or by connecting to a terminal server doesn't matter. As long as I am given sufficient information to connect and log in, I don't mind if this is key based or password based. If it is inconvenient to use SSH I am happy to add code to telnet into a board. I imagine SSH is what I will use for x86 machines.
I would then like to ask LAVA to boot a machine I have reserved using a particular disk image. For x86 machines, I assume this will be from a selection of VM disk images that OpenStack can boot (will need that listing). For ARM boards this will be a disk image I have published somewhere.
Non-API related thought: I think it is reasonable to have an internal file server to store disk images on that we create during builds without having to push up to snapshots.linaro.org and pull them back down. It makes far more sense to boot and test an image, then optionally upload it to the wider world. Let me know if we have this soft of temporary storage available.
I need to know if a machine is ready for me to use. I am happy to poll something.
I need to tell LAVA/CI that I have finished with a machine.
How I access that API doesn't matter to me as long as a Python library exists to allow me to interact with it without many headaches! Example code welcome :-)
If you are interested, this is an example configuration that builds a kernel: http://bazaar.launchpad.net/~linaro-automation/linaro-ci-runtime/trunk/view/...
-- James Tunnicliffe
Hey James,
Thanks for sharing your thoughts on this.
On Wed, Apr 24, 2013 at 06:37:40PM +0100, James Tunnicliffe wrote:
Non-API related thought: I think it is reasonable to have an internal file server to store disk images on that we create during builds without having to push up to snapshots.linaro.org and pull them back down. It makes far more sense to boot and test an image, then optionally upload it to the wider world. Let me know if we have this soft of temporary storage available.
We don't have something like this, but we probably should have one.
I need to know if a machine is ready for me to use. I am happy to poll something.
I need to tell LAVA/CI that I have finished with a machine.
If I understand correctly your asumption is to receive a interactive session on the requested device(s), and then issue commands on it. Is that correct?
Maybe it's too late to ask this, but did you consider the possibility of having the CI runtime produce "actual" LAVA jobs (i.e. a target device spec + a non-interactive script), and then using an API to submit those jobs, poll for their completion (or block until completion depending on the use case) and acessing/manipulating/addressing the job results, perhaps to use them as input for other jobs?
This approach would have the advantage that since you don't directly control the device, you don't have the need to tell LAVA that you are finished with it. LAVA knows when the job you submited is done. Besides, if a CI job crashes, LAVA won't stay forever waiting for being told that a given device is done with, and doesn't need to care about handling timeouts, and we don't need to worry about what is the right timeout to wait for etc.
Does that make sense?
OTOH, I realize that having the ability to reserve a device and receive an interactive session on it is useful and would open up several other possibilities, so I don't necessarily think it is a bad idea at all.
linaro-validation@lists.linaro.org