Hello Neil,
On Fri, 31 May 2013 16:25:44 +0100 Neil Williams codehelp@debian.org wrote:
[]
Issues raised in http://bazaar.launchpad.net/~linaro-automation/linaro-android-build-tools/tr...
This is very similar to what the Multi-Node parent <-> child communication may need.
Thanks for looking into this, Neil, and seeing for possibilities for reuse in other scenarios, that's exactly why I wanted to wider discussion of this and be led by Tyler, architect for entire team, because with "infrastructure" outlook we can miss something.
Getting a token: I see this as the service that starts the job has some secret allowing it to request the token, which is then passed onto the job.
Precisely as the Multi-Node setup could use. I'm thinking the token could be added as part of the jobdata in lava-dispatcher, commands.py: job = LavaTestJob(jobdata, oob_file, config, self.args.output_dir)
My expectation for this Multi-Node support would be that the token would be written to the test environment and the test would use that to get the list of other clients for this multi-node job. Once each client has booted and "called home" to the parent using the token, clients could get the IP address and role of other clients. This then allows Multi-Node tests to make calls between clients to test their own protocols and other requirements.
Well, my questions here would be what are these "calls" that clients in multinode setup do among themselves. My first thinking would be network testing, but as Antonio wrote in the other email, it doesn't have to be, and primary communication channel is still serial.
I'm afraid I don't know enough about multinode to asses if it shares enough commonality in authentication requirements with build slaves (which do publishing at the end of build). Just to elaborate with Jenkins build slaves we have:
1. Static build master which schedules builds. This system is considered trusted as user/3rd-party code cannot be executed there. 2. But actual builds are scheduled on EC2 slaves which come and go and also execute code as specified by users. So, we cannot trust slave environment and don't want to expose any "persistent" authentication credentials there, instead, only once-off tokens should be used there, so if attacker gets to them, one won't get too much.
Fairly speaking, as far as I can tell, p.2 applies pretty much to LAVA board pool too, except for some difference: there's finite set of build boards (i.e. they're under more scrutiny), and they're firewalled. But again, I don't know what kind of security (vs flexibility) you guys want with multinode setup and what's the best way to implement it...
I already have a protocol designed to do the file update thing (server has old file, client has new one but not old one) for a pet project that I was going to open source, but it is just a bit of fun and hasn't been tested, so even having got that far I would still tell other people to use rsync.
publish --token=<token> --type=<build_type> --strip=<strip> <build_id> <glob_pattern>...
This seems like a reasonable starting point. Lets make sure that it uses a configuration file to specify what to do with those build types etc. Preferably one that it can update from a public location so we don't have to re-spin the tool to add a new build type (though I guess we normally check it out of VCS as we go, so that works too).
Well, on client side, it's ideally just a single file which just handle obvious filtering options (like <glob_pattern> or --strip=) locally and passes the rest to API/service. Server-side can handle the options in any way it wants, note that options above don't require much "configuration", for example --type= just maps to top-level download dir.
Just wondering how much overlap there could be between publishing for CI and publishing IP addresses between clients running a Multi-Node job.
I've only done the briefest look at the Multi-Node aspects so far, there's a PDF I've shared with some rough flow ideas for Parent-Child-Communication.