Dave Pigott dave.pigott@linaro.org writes:
On 9 Jan 2013, at 02:18, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Dave Pigott dave.pigott@linaro.org writes:
OK. I'd still like to have a system ready to go as a standby in case the upgrade implodes so that we can continue to accept jobs at least.
Do you want me to create a cloud node called something like control-backup? I widened the pool of public IPs for the cloud yesterday to add 192.168.0.x so this *should* work - the reason we were hitting the stops before was that we had run out of public floating IPs. As luck would have it the private pool has always been wide enough, because I had planned for when we widened the IP address space. And just to reassure you: dnsmasq is configured to never serve from the 192.168.0.x address range.
Hm. I guess that makes sense, yes -- can you do that tomorrow?
Cheers, mwh
No problem. I'll email when it's done.
I can't remember the status of this, but in any case I've set up a fallback instance on dispatcher01 (I'd forgotten about the cloud-backup vm! that can be deleted now I guess). You can access this instance at http://fallback.validation.linaro.org/ (Apache / LAVA on the server think they are serving "validation.linaro.org" and mod_headers trickery on gateway is mapping that from and to fallback.validation.linaro.org -- this seems like better preparation for cutting between control and this node).
I've updated the google doc at https://docs.google.com/a/linaro.org/document/d/1K_FrpM0qaDCKd6fRHyt_NDf10lb... to have some super super specific instructions around shuffling database data around (maybe these should be run via salt from gateway? Might eliminate some potential for slip ups).
I've also updated the work items a bit on https://blueprints.launchpad.net/lava-lab/+spec/control-12.04. I wonder about these two:
Run full backup of control using something like "partimage": TODO Do a backup of known important files (/usr/local/bin, /etc, /srv/lava, etc): TODO
I guess the former is a good idea, although if the upgrade explodes I favour installing precise from scratch over restoring control from a partition backup. For the latter, I think all these files are in salt now.
I'd like to practice the "cut between control and fallback" steps next Monday, which will mean some small downtime. I'll send an announcement soon.
Cheers, mwh