On 7 Jan 2013, at 20:38, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Andy Doan andy.doan@linaro.org writes:
On 01/06/2013 06:21 PM, Michael Hudson-Doyle wrote:
Michael Hudson-Doyle michael.hudson@canonical.com writes:
I've started a google doc on this subject: https://docs.google.com/a/linaro.org/document/d/1K_FrpM0qaDCKd6fRHyt_NDf10lb...
I've updated this to include thoughts on how to upgrade postgres to 9.1. I'd like to do this soon (my next Monday, they're usually quiet? There will be a few minutes of total LAVA downtime).
that's cool and including the script is really great.
OK. I'll send an announcement about the downtime today.
+1 - Made a lot of sense when I read through the script last night.
Setting up streaming replication to enable us to cut over to the failover system is starting to seem like too much work for this. We could just rsync the postgres data across after shutting down LAVA on control and rsync it back again just before restart it after the upgrade.
The other option is to just take the downtime while the upgrade happens. It really shouldn't that long, but I can't really think of a way to gauge the potential cost ahead of time.
From what I read the downtime would probably be less than 3 minutes. I hate to waist developer *days* to save 3 minutes of uptime.
OK. I'd still like to have a system ready to go as a standby in case the upgrade implodes so that we can continue to accept jobs at least.
Do you want me to create a cloud node called something like control-backup? I widened the pool of public IPs for the cloud yesterday to add 192.168.0.x so this *should* work - the reason we were hitting the stops before was that we had run out of public floating IPs. As luck would have it the private pool has always been wide enough, because I had planned for when we widened the IP address space. And just to reassure you: dnsmasq is configured to never serve from the 192.168.0.x address range.
Thanks
Dave