Hi Michael,
First, let me introduce Matt Hart, our new Lava Lab Engineer. He and I are going to work on this failover and upgrade in the next week, and if possible, I'd like to find out where we are with this. If you haven't had time to look at this, maybe Matt could pick it up?
Thanks
Dave
On 17 Apr 2013, at 01:35, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Michael Hudson-Doyle michael.hudson@linaro.org writes:
Hi all,
I'm going to test some scripts I wrote to fail the LAVA database over to another server in a couple of hours (we will use this during the upgrade of precise to control too). This will cause two forms of disruption:
- I've already offlined all boards and am waiting for the jobs to
finish, so you might have to wait a little longer for your LAVA jobs to finish.
- There will be some very short moments of complete outage as the
failover happens.
Apologies in advance if this causes you difficulties -- but I hope having better disaster recovery for the lab is a good goal :-)
So this didn't quite work out -- the outage was probably a few minutes in total, and the failover didn't succeed. Three problems:
trivial syntax mistakes in my script (not a problem really)
my scripts only changed http traffic to point at the failover node,
not https
- the failover node was configured to serve lava at /, not
/lava-server/ as we currenly do for production (for extremely hysterical raisins)
All the above is easily enough fixed, and I'll try again tomorrow.
Apologies again for the disruption.
Cheers, mwh
linaro-validation mailing list linaro-validation@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-validation