OK. Progress (or not) update:
On 18 Oct 2012, at 03:01, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Dave Pigott dave.pigott@linaro.org writes:
So, to summarise, here is what I'll do, and the order in which I plan to do it:
- Take fast models offline
The two fast models were stuck (and had been since yesterday afternoon) running health jobs, with nothing in the log. Cancelled jobs. Got stuck in cancelling. Did a kill -2 on processes. Still stuck. Manually set board status to offline.
- Take snapshots of dogfood, staging and fastmodels01/03 (can't do 02 as it's broken)
Sigh. Because of the cloud node states, the snapshotting is stuck. While the instances are running, the control node can't see them.
So my plan from here is to update and reboot all the cloud nodes.
Thanks
Dave
- update/upgrade all cloud nodes
- reboot the cloud
- Work on fastmodels02
+1. Hopefully we don't have to go through all this all that often!
Cheers, mwh