On 09/19/2012 06:01 AM, Alexander Sack wrote:
Do we know why we have regular networking issues on master images still? Can we have an effort to nail this down? How can we do that?
We attempted to add some debugging in the past, but so far nothing has helped much. The biggest problem we have now is repeatability of this issue. If you ignore the TC2 failures which are skewing the results a little now, we have about a 5% failure rate (for a 2 week period we actually had 1%!). Of that 5%, 50% are network related:
pinging control fails downloading *.tgz in master fails
So out of 100 runs we get about 2 "wget" type failues and 2 "ping" type failures. Regardless of how small the number is, its _half_ of our issues, so we do get a good bang for our buck by improving it.