On 8 June 2012 06:01, Michael Hudson-Doyle michael.hudson@linaro.orgwrote:
YongQin Liu yongqin.liu@linaro.org writes:
Hi, All
I just submitted 100 jobs with panda-ics-gcc47-tilt-stable-blob#18
android
images. 4 of them are failed because of the network problem. 3 on panda01 and 1 on panda06. you can see here for the details.
https://docs.google.com/a/linaro.org/spreadsheet/ccc?key=0AnxpY5uv-BlNdG9zYT...
For the download problem of image files, I guest we can set the retry number to 5.
Makes sense. I think waiting 5 minutes between retries is probably a touch excessive too, maybe we should scale that down too.
This time I saw one job on panda01 succeed to download the images files
at
the 2nd try.
I've seen this happen a few times too.
I have filed a bug about this. https://bugs.launchpad.net/lava-dispatcher/+bug/1010285
do you think setting the wait time to 1 minute is ok?
Thanks, Yongqin Liu
For the network problem, is it just related to the specified board(like panda01), or has relation to the entire network of lab? Anyone has any thoughts about it?
I don't know what's going on at all. There is flakiness somewhere, but I don't know if it's in the panda master kernel, the hardware of some of the pandas or somewhere in our lab's setup. It's interesting that it happens more on particular devices though, suggests a more hardware-sided problem (whether with the panda or a loose cable or something else).
And another thing I suggest is that we change to use images of panda
stable
images in our health job here.
+1
Cheers, mwh