OK. It looks like the logic in job.py might be flawed. If rebooting fails, it continues to execute the next command regardless. In this case, deploy_linaro_image failed, but it carried on and did the next thing, which is to boot the test image. I'll try to unravel the logic and suggest a fix.
Thanks
Dave
On 10 Oct 2012, at 08:35, Dave Pigott dave.pigott@linaro.org wrote:
Hi all,
I found an interesting health failure today on origen07
http://validation.linaro.org/lava-server/scheduler/job/35016/log_file
When you look at the log, you see that the board starts off at the u-boot prompt. It then tries to do a "reboot", which (obviously) fails. So naturally, it then does a hard reset, and this is where it does something very odd: It interrupts the boot and tries to boot the previously installed test image. I haven't yet looked at the dispatcher code to figure out why (that's my next job).
What then started alarm bells ringing was that I saw this:
1261680 bytes read reading uInitrd
1532597 bytes read reading board.dtb
** Unable to read "board.dtb" from mmc 0:5 **
So whatever the test image was, it was expecting a device tree blob, which I would have assumed would have to have been installed during deploy_linaro_image() being that if there is one it should just be part of the test boot deployment.
So I looked at the log from the previous job:
http://validation.linaro.org/lava-server/scheduler/job/34938/log_file#entry2...
and sure enough, you'll see at that mark the same issue.
So there are two things:
- There's some twisted logic in the dispatcher that's making it do odd things if it starts off in u-boot
- Do we have an issue with dtbs not being handled properly by lava, or is it just that the hwpack was incomplete?
Thanks
Dave