linaro-validation

linaro-validation@lists.linaro.org

818 discussions

by Andy Doan

I was just doing a manual update of staging and see it got broken because the dashboard code now requires a newer version of linaro-dashboard-bundle. I doubt we've packaged that new version and deployed it, so to address it for now I created a local branch for that module as well. This is more of a note that we'll need to remember to make appropriate updates (lava-manifests ...) when we do our next roll-out of dashboard changes.

13 years, 3 months

Re: [Linaro-validation] LAVA doc update

by Dave Pigott

Hi James, Added the validation list to your e-mail for future reference. :) Dave On 8 Nov 2012, at 14:45, James Tunnicliffe <james.tunnicliffe(a)linaro.org> wrote: > (sorry Dave, validation(a)linaro.org bounced. Not sure what your team's > address is - please forward) > > http://askubuntu.com/questions/211603/problems-with-nat-adapater-since-upgr… > > This got me. NAT doesn't work in 12.10, but the proposed fix works: > > VBoxManage modifyvm "name" --natdnshostresolver1 on > VBoxManage modifyvm "name" --natdnsproxy1 on > > Not sure if it is worth mentioning in the docs or not. > > -- > James Tunnicliffe

13 years, 3 months

Lab health checks

by Dave Pigott

OK. Since Connect and the usual catch up, I've only just got round to looking at the health failures on production. Quite a few problems, with a common theme... ------------ origen07 ------------ Network died. USB ethernet dongle was actually dead. Replaced it. Still doesn't work. We have Origens falling like flies at the moment! ------------ origen09 ------------ http://validation.linaro.org/lava-server/scheduler/job/37461 Looks like the board hung while getting the test images. Rebooted and put back online. ------------ panda01 ------------ http://validation.linaro.org/lava-server/scheduler/job/38038 Looks like control had a glitch and we kept getting "connection reset by peer" trying to wget. Back online. ------------ panda02 ------------ http://validation.linaro.org/lava-server/scheduler/job/38039 Same as panda01. Put back online and it *still* failed in the same way. Will investigate. ------------ panda05 ------------ http://validation.linaro.org/lava-server/scheduler/job/37905 This one would have been caught by the the "output returns to get prompt" and/or the "try three times" boot fixes ------------ panda06 ------------ http://validation.linaro.org/lava-server/scheduler/job/37726 Same as panda05 ------------ panda11 ------------ http://validation.linaro.org/lava-server/scheduler/job/36722 Very odd. Looks like the test image, or the test partition, are corrupted. Went on to board and rebooted test image, same problem. Will re-test. If it fails will replace sd card. ------------ panda13 ------------ http://validation.linaro.org/lava-server/scheduler/job/37019 Very similar to panda11, except that this was in the android image. Same action as panda11. ------------ panda17 ------------ http://validation.linaro.org/lava-server/scheduler/job/37291 The clue here are these lines: mountall: fsck / [1088] terminated with status 4 mountall: Filesystem has errors: / Errors were found while checking the disk drive for /. Press F to attempt to fix the errors, I to ignore, S to skip mounting, or M for manual recovery We never hit the prompt because it needed to do an fsck. Looked on board and all ok now. Retest. ------------ panda21 ------------ http://validation.linaro.org/lava-server/scheduler/job/37864 Similar to panda11 and 13. Same action taken. ------------ panda22 ------------ http://validation.linaro.org/lava-server/scheduler/job/37294 Same as panda17 ------------ panda23 ------------ http://validation.linaro.org/lava-server/scheduler/job/37779 Same as panda05. ---------------- panda-es01 ---------------- http://validation.linaro.org/lava-server/scheduler/job/37676 Similar to panda11, 13 and 21 ---------------- snowball02 ---------------- http://validation.linaro.org/lava-server/scheduler/job/37295 Looks like the android test image just hangs on boot. Went on board, launched it. Same thing. Retest. ---------------- snowball06 ---------------- http://validation.linaro.org/lava-server/scheduler/job/36157 eth0 didn't come up in master image. Would have been fixed by the "reboot" fix.

13 years, 3 months

panda24 pulled out of looping mode in staging

by Andy Doan

Hey Guys, I'm doing a final test on panda24 before deploying a new dispatcher in production. This means its no longer in the looping health job mode we were doing. -andy

13 years, 3 months

Staging failure

by Dave Pigott

http://staging.validation.linaro.org/scheduler/job/35505/log_file This is a bit odd. It got confused when we were in the u-boot prompt while trying to boot up the android test image. It may be some code flaw, though I can't see what, other than it took 5 minutes to get from reboot to that point. Perhaps this calls for a similar approach to booting test images, i.e. if it fails, try a couple more times. May be an edge case, but would put our reliability way up. Along with that, we may have to up the timeouts for booting, given we might do it 3 times. Thoughts? Dave

13 years, 3 months

LAVA runs fails for ci-LT-TI-working-tree-3_4

by Fathi Boudra

On 5 November 2012 17:35, David Long wrote: > It looks to me like our TI 3.4 nightly LAVA runs are broken for all test > cases (not just the TILT stuff). Is someone investigating this? I'm looking to ci-LT-TI-working-tree-3_4 stream. It seems the test fails to run since 2012-10-24. lava-test doesn't install successfully: OperationFailed: executing u'apt-get -o Acquire::http::proxy=http://192.168.1.10:3128/ update' failed with code 100 I see other errors on other streams since the same date...

13 years, 3 months

Staging failures

by Dave Pigott

OK. We've had 3 failures out of 250 in 24 hours - 98.8% -better than the 16 or so we were getting, but still... ---------------- snowball10 ---------------- http://staging.validation.linaro.org/scheduler/job/35883 Test image kernel died on boot. ---------------- beaglexm04 ---------------- http://staging.validation.linaro.org/scheduler/job/35901 and http://staging.validation.linaro.org/scheduler/job/35838 Corrupted serial output in test image. Michael's new code caught it, but it was so corrupted it still failed. Next check went ok. Conclusion: All 3 of these could have been fixed by attempting a reboot of the test image, just like we're now doing on the master. Of course, one could take the view that the corrupted serial line means a broken board, but given it came back and consistently ran afterwards means we just hit some edge case. I'll file a bug. I strongly believe that we are close to achieving our 99.9%+ target set by Alexander. Thanks Dave

13 years, 3 months

Staging health checks

by Dave Pigott

Hi all, I've just checked in and deployed a more general fix for the network failures on staging, and it's running through its paces now. Essentially, the change now tries to reboot three times if *anything* goes wrong, not just the network coming up. Looking back at the failures of the last 24 hours (2 out of 158) both of those would have been fixed by my change. We'll know how well we've done in 24 hours time. Thanks Dave

13 years, 3 months

The architecture of Graphite

by Michael Hudson-Doyle

Although munin is working out fine for us currently, an option I'd prefer in some ways is graphite. This document: http://www.aosabook.org/en/graphite.html explains how graphite works in a way that that made way more sense to me than anything else I'd read. Cheers, mwh

13 years, 3 months

Initial version of yaml testdef support

by Senthil Kumaran S

Hi, I did code changes to support yaml based testdefs. I shall show you the demo tomorrow of what I have and we can discuss the yaml structure in more detail and get more stuff in. My sample testdef looks like the following (testdef.yaml): <snip1> metadata: name: simple version: 1.0 format: lava-test v1.0 environment: image-type: [beagle] install: url: steps: run: steps: - /bin/echo cache-coherency-switching - PASS - ls - pwd parse: pattern: (?P<test_case_id>.*-*)\\s+:\\s+(?P<result>(PASS|FAIL)) fixupdict: PASS: pass FAIL: fail </snip1> Following is a snipe from sample run: <snip2> cache-coherency-switching - PASS install.sh run.sh testdef.yaml /lava/tests/0_simple <LAVA_TEST_RUNNER>: 0_simple exited with: 0 0_simple-1351674922 build.txt cpuinfo.txt meminfo.txt pkgs.txt <LAVA_TEST_RUNNER>: exiting<LAVA_DISPATCHER>2012-10-31 02:44:54 PM INFO: lava_test_shell seems to have completed <LAVA_DISPATCHER>2012-10-31 02:44:54 PM INFO: attempting a filesystem sync before power_off linaro-test [rc=0]# sync sync linaro-test [rc=0]# <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: [ACTION-E] lava_test_shell is finished successfully. <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: Submitting the test result with parameters = {u'stream': u'/anonymous/stylesen/', u'server': u'http://10.155.13.219/RPC2/'} dashboard-put-result: http://10.155.13.219/dashboard/permalink/bundle/9fdccd73c7e825c2eec7850e61d… <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: Dashboard : http://10.155.13.219/dashboard/permalink/bundle/9fdccd73c7e825c2eec7850e61d… </snip2> Thank You. -- Senthil Kumaran S http://www.stylesen.org/ http://www.sasenthilkumaran.com/

13 years, 3 months

← Newer
1
...
53
54
55
56
57
58
59
...
82
Older →

Jump to page:

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

linaro-validation