linaro-validation

linaro-validation@lists.linaro.org

818 discussions

by Dave Pigott

Now that we have to spend less time looking at failed health jobs, we should start looking at stuck jobs: ------------ origen02 ------------ http://validation.linaro.org/lava-server/scheduler/job/39388 Been running since Nov. 20, 2012, 3:57 a.m. Submitted its bundle, but just never actually stopped. At the end, it had failed because the Android home screen never displayed, i.e. bootanim never stopped. Don't know if that is relevant or not. I cancelled the job but, as often happens in these cases, the job ends up in a continually cancelling state. Went onto control and did a kill -2. ---------------- snowball08 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39394 Running since Nov. 20, 2012, 4:44 a.m. Again, failed to get into Android test image and submitted results, but is stuck. Looked on control - no process. Cancelled job. Again, stuck. Did a kill -2. ---------------- snowball03 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39894 Running since Nov. 24, 2012, 5:54 p.m. Same as the others. Failed to get into Android test image, stuck in cancelling. Kill -2. ---------------- snowball07 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39940 Running since Nov. 25, 2012, 4:39 a.m. Same. ---------------- snowball04 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39970 Running since Nov. 25, 2012, 8:46 a.m. Same. Thanks Dave

13 years

Overnight health failures

by Dave Pigott

------------ panda12 ------------ http://validation.linaro.org/lava-server/scheduler/job/39734 Failed to boot test image, with lots of errors coming out. Went onto board and booted into test image and set proxy all fine, so a reboot of test image fix would have fixed this one. Put back online. Thanks Dave

13 years

Overnight health check failures

by Dave Pigott

Just the one: ------------ panda02 ------------ http://validation.linaro.org/lava-server/scheduler/job/39577 Same as panda06 yesterday. wget weirdness. Put back online. Thanks Dave

13 years

Overnight health check failures

by Dave Pigott

Hi all, Two last night, which means we're averaging approximately one health check failure per day, which equates to a 95% pass rate. Not great. ------------ panda04 ------------ http://validation.linaro.org/lava-server/scheduler/job/39484 When it got into the test image, the device was spewing out lots of weird error messages. Went onto the board and rebooted the test image: same problem. Shell prompt was also corrupted. I wasn't clear if this is a board/sd card/corrupt image deployment problem, so I booted the master image, and that seems fine. Putting back online to see if it was a one off corruption. If the board passes this time, then the only way to fix this problem would be to set up so that if for some reason things fail in the test image, go round and do it all again - including deployment, because rebooting the test image wouldn't have worked. ------------ panda06 ------------ http://validation.linaro.org/lava-server/scheduler/job/39477 wget weirdness. Kept getting "Connection reset by peer", and then retrying. Putting back online to see if it's a one off glitch. If the board passes this time, then the way to fix the problem is, if deployment fails, reboot to the master image and try again. Thanks Dave

13 years

About the problem that telnet process Consumes CPU

by YongQin Liu

Hi, Andy & Michael About the problem that the telnet process consumes CPU(bug1034218<https://bugs.launchpad.net/linaro-android/+bug/1034218> ), For now I tried two ways to verify it: 1. Run the CTS test via submitting a lava-job In this way, the process that consumes CPU is telnet 2. Run the CTS test via command line "lava-android-test run cts" In this way, there is no process that consumes CPU to 100%, In the meanwhile, I also opened the telnet session. So I guess the problem is the way we calling the telnet command in lava-dispatcher. >From my investigation, it's the select syscall in telnet that consumes CPU, So I doubt if there is some place in lava-dispatcher that reads the ouput of telnet in a loop without sleep in the loop. but I did not find such place in lava-dispatcher. How do you think about it? Finally, I feel that this problem is of lava-dispatcher, not the problem of lava-android-test or CTS, so can we change it to be a bug of lava-dispatcher? -- Thanks, Yongqin Liu --------------------------------------------------------------- #mailing list linaro-android(a)lists.linaro.org <linaro-dev(a)lists.linaro.org> http://lists.linaro.org/mailman/listinfo/linaro-android linaro-validation(a)lists.linaro.org <linaro-dev(a)lists.linaro.org> http://lists.linaro.org/pipermail/linaro-validation

13 years

Ability to pull a git repository that includes the test definition

by Senthil Kumaran S

Hi, We now have support for pulling git repositories which holds the test definition, by defining it in the job file. The code is available here - https://code.launchpad.net/~stylesen/lava-dispatcher/testdef-from-repo A sample job file which works is attached with this email. The git repository which I used is here - http://git.linaro.org/gitweb?p=people/stylesen/sampletestdefs.git;a=summary Please have a look at the whiteboard here - https://blueprints.launchpad.net/lava-dispatcher/+spec/test-case-managment-… - which gives more information about the change. Will shortly push support for bzr. PS: The code available in the above branch is still in development and not production ready. Any attempt to merge it with production will prove FATAL :) Thank You. -- Senthil Kumaran S http://www.stylesen.org/ http://www.sasenthilkumaran.com/

13 years

Urgent TC2 request

by Dave Pigott

Hi all, David Zinman contacted me with an urgent request. Apparently Mathieu's TC2 is playing up and he has an urgent need for it. Could we liberate one from the lab while Mathieu returns it, and then deal with either fixing it (if it's the motherboard, I have a spare; if it's the tile, we'll get ARM to swap it out)? Looking at the TC2 load, it's not so bad that we're going to block on it. Thanks Dave

13 years

Weekend health check failures

by Dave Pigott

Three over the weekend: ---------------- panda-es03 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39026 Looks like some nasty kernel panic during wget. Went on the board, rebooted it. All seems ok, so put back online to retest. ---------------- snowball02 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39129 ---------------- snowball03 ---------------- http://validation.linaro.org/lava-server/scheduler/job/39125 Genuine problem - looks like they aren't being recognised on their USB port. Not much I can do from home, so will look at them in the morning. Thanks Dave

13 years

Overnight failures

by Dave Pigott

Just the one, although the report says two. Did someone else put one online that had failed? --------------------- vexpress-tc2-01 --------------------- http://validation.linaro.org/lava-server/scheduler/job/39042 This *looks* like there was an existing telnet session on the board that hand't closed. I went on the board and it was fine, so whatever/whoever it was had disconnected. Rebooted and put back online. Thanks Dave

13 years

health check images

by Andy Doan

Hey Guys, We currently use ubuntu-desktop RFS's for our daily health checks in LAVA. Its my understanding the dev-platform team will be sunsetting ubuntu-desktop images. Due to this, I think we (the LAVA team) need to think about moving to a new set of images for our health checks so that they better reflect reality. 2012.11 should be producing some pre-built images for both server and nano. I think picking the server pre-built images probably makes more sense since it gives us a little more coverage than nano, but I don't have a strong sense on whether it makes much difference since we basically just do boot-testing in our health check. = So what does this mean? I think we'll need to file a 12.12 blueprint for doing health check investigation work. Essentially, we should build a job for each device-type in production based off its health job. We then update the image URL to be the new candidate. We then submit it 100 times and do a failure analysis. At this point we cross our fingers and hope the failures aren't worse than what we currently see. If so, we can switch the image over. If its not the case, we'll need to work with dev-platform team on getting new issues addressed. make sense? who wants to help? :) -andy

13 years

← Newer
1
...
50
51
52
53
54
55
56
...
82
Older →

Jump to page:

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

linaro-validation