linaro-validation October 2012

linaro-validation@lists.linaro.org

16 participants
33 discussions

by Michael Hudson-Doyle

We're going to be talking about test case management in LAVA at the Connect. I've brain-dumped some of my thoughts here: https://wiki.linaro.org/Platform/LAVA/Specs/TestCaseManagement Comments welcome. But if all you do is read it before coming to the session, that's enough for me :-) Cheers, mwh

12 years, 7 months

Initial version of yaml testdef support

by Senthil Kumaran S

Hi, I did code changes to support yaml based testdefs. I shall show you the demo tomorrow of what I have and we can discuss the yaml structure in more detail and get more stuff in. My sample testdef looks like the following (testdef.yaml): <snip1> metadata: name: simple version: 1.0 format: lava-test v1.0 environment: image-type: [beagle] install: url: steps: run: steps: - /bin/echo cache-coherency-switching - PASS - ls - pwd parse: pattern: (?P<test_case_id>.*-*)\\s+:\\s+(?P<result>(PASS|FAIL)) fixupdict: PASS: pass FAIL: fail </snip1> Following is a snipe from sample run: <snip2> cache-coherency-switching - PASS install.sh run.sh testdef.yaml /lava/tests/0_simple <LAVA_TEST_RUNNER>: 0_simple exited with: 0 0_simple-1351674922 build.txt cpuinfo.txt meminfo.txt pkgs.txt <LAVA_TEST_RUNNER>: exiting<LAVA_DISPATCHER>2012-10-31 02:44:54 PM INFO: lava_test_shell seems to have completed <LAVA_DISPATCHER>2012-10-31 02:44:54 PM INFO: attempting a filesystem sync before power_off linaro-test [rc=0]# sync sync linaro-test [rc=0]# <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: [ACTION-E] lava_test_shell is finished successfully. <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: Submitting the test result with parameters = {u'stream': u'/anonymous/stylesen/', u'server': u'http://10.155.13.219/RPC2/'} dashboard-put-result: http://10.155.13.219/dashboard/permalink/bundle/9fdccd73c7e825c2eec7850e61d… <LAVA_DISPATCHER>2012-10-31 02:44:57 PM INFO: Dashboard : http://10.155.13.219/dashboard/permalink/bundle/9fdccd73c7e825c2eec7850e61d… </snip2> Thank You. -- Senthil Kumaran S http://www.stylesen.org/ http://www.sasenthilkumaran.com/

12 years, 9 months

About the telnet process

by YongQin Liu

HI, All I just run a CTS job on staging now, and found that the cpu usage of the telnet process is nearly 100% Anyone has any idea about that? Below is the link of the collectd information installed on staging: http://staging-metrics.validation.linaro.org:8080/collectd/bin/index.cgi?ho… The high CPU usage is from 13:40 on CPU3 and changed to CPU1 at 14:30. below is the output of the top command: top - 15:18:26 up 7 days, 5:27, 1 user, load average: 1.81, 1.80, 1.46 Tasks: 144 total, 2 running, 142 sleeping, 0 stopped, 0 zombie Cpu(s): 2.5%us, 23.1%sy, 0.0%ni, 74.0%id, 0.3%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 8178504k total, 4866776k used, 3311728k free, 66408k buffers Swap: 0k total, 0k used, 0k free, 3870820k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31383 root 20 0 27516 1504 1224 R 100 0.0 97:09.40 telnet 3626 root 20 0 2437m 92m 8496 S 0 1.2 0:22.77 java 13383 lava-sta 20 0 198m 47m 5180 S 0 0.6 0:18.85 uwsgi 1 root 20 0 24460 2336 1244 S 0 0.0 0:00.98 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root 20 0 0 0 0 S 0 0.0 0:08.43 ksoftirqd/0 Thanks, Yongqin Liu --------------------------------------------------------------- #mailing list linaro-android(a)lists.linaro.org <linaro-dev(a)lists.linaro.org> http://lists.linaro.org/mailman/listinfo/linaro-android linaro-validation(a)lists.linaro.org <linaro-dev(a)lists.linaro.org> http://lists.linaro.org/pipermail/linaro-validation

12 years, 9 months

panda10 sdcard dead

by Andy Doan

Hey Guys, I just looked into Panda10 health check failures over the past 24hours. The good news is my bad code isn't to blame. The bad news is that the SD card appears to be having some issues.

12 years, 9 months

munin for the lab

by Michael Hudson-Doyle

Hi all, I've just created a very basic munin installation for the lab: http://munin.validation.linaro.org/ The monitoring wonk I consulted said that munin is perhaps not the greatest way of getting graphs of your system but that it's probably the easiest to set up. Better than nothing :-) To add a system to munin you need to: 1) apt-get install munin-node on the system 2) Edit /etc/munin/munin-node.conf on the system to contain: host_name XXX.validation.linaro.org allow ^192\.168\.1\.32$ 3) sudo service munin-node restart on the system 4) Add the following to /etc/munin/munin.conf on linaro-gateway: [XXX.validation.linaro.org] address 192.168.1.YYY use_node_name yes and that's it! The data viewable at http://munin.validation.linaro.org/ is generated by a */5 cron, so it takes a while for a new host to appear. If someone wants to add dogfood, the compute nodes, the fast model instances etc etc be my guest... Once all the systems are added, the next thing is to start looking at adding more us-specific metrics -- scheduler queue lengths, request numbers and duration from django or apache, various postgres stats etc etc. It would also be nice to add "events" to the graphs such as rollouts and job start/ends but I don't know if that is supported. Cheers, mwh

12 years, 9 months

TC2s offline

by Dave Pigott

Hi All, Due to a glitch with UEFI and the latest kernels, we are forced to leave the TC2s offline until the issue is resolved. Ryan Harkin and I have been working to try and resolve this, but the best we could do is to get them to pass their health check (using sticking plaster, string and a large hammer) but they would then fail every test that was submitted to them, which would be kind of pointless. We're working actively to fix this problem, and I'll let you know when we're back up and running. Thanks, and apologies once again, Dave

12 years, 9 months

first dispatcher run capturing energy data (with DS-5)

by Michael Hudson-Doyle

Hi all, It's all kinds of rough but I've just crossed a milestone: I ran the dispatcher and had DS-5 capture energy data from my host while it was running, using this branch (lots of which Andy wrote): https://code.launchpad.net/~mwhudson/lava-dispatcher/signals/+merge/131128 Currently the output of streamline -report is attached to the test result as an attribute, which is just awful. Either it should be parsed into interesting data, or the -report output should be attached to test run in a useful way (or both). But it's a start! I'm attaching the test definition and job file I used. Cheers, mwh

12 years, 9 months

2012.10 deployment issues

by Andy Doan

Hey Guys, I just hit a really annoying issue while trying to upgrade control to our latest lava-dispatcher code. Everything works great in dogfood and staging. However, I guess the python version on control is just different enough to cause a problem with our new use of "configglue". The issue is with our "boot_cmds" that are set by our device-type .conf files. The faulty snippet is roughly: string_to_list(boot_cmds) on a "normal" system, this produces an array of commands. On control we get a encoding mess that doesn't work with u-boot. eg: ['m\x00\x00\x00m\x00\x00\x00c\x00\x00\x00 ....... I think the easiest fix is to change our master.py to call: string_to_list(boot_cmds.encode('ascii')) I'm doing another round of unit testing to prove this works before attempting to deploy. For now I've marked all the devices that execute from control as offline. If the fix takes too long, I'll just revert to the previous lava deployment -andy

12 years, 9 months

hacking goals at Connect

by Andy Doan

Its often difficult to achieve pre-planned hacking goals at Connect, but Michael and I spent a little time thinking about this topic and wanted to try and lay out an agenda for LAVA next week. The general thought process we took for these was: * Is it beneficial to work on as a group? * Is it something that we'll benefit from even we only wind up having 20 minutes and could it also work well if we find the time to work on it for 2 hours. With that in mind we are thinking about these items: = Galaxy Nexus Fastboot Hacking Have a "learn fastboot" session based on the email thread from last week. = NI battery simulator I can show how this works Zach can try and help with the TCP disconnect issue = Versatile Express intro hacking Dave can give us some education on how the VExpress works/boots/etc. Possibly grab Ryan/Tixy to join. = Deployment Type Improvements I think this will roughly be a "get Antonio to teach us Chef" session. Maybe think about how to get Chef built into open stack image with cloud-init. = monitoring – adding app-specific metrics to munin Michael can talk to us a bit about Munin and how to add new custom metrics like. Then maybe we can hack on adding some like: * web/django stuff * postgres stuff * job stuff - num flocks - device type wait time at various %iles - device type utilizations

12 years, 9 months

Health check failures

by Dave Pigott

------------ origen02 ------------ http://validation.linaro.org/lava-server/scheduler/job/35250 When trying to boot the test image, it went into a panic. I went onto the board and booted it both into master and test images and it was fine. So it looks like it was a random glitch. Back online to re-test. ------------ origen09 ------------ http://validation.linaro.org/lava-server/scheduler/job/35134 Dropped into initramfs on test image boot. Booted up test image, and it sat for ages doing recovery on mmcblk0p6, which is testrootfs. Let it complete fsck, then did a reboot. Still recording errors, so replaced sd card with new image. Looks like one of the rare sd card failures. A health check finding a real problem. :) ------------ panda04 ------------ http://validation.linaro.org/lava-server/scheduler/job/35095 Android glitch of some sort - home screen problem. Put back online to retest. -------------------- snowball06/08 -------------------- http://192.168.1.10/lava-server/scheduler/job/35179 eth0 failed to come up. We see this a lot with snowballs. Perhaps there's a known bug in the master image we use (12.02)? Thanks Dave

12 years, 9 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

linaro-validation October 2012