linaro-validation July 2012

linaro-validation@lists.linaro.org

22 participants
25 discussions

by YongQin Liu

Hi, All I have written a script used to connect the android with the wifi in our lava lab. Now we need to pass the ssid and password information to the script. But I think it's better to put the ssid and password in some place of our lava validation, then the script get the information from that place. Like put a file in /etc/lava/devices/wifi, and the content it contains like following: SSID=LAVA-WiFiTest01 PASSWD=PASSWORD SSID2=LAVA-5GWiFiTest01 PASSWD2=PASSWORD2 then in the shell script, we can simply use "source /etc/lava/devices/wifi" to get the SSID and password information. And when we put the script into lava-android-test, we can trigger this script when we need in the test. How do you think about the place to put the devices information? is /etc/lava/devices/wifi ok? bluetooth should have the same problem I think. Thanks, Yongqin Liu.

12 years, 12 months

Already updated the enable_network_after_boot_android definition in file /srv/lava/instances/production/etc/lava-dispatcher/device-defaults.conf for production

by YongQin Liu

Hi, All I have updated the enable_network_after_boot_android definition in file /srv/lava/instances/production/etc/lava-dispatcher/device-defaults.conf for production Just added the following two lines: # This is for android build where the network is not up by default. 1 or 0 enable_network_after_boot_android = 1 here just a notice. BTW, where is the lava source deployed? when I run find command in the /srv/lava/instances/production directory, I can't find the source files of lava-dispatcher Thanks, Yongqin Liu

12 years, 12 months

Health Job Failure Report: 2012-06-18->2012-07-01

by Andy Doan

We've seen a significant reduction in health job failures, but I still wanted to send out a report on these so people could see how things are still breaking. We've had 25 real health failures over the past 2 weeks. By device type: 6 snowball 1 imx53 1 vexpress 2 beagle 6 origen 9 panda By failure type: 2 SD cards died: (both on Origen) 7 Serial Console Related: - 5 connection never established at start of job - 1 connection dropped during test - 1 garbage over serial line 10 Network Related: - 3 network failed to come up - 3 ping unreachable error - 3 wget | tar type failure (kind of network we think) NOTE: A report on something like this is semi-subjective, given that its based on my interpretation of the failures. The raw information on this can be found at: <https://docs.google.com/a/linaro.org/spreadsheet/ccc?key=0AnxpY5uv-BlNdG9zY…>

12 years, 12 months

multi machine lava deployments

by Michael Hudson-Doyle

So we're reaching the point where we are going to want to have more than one machine involved in running LAVA production. The immediate cause is to avoid running fastmodels on the same machine as the database and web server and everything else, but I think as the system grows up we'll want to do this for more functions. I know there have been some thinking about this in the past, but not to my knowledge anything written down. I've spent a good while today thinking about this sort of thing, so here's a bit of a brain dump. The different parts of a LAVA instance ====================================== A LAVA instance is three things: 1) Code 2) Configuration 3) Data. Data comes in two kinds: a postgres database and "media" files (log files and bundles mostly for us). There are a few parts to the configuration: there is the dispatcher config, the server/django config and the apache config. TBH, it would be better if the dispatcher config was derived from the database somehow. I still think we don't do a good job of handling the apache config, but it's a messy problem. The Django configuration includes the information on how to reach the data, both media and db. I think we'll need to define an "instance set" concept -- a list of machines and instance on those machines that share code, configuration and data. Multi-machine code ================== This is easy IMHO: all machines should have the same code installed. With the appropriate ssh keys scattered around, it should be easy to write a fabric job or just plain bash script to run ldt update on each machine. Multi-machine data ================== Accessing postgres from another machine is a solved problem, to put it mildly :-) I don't have a good idea on how to access media files across the network. In an ideal world, we would have a Django storage backend that talked to something like http://ceph.com/ceph-storage/object-storage/ or http://hadoop.apache.org/hdfs/ -- we don't need anything like full file system semantics -- but for now, maybe just mounting the media files over NFS might be the quickest way to get things going. Multi-machine configuration =========================== I think by and large the configuration of each instance should be the same. This means we need a mechanism to distribute changes to the instances. One way would be to store the configuration in a branch, and have ldt update upgrade this branch too (I think it would even be fairly easy to have multiple copies of the configuration on disk, similar to the way we have multiple copies of the buildouts, and have each buildout point to a specific revision of the config). We could also have the revision of the config branch to use be specified in the lava-manifest branch but that doesn't sound friendly to third party deployments -- I think the config branch should be specified as a parameter of the instance set and updating an instance set should update each instance to the latest version of the config branch. This will require a certain discipline in making changes to the branch! All that said, we don't want the instances on each machine to be _completely_ identical, leading to... Differentiating instances ========================= The point of this excercise is not to purely scale horizontally; we want different instances to do different things. I think the primary way we will differentiate instances is by which services they run: do they run uwsgi or not, do they run the scheduler or not, do they run celeryd or not? We already have a limited form of this already, in that you can configure an instance to start the scheduler or not, but this will need systemizing. In addition one instance in a set will need to 'own' the database: when we upgrade an instance we want one and only one instance to run the migrations. I'd also like to push towards a model where we can do rolling upgrades but that's a different kettle of fish I think ... Setup issues ============ There will be a requirement to make sure ssh keys etc are set up on the various machines involved. Ideally this would be done via puppet but for now I think we can just do it by hand... Thoughts? After writing this email, I don't think that there is a huge amount of work to do a good job here; we shouldn't settle for hacks. Cheers, mwh

12 years, 12 months

LAVA 12.07 Release Planning

by Andy Doan

I wanted to send off a note talking about our 12.07 cycle plans for LAVA to describe what we are all working on and how it all fits in with our major team themes. As we've talked about in the past, we've got a few major themes we are focused on right now: * reliability * reporting/visualizations * fast models * external measurement sources (ie power measurement) By theme's 12.07 focus will be: = Reliability: Pre-built image job reliability (spring) <https://blueprints.launchpad.net/lava-dispatcher/+spec/pre-built-reliability> Monitoring Of Scheduler Queues: (andy d) <https://blueprints.launchpad.net/lava-scheduler/+spec/scheduler-queue-monit…> Squid Monitoring and Metrics: (spring) <https://blueprints.launchpad.net/lava-lab/+spec/squid-analytics> //kind of reliability Update Cloud Deployment in Lab: (dave p) <https://blueprints.launchpad.net/lava-lab/+spec/deploy-juju> Create a Proper Staging/Dogfood Server: (dave p) <https://blueprints.launchpad.net/lava-lab/+spec/proper-staging-server> Update the CTS test to use the google package: (yongqin) <https://blueprints.launchpad.net/lava-android-test/+spec/update-cts-test> Monitor Deployment in Lab: (dave p) <https://blueprints.launchpad.net/lava-lab/+spec/lab-monitors> Black Box Test Actions (yongqin) <https://blueprints.launchpad.net/lava-dispatcher/+spec/black-box-test-actio…> = FastModels: Support Fast Models: (amit and andy) <https://blueprints.launchpad.net/lava-dispatcher/+spec/fast-model-support> Ubuntu Fastmodel Support: (amit, maybe some andy) <https://blueprints.launchpad.net/lava-dispatcher/+spec/ubuntu-fastmodels> = Reporting / Visualizations: Build an image status view for the QA services team: (michael) <https://blueprints.launchpad.net/lava-dashboard/+spec/image-status-view-for…> Notify a user when a test fails in LAVA: (spring) <https://blueprints.launchpad.net/lava-dashboard/+spec/linaro-platforms-o-no…> As the power measurement stuff still hasn't arrived, we don't have anything queued up for the "external measurement" theme. However, I think this release will help us take another big stride in reliability and reports. ACTIONS needed from BP assignees: Please look at each of your BP's and make sure the descriptions and more importantly the work items reflect the work as you understand it needs to be done. -andy

12 years, 12 months

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

linaro-validation July 2012