linaro-validation November 2013

linaro-validation@lists.linaro.org

20 participants
17 discussions

Re: [Linaro-validation] LAVA Online boards How to?

by Neil Williams

On Thu, 31 Oct 2013 09:14:47 +0200 Ayman Hendawy <ayman.hendawy(a)gmail.com> wrote: > Dear Neil, Do not reply to individuals. Keep replies only to the list. > Actually I wonder why it's not more open, why I can't get a real time > access to the kit serial console, why debugger is not available, > suppose I have an application over OS, I need to debug my code using > a debugger, to get know the certain line causing the problem, why I > don't have an access to some of the kit peripherals like USB port by > some how. > > What I mean, such great effort of LAVA, what limit it to give there > users more deeply access to there kits? why it's limited to posting > jobs? The simple answer is that this is to protect the use of the boards by other users. Submitting a job puts the device into a test image where the bugs in the test image are restricted to that test image. When the test ends (for better or for worse), the board returns to a known, working, state. To do otherwise would make the admin burden unsustainable. These are not general purpose debugging boards. These are test devices. The hands-on debugging needs to be done in emulators or local boards - preferably before the commits. LAVA is checking for side-effects of developer changes, especially performance changes over time. Access to the serial console of any LAVA device is restricted to the lab admins. The devices do not belong to the developers, it isn't about developers having access to "their" devices. The devices belong to LAVA and are maintained as a service for all developers. Doing that requires that LAVA imposes restrictions on what individual developers can do to avoid individuals leaving the device in an unstable or unbootable state. Many LAVA test jobs involve interim kernel builds - it is all too easy to make a commit which gets turned into a LAVA job which leads to a kernel panic in the test. If that was the main kernel for the device, *someone* (i.e. the LAVA lab admins) would have to fix it. Restricting tests to submitted jobs is that fix. -- Neil Williams ============= http://www.linux.codehelp.co.uk/

12 years, 6 months

docs for lava-tool

by Milosz Wasilewski

Hi, I'm trying to complete the docs for linaro test suites and add the way running local .yaml files on the remote server. From the lava-tool README I took the following line: lava testdef submit It kinda works: milosz@milosz-nb:~/linaro/testcases/staging-test-definitions/openembedded$ lava testdef submit busybox.yaml Creating job file... device_type: rtsm_fvp_base-aemv8a image: https://releases.linaro.org/13.09/openembedded/aarch64/vexpress64-openembed… Created job file 'lava-tool-job.json'. Server connection parameters: server: https://mwasilew@validation.linaro.org/RPC2/ rpc_endpoint: RPC2 However I did something wrong as the connection wasn't successful. I got the following exception: Traceback (most recent call last): File "/usr/local/bin/lava", line 9, in <module> load_entry_point('lava-tool==0.7.1', 'console_scripts', 'lava')() File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/tool/dispatcher.py", line 153, in run raise SystemExit(cls().dispatch(args)) File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/tool/dispatcher.py", line 143, in dispatch return command.invoke() File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/testdef/commands.py", line 101, in invoke super(submit, self).submit(job_file) File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/helper/command.py", line 113, in submit job_id = server.scheduler.submit_job(jobdata) File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request verbose=self.__verbose File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava_tool/authtoken.py", line 77, in request response = self._opener.open(request) File "/usr/lib/python2.7/urllib2.py", line 400, in open response = self._open(req, data) File "/usr/lib/python2.7/urllib2.py", line 418, in _open '_open', req) File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain result = func(*args) File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open return self.do_open(httplib.HTTPSConnection, req) File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open raise URLError(err) urllib2.URLError: <urlopen error [Errno 110] Connection timed out> Is there any docs I can check to make it working? milosz

12 years, 6 months

Inheritting device-type configs

by Dean Arnold

Hi All, In our LAVA setup we currently have our control server and then we have 4 remote workers. On these workers we have different device types distributed evenly to prevent complete loss of a particular type if one of our workers fails. On each worker we have our per device configuration specified here: /srv/lava/instances/<instance>/etc/lava-dispatcher/devices And our device-type configuration here: /srv/lava/instances/<instance>/etc/lava-dispatcher/device-types In the device-types config files we are overriding the defaults with ARM specific settings such as, the license_file in the case of fast models, or the partitions in the case of vexpress-tc2. The settings for a particular device-type are the same for the instances running on all workers, therefore it means we have the same <device-type>.conf on multiple machines. It would be good if I could define the ARM specific settings in one place rather than for each dispatcher. What I was wondering was whether or not the remote workers inherited lava-dispatcher device type configs from the master or whether each workers dispatcher was stand alone? If I placed the device-type config files specific to our setup under /srv/lava/instances/<instance>/etc/lava-dispatcher/device-types on our control server, would all of the remote workers pull these settings in? Thanks Dean -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782

12 years, 6 months

REQUEST OF INFORMATION

by novello giampiero

i HAVE SEE that there are some LMP devices integrated in lava-dispach. There is a way to buy some of them ? Best regards Novello G.

12 years, 6 months

LNG Rack offline

by Dave Pigott

Hi all, For reasons we don't fully understand, the LNG dispatcher has got problems. This was coincident with a bundle of servers in the TCWG deciding to reboot. We're working as a matter of priority to get the rack back up. It also appears that the serial console in that rack is giving us problems, so we're replacing that at the same time. I'll try to keep you updated as we go along, but obviously, getting the rack back online is top priority. Thanks Dave

12 years, 6 months

Cannot change ownership to uid 0, gid 1 when deploying linaro bootfs

by Dean Arnold

Hi All, I have recently started seeing an issue with my health check jobs at the stage where the Linaro bootfs is deployed. It looks like the permissions are wrong somewhere. I see no problems when running Android jobs. I have not changed anything on the TC2 boards, and I have not changed anything on the dispatcher apart from upgrade to the latest version of LAVA. I have attached a log for more details. Before I start looking into whether or not this is a problem with my specific LAVA environment, I just wanted to check if this was a known issue? If so, has there been any config changes that I need to apply please to prevent this from happening? Could this be to do with linaro-media-create? Thanks Dean -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782

12 years, 6 months

Re: [Linaro-validation] null value in column "admin_notifications" violates not-null constraint

by Dean Arnold

I have now resolved the issues mentioned in the previous email, and in doing so upgraded my LAVA instance to the latest version. This pulled in a commit to use 'ip route get' to detect which interface is connected to the lava dispatcher. This has resulted in our TC2 jobs failing with the following error when attempting to boot the master image: root@master [rc=0]# 2013-10-31 02:25:02 PM INFO: Waiting for network to come up LC_ALL=C ping -W4 -c1 10.1.103.191 PING 10.1.103.191 (10.1.103.191) 56(84) bytes of data. 64 bytes from 10.1.103.191: icmp_req=1 ttl=62 time=0.370 ms --- 10.1.103.191 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.370/0.370/0.370/0.000 ms root@master [rc=0]# ifconfig `ip route get 10.1.103.191 | cut -d ' ' -f3` | grep 'inet addr' |awk -F: '{split($2,a," "); print "<" a[1] ">"}' 10.1.99.1: error fetching interface information: Device not found root@master [rc=0]# 2013-10-31 02:26:12 PM ERROR: Unable to determine target image IP address 2013-10-31 02:26:12 PM INFO: CriticalError 2013-10-31 02:26:12 PM WARNING: [ACTION-E] deploy_linaro_android_image is finished with error (Unable to determine target image IP address). The reason for this being that the command used assumes that the interface will be in the 3rd field when doing the cut. However we are seeing that the output of ip route get 10.1.103.191 actually gives this: $ ip route get 10.1.103.197 10.1.103.197 via 10.1.99.1 dev eth0 src 10.1.99.87 Cache And the "cut -d ' ' -f3" bit then gives you 10.1.99.1, which ifconfig is unable to cope with. I suspect the issue is because our dispatcher is on a different subnet to our target devices, hence the "via 10.1.99.1" in the output. I agree that it makes sense to use 'ip route get' to determine the interface, though I was wondering if you could provide us with a more flexible parsing of the output please? I can raise a launchpad bug for this if you would like? Thanks Dean > -----Original Message----- > From: Dean Arnold > Sent: 30 October 2013 10:21 > To: Dean Arnold; 'linaro-validation(a)lists.linaro.org Validation' > Cc: Basil Eljuse; Ian Spray > Subject: RE: null value in column "admin_notifications" violates not- > null constraint > > Hi All > > I was wondering if you could help me please? > > I have managed to get my instance of LAVA working again, however now I > am seeing issues were every time a job is submitted, it is submitted > twice and both jobs grab the test resource, meaning we are seeing some > bizarre behaviour in the test run. See attached log. > > I am also seeing this when I attempt an upgrade: > > + set +x > + lava-server manage syncdb --noinput > WARNING:root:This instance will not use sentry as SENTRY_DSN is not > configured > + set +x > + lava-server manage migrate --noinput > WARNING:root:This instance will not use sentry as SENTRY_DSN is not > configured > Traceback (most recent call last): > File "/srv/lava/instances/production/bin/lava-server", line 55, in > <module> > lava_server.manage.main() > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > 97c7da5/lava_server/manage.py", line 128, in main > run_with_dispatcher_class(LAVAServerDispatcher) > File "/srv/lava/.cache/eggs/lava_tool-0.7- > py2.7.egg/lava_tool/dispatcher.py", line 45, in > run_with_dispatcher_class > raise cls.run() > File "/srv/lava/.cache/eggs/lava_tool-0.7- > py2.7.egg/lava/tool/dispatcher.py", line 147, in run > raise SystemExit(cls().dispatch(args)) > File "/srv/lava/.cache/eggs/lava_tool-0.7- > py2.7.egg/lava/tool/dispatcher.py", line 137, in dispatch > return command.invoke() > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > 97c7da5/lava_server/manage.py", line 116, in invoke > execute_manager(settings, ['lava-server'] + self.args.command) > File "/srv/lava/.cache/eggs/Django-1.4.2- > py2.7.egg/django/core/management/__init__.py", line 459, in > execute_manager > utility.execute() > File "/srv/lava/.cache/eggs/Django-1.4.2- > py2.7.egg/django/core/management/__init__.py", line 382, in execute > self.fetch_command(subcommand).run_from_argv(self.argv) > File "/srv/lava/.cache/eggs/Django-1.4.2- > py2.7.egg/django/core/management/base.py", line 196, in run_from_argv > self.execute(*args, **options.__dict__) > File "/srv/lava/.cache/eggs/Django-1.4.2- > py2.7.egg/django/core/management/base.py", line 232, in execute > output = self.handle(*args, **options) > File "/srv/lava/.cache/eggs/South-0.7.5- > py2.7.egg/south/management/commands/migrate.py", line 107, in handle > ignore_ghosts = ignore_ghosts, > File "/srv/lava/.cache/eggs/South-0.7.5- > py2.7.egg/south/migration/__init__.py", line 199, in migrate_app > applied_all = check_migration_histories(applied_all, delete_ghosts, > ignore_ghosts) > File "/srv/lava/.cache/eggs/South-0.7.5- > py2.7.egg/south/migration/__init__.py", line 88, in > check_migration_histories > raise exceptions.GhostMigrations(ghosts) > south.exceptions.GhostMigrations: > > ! These migrations are in the database but not on disk: > <lava_scheduler_app: > 0033_auto__add_field_testjob_admin_notifications> > ! I'm not trusting myself; either fix this yourself by fiddling > ! with the south_migrationhistory table, or pass --delete-ghost- > migrations > ! to South to have it delete ALL of these records (this may not be > good). > + die 'Failed to run database migrations' > + echo 'Failed to run database migrations' > + exit 1 > > I suspect that this is the underlying issue. Could you please > recommend the best way to go about fixing a ghost migration issue > please? It mentions to go fiddling in the database, but I would rather > not hack away blindly :) > > Thanks > Dean > > > > -----Original Message----- > > From: Dean Arnold > > Sent: 25 October 2013 13:01 > > To: linaro-validation(a)lists.linaro.org Validation > > Cc: Basil Eljuse; Ian Spray > > Subject: null value in column "admin_notifications" violates not-null > > constraint > > > > Hi All, > > > > I have recently carried out an upgrade of LAVA and I am now seeing an > > issue, where I am unable to trigger any jobs. The error listed in > > /srv/lava/instances/production/var/log/lava-scheduler.log can be seen > > below. > > > > I have checked the database column in question (admin_notifications > in > > the lava_scheduler_app_testjob table?) and the contents is as it says > > null. I have tried populating this column with a non-null string in > > an attempt to make Django happy, but I am still seeing the problem. > > > > I am not sure where the corruption happened, I presume something went > > wrong in the upgrade stage. Would it be possible to give me an > example > > of what should be in this column and I will add the data manually to > > try and resolve the problem. > > > > Thanks > > Dean > > > > ############################### > > > > > > 2013-10-25 11:51:55,364 [ERROR] > > [lava_scheduler_daemon.service.JobQueue] IntegrityError: null value > in > > column "admin_notifications" violates not-null constraint > > > > Traceback (most recent call last): > > File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap > > self.__bootstrap_inner() > > File "/usr/lib/python2.7/threading.py", line 551, in > > __bootstrap_inner > > self.run() > > File "/usr/lib/python2.7/threading.py", line 504, in run > > self.__target(*self.__args, **self.__kwargs) > > --- <exception caught here> --- > > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- > > x86_64.egg/twisted/python/threadpool.py", line 167, in _worker > > result = context.call(ctx, function, *args, **kwargs) > > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- > > x86_64.egg/twisted/python/context.py", line 118, in callWithContext > > return self.currentContext().callWithContext(ctx, func, *args, > > **kw) > > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- > > x86_64.egg/twisted/python/context.py", line 81, in callWithContext > > return func(*args,**kw) > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 70, in wrapper > > return func(*args, **kw) > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 242, in > > getJobList_impl > > job_list = self._assign_jobs(job_list) > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 205, in > > _assign_jobs > > job_list = self._get_health_check_jobs() > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 121, in > > _get_health_check_jobs > > job_list.append(self._getHealthCheckJobForBoard(device)) > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 286, in > > _getHealthCheckJobForBoard > > return TestJob.from_json_and_user(job_json, user, True) > > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- > > 97c7da5/lava_scheduler_app/models.py", line 622, in > from_json_and_user > > job.save() > > File "/srv/lava/.cache/eggs/django_restricted_resource-0.2.7- > > py2.7.egg/django_restricted_resource/models.py", line 71, in save > > return super(RestrictedResource, self).save(*args, **kwargs) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/models/base.py", line 463, in save > > self.save_base(using=using, force_insert=force_insert, > > force_update=force_update) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/models/base.py", line 551, in save_base > > result = manager._insert([self], fields=fields, > > return_id=update_pk, using=using, raw=raw) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/models/manager.py", line 203, in _insert > > return insert_query(self.model, objs, fields, **kwargs) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/models/query.py", line 1593, in insert_query > > return query.get_compiler(using=using).execute_sql(return_id) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/models/sql/compiler.py", line 910, in execute_sql > > cursor.execute(sql, params) > > File "/srv/lava/.cache/eggs/Django-1.4.2- > > py2.7.egg/django/db/backends/postgresql_psycopg2/base.py", line 52, > in > > execute > > return self.cursor.execute(query, args) > > django.db.utils.IntegrityError: null value in column > > "admin_notifications" violates not-null constraint > > > > > > 2013-10-25 11:51:55,365 [ERROR] [sentry.errors] No servers > configured, > > and sentry not installed. Cannot send message -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782

12 years, 6 months

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

linaro-validation November 2013