On Thu, 31 Oct 2013 09:14:47 +0200
Ayman Hendawy <ayman.hendawy(a)gmail.com> wrote:
> Dear Neil,
Do not reply to individuals. Keep replies only to the list.
> Actually I wonder why it's not more open, why I can't get a real time
> access to the kit serial console, why debugger is not available,
> suppose I have an application over OS, I need to debug my code using
> a debugger, to get know the certain line causing the problem, why I
> don't have an access to some of the kit peripherals like USB port by
> some how.
>
> What I mean, such great effort of LAVA, what limit it to give there
> users more deeply access to there kits? why it's limited to posting
> jobs?
The simple answer is that this is to protect the use of the boards by
other users. Submitting a job puts the device into a test image where
the bugs in the test image are restricted to that test image. When the
test ends (for better or for worse), the board returns to a known,
working, state.
To do otherwise would make the admin burden unsustainable.
These are not general purpose debugging boards. These are test devices.
The hands-on debugging needs to be done in emulators or local boards -
preferably before the commits. LAVA is checking for side-effects of
developer changes, especially performance changes over time.
Access to the serial console of any LAVA device is restricted to the
lab admins. The devices do not belong to the developers, it isn't about
developers having access to "their" devices. The devices belong to LAVA
and are maintained as a service for all developers. Doing that requires
that LAVA imposes restrictions on what individual developers can do to
avoid individuals leaving the device in an unstable or unbootable state.
Many LAVA test jobs involve interim kernel builds - it is all too easy
to make a commit which gets turned into a LAVA job which leads to a
kernel panic in the test. If that was the main kernel for the device,
*someone* (i.e. the LAVA lab admins) would have to fix it. Restricting
tests to submitted jobs is that fix.
--
Neil Williams
=============
http://www.linux.codehelp.co.uk/
Hi,
I'm trying to complete the docs for linaro test suites and add the way
running local .yaml files on the remote server. From the lava-tool
README I took the following line:
lava testdef submit
It kinda works:
milosz@milosz-nb:~/linaro/testcases/staging-test-definitions/openembedded$
lava testdef submit busybox.yaml
Creating job file...
device_type: rtsm_fvp_base-aemv8a
image: https://releases.linaro.org/13.09/openembedded/aarch64/vexpress64-openembed…
Created job file 'lava-tool-job.json'.
Server connection parameters:
server: https://mwasilew@validation.linaro.org/RPC2/
rpc_endpoint: RPC2
However I did something wrong as the connection wasn't successful. I
got the following exception:
Traceback (most recent call last):
File "/usr/local/bin/lava", line 9, in <module>
load_entry_point('lava-tool==0.7.1', 'console_scripts', 'lava')()
File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/tool/dispatcher.py",
line 153, in run
raise SystemExit(cls().dispatch(args))
File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/tool/dispatcher.py",
line 143, in dispatch
return command.invoke()
File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/testdef/commands.py",
line 101, in invoke
super(submit, self).submit(job_file)
File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava/helper/command.py",
line 113, in submit
job_id = server.scheduler.submit_job(jobdata)
File "/usr/lib/python2.7/xmlrpclib.py", line 1224, in __call__
return self.__send(self.__name, args)
File "/usr/lib/python2.7/xmlrpclib.py", line 1578, in __request
verbose=self.__verbose
File "/usr/local/lib/python2.7/dist-packages/lava_tool-0.7.1-py2.7.egg/lava_tool/authtoken.py",
line 77, in request
response = self._opener.open(request)
File "/usr/lib/python2.7/urllib2.py", line 400, in open
response = self._open(req, data)
File "/usr/lib/python2.7/urllib2.py", line 418, in _open
'_open', req)
File "/usr/lib/python2.7/urllib2.py", line 378, in _call_chain
result = func(*args)
File "/usr/lib/python2.7/urllib2.py", line 1215, in https_open
return self.do_open(httplib.HTTPSConnection, req)
File "/usr/lib/python2.7/urllib2.py", line 1177, in do_open
raise URLError(err)
urllib2.URLError: <urlopen error [Errno 110] Connection timed out>
Is there any docs I can check to make it working?
milosz
Hi All,
In our LAVA setup we currently have our control server and then we have 4 remote workers. On these workers we have different device types distributed evenly to prevent complete loss of a particular type if one of our workers fails.
On each worker we have our per device configuration specified here: /srv/lava/instances/<instance>/etc/lava-dispatcher/devices
And our device-type configuration here: /srv/lava/instances/<instance>/etc/lava-dispatcher/device-types
In the device-types config files we are overriding the defaults with ARM specific settings such as, the license_file in the case of fast models, or the partitions in the case of vexpress-tc2. The settings for a particular device-type are the same for the instances running on all workers, therefore it means we have the same <device-type>.conf on multiple machines. It would be good if I could define the ARM specific settings in one place rather than for each dispatcher.
What I was wondering was whether or not the remote workers inherited lava-dispatcher device type configs from the master or whether each workers dispatcher was stand alone? If I placed the device-type config files specific to our setup under /srv/lava/instances/<instance>/etc/lava-dispatcher/device-types on our control server, would all of the remote workers pull these settings in?
Thanks
Dean
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
Hi all,
For reasons we don't fully understand, the LNG dispatcher has got problems. This was coincident with a bundle of servers in the TCWG deciding to reboot. We're working as a matter of priority to get the rack back up. It also appears that the serial console in that rack is giving us problems, so we're replacing that at the same time.
I'll try to keep you updated as we go along, but obviously, getting the rack back online is top priority.
Thanks
Dave
Hi All,
I have recently started seeing an issue with my health check jobs at the stage where the Linaro bootfs is deployed. It looks like the permissions are wrong somewhere.
I see no problems when running Android jobs. I have not changed anything on the TC2 boards, and I have not changed anything on the dispatcher apart from upgrade to the latest version of LAVA.
I have attached a log for more details.
Before I start looking into whether or not this is a problem with my specific LAVA environment, I just wanted to check if this was a known issue? If so, has there been any config changes that I need to apply please to prevent this from happening? Could this be to do with linaro-media-create?
Thanks
Dean
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782
I have now resolved the issues mentioned in the previous email, and in doing so upgraded my LAVA instance to the latest version. This pulled in a commit to use 'ip route get' to detect which interface is connected to the lava dispatcher.
This has resulted in our TC2 jobs failing with the following error when attempting to boot the master image:
root@master [rc=0]# 2013-10-31 02:25:02 PM INFO: Waiting for network to come up
LC_ALL=C ping -W4 -c1 10.1.103.191
PING 10.1.103.191 (10.1.103.191) 56(84) bytes of data.
64 bytes from 10.1.103.191: icmp_req=1 ttl=62 time=0.370 ms
--- 10.1.103.191 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.370/0.370/0.370/0.000 ms
root@master [rc=0]#
ifconfig `ip route get 10.1.103.191 | cut -d ' ' -f3` | grep 'inet addr' |awk -F: '{split($2,a," "); print "<" a[1] ">"}'
10.1.99.1: error fetching interface information: Device not found
root@master [rc=0]#
2013-10-31 02:26:12 PM ERROR: Unable to determine target image IP address
2013-10-31 02:26:12 PM INFO: CriticalError
2013-10-31 02:26:12 PM WARNING: [ACTION-E] deploy_linaro_android_image is finished with error (Unable to determine target image IP address).
The reason for this being that the command used assumes that the interface will be in the 3rd field when doing the cut. However we are seeing that the output of ip route get 10.1.103.191 actually gives this:
$ ip route get 10.1.103.197
10.1.103.197 via 10.1.99.1 dev eth0 src 10.1.99.87
Cache
And the "cut -d ' ' -f3" bit then gives you 10.1.99.1, which ifconfig is unable to cope with. I suspect the issue is because our dispatcher is on a different subnet to our target devices, hence the "via 10.1.99.1" in the output.
I agree that it makes sense to use 'ip route get' to determine the interface, though I was wondering if you could provide us with a more flexible parsing of the output please? I can raise a launchpad bug for this if you would like?
Thanks
Dean
> -----Original Message-----
> From: Dean Arnold
> Sent: 30 October 2013 10:21
> To: Dean Arnold; 'linaro-validation(a)lists.linaro.org Validation'
> Cc: Basil Eljuse; Ian Spray
> Subject: RE: null value in column "admin_notifications" violates not-
> null constraint
>
> Hi All
>
> I was wondering if you could help me please?
>
> I have managed to get my instance of LAVA working again, however now I
> am seeing issues were every time a job is submitted, it is submitted
> twice and both jobs grab the test resource, meaning we are seeing some
> bizarre behaviour in the test run. See attached log.
>
> I am also seeing this when I attempt an upgrade:
>
> + set +x
> + lava-server manage syncdb --noinput
> WARNING:root:This instance will not use sentry as SENTRY_DSN is not
> configured
> + set +x
> + lava-server manage migrate --noinput
> WARNING:root:This instance will not use sentry as SENTRY_DSN is not
> configured
> Traceback (most recent call last):
> File "/srv/lava/instances/production/bin/lava-server", line 55, in
> <module>
> lava_server.manage.main()
> File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> 97c7da5/lava_server/manage.py", line 128, in main
> run_with_dispatcher_class(LAVAServerDispatcher)
> File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava_tool/dispatcher.py", line 45, in
> run_with_dispatcher_class
> raise cls.run()
> File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava/tool/dispatcher.py", line 147, in run
> raise SystemExit(cls().dispatch(args))
> File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava/tool/dispatcher.py", line 137, in dispatch
> return command.invoke()
> File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> 97c7da5/lava_server/manage.py", line 116, in invoke
> execute_manager(settings, ['lava-server'] + self.args.command)
> File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/__init__.py", line 459, in
> execute_manager
> utility.execute()
> File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/__init__.py", line 382, in execute
> self.fetch_command(subcommand).run_from_argv(self.argv)
> File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/base.py", line 196, in run_from_argv
> self.execute(*args, **options.__dict__)
> File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/base.py", line 232, in execute
> output = self.handle(*args, **options)
> File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/management/commands/migrate.py", line 107, in handle
> ignore_ghosts = ignore_ghosts,
> File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/migration/__init__.py", line 199, in migrate_app
> applied_all = check_migration_histories(applied_all, delete_ghosts,
> ignore_ghosts)
> File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/migration/__init__.py", line 88, in
> check_migration_histories
> raise exceptions.GhostMigrations(ghosts)
> south.exceptions.GhostMigrations:
>
> ! These migrations are in the database but not on disk:
> <lava_scheduler_app:
> 0033_auto__add_field_testjob_admin_notifications>
> ! I'm not trusting myself; either fix this yourself by fiddling
> ! with the south_migrationhistory table, or pass --delete-ghost-
> migrations
> ! to South to have it delete ALL of these records (this may not be
> good).
> + die 'Failed to run database migrations'
> + echo 'Failed to run database migrations'
> + exit 1
>
> I suspect that this is the underlying issue. Could you please
> recommend the best way to go about fixing a ghost migration issue
> please? It mentions to go fiddling in the database, but I would rather
> not hack away blindly :)
>
> Thanks
> Dean
>
>
> > -----Original Message-----
> > From: Dean Arnold
> > Sent: 25 October 2013 13:01
> > To: linaro-validation(a)lists.linaro.org Validation
> > Cc: Basil Eljuse; Ian Spray
> > Subject: null value in column "admin_notifications" violates not-null
> > constraint
> >
> > Hi All,
> >
> > I have recently carried out an upgrade of LAVA and I am now seeing an
> > issue, where I am unable to trigger any jobs. The error listed in
> > /srv/lava/instances/production/var/log/lava-scheduler.log can be seen
> > below.
> >
> > I have checked the database column in question (admin_notifications
> in
> > the lava_scheduler_app_testjob table?) and the contents is as it says
> > null. I have tried populating this column with a non-null string in
> > an attempt to make Django happy, but I am still seeing the problem.
> >
> > I am not sure where the corruption happened, I presume something went
> > wrong in the upgrade stage. Would it be possible to give me an
> example
> > of what should be in this column and I will add the data manually to
> > try and resolve the problem.
> >
> > Thanks
> > Dean
> >
> > ###############################
> >
> >
> > 2013-10-25 11:51:55,364 [ERROR]
> > [lava_scheduler_daemon.service.JobQueue] IntegrityError: null value
> in
> > column "admin_notifications" violates not-null constraint
> >
> > Traceback (most recent call last):
> > File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
> > self.__bootstrap_inner()
> > File "/usr/lib/python2.7/threading.py", line 551, in
> > __bootstrap_inner
> > self.run()
> > File "/usr/lib/python2.7/threading.py", line 504, in run
> > self.__target(*self.__args, **self.__kwargs)
> > --- <exception caught here> ---
> > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/threadpool.py", line 167, in _worker
> > result = context.call(ctx, function, *args, **kwargs)
> > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/context.py", line 118, in callWithContext
> > return self.currentContext().callWithContext(ctx, func, *args,
> > **kw)
> > File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/context.py", line 81, in callWithContext
> > return func(*args,**kw)
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 70, in wrapper
> > return func(*args, **kw)
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 242, in
> > getJobList_impl
> > job_list = self._assign_jobs(job_list)
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 205, in
> > _assign_jobs
> > job_list = self._get_health_check_jobs()
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 121, in
> > _get_health_check_jobs
> > job_list.append(self._getHealthCheckJobForBoard(device))
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 286, in
> > _getHealthCheckJobForBoard
> > return TestJob.from_json_and_user(job_json, user, True)
> > File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_app/models.py", line 622, in
> from_json_and_user
> > job.save()
> > File "/srv/lava/.cache/eggs/django_restricted_resource-0.2.7-
> > py2.7.egg/django_restricted_resource/models.py", line 71, in save
> > return super(RestrictedResource, self).save(*args, **kwargs)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/base.py", line 463, in save
> > self.save_base(using=using, force_insert=force_insert,
> > force_update=force_update)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/base.py", line 551, in save_base
> > result = manager._insert([self], fields=fields,
> > return_id=update_pk, using=using, raw=raw)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/manager.py", line 203, in _insert
> > return insert_query(self.model, objs, fields, **kwargs)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/query.py", line 1593, in insert_query
> > return query.get_compiler(using=using).execute_sql(return_id)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/sql/compiler.py", line 910, in execute_sql
> > cursor.execute(sql, params)
> > File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/backends/postgresql_psycopg2/base.py", line 52,
> in
> > execute
> > return self.cursor.execute(query, args)
> > django.db.utils.IntegrityError: null value in column
> > "admin_notifications" violates not-null constraint
> >
> >
> > 2013-10-25 11:51:55,365 [ERROR] [sentry.errors] No servers
> configured,
> > and sentry not installed. Cannot send message
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782