I have now resolved the issues mentioned in the previous email, and in doing so upgraded my LAVA instance to the latest version. This pulled in a commit to use 'ip route get' to detect which interface is connected to the lava dispatcher.
This has resulted in our TC2 jobs failing with the following error when attempting to boot the master image:
root@master [rc=0]# 2013-10-31 02:25:02 PM INFO: Waiting for network to come up LC_ALL=C ping -W4 -c1 10.1.103.191 PING 10.1.103.191 (10.1.103.191) 56(84) bytes of data. 64 bytes from 10.1.103.191: icmp_req=1 ttl=62 time=0.370 ms
--- 10.1.103.191 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.370/0.370/0.370/0.000 ms root@master [rc=0]# ifconfig `ip route get 10.1.103.191 | cut -d ' ' -f3` | grep 'inet addr' |awk -F: '{split($2,a," "); print "<" a[1] ">"}' 10.1.99.1: error fetching interface information: Device not found root@master [rc=0]#
2013-10-31 02:26:12 PM ERROR: Unable to determine target image IP address 2013-10-31 02:26:12 PM INFO: CriticalError 2013-10-31 02:26:12 PM WARNING: [ACTION-E] deploy_linaro_android_image is finished with error (Unable to determine target image IP address).
The reason for this being that the command used assumes that the interface will be in the 3rd field when doing the cut. However we are seeing that the output of ip route get 10.1.103.191 actually gives this:
$ ip route get 10.1.103.197 10.1.103.197 via 10.1.99.1 dev eth0 src 10.1.99.87 Cache
And the "cut -d ' ' -f3" bit then gives you 10.1.99.1, which ifconfig is unable to cope with. I suspect the issue is because our dispatcher is on a different subnet to our target devices, hence the "via 10.1.99.1" in the output.
I agree that it makes sense to use 'ip route get' to determine the interface, though I was wondering if you could provide us with a more flexible parsing of the output please? I can raise a launchpad bug for this if you would like?
Thanks Dean
-----Original Message----- From: Dean Arnold Sent: 30 October 2013 10:21 To: Dean Arnold; 'linaro-validation@lists.linaro.org Validation' Cc: Basil Eljuse; Ian Spray Subject: RE: null value in column "admin_notifications" violates not- null constraint
Hi All
I was wondering if you could help me please?
I have managed to get my instance of LAVA working again, however now I am seeing issues were every time a job is submitted, it is submitted twice and both jobs grab the test resource, meaning we are seeing some bizarre behaviour in the test run. See attached log.
I am also seeing this when I attempt an upgrade:
- set +x
- lava-server manage syncdb --noinput
WARNING:root:This instance will not use sentry as SENTRY_DSN is not configured
- set +x
- lava-server manage migrate --noinput
WARNING:root:This instance will not use sentry as SENTRY_DSN is not configured Traceback (most recent call last): File "/srv/lava/instances/production/bin/lava-server", line 55, in
<module> lava_server.manage.main() File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_server/manage.py", line 128, in main run_with_dispatcher_class(LAVAServerDispatcher) File "/srv/lava/.cache/eggs/lava_tool-0.7- py2.7.egg/lava_tool/dispatcher.py", line 45, in run_with_dispatcher_class raise cls.run() File "/srv/lava/.cache/eggs/lava_tool-0.7- py2.7.egg/lava/tool/dispatcher.py", line 147, in run raise SystemExit(cls().dispatch(args)) File "/srv/lava/.cache/eggs/lava_tool-0.7- py2.7.egg/lava/tool/dispatcher.py", line 137, in dispatch return command.invoke() File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_server/manage.py", line 116, in invoke execute_manager(settings, ['lava-server'] + self.args.command) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/core/management/__init__.py", line 459, in execute_manager utility.execute() File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/core/management/__init__.py", line 382, in execute self.fetch_command(subcommand).run_from_argv(self.argv) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/core/management/base.py", line 196, in run_from_argv self.execute(*args, **options.__dict__) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/core/management/base.py", line 232, in execute output = self.handle(*args, **options) File "/srv/lava/.cache/eggs/South-0.7.5- py2.7.egg/south/management/commands/migrate.py", line 107, in handle ignore_ghosts = ignore_ghosts, File "/srv/lava/.cache/eggs/South-0.7.5- py2.7.egg/south/migration/__init__.py", line 199, in migrate_app applied_all = check_migration_histories(applied_all, delete_ghosts, ignore_ghosts) File "/srv/lava/.cache/eggs/South-0.7.5- py2.7.egg/south/migration/__init__.py", line 88, in check_migration_histories raise exceptions.GhostMigrations(ghosts) south.exceptions.GhostMigrations:
! These migrations are in the database but not on disk: <lava_scheduler_app: 0033_auto__add_field_testjob_admin_notifications> ! I'm not trusting myself; either fix this yourself by fiddling ! with the south_migrationhistory table, or pass --delete-ghost- migrations ! to South to have it delete ALL of these records (this may not be good).
- die 'Failed to run database migrations'
- echo 'Failed to run database migrations'
- exit 1
I suspect that this is the underlying issue. Could you please recommend the best way to go about fixing a ghost migration issue please? It mentions to go fiddling in the database, but I would rather not hack away blindly :)
Thanks Dean
-----Original Message----- From: Dean Arnold Sent: 25 October 2013 13:01 To: linaro-validation@lists.linaro.org Validation Cc: Basil Eljuse; Ian Spray Subject: null value in column "admin_notifications" violates not-null constraint
Hi All,
I have recently carried out an upgrade of LAVA and I am now seeing an issue, where I am unable to trigger any jobs. The error listed in /srv/lava/instances/production/var/log/lava-scheduler.log can be seen below.
I have checked the database column in question (admin_notifications
in
the lava_scheduler_app_testjob table?) and the contents is as it says null. I have tried populating this column with a non-null string in an attempt to make Django happy, but I am still seeing the problem.
I am not sure where the corruption happened, I presume something went wrong in the upgrade stage. Would it be possible to give me an
example
of what should be in this column and I will add the data manually to try and resolve the problem.
Thanks Dean
###############################
2013-10-25 11:51:55,364 [ERROR] [lava_scheduler_daemon.service.JobQueue] IntegrityError: null value
in
column "admin_notifications" violates not-null constraint
Traceback (most recent call last): File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap self.__bootstrap_inner() File "/usr/lib/python2.7/threading.py", line 551, in __bootstrap_inner self.run() File "/usr/lib/python2.7/threading.py", line 504, in run self.__target(*self.__args, **self.__kwargs) --- <exception caught here> --- File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- x86_64.egg/twisted/python/threadpool.py", line 167, in _worker result = context.call(ctx, function, *args, **kwargs) File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- x86_64.egg/twisted/python/context.py", line 118, in callWithContext return self.currentContext().callWithContext(ctx, func, *args, **kw) File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux- x86_64.egg/twisted/python/context.py", line 81, in callWithContext return func(*args,**kw) File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 70, in wrapper return func(*args, **kw) File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 242, in getJobList_impl job_list = self._assign_jobs(job_list) File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 205, in _assign_jobs job_list = self._get_health_check_jobs() File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 121, in _get_health_check_jobs job_list.append(self._getHealthCheckJobForBoard(device)) File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 286, in _getHealthCheckJobForBoard return TestJob.from_json_and_user(job_json, user, True) File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17- 97c7da5/lava_scheduler_app/models.py", line 622, in
from_json_and_user
job.save()
File "/srv/lava/.cache/eggs/django_restricted_resource-0.2.7- py2.7.egg/django_restricted_resource/models.py", line 71, in save return super(RestrictedResource, self).save(*args, **kwargs) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/models/base.py", line 463, in save self.save_base(using=using, force_insert=force_insert, force_update=force_update) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/models/base.py", line 551, in save_base result = manager._insert([self], fields=fields, return_id=update_pk, using=using, raw=raw) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/models/manager.py", line 203, in _insert return insert_query(self.model, objs, fields, **kwargs) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/models/query.py", line 1593, in insert_query return query.get_compiler(using=using).execute_sql(return_id) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/models/sql/compiler.py", line 910, in execute_sql cursor.execute(sql, params) File "/srv/lava/.cache/eggs/Django-1.4.2- py2.7.egg/django/db/backends/postgresql_psycopg2/base.py", line 52,
in
execute return self.cursor.execute(query, args) django.db.utils.IntegrityError: null value in column "admin_notifications" violates not-null constraint
2013-10-25 11:51:55,365 [ERROR] [sentry.errors] No servers
configured,
and sentry not installed. Cannot send message
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782