Hi Matt,

 

I have raised the following bug: https://bugs.launchpad.net/lava-dispatcher/+bug/1247005

 

Could you recommend the latest stable revision, prior to your changes, I could pass to the lava-deployment tool please?  At the moment I am unable to run any jobs at all, so I just want a revision which I can use to keep things ticking over until this is fixed please?

 

Thanks,

Dean

 

From: Matt Hart [mailto:matthew.hart@linaro.org]
Sent: 31 October 2013 17:23
To: Dean Arnold
Cc: linaro-validation@lists.linaro.org Validation; Ian Spray; Basil Eljuse
Subject: Re: [Linaro-validation] null value in column "admin_notifications" violates not-null constraint

 

Hi Dean,

 

I added that change and yes you're right it looks like the code breaks when the dispatcher is on a different subnet. Please do make a launchpad bug and I'll look at it when I get a chance.

 

Thanks,

Matt (in the LAVA Lab)

 

On 31 October 2013 10:04, Dean Arnold <Dean.Arnold@arm.com> wrote:

I have now resolved the issues mentioned in the previous email, and in doing so upgraded my LAVA instance to the latest version.  This pulled in a commit to use 'ip route get' to detect which interface is connected to the lava dispatcher.

This has resulted in our TC2 jobs failing with the following error when attempting to boot the master image:

root@master [rc=0]# 2013-10-31 02:25:02 PM INFO: Waiting for network to come up
 LC_ALL=C ping -W4 -c1 10.1.103.191
 PING 10.1.103.191 (10.1.103.191) 56(84) bytes of data.
 64 bytes from 10.1.103.191: icmp_req=1 ttl=62 time=0.370 ms

 --- 10.1.103.191 ping statistics ---
 1 packets transmitted, 1 received, 0% packet loss, time 0ms
 rtt min/avg/max/mdev = 0.370/0.370/0.370/0.000 ms
 root@master [rc=0]#
 ifconfig `ip route get 10.1.103.191 | cut -d ' ' -f3` | grep 'inet addr' |awk -F: '{split($2,a," "); print "<" a[1] ">"}'
 10.1.99.1: error fetching interface information: Device not found
 root@master [rc=0]#

2013-10-31 02:26:12 PM ERROR: Unable to determine target image IP address
2013-10-31 02:26:12 PM INFO: CriticalError
2013-10-31 02:26:12 PM WARNING: [ACTION-E] deploy_linaro_android_image is finished with error (Unable to determine target image IP address).

The reason for this being that the command used assumes that the interface will be in the 3rd field when doing the cut.  However we are seeing that the output of ip route get 10.1.103.191 actually gives this:

$ ip route get 10.1.103.197
10.1.103.197 via 10.1.99.1 dev eth0  src 10.1.99.87
    Cache

And the "cut -d ' ' -f3" bit then gives you 10.1.99.1, which ifconfig is unable to cope with.  I suspect the issue is because our dispatcher is on a different subnet to our target devices, hence the "via 10.1.99.1" in the output.

I agree that it makes sense to use 'ip route get' to determine the interface, though I was wondering if you could provide us with a more flexible parsing of the output please?  I can raise a launchpad bug for this if you would like?


Thanks
Dean

> -----Original Message-----
> From: Dean Arnold

> Sent: 30 October 2013 10:21
> To: Dean Arnold; 'linaro-validation@lists.linaro.org Validation'
> Cc: Basil Eljuse; Ian Spray

> Subject: RE: null value in column "admin_notifications" violates not-
> null constraint
>

> Hi All
>
> I was wondering if you could help me please?
>
> I have managed to get my instance of LAVA working again, however now I
> am seeing issues were every time a job is submitted, it is submitted
> twice and both jobs grab the test resource, meaning we are seeing some
> bizarre behaviour in the test run. See attached log.
>
> I am also seeing this when I attempt an upgrade:
>
> + set +x
> + lava-server manage syncdb --noinput
> WARNING:root:This instance will not use sentry as SENTRY_DSN is not
> configured
> + set +x
> + lava-server manage migrate --noinput
> WARNING:root:This instance will not use sentry as SENTRY_DSN is not
> configured
> Traceback (most recent call last):
>   File "/srv/lava/instances/production/bin/lava-server", line 55, in
> <module>
>     lava_server.manage.main()
>   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> 97c7da5/lava_server/manage.py", line 128, in main
>     run_with_dispatcher_class(LAVAServerDispatcher)
>   File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava_tool/dispatcher.py", line 45, in
> run_with_dispatcher_class
>     raise cls.run()
>   File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava/tool/dispatcher.py", line 147, in run
>     raise SystemExit(cls().dispatch(args))
>   File "/srv/lava/.cache/eggs/lava_tool-0.7-
> py2.7.egg/lava/tool/dispatcher.py", line 137, in dispatch
>     return command.invoke()
>   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> 97c7da5/lava_server/manage.py", line 116, in invoke
>     execute_manager(settings, ['lava-server'] + self.args.command)
>   File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/__init__.py", line 459, in
> execute_manager
>     utility.execute()
>   File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/__init__.py", line 382, in execute
>     self.fetch_command(subcommand).run_from_argv(self.argv)
>   File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/base.py", line 196, in run_from_argv
>     self.execute(*args, **options.__dict__)
>   File "/srv/lava/.cache/eggs/Django-1.4.2-
> py2.7.egg/django/core/management/base.py", line 232, in execute
>     output = self.handle(*args, **options)
>   File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/management/commands/migrate.py", line 107, in handle
>     ignore_ghosts = ignore_ghosts,
>   File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/migration/__init__.py", line 199, in migrate_app
>     applied_all = check_migration_histories(applied_all, delete_ghosts,
> ignore_ghosts)
>   File "/srv/lava/.cache/eggs/South-0.7.5-
> py2.7.egg/south/migration/__init__.py", line 88, in
> check_migration_histories
>     raise exceptions.GhostMigrations(ghosts)
> south.exceptions.GhostMigrations:
>
>  ! These migrations are in the database but not on disk:
>     <lava_scheduler_app:
> 0033_auto__add_field_testjob_admin_notifications>
>  ! I'm not trusting myself; either fix this yourself by fiddling
>  ! with the south_migrationhistory table, or pass --delete-ghost-
> migrations
>  ! to South to have it delete ALL of these records (this may not be
> good).
> + die 'Failed to run database migrations'
> + echo 'Failed to run database migrations'
> + exit 1
>
> I suspect that this is the underlying issue.  Could you please
> recommend the best way to go about fixing a ghost migration issue
> please?  It mentions to go fiddling in the database, but I would rather
> not hack away blindly :)
>
> Thanks
> Dean
>
>
> > -----Original Message-----
> > From: Dean Arnold
> > Sent: 25 October 2013 13:01
> > To: linaro-validation@lists.linaro.org Validation
> > Cc: Basil Eljuse; Ian Spray
> > Subject: null value in column "admin_notifications" violates not-null
> > constraint
> >
> > Hi All,
> >
> > I have recently carried out an upgrade of LAVA and I am now seeing an
> > issue, where I am unable to trigger any jobs.  The error listed in
> > /srv/lava/instances/production/var/log/lava-scheduler.log can be seen
> > below.
> >
> > I have checked the database column in question (admin_notifications
> in
> > the lava_scheduler_app_testjob table?) and the contents is as it says
> > null.   I have tried populating this column with a non-null string in
> > an attempt to make Django happy, but I am still seeing the problem.
> >
> > I am not sure where the corruption happened, I presume something went
> > wrong in the upgrade stage.  Would it be possible to give me an
> example
> > of what should be in this column and I will add the data manually to
> > try and resolve the problem.
> >
> > Thanks
> > Dean
> >
> > ###############################
> >
> >
> > 2013-10-25 11:51:55,364 [ERROR]
> > [lava_scheduler_daemon.service.JobQueue] IntegrityError: null value
> in
> > column "admin_notifications" violates not-null constraint
> >
> > Traceback (most recent call last):
> >   File "/usr/lib/python2.7/threading.py", line 524, in __bootstrap
> >     self.__bootstrap_inner()
> >   File "/usr/lib/python2.7/threading.py", line 551, in
> > __bootstrap_inner
> >     self.run()
> >   File "/usr/lib/python2.7/threading.py", line 504, in run
> >     self.__target(*self.__args, **self.__kwargs)
> > --- <exception caught here> ---
> >   File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/threadpool.py", line 167, in _worker
> >     result = context.call(ctx, function, *args, **kwargs)
> >   File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/context.py", line 118, in callWithContext
> >     return self.currentContext().callWithContext(ctx, func, *args,
> > **kw)
> >   File "/srv/lava/.cache/eggs/Twisted-12.1.0-py2.7-linux-
> > x86_64.egg/twisted/python/context.py", line 81, in callWithContext
> >     return func(*args,**kw)
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 70, in wrapper
> >     return func(*args, **kw)
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 242, in
> > getJobList_impl
> >     job_list = self._assign_jobs(job_list)
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 205, in
> > _assign_jobs
> >     job_list = self._get_health_check_jobs()
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 121, in
> > _get_health_check_jobs
> >     job_list.append(self._getHealthCheckJobForBoard(device))
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_daemon/dbjobsource.py", line 286, in
> > _getHealthCheckJobForBoard
> >     return TestJob.from_json_and_user(job_json, user, True)
> >   File "/srv/lava/.cache/git-cache/exports/lava-server/2013-10-17-
> > 97c7da5/lava_scheduler_app/models.py", line 622, in
> from_json_and_user
> >     job.save()
> >   File "/srv/lava/.cache/eggs/django_restricted_resource-0.2.7-
> > py2.7.egg/django_restricted_resource/models.py", line 71, in save
> >     return super(RestrictedResource, self).save(*args, **kwargs)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/base.py", line 463, in save
> >     self.save_base(using=using, force_insert=force_insert,
> > force_update=force_update)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/base.py", line 551, in save_base
> >     result = manager._insert([self], fields=fields,
> > return_id=update_pk, using=using, raw=raw)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/manager.py", line 203, in _insert
> >     return insert_query(self.model, objs, fields, **kwargs)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/query.py", line 1593, in insert_query
> >     return query.get_compiler(using=using).execute_sql(return_id)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/models/sql/compiler.py", line 910, in execute_sql
> >     cursor.execute(sql, params)
> >   File "/srv/lava/.cache/eggs/Django-1.4.2-
> > py2.7.egg/django/db/backends/postgresql_psycopg2/base.py", line 52,
> in
> > execute
> >     return self.cursor.execute(query, args)
> > django.db.utils.IntegrityError: null value in column
> > "admin_notifications" violates not-null constraint
> >
> >
> > 2013-10-25 11:51:55,365 [ERROR] [sentry.errors] No servers
> configured,
> > and sentry not installed. Cannot send message

-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium.  Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No:  2548782

_______________________________________________
linaro-validation mailing list
linaro-validation@lists.linaro.org
http://lists.linaro.org/mailman/listinfo/linaro-validation

 


-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590
ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782