Hi Neil,
Earlier today I configured two new virtual machines to test the master/worker setup and I have managed to get this to work now. For this I installed a the master and slave from scratch with the LAVA_DB_ALLOWREMOTE variable set on the master, and all went well. I was able to do a couple of qemu runs on the worker with no issues.
It looks as though the problem I was seeing on our production LAVA setup, was down to my master configuration, which wasn't correct. When doing an upgrade of a master instance, the deployment tool doesn't carry out the steps for enabling remote workers, even with the LAVA_DB_ALLOWREMOTE variable set. I had then corrected some of the steps manually, but had missed a couple of bits, which is why I ended up with a *kind of* working setup.
When I did fresh install on my test setup today, the lava-deployment-tool handled the extra steps perfectly.
Thanks for your help with this, now I have a working setup to diff my production version against, I can apply the missing bits of config to the master. Alternatively I'll just do a fresh install and restore the database from a backup :)
I don't know if this is something you guys would like to be automated by the lava-deployment-tool during upgrades, or whether some steps on the wiki for the extra bit of config would be enough? If you need me to raise a ticket/bug for this I can do?
Cheers Dean
-----Original Message----- From: Neil Williams [mailto:codehelp@debian.org] Sent: 09 October 2013 10:20 To: Dean Arnold Cc: linaro-validation@lists.linaro.org Validation Subject: Re: [Linaro-validation] More Remote Dispatcher Questions - possible fix
On Tue, 8 Oct 2013 19:43:17 +0100 Dean Arnold Dean.Arnold@arm.com wrote:
next - Use the master information as is edit - Edit the master information Please decide what to do [next]: next ./lava-deployment-tool: line 270: defaults_coordinator: command not found
Could this be what is causing my problems?
Sorry, I completely missed this first time around. This is a bug in lava-deployment-tool - it is what caused the lack of the /etc/lava-coordinator/lava-coordinator.conf file which you added manually. I'll fix that problem in lava-deployment-tool. It's a minor change:
diff --git a/lava-deployment-tool b/lava-deployment-tool index 35486db..ca5d52e 100755 --- a/lava-deployment-tool +++ b/lava-deployment-tool @@ -708,6 +708,10 @@ wizard_coordinator () { true }
+defaults_coordinator () {
- true
+}
defaults_buildout () { true }
LAVA should still have setup postgresql correctly from a fresh install, especially as you mentioned that this is a fresh Ubuntu 12.04 LTS install.
My master instance of LAVA was installed on a clean Ubuntu server installation and the only time postresql was added was during the deployment of LAVA. The port number hadn't changed, the listen_addresses setting hadn't been set and therefore it just wasn't listening on that port at all.
I do intermittent reinstalls but most of the time the LAVA team are doing upgrades of existing clients.
Fresh installs tend to be tests in a VM which doesn't involve a remote dispatcher. The remote dispatchers are usually set up later but without these kind of problems.
lava-deployment-tool should have added this line to the postgresql configuration:
"host all all 0.0.0.0/0 trust"
I think that shows where the problem originally lies and that led me to this resource which should become the basis of the missing docs.
http://www.linaro.org/connect-lcu13/resources/Q/lce13 ADVANCED LAVA LAB CONFIGURATION Page 11 of this PDF http://www.linaro.org/documents/download/74099337b34eb0ab4521fb574f3bca b751e5603eee579
*Before* running ./lava-deployment-tool on the *master*, there is a variable to be exported which allows for the remote database connection from remote workers:
export LAVA_DB_ALLOWREMOTE=yes
That command also works on *upgrades*, so this may actually fix your problem. Upgrade master with this environment variable set, then restart lava on the worker.
I'll see about adding a note to setupworker or installworker to ensure that this is setup on the master - the problem being that the master doesn't want this enabled by default and the security implications should be clearly set out in the docs.
The information regarding the configuration of postresql was included for completeness only, I wasn't suggesting this was anything to do with LAVA.
Looks like a django error - your database connection is still not correct.
Is this something you have seen before?
No. I just googled SENTRY_DSN.
I too googled this, however as it is the first time we are setting up LAVA this way, and as this aspect of LAVA configuration is not documented,
... we plan to sort out missing docs like that ... there's already a card for it and I've added a comment to the card as a reminder about the magic environment variable.
The error message is very obscure - lava-deployment-tool could do with a way of testing whether it can see the database and reporting a sensible error message instead.
I didn't think it would be unreasonable to ask those who had written the system, and configured it this way before, whether they had seen this particular error, in case they were able to point me at a simple fix or a configuration step I may have missed, before I started investigating this further.
If you haven't seen this then fair enough, I will go carry on with my own investigations.
Sorry to leave you on a limb with this. The error you saw was unfamiliar, but after investigation, I hope this will fix the problem for you.
No - however, if you have a postgresql server installed on the worker, it is not required.
No there is no postgresql server installed on the worker.
OK, then this does need investigation because that kind of setup should "just work", especially on precise.
The initial use of setup instead of setupworker could have messed
up
the database configuration on the worker. It just looks like the worker cannot find the database.
After the mess I made of the previous install, I set up a VM to trial this, so this has been deployed from scratch using the updated commands you gave me.
Possibly the issue is with the configuration of the master. There has already been one issue with the postgresql setup, so possibly there are more. I will look into it.
It may well be with the configuration of the master, down to the missing environment variable.
--
Neil Williams
-- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782