Hello
I have installed a fresh 2017.12 in a docker and I hit two problem:
- With the exception of lava-coordinator, all services (lava-master, lava-server, lava-slave, etc...) fail to start with:
service lava-xxxx start
lava-xxxx: unrecognized service
All Other service successfully start (postgresql, apache2)
- I have then re-added old init scripts and when I add a worker with:
"lava-server manage workers add lab-slave"
I get:
Traceback (most recent call last):
File "/usr/bin/lava-server", line 78, in <module>
main()
File "/usr/bin/lava-server", line 74, in main
execute_from_command_line(django_options)
File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 367, in execute_from_command_line
utility.execute()
File "/usr/lib/python2.7/dist-packages/django/core/management/__init__.py", line 359, in execute
self.fetch_command(subcommand).run_from_argv(self.argv)
File "/usr/lib/python2.7/dist-packages/django/core/management/base.py", line 294, in run_from_argv
self.execute(*args, **cmd_options)
File "/usr/lib/python2.7/dist-packages/django/core/management/base.py", line 345, in execute
output = self.handle(*args, **options)
File "/usr/lib/python2.7/dist-packages/lava_server/management/commands/workers.py", line 77, in handle
options["disabled"])
KeyError: 'disabled'
iNote that before I have successfully done commands like adding user or device-types
Regards
Possibly the hardest thing that can be done with LAVA is to integrate a new
device-type. Every story is different, every device has it's own issues.
The LAVA software team would like to build a history of integration stories
to document the recovery process, provide integration hints for others,
build a wider picture of the requirements of automation across the
ecosystem.
A key part of any device integration is how to recover a device which has
been taken offline by a broken build or broken test action. It is rare that
a new device is suitable for LAVA directly out of the box, there are
usually firmware updates, configuration changes, jumper settings or other
mods required. Each of these steps needs to be documented so that someone
else can integrate the same board into their instance and the same steps
are useful when recovering a bricked device. Having the Jinja2 template is
not usually enough.
(As a side note on that, we do welcome all contributions of new jinja2
templates. The one place where everyone looks for new device-type support
is lava_scheduler_app/tests/device-types/ in the master branch of
http://git.linaro.org/lava/lava-server.git. If you have integrated a
device-type which does not exist in that directory, please follow the
contribution guidelines in the documentation and submit your Jinja2
template for review.)
https://git.linaro.org/lava/lava-server.git/tree/lava_scheduler_app/tests/d…
Recent device integrations in LAVA are documented through a process of
comments on JIRA stories (all LAVA JIRA stories are accessible
anonymously). Sometimes this turns out to be trial and error, so sometimes
a README document or similar is created for easy reference. I'll post some
of our recovery README documents on this thread later.
This is a request for contributions of your integration stories, your
device recovery README documents and your input on what are the minimum
requirements for automation in general.
It doesn't matter how unusual the hardware or how commonplace. It doesn't
matter if it took months to integrate or days. Someone may need your
information to setup the same device in their instance. If your device is
available on the open market, someone could be thinking of how to automate
it.
Please post your recovery README texts, integration stories here. This list
is publicly archived, so this should make it easy for others to find your
contribution in their favourite search engine.
In future, as well as unit tests, the LAVA software team may ask for the
recovery notes and integration story to be posted here when reviewing
contributions of Jinja2 templates.
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/
Hello!
Is there a way to set/override/disable the inactivity timeout?
While working with a (sometimes) slower board, we end up with "Terminal.stop: Inactivity timeout." . I believe this might have to do with the fact that the board fails to respond in a timely fashion and LAVA marks this as hung.
While searching for this in the docs, I only found a reference for hacking sessions.
Kind regards,
Dragos
Hi everyone!
Does LAVA offer a way to resize rootfs images (“ramdisk_files”, as they are referred to in the code) before deploying them?
We are working with our own Linux distro on a Cavium board and we need to resize the rootfs image (which is ext4) after the test overlays are added by LAVA. This step is necessary since we need to run a series of community tests that imply a large number of package dependencies. Also, we do not want to resize the whole image by default, since this will have other CI storage side effects and, also, some policy implications (that I am not really aware of 😊).
Anyway, we are currently looking at adding/altering apply_overlay.py (pipeline/actions/) in order to run a resize2fs command after the “updated” ramdisk is unmounted. Just wondering if there is already a more elegant way of doing this.
We are also considering making some extensions and maybe get to the point where we could specify this type of requirement in the device dictionary or the device template itself, and then specific code would be executed only if such a setting would be encountered at runtime, but that remains to be discussed.
Thanks in advance!
Dragoș
Dear all,
I'm quite sure I've seen somewhere in Lava documentation an article explaining how to replay some Lava jobs in a local mode, but I can't find this piece of information anymore.
By local mode, I mean:
- DUT at the developer desk, not in the test farm
- Jobs available on a Lava server
- Developer wants to replay/modify locally a job
Can you share any link or help on this?
Many thanks,
Denis
by Ros Dos Santos, Alfonso (CT RDA DS EVO OPS DIA SE 1)
Hello everyone,
in our current project we have some devices that are not "directly"
supported by lava. I would like to ask for you opinion on which would be
the correct way to proceed.
The main problem is that our devices have a EFI implementation in the
firmware that is making the task of installing uboot very hard. To avoid
this, we thought about serving the image through an "emulated" usb
stick. On that regard, we made some progress by setting up a secondary
device that would use the linux g_mass_storage module to serve the image
through the USB otg port.
Our setup is then:
1) The actual testing device
2) The secondary device which only serves the image with g_mass_storage
3) The host machine running the lava-slave application.
We thought that we could add a new device-type template to the lava
server that would somehow override the deployment and boot actions to
address our setup. We tried to look into the base.jinja2 file for some
sort of entry point that would allow us to, for example, run a script
that would first send the image to the secondary device and secondly run
the g_mass_storage module with the image file.
Unfortunately, it is still not clear for us how exactly we could
inject/override those steps and ignore for instance the boot loader
commands.
The first question would be, are we going in the right direction? Is
there any piece of documentation that would describe a similar problem?
What would be your recommendations on this topic?
thanks a lot in advance.
Best,
Alfonso
Hi ,
I deploy the device with grub(uefi) method. Now it can interrupt to grub>, and run the boot cmd and auto login. I use the pxe iso installation boot or the sata boot, not the nfs boot.
But the lava-overlay file only add into ramdisk(initrd.img), it will disappear when os bootup. So I use the transfer_overlay in my job , but it do nothing, the test shell can't be find .
Please give me some help!
Job.yaml:
actions:
- deploy:
to: tftp
kernel:
url: http://swlab004/1680141/vmlinuz-4.11.0-7.1.hxt.aarch64
type: zimage
ramdisk:
url: http://swlab004/1680141/initramfs-4.11.0-7.1.hxt.aarch64.img
compression: gz
install_overlay: false
preseed:
url: http://swlab004/centos7/anaconda-ks_1026.cfg
os: centos
timeout:
minutes: 5
- boot:
timeout:
minutes: 200
connection: serial
method: uefi-menu
commands: ramdisk_boot
auto_login:
login_prompt: ".* login:"
username: root
password_prompt: 'Password:'
password: root
transfer_overlay:
download_command: ifconfig;wget -S --progress=dot:giga
unpack_command: tar -C / -xzf
prompts:
- 'Last login: .*'
- '[root@.* .*]#'
parameters:
shutdown-message: "reboot: Restarting system"
- test:
timeout:
minutes: 50
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic system test command for Linaro Ubuntu images. The test runs basic commands like pwd, uname, vmstat, ifconfig, lscpu, lsusb and lsb_release."
maintainer:
- hongyu.xu(a)hxt-semitech.com
os:
- centos
scope:
- functional
run:
steps:
- echo "test1a:" "pass"
from: inline
name: smoke-tests1
path: inline/smoke-tests.yaml
Best Regards
XuHongyu
This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.
Greetings,
my embedded system is using Mender for over-the-air updates.
The way Mender works is that the embedded system needs to have to rootfs
partitions, where one is actively being used, and the other is written to
when a over-the-air update is initiated. After the new rootfs is written,
Mender will change a variable in the bootloader environment and reboot, so
that the system boots into the new rootfs. If there is a problem booting
into the new rootfs, the bootloader will be set to boot into the old one
again.
Has anyone here written LAVA tests for systems that use Mender? If so, have
you experienced that Mender has complicated the test automation?
Any Mender boot test templates would also be appreciated.
--
Kind regards,
Arnstein Kleven
Hi dear all,
I boot my device with nfs in lava, but it can't auto login the os , I put the logs to attachment !
Please give me some idear.
And I can boot and auto login successful with sata boot, use the same job.yaml file!
My job 454 :
device_type: amberwing_rep1
job_name: amberwing_rep_nfs
priority: medium
visibility: public
metadata:
docs-filename: amberwing_rep_nfs
timeouts:
job:
minutes: 300
action:
minutes: 200
connection:
minutes: 100
actions:
- deploy:
to: tftp
kernel:
url: http://swlab004/centos7/iso20171026-0/images/pxeboot/vmlinuz
type: zimage
ramdisk:
url: http://swlab004/centos7/iso20171026-0/images/pxeboot/initrd.img
compression: xz
install_overlay: false
nfsrootfs:
url: http://autotest002/tmp/rootfs_centos.tar.gz
install_overlay: true
preseed:
url: http://swlab004/centos7/anaconda-ks_1026.cfg
os: centos
timeout:
minutes: 5
- boot:
timeout:
minutes: 200
connection: serial
method: uefi-menu
auto_login:
login_prompt: '.* login:'
username: root
password_prompt: 'Password:'
password: root
commands: nfs_boot
prompts:
- 'Last login: .*'
- '[root@.* .*]#'
parameters:
shutdown-message: "reboot: Restarting system"
- test:
timeout:
minutes: 50
definitions:
- repository:
metadata:
format: Lava-Test Test Definition 1.0
name: smoke-tests-basic
description: "Basic system test command for Linaro Ubuntu images. The test runs basic commands like pwd, uname, vmstat, ifconfig, lscpu, lsusb and lsb_release."
maintainer:
- hongyu.xu(a)hxt-semitech.com
os:
- centos
scope:
- functional
devices:
- amberwing_rep-022
run:
steps:
- echo "test1a:" "pass"
from: inline
name: smoke-tests1
path: inline/smoke-tests.yaml
Best Regards
XuHongyu
This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.
LAVA requires a range of singleton daemon processes to schedule, organise,
publish and log the test job operations. 2017.12 will be retiring the
lava-server daemon and the /var/log/lava-server/lava-scheduler.log, moving
those operations into the lava-master daemon and the
/var/log/lava-server/lava-master.log. This is one of the steps in the
removal of V1.
This leads a substantial change to how admins approach an instance running
2017.12 or later.
When jobs don't appear to be running for some reason, the default action is
to check /var/log/lava-server/lava-scheduler.log - once the instance is
running 2017.12 or later, admins need to remember that it is correct for:
* /var/log/lava-server/lava-scheduler.log to be empty
* the status of the lava-server daemon to be active (exited)
* the lava-server service to not restart
At some point after 2017.12 has been running, admins can choose to archive
or purge /var/log/lava-server/lava-scheduler.log* and / or disable the
lava-server service using systemctl.
The scheduler from 2017.12 onwards will consist of:
* /var/log/lava-server/lava-master.log
* /var/log/lava-server/lava-logs.log
* the status of the lava-master daemon will be active (running)
* the lava-master service will restart when admins request it.
--
Neil Williams
=============
neil.williams(a)linaro.org
http://www.linux.codehelp.co.uk/