On 8 February 2018 at 08:26, Liao, Guoqi guoqi.liao@hxt-semitech.com wrote:
Hi Guys:
When I am doing a multi-node testing, I create one job definition liking below. For example:
Sub-job 1 finished booting and testing, but sub-job 2 is on-going booting. So sub-job 1 will
Remove the template file like <lava_dipatcher>/tmp/overlay****, that will cause sub-job 2 could NOT download
The overlay**** file, sub-job 2 failed in the end. My question is how to do sync between multi-node in the job
Synchronisation is done using the MultiNode API - the test shell simply calls lava-sync but for that to work, there needs to be a functional test shell in the first place.
However, this is a different problem, to do with multiple usage of transfer_overlay.
What version of LAVA are you running? We have fixes for this in the upcoming release.
https://projects.linaro.org/browse/LAVA-1202
However, I don't think we've explicitly tested with MultiNode using transfer_overlay.
Definition?
My job definition:
protocols:
lava-multinode:
roles: foo:
This looks like a typical client:server MultiNode test job - it really does help if you describe the roles that way rather than using slang.
tags: - board1 device_type: **********
I'm assuming an internal device-type but it is worth exploring whether the device integration for this type can support adding the overlay to the rootfs in advance.
Transfer_overlay is not a solution to using the same rootfs for multiple test jobs - there are still issues of persistence which will affect the utilities executed by the test shell. It would be much better to deploy a fresh rootfs each time and then let LAVA add the overlay to that rootfs, avoiding the need for transfer_overlay support. The rootfs can have whatever dependencies are required by the base system pre-installed but a fresh rootfs each time means that the configuration is always the same at the start of each test job.
Keep things simple and only change one element at a time. Not deploying the rootfs each time means that the rootfs *can* change arbitrarily between test jobs. So not deploying the rootfs each time means that you are not only changing the kernel each test job, you are also inheriting unknown changes in the rootfs from the previous test job. The rootfs can be exactly the same tarball every time in every test job but that then means that all your results are reproducible - only the kernel is being changed in each test job. The small amount of time required to deploy a clean rootfs for each test job is tiny in comparison to the engineering time lost by trying to debug issues caused by a persistent rootfs.
context: grub_method: centos grub_installed_device: (hd1,gpt1) count: 1 bar: tags: - board2 device_type: ********** context: grub_method: centos grub_installed_device: (hd2,gpt1) count: 1 timeout: minutes: 6
job_name: centos openjdk test
timeouts:
job:
minutes: 1500
action:
minutes: 50
connection:
minutes: 30
priority: medium
visibility: public
actions:
deploy:
role:
foo
bar
kernel:
url: http://******** type: zimage
os: centos
timeout:
minutes: 80
to: tftp
boot:
timeout:
minutes: 40
role:
- bar
method: grub
commands: centos_installed
auto_login:
login_prompt: 'login:' username: root password_prompt: 'Password:' password: root
prompts:
- 'root@localhost ~'
transfer_overlay:
download_command: rm -f /root/overlay* ; ifconfig ; wget -S
--progress=dot:giga
unpack_command: tar -C / -xaf parameters: shutdown-message: "reboot: Restarting system"
boot:
timeout:
minutes: 40
role:
- foo
method: grub
commands: centos_installed
auto_login:
login_prompt: 'login:' username: root password_prompt: 'Password:' password: root
prompts:
- 'root@localhost ~'
transfer_overlay:
download_command: rm -f /root/overlay* ; ifconfig ; wget -S
--progress=dot:giga
unpack_command: tar -C / -xaf parameters: shutdown-message: "reboot: Restarting system"
test:
role:
foo
bar
timeout:
minutes: 50
definitions:
repository: ssh://**********/test-definitions
from: git
branch: **********
path: automated/linux/openjdk/openjdk-smoke.yaml
Any configuration, package installation or setup done by that test definition will be persistent into the next test job and that is known to cause reliability issues, difficulty in triaging of failed results and other complications.
name: openjdk-smoke
Thanks
B.R.
Guoqi
This email is intended only for the named addressee. It may contain information that is confidential/private, legally privileged, or copyright-protected, and you should handle it accordingly. If you are not the intended recipient, you do not have legal rights to retain, copy, or distribute this email or its contents, and should promptly delete the email and all electronic copies in your system; do not retain copies in any media. If you have received this email in error, please notify the sender promptly. Thank you.
Lava-users mailing list Lava-users@lists.linaro.org https://lists.linaro.org/mailman/listinfo/lava-users