[Linaro-validation] timeout vs multi-node sync

18 Aug 2014


      Hi,
I'm trying to run some android tests using multi-node API. In order to
make sure both nodes of the multi-node job are in known state I'm
using lava-wait test_started/test_finished signals to sync between
nodes. Signals are prefixed so they identify host-target
lava-test-shell pairs with unique names. This works well when there is
only one test scheduled in a job. If anything goes wrong, the worst
case is tests will time out. However in case there are more tests
scheduled in a single job, the flow control sometimes fails. It
happens when lava-test-shell times out on one node. In this scenario
I'm doing:
1. host (wait for test_started from target)
2. target -> test_started -> host
Tests are executed...
3. target (wait for test_finished from host)
4. time out on host
So in this scenario target waits for the test_finished signal and
eventually times out as well (as the signal never comes). At the same
time host node already starts executing next lava-test-shell when it
waits for test_started signal from target. So nodes go out of sync and
the job produces no results. Is there any way to avoid such situation?
Best Regards,
milosz

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

[Linaro-validation] timeout vs multi-node sync