Hi Vishal,
Comments in line:
On 26 Nov 2013, at 14:01, Vishal Bhoj vishal.bhoj@linaro.org wrote:
Hi Dave,
On 26 November 2013 19:10, Dave Pigott dave.pigott@linaro.org wrote:
On 25 Nov 2013, at 10:26, Dave Pigott dave.pigott@linaro.org wrote:
OK - for the moment, I've off-lined 04. Longer term we need a better solution.
I think that the problem is that adb is buggy. It is clearly *supposed* to work with multiple devices simultaneously, otherwise why have it running as a daemon?
Course of action:
- We can patch around it by having an "android" device tag, which will be guaranteed to only have one instance per LAVA worker node.
- We should also investigate if there is an adb update that fixes the simultaneous connection issue.
http://developer.android.com/tools/help/adb.html
This seems to imply that for any version of android after 4.2.2 (JellyBean) we should be using 1.0.31. We're using 1.0.29, the default that ships with Ubuntu 12.04 LTS. The first time 1.0.31 was shipped with Ubuntu was in raring. Latest adb is available as part of Android SDK and should work on any Ubuntu version: http://dl.google.com/android/android-sdk_r22.3-linux.tgz
Installing adb along with SDK may be tedious with the way we setup lava-dispatcher. It involves running
So perhaps we should look at whether we can get 1.0.31 running on 12.04, and see if it fixes any of the problems we're seeing. The documentation certainly suggests that running multiple adb sessions is supported.
It should work out of the box so we should download the package and install it.
We use salt to control the server configuration, so we'll need to update the salt repo (lava-lab) to support this.
Just a note - we also see issues like this on non fast models devices.
We have tried fixing this issue previously as well. This bug is difficult to reproduce and last time we had found out that it was failing due to a memory corruption. Is it possible to "export ADB_TRACE=all" in the dispatcher setup so that we get the logs whenever we see this failure.
Yeah - good plan. Will add that to the dispatcher. I'll open a bug for it.
This merge request will at least help us recover from the failure easily instead of waiting till someone reports it: https://staging.review.linaro.org/#/c/508/
Will review tomorrow (I'm officially out this afternoon)
Dave
Dave
Thanks
Dave
On 25 Nov 2013, at 09:47, Vishal Bhoj vishal.bhoj@linaro.org wrote:
Hi,
The models are currently stable. Here are the jobs for release. Most of it has completed: http://validation.linaro.org/scheduler/job/87699 https://validation.linaro.org/scheduler/job/87700 https://validation.linaro.org/scheduler/job/87701
Submitted one more with partial test from job 87700: http://validation.linaro.org/scheduler/job/88187
Its okay to have multiple models per machine but we need to have only one model running Android per machine which has proven to be stable. If we run Android on more than one model per machine then it results in adb errors. Hence Yongqin requested to take _04 offline.
Currently juice is the only project where Android is booted extensively on "rtsm_fvp_base-aemv8a" models. It is preferred to have only one such model per H/W instance to have the setup stable for Android.
Regards, Vishal
On 25 November 2013 15:01, Jakub Pavelek jakub.pavelek@linaro.org wrote: Hi guys,
we should have setup with one model per HW instance. (Otherwise our tests will not run reliably). Guys help us getting that up and running again, it is release week.
Br,
--jakub
On 24 November 2013 14:52, Antonio Terceiro antonio.terceiro@linaro.org wrote: On Sat, Nov 23, 2013 at 12:54:40AM +0800, Yongqin Liu wrote:
Hi, Antonio
Thanks for the help. http://validation.linaro.org/scheduler/job/87680/log_file is booted up with the latest build#219.
But one thing I noticed that, both rtsm_fvp_base-aemv8a_02 and rtsm_fvp_base-aemv8a_04 are on fastmodels02.localdomain, could you help to disable rtsm_fvp_base-aemv8a_04? since when run two instances on one node at the same time, it may cause the adb problem.
There are fastmodels on the same machine (which may also cause the same problem?) so I doubt if take _04 offline will make any difference. -- Antonio Terceiro Software Engineer - Linaro http://www.linaro.org
linaro-validation mailing list linaro-validation@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-validation