On 29 January 2018 at 04:23, Anibal Limon <anibal.limon@linaro.org> wrote:Hi,I created a Piglit test definition [1] before i try to test it out into the V.L.O. i used test-runner to make a run locally (db410c) without lava-dispatcher, the run took ~2 hours 5 minutes [2] (wo download/flash) after that i tried to execute into the V.L.O. and it consumes 8 hours of execution and then finalize with a timeout, the test didn't reach the end and was able to execute only 14486 tests of 39249.By V.L.O you mean LKFT. Please be careful with your terminology. V.L.O is an entirely separate LAVA installation.
So i download the definition and executed the lava-run command in my desktop and the full test execution was in ~2 hours 38 minutes that makes sense because needs to download and flash the boot and rootfs images. [4]I introduced a time call for measure the piglit time execution in test-runner and lava-dispatcher and the exec times are the same ~96 minutes in both executions, search for ' Thank you for running Piglit' in the logs [2][5].I noticed when i ran lava-run into my desktop the CPU usage increases to 100% when reaches the step of actually running the test [6], i made some debugging and discover that the manner to get the results and notifications from serial console is using pexpect and certain regexes, i attached the gdb to the python process and found that the regexes executes over the whole serial buffer in very short intervals [7]It's not the entire buffer (that would include the bootloader operations), it is the buffer since the last pattern match (the start of the last test run). This is standard pexpect behaviour. In most cases, it doesn't cause any issues but one step is to match more often.
One solution is to "chunk" the operations so that results are reported in batches by redirecting to local storage (not emitting anything to the console). Then process the storage and report the results directly, one run per batch.
It would also help not to use the verbose flag or to redirect all the Piglit output to file(s) to avoid flooding the test log.(We do something similar with the lava unit tests - redirect to a script which filters out the noise and just reports the useful content.)That could be the reason of the delay in the test execution since the Piglit command takes the same time to run with test-runner and lava-dispatcher.pexpect happens inside lava-run, I'm not sure you are attributing the delay correctly. You describe the lava-run output as taking not much longer than the local run.
Test job 101274 on LKFT didn't get to the point of reporting any results from the test shell.Start: 17:22LXC ready and fastboot downloaded and unpacked: 17:25Overlay created and power reset the device: 17:27First fastboot operation (rootfs): 17:27 to 17:55 OKAY [846.845s]\nfinished. total time: 1676.151sStart of the LAVA test shell: 17:56First output from test: 17:58First output which actually matched any patterns expected by LAVA: 01:22 the following day.So, yes, that was a large buffer which needed to be scanned. Breaking up the output into test sets and test cases would make things a lot easier and let the job run more smoothly.Best regards,Anibal
_______________________________________________
Lava-users mailing list
Lava-users@lists.linaro.org
https://lists.linaro.org/mailman/listinfo/lava-users
--