W dniu 27.03.2012 13:13, Alexander Sack pisze:
On Tue, Mar 27, 2012 at 08:59:59AM +0200, Zygmunt Krynicki wrote:
W dniu 27.03.2012 06:46, Paul Larson pisze:
I like this in general, but some things to think about:
- Raw serial log is *hugely* important. Without it, some kernel bugs
will simply not be visible reliably.
And we'll keep grabbing them as today. It is just the case that:
They will not contain test output, so kernel bumps will sand out
If we capture them unreliably, such as today, then nothing breaks
- I'm starting to be convinced that we should have to depend on working
ethernet - this is already the case with android
This proposal does not depend on working ethernet on the test image
- Speaking of android... how does this affect testing on android? It
sounds as if it may be geared more towards ubuntu image testing
I need to check this but I suspect we can implement a imperative agent (shell, or even java if needed) that helps us run android early initialization. AFAIK most of the work is alredy performed via adb.
- If we require ethernet, what happens when we want to do an ethernet
enablement test (this should be coming soon) that wants to bring up/down ethernet interfaces and connect/disconnect them?
As above, we don't require ethernet on the test image. If the test starts messing with ethernet then we simply loose the real-time log streaming. In either case the log is preserved and is recovered after reboot from the master image.
I am a bit concerned that we add more architectural complexity to LAVA by moving to ethernet and then implementing fallbacks and special cases for the times when ethernet doesn't work...
Lava is unreliable today, this will fix it.
The base case is nothing. Everything is local. As a special exception, if Ethernet works then you also get _live_ log streaming. Normally the complete set of data would be recovered after the test is done, from the master image, and sent, reliably, to the system over Ethernet.
If you only had serial, what can be done? What problems need to be solved?
Serial is unreliable for us today. We see consistent data loss, we loose a lot of jobs as a consequence. This is because almost any serial corruption is fatal in our current environment.
Still, this is irrelevant. Even without serial the improvement is clear. The test execution is less interactive with my proposal (only the bootloader has to be scripted). Currently _everything_ is scripted on the root-logged-in serial line. That adds a lot of flakiness and a single missed byte can confuse the system.
Are there other options?
I think this one is very good. The alleged complexity is really not there if you look at how complex our current setup is.
Thanks ZK