Hello list,
Regarding integration of LAVA [1] architecture with Android devices, we would like to reuse the existing infrastructure and framework design. Abrek [2] is a great testsuite framework for running test cases and benchmarks. However, due to the restrictions of unusual Android runtime, we consider to introduce the agent-based remote validation invocation mechanism for LAVA as the extension. Also, the proof-of-concept implementation is attached.
** Why can't we execute LAVA/Abrek directly on Android devices?
LAVA/Abrek is written in Python, which implies there must be a solid Python runtime for Android. CPython is verified and well-designed, but it is not well tested on Android. In fact, Android has its own libc implementation, bionic, which is the minimal and special libc originally taken from NetBSD libc. However, bionic libc only supports limited set of POSIX C APIs, and it is almost not feasible to maintain Linaro bionic modifications in early stage just to satisfy CPython. The bionic libc is always changed fast by Google engineers, and we have no idea about their plans.
Therefore, we prefer the way not to modify Android runtime. That is, don't execute LAVA/Abrek directly on Android environment.
** Yet another agent?
In fact, Android already provide an elegant approach for accessing target environment, adb (Android Debug Bridge)[3], which is a versatile tool allowing users to manage the state of Android-powered device. adb itself is a client-server program that includes three components: (1) A client, which runs on your development machine. You can invoke a client from a shell by issuing an adb command. (2) A server, which runs as a background process on your development machine. The server manages communication between the client and the adb daemon running on the device. (3) A daemon, which runs as a background process on each device instance.
The adb protocol can be established through USB (device needs to enable Android USB gadget driver) or TCP/IP. adb is very solid and powerful. For example, if you would like to test Android UI, you can use the command on Host side: $ adb shell monkey -v -p your.package.name 500
The above could be illustrated:
Host commands --> adb protocol (USB or TCP/IP) --> Target receives the command --> execute "monkey"
Monkey is a program on Android device that runs on your emulator or device and generates pseudo-random streams of user events such as clicks, touches, or gestures, as well as a number of system-level events.
In this example, your Android application is launched, and 500 pseudo-random events are sent to it. We call it as "agent-based remote invocation", and it is built-in.
** So, what's agent-based remote validation invocation for LAVA?
There are three key items: (a) Agent (b) Remote validation invocation (c) LAVA
We keep in mind that we make no technical impact to LAVA architecture, and the "Agent" is just the "helper". Originally, the client-server communication looks like the following:
LAVA server <----> LAVA client --> Abrek test suite
Since Python runtime is hard to support, the proposed model would be:
LAVA server <--> LAVA client --> Abrek test suite || adb extension (host) <--> adb (target) -> execute command
For integrating test items and benchmarks for Android, this proposal is running abrek on the host side. By using Android's standard tool, adb to communicate with the adbd (adb daemon) on target, the test case commands can be issued and the output can return back. Besides, the files can be pushed and pull to and from the target by adb as well. Sometimes the results may be stored in certain file on target, and we definitely could couple with the case. Thus, running abrek on host side along with "Agent" should be a workable approach.
** Show me the use case
The attached patch is just trivial proof-of concept implementation done by Jeremy Chang that adds a 'monkey' test definition file for abrek. Once TCP/IP or USB is ready, for Android's monkey testing, the procedure is like as following: (1) abrek run monkey (2) abrek dashboard put /anonymous/ monkey1297752359.0
In addition, if we wish to execute certain native application on Android device, abrek could ask adb to send the executable files and related data to target first. Then, execute it as expected. The command looks like: $ adb push <my-executable-file> /system/bin $ adb push <my-data-file> /data $ adb shell /system/bin/<my-executable-file>
The above instructions could be refined into "adb extension" as the part of LAVA client framework.
That's all. It could be straightforward and transparent.
Any suggestion is appreciated. Thank you in advance.
Sincerely, Jim Huang (jserv) http://0xlab.org/
[1] https://wiki.linaro.org/Platform/Validation/LAVA/Architecture [2] https://wiki.linaro.org/Platform/Validation/AbrekTestsuites [3] http://developer.android.com/guide/developing/tools/adb.html
W dniu 15.02.2011 21:01, Jim Huang pisze:
Hi Jim, great work!
** Why can't we execute LAVA/Abrek directly on Android devices?
I agree that direct abrek is not the right solution for Android. I think there is a class of use cases that also falls into this category: * testing early silicon * testing very primitive systems * testing other foreign systems
They all have either no python or running python is not desirable.
** So, what's agent-based remote validation invocation for LAVA?
There are three key items: (a) Agent (b) Remote validation invocation (c) LAVA
We keep in mind that we make no technical impact to LAVA architecture, and the "Agent" is just the "helper". Originally, the client-server communication looks like the following:
LAVA server<----> LAVA client --> Abrek test suite
Currently LAVA is just taking shape. Things like server and client are still not very well defined. You don't have to be constrained by this. The only thing that is somewhat well defined is abrek that was produced in the previous cycle. Abrek was designed to run _on_ the device. For use cases where Abrek is just interacting with the test device we may want to extend it sensibly to clearly separate those cases where necessary while retaining as much common code and user interface as possible.
There are a few things to consider here:
Device context: Currently abrek has some logic to probe the running system for context information. Context is loosely defined as the collection of relevant software and hardware information that might be interesting to analyze or that can be used to connect distinct test results in post-process analysis. If we are just interacting with a remote device we need to determine this information in some other way _and_ ensure we're not adding any dummy information from the "host" system.
Remote device registration: A single host may talk to possibly large number of such devices. At the very least we should plan how we intend to manage the process of defining/adding/removing a device and how to allow remote tests to be invoked on registered devices, if possible (if the device supports this test).
Test classes: So far abrek tests could run on any Linux box (where any was poorly defined as "any ubuntu-like system". This is no longer the case so we should be able to classify tests (and devices) somehow.
IMHO we could define a new tool (or just a new sub-command class for abrek), say abrek-remote that will be used instead of the normal abrek call.
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others 1) Jobs can encapsulate device information like which android device to connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
Another upside is that such job definitions (and any required LAVA plugins) would just integrate with the rest of LAVA (farm backend, frontend, etc). You could work on a job description on your workstation and once it's finished you could add it to a job scheduler for automatic processing.
** Show me the use case
The attached patch is just trivial proof-of concept implementation done by Jeremy Chang that adds a 'monkey' test definition file for abrek. Once TCP/IP or USB is ready, for Android's monkey testing, the procedure is like as following: (1) abrek run monkey (2) abrek dashboard put /anonymous/ monkey1297752359.0
Could you please forgive my android ignorance and tell me how I can run this test? Please include any hardware/software I should have. Can I run this on a beagle board with some android installation? Do I need a real android phone? This is just so that I can participate in the discussion and not make clueless and pointless arguments later.
Any suggestion is appreciated. Thank you in advance.
I hope we can work on making this solid.
2011/2/16 Zygmunt Krynicki zygmunt.krynicki@linaro.org:
W dniu 15.02.2011 21:01, Jim Huang pisze:
Hi Jim, great work!
hi Zygmunt,
Thanks. It is my pleasure to work with Linaro validation team.
** Why can't we execute LAVA/Abrek directly on Android devices?
I agree that direct abrek is not the right solution for Android. I think there is a class of use cases that also falls into this category: * testing early silicon * testing very primitive systems * testing other foreign systems They all have either no python or running python is not desirable.
Exactly. Thanks for your explanation.
[...]
Currently LAVA is just taking shape. Things like server and client are still not very well defined. You don't have to be constrained by this. The only thing that is somewhat well defined is abrek that was produced in the previous cycle. Abrek was designed to run _on_ the device. For use cases where Abrek is just interacting with the test device we may want to extend it sensibly to clearly separate those cases where necessary while retaining as much common code and user interface as possible.
There are a few things to consider here:
Device context: Currently abrek has some logic to probe the running system for context information. Context is loosely defined as the collection of relevant software and hardware information that might be interesting to analyze or that can be used to connect distinct test results in post-process analysis. If we are just interacting with a remote device we need to determine this information in some other way _and_ ensure we're not adding any dummy information from the "host" system.
Remote device registration: A single host may talk to possibly large number of such devices. At the very least we should plan how we intend to manage the process of defining/adding/removing a device and how to allow remote tests to be invoked on registered devices, if possible (if the device supports this test).
Test classes: So far abrek tests could run on any Linux box (where any was poorly defined as "any ubuntu-like system". This is no longer the case so we should be able to classify tests (and devices) somehow.
It sounds reasonable. Do you think whether the above changes are progressive to current LAVA implementation or not?
In Android, we can even re-use the infrastructure of ADB and DDMS [1] to provide remote device registration service.
IMHO we could define a new tool (or just a new sub-command class for abrek), say abrek-remote that will be used instead of the normal abrek call.
Yes, I like the idea.
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux
[...][
Another upside is that such job definitions (and any required LAVA plugins) would just integrate with the rest of LAVA (farm backend, frontend, etc). You could work on a job description on your workstation and once it's finished you could add it to a job scheduler for automatic processing.
Agree.
** Show me the use case The attached patch is just trivial proof-of concept implementation done by Jeremy Chang that adds a 'monkey' test definition file for abrek. Once TCP/IP or USB is ready, for Android's monkey testing, the procedure is like as following: (1) abrek run monkey (2) abrek dashboard put /anonymous/ monkey1297752359.0
Could you please forgive my android ignorance and tell me how I can run this test? Please include any hardware/software I should have. Can I run this on a beagle board with some android installation? Do I need a real android phone? This is just so that I can participate in the discussion and not make clueless and pointless arguments later.
Jeremy, could you prepare the testing environment using Android emulator? We could write down the instructions in wiki for reference.
Any suggestion is appreciated. Thank you in advance.
I hope we can work on making this solid.
Sure!
Cheers, Jim Huang (jserv) http://0xlab.org/
[1] DDMS: http://developer.android.com/guide/developing/tools/ddms.html
[omit..]
** Show me the use case The attached patch is just trivial proof-of concept implementation done by Jeremy Chang that adds a 'monkey' test definition file for abrek. Once TCP/IP or USB is ready, for Android's monkey testing, the procedure is like as following: (1) abrek run monkey (2) abrek dashboard put /anonymous/ monkey1297752359.0
Could you please forgive my android ignorance and tell me how I can run this test? Please include any hardware/software I should have. Can I run this on a beagle board with some android installation? Do I need a real android phone? This is just so that I can participate in the discussion and not make clueless and pointless arguments later.
Jeremy, could you prepare the testing environment using Android emulator? We could write down the instructions in wiki for reference.
Emulator is a good start. I created a page for reference, introducing Android emulator based testing environment. as https://wiki.linaro.org/Platform/Android/EmulatorTestingEnvironment
Cheers, Jeremy Chang
Any suggestion is appreciated. Thank you in advance.
I hope we can work on making this solid.
Sure!
Cheers, Jim Huang (jserv) http://0xlab.org/
[1] DDMS: http://developer.android.com/guide/developing/tools/ddms.html
W dniu 16.02.2011 10:28, Jeremy Chang pisze:
Jeremy, could you prepare the testing environment using Android emulator? We could write down the instructions in wiki for reference.
Emulator is a good start. I created a page for reference, introducing Android emulator based testing environment. as https://wiki.linaro.org/Platform/Android/EmulatorTestingEnvironment
I'm following those instructions now. Thanks for setting that up! If I find any issues I'll let you know.
Thanks ZK
Hi all!
This is very good analysis you have done, and I would just add that Monkey is only one simpler example where the test execution using abrek is not an option. Another example from Android world is CTS: http://source.android.com/compatibility/cts-intro.html. Most certainly, all test suites and frameworks are not only DUT-based. Some of them have more or less dependencies and functions on the PC side as well.
Zygmunt has already identified key points for Job Dispatcher to support this, but one thing I would like to comment:
Another crazy option would be to expose LAVA Job Dispatcher directly and
allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others
- Jobs can encapsulate device information like which android device to
connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
It would be good to have this specified/discussed somewhere already now, maybe in the Dispatcher blueprint, or maybe we need additional blueprint or wiki spec for it?
Thanks! Mirsad
2011/2/16 Mirsad Vojnikovic mirsad.vojnikovic@linaro.org:
Hi all!
hi Mirsad,
This is very good analysis you have done, and I would just add that Monkey is only one simpler example where the test execution using abrek is not an option.
True.
Another example from Android world is CTS: http://source.android.com/compatibility/cts-intro.html. Most certainly, all test suites and frameworks are not only DUT-based. Some of them have more or less dependencies and functions on the PC side as well.
Yes, in fact, Android CTS replies on PC side, too. The device will try to reset adb (through USB gadget) for several times in order to probe the compatibility to host (Windows/Linux/MacOSX).
Zygmunt has already identified key points for Job Dispatcher to support this, but one thing I would like to comment:
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others
- Jobs can encapsulate device information like which android device to
connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
Thanks for pointing out this.
It would be good to have this specified/discussed somewhere already now, maybe in the Dispatcher blueprint, or maybe we need additional blueprint or wiki spec for it?
Yes, it would be great.
Cheers, Jim Huang (jserv) http://0xlab.org/
W dniu 16.02.2011 11:22, Mirsad Vojnikovic pisze:
Zygmunt has already identified key points for Job Dispatcher to support this, but one thing I would like to comment:
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others 1) Jobs can encapsulate device information like which android device to connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
Hi Mirsad.
I must have missed this part of your email, sorry for responding so late. My intention was not to expose the dispatcher for farm users but for generic higher-level abrek replacement.
In this case one could develop and experiment with jobs using their favorite text editor and a command line tool that effectively dispatches a job on a locally connected device (android or "classic" board). This would allow us to write custom code that is not yet in the released LAVA stack that can interact with a new class of device properly, like in our case, android.
It would be good to have this specified/discussed somewhere already now, maybe in the Dispatcher blueprint, or maybe we need additional blueprint or wiki spec for it?
I think this can wait until we 1) have resources that can be committed to do this 2) agree on what to do with android+abrek+lava in general.
Best regards ZK
On 16 February 2011 05:27, Zygmunt Krynicki zygmunt.krynicki@linaro.orgwrote:
W dniu 16.02.2011 11:22, Mirsad Vojnikovic pisze:
Zygmunt has already identified key points for Job Dispatcher to support
this, but one thing I would like to comment:
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others
- Jobs can encapsulate device information like which android device
to connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
Hi Mirsad.
I must have missed this part of your email, sorry for responding so late. My intention was not to expose the dispatcher for farm users but for generic higher-level abrek replacement.
OK, then I understand better.
In this case one could develop and experiment with jobs using their favorite text editor and a command line tool that effectively dispatches a job on a locally connected device (android or "classic" board). This would allow us to write custom code that is not yet in the released LAVA stack that can interact with a new class of device properly, like in our case, android.
Totally agree with you, this is a valid user scenario where direct control is needed.
It would be good to have this specified/discussed somewhere already now,
maybe in the Dispatcher blueprint, or maybe we need additional blueprint or wiki spec for it?
I think this can wait until we 1) have resources that can be committed to do
this 2) agree on what to do with android+abrek+lava in general.
OK, sounds reasonable.
Best regards
ZK
Another crazy option would be to expose LAVA Job Dispatcher directly and
allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others
- Jobs can encapsulate device information like which android device to
connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
I hope you don't really think this is a crazy idea, because it's *exactly* what we discussed doing, and I think it is still the direction forward. Even back at the sprint, we talked about the fact that though we don't know exactly what our android images will look like yet (so specifics are hard to plan around), we do have a pretty good idea that things like abrek are not going to be a good fit there. Additionally, there are other tests that may eventually need to run from the outside. For that matter, the boot test itself is exactly one of those things. If a system doesn't boot, we still want to capture the serial log and save it for later debugging. We certainly need, and have already discussed the fact that the dispatcher has a server component that drives execution of such things. This isn't, in any way, bypassing the dispatcher, but rather using a different interface of it to run a non-abrek based test.
Despite that, this is still a useful exercise, and many thanks to Jim for putting this together to show us a proof of concept for how we can run tests remotely on android systems!
Thanks, Paul Larson
W dniu 16.02.2011 15:10, Paul Larson pisze:
Another crazy option would be to expose LAVA Job Dispatcher directly and allow people to run jobs. In this case one job
I hope you don't really think this is a crazy idea, because it's *exactly* what we discussed doing, and I think it is still the direction
I did not give it much thought when I wrote that email but I think the direction is sensible and would be much more flexible than anything else. After more conversations and some experiments I think this is the way forward for all testing.
forward. Even back at the sprint, we talked about the fact that though we don't know exactly what our android images will look like yet (so specifics are hard to plan around), we do have a pretty good idea that things like abrek are not going to be a good fit there. Additionally, there are other tests that may eventually need to run from the outside. For that matter, the boot test itself is exactly one of those things. If a system doesn't boot, we still want to capture the serial log and
save it for later debugging. We certainly need, and have already discussed the fact that the dispatcher has a server component that drives execution of such things. This isn't, in any way, bypassing the dispatcher, but rather using a different interface of it to run a non-abrek based test.
Yeah, when you mentioned this now I started thinking. Do we really need a daemon-like component for the dispatcher in general or just in the farm environment. Can we assume that having the implementation the "daemon" can be replaced with a command line tool that simply interacts with one device - strictly for development purpose? I'm thinking about stuff like volatile device state and device monitoring requirements.
Other than that it seems that our device overwatch + dispatcher looks quite similar to android debug bridge. I think this is good.
Thanks ZK
On Wed, Feb 16, 2011 at 8:16 AM, Zygmunt Krynicki < zygmunt.krynicki@linaro.org> wrote:
Yeah, when you mentioned this now I started thinking. Do we really need a daemon-like component for the dispatcher in general or just in the farm environment. Can we assume that having the implementation the "daemon" can be replaced with a command line tool that simply interacts with one device - strictly for development purpose? I'm thinking about stuff like volatile device state and device monitoring requirements.
The only reason I see for having a daemon at all is to pick up jobs from the queue. In reality, I think even having the queue is overkill for the moment. The actual job dispatcher portion, that is, the piece that takes a job control file, parses it, initiates deployment to a device, ... can absolutely be a command line piece. We're going to have to have one of these processes running for each job running anyway, and it greatly simplifies things not only for us, but for anyone who wants to run a smaller-scale version of this.
-Paul Larson
On 16 February 2011 06:24, Paul Larson paul.larson@linaro.org wrote:
On Wed, Feb 16, 2011 at 8:16 AM, Zygmunt Krynicki < zygmunt.krynicki@linaro.org> wrote:
Yeah, when you mentioned this now I started thinking. Do we really need a daemon-like component for the dispatcher in general or just in the farm environment. Can we assume that having the implementation the "daemon" can be replaced with a command line tool that simply interacts with one device - strictly for development purpose? I'm thinking about stuff like volatile device state and device monitoring requirements.
The only reason I see for having a daemon at all is to pick up jobs from the queue. In reality, I think even having the queue is overkill for the moment. The actual job dispatcher portion, that is, the piece that takes a job control file, parses it, initiates deployment to a device, ... can absolutely be a command line piece. We're going to have to have one of these processes running for each job running anyway, and it greatly simplifies things not only for us, but for anyone who wants to run a smaller-scale version of this.
Totally agree, that is what I meant with three-layered Dispatcher, you can see it in the awful picture I sent previously when reviewing the Dispatcher. First layer to pick up jobs from queue and assign them to second layer, which we call server dispatcher today, and which then uses the third layer called client dispatcher on the board. Having a smaller-scale version would be very useful as well.
-Paul Larson
On 16 February 2011 06:10, Paul Larson paul.larson@linaro.org wrote:
Another crazy option would be to expose LAVA Job Dispatcher directly and
allow people to run jobs. In this case one job would use abrek and some other tools to invoke tests, process results and send them to the dashboard while other job (one for android) would not run abrek at all, instead it would call some other helper, while still reusing identical components for "process results and send to result storage" phases. This is still in flux but has some advantages: 0) Jobs are simple text files that can be stored and shared with others
- Jobs can encapsulate device information like which android device to
connect to and how. 2) Jobs can still "call" to other parts of the LAVA stack such as result submission 3) Jobs can be extended locally (by LAVA plugins) so that anyone can develop specialized use cases for their very specific needs without altering the stack or having to write something completely custom.
I think exposing Job Dispatcher directly would not be a good idea for validation farm, where test jobs are queued for execution via Scheduler (see LAVA architecture). Bypassing the job queue on the Dispatcher level should only be allowed in exceptional cases, i.e. canceling jobs for server/board update or similar. There might be scenarios where the unrestricted direct control is desirable, but that should be only allowed for local development environments.
I hope you don't really think this is a crazy idea, because it's *exactly* what we discussed doing, and I think it is still the direction forward. Even back at the sprint, we talked about the fact that though we don't know exactly what our android images will look like yet (so specifics are hard to plan around), we do have a pretty good idea that things like abrek are not going to be a good fit there. Additionally, there are other tests that may eventually need to run from the outside. For that matter, the boot test itself is exactly one of those things. If a system doesn't boot, we still want to capture the serial log and save it for later debugging. We certainly need, and have already discussed the fact that the dispatcher has a server component that drives execution of such things. This isn't, in any way, bypassing the dispatcher, but rather using a different interface of it to run a non-abrek based test.
Maybe I expressed myself wrongly. With bypassing I meant bypassing the job queue, not Dispatcher. I think it is a bad idea on the validation farm to bypass the job queue and execute jobs directly on the Dispatcher, regardless of if they are using abrek or some other test process or framework. I can understand that we need this functionality when developing the system, just in order to enable and validate different parts separately from others. But if this is how it should work in the end version, then we need to specify a way to track these direct test jobs from Dispatcher in some way.
Despite that, this is still a useful exercise, and many thanks to Jim for putting this together to show us a proof of concept for how we can run tests remotely on android systems!
Thanks, Paul Larson