On Tue, Feb 14, 2012 at 3:26 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Mon, 13 Feb 2012 22:27:25 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
Hi.
Fast model support is getting better. It seems that with the excellent patches by Peter Maydell we can now boot some kernels (I've only tried one tree, additional trees welcome :-). I'm currently building a from-scratch environment to ensure everything is accounted for and I understand how pieces interact.
Having said that I'd like to summarize how LAVA handles fast models:
Technically the solution is not unlike QEMU which you are all familiar with. The key differences are:
- Only NFS boot makes sense. There are no other sensible method that
I know of. We may also use a SD card (virtual obviously) but it is constrained to two gigabytes of data.
As mentioned in the other thread, it would be good to at least let ARM know that removing this limit would help us (if we can figure out how to do this).
We may figure out how to do this by reading the LISA source code that came with the model. That's a big task though (maybe grepping for mmc0 is a low hanging fruit, I did not check)
- The way we actually boot is complicated. There is no uboot, fast
model interpreter actually starts an .axf file that can do anything (some examples include running tests and benchmarks without actually setting the kernel or anything like that). There is no way to easily load the kernel and pass a command line. To work around that we're using a special axf file that uses fast model semihosting features to load the kernel/initrd from a host filesystem as well as to setup the command line that will be passed to the booting kernel. This allows us to freely configure NFS services and point our virtual kernel at appropriate IP addresses and pathnames.
So I guess I'd like to understand how this works in a bit more detail. Can you brain dump on the topic for a few minutes? :) What is "fast model semihosting"?
It's a way to have "syscalls" that connect the "bare hardware" (be it physical or emulated) to an external debugger or other monitor. You can find a short introduction in this blog [1]. For us it means we get to write bare-metal assembly that does the equivalent of open(), read(), write() and close(). The files are being opened are on the machine that runs the fast model. You can also print debugging statements straight to the console this way (we probably could write semihosting console driver if there is no such code yet) to get all of the output to the same tty that runs the model (model_shell). A more detailed explanation of this topic can be found in [2]
Fast model semihosting simply refers to using semihosting facilities in a fast model interpreter.
[1]: http://blogs.arm.com/software-enablement/418-semihosting-a-life-saver-during... [2]: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0471c/CHDJHHD...
- After the machine starts up we immediately open a TPC/IP connection
to a local TCP socket. We know which port is being used so we can easily allocate them up front. This port is now the traditional LAVA serial console.
I guess there is a risk here that we will miss early boot messages? This might not matter much.
There are other options but currently this seems to work quite okay.
Once we've found someone at ARM we can officially complain at about fast models, an option to have serial comms happen on the process stdin/stdout would be nice.
I think the reason they don't happen in the console is that by default we get four telnet ports to connect to (definitely more than one) so the logical question they'll ask is "which port should we redirect". Maybe there is an option buried somewhere to make that happen but so far I have not found it.
The rest of this looks like QEMU:
- you can access the filesystem easily (to gather results)
- we can use QEMU to chroot into the NFS root to install additional
software (emulation via a fast model is extremely slow)
In my testing, the pip install bzr+lp:lava-test step did not really work under QEMU. Maybe it does now, or maybe we can install a tarball or something.
I installed lava-test using release tarball. That has worked pretty well.
In general I think that:
1) We need to reconsider how to do testing on very slow machines 2) What can be invoked on the host (part of installation, unless that wants to build stuff, result parsing and tracking) 3) What has to be invoked on the target (test code, system probes)
It's important to make the intent very clear. If we define that cmd_install installs something while in "master image" on the "target" then we should not break that. I think that it would be sensible to add "host_chroot" mode that applies nicely to qemu and fast models. Very slow things that don't care about the architecture could be invoked in that mode without sacrificing performance.
Thanks ZK
On Tue, 14 Feb 2012 20:24:51 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Tue, Feb 14, 2012 at 3:26 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Mon, 13 Feb 2012 22:27:25 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
Hi.
Fast model support is getting better. It seems that with the excellent patches by Peter Maydell we can now boot some kernels (I've only tried one tree, additional trees welcome :-). I'm currently building a from-scratch environment to ensure everything is accounted for and I understand how pieces interact.
Having said that I'd like to summarize how LAVA handles fast models:
Technically the solution is not unlike QEMU which you are all familiar with. The key differences are:
- Only NFS boot makes sense. There are no other sensible method that
I know of. We may also use a SD card (virtual obviously) but it is constrained to two gigabytes of data.
As mentioned in the other thread, it would be good to at least let ARM know that removing this limit would help us (if we can figure out how to do this).
We may figure out how to do this by reading the LISA source code that came with the model. That's a big task though (maybe grepping for mmc0 is a low hanging fruit, I did not check)
That's not what I was suggesting! We should try to persuade ARM to do that. It may be that they can't do it in a reasonable timeframe, or maybe it's simply not a need that has been explained to them yet and is something they can do in a week.
- The way we actually boot is complicated. There is no uboot, fast
model interpreter actually starts an .axf file that can do anything (some examples include running tests and benchmarks without actually setting the kernel or anything like that). There is no way to easily load the kernel and pass a command line. To work around that we're using a special axf file that uses fast model semihosting features to load the kernel/initrd from a host filesystem as well as to setup the command line that will be passed to the booting kernel. This allows us to freely configure NFS services and point our virtual kernel at appropriate IP addresses and pathnames.
So I guess I'd like to understand how this works in a bit more detail. Can you brain dump on the topic for a few minutes? :) What is "fast model semihosting"?
It's a way to have "syscalls" that connect the "bare hardware" (be it physical or emulated) to an external debugger or other monitor. You can find a short introduction in this blog [1]. For us it means we get to write bare-metal assembly that does the equivalent of open(), read(), write() and close(). The files are being opened are on the machine that runs the fast model. You can also print debugging statements straight to the console this way (we probably could write semihosting console driver if there is no such code yet) to get all of the output to the same tty that runs the model (model_shell). A more detailed explanation of this topic can be found in [2]
Fast model semihosting simply refers to using semihosting facilities in a fast model interpreter.
Thanks for that. Sounds a tiny little bit like it's using a JTAG-for-fast-models type facility?
- After the machine starts up we immediately open a TPC/IP connection
to a local TCP socket. We know which port is being used so we can easily allocate them up front. This port is now the traditional LAVA serial console.
I guess there is a risk here that we will miss early boot messages? This might not matter much.
There are other options but currently this seems to work quite okay.
Fair enough.
Once we've found someone at ARM we can officially complain at about fast models, an option to have serial comms happen on the process stdin/stdout would be nice.
I think the reason they don't happen in the console is that by default we get four telnet ports to connect to (definitely more than one) so the logical question they'll ask is "which port should we redirect". Maybe there is an option buried somewhere to make that happen but so far I have not found it.
Again, I'm not saying that this is something we should do...
The rest of this looks like QEMU:
- you can access the filesystem easily (to gather results)
- we can use QEMU to chroot into the NFS root to install additional
software (emulation via a fast model is extremely slow)
In my testing, the pip install bzr+lp:lava-test step did not really work under QEMU. Maybe it does now, or maybe we can install a tarball or something.
I installed lava-test using release tarball. That has worked pretty well.
OK. That makes sense.
In general I think that:
- We need to reconsider how to do testing on very slow machines
You mean a "drive from the outside" approach like lava-android-test uses may make sense?
- What can be invoked on the host (part of installation, unless that
wants to build stuff, result parsing and tracking)
Yeah, this sort of thing is a grey area currently. More below.
- What has to be invoked on the target (test code, system probes)
It's important to make the intent very clear. If we define that cmd_install installs something while in "master image" on the "target" then we should not break that.
Well. We can't avoid breaking that if there *is no master image*.
Currently the dispatcher has the concept of a "reliable session", which is meant to be a target-like environment where things like compilation are possible. For master image based deployments, this is "booted into the master image, chrooted into a mounted testrootfs". For qemu, it is currently "boot the test image and hope that works", but it could be "chrooted into the testrootfs mounted on the host with qemu-arm-static in the right place", but that was less reliable than the other approach when I was testing this.
I think that it would be sensible to add "host_chroot" mode that applies nicely to qemu and fast models. Very slow things that don't care about the architecture could be invoked in that mode without sacrificing performance.
This code exists already. See _chroot_into_rootfs_session in lava_dispatcher.client.qemu.LavaQEMUClient and surrounds. The problem is that qemu is some distance from perfect...
Maybe we can limit the things that lava-test install does to things that work under qemu -- I guess installing via dpkg usually works (unless it's something like mono) and gcc probably works ok? Maybe we can do something like scratchbox where gcc is magically a cross compiler running directly on the host?
Cheers, mwh
On Wed, Feb 15, 2012 at 12:31 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Tue, 14 Feb 2012 20:24:51 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Tue, Feb 14, 2012 at 3:26 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Mon, 13 Feb 2012 22:27:25 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
Hi.
Fast model support is getting better. It seems that with the excellent patches by Peter Maydell we can now boot some kernels (I've only tried one tree, additional trees welcome :-). I'm currently building a from-scratch environment to ensure everything is accounted for and I understand how pieces interact.
Having said that I'd like to summarize how LAVA handles fast models:
Technically the solution is not unlike QEMU which you are all familiar with. The key differences are:
- Only NFS boot makes sense. There are no other sensible method that
I know of. We may also use a SD card (virtual obviously) but it is constrained to two gigabytes of data.
As mentioned in the other thread, it would be good to at least let ARM know that removing this limit would help us (if we can figure out how to do this).
We may figure out how to do this by reading the LISA source code that came with the model. That's a big task though (maybe grepping for mmc0 is a low hanging fruit, I did not check)
That's not what I was suggesting! We should try to persuade ARM to do that. It may be that they can't do it in a reasonable timeframe, or maybe it's simply not a need that has been explained to them yet and is something they can do in a week.
- The way we actually boot is complicated. There is no uboot, fast
model interpreter actually starts an .axf file that can do anything (some examples include running tests and benchmarks without actually setting the kernel or anything like that). There is no way to easily load the kernel and pass a command line. To work around that we're using a special axf file that uses fast model semihosting features to load the kernel/initrd from a host filesystem as well as to setup the command line that will be passed to the booting kernel. This allows us to freely configure NFS services and point our virtual kernel at appropriate IP addresses and pathnames.
So I guess I'd like to understand how this works in a bit more detail. Can you brain dump on the topic for a few minutes? :) What is "fast model semihosting"?
It's a way to have "syscalls" that connect the "bare hardware" (be it physical or emulated) to an external debugger or other monitor. You can find a short introduction in this blog [1]. For us it means we get to write bare-metal assembly that does the equivalent of open(), read(), write() and close(). The files are being opened are on the machine that runs the fast model. You can also print debugging statements straight to the console this way (we probably could write semihosting console driver if there is no such code yet) to get all of the output to the same tty that runs the model (model_shell). A more detailed explanation of this topic can be found in [2]
Fast model semihosting simply refers to using semihosting facilities in a fast model interpreter.
Thanks for that. Sounds a tiny little bit like it's using a JTAG-for-fast-models type facility?
Only in some ways, it's always device driven. You cannot use it to program memory or flip register values.
PS: It's also a security risk as it includes a funny SYS_SYSTEM call, yay, I get to run commands as whoever is running the model ;-) Model chroots anyone?
- After the machine starts up we immediately open a TPC/IP connection
to a local TCP socket. We know which port is being used so we can easily allocate them up front. This port is now the traditional LAVA serial console.
I guess there is a risk here that we will miss early boot messages? This might not matter much.
There are other options but currently this seems to work quite okay.
Fair enough.
Once we've found someone at ARM we can officially complain at about fast models, an option to have serial comms happen on the process stdin/stdout would be nice.
I think the reason they don't happen in the console is that by default we get four telnet ports to connect to (definitely more than one) so the logical question they'll ask is "which port should we redirect". Maybe there is an option buried somewhere to make that happen but so far I have not found it.
Again, I'm not saying that this is something we should do...
The rest of this looks like QEMU:
- you can access the filesystem easily (to gather results)
- we can use QEMU to chroot into the NFS root to install additional
software (emulation via a fast model is extremely slow)
In my testing, the pip install bzr+lp:lava-test step did not really work under QEMU. Maybe it does now, or maybe we can install a tarball or something.
I installed lava-test using release tarball. That has worked pretty well.
OK. That makes sense.
In general I think that:
- We need to reconsider how to do testing on very slow machines
You mean a "drive from the outside" approach like lava-android-test uses may make sense?
Actually that's very reasonable for another reason. Drive adb from outside is the 'lava-agent' idea I've been talking about lately. Instead of talking over a busy serial line you use packetized protocol over USB to talk to your piece of code that can do anything with your test device.
Here my main concern is speed. Every bit counts, this thing is slow as hell already.
- What can be invoked on the host (part of installation, unless that
wants to build stuff, result parsing and tracking)
Yeah, this sort of thing is a grey area currently. More below.
- What has to be invoked on the target (test code, system probes)
It's important to make the intent very clear. If we define that cmd_install installs something while in "master image" on the "target" then we should not break that.
Well. We can't avoid breaking that if there *is no master image*.
Yes but so far the master image was an ARM device. What I'd like to do is discuss how to sensibly migrate from that concept. Master image was supposed to have reliable kernel with networking. Is that still relevant (if we can just shove files onto the image?). Should we just ask people to build their benchmarks before starting the device for tests? What about existing tests? How does lava-test need to change to support this.
Currently the dispatcher has the concept of a "reliable session", which is meant to be a target-like environment where things like compilation are possible. For master image based deployments, this is "booted into the master image, chrooted into a mounted testrootfs". For qemu, it is currently "boot the test image and hope that works", but it could be "chrooted into the testrootfs mounted on the host with qemu-arm-static in the right place", but that was less reliable than the other approach when I was testing this.
Right, I know. I somewhat feel that it's an implementation detail that may vary from device to device and we should think about how the framework (lava-test) is going to look like. Unlike dispatcher commands it _has to_ be uniform and it _has to_ be backwards compatible.
I think that it would be sensible to add "host_chroot" mode that applies nicely to qemu and fast models. Very slow things that don't care about the architecture could be invoked in that mode without sacrificing performance.
+1 I think that's a good abstraction
This code exists already. See _chroot_into_rootfs_session in lava_dispatcher.client.qemu.LavaQEMUClient and surrounds. The problem is that qemu is some distance from perfect...
Right, I saw that. I'll have more comments later.
Maybe we can limit the things that lava-test install does to things that work under qemu -- I guess installing via dpkg usually works (unless it's something like mono) and gcc probably works ok? Maybe we can do something like scratchbox where gcc is magically a cross compiler running directly on the host?
The thing is that our users may want to depend on the original behavior. Things like cross gcc doing bad stuff or qemu corrupting/crashing are not possible to rule out. If anything we should do what we did, slowly and in a compatible way while designing and implementing a new version of lava-test, with sensible migration path, that allows users to optimize such things.
Thanks ZK
On Wed, 15 Feb 2012 01:44:44 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Wed, Feb 15, 2012 at 12:31 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Tue, 14 Feb 2012 20:24:51 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Tue, Feb 14, 2012 at 3:26 AM, Michael Hudson-Doyle michael.hudson@canonical.com wrote:
On Mon, 13 Feb 2012 22:27:25 +0100, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
Hi.
Fast model support is getting better. It seems that with the excellent patches by Peter Maydell we can now boot some kernels (I've only tried one tree, additional trees welcome :-). I'm currently building a from-scratch environment to ensure everything is accounted for and I understand how pieces interact.
Having said that I'd like to summarize how LAVA handles fast models:
Technically the solution is not unlike QEMU which you are all familiar with. The key differences are:
- Only NFS boot makes sense. There are no other sensible method that
I know of. We may also use a SD card (virtual obviously) but it is constrained to two gigabytes of data.
As mentioned in the other thread, it would be good to at least let ARM know that removing this limit would help us (if we can figure out how to do this).
We may figure out how to do this by reading the LISA source code that came with the model. That's a big task though (maybe grepping for mmc0 is a low hanging fruit, I did not check)
That's not what I was suggesting! We should try to persuade ARM to do that. It may be that they can't do it in a reasonable timeframe, or maybe it's simply not a need that has been explained to them yet and is something they can do in a week.
- The way we actually boot is complicated. There is no uboot, fast
model interpreter actually starts an .axf file that can do anything (some examples include running tests and benchmarks without actually setting the kernel or anything like that). There is no way to easily load the kernel and pass a command line. To work around that we're using a special axf file that uses fast model semihosting features to load the kernel/initrd from a host filesystem as well as to setup the command line that will be passed to the booting kernel. This allows us to freely configure NFS services and point our virtual kernel at appropriate IP addresses and pathnames.
So I guess I'd like to understand how this works in a bit more detail. Can you brain dump on the topic for a few minutes? :) What is "fast model semihosting"?
It's a way to have "syscalls" that connect the "bare hardware" (be it physical or emulated) to an external debugger or other monitor. You can find a short introduction in this blog [1]. For us it means we get to write bare-metal assembly that does the equivalent of open(), read(), write() and close(). The files are being opened are on the machine that runs the fast model. You can also print debugging statements straight to the console this way (we probably could write semihosting console driver if there is no such code yet) to get all of the output to the same tty that runs the model (model_shell). A more detailed explanation of this topic can be found in [2]
Fast model semihosting simply refers to using semihosting facilities in a fast model interpreter.
Thanks for that. Sounds a tiny little bit like it's using a JTAG-for-fast-models type facility?
Only in some ways, it's always device driven. You cannot use it to program memory or flip register values.
Right.
PS: It's also a security risk as it includes a funny SYS_SYSTEM call, yay, I get to run commands as whoever is running the model ;-) Model chroots anyone?
Argh. Spin up a vm for each test run? :)
- After the machine starts up we immediately open a TPC/IP connection
to a local TCP socket. We know which port is being used so we can easily allocate them up front. This port is now the traditional LAVA serial console.
I guess there is a risk here that we will miss early boot messages? This might not matter much.
There are other options but currently this seems to work quite okay.
Fair enough.
Once we've found someone at ARM we can officially complain at about fast models, an option to have serial comms happen on the process stdin/stdout would be nice.
I think the reason they don't happen in the console is that by default we get four telnet ports to connect to (definitely more than one) so the logical question they'll ask is "which port should we redirect". Maybe there is an option buried somewhere to make that happen but so far I have not found it.
Again, I'm not saying that this is something we should do...
The rest of this looks like QEMU:
- you can access the filesystem easily (to gather results)
- we can use QEMU to chroot into the NFS root to install additional
software (emulation via a fast model is extremely slow)
In my testing, the pip install bzr+lp:lava-test step did not really work under QEMU. Maybe it does now, or maybe we can install a tarball or something.
I installed lava-test using release tarball. That has worked pretty well.
OK. That makes sense.
In general I think that:
- We need to reconsider how to do testing on very slow machines
You mean a "drive from the outside" approach like lava-android-test uses may make sense?
Actually that's very reasonable for another reason. Drive adb from outside is the 'lava-agent' idea I've been talking about lately. Instead of talking over a busy serial line you use packetized protocol over USB to talk to your piece of code that can do anything with your test device.
Right. Although in terms of the actual operations that get executed on the device I don't know how much difference this makes.
Here my main concern is speed. Every bit counts, this thing is slow as hell already.
How slow is slow?
- What can be invoked on the host (part of installation, unless that
wants to build stuff, result parsing and tracking)
Yeah, this sort of thing is a grey area currently. More below.
- What has to be invoked on the target (test code, system probes)
It's important to make the intent very clear. If we define that cmd_install installs something while in "master image" on the "target" then we should not break that.
Well. We can't avoid breaking that if there *is no master image*.
Yes but so far the master image was an ARM device. What I'd like to do is discuss how to sensibly migrate from that concept. Master image was supposed to have reliable kernel with networking. Is that still relevant (if we can just shove files onto the image?). Should we just ask people to build their benchmarks before starting the device for tests? What about existing tests? How does lava-test need to change to support this.
All decent questions.
Currently the dispatcher has the concept of a "reliable session", which is meant to be a target-like environment where things like compilation are possible. For master image based deployments, this is "booted into the master image, chrooted into a mounted testrootfs". For qemu, it is currently "boot the test image and hope that works", but it could be "chrooted into the testrootfs mounted on the host with qemu-arm-static in the right place", but that was less reliable than the other approach when I was testing this.
Right, I know. I somewhat feel that it's an implementation detail that may vary from device to device and we should think about how the framework (lava-test) is going to look like. Unlike dispatcher commands it _has to_ be uniform and it _has to_ be backwards compatible.
I think that it would be sensible to add "host_chroot" mode that applies nicely to qemu and fast models. Very slow things that don't care about the architecture could be invoked in that mode without sacrificing performance.
+1 I think that's a good abstraction
You are +1ing your own idea here :-)
This code exists already. See _chroot_into_rootfs_session in lava_dispatcher.client.qemu.LavaQEMUClient and surrounds. The problem is that qemu is some distance from perfect...
Right, I saw that. I'll have more comments later.
Maybe we can limit the things that lava-test install does to things that work under qemu -- I guess installing via dpkg usually works (unless it's something like mono) and gcc probably works ok? Maybe we can do something like scratchbox where gcc is magically a cross compiler running directly on the host?
The thing is that our users may want to depend on the original behavior. Things like cross gcc doing bad stuff or qemu corrupting/crashing are not possible to rule out. If anything we should do what we did, slowly and in a compatible way while designing and implementing a new version of lava-test, with sensible migration path, that allows users to optimize such things.
I think running the installation in the model is going to be the most reliable thing for a15 testing -- given that one of the goals is to test LVM, I'd hope just plain networking works most of the time in the tested kernel. Not a perfect answer of course.
Cheers, mwh
linaro-validation@lists.linaro.org