On Mon, 25 Jul 2016 12:56:13 +0200 Sjoerd Simons sjoerd.simons@collabora.co.uk wrote:
On Mon, 2016-07-25 at 10:53 +0100, Neil Williams wrote:
On Mon, 25 Jul 2016 09:29:28 +0200 Sjoerd Simons sjoerd.simons@collabora.co.uk wrote:
On Sun, 2016-07-24 at 14:29 +0100, Neil Williams wrote:
On Sun, 24 Jul 2016 00:21:34 +0200 Sjoerd Simons sjoerd.simons@collabora.co.uk wrote:
There is no support for submitting to specific target devices as this impedes both scheduling and lab management when needing to retire broken hardware.
Hmm, That's true though. Fwiw What I tend to (ab)use submitting to specific target devices for is mostly for hacking sessions and the likes when needing to do some maintaince or other aspects that really need one specific target device rather then any regular jobs. It would be nice to cover that use-case somehow.
Hacking sessions are for users though. As an admin, you already have direct access to the device. This was one of the reasons why V1 had all the device configuration on the dispatcher, so that local scripts could parse out the connection_command and power_on_cmd to get a way to get onto the device whilst it was Offline. (This is why we have maintenance mode on a per-device level.)
With V2, that information is available directly from the UI, so all the admin needs is take the device offline, ssh onto the dispatcher and have a web browser looking at the device detail page.
But that's basically doing by hand things that lava can already do for you.
Maybe i'm just too lazy, but I like telling lava to just go and boot a board for me with a rootfs of choice such that i can login and do whatever needs to be done without having to resort to setting things up by hand.
Why do you need a rootfs in the first place?
With LAVA V2, the only software needed on the board is the bootloader - with the exception of devices supporting primary connections. There is nothing that needs to be done in a rootfs for a V2 device.
No need to wait for the hacking session to be scheduled (another job could always get in first, even at high priority a health check takes precedence or there could be another high priority job already in the queue).
In my experience health checks don't happen often enough to be problematic for this.
That's configurable. In a lab running 1,000 jobs a day it is routine.
For the other aspects, simply restricting submission to the device works well (Which depending on what gets done is a good choice anyway).
Though a maintaince priority/type of job that runs even if the device is currently offline and trumps all other priorities would be really nice for these kind of things. Though I bet you disagree on this aspect :)
Only a forced health check must ever run on a device which is offline. Health checks always take precedence over any priority settings.
Offline is a maintenance mode, especially for admins. That is the only purpose of having an offline status. Offline means that the device is currently unusable - it could be disconnected, bricked etc. It is up to the admin to be confident that it is safe to run a health check. There is also looping mode for repeating such tests.
We're updating the docs on health checks - stressing that a health check needs to test every type of action supported by the device type (except a hacking session as it still needs to be fully automated). The health check still needs to be quick but it also needs to be thorough.
Just because hacking sessions log in a user as root, does *not* mean that this is a workable solution for administration - that confuses the issues. TestJobs, like hacking sessions, need to be ephemeral in terms of storage - that way admins can trust that users can't actually undo the admin setup just by using a hacking session themselves.
Given that a hacking session gives you root per definitions means folks can do whatever they like on a board. Nothing is stopping someone in a hacking session to e.g. reflash the bootloader :)
Exactly - it is up to the admins to sanction such users as that causes work for the admin. It depends on the device - with a device with sufficient support, the bootloader can be safely replaced by the testjob so it would not be a problem.
I don't see what operations are needed in V2 that can be done inside a hacking session, except possibly updating the UBoot uEnv.txt but that's possible to do from the bootloader shell as well.