Hello Dave,
On Fri, 11 Nov 2011 10:36:45 +0000 Dave Pigott dave.pigott@linaro.org wrote:
[]
look and feel!), we're asking that you hold off submitting jobs to LAVA until we give the go-ahead.
"Hold-off" as in "please don't submit jobs now - they disrupt us", or "if you submit now, be prepared to not get expected results"?
I'll e-mail later to let you know when we are fully operational again.
Many thanks
Dave Pigott Validation Engineer T: +44 1223 45 00 24 | M +44 7940 45 93 44 Linaro.org │ Open source software for ARM SoCs Follow Linaro: Facebook | Twitter | Blog
Hi Paul,
On 11 Nov 2011, at 10:56, Paul Sokolovsky wrote:
Hello Dave,
On Fri, 11 Nov 2011 10:36:45 +0000 Dave Pigott dave.pigott@linaro.org wrote:
[]
look and feel!), we're asking that you hold off submitting jobs to LAVA until we give the go-ahead.
"Hold-off" as in "please don't submit jobs now - they disrupt us", or "if you submit now, be prepared to not get expected results"?
Mostly the first, in that we need a pipeline to be able to submit jobs and check them.
Thanks
Dave
On Fri, Nov 11, 2011 at 3:53 PM, Dave Pigott dave.pigott@linaro.org wrote:
Hi Paul,
On 11 Nov 2011, at 10:56, Paul Sokolovsky wrote:
Hello Dave,
On Fri, 11 Nov 2011 10:36:45 +0000 Dave Pigott dave.pigott@linaro.org wrote:
[]
look and feel!), we're asking that you hold off submitting jobs to LAVA until we give the go-ahead.
"Hold-off" as in "please don't submit jobs now - they disrupt us", or "if you submit now, be prepared to not get expected results"?
Mostly the first, in that we need a pipeline to be able to submit jobs and check them.
Problem is that with CI we cannot really hold off submitting jobs because that happens ad-hoc when a build finishes.
I understand the need to have a clean lab pipe. However, for that we need to work on a mechanism that allows you to put submitted jobs into a queue that isn't processed etc.
Would also be useful to put jobs there during maintenance so the frontend doesn't fail submitting it's job.
On Mon, 14 Nov 2011 12:18:04 +0100 Alexander Sack asac@linaro.org wrote:
On Fri, Nov 11, 2011 at 3:53 PM, Dave Pigott dave.pigott@linaro.org wrote:
Hi Paul,
On 11 Nov 2011, at 10:56, Paul Sokolovsky wrote:
Hello Dave,
On Fri, 11 Nov 2011 10:36:45 +0000 Dave Pigott dave.pigott@linaro.org wrote:
[]
look and feel!), we're asking that you hold off submitting jobs to LAVA until we give the go-ahead.
"Hold-off" as in "please don't submit jobs now - they disrupt us", or "if you submit now, be prepared to not get expected results"?
Mostly the first, in that we need a pipeline to be able to submit jobs and check them.
Problem is that with CI we cannot really hold off submitting jobs because that happens ad-hoc when a build finishes.
I understand the need to have a clean lab pipe. However, for that we need to work on a mechanism that allows you to put submitted jobs into a queue that isn't processed etc.
Would also be useful to put jobs there during maintenance so the frontend doesn't fail submitting it's job.
I exactly made a switch to allow non-fail behavior recently. I didn't actually activated it though, as previous LAVA downtime finished before I could. But we now have infra in place to avoid downtime propagation.
And if it's useful for Validation team to temporary cut any submissions from android-build, I can do it easily and with priority over IRC or email any time.
On Mon, Nov 14, 2011 at 5:18 AM, Alexander Sack asac@linaro.org wrote:
Problem is that with CI we cannot really hold off submitting jobs because that happens ad-hoc when a build finishes.
Agree, this isn't always easy, but neat to see that infrastructure has a workaround for it already.
I understand the need to have a clean lab pipe. However, for that we need to work on a mechanism that allows you to put submitted jobs into a queue that isn't processed etc.
Would also be useful to put jobs there during maintenance so the frontend doesn't fail submitting it's job.
There are a few issues to resolve around this. It would have to be a
separate piece running somewhere else (like a cloud instance). To make it more reliable, we'd really need to run more than one of these as backups for one another so that when/if we need to reboot it, it could be phased. It would also lack the ability to tell you the job number, since it can't maintain a connection and give you that information, so there would be no direct way to reference the job without coming up with another unique identifier to go by.
A simpler approach in the short term that's been requested already, is the ability to resubmit a failed job. This covers more situations than the one described here.
Thanks, Paul Larson