Hey folks,
I was checking today what we could do to optimize the job runs at LAVA, and noticed that the time it takes to download and cache the tarball is still the biggest bottleneck we have.
Checking job http://validation.linaro.org/lava-server/scheduler/job/23135, you can see at the top of the serial log that it took almost one hour just to download and cache the tarball. As I believe this could also be related with the connection we have available at the lab, and as we're caching the tarballs already, wouldn't it be possible to use zsync or a similar tool to speed up the download?
For the pre-built images we're already generating the zsync meta-data file together with the image itself, so would be nice to check if that would really make any difference from the lab perspective.
Thanks,
Ricardo Salveti ricardo.salveti@linaro.org writes:
Hey folks,
I was checking today what we could do to optimize the job runs at LAVA, and noticed that the time it takes to download and cache the tarball is still the biggest bottleneck we have.
Checking job http://validation.linaro.org/lava-server/scheduler/job/23135, you can see at the top of the serial log that it took almost one hour just to download and cache the tarball.
Ugh.
As I believe this could also be related with the connection we have available at the lab, and as we're caching the tarballs already,
Heh well. Two points:
(a) we only cache the boot and rootfs tarballs currently. I doubt zsync will help there -- surely we'd need to keep the image around for that?
(b) we really shouldn't cache them for any jobs but health jobs :)
wouldn't it be possible to use zsync or a similar tool to speed up the download?
For the pre-built images we're already generating the zsync meta-data file together with the image itself, so would be nice to check if that would really make any difference from the lab perspective.
That said, I'm sure we could use zsync to improve download times in the dispatcher. I don't know much about it though -- do you have to explicitly point zsync at the version of the file you've already downloaded? We _could_ probably keep the latest version of each "kind" of image we downloaded (somehow... imagine me waving my hands furiously at this point). What would be utterly awesome of course would be some kind of automagic thing for squid that used zsync without us knowing about it :)
Cheers, mwh
On Sun, Jun 24, 2012 at 6:31 PM, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Ricardo Salveti ricardo.salveti@linaro.org writes:
As I believe this could also be related with the connection we have available at the lab, and as we're caching the tarballs already,
Heh well. Two points:
(a) we only cache the boot and rootfs tarballs currently. I doubt zsync will help there -- surely we'd need to keep the image around for that?
I believe so, at least it'd be easier for zsync to understand.
wouldn't it be possible to use zsync or a similar tool to speed up the download?
For the pre-built images we're already generating the zsync meta-data file together with the image itself, so would be nice to check if that would really make any difference from the lab perspective.
That said, I'm sure we could use zsync to improve download times in the dispatcher. I don't know much about it though -- do you have to explicitly point zsync at the version of the file you've already downloaded? We _could_ probably keep the latest version of each "kind" of image we downloaded (somehow... imagine me waving my hands furiously at this point).
Yeah, that's what I was thinking about actually. If we have resources, we could store and cache the pre-built images used at lava for at least a few days (3 or 4). That would already be enough at least for the daily images to be way faster.
What would be utterly awesome of course would be some kind of automagic thing for squid that used zsync without us knowing about it :)
Sure, but I don't yet know any solution that would be smart enough to do what we need/want :-)
Thanks,
Ricardo Salveti ricardo.salveti@linaro.org writes:
On Sun, Jun 24, 2012 at 6:31 PM, Michael Hudson-Doyle michael.hudson@linaro.org wrote:
Ricardo Salveti ricardo.salveti@linaro.org writes:
As I believe this could also be related with the connection we have available at the lab, and as we're caching the tarballs already,
Heh well. Two points:
(a) we only cache the boot and rootfs tarballs currently. I doubt zsync will help there -- surely we'd need to keep the image around for that?
I believe so, at least it'd be easier for zsync to understand.
wouldn't it be possible to use zsync or a similar tool to speed up the download?
For the pre-built images we're already generating the zsync meta-data file together with the image itself, so would be nice to check if that would really make any difference from the lab perspective.
That said, I'm sure we could use zsync to improve download times in the dispatcher. I don't know much about it though -- do you have to explicitly point zsync at the version of the file you've already downloaded? We _could_ probably keep the latest version of each "kind" of image we downloaded (somehow... imagine me waving my hands furiously at this point).
Yeah, that's what I was thinking about actually. If we have resources, we could store and cache the pre-built images used at lava for at least a few days (3 or 4). That would already be enough at least for the daily images to be way faster.
OK. I filed https://bugs.launchpad.net/lava-dispatcher/+bug/1017791 so as not to forget about this completely. No promises on when it can be implemented by...
What would be utterly awesome of course would be some kind of automagic thing for squid that used zsync without us knowing about it :)
Sure, but I don't yet know any solution that would be smart enough to do what we need/want :-)
Shame!
Cheers, mwh
linaro-validation@lists.linaro.org