Re: [Linaro-validation] New publishing infra prototype report

3 Jun 2013


      On Mon, 3 Jun 2013 12:57:43 +0100
James Tunnicliffe james.tunnicliffe@linaro.org wrote:
[]
...
I know some of our ARM slaves are a bit CPU light, but they also tend
to have slow network connections. I am sure a bit of experimentation
will tell us if we we should always move files off some slaves to an
intermediary to do the hash+upload stuff.
Well, I'd personally aim to make client-side publishing support
clean and lean, so it was easy to setup and run on any client, including
not too powerful. Of course, some cases may need intermediary (like
when we need to publish [big] files from non-networked board (hmm)),
but those are niche cases.
...
...
...
Do we want to authenticate this sort of call? Should just be a
dictionary or DB lookup, so it would probably involve more CPU
time to authenticate it. That said, you can use it to fish for
files that already exist that you don't have access to, so perhaps
we need to filter the results based on each user...
My idea is that all publishing API calls are authed by that
"security token". It's by definition limited use, like has source IP
constraints, timing constraints (use not before 30min after issuance
and not after 60min), other constraints, like may use for no more
than 50 API calls, may publish not more than 10 files, etc.
OK, it occurs that I may have not broadcast my use case, which is: A
server gets a token from the publishing service then pass it to a
slave. The slave uses it and once the job has finished the server
should be able to inform the publishing service the token is no longer
required.
Per security practices, that's worse solution than specifying
constraints upfront. What if server "forgets" to terminate token life?
What if server DoSed so it's unable to terminate token which is used to
do bad things in the meantime? Otherwise, such usecase is doable of
course.
...
I don't actually care if this is a public IP facing service.
My primary usecase is EC2 build slaves, for which the service is
essentially public (well, we can allow it only for 10.* IPs, but it's
still open for all EC2 instances then, including foreign).
...
I have
already decided to create a proxy publishing service in the LAVA lab
so slaves can share files between themselves internally, some of which
can be tagged for upload. I can just run SSH publishing from the
proxy.
That just follows cumbersome design we currently have for Jenkins
publishers. The more steps, the harder to get them right, and then
maintain them in right state.
...
This can clearly be part of linaro-license-protection / some
shared publishing service project, or a separate project. It is going
to have quite a large overlap with any other publishing project
though...
Yes, so if it's clearly a publishing service, then I'd encourage you to
work on a design which solves all currently known publishing
requirements and would be flexible and generic enough to accommodate
future ones (and those basicly should be reducible to what we
discussed: support sufficiently efficient publishing of arbitrary file
sets, lean on a client side).
Actually, I specifically brought this question up to avoid situation
that other engineers go for "spur of the moment" adhoc implementations,
and we end up with bunch of crippled, insecure, hard-to-maintain
publishing implementations (current Jenkins one already has enough
holes and pain to setup/debug).
...
If we are only issuing and using the tokens over HTTPS I think that
the best practice is to not restrict the use of the service other than
how long the token is issued for.
Well, constraints above were just an example of what we can easily
implement with HTTP-based system (and not so easily with PAM-based). Of
course, the idea is that token constraints are flexible: scheduling
server decides a token with how many constraints to request for
particular publishing client. I agree that basic constraints to start
with would be: source IP (important for EC2, maybe less important for
LAVA) and max lifetime.
...
We can easily generate stats and set
up alerts that point to potential abuse, but I would rather not have a
job fail at the publishing step because they want to upload a large
list of files.
...
...
api/add_link?type=md5&hash=1234abcd&<same as upload API from here>
It would be very easy to integrate this with most jobs since it can
all be done with CLI tools that are probably in the default
install. Maybe not for Android/busybox. Haven't looked. It would
solve the OE problem quite easily and be flexible enough to allow
other jobs to do the same.
For file updates or creating a new file based on a diff, as long as
both endpoints have the old file you can just create and send a
binary diff. That would be simple HTTP[S] stuff as well.
We could do such optimizations, sure. Later, if needed ;-).
...
If the server has the old file and the client only has the new one
you are into interactive protocol territory; it would probably be
easier to work out how to get rsync to work nicely than invent
something new. Probably involving a temporary SSH login that you
can get over the HTTPS API and the account login (password or key)
is changed/deleted as the first login happens so it can't be
re-used?
What about parallel publishers? What pool size of such
manually-maintained accounts will be needed? What about race
conditions and overall stability?
...
That way you don't
need to mess with LDAP, PAM, Kerberos etc? I dunno, that was the
first thing that came to mind!
Well, that feels fragile. And mixing up HTTP service and native
service approach doesn't seem to be good idea either. The whole idea
of HTTP service was to skip dealing with system-level (and thus
"risky" both in terms of overall system stability and security)
services. And we don't really have usecase for full rsync behavior
("only changed *parts* of file are uploaded") - we usually use
tarballs with gazillion of not too big files. So, even if for a
*build*, only one file is changed, then tar will have lots of mtime
changes in file headers which are interspersed with file contents,
and *zip will smear that around for diff-style algo to not have
much benefit.
On the other hand, Neil sent email that there're similar challenges
for multi-node LAVA setup. I didn't read thru it yet, but my guess
that for (arbitrary) LAVA tests we'd rather use (and let our users
use) standard tech like ssh/scp/rsync for inter-node comm, and then
we'd need to have "PAM" level auth anyway, and then it makes little
sense to have separate auth scheme just for publishing.
All I was saying was that writing an interactive protocol to update a
file instead of sending a diff is a lot of effort and that we should
at least think about (and probably prototype) something using rsync if
we want to go down that route. I probably shouldn't have said more
than that! I personally want to avoid that whole area since rsync +
ssh has the same security headaches as we are currently living with
and writing our own replacement seems bonkers.
I am glad to hear you don't think full-on rsync is useful :-)
Ok, so that's largely development/implementation details. I should say
that I don't personally target to work on this (unless assigned of
course, or unless this comes up on critical path). That's why I handed
it over to Tyler to spec out better and plan implementation. I'm glad
it lies on your path, and I encourage you to pick up this task, as I
still have bunch of unfinished ones in my queue. In that regard, I just
ping-pong some requirements and ideas I had.
...
As for uploading tarballs, that sounds a lot like a case for the build
not to create the tarball :-)
Well, not we decide what our builds produce, and even assuming that
changing that is grounded, it's quite a big scope creep ;-).
...
...
...
I already have a protocol designed to do the file update thing
(server has old file, client has new one but not old one) for a
pet project that I was going to open source, but it is just a bit
of fun and hasn't been tested, so even having got that far I would
still tell other people to use rsync.
...
...
publish --token=<token> --type=<build_type> --strip=<strip>
<build_id> <glob_pattern>...
This seems like a reasonable starting point. Lets make sure
that it uses a configuration file to specify what to do with
those build types etc. Preferably one that it can update from a
public location so we don't have to re-spin the tool to add a
new build type (though I guess we normally check it out of VCS
as we go, so that works too).
Well, on client side, it's ideally just a single file which just
handle obvious filtering options (like <glob_pattern> or
--strip=) locally and passes the rest to API/service.
Server-side can handle the options in any way it wants, note
that options above don't require much "configuration", for
example --type= just maps to top-level download dir.
Except for, well, we already have adhoc behavioral
idiosyncrasies, like Android builds flattening happening on
server. You hardly can "configure" that (though lambdas in YAML
sounds cool (for debugging :-D)). Better approach would be to
move that stuff to client side and have simple well-defined
publishing semantics.
Indeed! I don't know why that is done on server at the moment and I
assumed that the publish script would just be a wrapper around curl
with the destination path modified appropriately (or similar, you
know me, I would write the whole thing in Python). I am sure there
are good reasons for the current logic, but I don't know what they
are :-)
James
...
...
James
On 29 May 2013 16:57, James Tunnicliffe
james.tunnicliffe@linaro.org wrote:
...
Hi Paul,
Thanks for this. I need a mechanism for publishing from CI
runtime jobs so this is important to me. I did look into using
SSH/SFTP and it is simple to do in a reasonably insecure way,
it would be much better to have an HTTP[S] based solution
that uses temporary authentication tokens.
I was looking at this today because I woke up early worrying
about it. Clearly I need more interesting stuff to think
about! (and now, more sleep).
Anyway, it should be possible to do in Django:
https://docs.djangoproject.com/en/dev/topics/http/file-uploads/
https://pypi.python.org/pypi/django-transfer/0.2-2
http://wiki.nginx.org/HttpUploadModule
My own notes are more focused on an internal server that would
upload files to releases/snapshots but could retain files
until disk space was needed to act as a cache. I was going to
look at extending linaro-licenese-protection for this so
there was no way to use the cache to avoid licenses. I was
also going to have completely private files that you could
only access if you had the authentication token that a job
was given.
https://docs.google.com/a/linaro.org/document/d/1_ewb-xFDJc8Adk7AijV95XthGMv...
Feel free to add comments or insert more information and
thoughts.
Note that for high performance uploads we probably want to
hand off the upload to a web server. That django-transfer
module doesn't support any Apache related upload stuff, which
may mean that it doesn't exist. Moving to an nginx based
solution would be easy enough if we needed to (we could
replace the mod-xsendfile with the equivalent nginx call).
I think a prototype branch is ~1 day of engineering effort (no
nginx, token based upload call in place, probably some kind of
token request call, probably limited security, no
proxy-publishing). Adding the rest of the features, testing
etc probably takes it to more like 1 week.
James
On 29 May 2013 16:26, Paul Sokolovsky
paul.sokolovsky@linaro.org wrote:
>
>
> Begin forwarded message:
>
> Date: Wed, 29 May 2013 17:19:31 +0300
> From: Paul Sokolovsky Paul.Sokolovsky@linaro.org
> To: Tyler Baker tyler.baker@linaro.org, Alan Bennett
> alan.bennett@linaro.org Cc: Senthil Kumaran
> senthil.kumaran@linaro.org, Fathi Boudra
> fathi.boudra@linaro.org Subject: Re: New publishing infra
> prototype report
>
>
> Hello Tyler,
>
> As brought up today on IRC, it's a month since the below
> report and proposal for further steps, and I don't remember
> any reply. This whole publishing thing is peculiar, as it
> sits still when its not needed, but it's actually in ambush
> there to cause havoc at any time.
>
> For example, today Senthil came up with the question on how
> to publish to snapshots. Fortunately, it turned out that it
> was request for publishing a single file manually. But I
> know the guys are working on Fedora builds, and that
> definitely will need automated publishing (unless initial
> requirements as provided by Fathi changed). And it's
> definitely needed for CBuild migration, which I assume will
> be worked on next month.
>
> Btw, I discovered that a BP for that was submitted yet by
> Danilo:
> https://blueprints.launchpad.net/linaro-infrastructure-misc/+spec/file-publi...
>
>
> Thanks,
> Paul
>
>
> On Mon, 29 Apr 2013 18:58:39 +0300
> Paul Sokolovsky Paul.Sokolovsky@linaro.org wrote:
>
>> Hello,
>>
>> Last month I worked on a blueprint
>> https://blueprints.launchpad.net/linaro-android-infrastructure/+spec/prototy...
>> to prototype an implementation of publishing framework which
>> wouldn't depend on particular Jenkins features (and
>> misfeatures) and could be reused for other services across
>> Linaro CI infrastructure. Among these other projects are:
>>
>> 1. OpenEmbedded builds - efficient ("fresh only")
>> publishing of source tarballs and cache files.
>> 2. CBuild - publishing of toolchain build artifacts and
>> logs. 3. Fedora/Lava - publishing of build artifacts and
>> logs.
>>
>> So, the good news is that was possible to implement a
>> publishing system whose interface is a single script which
>> hides all the publishing complexity underneath.
>> Implementation was a cumbersome, because existing
>> publishing backend was reused, but it already opens
>> possibility for better logging, debugging, profiling, etc.
>>
>> With proof-of-concept client-side available, the main
>> complexity still lies in server-side backend. It's clear
>> that current "SFTP
>> + SSH trigger script" approach doesn't scale well in terms
>> of ease of setup and security. I added my considerations to
>> address that topic in " Conclusions and Future Work"
>> section of
>> http://bazaar.launchpad.net/~linaro-automation/linaro-android-build-tools/tr...
>>
>> So, action items I suggest based on this report:
>>
>> 1. Tyler to consult with Fathi (Fedora), Marcin (OE) and me
>> (CBuild) and prepare architecture/spec for the general
>> publishing system. It would be nice to BP this task to start
>> in 13.05. 2. Depending on the time required to prepare spec,
>> implementation can be scheduled right away, or postponed
>> until LCE13, so we had another chance to discuss it face to
>> face (as adhoc meeting, or as a session, if it's really
>> worth it).
>>
>>
>> Thanks,
>> Paul
>>
>> Linaro.org | Open source software for ARM SoCe
>> Follow Linaro: http://www.facebook.com/pages/Linaro
>> http://twitter.com/#%21/linaroorg -
>> http://www.linaro.org/linaro-blog
>
>
>
> --
> Best Regards,
> Paul
>
> Linaro.org | Open source software for ARM SoCs
> Follow Linaro: http://www.facebook.com/pages/Linaro
> http://twitter.com/#%21/linaroorg -
> http://www.linaro.org/linaro-blog
>
>
> --
> Best Regards,
> Paul
>
> Linaro.org | Open source software for ARM SoCs
> Follow Linaro: http://www.facebook.com/pages/Linaro
> http://twitter.com/#%21/linaroorg -
> http://www.linaro.org/linaro-blog
--
James Tunnicliffe
--
Best Regards,
Paul
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#%21/linaroorg -
http://www.linaro.org/linaro-blog
--
James Tunnicliffe
--
Best Regards,
Paul
Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog
--
James Tunnicliffe
-- 
Best Regards,
Paul

Linaro.org | Open source software for ARM SoCs
Follow Linaro: http://www.facebook.com/pages/Linaro
http://twitter.com/#%21/linaroorg - http://www.linaro.org/linaro-blog

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Linaro-validation] New publishing infra prototype report