On Thu, Feb 23, 2012 at 12:36:02PM +1300, Michael Hudson-Doyle wrote:
On Wed, 22 Feb 2012 10:10:05 +0000, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Wed, Feb 22, 2012 at 02:21:57PM +1300, Michael Hudson-Doyle wrote:
Hi all,
The LAVA team is working on support for private jobs -- we already have some support for private results, but if the log of the job that produced the results is publicly visible, this isn't much privacy.
The model for result security is that a set of results can be:
- anonymous (anyone can see, anyone can write)
- public (anyone can see, only owning user or group can write)
- private (only owning user or group can see or write)
Each non-anonymous set of results is owned by a group or user. I think this model is sufficiently flexible -- the only gap I can see is that it's not possible to have a stream where a subset of the people who can see it can submit results to it.
We may, one day, want to implement real permissions but for the moment I think the security model we have is sufficient.
'real permissions'?
Fine graned permissions. "can_see", "can_read", "can_write", "can_remove", etc. + stacking, resolving deny+allow rules. All the good ACL stuff.
A bigger issue is the abuse of anonymous streams. I'd like to abolish them over the next few months. If anything, they were a workaround around lack of oauth support in early versions of the dashboard (something that has since proven a failure for our use case). We should IMO move everyone to non-anonymous streams and reserve anonymous streams for mass-filing of profiling information from end-users, something that we have yet to see being used.
Yeah. This should be easy to manage now, I'm not sure how to arrange the changeover without getting every user to change their job descriptions all at once. Maybe we could say an authenticated request to put a bundle into /anonymous/foo just goes into /public/personal/$user/foo by magic.
We may do something special on v.l.o but it general anonymous streams should be ... anonymous ;)
Clearly it makes sense to have the set of people who can see the eventual results and see the job output be the same. Currently the former group is encoded in the stream name of the submit_results action, for example:
{ "command": "submit_results", "parameters": { "server": "http://locallava/RPC2/", "stream": "/private/personal/mwhudson/test/" } }
would place results in a stream called 'test' that only I can or
"stream": "/public/team/linaro/kernel/"
identifies a stream that anyone can see but only members of the linaro group can put results in.
The scheduler *could* read out this parameter from the job json and enforce the privacy rules based on this, but that seems a bit fragile somehow. I think top level attribute in the json describing who can see the job would make sense -- we can then make sure the stream name on the submit_results matches this.
Does the /{public,private}/{personal,team}/{team-or-user-name} syntax make sense to people? I think it's reasonably clear and nicely terse.
You've missed the /{any-other-name,} at the end (a single person can have any number of streams.
Right but the name of the stream is not part of the "who can see it" stuff.
Sure but it affects parsing
Despite being the author I always forget if the privacy flag comes before the owner classification. The words "personal", "private" and "public" are easy to confuse. I was thinking that perhaps we should one day migrate towards something else. The stuff below is my random proposal:
~{team-or-person}/{private,}/{name,}
We should do as much validation at submit time as we can (rejecting jobs that submit to streams that do not exist, for example).
That will break the scheduler / dashboard separation model. You must also remember that scheduler and dashboard can use separate databases so you cannot reason about remote (dashboard) users without an explicit interface (that we don't have).
Well yes. I don't know how much of a benefit that separation is really -- some level of separation so that results can be submitted to a dashboard by a developer running tests on her desk is useful, but I don't know to what extent having the scheduler be able to send results to an entirely different dashboard is.
NONE :-)
Let's get rid of it, the only question is how should this look like?
On a side note. I think that the very first thing we should do is migrate Job to be a RestrictedResource. Then we can simply allow users to submit --private jobs, or delegate ownership to a --team they are a member of. This will immediately unlock a lot of testing that currently cannot happen (toolchain tests with restricted benchmarks).
Yep. That's on the list.
When that works we can see how we can bring both extensions closer so that users have a better experience. In my opinion that is to clearly define that scheduler _must_ be in the same database as the dashboard and to discard the full URL in favour of stream name. Less confusion, all validation possible, no real use cases lost (exactly who is using a private dispatcher to schedule tests to a public dashboard, or vice versa?)
Yeah, I agree. The question I was trying (badly) to ask is twofold:
- what do we want users to write in their job file?
Writing job files manually is also part of the problem but I understand what you are asking about. IMHO we should specify job security at submission time via a new API. It should not be a part of the document being sent. Perhaps we could bond that with setting up destination stream. Then we would have a really clean user interface,100% validated input and 100% reusable jobs. The only case that would be lost is dispatcher being able to send stuff to the dashboard. In that case I think it should follow lava-test, i.e. to loose the feature and to restrict itself to making good bunles (that something else can send)
- (less important) how do we handle the transition from what we have now to the answer to 1)?
Introduce the new interface, monitor the old interface, beat people with a stick if they keep using the old interface, start rejecting the old interface in 20120.4
Thanks ZK
PS: CC-ing to linaro-validation
On Thu, 23 Feb 2012 00:37:31 +0000, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Thu, Feb 23, 2012 at 12:36:02PM +1300, Michael Hudson-Doyle wrote:
On Wed, 22 Feb 2012 10:10:05 +0000, Zygmunt Krynicki zygmunt.krynicki@linaro.org wrote:
On Wed, Feb 22, 2012 at 02:21:57PM +1300, Michael Hudson-Doyle wrote:
Hi all,
The LAVA team is working on support for private jobs -- we already have some support for private results, but if the log of the job that produced the results is publicly visible, this isn't much privacy.
The model for result security is that a set of results can be:
- anonymous (anyone can see, anyone can write)
- public (anyone can see, only owning user or group can write)
- private (only owning user or group can see or write)
Each non-anonymous set of results is owned by a group or user. I think this model is sufficiently flexible -- the only gap I can see is that it's not possible to have a stream where a subset of the people who can see it can submit results to it.
We may, one day, want to implement real permissions but for the moment I think the security model we have is sufficient.
'real permissions'?
Fine graned permissions. "can_see", "can_read", "can_write", "can_remove", etc. + stacking, resolving deny+allow rules. All the good ACL stuff.
Yeah. Let's avoid needing that for as long as possible.
A bigger issue is the abuse of anonymous streams. I'd like to abolish them over the next few months. If anything, they were a workaround around lack of oauth support in early versions of the dashboard (something that has since proven a failure for our use case). We should IMO move everyone to non-anonymous streams and reserve anonymous streams for mass-filing of profiling information from end-users, something that we have yet to see being used.
Yeah. This should be easy to manage now, I'm not sure how to arrange the changeover without getting every user to change their job descriptions all at once. Maybe we could say an authenticated request to put a bundle into /anonymous/foo just goes into /public/personal/$user/foo by magic.
We may do something special on v.l.o but it general anonymous streams should be ... anonymous ;)
Heh.
Clearly it makes sense to have the set of people who can see the eventual results and see the job output be the same. Currently the former group is encoded in the stream name of the submit_results action, for example:
{ "command": "submit_results", "parameters": { "server": "http://locallava/RPC2/", "stream": "/private/personal/mwhudson/test/" } }
would place results in a stream called 'test' that only I can or
"stream": "/public/team/linaro/kernel/"
identifies a stream that anyone can see but only members of the linaro group can put results in.
The scheduler *could* read out this parameter from the job json and enforce the privacy rules based on this, but that seems a bit fragile somehow. I think top level attribute in the json describing who can see the job would make sense -- we can then make sure the stream name on the submit_results matches this.
Does the /{public,private}/{personal,team}/{team-or-user-name} syntax make sense to people? I think it's reasonably clear and nicely terse.
You've missed the /{any-other-name,} at the end (a single person can have any number of streams.
Right but the name of the stream is not part of the "who can see it" stuff.
Sure but it affects parsing
Despite being the author I always forget if the privacy flag comes before the owner classification. The words "personal", "private" and "public" are easy to confuse. I was thinking that perhaps we should one day migrate towards something else. The stuff below is my random proposal:
~{team-or-person}/{private,}/{name,}
I meant to say here that this is a bit ambiguous: groups and users have separate namespaces in Django AFAIK. But if we do all auth via Launchpad this isn't a real problem.
We should do as much validation at submit time as we can (rejecting jobs that submit to streams that do not exist, for example).
That will break the scheduler / dashboard separation model. You must also remember that scheduler and dashboard can use separate databases so you cannot reason about remote (dashboard) users without an explicit interface (that we don't have).
Well yes. I don't know how much of a benefit that separation is really -- some level of separation so that results can be submitted to a dashboard by a developer running tests on her desk is useful, but I don't know to what extent having the scheduler be able to send results to an entirely different dashboard is.
NONE :-)
Let's get rid of it, the only question is how should this look like?
On a side note. I think that the very first thing we should do is migrate Job to be a RestrictedResource. Then we can simply allow users to submit --private jobs, or delegate ownership to a --team they are a member of. This will immediately unlock a lot of testing that currently cannot happen (toolchain tests with restricted benchmarks).
Yep. That's on the list.
When that works we can see how we can bring both extensions closer so that users have a better experience. In my opinion that is to clearly define that scheduler _must_ be in the same database as the dashboard and to discard the full URL in favour of stream name. Less confusion, all validation possible, no real use cases lost (exactly who is using a private dispatcher to schedule tests to a public dashboard, or vice versa?)
Yeah, I agree. The question I was trying (badly) to ask is twofold:
- what do we want users to write in their job file?
Writing job files manually is also part of the problem but I understand what you are asking about. IMHO we should specify job security at submission time via a new API. It should not be a part of the document being sent.
Ah, yes, maybe that makes sense...
Perhaps we could bond that with setting up destination stream. Then we would have a really clean user interface,100% validated input and 100% reusable jobs. The only case that would be lost is dispatcher being able to send stuff to the dashboard. In that case I think it should follow lava-test, i.e. to loose the feature and to restrict itself to making good bunles (that something else can send)
While I think that might be a good direction to head in, I'm not sure it's a requirement for this work to do the submission outside the dispatcher (we could instead insert/modify the submit_results action in the dispatcher job file).
Cheers, mwh
linaro-validation@lists.linaro.org