Zygmunt Krynicki zygmunt.krynicki@linaro.org writes:
W dniu 19.10.2012 01:36, Michael Hudson-Doyle pisze:
Incidentally that's something we may collaborate on.
Yeah, so how does checkbox deal with this? I guess it doesn't quite have the concept of remote users submitted requests that jobs are run? (i.e. checkbox is more dispatcher than scheduler in lava terminology).
We have largely the same problem but in different context (there are different internal users).
Checkbox has the concept of "whitelists" which basically specify the test scenario. Each item in the whitelist is a "job" (full test definition) that can use various checkbox "plugins" (like shell, manual and many others that I'm not familiar with). Checkbox then transforms the whitelist (resolving dependencies and things like that) and executes the tests much like dispatcher would.
I see.
There are several use cases that are currently broken
Such as?
From what I recall mostly on the way upstream/downstream (and sometimes side-stream) relationships work. The actual details are specific to Canonical (I would gladly explain that in a private channel if you wish to know more) but the general idea is that without some API stability (and we offer none today), script stability (you can think of it as another level of API) our downstream users (which are NOT just consumers) have a hard time following our releases.
The second issue that is more directly addressed is that there is poor conductivity for actual tests to flow from team to team and to get "stability" people prefer to keep similar/identical tests to themselves (not as in secret but as in not collaborated upon easily)
Ah. Now I understand your interest in this topic :-)
One of the proposals would be to build a pypi-like directory of tests and use that as a base for namespacing (first-come first-served name allocation). I'm not entirely sure this would help to solve the problem but it's something that, if available, could give us another vector.
Hm. This is definitely an interesting idea. I had actually already thought that using user specified distutils- or debian-style versioning would make sense -- you would get the latest version by the chosen algorithm by default, but could still upload revisions of old versions if you wanted to.
I'd rather avoid debian-style versions in favor of strict, constant length, version system. Let's not have a custom postgresql function for comparing versions again ;)
Well, maybe debian-style is overkill. What I really want is this property: for any two versions A and B with A < B it is possible to construct a version C such that A < C < B. No constant length version system can satisfy this, but it doesn't necessarily imply anything as complicated as debian.
Part of this would be a command line tool for fetching / publishing test definitions I guess. In fact this could almost be the main thing: it depends whether you want to produce (and host, I guess) a single site which is the centrepoint of the test definition world (like pypi.python.org is for Python stuff) or just the tools / protocols people use to run and work with their own repositories (testdef.validation.linaro.org or testdef.qa.ubuntu.com or whatever).
I think that there _should_ be a central repository simply because it means less fractures early on. From what I know people don't deploy their own pypi just to host their pet project. They only do that if they depend on the protocols and tools around pypi and want to keep the code private.
I guess I am a little skeptical of the amount of test reuse that's going to be possible between different users. I mean, if a testdef includes device tags that must be present on a device for a test run to be possible -- as power measurement tests might well do -- that sort of ties the testdef to *our lab*, never mind LAVA in general, unless the concept of and specific names of device tags becomes more widespread than I really expect at the moment.
I think that, as with pypi, even if there is a "single centrepoint of the test definition world", we should expect that sites will have local test repositories for one reason and another (as they do with pypi).
Having said what I did above, nothing can prevent others from re-implementing the same protocols or deploying their own archive but I think we should encourage working in the common pool as this will improve the ecosystem IMHO (look at easy_install, pip or even crate.io,
What is crate.io btw? Pypi with a prettier skin?
they would not have happened if there was a competing group of pypi-like systems that have no dominance over others). In other words the value of pypi is the data that is stored there.
Well sure. But there are lots and lots and lots of sites that use, say, Django. How many are going to use Linaro's big.LITTLE tests? I think there is actually a difference here. I get the impression that your situation is somewhat different -- that you have checkbox users who really should be collaborating on test definitions but have no way of doing so right now.
Another way to handle namespacing is to include the name of the user / group that can update a resource in its name, ala branches on LP or repos on github (or bundle streams in LAVA). Not sure if that's a good idea for our use case or not.
I thought about one thing that would warrant ~user/project approach. Both pypi and launchpad are product-centric -- you go to shop for solutions looking for the product name. GitHub on the other hand is developer centric as $product can have any number of forks that are equally exposed.
I think for our goals we should focus on product-centric views. The actual code, wherever it exists, should be managed with other tools.
I /think/ I agree here...
I would not like to outgrow this concept to a DVCS or a code hosting tool.
I wonder if checkbox's rfc822ish format would be better than JSON for test interchange...
Probably although it's still imperfect and suffers from binary deficiency.
Having slept on this (a few times) I think I'm leaning towards .ini files, somewhat like the stuff you came up with for lava-test.
[metadata] name: stream version: 1.0 format: lava-test v1.0
[install] url: http://www.cs.virginia.edu/stream/FTP/Code/stream.c steps: cc stream.c -O2 -fopenmp -o stream
[run] steps: ./stream
[parse] pattern: ^(?P<test_case_id>\w+):\W+(?P<measurement>\d+.\d+) [parse:appendall] units: MB/s result: pass
for example. In my mind, the metadata section and the keys it has here are mandatory, all else is free-form (although there will be something somewhere that knows that this format string gives a meaning to the keys and values present in this example).
What I'd like to see in practice is a web service that is free-for-all that can hold test meta data. I believe that as we go test meta data will formalize and at some point it may become possible to run lava-test test from checkbox and checkbox job in lava (given appropriate adapters on both sides) merely by specifying the name of the test.
So that's an argument for aiming for a single site? Maybe. Maybe you'd just give a URL of a testdef rather than the name of a test, so http://testdef.validation.linaro.org/stream rather than just 'stream'.
Imagine pip installing that each time. IMO it's better to stick to names rather than URLS, if we can.
Yes, this part is a good point. The other option would be to have per user/site configuration for a default site, but well. Less configuration -> better.
People know how to manage names already and URLs is something we can only google for.
The full URL could be usable for some kind of "packages" but that's not the primary scope of the proposal, I think. Packages are more complicated and secondary and the directory should merely point you at something that you can install with an absolute URL.
I don't think I understand what you mean here.
Initially it could be a simple RESTful interface based on a dumb HTTP server serving files from a tree structure.
And then could grow wiki like features? :-)
I'd rather not go there. IMHO it should only have search and CRUD actions on the content. Anything beyond that works better elsewhere (readthedocs / crate.io). Remember that it's not the 'appstore' experience that we are after here. The goal is to introduce a common component that people can converge and thrive on. This alone may give us better code re-usability as we gain partial visibility to other developers _and_ we fix the release process for test definitions so that people can depend on them indefinitely.
One of the user stories we have is "which tests are available to run on board X with Y deployed to it?" -- if we use test repositories that are entirely disconnected from the LAVA database I think this becomes a bit harder to answer. Although one could make searching a required feature of a test repository...
I think that's something to do in stage 2 as we get a better understanding of what we have.
Hm. Not sure -- it really is something we want ASAP in lava.
In the end the perfect solution, for LAVA, might be LAVA-specific and we should not sacrifice the generic useful aspects in the quest for something this narrow.
In simple classifiers that might help there:
Environment::Hardware::SoC::OMAP35xx Environment::Hardware::Board::Panda Board ES Environment::Hardware::Add-Ons::Linaro::ABCDXYZ-Power-Probe Environment::Software::Linaro::Ubuntu Desktop Environment::Software::Ubuntu::Ubuntu Desktop
I think KISS will need to be applied here.
But this requires building a sensible taxonomy which is something I don't want to require in the first stage. The important part is to be _able_ to build one as the meta-data format won't constrain you. As we go we can release "official" meta-data spec releases that standardize what certain things mean. This could them be used as a basis for reliable (as in no false positives) and advanced search tools.
Right, let's just solve the problem in my face now in a way that doesn't flagrantly prevent more general solutions later.
This would allow us to try moving some of the experimental meta-data there and build the client parts. If the idea gains traction it could grow from there.
Some considerations:
- Some tests have to be private. I don't know how to solve that in
namespaces. Some of the ideas that come to mind is .private. namespace that is explicitly non-global and can be provided by a local "test definition repository"
That would work, I think.
- It should probably be schema free, serving simple rfc822 files with
python-like classifiers (Test::Platform::Android anyone?) as this will allow free experimentation
FWIW, I think they're pedantically called "trove classifiers" :-)
Right, thanks!
I guess there would be two mandatory fields: name and version. And maybe format? So you could have
Yeah, name and version is a good start. Obviously each test definition will have a maintainer / owner but that's not something that has to be visible here (and it certainly won't be a part of what gets published "to the archive" if we go that far).
Name: stream Version: 1.0b3 Format: LAVA testdef version 1.3
We could also prefix all non-standard (non standardized) headers with the vendor string (-Linaro -Canonical) or have a standard custom extension header prefix as in HTTP, X-foo
Blah. No thanks.
...
and everything else would only need to make sense to LAVA.
Then you would say client side:
$ testdef-get lava-stream
We definitely need a catchy name
But seriously. I'm not entirely sure that the command line tool will be a part of the "standard issue". The same way you use pip to install python stuff from pypi you'd use lava to install test definitions into lava. I don't imagine how a generic tool could know how to interact with lava and checkbox in a way that would still be useful. While your example is strictly about running tests (it's about defining them) I think it's important to emphasize -- the protocols, and maybe the common repo, matter more than the tools as those may be more domain-specific for a while.
You've lost me here, I'm afraid.
I really do care mostly about the experience of the people maintaining tests. *I* am (or at least, the LAVA team is) going to be writing the test execution stuff, so that more falls in the "do it once, forget about it" category.
Fetched lava-stream version 1.0b3 $ vi lava-stream.txt # update stuff $ testdef-push lava-stream.txt ERROR: lava-stream version 1.0b3 already exists on server $ vi lava-stream.txt # Oops, update version $ testdef-push lava-stream.txt Uploaded lava-stream version 1.0b4
I wonder if we could actually cheat and use pypi to prototype this.
Interesting idea.
I don't suppose they have a staging instance where I can register 20 tiny projects with oddball meta-data?
There is, it turns out: http://testpypi.python.org/pypi
http://wiki.python.org/moin/PyPiImplementations is also relevant if we want to just run our own instance of PyPI-like software.
- It probably does not have to be the download server as anyone can
host tests themselves. Just meta-data would be kept there.
By metadata you mean the key-value data as listed above, right?
Yes
(For small tests that may be enough but I can envision tests with external code and resources)
Yeah, the way lava-test tests can specify URLs and bzr and git repos to be fetched needs to stay I think.
That's the part I hate the most about current LAVA setup. I think that going forward they should go away and should be converted into test definitions that describe the very same code you'd git clone or bzr branch.
I agree, but I don't know how practical this is. I need to work through some moderately complicated existing tests -- if a test includes a few C files that need compilation, I don't think squeezing that into a testdef file is really practical. But maybe if the testdef server hosts more than just the testdef file (so more like PyPI, in fact) this becomes bearable again... not sure how this relates to some of your comments around just hosting metadata.
The reason I believe that is that it will allow you do to reliable releases. This is the same distinction as pypi not having any tarballs, just git urls. I think that would defeat the long term purpose of the directory. Remember that both the test "wrapper" / definition and the test code is something that gets consumed by users/testers so _both_ should be released in the same, reliable, way.
In addition to that, having "downloads" makes offline easier. I'm not entire sure how that would work with very high level tests that, say, apt-get install something from the archive and then run some arbitrary commands. One might be tempted to create a reproducible test environment where all the downloads are kept offline and versioned but perhaps that kind of test needs to be explicitly marked as non-idempotent and that's the actual value it provides.
Well. For things that are sufficiently declarative (specifying a git repo to clone or a package to install with apt-get) we can capture the versions that are obtained. For an arbitrary URL one could checksum the download and issue a warning if it has changed. But if the test code itself runs "apt-get update/install", well, then it's just not a test that supports being replayed on this level. As we don't have any support at all for re-running tests, I'm not sure I want to spend /too/ long worrying about this aspect...
Cheersm, mwh