On Wed, 27 Nov 2013 13:56:42 +1300 Michael Hudson-Doyle michael.hudson@linaro.org wrote:
I've been looking at moving my ghetto multinode stuff over to proper LAVA multinode on and off for a while now, and have something that I'm still not sure how best to handle: result aggregation.
MultiNode result bundle aggregation combines completed results after all test cases have run (specifically, during the submit_results action), at which point no further actions will be executed. Aggregation itself happens off device, not even on the dispatcher, it happens on the server. This allows each node to send their result bundle as normal (via the dispatcher over XMLRPC) and it is only the subid-zero job which needs to hang around waiting for other nodes to submit their individual results.
My question is: exactly what analysis are you needing to do *on the device under test* and can that be done via filters and image reports on the server?
If the analysis involves executing binaries compiled on the device, then that would be a reason to copy the binaries between nodes using TCP/IP (or even cache the binaries somewhere and run a second test to do the analysis) but otherwise, it's likely that the server will provide more competent analysis than the device under test. It's a question of getting the output into a suitable format.
Once a MultiNode job is complete, there is a single result bundle which can contain all of the test result data from all of the nodes, including measurements. There is scope for a custom script to optimise the parser to make the data in the result bundle easier to analyse in an image report.
This is the way that MultiNode is designed to work - each test definition massages the test result output into whatever structure is most amenable to being compared and graphed using Image Reports on the server, not on a device under test.
Using the server also means that further data mining is easy by extracting and processing the aggregated result bundle at any time including many months after the original test completed or comparing tests several weeks apart.
The motivating case here is having load generation distributed across various machines: to compute the req/s the server is actually able to manage I want to add up the number of requests each load generator made.
I can sort of see how to do this myself, basically something like this:
- store the data on each node
- arbitrarily pick one node to be the one that does the aggregation
LAVA does this arbitrarily as well - the bundles are aggregated by the job with subid zero, so 1234.0 aggregates for 1234.1 and 1234.2 etc.
- do tar | nc style things to get the data onto that node
- analyze it there and store the results using lava-test-case
Results inside a test case mean that if the analysis needs to be improved, old data cannot be re-processed.
but I was wondering if the LAVA team have any advice here. In particular, steps 2. and 3. seem like something it would be reasonable for LAVA to provide helpers to do.
The LAVA support for this would be to use filters and Image Reports on the server, not during the test when repeating the analysis means repeating the entire test (at which point the data changes under your feet).