Re: [Linaro-validation] multinode question: aggregating results

28 Nov 2013

      On Thu, 28 Nov 2013 14:35:21 +1300
Michael Hudson-Doyle michael.hudson@linaro.org wrote:
...
Neil Williams codehelp@debian.org writes:
...
MultiNode result bundle aggregation combines completed results
after all test cases have run (specifically, during the
submit_results action), at which point no further actions will be
executed. Aggregation itself happens off device, not even on the
dispatcher, it happens on the server. This allows each node to send
their result bundle as normal (via the dispatcher over XMLRPC) and
it is only the subid-zero job which needs to hang around waiting
for other nodes to submit their individual results.
Right.  And the "aggregation" that happens at this level is really
just that the test runs produced by each node are put in a list?
There's no possibility for me to interfere at this stage AIUI (which
I think is probably fine and sensible :-p)
Yes, processing of the aggregated data can be done via filters and
image reports or by downloading the bundle and running custom scripts.
...
...
My question is: exactly what analysis are you needing to do *on the
device under test*
It doesn't have to be on the/a device under test really... but the
prototypical example would be the one I gave in my mail, summing the
req/s reporting by each loadgen node to arrive at a total req/s for
the system as a whole.
...
and can that be done via filters and image reports on the server?
I don't know.  Can filters and image reports sum the measurements
across a bunch of separate test cases?
Stevan? Does Image Reports 2.0 have this support?
...
...
If the analysis involves executing binaries compiled on the device,
then that would be a reason to copy the binaries between nodes using
TCP/IP (or even cache the binaries somewhere and run a second test
to do the analysis) but otherwise, it's likely that the server will
provide more competent analysis than the device under test. It's a
question of getting the output into a suitable format.
Once a MultiNode job is complete, there is a single result bundle
which can contain all of the test result data from all of the nodes,
including measurements. There is scope for a custom script to
optimise the parser to make the data in the result bundle easier to
analyse in an image report.
Yeah, I think this is what I was sort of asking for.
:-) By custom script, I was thinking of a script on each node, written
by each test writer, which prepares the output of a test routine for
easier parsing in filters and image reports. How much work this needs
to do depends on how the work on Image Reports 2.0 develops.
...
...
This is the way that MultiNode is designed to work - each test
definition massages the test result output into whatever structure
is most amenable to being compared and graphed using Image Reports
on the server, not on a device under test.
Using the server also means that further data mining is easy by
extracting and processing the aggregated result bundle at any time
including many months after the original test completed or comparing
tests several weeks apart.
Well sure, I think it's a bad idea to throw the information that you
are aggregating away.  But it's nice to have the aggregate req/s in
the measurement field so you can get a quick idea of performance
changes.
Agreed. As far as measurement changes over time are concerned, that is
absolutely the role of filters and image reports.
...
...
...
The motivating case here is having load generation distributed
across various machines: to compute the req/s the server is
actually able to manage I want to add up the number of requests
each load generator made.
So you would have nodes with different roles in the MultiNode job:
generators and servers (or generators and surveyors). Wouldn't it also
be possible for the surveyor node(s) to record the measurements during
the test? This would be much like Antonio's initial suggestion of a
"watching" KVM which does live monitoring & collection rather than
aggregation after the event.
...
...
...
I can sort of see how to do this myself, basically something like
this:

store the data on each node
arbitrarily pick one node to be the one that does the

aggregation
LAVA does this arbitrarily as well - the bundles are aggregated by
the job with subid zero, so 1234.0 aggregates for 1234.1 and 1234.2
etc.
Is there a way for the node to tell if it is running the job with
subid 0?
It shouldn't need to know - whichever node does your load generation
calculations will feed it's results into the bundles which will be
aggregated by LAVA and the whole set becomes available to filters and
image reports.
...
...
...

do tar | nc style things to get the data onto that node
analyze it there and store the results using lava-test-case

Results inside a test case mean that if the analysis needs to be
improved, old data cannot be re-processed.
Not necessarily -- for my tests I also save the entire httperf output
as attachments and have scripts that analyze these to produce fancy
graphs as well as putting the aggregate req/s in the measurement
field.  I guess what this means is that the aggregation is only a
convenience really -- but probably a fairly important one.
There are two important advantages to doing this inside Image Reports:
1. Image Reports and are not reliant on TCP/IP connections
which some devices simply don't support during test runs.
2. Image Reports can easily work retrospectively across all existing
MultiNode and singlenode jobs whereas any change inside single test
jobs would not be able to pull data from older tests.
netcat and tar are the quick solution to this specific problem, there
remains a wider problem that LAVA should support calculations across
test cases in the image reports.
-- 

Neil Williams
=============
http://www.linux.codehelp.co.uk/

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

Re: [Linaro-validation] multinode question: aggregating results