Hi Carsten,
On Tue, Jan 04, 2022 at 03:13:08PM +0000, Carsten Haitzler wrote:
On 1/3/22 08:00, Leo Yan wrote:
[...]
Furthermore, I expect the bubble sort is to be used for testing the CoreSight configuration, e.g. it can be used to test for the strobing mode (and for validation AutoFDO).
How about you think for this?
I actually didn't include any autofdo testing as this was mostly a matter of tooling after you have collected a trace. Run through the trace data and then build up a good image of the execution of the target and that would probably belong in tooling outside of the kernel. The idea here was to see if we do collect sufficient amounts of data and that the data looks "sane".
Yeah, this is consistent with what Suzuki told me that the main target of this patch set is to verify the CoreSight trace data quality.
This is all about looking to see if we only get a single block or only 2 or 3 blocks then it stops or no blocks and then with various stresses on kernel (memory heavy, cpu heavy) to see if anything will greatly affect this.
The bubble sort does allow a basis to build some fdo tests on, but having a baseline of "does it collect data at all" to start with is a good call. I had not tested the strobing yet as that was probably another phase in this. Most of this was about getting the core infrastructure in to be able to add lots of little test tools we can run and the harnesses to run them and collect statistical data over time.
Just a side note - the asm loop is arm64 specific and thus it's good for testing an exact result from, but bubble sort is portable. It would allow us to use this for an Arm 64 platform like the Morello board. I've been keeping in mind "be somewhat portable" for this reason.
The only downside of keeping this test I think is that the whole test suite takes a bit longer to run. Is this sufficient a concern to remove this test from the patchset given the above?
So my essential purpose is to condense test cases as possible :)
For example, although the Arm64 asm loop case and the bubble sort case have different execution flows, both of them actually are to verify verify a complete process with CoreSight trace data recording and reporting (so covers CoreSight driver, perf tool and OpenCSD lib). Since we can pass different loop number to a test program, e.g. we already have one case to test very small trace data with Arm64 asm loop, why we still need the test case of bubble soring with small array? Seems to me that more cases are not bad thing, but if both case work on the same integration flow, I personally think these two cases cannot give us significant benefit rather than single case.
Throughout the whole patch set, my another concern is some test cases are platform dependent. E.g. if mainline kernel contains these test cases and later a developer reports a test case failure, it's difficult for us to figure out whether the failure is caused by the platform factors (e.g. memory usage, timeout, etc), or it's a good exposing for any issue in software components.
So for a test case requiring very small resources, we can set a strict criteria, for a test case with big chunk trace data, we can report a percentage value as the profiling quality metrics (e.g. we expect 1000 branch samples, but the result only contains 100 branch samples, so we can output the quality metrics as 100 / 1000 = 10%). This can allow us to easily conclude that the underlying mechanism works well, but the profiling quality is bad caused by losing tracing data. In other word, we can convert the quality result from binary format to a range value [0% .. 100%].
Thanks, Leo