On Fri, Jan 28, 2022 at 3:41 PM David Gow davidgow@google.com wrote:
On Thu, Jan 6, 2022 at 6:53 PM Jeremy Kerr jk@codeconstruct.com.au wrote:
Hi all,
Happy new year! I'm just picking up this thread again, after having a bunch of other things come up at the end of December. I've since implemented some of the direct feedback on the patch, but wanted to clarify some overall direction too:
Thanks for reaching back out -- and sorry for the delay. I've dusted everything off post-LCA now, and had another look over this.
I'm CCing both the KUnit mailing list and Daniel Latypov on this thread so we can avoid (or track) any potential conflicts with things like: https://lore.kernel.org/linux-kselftest/20211013191320.2490913-1-dlatypov@go...
[Whoops, I'm smart. Actually CCing them this time!]
One idea I've had in the past is to keep such a list around of "test suites to be run when KUnit is ready". This is partly because it's much nicer to have all the test suites run together as part of a single (K)TAP output, so this could be a way of implementing (at least part of) that.
I had a look at implementing this, but it doesn't seem to win us much with the current structure: since we kunit_run_all_tests() before a filesystem is available, kunit will always be "ready" (and the tests run) before we've had a chance to load modules, which may contain further tests.
I think the benefit of something like this isn't really with test modules so much as with built-in tests which are run manually. The thunderbolt test, for instance, currently initialises and runs tests itself[1,2], rather than via the KUnit executor, so that it can ensure some proportion of the stack is properly initialised. If there's a way of coalescing those into the same TAP output as other built-in tests, that'd be useful.
Thinking about it, though, I think it's a separate problem from module support, so not worth shoehorning in at this stage.
One option would be to defer kunit_run_all_tests() until we think we have the full set of tests, but there's no specific point at which we know that all required modules are loaded. We could defer this to an explicit user-triggered "run the tests now" interface (debugfs?), but that might break expectations of the tests automatically executing on init.
Yeah, while I do think it'd be neat to have an interface to explicitly run (or re-run) tests, I think having tests run on module load is still the most sensible thing, particularly since that's what people are expecting at the moment (especially with tests which were ported from standalone modules to KUnit).
There was a plan to allow test suites to be triggered from debugfs individually at some point, but I think it got derailed as tests weren't working if run twice, or some builtin-only tests having problems if run after a userland was brought up.
In any case, I think we should get test suites in modules running on module load, and leave any debugfs-related shenanigans for a future patchset.
Alternatively, I could properly split the TAP output, and just run tests whenever they're probed - either from the built-in set or as modules are loaded at arbitrary points in the future. However, I'm not sure of what the expectations on the consumer-side of the TAP output might be.
At the moment, the KUnit tooling will stop parsing after the first full TAP output, so if suites are outputting TAP separately, only the first one will be handled properly by kunit_tool. Of course, kunit_tool doesn't really handle modules by itself at the moment, so as long as the output is provided to kunit_tool one at a time, it's still possible to use it. (And the possibility of adding a way for kunit_tool to handle multiple TAP outputs in the same file is something we plan to look into.)
So, I think this isn't a problem for modules at the moment, though again it could be a bit painful for things like the thunderbolt test where there are multiple TAP documents from built-ins.
Note that the updated KTAP[3] spec does actually leave the option open for TAP output to effectively be "streaming" and to not state the total number of tests in advance, but I don't think that's the ideal way of handling it here.
Are there any preferences on the approach here?
So, after all that, I think the original plan makes the most sense, and if we find any cases which are particularly annoying we can either change things then, or (better still) adapt the tooling to handle them better.
Thanks again for looking into this! -- David