The test_scanf case didn't actually use the Kunit infrastructure, the stack use explosion is because gcc doesn't seem to combine stack allocations in many situations. I know gcc *sometimes* does that stack allocation combining, but not here. I suspect it might be related to type aliasing, and only merging stack slots when they have the same types, and thus triggered by the different result buffer sizes. Maybe.
I'll have to take another look at this test.
The build robot reported a stack explosion recently but despite trying various configurations, GCC versions and X86/ARM targets I couldn't reproduce. Robot got a ~8k stack, but in all my test builds GCC merged the stack structs and produced only ~100-200 bytes. Unfortunately haven't been able to spend the time on this.
I wanted to avoid the quick fix of multiple functions because really that's saying "make stack use (which is up to GCC) < X". The ultimate stack reduction would be one-test-per-function but that gives really bloated source. Any attempt to group several tests into a function is relying on an assumption about what GCC will merge or not merge on the stack, and that could change.
I think the best fix would be to re-work the code to use a work buffer instead of stack allocations (as it already does for the format strings). With the benefit of hindsight, this is what I should have done originally. When I have the time I'll work on it.