+cc Steve
On Sun, May 25, 2025 at 06:18:42PM +0100, David Laight wrote:
On Fri, 23 May 2025 20:01:33 +0200 "Arnd Bergmann" arnd@arndb.de wrote:
On Fri, May 23, 2025, at 19:11, Kent Overstreet wrote:
On Fri, May 23, 2025 at 05:17:15PM +0200, Arnd Bergmann wrote:
- KASAN_STACK adds extra redzones for each variable
- KASAN_STACK further prevents stack slots from getting reused inside one function, in order to better pinpoint which instance caused problems like out-of-scope access
- passing structures by value causes them to be put on the stack on some architectures, even when the structure size is only one or two registers
We mainly do this with bkey_s_c, which is just two words: on x86_64, that gets passed in registers. Is riscv different?
Not sure, I think it's mostly older ABIs that are limited, either not passing structures in registers at all, or only possibly one but not two of them.
- sanitizers turn off optimizations that lead to better stack usage
- in some cases, the missed optimization ends up causing local variables to get spilled to the stack many times because of a combination of all the above.
Yeesh.
I suspect we should be running with a larger stack when the sanitizers are running, and perhaps tweak the warnings accordingly. I did a bunch of stack usage work after I found a kmsan build was blowing out the stack, but then running with max stack usage tracing enabled showed it to be a largely non issue on non-sanitizer builds, IIRC.
Enabling KASAN does double the available stack space. However, I don't think we should use that as an excuse to raise the per-function warning limit, because
- the majority of all function stacks do not grow that much when sanitizers are enabled
- allmodconfig enables KASAN and should still catch mistakes where a driver accidentally puts a large structure on the stack
That is rather annoying when you want to look at the generated code :-(
- 2KB on 64-bit targes is a really large limit. At some point in the past I had a series that lowered the limit to 1536 byte for 64-bit targets, but I never managed to get all the changes merged.
I've a cunning plan to do a proper static analysis of stack usage. It is a 'simple' matter of getting objtool to output all calls with the stack offset. Indirect calls need the function hashes from fine-ibt, but also need clang to support 'hash seeds' to disambiguate all the void (*)(void *) functions. That'll first barf at all recursion, and then, I expect, show a massive stack use inside snprintf() in some error path.
I suspect recursion will make the results you get with that approach useless.
We already have "trace max stack", but that only checks at process exit, so it doesn't tell you much.
We could do better with tracing - just inject a trampoline that checks the current stack usage against the maximum stack usage we've seen, and emits a trace event with a stack trace if it's greater.
(and now Steve's going to tell us he's already done this :)