On Wed, Sep 8, 2021 at 7:50 AM Eric Dumazet edumazet@google.com wrote:
At least on my builds, do_tcp_getsockopt() uses less than 512 bytes of stack.
Probably because tcp_zerocopy_receive() is _not_ inlined, by pure luck I suppose.
Perhaps we should use noinline_for_stack here.
I agree that that is likely a good idea, but I also suspect that the stack growth may be related to other issues. So it being less than 512 bytes for you may be related to other random noise than inlining.
In the past I've seen at least two patterns
(a) not merging stack slots at all
(b) some odd "pattern allocator" problems, where I think gcc ended up re-using previous stack slots if they were the right size, but failing when previous allocations were fragmented
that (a) thing is what -fconserve-stack is all about, and we also used to have (iirc) -fno-defer-pop to avoid having function call argument stacks stick around.
And (b) is one of those "random allocation pattern" things, which depends on the phase of the moon, where gcc ends up treating the stack frame as a series of fixed-size allocations, but isn't very smart about it. Even if some allocations got free'd, they might be surrounded by oithers that didn't, and then gcc wouldn't re-use them if there's a bigger allocation afterwards. And similarly, I don't think gcc ever even joins together two free'd stack frame allocations.
I also wouldn't be surprised at all if some of our hardening flags ended up causing the stack frame reuse to entirely fail. IOW, I could easily see things like INIT_STACK_ALL_ZERO might cause the compiler to initialize all the stack frame allocations "early", so that their lifetimes all overlap.
So it could easily be about very subtle and random code generation choices that just change the order of allocation. A spill in the wrong place, things like that.
Or it could be about not-so-subtle big config option things.
Linus