On Wed, Sep 8, 2021 at 6:48 AM Thorsten Glaser t.glaser@tarent.de wrote:
On Tue, 7 Sep 2021, Linus Torvalds wrote:
The do_tcp_getsockopt() one in tpc.c is a classic case of "lots of different case statements, many of them with their own struct allocations on stack, and all of them disjoint".
Any compiler developers here? AFAIK the compiler knows the lifetime of function-local variables, so why not alias the actual memory locations and ranges to minimise stack usage?
At least on my builds, do_tcp_getsockopt() uses less than 512 bytes of stack.
Probably because tcp_zerocopy_receive() is _not_ inlined, by pure luck I suppose.
Perhaps we should use noinline_for_stack here.
diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index e8b48df73c852a48e51754ea98b1e08bf024bb9e..437910c096b202420518c9e5e5cd26b2194d8aa2 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -2054,9 +2054,10 @@ static void tcp_zc_finalize_rx_tstamp(struct sock *sk, }
#define TCP_ZEROCOPY_PAGE_BATCH_SIZE 32 -static int tcp_zerocopy_receive(struct sock *sk, - struct tcp_zerocopy_receive *zc, - struct scm_timestamping_internal *tss) +static noinline_for_stack int +tcp_zerocopy_receive(struct sock *sk, + struct tcp_zerocopy_receive *zc, + struct scm_timestamping_internal *tss) { u32 length = 0, offset, vma_len, avail_len, copylen = 0; unsigned long address = (unsigned long)zc->address;