On Tue, Dec 8, 2020 at 5:28 PM Mohamed Abuelfotoh, Hazem abuehaze@amazon.com wrote:
>Please try again, with a fixed tcp_rmem[1] on receiver, taking into >account bigger memory requirement for MTU 9000 >Rationale : TCP should be ready to receive 10 full frames before >autotuning takes place (these 10 MSS are typically in a single GRO
packet)
>At 9000 MTU, one frame typically consumes 12KB (or 16KB on some arches/drivers)
TCP uses a 50% factor rule, accounting 18000 bytes of kernel memory per MSS.
-> >echo "4096 180000 15728640" >/proc/sys/net/ipv4/tcp_rmem
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 9e8a6c1aa0190cc248b3b99b073a4c6e45884cf5..81b5d9375860ae583e08045fb25b089c456c60ab 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -534,6 +534,7 @@ static void tcp_init_buffer_space(struct sock *sk)
tp->rcv_ssthresh = min(tp->rcv_ssthresh, tp->window_clamp); tp->snd_cwnd_stamp = tcp_jiffies32;
tp->rcvq_space.space = min(tp->rcv_ssthresh, tp->rcvq_space.space);
}
Yes this worked and it looks like echo "4096 140000 15728640" >/proc/sys/net/ipv4/tcp_rmem is actually enough to trigger TCP autotuning, if the current default tcp_rmem[1] doesn't work well with 9000 MTU I am curious to know if there is specific reason behind having 131072 specifically as tcp_rmem[1]?I think the number itself has to be divisible by page size (4K) and 16KB given what you said that each Jumbo frame packet may consume up to 16KB.
I think the idea behind the value of 131072 was that because TCP RWIN was set to 65535, we had to reserve twice this amount of memory -> 131072 bytes.
Assuming DRS works well, the exact value should matter only for unresponsive applications (slow to read/drain the receive queue), since DRS is delayed for them.