On Mon, Dec 7, 2020 at 4:37 PM Eric Dumazet edumazet@google.com wrote:
On Mon, Dec 7, 2020 at 12:41 PM Hazem Mohamed Abuelfotoh abuehaze@amazon.com wrote:
Previously receiver buffer auto-tuning starts after receiving one advertised window amount of data.After the initial receiver buffer was raised by commit a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB"),the receiver buffer may take too long for TCP autotuning to start raising the receiver buffer size. commit 041a14d26715 ("tcp: start receiver buffer autotuning sooner") tried to decrease the threshold at which TCP auto-tuning starts but it's doesn't work well in some environments where the receiver has large MTU (9001) especially with high RTT connections as in these environments rcvq_space.space will be the same as rcv_wnd so TCP autotuning will never start because sender can't send more than rcv_wnd size in one round trip. To address this issue this patch is decreasing the initial rcvq_space.space so TCP autotuning kicks in whenever the sender is able to send more than 5360 bytes in one round trip regardless the receiver's configured MTU. Fixes: a337531b942b ("tcp: up initial rmem to 128KB and SYN rwin to around 64KB") Fixes: 041a14d26715 ("tcp: start receiver buffer autotuning sooner")
Signed-off-by: Hazem Mohamed Abuelfotoh abuehaze@amazon.com
net/ipv4/tcp_input.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c index 389d1b340248..f0ffac9e937b 100644 --- a/net/ipv4/tcp_input.c +++ b/net/ipv4/tcp_input.c @@ -504,13 +504,14 @@ static void tcp_grow_window(struct sock *sk, const struct sk_buff *skb) static void tcp_init_buffer_space(struct sock *sk) { int tcp_app_win = sock_net(sk)->ipv4.sysctl_tcp_app_win;
struct inet_connection_sock *icsk = inet_csk(sk); struct tcp_sock *tp = tcp_sk(sk); int maxwin; if (!(sk->sk_userlocks & SOCK_SNDBUF_LOCK)) tcp_sndbuf_expand(sk);
tp->rcvq_space.space = min_t(u32, tp->rcv_wnd, TCP_INIT_CWND * tp->advmss);
tp->rcvq_space.space = min_t(u32, tp->rcv_wnd, TCP_INIT_CWND * icsk->icsk_ack.rcv_mss);
I find using icsk->icsk_ack.rcv_mss misleading.
I would either use TCP_MSS_DEFAULT , or maybe simply 0, since we had no samples yet, there is little point to use a magic value.
0 will not work, since we use a do_div(grow, tp->rcvq_space.space)
Note that if a driver uses 16KB of memory to hold a 1500 bytes packet, then a 10 MSS GRO packet is consuming 160 KB of memory, which is bigger than tcp_rmem[1]. TCP could decide to drop these fat packets.
I wonder if your patch does not work around a more fundamental issue, I am still unable to reproduce the issue.