[PATCH 1/2] tcp: fix races in tcp_abort()

List overview All Threads
Download

newer

older

[merged mm-stable]...

[PATCH] sched/deadline: Fix race...

Youngmin Nam

14 Mar 2025 14 Mar '25

9:24 a.m.

From: Eric Dumazet edumazet@google.com

tcp_abort() has the same issue than the one fixed in the prior patch in tcp_write_err().

commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.

To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, this patch must be applied first.

In order to get consistent results from tcp_poll(), we must call sk_error_report() after tcp_done().

We can use tcp_done_with_error() to centralize this logic.

Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Neal Cardwell ncardwell@google.com Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+ [youngmin: Resolved minor conflict in net/ipv4/tcp.c] Signed-off-by: Youngmin Nam youngmin.nam@samsung.com --- net/ipv4/tcp.c | 6 +----- 1 file changed, 1 insertion(+), 5 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 7ad82be40f34..9fe164aa185c 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk);

if (!sock_flag(sk, SOCK_DEAD)) { - WRITE_ONCE(sk->sk_err, err); - /* This barrier is coupled with smp_rmb() in tcp_poll() */ - smp_wmb(); - sk_error_report(sk); if (tcp_need_reset(sk->sk_state)) tcp_send_active_reset(sk, GFP_ATOMIC); - tcp_done(sk); + tcp_done_with_error(sk, err); }

bh_unlock_sock(sk);

-- 2.39.2

Show replies by date

Youngmin Nam

14 Mar 14 Mar

9:24 a.m.

New subject: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort

From: Xueming Feng kuro@kuroa.me

commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.

We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation.

Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer.

While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue.

This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan.

This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening.

According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed.

The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293.

At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error().

p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org.

Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send... Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng kuro@kuroa.me Tested-by: Lorenzo Colitti lorenzo@google.com Reviewed-by: Jason Xing kerneljasonxing@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+ Link: https://lore.kernel.org/lkml/Z9OZS%2Fhc+v5og6%2FU@perf/ [youngmin: Resolved minor conflict in net/ipv4/tcp.c] Signed-off-by: Youngmin Nam youngmin.nam@samsung.com --- net/ipv4/tcp.c | 16 ++++++++++------ 1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c index 9fe164aa185c..ff22060f9145 100644 --- a/net/ipv4/tcp.c +++ b/net/ipv4/tcp.c @@ -4620,6 +4620,13 @@ int tcp_abort(struct sock *sk, int err) /* Don't race with userspace socket closes such as tcp_close. */ lock_sock(sk);

+ /* Avoid closing the same socket twice. */ + if (sk->sk_state == TCP_CLOSE) { + if (!has_current_bpf_ctx()) + release_sock(sk); + return -ENOENT; + } + if (sk->sk_state == TCP_LISTEN) { tcp_set_state(sk, TCP_CLOSE); inet_csk_listen_stop(sk); @@ -4629,15 +4636,12 @@ int tcp_abort(struct sock *sk, int err) local_bh_disable(); bh_lock_sock(sk);

- if (!sock_flag(sk, SOCK_DEAD)) { - if (tcp_need_reset(sk->sk_state)) - tcp_send_active_reset(sk, GFP_ATOMIC); - tcp_done_with_error(sk, err); - } + if (tcp_need_reset(sk->sk_state)) + tcp_send_active_reset(sk, GFP_ATOMIC); + tcp_done_with_error(sk, err);

bh_unlock_sock(sk); local_bh_enable(); - tcp_write_queue_purge(sk); if (!has_current_bpf_ctx()) release_sock(sk); return 0;

-- 2.39.2

Greg KH

12:24 p.m.

New subject: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort

On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote:

...

From: Xueming Feng kuro@kuroa.me

commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.

We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation.

Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer.

While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue.

This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan.

This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening.

According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed.

The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293.

At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error().

p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org.

Link: https://patchwork.ozlabs.org/project/netdev/patch/1450773094-7978-3-git-send... Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng kuro@kuroa.me Tested-by: Lorenzo Colitti lorenzo@google.com Reviewed-by: Jason Xing kerneljasonxing@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://patch.msgid.link/20240826102327.1461482-1-kuro@kuroa.me Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+

Does not apply to 6.1.y or older, what did you want this applied to?

thanks,

greg k-h

Youngmin Nam

17 Mar 17 Mar

4:32 a.m.

New subject: [PATCH 2/2] tcp: fix forever orphan socket caused by tcp_abort

On Fri, Mar 14, 2025 at 01:24:26PM +0100, Greg KH wrote:

...

On Fri, Mar 14, 2025 at 06:24:46PM +0900, Youngmin Nam wrote:

...
From: Xueming Feng kuro@kuroa.me

commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4 upstream.

We have some problem closing zero-window fin-wait-1 tcp sockets in our environment. This patch come from the investigation.

Previously tcp_abort only sends out reset and calls tcp_done when the socket is not SOCK_DEAD, aka orphan. For orphan socket, it will only purging the write queue, but not close the socket and left it to the timer.

While purging the write queue, tp->packets_out and sk->sk_write_queue is cleared along the way. However tcp_retransmit_timer have early return based on !tp->packets_out and tcp_probe_timer have early return based on !sk->sk_write_queue.

This caused ICSK_TIME_RETRANS and ICSK_TIME_PROBE0 not being resched and socket not being killed by the timers, converting a zero-windowed orphan into a forever orphan.

This patch removes the SOCK_DEAD check in tcp_abort, making it send reset to peer and close the socket accordingly. Preventing the timer-less orphan from happening.

According to Lorenzo's email in the v1 thread, the check was there to prevent force-closing the same socket twice. That situation is handled by testing for TCP_CLOSE inside lock, and returning -ENOENT if it is already closed.

The -ENOENT code comes from the associate patch Lorenzo made for iproute2-ss; link attached below, which also conform to RFC 9293.

At the end of the patch, tcp_write_queue_purge(sk) is removed because it was already called in tcp_done_with_error().

p.s. This is the same patch with v2. Resent due to mis-labeled "changes requested" on patchwork.kernel.org.

Link: https://protect2.fireeye.com/v1/url?k=f1caf90b-ae51376f-f1cb7244-000babda020... Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Xueming Feng kuro@kuroa.me Tested-by: Lorenzo Colitti lorenzo@google.com Reviewed-by: Jason Xing kerneljasonxing@gmail.com Reviewed-by: Eric Dumazet edumazet@google.com Link: https://protect2.fireeye.com/v1/url?k=66416ec8-39daa0ac-6640e587-000babda020... Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+

Does not apply to 6.1.y or older, what did you want this applied to?

thanks,

greg k-h

Hi Greg,

Sorry about that. Let me resend these patches for 6.1 and 5.15.

As for 5.10, it seems to have more dependencies for the backport. I think the maintainer should handle it to ensure a safe backport.

Greg KH

14 Mar 14 Mar

12:24 p.m.

On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote:

...

From: Eric Dumazet edumazet@google.com

tcp_abort() has the same issue than the one fixed in the prior patch in tcp_write_err().

commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.

To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, this patch must be applied first.

In order to get consistent results from tcp_poll(), we must call sk_error_report() after tcp_done().

We can use tcp_done_with_error() to centralize this logic.

Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Neal Cardwell ncardwell@google.com Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+

Did not apply to 5.10.y, what did you want this added to?

thanks,

greg k-h

Youngmin Nam

17 Mar 17 Mar

4:36 a.m.

On Fri, Mar 14, 2025 at 01:24:09PM +0100, Greg KH wrote:

...

On Fri, Mar 14, 2025 at 06:24:45PM +0900, Youngmin Nam wrote:

...
From: Eric Dumazet edumazet@google.com

tcp_abort() has the same issue than the one fixed in the prior patch in tcp_write_err().

commit 5ce4645c23cf5f048eb8e9ce49e514bababdee85 upstream.

To apply commit bac76cf89816bff06c4ec2f3df97dc34e150a1c4, this patch must be applied first.

In order to get consistent results from tcp_poll(), we must call sk_error_report() after tcp_done().

We can use tcp_done_with_error() to centralize this logic.

Fixes: c1e64e298b8c ("net: diag: Support destroying TCP sockets.") Signed-off-by: Eric Dumazet edumazet@google.com Acked-by: Neal Cardwell ncardwell@google.com Link: https://lore.kernel.org/r/20240528125253.1966136-4-edumazet@google.com Signed-off-by: Jakub Kicinski kuba@kernel.org Cc: stable@vger.kernel.org # v5.10+

Did not apply to 5.10.y, what did you want this added to?

thanks,

greg k-h

Hi Greg,

Sorry about that.

As for 5.10, it seems to have more dependencies for the backport. I think the maintainer should handle it to ensure a safe backport.

Sasha Levin

14 Mar 14 Mar

11:10 p.m.

[ Sasha's backport helper bot ]

Hi,

Summary of potential issues: ❌ Build failures detected

The upstream commit SHA1 provided is correct: 5ce4645c23cf5f048eb8e9ce49e514bababdee85

WARNING: Author mismatch between patch and upstream commit: Backport author: Youngmin Namyoungmin.nam@samsung.com Commit author: Eric Dumazetedumazet@google.com

Note: The patch differs from the upstream commit: --- Failed to apply patch cleanly. ---

Results of testing on various branches:

| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.13.y | Failed | N/A | | stable/linux-6.12.y | Failed | N/A | | stable/linux-6.6.y | Success | Success | | stable/linux-6.1.y | Success | Success | | stable/linux-5.15.y | Success | Success | | stable/linux-5.10.y | Failed | N/A | | stable/linux-5.4.y | Failed | N/A |

Build Errors: Patch failed to apply on stable/linux-6.13.y. Reject:

diff a/net/ipv4/tcp.c b/net/ipv4/tcp.c (rejected hunks) @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk);

bh_unlock_sock(sk); Patch failed to apply on stable/linux-6.12.y. Reject:

diff a/net/ipv4/tcp.c b/net/ipv4/tcp.c (rejected hunks) @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk);

bh_unlock_sock(sk); Patch failed to apply on stable/linux-5.10.y. Reject:

diff a/net/ipv4/tcp.c b/net/ipv4/tcp.c (rejected hunks) @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk);

bh_unlock_sock(sk); Patch failed to apply on stable/linux-5.4.y. Reject:

diff a/net/ipv4/tcp.c b/net/ipv4/tcp.c (rejected hunks) @@ -4630,13 +4630,9 @@ int tcp_abort(struct sock *sk, int err) bh_lock_sock(sk);

bh_unlock_sock(sk);

297

days inactive

300

days old

linux-stable-mirror@lists.linaro.org

6 comments

participants

tags (0)

participants (3)

Greg KH
Sasha Levin
Youngmin Nam