On Mon, Feb 3, 2020 at 7:40 AM David Laight David.Laight@aculab.com wrote:
From: Eric Dumazet
Sent: 31 January 2020 22:54 On 1/31/20 2:11 PM, Neal Cardwell wrote:
I looked into fixing this, but my quick reading of the Linux tcp_rcv_state_process() code is that it should behave correctly and that a connection in FIN_WAIT_1 that receives a FIN/ACK should move to TIME_WAIT.
SeongJae, do you happen to have a tcpdump trace of the problematic sequence where the "process A" ends up in FIN_WAIT_2 when it should be in TIME_WAIT?
If I have time I will try to construct a packetdrill case to verify the behavior in this case.
Unfortunately you wont be able to reproduce the issue with packetdrill, since it involved packets being processed at the same time (race window)
You might be able to force the timing race by adding a sleep in one of the code paths.
No good for a regression test, but ok for code testing.
Please take a look at packetdrill, there is no possibility for it to send more than one packet at a time.
Even if we modify packetdrill adding the possibility of feeding packets to its tun device from multiple threads, the race is tiny and you would have to run the packetdrill thousands of times to eventually trigger the race once.
While the test SeongJae provided is using two threads and regular TCP stack over loopback interface, it triggers the race more reliably.