On Wed, Jul 28, 2021 at 09:32:42 -0700 Yuchung Cheng ycheng@google.com wrote:
On the other hand maybe we do not hear middlebox issues because this mechanism is working. So I am okay to avoid applying to stable and keep in net-next to test this new policy.
This change did indeed break our mail servers at Wikimedia, causing difficult to diagnose timeout errors on sending outgoing email. I resorted to bisecting the kernel, which resulted in finding this commit. I have verified that reverting the sysctl value for tcp_fastopen_blackhole_timeout_sec to 3600 does resolve the timeouts.
Given that it is not clear how a user would discover that this sysctl has changed, or know how to fix a middle box somewhere on a path to their destination, I would love to see this change reverted.
Yours kindly, Jesse Hathaway