On Sat, Dec 5, 2020 at 1:03 PM Mohamed Abuelfotoh, Hazem abuehaze@amazon.com wrote:
Unfortunately few things are missing in this report.
What is the RTT between hosts in your test ? >>>>>RTT in my test is 162 msec, but I am able to reproduce it with lower RTTs for example I could see the issue downloading from google endpoint with RTT of 16.7 msec, as mentioned in my previous e-mail the issue is reproducible whenever RTT exceeded 12msec given that the sender is using bbr. RTT between hosts where I run the iperf test. # ping 54.199.163.187 PING 54.199.163.187 (54.199.163.187) 56(84) bytes of data. 64 bytes from 54.199.163.187: icmp_seq=1 ttl=33 time=162 ms 64 bytes from 54.199.163.187: icmp_seq=2 ttl=33 time=162 ms 64 bytes from 54.199.163.187: icmp_seq=3 ttl=33 time=162 ms 64 bytes from 54.199.163.187: icmp_seq=4 ttl=33 time=162 ms RTT between my EC2 instances and google endpoint. # ping 172.217.4.240 PING 172.217.4.240 (172.217.4.240) 56(84) bytes of data. 64 bytes from 172.217.4.240: icmp_seq=1 ttl=101 time=16.7 ms 64 bytes from 172.217.4.240: icmp_seq=2 ttl=101 time=16.7 ms 64 bytes from 172.217.4.240: icmp_seq=3 ttl=101 time=16.7 ms 64 bytes from 172.217.4.240: icmp_seq=4 ttl=101 time=16.7 ms What driver is used at the receiving side ? >>>>>>I am using ENA driver version version: 2.2.10g on the receiver with scatter gathering enabled. # ethtool -k eth0 | grep scatter-gather scatter-gather: on tx-scatter-gather: on tx-scatter-gather-fraglist: off [fixed]
This ethtool output refers to TX scatter gather, which is not relevant for this bug.
I see ENA driver might use 16 KB per incoming packet (if ENA_PAGE_SIZE is 16 KB)
Since I can not reproduce this problem with another NIC on x86, I really wonder if this is not an issue with ENA driver on PowerPC perhaps ?