On Wed, 2024-01-31 at 10:06 -0500, Willem de Bruijn wrote:
Otherwise I'll start with the gro and so-txtime tests. They may not be so easily calibrated. As we cannot control the gro timeout, nor the FQ max horizon.
Note that we can control the GRO timeout to some extent, via gro_flush_timeout, see commit 89abe628375301fedb68770644df845d49018d8b.
Unfortunately that is not enough for 'large' gro tests. I think the root cause is that the process sending the packets can be de-scheduled - even the qemu VM from the hypervisor CPU - causing an extremely large gap between consecutive pkts.
I guess/hope that replacing multiple sendmsg() with a sendmmsg() could improve a bit the scenario, but I fear it will not solve the issue completely.
In such cases we can use the environment variable to either skip the test entirely or --my preference-- run it to get code coverage, but suppress a failure if due to timing (only). Sounds good?
Sounds good to me! I was wondering about skipping the 'large' test only, but suppressing the failure when KSFT_MACHINE_SLOW=yes only for such test looks a better option.
Thanks!
Paolo