Hello Jakub,
On Thu, Jul 10, 2025 at 06:45:35PM -0700, Jakub Kicinski wrote:
+MAX_WRITES: int = 40
FWIW the test takes 25sec on our debug-heavy VMs right now. I think we can crank the writes quite a bit.. ?
Sure. I will increase it to 40. On my VMs I get more than 30 hits every single time:
2025-07-11 08:30:08,904 - DEBUG - BPFtrace output: {'hits': 30} 2025-07-11 08:30:08,904 - DEBUG - MAPS coming from bpftrace = {'hits': 30}
+def ethtool_read_rx_tx_queue(interface_name: str) -> tuple[int, int]:
- """
- Read the number of RX and TX queues using ethtool. This will be used
- to restore it after the test
- """
- rx_queue = 0
- tx_queue = 0
- try:
ethtool_result = ethtool(f"-g {interface_name}").stdout
json=True please and you'll get a dict, on CLI you can try:
ethtool --json -g eth0
Sure. I was parsing manually because some options do not have --json format.
ethtool --json -l eth0 ethtool: bad command line argument(s)
I haven't checked upstream, but, if this feature is upstream, is it worth implementing it?
ethtool(f"-G {interface_name} rx {rx_val} tx {tx_val}")
This is setting _ring size_ not queue count. I suppose we want both, this and queue count to 1 (with ethtool -l / -L) The ring size of 1 is unlikely to work on real devices. I'd try setting it to 128 and 256 and if neither sticks just carry on with whatever was there.
Thanks. I creating a function to do it. Something as:
def configure_network(ifname: str) -> None: """Configure ring size and queue numbers"""
# Set defined queues to 1 to force congestion prev_queues = ethtool_get_queues_cnt(ifname) logging.debug("RX/TX/combined queues: %s", prev_queues) # Only set the queues to 1 if they exists in the device. I.e, they are > 0 ethtool_set_queues_cnt(ifname, tuple(1 if x > 0 else x for x in prev_queues)) defer(ethtool_set_queues_cnt, ifname, prev_queues)
# Try to set the ring size to some low value. # Do not fail if the hardware do not accepted desired values prev_ring_size = ethtool_get_ringsize(ifname) for size in [(1, 1), (128, 128), (256, 256)]: if ethtool_set_ringsize(ifname, size): # hardware accepted the desired ringsize logging.debug("Set RX/TX ringsize to: %s from %s", size, prev_ring_size) break defer(ethtool_set_ringsize, ifname, prev_ring_size)
"remote_ip": (
cfg.remote_addr_v["4"] if cfg.addr_ipver == "4" else cfg.remote_addr_v["6"]
),
this is already done for you cfg.addr is either v4 or v6 depending on what was provided in the env
Ack!
+# toggle the interface up and down, to cause some congestion
Let's not do this, you're missing disruptive annotation and for many drivers NAPI is stopped before queues https://github.com/linux-netdev/nipa/wiki/Guidance-for-test-authors#ksft_dis...
Sure. This is not needed to reproduce the issue. I just put it in there in order to create more entropy. Anyway, removing it.
+def main() -> None:
- """Main function to run the test"""
- netcons_load_module()
- test_check_dependencies()
- with NetDrvEpEnv(__file__, nsim_test=True) as cfg:
I think nsim_test=True will make the test run _only_ on netdevsim. But there's nothing netdevsim specific here right? You can remove the argument and let's have this run against real drivers, too?
Sure. that is our goal, by the end of the day. Being able to run it on real hardware.
Thanks of the review. I will send an updated version soon, and we can continue the discusion over there. --breno