Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at maximum sppend, and netpoll needs to poll the NIC. This is achieved by:
1. Configuring a single RX/TX queue to create contention 2. Generating background traffic to saturate the interface 3. Sending netconsole messages to trigger netpoll polling 4. Using dynamic netconsole targets via configfs
The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. Perf probing confirms this test successfully triggers netpoll_poll_dev() in typical test runs.
This addresses a gap in netpoll test coverage for a path that is tricky for the network stack.
Signed-off-by: Breno Leitao leitao@debian.org --- Sending as an RFC for your appreciation, but it dpends on [1] which is stil under review. Once [1] lands, I will send this officially.
Link: https://lore.kernel.org/all/20250611-netdevsim_stat-v1-0-c11b657d96bf@debian... [1] --- tools/testing/selftests/drivers/net/Makefile | 1 + .../testing/selftests/drivers/net/netpoll_basic.py | 201 +++++++++++++++++++++ 2 files changed, 202 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile index be780bcb73a3b..70d6e3a920b7f 100644 --- a/tools/testing/selftests/drivers/net/Makefile +++ b/tools/testing/selftests/drivers/net/Makefile @@ -15,6 +15,7 @@ TEST_PROGS := \ netcons_fragmented_msg.sh \ netcons_overflow.sh \ netcons_sysdata.sh \ + netpoll_basic.py \ ping.py \ queues.py \ stats.py \ diff --git a/tools/testing/selftests/drivers/net/netpoll_basic.py b/tools/testing/selftests/drivers/net/netpoll_basic.py new file mode 100755 index 0000000000000..8abdfb2b1eb6e --- /dev/null +++ b/tools/testing/selftests/drivers/net/netpoll_basic.py @@ -0,0 +1,201 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +# This test aims to evaluate the netpoll polling mechanism (as in netpoll_poll_dev()). +# It presents a complex scenario where the network attempts to send a packet but fails, +# prompting it to poll the NIC from within the netpoll TX side. +# +# This has been a crucial path in netpoll that was previously untested. Jakub +# suggested using a single RX/TX queue, pushing traffic to the NIC, and then sending +# netpoll messages (via netconsole) to trigger the poll. `perf` probing of netpoll_poll_dev() +# showed that this test indeed triggers netpoll_poll_dev() once or twice in 10 iterations. + +# Author: Breno Leitao leitao@debian.org + +import errno +import os +import random +import string +import time + +from lib.py import ( + ethtool, + GenerateTraffic, + ksft_exit, + ksft_pr, + ksft_run, + KsftFailEx, + KsftSkipEx, + NetdevFamily, + NetDrvEpEnv, +) + +NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" +REMOTE_PORT = 6666 +LOCAL_PORT = 1514 +# Number of netcons messages to send. I usually see netpoll_poll_dev() +# being called at least once in 10 iterations. +ITERATIONS = 10 +DEBUG = False + + +def generate_random_netcons_name() -> str: + """Generate a random name starting with 'netcons'""" + random_suffix = "".join(random.choices(string.ascii_lowercase + string.digits, k=8)) + return f"netcons_{random_suffix}" + + +def get_stats(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> dict[str, int]: + """Get the statistics for the interface""" + return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0] + + +def set_single_rx_tx_queue(interface_name: str) -> None: + """Set the number of RX and TX queues to 1 using ethtool""" + try: + # This don't need to be reverted, since interfaces will be deleted after test + ethtool(f"-G {interface_name} rx 1 tx 1") + except Exception as e: + raise KsftSkipEx( + f"Failed to configure RX/TX queues: {e}. Ethtool not available?" + ) + + +def create_netconsole_target( + config_data: dict[str, str], + target_name: str, +) -> None: + """Create a netconsole dynamic target against the interfaces""" + ksft_pr(f"Using netconsole name: {target_name}") + try: + ksft_pr(f"Created target directory: {NETCONSOLE_CONFIGFS_PATH}/{target_name}") + os.makedirs(f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}", exist_ok=True) + except OSError as e: + if e.errno != errno.EEXIST: + raise KsftFailEx(f"Failed to create netconsole target directory: {e}") + + try: + for key, value in config_data.items(): + if DEBUG: + ksft_pr(f"Setting {key} to {value}") + with open( + f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{key}", + "w", + encoding="utf-8", + ) as f: + # Always convert to string to write to file + f.write(str(value)) + f.close() + + if DEBUG: + # Read all configuration values for debugging + for debug_key in config_data.keys(): + with open( + f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{debug_key}", + "r", + encoding="utf-8", + ) as f: + content = f.read() + ksft_pr( + f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{debug_key} {content}" + ) + + except Exception as e: + raise KsftFailEx(f"Failed to configure netconsole target: {e}") + + +def set_netconsole(cfg: NetDrvEpEnv, interface_name: str, target_name: str) -> None: + """Configure netconsole on the interface with the given target name""" + config_data = { + "extended": "1", + "dev_name": interface_name, + "local_port": LOCAL_PORT, + "remote_port": REMOTE_PORT, + "local_ip": cfg.addr_v["4"] if cfg.addr_ipver == "4" else cfg.addr_v["6"], + "remote_ip": ( + cfg.remote_addr_v["4"] if cfg.addr_ipver == "4" else cfg.remote_addr_v["6"] + ), + "remote_mac": "00:00:00:00:00:00", # Not important for this test + "enabled": "1", + } + + create_netconsole_target(config_data, target_name) + ksft_pr(f"Created netconsole target: {target_name} on interface {interface_name}") + + +def delete_netconsole_target(name: str) -> None: + """Delete a netconsole dynamic target""" + target_path = f"{NETCONSOLE_CONFIGFS_PATH}/{name}" + try: + if os.path.exists(target_path): + os.rmdir(target_path) + except OSError as e: + raise KsftFailEx(f"Failed to delete netconsole target: {e}") + + +def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int: + """Check if traffic is flowing on the interface""" + stat1 = get_stats(cfg, netdevnl) + time.sleep(1) + stat2 = get_stats(cfg, netdevnl) + pkts_per_sec = stat2["rx-packets"] - stat1["rx-packets"] + # Just make sure this will not fail even in slow/debug kernels + if pkts_per_sec < 10: + raise KsftFailEx(f"Traffic seems low: {pkts_per_sec}") + if DEBUG: + ksft_pr(f"Traffic per second {pkts_per_sec} ", pkts_per_sec) + + return pkts_per_sec + + +def do_netpoll_flush(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> None: + """Print messages to the console, trying to trigger a netpoll poll""" + for i in range(int(ITERATIONS)): + pkts_per_s = check_traffic_flowing(cfg, netdevnl) + with open("/dev/kmsg", "w", encoding="utf-8") as kmsg: + kmsg.write(f"netcons test #{i}: ({pkts_per_s} packets/s)\n") + + +def test_netpoll(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> None: + """Test netpoll by sending traffic to the interface and then sending netconsole messages to trigger a poll""" + target_name = generate_random_netcons_name() + ifname = cfg.dev["ifname"] + traffic = None + + try: + set_single_rx_tx_queue(ifname) + traffic = GenerateTraffic(cfg) + check_traffic_flowing(cfg, netdevnl) + set_netconsole(cfg, ifname, target_name) + do_netpoll_flush(cfg, netdevnl) + finally: + if traffic: + traffic.stop() + delete_netconsole_target(target_name) + + +def check_dependencies() -> None: + """Check if the dependencies are met""" + if not os.path.exists(NETCONSOLE_CONFIGFS_PATH): + raise KsftSkipEx( + f"Directory {NETCONSOLE_CONFIGFS_PATH} does not exist. CONFIG_NETCONSOLE_DYNAMIC might not be set." + ) + + +def main() -> None: + """Main function to run the test""" + check_dependencies() + netdevnl = NetdevFamily() + with NetDrvEpEnv(__file__, nsim_test=True) as cfg: + ksft_run( + [test_netpoll], + args=( + cfg, + netdevnl, + ), + ) + ksft_exit() + + +if __name__ == "__main__": + main()
--- base-commit: 5d6d67c4cb10a4b4d3ae35758d5eeed6239afdc8 change-id: 20250612-netpoll_test-a1324d2057c8
Best regards,
Breno Leitao wrote:
Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at maximum sppend, and netpoll needs to poll the NIC. This is achieved by:
minor type: sppend/speed
- Configuring a single RX/TX queue to create contention
- Generating background traffic to saturate the interface
- Sending netconsole messages to trigger netpoll polling
- Using dynamic netconsole targets via configfs
The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. Perf probing confirms this test successfully triggers netpoll_poll_dev() in typical test runs.
So the test needs profiling to make it a pass/fail regression test? Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless exercising the code on its own is valuable enough.
Or is there another way that the packets could be observed, e.g., counters.
This addresses a gap in netpoll test coverage for a path that is tricky for the network stack.
Signed-off-by: Breno Leitao leitao@debian.org
Sending as an RFC for your appreciation, but it dpends on [1] which is stil under review. Once [1] lands, I will send this officially.
Link: https://lore.kernel.org/all/20250611-netdevsim_stat-v1-0-c11b657d96bf@debian... [1]
tools/testing/selftests/drivers/net/Makefile | 1 + .../testing/selftests/drivers/net/netpoll_basic.py | 201 +++++++++++++++++++++ 2 files changed, 202 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/Makefile b/tools/testing/selftests/drivers/net/Makefile index be780bcb73a3b..70d6e3a920b7f 100644 --- a/tools/testing/selftests/drivers/net/Makefile +++ b/tools/testing/selftests/drivers/net/Makefile @@ -15,6 +15,7 @@ TEST_PROGS := \ netcons_fragmented_msg.sh \ netcons_overflow.sh \ netcons_sysdata.sh \
- netpoll_basic.py \ ping.py \ queues.py \ stats.py \
diff --git a/tools/testing/selftests/drivers/net/netpoll_basic.py b/tools/testing/selftests/drivers/net/netpoll_basic.py new file mode 100755 index 0000000000000..8abdfb2b1eb6e --- /dev/null +++ b/tools/testing/selftests/drivers/net/netpoll_basic.py @@ -0,0 +1,201 @@ +#!/usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0
+# This test aims to evaluate the netpoll polling mechanism (as in netpoll_poll_dev()). +# It presents a complex scenario where the network attempts to send a packet but fails, +# prompting it to poll the NIC from within the netpoll TX side. +# +# This has been a crucial path in netpoll that was previously untested. Jakub +# suggested using a single RX/TX queue, pushing traffic to the NIC, and then sending +# netpoll messages (via netconsole) to trigger the poll. `perf` probing of netpoll_poll_dev() +# showed that this test indeed triggers netpoll_poll_dev() once or twice in 10 iterations.
+# Author: Breno Leitao leitao@debian.org
+import errno +import os +import random +import string +import time
+from lib.py import (
- ethtool,
- GenerateTraffic,
- ksft_exit,
- ksft_pr,
- ksft_run,
- KsftFailEx,
- KsftSkipEx,
- NetdevFamily,
- NetDrvEpEnv,
+)
+NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" +REMOTE_PORT = 6666 +LOCAL_PORT = 1514 +# Number of netcons messages to send. I usually see netpoll_poll_dev() +# being called at least once in 10 iterations. +ITERATIONS = 10
Is usually sufficient to avoid flakiness, or should this be cranked up?
+DEBUG = False
+def generate_random_netcons_name() -> str:
- """Generate a random name starting with 'netcons'"""
- random_suffix = "".join(random.choices(string.ascii_lowercase + string.digits, k=8))
- return f"netcons_{random_suffix}"
+def get_stats(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> dict[str, int]:
- """Get the statistics for the interface"""
- return netdevnl.qstats_get({"ifindex": cfg.ifindex}, dump=True)[0]
+def set_single_rx_tx_queue(interface_name: str) -> None:
- """Set the number of RX and TX queues to 1 using ethtool"""
- try:
# This don't need to be reverted, since interfaces will be deleted after test
ethtool(f"-G {interface_name} rx 1 tx 1")
- except Exception as e:
raise KsftSkipEx(
f"Failed to configure RX/TX queues: {e}. Ethtool not available?"
)
+def create_netconsole_target(
- config_data: dict[str, str],
- target_name: str,
+) -> None:
- """Create a netconsole dynamic target against the interfaces"""
- ksft_pr(f"Using netconsole name: {target_name}")
- try:
ksft_pr(f"Created target directory: {NETCONSOLE_CONFIGFS_PATH}/{target_name}")
os.makedirs(f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}", exist_ok=True)
- except OSError as e:
if e.errno != errno.EEXIST:
raise KsftFailEx(f"Failed to create netconsole target directory: {e}")
- try:
for key, value in config_data.items():
if DEBUG:
ksft_pr(f"Setting {key} to {value}")
with open(
f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{key}",
"w",
encoding="utf-8",
) as f:
# Always convert to string to write to file
f.write(str(value))
f.close()
if DEBUG:
# Read all configuration values for debugging
for debug_key in config_data.keys():
with open(
f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{debug_key}",
"r",
encoding="utf-8",
) as f:
content = f.read()
ksft_pr(
f"{NETCONSOLE_CONFIGFS_PATH}/{target_name}/{debug_key} {content}"
)
- except Exception as e:
raise KsftFailEx(f"Failed to configure netconsole target: {e}")
+def set_netconsole(cfg: NetDrvEpEnv, interface_name: str, target_name: str) -> None:
- """Configure netconsole on the interface with the given target name"""
- config_data = {
"extended": "1",
"dev_name": interface_name,
"local_port": LOCAL_PORT,
"remote_port": REMOTE_PORT,
"local_ip": cfg.addr_v["4"] if cfg.addr_ipver == "4" else cfg.addr_v["6"],
"remote_ip": (
cfg.remote_addr_v["4"] if cfg.addr_ipver == "4" else cfg.remote_addr_v["6"]
),
"remote_mac": "00:00:00:00:00:00", # Not important for this test
"enabled": "1",
- }
- create_netconsole_target(config_data, target_name)
- ksft_pr(f"Created netconsole target: {target_name} on interface {interface_name}")
+def delete_netconsole_target(name: str) -> None:
- """Delete a netconsole dynamic target"""
- target_path = f"{NETCONSOLE_CONFIGFS_PATH}/{name}"
- try:
if os.path.exists(target_path):
os.rmdir(target_path)
- except OSError as e:
raise KsftFailEx(f"Failed to delete netconsole target: {e}")
+def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int:
- """Check if traffic is flowing on the interface"""
- stat1 = get_stats(cfg, netdevnl)
- time.sleep(1)
Can the same be learned with sufficient precision when sleeping for only 100 msec? As tests are added, it's worth trying to keep their runtime short.
- stat2 = get_stats(cfg, netdevnl)
- pkts_per_sec = stat2["rx-packets"] - stat1["rx-packets"]
- # Just make sure this will not fail even in slow/debug kernels
- if pkts_per_sec < 10:
raise KsftFailEx(f"Traffic seems low: {pkts_per_sec}")
- if DEBUG:
ksft_pr(f"Traffic per second {pkts_per_sec} ", pkts_per_sec)
- return pkts_per_sec
+def do_netpoll_flush(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> None:
- """Print messages to the console, trying to trigger a netpoll poll"""
- for i in range(int(ITERATIONS)):
pkts_per_s = check_traffic_flowing(cfg, netdevnl)
with open("/dev/kmsg", "w", encoding="utf-8") as kmsg:
kmsg.write(f"netcons test #{i}: ({pkts_per_s} packets/s)\n")
+def test_netpoll(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> None:
- """Test netpoll by sending traffic to the interface and then sending netconsole messages to trigger a poll"""
- target_name = generate_random_netcons_name()
- ifname = cfg.dev["ifname"]
- traffic = None
- try:
set_single_rx_tx_queue(ifname)
traffic = GenerateTraffic(cfg)
check_traffic_flowing(cfg, netdevnl)
set_netconsole(cfg, ifname, target_name)
do_netpoll_flush(cfg, netdevnl)
- finally:
if traffic:
traffic.stop()
delete_netconsole_target(target_name)
+def check_dependencies() -> None:
- """Check if the dependencies are met"""
- if not os.path.exists(NETCONSOLE_CONFIGFS_PATH):
raise KsftSkipEx(
f"Directory {NETCONSOLE_CONFIGFS_PATH} does not exist. CONFIG_NETCONSOLE_DYNAMIC might not be set."
)
+def main() -> None:
- """Main function to run the test"""
- check_dependencies()
- netdevnl = NetdevFamily()
- with NetDrvEpEnv(__file__, nsim_test=True) as cfg:
ksft_run(
[test_netpoll],
args=(
cfg,
netdevnl,
),
)
- ksft_exit()
+if __name__ == "__main__":
- main()
base-commit: 5d6d67c4cb10a4b4d3ae35758d5eeed6239afdc8 change-id: 20250612-netpoll_test-a1324d2057c8
Best regards,
Breno Leitao leitao@debian.org
Hello Willem,
On Thu, Jun 12, 2025 at 10:35:54PM -0400, Willem de Bruijn wrote:
Breno Leitao wrote:
Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at maximum sppend, and netpoll needs to poll the NIC. This is achieved by:
minor type: sppend/speed
Thanks! I will update.
- Configuring a single RX/TX queue to create contention
- Generating background traffic to saturate the interface
- Sending netconsole messages to trigger netpoll polling
- Using dynamic netconsole targets via configfs
The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. Perf probing confirms this test successfully triggers netpoll_poll_dev() in typical test runs.
So the test needs profiling to make it a pass/fail regression test? Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless exercising the code on its own is valuable enough.
Sorry for not being clear. This test doesn't depend on any profiling data. Basically I just run `perf probe` to guarantee that netpoll_poll_dev() was being called (as that was the goal of the test).
This test is self contained and should run at `make run_test` targets.
Or is there another way that the packets could be observed, e.g., counters.
Unfortunately netpoll doesn't expose any data, thus, it is hard to get it.
I have plans to create a configfs for netpoll, so, we can check for these numbers (as also configure some pre-defined values today, such as USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc.
In fact, I've an private PoC for this, but, I am modernizing the code first, and creating some selftests to help me with those changes later (given we have very little test on netpoll, and I aim to improve this, given how critical it is for some datacenter designs).
+NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" +REMOTE_PORT = 6666 +LOCAL_PORT = 1514 +# Number of netcons messages to send. I usually see netpoll_poll_dev() +# being called at least once in 10 iterations. +ITERATIONS = 10
Is usually sufficient to avoid flakiness, or should this be cranked up?
10 was the minimum number I was able to trigger it on my dev environment, either with default configuration and a debug heavy configuration, but, the higher the number, more change to trigger it. I can crank up it a bit more. Maybe 20?
+def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int:
- """Check if traffic is flowing on the interface"""
- stat1 = get_stats(cfg, netdevnl)
- time.sleep(1)
Can the same be learned with sufficient precision when sleeping for only 100 msec? As tests are added, it's worth trying to keep their runtime short.
100%. In fact, I don't need to wait for 1 seconds. In fact, we don't even need to check for traffic flowing after the traffic started. I've just added it to help me do develop the test.
We can either reduce it to 100ms or just remove it from the loop, without prejudice to the test itself. Maybe reducing it to 100 ms might help someone else that might debug this in the future, while just slowing down ITERATIONS * 0.1 seconds !?
Thanks for the review! --breno
Breno Leitao wrote:
Hello Willem,
On Thu, Jun 12, 2025 at 10:35:54PM -0400, Willem de Bruijn wrote:
Breno Leitao wrote:
Add a basic selftest for the netpoll polling mechanism, specifically targeting the netpoll poll() side.
The test creates a scenario where network transmission is running at maximum sppend, and netpoll needs to poll the NIC. This is achieved by:
minor type: sppend/speed
Thanks! I will update.
- Configuring a single RX/TX queue to create contention
- Generating background traffic to saturate the interface
- Sending netconsole messages to trigger netpoll polling
- Using dynamic netconsole targets via configfs
The test validates a critical netpoll code path by monitoring traffic flow and ensuring netpoll_poll_dev() is called when the normal TX path is blocked. Perf probing confirms this test successfully triggers netpoll_poll_dev() in typical test runs.
So the test needs profiling to make it a pass/fail regression test? Then perhaps add it to TEST_FILES rather than TEST_PROGS. Unless exercising the code on its own is valuable enough.
Sorry for not being clear. This test doesn't depend on any profiling data. Basically I just run `perf probe` to guarantee that netpoll_poll_dev() was being called (as that was the goal of the test).
This test is self contained and should run at `make run_test` targets.
Or is there another way that the packets could be observed, e.g., counters.
Unfortunately netpoll doesn't expose any data, thus, it is hard to get it.
I have plans to create a configfs for netpoll, so, we can check for these numbers (as also configure some pre-defined values today, such as USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc.
In fact, I've an private PoC for this, but, I am modernizing the code first, and creating some selftests to help me with those changes later (given we have very little test on netpoll, and I aim to improve this, given how critical it is for some datacenter designs).
+NETCONSOLE_CONFIGFS_PATH = "/sys/kernel/config/netconsole" +REMOTE_PORT = 6666 +LOCAL_PORT = 1514 +# Number of netcons messages to send. I usually see netpoll_poll_dev() +# being called at least once in 10 iterations. +ITERATIONS = 10
Is usually sufficient to avoid flakiness, or should this be cranked up?
10 was the minimum number I was able to trigger it on my dev environment, either with default configuration and a debug heavy configuration, but, the higher the number, more change to trigger it. I can crank up it a bit more. Maybe 20?
+def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int:
- """Check if traffic is flowing on the interface"""
- stat1 = get_stats(cfg, netdevnl)
- time.sleep(1)
Can the same be learned with sufficient precision when sleeping for only 100 msec? As tests are added, it's worth trying to keep their runtime short.
100%. In fact, I don't need to wait for 1 seconds. In fact, we don't even need to check for traffic flowing after the traffic started. I've just added it to help me do develop the test.
We can either reduce it to 100ms or just remove it from the loop, without prejudice to the test itself. Maybe reducing it to 100 ms might help someone else that might debug this in the future, while just slowing down ITERATIONS * 0.1 seconds !?
That makes sense. Or only keep it in DEBUG mode?
On Fri, Jun 13, 2025 at 09:43:35AM -0400, Willem de Bruijn wrote:
Breno Leitao wrote:
+def check_traffic_flowing(cfg: NetDrvEpEnv, netdevnl: NetdevFamily) -> int:
- """Check if traffic is flowing on the interface"""
- stat1 = get_stats(cfg, netdevnl)
- time.sleep(1)
Can the same be learned with sufficient precision when sleeping for only 100 msec? As tests are added, it's worth trying to keep their runtime short.
100%. In fact, I don't need to wait for 1 seconds. In fact, we don't even need to check for traffic flowing after the traffic started. I've just added it to help me do develop the test.
We can either reduce it to 100ms or just remove it from the loop, without prejudice to the test itself. Maybe reducing it to 100 ms might help someone else that might debug this in the future, while just slowing down ITERATIONS * 0.1 seconds !?
That makes sense. Or only keep it in DEBUG mode?
Even better, I will move it to DEBUG mode.
Thanks!
On Fri, 13 Jun 2025 05:47:50 -0700 Breno Leitao wrote:
Or is there another way that the packets could be observed, e.g., counters.
Unfortunately netpoll doesn't expose any data, thus, it is hard to get it.
I have plans to create a configfs for netpoll, so, we can check for these numbers (as also configure some pre-defined values today, such as USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc.
In fact, I've an private PoC for this, but, I am modernizing the code first, and creating some selftests to help me with those changes later (given we have very little test on netpoll, and I aim to improve this, given how critical it is for some datacenter designs).
FWIW you can steal bpftrace integration from this series: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/
On Fri, Jun 13, 2025 at 05:42:33PM -0700, Jakub Kicinski wrote:
On Fri, 13 Jun 2025 05:47:50 -0700 Breno Leitao wrote:
Or is there another way that the packets could be observed, e.g., counters.
Unfortunately netpoll doesn't expose any data, thus, it is hard to get it.
I have plans to create a configfs for netpoll, so, we can check for these numbers (as also configure some pre-defined values today, such as USEC_PER_POLL, MAX_SKBS, ip6h->version = 6; ip6h->priority = 0, etc.
In fact, I've an private PoC for this, but, I am modernizing the code first, and creating some selftests to help me with those changes later (given we have very little test on netpoll, and I aim to improve this, given how critical it is for some datacenter designs).
FWIW you can steal bpftrace integration from this series: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/
Yes, that would be great. I think we can iterate until we hit the poll path, otherwise we skip the test at timeout. Something as:
while (true): send msg if netpoll_poll_dev() was invoked: ksft_exit if timeout: raise KsftSkipEx As soon as your code lands, I will adapt the test to do so. Meanwhile, I will send the v1 for the netpoll, and later we can iterate.
Thanks for working on this bfptrace helper. This will be useful on other usecases as well.
--breno
On Fri, 20 Jun 2025 01:39:43 -0700 Breno Leitao wrote:
FWIW you can steal bpftrace integration from this series: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/
Yes, that would be great. I think we can iterate until we hit the poll path, otherwise we skip the test at timeout. Something as:
while (true): send msg if netpoll_poll_dev() was invoked: ksft_exit if timeout: raise KsftSkipEx As soon as your code lands, I will adapt the test to do so. Meanwhile, I will send the v1 for the netpoll, and later we can iterate.
Thanks for working on this bfptrace helper. This will be useful on other usecases as well.
Right, you're the second person I pointed that patch out to. Would be great if someone could steal that patch and make it a part of their series so that it gets merged :S But it's alright if you prefer to stick to non-bpftrace testing for now..
On Sat, Jun 21, 2025 at 06:51:21AM -0700, Jakub Kicinski wrote:
On Fri, 20 Jun 2025 01:39:43 -0700 Breno Leitao wrote:
FWIW you can steal bpftrace integration from this series: https://lore.kernel.org/all/20250421222827.283737-22-kuba@kernel.org/
Yes, that would be great. I think we can iterate until we hit the poll path, otherwise we skip the test at timeout. Something as:
while (true): send msg if netpoll_poll_dev() was invoked: ksft_exit if timeout: raise KsftSkipEx As soon as your code lands, I will adapt the test to do so. Meanwhile, I will send the v1 for the netpoll, and later we can iterate.
Thanks for working on this bfptrace helper. This will be useful on other usecases as well.
Right, you're the second person I pointed that patch out to. Would be great if someone could steal that patch and make it a part of their series so that it gets merged
I can do that. I was expecting your patches to be landed, and then I would reuse it. I was not expecting to ship it as part of my patchset.
So, the selftest for netpoll is already in the mailing list[1], so, we have two options, now:
1) Steal your patch and make [1] depend on it. 2) Merge the selftest [1] and, then, steal your patch by adding the bpftrace support in it.
What is your recommendation?
Link: https://lore.kernel.org/all/20250620-netpoll_test-v1-1-5068832f72fc@debian.o... [1]
Thanks, --breno
On Mon, 23 Jun 2025 02:16:12 -0700 Breno Leitao wrote:
So, the selftest for netpoll is already in the mailing list[1], so, we have two options, now:
- Steal your patch and make [1] depend on it.
- Merge the selftest [1] and, then, steal your patch by adding the bpftrace support in it.
What is your recommendation?
Let's see if [1] gets merged as is, if we need a v2 let's add the bpftrace patch?
On Mon, Jun 23, 2025 at 10:29:03AM -0700, Jakub Kicinski wrote:
On Mon, 23 Jun 2025 02:16:12 -0700 Breno Leitao wrote:
So, the selftest for netpoll is already in the mailing list[1], so, we have two options, now:
- Steal your patch and make [1] depend on it.
- Merge the selftest [1] and, then, steal your patch by adding the bpftrace support in it.
What is your recommendation?
Let's see if [1] gets merged as is, if we need a v2 let's add the bpftrace patch?
Sounds like a plan.
Thanks! --breno
linux-kselftest-mirror@lists.linaro.org