From: Allison Henderson allison.henderson@oracle.com
Hi All,
This series is a new selftest that Vegard, Chuck and myself have been working on to provide some test coverage for rds. I've made quite a few updates since the rfc sent a few weeks ago:
I've added several knobs to the script to tune network turbulance, and documented their usage in the README.txt. By default these options are left off.
Added an extra flag to specify log location
I've also added a flag to the config.sh to skip gcov configurations if the coverage report is not desired. run.sh has been adapted to skip the report if the required configs are not present, or if the required packages are not available
A time out has been added to prevent the test from hanging indefinitely
The previous gcov issues have been resolved with an appropriate gcov patch, as well as some extra logic to detect incompatible gcov and gcc versions.
The shellcheck nits reported in the last review have been addressed
In order to return an appropriate exit code, the run.sh script has been adapted to analyze the test.py strace, and determine if the test passed, failed or timed out.
RDS specific GCOV configs have been documented under Documentation/dev-tools/gcov.rst
Questions and comments appreciated. Thanks everyone!
Allison
Vegard Nossum (3): .gitignore: add .gcda files net: rds: add option for GCOV profiling selftests: rds: add testing infrastructure
.gitignore | 1 + Documentation/dev-tools/gcov.rst | 11 + MAINTAINERS | 1 + net/rds/Kconfig | 9 + net/rds/Makefile | 5 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/rds/Makefile | 13 + tools/testing/selftests/net/rds/README.txt | 41 ++++ tools/testing/selftests/net/rds/config.sh | 56 +++++ tools/testing/selftests/net/rds/init.sh | 69 ++++++ tools/testing/selftests/net/rds/run.sh | 271 +++++++++++++++++++++ tools/testing/selftests/net/rds/test.py | 251 +++++++++++++++++++ 12 files changed, 729 insertions(+) create mode 100644 tools/testing/selftests/net/rds/Makefile create mode 100644 tools/testing/selftests/net/rds/README.txt create mode 100755 tools/testing/selftests/net/rds/config.sh create mode 100755 tools/testing/selftests/net/rds/init.sh create mode 100755 tools/testing/selftests/net/rds/run.sh create mode 100644 tools/testing/selftests/net/rds/test.py
From: Vegard Nossum vegard.nossum@oracle.com
These files contain the runtime coverage data generated by gcov.
Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Allison Henderson allison.henderson@oracle.com --- .gitignore | 1 + 1 file changed, 1 insertion(+)
diff --git a/.gitignore b/.gitignore index c59dc60ba62e..8ff1b4138c56 100644 --- a/.gitignore +++ b/.gitignore @@ -24,6 +24,7 @@ *.dwo *.elf *.gcno +*.gcda *.gz *.i *.ko
From: Vegard Nossum vegard.nossum@oracle.com
To better our unit tests we need code coverage to be part of the kernel. This patch borrows heavily from how CONFIG_GCOV_PROFILE_FTRACE is implemented
Reviewed-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Allison Henderson allison.henderson@oracle.com --- net/rds/Kconfig | 9 +++++++++ net/rds/Makefile | 5 +++++ 2 files changed, 14 insertions(+)
diff --git a/net/rds/Kconfig b/net/rds/Kconfig index 75cd696963b2..f007730aa2bb 100644 --- a/net/rds/Kconfig +++ b/net/rds/Kconfig @@ -26,3 +26,12 @@ config RDS_DEBUG bool "RDS debugging messages" depends on RDS default n + +config GCOV_PROFILE_RDS + bool "Enable GCOV profiling on RDS" + depends on GCOV_KERNEL + help + Enable GCOV profiling on RDS for checking which functions/lines + are executed. + + If unsure, say N. diff --git a/net/rds/Makefile b/net/rds/Makefile index 8fdc118e2927..3af1ca1d965c 100644 --- a/net/rds/Makefile +++ b/net/rds/Makefile @@ -15,3 +15,8 @@ rds_tcp-y := tcp.o tcp_connect.o tcp_listen.o tcp_recv.o \ tcp_send.o tcp_stats.o
ccflags-$(CONFIG_RDS_DEBUG) := -DRDS_DEBUG + +# for GCOV coverage profiling +ifdef CONFIG_GCOV_PROFILE_RDS +GCOV_PROFILE := y +endif
On Tue, 25 Jun 2024 18:28:33 -0700 allison.henderson@oracle.com wrote:
From: Vegard Nossum vegard.nossum@oracle.com
To better our unit tests we need code coverage to be part of the kernel. This patch borrows heavily from how CONFIG_GCOV_PROFILE_FTRACE is implemented
Hi Florian, IIRC you were able to generate test coverage reports for nftables / netfilter. Is this the approach you used? I'm not sure how well adding a Kconfig knob for every module would scale..
From: Vegard Nossum vegard.nossum@oracle.com
This adds some basic self-testing infrastructure for RDS-TCP.
Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Allison Henderson allison.henderson@oracle.com --- Documentation/dev-tools/gcov.rst | 11 + MAINTAINERS | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/rds/Makefile | 13 + tools/testing/selftests/net/rds/README.txt | 41 ++++ tools/testing/selftests/net/rds/config.sh | 56 +++++ tools/testing/selftests/net/rds/init.sh | 69 ++++++ tools/testing/selftests/net/rds/run.sh | 271 +++++++++++++++++++++ tools/testing/selftests/net/rds/test.py | 251 +++++++++++++++++++ 9 files changed, 714 insertions(+)
diff --git a/Documentation/dev-tools/gcov.rst b/Documentation/dev-tools/gcov.rst index 5fce2b06f229..dbd26b02ff3c 100644 --- a/Documentation/dev-tools/gcov.rst +++ b/Documentation/dev-tools/gcov.rst @@ -75,6 +75,17 @@ Only files which are linked to the main kernel image or are compiled as kernel modules are supported by this mechanism.
+Module specific configs +----------------------- + +Gcov kernel configs for specific modules are described below: + +CONFIG_GCOV_PROFILE_RDS: + Enables GCOV profiling on RDS for checking which functions or + lines are executed. This config is used by the rds selftest to + generate coverage reports. If left unset the report is omitted. + + Files -----
diff --git a/MAINTAINERS b/MAINTAINERS index d648af07cbd6..861dbd2f15fd 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -18849,6 +18849,7 @@ S: Supported W: https://oss.oracle.com/projects/rds/ F: Documentation/networking/rds.rst F: net/rds/ +F: tools/testing/selftests/net/rds/
RDT - RESOURCE ALLOCATION M: Fenghua Yu fenghua.yu@intel.com diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile index 9039f3709aff..5b01fe3277e2 100644 --- a/tools/testing/selftests/Makefile +++ b/tools/testing/selftests/Makefile @@ -66,6 +66,7 @@ TARGETS += net/mptcp TARGETS += net/openvswitch TARGETS += net/tcp_ao TARGETS += net/netfilter +TARGETS += net/rds TARGETS += nsfs TARGETS += perf_events TARGETS += pidfd diff --git a/tools/testing/selftests/net/rds/Makefile b/tools/testing/selftests/net/rds/Makefile new file mode 100644 index 000000000000..52fe54006eba --- /dev/null +++ b/tools/testing/selftests/net/rds/Makefile @@ -0,0 +1,13 @@ +# SPDX-License-Identifier: GPL-2.0 + +all: + @echo mk_build_dir="$(shell pwd)" > include.sh + +TEST_PROGS := run.sh \ + include.sh \ + test.py \ + init.sh + +EXTRA_CLEAN := /tmp/rds_logs + +include ../../lib.mk diff --git a/tools/testing/selftests/net/rds/README.txt b/tools/testing/selftests/net/rds/README.txt new file mode 100644 index 000000000000..dddf9d33848e --- /dev/null +++ b/tools/testing/selftests/net/rds/README.txt @@ -0,0 +1,41 @@ +RDS self-tests +============== + +These scripts provide a coverage test for RDS-TCP by creating a vm +with two network namespaces and running rds packets between them. +A loopback network is provisioned with optional probability of packet +loss or corruption. A workload of 50000 hashes, each 64 characters +in size, are passed over an RDS socket on this test network. A passing +test means the RDS-TCP stack was able to recover properly. The provided +config.sh can be used to compile the kernel with the necessary gcov +options. The kernel may optionally be configured to omit the coverage +report as well. + +USAGE: + run.sh [-d logdir] [-l packet_loss] [-c packet_corruption] + [-u packet_duplcate] + +OPTIONS: + -d Log directory. Defaults to /tmp/rds_logs + + -l Simulates a percentage of packet loss + + -c Simulates a percentage of packet corruption + + -u Simulates a percentage of packet duplication. + +EXAMPLE: + + # Create a suitable gcov enabled .config + tools/testing/selftests/net/rds/config.sh -g + + # Alternatly create a gcov disabled .config + tools/testing/selftests/net/rds/config.sh + + # build the kernel + make -j128 + + # launch the tests in a VM + tools/testing/selftests/net/rds/run.sh + +An HTML coverage report will be output in /tmp/rds_logs/coverage/. diff --git a/tools/testing/selftests/net/rds/config.sh b/tools/testing/selftests/net/rds/config.sh new file mode 100755 index 000000000000..3454f4856d69 --- /dev/null +++ b/tools/testing/selftests/net/rds/config.sh @@ -0,0 +1,56 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 + +set -e +set -u +set -x + +unset KBUILD_OUTPUT + +GENERATE_GCOV_REPORT=0 +while getopts "g" opt; do + case ${opt} in + g) + GENERATE_GCOV_REPORT=1 + ;; + :) + echo "USAGE: config.sh [-g]" + exit 1 + ;; + ?) + echo "Invalid option: -${OPTARG}." + exit 1 + ;; + esac +done + +# start with a default config +make defconfig + +# no modules +scripts/config --disable CONFIG_MODULES + +# enable RDS +scripts/config --enable CONFIG_RDS +scripts/config --enable CONFIG_RDS_TCP + +if [ "$GENERATE_GCOV_REPORT" -eq 1 ]; then + # instrument RDS and only RDS + scripts/config --enable CONFIG_GCOV_KERNEL + scripts/config --disable GCOV_PROFILE_ALL + scripts/config --enable GCOV_PROFILE_RDS +else + scripts/config --disable CONFIG_GCOV_KERNEL + scripts/config --disable GCOV_PROFILE_ALL + scripts/config --disable GCOV_PROFILE_RDS +fi + +# need network namespaces to run tests with veth network interfaces +scripts/config --enable CONFIG_NET_NS +scripts/config --enable CONFIG_VETH + +# simulate packet loss +scripts/config --enable CONFIG_NET_SCH_NETEM + +# generate real .config without asking any questions +make olddefconfig diff --git a/tools/testing/selftests/net/rds/init.sh b/tools/testing/selftests/net/rds/init.sh new file mode 100755 index 000000000000..5d2577625769 --- /dev/null +++ b/tools/testing/selftests/net/rds/init.sh @@ -0,0 +1,69 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 + +set -e +set -u + +COLLECT_GCOV=0 +LOG_DIR=/tmp +PY_CMD="/usr/bin/python3" +PLOSS=0 +PCORRUPT=0 +PDUP=0 +while getopts "d:p:l:c:u:g" opt; do + case ${opt} in + d) + LOG_DIR=${OPTARG} + ;; + p) + PY_CMD=${OPTARG} + ;; + l) + PLOSS=${OPTARG} + ;; + c) + PCORRUPT=${OPTARG} + ;; + u) + PDUP=${OPTARG} + ;; + g) + COLLECT_GCOV=1 + ;; + :) + echo "USAGE: init.sh [-d logdir] [-p python_cmd] [-l packet_loss] [-c packet_corruption] " \ + "[-u packet_duplcate] [-g]" + exit 1 + ;; + ?) + echo "Invalid option: -${OPTARG}." + exit 1 + ;; + esac +done + +LOG_FILE="${LOG_DIR}/rds-strace.txt" + +mount -t proc none /proc +mount -t sysfs none /sys +mount -t tmpfs none /var/run +mount -t debugfs none /sys/kernel/debug + +echo running RDS tests... +echo Traces will be logged to "$LOG_FILE" +rm -f "$LOG_FILE" +strace -T -tt -o "$LOG_FILE" "$PY_CMD" "$(dirname "$0")/test.py" --timeout 300 -d "$LOG_DIR" \ + -l "$PLOSS" -c "$PCORRUPT" -u "$PDUP" || true + +if [ "$COLLECT_GCOV" -eq 1 ]; then + echo saving coverage data... + (set +x; cd /sys/kernel/debug/gcov; find ./* -name '*.gcda' | \ + while read -r f + do + cat < "/sys/kernel/debug/gcov/$f" > "/$f" + done) +fi + +dmesg > "${LOG_DIR}/dmesg.out" + +/usr/sbin/poweroff --no-wtmp --force diff --git a/tools/testing/selftests/net/rds/run.sh b/tools/testing/selftests/net/rds/run.sh new file mode 100755 index 000000000000..823631507230 --- /dev/null +++ b/tools/testing/selftests/net/rds/run.sh @@ -0,0 +1,271 @@ +#! /bin/bash +# SPDX-License-Identifier: GPL-2.0 + +set -e +set -u + +unset KBUILD_OUTPUT + +current_dir="$(realpath "$(dirname "$0")")" +build_dir="$current_dir" + +build_include="$current_dir/include.sh" +if test -f "$build_include"; then + # this include will define "$mk_build_dir" as the location the test was + # built. We will need this if the tests are installed in a location + # other than the kernel source + + source "$build_include" + build_dir="$mk_build_dir" +fi + +# This test requires kernel source and the *.gcda data therein +# Locate the top level of the kernel source, and the net/rds +# subfolder with the appropriate *.gcno object files +ksrc_dir="$(realpath "$build_dir"/../../../../../)" +kconfig="$ksrc_dir/.config" +obj_dir="$ksrc_dir/net/rds" + +GCOV_CMD=gcov + +# This script currently only works for x86_64 +ARCH="$(uname -m)" +case "${ARCH}" in +x86_64) + QEMU_BINARY=qemu-system-x86_64 + ;; +*) + echo "selftests: [SKIP] Unsupported architecture" + exit 4 + ;; +esac + +#check to see if the host has the required packages to generate a gcov report +check_gcov_env() +{ + if ! which "$GCOV_CMD" > /dev/null 2>&1; then + echo "Warning: Could not find gcov. " + GENERATE_GCOV_REPORT=0 + fi + + # the gcov version must match the gcc version + GCC_VER=$(gcc -dumpfullversion) + GCOV_VER=$($GCOV_CMD -v | grep gcov | awk '{print $3}'| awk 'BEGIN {FS="-"}{print $1}') + if [ "$GCOV_VER" != "$GCC_VER" ]; then + #attempt to find a matching gcov version + GCOV_CMD=gcov-$(gcc -dumpversion) + + if ! which "$GCOV_CMD" > /dev/null 2>&1; then + echo "Warning: Could not find an appropriate gcov installation. \ + gcov version must match gcc version" + GENERATE_GCOV_REPORT=0 + fi + + #recheck version number of found gcov executable + GCOV_VER=$($GCOV_CMD -v | grep gcov | awk '{print $3}'| \ + awk 'BEGIN {FS="-"}{print $1}') + if [ "$GCOV_VER" != "$GCC_VER" ]; then + echo "Warning: Could not find an appropriate gcov installation. \ + gcov version must match gcc version" + GENERATE_GCOV_REPORT=0 + else + echo "Warning: Mismatched gcc and gcov detected. Using $GCOV_CMD" + fi + fi + + if ! which gcovr > /dev/null 2>&1; then + echo "Warning: Could not find gcovr" + GENERATE_GCOV_REPORT=0 + fi +} + +# Check to see if the kconfig has the required configs to generate a coverage report +check_gcov_conf() +{ + if ! grep -x "CONFIG_GCOV_PROFILE_RDS=y" "$kconfig" > /dev/null 2>&1; then + echo "INFO: CONFIG_GCOV_PROFILE_RDS should be enabled for coverage reports" + GENERATE_GCOV_REPORT=0 + fi + if ! grep -x "CONFIG_GCOV_KERNEL=y" "$kconfig" > /dev/null 2>&1; then + echo "INFO: CONFIG_GCOV_KERNEL should be enabled for coverage reports" + GENERATE_GCOV_REPORT=0 + fi + if grep -x "CONFIG_GCOV_PROFILE_ALL=y" "$kconfig" > /dev/null 2>&1; then + echo "INFO: CONFIG_GCOV_PROFILE_ALL should be disabled for coverage reports" + GENERATE_GCOV_REPORT=0 + fi + + if [ "$GENERATE_GCOV_REPORT" -eq 0 ]; then + echo "To enable gcov reports, please run "\ + ""tools/testing/selftests/net/rds/config.sh -g" and rebuild the kernel" + else + # if we have the required kernel configs, proceed to check the environment to + # ensure we have the required gcov packages + check_gcov_env + fi +} + +# Kselftest framework requirement - SKIP code is 4. +check_conf_enabled() { + if ! grep -x "$1=y" "$kconfig" > /dev/null 2>&1; then + echo "selftests: [SKIP] This test requires $1 enabled" + echo "Please run tools/testing/selftests/net/rds/config.sh and rebuild the kernel" + exit 4 + fi +} +check_conf_disabled() { + if grep -x "$1=y" "$kconfig" > /dev/null 2>&1; then + echo "selftests: [SKIP] This test requires $1 disabled" + echo "Please run tools/testing/selftests/net/rds/config.sh and rebuild the kernel" + exit 4 + fi +} +check_conf() { + check_conf_enabled CONFIG_NET_SCH_NETEM + check_conf_enabled CONFIG_VETH + check_conf_enabled CONFIG_NET_NS + check_conf_enabled CONFIG_RDS_TCP + check_conf_enabled CONFIG_RDS + check_conf_disabled CONFIG_MODULES +} + +check_env() +{ + if ! test -d "$obj_dir"; then + echo "selftests: [SKIP] This test requires a kernel source tree" + exit 4 + fi + if ! test -e "$kconfig"; then + echo "selftests: [SKIP] This test requires a configured kernel source tree" + exit 4 + fi + if ! which strace > /dev/null 2>&1; then + echo "selftests: [SKIP] Could not run test without strace" + exit 4 + fi + if ! which tcpdump > /dev/null 2>&1; then + echo "selftests: [SKIP] Could not run test without tcpdump" + exit 4 + fi + if ! which "$QEMU_BINARY" > /dev/null 2>&1; then + echo "selftests: [SKIP] Could not run test without qemu" + exit 4 + fi + + if ! which python3 > /dev/null 2>&1; then + echo "selftests: [SKIP] Could not run test without python3" + exit 4 + fi + + python_major=$(python3 -c "import sys; print(sys.version_info[0])") + python_minor=$(python3 -c "import sys; print(sys.version_info[1])") + if [[ python_major -lt 3 || ( python_major -eq 3 && python_minor -lt 9 ) ]] ; then + echo "selftests: [SKIP] Could not run test without at least python3.9" + python3 -V + exit 4 + fi +} + +LOG_DIR=/tmp/rds_logs +PLOSS=0 +PCORRUPT=0 +PDUP=0 +GENERATE_GCOV_REPORT=1 +while getopts "d:l:c:u:" opt; do + case ${opt} in + d) + LOG_DIR=${OPTARG} + ;; + l) + PLOSS=${OPTARG} + ;; + c) + PCORRUPT=${OPTARG} + ;; + u) + PDUP=${OPTARG} + ;; + :) + echo "USAGE: run.sh [-d logdir] [-l packet_loss] [-c packet_corruption]" \ + "[-u packet_duplcate] [-g]" + exit 1 + ;; + ?) + echo "Invalid option: -${OPTARG}." + exit 1 + ;; + esac +done + + +check_env +check_conf + +check_gcov_conf +gflags="" +if [ "$GENERATE_GCOV_REPORT" -eq 1 ]; then + gflags="-g" +else + echo "Coverage report will be skipped" +fi + +#if we are running in a python environment, we need to capture that +#python bin so we can use the same python environment in the vm +PY_CMD=$(which python3) + +rm -fr "$LOG_DIR" +TRACE_FILE="${LOG_DIR}/rds-strace.txt" +mkdir -p "$LOG_DIR" + +# start a VM using a 9P root filesystem that maps to the host's / +# we pass ./init.sh from the same directory as we are in as the +# guest's init, which will run the tests and copy the coverage +# data back to the host filesystem. +$QEMU_BINARY \ + -enable-kvm \ + -cpu host \ + -smp 4 \ + -kernel "${ksrc_dir}/arch/x86/boot/bzImage" \ + -append "rootfstype=9p root=/dev/root rootflags=trans=virtio,version=9p2000.L rw \ + console=ttyS0 init=${current_dir}/init.sh -d ${LOG_DIR} -p ${PY_CMD} ${gflags} \ + -l ${PLOSS} -c ${PCORRUPT} -u ${PDUP} panic=-1" \ + -display none \ + -serial stdio \ + -fsdev local,id=fsdev0,path=/,security_model=none,multidevs=remap \ + -device virtio-9p-pci,fsdev=fsdev0,mount_tag=/dev/root \ + -no-reboot + +# generate a nice HTML coverage report +if [ "$GENERATE_GCOV_REPORT" -eq 1 ]; then + echo running gcovr... + gcovr -v -s --html-details --gcov-executable "$GCOV_CMD" --gcov-ignore-parse-errors \ + -o "${LOG_DIR}/coverage/" "${ksrc_dir}/net/rds/" +fi + +# extract the return code of the test script from the strace if it is there +if [ ! -f "$TRACE_FILE" ]; then + echo "FAIL: Test failed to complete" + exit 1 +fi + +set +e +tail -1 "$TRACE_FILE" | grep "killed by SIGALRM" > /dev/null 2>&1 +if [ $? -eq 0 ]; then + echo "FAIL: Test timed out" + exit 1 +fi + +tail -1 "$TRACE_FILE" | grep "exited with" +if [ $? -ne 0 ]; then + echo "FAIL: Test failed to complete" + exit 1 +fi + +test_rc=$(tail -1 "$TRACE_FILE" | grep -o 'exited with.*' | cut -d ' ' -f 3) +if [ "$test_rc" -eq 0 ]; then + echo "PASS: Test completed successfully" +else + echo "FAIL: Test failed" +fi + +exit "$test_rc" diff --git a/tools/testing/selftests/net/rds/test.py b/tools/testing/selftests/net/rds/test.py new file mode 100644 index 000000000000..4da3bb933842 --- /dev/null +++ b/tools/testing/selftests/net/rds/test.py @@ -0,0 +1,251 @@ +#! /usr/bin/env python3 +# SPDX-License-Identifier: GPL-2.0 + +import argparse +import ctypes +import errno +import hashlib +import os +import select +import signal +import socket +import subprocess +import sys +import atexit +from pwd import getpwuid +from os import stat + +libc = ctypes.cdll.LoadLibrary('libc.so.6') +setns = libc.setns + +net0 = 'net0' +net1 = 'net1' + +veth0 = 'veth0' +veth1 = 'veth1' + +# Convenience wrapper function for calling the subsystem ip command. +def ip(*args): + subprocess.check_call(['/usr/sbin/ip'] + list(args)) + +# Helper function for creating a socket inside a network namespace. +# We need this because otherwise RDS will detect that the two TCP +# sockets are on the same interface and use the loop transport instead +# of the TCP transport. +def netns_socket(netns, *args): + u0, u1 = socket.socketpair(socket.AF_UNIX, socket.SOCK_SEQPACKET) + + child = os.fork() + if child == 0: + # change network namespace + with open(f'/var/run/netns/{netns}') as f: + try: + ret = setns(f.fileno(), 0) + except IOError as e: + print(e.errno) + print(e) + + # create socket in target namespace + s = socket.socket(*args) + + # send resulting socket to parent + socket.send_fds(u0, [], [s.fileno()]) + + sys.exit(0) + + # receive socket from child + _, s, _, _ = socket.recv_fds(u1, 0, 1) + os.waitpid(child, 0) + u0.close() + u1.close() + return socket.fromfd(s[0], *args) + +#Parse out command line arguments. We take an optional +# timeout parameter and an optional log output folder +parser = argparse.ArgumentParser(description="init script args", + formatter_class=argparse.ArgumentDefaultsHelpFormatter) +parser.add_argument("-d", "--logdir", action="store", help="directory to store logs", default="/tmp") +parser.add_argument('--timeout', help="timeout to terminate hung test", type=int, default=0) +parser.add_argument('-l', '--loss', help="Simulate tcp packet loss", type=int, default=0) +parser.add_argument('-c', '--corruption', help="Simulate tcp packet corruption", type=int, default=0) +parser.add_argument('-u', '--duplicate', help="Simulate tcp packet duplication", type=int, default=0) +args = parser.parse_args() +logdir=args.logdir +packet_loss=args.loss +packet_corruption=args.corruption +packet_duplicate=args.duplicate + +ip('netns', 'add', net0) +ip('netns', 'add', net1) +ip('link', 'add', 'type', 'veth') + +addrs = [ + # we technically don't need different port numbers, but this will + # help identify traffic in the network analyzer + ('10.0.0.1', 10000), + ('10.0.0.2', 20000), +] + +# move interfaces to separate namespaces so they can no longer be +# bound directly; this prevents rds from switching over from the tcp +# transport to the loop transport. +ip('link', 'set', veth0, 'netns', net0, 'up') +ip('link', 'set', veth1, 'netns', net1, 'up') + +# add addresses +ip('-n', net0, 'addr', 'add', addrs[0][0] + '/32', 'dev', veth0) +ip('-n', net1, 'addr', 'add', addrs[1][0] + '/32', 'dev', veth1) + +# add routes +ip('-n', net0, 'route', 'add', addrs[1][0] + '/32', 'dev', veth0) +ip('-n', net1, 'route', 'add', addrs[0][0] + '/32', 'dev', veth1) + +# sanity check that our two interfaces/addresses are correctly set up +# and communicating by doing a single ping +ip('netns', 'exec', net0, 'ping', '-c', '1', addrs[1][0]) + +# Start a packet capture on each network +for net in [net0, net1]: + tcpdump_pid = os.fork() + if tcpdump_pid == 0: + pcap = logdir+'/'+net+'.pcap' + subprocess.check_call(['touch', pcap]) + user = getpwuid(stat(pcap).st_uid).pw_name + ip('netns', 'exec', net, '/usr/sbin/tcpdump', '-Z', user, '-i', 'any', '-w', pcap) + sys.exit(0) + +# simulate packet loss, duplication and corruption +for net, iface in [(net0, veth0), (net1, veth1)]: + ip('netns', 'exec', net, + '/usr/sbin/tc', 'qdisc', 'add', 'dev', iface, 'root', 'netem', + 'corrupt', str(packet_corruption)+'%', + 'loss', str(packet_loss)+'%', + 'duplicate', str(packet_duplicate)+'%', + ) + +# add a timeout +if args.timeout > 0: + signal.alarm(args.timeout) + +sockets = [ + netns_socket(net0, socket.AF_RDS, socket.SOCK_SEQPACKET), + netns_socket(net1, socket.AF_RDS, socket.SOCK_SEQPACKET), +] + +for s, addr in zip(sockets, addrs): + s.bind(addr) + s.setblocking(0) + +fileno_to_socket = { + s.fileno(): s for s in sockets +} + +addr_to_socket = { + addr: s for addr, s in zip(addrs, sockets) +} + +socket_to_addr = { + s: addr for addr, s in zip(addrs, sockets) +} + +send_hashes = {} +recv_hashes = {} + +ep = select.epoll() + +for s in sockets: + ep.register(s, select.EPOLLRDNORM) + +n = 50000 +nr_send = 0 +nr_recv = 0 + +while nr_send < n: + # Send as much as we can without blocking + print("sending...", nr_send, nr_recv) + while nr_send < n: + send_data = hashlib.sha256(f'packet {nr_send}'.encode('utf-8')).hexdigest().encode('utf-8') + + # pseudo-random send/receive pattern + sender = sockets[nr_send % 2] + receiver = sockets[1 - (nr_send % 3) % 2] + + try: + sender.sendto(send_data, socket_to_addr[receiver]) + send_hashes.setdefault((sender.fileno(), receiver.fileno()), hashlib.sha256()).update(f'<{send_data}>'.encode('utf-8')) + nr_send = nr_send + 1 + except BlockingIOError as e: + break + except OSError as e: + if e.errno in [errno.ENOBUFS, errno.ECONNRESET, errno.EPIPE]: + break + raise + + # Receive as much as we can without blocking + print("receiving...", nr_send, nr_recv) + while nr_recv < nr_send: + for fileno, eventmask in ep.poll(): + receiver = fileno_to_socket[fileno] + + if eventmask & select.EPOLLRDNORM: + while True: + try: + recv_data, address = receiver.recvfrom(1024) + sender = addr_to_socket[address] + recv_hashes.setdefault((sender.fileno(), receiver.fileno()), hashlib.sha256()).update(f'<{recv_data}>'.encode('utf-8')) + nr_recv = nr_recv + 1 + except BlockingIOError as e: + break + + # exercise net/rds/tcp.c:rds_tcp_sysctl_reset() + for net in [net0, net1]: + ip('netns', 'exec', net, '/usr/sbin/sysctl', 'net.rds.tcp.rds_tcp_rcvbuf=10000') + ip('netns', 'exec', net, '/usr/sbin/sysctl', 'net.rds.tcp.rds_tcp_sndbuf=10000') + +print("done", nr_send, nr_recv) + +# the Python socket module doesn't know these +RDS_INFO_FIRST = 10000 +RDS_INFO_LAST = 10017 + +nr_success = 0 +nr_error = 0 + +for s in sockets: + for optname in range(RDS_INFO_FIRST, RDS_INFO_LAST + 1): + # Sigh, the Python socket module doesn't allow us to pass + # buffer lengths greater than 1024 for some reason. RDS + # wants multiple pages. + try: + s.getsockopt(socket.SOL_RDS, optname, 1024) + nr_success = nr_success + 1 + except OSError as e: + nr_error = nr_error + 1 + if e.errno == errno.ENOSPC: + # ignore + pass + +print(f"getsockopt(): {nr_success}/{nr_error}") + +print("Stopping network packet captures") +subprocess.check_call(['killall', '-q', 'tcpdump']) + +# We're done sending and receiving stuff, now let's check if what +# we received is what we sent. +for (sender, receiver), send_hash in send_hashes.items(): + recv_hash = recv_hashes.get((sender, receiver)) + + if recv_hash is None: + print("FAIL: No data received") + sys.exit(1) + + if send_hash.hexdigest() != recv_hash.hexdigest(): + print("FAIL: Send/recv mismatch") + print("hash expected:", send_hash.hexdigest()) + print("hash received:", recv_hash.hexdigest()) + sys.exit(1) + + print(f"{sender}/{receiver}: ok") + +print("Success") +sys.exit(0)
On Tue, 25 Jun 2024 18:28:34 -0700 allison.henderson@oracle.com wrote:
From: Vegard Nossum vegard.nossum@oracle.com
This adds some basic self-testing infrastructure for RDS-TCP.
Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Allison Henderson allison.henderson@oracle.com
Documentation/dev-tools/gcov.rst | 11 + MAINTAINERS | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/rds/Makefile | 13 + tools/testing/selftests/net/rds/README.txt | 41 ++++ tools/testing/selftests/net/rds/config.sh | 56 +++++ tools/testing/selftests/net/rds/init.sh | 69 ++++++ tools/testing/selftests/net/rds/run.sh | 271 +++++++++++++++++++++ tools/testing/selftests/net/rds/test.py | 251 +++++++++++++++++++
Let's start with adding selftests, well integrated with kselftest infra. This is how we execute the tests in networking: https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-sty...
If you want to use python please use tools/testing/selftests/net/lib/py/ instead adding another wrappers.
On Thu, 2024-06-27 at 16:32 -0700, Jakub Kicinski wrote:
On Tue, 25 Jun 2024 18:28:34 -0700 allison.henderson@oracle.com wrote:
From: Vegard Nossum vegard.nossum@oracle.com
This adds some basic self-testing infrastructure for RDS-TCP.
Signed-off-by: Vegard Nossum vegard.nossum@oracle.com Signed-off-by: Chuck Lever chuck.lever@oracle.com Signed-off-by: Allison Henderson allison.henderson@oracle.com
Documentation/dev-tools/gcov.rst | 11 + MAINTAINERS | 1 + tools/testing/selftests/Makefile | 1 + tools/testing/selftests/net/rds/Makefile | 13 + tools/testing/selftests/net/rds/README.txt | 41 ++++ tools/testing/selftests/net/rds/config.sh | 56 +++++ tools/testing/selftests/net/rds/init.sh | 69 ++++++ tools/testing/selftests/net/rds/run.sh | 271 +++++++++++++++++++++ tools/testing/selftests/net/rds/test.py | 251 +++++++++++++++++++
Let's start with adding selftests, well integrated with kselftest infra. This is how we execute the tests in networking: https://github.com/linux-netdev/nipa/wiki/How-to-run-netdev-selftests-CI-sty...
If you want to use python please use tools/testing/selftests/net/lib/py/ instead adding another wrappers.
Alrighty, thank you for the review! Sorry for the delay, I've been out of town last week and just saw the response. I will go through the link you've provided and update the scripts. Thank you!
Allison
linux-kselftest-mirror@lists.linaro.org