Add a script to test various scenarios where a bridge is involved
in the fastpath. It runs tests in the forward path, and also in
a bridged path.
The setup is similar to a basic home router with multiple lan ports.
It uses 3 pairs of veth-devices. Each or all pairs can be
replaced by a pair of real interfaces, interconnected by wire.
This is necessary to test the behavior when dealing with
dsa ports, foreign (dsa) ports and switchdev userports that support
SWITCHDEV_OBJ_ID_PORT_VLAN.
See the head of the script for a detailed description.
Run without arguments to perform all tests on veth-devices.
Signed-off-by: Eric Woudstra <ericwouds(a)gmail.com>
---
This test script is written first for the proposed bridge-fastpath
patch-sets, but it's use is more general and can easily be expanded.
Changes in v3:
- Removed all warnings reported by shellcheck -x -e SC2317
- Improved del_pppoe(), check if interfaces are removed
- Added is_known_issue() to warn instead of error for known issues
- Link down and (hardware) interfaces to default netns at end of script
- Removed matching ip(v6) address
Changes in v2:
- Moved test-series to functions
- Moved code to set_pair_link() up/down
- Added conntrack zone to bridged traffic
- Test bridge chain prerouting in test without fastpath
and bridge chain forward in tests with fastpath
Some example outputs of this last version of patches from different
hardware, without and with patches:
ALL VETH:
=========
./bridge_fastpath.sh -t
Setup:
CLIENT 0
veth0cl
|
veth0rt
WAN
ROUTER
LAN1 LAN2
veth1rt veth2rt
| |
veth1cl veth2cl
CLIENT 1 CLIENT 2
Without patches:
PASS: unaware bridge, without encaps, without fastpath
PASS: unaware bridge, with single vlan encap, without fastpath
WARN: unaware bridge, with double q vlan encaps, without fastpath: ipv4/6: established bytes 0 < 4194304
WARN: unaware bridge, with 802.1ad vlan encaps, without fastpath: ipv4/6: established bytes 0 < 4194304
WARN: unaware bridge, with pppoe encap, without fastpath: ipv4/6: established bytes 0 < 4194304
WARN: unaware bridge, with pppoe-in-q encaps, without fastpath: ipv4/6: established bytes 0 < 4194304
PASS: aware bridge, without/without vlan encap, without fastpath
PASS: aware bridge, with/without vlan encap, without fastpath
PASS: aware bridge, with/with vlan encap, without fastpath
PASS: aware bridge, without/with vlan encap, without fastpath
PASS: forward, without vlan-device, without vlan encap, client1, without fastpath
PASS: forward, without vlan-device, without vlan encap, client1, with fastpath
PASS: forward, without vlan-device, with vlan encap, client1, without fastpath
WARN: forward, without vlan-device, with vlan encap, client1, with fastpath: ipv4/6: tcp broken
PASS: forward, with vlan-device, without vlan encap, client1, without fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with fastpath
PASS: forward, with vlan-device, with vlan encap, client1, without fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with fastpath
PASS: all tests passed
With patches:
PASS: unaware bridge, without encaps, without fastpath
PASS: unaware bridge, without encaps, with fastpath
PASS: unaware bridge, with single vlan encap, without fastpath
PASS: unaware bridge, with single vlan encap, with fastpath
PASS: unaware bridge, with double q vlan encaps, without fastpath
PASS: unaware bridge, with double q vlan encaps, with fastpath
PASS: unaware bridge, with 802.1ad vlan encaps, without fastpath
PASS: unaware bridge, with 802.1ad vlan encaps, with fastpath
PASS: unaware bridge, with pppoe encap, without fastpath
PASS: unaware bridge, with pppoe encap, with fastpath
PASS: unaware bridge, with pppoe-in-q encaps, without fastpath
PASS: unaware bridge, with pppoe-in-q encaps, with fastpath
PASS: aware bridge, without/without vlan encap, without fastpath
PASS: aware bridge, without/without vlan encap, with fastpath
PASS: aware bridge, with/without vlan encap, without fastpath
PASS: aware bridge, with/without vlan encap, with fastpath
PASS: aware bridge, with/with vlan encap, without fastpath
PASS: aware bridge, with/with vlan encap, with fastpath
PASS: aware bridge, without/with vlan encap, without fastpath
PASS: aware bridge, without/with vlan encap, with fastpath
PASS: forward, without vlan-device, without vlan encap, client1, without fastpath
PASS: forward, without vlan-device, without vlan encap, client1, with fastpath
PASS: forward, without vlan-device, with vlan encap, client1, without fastpath
PASS: forward, without vlan-device, with vlan encap, client1, with fastpath
PASS: forward, with vlan-device, without vlan encap, client1, without fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with fastpath
PASS: forward, with vlan-device, with vlan encap, client1, without fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with fastpath
PASS: all tests passed
BANANAPI-R3 (lan1 & lan2 are dsa):
============
Without patches:
./bridge_fastpath.sh -t -0 enu1u2,lan2 -1 enu1u1,lan1 -2 lan4,eth1
Setup:
CLIENT 0
enu1u2
|
lan2
WAN
ROUTER
LAN1 LAN2
lan1 eth1
| |
enu1u1 lan4
CLIENT 1 CLIENT 2
PASS: unaware bridge, without encaps, without fastpath
PASS: unaware bridge, with single vlan encap, without fastpath
WARN: unaware bridge, with pppoe encap, without fastpath: ipv4/6: established bytes 0 < 4194304
WARN: unaware bridge, with pppoe-in-q encaps, without fastpath: ipv4/6: established bytes 0 < 4194304
PASS: aware bridge, without/without vlan encap, without fastpath
PASS: aware bridge, with/without vlan encap, without fastpath
PASS: aware bridge, with/with vlan encap, without fastpath
PASS: aware bridge, without/with vlan encap, without fastpath
PASS: forward, without vlan-device, without vlan encap, client1, without fastpath
WARN: forward, without vlan-device, without vlan encap, client1, with fastpath: ipv4: counted bytes 2110480 > 2097152
WARN: forward, without vlan-device, without vlan encap, client1, with fastpath: ipv6: counted bytes 2116104 > 2097152
PASS: forward, without vlan-device, without vlan encap, client1, with hw_fastpath
PASS: forward, without vlan-device, without vlan encap, client2, without fastpath
PASS: forward, without vlan-device, without vlan encap, client2, with fastpath
PASS: forward, without vlan-device, without vlan encap, client2, with hw_fastpath
PASS: forward, without vlan-device, with vlan encap, client1, without fastpath
WARN: forward, without vlan-device, with vlan encap, client1, with fastpath: ipv4/6: tcp broken
WARN: forward, without vlan-device, with vlan encap, client1, with hw_fastpath: ipv4/6: tcp broken
PASS: forward, without vlan-device, with vlan encap, client2, without fastpath
WARN: forward, without vlan-device, with vlan encap, client2, with fastpath: ipv4/6: tcp broken
WARN: forward, without vlan-device, with vlan encap, client2, with hw_fastpath: ipv4/6: tcp broken
PASS: forward, with vlan-device, without vlan encap, client1, without fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with hw_fastpath
PASS: forward, with vlan-device, without vlan encap, client2, without fastpath
WARN: forward, with vlan-device, without vlan encap, client2, with fastpath: ipv4: counted bytes 2122388 > 2097152
WARN: forward, with vlan-device, without vlan encap, client2, with fastpath: ipv6: counted bytes 2129280 > 2097152
WARN: forward, with vlan-device, without vlan encap, client2, with hw_fastpath: ipv4: counted bytes 2110428 > 2097152
WARN: forward, with vlan-device, without vlan encap, client2, with hw_fastpath: ipv6: counted bytes 2140144 > 2097152
PASS: forward, with vlan-device, with vlan encap, client1, without fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with hw_fastpath
PASS: forward, with vlan-device, with vlan encap, client2, without fastpath
PASS: forward, with vlan-device, with vlan encap, client2, with fastpath
PASS: forward, with vlan-device, with vlan encap, client2, with hw_fastpath
PASS: all tests passed
With patches:
PASS: unaware bridge, without encaps, without fastpath
PASS: unaware bridge, without encaps, with fastpath
PASS: unaware bridge, without encaps, with hw_fastpath
PASS: unaware bridge, with single vlan encap, without fastpath
PASS: unaware bridge, with single vlan encap, with fastpath
PASS: unaware bridge, with single vlan encap, with hw_fastpath
PASS: unaware bridge, with pppoe encap, without fastpath
PASS: unaware bridge, with pppoe encap, with fastpath
PASS: unaware bridge, with pppoe encap, with hw_fastpath
PASS: unaware bridge, with pppoe-in-q encaps, without fastpath
PASS: unaware bridge, with pppoe-in-q encaps, with fastpath
PASS: unaware bridge, with pppoe-in-q encaps, with hw_fastpath
PASS: aware bridge, without/without vlan encap, without fastpath
PASS: aware bridge, without/without vlan encap, with fastpath
PASS: aware bridge, without/without vlan encap, with hw_fastpath
PASS: aware bridge, with/without vlan encap, without fastpath
PASS: aware bridge, with/without vlan encap, with fastpath
PASS: aware bridge, with/without vlan encap, with hw_fastpath
PASS: aware bridge, with/with vlan encap, without fastpath
PASS: aware bridge, with/with vlan encap, with fastpath
PASS: aware bridge, with/with vlan encap, with hw_fastpath
PASS: aware bridge, without/with vlan encap, without fastpath
PASS: aware bridge, without/with vlan encap, with fastpath
PASS: aware bridge, without/with vlan encap, with hw_fastpath
PASS: forward, without vlan-device, without vlan encap, client1, without fastpath
PASS: forward, without vlan-device, without vlan encap, client1, with fastpath
PASS: forward, without vlan-device, without vlan encap, client1, with hw_fastpath
PASS: forward, without vlan-device, without vlan encap, client2, without fastpath
PASS: forward, without vlan-device, without vlan encap, client2, with fastpath
PASS: forward, without vlan-device, without vlan encap, client2, with hw_fastpath
PASS: forward, without vlan-device, with vlan encap, client1, without fastpath
PASS: forward, without vlan-device, with vlan encap, client1, with fastpath
PASS: forward, without vlan-device, with vlan encap, client1, with hw_fastpath
PASS: forward, without vlan-device, with vlan encap, client2, without fastpath
PASS: forward, without vlan-device, with vlan encap, client2, with fastpath
PASS: forward, without vlan-device, with vlan encap, client2, with hw_fastpath
PASS: forward, with vlan-device, without vlan encap, client1, without fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with fastpath
PASS: forward, with vlan-device, without vlan encap, client1, with hw_fastpath
PASS: forward, with vlan-device, without vlan encap, client2, without fastpath
PASS: forward, with vlan-device, without vlan encap, client2, with fastpath
PASS: forward, with vlan-device, without vlan encap, client2, with hw_fastpath
PASS: forward, with vlan-device, with vlan encap, client1, without fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with fastpath
PASS: forward, with vlan-device, with vlan encap, client1, with hw_fastpath
PASS: forward, with vlan-device, with vlan encap, client2, without fastpath
PASS: forward, with vlan-device, with vlan encap, client2, with fastpath
PASS: forward, with vlan-device, with vlan encap, client2, with hw_fastpath
PASS: all tests passed
.../testing/selftests/net/netfilter/Makefile | 1 +
.../net/netfilter/bridge_fastpath.sh | 1055 +++++++++++++++++
2 files changed, 1056 insertions(+)
create mode 100755 tools/testing/selftests/net/netfilter/bridge_fastpath.sh
diff --git a/tools/testing/selftests/net/netfilter/Makefile b/tools/testing/selftests/net/netfilter/Makefile
index a98ed892f55f..e0de04333a3f 100644
--- a/tools/testing/selftests/net/netfilter/Makefile
+++ b/tools/testing/selftests/net/netfilter/Makefile
@@ -8,6 +8,7 @@ MNL_LDLIBS := $(shell $(HOSTPKG_CONFIG) --libs libmnl 2>/dev/null || echo -lmnl)
TEST_PROGS := br_netfilter.sh bridge_brouter.sh
TEST_PROGS += br_netfilter_queue.sh
+TEST_PROGS += bridge_fastpath.sh
TEST_PROGS += conntrack_dump_flush.sh
TEST_PROGS += conntrack_icmp_related.sh
TEST_PROGS += conntrack_ipip_mtu.sh
diff --git a/tools/testing/selftests/net/netfilter/bridge_fastpath.sh b/tools/testing/selftests/net/netfilter/bridge_fastpath.sh
new file mode 100755
index 000000000000..614497489edb
--- /dev/null
+++ b/tools/testing/selftests/net/netfilter/bridge_fastpath.sh
@@ -0,0 +1,1055 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+#
+# Check if conntrack, nft chain and fastpath is functional in setups
+# where a bridge is in the fastpath.
+#
+# Commandline options make it possible to use real ethernet pairs
+# instead of veth-device pairs. Any, or all, pairs can be tested using
+# real hardware pairs. This is can be useful to test dsa-ports,
+# switchdev (dsa) foreign ports and switchdev ports supporting
+# SWITCHDEV_OBJ_ID_PORT_VLAN.
+#
+# First tcp is tested. Conntrack and nft chain are tested using a counter.
+# When there is a fastpath possible between the interfaces then the
+# fastpath is also tested.
+# When there is a hardware offloaded fastpath possible between the
+# interfaces then the hardware offloaded path is also tested.
+#
+# Setup is as a typical router:
+#
+# nsclientwan
+# |
+# nsrt
+# | |
+# nsclient1 nsclient2
+#
+# Masquerading for ipv4 only.
+#
+# First check if a bridge table forward chain can be setup, skip
+# these tests if this is not possible.
+# Then check if a inet table forward chain can be setup, skip
+# these tests if this is not possible.
+#
+# Different setups of paths are tested that involve a bridge in the
+# fastpath. This can be in the forward-fastpath or in the bridge-fastpath.
+#
+# The first series, in the bridge-fastpath, using a vlan-unaware bridge.
+# Traffic with the following vlan-tags is checked:
+# a. without vlan
+# b. single vlan
+# c. double q vlan (only on veth-devices)
+# d. 802.1ad vlan (only on veth-devices)
+# e. pppoe (when available)
+# f. pppoe-in-q (when available)
+#
+# (for items c to f fastpath can only work when a conntrack zone is set)
+# (double tag testing results in broken tcp traffic on most hardware,
+# in this test setup, use '-a' argument to test it anyway)
+# (pppoe testing takes place if pppd and pppoe-server are installed)
+#
+# The second series, in the bridge-fastpath, using a vlan-aware bridge.
+# Here we test all combinations of ingress/egress with or without single
+# vlan encaps.
+#
+# The third series, in the forward-fastpath, using a vlan-aware bridge,
+# without a vlan-device linked to the master port. We test the same combinations
+# of ingress/egress with or without single vlan encaps.
+#
+# The fourth series, in the forward-fastpath, using a vlan-aware bridge,
+# with a vlan-device linked to the master port. We test the same combinations
+# of ingress/egress with or without single vlan encaps.
+#
+# Note 1: Using dsa userports on both sides of eth-pairs client1 or client2
+# gives erratic and unpredictable results. Use, for example, an usb-eth device
+# on the client side to test a dsa-userport.
+#
+# Note 2: Testing the hardware offloaded fastpath, it is not checked if the
+# packets do not follow the software fastpath instead. A universal way to
+# check this should be added at some point.
+#
+# Note 3: Some interfaces to test on the router side, are netns immutable.
+# Use the -d or --defaultnsrouter option so that the interfaces of the router
+# do not have to change netns. The router is build up in the default netns.
+#
+
+source lib.sh
+
+checktool "nft --version" "run test without nft"
+checktool "socat -h" "run test without socat"
+checktool "bridge -V" "run test without bridge"
+
+NR_OF_TESTS=4
+VID1=100
+VID2=101
+BRWAN=brwan
+BRLAN=brlan
+BRCL=brcl
+LINKUP_TIMEOUT=10
+PING_TIMEOUT=10
+SOCAT_TIMEOUT=10
+filesize=$((2 * 1024 * 1024))
+
+filein=$(mktemp)
+file1out=$(mktemp)
+file2out=$(mktemp)
+pppoeserveroptions=$(mktemp)
+pppoeserverpid=$(mktemp)
+
+setup_ns nsclientwan nsclientlan1 nsclientlan2
+
+ WAN=0 ; LAN1=1 ; LAN2=2 ; ADWAN=3 ; ADLAN=4
+nsa=( "$nsclientwan" "$nsclientlan1" "$nsclientlan2" ) # $nsrt $nsrt
+AD4=( '192.168.1.1' '192.168.2.101' '192.168.2.102' '192.168.1.2' '192.168.2.1' )
+AD6=( 'dead:1::1' 'dead:2::101' 'dead:2::102' 'dead:1::2' 'dead:2::1' )
+
+tests_string=$(seq 1 $NR_OF_TESTS)
+
+while [ "${1:-}" != '' ]; do
+ case "$1" in
+ '-0' | '--pairwan')
+ shift
+ vethcl[WAN]="${1%,*}"
+ vethrt[WAN]="${1#*,}"
+ ;;
+ '-1' | '--pairlan1')
+ shift
+ vethcl[LAN1]="${1%,*}"
+ vethrt[LAN1]="${1#*,}"
+ ;;
+ '-2' | '--pairlan2')
+ shift
+ vethcl[LAN2]="${1%,*}"
+ vethrt[LAN2]="${1#*,}"
+ ;;
+ '-s' | '--filesize')
+ shift
+ filesize=$1
+ ;;
+ '-p' | '--parts')
+ shift
+ tests_string=$1
+ ;;
+ '-4' | '--ipv4')
+ do_ipv4=1
+ ;;
+ '-6' | '--ipv6')
+ do_ipv6=1
+ ;;
+ '-n' | '--noskip')
+ noskip=1
+ ;;
+ '-d' | '--defaultnsrouter')
+ defaultnsrouter=1
+ ;;
+ '-f' | '--fixmac')
+ fixmac=1
+ ;;
+ '-t' | '--showtree')
+ showtree=1
+ ;;
+ *)
+ cat <<-EOF
+ Usage: $(basename "$0") [OPTION]...
+ -0 --pairwan eth0cl,eth0rt pair of real interfaces to use on wan side
+ -1 --pairlan1 eth1cl,eth1rt pair of real interfaces to use on lan1 side
+ -2 --pairlan2 eth2cl,eth2rt pair of real interfaces to use on lan2 side
+ -s --filesize filesize to use for testing in bytes
+ -p --parts partnumbers of tests to run, comma separated
+ -4|-6 --ipv4|--ipv6 test ipv4/6 only
+ -d --defaultnsrouter router in default network namespace, caution!
+ -f --fixmac change mac address when conflict found
+ -n --noskip also perform the normally skipped tests
+ -t --showtree show the tree of used interfaces
+ EOF
+ exit "$ksft_skip"
+ ;;
+ esac
+ shift
+done
+
+for i in ${tests_string//','/' '}; do
+ tests[i]="yes"
+done
+
+if [ -n "$defaultnsrouter" ]; then
+ nsrt="nsrt-$(mktemp -u XXXXXX)"
+ touch "/var/run/netns/$nsrt"
+ mount --bind /proc/1/ns/net "/var/run/netns/$nsrt"
+else
+ setup_ns nsrt
+fi
+nsa+=("$nsrt" "$nsrt")
+
+cleanup() {
+ if [ -n "$defaultnsrouter" ]; then
+ umount "/var/run/netns/$nsrt"
+ rm -f "/var/run/netns/$nsrt"
+ fi
+ cleanup_all_ns
+ rm -f "$filein" "$file1out" "$file2out" "$pppoeserveroptions" "$pppoeserverpid"
+}
+
+trap cleanup EXIT
+
+head -c "$filesize" < /dev/urandom > "$filein"
+
+check_mac()
+{
+ local ns=$1
+ local dev=$2
+ local othermacs=$3
+ local mac
+
+ mac=$(ip -net "$ns" -br link show dev "$dev" | \
+ grep -o -E '([[:xdigit:]]{1,2}:){5}[[:xdigit:]]{1,2}')
+
+ if [[ ! "$othermacs" =~ $mac ]]; then
+ echo "$mac"
+ return 0
+ fi
+ echo "WARN: Conflicting mac address $dev $mac" 1>&2
+
+ [ -z "$fixmac" ] && return 1
+
+ for (( j = 0 ; j < 10 ; j++ )); do
+ mac="${mac::6}$(printf %02x:%02x:%02x:%02x $((RANDOM%256)) \
+ $((RANDOM%256)) $((RANDOM%256)) $((RANDOM%256)))"
+ [[ "$othermacs" =~ $mac ]] && continue
+ echo "$mac"
+ ip -net "$ns" link set dev "$dev" address "$mac" 1>&2
+ return $?
+ done
+ return 1
+}
+
+is_link()
+{
+ local updown=$1
+ local ns=$2
+ local dev=$3
+
+ if ip -net "$ns" link show dev "$dev" "${updown,,}" 2>/dev/null | \
+ grep -q "state ${updown^^}"
+ then
+ return 0
+ fi
+ return 1
+}
+
+set_pair_link()
+{
+ local updown=$1
+ local all="${*:2}"
+ local lret=0
+ local i j
+
+ for i in $all; do
+ ns="${nsa[$i]}"
+ ip -net "$ns" link set "${vethcl[$i]}" "$updown"
+ lret=$((lret | $?))
+ ip -net "$nsrt" link set "${vethrt[$i]}" "$updown"
+ lret=$((lret | $?))
+ done
+ [ $lret -ne 0 ] && return 1
+
+ for j in $(seq 1 $((LINKUP_TIMEOUT * 5 ))); do
+ lret=0
+ for i in $all; do
+ ns="${nsa[$i]}"
+ is_link "$updown" "$ns" "${vethcl[$i]}"
+ lret=$((lret | $?))
+ is_link "$updown" "$nsrt" "${vethrt[$i]}"
+ lret=$((lret | $?))
+ done
+ [ $lret -eq 0 ] && break
+ sleep 0.2
+ done
+ return $lret
+}
+
+wait_ping()
+{
+ local i1=$1
+ local i2=$2
+ local ns1=${nsa[$i1]}
+ local j
+ local lret
+
+ for j in $(seq 1 $((PING_TIMEOUT * 5 ))); do
+ ip netns exec "$ns1" ping -c 1 -w $PING_TIMEOUT -i 0.2 \
+ -q "${AD4[$i2]}" >/dev/null 2>&1
+ lret=$?
+ [ $lret -le 1 ] && return $lret
+ sleep 0.2
+ done
+ return 1
+}
+
+add_addr()
+{
+ local i=$1
+ local dev=$2
+ local ns=${nsa[$i]}
+ local ad4=${AD4[$i]}
+ local ad6=${AD6[$i]}
+
+ ip -net "$ns" addr add "${ad4}/24" dev "$dev"
+ ip -net "$ns" addr add "${ad6}/64" dev "$dev" nodad
+ if [[ "$ns" == "nsclientlan"* ]]; then
+ ip -net "$ns" route add default via "${AD4[$ADLAN]}"
+ ip -net "$ns" route add default via "${AD6[$ADLAN]}"
+ elif [[ "$ns" == "nsclientwan"* ]]; then
+ ip -net "$ns" route add default via "${AD6[$ADWAN]}"
+ fi
+
+}
+
+del_addr()
+{
+ local i=$1
+ local dev=$2
+ local ns=${nsa[$i]}
+ local ad4=${AD4[$i]}
+ local ad6=${AD6[$i]}
+
+ if [[ "$ns" == "nsclientlan"* ]]; then
+ ip -net "$ns" route del default via "${AD6[$ADLAN]}"
+ ip -net "$ns" route del default via "${AD4[$ADLAN]}"
+ elif [[ "$ns" == "nsclientwan"* ]]; then
+ ip -net "$ns" route del default via "${AD6[$ADWAN]}"
+ fi
+ ip -net "$ns" addr del "${ad6}/64" dev "$dev" nodad
+ ip -net "$ns" addr del "${ad4}/24" dev "$dev"
+}
+
+set_client()
+{
+ local i=$1
+ local vlan=$2
+ local arg=$3
+ local ns=${nsa[$i]}
+ local vdev="${vethcl[$i]}"
+ local brdev="$BRCL"
+ local proto=""
+ local pvidslave=""
+
+ unset_client "$i"
+
+ if [[ "$vlan" == "qq" ]]; then
+ ip -net "$ns" link add link "$vdev" name "$vdev.$VID1" type vlan id $VID1
+ ip -net "$ns" link add link "$vdev.$VID1" name "$vdev.$VID1.$VID2" \
+ type vlan id $VID2
+ ip -net "$ns" link set "$vdev.$VID1" up
+ ip -net "$ns" link set "$vdev.$VID1.$VID2" up
+ add_addr "$i" "$vdev.$VID1.$VID2"
+ return
+ fi
+
+ [[ "$vlan" == "none" ]] && pvidslave="pvid untagged"
+ [[ "$vlan" == "ad" ]] && proto="vlan_protocol 802.1ad"
+
+ # shellcheck disable=SC2086
+ ip -net "$ns" link add "$brdev" type bridge vlan_filtering 1 vlan_default_pvid 0 $proto
+ ip -net "$ns" link set "$vdev" master "$brdev"
+ ip -net "$ns" link set "$brdev" up
+
+ # shellcheck disable=SC2086
+ bridge -net "$ns" vlan add dev "$vdev" vid $VID1 $pvidslave
+ bridge -net "$ns" vlan add dev "$brdev" vid $VID1 pvid untagged self
+
+ if [[ "$vlan" == "ad" ]]; then
+ ip -net "$ns" link add link "$brdev" name "$brdev.$VID2" type vlan id $VID2
+ brdev="$brdev.$VID2"
+ ip -net "$ns" link set "$brdev" up
+ fi
+
+ if [[ "$arg" != "noaddress" ]]; then
+ add_addr "$i" "$brdev"
+ fi
+}
+
+unset_client()
+{
+ local i=$1
+ local ns=${nsa[$i]}
+ local vdev="${vethcl[$i]}"
+ local brdev="$BRCL"
+
+ ip -net "$ns" link del "$brdev" type bridge 2>/dev/null
+ ip -net "$ns" link del "$vdev.$VID1" 2>/dev/null
+}
+
+add_pppoe()
+{
+ local i1=$1
+ local i2=$2
+ local dev1=$3
+ local dev2=$4
+ local desc=$5
+ local ns1=${nsa[$i1]}
+ local ns2=${nsa[$i2]}
+
+ ppp1=0
+ while [ -n "$(ip -net "$ns1" link show ppp$ppp1 2>/dev/null)" ]
+ do ((ppp1++)); done
+ echo "noauth defaultroute noipdefault unit $ppp1" >"$pppoeserveroptions"
+ ppp1="ppp$ppp1"
+
+ if ! ip netns exec "$ns1" pppoe-server -k -L "${AD4[$i1]}" -R "${AD4[$i2]}" \
+ -I "$dev1" -X "$pppoeserverpid" -O "$pppoeserveroptions" >/dev/null; then
+ echo "ERROR: $desc: failed to setup pppoe server" 1>&2
+ return 1
+ fi
+
+ if ! ip netns exec "$ns2" pppd plugin pppoe.so nic-"$dev2" persist holdoff 0 noauth \
+ defaultroute noipdefault noaccomp nodeflate noproxyarp nopcomp \
+ novj novjccomp linkname "selftest-$$" >/dev/null; then
+ echo "ERROR: $desc: failed to setup pppoe client" 1>&2
+ return 1
+ fi
+
+ if ! wait_ping "$i1" "$i2"; then
+ echo "ERROR: $desc: failed to setup functional pppoe connection" 1>&2
+ return 1
+ fi
+
+ ppp2=$(tail -n 1 < "/run/pppd/ppp-selftest-$$.pid")
+
+ ip -net "$ns1" addr add "${AD6[$i1]}/64" dev "$ppp1" nodad
+ ip -net "$ns2" addr add "${AD6[$i2]}/64" dev "$ppp2" nodad
+
+ return 0
+}
+
+del_pppoe()
+{
+ local i1=$1
+ local i2=$2
+ local dev1=$3
+ local dev2=$4
+ local ns1=${nsa[$i1]}
+ local ns2=${nsa[$i2]}
+ local i serverpid clientpid
+
+ serverpid="$(head -n 1 < "$pppoeserverpid")"
+ clientpid="$(head -n 1 < "/run/pppd/ppp-selftest-$$.pid")"
+
+ [[ -n "$ppp1" ]] && ip -net "$ns1" addr del "${AD6[$i1]}/64" dev "$ppp1"
+ [[ -n "$ppp2" ]] && ip -net "$ns2" addr del "${AD6[$i2]}/64" dev "$ppp2"
+
+ for i in $(seq 1 $((PING_TIMEOUT * 5 ))); do
+ if ip -net "$ns2" link show dev "$ppp2" 1>/dev/null 2>/dev/null; then
+ kill -9 "$clientpid" 2>/dev/null
+ elif ip -net "$ns1" link show dev "$ppp1" 1>/dev/null 2>/dev/null; then
+ kill -SIGTERM "$serverpid" 2>/dev/null
+ else return 0
+ fi
+ sleep 0.2
+ done
+ echo "ERROR: failed to remove pppoe connection" 1>&2
+ return 1
+}
+
+listener_ready()
+{
+ local ns=$1
+ local ipv=$2
+
+ ss -N "$ns" --ipv"$ipv" -lnt -o "sport = :8080" | grep -q 8080
+}
+
+test_tcp() {
+ local i1=$1
+ local i2=$2
+ local dofast=$3
+ local desc=$4
+ local ns1=${nsa[$i1]}
+ local ns2=${nsa[$i2]}
+ local i=-1
+ local lret=0
+ local ads=""
+ local ipv ad a lpid bytes limit error
+
+ if [ -n "$do_ipv4" ]; then ads="${AD4[$i2]}"
+ elif [ -n "$do_ipv6" ]; then ads="${AD6[$i2]}"
+ else ads="${AD4[$i2]} ${AD6[$i2]}"
+ fi
+ for ad in $ads; do
+ ((i++))
+ if [[ "$ad" =~ ":" ]]
+ then ipv="6"; a="[${ad}]"
+ else ipv="4"; a="${ad}"
+ fi
+
+ rm -f "$file1out" "$file2out"
+
+ # ip netns exec "$nsrt" nft reset counters >/dev/null
+ # But on some systems this results in 4GB values in packet and byte count, so:
+ (echo "flush ruleset"; ip netns exec "$nsrt" nft --stateless list ruleset) | \
+ ip netns exec "$nsrt" nft -f -
+
+ timeout "$SOCAT_TIMEOUT" ip netns exec "$ns2" socat TCP$ipv-LISTEN:8080,reuseaddr \
+ STDIO <"$filein" >"$file2out" 2>/dev/null &
+ lpid=$!
+ busywait 1000 listener_ready "$ns2" "$ipv"
+
+ timeout "$SOCAT_TIMEOUT" ip netns exec "$ns1" socat TCP$ipv:"$a":8080 \
+ STDIO <"$filein" >"$file1out" 2>/dev/null
+
+ if ! wait $lpid; then
+ error[i]="tcp broken"
+ continue
+ fi
+ if ! cmp "$filein" "$file1out" >/dev/null 2>&1; then
+ error[i]="file mismatch to ${ad}"
+ continue
+ fi
+ if ! cmp "$filein" "$file2out" >/dev/null 2>&1; then
+ error[i]="file mismatch from ${ad}"
+ continue
+ fi
+
+ limit=$((2 * filesize))
+ bytes=$(ip netns exec "$nsrt" nft list counter $family filter "check" | \
+ grep "packets" | cut -d' ' -f4)
+ if [ -z "$dofast" ] && [ "$bytes" -lt "$limit" ]; then
+
+ error[i]="established bytes $bytes < $limit"
+ continue
+ fi
+ if [ -n "$dofast" ] && [ "$bytes" -gt "$((limit/2))" ]; then
+ # Significant reduction of bytes expected
+ error[i]="counted bytes $bytes > $((limit/2))"
+ continue
+ fi
+
+ done
+
+ if [ -n "${error[0]}" ]; then
+ if [[ "${error[0]}" == "${error[1]}" ]]; then
+ error[0]="$desc: ipv4/6: ${error[0]}"
+ error[1]=""
+ else
+ error[0]="$desc: ipv4: ${error[0]}"
+ fi
+ fi
+ if [ -n "${error[1]}" ]; then
+ error[1]="$desc: ipv6: ${error[1]}"
+ fi
+
+ for i in 0 1; do
+ if [ -n "${error[i]}" ]; then
+ if is_known_issue "$desc: ${error[i]}"; then
+ echo "WARN: ${error[i]}" 1>&2
+ lret=$((lret | 1))
+ else
+ echo "ERROR: ${error[i]}" 1>&2
+ lret=$((lret | 2))
+ fi
+ fi
+ done
+ if [ $lret -eq 0 ]; then
+ echo "PASS: $desc"
+ fi
+ return $(( lret & 2 ))
+}
+
+known_issues=(
+'*unaware bridge,*with double q vlan encaps,*without fastpath*established*' # 1
+'*unaware bridge,*with 802.1ad vlan encaps,*without fastpath*established*' # 1
+'*unaware bridge,*with pppoe encap,*without fastpath*established*' # 1
+'*unaware bridge,*with pppoe-in-q encaps,*without fastpath*established*' # 1
+'*forward,*without vlan-device, without vlan encap,*with *fastpath:*counted*' # 2
+'*forward,*without vlan-device, with vlan encap,*with *fastpath:*tcp broken*' # 3
+'*forward,*with vlan-device, without vlan encap,*with *fastpath:*counted*' # 4
+)
+
+is_known_issue() {
+ local err=$1
+ for issue in "${known_issues[@]}"; do
+ # shellcheck disable=SC2053
+ [[ "$err" == $issue ]] && return 0
+ done
+ return 1
+}
+
+test_paths() {
+ local i1=$1
+ local i2=$2
+ local desc=$3
+ local ns1=${nsa[$i1]}
+ local ns2=${nsa[$i2]}
+
+
+ if ! setup_nftables "$i1" "$i2"; then
+ echo "ERROR: $desc: cannot setup nftables" 1>&2
+ return 1
+ fi
+ if ! test_tcp "$i1" "$i2" "" "$desc without fastpath"; then
+ return 1
+ fi
+
+ if ! setup_fastpath "$i1" "$i2" "" 2>/dev/null; then
+ return 0
+ fi
+ if ! test_tcp "$i1" "$i2" "fast" "$desc with fastpath"; then
+ return 1
+ fi
+
+ if ! setup_fastpath "$i1" "$i2" "hw" 2>/dev/null; then
+ return 0
+ fi
+ if ! test_tcp "$i1" "$i2" "fast" "$desc with hw_fastpath"; then
+ return 1
+ fi
+
+ return 0
+
+}
+
+add_masq()
+{
+ if [[ $family != "bridge" ]]; then
+ ip netns exec "$nsrt" nft -f - <<-EOF
+ table ip nat {
+ chain postrouting {
+ type nat hook postrouting priority 0;
+ oifname ${BRWAN} masquerade
+ }
+ }
+ EOF
+ else
+ return 0
+ fi
+}
+
+add_zone()
+{
+ local devs=$1
+
+ if [[ $family == "bridge" ]]; then
+ ip netns exec "$nsrt" nft -f - <<-EOF
+ table ${family} filter {
+ chain preroutingzones {
+ type filter hook prerouting priority -300;
+ iif ${devs} ct zone set 23
+ }
+ }
+ EOF
+ fi
+}
+
+setup_nftables()
+{
+ local devs="{ ${vethrt[$1]} , ${vethrt[$2]} }"
+ local i1=$1
+ local i2=$2
+
+ ip netns exec "$nsrt" nft flush ruleset
+
+ if ! add_masq; then
+ return 1
+ fi
+
+ add_zone "${devs}" 2>/dev/null
+
+ ip netns exec "$nsrt" nft -f - <<-EOF
+ table ${family} filter {
+ counter check { }
+ chain prerouting {
+ type filter hook prerouting priority 0; policy accept;
+ ct state established tcp dport 8080 counter name "check"
+ ct state established tcp sport 8080 counter name "check"
+ }
+ }
+ EOF
+}
+
+setup_fastpath()
+{
+ local devs="{ ${vethrt[$1]} , ${vethrt[$2]} }"
+ local arg=$3
+ local flags=""
+
+ [[ "$arg" == "hw" ]] && flags="flags offload"
+
+ ip netns exec "$nsrt" nft flush ruleset
+
+ if ! add_masq; then
+ return 1
+ fi
+
+ add_zone "${devs}" 2>/dev/null
+
+ ip netns exec "$nsrt" nft -f - <<-EOF
+ table ${family} filter {
+ counter check { }
+ flowtable f {
+ hook ingress priority filter
+ devices = ${devs}
+ ${flags}
+ }
+ chain forward {
+ type filter hook forward priority 0; policy accept;
+ counter name "check"
+ ct state established flow add @f
+ }
+ }
+ EOF
+}
+
+test_unaware_bridge()
+{
+ local lret=0
+ local i
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" none
+ done
+
+ test_paths $LAN1 $LAN2 "unaware bridge, without encaps, "
+ lret=$((lret | $?))
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" q
+ done
+
+ test_paths $LAN1 $LAN2 "unaware bridge, with single vlan encap, "
+ lret=$((lret | $?))
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" qq
+ done
+
+ # Skip testing double tagged packets on real hardware
+ if [ -n "$lan_all_veth" ] || [ -n "$noskip" ]; then
+
+ test_paths $LAN1 $LAN2 "unaware bridge, with double q vlan encaps, "
+ lret=$((lret | $?))
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" ad
+ done
+
+ test_paths $LAN1 $LAN2 "unaware bridge, with 802.1ad vlan encaps, "
+ lret=$((lret | $?))
+
+ fi
+ # End Skip testing double tagged packets
+
+ if [ -n "$(command -v pppd 2>/dev/null)" ] &&
+ [ -n "$(command -v pppoe-server 2>/dev/null)" ]; then
+ # Start pppoe
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" none noaddress
+ done
+
+ if add_pppoe $LAN1 $LAN2 "$BRCL" "$BRCL" "unaware bridge, with pppoe encap"; then
+ test_paths $LAN1 $LAN2 "unaware bridge, with pppoe encap, "
+ lret=$((lret | $?))
+ fi
+
+ del_pppoe $LAN1 $LAN2 "$BRCL" "$BRCL"
+ lret=$((lret | $?))
+
+ for i in $LAN1 $LAN2; do
+ set_client "$i" q noaddress
+ done
+
+ if add_pppoe $LAN1 $LAN2 "$BRCL" "$BRCL" "unaware bridge, with pppoe-in-q encaps"; then
+ test_paths $LAN1 $LAN2 "unaware bridge, with pppoe-in-q encaps, "
+ lret=$((lret | $?))
+ fi
+
+ del_pppoe $LAN1 $LAN2 "$BRCL" "$BRCL"
+ lret=$((lret | $?))
+
+ # End pppoe
+ fi
+
+ for i in $LAN1 $LAN2; do
+ unset_client "$i"
+ done
+ return $lret
+}
+
+test_aware_bridge()
+{
+ local lret=0
+ local i
+
+ for i in $LAN1 $LAN2; do
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ set_client "$i" none
+ done
+ test_paths $LAN1 $LAN2 "aware bridge, without/without vlan encap,"
+ lret=$((lret | $?))
+
+ i=$LAN1
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1
+ set_client $i q
+
+ test_paths $LAN1 $LAN2 "aware bridge, with/without vlan encap, "
+ lret=$((lret | $?))
+
+ i=$LAN2
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1
+ set_client $i q
+
+ test_paths $LAN1 $LAN2 "aware bridge, with/with vlan encap, "
+ lret=$((lret | $?))
+
+ i=$LAN1
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ set_client $i none
+
+ test_paths $LAN1 $LAN2 "aware bridge, without/with vlan encap, "
+ lret=$((lret | $?))
+
+ i=$LAN1
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ unset_client $i
+ i=$LAN2
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1
+ unset_client $i
+
+ return $lret
+}
+
+test_forward_without_vlandev()
+{
+ local wo=$1
+ local lret=0
+ local i
+
+ [[ "$wo" == "" ]] && wo="without"
+
+ for i in $LAN1 $LAN2; do
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ set_client "$i" none
+ done
+
+ test_paths $LAN1 $WAN "forward, $wo vlan-device, without vlan encap, client1,"
+ lret=$((lret | $?))
+ if [ -z "$lan_all_veth" ] || [ -n "$noskip" ]; then
+ test_paths $LAN2 $WAN "forward, $wo vlan-device, without vlan encap, client2,"
+ lret=$((lret | $?))
+ fi
+
+ for i in $LAN1 $LAN2; do
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1 pvid untagged
+ bridge -net "$nsrt" vlan add dev "${vethrt[$i]}" vid $VID1
+ set_client "$i" q
+ done
+
+ test_paths $LAN1 $WAN "forward, $wo vlan-device, with vlan encap, client1,"
+ lret=$((lret | $?))
+ if [ -z "$lan_all_veth" ] || [ -n "$noskip" ]; then
+ test_paths $LAN2 $WAN "forward, $wo vlan-device, with vlan encap, client2,"
+ lret=$((lret | $?))
+ fi
+
+ for i in $LAN1 $LAN2; do
+ bridge -net "$nsrt" vlan del dev "${vethrt[$i]}" vid $VID1
+ unset_client "$i"
+ done
+ return $lret
+}
+
+test_forward_with_vlandev()
+{
+ test_forward_without_vlandev "with"
+ return $?
+}
+
+ret=0
+### Start Initial Setup ###
+
+for i in 4 6; do
+ ip netns exec "$nsrt" sysctl -q net.ipv$i.conf.all.forwarding=1
+done
+
+### Use brwan to make sure software fastpath is ###
+### direct xmit in other direction also ###
+
+ip -net "$nsrt" link add $BRWAN type bridge
+ret=$((ret | $?))
+ip -net "$nsrt" link set $BRWAN up
+ret=$((ret | $?))
+if [ $ret -ne 0 ]; then
+ echo "SKIP: Can't create bridge"
+ exit "$ksft_skip"
+fi
+
+# If both lan clients are veth-devices, only test 1 in the forward path
+if [ -z "${vethcl[$LAN1]}" ] && [ -z "${vethcl[$LAN2]}" ]; then
+ lan_all_veth=1
+fi
+
+for i in $WAN $LAN1 $LAN2; do
+ ns="${nsa[$i]}"
+ if [ -z "${vethcl[$i]}" ]; then
+ vethcl[i]="veth${i}cl"
+ vethrt[i]="veth${i}rt"
+ ip link add "${vethcl[$i]}" netns "$ns" type veth \
+ peer name "${vethrt[$i]}" netns "$nsrt"
+ ret=$((ret | $?))
+ else # Use pair of interconnected hardware interfaces
+ ip link set "${vethrt[$i]}" netns "$nsrt"
+ ret=$((ret | $?))
+ ip link set "${vethcl[$i]}" netns "$ns"
+ ret=$((ret | $?))
+ fi
+done
+if [ $ret -ne 0 ]; then
+ echo "SKIP: (v)eth pairs cannot be used"
+ exit "$ksft_skip"
+fi
+
+if [ -n "$showtree" ]; then
+ cat <<-EOF
+ Setup:
+ CLIENT 0
+ ${vethcl[$WAN]}
+ |
+ ${vethrt[$WAN]}
+ WAN
+ ROUTER
+ LAN1 LAN2
+ $(printf "%14.14s" "${vethrt[$LAN1]}") ${vethrt[$LAN2]}
+ | |
+ $(printf "%14.14s" "${vethcl[$LAN1]}") ${vethcl[$LAN2]}
+ CLIENT 1 CLIENT 2
+
+ EOF
+fi
+
+for n in nsclientwan nsclientlan; do
+ routerside=""; clientside=""
+ for i in $WAN $LAN1 $LAN2; do
+ ns="${nsa[$i]}"
+ [[ "$ns" != "$n"* ]] && continue
+ mac=$(check_mac "$ns" "${vethcl[$i]}" "$routerside $clientside")
+ ret=$((ret | $?))
+ clientside+=" $mac"
+ mac=$(check_mac "$nsrt" "${vethrt[$i]}" "$clientside")
+ ret=$((ret | $?))
+ routerside+=" $mac"
+ done
+done
+if [ $ret -ne 0 ]; then
+ echo "SKIP: conflicting mac address"
+ exit "$ksft_skip"
+fi
+
+set_pair_link up $WAN $LAN1 $LAN2
+ret=$((ret | $?))
+if [ $ret -ne 0 ]; then
+ echo "SKIP: setting (v)eth pairs link up failed"
+ exit "$ksft_skip"
+fi
+
+i=$WAN
+ip -net "$nsrt" link set "${vethrt[$i]}" master $BRWAN
+set_client $i none
+add_addr $ADWAN "$BRWAN"
+
+family="bridge"
+if ! setup_nftables $LAN1 $LAN2 2>/dev/null; then
+ echo "INFO: Cannot add nftables table $family"
+ tests[1]=""; tests[2]=""
+fi
+family="inet"
+if ! setup_nftables $WAN $LAN1 2>/dev/null; then
+ echo "INFO: Cannot add nftables table $family"
+ tests[3]=""; tests[4]=""
+fi
+
+### End Initial Setup ###
+
+if [ -n "${tests[1]}" ]; then
+ # Setup brlan as vlan unaware bridge
+ family="bridge"
+ ip -net "$nsrt" link add $BRLAN type bridge
+ ip -net "$nsrt" link set $BRLAN up
+ for i in $LAN1 $LAN2; do
+ ip -net "$nsrt" link set "${vethrt[$i]}" master $BRLAN
+ done
+ test_unaware_bridge
+ ret=$((ret | $?))
+ ip -net "$nsrt" link del $BRLAN type bridge
+fi
+
+if [ -n "${tests[2]}" ] || [ -n "${tests[3]}" ] || [ -n "${tests[4]}" ]; then
+ # Setup brlan as vlan aware bridge
+ family="bridge"
+
+ ip -net "$nsrt" link add $BRLAN type bridge vlan_filtering 1 vlan_default_pvid 0
+ ip -net "$nsrt" link set $BRLAN up
+ bridge -net "$nsrt" vlan add dev $BRLAN vid $VID1 pvid untagged self
+ add_addr $ADLAN "$BRLAN"
+ for i in $LAN1 $LAN2; do
+ ip -net "$nsrt" link set "${vethrt[$i]}" master $BRLAN
+ done
+
+ if [ -n "${tests[2]}" ]; then
+ test_aware_bridge
+ ret=$((ret | $?))
+ fi
+
+ family="inet"
+
+ if [ -n "${tests[3]}" ]; then
+ test_forward_without_vlandev
+ ret=$((ret | $?))
+ fi
+
+ if [ -n "${tests[4]}" ]; then
+ # Setup vlan-device linked to brlan master port
+ del_addr $ADLAN "$BRLAN"
+ ip -net "$nsrt" link set $BRLAN down
+ bridge -net "$nsrt" vlan del dev $BRLAN vid $VID1 pvid untagged self
+ bridge -net "$nsrt" vlan add dev $BRLAN vid $VID1 self
+ ip -net "$nsrt" link add link $BRLAN name $BRLAN.$VID1 type vlan id $VID1
+ ip -net "$nsrt" link set $BRLAN up
+ ip -net "$nsrt" link set "$BRLAN.$VID1" up
+ add_addr $ADLAN "$BRLAN.$VID1"
+ test_forward_with_vlandev
+ ret=$((ret | $?))
+ fi
+
+ ip -net "$nsrt" link del $BRLAN type bridge
+fi
+
+### Finish tests ###
+
+ip -net "$nsrt" link del $BRWAN type bridge
+
+for i in $WAN $LAN1 $LAN2; do
+ unset_client "$i"
+done
+
+set_pair_link down $WAN $LAN1 $LAN2
+
+for i in $WAN $LAN1 $LAN2; do
+ ns="${nsa[$i]}"
+ if [[ "${vethcl[$i]:0:4}" != "veth" ]]; then
+ ip netns exec "$ns" ip link set "${vethcl[$i]}" netns 1
+ fi
+ if [[ "${vethrt[$i]:0:4}" != "veth" ]]; then
+ ip netns exec "$nsrt" ip link set "${vethrt[$i]}" netns 1
+ fi
+done
+
+if [ $ret -eq 0 ]; then
+ echo "PASS: all tests passed"
+else
+ echo "ERROR: bridge fastpath test has failed"
+fi
+
+exit $ret
--
2.50.0
When trying to build the latest BPF selftests, with a debug kernel
config, Pahole 1.30 and CLang 20.1.8 (and GCC 15.2), I got these errors:
progs/dynptr_success.c:579:9: error: call to undeclared function 'bpf_dynptr_slice'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
579 | data = bpf_dynptr_slice(&ptr, 0, NULL, 1);
| ^
progs/dynptr_success.c:579:9: note: did you mean 'bpf_dynptr_size'?
.virtme/build-debug-btf//tools/include/vmlinux.h:120280:14: note: 'bpf_dynptr_size' declared here
120280 | extern __u32 bpf_dynptr_size(const struct bpf_dynptr *p) __weak __ksym;
| ^
progs/dynptr_success.c:579:7: error: incompatible integer to pointer conversion assigning to '__u64 *' (aka 'unsigned long long *') from 'int' [-Wint-conversion]
579 | data = bpf_dynptr_slice(&ptr, 0, NULL, 1);
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
progs/dynptr_success.c:596:9: error: call to undeclared function 'bpf_dynptr_slice'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
596 | data = bpf_dynptr_slice(&ptr, 0, NULL, 10);
| ^
progs/dynptr_success.c:596:7: error: incompatible integer to pointer conversion assigning to 'char *' from 'int' [-Wint-conversion]
596 | data = bpf_dynptr_slice(&ptr, 0, NULL, 10);
| ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I don't have these errors without the debug kernel config from
kernel/configs/debug.config. With the debug kernel, bpf_dynptr_slice()
is not declared in vmlinux.h. It is declared there without debug.config.
The fix is similar to what is done in dynptr_fail.c which is also using
bpf_dynptr_slice(): bpf_kfuncs.h is now included.
Signed-off-by: Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
---
Notes:
- This patch looks wrong, I guess bpf_dynptr_slice() should be in
vmlinux.h even with a "debug" kernel, but it is not:
$ grep -cw bpf_dynptr_slice .virtme/build-debug-btf/tools/include/vmlinux.h
0
$ grep -w bpf_dynptr_slice .virtme/build-btf/tools/include/vmlinux.h
extern void *bpf_dynptr_slice(...) __weak __ksym;
- This is on top of bpf/master: commit 63d2247e2e37, tag bpf-fixes.
- I only see this error when using kernel/configs/debug.config.
- Because this has not been spot by the BPF CI, I wonder if I'm
building the BPF selftests properly... Here is what I did:
$ virtme-configkernel --arch x86_64 --defconfig \
--custom tools/testing/selftests/net/mptcp/config \
--custom kernel/configs/debug.config \
--custom tools/testing/selftests/bpf/config \
O=${PWD}/.virtme/build-debug-btf
$ ./scripts/config --file ${PWD}/.virtme/build-debug-btf/.config \
-e NET_NS_REFCNT_TRACKER -d SLUB_DEBUG_ON \
-d DEBUG_KMEMLEAK_AUTO_SCAN -e PANIC_ON_OOPS \
-e SOFTLOCKUP_DETECTOR -e BOOTPARAM_SOFTLOCKUP_PANIC \
-e HARDLOCKUP_DETECTOR -e BOOTPARAM_HUNG_TASK_PANIC \
-e DETECT_HUNG_TASK -e BOOTPARAM_HUNG_TASK_PANIC -e DEBUG_INFO \
-e DEBUG_INFO_DWARF_TOOLCHAIN_DEFAULT -e GDB_SCRIPTS \
-e DEBUG_INFO_DWARF4 -e DEBUG_INFO_COMPRESSED \
-e DEBUG_INFO_COMPRESSED_ZLIB -e DEBUG_INFO_BTF_MODULES \
-e MODULE_ALLOW_BTF_MISMATCH -d IA32_EMULATION -e DYNAMIC_DEBUG \
--set-val CONSOLE_LOGLEVEL_DEFAULT 8 -e FTRACE -e FUNCTION_TRACER \
-e DYNAMIC_FTRACE -e FTRACE_SYSCALLS -e HIST_TRIGGERS -e DEBUG_NET \
-m KUNIT -e KUNIT_DEBUGFS -d KUNIT_ALL_TESTS -m MPTCP_KUNIT_TEST \
-e BPF_JIT -e BPF_SYSCALL -e TUN -e CRYPTO_USER_API_HASH \
-e CRYPTO_SHA1 -e NET_SCH_TBF -e BRIDGE -d RETPOLINE -d PCCARD \
-d MACINTOSH_DRIVERS -d SOUND -d USB_SUPPORT -d NEW_LEDS -d SCSI \
-d SURFACE_PLATFORMS -d DRM -d FB -d ATA -d MISC_FILESYSTEMS
# sorry, long list used by the MPTCP CI to accelerate builds, etc.
$ make O=${PWD}/.virtme/build-debug-btf olddefconfig
$ make O=${PWD}/.virtme/build-debug-btf -j$(nproc) -l$(nproc)
$ make O=${PWD}/.virtme/build-debug-btf headers_install \
INSTALL_HDR_PATH=${PWD}/.virtme/headers
$ make O=${PWD}/.virtme/build-debug-btf \
KHDR_INCLUDES=-I${PWD}/.virtme/headers/includes \
-C tools/testing/selftests/bpf
- The errors I got should be reproducible using:
$ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
-e INPUT_EXTRA_ENV=INPUT_RUN_TESTS_ONLY=bpftest_all \
--pull always mptcp/mptcp-upstream-virtme-docker:latest \
auto-btf-debug
- These issues were originally spot by our MPTCP CI:
https://github.com/multipath-tcp/mptcp_net-next/actions/runs/18222911614/jo…
- No errors without kernel/configs/debug.config on the CI and on my side
- This CI got different issues, and I had to declare more kfuncs there:
https://github.com/multipath-tcp/mptcp_net-next/commit/4435d4da9f4f
but this CI is currently on top of 'net', with Jiri's patches from
https://lore.kernel.org/20251001122223.170830-1-jolsa@kernel.org
- The builds have been done from a clean build directory each time.
- Do you think the issue is on my side? Dependences? How the selftests
are built? I didn't change the way the BPF selftests are built for a
while. I had other issues with pahole 1.29, but fixed with 1.30.
- Feel free to discard this patch for a better solution (if any).
- I don't know which Fixes tag adding, but I doubt this patch is valid.
---
tools/testing/selftests/bpf/progs/dynptr_success.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
index 127dea342e5a67dda33e0a39e84d135206d2f3f1..60daf5ce8eb283d8c8bf2d7853eda6313df4fa87 100644
--- a/tools/testing/selftests/bpf/progs/dynptr_success.c
+++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
@@ -6,6 +6,7 @@
#include <stdbool.h>
#include <bpf/bpf_helpers.h>
#include <bpf/bpf_tracing.h>
+#include "bpf_kfuncs.h"
#include "bpf_misc.h"
#include "errno.h"
---
base-commit: 63d2247e2e37d9c589a0a26aa4e684f736a45e29
change-id: 20251003-bpf-sft-fix-build-err-6-18-6a4c032f680a
Best regards,
--
Matthieu Baerts (NGI0) <matttbe(a)kernel.org>
Problem
=======
When host APEI is unable to claim a synchronous external abort (SEA)
during guest abort, today KVM directly injects an asynchronous SError
into the VCPU then resumes it. The injected SError usually results in
unpleasant guest kernel panic.
One of the major situation of guest SEA is when VCPU consumes recoverable
uncorrected memory error (UER), which is not uncommon at all in modern
datacenter servers with large amounts of physical memory. Although SError
and guest panic is sufficient to stop the propagation of corrupted memory,
there is room to recover from an UER in a more graceful manner.
Proposed Solution
=================
The idea is, we can replay the SEA to the faulting VCPU. If the memory
error consumption or the fault that cause SEA is not from guest kernel,
the blast radius can be limited to the poison-consuming guest process,
while the VM can keep running.
In addition, instead of doing under the hood without involving userspace,
there are benefits to redirect the SEA to VMM:
- VM customers care about the disruptions caused by memory errors, and
VMM usually has the responsibility to start the process of notifying
the customers of memory error events in their VMs. For example some
cloud provider emits a critical log in their observability UI [1], and
provides a playbook for customers on how to mitigate disruptions to
their workloads.
- VMM can protect future memory error consumption by unmapping the poisoned
pages from stage-2 page table with KVM userfault [2], or by splitting the
memslot that contains the poisoned pages.
- VMM can keep track of SEA events in the VM. When VMM thinks the status
on the host or the VM is bad enough, e.g. number of distinct SEAs
exceeds a threshold, it can restart the VM on another healthy host.
- Behavior parity with x86 architecture. When machine check exception
(MCE) is caused by VCPU, kernel or KVM signals userspace SIGBUS to
let VMM either recover from the MCE, or terminate itself with VM.
The prior RFC proposes to implement SIGBUS on arm64 as well, but
Marc preferred KVM exit over signal [3]. However, implementation
aside, returning SEA to VMM is on par with returning MCE to VMM.
Once SEA is redirected to VMM, among other actions, VMM is encouraged
to inject external aborts into the faulting VCPU.
New UAPIs
=========
This patchset introduces following userspace-visible changes to empower
VMM to control what happens for SEA on guest memory:
- KVM_CAP_ARM_SEA_TO_USER. While taking SEA, if userspace has enabled
this new capability at VM creation, and the SEA is not owned by kernel
allocated memory, instead of injecting SError, return KVM_EXIT_ARM_SEA
to userspace.
- KVM_EXIT_ARM_SEA. This is the VM exit reason VMM gets. The details
about the SEA is provided in arm_sea as much as possible, including
sanitized ESR value at EL2, faulting guest virtual and physical
addresses if available.
* From v2 [4]:
- Rebased on "[PATCH] KVM: arm64: nv: Handle SEAs due to VNCR redirection" [5]
and kvmarm/next commit 7b8346bd9fce ("KVM: arm64: Don't attempt vLPI
mappings when vPE allocation is disabled")
- Took the host_owns_sea implementation from Oliver [6, 7].
- Excluded the guest SEA injection patches.
- Updated selftest.
* From v1 [8]:
- Rebased on commit 4d62121ce9b5 ("KVM: arm64: vgic-debug: Avoid
dereferencing NULL ITE pointer").
- Sanitize ESR_EL2 before reporting it to userspace.
- Do not do KVM_EXIT_ARM_SEA when SEA is caused by memory allocated to
stage-2 translation table.
[1] https://cloud.google.com/solutions/sap/docs/manage-host-errors
[2] https://lore.kernel.org/kvm/20250109204929.1106563-1-jthoughton@google.com
[3] https://lore.kernel.org/kvm/86pljbqqh0.wl-maz@kernel.org
[4] https://lore.kernel.org/kvm/20250604050902.3944054-1-jiaqiyan@google.com/
[5] https://lore.kernel.org/kvmarm/20250729182342.3281742-1-oliver.upton@linux.…
[6] https://lore.kernel.org/kvm/aHFohmTb9qR_JG1E@linux.dev/#t
[7] https://lore.kernel.org/kvm/aHK-DPufhLy5Dtuk@linux.dev/
[8] https://lore.kernel.org/kvm/20250505161412.1926643-1-jiaqiyan@google.com
Jiaqi Yan (3):
KVM: arm64: VM exit to userspace to handle SEA
KVM: selftests: Test for KVM_EXIT_ARM_SEA
Documentation: kvm: new UAPI for handling SEA
Documentation/virt/kvm/api.rst | 61 ++++
arch/arm64/include/asm/kvm_host.h | 2 +
arch/arm64/kvm/arm.c | 5 +
arch/arm64/kvm/mmu.c | 68 +++-
include/uapi/linux/kvm.h | 10 +
tools/arch/arm64/include/asm/esr.h | 2 +
tools/testing/selftests/kvm/Makefile.kvm | 1 +
.../testing/selftests/kvm/arm64/sea_to_user.c | 327 ++++++++++++++++++
tools/testing/selftests/kvm/lib/kvm_util.c | 1 +
9 files changed, 476 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c
--
2.50.1.565.gc32cd1483b-goog
Fix a memory leak in netpoll and introduce netconsole selftests that
expose the issue when running with kmemleak detection enabled.
This patchset includes a selftest for netpoll with multiple concurrent
users (netconsole + bonding), which simulates the scenario from test[1]
that originally demonstrated the issue allegedly fixed by commit
efa95b01da18 ("netpoll: fix use after free") - a commit that is now
being reverted.
Sending this to "net" branch because this is a fix, and the selftest
might help with the backports validation.
Link: https://lore.kernel.org/lkml/96b940137a50e5c387687bb4f57de8b0435a653f.14048… [1]
Signed-off-by: Breno Leitao <leitao(a)debian.org>
---
Changes in v6:
- Expand the tests even more and some small fixups
- Moved the test to bonding selftests
- Link to v5: https://lore.kernel.org/r/20250918-netconsole_torture-v5-0-77e25e0a4eb6@deb…
Changes in v5:
- Set CONFIG_BONDING=m in selftests/drivers/net/config.
- Link to v4: https://lore.kernel.org/r/20250917-netconsole_torture-v4-0-0a5b3b8f81ce@deb…
Changes in v4:
- Added an additional selftest to test multiple netpoll users in
parallel
- Link to v3: https://lore.kernel.org/r/20250905-netconsole_torture-v3-0-875c7febd316@deb…
Changes in v3:
- This patchset is a merge of the fix and the selftest together as
recommended by Jakub.
Changes in v2:
- Reuse the netconsole creation from lib_netcons.sh. Thus, refactoring
the create_dynamic_target() (Jakub)
- Move the "wait" to after all the messages has been sent.
- Link to v1: https://lore.kernel.org/r/20250902-netconsole_torture-v1-1-03c6066598e9@deb…
---
Breno Leitao (4):
net: netpoll: fix incorrect refcount handling causing incorrect cleanup
selftest: netcons: refactor target creation
selftest: netcons: create a torture test
selftest: netcons: add test for netconsole over bonded interfaces
net/core/netpoll.c | 7 +++++--
tools/testing/selftests/drivers/net/Makefile | 1 +
tools/testing/selftests/drivers/net/bonding/Makefile | 2 ++
tools/testing/selftests/drivers/net/bonding/config | 4 ++++
tools/testing/selftests/drivers/net/bonding/netcons_over_bonding.sh | 221 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
tools/testing/selftests/drivers/net/lib/sh/lib_netcons.sh | 189 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++------------------
tools/testing/selftests/drivers/net/netcons_torture.sh | 127 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
7 files changed, 531 insertions(+), 20 deletions(-)
---
base-commit: f1455695d2d99894b65db233877acac9a0e120b9
change-id: 20250902-netconsole_torture-8fc23f0aca99
Best regards,
--
Breno Leitao <leitao(a)debian.org>
This series backports 19 patches to update minmax.h in the 5.15.y branch,
aligning it with v6.17-rc7.
The ultimate goal is to synchronize all longterm branches so that they
include the full set of minmax.h changes (6.12.y and 6.6.y were already
backported by me and are now aligned, 6.1.y is in progress).
The key motivation is to bring in commit d03eba99f5bf ("minmax: allow
min()/max()/clamp() if the arguments have the same signedness"), which
is missing in kernel 5.10.y.
In mainline, this change enables min()/max()/clamp() to accept mixed
argument types, provided both have the same signedness. Without it,
backported patches that use these forms may trigger compiler warnings,
which escalate to build failures when -Werror is enabled.
Andy Shevchenko (1):
minmax: deduplicate __unconst_integer_typeof()
David Laight (8):
minmax: fix indentation of __cmp_once() and __clamp_once()
minmax.h: add whitespace around operators and after commas
minmax.h: update some comments
minmax.h: reduce the #define expansion of min(), max() and clamp()
minmax.h: use BUILD_BUG_ON_MSG() for the lo < hi test in clamp()
minmax.h: move all the clamp() definitions after the min/max() ones
minmax.h: simplify the variants of clamp()
minmax.h: remove some #defines that are only expanded once
Herve Codina (1):
minmax: Introduce {min,max}_array()
Linus Torvalds (8):
minmax: avoid overly complicated constant expressions in VM code
minmax: make generic MIN() and MAX() macros available everywhere
minmax: add a few more MIN_T/MAX_T users
minmax: simplify and clarify min_t()/max_t() implementation
minmax: simplify min()/max()/clamp() implementation
minmax: don't use max() in situations that want a C constant
expression
minmax: improve macro expansion and type checking
minmax: fix up min3() and max3() too
Matthew Wilcox (Oracle) (1):
minmax: add in_range() macro
arch/arm/mm/pageattr.c | 6 +-
arch/um/drivers/mconsole_user.c | 2 +
arch/x86/mm/pgtable.c | 2 +-
drivers/edac/sb_edac.c | 4 +-
drivers/edac/skx_common.h | 1 -
drivers/gpu/drm/amd/amdgpu/amdgpu.h | 2 +
.../drm/amd/display/modules/hdcp/hdcp_ddc.c | 2 +
.../drm/amd/pm/powerplay/hwmgr/ppevvmath.h | 14 +-
.../amd/pm/swsmu/smu11/sienna_cichlid_ppt.c | 2 +
.../drm/arm/display/include/malidp_utils.h | 2 +-
.../display/komeda/komeda_pipeline_state.c | 24 +-
drivers/gpu/drm/drm_color_mgmt.c | 2 +-
drivers/gpu/drm/msm/adreno/a6xx_gmu.c | 6 -
drivers/gpu/drm/radeon/evergreen_cs.c | 2 +
drivers/hwmon/adt7475.c | 24 +-
drivers/input/touchscreen/cyttsp4_core.c | 2 +-
drivers/irqchip/irq-sun6i-r.c | 2 +-
drivers/md/dm-integrity.c | 4 +-
drivers/media/dvb-frontends/stv0367_priv.h | 3 +
.../net/ethernet/chelsio/cxgb3/cxgb3_main.c | 18 +-
.../net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
drivers/net/fjes/fjes_main.c | 4 +-
drivers/nfc/pn544/i2c.c | 2 -
drivers/platform/x86/sony-laptop.c | 1 -
drivers/scsi/isci/init.c | 6 +-
.../pci/hive_isp_css_include/math_support.h | 5 -
drivers/virt/acrn/ioreq.c | 4 +-
fs/btrfs/misc.h | 2 -
fs/btrfs/tree-checker.c | 2 +-
fs/ext2/balloc.c | 2 -
fs/ext4/ext4.h | 2 -
fs/ufs/util.h | 6 -
include/linux/compiler.h | 9 +
include/linux/minmax.h | 264 +++++++++++++-----
kernel/trace/preemptirq_delay_test.c | 2 -
lib/btree.c | 1 -
lib/decompress_unlzma.c | 2 +
lib/logic_pio.c | 3 -
lib/vsprintf.c | 2 +-
lib/zstd/zstd_internal.h | 2 -
mm/zsmalloc.c | 1 -
net/ipv4/proc.c | 2 +-
net/ipv6/proc.c | 2 +-
net/netfilter/nf_nat_core.c | 6 +-
net/tipc/core.h | 2 +-
net/tipc/link.c | 10 +-
tools/testing/selftests/vm/mremap_test.c | 2 +
47 files changed, 289 insertions(+), 183 deletions(-)
--
2.47.3
Add the benchmark testcase "kprobe-multi-all", which will hook all the
kernel functions during the testing.
This series is separated out from [1].
Changes since V1:
* introduce trace_blacklist instead of copy-pasting strcmp in the 2nd
patch
* use fprintf() instead of printf() in 3rd patch
Link: https://lore.kernel.org/bpf/20250817024607.296117-1-dongml2@chinatelecom.cn/ [1]
Menglong Dong (3):
selftests/bpf: move get_ksyms and get_addrs to trace_helpers.c
selftests/bpf: skip recursive functions for kprobe_multi
selftests/bpf: add benchmark testing for kprobe-multi-all
tools/testing/selftests/bpf/bench.c | 4 +
.../selftests/bpf/benchs/bench_trigger.c | 53 ++++
.../selftests/bpf/benchs/run_bench_trigger.sh | 4 +-
.../bpf/prog_tests/kprobe_multi_test.c | 220 +---------------
.../selftests/bpf/progs/trigger_bench.c | 12 +
tools/testing/selftests/bpf/trace_helpers.c | 234 ++++++++++++++++++
tools/testing/selftests/bpf/trace_helpers.h | 3 +
7 files changed, 311 insertions(+), 219 deletions(-)
--
2.51.0