When a nexthop is added, without a gw address, the default scope was set to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify the scope. To create it, the function nh_create_ipv4() calls fib_check_nh() with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was setting scope to 'host'. Then, nh_create_ipv4() calls fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added, fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem: ip netns add foo ip -n foo link set lo up ip netns add bar ip -n bar link set lo up sleep 1
ip -n foo link add name eth0 type dummy ip -n foo link set eth0 up ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar ip -n foo link set veth0 up ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1 ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0 ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 RTNETLINK answers: Invalid argument $ ip netns exec foo ping -c1 192.168.1.1 ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt): ip -n foo route del 192.168.1.1 ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
$ ip -n foo route get 192.168.1.1 192.168.1.1 dev veth0 src 192.168.0.1 uid 0 cache $ ip netns exec foo ping -c1 192.168.1.1 PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data. 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
--- 192.168.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable@vger.kernel.org Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops") Reported-by: Edwin Brossette edwin.brossette@6wind.com Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com ---
v2 -> v3: - no change
v1 -> v2: - remove useless arp off / fixed mac settings in the description
net/ipv4/fib_semantics.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c index a57ba23571c9..20177ecf5bdd 100644 --- a/net/ipv4/fib_semantics.c +++ b/net/ipv4/fib_semantics.c @@ -1230,7 +1230,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
nh->fib_nh_dev = in_dev->dev; dev_hold_track(nh->fib_nh_dev, &nh->fib_nh_dev_tracker, GFP_ATOMIC); - nh->fib_nh_scope = RT_SCOPE_HOST; + nh->fib_nh_scope = RT_SCOPE_LINK; if (!netif_carrier_ok(nh->fib_nh_dev)) nh->fib_nh_flags |= RTNH_F_LINKDOWN; err = 0;
This test implement the scenario described in the commit "ip: fix dflt addr selection for connected nexthop". The test configures a nexthop object with an output device only (no gateway address) and a route that uses this nexthop. The goal is to check if the kernel selects a valid source address.
Link: https://lore.kernel.org/netdev/20220712095545.10947-1-nicolas.dichtel@6wind.... Signed-off-by: Nicolas Dichtel nicolas.dichtel@6wind.com ---
v2 -> v3: - add more details in the commit log
v1 -> v2: - add linux-kselftest@vger.kernel.org - clean trailing whitespaces - add a 'trap' on exit - remove useless sleep - remove useless arp off / fixed mac settings
tools/testing/selftests/net/Makefile | 2 +- .../selftests/net/fib_nexthop_nongw.sh | 119 ++++++++++++++++++ 2 files changed, 120 insertions(+), 1 deletion(-) create mode 100755 tools/testing/selftests/net/fib_nexthop_nongw.sh
diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index ddad703ace34..db05b3764b77 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -11,7 +11,7 @@ TEST_PROGS += udpgso_bench.sh fib_rule_tests.sh msg_zerocopy.sh psock_snd.sh TEST_PROGS += udpgro_bench.sh udpgro.sh test_vxlan_under_vrf.sh reuseport_addr_any.sh TEST_PROGS += test_vxlan_fdb_changelink.sh so_txtime.sh ipv6_flowlabel.sh TEST_PROGS += tcp_fastopen_backup_key.sh fcnal-test.sh l2tp.sh traceroute.sh -TEST_PROGS += fin_ack_lat.sh fib_nexthop_multiprefix.sh fib_nexthops.sh +TEST_PROGS += fin_ack_lat.sh fib_nexthop_multiprefix.sh fib_nexthops.sh fib_nexthop_nongw.sh TEST_PROGS += altnames.sh icmp.sh icmp_redirect.sh ip6_gre_headroom.sh TEST_PROGS += route_localnet.sh TEST_PROGS += reuseaddr_ports_exhausted.sh diff --git a/tools/testing/selftests/net/fib_nexthop_nongw.sh b/tools/testing/selftests/net/fib_nexthop_nongw.sh new file mode 100755 index 000000000000..b7b928b38ce4 --- /dev/null +++ b/tools/testing/selftests/net/fib_nexthop_nongw.sh @@ -0,0 +1,119 @@ +#!/bin/bash +# SPDX-License-Identifier: GPL-2.0 +# +# ns: h1 | ns: h2 +# 192.168.0.1/24 | +# eth0 | +# | 192.168.1.1/32 +# veth0 <---|---> veth1 +# Validate source address selection for route without gateway + +PAUSE_ON_FAIL=no +VERBOSE=0 +ret=0 + +################################################################################ +# helpers + +log_test() +{ + local rc=$1 + local expected=$2 + local msg="$3" + + if [ ${rc} -eq ${expected} ]; then + printf "TEST: %-60s [ OK ]\n" "${msg}" + nsuccess=$((nsuccess+1)) + else + ret=1 + nfail=$((nfail+1)) + printf "TEST: %-60s [FAIL]\n" "${msg}" + if [ "${PAUSE_ON_FAIL}" = "yes" ]; then + echo + echo "hit enter to continue, 'q' to quit" + read a + [ "$a" = "q" ] && exit 1 + fi + fi + + [ "$VERBOSE" = "1" ] && echo +} + +run_cmd() +{ + local cmd="$*" + local out + local rc + + if [ "$VERBOSE" = "1" ]; then + echo "COMMAND: $cmd" + fi + + out=$(eval $cmd 2>&1) + rc=$? + if [ "$VERBOSE" = "1" -a -n "$out" ]; then + echo "$out" + fi + + [ "$VERBOSE" = "1" ] && echo + + return $rc +} + +################################################################################ +# config +setup() +{ + ip netns add h1 + ip -n h1 link set lo up + ip netns add h2 + ip -n h2 link set lo up + + # Add a fake eth0 to support an ip address + ip -n h1 link add name eth0 type dummy + ip -n h1 link set eth0 up + ip -n h1 address add 192.168.0.1/24 dev eth0 + + # Configure veths (same @mac, arp off) + ip -n h1 link add name veth0 type veth peer name veth1 netns h2 + ip -n h1 link set veth0 up + + ip -n h2 link set veth1 up + + # Configure @IP in the peer netns + ip -n h2 address add 192.168.1.1/32 dev veth1 + ip -n h2 route add default dev veth1 + + # Add a nexthop without @gw and use it in a route + ip -n h1 nexthop add id 1 dev veth0 + ip -n h1 route add 192.168.1.1 nhid 1 +} + +cleanup() +{ + ip netns del h1 2>/dev/null + ip netns del h2 2>/dev/null +} + +trap cleanup EXIT + +################################################################################ +# main + +while getopts :pv o +do + case $o in + p) PAUSE_ON_FAIL=yes;; + v) VERBOSE=1;; + esac +done + +cleanup +setup + +run_cmd ip -netns h1 route get 192.168.1.1 +log_test $? 0 "nexthop: get route with nexthop without gw" +run_cmd ip netns exec h1 ping -c1 192.168.1.1 +log_test $? 0 "nexthop: ping through nexthop without gw" + +exit $ret
Hello:
This series was applied to netdev/net.git (master) by Paolo Abeni pabeni@redhat.com:
On Wed, 13 Jul 2022 13:48:52 +0200 you wrote:
When a nexthop is added, without a gw address, the default scope was set to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the route, thus the problem doesn't exist.
[...]
Here is the summary with links: - [net,v3,1/2] ip: fix dflt addr selection for connected nexthop https://git.kernel.org/netdev/net/c/747c14307214 - [net,v3,2/2] selftests/net: test nexthop without gw https://git.kernel.org/netdev/net/c/cd72e61bad14
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org