From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
BugLink: https://bugs.launchpad.net/bugs/1988809
__mkroute_input() uses fib_validate_source() to trigger an icmp redirect.
My understanding is that fib_validate_source() is used to know if the src
address and the gateway address are on the same link. For that,
fib_validate_source() returns 1 (same link) or 0 (not the same network).
__mkroute_input() is the only user of these positive values, all other
callers only look if the returned value is negative.
Since the below patch, fib_validate_source() didn't return anymore 1 when
both addresses are on the same network, because the route lookup returns
RT_SCOPE_LINK instead of RT_SCOPE_HOST. But this is, in fact, right.
Let's adapat the test to return 1 again when both addresses are on the same
link.
CC: stable(a)vger.kernel.org
Fixes: 747c14307214 ("ip: fix dflt addr selection for connected nexthop")
Reported-by: kernel test robot <yujie.liu(a)intel.com>
Reported-by: Heng Qi <hengqi(a)linux.alibaba.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Reviewed-by: David Ahern <dsahern(a)kernel.org>
Link: https://lore.kernel.org/r/20220829100121.3821-1-nicolas.dichtel@6wind.com
Signed-off-by: Jakub Kicinski <kuba(a)kernel.org>
(cherry-picked from commit eb55dc09b5dd040232d5de32812cc83001a23da6)
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger(a)canonical.com>
---
net/ipv4/fib_frontend.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/fib_frontend.c b/net/ipv4/fib_frontend.c
index ef3e7a3e3a29e..d38c8ca93ba09 100644
--- a/net/ipv4/fib_frontend.c
+++ b/net/ipv4/fib_frontend.c
@@ -399,7 +399,7 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
dev_match = dev_match || (res.type == RTN_LOCAL &&
dev == net->loopback_dev);
if (dev_match) {
- ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_HOST;
+ ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_LINK;
return ret;
}
if (no_addr)
@@ -411,7 +411,7 @@ static int __fib_validate_source(struct sk_buff *skb, __be32 src, __be32 dst,
ret = 0;
if (fib_lookup(net, &fl4, &res, FIB_LOOKUP_IGNORE_LINKSTATE) == 0) {
if (res.type == RTN_UNICAST)
- ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_HOST;
+ ret = FIB_RES_NHC(res)->nhc_scope >= RT_SCOPE_LINK;
}
return ret;
--
2.34.1
From: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
BugLink: https://bugs.launchpad.net/bugs/1988809
When a nexthop is added, without a gw address, the default scope was set
to 'host'. Thus, when a source address is selected, 127.0.0.1 may be chosen
but rejected when the route is used.
When using a route without a nexthop id, the scope can be configured in the
route, thus the problem doesn't exist.
To explain more deeply: when a user creates a nexthop, it cannot specify
the scope. To create it, the function nh_create_ipv4() calls fib_check_nh()
with scope set to 0. fib_check_nh() calls fib_check_nh_nongw() wich was
setting scope to 'host'. Then, nh_create_ipv4() calls
fib_info_update_nhc_saddr() with scope set to 'host'. The src addr is
chosen before the route is inserted.
When a 'standard' route (ie without a reference to a nexthop) is added,
fib_create_info() calls fib_info_update_nhc_saddr() with the scope set by
the user. iproute2 set the scope to 'link' by default.
Here is a way to reproduce the problem:
ip netns add foo
ip -n foo link set lo up
ip netns add bar
ip -n bar link set lo up
sleep 1
ip -n foo link add name eth0 type dummy
ip -n foo link set eth0 up
ip -n foo address add 192.168.0.1/24 dev eth0
ip -n foo link add name veth0 type veth peer name veth1 netns bar
ip -n foo link set veth0 up
ip -n bar link set veth1 up
ip -n bar address add 192.168.1.1/32 dev veth1
ip -n bar route add default dev veth1
ip -n foo nexthop add id 1 dev veth0
ip -n foo route add 192.168.1.1 nhid 1
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> RTNETLINK answers: Invalid argument
> $ ip netns exec foo ping -c1 192.168.1.1
> ping: connect: Invalid argument
Try without nexthop group (iproute2 sets scope to 'link' by dflt):
ip -n foo route del 192.168.1.1
ip -n foo route add 192.168.1.1 dev veth0
Try to get/use the route:
> $ ip -n foo route get 192.168.1.1
> 192.168.1.1 dev veth0 src 192.168.0.1 uid 0
> cache
> $ ip netns exec foo ping -c1 192.168.1.1
> PING 192.168.1.1 (192.168.1.1) 56(84) bytes of data.
> 64 bytes from 192.168.1.1: icmp_seq=1 ttl=64 time=0.039 ms
>
> --- 192.168.1.1 ping statistics ---
> 1 packets transmitted, 1 received, 0% packet loss, time 0ms
> rtt min/avg/max/mdev = 0.039/0.039/0.039/0.000 ms
CC: stable(a)vger.kernel.org
Fixes: 597cfe4fc339 ("nexthop: Add support for IPv4 nexthops")
Reported-by: Edwin Brossette <edwin.brossette(a)6wind.com>
Signed-off-by: Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
Link: https://lore.kernel.org/r/20220713114853.29406-1-nicolas.dichtel@6wind.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
(backported from commit 747c14307214b55dbd8250e1ab44cad8305756f1)
[lukenow: use dev_hold instead of dev_hold_track as dev_hold_track
requires some debugging infastructure that we don't want to backport]
Signed-off-by: Luke Nowakowski-Krijger <luke.nowakowskikrijger(a)canonical.com>
---
net/ipv4/fib_semantics.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv4/fib_semantics.c b/net/ipv4/fib_semantics.c
index f99ad4a98907d..16fe034615635 100644
--- a/net/ipv4/fib_semantics.c
+++ b/net/ipv4/fib_semantics.c
@@ -1217,7 +1217,7 @@ static int fib_check_nh_nongw(struct net *net, struct fib_nh *nh,
nh->fib_nh_dev = in_dev->dev;
dev_hold(nh->fib_nh_dev);
- nh->fib_nh_scope = RT_SCOPE_HOST;
+ nh->fib_nh_scope = RT_SCOPE_LINK;
if (!netif_carrier_ok(nh->fib_nh_dev))
nh->fib_nh_flags |= RTNH_F_LINKDOWN;
err = 0;
--
2.34.1
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index ff59a1851cb4..faaa0440b294 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -590,6 +590,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
@@ -732,6 +737,11 @@ static int __driver_attach(struct device *dev, void *data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
--
2.37.2.789.g6183377224-goog
commit 25e9fbf0fd38868a429feabc38abebfc6dbf6542 upstream.
Both __device_attach_driver() and __driver_attach() check the return
code of the bus_type.match() function to see if the device needs to be
added to the deferred probe list. After adding the device to the list,
the logic attempts to bind the device to the driver anyway, as if the
device had matched with the driver, which is not correct.
If __device_attach_driver() detects that the device in question is not
ready to match with a driver on the bus, then it doesn't make sense for
the device to attempt to bind with the current driver or continue
attempting to match with any of the other drivers on the bus. So, update
the logic in __device_attach_driver() to reflect this.
If __driver_attach() detects that a driver tried to match with a device
that is not ready to match yet, then the driver should not attempt to bind
with the device. However, the driver can still attempt to match and bind
with other devices on the bus, as drivers can be bound to multiple
devices. So, update the logic in __driver_attach() to reflect this.
Fixes: 656b8035b0ee ("ARM: 8524/1: driver cohandle -EPROBE_DEFER from bus_type.match()")
Cc: stable(a)vger.kernel.org
Cc: Saravana Kannan <saravanak(a)google.com>
Reported-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Guenter Roeck <linux(a)roeck-us.net>
Tested-by: Linus Walleij <linus.walleij(a)linaro.org>
Reviewed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Isaac J. Manjarres <isaacmanjarres(a)google.com>
---
drivers/base/dd.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/base/dd.c b/drivers/base/dd.c
index 26ba7a99b7d5..63390a416b44 100644
--- a/drivers/base/dd.c
+++ b/drivers/base/dd.c
@@ -738,6 +738,11 @@ static int __device_attach_driver(struct device_driver *drv, void *_data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Device can't match with a driver right now, so don't attempt
+ * to match or bind with other drivers on the bus.
+ */
+ return ret;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
@@ -891,6 +896,11 @@ static int __driver_attach(struct device *dev, void *data)
} else if (ret == -EPROBE_DEFER) {
dev_dbg(dev, "Device match requests probe deferral\n");
driver_deferred_probe_add(dev);
+ /*
+ * Driver could not match with device, but may match with
+ * another device on the bus.
+ */
+ return 0;
} else if (ret < 0) {
dev_dbg(dev, "Bus failed to match device: %d", ret);
return ret;
--
2.37.2.789.g6183377224-goog