On Tue, Aug 20, 2024 at 1:01 PM Mina Almasry almasrymina@google.com wrote:
On Mon, Aug 19, 2024 at 6:53 PM Jakub Kicinski kuba@kernel.org wrote:
On Mon, 19 Aug 2024 00:44:27 +0900 Taehee Yoo wrote:
@@ -9537,6 +9540,10 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack NL_SET_ERR_MSG(extack, "Native and generic XDP can't be active at the same time"); return -EEXIST; }
if (dev_get_max_mp_channel(dev) != -1) {
NL_SET_ERR_MSG(extack, "XDP can't be installed on a netdev using memory providers");
return -EINVAL;
}
Should we consider virtual interfaces like bonding, bridge, etc? Virtual interfaces as an upper interface of physical interfaces can still install XDP prog.
# ip link add bond0 type bond # ip link set eth0 master bond0 # ip link set bond0 xdp pin /sys/fs/bpf/x/y and # ip link set bond0 xdpgeneric pin /sys/fs/bpf/x/y
All virtual interfaces can install generic XDP prog. The bonding interface can install native XDP prog.
Good point. We may need some common helpers to place the checks for XDP. They are spread all over the place now.
Took a bit of a look here. Forgive me, I'm not that familiar with XDP and virtual interfaces, so I'm a bit unsure what to do here.
For veth, it seems, the device behind the veth is stored in veth_priv->peer, so it seems maybe a dev_get_max_mp_channel() check on veth_priv->peer is the way to go to disable this for veth? I think we need to do this check on creation of the veth and on the ndo_bpf of veth.
For bonding, it seems we need to add mp channel check in bond_xdp_set, and bond_enslave?
There are a few other drivers that define ndo_add_slave, seems a check in br_add_slave is needed as well.
This seems like a potentially deep rabbit hole with a few checks to add all of the place. Is this blocking the series? AFAICT if XDP fails with mp-bound queues with a benign error, that seems fine to me; I don't have a use case for memory providers + xdp yet. This should only be blocking if someone can repro a very serious error (kernel crash) or something with this combination.
I can try to add these checks locally and propose as a follow up series. Let me know if I'm on the right track with figuring out how to implement this, and, if you feel like it's blocking.
-- Thanks, Mina
I agree with the current approach, which uses the dev_get_min_mp_channel_count() in the dev_xdp_attach(). The only problem that I am concerned about is the dev_get_min_mp_channel_count() can't check lower interfaces. So, how about just making the current code to be able to check lower interfaces? Here is a rough modification and I tested it. it works well. Please look into this code.
diff --git a/net/core/dev.c b/net/core/dev.c index f6f40c682b83..87c7985cb242 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -6989,6 +6989,27 @@ static __latent_entropy void net_rx_action(struct softirq_action *h) bpf_net_ctx_clear(bpf_net_ctx); }
+static int __dev_get_min_mp_channel_count(struct net_device *dev, + struct netdev_nested_priv *priv) +{ + int i, max = 0; + + ASSERT_RTNL(); + + for (i = 0; i < dev->real_num_rx_queues; i++) + if (dev->_rx[i].mp_params.mp_priv) + /* The channel count is the idx plus 1. */ + max = i + 1; + + return max; +} + +u32 dev_get_min_mp_channel_count(const struct net_device *dev) +{ + return (u32)__dev_get_min_mp_channel_count((struct net_device *)dev, + NULL); +} + struct netdev_adjacent { struct net_device *dev; netdevice_tracker dev_tracker; @@ -9538,7 +9559,10 @@ static int dev_xdp_attach(struct net_device *dev, struct netlink_ext_ack *extack NL_SET_ERR_MSG(extack, "Native and generic XDP can't be active at the same time"); return -EEXIST; } - if (dev_get_min_mp_channel_count(dev)) { + + if (netdev_walk_all_lower_dev(dev, + __dev_get_min_mp_channel_count, + NULL)) { NL_SET_ERR_MSG(extack, "XDP can't be installed on a netdev using memory providers"); return -EINVAL; } @@ -9826,20 +9850,6 @@ int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack, return err; }
-u32 dev_get_min_mp_channel_count(const struct net_device *dev) -{ - u32 i, max = 0; - - ASSERT_RTNL(); - - for (i = 0; i < dev->real_num_rx_queues; i++) - if (dev->_rx[i].mp_params.mp_priv) - /* The channel count is the idx plus 1. */ - max = i + 1; - - return max; -} - /** * dev_index_reserve() - allocate an ifindex in a namespace * @net: the applicable net namespace
How to test: ip link add bond2 type bond ip link add bond1 master bond2 type bond ip link add bond0 master bond1 type bond ip link set eth0 master bond0 ip link set eth0 up ip link set bond0 up ip link set bond1 up ip link set bond2 up
ip link set bond2 xdp pin /sys/fs/bpf/x/y
./ncdevmem -s 192.168.1.4 -c 192.168.1.2 -f eth0 -l -p 5201 -v 7 -t 0 -q 1
# bond2 <-- xdp should not be installed. # | # bond1 <-- xdp should not be installed. # | # bond0 <-- xdp should not be installed. # | # eth0 <--memory provider is used.
The netdev_walk_all_lower_dev() calls the callback function (__dev_get_min_mp_channel_count) while it walks its own all lower interfaces recursively. If we want to check more conditions, we can just add checks in __dev_get_min_mp_channel_count() or change the callback function.
Note that currently dev_xdp_attach() checks upper interfaces with netdev_for_each_upper_dev_rcu() but it doesn't work recursively. I think It should be fixed to check upper interfaces recursively in a separate patch.
Thanks a lot! Taehee Yoo