-----Original Message----- From: Shradha Gupta shradhagupta@linux.microsoft.com Sent: Wednesday, January 31, 2024 2:54 AM To: Dexuan Cui decui@microsoft.com Cc: KY Srinivasan kys@microsoft.com; Haiyang Zhang haiyangz@microsoft.com; Wei Liu wei.liu@kernel.org; David S. Miller davem@davemloft.net; Eric Dumazet edumazet@google.com; Jakub Kicinski kuba@kernel.org; Paolo Abeni pabeni@redhat.com; Wojciech Drewek wojciech.drewek@intel.com; linux-hyperv@vger.kernel.org; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Shradha Gupta shradhagupta@microsoft.com; stable@vger.kernel.org Subject: Re: [PATCH] hv_netvsc:Register VF in netvsc_probe if NET_DEVICE_REGISTER missed
On Tue, Jan 30, 2024 at 08:13:21PM +0000, Dexuan Cui wrote:
From: Shradha Gupta shradhagupta@linux.microsoft.com Sent: Monday, January 29, 2024 11:19 PM [...] If hv_netvsc driver is removed and reloaded, the NET_DEVICE_REGISTER
s/removed/unloaded/ unloaded looks more accurate to me :-)
[...] Tested-on: Ubuntu22 Testcases: LISA testsuites verify_reload_hyperv_modules, perf_tcp_ntttcp_sriov
IMO the 3 lines can be removed: this bug is not specific to Ubuntu, and
the
test case names don't provide extra value to help understand the issue here and they might cause more questions unnecessarily, e.g. what's
LISA,
what exactly do the test cases do.
+/* Macros to define the context of vf registration */ +#define VF_REG_IN_PROBE 1 +#define VF_REG_IN_RECV_CBACK 2
I think VF_REG_IN_NOTIFIER is a better name? RECV_CBALL looks inaccurate to me.
@@ -2205,8 +2209,11 @@ static int netvsc_vf_join(struct net_device *vf_netdev, ndev->name, ret); goto upper_link_failed; }
- schedule_delayed_work(&ndev_ctx->vf_takeover,
VF_TAKEOVER_INT);
- /* If this registration is called from probe context vf_takeover
* is taken care of later in probe itself.
I suspect "later in probe itself" is not accurate. If 'context' is VF_REG_IN_PROBE, I suppose what happens here is: after netvsc_probe() finishes, the netvsc interface becomes available, so the user space will ifup it, and netvsc_open() will UP the VF
interface,
and netvsc_netdev_event() is called for the VF with event == NETDEV_POST_INIT (?) and NETDEV_CHANGE, and the data path is switched to the VF.
If my understanding is correct, I think in the case of 'context' == VF_REG_IN_PROBE, I suspect the "Align MTU of VF with master" and the "sync address list from ndev to VF" in __netvsc_vf_setup() are omitted? If so, should this be fixed? e.g. Not sure if the below is an
issue or not:
- a VF is bound to a NetVSC interface, and a user sets the MTUs to
- rmmod hv_netvsc
- modprobe hv_netvsc
- the netvsc interface uses MTU=1500 (the default), and the VF still
uses 1024.
@@ -2597,6 +2604,34 @@ static int netvsc_probe(struct hv_device *dev, }
list_add(&net_device_ctx->list, &netvsc_dev_list);
- /* When the hv_netvsc driver is removed and readded, the
s/removed and readded/unloaded and reloaded/
* NET_DEVICE_REGISTER for the vf device is replayed before
probe
* is complete. This is because register_netdevice_notifier() gets
* registered before vmbus_driver_register() so that callback func
* is set before probe and we don't miss events like
NETDEV_POST_INIT
* So, in this section we try to register each matching
Looks like there are spaces at the end of the line. We can move up a
few words
from the next line :-)
s/each matching/the matching/ A NetVSC interface has only 1 matching VF device.
* vf device that is present as a netdevice, knowing that it's
register
s/it's/its/
* call is not processed in the netvsc_netdev_notifier(as probing
is
* progress and get_netvsc_byslot fails).
*/
- for_each_netdev(dev_net(net), vf_netdev) {
if (vf_netdev->netdev_ops == &device_ops)
continue;
if (vf_netdev->type != ARPHRD_ETHER)
continue;
if (is_vlan_dev(vf_netdev))
continue;
if (netif_is_bond_master(vf_netdev))
continue;
The code above is duplicated from netvsc_netdev_event(). Can we add a new helper function is_matching_vf() to avoid the
duplication? Sure, I will do that. Thanks
netvsc_prepare_bonding(vf_netdev);
netvsc_register_vf(vf_netdev, VF_REG_IN_PROBE);
__netvsc_vf_setup(net, vf_netdev);
add a "break;' ?
With MANA devices and multiport support there, the individual ports are also net_devices. Wouldn't this be needed for such scenario(where we have multiple mana port net devices) to register them all?
Each device has separate probe() call, so only one VF will match in one netvsc_probe().
netvsc_prepare_bonding() & netvsc_register_vf() have get_netvsc_byslot(vf_netdev), but __netvsc_vf_setup() doesn't have. So, in case of multi-Vfs, this code will run "this" netvsc NIC with multiple VFs by __netvsc_vf_setup() which isn't correct.
You need to add the following lines before netvsc_prepare_bonding(vf_netdev) in netvsc_probe() to skip non-matching VFs:
if (net != get_netvsc_byslot(vf_netdev)) continue;
Thanks, - Haiyang