On Fri, 20 Sept 2024 at 00:56, Ido Schimmel idosch@nvidia.com wrote:
Hi,
Thanks for the patch and sorry for the late reply (was OOO).
On Mon, Sep 16, 2024 at 07:49:05PM +1000, Jamie Bainbridge wrote:
Running this test on a small system produces different failures every test checking deletions, and some flushes. From different test runs:
TEST: Common host entries configuration tests (L2) [FAIL] Failed to delete L2 host entry
TEST: Common port group entries configuration tests (IPv4 (S, G)) [FAIL] IPv4 (S, G) entry with VLAN 10 not deleted when VLAN was not specified
TEST: Common port group entries configuration tests (IPv6 (*, G)) [FAIL] IPv6 (*, G) entry with VLAN 10 not deleted when VLAN was not specified
TEST: Flush tests [FAIL] Entry not flushed by specified VLAN ID
TEST: Flush tests [FAIL] IPv6 host entry not flushed by "nopermanent" state
Add a short sleep after deletion and flush to resolve this.
The port group entry is removed from MDB entry's list synchronously, but the MDB entry itself is removed from the hash table asynchronously and the MDB get query will only return an error if an entry was not found there.
IOW, I think that when you do get a response after deletion, the entry you get is empty.
Can you please test the following patch [1] (w/o yours, obviously)?
[1] diff --git a/net/bridge/br_mdb.c b/net/bridge/br_mdb.c index bc37e47ad829..1a52a0bca086 100644 --- a/net/bridge/br_mdb.c +++ b/net/bridge/br_mdb.c @@ -1674,7 +1674,7 @@ int br_mdb_get(struct net_device *dev, struct nlattr *tb[], u32 portid, u32 seq, spin_lock_bh(&br->multicast_lock);
mp = br_mdb_ip_get(br, &group);
if (!mp) {
if (!mp || (!mp->ports && !mp->host_joined)) { NL_SET_ERR_MSG_MOD(extack, "MDB entry not found"); err = -ENOENT; goto unlock;
This works perfectly for me. Previously I would get at least 2 failures in 10. Without my patch and with the above patch, 100 tests pass without any failure.
Many thanks for looking at this!
Jamie