Jakub Kicinski kuba@kernel.org writes:
On Mon, 11 Nov 2024 18:09:01 +0100 Petr Machata wrote:
Check that only one notification is produced for various FDB edit operations.
Regarding the ip_link_add() and ip_link_master() helpers. This pattern of action plus corresponding defer is bound to come up often, and a dedicated vocabulary to capture it will be handy. tunnel_create() and vlan_create() from forwarding/lib.sh are somewhat opaque and perhaps too kitchen-sinky, so I tried to go in the opposite direction with these ones, and wrapped only the bare minimum to schedule a corresponding cleanup.
Looks like it fails about half of the time :(
https://netdev.bots.linux.dev/flakes.html?min-flip=0&tn-needle=fdb-notif...
OK, I can't reproduce this. Trying in VM, on an actual HW, debug, no debug, no luck. But I see basically two failures:
- A "0 seen, 1 expected", which... I don't know, maybe it could just be a misplaced sleep. I don't see how, but it's a deterministing scenario, there shouldn't be anything racy here, either it emits or it doesn't, so some buffering issue is the only thing I can think of.
- Deadlocks. E.g. this, which looks like it deadlocked and timed out ("bad unlock balance detected" followed by "blocked for more than 122 seconds" et.al.):
https://netdev-3.bots.linux.dev/vmksft-net-dbg/results/846621/18-fdb-notify-...
Like... how could this patchset even theoretically cause a deadlock?