On Sun, Jul 17, 2022 at 02:34:22PM +0200, netdev@kapio-technology.com wrote:
If I were to randomly guess at almost 4AM in the morning, it has to do with "bridge fdb add" rather than the "bridge fdb replace" that's used for the MAB selftest. The fact I pointed out a few revisions ago, that MAB needs to be opt-in, is now coming back to bite us. Since it's not opt-in, the mv88e6xxx driver always creates locked FDB entries, and when we try to "bridge fdb add", the kernel says "hey, the FDB entry is already there!". Is that it?
Yes, that sounds like a reasonable explanation, as it adds 'ext learned, offloaded' entries. If you try and replace the 'add' with 'replace' in those tests, does it work?
Well, you have access to the selftests too... But yes, that is the reason, and it works when I change 'add' to 'replace', although of course this isn't the correct solution.
As for how to opt into MAB. Hmm. MAB seems to be essentially CPU assisted learning, which creates locked FDB entries. I wonder whether we should reconsider the position that address learning makes no sense on locked ports, and say that "+locked -learning" means no MAB, and "+locked +learning" means MAB? This would make a bunch of things more natural to handle in the kernel, and would also give us the opt-in we need.
I have done the one and then the other. We need to have some final decision on this point. And remember that this gave rise to an extra patch to fix link-local learning if learning is turned on on a locked port, which resulted in the decision to allways have learning off on locked ports.
I think part of the reason for the back-and-forth was not making a very clear distinction between basic 802.1X using hostapd, and MAB. While I agree hostapd doesn't have what to do with learning, for MAB I'm still wondering. It's the same situation for mv88e6xxx's Port Association Vector in fact.
Side note, the VTU and ATU member violation printks annoy me so badly. They aren't stating something super useful and they're a DoS attack vector in itself, even if they're rate limited. I wonder whether we could just turn the prints into a set of ethtool counters and call it a day?
Sounds like a good idea to me. :-)
Thinking this through, what we really want is trace points here, otherwise we'd lose information about which MAC address/VID/FID was it that caused the violation.