On Thu, Jul 21, 2022 at 05:20:01PM +0300, Vladimir Oltean wrote:
On Thu, Jul 21, 2022 at 04:27:52PM +0300, Ido Schimmel wrote:
I tried looking information about MAB online, but couldn't find detailed material that answers my questions, so my answers are based on what I believe is logical, which might be wrong.
I'm kind of in the same situation here.
:(
Currently, the bridge will forward packets to a locked entry which effectively means that an unauthorized host can cause the bridge to direct packets to it and sniff them. Yes, the host can't send any packets through the port (while locked) and can't overtake an existing (unlocked) FDB entry, but it still seems like an odd decision. IMO, the situation in mv88e6xxx is even worse because there an unauthorized host can cause packets to a certain DMAC to be blackholed via its zero-DPV entry.
Another (minor?) issue is that locked entries cannot roam between locked ports. Lets say that my user space MAB policy is to authorize MAC X if it appears behind one of the locked ports swp1-swp4. An unauthorized host behind locked port swp5 can generate packets with SMAC X, preventing the true owner of this MAC behind swp1 from ever being authorized.
In the mv88e6xxx offload implementation, the locked entries eventually age out from time to time, practically giving the true owner of the MAC address another chance every 5 minutes or so. In the pure software implementation of locked FDB entries I'm not quite sure. It wouldn't make much sense for the behavior to differ significantly though.
From what I can tell, the same happens in software, but this behavior does not really make sense to me. It differs from how other learned entries age/roam and can lead to problems such as the one described above. It is also not documented anywhere, so I can't tell if it's intentional or an oversight. We need to have a good reason for such a behavior other than the fact that it appears to conform to the quirks of one hardware implementation.
It seems like the main purpose of these locked entries is to signal to user space the presence of a certain MAC behind a locked port, but they should not be able to affect packet forwarding in the bridge, unlike regular entries.
So essentially what you want is for br_handle_frame_finish() to treat "dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);" as NULL if test_bit(BR_FDB_LOCKED, &dst->flags) is true?
Yes. It's not clear to me why unauthorized hosts should be given the ability to affect packet forwarding in the bridge through these locked entries when their primary purpose seems to be notifying user space about the presence of the MAC. At the very least this should be explained in the commit message, to indicate that some thought went into this decision.
Regarding a separate knob for MAB, I tend to agree we need it. Otherwise we cannot control which locked ports are able to populate the FDB with locked entries. I don't particularly like the fact that we overload an existing flag ("learning") for that. Any reason not to add an explicit flag ("mab")? At least with the current implementation, locked entries cannot roam between locked ports and cannot be refreshed, which differs from regular learning.
Well, assuming we model the software bridge closer to mv88e6xxx (where locked FDB entries can roam after a certain time), does this change things? In the software implementation I think it would make sense for them to be able to roam right away (the age-out interval in mv88e6xxx is just a compromise between responsiveness to roaming and resistance to DoS).
Exactly. If this is the best that we can do with mv88e6xxx, then so be it, but other implementations (software/hardware) do not have the same limitations and I don't see a reason to bend them.
Regarding "learning" vs. "mab" (or something else), the former is a well-defined flag available since forever. In 5.18 and 5.19 it can also be enabled together with "locked" and packets from an unauthorized host (modulo link-local ones) will not populate the FDB. I prefer not to change an existing behavior.
From usability point of view, I think a new flag would be easier to explain than explaining that "learning on" behaves like A or B, based on whether "locked on" is set. The bridge can also be taught to forbid the new flag from being set when "locked" is not set.
A user space daemon that wants to try 802.1x and fallback to MAB can enable both flags or enable "mab" after some timer expires.