This patch set extends the locked port feature for devices that are behind a locked port, but do not have the ability to authorize themselves as a supplicant using IEEE 802.1X. Such devices can be printers, meters or anything related to fixed installations. Instead of 802.1X authorization, devices can get access based on their MAC addresses being whitelisted.
For an authorization daemon to detect that a device is trying to get access through a locked port, the bridge will add the MAC address of the device to the FDB with a locked flag to it. Thus the authorization daemon can catch the FDB add event and check if the MAC address is in the whitelist and if so replace the FDB entry without the locked flag enabled, and thus open the port for the device.
This feature is known as MAC-Auth or MAC Authentication Bypass (MAB) in Cisco terminology, where the full MAB concept involves additional Cisco infrastructure for authorization. There is no real authentication process, as the MAC address of the device is the only input the authorization daemon, in the general case, has to base the decision if to unlock the port or not.
With this patch set, an implementation of the offloaded case is supplied for the mv88e6xxx driver. When a packet ingresses on a locked port, an ATU miss violation event will occur. When handling such ATU miss violation interrupts, the MAC address of the device is added to the FDB with a zero destination port vector (DPV) and the MAC address is communicated through the switchdev layer to the bridge, so that a FDB entry with the locked flag enabled can be added.
Log: v3: Added timers and lists in the driver (mv88e6xxx) to keep track of and remove locked entries.
v4: Leave out enforcing a limit to the number of locked entries in the bridge. Removed the timers in the driver and use the worker only. Add locked FDB flag to all drivers using port_fdb_add() from the dsa api and let all drivers ignore entries with this flag set. Change how to get the ageing timeout of locked entries. See global1_atu.c and switchdev.c. Use struct mv88e6xxx_port for locked entries variables instead of struct dsa_port.
v5: Added 'mab' flag to enable MAB/MacAuth feature, in a similar way to the locked feature flag.
In these implementations for the mv88e6xxx, the switchport must be configured with learning on.
To tell userspace about the behavior of the locked entries in the driver, a 'blackhole' FDB flag has been added, which locked FDB entries coming from the driver gets. Also the 'sticky' flag comes with those locked entries, as the drivers locked entries cannot roam.
Fixed issues with taking mutex locks, and added a function to read the fid, that supports all versions of the chipset family.
Hans Schultz (6): net: bridge: add locked entry fdb flag to extend locked port feature net: switchdev: add support for offloading of fdb locked flag drivers: net: dsa: add locked fdb entry flag to drivers net: dsa: mv88e6xxx: allow reading FID when handling ATU violations net: dsa: mv88e6xxx: MacAuth/MAB implementation selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
drivers/net/dsa/b53/b53_common.c | 5 + drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 + drivers/net/dsa/lan9303-core.c | 5 + drivers/net/dsa/lantiq_gswip.c | 5 + drivers/net/dsa/microchip/ksz_common.c | 5 + drivers/net/dsa/mt7530.c | 5 + drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 81 ++++- drivers/net/dsa/mv88e6xxx/chip.h | 19 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 76 ++++- drivers/net/dsa/mv88e6xxx/port.c | 15 +- drivers/net/dsa/mv88e6xxx/port.h | 6 + drivers/net/dsa/mv88e6xxx/switchdev.c | 285 ++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ drivers/net/dsa/ocelot/felix.c | 5 + drivers/net/dsa/qca/qca8k-common.c | 5 + drivers/net/dsa/qca/qca8k.h | 1 + drivers/net/dsa/sja1105/sja1105_main.c | 7 +- include/linux/if_bridge.h | 1 + include/net/dsa.h | 1 + include/net/switchdev.h | 3 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 4 +- net/bridge/br.c | 5 +- net/bridge/br_fdb.c | 43 ++- net/bridge/br_input.c | 16 +- net/bridge/br_netlink.c | 9 +- net/bridge/br_private.h | 7 +- net/bridge/br_switchdev.c | 5 +- net/dsa/dsa_priv.h | 4 +- net/dsa/port.c | 7 +- net/dsa/slave.c | 4 +- net/dsa/switch.c | 10 +- .../net/forwarding/bridge_locked_port.sh | 107 ++++++- .../net/forwarding/bridge_sticky_fdb.sh | 21 +- 37 files changed, 768 insertions(+), 50 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
As locked FDB entries are a security measure to deny access for unauthorized hosts on specific ports, they will deny forwarding from any port to the (MAC, vlan) pair involved and locked entries will not be able by learning or otherwise to change the associated port.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Also add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 4 +++- net/bridge/br_fdb.c | 29 +++++++++++++++++++++++++++++ net/bridge/br_input.c | 16 +++++++++++++++- net/bridge/br_netlink.c | 9 ++++++++- net/bridge/br_private.h | 4 +++- 7 files changed, 60 insertions(+), 4 deletions(-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index d62ef428e3aa..1668ac4d7adc 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -59,6 +59,7 @@ struct br_ip_list { #define BR_MRP_LOST_IN_CONT BIT(19) #define BR_TX_FWD_OFFLOAD BIT(20) #define BR_PORT_LOCKED BIT(21) +#define BR_PORT_MAB BIT(22)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index e36d9d2c65a7..fcbd0b85ad53 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -560,6 +560,7 @@ enum { IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT, IFLA_BRPORT_MCAST_EHT_HOSTS_CNT, IFLA_BRPORT_LOCKED, + IFLA_BRPORT_MAB, __IFLA_BRPORT_MAX }; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index a998bf761635..bc1440a56b70 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -52,7 +52,9 @@ enum { #define NTF_STICKY (1 << 6) #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ -#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1) +#define NTF_EXT_BLACKHOLE (1 << 2)
/* * Neighbor Cache Entry States. diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..1962d9594a48 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm; + u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL) @@ -125,11 +126,18 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY; + if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) + ext_flags |= NTF_EXT_LOCKED; + if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) + ext_flags |= NTF_EXT_BLACKHOLE;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure; + if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags)) + goto nla_put_failure; + ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated); @@ -171,6 +179,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */ + + nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */ + nla_total_size(sizeof(u16)) /* NDA_VLAN */ + nla_total_size(sizeof(struct nda_cacheinfo)) + nla_total_size(0) /* NDA_FDB_EXT_ATTRS */ @@ -879,6 +888,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags); + if (source->flags & BR_PORT_MAB) + set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); + else + clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); }
if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags))) @@ -1082,6 +1095,16 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
+ if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) { + clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); + modified = true; + } + + if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) { + clear_bit(BR_FDB_BLACKHOLE, &fdb->flags); + modified = true; + } + if (fdb_handle_notify(fdb, notify)) modified = true;
@@ -1178,6 +1201,12 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
+ if (tb[NDA_FLAGS_EXT] && + (nla_get_u32(tb[NDA_FLAGS_EXT]) & (NTF_EXT_LOCKED | NTF_EXT_BLACKHOLE))) { + pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n"); + return -EINVAL; + } + if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr, diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d48aa7fa778 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,19 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p || - test_bit(BR_FDB_LOCAL, &fdb_src->flags)) + test_bit(BR_FDB_LOCAL, &fdb_src->flags) || + test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) { + if (!fdb_src || (READ_ONCE(fdb_src->dst) != p && + (p->flags & BR_LEARNING))) { + unsigned long flags = 0; + + if (p->flags & BR_PORT_MAB) { + __set_bit(BR_FDB_ENTRY_LOCKED, &flags); + br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags); + } + } goto drop; + } }
nbp_switchdev_frame_mark(p, skb); @@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
+ if (test_bit(BR_FDB_BLACKHOLE, &dst->flags)) + goto drop; + if (now != dst->used) dst->used = now; br_forward(dst->dst, skb, local_rcv, false); diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..34483420c9e4 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -188,6 +188,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_NEIGH_SUPPRESS */ + nla_total_size(1) /* IFLA_BRPORT_ISOLATED */ + nla_total_size(1) /* IFLA_BRPORT_LOCKED */ + + nla_total_size(1) /* IFLA_BRPORT_MAB */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */ + nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */ @@ -274,7 +275,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, nla_put_u8(skb, IFLA_BRPORT_MRP_IN_OPEN, !!(p->flags & BR_MRP_LOST_IN_CONT)) || nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED)) || - nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED))) + nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED)) || + nla_put_u8(skb, IFLA_BRPORT_MAB, !!(p->flags & BR_PORT_MAB))) return -EMSGSIZE;
timerval = br_timer_value(&p->message_age_timer); @@ -876,6 +878,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_NEIGH_SUPPRESS] = { .type = NLA_U8 }, [IFLA_BRPORT_ISOLATED] = { .type = NLA_U8 }, [IFLA_BRPORT_LOCKED] = { .type = NLA_U8 }, + [IFLA_BRPORT_MAB] = { .type = NLA_U8 }, [IFLA_BRPORT_BACKUP_PORT] = { .type = NLA_U32 }, [IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT] = { .type = NLA_U32 }, }; @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED); + br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB); + + if (!(p->flags & BR_PORT_LOCKED)) + p->flags &= ~BR_PORT_MAB;
changed_mask = old_flags ^ p->flags;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..048e4afbc5a0 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,9 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY, - BR_FDB_NOTIFY_INACTIVE + BR_FDB_NOTIFY_INACTIVE, + BR_FDB_ENTRY_LOCKED, + BR_FDB_BLACKHOLE, };
struct net_bridge_fdb_key {
On 26/08/2022 14:45, Hans Schultz wrote:
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
As locked FDB entries are a security measure to deny access for unauthorized hosts on specific ports, they will deny forwarding from any port to the (MAC, vlan) pair involved and locked entries will not be able by learning or otherwise to change the associated port.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Also add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 4 +++- net/bridge/br_fdb.c | 29 +++++++++++++++++++++++++++++ net/bridge/br_input.c | 16 +++++++++++++++- net/bridge/br_netlink.c | 9 ++++++++- net/bridge/br_private.h | 4 +++- 7 files changed, 60 insertions(+), 4 deletions(-)
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index d62ef428e3aa..1668ac4d7adc 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -59,6 +59,7 @@ struct br_ip_list { #define BR_MRP_LOST_IN_CONT BIT(19) #define BR_TX_FWD_OFFLOAD BIT(20) #define BR_PORT_LOCKED BIT(21) +#define BR_PORT_MAB BIT(22) #define BR_DEFAULT_AGEING_TIME (300 * HZ) diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index e36d9d2c65a7..fcbd0b85ad53 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -560,6 +560,7 @@ enum { IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT, IFLA_BRPORT_MCAST_EHT_HOSTS_CNT, IFLA_BRPORT_LOCKED,
- IFLA_BRPORT_MAB, __IFLA_BRPORT_MAX
}; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index a998bf761635..bc1440a56b70 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -52,7 +52,9 @@ enum { #define NTF_STICKY (1 << 6) #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ -#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1) +#define NTF_EXT_BLACKHOLE (1 << 2) /*
- Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..1962d9594a48 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm;
- u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL) @@ -125,11 +126,18 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY;
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
ext_flags |= NTF_EXT_LOCKED;
- if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags))
ext_flags |= NTF_EXT_BLACKHOLE;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure;
- if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
goto nla_put_failure;
- ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated);
@@ -171,6 +179,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */
+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
- nla_total_size(sizeof(u16)) /* NDA_VLAN */
- nla_total_size(sizeof(struct nda_cacheinfo))
- nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
@@ -879,6 +888,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags);
if (source->flags & BR_PORT_MAB)
set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
else
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
Please add a test for that bit and only then change it.
}
if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags))) @@ -1082,6 +1095,16 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) {
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
modified = true;
- }
- if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) {
clear_bit(BR_FDB_BLACKHOLE, &fdb->flags);
modified = true;
- }
- if (fdb_handle_notify(fdb, notify)) modified = true;
@@ -1178,6 +1201,12 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
- if (tb[NDA_FLAGS_EXT] &&
(nla_get_u32(tb[NDA_FLAGS_EXT]) & (NTF_EXT_LOCKED | NTF_EXT_BLACKHOLE))) {
pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n");
return -EINVAL;
- }
- if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr,
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d48aa7fa778 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,19 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid); if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
if (!fdb_src || (READ_ONCE(fdb_src->dst) != p &&
(p->flags & BR_LEARNING))) {
please break the second part of the check on a new line instead
unsigned long flags = 0;
if (p->flags & BR_PORT_MAB) {
combine this and the above as one "if" block, no need to have a new one here which preferrably starts with the MAB check
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);> + }
} goto drop;
}}
nbp_switchdev_frame_mark(p, skb); @@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
if (test_bit(BR_FDB_BLACKHOLE, &dst->flags))
goto drop;
Not happy about adding a new test in arguably the most used fast-path, but I don't see a better way to do blackhole right now. Could you please make it an unlikely() ?
I guess the blackhole flag will be allowed for user-space to set at some point, why not do it from the start?
Actually adding a BR_FDB_LOCAL and BR_FDB_BLACKHOLE would be a conflict above - the packet will be received. So you should move the blackhole check above the BR_FDB_LOCAL one if user-space is allowed to set it to any entry.
if (now != dst->used) dst->used = now; br_forward(dst->dst, skb, local_rcv, false);
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..34483420c9e4 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -188,6 +188,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_NEIGH_SUPPRESS */ + nla_total_size(1) /* IFLA_BRPORT_ISOLATED */ + nla_total_size(1) /* IFLA_BRPORT_LOCKED */
+ nla_total_size(1) /* IFLA_BRPORT_MAB */
- nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */
- nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */
- nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */
@@ -274,7 +275,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, nla_put_u8(skb, IFLA_BRPORT_MRP_IN_OPEN, !!(p->flags & BR_MRP_LOST_IN_CONT)) || nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED)) ||
nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED)))
nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags & BR_PORT_LOCKED)) ||
return -EMSGSIZE;nla_put_u8(skb, IFLA_BRPORT_MAB, !!(p->flags & BR_PORT_MAB)))
timerval = br_timer_value(&p->message_age_timer); @@ -876,6 +878,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_NEIGH_SUPPRESS] = { .type = NLA_U8 }, [IFLA_BRPORT_ISOLATED] = { .type = NLA_U8 }, [IFLA_BRPORT_LOCKED] = { .type = NLA_U8 },
- [IFLA_BRPORT_MAB] = { .type = NLA_U8 }, [IFLA_BRPORT_BACKUP_PORT] = { .type = NLA_U32 }, [IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT] = { .type = NLA_U32 },
}; @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
changed_mask = old_flags ^ p->flags; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..048e4afbc5a0 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,9 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY,
- BR_FDB_NOTIFY_INACTIVE
- BR_FDB_NOTIFY_INACTIVE,
- BR_FDB_ENTRY_LOCKED,
- BR_FDB_BLACKHOLE,
}; struct net_bridge_fdb_key {
Cheers, Nik
On Sat, Aug 27, 2022 at 02:30:25PM +0300, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote: Please add the blackhole flag in a separate patch.
+1
[...]
@@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
if (test_bit(BR_FDB_BLACKHOLE, &dst->flags))
goto drop;
Not happy about adding a new test in arguably the most used fast-path, but I don't see a better way to do blackhole right now. Could you please make it an unlikely() ?
I guess the blackhole flag will be allowed for user-space to set at some point, why not do it from the start?
Actually adding a BR_FDB_LOCAL and BR_FDB_BLACKHOLE would be a conflict above - the packet will be received. So you should move the blackhole check above the BR_FDB_LOCAL one if user-space is allowed to set it to any entry.
Agree about unlikely() and making it writeable from user space from the start. This flag is different from the "locked" flag that should only be ever set by the kernel.
Regarding BR_FDB_LOCAL, I think BR_FDB_BLACKHOLE should only be allowed with BR_FDB_LOCAL as these entries are similar in the following ways:
1. It doesn't make sense to associate a blackhole entry with a specific port. The packet will never be forwarded to this port, but dropped by the bridge. This means user space will add them on the bridge itself:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local blackhole
2. If you agree that these entries should not be associated with a specific port, then it also does not make sense to subject them to ageing and roaming, just like existing local/permanent entries.
The above allows us to push the new check under the BR_FDB_LOCAL check:
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..4357445529a5 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -182,8 +182,11 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (dst) { unsigned long now = jiffies;
- if (test_bit(BR_FDB_LOCAL, &dst->flags)) + if (test_bit(BR_FDB_LOCAL, &dst->flags)) { + if (unlikely(test_bit(BR_FDB_BLACKHOLE, &dst->flags))) + goto drop; return br_pass_frame_up(skb); + }
if (now != dst->used) dst->used = now;
On 27/08/2022 16:17, Ido Schimmel wrote:
On Sat, Aug 27, 2022 at 02:30:25PM +0300, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote: Please add the blackhole flag in a separate patch.
+1
[...]
@@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
if (test_bit(BR_FDB_BLACKHOLE, &dst->flags))
goto drop;
Not happy about adding a new test in arguably the most used fast-path, but I don't see a better way to do blackhole right now. Could you please make it an unlikely() ?
I guess the blackhole flag will be allowed for user-space to set at some point, why not do it from the start?
Actually adding a BR_FDB_LOCAL and BR_FDB_BLACKHOLE would be a conflict above - the packet will be received. So you should move the blackhole check above the BR_FDB_LOCAL one if user-space is allowed to set it to any entry.
Agree about unlikely() and making it writeable from user space from the start. This flag is different from the "locked" flag that should only be ever set by the kernel.
Regarding BR_FDB_LOCAL, I think BR_FDB_BLACKHOLE should only be allowed with BR_FDB_LOCAL as these entries are similar in the following ways:
- It doesn't make sense to associate a blackhole entry with a specific
port. The packet will never be forwarded to this port, but dropped by the bridge. This means user space will add them on the bridge itself:
Right, good point.
# bridge fdb add 00:11:22:33:44:55 dev br0 self local blackhole
- If you agree that these entries should not be associated with a
specific port, then it also does not make sense to subject them to ageing and roaming, just like existing local/permanent entries.
The above allows us to push the new check under the BR_FDB_LOCAL check:
hmm.. so only the driver will be allowed to add non-BR_FDB_LOCAL blackhole entries with locked flag set as well, that sounds ok as they will be extern_learn and enforced by it. It is a little discrepancy as we cannot add similar entries in SW but it really doesn't make any sense to have blackhole fdbs pointing to a port. SW won't be able to have a locked entry w/ blackhole set, but I like that it is hidden in the fdb local case when fwding and that's good enough for me.
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..4357445529a5 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -182,8 +182,11 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (dst) { unsigned long now = jiffies;
if (test_bit(BR_FDB_LOCAL, &dst->flags))
if (test_bit(BR_FDB_LOCAL, &dst->flags)) {
if (unlikely(test_bit(BR_FDB_BLACKHOLE, &dst->flags)))
goto drop; return br_pass_frame_up(skb);
}
if (now != dst->used) dst->used = now;
On 2022-08-27 15:17, Ido Schimmel wrote:
On Sat, Aug 27, 2022 at 02:30:25PM +0300, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote: Please add the blackhole flag in a separate patch.
+1
[...]
@@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
if (test_bit(BR_FDB_BLACKHOLE, &dst->flags))
goto drop;
Not happy about adding a new test in arguably the most used fast-path, but I don't see a better way to do blackhole right now. Could you please make it an unlikely() ?
I guess the blackhole flag will be allowed for user-space to set at some point, why not do it from the start?
Actually adding a BR_FDB_LOCAL and BR_FDB_BLACKHOLE would be a conflict above - the packet will be received. So you should move the blackhole check above the BR_FDB_LOCAL one if user-space is allowed to set it to any entry.
Agree about unlikely() and making it writeable from user space from the start. This flag is different from the "locked" flag that should only be ever set by the kernel.
Regarding BR_FDB_LOCAL, I think BR_FDB_BLACKHOLE should only be allowed with BR_FDB_LOCAL as these entries are similar in the following ways:
- It doesn't make sense to associate a blackhole entry with a specific
port. The packet will never be forwarded to this port, but dropped by the bridge. This means user space will add them on the bridge itself:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local blackhole
- If you agree that these entries should not be associated with a
specific port, then it also does not make sense to subject them to ageing and roaming, just like existing local/permanent entries.
The above allows us to push the new check under the BR_FDB_LOCAL check:
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..4357445529a5 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -182,8 +182,11 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (dst) { unsigned long now = jiffies;
if (test_bit(BR_FDB_LOCAL, &dst->flags))
if (test_bit(BR_FDB_LOCAL, &dst->flags)) {
if (unlikely(test_bit(BR_FDB_BLACKHOLE,
&dst->flags)))
goto drop; return br_pass_frame_up(skb);
} if (now != dst->used) dst->used = now;
It shall be so as suggested. :-)
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
As locked FDB entries are a security measure to deny access for unauthorized hosts on specific ports, they will deny forwarding from any port to the (MAC, vlan) pair involved and locked entries will not be able by learning or otherwise to change the associated port.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Locked entries will age out with the set bridge ageing time.
Also add a 'blackhole' fdb flag, ensuring that no forwarding from any port to a destination MAC that has a FDB entry with this flag on will occur. The packets will thus be dropped.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + include/uapi/linux/neighbour.h | 4 +++- net/bridge/br_fdb.c | 29 +++++++++++++++++++++++++++++ net/bridge/br_input.c | 16 +++++++++++++++- net/bridge/br_netlink.c | 9 ++++++++- net/bridge/br_private.h | 4 +++- 7 files changed, 60 insertions(+), 4 deletions(-)
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Okay, shall do. :-)
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index d62ef428e3aa..1668ac4d7adc 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -59,6 +59,7 @@ struct br_ip_list { #define BR_MRP_LOST_IN_CONT BIT(19) #define BR_TX_FWD_OFFLOAD BIT(20) #define BR_PORT_LOCKED BIT(21) +#define BR_PORT_MAB BIT(22)
#define BR_DEFAULT_AGEING_TIME (300 * HZ)
diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index e36d9d2c65a7..fcbd0b85ad53 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -560,6 +560,7 @@ enum { IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT, IFLA_BRPORT_MCAST_EHT_HOSTS_CNT, IFLA_BRPORT_LOCKED,
- IFLA_BRPORT_MAB, __IFLA_BRPORT_MAX
}; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index a998bf761635..bc1440a56b70 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -52,7 +52,9 @@ enum { #define NTF_STICKY (1 << 6) #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ -#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1) +#define NTF_EXT_BLACKHOLE (1 << 2)
/*
- Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..1962d9594a48 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm;
u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL)
@@ -125,11 +126,18 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY;
if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
ext_flags |= NTF_EXT_LOCKED;
if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags))
ext_flags |= NTF_EXT_BLACKHOLE;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure;
if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
goto nla_put_failure;
ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated);
@@ -171,6 +179,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */
+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
- nla_total_size(sizeof(u16)) /* NDA_VLAN */
- nla_total_size(sizeof(struct nda_cacheinfo))
- nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
@@ -879,6 +888,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags);
if (source->flags & BR_PORT_MAB)
set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
else
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
Please add a test for that bit and only then change it.
Okay.
} if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
@@ -1082,6 +1095,16 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) {
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
modified = true;
- }
- if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) {
clear_bit(BR_FDB_BLACKHOLE, &fdb->flags);
modified = true;
- }
- if (fdb_handle_notify(fdb, notify)) modified = true;
@@ -1178,6 +1201,12 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
- if (tb[NDA_FLAGS_EXT] &&
(nla_get_u32(tb[NDA_FLAGS_EXT]) & (NTF_EXT_LOCKED |
NTF_EXT_BLACKHOLE))) {
pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n");
return -EINVAL;
- }
- if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr,
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d48aa7fa778 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,19 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
if (!fdb_src || (READ_ONCE(fdb_src->dst) != p &&
(p->flags & BR_LEARNING))) {
please break the second part of the check on a new line instead
unsigned long flags = 0;
if (p->flags & BR_PORT_MAB) {
combine this and the above as one "if" block, no need to have a new one here which preferrably starts with the MAB check
As the above if is an "or", and the MAB check works as an "and", it will have to be replicated for all "or" checks, thus twice in the above if. The first check in the above "or" is for new entries, while the second part of the "or" is to allow roaming to a locked port.
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);>
}
} goto drop;
}
}
nbp_switchdev_frame_mark(p, skb);
@@ -185,6 +196,9 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb if (test_bit(BR_FDB_LOCAL, &dst->flags)) return br_pass_frame_up(skb);
if (test_bit(BR_FDB_BLACKHOLE, &dst->flags))
goto drop;
Not happy about adding a new test in arguably the most used fast-path, but I don't see a better way to do blackhole right now. Could you please make it an unlikely() ?
Surely.
I guess the blackhole flag will be allowed for user-space to set at some point, why not do it from the start?
I guess it should be so. :-)
Actually adding a BR_FDB_LOCAL and BR_FDB_BLACKHOLE would be a conflict above - the packet will be received. So you should move the blackhole check above the BR_FDB_LOCAL one if user-space is allowed to set it to any entry.
if (now != dst->used) dst->used = now; br_forward(dst->dst, skb, local_rcv, false);
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..34483420c9e4 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -188,6 +188,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_NEIGH_SUPPRESS */ + nla_total_size(1) /* IFLA_BRPORT_ISOLATED */ + nla_total_size(1) /* IFLA_BRPORT_LOCKED */
+ nla_total_size(1) /* IFLA_BRPORT_MAB */
- nla_total_size(sizeof(struct ifla_bridge_id)) /*
IFLA_BRPORT_ROOT_ID */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */ + nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */ @@ -274,7 +275,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, nla_put_u8(skb, IFLA_BRPORT_MRP_IN_OPEN, !!(p->flags & BR_MRP_LOST_IN_CONT)) || nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED)) ||
nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags &
BR_PORT_LOCKED)))
nla_put_u8(skb, IFLA_BRPORT_LOCKED, !!(p->flags &
BR_PORT_LOCKED)) ||
nla_put_u8(skb, IFLA_BRPORT_MAB, !!(p->flags & BR_PORT_MAB)))
return -EMSGSIZE;
timerval = br_timer_value(&p->message_age_timer);
@@ -876,6 +878,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_NEIGH_SUPPRESS] = { .type = NLA_U8 }, [IFLA_BRPORT_ISOLATED] = { .type = NLA_U8 }, [IFLA_BRPORT_LOCKED] = { .type = NLA_U8 },
- [IFLA_BRPORT_MAB] = { .type = NLA_U8 }, [IFLA_BRPORT_BACKUP_PORT] = { .type = NLA_U32 }, [IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT] = { .type = NLA_U32 },
}; @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
changed_mask = old_flags ^ p->flags;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..048e4afbc5a0 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,9 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY,
- BR_FDB_NOTIFY_INACTIVE
- BR_FDB_NOTIFY_INACTIVE,
- BR_FDB_ENTRY_LOCKED,
- BR_FDB_BLACKHOLE,
};
struct net_bridge_fdb_key {
Cheers, Nik
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
@@ -879,6 +888,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags);
if (source->flags & BR_PORT_MAB)
set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
else
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
Please add a test for that bit and only then change it.
Something like this?
@@ -749,6 +756,12 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags)))
clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags); + if (unlikely(test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))) { + if (!(source->flags & BR_PORT_MAB)) + clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); + } else + if (source->flags & BR_PORT_MAB) + set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); }
if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
@@ -879,6 +888,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags))) clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags);
if (source->flags & BR_PORT_MAB)
set_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
else
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
Please add a test for that bit and only then change it.
Okay, I have revised this part now. I hope that it is suitable?
@@ -749,6 +756,10 @@ void br_fdb_update(struct net_bridge *br, struct net_bridge_port *source, &fdb->flags)))
clear_bit(BR_FDB_ADDED_BY_EXT_LEARN, &fdb->flags); + /* Allow roaming from an unauthorized port to an + * authorized port */ + if (unlikely(test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))) + clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); }
if (unlikely(test_bit(BR_FDB_ADDED_BY_USER, &flags)))
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Hi, if userspace is to set this flag I think I need to change stuff in rtnetlink.c, as I will need to extent struct ndmsg with a new u32 entry as the old u8 flags is full. Maybe this is straight forward, but I am not so sure as I don't know that code too well. Maybe someone can give me a hint...?
On Mon, Aug 29, 2022 at 04:02:46PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Hi, if userspace is to set this flag I think I need to change stuff in rtnetlink.c, as I will need to extent struct ndmsg with a new u32 entry as the old u8 flags is full.
You cannot extend 'struct ndmsg'. That's why 'NDA_FLAGS_EXT' was introduced. See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
'NTF_EXT_BLACKHOLE' belongs in 'NDA_FLAGS_EXT' like you have it now, but the kernel should not reject it in br_fdb_add().
Maybe this is straight forward, but I am not so sure as I don't know that code too well. Maybe someone can give me a hint...?
On 2022-08-29 18:12, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 04:02:46PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Hi, if userspace is to set this flag I think I need to change stuff in rtnetlink.c, as I will need to extent struct ndmsg with a new u32 entry as the old u8 flags is full.
You cannot extend 'struct ndmsg'. That's why 'NDA_FLAGS_EXT' was introduced. See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
'NTF_EXT_BLACKHOLE' belongs in 'NDA_FLAGS_EXT' like you have it now, but the kernel should not reject it in br_fdb_add().
Maybe this is straight forward, but I am not so sure as I don't know that code too well. Maybe someone can give me a hint...?
Thanks! I see that I was in trouble there... :D
On 2022-08-29 18:12, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 04:02:46PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Hi, if userspace is to set this flag I think I need to change stuff in rtnetlink.c, as I will need to extent struct ndmsg with a new u32 entry as the old u8 flags is full.
You cannot extend 'struct ndmsg'. That's why 'NDA_FLAGS_EXT' was introduced. See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
'NTF_EXT_BLACKHOLE' belongs in 'NDA_FLAGS_EXT' like you have it now, but the kernel should not reject it in br_fdb_add().
Maybe this is straight forward, but I am not so sure as I don't know that code too well. Maybe someone can give me a hint...?
The question I have is if I can use nlh_flags to send the extended flags further on to fdb_add_entry(), where I expect to need it? (the extended flags are u32, while nlh_flags are u16)
On Tue, Aug 30, 2022 at 04:19:16PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 18:12, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 04:02:46PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 13:30, Nikolay Aleksandrov wrote:
On 26/08/2022 14:45, Hans Schultz wrote:
Hi, Please add the blackhole flag in a separate patch. A few more comments and questions below..
Hi, if userspace is to set this flag I think I need to change stuff in rtnetlink.c, as I will need to extent struct ndmsg with a new u32 entry as the old u8 flags is full.
You cannot extend 'struct ndmsg'. That's why 'NDA_FLAGS_EXT' was introduced. See: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?i...
'NTF_EXT_BLACKHOLE' belongs in 'NDA_FLAGS_EXT' like you have it now, but the kernel should not reject it in br_fdb_add().
Maybe this is straight forward, but I am not so sure as I don't know that code too well. Maybe someone can give me a hint...?
The question I have is if I can use nlh_flags to send the extended flags further on to fdb_add_entry(), where I expect to need it?
A separate argument looks cleaner to me.
(the extended flags are u32, while nlh_flags are u16)
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index a998bf761635..bc1440a56b70 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -52,7 +52,9 @@ enum { #define NTF_STICKY (1 << 6) #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ -#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1) +#define NTF_EXT_BLACKHOLE (1 << 2)
A few lines below in the file there is a comment explaining NTF_EXT_MANAGED. Please document NTF_EXT_LOCKED and NTF_EXT_BLACKHOLE as well.
/*
- Neighbor Cache Entry States.
[...]
@@ -1082,6 +1095,16 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) {
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
modified = true;
- }
Should be able to use test_and_clear_bit():
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..e5561ee2bfac 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1082,6 +1082,9 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
+ if (test_and_clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) + modified = true; + if (fdb_handle_notify(fdb, notify)) modified = true;
- if (test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) {
clear_bit(BR_FDB_BLACKHOLE, &fdb->flags);
modified = true;
- }
This will need to change to allow user space to set the flag.
- if (fdb_handle_notify(fdb, notify)) modified = true;
@@ -1178,6 +1201,12 @@ int br_fdb_add(struct ndmsg *ndm, struct nlattr *tb[], vg = nbp_vlan_group(p); }
- if (tb[NDA_FLAGS_EXT] &&
(nla_get_u32(tb[NDA_FLAGS_EXT]) & (NTF_EXT_LOCKED | NTF_EXT_BLACKHOLE))) {
pr_info("bridge: RTM_NEWNEIGH has invalid extended flags\n");
return -EINVAL;
- }
- if (tb[NDA_FDB_EXT_ATTRS]) { attr = tb[NDA_FDB_EXT_ATTRS]; err = nla_parse_nested(nfea_tb, NFEA_MAX, attr,
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d48aa7fa778 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,19 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid); if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
if (!fdb_src || (READ_ONCE(fdb_src->dst) != p &&
(p->flags & BR_LEARNING))) {
It looks like you are allowing a locked port to:
1. Overtake a local entry. Actually, it will be rejected by br_fdb_update() with a rate limited error message, but best to avoid it.
2. Overtake an entry pointing to an unlocked port. There is no reason for an authorized port to lose communication because an unauthorized port decided to spoof its MAC.
unsigned long flags = 0;
if (p->flags & BR_PORT_MAB) {
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);
}
} goto drop;
}}
How about the below (untested):
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..9143a94a1c57 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -109,9 +109,18 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
- if (!fdb_src || READ_ONCE(fdb_src->dst) != p || - test_bit(BR_FDB_LOCAL, &fdb_src->flags)) + if (!fdb_src) { + if (p->flags & BR_PORT_MAB) { + __set_bit(BR_FDB_ENTRY_LOCKED, &flags); + br_fdb_update(br, p, eth_hdr(skb)->h_source, + vid, flags); + } + goto drop; + } else if (READ_ONCE(fdb_src->dst) != p || + test_bit(BR_FDB_LOCAL, &fdb_src->flags) || + test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { goto drop; + } }
The semantics are very clear, IMO. On FDB miss, add a locked FDB entry and drop the packet. On FDB mismatch, drop the packet.
Entry can roam from an unauthorized port to an authorized port, but not the other way around. Not sure what is the use case for allowing roaming between unauthorized ports.
Note that with the above, locked entries are not refreshed and will therefore age out unless replaced by user space.
nbp_switchdev_frame_mark(p, skb); @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
Any reason not to emit an error if MAB is enabled while the port is unlocked? Something like this (untested):
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..18353a4c29e1 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -944,6 +944,12 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
+ if (!(p->flags & BR_PORT_LOCKED) && (p->flags & BR_PORT_MAB)) { + NL_SET_ERR_MSG(extack, "MAB cannot be enabled when port is unlocked"); + p->flags = old_flags; + return -EINVAL; + } + changed_mask = old_flags ^ p->flags;
err = br_switchdev_set_port_flag(p, p->flags, changed_mask, extack);
changed_mask = old_flags ^ p->flags; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..048e4afbc5a0 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,9 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY,
- BR_FDB_NOTIFY_INACTIVE
- BR_FDB_NOTIFY_INACTIVE,
- BR_FDB_ENTRY_LOCKED,
- BR_FDB_BLACKHOLE,
}; struct net_bridge_fdb_key { -- 2.30.2
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
nbp_switchdev_frame_mark(p, skb); @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
The reason for this is that I wanted it to be so that if you have MAB enabled (and locked of course) and unlock the port, it will automatically clear both flags instead of having to first disable MAB and then unlock the port.
Any reason not to emit an error if MAB is enabled while the port is unlocked? Something like this (untested):
diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 5aeb3646e74c..18353a4c29e1 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -944,6 +944,12 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
if (!(p->flags & BR_PORT_LOCKED) && (p->flags & BR_PORT_MAB)) {
NL_SET_ERR_MSG(extack, "MAB cannot be enabled when
port is unlocked");
p->flags = old_flags;
return -EINVAL;
}
changed_mask = old_flags ^ p->flags; err = br_switchdev_set_port_flag(p, p->flags, changed_mask,
extack);
On Sun, Aug 28, 2022 at 12:23:30PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
nbp_switchdev_frame_mark(p, skb); @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
The reason for this is that I wanted it to be so that if you have MAB enabled (and locked of course) and unlock the port, it will automatically clear both flags instead of having to first disable MAB and then unlock the port.
User space can just do:
# bridge link set dev swp1 locked off mab off
I prefer not to push such logic into the kernel and instead fail explicitly. I won't argue if more people are in favor.
On 2022-08-29 09:52, Ido Schimmel wrote:
On Sun, Aug 28, 2022 at 12:23:30PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
nbp_switchdev_frame_mark(p, skb); @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
The reason for this is that I wanted it to be so that if you have MAB enabled (and locked of course) and unlock the port, it will automatically clear both flags instead of having to first disable MAB and then unlock the port.
User space can just do:
# bridge link set dev swp1 locked off mab off
I prefer not to push such logic into the kernel and instead fail explicitly. I won't argue if more people are in favor.
I shall do it as you suggest. It sounds fair. :-)
On 29/08/2022 10:52, Ido Schimmel wrote:
On Sun, Aug 28, 2022 at 12:23:30PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
nbp_switchdev_frame_mark(p, skb); @@ -943,6 +946,10 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); br_set_port_flag(p, tb, IFLA_BRPORT_LOCKED, BR_PORT_LOCKED);
- br_set_port_flag(p, tb, IFLA_BRPORT_MAB, BR_PORT_MAB);
- if (!(p->flags & BR_PORT_LOCKED))
p->flags &= ~BR_PORT_MAB;
The reason for this is that I wanted it to be so that if you have MAB enabled (and locked of course) and unlock the port, it will automatically clear both flags instead of having to first disable MAB and then unlock the port.
User space can just do:
# bridge link set dev swp1 locked off mab off
I prefer not to push such logic into the kernel and instead fail explicitly. I won't argue if more people are in favor.
+1
I prefer to fail explicitly too, actually I also had a comment about this but somehow have managed to delete it before sending my review. :)
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote: How about the below (untested):
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..9143a94a1c57 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -109,9 +109,18 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
if (!fdb_src) {
if (p->flags & BR_PORT_MAB) {
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p,
eth_hdr(skb)->h_source,
vid, flags);
}
goto drop;
} else if (READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { goto drop;
} }
The semantics are very clear, IMO. On FDB miss, add a locked FDB entry and drop the packet. On FDB mismatch, drop the packet.
Entry can roam from an unauthorized port to an authorized port, but not the other way around. Not sure what is the use case for allowing roaming between unauthorized ports.
Note that with the above, locked entries are not refreshed and will therefore age out unless replaced by user space.
Okay I was under the impression that entries should be able to roam freely between authorized and unauthorized ports in the bridge as long as the locked flag is on when roaming to the MAB enabled port. As you know roaming is not a big issue with mv88e6xxx.
As I see this code, an entry cannot roam to an authorized port as there is no update after the port mismatch check and the packet is dropped as it should in this case in the locked section.
On 2022-08-27 17:19, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:33PM +0200, Hans Schultz wrote:
How about this?
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 1064a5b2d478..82bb50851716 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -103,8 +103,19 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p || - test_bit(BR_FDB_LOCAL, &fdb_src->flags)) + test_bit(BR_FDB_LOCAL, &fdb_src->flags) || + test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) { + if (!fdb_src || ((READ_ONCE(fdb_src->dst) != p) && + (!unlikely(test_bit(BR_FDB_LOCAL, &fdb_src->flags))))) { + unsigned long flags = 0; + + if (p->flags & BR_PORT_MAB) { + __set_bit(BR_FDB_ENTRY_LOCKED, &flags); + br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags); + } + } goto drop; + } }
nbp_switchdev_frame_mark(p, skb);
It will allow roaming to a MAB enabled port (no roaming to a simply locked port should be allowed of course), and it will not change a local entry and not rely on 'learning on' on the locked port of course. Roaming to an unlocked port will also be allowed, and the locked flag will be removed in this case according to code in br_fdb_update().
On 2022-08-27 17:19, Ido Schimmel wrote:
How about the below (untested):
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..9143a94a1c57 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -109,9 +109,18 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb struct net_bridge_fdb_entry *fdb_src = br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
if (!fdb_src) {
if (p->flags & BR_PORT_MAB) {
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p,
eth_hdr(skb)->h_source,
vid, flags);
}
goto drop;
} else if (READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_LOCKED, &fdb_src->flags)) { goto drop;
} }
The semantics are very clear, IMO. On FDB miss, add a locked FDB entry and drop the packet. On FDB mismatch, drop the packet.
Entry can roam from an unauthorized port to an authorized port, but not the other way around. Not sure what is the use case for allowing roaming between unauthorized ports.
Note that with the above, locked entries are not refreshed and will therefore age out unless replaced by user space.
Okay, I got the semantics (locked/unlocked vs unauthorized/authorized) reversed, so I will go with your suggestion.
Used for MacAuth/MAB feature in the offloaded case. Send the fdb locked flag down through the switchdev and DSA layers.
Also add support for drivers to set additional flags on bridge FDB entries including the new 'blackhole' flag.
And add support for offloading of the MAB/MacAuth feature flag.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- include/net/switchdev.h | 3 +++ net/bridge/br.c | 5 ++++- net/bridge/br_fdb.c | 14 ++++++++++++-- net/bridge/br_private.h | 3 ++- net/bridge/br_switchdev.c | 5 +++-- net/dsa/dsa_priv.h | 4 +++- net/dsa/port.c | 7 ++++--- net/dsa/slave.c | 4 +++- net/dsa/switch.c | 6 +++--- 9 files changed, 37 insertions(+), 14 deletions(-)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 7dcdc97c0bc3..437945179373 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -247,7 +247,10 @@ struct switchdev_notifier_fdb_info { const unsigned char *addr; u16 vid; u8 added_by_user:1, + sticky:1, is_local:1, + locked:1, + blackhole:1, offloaded:1; };
diff --git a/net/bridge/br.c b/net/bridge/br.c index 96e91d69a9a8..3671c0a12b7d 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,10 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, - fdb_info->vid, false); + fdb_info->vid, false, + fdb_info->locked, + fdb_info->blackhole, + fdb_info->sticky); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index 1962d9594a48..30126af9c42f 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1147,7 +1147,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; } - err = br_fdb_external_learn_add(br, p, addr, vid, true); + err = br_fdb_external_learn_add(br, p, addr, vid, true, false, false, false); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); @@ -1383,7 +1383,8 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool swdev_notify) + bool swdev_notify, bool locked, + bool blackhole, bool sticky) { struct net_bridge_fdb_entry *fdb; bool modified = false; @@ -1403,6 +1404,15 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
+ if (locked) + flags |= BIT(BR_FDB_ENTRY_LOCKED); + + if (blackhole) + flags |= BIT(BR_FDB_BLACKHOLE); + + if (sticky) + flags |= BIT(BR_FDB_STICKY); + fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 048e4afbc5a0..2836fec2dafb 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -812,7 +812,8 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool swdev_notify); + bool swdev_notify, bool locked, + bool blackhole, bool sticky); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..9aaefbc07e02 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -71,8 +71,8 @@ bool nbp_switchdev_allowed_egress(const struct net_bridge_port *p, }
/* Flags that can be offloaded to hardware */ -#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | \ - BR_MCAST_FLOOD | BR_BCAST_FLOOD | BR_PORT_LOCKED | \ +#define BR_PORT_FLAGS_HW_OFFLOAD (BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD | \ + BR_PORT_LOCKED | BR_PORT_MAB | \ BR_HAIRPIN_MODE | BR_ISOLATED | BR_MULTICAST_TO_UNICAST)
int br_switchdev_set_port_flag(struct net_bridge_port *p, @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags); + item->locked = test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx; } diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index 614fbba8fe39..b05685418b20 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid; + bool is_locked; struct dsa_db db; };
@@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { unsigned char addr[ETH_ALEN]; u16 vid; bool host_addr; + bool is_locked; };
enum dsa_standalone_event { @@ -232,7 +234,7 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, bool is_locked); int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 7afc35db0c29..3399cc1f7be0 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -303,7 +303,7 @@ static int dsa_port_inherit_brport_flags(struct dsa_port *dp, struct netlink_ext_ack *extack) { const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED; + BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB; struct net_device *brport_dev = dsa_port_to_bridge_port(dp); int flag, err;
@@ -327,7 +327,7 @@ static void dsa_port_clear_brport_flags(struct dsa_port *dp) { const unsigned long val = BR_FLOOD | BR_MCAST_FLOOD | BR_BCAST_FLOOD; const unsigned long mask = BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED; + BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB; int flag, err;
for_each_set_bit(flag, &mask, 32) { @@ -954,12 +954,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) }
int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid) + u16 vid, bool is_locked) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .is_locked = is_locked, .db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 345106b1ed78..a6486d7d9c7c 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2908,6 +2908,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev; + bool is_locked = switchdev_work->is_locked; u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp; @@ -2923,7 +2924,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else - err = dsa_port_fdb_add(dp, addr, vid); + err = dsa_port_fdb_add(dp, addr, vid, is_locked); if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n", @@ -3031,6 +3032,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr; + switchdev_work->is_locked = fdb_info->locked;
dsa_schedule_work(&switchdev_work->work);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 4dfd68cf61c5..f29604f25c67 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -234,7 +234,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp, }
static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid, struct dsa_db db) + u16 vid, bool is_locked, struct dsa_db db) { struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -399,7 +399,7 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { err = dsa_port_do_fdb_add(dp, info->addr, info->vid, - info->db); + false, info->db); if (err) break; } @@ -438,7 +438,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db); + return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->is_locked, info->db); }
static int dsa_switch_fdb_del(struct dsa_switch *ds,
On Fri, Aug 26, 2022 at 01:45:34PM +0200, Hans Schultz wrote:
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 7dcdc97c0bc3..437945179373 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -247,7 +247,10 @@ struct switchdev_notifier_fdb_info { const unsigned char *addr; u16 vid; u8 added_by_user:1,
sticky:1,
If mv88e6xxx reports entries with 'is_local=1, locked=1, blackhole=1', then the 'sticky' bit can be removed for now (we will need it some day to support sticky entries notified from the bridge). This takes care of the discrepancy Nik mentioned here:
https://lore.kernel.org/netdev/d1de0337-ae16-7dca-b212-1a4e85129c31@blackwal...
is_local:1,
locked:1,
blackhole:1, offloaded:1;
};
On 27/08/2022 18:46, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:34PM +0200, Hans Schultz wrote:
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 7dcdc97c0bc3..437945179373 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -247,7 +247,10 @@ struct switchdev_notifier_fdb_info { const unsigned char *addr; u16 vid; u8 added_by_user:1,
sticky:1,
If mv88e6xxx reports entries with 'is_local=1, locked=1, blackhole=1', then the 'sticky' bit can be removed for now (we will need it some day to support sticky entries notified from the bridge). This takes care of the discrepancy Nik mentioned here:
https://lore.kernel.org/netdev/d1de0337-ae16-7dca-b212-1a4e85129c31@blackwal...
+1
is_local:1,
locked:1,
blackhole:1, offloaded:1;
};
On 2022-08-27 17:46, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:34PM +0200, Hans Schultz wrote:
diff --git a/include/net/switchdev.h b/include/net/switchdev.h index 7dcdc97c0bc3..437945179373 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -247,7 +247,10 @@ struct switchdev_notifier_fdb_info { const unsigned char *addr; u16 vid; u8 added_by_user:1,
sticky:1,
If mv88e6xxx reports entries with 'is_local=1, locked=1, blackhole=1', then the 'sticky' bit can be removed for now (we will need it some day to support sticky entries notified from the bridge). This takes care of the discrepancy Nik mentioned here:
https://lore.kernel.org/netdev/d1de0337-ae16-7dca-b212-1a4e85129c31@blackwal...
is_local:1,
locked:1,
blackhole:1, offloaded:1;
};
Right!
On Fri, Aug 26, 2022 at 01:45:34PM +0200, Hans Schultz wrote:
@@ -1403,6 +1404,15 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
if (locked)
flags |= BIT(BR_FDB_ENTRY_LOCKED);
if (blackhole)
flags |= BIT(BR_FDB_BLACKHOLE);
if (sticky)
flags |= BIT(BR_FDB_STICKY);
While reviewing the test cases it occurred to me that the else branch (FDB entry already exists) needs modifications as well. Something like:
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..48f842a71597 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1397,6 +1397,21 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, modified = true; }
+ if (local != test_bit(BR_FDB_LOCAL, &fdb->flags)) { + change_bit(BR_FDB_LOCAL, &fdb->flags); + modified = true; + } + + if (locked != test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) { + change_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); + modified = true; + } + + if (blackhole != test_bit(BR_FDB_BLACKHOLE, &fdb->flags)) { + change_bit(BR_FDB_BLACKHOLE, &fdb->flags); + modified = true; + } + if (swdev_notify) set_bit(BR_FDB_ADDED_BY_USER, &fdb->flags);
fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM;
Ignore locked fdb entries coming in on all drivers.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/b53/b53_common.c | 5 +++++ drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++ drivers/net/dsa/lan9303-core.c | 5 +++++ drivers/net/dsa/lantiq_gswip.c | 5 +++++ drivers/net/dsa/microchip/ksz_common.c | 5 +++++ drivers/net/dsa/mt7530.c | 5 +++++ drivers/net/dsa/mv88e6xxx/chip.c | 5 +++++ drivers/net/dsa/ocelot/felix.c | 5 +++++ drivers/net/dsa/qca/qca8k-common.c | 5 +++++ drivers/net/dsa/qca/qca8k.h | 1 + drivers/net/dsa/sja1105/sja1105_main.c | 7 ++++++- include/net/dsa.h | 1 + net/dsa/switch.c | 4 ++-- 14 files changed, 56 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index 48cf344750ff..8a3c8a286d8c 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1684,11 +1684,16 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct b53_device *priv = ds->priv; int ret;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + /* 5325 and 5365 require some more massaging, but could * be supported eventually */ diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 795cbffd5c2b..19501b76b9df 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -362,6 +362,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port, const struct switchdev_obj_port_vlan *vlan); int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db); int b53_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c index 01f90994dedd..d93fc198dcb8 100644 --- a/drivers/net/dsa/hirschmann/hellcreek.c +++ b/drivers/net/dsa/hirschmann/hellcreek.c @@ -829,12 +829,17 @@ static int hellcreek_fdb_get(struct hellcreek *hellcreek,
static int hellcreek_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct hellcreek_fdb_entry entry = { 0 }; struct hellcreek *hellcreek = ds->priv; int ret;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + dev_dbg(hellcreek->dev, "Add FDB entry for MAC=%pM\n", addr);
mutex_lock(&hellcreek->reg_lock); diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index e03ff1f267bb..426b9ea668da 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -1190,10 +1190,15 @@ static void lan9303_port_fast_age(struct dsa_switch *ds, int port)
static int lan9303_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct lan9303 *chip = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + dev_dbg(chip->dev, "%s(%d, %pM, %d)\n", __func__, port, addr, vid); if (vid) return -EOPNOTSUPP; diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c index e531b93f3cb2..6622bf3c7249 100644 --- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -1399,8 +1399,13 @@ static int gswip_port_fdb(struct dsa_switch *ds, int port,
static int gswip_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { + /* Ignore locked entries */ + if (is_locked) + return 0; + return gswip_port_fdb(ds, port, addr, vid, true); }
diff --git a/drivers/net/dsa/microchip/ksz_common.c b/drivers/net/dsa/microchip/ksz_common.c index 6bd69a7e6809..9786983114ba 100644 --- a/drivers/net/dsa/microchip/ksz_common.c +++ b/drivers/net/dsa/microchip/ksz_common.c @@ -1205,6 +1205,7 @@ static void ksz_port_fast_age(struct dsa_switch *ds, int port)
static int ksz_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct ksz_device *dev = ds->priv; @@ -1212,6 +1213,10 @@ static int ksz_port_fdb_add(struct dsa_switch *ds, int port, if (!dev->dev_ops->fdb_add) return -EOPNOTSUPP;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + return dev->dev_ops->fdb_add(dev, port, addr, vid, db); }
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index 835807911be0..df657747fb09 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -1366,12 +1366,17 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port, static int mt7530_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct mt7530_priv *priv = ds->priv; int ret; u8 port_mask = BIT(port);
+ /* Ignore locked entries */ + if (is_locked) + return 0; + mutex_lock(&priv->reg_mutex); mt7530_fdb_write(priv, vid, port_mask, addr, -1, STATIC_ENT); ret = mt7530_fdb_cmd(priv, MT7530_FDB_WRITE, NULL); diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 07e9a4da924c..57ef20aac300 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2714,11 +2714,16 @@ static int mv88e6xxx_vlan_msti_set(struct dsa_switch *ds,
static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; int err;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index ee19ed96f284..65da02c42061 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -704,12 +704,17 @@ static int felix_fdb_dump(struct dsa_switch *ds, int port,
static int felix_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct net_device *bridge_dev = felix_classify_db(db); struct dsa_port *dp = dsa_to_port(ds, port); struct ocelot *ocelot = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + if (IS_ERR(bridge_dev)) return PTR_ERR(bridge_dev);
diff --git a/drivers/net/dsa/qca/qca8k-common.c b/drivers/net/dsa/qca/qca8k-common.c index bba95613e218..e7022f528bbf 100644 --- a/drivers/net/dsa/qca/qca8k-common.c +++ b/drivers/net/dsa/qca/qca8k-common.c @@ -795,11 +795,16 @@ int qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr,
int qca8k_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; u16 port_mask = BIT(port);
+ /* Ignore locked entries */ + if (is_locked) + return 0; + return qca8k_port_fdb_insert(priv, addr, port_mask, vid); }
diff --git a/drivers/net/dsa/qca/qca8k.h b/drivers/net/dsa/qca/qca8k.h index e36ecc9777f4..cac21267211f 100644 --- a/drivers/net/dsa/qca/qca8k.h +++ b/drivers/net/dsa/qca/qca8k.h @@ -479,6 +479,7 @@ int qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr, u16 port_mask, u16 vid); int qca8k_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db); int qca8k_port_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index b03d0d0c3dbf..4419787ebdfb 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1802,10 +1802,15 @@ int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
static int sja1105_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct sja1105_private *priv = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + if (!vid) { switch (db.type) { case DSA_DB_PORT: @@ -1944,7 +1949,7 @@ static int sja1105_mdb_add(struct dsa_switch *ds, int port, const struct switchdev_obj_port_mdb *mdb, struct dsa_db db) { - return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); + return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, false, db); }
static int sja1105_mdb_del(struct dsa_switch *ds, int port, diff --git a/include/net/dsa.h b/include/net/dsa.h index f2ce12860546..abda2806c5cb 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -1017,6 +1017,7 @@ struct dsa_switch_ops { */ int (*port_fdb_add)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db); int (*port_fdb_del)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, diff --git a/net/dsa/switch.c b/net/dsa/switch.c index f29604f25c67..849d5a39d021 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -243,7 +243,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
/* No need to bother with refcounting for user ports */ if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) - return ds->ops->port_fdb_add(ds, port, addr, vid, db); + return ds->ops->port_fdb_add(ds, port, addr, vid, is_locked, db);
mutex_lock(&dp->addr_lists_lock);
@@ -259,7 +259,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, goto out; }
- err = ds->ops->port_fdb_add(ds, port, addr, vid, db); + err = ds->ops->port_fdb_add(ds, port, addr, vid, is_locked, db); if (err) { kfree(a); goto out;
The FID is needed to get hold of which VID was involved in a violation, thus the need to be able to read the FID.
For convenience the function mv88e6xxx_g1_atu_op() has been used to read ATU violations, but the function invalidates reading the fid, so to both read ATU violations without zeroing the fid, and read the fid, functions have been added to ensure both are done correctly.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/global1_atu.c | 60 ++++++++++++++++++++++--- 1 file changed, 55 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index 40bd67a5c8e9..d9dfa1159cde 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -114,6 +114,19 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip) return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0); }
+static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip) +{ + int err; + + err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP, + MV88E6XXX_G1_ATU_OP_BUSY | + MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + if (err) + return err; + + return mv88e6xxx_g1_atu_op_wait(chip); +} + static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op) { u16 val; @@ -159,6 +172,41 @@ int mv88e6xxx_g1_atu_get_next(struct mv88e6xxx_chip *chip, u16 fid) return mv88e6xxx_g1_atu_op(chip, fid, MV88E6XXX_G1_ATU_OP_GET_NEXT_DB); }
+static int mv88e6xxx_g1_atu_fid_read(struct mv88e6xxx_chip *chip, u16 *fid) +{ + u16 val = 0, upper = 0, op = 0; + int err = -EOPNOTSUPP; + + if (mv88e6xxx_num_databases(chip) > 256) { + err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &val); + val &= 0xfff; + if (err) + return err; + } else { + err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_OP, &op); + if (err) + return err; + if (mv88e6xxx_num_databases(chip) > 64) { + /* ATU DBNum[7:4] are located in ATU Control 15:12 */ + err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &upper); + if (err) + return err; + + upper = (upper >> 8) & 0x00f0; + } else if (mv88e6xxx_num_databases(chip) > 16) { + /* ATU DBNum[5:4] are located in ATU Operation 9:8 */ + + upper = (op >> 4) & 0x30; + } + /* ATU DBNum[3:0] are located in ATU Operation 3:0 */ + + val = (op & 0xf) | upper; + } + *fid = val; + + return err; +} + /* Offset 0x0C: ATU Data Register */
static int mv88e6xxx_g1_atu_data_read(struct mv88e6xxx_chip *chip, @@ -353,14 +401,12 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) { struct mv88e6xxx_chip *chip = dev_id; struct mv88e6xxx_atu_entry entry; - int spid; - int err; - u16 val; + int err, spid; + u16 val, fid;
mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_g1_atu_op(chip, 0, - MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + err = mv88e6xxx_g1_read_atu_violation(chip); if (err) goto out;
@@ -368,6 +414,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ err = mv88e6xxx_g1_atu_fid_read(chip, &fid); + if (err) + goto out; + err = mv88e6xxx_g1_atu_data_read(chip, &entry); if (err) goto out;
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry.
Additionally the driver will set the sticky and blackhole flags, as the driver does not support roaming and forwarding from any port to a locked entry.
As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 76 +++++-- drivers/net/dsa/mv88e6xxx/chip.h | 19 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 16 +- drivers/net/dsa/mv88e6xxx/port.c | 15 +- drivers/net/dsa/mv88e6xxx/port.h | 6 + drivers/net/dsa/mv88e6xxx/switchdev.c | 285 ++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ 9 files changed, 434 insertions(+), 22 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index c8eca2b6f959..be903a983780 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o mv88e6xxx-objs += serdes.o mv88e6xxx-objs += smi.o +mv88e6xxx-objs += switchdev.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 57ef20aac300..8a17a02293c3 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h"
static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -916,6 +917,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port); + else + if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->dev, + "p%d: failed to clear locked entries\n", port); + } }
static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1682,6 +1690,13 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err;
+ if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", + port, err); + } + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip); @@ -1718,11 +1733,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; }
-static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, - int (*cb)(struct mv88e6xxx_chip *chip, - const struct mv88e6xxx_vtu_entry *entry, - void *priv), - void *priv) +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv) { struct mv88e6xxx_vtu_entry entry = { .vid = mv88e6xxx_max_vid(chip), @@ -2724,6 +2739,9 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, if (is_locked) return 0;
+ if (mv88e6xxx_port_is_locked(chip, port)) + mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); @@ -2737,12 +2755,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; - int err; + bool locked_found = false; + int err = 0;
- mv88e6xxx_reg_lock(chip); - err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); - mv88e6xxx_reg_unlock(chip); + if (mv88e6xxx_port_is_locked(chip, port)) + locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
+ if (!locked_found) { + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); + mv88e6xxx_reg_unlock(chip); + } return err; }
@@ -3838,11 +3861,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port) { - return mv88e6xxx_setup_devlink_regions_port(ds, port); + int err; + + err = mv88e6xxx_setup_devlink_regions_port(ds, port); + if (!err) + return mv88e6xxx_init_violation_handler(ds, port); + + return err; }
static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port) { + mv88e6xxx_teardown_violation_handler(ds, port); mv88e6xxx_teardown_devlink_regions_port(ds, port); }
@@ -6517,7 +6547,7 @@ static int mv88e6xxx_port_pre_bridge_flags(struct dsa_switch *ds, int port, const struct mv88e6xxx_ops *ops;
if (flags.mask & ~(BR_LEARNING | BR_FLOOD | BR_MCAST_FLOOD | - BR_BCAST_FLOOD | BR_PORT_LOCKED)) + BR_BCAST_FLOOD | BR_PORT_LOCKED | BR_PORT_MAB)) return -EINVAL;
ops = chip->info->ops; @@ -6538,13 +6568,13 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP;
- mv88e6xxx_reg_lock(chip); - if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_assoc_vector(chip, port, pav); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6552,8 +6582,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_ucast_flood(chip, port, unicast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6561,8 +6593,10 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_mcast_flood(chip, port, multicast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; } @@ -6570,20 +6604,34 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast); + mv88e6xxx_reg_unlock(chip); if (err) goto out; }
+ if (flags.mask & BR_PORT_MAB) { + chip->ports[port].mab = !!(flags.val & BR_PORT_MAB); + + if (!chip->ports[port].mab) + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + else + err = 0; + } + if (flags.mask & BR_PORT_LOCKED) { bool locked = !!(flags.val & BR_PORT_LOCKED);
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_lock(chip, port, locked); + mv88e6xxx_reg_unlock(chip); if (err) goto out; + + chip->ports[port].locked = locked; } out: - mv88e6xxx_reg_unlock(chip);
return err; } diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index e693154cf803..a40884367bdf 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -280,6 +280,16 @@ struct mv88e6xxx_port { unsigned int serdes_irq; char serdes_irq_name[64]; struct devlink_region *region; + + /* Locked port and MAB/MacAuth control flags */ + bool locked; + bool mab; + + /* List and maintenance of ATU locked entries */ + struct mutex ale_list_lock; + struct list_head ale_list; + struct delayed_work ale_work; + int ale_cnt; };
enum mv88e6xxx_region_id { @@ -399,6 +409,9 @@ struct mv88e6xxx_chip { int egress_dest_port; int ingress_dest_port;
+ /* Keep the register written age time for easy access */ + u8 age_time; + /* Per-port timestamping resources. */ struct mv88e6xxx_port_hwtstamp port_hwtstamp[DSA_MAX_PORTS];
@@ -802,6 +815,12 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip) mutex_unlock(&chip->reg_lock); }
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv); + int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
#endif /* _MV88E6XXX_CHIP_H */ diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index 65958b2a0d3a..503fbf216670 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -136,6 +136,7 @@ #define MV88E6XXX_G1_ATU_DATA_TRUNK 0x8000 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK 0x00f0 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK 0x000f #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST 0x0001 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index d9dfa1159cde..4f2c61612fb4 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -12,6 +12,8 @@
#include "chip.h" #include "global1.h" +#include "port.h" +#include "switchdev.h"
/* Offset 0x01: ATU FID Register */
@@ -54,6 +56,7 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip,
/* Round to nearest multiple of coeff */ age_time = (msecs + coeff / 2) / coeff; + chip->age_time = age_time;
err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val); if (err) @@ -418,6 +421,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid); + if (err) + goto out; + err = mv88e6xxx_g1_atu_data_read(chip, &entry); if (err) goto out; @@ -426,6 +433,8 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ mv88e6xxx_reg_unlock(chip); + spid = entry.state;
if (val & MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION) { @@ -446,6 +455,12 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) "ATU miss violation for %pM portvec %x spid %d\n", entry.mac, entry.portvec, spid); chip->ports[spid].atu_miss_violation++; + + if (fid && chip->ports[spid].mab) + err = mv88e6xxx_handle_violation(chip, spid, &entry, fid, + MV88E6XXX_G1_ATU_OP_MISS_VIOLATION); + if (err) + goto out; }
if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) { @@ -454,7 +469,6 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) entry.mac, entry.portvec, spid); chip->ports[spid].atu_full_violation++; } - mv88e6xxx_reg_unlock(chip);
return IRQ_HANDLED;
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 90c55f23b7c9..ad054bca2907 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -14,9 +14,11 @@ #include <linux/phylink.h>
#include "chip.h" +#include "global1.h" #include "global2.h" #include "port.h" #include "serdes.h" +#include "switchdev.h"
int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg, u16 *val) @@ -1221,13 +1223,12 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®); - if (err) - return err; - - reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; - if (locked) - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + reg = 0; + if (locked) { + reg = (1 << port); + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + }
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index cb04243f37c1..9475bc6e95a2 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff
/* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -375,6 +376,11 @@ int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked);
+static inline bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port) +{ + return chip->ports[port].locked; +} + int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port, u16 mode); int mv88e6095_port_tag_remap(struct mv88e6xxx_chip *chip, int port); diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.c b/drivers/net/dsa/mv88e6xxx/switchdev.c new file mode 100644 index 000000000000..12472cbf3b85 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.c @@ -0,0 +1,285 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * switchdev.c + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h" + +static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale, + bool notify, bool take_nl_lock) +{ + struct switchdev_notifier_fdb_info info = { + .addr = ale->mac, + .vid = ale->vid, + .locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_entry entry; + struct net_device *brport; + struct dsa_port *dp; + + entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED; + entry.trunk = false; + ether_addr_copy(entry.mac, ale->mac); + + mv88e6xxx_reg_lock(ale->chip); + mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry); + mv88e6xxx_reg_unlock(ale->chip); + + dp = dsa_to_port(ale->chip->ds, ale->port); + + if (notify) { + if (take_nl_lock) + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + + if (brport) { + call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, + brport, &info.info, NULL); + } else { + dev_err(ale->chip->dev, + "No bridge port for dsa port belonging to port %d\n", + ale->port); + } + if (take_nl_lock) + rtnl_unlock(); + } + + list_del(&ale->list); + kfree(ale); +} + +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work) +{ + struct mv88e6xxx_port *p = container_of(work, struct mv88e6xxx_port, ale_work.work); + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (time_after(jiffies, ale->expires)) { + mv88e6xxx_atu_locked_entry_purge(ale, true, true); + p->ale_cnt--; + } + } + mutex_unlock(&p->ale_list_lock); + + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(100)); +} + +static int mv88e6xxx_new_atu_locked_entry(struct mv88e6xxx_chip *chip, const unsigned char *addr, + int port, u16 fid, u16 vid, + struct mv88e6xxx_atu_locked_entry **alep) +{ + struct mv88e6xxx_atu_locked_entry *ale; + unsigned long now, age_time; + + ale = kmalloc(sizeof(*ale), GFP_ATOMIC); + if (!ale) + return -ENOMEM; + + ether_addr_copy(ale->mac, addr); + ale->chip = chip; + ale->port = port; + ale->fid = fid; + ale->vid = vid; + now = jiffies; + age_time = chip->age_time * chip->info->age_time_coeff; + ale->expires = now + age_time; + + *alep = ale; + return 0; +} + +struct mv88e6xxx_fid_search_ctx { + u16 fid_search; + u16 vid_found; +}; + +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv) +{ + struct mv88e6xxx_fid_search_ctx *ctx = priv; + + if (ctx->fid_search == entry->fid) { + ctx->vid_found = entry->vid; + return 1; + } + + return 0; +} + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type) +{ + struct switchdev_notifier_fdb_info info = { + .addr = entry->mac, + .vid = 0, + .sticky = true, + .locked = true, + .blackhole = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_locked_entry *ale; + struct mv88e6xxx_fid_search_ctx ctx; + struct net_device *brport; + struct mv88e6xxx_port *p; + struct dsa_port *dp; + int err; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + ctx.fid_search = fid; + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx); + mv88e6xxx_reg_unlock(chip); + if (err < 0) + return err; + if (err == 1) + info.vid = ctx.vid_found; + else + return -ENODATA; + + switch (type) { + case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION: + mutex_lock(&p->ale_list_lock); + if (p->ale_cnt >= ATU_LOCKED_ENTRIES_MAX) + goto exit; + mutex_unlock(&p->ale_list_lock); + entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC; + entry->trunk = false; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry); + if (err) + goto fail; + mv88e6xxx_reg_unlock(chip); + + dp = dsa_to_port(chip->ds, port); + err = mv88e6xxx_new_atu_locked_entry(chip, entry->mac, port, fid, + info.vid, &ale); + if (err) + return err; + + mutex_lock(&p->ale_list_lock); + list_add(&ale->list, &p->ale_list); + p->ale_cnt++; + mutex_unlock(&p->ale_list_lock); + + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + if (!brport) { + rtnl_unlock(); + return -ENODEV; + } + err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, + brport, &info.info, NULL); + rtnl_unlock(); + break; + } + + return err; + +fail: + mv88e6xxx_reg_unlock(chip); + return err; + +exit: + mutex_unlock(&p->ale_list_lock); + return 0; +} + +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + bool found = false; + + p = &chip->ports[port]; + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (ether_addr_equal(ale->mac, addr) == 0) { + if (ale->vid == vid) { + mv88e6xxx_atu_locked_entry_purge(ale, false, false); + p->ale_cnt--; + found = true; + break; + } + } + } + mutex_unlock(&p->ale_list_lock); + return found; +} + +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + mv88e6xxx_atu_locked_entry_purge(ale, true, false); + p->ale_cnt--; + } + mutex_unlock(&p->ale_list_lock); + + return 0; +} + +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + INIT_LIST_HEAD(&p->ale_list); + mutex_init(&p->ale_list_lock); + p->ale_cnt = 0; + INIT_DELAYED_WORK(&p->ale_work, mv88e6xxx_atu_locked_entry_cleanup); + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(500)); + + return 0; +} + +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + cancel_delayed_work(&p->ale_work); + mv88e6xxx_atu_locked_entry_flush(ds, port); + mutex_destroy(&p->ale_list_lock); + + return 0; +} diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.h b/drivers/net/dsa/mv88e6xxx/switchdev.h new file mode 100644 index 000000000000..df2005c36f47 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * switchdev.h + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#ifndef DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ +#define DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ + +#include <net/switchdev.h> +#include "chip.h" + +#define ATU_LOCKED_ENTRIES_MAX 64 + +struct mv88e6xxx_atu_locked_entry { + struct list_head list; + struct mv88e6xxx_chip *chip; + int port; + u8 mac[ETH_ALEN]; + u16 fid; + u16 vid; + unsigned long expires; +}; + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type); +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid); +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port); +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port); +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port); + +#endif /* DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ */
Verify that the MAC-Auth mechanism works by adding a FDB entry with the locked flag set, denying access until the FDB entry is replaced with a FDB entry without the locked flag set.
Also add a test that verifies that sticky FDB entries cannot roam.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- .../net/forwarding/bridge_locked_port.sh | 107 +++++++++++++++++- .../net/forwarding/bridge_sticky_fdb.sh | 21 +++- 2 files changed, 126 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh index 5b02b6b60ce7..b763b3b9fdf0 100755 --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh @@ -1,7 +1,15 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan" +ALL_TESTS=" + locked_port_ipv4 + locked_port_ipv6 + locked_port_vlan + locked_port_mab + locked_port_station_move + locked_port_mab_station_move +" + NUM_NETIFS=4 CHECK_TC="no" source lib.sh @@ -166,6 +174,103 @@ locked_port_ipv6() log_test "Locked port ipv6" }
+locked_port_mab() +{ + RET=0 + check_locked_port_support || return 0 + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work before locking port" + + bridge link set dev $swp1 locked on + bridge link set dev $swp1 learning on + if ! bridge link set dev $swp1 mab on 2>/dev/null; then + echo "SKIP: iproute2 too old; MacAuth feature not supported." + return $ksft_skip + fi + + ping_do $h1 192.0.2.2 + check_fail $? "MAB: Ping worked on locked port without FDB entry" + + bridge fdb show | grep `mac_get $h1` | grep -q "locked" + check_err $? "MAB: No locked fdb entry after ping on locked port" + + bridge fdb replace `mac_get $h1` dev $swp1 master static + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work with fdb entry without locked flag" + + bridge fdb del `mac_get $h1` dev $swp1 master + bridge link set dev $swp1 learning off + bridge link set dev $swp1 locked off + + log_test "Locked port MAB" +} + +# No roaming allowed to a simple locked port +locked_port_station_move() +{ + local mac=a0:b0:c0:c0:b0:a0 + + RET=0 + check_locked_port_support || return 0 + + bridge link set dev $swp1 locked on + bridge link set dev $swp1 learning on + + $MZ $h1 -q -t udp -a $mac -b rand + bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0" + check_fail $? "Locked port station move: FDB entry on first injection" + + $MZ $h2 -q -t udp -a $mac -b rand + bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0" + check_err $? "Locked port station move: Entry not found on unlocked port" + + $MZ $h1 -q -t udp -a $mac -b rand + bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0" + check_fail $? "Locked port station move: entry roamed to locked port" + + log_test "Locked port station move" +} + +# Roaming to and from a MAB enabled port should work if sticky flag is not set +locked_port_mab_station_move() +{ + local mac=10:20:30:30:20:10 + + RET=0 + check_locked_port_support || return 0 + + bridge link set dev $swp1 locked on + bridge link set dev $swp1 learning on + if ! bridge link set dev $swp1 mab on 2>/dev/null; then + echo "SKIP: iproute2 too old; MacAuth feature not supported." + return $ksft_skip + fi + + $MZ $h1 -q -t udp -a $mac -b rand + if bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0" | grep -q sticky; then + echo "SKIP: Roaming not possible with sticky flag, run sticky flag roaming test" + return $ksft_skip + fi + + bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked" + check_err $? "MAB station move: no locked entry on first injection" + + $MZ $h2 -q -t udp -a $mac -b rand + bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked" + check_fail $? "MAB station move: locked entry did not move" + + bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0" + check_err $? "MAB station move: roamed entry not found" + + $MZ $h1 -q -t udp -a $mac -b rand + bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked" + check_err $? "MAB station move: entry did not roam back to locked port" + + log_test "Locked port MAB station move" +} + trap cleanup EXIT
setup_prepare diff --git a/tools/testing/selftests/net/forwarding/bridge_sticky_fdb.sh b/tools/testing/selftests/net/forwarding/bridge_sticky_fdb.sh index 1f8ef0eff862..bca77bc3fe09 100755 --- a/tools/testing/selftests/net/forwarding/bridge_sticky_fdb.sh +++ b/tools/testing/selftests/net/forwarding/bridge_sticky_fdb.sh @@ -1,7 +1,7 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0
-ALL_TESTS="sticky" +ALL_TESTS="sticky sticky_no_roaming" NUM_NETIFS=4 TEST_MAC=de:ad:be:ef:13:37 source lib.sh @@ -59,6 +59,25 @@ sticky() log_test "Sticky fdb entry" }
+# No roaming allowed with the sticky flag set +sticky_no_roaming() +{ + local mac=a8:b4:c2:c2:b4:a8 + + RET=0 + + bridge link set dev $swp2 learning on + bridge fdb add $mac dev $swp1 master static sticky + bridge fdb show dev $swp1 | grep "$mac master br0" | grep -q sticky + check_err $? "Sticky no roaming: No sticky FDB entry found after adding" + + $MZ $h2 -q -t udp -c 10 -d 100msec -a $mac -b rand + bridge fdb show dev $swp2 | grep "$mac master br0" | grep -q sticky + check_fail $? "Sticky no roaming: Sticky entry roamed" + + log_test "Sticky no roaming" +} + trap cleanup EXIT
setup_prepare
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan" +ALL_TESTS="
- locked_port_ipv4
- locked_port_ipv6
- locked_port_vlan
- locked_port_mab
- locked_port_station_move
- locked_port_mab_station_move
+"
NUM_NETIFS=4 CHECK_TC="no" source lib.sh @@ -166,6 +174,103 @@ locked_port_ipv6() log_test "Locked port ipv6" } +locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason. Please avoid leaking this implementation detail to user space and instead use the "MAB" flag to enable learning if you need it in mv88e6xxx.
- if ! bridge link set dev $swp1 mab on 2>/dev/null; then
echo "SKIP: iproute2 too old; MacAuth feature not supported."
return $ksft_skip
- fi
Please add a similar function to check_locked_port_support() and invoke it next to it.
- ping_do $h1 192.0.2.2
- check_fail $? "MAB: Ping worked on locked port without FDB entry"
- bridge fdb show | grep `mac_get $h1` | grep -q "locked"
- check_err $? "MAB: No locked fdb entry after ping on locked port"
- bridge fdb replace `mac_get $h1` dev $swp1 master static
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work with fdb entry without locked flag"
- bridge fdb del `mac_get $h1` dev $swp1 master
Missing:
bridge link set dev $swp1 mab off
- bridge link set dev $swp1 learning off
Can be removed assuming we get rid of "learning on" above.
- bridge link set dev $swp1 locked off
- log_test "Locked port MAB"
+}
+# No roaming allowed to a simple locked port +locked_port_station_move() +{
- local mac=a0:b0:c0:c0:b0:a0
- RET=0
- check_locked_port_support || return 0
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
Same comment as above.
- $MZ $h1 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0"
- check_fail $? "Locked port station move: FDB entry on first injection"
- $MZ $h2 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0"
- check_err $? "Locked port station move: Entry not found on unlocked port"
Looks like this is going to fail with offloaded data path as according to fdb_print_flags() in iproute2 both the "extern_learn" and "offload" flags will be printed before "master".
I suggest using "bridge fdb get" instead (didn't test, might need small tweaks, but you will figure it):
bridge fdb get $mac br br0 vlan 1 master 2> /dev/null | grep -q "$swp2"
Same in other places where "bridge fdb show" is used.
- $MZ $h1 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0"
- check_fail $? "Locked port station move: entry roamed to locked port"
Missing:
bridge link set dev $swp1 locked off bridge fdb del $mac dev $swp1 master vlan 1
- log_test "Locked port station move"
+}
+# Roaming to and from a MAB enabled port should work if sticky flag is not set +locked_port_mab_station_move() +{
- local mac=10:20:30:30:20:10
- RET=0
- check_locked_port_support || return 0
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
Same comment as above.
- if ! bridge link set dev $swp1 mab on 2>/dev/null; then
Same comment as above.
echo "SKIP: iproute2 too old; MacAuth feature not supported."
return $ksft_skip
- fi
- $MZ $h1 -q -t udp -a $mac -b rand
- if bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0" | grep -q sticky; then
Will need to change to "permanent" instead of "sticky".
echo "SKIP: Roaming not possible with sticky flag, run sticky flag roaming test"
return $ksft_skip
Missing cleanup before the return.
- fi
- bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked"
- check_err $? "MAB station move: no locked entry on first injection"
- $MZ $h2 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked"
- check_fail $? "MAB station move: locked entry did not move"
- bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0"
Need to check that it does not roam with the "locked" flag set.
- check_err $? "MAB station move: roamed entry not found"
- $MZ $h1 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp1 | grep -q "$mac vlan 1 master br0 locked"
- check_err $? "MAB station move: entry did not roam back to locked port"
This will need to change to "check_fail" assuming we don't allow roaming from an authorized port to an unauthorized port, which I believe makes sense.
Missing cleanup.
- log_test "Locked port MAB station move"
+}
trap cleanup EXIT setup_prepare
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason.
As the way mv88e6xxx implements "learning off" is to remove port association for ingress packets on a port, but that breaks many other things such as refreshing ATU entries and violation interrupts, so it is needed and the question is then what is the worst to have 'learning on' on a locked port or to have the locked port enabling learning in the driver silently?
Opinions seem to differ. Note that even on locked ports without MAB, port association on ingress is still needed in future as I have a dynamic ATU patch set coming, that uses age out violation and hardware refreshing to let the hardware keep the dynamic entries as long as the authorized station is sending, but will age the entry out if the station keeps silent for the ageing time. But that patch set is dependent on this patch set, and I don't think I can send it before this is accepted...
Please avoid leaking this implementation detail to user space and instead use the "MAB" flag to enable learning if you need it in mv88e6xxx.
On Sun, Aug 28, 2022 at 02:00:29PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason.
As the way mv88e6xxx implements "learning off" is to remove port association for ingress packets on a port, but that breaks many other things such as refreshing ATU entries and violation interrupts, so it is needed and the question is then what is the worst to have 'learning on' on a locked port or to have the locked port enabling learning in the driver silently?
Opinions seem to differ. Note that even on locked ports without MAB, port association on ingress is still needed in future as I have a dynamic ATU patch set coming, that uses age out violation and hardware refreshing to let the hardware keep the dynamic entries as long as the authorized station is sending, but will age the entry out if the station keeps silent for the ageing time. But that patch set is dependent on this patch set, and I don't think I can send it before this is accepted...
Can you explain how you envision user space to work once everything is merged? I want to make sure we have the full picture before more stuff is merged. From what you describe, I expect the following:
1. Create topology, assuming two unauthorized ports:
# ip link add name br0 type bridge no_linklocal_learn 1 (*) # ip link set dev swp1 master br0 # ip link set dev swp2 master br0 # bridge link set dev swp1 learning on locked on # bridge link set dev swp2 learning on locked on # ip link set dev swp1 up # ip link set dev swp2 up # ip link set dev br0 up
2. Assuming h1 behind swp1 was authorized using 802.1X:
# bridge fdb replace $H1_MAC dev swp1 master dynamic
3. Assuming 802.1X authentication failed for h2 behind swp2, enable MAB:
# bridge link set dev swp2 mab on
4. Assuming $H2_MAC is in our allow list:
# bridge fdb replace $H2_MAC dev swp2 master dynamic
Learning is on in order to refresh the dynamic entries that user space installed.
(*) Need to add support for this option in iproute2. Already exposed over netlink (see 'IFLA_BR_MULTI_BOOLOPT').
On 2022-08-29 09:40, Ido Schimmel wrote:
On Sun, Aug 28, 2022 at 02:00:29PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason.
As the way mv88e6xxx implements "learning off" is to remove port association for ingress packets on a port, but that breaks many other things such as refreshing ATU entries and violation interrupts, so it is needed and the question is then what is the worst to have 'learning on' on a locked port or to have the locked port enabling learning in the driver silently?
Opinions seem to differ. Note that even on locked ports without MAB, port association on ingress is still needed in future as I have a dynamic ATU patch set coming, that uses age out violation and hardware refreshing to let the hardware keep the dynamic entries as long as the authorized station is sending, but will age the entry out if the station keeps silent for the ageing time. But that patch set is dependent on this patch set, and I don't think I can send it before this is accepted...
Can you explain how you envision user space to work once everything is merged? I want to make sure we have the full picture before more stuff is merged. From what you describe, I expect the following:
- Create topology, assuming two unauthorized ports:
# ip link add name br0 type bridge no_linklocal_learn 1 (*) # ip link set dev swp1 master br0 # ip link set dev swp2 master br0 # bridge link set dev swp1 learning on locked on # bridge link set dev swp2 learning on locked on
The final decision on this rests with you I would say. Actually I forgot to remove the port association in the driver in this version.
# ip link set dev swp1 up # ip link set dev swp2 up # ip link set dev br0 up
- Assuming h1 behind swp1 was authorized using 802.1X:
# bridge fdb replace $H1_MAC dev swp1 master dynamic
With the new MAB flag 'replace' is not needed when MAB is not enabled.
- Assuming 802.1X authentication failed for h2 behind swp2, enable
MAB:
# bridge link set dev swp2 mab on
- Assuming $H2_MAC is in our allow list:
# bridge fdb replace $H2_MAC dev swp2 master dynamic
Learning is on in order to refresh the dynamic entries that user space installed.
Yes, port association is needed for those reasons. :-)
(*) Need to add support for this option in iproute2. Already exposed over netlink (see 'IFLA_BR_MULTI_BOOLOPT').
Should I do that in this patch set?
On Mon, Aug 29, 2022 at 10:01:18AM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 09:40, Ido Schimmel wrote:
On Sun, Aug 28, 2022 at 02:00:29PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason.
As the way mv88e6xxx implements "learning off" is to remove port association for ingress packets on a port, but that breaks many other things such as refreshing ATU entries and violation interrupts, so it is needed and the question is then what is the worst to have 'learning on' on a locked port or to have the locked port enabling learning in the driver silently?
Opinions seem to differ. Note that even on locked ports without MAB, port association on ingress is still needed in future as I have a dynamic ATU patch set coming, that uses age out violation and hardware refreshing to let the hardware keep the dynamic entries as long as the authorized station is sending, but will age the entry out if the station keeps silent for the ageing time. But that patch set is dependent on this patch set, and I don't think I can send it before this is accepted...
Can you explain how you envision user space to work once everything is merged? I want to make sure we have the full picture before more stuff is merged. From what you describe, I expect the following:
- Create topology, assuming two unauthorized ports:
# ip link add name br0 type bridge no_linklocal_learn 1 (*) # ip link set dev swp1 master br0 # ip link set dev swp2 master br0 # bridge link set dev swp1 learning on locked on # bridge link set dev swp2 learning on locked on
The final decision on this rests with you I would say.
If the requirement for this feature (with or without MAB) is to work with dynamic entries (which is not what is currently implemented in the selftests), then learning needs to be enabled for the sole reason of refreshing the dynamic entries added by user space. That is, updating 'fdb->updated' with current jiffies value.
So, is this the requirement? I checked the hostapd fork you posted some time ago and I get the impression that the answer is yes [1], but I want to verify I'm not missing something.
[1] https://github.com/westermo/hostapd/commit/95dc96f9e89131b2319f5eae8ae7ac998...
Actually I forgot to remove the port association in the driver in this version.
# ip link set dev swp1 up # ip link set dev swp2 up # ip link set dev br0 up
- Assuming h1 behind swp1 was authorized using 802.1X:
# bridge fdb replace $H1_MAC dev swp1 master dynamic
With the new MAB flag 'replace' is not needed when MAB is not enabled.
Yes, but replace works in both cases.
- Assuming 802.1X authentication failed for h2 behind swp2, enable MAB:
# bridge link set dev swp2 mab on
- Assuming $H2_MAC is in our allow list:
# bridge fdb replace $H2_MAC dev swp2 master dynamic
Learning is on in order to refresh the dynamic entries that user space installed.
Yes, port association is needed for those reasons. :-)
Given that the current tests use "static" entries that cannot age, is there a reason to have "learning on"?
(*) Need to add support for this option in iproute2. Already exposed over netlink (see 'IFLA_BR_MULTI_BOOLOPT').
Should I do that in this patch set?
No, I'm saying that this option is already exposed over netlink, but missing iproute2 support. No kernel changes needed.
On 2022-08-29 13:32, Ido Schimmel wrote:
The final decision on this rests with you I would say.
If the requirement for this feature (with or without MAB) is to work with dynamic entries (which is not what is currently implemented in the selftests), then learning needs to be enabled for the sole reason of refreshing the dynamic entries added by user space. That is, updating 'fdb->updated' with current jiffies value.
So, is this the requirement? I checked the hostapd fork you posted some time ago and I get the impression that the answer is yes [1], but I want to verify I'm not missing something.
[1] https://github.com/westermo/hostapd/commit/95dc96f9e89131b2319f5eae8ae7ac998...
I cannot say that it is a requirement with respect to the bridge implementation, but it is with the driver implementation. But you are right that it is to be used with dynamic entries.
# ip link set dev swp1 up # ip link set dev swp2 up # ip link set dev br0 up
- Assuming h1 behind swp1 was authorized using 802.1X:
# bridge fdb replace $H1_MAC dev swp1 master dynamic
With the new MAB flag 'replace' is not needed when MAB is not enabled.
Yes, but replace works in both cases.
Yes, of course.
- Assuming 802.1X authentication failed for h2 behind swp2, enable MAB:
# bridge link set dev swp2 mab on
- Assuming $H2_MAC is in our allow list:
# bridge fdb replace $H2_MAC dev swp2 master dynamic
Learning is on in order to refresh the dynamic entries that user space installed.
Yes, port association is needed for those reasons. :-)
Given that the current tests use "static" entries that cannot age, is there a reason to have "learning on"?
Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
(*) Need to add support for this option in iproute2. Already exposed over netlink (see 'IFLA_BR_MULTI_BOOLOPT').
Should I do that in this patch set?
No, I'm saying that this option is already exposed over netlink, but missing iproute2 support. No kernel changes needed.
Oh yes, I meant in the iproute2 accompanying patch set to this one?
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote:
The final decision on this rests with you I would say.
If the requirement for this feature (with or without MAB) is to work with dynamic entries (which is not what is currently implemented in the selftests), then learning needs to be enabled for the sole reason of refreshing the dynamic entries added by user space. That is, updating 'fdb->updated' with current jiffies value.
So, is this the requirement? I checked the hostapd fork you posted some time ago and I get the impression that the answer is yes [1], but I want to verify I'm not missing something.
[1] https://github.com/westermo/hostapd/commit/95dc96f9e89131b2319f5eae8ae7ac998...
I cannot say that it is a requirement with respect to the bridge implementation, but it is with the driver implementation. But you are right that it is to be used with dynamic entries.
OK, so it's a requirement for both since we need both data paths to act the same.
[...]
Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
[...]
Oh yes, I meant in the iproute2 accompanying patch set to this one?
You can send it as a standalone patch to iproute2-next: https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git
Subject prefix should be "[PATCH iproute2-next]". See this commit for reference: https://git.kernel.org/pub/scm/network/iproute2/iproute2-next.git/commit/?id...
On 2022-08-29 16:37, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote: Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
Then it makes most sense to have the mv88e6xxx driver enable port association when then port is locked, as it does now.
On Mon, Aug 29, 2022 at 05:08:23PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 16:37, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote: Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
Then it makes most sense to have the mv88e6xxx driver enable port association when then port is locked, as it does now.
As you wish, but like you wrote "802.1X port association is only needed for dynamic ATU entries" and in this case user space needs to enable learning (for refresh only) so you can really key off learning on "learning || mab". User space can decide to lock the port and work with static entries and then learning is not required.
On 2022-08-29 18:03, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 05:08:23PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 16:37, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote: Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
Then it makes most sense to have the mv88e6xxx driver enable port association when then port is locked, as it does now.
As you wish, but like you wrote "802.1X port association is only needed for dynamic ATU entries" and in this case user space needs to enable learning (for refresh only) so you can really key off learning on "learning || mab". User space can decide to lock the port and work with static entries and then learning is not required.
I will of course remove all "learning on" in the selftests, which is what I think you are referring to. In the previous I am referring to the code in the driver itself which I understand shall turn on port association with locked ports, e.g. no need for "learning on" when using the feature in general outside selftests...
On Mon, Aug 29, 2022 at 06:13:14PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 18:03, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 05:08:23PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 16:37, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote: Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
Then it makes most sense to have the mv88e6xxx driver enable port association when then port is locked, as it does now.
As you wish, but like you wrote "802.1X port association is only needed for dynamic ATU entries" and in this case user space needs to enable learning (for refresh only) so you can really key off learning on "learning || mab". User space can decide to lock the port and work with static entries and then learning is not required.
I will of course remove all "learning on" in the selftests, which is what I think you are referring to. In the previous I am referring to the code in the driver itself which I understand shall turn on port association with locked ports, e.g. no need for "learning on" when using the feature in general outside selftests...
"learning on" is needed when dynamic FDB entries are used to authorize hosts. Without learning being enabled, the bridge driver (or the underlying hardware) will not refresh the entries during forwarding and they will age out, resulting in packet loss until the hosts are re-authorized.
Given the current test cases only use static entries, there is no need to enable learning on locked ports. This will change when test cases are added with dynamic entries.
Regarding mv88e6xxx, my understanding is that you also need learning enabled for MAB (I assume for the violation interrupts). Therefore, for mv88e6xxx, learning can be enabled if learning is on or MAB is on. Enabling it based on whether the port is locked or not seems inaccurate.
On 2022-09-03 16:47, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 06:13:14PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 18:03, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 05:08:23PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 16:37, Ido Schimmel wrote:
On Mon, Aug 29, 2022 at 02:04:42PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-29 13:32, Ido Schimmel wrote: Port association is needed for MAB to work at all on mv88e6xxx, but for 802.1X port association is only needed for dynamic ATU entries.
Ageing of dynamic entries in the bridge requires learning to be on as well, but in these test cases you are only using static entries and there is no reason to enable learning in the bridge for that. I prefer not to leak this mv88e6xxx implementation detail to user space and instead have the driver enable port association based on whether "learning" or "mab" is on.
Then it makes most sense to have the mv88e6xxx driver enable port association when then port is locked, as it does now.
As you wish, but like you wrote "802.1X port association is only needed for dynamic ATU entries" and in this case user space needs to enable learning (for refresh only) so you can really key off learning on "learning || mab". User space can decide to lock the port and work with static entries and then learning is not required.
I will of course remove all "learning on" in the selftests, which is what I think you are referring to. In the previous I am referring to the code in the driver itself which I understand shall turn on port association with locked ports, e.g. no need for "learning on" when using the feature in general outside selftests...
"learning on" is needed when dynamic FDB entries are used to authorize hosts. Without learning being enabled, the bridge driver (or the underlying hardware) will not refresh the entries during forwarding and they will age out, resulting in packet loss until the hosts are re-authorized.
Given the current test cases only use static entries, there is no need to enable learning on locked ports. This will change when test cases are added with dynamic entries.
Regarding mv88e6xxx, my understanding is that you also need learning enabled for MAB (I assume for the violation interrupts). Therefore, for mv88e6xxx, learning can be enabled if learning is on or MAB is on. Enabling it based on whether the port is locked or not seems inaccurate.
Given that 'learning on' is needed for hardware refreshing of ATU entries (mv88e6xxx), and that will in the future be needed in general, I think it is best to enable it when a port is locked. Also the matter is that the locked feature needs to modify the register that contains the PAV. So I see it as natural that it is done there, as it will eventually have to be done there. That the selftests do not need it besides when activating MAB, I think, is a special case.
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'? If so, it's up to the DSA maintainers to decide.
On 2022-09-08 09:59, Ido Schimmel wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
On Thu, Sep 08, 2022 at 01:14:59PM +0200, netdev@kapio-technology.com wrote:
On 2022-09-08 09:59, Ido Schimmel wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
On 2022-09-08 13:20, Vladimir Oltean wrote:
On Thu, Sep 08, 2022 at 01:14:59PM +0200, netdev@kapio-technology.com wrote:
On 2022-09-08 09:59, Ido Schimmel wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
With the software bridge it could be used to signal a untrusted host in connection with a locked port entry attempt. I don't see so much use other that test purposes with the driver though.
On Fri, Sep 09, 2022 at 03:11:56PM +0200, netdev@kapio-technology.com wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
With the software bridge it could be used to signal a untrusted host in connection with a locked port entry attempt. I don't see so much use other that test purposes with the driver though.
Not a huge selling point, to be honest. Can't the blackhole flag remain settable only in the device -> bridge direction, with user space just reading it?
On 2022-09-11 02:13, Vladimir Oltean wrote:
On Fri, Sep 09, 2022 at 03:11:56PM +0200, netdev@kapio-technology.com wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote:
I am at the blackhole driver implementation now, as I suppose that the iproute2 command should work with the mv88e6xxx driver when adding blackhole entries (with a added selftest)? I decided to add the blackhole feature as new ops for drivers with functions blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
With the software bridge it could be used to signal a untrusted host in connection with a locked port entry attempt. I don't see so much use other that test purposes with the driver though.
Not a huge selling point, to be honest. Can't the blackhole flag remain settable only in the device -> bridge direction, with user space just reading it?
That is possible, but it would of course not make sense to have selftests of the feature as that would not work unless there is a driver with this capability (now just mv88e6xxx).
On Sun, Sep 11, 2022 at 11:23:55AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-11 02:13, Vladimir Oltean wrote:
On Fri, Sep 09, 2022 at 03:11:56PM +0200, netdev@kapio-technology.com wrote:
On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote: > I am at the blackhole driver implementation now, as I suppose that the > iproute2 command should work with the mv88e6xxx driver when adding blackhole > entries (with a added selftest)? > I decided to add the blackhole feature as new ops for drivers with functions > blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach?
I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
With the software bridge it could be used to signal a untrusted host in connection with a locked port entry attempt. I don't see so much use other that test purposes with the driver though.
Not a huge selling point, to be honest. Can't the blackhole flag remain settable only in the device -> bridge direction, with user space just reading it?
That is possible, but it would of course not make sense to have selftests of the feature as that would not work unless there is a driver with this capability (now just mv88e6xxx).
The new "blackhole" flag requires changes in the bridge driver and without allowing user space to add such entries, the only way to test these changes is with mv88e6xxx which many of us do not have...
On 2022-09-12 11:08, Ido Schimmel wrote:
On Sun, Sep 11, 2022 at 11:23:55AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-11 02:13, Vladimir Oltean wrote:
On Fri, Sep 09, 2022 at 03:11:56PM +0200, netdev@kapio-technology.com wrote:
> On Wed, Sep 07, 2022 at 11:10:07PM +0200, netdev@kapio-technology.com wrote: > > I am at the blackhole driver implementation now, as I suppose that the > > iproute2 command should work with the mv88e6xxx driver when adding blackhole > > entries (with a added selftest)? > > I decided to add the blackhole feature as new ops for drivers with functions > > blackhole_fdb_add() and blackhole_fdb_del(). Do you agree with that approach? > > I assume you are talking about extending 'dsa_switch_ops'?
Yes, that is the idea.
> If so, it's up to the DSA maintainers to decide.
What will be the usefulness of adding a blackhole FDB entry from user space?
With the software bridge it could be used to signal a untrusted host in connection with a locked port entry attempt. I don't see so much use other that test purposes with the driver though.
Not a huge selling point, to be honest. Can't the blackhole flag remain settable only in the device -> bridge direction, with user space just reading it?
That is possible, but it would of course not make sense to have selftests of the feature as that would not work unless there is a driver with this capability (now just mv88e6xxx).
The new "blackhole" flag requires changes in the bridge driver and without allowing user space to add such entries, the only way to test these changes is with mv88e6xxx which many of us do not have...
I am now building from new system (comp), and the kernel selftests are not being installed correctly, so I haven't been able to run the selftests yet.
I have made a blackhole selftest, which looks like this:
test_blackhole_fdb() { RET=0
check_blackhole_fdb_support || return 0
tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen on initial" tcpdump_cleanup
bridge fdb add `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole" check_err $? "test_blackhole_fdb: No blackhole FDB entry found"
tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_fail $? "test_blackhole_fdb: packet seen with blackhole fdb entry" tcpdump_cleanup
bridge fdb del `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole" check_fail $? "test_blackhole_fdb: Blackhole FDB entry not deleted"
tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen after removing blackhole FDB entry" tcpdump_cleanup
log_test "Blackhole FDB entry test" }
the setup is simple and is the same as in bridge_sticky_fdb.sh.
Does the test look sound or is there obvious mistakes?
On Tue, Sep 20, 2022 at 11:29:12PM +0200, netdev@kapio-technology.com wrote:
I have made a blackhole selftest, which looks like this:
test_blackhole_fdb() { RET=0
check_blackhole_fdb_support || return 0 tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2
I don't think you can give an interface name to '-a' and '-b'?
tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen on initial" tcpdump_cleanup bridge fdb add `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole"
Make this grep more specific so that we are sure it is the entry user space installed. Something like this:
bridge fdb get `mac_get $h2` br br0 | grep -q blackhole
check_err $? "test_blackhole_fdb: No blackhole FDB entry found" tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_fail $? "test_blackhole_fdb: packet seen with blackhole fdb
entry" tcpdump_cleanup
The tcpdump filter is not specific enough. It can catch other UDP packets (e.g., multicast) being received by $h2. Anyway, to be sure the feature works as expected we need to make sure that the packets are not even egressing $swp2. Checking that they are not received by $h2 is not enough. See this (untested) suggestion [1] that uses a tc filter on the egress of $swp2.
bridge fdb del `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole" check_fail $? "test_blackhole_fdb: Blackhole FDB entry not deleted" tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen after removing
blackhole FDB entry" tcpdump_cleanup
log_test "Blackhole FDB entry test"
}
the setup is simple and is the same as in bridge_sticky_fdb.sh.
Does the test look sound or is there obvious mistakes?
[1] blackhole_fdb() { RET=0
tc filter add dev $swp2 egress protocol ip pref 1 handle 1 flower \ dst_ip 192.0.2.2 ip_proto udp dst_port 12345 action pass
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 1 check_err $? "Packet not seen on egress before adding blackhole entry"
bridge fdb add `mac_get $h2` dev br0 blackhole bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_err $? "Blackhole entry not found"
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 1 check_err $? "Packet seen on egress after adding blackhole entry"
# Check blackhole entries can be replaced. bridge fdb replace `mac_get $h2` dev $swp2 master static bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_fail $? "Blackhole entry found after replacement"
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 2 check_err $? "Packet not seen on egress after replacing blackhole entry"
bridge fdb del `mac_get $h2` dev $swp2 master static tc filter del dev $swp2 egress protocol ip pref 1 handle 1 flower
log_test "Blackhole FDB entry" }
On 2022-09-21 09:15, Ido Schimmel wrote:
On Tue, Sep 20, 2022 at 11:29:12PM +0200, netdev@kapio-technology.com wrote:
I have made a blackhole selftest, which looks like this:
test_blackhole_fdb() { RET=0
check_blackhole_fdb_support || return 0 tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2
I don't think you can give an interface name to '-a' and '-b'?
tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen on initial" tcpdump_cleanup bridge fdb add `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole"
Make this grep more specific so that we are sure it is the entry user space installed. Something like this:
bridge fdb get `mac_get $h2` br br0 | grep -q blackhole
check_err $? "test_blackhole_fdb: No blackhole FDB entry
found"
tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_fail $? "test_blackhole_fdb: packet seen with blackhole
fdb entry" tcpdump_cleanup
The tcpdump filter is not specific enough. It can catch other UDP packets (e.g., multicast) being received by $h2. Anyway, to be sure the feature works as expected we need to make sure that the packets are not even egressing $swp2. Checking that they are not received by $h2 is not enough. See this (untested) suggestion [1] that uses a tc filter on the egress of $swp2.
bridge fdb del `mac_get $h2` dev br0 blackhole bridge fdb show dev br0 | grep -q "blackhole" check_fail $? "test_blackhole_fdb: Blackhole FDB entry not
deleted"
tcpdump_start $h2 $MZ $h1 -q -t udp -a $h1 -b $h2 tcpdump_stop tcpdump_show | grep -q udp check_err $? "test_blackhole_fdb: No packet seen after
removing blackhole FDB entry" tcpdump_cleanup
log_test "Blackhole FDB entry test"
}
the setup is simple and is the same as in bridge_sticky_fdb.sh.
Does the test look sound or is there obvious mistakes?
[1] blackhole_fdb() { RET=0
tc filter add dev $swp2 egress protocol ip pref 1 handle 1 flower \ dst_ip 192.0.2.2 ip_proto udp dst_port 12345 action pass
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 1 check_err $? "Packet not seen on egress before adding blackhole entry"
bridge fdb add `mac_get $h2` dev br0 blackhole bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_err $? "Blackhole entry not found"
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 1 check_err $? "Packet seen on egress after adding blackhole entry"
# Check blackhole entries can be replaced. bridge fdb replace `mac_get $h2` dev $swp2 master static bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_fail $? "Blackhole entry found after replacement"
$MZ $h1 -c 1 -p 128 -t udp "sp=54321,dp=12345" \ -a own -b `mac_get $h2` -A 192.0.2.1 -B 192.0.2.2 -q
tc_check_packets "dev $swp2 egress" 1 2 check_err $? "Packet not seen on egress after replacing blackhole entry"
bridge fdb del `mac_get $h2` dev $swp2 master static tc filter del dev $swp2 egress protocol ip pref 1 handle 1 flower
log_test "Blackhole FDB entry" }
Thx, looks good. I have tried to run the test as far as I can manually, but I don't seem to have 'busywait' in the system, which tc_check_packets() depends on, and I couldn't find any 'busywait' in Buildroot.
netdev@kapio-technology.com writes:
Thx, looks good. I have tried to run the test as far as I can manually, but I don't seem to have 'busywait' in the system, which tc_check_packets() depends on, and I couldn't find any 'busywait' in Buildroot.
It's a helper defined in tools/testing/selftests/net/forwarding/lib.sh
On 2022-09-21 09:15, Ido Schimmel wrote:
# Check blackhole entries can be replaced. bridge fdb replace `mac_get $h2` dev $swp2 master static bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_fail $? "Blackhole entry found after replacement"
There seems to be a problem with replacing blackhole fdb entries as fdb_find_rcu() does not find the associated fdb entry (addr, vid) and I don't know why that is the case?
On 2022-09-23 13:34, netdev@kapio-technology.com wrote:
On 2022-09-21 09:15, Ido Schimmel wrote:
# Check blackhole entries can be replaced. bridge fdb replace `mac_get $h2` dev $swp2 master static bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_fail $? "Blackhole entry found after replacement"
There seems to be a problem with replacing blackhole fdb entries as fdb_find_rcu() does not find the associated fdb entry (addr, vid) and I don't know why that is the case?
I realize that the reason why fdb_find_rcu() does not find the entry is surely that it is stored in the bridge device fdb and not 'master' fdb.
On 2022-09-21 09:15, Ido Schimmel wrote:
# Check blackhole entries can be replaced. bridge fdb replace `mac_get $h2` dev $swp2 master static bridge fdb get `mac_get $h2` br br0 | grep -q blackhole check_fail $? "Blackhole entry found after replacement"
I am quite in doubt if the driver will be able to overwrite a blackhole entry added by userspace as the replace action must be to delete and then add the replacement afaics, but a NEWNEIGH event using port_fdb_add() will not succeed with that using the ops I use now. Otherwise it has be all with port_fb_add() with new lists keeping the userspace added blackhole fdb entries.
On 2022-09-21 09:15, Ido Schimmel wrote:
bridge fdb add `mac_get $h2` dev br0 blackhole
To make this work, I think we need to change the concept, so that blackhole FDB entries are added to ports connected to the bridge, thus bridge fdb add MAC dev $swpX master blackhole
This makes sense as the driver adds them based on the port where the SMAC is seen, even though the effect of the blackhole FDB entry is switch wide.
Adding them to the bridge (e.g. f.ex. br0) will not work in the SW bridge as the entries then are not found. We could deny this possibility or just document the use?
For offloaded I can change the add, so that it does a delete (even if none are present) and a add, thus facilitating the replace.
How does this sound?
Sorry for the delay, was away.
On Tue, Sep 27, 2022 at 10:33:10AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-21 09:15, Ido Schimmel wrote:
bridge fdb add `mac_get $h2` dev br0 blackhole
To make this work, I think we need to change the concept, so that blackhole FDB entries are added to ports connected to the bridge, thus bridge fdb add MAC dev $swpX master blackhole
This makes sense as the driver adds them based on the port where the SMAC is seen, even though the effect of the blackhole FDB entry is switch wide.
Asking user space to associate a blackhole entry with a bridge port does not make sense to me because unlike regular entries, blackhole entries do not forward packets out of this port. Blackhole routes and nexthops are not associated with a device either.
Adding them to the bridge (e.g. f.ex. br0) will not work in the SW bridge as the entries then are not found.
Why not found? This works:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev br0 master br0 permanent
With blackhole support I expect:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local blackhole $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev br0 master br0 permanent blackhole
On 2022-09-28 08:59, Ido Schimmel wrote:
Sorry for the delay, was away.
Good to have you back. :-)
On Tue, Sep 27, 2022 at 10:33:10AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-21 09:15, Ido Schimmel wrote:
bridge fdb add `mac_get $h2` dev br0 blackhole
To make this work, I think we need to change the concept, so that blackhole FDB entries are added to ports connected to the bridge, thus bridge fdb add MAC dev $swpX master blackhole
This makes sense as the driver adds them based on the port where the SMAC is seen, even though the effect of the blackhole FDB entry is switch wide.
Asking user space to associate a blackhole entry with a bridge port does not make sense to me because unlike regular entries, blackhole entries do not forward packets out of this port. Blackhole routes and nexthops are not associated with a device either.
Adding them to the bridge (e.g. f.ex. br0) will not work in the SW bridge as the entries then are not found.
Why not found? This works:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev br0 master br0 permanent
With blackhole support I expect:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local blackhole $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev br0 master br0 permanent blackhole
In my previous replies, I have notified that fdb_find_rcu() does not find the entry added with br0, and thus fdb_add_entry() that does the replace does not replace but adds a new entry. I have been thinking that it is because when added with br0 as dev it is added to dev br0's fdb, which is not the same as 'dev <Dev> master' fdb...
I think bridge fdb get works in a different way, as I know the get functionality gets all fdb entries from all devices and filters them (if I am not mistaken)...
On 2022-09-28 08:59, Ido Schimmel wrote:
Why not found? This works:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local $ bridge fdb get 00:11:22:33:44:55 br br0
With: # bridge fdb replace 00.11.22.33.44.55 dev $swpX static
fdb_find_rcu() will not find the entry added with 'dev br0' above, and will thus add a new entry afaik.
On Wed, Sep 28, 2022 at 09:47:42AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-28 08:59, Ido Schimmel wrote:
Why not found? This works:
# bridge fdb add 00:11:22:33:44:55 dev br0 self local $ bridge fdb get 00:11:22:33:44:55 br br0
With: # bridge fdb replace 00.11.22.33.44.55 dev $swpX static
fdb_find_rcu() will not find the entry added with 'dev br0' above, and will thus add a new entry afaik.
It needs "master" keyword:
$ bridge fdb get 00:11:22:33:44:55 br br0 Error: Fdb entry not found. # bridge fdb add 00:11:22:33:44:55 dev br0 self local $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev br0 master br0 permanent # bridge fdb replace 00:11:22:33:44:55 dev dummy10 master static $ bridge fdb get 00:11:22:33:44:55 br br0 00:11:22:33:44:55 dev dummy10 master br0 static
"master" means manipulate the FDB of the master device. Therefore, the replace command manipulates the FDB of br0.
"self" (which is the default [1]) means manipulate the FDB of the device itself. In case of br0 it means manipulate the FDB of the bridge device. For physical devices it usually translates to manipulating the unicast address filter list.
[1] https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/tree/bridge/fdb...
On 2022-09-28 10:46, Ido Schimmel wrote:
On Wed, Sep 28, 2022 at 09:47:42AM +0200, netdev@kapio-technology.com wrote:
It needs "master" keyword:
I must have forgotten it when I tested at some point...
It seems to be working, only that I am at a loss to why port_fdb_add() is called for each port after going through the DSA layer. I understand that all slave devices are listening on events, but I think the ops should only be called when the port matches? Also for some reason the blackhole (zero-DPV) ATU entry is not added on my device in case of no vlan (vid = 0), but there is also some phy problems that are unrelated to this patch set.
On 2022-09-28 10:46, Ido Schimmel wrote:
On Wed, Sep 28, 2022 at 09:47:42AM +0200, netdev@kapio-technology.com wrote:
On 2022-09-28 08:59, Ido Schimmel wrote:
BTW, I have added FDB flags in the DSA layer as a u16, so that now port_fdb_add() is as:
int (*port_fdb_add)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, u16 fdb_flags, struct dsa_db db);
On 2022-09-28 10:46, Ido Schimmel wrote:
"master" means manipulate the FDB of the master device. Therefore, the replace command manipulates the FDB of br0.
"self" (which is the default [1]) means manipulate the FDB of the device itself. In case of br0 it means manipulate the FDB of the bridge device. For physical devices it usually translates to manipulating the unicast address filter list.
Hi Ido, can you check the selftests of the v6 I have sent out using the iproute2-next I have also sent?
On 2022-09-12 11:08, Ido Schimmel wrote:
The new "blackhole" flag requires changes in the bridge driver and without allowing user space to add such entries, the only way to test these changes is with mv88e6xxx which many of us do not have...
There seems to be a little inconvenience when adding/deleting blackhole entries, and that is since all slaves are the listeners to SWITCHDEV_FDB_ADD(DEL)_TO_DEVICE events and blackhole entries are not to any slave devices, the ops will be called for every slave device as there is no way to distinguish. This said, the add and del operations work.
On 2022-08-29 09:40, Ido Schimmel wrote:
On Sun, Aug 28, 2022 at 02:00:29PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 20:21, Ido Schimmel wrote:
"locked on learning on" is counter intuitive and IMO very much a misconfiguration that we should have disallowed when the "locked" option was introduced. It is my understanding that the only reason we are even talking about it is because mv88e6xxx needs it for MAB for some reason.
As the way mv88e6xxx implements "learning off" is to remove port association for ingress packets on a port, but that breaks many other things such as refreshing ATU entries and violation interrupts, so it is needed and the question is then what is the worst to have 'learning on' on a locked port or to have the locked port enabling learning in the driver silently?
Opinions seem to differ. Note that even on locked ports without MAB, port association on ingress is still needed in future as I have a dynamic ATU patch set coming, that uses age out violation and hardware refreshing to let the hardware keep the dynamic entries as long as the authorized station is sending, but will age the entry out if the station keeps silent for the ageing time. But that patch set is dependent on this patch set, and I don't think I can send it before this is accepted...
# bridge link set dev swp1 learning on locked on # bridge link set dev swp2 learning on locked on
As we must think in how most drivers work, which I am not knowledgeable of, I think that it is probably the best to think of the way mv88e6xxx works as an outlier. If that is true, then I think the best option is to go with: #bridge link set dev $swp1 learning off locked on #bridge link set dev $swp2 learning off locked on
Then the cleanup side will just be: #bridge link set dev $swp1 locked off #bridge link set dev $swp2 locked off
The state 'learning off' is then consistent with the behavior of both the bridge and driver after the cleanup.
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
- $MZ $h2 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0"
- check_err $? "Locked port station move: Entry not found on unlocked
port"
Looks like this is going to fail with offloaded data path as according to fdb_print_flags() in iproute2 both the "extern_learn" and "offload" flags will be printed before "master".
The output shows like: 74:e1:e1:2c:4f:18 dev eth8 vlan 1 master br0 extern_learn offload sticky locked blackhole
"sticky" will of course become "permanent", but I can still make it more resilient by piping grep.
I suppose that I will keep the "sticky_no_roaming" test even though it is not really needed here anymore?
On Mon, Aug 29, 2022 at 06:07:59PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-27 20:21, Ido Schimmel wrote:
On Fri, Aug 26, 2022 at 01:45:38PM +0200, Hans Schultz wrote:
- $MZ $h2 -q -t udp -a $mac -b rand
- bridge fdb show dev $swp2 | grep -q "$mac vlan 1 master br0"
- check_err $? "Locked port station move: Entry not found on
unlocked port"
Looks like this is going to fail with offloaded data path as according to fdb_print_flags() in iproute2 both the "extern_learn" and "offload" flags will be printed before "master".
The output shows like: 74:e1:e1:2c:4f:18 dev eth8 vlan 1 master br0 extern_learn offload sticky locked blackhole
"sticky" will of course become "permanent", but I can still make it more resilient by piping grep.
OK.
I suppose that I will keep the "sticky_no_roaming" test even though it is not really needed here anymore?
You can send it separately if you want.
linux-kselftest-mirror@lists.linaro.org