This patch set extends the locked port feature for devices that are behind a locked port, but do not have the ability to authorize themselves as a supplicant using IEEE 802.1X. Such devices can be printers, meters or anything related to fixed installations. Instead of 802.1X authorization, devices can get access based on their MAC addresses being whitelisted.
For an authorization daemon to detect that a device is trying to get access through a locked port, the bridge will add the MAC address of the device to the FDB with a locked flag to it. Thus the authorization daemon can catch the FDB add event and check if the MAC address is in the whitelist and if so replace the FDB entry without the locked flag enabled, and thus open the port for the device.
This feature is known as MAC-Auth or MAC Authentication Bypass (MAB) in Cisco terminology, where the full MAB concept involves additional Cisco infrastructure for authorization. There is no real authentication process, as the MAC address of the device is the only input the authorization daemon, in the general case, has to base the decision if to unlock the port or not.
With this patch set, an implementation of the offloaded case is supplied for the mv88e6xxx driver. When a packet ingresses on a locked port, an ATU miss violation event will occur. When handling such ATU miss violation interrupts, the MAC address of the device is added to the FDB with a zero destination port vector (DPV) and the MAC address is communicated through the switchdev layer to the bridge, so that a FDB entry with the locked flag enabled can be added.
Log: v3: Added timers and lists in the driver (mv88e6xxx) to keep track of and remove locked entries.
v4: Leave out enforcing a limit to the number of locked entries in the bridge. Removed the timers in the driver and use the worker only. Add locked FDB flag to all drivers using port_fdb_add() from the dsa api and let all drivers ignore entries with this flag set. Change how to get the ageing timeout of locked entries. See global1_atu.c and switchdev.c. Use struct mv88e6xxx_port for locked entries variables instead of struct dsa_port.
Hans Schultz (6): net: bridge: add locked entry fdb flag to extend locked port feature net: switchdev: add support for offloading of fdb locked flag drivers: net: dsa: add locked fdb entry flag to drivers net: dsa: mv88e6xxx: allow reading FID when handling ATU violations net: dsa: mv88e6xxx: mac-auth/MAB implementation selftests: forwarding: add test of MAC-Auth Bypass to locked port tests
drivers/net/dsa/b53/b53_common.c | 5 + drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 + drivers/net/dsa/lan9303-core.c | 5 + drivers/net/dsa/lantiq_gswip.c | 5 + drivers/net/dsa/microchip/ksz9477.c | 5 + drivers/net/dsa/mt7530.c | 5 + drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 54 +++- drivers/net/dsa/mv88e6xxx/chip.h | 15 + drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 32 +- drivers/net/dsa/mv88e6xxx/port.c | 30 +- drivers/net/dsa/mv88e6xxx/port.h | 2 + drivers/net/dsa/mv88e6xxx/switchdev.c | 280 ++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 +++ drivers/net/dsa/ocelot/felix.c | 5 + drivers/net/dsa/qca8k.c | 5 + drivers/net/dsa/sja1105/sja1105_main.c | 5 + include/net/dsa.h | 7 + include/net/switchdev.h | 1 + include/uapi/linux/neighbour.h | 1 + net/bridge/br.c | 3 +- net/bridge/br_fdb.c | 19 +- net/bridge/br_input.c | 10 +- net/bridge/br_private.h | 5 +- net/bridge/br_switchdev.c | 1 + net/dsa/dsa_priv.h | 4 +- net/dsa/port.c | 7 +- net/dsa/slave.c | 4 +- net/dsa/switch.c | 10 +- .../net/forwarding/bridge_locked_port.sh | 30 +- 32 files changed, 566 insertions(+), 34 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- include/uapi/linux/neighbour.h | 1 + net/bridge/br_fdb.c | 12 ++++++++++++ net/bridge/br_input.c | 10 +++++++++- net/bridge/br_private.h | 3 ++- 4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index 39c565e460c7..76d65b481086 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -53,6 +53,7 @@ enum { #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ #define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1)
/* * Neighbor Cache Entry States. diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..ee9064a536ae 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm; + u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL) @@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY; + if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) + ext_flags |= NTF_EXT_LOCKED;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure; + if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags)) + goto nla_put_failure; + ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated); @@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */ + + nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */ + nla_total_size(sizeof(u16)) /* NDA_VLAN */ + nla_total_size(sizeof(struct nda_cacheinfo)) + nla_total_size(0) /* NDA_FDB_EXT_ATTRS */ @@ -1082,6 +1089,11 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
+ if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) { + clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); + modified = true; + } + if (fdb_handle_notify(fdb, notify)) modified = true;
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d15548cfda6 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid);
if (!fdb_src || READ_ONCE(fdb_src->dst) != p || - test_bit(BR_FDB_LOCAL, &fdb_src->flags)) + test_bit(BR_FDB_LOCAL, &fdb_src->flags) || + test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) { + if (!fdb_src) { + unsigned long flags = 0; + + __set_bit(BR_FDB_ENTRY_LOCKED, &flags); + br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags); + } goto drop; + } }
nbp_switchdev_frame_mark(p, skb); diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..47a3598d25c8 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,8 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY, - BR_FDB_NOTIFY_INACTIVE + BR_FDB_NOTIFY_INACTIVE, + BR_FDB_ENTRY_LOCKED, };
struct net_bridge_fdb_key {
On Thu, Jul 07, 2022 at 05:29:25PM +0200, Hans Schultz wrote:
Add an intermediate state for clients behind a locked port to allow for possible opening of the port for said clients. The clients mac address will be added with the locked flag set, denying access through the port for the mac address, but also creating a new FDB add event giving userspace daemons the ability to unlock the mac address. This feature corresponds to the Mac-Auth and MAC Authentication Bypass (MAB) named features. The latter defined by Cisco.
Only the kernel can set this FDB entry flag, while userspace can read the flag and remove it by replacing or deleting the FDB entry.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
include/uapi/linux/neighbour.h | 1 + net/bridge/br_fdb.c | 12 ++++++++++++ net/bridge/br_input.c | 10 +++++++++- net/bridge/br_private.h | 3 ++- 4 files changed, 24 insertions(+), 2 deletions(-)
diff --git a/include/uapi/linux/neighbour.h b/include/uapi/linux/neighbour.h index 39c565e460c7..76d65b481086 100644 --- a/include/uapi/linux/neighbour.h +++ b/include/uapi/linux/neighbour.h @@ -53,6 +53,7 @@ enum { #define NTF_ROUTER (1 << 7) /* Extended flags under NDA_FLAGS_EXT: */ #define NTF_EXT_MANAGED (1 << 0) +#define NTF_EXT_LOCKED (1 << 1) /*
- Neighbor Cache Entry States.
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..ee9064a536ae 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -105,6 +105,7 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, struct nda_cacheinfo ci; struct nlmsghdr *nlh; struct ndmsg *ndm;
- u32 ext_flags = 0;
nlh = nlmsg_put(skb, portid, seq, type, sizeof(*ndm), flags); if (nlh == NULL) @@ -125,11 +126,16 @@ static int fdb_fill_info(struct sk_buff *skb, const struct net_bridge *br, ndm->ndm_flags |= NTF_EXT_LEARNED; if (test_bit(BR_FDB_STICKY, &fdb->flags)) ndm->ndm_flags |= NTF_STICKY;
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags))
ext_flags |= NTF_EXT_LOCKED;
if (nla_put(skb, NDA_LLADDR, ETH_ALEN, &fdb->key.addr)) goto nla_put_failure; if (nla_put_u32(skb, NDA_MASTER, br->dev->ifindex)) goto nla_put_failure;
- if (nla_put_u32(skb, NDA_FLAGS_EXT, ext_flags))
goto nla_put_failure;
- ci.ndm_used = jiffies_to_clock_t(now - fdb->used); ci.ndm_confirmed = 0; ci.ndm_updated = jiffies_to_clock_t(now - fdb->updated);
@@ -171,6 +177,7 @@ static inline size_t fdb_nlmsg_size(void) return NLMSG_ALIGN(sizeof(struct ndmsg)) + nla_total_size(ETH_ALEN) /* NDA_LLADDR */ + nla_total_size(sizeof(u32)) /* NDA_MASTER */
+ nla_total_size(sizeof(u32)) /* NDA_FLAGS_EXT */
Need to add validation that 'NTF_EXT_LOCKED' is not set in 'NDA_FLAGS_EXT' in entries installed by user space.
+ nla_total_size(sizeof(u16)) /* NDA_VLAN */ + nla_total_size(sizeof(struct nda_cacheinfo)) + nla_total_size(0) /* NDA_FDB_EXT_ATTRS */
@@ -1082,6 +1089,11 @@ static int fdb_add_entry(struct net_bridge *br, struct net_bridge_port *source, modified = true; }
- if (test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags)) {
clear_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags);
modified = true;
- }
- if (fdb_handle_notify(fdb, notify)) modified = true;
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 68b3e850bcb9..3d15548cfda6 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -110,8 +110,16 @@ int br_handle_frame_finish(struct net *net, struct sock *sk, struct sk_buff *skb br_fdb_find_rcu(br, eth_hdr(skb)->h_source, vid); if (!fdb_src || READ_ONCE(fdb_src->dst) != p ||
test_bit(BR_FDB_LOCAL, &fdb_src->flags))
test_bit(BR_FDB_LOCAL, &fdb_src->flags) ||
test_bit(BR_FDB_ENTRY_LOCKED, &fdb_src->flags)) {
if (!fdb_src) {
unsigned long flags = 0;
__set_bit(BR_FDB_ENTRY_LOCKED, &flags);
br_fdb_update(br, p, eth_hdr(skb)->h_source, vid, flags);
}
We need a better definition of what 'BR_FDB_ENTRY_LOCKED' means. An entry marked with this flag is not good enough to let traffic with this SA ingress from a locked port, but if traffic is received from a different port with this DA, the bridge will forward it as known unicast.
If we are going to tell switchdev drivers to ignore locked entries, packets with this DA will be flooded as unknown unicast in hardware. For mv88e6xxx you write "The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV". What does it mean? Traffic with this DA will be blackholed?
It would be good to agree on one consistent behavior for all hardware implementations and the bridge driver. Do you know what happens in other implementations (e.g., Cisco)?
FWIW, to me it makes sense to have the bridge ignore locked entries and let traffic be flooded. If user space does not want traffic to egress a locked port, then it can turn off flooding on the port while it is locked.
goto drop;
}}
nbp_switchdev_frame_mark(p, skb); diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 06e5f6faa431..47a3598d25c8 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -251,7 +251,8 @@ enum { BR_FDB_ADDED_BY_EXT_LEARN, BR_FDB_OFFLOADED, BR_FDB_NOTIFY,
- BR_FDB_NOTIFY_INACTIVE
- BR_FDB_NOTIFY_INACTIVE,
- BR_FDB_ENTRY_LOCKED,
}; struct net_bridge_fdb_key { -- 2.30.2
Used for Mac-auth/MAB feature in the offloaded case. Send flag through switchdev to driver.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- include/net/dsa.h | 6 ++++++ include/net/switchdev.h | 1 + net/bridge/br.c | 3 ++- net/bridge/br_fdb.c | 7 +++++-- net/bridge/br_private.h | 2 +- net/bridge/br_switchdev.c | 1 + net/dsa/dsa_priv.h | 4 +++- net/dsa/port.c | 7 ++++++- net/dsa/slave.c | 4 +++- net/dsa/switch.c | 6 +++--- 10 files changed, 31 insertions(+), 10 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h index 14f07275852b..a5a843b2d67d 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -330,6 +330,12 @@ struct dsa_port { /* List of VLANs that CPU and DSA ports are members of. */ struct mutex vlans_lock; struct list_head vlans; + + /* List and maintenance of locked ATU entries */ + struct mutex locked_entries_list_lock; + struct list_head atu_locked_entries_list; + atomic_t atu_locked_entry_cnt; + struct delayed_work atu_work; };
/* TODO: ideally DSA ports would have a single dp->link_dp member, diff --git a/include/net/switchdev.h b/include/net/switchdev.h index aa0171d5786d..9f83c835ee62 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -245,6 +245,7 @@ struct switchdev_notifier_fdb_info { u16 vid; u8 added_by_user:1, is_local:1, + is_locked:1, offloaded:1; };
diff --git a/net/bridge/br.c b/net/bridge/br.c index 96e91d69a9a8..fe0a4741fcda 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr, - fdb_info->vid, false); + fdb_info->vid, false, + fdb_info->is_locked); if (err) { err = notifier_from_errno(err); break; diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index ee9064a536ae..32ebb18050b9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1136,7 +1136,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; } - err = br_fdb_external_learn_add(br, p, addr, vid, true); + err = br_fdb_external_learn_add(br, p, addr, vid, true, false); } else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb); @@ -1366,7 +1366,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool swdev_notify) + bool swdev_notify, bool locked) { struct net_bridge_fdb_entry *fdb; bool modified = false; @@ -1386,6 +1386,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
+ if (locked) + flags |= BIT(BR_FDB_ENTRY_LOCKED); + fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM; diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 47a3598d25c8..9082451b4d40 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -811,7 +811,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, - bool swdev_notify); + bool swdev_notify, bool locked); int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..85e566b856e1 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags); + item->is_locked = test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx; } diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index d9722e49864b..42f47a94b0f0 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid; + bool is_locked; struct dsa_db db; };
@@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { unsigned char addr[ETH_ALEN]; u16 vid; bool host_addr; + bool is_locked; };
enum dsa_standalone_event { @@ -232,7 +234,7 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid); + u16 vid, bool is_locked); int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 3738f2d40a0b..8bdac9aabe5d 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -35,6 +35,7 @@ static void dsa_port_notify_bridge_fdb_flush(const struct dsa_port *dp, u16 vid) struct net_device *brport_dev = dsa_port_to_bridge_port(dp); struct switchdev_notifier_fdb_info info = { .vid = vid, + .is_locked = false, };
/* When the port becomes standalone it has already left the bridge. @@ -950,12 +951,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) }
int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid) + u16 vid, bool is_locked) { struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid, + .is_locked = is_locked, .db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, @@ -979,6 +981,7 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, .dp = dp, .addr = addr, .vid = vid, + .is_locked = false, .db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge, @@ -999,6 +1002,7 @@ static int dsa_port_host_fdb_add(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid, + .is_locked = false, .db = db, };
@@ -1050,6 +1054,7 @@ static int dsa_port_host_fdb_del(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid, + .is_locked = false, .db = db, };
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 801a5d445833..905b15e4eab9 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2784,6 +2784,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev; + bool is_locked = switchdev_work->is_locked; u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp; @@ -2799,7 +2800,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else - err = dsa_port_fdb_add(dp, addr, vid); + err = dsa_port_fdb_add(dp, addr, vid, is_locked); if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n", @@ -2907,6 +2908,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr; + switchdev_work->is_locked = fdb_info->is_locked;
dsa_schedule_work(&switchdev_work->work);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 2b56218fc57c..32b1e7ac6373 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -234,7 +234,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp, }
static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, - u16 vid, struct dsa_db db) + u16 vid, bool is_locked, struct dsa_db db) { struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -398,7 +398,7 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { err = dsa_port_do_fdb_add(dp, info->addr, info->vid, - info->db); + false, info->db); if (err) break; } @@ -437,7 +437,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db); + return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->is_locked, info->db); }
static int dsa_switch_fdb_del(struct dsa_switch *ds,
On Thu, Jul 07, 2022 at 05:29:26PM +0200, Hans Schultz wrote:
Used for Mac-auth/MAB feature in the offloaded case. Send flag through switchdev to driver.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
include/net/dsa.h | 6 ++++++ include/net/switchdev.h | 1 + net/bridge/br.c | 3 ++- net/bridge/br_fdb.c | 7 +++++-- net/bridge/br_private.h | 2 +- net/bridge/br_switchdev.c | 1 + net/dsa/dsa_priv.h | 4 +++- net/dsa/port.c | 7 ++++++- net/dsa/slave.c | 4 +++- net/dsa/switch.c | 6 +++--- 10 files changed, 31 insertions(+), 10 deletions(-)
diff --git a/include/net/dsa.h b/include/net/dsa.h index 14f07275852b..a5a843b2d67d 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -330,6 +330,12 @@ struct dsa_port { /* List of VLANs that CPU and DSA ports are members of. */ struct mutex vlans_lock; struct list_head vlans;
- /* List and maintenance of locked ATU entries */
- struct mutex locked_entries_list_lock;
- struct list_head atu_locked_entries_list;
- atomic_t atu_locked_entry_cnt;
- struct delayed_work atu_work;
Leftovers from an old change, please drop from the patch.
}; /* TODO: ideally DSA ports would have a single dp->link_dp member, diff --git a/include/net/switchdev.h b/include/net/switchdev.h index aa0171d5786d..9f83c835ee62 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -245,6 +245,7 @@ struct switchdev_notifier_fdb_info { u16 vid; u8 added_by_user:1, is_local:1,
is_locked:1, offloaded:1;
}; diff --git a/net/bridge/br.c b/net/bridge/br.c index 96e91d69a9a8..fe0a4741fcda 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr,
fdb_info->vid, false);
fdb_info->vid, false,
if (err) { err = notifier_from_errno(err); break;fdb_info->is_locked);
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index ee9064a536ae..32ebb18050b9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1136,7 +1136,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; }
err = br_fdb_external_learn_add(br, p, addr, vid, true);
} else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
@@ -1366,7 +1366,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p) int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
bool swdev_notify)
bool swdev_notify, bool locked)
{ struct net_bridge_fdb_entry *fdb; bool modified = false; @@ -1386,6 +1386,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
if (locked)
flags |= BIT(BR_FDB_ENTRY_LOCKED);
- fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 47a3598d25c8..9082451b4d40 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -811,7 +811,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
bool swdev_notify);
bool swdev_notify, bool locked);
int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..85e566b856e1 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
- item->is_locked = test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx;
} diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index d9722e49864b..42f47a94b0f0 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid;
- bool is_locked;
drop
struct dsa_db db; }; @@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { unsigned char addr[ETH_ALEN]; u16 vid; bool host_addr;
- bool is_locked;
drop
}; enum dsa_standalone_event { @@ -232,7 +234,7 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid);
u16 vid, bool is_locked);
drop
int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 3738f2d40a0b..8bdac9aabe5d 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -35,6 +35,7 @@ static void dsa_port_notify_bridge_fdb_flush(const struct dsa_port *dp, u16 vid) struct net_device *brport_dev = dsa_port_to_bridge_port(dp); struct switchdev_notifier_fdb_info info = { .vid = vid,
.is_locked = false,
drop
}; /* When the port becomes standalone it has already left the bridge. @@ -950,12 +951,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) } int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid)
u16 vid, bool is_locked)
drop
{ struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid,
.is_locked = is_locked,
drop
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,
@@ -979,6 +981,7 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,
@@ -999,6 +1002,7 @@ static int dsa_port_host_fdb_add(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = db,
}; @@ -1050,6 +1054,7 @@ static int dsa_port_host_fdb_del(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = db,
}; diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 801a5d445833..905b15e4eab9 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2784,6 +2784,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev;
- bool is_locked = switchdev_work->is_locked;
drop
u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp; @@ -2799,7 +2800,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else
err = dsa_port_fdb_add(dp, addr, vid);
err = dsa_port_fdb_add(dp, addr, vid, is_locked);
drop
if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n",
@@ -2907,6 +2908,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr;
- switchdev_work->is_locked = fdb_info->is_locked;
drop
dsa_schedule_work(&switchdev_work->work); diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 2b56218fc57c..32b1e7ac6373 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -234,7 +234,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp, } static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid, struct dsa_db db)
u16 vid, bool is_locked, struct dsa_db db)
drop
{ struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -398,7 +398,7 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { err = dsa_port_do_fdb_add(dp, info->addr, info->vid,
info->db);
false, info->db);
drop
if (err) break; }
@@ -437,7 +437,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db);
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->is_locked, info->db);
drop
} static int dsa_switch_fdb_del(struct dsa_switch *ds, -- 2.30.2
On 2022-07-08 10:54, Vladimir Oltean wrote:
};
/* TODO: ideally DSA ports would have a single dp->link_dp member, diff --git a/include/net/switchdev.h b/include/net/switchdev.h index aa0171d5786d..9f83c835ee62 100644 --- a/include/net/switchdev.h +++ b/include/net/switchdev.h @@ -245,6 +245,7 @@ struct switchdev_notifier_fdb_info { u16 vid; u8 added_by_user:1, is_local:1,
is_locked:1, offloaded:1;
};
diff --git a/net/bridge/br.c b/net/bridge/br.c index 96e91d69a9a8..fe0a4741fcda 100644 --- a/net/bridge/br.c +++ b/net/bridge/br.c @@ -166,7 +166,8 @@ static int br_switchdev_event(struct notifier_block *unused, case SWITCHDEV_FDB_ADD_TO_BRIDGE: fdb_info = ptr; err = br_fdb_external_learn_add(br, p, fdb_info->addr,
fdb_info->vid, false);
fdb_info->vid, false,
if (err) { err = notifier_from_errno(err); break;fdb_info->is_locked);
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index ee9064a536ae..32ebb18050b9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -1136,7 +1136,7 @@ static int __br_fdb_add(struct ndmsg *ndm, struct net_bridge *br, "FDB entry towards bridge must be permanent"); return -EINVAL; }
err = br_fdb_external_learn_add(br, p, addr, vid, true);
} else { spin_lock_bh(&br->hash_lock); err = fdb_add_entry(br, p, addr, ndm, nlh_flags, vid, nfea_tb);err = br_fdb_external_learn_add(br, p, addr, vid, true, false);
@@ -1366,7 +1366,7 @@ void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p)
int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
bool swdev_notify)
bool swdev_notify, bool locked)
{ struct net_bridge_fdb_entry *fdb; bool modified = false; @@ -1386,6 +1386,9 @@ int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, if (!p) flags |= BIT(BR_FDB_LOCAL);
if (locked)
flags |= BIT(BR_FDB_ENTRY_LOCKED);
- fdb = fdb_create(br, p, addr, vid, flags); if (!fdb) { err = -ENOMEM;
diff --git a/net/bridge/br_private.h b/net/bridge/br_private.h index 47a3598d25c8..9082451b4d40 100644 --- a/net/bridge/br_private.h +++ b/net/bridge/br_private.h @@ -811,7 +811,7 @@ int br_fdb_sync_static(struct net_bridge *br, struct net_bridge_port *p); void br_fdb_unsync_static(struct net_bridge *br, struct net_bridge_port *p); int br_fdb_external_learn_add(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid,
bool swdev_notify);
bool swdev_notify, bool locked);
int br_fdb_external_learn_del(struct net_bridge *br, struct net_bridge_port *p, const unsigned char *addr, u16 vid, bool swdev_notify); diff --git a/net/bridge/br_switchdev.c b/net/bridge/br_switchdev.c index 8f3d76c751dd..85e566b856e1 100644 --- a/net/bridge/br_switchdev.c +++ b/net/bridge/br_switchdev.c @@ -136,6 +136,7 @@ static void br_switchdev_fdb_populate(struct net_bridge *br, item->added_by_user = test_bit(BR_FDB_ADDED_BY_USER, &fdb->flags); item->offloaded = test_bit(BR_FDB_OFFLOADED, &fdb->flags); item->is_local = test_bit(BR_FDB_LOCAL, &fdb->flags);
- item->is_locked = test_bit(BR_FDB_ENTRY_LOCKED, &fdb->flags); item->info.dev = (!p || item->is_local) ? br->dev : p->dev; item->info.ctx = ctx;
} diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h index d9722e49864b..42f47a94b0f0 100644 --- a/net/dsa/dsa_priv.h +++ b/net/dsa/dsa_priv.h @@ -65,6 +65,7 @@ struct dsa_notifier_fdb_info { const struct dsa_port *dp; const unsigned char *addr; u16 vid;
- bool is_locked;
drop
If this is dropped, the whole scheme will not work as userspace will not get notifications of new locked entries from the driver.
On 2022-07-08 10:54, Vladimir Oltean wrote:
struct dsa_db db; };
@@ -131,6 +132,7 @@ struct dsa_switchdev_event_work { unsigned char addr[ETH_ALEN]; u16 vid; bool host_addr;
- bool is_locked;
drop
};
enum dsa_standalone_event { @@ -232,7 +234,7 @@ int dsa_port_vlan_msti(struct dsa_port *dp, const struct switchdev_vlan_msti *msti); int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu); int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid);
u16 vid, bool is_locked);
drop
int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, u16 vid); int dsa_port_standalone_host_fdb_add(struct dsa_port *dp, diff --git a/net/dsa/port.c b/net/dsa/port.c index 3738f2d40a0b..8bdac9aabe5d 100644 --- a/net/dsa/port.c +++ b/net/dsa/port.c @@ -35,6 +35,7 @@ static void dsa_port_notify_bridge_fdb_flush(const struct dsa_port *dp, u16 vid) struct net_device *brport_dev = dsa_port_to_bridge_port(dp); struct switchdev_notifier_fdb_info info = { .vid = vid,
.is_locked = false,
drop
};
/* When the port becomes standalone it has already left the bridge. @@ -950,12 +951,13 @@ int dsa_port_mtu_change(struct dsa_port *dp, int new_mtu) }
int dsa_port_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid)
u16 vid, bool is_locked)
drop
{ struct dsa_notifier_fdb_info info = { .dp = dp, .addr = addr, .vid = vid,
.is_locked = is_locked,
drop
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,
@@ -979,6 +981,7 @@ int dsa_port_fdb_del(struct dsa_port *dp, const unsigned char *addr, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = { .type = DSA_DB_BRIDGE, .bridge = *dp->bridge,
@@ -999,6 +1002,7 @@ static int dsa_port_host_fdb_add(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = db,
};
@@ -1050,6 +1054,7 @@ static int dsa_port_host_fdb_del(struct dsa_port *dp, .dp = dp, .addr = addr, .vid = vid,
.is_locked = false,
drop
.db = db,
};
diff --git a/net/dsa/slave.c b/net/dsa/slave.c index 801a5d445833..905b15e4eab9 100644 --- a/net/dsa/slave.c +++ b/net/dsa/slave.c @@ -2784,6 +2784,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) container_of(work, struct dsa_switchdev_event_work, work); const unsigned char *addr = switchdev_work->addr; struct net_device *dev = switchdev_work->dev;
- bool is_locked = switchdev_work->is_locked;
drop
u16 vid = switchdev_work->vid; struct dsa_switch *ds; struct dsa_port *dp; @@ -2799,7 +2800,7 @@ static void dsa_slave_switchdev_event_work(struct work_struct *work) else if (dp->lag) err = dsa_port_lag_fdb_add(dp, addr, vid); else
err = dsa_port_fdb_add(dp, addr, vid);
err = dsa_port_fdb_add(dp, addr, vid, is_locked);
drop
if (err) { dev_err(ds->dev, "port %d failed to add %pM vid %d to fdb: %d\n",
@@ -2907,6 +2908,7 @@ static int dsa_slave_fdb_event(struct net_device *dev, ether_addr_copy(switchdev_work->addr, fdb_info->addr); switchdev_work->vid = fdb_info->vid; switchdev_work->host_addr = host_addr;
- switchdev_work->is_locked = fdb_info->is_locked;
drop
dsa_schedule_work(&switchdev_work->work);
diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 2b56218fc57c..32b1e7ac6373 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -234,7 +234,7 @@ static int dsa_port_do_mdb_del(struct dsa_port *dp, }
static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
u16 vid, struct dsa_db db)
u16 vid, bool is_locked, struct dsa_db db)
drop
{ struct dsa_switch *ds = dp->ds; struct dsa_mac_addr *a; @@ -398,7 +398,7 @@ static int dsa_switch_host_fdb_add(struct dsa_switch *ds, dsa_switch_for_each_port(dp, ds) { if (dsa_port_host_address_match(dp, info->dp)) { err = dsa_port_do_fdb_add(dp, info->addr, info->vid,
info->db);
false, info->db);
drop
if (err) break; }
@@ -437,7 +437,7 @@ static int dsa_switch_fdb_add(struct dsa_switch *ds, if (!ds->ops->port_fdb_add) return -EOPNOTSUPP;
- return dsa_port_do_fdb_add(dp, info->addr, info->vid, info->db);
- return dsa_port_do_fdb_add(dp, info->addr, info->vid,
info->is_locked, info->db);
drop
}
static int dsa_switch_fdb_del(struct dsa_switch *ds,
2.30.2
Hi Vladimir and Ido,
I can either ignore locked entries early or late in the dsa/switchdev layers.
If I ignore early, I think it should be in br_switchdev_fdb_notify() in net/bridge/br_switchdev.c. If I ignore late, I would think that it should be jut before sending it to the driver(s), e.g. in dsa_port_do_fdb_add() in net/dsa/switch.c.
There is of course pros and cons of both options, but if the flag is never to be sent to the driver, then it should be ignored early.
If ignored late most of this patch should not be dropped.
Ignore locked fdb entries coming in on all drivers.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/b53/b53_common.c | 5 +++++ drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++ drivers/net/dsa/lan9303-core.c | 5 +++++ drivers/net/dsa/lantiq_gswip.c | 5 +++++ drivers/net/dsa/microchip/ksz9477.c | 5 +++++ drivers/net/dsa/mt7530.c | 5 +++++ drivers/net/dsa/mv88e6xxx/chip.c | 5 +++++ drivers/net/dsa/ocelot/felix.c | 5 +++++ drivers/net/dsa/qca8k.c | 5 +++++ drivers/net/dsa/sja1105/sja1105_main.c | 5 +++++ include/net/dsa.h | 1 + net/dsa/switch.c | 4 ++-- 13 files changed, 54 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c index fbb32aa49b24..3567d4fcbaa7 100644 --- a/drivers/net/dsa/b53/b53_common.c +++ b/drivers/net/dsa/b53/b53_common.c @@ -1688,11 +1688,16 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct b53_device *priv = ds->priv; int ret;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + /* 5325 and 5365 require some more massaging, but could * be supported eventually */ diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h index 795cbffd5c2b..19501b76b9df 100644 --- a/drivers/net/dsa/b53/b53_priv.h +++ b/drivers/net/dsa/b53/b53_priv.h @@ -362,6 +362,7 @@ int b53_vlan_del(struct dsa_switch *ds, int port, const struct switchdev_obj_port_vlan *vlan); int b53_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db); int b53_fdb_del(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, diff --git a/drivers/net/dsa/hirschmann/hellcreek.c b/drivers/net/dsa/hirschmann/hellcreek.c index ac1f3b3a7040..0125d901c988 100644 --- a/drivers/net/dsa/hirschmann/hellcreek.c +++ b/drivers/net/dsa/hirschmann/hellcreek.c @@ -829,12 +829,17 @@ static int hellcreek_fdb_get(struct hellcreek *hellcreek,
static int hellcreek_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct hellcreek_fdb_entry entry = { 0 }; struct hellcreek *hellcreek = ds->priv; int ret;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + dev_dbg(hellcreek->dev, "Add FDB entry for MAC=%pM\n", addr);
mutex_lock(&hellcreek->reg_lock); diff --git a/drivers/net/dsa/lan9303-core.c b/drivers/net/dsa/lan9303-core.c index e03ff1f267bb..426b9ea668da 100644 --- a/drivers/net/dsa/lan9303-core.c +++ b/drivers/net/dsa/lan9303-core.c @@ -1190,10 +1190,15 @@ static void lan9303_port_fast_age(struct dsa_switch *ds, int port)
static int lan9303_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct lan9303 *chip = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + dev_dbg(chip->dev, "%s(%d, %pM, %d)\n", __func__, port, addr, vid); if (vid) return -EOPNOTSUPP; diff --git a/drivers/net/dsa/lantiq_gswip.c b/drivers/net/dsa/lantiq_gswip.c index 8af4def38a98..4da7186935f4 100644 --- a/drivers/net/dsa/lantiq_gswip.c +++ b/drivers/net/dsa/lantiq_gswip.c @@ -1399,8 +1399,13 @@ static int gswip_port_fdb(struct dsa_switch *ds, int port,
static int gswip_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { + /* Ignore locked entries */ + if (is_locked) + return 0; + return gswip_port_fdb(ds, port, addr, vid, true); }
diff --git a/drivers/net/dsa/microchip/ksz9477.c b/drivers/net/dsa/microchip/ksz9477.c index ab40b700cf1a..c177eb97a072 100644 --- a/drivers/net/dsa/microchip/ksz9477.c +++ b/drivers/net/dsa/microchip/ksz9477.c @@ -481,6 +481,7 @@ static int ksz9477_port_vlan_del(struct dsa_switch *ds, int port,
static int ksz9477_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct ksz_device *dev = ds->priv; @@ -488,6 +489,10 @@ static int ksz9477_port_fdb_add(struct dsa_switch *ds, int port, u32 data; int ret = 0;
+ /* Ignore locked entries */ + if (is_locked) + return ret; + mutex_lock(&dev->alu_mutex);
/* find any entry with mac & vid */ diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c index 2b02d823d497..cbf922cf425e 100644 --- a/drivers/net/dsa/mt7530.c +++ b/drivers/net/dsa/mt7530.c @@ -1356,12 +1356,17 @@ mt7530_port_bridge_leave(struct dsa_switch *ds, int port, static int mt7530_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct mt7530_priv *priv = ds->priv; int ret; u8 port_mask = BIT(port);
+ /* Ignore locked entries */ + if (is_locked) + return 0; + mutex_lock(&priv->reg_mutex); mt7530_fdb_write(priv, vid, port_mask, addr, -1, STATIC_ENT); ret = mt7530_fdb_cmd(priv, MT7530_FDB_WRITE, NULL); diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 5d2c57a7c708..d134153ef023 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -2717,11 +2717,16 @@ static int mv88e6xxx_vlan_msti_set(struct dsa_switch *ds,
static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; int err;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); diff --git a/drivers/net/dsa/ocelot/felix.c b/drivers/net/dsa/ocelot/felix.c index 3e07dc39007a..00ba495a7ab5 100644 --- a/drivers/net/dsa/ocelot/felix.c +++ b/drivers/net/dsa/ocelot/felix.c @@ -695,12 +695,17 @@ static int felix_fdb_dump(struct dsa_switch *ds, int port,
static int felix_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct net_device *bridge_dev = felix_classify_db(db); struct dsa_port *dp = dsa_to_port(ds, port); struct ocelot *ocelot = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + if (IS_ERR(bridge_dev)) return PTR_ERR(bridge_dev);
diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c index 2727d3169c25..6267e94af485 100644 --- a/drivers/net/dsa/qca8k.c +++ b/drivers/net/dsa/qca8k.c @@ -2369,11 +2369,16 @@ qca8k_port_fdb_insert(struct qca8k_priv *priv, const u8 *addr, static int qca8k_port_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct qca8k_priv *priv = (struct qca8k_priv *)ds->priv; u16 port_mask = BIT(port);
+ /* Ignore locked entries */ + if (is_locked) + return 0; + return qca8k_port_fdb_insert(priv, addr, port_mask, vid); }
diff --git a/drivers/net/dsa/sja1105/sja1105_main.c b/drivers/net/dsa/sja1105/sja1105_main.c index 72b6fc1932b5..9b5c2b56d06d 100644 --- a/drivers/net/dsa/sja1105/sja1105_main.c +++ b/drivers/net/dsa/sja1105/sja1105_main.c @@ -1802,10 +1802,15 @@ int sja1105pqrs_fdb_del(struct dsa_switch *ds, int port,
static int sja1105_fdb_add(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db) { struct sja1105_private *priv = ds->priv;
+ /* Ignore locked entries */ + if (is_locked) + return 0; + if (!vid) { switch (db.type) { case DSA_DB_PORT: diff --git a/include/net/dsa.h b/include/net/dsa.h index a5a843b2d67d..ebae403ce734 100644 --- a/include/net/dsa.h +++ b/include/net/dsa.h @@ -1010,6 +1010,7 @@ struct dsa_switch_ops { */ int (*port_fdb_add)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, + bool is_locked, struct dsa_db db); int (*port_fdb_del)(struct dsa_switch *ds, int port, const unsigned char *addr, u16 vid, diff --git a/net/dsa/switch.c b/net/dsa/switch.c index 32b1e7ac6373..ea3b9c1acb8c 100644 --- a/net/dsa/switch.c +++ b/net/dsa/switch.c @@ -243,7 +243,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr,
/* No need to bother with refcounting for user ports */ if (!(dsa_port_is_cpu(dp) || dsa_port_is_dsa(dp))) - return ds->ops->port_fdb_add(ds, port, addr, vid, db); + return ds->ops->port_fdb_add(ds, port, addr, vid, is_locked, db);
mutex_lock(&dp->addr_lists_lock);
@@ -259,7 +259,7 @@ static int dsa_port_do_fdb_add(struct dsa_port *dp, const unsigned char *addr, goto out; }
- err = ds->ops->port_fdb_add(ds, port, addr, vid, db); + err = ds->ops->port_fdb_add(ds, port, addr, vid, is_locked, db); if (err) { kfree(a); goto out;
Hi Hans,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on net/master] [also build test ERROR on shuah-kselftest/next linus/master v5.19-rc5] [cannot apply to net-next/master next-20220707] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Hans-Schultz/Extend-locked-po... base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 07266d066301b97ad56a693f81b29b7ced429b27 config: i386-allyesconfig (https://download.01.org/0day-ci/archive/20220708/202207081529.riNiUWOf-lkp@i...) compiler: gcc-11 (Debian 11.3.0-3) 11.3.0 reproduce (this is a W=1 build): # https://github.com/intel-lab-lkp/linux/commit/ebd598d7ea6c015001489c4293da88... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Hans-Schultz/Extend-locked-port-feature-with-FDB-locked-flag-MAC-Auth-MAB/20220707-233246 git checkout ebd598d7ea6c015001489c4293da887763491086 # save the config file mkdir build_dir && cp config build_dir/.config make W=1 O=build_dir ARCH=i386 SHELL=/bin/bash
If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
drivers/net/dsa/sja1105/sja1105_main.c: In function 'sja1105_mdb_add':
drivers/net/dsa/sja1105/sja1105_main.c:1952:63: error: incompatible type for argument 5 of 'sja1105_fdb_add'
1952 | return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); | ^~ | | | struct dsa_db drivers/net/dsa/sja1105/sja1105_main.c:1805:33: note: expected 'bool' {aka '_Bool'} but argument is of type 'struct dsa_db' 1805 | bool is_locked, | ~~~~~^~~~~~~~~
drivers/net/dsa/sja1105/sja1105_main.c:1952:16: error: too few arguments to function 'sja1105_fdb_add'
1952 | return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); | ^~~~~~~~~~~~~~~ drivers/net/dsa/sja1105/sja1105_main.c:1803:12: note: declared here 1803 | static int sja1105_fdb_add(struct dsa_switch *ds, int port, | ^~~~~~~~~~~~~~~ drivers/net/dsa/sja1105/sja1105_main.c:1953:1: error: control reaches end of non-void function [-Werror=return-type] 1953 | } | ^ cc1: some warnings being treated as errors
vim +/sja1105_fdb_add +1952 drivers/net/dsa/sja1105/sja1105_main.c
5126ec72a094bd3 Vladimir Oltean 2021-08-08 1947 a52b2da778fc93e Vladimir Oltean 2021-01-09 1948 static int sja1105_mdb_add(struct dsa_switch *ds, int port, c26933639b5402c Vladimir Oltean 2022-02-25 1949 const struct switchdev_obj_port_mdb *mdb, c26933639b5402c Vladimir Oltean 2022-02-25 1950 struct dsa_db db) 291d1e72b756424 Vladimir Oltean 2019-05-02 1951 { c26933639b5402c Vladimir Oltean 2022-02-25 @1952 return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); 291d1e72b756424 Vladimir Oltean 2019-05-02 1953 } 291d1e72b756424 Vladimir Oltean 2019-05-02 1954
Hi Hans,
On Thu, Jul 07, 2022 at 05:29:27PM +0200, Hans Schultz wrote:
Ignore locked fdb entries coming in on all drivers.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
A good patch should have a reason for the change in the commit message. This has no reason because there is no reason.
Think about it, you've said it yourself in patch 1:
| Only the kernel can set this FDB entry flag, while userspace can read | the flag and remove it by replacing or deleting the FDB entry.
So if user space will never add locked FDB entries to the bridge, then FDB entries with is_locked=true are never transported using SWITCHDEV_FDB_ADD_TO_DEVICE to drivers, and so, there is no reason at all to pass is_locked to drivers, just for them to ignore something that won't appear.
You just need this for SWITCHDEV_FDB_ADD_TO_BRIDGE, so please keep it only in those code paths, and remove it from net/dsa/slave.c as well.
drivers/net/dsa/b53/b53_common.c | 5 +++++ drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++ drivers/net/dsa/lan9303-core.c | 5 +++++ drivers/net/dsa/lantiq_gswip.c | 5 +++++ drivers/net/dsa/microchip/ksz9477.c | 5 +++++ drivers/net/dsa/mt7530.c | 5 +++++ drivers/net/dsa/mv88e6xxx/chip.c | 5 +++++ drivers/net/dsa/ocelot/felix.c | 5 +++++ drivers/net/dsa/qca8k.c | 5 +++++ drivers/net/dsa/sja1105/sja1105_main.c | 5 +++++ include/net/dsa.h | 1 + net/dsa/switch.c | 4 ++-- 13 files changed, 54 insertions(+), 2 deletions(-)
On 2022-07-08 10:49, Vladimir Oltean wrote:
Hi Hans,
On Thu, Jul 07, 2022 at 05:29:27PM +0200, Hans Schultz wrote:
Ignore locked fdb entries coming in on all drivers.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
A good patch should have a reason for the change in the commit message. This has no reason because there is no reason.
Think about it, you've said it yourself in patch 1:
| Only the kernel can set this FDB entry flag, while userspace can read | the flag and remove it by replacing or deleting the FDB entry.
So if user space will never add locked FDB entries to the bridge, then FDB entries with is_locked=true are never transported using SWITCHDEV_FDB_ADD_TO_DEVICE to drivers, and so, there is no reason at all to pass is_locked to drivers, just for them to ignore something that won't appear.
Correct me if I am wrong, but since the bridge can add locked entries, and the ensuring fdb update will create a SWITCHDEV_FDB_ADD_TO_DEVICE, those entries should reach the driver. The policy to ignore those in the driver can be seen as either the right thing to do, or not yet implemented.
I remember Ido wrote at a point that the scheme they use is to trap various packets to the CPU and let the bridge add the locked entry, which I then understand is sent to the driver with a SWITCHDEV_FDB_ADD_TO_DEVICE event.
You just need this for SWITCHDEV_FDB_ADD_TO_BRIDGE, so please keep it only in those code paths, and remove it from net/dsa/slave.c as well.
drivers/net/dsa/b53/b53_common.c | 5 +++++ drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++ drivers/net/dsa/lan9303-core.c | 5 +++++ drivers/net/dsa/lantiq_gswip.c | 5 +++++ drivers/net/dsa/microchip/ksz9477.c | 5 +++++ drivers/net/dsa/mt7530.c | 5 +++++ drivers/net/dsa/mv88e6xxx/chip.c | 5 +++++ drivers/net/dsa/ocelot/felix.c | 5 +++++ drivers/net/dsa/qca8k.c | 5 +++++ drivers/net/dsa/sja1105/sja1105_main.c | 5 +++++ include/net/dsa.h | 1 + net/dsa/switch.c | 4 ++-- 13 files changed, 54 insertions(+), 2 deletions(-)
On Fri, Jul 08, 2022 at 11:06:24AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 10:49, Vladimir Oltean wrote:
Hi Hans,
On Thu, Jul 07, 2022 at 05:29:27PM +0200, Hans Schultz wrote:
Ignore locked fdb entries coming in on all drivers.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
A good patch should have a reason for the change in the commit message. This has no reason because there is no reason.
Think about it, you've said it yourself in patch 1:
| Only the kernel can set this FDB entry flag, while userspace can read | the flag and remove it by replacing or deleting the FDB entry.
So if user space will never add locked FDB entries to the bridge, then FDB entries with is_locked=true are never transported using SWITCHDEV_FDB_ADD_TO_DEVICE to drivers, and so, there is no reason at all to pass is_locked to drivers, just for them to ignore something that won't appear.
Correct me if I am wrong, but since the bridge can add locked entries, and the ensuring fdb update will create a SWITCHDEV_FDB_ADD_TO_DEVICE, those entries should reach the driver. The policy to ignore those in the driver can be seen as either the right thing to do, or not yet implemented.
I remember Ido wrote at a point that the scheme they use is to trap various packets to the CPU and let the bridge add the locked entry, which I then understand is sent to the driver with a SWITCHDEV_FDB_ADD_TO_DEVICE event.
Well, yes, but if I'm correct, the bridge right now can't create locked FDB entries, so is_locked will always be false in the ADD_TO_DEVICE direction.
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
You just need this for SWITCHDEV_FDB_ADD_TO_BRIDGE, so please keep it only in those code paths, and remove it from net/dsa/slave.c as well.
drivers/net/dsa/b53/b53_common.c | 5 +++++ drivers/net/dsa/b53/b53_priv.h | 1 + drivers/net/dsa/hirschmann/hellcreek.c | 5 +++++ drivers/net/dsa/lan9303-core.c | 5 +++++ drivers/net/dsa/lantiq_gswip.c | 5 +++++ drivers/net/dsa/microchip/ksz9477.c | 5 +++++ drivers/net/dsa/mt7530.c | 5 +++++ drivers/net/dsa/mv88e6xxx/chip.c | 5 +++++ drivers/net/dsa/ocelot/felix.c | 5 +++++ drivers/net/dsa/qca8k.c | 5 +++++ drivers/net/dsa/sja1105/sja1105_main.c | 5 +++++ include/net/dsa.h | 1 + net/dsa/switch.c | 4 ++-- 13 files changed, 54 insertions(+), 2 deletions(-)
On 2022-07-08 11:15, Vladimir Oltean wrote:
Well, yes, but if I'm correct, the bridge right now can't create locked FDB entries, so is_locked will always be false in the ADD_TO_DEVICE direction.
The bridge can create locked FDB entries with this patch set. That is part of the idea that a SW bridge will have the same features as the HW version, so first I made it in the bridge.
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
I don't think Ido requested you to ignore is_locked from all DSA drivers, but instead from all switchdev drivers maybe. Quite different.
In any case I'm going to take a look at this patch set more closely and run the selftest on my Marvell switch, but I can't do this today unfortunately. I'll return with more comments.
On 2022-07-08 13:56, Vladimir Oltean wrote:
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
I don't think Ido requested you to ignore is_locked from all DSA drivers, but instead from all switchdev drivers maybe. Quite different.
So without changing the signature on port_fdb_add(). If that is to avoid changing that signature, which needs to be changed anyhow for any switchcore driver to act on it, then my next patch set will change the signarure also as it is needed for creating dynamic ATU entries from userspace, which is needed to make the whole thing complete.
As it is already done (with the is_locked to the drivers) and needed for future application, I would like Ido to comment on it before I take action.
In any case I'm going to take a look at this patch set more closely and run the selftest on my Marvell switch, but I can't do this today unfortunately. I'll return with more comments.
Yes :-)
On Fri, Jul 08, 2022 at 02:34:25PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 13:56, Vladimir Oltean wrote:
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
I don't think Ido requested you to ignore is_locked from all DSA drivers, but instead from all switchdev drivers maybe. Quite different.
So without changing the signature on port_fdb_add(). If that is to avoid changing that signature, which needs to be changed anyhow for any switchcore driver to act on it, then my next patch set will change the signarure also as it is needed for creating dynamic ATU entries from userspace, which is needed to make the whole thing complete.
As it is already done (with the is_locked to the drivers) and needed for future application, I would like Ido to comment on it before I take action.
It's related to my reply here [1]. AFAICT, we have two classes of device drivers:
1. Drivers like mv88e6xxx that report locked entries to the bridge driver via 'SWITCHDEV_FDB_ADD_TO_BRIDGE'.
2. Drivers like mlxsw that trap packets that incurred an FDB miss to the bridge driver. These packets will cause the bridge driver to emit 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications with the locked flag.
If we can agree that locked entries are only meant to signal to user space that a certain MAC tried to gain authorization and that the bridge should ignore them while forwarding, then there is no point in generating the 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications. We should teach the bridge driver to suppress these so that there is no need to patch all the device drivers.
[1] https://lore.kernel.org/netdev/YsqLyxTRtUjzDj6D@shredder/
In any case I'm going to take a look at this patch set more closely and run the selftest on my Marvell switch, but I can't do this today unfortunately. I'll return with more comments.
Yes :-)
On 2022-07-10 10:35, Ido Schimmel wrote:
On Fri, Jul 08, 2022 at 02:34:25PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 13:56, Vladimir Oltean wrote:
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
I don't think Ido requested you to ignore is_locked from all DSA drivers, but instead from all switchdev drivers maybe. Quite different.
So without changing the signature on port_fdb_add(). If that is to avoid changing that signature, which needs to be changed anyhow for any switchcore driver to act on it, then my next patch set will change the signarure also as it is needed for creating dynamic ATU entries from userspace, which is needed to make the whole thing complete.
As it is already done (with the is_locked to the drivers) and needed for future application, I would like Ido to comment on it before I take action.
It's related to my reply here [1]. AFAICT, we have two classes of device drivers:
- Drivers like mv88e6xxx that report locked entries to the bridge
driver via 'SWITCHDEV_FDB_ADD_TO_BRIDGE'.
- Drivers like mlxsw that trap packets that incurred an FDB miss to
the bridge driver. These packets will cause the bridge driver to emit 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications with the locked flag.
If we can agree that locked entries are only meant to signal to user space that a certain MAC tried to gain authorization and that the bridge should ignore them while forwarding, then there is no point in generating the 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications. We should teach the bridge driver to suppress these so that there is no need to patch all the device drivers.
I do not know of all about what other switchcores there are and how they work, but those that I have knowledge of, it has been prudent in connection with the locked port feature to install Storm Prevention or zero-DPV (Destination Port Vector) FDB entries. I would think that that should be the case for other switchcores too. Those entries cannot normally be installed from userspace (of course special tools can do anything).
But if the decision is to drop locked entries at the DSA layer, I can do that. I just want to ensure that all considerations have been taken.
[1] https://lore.kernel.org/netdev/YsqLyxTRtUjzDj6D@shredder/
In any case I'm going to take a look at this patch set more closely and run the selftest on my Marvell switch, but I can't do this today unfortunately. I'll return with more comments.
Yes :-)
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-10 10:35, Ido Schimmel wrote:
On Fri, Jul 08, 2022 at 02:34:25PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 13:56, Vladimir Oltean wrote:
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
I don't think Ido requested you to ignore is_locked from all DSA drivers, but instead from all switchdev drivers maybe. Quite different.
So without changing the signature on port_fdb_add(). If that is to avoid changing that signature, which needs to be changed anyhow for any switchcore driver to act on it, then my next patch set will change the signarure also as it is needed for creating dynamic ATU entries from userspace, which is needed to make the whole thing complete.
As it is already done (with the is_locked to the drivers) and needed for future application, I would like Ido to comment on it before I take action.
It's related to my reply here [1]. AFAICT, we have two classes of device drivers:
- Drivers like mv88e6xxx that report locked entries to the bridge
driver via 'SWITCHDEV_FDB_ADD_TO_BRIDGE'.
- Drivers like mlxsw that trap packets that incurred an FDB miss to the
bridge driver. These packets will cause the bridge driver to emit 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications with the locked flag.
If we can agree that locked entries are only meant to signal to user space that a certain MAC tried to gain authorization and that the bridge should ignore them while forwarding, then there is no point in generating the 'SWITCHDEV_FDB_ADD_TO_DEVICE' notifications. We should teach the bridge driver to suppress these so that there is no need to patch all the device drivers.
I do not know of all about what other switchcores there are and how they work, but those that I have knowledge of, it has been prudent in connection with the locked port feature to install Storm Prevention or zero-DPV (Destination Port Vector) FDB entries.
What are "Storm Prevention" and "zero-DPV" FDB entries?
I would think that that should be the case for other switchcores too. Those entries cannot normally be installed from userspace (of course special tools can do anything).
But if the decision is to drop locked entries at the DSA layer, I can do that. I just want to ensure that all considerations have been taken.
There is no decision that I'm aware of. I'm simply trying to understand how FDB entries that have 'BR_FDB_ENTRY_LOCKED' set are handled in mv88e6xxx and other devices in this class. We have at least three different implementations to consolidate:
1. The bridge driver, pure software forwarding. The locked entry is dynamically created by the bridge. Packets received via the locked port with a SA corresponding to the locked entry will be dropped, but will refresh the entry. On the other hand, packets with a DA corresponding to the locked entry will be forwarded as known unicast through the locked port.
2. Hardware implementations like Spectrum that can be programmed to trap packets that incurred an FDB miss. Like in the first case, the locked entry is dynamically created by the bridge driver and also aged by it. Unlike in the first case, since this entry is not present in hardware, packets with a DA corresponding to the locked entry will be flooded as unknown unicast.
3. Hardware implementations like mv88e6xxx that fire an interrupt upon FDB miss. Need your help to understand how the above works there and why. Specifically, how locked entries are represented in hardware (if at all) and what is the significance of not installing corresponding entries in hardware.
[1] https://lore.kernel.org/netdev/YsqLyxTRtUjzDj6D@shredder/
In any case I'm going to take a look at this patch set more closely and run the selftest on my Marvell switch, but I can't do this today unfortunately. I'll return with more comments.
Yes :-)
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
There is no decision that I'm aware of. I'm simply trying to understand how FDB entries that have 'BR_FDB_ENTRY_LOCKED' set are handled in mv88e6xxx and other devices in this class. We have at least three different implementations to consolidate:
- The bridge driver, pure software forwarding. The locked entry is
dynamically created by the bridge. Packets received via the locked port with a SA corresponding to the locked entry will be dropped, but will refresh the entry. On the other hand, packets with a DA corresponding to the locked entry will be forwarded as known unicast through the locked port.
- Hardware implementations like Spectrum that can be programmed to
trap packets that incurred an FDB miss. Like in the first case, the locked entry is dynamically created by the bridge driver and also aged by it. Unlike in the first case, since this entry is not present in hardware, packets with a DA corresponding to the locked entry will be flooded as unknown unicast.
- Hardware implementations like mv88e6xxx that fire an interrupt upon
FDB miss. Need your help to understand how the above works there and why. Specifically, how locked entries are represented in hardware (if at all) and what is the significance of not installing corresponding entries in hardware.
With the mv88e6xxx, a miss violation with the SA occurs when there is no entry. If you then add a normal entry with the SA, the port is open for that SA of course. The zero-DPV entry is an entry that ensures that there is no more miss violation interrupts from that SA, while dropping all entries with the SA.
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
DPV means Destination Port Vector, and an ATU entry with a DPV of 0 essentially means a FDB entry pointing nowhere, so it will drop the packet. That's a slight problem with Hans' implementation, the bridge thinks that the locked FDB entry belongs to port X, but in reality it matches on all bridged ports (since it matches by FID). FID allocation in mv88e6xxx is slightly strange, all VLAN-unaware bridge ports, belonging to any bridge, share the same FID, so the FDB databases are not exactly isolated from each other.
On 2022-07-17 14:57, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
DPV means Destination Port Vector, and an ATU entry with a DPV of 0 essentially means a FDB entry pointing nowhere, so it will drop the packet. That's a slight problem with Hans' implementation, the bridge thinks that the locked FDB entry belongs to port X, but in reality it matches on all bridged ports (since it matches by FID). FID allocation in mv88e6xxx is slightly strange, all VLAN-unaware bridge ports, belonging to any bridge, share the same FID, so the FDB databases are not exactly isolated from each other.
But if the locked port is vlan aware and has a pvid, it should not block other ports. Besides the fid will be zero with vlan unaware afaik, and all with zero fid do not create locked entries.
On Sun, Jul 17, 2022 at 03:09:10PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-17 14:57, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
DPV means Destination Port Vector, and an ATU entry with a DPV of 0 essentially means a FDB entry pointing nowhere, so it will drop the packet. That's a slight problem with Hans' implementation, the bridge thinks that the locked FDB entry belongs to port X, but in reality it matches on all bridged ports (since it matches by FID). FID allocation in mv88e6xxx is slightly strange, all VLAN-unaware bridge ports, belonging to any bridge, share the same FID, so the FDB databases are not exactly isolated from each other.
But if the locked port is vlan aware and has a pvid, it should not block other ports.
I don't understand what you want to say by that. It will block all other packets with the same MAC SA that are classified to the same FID. In case of VLAN-aware bridges, the mv88e6xxx driver allocates a new FID for each VID (see mv88e6xxx_atu_new). In other words, if a locked port is VLAN-aware and has a pvid, then whatever the PVID may be, all ports in that same VLAN are still blocked in the same way.
Besides the fid will be zero with vlan unaware afaik, and all with zero fid do not create locked entries.
If by 0 you mean 1 (MV88E6XXX_FID_BRIDGED), then you are correct: ports with FID 0 (MV88E6XXX_FID_STANDALONE) should not create locked FDB entries, because they are, well, standalone and not bridged. Again I don't exactly see the relevance though.
On 2022-07-17 15:59, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 03:09:10PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-17 14:57, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
DPV means Destination Port Vector, and an ATU entry with a DPV of 0 essentially means a FDB entry pointing nowhere, so it will drop the packet. That's a slight problem with Hans' implementation, the bridge thinks that the locked FDB entry belongs to port X, but in reality it matches on all bridged ports (since it matches by FID). FID allocation in mv88e6xxx is slightly strange, all VLAN-unaware bridge ports, belonging to any bridge, share the same FID, so the FDB databases are not exactly isolated from each other.
But if the locked port is vlan aware and has a pvid, it should not block other ports.
I don't understand what you want to say by that. It will block all other packets with the same MAC SA that are classified to the same FID. In case of VLAN-aware bridges, the mv88e6xxx driver allocates a new FID for each VID (see mv88e6xxx_atu_new). In other words, if a locked port is VLAN-aware and has a pvid, then whatever the PVID may be, all ports in that same VLAN are still blocked in the same way.
Maybe I am just trying to understand the problem you are posing, so afaics MAC addresses should be unique and having the same MAC address behind a locked port and a not-locked port seems like a mis-configuration regardless of vlan setup? As the zero-DPV entry only blocks the specific SA MAC on a specific vlan, which is behind a locked port, there shouldn't be any problem...?
If the host behind a locked port starts sending on another vlan than where it got the first locked entry, another locked entry will occur, as the locked entries are MAC + vlan.
Besides the fid will be zero with vlan unaware afaik, and all with zero fid do not create locked entries.
If by 0 you mean 1 (MV88E6XXX_FID_BRIDGED), then you are correct: ports with FID 0 (MV88E6XXX_FID_STANDALONE) should not create locked FDB entries, because they are, well, standalone and not bridged. Again I don't exactly see the relevance though.
On Sun, Jul 17, 2022 at 04:57:50PM +0200, netdev@kapio-technology.com wrote:
Maybe I am just trying to understand the problem you are posing, so afaics MAC addresses should be unique and having the same MAC address behind a locked port and a not-locked port seems like a mis-configuration regardless of vlan setup? As the zero-DPV entry only blocks the specific SA MAC on a specific vlan, which is behind a locked port, there shouldn't be any problem...?
If the host behind a locked port starts sending on another vlan than where it got the first locked entry, another locked entry will occur, as the locked entries are MAC + vlan.
I don't think it's an invalid configuration, I have a 17-port Marvell switch which I use as infrastructure to connect my PC with my board farm and to the Internet. I've cropped 4 out of those 17 ports for use in selftests, effectively now having 2 bridges (br0 used by the selftests and br-lan for systemd-networkd).
Currently all the traffic sent and received by the selftests is done through lan1-lan4, but if I wanted to run some bridge locked port tests with traffic from my PC, what I'd do is I'd connect a (locked) port from br0 to a port from br-lan, and my PC would thus gain indirect connectivity to the locked port.
Then I'd send a packet and the switch would create a locked FDB entry for my PC's MAC address, but that FDB entry would span across the entire MV88E6XXX_FID_BRIDGED, so practically speaking, it would block my PC's MAC address from doing anything, including accessing the Internet, i.e. traffic that has nothing at all to do with the locked port in br0. That isn't quite ok.
On 2022-07-17 17:08, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 04:57:50PM +0200, netdev@kapio-technology.com wrote:
Maybe I am just trying to understand the problem you are posing, so afaics MAC addresses should be unique and having the same MAC address behind a locked port and a not-locked port seems like a mis-configuration regardless of vlan setup? As the zero-DPV entry only blocks the specific SA MAC on a specific vlan, which is behind a locked port, there shouldn't be any problem...?
If the host behind a locked port starts sending on another vlan than where it got the first locked entry, another locked entry will occur, as the locked entries are MAC + vlan.
I don't think it's an invalid configuration, I have a 17-port Marvell switch which I use as infrastructure to connect my PC with my board farm and to the Internet. I've cropped 4 out of those 17 ports for use in selftests, effectively now having 2 bridges (br0 used by the selftests and br-lan for systemd-networkd).
Currently all the traffic sent and received by the selftests is done through lan1-lan4, but if I wanted to run some bridge locked port tests with traffic from my PC, what I'd do is I'd connect a (locked) port from br0 to a port from br-lan, and my PC would thus gain indirect connectivity to the locked port.
Then I'd send a packet and the switch would create a locked FDB entry for my PC's MAC address, but that FDB entry would span across the entire MV88E6XXX_FID_BRIDGED, so practically speaking, it would block my PC's MAC address from doing anything, including accessing the Internet, i.e. traffic that has nothing at all to do with the locked port in br0. That isn't quite ok.
Okay, I see the problem you refer to. I think that we have to accept some limitations unless you think that just zeroing the specific port bit in the DPV would be a better solution, and there wouldn't be caveats with that besides having to do a FDB search etc to get the correct DPV if I am not too mistaken.
Also trunk ports is a limitation as that is not supported in this implementation.
On Sun, Jul 17, 2022 at 06:10:22PM +0200, netdev@kapio-technology.com wrote:
Okay, I see the problem you refer to. I think that we have to accept some limitations unless you think that just zeroing the specific port bit in the DPV would be a better solution, and there wouldn't be caveats with that besides having to do a FDB search etc to get the correct DPV if I am not too mistaken.
No, honestly I believe that what we should do to improve the limitation is to have proper ATU database separation between one VLAN-unaware bridge and another (i.e. what is now MV88E6XXX_FID_BRIDGED to be one FID per bridge).
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
There is no decision that I'm aware of. I'm simply trying to understand how FDB entries that have 'BR_FDB_ENTRY_LOCKED' set are handled in mv88e6xxx and other devices in this class. We have at least three different implementations to consolidate:
- The bridge driver, pure software forwarding. The locked entry is
dynamically created by the bridge. Packets received via the locked port with a SA corresponding to the locked entry will be dropped, but will refresh the entry. On the other hand, packets with a DA corresponding to the locked entry will be forwarded as known unicast through the locked port.
- Hardware implementations like Spectrum that can be programmed to trap
packets that incurred an FDB miss. Like in the first case, the locked entry is dynamically created by the bridge driver and also aged by it. Unlike in the first case, since this entry is not present in hardware, packets with a DA corresponding to the locked entry will be flooded as unknown unicast.
- Hardware implementations like mv88e6xxx that fire an interrupt upon
FDB miss. Need your help to understand how the above works there and why. Specifically, how locked entries are represented in hardware (if at all) and what is the significance of not installing corresponding entries in hardware.
With the mv88e6xxx, a miss violation with the SA occurs when there is no entry. If you then add a normal entry with the SA, the port is open for that SA of course.
Good
The zero-DPV entry is an entry that ensures that there is no more miss violation interrupts from that SA, while dropping all entries with the SA.
Few questions:
1. Is it correct to think of this entry as an entry pointing to a special /dev/null port?
2. After installing this entry, you no longer get miss violation interrupts because packets with this SA incur a mismatch violation (src_port != /dev/null) and therefore discarded in hardware?
3. What happens to packets with a DA matching the zero-DPV entry, are they also discarded in hardware? If so, here we differ from the bridge driver implementation where such packets will be forwarded according to the locked entry and egress the locked port
4. The reason for installing this entry is to suppress further miss violation interrupts?
5. If not replaced, will this entry always age out after the ageing time? Not sure what can refresh it given that traffic does not ingress from the /dev/null port.
Thanks
On 2022-07-17 17:20, Ido Schimmel wrote:
On Sun, Jul 17, 2022 at 02:21:47PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
On Wed, Jul 13, 2022 at 09:09:58AM +0200, netdev@kapio-technology.com wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
They are both FDB entries that at the HW level drops all packets having a specific SA, thus using minimum resources. (thus the name "Storm Prevention" aka, protection against DOS attacks. We must remember that we operate with CPU based learning.)
There is no decision that I'm aware of. I'm simply trying to understand how FDB entries that have 'BR_FDB_ENTRY_LOCKED' set are handled in mv88e6xxx and other devices in this class. We have at least three different implementations to consolidate:
- The bridge driver, pure software forwarding. The locked entry is
dynamically created by the bridge. Packets received via the locked port with a SA corresponding to the locked entry will be dropped, but will refresh the entry. On the other hand, packets with a DA corresponding to the locked entry will be forwarded as known unicast through the locked port.
- Hardware implementations like Spectrum that can be programmed to trap
packets that incurred an FDB miss. Like in the first case, the locked entry is dynamically created by the bridge driver and also aged by it. Unlike in the first case, since this entry is not present in hardware, packets with a DA corresponding to the locked entry will be flooded as unknown unicast.
- Hardware implementations like mv88e6xxx that fire an interrupt upon
FDB miss. Need your help to understand how the above works there and why. Specifically, how locked entries are represented in hardware (if at all) and what is the significance of not installing corresponding entries in hardware.
With the mv88e6xxx, a miss violation with the SA occurs when there is no entry. If you then add a normal entry with the SA, the port is open for that SA of course.
Good
The zero-DPV entry is an entry that ensures that there is no more miss violation interrupts from that SA, while dropping all entries with the SA.
Few questions:
- Is it correct to think of this entry as an entry pointing to a
special /dev/null port?
I guess you can think of it like that. It's internal to the chipset how it does it.
- After installing this entry, you no longer get miss violation
interrupts because packets with this SA incur a mismatch violation (src_port != /dev/null) and therefore discarded in hardware?
Yes, and mismatch violations are suppressed in this implementation when locking the port.
- What happens to packets with a DA matching the zero-DPV entry, are
they also discarded in hardware? If so, here we differ from the bridge driver implementation where such packets will be forwarded according to the locked entry and egress the locked port
I understand that egress will follow what is setup with regard to UC, MC and BC, though I haven't tested that. But no replies will get through of course as long as the port hasn't been opened for the iface behind the locked port.
- The reason for installing this entry is to suppress further miss
violation interrupts?
Yes, while still HW dropping all ingress packets with the same (SA-mac, vlan) on the port.
- If not replaced, will this entry always age out after the ageing
time? Not sure what can refresh it given that traffic does not ingress from the /dev/null port.
That is where my implementation keeps the entries in a list and removes them after the bridge timeout using a kernel worker and jiffies. So by default they age out after approx. 5 min.
Thanks
On Sun, Jul 17, 2022 at 05:53:22PM +0200, netdev@kapio-technology.com wrote:
- What happens to packets with a DA matching the zero-DPV entry, are
they also discarded in hardware? If so, here we differ from the bridge driver implementation where such packets will be forwarded according to the locked entry and egress the locked port
I understand that egress will follow what is setup with regard to UC, MC and BC, though I haven't tested that. But no replies will get through of course as long as the port hasn't been opened for the iface behind the locked port.
Here, should we be rather fixing the software bridge, if the current behavior is to forward packets towards locked FDB entries?
On Thu, Jul 21, 2022 at 02:59:35PM +0300, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 05:53:22PM +0200, netdev@kapio-technology.com wrote:
- What happens to packets with a DA matching the zero-DPV entry, are
they also discarded in hardware? If so, here we differ from the bridge driver implementation where such packets will be forwarded according to the locked entry and egress the locked port
I understand that egress will follow what is setup with regard to UC, MC and BC, though I haven't tested that. But no replies will get through of course as long as the port hasn't been opened for the iface behind the locked port.
Here, should we be rather fixing the software bridge, if the current behavior is to forward packets towards locked FDB entries?
I think the bridge needs to be fixed, but not to discard packets. If I decided to lock a port, it means I do not blindly trust whoever who is behind the port, but instead want to authorize them first. Since an unauthorized user is able to create locked FDB entries we need to carefully define what they mean. I tried looking information about MAB online, but couldn't find detailed material that answers my questions, so my answers are based on what I believe is logical, which might be wrong.
Currently, the bridge will forward packets to a locked entry which effectively means that an unauthorized host can cause the bridge to direct packets to it and sniff them. Yes, the host can't send any packets through the port (while locked) and can't overtake an existing (unlocked) FDB entry, but it still seems like an odd decision. IMO, the situation in mv88e6xxx is even worse because there an unauthorized host can cause packets to a certain DMAC to be blackholed via its zero-DPV entry.
Another (minor?) issue is that locked entries cannot roam between locked ports. Lets say that my user space MAB policy is to authorize MAC X if it appears behind one of the locked ports swp1-swp4. An unauthorized host behind locked port swp5 can generate packets with SMAC X, preventing the true owner of this MAC behind swp1 from ever being authorized.
It seems like the main purpose of these locked entries is to signal to user space the presence of a certain MAC behind a locked port, but they should not be able to affect packet forwarding in the bridge, unlike regular entries.
Regarding a separate knob for MAB, I tend to agree we need it. Otherwise we cannot control which locked ports are able to populate the FDB with locked entries. I don't particularly like the fact that we overload an existing flag ("learning") for that. Any reason not to add an explicit flag ("mab")? At least with the current implementation, locked entries cannot roam between locked ports and cannot be refreshed, which differs from regular learning.
On Thu, Jul 21, 2022 at 04:27:52PM +0300, Ido Schimmel wrote:
I tried looking information about MAB online, but couldn't find detailed material that answers my questions, so my answers are based on what I believe is logical, which might be wrong.
I'm kind of in the same situation here.
Currently, the bridge will forward packets to a locked entry which effectively means that an unauthorized host can cause the bridge to direct packets to it and sniff them. Yes, the host can't send any packets through the port (while locked) and can't overtake an existing (unlocked) FDB entry, but it still seems like an odd decision. IMO, the situation in mv88e6xxx is even worse because there an unauthorized host can cause packets to a certain DMAC to be blackholed via its zero-DPV entry.
Another (minor?) issue is that locked entries cannot roam between locked ports. Lets say that my user space MAB policy is to authorize MAC X if it appears behind one of the locked ports swp1-swp4. An unauthorized host behind locked port swp5 can generate packets with SMAC X, preventing the true owner of this MAC behind swp1 from ever being authorized.
In the mv88e6xxx offload implementation, the locked entries eventually age out from time to time, practically giving the true owner of the MAC address another chance every 5 minutes or so. In the pure software implementation of locked FDB entries I'm not quite sure. It wouldn't make much sense for the behavior to differ significantly though.
It seems like the main purpose of these locked entries is to signal to user space the presence of a certain MAC behind a locked port, but they should not be able to affect packet forwarding in the bridge, unlike regular entries.
So essentially what you want is for br_handle_frame_finish() to treat "dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);" as NULL if test_bit(BR_FDB_LOCKED, &dst->flags) is true?
Regarding a separate knob for MAB, I tend to agree we need it. Otherwise we cannot control which locked ports are able to populate the FDB with locked entries. I don't particularly like the fact that we overload an existing flag ("learning") for that. Any reason not to add an explicit flag ("mab")? At least with the current implementation, locked entries cannot roam between locked ports and cannot be refreshed, which differs from regular learning.
Well, assuming we model the software bridge closer to mv88e6xxx (where locked FDB entries can roam after a certain time), does this change things? In the software implementation I think it would make sense for them to be able to roam right away (the age-out interval in mv88e6xxx is just a compromise between responsiveness to roaming and resistance to DoS).
On Thu, Jul 21, 2022 at 05:20:01PM +0300, Vladimir Oltean wrote:
On Thu, Jul 21, 2022 at 04:27:52PM +0300, Ido Schimmel wrote:
I tried looking information about MAB online, but couldn't find detailed material that answers my questions, so my answers are based on what I believe is logical, which might be wrong.
I'm kind of in the same situation here.
:(
Currently, the bridge will forward packets to a locked entry which effectively means that an unauthorized host can cause the bridge to direct packets to it and sniff them. Yes, the host can't send any packets through the port (while locked) and can't overtake an existing (unlocked) FDB entry, but it still seems like an odd decision. IMO, the situation in mv88e6xxx is even worse because there an unauthorized host can cause packets to a certain DMAC to be blackholed via its zero-DPV entry.
Another (minor?) issue is that locked entries cannot roam between locked ports. Lets say that my user space MAB policy is to authorize MAC X if it appears behind one of the locked ports swp1-swp4. An unauthorized host behind locked port swp5 can generate packets with SMAC X, preventing the true owner of this MAC behind swp1 from ever being authorized.
In the mv88e6xxx offload implementation, the locked entries eventually age out from time to time, practically giving the true owner of the MAC address another chance every 5 minutes or so. In the pure software implementation of locked FDB entries I'm not quite sure. It wouldn't make much sense for the behavior to differ significantly though.
From what I can tell, the same happens in software, but this behavior does not really make sense to me. It differs from how other learned entries age/roam and can lead to problems such as the one described above. It is also not documented anywhere, so I can't tell if it's intentional or an oversight. We need to have a good reason for such a behavior other than the fact that it appears to conform to the quirks of one hardware implementation.
It seems like the main purpose of these locked entries is to signal to user space the presence of a certain MAC behind a locked port, but they should not be able to affect packet forwarding in the bridge, unlike regular entries.
So essentially what you want is for br_handle_frame_finish() to treat "dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);" as NULL if test_bit(BR_FDB_LOCKED, &dst->flags) is true?
Yes. It's not clear to me why unauthorized hosts should be given the ability to affect packet forwarding in the bridge through these locked entries when their primary purpose seems to be notifying user space about the presence of the MAC. At the very least this should be explained in the commit message, to indicate that some thought went into this decision.
Regarding a separate knob for MAB, I tend to agree we need it. Otherwise we cannot control which locked ports are able to populate the FDB with locked entries. I don't particularly like the fact that we overload an existing flag ("learning") for that. Any reason not to add an explicit flag ("mab")? At least with the current implementation, locked entries cannot roam between locked ports and cannot be refreshed, which differs from regular learning.
Well, assuming we model the software bridge closer to mv88e6xxx (where locked FDB entries can roam after a certain time), does this change things? In the software implementation I think it would make sense for them to be able to roam right away (the age-out interval in mv88e6xxx is just a compromise between responsiveness to roaming and resistance to DoS).
Exactly. If this is the best that we can do with mv88e6xxx, then so be it, but other implementations (software/hardware) do not have the same limitations and I don't see a reason to bend them.
Regarding "learning" vs. "mab" (or something else), the former is a well-defined flag available since forever. In 5.18 and 5.19 it can also be enabled together with "locked" and packets from an unauthorized host (modulo link-local ones) will not populate the FDB. I prefer not to change an existing behavior.
From usability point of view, I think a new flag would be easier to explain than explaining that "learning on" behaves like A or B, based on whether "locked on" is set. The bridge can also be taught to forbid the new flag from being set when "locked" is not set.
A user space daemon that wants to try 802.1x and fallback to MAB can enable both flags or enable "mab" after some timer expires.
On 2022-07-24 13:10, Ido Schimmel wrote:
On Thu, Jul 21, 2022 at 05:20:01PM +0300, Vladimir Oltean wrote:
On Thu, Jul 21, 2022 at 04:27:52PM +0300, Ido Schimmel wrote:
I tried looking information about MAB online, but couldn't find detailed material that answers my questions, so my answers are based on what I believe is logical, which might be wrong.
I'm kind of in the same situation here.
:(
Sorry for being off here a while... here is my best link regarding MAB. Maybe it can answer some of your questions...
https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Security/TrustSec...
On 2022-07-24 13:10, Ido Schimmel wrote:
In the mv88e6xxx offload implementation, the locked entries eventually age out from time to time, practically giving the true owner of the MAC address another chance every 5 minutes or so. In the pure software implementation of locked FDB entries I'm not quite sure. It wouldn't make much sense for the behavior to differ significantly though.
From what I can tell, the same happens in software, but this behavior does not really make sense to me. It differs from how other learned entries age/roam and can lead to problems such as the one described above. It is also not documented anywhere, so I can't tell if it's intentional or an oversight. We need to have a good reason for such a behavior other than the fact that it appears to conform to the quirks of one hardware implementation.
It seems like the main purpose of these locked entries is to signal to user space the presence of a certain MAC behind a locked port, but they should not be able to affect packet forwarding in the bridge, unlike regular entries.
So essentially what you want is for br_handle_frame_finish() to treat "dst = br_fdb_find_rcu(br, eth_hdr(skb)->h_dest, vid);" as NULL if test_bit(BR_FDB_LOCKED, &dst->flags) is true?
Yes. It's not clear to me why unauthorized hosts should be given the ability to affect packet forwarding in the bridge through these locked entries when their primary purpose seems to be notifying user space about the presence of the MAC. At the very least this should be explained in the commit message, to indicate that some thought went into this decision.
I guess you are right that the SW setup locked entries can be used to gain uni-directional traffic through a switch, which should really not be the case. In this case I expect the zero-DPV entries to not give this ability, which is the correct behaviour with MAB IMHO.
Regarding a separate knob for MAB, I tend to agree we need it. Otherwise we cannot control which locked ports are able to populate the FDB with locked entries. I don't particularly like the fact that we overload an existing flag ("learning") for that. Any reason not to add an explicit flag ("mab")? At least with the current implementation, locked entries cannot roam between locked ports and cannot be refreshed, which differs from regular learning.
Well, assuming we model the software bridge closer to mv88e6xxx (where locked FDB entries can roam after a certain time), does this change things? In the software implementation I think it would make sense for them to be able to roam right away (the age-out interval in mv88e6xxx is just a compromise between responsiveness to roaming and resistance to DoS).
Exactly. If this is the best that we can do with mv88e6xxx, then so be it, but other implementations (software/hardware) do not have the same limitations and I don't see a reason to bend them.
Regarding "learning" vs. "mab" (or something else), the former is a well-defined flag available since forever. In 5.18 and 5.19 it can also be enabled together with "locked" and packets from an unauthorized host (modulo link-local ones) will not populate the FDB. I prefer not to change an existing behavior.
From usability point of view, I think a new flag would be easier to explain than explaining that "learning on" behaves like A or B, based on whether "locked on" is set. The bridge can also be taught to forbid the new flag from being set when "locked" is not set.
A user space daemon that wants to try 802.1x and fallback to MAB can enable both flags or enable "mab" after some timer expires.
With this driver it is not really an option to use +learning for a opt-in for MAB. I think locked port should always have -learning before locking the port. In fact there is a problem in this implementation with MAB if -learning is applied after locking the port, as that will disable MAB, but also refresh and all other violation interrupts. So I guess I need to disable the learning flag to the driver when the port is locked, or even from the bridge?
On 2022-07-21 13:59, Vladimir Oltean wrote:
On Sun, Jul 17, 2022 at 05:53:22PM +0200, netdev@kapio-technology.com wrote:
- What happens to packets with a DA matching the zero-DPV entry, are
they also discarded in hardware? If so, here we differ from the bridge driver implementation where such packets will be forwarded according to the locked entry and egress the locked port
I understand that egress will follow what is setup with regard to UC, MC and BC, though I haven't tested that. But no replies will get through of course as long as the port hasn't been opened for the iface behind the locked port.
Here, should we be rather fixing the software bridge, if the current behavior is to forward packets towards locked FDB entries?
Yes, I think that locked entries should block egress to the respective hosts behind the locked port, which should be fixed in the bridge.
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
What happens to unlocked ports that have learning enabled and are trying to use this address as SMAC? AFAICT, at least in the bridge driver, the locked entry will roam, but will keep the "locked" flag, which is probably not what we want. Let's see if we can agree on these semantics for a "locked" entry:
1. It discards packets with matching DMAC, regardless of ingress port. I read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
2. It is not refreshed and has ageing enabled. That is, after initial installation it will be removed by the bridge driver after configured ageing time unless converted to a regular (unlocked) entry.
I assume this allows you to remove the timer implementation from your driver and let the bridge driver notify you about the removal of this entry.
3. With regards to roaming, the entry cannot roam between locked ports (they need to have learning disabled anyway), but can roam to an unlocked port, in which case it becomes a regular entry that can roam and age.
If we agree on these semantics, then I can try to verify that at least Spectrum can support them (it seems mv88e6xxx can).
P.S. Sorry for the delay, I'm busy with other tasks at the moment.
[1] https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Security/TrustSec...
On 2022-08-09 11:20, Ido Schimmel wrote:
On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
What happens to unlocked ports that have learning enabled and are trying to use this address as SMAC? AFAICT, at least in the bridge driver, the locked entry will roam, but will keep the "locked" flag, which is probably not what we want. Let's see if we can agree on these semantics for a "locked" entry:
The next version of this will block forwarding to locked entries in the bridge, so they will behave like the zero-DPV entries.
- It discards packets with matching DMAC, regardless of ingress port.
I read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
Okay, so as V4 does (just without the error noted).
- It is not refreshed and has ageing enabled. That is, after initial
installation it will be removed by the bridge driver after configured ageing time unless converted to a regular (unlocked) entry.
I assume this allows you to remove the timer implementation from your driver and let the bridge driver notify you about the removal of this entry.
Okay, but only if the scheme is not so that the driver creates the locked entries itself, unless you indicate that the driver notifies the bridge, which then notifies back to the driver and installs the zero-DPV entry? If not I think the current implementation for the mv88e6xxx is fine.
- With regards to roaming, the entry cannot roam between locked ports
(they need to have learning disabled anyway), but can roam to an unlocked port, in which case it becomes a regular entry that can roam and age.
If we agree on these semantics, then I can try to verify that at least Spectrum can support them (it seems mv88e6xxx can).
The consensus here is that at least for the mv88e6xxx, learning should be on and link local learning should be blocked by the userspace setting you pointed to earlier.
P.S. Sorry for the delay, I'm busy with other tasks at the moment.
I understand :-)
[1] https://www.cisco.com/c/en/us/td/docs/solutions/Enterprise/Security/TrustSec...
On Tue, Aug 09, 2022 at 10:00:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-09 11:20, Ido Schimmel wrote:
On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
What happens to unlocked ports that have learning enabled and are trying to use this address as SMAC? AFAICT, at least in the bridge driver, the locked entry will roam, but will keep the "locked" flag, which is probably not what we want. Let's see if we can agree on these semantics for a "locked" entry:
The next version of this will block forwarding to locked entries in the bridge, so they will behave like the zero-DPV entries.
I'm talking about roaming, not forwarding. Let's say you have a locked entry with MAC X pointing to port Y. Now you get a packet with SMAC X from port Z which is unlocked. Will the FDB entry roam to port Z? I think it should, but at least in current implementation it seems that the "locked" flag will not be reset and having locked entries pointing to an unlocked port looks like a bug.
- It discards packets with matching DMAC, regardless of ingress port. I
read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
Okay, so as V4 does (just without the error noted).
- It is not refreshed and has ageing enabled. That is, after initial
installation it will be removed by the bridge driver after configured ageing time unless converted to a regular (unlocked) entry.
I assume this allows you to remove the timer implementation from your driver and let the bridge driver notify you about the removal of this entry.
Okay, but only if the scheme is not so that the driver creates the locked entries itself, unless you indicate that the driver notifies the bridge, which then notifies back to the driver and installs the zero-DPV entry? If not I think the current implementation for the mv88e6xxx is fine.
I don't see a problem in having the driver notifying the bridge about the installation of this entry and the bridge notifying the driver that the entry needs to be removed. It removes complexity from device drivers like mv88e6xxx and doesn't add extra complexity to the bridge driver.
Actually, there is one complication, 'SWITCHDEV_FDB_ADD_TO_BRIDGE' will add the locked entry as externally learned, which means the bridge will not age it. Might need something like this:
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..5f73d0b44ed9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -530,7 +530,8 @@ void br_fdb_cleanup(struct work_struct *work) unsigned long this_timer = f->updated + delay;
if (test_bit(BR_FDB_STATIC, &f->flags) || - test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags)) { + (test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags) && + !test_bit(BR_FDB_ENTRY_LOCKED, &f->flags))) { if (test_bit(BR_FDB_NOTIFY, &f->flags)) { if (time_after(this_timer, now)) work_delay = min(work_delay,
- With regards to roaming, the entry cannot roam between locked ports
(they need to have learning disabled anyway), but can roam to an unlocked port, in which case it becomes a regular entry that can roam and age.
If we agree on these semantics, then I can try to verify that at least Spectrum can support them (it seems mv88e6xxx can).
The consensus here is that at least for the mv88e6xxx, learning should be on and link local learning should be blocked by the userspace setting you pointed to earlier.
Why learning needs to be on in the bridge (not mv88e6xxx) driver?
On 2022-08-10 09:21, Ido Schimmel wrote:
On Tue, Aug 09, 2022 at 10:00:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-09 11:20, Ido Schimmel wrote:
On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
What happens to unlocked ports that have learning enabled and are trying to use this address as SMAC? AFAICT, at least in the bridge driver, the locked entry will roam, but will keep the "locked" flag, which is probably not what we want. Let's see if we can agree on these semantics for a "locked" entry:
The next version of this will block forwarding to locked entries in the bridge, so they will behave like the zero-DPV entries.
I'm talking about roaming, not forwarding. Let's say you have a locked entry with MAC X pointing to port Y. Now you get a packet with SMAC X from port Z which is unlocked. Will the FDB entry roam to port Z? I think it should, but at least in current implementation it seems that the "locked" flag will not be reset and having locked entries pointing to an unlocked port looks like a bug.
Remember that zero-DPV entries blackhole (mask) the MAC, so whenever a packet appears with the same MAC on another port it is just dropped in the HW, so there is no possibility of doing any CPU processing in this case. Only after the timeout (5 min) can the MAC get a normal ATU on an open port. For the bridge to do what you suggest, a FDB search would be needed afaics, and this might be in a process sensitive part of the code, thus leading to too heavy a cost.
- It discards packets with matching DMAC, regardless of ingress port. I
read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
Okay, so as V4 does (just without the error noted).
- It is not refreshed and has ageing enabled. That is, after initial
installation it will be removed by the bridge driver after configured ageing time unless converted to a regular (unlocked) entry.
I assume this allows you to remove the timer implementation from your driver and let the bridge driver notify you about the removal of this entry.
Okay, but only if the scheme is not so that the driver creates the locked entries itself, unless you indicate that the driver notifies the bridge, which then notifies back to the driver and installs the zero-DPV entry? If not I think the current implementation for the mv88e6xxx is fine.
I don't see a problem in having the driver notifying the bridge about the installation of this entry and the bridge notifying the driver that the entry needs to be removed. It removes complexity from device drivers like mv88e6xxx and doesn't add extra complexity to the bridge driver.
Actually, there is one complication, 'SWITCHDEV_FDB_ADD_TO_BRIDGE' will add the locked entry as externally learned, which means the bridge will not age it. Might need something like this:
diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c index e7f4fccb6adb..5f73d0b44ed9 100644 --- a/net/bridge/br_fdb.c +++ b/net/bridge/br_fdb.c @@ -530,7 +530,8 @@ void br_fdb_cleanup(struct work_struct *work) unsigned long this_timer = f->updated + delay;
if (test_bit(BR_FDB_STATIC, &f->flags) ||
test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags)) {
(test_bit(BR_FDB_ADDED_BY_EXT_LEARN, &f->flags) &&
!test_bit(BR_FDB_ENTRY_LOCKED, &f->flags))) { if (test_bit(BR_FDB_NOTIFY, &f->flags)) { if (time_after(this_timer, now)) work_delay = min(work_delay,
There is a case of ownership of the FDB/ATU entry, which if I remember correctly, will point to the current implementation being the right way to do it, thus having the driver keeping ownership of the entry and thereby also ageing it, but I think Vladimir should have his say here.
- With regards to roaming, the entry cannot roam between locked ports
(they need to have learning disabled anyway), but can roam to an unlocked port, in which case it becomes a regular entry that can roam and age.
If we agree on these semantics, then I can try to verify that at least Spectrum can support them (it seems mv88e6xxx can).
The consensus here is that at least for the mv88e6xxx, learning should be on and link local learning should be blocked by the userspace setting you pointed to earlier.
Why learning needs to be on in the bridge (not mv88e6xxx) driver?
I think it is seen as 'cheating' to enable learning only in the driver behind the scenes, so kind of hackish. E.g. 'bridge -d link show' will then report 'learning off', while learning is on in the driver. And learning needs to be on for the driver as discussed earlier, which only gives rise to the link local learning problem, which is then solved by the user space setting.
On Wed, Aug 10, 2022 at 10:40:45AM +0200, netdev@kapio-technology.com wrote:
On 2022-08-10 09:21, Ido Schimmel wrote:
On Tue, Aug 09, 2022 at 10:00:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-08-09 11:20, Ido Schimmel wrote:
On Mon, Aug 01, 2022 at 05:33:49PM +0200, netdev@kapio-technology.com wrote:
On 2022-07-13 14:39, Ido Schimmel wrote:
What are "Storm Prevention" and "zero-DPV" FDB entries?
For the zero-DPV entries, I can summarize:
Since a CPU can become saturated from constant SA Miss Violations from a denied source, source MAC address are masked by loading a zero-DPV (Destination Port Vector) entry in the ATU. As the address now appears in the database it will not cause more Miss Violations. ANY port trying to send a frame to this unauthorized address is discarded. Any locked port trying to use this unauthorized address has its frames discarded too (as the ports SA bit is not set in the ATU entry).
What happens to unlocked ports that have learning enabled and are trying to use this address as SMAC? AFAICT, at least in the bridge driver, the locked entry will roam, but will keep the "locked" flag, which is probably not what we want. Let's see if we can agree on these semantics for a "locked" entry:
The next version of this will block forwarding to locked entries in the bridge, so they will behave like the zero-DPV entries.
I'm talking about roaming, not forwarding. Let's say you have a locked entry with MAC X pointing to port Y. Now you get a packet with SMAC X from port Z which is unlocked. Will the FDB entry roam to port Z? I think it should, but at least in current implementation it seems that the "locked" flag will not be reset and having locked entries pointing to an unlocked port looks like a bug.
Remember that zero-DPV entries blackhole (mask) the MAC, so whenever a packet appears with the same MAC on another port it is just dropped in the
What do you mean by "same MAC"? As SMAC or DMAC? I'm talking about SMAC and when the packet is received via an unlocked port. Why would such a packet be dropped?
HW, so there is no possibility of doing any CPU processing in this case. Only after the timeout (5 min) can the MAC get a normal ATU on an open port. For the bridge to do what you suggest, a FDB search would be needed afaics, and this might be in a process sensitive part of the code, thus leading to too heavy a cost.
When learning is enabled the bridge already performs a lookup on the SMAC.
TBH, I don't think this is progressing well because there is too much discrepancy between how this feature works in the bridge driver and in the hardware you work with.
I suggest to first define the model in the bridge driver and then take care of the offload. My suggestion is to send another RFC with only the bridge changes with emphasize on the following aspects:
* Forwarding rules for "locked" entries. Do they drop matching packets? Forward them? Or not considered at all for forwarding?
* Roaming rules for "locked" entries. Can they roam between locked ports? Can they roam from a locked port to an unlocked port and vice versa? Or are they completely sticky?
* Ageing rule for "locked" entries. Are these entries subject to the ageing time or are they static? If they are not static, are they refreshed by incoming traffic from a locked port or not?
* MAB enablement. New option? Overload an existing one? No option?
The commit messages should explain these design choices and new tests cases should verify the desired behavior.
Once we have an agreement we can work out the switchdev/mv88e6xxx parts and eventually the entire thing can be merged together. Fair?
On 2022-08-11 13:28, Ido Schimmel wrote:
I'm talking about roaming, not forwarding. Let's say you have a locked entry with MAC X pointing to port Y. Now you get a packet with SMAC X from port Z which is unlocked. Will the FDB entry roam to port Z? I think it should, but at least in current implementation it seems that the "locked" flag will not be reset and having locked entries pointing to an unlocked port looks like a bug.
Yes, now I have tried to test with a case like this using the bridge and have verified the locked entry pointing to an unlocked port, which I agree seems to be a bug, which I will get fixed.
On 2022-08-10 09:21, Ido Schimmel wrote:
- It discards packets with matching DMAC, regardless of ingress port. I
read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
I just want to be completely sure as what should be done in version 5 with locked entries from the bridge, as - if I should implement it so that they are sent to all the drivers, and the drivers then ignore them if they don't need to take action? (for the mv88e6xxx driver, it does not need them and can ignore but other drivers might need.)
On Tue, Aug 16, 2022 at 09:51:32AM +0200, netdev@kapio-technology.com wrote:
On 2022-08-10 09:21, Ido Schimmel wrote:
- It discards packets with matching DMAC, regardless of ingress port. I
read the document [1] you linked to in a different reply and could not find anything against this approach, so this might be fine or at least not very significant.
Note that this means that "locked" entries need to be notified to device drivers so that they will install a matching entry in the HW FDB.
I just want to be completely sure as what should be done in version 5 with locked entries from the bridge, as - if I should implement it so that they are sent to all the drivers, and the drivers then ignore them if they don't need to take action? (for the mv88e6xxx driver, it does not need them and can ignore but other drivers might need.)
Yes, I think that would be best. At least when mlxsw starts supporting MAB it will need to program the locked entry as an FDB with discard action.
To be clear, I'm aware that all drivers other than mv88e6xxx currently forbid a port from being locked, making it unlikely that they will receive such notifications, but if you do it now then we will not need more changes in the bridge when other drivers gain support for 802.1X/MAB.
On Fri, Jul 08, 2022 at 11:50:33AM +0200, netdev@kapio-technology.com wrote:
On 2022-07-08 11:15, Vladimir Oltean wrote:
When the possibility for it to be true will exist, _all_ switchdev drivers will need to be updated to ignore that (mlxsw, cpss, ocelot, rocker, prestera, etc etc), not just DSA. And you don't need to propagate the is_locked flag to all individual DSA sub-drivers when none care about is_locked in the ADD_TO_DEVICE direction, you can just ignore within DSA until needed otherwise.
Maybe I have it wrong, but I think that Ido requested me to send it to all the drivers, and have them ignore entries with is_locked=true ...
Yes, but re-read my message about what "all the drivers" means.
Hi Hans,
Thank you for the patch! Yet something to improve:
[auto build test ERROR on net/master] [also build test ERROR on shuah-kselftest/next linus/master v5.19-rc5] [cannot apply to net-next/master next-20220708] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Hans-Schultz/Extend-locked-po... base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 07266d066301b97ad56a693f81b29b7ced429b27 config: i386-randconfig-a013 (https://download.01.org/0day-ci/archive/20220709/202207090438.PlGIeA4G-lkp@i...) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 562c3467a6738aa89203f72fc1d1343e5baadf3c) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/ebd598d7ea6c015001489c4293da88... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Hans-Schultz/Extend-locked-port-feature-with-FDB-locked-flag-MAC-Auth-MAB/20220707-233246 git checkout ebd598d7ea6c015001489c4293da887763491086 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=i386 SHELL=/bin/bash drivers/net/dsa/sja1105/
If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot lkp@intel.com
All errors (new ones prefixed by >>):
drivers/net/dsa/sja1105/sja1105_main.c:1952:58: error: too few arguments to function call, expected 6, have 5
return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); ~~~~~~~~~~~~~~~ ^ drivers/net/dsa/sja1105/sja1105_main.c:1803:12: note: 'sja1105_fdb_add' declared here static int sja1105_fdb_add(struct dsa_switch *ds, int port, ^ 1 error generated.
vim +1952 drivers/net/dsa/sja1105/sja1105_main.c
5126ec72a094bd3 Vladimir Oltean 2021-08-08 1947 a52b2da778fc93e Vladimir Oltean 2021-01-09 1948 static int sja1105_mdb_add(struct dsa_switch *ds, int port, c26933639b5402c Vladimir Oltean 2022-02-25 1949 const struct switchdev_obj_port_mdb *mdb, c26933639b5402c Vladimir Oltean 2022-02-25 1950 struct dsa_db db) 291d1e72b756424 Vladimir Oltean 2019-05-02 1951 { c26933639b5402c Vladimir Oltean 2022-02-25 @1952 return sja1105_fdb_add(ds, port, mdb->addr, mdb->vid, db); 291d1e72b756424 Vladimir Oltean 2019-05-02 1953 } 291d1e72b756424 Vladimir Oltean 2019-05-02 1954
For convenience the function mv88e6xxx_g1_atu_op() has been used to read ATU violations, but the function has other purposes and does not enable the possibility to read the FID when reading ATU violations.
The FID is needed to get hold of which VID was involved in the violation, thus the need to be able to read the FID.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/global1_atu.c | 16 ++++++++++++++-- 1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index 40bd67a5c8e9..5d120d53823c 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -114,6 +114,19 @@ static int mv88e6xxx_g1_atu_op_wait(struct mv88e6xxx_chip *chip) return mv88e6xxx_g1_wait_bit(chip, MV88E6XXX_G1_ATU_OP, bit, 0); }
+static int mv88e6xxx_g1_read_atu_violation(struct mv88e6xxx_chip *chip) +{ + int err; + + err = mv88e6xxx_g1_write(chip, MV88E6XXX_G1_ATU_OP, + MV88E6XXX_G1_ATU_OP_BUSY | + MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + if (err) + return err; + + return mv88e6xxx_g1_atu_op_wait(chip); +} + static int mv88e6xxx_g1_atu_op(struct mv88e6xxx_chip *chip, u16 fid, u16 op) { u16 val; @@ -359,8 +372,7 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id)
mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_g1_atu_op(chip, 0, - MV88E6XXX_G1_ATU_OP_GET_CLR_VIOLATION); + err = mv88e6xxx_g1_read_atu_violation(chip); if (err) goto out;
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry.
As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- drivers/net/dsa/mv88e6xxx/Makefile | 1 + drivers/net/dsa/mv88e6xxx/chip.c | 49 ++++- drivers/net/dsa/mv88e6xxx/chip.h | 15 ++ drivers/net/dsa/mv88e6xxx/global1.h | 1 + drivers/net/dsa/mv88e6xxx/global1_atu.c | 16 +- drivers/net/dsa/mv88e6xxx/port.c | 30 ++- drivers/net/dsa/mv88e6xxx/port.h | 2 + drivers/net/dsa/mv88e6xxx/switchdev.c | 280 ++++++++++++++++++++++++ drivers/net/dsa/mv88e6xxx/switchdev.h | 37 ++++ 9 files changed, 414 insertions(+), 17 deletions(-) create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.c create mode 100644 drivers/net/dsa/mv88e6xxx/switchdev.h
diff --git a/drivers/net/dsa/mv88e6xxx/Makefile b/drivers/net/dsa/mv88e6xxx/Makefile index c8eca2b6f959..be903a983780 100644 --- a/drivers/net/dsa/mv88e6xxx/Makefile +++ b/drivers/net/dsa/mv88e6xxx/Makefile @@ -15,3 +15,4 @@ mv88e6xxx-objs += port_hidden.o mv88e6xxx-$(CONFIG_NET_DSA_MV88E6XXX_PTP) += ptp.o mv88e6xxx-objs += serdes.o mv88e6xxx-objs += smi.o +mv88e6xxx-objs += switchdev.o diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index d134153ef023..c1fb7b5ba3ac 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -42,6 +42,7 @@ #include "ptp.h" #include "serdes.h" #include "smi.h" +#include "switchdev.h"
static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -919,6 +920,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port); + else + if (mv88e6xxx_port_is_locked(chip, port)) { + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->dev, + "p%d: failed to clear locked entries\n", port); + } }
static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1685,6 +1693,12 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err;
+ if (mv88e6xxx_port_is_locked(chip, port)) + err = mv88e6xxx_atu_locked_entry_flush(ds, port); + if (err) + dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", + port, err); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip); @@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; }
-static int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, - int (*cb)(struct mv88e6xxx_chip *chip, - const struct mv88e6xxx_vtu_entry *entry, - void *priv), - void *priv) +int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv) { struct mv88e6xxx_vtu_entry entry = { .vid = mv88e6xxx_max_vid(chip), @@ -2727,6 +2741,9 @@ static int mv88e6xxx_port_fdb_add(struct dsa_switch *ds, int port, if (is_locked) return 0;
+ if (mv88e6xxx_port_is_locked(chip, port)) + mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid); + mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC); @@ -2740,12 +2757,17 @@ static int mv88e6xxx_port_fdb_del(struct dsa_switch *ds, int port, struct dsa_db db) { struct mv88e6xxx_chip *chip = ds->priv; - int err; + bool locked_found = false; + int err = 0;
- mv88e6xxx_reg_lock(chip); - err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); - mv88e6xxx_reg_unlock(chip); + if (mv88e6xxx_port_is_locked(chip, port)) + locked_found = mv88e6xxx_atu_locked_entry_find_purge(ds, port, addr, vid);
+ if (!locked_found) { + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_db_load_purge(chip, port, addr, vid, 0); + mv88e6xxx_reg_unlock(chip); + } return err; }
@@ -3814,11 +3836,18 @@ static int mv88e6xxx_setup(struct dsa_switch *ds)
static int mv88e6xxx_port_setup(struct dsa_switch *ds, int port) { - return mv88e6xxx_setup_devlink_regions_port(ds, port); + int err; + + err = mv88e6xxx_setup_devlink_regions_port(ds, port); + if (!err) + return mv88e6xxx_init_violation_handler(ds, port); + + return err; }
static void mv88e6xxx_port_teardown(struct dsa_switch *ds, int port) { + mv88e6xxx_teardown_violation_handler(ds, port); mv88e6xxx_teardown_devlink_regions_port(ds, port); }
diff --git a/drivers/net/dsa/mv88e6xxx/chip.h b/drivers/net/dsa/mv88e6xxx/chip.h index 5e03cfe50156..3cc2db722ad9 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.h +++ b/drivers/net/dsa/mv88e6xxx/chip.h @@ -280,6 +280,12 @@ struct mv88e6xxx_port { unsigned int serdes_irq; char serdes_irq_name[64]; struct devlink_region *region; + + /* List and maintenance of ATU locked entries */ + struct mutex ale_list_lock; + struct list_head ale_list; + struct delayed_work ale_work; + int ale_cnt; };
enum mv88e6xxx_region_id { @@ -399,6 +405,9 @@ struct mv88e6xxx_chip { int egress_dest_port; int ingress_dest_port;
+ /* Keep the register written age time for easy access */ + u8 age_time; + /* Per-port timestamping resources. */ struct mv88e6xxx_port_hwtstamp port_hwtstamp[DSA_MAX_PORTS];
@@ -803,6 +812,12 @@ static inline void mv88e6xxx_reg_unlock(struct mv88e6xxx_chip *chip) mutex_unlock(&chip->reg_lock); }
+int mv88e6xxx_vtu_walk(struct mv88e6xxx_chip *chip, + int (*cb)(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv), + void *priv); + int mv88e6xxx_fid_map(struct mv88e6xxx_chip *chip, unsigned long *bitmap);
#endif /* _MV88E6XXX_CHIP_H */ diff --git a/drivers/net/dsa/mv88e6xxx/global1.h b/drivers/net/dsa/mv88e6xxx/global1.h index 65958b2a0d3a..503fbf216670 100644 --- a/drivers/net/dsa/mv88e6xxx/global1.h +++ b/drivers/net/dsa/mv88e6xxx/global1.h @@ -136,6 +136,7 @@ #define MV88E6XXX_G1_ATU_DATA_TRUNK 0x8000 #define MV88E6XXX_G1_ATU_DATA_TRUNK_ID_MASK 0x00f0 #define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_MASK 0x3ff0 +#define MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_MASK 0x000f #define MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED 0x0000 #define MV88E6XXX_G1_ATU_DATA_STATE_UC_AGE_1_OLDEST 0x0001 diff --git a/drivers/net/dsa/mv88e6xxx/global1_atu.c b/drivers/net/dsa/mv88e6xxx/global1_atu.c index 5d120d53823c..d4ca976dac91 100644 --- a/drivers/net/dsa/mv88e6xxx/global1_atu.c +++ b/drivers/net/dsa/mv88e6xxx/global1_atu.c @@ -12,6 +12,8 @@
#include "chip.h" #include "global1.h" +#include "port.h" +#include "switchdev.h"
/* Offset 0x01: ATU FID Register */
@@ -54,6 +56,7 @@ int mv88e6xxx_g1_atu_set_age_time(struct mv88e6xxx_chip *chip,
/* Round to nearest multiple of coeff */ age_time = (msecs + coeff / 2) / coeff; + chip->age_time = age_time;
err = mv88e6xxx_g1_read(chip, MV88E6XXX_G1_ATU_CTL, &val); if (err) @@ -369,6 +372,7 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) int spid; int err; u16 val; + u16 fid;
mv88e6xxx_reg_lock(chip);
@@ -380,6 +384,10 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ err = mv88e6xxx_g1_read(chip, MV88E6352_G1_ATU_FID, &fid); + if (err) + goto out; + err = mv88e6xxx_g1_atu_data_read(chip, &entry); if (err) goto out; @@ -388,6 +396,8 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) if (err) goto out;
+ mv88e6xxx_reg_unlock(chip); + spid = entry.state;
if (val & MV88E6XXX_G1_ATU_OP_AGE_OUT_VIOLATION) { @@ -408,6 +418,11 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) "ATU miss violation for %pM portvec %x spid %d\n", entry.mac, entry.portvec, spid); chip->ports[spid].atu_miss_violation++; + if (fid && mv88e6xxx_port_is_locked(chip, spid)) + err = mv88e6xxx_handle_violation(chip, spid, &entry, fid, + MV88E6XXX_G1_ATU_OP_MISS_VIOLATION); + if (err) + goto out; }
if (val & MV88E6XXX_G1_ATU_OP_FULL_VIOLATION) { @@ -416,7 +431,6 @@ static irqreturn_t mv88e6xxx_g1_atu_prob_irq_thread_fn(int irq, void *dev_id) entry.mac, entry.portvec, spid); chip->ports[spid].atu_full_violation++; } - mv88e6xxx_reg_unlock(chip);
return IRQ_HANDLED;
diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 795b3128768f..57eb25e9eb63 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -14,9 +14,11 @@ #include <linux/phylink.h>
#include "chip.h" +#include "global1.h" #include "global2.h" #include "port.h" #include "serdes.h" +#include "switchdev.h"
int mv88e6xxx_port_read(struct mv88e6xxx_chip *chip, int port, int reg, u16 *val) @@ -1239,6 +1241,17 @@ int mv88e6xxx_port_set_mirror(struct mv88e6xxx_chip *chip, int port, return err; }
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port) +{ + u16 reg; + + mv88e6xxx_reg_lock(chip); + mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®); + mv88e6xxx_reg_unlock(chip); + + return reg & MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK; +} + int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked) { @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®); - if (err) - return err; + if (!locked) { + err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port); + if (err) + return err; + }
- reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; - if (locked) - reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + reg = 0; + if (locked) { + reg = (1 << port); + reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG | + MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; + }
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index e0a705d82019..5c1d485e7442 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff
/* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid); int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid); int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked);
diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.c b/drivers/net/dsa/mv88e6xxx/switchdev.c new file mode 100644 index 000000000000..e6c67fb741af --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.c @@ -0,0 +1,280 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * switchdev.c + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#include <net/switchdev.h> +#include <linux/list.h> +#include "chip.h" +#include "global1.h" +#include "switchdev.h" + +static void mv88e6xxx_atu_locked_entry_purge(struct mv88e6xxx_atu_locked_entry *ale, bool notify) +{ + struct switchdev_notifier_fdb_info info = { + .addr = ale->mac, + .vid = ale->vid, + .is_locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_entry entry; + struct net_device *brport; + struct dsa_port *dp; + + entry.portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry.state = MV88E6XXX_G1_ATU_DATA_STATE_UC_UNUSED; + entry.trunk = false; + ether_addr_copy(entry.mac, ale->mac); + + mv88e6xxx_reg_lock(ale->chip); + mv88e6xxx_g1_atu_loadpurge(ale->chip, ale->fid, &entry); + mv88e6xxx_reg_unlock(ale->chip); + + dp = dsa_to_port(ale->chip->ds, ale->port); + + if (notify) { + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + + if (brport) { + call_switchdev_notifiers(SWITCHDEV_FDB_DEL_TO_BRIDGE, + brport, &info.info, NULL); + } else { + dev_err(ale->chip->dev, "No bridge port for dsa port belonging to port %d\n", + ale->port); + } + rtnl_unlock(); + } + + list_del(&ale->list); + kfree(ale); +} + +static void mv88e6xxx_atu_locked_entry_cleanup(struct work_struct *work) +{ + struct mv88e6xxx_port *p = container_of(work, struct mv88e6xxx_port, ale_work.work); + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (time_after(jiffies, ale->expires)) { + mv88e6xxx_atu_locked_entry_purge(ale, true); + p->ale_cnt--; + } + } + mutex_unlock(&p->ale_list_lock); + + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(100)); +} + +static int mv88e6xxx_new_atu_locked_entry(struct mv88e6xxx_chip *chip, const unsigned char *addr, + int port, u16 fid, u16 vid, + struct mv88e6xxx_atu_locked_entry **alep) +{ + struct mv88e6xxx_atu_locked_entry *ale; + unsigned long now, age_time; + + ale = kmalloc(sizeof(*ale), GFP_ATOMIC); + if (!ale) + return -ENOMEM; + + ether_addr_copy(ale->mac, addr); + ale->chip = chip; + ale->port = port; + ale->fid = fid; + ale->vid = vid; + now = jiffies; + age_time = chip->age_time * chip->info->age_time_coeff; + ale->expires = now + age_time; + + *alep = ale; + return 0; +} + +struct mv88e6xxx_fid_search_ctx { + u16 fid_search; + u16 vid_found; +}; + +static int mv88e6xxx_find_vid_on_matching_fid(struct mv88e6xxx_chip *chip, + const struct mv88e6xxx_vtu_entry *entry, + void *priv) +{ + struct mv88e6xxx_fid_search_ctx *ctx = priv; + + if (ctx->fid_search == entry->fid) { + ctx->vid_found = entry->vid; + return 1; + } + + return 0; +} + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type) +{ + struct switchdev_notifier_fdb_info info = { + .addr = entry->mac, + .vid = 0, + .is_locked = true, + .offloaded = true, + }; + struct mv88e6xxx_atu_locked_entry *ale; + struct mv88e6xxx_fid_search_ctx ctx; + struct net_device *brport; + struct mv88e6xxx_port *p; + struct dsa_port *dp; + int err; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + ctx.fid_search = fid; + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_vtu_walk(chip, mv88e6xxx_find_vid_on_matching_fid, &ctx); + mv88e6xxx_reg_unlock(chip); + if (err < 0) + return err; + if (err == 1) + info.vid = ctx.vid_found; + else + return -ENODATA; + + switch (type) { + case MV88E6XXX_G1_ATU_OP_MISS_VIOLATION: + mutex_lock(&p->ale_list_lock); + if (p->ale_cnt >= ATU_LOCKED_ENTRIES_MAX) + goto exit; + mutex_unlock(&p->ale_list_lock); + entry->portvec = MV88E6XXX_G1_ATU_DATA_PORT_VECTOR_NO_EGRESS; + entry->state = MV88E6XXX_G1_ATU_DATA_STATE_UC_STATIC; + entry->trunk = false; + + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_g1_atu_loadpurge(chip, fid, entry); + if (err) + goto fail; + mv88e6xxx_reg_unlock(chip); + + dp = dsa_to_port(chip->ds, port); + err = mv88e6xxx_new_atu_locked_entry(chip, entry->mac, port, fid, + info.vid, &ale); + if (err) + return err; + + mutex_lock(&p->ale_list_lock); + list_add(&ale->list, &p->ale_list); + p->ale_cnt++; + mutex_unlock(&p->ale_list_lock); + + rtnl_lock(); + brport = dsa_port_to_bridge_port(dp); + if (!brport) { + rtnl_unlock(); + return -ENODEV; + } + err = call_switchdev_notifiers(SWITCHDEV_FDB_ADD_TO_BRIDGE, + brport, &info.info, NULL); + rtnl_unlock(); + break; + } + + return err; + +fail: + mv88e6xxx_reg_unlock(chip); + return err; + +exit: + mutex_unlock(&p->ale_list_lock); + return 0; +} + +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + bool found = false; + + p = &chip->ports[port]; + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + if (ether_addr_equal(ale->mac, addr) == 0) { + if (ale->vid == vid) { + mv88e6xxx_atu_locked_entry_purge(ale, false); + p->ale_cnt--; + found = true; + break; + } + } + } + mutex_unlock(&p->ale_list_lock); + return found; +} + +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_atu_locked_entry *ale, *tmp; + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + mutex_lock(&p->ale_list_lock); + list_for_each_entry_safe(ale, tmp, &p->ale_list, list) { + mv88e6xxx_atu_locked_entry_purge(ale, false); + p->ale_cnt--; + } + mutex_unlock(&p->ale_list_lock); + + return 0; +} + +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + INIT_LIST_HEAD(&p->ale_list); + mutex_init(&p->ale_list_lock); + p->ale_cnt = 0; + INIT_DELAYED_WORK(&p->ale_work, mv88e6xxx_atu_locked_entry_cleanup); + mod_delayed_work(system_long_wq, &p->ale_work, msecs_to_jiffies(500)); + + ds->ageing_time_min = 100; + return 0; +} + +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port) +{ + struct mv88e6xxx_chip *chip = ds->priv; + struct mv88e6xxx_port *p; + + if (!mv88e6xxx_is_invalid_port(chip, port)) + p = &chip->ports[port]; + else + return -ENODEV; + + cancel_delayed_work(&p->ale_work); + mv88e6xxx_atu_locked_entry_flush(ds, port); + mutex_destroy(&p->ale_list_lock); + + return 0; +} diff --git a/drivers/net/dsa/mv88e6xxx/switchdev.h b/drivers/net/dsa/mv88e6xxx/switchdev.h new file mode 100644 index 000000000000..df2005c36f47 --- /dev/null +++ b/drivers/net/dsa/mv88e6xxx/switchdev.h @@ -0,0 +1,37 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later + * + * switchdev.h + * + * Authors: + * Hans J. Schultz hans.schultz@westermo.com + * + */ + +#ifndef DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ +#define DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ + +#include <net/switchdev.h> +#include "chip.h" + +#define ATU_LOCKED_ENTRIES_MAX 64 + +struct mv88e6xxx_atu_locked_entry { + struct list_head list; + struct mv88e6xxx_chip *chip; + int port; + u8 mac[ETH_ALEN]; + u16 fid; + u16 vid; + unsigned long expires; +}; + +int mv88e6xxx_handle_violation(struct mv88e6xxx_chip *chip, int port, + struct mv88e6xxx_atu_entry *entry, + u16 fid, u16 type); +bool mv88e6xxx_atu_locked_entry_find_purge(struct dsa_switch *ds, int port, + const unsigned char *addr, u16 vid); +int mv88e6xxx_atu_locked_entry_flush(struct dsa_switch *ds, int port); +int mv88e6xxx_init_violation_handler(struct dsa_switch *ds, int port); +int mv88e6xxx_teardown_violation_handler(struct dsa_switch *ds, int port); + +#endif /* DRIVERS_NET_DSA_MV88E6XXX_SWITCHDEV_H_ */
Hi Hans,
Thank you for the patch! Perhaps something to improve:
[auto build test WARNING on net/master] [also build test WARNING on shuah-kselftest/next linus/master v5.19-rc5] [cannot apply to net-next/master next-20220707] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Hans-Schultz/Extend-locked-po... base: https://git.kernel.org/pub/scm/linux/kernel/git/davem/net.git 07266d066301b97ad56a693f81b29b7ced429b27 config: x86_64-randconfig-a005 (https://download.01.org/0day-ci/archive/20220708/202207081751.PPV8GRpf-lkp@i...) compiler: clang version 15.0.0 (https://github.com/llvm/llvm-project 562c3467a6738aa89203f72fc1d1343e5baadf3c) reproduce (this is a W=1 build): wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross chmod +x ~/bin/make.cross # https://github.com/intel-lab-lkp/linux/commit/74f76ae0b0d4b12864a0cf5ff3c068... git remote add linux-review https://github.com/intel-lab-lkp/linux git fetch --no-tags linux-review Hans-Schultz/Extend-locked-port-feature-with-FDB-locked-flag-MAC-Auth-MAB/20220707-233246 git checkout 74f76ae0b0d4b12864a0cf5ff3c0685bc420aea3 # save the config file mkdir build_dir && cp config build_dir/.config COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash drivers/net/dsa/mv88e6xxx/
If you fix the issue, kindly add following tag where applicable Reported-by: kernel test robot lkp@intel.com
All warnings (new ones prefixed by >>):
drivers/net/dsa/mv88e6xxx/chip.c:1696:6: warning: variable 'err' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
if (mv88e6xxx_port_is_locked(chip, port)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/chip.c:1698:6: note: uninitialized use occurs here if (err) ^~~ drivers/net/dsa/mv88e6xxx/chip.c:1696:2: note: remove the 'if' if its condition is always true if (mv88e6xxx_port_is_locked(chip, port)) ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ drivers/net/dsa/mv88e6xxx/chip.c:1694:9: note: initialize the variable 'err' to silence this warning int err; ^ = 0 1 warning generated.
vim +1696 drivers/net/dsa/mv88e6xxx/chip.c
1690 1691 static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) 1692 { 1693 struct mv88e6xxx_chip *chip = ds->priv; 1694 int err; 1695
1696 if (mv88e6xxx_port_is_locked(chip, port))
1697 err = mv88e6xxx_atu_locked_entry_flush(ds, port); 1698 if (err) 1699 dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n", 1700 port, err); 1701 1702 mv88e6xxx_reg_lock(chip); 1703 err = mv88e6xxx_port_fast_age_fid(chip, port, 0); 1704 mv88e6xxx_reg_unlock(chip); 1705 1706 if (err) 1707 dev_err(chip->ds->dev, "p%d: failed to flush ATU: %d\n", 1708 port, err); 1709 } 1710
On Thu, Jul 07, 2022 at 05:29:29PM +0200, Hans Schultz wrote:
This implementation for the Marvell mv88e6xxx chip series, is based on handling ATU miss violations occurring when packets ingress on a port that is locked. The mac address triggering the ATU miss violation will be added to the ATU with a zero-DPV, and is then communicated through switchdev to the bridge module, which adds a fdb entry with the fdb locked flag set. The entry is kept according to the bridges ageing time, thus simulating a dynamic entry.
As this is essentially a form of CPU based learning, the amount of locked entries will be limited by a hardcoded value for now, so as to prevent DOS attacks.
Signed-off-by: Hans Schultz netdev@kapio-technology.com
static void assert_reg_lock(struct mv88e6xxx_chip *chip) { @@ -919,6 +920,13 @@ static void mv88e6xxx_mac_link_down(struct dsa_switch *ds, int port, if (err) dev_err(chip->dev, "p%d: failed to force MAC link down\n", port);
- else
if (mv88e6xxx_port_is_locked(chip, port)) {
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
if (err)
dev_err(chip->dev,
"p%d: failed to clear locked entries\n", port);
}
} static void mv88e6xxx_mac_link_up(struct dsa_switch *ds, int port, @@ -1685,6 +1693,12 @@ static void mv88e6xxx_port_fast_age(struct dsa_switch *ds, int port) struct mv88e6xxx_chip *chip = ds->priv; int err;
- if (mv88e6xxx_port_is_locked(chip, port))
err = mv88e6xxx_atu_locked_entry_flush(ds, port);
- if (err)
Kernel build robot pointed this out too, but it's worth repeating to make sure it doesn't get missed. If the port isn't locked, this function returns a junk integer from the stack which isn't zero, so it's treated as an error code.
dev_err(chip->ds->dev, "p%d: failed to clear locked entries: %d\n",
port, err);
- mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_fast_age_fid(chip, port, 0); mv88e6xxx_reg_unlock(chip);
@@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; } int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked) { @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, ®);
- if (err)
return err;
- if (!locked) {
err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port);
Did you re-run the selftest with v4? Because this deadlocks due to the double reg_lock illustrated below:
+ vrf_name=vlan1 + ip vrf exec vlan1 ping 192.0.2.2 -c 10 -i 0.1 -w 5 + check_err 1 'Ping did not work after locking port and adding FDB entry' + local err=1 + local 'msg=Ping did not work after locking port and adding FDB entry' + [[ 0 -eq 0 ]] + [[ 1 -ne 0 ]] + RET=1 + retmsg='Ping did not work after locking port and adding FDB entry' + bridge link set dev lan2 locked off [ 733.273994] [ 733.275515] ============================================ [ 733.280823] WARNING: possible recursive locking detected [ 733.286133] 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 Not tainted [ 733.292311] -------------------------------------------- [ 733.297613] kworker/0:0/601 is trying to acquire lock: [ 733.302751] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.312524] [ 733.312524] but task is already holding lock: [ 733.318351] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.327674] [ 733.327674] other info that might help us debug this: [ 733.334198] Possible unsafe locking scenario: [ 733.334198] [ 733.340109] CPU0 [ 733.342549] ---- [ 733.344990] lock(&chip->reg_lock); [ 733.348567] lock(&chip->reg_lock); [ 733.352149] [ 733.352149] *** DEADLOCK *** [ 733.352149] [ 733.358063] May be due to missing lock nesting notation [ 733.358063] [ 733.364846] 6 locks held by kworker/0:0/601: [ 733.369115] #0: ffff00000000b748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.378541] #1: ffff80000c43bdc8 (deferred_process_work){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.387966] #2: ffff80000b020cb0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30 [ 733.395660] #3: ffff80000b073370 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x34/0xa0 [ 733.407432] #4: ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.417202] #5: ffff00000213e0f0 (&p->ale_list_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_flush+0x4c/0xc0 [ 733.427495] [ 733.427495] stack backtrace: [ 733.431858] CPU: 0 PID: 601 Comm: kworker/0:0 Not tainted 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 [ 733.440992] Hardware name: CZ.NIC Turris Mox Board (DT) [ 733.446219] Workqueue: events switchdev_deferred_process_work [ 733.451982] Call trace: [ 733.454424] dump_backtrace.part.0+0xcc/0xe0 [ 733.458702] show_stack+0x18/0x6c [ 733.462028] dump_stack_lvl+0x8c/0xb8 [ 733.465703] dump_stack+0x18/0x34 [ 733.469029] __lock_acquire+0x1074/0x1fd0 [ 733.473052] lock_acquire.part.0+0xe4/0x220 [ 733.477244] lock_acquire+0x68/0x8c [ 733.480744] __mutex_lock+0x9c/0x460 [ 733.484332] mutex_lock_nested+0x40/0x50 [ 733.488268] mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.493592] mv88e6xxx_atu_locked_entry_flush+0x7c/0xc0 [ 733.498827] mv88e6xxx_port_set_lock+0xfc/0x10c [ 733.503374] mv88e6xxx_port_bridge_flags+0x200/0x234 [ 733.508351] dsa_port_bridge_flags+0x44/0xe0 [ 733.512635] dsa_slave_port_attr_set+0x1ec/0x230 [ 733.517268] __switchdev_handle_port_attr_set+0x58/0x100 [ 733.522594] switchdev_handle_port_attr_set+0x10/0x24 [ 733.527659] dsa_slave_switchdev_blocking_event+0x8c/0xd4 [ 733.533074] blocking_notifier_call_chain+0x6c/0xa0 [ 733.537968] switchdev_port_attr_notify.constprop.0+0x4c/0xb0 [ 733.543729] switchdev_port_attr_set_deferred+0x24/0x80 [ 733.548967] switchdev_deferred_process+0x90/0x164 [ 733.553774] switchdev_deferred_process_work+0x14/0x2c [ 733.558926] process_one_work+0x28c/0x6c4 [ 733.562949] worker_thread+0x74/0x450 [ 733.566623] kthread+0x118/0x11c [ 733.569860] ret_from_fork+0x10/0x20 ++ mac_get lan1 ++ local if_name=lan1 ++ jq -r '.[]["address"]' ++ ip -j link show dev lan1
I've tentatively fixed this in my tree by taking the register lock more localized to the sub-functions of mv88e6xxx_port_bridge_flags(), and not taking it at caller side for mv88e6xxx_port_set_lock(), but rather letting the callee take it:
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 7b57ac121589..ec5954f32774 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -6557,41 +6557,47 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP;
- mv88e6xxx_reg_lock(chip); - if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_set_assoc_vector(chip, port, pav); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; }
if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_ucast_flood(chip, port, unicast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; }
if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = chip->info->ops->port_set_mcast_flood(chip, port, multicast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; }
if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast); + mv88e6xxx_reg_unlock(chip); if (err) - goto out; + return err; }
if (flags.mask & BR_PORT_LOCKED) { @@ -6599,10 +6605,8 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port,
err = mv88e6xxx_port_set_lock(chip, port, locked); if (err) - goto out; + return err; } -out: - mv88e6xxx_reg_unlock(chip);
return err; } diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 5e4d5e9265a4..2a60aded6fbe 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -1222,15 +1222,19 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, u16 reg; int err;
+ mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®); - if (err) + if (err) { + mv88e6xxx_reg_unlock(chip); return err; + }
reg &= ~MV88E6XXX_PORT_CTL0_SA_FILT_MASK; if (locked) reg |= MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg); + mv88e6xxx_reg_unlock(chip); if (err) return err;
@@ -1247,7 +1251,11 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; }
- return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); + mv88e6xxx_reg_lock(chip); + err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); + mv88e6xxx_reg_unlock(chip); + + return err; }
int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port,
if (err)
return err;
- }
- reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- if (locked)
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- reg = 0;
- if (locked) {
reg = (1 << port);
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- }
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR, reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index e0a705d82019..5c1d485e7442 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff /* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid); int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid); int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid); +bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked);
Other than that, after going through some hoops and patching iproute2, the switch appears to do something, I do see locked FDB entries:
[root@mox:~/.../drivers/net/dsa] # bridge fdb show dev lan2 00:01:02:03:04:01 vlan 1 locked extern_learn offload master br0 00:01:02:03:04:02 vlan 1 master br0 permanent 00:01:02:03:04:02 master br0 permanent
however all selftests except for MAB appear to fail:
[root@mox:~/.../drivers/net/dsa] # ./bridge_locked_port.sh lan1 lan2 lan3 lan4 [ 2394.084584] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2398.984189] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 RTNETLINK answers: File exists TEST: Locked port ipv4 [FAIL] Ping did not work after locking port and adding FDB entry [ 2410.452638] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2415.366824] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 RTNETLINK answers: File exists TEST: Locked port ipv6 [FAIL] Ping6 did not work after locking port and adding FDB entry [ 2425.653013] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.663784] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.752771] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.853188] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2425.954193] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.054593] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.155646] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.256733] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.357159] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2426.457638] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2427.354018] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2430.679471] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 33 callbacks suppressed [ 2430.679502] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.723742] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.783752] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.887742] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2430.991738] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.095714] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.199505] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.303700] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.407707] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2431.511674] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 RTNETLINK answers: File exists [ 2435.683201] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 28 callbacks suppressed [ 2435.683232] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.787060] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.891223] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2435.995249] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.099218] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.203215] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.307207] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.411175] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.515179] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2436.619220] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.712164] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 26 callbacks suppressed [ 2440.712194] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.812777] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2440.913630] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.014523] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.114196] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 2441.214961] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 TEST: Locked port vlan [FAIL] Ping through vlan did not work after locking port and adding FDB entry [ 2443.945008] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 2448.131708] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 TEST: Locked port MAB [ OK ] [ 2455.139841] mv88e6085 d0032004.mdio-mii:10 lan3: Link is Down [ 2455.158623] br0: port 2(lan3) entered disabled state [ 2455.288477] mv88e6085 d0032004.mdio-mii:10 lan2: Link is Down [ 2455.309996] br0: port 1(lan2) entered disabled state [ 2455.446341] device lan3 left promiscuous mode [ 2455.452208] br0: port 2(lan3) entered disabled state [ 2455.574232] device lan2 left promiscuous mode [ 2455.580378] br0: port 1(lan2) entered disabled state [ 2455.906058] mv88e6085 d0032004.mdio-mii:10 lan4: Link is Down [ 2456.069399] mv88e6085 d0032004.mdio-mii:10 lan1: Link is Down
Compare this to the same selftest, run just before applying the mv88e6xxx MAB patch:
TEST: Locked port ipv4 [ OK ] TEST: Locked port ipv6 [ OK ] [ 119.080114] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.091009] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.180557] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.280916] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.381256] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.481618] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.581993] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.682416] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.782799] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 119.883177] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.100262] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 33 callbacks suppressed [ 124.100293] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.136182] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.204408] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.312440] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.416384] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.520361] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.624347] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.728344] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.832336] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 124.936341] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.014610] mv88e6xxx_g1_vtu_prob_irq_thread_fn: 20 callbacks suppressed [ 130.014637] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.118292] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.219203] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.324289] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.335695] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.424680] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.525600] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.626466] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.728275] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 [ 130.829422] mv88e6085 d0032004.mdio-mii:10: VTU member violation for vid 100, source port 9 TEST: Locked port vlan [ OK ] [ 133.926477] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.028380] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.132571] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.244056] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.344703] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.448416] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.552692] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.656642] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.760786] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 134.864605] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 139.247209] mv88e6xxx_g1_atu_prob_irq_thread_fn: 41 callbacks suppressed [ 139.247241] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 [ 144.358773] mv88e6085 d0032004.mdio-mii:10: ATU miss violation for 00:01:02:03:04:01 portvec 4 spid 2 TEST: Locked port MAB [FAIL] MAB: No locked fdb entry after ping on locked port
In other words, this patch set makes MAB work and breaks everything else. I'm willing to investigate exactly what is it that breaks the other selftest, but not today. It may be related to the "RTNETLINK answers: File exists" messages, which themselves come from the commands | bridge fdb add 00:01:02:03:04:01 dev lan2 master static
If I were to randomly guess at almost 4AM in the morning, it has to do with "bridge fdb add" rather than the "bridge fdb replace" that's used for the MAB selftest. The fact I pointed out a few revisions ago, that MAB needs to be opt-in, is now coming back to bite us. Since it's not opt-in, the mv88e6xxx driver always creates locked FDB entries, and when we try to "bridge fdb add", the kernel says "hey, the FDB entry is already there!". Is that it?
As for how to opt into MAB. Hmm. MAB seems to be essentially CPU assisted learning, which creates locked FDB entries. I wonder whether we should reconsider the position that address learning makes no sense on locked ports, and say that "+locked -learning" means no MAB, and "+locked +learning" means MAB? This would make a bunch of things more natural to handle in the kernel, and would also give us the opt-in we need.
Side note, the VTU and ATU member violation printks annoy me so badly. They aren't stating something super useful and they're a DoS attack vector in itself, even if they're rate limited. I wonder whether we could just turn the prints into a set of ethtool counters and call it a day?
In other words, this patch set makes MAB work and breaks everything else. I'm willing to investigate exactly what is it that breaks the other selftest, but not today. It may be related to the "RTNETLINK answers: File exists" messages, which themselves come from the commands | bridge fdb add 00:01:02:03:04:01 dev lan2 master static
If I were to randomly guess at almost 4AM in the morning, it has to do with "bridge fdb add" rather than the "bridge fdb replace" that's used for the MAB selftest. The fact I pointed out a few revisions ago, that MAB needs to be opt-in, is now coming back to bite us. Since it's not opt-in, the mv88e6xxx driver always creates locked FDB entries, and when we try to "bridge fdb add", the kernel says "hey, the FDB entry is already there!". Is that it?
Yes, that sounds like a reasonable explanation, as it adds 'ext learned, offloaded' entries. If you try and replace the 'add' with 'replace' in those tests, does it work?
As for how to opt into MAB. Hmm. MAB seems to be essentially CPU assisted learning, which creates locked FDB entries. I wonder whether we should reconsider the position that address learning makes no sense on locked ports, and say that "+locked -learning" means no MAB, and "+locked +learning" means MAB? This would make a bunch of things more natural to handle in the kernel, and would also give us the opt-in we need.
I have done the one and then the other. We need to have some final decision on this point. And remember that this gave rise to an extra patch to fix link-local learning if learning is turned on on a locked port, which resulted in the decision to allways have learning off on locked ports.
Side note, the VTU and ATU member violation printks annoy me so badly. They aren't stating something super useful and they're a DoS attack vector in itself, even if they're rate limited. I wonder whether we could just turn the prints into a set of ethtool counters and call it a day?
Sounds like a good idea to me. :-)
On Sun, Jul 17, 2022 at 02:34:22PM +0200, netdev@kapio-technology.com wrote:
If I were to randomly guess at almost 4AM in the morning, it has to do with "bridge fdb add" rather than the "bridge fdb replace" that's used for the MAB selftest. The fact I pointed out a few revisions ago, that MAB needs to be opt-in, is now coming back to bite us. Since it's not opt-in, the mv88e6xxx driver always creates locked FDB entries, and when we try to "bridge fdb add", the kernel says "hey, the FDB entry is already there!". Is that it?
Yes, that sounds like a reasonable explanation, as it adds 'ext learned, offloaded' entries. If you try and replace the 'add' with 'replace' in those tests, does it work?
Well, you have access to the selftests too... But yes, that is the reason, and it works when I change 'add' to 'replace', although of course this isn't the correct solution.
As for how to opt into MAB. Hmm. MAB seems to be essentially CPU assisted learning, which creates locked FDB entries. I wonder whether we should reconsider the position that address learning makes no sense on locked ports, and say that "+locked -learning" means no MAB, and "+locked +learning" means MAB? This would make a bunch of things more natural to handle in the kernel, and would also give us the opt-in we need.
I have done the one and then the other. We need to have some final decision on this point. And remember that this gave rise to an extra patch to fix link-local learning if learning is turned on on a locked port, which resulted in the decision to allways have learning off on locked ports.
I think part of the reason for the back-and-forth was not making a very clear distinction between basic 802.1X using hostapd, and MAB. While I agree hostapd doesn't have what to do with learning, for MAB I'm still wondering. It's the same situation for mv88e6xxx's Port Association Vector in fact.
Side note, the VTU and ATU member violation printks annoy me so badly. They aren't stating something super useful and they're a DoS attack vector in itself, even if they're rate limited. I wonder whether we could just turn the prints into a set of ethtool counters and call it a day?
Sounds like a good idea to me. :-)
Thinking this through, what we really want is trace points here, otherwise we'd lose information about which MAC address/VID/FID was it that caused the violation.
On 2022-07-17 02:47, Vladimir Oltean wrote:
@@ -1721,11 +1735,11 @@ static int mv88e6xxx_vtu_get(struct mv88e6xxx_chip *chip, u16 vid, return err; }
int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked) { @@ -1257,13 +1270,18 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, if (err) return err;
- err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR,
®);
- if (err)
return err;
- if (!locked) {
err = mv88e6xxx_atu_locked_entry_flush(chip->ds, port);
Did you re-run the selftest with v4? Because this deadlocks due to the double reg_lock illustrated below:
I did some selftest, but I didn't experience the problem of the double chip lock as it only happens on unlock which is only called where the MAB test ends.
- vrf_name=vlan1
- ip vrf exec vlan1 ping 192.0.2.2 -c 10 -i 0.1 -w 5
- check_err 1 'Ping did not work after locking port and adding FDB
entry'
- local err=1
- local 'msg=Ping did not work after locking port and adding FDB entry'
- [[ 0 -eq 0 ]]
- [[ 1 -ne 0 ]]
- RET=1
- retmsg='Ping did not work after locking port and adding FDB entry'
- bridge link set dev lan2 locked off
[ 733.273994] [ 733.275515] ============================================ [ 733.280823] WARNING: possible recursive locking detected [ 733.286133] 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 Not tainted [ 733.292311] -------------------------------------------- [ 733.297613] kworker/0:0/601 is trying to acquire lock: [ 733.302751] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.312524] [ 733.312524] but task is already holding lock: [ 733.318351] ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.327674] [ 733.327674] other info that might help us debug this: [ 733.334198] Possible unsafe locking scenario: [ 733.334198] [ 733.340109] CPU0 [ 733.342549] ---- [ 733.344990] lock(&chip->reg_lock); [ 733.348567] lock(&chip->reg_lock); [ 733.352149] [ 733.352149] *** DEADLOCK *** [ 733.352149] [ 733.358063] May be due to missing lock nesting notation [ 733.358063] [ 733.364846] 6 locks held by kworker/0:0/601: [ 733.369115] #0: ffff00000000b748 ((wq_completion)events){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.378541] #1: ffff80000c43bdc8 (deferred_process_work){+.+.}-{0:0}, at: process_one_work+0x1f4/0x6c4 [ 733.387966] #2: ffff80000b020cb0 (rtnl_mutex){+.+.}-{4:4}, at: rtnl_lock+0x1c/0x30 [ 733.395660] #3: ffff80000b073370 ((switchdev_blocking_notif_chain).rwsem){++++}-{4:4}, at: blocking_notifier_call_chain+0x34/0xa0 [ 733.407432] #4: ffff00000213c110 (&chip->reg_lock){+.+.}-{4:4}, at: mv88e6xxx_port_bridge_flags+0x48/0x234 [ 733.417202] #5: ffff00000213e0f0 (&p->ale_list_lock){+.+.}-{4:4}, at: mv88e6xxx_atu_locked_entry_flush+0x4c/0xc0 [ 733.427495] [ 733.427495] stack backtrace: [ 733.431858] CPU: 0 PID: 601 Comm: kworker/0:0 Not tainted 5.19.0-rc6-07010-ga9b9500ffaac-dirty #3293 [ 733.440992] Hardware name: CZ.NIC Turris Mox Board (DT) [ 733.446219] Workqueue: events switchdev_deferred_process_work [ 733.451982] Call trace: [ 733.454424] dump_backtrace.part.0+0xcc/0xe0 [ 733.458702] show_stack+0x18/0x6c [ 733.462028] dump_stack_lvl+0x8c/0xb8 [ 733.465703] dump_stack+0x18/0x34 [ 733.469029] __lock_acquire+0x1074/0x1fd0 [ 733.473052] lock_acquire.part.0+0xe4/0x220 [ 733.477244] lock_acquire+0x68/0x8c [ 733.480744] __mutex_lock+0x9c/0x460 [ 733.484332] mutex_lock_nested+0x40/0x50 [ 733.488268] mv88e6xxx_atu_locked_entry_purge+0x70/0x1a4 [ 733.493592] mv88e6xxx_atu_locked_entry_flush+0x7c/0xc0 [ 733.498827] mv88e6xxx_port_set_lock+0xfc/0x10c [ 733.503374] mv88e6xxx_port_bridge_flags+0x200/0x234 [ 733.508351] dsa_port_bridge_flags+0x44/0xe0 [ 733.512635] dsa_slave_port_attr_set+0x1ec/0x230 [ 733.517268] __switchdev_handle_port_attr_set+0x58/0x100 [ 733.522594] switchdev_handle_port_attr_set+0x10/0x24 [ 733.527659] dsa_slave_switchdev_blocking_event+0x8c/0xd4 [ 733.533074] blocking_notifier_call_chain+0x6c/0xa0 [ 733.537968] switchdev_port_attr_notify.constprop.0+0x4c/0xb0 [ 733.543729] switchdev_port_attr_set_deferred+0x24/0x80 [ 733.548967] switchdev_deferred_process+0x90/0x164 [ 733.553774] switchdev_deferred_process_work+0x14/0x2c [ 733.558926] process_one_work+0x28c/0x6c4 [ 733.562949] worker_thread+0x74/0x450 [ 733.566623] kthread+0x118/0x11c [ 733.569860] ret_from_fork+0x10/0x20 ++ mac_get lan1 ++ local if_name=lan1 ++ jq -r '.[]["address"]' ++ ip -j link show dev lan1
I've tentatively fixed this in my tree by taking the register lock more localized to the sub-functions of mv88e6xxx_port_bridge_flags(), and not taking it at caller side for mv88e6xxx_port_set_lock(), but rather letting the callee take it:
Yes, the double chip lock came from calling mv88e6xxx_atu_locked_entry_flush() in mv88e6xxx_port_set_lock() that is called from mv88e6xxx_port_bridge_flags() whic has the lock taken. I have fixed this now, but when I unlock a port with locked entries, I cannot have notify the bridge to clear the same locked entries by having mv88e6xxx_atu_locked_entry_flush() calling with the notify on, thus taking the rtnl lock, as this also results in a double lock. To remove the locked entries in the bridge FDB it is then necessary as I see it, to take the link down or have a userspace daemon clear them. I hope that is okay if documented?
diff --git a/drivers/net/dsa/mv88e6xxx/chip.c b/drivers/net/dsa/mv88e6xxx/chip.c index 7b57ac121589..ec5954f32774 100644 --- a/drivers/net/dsa/mv88e6xxx/chip.c +++ b/drivers/net/dsa/mv88e6xxx/chip.c @@ -6557,41 +6557,47 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port, struct mv88e6xxx_chip *chip = ds->priv; int err = -EOPNOTSUPP;
- mv88e6xxx_reg_lock(chip);
- if (flags.mask & BR_LEARNING) { bool learning = !!(flags.val & BR_LEARNING); u16 pav = learning ? (1 << port) : 0;
err = mv88e6xxx_port_set_assoc_vector(chip, port, pav);mv88e6xxx_reg_lock(chip);
if (err)mv88e6xxx_reg_unlock(chip);
goto out;
return err;
}
if (flags.mask & BR_FLOOD) { bool unicast = !!(flags.val & BR_FLOOD);
mv88e6xxx_reg_lock(chip);
err = chip->info->ops->port_set_ucast_flood(chip, port, unicast);
mv88e6xxx_reg_unlock(chip);
if (err)
goto out;
return err;
}
if (flags.mask & BR_MCAST_FLOOD) { bool multicast = !!(flags.val & BR_MCAST_FLOOD);
mv88e6xxx_reg_lock(chip);
err = chip->info->ops->port_set_mcast_flood(chip, port, multicast);
mv88e6xxx_reg_unlock(chip);
if (err)
goto out;
return err;
}
if (flags.mask & BR_BCAST_FLOOD) { bool broadcast = !!(flags.val & BR_BCAST_FLOOD);
mv88e6xxx_reg_lock(chip);
err = mv88e6xxx_port_broadcast_sync(chip, port, broadcast);
mv88e6xxx_reg_unlock(chip);
if (err)
goto out;
return err;
}
if (flags.mask & BR_PORT_LOCKED) {
@@ -6599,10 +6605,8 @@ static int mv88e6xxx_port_bridge_flags(struct dsa_switch *ds, int port,
err = mv88e6xxx_port_set_lock(chip, port, locked); if (err)
goto out;
}return err;
-out:
mv88e6xxx_reg_unlock(chip);
return err;
} diff --git a/drivers/net/dsa/mv88e6xxx/port.c b/drivers/net/dsa/mv88e6xxx/port.c index 5e4d5e9265a4..2a60aded6fbe 100644 --- a/drivers/net/dsa/mv88e6xxx/port.c +++ b/drivers/net/dsa/mv88e6xxx/port.c @@ -1222,15 +1222,19 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, u16 reg; int err;
- mv88e6xxx_reg_lock(chip); err = mv88e6xxx_port_read(chip, port, MV88E6XXX_PORT_CTL0, ®);
- if (err)
if (err) {
mv88e6xxx_reg_unlock(chip);
return err;
}
reg &= ~MV88E6XXX_PORT_CTL0_SA_FILT_MASK; if (locked) reg |= MV88E6XXX_PORT_CTL0_SA_FILT_DROP_ON_LOCK;
err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_CTL0, reg);
mv88e6xxx_reg_unlock(chip); if (err) return err;
@@ -1247,7 +1251,11 @@ int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT; }
- return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR,
reg);
- mv88e6xxx_reg_lock(chip);
- err = mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR,
reg);
- mv88e6xxx_reg_unlock(chip);
- return err;
}
int mv88e6xxx_port_set_8021q_mode(struct mv88e6xxx_chip *chip, int port,
if (err)
return err;
- }
- reg &= ~MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
- if (locked)
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
reg = 0;
if (locked) {
reg = (1 << port);
reg |= MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG |
MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT;
}
return mv88e6xxx_port_write(chip, port, MV88E6XXX_PORT_ASSOC_VECTOR,
reg); } diff --git a/drivers/net/dsa/mv88e6xxx/port.h b/drivers/net/dsa/mv88e6xxx/port.h index e0a705d82019..5c1d485e7442 100644 --- a/drivers/net/dsa/mv88e6xxx/port.h +++ b/drivers/net/dsa/mv88e6xxx/port.h @@ -231,6 +231,7 @@ #define MV88E6XXX_PORT_ASSOC_VECTOR_LOCKED_PORT 0x2000 #define MV88E6XXX_PORT_ASSOC_VECTOR_IGNORE_WRONG 0x1000 #define MV88E6XXX_PORT_ASSOC_VECTOR_REFRESH_LOCKED 0x0800 +#define MV88E6XXX_PORT_ASSOC_VECTOR_PAV_MASK 0x07ff
/* Offset 0x0C: Port ATU Control */ #define MV88E6XXX_PORT_ATU_CTL 0x0c @@ -374,6 +375,7 @@ int mv88e6xxx_port_set_fid(struct mv88e6xxx_chip *chip, int port, u16 fid); int mv88e6xxx_port_get_pvid(struct mv88e6xxx_chip *chip, int port, u16 *pvid); int mv88e6xxx_port_set_pvid(struct mv88e6xxx_chip *chip, int port, u16 pvid);
+bool mv88e6xxx_port_is_locked(struct mv88e6xxx_chip *chip, int port); int mv88e6xxx_port_set_lock(struct mv88e6xxx_chip *chip, int port, bool locked);
Verify that the MAC-Auth mechanism works by adding a FDB entry with the locked flag set, denying access until the FDB entry is replaced with a FDB entry without the locked flag set.
Signed-off-by: Hans Schultz netdev@kapio-technology.com --- .../net/forwarding/bridge_locked_port.sh | 30 ++++++++++++++++++- 1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh index 5b02b6b60ce7..1ee12d7b5c8b 100755 --- a/tools/testing/selftests/net/forwarding/bridge_locked_port.sh +++ b/tools/testing/selftests/net/forwarding/bridge_locked_port.sh @@ -1,7 +1,7 @@ #!/bin/bash # SPDX-License-Identifier: GPL-2.0
-ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan" +ALL_TESTS="locked_port_ipv4 locked_port_ipv6 locked_port_vlan locked_port_mab" NUM_NETIFS=4 CHECK_TC="no" source lib.sh @@ -166,6 +166,34 @@ locked_port_ipv6() log_test "Locked port ipv6" }
+locked_port_mab() +{ + RET=0 + check_locked_port_support || return 0 + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work before locking port" + + bridge link set dev $swp1 locked on + bridge link set dev $swp1 learning on + + ping_do $h1 192.0.2.2 + check_fail $? "MAB: Ping worked on locked port without FDB entry" + + bridge fdb show | grep `mac_get $h1` | grep -q "locked" + check_err $? "MAB: No locked fdb entry after ping on locked port" + + bridge fdb replace `mac_get $h1` dev $swp1 master static + + ping_do $h1 192.0.2.2 + check_err $? "MAB: Ping did not work with fdb entry without locked flag" + + bridge fdb del `mac_get $h1` dev $swp1 master + bridge link set dev $swp1 learning off + bridge link set dev $swp1 locked off + + log_test "Locked port MAB" +} trap cleanup EXIT
setup_prepare
On Thu, Jul 07, 2022 at 05:29:30PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
I was under the impression that we agreed that learning does not need to be enabled in the bridge driver
- ping_do $h1 192.0.2.2
- check_fail $? "MAB: Ping worked on locked port without FDB entry"
- bridge fdb show | grep `mac_get $h1` | grep -q "locked"
- check_err $? "MAB: No locked fdb entry after ping on locked port"
- bridge fdb replace `mac_get $h1` dev $swp1 master static
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work with fdb entry without locked flag"
- bridge fdb del `mac_get $h1` dev $swp1 master
- bridge link set dev $swp1 learning off
- bridge link set dev $swp1 locked off
- log_test "Locked port MAB"
+} trap cleanup EXIT setup_prepare -- 2.30.2
On 2022-07-10 09:29, Ido Schimmel wrote:
On Thu, Jul 07, 2022 at 05:29:30PM +0200, Hans Schultz wrote:
+locked_port_mab() +{
- RET=0
- check_locked_port_support || return 0
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work before locking port"
- bridge link set dev $swp1 locked on
- bridge link set dev $swp1 learning on
I was under the impression that we agreed that learning does not need to be enabled in the bridge driver
Sorry, you are right. I forgot to change it here.
- ping_do $h1 192.0.2.2
- check_fail $? "MAB: Ping worked on locked port without FDB entry"
- bridge fdb show | grep `mac_get $h1` | grep -q "locked"
- check_err $? "MAB: No locked fdb entry after ping on locked port"
- bridge fdb replace `mac_get $h1` dev $swp1 master static
- ping_do $h1 192.0.2.2
- check_err $? "MAB: Ping did not work with fdb entry without locked
flag"
- bridge fdb del `mac_get $h1` dev $swp1 master
- bridge link set dev $swp1 learning off
- bridge link set dev $swp1 locked off
- log_test "Locked port MAB"
+} trap cleanup EXIT
setup_prepare
2.30.2
On Thu, 7 Jul 2022 17:29:24 +0200 Hans Schultz wrote:
[PATCH v4 net-next 0/6] Extend locked port feature with FDB locked flag (MAC-Auth/MAB)
Let's give it a day or two for feedback but the series does not apply cleanly to net-next so a rebase & repost will be needed even if it's otherwise perfect.
On 2022-07-07 17:29 +0200, Hans Schultz wrote:
This patch set extends the locked port feature for devices that are behind a locked port, but do not have the ability to authorize themselves as a supplicant using IEEE 802.1X. Such devices can be printers, meters or anything related to fixed installations. Instead of 802.1X authorization, devices can get access based on their MAC addresses being whitelisted.
^
Please consider using the alternate vocabulary discussed in Documentation/process/coding-style.rst §4.
linux-kselftest-mirror@lists.linaro.org