Synchronous Ethernet networks use a physical layer clock to syntonize the frequency across different network elements.
Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet Equipment Clock (EEC) and have the ability to recover synchronization from the synchronization inputs - either traffic interfaces or external frequency sources. The EEC can synchronize its frequency (syntonize) to any of those sources. It is also able to select synchronization source through priority tables and synchronization status messaging. It also provides neccessary filtering and holdover capabilities
This patch series introduces basic interface for reading the Ethernet Equipment Clock (EEC) state on a SyncE capable device. This state gives information about the source of the syntonization signal (ether my port, or any external one) and the state of EEC. This interface is required\ to implement Synchronization Status Messaging on upper layers.
RFC history: v2: - removed whitespace changes - fix issues reported by test robot v3: - Changed naming from SyncE to EEC - Clarify cover letter and commit message for patch 1 v4: - Removed sync_source and pin_idx info - Changed one structure to attributes - Added EEC_SRC_PORT flag to indicate that the EEC is synchronized to the recovered clock of a port that returns the state v5: - add EEC source as an optiona attribute - implement support for recovered clocks - align states returned by EEC to ITU-T G.781 v6: - fix EEC clock state reporting - add documentation - fix descriptions in code comments
Maciej Machnikowski (6): ice: add support detecting features based on netlist rtnetlink: Add new RTM_GETEECSTATE message to get SyncE status ice: add support for reading SyncE DPLL state rtnetlink: Add support for SyncE recovered clock configuration ice: add support for SyncE recovered clocks docs: net: Add description of SyncE interfaces
Documentation/networking/synce.rst | 88 ++++++ drivers/net/ethernet/intel/ice/ice.h | 7 + .../net/ethernet/intel/ice/ice_adminq_cmd.h | 94 ++++++- drivers/net/ethernet/intel/ice/ice_common.c | 224 ++++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 20 +- drivers/net/ethernet/intel/ice/ice_devids.h | 3 + drivers/net/ethernet/intel/ice/ice_lib.c | 6 +- drivers/net/ethernet/intel/ice/ice_main.c | 138 ++++++++++ drivers/net/ethernet/intel/ice/ice_ptp.c | 34 +++ drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 49 ++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 22 ++ drivers/net/ethernet/intel/ice/ice_type.h | 1 + include/linux/netdevice.h | 33 +++ include/uapi/linux/if_link.h | 57 ++++ include/uapi/linux/rtnetlink.h | 10 + net/core/rtnetlink.c | 253 ++++++++++++++++++ security/selinux/nlmsgtab.c | 6 +- 17 files changed, 1041 insertions(+), 4 deletions(-) create mode 100644 Documentation/networking/synce.rst
Add new functions to check netlist of a given board for: - Recovered Clock device, - Clock Generation Unit, - Clock Multiplexer,
Initialize feature bits depending on detected components.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- drivers/net/ethernet/intel/ice/ice.h | 2 + .../net/ethernet/intel/ice/ice_adminq_cmd.h | 7 +- drivers/net/ethernet/intel/ice/ice_common.c | 123 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 9 ++ drivers/net/ethernet/intel/ice/ice_lib.c | 6 +- drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 1 + drivers/net/ethernet/intel/ice/ice_type.h | 1 + 7 files changed, 147 insertions(+), 2 deletions(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index bf4ecd9a517c..3dc4caa41565 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -186,6 +186,8 @@
enum ice_feature { ICE_F_DSCP, + ICE_F_CGU, + ICE_F_PHY_RCLK, ICE_F_SMA_CTRL, ICE_F_MAX }; diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 4eef3488d86f..339c2a86f680 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1297,6 +1297,8 @@ struct ice_aqc_link_topo_params { #define ICE_AQC_LINK_TOPO_NODE_TYPE_CAGE 6 #define ICE_AQC_LINK_TOPO_NODE_TYPE_MEZZ 7 #define ICE_AQC_LINK_TOPO_NODE_TYPE_ID_EEPROM 8 +#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL 9 +#define ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX 10 #define ICE_AQC_LINK_TOPO_NODE_CTX_S 4 #define ICE_AQC_LINK_TOPO_NODE_CTX_M \ (0xF << ICE_AQC_LINK_TOPO_NODE_CTX_S) @@ -1333,7 +1335,10 @@ struct ice_aqc_link_topo_addr { struct ice_aqc_get_link_topo { struct ice_aqc_link_topo_addr addr; u8 node_part_num; -#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575 0x21 +#define ICE_AQC_GET_LINK_TOPO_NODE_NR_PCA9575 0x21 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032 0x24 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL 0x31 +#define ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX 0x47 u8 rsvd[9]; };
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index b3066d0fea8b..35903b282885 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -274,6 +274,79 @@ ice_aq_get_link_topo_handle(struct ice_port_info *pi, u8 node_type, return ice_aq_send_cmd(pi->hw, &desc, NULL, 0, cd); }
+/** + * ice_aq_get_netlist_node + * @hw: pointer to the hw struct + * @cmd: get_link_topo AQ structure + * @node_part_number: output node part number if node found + * @node_handle: output node handle parameter if node found + */ +enum ice_status +ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd, + u8 *node_part_number, u16 *node_handle) +{ + struct ice_aq_desc desc; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_link_topo); + desc.params.get_link_topo = *cmd; + + if (ice_aq_send_cmd(hw, &desc, NULL, 0, NULL)) + return ICE_ERR_NOT_SUPPORTED; + + if (node_handle) + *node_handle = + le16_to_cpu(desc.params.get_link_topo.addr.handle); + if (node_part_number) + *node_part_number = desc.params.get_link_topo.node_part_num; + + return ICE_SUCCESS; +} + +#define MAX_NETLIST_SIZE 10 +/** + * ice_find_netlist_node + * @hw: pointer to the hw struct + * @node_type_ctx: type of netlist node to look for + * @node_part_number: node part number to look for + * @node_handle: output parameter if node found - optional + * + * Find and return the node handle for a given node type and part number in the + * netlist. When found ICE_SUCCESS is returned, ICE_ERR_DOES_NOT_EXIST + * otherwise. If @node_handle provided, it would be set to found node handle. + */ +enum ice_status +ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number, + u16 *node_handle) +{ + struct ice_aqc_get_link_topo cmd; + u8 rec_node_part_number; + enum ice_status status; + u16 rec_node_handle; + u8 idx; + + for (idx = 0; idx < MAX_NETLIST_SIZE; idx++) { + memset(&cmd, 0, sizeof(cmd)); + + cmd.addr.topo_params.node_type_ctx = + (node_type_ctx << ICE_AQC_LINK_TOPO_NODE_TYPE_S); + cmd.addr.topo_params.index = idx; + + status = ice_aq_get_netlist_node(hw, &cmd, + &rec_node_part_number, + &rec_node_handle); + if (status) + return status; + + if (rec_node_part_number == node_part_number) { + if (node_handle) + *node_handle = rec_node_handle; + return ICE_SUCCESS; + } + } + + return ICE_ERR_DOES_NOT_EXIST; +} + /** * ice_is_media_cage_present * @pi: port information structure @@ -5083,3 +5156,53 @@ bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw) } return false; } + +/** + * ice_is_phy_rclk_present_e810t + * @hw: pointer to the hw struct + * + * Check if the PHY Recovered Clock device is present in the netlist + */ +bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw) +{ + if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_PKVL, NULL)) + return false; + + return true; +} + +/** + * ice_is_cgu_present_e810t + * @hw: pointer to the hw struct + * + * Check if the Clock Generation Unit (CGU) device is present in the netlist + */ +bool ice_is_cgu_present_e810t(struct ice_hw *hw) +{ + if (!ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_CTRL, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032, + NULL)) { + hw->cgu_part_number = + ICE_ACQ_GET_LINK_TOPO_NODE_NR_ZL30632_80032; + return true; + } + return false; +} + +/** + * ice_is_clock_mux_present_e810t + * @hw: pointer to the hw struct + * + * Check if the Clock Multiplexer device is present in the netlist + */ +bool ice_is_clock_mux_present_e810t(struct ice_hw *hw) +{ + if (ice_find_netlist_node(hw, ICE_AQC_LINK_TOPO_NODE_TYPE_CLK_MUX, + ICE_ACQ_GET_LINK_TOPO_NODE_NR_GEN_CLK_MUX, + NULL)) + return false; + + return true; +} + diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index 65c1b3244264..b20a5c085246 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -89,6 +89,12 @@ ice_aq_get_phy_caps(struct ice_port_info *pi, bool qual_mods, u8 report_mode, struct ice_aqc_get_phy_caps_data *caps, struct ice_sq_cd *cd); enum ice_status +ice_aq_get_netlist_node(struct ice_hw *hw, struct ice_aqc_get_link_topo *cmd, + u8 *node_part_number, u16 *node_handle); +enum ice_status +ice_find_netlist_node(struct ice_hw *hw, u8 node_type_ctx, u8 node_part_number, + u16 *node_handle); +enum ice_status ice_aq_list_caps(struct ice_hw *hw, void *buf, u16 buf_size, u32 *cap_count, enum ice_adminq_opc opc, struct ice_sq_cd *cd); enum ice_status @@ -206,4 +212,7 @@ bool ice_fw_supports_lldp_fltr_ctrl(struct ice_hw *hw); enum ice_status ice_lldp_fltr_add_remove(struct ice_hw *hw, u16 vsi_num, bool add); bool ice_fw_supports_report_dflt_cfg(struct ice_hw *hw); +bool ice_is_phy_rclk_present_e810t(struct ice_hw *hw); +bool ice_is_cgu_present_e810t(struct ice_hw *hw); +bool ice_is_clock_mux_present_e810t(struct ice_hw *hw); #endif /* _ICE_COMMON_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_lib.c b/drivers/net/ethernet/intel/ice/ice_lib.c index 40562600a8cf..2422215b7937 100644 --- a/drivers/net/ethernet/intel/ice/ice_lib.c +++ b/drivers/net/ethernet/intel/ice/ice_lib.c @@ -4183,8 +4183,12 @@ void ice_init_feature_support(struct ice_pf *pf) case ICE_DEV_ID_E810C_QSFP: case ICE_DEV_ID_E810C_SFP: ice_set_feature_support(pf, ICE_F_DSCP); - if (ice_is_e810t(&pf->hw)) + if (ice_is_clock_mux_present_e810t(&pf->hw)) ice_set_feature_support(pf, ICE_F_SMA_CTRL); + if (ice_is_phy_rclk_present_e810t(&pf->hw)) + ice_set_feature_support(pf, ICE_F_PHY_RCLK); + if (ice_is_cgu_present_e810t(&pf->hw)) + ice_set_feature_support(pf, ICE_F_CGU); break; default: break; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index 29f947c0cd2e..aa257db36765 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -800,3 +800,4 @@ bool ice_is_pca9575_present(struct ice_hw *hw)
return !status && handle; } + diff --git a/drivers/net/ethernet/intel/ice/ice_type.h b/drivers/net/ethernet/intel/ice/ice_type.h index 9e0c2923c62e..a9dc16641bd4 100644 --- a/drivers/net/ethernet/intel/ice/ice_type.h +++ b/drivers/net/ethernet/intel/ice/ice_type.h @@ -920,6 +920,7 @@ struct ice_hw { struct list_head rss_list_head; struct ice_mbx_snapshot mbx_snapshot; u16 io_expander_handle; + u8 cgu_part_number; };
/* Statistics collected by each port, VSI, VEB, and S-channel */
This patch series introduces basic interface for reading the Ethernet Equipment Clock (EEC) state on a SyncE capable device. This state gives information about the state of EEC. This interface is required to implement Synchronization Status Messaging on upper layers.
Initial implementation returns SyncE EEC state in the IFLA_EEC_STATE attribute. The optional index of input that's used as a source can be returned in the IFLA_EEC_SRC_IDX attribute.
SyncE EEC state read needs to be implemented as a ndo_get_eec_state function. The index will be read by calling the ndo_get_eec_src.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- include/linux/netdevice.h | 13 ++++++ include/uapi/linux/if_link.h | 31 +++++++++++++ include/uapi/linux/rtnetlink.h | 3 ++ net/core/rtnetlink.c | 79 ++++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 3 +- 5 files changed, 128 insertions(+), 1 deletion(-)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 3ec42495a43a..ef2b381dae0c 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1344,6 +1344,13 @@ struct netdev_net_notifier { * The caller must be under RCU read context. * int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); * Get the forwarding path to reach the real device from the HW destination address + * int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, + * u32 *src_idx, struct netlink_ext_ack *extack); + * Get state of physical layer frequency synchronization (SyncE) + * int (*ndo_get_eec_src)(struct net_device *dev, u32 *src, + * struct netlink_ext_ack *extack); + * Get the index of the source signal that's currently used as EEC's + * reference */ struct net_device_ops { int (*ndo_init)(struct net_device *dev); @@ -1563,6 +1570,12 @@ struct net_device_ops { struct net_device * (*ndo_get_peer_dev)(struct net_device *dev); int (*ndo_fill_forward_path)(struct net_device_path_ctx *ctx, struct net_device_path *path); + int (*ndo_get_eec_state)(struct net_device *dev, + enum if_eec_state *state, + struct netlink_ext_ack *extack); + int (*ndo_get_eec_src)(struct net_device *dev, + u32 *src, + struct netlink_ext_ack *extack); };
/** diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index eebd3894fe89..8eae80f287e9 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1273,4 +1273,35 @@ enum {
#define IFLA_MCTP_MAX (__IFLA_MCTP_MAX - 1)
+/* SyncE section */ + +enum if_eec_state { + IF_EEC_STATE_INVALID = 0, /* state is not valid */ + IF_EEC_STATE_FREERUN, /* clock is free-running */ + IF_EEC_STATE_LOCKED, /* clock is locked to the reference, + * but the holdover memory is not valid + */ + IF_EEC_STATE_LOCKED_HO_ACQ, /* clock is locked to the reference + * and holdover memory is valid + */ + IF_EEC_STATE_HOLDOVER, /* clock is in holdover mode */ +}; + +#define EEC_SRC_PORT (1 << 0) /* recovered clock from the port is + * currently the source for the EEC + */ + +struct if_eec_state_msg { + __u32 ifindex; +}; + +enum { + IFLA_EEC_UNSPEC, + IFLA_EEC_STATE, + IFLA_EEC_SRC_IDX, + __IFLA_EEC_MAX, +}; + +#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1) + #endif /* _UAPI_LINUX_IF_LINK_H */ diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 5888492a5257..1d8662afd6bd 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,9 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
+ RTM_GETEECSTATE = 124, +#define RTM_GETEECSTATE RTM_GETEECSTATE + __RTM_MAX, #define RTM_MAX (((__RTM_MAX + 3) & ~3) - 1) }; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 2af8aeeadadf..03bc773d0e69 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5467,6 +5467,83 @@ static int rtnl_stats_dump(struct sk_buff *skb, struct netlink_callback *cb) return skb->len; }
+static int rtnl_fill_eec_state(struct sk_buff *skb, struct net_device *dev, + u32 portid, u32 seq, struct netlink_callback *cb, + int flags, struct netlink_ext_ack *extack) +{ + const struct net_device_ops *ops = dev->netdev_ops; + struct if_eec_state_msg *state_msg; + enum if_eec_state state; + struct nlmsghdr *nlh; + u32 src_idx; + int err; + + ASSERT_RTNL(); + + if (!ops->ndo_get_eec_state) + return -EOPNOTSUPP; + + err = ops->ndo_get_eec_state(dev, &state, extack); + if (err) + return err; + + nlh = nlmsg_put(skb, portid, seq, RTM_GETEECSTATE, sizeof(*state_msg), + flags); + if (!nlh) + return -EMSGSIZE; + + state_msg = nlmsg_data(nlh); + state_msg->ifindex = dev->ifindex; + + if (nla_put_u32(skb, IFLA_EEC_STATE, state)) + return -EMSGSIZE; + + if (!ops->ndo_get_eec_src) + goto end_msg; + + err = ops->ndo_get_eec_src(dev, &src_idx, extack); + if (err) + return err; + + if (nla_put_u32(skb, IFLA_EEC_SRC_IDX, src_idx)) + return -EMSGSIZE; + +end_msg: + nlmsg_end(skb, nlh); + return 0; +} + +static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_eec_state_msg *state; + struct net_device *dev; + struct sk_buff *nskb; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!nskb) + return -ENOBUFS; + + err = rtnl_fill_eec_state(nskb, dev, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NULL, nlh->nlmsg_flags, + extack); + if (err < 0) + kfree_skb(nskb); + else + err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); + + return err; +} + /* Process one rtnetlink message. */
static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5692,4 +5769,6 @@ void __init rtnetlink_init(void)
rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0); + + rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0); } diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 94ea2a8b2bb7..2c66e722ea9c 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -91,6 +91,7 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_NEWNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_GETEECSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, };
static const struct nlmsg_perm nlmsg_tcpdiag_perms[] = @@ -176,7 +177,7 @@ int selinux_nlmsg_lookup(u16 sclass, u16 nlmsg_type, u32 *perm) * structures at the top of this file with the new mappings * before updating the BUILD_BUG_ON() macro! */ - BUILD_BUG_ON(RTM_MAX != (RTM_NEWNEXTHOPBUCKET + 3)); + BUILD_BUG_ON(RTM_MAX != (RTM_GETEECSTATE + 3)); err = nlmsg_perm(nlmsg_type, perm, nlmsg_route_perms, sizeof(nlmsg_route_perms)); break;
Implement SyncE DPLL monitoring for E810-T devices. Poll loop will periodically check the state of the DPLL and cache it in the pf structure. State changes will be logged in the system log.
Cached state can be read using the RTM_GETEECSTATE rtnetlink message.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- drivers/net/ethernet/intel/ice/ice.h | 5 ++ .../net/ethernet/intel/ice/ice_adminq_cmd.h | 34 +++++++++++++ drivers/net/ethernet/intel/ice/ice_common.c | 36 ++++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 5 +- drivers/net/ethernet/intel/ice/ice_devids.h | 3 ++ drivers/net/ethernet/intel/ice/ice_main.c | 47 ++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp.c | 34 +++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.c | 48 +++++++++++++++++++ drivers/net/ethernet/intel/ice/ice_ptp_hw.h | 22 +++++++++ 9 files changed, 233 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/ice/ice.h b/drivers/net/ethernet/intel/ice/ice.h index 3dc4caa41565..1dff7ca704d4 100644 --- a/drivers/net/ethernet/intel/ice/ice.h +++ b/drivers/net/ethernet/intel/ice/ice.h @@ -609,6 +609,11 @@ struct ice_pf { #define ICE_VF_AGG_NODE_ID_START 65 #define ICE_MAX_VF_AGG_NODES 32 struct ice_agg_node vf_agg_node[ICE_MAX_VF_AGG_NODES]; + + enum if_eec_state synce_dpll_state; + u8 synce_dpll_pin; + enum if_eec_state ptp_dpll_state; + u8 ptp_dpll_pin; };
struct ice_netdev_priv { diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 339c2a86f680..11226af7a9a4 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1808,6 +1808,36 @@ struct ice_aqc_add_rdma_qset_data { struct ice_aqc_add_tx_rdma_qset_entry rdma_qsets[]; };
+/* Get CGU DPLL status (direct 0x0C66) */ +struct ice_aqc_get_cgu_dpll_status { + u8 dpll_num; + u8 ref_state; +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_LOS BIT(0) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_SCM BIT(1) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_CFM BIT(2) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_GST BIT(3) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_PFM BIT(4) +#define ICE_AQC_GET_CGU_DPLL_STATUS_REF_SW_ESYNC BIT(6) +#define ICE_AQC_GET_CGU_DPLL_STATUS_FAST_LOCK_EN BIT(7) + __le16 dpll_state; +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK BIT(0) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO BIT(1) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY BIT(2) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_FLHIT BIT(5) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_PSLHIT BIT(7) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT 8 +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL \ + ICE_M(0x1F, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT) +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT 13 +#define ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE \ + ICE_M(0x7, ICE_AQC_GET_CGU_DPLL_STATUS_STATE_MODE_SHIFT) + __le32 phase_offset_h; + __le32 phase_offset_l; + u8 eec_mode; + u8 rsvd[1]; + __le16 node_handle; +}; + /* Configure Firmware Logging Command (indirect 0xFF09) * Logging Information Read Response (indirect 0xFF10) * Note: The 0xFF10 command has no input parameters. @@ -2039,6 +2069,7 @@ struct ice_aq_desc { struct ice_aqc_fw_logging fw_logging; struct ice_aqc_get_clear_fw_log get_clear_fw_log; struct ice_aqc_download_pkg download_pkg; + struct ice_aqc_get_cgu_dpll_status get_cgu_dpll_status; struct ice_aqc_driver_shared_params drv_shared_params; struct ice_aqc_set_mac_lb set_mac_lb; struct ice_aqc_alloc_free_res_cmd sw_res_ctrl; @@ -2205,6 +2236,9 @@ enum ice_adminq_opc { ice_aqc_opc_update_pkg = 0x0C42, ice_aqc_opc_get_pkg_info_list = 0x0C43,
+ /* 1588/SyncE commands/events */ + ice_aqc_opc_get_cgu_dpll_status = 0x0C66, + ice_aqc_opc_driver_shared_params = 0x0C90,
/* Standalone Commands/Events */ diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 35903b282885..8069141ac105 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -4644,6 +4644,42 @@ ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, return ice_status_to_errno(status); }
+/** + * ice_aq_get_cgu_dpll_status + * @hw: pointer to the HW struct + * @dpll_num: DPLL index + * @ref_state: Reference clock state + * @dpll_state: DPLL state + * @phase_offset: Phase offset in ps + * @eec_mode: EEC_mode + * + * Get CGU DPLL status (0x0C66) + */ +enum ice_status +ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, + u16 *dpll_state, u64 *phase_offset, u8 *eec_mode) +{ + struct ice_aqc_get_cgu_dpll_status *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_cgu_dpll_status); + cmd = &desc.params.get_cgu_dpll_status; + cmd->dpll_num = dpll_num; + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) { + *ref_state = cmd->ref_state; + *dpll_state = le16_to_cpu(cmd->dpll_state); + *phase_offset = le32_to_cpu(cmd->phase_offset_h); + *phase_offset <<= 32; + *phase_offset += le32_to_cpu(cmd->phase_offset_l); + *eec_mode = cmd->eec_mode; + } + + return status; +} + /** * ice_replay_pre_init - replay pre initialization * @hw: pointer to the HW struct diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index b20a5c085246..aaed388a40a8 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -106,6 +106,7 @@ enum ice_status ice_aq_manage_mac_write(struct ice_hw *hw, const u8 *mac_addr, u8 flags, struct ice_sq_cd *cd); bool ice_is_e810(struct ice_hw *hw); +bool ice_is_e810t(struct ice_hw *hw); enum ice_status ice_clear_pf_cfg(struct ice_hw *hw); enum ice_status ice_aq_set_phy_cfg(struct ice_hw *hw, struct ice_port_info *pi, @@ -162,6 +163,9 @@ ice_cfg_vsi_rdma(struct ice_port_info *pi, u16 vsi_handle, u16 tc_bitmap, int ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc, u16 *rdma_qset, u16 num_qsets, u32 *qset_teid); +enum ice_status +ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, + u16 *dpll_state, u64 *phase_offset, u8 *eec_mode); int ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, u16 *q_id); @@ -189,7 +193,6 @@ ice_stat_update40(struct ice_hw *hw, u32 reg, bool prev_stat_loaded, void ice_stat_update32(struct ice_hw *hw, u32 reg, bool prev_stat_loaded, u64 *prev_stat, u64 *cur_stat); -bool ice_is_e810t(struct ice_hw *hw); enum ice_status ice_sched_query_elem(struct ice_hw *hw, u32 node_teid, struct ice_aqc_txsched_elem_data *buf); diff --git a/drivers/net/ethernet/intel/ice/ice_devids.h b/drivers/net/ethernet/intel/ice/ice_devids.h index 61dd2f18dee8..0b654d417d29 100644 --- a/drivers/net/ethernet/intel/ice/ice_devids.h +++ b/drivers/net/ethernet/intel/ice/ice_devids.h @@ -58,4 +58,7 @@ /* Intel(R) Ethernet Connection E822-L 1GbE */ #define ICE_DEV_ID_E822L_SGMII 0x189A
+#define ICE_SUBDEV_ID_E810T 0x000E +#define ICE_SUBDEV_ID_E810T2 0x000F + #endif /* _ICE_DEVIDS_H_ */ diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index f099797f35e3..da6cfe19259a 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6240,6 +6240,51 @@ static void ice_napi_disable_all(struct ice_vsi *vsi) } }
+/** + * ice_get_eec_state - get state of SyncE DPLL + * @netdev: network interface device structure + * @state: state of SyncE DPLL + * @eec_flags: EEC state flags + * @extack: netlink extended ack + */ +static int +ice_get_eec_state(struct net_device *netdev, enum if_eec_state *state, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *state = pf->synce_dpll_state; + + return 0; +} + +/** + * ice_get_eec_src - get reference index of SyncE DPLL + * @netdev: network interface device structure + * @src: index of source reference of the SyncE DPLL + * @extack: netlink extended ack + */ +static int +ice_get_eec_src(struct net_device *netdev, u32 *src, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *src = pf->synce_dpll_pin; + + return 0; +} + /** * ice_down - Shutdown the connection * @vsi: The VSI being stopped @@ -8601,4 +8646,6 @@ static const struct net_device_ops ice_netdev_ops = { .ndo_bpf = ice_xdp, .ndo_xdp_xmit = ice_xdp_xmit, .ndo_xsk_wakeup = ice_xsk_wakeup, + .ndo_get_eec_state = ice_get_eec_state, + .ndo_get_eec_src = ice_get_eec_src, }; diff --git a/drivers/net/ethernet/intel/ice/ice_ptp.c b/drivers/net/ethernet/intel/ice/ice_ptp.c index bf7247c6f58e..a38d0ab4d6d5 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp.c @@ -1766,6 +1766,36 @@ static void ice_ptp_tx_tstamp_cleanup(struct ice_ptp_tx *tx) } }
+static void ice_handle_cgu_state(struct ice_pf *pf) +{ + enum if_eec_state cgu_state; + u8 pin; + + cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_SYNCE, &pin); + if (pf->synce_dpll_state != cgu_state) { + pf->synce_dpll_state = cgu_state; + pf->synce_dpll_pin = pin; + + dev_warn(ice_pf_to_dev(pf), + "<DPLL%i> state changed to: %d, pin %d", + ICE_CGU_DPLL_SYNCE, + pf->synce_dpll_state, + pin); + } + + cgu_state = ice_get_zl_dpll_state(&pf->hw, ICE_CGU_DPLL_PTP, &pin); + if (pf->ptp_dpll_state != cgu_state) { + pf->ptp_dpll_state = cgu_state; + pf->ptp_dpll_pin = pin; + + dev_warn(ice_pf_to_dev(pf), + "<DPLL%i> state changed to: %d, pin %d", + ICE_CGU_DPLL_PTP, + pf->ptp_dpll_state, + pin); + } +} + static void ice_ptp_periodic_work(struct kthread_work *work) { struct ice_ptp *ptp = container_of(work, struct ice_ptp, work.work); @@ -1774,6 +1804,9 @@ static void ice_ptp_periodic_work(struct kthread_work *work) if (!test_bit(ICE_FLAG_PTP, pf->flags)) return;
+ if (ice_is_feature_supported(pf, ICE_F_CGU)) + ice_handle_cgu_state(pf); + ice_ptp_update_cached_phctime(pf);
ice_ptp_tx_tstamp_cleanup(&pf->ptp.port.tx); @@ -1958,3 +1991,4 @@ void ice_ptp_release(struct ice_pf *pf)
dev_info(ice_pf_to_dev(pf), "Removed PTP clock\n"); } + diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c index aa257db36765..7a9482918a20 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.c +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.c @@ -375,6 +375,54 @@ static int ice_ptp_port_cmd_e810(struct ice_hw *hw, enum ice_ptp_tmr_cmd cmd) return 0; }
+/** + * ice_get_zl_dpll_state - get the state of the DPLL + * @hw: pointer to the hw struct + * @dpll_idx: Index of internal DPLL unit + * @pin: pointer to a buffer for returning currently active pin + * + * This function will read the state of the DPLL(dpll_idx). If optional + * parameter pin is given it'll be used to retrieve currently active pin. + * + * Return: state of the DPLL + */ +enum if_eec_state +ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin) +{ + enum ice_status status; + u64 phase_offset; + u16 dpll_state; + u8 ref_state; + u8 eec_mode; + + if (dpll_idx >= ICE_CGU_DPLL_MAX) + return IF_EEC_STATE_INVALID; + + status = ice_aq_get_cgu_dpll_status(hw, dpll_idx, &ref_state, + &dpll_state, &phase_offset, + &eec_mode); + if (status) + return IF_EEC_STATE_INVALID; + + if (pin) { + /* current ref pin in dpll_state_refsel_status_X register */ + *pin = (dpll_state & + ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SEL) >> + ICE_AQC_GET_CGU_DPLL_STATUS_STATE_CLK_REF_SHIFT; + } + + if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_LOCK) { + if (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY) + return IF_EEC_STATE_LOCKED_HO_ACQ; + else + return IF_EEC_STATE_LOCKED; + } else if ((dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO) && + (dpll_state & ICE_AQC_GET_CGU_DPLL_STATUS_STATE_HO_READY)) { + return IF_EEC_STATE_HOLDOVER; + } + return IF_EEC_STATE_FREERUN; +} + /* Device agnostic functions * * The following functions implement useful behavior to hide the differences diff --git a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h index b2984b5c22c1..fcd543531b2c 100644 --- a/drivers/net/ethernet/intel/ice/ice_ptp_hw.h +++ b/drivers/net/ethernet/intel/ice/ice_ptp_hw.h @@ -33,6 +33,8 @@ int ice_ptp_init_phy_e810(struct ice_hw *hw); int ice_read_sma_ctrl_e810t(struct ice_hw *hw, u8 *data); int ice_write_sma_ctrl_e810t(struct ice_hw *hw, u8 data); bool ice_is_pca9575_present(struct ice_hw *hw); +enum if_eec_state +ice_get_zl_dpll_state(struct ice_hw *hw, u8 dpll_idx, u8 *pin);
#define PFTSYN_SEM_BYTES 4
@@ -98,4 +100,24 @@ bool ice_is_pca9575_present(struct ice_hw *hw); #define ICE_SMA_MAX_BIT_E810T 7 #define ICE_PCA9575_P1_OFFSET 8
+enum ice_e810t_cgu_dpll { + ICE_CGU_DPLL_SYNCE, + ICE_CGU_DPLL_PTP, + ICE_CGU_DPLL_MAX +}; + +enum ice_e810t_cgu_pins { + REF0P, + REF0N, + REF1P, + REF1N, + REF2P, + REF2N, + REF3P, + REF3N, + REF4P, + REF4N, + NUM_E810T_CGU_PINS +}; + #endif /* _ICE_PTP_HW_H_ */
Add support for RTNL messages for reading/configuring SyncE recovered clocks. The messages are: RTM_GETRCLKRANGE: Reads the allowed pin index range for the recovered clock outputs. This can be aligned to PHY outputs or to EEC inputs, whichever is better for a given application
RTM_GETRCLKSTATE: Read the state of recovered pins that output recovered clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- include/linux/netdevice.h | 9 ++ include/uapi/linux/if_link.h | 26 +++++ include/uapi/linux/rtnetlink.h | 7 ++ net/core/rtnetlink.c | 174 +++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 3 + 5 files changed, 219 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index ef2b381dae0c..708bd8336155 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1576,6 +1576,15 @@ struct net_device_ops { int (*ndo_get_eec_src)(struct net_device *dev, u32 *src, struct netlink_ext_ack *extack); + int (*ndo_get_rclk_range)(struct net_device *dev, + u32 *min_idx, u32 *max_idx, + struct netlink_ext_ack *extack); + int (*ndo_set_rclk_out)(struct net_device *dev, + u32 out_idx, bool ena, + struct netlink_ext_ack *extack); + int (*ndo_get_rclk_state)(struct net_device *dev, + u32 out_idx, bool *ena, + struct netlink_ext_ack *extack); };
/** diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 8eae80f287e9..e27c153cfba3 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1304,4 +1304,30 @@ enum {
#define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1)
+struct if_rclk_range_msg { + __u32 ifindex; +}; + +enum { + IFLA_RCLK_RANGE_UNSPEC, + IFLA_RCLK_RANGE_MIN_PIN, + IFLA_RCLK_RANGE_MAX_PIN, + __IFLA_RCLK_RANGE_MAX, +}; + +struct if_set_rclk_msg { + __u32 ifindex; + __u32 out_idx; + __u32 flags; +}; + +#define SET_RCLK_FLAGS_ENA (1U << 0) + +enum { + IFLA_RCLK_STATE_UNSPEC, + IFLA_RCLK_STATE_OUT_IDX, + IFLA_RCLK_STATE_COUNT, + __IFLA_RCLK_STATE_MAX, +}; + #endif /* _UAPI_LINUX_IF_LINK_H */ diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 1d8662afd6bd..6c0d96d56ec7 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,13 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
+ RTM_GETRCLKRANGE = 120, +#define RTM_GETRCLKRANGE RTM_GETRCLKRANGE + RTM_GETRCLKSTATE = 121, +#define RTM_GETRCLKSTATE RTM_GETRCLKSTATE + RTM_SETRCLKSTATE = 122, +#define RTM_SETRCLKSTATE RTM_SETRCLKSTATE + RTM_GETEECSTATE = 124, #define RTM_GETEECSTATE RTM_GETEECSTATE
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 03bc773d0e69..bc1e050f6d38 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5544,6 +5544,176 @@ static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, return err; }
+static int rtnl_fill_rclk_range(struct sk_buff *skb, struct net_device *dev, + u32 portid, u32 seq, + struct netlink_callback *cb, int flags, + struct netlink_ext_ack *extack) +{ + const struct net_device_ops *ops = dev->netdev_ops; + struct if_rclk_range_msg *state_msg; + struct nlmsghdr *nlh; + u32 min_idx, max_idx; + int err; + + ASSERT_RTNL(); + + if (!ops->ndo_get_rclk_range) + return -EOPNOTSUPP; + + err = ops->ndo_get_rclk_range(dev, &min_idx, &max_idx, extack); + if (err) + return err; + + nlh = nlmsg_put(skb, portid, seq, RTM_GETRCLKRANGE, sizeof(*state_msg), + flags); + if (!nlh) + return -EMSGSIZE; + + state_msg = nlmsg_data(nlh); + state_msg->ifindex = dev->ifindex; + + if (nla_put_u32(skb, IFLA_RCLK_RANGE_MIN_PIN, min_idx) || + nla_put_u32(skb, IFLA_RCLK_RANGE_MAX_PIN, max_idx)) + return -EMSGSIZE; + + nlmsg_end(skb, nlh); + return 0; +} + +static int rtnl_rclk_range_get(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_eec_state_msg *state; + struct net_device *dev; + struct sk_buff *nskb; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!nskb) + return -ENOBUFS; + + err = rtnl_fill_rclk_range(nskb, dev, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NULL, nlh->nlmsg_flags, + extack); + if (err < 0) + kfree_skb(nskb); + else + err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); + + return err; +} + +static int rtnl_fill_rclk_state(struct sk_buff *skb, struct net_device *dev, + u32 portid, u32 seq, + struct netlink_callback *cb, int flags, + struct netlink_ext_ack *extack) +{ + const struct net_device_ops *ops = dev->netdev_ops; + u32 min_idx, max_idx, src_idx, count = 0; + struct if_eec_state_msg *state_msg; + struct nlmsghdr *nlh; + bool ena; + int err; + + ASSERT_RTNL(); + + if (!ops->ndo_get_rclk_state || !ops->ndo_get_rclk_range) + return -EOPNOTSUPP; + + err = ops->ndo_get_rclk_range(dev, &min_idx, &max_idx, extack); + if (err) + return err; + + nlh = nlmsg_put(skb, portid, seq, RTM_GETRCLKSTATE, sizeof(*state_msg), + flags); + if (!nlh) + return -EMSGSIZE; + + state_msg = nlmsg_data(nlh); + state_msg->ifindex = dev->ifindex; + + for (src_idx = min_idx; src_idx <= max_idx; src_idx++) { + ops->ndo_get_rclk_state(dev, src_idx, &ena, extack); + if (!ena) + continue; + + if (nla_put_u32(skb, IFLA_RCLK_STATE_OUT_IDX, src_idx)) + return -EMSGSIZE; + count++; + } + + if (nla_put_u32(skb, IFLA_RCLK_STATE_COUNT, count)) + return -EMSGSIZE; + + nlmsg_end(skb, nlh); + return 0; +} + +static int rtnl_rclk_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_eec_state_msg *state; + struct net_device *dev; + struct sk_buff *nskb; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL); + if (!nskb) + return -ENOBUFS; + + err = rtnl_fill_rclk_state(nskb, dev, NETLINK_CB(skb).portid, + nlh->nlmsg_seq, NULL, nlh->nlmsg_flags, + extack); + if (err < 0) + kfree_skb(nskb); + else + err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid); + + return err; +} + +static int rtnl_rclk_set(struct sk_buff *skb, struct nlmsghdr *nlh, + struct netlink_ext_ack *extack) +{ + struct net *net = sock_net(skb->sk); + struct if_set_rclk_msg *state; + struct net_device *dev; + bool ena; + int err; + + state = nlmsg_data(nlh); + dev = __dev_get_by_index(net, state->ifindex); + if (!dev) { + NL_SET_ERR_MSG(extack, "unknown ifindex"); + return -ENODEV; + } + + if (!dev->netdev_ops->ndo_set_rclk_out) + return -EOPNOTSUPP; + + ena = !!(state->flags & SET_RCLK_FLAGS_ENA); + err = dev->netdev_ops->ndo_set_rclk_out(dev, state->out_idx, ena, + extack); + + return err; +} + /* Process one rtnetlink message. */
static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5770,5 +5940,9 @@ void __init rtnetlink_init(void) rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0);
+ rtnl_register(PF_UNSPEC, RTM_GETRCLKRANGE, rtnl_rclk_range_get, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_GETRCLKSTATE, rtnl_rclk_state_get, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_SETRCLKSTATE, rtnl_rclk_set, NULL, 0); + rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0); } diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 2c66e722ea9c..57c7c85edd4d 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -91,6 +91,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_NEWNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_GETRCLKRANGE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_GETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, + { RTM_SETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETEECSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, };
On 11/4/21 1:12 AM, Maciej Machnikowski wrote:
Add support for RTNL messages for reading/configuring SyncE recovered clocks. The messages are: RTM_GETRCLKRANGE: Reads the allowed pin index range for the recovered clock outputs. This can be aligned to PHY outputs or to EEC inputs, whichever is better for a given application
RTM_GETRCLKSTATE: Read the state of recovered pins that output recovered clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
Can't we just use a single RTM msg with nested attributes ?
With separate RTM msgtype for each syncE attribute we will end up bloating the RTM msg namespace.
(these api's could also be in ethtool given its directly querying the drivers)
include/linux/netdevice.h | 9 ++ include/uapi/linux/if_link.h | 26 +++++ include/uapi/linux/rtnetlink.h | 7 ++ net/core/rtnetlink.c | 174 +++++++++++++++++++++++++++++++++ security/selinux/nlmsgtab.c | 3 + 5 files changed, 219 insertions(+)
diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index ef2b381dae0c..708bd8336155 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1576,6 +1576,15 @@ struct net_device_ops { int (*ndo_get_eec_src)(struct net_device *dev, u32 *src, struct netlink_ext_ack *extack);
- int (*ndo_get_rclk_range)(struct net_device *dev,
u32 *min_idx, u32 *max_idx,
struct netlink_ext_ack *extack);
- int (*ndo_set_rclk_out)(struct net_device *dev,
u32 out_idx, bool ena,
struct netlink_ext_ack *extack);
- int (*ndo_get_rclk_state)(struct net_device *dev,
u32 out_idx, bool *ena,
};struct netlink_ext_ack *extack);
/** diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 8eae80f287e9..e27c153cfba3 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -1304,4 +1304,30 @@ enum { #define IFLA_EEC_MAX (__IFLA_EEC_MAX - 1) +struct if_rclk_range_msg {
- __u32 ifindex;
+};
+enum {
- IFLA_RCLK_RANGE_UNSPEC,
- IFLA_RCLK_RANGE_MIN_PIN,
- IFLA_RCLK_RANGE_MAX_PIN,
- __IFLA_RCLK_RANGE_MAX,
+};
+struct if_set_rclk_msg {
- __u32 ifindex;
- __u32 out_idx;
- __u32 flags;
+};
+#define SET_RCLK_FLAGS_ENA (1U << 0)
+enum {
- IFLA_RCLK_STATE_UNSPEC,
- IFLA_RCLK_STATE_OUT_IDX,
- IFLA_RCLK_STATE_COUNT,
- __IFLA_RCLK_STATE_MAX,
+};
- #endif /* _UAPI_LINUX_IF_LINK_H */
diff --git a/include/uapi/linux/rtnetlink.h b/include/uapi/linux/rtnetlink.h index 1d8662afd6bd..6c0d96d56ec7 100644 --- a/include/uapi/linux/rtnetlink.h +++ b/include/uapi/linux/rtnetlink.h @@ -185,6 +185,13 @@ enum { RTM_GETNEXTHOPBUCKET, #define RTM_GETNEXTHOPBUCKET RTM_GETNEXTHOPBUCKET
- RTM_GETRCLKRANGE = 120,
+#define RTM_GETRCLKRANGE RTM_GETRCLKRANGE
- RTM_GETRCLKSTATE = 121,
+#define RTM_GETRCLKSTATE RTM_GETRCLKSTATE
- RTM_SETRCLKSTATE = 122,
+#define RTM_SETRCLKSTATE RTM_SETRCLKSTATE
- RTM_GETEECSTATE = 124, #define RTM_GETEECSTATE RTM_GETEECSTATE
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 03bc773d0e69..bc1e050f6d38 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -5544,6 +5544,176 @@ static int rtnl_eec_state_get(struct sk_buff *skb, struct nlmsghdr *nlh, return err; } +static int rtnl_fill_rclk_range(struct sk_buff *skb, struct net_device *dev,
u32 portid, u32 seq,
struct netlink_callback *cb, int flags,
struct netlink_ext_ack *extack)
+{
- const struct net_device_ops *ops = dev->netdev_ops;
- struct if_rclk_range_msg *state_msg;
- struct nlmsghdr *nlh;
- u32 min_idx, max_idx;
- int err;
- ASSERT_RTNL();
- if (!ops->ndo_get_rclk_range)
return -EOPNOTSUPP;
- err = ops->ndo_get_rclk_range(dev, &min_idx, &max_idx, extack);
- if (err)
return err;
- nlh = nlmsg_put(skb, portid, seq, RTM_GETRCLKRANGE, sizeof(*state_msg),
flags);
- if (!nlh)
return -EMSGSIZE;
- state_msg = nlmsg_data(nlh);
- state_msg->ifindex = dev->ifindex;
- if (nla_put_u32(skb, IFLA_RCLK_RANGE_MIN_PIN, min_idx) ||
nla_put_u32(skb, IFLA_RCLK_RANGE_MAX_PIN, max_idx))
return -EMSGSIZE;
- nlmsg_end(skb, nlh);
- return 0;
+}
+static int rtnl_rclk_range_get(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
+{
- struct net *net = sock_net(skb->sk);
- struct if_eec_state_msg *state;
- struct net_device *dev;
- struct sk_buff *nskb;
- int err;
- state = nlmsg_data(nlh);
- dev = __dev_get_by_index(net, state->ifindex);
- if (!dev) {
NL_SET_ERR_MSG(extack, "unknown ifindex");
return -ENODEV;
- }
- nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
- if (!nskb)
return -ENOBUFS;
- err = rtnl_fill_rclk_range(nskb, dev, NETLINK_CB(skb).portid,
nlh->nlmsg_seq, NULL, nlh->nlmsg_flags,
extack);
- if (err < 0)
kfree_skb(nskb);
- else
err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
- return err;
+}
+static int rtnl_fill_rclk_state(struct sk_buff *skb, struct net_device *dev,
u32 portid, u32 seq,
struct netlink_callback *cb, int flags,
struct netlink_ext_ack *extack)
+{
- const struct net_device_ops *ops = dev->netdev_ops;
- u32 min_idx, max_idx, src_idx, count = 0;
- struct if_eec_state_msg *state_msg;
- struct nlmsghdr *nlh;
- bool ena;
- int err;
- ASSERT_RTNL();
- if (!ops->ndo_get_rclk_state || !ops->ndo_get_rclk_range)
return -EOPNOTSUPP;
- err = ops->ndo_get_rclk_range(dev, &min_idx, &max_idx, extack);
- if (err)
return err;
- nlh = nlmsg_put(skb, portid, seq, RTM_GETRCLKSTATE, sizeof(*state_msg),
flags);
- if (!nlh)
return -EMSGSIZE;
- state_msg = nlmsg_data(nlh);
- state_msg->ifindex = dev->ifindex;
- for (src_idx = min_idx; src_idx <= max_idx; src_idx++) {
ops->ndo_get_rclk_state(dev, src_idx, &ena, extack);
if (!ena)
continue;
if (nla_put_u32(skb, IFLA_RCLK_STATE_OUT_IDX, src_idx))
return -EMSGSIZE;
count++;
- }
- if (nla_put_u32(skb, IFLA_RCLK_STATE_COUNT, count))
return -EMSGSIZE;
- nlmsg_end(skb, nlh);
- return 0;
+}
+static int rtnl_rclk_state_get(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
+{
- struct net *net = sock_net(skb->sk);
- struct if_eec_state_msg *state;
- struct net_device *dev;
- struct sk_buff *nskb;
- int err;
- state = nlmsg_data(nlh);
- dev = __dev_get_by_index(net, state->ifindex);
- if (!dev) {
NL_SET_ERR_MSG(extack, "unknown ifindex");
return -ENODEV;
- }
- nskb = nlmsg_new(NLMSG_DEFAULT_SIZE, GFP_KERNEL);
- if (!nskb)
return -ENOBUFS;
- err = rtnl_fill_rclk_state(nskb, dev, NETLINK_CB(skb).portid,
nlh->nlmsg_seq, NULL, nlh->nlmsg_flags,
extack);
- if (err < 0)
kfree_skb(nskb);
- else
err = rtnl_unicast(nskb, net, NETLINK_CB(skb).portid);
- return err;
+}
+static int rtnl_rclk_set(struct sk_buff *skb, struct nlmsghdr *nlh,
struct netlink_ext_ack *extack)
+{
- struct net *net = sock_net(skb->sk);
- struct if_set_rclk_msg *state;
- struct net_device *dev;
- bool ena;
- int err;
- state = nlmsg_data(nlh);
- dev = __dev_get_by_index(net, state->ifindex);
- if (!dev) {
NL_SET_ERR_MSG(extack, "unknown ifindex");
return -ENODEV;
- }
- if (!dev->netdev_ops->ndo_set_rclk_out)
return -EOPNOTSUPP;
- ena = !!(state->flags & SET_RCLK_FLAGS_ENA);
- err = dev->netdev_ops->ndo_set_rclk_out(dev, state->out_idx, ena,
extack);
- return err;
+}
- /* Process one rtnetlink message. */
static int rtnetlink_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh, @@ -5770,5 +5940,9 @@ void __init rtnetlink_init(void) rtnl_register(PF_UNSPEC, RTM_GETSTATS, rtnl_stats_get, rtnl_stats_dump, 0);
- rtnl_register(PF_UNSPEC, RTM_GETRCLKRANGE, rtnl_rclk_range_get, NULL, 0);
- rtnl_register(PF_UNSPEC, RTM_GETRCLKSTATE, rtnl_rclk_state_get, NULL, 0);
- rtnl_register(PF_UNSPEC, RTM_SETRCLKSTATE, rtnl_rclk_set, NULL, 0);
- rtnl_register(PF_UNSPEC, RTM_GETEECSTATE, rtnl_eec_state_get, NULL, 0); }
diff --git a/security/selinux/nlmsgtab.c b/security/selinux/nlmsgtab.c index 2c66e722ea9c..57c7c85edd4d 100644 --- a/security/selinux/nlmsgtab.c +++ b/security/selinux/nlmsgtab.c @@ -91,6 +91,9 @@ static const struct nlmsg_perm nlmsg_route_perms[] = { RTM_NEWNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_DELNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETNEXTHOPBUCKET, NETLINK_ROUTE_SOCKET__NLMSG_READ },
- { RTM_GETRCLKRANGE, NETLINK_ROUTE_SOCKET__NLMSG_READ },
- { RTM_GETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ },
- { RTM_SETRCLKSTATE, NETLINK_ROUTE_SOCKET__NLMSG_WRITE }, { RTM_GETEECSTATE, NETLINK_ROUTE_SOCKET__NLMSG_READ }, };
-----Original Message----- From: Roopa Prabhu roopa@nvidia.com Sent: Thursday, November 4, 2021 7:25 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com; netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org Cc: richardcochran@gmail.com; abyagowi@fb.com; Nguyen, Anthony L anthony.l.nguyen@intel.com; davem@davemloft.net; kuba@kernel.org; linux-kselftest@vger.kernel.org; idosch@idosch.org; mkubecek@suse.cz; saeed@kernel.org; michael.chan@broadcom.com Subject: Re: [PATCH net-next 4/6] rtnetlink: Add support for SyncE recovered clock configuration
On 11/4/21 1:12 AM, Maciej Machnikowski wrote:
Add support for RTNL messages for reading/configuring SyncE recovered clocks. The messages are: RTM_GETRCLKRANGE: Reads the allowed pin index range for the
recovered
clock outputs. This can be aligned to PHY outputs or to EEC inputs, whichever is better for a given application
RTM_GETRCLKSTATE: Read the state of recovered pins that output
recovered
clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
Can't we just use a single RTM msg with nested attributes ?
With separate RTM msgtype for each syncE attribute we will end up bloating the RTM msg namespace.
(these api's could also be in ethtool given its directly querying the drivers)
I'm not a fan of merging those messages. The mergeable ones are GETRCLKRANGE and GETCLKSTATE, but the get range function may be result in a significantly longer call if the information about the underlying HW require any HW calls. They are currently grouped in 3 categories: - reading the boundaries in GetRclkRange (we can later add more to it, like configurable frequency limits) - Reading current configuration - setting the required configuration
I don't plan adding any additional RTM msg types for those (and that's the reason why I pulled them before EEC state which may have more messages in the future)
We also discussed ethtool way in the past RFCs, but this concept is applicable to any transport layer so I'd rather not limit it to ethernet devices (i.e. SONET, infiniband and others).
Regards Maciek
On Fri, Nov 05, 2021 at 12:17:19PM +0000, Machnikowski, Maciej wrote:
-----Original Message----- From: Roopa Prabhu roopa@nvidia.com Sent: Thursday, November 4, 2021 7:25 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com; netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org Cc: richardcochran@gmail.com; abyagowi@fb.com; Nguyen, Anthony L anthony.l.nguyen@intel.com; davem@davemloft.net; kuba@kernel.org; linux-kselftest@vger.kernel.org; idosch@idosch.org; mkubecek@suse.cz; saeed@kernel.org; michael.chan@broadcom.com Subject: Re: [PATCH net-next 4/6] rtnetlink: Add support for SyncE recovered clock configuration
On 11/4/21 1:12 AM, Maciej Machnikowski wrote:
Add support for RTNL messages for reading/configuring SyncE recovered clocks. The messages are: RTM_GETRCLKRANGE: Reads the allowed pin index range for the
recovered
clock outputs. This can be aligned to PHY outputs or to EEC inputs, whichever is better for a given application
RTM_GETRCLKSTATE: Read the state of recovered pins that output
recovered
clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com
Can't we just use a single RTM msg with nested attributes ?
With separate RTM msgtype for each syncE attribute we will end up bloating the RTM msg namespace.
(these api's could also be in ethtool given its directly querying the drivers)
I'm not a fan of merging those messages. The mergeable ones are GETRCLKRANGE and GETCLKSTATE, but the get range function may be result in a significantly longer call if the information about the underlying HW require any HW calls. They are currently grouped in 3 categories:
- reading the boundaries in GetRclkRange (we can later add more to it, like configurable frequency limits)
- Reading current configuration
- setting the required configuration
I don't plan adding any additional RTM msg types for those (and that's the reason why I pulled them before EEC state which may have more messages in the future)
We also discussed ethtool way in the past RFCs, but this concept is applicable to any transport layer so I'd rather not limit it to ethernet devices (i.e. SONET, infiniband and others).
I'm still not convinced that this doesn't belong in ethtool. I find it very weird to have a message called "Get Ethernet Equipment Clock State" in rtnetlink and not in ethtool. I also believe it was a mistake to add DCB to rtnetlink (for example, why PAUSE is configured via ethtool, but PFC via rtnetlink?)
-----Original Message----- From: Ido Schimmel idosch@idosch.org Sent: Sunday, November 7, 2021 3:21 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH net-next 4/6] rtnetlink: Add support for SyncE recovered clock configuration
On Fri, Nov 05, 2021 at 12:17:19PM +0000, Machnikowski, Maciej wrote:
-----Original Message----- From: Roopa Prabhu roopa@nvidia.com Sent: Thursday, November 4, 2021 7:25 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com; netdev@vger.kernel.org; intel-wired-lan@lists.osuosl.org Cc: richardcochran@gmail.com; abyagowi@fb.com; Nguyen, Anthony L anthony.l.nguyen@intel.com; davem@davemloft.net;
kuba@kernel.org;
linux-kselftest@vger.kernel.org; idosch@idosch.org; mkubecek@suse.cz; saeed@kernel.org; michael.chan@broadcom.com Subject: Re: [PATCH net-next 4/6] rtnetlink: Add support for SyncE
recovered
clock configuration
On 11/4/21 1:12 AM, Maciej Machnikowski wrote:
Add support for RTNL messages for reading/configuring SyncE
recovered
clocks. The messages are: RTM_GETRCLKRANGE: Reads the allowed pin index range for the
recovered
clock outputs. This can be aligned to PHY outputs or to EEC inputs, whichever is better for a given application
RTM_GETRCLKSTATE: Read the state of recovered pins that output
recovered
clock from a given port. The message will contain the number of assigned clocks (IFLA_RCLK_STATE_COUNT) and a N pin inexes in IFLA_RCLK_STATE_OUT_IDX
RTM_SETRCLKSTATE: Sets the redirection of the recovered clock for a given pin
Signed-off-by: Maciej Machnikowski
Can't we just use a single RTM msg with nested attributes ?
With separate RTM msgtype for each syncE attribute we will end up bloating the RTM msg namespace.
(these api's could also be in ethtool given its directly querying the drivers)
I'm not a fan of merging those messages. The mergeable ones are GETRCLKRANGE and GETCLKSTATE, but the get range function may be result in a significantly longer call if the information about the underlying HW require any HW calls. They are currently grouped in 3 categories:
- reading the boundaries in GetRclkRange (we can later add more to it, like configurable frequency limits)
- Reading current configuration
- setting the required configuration
I don't plan adding any additional RTM msg types for those (and that's the reason why I pulled them before EEC state which may have more messages in the future)
We also discussed ethtool way in the past RFCs, but this concept is applicable to any transport layer so I'd rather not limit it to ethernet devices (i.e. SONET, infiniband and others).
I'm still not convinced that this doesn't belong in ethtool. I find it very weird to have a message called "Get Ethernet Equipment Clock State" in rtnetlink and not in ethtool. I also believe it was a mistake to add DCB to rtnetlink (for example, why PAUSE is configured via ethtool, but PFC via rtnetlink?)
We can use: - SEC - Synchronous Equipment Clock - EC - Equipment Clock
SyncE is a specific implementation of a more generic concept. I'd rather not limit it to Ethernet only, as there are more network types that already use this concept, like Sonet/SDH or PDH as well as GPON/EPON networks and given the recent growth in timing applications - I expect more to follow.
Implement NDO functions for handling SyncE recovered clocks.
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- .../net/ethernet/intel/ice/ice_adminq_cmd.h | 53 +++++++++++ drivers/net/ethernet/intel/ice/ice_common.c | 65 +++++++++++++ drivers/net/ethernet/intel/ice/ice_common.h | 6 ++ drivers/net/ethernet/intel/ice/ice_main.c | 91 +++++++++++++++++++ include/linux/netdevice.h | 11 +++ 5 files changed, 226 insertions(+)
diff --git a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h index 11226af7a9a4..dace00a35c44 100644 --- a/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h +++ b/drivers/net/ethernet/intel/ice/ice_adminq_cmd.h @@ -1281,6 +1281,31 @@ struct ice_aqc_set_mac_lb { u8 reserved[15]; };
+/* Set PHY recovered clock output (direct 0x0630) */ +struct ice_aqc_set_phy_rec_clk_out { + u8 phy_output; + u8 port_num; + u8 flags; +#define ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN BIT(0) +#define ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT 0xFF + u8 rsvd; + __le32 freq; + u8 rsvd2[6]; + __le16 node_handle; +}; + +/* Get PHY recovered clock output (direct 0x0631) */ +struct ice_aqc_get_phy_rec_clk_out { + u8 phy_output; + u8 port_num; + u8 flags; +#define ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN BIT(0) + u8 rsvd; + __le32 freq; + u8 rsvd2[6]; + __le16 node_handle; +}; + struct ice_aqc_link_topo_params { u8 lport_num; u8 lport_num_valid; @@ -1838,6 +1863,28 @@ struct ice_aqc_get_cgu_dpll_status { __le16 node_handle; };
+/* Read CGU register (direct 0x0C6E) */ +struct ice_aqc_read_cgu_reg { + __le16 offset; +#define ICE_AQC_READ_CGU_REG_MAX_DATA_LEN 16 + u8 data_len; + u8 rsvd[13]; +}; + +/* Read CGU register response (direct 0x0C6E) */ +struct ice_aqc_read_cgu_reg_resp { + u8 data[ICE_AQC_READ_CGU_REG_MAX_DATA_LEN]; +}; + +/* Write CGU register (direct 0x0C6F) */ +struct ice_aqc_write_cgu_reg { + __le16 offset; +#define ICE_AQC_WRITE_CGU_REG_MAX_DATA_LEN 7 + u8 data_len; + u8 data[ICE_AQC_WRITE_CGU_REG_MAX_DATA_LEN]; + u8 rsvd[6]; +}; + /* Configure Firmware Logging Command (indirect 0xFF09) * Logging Information Read Response (indirect 0xFF10) * Note: The 0xFF10 command has no input parameters. @@ -2033,6 +2080,8 @@ struct ice_aq_desc { struct ice_aqc_get_phy_caps get_phy; struct ice_aqc_set_phy_cfg set_phy; struct ice_aqc_restart_an restart_an; + struct ice_aqc_set_phy_rec_clk_out set_phy_rec_clk_out; + struct ice_aqc_get_phy_rec_clk_out get_phy_rec_clk_out; struct ice_aqc_gpio read_write_gpio; struct ice_aqc_sff_eeprom read_write_sff_param; struct ice_aqc_set_port_id_led set_port_id_led; @@ -2188,6 +2237,8 @@ enum ice_adminq_opc { ice_aqc_opc_get_link_status = 0x0607, ice_aqc_opc_set_event_mask = 0x0613, ice_aqc_opc_set_mac_lb = 0x0620, + ice_aqc_opc_set_phy_rec_clk_out = 0x0630, + ice_aqc_opc_get_phy_rec_clk_out = 0x0631, ice_aqc_opc_get_link_topo = 0x06E0, ice_aqc_opc_set_port_id_led = 0x06E9, ice_aqc_opc_set_gpio = 0x06EC, @@ -2238,6 +2289,8 @@ enum ice_adminq_opc {
/* 1588/SyncE commands/events */ ice_aqc_opc_get_cgu_dpll_status = 0x0C66, + ice_aqc_opc_read_cgu_reg = 0x0C6E, + ice_aqc_opc_write_cgu_reg = 0x0C6F,
ice_aqc_opc_driver_shared_params = 0x0C90,
diff --git a/drivers/net/ethernet/intel/ice/ice_common.c b/drivers/net/ethernet/intel/ice/ice_common.c index 8069141ac105..29d302ea1e56 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.c +++ b/drivers/net/ethernet/intel/ice/ice_common.c @@ -5242,3 +5242,68 @@ bool ice_is_clock_mux_present_e810t(struct ice_hw *hw) return true; }
+/** + * ice_aq_set_phy_rec_clk_out - set RCLK phy out + * @hw: pointer to the HW struct + * @phy_output: PHY reference clock output pin + * @enable: GPIO state to be applied + * @freq: PHY output frequency + * + * Set CGU reference priority (0x0630) + * Return 0 on success or negative value on failure. + */ +enum ice_status +ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable, + u32 *freq) +{ + struct ice_aqc_set_phy_rec_clk_out *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_set_phy_rec_clk_out); + cmd = &desc.params.set_phy_rec_clk_out; + cmd->phy_output = phy_output; + cmd->port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT; + cmd->flags = enable & ICE_AQC_SET_PHY_REC_CLK_OUT_OUT_EN; + cmd->freq = cpu_to_le32(*freq); + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) + *freq = le32_to_cpu(cmd->freq); + + return status; +} + +/** + * ice_aq_get_phy_rec_clk_out + * @hw: pointer to the HW struct + * @phy_output: PHY reference clock output pin + * @port_num: Port number + * @flags: PHY flags + * @freq: PHY output frequency + * + * Get PHY recovered clock output (0x0631) + */ +enum ice_status +ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num, + u8 *flags, u32 *freq) +{ + struct ice_aqc_get_phy_rec_clk_out *cmd; + struct ice_aq_desc desc; + enum ice_status status; + + ice_fill_dflt_direct_cmd_desc(&desc, ice_aqc_opc_get_phy_rec_clk_out); + cmd = &desc.params.get_phy_rec_clk_out; + cmd->phy_output = phy_output; + cmd->port_num = *port_num; + + status = ice_aq_send_cmd(hw, &desc, NULL, 0, NULL); + if (!status) { + *port_num = cmd->port_num; + *flags = cmd->flags; + *freq = le32_to_cpu(cmd->freq); + } + + return status; +} + diff --git a/drivers/net/ethernet/intel/ice/ice_common.h b/drivers/net/ethernet/intel/ice/ice_common.h index aaed388a40a8..8a99c8364173 100644 --- a/drivers/net/ethernet/intel/ice/ice_common.h +++ b/drivers/net/ethernet/intel/ice/ice_common.h @@ -166,6 +166,12 @@ ice_ena_vsi_rdma_qset(struct ice_port_info *pi, u16 vsi_handle, u8 tc, enum ice_status ice_aq_get_cgu_dpll_status(struct ice_hw *hw, u8 dpll_num, u8 *ref_state, u16 *dpll_state, u64 *phase_offset, u8 *eec_mode); +enum ice_status +ice_aq_set_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, bool enable, + u32 *freq); +enum ice_status +ice_aq_get_phy_rec_clk_out(struct ice_hw *hw, u8 phy_output, u8 *port_num, + u8 *flags, u32 *freq); int ice_dis_vsi_rdma_qset(struct ice_port_info *pi, u16 count, u32 *qset_teid, u16 *q_id); diff --git a/drivers/net/ethernet/intel/ice/ice_main.c b/drivers/net/ethernet/intel/ice/ice_main.c index da6cfe19259a..127fad8fc8a8 100644 --- a/drivers/net/ethernet/intel/ice/ice_main.c +++ b/drivers/net/ethernet/intel/ice/ice_main.c @@ -6285,6 +6285,94 @@ ice_get_eec_src(struct net_device *netdev, u32 *src, return 0; }
+/** + * ice_get_rclk_range - get range of recovered clock indices + * @netdev: network interface device structure + * @min_idx: min rclk index + * @max_idx: max rclk index + * @extack: netlink extended ack + */ +static int +ice_get_rclk_range(struct net_device *netdev, u32 *min_idx, u32 *max_idx, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + *min_idx = REF1P; + *max_idx = REF1N; + + return 0; +} + +/** + * ice_set_rclk_out - set recovered clock redirection to the output pin + * @netdev: network interface device structure + * @out_idx: output index + * @ena: true will enable redirection, false will disable it + * @extack: netlink extended ack + */ +static int +ice_set_rclk_out(struct net_device *netdev, u32 out_idx, bool ena, + struct netlink_ext_ack *extack) +{ + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + enum ice_status ret; + u32 freq; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + if (out_idx < REF1P || out_idx > REF1N) + return -EINVAL; + + ret = ice_aq_set_phy_rec_clk_out(&pf->hw, out_idx - REF1P, ena, &freq); + + return ice_status_to_errno(ret); +} + +/** + * ice_get_rclk_state - Get state of recovered clock pin for a given netdev + * @netdev: network interface device structure + * @out_idx: output index + * @ena: returns true if the pin is enabled + * @extack: netlink extended ack + */ +static int +ice_get_rclk_state(struct net_device *netdev, u32 out_idx, bool *ena, + struct netlink_ext_ack *extack) +{ + u8 port_num = ICE_AQC_SET_PHY_REC_CLK_OUT_CURR_PORT; + struct ice_netdev_priv *np = netdev_priv(netdev); + struct ice_vsi *vsi = np->vsi; + struct ice_pf *pf = vsi->back; + enum ice_status ret; + u32 freq; + u8 flags; + + if (!ice_is_feature_supported(pf, ICE_F_CGU)) + return -EOPNOTSUPP; + + if (out_idx < REF1P || out_idx > REF1N) + return -EINVAL; + + ret = ice_aq_get_phy_rec_clk_out(&pf->hw, out_idx - REF1P, &port_num, + &flags, &freq); + + if (!ret && (flags & ICE_AQC_GET_PHY_REC_CLK_OUT_OUT_EN)) + *ena = true; + else + *ena = false; + + return ice_status_to_errno(ret); +} + /** * ice_down - Shutdown the connection * @vsi: The VSI being stopped @@ -8648,4 +8736,7 @@ static const struct net_device_ops ice_netdev_ops = { .ndo_xsk_wakeup = ice_xsk_wakeup, .ndo_get_eec_state = ice_get_eec_state, .ndo_get_eec_src = ice_get_eec_src, + .ndo_get_rclk_range = ice_get_rclk_range, + .ndo_set_rclk_out = ice_set_rclk_out, + .ndo_get_rclk_state = ice_get_rclk_state, }; diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h index 708bd8336155..9faa005506d1 100644 --- a/include/linux/netdevice.h +++ b/include/linux/netdevice.h @@ -1351,6 +1351,17 @@ struct netdev_net_notifier { * struct netlink_ext_ack *extack); * Get the index of the source signal that's currently used as EEC's * reference + * int (*ndo_get_rclk_range)(struct net_device *dev, u32 *min_idx, u32 *max_idx, + * struct netlink_ext_ack *extack); + * Get range of valid output indices for the set/get Recovered Clock + * functions + * int (*ndo_set_rclk_out)(struct net_device *dev, u32 out_idx, bool ena, + * struct netlink_ext_ack *extack); + * Set the receive clock recovery redirection to a given Recovered Clock + * output. + * int (*ndo_get_rclk_state)(struct net_device *dev, u32 out_idx, bool *ena, + * struct netlink_ext_ack *extack); + * Get current state of the recovered clock to pin mapping. */ struct net_device_ops { int (*ndo_init)(struct net_device *dev);
Add Documentation/networking/synce.rst describing new RTNL messages and respective NDO ops supporting SyncE (Synchronous Ethernet).
Signed-off-by: Maciej Machnikowski maciej.machnikowski@intel.com --- Documentation/networking/synce.rst | 88 ++++++++++++++++++++++++++++++ 1 file changed, 88 insertions(+) create mode 100644 Documentation/networking/synce.rst
diff --git a/Documentation/networking/synce.rst b/Documentation/networking/synce.rst new file mode 100644 index 000000000000..986b9e62809f --- /dev/null +++ b/Documentation/networking/synce.rst @@ -0,0 +1,88 @@ +.. SPDX-License-Identifier: GPL-2.0 + +==================== +Synchronous Ethernet +==================== + +Synchronous Ethernet networks use a physical layer clock to syntonize +the frequency across different network elements. + +Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet +Equipment Clock (EEC) and can recover synchronization +from the synchronization inputs - either traffic interfaces or external +frequency sources. +The EEC can synchronize its frequency (syntonize) to any of those sources. +It is also able to select a synchronization source through priority tables +and synchronization status messaging. It also provides necessary +filtering and holdover capabilities. + +The following interface can be applicable to diffferent packet network types +following ITU-T G.8261/G.8262 recommendations. + +Interface +========= + +The following RTNL messages are used to read/configure SyncE recovered +clocks. + +RTM_GETRCLKRANGE +----------------- +Reads the allowed pin index range for the recovered clock outputs. +This can be aligned to PHY outputs or to EEC inputs, whichever is +better for a given application. +Will call the ndo_get_rclk_range function to read the allowed range +of output pin indexes. +Will call ndo_get_rclk_range to determine the allowed recovered clock +range and return them in the IFLA_RCLK_RANGE_MIN_PIN and the +IFLA_RCLK_RANGE_MAX_PIN attributes + +RTM_GETRCLKSTATE +----------------- +Read the state of recovered pins that output recovered clock from +a given port. The message will contain the number of assigned clocks +(IFLA_RCLK_STATE_COUNT) and an N pin indexes in IFLA_RCLK_STATE_OUT_IDX +To support multiple recovered clock outputs from the same port, this message +will return the IFLA_RCLK_STATE_COUNT attribute containing the number of +active recovered clock outputs (N) and N IFLA_RCLK_STATE_OUT_IDX attributes +listing the active output indexes. +This message will call the ndo_get_rclk_range to determine the allowed +recovered clock indexes and then will loop through them, calling +the ndo_get_rclk_state for each of them. + +RTM_SETRCLKSTATE +----------------- +Sets the redirection of the recovered clock for a given pin. This message +expects one attribute: +struct if_set_rclk_msg { + __u32 ifindex; /* interface index */ + __u32 out_idx; /* output index (from a valid range) + __u32 flags; /* configuration flags */ +}; + +Supported flags are: +SET_RCLK_FLAGS_ENA - if set in flags - the given output will be enabled, + if clear - the output will be disabled. + +RTM_GETEECSTATE +---------------- +Reads the state of the EEC or equivalent physical clock synchronizer. +This message returns the following attributes: +IFLA_EEC_STATE - current state of the EEC or equivalent clock generator. + The states returned in this attribute are aligned to the + ITU-T G.781 and are: + IF_EEC_STATE_INVALID - state is not valid + IF_EEC_STATE_FREERUN - clock is free-running + IF_EEC_STATE_LOCKED - clock is locked to the reference, + but the holdover memory is not valid + IF_EEC_STATE_LOCKED_HO_ACQ - clock is locked to the reference + and holdover memory is valid + IF_EEC_STATE_HOLDOVER - clock is in holdover mode +State is read from the netdev calling the: +int (*ndo_get_eec_state)(struct net_device *dev, enum if_eec_state *state, + u32 *src_idx, struct netlink_ext_ack *extack); + +IFLA_EEC_SRC_IDX - optional attribute returning the index of the reference that + is used for the current IFLA_EEC_STATE, i.e., the index of + the pin that the EEC is locked to. + +Will be returned only if the ndo_get_eec_src is implemented. \ No newline at end of file
On Thu, 4 Nov 2021 09:12:31 +0100 Maciej Machnikowski wrote:
+Synchronous Ethernet networks use a physical layer clock to syntonize +the frequency across different network elements.
+Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet +Equipment Clock (EEC) and can recover synchronization +from the synchronization inputs - either traffic interfaces or external +frequency sources. +The EEC can synchronize its frequency (syntonize) to any of those sources. +It is also able to select a synchronization source through priority tables +and synchronization status messaging. It also provides necessary +filtering and holdover capabilities.
+The following interface can be applicable to diffferent packet network types +following ITU-T G.8261/G.8262 recommendations.
Can we get a diagram in here in terms of how the port feeds its recovered Rx freq into EEC and that feeds freq of Tx on other ports?
I'm still struggling to understand your reasoning around not making EEC its own object. "We can do this later" seems like trading relatively little effort now for extra work for driver and application developers for ever.
Also patch 3 still has a kdoc warning.
-----Original Message----- From: Jakub Kicinski kuba@kernel.org Sent: Thursday, November 4, 2021 7:09 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH net-next 6/6] docs: net: Add description of SyncE interfaces
On Thu, 4 Nov 2021 09:12:31 +0100 Maciej Machnikowski wrote:
+Synchronous Ethernet networks use a physical layer clock to syntonize +the frequency across different network elements.
+Basic SyncE node defined in the ITU-T G.8264 consist of an Ethernet +Equipment Clock (EEC) and can recover synchronization +from the synchronization inputs - either traffic interfaces or external +frequency sources. +The EEC can synchronize its frequency (syntonize) to any of those
sources.
+It is also able to select a synchronization source through priority tables +and synchronization status messaging. It also provides necessary +filtering and holdover capabilities.
+The following interface can be applicable to diffferent packet network
types
+following ITU-T G.8261/G.8262 recommendations.
Can we get a diagram in here in terms of how the port feeds its recovered Rx freq into EEC and that feeds freq of Tx on other ports?
Will try - yet my ASCII art skills are not very well developed :)
I'm still struggling to understand your reasoning around not making EEC its own object. "We can do this later" seems like trading relatively little effort now for extra work for driver and application developers for ever.
That's not the case. We need EEC and the other subsystem we wanted to make is the DPLL subsystem. While EEC can be a DPLL - it doesn't have to, and it's also the other way round - the DPLL can have numerous different usages. When we add the DPLL subsystem support the future work will be as simple as routing the EEC state read function to the DPLL subsystem. But if someone decides to use a different HW implementation he will still be able to implement his own version of API to handle it without a bigger DPLL block
Also patch 3 still has a kdoc warning.
Will fix.
On Fri, 5 Nov 2021 11:51:48 +0000 Machnikowski, Maciej wrote:
I'm still struggling to understand your reasoning around not making EEC its own object. "We can do this later" seems like trading relatively little effort now for extra work for driver and application developers for ever.
That's not the case. We need EEC and the other subsystem we wanted to make is the DPLL subsystem. While EEC can be a DPLL - it doesn't have to, and it's also the other way round - the DPLL can have numerous different usages.
We wanted to create a DPLL object to the extent that as a SW guy I don't understand the difference between that and an EEC. Whatever category of *PLL etc. objects EEC is, that's what we want to model.
When we add the DPLL subsystem support the future work will be as simple as routing the EEC state read function to the DPLL subsystem. But if someone decides to use a different HW implementation he will still be able to implement his own version of API to handle it without a bigger DPLL block
All we want is something that's not a port to hang whatever attributes exist in RTM_GETEECSTATE.
-----Original Message----- From: Jakub Kicinski kuba@kernel.org Sent: Friday, November 5, 2021 10:30 PM To: Machnikowski, Maciej maciej.machnikowski@intel.com Subject: Re: [PATCH net-next 6/6] docs: net: Add description of SyncE interfaces
On Fri, 5 Nov 2021 11:51:48 +0000 Machnikowski, Maciej wrote:
I'm still struggling to understand your reasoning around not making EEC its own object. "We can do this later" seems like trading relatively little effort now for extra work for driver and application developers for ever.
That's not the case. We need EEC and the other subsystem we wanted to make is the DPLL subsystem. While EEC can be a DPLL - it doesn't have to, and it's also the other way round - the DPLL can have numerous
different
usages.
We wanted to create a DPLL object to the extent that as a SW guy I don't understand the difference between that and an EEC. Whatever category of *PLL etc. objects EEC is, that's what we want to model.
The DPLL has more uses than just EEC. I.e. Timing card uses one to generate different frequencies synchronized to 1PPS coming from the GNSS receiver.
Implementing the whole DPLL subsystem may be an overkill for some basic solutions that are embedded inside the PHY and only handle the syntonization of TX frequency to the RX one. In this case all they would report is the current state.
When we add the DPLL subsystem support the future work will be as
simple
as routing the EEC state read function to the DPLL subsystem. But if
someone
decides to use a different HW implementation he will still be able to implement his own version of API to handle it without a bigger DPLL block
All we want is something that's not a port to hang whatever attributes exist in RTM_GETEECSTATE.
Routing to the DPLL object will be a specific use-case required only if we support advanced cases with external sources of frequency (like an atomic clock).
linux-kselftest-mirror@lists.linaro.org