From: Edward Cree ecree.xilinx@gmail.com
The original semantics of ntuple filters with FLOW_RSS were not fully understood by all drivers, some ignoring the ring_cookie from the flow rule. Require this support to be explicitly declared by the driver for filters relying on it to be inserted, and add self- test coverage for this functionality. Also teach ethtool_check_max_channel() about this.
Edward Cree (5): net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in net: ethtool: account for RSS+RXNFC add semantics when checking channel count selftest: include dst-ip in ethtool ntuple rules selftest: validate RSS+ntuple filters with nonzero ring_cookie selftest: extend test_rss_context_queue_reconfigure for action addition
drivers/net/ethernet/sfc/ef100_ethtool.c | 1 + drivers/net/ethernet/sfc/ethtool.c | 1 + include/linux/ethtool.h | 4 + net/ethtool/common.c | 42 +++++++--- net/ethtool/ioctl.c | 5 ++ .../selftests/drivers/net/hw/rss_ctx.py | 79 +++++++++++++++++-- 6 files changed, 113 insertions(+), 19 deletions(-)
From: Edward Cree ecree.xilinx@gmail.com
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters. However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com --- drivers/net/ethernet/sfc/ef100_ethtool.c | 1 + drivers/net/ethernet/sfc/ethtool.c | 1 + include/linux/ethtool.h | 4 ++++ net/ethtool/ioctl.c | 5 +++++ 4 files changed, 11 insertions(+)
diff --git a/drivers/net/ethernet/sfc/ef100_ethtool.c b/drivers/net/ethernet/sfc/ef100_ethtool.c index 5c2551369812..6c3b74000d3b 100644 --- a/drivers/net/ethernet/sfc/ef100_ethtool.c +++ b/drivers/net/ethernet/sfc/ef100_ethtool.c @@ -59,6 +59,7 @@ const struct ethtool_ops ef100_ethtool_ops = { .get_rxfh_indir_size = efx_ethtool_get_rxfh_indir_size, .get_rxfh_key_size = efx_ethtool_get_rxfh_key_size, .rxfh_per_ctx_key = true, + .cap_rss_rxnfc_adds = true, .rxfh_priv_size = sizeof(struct efx_rss_context_priv), .get_rxfh = efx_ethtool_get_rxfh, .set_rxfh = efx_ethtool_set_rxfh, diff --git a/drivers/net/ethernet/sfc/ethtool.c b/drivers/net/ethernet/sfc/ethtool.c index bb1930818beb..83d715544f7f 100644 --- a/drivers/net/ethernet/sfc/ethtool.c +++ b/drivers/net/ethernet/sfc/ethtool.c @@ -263,6 +263,7 @@ const struct ethtool_ops efx_ethtool_ops = { .get_rxfh_indir_size = efx_ethtool_get_rxfh_indir_size, .get_rxfh_key_size = efx_ethtool_get_rxfh_key_size, .rxfh_per_ctx_key = true, + .cap_rss_rxnfc_adds = true, .rxfh_priv_size = sizeof(struct efx_rss_context_priv), .get_rxfh = efx_ethtool_get_rxfh, .set_rxfh = efx_ethtool_set_rxfh, diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 1199e308c8dd..299280c94d07 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -734,6 +734,9 @@ struct kernel_ethtool_ts_info { * @rxfh_per_ctx_key: device supports setting different RSS key for each * additional context. Netlink API should report hfunc, key, and input_xfrm * for every context, not just context 0. + * @cap_rss_rxnfc_adds: device supports nonzero ring_cookie in filters with + * %FLOW_RSS flag; the queue ID from the filter is added to the value from + * the indirection table to determine the delivery queue. * @rxfh_indir_space: max size of RSS indirection tables, if indirection table * size as returned by @get_rxfh_indir_size may change during lifetime * of the device. Leave as 0 if the table size is constant. @@ -956,6 +959,7 @@ struct ethtool_ops { u32 cap_rss_ctx_supported:1; u32 cap_rss_sym_xor_supported:1; u32 rxfh_per_ctx_key:1; + u32 cap_rss_rxnfc_adds:1; u32 rxfh_indir_space; u16 rxfh_key_space; u16 rxfh_priv_size; diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 7da94e26ced6..d86399bcf223 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
+ /* Nonzero ring with RSS only makes sense if NIC adds them together */ + if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds && + ethtool_get_flow_spec_ring(info.fs.ring_cookie)) + return -EINVAL; + if (ops->get_rxfh) { struct ethtool_rxfh_param rxfh = {};
On Wed, Nov 13, 2024 at 12:13:09PM +0000, edward.cree@amd.com wrote:
From: Edward Cree ecree.xilinx@gmail.com
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters. However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com
Reviewed-by: Martin Habets habetsm.xilinx@gmail.com
drivers/net/ethernet/sfc/ef100_ethtool.c | 1 + drivers/net/ethernet/sfc/ethtool.c | 1 + include/linux/ethtool.h | 4 ++++ net/ethtool/ioctl.c | 5 +++++ 4 files changed, 11 insertions(+)
diff --git a/drivers/net/ethernet/sfc/ef100_ethtool.c b/drivers/net/ethernet/sfc/ef100_ethtool.c index 5c2551369812..6c3b74000d3b 100644 --- a/drivers/net/ethernet/sfc/ef100_ethtool.c +++ b/drivers/net/ethernet/sfc/ef100_ethtool.c @@ -59,6 +59,7 @@ const struct ethtool_ops ef100_ethtool_ops = { .get_rxfh_indir_size = efx_ethtool_get_rxfh_indir_size, .get_rxfh_key_size = efx_ethtool_get_rxfh_key_size, .rxfh_per_ctx_key = true,
- .cap_rss_rxnfc_adds = true, .rxfh_priv_size = sizeof(struct efx_rss_context_priv), .get_rxfh = efx_ethtool_get_rxfh, .set_rxfh = efx_ethtool_set_rxfh,
diff --git a/drivers/net/ethernet/sfc/ethtool.c b/drivers/net/ethernet/sfc/ethtool.c index bb1930818beb..83d715544f7f 100644 --- a/drivers/net/ethernet/sfc/ethtool.c +++ b/drivers/net/ethernet/sfc/ethtool.c @@ -263,6 +263,7 @@ const struct ethtool_ops efx_ethtool_ops = { .get_rxfh_indir_size = efx_ethtool_get_rxfh_indir_size, .get_rxfh_key_size = efx_ethtool_get_rxfh_key_size, .rxfh_per_ctx_key = true,
- .cap_rss_rxnfc_adds = true, .rxfh_priv_size = sizeof(struct efx_rss_context_priv), .get_rxfh = efx_ethtool_get_rxfh, .set_rxfh = efx_ethtool_set_rxfh,
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h index 1199e308c8dd..299280c94d07 100644 --- a/include/linux/ethtool.h +++ b/include/linux/ethtool.h @@ -734,6 +734,9 @@ struct kernel_ethtool_ts_info {
- @rxfh_per_ctx_key: device supports setting different RSS key for each
- additional context. Netlink API should report hfunc, key, and input_xfrm
- for every context, not just context 0.
- @cap_rss_rxnfc_adds: device supports nonzero ring_cookie in filters with
- %FLOW_RSS flag; the queue ID from the filter is added to the value from
- the indirection table to determine the delivery queue.
- @rxfh_indir_space: max size of RSS indirection tables, if indirection table
- size as returned by @get_rxfh_indir_size may change during lifetime
- of the device. Leave as 0 if the table size is constant.
@@ -956,6 +959,7 @@ struct ethtool_ops { u32 cap_rss_ctx_supported:1; u32 cap_rss_sym_xor_supported:1; u32 rxfh_per_ctx_key:1;
- u32 cap_rss_rxnfc_adds:1; u32 rxfh_indir_space; u16 rxfh_key_space; u16 rxfh_priv_size;
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 7da94e26ced6..d86399bcf223 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
- /* Nonzero ring with RSS only makes sense if NIC adds them together */
- if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds &&
ethtool_get_flow_spec_ring(info.fs.ring_cookie))
return -EINVAL;
- if (ops->get_rxfh) { struct ethtool_rxfh_param rxfh = {};
Hi Edward,
On 13/11/2024 14:13, edward.cree@amd.com wrote:
From: Edward Cree ecree.xilinx@gmail.com
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters.
TBH, I'm not sure I understand the difference? Perhaps you can share an example?
However, some drivers / hardware ignore the ring_cookie, and simply use the indirection table entries as queue IDs directly. Thus, for drivers which have not opted in by setting ethtool_ops.cap_rss_rxnfc_adds to declare that they support the original (addition) semantics, reject in ethtool_set_rxnfc any filter which combines FLOW_RSS and a nonzero ring. (For a ring_cookie of zero, both behaviours are equivalent.) Set the cap bit in sfc, as it is known to support this feature.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c index 7da94e26ced6..d86399bcf223 100644 --- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
- /* Nonzero ring with RSS only makes sense if NIC adds them together */
- if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds &&
ethtool_get_flow_spec_ring(info.fs.ring_cookie))
return -EINVAL;
I believe this check shouldn't happen when we do ETHTOOL_SRXCLSRLDEL as flow_type is garbage, WDYT?
- if (ops->get_rxfh) { struct ethtool_rxfh_param rxfh = {};
On 25/11/2024 07:11, Gal Pressman wrote:
On 13/11/2024 14:13, edward.cree@amd.com wrote:
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters.
TBH, I'm not sure I understand the difference? Perhaps you can share an example?
Something like this:
ethtool -X $intf context new equal 2 # creates context ID 1, table filled with 0s and 1s ethtool -N $intf <match fields...> context 1 # filter distributes traffic to queues 0 and 1 ethtool -N $intf <match fields...> context 1 action 2 # filter distributes traffic to queues 2 and 3
See the selftest in patch 4 for a concrete example of this. Some NICs were apparently sending the traffic from both filters to queues 0 and 1, and ignoring the 'action 2' on the second filter.
@@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
- /* Nonzero ring with RSS only makes sense if NIC adds them together */
- if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds &&
ethtool_get_flow_spec_ring(info.fs.ring_cookie))
return -EINVAL;
I believe this check shouldn't happen when we do ETHTOOL_SRXCLSRLDEL as flow_type is garbage, WDYT?
Agreed; this check should only apply to ETHTOOL_SRXCLSRLINS. Do you want to send the fix or shall I?
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
On 25/11/2024 15:21, Edward Cree wrote:
On 25/11/2024 07:11, Gal Pressman wrote:
On 13/11/2024 14:13, edward.cree@amd.com wrote:
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters.
TBH, I'm not sure I understand the difference? Perhaps you can share an example?
Something like this:
ethtool -X $intf context new equal 2 # creates context ID 1, table filled with 0s and 1s ethtool -N $intf <match fields...> context 1 # filter distributes traffic to queues 0 and 1 ethtool -N $intf <match fields...> context 1 action 2 # filter distributes traffic to queues 2 and 3
See the selftest in patch 4 for a concrete example of this. Some NICs were apparently sending the traffic from both filters to queues 0 and 1, and ignoring the 'action 2' on the second filter.
Thanks, I did not know it works that way, is it actually documented anywhere?
@@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
- /* Nonzero ring with RSS only makes sense if NIC adds them together */
- if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds &&
ethtool_get_flow_spec_ring(info.fs.ring_cookie))
return -EINVAL;
I believe this check shouldn't happen when we do ETHTOOL_SRXCLSRLDEL as flow_type is garbage, WDYT?
Agreed; this check should only apply to ETHTOOL_SRXCLSRLINS. Do you want to send the fix or shall I?
I will do it.
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example: 1. Enable symmetric xor 2. Request hash on src only = Error as expected, however:
1. Request hash on src only 2. Enable symmetric xor = Success :(.
I've been thinking of improving the situation, but that requires iterating over all flow types on symmetric xor enablement and that feels quite bad..
On 2024-11-25 7:10 a.m., Gal Pressman wrote:
On 25/11/2024 15:21, Edward Cree wrote:
On 25/11/2024 07:11, Gal Pressman wrote:
On 13/11/2024 14:13, edward.cree@amd.com wrote:
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters.
TBH, I'm not sure I understand the difference? Perhaps you can share an example?
Something like this:
ethtool -X $intf context new equal 2 # creates context ID 1, table filled with 0s and 1s ethtool -N $intf <match fields...> context 1 # filter distributes traffic to queues 0 and 1 ethtool -N $intf <match fields...> context 1 action 2 # filter distributes traffic to queues 2 and 3
See the selftest in patch 4 for a concrete example of this. Some NICs were apparently sending the traffic from both filters to queues 0 and 1, and ignoring the 'action 2' on the second filter.
Thanks, I did not know it works that way, is it actually documented anywhere?
@@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc;
- /* Nonzero ring with RSS only makes sense if NIC adds them together */
- if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds &&
ethtool_get_flow_spec_ring(info.fs.ring_cookie))
return -EINVAL;
I believe this check shouldn't happen when we do ETHTOOL_SRXCLSRLDEL as flow_type is garbage, WDYT?
Agreed; this check should only apply to ETHTOOL_SRXCLSRLINS. Do you want to send the fix or shall I?
I will do it.
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example:
- Enable symmetric xor
- Request hash on src only
= Error as expected, however:
Correct. The check below is to make sure that no ntuple that does not cover symmetric fields is added if symm-xor is enabled.
- Request hash on src only
- Enable symmetric xor
= Success :(.
I've been thinking of improving the situation, but that requires iterating over all flow types on symmetric xor enablement and that feels quite bad..
and delete/disable filters? may be just a warning to the user that some filters are not symmetric.
On 25/11/2024 16:20, Ahmed Zaki wrote:
On 2024-11-25 7:10 a.m., Gal Pressman wrote:
On 25/11/2024 15:21, Edward Cree wrote:
On 25/11/2024 07:11, Gal Pressman wrote:
On 13/11/2024 14:13, edward.cree@amd.com wrote:
Ethtool ntuple filters with FLOW_RSS were originally defined as adding the base queue ID (ring_cookie) to the value from the indirection table, so that the same table could distribute over more than one set of queues when used by different filters.
TBH, I'm not sure I understand the difference? Perhaps you can share an example?
Something like this:
ethtool -X $intf context new equal 2 # creates context ID 1, table filled with 0s and 1s ethtool -N $intf <match fields...> context 1 # filter distributes traffic to queues 0 and 1 ethtool -N $intf <match fields...> context 1 action 2 # filter distributes traffic to queues 2 and 3
See the selftest in patch 4 for a concrete example of this. Some NICs were apparently sending the traffic from both filters to queues 0 and 1, and ignoring the 'action 2' on the second filter.
Thanks, I did not know it works that way, is it actually documented anywhere?
@@ -992,6 +992,11 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, if (rc) return rc; + /* Nonzero ring with RSS only makes sense if NIC adds them together */ + if (info.flow_type & FLOW_RSS && !ops->cap_rss_rxnfc_adds && + ethtool_get_flow_spec_ring(info.fs.ring_cookie)) + return -EINVAL;
I believe this check shouldn't happen when we do ETHTOOL_SRXCLSRLDEL as flow_type is garbage, WDYT?
Agreed; this check should only apply to ETHTOOL_SRXCLSRLINS. Do you want to send the fix or shall I?
I will do it.
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example:
- Enable symmetric xor
- Request hash on src only
= Error as expected, however:
Correct. The check below is to make sure that no ntuple that does not cover symmetric fields is added if symm-xor is enabled.
- Request hash on src only
- Enable symmetric xor
= Success :(.
I've been thinking of improving the situation, but that requires iterating over all flow types on symmetric xor enablement and that feels quite bad..
and delete/disable filters? may be just a warning to the user that some filters are not symmetric.
I think the right thing to do in that case is fail the symmetric xor enablement, but do we really want to query the driver for all flow types and check if an asymmetric filter exists?
On 25/11/2024 14:20, Ahmed Zaki wrote:
On 2024-11-25 7:10 a.m., Gal Pressman wrote:
On 25/11/2024 15:21, Edward Cree wrote:
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example:
- Enable symmetric xor
- Request hash on src only
= Error as expected, however:
Correct. The check below is to make sure that no ntuple that does not cover symmetric fields is added if symm-xor is enabled.
But symm-xor is about hashing, and is only relevant to traffic being directed by RSS. The user should still be allowed to, and the NIC should be able to handle, setting an ntuple filter (SRXCLSRLINS) that is asymmetric, to override the symmetric hashing for selected traffic. symm-xor should only constrain RXFH settings. And in fact even if you wanted to block asymm ntuple filters, the current code does not do that, since the info.data fields it looks at aren't populated for ntuple filters (whose filter fields are defined by info.fs). So the xfrm check should be under `if (info.cmd == ETHTOOL_SRXFH)`.
On 2024-11-25 7:42 a.m., Edward Cree wrote:
On 25/11/2024 14:20, Ahmed Zaki wrote:
On 2024-11-25 7:10 a.m., Gal Pressman wrote:
On 25/11/2024 15:21, Edward Cree wrote:
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example:
- Enable symmetric xor
- Request hash on src only
= Error as expected, however:
Correct. The check below is to make sure that no ntuple that does not cover symmetric fields is added if symm-xor is enabled.
But symm-xor is about hashing, and is only relevant to traffic being directed by RSS. The user should still be allowed to, and the NIC should be able to handle, setting an ntuple filter (SRXCLSRLINS) that is asymmetric, to override the symmetric hashing for selected traffic.
I agree, and in its first version, the sym-xor series was setting sym-xor per ntuple, not per netdev. So the NIC can support different RSS functions for different filters. Unfortunately, this was then changed to be per-netdev during review. At that point, these checks were added in nxfc path.
symm-xor should only constrain RXFH settings. And in fact even if you wanted to block asymm ntuple filters, the current code does not do that, since the info.data fields it looks at aren't populated for ntuple filters (whose filter fields are defined by info.fs). So the xfrm check should be under `if (info.cmd == ETHTOOL_SRXFH)`.
If it is not, then it is a bug. I will try to test later this week and send a fix if needed.
On 2024-11-25 12:01 p.m., Ahmed Zaki wrote:
On 2024-11-25 7:42 a.m., Edward Cree wrote:
On 25/11/2024 14:20, Ahmed Zaki wrote:
On 2024-11-25 7:10 a.m., Gal Pressman wrote:
On 25/11/2024 15:21, Edward Cree wrote:
Also, the check below it, dealing with sym-xor, looks like it's only relevant to ETHTOOL_SRXFH, since info.data is garbage for other commands. Ahmed, is my understanding correct there?
Speaking of the below check, the sanity check depends on the order of operations, for example:
- Enable symmetric xor
- Request hash on src only
= Error as expected, however:
Correct. The check below is to make sure that no ntuple that does not cover symmetric fields is added if symm-xor is enabled.
But symm-xor is about hashing, and is only relevant to traffic being directed by RSS. The user should still be allowed to, and the NIC should be able to handle, setting an ntuple filter (SRXCLSRLINS) that is asymmetric, to override the symmetric hashing for selected traffic.
I agree, and in its first version, the sym-xor series was setting sym- xor per ntuple, not per netdev. So the NIC can support different RSS functions for different filters. Unfortunately, this was then changed to be per-netdev during review. At that point, these checks were added in nxfc path.
symm-xor should only constrain RXFH settings. And in fact even if you wanted to block asymm ntuple filters, the current code does not do that, since the info.data fields it looks at aren't populated for ntuple filters (whose filter fields are defined by info.fs). So the xfrm check should be under `if (info.cmd == ETHTOOL_SRXFH)`.
If it is not, then it is a bug. I will try to test later this week and send a fix if needed.
Sorry, I misunderstood your original question. The check was actually intended for "rx-flow-hash":
# ethtool -X eth0 xfrm symmetric-xor # ethtool -N eth0 rx-flow-hash udp4 sdf Cannot change RX network flow hashing options: Invalid argument
and the NICs that support symm-xor do not use RSS on ntuple filters. So you are right, we should:
--- a/net/ethtool/ioctl.c +++ b/net/ethtool/ioctl.c @@ -997,7 +997,7 @@ static noinline_for_stack int ethtool_set_rxnfc(struct net_device *dev, ethtool_get_flow_spec_ring(info.fs.ring_cookie)) return -EINVAL;
- if (ops->get_rxfh) { + if (info.cmd == ETHTOOL_SRXFH && ops->get_rxfh) { struct ethtool_rxfh_param rxfh = {};
rc = ops->get_rxfh(dev, &rxfh);
Let me know if you want me to send this.
Thanks.
On 25/11/2024 14:10, Gal Pressman wrote:
Thanks, I did not know it works that way, is it actually documented anywhere?
Yes; the struct ethtool_rxnfc kerneldoc has: * For %ETHTOOL_SRXCLSRLINS, @fs specifies the rule to add or update. * @fs.@location either specifies the location to use or is a special * location value with %RX_CLS_LOC_SPECIAL flag set. On return, * @fs.@location is the actual rule location. If @fs.@flow_type * includes the %FLOW_RSS flag, @rss_context is the RSS context ID to * use for flow spreading traffic which matches this rule. The value * from the rxfh indirection table will be added to @fs.@ring_cookie * to choose which ring to deliver to.
I am also preparing a patch to add this info to the ethtool man page.
From: Edward Cree ecree.xilinx@gmail.com
In ethtool_check_max_channel(), the new RX count must not only cover the max queue indices in RSS indirection tables and RXNFC destinations separately, but must also, for RXNFC rules with FLOW_RSS, cover the sum of the destination queue and the maximum index in the associated RSS context's indirection table, since that is the highest queue that the rule can actually deliver traffic to. It could be argued that the max queue across all custom RSS contexts (ethtool_get_max_rss_ctx_channel()) need no longer be considered, since any context to which packets can actually be delivered will be targeted by some RXNFC rule and its max will thus be allowed for by ethtool_get_max_rxnfc_channel(). For simplicity we keep both checks, so even RSS contexts unused by any RXNFC rule must fit the channel count.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com --- net/ethtool/common.c | 42 ++++++++++++++++++++++++++++++------------ 1 file changed, 30 insertions(+), 12 deletions(-)
diff --git a/net/ethtool/common.c b/net/ethtool/common.c index 0d62363dbd9d..05ce4f8080b3 100644 --- a/net/ethtool/common.c +++ b/net/ethtool/common.c @@ -538,6 +538,20 @@ static int ethtool_get_rxnfc_rule_count(struct net_device *dev) return info.rule_cnt; }
+/* Max offset for one RSS context */ +static u32 ethtool_get_rss_ctx_max_channel(struct ethtool_rxfh_context *ctx) +{ + u32 max_ring = 0; + u32 i, *tbl; + + if (WARN_ON_ONCE(!ctx)) + return 0; + tbl = ethtool_rxfh_context_indir(ctx); + for (i = 0; i < ctx->indir_size; i++) + max_ring = max(max_ring, tbl[i]); + return max_ring; +} + static int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max) { const struct ethtool_ops *ops = dev->ethtool_ops; @@ -574,10 +588,18 @@ static int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max)
if (rule_info.fs.ring_cookie != RX_CLS_FLOW_DISC && rule_info.fs.ring_cookie != RX_CLS_FLOW_WAKE && - !(rule_info.flow_type & FLOW_RSS) && - !ethtool_get_flow_spec_ring_vf(rule_info.fs.ring_cookie)) - max_ring = - max_t(u64, max_ring, rule_info.fs.ring_cookie); + !ethtool_get_flow_spec_ring_vf(rule_info.fs.ring_cookie)) { + u64 ring = rule_info.fs.ring_cookie; + + if (rule_info.flow_type & FLOW_RSS) { + struct ethtool_rxfh_context *ctx; + + ctx = xa_load(&dev->ethtool->rss_ctx, + rule_info.rss_context); + ring += ethtool_get_rss_ctx_max_channel(ctx); + } + max_ring = max_t(u64, max_ring, ring); + } }
kvfree(info); @@ -589,6 +611,7 @@ static int ethtool_get_max_rxnfc_channel(struct net_device *dev, u64 *max) return err; }
+/* Max offset across all of a device's RSS contexts */ static u32 ethtool_get_max_rss_ctx_channel(struct net_device *dev) { struct ethtool_rxfh_context *ctx; @@ -596,13 +619,8 @@ static u32 ethtool_get_max_rss_ctx_channel(struct net_device *dev) u32 max_ring = 0;
mutex_lock(&dev->ethtool->rss_lock); - xa_for_each(&dev->ethtool->rss_ctx, context, ctx) { - u32 i, *tbl; - - tbl = ethtool_rxfh_context_indir(ctx); - for (i = 0; i < ctx->indir_size; i++) - max_ring = max(max_ring, tbl[i]); - } + xa_for_each(&dev->ethtool->rss_ctx, context, ctx) + max_ring = max(max_ring, ethtool_get_rss_ctx_max_channel(ctx)); mutex_unlock(&dev->ethtool->rss_lock);
return max_ring; @@ -611,7 +629,7 @@ static u32 ethtool_get_max_rss_ctx_channel(struct net_device *dev) static u32 ethtool_get_max_rxfh_channel(struct net_device *dev) { struct ethtool_rxfh_param rxfh = {}; - u32 dev_size, current_max; + u32 dev_size, current_max = 0; int ret;
/* While we do track whether RSS context has an indirection
From: Edward Cree ecree.xilinx@gmail.com
sfc hardware does not support filters with only ipproto + dst-port; adding dst-ip to the flow spec allows the rss_ctx test to be run on these devices.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com --- I'm not sure if this change will break the test for other drivers that perhaps support the old filter but not the new one. If so we might need to add an option to cfg to control this choice.
tools/testing/selftests/drivers/net/hw/rss_ctx.py | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index 29995586993c..fb61dae20fd8 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -215,7 +215,7 @@ def test_rss_queue_reconfigure(cfg, main_ctx=True): defer(ethtool, f"-X {cfg.ifname} default") else: other_key = 'noise' - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}")
@@ -429,7 +429,7 @@ def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None): ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data))
ports.append(rand_port()) - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}")
@@ -516,7 +516,7 @@ def test_rss_context_out_of_order(cfg, ctx_cnt=4): ctx.append(defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete"))
ports.append(rand_port()) - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple.append(defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}"))
@@ -569,7 +569,7 @@ def test_rss_context_overlap(cfg, other_ctx=0):
port = rand_port() if other_ctx: - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {other_ctx}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {other_ctx}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple = defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
@@ -587,7 +587,7 @@ def test_rss_context_overlap(cfg, other_ctx=0): # Now create a rule for context 1 and make sure traffic goes to a subset if other_ctx: ntuple.exec() - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
@@ -620,7 +620,7 @@ def test_delete_rss_context_busy(cfg):
# utilize context from ntuple filter port = rand_port() - flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}" + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
On Wed, Nov 13, 2024 at 12:13:11PM +0000, edward.cree@amd.com wrote:
From: Edward Cree ecree.xilinx@gmail.com
sfc hardware does not support filters with only ipproto + dst-port; adding dst-ip to the flow spec allows the rss_ctx test to be run on these devices.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com
Reviewed-by: Martin Habets habetsm.xilinx@gmail.com
I'm not sure if this change will break the test for other drivers that perhaps support the old filter but not the new one. If so we might need to add an option to cfg to control this choice.
tools/testing/selftests/drivers/net/hw/rss_ctx.py | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index 29995586993c..fb61dae20fd8 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -215,7 +215,7 @@ def test_rss_queue_reconfigure(cfg, main_ctx=True): defer(ethtool, f"-X {cfg.ifname} default") else: other_key = 'noise'
flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}"
flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}")
@@ -429,7 +429,7 @@ def test_rss_context(cfg, ctx_cnt=1, create_with_cfg=None): ksft_eq(max(data['rss-indirection-table']), 2 + i * 2 + 1, "Unexpected context cfg: " + str(data)) ports.append(rand_port())
flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}"
flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple}")
@@ -516,7 +516,7 @@ def test_rss_context_out_of_order(cfg, ctx_cnt=4): ctx.append(defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete")) ports.append(rand_port())
flow = f"flow-type tcp{cfg.addr_ipver} dst-port {ports[i]} context {ctx_id}"
flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {ports[i]} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple.append(defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}"))
@@ -569,7 +569,7 @@ def test_rss_context_overlap(cfg, other_ctx=0): port = rand_port() if other_ctx:
flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {other_ctx}"
flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {other_ctx}" ntuple_id = ethtool_create(cfg, "-N", flow) ntuple = defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
@@ -587,7 +587,7 @@ def test_rss_context_overlap(cfg, other_ctx=0): # Now create a rule for context 1 and make sure traffic goes to a subset if other_ctx: ntuple.exec()
- flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}"
- flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
@@ -620,7 +620,7 @@ def test_delete_rss_context_busy(cfg): # utilize context from ntuple filter port = rand_port()
- flow = f"flow-type tcp{cfg.addr_ipver} dst-port {port} context {ctx_id}"
- flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id}" ntuple_id = ethtool_create(cfg, "-N", flow) defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}")
From: Edward Cree ecree.xilinx@gmail.com
Test creates an ntuple filter with 'action 2' and an RSS context whose indirection table has entries 0 and 1. Resulting traffic should go to queues 2 and 3; verify that it never hits queues 0 and 1.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com --- .../selftests/drivers/net/hw/rss_ctx.py | 41 ++++++++++++++++++- 1 file changed, 40 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index fb61dae20fd8..8f62dc29bd26 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -633,6 +633,45 @@ def test_delete_rss_context_busy(cfg): pass
+def test_rss_ntuple_addition(cfg): + """ + Test that the queue offset (ring_cookie) of an ntuple rule is added + to the queue number read from the indirection table. + """ + + require_ntuple(cfg) + + queue_cnt = len(_get_rx_cnts(cfg)) + if queue_cnt < 4: + try: + ksft_pr(f"Increasing queue count {queue_cnt} -> 4") + ethtool(f"-L {cfg.ifname} combined 4") + defer(ethtool, f"-L {cfg.ifname} combined {queue_cnt}") + except: + raise KsftSkipEx("Not enough queues for the test") + + # Use queue 0 for normal traffic + ethtool(f"-X {cfg.ifname} equal 1") + defer(ethtool, f"-X {cfg.ifname} default") + + # create additional rss context + ctx_id = ethtool_create(cfg, "-X", "context new equal 2") + defer(ethtool, f"-X {cfg.ifname} context {ctx_id} delete") + + # utilize context from ntuple filter + port = rand_port() + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id} action 2" + try: + ntuple_id = ethtool_create(cfg, "-N", flow) + except CmdExitFailure: + raise KsftSkipEx("Ntuple filter with RSS and nonzero action not supported") + defer(ethtool, f"-N {cfg.ifname} delete {ntuple_id}") + + _send_traffic_check(cfg, port, f"context {ctx_id}", { 'target': (2, 3), + 'empty' : (1,), + 'noise' : (0,) }) + + def main() -> None: with NetDrvEpEnv(__file__, nsim_test=False) as cfg: cfg.ethnl = EthtoolFamily() @@ -644,7 +683,7 @@ def main() -> None: test_rss_context_dump, test_rss_context_queue_reconfigure, test_rss_context_overlap, test_rss_context_overlap2, test_rss_context_out_of_order, test_rss_context4_create_with_cfg, - test_delete_rss_context_busy], + test_delete_rss_context_busy, test_rss_ntuple_addition], args=(cfg, )) ksft_exit()
From: Edward Cree ecree.xilinx@gmail.com
The combination of ntuple action (ring_cookie) and RSS context can cause an ntuple rule to target a higher queue than appears in any RSS indirection table or directly in the ntuple rule, since the two numbers are added together. Verify the logic that prevents reducing the queue count in this case.
Signed-off-by: Edward Cree ecree.xilinx@gmail.com --- I'm not able to run this selftest myself, because sfc doesn't support ethtool set-channels (-L). If someone else can verify that it works on their hardware that would be much appreciated.
.../selftests/drivers/net/hw/rss_ctx.py | 26 +++++++++++++++++++ 1 file changed, 26 insertions(+)
diff --git a/tools/testing/selftests/drivers/net/hw/rss_ctx.py b/tools/testing/selftests/drivers/net/hw/rss_ctx.py index 8f62dc29bd26..0b49ce7ae678 100755 --- a/tools/testing/selftests/drivers/net/hw/rss_ctx.py +++ b/tools/testing/selftests/drivers/net/hw/rss_ctx.py @@ -238,6 +238,32 @@ def test_rss_queue_reconfigure(cfg, main_ctx=True): else: raise Exception(f"Driver didn't prevent us from deactivating a used queue (context {ctx_id})")
+ if not main_ctx: + ethtool(f"-L {cfg.ifname} combined 4") + flow = f"flow-type tcp{cfg.addr_ipver} dst-ip {cfg.addr} dst-port {port} context {ctx_id} action 1" + try: + # this targets queue 4, which doesn't exist + ntuple2 = ethtool_create(cfg, "-N", flow) + except CmdExitFailure: + pass + else: + raise Exception(f"Driver didn't prevent us from targeting a nonexistent queue (context {ctx_id})") + # change the table to target queues 0 and 2 + ethtool(f"-X {cfg.ifname} {ctx_ref} weight 1 0 1 0") + # ntuple rule therefore targets queues 1 and 3 + ntuple2 = ethtool_create(cfg, "-N", flow) + # should replace existing filter + ksft_eq(ntuple, ntuple2) + _send_traffic_check(cfg, port, ctx_ref, { 'target': (1, 3), + 'noise' : (0, 2) }) + # Setting queue count to 3 should fail, queue 3 is used + try: + ethtool(f"-L {cfg.ifname} combined 3") + except CmdExitFailure: + pass + else: + raise Exception(f"Driver didn't prevent us from deactivating a used queue (context {ctx_id})") +
def test_rss_resize(cfg): """Test resizing of the RSS table.
Hello:
This series was applied to netdev/net-next.git (main) by Jakub Kicinski kuba@kernel.org:
On Wed, 13 Nov 2024 12:13:08 +0000 you wrote:
From: Edward Cree ecree.xilinx@gmail.com
The original semantics of ntuple filters with FLOW_RSS were not fully understood by all drivers, some ignoring the ring_cookie from the flow rule. Require this support to be explicitly declared by the driver for filters relying on it to be inserted, and add self- test coverage for this functionality. Also teach ethtool_check_max_channel() about this.
[...]
Here is the summary with links: - [net-next,1/5] net: ethtool: only allow set_rxnfc with rss + ring_cookie if driver opts in https://git.kernel.org/netdev/net-next/c/9e43ad7a1ede - [net-next,2/5] net: ethtool: account for RSS+RXNFC add semantics when checking channel count https://git.kernel.org/netdev/net-next/c/a64499f618b2 - [net-next,3/5] selftest: include dst-ip in ethtool ntuple rules https://git.kernel.org/netdev/net-next/c/b2d5b4c46856 - [net-next,4/5] selftest: validate RSS+ntuple filters with nonzero ring_cookie https://git.kernel.org/netdev/net-next/c/e9e8abfec214 - [net-next,5/5] selftest: extend test_rss_context_queue_reconfigure for action addition https://git.kernel.org/netdev/net-next/c/29a4bc1fe961
You are awesome, thank you!
linux-kselftest-mirror@lists.linaro.org