This series attempts to reduce the parsing overhead of IPv6 extension headers in GRO and GSO, by removing extension header specific code and enabling the frag0 fast path.
The following changes were made: - Removed some unnecessary HBH conditionals by adding HBH offload to inet6_offloads - Added a utility function to support frag0 fast path in ipv6_gro_receive - Added selftests for IPv6 packets with extension headers in GRO
Richard
v1 -> v2: - Added a minimum IPv6 extension header length constant to make code self documenting. - Added new selftest which checks that packets with different extension header payloads do not coalesce. - Added more info in the second commit message regarding the code changes. - v1: https://lore.kernel.org/netdev/f4eff69d-3917-4c42-8c6b-d09597ac4437@gmail.co...
Richard Gobert (3): net: gso: add HBH extension header offload support net: gro: parse ipv6 ext headers without frag0 invalidation selftests/net: fix GRO coalesce test and add ext header coalesce tests
include/net/ipv6.h | 1 + net/ipv6/exthdrs_offload.c | 11 ++++ net/ipv6/ip6_offload.c | 76 +++++++++++++++++-------- tools/testing/selftests/net/gro.c | 94 +++++++++++++++++++++++++++++-- 4 files changed, 152 insertions(+), 30 deletions(-)
This commit adds net_offload to IPv6 Hop-by-Hop extension headers (as it is done for routing and dstopts) since it is supported in GSO and GRO. This allows to remove specific HBH conditionals in GSO and GRO when pulling and parsing an incoming packet.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com --- net/ipv6/exthdrs_offload.c | 11 +++++++++++ net/ipv6/ip6_offload.c | 25 +++++++++++-------------- 2 files changed, 22 insertions(+), 14 deletions(-)
diff --git a/net/ipv6/exthdrs_offload.c b/net/ipv6/exthdrs_offload.c index 06750d65d480..4c00398f4dca 100644 --- a/net/ipv6/exthdrs_offload.c +++ b/net/ipv6/exthdrs_offload.c @@ -16,6 +16,10 @@ static const struct net_offload dstopt_offload = { .flags = INET6_PROTO_GSO_EXTHDR, };
+static const struct net_offload hbh_offload = { + .flags = INET6_PROTO_GSO_EXTHDR, +}; + int __init ipv6_exthdrs_offload_init(void) { int ret; @@ -28,9 +32,16 @@ int __init ipv6_exthdrs_offload_init(void) if (ret) goto out_rt;
+ ret = inet6_add_offload(&hbh_offload, IPPROTO_HOPOPTS); + if (ret) + goto out_dstopts; + out: return ret;
+out_dstopts: + inet6_del_offload(&dstopt_offload, IPPROTO_DSTOPTS); + out_rt: inet6_del_offload(&rthdr_offload, IPPROTO_ROUTING); goto out; diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index d6314287338d..0e0b5fed0995 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -45,15 +45,13 @@ static int ipv6_gso_pull_exthdrs(struct sk_buff *skb, int proto) struct ipv6_opt_hdr *opth; int len;
- if (proto != NEXTHDR_HOP) { - ops = rcu_dereference(inet6_offloads[proto]); + ops = rcu_dereference(inet6_offloads[proto]);
- if (unlikely(!ops)) - break; + if (unlikely(!ops)) + break;
- if (!(ops->flags & INET6_PROTO_GSO_EXTHDR)) - break; - } + if (!(ops->flags & INET6_PROTO_GSO_EXTHDR)) + break;
if (unlikely(!pskb_may_pull(skb, 8))) break; @@ -171,13 +169,12 @@ static int ipv6_exthdrs_len(struct ipv6hdr *iph,
proto = iph->nexthdr; for (;;) { - if (proto != NEXTHDR_HOP) { - *opps = rcu_dereference(inet6_offloads[proto]); - if (unlikely(!(*opps))) - break; - if (!((*opps)->flags & INET6_PROTO_GSO_EXTHDR)) - break; - } + *opps = rcu_dereference(inet6_offloads[proto]); + if (unlikely(!(*opps))) + break; + if (!((*opps)->flags & INET6_PROTO_GSO_EXTHDR)) + break; + opth = (void *)opth + optlen; optlen = ipv6_optlen(opth); len += optlen;
On 1/2/24 6:20 AM, Richard Gobert wrote:
This commit adds net_offload to IPv6 Hop-by-Hop extension headers (as it is done for routing and dstopts) since it is supported in GSO and GRO. This allows to remove specific HBH conditionals in GSO and GRO when pulling and parsing an incoming packet.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
net/ipv6/exthdrs_offload.c | 11 +++++++++++ net/ipv6/ip6_offload.c | 25 +++++++++++-------------- 2 files changed, 22 insertions(+), 14 deletions(-)
Reviewed-by: David Ahern dsahern@kernel.org
On Tue, Jan 2, 2024 at 2:21 PM Richard Gobert richardbgobert@gmail.com wrote:
This commit adds net_offload to IPv6 Hop-by-Hop extension headers (as it is done for routing and dstopts) since it is supported in GSO and GRO. This allows to remove specific HBH conditionals in GSO and GRO when pulling and parsing an incoming packet.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
Reviewed-by: Eric Dumazet edumazet@google.com
The existing code always pulls the IPv6 header and sets the transport offset initially. Then optionally again pulls any extension headers in ipv6_gso_pull_exthdrs and sets the transport offset again on return from that call. skb->data is set at the start of the first extension header before calling ipv6_gso_pull_exthdrs, and must disable the frag0 optimization because that function uses pskb_may_pull/pskb_pull instead of skb_gro_ helpers. It sets the GRO offset to the TCP header with skb_gro_pull and sets the transport header. Then returns skb->data to its position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs - which is used in ipv6_gro_receive to pull ipv6 ext headers instead of ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all operations use skb_gro_* helpers, and the frag0 fast path can be taken for IPv6 packets with ext headers.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com --- include/net/ipv6.h | 1 + net/ipv6/ip6_offload.c | 51 +++++++++++++++++++++++++++++++++--------- 2 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 78d38dd88aba..217240efa182 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -26,6 +26,7 @@ struct ip_tunnel_info; #define SIN6_LEN_RFC2133 24
#define IPV6_MAXPLEN 65535 +#define IPV6_MIN_EXTHDR_LEN 8
/* * NextHeader field of IPv6 header diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 0e0b5fed0995..c07111d8f56a 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -37,6 +37,40 @@ INDIRECT_CALL_L4(cb, f2, f1, head, skb); \ })
+static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto) +{ + const struct net_offload *ops = NULL; + struct ipv6_opt_hdr *opth; + + for (;;) { + int len; + + ops = rcu_dereference(inet6_offloads[proto]); + + if (unlikely(!ops)) + break; + + if (!(ops->flags & INET6_PROTO_GSO_EXTHDR)) + break; + + opth = skb_gro_header(skb, off + IPV6_MIN_EXTHDR_LEN, off); + if (unlikely(!opth)) + break; + + len = ipv6_optlen(opth); + + opth = skb_gro_header(skb, off + len, off); + if (unlikely(!opth)) + break; + proto = opth->nexthdr; + + off += len; + } + + skb_gro_pull(skb, off - skb_network_offset(skb)); + return proto; +} + static int ipv6_gso_pull_exthdrs(struct sk_buff *skb, int proto) { const struct net_offload *ops = NULL; @@ -203,28 +237,25 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, goto out;
skb_set_network_header(skb, off); - skb_gro_pull(skb, sizeof(*iph)); - skb_set_transport_header(skb, skb_gro_offset(skb));
- flush += ntohs(iph->payload_len) != skb_gro_len(skb); + flush += ntohs(iph->payload_len) != skb->len - hlen;
proto = iph->nexthdr; ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) { - pskb_pull(skb, skb_gro_offset(skb)); - skb_gro_frag0_invalidate(skb); - proto = ipv6_gso_pull_exthdrs(skb, proto); - skb_gro_pull(skb, -skb_transport_offset(skb)); - skb_reset_transport_header(skb); - __skb_push(skb, skb_gro_offset(skb)); + proto = ipv6_gro_pull_exthdrs(skb, hlen, proto);
ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) goto out;
- iph = ipv6_hdr(skb); + iph = skb_gro_network_header(skb); + } else { + skb_gro_pull(skb, sizeof(*iph)); }
+ skb_set_transport_header(skb, skb_gro_offset(skb)); + NAPI_GRO_CB(skb)->proto = proto;
flush--;
On 1/2/24 6:24 AM, Richard Gobert wrote:
The existing code always pulls the IPv6 header and sets the transport offset initially. Then optionally again pulls any extension headers in ipv6_gso_pull_exthdrs and sets the transport offset again on return from that call. skb->data is set at the start of the first extension header before calling ipv6_gso_pull_exthdrs, and must disable the frag0 optimization because that function uses pskb_may_pull/pskb_pull instead of skb_gro_ helpers. It sets the GRO offset to the TCP header with skb_gro_pull and sets the transport header. Then returns skb->data to its position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs - which is used in ipv6_gro_receive to pull ipv6 ext headers instead of ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all operations use skb_gro_* helpers, and the frag0 fast path can be taken for IPv6 packets with ext headers.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
include/net/ipv6.h | 1 + net/ipv6/ip6_offload.c | 51 +++++++++++++++++++++++++++++++++--------- 2 files changed, 42 insertions(+), 10 deletions(-)
Reviewed-by: David Ahern dsahern@kernel.org
On Tue, Jan 2, 2024 at 2:25 PM Richard Gobert richardbgobert@gmail.com wrote:
The existing code always pulls the IPv6 header and sets the transport offset initially. Then optionally again pulls any extension headers in ipv6_gso_pull_exthdrs and sets the transport offset again on return from that call. skb->data is set at the start of the first extension header before calling ipv6_gso_pull_exthdrs, and must disable the frag0 optimization because that function uses pskb_may_pull/pskb_pull instead of skb_gro_ helpers. It sets the GRO offset to the TCP header with skb_gro_pull and sets the transport header. Then returns skb->data to its position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs - which is used in ipv6_gro_receive to pull ipv6 ext headers instead of ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all operations use skb_gro_* helpers, and the frag0 fast path can be taken for IPv6 packets with ext headers.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
include/net/ipv6.h | 1 + net/ipv6/ip6_offload.c | 51 +++++++++++++++++++++++++++++++++--------- 2 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 78d38dd88aba..217240efa182 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -26,6 +26,7 @@ struct ip_tunnel_info; #define SIN6_LEN_RFC2133 24
#define IPV6_MAXPLEN 65535 +#define IPV6_MIN_EXTHDR_LEN 8
// Hmm see my following comment.
/*
NextHeader field of IPv6 header
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 0e0b5fed0995..c07111d8f56a 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -37,6 +37,40 @@ INDIRECT_CALL_L4(cb, f2, f1, head, skb); \ })
+static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto) +{
const struct net_offload *ops = NULL;
struct ipv6_opt_hdr *opth;
for (;;) {
int len;
ops = rcu_dereference(inet6_offloads[proto]);
if (unlikely(!ops))
break;
if (!(ops->flags & INET6_PROTO_GSO_EXTHDR))
break;
opth = skb_gro_header(skb, off + IPV6_MIN_EXTHDR_LEN, off);
I do not see a compelling reason for adding yet another constant here.
I would stick to
opth = skb_gro_header(skb, off + sizeof(*opth), off);
Consistency with similar helpers is desirable.
if (unlikely(!opth))
break;
len = ipv6_optlen(opth);
opth = skb_gro_header(skb, off + len, off);
Note this call will take care of precise pull.
if (unlikely(!opth))
break;
proto = opth->nexthdr;
off += len;
}
skb_gro_pull(skb, off - skb_network_offset(skb));
return proto;
+}
static int ipv6_gso_pull_exthdrs(struct sk_buff *skb, int proto) { const struct net_offload *ops = NULL; @@ -203,28 +237,25 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, goto out;
skb_set_network_header(skb, off);
skb_gro_pull(skb, sizeof(*iph));
skb_set_transport_header(skb, skb_gro_offset(skb));
flush += ntohs(iph->payload_len) != skb_gro_len(skb);
flush += ntohs(iph->payload_len) != skb->len - hlen; proto = iph->nexthdr; ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) {
pskb_pull(skb, skb_gro_offset(skb));
skb_gro_frag0_invalidate(skb);
proto = ipv6_gso_pull_exthdrs(skb, proto);
skb_gro_pull(skb, -skb_transport_offset(skb));
skb_reset_transport_header(skb);
__skb_push(skb, skb_gro_offset(skb));
proto = ipv6_gro_pull_exthdrs(skb, hlen, proto); ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) goto out;
iph = ipv6_hdr(skb);
iph = skb_gro_network_header(skb);
} else {
skb_gro_pull(skb, sizeof(*iph)); }
skb_set_transport_header(skb, skb_gro_offset(skb));
NAPI_GRO_CB(skb)->proto = proto; flush--;
-- 2.36.1
Eric Dumazet wrote:
On Tue, Jan 2, 2024 at 2:25 PM Richard Gobert richardbgobert@gmail.com wrote:
The existing code always pulls the IPv6 header and sets the transport offset initially. Then optionally again pulls any extension headers in ipv6_gso_pull_exthdrs and sets the transport offset again on return from that call. skb->data is set at the start of the first extension header before calling ipv6_gso_pull_exthdrs, and must disable the frag0 optimization because that function uses pskb_may_pull/pskb_pull instead of skb_gro_ helpers. It sets the GRO offset to the TCP header with skb_gro_pull and sets the transport header. Then returns skb->data to its position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs - which is used in ipv6_gro_receive to pull ipv6 ext headers instead of ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all operations use skb_gro_* helpers, and the frag0 fast path can be taken for IPv6 packets with ext headers.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
include/net/ipv6.h | 1 + net/ipv6/ip6_offload.c | 51 +++++++++++++++++++++++++++++++++--------- 2 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 78d38dd88aba..217240efa182 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -26,6 +26,7 @@ struct ip_tunnel_info; #define SIN6_LEN_RFC2133 24
#define IPV6_MAXPLEN 65535 +#define IPV6_MIN_EXTHDR_LEN 8
// Hmm see my following comment.
/*
NextHeader field of IPv6 header
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 0e0b5fed0995..c07111d8f56a 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -37,6 +37,40 @@ INDIRECT_CALL_L4(cb, f2, f1, head, skb); \ })
+static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto) +{
const struct net_offload *ops = NULL;
struct ipv6_opt_hdr *opth;
for (;;) {
int len;
ops = rcu_dereference(inet6_offloads[proto]);
if (unlikely(!ops))
break;
if (!(ops->flags & INET6_PROTO_GSO_EXTHDR))
break;
opth = skb_gro_header(skb, off + IPV6_MIN_EXTHDR_LEN, off);
I do not see a compelling reason for adding yet another constant here.
I would stick to
opth = skb_gro_header(skb, off + sizeof(*opth), off);
Consistency with similar helpers is desirable.
In terms of consistency - similar helper functions (ipv6_gso_pull_exthdrs, ipv6_parse_hopopts) also pull 8 bytes at the beginning of every IPv6 extension header, because the minimum extension header length is 8 bytes.
sizeof(*opth) = 2, so for an IPv6 packet with one extension header with a common length of 8 bytes, pskb_may_pull will be called twice: first with length = 2 and again with length = 8, which might not be ideal when parsing non-linear packets.
Willem suggested adding a constant to make the code more self-documenting.
if (unlikely(!opth))
break;
len = ipv6_optlen(opth);
opth = skb_gro_header(skb, off + len, off);
Note this call will take care of precise pull.
if (unlikely(!opth))
break;
proto = opth->nexthdr;
off += len;
}
skb_gro_pull(skb, off - skb_network_offset(skb));
return proto;
+}
static int ipv6_gso_pull_exthdrs(struct sk_buff *skb, int proto) { const struct net_offload *ops = NULL; @@ -203,28 +237,25 @@ INDIRECT_CALLABLE_SCOPE struct sk_buff *ipv6_gro_receive(struct list_head *head, goto out;
skb_set_network_header(skb, off);
skb_gro_pull(skb, sizeof(*iph));
skb_set_transport_header(skb, skb_gro_offset(skb));
flush += ntohs(iph->payload_len) != skb_gro_len(skb);
flush += ntohs(iph->payload_len) != skb->len - hlen; proto = iph->nexthdr; ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) {
pskb_pull(skb, skb_gro_offset(skb));
skb_gro_frag0_invalidate(skb);
proto = ipv6_gso_pull_exthdrs(skb, proto);
skb_gro_pull(skb, -skb_transport_offset(skb));
skb_reset_transport_header(skb);
__skb_push(skb, skb_gro_offset(skb));
proto = ipv6_gro_pull_exthdrs(skb, hlen, proto); ops = rcu_dereference(inet6_offloads[proto]); if (!ops || !ops->callbacks.gro_receive) goto out;
iph = ipv6_hdr(skb);
iph = skb_gro_network_header(skb);
} else {
skb_gro_pull(skb, sizeof(*iph)); }
skb_set_transport_header(skb, skb_gro_offset(skb));
NAPI_GRO_CB(skb)->proto = proto; flush--;
-- 2.36.1
On Wed, Jan 3, 2024 at 2:08 PM Richard Gobert richardbgobert@gmail.com wrote:
Eric Dumazet wrote:
On Tue, Jan 2, 2024 at 2:25 PM Richard Gobert richardbgobert@gmail.com wrote:
The existing code always pulls the IPv6 header and sets the transport offset initially. Then optionally again pulls any extension headers in ipv6_gso_pull_exthdrs and sets the transport offset again on return from that call. skb->data is set at the start of the first extension header before calling ipv6_gso_pull_exthdrs, and must disable the frag0 optimization because that function uses pskb_may_pull/pskb_pull instead of skb_gro_ helpers. It sets the GRO offset to the TCP header with skb_gro_pull and sets the transport header. Then returns skb->data to its position before this block.
This commit introduces a new helper function - ipv6_gro_pull_exthdrs - which is used in ipv6_gro_receive to pull ipv6 ext headers instead of ipv6_gso_pull_exthdrs. Thus, there is no modification of skb->data, all operations use skb_gro_* helpers, and the frag0 fast path can be taken for IPv6 packets with ext headers.
Signed-off-by: Richard Gobert richardbgobert@gmail.com Reviewed-by: Willem de Bruijn willemb@google.com
include/net/ipv6.h | 1 + net/ipv6/ip6_offload.c | 51 +++++++++++++++++++++++++++++++++--------- 2 files changed, 42 insertions(+), 10 deletions(-)
diff --git a/include/net/ipv6.h b/include/net/ipv6.h index 78d38dd88aba..217240efa182 100644 --- a/include/net/ipv6.h +++ b/include/net/ipv6.h @@ -26,6 +26,7 @@ struct ip_tunnel_info; #define SIN6_LEN_RFC2133 24
#define IPV6_MAXPLEN 65535 +#define IPV6_MIN_EXTHDR_LEN 8
// Hmm see my following comment.
/*
NextHeader field of IPv6 header
diff --git a/net/ipv6/ip6_offload.c b/net/ipv6/ip6_offload.c index 0e0b5fed0995..c07111d8f56a 100644 --- a/net/ipv6/ip6_offload.c +++ b/net/ipv6/ip6_offload.c @@ -37,6 +37,40 @@ INDIRECT_CALL_L4(cb, f2, f1, head, skb); \ })
+static int ipv6_gro_pull_exthdrs(struct sk_buff *skb, int off, int proto) +{
const struct net_offload *ops = NULL;
struct ipv6_opt_hdr *opth;
for (;;) {
int len;
ops = rcu_dereference(inet6_offloads[proto]);
if (unlikely(!ops))
break;
if (!(ops->flags & INET6_PROTO_GSO_EXTHDR))
break;
opth = skb_gro_header(skb, off + IPV6_MIN_EXTHDR_LEN, off);
I do not see a compelling reason for adding yet another constant here.
I would stick to
opth = skb_gro_header(skb, off + sizeof(*opth), off);
Consistency with similar helpers is desirable.
In terms of consistency - similar helper functions (ipv6_gso_pull_exthdrs, ipv6_parse_hopopts) also pull 8 bytes at the beginning of every IPv6 extension header, because the minimum extension header length is 8 bytes.
sizeof(*opth) = 2, so for an IPv6 packet with one extension header with a common length of 8 bytes, pskb_may_pull will be called twice: first with length = 2 and again with length = 8, which might not be ideal when parsing non-linear packets.
Willem suggested adding a constant to make the code more self-documenting.
Hmm... I was looking at
skb_checksum_setup_ipv6() , it uses skb_maybe_pull_tail( ... sizeof(struct ipv6_opt_hdr)) ipv6_skip_exthdr() also uses sizeof(struct ipv6_opt_hdr) ip6_tnl_parse_tlv_enc_lim also uses the same. hbh_mt6(), ipv6header_mt6(), .. same... ip6_find_1stfragopt(), get_ipv6_ext_hdrs(), tcf_csum_ipv6(), mip6_rthdr_offset() same
So it seems you found two helpers that went the other way.
If you think pulling 8 bytes first is a win, I would suggest a stand alone patch, adding the magic constant using it in all places, so that a casual reader can make sense of the magical 8 value.
Eric Dumazet wrote:
Hmm... I was looking at
skb_checksum_setup_ipv6() , it uses skb_maybe_pull_tail( ... sizeof(struct ipv6_opt_hdr)) ipv6_skip_exthdr() also uses sizeof(struct ipv6_opt_hdr) ip6_tnl_parse_tlv_enc_lim also uses the same. hbh_mt6(), ipv6header_mt6(), .. same... ip6_find_1stfragopt(), get_ipv6_ext_hdrs(), tcf_csum_ipv6(), mip6_rthdr_offset() same
So it seems you found two helpers that went the other way.
If you think pulling 8 bytes first is a win, I would suggest a stand alone patch, adding the magic constant using it in all places, so that a casual reader can make sense of the magical 8 value.
I guess pulling 8 bytes first is not such a big advantage. I will submit a v3 with sizeof(*opth) as you suggested.
Currently there is no test which checks that IPv6 extension header packets successfully coalesce. This commit adds a test, which verifies two IPv6 packets with HBH extension headers do coalesce, and another test which checks that packets with different extension header data do not coalesce in GRO.
I changed the receive socket filter to accept a packet with one extension header. This change exposed a bug in the fragment test -- the old BPF did not accept the fragment packet. I updated correct_num_packets in the fragment test accordingly.
Signed-off-by: Richard Gobert richardbgobert@gmail.com --- tools/testing/selftests/net/gro.c | 94 +++++++++++++++++++++++++++++-- 1 file changed, 88 insertions(+), 6 deletions(-)
diff --git a/tools/testing/selftests/net/gro.c b/tools/testing/selftests/net/gro.c index 30024d0ed373..6dbba8ec53a1 100644 --- a/tools/testing/selftests/net/gro.c +++ b/tools/testing/selftests/net/gro.c @@ -71,6 +71,12 @@ #define MAX_PAYLOAD (IP_MAXPACKET - sizeof(struct tcphdr) - sizeof(struct ipv6hdr)) #define NUM_LARGE_PKT (MAX_PAYLOAD / MSS) #define MAX_HDR_LEN (ETH_HLEN + sizeof(struct ipv6hdr) + sizeof(struct tcphdr)) +#define MIN_EXTHDR_SIZE 8 +#define EXT_PAYLOAD_1 "\x00\x00\x00\x00\x00\x00" +#define EXT_PAYLOAD_2 "\x11\x11\x11\x11\x11\x11" + +#define ipv6_optlen(p) (((p)->hdrlen+1) << 3) /* calculate IPv6 extension header len */ +#define BUILD_BUG_ON(condition) ((void)sizeof(char[1 - 2*!!(condition)]))
static const char *addr6_src = "fdaa::2"; static const char *addr6_dst = "fdaa::1"; @@ -104,7 +110,7 @@ static void setup_sock_filter(int fd) const int dport_off = tcp_offset + offsetof(struct tcphdr, dest); const int ethproto_off = offsetof(struct ethhdr, h_proto); int optlen = 0; - int ipproto_off; + int ipproto_off, opt_ipproto_off; int next_off;
if (proto == PF_INET) @@ -116,14 +122,30 @@ static void setup_sock_filter(int fd) if (strcmp(testname, "ip") == 0) { if (proto == PF_INET) optlen = sizeof(struct ip_timestamp); - else - optlen = sizeof(struct ip6_frag); + else { + BUILD_BUG_ON(sizeof(struct ip6_hbh) > MIN_EXTHDR_SIZE); + BUILD_BUG_ON(sizeof(struct ip6_dest) > MIN_EXTHDR_SIZE); + BUILD_BUG_ON(sizeof(struct ip6_frag) > MIN_EXTHDR_SIZE); + + /* same size for HBH and Fragment extension header types */ + optlen = MIN_EXTHDR_SIZE; + opt_ipproto_off = ETH_HLEN + sizeof(struct ipv6hdr) + + offsetof(struct ip6_ext, ip6e_nxt); + } }
+ /* this filter validates the following: + * - packet is IPv4/IPv6 according to the running test. + * - packet is TCP. Also handles the case of one extension header and then TCP. + * - checks the packet tcp dport equals to DPORT. Also handles the case of one + * extension header and then TCP. + */ struct sock_filter filter[] = { BPF_STMT(BPF_LD + BPF_H + BPF_ABS, ethproto_off), - BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ntohs(ethhdr_proto), 0, 7), + BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, ntohs(ethhdr_proto), 0, 9), BPF_STMT(BPF_LD + BPF_B + BPF_ABS, ipproto_off), + BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_TCP, 2, 0), + BPF_STMT(BPF_LD + BPF_B + BPF_ABS, opt_ipproto_off), BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_TCP, 0, 5), BPF_STMT(BPF_LD + BPF_H + BPF_ABS, dport_off), BPF_JUMP(BPF_JMP + BPF_JEQ + BPF_K, DPORT, 2, 0), @@ -576,6 +598,40 @@ static void add_ipv4_ts_option(void *buf, void *optpkt) iph->check = checksum_fold(iph, sizeof(struct iphdr) + optlen, 0); }
+static void add_ipv6_exthdr(void *buf, void *optpkt, __u8 exthdr_type, char *ext_payload) +{ + struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr *)(optpkt + tcp_offset); + struct ipv6hdr *iph = (struct ipv6hdr *)(optpkt + ETH_HLEN); + char *exthdr_payload_start = (char *)(exthdr + 1); + + exthdr->hdrlen = 0; + exthdr->nexthdr = IPPROTO_TCP; + + if (ext_payload) + memcpy(exthdr_payload_start, ext_payload, MIN_EXTHDR_SIZE - sizeof(*exthdr)); + + memcpy(optpkt, buf, tcp_offset); + memcpy(optpkt + tcp_offset + MIN_EXTHDR_SIZE, buf + tcp_offset, + sizeof(struct tcphdr) + PAYLOAD_LEN); + + iph->nexthdr = exthdr_type; + iph->payload_len = htons(ntohs(iph->payload_len) + MIN_EXTHDR_SIZE); +} + +static void send_ipv6_exthdr(int fd, struct sockaddr_ll *daddr, char *ext_data1, char *ext_data2) +{ + static char buf[MAX_HDR_LEN + PAYLOAD_LEN]; + static char exthdr_pck[sizeof(buf) + MIN_EXTHDR_SIZE]; + + create_packet(buf, 0, 0, PAYLOAD_LEN, 0); + add_ipv6_exthdr(buf, exthdr_pck, IPPROTO_HOPOPTS, ext_data1); + write_packet(fd, exthdr_pck, total_hdr_len + PAYLOAD_LEN + MIN_EXTHDR_SIZE, daddr); + + create_packet(buf, PAYLOAD_LEN * 1, 0, PAYLOAD_LEN, 0); + add_ipv6_exthdr(buf, exthdr_pck, IPPROTO_HOPOPTS, ext_data2); + write_packet(fd, exthdr_pck, total_hdr_len + PAYLOAD_LEN + MIN_EXTHDR_SIZE, daddr); +} + /* IPv4 options shouldn't coalesce */ static void send_ip_options(int fd, struct sockaddr_ll *daddr) { @@ -697,7 +753,7 @@ static void send_fragment6(int fd, struct sockaddr_ll *daddr) create_packet(buf, PAYLOAD_LEN * i, 0, PAYLOAD_LEN, 0); write_packet(fd, buf, bufpkt_len, daddr); } - + sleep(1); create_packet(buf, PAYLOAD_LEN * 2, 0, PAYLOAD_LEN, 0); memset(extpkt, 0, extpkt_len);
@@ -760,6 +816,7 @@ static void check_recv_pkts(int fd, int *correct_payload, vlog("}, Total %d packets\nReceived {", correct_num_pkts);
while (1) { + ip_ext_len = 0; pkt_size = recv(fd, buffer, IP_MAXPACKET + ETH_HLEN + 1, 0); if (pkt_size < 0) error(1, errno, "could not receive"); @@ -767,7 +824,7 @@ static void check_recv_pkts(int fd, int *correct_payload, if (iph->version == 4) ip_ext_len = (iph->ihl - 5) * 4; else if (ip6h->version == 6 && ip6h->nexthdr != IPPROTO_TCP) - ip_ext_len = sizeof(struct ip6_frag); + ip_ext_len = MIN_EXTHDR_SIZE;
tcph = (struct tcphdr *)(buffer + tcp_offset + ip_ext_len);
@@ -880,7 +937,21 @@ static void gro_sender(void) sleep(1); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); } else if (proto == PF_INET6) { + sleep(1); send_fragment6(txfd, &daddr); + sleep(1); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + + sleep(1); + /* send IPv6 packets with ext header with same payload */ + send_ipv6_exthdr(txfd, &daddr, EXT_PAYLOAD_1, EXT_PAYLOAD_1); + sleep(1); + write_packet(txfd, fin_pkt, total_hdr_len, &daddr); + + sleep(1); + /* send IPv6 packets with ext header with different payload */ + send_ipv6_exthdr(txfd, &daddr, EXT_PAYLOAD_1, EXT_PAYLOAD_2); + sleep(1); write_packet(txfd, fin_pkt, total_hdr_len, &daddr); } } else if (strcmp(testname, "large") == 0) { @@ -997,6 +1068,17 @@ static void gro_receiver(void) */ printf("fragmented ip6 doesn't coalesce: "); correct_payload[0] = PAYLOAD_LEN * 2; + correct_payload[1] = PAYLOAD_LEN; + correct_payload[2] = PAYLOAD_LEN; + check_recv_pkts(rxfd, correct_payload, 3); + + printf("ipv6 with ext header does coalesce: "); + correct_payload[0] = PAYLOAD_LEN * 2; + check_recv_pkts(rxfd, correct_payload, 1); + + printf("ipv6 with ext header with different payloads doesn't coalesce: "); + correct_payload[0] = PAYLOAD_LEN; + correct_payload[1] = PAYLOAD_LEN; check_recv_pkts(rxfd, correct_payload, 2); } } else if (strcmp(testname, "large") == 0) {
Richard Gobert wrote:
Currently there is no test which checks that IPv6 extension header packets successfully coalesce. This commit adds a test, which verifies two IPv6 packets with HBH extension headers do coalesce, and another test which checks that packets with different extension header data do not coalesce in GRO.
I changed the receive socket filter to accept a packet with one extension header. This change exposed a bug in the fragment test -- the old BPF did not accept the fragment packet. I updated correct_num_packets in the fragment test accordingly.
Signed-off-by: Richard Gobert richardbgobert@gmail.com
Reviewed-by: Willem de Bruijn willemb@google.com
Thanks for adding the second test.
+static void add_ipv6_exthdr(void *buf, void *optpkt, __u8 exthdr_type, char *ext_payload) +{
- struct ipv6_opt_hdr *exthdr = (struct ipv6_opt_hdr *)(optpkt + tcp_offset);
- struct ipv6hdr *iph = (struct ipv6hdr *)(optpkt + ETH_HLEN);
- char *exthdr_payload_start = (char *)(exthdr + 1);
- exthdr->hdrlen = 0;
- exthdr->nexthdr = IPPROTO_TCP;
- if (ext_payload)
memcpy(exthdr_payload_start, ext_payload, MIN_EXTHDR_SIZE - sizeof(*exthdr));
minor nit, in case this gets respun: ext_payload is always true.
linux-kselftest-mirror@lists.linaro.org