This is based on Stephen's v4.14 patches, with the necessary merge conflicts, and the lack of timer_setup() on the 4.9 baseline.
Perf results on a gigabit capable system, before and after are below.
Series can also be found here:
https://github.com/ffainelli/linux/commits/fragment-stack-v4.9
PerfTop: 457 irqs/sec kernel:74.4% exact: 0.0% [4000Hz cycles], (all, 4 CPUs) -------------------------------------------------------------------------------
29.62% [kernel] [k] ip_defrag 6.57% [kernel] [k] arch_cpu_idle 1.72% [kernel] [k] v7_dma_inv_range 1.68% [kernel] [k] __netif_receive_skb_core 1.43% [kernel] [k] fib_table_lookup 1.30% [kernel] [k] finish_task_switch 1.08% [kernel] [k] ip_rcv 1.01% [kernel] [k] skb_release_data 0.99% [kernel] [k] __slab_free 0.96% [kernel] [k] bcm_sysport_poll 0.88% [kernel] [k] __netdev_alloc_skb 0.87% [kernel] [k] tick_nohz_idle_enter 0.86% [kernel] [k] dev_gro_receive 0.85% [kernel] [k] _raw_spin_unlock_irqrestore 0.84% [kernel] [k] __memzero 0.74% [kernel] [k] tick_nohz_idle_exit 0.73% ld-2.24.so [.] do_lookup_x 0.66% [kernel] [k] kmem_cache_free 0.66% [kernel] [k] bcm_sysport_rx_refill 0.65% [kernel] [k] eth_type_trans
After patching:
PerfTop: 170 irqs/sec kernel:86.5% exact: 0.0% [4000Hz cycles], (all, 4 CPUs) -------------------------------------------------------------------------------
7.79% [kernel] [k] arch_cpu_idle 5.14% [kernel] [k] v7_dma_inv_range 4.20% [kernel] [k] ip_defrag 3.89% [kernel] [k] __netif_receive_skb_core 3.65% [kernel] [k] fib_table_lookup 2.16% [kernel] [k] finish_task_switch 1.93% [kernel] [k] _raw_spin_unlock_irqrestore 1.90% [kernel] [k] ip_rcv 1.84% [kernel] [k] bcm_sysport_poll 1.83% [kernel] [k] __memzero 1.65% [kernel] [k] __netdev_alloc_skb 1.60% [kernel] [k] __slab_free 1.49% [kernel] [k] __do_softirq 1.49% [kernel] [k] bcm_sysport_rx_refill 1.31% [kernel] [k] dma_cache_maint_page 1.25% [kernel] [k] tick_nohz_idle_enter 1.24% [kernel] [k] ip_route_input_noref 1.17% [kernel] [k] eth_type_trans 1.06% [kernel] [k] fib_validate_source 1.03% [kernel] [k] inet_frag_find
Dan Carpenter (1): ipv4: frags: precedence bug in ip_expire()
Eric Dumazet (22): inet: frags: change inet_frags_init_net() return value inet: frags: add a pointer to struct netns_frags inet: frags: refactor ipfrag_init() inet: frags: refactor ipv6_frag_init() inet: frags: refactor lowpan_net_frag_init() ipv6: export ip6 fragments sysctl to unprivileged users rhashtable: add schedule points inet: frags: use rhashtables for reassembly units inet: frags: remove some helpers inet: frags: get rif of inet_frag_evicting() inet: frags: remove inet_frag_maybe_warn_overflow() inet: frags: break the 2GB limit for frags storage inet: frags: do not clone skb in ip_expire() ipv6: frags: rewrite ip6_expire_frag_queue() rhashtable: reorganize struct rhashtable layout inet: frags: reorganize struct netns_frags inet: frags: get rid of ipfrag_skb_cb/FRAG_CB inet: frags: fix ip6frag_low_thresh boundary net: speed up skb_rbtree_purge() net: pskb_trim_rcsum() and CHECKSUM_COMPLETE are friends net: add rb_to_skb() and other rb tree helpers net: sk_buff rbnode reorg
Florian Westphal (1): ipv6: defrag: drop non-last frags smaller than min mtu
Peter Oskolkov (4): ip: discard IPv4 datagrams with overlapping segments. net: modify skb_rbtree_purge to return the truesize of all purged skbs. ip: add helpers to process in-order fragments faster. ip: process in-order fragments efficiently
Taehee Yoo (1): ip: frags: fix crash in ip_do_fragment()
Documentation/networking/ip-sysctl.txt | 13 +- include/linux/rhashtable.h | 4 +- include/linux/skbuff.h | 48 +- include/net/inet_frag.h | 133 +++--- include/net/ip.h | 1 - include/net/ipv6.h | 26 +- include/uapi/linux/snmp.h | 1 + lib/rhashtable.c | 5 +- net/core/skbuff.c | 31 +- net/ieee802154/6lowpan/6lowpan_i.h | 26 +- net/ieee802154/6lowpan/reassembly.c | 148 +++--- net/ipv4/inet_fragment.c | 379 ++++------------ net/ipv4/ip_fragment.c | 573 +++++++++++++----------- net/ipv4/proc.c | 7 +- net/ipv4/tcp_input.c | 33 +- net/ipv6/netfilter/nf_conntrack_reasm.c | 100 ++--- net/ipv6/proc.c | 5 +- net/ipv6/reassembly.c | 212 ++++----- 18 files changed, 785 insertions(+), 960 deletions(-)