From: Florian Westphal fw@strlen.de
commit d209df3e7f7002d9099fdb0f6df0f972b4386a63 upstream
[ We hit the trace described in commit message with the kselftest/nft_trans_stress.sh. This patch diverges from the upstream one since kernel 4.14 does not have following symbols: nft_chain_filter_init, nf_tables_flowtable_notifier ]
We must register nfnetlink ops last, as that exposes nf_tables to userspace. Without this, we could theoretically get nfnetlink request before net->nft state has been initialized.
Fixes: 99633ab29b213 ("netfilter: nf_tables: complete net namespace support") Signed-off-by: Florian Westphal fw@strlen.de Signed-off-by: Pablo Neira Ayuso pablo@netfilter.org [apanyaki: backport to v4.14-stable] Signed-off-by: Andrew Paniakin apanyaki@amazon.com ---
[ 163.471426] Call Trace: [ 163.474901] netlink_dump+0x125/0x2d0 [ 163.479081] __netlink_dump_start+0x16a/0x1c0 [ 163.483589] nf_tables_gettable+0x151/0x180 [nf_tables] [ 163.488561] ? nf_tables_gettable+0x180/0x180 [nf_tables] [ 163.493658] nfnetlink_rcv_msg+0x222/0x250 [nfnetlink] [ 163.498608] ? __skb_try_recv_datagram+0x114/0x180 [ 163.503359] ? nfnetlink_net_exit_batch+0x60/0x60 [nfnetlink] [ 163.508590] netlink_rcv_skb+0x4d/0x130 [ 163.512832] nfnetlink_rcv+0x92/0x780 [nfnetlink] [ 163.517465] ? netlink_recvmsg+0x202/0x3e0 [ 163.521801] ? __kmalloc_node_track_caller+0x31/0x290 [ 163.526635] ? copy_msghdr_from_user+0xd5/0x150 [ 163.531216] ? __netlink_lookup+0xd0/0x130 [ 163.535536] netlink_unicast+0x196/0x240 [ 163.539759] netlink_sendmsg+0x2da/0x400 [ 163.544010] sock_sendmsg+0x36/0x40 [ 163.548030] SYSC_sendto+0x10e/0x140 [ 163.552119] ? __audit_syscall_entry+0xbc/0x110 [ 163.556741] ? syscall_trace_enter+0x1df/0x2e0 [ 163.561315] ? __audit_syscall_exit+0x231/0x2b0 [ 163.565857] do_syscall_64+0x67/0x110 [ 163.569930] entry_SYSCALL_64_after_hwframe+0x59/0xbe
Reproduce with debug logs clearly shows the nft initialization issue exactly as in ported patch description: [ 22.600051] nft load start [ 22.600858] nf_tables: (c) 2007-2009 Patrick McHardy kaber@trash.net [ 22.601241] nf_tables_gettable start: ffff888527c10000 [ 22.601271] register_pernet_subsys ffffffffa02ba0c0 [ 22.601274] netns ops_init ffffffffa02ba0c0 ffffffff821aeec0 [ 22.602506] nf_tables_dump_tables: ffff888527c10000 [ 22.603187] af_info list init done: ffffffff821aeec0 [ 22.604064] nf_tables_dump_tables: afi: (null) [ 22.604077] BUG: unable to handle kernel [ 22.604820] netns ops_init end ffffffffa02ba0c0 ffffffff821aeec0 [ 22.605698] NULL pointer dereference [ 22.606354] netns ops_init ffffffffa02ba0c0 ffff888527c10000
(gdb) p &init_net $2 = (struct net *) 0xffffffff821aeec0 <init_net> ffff888527c10000 is a testns1 namespaces
To reproduce this problem and test the fix I scripted following steps: - start Qemu VM - run nft_trans_stress.sh test - check dmesg logs for NULL pointer dereference - reboot via QMP and repeat
I tested the fix with our kernel regression tests (including kselftest) also.
net/netfilter/nf_tables_api.c | 15 +++++++++++---- 1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c index c683a45b8ae53..65495b528290b 100644 --- a/net/netfilter/nf_tables_api.c +++ b/net/netfilter/nf_tables_api.c @@ -6032,18 +6032,25 @@ static int __init nf_tables_module_init(void) goto err1; }
- err = nf_tables_core_module_init(); + err = register_pernet_subsys(&nf_tables_net_ops); if (err < 0) goto err2;
- err = nfnetlink_subsys_register(&nf_tables_subsys); + err = nf_tables_core_module_init(); if (err < 0) goto err3;
+ /* must be last */ + err = nfnetlink_subsys_register(&nf_tables_subsys); + if (err < 0) + goto err4; + pr_info("nf_tables: (c) 2007-2009 Patrick McHardy kaber@trash.net\n"); - return register_pernet_subsys(&nf_tables_net_ops); -err3: + return err; +err4: nf_tables_core_module_exit(); +err3: + unregister_pernet_subsys(&nf_tables_net_ops); err2: kfree(info); err1: