On Wed, May 29, 2019 at 12:25:39PM +0200, Stefan Bader wrote:
From: Jiri Wiesner jwiesner@suse.com
The *_frag_reasm() functions are susceptible to miscalculating the byte count of packet fragments in case the truesize of a head buffer changes. The truesize member may be changed by the call to skb_unclone(), leaving the fragment memory limit counter unbalanced even if all fragments are processed. This miscalculation goes unnoticed as long as the network namespace which holds the counter is not destroyed.
Should an attempt be made to destroy a network namespace that holds an unbalanced fragment memory limit counter the cleanup of the namespace never finishes. The thread handling the cleanup gets stuck in inet_frags_exit_net() waiting for the percpu counter to reach zero. The thread is usually in running state with a stacktrace similar to:
PID: 1073 TASK: ffff880626711440 CPU: 1 COMMAND: "kworker/u48:4" #5 [ffff880621563d48] _raw_spin_lock at ffffffff815f5480 #6 [ffff880621563d48] inet_evict_bucket at ffffffff8158020b #7 [ffff880621563d80] inet_frags_exit_net at ffffffff8158051c #8 [ffff880621563db0] ops_exit_list at ffffffff814f5856 #9 [ffff880621563dd8] cleanup_net at ffffffff814f67c0 #10 [ffff880621563e38] process_one_work at ffffffff81096f14
It is not possible to create new network namespaces, and processes that call unshare() end up being stuck in uninterruptible sleep state waiting to acquire the net_mutex.
The bug was observed in the IPv6 netfilter code by Per Sundstrom. I thank him for his analysis of the problem. The parts of this patch that apply to IPv4 and IPv6 fragment reassembly are preemptive measures.
Signed-off-by: Jiri Wiesner jwiesner@suse.com Reported-by: Per Sundstrom per.sundstrom@redqube.se Acked-by: Peter Oskolkov posk@google.com Signed-off-by: David S. Miller davem@davemloft.net
(backported from commit ebaf39e6032faf77218220707fc3fa22487784e0) [smb: context adjustments in net/ipv6/netfilter/nf_conntrack_reasm.c] Signed-off-by: Stefan Bader stefan.bader@canonical.com
I can't take a patch for 4.4.y that is not in 4.9.y as anyone upgrading kernel versions would have a regression :(
Can you also provide a backport of the needed patches for 4.9.y for this issue so I can take these?
thanks,
greg k-h