On 6/21/21 10:29 AM, Steffen Klassert wrote:
On Fri, Jun 18, 2021 at 04:11:01PM +0200, Varad Gautam wrote:
Commit "xfrm: policy: Read seqcount outside of rcu-read side in xfrm_policy_lookup_bytype" [Linked] resolved a locking bug in xfrm_policy_lookup_bytype that causes an RCU reader-writer deadlock on the mutex wrapped by xfrm_policy_hash_generation on PREEMPT_RT since 77cc278f7b20 ("xfrm: policy: Use sequence counters with associated lock").
However, xfrm_sk_policy_lookup can still reach xfrm_policy_lookup_bytype while holding rcu_read_lock(), as: xfrm_sk_policy_lookup() rcu_read_lock() security_xfrm_policy_lookup() xfrm_policy_lookup()
Hm, I don't see that call chain. security_xfrm_policy_lookup() calls a hook with the name xfrm_policy_lookup. The only LSM that has registered a function to that hook is selinux. It registers selinux_xfrm_policy_lookup() and I don't see how we can call xfrm_policy_lookup() from there.
Did you actually trigger that bug?
Right, I misread the call chain - security_xfrm_policy_lookup does not reach xfrm_policy_lookup, making this patch unnecessary. The bug I have is:
T1, holding hash_resize_mutex and sleeping inside synchronize_rcu:
__schedule schedule schedule_timeout wait_for_completion __wait_rcu_gp synchronize_rcu xfrm_hash_resize
And T2 producing RCU-stalls since it blocked on the mutex:
__schedule schedule __rt_mutex_slowlock rt_mutex_slowlock_locked rt_mutex_slowlock xfrm_policy_lookup_bytype.constprop.77 __xfrm_policy_check udpv6_queue_rcv_one_skb __udp6_lib_rcv ip6_protocol_deliver_rcu ip6_input_finish ip6_input ip6_mc_input ipv6_rcv __netif_receive_skb_one_core process_backlog net_rx_action __softirqentry_text_start __local_bh_enable_ip ip6_finish_output2 ip6_output ip6_send_skb udp_v6_send_skb udpv6_sendmsg sock_sendmsg ____sys_sendmsg ___sys_sendmsg __sys_sendmsg do_syscall_64
So, despite the patch here [1], there is another way to reach xfrm_policy_lookup_bytype within an RCU-read side - which on PREEMPT_RT will deadlock with xfrm_hash_resize. Does softirq processing on RT happen within rcu_read_lock/unlock - this would explain the stalls.
[1] https://lore.kernel.org/r/20210528160407.32127-1-varad.gautam@suse.com/
Regards, Varad