Hi Breno
30 Oct 2024 15:02:45 Breno Leitao leitao@debian.org:
The mptcp_sched_find() function must be called with the RCU read lock held, as it accesses RCU-protected data structures. This requirement was not properly enforced in the mptcp_init_sock() function, leading to a RCU list traversal in a non-reader section error when CONFIG_PROVE_RCU_LIST is enabled.
net/mptcp/sched.c:44 RCU-list traversed in non-reader section!!
Fix it by acquiring the RCU read lock before calling the mptcp_sched_find() function. This ensures that the function is invoked with the necessary RCU protection in place, as it accesses RCU-protected data structures.
Thank you for having looked at that, but there is already a fix:
https://lore.kernel.org/netdev/20241021-net-mptcp-sched-lock-v1-1-637759cf06...
This fix has even been applied in the net tree already:
https://git.kernel.org/netdev/net/c/3deb12c788c3
Did you not get conflicts when rebasing your branch on top of the latest version?
Additionally, the patch breaks down the mptcp_init_sched() call into smaller parts, with the RCU read lock only covering the specific call to mptcp_sched_find(). This helps minimize the critical section, reducing the time during which RCU grace periods are blocked.
I agree with Eric (thank you for the review!): this creates other issues.
The mptcp_sched_list_lock is not held in this case, and it is not clear if it is necessary.
It is not needed, the list is not modified, only read with RCU.
Cheers, Matt