On Wed, 2024-02-14 at 14:59 +0100, Marc Kleine-Budde wrote:
From: Ziqi Zhao astrajoan@yahoo.com
The following 3 locks would race against each other, causing the deadlock situation in the Syzbot bug report:
- j1939_socks_lock
- active_session_list_lock
- sk_session_queue_lock
A reasonable fix is to change j1939_socks_lock to an rwlock, since in the rare situations where a write lock is required for the linked list that j1939_socks_lock is protecting, the code does not attempt to acquire any more locks. This would break the circular lock dependency, where, for example, the current thread already locks j1939_socks_lock and attempts to acquire sk_session_queue_lock, and at the same time, another thread attempts to acquire j1939_socks_lock while holding sk_session_queue_lock.
Just FYI, Eric recently did a great job removing most rwlock in networking code, see commit 0daf07e52709 ("raw: convert raw sockets to RCU").
I'm unsure if the same reasoning apply to can, but perhaps you could consider converting this code to RCU for net-next.
Cheers,
Paolo