On Sun, Apr 28, 2019 at 05:25:38PM -0400, Waiman Long wrote:
During my rwsem testing, it was found that after a down_read(), the reader count may occasionally become 0 or even negative. Consequently, a writer may steal the lock at that time and execute with the reader in parallel thus breaking the mutual exclusion guarantee of the write lock. In other words, both readers and writer can become rwsem owners simultaneously.
The current reader wakeup code does it in one pass to clear waiter->task and put them into wake_q before fully incrementing the reader count. Once waiter->task is cleared, the corresponding reader may see it, finish the critical section and do unlock to decrement the count before the count is incremented. This is not a problem if there is only one reader to wake up as the count has been pre-incremented by 1. It is a problem if there are more than one readers to be woken up and writer can steal the lock.
The wakeup was actually done in 2 passes before the v4.9 commit 70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once"). To fix this problem, the wakeup is now done in two passes again. In the first pass, we collect the readers and count them. The reader count is then fully incremented. In the second pass, the waiter->task is then cleared and they are put into wake_q to be woken up later.
Fixes: 70800c3c0cc5 ("locking/rwsem: Scan the wait_list for readers only once")
It is effectively a revert of that patch, right? Just written more clever.