There is a couple of reports about lockup in ldsem_down_read() without anyone holding write end of ldisc semaphore: lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com lkml.kernel.org/r/20180907045041.GF1110@shao2-debian
They all looked like a missed wake up. I wasn't lucky enough to reproduce it, but it seems like reader on another CPU can miss waiter->task update and schedule again, resulting in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.
Make sure waked up reader will see waiter->task == NULL.
Cc: stable@vger.kernel.org Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Jiri Slaby jslaby@suse.com Cc: Peter Zijlstra peterz@infradead.org Cc: "Paul E. McKenney" paulmck@linux.vnet.ibm.com Reported-by: kernel test robot rong.a.chen@intel.com Signed-off-by: Dmitry Safonov dima@arista.com --- drivers/tty/tty_ldsem.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/tty_ldsem.c b/drivers/tty/tty_ldsem.c index 0c98d88f795a..832accbbcb6d 100644 --- a/drivers/tty/tty_ldsem.c +++ b/drivers/tty/tty_ldsem.c @@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem) tsk = waiter->task; smp_mb(); waiter->task = NULL; + /* Make sure down_read_failed() will see !waiter->task update */ + smp_wmb(); wake_up_process(tsk); put_task_struct(tsk); } @@ -217,7 +219,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout) for (;;) { set_current_state(TASK_UNINTERRUPTIBLE);
- if (!waiter.task) + if (!READ_ONCE(waiter.task)) break; if (!timeout) break;