Re: [PATCHv3 2/6] tty/ldsem: Update waiter->task before waking up reader

11 Sep 2018


      On Tue, Sep 11, 2018 at 02:04:49PM +0900, Sergey Senozhatsky wrote:
...
On (09/11/18 02:48), Dmitry Safonov wrote:
...
There is a couple of reports about lockup in ldsem_down_read() without
anyone holding write end of ldisc semaphore:
lkml.kernel.org/r/20171121132855.ajdv4k6swzhvktl6@wfg-t540p.sh.intel.com
lkml.kernel.org/r/20180907045041.GF1110@shao2-debian
They all looked like a missed wake up.
I wasn't lucky enough to reproduce it, but it seems like reader on
another CPU can miss waiter->task update and schedule again, resulting
in indefinite (MAX_SCHEDULE_TIMEOUT) sleep.
Certainly, something suspicious is going on.
...
@@ -118,6 +118,8 @@ static void __ldsem_wake_readers(struct ld_semaphore *sem)
   	tsk = waiter->task;
   	smp_mb();
   	waiter->task = NULL;

/* Make sure down_read_failed() will see !waiter->task update */


smp_wmb();

wake_up_process(tsk);

Hmm. I think wake_up_process() executes a full memory barrier, because
it accesses task state.
...
put_task_struct(tsk);

}
@@ -217,7 +219,7 @@ down_read_failed(struct ld_semaphore *sem, long count, long timeout)
   for (;;) {
   	set_current_state(TASK_UNINTERRUPTIBLE);
I think that set_current_state() also executes memory barrier. Just
because it accesses task state.
In both cases, the rationale, 'because it accesses task state' is
utterly wrong.
The reasoning can be found in the comment near set_current_state().