On 08/29/2018, 04:40 PM, Jiri Slaby wrote:
On 08/29/2018, 04:23 AM, Dmitry Safonov wrote:
tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() nor set_ldisc() nor tty_ldisc_release() as they use tty lock. But it races with anyone who expects line discipline to be the same after hoding read semaphore in tty_ldisc_ref().
We've seen the following crash on v4.9.108 stable:
BUG: unable to handle kernel paging request at 0000000000002260 IP: [..] n_tty_receive_buf_common+0x5f/0x86d Workqueue: events_unbound flush_to_ldisc Call Trace: [..] n_tty_receive_buf2 [..] tty_ldisc_receive_buf [..] flush_to_ldisc [..] process_one_work [..] worker_thread [..] kthread [..] ret_from_fork
I think, tty_ldisc_reinit() should be called with ldisc_sem hold for writing, which will protect any reader against line discipline changes.
Note: I failed to reproduce the described crash, so obiviously can't guarantee that this is the place where line discipline was switched.
...
So what about: tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT); if (!tty->ldisc) ret = tty_ldisc_reinit(tty, tty->termios.c_line); tty_ldisc_unlock(tty);
if (!ret) tty->count++; return ret;
I forgot to add that I debugged a different NULL ptr deref to this very same root cause today (set_termios called with NULL tty->disc_data). So really, tty_reinit's ldisc change must be protected by the ldisc_sem, otherwise other threads will see tty->ldisc, but not tty->disc_data.
thanks,