From: Hui Li caelli@tencent.com
We have met a hang on pty device, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com --- Patch changelogs between v1 and v2: -add barrier inside n_tty_read and n_tty_receive_buf_common; -comment why barrier is needed; -access to ldata->no_room is changed with READ_ONCE and WRITE_ONCE; Patch changelogs between v2 and v3: -in function n_tty_receive_buf_common, add unlikely to check ldata->no_room, eg: if (unlikely(ldata->no_room)), and READ_ONCE is removed here to get locality; -change comment for barrier to show the race condition to make comment easier to understand; Patch changelogs between v3 and v4: -change subject from 'tty: fix a possible hang on tty device' to 'tty: fix hang on tty device with no_room set' to make subject more obvious; Patch changelogs between v4 and v5: -name is changed from cael to caelli, li is added as the family name and caelli is the fullname. Patch changelogs between v5 and v6: -change from and Signed-off-by, from 'caelli juanfengpy@gmail.com' to 'caelli caelli@tencent.com', later one is my corporate address. Patch changelogs between v6 and v7: -change name from caelli to 'Hui Li', which is my name in chinese. -the comment for barrier is improved, and a Fixes and Reviewed-by tags is added.
drivers/tty/n_tty.c | 41 +++++++++++++++++++++++++++++++++++++---- 1 file changed, 37 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index c8f56c9b1a1c..8c17304fffcf 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -204,8 +204,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1698,7 +1698,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1729,6 +1729,27 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set, otherwise, following race may occur: + * n_tty_receive_buf_common() + * n_tty_read() + * if (!chars_in_buffer(tty))->false + * copy_from_read_buf() + * read_tail=commit_head + * n_tty_kick_worker() + * if (ldata->no_room)->false + * ldata->no_room = 1 + * Then both kworker and reader will fail to kick n_tty_kick_worker(), + * smp_mb is paired with smp_mb() in n_tty_read(). + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2282,8 +2303,25 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(), + * otherwise, following race may occur: + * n_tty_read() + * n_tty_receive_buf_common() + * n_tty_kick_worker() + * if(ldata->no_room)->false + * ldata->no_room = 1 + * if (!chars_in_buffer(tty))->false + * copy_from_read_buf() + * read_tail=commit_head + * Both reader and kworker will fail to kick tty_buffer_restart_work(), + * smp_mb is paired with smp_mb() in n_tty_receive_buf_common(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
On 17. 03. 23, 3:41, juanfengpy@gmail.com wrote:
From: Hui Li caelli@tencent.com
We have met a hang on pty device, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
...
--- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c
...
@@ -1729,6 +1729,27 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
- if (unlikely(ldata->no_room)) {
/*
* Barrier here is to ensure to read the latest read_tail in
* chars_in_buffer() and to make sure that read_tail is not loaded
* before ldata->no_room is set,
I am not sure I would keep the following part of the comment in the code:
otherwise, following race may occur:
* n_tty_receive_buf_common()
* n_tty_read()
* if (!chars_in_buffer(tty))->false
* copy_from_read_buf()
* read_tail=commit_head
* n_tty_kick_worker()
* if (ldata->no_room)->false
* ldata->no_room = 1
* Then both kworker and reader will fail to kick n_tty_kick_worker(),
* smp_mb is paired with smp_mb() in n_tty_read().
I would only let it ^^^ documented in the commit log as you did.
*/
smp_mb();
if (!chars_in_buffer(tty))
n_tty_kick_worker(tty);
- }
- up_read(&tty->termios_rwsem);
return rcvd; @@ -2282,8 +2303,25 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; }
- if (old_tail != ldata->read_tail)
- if (old_tail != ldata->read_tail) {
/*
* Make sure no_room is not read in n_tty_kick_worker()
* before setting ldata->read_tail in copy_from_read_buf(),
The same here (it's only repeated). I think the above two lines are enough for the comment. We have git blame after all.
* otherwise, following race may occur:
* n_tty_read()
* n_tty_receive_buf_common()
* n_tty_kick_worker()
* if(ldata->no_room)->false
* ldata->no_room = 1
* if (!chars_in_buffer(tty))->false
* copy_from_read_buf()
* read_tail=commit_head
* Both reader and kworker will fail to kick tty_buffer_restart_work(),
* smp_mb is paired with smp_mb() in n_tty_receive_buf_common().
*/
n_tty_kick_worker(tty);smp_mb();
- } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
From: Hui Li caelli@tencent.com
We have met a hang on pty device, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com --- Patch changelogs between v1 and v2: -add barrier inside n_tty_read and n_tty_receive_buf_common; -comment why barrier is needed; -access to ldata->no_room is changed with READ_ONCE and WRITE_ONCE; Patch changelogs between v2 and v3: -in function n_tty_receive_buf_common, add unlikely to check ldata->no_room, eg: if (unlikely(ldata->no_room)), and READ_ONCE is removed here to get locality; -change comment for barrier to show the race condition to make comment easier to understand; Patch changelogs between v3 and v4: -change subject from 'tty: fix a possible hang on tty device' to 'tty: fix hang on tty device with no_room set' to make subject more obvious; Patch changelogs between v4 and v5: -name is changed from cael to caelli, li is added as the family name and caelli is the fullname. Patch changelogs between v5 and v6: -change from and Signed-off-by, from 'caelli juanfengpy@gmail.com' to 'caelli caelli@tencent.com', later one is my corporate address. Patch changelogs between v6 and v7: -change name from caelli to 'Hui Li', which is my name in chinese. -the comment for barrier is improved, and a Fixes and Reviewed-by tags is added. Patch changelogs between v7 and v8: -Simplify the comments for barriers.
drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index c8f56c9b1a1c..4dff2f34e2d0 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -204,8 +204,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1698,7 +1698,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1729,6 +1729,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2282,8 +2293,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
From: Hui Li caelli@tencent.com
It is possible to hang pty devices in this case, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com --- Patch changelogs between v1 and v2: -add barrier inside n_tty_read and n_tty_receive_buf_common; -comment why barrier is needed; -access to ldata->no_room is changed with READ_ONCE and WRITE_ONCE; Patch changelogs between v2 and v3: -in function n_tty_receive_buf_common, add unlikely to check ldata->no_room, eg: if (unlikely(ldata->no_room)), and READ_ONCE is removed here to get locality; -change comment for barrier to show the race condition to make comment easier to understand; Patch changelogs between v3 and v4: -change subject from 'tty: fix a possible hang on tty device' to 'tty: fix hang on tty device with no_room set' to make subject more obvious; Patch changelogs between v4 and v5: -name is changed from cael to caelli, li is added as the family name and caelli is the fullname. Patch changelogs between v5 and v6: -change from and Signed-off-by, from 'caelli juanfengpy@gmail.com' to 'caelli caelli@tencent.com', later one is my corporate address. Patch changelogs between v6 and v7: -change name from caelli to 'Hui Li', which is my name in chinese. -the comment for barrier is improved, and a Fixes and Reviewed-by tags is added. Patch changelogs between v7 and v8: -Simplify the comments for barriers. Patch changelogs between v8 and v9: -change the commit messages as suggested by Bagas Sanjaya.
drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index c8f56c9b1a1c..4dff2f34e2d0 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -204,8 +204,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1698,7 +1698,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1729,6 +1729,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2282,8 +2293,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
This is a note to let you know that I've just added the patch titled
tty: fix hang on tty device with no_room set
to my tty git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git in the tty-testing branch.
The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.)
The patch will be merged to the tty-next branch sometime soon, after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
From 4903fde8047a28299d1fc79c1a0dcc255e928f12 Mon Sep 17 00:00:00 2001 From: Hui Li caelli@tencent.com Date: Thu, 6 Apr 2023 10:44:50 +0800 Subject: tty: fix hang on tty device with no_room set MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
It is possible to hang pty devices in this case, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com Message-ID: 1680749090-14106-1-git-send-email-caelli@tencent.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 1c9e5d2ea7de..552e8a741562 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -203,8 +203,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1697,7 +1697,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1728,6 +1728,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2281,8 +2292,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
This is a note to let you know that I've just added the patch titled
tty: fix hang on tty device with no_room set
to my tty git tree which can be found at git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git in the tty-next branch.
The patch will show up in the next release of the linux-next tree (usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release during the merge window.
If you have any questions about this process, please let me know.
From 4903fde8047a28299d1fc79c1a0dcc255e928f12 Mon Sep 17 00:00:00 2001 From: Hui Li caelli@tencent.com Date: Thu, 6 Apr 2023 10:44:50 +0800 Subject: tty: fix hang on tty device with no_room set MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit
It is possible to hang pty devices in this case, the reader was blocking at epoll on master side, the writer was sleeping at wait_woken inside n_tty_write on slave side, and the write buffer on tty_port was full, we found that the reader and writer would never be woken again and blocked forever.
The problem was caused by a race between reader and kworker: n_tty_read(reader): n_tty_receive_buf_common(kworker): copy_from_read_buf()| |room = N_TTY_BUF_SIZE - (ldata->read_head - tail) |room <= 0 n_tty_kick_worker() | |ldata->no_room = true
After writing to slave device, writer wakes up kworker to flush data on tty_port to reader, and the kworker finds that reader has no room to store data so room <= 0 is met. At this moment, reader consumes all the data on reader buffer and calls n_tty_kick_worker to check ldata->no_room which is false and reader quits reading. Then kworker sets ldata->no_room=true and quits too.
If write buffer is not full, writer will wake kworker to flush data again after following writes, but if write buffer is full and writer goes to sleep, kworker will never be woken again and tty device is blocked.
This problem can be solved with a check for read buffer size inside n_tty_receive_buf_common, if read buffer is empty and ldata->no_room is true, a call to n_tty_kick_worker is necessary to keep flushing data to reader.
Cc: stable@vger.kernel.org Fixes: 42458f41d08f ("n_tty: Ensure reader restarts worker for next reader") Reviewed-by: Ilpo Järvinen ilpo.jarvinen@linux.intel.com Signed-off-by: Hui Li caelli@tencent.com Message-ID: 1680749090-14106-1-git-send-email-caelli@tencent.com Signed-off-by: Greg Kroah-Hartman gregkh@linuxfoundation.org --- drivers/tty/n_tty.c | 25 +++++++++++++++++++++---- 1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/drivers/tty/n_tty.c b/drivers/tty/n_tty.c index 1c9e5d2ea7de..552e8a741562 100644 --- a/drivers/tty/n_tty.c +++ b/drivers/tty/n_tty.c @@ -203,8 +203,8 @@ static void n_tty_kick_worker(struct tty_struct *tty) struct n_tty_data *ldata = tty->disc_data;
/* Did the input worker stop? Restart it */ - if (unlikely(ldata->no_room)) { - ldata->no_room = 0; + if (unlikely(READ_ONCE(ldata->no_room))) { + WRITE_ONCE(ldata->no_room, 0);
WARN_RATELIMIT(tty->port->itty == NULL, "scheduling with invalid itty\n"); @@ -1697,7 +1697,7 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, if (overflow && room < 0) ldata->read_head--; room = overflow; - ldata->no_room = flow && !room; + WRITE_ONCE(ldata->no_room, flow && !room); } else overflow = 0;
@@ -1728,6 +1728,17 @@ n_tty_receive_buf_common(struct tty_struct *tty, const unsigned char *cp, } else n_tty_check_throttle(tty);
+ if (unlikely(ldata->no_room)) { + /* + * Barrier here is to ensure to read the latest read_tail in + * chars_in_buffer() and to make sure that read_tail is not loaded + * before ldata->no_room is set. + */ + smp_mb(); + if (!chars_in_buffer(tty)) + n_tty_kick_worker(tty); + } + up_read(&tty->termios_rwsem);
return rcvd; @@ -2281,8 +2292,14 @@ static ssize_t n_tty_read(struct tty_struct *tty, struct file *file, if (time) timeout = time; } - if (old_tail != ldata->read_tail) + if (old_tail != ldata->read_tail) { + /* + * Make sure no_room is not read in n_tty_kick_worker() + * before setting ldata->read_tail in copy_from_read_buf(). + */ + smp_mb(); n_tty_kick_worker(tty); + } up_read(&tty->termios_rwsem);
remove_wait_queue(&tty->read_wait, &wait);
linux-stable-mirror@lists.linaro.org