Hi all,
Three fixes that worth to have in the @stable, as they were hit by different people, including Arista on v4.9 stable.
And for linux-next - adding lockdep asserts for line discipline changing code, verifying that write ldisc sem will be held forthwith.
The last patch is an optional and probably, timeout can be dropped for read_lock(). I'll do it if everyone agrees. (Or as per discussion with Peter in v3, just convert ldisc to a regular rwsem).
Thanks, Dima
Changes since v3: - Added tested-by Mark Rutland (thanks!) - Dropped patch with smp_wmb() - wrong idea - lockdep_assert_held() should be actually lockdep_assert_held_exclusive() - Described why tty_ldisc_open() can be called without ldisc_sem held for pty slave end (o_tty). - Added Peter's patch for dropping self-made lockdep annotations - Fix for a reader(s) of ldisc semaphore waiting for an active reader(s)
Changes since v2: - Added reviewed-by tags - Hopefully, fixed reported by 0-day issue. - Added optional fix for wait_readers decrement
Changes since v1: - Added tested-by/reported-by tags - Dropped 3/4 (locking tty pair for lockdep sake), Because of that - not adding lockdep_assert_held() in tty_ldisc_open() - Added 4/4 cleanup to inc tty->count only on success of tty_ldisc_reinit() - lock ldisc without (5*HZ) timeout in tty_reopen()
v1 link: lkml.kernel.org/r/20180829022353.23568-1-dima@arista.com v2 link: lkml.kernel.org/r/20180903165257.29227-1-dima@arista.com v3 link: lkml.kernel.org/r/20180911014821.26286-1-dima@arista.com
Cc: Daniel Axtens dja@axtens.net Cc: Dmitry Vyukov dvyukov@google.com Cc: Mark Rutland mark.rutland@arm.com Cc: Michael Neuling mikey@neuling.org Cc: Mikulas Patocka mpatocka@redhat.com Cc: Nathan March nathan@gt.net Cc: Pasi Kärkkäinen pasik@iki.fi Cc: Peter Hurley peter@hurleysoftware.com Cc: Peter Zijlstra peterz@infradead.org Cc: "Rong, Chen" rong.a.chen@intel.com Cc: Sergey Senozhatsky sergey.senozhatsky.work@gmail.com Cc: Tan Xiaojun tanxiaojun@huawei.com Cc: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp (please, ignore if I Cc'ed you mistakenly)
Dmitry Safonov (6): tty: Drop tty->count on tty_reopen() failure tty: Hold tty_ldisc_lock() during tty_reopen() tty/ldsem: Wake up readers after timed out down_write() tty: Simplify tty->count math in tty_reopen() tty/ldsem: Add lockdep asserts for ldisc_sem tty/ldsem: Decrement wait_readers on timeouted down_read()
Peter Zijlstra (1): tty/ldsem: Convert to regular lockdep annotations
drivers/tty/tty_io.c | 12 ++++++---- drivers/tty/tty_ldisc.c | 9 +++++++ drivers/tty/tty_ldsem.c | 62 ++++++++++++++++++++----------------------------- 3 files changed, 42 insertions(+), 41 deletions(-)
In case of tty_ldisc_reinit() failure, tty->count should be decremented back, otherwise we will never release_tty(). Tetsuo reported that it fixes noisy warnings on tty release like: pts pts4033: tty_release: tty->count(10529) != (#fd's(7) + #kopen's(0))
Fixes: commit 892d1fa7eaae ("tty: Destroy ldisc instance on hangup")
Cc: stable@vger.kernel.org # v4.6+ Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Jiri Slaby jslaby@suse.com Reviewed-by: Jiri Slaby jslaby@suse.cz Tested-by: Jiri Slaby jslaby@suse.com Tested-by: Mark Rutland mark.rutland@arm.com Tested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Dmitry Safonov dima@arista.com --- drivers/tty/tty_io.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 32bc3e3fe4d3..5e5da9acaf0a 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1255,6 +1255,7 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct * static int tty_reopen(struct tty_struct *tty) { struct tty_driver *driver = tty->driver; + int retval;
if (driver->type == TTY_DRIVER_TYPE_PTY && driver->subtype == PTY_TYPE_MASTER) @@ -1268,10 +1269,14 @@ static int tty_reopen(struct tty_struct *tty)
tty->count++;
- if (!tty->ldisc) - return tty_ldisc_reinit(tty, tty->termios.c_line); + if (tty->ldisc) + return 0;
- return 0; + retval = tty_ldisc_reinit(tty, tty->termios.c_line); + if (retval) + tty->count--; + + return retval; }
/**
tty_ldisc_reinit() doesn't race with neither tty_ldisc_hangup() nor set_ldisc() nor tty_ldisc_release() as they use tty lock. But it races with anyone who expects line discipline to be the same after hoding read semaphore in tty_ldisc_ref().
We've seen the following crash on v4.9.108 stable:
BUG: unable to handle kernel paging request at 0000000000002260 IP: [..] n_tty_receive_buf_common+0x5f/0x86d Workqueue: events_unbound flush_to_ldisc Call Trace: [..] n_tty_receive_buf2 [..] tty_ldisc_receive_buf [..] flush_to_ldisc [..] process_one_work [..] worker_thread [..] kthread [..] ret_from_fork
tty_ldisc_reinit() should be called with ldisc_sem hold for writing, which will protect any reader against line discipline changes.
Backport-first: b027e2298bd5 ("tty: fix data race between tty_init_dev and flush of buf") Cc: stable@vger.kernel.org
Cc: Greg Kroah-Hartman gregkh@linuxfoundation.org Cc: Jiri Slaby jslaby@suse.com Reviewed-by: Jiri Slaby jslaby@suse.cz Reported-by: syzbot+3aa9784721dfb90e984d@syzkaller.appspotmail.com Tested-by: Mark Rutland mark.rutland@arm.com Tested-by: Tetsuo Handa penguin-kernel@i-love.sakura.ne.jp Signed-off-by: Dmitry Safonov dima@arista.com --- drivers/tty/tty_io.c | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c index 5e5da9acaf0a..a947719b4626 100644 --- a/drivers/tty/tty_io.c +++ b/drivers/tty/tty_io.c @@ -1255,7 +1255,7 @@ static void tty_driver_remove_tty(struct tty_driver *driver, struct tty_struct * static int tty_reopen(struct tty_struct *tty) { struct tty_driver *driver = tty->driver; - int retval; + int retval = 0;
if (driver->type == TTY_DRIVER_TYPE_PTY && driver->subtype == PTY_TYPE_MASTER) @@ -1267,15 +1267,18 @@ static int tty_reopen(struct tty_struct *tty) if (test_bit(TTY_EXCLUSIVE, &tty->flags) && !capable(CAP_SYS_ADMIN)) return -EBUSY;
- tty->count++; + tty_ldisc_lock(tty, MAX_SCHEDULE_TIMEOUT);
+ tty->count++; if (tty->ldisc) - return 0; + goto out_unlock;
retval = tty_ldisc_reinit(tty, tty->termios.c_line); if (retval) tty->count--;
+out_unlock: + tty_ldisc_unlock(tty); return retval; }
FYI, we noticed the following commit (built with gcc-4.9):
commit: f26eb68a52bb7302d220133440c1438143f2c7e1 ("[PATCHv4 2/7] tty: Hold tty_ldisc_lock() during tty_reopen()") url: https://github.com/0day-ci/linux/commits/Dmitry-Safonov/tty-Hold-write-ldisc... base: https://git.kernel.org/cgit/linux/kernel/git/gregkh/tty.git tty-testing
in testcase: locktorture with following parameters:
runtime: 300s test: cpuhotplug
test-description: This torture test consists of creating a number of kernel threads which acquire the lock and hold it for specific amount of time, thus simulating different critical region behaviors. test-url: https://www.kernel.org/doc/Documentation/locking/locktorture.txt
on test machine: qemu-system-i386 -enable-kvm -m 256M
caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):
+----------------------------------------------------+------------+------------+ | | 59ce4b9e1e | f26eb68a52 | +----------------------------------------------------+------------+------------+ | boot_successes | 0 | 0 | | boot_failures | 12 | 14 | | INFO:trying_to_register_non-static_key | 12 | 14 | | WARNING:at_mm/page_alloc.c:#__alloc_pages_nodemask | 12 | 14 | | EIP:__alloc_pages_nodemask | 12 | 14 | | Mem-Info | 12 | 14 | | INFO:task_blocked_for_more_than#seconds | 0 | 8 | | Kernel_panic-not_syncing:hung_task:blocked_tasks | 0 | 8 | +----------------------------------------------------+------------+------------+
[ 245.080051] INFO: task lkp-setup-rootf:500 blocked for more than 120 seconds. [ 245.082850] Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.084673] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.086455] lkp-setup-rootf D 6516 500 498 0x00000000 [ 245.087766] Call Trace: [ 245.088391] __schedule+0x4b1/0xe80 [ 245.089322] schedule+0x25/0x60 [ 245.090178] schedule_timeout+0x275/0x4e0 [ 245.091105] ? __ldsem_down_write_nested+0x1aa/0x300 [ 245.092217] __ldsem_down_write_nested+0x1b2/0x300 [ 245.093286] ? ldsem_down_write+0x2a/0x2e [ 245.094305] ldsem_down_write+0x2a/0x2e [ 245.095211] tty_ldisc_lock+0x16/0x40 [ 245.096104] tty_reopen+0x3f/0xc0 [ 245.096892] tty_open+0x3cf/0x450 [ 245.097718] chrdev_open+0x9c/0x1f0 [ 245.098511] ? security_file_open+0x99/0xa0 [ 245.099512] do_dentry_open+0x3d5/0x5d0 [ 245.100428] ? inode_permission+0x95/0x180 [ 245.101351] ? cdev_put+0x30/0x30 [ 245.102138] vfs_open+0x2d/0x30 [ 245.102898] path_openat+0x9e1/0x17d0 [ 245.103733] do_filp_open+0x6a/0x120 [ 245.104541] ? _raw_spin_unlock+0x22/0x30 [ 245.105439] ? __alloc_fd+0xa5/0x170 [ 245.106309] do_sys_open+0x13a/0x250 [ 245.107118] sys_open+0x22/0x30 [ 245.107980] do_int80_syscall_32+0x94/0x1c0 [ 245.108982] entry_INT80_32+0xf3/0xf3 [ 245.109865] EIP: 0x6fe9c305 [ 245.110672] Code: Bad RIP value. [ 245.111607] EAX: ffffffda EBX: 0906aa54 ECX: 00008241 EDX: 000001b6 [ 245.113034] ESI: 00000241 EDI: 000001b6 EBP: 778ae6f4 ESP: 778ae5ec [ 245.114594] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 007b EFLAGS: 00000286 [ 245.116119] INFO: task jshn:542 blocked for more than 120 seconds. [ 245.117633] Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.119174] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.120929] jshn D 6708 542 541 0x00000000 [ 245.122290] Call Trace: [ 245.122871] ? __schedule+0x4b1/0xe80 [ 245.123749] ? schedule+0x25/0x60 [ 245.124554] ? __mutex_unlock_slowpath+0xbe/0x3b0 [ 245.125670] ? schedule_timeout+0x275/0x4e0 [ 245.126623] ? __ldsem_down_read_nested+0x137/0x2c0 [ 245.127726] ? __ldsem_down_read_nested+0x14b/0x2c0 [ 245.129035] ? ldsem_down_read+0x2a/0x30 [ 245.129943] ? uart_write+0x220/0x220 [ 245.130875] ? ldsem_down_read+0x2a/0x30 [ 245.131729] ? tty_ldisc_ref_wait+0x1e/0x40 [ 245.132637] ? tty_ioctl+0xe2/0x970 [ 245.133451] ? do_raw_spin_unlock+0xb2/0x130 [ 245.134501] ? _raw_spin_unlock+0x22/0x30 [ 245.135428] ? handle_mm_fault+0x16d1/0x1e80 [ 245.136368] ? send_break+0x110/0x110 [ 245.137240] ? do_vfs_ioctl+0x84/0x9f0 [ 245.138181] ? __do_page_fault+0x40c/0x840 [ 245.139146] ? up_read+0x67/0xd0 [ 245.139870] ? __fget_light+0x81/0x90 [ 245.140672] ? security_file_ioctl+0xe/0x50 [ 245.141676] ? ksys_ioctl+0x58/0x80 [ 245.142436] ? sys_ioctl+0x16/0x20 [ 245.143195] ? do_int80_syscall_32+0x94/0x1c0 [ 245.144160] ? entry_INT80_32+0xf3/0xf3 [ 245.145021] INFO: task dnsmasq:555 blocked for more than 120 seconds. [ 245.146852] Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.148826] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.151131] dnsmasq D 6772 555 1 0x00000000 [ 245.152449] Call Trace: [ 245.153009] __schedule+0x4b1/0xe80 [ 245.153808] schedule+0x25/0x60 [ 245.154561] ? __mutex_unlock_slowpath+0xbe/0x3b0 [ 245.155609] schedule_timeout+0x275/0x4e0 [ 245.156560] ? __ldsem_down_read_nested+0x137/0x2c0 [ 245.157693] __ldsem_down_read_nested+0x14b/0x2c0 [ 245.159001] ? ldsem_down_read+0x2a/0x30 [ 245.159885] ? uart_write+0x220/0x220 [ 245.160702] ldsem_down_read+0x2a/0x30 [ 245.161522] tty_ldisc_ref_wait+0x1e/0x40 [ 245.162581] tty_ioctl+0xe2/0x970 [ 245.163314] ? do_raw_spin_unlock+0xb2/0x130 [ 245.164286] ? _raw_spin_unlock+0x22/0x30 [ 245.165257] ? handle_mm_fault+0x16d1/0x1e80 [ 245.166260] ? send_break+0x110/0x110 [ 245.167082] do_vfs_ioctl+0x84/0x9f0 [ 245.167876] ? __do_page_fault+0x40c/0x840 [ 245.169051] ? up_read+0x67/0xd0 [ 245.169829] ? __fget_light+0x81/0x90 [ 245.170637] ? security_file_ioctl+0xe/0x50 [ 245.171570] ksys_ioctl+0x58/0x80 [ 245.172336] sys_ioctl+0x16/0x20 [ 245.173139] do_int80_syscall_32+0x94/0x1c0 [ 245.174104] entry_INT80_32+0xf3/0xf3 [ 245.175017] EIP: 0x6ff64648 [ 245.175657] Code: c0 74 1a 8d 83 ce e8 fe ff 89 44 24 04 89 3c 24 e8 21 9d ff ff 85 c0 0f 85 93 00 00 00 8b 83 68 44 00 00 85 c0 74 53 48 89 83 <68> 44 00 00 8b 93 5c 44 00 00 8b 04 82 89 04 24 e8 53 95 fd ff eb [ 245.180346] EAX: ffffffda EBX: 00000000 ECX: 00005401 EDX: 77a6c704 [ 245.181929] ESI: 77a6c704 EDI: 77a6c73c EBP: 77a6c9a8 ESP: 77a6c6e4 [ 245.183279] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 007b EFLAGS: 00000246 [ 245.184814] INFO: task dropbear:667 blocked for more than 120 seconds. [ 245.186388] Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.187992] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.189851] dropbear D 6772 667 1 0x00000000 [ 245.191265] Call Trace: [ 245.191865] __schedule+0x4b1/0xe80 [ 245.192643] schedule+0x25/0x60 [ 245.193355] ? __mutex_unlock_slowpath+0xbe/0x3b0 [ 245.194554] schedule_timeout+0x275/0x4e0 [ 245.195574] ? __ldsem_down_read_nested+0x137/0x2c0 [ 245.196694] __ldsem_down_read_nested+0x14b/0x2c0 [ 245.197744] ? ldsem_down_read+0x2a/0x30 [ 245.198686] ? uart_write+0x220/0x220 [ 245.199516] ldsem_down_read+0x2a/0x30 [ 245.200321] tty_ldisc_ref_wait+0x1e/0x40 [ 245.201219] tty_ioctl+0xe2/0x970 [ 245.202012] ? do_raw_spin_unlock+0xb2/0x130 [ 245.202937] ? _raw_spin_unlock+0x22/0x30 [ 245.203891] ? handle_mm_fault+0x16d1/0x1e80 [ 245.204884] ? send_break+0x110/0x110 [ 245.205834] do_vfs_ioctl+0x84/0x9f0 [ 245.206729] ? __do_page_fault+0x40c/0x840 [ 245.207752] ? up_read+0x67/0xd0 [ 245.208526] ? __fget_light+0x81/0x90 [ 245.209440] ? security_file_ioctl+0xe/0x50 [ 245.210531] ksys_ioctl+0x58/0x80 [ 245.211285] sys_ioctl+0x16/0x20 [ 245.212182] do_int80_syscall_32+0x94/0x1c0 [ 245.213163] entry_INT80_32+0xf3/0xf3 [ 245.214146] EIP: 0x6ff5f648 [ 245.214894] Code: 39 89 3e 4a 83 c1 04 83 c6 04 eb ef 85 d2 74 09 4a 8b 34 91 89 34 90 eb f3 5e 5f c3 56 8b 44 24 08 8b 4c 24 0c 8b 54 24 10 85 <d2> 74 0d 8b 31 89 30 4a 83 c1 04 83 c0 04 eb ef 5e c3 56 8b 44 24 [ 245.219039] EAX: ffffffda EBX: 00000000 ECX: 00005401 EDX: 77b3c984 [ 245.220389] ESI: 77b3c984 EDI: 77b3c9bc EBP: 77b3cc28 ESP: 77b3c964 [ 245.222023] DS: 007b ES: 007b FS: 0000 GS: 0000 SS: 007b EFLAGS: 00000246 [ 245.223596] INFO: lockdep is turned off. [ 245.224554] NMI backtrace for cpu 0 [ 245.225318] CPU: 0 PID: 18 Comm: khungtaskd Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.227454] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 245.229272] Call Trace: [ 245.229879] ? dump_stack+0x76/0xb6 [ 245.230725] ? nmi_cpu_backtrace+0x9e/0xa0 [ 245.231630] ? nmi_trigger_cpumask_backtrace+0xaa/0xf0 [ 245.232783] ? debug_show_all_locks+0x112/0x120 [ 245.233863] ? watchdog+0x308/0x530 [ 245.234744] ? lapic_can_unplug_cpu+0xa0/0xa0 [ 245.235811] ? arch_trigger_cpumask_backtrace+0x15/0x20 [ 245.237103] ? watchdog+0x349/0x530 [ 245.237891] ? kthread+0x10f/0x140 [ 245.238634] ? reset_hung_task_detector+0x20/0x20 [ 245.239777] ? kthread_stop+0x210/0x210 [ 245.240642] ? ret_from_fork+0x2e/0x38 [ 245.241586] Kernel panic - not syncing: hung_task: blocked tasks [ 245.242902] CPU: 0 PID: 18 Comm: khungtaskd Tainted: G W T 4.19.0-rc3-00014-gf26eb68 #1 [ 245.244941] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 [ 245.246929] Call Trace: [ 245.247486] ? dump_stack+0x76/0xb6 [ 245.248246] ? panic+0x97/0x1e0 [ 245.248991] ? watchdog+0x355/0x530 [ 245.249798] ? kthread+0x10f/0x140 [ 245.250598] ? reset_hung_task_detector+0x20/0x20 [ 245.251716] ? kthread_stop+0x210/0x210 [ 245.252617] ? ret_from_fork+0x2e/0x38 [ 245.253582] Kernel Offset: disabled [ 245.254385] Rebooting in 3 seconds..
Elapsed time: 240
#!/bin/bash
# To reproduce, # 1) save job-script and this script (both are attached in 0day report email) # 2) run this script with your compiled kernel and optional env $INSTALL_MOD_PATH
kernel=$1
initrds=( /osimage/openwrt/openwrt-i386-2016-03-16.cgz /lkp/lkp/lkp-i386.cgz )
HTTP_PREFIX=https://download.01.org/0day-ci/lkp-qemu wget --timestamping "${initrds[@]/#/$HTTP_PREFIX}"
{ cat "${initrds[@]//*//}" [[ $INSTALL_MOD_PATH ]] && ( cd "$INSTALL_MOD_PATH" find lib | cpio -o -H newc --quiet | gzip ) echo job-script | cpio -o -H newc --quiet | gzip } > initrd.img
qemu-img create -f qcow2 disk-vm-lkp-hsw-4ep1-openwrt-i386-14-0 256G qemu-img create -f qcow2 disk-vm-lkp-hsw-4ep1-openwrt-i386-14-1 256G
kvm=( qemu-system-i386 -enable-kvm -kernel $kernel -initrd initrd.img -m 256 -smp 1 -device e1000,netdev=net0 -netdev user,id=net0 -boot order=nc -no-reboot -watchdog i6300esb -watchdog-action debug -rtc base=localtime -drive file=disk-vm-lkp-hsw-4ep1-openwrt-i386-14-0,media=disk,if=virtio -drive file=disk-vm-lkp-hsw-4ep1-openwrt-i386-14-1,media=disk,if=virtio -serial stdio -display none -monitor null )
append=( ip=::::vm-lkp-hsw-4ep1-openwrt-i386-14::dhcp root=/dev/ram0 user=lkp job=/job-script ARCH=i386 kconfig=i386-randconfig-sb0-09131136 branch=linux-devel/devel-spot-201809131102 commit=f26eb68a52bb7302d220133440c1438143f2c7e1 BOOT_IMAGE=/pkg/linux/i386-randconfig-sb0-09131136/gcc-4.9/f26eb68a52bb7302d220133440c1438143f2c7e1/vmlinuz-4.19.0-rc3-00014-gf26eb68 max_uptime=1500 RESULT_ROOT=/result/locktorture/300s-cpuhotplug/vm-lkp-hsw-4ep1-openwrt-i386/openwrt-i386-2016-03-16.cgz/i386-randconfig-sb0-09131136/gcc-4.9/f26eb68a52bb7302d220133440c1438143f2c7e1/3 result_service=tmpfs debug apic=debug sysrq_always_enabled rcupdate.rcu_cpu_stall_timeout=100 net.ifnames=0 printk.devkmsg=on
To reproduce:
git clone https://github.com/intel/lkp-tests.git cd lkp-tests find lib/ | cpio -o -H newc --quiet | gzip > modules.cgz bin/lkp qemu -k <bzImage> -m modules.cgz job-script # job-script is attached in this email
Thanks, Rong, Chen
On Fri, 2018-09-14 at 17:01 +0800, kernel test robot wrote:
FYI, we noticed the following commit (built with gcc-4.9):
Thanks,
[..]
[ 245.080051] INFO: task lkp-setup-rootf:500 blocked for more than 120 seconds. [ 245.082850] Tainted: G W T 4.19.0-rc3-00014- gf26eb68 #1 [ 245.084673] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 245.086455] lkp-setup-rootf D 6516 500 498 0x00000000 [ 245.087766] Call Trace: [ 245.088391] __schedule+0x4b1/0xe80 [ 245.089322] schedule+0x25/0x60 [ 245.090178] schedule_timeout+0x275/0x4e0 [ 245.091105] ? __ldsem_down_write_nested+0x1aa/0x300 [ 245.092217] __ldsem_down_write_nested+0x1b2/0x300 [ 245.093286] ? ldsem_down_write+0x2a/0x2e [ 245.094305] ldsem_down_write+0x2a/0x2e [ 245.095211] tty_ldisc_lock+0x16/0x40 [ 245.096104] tty_reopen+0x3f/0xc0 [ 245.096892] tty_open+0x3cf/0x450 [ 245.097718] chrdev_open+0x9c/0x1f0 [ 245.098511] ? security_file_open+0x99/0xa0 [ 245.099512] do_dentry_open+0x3d5/0x5d0 [ 245.100428] ? inode_permission+0x95/0x180 [ 245.101351] ? cdev_put+0x30/0x30 [ 245.102138] vfs_open+0x2d/0x30 [ 245.102898] path_openat+0x9e1/0x17d0 [ 245.103733] do_filp_open+0x6a/0x120 [ 245.104541] ? _raw_spin_unlock+0x22/0x30 [ 245.105439] ? __alloc_fd+0xa5/0x170 [ 245.106309] do_sys_open+0x13a/0x250 [ 245.107118] sys_open+0x22/0x30 [ 245.107980] do_int80_syscall_32+0x94/0x1c0 [ 245.108982] entry_INT80_32+0xf3/0xf3
So, unfortunately, I'll still have to set timeout to 5 sec for tty_reopen(). The better fix would be to release read lock whenever a writer is pending (say n_tty_receive_buf_common() can fallback), but that's probably better to put in another patches set, as this one is already packed. Annoyingly, it means that open(tty) will have now more chances to fail then before.. But still it's better than a panic with null-ptr deref.
linux-stable-mirror@lists.linaro.org