Hi Peter,
On 20 August 2013 08:05, Peter Hurley peter@hurleysoftware.com wrote:
The key to why the flush_to_disc() worker thread cannot acquire the uart_port spinlock is in the first stack trace:
s3c24xx_serial_rx_chars raw_spin_unlock_irqrestore do_raw_spin_unlock dump_stack
So the real question here is why do_raw_spin_unlock() is doing a dump_stack()? Is one of the SPIN_BUG_ON() asserts in debug_spin_unlock() triggering?
Ahh, you are right and SPIN_BUG_ON() is indeed getting asserted.
I am embarrassed to say this but it happened due to my stupid mistake :(
And yes, non-rt kernels doesn't see this crash as we simply schedule work from our interrupt handlers instead of calling flush_to_ldisc()
This is how I got the wrong crash dump: - Applied my patch over latest linux-next - Some more debug messages
Tested: No crashes found..
Thought maybe I should try without my patch first, if it really crashes or not.
- Reverted my patch and got some conflicts due the "debug patch".. and ended up resolving them in the wrong way:
- spin_unlock_irqrestore(&port->lock, flags); pr_info("%s\n", __func__); tty_flip_buffer_push(&port->state->port); + spin_unlock_irqrestore(&port->lock, flags);
out: + spin_unlock_irqrestore(&port->lock, flags);
Got two unlocks and so the dump.. Sorry for the incorrect information :(
But yes there are some issues with the RT kernel, which are going to stay without this patchset.. Here are the logs with 3.10.6 + rt kernel (attached as well, in case lines are wrapped by gmail :( ):
root@localhost:~# [ 128.771800] ------------[ cut here ]------------ [ 128.771815] kernel BUG at /home/arm/work/kernel/linaro/lng/lng.git/kernel/rtmutex.c:738! [ 128.771829] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2 [ 128.771844] Modules linked in: [ 128.771868] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Not tainted 3.10.6-rt3+ #7 [ 128.771885] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000 [ 128.771915] PC is at rt_spin_lock_slowlock+0x18c/0x1a0 [ 128.771933] LR is at rt_spin_lock_slowlock+0x3f/0x1a0 [ 128.771956] pc : [<80406b9c>] lr : [<80406a4f>] psr: 60070033 [ 128.771956] sp : ee2b1d68 ip : 00000000 fp : 00000000 [ 128.771968] r10: ee6e641e r9 : ef2d9cc0 r8 : 00000000 [ 128.771982] r7 : ee6e7800 r6 : 00000000 r5 : ee2b0000 r4 : 806a9ccc [ 128.771995] r3 : ef2d9cc0 r2 : 00000000 r1 : ef2d9cc0 r0 : 00000000 [ 128.772011] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA Thumb Segment kernel [ 128.772026] Control: 50c5387d Table: aeac406a DAC: 00000015 [ 128.772041] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238) [ 128.772054] Stack: (0xee2b1d68 to 0xee2b2000) [ 128.772080] 1d60: 00000000 00000000 00000000 00000000 00000400 80654380 [ 128.772105] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101 [ 128.772128] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e [ 128.772150] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9 [ 128.772173] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d [ 128.772195] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff [ 128.772216] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000 [ 128.772238] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0 [ 128.772260] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000 [ 128.772282] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500 [ 128.772304] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000 [ 128.772325] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b [ 128.772347] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1 [ 128.772368] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750 [ 128.772390] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd [ 128.772411] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000 [ 128.772432] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000 [ 128.772453] 1f80: ee2b1f80 ee2b1f80 00000000 00000000 ee2b1f90 ee2b1f90 00000000 ee6bdcd0 [ 128.772473] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000 [ 128.772492] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 128.772512] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [ 128.772564] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30) [ 128.772608] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) [ 128.772650] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) [ 128.772697] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) [ 128.772742] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) [ 128.772793] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from [<80067db9>] (irq_forced_thread_fn+0x19/0x38) [ 128.772837] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118) [ 128.772884] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74) [ 128.772932] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20) [ 128.772962] Code: f7ff fae4 e7ac de02 (de02) 1d23 [ 129.122974] ---[ end trace 0000000000000002 ]--- [ 129.122994] note: irq/85-12c20000[1895] exited with preempt_count 1 [ 129.123041] Unable to handle kernel paging request at virtual address ffffffec [ 129.123049] pgd = 80004000 [ 129.123077] [ffffffec] *pgd=af7fd821, *pte=00000000, *ppte=00000000 [ 129.123095] Internal error: Oops: 17 [#2] PREEMPT SMP THUMB2 [ 129.123108] Modules linked in: [ 129.123131] CPU: 0 PID: 1895 Comm: irq/85-12c20000 Tainted: G D 3.10.6-rt3+ #7 [ 129.123144] task: ef2d9cc0 ti: ee2b0000 task.ti: ee2b0000 [ 129.123173] PC is at kthread_data+0xa/0x10 [ 129.123196] LR is at irq_thread_dtor+0x21/0x80 [ 129.123218] pc : [<80033886>] lr : [<80067d1d>] psr: 800700b3 [ 129.123218] sp : ee2b1bd8 ip : 00000031 fp : 20070093 [ 129.123231] r10: ef2d9cc0 r9 : 80680008 r8 : ee2b0000 [ 129.123243] r7 : ef2d9cc0 r6 : 806d8c88 r5 : 00000000 r4 : ef2d9cc0 [ 129.123254] r3 : 00000000 r2 : 80000000 r1 : ee2b1f38 r0 : ef2d9cc0 [ 129.123269] Flags: Nzcv IRQs off FIQs on Mode SVC_32 ISA Thumb Segment user [ 129.123283] Control: 50c5387d Table: aeac406a DAC: 00000015 [ 129.123297] Process irq/85-12c20000 (pid: 1895, stack limit = 0xee2b0238) [ 129.123307] Stack: (0xee2b1bd8 to 0xee2b2000) [ 129.123324] 1bc0: 80067cfd ef2da124 [ 129.123348] 1be0: 00000000 8003136f ee2b0000 002f002f 00000000 0000000b 00000000 00000000 [ 129.123370] 1c00: ee2b1d20 8001f619 20070093 00000001 8053407c 8001dccf 002f002f 806855ec [ 129.123392] 1c20: ee2b0000 806c5f84 0000000b ee2b0000 ee2b1d20 8052fd08 80680008 00000000 [ 129.123413] 1c40: 20070093 8000fcf1 ee2b0238 0000000b 8176f34c 00000000 00030001 00000004 [ 129.123434] 1c60: 0000de02 ee2b1d20 80406b9c ee6e641e 00000000 80008235 00000006 00000002 [ 129.123456] 1c80: 00000004 00000000 00030001 80406b9c 00000193 80651d84 0111c000 80021859 [ 129.123477] 1ca0: 00000001 80021c2b 8176dd84 8006dd4f 00000000 81770380 00000001 00000001 [ 129.123498] 1cc0: 00000001 00000000 00000400 80042a73 0000041d 000003e3 00000400 00000000 [ 129.123519] 1ce0: 80654380 ef07e950 80654380 ef07e8c0 00000000 80654380 ffffffff 00000000 [ 129.123540] 1d00: ee2b1d0c 80406b9e 60070033 ffffffff ee2b1d54 00000000 ef2d9cc0 8000cac1 [ 129.123561] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800 [ 129.123581] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c [ 129.123601] 1d60: 60070033 ffffffff 00000000 00000000 00000000 00000000 00000400 80654380 [ 129.123622] 1d80: ef07d800 00000002 ef07d800 81767800 00000002 00000000 00000001 80043101 [ 129.123643] 1da0: 00000002 806a9ccc eea47400 0000002e ee6e7800 00000000 ee6e7800 ee6e641e [ 129.123664] 1dc0: 00000000 80288639 8028864d eea47400 ee6e651d 80278af5 81767380 800451c9 [ 129.123686] 1de0: 00000064 ef2d9e18 81767380 80654380 ef2d9cc0 80406a4f ef2d9e18 80037a0d [ 129.123708] 1e00: eea11250 ee2b0000 eea11240 eea11250 eea11240 80406a4f 00000000 ffffffff [ 129.123729] 1e20: ef33a000 80037be5 00000000 80406a4f 00000100 00000001 eea11250 ee2b0000 [ 129.123750] 1e40: eea11250 806582c0 ee75a500 80406a4f eea11250 ee2b0000 00000000 806582c0 [ 129.123770] 1e60: 0000003f 80406a4f eea11240 eea11250 00000000 8027bcdd 00000001 00000000 [ 129.123792] 1e80: ee2b1ebc 271ae71a eea11240 eea11240 eea47400 eea11250 80278a59 ee75a500 [ 129.123813] 1ea0: eea11308 00000001 ee6e651c 8027bbe5 00000000 806a9cac 806a9ccc 00000000 [ 129.123834] 1ec0: 806582c0 0000003f 00000000 ee75a300 00000000 8028d4db 00000000 8003ca2b [ 129.123855] 1ee0: 00000001 ee2b0000 806a9cac 00000001 00000055 00000000 ee75a320 80067da1 [ 129.123876] 1f00: ee75a300 8028dbdd ee75a300 8065c800 ee2b0000 80067db9 00000001 8067b750 [ 129.123897] 1f20: ee2b0000 80067e91 ee75a300 8065c800 ee75a300 00000000 00000000 80067cfd [ 129.123918] 1f40: ee2b1f58 ee6bdcd0 ee2b1f88 ee75a300 80067dd9 00000000 00000000 00000000 [ 129.123938] 1f60: 00000000 80033561 00000000 00000000 000000f8 ee75a300 00000000 00000000 [ 129.123959] 1f80: ee2b1f80 ee2b1f80 00000001 00010001 ee2b1f90 ee2b1f90 00000000 ee6bdcd0 [ 129.123979] 1fa0: 800334f5 00000000 00000000 8000cf3d 00000000 00000000 00000000 00000000 [ 129.123998] 1fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 129.124017] 1fe0: 00000000 00000000 00000000 00000000 00000013 00000000 00000000 00000000 [ 129.124070] [<80033886>] (kthread_data+0xa/0x10) from [<80067d1d>] (irq_thread_dtor+0x21/0x80) [ 129.124116] [<80067d1d>] (irq_thread_dtor+0x21/0x80) from [<8003136f>] (task_work_run+0x73/0xac) [ 129.124156] [<8003136f>] (task_work_run+0x73/0xac) from [<8001f619>] (do_exit+0x215/0x6fc) [ 129.124189] [<8001f619>] (do_exit+0x215/0x6fc) from [<8000fcf1>] (die+0x155/0x1e8) [ 129.124221] [<8000fcf1>] (die+0x155/0x1e8) from [<80008235>] (do_undefinstr+0x69/0x14c) [ 129.124261] [<80008235>] (do_undefinstr+0x69/0x14c) from [<8000cac1>] (__und_svc_finish+0x1/0x40) [ 129.124272] Exception stack(0xee2b1d20 to 0xee2b1d68) [ 129.124296] 1d20: 00000000 ef2d9cc0 00000000 ef2d9cc0 806a9ccc ee2b0000 00000000 ee6e7800 [ 129.124318] 1d40: 00000000 ef2d9cc0 ee6e641e 00000000 00000000 ee2b1d68 80406a4f 80406b9c [ 129.124329] 1d60: 60070033 ffffffff [ 129.124372] [<8000cac1>] (__und_svc_finish+0x1/0x40) from [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) [ 129.124408] [<80406b9c>] (rt_spin_lock_slowlock+0x18c/0x1a0) from [<80288639>] (uart_start+0x1d/0x30) [ 129.124445] [<80288639>] (uart_start+0x1d/0x30) from [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) [ 129.124484] [<80278af5>] (n_tty_receive_buf+0x9d/0xbf0) from [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) [ 129.124527] [<8027bbe5>] (flush_to_ldisc+0xb5/0xe8) from [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) [ 129.124569] [<8028d4db>] (s3c24xx_serial_rx_chars+0xbf/0x19c) from [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) [ 129.124615] [<8028dbdd>] (s3c64xx_serial_handle_irq+0x31/0x38) from [<80067db9>] (irq_forced_thread_fn+0x19/0x38) [ 129.124657] [<80067db9>] (irq_forced_thread_fn+0x19/0x38) from [<80067e91>] (irq_thread+0xb9/0x118) [ 129.124700] [<80067e91>] (irq_thread+0xb9/0x118) from [<80033561>] (kthread+0x6d/0x74) [ 129.124745] [<80033561>] (kthread+0x6d/0x74) from [<8000cf3d>] (ret_from_fork+0x11/0x20) [ 129.124775] Code: f85d eb04 f8d0 3264 (f853) 0c14
---------x-----------------x------------
Here we haven't scheduled work from tty_flip_buffer_push() because of following change in RT kernel:
commit 714cc6dc551136b8f88b53361b504a5f624308b9 Author: Ingo Molnar mingo@elte.hu Date: Fri Jul 3 08:30:01 2009 -0500
serial: 8250: Call flush_to_ldisc when the irq is threaded
Signed-off-by: Ingo Molnar mingo@elte.hu --- drivers/tty/tty_buffer.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/tty/tty_buffer.c b/drivers/tty/tty_buffer.c index 9121c1f..94d3215 100644 --- a/drivers/tty/tty_buffer.c +++ b/drivers/tty/tty_buffer.c @@ -517,10 +517,15 @@ void tty_flip_buffer_push(struct tty_port *port) buf->tail->commit = buf->tail->used; spin_unlock_irqrestore(&buf->lock, flags);
+#ifndef CONFIG_PREEMPT_RT_FULL if (port->low_latency) flush_to_ldisc(&buf->work); else schedule_work(&buf->work); +#else + flush_to_ldisc(&buf->work); +#endif + } EXPORT_SYMBOL(tty_flip_buffer_push);
@@ -545,4 +550,3 @@ void tty_buffer_init(struct tty_port *port) buf->memory_used = 0; INIT_WORK(&buf->work, flush_to_ldisc); }
---------x------------------x-------------------
How do you guys suggest me to fix this problem now? The fix I suggested? Or something specific to the RT kernel only? Or maybe send these patches for RT kernel only.
One more thing: We are calling tty_flip_buffer_push() with (from interrupt context) and without (from process context) port->lock held.. Shouldn't we make this flat, like don't acquire port->lock while calling tty_flip_buffer_push() as that may cause recursive locking?
Thanks for your help and patience.
-- viresh