Hello all,
I am working on a android tablet using TI OMAP 4470 ES1 Soc which has 2xArm Cortex A9. The system is running Android 4.2.2 with Linux Kernel 3.4.48.
My kernel configuration with these debugging:

CONFIG_CC_STACKPROTECTOR=y
CONFIG_DEBUG_SPINLOCK=y
CONFIG_DEBUG_MUTEXES=y
# CONFIG_DEBUG_LOCK_ALLOC is not set
CONFIG_TRACE_IRQFLAGS=y
CONFIG_DEBUG_ATOMIC_SLEEP=y
CONFIG_DEBUG_LOCK_ALLOC=y
CONFIG_PROVE_LOCKING=y
CONFIG_LOCKDEP=y
CONFIG_DEBUG_SLAB=y
CONFIG_DEBUG_SLAB_LEAK=y
#and FTRACE
CONFIG_FUNCTION_TRACER=y
CONFIG_FUNCTION_GRAPH_TRACER=y
CONFIG_STACK_TRACER=y
CONFIG_DYNAMIC_FTRACE=y

The system randomly got reset due to memory corruption and one of them seems the the stack is restored incorrectly. One of the crashing is as following:

[484677.808807] Unable to handle kernel paging request at virtual address 0040049c
[484677.817077] pgd = d5ee8000
[484677.820220] [0040049c] *pgd=00000000
[484677.824615] Process UEventObserver (pid: 764, stack limit = 0xd5b2c2f8)
[484677.832000] Internal error: Oops: 805 [#1] PREEMPT SMP ARM
[484677.838287] Modules linked in: wlcore_sdio(O) wl18xx(O) wlcore(O) mac80211(O) cfg80211(O) compat(O) pvrsrvkm_sgx544_112(O)
[484677.851806] CPU: 0    Tainted: G        W  O  (3.4.48-dirty #1)
[484677.858459] PC is at lock_release+0x9c/0x134
[484677.863433] LR is at _raw_spin_lock_irqsave+0x64/0x70
[484677.869140] pc : [<c00a48f0>]    lr : [<c06b7850>]    psr: 60000193
[484677.869140] sp : d5b2db20  ip : d5b2daf0  fp : d5b2db54
[484677.882141] r10: 00000000  r9 : d5b2dbf4  r8 : 00000000
[484677.888122] r7 : d5a8bec0  r6 : d5b2c000  r5 : d58a8040  r4 : c00a3710
[484677.895416] r3 : 00400040  r2 : 00000000  r1 : 5bbd5bbc  r0 : 60000113

Decode the stack dump:
[484678.126312] SP: 0xd5b2daa0:
[484678.131408] daa0  00000080 00000000 c0070cf8 00000000 c06b6cb4 c06b64c8 c00a48f0 60000193
[484678.141998] dac0  ffffffff d5b2db0c d5b2db54 d5b2dad8 c06b85d8(lr=__dabt_svc) c000839c(pc=do_DataAbort())
                      60000113(r0) 5bbd5bbc(r1)
[484678.152587] dae0  00000000(r2) 00400040(r3) c00a3710(pc=__lock_acquire)
push    {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
                d58a8040(4) d5b2c000(5) d5a8bec0(6) 00000000(7) d5b2dbf4(8)
[484678.163055] db00  00000000(9) d5b2db54(sl) d5b2daf0(fp) d5b2db20(ip) c06b7850(lr=_raw_spin_lock_irqsave) c00a48f0(pc=lock_release)
60000193 ffffffff
push    {r3, r4, r5, r6, fp, ip, lr, pc}
[484678.173522] db20  d5b2dd10(r3) d5b2dcf4(r4) d5a86338(r5) 00000000(r6) d5b2db5c(fp) d5b2db40(ip) c0144974(lr=__pollwait) c0070cd4(pc=add_wait_queue)
[484678.184082] db40  d5abb400 d5b2dbfc d5a86338 d5b2dc04 d5b2db74 d5b2db60 c050e470 c01448f0(pc=pollwake)
[484678.194702] db60  d5b2dcf4 d5b2dbfc d5b2db84 d5b2db78 c0500d84 c050e440 d5b2dbe4 d5b2db88
[484678.205200] db80  c0144cc4 c0500d64 d5b2dbac d5b2c000 00000000 00000000 00000000 d5b2dbf4

I found that the UEventObserver userspace process is calling select() system call then the kernel code path is:
__pollwait() 
-->      add_wait_queue()
------>     _raw_spin_lock_irqsave()
--------->        lock_release()


1. The first issue is I do not understand why the _raw_spin_lock_irqsave() call lock_release right after. It should call lock_acquire().
2. It look likes the register is pop from stack is not correct:

It look likes the stack frame for add_wait_queue() function is correct:
60000193(cpsr) ffffffff
push    {r3, r4, r5, r6, fp, ip, lr, pc}
[484678.173522] db20  d5b2dd10(r3) d5b2dcf4(r4) d5a86338(r5) 00000000(r6) d5b2db5c(fp) d5b2db40(ip) c0144974(lr=__pollwait) c0070cd4(pc=add_wait_queue)

The cpsr=600000193 means interrupt is disabled so that scheduler is also off mean no other task in the same core can affect the current execution thread.

The stack frame for calling lock_release() seems correct too
push    {r4, r5, r6, r7, r8, r9, sl, fp, ip, lr, pc}
                d58a8040(4) d5b2c000(5) d5a8bec0(6) 00000000(7) d5b2dbf4(8)
[484678.163055] db00  00000000(9) d5b2db54(sl) d5b2daf0(fp) d5b2db20(ip) c06b7850(lr=_raw_spin_lock_irqsave) c00a48f0(pc=lock_release)
I found that:
      R5=d5b2c000 point to the thread_info
      R4=d58a8040 point to the current task_struct
But it look like poping from stack got issue then the system crashed:
r6 : d5b2c000  r5 : d58a8040  r4 : c00a3710
Now R6=d5b2c000 (which is R5 in stack) and R5=d58a8040 (which is R4 in stack) R4=c00a3710(seem this is the LR of other stack frame).

Have anyone faced this issue ? 
I suspect that enabling Profiling when building kernel cause this issue because I found the generated code for add_wait_queue():
void add_wait_queue(wait_queue_head_t *q, wait_queue_t *wait)
{
c0070cc8:   e1a0c00d    mov ip, sp
c0070ccc:   e92dd878    push    {r3, r4, r5, r6, fp, ip, lr, pc}
c0070cd0:   e24cb004    sub fp, ip, #4
c0070cd4:   e92d4000    push    {lr}
c0070cd8:   ebfe8c62    bl  c0013e68 <__gnu_mcount_nc>

The jump to __gnu_mcount_nc() is generated by GCC when the option -pg is enabled. Is there any how that cause issue for stack setup ?
Many thanks for reading this long mail. Any sharing idea is very appriciated.

Thanks again.
--
Quan Cao
0976574864