On 18/04/2014 16:44, Richard Henderson wrote:
On 04/18/2014 07:00 AM, Mian M. Hamayun wrote:
Hello Peter & All,
I am trying to figure out a problem in qemu on aarch64 (with kvm enabled). I have found this problem in many different versions of qemu (v2.0.0-rc3/rc2/rc1/rc0, master 2d03b49), and I believe that either I am missing something common in all of these versions or its a genuine bug in qemu on aarch64.
The problem is triggered by virtqueue_notify() function (in virtio_ring.c) from the guest kernel and fails in the qemu_coroutine_new() while trying to do the swapcontext(&old_uc, &uc) (see coroutine-ucontext.c:164). The sigsetjmp(old_env, 0) just before the swapcontext() call seems to work fine, as it returns 0, and then we invoke the swapcontext().
The host kernel reports: "qemu-system-aar[596]: bad frame in sys_rt_sigreturn: pc=004462e0 sp=7f8020f000" and kills the qemu process due to segmentation fault. The pc=004462e0 is for the coroutine_trampoline() but we don't actually reach it, when this particular crash happens.
Just to give you an idea of the code I am talking about:
$~/qemu[master]$ git blame -L 159,166 coroutine-ucontext.c 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 159) makecontext(&uc, (void (*)(void))coroutine_trampoline, 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 160) 2, arg.i[0], arg.i[1]); 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 161) 6ab7e546 (Peter Maydell 2013-02-20 15:21:09 +0000 162) /* swapcontext() in, siglongjmp() back out */ 6ab7e546 (Peter Maydell 2013-02-20 15:21:09 +0000 163) if (!sigsetjmp(old_env, 0)) { 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 164) swapcontext(&old_uc, &uc); 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 165) } 00dccaf1 (Kevin Wolf 2011-01-17 16:08:14 +0000 166) return &co->base;
My qemu configure/run commands are:
./configure --target-list=aarch64-softmmu \ --cross-prefix=aarch64-linux-gnu- \ --enable-fdt *--enable-kvm* --disable-werror \ --audio-drv-list="" --static
./qemu-system-aarch64 \ *-enable-kvm* -nographic -kernel Image \ -drive if=none,file=disk_oe64.img,id=fs \ -device virtio-blk-device,drive=fs \ -m 1024 -M virt -cpu host \ -append "earlyprintk console=ttyAMA0 mem=1024M rootwait root=/dev/vda rw init=/bin/sh"
Any ideas/comments on how to resolve this issue?
Note that a patch has just gone into glibc to rewrite setcontext et al for aarch64. I'd try using git glibc before looking too much deeper.
r~
It seems right to me as well, as the setcontext related bug (for aarch64) has recently been posted/fixed in glibc: https://sourceware.org/bugzilla/show_bug.cgi?id=16629
I have also tested the example posted at: https://sourceware.org/bugzilla/attachment.cgi?id=7435 and I get the following output on x86_64:
start f2 start f1 finish f2 finish f1 {ss_sp: (nil), ss_flags: 2, ss_size: 0} {ss_sp: (nil), ss_flags: 2, ss_size: 0}
and this one on AArch64:
start f2 start f1 finish f2 finish f1 {ss_sp: (nil), ss_flags: 2, ss_size: 0} *{ss_sp: 0x7ff7e723f0, ss_flags: 0, ss_size: 8192}*
I will first look into updating my glibc for AArch64 before coming back to qemu.
Thanks for your support, Hamayun