From: Arnd Bergmann
Sent: 24 May 2023 12:18
On Wed, May 24, 2023, at 11:02, Naresh Kamboju wrote:
While running LTP controllers following kernel crash noticed on qemu-x86_64 compat mode with stable-rc 6.3.4-rc2.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Linux version 6.3.4-rc2 (tuxmake@tuxmake) (x86_64-linux-gnu-gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40) #1 SMP PREEMPT_DYNAMIC @1684862676 .. ./runltp -f controllers ... cpuset_inherit 11 TPASS: cpus: Inherited information is right! cpuset_inherit 13 TPASS: mems: Inherited information is right! <4>[ 1130.117922] int3: 0000 [#1] PREEMPT SMP PTI <4>[ 1130.118132] CPU: 0 PID: 32748 Comm: cpuset_inherit_ Not tainted 6.3.4-rc2 #1 <4>[ 1130.118216] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.14.0-2 04/01/2014 <4>[ 1130.118320] RIP: 0010:__alloc_pages+0xeb/0x340 <4>[ 1130.118605] Code: 48 c1 e0 04 48 8d 84 01 00 13 00 00 48 89 45 a8 8b 05 d9 31 cf 01 85 c0 0f 85 05 02 00 00 89 d8 c1 e8 03 83 e0 03 89 45 c0 66 <90> 41 89 df 41 be 01 00 00 00 f6 c7 04 75 66 44 89 e6 89 df e8 ec
I haven't figured out what is going on here, but I tracked down the trapping instruction <90> to the middle of the 'xchg %ax,%ax' two-byte nop in:
ffffffff814218f4: 83 e0 03 and $0x3,%eax ffffffff814218f7: 89 45 c0 mov %eax,-0x40(%rbp) ffffffff814218fa: 66 90 xchg %ax,%ax ffffffff814218fc: 41 89 df mov %ebx,%r15d ffffffff814218ff: 41 be 01 00 00 00 mov $0x1,%r14d
which in turn is the cpusets_enabled() check in prepare_alloc_pages().
Does that code actually match the call/return stack?
It is pretty much impossible to get a trap on an 0x90 byte. I think you'd need to jump to it and then get a page fault.
So I bet that isn't the code that was actually being executed. So either the fault address is garbage or something horrid(tm) has happened to the page tables.
David
- Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)