在 2024/12/19 06:37, Qu Wenruo 写道:
在 2024/12/19 02:22, Naresh Kamboju 写道:
On Wed, 18 Dec 2024 at 17:33, Naresh Kamboju naresh.kamboju@linaro.org wrote:
The following kernel crash noticed on qemu-arm64 while running the Linux next-20241210 tag (to next-20241218) kernel built with - CONFIG_ARM64_64K_PAGES=y - CONFIG_ARM64_16K_PAGES=y and running LTP smoke tests.
First seen on Linux next-20241210. Good: next-20241209 Bad: next-20241210 and next-20241218
qemu-arm64: 9.1.2
Anyone noticed this ?
Anders bisected this reported regression and found, # first bad commit: [9c1d66793b6faa00106ae4c866359578bfc012d2] btrfs: validate system chunk array at btrfs_validate_super()
Weird, I run daily fstests with 64K page sized aarch64 VM.
But never hit a crash on this.
And the original crash call trace only points back to ext4, not btrfs.
Mind to test it with KASAN enabled?
Another thing is, how do you enable both 16K and 64K page size at the same time?
The Kconfig should only select one page size IIRC.
And for the bisection, does it focus on the test failure or the crash?
For the test failure it looks like some older btrfs-progs, causing invalid system chunk items, which got caught by the newer and more strict sanity checks.
For the crash, unfortunately I'm not able to reproduce using fstests. Will try LTP soon.
Thanks, Qu
Thanks, Qu
Test log:
tst_test.c:1799: TINFO: === Testing on btrfs === tst_test.c:1158: TINFO: Formatting /dev/loop0 with btrfs opts='' extra opts='' <6>[ 71.880167] BTRFS: device fsid d492b571-012c-40a9-b8e1-efc97408d3bc devid 1 transid 6 /dev/loop0 (7:0) scanned by chdir01 (476) tst_test.c:1170: TINFO: Mounting /dev/loop0 to /tmp/LTP_chdJeywxF/mntpoint fstyp=btrfs flags=0 <6>[ 71.960245] BTRFS info (device loop0): first mount of filesystem d492b571-012c-40a9-b8e1-efc97408d3bc <6>[ 71.970667] BTRFS info (device loop0): using crc32c (crc32c-arm64) checksum algorithm <2>[ 71.993486] BTRFS critical (device loop0): corrupt superblock syschunk array: chunk_start=22020096, invalid chunk sectorsize, have 65536 expect 4096 <3>[ 71.995802] BTRFS error (device loop0): superblock contains fatal errors <3>[ 72.014538] BTRFS error (device loop0): open_ctree failed: -22 tst_test.c:1170: TBROK: mount(/dev/loop0, mntpoint, btrfs, 0, (nil)) failed: EINVAL (22)
Summary: passed 48 failed 0 broken 1 skipped 0 warnings 0
Duration: 7.002s
===== symlink01 ===== command: symlink01 <12>[ 72.494428] /usr/local/bin/kirk[253]: starting test symlink01 (symlink01) symlink01 0 TINFO : Using /tmp/LTP_symmsYXet as tmpdir (tmpfs filesystem) symlink01 1 TPASS : Creation of symbolic link file to no object file is ok symlink01 2 TPASS : Creation of symbolic link file to no object file is ok symlink01 3 TPASS : Creation of symbolic link file and object file via symbolic link is ok symlink01 4 TPASS : Creating an existing symbolic link file error is caught symlink01 5 TPASS : Creating a symbolic link which exceeds maximum pathname error is caught
Summary: passed 5 failed 0 broken 0 skipped 0 warnings 0
Duration: 0.052s
===== stat04 ===== command: stat04 <12>[ 72.966706] /usr/local/bin/kirk[253]: starting test stat04 (stat04) tst_buffers.c:57: TINFO: Test is using guarded buffers tst_tmpdir.c:316: TINFO: Using /tmp/LTP_staEABwgV as tmpdir (tmpfs filesystem) <6>[ 73.447708] loop0: detected capacity change from 0 to 614400 tst_device.c:96: TINFO: Found free device 0 '/dev/loop0' tst_test.c:1860: TINFO: LTP version: 20240930 tst_test.c:1864: TINFO: Tested kernel: 6.13.0-rc3-next-20241218 #1 SMP PREEMPT @1734498806 aarch64 tst_test.c:1703: TINFO: Timeout per run is 0h 05m 24s stat04.c:60: TINFO: Formatting /dev/loop0 with ext2 opts='-b 4096' extra opts='' mke2fs 1.47.1 (20-May-2024) <3>[ 73.859753] operation not supported error, dev loop0, sector 614272 op 0x9:(WRITE_ZEROES) flags 0x10000800 phys_seg 0 prio class 0 stat04.c:61: TINFO: Mounting /dev/loop0 to /tmp/LTP_staEABwgV/mntpoint fstyp=ext2 flags=0 <6>[ 73.939263] EXT4-fs (loop0): mounting ext2 file system using the ext4 subsystem <1>[ 73.946378] Unable to handle kernel paging request at virtual address a8fff00000c0c224 <1>[ 73.947878] Mem abort info: <1>[ 73.949153] ESR = 0x0000000096000005 <1>[ 73.959105] EC = 0x25: DABT (current EL), IL = 32 bits <1>[ 73.960031] SET = 0, FnV = 0 <1>[ 73.960349] EA = 0, S1PTW = 0 <1>[ 73.960638] FSC = 0x05: level 1 translation fault <1>[ 73.961005] Data abort info: <1>[ 73.961293] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 <1>[ 73.963739] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 <1>[ 73.964980] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 <1>[ 73.967132] [a8fff00000c0c224] address between user and kernel address ranges <0>[ 73.968923] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP <4>[ 73.970516] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables <4>[ 73.974237] CPU: 1 UID: 0 PID: 529 Comm: stat04 Not tainted 6.13.0-rc3-next-20241218 #1 <4>[ 73.975359] Hardware name: linux,dummy-virt (DT) <4>[ 73.977170] pstate: 62402009 (nZCv daif +PAN -UAO +TCO -DIT -SSBS BTYPE=--) <4>[ 73.978295] pc : __kmalloc_node_noprof (mm/slub.c:492 mm/slub.c:505 mm/slub.c:532 mm/slub.c:3993 mm/slub.c:4152 mm/slub.c:4293 mm/slub.c:4300) <4>[ 73.980200] lr : alloc_cpumask_var_node (lib/cpumask.c:62 (discriminator 2)) <4>[ 73.981466] sp : ffff80008258f950 <4>[ 73.982228] x29: ffff80008258f970 x28: ffffa93389398000 x27: 0000000000000001 <4>[ 73.983875] x26: fffffc1fc0303080 x25: 00000000ffffffff x24: a8fff00000c0c224 <4>[ 73.985649] x23: 0000000000000cc0 x22: ffffa93387f51d0c x21: 00000000ffffffff <4>[ 73.986188] x20: fff00000c0010400 x19: 0000000000000008 x18: 0000000000000000 <4>[ 73.988686] x17: fff056cd748b0000 x16: ffff800080020000 x15: 0000000000000000 <4>[ 73.990276] x14: 0000000000002a66 x13: 0000000000004000 x12: 0000000000000001 <4>[ 73.992401] x11: 0000000000000002 x10: 0000000000004001 x9 : ffffa93387f51d0c <4>[ 73.993108] x8 : fff00000c2c99240 x7 : 0000000000000001 x6 : 0000000000000001 <4>[ 73.993886] x5 : fff00000c4879800 x4 : 0000000000000000 x3 : 000000000033a401 <4>[ 73.995550] x2 : 0000000000000000 x1 : a8fff00000c0c224 x0 : fff00000c0010400 <4>[ 73.997017] Call trace: <4>[ 73.998266] __kmalloc_node_noprof+0x100/0x4a0 P <4>[ 73.999716] alloc_cpumask_var_node (lib/cpumask.c:62 (discriminator 2)) <4>[ 74.000942] alloc_workqueue_attrs (kernel/workqueue.c:4624 (discriminator 1)) <4>[ 74.001327] apply_wqattrs_prepare (kernel/workqueue.c:5263) <4>[ 74.003095] apply_workqueue_attrs_locked (kernel/workqueue.c:5351) <4>[ 74.003855] alloc_workqueue (kernel/workqueue.c:5722 (discriminator 1) kernel/workqueue.c:5772 (discriminator 1)) <4>[ 74.005398] ext4_fill_super (fs/ext4/super.c:5484 fs/ext4/ super.c:5722) <4>[ 74.006132] get_tree_bdev_flags (fs/super.c:1636) <4>[ 74.007624] get_tree_bdev (fs/super.c:1660) <4>[ 74.008664] ext4_get_tree (fs/ext4/super.c:5755) <4>[ 74.009423] vfs_get_tree (fs/super.c:1814) <4>[ 74.009703] path_mount (fs/namespace.c:3556 fs/namespace.c:3883) <4>[ 74.010608] __arm64_sys_mount (fs/namespace.c:3896 fs/namespace.c:4107 fs/namespace.c:4084 fs/namespace.c:4084) <4>[ 74.011527] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54) <4>[ 74.012798] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2)) <4>[ 74.014042] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator
- arch/arm64/include/asm/irqflags.h:136 (discriminator 1)
arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:745 (discriminator 1)) <4>[ 74.014942] el0t_64_sync_handler (arch/arm64/kernel/entry- common.c:763) <4>[ 74.015917] el0t_64_sync (arch/arm64/kernel/entry.S:600) <0>[ 74.017042] Code: 12800019 b9402a82 aa1803e1 aa1403e0 (f8626b1a) All code ======== 0: 12800019 mov w25, #0xffffffff // #-1 4: b9402a82 ldr w2, [x20, #40] 8: aa1803e1 mov x1, x24 c: aa1403e0 mov x0, x20 10:* f8626b1a ldr x26, [x24, x2] <-- trapping instruction
Code starting with the faulting instruction
0: f8626b1a ldr x26, [x24, x2] <4>[ 74.019014] ---[ end trace 0000000000000000 ]--- tst_test.c:1763: TBROK: Test killed by SIGSEGV!
Summary: passed 0 failed 0 broken 1 skipped 0 warnings 0 tst_device.c:269: TWARN: ioctl(/dev/loop0, LOOP_CLR_FD, 0) no ENXIO for too long Tainted kernel: kernel died recently, i.e. there was an OOPS or BUG[0m Tainted kernel: ['kernel died recently, i.e. there was an OOPS or BUG'][0m Restarting SUT: host
===== df01_sh ===== command: df01.sh <12>[ 76.370093] /usr/local/bin/kirk[253]: starting test df01_sh (df01.sh) Tainted kernel: kernel died recently, i.e. there was an OOPS or BUG[0m <1>[ 76.603065] Unable to handle kernel paging request at virtual address a8fff00000c0c224 <1>[ 76.603922] Mem abort info: <1>[ 76.604197] ESR = 0x0000000096000005 <1>[ 76.604638] EC = 0x25: DABT (current EL), IL = 32 bits <1>[ 76.605128] SET = 0, FnV = 0 <1>[ 76.606996] EA = 0, S1PTW = 0 <1>[ 76.607274] FSC = 0x05: level 1 translation fault <1>[ 76.607611] Data abort info: <1>[ 76.607897] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000 <1>[ 76.609765] CM = 0, WnR = 0, TnD = 0, TagAccess = 0 <1>[ 76.610958] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 <1>[ 76.611652] [a8fff00000c0c224] address between user and kernel address ranges <0>[ 76.612130] Internal error: Oops: 0000000096000005 [#2] PREEMPT SMP <4>[ 76.613305] Modules linked in: btrfs blake2b_generic xor xor_neon raid6_pq zstd_compress sm3_ce sm3 sha3_ce sha512_ce sha512_arm64 fuse drm backlight ip_tables x_tables <4>[ 76.617688] CPU: 1 UID: 0 PID: 553 Comm: df01.sh Tainted: G D 6.13.0-rc3-next-20241218 #1 <4>[ 76.620869] Tainted: [D]=DIE <4>[ 76.621184] Hardware name: linux,dummy-virt (DT) <4>[ 76.622671] pstate: 63402009 (nZCv daif +PAN -UAO +TCO +DIT -SSBS BTYPE=--) <4>[ 76.623693] pc : __kmalloc_node_noprof (mm/slub.c:492 mm/slub.c:505 mm/slub.c:532 mm/slub.c:3993 mm/slub.c:4152 mm/slub.c:4293 mm/slub.c:4300) <4>[ 76.624180] lr : __vmalloc_node_range_noprof (include/linux/slab.h:922 mm/vmalloc.c:3647 mm/vmalloc.c:3846) <4>[ 76.625290] sp : ffff80008258fa90 <4>[ 76.626275] x29: ffff80008258fab0 x28: fff00000c2c98e80 x27: fff00000c48fd100 <4>[ 76.626966] x26: fffffc1fc0303080 x25: 00000000ffffffff x24: a8fff00000c0c224 <4>[ 76.627599] x23: 0000000000000dc0 x22: ffffa93386d87390 x21: 00000000ffffffff <4>[ 76.628603] x20: fff00000c0010400 x19: 0000000000000008 x18: 0000000000000000 <4>[ 76.629618] x17: 0000000000000000 x16: ffff800082180000 x15: ffff800080000000 <4>[ 76.630999] x14: fff00000c00203f0 x13: 00000ffff8000821 x12: 0000000000000000 <4>[ 76.632089] x11: 0000000000000000 x10: 0000000000000000 x9 : ffffa93386d87390 <4>[ 76.634293] x8 : ffff80008258f908 x7 : fff00000c2c98e80 x6 : 0000000000010000 <4>[ 76.634816] x5 : ffffa93389379000 x4 : 0000000000000000 x3 : 000000000033b801 <4>[ 76.636355] x2 : 0000000000000000 x1 : a8fff00000c0c224 x0 : fff00000c0010400 <4>[ 76.638309] Call trace: <4>[ 76.639031] __kmalloc_node_noprof+0x100/0x4a0 P <4>[ 76.640890] __vmalloc_node_range_noprof (include/linux/slab.h:922 mm/vmalloc.c:3647 mm/vmalloc.c:3846) <4>[ 76.641267] copy_process (kernel/fork.c:314 (discriminator 1) kernel/fork.c:1061 (discriminator 1) kernel/fork.c:2176 (discriminator 1)) <4>[ 76.641795] kernel_clone (kernel/fork.c:2758) <4>[ 76.643003] __do_sys_clone (kernel/fork.c:2902) <4>[ 76.644078] __arm64_sys_clone (kernel/fork.c:2869) <4>[ 76.645306] invoke_syscall.constprop.0 (arch/arm64/include/asm/syscall.h:61 arch/arm64/kernel/syscall.c:54) <4>[ 76.646337] do_el0_svc (include/linux/thread_info.h:135 (discriminator 2) arch/arm64/kernel/syscall.c:140 (discriminator 2) arch/arm64/kernel/syscall.c:151 (discriminator 2)) <4>[ 76.646974] el0_svc (arch/arm64/include/asm/irqflags.h:82 (discriminator 1) arch/arm64/include/asm/irqflags.h:123 (discriminator
- arch/arm64/include/asm/irqflags.h:136 (discriminator 1)
arch/arm64/kernel/entry-common.c:165 (discriminator 1) arch/arm64/kernel/entry-common.c:178 (discriminator 1) arch/arm64/kernel/entry-common.c:745 (discriminator 1)) <4>[ 76.647709] el0t_64_sync_handler (arch/arm64/kernel/entry- common.c:763) <4>[ 76.649032] el0t_64_sync (arch/arm64/kernel/entry.S:600) <0>[ 76.649724] Code: 12800019 b9402a82 aa1803e1 aa1403e0 (f8626b1a)
<trim>
All code
0: 12800019 mov w25, #0xffffffff // #-1 4: b9402a82 ldr w2, [x20, #40] 8: aa1803e1 mov x1, x24 c: aa1403e0 mov x0, x20 10:* f8626b1a ldr x26, [x24, x2] <-- trapping instruction
Code starting with the faulting instruction
0: f8626b1a ldr x26, [x24, x2] <4>[ 79.647693] ---[ end trace 0000000000000000 ]--- <0>[ 79.649260] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b <2>[ 79.650229] SMP: stopping secondary CPUs <0>[ 79.651558] Kernel Offset: 0x293306a00000 from 0xffff800080000000 <0>[ 79.652015] PHYS_OFFSET: 0x40000000 <0>[ 79.652461] CPU features: 0x000,000000d0,60bef2d8,cb7e7f3f <0>[ 79.653039] Memory Limit: none <0>[ 79.653854] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
Links:
- https://qa-reports.linaro.org/lkft/linux-next-master/build/ next-20241218/testrun/26396709/suite/log-parser-test/test/panic- multiline-kernel-panic-not-syncing-attempted-to-kill-init-exitcode/ history/ - https://qa-reports.linaro.org/lkft/linux-next-master/build/ next-20241212/testrun/26277241/suite/log-parser-test/test/panic- multiline-kernel-panic-not-syncing-attempted-to-kill-init-exitcode/log - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/ tests/2qNMDhPFtR8j185QSvZMn989u84 - https://storage.tuxsuite.com/public/linaro/lkft/ builds/2qNMCQazNJteQLGCw7MnMtUwzkD/ - https://qa-reports.linaro.org/lkft/linux-next-master/build/ next-20241211/testrun/26266202/suite/log-parser-test/test/panic- multiline-kernel-panic-not-syncing-attempted-to-kill-init-exitcode/ details/
metadata:
git describe: next-20241210..next-20241218 git repo: https://git.kernel.org/pub/scm/linux/kernel/git/next/ linux-next.git kernel config: https://storage.tuxsuite.com/public/linaro/lkft/ builds/2qNMCQazNJteQLGCw7MnMtUwzkD/config build url: https://storage.tuxsuite.com/public/linaro/lkft/ builds/2qNMCQazNJteQLGCw7MnMtUwzkD/ toolchain: gcc-13 config: CONFIG_ARM64_64K_PAGES=y, CONFIG_ARM64_16K_PAGES=y arch: arm64 qemu: qemu-arm64 version 9.1.2
-- Linaro LKFT https://lkft.linaro.org