The qemu-x86_64 boot failed with today's Linux next-20240814 tag due to following crash.
The catch here is the crash seen on both x86_64 device and qemu-x86_64 but x86_64 device is able to boot successfully.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot log: --- [ 0.000000] Linux version 6.11.0-rc3-next-20240814 (tuxmake@tuxmake) (x86_64-linux-gnu-gcc (Debian 13.3.0-1) 13.3.0, GNU ld (GNU Binutils for Debian) 2.42.50.20240625) #1 SMP PREEMPT_DYNAMIC @1723614704 ... <6>[ 2.479915] scsi host0: ahci <4>[ 2.484371] sysfs: cannot create duplicate filename '/devices/virtual/workqueue/scsi_tmf_-1073661392' <4>[ 2.486170] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814 #1 <4>[ 2.486709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 <4>[ 2.486709] Call Trace: <4>[ 2.486709] <TASK> <4>[ 2.486709] dump_stack_lvl+0x96/0xb0 <4>[ 2.486709] dump_stack+0x14/0x20 <4>[ 2.486709] sysfs_warn_dup+0x5f/0x80 <4>[ 2.486709] sysfs_create_dir_ns+0xd0/0xf0 <4>[ 2.486709] kobject_add_internal+0xa8/0x2e0 <4>[ 2.486709] kobject_add+0x97/0x100 <4>[ 2.486709] ? get_device_parent+0x109/0x1d0 <4>[ 2.486709] device_add+0xe4/0x880 <4>[ 2.486709] ? hrtimer_init+0x2b/0x80 <4>[ 2.486709] device_register+0x1e/0x30 <4>[ 2.486709] workqueue_sysfs_register+0x91/0x140 <4>[ 2.486709] __alloc_workqueue+0x664/0x800 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] alloc_workqueue+0x5a/0x80 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] scsi_host_alloc+0x365/0x470 <4>[ 2.486709] ata_scsi_add_hosts+0xc2/0x130 <4>[ 2.486709] ata_host_register+0xb5/0x260 <4>[ 2.486709] ata_host_activate+0xe9/0x140 <4>[ 2.486709] ahci_host_activate+0x16a/0x190 <4>[ 2.486709] ahci_init_one+0xe0f/0x1080 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] local_pci_probe+0x48/0xa0 <4>[ 2.486709] pci_device_probe+0xc6/0x1f0 <4>[ 2.486709] really_probe+0xcc/0x3b0 <4>[ 2.486709] __driver_probe_device+0x7d/0x160 <4>[ 2.486709] driver_probe_device+0x24/0xa0 <4>[ 2.486709] __driver_attach+0xdd/0x1d0 <4>[ 2.486709] ? __pfx___driver_attach+0x10/0x10 <4>[ 2.486709] bus_for_each_dev+0x91/0xe0 <4>[ 2.486709] driver_attach+0x22/0x30 <4>[ 2.486709] bus_add_driver+0x118/0x240 <4>[ 2.486709] driver_register+0x62/0x120 <4>[ 2.486709] ? __pfx_ahci_pci_driver_init+0x10/0x10 <4>[ 2.486709] __pci_register_driver+0x62/0x70 <4>[ 2.486709] ahci_pci_driver_init+0x22/0x30 <4>[ 2.486709] do_one_initcall+0x62/0x250 <4>[ 2.486709] kernel_init_freeable+0x1ba/0x310 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] kernel_init+0x1e/0x1d0 <4>[ 2.486709] ret_from_fork+0x41/0x60 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] ret_from_fork_asm+0x1a/0x30 <4>[ 2.486709] </TASK> <3>[ 2.508109] kobject: kobject_add_internal failed for scsi_tmf_-1073661392 with -EEXIST, don't try to register things with the same name in the same directory. <4>[ 2.519098] scsi host1: failed to create tmf workq <6>[ 2.524520] kworker/R-scsi_ (56) used greatest stack depth: 15464 bytes left <6>[ 2.528402] scsi_eh_1 (55) used greatest stack depth: 14872 bytes left <3>[ 2.540312] ahci 0000:00:1f.2: probe with driver ahci failed with error -12
Full dmesg log: ----------- - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240814/tes...
Reproduce script: --- - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2kdXPeyCZIU... - Qemu version: 8.2.4
Boot command: /usr/bin/qemu-system-x86_64 -cpu Nehalem -machine q35 -nographic -nic none -m 4G -monitor none -no-reboot -smp 2 -kernel kernel/bzImage -append "console=ttyS0,115200 rootwait root=/dev/sda debug verbose console_msg_format=syslog systemd.log_level=warning rw earlycon" -drive file=rootfs.ext4,if=ide,format=raw"
Build link: ------ - https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd... - https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd...
metadata: --- git_ref: master git_describe: next-20240814 git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next kernel_version: 6.11.0-rc3 arch: x86 device: qemu-x86_64
Please let me know if you need more information.
-- Linaro LKFT https://lkft.linaro.org
On Wed, 14 Aug 2024 at 15:15, Naresh Kamboju naresh.kamboju@linaro.org wrote:
The qemu-x86_64 boot failed with today's Linux next-20240814 tag due to following crash.
The catch here is the crash seen on both x86_64 device and qemu-x86_64 but x86_64 device is able to boot successfully.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot log:
[ 0.000000] Linux version 6.11.0-rc3-next-20240814 (tuxmake@tuxmake) (x86_64-linux-gnu-gcc (Debian 13.3.0-1) 13.3.0, GNU ld (GNU Binutils for Debian) 2.42.50.20240625) #1 SMP PREEMPT_DYNAMIC @1723614704 ... <6>[ 2.479915] scsi host0: ahci <4>[ 2.484371] sysfs: cannot create duplicate filename '/devices/virtual/workqueue/scsi_tmf_-1073661392'
Anders bisected to the following first commit and reverted this commit and qemu-x86_64 boot successful now.
# first bad commit: [b188c57af2b5c17a1e8f71a0358f330446a4f788] workqueue: Split alloc_workqueue into internal function and lockdep init
original report link: https://lore.kernel.org/all/CA+G9fYuD4-qKAX9nDS-3cy+HwGbyJ6WoD7bZ_QL0J__A++P...
<4>[ 2.486170] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814 #1 <4>[ 2.486709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 <4>[ 2.486709] Call Trace: <4>[ 2.486709] <TASK> <4>[ 2.486709] dump_stack_lvl+0x96/0xb0 <4>[ 2.486709] dump_stack+0x14/0x20 <4>[ 2.486709] sysfs_warn_dup+0x5f/0x80 <4>[ 2.486709] sysfs_create_dir_ns+0xd0/0xf0 <4>[ 2.486709] kobject_add_internal+0xa8/0x2e0 <4>[ 2.486709] kobject_add+0x97/0x100 <4>[ 2.486709] ? get_device_parent+0x109/0x1d0 <4>[ 2.486709] device_add+0xe4/0x880 <4>[ 2.486709] ? hrtimer_init+0x2b/0x80 <4>[ 2.486709] device_register+0x1e/0x30 <4>[ 2.486709] workqueue_sysfs_register+0x91/0x140 <4>[ 2.486709] __alloc_workqueue+0x664/0x800 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] alloc_workqueue+0x5a/0x80 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] scsi_host_alloc+0x365/0x470 <4>[ 2.486709] ata_scsi_add_hosts+0xc2/0x130 <4>[ 2.486709] ata_host_register+0xb5/0x260 <4>[ 2.486709] ata_host_activate+0xe9/0x140 <4>[ 2.486709] ahci_host_activate+0x16a/0x190 <4>[ 2.486709] ahci_init_one+0xe0f/0x1080 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] local_pci_probe+0x48/0xa0 <4>[ 2.486709] pci_device_probe+0xc6/0x1f0 <4>[ 2.486709] really_probe+0xcc/0x3b0 <4>[ 2.486709] __driver_probe_device+0x7d/0x160 <4>[ 2.486709] driver_probe_device+0x24/0xa0 <4>[ 2.486709] __driver_attach+0xdd/0x1d0 <4>[ 2.486709] ? __pfx___driver_attach+0x10/0x10 <4>[ 2.486709] bus_for_each_dev+0x91/0xe0 <4>[ 2.486709] driver_attach+0x22/0x30 <4>[ 2.486709] bus_add_driver+0x118/0x240 <4>[ 2.486709] driver_register+0x62/0x120 <4>[ 2.486709] ? __pfx_ahci_pci_driver_init+0x10/0x10 <4>[ 2.486709] __pci_register_driver+0x62/0x70 <4>[ 2.486709] ahci_pci_driver_init+0x22/0x30 <4>[ 2.486709] do_one_initcall+0x62/0x250 <4>[ 2.486709] kernel_init_freeable+0x1ba/0x310 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] kernel_init+0x1e/0x1d0 <4>[ 2.486709] ret_from_fork+0x41/0x60 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] ret_from_fork_asm+0x1a/0x30 <4>[ 2.486709] </TASK> <3>[ 2.508109] kobject: kobject_add_internal failed for scsi_tmf_-1073661392 with -EEXIST, don't try to register things with the same name in the same directory. <4>[ 2.519098] scsi host1: failed to create tmf workq <6>[ 2.524520] kworker/R-scsi_ (56) used greatest stack depth: 15464 bytes left <6>[ 2.528402] scsi_eh_1 (55) used greatest stack depth: 14872 bytes left <3>[ 2.540312] ahci 0000:00:1f.2: probe with driver ahci failed with error -12
Full dmesg log:
Reproduce script:
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2kdXPeyCZIU...
- Qemu version: 8.2.4
Boot command: /usr/bin/qemu-system-x86_64 -cpu Nehalem -machine q35 -nographic -nic none -m 4G -monitor none -no-reboot -smp 2 -kernel kernel/bzImage -append "console=ttyS0,115200 rootwait root=/dev/sda debug verbose console_msg_format=syslog systemd.log_level=warning rw earlycon" -drive file=rootfs.ext4,if=ide,format=raw"
Build link:
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd...
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd...
metadata:
git_ref: master git_describe: next-20240814 git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next kernel_version: 6.11.0-rc3 arch: x86 device: qemu-x86_64
Please let me know if you need more information.
-- Linaro LKFT https://lkft.linaro.org
- Naresh
On Wed, 14 Aug 2024 at 15:56, Naresh Kamboju naresh.kamboju@linaro.org wrote:
On Wed, 14 Aug 2024 at 15:15, Naresh Kamboju naresh.kamboju@linaro.org wrote:
The qemu-x86_64 boot failed with today's Linux next-20240814 tag due to following crash.
The catch here is the crash seen on both x86_64 device and qemu-x86_64 but x86_64 device is able to boot successfully.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot log:
[ 0.000000] Linux version 6.11.0-rc3-next-20240814 (tuxmake@tuxmake) (x86_64-linux-gnu-gcc (Debian 13.3.0-1) 13.3.0, GNU ld (GNU Binutils for Debian) 2.42.50.20240625) #1 SMP PREEMPT_DYNAMIC @1723614704 ... <6>[ 2.479915] scsi host0: ahci <4>[ 2.484371] sysfs: cannot create duplicate filename '/devices/virtual/workqueue/scsi_tmf_-1073661392'
Anders bisected to the following first commit and reverted this commit and qemu-x86_64 boot successful now.
# first bad commit: [b188c57af2b5c17a1e8f71a0358f330446a4f788] workqueue: Split alloc_workqueue into internal function and lockdep init
This reported problem is still seen on today's Linux next-20240819 tag.
original report link: https://lore.kernel.org/all/CA+G9fYuD4-qKAX9nDS-3cy+HwGbyJ6WoD7bZ_QL0J__A++P...
<4>[ 2.486170] CPU: 1 UID: 0 PID: 1 Comm: swapper/0 Not tainted 6.11.0-rc3-next-20240814 #1 <4>[ 2.486709] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-debian-1.16.3-2 04/01/2014 <4>[ 2.486709] Call Trace: <4>[ 2.486709] <TASK> <4>[ 2.486709] dump_stack_lvl+0x96/0xb0 <4>[ 2.486709] dump_stack+0x14/0x20 <4>[ 2.486709] sysfs_warn_dup+0x5f/0x80 <4>[ 2.486709] sysfs_create_dir_ns+0xd0/0xf0 <4>[ 2.486709] kobject_add_internal+0xa8/0x2e0 <4>[ 2.486709] kobject_add+0x97/0x100 <4>[ 2.486709] ? get_device_parent+0x109/0x1d0 <4>[ 2.486709] device_add+0xe4/0x880 <4>[ 2.486709] ? hrtimer_init+0x2b/0x80 <4>[ 2.486709] device_register+0x1e/0x30 <4>[ 2.486709] workqueue_sysfs_register+0x91/0x140 <4>[ 2.486709] __alloc_workqueue+0x664/0x800 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] alloc_workqueue+0x5a/0x80 <4>[ 2.486709] ? __kthread_create_on_node+0x108/0x170 <4>[ 2.486709] scsi_host_alloc+0x365/0x470 <4>[ 2.486709] ata_scsi_add_hosts+0xc2/0x130 <4>[ 2.486709] ata_host_register+0xb5/0x260 <4>[ 2.486709] ata_host_activate+0xe9/0x140 <4>[ 2.486709] ahci_host_activate+0x16a/0x190 <4>[ 2.486709] ahci_init_one+0xe0f/0x1080 <4>[ 2.486709] ? trace_preempt_on+0x1e/0x70 <4>[ 2.486709] local_pci_probe+0x48/0xa0 <4>[ 2.486709] pci_device_probe+0xc6/0x1f0 <4>[ 2.486709] really_probe+0xcc/0x3b0 <4>[ 2.486709] __driver_probe_device+0x7d/0x160 <4>[ 2.486709] driver_probe_device+0x24/0xa0 <4>[ 2.486709] __driver_attach+0xdd/0x1d0 <4>[ 2.486709] ? __pfx___driver_attach+0x10/0x10 <4>[ 2.486709] bus_for_each_dev+0x91/0xe0 <4>[ 2.486709] driver_attach+0x22/0x30 <4>[ 2.486709] bus_add_driver+0x118/0x240 <4>[ 2.486709] driver_register+0x62/0x120 <4>[ 2.486709] ? __pfx_ahci_pci_driver_init+0x10/0x10 <4>[ 2.486709] __pci_register_driver+0x62/0x70 <4>[ 2.486709] ahci_pci_driver_init+0x22/0x30 <4>[ 2.486709] do_one_initcall+0x62/0x250 <4>[ 2.486709] kernel_init_freeable+0x1ba/0x310 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] kernel_init+0x1e/0x1d0 <4>[ 2.486709] ret_from_fork+0x41/0x60 <4>[ 2.486709] ? __pfx_kernel_init+0x10/0x10 <4>[ 2.486709] ret_from_fork_asm+0x1a/0x30 <4>[ 2.486709] </TASK> <3>[ 2.508109] kobject: kobject_add_internal failed for scsi_tmf_-1073661392 with -EEXIST, don't try to register things with the same name in the same directory. <4>[ 2.519098] scsi host1: failed to create tmf workq <6>[ 2.524520] kworker/R-scsi_ (56) used greatest stack depth: 15464 bytes left <6>[ 2.528402] scsi_eh_1 (55) used greatest stack depth: 14872 bytes left <3>[ 2.540312] ahci 0000:00:1f.2: probe with driver ahci failed with error -12
Full dmesg log:
Reproduce script:
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2kdXPeyCZIU...
- Qemu version: 8.2.4
Boot command: /usr/bin/qemu-system-x86_64 -cpu Nehalem -machine q35 -nographic -nic none -m 4G -monitor none -no-reboot -smp 2 -kernel kernel/bzImage -append "console=ttyS0,115200 rootwait root=/dev/sda debug verbose console_msg_format=syslog systemd.log_level=warning rw earlycon" -drive file=rootfs.ext4,if=ide,format=raw"
Build link:
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd...
- https://storage.tuxsuite.com/public/linaro/lkft/builds/2kdXMb8C1EcMoXxMdKTWd...
metadata:
git_ref: master git_describe: next-20240814 git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next kernel_version: 6.11.0-rc3 arch: x86 device: qemu-x86_64
Please let me know if you need more information.
Latest crash log link, - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240819/tes...
-- Linaro LKFT https://lkft.linaro.org
- Naresh
On Wed, Aug 14, 2024 at 03:56:56PM +0530, Naresh Kamboju wrote:
On Wed, 14 Aug 2024 at 15:15, Naresh Kamboju naresh.kamboju@linaro.org wrote:
The qemu-x86_64 boot failed with today's Linux next-20240814 tag due to following crash.
The catch here is the crash seen on both x86_64 device and qemu-x86_64 but x86_64 device is able to boot successfully.
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
Boot log:
[ 0.000000] Linux version 6.11.0-rc3-next-20240814 (tuxmake@tuxmake) (x86_64-linux-gnu-gcc (Debian 13.3.0-1) 13.3.0, GNU ld (GNU Binutils for Debian) 2.42.50.20240625) #1 SMP PREEMPT_DYNAMIC @1723614704 ... <6>[ 2.479915] scsi host0: ahci <4>[ 2.484371] sysfs: cannot create duplicate filename '/devices/virtual/workqueue/scsi_tmf_-1073661392'
^^^^^^^^^^^ Negative number. This comes from:
shost->tmf_work_q = alloc_workqueue("scsi_tmf_%d", WQ_UNBOUND | WQ_MEM_RECLAIM | WQ_SYSFS, 1, shost->host_no);
shost->host_no comes from ida_alloc() and we have checked to ensure it's not negative. The problem is the va_args changes in workqueue.c as Anders's bisect shows.
kernel/workqueue.c 5627 static struct workqueue_struct *__alloc_workqueue(const char *fmt, 5628 unsigned int flags, 5629 int max_active, ...) ^^^ This should be a "va_list args" now.
5750 struct workqueue_struct *alloc_workqueue(const char *fmt, 5751 unsigned int flags, 5752 int max_active, ...) 5753 { 5754 struct workqueue_struct *wq; 5755 va_list args; 5756 5757 va_start(args, max_active); 5758 wq = __alloc_workqueue(fmt, flags, max_active, args); ^^^^ We're passing a va_list.
5759 va_end(args);
Any workqueue that has a format string is going to be broken now.
regards, dan carpenter
Give this patch a shot and I'll resend if it fixes the bug.
Fixes: b188c57af2b5 ("workqueue: Split alloc_workqueue into internal function and lockdep init") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org --- kernel/workqueue.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/kernel/workqueue.c b/kernel/workqueue.c index bfeeefeee332..2fb93f3088f9 100644 --- a/kernel/workqueue.c +++ b/kernel/workqueue.c @@ -5623,12 +5623,10 @@ static void wq_adjust_max_active(struct workqueue_struct *wq) } while (activated); }
-__printf(1, 4) static struct workqueue_struct *__alloc_workqueue(const char *fmt, unsigned int flags, - int max_active, ...) + int max_active, va_list args) { - va_list args; struct workqueue_struct *wq; size_t wq_size; int name_len; @@ -5660,10 +5658,7 @@ static struct workqueue_struct *__alloc_workqueue(const char *fmt, goto err_free_wq; }
- va_start(args, max_active); name_len = vsnprintf(wq->name, sizeof(wq->name), fmt, args); - va_end(args); - if (name_len >= WQ_NAME_LEN) pr_warn_once("workqueue: name exceeds WQ_NAME_LEN. Truncating to: %s\n", wq->name);
On Wed, Aug 21, 2024 at 02:54:28PM +0300, Dan Carpenter wrote:
Give this patch a shot and I'll resend if it fixes the bug.
Fixes: b188c57af2b5 ("workqueue: Split alloc_workqueue into internal function and lockdep init") Signed-off-by: Dan Carpenter dan.carpenter@linaro.org
Actually never mind. Matthew sent this same patch today so this bug is fixed on today's linux-next.
regards, dan carpenter