Following kernel DEBUG_LOCKS_WARN_ON(lock->magic != lock) noticed on arm64 dragonboard 410c device while booting Linux next-20220502 tag kernel.
warning: --------- [ 6.052275] ------------[ cut here ]------------ [ 6.052289] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 6.052314] WARNING: CPU: 3 PID: 57 at kernel/locking/mutex.c:582 __mutex_lock+0x1dc/0x420 [ 6.052345] Modules linked in: [ 6.052361] CPU: 3 PID: 57 Comm: kworker/u8:2 Not tainted 5.18.0-rc5-next-20220502 #1 [ 6.052374] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [ 6.052384] Workqueue: events_unbound deferred_probe_work_func [ 6.052408] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 6.052421] pc : __mutex_lock+0x1dc/0x420 [ 6.052434] lr : __mutex_lock+0x1dc/0x420 [ 6.052446] sp : ffff80000c42b9a0 [ 6.052453] x29: ffff80000c42b9a0 x28: ffff80000b314860 x27: ffff00000408810d [ 6.052481] x26: ffff80000adb4bb8 x25: ffff000005220a74 x24: 0000000000000000 [ 6.052506] x23: 0000000000000000 x22: ffff800008bbef8c x21: 0000000000000002 [ 6.052533] x20: 0000000000000000 x19: ffff00000c1a5110 x18: 0000000000000000 [ 6.052558] x17: ffff800008bbd594 x16: ffff800008bc10f4 x15: ffff80000927b6c4 [ 6.052585] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47 [ 6.052611] x11: 656820747563205b x10: 0000000000000029 x9 : ffff8000081cc7b8 [ 6.052637] x8 : 0000000000000029 x7 : 0000000000000013 x6 : 0000000000000001 [ 6.267204] x5 : ffff80000adb5000 x4 : ffff80000adb54f0 x3 : 0000000000000000 [ 6.274321] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005265080 [ 6.281439] Call trace: [ 6.288524] __mutex_lock+0x1dc/0x420 [ 6.290789] mutex_lock_nested+0x4c/0x90 [ 6.294610] driver_set_override+0x12c/0x160 [ 6.298605] qcom_smd_register_edge+0x2d8/0x52c [ 6.302860] qcom_smd_probe+0x64/0x90 [ 6.307106] platform_probe+0x74/0x100 [ 6.310928] really_probe+0x1a4/0x3f4 [ 6.314572] __driver_probe_device+0x124/0x1a0 [ 6.318310] driver_probe_device+0x44/0x110 [ 6.322648] __device_attach_driver+0xb0/0x154 [ 6.326730] bus_for_each_drv+0x84/0xe0 [ 6.331240] __device_attach+0xe0/0x1d0 [ 6.334974] device_initial_probe+0x20/0x30 [ 6.338795] bus_probe_device+0xac/0xb4 [ 6.342958] deferred_probe_work_func+0xc8/0x120 [ 6.346784] process_one_work+0x280/0x6e0 [ 6.351641] worker_thread+0x230/0x434 [ 6.355546] kthread+0x10c/0x120 [ 6.359188] ret_from_fork+0x10/0x20 [ 6.362577] irq event stamp: 4049 [ 6.366132] hardirqs last enabled at (4049): [<ffff8000096ac248>] _raw_spin_unlock_irqrestore+0xa8/0xc4 [ 6.369392] hardirqs last disabled at (4048): [<ffff8000096ac968>] _raw_spin_lock_irqsave+0xd8/0x170 [ 6.378938] softirqs last enabled at (1496): [<ffff800008090cd8>] __do_softirq+0x528/0x654 [ 6.388046] softirqs last disabled at (1491): [<ffff80000812f99c>] __irq_exit_rcu+0x17c/0x1b0 [ 6.396120] ---[ end trace 0000000000000000 ]---
Reported-by: Linux Kernel Functional Testing lkft@linaro.org
metadata: git_ref: master git_repo: https://gitlab.com/Linaro/lkft/mirrors/next/linux-next git_sha: 9f9b9a2972eb8dcaad09d826c5c6d7488eaca3e6 git_describe: next-20220502 kernel_version: 5.18.0-rc5 kernel-config: https://builds.tuxbuild.com/28bnDcc7Sy8tyxRNdWwIkoP1qQK/config artifact-location: https://builds.tuxbuild.com/28bnDcc7Sy8tyxRNdWwIkoP1qQK toolchain: gcc-11
-- Linaro LKFT https://lkft.linaro.org
[1] https://lkft.validation.linaro.org/scheduler/job/4978421#L2608 [2] https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20220502/tes...
On Wed, May 4, 2022 at 11:09 AM Naresh Kamboju naresh.kamboju@linaro.org wrote:
Following kernel DEBUG_LOCKS_WARN_ON(lock->magic != lock) noticed on arm64 dragonboard 410c device while booting Linux next-20220502 tag kernel.
warning:
[ 6.052275] ------------[ cut here ]------------ [ 6.052289] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 6.052314] WARNING: CPU: 3 PID: 57 at kernel/locking/mutex.c:582 __mutex_lock+0x1dc/0x420 [ 6.052345] Modules linked in: [ 6.052361] CPU: 3 PID: 57 Comm: kworker/u8:2 Not tainted 5.18.0-rc5-next-20220502 #1 [ 6.052374] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [ 6.052384] Workqueue: events_unbound deferred_probe_work_func [ 6.052408] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 6.052421] pc : __mutex_lock+0x1dc/0x420 [ 6.052434] lr : __mutex_lock+0x1dc/0x420 [ 6.052446] sp : ffff80000c42b9a0 [ 6.052453] x29: ffff80000c42b9a0 x28: ffff80000b314860 x27: ffff00000408810d [ 6.052481] x26: ffff80000adb4bb8 x25: ffff000005220a74 x24: 0000000000000000 [ 6.052506] x23: 0000000000000000 x22: ffff800008bbef8c x21: 0000000000000002 [ 6.052533] x20: 0000000000000000 x19: ffff00000c1a5110 x18: 0000000000000000 [ 6.052558] x17: ffff800008bbd594 x16: ffff800008bc10f4 x15: ffff80000927b6c4 [ 6.052585] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47 [ 6.052611] x11: 656820747563205b x10: 0000000000000029 x9 : ffff8000081cc7b8 [ 6.052637] x8 : 0000000000000029 x7 : 0000000000000013 x6 : 0000000000000001 [ 6.267204] x5 : ffff80000adb5000 x4 : ffff80000adb54f0 x3 : 0000000000000000 [ 6.274321] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005265080 [ 6.281439] Call trace: [ 6.288524] __mutex_lock+0x1dc/0x420 [ 6.290789] mutex_lock_nested+0x4c/0x90 [ 6.294610] driver_set_override+0x12c/0x160 [ 6.298605] qcom_smd_register_edge+0x2d8/0x52c
The problem is in
static int qcom_smd_create_chrdev(struct qcom_smd_edge *edge) { struct qcom_smd_device *qsdev;
qsdev = kzalloc(sizeof(*qsdev), GFP_KERNEL); if (!qsdev) return -ENOMEM;
qsdev->edge = edge; qsdev->rpdev.ops = &qcom_smd_device_ops; qsdev->rpdev.dev.parent = &edge->dev; qsdev->rpdev.dev.release = qcom_smd_release_device;
return rpmsg_ctrldev_register_device(&qsdev->rpdev); }
static inline int rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev) { int ret;
strcpy(rpdev->id.name, "rpmsg_ctrl"); ret = driver_set_override(&rpdev->dev, &rpdev->driver_override, rpdev->id.name, strlen(rpdev->id.name)); if (ret) return ret;
ret = rpmsg_register_device(rpdev); if (ret) kfree(rpdev->driver_override);
return ret; }
This allocates an uninitialized device structure that is then passed driver_set_override() before calling device_register(), so the mutex is not initialized yet.
Arnd
On 04/05/2022 11:22, Arnd Bergmann wrote:
On Wed, May 4, 2022 at 11:09 AM Naresh Kamboju naresh.kamboju@linaro.org wrote:
Following kernel DEBUG_LOCKS_WARN_ON(lock->magic != lock) noticed on arm64 dragonboard 410c device while booting Linux next-20220502 tag kernel.
warning:
[ 6.052275] ------------[ cut here ]------------ [ 6.052289] DEBUG_LOCKS_WARN_ON(lock->magic != lock) [ 6.052314] WARNING: CPU: 3 PID: 57 at kernel/locking/mutex.c:582 __mutex_lock+0x1dc/0x420 [ 6.052345] Modules linked in: [ 6.052361] CPU: 3 PID: 57 Comm: kworker/u8:2 Not tainted 5.18.0-rc5-next-20220502 #1 [ 6.052374] Hardware name: Qualcomm Technologies, Inc. APQ 8016 SBC (DT) [ 6.052384] Workqueue: events_unbound deferred_probe_work_func [ 6.052408] pstate: 60000005 (nZCv daif -PAN -UAO -TCO -DIT -SSBS BTYPE=--) [ 6.052421] pc : __mutex_lock+0x1dc/0x420 [ 6.052434] lr : __mutex_lock+0x1dc/0x420 [ 6.052446] sp : ffff80000c42b9a0 [ 6.052453] x29: ffff80000c42b9a0 x28: ffff80000b314860 x27: ffff00000408810d [ 6.052481] x26: ffff80000adb4bb8 x25: ffff000005220a74 x24: 0000000000000000 [ 6.052506] x23: 0000000000000000 x22: ffff800008bbef8c x21: 0000000000000002 [ 6.052533] x20: 0000000000000000 x19: ffff00000c1a5110 x18: 0000000000000000 [ 6.052558] x17: ffff800008bbd594 x16: ffff800008bc10f4 x15: ffff80000927b6c4 [ 6.052585] x14: 0000000000000000 x13: 284e4f5f4e524157 x12: 5f534b434f4c5f47 [ 6.052611] x11: 656820747563205b x10: 0000000000000029 x9 : ffff8000081cc7b8 [ 6.052637] x8 : 0000000000000029 x7 : 0000000000000013 x6 : 0000000000000001 [ 6.267204] x5 : ffff80000adb5000 x4 : ffff80000adb54f0 x3 : 0000000000000000 [ 6.274321] x2 : 0000000000000000 x1 : 0000000000000000 x0 : ffff000005265080 [ 6.281439] Call trace: [ 6.288524] __mutex_lock+0x1dc/0x420 [ 6.290789] mutex_lock_nested+0x4c/0x90 [ 6.294610] driver_set_override+0x12c/0x160 [ 6.298605] qcom_smd_register_edge+0x2d8/0x52c
The problem is in
static int qcom_smd_create_chrdev(struct qcom_smd_edge *edge) { struct qcom_smd_device *qsdev;
qsdev = kzalloc(sizeof(*qsdev), GFP_KERNEL); if (!qsdev) return -ENOMEM; qsdev->edge = edge; qsdev->rpdev.ops = &qcom_smd_device_ops; qsdev->rpdev.dev.parent = &edge->dev; qsdev->rpdev.dev.release = qcom_smd_release_device; return rpmsg_ctrldev_register_device(&qsdev->rpdev);
}
static inline int rpmsg_ctrldev_register_device(struct rpmsg_device *rpdev) { int ret;
strcpy(rpdev->id.name, "rpmsg_ctrl"); ret = driver_set_override(&rpdev->dev, &rpdev->driver_override, rpdev->id.name, strlen(rpdev->id.name)); if (ret) return ret; ret = rpmsg_register_device(rpdev); if (ret) kfree(rpdev->driver_override); return ret;
}
This allocates an uninitialized device structure that is then passed driver_set_override() before calling device_register(), so the mutex is not initialized yet.
Folks, this was already reported by Marek and fixed:
https://lore.kernel.org/all/20220429195946.1061725-1-krzysztof.kozlowski@lin...
This has to go via Greg's tree.
Best regards, Krzysztof