cti_enable_hw() and cti_disable_hw() are called from an atomic context so shouldn't use runtime PM because it can result in a sleep when communicating with firmware.
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock management to the SCMI power domain""), this causes a hang on Juno when running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put() was called with the wrong device until commit 692c9a499b28 ("coresight: cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022 Call trace: dump_backtrace+0x134/0x140 show_stack+0x20/0x58 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x180/0x228 __might_sleep+0x50/0x88 __pm_runtime_resume+0xac/0xb0 cti_enable+0x44/0x120 coresight_control_assoc_ectdev+0xc0/0x150 coresight_enable_path+0xb4/0x288 etm_event_start+0x138/0x170 etm_event_add+0x48/0x70 event_sched_in.isra.122+0xb4/0x280 merge_sched_in+0x1fc/0x3d0 visit_groups_merge.constprop.137+0x16c/0x4b0 ctx_sched_in+0x114/0x1f0 perf_event_sched_in+0x60/0x90 ctx_resched+0x68/0xb0 perf_event_exec+0x138/0x508 begin_new_exec+0x52c/0xd40 load_elf_binary+0x6b8/0x17d0 bprm_execve+0x360/0x7f8 do_execveat_common.isra.47+0x218/0x238 __arm64_sys_execve+0x48/0x60 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.4+0xfc/0x120 do_el0_svc+0x34/0xc0 el0_svc+0x40/0x98 el0t_64_sync_handler+0x98/0xc0 el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not needed here because it must have already been done when building the path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver") Reported-by: Aishwarya TCV Aishwarya.TCV@arm.com Reported-by: Cristian Marussi Cristian.Marussi@arm.com Signed-off-by: Suzuki Poulose Suzuki.Poulose@arm.com Signed-off-by: James Clark james.clark@arm.com --- drivers/hwtracing/coresight/coresight-cti-core.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c index 8988b2ed2ea6..c5f7fc4e2552 100644 --- a/drivers/hwtracing/coresight/coresight-cti-core.c +++ b/drivers/hwtracing/coresight/coresight-cti-core.c @@ -94,7 +94,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) unsigned long flags; int rc = 0;
- pm_runtime_get_sync(dev->parent); spin_lock_irqsave(&drvdata->spinlock, flags);
/* no need to do anything if enabled or unpowered*/ @@ -119,7 +118,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) /* cannot enable due to error */ cti_err_not_enabled: spin_unlock_irqrestore(&drvdata->spinlock, flags); - pm_runtime_put(dev->parent); return rc; }
@@ -175,7 +173,6 @@ static int cti_disable_hw(struct cti_drvdata *drvdata) coresight_disclaim_device_unlocked(csdev); CS_LOCK(drvdata->base); spin_unlock(&drvdata->spinlock); - pm_runtime_put(dev->parent); return 0;
/* not disabled this call */
On 05/10/2022 14:14, James Clark wrote:
cti_enable_hw() and cti_disable_hw() are called from an atomic context so shouldn't use runtime PM because it can result in a sleep when communicating with firmware.
Hi Mike,
Are you able to test this on a Dragonboard please? I've only tested on Juno so far.
Thanks James
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock management to the SCMI power domain""), this causes a hang on Juno when running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put() was called with the wrong device until commit 692c9a499b28 ("coresight: cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022 Call trace: dump_backtrace+0x134/0x140 show_stack+0x20/0x58 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x180/0x228 __might_sleep+0x50/0x88 __pm_runtime_resume+0xac/0xb0 cti_enable+0x44/0x120 coresight_control_assoc_ectdev+0xc0/0x150 coresight_enable_path+0xb4/0x288 etm_event_start+0x138/0x170 etm_event_add+0x48/0x70 event_sched_in.isra.122+0xb4/0x280 merge_sched_in+0x1fc/0x3d0 visit_groups_merge.constprop.137+0x16c/0x4b0 ctx_sched_in+0x114/0x1f0 perf_event_sched_in+0x60/0x90 ctx_resched+0x68/0xb0 perf_event_exec+0x138/0x508 begin_new_exec+0x52c/0xd40 load_elf_binary+0x6b8/0x17d0 bprm_execve+0x360/0x7f8 do_execveat_common.isra.47+0x218/0x238 __arm64_sys_execve+0x48/0x60 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.4+0xfc/0x120 do_el0_svc+0x34/0xc0 el0_svc+0x40/0x98 el0t_64_sync_handler+0x98/0xc0 el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not needed here because it must have already been done when building the path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver") Reported-by: Aishwarya TCV Aishwarya.TCV@arm.com Reported-by: Cristian Marussi Cristian.Marussi@arm.com Signed-off-by: Suzuki Poulose Suzuki.Poulose@arm.com Signed-off-by: James Clark james.clark@arm.com
drivers/hwtracing/coresight/coresight-cti-core.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c index 8988b2ed2ea6..c5f7fc4e2552 100644 --- a/drivers/hwtracing/coresight/coresight-cti-core.c +++ b/drivers/hwtracing/coresight/coresight-cti-core.c @@ -94,7 +94,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) unsigned long flags; int rc = 0;
- pm_runtime_get_sync(dev->parent); spin_lock_irqsave(&drvdata->spinlock, flags);
/* no need to do anything if enabled or unpowered*/ @@ -119,7 +118,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) /* cannot enable due to error */ cti_err_not_enabled: spin_unlock_irqrestore(&drvdata->spinlock, flags);
- pm_runtime_put(dev->parent); return rc;
} @@ -175,7 +173,6 @@ static int cti_disable_hw(struct cti_drvdata *drvdata) coresight_disclaim_device_unlocked(csdev); CS_LOCK(drvdata->base); spin_unlock(&drvdata->spinlock);
- pm_runtime_put(dev->parent); return 0;
/* not disabled this call */
Tested on DB410c + kernel 6.0.0
Reviewed-by: Mike Leach mike.leach@linaro.org Tested-by: Mike Leach mike.leach@linaro.org
On Wed, 5 Oct 2022 at 14:16, James Clark james.clark@arm.com wrote:
cti_enable_hw() and cti_disable_hw() are called from an atomic context so shouldn't use runtime PM because it can result in a sleep when communicating with firmware.
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock management to the SCMI power domain""), this causes a hang on Juno when running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put() was called with the wrong device until commit 692c9a499b28 ("coresight: cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
coresight cti_sys0: cti_enable_hw -- dev:cti_sys0 parent: 20020000.cti BUG: sleeping function called from invalid context at drivers/base/power/runtime.c:1151 in_atomic(): 1, irqs_disabled(): 128, non_block: 0, pid: 330, name: perf-exec preempt_count: 2, expected: 0 RCU nest depth: 0, expected: 0 INFO: lockdep is turned off. irq event stamp: 0 hardirqs last enabled at (0): [<0000000000000000>] 0x0 hardirqs last disabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last enabled at (0): [<ffff80000822b394>] copy_process+0xa0c/0x1948 softirqs last disabled at (0): [<0000000000000000>] 0x0 CPU: 3 PID: 330 Comm: perf-exec Not tainted 6.0.0-00053-g042116d99298 #7 Hardware name: ARM LTD ARM Juno Development Platform/ARM Juno Development Platform, BIOS EDK II Sep 13 2022 Call trace: dump_backtrace+0x134/0x140 show_stack+0x20/0x58 dump_stack_lvl+0x8c/0xb8 dump_stack+0x18/0x34 __might_resched+0x180/0x228 __might_sleep+0x50/0x88 __pm_runtime_resume+0xac/0xb0 cti_enable+0x44/0x120 coresight_control_assoc_ectdev+0xc0/0x150 coresight_enable_path+0xb4/0x288 etm_event_start+0x138/0x170 etm_event_add+0x48/0x70 event_sched_in.isra.122+0xb4/0x280 merge_sched_in+0x1fc/0x3d0 visit_groups_merge.constprop.137+0x16c/0x4b0 ctx_sched_in+0x114/0x1f0 perf_event_sched_in+0x60/0x90 ctx_resched+0x68/0xb0 perf_event_exec+0x138/0x508 begin_new_exec+0x52c/0xd40 load_elf_binary+0x6b8/0x17d0 bprm_execve+0x360/0x7f8 do_execveat_common.isra.47+0x218/0x238 __arm64_sys_execve+0x48/0x60 invoke_syscall+0x4c/0x110 el0_svc_common.constprop.4+0xfc/0x120 do_el0_svc+0x34/0xc0 el0_svc+0x40/0x98 el0t_64_sync_handler+0x98/0xc0 el0t_64_sync+0x170/0x174
Fix the issue by removing the runtime PM calls completely. They are not needed here because it must have already been done when building the path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver") Reported-by: Aishwarya TCV Aishwarya.TCV@arm.com Reported-by: Cristian Marussi Cristian.Marussi@arm.com Signed-off-by: Suzuki Poulose Suzuki.Poulose@arm.com Signed-off-by: James Clark james.clark@arm.com
drivers/hwtracing/coresight/coresight-cti-core.c | 3 --- 1 file changed, 3 deletions(-)
diff --git a/drivers/hwtracing/coresight/coresight-cti-core.c b/drivers/hwtracing/coresight/coresight-cti-core.c index 8988b2ed2ea6..c5f7fc4e2552 100644 --- a/drivers/hwtracing/coresight/coresight-cti-core.c +++ b/drivers/hwtracing/coresight/coresight-cti-core.c @@ -94,7 +94,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) unsigned long flags; int rc = 0;
pm_runtime_get_sync(dev->parent); spin_lock_irqsave(&drvdata->spinlock, flags); /* no need to do anything if enabled or unpowered*/
@@ -119,7 +118,6 @@ static int cti_enable_hw(struct cti_drvdata *drvdata) /* cannot enable due to error */ cti_err_not_enabled: spin_unlock_irqrestore(&drvdata->spinlock, flags);
pm_runtime_put(dev->parent); return rc;
}
@@ -175,7 +173,6 @@ static int cti_disable_hw(struct cti_drvdata *drvdata) coresight_disclaim_device_unlocked(csdev); CS_LOCK(drvdata->base); spin_unlock(&drvdata->spinlock);
pm_runtime_put(dev->parent); return 0; /* not disabled this call */
-- 2.28.0
Hi James
On 05/10/2022 14:14, James Clark wrote:
cti_enable_hw() and cti_disable_hw() are called from an atomic context so shouldn't use runtime PM because it can result in a sleep when communicating with firmware.
Since commit 3c6656337852 ("Revert "firmware: arm_scmi: Add clock management to the SCMI power domain""), this causes a hang on Juno when running the Perf Coresight tests or running this command:
perf record -e cs_etm//u -- ls
This was also missed until the revert commit because pm_runtime_put() was called with the wrong device until commit 692c9a499b28 ("coresight: cti: Correct the parameter for pm_runtime_put")
With lock and scheduler debugging enabled the following is output:
...
Fix the issue by removing the runtime PM calls completely. They are not needed here because it must have already been done when building the path for a trace.
Fixes: 835d722ba10a ("coresight: cti: Initial CoreSight CTI Driver") Reported-by: Aishwarya TCV Aishwarya.TCV@arm.com Reported-by: Cristian Marussi Cristian.Marussi@arm.com Signed-off-by: Suzuki Poulose Suzuki.Poulose@arm.com
I have applied this patch, changing the above to:
Suggested-by: ...
Suzuki