Hi folks,
I was trying to enable SMMU translation for coresight TMC ETR component via "iommus" property on SC7180 SoC. And on enabling the trace, I observe random softlockups everytime. For enabling trace, I just select ETR as the sink and ETM as the source via sysfs and immediately start seeing lockups. I have looked from the hardware point of view and there is no errata as such and it works fine on downstream kernel which is very old - 4.14 kernel. This is reproducible even with KASAN disabled as well. Also I have dumped other CPU stacks as well during softlockup via adding dump_stack in ipi_cpu_stop and and no interesting logs there as other cpus are idle.
Logs for softlockup,
[ 69.725539] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [ksoftirqd/0:9] [ 69.772709] irq event stamp: 144 [ 69.776047] hardirqs last enabled at (143): [<ffffffd05a0fb908>] run_ksoftirqd+0x34/0x70 [ 69.784452] hardirqs last disabled at (144): [<ffffffd05ab7bbd8>] __schedule+0x158/0x940 [ 69.792768] softirqs last enabled at (142): [<ffffffd05a0819dc>] __do_softirq+0x4bc/0x51c [ 69.801255] softirqs last disabled at (69): [<ffffffd05a0fb904>] run_ksoftirqd+0x30/0x70 [ 69.824147] pstate: 60400009 (nZCv daif +PAN -UAO) [ 69.829082] pc : skb_release_data+0x60/0x1a4 [ 69.833475] lr : skb_release_data+0x50/0x1a4 [ 69.837864] sp : ffffff8147cab770 [ 69.841279] x29: ffffff8147cab770 x28: ffffff8133277d00 [ 69.846744] x27: 0000000000000034 x26: ffffff814441a30e [ 69.852204] x25: ffffff811f0c0000 x24: 0000000000000000 [ 69.857666] x23: ffffff81292cc000 x22: ffffff814441a380 [ 69.863133] x21: 0000000000000001 x20: ffffff814441a3a0 [ 69.868595] x19: ffffff81055a7540 x18: ffffff8114f80280 [ 69.874062] x17: 00000000000005a4 x16: ffffff8114f80278 [ 69.879526] x15: 0000000000000000 x14: 0000000000000001 [ 69.884994] x13: 00000000caf7cc01 x12: 00000000b6bc5499 [ 69.890453] x11: 0000000000000000 x10: dfffffd000000001 [ 69.895920] x9 : 0000000000000001 x8 : 0000000000000000 [ 69.901377] x7 : 0000000000000000 x6 : 0000000000000000 [ 69.906839] x5 : 0000000000000000 x4 : 0000000000000000 [ 69.912307] x3 : ffffffd05a99ea98 x2 : 0000000000000001 [ 69.917771] x1 : 0000000000000004 x0 : 0000000000000001 [ 69.923231] Call trace: [ 69.925765] skb_release_data+0x60/0x1a4 [ 69.929801] skb_release_all+0x2c/0x38 [ 69.933669] consume_skb+0x48/0x6c [ 69.937177] packet_rcv+0x5c/0x3f0 [ 69.940685] __netif_receive_skb_one_core+0x50/0x60 [ 69.945704] __netif_receive_skb+0x4c/0x8c [ 69.949926] netif_receive_skb_internal+0xb8/0x198 [ 69.954856] napi_gro_receive+0x19c/0x360 [ 69.959030] r8152_poll+0x518/0x7b4 [r8152] [ 69.963336] net_rx_action+0x114/0x32c [ 69.967198] __do_softirq+0x258/0x51c [ 69.970968] irq_exit+0xd8/0xf8 [ 69.974215] __handle_domain_irq+0x8c/0xc4 [ 69.978430] gic_handle_irq+0x100/0x1c4 [ 69.982378] el1_irq+0xbc/0x180 [ 69.985619] smpboot_thread_fn+0x84/0x2bc [ 69.989753] kthread+0x128/0x138 [ 69.993081] ret_from_fork+0x10/0x18 [ 69.996769] Kernel panic - not syncing: softlockup: hung tasks [ 70.017342] Call trace: [ 70.019867] dump_backtrace+0x0/0x188 [ 70.023640] show_stack+0x20/0x2c [ 70.027061] dump_stack+0xdc/0x144 [ 70.030573] panic+0x174/0x37c [ 70.033724] softlockup_fn+0x0/0x60 [ 70.037324] __hrtimer_run_queues+0x264/0x498 [ 70.041808] hrtimer_interrupt+0xf0/0x22c [ 70.045946] arch_timer_handler_phys+0x40/0x50 [ 70.050524] handle_percpu_devid_irq+0x8c/0x158 [ 70.055190] __handle_domain_irq+0x84/0xc4 [ 70.059412] gic_handle_irq+0x100/0x1c4 [ 70.063359] el1_irq+0xbc/0x180 [ 70.066595] skb_release_data+0x60/0x1a4 [ 70.070638] skb_release_all+0x2c/0x38 [ 70.074498] consume_skb+0x48/0x6c [ 70.078007] packet_rcv+0x5c/0x3f0 [ 70.081518] __netif_receive_skb_one_core+0x50/0x60 [ 70.086533] __netif_receive_skb+0x4c/0x8c [ 70.090746] netif_receive_skb_internal+0xb8/0x198 [ 70.095673] napi_gro_receive+0x19c/0x360 [ 70.099825] r8152_poll+0x518/0x7b4 [r8152] [ 70.104129] net_rx_action+0x114/0x32c [ 70.107984] __do_softirq+0x258/0x51c [ 70.111760] irq_exit+0xd8/0xf8 [ 70.115003] __handle_domain_irq+0x8c/0xc4 [ 70.119215] gic_handle_irq+0x100/0x1c4 [ 70.123168] el1_irq+0xbc/0x180 [ 70.126410] smpboot_thread_fn+0x84/0x2bc [ 70.130540] kthread+0x128/0x138 [ 70.133862] ret_from_fork+0x10/0x18 [ 70.137570] SMP: stopping secondary CPUs [ 70.355132] Kernel Offset: 0x4a000000 from 0xffffffd010000000 [ 70.361047] PHYS_OFFSET: 0xfffffffac0000000 [ 70.365357] CPU features: 0x0006,2a80aa18 [ 70.369489] Memory Limit: none
One more such softlockup,
[ 122.634764] watchdog: BUG: soft lockup - CPU#0 stuck for 11s! [sugov:0:1729] [ 122.683618] irq event stamp: 0 [ 122.686813] hardirqs last enabled at (0): [<0000000000000000>] 0x0 [ 122.693310] hardirqs last disabled at (0): [<ffffffd254aec380>] copy_process+0x3e4/0xca8 [ 122.701655] softirqs last enabled at (0): [<ffffffd254aec398>] copy_process+0x3fc/0xca8 [ 122.710003] softirqs last disabled at (0): [<0000000000000000>] 0x0 [ 122.731030] pstate: 20400009 (nzCv daif +PAN -UAO) [ 122.736001] pc : preempt_schedule_irq+0x48/0x84 [ 122.740697] lr : preempt_schedule_irq+0x44/0x84 [ 122.745389] sp : ffffffc015c13a80 [ 122.748828] x29: ffffffc015c13a80 x28: ffffff815285af80 [ 122.754330] x27: ffffff815285af80 x26: ffffffc010004000 [ 122.759833] x25: ffffffc010000000 x24: 0000000000000000 [ 122.765332] x23: 0000000080c00009 x22: ffffffd2555492e8 [ 122.770826] x21: 0000000000000060 x20: 00000000000000e0 [ 122.776321] x19: ffffff815285af80 x18: 0000000000000005 [ 122.781825] x17: 0000000000000000 x16: 0000000000000001 [ 122.787319] x15: 0000000000000010 x14: 0000000000000010 [ 122.792822] x13: 0000000000000000 x12: ffffffd254a9bb14 [ 122.798320] x11: 632d4b6b9e7ff300 x10: 632d4b6b9e7ff300 [ 122.803820] x9 : ffffffd25554750c x8 : 0000000000000000 [ 122.809318] x7 : ffffffd254ba17b4 x6 : 0000000000000000 [ 122.814815] x5 : 0000000000010000 x4 : 0000000000000080 [ 122.820312] x3 : ffffffc015c13a00 x2 : ffffffd254a93c04 [ 122.825809] x1 : ffffffc015c13a00 x0 : ffffffd255548028 [ 122.831309] Call trace: [ 122.833876] preempt_schedule_irq+0x48/0x84 [ 122.838217] arm64_preempt_schedule_irq+0x4c/0x70 [ 122.843092] el1_irq+0xd4/0x180 [ 122.846364] __mutex_unlock_slowpath+0x260/0x290 [ 122.851150] mutex_unlock+0x24/0x30 [ 122.854780] dev_pm_opp_xlate_required_opp+0xd8/0x114 [ 122.860010] qcom_cpufreq_hw_target_index+0xf4/0x1a4 [ 122.865160] __cpufreq_driver_target+0x1e8/0x2f4 [ 122.869947] sugov_work+0x58/0x70 [ 122.873397] kthread_worker_fn+0x10c/0x1f8 [ 122.877652] kthread+0x11c/0x12c [ 122.881014] ret_from_fork+0x10/0x18 [ 122.884733] Kernel panic - not syncing: softlockup: hung tasks [ 122.905309] Call trace: [ 122.907872] dump_backtrace+0x0/0x188 [ 122.911678] show_stack+0x20/0x2c [ 122.915130] dump_stack+0xc8/0x124 [ 122.918667] panic+0x168/0x388 [ 122.921857] softlockup_fn+0x0/0x60 [ 122.925485] __hrtimer_run_queues+0x250/0x47c [ 122.930003] hrtimer_interrupt+0xf0/0x22c [ 122.934171] arch_timer_handler_phys+0x40/0x50 [ 122.938782] handle_percpu_devid_irq+0x8c/0x158 [ 122.943485] __handle_domain_irq+0x84/0xc4 [ 122.947737] gic_handle_irq+0xd0/0x178 [ 122.951625] el1_irq+0xbc/0x180 [ 122.954903] preempt_schedule_irq+0x48/0x84 [ 122.959239] arm64_preempt_schedule_irq+0x4c/0x70 [ 122.964110] el1_irq+0xd4/0x180 [ 122.967387] __mutex_unlock_slowpath+0x260/0x290 [ 122.972169] mutex_unlock+0x24/0x30 [ 122.975791] dev_pm_opp_xlate_required_opp+0xd8/0x114 [ 122.981015] qcom_cpufreq_hw_target_index+0xf4/0x1a4 [ 122.986150] __cpufreq_driver_target+0x1e8/0x2f4 [ 122.990933] sugov_work+0x58/0x70 [ 122.994378] kthread_worker_fn+0x10c/0x1f8 [ 122.998629] kthread+0x11c/0x12c [ 123.001982] ret_from_fork+0x10/0x18 [ 123.005741] SMP: stopping secondary CPUs [ 123.223202] Kernel Offset: 0x1244a00000 from 0xffffffc010000000 [ 123.229319] PHYS_OFFSET: 0xffffffdb80000000 [ 123.233655] CPU features: 0x0006,2a80aa18 [ 123.237809] Memory Limit: none
Thanks, Sai