On 17/08/2023 08:13, Anshuman Khandual wrote:
Hello Junhao,
On 8/16/23 19:40, Suzuki K Poulose wrote:
From: Junhao He hejunhao3@huawei.com
smp_call_function_single() will allocate an IPI interrupt vector to the target processor and send a function call request to the interrupt vector. After the target processor receives the IPI interrupt, it will execute arm_trbe_remove_coresight_cpu() call request in the interrupt handler.
According to the device_unregister() stack information, if other process is useing the device, the down_write() may sleep, and trigger deadlocks or unexpected errors.
arm_trbe_remove_coresight_cpu coresight_unregister device_unregister device_del kobject_del __kobject_del sysfs_remove_dir kernfs_remove down_write ---------> it may sleep
But how did you really detect this problem ? Does this show up as an warning when you enable lockdep debug ? OR it really happened during a real workload execution followed by TRBE module unload. Although the problem seems plausible (which needs fixing), just wondering how did we trigger this.
I was able to trigger this with just :
modprobe coresight-trbe; modprobe -r coresight-trbe;
With all the bells and whistles enabled in the kernel.
Suzuki