On 05/31/2011 08:48 PM, Paul Larson wrote:
Since there is no lp project (that I am aware of) for the pmqa tests, I think I just reported it to Amit directly at the time. Iirc, he said to follow up on it when he had someone working on the qa tests, which appears to be you, so I'm doing that now. :) Do you have a place where bugs for it should live?
No, not yet. Let me discuss with the power management team about creating a launchpad project.
We could put it against abrek I suppose, but it's not really, it's a bug in the testsuite, not the frameworks.
What I was seeing at the time, was avail_freq02 would never exit on beagleXM. It should take only a few seconds to complete I think, based on what I saw on panda. But on beagleXM it would sit in this state for hours:
I doubt the problem is coming from the test suite. At the first glance I think it raises a kernel bug. It is not normal to have an userspace program blocked in uninterruptible state and moreover some kthread blocked too. It is probable there is a domino effect here with a dangling lock in the kernel.
Is it possible to have the kernel version where this problem appears ? Does it happen with beagleXM only or with more boards ? If the former, it will be hard for me to reproduce the problem as I don't have it. Is there a solution for that ? (eg. accessing a boards farm + a magic finger to reboot the boards).
root@linaro:~# abrek run pwrmgmt [ 241.163391] INFO: task kworker/0:1:24 blocked for more than 120 seconds. [ 241.170379] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 241.178680] INFO: task avail_freq02.sh:1078 blocked for more than 120 seconds. [ 241.186218] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 361.186889] INFO: task kworker/0:1:24 blocked for more than 120 seconds. [ 361.193878] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message. [ 361.202209] INFO: task avail_freq02.sh:1078 blocked for more than 120 seconds. [ 361.209747] "echo 0> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Can you reproduce the problem but with hung_task_panic set to 1, so we will have a full stack trace and the more context.
Thanks -- Daniel