Hi, Andy
I started a cts job from the command line via lava-dispatch command when I was off my work (about 11:00 UTC), and now the telnet process is consuming the CPU to 100%(started from 12:25).
but the lava-dispatch process is disappeared. that maybe because my ssh connection from company disconnected. And the parent pid of the telnet process becomes 1.
the process has the 7287 pid is the telnet session connected to panda24 liuyq0307@staging:~$ ps -ef|grep telnet 1005 7287 1 65 10:36 pts/4 03:23:52 /usr/bin/telnet serial2 7033 root 14409 14324 0 15:47 pts/1 00:00:00 /usr/bin/telnet serial1 7005 1008 14749 13592 0 15:48 pts/7 00:00:00 grep --color=auto telnet liuyq0307@staging:~$
The output of the top command: Tasks: 158 total, 2 running, 156 sleeping, 0 stopped, 0 zombie Cpu(s): 2.9%us, 23.0%sy, 0.0%ni, 73.0%id, 1.0%wa, 0.0%hi, 0.1%si, 0.0%st Mem: 8178504k total, 6159136k used, 2019368k free, 65632k buffers Swap: 0k total, 0k used, 0k free, 4714592k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7287 instance 20 0 27516 1520 1228 R 100 0.0 202:05.32 telnet 26256 root 20 0 881m 46m 5740 S 2 0.6 27:02.47 lava-server 5005 instance 20 0 2439m 141m 8588 S 1 1.8 0:10.40 java 13804 liuyq030 20 0 17340 1296 912 R 0 0.0 0:00.01 top 19202 root 20 0 38792 1488 1016 S 0 0.0 5:46.67 adb 26284 postgres 20 0 127m 32m 28m S 0 0.4 0:38.28 postgres 1 root 20 0 24460 2340 1244 S 0 0.0 0:00.97 init
Thanks, Yongqin Liu On 14/11/2012, Andy Doan andy.doan@linaro.org wrote:
On 11/14/2012 12:48 AM, YongQin Liu wrote:
Hi, Andy & Michael
About the problem that the telnet process consumes CPU(bug1034218 https://bugs.launchpad.net/linaro-android/+bug/1034218), For now I tried two ways to verify it:
- Run the CTS test via submitting a lava-job In this way, the process that consumes CPU is telnet
- Run the CTS test via command line "lava-android-test run cts" In this way, there is no process that consumes CPU to 100%, In the meanwhile, I also opened the telnet session.
So I guess the problem is the way we calling the telnet command in lava-dispatcher. From my investigation, it's the select syscall in telnet that consumes CPU, So I doubt if there is some place in lava-dispatcher that reads the ouput of telnet in a loop without sleep in the loop. but I did not find such place in lava-dispatcher.
How do you think about it?
Do you get 100% CPU when you run the job by hand, ie "lava-dispatch jobfile.json"?
This just doesn't make sense. I don't see how the telnet binary's usage of the "select" API is being influenced by its parent process.
Finally, I feel that this problem is of lava-dispatcher, not the problem of lava-android-test or CTS, so can we change it to be a bug of lava-dispatcher?
That's fine. The core problem remains the same. The big question is: what do we need to try and debug next?