On Mon, 30 Jul 2018 at 01:42, Yongqin Liu yongqin.liu@linaro.org wrote:
On Fri, 27 Jul 2018 at 23:03, Bero Rosenkränzer Bernhard.Rosenkranzer@linaro.org wrote:
Hi, -11 is -EAGAIN -- AFAIK connect() returning EAGAIN usually means the listen() backlog is full. Unless the test's purpose is to check connections are accepted quickly under load, it may make sense to make the test handle EAGAIN instead of failing on it, e.g. change
xyz = connect(...);
to something more like
int retries = 10; do { xyz = connect(...) if(xyz >= 0 || errno != EAGAIN) break; sleep(1); } while(--retries);
Hmm, I don't think the failure is that case, it passes for 4.14 and 4.9 kernel, and it passes for other parameter combination as well.
And here, except this special case, I most want to know the methods on how to debug on kernel side functions. With adding printk lines to find the real error happened function, it's not smart I think. maybe others have good methods on such cases.
I use ftrace all the time. It has all the advantages of printk without the downsides (atomic/interrupt context, modification of kernel behavior). The information is accumulated in ring buffers that can be retrieved and analysed once the test case has been executed. If you go that way you'll want to use 'trace-cmd' to control what gets traced. For trace analysis use kernelshark - it is really good and you see exactly what is happening on the entire system.
This tutorial here [1] gives you a lot of information on what I just wrote above - I especially recommend the LWN articles by Steve Rostedt.
Last but not least I recommend patience - the road you're embarking on is long and arduous.
Hope this helps, Mathieu
[1]. https://jvns.ca/blog/2017/03/19/getting-started-with-ftrace/
Thanks, Yongqin Liu
On Fri, 27 Jul 2018 at 16:41, Yongqin Liu yongqin.liu@linaro.org wrote:
Hi, Sumit, John, Amit, All
I am investigating on the VtsKernelNetTest failure with HiKey 4.4 kernel, and I found the problem is that the socket.connect call returns -11, with adding printk lines in the SYSCALL_DEFINE3 of connect in socket.c file, I found that the error is returned by the line of "err = sock->ops->connect(sock, (struct sockaddr *)&address, addrlen,sock->file->f_flags);" https://android.googlesource.com/kernel/hikey-linaro/+/android-hikey-linaro-...
There actually I don't know which connect function is called, so I searched the .connect assignment in kernel/linaro/hisilicon-4.4/net/ipv4/
and with adding printk lines, I found the implementation is tcp_v4_connect in net/ipv4/tcp_ipv4.c here: https://android.googlesource.com/kernel/hikey-linaro/+/android-hikey-linaro-...
There with adding printk lines, I found it the -11 is returned by call of ip_route_connect here: https://android.googlesource.com/kernel/hikey-linaro/+/android-hikey-linaro-...
Then I need to go to the definition of ip_route_connect to add printks and so on to find the real place where -11 is return, and check the reason there.
but this work seems stupid and time consuming, I think there should be smart methods I need to learn.
Could you please help go give some suggestion on what to do with such cases?
Thanks in advance!
-- Best Regards, Yongqin Liu
#mailing list linaro-android@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-android _______________________________________________ linaro-android mailing list linaro-android@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-android
-- Best Regards, Yongqin Liu
#mailing list linaro-android@lists.linaro.org http://lists.linaro.org/mailman/listinfo/linaro-android _______________________________________________ linaro-android mailing list linaro-android@lists.linaro.org https://lists.linaro.org/mailman/listinfo/linaro-android