On Fri, Oct 13, 2017 at 10:16:04AM -0500, Dan Rue wrote:
On Fri, Oct 13, 2017 at 12:28:46PM +0000, Greg KH wrote:
On Fri, Oct 13, 2017 at 12:23:37PM +0100, Mark Brown wrote:
On Fri, Oct 13, 2017 at 12:31:37PM +0200, Greg KH wrote:
I just looked at this for kernelci, not sure what's going on with LKFT here and haven't talked to anyone working on it but I'll bet it was the same.
Turns out that 4.9.55-rc1 did not work at all for networking, yet no tests seem to have caught it. Are we not testing something with a network here? You said you were using NFS, how did that work?
Looks like it wasn't networking in general but rather the specific non-IP protocol that DHCP userspace uses for DHCP packets that that was broken - you're going to find that most people testing in labs will use static network configurations so won't exercise DHCP by default. Quite apart from anything else if you're working with embedded stuff the woeful unwillingness of hardware manufacturers to provide unique MAC addresses on their development boards means that trying to get a consistent IP address to the boards becomes troublesome, it's a lot easier to just provide a static configuration.
In the LKFT lab, we do use dhcp, but we do not use 'dhclient', which seems to have been required to trigger this bug. We bring up interfaces two different ways: at boot time, which worked fine, and explicitly on hikey using 'udhcpc', which also worked fine.
I am able to reproduce the problem by running 'dhclient' on x15 explicitly. The problem does not occur on 4.9.54, but it does occur on 4.9.55-rc1 as well as 4.9.55 release.
4.9.54 (pass): https://lkft.validation.linaro.org/scheduler/job/46865 4.9.55-rc1 (fail): https://lkft.validation.linaro.org/scheduler/job/46859#L1068 4.9.55 (fail): https://lkft.validation.linaro.org/scheduler/job/46866#L1078
I haven't tested 4.9.56 yet but I will shortly and expect it to pass.
We do have some existing tests that we don't currently run in both LTP and in our test-definitions repository that use dhclient. If it's a good idea, we could add such tests to exercise code paths that we are concerned about - I just want to be sure that we add test coverages strategically and not just as a knee-jerk reaction to one-time problems.
Hope that helps clarify,
Yes that does thanks. We just got unlucky in the specific bug here.
greg k-h