Hello LKFT maintainers, CI operators,
First, I would like to say thank you to the people behind the LKFT project for validating stable kernels (and more), and including some Network selftests in their tests suites.
A lot of improvements around the networking kselftests have been done this year. At the last Netconf [1], we discussed how these tests were validated on stable kernels from CIs like the LKFT one, and we have some suggestions to improve the situation.
KSelftests from the same version --------------------------------
According to the doc [2], kselftests should support all previous kernel versions. The LKFT CI is then using the kselftests from the last stable release to validate all stable versions. Even if there are good reasons to do that, we would like to ask for an opt-out for this policy for the networking tests: this is hard to maintain with the increased complexity, hard to validate on all stable kernels before applying patches, and hard to put in place in some situations. As a result, many tests are failing on older kernels, and it looks like it is a lot of work to support older kernels, and to maintain this.
Many networking tests are validating the internal behaviour that is not exposed to the userspace. A typical example: some tests look at the raw packets being exchanged during a test, and this behaviour can change without modifying how the userspace is interacting with the kernel. The kernel could expose capabilities, but that's not something that seems natural to put in place for internal behaviours that are not exposed to end users. Maybe workarounds could be used, e.g. looking at kernel symbols, etc. Nut that doesn't always work, increase the complexity, and often "false positive" issue will be noticed only after a patch hits stable, and will cause a bunch of tests to be ignored.
Regarding fixes, ideally they will come with a new or modified test that can also be backported. So the coverage can continue to grow in stable versions too.
Do you think that from the kernel v6.12 (or before?), the LKFT CI could run the networking kselftests from the version that is being validated, and not from a newer one? So validating the selftests from v6.12.1 on a v6.12.1, and not the ones from a future v6.16.y on a v6.12.42.
Skipped tests -------------
It looks like many tests are skipped:
- Some have been in a skip file [3] for a while: maybe they can be removed?
- Some are skipped because of missing tools: maybe they can be added? e.g. iputils, tshark, ipv6toolkit, etc.
- Some tests are in 'net', but in subdirectories, and hence not tested, e.g. forwarding, packetdrill, netfilter, tcp_ao. Could they be tested too?
How can we change this to increase the code coverage using existing tests?
KVM ---
It looks like different VMs are being used to execute the different tests. Do these VMs benefit from any accelerations like KVM? If not, some tests might fail because the environment is too slow.
The KSFT_MACHINE_SLOW=yes env var can be set to increase some tolerances, timeout or to skip some parts, but that might not be enough for some tests.
Notifications -------------
In case of new regressions, who is being notified? Are the people from the MAINTAINERS file, and linked to the corresponding selftests being notified or do they need to do the monitoring on their side?
Looking forward to improving the networking selftests results when validating stable kernels!
[1] https://netdev.bots.linux.dev/netconf/2024/ [2] https://docs.kernel.org/dev-tools/kselftest.html [3] https://github.com/Linaro/test-definitions/blob/master/automated/linux/kself...
Cheers, Matt