On Wed, 28 Feb 2024 at 23:40, Guenter Roeck linux@roeck-us.net wrote:
On 2/28/24 02:15, Geert Uytterhoeven wrote:
CC testing
On Wed, Feb 28, 2024 at 8:59 AM Guenter Roeck linux@roeck-us.net wrote:
On 2/27/24 23:25, Christophe Leroy wrote: [ ... ]
This test case is supposed to be as true to the "general case" as possible, so I have aligned the data along 14 + NET_IP_ALIGN. On ARM this will be a 16-byte boundary since NET_IP_ALIGN is 2. A driver that does not follow this may not be appropriately tested by this test case, but anyone is welcome to submit additional test cases that address this additional alignment concern.
But then this test case is becoming less and less true to the "general case" with this patch, whereas your initial implementation was almost perfect as it was covering most cases, a lot more than what we get with that patch applied.
NP with me if that is where people want to go. I'll simply disable checksum tests on all architectures which don't support unaligned accesses (so far it looks like that is only arm with thumb instructions, and possibly nios2). I personally find that less desirable and would have preferred a second configurable set of tests for unaligned accesses, but I have no problem with it.
IMHO the tests should validate the expected functionality. If a test fails, either functionality is missing or behaves wrong, or the test is wrong.
What is the point of writing tests for a core functionality like network checksumming that do not match the expected functionality?
Tough one. I can't enable CONFIG_NET_TEST on nios2, parisc, and arm with THUMB enabled due to crashes or hangs in gso tests. I accept that. Downside is that I have to disable CONFIG_NET_TEST on those architectures/platforms entirely, meaning a whole class of tests are missing for those architectures. I would prefer to have a configuration option such as CONFIG_NET_GSO_TEST to let me disable the problematic tests for the affected platforms so I can run all the other network unit tests. Yes, obviously something is wrong either with the affected tests or with the implementation of the tested functionality on the affected systems, but that could be handled separately if a separate configuration option existed, and new regressions in other tests on the affected architectures could be identified as they happen.
This case is similar. I'd prefer to have a separate configuration option, say, CONFIG_CHECKSUM_MISALIGNED_KUNIT, which I can disable to be able to run the common checksum tests on platforms / architectures which don't support unaligned accesses.
However, as I said, if the community wants to take a harsh stance, I have no problem with just disabling groups of tests entirely on platforms which have a problem with part of it.
Guenter
I think the ideal solution is for there to be some official stance on the required alignment, for every architecture to support that, and for the tests to exercise it. Now, judging from the sheer number of replies in this thread, it seems like there isn't any real agreement on that. (From my quick reading of some of the checksum code, my assumption was that this was either 1- or 2- byte alignment required, with 4-byte alignment being ideal for performance reasons in most setups).
If different architectures have different alignment requirements (ouch!), my feeling is that the test suite should be written to the maximum such alignment (as any non-architecture-specific code will need to align things anyway), and architectures/drivers with non-aligned buffers can have their own tests. If it turns out there are a lot of such drivers/architectures, then we can add the extra config option.
I'd rather, if there is a config option to disable these tests, it be of the form ARCH_HAS_UNALIGNED_CHECKSUM to enable it, or similar. There's also the option of having the test 'skip' itself on a configuration which doesn't support it. That way it'll still show up in the list of tests, but with a description, like "Disabled due to checksum alignment requirements" or something, which may be more obvious to people debugging it later.
For the gso test hangs, I think it's probably quite sensible to have a config option for the GSO tests generally. I'd be more hesitant to have a separate CONFIG_NET_GSO_FREQUENTLY_BROKEN_TESTS, which is selected automatically by a bunch of architectures. At that point, I think we need to either just fix the bugs, or start thinking about a better solution for these tests / architectures.
One of the things I'm hoping to work on this year is some improvements to KUnit tooling to automatically run tests across a wider set of architectures and configs, so test authors can catch this sort of thing before even sending patches out. We can do a bit of this with the manual --arch <arch> option to kunit.py, but very few people will test things across more than a couple of architectures, and rarely will we get good testing on the less common architectures, like 32-bit ones, big endian ones, or ones with stricter alignment requirements. So we can do better there.
tl;dr: I think it's a good idea for tests to sit behind config options. Obviously they shouldn't be either too broad, or too granular, but common sense usually prevails here. I'd rather not have config options explicitly for "broken" tests, though: if you have to, try to make the config option for the missing/broken feature (HAS_xxx) rather than the test if possible. Otherwise, 'skip' the test, with a suitable reason string if you can.
-- David