Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd);
+ if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + } + ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
I'll try to check more closely but maybe if you can reproduce it too, you'll have more idea what's going on.
The next thing is more of a question/documentation suggestion. Tons of tests fail like this:
ok 24 hmm.hmm_device_private.hmm_cow_in_device # RUN hmm.hmm_device_coherent.open_close ... could not open hmm dmirror driver (/dev/hmm_dmirror2) # SKIP DEVICE_COHERENT not available # OK hmm.hmm_device_coherent.open_close
I assume this is because I run "test_hmm.sh smoke" without the SPM parameters. The help message doesn't say much about what to specify there for <spm_addr_dev0> <spm_addr_dev1>. Do these tests need a particular hardware? (unlike the rest?) Maybe it could be clarified.
Last thing, I noticed all these DEVICE_COHERENT tests ultimately count as OK, not SKIPPED, which would probably be more appropriate?
# FAILED: 51 / 54 tests passed. # Totals: pass:50 fail:3 xfail:0 xpass:0 skip:1 error:0
(the skip:1 is due to test 9 "# SKIP Huge page could not be allocated" which is probably a misconfiguration on my part so I don't report that as an issue)
Thanks, Vlastimil
On 13.10.22 18:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd);
- if (ret != 0) {
fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd);
exit(1);
- }
- ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
On 10/13/22 11:01, David Hildenbrand wrote:
On 13.10.22 18:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd); + if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + }
ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f (tag: v6.0, linux/master)
# FAILED: 25 / 50 tests passed. # Totals: pass:25 fail:25 xfail:0 xpass:0 skip:0 error:0
Looks good to me.
Possible change in 6.1 and we have to time fix them all. :)
thanks, -- Shuah
On 10/13/22 19:10, Shuah Khan wrote:
On 10/13/22 11:01, David Hildenbrand wrote:
On 13.10.22 18:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd); + if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + }
ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f (tag: v6.0, linux/master)
# FAILED: 25 / 50 tests passed. # Totals: pass:25 fail:25 xfail:0 xpass:0 skip:0 error:0
Looks good to me.
Hmm but there's 25 that failed? Or are those also misreported SKIPs?
Possible change in 6.1 and we have to time fix them all. :)
thanks, -- Shuah
On 10/13/22 11:12, Vlastimil Babka wrote:
On 10/13/22 19:10, Shuah Khan wrote:
On 10/13/22 11:01, David Hildenbrand wrote:
On 13.10.22 18:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd); + if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + }
ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
commit 4fe89d07dcc2804c8b562f6c7896a45643d34b2f (tag: v6.0, linux/master)
# FAILED: 25 / 50 tests passed. # Totals: pass:25 fail:25 xfail:0 xpass:0 skip:0 error:0
Looks good to me.
Hmm but there's 25 that failed? Or are those also misreported SKIPs?
Likely the case. Here is an observation. All of these FAILs are coming from line 141 which is FIXTURE_SETUP. See the result:
# hmm-tests.c:141:open_close:Expected self->fd (-1) >= 0 (0) # open_close: Test terminated by assertion # FAIL hmm.hmm_device_private.open_close not ok 1 hmm.hmm_device_private.open_close
However the code is:
FIXTURE_SETUP(hmm) { self->page_size = sysconf(_SC_PAGE_SIZE); self->page_shift = ffs(self->page_size) - 1;
self->fd = hmm_open(variant->device_number); if (self->fd < 0 && hmm_is_coherent_type(variant->device_number)) SKIP(exit(0), "DEVICE_COHERENT not available");
Note: It is SKIP(). It appears it will be reported as fail unless both of the above conditions are true. Perhaps this check should be either of these conditions is true, it is skip.
ASSERT_GE(self->fd, 0); }
Looks like this test could a review to see if all these conditions should be a FAIL or SKIP. This problem exists in 6.0.
thanks, -- Shuah
On 10/13/22 19:12, Vlastimil Babka wrote:
On 10/13/22 19:10, Shuah Khan wrote:
On 10/13/22 11:01, David Hildenbrand wrote:
On 13.10.22 18:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd); + if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + }
ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1673:exclusive:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1727:exclusive_mprotect:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1780:exclusive_cow:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1673:exclusive:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1727:exclusive_mprotect:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1780:exclusive_cow:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
On 10/13/22 12:00, David Hildenbrand wrote:
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
I am not seeing infinite loops and seeing 25 failures which could be skips.
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
Here is my config CONFIG_HMM_MIRROR=y # CONFIG_TEST_HMM is not set
Okay here is what is going on - hmm_tests are supposed to be run from test_hmm.sh script. When I run this I see a message that tells me what to do.
sudo ./test_hmm.sh ./test_hmm.sh: You must have the following enabled in your kernel: CONFIG_TEST_HMM=m
Running ./hmm_tests gives me all the failures. So it appears running hmm_tests executable won't work. This is expected as test_hmm.sh does the right setup before running the test. We have several tests that do that.
Vlastimil, can you try this and let me know what you see. I will compile with CONFIG_TEST_HMM=m and let you know what I see on my system.
thanks, -- Shuah
On 10/13/2022 9:38 PM, Shuah Khan wrote:
On 10/13/22 12:00, David Hildenbrand wrote:
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
I am not seeing infinite loops and seeing 25 failures which could be skips.
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
Here is my config CONFIG_HMM_MIRROR=y # CONFIG_TEST_HMM is not set
Okay here is what is going on - hmm_tests are supposed to be run from test_hmm.sh script. When I run this I see a message that tells me what to do.
sudo ./test_hmm.sh ./test_hmm.sh: You must have the following enabled in your kernel: CONFIG_TEST_HMM=m
Running ./hmm_tests gives me all the failures. So it appears running hmm_tests executable won't work. This is expected as test_hmm.sh does the right setup before running the test. We have several tests that do that.
Vlastimil, can you try this and let me know what you see. I will compile with CONFIG_TEST_HMM=m and let you know what I see on my system.
Right, I didn't mention it, sorry. I did have CONFIG_TEST_HMM=m and was running "test_hmm.sh smoke"
thanks, -- Shuah
Vlastimil Babka vbabka@suse.cz writes:
On 10/13/2022 9:38 PM, Shuah Khan wrote:
On 10/13/22 12:00, David Hildenbrand wrote:
> When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
I am not seeing infinite loops and seeing 25 failures which could be skips.
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
Here is my config CONFIG_HMM_MIRROR=y # CONFIG_TEST_HMM is not set
Okay here is what is going on - hmm_tests are supposed to be run from test_hmm.sh script. When I run this I see a message that tells me what to do.
sudo ./test_hmm.sh ./test_hmm.sh: You must have the following enabled in your kernel: CONFIG_TEST_HMM=m
Running ./hmm_tests gives me all the failures. So it appears running hmm_tests executable won't work. This is expected as test_hmm.sh does the right setup before running the test. We have several tests that do that.
Vlastimil, can you try this and let me know what you see. I will compile with CONFIG_TEST_HMM=m and let you know what I see on my system.
Right, I didn't mention it, sorry. I did have CONFIG_TEST_HMM=m and was running "test_hmm.sh smoke"
FWIW I tend not to use that script on my development machine, mainly because I either have the module built in or otherwise don't have modules installed in a place modprobe knows about.
Anyway I am not seeing test failures running hmm-tests directly. However I do observe both the issue of SKIP in FIXTURE_SETUP() being reported as a pass in the summary, and the infinite loop on ASSERT failure in FIXTURE_TEARDOWN.
There does seem to be some framework issues here which are causing this behaviour. Consider the following representitive snippet:
#include "../kselftest_harness.h"
#include <stdio.h>
FIXTURE(test) {};
FIXTURE_SETUP(test) { SKIP(return, "skip"); }
FIXTURE_TEARDOWN(test) { ASSERT_TRUE(0); }
TEST_F(test, test) { printf("Running test\n"); }
TEST_HARNESS_MAIN
In this case the test will still be run even though SKIP() was called in FIXTURE_SETUP. The ASSERT_TRUE() during FIXTURE_TEARDOWN results in the infinite loop. So it looks to me like calling SKIP from FIXTURE_SETUP isn't supported, and calling ASSERT_*() in FIXTURE_TEARDOWN is also not allowed/supported by the kselftest framework.
Unlike hmm-tests though the above snippet reports correct pass/skip statistics with the teardown assertion removed. This is because there is also a bug in hmm-tests. Currently we have:
SKIP(exit(0), "DEVICE_COHERENT not available");
Which should really be:
SKIP(return, "DEVICE_COHERENT not available");
Of course that results in an infinite loop due to the associated assertion failure during teardown which is still called despite the SKIP in setup. Not sure if this is why it was originally coded this way.
- Alistair
thanks, -- Shuah
Seems like this would fix both the SKIP in FIXTURE_SETUP and ASSERT in FIXTURE_TEARDOWN issues:
---
diff --git a/tools/testing/selftests/kselftest_harness.h b/tools/testing/selftests/kselftest_harness.h index 25f4d54067c0..1998fe888f8f 100644 --- a/tools/testing/selftests/kselftest_harness.h +++ b/tools/testing/selftests/kselftest_harness.h @@ -387,12 +387,12 @@ if (setjmp(_metadata->env) == 0) { \ fixture_name##_setup(_metadata, &self, variant->data); \ /* Let setup failure terminate early. */ \ - if (!_metadata->passed) \ + if (!_metadata->passed || _metadata->skip) \ return; \ _metadata->setup_completed = true; \ fixture_name##_##test_name(_metadata, &self, variant->data); \ } \ - if (_metadata->setup_completed) \ + if (_metadata->setup_completed && setjmp(_metadata->env) == 0) \ fixture_name##_teardown(_metadata, &self, variant->data); \ __test_check_assert(_metadata); \ } \
Alistair Popple apopple@nvidia.com writes:
Vlastimil Babka vbabka@suse.cz writes:
On 10/13/2022 9:38 PM, Shuah Khan wrote:
On 10/13/22 12:00, David Hildenbrand wrote:
>> When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
I am not seeing infinite loops and seeing 25 failures which could be skips.
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
Here is my config CONFIG_HMM_MIRROR=y # CONFIG_TEST_HMM is not set
Okay here is what is going on - hmm_tests are supposed to be run from test_hmm.sh script. When I run this I see a message that tells me what to do.
sudo ./test_hmm.sh ./test_hmm.sh: You must have the following enabled in your kernel: CONFIG_TEST_HMM=m
Running ./hmm_tests gives me all the failures. So it appears running hmm_tests executable won't work. This is expected as test_hmm.sh does the right setup before running the test. We have several tests that do that.
Vlastimil, can you try this and let me know what you see. I will compile with CONFIG_TEST_HMM=m and let you know what I see on my system.
Right, I didn't mention it, sorry. I did have CONFIG_TEST_HMM=m and was running "test_hmm.sh smoke"
FWIW I tend not to use that script on my development machine, mainly because I either have the module built in or otherwise don't have modules installed in a place modprobe knows about.
Anyway I am not seeing test failures running hmm-tests directly. However I do observe both the issue of SKIP in FIXTURE_SETUP() being reported as a pass in the summary, and the infinite loop on ASSERT failure in FIXTURE_TEARDOWN.
There does seem to be some framework issues here which are causing this behaviour. Consider the following representitive snippet:
#include "../kselftest_harness.h"
#include <stdio.h>
FIXTURE(test) {};
FIXTURE_SETUP(test) { SKIP(return, "skip"); }
FIXTURE_TEARDOWN(test) { ASSERT_TRUE(0); }
TEST_F(test, test) { printf("Running test\n"); }
TEST_HARNESS_MAIN
In this case the test will still be run even though SKIP() was called in FIXTURE_SETUP. The ASSERT_TRUE() during FIXTURE_TEARDOWN results in the infinite loop. So it looks to me like calling SKIP from FIXTURE_SETUP isn't supported, and calling ASSERT_*() in FIXTURE_TEARDOWN is also not allowed/supported by the kselftest framework.
Unlike hmm-tests though the above snippet reports correct pass/skip statistics with the teardown assertion removed. This is because there is also a bug in hmm-tests. Currently we have:
SKIP(exit(0), "DEVICE_COHERENT not available");
Which should really be:
SKIP(return, "DEVICE_COHERENT not available");
Of course that results in an infinite loop due to the associated assertion failure during teardown which is still called despite the SKIP in setup. Not sure if this is why it was originally coded this way.
- Alistair
thanks, -- Shuah
On 10/14/22 05:21, Alistair Popple wrote:
Seems like this would fix both the SKIP in FIXTURE_SETUP and ASSERT in FIXTURE_TEARDOWN issues:
Yep, that fixed the infinite error loops for me, thanks.
...
Unlike hmm-tests though the above snippet reports correct pass/skip statistics with the teardown assertion removed. This is because there is also a bug in hmm-tests. Currently we have:
SKIP(exit(0), "DEVICE_COHERENT not available");
Which should really be:
SKIP(return, "DEVICE_COHERENT not available");
And with this on top, I got the skips due to DEVICE_COHERENT not available counted correctly.
Of course that results in an infinite loop due to the associated assertion failure during teardown which is still called despite the SKIP in setup. Not sure if this is why it was originally coded this way.
- Alistair
thanks, -- Shuah
On 10/13/22 20:00, David Hildenbrand wrote:
When did that test start failing? Was it still ok for 6.0?
Didn't test yet, will try, in case it's my system/config specific thing.
So it's actually all the same with v6.0 for me. The infinite loops, the test failures, the misreported SKIPs.
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1673:exclusive:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1727:exclusive_mprotect:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1780:exclusive_cow:Expected ret (-16) == 0 (0) hmm close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
Is the kernel compiled with support. I have the feeling that we might simply miss kernel support and it's not handled gracefully ...
If you mean CONFIG_DEVICE_PRIVATE=y then it's there. Couldn't find anything relevant that wouldn't be enabled.
On 10/13/22 10:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
kselftest pull requests didn't include any framework changes. I doubt that it is framework related.
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd);
- if (ret != 0) {
fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd);
exit(1);
- }
- ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
I'll try to check more closely but maybe if you can reproduce it too, you'll have more idea what's going on.
Sounds good.
The next thing is more of a question/documentation suggestion. Tons of tests fail like this:
ok 24 hmm.hmm_device_private.hmm_cow_in_device # RUN hmm.hmm_device_coherent.open_close ... could not open hmm dmirror driver (/dev/hmm_dmirror2) # SKIP DEVICE_COHERENT not available # OK hmm.hmm_device_coherent.open_close
I assume this is because I run "test_hmm.sh smoke" without the SPM parameters. The help message doesn't say much about what to specify there for <spm_addr_dev0> <spm_addr_dev1>. Do these tests need a particular hardware? (unlike the rest?) Maybe it could be clarified.
Last thing, I noticed all these DEVICE_COHERENT tests ultimately count as OK, not SKIPPED, which would probably be more appropriate?
Anytime a test can't be run due to missing config, the result should be a SKIP. If that is not the case let's fix these cases.
# FAILED: 51 / 54 tests passed. # Totals: pass:50 fail:3 xfail:0 xpass:0 skip:1 error:0
(the skip:1 is due to test 9 "# SKIP Huge page could not be allocated" which is probably a misconfiguration on my part so I don't report that as an issue)
Skip is the right result in this case if it is indeed the result of misconfig.
thanks, -- Shuah
On 10/13/22 19:03, Shuah Khan wrote:
On 10/13/22 10:54, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
kselftest pull requests didn't include any framework changes. I doubt that it is framework related.
But is it OK to use e.g. ASSERT_EQ() in FIXTURE_TEARDOWN()?
--- a/tools/testing/selftests/vm/hmm-tests.c +++ b/tools/testing/selftests/vm/hmm-tests.c @@ -154,6 +154,11 @@ FIXTURE_TEARDOWN(hmm) { int ret = close(self->fd); + if (ret != 0) { + fprintf(stderr, "close returned (%d) fd is (%d)\n", ret,self->fd); + exit(1); + }
ASSERT_EQ(ret, 0); self->fd = -1; }
Next, there are some tests that fail (and thus also trigger the issue above)
# RUN hmm.hmm_device_private.exclusive ... # hmm-tests.c:1702:exclusive:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive not ok 20 hmm.hmm_device_private.exclusive # RUN hmm.hmm_device_private.exclusive_mprotect ... # hmm-tests.c:1756:exclusive_mprotect:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_mprotect: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_mprotect not ok 21 hmm.hmm_device_private.exclusive_mprotect # RUN hmm.hmm_device_private.exclusive_cow ... # hmm-tests.c:1809:exclusive_cow:Expected ret (-16) == 0 (0) close returned (-1) fd is (3) # exclusive_cow: Test failed at step #1 # FAIL hmm.hmm_device_private.exclusive_cow not ok 22 hmm.hmm_device_private.exclusive_cow
I'll try to check more closely but maybe if you can reproduce it too, you'll have more idea what's going on.
Sounds good.
The next thing is more of a question/documentation suggestion. Tons of tests fail like this:
ok 24 hmm.hmm_device_private.hmm_cow_in_device # RUN hmm.hmm_device_coherent.open_close ... could not open hmm dmirror driver (/dev/hmm_dmirror2) # SKIP DEVICE_COHERENT not available # OK hmm.hmm_device_coherent.open_close
I assume this is because I run "test_hmm.sh smoke" without the SPM parameters. The help message doesn't say much about what to specify there for <spm_addr_dev0> <spm_addr_dev1>. Do these tests need a particular hardware? (unlike the rest?) Maybe it could be clarified.
Last thing, I noticed all these DEVICE_COHERENT tests ultimately count as OK, not SKIPPED, which would probably be more appropriate?
Anytime a test can't be run due to missing config, the result should be a SKIP. If that is not the case let's fix these cases.
# FAILED: 51 / 54 tests passed. # Totals: pass:50 fail:3 xfail:0 xpass:0 skip:1 error:0
(the skip:1 is due to test 9 "# SKIP Huge page could not be allocated" which is probably a misconfiguration on my part so I don't report that as an issue)
Skip is the right result in this case if it is indeed the result of misconfig.
Right. My point is that there were more than 20 more reporting "# SKIP DEVICE_COHERENT not available" that were not counted as skip: but pass:
thanks, -- Shuah
On Thu, Oct 13, 2022 at 06:54:24PM +0200, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
I've seen this too in other tests, it is a kselftests bug/limitation, AFAIK. You can't use assert macros in those functions.
Jason
Am 2022-10-14 um 08:01 schrieb Jason Gunthorpe:
On Thu, Oct 13, 2022 at 06:54:24PM +0200, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
I've seen this too in other tests, it is a kselftests bug/limitation, AFAIK. You can't use assert macros in those functions.
I vaguely remember looking at this when I reviewed Alex's patches that added device-coherent support. We wanted to have these checks in the fixture setup so that we wouldn't have to duplicate them in all the tests.
I'm not sure if I missed it in review, and Alex missed it in testing, or if this is a regression that happened more recently. Sorry for the trouble. It looks like Alistair already figured out a fix.
Regards, Felix
Jason
On Fri, Oct 14, 2022 at 11:03:39AM -0400, Felix Kuehling wrote:
Am 2022-10-14 um 08:01 schrieb Jason Gunthorpe:
On Thu, Oct 13, 2022 at 06:54:24PM +0200, Vlastimil Babka wrote:
Hi,
I've been trying the hmm_tests as of today's commit:
a185a0995518 ("Merge tag 'linux-kselftest-kunit-6.1-rc1-2' ...)
and run into several issues that seemed worth reporting.
First, it seems the FIXTURE_TEARDOWN(hmm) in tools/testing/selftests/vm/hmm-tests.c using ASSERT_EQ(ret, 0); can run into an infinite loop of reporting the assertion failure. Dunno if it's a kselftests issue or it's a bug to use asserts in teardown. I hacked it up like this locally to proceed:
I've seen this too in other tests, it is a kselftests bug/limitation, AFAIK. You can't use assert macros in those functions.
I vaguely remember looking at this when I reviewed Alex's patches that added device-coherent support. We wanted to have these checks in the fixture setup so that we wouldn't have to duplicate them in all the tests.
I'm not sure if I missed it in review, and Alex missed it in testing, or if this is a regression that happened more recently. Sorry for the trouble. It looks like Alistair already figured out a fix.
I think the design is fine, it is just surprising you can't call ASSERT/etc in the fixture codes. Hopefully something like Alistair's fix gets merged.
Jason
linux-kselftest-mirror@lists.linaro.org