On 23/03/2024 20:13, Mirsad Todorovac wrote:
On 3/19/24 14:58, Jason Gunthorpe wrote:
On Tue, Mar 12, 2024 at 07:35:40AM +0100, Mirsad Todorovac wrote:
Hi,
(This is verified on the second test box.)
In the most recent 6.8.0 release of torvalds tree kernel with selftest configs on, process ./iommufd appears to consume 99% of a CPU core for quote a while in an endless loop:
There is a "bug" in the ksefltest framework where if you call a kselftest assertion from the setup/teardown it infinite loops
The fix I know is to replace kselftest assertions with normal assert()
But I don't see an obvious thing here saying you are hitting that..
Jason
Hi,
I'm not that deep into kselftest for that intervention.
Yet, with the v6.8-11743-ga4145ce1e7bc build, the problem with ./iommufd did not stuck. Instead I got these 10 failed tests:
# # RUN iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # enforce_dirty: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty # not ok 156 iommufd_dirty_tracking.domain_dirty128M_huge.enforce_dirty # # RUN iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # set_dirty_tracking: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking # not ok 157 iommufd_dirty_tracking.domain_dirty128M_huge.set_dirty_tracking # # RUN iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # device_dirty_capability: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability # not ok 158 iommufd_dirty_tracking.domain_dirty128M_huge.device_dirty_capability # # RUN iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # get_dirty_bitmap: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap # not ok 159 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap # # RUN iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # get_dirty_bitmap_no_clear: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear # not ok 160 iommufd_dirty_tracking.domain_dirty128M_huge.get_dirty_bitmap_no_clear . . . # # RUN iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # enforce_dirty: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty # not ok 166 iommufd_dirty_tracking.domain_dirty256M_huge.enforce_dirty # # RUN iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # set_dirty_tracking: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking # not ok 167 iommufd_dirty_tracking.domain_dirty256M_huge.set_dirty_tracking # # RUN iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # device_dirty_capability: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability # not ok 168 iommufd_dirty_tracking.domain_dirty256M_huge.device_dirty_capability # # RUN iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # get_dirty_bitmap: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap # not ok 169 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap # # RUN iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear ... # iommufd: iommufd.c:1749: iommufd_dirty_tracking_setup: Assertion `vrc == self->buffer' failed. # # get_dirty_bitmap_no_clear: Test terminated by assertion # # FAIL iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear # not ok 170 iommufd_dirty_tracking.domain_dirty256M_huge.get_dirty_bitmap_no_clear . . . # # FAILED: 170 / 180 tests passed. # # Totals: pass:170 fail:10 xfail:0 xpass:0 skip:0 error:0 not ok 1 selftests: iommu: iommufd # exit=1
It seems like the same assertion failed in all 10 failed tests?
... It means that the hugetlb mmap() failed, which is required for this specific tests. Because we need to allocate a bigger IOVA range, and in hugepages to exercise the test.
However, I am not smart enough to figure out why ...
Apparently, from the source, mmap() fails to allocate pages on the desired address:
1746 assert((uintptr_t)self->buffer % HUGEPAGE_SIZE == 0); 1747 vrc = mmap(self->buffer, variant->buffer_size, PROT_READ | PROT_WRITE, 1748 mmap_flags, -1, 0); → 1749 assert(vrc == self->buffer); 1750
But I am not that deep into the source to figure our what was intended and what went wrong :-/
I can SKIP() the test rather assert() in here if it helps. Though there are other tests that fail if no hugetlb pages are reserved.
But I am not sure if this is problem here as the initial bug email had an enterily different set of failures? Maybe all you need is an assert() and it gets into this state?
Joao