On Sun, Mar 22, 2020 at 09:31:04AM -0700, Shakeel Butt wrote:
On Sat, Mar 21, 2020 at 6:35 PM Rafael Aquini aquini@redhat.com wrote:
Changes for commit 9c4e6b1a7027f ("mm, mlock, vmscan: no more skipping pagevecs") break this test expectations on the behavior of mlock syscall family immediately inserting the recently faulted pages into the UNEVICTABLE_LRU, when MCL_ONFAULT is passed to the syscall as part of its flag-set.
mlock* syscalls do not provide any guarantee that the pages will be in unevictable LRU, only that the pages will not be paged-out. The test is checking something very internal to the kernel and this is expected to break.
It was a check expected to be satisfied before the commit, though. Getting the mlocked pages inserted directly into the unevictable LRU, skipping the pagevec, was established behavior before the aforementioned commit, and even though one could argue userspace should not be aware, or care, about such inner kernel circles the program in question is not an ordinary userspace app, but a kernel selftest that is supposed to check for the functionality correctness.
There is no functional error introduced by the aforementioned commit, but it opens up a time window where the recently faulted and locked pages might yet not be put back into the UNEVICTABLE_LRU, thus causing a subsequent and immediate PFN flag check for the UNEVICTABLE bit to trip on false-negative errors, as it happens with this test.
This patch fix the false negative by forcefully resorting to a code path that will call a CPU pagevec drain right after the fault but before the PFN flag check takes place, sorting out the race that way.
Fixes: 9c4e6b1a7027f ("mm, mlock, vmscan: no more skipping pagevecs")
This is fixing the actual test and not about fixing the mentioned patch. So, this Fixes line is not needed.
If one bisects the kernel looking for the patch that causes the selftest to fail that commit is going to show up as the issue, thus the reference.