On Fri, Apr 18, 2025 at 12:56 PM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
On Fri, Apr 18, 2025 at 12:31:29PM -0700, Suren Baghdasaryan wrote:
On Fri, Apr 18, 2025 at 11:30 AM Lorenzo Stoakes lorenzo.stoakes@oracle.com wrote:
On Fri, Apr 18, 2025 at 10:49:54AM -0700, Suren Baghdasaryan wrote:
Test that /proc/pid/maps does not report unexpected holes in the address space when we concurrently remap a part of a vma into the middle of another vma. This remapping results in the destination vma being split into three parts and the part in the middle being patched back from, all done concurrently from under the reader. We should always see either original vma or the split one with no holes.
Signed-off-by: Suren Baghdasaryan surenb@google.com
Umm, but we haven't fixed this in the mremap code right? :) So isn't this test failing right now? :P
My understanding from meeting was you'd drop this commit/mark it skipped for now or something like this?
After you pointed out the issue in mremap_to() I spent some time investigating that code. IIUC the fact that mremap_to() does do_munmap() and move_vma() as two separate operations should not affect lockless reading because both those operations are done under mmap_write_lock() without dropping it in between. Since my lockless mechanism uses mmap_lock_speculate_xxx API to detect address space modifications and retry if concurrent update happen, it should be able to handle this case. The vma it reports should be either the version before mmap_write_lock was taken (before the mremap()) or after it got dropped (after mremap() is complete). Did I miss something more subtle?
Speaking to Liam, seems perhaps the implementation has changed since we first started looking at this?
I guess it's this speculation stuff that keeps you safe then, yeah we hold the write lock over both.
So actually ideal if we can avoid having to fix up the mremap implementation!
If you're sure that the speculation combined with mmap write lock keeps us safe then we should be ok I'd say.
Yeah, I tested that quite rigorously and confirmed (using the mm->mm_lock_seq) that mmap_write_lock is not dropped somewhere in the middle of those operations. I think we should be safe.
tools/testing/selftests/proc/proc-pid-vm.c | 92 ++++++++++++++++++++++ 1 file changed, 92 insertions(+)
diff --git a/tools/testing/selftests/proc/proc-pid-vm.c b/tools/testing/selftests/proc/proc-pid-vm.c index 39842e4ec45f..1aef2db7e893 100644 --- a/tools/testing/selftests/proc/proc-pid-vm.c +++ b/tools/testing/selftests/proc/proc-pid-vm.c @@ -663,6 +663,95 @@ static void test_maps_tearing_from_resize(int maps_fd, signal_state(mod_info, TEST_DONE); }
+static inline void remap_vma(const struct vma_modifier_info *mod_info) +{
/*
* Remap the last page of the next vma into the middle of the vma.
* This splits the current vma and the first and middle parts (the
* parts at lower addresses) become the last vma objserved in the
* first page and the first vma observed in the last page.
*/
assert(mremap(mod_info->next_addr + page_size * 2, page_size,
page_size, MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP,
mod_info->addr + page_size) != MAP_FAILED);
+}
+static inline void patch_vma(const struct vma_modifier_info *mod_info) +{
assert(!mprotect(mod_info->addr + page_size, page_size,
mod_info->prot));
+}
+static inline void check_remap_result(struct line_content *mod_last_line,
struct line_content *mod_first_line,
struct line_content *restored_last_line,
struct line_content *restored_first_line)
+{
/* Make sure vmas at the boundaries are changing */
assert(strcmp(mod_last_line->text, restored_last_line->text) != 0);
assert(strcmp(mod_first_line->text, restored_first_line->text) != 0);
+}
+static void test_maps_tearing_from_remap(int maps_fd,
struct vma_modifier_info *mod_info,
struct page_content *page1,
struct page_content *page2,
struct line_content *last_line,
struct line_content *first_line)
+{
struct line_content remapped_last_line;
struct line_content remapped_first_line;
struct line_content restored_last_line;
struct line_content restored_first_line;
wait_for_state(mod_info, SETUP_READY);
/* re-read the file to avoid using stale data from previous test */
read_boundary_lines(maps_fd, page1, page2, last_line, first_line);
mod_info->vma_modify = remap_vma;
mod_info->vma_restore = patch_vma;
mod_info->vma_mod_check = check_remap_result;
capture_mod_pattern(maps_fd, mod_info, page1, page2, last_line, first_line,
&remapped_last_line, &remapped_first_line,
&restored_last_line, &restored_first_line);
/* Now start concurrent modifications for test_duration_sec */
signal_state(mod_info, TEST_READY);
struct line_content new_last_line;
struct line_content new_first_line;
struct timespec start_ts, end_ts;
clock_gettime(CLOCK_MONOTONIC_COARSE, &start_ts);
do {
read_boundary_lines(maps_fd, page1, page2, &new_last_line, &new_first_line);
/* Check if we read vmas after remapping it */
if (!strcmp(new_last_line.text, remapped_last_line.text)) {
/*
* The vmas should be consistent with remap results,
* however if the vma was concurrently restored, it
* can be reported twice (first as split one, then
* as restored one) because we found it as the next vma
* again. In that case new first line will be the same
* as the last restored line.
*/
assert(!strcmp(new_first_line.text, remapped_first_line.text) ||
!strcmp(new_first_line.text, restored_last_line.text));
} else {
/* The vmas should be consistent with the original/resored state */
assert(!strcmp(new_last_line.text, restored_last_line.text) &&
!strcmp(new_first_line.text, restored_first_line.text));
}
clock_gettime(CLOCK_MONOTONIC_COARSE, &end_ts);
} while (end_ts.tv_sec - start_ts.tv_sec < test_duration_sec);
/* Signal the modifyer thread to stop and wait until it exits */
signal_state(mod_info, TEST_DONE);
+}
static int test_maps_tearing(void) { struct vma_modifier_info *mod_info; @@ -757,6 +846,9 @@ static int test_maps_tearing(void) test_maps_tearing_from_resize(maps_fd, mod_info, &page1, &page2, &last_line, &first_line);
test_maps_tearing_from_remap(maps_fd, mod_info, &page1, &page2,
&last_line, &first_line);
stop_vma_modifier(mod_info); free(page2.data);
-- 2.49.0.805.g082f7c87e0-goog