This commit removed an extra check for zero-length ranges, and folded it into the common validate_range() helper used by all UFFD ioctls.
It failed to notice though that UFFDIO_COPY *only* called validate_range on the dst range, not the src range. So removing this check actually let us proceed with zero-length source ranges, eventually hitting a BUG further down in the call stack.
The correct fix seems clear: call validate_range() on the src range too.
Other ioctls are not affected by this, as they only have one range, not two (src + dst).
Reported-by: syzbot+42309678e0bc7b32f8e9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=42309678e0bc7b32f8e9 Signed-off-by: Axel Rasmussen axelrasmussen@google.com --- fs/userfaultfd.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 53a7220c4679..36d233759233 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1759,6 +1759,9 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
+ ret = validate_range(ctx->mm, uffdio_copy.src, uffdio_copy.len); + if (ret) + goto out; ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out;
On 14/07/2023 19:29, Axel Rasmussen wrote:
This commit removed an extra check for zero-length ranges, and folded it into the common validate_range() helper used by all UFFD ioctls.
It failed to notice though that UFFDIO_COPY *only* called validate_range on the dst range, not the src range. So removing this check actually let us proceed with zero-length source ranges, eventually hitting a BUG further down in the call stack.
The correct fix seems clear: call validate_range() on the src range too.
Other ioctls are not affected by this, as they only have one range, not two (src + dst).
Reported-by: syzbot+42309678e0bc7b32f8e9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=42309678e0bc7b32f8e9 Signed-off-by: Axel Rasmussen axelrasmussen@google.com
fs/userfaultfd.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 53a7220c4679..36d233759233 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1759,6 +1759,9 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_copy.src, uffdio_copy.len);
- if (ret)
ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out;goto out;
Hi Axel,
I've just noticed that this patch, now in mm-unstable, regresses the mkdirty mm selftest:
# [INFO] detected THP size: 2048 KiB TAP version 13 1..6 # [INFO] PTRACE write access ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 UFFDIO_COPY failed Bail out! 1 out of 6 tests failed # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0
Whereas all 6 tests pass against v6.5-rc4.
I'm afraid I don't know the test well and haven't looked at what the issue might be, but noticed and thought I should point it out.
bisect log:
git bisect start # bad: [ad3232df3e410acc2229c9195479c5596c1d1f96] mm/memory_hotplug: embed vmem_altmap details in memory block git bisect bad ad3232df3e410acc2229c9195479c5596c1d1f96 # good: [5d0c230f1de8c7515b6567d9afba1f196fb4e2f4] Linux 6.5-rc4 git bisect good 5d0c230f1de8c7515b6567d9afba1f196fb4e2f4 # bad: [aa5712770e3f0edb31ae879cd6452d5c2111d4fb] mm: fix obsolete function name above debug_pagealloc_enabled_static() git bisect bad aa5712770e3f0edb31ae879cd6452d5c2111d4fb # bad: [bef1ff8723df303a06cdaffe64d95db2f7e7d4f6] mm: userfaultfd: support UFFDIO_POISON for hugetlbfs git bisect bad bef1ff8723df303a06cdaffe64d95db2f7e7d4f6 # good: [4f4469463e8571012d2602b39f14ed3e3dbd972a] selftests/mm: add gup test matrix in run_vmtests.sh git bisect good 4f4469463e8571012d2602b39f14ed3e3dbd972a # good: [5c0d69839ef4f560679919f3483a592741df74f8] memcg: drop kmem.limit_in_bytes git bisect good 5c0d69839ef4f560679919f3483a592741df74f8 # good: [b75f155a299729dc62de0ce6a9400d076298aa4c] mm: compaction: skip the memory hole rapidly when isolating free pages git bisect good b75f155a299729dc62de0ce6a9400d076298aa4c # good: [12b13121d9f4487301dd9fb765265b642b2f6d5d] mm/memcg: minor cleanup for MEM_CGROUP_ID_MAX git bisect good 12b13121d9f4487301dd9fb765265b642b2f6d5d # bad: [c9c368e75919c105aae072896e58d0ad4639e505] mm: userfaultfd: check for start + len overflow in validate_range: fix git bisect bad c9c368e75919c105aae072896e58d0ad4639e505 # good: [9e707995021bbdfc67ad83a985f7796bba580bed] mm-make-pte_marker_swapin_error-more-general-fix git bisect good 9e707995021bbdfc67ad83a985f7796bba580bed # good: [46b66377b696c43c89ed4b1cb3f56b64e8fd475b] mm: userfaultfd: check for start + len overflow in validate_range git bisect good 46b66377b696c43c89ed4b1cb3f56b64e8fd475b # first bad commit: [c9c368e75919c105aae072896e58d0ad4639e505] mm: userfaultfd: check for start + len overflow in validate_range: fix
Thanks, Ryan
On 10.08.23 17:53, Ryan Roberts wrote:
On 14/07/2023 19:29, Axel Rasmussen wrote:
This commit removed an extra check for zero-length ranges, and folded it into the common validate_range() helper used by all UFFD ioctls.
It failed to notice though that UFFDIO_COPY *only* called validate_range on the dst range, not the src range. So removing this check actually let us proceed with zero-length source ranges, eventually hitting a BUG further down in the call stack.
The correct fix seems clear: call validate_range() on the src range too.
Other ioctls are not affected by this, as they only have one range, not two (src + dst).
Reported-by: syzbot+42309678e0bc7b32f8e9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=42309678e0bc7b32f8e9 Signed-off-by: Axel Rasmussen axelrasmussen@google.com
fs/userfaultfd.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 53a7220c4679..36d233759233 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1759,6 +1759,9 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_copy.src, uffdio_copy.len);
- if (ret)
ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out;goto out;
Hi Axel,
I've just noticed that this patch, now in mm-unstable, regresses the mkdirty mm selftest:
# [INFO] detected THP size: 2048 KiB TAP version 13 1..6 # [INFO] PTRACE write access ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 UFFDIO_COPY failed Bail out! 1 out of 6 tests failed # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0
Whereas all 6 tests pass against v6.5-rc4.
I'm afraid I don't know the test well and haven't looked at what the issue might be, but noticed and thought I should point it out.
That test (written by me ;) ) essentially does
src = malloc(pagesize); dst = mmap(NULL, pagesize, PROT_READ, MAP_PRIVATE|MAP_ANON, -1, 0) ...
uffdio_copy.dst = (unsigned long) dst; uffdio_copy.src = (unsigned long) src; uffdio_copy.len = pagesize; uffdio_copy.mode = 0; if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy)) { ...
So src might not be aligned to a full page.
According to the man page:
"EINVAL Either dst or len was not a multiple of the system page size, or the range specified by src and len or dst and len was invalid."
So, AFAIKT, there is no requirement for src to be page-aligned.
Using validate_range() on the src is wrong.
On Thu, Aug 10, 2023 at 9:49 AM David Hildenbrand david@redhat.com wrote:
On 10.08.23 17:53, Ryan Roberts wrote:
On 14/07/2023 19:29, Axel Rasmussen wrote:
This commit removed an extra check for zero-length ranges, and folded it into the common validate_range() helper used by all UFFD ioctls.
It failed to notice though that UFFDIO_COPY *only* called validate_range on the dst range, not the src range. So removing this check actually let us proceed with zero-length source ranges, eventually hitting a BUG further down in the call stack.
The correct fix seems clear: call validate_range() on the src range too.
Other ioctls are not affected by this, as they only have one range, not two (src + dst).
Reported-by: syzbot+42309678e0bc7b32f8e9@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=42309678e0bc7b32f8e9 Signed-off-by: Axel Rasmussen axelrasmussen@google.com
fs/userfaultfd.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/fs/userfaultfd.c b/fs/userfaultfd.c index 53a7220c4679..36d233759233 100644 --- a/fs/userfaultfd.c +++ b/fs/userfaultfd.c @@ -1759,6 +1759,9 @@ static int userfaultfd_copy(struct userfaultfd_ctx *ctx, sizeof(uffdio_copy)-sizeof(__s64))) goto out;
- ret = validate_range(ctx->mm, uffdio_copy.src, uffdio_copy.len);
- if (ret)
ret = validate_range(ctx->mm, uffdio_copy.dst, uffdio_copy.len); if (ret) goto out;goto out;
Hi Axel,
I've just noticed that this patch, now in mm-unstable, regresses the mkdirty mm selftest:
# [INFO] detected THP size: 2048 KiB TAP version 13 1..6 # [INFO] PTRACE write access ok 1 SIGSEGV generated, page not modified # [INFO] PTRACE write access to THP ok 2 SIGSEGV generated, page not modified # [INFO] Page migration ok 3 SIGSEGV generated, page not modified # [INFO] Page migration of THP ok 4 SIGSEGV generated, page not modified # [INFO] PTE-mapping a THP ok 5 SIGSEGV generated, page not modified # [INFO] UFFDIO_COPY not ok 6 UFFDIO_COPY failed Bail out! 1 out of 6 tests failed # Totals: pass:5 fail:1 xfail:0 xpass:0 skip:0 error:0
Whereas all 6 tests pass against v6.5-rc4.
I'm afraid I don't know the test well and haven't looked at what the issue might be, but noticed and thought I should point it out.
That test (written by me ;) ) essentially does
src = malloc(pagesize); dst = mmap(NULL, pagesize, PROT_READ, MAP_PRIVATE|MAP_ANON, -1, 0) ...
uffdio_copy.dst = (unsigned long) dst; uffdio_copy.src = (unsigned long) src; uffdio_copy.len = pagesize; uffdio_copy.mode = 0; if (ioctl(uffd, UFFDIO_COPY, &uffdio_copy)) { ...
So src might not be aligned to a full page.
According to the man page:
"EINVAL Either dst or len was not a multiple of the system page size, or the range specified by src and len or dst and len was invalid."
So, AFAIKT, there is no requirement for src to be page-aligned.
Using validate_range() on the src is wrong.
Thanks for the report and the suggestions! I sent a fixup patch which should resolve this [1]. At least, I ran the test in question a bunch of times and it passed reliably with this fix.
[1]: https://patchwork.kernel.org/project/linux-mm/patch/20230810192128.1855570-1...
-- Cheers,
David / dhildenb
linux-kselftest-mirror@lists.linaro.org