commit 95b31e35239e5e1689e3d965d692a313c71bd8ab upstream
This is a regression since linux v4.19, introduced with commit 648e921888ad96ea3dc922739e96716ad3225d7f.
Without this patch, one ethernet port of this board is not usable.
>From the currently maintained stable kernels, the patch 648e921888ad96ea3dc922739e96716ad3225d7f was also
backported to linux-4.14.y, so I suggest to backport the patch to v4.14 and v4.19+.
Best regards,
Georg
From: LinFeng <linfeng23(a)huawei.com>
We found that the !is_zero_page() in kvm_is_mmio_pfn() was
submmited in commit:90cff5a8cc("KVM: check for !is_zero_pfn() in
kvm_is_mmio_pfn()"), but reverted in commit:0ef2459983("kvm: fix
kvm_is_mmio_pfn() and rename to kvm_is_reserved_pfn()").
Maybe just adding !is_zero_page() to kvm_is_reserved_pfn() is too
rough. According to commit:e433e83bc3("KVM: MMU: Do not treat
ZONE_DEVICE pages as being reserved"), special handling in some
other flows is also need by zero_page, if we would treat zero_page
as being reserved.
Well, as fixing all functions reference to kvm_is_reserved_pfn() in
this patch, we found that only kvm_release_pfn_clean() and
kvm_get_pfn() don't need special handling.
So, we thought why not only check is_zero_page() in before get and
put page, and revert our last commit:31e813f38f("KVM: fix overflow
of zero page refcount with ksm running") in master.
Instead of adding !is_zero_page() in kvm_is_reserved_pfn(),
new idea is as follow:
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 7f9ee2929cfe..f9a1f9cf188e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -1695,7 +1695,8 @@ EXPORT_SYMBOL_GPL(kvm_release_page_clean);
void kvm_release_pfn_clean(kvm_pfn_t pfn)
{
- if (!is_error_noslot_pfn(pfn) && !kvm_is_reserved_pfn(pfn))
+ if (!is_error_noslot_pfn(pfn) &&
+ (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn)))
put_page(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_release_pfn_clean);
@@ -1734,7 +1735,7 @@ EXPORT_SYMBOL_GPL(kvm_set_pfn_accessed);
void kvm_get_pfn(kvm_pfn_t pfn)
{
- if (!kvm_is_reserved_pfn(pfn))
+ if (!kvm_is_reserved_pfn(pfn) || is_zero_pfn(pfn))
get_page(pfn_to_page(pfn));
}
EXPORT_SYMBOL_GPL(kvm_get_pfn);
We are confused why ZONE_DEVICE not do this, but treating it as
no reserved. Is it racy if we change only use the patch in cover letter,
but not the series patches.
And we check the code of v4.9.y v4.10.y v4.11.y v4.12.y, this bug exists
in v4.11.y and later, but not in v4.9.y v4.10.y or before.
After commit:e86c59b1b1("mm/ksm: improve deduplication of zero pages
with colouring"), ksm will use zero pages with colouring. This feature
was added in v4.11.y, so I wonder why v4.9.y has this bug.
We use crash tools attaching to /proc/kcore to check the refcount of
zero_page, then create and destroy vm. The refcount stays at 1 on v4.9.y,
well it increases only after v4.11.y. Are you sure it is the same bug
you run into? Is there something we missing?
LinFeng (1):
KVM: special handling of zero_page in some flows
Zhuang Yanying (1):
KVM: fix overflow of zero page refcount with ksm running
arch/x86/kvm/mmu.c | 2 ++
virt/kvm/kvm_main.c | 9 +++++----
2 files changed, 7 insertions(+), 4 deletions(-)
--
2.23.0
>>
>> Fixes CVE-2018-20669
>> Backported from v5.0-rc1
>> Patch 1/1
>
> Also, that cve was "supposed" to already be fixed in the 4.19.13 kernel
> release for some reason, and it's a drm issue, not a core access_ok()
> issue.
>
> So why is this needed for 4.14?
>
See https://access.redhat.com/security/cve/cve-2018-20669
Looks like Linus' fix was attacking this at the root cause, not only for DRM.
Also, i use https://www.linuxkernelcves.com/ as a research source,
and they claim that CVE not fixed in 4.19.
(and i'll check for the other LTS kernels as well)
>>
>> Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
>
> No s-o-by from you?
Ops. Will add this in a resend.
>> Want to give this work back to the community, as 4.14 is a SLTS.
>
> What is "SLTS"?
Super Long Term Supported kernel - thanks to guys like you :-)
4.14 really is that (Jan. 2024, as of https://www.kernel.org/category/releases.html)
>
> thanks,
>
> greg k-h
Thanks, and i have some other patches backported to 4.14 as CVE fixes,
which i'll propose in the next hours.
BR
Carsten
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter
>From eb5a13ddc30824c20f1e2b662d2c821ad3808526 Mon Sep 17 00:00:00 2001
From: Linus Torvalds <torvalds(a)linux-foundation.org>
Date: Fri, 4 Jan 2019 12:56:09 -0800
Subject: [PATCH] make 'user_access_begin()' do 'access_ok()'
[ Upstream commit 594cc251fdd0d231d342d88b2fdff4bc42fb0690 ]
Fixes CVE-2018-20669
Backported from v5.0-rc1
Patch 1/1
Originally, the rule used to be that you'd have to do access_ok()
separately, and then user_access_begin() before actually doing the
direct (optimized) user access.
But experience has shown that people then decide not to do access_ok()
at all, and instead rely on it being implied by other operations or
similar. Which makes it very hard to verify that the access has
actually been range-checked.
If you use the unsafe direct user accesses, hardware features (either
SMAP - Supervisor Mode Access Protection - on x86, or PAN - Privileged
Access Never - on ARM) do force you to use user_access_begin(). But
nothing really forces the range check.
By putting the range check into user_access_begin(), we actually force
people to do the right thing (tm), and the range check vill be visible
near the actual accesses. We have way too long a history of people
trying to avoid them.
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
Rationale:
When working on stability and security for a project with 4.14 kernel,
i backported patches from upstream.
Want to give this work back to the community, as 4.14 is a SLTS.
---
arch/x86/include/asm/uaccess.h | 9 ++++++++-
drivers/gpu/drm/i915/i915_gem_execbuffer.c | 15 +++++++++++++--
include/linux/uaccess.h | 2 +-
kernel/compat.c | 6 ++----
kernel/exit.c | 6 ++----
lib/strncpy_from_user.c | 9 +++++----
lib/strnlen_user.c | 9 +++++----
7 files changed, 36 insertions(+), 20 deletions(-)
diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h
index 971830341061..fd00c5fba059 100644
--- a/arch/x86/include/asm/uaccess.h
+++ b/arch/x86/include/asm/uaccess.h
@@ -711,7 +711,14 @@ extern struct movsl_mask {
* checking before using them, but you have to surround them with the
* user_access_begin/end() pair.
*/
-#define user_access_begin() __uaccess_begin()
+static __must_check inline bool user_access_begin(const void __user *ptr, size_t len)
+{
+ if (unlikely(!access_ok(VERIFY_READ,ptr,len)))
+ return 0;
+ __uaccess_begin();
+ return 1;
+}
+#define user_access_begin(a,b) user_access_begin(a,b)
#define user_access_end() __uaccess_end()
#define unsafe_put_user(x, ptr, err_label) \
diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
index d99d05a91032..3a65a45184ad 100644
--- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c
@@ -1566,7 +1566,9 @@ static int eb_copy_relocations(const struct i915_execbuffer *eb)
* happened we would make the mistake of assuming that the
* relocations were valid.
*/
- user_access_begin();
+ if (!user_access_begin(urelocs, size))
+ goto end_user;
+
for (copied = 0; copied < nreloc; copied++)
unsafe_put_user(-1,
&urelocs[copied].presumed_offset,
@@ -2649,7 +2651,16 @@ i915_gem_execbuffer2(struct drm_device *dev, void *data,
unsigned int i;
/* Copy the new buffer offsets back to the user's exec list. */
- user_access_begin();
+ /*
+ * Note: args->buffer_count * sizeof(*user_exec_list) does not overflow,
+ * because we checked this on entry.
+ *
+ * And this range already got effectively checked earlier
+ * when we did the "copy_from_user()" above.
+ */
+ if (!user_access_begin(user_exec_list, args->buffer_count * sizeof(*user_exec_list)))
+ goto end_user;
+
for (i = 0; i < args->buffer_count; i++) {
if (!(exec2_list[i].offset & UPDATE))
continue;
diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h
index 251e655d407f..76407748701b 100644
--- a/include/linux/uaccess.h
+++ b/include/linux/uaccess.h
@@ -267,7 +267,7 @@ extern long strncpy_from_unsafe(char *dst, const void *unsafe_addr, long count);
probe_kernel_read(&retval, addr, sizeof(retval))
#ifndef user_access_begin
-#define user_access_begin() do { } while (0)
+#define user_access_begin(ptr,len) access_ok(ptr, len)
#define user_access_end() do { } while (0)
#define unsafe_get_user(x, ptr, err) do { if (unlikely(__get_user(x, ptr))) goto err; } while (0)
#define unsafe_put_user(x, ptr, err) do { if (unlikely(__put_user(x, ptr))) goto err; } while (0)
diff --git a/kernel/compat.c b/kernel/compat.c
index 7e83733d4c95..a9f5de63dc90 100644
--- a/kernel/compat.c
+++ b/kernel/compat.c
@@ -437,10 +437,9 @@ long compat_get_bitmap(unsigned long *mask, const compat_ulong_t __user *umask,
bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
- if (!access_ok(VERIFY_READ, umask, bitmap_size / 8))
+ if (!user_access_begin(umask, bitmap_size / 8))
return -EFAULT;
- user_access_begin();
while (nr_compat_longs > 1) {
compat_ulong_t l1, l2;
unsafe_get_user(l1, umask++, Efault);
@@ -467,10 +466,9 @@ long compat_put_bitmap(compat_ulong_t __user *umask, unsigned long *mask,
bitmap_size = ALIGN(bitmap_size, BITS_PER_COMPAT_LONG);
nr_compat_longs = BITS_TO_COMPAT_LONGS(bitmap_size);
- if (!access_ok(VERIFY_WRITE, umask, bitmap_size / 8))
+ if (!user_access_begin(umask, bitmap_size / 8))
return -EFAULT;
- user_access_begin();
while (nr_compat_longs > 1) {
unsigned long m = *mask++;
unsafe_put_user((compat_ulong_t)m, umask++, Efault);
diff --git a/kernel/exit.c b/kernel/exit.c
index d1baf9c96c3e..c3ad546ba7c2 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1597,10 +1597,9 @@ SYSCALL_DEFINE5(waitid, int, which, pid_t, upid, struct siginfo __user *,
if (!infop)
return err;
- if (!access_ok(VERIFY_WRITE, infop, sizeof(*infop)))
+ if (!user_access_begin(infop, sizeof(*infop)))
return -EFAULT;
- user_access_begin();
unsafe_put_user(signo, &infop->si_signo, Efault);
unsafe_put_user(0, &infop->si_errno, Efault);
unsafe_put_user(info.cause, &infop->si_code, Efault);
@@ -1725,10 +1724,9 @@ COMPAT_SYSCALL_DEFINE5(waitid,
if (!infop)
return err;
- if (!access_ok(VERIFY_WRITE, infop, sizeof(*infop)))
+ if (!user_access_begin(infop, sizeof(*infop)))
return -EFAULT;
- user_access_begin();
unsafe_put_user(signo, &infop->si_signo, Efault);
unsafe_put_user(0, &infop->si_errno, Efault);
unsafe_put_user(info.cause, &infop->si_code, Efault);
diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c
index e304b54c9c7d..023ba9f3b99f 100644
--- a/lib/strncpy_from_user.c
+++ b/lib/strncpy_from_user.c
@@ -115,10 +115,11 @@ long strncpy_from_user(char *dst, const char __user *src, long count)
kasan_check_write(dst, count);
check_object_size(dst, count, false);
- user_access_begin();
- retval = do_strncpy_from_user(dst, src, count, max);
- user_access_end();
- return retval;
+ if (user_access_begin(src, max)) {
+ retval = do_strncpy_from_user(dst, src, count, max);
+ user_access_end();
+ return retval;
+ }
}
return -EFAULT;
}
diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c
index 184f80f7bacf..7f2db3fe311f 100644
--- a/lib/strnlen_user.c
+++ b/lib/strnlen_user.c
@@ -114,10 +114,11 @@ long strnlen_user(const char __user *str, long count)
unsigned long max = max_addr - src_addr;
long retval;
- user_access_begin();
- retval = do_strnlen_user(str, count, max);
- user_access_end();
- return retval;
+ if (user_access_begin(str, max)) {
+ retval = do_strnlen_user(str, count, max);
+ user_access_end();
+ return retval;
+ }
}
return 0;
}
--
2.17.1
-----------------
Mentor Graphics (Deutschland) GmbH, Arnulfstraße 201, 80634 München / Germany
Registergericht München HRB 106955, Geschäftsführer: Thomas Heurung, Alexander Walter