Hi Greg,
I am looking into fixing CVE-2017-12190 in chromeos-4.4.
The fix requires two patches, 95d78c28b5a85 ("fix unbalanced
page refcounting in bio_map_user_iov") and 2b04e8f6bbb1 ("more
bio_map_user_iov() leak fixes"). I noticed that the second patch
has not been applied to linux-4.4.y. Is that due to the conflict,
or was there a conscious decision not to apply it ?
On a side note, is there a working archive for stable(a)vger.kernel.org ?
The one listed at vger.kernel.org (gmane) doesn't work.
Thanks,
Guenter
The patch titled
Subject: kernel/acct.c: fix the acct->needcheck check in check_free_space()
has been added to the -mm tree. Its filename is
acct-fix-the-acct-needcheck-check-in-check_free_space.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/acct-fix-the-acct-needcheck-check-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/acct-fix-the-acct-needcheck-check-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Oleg Nesterov <oleg(a)redhat.com>
Subject: kernel/acct.c: fix the acct->needcheck check in check_free_space()
As Tsukada explains, the time_is_before_jiffies(acct->needcheck) check is
very wrong, we need time_is_after_jiffies() to make sys_acct() work.
Ignoring the overflows, the code should "goto out" if needcheck > jiffies,
while currently it checks "needcheck < jiffies" and thus in the likely
case check_free_space() does nothing until jiffies overflow.
In particular this means that sys_acct() is simply broken, acct_on() sets
acct->needcheck = jiffies and expects that check_free_space() should set
acct->active = 1 after the free-space check, but this won't happen if
jiffies increments in between.
This was broken by commit 32dc73086015 ("get rid of timer in kern/acct.c")
in 2011, then another (correct) commit 795a2f22a8ea ("acct() should honour
the limits from the very beginning") made the problem more visible.
Link: http://lkml.kernel.org/r/20171213133940.GA6554@redhat.com
Fixes: 32dc73086015 ("get rid of timer in kern/acct.c")
Reported-by: TSUKADA Koutaro <tsukada(a)ascade.co.jp>
Suggested-by: TSUKADA Koutaro <tsukada(a)ascade.co.jp>
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/acct.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff -puN kernel/acct.c~acct-fix-the-acct-needcheck-check-in-check_free_space kernel/acct.c
--- a/kernel/acct.c~acct-fix-the-acct-needcheck-check-in-check_free_space
+++ a/kernel/acct.c
@@ -102,7 +102,7 @@ static int check_free_space(struct bsd_a
{
struct kstatfs sbuf;
- if (time_is_before_jiffies(acct->needcheck))
+ if (time_is_after_jiffies(acct->needcheck))
goto out;
/* May block */
_
Patches currently in -mm which might be from oleg(a)redhat.com are
acct-fix-the-acct-needcheck-check-in-check_free_space.patch
From: Changbin Du <changbin.du(a)intel.com>
The default NR_CPUS can be very large, but actual possible nr_cpu_ids
usually is very small. For my x86 distribution, the NR_CPUS is 8192 and
nr_cpu_ids is 4. About 2 pages are wasted.
Most machines don't have so many CPUs, so define a array with NR_CPUS
just wastes memory. So let's allocate the buffer dynamically when need.
With this change, the mutext tracing_cpumask_update_lock also can be
removed now, which was used to protect mask_str.
Link: http://lkml.kernel.org/r/1512013183-19107-1-git-send-email-changbin.du@inte…
Fixes: 36dfe9252bd4c ("ftrace: make use of tracing_cpumask")
Cc: stable(a)vger.kernel.org
Signed-off-by: Changbin Du <changbin.du(a)intel.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/trace.c | 29 +++++++++--------------------
1 file changed, 9 insertions(+), 20 deletions(-)
diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 5815ec16edd4..9f3f043ba3b7 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4178,37 +4178,30 @@ static const struct file_operations show_traces_fops = {
.llseek = seq_lseek,
};
-/*
- * The tracer itself will not take this lock, but still we want
- * to provide a consistent cpumask to user-space:
- */
-static DEFINE_MUTEX(tracing_cpumask_update_lock);
-
-/*
- * Temporary storage for the character representation of the
- * CPU bitmask (and one more byte for the newline):
- */
-static char mask_str[NR_CPUS + 1];
-
static ssize_t
tracing_cpumask_read(struct file *filp, char __user *ubuf,
size_t count, loff_t *ppos)
{
struct trace_array *tr = file_inode(filp)->i_private;
+ char *mask_str;
int len;
- mutex_lock(&tracing_cpumask_update_lock);
+ len = snprintf(NULL, 0, "%*pb\n",
+ cpumask_pr_args(tr->tracing_cpumask)) + 1;
+ mask_str = kmalloc(len, GFP_KERNEL);
+ if (!mask_str)
+ return -ENOMEM;
- len = snprintf(mask_str, count, "%*pb\n",
+ len = snprintf(mask_str, len, "%*pb\n",
cpumask_pr_args(tr->tracing_cpumask));
if (len >= count) {
count = -EINVAL;
goto out_err;
}
- count = simple_read_from_buffer(ubuf, count, ppos, mask_str, NR_CPUS+1);
+ count = simple_read_from_buffer(ubuf, count, ppos, mask_str, len);
out_err:
- mutex_unlock(&tracing_cpumask_update_lock);
+ kfree(mask_str);
return count;
}
@@ -4228,8 +4221,6 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf,
if (err)
goto err_unlock;
- mutex_lock(&tracing_cpumask_update_lock);
-
local_irq_disable();
arch_spin_lock(&tr->max_lock);
for_each_tracing_cpu(cpu) {
@@ -4252,8 +4243,6 @@ tracing_cpumask_write(struct file *filp, const char __user *ubuf,
local_irq_enable();
cpumask_copy(tr->tracing_cpumask, tracing_cpumask_new);
-
- mutex_unlock(&tracing_cpumask_update_lock);
free_cpumask_var(tracing_cpumask_new);
return count;
--
2.13.2
Hi all,
I've tested the following changes, belonging to merge commit f7dd3b1734e,
on top of 4.9.68 after a very easy backport from 4.10, and I think it
may be worthwhile adding them to 4.9.x:
x86/tsc: Limit the adjust value further
x86/tsc: Annotate printouts as firmware bug
x86/tsc: Force TSC_ADJUST register to value >= zero
x86/tsc: Validate TSC_ADJUST after resume
x86/tsc: Validate cpumask pointer before accessing it
x86/tsc: Fix broken CONFIG_X86_TSC=n build
x86/tsc: Try to adjust TSC if sync test fails
x86/tsc: Prepare warp test for TSC adjustment
x86/tsc: Move sync cleanup to a safe place
x86/tsc: Sync test only for the first cpu in a package
x86/tsc: Verify TSC_ADJUST from idle
x86/tsc: Store and check TSC ADJUST MSR
x86/tsc: Detect random warps
x86/tsc: Use X86_FEATURE_TSC_ADJUST in detect_art()
x86/tsc: Finalize the split of the TSC_RELIABLE flag
x86/tsc: Set TSC_KNOWN_FREQ and TSC_RELIABLE flags on Intel Atom SoCs
x86/tsc: Mark Intel ATOM_GOLDMONT TSC reliable
x86/tsc: Mark TSC frequency determined by CPUID as known
x86/tsc: Add X86_FEATURE_TSC_KNOWN_FREQ flag
These changes percisely fix an issue I am having with a relatively new
8-core Intel(R) Core(TM) i7-7820X with an updated ASUS BIOS (December 2017).
Under v4.9.68, the kernel fallbacks on the chosen clocksource to HPET which
just doesn't work - there is over a 200ms time drift that does not go
away even after repeated ntpdate sync attempts.
For further testing I've posted a branch for these changes here:
https://github.com/kernelim/linux tsc-fix-for-4.9.x
--
Dan Aloni
This is a note to let you know that I've just added the patch titled
staging: ion: Fix ion_cma_heap allocations
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From d98e6dbf42f73101128885a1e0ae672cd92b2e1a Mon Sep 17 00:00:00 2001
From: John Stultz <john.stultz(a)linaro.org>
Date: Fri, 8 Dec 2017 17:11:12 -0800
Subject: staging: ion: Fix ion_cma_heap allocations
In trying to add support for drm_hwcomposer to HiKey,
I've needed to utilize the ION CMA heap, and I've noticed
problems with allocations on newer kernels failing.
It seems back with 204f672255c2 ("ion: Use CMA APIs directly"),
the ion_cma_heap code was modified to use the CMA API, but
kept the arguments as buffer lengths rather then number of pages.
This results in errors as we don't have enough pages in CMA to
satisfy the exaggerated requests.
This patch converts the ion_cma_heap CMA API usage to properly
request pages.
It also fixes a minor issue in the allocation where in the error
path, the cma_release is called with the buffer->size value which
hasn't yet been set.
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Sumit Semwal <sumit.semwal(a)linaro.org>
Cc: Benjamin Gaignard <benjamin.gaignard(a)linaro.org>
Cc: Archit Taneja <architt(a)codeaurora.org>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: Dmitry Shmidt <dimitrysh(a)google.com>
Cc: Todd Kjos <tkjos(a)google.com>
Cc: Amit Pundir <amit.pundir(a)linaro.org>
Fixes: 204f672255c2 ("staging: android: ion: Use CMA APIs directly")
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Laura Abbott <labbott(a)redhat.com>
Signed-off-by: John Stultz <john.stultz(a)linaro.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/android/ion/ion_cma_heap.c | 15 +++++++++++----
1 file changed, 11 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/android/ion/ion_cma_heap.c b/drivers/staging/android/ion/ion_cma_heap.c
index dd5545d9990a..86196ffd2faf 100644
--- a/drivers/staging/android/ion/ion_cma_heap.c
+++ b/drivers/staging/android/ion/ion_cma_heap.c
@@ -39,9 +39,15 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer,
struct ion_cma_heap *cma_heap = to_cma_heap(heap);
struct sg_table *table;
struct page *pages;
+ unsigned long size = PAGE_ALIGN(len);
+ unsigned long nr_pages = size >> PAGE_SHIFT;
+ unsigned long align = get_order(size);
int ret;
- pages = cma_alloc(cma_heap->cma, len, 0, GFP_KERNEL);
+ if (align > CONFIG_CMA_ALIGNMENT)
+ align = CONFIG_CMA_ALIGNMENT;
+
+ pages = cma_alloc(cma_heap->cma, nr_pages, align, GFP_KERNEL);
if (!pages)
return -ENOMEM;
@@ -53,7 +59,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer,
if (ret)
goto free_mem;
- sg_set_page(table->sgl, pages, len, 0);
+ sg_set_page(table->sgl, pages, size, 0);
buffer->priv_virt = pages;
buffer->sg_table = table;
@@ -62,7 +68,7 @@ static int ion_cma_allocate(struct ion_heap *heap, struct ion_buffer *buffer,
free_mem:
kfree(table);
err:
- cma_release(cma_heap->cma, pages, buffer->size);
+ cma_release(cma_heap->cma, pages, nr_pages);
return -ENOMEM;
}
@@ -70,9 +76,10 @@ static void ion_cma_free(struct ion_buffer *buffer)
{
struct ion_cma_heap *cma_heap = to_cma_heap(buffer->heap);
struct page *pages = buffer->priv_virt;
+ unsigned long nr_pages = PAGE_ALIGN(buffer->size) >> PAGE_SHIFT;
/* release memory */
- cma_release(cma_heap->cma, pages, buffer->size);
+ cma_release(cma_heap->cma, pages, nr_pages);
/* release sg table */
sg_free_table(buffer->sg_table);
kfree(buffer->sg_table);
--
2.15.1
This is a note to let you know that I've just added the patch titled
USB: core: prevent malicious bNumInterfaces overflow
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 48a4ff1c7bb5a32d2e396b03132d20d552c0eca7 Mon Sep 17 00:00:00 2001
From: Alan Stern <stern(a)rowland.harvard.edu>
Date: Tue, 12 Dec 2017 14:25:13 -0500
Subject: USB: core: prevent malicious bNumInterfaces overflow
A malicious USB device with crafted descriptors can cause the kernel
to access unallocated memory by setting the bNumInterfaces value too
high in a configuration descriptor. Although the value is adjusted
during parsing, this adjustment is skipped in one of the error return
paths.
This patch prevents the problem by setting bNumInterfaces to 0
initially. The existing code already sets it to the proper value
after parsing is complete.
Signed-off-by: Alan Stern <stern(a)rowland.harvard.edu>
Reported-by: Andrey Konovalov <andreyknvl(a)google.com>
CC: <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/config.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
index 55b198ba629b..78e92d29f8d9 100644
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -555,6 +555,9 @@ static int usb_parse_configuration(struct usb_device *dev, int cfgidx,
unsigned iad_num = 0;
memcpy(&config->desc, buffer, USB_DT_CONFIG_SIZE);
+ nintf = nintf_orig = config->desc.bNumInterfaces;
+ config->desc.bNumInterfaces = 0; // Adjusted later
+
if (config->desc.bDescriptorType != USB_DT_CONFIG ||
config->desc.bLength < USB_DT_CONFIG_SIZE ||
config->desc.bLength > size) {
@@ -568,7 +571,6 @@ static int usb_parse_configuration(struct usb_device *dev, int cfgidx,
buffer += config->desc.bLength;
size -= config->desc.bLength;
- nintf = nintf_orig = config->desc.bNumInterfaces;
if (nintf > USB_MAXINTERFACES) {
dev_warn(ddev, "config %d has too many interfaces: %d, "
"using maximum allowed: %d\n",
--
2.15.1
This is a note to let you know that I've just added the patch titled
Revert "USB: core: only clean up what we allocated"
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From cf4df407e0d7cde60a45369c2a3414d18e2d4fdd Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Wed, 13 Dec 2017 11:59:39 +0100
Subject: Revert "USB: core: only clean up what we allocated"
This reverts commit 32fd87b3bbf5f7a045546401dfe2894dbbf4d8c3.
Alan wrote a better fix for this...
Cc: Andrey Konovalov <andreyknvl(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/core/config.c | 9 +++------
1 file changed, 3 insertions(+), 6 deletions(-)
diff --git a/drivers/usb/core/config.c b/drivers/usb/core/config.c
index 93b38471754e..55b198ba629b 100644
--- a/drivers/usb/core/config.c
+++ b/drivers/usb/core/config.c
@@ -764,21 +764,18 @@ void usb_destroy_configuration(struct usb_device *dev)
return;
if (dev->rawdescriptors) {
- for (i = 0; i < dev->descriptor.bNumConfigurations &&
- i < USB_MAXCONFIG; i++)
+ for (i = 0; i < dev->descriptor.bNumConfigurations; i++)
kfree(dev->rawdescriptors[i]);
kfree(dev->rawdescriptors);
dev->rawdescriptors = NULL;
}
- for (c = 0; c < dev->descriptor.bNumConfigurations &&
- c < USB_MAXCONFIG; c++) {
+ for (c = 0; c < dev->descriptor.bNumConfigurations; c++) {
struct usb_host_config *cf = &dev->config[c];
kfree(cf->string);
- for (i = 0; i < cf->desc.bNumInterfaces &&
- i < USB_MAXINTERFACES; i++) {
+ for (i = 0; i < cf->desc.bNumInterfaces; i++) {
if (cf->intf_cache[i])
kref_put(&cf->intf_cache[i]->ref,
usb_release_interface_cache);
--
2.15.1
When plugging in a USB webcam I see the following message:
xhci_hcd 0000:04:00.0: WARN Successful completion on short TX: needs
XHCI_TRUST_TX_LENGTH quirk?
handle_tx_event: 913 callbacks suppressed
All is quiet again with this patch (and I've done a fair but of soak
testing with the camera since).
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Daniel Thompson <daniel.thompson(a)linaro.org>
---
drivers/usb/host/xhci-pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index 7ef1274ef7f7..1aad89b8aba0 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -177,6 +177,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
xhci->quirks |= XHCI_BROKEN_STREAMS;
}
+ if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
+ pdev->device == 0x0014)
+ xhci->quirks |= XHCI_TRUST_TX_LENGTH;
if (pdev->vendor == PCI_VENDOR_ID_RENESAS &&
pdev->device == 0x0015)
xhci->quirks |= XHCI_RESET_ON_RESUME;
--
2.14.2
This is a note to let you know that I've just added the patch titled
powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
powerpc-64-fix-checksum-folding-in-csum_tcpudp_nofold-and-ip_fast_csum_nofold.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b492f7e4e07a28e706db26cf4943bb0911435426 Mon Sep 17 00:00:00 2001
From: Paul Mackerras <paulus(a)ozlabs.org>
Date: Thu, 3 Nov 2016 16:10:55 +1100
Subject: powerpc/64: Fix checksum folding in csum_tcpudp_nofold and ip_fast_csum_nofold
From: Paul Mackerras <paulus(a)ozlabs.org>
commit b492f7e4e07a28e706db26cf4943bb0911435426 upstream.
These functions compute an IP checksum by computing a 64-bit sum and
folding it to 32 bits (the "nofold" in their names refers to folding
down to 16 bits). However, doing (u32) (s + (s >> 32)) is not
sufficient to fold a 64-bit sum to 32 bits correctly. The addition
can produce a carry out from bit 31, which needs to be added in to
the sum to produce the correct result.
To fix this, we copy the from64to32() function from lib/checksum.c
and use that.
Signed-off-by: Paul Mackerras <paulus(a)ozlabs.org>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/powerpc/include/asm/checksum.h | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
--- a/arch/powerpc/include/asm/checksum.h
+++ b/arch/powerpc/include/asm/checksum.h
@@ -53,17 +53,25 @@ static inline __sum16 csum_fold(__wsum s
return (__force __sum16)(~((__force u32)sum + tmp) >> 16);
}
+static inline u32 from64to32(u64 x)
+{
+ /* add up 32-bit and 32-bit for 32+c bit */
+ x = (x & 0xffffffff) + (x >> 32);
+ /* add up carry.. */
+ x = (x & 0xffffffff) + (x >> 32);
+ return (u32)x;
+}
+
static inline __wsum csum_tcpudp_nofold(__be32 saddr, __be32 daddr, __u32 len,
__u8 proto, __wsum sum)
{
#ifdef __powerpc64__
- unsigned long s = (__force u32)sum;
+ u64 s = (__force u32)sum;
s += (__force u32)saddr;
s += (__force u32)daddr;
s += proto + len;
- s += (s >> 32);
- return (__force __wsum) s;
+ return (__force __wsum) from64to32(s);
#else
__asm__("\n\
addc %0,%0,%1 \n\
@@ -123,8 +131,7 @@ static inline __wsum ip_fast_csum_nofold
for (i = 0; i < ihl - 1; i++, ptr++)
s += *ptr;
- s += (s >> 32);
- return (__force __wsum)s;
+ return (__force __wsum)from64to32(s);
#else
__wsum sum, tmp;
Patches currently in stable-queue which might be from paulus(a)ozlabs.org are
queue-4.9/powerpc-64-fix-checksum-folding-in-csum_tcpudp_nofold-and-ip_fast_csum_nofold.patch
queue-4.9/powerpc-64-fix-checksum-folding-in-csum_add.patch
queue-4.9/powerpc-64-invalidate-process-table-caching-after-setting-process-table.patch
From: Wanpeng Li <wanpeng.li(a)hotmail.com>
------------[ cut here ]------------
Bad FPU state detected at kvm_put_guest_fpu+0xd8/0x2d0 [kvm], reinitializing FPU registers.
WARNING: CPU: 1 PID: 4594 at arch/x86/mm/extable.c:103 ex_handler_fprestore+0x88/0x90
CPU: 1 PID: 4594 Comm: qemu-system-x86 Tainted: G B OE 4.15.0-rc2+ #10
RIP: 0010:ex_handler_fprestore+0x88/0x90
Call Trace:
fixup_exception+0x4e/0x60
do_general_protection+0xff/0x270
general_protection+0x22/0x30
RIP: 0010:kvm_put_guest_fpu+0xd8/0x2d0 [kvm]
RSP: 0018:ffff8803d5627810 EFLAGS: 00010246
kvm_vcpu_reset+0x3b4/0x3c0 [kvm]
kvm_apic_accept_events+0x1c0/0x240 [kvm]
kvm_arch_vcpu_ioctl_run+0x1658/0x2fb0 [kvm]
kvm_vcpu_ioctl+0x479/0x880 [kvm]
do_vfs_ioctl+0x142/0x9a0
SyS_ioctl+0x74/0x80
do_syscall_64+0x15f/0x600
This can be reproduced by running any testcase in kvm-unit-tests since
the qemu userspace FPU context is not initialized, which results in the
init path from kvm_apic_accept_events() will load/put qemu userspace
FPU context w/o initialized. In addition, w/o this splatting we still
should initialize vcpu->arch.user_fpu instead of current->thread.fpu.
This patch fixes it by initializing qemu user FPU context if it is
uninitialized before KVM_RUN.
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Radim Krčmář <rkrcmar(a)redhat.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: stable(a)vger.kernel.org
Fixes: f775b13eedee (x86,kvm: move qemu/guest FPU switching out to vcpu_run)
Signed-off-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
---
arch/x86/kvm/x86.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index a92b22f..063a643 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -7273,10 +7273,13 @@ static int complete_emulated_mmio(struct kvm_vcpu *vcpu)
int kvm_arch_vcpu_ioctl_run(struct kvm_vcpu *vcpu, struct kvm_run *kvm_run)
{
- struct fpu *fpu = ¤t->thread.fpu;
+ struct fpu *fpu = &vcpu->arch.user_fpu;
int r;
- fpu__initialize(fpu);
+ if (!fpu->initialized) {
+ fpstate_init(&fpu->state);
+ fpu->initialized = 1;
+ }
kvm_sigset_activate(vcpu);
--
2.7.4
From: Vaibhav Jain <vaibhav(a)linux.vnet.ibm.com>
[ Upstream commit 07f5ab6002a4f0b633f3495157166f9f6180871b ]
Fix a boundary condition where in some cases an eeh event with state ==
pci_channel_io_perm_failure wont be passed on to a driver attached to
the virtual PCI device associated with a slice. This will happen in case
the slice just before (n-1) doesn't have any vPHB bus associated with
it, that results in an early return from cxl_pci_error_detected()
callback.
With state == pci_channel_io_perm_failure, the adapter will be removed
irrespective of the return value of cxl_vphb_error_detected(). So we now
always return PCI_ERS_RESULT_DISCONNECTED for this case i.e even if
the AFU isn't using a vPHB (currently returns PCI_ERS_RESULT_NONE).
Fixes: e4f5fc001a6("cxl: Do not create vPHB if there are no AFU configuration records")
Signed-off-by: Vaibhav Jain <vaibhav(a)linux.vnet.ibm.com>
Reviewed-by: Matthew R. Ochs <mrochs(a)linux.vnet.ibm.com>
Reviewed-by: Andrew Donnellan <andrew.donnellan(a)au1.ibm.com>
Acked-by: Frederic Barrat <fbarrat(a)linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Signed-off-by: Sasha Levin <alexander.levin(a)verizon.com>
---
drivers/misc/cxl/pci.c | 13 ++++++-------
1 file changed, 6 insertions(+), 7 deletions(-)
diff --git a/drivers/misc/cxl/pci.c b/drivers/misc/cxl/pci.c
index eef202d4399b..9b8628273b1b 100644
--- a/drivers/misc/cxl/pci.c
+++ b/drivers/misc/cxl/pci.c
@@ -1793,15 +1793,14 @@ static pci_ers_result_t cxl_pci_error_detected(struct pci_dev *pdev,
/* If we're permanently dead, give up. */
if (state == pci_channel_io_perm_failure) {
- /* Tell the AFU drivers; but we don't care what they
- * say, we're going away.
- */
for (i = 0; i < adapter->slices; i++) {
afu = adapter->afu[i];
- /* Only participate in EEH if we are on a virtual PHB */
- if (afu->phb == NULL)
- return PCI_ERS_RESULT_NONE;
- cxl_vphb_error_detected(afu, state);
+ /*
+ * Tell the AFU drivers; but we don't care what they
+ * say, we're going away.
+ */
+ if (afu->phb != NULL)
+ cxl_vphb_error_detected(afu, state);
}
return PCI_ERS_RESULT_DISCONNECT;
}
--
2.11.0
Fix child-node lookup during probe, which ended up searching the whole
device tree depth-first starting at the parent rather than just matching
on its children.
Note that the original premature free of the parent node has already
been fixed separately, but that fix was apparently never backported to
stable.
Fixes: 47654a162081 ("usb: chipidea: msm: Restore wrapper settings after reset")
Fixes: b74c43156c0c ("usb: chipidea: msm: ci_hdrc_msm_probe() missing of_node_get()")
Cc: stable <stable(a)vger.kernel.org> # 4.10: b74c43156c0c
Cc: Stephen Boyd <stephen.boyd(a)linaro.org>
Cc: Frank Rowand <frank.rowand(a)sony.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/chipidea/ci_hdrc_msm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/chipidea/ci_hdrc_msm.c b/drivers/usb/chipidea/ci_hdrc_msm.c
index 3593ce0ec641..880009987460 100644
--- a/drivers/usb/chipidea/ci_hdrc_msm.c
+++ b/drivers/usb/chipidea/ci_hdrc_msm.c
@@ -247,7 +247,7 @@ static int ci_hdrc_msm_probe(struct platform_device *pdev)
if (ret)
goto err_mux;
- ulpi_node = of_find_node_by_name(of_node_get(pdev->dev.of_node), "ulpi");
+ ulpi_node = of_get_child_by_name(pdev->dev.of_node, "ulpi");
if (ulpi_node) {
phy_node = of_get_next_available_child(ulpi_node, NULL);
ci->hsic = of_device_is_compatible(phy_node, "qcom,usb-hsic-phy");
--
2.15.0
The patch titled
Subject: mm, oom_reaper: fix memory corruption
has been added to the -mm tree. Its filename is
mm-oom_reaper-fix-memory-corruption.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-oom_reaper-fix-memory-corruptio…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-oom_reaper-fix-memory-corruptio…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Michal Hocko <mhocko(a)suse.com>
Subject: mm, oom_reaper: fix memory corruption
David Rientjes has reported the following memory corruption while the oom
reaper tries to unmap the victims address space
BUG: Bad page map in process oom_reaper pte:6353826300000000 pmd:00000000
addr:00007f50cab1d000 vm_flags:08100073 anon_vma:ffff9eea335603f0 mapping: (null) index:7f50cab1d
file: (null) fault: (null) mmap: (null) readpage: (null)
CPU: 2 PID: 1001 Comm: oom_reaper
Call Trace:
[<ffffffffa4bd967d>] dump_stack+0x4d/0x70
[<ffffffffa4a03558>] unmap_page_range+0x1068/0x1130
[<ffffffffa4a2e07f>] __oom_reap_task_mm+0xd5/0x16b
[<ffffffffa4a2e226>] oom_reaper+0xff/0x14c
[<ffffffffa48d6ad1>] kthread+0xc1/0xe0
Tetsuo Handa has noticed that the synchronization inside exit_mmap is
insufficient. We only synchronize with the oom reaper if
tsk_is_oom_victim which is not true if the final __mmput is called from a
different context than the oom victim exit path. This can trivially
happen from context of any task which has grabbed mm reference (e.g. to
read /proc/<pid>/ file which requires mm etc.). The race would look like
this
oom_reaper oom_victim task
mmget_not_zero
do_exit
mmput
__oom_reap_task_mm mmput
__mmput
exit_mmap
remove_vma
unmap_page_range
Fix this issue by providing a new mm_is_oom_victim() helper which operates
on the mm struct rather than a task. Any context which operates on a
remote mm struct should use this helper in place of tsk_is_oom_victim.
The flag is set in mark_oom_victim and never cleared so it is stable in
the exit_mmap path.
Debugged by Tetsuo Handa.
Link: http://lkml.kernel.org/r/20171210095130.17110-1-mhocko@kernel.org
Fixes: 212925802454 ("mm: oom: let oom_reap_task and exit_mmap run concurrently")
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: David Rientjes <rientjes(a)google.com>
Acked-by: David Rientjes <rientjes(a)google.com>
Cc: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: Andrea Argangeli <andrea(a)kernel.org>
Cc: <stable(a)vger.kernel.org> [4.14]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/oom.h | 9 +++++++++
include/linux/sched/coredump.h | 1 +
mm/mmap.c | 10 +++++-----
mm/oom_kill.c | 4 +++-
4 files changed, 18 insertions(+), 6 deletions(-)
diff -puN include/linux/oom.h~mm-oom_reaper-fix-memory-corruption include/linux/oom.h
--- a/include/linux/oom.h~mm-oom_reaper-fix-memory-corruption
+++ a/include/linux/oom.h
@@ -67,6 +67,15 @@ static inline bool tsk_is_oom_victim(str
}
/*
+ * Use this helper if tsk->mm != mm and the victim mm needs a special
+ * handling. This is guaranteed to stay true after once set.
+ */
+static inline bool mm_is_oom_victim(struct mm_struct *mm)
+{
+ return test_bit(MMF_OOM_VICTIM, &mm->flags);
+}
+
+/*
* Checks whether a page fault on the given mm is still reliable.
* This is no longer true if the oom reaper started to reap the
* address space which is reflected by MMF_UNSTABLE flag set in
diff -puN include/linux/sched/coredump.h~mm-oom_reaper-fix-memory-corruption include/linux/sched/coredump.h
--- a/include/linux/sched/coredump.h~mm-oom_reaper-fix-memory-corruption
+++ a/include/linux/sched/coredump.h
@@ -70,6 +70,7 @@ static inline int get_dumpable(struct mm
#define MMF_UNSTABLE 22 /* mm is unstable for copy_from_user */
#define MMF_HUGE_ZERO_PAGE 23 /* mm has ever used the global huge zero page */
#define MMF_DISABLE_THP 24 /* disable THP for all VMAs */
+#define MMF_OOM_VICTIM 25 /* mm is the oom victim */
#define MMF_DISABLE_THP_MASK (1 << MMF_DISABLE_THP)
#define MMF_INIT_MASK (MMF_DUMPABLE_MASK | MMF_DUMP_FILTER_MASK |\
diff -puN mm/mmap.c~mm-oom_reaper-fix-memory-corruption mm/mmap.c
--- a/mm/mmap.c~mm-oom_reaper-fix-memory-corruption
+++ a/mm/mmap.c
@@ -3019,20 +3019,20 @@ void exit_mmap(struct mm_struct *mm)
/* Use -1 here to ensure all VMAs in the mm are unmapped */
unmap_vmas(&tlb, vma, 0, -1);
- set_bit(MMF_OOM_SKIP, &mm->flags);
- if (unlikely(tsk_is_oom_victim(current))) {
+ if (unlikely(mm_is_oom_victim(mm))) {
/*
* Wait for oom_reap_task() to stop working on this
* mm. Because MMF_OOM_SKIP is already set before
* calling down_read(), oom_reap_task() will not run
* on this "mm" post up_write().
*
- * tsk_is_oom_victim() cannot be set from under us
- * either because current->mm is already set to NULL
+ * mm_is_oom_victim() cannot be set from under us
+ * either because victim->mm is already set to NULL
* under task_lock before calling mmput and oom_mm is
- * set not NULL by the OOM killer only if current->mm
+ * set not NULL by the OOM killer only if victim->mm
* is found not NULL while holding the task_lock.
*/
+ set_bit(MMF_OOM_SKIP, &mm->flags);
down_write(&mm->mmap_sem);
up_write(&mm->mmap_sem);
}
diff -puN mm/oom_kill.c~mm-oom_reaper-fix-memory-corruption mm/oom_kill.c
--- a/mm/oom_kill.c~mm-oom_reaper-fix-memory-corruption
+++ a/mm/oom_kill.c
@@ -683,8 +683,10 @@ static void mark_oom_victim(struct task_
return;
/* oom_mm is bound to the signal struct life time. */
- if (!cmpxchg(&tsk->signal->oom_mm, NULL, mm))
+ if (!cmpxchg(&tsk->signal->oom_mm, NULL, mm)) {
mmgrab(tsk->signal->oom_mm);
+ set_bit(MMF_OOM_VICTIM, &mm->flags);
+ }
/*
* Make sure that the task is woken up from uninterruptible sleep
_
Patches currently in -mm which might be from mhocko(a)suse.com are
mm-oom_reaper-fix-memory-corruption.patch
mm-drop-hotplug-lock-from-lru_add_drain_all.patch
mm-hugetlb-drop-hugepages_treat_as_movable-sysctl.patch
A verdict of NF_STOLEN after NF_QUEUE will cause an incorrect return value
and a potential kernel panic via double free of skb's
This was broken by commit 7034b566a4e7 ("netfilter: fix nf_queue handling")
and subsequently fixed in v4.10 by commit c63cbc460419 ("netfilter:
use switch() to handle verdict cases from nf_hook_slow()"). However that
commit cannot be cleanly cherry-picked to v4.9
Signed-off-by: Debabrata Banerjee <dbanerje(a)akamai.com>
---
This fix is only needed for v4.9 stable since v4.10+ does not have the
issue
---
net/netfilter/core.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/net/netfilter/core.c b/net/netfilter/core.c
index 004af030ef1a..d869ea50623e 100644
--- a/net/netfilter/core.c
+++ b/net/netfilter/core.c
@@ -364,6 +364,11 @@ int nf_hook_slow(struct sk_buff *skb, struct nf_hook_state *state)
ret = nf_queue(skb, state, &entry, verdict);
if (ret == 1 && entry)
goto next_hook;
+ } else {
+ /* Implicit handling for NF_STOLEN, as well as any other
+ * non conventional verdicts.
+ */
+ ret = 0;
}
return ret;
}
--
2.15.1
The patch titled
Subject: kernel: make groups_sort calling a responsibility group_info allocators
has been added to the -mm tree. Its filename is
kernel-make-groups_sort-calling-a-responsibility-group_info-allocators.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/kernel-make-groups_sort-calling-a-…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/kernel-make-groups_sort-calling-a-…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/SubmitChecklist when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Thiago Rafael Becker <thiago.becker(a)gmail.com>
Subject: kernel: make groups_sort calling a responsibility group_info allocators
In testing, we found that nfsd threads may call set_groups in parallel for
the same entry cached in auth.unix.gid, racing in the call of groups_sort,
corrupting the groups for that entry and leading to permission denials for
the client.
This patch:
- Make groups_sort globally visible.
- Move the call to groups_sort to the modifiers of group_info
- Remove the call to groups_sort from set_groups
Link: http://lkml.kernel.org/r/20171211151420.18655-1-thiago.becker@gmail.com
Signed-off-by: Thiago Rafael Becker <thiago.becker(a)gmail.com>
Reviewed-by: Matthew Wilcox <mawilcox(a)microsoft.com>
Reviewed-by: NeilBrown <neilb(a)suse.com>
Acked-by: "J. Bruce Fields" <bfields(a)fieldses.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Martin Schwidefsky <schwidefsky(a)de.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
arch/s390/kernel/compat_linux.c | 1 +
fs/nfsd/auth.c | 3 +++
include/linux/cred.h | 1 +
kernel/groups.c | 5 +++--
kernel/uid16.c | 1 +
net/sunrpc/auth_gss/gss_rpc_xdr.c | 1 +
net/sunrpc/auth_gss/svcauth_gss.c | 1 +
net/sunrpc/svcauth_unix.c | 2 ++
8 files changed, 13 insertions(+), 2 deletions(-)
diff -puN arch/s390/kernel/compat_linux.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators arch/s390/kernel/compat_linux.c
--- a/arch/s390/kernel/compat_linux.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/arch/s390/kernel/compat_linux.c
@@ -263,6 +263,7 @@ COMPAT_SYSCALL_DEFINE2(s390_setgroups16,
return retval;
}
+ groups_sort(group_info);
retval = set_current_groups(group_info);
put_group_info(group_info);
diff -puN fs/nfsd/auth.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators fs/nfsd/auth.c
--- a/fs/nfsd/auth.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/fs/nfsd/auth.c
@@ -60,6 +60,9 @@ int nfsd_setuser(struct svc_rqst *rqstp,
gi->gid[i] = exp->ex_anon_gid;
else
gi->gid[i] = rqgi->gid[i];
+
+ /* Each thread allocates its own gi, no race */
+ groups_sort(gi);
}
} else {
gi = get_group_info(rqgi);
diff -puN include/linux/cred.h~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators include/linux/cred.h
--- a/include/linux/cred.h~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/include/linux/cred.h
@@ -83,6 +83,7 @@ extern int set_current_groups(struct gro
extern void set_groups(struct cred *, struct group_info *);
extern int groups_search(const struct group_info *, kgid_t);
extern bool may_setgroups(void);
+extern void groups_sort(struct group_info *);
/*
* The security context of a task
diff -puN kernel/groups.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators kernel/groups.c
--- a/kernel/groups.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/kernel/groups.c
@@ -86,11 +86,12 @@ static int gid_cmp(const void *_a, const
return gid_gt(a, b) - gid_lt(a, b);
}
-static void groups_sort(struct group_info *group_info)
+void groups_sort(struct group_info *group_info)
{
sort(group_info->gid, group_info->ngroups, sizeof(*group_info->gid),
gid_cmp, NULL);
}
+EXPORT_SYMBOL(groups_sort);
/* a simple bsearch */
int groups_search(const struct group_info *group_info, kgid_t grp)
@@ -122,7 +123,6 @@ int groups_search(const struct group_inf
void set_groups(struct cred *new, struct group_info *group_info)
{
put_group_info(new->group_info);
- groups_sort(group_info);
get_group_info(group_info);
new->group_info = group_info;
}
@@ -206,6 +206,7 @@ SYSCALL_DEFINE2(setgroups, int, gidsetsi
return retval;
}
+ groups_sort(group_info);
retval = set_current_groups(group_info);
put_group_info(group_info);
diff -puN kernel/uid16.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators kernel/uid16.c
--- a/kernel/uid16.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/kernel/uid16.c
@@ -192,6 +192,7 @@ SYSCALL_DEFINE2(setgroups16, int, gidset
return retval;
}
+ groups_sort(group_info);
retval = set_current_groups(group_info);
put_group_info(group_info);
diff -puN net/sunrpc/auth_gss/gss_rpc_xdr.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators net/sunrpc/auth_gss/gss_rpc_xdr.c
--- a/net/sunrpc/auth_gss/gss_rpc_xdr.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/net/sunrpc/auth_gss/gss_rpc_xdr.c
@@ -231,6 +231,7 @@ static int gssx_dec_linux_creds(struct x
goto out_free_groups;
creds->cr_group_info->gid[i] = kgid;
}
+ groups_sort(creds->cr_group_info);
return 0;
out_free_groups:
diff -puN net/sunrpc/auth_gss/svcauth_gss.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators net/sunrpc/auth_gss/svcauth_gss.c
--- a/net/sunrpc/auth_gss/svcauth_gss.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/net/sunrpc/auth_gss/svcauth_gss.c
@@ -481,6 +481,7 @@ static int rsc_parse(struct cache_detail
goto out;
rsci.cred.cr_group_info->gid[i] = kgid;
}
+ groups_sort(rsci.cred.cr_group_info);
/* mech name */
len = qword_get(&mesg, buf, mlen);
diff -puN net/sunrpc/svcauth_unix.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators net/sunrpc/svcauth_unix.c
--- a/net/sunrpc/svcauth_unix.c~kernel-make-groups_sort-calling-a-responsibility-group_info-allocators
+++ a/net/sunrpc/svcauth_unix.c
@@ -520,6 +520,7 @@ static int unix_gid_parse(struct cache_d
ug.gi->gid[i] = kgid;
}
+ groups_sort(ug.gi);
ugp = unix_gid_lookup(cd, uid);
if (ugp) {
struct cache_head *ch;
@@ -819,6 +820,7 @@ svcauth_unix_accept(struct svc_rqst *rqs
kgid_t kgid = make_kgid(&init_user_ns, svc_getnl(argv));
cred->cr_group_info->gid[i] = kgid;
}
+ groups_sort(cred->cr_group_info);
if (svc_getu32(argv) != htonl(RPC_AUTH_NULL) || svc_getu32(argv) != 0) {
*authp = rpc_autherr_badverf;
return SVC_DENIED;
_
Patches currently in -mm which might be from thiago.becker(a)gmail.com are
kernel-make-groups_sort-calling-a-responsibility-group_info-allocators.patch
Hi Eric,
A kernel bug report was opened against Ubuntu [0]. It was found that
reverting the following commit resolved this bug:
commit b2504a5dbef3305ef41988ad270b0e8ec289331c
Author: Eric Dumazet <edumazet(a)google.com>
Date: Tue Jan 31 10:20:32 2017 -0800
net: reduce skb_warn_bad_offload() noise
The regression was introduced as of v4.11-rc1 and still exists in
current mainline.
I was hoping to get your feedback, since you are the patch author. Do
you think gathering any additional data will help diagnose this issue,
or would it be best to submit a revert request?
This commit did in fact resolve another bug[1], but in the process
introduced this regression.
Thanks,
Joe
[0] http://pad.lv/1715609
[1] http://pad.lv/1705447
Hi,
This series corrects a number of issues with NT_PRFPREG regset, most
importantly an FCSR access API regression introduced with the addition of
MSA support, and then a few smaller issues with the get/set handlers.
I have decided to factor out non-MSA and MSA context helpers as the first
step to avoid the issue with excessive indentation that would inevitably
happen if the regression fix was applied to current code as it stands.
It shouldn't be a big deal with backporting as this code hasn't changed
much since the regression, and it will make any future bacports easier.
Only a call to `init_fp_ctx' will have to be trivially resolved (though
arguably commit ac9ad83bc318 ("MIPS: prevent FP context set via ptrace
being discarded"), which has added `init_fp_ctx', would be good to
backport as far as possible instead).
These changes have been verified by examining the register state recorded
in core dumps manually with GDB, as well as by running the GDB test suite.
No user of ptrace(2) PTRACE_GETREGSET and PTRACE_SETREGSET requests is
known for the MIPS port, so this part remains not covered, however it is
assumed to remain consistent with how the creation of core file works.
See individual patch descriptions for further details, and for changes
made since v1 to address concerns raised in the review.
Maciej
Hi,
On Tue, Dec 12, 2017 at 3:06 AM, Felipe Balbi <balbi(a)kernel.org> wrote:
>
> Hi,
>
> Douglas Anderson <dianders(a)chromium.org> writes:
>> On rk3288-veyron devices on Chrome OS it was found that plugging in an
>> Arduino-based USB device could cause the system to lockup, especially
>> if the CPU Frequency was at one of the slower operating points (like
>> 100 MHz / 200 MHz).
>>
>> Upon tracing, I found that the following was happening:
>> * The USB device (full speed) was connected to a high speed hub and
>> then to the rk3288. Thus, we were dealing with split transactions,
>> which is all handled in software on dwc2.
>> * Userspace was initiating a BULK IN transfer
>> * When we sent the SSPLIT (to start the split transaction), we got an
>> ACK. Good. Then we issued the CSPLIT.
>> * When we sent the CSPLIT, we got back a NAK. We immediately (from
>> the interrupt handler) started to retry and sent another SSPLIT.
>> * The device kept NAKing our CSPLIT, so we kept ping-ponging between
>> sending a SSPLIT and a CSPLIT, each time sending from the interrupt
>> handler.
>> * The handling of the interrupts was (because of the low CPU speed and
>> the inefficiency of the dwc2 interrupt handler) was actually taking
>> _longer_ than it took the other side to send the ACK/NAK. Thus we
>> were _always_ in the USB interrupt routine.
>> * The fact that USB interrupts were always going off was preventing
>> other things from happening in the system. This included preventing
>> the system from being able to transition to a higher CPU frequency.
>>
>> As I understand it, there is no requirement to retry super quickly
>> after a NAK, we just have to retry sometime in the future. Thus one
>> solution to the above is to just add a delay between getting a NAK and
>> retrying the transmission. If this delay is sufficiently long to get
>> out of the interrupt routine then the rest of the system will be able
>> to make forward progress. Even a 25 us delay would probably be
>> enough, but we'll be extra conservative and try to delay 1 ms (the
>> exact amount depends on HZ and the accuracy of the jiffy and how close
>> the current jiffy is to ticking, but could be as much as 20 ms or as
>> little as 1 ms).
>>
>> Presumably adding a delay like this could impact the USB throughput,
>> so we only add the delay with repeated NAKs.
>>
>> NOTE: Upon further testing of a pl2303 serial adapter, I found that
>> this fix may help with problems there. Specifically I found that the
>> pl2303 serial adapters tend to respond with a NAK when they have
>> nothing to say and thus we end with this same sequence.
>>
>> Signed-off-by: Douglas Anderson <dianders(a)chromium.org>
>> Cc: stable(a)vger.kernel.org
>> Reviewed-by: Julius Werner <jwerner(a)chromium.org>
>> Tested-by: Stefan Wahren <stefan.wahren(a)i2se.com>
>
> This seems too big for -rc or -stable inclusion.
I've removed the stable tag at your request. I originally added it at
your request in response to v2 of this patch. I'd agree that it's a
pretty big patch and therefore "risky" to pick back to stable. ...but
it does fix a bug reported by several people on the mailing lists, so
I'll leave it to your discretion. Previously in relation to the
stable tag, I had mentioned:
It's a little weird since it doesn't "fix" any specific
commit, so I guess it will be up to stable folks to decide how far to
go back. The dwc2 devices I work with are actually on 3.14, but we
have some pretty massive backports related to dwc2 there...
> In any case, this
> doesn't apply to my testing/next branch. Care to rebase and collect acks
> you received while doing that?
Sure, no problem. I've posted v4 with John Youn's Ack. The reason v3
didn't apply is that you've now got commit e99e88a9d2b0 ("treewide:
setup_timer() -> timer_setup()"). Originally my plan was to beat that
patch into the kernel and then I'd do the timer conversion myself.
That was patch #2 in the v3 series, AKA
<https://patchwork.kernel.org/patch/10032935/>. ...but since I failed
to beat Kees' patch in, I've now squashed patches #1 and #2 together
and resolved the trivial conflict.
If anyone were thinking of trying to backport this patch to older
kernels (where they presumably don't have Kees's timer patch) they can
always use the v3 patch posted here as a reference for how to make
things work. ;)
-Doug
Using PGDIR_SHIFT to identify espfix64 addresses on 5-level systems
was wrong, and it resulted in panics due to unhandled double faults.
Use P4D_SHIFT instead, which is correct on 4-level and 5-level
machines.
This fixes a panic when running x86 selftests on 5-level machines.
Fixes: 1d33b219563f ("x86/espfix: Add support for 5-level paging")
Cc: stable(a)vger.kernel.org
Cc: "Kirill A. Shutemov" <kirill(a)shutemov.name>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
arch/x86/kernel/traps.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index 74136fd16f49..c4e5b0a7516f 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -360,7 +360,7 @@ dotraplinkage void do_double_fault(struct pt_regs *regs, long error_code)
*
* No need for ist_enter here because we don't use RCU.
*/
- if (((long)regs->sp >> PGDIR_SHIFT) == ESPFIX_PGD_ENTRY &&
+ if (((long)regs->sp >> P4D_SHIFT) == ESPFIX_PGD_ENTRY &&
regs->cs == __KERNEL_CS &&
regs->ip == (unsigned long)native_irq_return_iret)
{
--
2.13.6
From: Marc Zyngier <marc.zyngier(a)arm.com>
Commit 5553b142be11e794ebc0805950b2e8313f93d718 upstream.
VTTBR_BADDR_MASK is used to sanity check the size and alignment of the
VTTBR address. It seems to currently be off by one, thereby only
allowing up to 39-bit addresses (instead of 40-bit) and also
insufficiently checking the alignment. This patch fixes it.
This patch is the 32bit pendent of Kristina's arm64 fix, and
she deserves the actual kudos for pinpointing that one.
Fixes: f7ed45be3ba52 ("KVM: ARM: World-switch implementation")
Cc: <stable(a)vger.kernel.org> # 3.9
Reported-by: Kristina Martsenko <kristina.martsenko(a)arm.com>
Reviewed-by: Christoffer Dall <christoffer.dall(a)linaro.org>
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Christoffer Dall <christoffer.dall(a)linaro.org>
---
arch/arm/include/asm/kvm_arm.h | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/arch/arm/include/asm/kvm_arm.h b/arch/arm/include/asm/kvm_arm.h
index 816db0bf2dd8..46b336df4ec1 100644
--- a/arch/arm/include/asm/kvm_arm.h
+++ b/arch/arm/include/asm/kvm_arm.h
@@ -161,8 +161,7 @@
#else
#define VTTBR_X (5 - KVM_T0SZ)
#endif
-#define VTTBR_BADDR_SHIFT (VTTBR_X - 1)
-#define VTTBR_BADDR_MASK (((1LLU << (40 - VTTBR_X)) - 1) << VTTBR_BADDR_SHIFT)
+#define VTTBR_BADDR_MASK (((1LLU << (40 - VTTBR_X)) - 1) << VTTBR_X)
#define VTTBR_VMID_SHIFT (48LLU)
#define VTTBR_VMID_MASK (0xffLLU << VTTBR_VMID_SHIFT)
--
2.14.2