This is a note to let you know that I've just added the patch titled
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hrtimer-reset-hrtimer-cpu-base-proper-on-cpu-hotplug.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d5421ea43d30701e03cadc56a38854c36a8b4433 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Fri, 26 Jan 2018 14:54:32 +0100
Subject: hrtimer: Reset hrtimer cpu base proper on CPU hotplug
From: Thomas Gleixner <tglx(a)linutronix.de>
commit d5421ea43d30701e03cadc56a38854c36a8b4433 upstream.
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.
If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.
Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.
Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.
Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Sebastian Sewior <bigeasy(a)linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria(a)linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/time/hrtimer.c | 3 +++
1 file changed, 3 insertions(+)
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -669,7 +669,9 @@ static void hrtimer_reprogram(struct hrt
static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
{
base->expires_next.tv64 = KTIME_MAX;
+ base->hang_detected = 0;
base->hres_active = 0;
+ base->next_timer = NULL;
}
/*
@@ -1615,6 +1617,7 @@ static void init_hrtimers_cpu(int cpu)
timerqueue_init_head(&cpu_base->clock_base[i].active);
}
+ cpu_base->active_bases = 0;
cpu_base->cpu = cpu;
hrtimer_init_hres(cpu_base);
}
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.4/prevent-timer-value-0-for-mwaitx.patch
queue-4.4/x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
queue-4.4/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-4.4/timers-plug-locking-race-vs.-timer-migration.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/x86-retpoline-fill-rsb-on-context-switch-for-affected-cpus.patch
queue-4.4/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.4/x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
queue-4.4/time-avoid-undefined-behaviour-in-ktime_add_safe.patch
queue-4.4/sched-deadline-use-the-revised-wakeup-rule-for-suspending-constrained-dl-tasks.patch
queue-4.4/hrtimer-reset-hrtimer-cpu-base-proper-on-cpu-hotplug.patch
This is a note to let you know that I've just added the patch titled
x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-mm-64-fix-vmapped-stack-syncing-on-very-large-memory-4-level-systems.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5beda7d54eafece4c974cfa9fbb9f60fb18fd20a Mon Sep 17 00:00:00 2001
From: Andy Lutomirski <luto(a)kernel.org>
Date: Thu, 25 Jan 2018 13:12:14 -0800
Subject: x86/mm/64: Fix vmapped stack syncing on very-large-memory 4-level systems
From: Andy Lutomirski <luto(a)kernel.org>
commit 5beda7d54eafece4c974cfa9fbb9f60fb18fd20a upstream.
Neil Berrington reported a double-fault on a VM with 768GB of RAM that uses
large amounts of vmalloc space with PTI enabled.
The cause is that load_new_mm_cr3() was never fixed to take the 5-level pgd
folding code into account, so, on a 4-level kernel, the pgd synchronization
logic compiles away to exactly nothing.
Interestingly, the problem doesn't trigger with nopti. I assume this is
because the kernel is mapped with global pages if we boot with nopti. The
sequence of operations when we create a new task is that we first load its
mm while still running on the old stack (which crashes if the old stack is
unmapped in the new mm unless the TLB saves us), then we call
prepare_switch_to(), and then we switch to the new stack.
prepare_switch_to() pokes the new stack directly, which will populate the
mapping through vmalloc_fault(). I assume that we're getting lucky on
non-PTI systems -- the old stack's TLB entry stays alive long enough to
make it all the way through prepare_switch_to() and switch_to() so that we
make it to a valid stack.
Fixes: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support")
Reported-and-tested-by: Neil Berrington <neil.berrington(a)datacore.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Konstantin Khlebnikov <khlebnikov(a)yandex-team.ru>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Link: https://lkml.kernel.org/r/346541c56caed61abbe693d7d2742b4a380c5001.15169145…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/mm/tlb.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -151,6 +151,34 @@ void switch_mm(struct mm_struct *prev, s
local_irq_restore(flags);
}
+static void sync_current_stack_to_mm(struct mm_struct *mm)
+{
+ unsigned long sp = current_stack_pointer;
+ pgd_t *pgd = pgd_offset(mm, sp);
+
+ if (CONFIG_PGTABLE_LEVELS > 4) {
+ if (unlikely(pgd_none(*pgd))) {
+ pgd_t *pgd_ref = pgd_offset_k(sp);
+
+ set_pgd(pgd, *pgd_ref);
+ }
+ } else {
+ /*
+ * "pgd" is faked. The top level entries are "p4d"s, so sync
+ * the p4d. This compiles to approximately the same code as
+ * the 5-level case.
+ */
+ p4d_t *p4d = p4d_offset(pgd, sp);
+
+ if (unlikely(p4d_none(*p4d))) {
+ pgd_t *pgd_ref = pgd_offset_k(sp);
+ p4d_t *p4d_ref = p4d_offset(pgd_ref, sp);
+
+ set_p4d(p4d, *p4d_ref);
+ }
+ }
+}
+
void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
{
@@ -226,11 +254,7 @@ void switch_mm_irqs_off(struct mm_struct
* mapped in the new pgd, we'll double-fault. Forcibly
* map it.
*/
- unsigned int index = pgd_index(current_stack_pointer);
- pgd_t *pgd = next->pgd + index;
-
- if (unlikely(pgd_none(*pgd)))
- set_pgd(pgd, init_mm.pgd[index]);
+ sync_current_stack_to_mm(next);
}
/* Stop remote flushes for the previous mm */
Patches currently in stable-queue which might be from luto(a)kernel.org are
queue-4.14/x86-mm-64-fix-vmapped-stack-syncing-on-very-large-memory-4-level-systems.patch
This is a note to let you know that I've just added the patch titled
x86/microcode/intel: Extend BDW late-loading further with LLC size check
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7e702d17ed138cf4ae7c00e8c00681ed464587c7 Mon Sep 17 00:00:00 2001
From: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Date: Tue, 23 Jan 2018 11:41:32 +0100
Subject: x86/microcode/intel: Extend BDW late-loading further with LLC size check
From: Jia Zhang <zhang.jia(a)linux.alibaba.com>
commit 7e702d17ed138cf4ae7c00e8c00681ed464587c7 upstream.
Commit b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a
revision check") reduced the impact of erratum BDF90 for Broadwell model
79.
The impact can be reduced further by checking the size of the last level
cache portion per core.
Tony: "The erratum says the problem only occurs on the large-cache SKUs.
So we only need to avoid the update if we are on a big cache SKU that is
also running old microcode."
For more details, see erratum BDF90 in document #334165 (Intel Xeon
Processor E7-8800/4800 v4 Product Family Specification Update) from
September 2017.
Fixes: b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check")
Signed-off-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Tony Luck <tony.luck(a)intel.com>
Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/microcode/intel.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -45,6 +45,9 @@ static const char ucode_path[] = "kernel
/* Current microcode patch used in early patching on the APs. */
static struct microcode_intel *intel_ucode_patch;
+/* last level cache size per core */
+static int llc_size_per_core;
+
static inline bool cpu_signatures_match(unsigned int s1, unsigned int p1,
unsigned int s2, unsigned int p2)
{
@@ -912,12 +915,14 @@ static bool is_blacklisted(unsigned int
/*
* Late loading on model 79 with microcode revision less than 0x0b000021
- * may result in a system hang. This behavior is documented in item
- * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family).
+ * and LLC size per core bigger than 2.5MB may result in a system hang.
+ * This behavior is documented in item BDF90, #334165 (Intel Xeon
+ * Processor E7-8800/4800 v4 Product Family).
*/
if (c->x86 == 6 &&
c->x86_model == INTEL_FAM6_BROADWELL_X &&
c->x86_mask == 0x01 &&
+ llc_size_per_core > 2621440 &&
c->microcode < 0x0b000021) {
pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode);
pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n");
@@ -975,6 +980,15 @@ static struct microcode_ops microcode_in
.apply_microcode = apply_microcode_intel,
};
+static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c)
+{
+ u64 llc_size = c->x86_cache_size * 1024;
+
+ do_div(llc_size, c->x86_max_cores);
+
+ return (int)llc_size;
+}
+
struct microcode_ops * __init init_intel_microcode(void)
{
struct cpuinfo_x86 *c = &boot_cpu_data;
@@ -985,5 +999,7 @@ struct microcode_ops * __init init_intel
return NULL;
}
+ llc_size_per_core = calc_llc_size_per_core(c);
+
return µcode_intel_ops;
}
Patches currently in stable-queue which might be from zhang.jia(a)linux.alibaba.com are
queue-4.14/x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
This is a note to let you know that I've just added the patch titled
x86/microcode: Fix again accessing initrd after having been freed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-microcode-fix-again-accessing-initrd-after-having-been-freed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1d080f096fe33f031d26e19b3ef0146f66b8b0f1 Mon Sep 17 00:00:00 2001
From: Borislav Petkov <bp(a)suse.de>
Date: Tue, 23 Jan 2018 11:41:33 +0100
Subject: x86/microcode: Fix again accessing initrd after having been freed
From: Borislav Petkov <bp(a)suse.de>
commit 1d080f096fe33f031d26e19b3ef0146f66b8b0f1 upstream.
Commit 24c2503255d3 ("x86/microcode: Do not access the initrd after it has
been freed") fixed attempts to access initrd from the microcode loader
after it has been freed. However, a similar KASAN warning was reported
(stack trace edited):
smpboot: Booting Node 0 Processor 1 APIC 0x11
==================================================================
BUG: KASAN: use-after-free in find_cpio_data+0x9b5/0xa50
Read of size 1 at addr ffff880035ffd000 by task swapper/1/0
CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.14.8-slack #7
Hardware name: System manufacturer System Product Name/A88X-PLUS, BIOS 3003 03/10/2016
Call Trace:
dump_stack
print_address_description
kasan_report
? find_cpio_data
__asan_report_load1_noabort
find_cpio_data
find_microcode_in_initrd
__load_ucode_amd
load_ucode_amd_ap
load_ucode_ap
After some investigation, it turned out that a merge was done using the
wrong side to resolve, leading to picking up the previous state, before
the 24c2503255d3 fix. Therefore the Fixes tag below contains a merge
commit.
Revert the mismerge by catching the save_microcode_in_initrd_amd()
retval and thus letting the function exit with the last return statement
so that initrd_gone can be set to true.
Fixes: f26483eaedec ("Merge branch 'x86/urgent' into x86/microcode, to resolve conflicts")
Reported-by: <higuita(a)gmx.net>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://bugzilla.kernel.org/show_bug.cgi?id=198295
Link: https://lkml.kernel.org/r/20180123104133.918-2-bp@alien8.de
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/microcode/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/cpu/microcode/core.c
+++ b/arch/x86/kernel/cpu/microcode/core.c
@@ -239,7 +239,7 @@ static int __init save_microcode_in_init
break;
case X86_VENDOR_AMD:
if (c->x86 >= 0x10)
- return save_microcode_in_initrd_amd(cpuid_eax(1));
+ ret = save_microcode_in_initrd_amd(cpuid_eax(1));
break;
default:
break;
Patches currently in stable-queue which might be from bp(a)suse.de are
queue-4.14/x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
queue-4.14/x86-microcode-fix-again-accessing-initrd-after-having-been-freed.patch
This is a note to let you know that I've just added the patch titled
perf/x86/amd/power: Do not load AMD power module on !AMD platforms
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
perf-x86-amd-power-do-not-load-amd-power-module-on-amd-platforms.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 40d4071ce2d20840d224b4a77b5dc6f752c9ab15 Mon Sep 17 00:00:00 2001
From: Xiao Liang <xiliang(a)redhat.com>
Date: Mon, 22 Jan 2018 14:12:52 +0800
Subject: perf/x86/amd/power: Do not load AMD power module on !AMD platforms
From: Xiao Liang <xiliang(a)redhat.com>
commit 40d4071ce2d20840d224b4a77b5dc6f752c9ab15 upstream.
The AMD power module can be loaded on non AMD platforms, but unload fails
with the following Oops:
BUG: unable to handle kernel NULL pointer dereference at (null)
IP: __list_del_entry_valid+0x29/0x90
Call Trace:
perf_pmu_unregister+0x25/0xf0
amd_power_pmu_exit+0x1c/0xd23 [power]
SyS_delete_module+0x1a8/0x2b0
? exit_to_usermode_loop+0x8f/0xb0
entry_SYSCALL_64_fastpath+0x20/0x83
Return -ENODEV instead of 0 from the module init function if the CPU does
not match.
Fixes: c7ab62bfbe0e ("perf/x86/amd/power: Add AMD accumulated power reporting mechanism")
Signed-off-by: Xiao Liang <xiliang(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Link: https://lkml.kernel.org/r/20180122061252.6394-1-xiliang@redhat.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/events/amd/power.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/events/amd/power.c
+++ b/arch/x86/events/amd/power.c
@@ -277,7 +277,7 @@ static int __init amd_power_pmu_init(voi
int ret;
if (!x86_match_cpu(cpu_match))
- return 0;
+ return -ENODEV;
if (!boot_cpu_has(X86_FEATURE_ACC_POWER))
return -ENODEV;
Patches currently in stable-queue which might be from xiliang(a)redhat.com are
queue-4.14/perf-x86-amd-power-do-not-load-amd-power-module-on-amd-platforms.patch
This is a note to let you know that I've just added the patch titled
hrtimer: Reset hrtimer cpu base proper on CPU hotplug
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hrtimer-reset-hrtimer-cpu-base-proper-on-cpu-hotplug.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d5421ea43d30701e03cadc56a38854c36a8b4433 Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Fri, 26 Jan 2018 14:54:32 +0100
Subject: hrtimer: Reset hrtimer cpu base proper on CPU hotplug
From: Thomas Gleixner <tglx(a)linutronix.de>
commit d5421ea43d30701e03cadc56a38854c36a8b4433 upstream.
The hrtimer interrupt code contains a hang detection and mitigation
mechanism, which prevents that a long delayed hrtimer interrupt causes a
continous retriggering of interrupts which prevent the system from making
progress. If a hang is detected then the timer hardware is programmed with
a certain delay into the future and a flag is set in the hrtimer cpu base
which prevents newly enqueued timers from reprogramming the timer hardware
prior to the chosen delay. The subsequent hrtimer interrupt after the delay
clears the flag and resumes normal operation.
If such a hang happens in the last hrtimer interrupt before a CPU is
unplugged then the hang_detected flag is set and stays that way when the
CPU is plugged in again. At that point the timer hardware is not armed and
it cannot be armed because the hang_detected flag is still active, so
nothing clears that flag. As a consequence the CPU does not receive hrtimer
interrupts and no timers expire on that CPU which results in RCU stalls and
other malfunctions.
Clear the flag along with some other less critical members of the hrtimer
cpu base to ensure starting from a clean state when a CPU is plugged in.
Thanks to Paul, Sebastian and Anna-Maria for their help to get down to the
root cause of that hard to reproduce heisenbug. Once understood it's
trivial and certainly justifies a brown paperbag.
Fixes: 41d2e4949377 ("hrtimer: Tune hrtimer_interrupt hang logic")
Reported-by: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Sebastian Sewior <bigeasy(a)linutronix.de>
Cc: Anna-Maria Gleixner <anna-maria(a)linutronix.de>
Link: https://lkml.kernel.org/r/alpine.DEB.2.20.1801261447590.2067@nanos
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/time/hrtimer.c | 3 +++
1 file changed, 3 insertions(+)
--- a/kernel/time/hrtimer.c
+++ b/kernel/time/hrtimer.c
@@ -655,7 +655,9 @@ static void hrtimer_reprogram(struct hrt
static inline void hrtimer_init_hres(struct hrtimer_cpu_base *base)
{
base->expires_next = KTIME_MAX;
+ base->hang_detected = 0;
base->hres_active = 0;
+ base->next_timer = NULL;
}
/*
@@ -1591,6 +1593,7 @@ int hrtimers_prepare_cpu(unsigned int cp
timerqueue_init_head(&cpu_base->clock_base[i].active);
}
+ cpu_base->active_bases = 0;
cpu_base->cpu = cpu;
hrtimer_init_hres(cpu_base);
return 0;
Patches currently in stable-queue which might be from tglx(a)linutronix.de are
queue-4.14/perf-x86-amd-power-do-not-load-amd-power-module-on-amd-platforms.patch
queue-4.14/x86-mm-64-fix-vmapped-stack-syncing-on-very-large-memory-4-level-systems.patch
queue-4.14/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.14/x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
queue-4.14/hrtimer-reset-hrtimer-cpu-base-proper-on-cpu-hotplug.patch
queue-4.14/x86-microcode-fix-again-accessing-initrd-after-having-been-freed.patch
This is a note to let you know that I've just added the patch titled
x86/microcode/intel: Extend BDW late-loading further with LLC size check
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7e702d17ed138cf4ae7c00e8c00681ed464587c7 Mon Sep 17 00:00:00 2001
From: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Date: Tue, 23 Jan 2018 11:41:32 +0100
Subject: x86/microcode/intel: Extend BDW late-loading further with LLC size check
From: Jia Zhang <zhang.jia(a)linux.alibaba.com>
commit 7e702d17ed138cf4ae7c00e8c00681ed464587c7 upstream.
Commit b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a
revision check") reduced the impact of erratum BDF90 for Broadwell model
79.
The impact can be reduced further by checking the size of the last level
cache portion per core.
Tony: "The erratum says the problem only occurs on the large-cache SKUs.
So we only need to avoid the update if we are on a big cache SKU that is
also running old microcode."
For more details, see erratum BDF90 in document #334165 (Intel Xeon
Processor E7-8800/4800 v4 Product Family Specification Update) from
September 2017.
Fixes: b94b73733171 ("x86/microcode/intel: Extend BDW late-loading with a revision check")
Signed-off-by: Jia Zhang <zhang.jia(a)linux.alibaba.com>
Signed-off-by: Borislav Petkov <bp(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Acked-by: Tony Luck <tony.luck(a)intel.com>
Link: https://lkml.kernel.org/r/1516321542-31161-1-git-send-email-zhang.jia@linux…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/cpu/microcode/intel.c | 20 ++++++++++++++++++--
1 file changed, 18 insertions(+), 2 deletions(-)
--- a/arch/x86/kernel/cpu/microcode/intel.c
+++ b/arch/x86/kernel/cpu/microcode/intel.c
@@ -87,6 +87,9 @@ MODULE_DESCRIPTION("Microcode Update Dri
MODULE_AUTHOR("Tigran Aivazian <tigran(a)aivazian.fsnet.co.uk>");
MODULE_LICENSE("GPL");
+/* last level cache size per core */
+static int llc_size_per_core;
+
static int collect_cpu_info(int cpu_num, struct cpu_signature *csig)
{
struct cpuinfo_x86 *c = &cpu_data(cpu_num);
@@ -273,12 +276,14 @@ static bool is_blacklisted(unsigned int
/*
* Late loading on model 79 with microcode revision less than 0x0b000021
- * may result in a system hang. This behavior is documented in item
- * BDF90, #334165 (Intel Xeon Processor E7-8800/4800 v4 Product Family).
+ * and LLC size per core bigger than 2.5MB may result in a system hang.
+ * This behavior is documented in item BDF90, #334165 (Intel Xeon
+ * Processor E7-8800/4800 v4 Product Family).
*/
if (c->x86 == 6 &&
c->x86_model == 79 &&
c->x86_mask == 0x01 &&
+ llc_size_per_core > 2621440 &&
c->microcode < 0x0b000021) {
pr_err_once("Erratum BDF90: late loading with revision < 0x0b000021 (0x%x) disabled.\n", c->microcode);
pr_err_once("Please consider either early loading through initrd/built-in or a potential BIOS update.\n");
@@ -345,6 +350,15 @@ static struct microcode_ops microcode_in
.microcode_fini_cpu = microcode_fini_cpu,
};
+static int __init calc_llc_size_per_core(struct cpuinfo_x86 *c)
+{
+ u64 llc_size = c->x86_cache_size * 1024;
+
+ do_div(llc_size, c->x86_max_cores);
+
+ return (int)llc_size;
+}
+
struct microcode_ops * __init init_intel_microcode(void)
{
struct cpuinfo_x86 *c = &cpu_data(0);
@@ -355,6 +369,8 @@ struct microcode_ops * __init init_intel
return NULL;
}
+ llc_size_per_core = calc_llc_size_per_core(c);
+
return µcode_intel_ops;
}
Patches currently in stable-queue which might be from zhang.jia(a)linux.alibaba.com are
queue-3.18/x86-microcode-intel-extend-bdw-late-loading-further-with-llc-size-check.patch
On Tue, 2016-08-02 at 01:50:16 UTC, Stewart Smith wrote:
> According to the OPAL docs:
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-…
> https://github.com/open-power/skiboot/blob/skiboot-5.2.5/doc/opal-api/opal-…
> OPAL_HARDWARE may be returned from OPAL_RTC_READ or OPAL_RTC_WRITE and this
> indicates either a transient or permanent error.
>
> Prior to this patch, Linux was not dealing with OPAL_HARDWARE being a
> permanent error particularly well, in that you could end up in a busy
> loop.
>
> This was not too hard to trigger on an AMI BMC based OpenPOWER machine
> doing a continuous "ipmitool mc reset cold" to the BMC, the result of
> that being that we'd get stuck in an infinite loop in opal_get_rtc_time.
>
> We now retry a few times before returning the error higher up the stack.
>
> Cc: stable(a)vger.kernel.org
> Signed-off-by: Stewart Smith <stewart(a)linux.vnet.ibm.com>
Applied to powerpc next, thanks.
https://git.kernel.org/powerpc/c/5b8b58063029f02da573120ef4dc90
cheers
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Greg,
Pleae pull commits for Linux 3.18 .
I've sent a review request for all commits over a week ago and all
comments were addressed.
Thanks,
Sasha
=====
The following changes since commit a5d35deca214e095bf9d1745aa6c00dd7ced0517:
Linux 3.18.92 (2018-01-17 09:29:32 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-3.18-28012018
for you to fetch changes up to df300edb5fa24f11a40cd7278fd03befa197595c:
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID (2018-01-20 20:06:55 -0500)
- ----------------------------------------------------------------
for-greg-3.18-28012018
- ----------------------------------------------------------------
Andrew Elble (1):
nfsd: check for use of the closed special stateid
Chun-Yeow Yeoh (1):
mac80211: fix the update of path metric for RANN frame
Colin Ian King (1):
usb: gadget: don't dereference g until after it has been null checked
Eduardo Otubo (1):
xen-netfront: remove warning when unloading module
Geert Uytterhoeven (1):
net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
Gustavo A. R. Silva (1):
scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg
Icenowy Zheng (1):
media: usbtv: add a new usbid
Larry Finger (1):
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID
Liran Alon (2):
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
KVM: x86: Don't re-execute instruction when not passing CR2 value
Michael Lyle (1):
bcache: check return value of register_shrinker
Robert Lippert (1):
hwmon: (pmbus) Use 64bit math for DIRECT format values
Tetsuo Handa (1):
quota: Check for register_shrinker() failure.
Trond Myklebust (1):
nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)
Wanpeng Li (2):
KVM: X86: Fix operand/address-size during instruction decoding
KVM: VMX: Fix rflags cache during vCPU reset
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/kvm/emulate.c | 7 +++++++
arch/x86/kvm/vmx.c | 4 ++--
arch/x86/kvm/x86.c | 2 +-
drivers/hwmon/pmbus/pmbus_core.c | 21 ++++++++++++---------
drivers/md/bcache/btree.c | 5 ++++-
drivers/media/usb/usbtv/usbtv-core.c | 1 +
drivers/net/ethernet/xilinx/Kconfig | 1 +
drivers/net/xen-netfront.c | 18 ++++++++++++++++++
drivers/scsi/ufs/ufshcd.c | 7 +++++--
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 14 ++++----------
drivers/usb/gadget/composite.c | 7 +++++--
fs/nfsd/nfs4state.c | 15 +++++++++++++--
fs/quota/dquot.c | 3 ++-
net/mac80211/mesh_hwmp.c | 15 +++++++++------
15 files changed, 86 insertions(+), 37 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCAAGBQJabk+JAAoJEN6mb/eXdyzcsM4P/jdYg7OMJOze9ufNRFj2GyPM
Exw5hdnLFQXd6UTuM2qL08mTHmM4R8c7FGic9AQL/2hJHQE1k62BJL5HcHOd9uqg
IKWdzP1tDFenSsOpdhbUiI0lqIDRUbNBPLw/8nfQpWKPeaXkiqK7cExgEzMHY+bk
eQgWNTttm+1U8h6UgXjMf86XR7+CLO7X1ph+Nh6rticfEVLnVeUUKTZ+im2J59YU
dQHyav2C0d8QI2FOs6P3uh9y5KcdYP85ro1BI1s4Q9M0MaJbwFgTDajOFQPuiAgV
WciUCphNbZsDYHBYttvekVOynfHujuQp6HsqhDfxspy/Dwsz1ppxYP20Y9d3sfwv
g5mYY3JdxGJiQ9UoijlSMxnsvcu7epg4aBy6S4Hsh6Ojsw2HJyYxHYqBQJFed/oz
1dZDCWhLAHEkNaOrKzxZJU9T14VhAD2QoZSTf3KI0g8KxMpgQVv+KbcvO9nUAJPz
wJP5xWObjVHCVLR6xh7E/zKroR1CNRzGmh5XTV0JPjrAQvm6Fs+etVsfTGp19yZ6
B4MGCltIVf96OdLoFvkbod3n058U3ZFI3fqnsxNd37U46vS1k1nZp6BKEG6VU50R
AmzRaZUkx/oBdRUP+bHg4Md8zq8yI911q+A+bTkCY+krAcVNmRopGPB24Mg2bBpF
bEthrC6b7HCRfHJD+2C5
=2nGa
-----END PGP SIGNATURE-----
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256
Hi Greg,
Pleae pull commits for Linux 4.4 .
I've sent a review request for all commits over a week ago and all
comments were addressed.
Thanks,
Sasha
=====
The following changes since commit 42375c1120d5c90d7469ba264fb124f728b1a4f7:
Linux 4.4.112 (2018-01-17 09:35:33 +0100)
are available in the git repository at:
git://git.kernel.org/pub/scm/linux/kernel/git/sashal/linux-stable.git tags/for-greg-4.4-28012018
for you to fetch changes up to cfeb59f68fea0fddcb4b26c839cc098247814c88:
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID (2018-01-28 13:15:38 -0500)
- ----------------------------------------------------------------
for-greg-4.4-28012018
- ----------------------------------------------------------------
Andrew Elble (1):
nfsd: check for use of the closed special stateid
Christophe JAILLET (1):
drm/omap: Fix error handling path in 'omap_dmm_probe()'
Chun-Yeow Yeoh (1):
mac80211: fix the update of path metric for RANN frame
Colin Ian King (1):
usb: gadget: don't dereference g until after it has been null checked
Darrick J. Wong (1):
xfs: ubsan fixes
Eduardo Otubo (1):
xen-netfront: remove warning when unloading module
Felix Kuehling (2):
drm/amdgpu: Fix SDMA load/unload sequence on HWS disabled mode
drm/amdkfd: Fix SDMA oversubsription handling
Geert Uytterhoeven (1):
net: ethernet: xilinx: Mark XILINX_LL_TEMAC broken on 64-bit
Guilherme G. Piccoli (1):
scsi: aacraid: Prevent crash in case of free interrupt during scsi EH path
Gustavo A. R. Silva (1):
scsi: ufs: ufshcd: fix potential NULL pointer dereference in ufshcd_config_vreg
Hans de Goede (1):
ACPI / bus: Leave modalias empty for devices which are not present
Icenowy Zheng (1):
media: usbtv: add a new usbid
James Hogan (1):
cpufreq: Add Loongson machine dependencies
Josef Bacik (1):
btrfs: fix deadlock when writing out space cache
Larry Finger (1):
staging: rtl8188eu: Fix incorrect response to SIOCGIWESSID
Liran Alon (2):
KVM: x86: emulator: Return to user-mode on L1 CPL=0 emulation failure
KVM: x86: Don't re-execute instruction when not passing CR2 value
Michael Lyle (1):
bcache: check return value of register_shrinker
Nikita Leshenko (3):
KVM: x86: ioapic: Fix level-triggered EOI and IOAPIC reconfigure race
KVM: x86: ioapic: Clear Remote IRR when entry is switched to edge-triggered
KVM: x86: ioapic: Preserve read-only values in the redirection table
Robert Lippert (1):
hwmon: (pmbus) Use 64bit math for DIRECT format values
Tetsuo Handa (1):
quota: Check for register_shrinker() failure.
Trond Myklebust (3):
nfsd: CLOSE SHOULD return the invalid special stateid for NFSv4.x (x>0)
nfsd: Ensure we check stateid validity in the seqid operation checks
SUNRPC: Allow connect to return EHOSTUNREACH
Vasily Averin (2):
grace: replace BUG_ON by WARN_ONCE in exit_net hook
lockd: fix "list_add double add" caused by legacy signal interface
Wanpeng Li (2):
KVM: X86: Fix operand/address-size during instruction decoding
KVM: VMX: Fix rflags cache during vCPU reset
Yisheng Xie (1):
kmemleak: add scheduling point to kmemleak_scan()
shaoyunl (1):
drm/amdkfd: Fix SDMA ring buffer size calculation
zhangliping (1):
openvswitch: fix the incorrect flow action alloc size
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/kvm/emulate.c | 7 ++++
arch/x86/kvm/ioapic.c | 20 +++++++--
arch/x86/kvm/vmx.c | 4 +-
arch/x86/kvm/x86.c | 2 +-
drivers/acpi/device_sysfs.c | 4 ++
drivers/cpufreq/Kconfig | 2 +
drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gfx_v7.c | 47 ++++++++++++++++------
drivers/gpu/drm/amd/amdkfd/kfd_mqd_manager_cik.c | 4 +-
.../gpu/drm/amd/amdkfd/kfd_process_queue_manager.c | 18 +++++++++
drivers/gpu/drm/omapdrm/omap_dmm_tiler.c | 3 +-
drivers/hwmon/pmbus/pmbus_core.c | 21 +++++-----
drivers/md/bcache/btree.c | 5 ++-
drivers/media/usb/usbtv/usbtv-core.c | 1 +
drivers/net/ethernet/xilinx/Kconfig | 1 +
drivers/net/xen-netfront.c | 18 +++++++++
drivers/scsi/aacraid/commsup.c | 2 +-
drivers/scsi/ufs/ufshcd.c | 7 +++-
drivers/staging/rtl8188eu/os_dep/ioctl_linux.c | 14 ++-----
drivers/usb/gadget/composite.c | 7 +++-
fs/btrfs/free-space-cache.c | 3 +-
fs/nfs_common/grace.c | 10 ++++-
fs/nfsd/nfs4state.c | 34 +++++++++-------
fs/quota/dquot.c | 3 +-
fs/xfs/xfs_aops.c | 6 +--
mm/kmemleak.c | 2 +
net/mac80211/mesh_hwmp.c | 15 ++++---
net/openvswitch/flow_netlink.c | 16 ++++----
net/sunrpc/xprtsock.c | 1 +
29 files changed, 197 insertions(+), 83 deletions(-)
-----BEGIN PGP SIGNATURE-----
iQIcBAEBCAAGBQJabk9/AAoJEN6mb/eXdyzcVCwQAMG2Lj8HgHf2AYNOmEtJeju6
DspErylBwcoOjpbc4LKgqCY9JVev8KWbSPC4PU3J3XeiAjizrqrkmLW83N7sj9IJ
EQ49sc84No6eKLEkxAVagPVuV1XAgZy8ywCxewC9a6l72PFY0IqGHGydxOTbXLhu
HLiwXfu7LtziVN8Le7Em0x21L0w9x8+/pFX79SHgarIAdiEwhjjvsSrEIFsNC/nQ
VFPpmsGxHjlU+hh6YX7k1UOPhE1PbouDuro1qFTwtsU4eCik38k/NmYXcV5elPBW
P3w41ErkeM4+oCD7ArVMjmqQts8+EEyHQELiHcitFhankCXqhiiRu2PETfeNeOiT
ym1qKnIqxaK3+moN7ZugAUX8PdB9TmqyNiOAUDnsr+RF4pD8RBGJ5yQaWwoUFat3
0GaZTq/c8B2NMQ7ugNK0nn1firXgPUIfvO22r7io5nDF1mzIUHj0qeYw14Pdiewi
Dsss4EzWMfP9tGHyoEFY8Z1Pg+6PKdlzuoBt8H+FUBi38NU/5Fx8t9t+TiPgKQq0
yaoToyMabCcLkFh+D8o22W0e/kTUijcxYu4OY6uvKKFYeTdHvPpF4N/CLrt/IaJz
3rQ5Op1GQnZ6ep9Z/bSkx6wvTu/0lf8PUGsRvAgV2SGQF6THcZMLA5WKhifWmVED
/6na9ECM6uGY3pNh1fFQ
=z5FJ
-----END PGP SIGNATURE-----
This is a note to let you know that I've just added the patch titled
vmxnet3: repair memory leak
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vmxnet3-repair-memory-leak.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Neil Horman <nhorman(a)tuxdriver.com>
Date: Mon, 22 Jan 2018 16:06:37 -0500
Subject: vmxnet3: repair memory leak
From: Neil Horman <nhorman(a)tuxdriver.com>
[ Upstream commit 848b159835ddef99cc4193083f7e786c3992f580 ]
with the introduction of commit
b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info
is improperly handled. While it is heap allocated when an rx queue is
setup, and freed when torn down, an old line of code in
vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0]
being set to NULL prior to its being freed, causing a memory leak, which
eventually exhausts the system on repeated create/destroy operations
(for example, when the mtu of a vmxnet3 interface is changed
frequently.
Fix is pretty straight forward, just move the NULL set to after the
free.
Tested by myself with successful results
Applies to net, and should likely be queued for stable, please
Signed-off-by: Neil Horman <nhorman(a)tuxdriver.com>
Reported-By: boyang(a)redhat.com
CC: boyang(a)redhat.com
CC: Shrikrishna Khare <skhare(a)vmware.com>
CC: "VMware, Inc." <pv-drivers(a)vmware.com>
CC: David S. Miller <davem(a)davemloft.net>
Acked-by: Shrikrishna Khare <skhare(a)vmware.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vmxnet3/vmxnet3_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1616,7 +1616,6 @@ static void vmxnet3_rq_destroy(struct vm
rq->rx_ring[i].basePA);
rq->rx_ring[i].base = NULL;
}
- rq->buf_info[i] = NULL;
}
if (rq->data_ring.base) {
@@ -1638,6 +1637,7 @@ static void vmxnet3_rq_destroy(struct vm
(rq->rx_ring[0].size + rq->rx_ring[1].size);
dma_free_coherent(&adapter->pdev->dev, sz, rq->buf_info[0],
rq->buf_info_pa);
+ rq->buf_info[0] = rq->buf_info[1] = NULL;
}
}
Patches currently in stable-queue which might be from nhorman(a)tuxdriver.com are
queue-4.9/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.9/vmxnet3-repair-memory-leak.patch
queue-4.9/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
This is a note to let you know that I've just added the patch titled
tipc: fix a memory leak in tipc_nl_node_get_link()
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Wed, 10 Jan 2018 12:50:25 -0800
Subject: tipc: fix a memory leak in tipc_nl_node_get_link()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 59b36613e85fb16ebf9feaf914570879cd5c2a21 ]
When tipc_node_find_by_name() fails, the nlmsg is not
freed.
While on it, switch to a goto label to properly
free it.
Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Jon Maloy <jon.maloy(a)ericsson.com>
Cc: Ying Xue <ying.xue(a)windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Acked-by: Ying Xue <ying.xue(a)windriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tipc/node.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1848,36 +1848,38 @@ int tipc_nl_node_get_link(struct sk_buff
if (strcmp(name, tipc_bclink_name) == 0) {
err = tipc_nl_add_bc_link(net, &msg);
- if (err) {
- nlmsg_free(msg.skb);
- return err;
- }
+ if (err)
+ goto err_free;
} else {
int bearer_id;
struct tipc_node *node;
struct tipc_link *link;
node = tipc_node_find_by_name(net, name, &bearer_id);
- if (!node)
- return -EINVAL;
+ if (!node) {
+ err = -EINVAL;
+ goto err_free;
+ }
tipc_node_read_lock(node);
link = node->links[bearer_id].link;
if (!link) {
tipc_node_read_unlock(node);
- nlmsg_free(msg.skb);
- return -EINVAL;
+ err = -EINVAL;
+ goto err_free;
}
err = __tipc_nl_add_link(net, &msg, link, 0);
tipc_node_read_unlock(node);
- if (err) {
- nlmsg_free(msg.skb);
- return err;
- }
+ if (err)
+ goto err_free;
}
return genlmsg_reply(msg.skb, info);
+
+err_free:
+ nlmsg_free(msg.skb);
+ return err;
}
int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info)
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.9/tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
queue-4.9/tun-fix-a-memory-leak-for-tfile-tx_array.patch
This is a note to let you know that I've just added the patch titled
tun: fix a memory leak for tfile->tx_array
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tun-fix-a-memory-leak-for-tfile-tx_array.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Mon, 15 Jan 2018 11:37:29 -0800
Subject: tun: fix a memory leak for tfile->tx_array
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 4df0bfc79904b7169dc77dcce44598b1545721f9 ]
tfile->tun could be detached before we close the tun fd,
via tun_detach_all(), so it should not be used to check for
tfile->tx_array.
As Jason suggested, we probably have to clean it up
unconditionally both in __tun_deatch() and tun_detach_all(),
but this requires to check if it is initialized or not.
Currently skb_array_cleanup() doesn't have such a check,
so I check it in the caller and introduce a helper function,
it is a bit ugly but we can always improve it in net-next.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Fixes: 1576d9860599 ("tun: switch to use skb array for tx")
Cc: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/tun.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -525,6 +525,14 @@ static void tun_queue_purge(struct tun_f
skb_queue_purge(&tfile->sk.sk_error_queue);
}
+static void tun_cleanup_tx_array(struct tun_file *tfile)
+{
+ if (tfile->tx_array.ring.queue) {
+ skb_array_cleanup(&tfile->tx_array);
+ memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+ }
+}
+
static void __tun_detach(struct tun_file *tfile, bool clean)
{
struct tun_file *ntfile;
@@ -566,8 +574,7 @@ static void __tun_detach(struct tun_file
tun->dev->reg_state == NETREG_REGISTERED)
unregister_netdevice(tun->dev);
}
- if (tun)
- skb_array_cleanup(&tfile->tx_array);
+ tun_cleanup_tx_array(tfile);
sock_put(&tfile->sk);
}
}
@@ -606,11 +613,13 @@ static void tun_detach_all(struct net_de
/* Drop read queue */
tun_queue_purge(tfile);
sock_put(&tfile->sk);
+ tun_cleanup_tx_array(tfile);
}
list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
tun_enable_queue(tfile);
tun_queue_purge(tfile);
sock_put(&tfile->sk);
+ tun_cleanup_tx_array(tfile);
}
BUG_ON(tun->numdisabled != 0);
@@ -2363,6 +2372,8 @@ static int tun_chr_open(struct inode *in
sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
+ memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+
return 0;
}
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.9/tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
queue-4.9/tun-fix-a-memory-leak-for-tfile-tx_array.patch
This is a note to let you know that I've just added the patch titled
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 15 Jan 2018 17:01:36 +0800
Subject: sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit a0ff660058b88d12625a783ce9e5c1371c87951f ]
After commit cea0cc80a677 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.
Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50(a)syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -83,7 +83,7 @@
static int sctp_writeable(struct sock *sk);
static void sctp_wfree(struct sk_buff *skb);
static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
- size_t msg_len, struct sock **orig_sk);
+ size_t msg_len);
static int sctp_wait_for_packet(struct sock *sk, int *err, long *timeo_p);
static int sctp_wait_for_connect(struct sctp_association *, long *timeo_p);
static int sctp_wait_for_accept(struct sock *sk, long timeo);
@@ -1956,7 +1956,7 @@ static int sctp_sendmsg(struct sock *sk,
timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
if (!sctp_wspace(asoc)) {
/* sk can be changed by peel off when waiting for buf. */
- err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len, &sk);
+ err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len);
if (err) {
if (err == -ESRCH) {
/* asoc is already dead. */
@@ -7439,12 +7439,12 @@ void sctp_sock_rfree(struct sk_buff *skb
/* Helper function to wait for space in the sndbuf. */
static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
- size_t msg_len, struct sock **orig_sk)
+ size_t msg_len)
{
struct sock *sk = asoc->base.sk;
- int err = 0;
long current_timeo = *timeo_p;
DEFINE_WAIT(wait);
+ int err = 0;
pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc,
*timeo_p, msg_len);
@@ -7473,17 +7473,13 @@ static int sctp_wait_for_sndbuf(struct s
release_sock(sk);
current_timeo = schedule_timeout(current_timeo);
lock_sock(sk);
- if (sk != asoc->base.sk) {
- release_sock(sk);
- sk = asoc->base.sk;
- lock_sock(sk);
- }
+ if (sk != asoc->base.sk)
+ goto do_error;
*timeo_p = current_timeo;
}
out:
- *orig_sk = sk;
finish_wait(&asoc->wait, &wait);
/* Release the association's refcnt. */
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.9/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.9/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.9/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
This is a note to let you know that I've just added the patch titled
sctp: do not allow the v4 socket to bind a v4mapped v6 address
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 15 Jan 2018 17:02:00 +0800
Subject: sctp: do not allow the v4 socket to bind a v4mapped v6 address
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit c5006b8aa74599ce19104b31d322d2ea9ff887cc ]
The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963(a)syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -332,16 +332,14 @@ static struct sctp_af *sctp_sockaddr_af(
if (len < sizeof (struct sockaddr))
return NULL;
+ if (!opt->pf->af_supported(addr->sa.sa_family, opt))
+ return NULL;
+
/* V4 mapped address are really of AF_INET family */
if (addr->sa.sa_family == AF_INET6 &&
- ipv6_addr_v4mapped(&addr->v6.sin6_addr)) {
- if (!opt->pf->af_supported(AF_INET, opt))
- return NULL;
- } else {
- /* Does this PF support this AF? */
- if (!opt->pf->af_supported(addr->sa.sa_family, opt))
- return NULL;
- }
+ ipv6_addr_v4mapped(&addr->v6.sin6_addr) &&
+ !opt->pf->af_supported(AF_INET, opt))
+ return NULL;
/* If we get this far, af is valid. */
af = sctp_get_af_specific(addr->sa.sa_family);
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.9/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.9/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.9/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
This is a note to let you know that I've just added the patch titled
pppoe: take ->needed_headroom of lower device into account on xmit
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Mon, 22 Jan 2018 18:06:37 +0100
Subject: pppoe: take ->needed_headroom of lower device into account on xmit
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 02612bb05e51df8489db5e94d0cf8d1c81f87b0c ]
In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom
was probably fine before the introduction of ->needed_headroom in
commit f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom").
But now, virtual devices typically advertise the size of their overhead
in dev->needed_headroom, so we must also take it into account in
skb_reserve().
Allocation size of skb is also updated to take dev->needed_tailroom
into account and replace the arbitrary 32 bytes with the real size of
a PPPoE header.
This issue was discovered by syzbot, who connected a pppoe socket to a
gre device which had dev->header_ops->create == ipgre_header and
dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any
headroom, and dev_hard_header() crashed when ipgre_header() tried to
prepend its header to skb->data.
skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24
head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted
4.15.0-rc7-next-20180115+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100
RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282
RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000
RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc
RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0
R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180
FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
skb_under_panic net/core/skbuff.c:114 [inline]
skb_push+0xce/0xf0 net/core/skbuff.c:1714
ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879
dev_hard_header include/linux/netdevice.h:2723 [inline]
pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:640
sock_write_iter+0x31a/0x5d0 net/socket.c:909
call_write_iter include/linux/fs.h:1775 [inline]
do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653
do_iter_write+0x154/0x540 fs/read_write.c:932
vfs_writev+0x18a/0x340 fs/read_write.c:977
do_writev+0xfc/0x2a0 fs/read_write.c:1012
SYSC_writev fs/read_write.c:1085 [inline]
SyS_writev+0x27/0x30 fs/read_write.c:1082
entry_SYSCALL_64_fastpath+0x29/0xa0
Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like
interfaces, but reserving space for ->needed_headroom is a more
fundamental issue that needs to be addressed first.
Same problem exists for __pppoe_xmit(), which also needs to take
dev->needed_headroom into account in skb_cow_head().
Fixes: f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom")
Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d(a)syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/pppoe.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -842,6 +842,7 @@ static int pppoe_sendmsg(struct socket *
struct pppoe_hdr *ph;
struct net_device *dev;
char *start;
+ int hlen;
lock_sock(sk);
if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) {
@@ -860,16 +861,16 @@ static int pppoe_sendmsg(struct socket *
if (total_len > (dev->mtu + dev->hard_header_len))
goto end;
-
- skb = sock_wmalloc(sk, total_len + dev->hard_header_len + 32,
- 0, GFP_KERNEL);
+ hlen = LL_RESERVED_SPACE(dev);
+ skb = sock_wmalloc(sk, hlen + sizeof(*ph) + total_len +
+ dev->needed_tailroom, 0, GFP_KERNEL);
if (!skb) {
error = -ENOMEM;
goto end;
}
/* Reserve space for headers. */
- skb_reserve(skb, dev->hard_header_len);
+ skb_reserve(skb, hlen);
skb_reset_network_header(skb);
skb->dev = dev;
@@ -930,7 +931,7 @@ static int __pppoe_xmit(struct sock *sk,
/* Copy the data if there is no space for the header or if it's
* read-only.
*/
- if (skb_cow_head(skb, sizeof(*ph) + dev->hard_header_len))
+ if (skb_cow_head(skb, LL_RESERVED_SPACE(dev) + sizeof(*ph)))
goto abort;
__skb_push(skb, sizeof(*ph));
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.9/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.9/ppp-unlock-all_ppp_mutex-before-registering-device.patch
This is a note to let you know that I've just added the patch titled
r8169: fix memory corruption on retrieval of hardware statistics.
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-memory-corruption-on-retrieval-of-hardware-statistics.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Francois Romieu <romieu(a)fr.zoreil.com>
Date: Fri, 26 Jan 2018 01:53:26 +0100
Subject: r8169: fix memory corruption on retrieval of hardware statistics.
From: Francois Romieu <romieu(a)fr.zoreil.com>
[ Upstream commit a78e93661c5fd30b9e1dee464b2f62f966883ef7 ]
Hardware statistics retrieval hurts in tight invocation loops.
Avoid extraneous write and enforce strict ordering of writes targeted to
the tally counters dump area address registers.
Signed-off-by: Francois Romieu <romieu(a)fr.zoreil.com>
Tested-by: Oliver Freyermuth <o.freyermuth(a)googlemail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2222,19 +2222,14 @@ static bool rtl8169_do_counters(struct n
void __iomem *ioaddr = tp->mmio_addr;
dma_addr_t paddr = tp->counters_phys_addr;
u32 cmd;
- bool ret;
RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+ RTL_R32(CounterAddrHigh);
cmd = (u64)paddr & DMA_BIT_MASK(32);
RTL_W32(CounterAddrLow, cmd);
RTL_W32(CounterAddrLow, cmd | counter_cmd);
- ret = rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
-
- RTL_W32(CounterAddrLow, 0);
- RTL_W32(CounterAddrHigh, 0);
-
- return ret;
+ return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
}
static bool rtl8169_reset_counters(struct net_device *dev)
Patches currently in stable-queue which might be from romieu(a)fr.zoreil.com are
queue-4.9/r8169-fix-memory-corruption-on-retrieval-of-hardware-statistics.patch
This is a note to let you know that I've just added the patch titled
ppp: unlock all_ppp_mutex before registering device
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ppp-unlock-all_ppp_mutex-before-registering-device.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Wed, 10 Jan 2018 16:24:45 +0100
Subject: ppp: unlock all_ppp_mutex before registering device
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 0171c41835591e9aa2e384b703ef9a6ae367c610 ]
ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices,
needs to lock pn->all_ppp_mutex. Therefore we mustn't call
register_netdevice() with pn->all_ppp_mutex already locked, or we'd
deadlock in case register_netdevice() fails and calls .ndo_uninit().
Fortunately, we can unlock pn->all_ppp_mutex before calling
register_netdevice(). This lock protects pn->units_idr, which isn't
used in the device registration process.
However, keeping pn->all_ppp_mutex locked during device registration
did ensure that no device in transient state would be published in
pn->units_idr. In practice, unlocking it before calling
register_netdevice() doesn't change this property: ppp_unit_register()
is called with 'ppp_mutex' locked and all searches done in
pn->units_idr hold this lock too.
Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion")
Reported-and-tested-by: syzbot+367889b9c9e279219175(a)syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/ppp_generic.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -1002,17 +1002,18 @@ static int ppp_unit_register(struct ppp
if (!ifname_is_set)
snprintf(ppp->dev->name, IFNAMSIZ, "ppp%i", ppp->file.index);
+ mutex_unlock(&pn->all_ppp_mutex);
+
ret = register_netdevice(ppp->dev);
if (ret < 0)
goto err_unit;
atomic_inc(&ppp_unit_count);
- mutex_unlock(&pn->all_ppp_mutex);
-
return 0;
err_unit:
+ mutex_lock(&pn->all_ppp_mutex);
unit_put(&pn->units_idr, ppp->file.index);
err:
mutex_unlock(&pn->all_ppp_mutex);
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.9/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.9/ppp-unlock-all_ppp_mutex-before-registering-device.patch
This is a note to let you know that I've just added the patch titled
net: tcp: close sock if net namespace is exiting
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tcp-close-sock-if-net-namespace-is-exiting.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Dan Streetman <ddstreet(a)ieee.org>
Date: Thu, 18 Jan 2018 16:14:26 -0500
Subject: net: tcp: close sock if net namespace is exiting
From: Dan Streetman <ddstreet(a)ieee.org>
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/net_namespace.h | 10 ++++++++++
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_timer.c | 15 +++++++++++++++
3 files changed, 28 insertions(+)
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -213,6 +213,11 @@ int net_eq(const struct net *net1, const
return net1 == net2;
}
+static inline int check_net(const struct net *net)
+{
+ return atomic_read(&net->count) != 0;
+}
+
void net_drop_ns(void *);
#else
@@ -236,6 +241,11 @@ int net_eq(const struct net *net1, const
{
return 1;
}
+
+static inline int check_net(const struct net *net)
+{
+ return 1;
+}
#define net_drop_ns NULL
#endif
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2215,6 +2215,9 @@ adjudge_to_death:
tcp_send_active_reset(sk, GFP_ATOMIC);
__NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPABORTONMEMORY);
+ } else if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_set_state(sk, TCP_CLOSE);
}
}
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -50,11 +50,19 @@ static void tcp_write_err(struct sock *s
* to prevent DoS attacks. It is called when a retransmission timeout
* or zero probe timeout occurs on orphaned socket.
*
+ * Also close if our net namespace is exiting; in that case there is no
+ * hope of ever communicating again since all netns interfaces are already
+ * down (or about to be down), and we need to release our dst references,
+ * which have been moved to the netns loopback interface, so the namespace
+ * can finish exiting. This condition is only possible if we are a kernel
+ * socket, as those do not hold references to the namespace.
+ *
* Criteria is still not confirmed experimentally and may change.
* We kill the socket, if:
* 1. If number of orphaned sockets exceeds an administratively configured
* limit.
* 2. If we have strong memory pressure.
+ * 3. If our net namespace is exiting.
*/
static int tcp_out_of_resources(struct sock *sk, bool do_reset)
{
@@ -83,6 +91,13 @@ static int tcp_out_of_resources(struct s
__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
return 1;
}
+
+ if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_done(sk);
+ return 1;
+ }
+
return 0;
}
Patches currently in stable-queue which might be from ddstreet(a)ieee.org are
queue-4.9/net-tcp-close-sock-if-net-namespace-is-exiting.patch
This is a note to let you know that I've just added the patch titled
net: qdisc_pkt_len_init() should be more robust
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-qdisc_pkt_len_init-should-be-more-robust.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 18 Jan 2018 19:59:19 -0800
Subject: net: qdisc_pkt_len_init() should be more robust
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 7c68d1a6b4db9012790af7ac0f0fdc0d2083422a ]
Without proper validation of DODGY packets, we might very well
feed qdisc_pkt_len_init() with invalid GSO packets.
tcp_hdrlen() might access out-of-bound data, so let's use
skb_header_pointer() and proper checks.
Whole story is described in commit d0c081b49137 ("flow_dissector:
properly cap thoff field")
We have the goal of validating DODGY packets earlier in the stack,
so we might very well revert this fix in the future.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Cc: Jason Wang <jasowang(a)redhat.com>
Reported-by: syzbot+9da69ebac7dddd804552(a)syzkaller.appspotmail.com
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3083,10 +3083,21 @@ static void qdisc_pkt_len_init(struct sk
hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
/* + transport layer */
- if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
- hdr_len += tcp_hdrlen(skb);
- else
- hdr_len += sizeof(struct udphdr);
+ if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+ const struct tcphdr *th;
+ struct tcphdr _tcphdr;
+
+ th = skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_tcphdr), &_tcphdr);
+ if (likely(th))
+ hdr_len += __tcp_hdrlen(th);
+ } else {
+ struct udphdr _udphdr;
+
+ if (skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_udphdr), &_udphdr))
+ hdr_len += sizeof(struct udphdr);
+ }
if (shinfo->gso_type & SKB_GSO_DODGY)
gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.9/flow_dissector-properly-cap-thoff-field.patch
queue-4.9/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.9/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.9/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
net: igmp: fix source address check for IGMPv3 reports
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-igmp-fix-source-address-check-for-igmpv3-reports.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Felix Fietkau <nbd(a)nbd.name>
Date: Fri, 19 Jan 2018 11:50:46 +0100
Subject: net: igmp: fix source address check for IGMPv3 reports
From: Felix Fietkau <nbd(a)nbd.name>
[ Upstream commit ad23b750933ea7bf962678972a286c78a8fa36aa ]
Commit "net: igmp: Use correct source address on IGMPv3 reports"
introduced a check to validate the source address of locally generated
IGMPv3 packets.
Instead of checking the local interface address directly, it uses
inet_ifa_match(fl4->saddr, ifa), which checks if the address is on the
local subnet (or equal to the point-to-point address if used).
This breaks for point-to-point interfaces, so check against
ifa->ifa_local directly.
Cc: Kevin Cernekee <cernekee(a)chromium.org>
Fixes: a46182b00290 ("net: igmp: Use correct source address on IGMPv3 reports")
Reported-by: Sebastian Gottschall <s.gottschall(a)dd-wrt.com>
Signed-off-by: Felix Fietkau <nbd(a)nbd.name>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/igmp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/ipv4/igmp.c
+++ b/net/ipv4/igmp.c
@@ -332,7 +332,7 @@ static __be32 igmpv3_get_srcaddr(struct
return htonl(INADDR_ANY);
for_ifa(in_dev) {
- if (inet_ifa_match(fl4->saddr, ifa))
+ if (fl4->saddr == ifa->ifa_local)
return fl4->saddr;
} endfor_ifa(in_dev);
Patches currently in stable-queue which might be from nbd(a)nbd.name are
queue-4.9/net-igmp-fix-source-address-check-for-igmpv3-reports.patch
This is a note to let you know that I've just added the patch titled
lan78xx: Fix failure in USB Full Speed
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
lan78xx-fix-failure-in-usb-full-speed.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Yuiko Oshino <yuiko.oshino(a)microchip.com>
Date: Mon, 15 Jan 2018 13:24:28 -0500
Subject: lan78xx: Fix failure in USB Full Speed
From: Yuiko Oshino <yuiko.oshino(a)microchip.com>
[ Upstream commit a5b1379afbfabf91e3a689e82ac619a7157336b3 ]
Fix initialize the uninitialized tx_qlen to an appropriate value when USB
Full Speed is used.
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino(a)microchip.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/lan78xx.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2197,6 +2197,7 @@ static int lan78xx_reset(struct lan78xx_
buf = DEFAULT_BURST_CAP_SIZE / FS_USB_PKT_SIZE;
dev->rx_urb_size = DEFAULT_BURST_CAP_SIZE;
dev->rx_qlen = 4;
+ dev->tx_qlen = 4;
}
ret = lan78xx_write_reg(dev, BURST_CAP, buf);
Patches currently in stable-queue which might be from yuiko.oshino(a)microchip.com are
queue-4.9/lan78xx-fix-failure-in-usb-full-speed.patch
This is a note to let you know that I've just added the patch titled
mlxsw: spectrum_router: Don't log an error on missing neighbor
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mlxsw-spectrum_router-don-t-log-an-error-on-missing-neighbor.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Yuval Mintz <yuvalm(a)mellanox.com>
Date: Wed, 24 Jan 2018 10:02:09 +0100
Subject: mlxsw: spectrum_router: Don't log an error on missing neighbor
From: Yuval Mintz <yuvalm(a)mellanox.com>
[ Upstream commit 1ecdaea02ca6bfacf2ecda500dc1af51e9780c42 ]
Driver periodically samples all neighbors configured in device
in order to update the kernel regarding their state. When finding
an entry configured in HW that doesn't show in neigh_lookup()
driver logs an error message.
This introduces a race when removing multiple neighbors -
it's possible that a given entry would still be configured in HW
as its removal is still being processed but is already removed
from the kernel's neighbor tables.
Simply remove the error message and gracefully accept such events.
Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Fixes: 60f040ca11b9 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours")
Signed-off-by: Yuval Mintz <yuvalm(a)mellanox.com>
Reviewed-by: Ido Schimmel <idosch(a)mellanox.com>
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 5 +----
1 file changed, 1 insertion(+), 4 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -765,11 +765,8 @@ static void mlxsw_sp_router_neigh_ent_ip
dipn = htonl(dip);
dev = mlxsw_sp->rifs[rif]->dev;
n = neigh_lookup(&arp_tbl, &dipn, dev);
- if (!n) {
- netdev_err(dev, "Failed to find matching neighbour for IP=%pI4h\n",
- &dip);
+ if (!n)
return;
- }
netdev_dbg(dev, "Updating neighbour with IP=%pI4h\n", &dip);
neigh_event_send(n, NULL);
Patches currently in stable-queue which might be from yuvalm(a)mellanox.com are
queue-4.9/mlxsw-spectrum_router-don-t-log-an-error-on-missing-neighbor.patch
This is a note to let you know that I've just added the patch titled
net: Allow neigh contructor functions ability to modify the primary_key
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Jim Westfall <jwestfall(a)surrealistic.net>
Date: Sun, 14 Jan 2018 04:18:50 -0800
Subject: net: Allow neigh contructor functions ability to modify the primary_key
From: Jim Westfall <jwestfall(a)surrealistic.net>
[ Upstream commit 096b9854c04df86f03b38a97d40b6506e5730919 ]
Use n->primary_key instead of pkey to account for the possibility that a neigh
constructor function may have modified the primary_key value.
Signed-off-by: Jim Westfall <jwestfall(a)surrealistic.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/neighbour.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -496,7 +496,7 @@ struct neighbour *__neigh_create(struct
if (atomic_read(&tbl->entries) > (1 << nht->hash_shift))
nht = neigh_hash_grow(tbl, nht->hash_shift + 1);
- hash_val = tbl->hash(pkey, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
+ hash_val = tbl->hash(n->primary_key, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
if (n->parms->dead) {
rc = ERR_PTR(-EINVAL);
@@ -508,7 +508,7 @@ struct neighbour *__neigh_create(struct
n1 != NULL;
n1 = rcu_dereference_protected(n1->next,
lockdep_is_held(&tbl->lock))) {
- if (dev == n1->dev && !memcmp(n1->primary_key, pkey, key_len)) {
+ if (dev == n1->dev && !memcmp(n1->primary_key, n->primary_key, key_len)) {
if (want_ref)
neigh_hold(n1);
rc = n1;
Patches currently in stable-queue which might be from jwestfall(a)surrealistic.net are
queue-4.9/ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
queue-4.9/net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
This is a note to let you know that I've just added the patch titled
ipv6: ip6_make_skb() needs to clear cork.base.dst
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 11 Jan 2018 22:31:18 -0800
Subject: ipv6: ip6_make_skb() needs to clear cork.base.dst
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 95ef498d977bf44ac094778fd448b98af158a3e6 ]
In my last patch, I missed fact that cork.base.dst was not initialized
in ip6_make_skb() :
If ip6_setup_cork() returns an error, we might attempt a dst_release()
on some random pointer.
Fixes: 862c03ee1deb ("ipv6: fix possible mem leaks in ipv6_make_skb()")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_output.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1800,6 +1800,7 @@ struct sk_buff *ip6_make_skb(struct sock
cork.base.flags = 0;
cork.base.addr = 0;
cork.base.opt = NULL;
+ cork.base.dst = NULL;
v6_cork.opt = NULL;
err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6);
if (err) {
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.9/flow_dissector-properly-cap-thoff-field.patch
queue-4.9/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.9/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.9/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
ipv6: fix udpv6 sendmsg crash caused by too small MTU
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Mike Maloney <maloney(a)google.com>
Date: Wed, 10 Jan 2018 12:45:10 -0500
Subject: ipv6: fix udpv6 sendmsg crash caused by too small MTU
From: Mike Maloney <maloney(a)google.com>
[ Upstream commit 749439bfac6e1a2932c582e2699f91d329658196 ]
The logic in __ip6_append_data() assumes that the MTU is at least large
enough for the headers. A device's MTU may be adjusted after being
added while sendmsg() is processing data, resulting in
__ip6_append_data() seeing any MTU. For an mtu smaller than the size of
the fragmentation header, the math results in a negative 'maxfraglen',
which causes problems when refragmenting any previous skb in the
skb_write_queue, leaving it possibly malformed.
Instead sendmsg returns EINVAL when the mtu is calculated to be less
than IPV6_MIN_MTU.
Found by syzkaller:
kernel BUG at ./include/linux/skbuff.h:2064!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d0b68580 task.stack: ffff8801ac6b8000
RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline]
RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617
RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216
RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000
RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0
RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000
R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8
R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000
FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
ip6_finish_skb include/net/ipv6.h:911 [inline]
udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093
udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
SYSC_sendto+0x352/0x5a0 net/socket.c:1750
SyS_sendto+0x40/0x50 net/socket.c:1718
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9
RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005
RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c
R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69
R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000
Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba
RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570
RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Mike Maloney <maloney(a)google.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_output.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1260,14 +1260,16 @@ static int ip6_setup_cork(struct sock *s
v6_cork->tclass = ipc6->tclass;
if (rt->dst.flags & DST_XFRM_TUNNEL)
mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
- rt->dst.dev->mtu : dst_mtu(&rt->dst);
+ READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
else
mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
- rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+ READ_ONCE(rt->dst.dev->mtu) : dst_mtu(rt->dst.path);
if (np->frag_size < mtu) {
if (np->frag_size)
mtu = np->frag_size;
}
+ if (mtu < IPV6_MIN_MTU)
+ return -EINVAL;
cork->base.fragsize = mtu;
if (dst_allfrag(rt->dst.path))
cork->base.flags |= IPCORK_ALLFRAG;
Patches currently in stable-queue which might be from maloney(a)google.com are
queue-4.9/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
This is a note to let you know that I've just added the patch titled
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Jim Westfall <jwestfall(a)surrealistic.net>
Date: Sun, 14 Jan 2018 04:18:51 -0800
Subject: ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
From: Jim Westfall <jwestfall(a)surrealistic.net>
[ Upstream commit cd9ff4de0107c65d69d02253bb25d6db93c3dbc1 ]
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.
This used the be the old behavior but became broken in a263b3093641f
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.
Signed-off-by: Jim Westfall <jwestfall(a)surrealistic.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/arp.h | 3 +++
net/ipv4/arp.c | 7 ++++++-
2 files changed, 9 insertions(+), 1 deletion(-)
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -19,6 +19,9 @@ static inline u32 arp_hashfn(const void
static inline struct neighbour *__ipv4_neigh_lookup_noref(struct net_device *dev, u32 key)
{
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+ key = INADDR_ANY;
+
return ___neigh_lookup_noref(&arp_tbl, neigh_key_eq32, arp_hashfn, &key, dev);
}
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -223,11 +223,16 @@ static bool arp_key_eq(const struct neig
static int arp_constructor(struct neighbour *neigh)
{
- __be32 addr = *(__be32 *)neigh->primary_key;
+ __be32 addr;
struct net_device *dev = neigh->dev;
struct in_device *in_dev;
struct neigh_parms *parms;
+ u32 inaddr_any = INADDR_ANY;
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+ memcpy(neigh->primary_key, &inaddr_any, arp_tbl.key_len);
+
+ addr = *(__be32 *)neigh->primary_key;
rcu_read_lock();
in_dev = __in_dev_get_rcu(dev);
if (!in_dev) {
Patches currently in stable-queue which might be from jwestfall(a)surrealistic.net are
queue-4.9/ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
queue-4.9/net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-getsockopt-for-sockets-with-default-ipv6_autoflowlabel.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Date: Mon, 22 Jan 2018 20:06:42 +0000
Subject: ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
[ Upstream commit e9191ffb65d8e159680ce0ad2224e1acbde6985c ]
Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.
The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled. Fix it to return the effective value, whether
that has been set at the socket or net namespace level.
Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/ipv6.h | 1 +
net/ipv6/ip6_output.c | 2 +-
net/ipv6/ipv6_sockglue.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -290,6 +290,7 @@ int ipv6_flowlabel_opt_get(struct sock *
int flags);
int ip6_flowlabel_init(void);
void ip6_flowlabel_cleanup(void);
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np);
static inline void fl6_sock_release(struct ip6_flowlabel *fl)
{
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -156,7 +156,7 @@ int ip6_output(struct net *net, struct s
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
}
-static bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
{
if (!np->autoflowlabel_set)
return ip6_default_np_autolabel(net);
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -1316,7 +1316,7 @@ static int do_ipv6_getsockopt(struct soc
break;
case IPV6_AUTOFLOWLABEL:
- val = np->autoflowlabel;
+ val = ip6_autoflowlabel(sock_net(sk), np);
break;
default:
Patches currently in stable-queue which might be from ben.hutchings(a)codethink.co.uk are
queue-4.9/vsyscall-fix-permissions-for-emulate-mode-with-kaiser-pti.patch
queue-4.9/ipv6-fix-getsockopt-for-sockets-with-default-ipv6_autoflowlabel.patch
This is a note to let you know that I've just added the patch titled
gso: validate gso_type in GSO handlers
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gso-validate-gso_type-in-gso-handlers.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Willem de Bruijn <willemb(a)google.com>
Date: Fri, 19 Jan 2018 09:29:18 -0500
Subject: gso: validate gso_type in GSO handlers
From: Willem de Bruijn <willemb(a)google.com>
[ Upstream commit 121d57af308d0cf943f08f4738d24d3966c38cd9 ]
Validate gso_type during segmentation as SKB_GSO_DODGY sources
may pass packets where the gso_type does not match the contents.
Syzkaller was able to enter the SCTP gso handler with a packet of
gso_type SKB_GSO_TCPV4.
On entry of transport layer gso handlers, verify that the gso_type
matches the transport protocol.
Fixes: 90017accff61 ("sctp: Add GSO support")
Link: http://lkml.kernel.org/r/<001a1137452496ffc305617e5fe0(a)google.com>
Reported-by: syzbot+fee64147a25aecd48055(a)syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb(a)google.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/tcp_offload.c | 3 +++
net/ipv4/udp_offload.c | 3 +++
net/ipv6/tcpv6_offload.c | 3 +++
net/ipv6/udp_offload.c | 3 +++
net/sctp/offload.c | 3 +++
5 files changed, 15 insertions(+)
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -32,6 +32,9 @@ static void tcp_gso_tstamp(struct sk_buf
static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
netdev_features_t features)
{
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+ return ERR_PTR(-EINVAL);
+
if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
return ERR_PTR(-EINVAL);
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -205,6 +205,9 @@ static struct sk_buff *udp4_ufo_fragment
goto out;
}
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+ goto out;
+
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
goto out;
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -46,6 +46,9 @@ static struct sk_buff *tcp6_gso_segment(
{
struct tcphdr *th;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6))
+ return ERR_PTR(-EINVAL);
+
if (!pskb_may_pull(skb, sizeof(*th)))
return ERR_PTR(-EINVAL);
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -55,6 +55,9 @@ static struct sk_buff *udp6_ufo_fragment
const struct ipv6hdr *ipv6h;
struct udphdr *uh;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+ goto out;
+
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
goto out;
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -44,6 +44,9 @@ static struct sk_buff *sctp_gso_segment(
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct sctphdr *sh;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP))
+ goto out;
+
sh = sctp_hdr(skb);
if (!pskb_may_pull(skb, sizeof(*sh)))
goto out;
Patches currently in stable-queue which might be from willemb(a)google.com are
queue-4.9/gso-validate-gso_type-in-gso-handlers.patch
queue-4.9/flow_dissector-properly-cap-thoff-field.patch
queue-4.9/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
ip6_gre: init dev->mtu and dev->hard_header_len correctly
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Thu, 18 Jan 2018 20:51:12 +0300
Subject: ip6_gre: init dev->mtu and dev->hard_header_len correctly
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 128bb975dc3c25d00de04e503e2fe0a780d04459 ]
Commit b05229f44228 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") moved dev->mtu initialization
from ip6gre_tunnel_setup() to ip6gre_tunnel_init(), as a
result, the previously set values, before ndo_init(), are
reset in the following cases:
* rtnl_create_link() can update dev->mtu from IFLA_MTU
parameter.
* ip6gre_tnl_link_config() is invoked before ndo_init() in
netlink and ioctl setup, so ndo_init() can reset MTU
adjustments with the lower device MTU as well, dev->mtu
and dev->hard_header_len.
Not applicable for ip6gretap because it has one more call
to ip6gre_tnl_link_config(tunnel, 1) in ip6gre_tap_init().
Fix the first case by updating dev->mtu with 'tb[IFLA_MTU]'
parameter if a user sets it manually on a device creation,
and fix the second one by moving ip6gre_tnl_link_config()
call after register_netdevice().
Fixes: b05229f44228 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Fixes: db2ec95d1ba4 ("ip6_gre: Fix MTU setting")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_gre.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -337,11 +337,12 @@ static struct ip6_tnl *ip6gre_tunnel_loc
nt->dev = dev;
nt->net = dev_net(dev);
- ip6gre_tnl_link_config(nt, 1);
if (register_netdevice(dev) < 0)
goto failed_free;
+ ip6gre_tnl_link_config(nt, 1);
+
/* Can use a lockless transmit, unless we generate output sequences */
if (!(nt->parms.o_flags & TUNNEL_SEQ))
dev->features |= NETIF_F_LLTX;
@@ -1263,7 +1264,6 @@ static void ip6gre_netlink_parms(struct
static int ip6gre_tap_init(struct net_device *dev)
{
- struct ip6_tnl *tunnel;
int ret;
ret = ip6gre_tunnel_init_common(dev);
@@ -1272,10 +1272,6 @@ static int ip6gre_tap_init(struct net_de
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
- tunnel = netdev_priv(dev);
-
- ip6gre_tnl_link_config(tunnel, 1);
-
return 0;
}
@@ -1370,7 +1366,6 @@ static int ip6gre_newlink(struct net *sr
nt->dev = dev;
nt->net = dev_net(dev);
- ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
dev->features |= GRE6_FEATURES;
dev->hw_features |= GRE6_FEATURES;
@@ -1396,6 +1391,11 @@ static int ip6gre_newlink(struct net *sr
if (err)
goto out;
+ ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
+
+ if (tb[IFLA_MTU])
+ ip6_tnl_change_mtu(dev, nla_get_u32(tb[IFLA_MTU]));
+
dev_hold(dev);
ip6gre_tunnel_link(ign, nt);
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.9/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.9/ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
This is a note to let you know that I've just added the patch titled
flow_dissector: properly cap thoff field
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
flow_dissector-properly-cap-thoff-field.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Wed, 17 Jan 2018 14:21:13 -0800
Subject: flow_dissector: properly cap thoff field
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit d0c081b49137cd3200f2023c0875723be66e7ce5 ]
syzbot reported yet another crash [1] that is caused by
insufficient validation of DODGY packets.
Two bugs are happening here to trigger the crash.
1) Flow dissection leaves with incorrect thoff field.
2) skb_probe_transport_header() sets transport header to this invalid
thoff, even if pointing after skb valid data.
3) qdisc_pkt_len_init() reads out-of-bound data because it
trusts tcp_hdrlen(skb)
Possible fixes :
- Full flow dissector validation before injecting bad DODGY packets in
the stack.
This approach was attempted here : https://patchwork.ozlabs.org/patch/
861874/
- Have more robust functions in the core.
This might be needed anyway for stable versions.
This patch fixes the flow dissection issue.
[1]
CPU: 1 PID: 3144 Comm: syzkaller271204 Not tainted 4.15.0-rc4-mm1+ #49
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:355 [inline]
kasan_report+0x23b/0x360 mm/kasan/report.c:413
__asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:432
__tcp_hdrlen include/linux/tcp.h:35 [inline]
tcp_hdrlen include/linux/tcp.h:40 [inline]
qdisc_pkt_len_init net/core/dev.c:3160 [inline]
__dev_queue_xmit+0x20d3/0x2200 net/core/dev.c:3465
dev_queue_xmit+0x17/0x20 net/core/dev.c:3554
packet_snd net/packet/af_packet.c:2943 [inline]
packet_sendmsg+0x3ad5/0x60a0 net/packet/af_packet.c:2968
sock_sendmsg_nosec net/socket.c:628 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:638
sock_write_iter+0x31a/0x5d0 net/socket.c:907
call_write_iter include/linux/fs.h:1776 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x684/0x970 fs/read_write.c:482
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
entry_SYSCALL_64_fastpath+0x1f/0x96
Fixes: 34fad54c2537 ("net: __skb_flow_dissect() must cap its return value")
Fixes: a6e544b0a88b ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/flow_dissector.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -550,8 +550,8 @@ ip_proto_again:
out_good:
ret = true;
- key_control->thoff = (u16)nhoff;
out:
+ key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
key_basic->n_proto = proto;
key_basic->ip_proto = ip_proto;
@@ -559,7 +559,6 @@ out:
out_bad:
ret = false;
- key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
goto out;
}
EXPORT_SYMBOL(__skb_flow_dissect);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.9/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.9/flow_dissector-properly-cap-thoff-field.patch
queue-4.9/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.9/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.9/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Fri, 26 Jan 2018 15:14:16 +0300
Subject: dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit dd5684ecae3bd8e44b644f50e2c12c7e57fdfef5 ]
ccid2_hc_tx_rto_expire() timer callback always restarts the timer
again and can run indefinitely (unless it is stopped outside), and after
commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
The timer prevents releasing the socket, as a result, sk_destruct() won't
be called.
Found with LTP/dccp_ipsec tests running on the bonding device,
which later couldn't be unloaded after the tests were completed:
unregister_netdevice: waiting for bond0 to become free. Usage count = 148
Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/dccp/ccids/ccid2.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -140,6 +140,9 @@ static void ccid2_hc_tx_rto_expire(unsig
ccid2_pr_debug("RTO_EXPIRE\n");
+ if (sk->sk_state == DCCP_CLOSED)
+ goto out;
+
/* back-off timer */
hc->tx_rto <<= 1;
if (hc->tx_rto > DCCP_RTO_MAX)
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.9/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.9/ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
This is a note to let you know that I've just added the patch titled
be2net: restore properly promisc mode after queues reconfiguration
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
be2net-restore-properly-promisc-mode-after-queues-reconfiguration.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:37:09 CET 2018
From: Ivan Vecera <cera(a)cera.cz>
Date: Fri, 19 Jan 2018 20:23:50 +0100
Subject: be2net: restore properly promisc mode after queues reconfiguration
From: Ivan Vecera <cera(a)cera.cz>
[ Upstream commit 52acf06451930eb4cefabd5ecea56e2d46c32f76 ]
The commit 622190669403 ("be2net: Request RSS capability of Rx interface
depending on number of Rx rings") modified be_update_queues() so the
IFACE (HW representation of the netdevice) is destroyed and then
re-created. This causes a regression because potential promiscuous mode
is not restored properly during be_open() because the driver thinks
that the HW has promiscuous mode already enabled.
Note that Lancer is not affected by this bug because RX-filter flags are
disabled during be_close() for this chipset.
Cc: Sathya Perla <sathya.perla(a)broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna(a)broadcom.com>
Cc: Somnath Kotur <somnath.kotur(a)broadcom.com>
Fixes: 622190669403 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings")
Signed-off-by: Ivan Vecera <ivecera(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/emulex/benet/be_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4733,6 +4733,15 @@ int be_update_queues(struct be_adapter *
be_schedule_worker(adapter);
+ /*
+ * The IF was destroyed and re-created. We need to clear
+ * all promiscuous flags valid for the destroyed IF.
+ * Without this promisc mode is not restored during
+ * be_open() because the driver thinks that it is
+ * already enabled in HW.
+ */
+ adapter->if_flags &= ~BE_IF_FLAGS_ALL_PROMISCUOUS;
+
if (netif_running(netdev))
status = be_open(netdev);
Patches currently in stable-queue which might be from cera(a)cera.cz are
queue-4.9/be2net-restore-properly-promisc-mode-after-queues-reconfiguration.patch
This is a note to let you know that I've just added the patch titled
tun: fix a memory leak for tfile->tx_array
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tun-fix-a-memory-leak-for-tfile-tx_array.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Mon, 15 Jan 2018 11:37:29 -0800
Subject: tun: fix a memory leak for tfile->tx_array
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 4df0bfc79904b7169dc77dcce44598b1545721f9 ]
tfile->tun could be detached before we close the tun fd,
via tun_detach_all(), so it should not be used to check for
tfile->tx_array.
As Jason suggested, we probably have to clean it up
unconditionally both in __tun_deatch() and tun_detach_all(),
but this requires to check if it is initialized or not.
Currently skb_array_cleanup() doesn't have such a check,
so I check it in the caller and introduce a helper function,
it is a bit ugly but we can always improve it in net-next.
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Fixes: 1576d9860599 ("tun: switch to use skb array for tx")
Cc: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/tun.c | 15 +++++++++++++--
1 file changed, 13 insertions(+), 2 deletions(-)
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -534,6 +534,14 @@ static void tun_queue_purge(struct tun_f
skb_queue_purge(&tfile->sk.sk_error_queue);
}
+static void tun_cleanup_tx_array(struct tun_file *tfile)
+{
+ if (tfile->tx_array.ring.queue) {
+ skb_array_cleanup(&tfile->tx_array);
+ memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+ }
+}
+
static void __tun_detach(struct tun_file *tfile, bool clean)
{
struct tun_file *ntfile;
@@ -575,8 +583,7 @@ static void __tun_detach(struct tun_file
tun->dev->reg_state == NETREG_REGISTERED)
unregister_netdevice(tun->dev);
}
- if (tun)
- skb_array_cleanup(&tfile->tx_array);
+ tun_cleanup_tx_array(tfile);
sock_put(&tfile->sk);
}
}
@@ -616,11 +623,13 @@ static void tun_detach_all(struct net_de
/* Drop read queue */
tun_queue_purge(tfile);
sock_put(&tfile->sk);
+ tun_cleanup_tx_array(tfile);
}
list_for_each_entry_safe(tfile, tmp, &tun->disabled, next) {
tun_enable_queue(tfile);
tun_queue_purge(tfile);
sock_put(&tfile->sk);
+ tun_cleanup_tx_array(tfile);
}
BUG_ON(tun->numdisabled != 0);
@@ -2624,6 +2633,8 @@ static int tun_chr_open(struct inode *in
sock_set_flag(&tfile->sk, SOCK_ZEROCOPY);
+ memset(&tfile->tx_array, 0, sizeof(tfile->tx_array));
+
return 0;
}
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.14/tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
queue-4.14/tun-fix-a-memory-leak-for-tfile-tx_array.patch
This is a note to let you know that I've just added the patch titled
vmxnet3: repair memory leak
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
vmxnet3-repair-memory-leak.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Neil Horman <nhorman(a)tuxdriver.com>
Date: Mon, 22 Jan 2018 16:06:37 -0500
Subject: vmxnet3: repair memory leak
From: Neil Horman <nhorman(a)tuxdriver.com>
[ Upstream commit 848b159835ddef99cc4193083f7e786c3992f580 ]
with the introduction of commit
b0eb57cb97e7837ebb746404c2c58c6f536f23fa, it appears that rq->buf_info
is improperly handled. While it is heap allocated when an rx queue is
setup, and freed when torn down, an old line of code in
vmxnet3_rq_destroy was not properly removed, leading to rq->buf_info[0]
being set to NULL prior to its being freed, causing a memory leak, which
eventually exhausts the system on repeated create/destroy operations
(for example, when the mtu of a vmxnet3 interface is changed
frequently.
Fix is pretty straight forward, just move the NULL set to after the
free.
Tested by myself with successful results
Applies to net, and should likely be queued for stable, please
Signed-off-by: Neil Horman <nhorman(a)tuxdriver.com>
Reported-By: boyang(a)redhat.com
CC: boyang(a)redhat.com
CC: Shrikrishna Khare <skhare(a)vmware.com>
CC: "VMware, Inc." <pv-drivers(a)vmware.com>
CC: David S. Miller <davem(a)davemloft.net>
Acked-by: Shrikrishna Khare <skhare(a)vmware.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vmxnet3/vmxnet3_drv.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/vmxnet3/vmxnet3_drv.c
+++ b/drivers/net/vmxnet3/vmxnet3_drv.c
@@ -1616,7 +1616,6 @@ static void vmxnet3_rq_destroy(struct vm
rq->rx_ring[i].basePA);
rq->rx_ring[i].base = NULL;
}
- rq->buf_info[i] = NULL;
}
if (rq->data_ring.base) {
@@ -1638,6 +1637,7 @@ static void vmxnet3_rq_destroy(struct vm
(rq->rx_ring[0].size + rq->rx_ring[1].size);
dma_free_coherent(&adapter->pdev->dev, sz, rq->buf_info[0],
rq->buf_info_pa);
+ rq->buf_info[0] = rq->buf_info[1] = NULL;
}
}
Patches currently in stable-queue which might be from nhorman(a)tuxdriver.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/vmxnet3-repair-memory-leak.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
This is a note to let you know that I've just added the patch titled
tls: reset crypto_info when do_tls_setsockopt_tx fails
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tls-reset-crypto_info-when-do_tls_setsockopt_tx-fails.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Tue, 16 Jan 2018 16:04:28 +0100
Subject: tls: reset crypto_info when do_tls_setsockopt_tx fails
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit 6db959c82eb039a151d95a0f8b7dea643657327a ]
The current code copies directly from userspace to ctx->crypto_send, but
doesn't always reinitialize it to 0 on failure. This causes any
subsequent attempt to use this setsockopt to fail because of the
TLS_CRYPTO_INFO_READY check, eventhough crypto_info is not actually
ready.
This should result in a correctly set up socket after the 3rd call, but
currently it does not:
size_t s = sizeof(struct tls12_crypto_info_aes_gcm_128);
struct tls12_crypto_info_aes_gcm_128 crypto_good = {
.info.version = TLS_1_2_VERSION,
.info.cipher_type = TLS_CIPHER_AES_GCM_128,
};
struct tls12_crypto_info_aes_gcm_128 crypto_bad_type = crypto_good;
crypto_bad_type.info.cipher_type = 42;
setsockopt(sock, SOL_TLS, TLS_TX, &crypto_bad_type, s);
setsockopt(sock, SOL_TLS, TLS_TX, &crypto_good, s - 1);
setsockopt(sock, SOL_TLS, TLS_TX, &crypto_good, s);
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -373,7 +373,7 @@ static int do_tls_setsockopt_tx(struct s
case TLS_CIPHER_AES_GCM_128: {
if (optlen != sizeof(struct tls12_crypto_info_aes_gcm_128)) {
rc = -EINVAL;
- goto out;
+ goto err_crypto_info;
}
rc = copy_from_user(
crypto_info,
@@ -388,7 +388,7 @@ static int do_tls_setsockopt_tx(struct s
}
default:
rc = -EINVAL;
- goto out;
+ goto err_crypto_info;
}
ctx->sk_write_space = sk->sk_write_space;
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.14/tls-reset-crypto_info-when-do_tls_setsockopt_tx-fails.patch
queue-4.14/tls-return-ebusy-if-crypto_info-is-already-set.patch
queue-4.14/tls-fix-sw_ctx-leak.patch
This is a note to let you know that I've just added the patch titled
tls: return -EBUSY if crypto_info is already set
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tls-return-ebusy-if-crypto_info-is-already-set.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Tue, 16 Jan 2018 16:04:27 +0100
Subject: tls: return -EBUSY if crypto_info is already set
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit 877d17c79b66466942a836403773276e34fe3614 ]
do_tls_setsockopt_tx returns 0 without doing anything when crypto_info
is already set. Silent failure is confusing for users.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_main.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -364,8 +364,10 @@ static int do_tls_setsockopt_tx(struct s
crypto_info = &ctx->crypto_send;
/* Currently we don't support set crypto info more than one time */
- if (TLS_CRYPTO_INFO_READY(crypto_info))
+ if (TLS_CRYPTO_INFO_READY(crypto_info)) {
+ rc = -EBUSY;
goto out;
+ }
switch (tmp_crypto_info.cipher_type) {
case TLS_CIPHER_AES_GCM_128: {
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.14/tls-reset-crypto_info-when-do_tls_setsockopt_tx-fails.patch
queue-4.14/tls-return-ebusy-if-crypto_info-is-already-set.patch
queue-4.14/tls-fix-sw_ctx-leak.patch
This is a note to let you know that I've just added the patch titled
tipc: fix a memory leak in tipc_nl_node_get_link()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Cong Wang <xiyou.wangcong(a)gmail.com>
Date: Wed, 10 Jan 2018 12:50:25 -0800
Subject: tipc: fix a memory leak in tipc_nl_node_get_link()
From: Cong Wang <xiyou.wangcong(a)gmail.com>
[ Upstream commit 59b36613e85fb16ebf9feaf914570879cd5c2a21 ]
When tipc_node_find_by_name() fails, the nlmsg is not
freed.
While on it, switch to a goto label to properly
free it.
Fixes: be9c086715c ("tipc: narrow down exposure of struct tipc_node")
Reported-by: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Jon Maloy <jon.maloy(a)ericsson.com>
Cc: Ying Xue <ying.xue(a)windriver.com>
Signed-off-by: Cong Wang <xiyou.wangcong(a)gmail.com>
Acked-by: Ying Xue <ying.xue(a)windriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tipc/node.c | 26 ++++++++++++++------------
1 file changed, 14 insertions(+), 12 deletions(-)
--- a/net/tipc/node.c
+++ b/net/tipc/node.c
@@ -1848,36 +1848,38 @@ int tipc_nl_node_get_link(struct sk_buff
if (strcmp(name, tipc_bclink_name) == 0) {
err = tipc_nl_add_bc_link(net, &msg);
- if (err) {
- nlmsg_free(msg.skb);
- return err;
- }
+ if (err)
+ goto err_free;
} else {
int bearer_id;
struct tipc_node *node;
struct tipc_link *link;
node = tipc_node_find_by_name(net, name, &bearer_id);
- if (!node)
- return -EINVAL;
+ if (!node) {
+ err = -EINVAL;
+ goto err_free;
+ }
tipc_node_read_lock(node);
link = node->links[bearer_id].link;
if (!link) {
tipc_node_read_unlock(node);
- nlmsg_free(msg.skb);
- return -EINVAL;
+ err = -EINVAL;
+ goto err_free;
}
err = __tipc_nl_add_link(net, &msg, link, 0);
tipc_node_read_unlock(node);
- if (err) {
- nlmsg_free(msg.skb);
- return err;
- }
+ if (err)
+ goto err_free;
}
return genlmsg_reply(msg.skb, info);
+
+err_free:
+ nlmsg_free(msg.skb);
+ return err;
}
int tipc_nl_node_reset_link_stats(struct sk_buff *skb, struct genl_info *info)
Patches currently in stable-queue which might be from xiyou.wangcong(a)gmail.com are
queue-4.14/tipc-fix-a-memory-leak-in-tipc_nl_node_get_link.patch
queue-4.14/tun-fix-a-memory-leak-for-tfile-tx_array.patch
This is a note to let you know that I've just added the patch titled
tls: fix sw_ctx leak
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
tls-fix-sw_ctx-leak.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Sabrina Dubroca <sd(a)queasysnail.net>
Date: Tue, 16 Jan 2018 16:04:26 +0100
Subject: tls: fix sw_ctx leak
From: Sabrina Dubroca <sd(a)queasysnail.net>
[ Upstream commit cf6d43ef66f416282121f436ce1bee9a25199d52 ]
During setsockopt(SOL_TCP, TLS_TX), if initialization of the software
context fails in tls_set_sw_offload(), we leak sw_ctx. We also don't
reassign ctx->priv_ctx to NULL, so we can't even do another attempt to
set it up on the same socket, as it will fail with -EEXIST.
Fixes: 3c4d7559159b ('tls: kernel TLS support')
Signed-off-by: Sabrina Dubroca <sd(a)queasysnail.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_sw.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -697,18 +697,17 @@ int tls_set_sw_offload(struct sock *sk,
}
default:
rc = -EINVAL;
- goto out;
+ goto free_priv;
}
ctx->prepend_size = TLS_HEADER_SIZE + nonce_size;
ctx->tag_size = tag_size;
ctx->overhead_size = ctx->prepend_size + ctx->tag_size;
ctx->iv_size = iv_size;
- ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE,
- GFP_KERNEL);
+ ctx->iv = kmalloc(iv_size + TLS_CIPHER_AES_GCM_128_SALT_SIZE, GFP_KERNEL);
if (!ctx->iv) {
rc = -ENOMEM;
- goto out;
+ goto free_priv;
}
memcpy(ctx->iv, gcm_128_info->salt, TLS_CIPHER_AES_GCM_128_SALT_SIZE);
memcpy(ctx->iv + TLS_CIPHER_AES_GCM_128_SALT_SIZE, iv, iv_size);
@@ -756,7 +755,7 @@ int tls_set_sw_offload(struct sock *sk,
rc = crypto_aead_setauthsize(sw_ctx->aead_send, ctx->tag_size);
if (!rc)
- goto out;
+ return 0;
free_aead:
crypto_free_aead(sw_ctx->aead_send);
@@ -767,6 +766,9 @@ free_rec_seq:
free_iv:
kfree(ctx->iv);
ctx->iv = NULL;
+free_priv:
+ kfree(ctx->priv_ctx);
+ ctx->priv_ctx = NULL;
out:
return rc;
}
Patches currently in stable-queue which might be from sd(a)queasysnail.net are
queue-4.14/tls-reset-crypto_info-when-do_tls_setsockopt_tx-fails.patch
queue-4.14/tls-return-ebusy-if-crypto_info-is-already-set.patch
queue-4.14/tls-fix-sw_ctx-leak.patch
This is a note to let you know that I've just added the patch titled
sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 15 Jan 2018 17:01:19 +0800
Subject: sctp: reinit stream if stream outcnt has been change by sinit in sendmsg
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit 625637bf4afa45204bd87e4218645182a919485a ]
After introducing sctp_stream structure, sctp uses stream->outcnt as the
out stream nums instead of c.sinit_num_ostreams.
However when users use sinit in cmsg, it only updates c.sinit_num_ostreams
in sctp_sendmsg. At that moment, stream->outcnt is still using previous
value. If it's value is not updated, the sinit_num_ostreams of sinit could
not really work.
This patch is to fix it by updating stream->outcnt and reiniting stream
if stream outcnt has been change by sinit in sendmsg.
Fixes: a83863174a61 ("sctp: prepare asoc stream for stream reconf")
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -1880,8 +1880,14 @@ static int sctp_sendmsg(struct sock *sk,
*/
if (sinit) {
if (sinit->sinit_num_ostreams) {
- asoc->c.sinit_num_ostreams =
- sinit->sinit_num_ostreams;
+ __u16 outcnt = sinit->sinit_num_ostreams;
+
+ asoc->c.sinit_num_ostreams = outcnt;
+ /* outcnt has been changed, so re-init stream */
+ err = sctp_stream_init(&asoc->stream, outcnt, 0,
+ GFP_KERNEL);
+ if (err)
+ goto out_free;
}
if (sinit->sinit_max_instreams) {
asoc->c.sinit_max_instreams =
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 15 Jan 2018 17:01:36 +0800
Subject: sctp: return error if the asoc has been peeled off in sctp_wait_for_sndbuf
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit a0ff660058b88d12625a783ce9e5c1371c87951f ]
After commit cea0cc80a677 ("sctp: use the right sk after waking up from
wait_buf sleep"), it may change to lock another sk if the asoc has been
peeled off in sctp_wait_for_sndbuf.
However, the asoc's new sk could be already closed elsewhere, as it's in
the sendmsg context of the old sk that can't avoid the new sk's closing.
If the sk's last one refcnt is held by this asoc, later on after putting
this asoc, the new sk will be freed, while under it's own lock.
This patch is to revert that commit, but fix the old issue by returning
error under the old sk's lock.
Fixes: cea0cc80a677 ("sctp: use the right sk after waking up from wait_buf sleep")
Reported-by: syzbot+ac6ea7baa4432811eb50(a)syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 16 ++++++----------
1 file changed, 6 insertions(+), 10 deletions(-)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -84,7 +84,7 @@
static int sctp_writeable(struct sock *sk);
static void sctp_wfree(struct sk_buff *skb);
static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
- size_t msg_len, struct sock **orig_sk);
+ size_t msg_len);
static int sctp_wait_for_packet(struct sock *sk, int *err, long *timeo_p);
static int sctp_wait_for_connect(struct sctp_association *, long *timeo_p);
static int sctp_wait_for_accept(struct sock *sk, long timeo);
@@ -1961,7 +1961,7 @@ static int sctp_sendmsg(struct sock *sk,
timeo = sock_sndtimeo(sk, msg->msg_flags & MSG_DONTWAIT);
if (!sctp_wspace(asoc)) {
/* sk can be changed by peel off when waiting for buf. */
- err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len, &sk);
+ err = sctp_wait_for_sndbuf(asoc, &timeo, msg_len);
if (err) {
if (err == -ESRCH) {
/* asoc is already dead. */
@@ -7825,12 +7825,12 @@ void sctp_sock_rfree(struct sk_buff *skb
/* Helper function to wait for space in the sndbuf. */
static int sctp_wait_for_sndbuf(struct sctp_association *asoc, long *timeo_p,
- size_t msg_len, struct sock **orig_sk)
+ size_t msg_len)
{
struct sock *sk = asoc->base.sk;
- int err = 0;
long current_timeo = *timeo_p;
DEFINE_WAIT(wait);
+ int err = 0;
pr_debug("%s: asoc:%p, timeo:%ld, msg_len:%zu\n", __func__, asoc,
*timeo_p, msg_len);
@@ -7859,17 +7859,13 @@ static int sctp_wait_for_sndbuf(struct s
release_sock(sk);
current_timeo = schedule_timeout(current_timeo);
lock_sock(sk);
- if (sk != asoc->base.sk) {
- release_sock(sk);
- sk = asoc->base.sk;
- lock_sock(sk);
- }
+ if (sk != asoc->base.sk)
+ goto do_error;
*timeo_p = current_timeo;
}
out:
- *orig_sk = sk;
finish_wait(&asoc->wait, &wait);
/* Release the association's refcnt. */
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
sctp: do not allow the v4 socket to bind a v4mapped v6 address
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Mon, 15 Jan 2018 17:02:00 +0800
Subject: sctp: do not allow the v4 socket to bind a v4mapped v6 address
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit c5006b8aa74599ce19104b31d322d2ea9ff887cc ]
The check in sctp_sockaddr_af is not robust enough to forbid binding a
v4mapped v6 addr on a v4 socket.
The worse thing is that v4 socket's bind_verify would not convert this
v4mapped v6 addr to a v4 addr. syzbot even reported a crash as the v4
socket bound a v6 addr.
This patch is to fix it by doing the common sa.sa_family check first,
then AF_INET check for v4mapped v6 addrs.
Fixes: 7dab83de50c7 ("sctp: Support ipv6only AF_INET6 sockets.")
Reported-by: syzbot+7b7b518b1228d2743963(a)syzkaller.appspotmail.com
Acked-by: Neil Horman <nhorman(a)tuxdriver.com>
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/sctp/socket.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)
--- a/net/sctp/socket.c
+++ b/net/sctp/socket.c
@@ -334,16 +334,14 @@ static struct sctp_af *sctp_sockaddr_af(
if (len < sizeof (struct sockaddr))
return NULL;
+ if (!opt->pf->af_supported(addr->sa.sa_family, opt))
+ return NULL;
+
/* V4 mapped address are really of AF_INET family */
if (addr->sa.sa_family == AF_INET6 &&
- ipv6_addr_v4mapped(&addr->v6.sin6_addr)) {
- if (!opt->pf->af_supported(AF_INET, opt))
- return NULL;
- } else {
- /* Does this PF support this AF? */
- if (!opt->pf->af_supported(addr->sa.sa_family, opt))
- return NULL;
- }
+ ipv6_addr_v4mapped(&addr->v6.sin6_addr) &&
+ !opt->pf->af_supported(AF_INET, opt))
+ return NULL;
/* If we get this far, af is valid. */
af = sctp_get_af_specific(addr->sa.sa_family);
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
r8169: fix memory corruption on retrieval of hardware statistics.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
r8169-fix-memory-corruption-on-retrieval-of-hardware-statistics.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Francois Romieu <romieu(a)fr.zoreil.com>
Date: Fri, 26 Jan 2018 01:53:26 +0100
Subject: r8169: fix memory corruption on retrieval of hardware statistics.
From: Francois Romieu <romieu(a)fr.zoreil.com>
[ Upstream commit a78e93661c5fd30b9e1dee464b2f62f966883ef7 ]
Hardware statistics retrieval hurts in tight invocation loops.
Avoid extraneous write and enforce strict ordering of writes targeted to
the tally counters dump area address registers.
Signed-off-by: Francois Romieu <romieu(a)fr.zoreil.com>
Tested-by: Oliver Freyermuth <o.freyermuth(a)googlemail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/realtek/r8169.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -2239,19 +2239,14 @@ static bool rtl8169_do_counters(struct n
void __iomem *ioaddr = tp->mmio_addr;
dma_addr_t paddr = tp->counters_phys_addr;
u32 cmd;
- bool ret;
RTL_W32(CounterAddrHigh, (u64)paddr >> 32);
+ RTL_R32(CounterAddrHigh);
cmd = (u64)paddr & DMA_BIT_MASK(32);
RTL_W32(CounterAddrLow, cmd);
RTL_W32(CounterAddrLow, cmd | counter_cmd);
- ret = rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
-
- RTL_W32(CounterAddrLow, 0);
- RTL_W32(CounterAddrHigh, 0);
-
- return ret;
+ return rtl_udelay_loop_wait_low(tp, &rtl_counters_cond, 10, 1000);
}
static bool rtl8169_reset_counters(struct net_device *dev)
Patches currently in stable-queue which might be from romieu(a)fr.zoreil.com are
queue-4.14/r8169-fix-memory-corruption-on-retrieval-of-hardware-statistics.patch
This is a note to let you know that I've just added the patch titled
pppoe: take ->needed_headroom of lower device into account on xmit
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Mon, 22 Jan 2018 18:06:37 +0100
Subject: pppoe: take ->needed_headroom of lower device into account on xmit
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 02612bb05e51df8489db5e94d0cf8d1c81f87b0c ]
In pppoe_sendmsg(), reserving dev->hard_header_len bytes of headroom
was probably fine before the introduction of ->needed_headroom in
commit f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom").
But now, virtual devices typically advertise the size of their overhead
in dev->needed_headroom, so we must also take it into account in
skb_reserve().
Allocation size of skb is also updated to take dev->needed_tailroom
into account and replace the arbitrary 32 bytes with the real size of
a PPPoE header.
This issue was discovered by syzbot, who connected a pppoe socket to a
gre device which had dev->header_ops->create == ipgre_header and
dev->hard_header_len == 0. Therefore, PPPoE didn't reserve any
headroom, and dev_hard_header() crashed when ipgre_header() tried to
prepend its header to skb->data.
skbuff: skb_under_panic: text:000000001d390b3a len:31 put:24
head:00000000d8ed776f data:000000008150e823 tail:0x7 end:0xc0 dev:gre0
------------[ cut here ]------------
kernel BUG at net/core/skbuff.c:104!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 3670 Comm: syzkaller801466 Not tainted
4.15.0-rc7-next-20180115+ #97
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
RIP: 0010:skb_panic+0x162/0x1f0 net/core/skbuff.c:100
RSP: 0018:ffff8801d9bd7840 EFLAGS: 00010282
RAX: 0000000000000083 RBX: ffff8801d4f083c0 RCX: 0000000000000000
RDX: 0000000000000083 RSI: 1ffff1003b37ae92 RDI: ffffed003b37aefc
RBP: ffff8801d9bd78a8 R08: 1ffff1003b37ae8a R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffffffff86200de0
R13: ffffffff84a981ad R14: 0000000000000018 R15: ffff8801d2d34180
FS: 00000000019c4880(0000) GS:ffff8801db300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00000000208bc000 CR3: 00000001d9111001 CR4: 00000000001606e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
skb_under_panic net/core/skbuff.c:114 [inline]
skb_push+0xce/0xf0 net/core/skbuff.c:1714
ipgre_header+0x6d/0x4e0 net/ipv4/ip_gre.c:879
dev_hard_header include/linux/netdevice.h:2723 [inline]
pppoe_sendmsg+0x58e/0x8b0 drivers/net/ppp/pppoe.c:890
sock_sendmsg_nosec net/socket.c:630 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:640
sock_write_iter+0x31a/0x5d0 net/socket.c:909
call_write_iter include/linux/fs.h:1775 [inline]
do_iter_readv_writev+0x525/0x7f0 fs/read_write.c:653
do_iter_write+0x154/0x540 fs/read_write.c:932
vfs_writev+0x18a/0x340 fs/read_write.c:977
do_writev+0xfc/0x2a0 fs/read_write.c:1012
SYSC_writev fs/read_write.c:1085 [inline]
SyS_writev+0x27/0x30 fs/read_write.c:1082
entry_SYSCALL_64_fastpath+0x29/0xa0
Admittedly PPPoE shouldn't be allowed to run on non Ethernet-like
interfaces, but reserving space for ->needed_headroom is a more
fundamental issue that needs to be addressed first.
Same problem exists for __pppoe_xmit(), which also needs to take
dev->needed_headroom into account in skb_cow_head().
Fixes: f5184d267c1a ("net: Allow netdevices to specify needed head/tailroom")
Reported-by: syzbot+ed0838d0fa4c4f2b528e20286e6dc63effc7c14d(a)syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Reviewed-by: Xin Long <lucien.xin(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/pppoe.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
--- a/drivers/net/ppp/pppoe.c
+++ b/drivers/net/ppp/pppoe.c
@@ -842,6 +842,7 @@ static int pppoe_sendmsg(struct socket *
struct pppoe_hdr *ph;
struct net_device *dev;
char *start;
+ int hlen;
lock_sock(sk);
if (sock_flag(sk, SOCK_DEAD) || !(sk->sk_state & PPPOX_CONNECTED)) {
@@ -860,16 +861,16 @@ static int pppoe_sendmsg(struct socket *
if (total_len > (dev->mtu + dev->hard_header_len))
goto end;
-
- skb = sock_wmalloc(sk, total_len + dev->hard_header_len + 32,
- 0, GFP_KERNEL);
+ hlen = LL_RESERVED_SPACE(dev);
+ skb = sock_wmalloc(sk, hlen + sizeof(*ph) + total_len +
+ dev->needed_tailroom, 0, GFP_KERNEL);
if (!skb) {
error = -ENOMEM;
goto end;
}
/* Reserve space for headers. */
- skb_reserve(skb, dev->hard_header_len);
+ skb_reserve(skb, hlen);
skb_reset_network_header(skb);
skb->dev = dev;
@@ -930,7 +931,7 @@ static int __pppoe_xmit(struct sock *sk,
/* Copy the data if there is no space for the header or if it's
* read-only.
*/
- if (skb_cow_head(skb, sizeof(*ph) + dev->hard_header_len))
+ if (skb_cow_head(skb, LL_RESERVED_SPACE(dev) + sizeof(*ph)))
goto abort;
__skb_push(skb, sizeof(*ph));
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/ppp-unlock-all_ppp_mutex-before-registering-device.patch
This is a note to let you know that I've just added the patch titled
ppp: unlock all_ppp_mutex before registering device
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ppp-unlock-all_ppp_mutex-before-registering-device.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Guillaume Nault <g.nault(a)alphalink.fr>
Date: Wed, 10 Jan 2018 16:24:45 +0100
Subject: ppp: unlock all_ppp_mutex before registering device
From: Guillaume Nault <g.nault(a)alphalink.fr>
[ Upstream commit 0171c41835591e9aa2e384b703ef9a6ae367c610 ]
ppp_dev_uninit(), which is the .ndo_uninit() handler of PPP devices,
needs to lock pn->all_ppp_mutex. Therefore we mustn't call
register_netdevice() with pn->all_ppp_mutex already locked, or we'd
deadlock in case register_netdevice() fails and calls .ndo_uninit().
Fortunately, we can unlock pn->all_ppp_mutex before calling
register_netdevice(). This lock protects pn->units_idr, which isn't
used in the device registration process.
However, keeping pn->all_ppp_mutex locked during device registration
did ensure that no device in transient state would be published in
pn->units_idr. In practice, unlocking it before calling
register_netdevice() doesn't change this property: ppp_unit_register()
is called with 'ppp_mutex' locked and all searches done in
pn->units_idr hold this lock too.
Fixes: 8cb775bc0a34 ("ppp: fix device unregistration upon netns deletion")
Reported-and-tested-by: syzbot+367889b9c9e279219175(a)syzkaller.appspotmail.com
Signed-off-by: Guillaume Nault <g.nault(a)alphalink.fr>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ppp/ppp_generic.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/ppp/ppp_generic.c
+++ b/drivers/net/ppp/ppp_generic.c
@@ -1003,17 +1003,18 @@ static int ppp_unit_register(struct ppp
if (!ifname_is_set)
snprintf(ppp->dev->name, IFNAMSIZ, "ppp%i", ppp->file.index);
+ mutex_unlock(&pn->all_ppp_mutex);
+
ret = register_netdevice(ppp->dev);
if (ret < 0)
goto err_unit;
atomic_inc(&ppp_unit_count);
- mutex_unlock(&pn->all_ppp_mutex);
-
return 0;
err_unit:
+ mutex_lock(&pn->all_ppp_mutex);
unit_put(&pn->units_idr, ppp->file.index);
err:
mutex_unlock(&pn->all_ppp_mutex);
Patches currently in stable-queue which might be from g.nault(a)alphalink.fr are
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/ppp-unlock-all_ppp_mutex-before-registering-device.patch
This is a note to let you know that I've just added the patch titled
nfp: use the correct index for link speed table
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
nfp-use-the-correct-index-for-link-speed-table.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Date: Mon, 15 Jan 2018 11:47:53 -0800
Subject: nfp: use the correct index for link speed table
From: Jakub Kicinski <jakub.kicinski(a)netronome.com>
[ Upstream commit 0d9c9f0f40ca262b67fc06a702b85f3976f5e1a1 ]
sts variable is holding link speed as well as state. We should
be using ls to index into ls_to_ethtool.
Fixes: 265aeb511bd5 ("nfp: add support for .get_link_ksettings()")
Signed-off-by: Jakub Kicinski <jakub.kicinski(a)netronome.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -306,7 +306,7 @@ nfp_net_get_link_ksettings(struct net_de
ls >= ARRAY_SIZE(ls_to_ethtool))
return 0;
- cmd->base.speed = ls_to_ethtool[sts];
+ cmd->base.speed = ls_to_ethtool[ls];
cmd->base.duplex = DUPLEX_FULL;
return 0;
Patches currently in stable-queue which might be from jakub.kicinski(a)netronome.com are
queue-4.14/nfp-use-the-correct-index-for-link-speed-table.patch
This is a note to let you know that I've just added the patch titled
netlink: reset extack earlier in netlink_rcv_skb
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Xin Long <lucien.xin(a)gmail.com>
Date: Thu, 18 Jan 2018 14:48:03 +0800
Subject: netlink: reset extack earlier in netlink_rcv_skb
From: Xin Long <lucien.xin(a)gmail.com>
[ Upstream commit cd443f1e91ca600a092e780e8250cd6a2954b763 ]
Move up the extack reset/initialization in netlink_rcv_skb, so that
those 'goto ack' will not skip it. Otherwise, later on netlink_ack
may use the uninitialized extack and cause kernel crash.
Fixes: cbbdf8433a5f ("netlink: extack needs to be reset each time through loop")
Reported-by: syzbot+03bee3680a37466775e7(a)syzkaller.appspotmail.com
Signed-off-by: Xin Long <lucien.xin(a)gmail.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2400,6 +2400,7 @@ int netlink_rcv_skb(struct sk_buff *skb,
while (skb->len >= nlmsg_total_size(0)) {
int msglen;
+ memset(&extack, 0, sizeof(extack));
nlh = nlmsg_hdr(skb);
err = 0;
@@ -2414,7 +2415,6 @@ int netlink_rcv_skb(struct sk_buff *skb,
if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
goto ack;
- memset(&extack, 0, sizeof(extack));
err = cb(skb, nlh, &extack);
if (err == -EINTR)
goto skip;
Patches currently in stable-queue which might be from lucien.xin(a)gmail.com are
queue-4.14/sctp-do-not-allow-the-v4-socket-to-bind-a-v4mapped-v6-address.patch
queue-4.14/sctp-reinit-stream-if-stream-outcnt-has-been-change-by-sinit-in-sendmsg.patch
queue-4.14/pppoe-take-needed_headroom-of-lower-device-into-account-on-xmit.patch
queue-4.14/sctp-return-error-if-the-asoc-has-been-peeled-off-in-sctp_wait_for_sndbuf.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
netlink: extack needs to be reset each time through loop
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netlink-extack-needs-to-be-reset-each-time-through-loop.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: David Ahern <dsahern(a)gmail.com>
Date: Wed, 10 Jan 2018 13:00:39 -0800
Subject: netlink: extack needs to be reset each time through loop
From: David Ahern <dsahern(a)gmail.com>
[ Upstream commit cbbdf8433a5f117b1a2119ea30fc651b61ef7570 ]
syzbot triggered the WARN_ON in netlink_ack testing the bad_attr value.
The problem is that netlink_rcv_skb loops over the skb repeatedly invoking
the callback and without resetting the extack leaving potentially stale
data. Initializing each time through avoids the WARN_ON.
Fixes: 2d4bc93368f5a ("netlink: extended ACK reporting")
Reported-by: syzbot+315fa6766d0f7c359327(a)syzkaller.appspotmail.com
Signed-off-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netlink/af_netlink.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -2393,7 +2393,7 @@ int netlink_rcv_skb(struct sk_buff *skb,
struct nlmsghdr *,
struct netlink_ext_ack *))
{
- struct netlink_ext_ack extack = {};
+ struct netlink_ext_ack extack;
struct nlmsghdr *nlh;
int err;
@@ -2414,6 +2414,7 @@ int netlink_rcv_skb(struct sk_buff *skb,
if (nlh->nlmsg_type < NLMSG_MIN_TYPE)
goto ack;
+ memset(&extack, 0, sizeof(extack));
err = cb(skb, nlh, &extack);
if (err == -EINTR)
goto skip;
Patches currently in stable-queue which might be from dsahern(a)gmail.com are
queue-4.14/net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
queue-4.14/net-vrf-add-support-for-sends-to-local-broadcast-address.patch
queue-4.14/netlink-extack-needs-to-be-reset-each-time-through-loop.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
net: vrf: Add support for sends to local broadcast address
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-vrf-add-support-for-sends-to-local-broadcast-address.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: David Ahern <dsahern(a)gmail.com>
Date: Wed, 24 Jan 2018 19:37:37 -0800
Subject: net: vrf: Add support for sends to local broadcast address
From: David Ahern <dsahern(a)gmail.com>
[ Upstream commit 1e19c4d689dc1e95bafd23ef68fbc0c6b9e05180 ]
Sukumar reported that sends to the local broadcast address
(255.255.255.255) are broken. Check for the address in vrf driver
and do not redirect to the VRF device - similar to multicast
packets.
With this change sockets can use SO_BINDTODEVICE to specify an
egress interface and receive responses. Note: the egress interface
can not be a VRF device but needs to be the enslaved device.
https://bugzilla.kernel.org/show_bug.cgi?id=198521
Reported-by: Sukumar Gopalakrishnan <sukumarg1973(a)gmail.com>
Signed-off-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/vrf.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -674,8 +674,9 @@ static struct sk_buff *vrf_ip_out(struct
struct sock *sk,
struct sk_buff *skb)
{
- /* don't divert multicast */
- if (ipv4_is_multicast(ip_hdr(skb)->daddr))
+ /* don't divert multicast or local broadcast */
+ if (ipv4_is_multicast(ip_hdr(skb)->daddr) ||
+ ipv4_is_lbcast(ip_hdr(skb)->daddr))
return skb;
if (qdisc_tx_is_default(vrf_dev))
Patches currently in stable-queue which might be from dsahern(a)gmail.com are
queue-4.14/net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
queue-4.14/net-vrf-add-support-for-sends-to-local-broadcast-address.patch
queue-4.14/netlink-extack-needs-to-be-reset-each-time-through-loop.patch
queue-4.14/netlink-reset-extack-earlier-in-netlink_rcv_skb.patch
This is a note to let you know that I've just added the patch titled
net/tls: Only attach to sockets in ESTABLISHED state
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tls-only-attach-to-sockets-in-established-state.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Ilya Lesokhin <ilyal(a)mellanox.com>
Date: Tue, 16 Jan 2018 15:31:52 +0200
Subject: net/tls: Only attach to sockets in ESTABLISHED state
From: Ilya Lesokhin <ilyal(a)mellanox.com>
[ Upstream commit d91c3e17f75f218022140dee18cf515292184a8f ]
Calling accept on a TCP socket with a TLS ulp attached results
in two sockets that share the same ulp context.
The ulp context is freed while a socket is destroyed, so
after one of the sockets is released, the second second will
trigger a use after free when it tries to access the ulp context
attached to it.
We restrict the TLS ulp to sockets in ESTABLISHED state
to prevent the scenario above.
Fixes: 3c4d7559159b ("tls: kernel TLS support")
Reported-by: syzbot+904e7cd6c5c741609228(a)syzkaller.appspotmail.com
Signed-off-by: Ilya Lesokhin <ilyal(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/tls/tls_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/net/tls/tls_main.c
+++ b/net/tls/tls_main.c
@@ -444,6 +444,15 @@ static int tls_init(struct sock *sk)
struct tls_context *ctx;
int rc = 0;
+ /* The TLS ulp is currently supported only for TCP sockets
+ * in ESTABLISHED state.
+ * Supporting sockets in LISTEN state will require us
+ * to modify the accept implementation to clone rather then
+ * share the ulp context.
+ */
+ if (sk->sk_state != TCP_ESTABLISHED)
+ return -ENOTSUPP;
+
/* allocate tls context */
ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
if (!ctx) {
Patches currently in stable-queue which might be from ilyal(a)mellanox.com are
queue-4.14/net-tls-only-attach-to-sockets-in-established-state.patch
This is a note to let you know that I've just added the patch titled
net/tls: Fix inverted error codes to avoid endless loop
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tls-fix-inverted-error-codes-to-avoid-endless-loop.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: "r.hering(a)avm.de" <r.hering(a)avm.de>
Date: Fri, 12 Jan 2018 15:42:06 +0100
Subject: net/tls: Fix inverted error codes to avoid endless loop
From: "r.hering(a)avm.de" <r.hering(a)avm.de>
[ Upstream commit 30be8f8dba1bd2aff73e8447d59228471233a3d4 ]
sendfile() calls can hang endless with using Kernel TLS if a socket error occurs.
Socket error codes must be inverted by Kernel TLS before returning because
they are stored with positive sign. If returned non-inverted they are
interpreted as number of bytes sent, causing endless looping of the
splice mechanic behind sendfile().
Signed-off-by: Robert Hering <r.hering(a)avm.de>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/tls.h | 2 +-
net/tls/tls_sw.c | 4 ++--
2 files changed, 3 insertions(+), 3 deletions(-)
--- a/include/net/tls.h
+++ b/include/net/tls.h
@@ -168,7 +168,7 @@ static inline bool tls_is_pending_open_r
static inline void tls_err_abort(struct sock *sk)
{
- sk->sk_err = -EBADMSG;
+ sk->sk_err = EBADMSG;
sk->sk_error_report(sk);
}
--- a/net/tls/tls_sw.c
+++ b/net/tls/tls_sw.c
@@ -407,7 +407,7 @@ int tls_sw_sendmsg(struct sock *sk, stru
while (msg_data_left(msg)) {
if (sk->sk_err) {
- ret = sk->sk_err;
+ ret = -sk->sk_err;
goto send_end;
}
@@ -560,7 +560,7 @@ int tls_sw_sendpage(struct sock *sk, str
size_t copy, required_size;
if (sk->sk_err) {
- ret = sk->sk_err;
+ ret = -sk->sk_err;
goto sendpage_end;
}
Patches currently in stable-queue which might be from r.hering(a)avm.de are
queue-4.14/net-tls-fix-inverted-error-codes-to-avoid-endless-loop.patch
This is a note to let you know that I've just added the patch titled
net: tcp: close sock if net namespace is exiting
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-tcp-close-sock-if-net-namespace-is-exiting.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Dan Streetman <ddstreet(a)ieee.org>
Date: Thu, 18 Jan 2018 16:14:26 -0500
Subject: net: tcp: close sock if net namespace is exiting
From: Dan Streetman <ddstreet(a)ieee.org>
[ Upstream commit 4ee806d51176ba7b8ff1efd81f271d7252e03a1d ]
When a tcp socket is closed, if it detects that its net namespace is
exiting, close immediately and do not wait for FIN sequence.
For normal sockets, a reference is taken to their net namespace, so it will
never exit while the socket is open. However, kernel sockets do not take a
reference to their net namespace, so it may begin exiting while the kernel
socket is still open. In this case if the kernel socket is a tcp socket,
it will stay open trying to complete its close sequence. The sock's dst(s)
hold a reference to their interface, which are all transferred to the
namespace's loopback interface when the real interfaces are taken down.
When the namespace tries to take down its loopback interface, it hangs
waiting for all references to the loopback interface to release, which
results in messages like:
unregister_netdevice: waiting for lo to become free. Usage count = 1
These messages continue until the socket finally times out and closes.
Since the net namespace cleanup holds the net_mutex while calling its
registered pernet callbacks, any new net namespace initialization is
blocked until the current net namespace finishes exiting.
After this change, the tcp socket notices the exiting net namespace, and
closes immediately, releasing its dst(s) and their reference to the
loopback interface, which lets the net namespace continue exiting.
Link: https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1711407
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=97811
Signed-off-by: Dan Streetman <ddstreet(a)canonical.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/net_namespace.h | 10 ++++++++++
net/ipv4/tcp.c | 3 +++
net/ipv4/tcp_timer.c | 15 +++++++++++++++
3 files changed, 28 insertions(+)
--- a/include/net/net_namespace.h
+++ b/include/net/net_namespace.h
@@ -223,6 +223,11 @@ int net_eq(const struct net *net1, const
return net1 == net2;
}
+static inline int check_net(const struct net *net)
+{
+ return atomic_read(&net->count) != 0;
+}
+
void net_drop_ns(void *);
#else
@@ -246,6 +251,11 @@ int net_eq(const struct net *net1, const
{
return 1;
}
+
+static inline int check_net(const struct net *net)
+{
+ return 1;
+}
#define net_drop_ns NULL
#endif
--- a/net/ipv4/tcp.c
+++ b/net/ipv4/tcp.c
@@ -2273,6 +2273,9 @@ adjudge_to_death:
tcp_send_active_reset(sk, GFP_ATOMIC);
__NET_INC_STATS(sock_net(sk),
LINUX_MIB_TCPABORTONMEMORY);
+ } else if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_set_state(sk, TCP_CLOSE);
}
}
--- a/net/ipv4/tcp_timer.c
+++ b/net/ipv4/tcp_timer.c
@@ -50,11 +50,19 @@ static void tcp_write_err(struct sock *s
* to prevent DoS attacks. It is called when a retransmission timeout
* or zero probe timeout occurs on orphaned socket.
*
+ * Also close if our net namespace is exiting; in that case there is no
+ * hope of ever communicating again since all netns interfaces are already
+ * down (or about to be down), and we need to release our dst references,
+ * which have been moved to the netns loopback interface, so the namespace
+ * can finish exiting. This condition is only possible if we are a kernel
+ * socket, as those do not hold references to the namespace.
+ *
* Criteria is still not confirmed experimentally and may change.
* We kill the socket, if:
* 1. If number of orphaned sockets exceeds an administratively configured
* limit.
* 2. If we have strong memory pressure.
+ * 3. If our net namespace is exiting.
*/
static int tcp_out_of_resources(struct sock *sk, bool do_reset)
{
@@ -83,6 +91,13 @@ static int tcp_out_of_resources(struct s
__NET_INC_STATS(sock_net(sk), LINUX_MIB_TCPABORTONMEMORY);
return 1;
}
+
+ if (!check_net(sock_net(sk))) {
+ /* Not possible to send reset; just close */
+ tcp_done(sk);
+ return 1;
+ }
+
return 0;
}
Patches currently in stable-queue which might be from ddstreet(a)ieee.org are
queue-4.14/net-tcp-close-sock-if-net-namespace-is-exiting.patch
This is a note to let you know that I've just added the patch titled
net: qdisc_pkt_len_init() should be more robust
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-qdisc_pkt_len_init-should-be-more-robust.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 18 Jan 2018 19:59:19 -0800
Subject: net: qdisc_pkt_len_init() should be more robust
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 7c68d1a6b4db9012790af7ac0f0fdc0d2083422a ]
Without proper validation of DODGY packets, we might very well
feed qdisc_pkt_len_init() with invalid GSO packets.
tcp_hdrlen() might access out-of-bound data, so let's use
skb_header_pointer() and proper checks.
Whole story is described in commit d0c081b49137 ("flow_dissector:
properly cap thoff field")
We have the goal of validating DODGY packets earlier in the stack,
so we might very well revert this fix in the future.
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Cc: Jason Wang <jasowang(a)redhat.com>
Reported-by: syzbot+9da69ebac7dddd804552(a)syzkaller.appspotmail.com
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/dev.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -3128,10 +3128,21 @@ static void qdisc_pkt_len_init(struct sk
hdr_len = skb_transport_header(skb) - skb_mac_header(skb);
/* + transport layer */
- if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6)))
- hdr_len += tcp_hdrlen(skb);
- else
- hdr_len += sizeof(struct udphdr);
+ if (likely(shinfo->gso_type & (SKB_GSO_TCPV4 | SKB_GSO_TCPV6))) {
+ const struct tcphdr *th;
+ struct tcphdr _tcphdr;
+
+ th = skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_tcphdr), &_tcphdr);
+ if (likely(th))
+ hdr_len += __tcp_hdrlen(th);
+ } else {
+ struct udphdr _udphdr;
+
+ if (skb_header_pointer(skb, skb_transport_offset(skb),
+ sizeof(_udphdr), &_udphdr))
+ hdr_len += sizeof(struct udphdr);
+ }
if (shinfo->gso_type & SKB_GSO_DODGY)
gso_segs = DIV_ROUND_UP(skb->len - hdr_len,
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.14/flow_dissector-properly-cap-thoff-field.patch
queue-4.14/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
net/mlx5: Fix get vector affinity helper function
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5-fix-get-vector-affinity-helper-function.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Saeed Mahameed <saeedm(a)mellanox.com>
Date: Thu, 4 Jan 2018 04:35:51 +0200
Subject: net/mlx5: Fix get vector affinity helper function
From: Saeed Mahameed <saeedm(a)mellanox.com>
[ Upstream commit 05e0cc84e00c54fb152d1f4b86bc211823a83d0c ]
mlx5_get_vector_affinity used to call pci_irq_get_affinity and after
reverting the patch that sets the device affinity via PCI_IRQ_AFFINITY
API, calling pci_irq_get_affinity becomes useless and it breaks RDMA
mlx5 users. To fix this, this patch provides an alternative way to
retrieve IRQ vector affinity using legacy IRQ API, following
smp_affinity read procfs implementation.
Fixes: 231243c82793 ("Revert mlx5: move affinity hints assignments to generic code")
Fixes: a435393acafb ("mlx5: move affinity hints assignments to generic code")
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/mlx5/driver.h | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)
--- a/include/linux/mlx5/driver.h
+++ b/include/linux/mlx5/driver.h
@@ -36,6 +36,7 @@
#include <linux/kernel.h>
#include <linux/completion.h>
#include <linux/pci.h>
+#include <linux/irq.h>
#include <linux/spinlock_types.h>
#include <linux/semaphore.h>
#include <linux/slab.h>
@@ -1194,7 +1195,23 @@ enum {
static inline const struct cpumask *
mlx5_get_vector_affinity(struct mlx5_core_dev *dev, int vector)
{
- return pci_irq_get_affinity(dev->pdev, MLX5_EQ_VEC_COMP_BASE + vector);
+ const struct cpumask *mask;
+ struct irq_desc *desc;
+ unsigned int irq;
+ int eqn;
+ int err;
+
+ err = mlx5_vector2eqn(dev, vector, &eqn, &irq);
+ if (err)
+ return NULL;
+
+ desc = irq_to_desc(irq);
+#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
+ mask = irq_data_get_effective_affinity_mask(&desc->irq_data);
+#else
+ mask = desc->irq_common_data.affinity;
+#endif
+ return mask;
}
#endif /* MLX5_DRIVER_H */
Patches currently in stable-queue which might be from saeedm(a)mellanox.com are
queue-4.14/net-mlx5-fix-get-vector-affinity-helper-function.patch
queue-4.14/net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch
queue-4.14/net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
This is a note to let you know that I've just added the patch titled
net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Talat Batheesh <talatb(a)mellanox.com>
Date: Sun, 21 Jan 2018 05:30:42 +0200
Subject: net/mlx5e: Fix fixpoint divide exception in mlx5e_am_stats_compare
From: Talat Batheesh <talatb(a)mellanox.com>
[ Upstream commit e58edaa4863583b54409444f11b4f80dff0af1cd ]
Helmut reported a bug about division by zero while
running traffic and doing physical cable pull test.
When the cable unplugged the ppms become zero, so when
dividing the current ppms by the previous ppms in the
next dim iteration there is division by zero.
This patch prevent this division for both ppms and epms.
Fixes: c3164d2fc48f ("net/mlx5e: Added BW check for DIM decision mechanism")
Reported-by: Helmut Grauer <helmut.grauer(a)de.ibm.com>
Signed-off-by: Talat Batheesh <talatb(a)mellanox.com>
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c | 6 ++++++
1 file changed, 6 insertions(+)
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_rx_am.c
@@ -197,9 +197,15 @@ static int mlx5e_am_stats_compare(struct
return (curr->bpms > prev->bpms) ? MLX5E_AM_STATS_BETTER :
MLX5E_AM_STATS_WORSE;
+ if (!prev->ppms)
+ return curr->ppms ? MLX5E_AM_STATS_BETTER :
+ MLX5E_AM_STATS_SAME;
+
if (IS_SIGNIFICANT_DIFF(curr->ppms, prev->ppms))
return (curr->ppms > prev->ppms) ? MLX5E_AM_STATS_BETTER :
MLX5E_AM_STATS_WORSE;
+ if (!prev->epms)
+ return MLX5E_AM_STATS_SAME;
if (IS_SIGNIFICANT_DIFF(curr->epms, prev->epms))
return (curr->epms < prev->epms) ? MLX5E_AM_STATS_BETTER :
Patches currently in stable-queue which might be from talatb(a)mellanox.com are
queue-4.14/net-mlx5e-fix-fixpoint-divide-exception-in-mlx5e_am_stats_compare.patch
This is a note to let you know that I've just added the patch titled
net: ipv4: Make "ip route get" match iif lo rules again.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Lorenzo Colitti <lorenzo(a)google.com>
Date: Thu, 11 Jan 2018 18:36:26 +0900
Subject: net: ipv4: Make "ip route get" match iif lo rules again.
From: Lorenzo Colitti <lorenzo(a)google.com>
[ Upstream commit 6503a30440962f1e1ccb8868816b4e18201218d4 ]
Commit 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu
versions of route lookup") broke "ip route get" in the presence
of rules that specify iif lo.
Host-originated traffic always has iif lo, because
ip_route_output_key_hash and ip6_route_output_flags set the flow
iif to LOOPBACK_IFINDEX. Thus, putting "iif lo" in an ip rule is a
convenient way to select only originated traffic and not forwarded
traffic.
inet_rtm_getroute used to match these rules correctly because
even though it sets the flow iif to 0, it called
ip_route_output_key which overwrites iif with LOOPBACK_IFINDEX.
But now that it calls ip_route_output_key_hash_rcu, the ifindex
will remain 0 and not match the iif lo in the rule. As a result,
"ip route get" will return ENETUNREACH.
Fixes: 3765d35ed8b9 ("net: ipv4: Convert inet_rtm_getroute to rcu versions of route lookup")
Tested: https://android.googlesource.com/kernel/tests/+/master/net/test/multinetwor… passes again
Signed-off-by: Lorenzo Colitti <lorenzo(a)google.com>
Acked-by: David Ahern <dsahern(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/route.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv4/route.c
+++ b/net/ipv4/route.c
@@ -2762,6 +2762,7 @@ static int inet_rtm_getroute(struct sk_b
if (err == 0 && rt->dst.error)
err = -rt->dst.error;
} else {
+ fl4.flowi4_iif = LOOPBACK_IFINDEX;
rt = ip_route_output_key_hash_rcu(net, &fl4, &res, skb);
err = 0;
if (IS_ERR(rt))
Patches currently in stable-queue which might be from lorenzo(a)google.com are
queue-4.14/net-ipv4-make-ip-route-get-match-iif-lo-rules-again.patch
This is a note to let you know that I've just added the patch titled
{net,ib}/mlx5: Don't disable local loopback multicast traffic when needed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eran Ben Elisha <eranbe(a)mellanox.com>
Date: Tue, 9 Jan 2018 11:41:10 +0200
Subject: {net,ib}/mlx5: Don't disable local loopback multicast traffic when needed
From: Eran Ben Elisha <eranbe(a)mellanox.com>
[ Upstream commit 8978cc921fc7fad3f4d6f91f1da01352aeeeff25 ]
There are systems platform information management interfaces (such as
HOST2BMC) for which we cannot disable local loopback multicast traffic.
Separate disable_local_lb_mc and disable_local_lb_uc capability bits so
driver will not disable multicast loopback traffic if not supported.
(It is expected that Firmware will not set disable_local_lb_mc if
HOST2BMC is running for example.)
Function mlx5_nic_vport_update_local_lb will do best effort to
disable/enable UC/MC loopback traffic and return success only in case it
succeeded to changed all allowed by Firmware.
Adapt mlx5_ib and mlx5e to support the new cap bits.
Fixes: 2c43c5a036be ("net/mlx5e: Enable local loopback in loopback selftest")
Fixes: c85023e153e3 ("IB/mlx5: Add raw ethernet local loopback support")
Fixes: bded747bb432 ("net/mlx5: Add raw ethernet local loopback firmware command")
Signed-off-by: Eran Ben Elisha <eranbe(a)mellanox.com>
Cc: kernel-team(a)fb.com
Signed-off-by: Saeed Mahameed <saeedm(a)mellanox.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/infiniband/hw/mlx5/main.c | 9 ++++--
drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c | 27 ++++++++++++------
drivers/net/ethernet/mellanox/mlx5/core/main.c | 3 --
drivers/net/ethernet/mellanox/mlx5/core/vport.c | 22 ++++++++++----
include/linux/mlx5/mlx5_ifc.h | 5 ++-
5 files changed, 44 insertions(+), 22 deletions(-)
--- a/drivers/infiniband/hw/mlx5/main.c
+++ b/drivers/infiniband/hw/mlx5/main.c
@@ -1276,7 +1276,8 @@ static int mlx5_ib_alloc_transport_domai
return err;
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
- !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+ (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+ !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
return err;
mutex_lock(&dev->lb_mutex);
@@ -1294,7 +1295,8 @@ static void mlx5_ib_dealloc_transport_do
mlx5_core_dealloc_transport_domain(dev->mdev, tdn);
if ((MLX5_CAP_GEN(dev->mdev, port_type) != MLX5_CAP_PORT_TYPE_ETH) ||
- !MLX5_CAP_GEN(dev->mdev, disable_local_lb))
+ (!MLX5_CAP_GEN(dev->mdev, disable_local_lb_uc) &&
+ !MLX5_CAP_GEN(dev->mdev, disable_local_lb_mc)))
return;
mutex_lock(&dev->lb_mutex);
@@ -4161,7 +4163,8 @@ static void *mlx5_ib_add(struct mlx5_cor
}
if ((MLX5_CAP_GEN(mdev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
- MLX5_CAP_GEN(mdev, disable_local_lb))
+ (MLX5_CAP_GEN(mdev, disable_local_lb_uc) ||
+ MLX5_CAP_GEN(mdev, disable_local_lb_mc)))
mutex_init(&dev->lb_mutex);
dev->ib_active = true;
--- a/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en_selftest.c
@@ -238,15 +238,19 @@ static int mlx5e_test_loopback_setup(str
int err = 0;
/* Temporarily enable local_lb */
- if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
- mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
- if (!lbtp->local_lb)
- mlx5_nic_vport_update_local_lb(priv->mdev, true);
+ err = mlx5_nic_vport_query_local_lb(priv->mdev, &lbtp->local_lb);
+ if (err)
+ return err;
+
+ if (!lbtp->local_lb) {
+ err = mlx5_nic_vport_update_local_lb(priv->mdev, true);
+ if (err)
+ return err;
}
err = mlx5e_refresh_tirs(priv, true);
if (err)
- return err;
+ goto out;
lbtp->loopback_ok = false;
init_completion(&lbtp->comp);
@@ -256,16 +260,21 @@ static int mlx5e_test_loopback_setup(str
lbtp->pt.dev = priv->netdev;
lbtp->pt.af_packet_priv = lbtp;
dev_add_pack(&lbtp->pt);
+
+ return 0;
+
+out:
+ if (!lbtp->local_lb)
+ mlx5_nic_vport_update_local_lb(priv->mdev, false);
+
return err;
}
static void mlx5e_test_loopback_cleanup(struct mlx5e_priv *priv,
struct mlx5e_lbt_priv *lbtp)
{
- if (MLX5_CAP_GEN(priv->mdev, disable_local_lb)) {
- if (!lbtp->local_lb)
- mlx5_nic_vport_update_local_lb(priv->mdev, false);
- }
+ if (!lbtp->local_lb)
+ mlx5_nic_vport_update_local_lb(priv->mdev, false);
dev_remove_pack(&lbtp->pt);
mlx5e_refresh_tirs(priv, false);
--- a/drivers/net/ethernet/mellanox/mlx5/core/main.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/main.c
@@ -577,8 +577,7 @@ static int mlx5_core_set_hca_defaults(st
int ret = 0;
/* Disable local_lb by default */
- if ((MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH) &&
- MLX5_CAP_GEN(dev, disable_local_lb))
+ if (MLX5_CAP_GEN(dev, port_type) == MLX5_CAP_PORT_TYPE_ETH)
ret = mlx5_nic_vport_update_local_lb(dev, false);
return ret;
--- a/drivers/net/ethernet/mellanox/mlx5/core/vport.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/vport.c
@@ -908,23 +908,33 @@ int mlx5_nic_vport_update_local_lb(struc
void *in;
int err;
- mlx5_core_dbg(mdev, "%s local_lb\n", enable ? "enable" : "disable");
+ if (!MLX5_CAP_GEN(mdev, disable_local_lb_mc) &&
+ !MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+ return 0;
+
in = kvzalloc(inlen, GFP_KERNEL);
if (!in)
return -ENOMEM;
MLX5_SET(modify_nic_vport_context_in, in,
- field_select.disable_mc_local_lb, 1);
- MLX5_SET(modify_nic_vport_context_in, in,
nic_vport_context.disable_mc_local_lb, !enable);
-
- MLX5_SET(modify_nic_vport_context_in, in,
- field_select.disable_uc_local_lb, 1);
MLX5_SET(modify_nic_vport_context_in, in,
nic_vport_context.disable_uc_local_lb, !enable);
+ if (MLX5_CAP_GEN(mdev, disable_local_lb_mc))
+ MLX5_SET(modify_nic_vport_context_in, in,
+ field_select.disable_mc_local_lb, 1);
+
+ if (MLX5_CAP_GEN(mdev, disable_local_lb_uc))
+ MLX5_SET(modify_nic_vport_context_in, in,
+ field_select.disable_uc_local_lb, 1);
+
err = mlx5_modify_nic_vport_context(mdev, in, inlen);
+ if (!err)
+ mlx5_core_dbg(mdev, "%s local_lb\n",
+ enable ? "enable" : "disable");
+
kvfree(in);
return err;
}
--- a/include/linux/mlx5/mlx5_ifc.h
+++ b/include/linux/mlx5/mlx5_ifc.h
@@ -1023,8 +1023,9 @@ struct mlx5_ifc_cmd_hca_cap_bits {
u8 log_max_wq_sz[0x5];
u8 nic_vport_change_event[0x1];
- u8 disable_local_lb[0x1];
- u8 reserved_at_3e2[0x9];
+ u8 disable_local_lb_uc[0x1];
+ u8 disable_local_lb_mc[0x1];
+ u8 reserved_at_3e3[0x8];
u8 log_max_vlan_list[0x5];
u8 reserved_at_3f0[0x3];
u8 log_max_current_mc_list[0x5];
Patches currently in stable-queue which might be from eranbe(a)mellanox.com are
queue-4.14/net-ib-mlx5-don-t-disable-local-loopback-multicast-traffic-when-needed.patch
This is a note to let you know that I've just added the patch titled
net: Allow neigh contructor functions ability to modify the primary_key
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Jim Westfall <jwestfall(a)surrealistic.net>
Date: Sun, 14 Jan 2018 04:18:50 -0800
Subject: net: Allow neigh contructor functions ability to modify the primary_key
From: Jim Westfall <jwestfall(a)surrealistic.net>
[ Upstream commit 096b9854c04df86f03b38a97d40b6506e5730919 ]
Use n->primary_key instead of pkey to account for the possibility that a neigh
constructor function may have modified the primary_key value.
Signed-off-by: Jim Westfall <jwestfall(a)surrealistic.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/neighbour.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -532,7 +532,7 @@ struct neighbour *__neigh_create(struct
if (atomic_read(&tbl->entries) > (1 << nht->hash_shift))
nht = neigh_hash_grow(tbl, nht->hash_shift + 1);
- hash_val = tbl->hash(pkey, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
+ hash_val = tbl->hash(n->primary_key, dev, nht->hash_rnd) >> (32 - nht->hash_shift);
if (n->parms->dead) {
rc = ERR_PTR(-EINVAL);
@@ -544,7 +544,7 @@ struct neighbour *__neigh_create(struct
n1 != NULL;
n1 = rcu_dereference_protected(n1->next,
lockdep_is_held(&tbl->lock))) {
- if (dev == n1->dev && !memcmp(n1->primary_key, pkey, key_len)) {
+ if (dev == n1->dev && !memcmp(n1->primary_key, n->primary_key, key_len)) {
if (want_ref)
neigh_hold(n1);
rc = n1;
Patches currently in stable-queue which might be from jwestfall(a)surrealistic.net are
queue-4.14/ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
queue-4.14/net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
This is a note to let you know that I've just added the patch titled
mlxsw: spectrum_router: Don't log an error on missing neighbor
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mlxsw-spectrum_router-don-t-log-an-error-on-missing-neighbor.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Yuval Mintz <yuvalm(a)mellanox.com>
Date: Wed, 24 Jan 2018 10:02:09 +0100
Subject: mlxsw: spectrum_router: Don't log an error on missing neighbor
From: Yuval Mintz <yuvalm(a)mellanox.com>
[ Upstream commit 1ecdaea02ca6bfacf2ecda500dc1af51e9780c42 ]
Driver periodically samples all neighbors configured in device
in order to update the kernel regarding their state. When finding
an entry configured in HW that doesn't show in neigh_lookup()
driver logs an error message.
This introduces a race when removing multiple neighbors -
it's possible that a given entry would still be configured in HW
as its removal is still being processed but is already removed
from the kernel's neighbor tables.
Simply remove the error message and gracefully accept such events.
Fixes: c723c735fa6b ("mlxsw: spectrum_router: Periodically update the kernel's neigh table")
Fixes: 60f040ca11b9 ("mlxsw: spectrum_router: Periodically dump active IPv6 neighbours")
Signed-off-by: Yuval Mintz <yuvalm(a)mellanox.com>
Reviewed-by: Ido Schimmel <idosch(a)mellanox.com>
Signed-off-by: Jiri Pirko <jiri(a)mellanox.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_router.c
@@ -1531,11 +1531,8 @@ static void mlxsw_sp_router_neigh_ent_ip
dipn = htonl(dip);
dev = mlxsw_sp->router->rifs[rif]->dev;
n = neigh_lookup(&arp_tbl, &dipn, dev);
- if (!n) {
- netdev_err(dev, "Failed to find matching neighbour for IP=%pI4h\n",
- &dip);
+ if (!n)
return;
- }
netdev_dbg(dev, "Updating neighbour with IP=%pI4h\n", &dip);
neigh_event_send(n, NULL);
@@ -1562,11 +1559,8 @@ static void mlxsw_sp_router_neigh_ent_ip
dev = mlxsw_sp->router->rifs[rif]->dev;
n = neigh_lookup(&nd_tbl, &dip, dev);
- if (!n) {
- netdev_err(dev, "Failed to find matching neighbour for IP=%pI6c\n",
- &dip);
+ if (!n)
return;
- }
netdev_dbg(dev, "Updating neighbour with IP=%pI6c\n", &dip);
neigh_event_send(n, NULL);
Patches currently in stable-queue which might be from yuvalm(a)mellanox.com are
queue-4.14/mlxsw-spectrum_router-don-t-log-an-error-on-missing-neighbor.patch
This is a note to let you know that I've just added the patch titled
ipv6: ip6_make_skb() needs to clear cork.base.dst
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Thu, 11 Jan 2018 22:31:18 -0800
Subject: ipv6: ip6_make_skb() needs to clear cork.base.dst
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit 95ef498d977bf44ac094778fd448b98af158a3e6 ]
In my last patch, I missed fact that cork.base.dst was not initialized
in ip6_make_skb() :
If ip6_setup_cork() returns an error, we might attempt a dst_release()
on some random pointer.
Fixes: 862c03ee1deb ("ipv6: fix possible mem leaks in ipv6_make_skb()")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_output.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1735,6 +1735,7 @@ struct sk_buff *ip6_make_skb(struct sock
cork.base.flags = 0;
cork.base.addr = 0;
cork.base.opt = NULL;
+ cork.base.dst = NULL;
v6_cork.opt = NULL;
err = ip6_setup_cork(sk, &cork, &v6_cork, ipc6, rt, fl6);
if (err) {
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.14/flow_dissector-properly-cap-thoff-field.patch
queue-4.14/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
ipv6: fix udpv6 sendmsg crash caused by too small MTU
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Mike Maloney <maloney(a)google.com>
Date: Wed, 10 Jan 2018 12:45:10 -0500
Subject: ipv6: fix udpv6 sendmsg crash caused by too small MTU
From: Mike Maloney <maloney(a)google.com>
[ Upstream commit 749439bfac6e1a2932c582e2699f91d329658196 ]
The logic in __ip6_append_data() assumes that the MTU is at least large
enough for the headers. A device's MTU may be adjusted after being
added while sendmsg() is processing data, resulting in
__ip6_append_data() seeing any MTU. For an mtu smaller than the size of
the fragmentation header, the math results in a negative 'maxfraglen',
which causes problems when refragmenting any previous skb in the
skb_write_queue, leaving it possibly malformed.
Instead sendmsg returns EINVAL when the mtu is calculated to be less
than IPV6_MIN_MTU.
Found by syzkaller:
kernel BUG at ./include/linux/skbuff.h:2064!
invalid opcode: 0000 [#1] SMP KASAN
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 1 PID: 14216 Comm: syz-executor5 Not tainted 4.13.0-rc4+ #2
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
task: ffff8801d0b68580 task.stack: ffff8801ac6b8000
RIP: 0010:__skb_pull include/linux/skbuff.h:2064 [inline]
RIP: 0010:__ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617
RSP: 0018:ffff8801ac6bf570 EFLAGS: 00010216
RAX: 0000000000010000 RBX: 0000000000000028 RCX: ffffc90003cce000
RDX: 00000000000001b8 RSI: ffffffff839df06f RDI: ffff8801d9478ca0
RBP: ffff8801ac6bf780 R08: ffff8801cc3f1dbc R09: 0000000000000000
R10: ffff8801ac6bf7a0 R11: 43cb4b7b1948a9e7 R12: ffff8801cc3f1dc8
R13: ffff8801cc3f1d40 R14: 0000000000001036 R15: dffffc0000000000
FS: 00007f43d740c700(0000) GS:ffff8801dc100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7834984000 CR3: 00000001d79b9000 CR4: 00000000001406e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
ip6_finish_skb include/net/ipv6.h:911 [inline]
udp_v6_push_pending_frames+0x255/0x390 net/ipv6/udp.c:1093
udpv6_sendmsg+0x280d/0x31a0 net/ipv6/udp.c:1363
inet_sendmsg+0x11f/0x5e0 net/ipv4/af_inet.c:762
sock_sendmsg_nosec net/socket.c:633 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:643
SYSC_sendto+0x352/0x5a0 net/socket.c:1750
SyS_sendto+0x40/0x50 net/socket.c:1718
entry_SYSCALL_64_fastpath+0x1f/0xbe
RIP: 0033:0x4512e9
RSP: 002b:00007f43d740bc08 EFLAGS: 00000216 ORIG_RAX: 000000000000002c
RAX: ffffffffffffffda RBX: 00000000007180a8 RCX: 00000000004512e9
RDX: 000000000000002e RSI: 0000000020d08000 RDI: 0000000000000005
RBP: 0000000000000086 R08: 00000000209c1000 R09: 000000000000001c
R10: 0000000000040800 R11: 0000000000000216 R12: 00000000004b9c69
R13: 00000000ffffffff R14: 0000000000000005 R15: 00000000202c2000
Code: 9e 01 fe e9 c5 e8 ff ff e8 7f 9e 01 fe e9 4a ea ff ff 48 89 f7 e8 52 9e 01 fe e9 aa eb ff ff e8 a8 b6 cf fd 0f 0b e8 a1 b6 cf fd <0f> 0b 49 8d 45 78 4d 8d 45 7c 48 89 85 78 fe ff ff 49 8d 85 ba
RIP: __skb_pull include/linux/skbuff.h:2064 [inline] RSP: ffff8801ac6bf570
RIP: __ip6_make_skb+0x18cf/0x1f70 net/ipv6/ip6_output.c:1617 RSP: ffff8801ac6bf570
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Signed-off-by: Mike Maloney <maloney(a)google.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_output.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -1206,14 +1206,16 @@ static int ip6_setup_cork(struct sock *s
v6_cork->tclass = ipc6->tclass;
if (rt->dst.flags & DST_XFRM_TUNNEL)
mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
- rt->dst.dev->mtu : dst_mtu(&rt->dst);
+ READ_ONCE(rt->dst.dev->mtu) : dst_mtu(&rt->dst);
else
mtu = np->pmtudisc >= IPV6_PMTUDISC_PROBE ?
- rt->dst.dev->mtu : dst_mtu(rt->dst.path);
+ READ_ONCE(rt->dst.dev->mtu) : dst_mtu(rt->dst.path);
if (np->frag_size < mtu) {
if (np->frag_size)
mtu = np->frag_size;
}
+ if (mtu < IPV6_MIN_MTU)
+ return -EINVAL;
cork->base.fragsize = mtu;
if (dst_allfrag(rt->dst.path))
cork->base.flags |= IPCORK_ALLFRAG;
Patches currently in stable-queue which might be from maloney(a)google.com are
queue-4.14/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
This is a note to let you know that I've just added the patch titled
lan78xx: Fix failure in USB Full Speed
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
lan78xx-fix-failure-in-usb-full-speed.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Yuiko Oshino <yuiko.oshino(a)microchip.com>
Date: Mon, 15 Jan 2018 13:24:28 -0500
Subject: lan78xx: Fix failure in USB Full Speed
From: Yuiko Oshino <yuiko.oshino(a)microchip.com>
[ Upstream commit a5b1379afbfabf91e3a689e82ac619a7157336b3 ]
Fix initialize the uninitialized tx_qlen to an appropriate value when USB
Full Speed is used.
Fixes: 55d7de9de6c3 ("Microchip's LAN7800 family USB 2/3 to 10/100/1000 Ethernet device driver")
Signed-off-by: Yuiko Oshino <yuiko.oshino(a)microchip.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/usb/lan78xx.c | 1 +
1 file changed, 1 insertion(+)
--- a/drivers/net/usb/lan78xx.c
+++ b/drivers/net/usb/lan78xx.c
@@ -2396,6 +2396,7 @@ static int lan78xx_reset(struct lan78xx_
buf = DEFAULT_BURST_CAP_SIZE / FS_USB_PKT_SIZE;
dev->rx_urb_size = DEFAULT_BURST_CAP_SIZE;
dev->rx_qlen = 4;
+ dev->tx_qlen = 4;
}
ret = lan78xx_write_reg(dev, BURST_CAP, buf);
Patches currently in stable-queue which might be from yuiko.oshino(a)microchip.com are
queue-4.14/lan78xx-fix-failure-in-usb-full-speed.patch
This is a note to let you know that I've just added the patch titled
ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Jim Westfall <jwestfall(a)surrealistic.net>
Date: Sun, 14 Jan 2018 04:18:51 -0800
Subject: ipv4: Make neigh lookup keys for loopback/point-to-point devices be INADDR_ANY
From: Jim Westfall <jwestfall(a)surrealistic.net>
[ Upstream commit cd9ff4de0107c65d69d02253bb25d6db93c3dbc1 ]
Map all lookup neigh keys to INADDR_ANY for loopback/point-to-point devices
to avoid making an entry for every remote ip the device needs to talk to.
This used the be the old behavior but became broken in a263b3093641f
(ipv4: Make neigh lookups directly in output packet path) and later removed
in 0bb4087cbec0 (ipv4: Fix neigh lookup keying over loopback/point-to-point
devices) because it was broken.
Signed-off-by: Jim Westfall <jwestfall(a)surrealistic.net>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/arp.h | 3 +++
net/ipv4/arp.c | 7 ++++++-
2 files changed, 9 insertions(+), 1 deletion(-)
--- a/include/net/arp.h
+++ b/include/net/arp.h
@@ -20,6 +20,9 @@ static inline u32 arp_hashfn(const void
static inline struct neighbour *__ipv4_neigh_lookup_noref(struct net_device *dev, u32 key)
{
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+ key = INADDR_ANY;
+
return ___neigh_lookup_noref(&arp_tbl, neigh_key_eq32, arp_hashfn, &key, dev);
}
--- a/net/ipv4/arp.c
+++ b/net/ipv4/arp.c
@@ -223,11 +223,16 @@ static bool arp_key_eq(const struct neig
static int arp_constructor(struct neighbour *neigh)
{
- __be32 addr = *(__be32 *)neigh->primary_key;
+ __be32 addr;
struct net_device *dev = neigh->dev;
struct in_device *in_dev;
struct neigh_parms *parms;
+ u32 inaddr_any = INADDR_ANY;
+ if (dev->flags & (IFF_LOOPBACK | IFF_POINTOPOINT))
+ memcpy(neigh->primary_key, &inaddr_any, arp_tbl.key_len);
+
+ addr = *(__be32 *)neigh->primary_key;
rcu_read_lock();
in_dev = __in_dev_get_rcu(dev);
if (!in_dev) {
Patches currently in stable-queue which might be from jwestfall(a)surrealistic.net are
queue-4.14/ipv4-make-neigh-lookup-keys-for-loopback-point-to-point-devices-be-inaddr_any.patch
queue-4.14/net-allow-neigh-contructor-functions-ability-to-modify-the-primary_key.patch
This is a note to let you know that I've just added the patch titled
ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipv6-fix-getsockopt-for-sockets-with-default-ipv6_autoflowlabel.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Date: Mon, 22 Jan 2018 20:06:42 +0000
Subject: ipv6: Fix getsockopt() for sockets with default IPV6_AUTOFLOWLABEL
From: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
[ Upstream commit e9191ffb65d8e159680ce0ad2224e1acbde6985c ]
Commit 513674b5a2c9 ("net: reevalulate autoflowlabel setting after
sysctl setting") removed the initialisation of
ipv6_pinfo::autoflowlabel and added a second flag to indicate
whether this field or the net namespace default should be used.
The getsockopt() handling for this case was not updated, so it
currently returns 0 for all sockets for which IPV6_AUTOFLOWLABEL is
not explicitly enabled. Fix it to return the effective value, whether
that has been set at the socket or net namespace level.
Fixes: 513674b5a2c9 ("net: reevalulate autoflowlabel setting after sysctl ...")
Signed-off-by: Ben Hutchings <ben.hutchings(a)codethink.co.uk>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/net/ipv6.h | 1 +
net/ipv6/ip6_output.c | 2 +-
net/ipv6/ipv6_sockglue.c | 2 +-
3 files changed, 3 insertions(+), 2 deletions(-)
--- a/include/net/ipv6.h
+++ b/include/net/ipv6.h
@@ -291,6 +291,7 @@ int ipv6_flowlabel_opt_get(struct sock *
int flags);
int ip6_flowlabel_init(void);
void ip6_flowlabel_cleanup(void);
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np);
static inline void fl6_sock_release(struct ip6_flowlabel *fl)
{
--- a/net/ipv6/ip6_output.c
+++ b/net/ipv6/ip6_output.c
@@ -166,7 +166,7 @@ int ip6_output(struct net *net, struct s
!(IP6CB(skb)->flags & IP6SKB_REROUTED));
}
-static bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
+bool ip6_autoflowlabel(struct net *net, const struct ipv6_pinfo *np)
{
if (!np->autoflowlabel_set)
return ip6_default_np_autolabel(net);
--- a/net/ipv6/ipv6_sockglue.c
+++ b/net/ipv6/ipv6_sockglue.c
@@ -1324,7 +1324,7 @@ static int do_ipv6_getsockopt(struct soc
break;
case IPV6_AUTOFLOWLABEL:
- val = np->autoflowlabel;
+ val = ip6_autoflowlabel(sock_net(sk), np);
break;
case IPV6_RECVFRAGSIZE:
Patches currently in stable-queue which might be from ben.hutchings(a)codethink.co.uk are
queue-4.14/ipv6-fix-getsockopt-for-sockets-with-default-ipv6_autoflowlabel.patch
This is a note to let you know that I've just added the patch titled
ip6_gre: init dev->mtu and dev->hard_header_len correctly
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Thu, 18 Jan 2018 20:51:12 +0300
Subject: ip6_gre: init dev->mtu and dev->hard_header_len correctly
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit 128bb975dc3c25d00de04e503e2fe0a780d04459 ]
Commit b05229f44228 ("gre6: Cleanup GREv6 transmit path,
call common GRE functions") moved dev->mtu initialization
from ip6gre_tunnel_setup() to ip6gre_tunnel_init(), as a
result, the previously set values, before ndo_init(), are
reset in the following cases:
* rtnl_create_link() can update dev->mtu from IFLA_MTU
parameter.
* ip6gre_tnl_link_config() is invoked before ndo_init() in
netlink and ioctl setup, so ndo_init() can reset MTU
adjustments with the lower device MTU as well, dev->mtu
and dev->hard_header_len.
Not applicable for ip6gretap because it has one more call
to ip6gre_tnl_link_config(tunnel, 1) in ip6gre_tap_init().
Fix the first case by updating dev->mtu with 'tb[IFLA_MTU]'
parameter if a user sets it manually on a device creation,
and fix the second one by moving ip6gre_tnl_link_config()
call after register_netdevice().
Fixes: b05229f44228 ("gre6: Cleanup GREv6 transmit path, call common GRE functions")
Fixes: db2ec95d1ba4 ("ip6_gre: Fix MTU setting")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/ip6_gre.c | 14 +++++++-------
1 file changed, 7 insertions(+), 7 deletions(-)
--- a/net/ipv6/ip6_gre.c
+++ b/net/ipv6/ip6_gre.c
@@ -337,11 +337,12 @@ static struct ip6_tnl *ip6gre_tunnel_loc
nt->dev = dev;
nt->net = dev_net(dev);
- ip6gre_tnl_link_config(nt, 1);
if (register_netdevice(dev) < 0)
goto failed_free;
+ ip6gre_tnl_link_config(nt, 1);
+
/* Can use a lockless transmit, unless we generate output sequences */
if (!(nt->parms.o_flags & TUNNEL_SEQ))
dev->features |= NETIF_F_LLTX;
@@ -1307,7 +1308,6 @@ static void ip6gre_netlink_parms(struct
static int ip6gre_tap_init(struct net_device *dev)
{
- struct ip6_tnl *tunnel;
int ret;
ret = ip6gre_tunnel_init_common(dev);
@@ -1316,10 +1316,6 @@ static int ip6gre_tap_init(struct net_de
dev->priv_flags |= IFF_LIVE_ADDR_CHANGE;
- tunnel = netdev_priv(dev);
-
- ip6gre_tnl_link_config(tunnel, 1);
-
return 0;
}
@@ -1411,12 +1407,16 @@ static int ip6gre_newlink(struct net *sr
nt->dev = dev;
nt->net = dev_net(dev);
- ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
err = register_netdevice(dev);
if (err)
goto out;
+ ip6gre_tnl_link_config(nt, !tb[IFLA_MTU]);
+
+ if (tb[IFLA_MTU])
+ ip6_tnl_change_mtu(dev, nla_get_u32(tb[IFLA_MTU]));
+
dev_hold(dev);
ip6gre_tunnel_link(ign, nt);
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
This is a note to let you know that I've just added the patch titled
gso: validate gso_type in GSO handlers
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
gso-validate-gso_type-in-gso-handlers.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Willem de Bruijn <willemb(a)google.com>
Date: Fri, 19 Jan 2018 09:29:18 -0500
Subject: gso: validate gso_type in GSO handlers
From: Willem de Bruijn <willemb(a)google.com>
[ Upstream commit 121d57af308d0cf943f08f4738d24d3966c38cd9 ]
Validate gso_type during segmentation as SKB_GSO_DODGY sources
may pass packets where the gso_type does not match the contents.
Syzkaller was able to enter the SCTP gso handler with a packet of
gso_type SKB_GSO_TCPV4.
On entry of transport layer gso handlers, verify that the gso_type
matches the transport protocol.
Fixes: 90017accff61 ("sctp: Add GSO support")
Link: http://lkml.kernel.org/r/<001a1137452496ffc305617e5fe0(a)google.com>
Reported-by: syzbot+fee64147a25aecd48055(a)syzkaller.appspotmail.com
Signed-off-by: Willem de Bruijn <willemb(a)google.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Reviewed-by: Marcelo Ricardo Leitner <marcelo.leitner(a)gmail.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/esp4_offload.c | 3 +++
net/ipv4/tcp_offload.c | 3 +++
net/ipv4/udp_offload.c | 3 +++
net/ipv6/esp6_offload.c | 3 +++
net/ipv6/tcpv6_offload.c | 3 +++
net/ipv6/udp_offload.c | 3 +++
net/sctp/offload.c | 3 +++
7 files changed, 21 insertions(+)
--- a/net/ipv4/esp4_offload.c
+++ b/net/ipv4/esp4_offload.c
@@ -121,6 +121,9 @@ static struct sk_buff *esp4_gso_segment(
if (!xo)
goto out;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_ESP))
+ goto out;
+
seq = xo->seq.low;
x = skb->sp->xvec[skb->sp->len - 1];
--- a/net/ipv4/tcp_offload.c
+++ b/net/ipv4/tcp_offload.c
@@ -32,6 +32,9 @@ static void tcp_gso_tstamp(struct sk_buf
static struct sk_buff *tcp4_gso_segment(struct sk_buff *skb,
netdev_features_t features)
{
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV4))
+ return ERR_PTR(-EINVAL);
+
if (!pskb_may_pull(skb, sizeof(struct tcphdr)))
return ERR_PTR(-EINVAL);
--- a/net/ipv4/udp_offload.c
+++ b/net/ipv4/udp_offload.c
@@ -203,6 +203,9 @@ static struct sk_buff *udp4_ufo_fragment
goto out;
}
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+ goto out;
+
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
goto out;
--- a/net/ipv6/esp6_offload.c
+++ b/net/ipv6/esp6_offload.c
@@ -148,6 +148,9 @@ static struct sk_buff *esp6_gso_segment(
if (!xo)
goto out;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_ESP))
+ goto out;
+
seq = xo->seq.low;
x = skb->sp->xvec[skb->sp->len - 1];
--- a/net/ipv6/tcpv6_offload.c
+++ b/net/ipv6/tcpv6_offload.c
@@ -46,6 +46,9 @@ static struct sk_buff *tcp6_gso_segment(
{
struct tcphdr *th;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_TCPV6))
+ return ERR_PTR(-EINVAL);
+
if (!pskb_may_pull(skb, sizeof(*th)))
return ERR_PTR(-EINVAL);
--- a/net/ipv6/udp_offload.c
+++ b/net/ipv6/udp_offload.c
@@ -42,6 +42,9 @@ static struct sk_buff *udp6_ufo_fragment
const struct ipv6hdr *ipv6h;
struct udphdr *uh;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_UDP))
+ goto out;
+
if (!pskb_may_pull(skb, sizeof(struct udphdr)))
goto out;
--- a/net/sctp/offload.c
+++ b/net/sctp/offload.c
@@ -45,6 +45,9 @@ static struct sk_buff *sctp_gso_segment(
struct sk_buff *segs = ERR_PTR(-EINVAL);
struct sctphdr *sh;
+ if (!(skb_shinfo(skb)->gso_type & SKB_GSO_SCTP))
+ goto out;
+
sh = sctp_hdr(skb);
if (!pskb_may_pull(skb, sizeof(*sh)))
goto out;
Patches currently in stable-queue which might be from willemb(a)google.com are
queue-4.14/gso-validate-gso_type-in-gso-handlers.patch
queue-4.14/flow_dissector-properly-cap-thoff-field.patch
queue-4.14/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
flow_dissector: properly cap thoff field
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
flow_dissector-properly-cap-thoff-field.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Eric Dumazet <edumazet(a)google.com>
Date: Wed, 17 Jan 2018 14:21:13 -0800
Subject: flow_dissector: properly cap thoff field
From: Eric Dumazet <edumazet(a)google.com>
[ Upstream commit d0c081b49137cd3200f2023c0875723be66e7ce5 ]
syzbot reported yet another crash [1] that is caused by
insufficient validation of DODGY packets.
Two bugs are happening here to trigger the crash.
1) Flow dissection leaves with incorrect thoff field.
2) skb_probe_transport_header() sets transport header to this invalid
thoff, even if pointing after skb valid data.
3) qdisc_pkt_len_init() reads out-of-bound data because it
trusts tcp_hdrlen(skb)
Possible fixes :
- Full flow dissector validation before injecting bad DODGY packets in
the stack.
This approach was attempted here : https://patchwork.ozlabs.org/patch/
861874/
- Have more robust functions in the core.
This might be needed anyway for stable versions.
This patch fixes the flow dissection issue.
[1]
CPU: 1 PID: 3144 Comm: syzkaller271204 Not tainted 4.15.0-rc4-mm1+ #49
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0x194/0x257 lib/dump_stack.c:53
print_address_description+0x73/0x250 mm/kasan/report.c:256
kasan_report_error mm/kasan/report.c:355 [inline]
kasan_report+0x23b/0x360 mm/kasan/report.c:413
__asan_report_load2_noabort+0x14/0x20 mm/kasan/report.c:432
__tcp_hdrlen include/linux/tcp.h:35 [inline]
tcp_hdrlen include/linux/tcp.h:40 [inline]
qdisc_pkt_len_init net/core/dev.c:3160 [inline]
__dev_queue_xmit+0x20d3/0x2200 net/core/dev.c:3465
dev_queue_xmit+0x17/0x20 net/core/dev.c:3554
packet_snd net/packet/af_packet.c:2943 [inline]
packet_sendmsg+0x3ad5/0x60a0 net/packet/af_packet.c:2968
sock_sendmsg_nosec net/socket.c:628 [inline]
sock_sendmsg+0xca/0x110 net/socket.c:638
sock_write_iter+0x31a/0x5d0 net/socket.c:907
call_write_iter include/linux/fs.h:1776 [inline]
new_sync_write fs/read_write.c:469 [inline]
__vfs_write+0x684/0x970 fs/read_write.c:482
vfs_write+0x189/0x510 fs/read_write.c:544
SYSC_write fs/read_write.c:589 [inline]
SyS_write+0xef/0x220 fs/read_write.c:581
entry_SYSCALL_64_fastpath+0x1f/0x96
Fixes: 34fad54c2537 ("net: __skb_flow_dissect() must cap its return value")
Fixes: a6e544b0a88b ("flow_dissector: Jump to exit code in __skb_flow_dissect")
Signed-off-by: Eric Dumazet <edumazet(a)google.com>
Cc: Willem de Bruijn <willemb(a)google.com>
Reported-by: syzbot <syzkaller(a)googlegroups.com>
Acked-by: Jason Wang <jasowang(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/core/flow_dissector.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -876,8 +876,8 @@ ip_proto_again:
out_good:
ret = true;
- key_control->thoff = (u16)nhoff;
out:
+ key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
key_basic->n_proto = proto;
key_basic->ip_proto = ip_proto;
@@ -885,7 +885,6 @@ out:
out_bad:
ret = false;
- key_control->thoff = min_t(u16, nhoff, skb ? skb->len : hlen);
goto out;
}
EXPORT_SYMBOL(__skb_flow_dissect);
Patches currently in stable-queue which might be from edumazet(a)google.com are
queue-4.14/ipv6-fix-udpv6-sendmsg-crash-caused-by-too-small-mtu.patch
queue-4.14/flow_dissector-properly-cap-thoff-field.patch
queue-4.14/ipv6-ip6_make_skb-needs-to-clear-cork.base.dst.patch
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/net-qdisc_pkt_len_init-should-be-more-robust.patch
This is a note to let you know that I've just added the patch titled
dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Date: Fri, 26 Jan 2018 15:14:16 +0300
Subject: dccp: don't restart ccid2_hc_tx_rto_expire() if sk in closed state
From: Alexey Kodanev <alexey.kodanev(a)oracle.com>
[ Upstream commit dd5684ecae3bd8e44b644f50e2c12c7e57fdfef5 ]
ccid2_hc_tx_rto_expire() timer callback always restarts the timer
again and can run indefinitely (unless it is stopped outside), and after
commit 120e9dabaf55 ("dccp: defer ccid_hc_tx_delete() at dismantle time"),
which moved ccid_hc_tx_delete() (also includes sk_stop_timer()) from
dccp_destroy_sock() to sk_destruct(), this started to happen quite often.
The timer prevents releasing the socket, as a result, sk_destruct() won't
be called.
Found with LTP/dccp_ipsec tests running on the bonding device,
which later couldn't be unloaded after the tests were completed:
unregister_netdevice: waiting for bond0 to become free. Usage count = 148
Fixes: 2a91aa396739 ("[DCCP] CCID2: Initial CCID2 (TCP-Like) implementation")
Signed-off-by: Alexey Kodanev <alexey.kodanev(a)oracle.com>
Reviewed-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/dccp/ccids/ccid2.c | 3 +++
1 file changed, 3 insertions(+)
--- a/net/dccp/ccids/ccid2.c
+++ b/net/dccp/ccids/ccid2.c
@@ -140,6 +140,9 @@ static void ccid2_hc_tx_rto_expire(unsig
ccid2_pr_debug("RTO_EXPIRE\n");
+ if (sk->sk_state == DCCP_CLOSED)
+ goto out;
+
/* back-off timer */
hc->tx_rto <<= 1;
if (hc->tx_rto > DCCP_RTO_MAX)
Patches currently in stable-queue which might be from alexey.kodanev(a)oracle.com are
queue-4.14/dccp-don-t-restart-ccid2_hc_tx_rto_expire-if-sk-in-closed-state.patch
queue-4.14/ip6_gre-init-dev-mtu-and-dev-hard_header_len-correctly.patch
This is a note to let you know that I've just added the patch titled
be2net: restore properly promisc mode after queues reconfiguration
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
be2net-restore-properly-promisc-mode-after-queues-reconfiguration.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Sun Jan 28 17:35:08 CET 2018
From: Ivan Vecera <cera(a)cera.cz>
Date: Fri, 19 Jan 2018 20:23:50 +0100
Subject: be2net: restore properly promisc mode after queues reconfiguration
From: Ivan Vecera <cera(a)cera.cz>
[ Upstream commit 52acf06451930eb4cefabd5ecea56e2d46c32f76 ]
The commit 622190669403 ("be2net: Request RSS capability of Rx interface
depending on number of Rx rings") modified be_update_queues() so the
IFACE (HW representation of the netdevice) is destroyed and then
re-created. This causes a regression because potential promiscuous mode
is not restored properly during be_open() because the driver thinks
that the HW has promiscuous mode already enabled.
Note that Lancer is not affected by this bug because RX-filter flags are
disabled during be_close() for this chipset.
Cc: Sathya Perla <sathya.perla(a)broadcom.com>
Cc: Ajit Khaparde <ajit.khaparde(a)broadcom.com>
Cc: Sriharsha Basavapatna <sriharsha.basavapatna(a)broadcom.com>
Cc: Somnath Kotur <somnath.kotur(a)broadcom.com>
Fixes: 622190669403 ("be2net: Request RSS capability of Rx interface depending on number of Rx rings")
Signed-off-by: Ivan Vecera <ivecera(a)redhat.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/net/ethernet/emulex/benet/be_main.c | 9 +++++++++
1 file changed, 9 insertions(+)
--- a/drivers/net/ethernet/emulex/benet/be_main.c
+++ b/drivers/net/ethernet/emulex/benet/be_main.c
@@ -4634,6 +4634,15 @@ int be_update_queues(struct be_adapter *
be_schedule_worker(adapter);
+ /*
+ * The IF was destroyed and re-created. We need to clear
+ * all promiscuous flags valid for the destroyed IF.
+ * Without this promisc mode is not restored during
+ * be_open() because the driver thinks that it is
+ * already enabled in HW.
+ */
+ adapter->if_flags &= ~BE_IF_FLAGS_ALL_PROMISCUOUS;
+
if (netif_running(netdev))
status = be_open(netdev);
Patches currently in stable-queue which might be from cera(a)cera.cz are
queue-4.14/be2net-restore-properly-promisc-mode-after-queues-reconfiguration.patch
This is a note to let you know that I've just added the patch titled
um: Remove copy&paste code from init.h
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-remove-copy-paste-code-from-init.h.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 30b11ee9ae23d78de66b9ae315880af17a64ba83 Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard(a)nod.at>
Date: Sun, 31 May 2015 22:15:58 +0200
Subject: um: Remove copy&paste code from init.h
From: Richard Weinberger <richard(a)nod.at>
commit 30b11ee9ae23d78de66b9ae315880af17a64ba83 upstream.
As we got rid of the __KERNEL__ abuse, we can directly
include linux/compiler.h now.
This also allows gcc 5 to build UML.
Reported-by: Hans-Werner Hilse <hwhilse(a)gmail.com>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Greg Hackmann <ghackmann(a)google.com>
Cc: Bernie Innocenti <codewiz(a)google.com>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/include/shared/init.h | 22 +---------------------
1 file changed, 1 insertion(+), 21 deletions(-)
--- a/arch/um/include/shared/init.h
+++ b/arch/um/include/shared/init.h
@@ -40,28 +40,8 @@
typedef int (*initcall_t)(void);
typedef void (*exitcall_t)(void);
-#ifdef __UM_HOST__
-#ifndef __section
-# define __section(S) __attribute__ ((__section__(#S)))
-#endif
-
-#if __GNUC__ == 3
-
-#if __GNUC_MINOR__ >= 3
-# define __used __attribute__((__used__))
-#else
-# define __used __attribute__((__unused__))
-#endif
-
-#else
-#if __GNUC__ == 4
-# define __used __attribute__((__used__))
-#endif
-#endif
-
-#else
#include <linux/compiler.h>
-#endif
+
/* These are for everybody (although not all archs will actually
discard it in modules) */
#define __init __section(.init.text)
Patches currently in stable-queue which might be from richard(a)nod.at are
queue-3.18/um-stop-abusing-__kernel__.patch
queue-3.18/um-link-vmlinux-with-no-pie.patch
queue-3.18/um-remove-copy-paste-code-from-init.h.patch
This is a note to let you know that I've just added the patch titled
um: Stop abusing __KERNEL__
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-stop-abusing-__kernel__.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 298e20ba8c197e8d429a6c8671550c41c7919033 Mon Sep 17 00:00:00 2001
From: Richard Weinberger <richard(a)nod.at>
Date: Sun, 31 May 2015 19:50:57 +0200
Subject: um: Stop abusing __KERNEL__
From: Richard Weinberger <richard(a)nod.at>
commit 298e20ba8c197e8d429a6c8671550c41c7919033 upstream.
Currently UML is abusing __KERNEL__ to distinguish between
kernel and host code (os-Linux). It is better to use a custom
define such that existing users of __KERNEL__ don't get confused.
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Greg Hackmann <ghackmann(a)google.com>
Cc: Bernie Innocenti <codewiz(a)google.com>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/Makefile | 7 ++++---
arch/um/drivers/mconsole.h | 2 +-
arch/um/include/shared/init.h | 4 ++--
arch/um/include/shared/user.h | 2 +-
arch/x86/um/shared/sysdep/tls.h | 6 +++---
5 files changed, 11 insertions(+), 10 deletions(-)
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -68,9 +68,10 @@ KBUILD_CFLAGS += $(CFLAGS) $(CFLAGS-y) -
KBUILD_AFLAGS += $(ARCH_INCLUDE)
-USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -D__KERNEL__,,\
- $(patsubst -I%,,$(KBUILD_CFLAGS)))) $(ARCH_INCLUDE) $(MODE_INCLUDE) \
- $(filter -I%,$(CFLAGS)) -D_FILE_OFFSET_BITS=64 -idirafter include
+USER_CFLAGS = $(patsubst $(KERNEL_DEFINES),,$(patsubst -I%,,$(KBUILD_CFLAGS))) \
+ $(ARCH_INCLUDE) $(MODE_INCLUDE) $(filter -I%,$(CFLAGS)) \
+ -D_FILE_OFFSET_BITS=64 -idirafter include \
+ -D__KERNEL__ -D__UM_HOST__
#This will adjust *FLAGS accordingly to the platform.
include $(srctree)/$(ARCH_DIR)/Makefile-os-$(OS)
--- a/arch/um/drivers/mconsole.h
+++ b/arch/um/drivers/mconsole.h
@@ -7,7 +7,7 @@
#ifndef __MCONSOLE_H__
#define __MCONSOLE_H__
-#ifndef __KERNEL__
+#ifdef __UM_HOST__
#include <stdint.h>
#define u32 uint32_t
#endif
--- a/arch/um/include/shared/init.h
+++ b/arch/um/include/shared/init.h
@@ -40,7 +40,7 @@
typedef int (*initcall_t)(void);
typedef void (*exitcall_t)(void);
-#ifndef __KERNEL__
+#ifdef __UM_HOST__
#ifndef __section
# define __section(S) __attribute__ ((__section__(#S)))
#endif
@@ -131,7 +131,7 @@ extern struct uml_param __uml_setup_star
#define __uml_postsetup_call __used __section(.uml.postsetup.init)
#define __uml_exit_call __used __section(.uml.exitcall.exit)
-#ifndef __KERNEL__
+#ifdef __UM_HOST__
#define __define_initcall(level,fn) \
static initcall_t __initcall_##fn __used \
--- a/arch/um/include/shared/user.h
+++ b/arch/um/include/shared/user.h
@@ -17,7 +17,7 @@
#define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0]))
/* This is to get size_t */
-#ifdef __KERNEL__
+#ifndef __UM_HOST__
#include <linux/types.h>
#else
#include <stddef.h>
--- a/arch/x86/um/shared/sysdep/tls.h
+++ b/arch/x86/um/shared/sysdep/tls.h
@@ -1,7 +1,7 @@
#ifndef _SYSDEP_TLS_H
#define _SYSDEP_TLS_H
-# ifndef __KERNEL__
+#ifdef __UM_HOST__
/* Change name to avoid conflicts with the original one from <asm/ldt.h>, which
* may be named user_desc (but in 2.4 and in header matching its API was named
@@ -22,11 +22,11 @@ typedef struct um_dup_user_desc {
#endif
} user_desc_t;
-# else /* __KERNEL__ */
+#else /* __UM_HOST__ */
typedef struct user_desc user_desc_t;
-# endif /* __KERNEL__ */
+#endif /* __UM_HOST__ */
extern int os_set_thread_area(user_desc_t *info, int pid);
extern int os_get_thread_area(user_desc_t *info, int pid);
Patches currently in stable-queue which might be from richard(a)nod.at are
queue-3.18/um-stop-abusing-__kernel__.patch
queue-3.18/um-link-vmlinux-with-no-pie.patch
queue-3.18/um-remove-copy-paste-code-from-init.h.patch
If request is marked as NVME_REQ_CANCELLED, we don't need to retry for
requeuing it, and it should be completed immediately. Even simply from
the flag name, it needn't to be requeued.
Otherwise, it is easy to cause IO hang when IO is timed out in case of
PCI NVMe:
1) IO timeout is triggered, and nvme_timeout() tries to disable
device(nvme_dev_disable) and reset controller(nvme_reset_ctrl)
2) inside nvme_dev_disable(), queue is frozen and quiesced, and
try to cancel every request, but the timeout request can't be canceled
since it is completed by __blk_mq_complete_request() in blk_mq_rq_timed_out().
3) this timeout req is requeued via nvme_complete_rq(), but can't be
dispatched at all because queue is quiesced and hardware isn't ready,
finally nvme_wait_freeze() waits for ever in nvme_reset_work().
Cc: "jianchao.wang" <jianchao.w.wang(a)oracle.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Keith Busch <keith.busch(a)intel.com>
Cc: stable(a)vger.kernel.org
Reported-by: Xiao Liang <xiliang(a)redhat.com>
Signed-off-by: Ming Lei <ming.lei(a)redhat.com>
---
drivers/nvme/host/core.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 0ff03cf95f7f..5cd713a164cb 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -210,6 +210,8 @@ static inline bool nvme_req_needs_retry(struct request *req)
return false;
if (nvme_req(req)->retries >= nvme_max_retries)
return false;
+ if (nvme_req(req)->flags & NVME_REQ_CANCELLED)
+ return false;
return true;
}
--
2.9.5
This is a note to let you know that I've just added the patch titled
drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-vc4-fix-null-pointer-dereference-in-vc4_save_hang_state.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 17b11b76b87afe9f8be199d7a5f442497133e2b0 Mon Sep 17 00:00:00 2001
From: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Date: Thu, 18 Jan 2018 15:58:21 +0100
Subject: drm/vc4: Fix NULL pointer dereference in vc4_save_hang_state()
From: Boris Brezillon <boris.brezillon(a)free-electrons.com>
commit 17b11b76b87afe9f8be199d7a5f442497133e2b0 upstream.
When saving BOs in the hang state we skip one entry of the
kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL
pointer dereference when, later in this function, we iterate over all
BOs to check their ->madv state.
Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Signed-off-by: Eric Anholt <eric(a)anholt.net>
Reviewed-by: Eric Anholt <eric(a)anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180118145821.22344-1-boris.…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/vc4/vc4_gem.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -146,7 +146,7 @@ vc4_save_hang_state(struct drm_device *d
struct vc4_exec_info *exec[2];
struct vc4_bo *bo;
unsigned long irqflags;
- unsigned int i, j, unref_list_count, prev_idx;
+ unsigned int i, j, k, unref_list_count;
kernel_state = kcalloc(1, sizeof(*kernel_state), GFP_KERNEL);
if (!kernel_state)
@@ -182,24 +182,24 @@ vc4_save_hang_state(struct drm_device *d
return;
}
- prev_idx = 0;
+ k = 0;
for (i = 0; i < 2; i++) {
if (!exec[i])
continue;
for (j = 0; j < exec[i]->bo_count; j++) {
drm_gem_object_get(&exec[i]->bo[j]->base);
- kernel_state->bo[j + prev_idx] = &exec[i]->bo[j]->base;
+ kernel_state->bo[k++] = &exec[i]->bo[j]->base;
}
list_for_each_entry(bo, &exec[i]->unref_list, unref_head) {
drm_gem_object_get(&bo->base.base);
- kernel_state->bo[j + prev_idx] = &bo->base.base;
- j++;
+ kernel_state->bo[k++] = &bo->base.base;
}
- prev_idx = j + 1;
}
+ WARN_ON_ONCE(k != state->bo_count);
+
if (exec[0])
state->start_bin = exec[0]->ct0ca;
if (exec[1])
Patches currently in stable-queue which might be from boris.brezillon(a)free-electrons.com are
queue-4.14/drm-vc4-fix-null-pointer-dereference-in-vc4_save_hang_state.patch
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 17b11b76b87afe9f8be199d7a5f442497133e2b0 Mon Sep 17 00:00:00 2001
From: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Date: Thu, 18 Jan 2018 15:58:21 +0100
Subject: [PATCH] drm/vc4: Fix NULL pointer dereference in
vc4_save_hang_state()
When saving BOs in the hang state we skip one entry of the
kernel_state->bo[] array, thus leaving it to NULL. This leads to a NULL
pointer dereference when, later in this function, we iterate over all
BOs to check their ->madv state.
Fixes: ca26d28bbaa3 ("drm/vc4: improve throughput by pipelining binning and rendering jobs")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Boris Brezillon <boris.brezillon(a)free-electrons.com>
Signed-off-by: Eric Anholt <eric(a)anholt.net>
Reviewed-by: Eric Anholt <eric(a)anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/20180118145821.22344-1-boris.…
diff --git a/drivers/gpu/drm/vc4/vc4_gem.c b/drivers/gpu/drm/vc4/vc4_gem.c
index e3e868cdee79..c94cce96544c 100644
--- a/drivers/gpu/drm/vc4/vc4_gem.c
+++ b/drivers/gpu/drm/vc4/vc4_gem.c
@@ -146,7 +146,7 @@ vc4_save_hang_state(struct drm_device *dev)
struct vc4_exec_info *exec[2];
struct vc4_bo *bo;
unsigned long irqflags;
- unsigned int i, j, unref_list_count, prev_idx;
+ unsigned int i, j, k, unref_list_count;
kernel_state = kcalloc(1, sizeof(*kernel_state), GFP_KERNEL);
if (!kernel_state)
@@ -182,7 +182,7 @@ vc4_save_hang_state(struct drm_device *dev)
return;
}
- prev_idx = 0;
+ k = 0;
for (i = 0; i < 2; i++) {
if (!exec[i])
continue;
@@ -197,7 +197,7 @@ vc4_save_hang_state(struct drm_device *dev)
WARN_ON(!refcount_read(&bo->usecnt));
refcount_inc(&bo->usecnt);
drm_gem_object_get(&exec[i]->bo[j]->base);
- kernel_state->bo[j + prev_idx] = &exec[i]->bo[j]->base;
+ kernel_state->bo[k++] = &exec[i]->bo[j]->base;
}
list_for_each_entry(bo, &exec[i]->unref_list, unref_head) {
@@ -205,12 +205,12 @@ vc4_save_hang_state(struct drm_device *dev)
* because they are naturally unpurgeable.
*/
drm_gem_object_get(&bo->base.base);
- kernel_state->bo[j + prev_idx] = &bo->base.base;
- j++;
+ kernel_state->bo[k++] = &bo->base.base;
}
- prev_idx = j + 1;
}
+ WARN_ON_ONCE(k != state->bo_count);
+
if (exec[0])
state->start_bin = exec[0]->ct0ca;
if (exec[1])
This is a note to let you know that I've just added the patch titled
eventpoll.h: add missing epoll event masks
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
eventpoll.h-add-missing-epoll-event-masks.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7e040726850a106587485c21bdacc0bfc8a0cbed Mon Sep 17 00:00:00 2001
From: Greg KH <gregkh(a)linuxfoundation.org>
Date: Wed, 8 Mar 2017 19:03:44 +0100
Subject: eventpoll.h: add missing epoll event masks
From: Greg KH <gregkh(a)linuxfoundation.org>
commit 7e040726850a106587485c21bdacc0bfc8a0cbed upstream.
[resend due to me forgetting to cc: linux-api the first time around I
posted these back on Feb 23]
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
For some reason these values are not in the uapi header file, so any
libc has to define it themselves. To prevent them from needing to do
this, just have the kernel provide the correct values.
Reported-by: Elliott Hughes <enh(a)google.com>
Signed-off-by: Greg Hackmann <ghackmann(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/uapi/linux/eventpoll.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- a/include/uapi/linux/eventpoll.h
+++ b/include/uapi/linux/eventpoll.h
@@ -26,6 +26,19 @@
#define EPOLL_CTL_DEL 2
#define EPOLL_CTL_MOD 3
+/* Epoll event masks */
+#define EPOLLIN 0x00000001
+#define EPOLLPRI 0x00000002
+#define EPOLLOUT 0x00000004
+#define EPOLLERR 0x00000008
+#define EPOLLHUP 0x00000010
+#define EPOLLRDNORM 0x00000040
+#define EPOLLRDBAND 0x00000080
+#define EPOLLWRNORM 0x00000100
+#define EPOLLWRBAND 0x00000200
+#define EPOLLMSG 0x00000400
+#define EPOLLRDHUP 0x00002000
+
/* Set exclusive wakeup mode for the target file descriptor */
#define EPOLLEXCLUSIVE (1 << 28)
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-4.9/orangefs-fix-deadlock-do-not-write-i_size-in-read_iter.patch
queue-4.9/prevent-timer-value-0-for-mwaitx.patch
queue-4.9/eventpoll.h-add-missing-epoll-event-masks.patch
queue-4.9/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.9/drivers-base-cacheinfo-fix-x86-with-config_of-enabled.patch
queue-4.9/can-af_can-canfd_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.9/can-af_can-can_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/vsyscall-fix-permissions-for-emulate-mode-with-kaiser-pti.patch
queue-4.9/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-4.9/um-link-vmlinux-with-no-pie.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.9/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.9/usbip-fix-implicit-fallthrough-warning.patch
queue-4.9/orangefs-initialize-op-on-loop-restart-in-orangefs_devreq_read.patch
queue-4.9/cma-fix-calculation-of-aligned-offset.patch
queue-4.9/drivers-base-cacheinfo-fix-boot-error-message-when-acpi-is-enabled.patch
queue-4.9/mm-fix-100-cpu-kswapd-busyloop-on-unreclaimable-nodes.patch
queue-4.9/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
queue-4.9/kvm-arm-arm64-check-pagesize-when-allocating-a-hugepage-at-stage-2.patch
queue-4.9/input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
queue-4.9/acpica-namespace-fix-operand-cache-leak.patch
queue-4.9/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.9/orangefs-use-list_for_each_entry_safe-in-purge_waiting_ops.patch
queue-4.9/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.9/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
queue-4.9/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
eventpoll.h: add missing epoll event masks
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
eventpoll.h-add-missing-epoll-event-masks.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7e040726850a106587485c21bdacc0bfc8a0cbed Mon Sep 17 00:00:00 2001
From: Greg KH <gregkh(a)linuxfoundation.org>
Date: Wed, 8 Mar 2017 19:03:44 +0100
Subject: eventpoll.h: add missing epoll event masks
From: Greg KH <gregkh(a)linuxfoundation.org>
commit 7e040726850a106587485c21bdacc0bfc8a0cbed upstream.
[resend due to me forgetting to cc: linux-api the first time around I
posted these back on Feb 23]
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
For some reason these values are not in the uapi header file, so any
libc has to define it themselves. To prevent them from needing to do
this, just have the kernel provide the correct values.
Reported-by: Elliott Hughes <enh(a)google.com>
Signed-off-by: Greg Hackmann <ghackmann(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/uapi/linux/eventpoll.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- a/include/uapi/linux/eventpoll.h
+++ b/include/uapi/linux/eventpoll.h
@@ -26,6 +26,19 @@
#define EPOLL_CTL_DEL 2
#define EPOLL_CTL_MOD 3
+/* Epoll event masks */
+#define EPOLLIN 0x00000001
+#define EPOLLPRI 0x00000002
+#define EPOLLOUT 0x00000004
+#define EPOLLERR 0x00000008
+#define EPOLLHUP 0x00000010
+#define EPOLLRDNORM 0x00000040
+#define EPOLLRDBAND 0x00000080
+#define EPOLLWRNORM 0x00000100
+#define EPOLLWRBAND 0x00000200
+#define EPOLLMSG 0x00000400
+#define EPOLLRDHUP 0x00002000
+
/*
* Request the handling of system wakeup events so as to prevent system suspends
* from happening while those events are being processed.
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-4.4/prevent-timer-value-0-for-mwaitx.patch
queue-4.4/fs-select-add-vmalloc-fallback-for-select-2.patch
queue-4.4/eventpoll.h-add-missing-epoll-event-masks.patch
queue-4.4/x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
queue-4.4/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/netfilter-x_tables-speed-up-jump-target-validation.patch
queue-4.4/netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
queue-4.4/mmc-sdhci-of-esdhc-add-remove-some-quirks-according-to-vendor-version.patch
queue-4.4/usbip-fix-stub_rx-harden-cmd_submit-path-to-handle-malicious-input.patch
queue-4.4/netfilter-fix-is_err_value-usage.patch
queue-4.4/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.4/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-4.4/drivers-base-cacheinfo-fix-x86-with-config_of-enabled.patch
queue-4.4/can-af_can-canfd_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/pci-layerscape-add-fsl-ls2085a-pcie-compatible-id.patch
queue-4.4/ext2-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.4/can-af_can-can_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.4/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.4/netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
queue-4.4/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.4/vsyscall-fix-permissions-for-emulate-mode-with-kaiser-pti.patch
queue-4.4/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-4.4/um-link-vmlinux-with-no-pie.patch
queue-4.4/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.4/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.4/timers-plug-locking-race-vs.-timer-migration.patch
queue-4.4/acpi-processor-avoid-reserving-io-regions-too-early.patch
queue-4.4/usbip-fix-implicit-fallthrough-warning.patch
queue-4.4/cma-fix-calculation-of-aligned-offset.patch
queue-4.4/netfilter-nf_conntrack_sip-extend-request-line-validation.patch
queue-4.4/netfilter-restart-search-if-moved-to-other-chain.patch
queue-4.4/x86-microcode-intel-fix-bdw-late-loading-revision-check.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/drivers-base-cacheinfo-fix-boot-error-message-when-acpi-is-enabled.patch
queue-4.4/pci-layerscape-fix-msg-tlp-drop-setting.patch
queue-4.4/netfilter-use-fwmark_reflect-in-nf_send_reset.patch
queue-4.4/x86-retpoline-fill-rsb-on-context-switch-for-affected-cpus.patch
queue-4.4/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
queue-4.4/input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
queue-4.4/acpica-namespace-fix-operand-cache-leak.patch
queue-4.4/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.4/usbip-prevent-leaking-socket-pointer-address-in-messages.patch
queue-4.4/usbip-fix-stub_rx-get_pipe-to-validate-endpoint-number.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.4/time-avoid-undefined-behaviour-in-ktime_add_safe.patch
queue-4.4/sched-deadline-use-the-revised-wakeup-rule-for-suspending-constrained-dl-tasks.patch
queue-4.4/reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
queue-4.4/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
queue-4.4/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
queue-4.4/usb-usbip-fix-possible-deadlocks-reported-by-lockdep.patch
This is a note to let you know that I've just added the patch titled
eventpoll.h: add missing epoll event masks
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
eventpoll.h-add-missing-epoll-event-masks.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 7e040726850a106587485c21bdacc0bfc8a0cbed Mon Sep 17 00:00:00 2001
From: Greg KH <gregkh(a)linuxfoundation.org>
Date: Wed, 8 Mar 2017 19:03:44 +0100
Subject: eventpoll.h: add missing epoll event masks
From: Greg KH <gregkh(a)linuxfoundation.org>
commit 7e040726850a106587485c21bdacc0bfc8a0cbed upstream.
[resend due to me forgetting to cc: linux-api the first time around I
posted these back on Feb 23]
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
For some reason these values are not in the uapi header file, so any
libc has to define it themselves. To prevent them from needing to do
this, just have the kernel provide the correct values.
Reported-by: Elliott Hughes <enh(a)google.com>
Signed-off-by: Greg Hackmann <ghackmann(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/uapi/linux/eventpoll.h | 13 +++++++++++++
1 file changed, 13 insertions(+)
--- a/include/uapi/linux/eventpoll.h
+++ b/include/uapi/linux/eventpoll.h
@@ -26,6 +26,19 @@
#define EPOLL_CTL_DEL 2
#define EPOLL_CTL_MOD 3
+/* Epoll event masks */
+#define EPOLLIN 0x00000001
+#define EPOLLPRI 0x00000002
+#define EPOLLOUT 0x00000004
+#define EPOLLERR 0x00000008
+#define EPOLLHUP 0x00000010
+#define EPOLLRDNORM 0x00000040
+#define EPOLLRDBAND 0x00000080
+#define EPOLLWRNORM 0x00000100
+#define EPOLLWRBAND 0x00000200
+#define EPOLLMSG 0x00000400
+#define EPOLLRDHUP 0x00002000
+
/*
* Request the handling of system wakeup events so as to prevent system suspends
* from happening while those events are being processed.
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-3.18/futex-prevent-overflow-by-strengthen-input-validation.patch
queue-3.18/af_key-fix-buffer-overread-in-parse_exthdrs.patch
queue-3.18/eventpoll.h-add-missing-epoll-event-masks.patch
queue-3.18/af_key-fix-buffer-overread-in-verify_address_len.patch
queue-3.18/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-3.18/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-3.18/dm-thin-metadata-thin_max_concurrent_locks-should-be-6.patch
queue-3.18/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-3.18/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-3.18/input-twl4030-vibra-fix-error-bad-of_node_put-warning.patch
queue-3.18/can-af_can-canfd_rcv-replace-warn_once-by-pr_warn_once.patch
queue-3.18/dm-btree-fix-serious-bug-in-btree_split_beneath.patch
queue-3.18/reiserfs-fix-race-in-prealloc-discard.patch
queue-3.18/can-af_can-can_rcv-replace-warn_once-by-pr_warn_once.patch
queue-3.18/alsa-pcm-remove-yet-superfluous-warn_on.patch
queue-3.18/mips-ar7-ensure-the-port-type-s-fcr-value-is-used.patch
queue-3.18/netfilter-xt_osf-add-missing-permission-checks.patch
queue-3.18/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-3.18/pipe-avoid-round_pipe_size-nr_pages-overflow-on-32-bit.patch
queue-3.18/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-3.18/um-link-vmlinux-with-no-pie.patch
queue-3.18/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-3.18/usbip-fix-implicit-fallthrough-warning.patch
queue-3.18/arm-dts-kirkwood-fix-pin-muxing-of-mpp7-on-openblocks-a7.patch
queue-3.18/netfilter-nf_conntrack_sip-extend-request-line-validation.patch
queue-3.18/netfilter-restart-search-if-moved-to-other-chain.patch
queue-3.18/input-twl4030-vibra-fix-sibling-node-lookup.patch
queue-3.18/input-twl6040-vibra-fix-child-node-lookup.patch
queue-3.18/scsi-sg-disable-set_force_low_dma.patch
queue-3.18/phy-work-around-phys-references-to-usb-nop-xceiv-devices.patch
queue-3.18/input-twl6040-vibra-fix-dt-node-memory-management.patch
queue-3.18/alsa-hda-apply-the-existing-quirk-to-imac-14-1.patch
queue-3.18/input-88pm860x-ts-fix-child-node-lookup.patch
queue-3.18/arm64-kvm-fix-smccc-handling-of-unimplemented-smc-hvc-calls.patch
queue-3.18/gcov-disable-for-compile_test.patch
queue-3.18/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
queue-3.18/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
Neil Berrington reported a double-fault on a VM with 768GB of RAM that
uses large amounts of vmalloc space with PTI enabled.
The cause is that load_new_mm_cr3() was never fixed to take the
5-level pgd folding code into account, so, on a 4-level kernel, the
pgd synchronization logic compiles away to exactly nothing.
Interestingly, the problem doesn't trigger with nopti. I assume this
is because the kernel is mapped with global pages if we boot with
nopti. The sequence of operations when we create a new task is that
we first load its mm while still running on the old stack (which
crashes if the old stack is unmapped in the new mm unless the TLB
saves us), then we call prepare_switch_to(), and then we switch to the
new stack. prepare_switch_to() pokes the new stack directly, which
will populate the mapping through vmalloc_fault(). I assume that
we're getting lucky on non-PTI systems -- the old stack's TLB entry
stays alive long enough to make it all the way through
prepare_switch_to() and switch_to() so that we make it to a valid
stack.
Fixes: b50858ce3e2a ("x86/mm/vmalloc: Add 5-level paging support")
Cc: stable(a)vger.kernel.org
Reported-and-tested-by: Neil Berrington <neil.berrington(a)datacore.com>
Signed-off-by: Andy Lutomirski <luto(a)kernel.org>
---
arch/x86/mm/tlb.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index a1561957dccb..5bfe61a5e8e3 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -151,6 +151,34 @@ void switch_mm(struct mm_struct *prev, struct mm_struct *next,
local_irq_restore(flags);
}
+static void sync_current_stack_to_mm(struct mm_struct *mm)
+{
+ unsigned long sp = current_stack_pointer;
+ pgd_t *pgd = pgd_offset(mm, sp);
+
+ if (CONFIG_PGTABLE_LEVELS > 4) {
+ if (unlikely(pgd_none(*pgd))) {
+ pgd_t *pgd_ref = pgd_offset_k(sp);
+
+ set_pgd(pgd, *pgd_ref);
+ }
+ } else {
+ /*
+ * "pgd" is faked. The top level entries are "p4d"s, so sync
+ * the p4d. This compiles to approximately the same code as
+ * the 5-level case.
+ */
+ p4d_t *p4d = p4d_offset(pgd, sp);
+
+ if (unlikely(p4d_none(*p4d))) {
+ pgd_t *pgd_ref = pgd_offset_k(sp);
+ p4d_t *p4d_ref = p4d_offset(pgd_ref, sp);
+
+ set_p4d(p4d, *p4d_ref);
+ }
+ }
+}
+
void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
struct task_struct *tsk)
{
@@ -226,11 +254,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
* mapped in the new pgd, we'll double-fault. Forcibly
* map it.
*/
- unsigned int index = pgd_index(current_stack_pointer);
- pgd_t *pgd = next->pgd + index;
-
- if (unlikely(pgd_none(*pgd)))
- set_pgd(pgd, init_mm.pgd[index]);
+ sync_current_stack_to_mm(next);
}
/* Stop remote flushes for the previous mm */
--
2.14.3
commit 1005bccd7a4a ("crypto: caam - enable instantiation of all RNG4 state
handles") introduces a control when incrementing ent_delay which contains
the following comment above it:
/*
* If either SH were instantiated by somebody else
* (e.g. u-boot) then it is assumed that the entropy
* parameters are properly set and thus the function
* setting these (kick_trng(...)) is skipped.
* Also, if a handle was instantiated, do not change
* the TRNG parameters.
*/
This is a problem observed when sec_init() has been run in u-boot and
and TrustZone is enabled. We can fix this by instantiating all rng state
handles in u-boot but, on the Kernel side we should ensure that this
non-terminating path is dealt with.
Fixes: 1005bccd7a4a ("crypto: caam - enable instantiation of all RNG4 state
handles")
Reported-by: Ryan Harkin <ryan.harkin(a)linaro.org>
Cc: "Horia Geantă" <horia.geanta(a)nxp.com>
Cc: Aymen Sghaier <aymen.sghaier(a)nxp.com>
Cc: Fabio Estevam <fabio.estevam(a)nxp.com>
Cc: Peng Fan <peng.fan(a)nxp.com>
Cc: Herbert Xu <herbert(a)gondor.apana.org.au>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Lukas Auer <lukas.auer(a)aisec.fraunhofer.de>
Cc: <stable(a)vger.kernel.org> # 4.12+
Signed-off-by: Bryan O'Donoghue <pure.logic(a)nexus-software.ie>
---
drivers/crypto/caam/ctrl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 98986d3..0a1e96b 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -704,7 +704,10 @@ static int caam_probe(struct platform_device *pdev)
ent_delay);
kick_trng(pdev, ent_delay);
ent_delay += 400;
+ } else if (ctrlpriv->rng4_sh_init && inst_handles) {
+ ent_delay += 400;
}
+
/*
* if instantiate_rng(...) fails, the loop will rerun
* and the kick_trng(...) function will modfiy the
--
2.7.4
Upstream commit 1c9de5bf4286 ("usbip: vhci-hcd: Add USB3 SuperSpeed
support")
vhci_hcd clears all the bits port_status bits instead of clearing
just the USB_PORT_STAT_POWER bit when it handles ClearPortFeature:
USB_PORT_FEAT_POWER. This causes vhci_hcd attach to fail in a bad
state, leaving device unusable by the client. The device is still
attached and however client can't use it.
The problem was fixed as part of larger change to add USB3 Super
Speed support. This patch backports just the change to clear the
USB_PORT_STAT_POWER.
In addition, a minor formatting error in status file is fixed.
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
---
drivers/usb/usbip/vhci_hcd.c | 2 +-
drivers/usb/usbip/vhci_sysfs.c | 5 ++---
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/usb/usbip/vhci_hcd.c b/drivers/usb/usbip/vhci_hcd.c
index 00d68945548e..2d96bfd34138 100644
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -285,7 +285,7 @@ static int vhci_hub_control(struct usb_hcd *hcd, u16 typeReq, u16 wValue,
case USB_PORT_FEAT_POWER:
usbip_dbg_vhci_rh(
" ClearPortFeature: USB_PORT_FEAT_POWER\n");
- dum->port_status[rhport] = 0;
+ dum->port_status[rhport] &= ~USB_PORT_STAT_POWER;
dum->resuming = 0;
break;
case USB_PORT_FEAT_C_RESET:
diff --git a/drivers/usb/usbip/vhci_sysfs.c b/drivers/usb/usbip/vhci_sysfs.c
index 1c7f41a65565..ebf133c9eea3 100644
--- a/drivers/usb/usbip/vhci_sysfs.c
+++ b/drivers/usb/usbip/vhci_sysfs.c
@@ -53,7 +53,7 @@ static ssize_t status_show(struct device *dev, struct device_attribute *attr,
* a security hole, the change is made to use sockfd instead.
*/
out += sprintf(out,
- "prt sta spd bus dev sockfd local_busid\n");
+ "prt sta spd dev sockfd local_busid\n");
for (i = 0; i < VHCI_NPORTS; i++) {
struct vhci_device *vdev = port_to_vdev(i);
@@ -64,8 +64,7 @@ static ssize_t status_show(struct device *dev, struct device_attribute *attr,
if (vdev->ud.status == VDEV_ST_USED) {
out += sprintf(out, "%03u %08x ",
vdev->speed, vdev->devid);
- out += sprintf(out, "%16p ", vdev->ud.tcp_socket);
- out += sprintf(out, "%06u", vdev->ud.sockfd);
+ out += sprintf(out, "%06u ", vdev->ud.sockfd);
out += sprintf(out, "%s", dev_name(&vdev->udev->dev));
} else
--
2.14.1
commit 1005bccd7a4a ("crypto: caam - enable instantiation of all RNG4 state
handles") introduces a control when incrementing ent_delay which contains
the following comment above it:
/*
* If either SH were instantiated by somebody else
* (e.g. u-boot) then it is assumed that the entropy
* parameters are properly set and thus the function
* setting these (kick_trng(...)) is skipped.
* Also, if a handle was instantiated, do not change
* the TRNG parameters.
*/
This is a problem observed when sec_init() has been run in u-boot and
and TrustZone is enabled. We can fix this by instantiating all rng state
handles in u-boot but, on the Kernel side we should ensure that this
non-terminating path is dealt with.
Fixes: 1005bccd7a4a ("crypto: caam - enable instantiation of all RNG4 state
handles")
Reported-by: Ryan Harkin <ryan.harkin(a)linaro.org>
Cc: "Horia Geantă" <horia.geanta(a)nxp.com>
Cc: Aymen Sghaier <aymen.sghaier(a)nxp.com>
Cc: Fabio Estevam <fabio.estevam(a)nxp.com>
Cc: Peng Fan <peng.fan(a)nxp.com>
Cc: Herbert Xu <herbert(a)gondor.apana.org.au>
Cc: "David S. Miller" <davem(a)davemloft.net>
Cc: Lukas Auer <lukas.auer(a)aisec.fraunhofer.de>
Cc: <stable(a)vger.kernel.org> # 4.12+
Signed-off-by: Bryan O'Donoghue <pure.logic(a)nexus-software.ie>
---
drivers/crypto/caam/ctrl.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/crypto/caam/ctrl.c b/drivers/crypto/caam/ctrl.c
index 98986d3..0a1e96b 100644
--- a/drivers/crypto/caam/ctrl.c
+++ b/drivers/crypto/caam/ctrl.c
@@ -704,7 +704,10 @@ static int caam_probe(struct platform_device *pdev)
ent_delay);
kick_trng(pdev, ent_delay);
ent_delay += 400;
+ } else if (ctrlpriv->rng4_sh_init && inst_handles) {
+ ent_delay += 400;
}
+
/*
* if instantiate_rng(...) fails, the loop will rerun
* and the kick_trng(...) function will modfiy the
--
2.7.4
The patch
ASoC: compress: Correct handling of copy callback
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 290df4d3ab192821b66857c05346b23056ee9545 Mon Sep 17 00:00:00 2001
From: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Date: Fri, 26 Jan 2018 13:08:43 +0000
Subject: [PATCH] ASoC: compress: Correct handling of copy callback
The soc_compr_copy callback is currently broken. Since the
changes to move the compr_ops over to the component the return
value is not correctly propagated, always returning zero on
success rather than the number of bytes copied. This causes
user-space to stall continuously reading as it does not believe
it has received any data.
Furthermore, the changes to move the compr_ops over to the
component iterate through the list of components and will call
the copy callback for any that have compressed ops. There isn't
currently any consensus on the mechanism to combine the results
of multiple copy callbacks.
To fix this issue for now halt searching the component list when
we locate a copy callback and return the result of that single
callback. Additional work should probably be done to look at the
other ops, tidy things up, and work out if we want to support
multiple components on a single compressed, but this is the only
fix required to get things working again.
Fixes: 9e7e3738ab0e ("ASoC: snd_soc_component_driver has snd_compr_ops")
Signed-off-by: Charles Keepax <ckeepax(a)opensource.cirrus.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
sound/soc/soc-compress.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/sound/soc/soc-compress.c b/sound/soc/soc-compress.c
index d9b1e6417fb9..1507117d1185 100644
--- a/sound/soc/soc-compress.c
+++ b/sound/soc/soc-compress.c
@@ -944,7 +944,7 @@ static int soc_compr_copy(struct snd_compr_stream *cstream,
struct snd_soc_platform *platform = rtd->platform;
struct snd_soc_component *component;
struct snd_soc_rtdcom_list *rtdcom;
- int ret = 0, __ret;
+ int ret = 0;
mutex_lock_nested(&rtd->pcm_mutex, rtd->pcm_subclass);
@@ -965,10 +965,10 @@ static int soc_compr_copy(struct snd_compr_stream *cstream,
!component->driver->compr_ops->copy)
continue;
- __ret = component->driver->compr_ops->copy(cstream, buf, count);
- if (__ret < 0)
- ret = __ret;
+ ret = component->driver->compr_ops->copy(cstream, buf, count);
+ break;
}
+
err:
mutex_unlock(&rtd->pcm_mutex);
return ret;
--
2.15.1
This is a note to let you know that I've just added the patch titled
staging: lustre: separate a connection destroy from free struct
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 9b046013e5837f8a58453d1e9f8e01d03adb7fe7 Mon Sep 17 00:00:00 2001
From: Dmitry Eremin <dmitry.eremin(a)intel.com>
Date: Thu, 25 Jan 2018 16:51:04 +0300
Subject: staging: lustre: separate a connection destroy from free struct
kib_conn
The logic of the original commit 4d99b2581eff ("staging: lustre: avoid
intensive reconnecting for ko2iblnd") was assumed conditional free of
struct kib_conn if the second argument free_conn in function
kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn) is true.
But this hunk of code was dropped from original commit. As result the logic
works wrong and current code use struct kib_conn after free.
> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> 3317 kiblnd_destroy_conn(conn, !peer);
> ^^^^ Freed always (but should be conditionally)
> 3318
> 3319 spin_lock_irqsave(lock, flags);
> 3320 if (!peer)
> 3321 continue;
> 3322
> 3323 conn->ibc_peer = peer;
> ^^^^^^^^^^^^^^ Use after free
> 3324 if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
> 3325 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3326 &kiblnd_data.kib_reconn_list);
> 3327 else
> 3328 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3329 &kiblnd_data.kib_reconn_wait);
To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn() and free as it was intended in original
commit.
Cc: <stable(a)vger.kernel.org> # v4.6
Fixes: 4d99b2581eff ("staging: lustre: avoid intensive reconnecting for ko2iblnd")
Signed-off-by: Dmitry Eremin <Dmitry.Eremin(a)intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 7 +++----
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 2 +-
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 6 ++++--
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 2ebc484385b3..ec84edfda271 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -824,14 +824,15 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm
return conn;
failed_2:
- kiblnd_destroy_conn(conn, true);
+ kiblnd_destroy_conn(conn);
+ kfree(conn);
failed_1:
kfree(init_qp_attr);
failed_0:
return NULL;
}
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
+void kiblnd_destroy_conn(struct kib_conn *conn)
{
struct rdma_cm_id *cmid = conn->ibc_cmid;
struct kib_peer *peer = conn->ibc_peer;
@@ -889,8 +890,6 @@ void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
rdma_destroy_id(cmid);
atomic_dec(&net->ibn_nconns);
}
-
- kfree(conn);
}
int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
index 171eced213f8..b18911d09e9a 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
@@ -1016,7 +1016,7 @@ int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why);
struct kib_conn *kiblnd_create_conn(struct kib_peer *peer,
struct rdma_cm_id *cmid,
int state, int version);
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn);
+void kiblnd_destroy_conn(struct kib_conn *conn);
void kiblnd_close_conn(struct kib_conn *conn, int error);
void kiblnd_close_conn_locked(struct kib_conn *conn, int error);
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 9b3328c5d1e7..b3e7f28eb978 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3314,11 +3314,13 @@ kiblnd_connd(void *arg)
spin_unlock_irqrestore(lock, flags);
dropped_lock = 1;
- kiblnd_destroy_conn(conn, !peer);
+ kiblnd_destroy_conn(conn);
spin_lock_irqsave(lock, flags);
- if (!peer)
+ if (!peer) {
+ kfree(conn);
continue;
+ }
conn->ibc_peer = peer;
if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
--
2.16.1
If the device gets into a state where RXNEMP (RX FIFO not empty)
interrupt is asserted without RXOK (new frame received successfully)
interrupt being asserted, xcan_rx_poll() will continue to try to clear
RXNEMP without actually reading frames from RX FIFO. If the RX FIFO is
not empty, the interrupt will not be cleared and napi_schedule() will
just be called again.
This situation can occur when:
(a) xcan_rx() returns without reading RX FIFO due to an error condition.
The code tries to clear both RXOK and RXNEMP but RXNEMP will not clear
due to a frame still being in the FIFO. The frame will never be read
from the FIFO as RXOK is no longer set.
(b) A frame is received between xcan_rx_poll() reading interrupt status
and clearing RXOK. RXOK will be cleared, but RXNEMP will again remain
set as the new message is still in the FIFO.
I'm able to trigger case (b) by flooding the bus with frames under load.
There does not seem to be any benefit in using both RXNEMP and RXOK in
the way the driver does, and the polling example in the reference manual
(UG585 v1.10 18.3.7 Read Messages from RxFIFO) also says that either
RXOK or RXNEMP can be used for detecting incoming messages.
Fix the issue and simplify the RX processing by only using RXNEMP
without RXOK.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
---
drivers/net/can/xilinx_can.c | 18 +++++-------------
1 file changed, 5 insertions(+), 13 deletions(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 389a960..1bda47a 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -101,7 +101,7 @@ enum xcan_reg {
#define XCAN_INTR_ALL (XCAN_IXR_TXOK_MASK | XCAN_IXR_BSOFF_MASK |\
XCAN_IXR_WKUP_MASK | XCAN_IXR_SLP_MASK | \
XCAN_IXR_RXNEMP_MASK | XCAN_IXR_ERROR_MASK | \
- XCAN_IXR_ARBLST_MASK | XCAN_IXR_RXOK_MASK)
+ XCAN_IXR_ARBLST_MASK)
/* CAN register bit shift - XCAN_<REG>_<BIT>_SHIFT */
#define XCAN_BTR_SJW_SHIFT 7 /* Synchronous jump width */
@@ -708,15 +708,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
while ((isr & XCAN_IXR_RXNEMP_MASK) && (work_done < quota)) {
- if (isr & XCAN_IXR_RXOK_MASK) {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXOK_MASK);
- work_done += xcan_rx(ndev);
- } else {
- priv->write_reg(priv, XCAN_ICR_OFFSET,
- XCAN_IXR_RXNEMP_MASK);
- break;
- }
+ work_done += xcan_rx(ndev);
priv->write_reg(priv, XCAN_ICR_OFFSET, XCAN_IXR_RXNEMP_MASK);
isr = priv->read_reg(priv, XCAN_ISR_OFFSET);
}
@@ -727,7 +719,7 @@ static int xcan_rx_poll(struct napi_struct *napi, int quota)
if (work_done < quota) {
napi_complete_done(napi, work_done);
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier |= (XCAN_IXR_RXOK_MASK | XCAN_IXR_RXNEMP_MASK);
+ ier |= XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
}
return work_done;
@@ -799,9 +791,9 @@ static irqreturn_t xcan_interrupt(int irq, void *dev_id)
}
/* Check for the type of receive interrupt and Processing it */
- if (isr & (XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK)) {
+ if (isr & XCAN_IXR_RXNEMP_MASK) {
ier = priv->read_reg(priv, XCAN_IER_OFFSET);
- ier &= ~(XCAN_IXR_RXNEMP_MASK | XCAN_IXR_RXOK_MASK);
+ ier &= ~XCAN_IXR_RXNEMP_MASK;
priv->write_reg(priv, XCAN_IER_OFFSET, ier);
napi_schedule(&priv->napi);
}
--
2.8.3
The xilinx_can driver performs a software reset when an RX overrun is
detected. This causes the device to enter Configuration mode where no
messages are received or transmitted.
The documentation does not mention any need to perform a reset on an RX
overrun, and testing by inducing an RX overflow also indicated that the
device continues to work just fine without a reset.
Remove the software reset.
Tested with the integrated CAN on Zynq-7000 SoC.
Fixes: b1201e44f50b ("can: xilinx CAN controller support")
Signed-off-by: Anssi Hannula <anssi.hannula(a)bitwise.fi>
Cc: <stable(a)vger.kernel.org>
---
drivers/net/can/xilinx_can.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/drivers/net/can/xilinx_can.c b/drivers/net/can/xilinx_can.c
index 89aec07..389a960 100644
--- a/drivers/net/can/xilinx_can.c
+++ b/drivers/net/can/xilinx_can.c
@@ -600,7 +600,6 @@ static void xcan_err_interrupt(struct net_device *ndev, u32 isr)
if (isr & XCAN_IXR_RXOFLW_MASK) {
stats->rx_over_errors++;
stats->rx_errors++;
- priv->write_reg(priv, XCAN_SRR_OFFSET, XCAN_SRR_RESET_MASK);
if (skb) {
cf->can_id |= CAN_ERR_CRTL;
cf->data[1] |= CAN_ERR_CRTL_RX_OVERFLOW;
--
2.8.3
This is a note to let you know that I've just added the patch titled
staging: lustre: separate a connection destroy from free struct
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 9b046013e5837f8a58453d1e9f8e01d03adb7fe7 Mon Sep 17 00:00:00 2001
From: Dmitry Eremin <dmitry.eremin(a)intel.com>
Date: Thu, 25 Jan 2018 16:51:04 +0300
Subject: staging: lustre: separate a connection destroy from free struct
kib_conn
The logic of the original commit 4d99b2581eff ("staging: lustre: avoid
intensive reconnecting for ko2iblnd") was assumed conditional free of
struct kib_conn if the second argument free_conn in function
kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn) is true.
But this hunk of code was dropped from original commit. As result the logic
works wrong and current code use struct kib_conn after free.
> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> 3317 kiblnd_destroy_conn(conn, !peer);
> ^^^^ Freed always (but should be conditionally)
> 3318
> 3319 spin_lock_irqsave(lock, flags);
> 3320 if (!peer)
> 3321 continue;
> 3322
> 3323 conn->ibc_peer = peer;
> ^^^^^^^^^^^^^^ Use after free
> 3324 if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
> 3325 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3326 &kiblnd_data.kib_reconn_list);
> 3327 else
> 3328 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3329 &kiblnd_data.kib_reconn_wait);
To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn() and free as it was intended in original
commit.
Cc: <stable(a)vger.kernel.org> # v4.6
Fixes: 4d99b2581eff ("staging: lustre: avoid intensive reconnecting for ko2iblnd")
Signed-off-by: Dmitry Eremin <Dmitry.Eremin(a)intel.com>
Reviewed-by: Andreas Dilger <andreas.dilger(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 7 +++----
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 2 +-
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 6 ++++--
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 2ebc484385b3..ec84edfda271 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -824,14 +824,15 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm
return conn;
failed_2:
- kiblnd_destroy_conn(conn, true);
+ kiblnd_destroy_conn(conn);
+ kfree(conn);
failed_1:
kfree(init_qp_attr);
failed_0:
return NULL;
}
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
+void kiblnd_destroy_conn(struct kib_conn *conn)
{
struct rdma_cm_id *cmid = conn->ibc_cmid;
struct kib_peer *peer = conn->ibc_peer;
@@ -889,8 +890,6 @@ void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
rdma_destroy_id(cmid);
atomic_dec(&net->ibn_nconns);
}
-
- kfree(conn);
}
int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
index 171eced213f8..b18911d09e9a 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
@@ -1016,7 +1016,7 @@ int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why);
struct kib_conn *kiblnd_create_conn(struct kib_peer *peer,
struct rdma_cm_id *cmid,
int state, int version);
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn);
+void kiblnd_destroy_conn(struct kib_conn *conn);
void kiblnd_close_conn(struct kib_conn *conn, int error);
void kiblnd_close_conn_locked(struct kib_conn *conn, int error);
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 9b3328c5d1e7..b3e7f28eb978 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3314,11 +3314,13 @@ kiblnd_connd(void *arg)
spin_unlock_irqrestore(lock, flags);
dropped_lock = 1;
- kiblnd_destroy_conn(conn, !peer);
+ kiblnd_destroy_conn(conn);
spin_lock_irqsave(lock, flags);
- if (!peer)
+ if (!peer) {
+ kfree(conn);
continue;
+ }
conn->ibc_peer = peer;
if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
--
2.16.1
This is a note to let you know that I've just added the patch titled
ARM: net: bpf: fix tail call jumps
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-net-bpf-fix-tail-call-jumps.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f4483f2cc1fdc03488c8a1452e545545ae5bda93 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel(a)armlinux.org.uk>
Date: Sat, 13 Jan 2018 11:39:54 +0000
Subject: ARM: net: bpf: fix tail call jumps
From: Russell King <rmk+kernel(a)armlinux.org.uk>
commit f4483f2cc1fdc03488c8a1452e545545ae5bda93 upstream.
When a tail call fails, it is documented that the tail call should
continue execution at the following instruction. An example tail call
sequence is:
12: (85) call bpf_tail_call#12
13: (b7) r0 = 0
14: (95) exit
The ARM assembler for the tail call in this case ends up branching to
instruction 14 instead of instruction 13, resulting in the BPF filter
returning a non-zero value:
178: ldr r8, [sp, #588] ; insn 12
17c: ldr r6, [r8, r6]
180: ldr r8, [sp, #580]
184: cmp r8, r6
188: bcs 0x1e8
18c: ldr r6, [sp, #524]
190: ldr r7, [sp, #528]
194: cmp r7, #0
198: cmpeq r6, #32
19c: bhi 0x1e8
1a0: adds r6, r6, #1
1a4: adc r7, r7, #0
1a8: str r6, [sp, #524]
1ac: str r7, [sp, #528]
1b0: mov r6, #104
1b4: ldr r8, [sp, #588]
1b8: add r6, r8, r6
1bc: ldr r8, [sp, #580]
1c0: lsl r7, r8, #2
1c4: ldr r6, [r6, r7]
1c8: cmp r6, #0
1cc: beq 0x1e8
1d0: mov r8, #32
1d4: ldr r6, [r6, r8]
1d8: add r6, r6, #44
1dc: bx r6
1e0: mov r0, #0 ; insn 13
1e4: mov r1, #0
1e8: add sp, sp, #596 ; insn 14
1ec: pop {r4, r5, r6, r7, r8, sl, pc}
For other sequences, the tail call could end up branching midway through
the following BPF instructions, or maybe off the end of the function,
leading to unknown behaviours.
Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/net/bpf_jit_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -949,7 +949,7 @@ static int emit_bpf_tail_call(struct jit
const u8 *tcc = bpf2a32[TCALL_CNT];
const int idx0 = ctx->idx;
#define cur_offset (ctx->idx - idx0)
-#define jmp_offset (out_offset - (cur_offset))
+#define jmp_offset (out_offset - (cur_offset) - 2)
u32 off, lo, hi;
/* if (index >= array->map.max_entries)
Patches currently in stable-queue which might be from rmk+kernel(a)armlinux.org.uk are
queue-4.14/arm-net-bpf-fix-stack-alignment.patch
queue-4.14/arm-net-bpf-fix-ldx-instructions.patch
queue-4.14/arm-net-bpf-fix-register-saving.patch
queue-4.14/arm-net-bpf-move-stack-documentation.patch
queue-4.14/arm-net-bpf-correct-stack-layout-documentation.patch
queue-4.14/arm-net-bpf-clarify-tail_call-index.patch
queue-4.14/arm-net-bpf-avoid-bx-instruction-on-non-thumb-capable-cpus.patch
queue-4.14/arm-net-bpf-fix-tail-call-jumps.patch
This is a note to let you know that I've just added the patch titled
ARM: net: bpf: fix stack alignment
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-net-bpf-fix-stack-alignment.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From d1220efd23484c72c82d5471f05daeb35b5d1916 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel(a)armlinux.org.uk>
Date: Sat, 13 Jan 2018 16:10:07 +0000
Subject: ARM: net: bpf: fix stack alignment
From: Russell King <rmk+kernel(a)armlinux.org.uk>
commit d1220efd23484c72c82d5471f05daeb35b5d1916 upstream.
As per 2dede2d8e925 ("ARM EABI: stack pointer must be 64-bit aligned
after a CPU exception") the stack should be aligned to a 64-bit boundary
on EABI systems. Ensure that the eBPF JIT appropraitely aligns the
stack.
Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/net/bpf_jit_32.c | 11 ++++++++---
1 file changed, 8 insertions(+), 3 deletions(-)
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -179,8 +179,13 @@ static void jit_fill_hole(void *area, un
*ptr++ = __opcode_to_mem_arm(ARM_INST_UDF);
}
-/* Stack must be multiples of 16 Bytes */
-#define STACK_ALIGN(sz) (((sz) + 3) & ~3)
+#if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5)
+/* EABI requires the stack to be aligned to 64-bit boundaries */
+#define STACK_ALIGNMENT 8
+#else
+/* Stack must be aligned to 32-bit boundaries */
+#define STACK_ALIGNMENT 4
+#endif
/* Stack space for BPF_REG_2, BPF_REG_3, BPF_REG_4,
* BPF_REG_5, BPF_REG_7, BPF_REG_8, BPF_REG_9,
@@ -194,7 +199,7 @@ static void jit_fill_hole(void *area, un
+ SCRATCH_SIZE + \
+ 4 /* extra for skb_copy_bits buffer */)
-#define STACK_SIZE STACK_ALIGN(_STACK_SIZE)
+#define STACK_SIZE ALIGN(_STACK_SIZE, STACK_ALIGNMENT)
/* Get the offset of eBPF REGISTERs stored on scratch space. */
#define STACK_VAR(off) (STACK_SIZE-off-4)
Patches currently in stable-queue which might be from rmk+kernel(a)armlinux.org.uk are
queue-4.14/arm-net-bpf-fix-stack-alignment.patch
queue-4.14/arm-net-bpf-fix-ldx-instructions.patch
queue-4.14/arm-net-bpf-fix-register-saving.patch
queue-4.14/arm-net-bpf-move-stack-documentation.patch
queue-4.14/arm-net-bpf-correct-stack-layout-documentation.patch
queue-4.14/arm-net-bpf-clarify-tail_call-index.patch
queue-4.14/arm-net-bpf-avoid-bx-instruction-on-non-thumb-capable-cpus.patch
queue-4.14/arm-net-bpf-fix-tail-call-jumps.patch
This is a note to let you know that I've just added the patch titled
ARM: net: bpf: clarify tail_call index
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-net-bpf-clarify-tail_call-index.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 091f02483df7b56615b524491f404e574c5e0668 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel(a)armlinux.org.uk>
Date: Sat, 13 Jan 2018 12:11:26 +0000
Subject: ARM: net: bpf: clarify tail_call index
From: Russell King <rmk+kernel(a)armlinux.org.uk>
commit 091f02483df7b56615b524491f404e574c5e0668 upstream.
As per 90caccdd8cc0 ("bpf: fix bpf_tail_call() x64 JIT"), the index used
for array lookup is defined to be 32-bit wide. Update a misleading
comment that suggests it is 64-bit wide.
Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/net/bpf_jit_32.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -1016,7 +1016,7 @@ static int emit_bpf_tail_call(struct jit
emit_a32_mov_i(tmp[1], off, false, ctx);
emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r2[1])), ctx);
emit(ARM_LDR_R(tmp[1], tmp2[1], tmp[1]), ctx);
- /* index (64 bit) */
+ /* index is 32-bit for arrays */
emit(ARM_LDR_I(tmp2[1], ARM_SP, STACK_VAR(r3[1])), ctx);
/* index >= array->map.max_entries */
emit(ARM_CMP_R(tmp2[1], tmp[1]), ctx);
Patches currently in stable-queue which might be from rmk+kernel(a)armlinux.org.uk are
queue-4.14/arm-net-bpf-fix-stack-alignment.patch
queue-4.14/arm-net-bpf-fix-ldx-instructions.patch
queue-4.14/arm-net-bpf-fix-register-saving.patch
queue-4.14/arm-net-bpf-move-stack-documentation.patch
queue-4.14/arm-net-bpf-correct-stack-layout-documentation.patch
queue-4.14/arm-net-bpf-clarify-tail_call-index.patch
queue-4.14/arm-net-bpf-avoid-bx-instruction-on-non-thumb-capable-cpus.patch
queue-4.14/arm-net-bpf-fix-tail-call-jumps.patch
This is a note to let you know that I've just added the patch titled
ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUs
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
arm-net-bpf-avoid-bx-instruction-on-non-thumb-capable-cpus.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e9062481824384f00299971f923fecf6b3668001 Mon Sep 17 00:00:00 2001
From: Russell King <rmk+kernel(a)armlinux.org.uk>
Date: Sat, 13 Jan 2018 11:35:15 +0000
Subject: ARM: net: bpf: avoid 'bx' instruction on non-Thumb capable CPUs
From: Russell King <rmk+kernel(a)armlinux.org.uk>
commit e9062481824384f00299971f923fecf6b3668001 upstream.
Avoid the 'bx' instruction on CPUs that have no support for Thumb and
thus do not implement this instruction by moving the generation of this
opcode to a separate function that selects between:
bx reg
and
mov pc, reg
according to the capabilities of the CPU.
Fixes: 39c13c204bb1 ("arm: eBPF JIT compiler")
Signed-off-by: Russell King <rmk+kernel(a)armlinux.org.uk>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/arm/net/bpf_jit_32.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
--- a/arch/arm/net/bpf_jit_32.c
+++ b/arch/arm/net/bpf_jit_32.c
@@ -285,16 +285,20 @@ static inline void emit_mov_i(const u8 r
emit_mov_i_no8m(rd, val, ctx);
}
-static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
+static void emit_bx_r(u8 tgt_reg, struct jit_ctx *ctx)
{
- ctx->seen |= SEEN_CALL;
-#if __LINUX_ARM_ARCH__ < 5
- emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
-
if (elf_hwcap & HWCAP_THUMB)
emit(ARM_BX(tgt_reg), ctx);
else
emit(ARM_MOV_R(ARM_PC, tgt_reg), ctx);
+}
+
+static inline void emit_blx_r(u8 tgt_reg, struct jit_ctx *ctx)
+{
+ ctx->seen |= SEEN_CALL;
+#if __LINUX_ARM_ARCH__ < 5
+ emit(ARM_MOV_R(ARM_LR, ARM_PC), ctx);
+ emit_bx_r(tgt_reg, ctx);
#else
emit(ARM_BLX_R(tgt_reg), ctx);
#endif
@@ -997,7 +1001,7 @@ static int emit_bpf_tail_call(struct jit
emit_a32_mov_i(tmp2[1], off, false, ctx);
emit(ARM_LDR_R(tmp[1], tmp[1], tmp2[1]), ctx);
emit(ARM_ADD_I(tmp[1], tmp[1], ctx->prologue_bytes), ctx);
- emit(ARM_BX(tmp[1]), ctx);
+ emit_bx_r(tmp[1], ctx);
/* out: */
if (out_offset == -1)
@@ -1166,7 +1170,7 @@ static void build_epilogue(struct jit_ct
emit(ARM_POP(reg_set), ctx);
/* Return back to the callee function */
if (!(ctx->seen & SEEN_CALL))
- emit(ARM_BX(ARM_LR), ctx);
+ emit_bx_r(ARM_LR, ctx);
#endif
}
Patches currently in stable-queue which might be from rmk+kernel(a)armlinux.org.uk are
queue-4.14/arm-net-bpf-fix-stack-alignment.patch
queue-4.14/arm-net-bpf-fix-ldx-instructions.patch
queue-4.14/arm-net-bpf-fix-register-saving.patch
queue-4.14/arm-net-bpf-move-stack-documentation.patch
queue-4.14/arm-net-bpf-correct-stack-layout-documentation.patch
queue-4.14/arm-net-bpf-clarify-tail_call-index.patch
queue-4.14/arm-net-bpf-avoid-bx-instruction-on-non-thumb-capable-cpus.patch
queue-4.14/arm-net-bpf-fix-tail-call-jumps.patch
This is a note to let you know that I've just added the patch titled
um: link vmlinux with -no-pie
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-link-vmlinux-with-no-pie.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 883354afbc109c57f925ccc19840055193da0cc0 Mon Sep 17 00:00:00 2001
From: Thomas Meyer <thomas(a)m3y3r.de>
Date: Sun, 20 Aug 2017 13:26:04 +0200
Subject: um: link vmlinux with -no-pie
From: Thomas Meyer <thomas(a)m3y3r.de>
commit 883354afbc109c57f925ccc19840055193da0cc0 upstream.
Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option.
Link UML dynamic kernel image also with -no-pie to fix the build.
Signed-off-by: Thomas Meyer <thomas(a)m3y3r.de>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Bernie Innocenti <codewiz(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -117,7 +117,7 @@ archheaders:
archprepare: include/generated/user_constants.h
LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
$(call cc-option, -fno-stack-protector,) \
Patches currently in stable-queue which might be from thomas(a)m3y3r.de are
queue-4.9/um-link-vmlinux-with-no-pie.patch
This is a note to let you know that I've just added the patch titled
um: link vmlinux with -no-pie
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-link-vmlinux-with-no-pie.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 883354afbc109c57f925ccc19840055193da0cc0 Mon Sep 17 00:00:00 2001
From: Thomas Meyer <thomas(a)m3y3r.de>
Date: Sun, 20 Aug 2017 13:26:04 +0200
Subject: um: link vmlinux with -no-pie
From: Thomas Meyer <thomas(a)m3y3r.de>
commit 883354afbc109c57f925ccc19840055193da0cc0 upstream.
Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option.
Link UML dynamic kernel image also with -no-pie to fix the build.
Signed-off-by: Thomas Meyer <thomas(a)m3y3r.de>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Bernie Innocenti <codewiz(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -117,7 +117,7 @@ archheaders:
archprepare: include/generated/user_constants.h
LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
$(call cc-option, -fno-stack-protector,) \
Patches currently in stable-queue which might be from thomas(a)m3y3r.de are
queue-4.4/um-link-vmlinux-with-no-pie.patch
This is a note to let you know that I've just added the patch titled
um: link vmlinux with -no-pie
to the 3.18-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
um-link-vmlinux-with-no-pie.patch
and it can be found in the queue-3.18 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 883354afbc109c57f925ccc19840055193da0cc0 Mon Sep 17 00:00:00 2001
From: Thomas Meyer <thomas(a)m3y3r.de>
Date: Sun, 20 Aug 2017 13:26:04 +0200
Subject: um: link vmlinux with -no-pie
From: Thomas Meyer <thomas(a)m3y3r.de>
commit 883354afbc109c57f925ccc19840055193da0cc0 upstream.
Debian's gcc defaults to pie. The global Makefile already defines the -fno-pie option.
Link UML dynamic kernel image also with -no-pie to fix the build.
Signed-off-by: Thomas Meyer <thomas(a)m3y3r.de>
Signed-off-by: Richard Weinberger <richard(a)nod.at>
Cc: Bernie Innocenti <codewiz(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/um/Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/um/Makefile
+++ b/arch/um/Makefile
@@ -116,7 +116,7 @@ archheaders:
archprepare: include/generated/user_constants.h
LINK-$(CONFIG_LD_SCRIPT_STATIC) += -static
-LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib
+LINK-$(CONFIG_LD_SCRIPT_DYN) += -Wl,-rpath,/lib $(call cc-option, -no-pie)
CFLAGS_NO_HARDENING := $(call cc-option, -fno-PIC,) $(call cc-option, -fno-pic,) \
$(call cc-option, -fno-stack-protector,) \
Patches currently in stable-queue which might be from thomas(a)m3y3r.de are
queue-3.18/um-link-vmlinux-with-no-pie.patch
This is a note to let you know that I've just added the patch titled
orangefs: fix deadlock; do not write i_size in read_iter
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
orangefs-fix-deadlock-do-not-write-i_size-in-read_iter.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6793f1c450b1533a5e9c2493490de771d38b24f9 Mon Sep 17 00:00:00 2001
From: Martin Brandenburg <martin(a)omnibond.com>
Date: Thu, 25 Jan 2018 19:39:44 -0500
Subject: orangefs: fix deadlock; do not write i_size in read_iter
From: Martin Brandenburg <martin(a)omnibond.com>
commit 6793f1c450b1533a5e9c2493490de771d38b24f9 upstream.
After do_readv_writev, the inode cache is invalidated anyway, so i_size
will never be read. It will be fetched from the server which will also
know about updates from other machines.
Fixes deadlock on 32-bit SMP.
See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2
Signed-off-by: Martin Brandenburg <martin(a)omnibond.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Mike Marshall <hubcap(a)omnibond.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/orangefs/file.c | 7 ++-----
fs/orangefs/orangefs-kernel.h | 11 -----------
2 files changed, 2 insertions(+), 16 deletions(-)
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -446,7 +446,7 @@ ssize_t orangefs_inode_read(struct inode
static ssize_t orangefs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
{
struct file *file = iocb->ki_filp;
- loff_t pos = *(&iocb->ki_pos);
+ loff_t pos = iocb->ki_pos;
ssize_t rc = 0;
BUG_ON(iocb->private);
@@ -485,9 +485,6 @@ static ssize_t orangefs_file_write_iter(
}
}
- if (file->f_pos > i_size_read(file->f_mapping->host))
- orangefs_i_size_write(file->f_mapping->host, file->f_pos);
-
rc = generic_write_checks(iocb, iter);
if (rc <= 0) {
@@ -501,7 +498,7 @@ static ssize_t orangefs_file_write_iter(
* pos to the end of the file, so we will wait till now to set
* pos...
*/
- pos = *(&iocb->ki_pos);
+ pos = iocb->ki_pos;
rc = do_readv_writev(ORANGEFS_IO_WRITE,
file,
--- a/fs/orangefs/orangefs-kernel.h
+++ b/fs/orangefs/orangefs-kernel.h
@@ -570,17 +570,6 @@ do { \
sys_attr.mask = ORANGEFS_ATTR_SYS_ALL_SETABLE; \
} while (0)
-static inline void orangefs_i_size_write(struct inode *inode, loff_t i_size)
-{
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- inode_lock(inode);
-#endif
- i_size_write(inode, i_size);
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- inode_unlock(inode);
-#endif
-}
-
static inline void orangefs_set_timeout(struct dentry *dentry)
{
unsigned long time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000;
Patches currently in stable-queue which might be from martin(a)omnibond.com are
queue-4.9/orangefs-fix-deadlock-do-not-write-i_size-in-read_iter.patch
queue-4.9/orangefs-initialize-op-on-loop-restart-in-orangefs_devreq_read.patch
queue-4.9/orangefs-use-list_for_each_entry_safe-in-purge_waiting_ops.patch
This is a note to let you know that I've just added the patch titled
Input: trackpoint - force 3 buttons if 0 button is reported
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f5d07b9e98022d50720e38aa936fc11c67868ece Mon Sep 17 00:00:00 2001
From: Aaron Ma <aaron.ma(a)canonical.com>
Date: Fri, 19 Jan 2018 09:43:39 -0800
Subject: Input: trackpoint - force 3 buttons if 0 button is reported
From: Aaron Ma <aaron.ma(a)canonical.com>
commit f5d07b9e98022d50720e38aa936fc11c67868ece upstream.
Lenovo introduced trackpoint compatible sticks with minimum PS/2 commands.
They supposed to reply with 0x02, 0x03, or 0x04 in response to the
"Read Extended ID" command, so we would know not to try certain extended
commands. Unfortunately even some trackpoints reporting the original IBM
version (0x01 firmware 0x0e) now respond with incorrect data to the "Get
Extended Buttons" command:
thinkpad_acpi: ThinkPad BIOS R0DET87W (1.87 ), EC unknown
thinkpad_acpi: Lenovo ThinkPad E470, model 20H1004SGE
psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 0/0
Since there are no trackpoints without buttons, let's assume the trackpoint
has 3 buttons when we get 0 response to the extended buttons query.
Signed-off-by: Aaron Ma <aaron.ma(a)canonical.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196253
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/input/mouse/trackpoint.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/input/mouse/trackpoint.c
+++ b/drivers/input/mouse/trackpoint.c
@@ -383,6 +383,9 @@ int trackpoint_detect(struct psmouse *ps
if (trackpoint_read(&psmouse->ps2dev, TP_EXT_BTN, &button_info)) {
psmouse_warn(psmouse, "failed to get extended button data, assuming 3 buttons\n");
button_info = 0x33;
+ } else if (!button_info) {
+ psmouse_warn(psmouse, "got 0 in extended button data, assuming 3 buttons\n");
+ button_info = 0x33;
}
psmouse->private = kzalloc(sizeof(struct trackpoint_data), GFP_KERNEL);
Patches currently in stable-queue which might be from aaron.ma(a)canonical.com are
queue-4.9/input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
This is a note to let you know that I've just added the patch titled
usbip: prevent leaking socket pointer address in messages
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
usbip-prevent-leaking-socket-pointer-address-in-messages.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 90120d15f4c397272aaf41077960a157fc4212bf Mon Sep 17 00:00:00 2001
From: Shuah Khan <shuahkh(a)osg.samsung.com>
Date: Fri, 15 Dec 2017 10:50:09 -0700
Subject: usbip: prevent leaking socket pointer address in messages
From: Shuah Khan <shuahkh(a)osg.samsung.com>
commit 90120d15f4c397272aaf41077960a157fc4212bf upstream.
usbip driver is leaking socket pointer address in messages. Remove
the messages that aren't useful and print sockfd in the ones that
are useful for debugging.
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/usbip/stub_dev.c | 3 +--
drivers/usb/usbip/usbip_common.c | 15 ++++-----------
drivers/usb/usbip/vhci_hcd.c | 2 +-
3 files changed, 6 insertions(+), 14 deletions(-)
--- a/drivers/usb/usbip/stub_dev.c
+++ b/drivers/usb/usbip/stub_dev.c
@@ -163,8 +163,7 @@ static void stub_shutdown_connection(str
* step 1?
*/
if (ud->tcp_socket) {
- dev_dbg(&sdev->udev->dev, "shutdown tcp_socket %p\n",
- ud->tcp_socket);
+ dev_dbg(&sdev->udev->dev, "shutdown sockfd %d\n", ud->sockfd);
kernel_sock_shutdown(ud->tcp_socket, SHUT_RDWR);
}
--- a/drivers/usb/usbip/usbip_common.c
+++ b/drivers/usb/usbip/usbip_common.c
@@ -317,18 +317,14 @@ int usbip_recv(struct socket *sock, void
struct msghdr msg;
struct kvec iov;
int total = 0;
-
/* for blocks of if (usbip_dbg_flag_xmit) */
char *bp = buf;
int osize = size;
- usbip_dbg_xmit("enter\n");
-
- if (!sock || !buf || !size) {
- pr_err("invalid arg, sock %p buff %p size %d\n", sock, buf,
- size);
+ if (!sock || !buf || !size)
return -EINVAL;
- }
+
+ usbip_dbg_xmit("enter\n");
do {
sock->sk->sk_allocation = GFP_NOIO;
@@ -341,11 +337,8 @@ int usbip_recv(struct socket *sock, void
msg.msg_flags = MSG_NOSIGNAL;
result = kernel_recvmsg(sock, &msg, &iov, 1, size, MSG_WAITALL);
- if (result <= 0) {
- pr_debug("receive sock %p buf %p size %u ret %d total %d\n",
- sock, buf, size, result, total);
+ if (result <= 0)
goto err;
- }
size -= result;
buf += result;
--- a/drivers/usb/usbip/vhci_hcd.c
+++ b/drivers/usb/usbip/vhci_hcd.c
@@ -778,7 +778,7 @@ static void vhci_shutdown_connection(str
/* need this? see stub_dev.c */
if (ud->tcp_socket) {
- pr_debug("shutdown tcp_socket %p\n", ud->tcp_socket);
+ pr_debug("shutdown sockfd %d\n", ud->sockfd);
kernel_sock_shutdown(ud->tcp_socket, SHUT_RDWR);
}
Patches currently in stable-queue which might be from shuahkh(a)osg.samsung.com are
queue-4.4/usbip-fix-stub_rx-harden-cmd_submit-path-to-handle-malicious-input.patch
queue-4.4/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.4/usbip-fix-implicit-fallthrough-warning.patch
queue-4.4/usbip-prevent-leaking-socket-pointer-address-in-messages.patch
queue-4.4/usbip-fix-stub_rx-get_pipe-to-validate-endpoint-number.patch
queue-4.4/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.4/usb-usbip-fix-possible-deadlocks-reported-by-lockdep.patch
This is a note to let you know that I've just added the patch titled
usbip: fix stub_rx: get_pipe() to validate endpoint number
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
usbip-fix-stub_rx-get_pipe-to-validate-endpoint-number.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 635f545a7e8be7596b9b2b6a43cab6bbd5a88e43 Mon Sep 17 00:00:00 2001
From: Shuah Khan <shuahkh(a)osg.samsung.com>
Date: Thu, 7 Dec 2017 14:16:47 -0700
Subject: usbip: fix stub_rx: get_pipe() to validate endpoint number
From: Shuah Khan <shuahkh(a)osg.samsung.com>
commit 635f545a7e8be7596b9b2b6a43cab6bbd5a88e43 upstream.
get_pipe() routine doesn't validate the input endpoint number
and uses to reference ep_in and ep_out arrays. Invalid endpoint
number can trigger BUG(). Range check the epnum and returning
error instead of calling BUG().
Change caller stub_recv_cmd_submit() to handle the get_pipe()
error return.
Reported-by: Secunia Research <vuln(a)secunia.com>
Signed-off-by: Shuah Khan <shuahkh(a)osg.samsung.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/usbip/stub_rx.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
--- a/drivers/usb/usbip/stub_rx.c
+++ b/drivers/usb/usbip/stub_rx.c
@@ -344,15 +344,15 @@ static int get_pipe(struct stub_device *
struct usb_host_endpoint *ep;
struct usb_endpoint_descriptor *epd = NULL;
+ if (epnum < 0 || epnum > 15)
+ goto err_ret;
+
if (dir == USBIP_DIR_IN)
ep = udev->ep_in[epnum & 0x7f];
else
ep = udev->ep_out[epnum & 0x7f];
- if (!ep) {
- dev_err(&sdev->interface->dev, "no such endpoint?, %d\n",
- epnum);
- BUG();
- }
+ if (!ep)
+ goto err_ret;
epd = &ep->desc;
if (usb_endpoint_xfer_control(epd)) {
@@ -383,9 +383,10 @@ static int get_pipe(struct stub_device *
return usb_rcvisocpipe(udev, epnum);
}
+err_ret:
/* NOT REACHED */
- dev_err(&sdev->interface->dev, "get pipe, epnum %d\n", epnum);
- return 0;
+ dev_err(&sdev->udev->dev, "get pipe() invalid epnum %d\n", epnum);
+ return -1;
}
static void masking_bogus_flags(struct urb *urb)
@@ -451,6 +452,9 @@ static void stub_recv_cmd_submit(struct
struct usb_device *udev = sdev->udev;
int pipe = get_pipe(sdev, pdu->base.ep, pdu->base.direction);
+ if (pipe == -1)
+ return;
+
priv = stub_priv_alloc(sdev, pdu);
if (!priv)
return;
Patches currently in stable-queue which might be from shuahkh(a)osg.samsung.com are
queue-4.4/usbip-fix-stub_rx-harden-cmd_submit-path-to-handle-malicious-input.patch
queue-4.4/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.4/usbip-fix-implicit-fallthrough-warning.patch
queue-4.4/usbip-prevent-leaking-socket-pointer-address-in-messages.patch
queue-4.4/usbip-fix-stub_rx-get_pipe-to-validate-endpoint-number.patch
queue-4.4/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.4/usb-usbip-fix-possible-deadlocks-reported-by-lockdep.patch
This is a note to let you know that I've just added the patch titled
Input: trackpoint - force 3 buttons if 0 button is reported
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f5d07b9e98022d50720e38aa936fc11c67868ece Mon Sep 17 00:00:00 2001
From: Aaron Ma <aaron.ma(a)canonical.com>
Date: Fri, 19 Jan 2018 09:43:39 -0800
Subject: Input: trackpoint - force 3 buttons if 0 button is reported
From: Aaron Ma <aaron.ma(a)canonical.com>
commit f5d07b9e98022d50720e38aa936fc11c67868ece upstream.
Lenovo introduced trackpoint compatible sticks with minimum PS/2 commands.
They supposed to reply with 0x02, 0x03, or 0x04 in response to the
"Read Extended ID" command, so we would know not to try certain extended
commands. Unfortunately even some trackpoints reporting the original IBM
version (0x01 firmware 0x0e) now respond with incorrect data to the "Get
Extended Buttons" command:
thinkpad_acpi: ThinkPad BIOS R0DET87W (1.87 ), EC unknown
thinkpad_acpi: Lenovo ThinkPad E470, model 20H1004SGE
psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 0/0
Since there are no trackpoints without buttons, let's assume the trackpoint
has 3 buttons when we get 0 response to the extended buttons query.
Signed-off-by: Aaron Ma <aaron.ma(a)canonical.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196253
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/input/mouse/trackpoint.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/input/mouse/trackpoint.c
+++ b/drivers/input/mouse/trackpoint.c
@@ -383,6 +383,9 @@ int trackpoint_detect(struct psmouse *ps
if (trackpoint_read(&psmouse->ps2dev, TP_EXT_BTN, &button_info)) {
psmouse_warn(psmouse, "failed to get extended button data, assuming 3 buttons\n");
button_info = 0x33;
+ } else if (!button_info) {
+ psmouse_warn(psmouse, "got 0 in extended button data, assuming 3 buttons\n");
+ button_info = 0x33;
}
psmouse->private = kzalloc(sizeof(struct trackpoint_data), GFP_KERNEL);
Patches currently in stable-queue which might be from aaron.ma(a)canonical.com are
queue-4.4/input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
This is a note to let you know that I've just added the patch titled
orangefs: fix deadlock; do not write i_size in read_iter
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
orangefs-fix-deadlock-do-not-write-i_size-in-read_iter.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6793f1c450b1533a5e9c2493490de771d38b24f9 Mon Sep 17 00:00:00 2001
From: Martin Brandenburg <martin(a)omnibond.com>
Date: Thu, 25 Jan 2018 19:39:44 -0500
Subject: orangefs: fix deadlock; do not write i_size in read_iter
From: Martin Brandenburg <martin(a)omnibond.com>
commit 6793f1c450b1533a5e9c2493490de771d38b24f9 upstream.
After do_readv_writev, the inode cache is invalidated anyway, so i_size
will never be read. It will be fetched from the server which will also
know about updates from other machines.
Fixes deadlock on 32-bit SMP.
See https://marc.info/?l=linux-fsdevel&m=151268557427760&w=2
Signed-off-by: Martin Brandenburg <martin(a)omnibond.com>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: Mike Marshall <hubcap(a)omnibond.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/orangefs/file.c | 7 ++-----
fs/orangefs/orangefs-kernel.h | 11 -----------
2 files changed, 2 insertions(+), 16 deletions(-)
--- a/fs/orangefs/file.c
+++ b/fs/orangefs/file.c
@@ -446,7 +446,7 @@ ssize_t orangefs_inode_read(struct inode
static ssize_t orangefs_file_read_iter(struct kiocb *iocb, struct iov_iter *iter)
{
struct file *file = iocb->ki_filp;
- loff_t pos = *(&iocb->ki_pos);
+ loff_t pos = iocb->ki_pos;
ssize_t rc = 0;
BUG_ON(iocb->private);
@@ -486,9 +486,6 @@ static ssize_t orangefs_file_write_iter(
}
}
- if (file->f_pos > i_size_read(file->f_mapping->host))
- orangefs_i_size_write(file->f_mapping->host, file->f_pos);
-
rc = generic_write_checks(iocb, iter);
if (rc <= 0) {
@@ -502,7 +499,7 @@ static ssize_t orangefs_file_write_iter(
* pos to the end of the file, so we will wait till now to set
* pos...
*/
- pos = *(&iocb->ki_pos);
+ pos = iocb->ki_pos;
rc = do_readv_writev(ORANGEFS_IO_WRITE,
file,
--- a/fs/orangefs/orangefs-kernel.h
+++ b/fs/orangefs/orangefs-kernel.h
@@ -566,17 +566,6 @@ do { \
sys_attr.mask = ORANGEFS_ATTR_SYS_ALL_SETABLE; \
} while (0)
-static inline void orangefs_i_size_write(struct inode *inode, loff_t i_size)
-{
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- inode_lock(inode);
-#endif
- i_size_write(inode, i_size);
-#if BITS_PER_LONG == 32 && defined(CONFIG_SMP)
- inode_unlock(inode);
-#endif
-}
-
static inline void orangefs_set_timeout(struct dentry *dentry)
{
unsigned long time = jiffies + orangefs_dcache_timeout_msecs*HZ/1000;
Patches currently in stable-queue which might be from martin(a)omnibond.com are
queue-4.14/orangefs-fix-deadlock-do-not-write-i_size-in-read_iter.patch
queue-4.14/orangefs-initialize-op-on-loop-restart-in-orangefs_devreq_read.patch
queue-4.14/orangefs-use-list_for_each_entry_safe-in-purge_waiting_ops.patch
This is a note to let you know that I've just added the patch titled
KVM: s390: add proper locking for CMMA migration bitmap
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
kvm-s390-add-proper-locking-for-cmma-migration-bitmap.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 1de1ea7efeb9e8543212210e34518b4049ccd285 Mon Sep 17 00:00:00 2001
From: Christian Borntraeger <borntraeger(a)de.ibm.com>
Date: Fri, 22 Dec 2017 10:54:20 +0100
Subject: KVM: s390: add proper locking for CMMA migration bitmap
From: Christian Borntraeger <borntraeger(a)de.ibm.com>
commit 1de1ea7efeb9e8543212210e34518b4049ccd285 upstream.
Some parts of the cmma migration bitmap is already protected
with the kvm->lock (e.g. the migration start). On the other
hand the read of the cmma bits is not protected against a
concurrent free, neither is the emulation of the ESSA instruction.
Let's extend the locking to all related ioctls by using
the slots lock for
- kvm_s390_vm_start_migration
- kvm_s390_vm_stop_migration
- kvm_s390_set_cmma_bits
- kvm_s390_get_cmma_bits
In addition to that, we use synchronize_srcu before freeing
the migration structure as all users hold kvm->srcu for read.
(e.g. the ESSA handler).
Reported-by: David Hildenbrand <david(a)redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger(a)de.ibm.com>
Fixes: 190df4a212a7 (KVM: s390: CMMA tracking, ESSA emulation, migration mode)
Reviewed-by: Claudio Imbrenda <imbrenda(a)linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Cornelia Huck <cohuck(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/s390/kvm/kvm-s390.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
--- a/arch/s390/kvm/kvm-s390.c
+++ b/arch/s390/kvm/kvm-s390.c
@@ -768,7 +768,7 @@ static void kvm_s390_sync_request_broadc
/*
* Must be called with kvm->srcu held to avoid races on memslots, and with
- * kvm->lock to avoid races with ourselves and kvm_s390_vm_stop_migration.
+ * kvm->slots_lock to avoid races with ourselves and kvm_s390_vm_stop_migration.
*/
static int kvm_s390_vm_start_migration(struct kvm *kvm)
{
@@ -824,7 +824,7 @@ static int kvm_s390_vm_start_migration(s
}
/*
- * Must be called with kvm->lock to avoid races with ourselves and
+ * Must be called with kvm->slots_lock to avoid races with ourselves and
* kvm_s390_vm_start_migration.
*/
static int kvm_s390_vm_stop_migration(struct kvm *kvm)
@@ -839,6 +839,8 @@ static int kvm_s390_vm_stop_migration(st
if (kvm->arch.use_cmma) {
kvm_s390_sync_request_broadcast(kvm, KVM_REQ_STOP_MIGRATION);
+ /* We have to wait for the essa emulation to finish */
+ synchronize_srcu(&kvm->srcu);
vfree(mgs->pgste_bitmap);
}
kfree(mgs);
@@ -848,14 +850,12 @@ static int kvm_s390_vm_stop_migration(st
static int kvm_s390_vm_set_migration(struct kvm *kvm,
struct kvm_device_attr *attr)
{
- int idx, res = -ENXIO;
+ int res = -ENXIO;
- mutex_lock(&kvm->lock);
+ mutex_lock(&kvm->slots_lock);
switch (attr->attr) {
case KVM_S390_VM_MIGRATION_START:
- idx = srcu_read_lock(&kvm->srcu);
res = kvm_s390_vm_start_migration(kvm);
- srcu_read_unlock(&kvm->srcu, idx);
break;
case KVM_S390_VM_MIGRATION_STOP:
res = kvm_s390_vm_stop_migration(kvm);
@@ -863,7 +863,7 @@ static int kvm_s390_vm_set_migration(str
default:
break;
}
- mutex_unlock(&kvm->lock);
+ mutex_unlock(&kvm->slots_lock);
return res;
}
@@ -1753,7 +1753,9 @@ long kvm_arch_vm_ioctl(struct file *filp
r = -EFAULT;
if (copy_from_user(&args, argp, sizeof(args)))
break;
+ mutex_lock(&kvm->slots_lock);
r = kvm_s390_get_cmma_bits(kvm, &args);
+ mutex_unlock(&kvm->slots_lock);
if (!r) {
r = copy_to_user(argp, &args, sizeof(args));
if (r)
@@ -1767,7 +1769,9 @@ long kvm_arch_vm_ioctl(struct file *filp
r = -EFAULT;
if (copy_from_user(&args, argp, sizeof(args)))
break;
+ mutex_lock(&kvm->slots_lock);
r = kvm_s390_set_cmma_bits(kvm, &args);
+ mutex_unlock(&kvm->slots_lock);
break;
}
default:
Patches currently in stable-queue which might be from borntraeger(a)de.ibm.com are
queue-4.14/kvm-s390-add-proper-locking-for-cmma-migration-bitmap.patch
This is a note to let you know that I've just added the patch titled
Input: xpad - add support for PDP Xbox One controllers
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
input-xpad-add-support-for-pdp-xbox-one-controllers.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e5c9c6a885fad00aa559b49d8fc23a60e290824e Mon Sep 17 00:00:00 2001
From: Mark Furneaux <mark(a)furneaux.ca>
Date: Mon, 22 Jan 2018 11:24:17 -0800
Subject: Input: xpad - add support for PDP Xbox One controllers
From: Mark Furneaux <mark(a)furneaux.ca>
commit e5c9c6a885fad00aa559b49d8fc23a60e290824e upstream.
Adds support for the current lineup of Xbox One controllers from PDP
(Performance Designed Products). These controllers are very picky with
their initialization sequence and require an additional 2 packets before
they send any input reports.
Signed-off-by: Mark Furneaux <mark(a)furneaux.ca>
Reviewed-by: Cameron Gutman <aicommander(a)gmail.com>
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/input/joystick/xpad.c | 19 +++++++++++++++++++
1 file changed, 19 insertions(+)
--- a/drivers/input/joystick/xpad.c
+++ b/drivers/input/joystick/xpad.c
@@ -229,6 +229,7 @@ static const struct xpad_device {
{ 0x0e6f, 0x0213, "Afterglow Gamepad for Xbox 360", 0, XTYPE_XBOX360 },
{ 0x0e6f, 0x021f, "Rock Candy Gamepad for Xbox 360", 0, XTYPE_XBOX360 },
{ 0x0e6f, 0x0246, "Rock Candy Gamepad for Xbox One 2015", 0, XTYPE_XBOXONE },
+ { 0x0e6f, 0x02ab, "PDP Controller for Xbox One", 0, XTYPE_XBOXONE },
{ 0x0e6f, 0x0301, "Logic3 Controller", 0, XTYPE_XBOX360 },
{ 0x0e6f, 0x0346, "Rock Candy Gamepad for Xbox One 2016", 0, XTYPE_XBOXONE },
{ 0x0e6f, 0x0401, "Logic3 Controller", 0, XTYPE_XBOX360 },
@@ -476,6 +477,22 @@ static const u8 xboxone_hori_init[] = {
};
/*
+ * This packet is required for some of the PDP pads to start
+ * sending input reports. One of those pads is (0x0e6f:0x02ab).
+ */
+static const u8 xboxone_pdp_init1[] = {
+ 0x0a, 0x20, 0x00, 0x03, 0x00, 0x01, 0x14
+};
+
+/*
+ * This packet is required for some of the PDP pads to start
+ * sending input reports. One of those pads is (0x0e6f:0x02ab).
+ */
+static const u8 xboxone_pdp_init2[] = {
+ 0x06, 0x20, 0x00, 0x02, 0x01, 0x00
+};
+
+/*
* A specific rumble packet is required for some PowerA pads to start
* sending input reports. One of those pads is (0x24c6:0x543a).
*/
@@ -505,6 +522,8 @@ static const struct xboxone_init_packet
XBOXONE_INIT_PKT(0x0e6f, 0x0165, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0f0d, 0x0067, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0000, 0x0000, xboxone_fw2015_init),
+ XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init1),
+ XBOXONE_INIT_PKT(0x0e6f, 0x02ab, xboxone_pdp_init2),
XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumblebegin_init),
XBOXONE_INIT_PKT(0x24c6, 0x542a, xboxone_rumblebegin_init),
XBOXONE_INIT_PKT(0x24c6, 0x543a, xboxone_rumblebegin_init),
Patches currently in stable-queue which might be from mark(a)furneaux.ca are
queue-4.14/input-xpad-add-support-for-pdp-xbox-one-controllers.patch
This is a note to let you know that I've just added the patch titled
Input: trackpoint - force 3 buttons if 0 button is reported
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From f5d07b9e98022d50720e38aa936fc11c67868ece Mon Sep 17 00:00:00 2001
From: Aaron Ma <aaron.ma(a)canonical.com>
Date: Fri, 19 Jan 2018 09:43:39 -0800
Subject: Input: trackpoint - force 3 buttons if 0 button is reported
From: Aaron Ma <aaron.ma(a)canonical.com>
commit f5d07b9e98022d50720e38aa936fc11c67868ece upstream.
Lenovo introduced trackpoint compatible sticks with minimum PS/2 commands.
They supposed to reply with 0x02, 0x03, or 0x04 in response to the
"Read Extended ID" command, so we would know not to try certain extended
commands. Unfortunately even some trackpoints reporting the original IBM
version (0x01 firmware 0x0e) now respond with incorrect data to the "Get
Extended Buttons" command:
thinkpad_acpi: ThinkPad BIOS R0DET87W (1.87 ), EC unknown
thinkpad_acpi: Lenovo ThinkPad E470, model 20H1004SGE
psmouse serio2: trackpoint: IBM TrackPoint firmware: 0x0e, buttons: 0/0
Since there are no trackpoints without buttons, let's assume the trackpoint
has 3 buttons when we get 0 response to the extended buttons query.
Signed-off-by: Aaron Ma <aaron.ma(a)canonical.com>
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=196253
Signed-off-by: Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/input/mouse/trackpoint.c | 3 +++
1 file changed, 3 insertions(+)
--- a/drivers/input/mouse/trackpoint.c
+++ b/drivers/input/mouse/trackpoint.c
@@ -383,6 +383,9 @@ int trackpoint_detect(struct psmouse *ps
if (trackpoint_read(ps2dev, TP_EXT_BTN, &button_info)) {
psmouse_warn(psmouse, "failed to get extended button data, assuming 3 buttons\n");
button_info = 0x33;
+ } else if (!button_info) {
+ psmouse_warn(psmouse, "got 0 in extended button data, assuming 3 buttons\n");
+ button_info = 0x33;
}
psmouse->private = kzalloc(sizeof(struct trackpoint_data), GFP_KERNEL);
Patches currently in stable-queue which might be from aaron.ma(a)canonical.com are
queue-4.14/input-trackpoint-force-3-buttons-if-0-button-is-reported.patch
This is a note to let you know that I've just added the patch titled
Btrfs: fix stale entries in readdir
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
btrfs-fix-stale-entries-in-readdir.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e4fd493c0541d36953f7b9d3bfced67a1321792f Mon Sep 17 00:00:00 2001
From: Josef Bacik <jbacik(a)fb.com>
Date: Tue, 23 Jan 2018 15:17:05 -0500
Subject: Btrfs: fix stale entries in readdir
From: Josef Bacik <jbacik(a)fb.com>
commit e4fd493c0541d36953f7b9d3bfced67a1321792f upstream.
In fixing the readdir+pagefault deadlock I accidentally introduced a
stale entry regression in readdir. If we get close to full for the
temporary buffer, and then skip a few delayed deletions, and then try to
add another entry that won't fit, we will emit the entries we found and
retry. Unfortunately we delete entries from our del_list as we find
them, assuming we won't need them. However our pos will be with
whatever our last entry was, which could be before the delayed deletions
we skipped, so the next search will add the deleted entries back into
our readdir buffer. So instead don't delete entries we find in our
del_list so we can make sure we always find our delayed deletions. This
is a slight perf hit for readdir with lots of pending deletions, but
hopefully this isn't a common occurrence. If it is we can revist this
and optimize it.
Fixes: 23b5ec74943f ("btrfs: fix readdir deadlock with pagefault")
Signed-off-by: Josef Bacik <jbacik(a)fb.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/btrfs/delayed-inode.c | 26 ++++++++------------------
1 file changed, 8 insertions(+), 18 deletions(-)
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -1677,28 +1677,18 @@ void btrfs_readdir_put_delayed_items(str
int btrfs_should_delete_dir_index(struct list_head *del_list,
u64 index)
{
- struct btrfs_delayed_item *curr, *next;
- int ret;
+ struct btrfs_delayed_item *curr;
+ int ret = 0;
- if (list_empty(del_list))
- return 0;
-
- list_for_each_entry_safe(curr, next, del_list, readdir_list) {
+ list_for_each_entry(curr, del_list, readdir_list) {
if (curr->key.offset > index)
break;
-
- list_del(&curr->readdir_list);
- ret = (curr->key.offset == index);
-
- if (refcount_dec_and_test(&curr->refs))
- kfree(curr);
-
- if (ret)
- return 1;
- else
- continue;
+ if (curr->key.offset == index) {
+ ret = 1;
+ break;
+ }
}
- return 0;
+ return ret;
}
/*
Patches currently in stable-queue which might be from jbacik(a)fb.com are
queue-4.14/btrfs-fix-stale-entries-in-readdir.patch
As I started backporting security fixes, I found a deadlock bug that was
fixed in a later release. This patch series contains backports for all
these problems.
Andrew Goodbody (1):
usb: usbip: Fix possible deadlocks reported by lockdep
Shuah Khan (3):
usbip: fix stub_rx: get_pipe() to validate endpoint number
usbip: fix stub_rx: harden CMD_SUBMIT path to handle malicious input
usbip: prevent leaking socket pointer address in messages
drivers/usb/usbip/stub_dev.c | 3 +-
drivers/usb/usbip/stub_rx.c | 46 ++++++++++++++++----
drivers/usb/usbip/usbip_common.c | 15 ++-----
drivers/usb/usbip/usbip_event.c | 5 ++-
drivers/usb/usbip/vhci_hcd.c | 90 +++++++++++++++++++++++-----------------
drivers/usb/usbip/vhci_rx.c | 30 ++++++++------
drivers/usb/usbip/vhci_sysfs.c | 19 +++++----
drivers/usb/usbip/vhci_tx.c | 14 ++++---
8 files changed, 134 insertions(+), 88 deletions(-)
--
2.14.1
The logic of the original commit 4d99b2581eff ("staging: lustre: avoid
intensive reconnecting for ko2iblnd") was assumed conditional free of
struct kib_conn if the second argument free_conn in function
kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn) is true.
But this hunk of code was dropped from original commit. As result the logic
works wrong and current code use struct kib_conn after free.
> drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
> 3317 kiblnd_destroy_conn(conn, !peer);
> ^^^^ Freed always (but should be conditionally)
> 3318
> 3319 spin_lock_irqsave(lock, flags);
> 3320 if (!peer)
> 3321 continue;
> 3322
> 3323 conn->ibc_peer = peer;
> ^^^^^^^^^^^^^^ Use after free
> 3324 if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
> 3325 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3326 &kiblnd_data.kib_reconn_list);
> 3327 else
> 3328 list_add_tail(&conn->ibc_list,
> ^^^^^^^^^^^^^^ Use after free
> 3329 &kiblnd_data.kib_reconn_wait);
To avoid confusion this fix moved the freeing a struct kib_conn outside of
the function kiblnd_destroy_conn() and free as it was intended in original
commit.
Cc: <stable(a)vger.kernel.org> # v4.6
Fixes: 4d99b2581eff ("staging: lustre: avoid intensive reconnecting for ko2iblnd")
Signed-off-by: Dmitry Eremin <Dmitry.Eremin(a)intel.com>
---
Changes in v4:
- fixed the issue with use after free by moving the freeing a struct
kib_conn outside of the function kiblnd_destroy_conn()
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c | 7 +++----
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h | 2 +-
drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c | 6 ++++--
3 files changed, 8 insertions(+), 7 deletions(-)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
index 2ebc484385b3..ec84edfda271 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.c
@@ -824,14 +824,15 @@ struct kib_conn *kiblnd_create_conn(struct kib_peer *peer, struct rdma_cm_id *cm
return conn;
failed_2:
- kiblnd_destroy_conn(conn, true);
+ kiblnd_destroy_conn(conn);
+ kfree(conn);
failed_1:
kfree(init_qp_attr);
failed_0:
return NULL;
}
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
+void kiblnd_destroy_conn(struct kib_conn *conn)
{
struct rdma_cm_id *cmid = conn->ibc_cmid;
struct kib_peer *peer = conn->ibc_peer;
@@ -889,8 +890,6 @@ void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn)
rdma_destroy_id(cmid);
atomic_dec(&net->ibn_nconns);
}
-
- kfree(conn);
}
int kiblnd_close_peer_conns_locked(struct kib_peer *peer, int why)
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
index 171eced213f8..b18911d09e9a 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd.h
@@ -1016,7 +1016,7 @@ int kiblnd_close_stale_conns_locked(struct kib_peer *peer,
struct kib_conn *kiblnd_create_conn(struct kib_peer *peer,
struct rdma_cm_id *cmid,
int state, int version);
-void kiblnd_destroy_conn(struct kib_conn *conn, bool free_conn);
+void kiblnd_destroy_conn(struct kib_conn *conn);
void kiblnd_close_conn(struct kib_conn *conn, int error);
void kiblnd_close_conn_locked(struct kib_conn *conn, int error);
diff --git a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
index 9b3328c5d1e7..b3e7f28eb978 100644
--- a/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
+++ b/drivers/staging/lustre/lnet/klnds/o2iblnd/o2iblnd_cb.c
@@ -3314,11 +3314,13 @@ static int kiblnd_resolve_addr(struct rdma_cm_id *cmid,
spin_unlock_irqrestore(lock, flags);
dropped_lock = 1;
- kiblnd_destroy_conn(conn, !peer);
+ kiblnd_destroy_conn(conn);
spin_lock_irqsave(lock, flags);
- if (!peer)
+ if (!peer) {
+ kfree(conn);
continue;
+ }
conn->ibc_peer = peer;
if (peer->ibp_reconnected < KIB_RECONN_HIGH_RACE)
--
1.8.3.1
--------------------------------------------------------------------
Joint Stock Company Intel A/O
Registered legal address: Krylatsky Hills Business Park,
17 Krylatskaya Str., Bldg 4, Moscow 121614,
Russian Federation
This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.
This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:
a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation)
After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.
For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.
So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.
As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.15, we'll save that for a later patch.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Mike Galbraith <efault(a)gmx.de>
Cc: stable(a)vger.kernel.org
---
Changes since v2:
- Remove teardown, just reuse pci->irq to indicate when we're tearing
down the driver
drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c | 44 +++++++++++++++++---------
1 file changed, 29 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index deb96de54b00..3b2cad639388 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -71,6 +71,10 @@ nvkm_pci_intr(int irq, void *arg)
struct nvkm_pci *pci = arg;
struct nvkm_device *device = pci->subdev.device;
bool handled = false;
+
+ if (pci->irq < 0)
+ return IRQ_HANDLED;
+
nvkm_mc_intr_unarm(device);
if (pci->msi)
pci->func->msi_rearm(pci);
@@ -84,11 +88,6 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci->irq >= 0) {
- free_irq(pci->irq, pci);
- pci->irq = -1;
- }
-
if (pci->agp.bridge)
nvkm_agp_fini(pci);
@@ -108,8 +107,20 @@ static int
nvkm_pci_oneinit(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci_is_pcie(pci->pdev))
- return nvkm_pcie_oneinit(pci);
+ struct pci_dev *pdev = pci->pdev;
+ int ret;
+
+ if (pci_is_pcie(pci->pdev)) {
+ ret = nvkm_pcie_oneinit(pci);
+ if (ret)
+ return ret;
+ }
+
+ ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+ if (ret)
+ return ret;
+ pci->irq = pdev->irq;
+
return 0;
}
@@ -117,7 +128,6 @@ static int
nvkm_pci_init(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- struct pci_dev *pdev = pci->pdev;
int ret;
if (pci->agp.bridge) {
@@ -131,28 +141,32 @@ nvkm_pci_init(struct nvkm_subdev *subdev)
if (pci->func->init)
pci->func->init(pci);
- ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
- if (ret)
- return ret;
-
- pci->irq = pdev->irq;
-
/* Ensure MSI interrupts are armed, for the case where there are
* already interrupts pending (for whatever reason) at load time.
*/
if (pci->msi)
pci->func->msi_rearm(pci);
- return ret;
+ return 0;
}
static void *
nvkm_pci_dtor(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
+ int irq;
+
nvkm_agp_dtor(pci);
+
+ if (pci->irq >= 0) {
+ irq = pci->irq;
+ pci->irq = -1;
+ free_irq(irq, pci);
+ }
+
if (pci->msi)
pci_disable_msi(pci->pdev);
+
return nvkm_pci(subdev);
}
--
2.14.3
The patch titled
Subject: lib/strscpy: remove word-at-a-time optimization.
has been removed from the -mm tree. Its filename was
lib-strscpy-remove-word-at-a-time-optimization.patch
This patch was dropped because an updated version will be merged
------------------------------------------------------
From: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Subject: lib/strscpy: remove word-at-a-time optimization.
strscpy() performs the word-at-a-time optimistic reads. So it may may
access the memory past the end of the object, which is perfectly fine
since strscpy() doesn't use that (past-the-end) data and makes sure the
optimistic read won't cross a page boundary.
But KASAN doesn't know anything about that so it will complain. There are
several possible ways to address this issue, but none are perfect. See
https://lkml.kernel.org/r/9f0a9cf6-51f7-cd1f-5dc6-6d510a7b8ec4@virtuozzo.com
It seems the best solution is to simply disable word-at-a-time
optimization. My trivial testing shows that byte-at-a-time could be up to
x4.3 times slower than word-at-a-time. It may seems like a lot, but it's
actually ~1.2e-10 sec per symbol vs ~4.8e-10 sec per symbol on modern
hardware. And we don't use strscpy() in a performance critical paths to
copy large amounts of data, so it shouldn't matter anyway.
Link: http://lkml.kernel.org/r/20180109163745.3692-1-aryabinin@virtuozzo.com
Fixes: 30035e45753b7 ("string: provide strscpy()")
Signed-off-by: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Eryu Guan <eguan(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Chris Metcalf <metcalf(a)alum.mit.edu>
Cc: David Laight <David.Laight(a)ACULAB.COM>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
lib/string.c | 38 --------------------------------------
1 file changed, 38 deletions(-)
diff -puN lib/string.c~lib-strscpy-remove-word-at-a-time-optimization lib/string.c
--- a/lib/string.c~lib-strscpy-remove-word-at-a-time-optimization
+++ a/lib/string.c
@@ -29,7 +29,6 @@
#include <linux/errno.h>
#include <asm/byteorder.h>
-#include <asm/word-at-a-time.h>
#include <asm/page.h>
#ifndef __HAVE_ARCH_STRNCASECMP
@@ -177,45 +176,8 @@ EXPORT_SYMBOL(strlcpy);
*/
ssize_t strscpy(char *dest, const char *src, size_t count)
{
- const struct word_at_a_time constants = WORD_AT_A_TIME_CONSTANTS;
- size_t max = count;
long res = 0;
- if (count == 0)
- return -E2BIG;
-
-#ifdef CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS
- /*
- * If src is unaligned, don't cross a page boundary,
- * since we don't know if the next page is mapped.
- */
- if ((long)src & (sizeof(long) - 1)) {
- size_t limit = PAGE_SIZE - ((long)src & (PAGE_SIZE - 1));
- if (limit < max)
- max = limit;
- }
-#else
- /* If src or dest is unaligned, don't do word-at-a-time. */
- if (((long) dest | (long) src) & (sizeof(long) - 1))
- max = 0;
-#endif
-
- while (max >= sizeof(unsigned long)) {
- unsigned long c, data;
-
- c = *(unsigned long *)(src+res);
- if (has_zero(c, &data, &constants)) {
- data = prep_zero_mask(c, data, &constants);
- data = create_zero_mask(data);
- *(unsigned long *)(dest+res) = c & zero_bytemask(data);
- return res + find_zero(data);
- }
- *(unsigned long *)(dest+res) = c;
- res += sizeof(unsigned long);
- count -= sizeof(unsigned long);
- max -= sizeof(unsigned long);
- }
-
while (count) {
char c;
_
Patches currently in -mm which might be from aryabinin(a)virtuozzo.com are
mm-memcontrolc-try-harder-to-decrease-limit_in_bytes.patch
kasan-makefile-support-llvm-style-asan-parameters.patch
lib-ubsan-add-type-mismatch-handler-for-new-gcc-clang.patch
lib-ubsan-remove-returns-nonnull-attribute-checks.patch
lib-ubsan-remove-returns-nonnull-attribute-checks-fix.patch
For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.
This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:
a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation)
After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.
For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.
So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.
As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.15, we'll save that for a later patch.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Mike Galbraith <efault(a)gmx.de>
Cc: stable(a)vger.kernel.org
---
Changes since v1:
- Fix small typo in commit message
No functional changes
drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h | 1 +
drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c | 43 +++++++++++++++--------
2 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
index 23803cc859fd..378bfc8d5fa8 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
@@ -31,6 +31,7 @@ struct nvkm_pci {
} pcie;
bool msi;
+ bool teardown;
};
u32 nvkm_pci_rd32(struct nvkm_pci *, u16 addr);
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index deb96de54b00..4e020f05c99f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -71,6 +71,10 @@ nvkm_pci_intr(int irq, void *arg)
struct nvkm_pci *pci = arg;
struct nvkm_device *device = pci->subdev.device;
bool handled = false;
+
+ if (pci->teardown)
+ return IRQ_HANDLED;
+
nvkm_mc_intr_unarm(device);
if (pci->msi)
pci->func->msi_rearm(pci);
@@ -84,11 +88,6 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci->irq >= 0) {
- free_irq(pci->irq, pci);
- pci->irq = -1;
- }
-
if (pci->agp.bridge)
nvkm_agp_fini(pci);
@@ -108,8 +107,20 @@ static int
nvkm_pci_oneinit(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci_is_pcie(pci->pdev))
- return nvkm_pcie_oneinit(pci);
+ struct pci_dev *pdev = pci->pdev;
+ int ret;
+
+ if (pci_is_pcie(pci->pdev)) {
+ ret = nvkm_pcie_oneinit(pci);
+ if (ret)
+ return ret;
+ }
+
+ ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+ if (ret)
+ return ret;
+ pci->irq = pdev->irq;
+
return 0;
}
@@ -117,7 +128,6 @@ static int
nvkm_pci_init(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- struct pci_dev *pdev = pci->pdev;
int ret;
if (pci->agp.bridge) {
@@ -131,28 +141,30 @@ nvkm_pci_init(struct nvkm_subdev *subdev)
if (pci->func->init)
pci->func->init(pci);
- ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
- if (ret)
- return ret;
-
- pci->irq = pdev->irq;
-
/* Ensure MSI interrupts are armed, for the case where there are
* already interrupts pending (for whatever reason) at load time.
*/
if (pci->msi)
pci->func->msi_rearm(pci);
- return ret;
+ return 0;
}
static void *
nvkm_pci_dtor(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
+
nvkm_agp_dtor(pci);
+
+ if (pci->irq >= 0) {
+ pci->teardown = true;
+ free_irq(pci->irq, pci);
+ }
+
if (pci->msi)
pci_disable_msi(pci->pdev);
+
return nvkm_pci(subdev);
}
@@ -177,6 +189,7 @@ nvkm_pci_new_(const struct nvkm_pci_func *func, struct nvkm_device *device,
pci->func = func;
pci->pdev = device->func->pci(device)->pdev;
pci->irq = -1;
+ pci->teardown = false;
pci->pcie.speed = -1;
pci->pcie.width = -1;
--
2.14.3
For a while we've been having issues with seemingly random interrupts
coming from nvidia cards when resuming them. Originally the fix for this
was thought to be just re-arming the MSI interrupt registers right after
re-allocating our IRQs, however it seems a lot of what we do is both
wrong and not even nessecary.
This was made apparent by what appeared to be a regression in the
mainline kernel that started introducing suspend/resume issues for
nouveau:
a0c9259dc4e1 (irq/matrix: Spread interrupts on allocation)
After this commit was introduced, we started getting interrupts from the
GPU before we actually re-allocated our own IRQ (see references below)
and assigned the IRQ handler. Investigating this turned out that the
problem was not with the commit, but the fact that nouveau even
free/allocates it's irqs before and after suspend/resume.
For starters: drivers in the linux kernel haven't had to handle
freeing/re-allocating their IRQs during suspend/resume cycles for quite
a while now. Nouveau seems to be one of the few drivers left that still
does this, despite the fact there's no reason we actually need to since
disabling interrupts from the device side should be enough, as the
kernel is already smart enough to know to disable host-side interrupts
for us before going into suspend. Since we were tearing down our IRQs by
hand however, that means there was a short period during resume where
interrupts could be received before we re-allocated our IRQ which would
lead to us getting an unhandled IRQ. Since we never handle said IRQ and
re-arm the interrupt registers, this would cause us to miss all of the
interrupts from the GPU and cause our init process to start timing out
on anything requiring interrupts.
So, since this whole setup/teardown every suspend/resume cycle is
useless anyway, move irq setup/teardown into the pci subdev's ctor/dtor
functions instead so they're only called at driver load and driver
unload. This should fix most of the issues with pending interrupts on
resume, along with getting suspend/resume for nouveau to work again.
As well, this probably means we can also just remove the msi rearm call
inside nvkm_pci_init(). But since our main focus here is to fix
suspend/resume before 4.16, we'll save that for a later patch.
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Mike Galbraith <efault(a)gmx.de>
Cc: stable(a)vger.kernel.org
---
drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h | 1 +
drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c | 43 +++++++++++++++--------
2 files changed, 29 insertions(+), 15 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h b/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
index 23803cc859fd..378bfc8d5fa8 100644
--- a/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
+++ b/drivers/gpu/drm/nouveau/include/nvkm/subdev/pci.h
@@ -31,6 +31,7 @@ struct nvkm_pci {
} pcie;
bool msi;
+ bool teardown;
};
u32 nvkm_pci_rd32(struct nvkm_pci *, u16 addr);
diff --git a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
index deb96de54b00..4e020f05c99f 100644
--- a/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
+++ b/drivers/gpu/drm/nouveau/nvkm/subdev/pci/base.c
@@ -71,6 +71,10 @@ nvkm_pci_intr(int irq, void *arg)
struct nvkm_pci *pci = arg;
struct nvkm_device *device = pci->subdev.device;
bool handled = false;
+
+ if (pci->teardown)
+ return IRQ_HANDLED;
+
nvkm_mc_intr_unarm(device);
if (pci->msi)
pci->func->msi_rearm(pci);
@@ -84,11 +88,6 @@ nvkm_pci_fini(struct nvkm_subdev *subdev, bool suspend)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci->irq >= 0) {
- free_irq(pci->irq, pci);
- pci->irq = -1;
- }
-
if (pci->agp.bridge)
nvkm_agp_fini(pci);
@@ -108,8 +107,20 @@ static int
nvkm_pci_oneinit(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- if (pci_is_pcie(pci->pdev))
- return nvkm_pcie_oneinit(pci);
+ struct pci_dev *pdev = pci->pdev;
+ int ret;
+
+ if (pci_is_pcie(pci->pdev)) {
+ ret = nvkm_pcie_oneinit(pci);
+ if (ret)
+ return ret;
+ }
+
+ ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
+ if (ret)
+ return ret;
+ pci->irq = pdev->irq;
+
return 0;
}
@@ -117,7 +128,6 @@ static int
nvkm_pci_init(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
- struct pci_dev *pdev = pci->pdev;
int ret;
if (pci->agp.bridge) {
@@ -131,28 +141,30 @@ nvkm_pci_init(struct nvkm_subdev *subdev)
if (pci->func->init)
pci->func->init(pci);
- ret = request_irq(pdev->irq, nvkm_pci_intr, IRQF_SHARED, "nvkm", pci);
- if (ret)
- return ret;
-
- pci->irq = pdev->irq;
-
/* Ensure MSI interrupts are armed, for the case where there are
* already interrupts pending (for whatever reason) at load time.
*/
if (pci->msi)
pci->func->msi_rearm(pci);
- return ret;
+ return 0;
}
static void *
nvkm_pci_dtor(struct nvkm_subdev *subdev)
{
struct nvkm_pci *pci = nvkm_pci(subdev);
+
nvkm_agp_dtor(pci);
+
+ if (pci->irq >= 0) {
+ pci->teardown = true;
+ free_irq(pci->irq, pci);
+ }
+
if (pci->msi)
pci_disable_msi(pci->pdev);
+
return nvkm_pci(subdev);
}
@@ -177,6 +189,7 @@ nvkm_pci_new_(const struct nvkm_pci_func *func, struct nvkm_device *device,
pci->func = func;
pci->pdev = device->func->pci(device)->pdev;
pci->irq = -1;
+ pci->teardown = false;
pci->pcie.speed = -1;
pci->pcie.width = -1;
--
2.14.3
Add acpi_arch_get_root_pointer() for Xen PVH guests to communicate
the address of the RSDP table given to the kernel via Xen start info.
This makes the kernel boot again in PVH mode after on recent Xen the
RSDP was moved to higher addresses. So up to that change it was pure
luck that the legacy method to locate the RSDP was working when
running as PVH mode.
Cc: <stable(a)vger.kernel.org> # 4.11
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
arch/x86/xen/enlighten_pvh.c | 14 +++++++++++---
1 file changed, 11 insertions(+), 3 deletions(-)
diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 436c4f003e17..f08fd43f2aa2 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -16,15 +16,23 @@
/*
* PVH variables.
*
- * xen_pvh and pvh_bootparams need to live in data segment since they
- * are used after startup_{32|64}, which clear .bss, are invoked.
+ * xen_pvh, pvh_bootparams and pvh_start_info need to live in data segment
+ * since they are used after startup_{32|64}, which clear .bss, are invoked.
*/
bool xen_pvh __attribute__((section(".data"))) = 0;
struct boot_params pvh_bootparams __attribute__((section(".data")));
+struct hvm_start_info pvh_start_info __attribute__((section(".data")));
-struct hvm_start_info pvh_start_info;
unsigned int pvh_start_info_sz = sizeof(pvh_start_info);
+acpi_physical_address acpi_arch_get_root_pointer(void)
+{
+ if (xen_pvh)
+ return pvh_start_info.rsdp_paddr;
+
+ return 0;
+}
+
static void __init init_pvh_bootparams(void)
{
struct xen_memory_map memmap;
--
2.13.6
Add acpi_arch_get_root_pointer() for Xen PVH guests to communicate
the address of the RSDP table given to the kernel via Xen start info.
This makes the kernel boot again in PVH mode after on recent Xen the
RSDP was moved to higher addresses. So up to that change it was pure
luck that the legacy method to locate the RSDP was working when
running as PVH mode.
Cc: <stable(a)vger.kernel.org> # 4.11
Signed-off-by: Juergen Gross <jgross(a)suse.com>
---
arch/x86/xen/enlighten_pvh.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)
diff --git a/arch/x86/xen/enlighten_pvh.c b/arch/x86/xen/enlighten_pvh.c
index 436c4f003e17..9a5c3a7fe673 100644
--- a/arch/x86/xen/enlighten_pvh.c
+++ b/arch/x86/xen/enlighten_pvh.c
@@ -16,15 +16,24 @@
/*
* PVH variables.
*
- * xen_pvh and pvh_bootparams need to live in data segment since they
- * are used after startup_{32|64}, which clear .bss, are invoked.
+ * xen_pvh, pvh_bootparams and pvh_start_info need to live in data segment
+ * since they are used after startup_{32|64}, which clear .bss, are invoked.
*/
bool xen_pvh __attribute__((section(".data"))) = 0;
struct boot_params pvh_bootparams __attribute__((section(".data")));
+struct hvm_start_info pvh_start_info __attribute__((section(".data")));
-struct hvm_start_info pvh_start_info;
unsigned int pvh_start_info_sz = sizeof(pvh_start_info);
+acpi_physical_address acpi_arch_get_root_pointer(void)
+{
+ if (xen_pvh)
+ return pvh_start_info.rsdp_paddr;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(acpi_arch_get_root_pointer);
+
static void __init init_pvh_bootparams(void)
{
struct xen_memory_map memmap;
--
2.13.6
This is a note to let you know that I've just added the patch titled
android: binder: use VM_ALLOC to get vm area
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From aac6830ec1cb681544212838911cdc57f2638216 Mon Sep 17 00:00:00 2001
From: Ganesh Mahendran <opensource.ganesh(a)gmail.com>
Date: Wed, 10 Jan 2018 10:49:05 +0800
Subject: android: binder: use VM_ALLOC to get vm area
VM_IOREMAP is used to access hardware through a mechanism called
I/O mapped memory. Android binder is a IPC machanism which will
not access I/O memory.
And VM_IOREMAP has alignment requiement which may not needed in
binder.
__get_vm_area_node()
{
...
if (flags & VM_IOREMAP)
align = 1ul << clamp_t(int, fls_long(size),
PAGE_SHIFT, IOREMAP_MAX_ORDER);
...
}
This patch will save some kernel vm area, especially for 32bit os.
In 32bit OS, kernel vm area is only 240MB. We may got below
error when launching a app:
<3>[ 4482.440053] binder_alloc: binder_alloc_mmap_handler: 15728 8ce67000-8cf65000 get_vm_area failed -12
<3>[ 4483.218817] binder_alloc: binder_alloc_mmap_handler: 15745 8ce67000-8cf65000 get_vm_area failed -12
Signed-off-by: Ganesh Mahendran <opensource.ganesh(a)gmail.com>
Acked-by: Martijn Coenen <maco(a)android.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
----
V3: update comments
V2: update comments
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/android/binder_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 07b866afee54..5a426c877dfb 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -670,7 +670,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
goto err_already_mapped;
}
- area = get_vm_area(vma->vm_end - vma->vm_start, VM_IOREMAP);
+ area = get_vm_area(vma->vm_end - vma->vm_start, VM_ALLOC);
if (area == NULL) {
ret = -ENOMEM;
failure_string = "get_vm_area";
--
2.16.1
This is a note to let you know that I've just added the patch titled
USB: serial: pl2303: new device id for Chilitag
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From d08dd3f3dd2ae351b793fc5b76abdbf0fd317b12 Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Thu, 25 Jan 2018 09:48:55 +0100
Subject: USB: serial: pl2303: new device id for Chilitag
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This adds a new device id for Chilitag devices to the pl2303 driver.
Reported-by: "Chu.Mike [朱堅宜]" <Mike-Chu(a)prolific.com.tw>
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/serial/pl2303.c | 1 +
drivers/usb/serial/pl2303.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
index 57ae832a49ff..46dd09da2434 100644
--- a/drivers/usb/serial/pl2303.c
+++ b/drivers/usb/serial/pl2303.c
@@ -38,6 +38,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_RSAQ2) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_DCU11) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_RSAQ3) },
+ { USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_CHILITAG) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_PHAROS) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_ALDIGA) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_MMX) },
diff --git a/drivers/usb/serial/pl2303.h b/drivers/usb/serial/pl2303.h
index f98fd84890de..fcd72396a7b6 100644
--- a/drivers/usb/serial/pl2303.h
+++ b/drivers/usb/serial/pl2303.h
@@ -12,6 +12,7 @@
#define PL2303_PRODUCT_ID_DCU11 0x1234
#define PL2303_PRODUCT_ID_PHAROS 0xaaa0
#define PL2303_PRODUCT_ID_RSAQ3 0xaaa2
+#define PL2303_PRODUCT_ID_CHILITAG 0xaaa8
#define PL2303_PRODUCT_ID_ALDIGA 0x0611
#define PL2303_PRODUCT_ID_MMX 0x0612
#define PL2303_PRODUCT_ID_GPRS 0x0609
--
2.16.1
This is a note to let you know that I've just added the patch titled
android: binder: use VM_ALLOC to get vm area
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the char-misc-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From aac6830ec1cb681544212838911cdc57f2638216 Mon Sep 17 00:00:00 2001
From: Ganesh Mahendran <opensource.ganesh(a)gmail.com>
Date: Wed, 10 Jan 2018 10:49:05 +0800
Subject: android: binder: use VM_ALLOC to get vm area
VM_IOREMAP is used to access hardware through a mechanism called
I/O mapped memory. Android binder is a IPC machanism which will
not access I/O memory.
And VM_IOREMAP has alignment requiement which may not needed in
binder.
__get_vm_area_node()
{
...
if (flags & VM_IOREMAP)
align = 1ul << clamp_t(int, fls_long(size),
PAGE_SHIFT, IOREMAP_MAX_ORDER);
...
}
This patch will save some kernel vm area, especially for 32bit os.
In 32bit OS, kernel vm area is only 240MB. We may got below
error when launching a app:
<3>[ 4482.440053] binder_alloc: binder_alloc_mmap_handler: 15728 8ce67000-8cf65000 get_vm_area failed -12
<3>[ 4483.218817] binder_alloc: binder_alloc_mmap_handler: 15745 8ce67000-8cf65000 get_vm_area failed -12
Signed-off-by: Ganesh Mahendran <opensource.ganesh(a)gmail.com>
Acked-by: Martijn Coenen <maco(a)android.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
----
V3: update comments
V2: update comments
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/android/binder_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 07b866afee54..5a426c877dfb 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -670,7 +670,7 @@ int binder_alloc_mmap_handler(struct binder_alloc *alloc,
goto err_already_mapped;
}
- area = get_vm_area(vma->vm_end - vma->vm_start, VM_IOREMAP);
+ area = get_vm_area(vma->vm_end - vma->vm_start, VM_ALLOC);
if (area == NULL) {
ret = -ENOMEM;
failure_string = "get_vm_area";
--
2.16.1
This is a note to let you know that I've just added the patch titled
USB: serial: pl2303: new device id for Chilitag
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the usb-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From d08dd3f3dd2ae351b793fc5b76abdbf0fd317b12 Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Thu, 25 Jan 2018 09:48:55 +0100
Subject: USB: serial: pl2303: new device id for Chilitag
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This adds a new device id for Chilitag devices to the pl2303 driver.
Reported-by: "Chu.Mike [朱堅宜]" <Mike-Chu(a)prolific.com.tw>
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Johan Hovold <johan(a)kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/serial/pl2303.c | 1 +
drivers/usb/serial/pl2303.h | 1 +
2 files changed, 2 insertions(+)
diff --git a/drivers/usb/serial/pl2303.c b/drivers/usb/serial/pl2303.c
index 57ae832a49ff..46dd09da2434 100644
--- a/drivers/usb/serial/pl2303.c
+++ b/drivers/usb/serial/pl2303.c
@@ -38,6 +38,7 @@ static const struct usb_device_id id_table[] = {
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_RSAQ2) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_DCU11) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_RSAQ3) },
+ { USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_CHILITAG) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_PHAROS) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_ALDIGA) },
{ USB_DEVICE(PL2303_VENDOR_ID, PL2303_PRODUCT_ID_MMX) },
diff --git a/drivers/usb/serial/pl2303.h b/drivers/usb/serial/pl2303.h
index f98fd84890de..fcd72396a7b6 100644
--- a/drivers/usb/serial/pl2303.h
+++ b/drivers/usb/serial/pl2303.h
@@ -12,6 +12,7 @@
#define PL2303_PRODUCT_ID_DCU11 0x1234
#define PL2303_PRODUCT_ID_PHAROS 0xaaa0
#define PL2303_PRODUCT_ID_RSAQ3 0xaaa2
+#define PL2303_PRODUCT_ID_CHILITAG 0xaaa8
#define PL2303_PRODUCT_ID_ALDIGA 0x0611
#define PL2303_PRODUCT_ID_MMX 0x0612
#define PL2303_PRODUCT_ID_GPRS 0x0609
--
2.16.1
The bounce buffer is gone from the MMC core, and now we found out
that there are some (crippled) i.MX boards out there that have broken
ADMA (cannot do scatter-gather), and also broken PIO so they must
use SDMA. Closer examination shows a less significant slowdown
also on SDMA-only capable Laptop hosts.
SDMA sets down the number of segments to one, so that each segment
gets turned into a singular request that ping-pongs to the block
layer before the next request/segment is issued.
Apparently it happens a lot that the block layer send requests
that include a lot of physically discontigous segments. My guess
is that this phenomenon is coming from the file system.
These devices that cannot handle scatterlists in hardware can see
major benefits from a DMA-contigous bounce buffer.
This patch accumulates those fragmented scatterlists in a physically
contigous bounce buffer so that we can issue bigger DMA data chunks
to/from the card.
When tested with thise PCI-integrated host (1217:8221) that
only supports SDMA:
0b:00.0 SD Host controller: O2 Micro, Inc. OZ600FJ0/OZ900FJ0/OZ600FJS
SD/MMC Card Reader Controller (rev 05)
This patch gave ~1Mbyte/s improved throughput on large reads and
writes when testing using iozone than without the patch.
dmesg:
sdhci-pci 0000:0b:00.0: SDHCI controller found [1217:8221] (rev 5)
mmc0 bounce up to 128 segments into one, max segment size 65536 bytes
mmc0: SDHCI controller on PCI [0000:0b:00.0] using DMA
On the i.MX SDHCI controllers on the crippled i.MX 25 and i.MX 35
the patch restores the performance to what it was before we removed
the bounce buffers, and then some: performance is better than ever
because we now allocate a bounce buffer the size of the maximum
single request the SDMA engine can handle. On the PCI laptop this
is 256K, whereas with the old bounce buffer code it was 64K max.
Cc: Pierre Ossman <pierre(a)ossman.eu>
Cc: Benoît Thébaudeau <benoit(a)wsystem.com>
Cc: Fabio Estevam <fabio.estevam(a)nxp.com>
Cc: Benjamin Beckmeyer <beckmeyer.b(a)rittal.de>
Cc: stable(a)vger.kernel.org # v4.14+
Fixes: de3ee99b097d ("mmc: Delete bounce buffer handling")
Tested-by: Benjamin Beckmeyer <beckmeyer.b(a)rittal.de>
Signed-off-by: Linus Walleij <linus.walleij(a)linaro.org>
---
ChangeLog v6->v7:
- Fix the directions on dma_sync_for[device|cpu]() so the
ownership of the buffer gets swapped properly and in the right
direction for every transfer. Didn't see this because x86 PCI is
DMA coherent...
- Tested and greelighted on i.MX 25.
- Also tested on the PCI version.
ChangeLog v5->v6:
- Again switch back to explicit sync of buffers. I want to get this
solution to work because it gives more control and it's more
elegant.
- Update host->max_req_size as noted by Adrian, hopefully this
fixes the i.MX. I was just lucky on my Intel laptop I guess:
the block stack never requested anything bigger than 64KB and
that was why it worked even if max_req_size was bigger than
what would fit in the bounce buffer.
- Copy the number of bytes in the mmc_data instead of the number
of bytes in the bounce buffer. For RX this is blksize * blocks
and for TX this is bytes_xfered.
- Break out a sdhci_sdma_address() for getting the DMA address
for either the raw sglist or the bounce buffer depending on
configuration.
- Add some explicit bounds check for the data so that we do not
attempt to copy more than the bounce buffer size even if the
block layer is erroneously configured.
- Move allocation of bounce buffer out to its own function.
- Use pr_[info|err] throughout so all debug prints from the
driver come out in the same manner and style.
- Use unsigned int for the bounce buffer size.
- Re-tested with iozone: we still get the same nice performance
improvements.
- Request a text on i.MX (hi Benjamin)
ChangeLog v4->v5:
- Go back to dma_alloc_coherent() as this apparently works better.
- Keep the other changes, cap for 64KB, fall back to single segments.
- Requesting a test of this on i.MX. (Sorry Benjamin.)
ChangeLog v3->v4:
- Cap the bounce buffer to 64KB instead of the biggest segment
as we experience diminishing returns with buffers > 64KB.
- Instead of using dma_alloc_coherent(), use good old devm_kmalloc()
and issue dma_sync_single_for*() to explicitly switch
ownership between CPU and the device. This way we exercise the
cache better and may consume less CPU.
- Bail out with single segments if we cannot allocate a bounce
buffer.
- Tested on the PCI SDHCI on my laptop: requesting a new test
on i.MX from Benjamin. (Please!)
ChangeLog v2->v3:
- Rewrite the commit message a bit
- Add Benjamin's Tested-by
- Add Fixes and stable tags
ChangeLog v1->v2:
- Skip the remapping and fiddling with the buffer, instead use
dma_alloc_coherent() and use a simple, coherent bounce buffer.
- Couple kernel messages to ->parent of the mmc_host as it relates
to the hardware characteristics.
---
drivers/mmc/host/sdhci.c | 169 ++++++++++++++++++++++++++++++++++++++++++++---
drivers/mmc/host/sdhci.h | 3 +
2 files changed, 164 insertions(+), 8 deletions(-)
diff --git a/drivers/mmc/host/sdhci.c b/drivers/mmc/host/sdhci.c
index e9290a3439d5..51d670b8104d 100644
--- a/drivers/mmc/host/sdhci.c
+++ b/drivers/mmc/host/sdhci.c
@@ -21,6 +21,7 @@
#include <linux/dma-mapping.h>
#include <linux/slab.h>
#include <linux/scatterlist.h>
+#include <linux/sizes.h>
#include <linux/swiotlb.h>
#include <linux/regulator/consumer.h>
#include <linux/pm_runtime.h>
@@ -502,8 +503,35 @@ static int sdhci_pre_dma_transfer(struct sdhci_host *host,
if (data->host_cookie == COOKIE_PRE_MAPPED)
return data->sg_count;
- sg_count = dma_map_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
- mmc_get_dma_dir(data));
+ /* Bounce write requests to the bounce buffer */
+ if (host->bounce_buffer) {
+ unsigned int length = data->blksz * data->blocks;
+
+ if (length > host->bounce_buffer_size) {
+ pr_err("%s: asked for transfer of %u bytes exceeds bounce buffer %u bytes\n",
+ mmc_hostname(host->mmc), length,
+ host->bounce_buffer_size);
+ return -EIO;
+ }
+ if (mmc_get_dma_dir(data) == DMA_TO_DEVICE) {
+ /* Copy the data to the bounce buffer */
+ sg_copy_to_buffer(data->sg, data->sg_len,
+ host->bounce_buffer,
+ length);
+ }
+ /* Switch ownership to the DMA */
+ dma_sync_single_for_device(host->mmc->parent,
+ host->bounce_addr,
+ host->bounce_buffer_size,
+ mmc_get_dma_dir(data));
+ /* Just a dummy value */
+ sg_count = 1;
+ } else {
+ /* Just access the data directly from memory */
+ sg_count = dma_map_sg(mmc_dev(host->mmc),
+ data->sg, data->sg_len,
+ mmc_get_dma_dir(data));
+ }
if (sg_count == 0)
return -ENOSPC;
@@ -858,8 +886,13 @@ static void sdhci_prepare_data(struct sdhci_host *host, struct mmc_command *cmd)
SDHCI_ADMA_ADDRESS_HI);
} else {
WARN_ON(sg_cnt != 1);
- sdhci_writel(host, sg_dma_address(data->sg),
- SDHCI_DMA_ADDRESS);
+ /* Bounce buffer goes to work */
+ if (host->bounce_buffer)
+ sdhci_writel(host, host->bounce_addr,
+ SDHCI_DMA_ADDRESS);
+ else
+ sdhci_writel(host, sg_dma_address(data->sg),
+ SDHCI_DMA_ADDRESS);
}
}
@@ -2248,7 +2281,12 @@ static void sdhci_pre_req(struct mmc_host *mmc, struct mmc_request *mrq)
mrq->data->host_cookie = COOKIE_UNMAPPED;
- if (host->flags & SDHCI_REQ_USE_DMA)
+ /*
+ * No pre-mapping in the pre hook if we're using the bounce buffer,
+ * for that we would need two bounce buffers since one buffer is
+ * in flight when this is getting called.
+ */
+ if (host->flags & SDHCI_REQ_USE_DMA && !host->bounce_buffer)
sdhci_pre_dma_transfer(host, mrq->data, COOKIE_PRE_MAPPED);
}
@@ -2352,8 +2390,45 @@ static bool sdhci_request_done(struct sdhci_host *host)
struct mmc_data *data = mrq->data;
if (data && data->host_cookie == COOKIE_MAPPED) {
- dma_unmap_sg(mmc_dev(host->mmc), data->sg, data->sg_len,
- mmc_get_dma_dir(data));
+ if (host->bounce_buffer) {
+ /*
+ * On reads, copy the bounced data into the
+ * sglist
+ */
+ if (mmc_get_dma_dir(data) == DMA_FROM_DEVICE) {
+ unsigned int length = data->bytes_xfered;
+
+ if (length > host->bounce_buffer_size) {
+ pr_err("%s: bounce buffer is %u bytes but DMA claims to have transferred %u bytes\n",
+ mmc_hostname(host->mmc),
+ host->bounce_buffer_size,
+ data->bytes_xfered);
+ /* Cap it down and continue */
+ length = host->bounce_buffer_size;
+ }
+ dma_sync_single_for_cpu(
+ host->mmc->parent,
+ host->bounce_addr,
+ host->bounce_buffer_size,
+ DMA_FROM_DEVICE);
+ sg_copy_from_buffer(data->sg,
+ data->sg_len,
+ host->bounce_buffer,
+ length);
+ } else {
+ /* No copying, just switch ownership */
+ dma_sync_single_for_cpu(
+ host->mmc->parent,
+ host->bounce_addr,
+ host->bounce_buffer_size,
+ mmc_get_dma_dir(data));
+ }
+ } else {
+ /* Unmap the raw data */
+ dma_unmap_sg(mmc_dev(host->mmc), data->sg,
+ data->sg_len,
+ mmc_get_dma_dir(data));
+ }
data->host_cookie = COOKIE_UNMAPPED;
}
}
@@ -2543,6 +2618,14 @@ static void sdhci_adma_show_error(struct sdhci_host *host)
}
}
+static u32 sdhci_sdma_address(struct sdhci_host *host)
+{
+ if (host->bounce_buffer)
+ return host->bounce_addr;
+ else
+ return sg_dma_address(host->data->sg);
+}
+
static void sdhci_data_irq(struct sdhci_host *host, u32 intmask)
{
u32 command;
@@ -2636,7 +2719,8 @@ static void sdhci_data_irq(struct sdhci_host *host, u32 intmask)
*/
if (intmask & SDHCI_INT_DMA_END) {
u32 dmastart, dmanow;
- dmastart = sg_dma_address(host->data->sg);
+
+ dmastart = sdhci_sdma_address(host);
dmanow = dmastart + host->data->bytes_xfered;
/*
* Force update to the next DMA block boundary.
@@ -3217,6 +3301,68 @@ void __sdhci_read_caps(struct sdhci_host *host, u16 *ver, u32 *caps, u32 *caps1)
}
EXPORT_SYMBOL_GPL(__sdhci_read_caps);
+static int sdhci_allocate_bounce_buffer(struct sdhci_host *host)
+{
+ struct mmc_host *mmc = host->mmc;
+ unsigned int max_blocks;
+ unsigned int bounce_size;
+ int ret;
+
+ /*
+ * Cap the bounce buffer at 64KB. Using a bigger bounce buffer
+ * has diminishing returns, this is probably because SD/MMC
+ * cards are usually optimized to handle this size of requests.
+ */
+ bounce_size = SZ_64K;
+ /*
+ * Adjust downwards to maximum request size if this is less
+ * than our segment size, else hammer down the maximum
+ * request size to the maximum buffer size.
+ */
+ if (mmc->max_req_size < bounce_size)
+ bounce_size = mmc->max_req_size;
+ max_blocks = bounce_size / 512;
+
+ /*
+ * When we just support one segment, we can get significant
+ * speedups by the help of a bounce buffer to group scattered
+ * reads/writes together.
+ */
+ host->bounce_buffer = devm_kmalloc(mmc->parent,
+ bounce_size,
+ GFP_KERNEL);
+ if (!host->bounce_buffer) {
+ pr_err("%s: failed to allocate %u bytes for bounce buffer, falling back to single segments\n",
+ mmc_hostname(mmc),
+ bounce_size);
+ /*
+ * Exiting with zero here makes sure we proceed with
+ * mmc->max_segs == 1.
+ */
+ return 0;
+ }
+
+ host->bounce_addr = dma_map_single(mmc->parent,
+ host->bounce_buffer,
+ bounce_size,
+ DMA_BIDIRECTIONAL);
+ ret = dma_mapping_error(mmc->parent, host->bounce_addr);
+ if (ret)
+ /* Again fall back to max_segs == 1 */
+ return 0;
+ host->bounce_buffer_size = bounce_size;
+
+ /* Lie about this since we're bouncing */
+ mmc->max_segs = max_blocks;
+ mmc->max_seg_size = bounce_size;
+ mmc->max_req_size = bounce_size;
+
+ pr_info("%s bounce up to %u segments into one, max segment size %u bytes\n",
+ mmc_hostname(mmc), max_blocks, bounce_size);
+
+ return 0;
+}
+
int sdhci_setup_host(struct sdhci_host *host)
{
struct mmc_host *mmc;
@@ -3713,6 +3859,13 @@ int sdhci_setup_host(struct sdhci_host *host)
*/
mmc->max_blk_count = (host->quirks & SDHCI_QUIRK_NO_MULTIBLOCK) ? 1 : 65535;
+ if (mmc->max_segs == 1) {
+ /* This may alter mmc->*_blk_* parameters */
+ ret = sdhci_allocate_bounce_buffer(host);
+ if (ret)
+ return ret;
+ }
+
return 0;
unreg:
diff --git a/drivers/mmc/host/sdhci.h b/drivers/mmc/host/sdhci.h
index 54bc444c317f..1d7d61e25dbf 100644
--- a/drivers/mmc/host/sdhci.h
+++ b/drivers/mmc/host/sdhci.h
@@ -440,6 +440,9 @@ struct sdhci_host {
int irq; /* Device IRQ */
void __iomem *ioaddr; /* Mapped address */
+ char *bounce_buffer; /* For packing SDMA reads/writes */
+ dma_addr_t bounce_addr;
+ unsigned int bounce_buffer_size;
const struct sdhci_ops *ops; /* Low level hw interface */
--
2.14.3
In commit 2bfa0719ac2a ("usb: gadget: function: f_fs: pass
companion descriptor along") there is a pointer arithmetic
bug where the comp_desc is obtained as follows:
comp_desc = (struct usb_ss_ep_comp_descriptor *)(ds +
USB_DT_ENDPOINT_SIZE);
Since ds is a pointer to usb_endpoint_descriptor, adding
7 to it ends up going out of bounds (7 * sizeof(struct
usb_endpoint_descriptor), which is actually 7*9 bytes) past
the SS descriptor. As a result the maxburst value will be
read incorrectly, and the UDC driver will also get a garbage
comp_desc (assuming it uses it).
Since Felipe wrote, "Eventually, f_fs.c should be converted
to use config_ep_by_speed() like all other functions, though",
let's finally do it. This allows the other usb_ep fields to
be properly populated, such as maxpacket and mult. It also
eliminates the awkward speed-based descriptor lookup since
config_ep_by_speed() does that already using the ones found
in struct usb_function.
Fixes: 2bfa0719ac2a ("usb: gadget: function: f_fs: pass companion descriptor along")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jack Pham <jackp(a)codeaurora.org>
---
drivers/usb/gadget/function/f_fs.c | 38 +++++++-------------------------------
1 file changed, 7 insertions(+), 31 deletions(-)
diff --git a/drivers/usb/gadget/function/f_fs.c b/drivers/usb/gadget/function/f_fs.c
index 5f2dafb5..717b2de 100644
--- a/drivers/usb/gadget/function/f_fs.c
+++ b/drivers/usb/gadget/function/f_fs.c
@@ -1852,44 +1852,20 @@ static int ffs_func_eps_enable(struct ffs_function *func)
spin_lock_irqsave(&func->ffs->eps_lock, flags);
while(count--) {
- struct usb_endpoint_descriptor *ds;
- struct usb_ss_ep_comp_descriptor *comp_desc = NULL;
- int needs_comp_desc = false;
- int desc_idx;
-
- if (ffs->gadget->speed == USB_SPEED_SUPER) {
- desc_idx = 2;
- needs_comp_desc = true;
- } else if (ffs->gadget->speed == USB_SPEED_HIGH)
- desc_idx = 1;
- else
- desc_idx = 0;
-
- /* fall-back to lower speed if desc missing for current speed */
- do {
- ds = ep->descs[desc_idx];
- } while (!ds && --desc_idx >= 0);
-
- if (!ds) {
- ret = -EINVAL;
- break;
- }
-
ep->ep->driver_data = ep;
- ep->ep->desc = ds;
- if (needs_comp_desc) {
- comp_desc = (struct usb_ss_ep_comp_descriptor *)(ds +
- USB_DT_ENDPOINT_SIZE);
- ep->ep->maxburst = comp_desc->bMaxBurst + 1;
- ep->ep->comp_desc = comp_desc;
+ ret = config_ep_by_speed(func->gadget, &func->function, ep->ep);
+ if (ret) {
+ pr_err("%s: config_ep_by_speed(%s) returned %d\n",
+ __func__, ep->ep->name, ret);
+ break;
}
ret = usb_ep_enable(ep->ep);
if (likely(!ret)) {
epfile->ep = ep;
- epfile->in = usb_endpoint_dir_in(ds);
- epfile->isoc = usb_endpoint_xfer_isoc(ds);
+ epfile->in = usb_endpoint_dir_in(ep->ep->desc);
+ epfile->isoc = usb_endpoint_xfer_isoc(ep->ep->desc);
} else {
break;
}
--
2.9.1.200.gb1ec08f
On Mon, Jan 22, 2018 at 6:59 PM, Zhang, Ning A <ning.a.zhang(a)intel.com> wrote:
> hello, Greg, Andy, Thomas
>
> would you like to backport these two patches to LTS kernel?
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/com…
>
> x86/asm/32: Make sync_core() handle missing CPUID on all 32-bit kernels
>
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/com…
>
> x86/asm: Rewrite sync_core() to use IRET-to-self
>
I'd be in favor of backporting
1c52d859cb2d417e7216d3e56bb7fea88444cec9. I see no compelling reason
to backport the other one, since it doesn't fix a bug. Greg, can you
do this?
Please add this patch to stable 4.9
It exists in 4.14 and it is not applicable for earlier versions.
------------------------
commit c73322d098e4b6f5f0f0fa1330bf57e218775539
Author: Johannes Weiner <hannes(a)cmpxchg.org>
Date: Wed May 3 14:51:51 2017 -0700
mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
-------------------------
This is a note to let you know that I've just added the patch titled
mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-fix-100-cpu-kswapd-busyloop-on-unreclaimable-nodes.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c73322d098e4b6f5f0f0fa1330bf57e218775539 Mon Sep 17 00:00:00 2001
From: Johannes Weiner <hannes(a)cmpxchg.org>
Date: Wed, 3 May 2017 14:51:51 -0700
Subject: mm: fix 100% CPU kswapd busyloop on unreclaimable nodes
From: Johannes Weiner <hannes(a)cmpxchg.org>
commit c73322d098e4b6f5f0f0fa1330bf57e218775539 upstream.
Patch series "mm: kswapd spinning on unreclaimable nodes - fixes and
cleanups".
Jia reported a scenario in which the kswapd of a node indefinitely spins
at 100% CPU usage. We have seen similar cases at Facebook.
The kernel's current method of judging its ability to reclaim a node (or
whether to back off and sleep) is based on the amount of scanned pages
in proportion to the amount of reclaimable pages. In Jia's and our
scenarios, there are no reclaimable pages in the node, however, and the
condition for backing off is never met. Kswapd busyloops in an attempt
to restore the watermarks while having nothing to work with.
This series reworks the definition of an unreclaimable node based not on
scanning but on whether kswapd is able to actually reclaim pages in
MAX_RECLAIM_RETRIES (16) consecutive runs. This is the same criteria
the page allocator uses for giving up on direct reclaim and invoking the
OOM killer. If it cannot free any pages, kswapd will go to sleep and
leave further attempts to direct reclaim invocations, which will either
make progress and re-enable kswapd, or invoke the OOM killer.
Patch #1 fixes the immediate problem Jia reported, the remainder are
smaller fixlets, cleanups, and overall phasing out of the old method.
Patch #6 is the odd one out. It's a nice cleanup to get_scan_count(),
and directly related to #5, but in itself not relevant to the series.
If the whole series is too ambitious for 4.11, I would consider the
first three patches fixes, the rest cleanups.
This patch (of 9):
Jia He reports a problem with kswapd spinning at 100% CPU when
requesting more hugepages than memory available in the system:
$ echo 4000 >/proc/sys/vm/nr_hugepages
top - 13:42:59 up 3:37, 1 user, load average: 1.09, 1.03, 1.01
Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 12.5 sy, 0.0 ni, 85.5 id, 2.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 31371520 total, 30915136 used, 456384 free, 320 buffers
KiB Swap: 6284224 total, 115712 used, 6168512 free. 48192 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
76 root 20 0 0 0 0 R 100.0 0.000 217:17.29 kswapd3
At that time, there are no reclaimable pages left in the node, but as
kswapd fails to restore the high watermarks it refuses to go to sleep.
Kswapd needs to back away from nodes that fail to balance. Up until
commit 1d82de618ddd ("mm, vmscan: make kswapd reclaim in terms of
nodes") kswapd had such a mechanism. It considered zones whose
theoretically reclaimable pages it had reclaimed six times over as
unreclaimable and backed away from them. This guard was erroneously
removed as the patch changed the definition of a balanced node.
However, simply restoring this code wouldn't help in the case reported
here: there *are* no reclaimable pages that could be scanned until the
threshold is met. Kswapd would stay awake anyway.
Introduce a new and much simpler way of backing off. If kswapd runs
through MAX_RECLAIM_RETRIES (16) cycles without reclaiming a single
page, make it back off from the node. This is the same number of shots
direct reclaim takes before declaring OOM. Kswapd will go to sleep on
that node until a direct reclaimer manages to reclaim some pages, thus
proving the node reclaimable again.
[hannes(a)cmpxchg.org: check kswapd failure against the cumulative nr_reclaimed count]
Link: http://lkml.kernel.org/r/20170306162410.GB2090@cmpxchg.org
[shakeelb(a)google.com: fix condition for throttle_direct_reclaim]
Link: http://lkml.kernel.org/r/20170314183228.20152-1-shakeelb@google.com
Link: http://lkml.kernel.org/r/20170228214007.5621-2-hannes@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Signed-off-by: Shakeel Butt <shakeelb(a)google.com>
Reported-by: Jia He <hejianet(a)gmail.com>
Tested-by: Jia He <hejianet(a)gmail.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Hillf Danton <hillf.zj(a)alibaba-inc.com>
Acked-by: Minchan Kim <minchan(a)kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Dmitry Shmidt <dimitrysh(a)google.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/mmzone.h | 2 ++
mm/internal.h | 6 ++++++
mm/page_alloc.c | 9 ++-------
mm/vmscan.c | 47 ++++++++++++++++++++++++++++++++---------------
mm/vmstat.c | 2 +-
5 files changed, 43 insertions(+), 23 deletions(-)
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -633,6 +633,8 @@ typedef struct pglist_data {
int kswapd_order;
enum zone_type kswapd_classzone_idx;
+ int kswapd_failures; /* Number of 'reclaimed == 0' runs */
+
#ifdef CONFIG_COMPACTION
int kcompactd_max_order;
enum zone_type kcompactd_classzone_idx;
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -74,6 +74,12 @@ static inline void set_page_refcounted(s
extern unsigned long highest_memmap_pfn;
/*
+ * Maximum number of reclaim retries without progress before the OOM
+ * killer is consider the only way forward.
+ */
+#define MAX_RECLAIM_RETRIES 16
+
+/*
* in mm/vmscan.c:
*/
extern int isolate_lru_page(struct page *page);
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -3422,12 +3422,6 @@ bool gfp_pfmemalloc_allowed(gfp_t gfp_ma
}
/*
- * Maximum number of reclaim retries without any progress before OOM killer
- * is consider as the only way to move forward.
- */
-#define MAX_RECLAIM_RETRIES 16
-
-/*
* Checks whether it makes sense to retry the reclaim to make a forward progress
* for the given allocation request.
* The reclaim feedback represented by did_some_progress (any progress during
@@ -4385,7 +4379,8 @@ void show_free_areas(unsigned int filter
K(node_page_state(pgdat, NR_WRITEBACK_TEMP)),
K(node_page_state(pgdat, NR_UNSTABLE_NFS)),
node_page_state(pgdat, NR_PAGES_SCANNED),
- !pgdat_reclaimable(pgdat) ? "yes" : "no");
+ pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES ?
+ "yes" : "no");
}
for_each_populated_zone(zone) {
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2606,6 +2606,15 @@ static bool shrink_node(pg_data_t *pgdat
} while (should_continue_reclaim(pgdat, sc->nr_reclaimed - nr_reclaimed,
sc->nr_scanned - nr_scanned, sc));
+ /*
+ * Kswapd gives up on balancing particular nodes after too
+ * many failures to reclaim anything from them and goes to
+ * sleep. On reclaim progress, reset the failure counter. A
+ * successful direct reclaim run will revive a dormant kswapd.
+ */
+ if (reclaimable)
+ pgdat->kswapd_failures = 0;
+
return reclaimable;
}
@@ -2680,10 +2689,6 @@ static void shrink_zones(struct zonelist
GFP_KERNEL | __GFP_HARDWALL))
continue;
- if (sc->priority != DEF_PRIORITY &&
- !pgdat_reclaimable(zone->zone_pgdat))
- continue; /* Let kswapd poll it */
-
/*
* If we already have plenty of memory free for
* compaction in this zone, don't free any more.
@@ -2820,7 +2825,7 @@ retry:
return 0;
}
-static bool pfmemalloc_watermark_ok(pg_data_t *pgdat)
+static bool allow_direct_reclaim(pg_data_t *pgdat)
{
struct zone *zone;
unsigned long pfmemalloc_reserve = 0;
@@ -2828,6 +2833,9 @@ static bool pfmemalloc_watermark_ok(pg_d
int i;
bool wmark_ok;
+ if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+ return true;
+
for (i = 0; i <= ZONE_NORMAL; i++) {
zone = &pgdat->node_zones[i];
if (!managed_zone(zone) ||
@@ -2908,7 +2916,7 @@ static bool throttle_direct_reclaim(gfp_
/* Throttle based on the first usable node */
pgdat = zone->zone_pgdat;
- if (pfmemalloc_watermark_ok(pgdat))
+ if (allow_direct_reclaim(pgdat))
goto out;
break;
}
@@ -2930,14 +2938,14 @@ static bool throttle_direct_reclaim(gfp_
*/
if (!(gfp_mask & __GFP_FS)) {
wait_event_interruptible_timeout(pgdat->pfmemalloc_wait,
- pfmemalloc_watermark_ok(pgdat), HZ);
+ allow_direct_reclaim(pgdat), HZ);
goto check_pending;
}
/* Throttle until kswapd wakes the process */
wait_event_killable(zone->zone_pgdat->pfmemalloc_wait,
- pfmemalloc_watermark_ok(pgdat));
+ allow_direct_reclaim(pgdat));
check_pending:
if (fatal_signal_pending(current))
@@ -3116,7 +3124,7 @@ static bool prepare_kswapd_sleep(pg_data
/*
* The throttled processes are normally woken up in balance_pgdat() as
- * soon as pfmemalloc_watermark_ok() is true. But there is a potential
+ * soon as allow_direct_reclaim() is true. But there is a potential
* race between when kswapd checks the watermarks and a process gets
* throttled. There is also a potential race if processes get
* throttled, kswapd wakes, a large process exits thereby balancing the
@@ -3130,6 +3138,10 @@ static bool prepare_kswapd_sleep(pg_data
if (waitqueue_active(&pgdat->pfmemalloc_wait))
wake_up_all(&pgdat->pfmemalloc_wait);
+ /* Hopeless node, leave it to direct reclaim */
+ if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+ return true;
+
for (i = 0; i <= classzone_idx; i++) {
struct zone *zone = pgdat->node_zones + i;
@@ -3216,9 +3228,9 @@ static int balance_pgdat(pg_data_t *pgda
count_vm_event(PAGEOUTRUN);
do {
+ unsigned long nr_reclaimed = sc.nr_reclaimed;
bool raise_priority = true;
- sc.nr_reclaimed = 0;
sc.reclaim_idx = classzone_idx;
/*
@@ -3297,7 +3309,7 @@ static int balance_pgdat(pg_data_t *pgda
* able to safely make forward progress. Wake them
*/
if (waitqueue_active(&pgdat->pfmemalloc_wait) &&
- pfmemalloc_watermark_ok(pgdat))
+ allow_direct_reclaim(pgdat))
wake_up_all(&pgdat->pfmemalloc_wait);
/* Check if kswapd should be suspending */
@@ -3308,10 +3320,14 @@ static int balance_pgdat(pg_data_t *pgda
* Raise priority if scanning rate is too low or there was no
* progress in reclaiming pages
*/
- if (raise_priority || !sc.nr_reclaimed)
+ nr_reclaimed = sc.nr_reclaimed - nr_reclaimed;
+ if (raise_priority || !nr_reclaimed)
sc.priority--;
} while (sc.priority >= 1);
+ if (!sc.nr_reclaimed)
+ pgdat->kswapd_failures++;
+
out:
/*
* Return the order kswapd stopped reclaiming at as
@@ -3511,6 +3527,10 @@ void wakeup_kswapd(struct zone *zone, in
if (!waitqueue_active(&pgdat->kswapd_wait))
return;
+ /* Hopeless node, leave it to direct reclaim */
+ if (pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES)
+ return;
+
/* Only wake kswapd if all zones are unbalanced */
for (z = 0; z <= classzone_idx; z++) {
zone = pgdat->node_zones + z;
@@ -3781,9 +3801,6 @@ int node_reclaim(struct pglist_data *pgd
sum_zone_node_page_state(pgdat->node_id, NR_SLAB_RECLAIMABLE) <= pgdat->min_slab_pages)
return NODE_RECLAIM_FULL;
- if (!pgdat_reclaimable(pgdat))
- return NODE_RECLAIM_FULL;
-
/*
* Do not scan if the allocation should not be delayed.
*/
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1421,7 +1421,7 @@ static void zoneinfo_show_print(struct s
"\n node_unreclaimable: %u"
"\n start_pfn: %lu"
"\n node_inactive_ratio: %u",
- !pgdat_reclaimable(zone->zone_pgdat),
+ pgdat->kswapd_failures >= MAX_RECLAIM_RETRIES,
zone->zone_start_pfn,
zone->zone_pgdat->inactive_ratio);
seq_putc(m, '\n');
Patches currently in stable-queue which might be from hannes(a)cmpxchg.org are
queue-4.9/mm-fix-100-cpu-kswapd-busyloop-on-unreclaimable-nodes.patch
queue-4.9/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
Hi Sasha,
The build logs for linux-4.1 on ARM64 have become completely unreadable
after Olof has updated his toolchain, with many thousand warnings like:
> arm64.allmodconfig:
> /tmp/cc1QjtsZ.s:2500: Warning: ignoring incorrect section type for .init_array.00100
> /tmp/cc1QjtsZ.s:2513: Warning: ignoring incorrect section type for .fini_array.00100
> /tmp/cc3sXmLs.s:77: Warning: ignoring incorrect section type for .init_array.00100
> /tmp/cc3sXmLs.s:90: Warning: ignoring incorrect section type for .fini_array.00100
> /tmp/cc3J6cMe.s:1401: Warning: ignoring incorrect section type for .init_array.00100
>/tmp/cc3J6cMe.s:1414: Warning: ignoring incorrect section type for .fini_array.00100
> /tmp/ccbjphUv.s:1775: Warning: ignoring incorrect section type for .init_array.00100
> /tmp/ccbjphUv.s:1788: Warning: ignoring incorrect section type for .fini_array.00100
> /tmp/ccVPrJUF.s:50: Warning: ignoring incorrect section type for .init_array.00100
Could you pick up commit
cc622420798c ("gcov: disable for COMPILE_TEST")
like Greg did for 3.18 and 4.4?
Many configurations also produce
drivers/mtd/mtd_blkdevs.c:100:2: warning: switch condition has boolean
value [-Wswitch-bool]
which got fixed upstream with
cc7fce802290 ("mtd: blkdevs: fix switch-bool compilation warning")
Once those two are backported, the log files like
http://arm-soc.lixom.net/buildlogs/stable-rc/v4.1.49/
become much more useful.
Thanks,
Arnd
This is a note to let you know that I've just added the patch titled
Revert "module: Add retpoline tag to VERMAGIC"
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
revert-module-add-retpoline-tag-to-vermagic.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5132ede0fe8092b043dae09a7cc32b8ae7272baa Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Wed, 24 Jan 2018 15:28:17 +0100
Subject: Revert "module: Add retpoline tag to VERMAGIC"
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
commit 5132ede0fe8092b043dae09a7cc32b8ae7272baa upstream.
This reverts commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12.
Turns out distros do not want to make retpoline as part of their "ABI",
so this patch should not have been merged. Sorry Andi, this was my
fault, I suggested it when your original patch was the "correct" way of
doing this instead.
Reported-by: Jiri Kosina <jikos(a)kernel.org>
Fixes: 6cfb521ac0d5 ("module: Add retpoline tag to VERMAGIC")
Acked-by: Andi Kleen <ak(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: rusty(a)rustcorp.com.au
Cc: arjan.van.de.ven(a)intel.com
Cc: jeyu(a)kernel.org
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/vermagic.h | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
--- a/include/linux/vermagic.h
+++ b/include/linux/vermagic.h
@@ -24,16 +24,10 @@
#ifndef MODULE_ARCH_VERMAGIC
#define MODULE_ARCH_VERMAGIC ""
#endif
-#ifdef RETPOLINE
-#define MODULE_VERMAGIC_RETPOLINE "retpoline "
-#else
-#define MODULE_VERMAGIC_RETPOLINE ""
-#endif
#define VERMAGIC_STRING \
UTS_RELEASE " " \
MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \
MODULE_VERMAGIC_MODULE_UNLOAD MODULE_VERMAGIC_MODVERSIONS \
- MODULE_ARCH_VERMAGIC \
- MODULE_VERMAGIC_RETPOLINE
+ MODULE_ARCH_VERMAGIC
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-4.9/prevent-timer-value-0-for-mwaitx.patch
queue-4.9/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.9/drivers-base-cacheinfo-fix-x86-with-config_of-enabled.patch
queue-4.9/can-af_can-canfd_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.9/can-af_can-can_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.9/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.9/usbip-fix-implicit-fallthrough-warning.patch
queue-4.9/orangefs-initialize-op-on-loop-restart-in-orangefs_devreq_read.patch
queue-4.9/cma-fix-calculation-of-aligned-offset.patch
queue-4.9/drivers-base-cacheinfo-fix-boot-error-message-when-acpi-is-enabled.patch
queue-4.9/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
queue-4.9/kvm-arm-arm64-check-pagesize-when-allocating-a-hugepage-at-stage-2.patch
queue-4.9/acpica-namespace-fix-operand-cache-leak.patch
queue-4.9/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.9/orangefs-use-list_for_each_entry_safe-in-purge_waiting_ops.patch
queue-4.9/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.9/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
queue-4.9/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
Revert "module: Add retpoline tag to VERMAGIC"
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
revert-module-add-retpoline-tag-to-vermagic.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5132ede0fe8092b043dae09a7cc32b8ae7272baa Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Wed, 24 Jan 2018 15:28:17 +0100
Subject: Revert "module: Add retpoline tag to VERMAGIC"
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
commit 5132ede0fe8092b043dae09a7cc32b8ae7272baa upstream.
This reverts commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12.
Turns out distros do not want to make retpoline as part of their "ABI",
so this patch should not have been merged. Sorry Andi, this was my
fault, I suggested it when your original patch was the "correct" way of
doing this instead.
Reported-by: Jiri Kosina <jikos(a)kernel.org>
Fixes: 6cfb521ac0d5 ("module: Add retpoline tag to VERMAGIC")
Acked-by: Andi Kleen <ak(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: rusty(a)rustcorp.com.au
Cc: arjan.van.de.ven(a)intel.com
Cc: jeyu(a)kernel.org
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/vermagic.h | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
--- a/include/linux/vermagic.h
+++ b/include/linux/vermagic.h
@@ -24,16 +24,10 @@
#ifndef MODULE_ARCH_VERMAGIC
#define MODULE_ARCH_VERMAGIC ""
#endif
-#ifdef RETPOLINE
-#define MODULE_VERMAGIC_RETPOLINE "retpoline "
-#else
-#define MODULE_VERMAGIC_RETPOLINE ""
-#endif
#define VERMAGIC_STRING \
UTS_RELEASE " " \
MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \
MODULE_VERMAGIC_MODULE_UNLOAD MODULE_VERMAGIC_MODVERSIONS \
- MODULE_ARCH_VERMAGIC \
- MODULE_VERMAGIC_RETPOLINE
+ MODULE_ARCH_VERMAGIC
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-4.4/prevent-timer-value-0-for-mwaitx.patch
queue-4.4/fs-select-add-vmalloc-fallback-for-select-2.patch
queue-4.4/x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
queue-4.4/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/netfilter-x_tables-speed-up-jump-target-validation.patch
queue-4.4/netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
queue-4.4/mmc-sdhci-of-esdhc-add-remove-some-quirks-according-to-vendor-version.patch
queue-4.4/netfilter-fix-is_err_value-usage.patch
queue-4.4/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.4/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-4.4/drivers-base-cacheinfo-fix-x86-with-config_of-enabled.patch
queue-4.4/can-af_can-canfd_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/pci-layerscape-add-fsl-ls2085a-pcie-compatible-id.patch
queue-4.4/ext2-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.4/can-af_can-can_rcv-replace-warn_once-by-pr_warn_once.patch
queue-4.4/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.4/netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
queue-4.4/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.4/x86-asm-32-make-sync_core-handle-missing-cpuid-on-all-32-bit-kernels.patch
queue-4.4/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.4/usbip-fix-potential-format-overflow-in-userspace-tools.patch
queue-4.4/timers-plug-locking-race-vs.-timer-migration.patch
queue-4.4/acpi-processor-avoid-reserving-io-regions-too-early.patch
queue-4.4/usbip-fix-implicit-fallthrough-warning.patch
queue-4.4/cma-fix-calculation-of-aligned-offset.patch
queue-4.4/netfilter-nf_conntrack_sip-extend-request-line-validation.patch
queue-4.4/netfilter-restart-search-if-moved-to-other-chain.patch
queue-4.4/x86-microcode-intel-fix-bdw-late-loading-revision-check.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/drivers-base-cacheinfo-fix-boot-error-message-when-acpi-is-enabled.patch
queue-4.4/pci-layerscape-fix-msg-tlp-drop-setting.patch
queue-4.4/netfilter-use-fwmark_reflect-in-nf_send_reset.patch
queue-4.4/x86-retpoline-fill-rsb-on-context-switch-for-affected-cpus.patch
queue-4.4/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
queue-4.4/acpica-namespace-fix-operand-cache-leak.patch
queue-4.4/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.4/time-avoid-undefined-behaviour-in-ktime_add_safe.patch
queue-4.4/sched-deadline-use-the-revised-wakeup-rule-for-suspending-constrained-dl-tasks.patch
queue-4.4/reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
queue-4.4/usbip-prevent-vhci_hcd-driver-from-leaking-a-socket-pointer-address.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
queue-4.4/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
Revert "module: Add retpoline tag to VERMAGIC"
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
revert-module-add-retpoline-tag-to-vermagic.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 5132ede0fe8092b043dae09a7cc32b8ae7272baa Mon Sep 17 00:00:00 2001
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Date: Wed, 24 Jan 2018 15:28:17 +0100
Subject: Revert "module: Add retpoline tag to VERMAGIC"
From: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
commit 5132ede0fe8092b043dae09a7cc32b8ae7272baa upstream.
This reverts commit 6cfb521ac0d5b97470883ff9b7facae264b7ab12.
Turns out distros do not want to make retpoline as part of their "ABI",
so this patch should not have been merged. Sorry Andi, this was my
fault, I suggested it when your original patch was the "correct" way of
doing this instead.
Reported-by: Jiri Kosina <jikos(a)kernel.org>
Fixes: 6cfb521ac0d5 ("module: Add retpoline tag to VERMAGIC")
Acked-by: Andi Kleen <ak(a)linux.intel.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: David Woodhouse <dwmw(a)amazon.co.uk>
Cc: rusty(a)rustcorp.com.au
Cc: arjan.van.de.ven(a)intel.com
Cc: jeyu(a)kernel.org
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
include/linux/vermagic.h | 8 +-------
1 file changed, 1 insertion(+), 7 deletions(-)
--- a/include/linux/vermagic.h
+++ b/include/linux/vermagic.h
@@ -31,17 +31,11 @@
#else
#define MODULE_RANDSTRUCT_PLUGIN
#endif
-#ifdef RETPOLINE
-#define MODULE_VERMAGIC_RETPOLINE "retpoline "
-#else
-#define MODULE_VERMAGIC_RETPOLINE ""
-#endif
#define VERMAGIC_STRING \
UTS_RELEASE " " \
MODULE_VERMAGIC_SMP MODULE_VERMAGIC_PREEMPT \
MODULE_VERMAGIC_MODULE_UNLOAD MODULE_VERMAGIC_MODVERSIONS \
MODULE_ARCH_VERMAGIC \
- MODULE_RANDSTRUCT_PLUGIN \
- MODULE_VERMAGIC_RETPOLINE
+ MODULE_RANDSTRUCT_PLUGIN
Patches currently in stable-queue which might be from gregkh(a)linuxfoundation.org are
queue-4.14/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.14/xfrm-fix-a-race-in-the-xdst-pcpu-cache.patch
queue-4.14/orangefs-initialize-op-on-loop-restart-in-orangefs_devreq_read.patch
queue-4.14/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
queue-4.14/revert-module-add-retpoline-tag-to-vermagic.patch
queue-4.14/orangefs-use-list_for_each_entry_safe-in-purge_waiting_ops.patch
queue-4.14/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
From: Stefan Schake <stschake(a)gmail.com>
[ Upstream commit 253696ccd613fbdaa5aba1de44c461a058e0a114 ]
Synchronously disable the IRQ to make the following cancel_work_sync
invocation effective.
An interrupt in flight could enqueue further overflow mem work. As we
free the binner BO immediately following vc4_irq_uninstall this caused
a NULL pointer dereference in the work callback vc4_overflow_mem_work.
Link: https://github.com/anholt/linux/issues/114
Signed-off-by: Stefan Schake <stschake(a)gmail.com>
Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Eric Anholt <eric(a)anholt.net>
Reviewed-by: Eric Anholt <eric(a)anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1510275907-993-2-git-send-ema…
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
drivers/gpu/drm/vc4/vc4_irq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 7d7af3a93d94..61b2e5377993 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -208,6 +208,9 @@ vc4_irq_postinstall(struct drm_device *dev)
{
struct vc4_dev *vc4 = to_vc4_dev(dev);
+ /* Undo the effects of a previous vc4_irq_uninstall. */
+ enable_irq(dev->irq);
+
/* Enable both the render done and out of memory interrupts. */
V3D_WRITE(V3D_INTENA, V3D_DRIVER_IRQS);
@@ -225,6 +228,9 @@ vc4_irq_uninstall(struct drm_device *dev)
/* Clear any pending interrupts we might have left. */
V3D_WRITE(V3D_INTCTL, V3D_DRIVER_IRQS);
+ /* Finish any interrupt handler still in flight. */
+ disable_irq(dev->irq);
+
cancel_work_sync(&vc4->overflow_mem_work);
}
--
2.11.0
From: Stefan Schake <stschake(a)gmail.com>
[ Upstream commit 253696ccd613fbdaa5aba1de44c461a058e0a114 ]
Synchronously disable the IRQ to make the following cancel_work_sync
invocation effective.
An interrupt in flight could enqueue further overflow mem work. As we
free the binner BO immediately following vc4_irq_uninstall this caused
a NULL pointer dereference in the work callback vc4_overflow_mem_work.
Link: https://github.com/anholt/linux/issues/114
Signed-off-by: Stefan Schake <stschake(a)gmail.com>
Fixes: d5b1a78a772f ("drm/vc4: Add support for drawing 3D frames.")
Signed-off-by: Eric Anholt <eric(a)anholt.net>
Reviewed-by: Eric Anholt <eric(a)anholt.net>
Link: https://patchwork.freedesktop.org/patch/msgid/1510275907-993-2-git-send-ema…
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
drivers/gpu/drm/vc4/vc4_irq.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/gpu/drm/vc4/vc4_irq.c b/drivers/gpu/drm/vc4/vc4_irq.c
index 094bc6a475c1..d45a7c0a7915 100644
--- a/drivers/gpu/drm/vc4/vc4_irq.c
+++ b/drivers/gpu/drm/vc4/vc4_irq.c
@@ -208,6 +208,9 @@ vc4_irq_postinstall(struct drm_device *dev)
{
struct vc4_dev *vc4 = to_vc4_dev(dev);
+ /* Undo the effects of a previous vc4_irq_uninstall. */
+ enable_irq(dev->irq);
+
/* Enable both the render done and out of memory interrupts. */
V3D_WRITE(V3D_INTENA, V3D_DRIVER_IRQS);
@@ -225,6 +228,9 @@ vc4_irq_uninstall(struct drm_device *dev)
/* Clear any pending interrupts we might have left. */
V3D_WRITE(V3D_INTCTL, V3D_DRIVER_IRQS);
+ /* Finish any interrupt handler still in flight. */
+ disable_irq(dev->irq);
+
cancel_work_sync(&vc4->overflow_mem_work);
}
--
2.11.0
The patch
ASoC: rt5514-spi: Check the validity of drvdata pointer on resume
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 509bf3a7d43ab173abc354df2a859229ede043c0 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <marc.zyngier(a)arm.com>
Date: Wed, 24 Jan 2018 14:50:00 +0000
Subject: [PATCH] ASoC: rt5514-spi: Check the validity of drvdata pointer on
resume
The rt5514-spi driver seem to assume the validity of the drvdata pointer
on resume, which it may not be populated, leading to a not-so-nice crash.
This stems from the fact that rt5514_spi_pcm_probe() is never called on
my system (a kevin Chromebook). No idea why, but if it can happen, it
is worth fixing.
Fixes: e9c50aa6bd39 ("ASoC: rt5514-spi: check irq status to schedule data copy in resume function")
Signed-off-by: Marc Zyngier <marc.zyngier(a)arm.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
sound/soc/codecs/rt5514-spi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/sound/soc/codecs/rt5514-spi.c b/sound/soc/codecs/rt5514-spi.c
index 2df91db765ac..9255afcf2c3a 100644
--- a/sound/soc/codecs/rt5514-spi.c
+++ b/sound/soc/codecs/rt5514-spi.c
@@ -482,7 +482,7 @@ static int __maybe_unused rt5514_resume(struct device *dev)
if (device_may_wakeup(dev))
disable_irq_wake(irq);
- if (rt5514_dsp->substream) {
+ if (rt5514_dsp && rt5514_dsp->substream) {
rt5514_spi_burst_read(RT5514_IRQ_CTRL, (u8 *)&buf, sizeof(buf));
if (buf[0] & RT5514_IRQ_STATUS_BIT)
rt5514_schedule_copy(rt5514_dsp);
--
2.15.1
From: Liran Alon <liran.alon(a)oracle.com>
[ Upstream commit 1f4dcb3b213235e642088709a1c54964d23365e9 ]
On this case, handle_emulation_failure() fills kvm_run with
internal-error information which it expects to be delivered
to user-mode for further processing.
However, the code reports a wrong return-value which makes KVM to never
return to user-mode on this scenario.
Fixes: 6d77dbfc88e3 ("KVM: inject #UD if instruction emulation fails and exit to
userspace")
Signed-off-by: Liran Alon <liran.alon(a)oracle.com>
Reviewed-by: Nikita Leshenko <nikita.leshchenko(a)oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Reviewed-by: Wanpeng Li <wanpeng.li(a)hotmail.com>
Signed-off-by: Radim Krčmář <rkrcmar(a)redhat.com>
Signed-off-by: Sasha Levin <alexander.levin(a)microsoft.com>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index f973cfa8ff4f..3900d34980de 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -5153,7 +5153,7 @@ static int handle_emulation_failure(struct kvm_vcpu *vcpu)
vcpu->run->exit_reason = KVM_EXIT_INTERNAL_ERROR;
vcpu->run->internal.suberror = KVM_INTERNAL_ERROR_EMULATION;
vcpu->run->internal.ndata = 0;
- r = EMULATE_FAIL;
+ r = EMULATE_USER_EXIT;
}
kvm_queue_exception(vcpu, UD_VECTOR);
--
2.11.0
This is a note to let you know that I've just added the patch titled
xfrm: Fix a race in the xdst pcpu cache.
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
xfrm-fix-a-race-in-the-xdst-pcpu-cache.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 76a4201191814a0061cb5c861fafb9ecaa764846 Mon Sep 17 00:00:00 2001
From: Steffen Klassert <steffen.klassert(a)secunet.com>
Date: Wed, 10 Jan 2018 12:14:28 +0100
Subject: xfrm: Fix a race in the xdst pcpu cache.
From: Steffen Klassert <steffen.klassert(a)secunet.com>
commit 76a4201191814a0061cb5c861fafb9ecaa764846 upstream.
We need to run xfrm_resolve_and_create_bundle() with
bottom halves off. Otherwise we may reuse an already
released dst_enty when the xfrm lookup functions are
called from process context.
Fixes: c30d78c14a813db39a647b6a348b428 ("xfrm: add xdst pcpu cache")
Reported-by: Darius Ski <darius.ski(a)gmail.com>
Signed-off-by: Steffen Klassert <steffen.klassert(a)secunet.com>
Acked-by: David Miller <davem(a)davemloft.net>,
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/xfrm/xfrm_policy.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
--- a/net/xfrm/xfrm_policy.c
+++ b/net/xfrm/xfrm_policy.c
@@ -2056,8 +2056,11 @@ xfrm_bundle_lookup(struct net *net, cons
if (num_xfrms <= 0)
goto make_dummy_bundle;
+ local_bh_disable();
xdst = xfrm_resolve_and_create_bundle(pols, num_pols, fl, family,
- xflo->dst_orig);
+ xflo->dst_orig);
+ local_bh_enable();
+
if (IS_ERR(xdst)) {
err = PTR_ERR(xdst);
if (err != -EAGAIN)
@@ -2144,9 +2147,12 @@ struct dst_entry *xfrm_lookup(struct net
goto no_transform;
}
+ local_bh_disable();
xdst = xfrm_resolve_and_create_bundle(
pols, num_pols, fl,
family, dst_orig);
+ local_bh_enable();
+
if (IS_ERR(xdst)) {
xfrm_pols_put(pols, num_pols);
err = PTR_ERR(xdst);
Patches currently in stable-queue which might be from steffen.klassert(a)secunet.com are
queue-4.14/xfrm-fix-a-race-in-the-xdst-pcpu-cache.patch
This is a note to let you know that I've just added the patch titled
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From eef9ffdf9cd39b2986367bc8395e2772bc1284ba Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn(a)suse.de>
Date: Mon, 9 Oct 2017 13:33:19 +0200
Subject: scsi: libiscsi: fix shifting of DID_REQUEUE host byte
From: Johannes Thumshirn <jthumshirn(a)suse.de>
commit eef9ffdf9cd39b2986367bc8395e2772bc1284ba upstream.
The SCSI host byte should be shifted left by 16 in order to have
scsi_decide_disposition() do the right thing (.i.e. requeue the
command).
Signed-off-by: Johannes Thumshirn <jthumshirn(a)suse.de>
Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
Cc: Lee Duncan <lduncan(a)suse.com>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Bart Van Assche <Bart.VanAssche(a)sandisk.com>
Cc: Chris Leech <cleech(a)redhat.com>
Acked-by: Lee Duncan <lduncan(a)suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/scsi/libiscsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1727,7 +1727,7 @@ int iscsi_queuecommand(struct Scsi_Host
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
reason = FAILURE_SESSION_IN_RECOVERY;
- sc->result = DID_REQUEUE;
+ sc->result = DID_REQUEUE << 16;
goto fault;
}
Patches currently in stable-queue which might be from jthumshirn(a)suse.de are
queue-4.9/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
This is a note to let you know that I've just added the patch titled
reiserfs: don't preallocate blocks for extended attributes
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:35:04 -0400
Subject: reiserfs: don't preallocate blocks for extended attributes
From: Jeff Mahoney <jeffm(a)suse.com>
commit 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a upstream.
Most extended attributes will fit in a single block. More importantly,
we drop the reference to the inode while holding the transaction open
so the preallocated blocks aren't released. As a result, the inode
may be evicted before it's removed from the transaction's prealloc list
which can cause memory corruption.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -1136,7 +1136,7 @@ static int determine_prealloc_size(reise
hint->prealloc_size = 0;
if (!hint->formatted_node && hint->preallocate) {
- if (S_ISREG(hint->inode->i_mode)
+ if (S_ISREG(hint->inode->i_mode) && !IS_PRIVATE(hint->inode)
&& hint->inode->i_size >=
REISERFS_SB(hint->th->t_super)->s_alloc_options.
preallocmin * hint->inode->i_sb->s_blocksize)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
reiserfs: fix race in prealloc discard
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-fix-race-in-prealloc-discard.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 08db141b5313ac2f64b844fb5725b8d81744b417 Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:47:34 -0400
Subject: reiserfs: fix race in prealloc discard
From: Jeff Mahoney <jeffm(a)suse.com>
commit 08db141b5313ac2f64b844fb5725b8d81744b417 upstream.
The main loop in __discard_prealloc is protected by the reiserfs write lock
which is dropped across schedules like the BKL it replaced. The problem is
that it checks the value, calls a routine that schedules, and then adjusts
the state. As a result, two threads that are calling
reiserfs_prealloc_discard at the same time can race when one calls
reiserfs_free_prealloc_block, the lock is dropped, and the other calls
reiserfs_free_prealloc_block with the same block number. In the right
circumstances, it can cause the prealloc count to go negative.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -513,9 +513,17 @@ static void __discard_prealloc(struct re
"inode has negative prealloc blocks count.");
#endif
while (ei->i_prealloc_count > 0) {
- reiserfs_free_prealloc_block(th, inode, ei->i_prealloc_block);
- ei->i_prealloc_block++;
+ b_blocknr_t block_to_free;
+
+ /*
+ * reiserfs_free_prealloc_block can drop the write lock,
+ * which could allow another caller to free the same block.
+ * We can protect against it by modifying the prealloc
+ * state before calling it.
+ */
+ block_to_free = ei->i_prealloc_block++;
ei->i_prealloc_count--;
+ reiserfs_free_prealloc_block(th, inode, block_to_free);
dirty = 1;
}
if (dirty)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.9/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.9/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
netfilter: nfnetlink_cthelper: Add missing permission checks
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b380c42f7d00a395feede754f0bc2292eebe6e5 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Sun, 3 Dec 2017 12:12:45 -0800
Subject: netfilter: nfnetlink_cthelper: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, nfnl_cthelper_list is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
$ nfct helper list
nfct v1.4.4: netlink error: Operation not permitted
$ vpnns -- nfct helper list
{
.name = ftp,
.queuenum = 0,
.l3protonum = 2,
.l4protonum = 6,
.priv_data_len = 24,
.status = enabled,
};
Add capable() checks in nfnetlink_cthelper, as this is cleaner than
trying to generalize the solution.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/list.h>
#include <linux/errno.h>
+#include <linux/capability.h>
#include <net/netlink.h>
#include <net/sock.h>
@@ -392,6 +393,9 @@ static int nfnl_cthelper_new(struct net
struct nfnl_cthelper *nlcth;
int ret = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE])
return -EINVAL;
@@ -595,6 +599,9 @@ static int nfnl_cthelper_get(struct net
struct nfnl_cthelper *nlcth;
bool tuple_set = false;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
.dump = nfnl_cthelper_dump_table,
@@ -661,6 +668,9 @@ static int nfnl_cthelper_del(struct net
struct nfnl_cthelper *nlcth, *n;
int j = 0, ret;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (tb[NFCTH_NAME])
helper_name = nla_data(tb[NFCTH_NAME]);
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: xt_osf: Add missing permission checks
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-xt_osf-add-missing-permission-checks.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 916a27901de01446bcf57ecca4783f6cff493309 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Tue, 5 Dec 2017 15:42:41 -0800
Subject: netfilter: xt_osf: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, xt_osf_fingers is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
vpnns -- nfnl_osf -f /tmp/pf.os
vpnns -- nfnl_osf -f /tmp/pf.os -d
These non-root operations successfully modify the systemwide OS
fingerprint list. Add new capable() checks so that they can't.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/xt_osf.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/net/netfilter/xt_osf.c
+++ b/net/netfilter/xt_osf.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/kernel.h>
+#include <linux/capability.h>
#include <linux/if.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
@@ -69,6 +70,9 @@ static int xt_osf_add_callback(struct ne
struct xt_osf_finger *kf = NULL, *sf;
int err = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
@@ -113,6 +117,9 @@ static int xt_osf_remove_callback(struct
struct xt_osf_finger *sf;
int err = -ENOENT;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.9/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.9/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
mm, page_alloc: fix potential false positive in __zone_watermark_ok
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b050e3769c6b4013bb937e879fc43bf1847ee819 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka(a)suse.cz>
Date: Wed, 15 Nov 2017 17:38:30 -0800
Subject: mm, page_alloc: fix potential false positive in __zone_watermark_ok
From: Vlastimil Babka <vbabka(a)suse.cz>
commit b050e3769c6b4013bb937e879fc43bf1847ee819 upstream.
Since commit 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for
order-0 allocations"), __zone_watermark_ok() check for high-order
allocations will shortcut per-migratetype free list checks for
ALLOC_HARDER allocations, and return true as long as there's free page
of any migratetype. The intention is that ALLOC_HARDER can allocate
from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.
However, as a side effect, the watermark check will then also return
true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the
allocation cannot actually obtain isolated pages, and might not be able
to obtain CMA pages, this can result in a false positive.
The condition should be rare and perhaps the outcome is not a fatal one.
Still, it's better if the watermark check is correct. There also
shouldn't be a performance tradeoff here.
Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
Fixes: 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for order-0 allocations")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2821,9 +2821,6 @@ bool __zone_watermark_ok(struct zone *z,
if (!area->nr_free)
continue;
- if (alloc_harder)
- return true;
-
for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
if (!list_empty(&area->free_list[mt]))
return true;
@@ -2835,6 +2832,9 @@ bool __zone_watermark_ok(struct zone *z,
return true;
}
#endif
+ if (alloc_harder &&
+ !list_empty(&area->free_list[MIGRATE_HIGHATOMIC]))
+ return true;
}
return false;
}
Patches currently in stable-queue which might be from vbabka(a)suse.cz are
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.9/cma-fix-calculation-of-aligned-offset.patch
queue-4.9/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
This is a note to let you know that I've just added the patch titled
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 561b5e0709e4a248c67d024d4d94b6e31e3edf2f Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Mon, 10 Jul 2017 15:49:51 -0700
Subject: mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
From: Michal Hocko <mhocko(a)suse.com>
commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f upstream.
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Debugged-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Ben Hutchings <ben(a)decadent.org.uk>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/mmap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2240,7 +2240,8 @@ int expand_upwards(struct vm_area_struct
gap_addr = TASK_SIZE;
next = vma->vm_next;
- if (next && next->vm_start < gap_addr) {
+ if (next && next->vm_start < gap_addr &&
+ (next->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(next->vm_flags & VM_GROWSUP))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
@@ -2324,7 +2325,8 @@ int expand_downwards(struct vm_area_stru
if (gap_addr > address)
return -ENOMEM;
prev = vma->vm_prev;
- if (prev && prev->vm_end > gap_addr) {
+ if (prev && prev->vm_end > gap_addr &&
+ (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(prev->vm_flags & VM_GROWSDOWN))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
This is a note to let you know that I've just added the patch titled
hwpoison, memcg: forcibly uncharge LRU pages
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hwpoison-memcg-forcibly-uncharge-lru-pages.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 18365225f0440d09708ad9daade2ec11275c3df9 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Fri, 12 May 2017 15:46:26 -0700
Subject: hwpoison, memcg: forcibly uncharge LRU pages
From: Michal Hocko <mhocko(a)suse.com>
commit 18365225f0440d09708ad9daade2ec11275c3df9 upstream.
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page. While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away. Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.
[akpm(a)linux-foundation.org: coding-style fixes]
Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora(a)gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memcontrol.c | 2 +-
mm/memory-failure.c | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5531,7 +5531,7 @@ static void uncharge_list(struct list_he
next = page->lru.next;
VM_BUG_ON_PAGE(PageLRU(page), page);
- VM_BUG_ON_PAGE(page_count(page), page);
+ VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
if (!page->mem_cgroup)
continue;
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -535,6 +535,13 @@ static int delete_from_lru_cache(struct
*/
ClearPageActive(p);
ClearPageUnevictable(p);
+
+ /*
+ * Poisoned page might never drop its ref count to 0 so we have
+ * to uncharge it manually from its memcg.
+ */
+ mem_cgroup_uncharge(p);
+
/*
* drop the page count elevated by isolate_lru_page()
*/
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.9/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.9/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
This is a note to let you know that I've just added the patch titled
ipc: msg, make msgrcv work with LONG_MIN
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipc-msg-make-msgrcv-work-with-long_min.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 999898355e08ae3b92dfd0a08db706e0c6703d30 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby(a)suse.cz>
Date: Wed, 14 Dec 2016 15:06:07 -0800
Subject: ipc: msg, make msgrcv work with LONG_MIN
From: Jiri Slaby <jslaby(a)suse.cz>
commit 999898355e08ae3b92dfd0a08db706e0c6703d30 upstream.
When LONG_MIN is passed to msgrcv, one would expect to recieve any
message. But convert_mode does *msgtyp = -*msgtyp and -LONG_MIN is
undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN
again.
So handle this case properly by assigning LONG_MAX to *msgtyp if
LONG_MIN was specified as msgtyp to msgrcv.
This code:
long msg[] = { 100, 200 };
int m = msgget(IPC_PRIVATE, IPC_CREAT | 0644);
msgsnd(m, &msg, sizeof(msg), 0);
msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0);
produces currently nothing:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 65538
msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(65538, ...
Except a UBSAN warning:
UBSAN: Undefined behaviour in ipc/msg.c:745:13
negation of -9223372036854775808 cannot be represented in type 'long int':
With the patch, I see what I expect:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 0
msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16
Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Manfred Spraul <manfred(a)colorfullife.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
ipc/msg.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -763,7 +763,10 @@ static inline int convert_mode(long *msg
if (*msgtyp == 0)
return SEARCH_ANY;
if (*msgtyp < 0) {
- *msgtyp = -*msgtyp;
+ if (*msgtyp == LONG_MIN) /* -LONG_MIN is undefined */
+ *msgtyp = LONG_MAX;
+ else
+ *msgtyp = -*msgtyp;
return SEARCH_LESSEQUAL;
}
if (msgflg & MSG_EXCEPT)
Patches currently in stable-queue which might be from jslaby(a)suse.cz are
queue-4.9/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.9/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.9/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
This is a note to let you know that I've just added the patch titled
fs/fcntl: f_setown, avoid undefined behaviour
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fs-fcntl-f_setown-avoid-undefined-behaviour.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fc3dc67471461c0efcb1ed22fb7595121d65fad9 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby(a)suse.cz>
Date: Tue, 13 Jun 2017 13:35:51 +0200
Subject: fs/fcntl: f_setown, avoid undefined behaviour
From: Jiri Slaby <jslaby(a)suse.cz>
commit fc3dc67471461c0efcb1ed22fb7595121d65fad9 upstream.
fcntl(0, F_SETOWN, 0x80000000) triggers:
UBSAN: Undefined behaviour in fs/fcntl.c:118:7
negation of -2147483648 cannot be represented in type 'int':
CPU: 1 PID: 18261 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
...
Call Trace:
...
[<ffffffffad8f0868>] ? f_setown+0x1d8/0x200
[<ffffffffad8f19a9>] ? SyS_fcntl+0x999/0xf30
[<ffffffffaed1fb00>] ? entry_SYSCALL_64_fastpath+0x23/0xc1
Fix that by checking the arg parameter properly (against INT_MAX) before
"who = -who". And return immediatelly with -EINVAL in case it is wrong.
Note that according to POSIX we can return EINVAL:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
[EINVAL]
The cmd argument is F_SETOWN and the value of the argument
is not valid as a process or process group identifier.
[v2] returns an error, v1 used to fail silently
[v3] implement proper check for the bad value INT_MIN
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Cc: Jeff Layton <jlayton(a)poochiereds.net>
Cc: "J. Bruce Fields" <bfields(a)fieldses.org>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/fcntl.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -114,6 +114,10 @@ void f_setown(struct file *filp, unsigne
int who = arg;
type = PIDTYPE_PID;
if (who < 0) {
+ /* avoid overflow below */
+ if (who == INT_MIN)
+ return -EINVAL;
+
type = PIDTYPE_PGID;
who = -who;
}
Patches currently in stable-queue which might be from jslaby(a)suse.cz are
queue-4.9/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.9/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.9/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
This is a note to let you know that I've just added the patch titled
cma: fix calculation of aligned offset
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
cma-fix-calculation-of-aligned-offset.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e048cb32f69038aa1c8f11e5c1b331be4181659d Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Mon, 10 Jul 2017 15:49:44 -0700
Subject: cma: fix calculation of aligned offset
From: Doug Berger <opendmb(a)gmail.com>
commit e048cb32f69038aa1c8f11e5c1b331be4181659d upstream.
The align_offset parameter is used by bitmap_find_next_zero_area_off()
to represent the offset of map's base from the previous alignment
boundary; the function ensures that the returned index, plus the
align_offset, honors the specified align_mask.
The logic introduced by commit b5be83e308f7 ("mm: cma: align to physical
address, not CMA region position") has the cma driver calculate the
offset to the *next* alignment boundary. In most cases, the base
alignment is greater than that specified when making allocations,
resulting in a zero offset whether we align up or down. In the example
given with the commit, the base alignment (8MB) was half the requested
alignment (16MB) so the math also happened to work since the offset is
8MB in both directions. However, when requesting allocations with an
alignment greater than twice that of the base, the returned index would
not be correctly aligned.
Also, the align_order arguments of cma_bitmap_aligned_mask() and
cma_bitmap_aligned_offset() should not be negative so the argument type
was made unsigned.
Fixes: b5be83e308f7 ("mm: cma: align to physical address, not CMA region position")
Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
Signed-off-by: Angus Clark <angus(a)angusclark.org>
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Gregory Fong <gregory.0xf0(a)gmail.com>
Cc: Doug Berger <opendmb(a)gmail.com>
Cc: Angus Clark <angus(a)angusclark.org>
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Lucas Stach <l.stach(a)pengutronix.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Shiraz Hashim <shashim(a)codeaurora.org>
Cc: Jaewon Kim <jaewon31.kim(a)samsung.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/cma.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -54,7 +54,7 @@ unsigned long cma_get_size(const struct
}
static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
- int align_order)
+ unsigned int align_order)
{
if (align_order <= cma->order_per_bit)
return 0;
@@ -62,17 +62,14 @@ static unsigned long cma_bitmap_aligned_
}
/*
- * Find a PFN aligned to the specified order and return an offset represented in
- * order_per_bits.
+ * Find the offset of the base PFN from the specified align_order.
+ * The value returned is represented in order_per_bits.
*/
static unsigned long cma_bitmap_aligned_offset(const struct cma *cma,
- int align_order)
+ unsigned int align_order)
{
- if (align_order <= cma->order_per_bit)
- return 0;
-
- return (ALIGN(cma->base_pfn, (1UL << align_order))
- - cma->base_pfn) >> cma->order_per_bit;
+ return (cma->base_pfn & ((1UL << align_order) - 1))
+ >> cma->order_per_bit;
}
static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma,
Patches currently in stable-queue which might be from opendmb(a)gmail.com are
queue-4.9/cma-fix-calculation-of-aligned-offset.patch
This is a note to let you know that I've just added the patch titled
ACPICA: Namespace: fix operand cache leak
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpica-namespace-fix-operand-cache-leak.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b2d69114fefa474fca542e51119036dceb4aa6f Mon Sep 17 00:00:00 2001
From: Seunghun Han <kkamagui(a)gmail.com>
Date: Wed, 26 Apr 2017 16:18:08 +0800
Subject: ACPICA: Namespace: fix operand cache leak
From: Seunghun Han <kkamagui(a)gmail.com>
commit 3b2d69114fefa474fca542e51119036dceb4aa6f upstream.
ACPICA commit a23325b2e583556eae88ed3f764e457786bf4df6
I found some ACPI operand cache leaks in ACPI early abort cases.
Boot log of ACPI operand cache leak is as follows:
>[ 0.174332] ACPI: Added _OSI(Module Device)
>[ 0.175504] ACPI: Added _OSI(Processor Device)
>[ 0.176010] ACPI: Added _OSI(3.0 _SCP Extensions)
>[ 0.177032] ACPI: Added _OSI(Processor Aggregator Device)
>[ 0.178284] ACPI: SCI (IRQ16705) allocation failed
>[ 0.179352] ACPI Exception: AE_NOT_ACQUIRED, Unable to install
System Control Interrupt handler (20160930/evevent-131)
>[ 0.180008] ACPI: Unable to start the ACPI Interpreter
>[ 0.181125] ACPI Error: Could not remove SCI handler
(20160930/evmisc-281)
>[ 0.184068] kmem_cache_destroy Acpi-Operand: Slab cache still has
objects
>[ 0.185358] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3 #2
>[ 0.186820] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS
virtual_box 12/01/2006
>[ 0.188000] Call Trace:
>[ 0.188000] ? dump_stack+0x5c/0x7d
>[ 0.188000] ? kmem_cache_destroy+0x224/0x230
>[ 0.188000] ? acpi_sleep_proc_init+0x22/0x22
>[ 0.188000] ? acpi_os_delete_cache+0xa/0xd
>[ 0.188000] ? acpi_ut_delete_caches+0x3f/0x7b
>[ 0.188000] ? acpi_terminate+0x5/0xf
>[ 0.188000] ? acpi_init+0x288/0x32e
>[ 0.188000] ? __class_create+0x4c/0x80
>[ 0.188000] ? video_setup+0x7a/0x7a
>[ 0.188000] ? do_one_initcall+0x4e/0x1b0
>[ 0.188000] ? kernel_init_freeable+0x194/0x21a
>[ 0.188000] ? rest_init+0x80/0x80
>[ 0.188000] ? kernel_init+0xa/0x100
>[ 0.188000] ? ret_from_fork+0x25/0x30
When early abort is occurred due to invalid ACPI information, Linux kernel
terminates ACPI by calling acpi_terminate() function. The function calls
acpi_ns_terminate() function to delete namespace data and ACPI operand cache
(acpi_gbl_module_code_list).
But the deletion code in acpi_ns_terminate() function is wrapped in
ACPI_EXEC_APP definition, therefore the code is only executed when the
definition exists. If the define doesn't exist, ACPI operand cache
(acpi_gbl_module_code_list) is leaked, and stack dump is shown in kernel log.
This causes a security threat because the old kernel (<= 4.9) shows memory
locations of kernel functions in stack dump, therefore kernel ASLR can be
neutralized.
To fix ACPI operand leak for enhancing security, I made a patch which
removes the ACPI_EXEC_APP define in acpi_ns_terminate() function for
executing the deletion code unconditionally.
Link: https://github.com/acpica/acpica/commit/a23325b2
Signed-off-by: Seunghun Han <kkamagui(a)gmail.com>
Signed-off-by: Lv Zheng <lv.zheng(a)intel.com>
Signed-off-by: Bob Moore <robert.moore(a)intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Lee, Chun-Yi <jlee(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/acpica/nsutils.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--- a/drivers/acpi/acpica/nsutils.c
+++ b/drivers/acpi/acpica/nsutils.c
@@ -594,25 +594,20 @@ struct acpi_namespace_node *acpi_ns_vali
void acpi_ns_terminate(void)
{
acpi_status status;
+ union acpi_operand_object *prev;
+ union acpi_operand_object *next;
ACPI_FUNCTION_TRACE(ns_terminate);
-#ifdef ACPI_EXEC_APP
- {
- union acpi_operand_object *prev;
- union acpi_operand_object *next;
+ /* Delete any module-level code blocks */
- /* Delete any module-level code blocks */
-
- next = acpi_gbl_module_code_list;
- while (next) {
- prev = next;
- next = next->method.mutex;
- prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
- acpi_ut_remove_reference(prev);
- }
+ next = acpi_gbl_module_code_list;
+ while (next) {
+ prev = next;
+ next = next->method.mutex;
+ prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
+ acpi_ut_remove_reference(prev);
}
-#endif
/*
* Free the entire namespace -- all nodes and all objects
Patches currently in stable-queue which might be from kkamagui(a)gmail.com are
queue-4.9/acpica-namespace-fix-operand-cache-leak.patch
This is a note to let you know that I've just added the patch titled
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Fri, 30 Dec 2016 02:27:31 +0100
Subject: ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
From: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
commit c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 upstream.
The way acpi_find_child_device() works currently is that, if there
are two (or more) devices with the same _ADR value in the same
namespace scope (which is not specifically allowed by the spec and
the OS behavior in that case is not defined), the first one of them
found to be present (with the help of _STA) will be returned.
This covers the majority of cases, but is not sufficient if some of
the devices in question have a _HID (or _CID) returning some valid
ACPI/PNP device IDs (which is disallowed by the spec) and the
ASL writers' expectation appears to be that the OS will match
devices without a valid ACPI/PNP device ID against a given bus
address first.
To cover this special case as well, modify find_child_checks()
to prefer devices without ACPI/PNP device IDs over devices that
have them.
Suggested-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Tested-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/glue.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/acpi/glue.c
+++ b/drivers/acpi/glue.c
@@ -99,13 +99,13 @@ static int find_child_checks(struct acpi
return -ENODEV;
/*
- * If the device has a _HID (or _CID) returning a valid ACPI/PNP
- * device ID, it is better to make it look less attractive here, so that
- * the other device with the same _ADR value (that may not have a valid
- * device ID) can be matched going forward. [This means a second spec
- * violation in a row, so whatever we do here is best effort anyway.]
+ * If the device has a _HID returning a valid ACPI/PNP device ID, it is
+ * better to make it look less attractive here, so that the other device
+ * with the same _ADR value (that may not have a valid device ID) can be
+ * matched going forward. [This means a second spec violation in a row,
+ * so whatever we do here is best effort anyway.]
*/
- return sta_present && list_empty(&adev->pnp.ids) ?
+ return sta_present && !adev->pnp.type.platform_id ?
FIND_CHILD_MAX_SCORE : FIND_CHILD_MIN_SCORE;
}
Patches currently in stable-queue which might be from rafael.j.wysocki(a)intel.com are
queue-4.9/acpica-namespace-fix-operand-cache-leak.patch
queue-4.9/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
This is a note to let you know that I've just added the patch titled
x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 9d98bcec731756b8688b59ec998707924d716d7b Mon Sep 17 00:00:00 2001
From: Rui Wang <rui.y.wang(a)intel.com>
Date: Wed, 8 Jun 2016 14:59:52 +0800
Subject: x86/ioapic: Fix incorrect pointers in ioapic_setup_resources()
From: Rui Wang <rui.y.wang(a)intel.com>
commit 9d98bcec731756b8688b59ec998707924d716d7b upstream.
On a 4-socket Brickland system, hot-removing one ioapic is fine.
Hot-removing the 2nd one causes panic in mp_unregister_ioapic()
while calling release_resource().
It is because the iomem_res pointer has already been released
when removing the first ioapic.
To explain the use of &res[num] here: res is assigned to ioapic_resources,
and later in ioapic_insert_resources() we do:
struct resource *r = ioapic_resources;
for_each_ioapic(i) {
insert_resource(&iomem_resource, r);
r++;
}
Here 'r' is treated as an arry of 'struct resource', and the r++ ensures
that each element of the array is inserted separately. Thus we should call
release_resouce() on each element at &res[num].
Fix it by assigning the correct pointers to ioapics[i].iomem_res in
ioapic_setup_resources().
Signed-off-by: Rui Wang <rui.y.wang(a)intel.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: tony.luck(a)intel.com
Cc: linux-pci(a)vger.kernel.org
Cc: rjw(a)rjwysocki.net
Cc: linux-acpi(a)vger.kernel.org
Cc: bhelgaas(a)google.com
Link: http://lkml.kernel.org/r/1465369193-4816-3-git-send-email-rui.y.wang@intel.…
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Acked-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
arch/x86/kernel/apic/io_apic.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/arch/x86/kernel/apic/io_apic.c
+++ b/arch/x86/kernel/apic/io_apic.c
@@ -2592,8 +2592,8 @@ static struct resource * __init ioapic_s
res[num].flags = IORESOURCE_MEM | IORESOURCE_BUSY;
snprintf(mem, IOAPIC_RESOURCE_NAME_SIZE, "IOAPIC %u", i);
mem += IOAPIC_RESOURCE_NAME_SIZE;
+ ioapics[i].iomem_res = &res[num];
num++;
- ioapics[i].iomem_res = res;
}
ioapic_resources = res;
Patches currently in stable-queue which might be from rui.y.wang(a)intel.com are
queue-4.4/x86-ioapic-fix-incorrect-pointers-in-ioapic_setup_resources.patch
This is a note to let you know that I've just added the patch titled
scsi: libiscsi: fix shifting of DID_REQUEUE host byte
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From eef9ffdf9cd39b2986367bc8395e2772bc1284ba Mon Sep 17 00:00:00 2001
From: Johannes Thumshirn <jthumshirn(a)suse.de>
Date: Mon, 9 Oct 2017 13:33:19 +0200
Subject: scsi: libiscsi: fix shifting of DID_REQUEUE host byte
From: Johannes Thumshirn <jthumshirn(a)suse.de>
commit eef9ffdf9cd39b2986367bc8395e2772bc1284ba upstream.
The SCSI host byte should be shifted left by 16 in order to have
scsi_decide_disposition() do the right thing (.i.e. requeue the
command).
Signed-off-by: Johannes Thumshirn <jthumshirn(a)suse.de>
Fixes: 661134ad3765 ("[SCSI] libiscsi, bnx2i: make bound ep check common")
Cc: Lee Duncan <lduncan(a)suse.com>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Bart Van Assche <Bart.VanAssche(a)sandisk.com>
Cc: Chris Leech <cleech(a)redhat.com>
Acked-by: Lee Duncan <lduncan(a)suse.com>
Signed-off-by: Martin K. Petersen <martin.petersen(a)oracle.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/scsi/libiscsi.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/drivers/scsi/libiscsi.c
+++ b/drivers/scsi/libiscsi.c
@@ -1727,7 +1727,7 @@ int iscsi_queuecommand(struct Scsi_Host
if (test_bit(ISCSI_SUSPEND_BIT, &conn->suspend_tx)) {
reason = FAILURE_SESSION_IN_RECOVERY;
- sc->result = DID_REQUEUE;
+ sc->result = DID_REQUEUE << 16;
goto fault;
}
Patches currently in stable-queue which might be from jthumshirn(a)suse.de are
queue-4.4/scsi-libiscsi-fix-shifting-of-did_requeue-host-byte.patch
This is a note to let you know that I've just added the patch titled
reiserfs: fix race in prealloc discard
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-fix-race-in-prealloc-discard.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 08db141b5313ac2f64b844fb5725b8d81744b417 Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:47:34 -0400
Subject: reiserfs: fix race in prealloc discard
From: Jeff Mahoney <jeffm(a)suse.com>
commit 08db141b5313ac2f64b844fb5725b8d81744b417 upstream.
The main loop in __discard_prealloc is protected by the reiserfs write lock
which is dropped across schedules like the BKL it replaced. The problem is
that it checks the value, calls a routine that schedules, and then adjusts
the state. As a result, two threads that are calling
reiserfs_prealloc_discard at the same time can race when one calls
reiserfs_free_prealloc_block, the lock is dropped, and the other calls
reiserfs_free_prealloc_block with the same block number. In the right
circumstances, it can cause the prealloc count to go negative.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -513,9 +513,17 @@ static void __discard_prealloc(struct re
"inode has negative prealloc blocks count.");
#endif
while (ei->i_prealloc_count > 0) {
- reiserfs_free_prealloc_block(th, inode, ei->i_prealloc_block);
- ei->i_prealloc_block++;
+ b_blocknr_t block_to_free;
+
+ /*
+ * reiserfs_free_prealloc_block can drop the write lock,
+ * which could allow another caller to free the same block.
+ * We can protect against it by modifying the prealloc
+ * state before calling it.
+ */
+ block_to_free = ei->i_prealloc_block++;
ei->i_prealloc_count--;
+ reiserfs_free_prealloc_block(th, inode, block_to_free);
dirty = 1;
}
if (dirty)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
reiserfs: Don't clear SGID when inheriting ACLs
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 6883cd7f68245e43e91e5ee583b7550abf14523f Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Thu, 22 Jun 2017 09:32:49 +0200
Subject: reiserfs: Don't clear SGID when inheriting ACLs
From: Jan Kara <jack(a)suse.cz>
commit 6883cd7f68245e43e91e5ee583b7550abf14523f upstream.
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by moving posix_acl_update_mode() out of
__reiserfs_set_acl() into reiserfs_set_acl(). That way the function will
not be called when inheriting ACLs which is what we want as it prevents
SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: reiserfs-devel(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/xattr_acl.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
--- a/fs/reiserfs/xattr_acl.c
+++ b/fs/reiserfs/xattr_acl.c
@@ -37,7 +37,14 @@ reiserfs_set_acl(struct inode *inode, st
error = journal_begin(&th, inode->i_sb, jcreate_blocks);
reiserfs_write_unlock(inode->i_sb);
if (error == 0) {
+ if (type == ACL_TYPE_ACCESS && acl) {
+ error = posix_acl_update_mode(inode, &inode->i_mode,
+ &acl);
+ if (error)
+ goto unlock;
+ }
error = __reiserfs_set_acl(&th, inode, type, acl);
+unlock:
reiserfs_write_lock(inode->i_sb);
error2 = journal_end(&th);
reiserfs_write_unlock(inode->i_sb);
@@ -245,11 +252,6 @@ __reiserfs_set_acl(struct reiserfs_trans
switch (type) {
case ACL_TYPE_ACCESS:
name = POSIX_ACL_XATTR_ACCESS;
- if (acl) {
- error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
- if (error)
- return error;
- }
break;
case ACL_TYPE_DEFAULT:
name = POSIX_ACL_XATTR_DEFAULT;
Patches currently in stable-queue which might be from jack(a)suse.cz are
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/ext2-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.4/reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
This is a note to let you know that I've just added the patch titled
reiserfs: don't preallocate blocks for extended attributes
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a Mon Sep 17 00:00:00 2001
From: Jeff Mahoney <jeffm(a)suse.com>
Date: Thu, 22 Jun 2017 16:35:04 -0400
Subject: reiserfs: don't preallocate blocks for extended attributes
From: Jeff Mahoney <jeffm(a)suse.com>
commit 54930dfeb46e978b447af0fb8ab4e181c1bf9d7a upstream.
Most extended attributes will fit in a single block. More importantly,
we drop the reference to the inode while holding the transaction open
so the preallocated blocks aren't released. As a result, the inode
may be evicted before it's removed from the transaction's prealloc list
which can cause memory corruption.
Signed-off-by: Jeff Mahoney <jeffm(a)suse.com>
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/reiserfs/bitmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/reiserfs/bitmap.c
+++ b/fs/reiserfs/bitmap.c
@@ -1136,7 +1136,7 @@ static int determine_prealloc_size(reise
hint->prealloc_size = 0;
if (!hint->formatted_node && hint->preallocate) {
- if (S_ISREG(hint->inode->i_mode)
+ if (S_ISREG(hint->inode->i_mode) && !IS_PRIVATE(hint->inode)
&& hint->inode->i_size >=
REISERFS_SB(hint->th->t_super)->s_alloc_options.
preallocmin * hint->inode->i_sb->s_blocksize)
Patches currently in stable-queue which might be from jeffm(a)suse.com are
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
This is a note to let you know that I've just added the patch titled
netfilter: xt_osf: Add missing permission checks
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-xt_osf-add-missing-permission-checks.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 916a27901de01446bcf57ecca4783f6cff493309 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Tue, 5 Dec 2017 15:42:41 -0800
Subject: netfilter: xt_osf: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, xt_osf_fingers is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
vpnns -- nfnl_osf -f /tmp/pf.os
vpnns -- nfnl_osf -f /tmp/pf.os -d
These non-root operations successfully modify the systemwide OS
fingerprint list. Add new capable() checks so that they can't.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/xt_osf.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/net/netfilter/xt_osf.c
+++ b/net/netfilter/xt_osf.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/kernel.h>
+#include <linux/capability.h>
#include <linux/if.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
@@ -69,6 +70,9 @@ static int xt_osf_add_callback(struct so
struct xt_osf_finger *kf = NULL, *sf;
int err = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
@@ -112,6 +116,9 @@ static int xt_osf_remove_callback(struct
struct xt_osf_finger *sf;
int err = -ENOENT;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.4/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.4/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: use fwmark_reflect in nf_send_reset
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-use-fwmark_reflect-in-nf_send_reset.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From cc31d43b4154ad5a7d8aa5543255a93b7e89edc2 Mon Sep 17 00:00:00 2001
From: Pau Espin Pedrol <pau.espin(a)tessares.net>
Date: Fri, 6 Jan 2017 20:33:27 +0100
Subject: netfilter: use fwmark_reflect in nf_send_reset
From: Pau Espin Pedrol <pau.espin(a)tessares.net>
commit cc31d43b4154ad5a7d8aa5543255a93b7e89edc2 upstream.
Otherwise, RST packets generated by ipt_REJECT always have mark 0 when
the routing is checked later in the same code path.
Fixes: e110861f8609 ("net: add a sysctl to reflect the fwmark on replies")
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Signed-off-by: Pau Espin Pedrol <pau.espin(a)tessares.net>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/netfilter/nf_reject_ipv4.c | 2 ++
net/ipv6/netfilter/nf_reject_ipv6.c | 3 +++
2 files changed, 5 insertions(+)
--- a/net/ipv4/netfilter/nf_reject_ipv4.c
+++ b/net/ipv4/netfilter/nf_reject_ipv4.c
@@ -124,6 +124,8 @@ void nf_send_reset(struct net *net, stru
/* ip_route_me_harder expects skb->dst to be set */
skb_dst_set_noref(nskb, skb_dst(oldskb));
+ nskb->mark = IP4_REPLY_MARK(net, oldskb->mark);
+
skb_reserve(nskb, LL_MAX_HEADER);
niph = nf_reject_iphdr_put(nskb, oldskb, IPPROTO_TCP,
ip4_dst_hoplimit(skb_dst(nskb)));
--- a/net/ipv6/netfilter/nf_reject_ipv6.c
+++ b/net/ipv6/netfilter/nf_reject_ipv6.c
@@ -157,6 +157,7 @@ void nf_send_reset6(struct net *net, str
fl6.daddr = oip6h->saddr;
fl6.fl6_sport = otcph->dest;
fl6.fl6_dport = otcph->source;
+ fl6.flowi6_mark = IP6_REPLY_MARK(net, oldskb->mark);
security_skb_classify_flow(oldskb, flowi6_to_flowi(&fl6));
dst = ip6_route_output(net, NULL, &fl6);
if (dst == NULL || dst->error) {
@@ -180,6 +181,8 @@ void nf_send_reset6(struct net *net, str
skb_dst_set(nskb, dst);
+ nskb->mark = fl6.flowi6_mark;
+
skb_reserve(nskb, hh_len + dst->header_len);
ip6h = nf_reject_ip6hdr_put(nskb, oldskb, IPPROTO_TCP,
ip6_dst_hoplimit(dst));
Patches currently in stable-queue which might be from pau.espin(a)tessares.net are
queue-4.4/netfilter-use-fwmark_reflect-in-nf_send_reset.patch
This is a note to let you know that I've just added the patch titled
netfilter: restart search if moved to other chain
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-restart-search-if-moved-to-other-chain.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 95a8d19f28e6b29377a880c6264391a62e07fccc Mon Sep 17 00:00:00 2001
From: Florian Westphal <fw(a)strlen.de>
Date: Thu, 25 Aug 2016 15:33:29 +0200
Subject: netfilter: restart search if moved to other chain
From: Florian Westphal <fw(a)strlen.de>
commit 95a8d19f28e6b29377a880c6264391a62e07fccc upstream.
In case nf_conntrack_tuple_taken did not find a conflicting entry
check that all entries in this hash slot were tested and restart
in case an entry was moved to another chain.
Reported-by: Eric Dumazet <edumazet(a)google.com>
Fixes: ea781f197d6a ("netfilter: nf_conntrack: use SLAB_DESTROY_BY_RCU and get rid of call_rcu()")
Signed-off-by: Florian Westphal <fw(a)strlen.de>
Acked-by: Eric Dumazet <edumazet(a)google.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nf_conntrack_core.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/net/netfilter/nf_conntrack_core.c
+++ b/net/netfilter/nf_conntrack_core.c
@@ -719,6 +719,7 @@ nf_conntrack_tuple_taken(const struct nf
* least once for the stats anyway.
*/
rcu_read_lock_bh();
+ begin:
hlist_nulls_for_each_entry_rcu(h, n, &net->ct.hash[hash], hnnode) {
ct = nf_ct_tuplehash_to_ctrack(h);
if (ct != ignored_conntrack &&
@@ -730,6 +731,12 @@ nf_conntrack_tuple_taken(const struct nf
}
NF_CT_STAT_INC(net, searched);
}
+
+ if (get_nulls_value(n) != hash) {
+ NF_CT_STAT_INC(net, search_restart);
+ goto begin;
+ }
+
rcu_read_unlock_bh();
return 0;
Patches currently in stable-queue which might be from fw(a)strlen.de are
queue-4.4/netfilter-x_tables-speed-up-jump-target-validation.patch
queue-4.4/netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
queue-4.4/netfilter-restart-search-if-moved-to-other-chain.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
This is a note to let you know that I've just added the patch titled
netfilter: nfnetlink_queue: reject verdict request from different portid
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 00a3101f561816e58de054a470484996f78eb5eb Mon Sep 17 00:00:00 2001
From: Liping Zhang <liping.zhang(a)spreadtrum.com>
Date: Mon, 8 Aug 2016 22:07:27 +0800
Subject: netfilter: nfnetlink_queue: reject verdict request from different portid
From: Liping Zhang <liping.zhang(a)spreadtrum.com>
commit 00a3101f561816e58de054a470484996f78eb5eb upstream.
Like NFQNL_MSG_VERDICT_BATCH do, we should also reject the verdict
request when the portid is not same with the initial portid(maybe
from another process).
Fixes: 97d32cf9440d ("netfilter: nfnetlink_queue: batch verdict support")
Signed-off-by: Liping Zhang <liping.zhang(a)spreadtrum.com>
Reviewed-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nfnetlink_queue.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
--- a/net/netfilter/nfnetlink_queue.c
+++ b/net/netfilter/nfnetlink_queue.c
@@ -1053,10 +1053,8 @@ nfqnl_recv_verdict(struct sock *ctnl, st
struct net *net = sock_net(ctnl);
struct nfnl_queue_net *q = nfnl_queue_pernet(net);
- queue = instance_lookup(q, queue_num);
- if (!queue)
- queue = verdict_instance_lookup(q, queue_num,
- NETLINK_CB(skb).portid);
+ queue = verdict_instance_lookup(q, queue_num,
+ NETLINK_CB(skb).portid);
if (IS_ERR(queue))
return PTR_ERR(queue);
Patches currently in stable-queue which might be from liping.zhang(a)spreadtrum.com are
queue-4.4/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
This is a note to let you know that I've just added the patch titled
netfilter: nfnetlink_cthelper: Add missing permission checks
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b380c42f7d00a395feede754f0bc2292eebe6e5 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Sun, 3 Dec 2017 12:12:45 -0800
Subject: netfilter: nfnetlink_cthelper: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, nfnl_cthelper_list is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
$ nfct helper list
nfct v1.4.4: netlink error: Operation not permitted
$ vpnns -- nfct helper list
{
.name = ftp,
.queuenum = 0,
.l3protonum = 2,
.l4protonum = 6,
.priv_data_len = 24,
.status = enabled,
};
Add capable() checks in nfnetlink_cthelper, as this is cleaner than
trying to generalize the solution.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/list.h>
#include <linux/errno.h>
+#include <linux/capability.h>
#include <net/netlink.h>
#include <net/sock.h>
@@ -392,6 +393,9 @@ nfnl_cthelper_new(struct sock *nfnl, str
struct nfnl_cthelper *nlcth;
int ret = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE])
return -EINVAL;
@@ -595,6 +599,9 @@ nfnl_cthelper_get(struct sock *nfnl, str
struct nfnl_cthelper *nlcth;
bool tuple_set = false;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
.dump = nfnl_cthelper_dump_table,
@@ -661,6 +668,9 @@ nfnl_cthelper_del(struct sock *nfnl, str
struct nfnl_cthelper *nlcth, *n;
int j = 0, ret;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (tb[NFCTH_NAME])
helper_name = nla_data(tb[NFCTH_NAME]);
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.4/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.4/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 83170f3beccccd7ceb4f9a0ac0c4dc736afde90c Mon Sep 17 00:00:00 2001
From: Paolo Abeni <pabeni(a)redhat.com>
Date: Thu, 26 May 2016 19:08:10 +0200
Subject: netfilter: nf_dup_ipv6: set again FLOWI_FLAG_KNOWN_NH at flowi6_flags
From: Paolo Abeni <pabeni(a)redhat.com>
commit 83170f3beccccd7ceb4f9a0ac0c4dc736afde90c upstream.
With the commit 48e8aa6e3137 ("ipv6: Set FLOWI_FLAG_KNOWN_NH at
flowi6_flags") ip6_pol_route() callers were asked to to set the
FLOWI_FLAG_KNOWN_NH properly and xt_TEE was updated accordingly,
but with the later refactor in commit bbde9fc1824a ("netfilter:
factor out packet duplication for IPv4/IPv6") the flowi6_flags
update was lost.
This commit re-add it just before the routing decision.
Fixes: bbde9fc1824a ("netfilter: factor out packet duplication for IPv4/IPv6")
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv6/netfilter/nf_dup_ipv6.c | 1 +
1 file changed, 1 insertion(+)
--- a/net/ipv6/netfilter/nf_dup_ipv6.c
+++ b/net/ipv6/netfilter/nf_dup_ipv6.c
@@ -33,6 +33,7 @@ static bool nf_dup_ipv6_route(struct net
fl6.daddr = *gw;
fl6.flowlabel = (__force __be32)(((iph->flow_lbl[0] & 0xF) << 16) |
(iph->flow_lbl[1] << 8) | iph->flow_lbl[2]);
+ fl6.flowi6_flags = FLOWI_FLAG_KNOWN_NH;
dst = ip6_route_output(net, NULL, &fl6);
if (dst->error) {
dst_release(dst);
Patches currently in stable-queue which might be from pabeni(a)redhat.com are
queue-4.4/netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
This is a note to let you know that I've just added the patch titled
netfilter: nf_ct_expect: remove the redundant slash when policy name is empty
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b173a28f62cf929324a8a6adcc45adadce311d16 Mon Sep 17 00:00:00 2001
From: Liping Zhang <liping.zhang(a)spreadtrum.com>
Date: Mon, 8 Aug 2016 21:57:58 +0800
Subject: netfilter: nf_ct_expect: remove the redundant slash when policy name is empty
From: Liping Zhang <liping.zhang(a)spreadtrum.com>
commit b173a28f62cf929324a8a6adcc45adadce311d16 upstream.
The 'name' filed in struct nf_conntrack_expect_policy{} is not a
pointer, so check it is NULL or not will always return true. Even if the
name is empty, slash will always be displayed like follows:
# cat /proc/net/nf_conntrack_expect
297 l3proto = 2 proto=6 src=1.1.1.1 dst=2.2.2.2 sport=1 dport=1025 ftp/
^
Fixes: 3a8fc53a45c4 ("netfilter: nf_ct_helper: allocate 16 bytes for the helper and policy names")
Signed-off-by: Liping Zhang <liping.zhang(a)spreadtrum.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nf_conntrack_expect.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/net/netfilter/nf_conntrack_expect.c
+++ b/net/netfilter/nf_conntrack_expect.c
@@ -560,7 +560,7 @@ static int exp_seq_show(struct seq_file
helper = rcu_dereference(nfct_help(expect->master)->helper);
if (helper) {
seq_printf(s, "%s%s", expect->flags ? " " : "", helper->name);
- if (helper->expect_policy[expect->class].name)
+ if (helper->expect_policy[expect->class].name[0])
seq_printf(s, "/%s",
helper->expect_policy[expect->class].name);
}
Patches currently in stable-queue which might be from liping.zhang(a)spreadtrum.com are
queue-4.4/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
This is a note to let you know that I've just added the patch titled
netfilter: nf_conntrack_sip: extend request line validation
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nf_conntrack_sip-extend-request-line-validation.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 444f901742d054a4cd5ff045871eac5131646cfb Mon Sep 17 00:00:00 2001
From: Ulrich Weber <ulrich.weber(a)riverbed.com>
Date: Mon, 24 Oct 2016 18:07:23 +0200
Subject: netfilter: nf_conntrack_sip: extend request line validation
From: Ulrich Weber <ulrich.weber(a)riverbed.com>
commit 444f901742d054a4cd5ff045871eac5131646cfb upstream.
on SIP requests, so a fragmented TCP SIP packet from an allow header starting with
INVITE,NOTIFY,OPTIONS,REFER,REGISTER,UPDATE,SUBSCRIBE
Content-Length: 0
will not bet interpreted as an INVITE request. Also Request-URI must start with an alphabetic character.
Confirm with RFC 3261
Request-Line = Method SP Request-URI SP SIP-Version CRLF
Fixes: 30f33e6dee80 ("[NETFILTER]: nf_conntrack_sip: support method specific request/response handling")
Signed-off-by: Ulrich Weber <ulrich.weber(a)riverbed.com>
Acked-by: Marco Angaroni <marcoangaroni(a)gmail.com>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nf_conntrack_sip.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -1434,9 +1434,12 @@ static int process_sip_request(struct sk
handler = &sip_handlers[i];
if (handler->request == NULL)
continue;
- if (*datalen < handler->len ||
+ if (*datalen < handler->len + 2 ||
strncasecmp(*dptr, handler->method, handler->len))
continue;
+ if ((*dptr)[handler->len] != ' ' ||
+ !isalpha((*dptr)[handler->len+1]))
+ continue;
if (ct_sip_get_header(ct, *dptr, 0, *datalen, SIP_HDR_CSEQ,
&matchoff, &matchlen) <= 0) {
Patches currently in stable-queue which might be from ulrich.weber(a)riverbed.com are
queue-4.4/netfilter-nf_conntrack_sip-extend-request-line-validation.patch
This is a note to let you know that I've just added the patch titled
netfilter: fix IS_ERR_VALUE usage
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-fix-is_err_value-usage.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 92b4423e3a0bc5d43ecde4bcad871f8b5ba04efd Mon Sep 17 00:00:00 2001
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
Date: Fri, 29 Apr 2016 10:39:34 +0200
Subject: netfilter: fix IS_ERR_VALUE usage
From: Pablo Neira Ayuso <pablo(a)netfilter.org>
commit 92b4423e3a0bc5d43ecde4bcad871f8b5ba04efd upstream.
This is a forward-port of the original patch from Andrzej Hajda,
he said:
"IS_ERR_VALUE should be used only with unsigned long type.
Otherwise it can work incorrectly. To achieve this function
xt_percpu_counter_alloc is modified to return unsigned long,
and its result is assigned to temporary variable to perform
error checking, before assigning to .pcnt field.
The patch follows conclusion from discussion on LKML [1][2].
[1]: http://permalink.gmane.org/gmane.linux.kernel/2120927
[2]: http://permalink.gmane.org/gmane.linux.kernel/2150581"
Original patch from Andrzej is here:
http://patchwork.ozlabs.org/patch/582970/
This patch has clashed with input validation fixes for x_tables.
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
---
include/linux/netfilter/x_tables.h | 6 +++---
net/ipv4/netfilter/arp_tables.c | 6 ++++--
net/ipv4/netfilter/ip_tables.c | 6 ++++--
net/ipv6/netfilter/ip6_tables.c | 6 ++++--
4 files changed, 15 insertions(+), 9 deletions(-)
--- a/include/linux/netfilter/x_tables.h
+++ b/include/linux/netfilter/x_tables.h
@@ -381,16 +381,16 @@ static inline unsigned long ifname_compa
* allows us to return 0 for single core systems without forcing
* callers to deal with SMP vs. NONSMP issues.
*/
-static inline u64 xt_percpu_counter_alloc(void)
+static inline unsigned long xt_percpu_counter_alloc(void)
{
if (nr_cpu_ids > 1) {
void __percpu *res = __alloc_percpu(sizeof(struct xt_counters),
sizeof(struct xt_counters));
if (res == NULL)
- return (u64) -ENOMEM;
+ return -ENOMEM;
- return (u64) (__force unsigned long) res;
+ return (__force unsigned long) res;
}
return 0;
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -511,11 +511,13 @@ find_check_entry(struct arpt_entry *e, c
{
struct xt_entry_target *t;
struct xt_target *target;
+ unsigned long pcnt;
int ret;
- e->counters.pcnt = xt_percpu_counter_alloc();
- if (IS_ERR_VALUE(e->counters.pcnt))
+ pcnt = xt_percpu_counter_alloc();
+ if (IS_ERR_VALUE(pcnt))
return -ENOMEM;
+ e->counters.pcnt = pcnt;
t = arpt_get_target(e);
target = xt_request_find_target(NFPROTO_ARP, t->u.user.name,
--- a/net/ipv4/netfilter/ip_tables.c
+++ b/net/ipv4/netfilter/ip_tables.c
@@ -653,10 +653,12 @@ find_check_entry(struct ipt_entry *e, st
unsigned int j;
struct xt_mtchk_param mtpar;
struct xt_entry_match *ematch;
+ unsigned long pcnt;
- e->counters.pcnt = xt_percpu_counter_alloc();
- if (IS_ERR_VALUE(e->counters.pcnt))
+ pcnt = xt_percpu_counter_alloc();
+ if (IS_ERR_VALUE(pcnt))
return -ENOMEM;
+ e->counters.pcnt = pcnt;
j = 0;
mtpar.net = net;
--- a/net/ipv6/netfilter/ip6_tables.c
+++ b/net/ipv6/netfilter/ip6_tables.c
@@ -666,10 +666,12 @@ find_check_entry(struct ip6t_entry *e, s
unsigned int j;
struct xt_mtchk_param mtpar;
struct xt_entry_match *ematch;
+ unsigned long pcnt;
- e->counters.pcnt = xt_percpu_counter_alloc();
- if (IS_ERR_VALUE(e->counters.pcnt))
+ pcnt = xt_percpu_counter_alloc();
+ if (IS_ERR_VALUE(pcnt))
return -ENOMEM;
+ e->counters.pcnt = pcnt;
j = 0;
mtpar.net = net;
Patches currently in stable-queue which might be from pablo(a)netfilter.org are
queue-4.4/netfilter-x_tables-speed-up-jump-target-validation.patch
queue-4.4/netfilter-nf_dup_ipv6-set-again-flowi_flag_known_nh-at-flowi6_flags.patch
queue-4.4/netfilter-fix-is_err_value-usage.patch
queue-4.4/netfilter-nf_ct_expect-remove-the-redundant-slash-when-policy-name-is-empty.patch
queue-4.4/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.4/netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
queue-4.4/netfilter-nf_conntrack_sip-extend-request-line-validation.patch
queue-4.4/netfilter-restart-search-if-moved-to-other-chain.patch
queue-4.4/netfilter-use-fwmark_reflect-in-nf_send_reset.patch
queue-4.4/netfilter-nfnetlink_queue-reject-verdict-request-from-different-portid.patch
queue-4.4/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 17a49cd549d9dc8707dc9262210166455c612dde Mon Sep 17 00:00:00 2001
From: Hongxu Jia <hongxu.jia(a)windriver.com>
Date: Tue, 29 Nov 2016 21:56:26 -0500
Subject: netfilter: arp_tables: fix invoking 32bit "iptable -P INPUT ACCEPT" failed in 64bit kernel
From: Hongxu Jia <hongxu.jia(a)windriver.com>
commit 17a49cd549d9dc8707dc9262210166455c612dde upstream.
Since 09d9686047db ("netfilter: x_tables: do compat validation via
translate_table"), it used compatr structure to assign newinfo
structure. In translate_compat_table of ip_tables.c and ip6_tables.c,
it used compatr->hook_entry to replace info->hook_entry and
compatr->underflow to replace info->underflow, but not do the same
replacement in arp_tables.c.
It caused invoking 32-bit "arptbale -P INPUT ACCEPT" failed in 64bit
kernel.
--------------------------------------
root@qemux86-64:~# arptables -P INPUT ACCEPT
root@qemux86-64:~# arptables -P INPUT ACCEPT
ERROR: Policy for `INPUT' offset 448 != underflow 0
arptables: Incompatible with this kernel
--------------------------------------
Fixes: 09d9686047db ("netfilter: x_tables: do compat validation via translate_table")
Signed-off-by: Hongxu Jia <hongxu.jia(a)windriver.com>
Acked-by: Florian Westphal <fw(a)strlen.de>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/ipv4/netfilter/arp_tables.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/net/ipv4/netfilter/arp_tables.c
+++ b/net/ipv4/netfilter/arp_tables.c
@@ -1339,8 +1339,8 @@ static int translate_compat_table(struct
newinfo->number = compatr->num_entries;
for (i = 0; i < NF_ARP_NUMHOOKS; i++) {
- newinfo->hook_entry[i] = info->hook_entry[i];
- newinfo->underflow[i] = info->underflow[i];
+ newinfo->hook_entry[i] = compatr->hook_entry[i];
+ newinfo->underflow[i] = compatr->underflow[i];
}
entry1 = newinfo->entries;
pos = entry1;
Patches currently in stable-queue which might be from hongxu.jia(a)windriver.com are
queue-4.4/netfilter-arp_tables-fix-invoking-32bit-iptable-p-input-accept-failed-in-64bit-kernel.patch
This is a note to let you know that I've just added the patch titled
mm, page_alloc: fix potential false positive in __zone_watermark_ok
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From b050e3769c6b4013bb937e879fc43bf1847ee819 Mon Sep 17 00:00:00 2001
From: Vlastimil Babka <vbabka(a)suse.cz>
Date: Wed, 15 Nov 2017 17:38:30 -0800
Subject: mm, page_alloc: fix potential false positive in __zone_watermark_ok
From: Vlastimil Babka <vbabka(a)suse.cz>
commit b050e3769c6b4013bb937e879fc43bf1847ee819 upstream.
Since commit 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for
order-0 allocations"), __zone_watermark_ok() check for high-order
allocations will shortcut per-migratetype free list checks for
ALLOC_HARDER allocations, and return true as long as there's free page
of any migratetype. The intention is that ALLOC_HARDER can allocate
from MIGRATE_HIGHATOMIC free lists, while normal allocations can't.
However, as a side effect, the watermark check will then also return
true when there are pages only on the MIGRATE_ISOLATE list, or (prior to
CMA conversion to ZONE_MOVABLE) on the MIGRATE_CMA list. Since the
allocation cannot actually obtain isolated pages, and might not be able
to obtain CMA pages, this can result in a false positive.
The condition should be rare and perhaps the outcome is not a fatal one.
Still, it's better if the watermark check is correct. There also
shouldn't be a performance tradeoff here.
Link: http://lkml.kernel.org/r/20171102125001.23708-1-vbabka@suse.cz
Fixes: 97a16fc82a7c ("mm, page_alloc: only enforce watermarks for order-0 allocations")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/page_alloc.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2468,9 +2468,6 @@ static bool __zone_watermark_ok(struct z
if (!area->nr_free)
continue;
- if (alloc_harder)
- return true;
-
for (mt = 0; mt < MIGRATE_PCPTYPES; mt++) {
if (!list_empty(&area->free_list[mt]))
return true;
@@ -2482,6 +2479,9 @@ static bool __zone_watermark_ok(struct z
return true;
}
#endif
+ if (alloc_harder &&
+ !list_empty(&area->free_list[MIGRATE_HIGHATOMIC]))
+ return true;
}
return false;
}
Patches currently in stable-queue which might be from vbabka(a)suse.cz are
queue-4.4/fs-select-add-vmalloc-fallback-for-select-2.patch
queue-4.4/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
queue-4.4/cma-fix-calculation-of-aligned-offset.patch
queue-4.4/mm-page_alloc-fix-potential-false-positive-in-__zone_watermark_ok.patch
This is a note to let you know that I've just added the patch titled
mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 561b5e0709e4a248c67d024d4d94b6e31e3edf2f Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Mon, 10 Jul 2017 15:49:51 -0700
Subject: mm/mmap.c: do not blow on PROT_NONE MAP_FIXED holes in the stack
From: Michal Hocko <mhocko(a)suse.com>
commit 561b5e0709e4a248c67d024d4d94b6e31e3edf2f upstream.
Commit 1be7107fbe18 ("mm: larger stack guard gap, between vmas") has
introduced a regression in some rust and Java environments which are
trying to implement their own stack guard page. They are punching a new
MAP_FIXED mapping inside the existing stack Vma.
This will confuse expand_{downwards,upwards} into thinking that the
stack expansion would in fact get us too close to an existing non-stack
vma which is a correct behavior wrt safety. It is a real regression on
the other hand.
Let's work around the problem by considering PROT_NONE mapping as a part
of the stack. This is a gros hack but overflowing to such a mapping
would trap anyway an we only can hope that usespace knows what it is
doing and handle it propely.
Fixes: 1be7107fbe18 ("mm: larger stack guard gap, between vmas")
Link: http://lkml.kernel.org/r/20170705182849.GA18027@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Debugged-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Ben Hutchings <ben(a)decadent.org.uk>
Cc: Willy Tarreau <w(a)1wt.eu>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Hugh Dickins <hughd(a)google.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/mmap.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -2188,7 +2188,8 @@ int expand_upwards(struct vm_area_struct
gap_addr = TASK_SIZE;
next = vma->vm_next;
- if (next && next->vm_start < gap_addr) {
+ if (next && next->vm_start < gap_addr &&
+ (next->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(next->vm_flags & VM_GROWSUP))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
@@ -2273,7 +2274,8 @@ int expand_downwards(struct vm_area_stru
if (gap_addr > address)
return -ENOMEM;
prev = vma->vm_prev;
- if (prev && prev->vm_end > gap_addr) {
+ if (prev && prev->vm_end > gap_addr &&
+ (prev->vm_flags & (VM_WRITE|VM_READ|VM_EXEC))) {
if (!(prev->vm_flags & VM_GROWSDOWN))
return -ENOMEM;
/* Check that both stack segments have the same anon_vma? */
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.4/fs-select-add-vmalloc-fallback-for-select-2.patch
queue-4.4/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.4/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
This is a note to let you know that I've just added the patch titled
ipc: msg, make msgrcv work with LONG_MIN
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ipc-msg-make-msgrcv-work-with-long_min.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 999898355e08ae3b92dfd0a08db706e0c6703d30 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby(a)suse.cz>
Date: Wed, 14 Dec 2016 15:06:07 -0800
Subject: ipc: msg, make msgrcv work with LONG_MIN
From: Jiri Slaby <jslaby(a)suse.cz>
commit 999898355e08ae3b92dfd0a08db706e0c6703d30 upstream.
When LONG_MIN is passed to msgrcv, one would expect to recieve any
message. But convert_mode does *msgtyp = -*msgtyp and -LONG_MIN is
undefined. In particular, with my gcc -LONG_MIN produces -LONG_MIN
again.
So handle this case properly by assigning LONG_MAX to *msgtyp if
LONG_MIN was specified as msgtyp to msgrcv.
This code:
long msg[] = { 100, 200 };
int m = msgget(IPC_PRIVATE, IPC_CREAT | 0644);
msgsnd(m, &msg, sizeof(msg), 0);
msgrcv(m, &msg, sizeof(msg), LONG_MIN, 0);
produces currently nothing:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 65538
msgsnd(65538, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(65538, ...
Except a UBSAN warning:
UBSAN: Undefined behaviour in ipc/msg.c:745:13
negation of -9223372036854775808 cannot be represented in type 'long int':
With the patch, I see what I expect:
msgget(IPC_PRIVATE, IPC_CREAT|0644) = 0
msgsnd(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, 0) = 0
msgrcv(0, {100, "\310\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"}, 16, -9223372036854775808, 0) = 16
Link: http://lkml.kernel.org/r/20161024082633.10148-1-jslaby@suse.cz
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Cc: Davidlohr Bueso <dave(a)stgolabs.net>
Cc: Manfred Spraul <manfred(a)colorfullife.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
ipc/msg.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/ipc/msg.c
+++ b/ipc/msg.c
@@ -742,7 +742,10 @@ static inline int convert_mode(long *msg
if (*msgtyp == 0)
return SEARCH_ANY;
if (*msgtyp < 0) {
- *msgtyp = -*msgtyp;
+ if (*msgtyp == LONG_MIN) /* -LONG_MIN is undefined */
+ *msgtyp = LONG_MAX;
+ else
+ *msgtyp = -*msgtyp;
return SEARCH_LESSEQUAL;
}
if (msgflg & MSG_EXCEPT)
Patches currently in stable-queue which might be from jslaby(a)suse.cz are
queue-4.4/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.4/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/x86-retpoline-fill-rsb-on-context-switch-for-affected-cpus.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.4/time-avoid-undefined-behaviour-in-ktime_add_safe.patch
This is a note to let you know that I've just added the patch titled
hwpoison, memcg: forcibly uncharge LRU pages
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
hwpoison-memcg-forcibly-uncharge-lru-pages.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 18365225f0440d09708ad9daade2ec11275c3df9 Mon Sep 17 00:00:00 2001
From: Michal Hocko <mhocko(a)suse.com>
Date: Fri, 12 May 2017 15:46:26 -0700
Subject: hwpoison, memcg: forcibly uncharge LRU pages
From: Michal Hocko <mhocko(a)suse.com>
commit 18365225f0440d09708ad9daade2ec11275c3df9 upstream.
Laurent Dufour has noticed that hwpoinsoned pages are kept charged. In
his particular case he has hit a bad_page("page still charged to
cgroup") when onlining a hwpoison page. While this looks like something
that shouldn't happen in the first place because onlining hwpages and
returning them to the page allocator makes only little sense it shows a
real problem.
hwpoison pages do not get freed usually so we do not uncharge them (at
least not since commit 0a31bc97c80c ("mm: memcontrol: rewrite uncharge
API")). Each charge pins memcg (since e8ea14cc6ead ("mm: memcontrol:
take a css reference for each charged page")) as well and so the
mem_cgroup and the associated state will never go away. Fix this leak
by forcibly uncharging a LRU hwpoisoned page in delete_from_lru_cache().
We also have to tweak uncharge_list because it cannot rely on zero ref
count for these pages.
[akpm(a)linux-foundation.org: coding-style fixes]
Fixes: 0a31bc97c80c ("mm: memcontrol: rewrite uncharge API")
Link: http://lkml.kernel.org/r/20170502185507.GB19165@dhcp22.suse.cz
Signed-off-by: Michal Hocko <mhocko(a)suse.com>
Reported-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Tested-by: Laurent Dufour <ldufour(a)linux.vnet.ibm.com>
Reviewed-by: Balbir Singh <bsingharora(a)gmail.com>
Reviewed-by: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/memcontrol.c | 2 +-
mm/memory-failure.c | 7 +++++++
2 files changed, 8 insertions(+), 1 deletion(-)
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5576,7 +5576,7 @@ static void uncharge_list(struct list_he
next = page->lru.next;
VM_BUG_ON_PAGE(PageLRU(page), page);
- VM_BUG_ON_PAGE(page_count(page), page);
+ VM_BUG_ON_PAGE(!PageHWPoison(page) && page_count(page), page);
if (!page->mem_cgroup)
continue;
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -539,6 +539,13 @@ static int delete_from_lru_cache(struct
*/
ClearPageActive(p);
ClearPageUnevictable(p);
+
+ /*
+ * Poisoned page might never drop its ref count to 0 so we have
+ * to uncharge it manually from its memcg.
+ */
+ mem_cgroup_uncharge(p);
+
/*
* drop the page count elevated by isolate_lru_page()
*/
Patches currently in stable-queue which might be from mhocko(a)suse.com are
queue-4.4/fs-select-add-vmalloc-fallback-for-select-2.patch
queue-4.4/hwpoison-memcg-forcibly-uncharge-lru-pages.patch
queue-4.4/mm-mmap.c-do-not-blow-on-prot_none-map_fixed-holes-in-the-stack.patch
This is a note to let you know that I've just added the patch titled
fs/fcntl: f_setown, avoid undefined behaviour
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
fs-fcntl-f_setown-avoid-undefined-behaviour.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From fc3dc67471461c0efcb1ed22fb7595121d65fad9 Mon Sep 17 00:00:00 2001
From: Jiri Slaby <jslaby(a)suse.cz>
Date: Tue, 13 Jun 2017 13:35:51 +0200
Subject: fs/fcntl: f_setown, avoid undefined behaviour
From: Jiri Slaby <jslaby(a)suse.cz>
commit fc3dc67471461c0efcb1ed22fb7595121d65fad9 upstream.
fcntl(0, F_SETOWN, 0x80000000) triggers:
UBSAN: Undefined behaviour in fs/fcntl.c:118:7
negation of -2147483648 cannot be represented in type 'int':
CPU: 1 PID: 18261 Comm: syz-executor Not tainted 4.8.1-0-syzkaller #1
...
Call Trace:
...
[<ffffffffad8f0868>] ? f_setown+0x1d8/0x200
[<ffffffffad8f19a9>] ? SyS_fcntl+0x999/0xf30
[<ffffffffaed1fb00>] ? entry_SYSCALL_64_fastpath+0x23/0xc1
Fix that by checking the arg parameter properly (against INT_MAX) before
"who = -who". And return immediatelly with -EINVAL in case it is wrong.
Note that according to POSIX we can return EINVAL:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/fcntl.html
[EINVAL]
The cmd argument is F_SETOWN and the value of the argument
is not valid as a process or process group identifier.
[v2] returns an error, v1 used to fail silently
[v3] implement proper check for the bad value INT_MIN
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Cc: Jeff Layton <jlayton(a)poochiereds.net>
Cc: "J. Bruce Fields" <bfields(a)fieldses.org>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: linux-fsdevel(a)vger.kernel.org
Signed-off-by: Jeff Layton <jlayton(a)redhat.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/fcntl.c | 4 ++++
1 file changed, 4 insertions(+)
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -113,6 +113,10 @@ void f_setown(struct file *filp, unsigne
int who = arg;
type = PIDTYPE_PID;
if (who < 0) {
+ /* avoid overflow below */
+ if (who == INT_MIN)
+ return;
+
type = PIDTYPE_PGID;
who = -who;
}
Patches currently in stable-queue which might be from jslaby(a)suse.cz are
queue-4.4/ipc-msg-make-msgrcv-work-with-long_min.patch
queue-4.4/fs-fcntl-f_setown-avoid-undefined-behaviour.patch
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/x86-retpoline-fill-rsb-on-context-switch-for-affected-cpus.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
queue-4.4/time-avoid-undefined-behaviour-in-ktime_add_safe.patch
This is a note to let you know that I've just added the patch titled
ext2: Don't clear SGID when inheriting ACLs
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
ext2-don-t-clear-sgid-when-inheriting-acls.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From a992f2d38e4ce17b8c7d1f7f67b2de0eebdea069 Mon Sep 17 00:00:00 2001
From: Jan Kara <jack(a)suse.cz>
Date: Wed, 21 Jun 2017 14:34:15 +0200
Subject: ext2: Don't clear SGID when inheriting ACLs
From: Jan Kara <jack(a)suse.cz>
commit a992f2d38e4ce17b8c7d1f7f67b2de0eebdea069 upstream.
When new directory 'DIR1' is created in a directory 'DIR0' with SGID bit
set, DIR1 is expected to have SGID bit set (and owning group equal to
the owning group of 'DIR0'). However when 'DIR0' also has some default
ACLs that 'DIR1' inherits, setting these ACLs will result in SGID bit on
'DIR1' to get cleared if user is not member of the owning group.
Fix the problem by creating __ext2_set_acl() function that does not call
posix_acl_update_mode() and use it when inheriting ACLs. That prevents
SGID bit clearing and the mode has been properly set by
posix_acl_create() anyway.
Fixes: 073931017b49d9458aa351605b43a7e34598caef
CC: stable(a)vger.kernel.org
CC: linux-ext4(a)vger.kernel.org
Signed-off-by: Jan Kara <jack(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
fs/ext2/acl.c | 36 ++++++++++++++++++++++--------------
1 file changed, 22 insertions(+), 14 deletions(-)
--- a/fs/ext2/acl.c
+++ b/fs/ext2/acl.c
@@ -178,11 +178,8 @@ ext2_get_acl(struct inode *inode, int ty
return acl;
}
-/*
- * inode->i_mutex: down
- */
-int
-ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
+static int
+__ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
{
int name_index;
void *value = NULL;
@@ -192,13 +189,6 @@ ext2_set_acl(struct inode *inode, struct
switch(type) {
case ACL_TYPE_ACCESS:
name_index = EXT2_XATTR_INDEX_POSIX_ACL_ACCESS;
- if (acl) {
- error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
- if (error)
- return error;
- inode->i_ctime = CURRENT_TIME_SEC;
- mark_inode_dirty(inode);
- }
break;
case ACL_TYPE_DEFAULT:
@@ -225,6 +215,24 @@ ext2_set_acl(struct inode *inode, struct
}
/*
+ * inode->i_mutex: down
+ */
+int
+ext2_set_acl(struct inode *inode, struct posix_acl *acl, int type)
+{
+ int error;
+
+ if (type == ACL_TYPE_ACCESS && acl) {
+ error = posix_acl_update_mode(inode, &inode->i_mode, &acl);
+ if (error)
+ return error;
+ inode->i_ctime = CURRENT_TIME_SEC;
+ mark_inode_dirty(inode);
+ }
+ return __ext2_set_acl(inode, acl, type);
+}
+
+/*
* Initialize the ACLs of a new inode. Called from ext2_new_inode.
*
* dir->i_mutex: down
@@ -241,12 +249,12 @@ ext2_init_acl(struct inode *inode, struc
return error;
if (default_acl) {
- error = ext2_set_acl(inode, default_acl, ACL_TYPE_DEFAULT);
+ error = __ext2_set_acl(inode, default_acl, ACL_TYPE_DEFAULT);
posix_acl_release(default_acl);
}
if (acl) {
if (!error)
- error = ext2_set_acl(inode, acl, ACL_TYPE_ACCESS);
+ error = __ext2_set_acl(inode, acl, ACL_TYPE_ACCESS);
posix_acl_release(acl);
}
return error;
Patches currently in stable-queue which might be from jack(a)suse.cz are
queue-4.4/reiserfs-don-t-preallocate-blocks-for-extended-attributes.patch
queue-4.4/ext2-don-t-clear-sgid-when-inheriting-acls.patch
queue-4.4/reiserfs-fix-race-in-prealloc-discard.patch
queue-4.4/reiserfs-don-t-clear-sgid-when-inheriting-acls.patch
This is a note to let you know that I've just added the patch titled
cma: fix calculation of aligned offset
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
cma-fix-calculation-of-aligned-offset.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From e048cb32f69038aa1c8f11e5c1b331be4181659d Mon Sep 17 00:00:00 2001
From: Doug Berger <opendmb(a)gmail.com>
Date: Mon, 10 Jul 2017 15:49:44 -0700
Subject: cma: fix calculation of aligned offset
From: Doug Berger <opendmb(a)gmail.com>
commit e048cb32f69038aa1c8f11e5c1b331be4181659d upstream.
The align_offset parameter is used by bitmap_find_next_zero_area_off()
to represent the offset of map's base from the previous alignment
boundary; the function ensures that the returned index, plus the
align_offset, honors the specified align_mask.
The logic introduced by commit b5be83e308f7 ("mm: cma: align to physical
address, not CMA region position") has the cma driver calculate the
offset to the *next* alignment boundary. In most cases, the base
alignment is greater than that specified when making allocations,
resulting in a zero offset whether we align up or down. In the example
given with the commit, the base alignment (8MB) was half the requested
alignment (16MB) so the math also happened to work since the offset is
8MB in both directions. However, when requesting allocations with an
alignment greater than twice that of the base, the returned index would
not be correctly aligned.
Also, the align_order arguments of cma_bitmap_aligned_mask() and
cma_bitmap_aligned_offset() should not be negative so the argument type
was made unsigned.
Fixes: b5be83e308f7 ("mm: cma: align to physical address, not CMA region position")
Link: http://lkml.kernel.org/r/20170628170742.2895-1-opendmb@gmail.com
Signed-off-by: Angus Clark <angus(a)angusclark.org>
Signed-off-by: Doug Berger <opendmb(a)gmail.com>
Acked-by: Gregory Fong <gregory.0xf0(a)gmail.com>
Cc: Doug Berger <opendmb(a)gmail.com>
Cc: Angus Clark <angus(a)angusclark.org>
Cc: Laura Abbott <labbott(a)redhat.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Lucas Stach <l.stach(a)pengutronix.de>
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Shiraz Hashim <shashim(a)codeaurora.org>
Cc: Jaewon Kim <jaewon31.kim(a)samsung.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
mm/cma.c | 15 ++++++---------
1 file changed, 6 insertions(+), 9 deletions(-)
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -54,7 +54,7 @@ unsigned long cma_get_size(const struct
}
static unsigned long cma_bitmap_aligned_mask(const struct cma *cma,
- int align_order)
+ unsigned int align_order)
{
if (align_order <= cma->order_per_bit)
return 0;
@@ -62,17 +62,14 @@ static unsigned long cma_bitmap_aligned_
}
/*
- * Find a PFN aligned to the specified order and return an offset represented in
- * order_per_bits.
+ * Find the offset of the base PFN from the specified align_order.
+ * The value returned is represented in order_per_bits.
*/
static unsigned long cma_bitmap_aligned_offset(const struct cma *cma,
- int align_order)
+ unsigned int align_order)
{
- if (align_order <= cma->order_per_bit)
- return 0;
-
- return (ALIGN(cma->base_pfn, (1UL << align_order))
- - cma->base_pfn) >> cma->order_per_bit;
+ return (cma->base_pfn & ((1UL << align_order) - 1))
+ >> cma->order_per_bit;
}
static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma,
Patches currently in stable-queue which might be from opendmb(a)gmail.com are
queue-4.4/cma-fix-calculation-of-aligned-offset.patch
This is a note to let you know that I've just added the patch titled
ACPICA: Namespace: fix operand cache leak
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpica-namespace-fix-operand-cache-leak.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 3b2d69114fefa474fca542e51119036dceb4aa6f Mon Sep 17 00:00:00 2001
From: Seunghun Han <kkamagui(a)gmail.com>
Date: Wed, 26 Apr 2017 16:18:08 +0800
Subject: ACPICA: Namespace: fix operand cache leak
From: Seunghun Han <kkamagui(a)gmail.com>
commit 3b2d69114fefa474fca542e51119036dceb4aa6f upstream.
ACPICA commit a23325b2e583556eae88ed3f764e457786bf4df6
I found some ACPI operand cache leaks in ACPI early abort cases.
Boot log of ACPI operand cache leak is as follows:
>[ 0.174332] ACPI: Added _OSI(Module Device)
>[ 0.175504] ACPI: Added _OSI(Processor Device)
>[ 0.176010] ACPI: Added _OSI(3.0 _SCP Extensions)
>[ 0.177032] ACPI: Added _OSI(Processor Aggregator Device)
>[ 0.178284] ACPI: SCI (IRQ16705) allocation failed
>[ 0.179352] ACPI Exception: AE_NOT_ACQUIRED, Unable to install
System Control Interrupt handler (20160930/evevent-131)
>[ 0.180008] ACPI: Unable to start the ACPI Interpreter
>[ 0.181125] ACPI Error: Could not remove SCI handler
(20160930/evmisc-281)
>[ 0.184068] kmem_cache_destroy Acpi-Operand: Slab cache still has
objects
>[ 0.185358] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.10.0-rc3 #2
>[ 0.186820] Hardware name: innotek gmb_h virtual_box/virtual_box, BIOS
virtual_box 12/01/2006
>[ 0.188000] Call Trace:
>[ 0.188000] ? dump_stack+0x5c/0x7d
>[ 0.188000] ? kmem_cache_destroy+0x224/0x230
>[ 0.188000] ? acpi_sleep_proc_init+0x22/0x22
>[ 0.188000] ? acpi_os_delete_cache+0xa/0xd
>[ 0.188000] ? acpi_ut_delete_caches+0x3f/0x7b
>[ 0.188000] ? acpi_terminate+0x5/0xf
>[ 0.188000] ? acpi_init+0x288/0x32e
>[ 0.188000] ? __class_create+0x4c/0x80
>[ 0.188000] ? video_setup+0x7a/0x7a
>[ 0.188000] ? do_one_initcall+0x4e/0x1b0
>[ 0.188000] ? kernel_init_freeable+0x194/0x21a
>[ 0.188000] ? rest_init+0x80/0x80
>[ 0.188000] ? kernel_init+0xa/0x100
>[ 0.188000] ? ret_from_fork+0x25/0x30
When early abort is occurred due to invalid ACPI information, Linux kernel
terminates ACPI by calling acpi_terminate() function. The function calls
acpi_ns_terminate() function to delete namespace data and ACPI operand cache
(acpi_gbl_module_code_list).
But the deletion code in acpi_ns_terminate() function is wrapped in
ACPI_EXEC_APP definition, therefore the code is only executed when the
definition exists. If the define doesn't exist, ACPI operand cache
(acpi_gbl_module_code_list) is leaked, and stack dump is shown in kernel log.
This causes a security threat because the old kernel (<= 4.9) shows memory
locations of kernel functions in stack dump, therefore kernel ASLR can be
neutralized.
To fix ACPI operand leak for enhancing security, I made a patch which
removes the ACPI_EXEC_APP define in acpi_ns_terminate() function for
executing the deletion code unconditionally.
Link: https://github.com/acpica/acpica/commit/a23325b2
Signed-off-by: Seunghun Han <kkamagui(a)gmail.com>
Signed-off-by: Lv Zheng <lv.zheng(a)intel.com>
Signed-off-by: Bob Moore <robert.moore(a)intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Lee, Chun-Yi <jlee(a)suse.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/acpica/nsutils.c | 23 +++++++++--------------
1 file changed, 9 insertions(+), 14 deletions(-)
--- a/drivers/acpi/acpica/nsutils.c
+++ b/drivers/acpi/acpica/nsutils.c
@@ -593,25 +593,20 @@ struct acpi_namespace_node *acpi_ns_vali
void acpi_ns_terminate(void)
{
acpi_status status;
+ union acpi_operand_object *prev;
+ union acpi_operand_object *next;
ACPI_FUNCTION_TRACE(ns_terminate);
-#ifdef ACPI_EXEC_APP
- {
- union acpi_operand_object *prev;
- union acpi_operand_object *next;
+ /* Delete any module-level code blocks */
- /* Delete any module-level code blocks */
-
- next = acpi_gbl_module_code_list;
- while (next) {
- prev = next;
- next = next->method.mutex;
- prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
- acpi_ut_remove_reference(prev);
- }
+ next = acpi_gbl_module_code_list;
+ while (next) {
+ prev = next;
+ next = next->method.mutex;
+ prev->method.mutex = NULL; /* Clear the Mutex (cheated) field */
+ acpi_ut_remove_reference(prev);
}
-#endif
/*
* Free the entire namespace -- all nodes and all objects
Patches currently in stable-queue which might be from kkamagui(a)gmail.com are
queue-4.4/acpica-namespace-fix-operand-cache-leak.patch
This is a note to let you know that I've just added the patch titled
ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Fri, 30 Dec 2016 02:27:31 +0100
Subject: ACPI / scan: Prefer devices without _HID/_CID for _ADR matching
From: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
commit c2a6bbaf0c5f90463a7011a295bbdb7e33c80b51 upstream.
The way acpi_find_child_device() works currently is that, if there
are two (or more) devices with the same _ADR value in the same
namespace scope (which is not specifically allowed by the spec and
the OS behavior in that case is not defined), the first one of them
found to be present (with the help of _STA) will be returned.
This covers the majority of cases, but is not sufficient if some of
the devices in question have a _HID (or _CID) returning some valid
ACPI/PNP device IDs (which is disallowed by the spec) and the
ASL writers' expectation appears to be that the OS will match
devices without a valid ACPI/PNP device ID against a given bus
address first.
To cover this special case as well, modify find_child_checks()
to prefer devices without ACPI/PNP device IDs over devices that
have them.
Suggested-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Tested-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Jiri Slaby <jslaby(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/glue.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
--- a/drivers/acpi/glue.c
+++ b/drivers/acpi/glue.c
@@ -99,13 +99,13 @@ static int find_child_checks(struct acpi
return -ENODEV;
/*
- * If the device has a _HID (or _CID) returning a valid ACPI/PNP
- * device ID, it is better to make it look less attractive here, so that
- * the other device with the same _ADR value (that may not have a valid
- * device ID) can be matched going forward. [This means a second spec
- * violation in a row, so whatever we do here is best effort anyway.]
+ * If the device has a _HID returning a valid ACPI/PNP device ID, it is
+ * better to make it look less attractive here, so that the other device
+ * with the same _ADR value (that may not have a valid device ID) can be
+ * matched going forward. [This means a second spec violation in a row,
+ * so whatever we do here is best effort anyway.]
*/
- return sta_present && list_empty(&adev->pnp.ids) ?
+ return sta_present && !adev->pnp.type.platform_id ?
FIND_CHILD_MAX_SCORE : FIND_CHILD_MIN_SCORE;
}
Patches currently in stable-queue which might be from rafael.j.wysocki(a)intel.com are
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/acpi-processor-avoid-reserving-io-regions-too-early.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/acpica-namespace-fix-operand-cache-leak.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
This is a note to let you know that I've just added the patch titled
ACPI / processor: Avoid reserving IO regions too early
to the 4.4-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
acpi-processor-avoid-reserving-io-regions-too-early.patch
and it can be found in the queue-4.4 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 86314751c7945fa0c67f459beeda2e7c610ca429 Mon Sep 17 00:00:00 2001
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Thu, 2 Jun 2016 01:57:50 +0200
Subject: ACPI / processor: Avoid reserving IO regions too early
From: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
commit 86314751c7945fa0c67f459beeda2e7c610ca429 upstream.
Roland Dreier reports that one of his systems cannot boot because of
the changes made by commit ac212b6980d8 (ACPI / processor: Use common
hotplug infrastructure).
The problematic part of it is the request_region() call in
acpi_processor_get_info() that used to run at module init time before
the above commit and now it runs much earlier. Unfortunately, the
region(s) reserved by it fall into a range the PCI subsystem attempts
to reserve for AHCI IO BARs. As a result, the PCI reservation fails
and AHCI doesn't work, while previously the PCI reservation would
be made before acpi_processor_get_info() and it would succeed.
That request_region() call, however, was overlooked by commit
ac212b6980d8, as it is not necessary for the enumeration of the
processors. It only is needed when the ACPI processor driver
actually attempts to handle them which doesn't happen before
loading the ACPI processor driver module. Therefore that call
should have been moved from acpi_processor_get_info() into that
module.
Address the problem by moving the request_region() call in question
out of acpi_processor_get_info() and use the observation that the
region reserved by it is only needed if the FADT-based CPU
throttling method is going to be used, which means that it should
be sufficient to invoke it from acpi_processor_get_throttling_fadt().
Fixes: ac212b6980d8 (ACPI / processor: Use common hotplug infrastructure)
Reported-by: Roland Dreier <roland(a)purestorage.com>
Tested-by: Roland Dreier <roland(a)purestorage.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Acked-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/acpi/acpi_processor.c | 9 ---------
drivers/acpi/processor_throttling.c | 9 +++++++++
2 files changed, 9 insertions(+), 9 deletions(-)
--- a/drivers/acpi/acpi_processor.c
+++ b/drivers/acpi/acpi_processor.c
@@ -331,15 +331,6 @@ static int acpi_processor_get_info(struc
pr->throttling.duty_width = acpi_gbl_FADT.duty_width;
pr->pblk = object.processor.pblk_address;
-
- /*
- * We don't care about error returns - we just try to mark
- * these reserved so that nobody else is confused into thinking
- * that this region might be unused..
- *
- * (In particular, allocating the IO range for Cardbus)
- */
- request_region(pr->throttling.address, 6, "ACPI CPU throttle");
}
/*
--- a/drivers/acpi/processor_throttling.c
+++ b/drivers/acpi/processor_throttling.c
@@ -676,6 +676,15 @@ static int acpi_processor_get_throttling
if (!pr->flags.throttling)
return -ENODEV;
+ /*
+ * We don't care about error returns - we just try to mark
+ * these reserved so that nobody else is confused into thinking
+ * that this region might be unused..
+ *
+ * (In particular, allocating the IO range for Cardbus)
+ */
+ request_region(pr->throttling.address, 6, "ACPI CPU throttle");
+
pr->throttling.state = 0;
duty_mask = pr->throttling.state_count - 1;
Patches currently in stable-queue which might be from rafael.j.wysocki(a)intel.com are
queue-4.4/pm-sleep-declare-__tracedata-symbols-as-char-rather-than-char.patch
queue-4.4/acpi-processor-avoid-reserving-io-regions-too-early.patch
queue-4.4/x86-cpu-intel-introduce-macros-for-intel-family-numbers.patch
queue-4.4/acpica-namespace-fix-operand-cache-leak.patch
queue-4.4/acpi-scan-prefer-devices-without-_hid-_cid-for-_adr-matching.patch
This is a note to let you know that I've just added the patch titled
netfilter: xt_osf: Add missing permission checks
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-xt_osf-add-missing-permission-checks.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 916a27901de01446bcf57ecca4783f6cff493309 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Tue, 5 Dec 2017 15:42:41 -0800
Subject: netfilter: xt_osf: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 916a27901de01446bcf57ecca4783f6cff493309 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, xt_osf_fingers is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
vpnns -- nfnl_osf -f /tmp/pf.os
vpnns -- nfnl_osf -f /tmp/pf.os -d
These non-root operations successfully modify the systemwide OS
fingerprint list. Add new capable() checks so that they can't.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/xt_osf.c | 7 +++++++
1 file changed, 7 insertions(+)
--- a/net/netfilter/xt_osf.c
+++ b/net/netfilter/xt_osf.c
@@ -19,6 +19,7 @@
#include <linux/module.h>
#include <linux/kernel.h>
+#include <linux/capability.h>
#include <linux/if.h>
#include <linux/inetdevice.h>
#include <linux/ip.h>
@@ -70,6 +71,9 @@ static int xt_osf_add_callback(struct ne
struct xt_osf_finger *kf = NULL, *sf;
int err = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
@@ -115,6 +119,9 @@ static int xt_osf_remove_callback(struct
struct xt_osf_finger *sf;
int err = -ENOENT;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!osf_attrs[OSF_ATTR_FINGER])
return -EINVAL;
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.14/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.14/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
This is a note to let you know that I've just added the patch titled
netfilter: nfnetlink_cthelper: Add missing permission checks
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 4b380c42f7d00a395feede754f0bc2292eebe6e5 Mon Sep 17 00:00:00 2001
From: Kevin Cernekee <cernekee(a)chromium.org>
Date: Sun, 3 Dec 2017 12:12:45 -0800
Subject: netfilter: nfnetlink_cthelper: Add missing permission checks
From: Kevin Cernekee <cernekee(a)chromium.org>
commit 4b380c42f7d00a395feede754f0bc2292eebe6e5 upstream.
The capability check in nfnetlink_rcv() verifies that the caller
has CAP_NET_ADMIN in the namespace that "owns" the netlink socket.
However, nfnl_cthelper_list is shared by all net namespaces on the
system. An unprivileged user can create user and net namespaces
in which he holds CAP_NET_ADMIN to bypass the netlink_net_capable()
check:
$ nfct helper list
nfct v1.4.4: netlink error: Operation not permitted
$ vpnns -- nfct helper list
{
.name = ftp,
.queuenum = 0,
.l3protonum = 2,
.l4protonum = 6,
.priv_data_len = 24,
.status = enabled,
};
Add capable() checks in nfnetlink_cthelper, as this is cleaner than
trying to generalize the solution.
Signed-off-by: Kevin Cernekee <cernekee(a)chromium.org>
Signed-off-by: Pablo Neira Ayuso <pablo(a)netfilter.org>
Acked-by: Michal Kubecek <mkubecek(a)suse.cz>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
net/netfilter/nfnetlink_cthelper.c | 10 ++++++++++
1 file changed, 10 insertions(+)
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -17,6 +17,7 @@
#include <linux/types.h>
#include <linux/list.h>
#include <linux/errno.h>
+#include <linux/capability.h>
#include <net/netlink.h>
#include <net/sock.h>
@@ -407,6 +408,9 @@ static int nfnl_cthelper_new(struct net
struct nfnl_cthelper *nlcth;
int ret = 0;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (!tb[NFCTH_NAME] || !tb[NFCTH_TUPLE])
return -EINVAL;
@@ -611,6 +615,9 @@ static int nfnl_cthelper_get(struct net
struct nfnl_cthelper *nlcth;
bool tuple_set = false;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (nlh->nlmsg_flags & NLM_F_DUMP) {
struct netlink_dump_control c = {
.dump = nfnl_cthelper_dump_table,
@@ -678,6 +685,9 @@ static int nfnl_cthelper_del(struct net
struct nfnl_cthelper *nlcth, *n;
int j = 0, ret;
+ if (!capable(CAP_NET_ADMIN))
+ return -EPERM;
+
if (tb[NFCTH_NAME])
helper_name = nla_data(tb[NFCTH_NAME]);
Patches currently in stable-queue which might be from cernekee(a)chromium.org are
queue-4.14/netfilter-xt_osf-add-missing-permission-checks.patch
queue-4.14/netfilter-nfnetlink_cthelper-add-missing-permission-checks.patch