Hi stable kernel maintainers,
please squash these amdgpu fixes together and backport them to all
applicable stable branches:
15e6b76880e65be24250e30986084b5569b7a06f "drm/amdgpu: Warn and update
pin_size values when
destroying a pinned BO"
456607d816d89a442a3d5ec98b02c8bc950b5228 "drm/amdgpu: Don't warn on
destroying a pinned BO"
(These depend on commits a5ccfe5c20740f2fbf00291490cdf8d2373ec255 and
ddc21af4d0f37f42b33c54cb69b215997fe5b082, which already have Cc: stable)
--
Earthling Michel Dänzer | http://www.amd.com
Libre software enthusiast | Mesa and X developer
Two fixes for potential and real issues.
Looks worth to have in stables as we've hit it on v4.9 stable.
And for linux-next - adding lockdep asserts for line discipline changing
code, verifying that write ldisc sem will be held forthwith.
I couldn't verify that holding write lock fixes the issue as we've hit
it only once and I've failed in reproducing it.
But searching in lkml, Cc'ing here people who probably had the same
crash (and in hope someone of them could give tested-by):
Cc: Daniel Axtens <dja(a)axtens.net>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: Michael Neuling <mikey(a)neuling.org>
Cc: Mikulas Patocka <mpatocka(a)redhat.com>
Cc: Pasi Kärkkäinen <pasik(a)iki.fi>
Cc: Peter Hurley <peter(a)hurleysoftware.com>
Cc: Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
Cc: Tan Xiaojun <tanxiaojun(a)huawei.com>
(please, ignore if I Cc'ed you mistakenly)
Dmitry Safonov (4):
tty: Drop tty->count on tty_reopen() failure
tty: Hold tty_ldisc_lock() during tty_reopen()
tty: Lock tty pair in tty_init_dev()
tty/lockdep: Add ldisc_sem asserts
drivers/tty/tty_io.c | 21 +++++++++++++++------
drivers/tty/tty_ldisc.c | 12 ++++++++----
include/linux/tty.h | 4 ++++
3 files changed, 27 insertions(+), 10 deletions(-)
--
2.13.6
We need that to adjust the len of the 2nd transfer (called data in
spi-mem) if it's too long to fit in a SPI message or SPI transfer.
Fixes: c36ff266dc82 ("spi: Extend the core to ease integration of SPI memory controllers")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Chuanhua Han <chuanhua.han(a)nxp.com>
Reviewed-by: Boris Brezillon <boris.brezillon(a)bootlin.com>
---
Changes in v5:
-Add the validation check after the op->data.nbytes assignment
-Assign the "len" variable after defining it
-Remove the brackets on both sides of "opt-> data.nbytes"
Changes in v4:
-Rename variable name "opcode_addr_dummy_sum" to "len"
-The comparison of "spi_max_message_size(mem->spi)" and "len" was removed
-Adjust their order when comparing the sizes of "spi_max_message_size(mem->spi)" and "len"
-Changing the "unsigned long" type in the code to "size_t"
Changes in v3:
-Rename variable name "val" to "opcode_addr_dummy_sum"
-Place the legitimacy of the transfer size(i.e., "spi_max_message_size(mem->spi)" and
"opcode_addr_dummy_sum") into "if (! ctlr - > mem_ops | |! ctlr-> mem_ops->exec_op) {"
structure and add "spi_max_transfer_size(mem->spi) and opcode_addr_dummy_sum"
-Adjust the formatting alignment of the code
-"(unsigned long)op->data.nbytes" was modified to "(unsigned long)(op->data.nbytes)"
Changes in v2:
-Place the adjusted transfer bytes code in spi_mem_adjust_op_size() and check
spi_max_message_size(mem->spi) value before subtracting opcode, addr and dummy bytes
-Change the code from fsl-espi controller to generic code(The adjustment of spi transmission
length was originally modified in the "drivers/spi/spi-fsl-espi.c" file, and now the adjustment
of transfer length is made in the "drivers/spi/spi-mem.c" file)
drivers/spi/spi-mem.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)
diff --git a/drivers/spi/spi-mem.c b/drivers/spi/spi-mem.c
index e43842c..eb72dba 100644
--- a/drivers/spi/spi-mem.c
+++ b/drivers/spi/spi-mem.c
@@ -346,10 +346,25 @@ EXPORT_SYMBOL_GPL(spi_mem_get_name);
int spi_mem_adjust_op_size(struct spi_mem *mem, struct spi_mem_op *op)
{
struct spi_controller *ctlr = mem->spi->controller;
+ size_t len;
+
+ len = sizeof(op->cmd.opcode) + op->addr.nbytes + op->dummy.nbytes;
if (ctlr->mem_ops && ctlr->mem_ops->adjust_op_size)
return ctlr->mem_ops->adjust_op_size(mem, op);
+ if (!ctlr->mem_ops || !ctlr->mem_ops->exec_op) {
+ if (len > spi_max_transfer_size(mem->spi))
+ return -EINVAL;
+
+ op->data.nbytes = min3((size_t)op->data.nbytes,
+ spi_max_transfer_size(mem->spi),
+ spi_max_message_size(mem->spi) -
+ len);
+ if (!op->data.nbytes)
+ return -EINVAL;
+ }
+
return 0;
}
EXPORT_SYMBOL_GPL(spi_mem_adjust_op_size);
--
2.7.4
This is the start of the stable review cycle for the 4.4.148 release.
There are 43 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Aug 16 17:14:59 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.148-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.148-rc1
Guenter Roeck <linux(a)roeck-us.net>
x86/speculation/l1tf: Fix up CPU feature flags
Andi Kleen <ak(a)linux.intel.com>
x86/mm/kmmio: Make the tracer robust against L1TF
Andi Kleen <ak(a)linux.intel.com>
x86/mm/pat: Make set_memory_np() L1TF safe
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Make pmd/pud_mknotpresent() invert
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Invert all not present mappings
Michal Hocko <mhocko(a)suse.cz>
x86/speculation/l1tf: Fix up pte->pfn conversion for PAE
Vlastimil Babka <vbabka(a)suse.cz>
x86/speculation/l1tf: Protect PAE swap entries against L1TF
Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
x86/cpufeatures: Add detection of L1D cache flush support.
Vlastimil Babka <vbabka(a)suse.cz>
x86/speculation/l1tf: Extend 64bit swap file size limit
Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
x86/bugs: Move the l1tf function and define pr_fmt properly
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Limit swap file size to MAX_PA/2
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Disallow non privileged high MMIO PROT_NONE mappings
Dan Williams <dan.j.williams(a)intel.com>
mm: fix cache mode tracking in vm_insert_mixed()
Andy Lutomirski <luto(a)kernel.org>
mm: Add vm_insert_pfn_prot()
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Add sysfs reporting for l1tf
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Make sure the first page is always reserved
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Protect PROT_NONE PTEs against speculation
Linus Torvalds <torvalds(a)linux-foundation.org>
x86/speculation/l1tf: Protect swap entries against L1TF
Linus Torvalds <torvalds(a)linux-foundation.org>
x86/speculation/l1tf: Change order of offset/type in swap entry
Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
mm: x86: move _PAGE_SWP_SOFT_DIRTY from bit 7 to bit 1
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/mm: Fix swap entry comment and macro
Dave Hansen <dave.hansen(a)linux.intel.com>
x86/mm: Move swap offset/type up in PTE to work around erratum
Andi Kleen <ak(a)linux.intel.com>
x86/speculation/l1tf: Increase 32bit PAE __PHYSICAL_PAGE_SHIFT
Nick Desaulniers <ndesaulniers(a)google.com>
x86/irqflags: Provide a declaration for native_save_fl
Masami Hiramatsu <mhiramat(a)kernel.org>
kprobes/x86: Fix %p uses in error messages
Jiri Kosina <jkosina(a)suse.cz>
x86/speculation: Protect against userspace-userspace spectreRSB
Peter Zijlstra <peterz(a)infradead.org>
x86/paravirt: Fix spectre-v2 mitigations for paravirt guests
Oleksij Rempel <o.rempel(a)pengutronix.de>
ARM: dts: imx6sx: fix irq for pcie bridge
Michael Mera <dev(a)michaelmera.com>
IB/ocrdma: fix out of bounds access to local buffer
Jack Morgenstein <jackm(a)dev.mellanox.co.il>
IB/mlx4: Mark user MR as writable if actual virtual memory is writable
Jack Morgenstein <jackm(a)dev.mellanox.co.il>
IB/core: Make testing MR flags for writability a static inline function
Al Viro <viro(a)zeniv.linux.org.uk>
fix __legitimize_mnt()/mntput() race
Al Viro <viro(a)zeniv.linux.org.uk>
fix mntput/mntput race
Al Viro <viro(a)zeniv.linux.org.uk>
root dentries need RCU-delayed freeing
Bart Van Assche <bart.vanassche(a)wdc.com>
scsi: sr: Avoid that opening a CD-ROM hangs with runtime power management enabled
Hans de Goede <hdegoede(a)redhat.com>
ACPI / LPSS: Add missing prv_offset setting for byt/cht PWM devices
Juergen Gross <jgross(a)suse.com>
xen/netfront: don't cache skb_shinfo()
John David Anglin <dave.anglin(a)bell.net>
parisc: Define mb() and add memory barriers to assembler unlock sequences
Helge Deller <deller(a)gmx.de>
parisc: Enable CONFIG_MLONGCALLS by default
Kees Cook <keescook(a)chromium.org>
fork: unconditionally clear stack on fork
Thomas Egerer <hakke_007(a)gmx.de>
ipv4+ipv6: Make INET*_ESP select CRYPTO_ECHAINIV
Tadeusz Struk <tadeusz.struk(a)intel.com>
tpm: fix race condition in tpm_common_write()
Theodore Ts'o <tytso(a)mit.edu>
ext4: fix check to prevent initializing reserved inodes
-------------
Diffstat:
Makefile | 4 +-
arch/arm/boot/dts/imx6sx.dtsi | 2 +-
arch/parisc/Kconfig | 2 +-
arch/parisc/include/asm/barrier.h | 32 +++++++++++
arch/parisc/kernel/entry.S | 2 +
arch/parisc/kernel/pacache.S | 1 +
arch/parisc/kernel/syscall.S | 4 ++
arch/x86/include/asm/cpufeatures.h | 10 ++--
arch/x86/include/asm/irqflags.h | 2 +
arch/x86/include/asm/page_32_types.h | 9 +++-
arch/x86/include/asm/pgtable-2level.h | 17 ++++++
arch/x86/include/asm/pgtable-3level.h | 37 ++++++++++++-
arch/x86/include/asm/pgtable-invert.h | 32 +++++++++++
arch/x86/include/asm/pgtable.h | 84 +++++++++++++++++++++++------
arch/x86/include/asm/pgtable_64.h | 54 +++++++++++++++----
arch/x86/include/asm/pgtable_types.h | 10 ++--
arch/x86/include/asm/processor.h | 5 ++
arch/x86/kernel/cpu/bugs.c | 81 +++++++++++++++++-----------
arch/x86/kernel/cpu/common.c | 20 +++++++
arch/x86/kernel/kprobes/core.c | 4 +-
arch/x86/kernel/paravirt.c | 14 +++--
arch/x86/kernel/setup.c | 6 +++
arch/x86/mm/init.c | 23 ++++++++
arch/x86/mm/kmmio.c | 25 +++++----
arch/x86/mm/mmap.c | 21 ++++++++
arch/x86/mm/pageattr.c | 8 +--
drivers/acpi/acpi_lpss.c | 2 +
drivers/base/cpu.c | 8 +++
drivers/char/tpm/tpm-dev.c | 43 +++++++--------
drivers/infiniband/core/umem.c | 11 +---
drivers/infiniband/hw/mlx4/mr.c | 50 ++++++++++++++---
drivers/infiniband/hw/ocrdma/ocrdma_stats.c | 2 +-
drivers/net/xen-netfront.c | 8 +--
drivers/scsi/sr.c | 29 +++++++---
fs/dcache.c | 6 ++-
fs/ext4/ialloc.c | 5 +-
fs/ext4/super.c | 8 +--
fs/namespace.c | 28 +++++++++-
include/asm-generic/pgtable.h | 12 +++++
include/linux/cpu.h | 2 +
include/linux/mm.h | 2 +
include/linux/swapfile.h | 2 +
include/linux/thread_info.h | 6 +--
include/rdma/ib_verbs.h | 14 +++++
mm/memory.c | 62 +++++++++++++++++----
mm/mprotect.c | 49 +++++++++++++++++
mm/swapfile.c | 46 ++++++++++------
net/ipv4/Kconfig | 1 +
net/ipv6/Kconfig | 1 +
49 files changed, 714 insertions(+), 192 deletions(-)
Trivial backport of commit 3e536e222f293053; newer kernels have simply
moved the vararg macros.
Testing: 3.18 and 4.4 booted OK in qemu.
>8------------------------------------------------------8<
[backport of commit 3e536e222f293053 from mainline]
There is a window for racing when printing directly to task->comm,
allowing other threads to see a non-terminated string. The vsnprintf
function fills the buffer, counts the truncated chars, then finally
writes the \0 at the end.
creator other
vsnprintf:
fill (not terminated)
count the rest trace_sched_waking(p):
... memcpy(comm, p->comm, TASK_COMM_LEN)
write \0
The consequences depend on how 'other' uses the string. In our case,
it was copied into the tracing system's saved cmdlines, a buffer of
adjacent TASK_COMM_LEN-byte buffers (note the 'n' where 0 should be):
crash-arm64> x/1024s savedcmd->saved_cmdlines | grep 'evenk'
0xffffffd5b3818640: "irq/497-pwr_evenkworker/u16:12"
...and a strcpy out of there would cause stack corruption:
[224761.522292] Kernel panic - not syncing: stack-protector:
Kernel stack is corrupted in: ffffff9bf9783c78
crash-arm64> kbt | grep 'comm\|trace_print_context'
#6 0xffffff9bf9783c78 in trace_print_context+0x18c(+396)
comm (char [16]) = "irq/497-pwr_even"
crash-arm64> rd 0xffffffd4d0e17d14 8
ffffffd4d0e17d14: 2f71726900000000 5f7277702d373934 ....irq/497-pwr_
ffffffd4d0e17d24: 726f776b6e657665 3a3631752f72656b evenkworker/u16:
ffffffd4d0e17d34: f9780248ff003231 cede60e0ffffff9b 12..H.x......`..
ffffffd4d0e17d44: cede60c8ffffffd4 00000fffffffffd4 .....`..........
The workaround in e09e28671 (use strlcpy in __trace_find_cmdline) was
likely needed because of this same bug.
Solved by vsnprintf:ing to a local buffer, then using set_task_comm().
This way, there won't be a window where comm is not terminated.
Cc: stable(a)vger.kernel.org
Fixes: bc0c38d139ec7 ("ftrace: latency tracer infrastructure")
Reviewed-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
[backported to 3.18 / 4.4 by Snild]
Signed-off-by: Snild Dolkow <snild(a)sony.com>
---
kernel/kthread.c | 8 +++++++-
1 file changed, 7 insertions(+), 1 deletion(-)
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 850b255..ac6849e 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -313,10 +313,16 @@ struct task_struct *kthread_create_on_node(int (*threadfn)(void *data),
task = create->result;
if (!IS_ERR(task)) {
static const struct sched_param param = { .sched_priority = 0 };
+ char name[TASK_COMM_LEN];
va_list args;
va_start(args, namefmt);
- vsnprintf(task->comm, sizeof(task->comm), namefmt, args);
+ /*
+ * task is already visible to other tasks, so updating
+ * COMM must be protected.
+ */
+ vsnprintf(name, sizeof(name), namefmt, args);
+ set_task_comm(task, name);
va_end(args);
/*
* root may have changed our (kthreadd's) priority or CPU mask.
--
2.7.4
This is the start of the stable review cycle for the 4.9.95 release.
There are 66 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu Apr 19 15:56:27 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.95-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.95-rc1
Phil Elwell <phil(a)raspberrypi.org>
lan78xx: Correctly indicate invalid OTP
Stefan Hajnoczi <stefanha(a)redhat.com>
vhost: fix vhost_vq_access_ok() log check
Tejaswi Tanikella <tejaswit(a)codeaurora.org>
slip: Check if rstate is initialized before uncompressing
Ka-Cheong Poon <ka-cheong.poon(a)oracle.com>
rds: MP-RDS may use an invalid c_path
Bassem Boubaker <bassem.boubaker(a)actia.fr>
cdc_ether: flag the Cinterion AHS8 modem by gemalto as WWAN
Marek Szyprowski <m.szyprowski(a)samsung.com>
hwmon: (ina2xx) Fix access to uninitialized mutex
Sudhir Sreedharan <ssreedharan(a)mvista.com>
rtl8187: Fix NULL pointer dereference in priv->conf_mutex
Szymon Janc <szymon.janc(a)codecoup.pl>
Bluetooth: Fix connection if directed advertising and privacy is used
Al Viro <viro(a)zeniv.linux.org.uk>
getname_kernel() needs to make sure that ->name != ->iname in long case
Vasily Gorbik <gor(a)linux.ibm.com>
s390/ipl: ensure loadparm valid flag is set
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qdio: don't merge ERROR output buffers
Julian Wiedmann <jwi(a)linux.vnet.ibm.com>
s390/qdio: don't retry EQBS after CCQ 96
Dan Williams <dan.j.williams(a)intel.com>
nfit: fix region registration vs block-data-window ranges
Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
block/loop: fix deadlock after loop_set_status
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "perf tests: Decompress kernel module before objdump"
Eric Biggers <ebiggers(a)google.com>
sunrpc: remove incorrect HMAC request initialization
Mark Rutland <mark.rutland(a)arm.com>
arm64: Kill PSCI_GET_VERSION as a variant-2 workaround
Mark Rutland <mark.rutland(a)arm.com>
arm64: Add ARM_SMCCC_ARCH_WORKAROUND_1 BP hardening support
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc: Implement SMCCC v1.1 inline primitive
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: smccc: Make function identifiers an unsigned quantity
Mark Rutland <mark.rutland(a)arm.com>
firmware/psci: Expose SMCCC version through psci_ops
Mark Rutland <mark.rutland(a)arm.com>
firmware/psci: Expose PSCI conduit
Mark Rutland <mark.rutland(a)arm.com>
arm64: KVM: Add SMCCC_ARCH_WORKAROUND_1 fast handling
Mark Rutland <mark.rutland(a)arm.com>
arm64: KVM: Report SMCCC_ARCH_WORKAROUND_1 BP hardening support
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Turn kvm_psci_version into a static inline
Mark Rutland <mark.rutland(a)arm.com>
arm64: KVM: Make PSCI_VERSION a fast path
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Advertise SMCCC v1.1
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Implement PSCI 1.0 support
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Add smccc accessors to PSCI code
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Add PSCI_VERSION helper
Mark Rutland <mark.rutland(a)arm.com>
arm/arm64: KVM: Consolidate the PSCI include files
Mark Rutland <mark.rutland(a)arm.com>
arm64: KVM: Increment PC after handling an SMC trap
Mark Rutland <mark.rutland(a)arm.com>
arm64: Branch predictor hardening for Cavium ThunderX2
Mark Rutland <mark.rutland(a)arm.com>
arm64: Implement branch predictor hardening for affected Cortex-A CPUs
Mark Rutland <mark.rutland(a)arm.com>
arm64: cpu_errata: Allow an erratum to be match for all revisions of a core
Mark Rutland <mark.rutland(a)arm.com>
arm64: cputype: Add missing MIDR values for Cortex-A72 and Cortex-A75
Mark Rutland <mark.rutland(a)arm.com>
arm64: entry: Apply BP hardening for suspicious interrupts from EL0
Mark Rutland <mark.rutland(a)arm.com>
arm64: entry: Apply BP hardening for high-priority synchronous exceptions
Mark Rutland <mark.rutland(a)arm.com>
arm64: KVM: Use per-CPU vector when BP hardening is enabled
Mark Rutland <mark.rutland(a)arm.com>
mm: Introduce lm_alias
Mark Rutland <mark.rutland(a)arm.com>
arm64: Move BP hardening to check_and_switch_context
Mark Rutland <mark.rutland(a)arm.com>
arm64: Add skeleton to harden the branch predictor against aliasing attacks
Mark Rutland <mark.rutland(a)arm.com>
arm64: Move post_ttbr_update_workaround to C code
Mark Rutland <mark.rutland(a)arm.com>
arm64: Factor out TTBR0_EL1 post-update workaround into a specific asm macro
Mark Rutland <mark.rutland(a)arm.com>
drivers/firmware: Expose psci_get_version through psci_ops structure
Mark Rutland <mark.rutland(a)arm.com>
arm64: cpufeature: Pass capability structure to ->enable callback
Mark Rutland <mark.rutland(a)arm.com>
arm64: Run enable method for errata work arounds on late CPUs
Mark Rutland <mark.rutland(a)arm.com>
arm64: cpufeature: __this_cpu_has_cap() shouldn't stop early
Mark Rutland <mark.rutland(a)arm.com>
arm64: uaccess: Mask __user pointers for __arch_{clear, copy_*}_user
Mark Rutland <mark.rutland(a)arm.com>
arm64: uaccess: Don't bother eliding access_ok checks in __{get, put}_user
Mark Rutland <mark.rutland(a)arm.com>
arm64: uaccess: Prevent speculative use of the current addr_limit
Mark Rutland <mark.rutland(a)arm.com>
arm64: entry: Ensure branch through syscall table is bounded under speculation
Mark Rutland <mark.rutland(a)arm.com>
arm64: Use pointer masking to limit uaccess speculation
Mark Rutland <mark.rutland(a)arm.com>
arm64: Make USER_DS an inclusive limit
Mark Rutland <mark.rutland(a)arm.com>
arm64: move TASK_* definitions to <asm/processor.h>
Mark Rutland <mark.rutland(a)arm.com>
arm64: Implement array_index_mask_nospec()
Mark Rutland <mark.rutland(a)arm.com>
arm64: barrier: Add CSDB macros to control data-value prediction
Arnd Bergmann <arnd(a)arndb.de>
radeon: hide pointless #warning when compile testing
Prashant Bhole <bhole_prashant_q7(a)lab.ntt.co.jp>
perf/core: Fix use-after-free in uprobe_perf_close()
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix timestamp following overflow
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix error recovery from missing TIP packet
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix sync_switch
Adrian Hunter <adrian.hunter(a)intel.com>
perf intel-pt: Fix overlap detection to identify consecutive buffers correctly
Dexuan Cui <decui(a)microsoft.com>
Drivers: hv: vmbus: do not mark HV_PCIE as perf_device
Helge Deller <deller(a)gmx.de>
parisc: Fix out of array access in match_pci_device()
Mauro Carvalho Chehab <mchehab(a)kernel.org>
media: v4l2-compat-ioctl32: don't oops on overlay
-------------
Diffstat:
Makefile | 4 +-
arch/arm/include/asm/kvm_host.h | 6 +
arch/arm/include/asm/kvm_mmu.h | 10 +
arch/arm/include/asm/kvm_psci.h | 27 -
arch/arm/kvm/arm.c | 11 +-
arch/arm/kvm/handle_exit.c | 4 +-
arch/arm/kvm/psci.c | 143 +-
arch/arm64/Kconfig | 17 +
arch/arm64/crypto/sha256-core.S | 2061 ++++++++++++++++++++
arch/arm64/crypto/sha512-core.S | 1085 +++++++++++
arch/arm64/include/asm/assembler.h | 19 +
arch/arm64/include/asm/barrier.h | 23 +
arch/arm64/include/asm/cpucaps.h | 3 +-
arch/arm64/include/asm/cputype.h | 6 +
arch/arm64/include/asm/kvm_host.h | 5 +
arch/arm64/include/asm/kvm_mmu.h | 38 +
arch/arm64/include/asm/kvm_psci.h | 27 -
arch/arm64/include/asm/memory.h | 15 -
arch/arm64/include/asm/mmu.h | 39 +
arch/arm64/include/asm/processor.h | 24 +
arch/arm64/include/asm/sysreg.h | 2 +
arch/arm64/include/asm/uaccess.h | 153 +-
arch/arm64/kernel/Makefile | 4 +
arch/arm64/kernel/arm64ksyms.c | 4 +-
arch/arm64/kernel/bpi.S | 75 +
arch/arm64/kernel/cpu_errata.c | 189 +-
arch/arm64/kernel/cpufeature.c | 10 +-
arch/arm64/kernel/entry.S | 25 +-
arch/arm64/kvm/handle_exit.c | 16 +-
arch/arm64/kvm/hyp/hyp-entry.S | 20 +-
arch/arm64/kvm/hyp/switch.c | 5 +-
arch/arm64/lib/clear_user.S | 6 +-
arch/arm64/lib/copy_in_user.S | 4 +-
arch/arm64/mm/context.c | 12 +
arch/arm64/mm/fault.c | 34 +-
arch/arm64/mm/proc.S | 7 +-
arch/parisc/kernel/drivers.c | 4 +
arch/s390/kernel/ipl.c | 1 +
drivers/acpi/nfit/core.c | 22 +-
drivers/block/loop.c | 12 +-
drivers/firmware/psci.c | 57 +-
drivers/gpu/drm/radeon/radeon_object.c | 3 +-
drivers/hv/channel_mgmt.c | 2 +-
drivers/hwmon/ina2xx.c | 3 +-
drivers/media/v4l2-core/v4l2-compat-ioctl32.c | 4 +-
drivers/net/slip/slhc.c | 5 +
drivers/net/usb/cdc_ether.c | 6 +
drivers/net/usb/lan78xx.c | 3 +-
drivers/net/wireless/realtek/rtl818x/rtl8187/dev.c | 2 +-
drivers/s390/cio/qdio_main.c | 42 +-
drivers/vhost/vhost.c | 8 +-
fs/namei.c | 3 +-
include/kvm/arm_psci.h | 51 +
include/linux/arm-smccc.h | 165 +-
include/linux/mm.h | 4 +
include/linux/psci.h | 14 +
include/net/bluetooth/hci_core.h | 2 +-
include/net/slhc_vj.h | 1 +
include/uapi/linux/psci.h | 3 +
kernel/events/core.c | 6 +
net/bluetooth/hci_conn.c | 29 +-
net/bluetooth/hci_event.c | 15 +-
net/bluetooth/l2cap_core.c | 2 +-
net/rds/send.c | 15 +-
net/sunrpc/auth_gss/gss_krb5_crypto.c | 3 -
tools/perf/tests/code-reading.c | 20 +-
.../perf/util/intel-pt-decoder/intel-pt-decoder.c | 64 +-
.../perf/util/intel-pt-decoder/intel-pt-decoder.h | 2 +-
tools/perf/util/intel-pt.c | 37 +-
69 files changed, 4423 insertions(+), 320 deletions(-)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d814a49198eafa6163698bdd93961302f3a877a4 Mon Sep 17 00:00:00 2001
From: Ethan Lien <ethanlien(a)synology.com>
Date: Mon, 2 Jul 2018 15:44:58 +0800
Subject: [PATCH] btrfs: use correct compare function of dirty_metadata_bytes
We use customized, nodesize batch value to update dirty_metadata_bytes.
We should also use batch version of compare function or we will easily
goto fast path and get false result from percpu_counter_compare().
Fixes: e2d845211eda ("Btrfs: use percpu counter for dirty metadata count")
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: Ethan Lien <ethanlien(a)synology.com>
Reviewed-by: Nikolay Borisov <nborisov(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index 6023eed3e805..e3858b2fe014 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -959,8 +959,9 @@ static int btree_writepages(struct address_space *mapping,
fs_info = BTRFS_I(mapping->host)->root->fs_info;
/* this is a bit racy, but that's ok */
- ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
- BTRFS_DIRTY_METADATA_THRESH);
+ ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
+ BTRFS_DIRTY_METADATA_THRESH,
+ fs_info->dirty_metadata_batch);
if (ret < 0)
return 0;
}
@@ -4134,8 +4135,9 @@ static void __btrfs_btree_balance_dirty(struct btrfs_fs_info *fs_info,
if (flush_delayed)
btrfs_balance_delayed_items(fs_info);
- ret = percpu_counter_compare(&fs_info->dirty_metadata_bytes,
- BTRFS_DIRTY_METADATA_THRESH);
+ ret = __percpu_counter_compare(&fs_info->dirty_metadata_bytes,
+ BTRFS_DIRTY_METADATA_THRESH,
+ fs_info->dirty_metadata_batch);
if (ret > 0) {
balance_dirty_pages_ratelimited(fs_info->btree_inode->i_mapping);
}
The page migration code employs try_to_unmap() to try and unmap the
source page. This is accomplished by using rmap_walk to find all
vmas where the page is mapped. This search stops when page mapcount
is zero. For shared PMD huge pages, the page map count is always 1
no matter the number of mappings. Shared mappings are tracked via
the reference count of the PMD page. Therefore, try_to_unmap stops
prematurely and does not completely unmap all mappings of the source
page.
This problem can result is data corruption as writes to the original
source page can happen after contents of the page are copied to the
target page. Hence, data is lost.
This problem was originally seen as DB corruption of shared global
areas after a huge page was soft offlined due to ECC memory errors.
DB developers noticed they could reproduce the issue by (hotplug)
offlining memory used to back huge pages. A simple testcase can
reproduce the problem by creating a shared PMD mapping (note that
this must be at least PUD_SIZE in size and PUD_SIZE aligned (1GB on
x86)), and using migrate_pages() to migrate process pages between
nodes while continually writing to the huge pages being migrated.
To fix, have the try_to_unmap_one routine check for huge PMD sharing
by calling huge_pmd_unshare for hugetlbfs huge pages. If it is a
shared mapping it will be 'unshared' which removes the page table
entry and drops the reference on the PMD page. After this, flush
caches and TLB.
mmu notifiers are called before locking page tables, but we can not
be sure of PMD sharing until page tables are locked. Therefore,
check for the possibility of PMD sharing before locking so that
notifiers can prepare for the worst possible case.
Fixes: 39dde65c9940 ("shared page table for hugetlb page")
Cc: stable(a)vger.kernel.org
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
---
include/linux/hugetlb.h | 14 ++++++++++++++
mm/hugetlb.c | 40 +++++++++++++++++++++++++++++++++++++--
mm/rmap.c | 42 ++++++++++++++++++++++++++++++++++++++---
3 files changed, 91 insertions(+), 5 deletions(-)
diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h
index 36fa6a2a82e3..4ee95d8c8413 100644
--- a/include/linux/hugetlb.h
+++ b/include/linux/hugetlb.h
@@ -140,6 +140,8 @@ pte_t *huge_pte_alloc(struct mm_struct *mm,
pte_t *huge_pte_offset(struct mm_struct *mm,
unsigned long addr, unsigned long sz);
int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep);
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end);
struct page *follow_huge_addr(struct mm_struct *mm, unsigned long address,
int write);
struct page *follow_huge_pd(struct vm_area_struct *vma,
@@ -170,6 +172,18 @@ static inline unsigned long hugetlb_total_pages(void)
return 0;
}
+static inline int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr,
+ pte_t *ptep)
+{
+ return 0;
+}
+
+static inline void adjust_range_if_pmd_sharing_possible(
+ struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
+
#define follow_hugetlb_page(m,v,p,vs,a,b,i,w,n) ({ BUG(); 0; })
#define follow_huge_addr(mm, addr, write) ERR_PTR(-EINVAL)
#define copy_hugetlb_page_range(src, dst, vma) ({ BUG(); 0; })
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 3103099f64fd..a73c5728e961 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4548,6 +4548,9 @@ static unsigned long page_table_shareable(struct vm_area_struct *svma,
return saddr;
}
+#define _range_in_vma(vma, start, end) \
+ ((vma)->vm_start <= (start) && (end) <= (vma)->vm_end)
+
static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
{
unsigned long base = addr & PUD_MASK;
@@ -4556,12 +4559,40 @@ static bool vma_shareable(struct vm_area_struct *vma, unsigned long addr)
/*
* check on proper vm_flags and page table alignment
*/
- if (vma->vm_flags & VM_MAYSHARE &&
- vma->vm_start <= base && end <= vma->vm_end)
+ if (vma->vm_flags & VM_MAYSHARE && _range_in_vma(vma, base, end))
return true;
return false;
}
+/*
+ * Determine if start,end range within vma could be mapped by shared pmd.
+ * If yes, adjust start and end to cover range associated with possible
+ * shared pmd mappings.
+ */
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+ unsigned long check_addr = *start;
+
+ if (!(vma->vm_flags & VM_MAYSHARE))
+ return;
+
+ for (check_addr = *start; check_addr < *end; check_addr += PUD_SIZE) {
+ unsigned long a_start = check_addr & PUD_MASK;
+ unsigned long a_end = a_start + PUD_SIZE;
+
+ /*
+ * If sharing is possible, adjust start/end if necessary.
+ */
+ if (_range_in_vma(vma, a_start, a_end)) {
+ if (a_start < *start)
+ *start = a_start;
+ if (a_end > *end)
+ *end = a_end;
+ }
+ }
+}
+
/*
* Search for a shareable pmd page for hugetlb. In any case calls pmd_alloc()
* and returns the corresponding pte. While this is not necessary for the
@@ -4659,6 +4690,11 @@ int huge_pmd_unshare(struct mm_struct *mm, unsigned long *addr, pte_t *ptep)
{
return 0;
}
+
+void adjust_range_if_pmd_sharing_possible(struct vm_area_struct *vma,
+ unsigned long *start, unsigned long *end)
+{
+}
#define want_pmd_share() (0)
#endif /* CONFIG_ARCH_WANT_HUGE_PMD_SHARE */
diff --git a/mm/rmap.c b/mm/rmap.c
index eb477809a5c0..1e79fac3186b 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -1362,11 +1362,21 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
}
/*
- * We have to assume the worse case ie pmd for invalidation. Note that
- * the page can not be free in this function as call of try_to_unmap()
- * must hold a reference on the page.
+ * For THP, we have to assume the worse case ie pmd for invalidation.
+ * For hugetlb, it could be much worse if we need to do pud
+ * invalidation in the case of pmd sharing.
+ *
+ * Note that the page can not be free in this function as call of
+ * try_to_unmap() must hold a reference on the page.
*/
end = min(vma->vm_end, start + (PAGE_SIZE << compound_order(page)));
+ if (PageHuge(page)) {
+ /*
+ * If sharing is possible, start and end will be adjusted
+ * accordingly.
+ */
+ adjust_range_if_pmd_sharing_possible(vma, &start, &end);
+ }
mmu_notifier_invalidate_range_start(vma->vm_mm, start, end);
while (page_vma_mapped_walk(&pvmw)) {
@@ -1409,6 +1419,32 @@ static bool try_to_unmap_one(struct page *page, struct vm_area_struct *vma,
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
+ if (PageHuge(page)) {
+ if (huge_pmd_unshare(mm, &address, pvmw.pte)) {
+ /*
+ * huge_pmd_unshare unmapped an entire PMD
+ * page. There is no way of knowing exactly
+ * which PMDs may be cached for this mm, so
+ * we must flush them all. start/end were
+ * already adjusted above to cover this range.
+ */
+ flush_cache_range(vma, start, end);
+ flush_tlb_range(vma, start, end);
+ mmu_notifier_invalidate_range(mm, start, end);
+
+ /*
+ * The ref count of the PMD page was dropped
+ * which is part of the way map counting
+ * is done for shared PMDs. Return 'true'
+ * here. When there is no other sharing,
+ * huge_pmd_unshare returns false and we will
+ * unmap the actual page and drop map count
+ * to zero.
+ */
+ page_vma_mapped_walk_done(&pvmw);
+ break;
+ }
+ }
if (IS_ENABLED(CONFIG_MIGRATION) &&
(flags & TTU_MIGRATION) &&
--
2.17.1