On Tue, Feb 27, 2018 at 02:17:15AM +0000, Harsh Shandilya wrote:
> On Tue 27 Feb, 2018, 1:47 AM Greg Kroah-Hartman, <gregkh(a)linuxfoundation.org>
> wrote:
>
> > This is the start of the stable review cycle for the 3.18.97 release.
> > There are 13 patches in this series, all will be posted as a response
> > to this one. If anyone has any issues with these being applied, please
> > let me know.
> >
> > Responses should be made by Wed Feb 28 20:15:12 UTC 2018.
> > Anything received after that time might be too late.
> >
> > The whole patch series can be found in one patch at:
> >
> > https://www.kernel.org/pub/linux/kernel/v3.x/stable-review/patch-3.18.97-rc…
> > or in the git tree and branch at:
> > git://
> > git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
> > linux-3.18.y
> > and the diffstat can be found below.
> >
>
> No regressions noticed on the OnePlus 3T. CAF's msm-3.18 tree requires
> reverting commit
> https://source.codeaurora.org/quic/la/kernel/msm-3.18/commit?id=1278f001ef9…
> to avoid conflicting with the patch titled "usb: gadget: f_fs: Process all
> descriptors during bind", kernel-common has no merge problems. Thanks for
> the update.
Thanks for testing this and letting me know.
greg k-h
From: Ben Gardner <gardner.ben(a)gmail.com>
commit fba4adbbf670577e605f9ad306629db6031cd48b upstream.
One I2C bus on my Atom E3845 board has been broken since 4.9.
It has two devices, both declared by ACPI and with built-in drivers.
There are two back-to-back transactions originating from the kernel, one
targeting each device. The first transaction works, the second one locks
up the I2C controller. The controller never recovers.
These kernel logs show up whenever an I2C transaction is attempted after
this failure.
i2c-designware-pci 0000:00:18.3: timeout in disabling adapter
i2c-designware-pci 0000:00:18.3: timeout waiting for bus ready
Waiting for the I2C controller status to indicate that it is enabled
before programming it fixes the issue.
I have tested this patch on 4.14 and 4.15.
Fixes: commit 2702ea7dbec5 ("i2c: designware: wait for disable/enable only if necessary")
Cc: linux-stable <stable(a)vger.kernel.org> # v4.9..v4.12
Signed-off-by: Ben Gardner <gardner.ben(a)gmail.com>
[Jarkko: Backported to v4.9..v4.12 before i2c-designware-core.c was
renamed to i2c-designware-master.c]
Signed-off-by: Jarkko Nikula <jarkko.nikula(a)linux.intel.com>
---
drivers/i2c/busses/i2c-designware-core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/i2c/busses/i2c-designware-core.c b/drivers/i2c/busses/i2c-designware-core.c
index 809f4d4e93a0..340e037b3224 100644
--- a/drivers/i2c/busses/i2c-designware-core.c
+++ b/drivers/i2c/busses/i2c-designware-core.c
@@ -507,7 +507,7 @@ static void i2c_dw_xfer_init(struct dw_i2c_dev *dev)
i2c_dw_disable_int(dev);
/* Enable the adapter */
- __i2c_dw_enable(dev, true);
+ __i2c_dw_enable_and_wait(dev, true);
/* Clear and enable interrupts */
dw_readl(dev, DW_IC_CLR_INTR);
--
2.16.1
To reproduce the lock up do the following
- connect otg host adapter and a USB device to the dual-role port
so that it is in host mode.
- suspend to mem.
- disconnect otg adapter.
- resume the system.
If we call dwc3_host_exit() before tasks are thawed
xhci_plat_remove() seems to lock up at the second usb_remove_hcd() call.
To work around this we queue the _dwc3_set_mode() work on
the system_freezable_wq.
Fixes: 41ce1456e1db ("usb: dwc3: core: make dwc3_set_mode() work properly")
Cc: <stable(a)vger.kernel.org> # v4.12+
Suggested-by: Manu Gautam <mgautam(a)codeaurora.org>
Signed-off-by: Roger Quadros <rogerq(a)ti.com>
---
drivers/usb/dwc3/core.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/usb/dwc3/core.c b/drivers/usb/dwc3/core.c
index f1d838a..e94bf91 100644
--- a/drivers/usb/dwc3/core.c
+++ b/drivers/usb/dwc3/core.c
@@ -175,7 +175,7 @@ void dwc3_set_mode(struct dwc3 *dwc, u32 mode)
dwc->desired_dr_role = mode;
spin_unlock_irqrestore(&dwc->lock, flags);
- queue_work(system_power_efficient_wq, &dwc->drd_work);
+ queue_work(system_freezable_wq, &dwc->drd_work);
}
u32 dwc3_core_fifo_space(struct dwc3_ep *dep, u8 type)
--
cheers,
-roger
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki
Due to using gcc defines for configuration, some labels might be unused in
certain configurations. While adding a __maybe_unused to the label is
fine in general, the line has to be terminated with ';'. This is also
reflected in the gcc documentation, but gcc parsed the previous variant
without an error message.
This has been spotted while compiling with goto-cc, the compiler for the
CPROVER tool suite.
Signed-off-by: Norbert Manthey <nmanthey(a)amazon.de>
Signed-off-by: Michael Tautschnig <tautschn(a)amazon.co.uk>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
---
kernel/sched/fair.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 26a71eb..3af8fb6 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -6716,7 +6716,7 @@ pick_next_task_fair(struct rq *rq, struct task_struct *prev, struct rq_flags *rf
p = task_of(se);
-done: __maybe_unused
+done: __maybe_unused;
#ifdef CONFIG_SMP
/*
* Move the next running task to the front of
--
2.7.4
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
The patch
ASoC: topology: Fix logical inversion in set_link_hw_format()
has been applied to the asoc tree at
https://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound.git
All being well this means that it will be integrated into the linux-next
tree (usually sometime in the next 24 hours) and sent to Linus during
the next merge window (or sooner if it is a bug fix), however if
problems are discovered then the patch may be dropped or reverted.
You may get further e-mails resulting from automated or manual testing
and review of the tree, please engage with people reporting problems and
send followup patches addressing any issues that are reported if needed.
If any updates are required or you are submitting further changes they
should be sent as incremental updates against current git, existing
patches will not be replaced.
Please add any relevant lists and maintainers to the CCs when replying
to this mail.
Thanks,
Mark
>From 7f34c3f38e9666216fda155b6ebededd1073f5aa Mon Sep 17 00:00:00 2001
From: Pan Xiuli <xiuli.pan(a)linux.intel.com>
Date: Fri, 23 Feb 2018 06:01:28 +0800
Subject: [PATCH] ASoC: topology: Fix logical inversion in set_link_hw_format()
The master/slave conventions are wrt. the codec. The topology code
contains a logical inversion, fix to follow ASoC usage.
Signed-off-by: Pan Xiuli <xiuli.pan(a)linux.intel.com>
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Cc: stable(a)vger.kernel.org
---
sound/soc/soc-topology.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/sound/soc/soc-topology.c b/sound/soc/soc-topology.c
index 01a50413c66f..5f507993c43e 100644
--- a/sound/soc/soc-topology.c
+++ b/sound/soc/soc-topology.c
@@ -1996,11 +1996,11 @@ static void set_link_hw_format(struct snd_soc_dai_link *link,
/* clock masters */
bclk_master = hw_config->bclk_master;
fsync_master = hw_config->fsync_master;
- if (!bclk_master && !fsync_master)
+ if (bclk_master && fsync_master)
link->dai_fmt |= SND_SOC_DAIFMT_CBM_CFM;
- else if (bclk_master && !fsync_master)
- link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFM;
else if (!bclk_master && fsync_master)
+ link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFM;
+ else if (bclk_master && !fsync_master)
link->dai_fmt |= SND_SOC_DAIFMT_CBM_CFS;
else
link->dai_fmt |= SND_SOC_DAIFMT_CBS_CFS;
--
2.16.1
On 02/26/18 20:47, Mark Brown wrote:
> On Mon, Feb 26, 2018 at 07:34:07PM +0100, Kirill Marinushkin wrote:
>
>> 2. The suggested commit breaks the existing ASoC.
>> The existing functionality already works with several existing ASoC by
>> Intel. The suggested commit will break it for the following reason:
>> The ALSA topology mechanism loads the binary topology file into ASoC. The
>> suggested commit modifies the parser, but the binaries are already created for
>> the existing functionality. As a result, all existing binaries will be parsed
>> incorrectly.
> Are there actual binaries using the existing code or is that just
> theoretical? My impression was that this was fixing things for existing
> binaries which were shipped for non-mainline kernels, is that not the
> case?
There is an actual conf, which can be converted into an actual binary:
`alsa-lib/src/conf/topology/broadwell/broadwell.conf`
Theoretically, there can be an unknown number of the binaries, not available publicly.
Broadwell exists for a long time, and there is no way to be sure that all the theoretically
existing binaries are modified together with this kernel patch.
On Mon, 2018-02-12 at 08:24 +0100, Paolo Valente wrote:
>
> > Il giorno 10 feb 2018, alle ore 09:29, Oleksandr Natalenko <oleksandr(a)natalenko.name> ha scritto:
> >
> > Hi.
> >
> > On pátek 9. února 2018 18:29:39 CET Mike Galbraith wrote:
> >> On Fri, 2018-02-09 at 14:21 +0100, Oleksandr Natalenko wrote:
> >>> In addition to this I think it should be worth considering CC'ing Greg
> >>> to pull this fix into 4.15 stable tree.
> >>
> >> This isn't one he can cherry-pick, some munging required, in which case
> >> he usually wants a properly tested backport.
> >>
> >> -Mike
> >
> > Maybe, this could be a good opportunity to push all the pending BFQ patches
> > into the stable 4.15 branch? Because IIUC currently BFQ in 4.15 is just
> > unusable.
> >
> > Paolo?
> >
>
> Of course ok for me, and thanks Oleksandr for proposing this. These
> commits should apply cleanly on 4.15, and FWIW have been tested, by me
> and BFQ users, on 4.15 too in these months.
How does that work without someone actually submitting patches? CC
stable and pass along a conveniently sorted cherry-pick list?
9b25bd0368d5 block, bfq: remove batches of confusing ifdefs
a34b024448eb block, bfq: consider also past I/O in soft real-time detection
4403e4e467c3 block, bfq: remove superfluous check in queue-merging setup
7b8fa3b900a0 block, bfq: let a queue be merged only shortly after starting I/O
1be6e8a964ee block, bfq: check low_latency flag in bfq_bfqq_save_state()
05e902835616 block, bfq: add missing rq_pos_tree update on rq removal
f0ba5ea2fe45 block, bfq: increase threshold to deem I/O as random
0d52af590552 block, bfq: release oom-queue ref to root group on exit
52257ffbfcaf block, bfq: put async queues for root bfq groups too
8abef10b3de1 bfq-iosched: don't call bfqg_and_blkg_put for !CONFIG_BFQ_GROUP_IOSCHED
8993d445df38 block, bfq: fix occurrences of request finish method's old name
8a8747dc01ce block, bfq: limit sectors served with interactive weight raising
a52a69ea89dc block, bfq: limit tags for writes and async I/O
a78773906147 block, bfq: add requeue-request hook
> ---
> >
> > block, bfq: add requeue-request hook
> > bfq-iosched: don't call bfqg_and_blkg_put for !CONFIG_BFQ_GROUP_IOSCHED
> > block, bfq: release oom-queue ref to root group on exit
> > block, bfq: put async queues for root bfq groups too
> > block, bfq: limit sectors served with interactive weight raising
> > block, bfq: limit tags for writes and async I/O
> > block, bfq: increase threshold to deem I/O as random
> > block, bfq: remove superfluous check in queue-merging setup
> > block, bfq: let a queue be merged only shortly after starting I/O
> > block, bfq: check low_latency flag in bfq_bfqq_save_state()
> > block, bfq: add missing rq_pos_tree update on rq removal
> > block, bfq: fix occurrences of request finish method's old name
> > block, bfq: consider also past I/O in soft real-time detection
> > block, bfq: remove batches of confusing ifdefs
> >
> >
>
When an MSI descriptor was not available, the error path would try to
unbind an irq that was never acquired - potentially unbinding an
unrelated irq.
Fixes: 4892c9b4ada9f9 ("xen: add support for MSI message groups")
Reported-by: Hooman Mirhadi <mirhadih(a)amazon.com>
CC: <stable(a)vger.kernel.org>
CC: Roger Pau Monné <roger.pau(a)citrix.com>
CC: David Vrabel <david.vrabel(a)citrix.com>
CC: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
CC: Eduardo Valentin <eduval(a)amazon.com>
CC: Juergen Gross <jgross(a)suse.com>
CC: Thomas Gleixner <tglx(a)linutronix.de>
CC: "K. Y. Srinivasan" <kys(a)microsoft.com>
CC: Liu Shuo <shuo.a.liu(a)intel.com>
CC: Anoob Soman <anoob.soman(a)citrix.com>
Signed-off-by: Amit Shah <aams(a)amazon.com>
---
drivers/xen/events/events_base.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/xen/events/events_base.c b/drivers/xen/events/events_base.c
index 1ab4bd1..b6b8b29 100644
--- a/drivers/xen/events/events_base.c
+++ b/drivers/xen/events/events_base.c
@@ -749,6 +749,7 @@ int xen_bind_pirq_msi_to_irq(struct pci_dev *dev, struct msi_desc *msidesc,
}
ret = irq_set_msi_desc(irq, msidesc);
+ i--;
if (ret < 0)
goto error_irq;
out:
--
2.7.3.AMZN
Amazon Development Center Germany GmbH
Berlin - Dresden - Aachen
main office: Krausenstr. 38, 10117 Berlin
Geschaeftsfuehrer: Dr. Ralf Herbrich, Christian Schlaeger
Ust-ID: DE289237879
Eingetragen am Amtsgericht Charlottenburg HRB 149173 B
commit 8e1eb3fa009aa7c0b944b3c8b26b07de0efb3200 upstream.
At entry userspace may have (maliciously) populated the extra registers
outside the syscall calling convention with arbitrary values that could
be useful in a speculative execution (Spectre style) attack.
Clear these registers to minimize the kernel's attack surface.
Note, this only clears the extra registers and not the unused
registers for syscalls less than 6 arguments, since those registers are
likely to be clobbered well before their values could be put to use
under speculation.
Note, Linus found that the XOR instructions can be executed with
minimized cost if interleaved with the PUSH instructions, and Ingo's
analysis found that R10 and R11 should be included in the register
clearing beyond the typical 'extra' syscall calling convention
registers.
Suggested-by: Linus Torvalds <torvalds(a)linux-foundation.org>
Reported-by: Andi Kleen <ak(a)linux.intel.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brian Gerst <brgerst(a)gmail.com>
Cc: Denys Vlasenko <dvlasenk(a)redhat.com>
Cc: H. Peter Anvin <hpa(a)zytor.com>
Cc: Josh Poimboeuf <jpoimboe(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Link: http://lkml.kernel.org/r/151787988577.7847.16733592218894189003.stgit@dwill…
[ Made small improvements to the changelog and the code comments. ]
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
---
arch/x86/entry/entry_64.S | 13 +++++++++++++
1 file changed, 13 insertions(+)
diff --git a/arch/x86/entry/entry_64.S b/arch/x86/entry/entry_64.S
index db5009ce065a..8d7e4d48db0d 100644
--- a/arch/x86/entry/entry_64.S
+++ b/arch/x86/entry/entry_64.S
@@ -176,13 +176,26 @@ GLOBAL(entry_SYSCALL_64_after_swapgs)
pushq %r8 /* pt_regs->r8 */
pushq %r9 /* pt_regs->r9 */
pushq %r10 /* pt_regs->r10 */
+ /*
+ * Clear extra registers that a speculation attack might
+ * otherwise want to exploit. Interleave XOR with PUSH
+ * for better uop scheduling:
+ */
+ xorq %r10, %r10 /* nospec r10 */
pushq %r11 /* pt_regs->r11 */
+ xorq %r11, %r11 /* nospec r11 */
pushq %rbx /* pt_regs->rbx */
+ xorl %ebx, %ebx /* nospec rbx */
pushq %rbp /* pt_regs->rbp */
+ xorl %ebp, %ebp /* nospec rbp */
pushq %r12 /* pt_regs->r12 */
+ xorq %r12, %r12 /* nospec r12 */
pushq %r13 /* pt_regs->r13 */
+ xorq %r13, %r13 /* nospec r13 */
pushq %r14 /* pt_regs->r14 */
+ xorq %r14, %r14 /* nospec r14 */
pushq %r15 /* pt_regs->r15 */
+ xorq %r15, %r15 /* nospec r15 */
/* IRQs are off. */
movq %rsp, %rdi
commit b70131de648c2b997d22f4653934438013f407a1 upstream.
V4L2 memory registrations are incompatible with filesystem-dax that
needs the ability to revoke dma access to a mapping at will, or
otherwise allow the kernel to wait for completion of DMA. The
filesystem-dax implementation breaks the traditional solution of
truncate of active file backed mappings since there is no page-cache
page we can orphan to sustain ongoing DMA.
If v4l2 wants to support long lived DMA mappings it needs to arrange to
hold a file lease or use some other mechanism so that the kernel can
coordinate revoking DMA access when the filesystem needs to truncate
mappings.
Link: http://lkml.kernel.org/r/151068940499.7446.12846708245365671207.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reported-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
drivers/media/v4l2-core/videobuf-dma-sg.c | 5 +++--
1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/media/v4l2-core/videobuf-dma-sg.c b/drivers/media/v4l2-core/videobuf-dma-sg.c
index 1db0af6c7f94..b6189a4958c5 100644
--- a/drivers/media/v4l2-core/videobuf-dma-sg.c
+++ b/drivers/media/v4l2-core/videobuf-dma-sg.c
@@ -185,12 +185,13 @@ static int videobuf_dma_init_user_locked(struct videobuf_dmabuf *dma,
dprintk(1, "init user [0x%lx+0x%lx => %d pages]\n",
data, size, dma->nr_pages);
- err = get_user_pages(data & PAGE_MASK, dma->nr_pages,
+ err = get_user_pages_longterm(data & PAGE_MASK, dma->nr_pages,
flags, dma->pages, NULL);
if (err != dma->nr_pages) {
dma->nr_pages = (err >= 0) ? err : 0;
- dprintk(1, "get_user_pages: err=%d [%d]\n", err, dma->nr_pages);
+ dprintk(1, "get_user_pages_longterm: err=%d [%d]\n", err,
+ dma->nr_pages);
return err < 0 ? err : -EINVAL;
}
return 0;
commit 2bb6d2837083de722bfdc369cb0d76ce188dd9b4 upstream.
Patch series "introduce get_user_pages_longterm()", v2.
Here is a new get_user_pages api for cases where a driver intends to
keep an elevated page count indefinitely. This is distinct from usages
like iov_iter_get_pages where the elevated page counts are transient.
The iov_iter_get_pages cases immediately turn around and submit the
pages to a device driver which will put_page when the i/o operation
completes (under kernel control).
In the longterm case userspace is responsible for dropping the page
reference at some undefined point in the future. This is untenable for
filesystem-dax case where the filesystem is in control of the lifetime
of the block / page and needs reasonable limits on how long it can wait
for pages in a mapping to become idle.
Fixing filesystems to actually wait for dax pages to be idle before
blocks from a truncate/hole-punch operation are repurposed is saved for
a later patch series.
Also, allowing longterm registration of dax mappings is a future patch
series that introduces a "map with lease" semantic where the kernel can
revoke a lease and force userspace to drop its page references.
I have also tagged these for -stable to purposely break cases that might
assume that longterm memory registrations for filesystem-dax mappings
were supported by the kernel. The behavior regression this policy
change implies is one of the reasons we maintain the "dax enabled.
Warning: EXPERIMENTAL, use at your own risk" notification when mounting
a filesystem in dax mode.
It is worth noting the device-dax interface does not suffer the same
constraints since it does not support file space management operations
like hole-punch.
This patch (of 4):
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow long standing memory registrations against
filesytem-dax vmas. Device-dax vmas do not have this problem and are
explicitly allowed.
This is temporary until a "memory registration with layout-lease"
mechanism can be implemented for the affected sub-systems (RDMA and
V4L2).
[akpm(a)linux-foundation.org: use kcalloc()]
Link: http://lkml.kernel.org/r/151068939435.7446.13560129395419350737.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Suggested-by: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
include/linux/dax.h | 5 ----
include/linux/fs.h | 20 ++++++++++++++++
include/linux/mm.h | 13 ++++++++++
mm/gup.c | 64 +++++++++++++++++++++++++++++++++++++++++++++++++++
4 files changed, 97 insertions(+), 5 deletions(-)
diff --git a/include/linux/dax.h b/include/linux/dax.h
index add6c4bc568f..ed9cf2f5cd06 100644
--- a/include/linux/dax.h
+++ b/include/linux/dax.h
@@ -61,11 +61,6 @@ static inline int dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr,
int dax_pfn_mkwrite(struct vm_area_struct *, struct vm_fault *);
#define dax_mkwrite(vma, vmf, gb) dax_fault(vma, vmf, gb)
-static inline bool vma_is_dax(struct vm_area_struct *vma)
-{
- return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host);
-}
-
static inline bool dax_mapping(struct address_space *mapping)
{
return mapping->host && IS_DAX(mapping->host);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index d705ae084edd..745ea1b2e02c 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -18,6 +18,7 @@
#include <linux/bug.h>
#include <linux/mutex.h>
#include <linux/rwsem.h>
+#include <linux/mm_types.h>
#include <linux/capability.h>
#include <linux/semaphore.h>
#include <linux/fiemap.h>
@@ -3033,6 +3034,25 @@ static inline bool io_is_direct(struct file *filp)
return (filp->f_flags & O_DIRECT) || IS_DAX(filp->f_mapping->host);
}
+static inline bool vma_is_dax(struct vm_area_struct *vma)
+{
+ return vma->vm_file && IS_DAX(vma->vm_file->f_mapping->host);
+}
+
+static inline bool vma_is_fsdax(struct vm_area_struct *vma)
+{
+ struct inode *inode;
+
+ if (!vma->vm_file)
+ return false;
+ if (!vma_is_dax(vma))
+ return false;
+ inode = file_inode(vma->vm_file);
+ if (inode->i_mode == S_IFCHR)
+ return false; /* device-dax */
+ return true;
+}
+
static inline int iocb_flags(struct file *file)
{
int res = 0;
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2217e2f18247..8e506783631b 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1288,6 +1288,19 @@ long __get_user_pages_unlocked(struct task_struct *tsk, struct mm_struct *mm,
struct page **pages, unsigned int gup_flags);
long get_user_pages_unlocked(unsigned long start, unsigned long nr_pages,
struct page **pages, unsigned int gup_flags);
+#ifdef CONFIG_FS_DAX
+long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
+ unsigned int gup_flags, struct page **pages,
+ struct vm_area_struct **vmas);
+#else
+static inline long get_user_pages_longterm(unsigned long start,
+ unsigned long nr_pages, unsigned int gup_flags,
+ struct page **pages, struct vm_area_struct **vmas)
+{
+ return get_user_pages(start, nr_pages, gup_flags, pages, vmas);
+}
+#endif /* CONFIG_FS_DAX */
+
int get_user_pages_fast(unsigned long start, int nr_pages, int write,
struct page **pages);
diff --git a/mm/gup.c b/mm/gup.c
index c63a0341ae38..6c3b4e822946 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -982,6 +982,70 @@ long get_user_pages(unsigned long start, unsigned long nr_pages,
}
EXPORT_SYMBOL(get_user_pages);
+#ifdef CONFIG_FS_DAX
+/*
+ * This is the same as get_user_pages() in that it assumes we are
+ * operating on the current task's mm, but it goes further to validate
+ * that the vmas associated with the address range are suitable for
+ * longterm elevated page reference counts. For example, filesystem-dax
+ * mappings are subject to the lifetime enforced by the filesystem and
+ * we need guarantees that longterm users like RDMA and V4L2 only
+ * establish mappings that have a kernel enforced revocation mechanism.
+ *
+ * "longterm" == userspace controlled elevated page count lifetime.
+ * Contrast this to iov_iter_get_pages() usages which are transient.
+ */
+long get_user_pages_longterm(unsigned long start, unsigned long nr_pages,
+ unsigned int gup_flags, struct page **pages,
+ struct vm_area_struct **vmas_arg)
+{
+ struct vm_area_struct **vmas = vmas_arg;
+ struct vm_area_struct *vma_prev = NULL;
+ long rc, i;
+
+ if (!pages)
+ return -EINVAL;
+
+ if (!vmas) {
+ vmas = kcalloc(nr_pages, sizeof(struct vm_area_struct *),
+ GFP_KERNEL);
+ if (!vmas)
+ return -ENOMEM;
+ }
+
+ rc = get_user_pages(start, nr_pages, gup_flags, pages, vmas);
+
+ for (i = 0; i < rc; i++) {
+ struct vm_area_struct *vma = vmas[i];
+
+ if (vma == vma_prev)
+ continue;
+
+ vma_prev = vma;
+
+ if (vma_is_fsdax(vma))
+ break;
+ }
+
+ /*
+ * Either get_user_pages() failed, or the vma validation
+ * succeeded, in either case we don't need to put_page() before
+ * returning.
+ */
+ if (i >= rc)
+ goto out;
+
+ for (i = 0; i < rc; i++)
+ put_page(pages[i]);
+ rc = -EOPNOTSUPP;
+out:
+ if (vmas != vmas_arg)
+ kfree(vmas);
+ return rc;
+}
+EXPORT_SYMBOL(get_user_pages_longterm);
+#endif /* CONFIG_FS_DAX */
+
/**
* populate_vma_page_range() - populate a range of pages in the vma.
* @vma: target vma
This is a note to let you know that I've just added the patch titled
mm: Fix devm_memremap_pages() collision handling
to the 4.9-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
mm-fix-devm_memremap_pages-collision-handling.patch
and it can be found in the queue-4.9 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From foo@baz Mon Feb 26 20:55:53 CET 2018
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Fri, 23 Feb 2018 14:06:10 -0800
Subject: mm: Fix devm_memremap_pages() collision handling
To: gregkh(a)linuxfoundation.org
Cc:
Message-ID: <151942357089.21775.3486425046348885247.stgit(a)dwillia2-desk3.amr.corp.intel.com>
From: Jan H. Schönherr <jschoenh(a)amazon.de>
commit 77dd66a3c67c93ab401ccc15efff25578be281fd upstream.
If devm_memremap_pages() detects a collision while adding entries
to the radix-tree, we call pgmap_radix_release(). Unfortunately,
the function removes *all* entries for the range -- including the
entries that caused the collision in the first place.
Modify pgmap_radix_release() to take an additional argument to
indicate where to stop, so that only newly added entries are removed
from the tree.
Cc: <stable(a)vger.kernel.org>
Fixes: 9476df7d80df ("mm: introduce find_dev_pagemap()")
Signed-off-by: Jan H. Schönherr <jschoenh(a)amazon.de>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
kernel/memremap.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
--- a/kernel/memremap.c
+++ b/kernel/memremap.c
@@ -194,7 +194,7 @@ void put_zone_device_page(struct page *p
}
EXPORT_SYMBOL(put_zone_device_page);
-static void pgmap_radix_release(struct resource *res)
+static void pgmap_radix_release(struct resource *res, resource_size_t end_key)
{
resource_size_t key, align_start, align_size, align_end;
@@ -203,8 +203,11 @@ static void pgmap_radix_release(struct r
align_end = align_start + align_size - 1;
mutex_lock(&pgmap_lock);
- for (key = res->start; key <= res->end; key += SECTION_SIZE)
+ for (key = res->start; key <= res->end; key += SECTION_SIZE) {
+ if (key >= end_key)
+ break;
radix_tree_delete(&pgmap_radix, key >> PA_SECTION_SHIFT);
+ }
mutex_unlock(&pgmap_lock);
}
@@ -255,7 +258,7 @@ static void devm_memremap_pages_release(
unlock_device_hotplug();
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
- pgmap_radix_release(res);
+ pgmap_radix_release(res, -1);
dev_WARN_ONCE(dev, pgmap->altmap && pgmap->altmap->alloc,
"%s: failed to free all reserved pages\n", __func__);
}
@@ -289,7 +292,7 @@ struct dev_pagemap *find_dev_pagemap(res
void *devm_memremap_pages(struct device *dev, struct resource *res,
struct percpu_ref *ref, struct vmem_altmap *altmap)
{
- resource_size_t key, align_start, align_size, align_end;
+ resource_size_t key = 0, align_start, align_size, align_end;
pgprot_t pgprot = PAGE_KERNEL;
struct dev_pagemap *pgmap;
struct page_map *page_map;
@@ -392,7 +395,7 @@ void *devm_memremap_pages(struct device
untrack_pfn(NULL, PHYS_PFN(align_start), align_size);
err_pfn_remap:
err_radix:
- pgmap_radix_release(res);
+ pgmap_radix_release(res, key);
devres_free(page_map);
return ERR_PTR(error);
}
Patches currently in stable-queue which might be from dan.j.williams(a)intel.com are
queue-4.9/mm-fix-devm_memremap_pages-collision-handling.patch
queue-4.9/ib-core-disable-memory-registration-of-filesystem-dax-vmas.patch
queue-4.9/mm-avoid-spurious-bad-pmd-warning-messages.patch
queue-4.9/mm-introduce-get_user_pages_longterm.patch
queue-4.9/mm-fail-get_vaddr_frames-for-filesystem-dax-mappings.patch
queue-4.9/fs-dax.c-fix-inefficiency-in-dax_writeback_mapping_range.patch
queue-4.9/device-dax-implement-split-to-catch-invalid-munmap-attempts.patch
queue-4.9/v4l2-disable-filesystem-dax-mapping-support.patch
queue-4.9/libnvdimm-dax-fix-1gb-aligned-namespaces-vs-physical-misalignment.patch
queue-4.9/x86-entry-64-clear-extra-registers-beyond-syscall-arguments-to-reduce-speculation-attack-surface.patch
queue-4.9/libnvdimm-fix-integer-overflow-static-analysis-warning.patch
commit b7f0554a56f21fb3e636a627450a9add030889be upstream.
Until there is a solution to the dma-to-dax vs truncate problem it is
not safe to allow V4L2, Exynos, and other frame vector users to create
long standing / irrevocable memory registrations against filesytem-dax
vmas.
[dan.j.williams(a)intel.com: add comment for vma_is_fsdax() check in get_vaddr_frames(), per Jan]
Link: http://lkml.kernel.org/r/151197874035.26211.4061781453123083667.stgit@dwill…
Link: http://lkml.kernel.org/r/151068939985.7446.15684639617389154187.stgit@dwill…
Fixes: 3565fce3a659 ("mm, x86: get_user_pages() for dax mappings")
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Inki Dae <inki.dae(a)samsung.com>
Cc: Seung-Woo Kim <sw0312.kim(a)samsung.com>
Cc: Joonyoung Shim <jy0922.shim(a)samsung.com>
Cc: Kyungmin Park <kyungmin.park(a)samsung.com>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: Mel Gorman <mgorman(a)suse.de>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Doug Ledford <dledford(a)redhat.com>
Cc: Hal Rosenstock <hal.rosenstock(a)gmail.com>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jeff Moyer <jmoyer(a)redhat.com>
Cc: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Sean Hefty <sean.hefty(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
mm/frame_vector.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/mm/frame_vector.c b/mm/frame_vector.c
index db77dcb38afd..375a103d7a56 100644
--- a/mm/frame_vector.c
+++ b/mm/frame_vector.c
@@ -52,6 +52,18 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames,
ret = -EFAULT;
goto out;
}
+
+ /*
+ * While get_vaddr_frames() could be used for transient (kernel
+ * controlled lifetime) pinning of memory pages all current
+ * users establish long term (userspace controlled lifetime)
+ * page pinning. Treat get_vaddr_frames() like
+ * get_user_pages_longterm() and disallow it for filesystem-dax
+ * mappings.
+ */
+ if (vma_is_fsdax(vma))
+ return -EOPNOTSUPP;
+
if (!(vma->vm_flags & (VM_IO | VM_PFNMAP))) {
vec->got_ref = true;
vec->is_pfns = false;
From: Ross Zwisler <ross.zwisler(a)linux.intel.com>
commit d0f0931de936a0a468d7e59284d39581c16d3a73 upstream.
When the pmd_devmap() checks were added by 5c7fb56e5e3f ("mm, dax:
dax-pmd vs thp-pmd vs hugetlbfs-pmd") to add better support for DAX huge
pages, they were all added to the end of if() statements after existing
pmd_trans_huge() checks. So, things like:
- if (pmd_trans_huge(*pmd))
+ if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd))
When further checks were added after pmd_trans_unstable() checks by
commit 7267ec008b5c ("mm: postpone page table allocation until we have
page to map") they were also added at the end of the conditional:
+ if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
This ordering is fine for pmd_trans_huge(), but doesn't work for
pmd_trans_unstable(). This is because DAX huge pages trip the bad_pmd()
check inside of pmd_none_or_trans_huge_or_clear_bad() (called by
pmd_trans_unstable()), which prints out a warning and returns 1. So, we
do end up doing the right thing, but only after spamming dmesg with
suspicious looking messages:
mm/pgtable-generic.c:39: bad pmd ffff8808daa49b88(84000001006000a5)
Reorder these checks in a helper so that pmd_devmap() is checked first,
avoiding the error messages, and add a comment explaining why the
ordering is important.
Fixes: commit 7267ec008b5c ("mm: postpone page table allocation until we have page to map")
Link: http://lkml.kernel.org/r/20170522215749.23516-1-ross.zwisler@linux.intel.com
Signed-off-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Reviewed-by: Jan Kara <jack(a)suse.cz>
Cc: Pawel Lebioda <pawel.lebioda(a)intel.com>
Cc: "Darrick J. Wong" <darrick.wong(a)oracle.com>
Cc: Alexander Viro <viro(a)zeniv.linux.org.uk>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Dave Hansen <dave.hansen(a)intel.com>
Cc: Matthew Wilcox <mawilcox(a)microsoft.com>
Cc: "Kirill A . Shutemov" <kirill.shutemov(a)linux.intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Xiong Zhou <xzhou(a)redhat.com>
Cc: Eryu Guan <eguan(a)redhat.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
mm/memory.c | 40 ++++++++++++++++++++++++++++++----------
1 file changed, 30 insertions(+), 10 deletions(-)
diff --git a/mm/memory.c b/mm/memory.c
index e2e68767a373..d2db2c4eb0a4 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -2848,6 +2848,17 @@ static int __do_fault(struct fault_env *fe, pgoff_t pgoff,
return ret;
}
+/*
+ * The ordering of these checks is important for pmds with _PAGE_DEVMAP set.
+ * If we check pmd_trans_unstable() first we will trip the bad_pmd() check
+ * inside of pmd_none_or_trans_huge_or_clear_bad(). This will end up correctly
+ * returning 1 but not before it spams dmesg with the pmd_clear_bad() output.
+ */
+static int pmd_devmap_trans_unstable(pmd_t *pmd)
+{
+ return pmd_devmap(*pmd) || pmd_trans_unstable(pmd);
+}
+
static int pte_alloc_one_map(struct fault_env *fe)
{
struct vm_area_struct *vma = fe->vma;
@@ -2871,18 +2882,27 @@ static int pte_alloc_one_map(struct fault_env *fe)
map_pte:
/*
* If a huge pmd materialized under us just retry later. Use
- * pmd_trans_unstable() instead of pmd_trans_huge() to ensure the pmd
- * didn't become pmd_trans_huge under us and then back to pmd_none, as
- * a result of MADV_DONTNEED running immediately after a huge pmd fault
- * in a different thread of this mm, in turn leading to a misleading
- * pmd_trans_huge() retval. All we have to ensure is that it is a
- * regular pmd that we can walk with pte_offset_map() and we can do that
- * through an atomic read in C, which is what pmd_trans_unstable()
- * provides.
+ * pmd_trans_unstable() via pmd_devmap_trans_unstable() instead of
+ * pmd_trans_huge() to ensure the pmd didn't become pmd_trans_huge
+ * under us and then back to pmd_none, as a result of MADV_DONTNEED
+ * running immediately after a huge pmd fault in a different thread of
+ * this mm, in turn leading to a misleading pmd_trans_huge() retval.
+ * All we have to ensure is that it is a regular pmd that we can walk
+ * with pte_offset_map() and we can do that through an atomic read in
+ * C, which is what pmd_trans_unstable() provides.
*/
- if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
+ if (pmd_devmap_trans_unstable(fe->pmd))
return VM_FAULT_NOPAGE;
+ /*
+ * At this point we know that our vmf->pmd points to a page of ptes
+ * and it cannot become pmd_none(), pmd_devmap() or pmd_trans_huge()
+ * for the duration of the fault. If a racing MADV_DONTNEED runs and
+ * we zap the ptes pointed to by our vmf->pmd, the vmf->ptl will still
+ * be valid and we will re-check to make sure the vmf->pte isn't
+ * pte_none() under vmf->ptl protection when we return to
+ * alloc_set_pte().
+ */
fe->pte = pte_offset_map_lock(vma->vm_mm, fe->pmd, fe->address,
&fe->ptl);
return 0;
@@ -3456,7 +3476,7 @@ static int handle_pte_fault(struct fault_env *fe)
fe->pte = NULL;
} else {
/* See comment in pte_alloc_one_map() */
- if (pmd_trans_unstable(fe->pmd) || pmd_devmap(*fe->pmd))
+ if (pmd_devmap_trans_unstable(fe->pmd))
return 0;
/*
* A regular pmd is established and it can't morph into a huge
commit 58738c495e15badd2015e19ff41f1f1ed55200bc upstream.
Dan reports:
The patch 62232e45f4a2: "libnvdimm: control (ioctl) messages for
nvdimm_bus and nvdimm devices" from Jun 8, 2015, leads to the
following static checker warning:
drivers/nvdimm/bus.c:1018 __nd_ioctl()
warn: integer overflows 'buf_len'
From a casual review, this seems like it might be a real bug. On
the first iteration we load some data into in_env[]. On the second
iteration we read a use controlled "in_size" from nd_cmd_in_size().
It can go up to UINT_MAX - 1. A high number means we will fill the
whole in_env[] buffer. But we potentially keep looping and adding
more to in_len so now it can be any value.
It simple enough to change, but it feels weird that we keep looping
even though in_env is totally full. Shouldn't we just return an
error if we don't have space for desc->in_num.
We keep looping because the size of the total input is allowed to be
bigger than the 'envelope' which is a subset of the payload that tells
us how much data to expect. For safety explicitly check that buf_len
does not overflow which is what the checker flagged.
Cc: <stable(a)vger.kernel.org>
Fixes: 62232e45f4a2: "libnvdimm: control (ioctl) messages for nvdimm_bus..."
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
---
drivers/nvdimm/bus.c | 11 ++++++-----
1 file changed, 6 insertions(+), 5 deletions(-)
diff --git a/drivers/nvdimm/bus.c b/drivers/nvdimm/bus.c
index 0392eb8a0dea..8311a93cabd8 100644
--- a/drivers/nvdimm/bus.c
+++ b/drivers/nvdimm/bus.c
@@ -812,16 +812,17 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
int read_only, unsigned int ioctl_cmd, unsigned long arg)
{
struct nvdimm_bus_descriptor *nd_desc = nvdimm_bus->nd_desc;
- size_t buf_len = 0, in_len = 0, out_len = 0;
static char out_env[ND_CMD_MAX_ENVELOPE];
static char in_env[ND_CMD_MAX_ENVELOPE];
const struct nd_cmd_desc *desc = NULL;
unsigned int cmd = _IOC_NR(ioctl_cmd);
void __user *p = (void __user *) arg;
struct device *dev = &nvdimm_bus->dev;
- struct nd_cmd_pkg pkg;
const char *cmd_name, *dimm_name;
+ u32 in_len = 0, out_len = 0;
unsigned long cmd_mask;
+ struct nd_cmd_pkg pkg;
+ u64 buf_len = 0;
void *buf;
int rc, i;
@@ -882,7 +883,7 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
}
if (cmd == ND_CMD_CALL) {
- dev_dbg(dev, "%s:%s, idx: %llu, in: %zu, out: %zu, len %zu\n",
+ dev_dbg(dev, "%s:%s, idx: %llu, in: %u, out: %u, len %llu\n",
__func__, dimm_name, pkg.nd_command,
in_len, out_len, buf_len);
@@ -912,9 +913,9 @@ static int __nd_ioctl(struct nvdimm_bus *nvdimm_bus, struct nvdimm *nvdimm,
out_len += out_size;
}
- buf_len = out_len + in_len;
+ buf_len = (u64) out_len + (u64) in_len;
if (buf_len > ND_IOCTL_MAX_BUFLEN) {
- dev_dbg(dev, "%s:%s cmd: %s buf_len: %zu > %d\n", __func__,
+ dev_dbg(dev, "%s:%s cmd: %s buf_len: %llu > %d\n", __func__,
dimm_name, cmd_name, buf_len,
ND_IOCTL_MAX_BUFLEN);
return -EINVAL;
From: Jan Kara <jack(a)suse.cz>
commit 1eb643d02b21412e603b42cdd96010a2ac31c05f upstream.
dax_writeback_mapping_range() fails to update iteration index when
searching radix tree for entries needing cache flushing. Thus each
pagevec worth of entries is searched starting from the start which is
inefficient and prone to livelocks. Update index properly.
Link: http://lkml.kernel.org/r/20170619124531.21491-1-jack@suse.cz
Fixes: 9973c98ecfda3 ("dax: add support for fsync/sync")
Signed-off-by: Jan Kara <jack(a)suse.cz>
Reviewed-by: Ross Zwisler <ross.zwisler(a)linux.intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
---
fs/dax.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/fs/dax.c b/fs/dax.c
index 800748f10b3d..71f87d74afe1 100644
--- a/fs/dax.c
+++ b/fs/dax.c
@@ -785,6 +785,7 @@ int dax_writeback_mapping_range(struct address_space *mapping,
if (ret < 0)
return ret;
}
+ start_index = indices[pvec.nr - 1] + 1;
}
return 0;
}
This is a note to let you know that I've just added the patch titled
drm/i915/breadcrumbs: Ignore unsubmitted signalers
to the 4.15-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-i915-breadcrumbs-ignore-unsubmitted-signalers.patch
and it can be found in the queue-4.15 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 117172c8f9d40ba1de8cb35c6e614422faa03330 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 13 Feb 2018 09:01:54 +0000
Subject: drm/i915/breadcrumbs: Ignore unsubmitted signalers
From: Chris Wilson <chris(a)chris-wilson.co.uk>
commit 117172c8f9d40ba1de8cb35c6e614422faa03330 upstream.
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.
Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@…
(cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213090154.17373-1-chris@…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 29 ++++++++++-------------------
1 file changed, 10 insertions(+), 19 deletions(-)
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -552,29 +552,16 @@ void intel_engine_remove_wait(struct int
spin_unlock_irq(&b->rb_lock);
}
-static bool signal_valid(const struct drm_i915_gem_request *request)
-{
- return intel_wait_check_request(&request->signaling.wait, request);
-}
-
static bool signal_complete(const struct drm_i915_gem_request *request)
{
if (!request)
return false;
- /* If another process served as the bottom-half it may have already
- * signalled that this wait is already completed.
- */
- if (intel_wait_complete(&request->signaling.wait))
- return signal_valid(request);
-
- /* Carefully check if the request is complete, giving time for the
+ /*
+ * Carefully check if the request is complete, giving time for the
* seqno to be visible or if the GPU hung.
*/
- if (__i915_request_irq_complete(request))
- return true;
-
- return false;
+ return __i915_request_irq_complete(request);
}
static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
@@ -617,9 +604,13 @@ static int intel_breadcrumbs_signaler(vo
request = i915_gem_request_get_rcu(request);
rcu_read_unlock();
if (signal_complete(request)) {
- local_bh_disable();
- dma_fence_signal(&request->fence);
- local_bh_enable(); /* kick start the tasklets */
+ if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &request->fence.flags)) {
+ local_bh_disable();
+ dma_fence_signal(&request->fence);
+ GEM_BUG_ON(!i915_gem_request_completed(request));
+ local_bh_enable(); /* kick start the tasklets */
+ }
spin_lock_irq(&b->rb_lock);
Patches currently in stable-queue which might be from chris(a)chris-wilson.co.uk are
queue-4.15/drm-handle-unexpected-holes-in-color-eviction.patch
queue-4.15/drm-i915-breadcrumbs-ignore-unsubmitted-signalers.patch
This is a note to let you know that I've just added the patch titled
drm/i915/breadcrumbs: Ignore unsubmitted signalers
to the 4.14-stable tree which can be found at:
http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
The filename of the patch is:
drm-i915-breadcrumbs-ignore-unsubmitted-signalers.patch
and it can be found in the queue-4.14 subdirectory.
If you, or anyone else, feels it should not be added to the stable tree,
please let <stable(a)vger.kernel.org> know about it.
>From 117172c8f9d40ba1de8cb35c6e614422faa03330 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 13 Feb 2018 09:01:54 +0000
Subject: drm/i915/breadcrumbs: Ignore unsubmitted signalers
From: Chris Wilson <chris(a)chris-wilson.co.uk>
commit 117172c8f9d40ba1de8cb35c6e614422faa03330 upstream.
When a request is preempted, it is unsubmitted from the HW queue and
removed from the active list of breadcrumbs. In the process, this
however triggers the signaler and it may see the clear rbtree with the
old, and still valid, seqno, or it may match the cleared seqno with the
now zero rq->global_seqno. This confuses the signaler into action and
signaling the fence.
Fixes: d6a2289d9d6b ("drm/i915: Remove the preempted request from the execution queue")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: <stable(a)vger.kernel.org> # v4.12+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180206094633.30181-1-chris@…
(cherry picked from commit fd10e2ce9905030d922e179a8047a4d50daffd8e)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20180213090154.17373-1-chris@…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/gpu/drm/i915/intel_breadcrumbs.c | 29 ++++++++++-------------------
1 file changed, 10 insertions(+), 19 deletions(-)
--- a/drivers/gpu/drm/i915/intel_breadcrumbs.c
+++ b/drivers/gpu/drm/i915/intel_breadcrumbs.c
@@ -541,29 +541,16 @@ void intel_engine_remove_wait(struct int
spin_unlock_irq(&b->rb_lock);
}
-static bool signal_valid(const struct drm_i915_gem_request *request)
-{
- return intel_wait_check_request(&request->signaling.wait, request);
-}
-
static bool signal_complete(const struct drm_i915_gem_request *request)
{
if (!request)
return false;
- /* If another process served as the bottom-half it may have already
- * signalled that this wait is already completed.
- */
- if (intel_wait_complete(&request->signaling.wait))
- return signal_valid(request);
-
- /* Carefully check if the request is complete, giving time for the
+ /*
+ * Carefully check if the request is complete, giving time for the
* seqno to be visible or if the GPU hung.
*/
- if (__i915_request_irq_complete(request))
- return true;
-
- return false;
+ return __i915_request_irq_complete(request);
}
static struct drm_i915_gem_request *to_signaler(struct rb_node *rb)
@@ -606,9 +593,13 @@ static int intel_breadcrumbs_signaler(vo
request = i915_gem_request_get_rcu(request);
rcu_read_unlock();
if (signal_complete(request)) {
- local_bh_disable();
- dma_fence_signal(&request->fence);
- local_bh_enable(); /* kick start the tasklets */
+ if (!test_bit(DMA_FENCE_FLAG_SIGNALED_BIT,
+ &request->fence.flags)) {
+ local_bh_disable();
+ dma_fence_signal(&request->fence);
+ GEM_BUG_ON(!i915_gem_request_completed(request));
+ local_bh_enable(); /* kick start the tasklets */
+ }
spin_lock_irq(&b->rb_lock);
Patches currently in stable-queue which might be from chris(a)chris-wilson.co.uk are
queue-4.14/drm-handle-unexpected-holes-in-color-eviction.patch
queue-4.14/drm-i915-breadcrumbs-ignore-unsubmitted-signalers.patch
From: Eric Biggers <ebiggers(a)google.com>
commit 4b34968e77ad09628cfb3c4a7daf2adc2cefc6e8 upstream.
[Please apply to 4.9-stable. I dropped the portion of the patch
pertaining to KEYCTL_RESTRICT_KEYRING which was added in v4.12.]
The asymmetric key type allows an X.509 certificate to be added even if
its signature's hash algorithm is not available in the crypto API. In
that case 'payload.data[asym_auth]' will be NULL. But the key
restriction code failed to check for this case before trying to use the
signature, resulting in a NULL pointer dereference in
key_or_keyring_common() or in restrict_link_by_signature().
Fix this by returning -ENOPKG when the signature is unsupported.
Reproducer when all the CONFIG_CRYPTO_SHA512* options are disabled and
keyctl has support for the 'restrict_keyring' command:
keyctl new_session
keyctl restrict_keyring @s asymmetric builtin_trusted
openssl req -new -sha512 -x509 -batch -nodes -outform der \
| keyctl padd asymmetric desc @s
Fixes: a511e1af8b12 ("KEYS: Move the point of trust determination to __key_link()")
Cc: <stable(a)vger.kernel.org> # v4.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
---
crypto/asymmetric_keys/restrict.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/crypto/asymmetric_keys/restrict.c b/crypto/asymmetric_keys/restrict.c
index 19d1afb9890f..09b1374dc619 100644
--- a/crypto/asymmetric_keys/restrict.c
+++ b/crypto/asymmetric_keys/restrict.c
@@ -66,8 +66,9 @@ __setup("ca_keys=", ca_keys_setup);
*
* Returns 0 if the new certificate was accepted, -ENOKEY if we couldn't find a
* matching parent certificate in the trusted list, -EKEYREJECTED if the
- * signature check fails or the key is blacklisted and some other error if
- * there is a matching certificate but the signature check cannot be performed.
+ * signature check fails or the key is blacklisted, -ENOPKG if the signature
+ * uses unsupported crypto, or some other error if there is a matching
+ * certificate but the signature check cannot be performed.
*/
int restrict_link_by_signature(struct key *trust_keyring,
const struct key_type *type,
@@ -86,6 +87,8 @@ int restrict_link_by_signature(struct key *trust_keyring,
return -EOPNOTSUPP;
sig = payload->data[asym_auth];
+ if (!sig)
+ return -ENOPKG;
if (!sig->auth_ids[0] && !sig->auth_ids[1])
return -ENOKEY;
--
2.16.1.291.g4437f3f132-goog
From: Eric Biggers <ebiggers(a)google.com>
commit e0058f3a874ebb48b25be7ff79bc3b4e59929f90 upstream.
[Please apply to 3.18-stable. Patch was fixed up to drop usage of
ASN.1 state machine opcodes which were added in 4.3.]
In asn1_ber_decoder(), indefinitely-sized ASN.1 items were being passed
to the action functions before their lengths had been computed, using
the bogus length of 0x80 (ASN1_INDEFINITE_LENGTH). This resulted in
reading data past the end of the input buffer, when given a specially
crafted message.
Fix it by rearranging the code so that the indefinite length is resolved
before the action is called.
This bug was originally found by fuzzing the X.509 parser in userspace
using libFuzzer from the LLVM project.
KASAN report (cleaned up slightly):
BUG: KASAN: slab-out-of-bounds in memcpy ./include/linux/string.h:341 [inline]
BUG: KASAN: slab-out-of-bounds in x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
Read of size 128 at addr ffff880035dd9eaf by task keyctl/195
CPU: 1 PID: 195 Comm: keyctl Not tainted 4.14.0-09238-g1d3b78bbc6e9 #26
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.11.0-20171110_100015-anatol 04/01/2014
Call Trace:
__dump_stack lib/dump_stack.c:17 [inline]
dump_stack+0xd1/0x175 lib/dump_stack.c:53
print_address_description+0x78/0x260 mm/kasan/report.c:252
kasan_report_error mm/kasan/report.c:351 [inline]
kasan_report+0x23f/0x350 mm/kasan/report.c:409
memcpy+0x1f/0x50 mm/kasan/kasan.c:302
memcpy ./include/linux/string.h:341 [inline]
x509_fabricate_name.constprop.1+0x1a4/0x940 crypto/asymmetric_keys/x509_cert_parser.c:366
asn1_ber_decoder+0xb4a/0x1fd0 lib/asn1_decoder.c:447
x509_cert_parse+0x1c7/0x620 crypto/asymmetric_keys/x509_cert_parser.c:89
x509_key_preparse+0x61/0x750 crypto/asymmetric_keys/x509_public_key.c:174
asymmetric_key_preparse+0xa4/0x150 crypto/asymmetric_keys/asymmetric_type.c:388
key_create_or_update+0x4d4/0x10a0 security/keys/key.c:850
SYSC_add_key security/keys/keyctl.c:122 [inline]
SyS_add_key+0xe8/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96
Allocated by task 195:
__do_kmalloc_node mm/slab.c:3675 [inline]
__kmalloc_node+0x47/0x60 mm/slab.c:3682
kvmalloc ./include/linux/mm.h:540 [inline]
SYSC_add_key security/keys/keyctl.c:104 [inline]
SyS_add_key+0x19e/0x290 security/keys/keyctl.c:62
entry_SYSCALL_64_fastpath+0x1f/0x96
Fixes: 42d5ec27f873 ("X.509: Add an ASN.1 decoder")
Reported-by: Alexander Potapenko <glider(a)google.com>
Cc: <stable(a)vger.kernel.org> # v3.7+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Signed-off-by: David Howells <dhowells(a)redhat.com>
---
lib/asn1_decoder.c | 43 ++++++++++++++++++++++++-------------------
1 file changed, 24 insertions(+), 19 deletions(-)
diff --git a/lib/asn1_decoder.c b/lib/asn1_decoder.c
index 7f453b6ff0d0..c3a768ae1a40 100644
--- a/lib/asn1_decoder.c
+++ b/lib/asn1_decoder.c
@@ -305,38 +305,43 @@ next_op:
/* Decide how to handle the operation */
switch (op) {
- case ASN1_OP_MATCH_ANY_ACT:
- case ASN1_OP_COND_MATCH_ANY_ACT:
- ret = actions[machine[pc + 1]](context, hdr, tag, data + dp, len);
- if (ret < 0)
- return ret;
- goto skip_data;
-
- case ASN1_OP_MATCH_ACT:
- case ASN1_OP_MATCH_ACT_OR_SKIP:
- case ASN1_OP_COND_MATCH_ACT_OR_SKIP:
- ret = actions[machine[pc + 2]](context, hdr, tag, data + dp, len);
- if (ret < 0)
- return ret;
- goto skip_data;
-
case ASN1_OP_MATCH:
case ASN1_OP_MATCH_OR_SKIP:
+ case ASN1_OP_MATCH_ACT:
+ case ASN1_OP_MATCH_ACT_OR_SKIP:
case ASN1_OP_MATCH_ANY:
+ case ASN1_OP_MATCH_ANY_ACT:
case ASN1_OP_COND_MATCH_OR_SKIP:
+ case ASN1_OP_COND_MATCH_ACT_OR_SKIP:
case ASN1_OP_COND_MATCH_ANY:
- skip_data:
+ case ASN1_OP_COND_MATCH_ANY_ACT:
+
if (!(flags & FLAG_CONS)) {
if (flags & FLAG_INDEFINITE_LENGTH) {
+ size_t tmp = dp;
+
ret = asn1_find_indefinite_length(
- data, datalen, &dp, &len, &errmsg);
+ data, datalen, &tmp, &len, &errmsg);
if (ret < 0)
goto error;
- } else {
- dp += len;
}
pr_debug("- LEAF: %zu\n", len);
}
+
+ if (op & ASN1_OP_MATCH__ACT) {
+ unsigned char act;
+
+ if (op & ASN1_OP_MATCH__ANY)
+ act = machine[pc + 1];
+ else
+ act = machine[pc + 2];
+ ret = actions[act](context, hdr, tag, data + dp, len);
+ if (ret < 0)
+ return ret;
+ }
+
+ if (!(flags & FLAG_CONS))
+ dp += len;
pc += asn1_op_lengths[op];
goto next_op;
--
2.16.1.291.g4437f3f132-goog