The patch titled
Subject: mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment
has been removed from the -mm tree. Its filename was
thp-shmem-fix-conflict-of-above-47bit-hint-address-and-pmd-alignment.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill(a)shutemov.name>
Subject: mm/shmem.c: thp, shmem: fix conflict of above-47bit hint address and PMD alignment
Shmem/tmpfs tries to provide THP-friendly mappings if huge pages are
enabled. But it doesn't work well with above-47bit hint address.
Normally, the kernel doesn't create userspace mappings above 47-bit, even
if the machine allows this (such as with 5-level paging on x86-64). Not
all user space is ready to handle wide addresses. It's known that at
least some JIT compilers use higher bits in pointers to encode their
information.
Userspace can ask for allocation from full address space by specifying
hint address (with or without MAP_FIXED) above 47-bits. If the
application doesn't need a particular address, but wants to allocate from
whole address space it can specify -1 as a hint address.
Unfortunately, this trick breaks THP alignment in shmem/tmp:
shmem_get_unmapped_area() would not try to allocate PMD-aligned area if
*any* hint address specified.
This can be fixed by requesting the aligned area if the we failed to
allocated at user-specified hint address. The request with inflated
length will also take the user-specified hint address. This way we will
not lose an allocation request from the full address space.
[kirill(a)shutemov.name: fold in a fixup]
Link: http://lkml.kernel.org/r/20191223231309.t6bh5hkbmokihpfu@box
Link: http://lkml.kernel.org/r/20191220142548.7118-3-kirill.shutemov@linux.intel.…
Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Cc: "Willhalm, Thomas" <thomas.willhalm(a)intel.com>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: "Bruggeman, Otto G" <otto.g.bruggeman(a)intel.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
--- a/mm/shmem.c~thp-shmem-fix-conflict-of-above-47bit-hint-address-and-pmd-alignment
+++ a/mm/shmem.c
@@ -2107,9 +2107,10 @@ unsigned long shmem_get_unmapped_area(st
/*
* Our priority is to support MAP_SHARED mapped hugely;
* and support MAP_PRIVATE mapped hugely too, until it is COWed.
- * But if caller specified an address hint, respect that as before.
+ * But if caller specified an address hint and we allocated area there
+ * successfully, respect that as before.
*/
- if (uaddr)
+ if (uaddr == addr)
return addr;
if (shmem_huge != SHMEM_HUGE_FORCE) {
@@ -2143,7 +2144,7 @@ unsigned long shmem_get_unmapped_area(st
if (inflated_len < len)
return addr;
- inflated_addr = get_area(NULL, 0, inflated_len, 0, flags);
+ inflated_addr = get_area(NULL, uaddr, inflated_len, 0, flags);
if (IS_ERR_VALUE(inflated_addr))
return addr;
if (inflated_addr & ~PAGE_MASK)
_
Patches currently in -mm which might be from kirill(a)shutemov.name are
mm-page_alloc-skip-non-present-sections-on-zone-initialization.patch
The patch titled
Subject: mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment
has been removed from the -mm tree. Its filename was
thp-fix-conflict-of-above-47bit-hint-address-and-pmd-alignment.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill(a)shutemov.name>
Subject: mm/huge_memory.c: thp: fix conflict of above-47bit hint address and PMD alignment
Patch series "Fix two above-47bit hint address vs. THP bugs".
The two get_unmapped_area() implementations have to be fixed to provide
THP-friendly mappings if above-47bit hint address is specified.
This patch (of 2):
Filesystems use thp_get_unmapped_area() to provide THP-friendly mappings.
For DAX in particular.
Normally, the kernel doesn't create userspace mappings above 47-bit, even
if the machine allows this (such as with 5-level paging on x86-64). Not
all user space is ready to handle wide addresses. It's known that at
least some JIT compilers use higher bits in pointers to encode their
information.
Userspace can ask for allocation from full address space by specifying
hint address (with or without MAP_FIXED) above 47-bits. If the
application doesn't need a particular address, but wants to allocate from
whole address space it can specify -1 as a hint address.
Unfortunately, this trick breaks thp_get_unmapped_area(): the function
would not try to allocate PMD-aligned area if *any* hint address
specified.
Modify the routine to handle it correctly:
- Try to allocate the space at the specified hint address with length
padding required for PMD alignment.
- If failed, retry without length padding (but with the same hint
address);
- If the returned address matches the hint address return it.
- Otherwise, align the address as required for THP and return.
The user specified hint address is passed down to get_unmapped_area() so
above-47bit hint address will be taken into account without breaking
alignment requirements.
Link: http://lkml.kernel.org/r/20191220142548.7118-2-kirill.shutemov@linux.intel.…
Fixes: b569bab78d8d ("x86/mm: Prepare to expose larger address space to userspace")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Thomas Willhalm <thomas.willhalm(a)intel.com>
Tested-by: Dan Williams <dan.j.williams(a)intel.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.vnet.ibm.com>
Cc: "Bruggeman, Otto G" <otto.g.bruggeman(a)intel.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 38 ++++++++++++++++++++++++--------------
1 file changed, 24 insertions(+), 14 deletions(-)
--- a/mm/huge_memory.c~thp-fix-conflict-of-above-47bit-hint-address-and-pmd-alignment
+++ a/mm/huge_memory.c
@@ -527,13 +527,13 @@ void prep_transhuge_page(struct page *pa
set_compound_page_dtor(page, TRANSHUGE_PAGE_DTOR);
}
-static unsigned long __thp_get_unmapped_area(struct file *filp, unsigned long len,
+static unsigned long __thp_get_unmapped_area(struct file *filp,
+ unsigned long addr, unsigned long len,
loff_t off, unsigned long flags, unsigned long size)
{
- unsigned long addr;
loff_t off_end = off + len;
loff_t off_align = round_up(off, size);
- unsigned long len_pad;
+ unsigned long len_pad, ret;
if (off_end <= off_align || (off_end - off_align) < size)
return 0;
@@ -542,30 +542,40 @@ static unsigned long __thp_get_unmapped_
if (len_pad < len || (off + len_pad) < off)
return 0;
- addr = current->mm->get_unmapped_area(filp, 0, len_pad,
+ ret = current->mm->get_unmapped_area(filp, addr, len_pad,
off >> PAGE_SHIFT, flags);
- if (IS_ERR_VALUE(addr))
+
+ /*
+ * The failure might be due to length padding. The caller will retry
+ * without the padding.
+ */
+ if (IS_ERR_VALUE(ret))
return 0;
- addr += (off - addr) & (size - 1);
- return addr;
+ /*
+ * Do not try to align to THP boundary if allocation at the address
+ * hint succeeds.
+ */
+ if (ret == addr)
+ return addr;
+
+ ret += (off - ret) & (size - 1);
+ return ret;
}
unsigned long thp_get_unmapped_area(struct file *filp, unsigned long addr,
unsigned long len, unsigned long pgoff, unsigned long flags)
{
+ unsigned long ret;
loff_t off = (loff_t)pgoff << PAGE_SHIFT;
- if (addr)
- goto out;
if (!IS_DAX(filp->f_mapping->host) || !IS_ENABLED(CONFIG_FS_DAX_PMD))
goto out;
- addr = __thp_get_unmapped_area(filp, len, off, flags, PMD_SIZE);
- if (addr)
- return addr;
-
- out:
+ ret = __thp_get_unmapped_area(filp, addr, len, off, flags, PMD_SIZE);
+ if (ret)
+ return ret;
+out:
return current->mm->get_unmapped_area(filp, addr, len, pgoff, flags);
}
EXPORT_SYMBOL_GPL(thp_get_unmapped_area);
_
Patches currently in -mm which might be from kirill(a)shutemov.name are
mm-page_alloc-skip-non-present-sections-on-zone-initialization.patch
The patch titled
Subject: media/v4l2-core: set pages dirty upon releasing DMA buffers
has been added to the -mm tree. Its filename is
media-v4l2-core-set-pages-dirty-upon-releasing-dma-buffers.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/media-v4l2-core-set-pages-dirty-up…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/media-v4l2-core-set-pages-dirty-up…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: John Hubbard <jhubbard(a)nvidia.com>
Subject: media/v4l2-core: set pages dirty upon releasing DMA buffers
After DMA is complete, and the device and CPU caches are synchronized,
it's still required to mark the CPU pages as dirty, if the data was coming
from the device. However, this driver was just issuing a bare put_page()
call, without any set_page_dirty*() call.
Fix the problem, by calling set_page_dirty_lock() if the CPU pages were
potentially receiving data from the device.
Link: http://lkml.kernel.org/r/20200107224558.2362728-11-jhubbard@nvidia.com
Signed-off-by: John Hubbard <jhubbard(a)nvidia.com>
Reviewed-by: Christoph Hellwig <hch(a)lst.de>
Acked-by: Hans Verkuil <hverkuil-cisco(a)xs4all.nl>
Cc: Mauro Carvalho Chehab <mchehab(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Cc: Alex Williamson <alex.williamson(a)redhat.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Björn Töpel <bjorn.topel(a)intel.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: Dan Williams <dan.j.williams(a)intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: Jason Gunthorpe <jgg(a)mellanox.com>
Cc: Jason Gunthorpe <jgg(a)ziepe.ca>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Jerome Glisse <jglisse(a)redhat.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Kirill A. Shutemov <kirill(a)shutemov.name>
Cc: Leon Romanovsky <leonro(a)mellanox.com>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
drivers/media/v4l2-core/videobuf-dma-sg.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/drivers/media/v4l2-core/videobuf-dma-sg.c~media-v4l2-core-set-pages-dirty-upon-releasing-dma-buffers
+++ a/drivers/media/v4l2-core/videobuf-dma-sg.c
@@ -349,8 +349,11 @@ int videobuf_dma_free(struct videobuf_dm
BUG_ON(dma->sglen);
if (dma->pages) {
- for (i = 0; i < dma->nr_pages; i++)
+ for (i = 0; i < dma->nr_pages; i++) {
+ if (dma->direction == DMA_FROM_DEVICE)
+ set_page_dirty_lock(dma->pages[i]);
put_page(dma->pages[i]);
+ }
kfree(dma->pages);
dma->pages = NULL;
}
_
Patches currently in -mm which might be from jhubbard(a)nvidia.com are
mm-gup-factor-out-duplicate-code-from-four-routines.patch
mm-gup-move-try_get_compound_head-to-top-fix-minor-issues.patch
mm-devmap-refactor-1-based-refcounting-for-zone_device-pages.patch
goldish_pipe-rename-local-pin_user_pages-routine.patch
mm-fix-get_user_pages_remotes-handling-of-foll_longterm.patch
vfio-fix-foll_longterm-use-simplify-get_user_pages_remote-call.patch
mm-gup-allow-foll_force-for-get_user_pages_fast.patch
ib-umem-use-get_user_pages_fast-to-pin-dma-pages.patch
media-v4l2-core-set-pages-dirty-upon-releasing-dma-buffers.patch
mm-gup-introduce-pin_user_pages-and-foll_pin.patch
goldish_pipe-convert-to-pin_user_pages-and-put_user_page.patch
ib-corehwumem-set-foll_pin-via-pin_user_pages-fix-up-odp.patch
mm-process_vm_access-set-foll_pin-via-pin_user_pages_remote.patch
drm-via-set-foll_pin-via-pin_user_pages_fast.patch
fs-io_uring-set-foll_pin-via-pin_user_pages.patch
net-xdp-set-foll_pin-via-pin_user_pages.patch
media-v4l2-core-pin_user_pages-foll_pin-and-put_user_page-conversion.patch
vfio-mm-pin_user_pages-foll_pin-and-put_user_page-conversion.patch
powerpc-book3s64-convert-to-pin_user_pages-and-put_user_page.patch
mm-gup_benchmark-use-proper-foll_write-flags-instead-of-hard-coding-1.patch
mm-tree-wide-rename-put_user_page-to-unpin_user_page.patch