The patch titled
Subject: khugepaged: fix null-pointer dereference due to race
has been removed from the -mm tree. Its filename was
khugepaged-fix-null-pointer-dereference-due-to-race.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: khugepaged: fix null-pointer dereference due to race
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel…
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: syzbot+ed318e8b790ca72c5ad0(a)syzkaller.appspotmail.com
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/khugepaged.c~khugepaged-fix-null-pointer-dereference-due-to-race
+++ a/mm/khugepaged.c
@@ -958,6 +958,9 @@ static int hugepage_vma_revalidate(struc
return SCAN_ADDRESS_RANGE;
if (!hugepage_vma_check(vma, vma->vm_flags))
return SCAN_VMA_CHECK;
+ /* Anon VMA expected */
+ if (!vma->anon_vma || vma->vm_ops)
+ return SCAN_VMA_CHECK;
return 0;
}
_
Patches currently in -mm which might be from kirill.shutemov(a)linux.intel.com are
The patch titled
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
has been removed from the -mm tree. Its filename was
mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Barry Song <song.bao.hua(a)hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes. This
patch fixes possible double reservation and CMA leak.
[akpm(a)linux-foundation.org: fix CONFIG_CMA=n warning]
[sfr(a)canb.auug.org.au: better checks before using hugetlb_cma]
Link: http://lkml.kernel.org/r/20200721205716.6dbaa56b@canb.auug.org.au
Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua(a)hisilicon.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled
+++ a/mm/hugetlb.c
@@ -45,7 +45,10 @@ int hugetlb_max_hstate __read_mostly;
unsigned int default_hstate_idx;
struct hstate hstates[HUGE_MAX_HSTATE];
+#ifdef CONFIG_CMA
static struct cma *hugetlb_cma[MAX_NUMNODES];
+#endif
+static unsigned long hugetlb_cma_size __initdata;
/*
* Minimum page order among possible hugepage sizes, set to a proper value
@@ -1235,9 +1238,10 @@ static void free_gigantic_page(struct pa
* If the page isn't allocated using the cma allocator,
* cma_release() returns false.
*/
- if (IS_ENABLED(CONFIG_CMA) &&
- cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
+#ifdef CONFIG_CMA
+ if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
return;
+#endif
free_contig_range(page_to_pfn(page), 1 << order);
}
@@ -1248,7 +1252,8 @@ static struct page *alloc_gigantic_page(
{
unsigned long nr_pages = 1UL << huge_page_order(h);
- if (IS_ENABLED(CONFIG_CMA)) {
+#ifdef CONFIG_CMA
+ {
struct page *page;
int node;
@@ -1262,6 +1267,7 @@ static struct page *alloc_gigantic_page(
return page;
}
}
+#endif
return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
}
@@ -2571,7 +2577,7 @@ static void __init hugetlb_hstate_alloc_
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+ if (hugetlb_cma_size) {
pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
break;
}
@@ -5654,7 +5660,6 @@ void move_hugetlb_state(struct page *old
}
#ifdef CONFIG_CMA
-static unsigned long hugetlb_cma_size __initdata;
static bool cma_reserve_called __initdata;
static int __init cmdline_parse_hugetlb_cma(char *p)
_
Patches currently in -mm which might be from song.bao.hua(a)hisilicon.com are
mm-hugetlb-split-hugetlb_cma-in-nodes-with-memory.patch
mm-cma-fix-the-name-of-cma-areas.patch
mm-hugetlb-fix-the-name-of-hugetlb-cma.patch
The patch titled
Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
has been removed from the -mm tree. Its filename was
mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying. If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed. It resulted
in memory leak when memcg was destroyed. We can use the following steps
to reproduce.
1) Use kmem_cache_create() to create a new kmem_cache named A.
2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
so the refcount of B is just increased.
3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
decrease the B's refcount but mark the B as dying.
4) Create a new memory cgroup and alloc memory from the kmem_cache
B. It leads to create a non-root kmem_cache for allocating memory.
5) When destroy the memory cgroup created in the step 4), the
non-root kmem_cache can never be destroyed.
If we repeat steps 4) and 5), this will cause a lot of memory leak. So
only when refcount reach zero, we mark the root kmem_cache as dying.
Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slab_common.c | 35 ++++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
--- a/mm/slab_common.c~mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy
+++ a/mm/slab_common.c
@@ -326,6 +326,14 @@ int slab_unmergeable(struct kmem_cache *
if (s->refcount < 0)
return 1;
+#ifdef CONFIG_MEMCG_KMEM
+ /*
+ * Skip the dying kmem_cache.
+ */
+ if (s->memcg_params.dying)
+ return 1;
+#endif
+
return 0;
}
@@ -886,12 +894,15 @@ static int shutdown_memcg_caches(struct
return 0;
}
-static void flush_memcg_workqueue(struct kmem_cache *s)
+static void memcg_set_kmem_cache_dying(struct kmem_cache *s)
{
spin_lock_irq(&memcg_kmem_wq_lock);
s->memcg_params.dying = true;
spin_unlock_irq(&memcg_kmem_wq_lock);
+}
+static void flush_memcg_workqueue(struct kmem_cache *s)
+{
/*
* SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
* sure all registered rcu callbacks have been invoked.
@@ -923,10 +934,6 @@ static inline int shutdown_memcg_caches(
{
return 0;
}
-
-static inline void flush_memcg_workqueue(struct kmem_cache *s)
-{
-}
#endif /* CONFIG_MEMCG_KMEM */
void slab_kmem_cache_release(struct kmem_cache *s)
@@ -944,8 +951,6 @@ void kmem_cache_destroy(struct kmem_cach
if (unlikely(!s))
return;
- flush_memcg_workqueue(s);
-
get_online_cpus();
get_online_mems();
@@ -955,6 +960,22 @@ void kmem_cache_destroy(struct kmem_cach
if (s->refcount)
goto out_unlock;
+#ifdef CONFIG_MEMCG_KMEM
+ memcg_set_kmem_cache_dying(s);
+
+ mutex_unlock(&slab_mutex);
+
+ put_online_mems();
+ put_online_cpus();
+
+ flush_memcg_workqueue(s);
+
+ get_online_cpus();
+ get_online_mems();
+
+ mutex_lock(&slab_mutex);
+#endif
+
err = shutdown_memcg_caches(s);
if (!err)
err = shutdown_cache(s);
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-page_alloc-skip-setting-nodemask-when-we-are-in-interrupt.patch
The patch titled
Subject: mm/memcg: fix refcount error while moving and swapping
has been removed from the -mm tree. Its filename was
mm-memcg-fix-refcount-error-while-moving-and-swapping.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/memcg: fix refcount error while moving and swapping
It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.
This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).
Just skip that optimization: do that part of the accounting immediately.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Alex Shi <alex.shi(a)linux.alibaba.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c~mm-memcg-fix-refcount-error-while-moving-and-swapping
+++ a/mm/memcontrol.c
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
if (!mem_cgroup_is_root(mc.to))
page_counter_uncharge(&mc.to->memory, mc.moved_swap);
- mem_cgroup_id_get_many(mc.to, mc.moved_swap);
css_put_many(&mc.to->css, mc.moved_swap);
mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put: /* get_mctgt_type() gets the page
ent = target.ent;
if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
mc.precharge--;
- /* we fixup refcnts and charges later. */
+ mem_cgroup_id_get_many(mc.to, 1);
+ /* we fixup other refcnts and charges later. */
mc.moved_swap++;
}
break;
_
Patches currently in -mm which might be from hughd(a)google.com are
The patch titled
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
has been removed from the -mm tree. Its filename was
vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Chengguang Xu <cgxu519(a)mykernel.net>
Subject: vfs/xattr: mm/shmem: kernfs: release simple xattr entry in a right way
After commit fdc85222d58e ("kernfs: kvmalloc xattr value instead of
kmalloc"), simple xattr entry is allocated with kvmalloc() instead of
kmalloc(), so we should release it with kvfree() instead of kfree().
Link: http://lkml.kernel.org/r/20200704051608.15043-1-cgxu519@mykernel.net
Fixes: fdc85222d58e ("kernfs: kvmalloc xattr value instead of kmalloc")
Signed-off-by: Chengguang Xu <cgxu519(a)mykernel.net>
Acked-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: Daniel Xu <dxu(a)dxuuu.xyz>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Andreas Dilger <adilger(a)dilger.ca>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Al Viro <viro(a)zeniv.linux.org.uk>
Cc: <stable(a)vger.kernel.org> [5.7]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/xattr.h | 3 ++-
mm/shmem.c | 2 +-
2 files changed, 3 insertions(+), 2 deletions(-)
--- a/include/linux/xattr.h~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/include/linux/xattr.h
@@ -15,6 +15,7 @@
#include <linux/slab.h>
#include <linux/types.h>
#include <linux/spinlock.h>
+#include <linux/mm.h>
#include <uapi/linux/xattr.h>
struct inode;
@@ -94,7 +95,7 @@ static inline void simple_xattrs_free(st
list_for_each_entry_safe(xattr, node, &xattrs->head, list) {
kfree(xattr->name);
- kfree(xattr);
+ kvfree(xattr);
}
}
--- a/mm/shmem.c~vfs-xattr-mm-shmem-kernfs-release-simple-xattr-entry-in-a-right-way
+++ a/mm/shmem.c
@@ -3178,7 +3178,7 @@ static int shmem_initxattrs(struct inode
new_xattr->name = kmalloc(XATTR_SECURITY_PREFIX_LEN + len,
GFP_KERNEL);
if (!new_xattr->name) {
- kfree(new_xattr);
+ kvfree(new_xattr);
return -ENOMEM;
}
_
Patches currently in -mm which might be from cgxu519(a)mykernel.net are
The patch titled
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
has been removed from the -mm tree. Its filename was
mm-close-race-between-munmap-and-expand_upwards-downwards.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock(). It can lead to race with __do_munmap():
Thread A Thread B
__do_munmap()
detach_vmas_to_be_unmapped()
mmap_write_downgrade()
expand_downwards()
vma->vm_start = address;
// The VMA now overlaps with
// VMAs detached by the Thread A
// page fault populates expanded part
// of the VMA
unmap_region()
// Zaps pagetables partly
// populated by Thread B
Similar race exists for expand_upwards().
The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.
[akpm(a)linux-foundation.org: s/mmap_sem/mmap_lock/ in comment]
Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel…
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Jann Horn <jannh(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards
+++ a/mm/mmap.c
@@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struc
* Create a list of vma's touched by the unmap, removing them from the mm's
* vma list as we go..
*/
-static void
+static bool
detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev, unsigned long end)
{
@@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_str
/* Kill the cache */
vmacache_invalidate(mm);
+
+ /*
+ * Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
+ * VM_GROWSUP VMA. Such VMAs can change their size under
+ * down_read(mmap_lock) and collide with the VMA we are about to unmap.
+ */
+ if (vma && (vma->vm_flags & VM_GROWSDOWN))
+ return false;
+ if (prev && (prev->vm_flags & VM_GROWSUP))
+ return false;
+ return true;
}
/*
@@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, un
}
/* Detach vmas from rbtree */
- detach_vmas_to_be_unmapped(mm, vma, prev, end);
+ if (!detach_vmas_to_be_unmapped(mm, vma, prev, end))
+ downgrade = false;
if (downgrade)
mmap_write_downgrade(mm);
_
Patches currently in -mm which might be from kirill.shutemov(a)linux.intel.com are
This is the start of the stable review cycle for the 4.9.148 release.
There are 22 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun Dec 30 11:31:00 UTC 2018.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.9.148-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.9.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.9.148-rc1
Gustavo A. R. Silva <gustavo(a)embeddedor.com>
drm/ioctl: Fix Spectre v1 vulnerabilities
Ivan Delalande <colona(a)arista.com>
proc/sysctl: don't return ENOMEM on lookup when a table is unregistering
Sergey Senozhatsky <sergey.senozhatsky.work(a)gmail.com>
panic: avoid deadlocks in re-entrant console drivers
Richard Weinberger <richard(a)nod.at>
ubifs: Handle re-linking of inodes correctly while recovery
Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
x86/fpu: Disable bottom halves while loading FPU registers
Colin Ian King <colin.king(a)canonical.com>
x86/mtrr: Don't copy uninitialized gentry fields back to userspace
Dexuan Cui <decui(a)microsoft.com>
Drivers: hv: vmbus: Return -EINVAL for the sys files for unopened channels
Christophe Leroy <christophe.leroy(a)c-s.fr>
gpio: max7301: fix driver for use with CONFIG_VMAP_STACK
Russell King <rmk+kernel(a)armlinux.org.uk>
mmc: omap_hsmmc: fix DMA API warning
Ulf Hansson <ulf.hansson(a)linaro.org>
mmc: core: Use a minimum 1600ms timeout when enabling CACHE ctrl
Ulf Hansson <ulf.hansson(a)linaro.org>
mmc: core: Allow BKOPS and CACHE ctrl even if no HPI support
Ulf Hansson <ulf.hansson(a)linaro.org>
mmc: core: Reset HPI enabled state during re-init and in case of errors
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add Telit LN940 series
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add Fibocom NL668 series
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add Simcom SIM7500/SIM7600 (MBIM mode)
Tore Anderson <tore(a)fud.no>
USB: serial: option: add HP lt4132
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add GosunCn ZTE WeLink ME3630
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Don't prevent USB2 bus suspend in state check intended for USB3 only
Hui Peng <benquike(a)gmail.com>
USB: hso: Fix OOB memory access in hso_probe/hso_get_config_data
Bart Van Assche <bart.vanassche(a)wdc.com>
ib_srpt: Fix a use-after-free in __srpt_close_all_ch()
Mikulas Patocka <mpatocka(a)redhat.com>
block: fix infinite loop if the device loses discard capability
Jens Axboe <axboe(a)kernel.dk>
block: break discard submissions into the user defined size
-------------
Diffstat:
Makefile | 4 ++--
arch/x86/kernel/cpu/mtrr/if.c | 2 ++
arch/x86/kernel/fpu/signal.c | 4 ++--
block/blk-lib.c | 22 ++++++++++++++++++---
drivers/gpio/gpio-max7301.c | 12 +++---------
drivers/gpu/drm/drm_ioctl.c | 10 ++++++++--
drivers/hv/vmbus_drv.c | 20 +++++++++++++++++++
drivers/infiniband/ulp/srpt/ib_srpt.c | 4 ++--
drivers/mmc/core/mmc.c | 24 ++++++++++++++---------
drivers/mmc/host/omap_hsmmc.c | 12 +++++++++++-
drivers/net/usb/hso.c | 18 +++++++++++++++--
drivers/usb/host/xhci-hub.c | 3 ++-
drivers/usb/serial/option.c | 16 ++++++++++++++-
fs/proc/proc_sysctl.c | 13 ++++++------
fs/ubifs/replay.c | 37 +++++++++++++++++++++++++++++++++++
kernel/panic.c | 6 +++++-
16 files changed, 165 insertions(+), 42 deletions(-)
The following commit has been merged into the perf/urgent branch of tip:
Commit-ID: fe5ed7ab99c656bd2f5b79b49df0e9ebf2cead8a
Gitweb: https://git.kernel.org/tip/fe5ed7ab99c656bd2f5b79b49df0e9ebf2cead8a
Author: Oleg Nesterov <oleg(a)redhat.com>
AuthorDate: Thu, 23 Jul 2020 17:44:20 +02:00
Committer: Ingo Molnar <mingo(a)kernel.org>
CommitterDate: Fri, 24 Jul 2020 15:38:37 +02:00
uprobes: Change handle_swbp() to send SIGTRAP with si_code=SI_KERNEL, to fix GDB regression
If a tracee is uprobed and it hits int3 inserted by debugger, handle_swbp()
does send_sig(SIGTRAP, current, 0) which means si_code == SI_USER. This used
to work when this code was written, but then GDB started to validate si_code
and now it simply can't use breakpoints if the tracee has an active uprobe:
# cat test.c
void unused_func(void)
{
}
int main(void)
{
return 0;
}
# gcc -g test.c -o test
# perf probe -x ./test -a unused_func
# perf record -e probe_test:unused_func gdb ./test -ex run
GNU gdb (GDB) 10.0.50.20200714-git
...
Program received signal SIGTRAP, Trace/breakpoint trap.
0x00007ffff7ddf909 in dl_main () from /lib64/ld-linux-x86-64.so.2
(gdb)
The tracee hits the internal breakpoint inserted by GDB to monitor shared
library events but GDB misinterprets this SIGTRAP and reports a signal.
Change handle_swbp() to use force_sig(SIGTRAP), this matches do_int3_user()
and fixes the problem.
This is the minimal fix for -stable, arch/x86/kernel/uprobes.c is equally
wrong; it should use send_sigtrap(TRAP_TRACE) instead of send_sig(SIGTRAP),
but this doesn't confuse GDB and needs another x86-specific patch.
Reported-by: Aaron Merey <amerey(a)redhat.com>
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Reviewed-by: Srikar Dronamraju <srikar(a)linux.vnet.ibm.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200723154420.GA32043@redhat.com
---
kernel/events/uprobes.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c
index bb08628..5f8b0c5 100644
--- a/kernel/events/uprobes.c
+++ b/kernel/events/uprobes.c
@@ -2199,7 +2199,7 @@ static void handle_swbp(struct pt_regs *regs)
if (!uprobe) {
if (is_swbp > 0) {
/* No matching uprobe; signal SIGTRAP. */
- send_sig(SIGTRAP, current, 0);
+ force_sig(SIGTRAP);
} else {
/*
* Either we raced with uprobe_unregister() or we can't
From: Hugh Dickins <hughd(a)google.com>
Subject: mm/memcg: fix refcount error while moving and swapping
It was hard to keep a test running, moving tasks between memcgs with
move_charge_at_immigrate, while swapping: mem_cgroup_id_get_many()'s
refcount is discovered to be 0 (supposedly impossible), so it is then
forced to REFCOUNT_SATURATED, and after thousands of warnings in quick
succession, the test is at last put out of misery by being OOM killed.
This is because of the way moved_swap accounting was saved up until the
task move gets completed in __mem_cgroup_clear_mc(), deferred from when
mem_cgroup_move_swap_account() actually exchanged old and new ids.
Concurrent activity can free up swap quicker than the task is scanned,
bringing id refcount down 0 (which should only be possible when
offlining).
Just skip that optimization: do that part of the accounting immediately.
Link: http://lkml.kernel.org/r/alpine.LSU.2.11.2007071431050.4726@eggly.anvils
Fixes: 615d66c37c75 ("mm: memcontrol: fix memcg id ref counter on swap charge move")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Alex Shi <alex.shi(a)linux.alibaba.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/memcontrol.c~mm-memcg-fix-refcount-error-while-moving-and-swapping
+++ a/mm/memcontrol.c
@@ -5669,7 +5669,6 @@ static void __mem_cgroup_clear_mc(void)
if (!mem_cgroup_is_root(mc.to))
page_counter_uncharge(&mc.to->memory, mc.moved_swap);
- mem_cgroup_id_get_many(mc.to, mc.moved_swap);
css_put_many(&mc.to->css, mc.moved_swap);
mc.moved_swap = 0;
@@ -5860,7 +5859,8 @@ put: /* get_mctgt_type() gets the page
ent = target.ent;
if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) {
mc.precharge--;
- /* we fixup refcnts and charges later. */
+ mem_cgroup_id_get_many(mc.to, 1);
+ /* we fixup other refcnts and charges later. */
mc.moved_swap++;
}
break;
_
target_unpopulated is incremented with nr_pages at the start of the
function, but the call to free_xenballooned_pages will only subtract
pgno number of pages, and thus the rest need to be subtracted before
returning or else accounting will be skewed.
Signed-off-by: Roger Pau Monné <roger.pau(a)citrix.com>
Cc: stable(a)vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Stefano Stabellini <sstabellini(a)kernel.org>
Cc: xen-devel(a)lists.xenproject.org
---
drivers/xen/balloon.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 77c57568e5d7..3cb10ed32557 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -630,6 +630,12 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages)
out_undo:
mutex_unlock(&balloon_mutex);
free_xenballooned_pages(pgno, pages);
+ /*
+ * NB: free_xenballooned_pages will only subtract pgno pages, but since
+ * target_unpopulated is incremented with nr_pages at the start we need
+ * to remove the remaining ones also, or accounting will be screwed.
+ */
+ balloon_stats.target_unpopulated -= nr_pages - pgno;
return ret;
}
EXPORT_SYMBOL(alloc_xenballooned_pages);
--
2.27.0
The VT-d spec requires (10.4.4 Global Command Register, TE field) that:
Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.
This provides a quirk list for those devices and skips TE disabling if
the qurik hits.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571
Tested-by: Koba Ko <koba.ko(a)canonical.com>
Tested-by: Jun Miao <jun.miao(a)windriver.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
Change since v1:
- Add below tags:
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=206571
Tested-by: Jun Miao <jun.miao(a)windriver.com>
drivers/iommu/intel/dmar.c | 1 +
drivers/iommu/intel/iommu.c | 27 +++++++++++++++++++++++++++
include/linux/dmar.h | 1 +
include/linux/intel-iommu.h | 2 ++
4 files changed, 31 insertions(+)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 683b812c5c47..16f47041f1bf 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1102,6 +1102,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
}
drhd->iommu = iommu;
+ iommu->drhd = drhd;
return 0;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..a459eac96754 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -356,6 +356,7 @@ static int intel_iommu_strict;
static int intel_iommu_superpage = 1;
static int iommu_identity_mapping;
static int intel_no_bounce;
+static int iommu_skip_te_disable;
#define IDENTMAP_GFX 2
#define IDENTMAP_AZALIA 4
@@ -1629,6 +1630,10 @@ static void iommu_disable_translation(struct intel_iommu *iommu)
u32 sts;
unsigned long flag;
+ if (iommu_skip_te_disable && iommu->drhd->gfx_dedicated &&
+ (cap_read_drain(iommu->cap) || cap_write_drain(iommu->cap)))
+ return;
+
raw_spin_lock_irqsave(&iommu->register_lock, flag);
iommu->gcmd &= ~DMA_GCMD_TE;
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
@@ -4039,6 +4044,7 @@ static void __init init_no_remapping_devices(void)
/* This IOMMU has *only* gfx devices. Either bypass it or
set the gfx_mapped flag, as appropriate */
+ drhd->gfx_dedicated = 1;
if (!dmar_map_gfx) {
drhd->ignored = 1;
for_each_active_dev_scope(drhd->devices,
@@ -6182,6 +6188,27 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_calpella_no_shadow_g
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, quirk_calpella_no_shadow_gtt);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, quirk_calpella_no_shadow_gtt);
+static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
+{
+ unsigned short ver;
+
+ if (!IS_GFX_DEVICE(dev))
+ return;
+
+ ver = (dev->device >> 8) & 0xff;
+ if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
+ ver != 0x4e && ver != 0x8a && ver != 0x98 &&
+ ver != 0x9a)
+ return;
+
+ if (risky_device(dev))
+ return;
+
+ pci_info(dev, "Skip IOMMU disabling for graphics\n");
+ iommu_skip_te_disable = 1;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_igfx_skip_te_disable);
+
/* On Tylersburg chipsets, some BIOSes have been known to enable the
ISOCH DMAR unit for the Azalia sound device, but not give it any
TLB entries, which causes it to deadlock. Check for that. We do
diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index d7bf029df737..65565820328a 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -48,6 +48,7 @@ struct dmar_drhd_unit {
u16 segment; /* PCI domain */
u8 ignored:1; /* ignore drhd */
u8 include_all:1;
+ u8 gfx_dedicated:1; /* graphic dedicated */
struct intel_iommu *iommu;
};
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index 3e8fa1c7a1e6..04bd9279c3fb 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -600,6 +600,8 @@ struct intel_iommu {
struct iommu_device iommu; /* IOMMU core code handle */
int node;
u32 flags; /* Software defined flags */
+
+ struct dmar_drhd_unit *drhd;
};
/* PCI domain-device relationship */
--
2.17.1
commit 1dae7e0e58b484eaa43d530f211098fdeeb0f404 upstream.
[BUG]
There are several reported runaway balance, that balance is flooding the
log with "found X extents" where the X never changes.
[CAUSE]
Commit d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after
merge_reloc_roots") introduced BTRFS_ROOT_DEAD_RELOC_TREE bit to
indicate that one subvolume has finished its tree blocks swap with its
reloc tree.
However if balance is canceled or hits ENOSPC halfway, we didn't clear
the BTRFS_ROOT_DEAD_RELOC_TREE bit, leaving that bit hanging forever
until unmount.
Any subvolume root with that bit, would cause backref cache to skip this
tree block, as it has finished its tree block swap. This would cause
all tree blocks of that root be ignored by balance, leading to runaway
balance.
[FIX]
Fix the problem by also clearing the BTRFS_ROOT_DEAD_RELOC_TREE bit for
the original subvolume of orphan reloc root.
Add an umount check for the stale bit still set.
Fixes: d2311e698578 ("btrfs: relocation: Delay reloc tree deletion after merge_reloc_roots")
Cc: <stable(a)vger.kernel.org> # 5.7.x
Signed-off-by: Qu Wenruo <wqu(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/disk-io.c | 1 +
fs/btrfs/relocation.c | 2 ++
2 files changed, 3 insertions(+)
diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index f71e4dbe1d8a..f00e64fee5dd 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -1998,6 +1998,7 @@ void btrfs_put_root(struct btrfs_root *root)
if (refcount_dec_and_test(&root->refs)) {
WARN_ON(!RB_EMPTY_ROOT(&root->inode_tree));
+ WARN_ON(test_bit(BTRFS_ROOT_DEAD_RELOC_TREE, &root->state));
if (root->anon_dev)
free_anon_bdev(root->anon_dev);
btrfs_drew_lock_destroy(&root->snapshot_lock);
diff --git a/fs/btrfs/relocation.c b/fs/btrfs/relocation.c
index 157452a5e110..f67d736c27a1 100644
--- a/fs/btrfs/relocation.c
+++ b/fs/btrfs/relocation.c
@@ -2642,6 +2642,8 @@ void merge_reloc_roots(struct reloc_control *rc)
root->reloc_root = NULL;
btrfs_put_root(reloc_root);
}
+ clear_bit(BTRFS_ROOT_DEAD_RELOC_TREE,
+ &root->state);
btrfs_put_root(root);
}
--
2.27.0
The flags passed to the wait_entry.func are passed onwards to
try_to_wake_up(), which has a very particular interpretation for its
wake_flags. In particular, beyond the published WF_SYNC, it has a few
internal flags as well. Since we passed the fence->error down the chain
via the flags argument, these ended up in the default_wake_function
confusing the kernel/sched.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2110
Fixes: ef4688497512 ("drm/i915: Propagate fence errors")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.4+
---
drivers/gpu/drm/i915/i915_sw_fence.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/i915/i915_sw_fence.c b/drivers/gpu/drm/i915/i915_sw_fence.c
index 295b9829e2da..4cd2038cbe35 100644
--- a/drivers/gpu/drm/i915/i915_sw_fence.c
+++ b/drivers/gpu/drm/i915/i915_sw_fence.c
@@ -164,9 +164,13 @@ static void __i915_sw_fence_wake_up_all(struct i915_sw_fence *fence,
do {
list_for_each_entry_safe(pos, next, &x->head, entry) {
- pos->func(pos,
- TASK_NORMAL, fence->error,
- &extra);
+ int wake_flags;
+
+ wake_flags = fence->error;
+ if (pos->func == autoremove_wake_function)
+ wake_flags = 0;
+
+ pos->func(pos, TASK_NORMAL, wake_flags, &extra);
}
if (list_empty(&extra))
--
2.20.1
Especially with memory hotplug, we can have offline sections (with a
garbage memmap) and overlapping zones. We have to make sure to only
touch initialized memmaps (online sections managed by the buddy) and that
the zone matches, to not move pages between zones.
To test if this can actually happen, I added a simple
BUG_ON(page_zone(page_i) != page_zone(page_j));
right before the swap. When hotplugging a 256M DIMM to a 4G x86-64 VM and
onlining the first memory block "online_movable" and the second memory
block "online_kernel", it will trigger the BUG, as both zones (NORMAL
and MOVABLE) overlap.
This might result in all kinds of weird situations (e.g., double
allocations, list corruptions, unmovable allocations ending up in the
movable zone).
Fixes: e900a918b098 ("mm: shuffle initial free memory to improve memory-side-cache utilization")
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: stable(a)vger.kernel.org # v5.2+
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Minchan Kim <minchan(a)kernel.org>
Cc: Huang Ying <ying.huang(a)intel.com>
Cc: Wei Yang <richard.weiyang(a)gmail.com>
Cc: Mel Gorman <mgorman(a)techsingularity.net>
Signed-off-by: David Hildenbrand <david(a)redhat.com>
---
mm/shuffle.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)
diff --git a/mm/shuffle.c b/mm/shuffle.c
index 44406d9977c77..dd13ab851b3ee 100644
--- a/mm/shuffle.c
+++ b/mm/shuffle.c
@@ -58,25 +58,25 @@ module_param_call(shuffle, shuffle_store, shuffle_show, &shuffle_param, 0400);
* For two pages to be swapped in the shuffle, they must be free (on a
* 'free_area' lru), have the same order, and have the same migratetype.
*/
-static struct page * __meminit shuffle_valid_page(unsigned long pfn, int order)
+static struct page * __meminit shuffle_valid_page(struct zone *zone,
+ unsigned long pfn, int order)
{
- struct page *page;
+ struct page *page = pfn_to_online_page(pfn);
/*
* Given we're dealing with randomly selected pfns in a zone we
* need to ask questions like...
*/
- /* ...is the pfn even in the memmap? */
- if (!pfn_valid_within(pfn))
+ /* ... is the page managed by the buddy? */
+ if (!page)
return NULL;
- /* ...is the pfn in a present section or a hole? */
- if (!pfn_in_present_section(pfn))
+ /* ... is the page assigned to the same zone? */
+ if (page_zone(page) != zone)
return NULL;
/* ...is the page free and currently on a free_area list? */
- page = pfn_to_page(pfn);
if (!PageBuddy(page))
return NULL;
@@ -123,7 +123,7 @@ void __meminit __shuffle_zone(struct zone *z)
* page_j randomly selected in the span @zone_start_pfn to
* @spanned_pages.
*/
- page_i = shuffle_valid_page(i, order);
+ page_i = shuffle_valid_page(z, i, order);
if (!page_i)
continue;
@@ -137,7 +137,7 @@ void __meminit __shuffle_zone(struct zone *z)
j = z->zone_start_pfn +
ALIGN_DOWN(get_random_long() % z->spanned_pages,
order_pages);
- page_j = shuffle_valid_page(j, order);
+ page_j = shuffle_valid_page(z, j, order);
if (page_j && page_j != page_i)
break;
}
--
2.26.2
Hi,
Please consider applying the following patches to the listed stable
releases.
The following patches were found to be missing in stable releases by the
Chrome OS missing patch robot. The patches meet the following criteria.
- The patch includes a Fixes: tag
Note that the Fixes: tag does not always point to the correct upstream
SHA. In that case the correct upstream SHA is listed below.
- The patch referenced in the Fixes: tag has been applied to the listed
stable release
- The patch has not been applied to that stable release
All patches have been applied to the listed stable releases and to at least
one Chrome OS branch. Resulting images have been build- and runtime-tested
(where applicable) on real hardware and with virtual hardware on
kerneltests.org.
Thanks,
Guenter
---
Upstream commit 2aeb18835476 ("perf/core: Fix locking for children siblings group read")
upstream: v4.13-rc2
Fixes: ba5213ae6b88 ("perf/core: Correct event creation with PERF_FORMAT_GROUP")
in linux-4.4.y: a8dd3dfefcf5
in linux-4.9.y: 50fe37e83e14
upstream: v4.13-rc1
Affected branches:
linux-4.4.y
linux-4.9.y (already applied)
Upstream commit d41f36a6464a ("spi: spi-fsl-dspi: Exit the ISR with IRQ_NONE when it's not ours")
upstream: v5.4-rc1
Fixes: 13aed2392741 ("spi: spi-fsl-dspi: use IRQF_SHARED mode to request IRQ")
in linux-4.14.y: c75e886e1270
in linux-4.19.y: eb336b9003b1
upstream: v5.0-rc1
Affected branches:
linux-4.14.y
linux-4.19.y
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Subject: io-mapping: indicate mapping failure
The !ATOMIC_IOMAP version of io_maping_init_wc will always return success,
even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
[akpm(a)linux-foundation.org: detect ioremap_wc() errors earlier]
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Daniel Vetter <daniel(a)ffwll.ch>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/io-mapping.h | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure
+++ a/include/linux/io-mapping.h
@@ -107,9 +107,12 @@ io_mapping_init_wc(struct io_mapping *io
resource_size_t base,
unsigned long size)
{
+ iomap->iomem = ioremap_wc(base, size);
+ if (!iomap->iomem)
+ return NULL;
+
iomap->base = base;
iomap->size = size;
- iomap->iomem = ioremap_wc(base, size);
#if defined(pgprot_noncached_wc) /* archs can't agree on a name ... */
iomap->prot = pgprot_noncached_wc(PAGE_KERNEL);
#elif defined(pgprot_writecombine)
_
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: khugepaged: fix null-pointer dereference due to race
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel…
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: syzbot+ed318e8b790ca72c5ad0(a)syzkaller.appspotmail.com
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/khugepaged.c~khugepaged-fix-null-pointer-dereference-due-to-race
+++ a/mm/khugepaged.c
@@ -958,6 +958,9 @@ static int hugepage_vma_revalidate(struc
return SCAN_ADDRESS_RANGE;
if (!hugepage_vma_check(vma, vma->vm_flags))
return SCAN_VMA_CHECK;
+ /* Anon VMA expected */
+ if (!vma->anon_vma || vma->vm_ops)
+ return SCAN_VMA_CHECK;
return 0;
}
_
From: Barry Song <song.bao.hua(a)hisilicon.com>
Subject: mm/hugetlb: avoid hardcoding while checking if cma is enabled
hugetlb_cma[0] can be NULL due to various reasons, for example, node0 has
no memory. so NULL hugetlb_cma[0] doesn't necessarily mean cma is not
enabled. gigantic pages might have been reserved on other nodes. This
patch fixes possible double reservation and CMA leak.
[akpm(a)linux-foundation.org: fix CONFIG_CMA=n warning]
[sfr(a)canb.auug.org.au: better checks before using hugetlb_cma]
Link: http://lkml.kernel.org/r/20200721205716.6dbaa56b@canb.auug.org.au
Link: http://lkml.kernel.org/r/20200710005726.36068-1-song.bao.hua@hisilicon.com
Fixes: cf11e85fc08c ("mm: hugetlb: optionally allocate gigantic hugepages using cma")
Signed-off-by: Barry Song <song.bao.hua(a)hisilicon.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Jonathan Cameron <jonathan.cameron(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 15 ++++++++++-----
1 file changed, 10 insertions(+), 5 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-avoid-hardcoding-while-checking-if-cma-is-enabled
+++ a/mm/hugetlb.c
@@ -45,7 +45,10 @@ int hugetlb_max_hstate __read_mostly;
unsigned int default_hstate_idx;
struct hstate hstates[HUGE_MAX_HSTATE];
+#ifdef CONFIG_CMA
static struct cma *hugetlb_cma[MAX_NUMNODES];
+#endif
+static unsigned long hugetlb_cma_size __initdata;
/*
* Minimum page order among possible hugepage sizes, set to a proper value
@@ -1235,9 +1238,10 @@ static void free_gigantic_page(struct pa
* If the page isn't allocated using the cma allocator,
* cma_release() returns false.
*/
- if (IS_ENABLED(CONFIG_CMA) &&
- cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
+#ifdef CONFIG_CMA
+ if (cma_release(hugetlb_cma[page_to_nid(page)], page, 1 << order))
return;
+#endif
free_contig_range(page_to_pfn(page), 1 << order);
}
@@ -1248,7 +1252,8 @@ static struct page *alloc_gigantic_page(
{
unsigned long nr_pages = 1UL << huge_page_order(h);
- if (IS_ENABLED(CONFIG_CMA)) {
+#ifdef CONFIG_CMA
+ {
struct page *page;
int node;
@@ -1262,6 +1267,7 @@ static struct page *alloc_gigantic_page(
return page;
}
}
+#endif
return alloc_contig_pages(nr_pages, gfp_mask, nid, nodemask);
}
@@ -2571,7 +2577,7 @@ static void __init hugetlb_hstate_alloc_
for (i = 0; i < h->max_huge_pages; ++i) {
if (hstate_is_gigantic(h)) {
- if (IS_ENABLED(CONFIG_CMA) && hugetlb_cma[0]) {
+ if (hugetlb_cma_size) {
pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n");
break;
}
@@ -5654,7 +5660,6 @@ void move_hugetlb_state(struct page *old
}
#ifdef CONFIG_CMA
-static unsigned long hugetlb_cma_size __initdata;
static bool cma_reserve_called __initdata;
static int __init cmdline_parse_hugetlb_cma(char *p)
_
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: memcg/slab: fix memory leak at non-root kmem_cache destroy
If the kmem_cache refcount is greater than one, we should not mark the
root kmem_cache as dying. If we mark the root kmem_cache dying
incorrectly, the non-root kmem_cache can never be destroyed. It resulted
in memory leak when memcg was destroyed. We can use the following steps
to reproduce.
1) Use kmem_cache_create() to create a new kmem_cache named A.
2) Coincidentally, the kmem_cache A is an alias for kmem_cache B,
so the refcount of B is just increased.
3) Use kmem_cache_destroy() to destroy the kmem_cache A, just
decrease the B's refcount but mark the B as dying.
4) Create a new memory cgroup and alloc memory from the kmem_cache
B. It leads to create a non-root kmem_cache for allocating memory.
5) When destroy the memory cgroup created in the step 4), the
non-root kmem_cache can never be destroyed.
If we repeat steps 4) and 5), this will cause a lot of memory leak. So
only when refcount reach zero, we mark the root kmem_cache as dying.
Link: http://lkml.kernel.org/r/20200716165103.83462-1-songmuchun@bytedance.com
Fixes: 92ee383f6daa ("mm: fix race between kmem_cache destroy, create and deactivate")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Roman Gushchin <guro(a)fb.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Christoph Lameter <cl(a)linux.com>
Cc: Pekka Enberg <penberg(a)kernel.org>
Cc: David Rientjes <rientjes(a)google.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Shakeel Butt <shakeelb(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/slab_common.c | 35 ++++++++++++++++++++++++++++-------
1 file changed, 28 insertions(+), 7 deletions(-)
--- a/mm/slab_common.c~mm-memcg-slab-fix-memory-leak-at-non-root-kmem_cache-destroy
+++ a/mm/slab_common.c
@@ -326,6 +326,14 @@ int slab_unmergeable(struct kmem_cache *
if (s->refcount < 0)
return 1;
+#ifdef CONFIG_MEMCG_KMEM
+ /*
+ * Skip the dying kmem_cache.
+ */
+ if (s->memcg_params.dying)
+ return 1;
+#endif
+
return 0;
}
@@ -886,12 +894,15 @@ static int shutdown_memcg_caches(struct
return 0;
}
-static void flush_memcg_workqueue(struct kmem_cache *s)
+static void memcg_set_kmem_cache_dying(struct kmem_cache *s)
{
spin_lock_irq(&memcg_kmem_wq_lock);
s->memcg_params.dying = true;
spin_unlock_irq(&memcg_kmem_wq_lock);
+}
+static void flush_memcg_workqueue(struct kmem_cache *s)
+{
/*
* SLAB and SLUB deactivate the kmem_caches through call_rcu. Make
* sure all registered rcu callbacks have been invoked.
@@ -923,10 +934,6 @@ static inline int shutdown_memcg_caches(
{
return 0;
}
-
-static inline void flush_memcg_workqueue(struct kmem_cache *s)
-{
-}
#endif /* CONFIG_MEMCG_KMEM */
void slab_kmem_cache_release(struct kmem_cache *s)
@@ -944,8 +951,6 @@ void kmem_cache_destroy(struct kmem_cach
if (unlikely(!s))
return;
- flush_memcg_workqueue(s);
-
get_online_cpus();
get_online_mems();
@@ -955,6 +960,22 @@ void kmem_cache_destroy(struct kmem_cach
if (s->refcount)
goto out_unlock;
+#ifdef CONFIG_MEMCG_KMEM
+ memcg_set_kmem_cache_dying(s);
+
+ mutex_unlock(&slab_mutex);
+
+ put_online_mems();
+ put_online_cpus();
+
+ flush_memcg_workqueue(s);
+
+ get_online_cpus();
+ get_online_mems();
+
+ mutex_lock(&slab_mutex);
+#endif
+
err = shutdown_memcg_caches(s);
if (!err)
err = shutdown_cache(s);
_
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: mm/mmap.c: close race between munmap() and expand_upwards()/downwards()
VMA with VM_GROWSDOWN or VM_GROWSUP flag set can change their size under
mmap_read_lock(). It can lead to race with __do_munmap():
Thread A Thread B
__do_munmap()
detach_vmas_to_be_unmapped()
mmap_write_downgrade()
expand_downwards()
vma->vm_start = address;
// The VMA now overlaps with
// VMAs detached by the Thread A
// page fault populates expanded part
// of the VMA
unmap_region()
// Zaps pagetables partly
// populated by Thread B
Similar race exists for expand_upwards().
The fix is to avoid downgrading mmap_lock in __do_munmap() if detached
VMAs are next to VM_GROWSDOWN or VM_GROWSUP VMA.
[akpm(a)linux-foundation.org: s/mmap_sem/mmap_lock/ in comment]
Link: http://lkml.kernel.org/r/20200709105309.42495-1-kirill.shutemov@linux.intel…
Fixes: dd2283f2605e ("mm: mmap: zap pages with read mmap_sem in munmap")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: Jann Horn <jannh(a)google.com>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Reviewed-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: Oleg Nesterov <oleg(a)redhat.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: <stable(a)vger.kernel.org> [4.20+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mmap.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
--- a/mm/mmap.c~mm-close-race-between-munmap-and-expand_upwards-downwards
+++ a/mm/mmap.c
@@ -2620,7 +2620,7 @@ static void unmap_region(struct mm_struc
* Create a list of vma's touched by the unmap, removing them from the mm's
* vma list as we go..
*/
-static void
+static bool
detach_vmas_to_be_unmapped(struct mm_struct *mm, struct vm_area_struct *vma,
struct vm_area_struct *prev, unsigned long end)
{
@@ -2645,6 +2645,17 @@ detach_vmas_to_be_unmapped(struct mm_str
/* Kill the cache */
vmacache_invalidate(mm);
+
+ /*
+ * Do not downgrade mmap_lock if we are next to VM_GROWSDOWN or
+ * VM_GROWSUP VMA. Such VMAs can change their size under
+ * down_read(mmap_lock) and collide with the VMA we are about to unmap.
+ */
+ if (vma && (vma->vm_flags & VM_GROWSDOWN))
+ return false;
+ if (prev && (prev->vm_flags & VM_GROWSUP))
+ return false;
+ return true;
}
/*
@@ -2825,7 +2836,8 @@ int __do_munmap(struct mm_struct *mm, un
}
/* Detach vmas from rbtree */
- detach_vmas_to_be_unmapped(mm, vma, prev, end);
+ if (!detach_vmas_to_be_unmapped(mm, vma, prev, end))
+ downgrade = false;
if (downgrade)
mmap_write_downgrade(mm);
_
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Currently, memalloc_nocma_{save/restore} API that prevents CMA area
in page allocation is implemented by using current_gfp_context(). However,
there are two problems of this implementation.
First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath. So, CMA area can be
allocated through the allocation fastpath even if
memalloc_nocma_{save/restore} APIs are used. Currently, there is just
one user for these APIs and it has a fallback method to prevent actual
problem.
Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
to exclude the memory on the ZONE_MOVABLE for allocation target.
To fix these problems, this patch changes the implementation to exclude
CMA area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding CMA area in allocation.
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Cc: <stable(a)vger.kernel.org>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
---
include/linux/sched/mm.h | 8 +-------
mm/page_alloc.c | 31 +++++++++++++++++++++----------
2 files changed, 22 insertions(+), 17 deletions(-)
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 480a4d1..17e0c31 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -177,12 +177,10 @@ static inline bool in_vfork(struct task_struct *tsk)
* Applies per-task gfp context to the given allocation flags.
* PF_MEMALLOC_NOIO implies GFP_NOIO
* PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
*/
static inline gfp_t current_gfp_context(gfp_t flags)
{
- if (unlikely(current->flags &
- (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+ if (unlikely(current->flags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
/*
* NOIO implies both NOIO and NOFS and it is a weaker context
* so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(gfp_t flags)
flags &= ~(__GFP_IO | __GFP_FS);
else if (current->flags & PF_MEMALLOC_NOFS)
flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
- if (current->flags & PF_MEMALLOC_NOCMA)
- flags &= ~__GFP_MOVABLE;
-#endif
}
return flags;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e028b87c..7336e94 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2790,7 +2790,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
* allocating from CMA when over half of the zone's free memory
* is in the CMA area.
*/
- if (migratetype == MIGRATE_MOVABLE &&
+ if (alloc_flags & ALLOC_CMA &&
zone_page_state(zone, NR_FREE_CMA_PAGES) >
zone_page_state(zone, NR_FREE_PAGES) / 2) {
page = __rmqueue_cma_fallback(zone, order);
@@ -2801,7 +2801,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
retry:
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
- if (migratetype == MIGRATE_MOVABLE)
+ if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3671,6 +3671,20 @@ alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
return alloc_flags;
}
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+ unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+ unsigned int pflags = current->flags;
+
+ if (!(pflags & PF_MEMALLOC_NOCMA) &&
+ gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+ alloc_flags |= ALLOC_CMA;
+
+#endif
+ return alloc_flags;
+}
+
/*
* get_page_from_freelist goes through the zonelist trying to allocate
* a page.
@@ -4316,10 +4330,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
-#ifdef CONFIG_CMA
- if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
return alloc_flags;
}
@@ -4620,7 +4632,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
if (reserve_flags)
- alloc_flags = reserve_flags;
+ alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
/*
* Reset the nodemask and zonelist iterators if memory policies can be
@@ -4697,7 +4709,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
/* Avoid allocations with no watermarks from looping endlessly */
if (tsk_is_oom_victim(current) &&
- (alloc_flags == ALLOC_OOM ||
+ (alloc_flags & ALLOC_OOM ||
(gfp_mask & __GFP_NOMEMALLOC)))
goto nopage;
@@ -4785,8 +4797,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;
- if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
- *alloc_flags |= ALLOC_CMA;
+ *alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
return true;
}
--
2.7.4
The patch titled
Subject: khugepaged: fix null-pointer dereference due to race
has been added to the -mm tree. Its filename is
khugepaged-fix-null-pointer-dereference-due-to-race.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/khugepaged-fix-null-pointer-derefe…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/khugepaged-fix-null-pointer-derefe…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Kirill A. Shutemov" <kirill.shutemov(a)linux.intel.com>
Subject: khugepaged: fix null-pointer dereference due to race
khugepaged has to drop mmap lock several times while collapsing a page.
The situation can change while the lock is dropped and we need to
re-validate that the VMA is still in place and the PMD is still subject
for collapse.
But we miss one corner case: while collapsing an anonymous pages the VMA
could be replaced with file VMA. If the file VMA doesn't have any
private pages we get NULL pointer dereference:
general protection fault, probably for non-canonical address 0xdffffc0000000000: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x0000000000000000-0x0000000000000007]
anon_vma_lock_write include/linux/rmap.h:120 [inline]
collapse_huge_page mm/khugepaged.c:1110 [inline]
khugepaged_scan_pmd mm/khugepaged.c:1349 [inline]
khugepaged_scan_mm_slot mm/khugepaged.c:2110 [inline]
khugepaged_do_scan mm/khugepaged.c:2193 [inline]
khugepaged+0x3bba/0x5a10 mm/khugepaged.c:2238
The fix is to make sure that the VMA is anonymous in
hugepage_vma_revalidate(). The helper is only used for collapsing
anonymous pages.
Link: http://lkml.kernel.org/r/20200722121439.44328-1-kirill.shutemov@linux.intel…
Fixes: 99cb0dbd47a1 ("mm,thp: add read-only THP support for (non-shmem) FS")
Signed-off-by: Kirill A. Shutemov <kirill.shutemov(a)linux.intel.com>
Reported-by: syzbot+ed318e8b790ca72c5ad0(a)syzkaller.appspotmail.com
Reviewed-by: David Hildenbrand <david(a)redhat.com>
Acked-by: Yang Shi <yang.shi(a)linux.alibaba.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/khugepaged.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/khugepaged.c~khugepaged-fix-null-pointer-dereference-due-to-race
+++ a/mm/khugepaged.c
@@ -958,6 +958,9 @@ static int hugepage_vma_revalidate(struc
return SCAN_ADDRESS_RANGE;
if (!hugepage_vma_check(vma, vma->vm_flags))
return SCAN_VMA_CHECK;
+ /* Anon VMA expected */
+ if (!vma->anon_vma || vma->vm_ops)
+ return SCAN_VMA_CHECK;
return 0;
}
_
Patches currently in -mm which might be from kirill.shutemov(a)linux.intel.com are
mm-close-race-between-munmap-and-expand_upwards-downwards.patch
khugepaged-fix-null-pointer-dereference-due-to-race.patch
The patch titled
Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
has been added to the -mm tree. Its filename is
mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-page_alloc-fix-memalloc_nocma_s…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-page_alloc-fix-memalloc_nocma_s…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: js1304(a)gmail.com
Subject: mm/page_alloc: fix memalloc_nocma_{save/restore} APIs
Currently, memalloc_nocma_{save/restore} API that prevents CMA area
in page allocation is implemented by using current_gfp_context(). However,
there are two problems of this implementation.
First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath. So, CMA area can be
allocated through the allocation fastpath even if
memalloc_nocma_{save/restore} APIs are used. Currently, there is just
one user for these APIs and it has a fallback method to prevent actual
problem.
Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
to exclude the memory on the ZONE_MOVABLE for allocation target.
To fix these problems, this patch changes the implementation to exclude
CMA area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding CMA area in allocation.
Link: http://lkml.kernel.org/r/1595468942-29687-1-git-send-email-iamjoonsoo.kim@l…
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Reviewed-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Christoph Hellwig <hch(a)infradead.org>
Cc: Roman Gushchin <guro(a)fb.com>
Cc: Mike Kravetz <mike.kravetz(a)oracle.com>
Cc: Naoya Horiguchi <n-horiguchi(a)ah.jp.nec.com>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: "Aneesh Kumar K . V" <aneesh.kumar(a)linux.ibm.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/sched/mm.h | 8 +-------
mm/page_alloc.c | 31 +++++++++++++++++++++----------
2 files changed, 22 insertions(+), 17 deletions(-)
--- a/include/linux/sched/mm.h~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/include/linux/sched/mm.h
@@ -177,12 +177,10 @@ static inline bool in_vfork(struct task_
* Applies per-task gfp context to the given allocation flags.
* PF_MEMALLOC_NOIO implies GFP_NOIO
* PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
*/
static inline gfp_t current_gfp_context(gfp_t flags)
{
- if (unlikely(current->flags &
- (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+ if (unlikely(current->flags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
/*
* NOIO implies both NOIO and NOFS and it is a weaker context
* so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(
flags &= ~(__GFP_IO | __GFP_FS);
else if (current->flags & PF_MEMALLOC_NOFS)
flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
- if (current->flags & PF_MEMALLOC_NOCMA)
- flags &= ~__GFP_MOVABLE;
-#endif
}
return flags;
}
--- a/mm/page_alloc.c~mm-page_alloc-fix-memalloc_nocma_save-restore-apis
+++ a/mm/page_alloc.c
@@ -2790,7 +2790,7 @@ __rmqueue(struct zone *zone, unsigned in
* allocating from CMA when over half of the zone's free memory
* is in the CMA area.
*/
- if (migratetype == MIGRATE_MOVABLE &&
+ if (alloc_flags & ALLOC_CMA &&
zone_page_state(zone, NR_FREE_CMA_PAGES) >
zone_page_state(zone, NR_FREE_PAGES) / 2) {
page = __rmqueue_cma_fallback(zone, order);
@@ -2801,7 +2801,7 @@ __rmqueue(struct zone *zone, unsigned in
retry:
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
- if (migratetype == MIGRATE_MOVABLE)
+ if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3671,6 +3671,20 @@ alloc_flags_nofragment(struct zone *zone
return alloc_flags;
}
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+ unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+ unsigned int pflags = current->flags;
+
+ if (!(pflags & PF_MEMALLOC_NOCMA) &&
+ gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+ alloc_flags |= ALLOC_CMA;
+
+#endif
+ return alloc_flags;
+}
+
/*
* get_page_from_freelist goes through the zonelist trying to allocate
* a page.
@@ -4316,10 +4330,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
-#ifdef CONFIG_CMA
- if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
return alloc_flags;
}
@@ -4620,7 +4632,7 @@ retry:
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
if (reserve_flags)
- alloc_flags = reserve_flags;
+ alloc_flags = current_alloc_flags(gfp_mask, reserve_flags);
/*
* Reset the nodemask and zonelist iterators if memory policies can be
@@ -4697,7 +4709,7 @@ retry:
/* Avoid allocations with no watermarks from looping endlessly */
if (tsk_is_oom_victim(current) &&
- (alloc_flags == ALLOC_OOM ||
+ (alloc_flags & ALLOC_OOM ||
(gfp_mask & __GFP_NOMEMALLOC)))
goto nopage;
@@ -4785,8 +4797,7 @@ static inline bool prepare_alloc_pages(g
if (should_fail_alloc_page(gfp_mask, order))
return false;
- if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
- *alloc_flags |= ALLOC_CMA;
+ *alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
return true;
}
_
Patches currently in -mm which might be from iamjoonsoo.kim(a)lge.com are
mm-page_alloc-fix-memalloc_nocma_save-restore-apis.patch
mm-vmscan-make-active-inactive-ratio-as-1-1-for-anon-lru.patch
mm-vmscan-protect-the-workingset-on-anonymous-lru.patch
mm-workingset-prepare-the-workingset-detection-infrastructure-for-anon-lru.patch
mm-swapcache-support-to-handle-the-shadow-entries.patch
mm-swap-implement-workingset-detection-for-anonymous-lru.patch
mm-vmscan-restore-active-inactive-ratio-for-anonymous-lru.patch
mm-page_isolation-prefer-the-node-of-the-source-page.patch
mm-migrate-move-migration-helper-from-h-to-c.patch
mm-hugetlb-unify-migration-callbacks.patch
mm-migrate-clear-__gfp_reclaim-to-make-the-migration-callback-consistent-with-regular-thp-allocations.patch
mm-migrate-make-a-standard-migration-target-allocation-function.patch
mm-mempolicy-use-a-standard-migration-target-allocation-callback.patch
mm-page_alloc-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory-failure-remove-a-wrapper-for-alloc_migration_target.patch
mm-memory_hotplug-remove-a-wrapper-for-alloc_migration_target.patch
__tracepoint_string's have their string data stored in .rodata, and an
address to that data stored in the "__tracepoint_str" section. Functions
that refer to those strings refer to the symbol of the address. Compiler
optimization can replace those address references with references
directly to the string data. If the address doesn't appear to have other
uses, then it appears dead to the compiler and is removed. This can
break the /tracing/printk_formats sysfs node which iterates the
addresses stored in the "__tracepoint_str" section.
Like other strings stored in custom sections in this header, mark these
__used to inform the compiler that there are other non-obvious users of
the address, so they should still be emitted.
Cc: stable(a)vger.kernel.org
Reported-by: Tim Murray <timmurray(a)google.com>
Reported-by: Simon MacMullen <simonmacm(a)google.com>
Suggested-by: Greg Hackmann <ghackmann(a)google.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
We observe this in Clang; it seems that GCC doesn't do the "cleanup" of
the dead address.
Specifically, the Clang passes "Interprocedural Sparse Conditional
Constant Propagation" (IPSCCP) and GlobalOpt both try to removed the
address if no other uses exist after inlining the reference directly to
the string data.
We don't want to change the linkage of these variables, but we kind of
want optimization behavior to treat these function static strings as if
they had `extern` linkage, at least by not removing the address of the
string data from the custom section.
include/linux/tracepoint.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/tracepoint.h b/include/linux/tracepoint.h
index a1fecf311621..3a5b717d92e8 100644
--- a/include/linux/tracepoint.h
+++ b/include/linux/tracepoint.h
@@ -361,7 +361,7 @@ static inline struct tracepoint *tracepoint_ptr_deref(tracepoint_ptr_t *p)
static const char *___tp_str __tracepoint_string = str; \
___tp_str; \
})
-#define __tracepoint_string __attribute__((section("__tracepoint_str")))
+#define __tracepoint_string __attribute__((section("__tracepoint_str"), used))
#else
/*
* tracepoint_string() is used to save the string address for userspace
--
2.28.0.rc0.105.gf9edc3c819-goog
Reported by Forza on IRC that remounting with compression options does
not reflect the change in level, or at least it does not appear to do so
according to the messages:
mount -o compress=zstd:1 /dev/sda /mnt
mount -o remount,compress=zstd:15 /mnt
does not print the change to the level to syslog:
[ 41.366060] BTRFS info (device vda): use zstd compression, level 1
[ 41.368254] BTRFS info (device vda): disk space caching is enabled
[ 41.390429] BTRFS info (device vda): disk space caching is enabled
What really happens is that the message is lost but the level is actualy
changed.
There's another weird output, if compression is reset to 'no':
[ 45.413776] BTRFS info (device vda): use no compression, level 4
To fix that, save the previous compression level and print the message
in that case too and use separate message for 'no' compression.
CC: stable(a)vger.kernel.org # 4.19+
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/super.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
diff --git a/fs/btrfs/super.c b/fs/btrfs/super.c
index 5a9dc31d95c9..aa73422b0678 100644
--- a/fs/btrfs/super.c
+++ b/fs/btrfs/super.c
@@ -517,6 +517,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
char *compress_type;
bool compress_force = false;
enum btrfs_compression_type saved_compress_type;
+ int saved_compress_level;
bool saved_compress_force;
int no_compress = 0;
@@ -598,6 +599,7 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
info->compress_type : BTRFS_COMPRESS_NONE;
saved_compress_force =
btrfs_test_opt(info, FORCE_COMPRESS);
+ saved_compress_level = info->compress_level;
if (token == Opt_compress ||
token == Opt_compress_force ||
strncmp(args[0].from, "zlib", 4) == 0) {
@@ -642,6 +644,8 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
no_compress = 0;
} else if (strncmp(args[0].from, "no", 2) == 0) {
compress_type = "no";
+ info->compress_level = 0;
+ info->compress_type = 0;
btrfs_clear_opt(info->mount_opt, COMPRESS);
btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
compress_force = false;
@@ -662,11 +666,11 @@ int btrfs_parse_options(struct btrfs_fs_info *info, char *options,
*/
btrfs_clear_opt(info->mount_opt, FORCE_COMPRESS);
}
- if ((btrfs_test_opt(info, COMPRESS) &&
- (info->compress_type != saved_compress_type ||
- compress_force != saved_compress_force)) ||
- (!btrfs_test_opt(info, COMPRESS) &&
- no_compress == 1)) {
+ if (no_compress == 1) {
+ btrfs_info(info, "use no compression");
+ } else if ((info->compress_type != saved_compress_type) ||
+ (compress_force != saved_compress_force) ||
+ (info->compress_level != saved_compress_level)) {
btrfs_info(info, "%s %s compression, level %d",
(compress_force) ? "force" : "use",
compress_type, info->compress_level);
--
2.25.0
Call dwc2_debugfs_exit() when usb_add_gadget_udc() has failed. This
ensure that the debugfs entries created by dwc2_debugfs_init() are
cleaned up in the error path.
Fixes: 207324a321a866 ("usb: dwc2: Postponed gadget registration to the udc class driver")
Signed-off-by: Martin Blumenstingl <martin.blumenstingl(a)googlemail.com>
---
This patch is compile-tested only. I found this while trying to
understand the latest changes to dwc2/platform.c.
drivers/usb/dwc2/platform.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/dwc2/platform.c b/drivers/usb/dwc2/platform.c
index c347d93eae64..02b6da7e21d7 100644
--- a/drivers/usb/dwc2/platform.c
+++ b/drivers/usb/dwc2/platform.c
@@ -582,12 +582,14 @@ static int dwc2_driver_probe(struct platform_device *dev)
retval = usb_add_gadget_udc(hsotg->dev, &hsotg->gadget);
if (retval) {
dwc2_hsotg_remove(hsotg);
- goto error_init;
+ goto error_debugfs;
}
}
#endif /* CONFIG_USB_DWC2_PERIPHERAL || CONFIG_USB_DWC2_DUAL_ROLE */
return 0;
+error_debugfs:
+ dwc2_debugfs_exit(hsotg);
error_init:
if (hsotg->params.activate_stm_id_vb_detection)
regulator_disable(hsotg->usb33d);
--
2.27.0
From: Gregory Herrero <gregory.herrero(a)oracle.com>
Currently, if a section has a relocation to '_mcount' symbol, a new
__mcount_loc entry will be added whatever the relocation type is.
This is problematic when a relocation to '_mcount' is in the middle of a
section and is not a call for ftrace use.
Such relocation could be generated with below code for example:
bool is_mcount(unsigned long addr)
{
return (target == (unsigned long) &_mcount);
}
With this snippet of code, ftrace will try to patch the mcount location
generated by this code on module load and fail with:
Call trace:
ftrace_bug+0xa0/0x28c
ftrace_process_locs+0x2f4/0x430
ftrace_module_init+0x30/0x38
load_module+0x14f0/0x1e78
__do_sys_finit_module+0x100/0x11c
__arm64_sys_finit_module+0x28/0x34
el0_svc_common+0x88/0x194
el0_svc_handler+0x38/0x8c
el0_svc+0x8/0xc
---[ end trace d828d06b36ad9d59 ]---
ftrace failed to modify
[<ffffa2dbf3a3a41c>] 0xffffa2dbf3a3a41c
actual: 66:a9:3c:90
Initializing ftrace call sites
ftrace record flags: 2000000
(0)
expected tramp: ffffa2dc6cf66724
So Limit the relocation type to R_AARCH64_CALL26 as in perl version of
recordmcount.
Fixes: ed60453fa8f8 ("ARM: 6511/1: ftrace: add ARM support for C version of recordmcount")
Signed-off-by: Gregory Herrero <gregory.herrero(a)oracle.com>
---
scripts/recordmcount.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/scripts/recordmcount.c b/scripts/recordmcount.c
index 7225107a9aaf..e59022b3f125 100644
--- a/scripts/recordmcount.c
+++ b/scripts/recordmcount.c
@@ -434,6 +434,11 @@ static int arm_is_fake_mcount(Elf32_Rel const *rp)
return 1;
}
+static int arm64_is_fake_mcount(Elf64_Rel const *rp)
+{
+ return ELF64_R_TYPE(w(rp->r_info)) != R_AARCH64_CALL26;
+}
+
/* 64-bit EM_MIPS has weird ELF64_Rela.r_info.
* http://techpubs.sgi.com/library/manuals/4000/007-4658-001/pdf/007-4658-001.…
* We interpret Table 29 Relocation Operation (Elf64_Rel, Elf64_Rela) [p.40]
@@ -547,6 +552,7 @@ static int do_file(char const *const fname)
make_nop = make_nop_arm64;
rel_type_nop = R_AARCH64_NONE;
ideal_nop = ideal_nop4_arm64;
+ is_fake_mcount64 = arm64_is_fake_mcount;
break;
case EM_IA_64: reltype = R_IA64_IMM64; break;
case EM_MIPS: /* reltype: e_class */ break;
--
2.27.0
If a stage-2 page-table contains an executable, read-only mapping at the
pte level (e.g. due to dirty logging being enabled), a subsequent write
fault to the same page which tries to install a larger block mapping
(e.g. due to dirty logging having been disabled) will erroneously inherit
the exec permission and consequently skip I-cache invalidation for the
rest of the block.
Ensure that exec permission is only inherited by write faults when the
new mapping is of the same size as the existing one. A subsequent
instruction abort will result in I-cache invalidation for the entire
block mapping.
Cc: Marc Zyngier <maz(a)kernel.org>
Cc: Quentin Perret <qperret(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
---
Found by code inspection, rather than something actually going wrong.
arch/arm64/kvm/mmu.c | 9 +++++----
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index 8c0035cab6b6..69dc36d1d486 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -1326,7 +1326,7 @@ static bool stage2_get_leaf_entry(struct kvm *kvm, phys_addr_t addr,
return true;
}
-static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
+static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr, unsigned long sz)
{
pud_t *pudp;
pmd_t *pmdp;
@@ -1338,9 +1338,9 @@ static bool stage2_is_exec(struct kvm *kvm, phys_addr_t addr)
return false;
if (pudp)
- return kvm_s2pud_exec(pudp);
+ return sz == PUD_SIZE && kvm_s2pud_exec(pudp);
else if (pmdp)
- return kvm_s2pmd_exec(pmdp);
+ return sz == PMD_SIZE && kvm_s2pmd_exec(pmdp);
else
return kvm_s2pte_exec(ptep);
}
@@ -1958,7 +1958,8 @@ static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa,
* execute permissions, and we preserve whatever we have.
*/
needs_exec = exec_fault ||
- (fault_status == FSC_PERM && stage2_is_exec(kvm, fault_ipa));
+ (fault_status == FSC_PERM &&
+ stage2_is_exec(kvm, fault_ipa, vma_pagesize));
if (vma_pagesize == PUD_SIZE) {
pud_t new_pud = kvm_pfn_pud(pfn, mem_type);
--
2.28.0.rc0.105.gf9edc3c819-goog
target_unpopulated is incremented with nr_pages at the start of the
function, but the call to free_xenballooned_pages will only subtract
pgno number of pages, and thus the rest need to be subtracted before
returning or else accounting will be skewed.
Signed-off-by: Roger Pau Monné <roger.pau(a)citrix.com>
Cc: stable(a)vger.kernel.org
---
Cc: Boris Ostrovsky <boris.ostrovsky(a)oracle.com>
Cc: Juergen Gross <jgross(a)suse.com>
Cc: Stefano Stabellini <sstabellini(a)kernel.org>
Cc: xen-devel(a)lists.xenproject.org
---
drivers/xen/balloon.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/xen/balloon.c b/drivers/xen/balloon.c
index 77c57568e5d7..3cb10ed32557 100644
--- a/drivers/xen/balloon.c
+++ b/drivers/xen/balloon.c
@@ -630,6 +630,12 @@ int alloc_xenballooned_pages(int nr_pages, struct page **pages)
out_undo:
mutex_unlock(&balloon_mutex);
free_xenballooned_pages(pgno, pages);
+ /*
+ * NB: free_xenballooned_pages will only subtract pgno pages, but since
+ * target_unpopulated is incremented with nr_pages at the start we need
+ * to remove the remaining ones also, or accounting will be screwed.
+ */
+ balloon_stats.target_unpopulated -= nr_pages - pgno;
return ret;
}
EXPORT_SYMBOL(alloc_xenballooned_pages);
--
2.27.0
From: Tom Rix <trix(a)redhat.com>
clang static analysis flags this error
qat_uclo.c:297:3: warning: Attempt to free released memory
[unix.Malloc]
kfree(*init_tab_base);
^~~~~~~~~~~~~~~~~~~~~
When input *init_tab_base is null, the function allocates memory for
the head of the list. When there is problem allocating other list
elements the list is unwound and freed. Then a check is made if the
list head was allocated and is also freed.
Keeping track of the what may need to be freed is the variable 'tail_old'.
The unwinding/freeing block is
while (tail_old) {
mem_init = tail_old->next;
kfree(tail_old);
tail_old = mem_init;
}
The problem is that the first element of tail_old is also what was
allocated for the list head
init_header = kzalloc(sizeof(*init_header), GFP_KERNEL);
...
*init_tab_base = init_header;
flag = 1;
}
tail_old = init_header;
So *init_tab_base/init_header are freed twice.
There is another problem.
When the input *init_tab_base is non null the tail_old is calculated by
traveling down the list to first non null entry.
tail_old = init_header;
while (tail_old->next)
tail_old = tail_old->next;
When the unwinding free happens, the last entry of the input list will
be freed.
So the freeing needs a general changed.
If locally allocated the first element of tail_old is freed, else it
is skipped. As a bit of cleanup, reset *init_tab_base if it came in
as null.
Fixes: b4b7e67c917f ("crypto: qat - Intel(R) QAT ucode part of fw loader")
Signed-off-by: Tom Rix <trix(a)redhat.com>
---
drivers/crypto/qat/qat_common/qat_uclo.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/drivers/crypto/qat/qat_common/qat_uclo.c b/drivers/crypto/qat/qat_common/qat_uclo.c
index 4cc1f436b075..bff759e2f811 100644
--- a/drivers/crypto/qat/qat_common/qat_uclo.c
+++ b/drivers/crypto/qat/qat_common/qat_uclo.c
@@ -288,13 +288,18 @@ static int qat_uclo_create_batch_init_list(struct icp_qat_fw_loader_handle
}
return 0;
out_err:
+ /* Do not free the list head unless we allocated it. */
+ tail_old = tail_old->next;
+ if (flag) {
+ kfree(*init_tab_base);
+ *init_tab_base = NULL;
+ }
+
while (tail_old) {
mem_init = tail_old->next;
kfree(tail_old);
tail_old = mem_init;
}
- if (flag)
- kfree(*init_tab_base);
return -ENOMEM;
}
--
2.18.1
This is a note to let you know that I've just added the patch titled
/dev/mem: Add missing memory barriers for devmem_inode
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b34e7e298d7a5ed76b3aa327c240c29f1ef6dd22 Mon Sep 17 00:00:00 2001
From: Eric Biggers <ebiggers(a)google.com>
Date: Wed, 15 Jul 2020 23:05:53 -0700
Subject: /dev/mem: Add missing memory barriers for devmem_inode
WRITE_ONCE() isn't the correct way to publish a pointer to a data
structure, since it doesn't include a write memory barrier. Therefore
other tasks may see that the pointer has been set but not see that the
pointed-to memory has finished being initialized yet. Instead a
primitive with "release" semantics is needed.
Use smp_store_release() for this.
The use of READ_ONCE() on the read side is still potentially correct if
there's no control dependency, i.e. if all memory being "published" is
transitively reachable via the pointer itself. But this pairing is
somewhat confusing and error-prone. So just upgrade the read side to
smp_load_acquire() so that it clearly pairs with smp_store_release().
Cc: Arnd Bergmann <arnd(a)arndb.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Russell King <linux(a)arm.linux.org.uk>
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Fixes: 3234ac664a87 ("/dev/mem: Revoke mappings when a driver claims the region")
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
Cc: stable <stable(a)vger.kernel.org>
Acked-by: Dan Williams <dan.j.williams(a)intel.com>
Link: https://lore.kernel.org/r/20200716060553.24618-1-ebiggers@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/char/mem.c | 10 +++++++---
1 file changed, 7 insertions(+), 3 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 934c92dcb9ab..687d4af6945d 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -814,7 +814,8 @@ static struct inode *devmem_inode;
#ifdef CONFIG_IO_STRICT_DEVMEM
void revoke_devmem(struct resource *res)
{
- struct inode *inode = READ_ONCE(devmem_inode);
+ /* pairs with smp_store_release() in devmem_init_inode() */
+ struct inode *inode = smp_load_acquire(&devmem_inode);
/*
* Check that the initialization has completed. Losing the race
@@ -1028,8 +1029,11 @@ static int devmem_init_inode(void)
return rc;
}
- /* publish /dev/mem initialized */
- WRITE_ONCE(devmem_inode, inode);
+ /*
+ * Publish /dev/mem initialized.
+ * Pairs with smp_load_acquire() in revoke_devmem().
+ */
+ smp_store_release(&devmem_inode, inode);
return 0;
}
--
2.27.0
This is a note to let you know that I've just added the patch titled
binder: Don't use mmput() from shrinker function.
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From f867c771f98891841c217fa8459244ed0dd28921 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)i-love.sakura.ne.jp>
Date: Fri, 17 Jul 2020 00:12:15 +0900
Subject: binder: Don't use mmput() from shrinker function.
syzbot is reporting that mmput() from shrinker function has a risk of
deadlock [1], for delayed_uprobe_add() from update_ref_ctr() calls
kzalloc(GFP_KERNEL) with delayed_uprobe_lock held, and
uprobe_clear_state() from __mmput() also holds delayed_uprobe_lock.
Commit a1b2289cef92ef0e ("android: binder: drop lru lock in isolate
callback") replaced mmput() with mmput_async() in order to avoid sleeping
with spinlock held. But this patch replaces mmput() with mmput_async() in
order not to start __mmput() from shrinker context.
[1] https://syzkaller.appspot.com/bug?id=bc9e7303f537c41b2b0cc2dfcea3fc42964c2d…
Reported-by: syzbot <syzbot+1068f09c44d151250c33(a)syzkaller.appspotmail.com>
Reported-by: syzbot <syzbot+e5344baa319c9a96edec(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Reviewed-by: Michal Hocko <mhocko(a)suse.com>
Acked-by: Todd Kjos <tkjos(a)google.com>
Acked-by: Christian Brauner <christian.brauner(a)ubuntu.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/4ba9adb2-43f5-2de0-22de-f6075c1fab50@i-love.sakur…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/android/binder_alloc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/android/binder_alloc.c b/drivers/android/binder_alloc.c
index 42c672f1584e..cbe6aa77d50d 100644
--- a/drivers/android/binder_alloc.c
+++ b/drivers/android/binder_alloc.c
@@ -947,7 +947,7 @@ enum lru_status binder_alloc_free_page(struct list_head *item,
trace_binder_unmap_user_end(alloc, index);
}
mmap_read_unlock(mm);
- mmput(mm);
+ mmput_async(mm);
trace_binder_unmap_kernel_start(alloc, index);
--
2.27.0
This is a note to let you know that I've just added the patch titled
iio: imu: st_lsm6dsx: reset hw ts after resume
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From a1bab9396c2d98c601ce81c27567159dfbc10c19 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Mon, 13 Jul 2020 13:40:19 +0200
Subject: iio: imu: st_lsm6dsx: reset hw ts after resume
Reset hw time samples generator after system resume in order to avoid
disalignment between system and device time reference since FIFO
batching and time samples generator are disabled during suspend.
Fixes: 213451076bd3 ("iio: imu: st_lsm6dsx: add hw timestamp support")
Tested-by: Sean Nyekjaer <sean(a)geanix.com>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h | 3 +--
.../iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 23 ++++++++++++-------
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 2 +-
3 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
index d82ec6398222..d80ba2e688ed 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
@@ -436,8 +436,7 @@ int st_lsm6dsx_update_watermark(struct st_lsm6dsx_sensor *sensor,
u16 watermark);
int st_lsm6dsx_update_fifo(struct st_lsm6dsx_sensor *sensor, bool enable);
int st_lsm6dsx_flush_fifo(struct st_lsm6dsx_hw *hw);
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode);
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_tagged_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_check_odr(struct st_lsm6dsx_sensor *sensor, u32 odr, u8 *val);
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
index afd00daeefb2..7de10bd636ea 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
@@ -184,8 +184,8 @@ static int st_lsm6dsx_update_decimators(struct st_lsm6dsx_hw *hw)
return err;
}
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode)
+static int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
+ enum st_lsm6dsx_fifo_mode fifo_mode)
{
unsigned int data;
@@ -302,6 +302,18 @@ static int st_lsm6dsx_reset_hw_ts(struct st_lsm6dsx_hw *hw)
return 0;
}
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw)
+{
+ int err;
+
+ /* reset hw ts counter */
+ err = st_lsm6dsx_reset_hw_ts(hw);
+ if (err < 0)
+ return err;
+
+ return st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+}
+
/*
* Set max bulk read to ST_LSM6DSX_MAX_WORD_LEN/ST_LSM6DSX_MAX_TAGGED_WORD_LEN
* in order to avoid a kmalloc for each bus access
@@ -675,12 +687,7 @@ int st_lsm6dsx_update_fifo(struct st_lsm6dsx_sensor *sensor, bool enable)
goto out;
if (fifo_mask) {
- /* reset hw ts counter */
- err = st_lsm6dsx_reset_hw_ts(hw);
- if (err < 0)
- goto out;
-
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
if (err < 0)
goto out;
}
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
index c8ddeb3f48ff..346c24281d26 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
@@ -2457,7 +2457,7 @@ static int __maybe_unused st_lsm6dsx_resume(struct device *dev)
}
if (hw->fifo_mask)
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
return err;
}
--
2.27.0
This is a note to let you know that I've just added the patch titled
iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 65afb0932a81c1de719ceee0db0b276094b10ac8 Mon Sep 17 00:00:00 2001
From: Alexandru Ardelean <alexandru.ardelean(a)analog.com>
Date: Mon, 6 Jul 2020 14:02:57 +0300
Subject: iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
There are 2 exit paths where the lock isn't held, but try to unlock the
mutex when exiting. In these places we should just return from the
function.
A neater approach would be to cleanup the ad5592r_read_raw(), but that
would make this patch more difficult to backport to stable versions.
Fixes 56ca9db862bf3: ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs")
Reported-by: Charles Stanhope <charles.stanhope(a)gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean(a)analog.com>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/dac/ad5592r-base.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/dac/ad5592r-base.c b/drivers/iio/dac/ad5592r-base.c
index 5c4e5ff70380..cc4875660a69 100644
--- a/drivers/iio/dac/ad5592r-base.c
+++ b/drivers/iio/dac/ad5592r-base.c
@@ -413,7 +413,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
s64 tmp = *val * (3767897513LL / 25LL);
*val = div_s64_rem(tmp, 1000000000LL, val2);
- ret = IIO_VAL_INT_PLUS_MICRO;
+ return IIO_VAL_INT_PLUS_MICRO;
} else {
int mult;
@@ -444,7 +444,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
ret = IIO_VAL_INT;
break;
default:
- ret = -EINVAL;
+ return -EINVAL;
}
unlock:
--
2.27.0
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Currently, preventing cma area in page allocation is implemented by using
current_gfp_context(). However, there are two problems of this
implementation.
First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath.
Second, clearing __GFP_MOVABLE has a side effect to exclude the memory
on the ZONE_MOVABLE for allocation target.
To fix these problems, this patch changes the implementation to exclude
cma area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding cma area in allocation.
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
---
include/linux/sched/mm.h | 8 +-------
mm/page_alloc.c | 37 ++++++++++++++++++++++++-------------
2 files changed, 25 insertions(+), 20 deletions(-)
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 44ad5b7..6c652ec 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -175,14 +175,12 @@ static inline bool in_vfork(struct task_struct *tsk)
* Applies per-task gfp context to the given allocation flags.
* PF_MEMALLOC_NOIO implies GFP_NOIO
* PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
*/
static inline gfp_t current_gfp_context(gfp_t flags)
{
unsigned int pflags = READ_ONCE(current->flags);
- if (unlikely(pflags &
- (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+ if (unlikely(pflags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
/*
* NOIO implies both NOIO and NOFS and it is a weaker context
* so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(gfp_t flags)
flags &= ~(__GFP_IO | __GFP_FS);
else if (pflags & PF_MEMALLOC_NOFS)
flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
- if (pflags & PF_MEMALLOC_NOCMA)
- flags &= ~__GFP_MOVABLE;
-#endif
}
return flags;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 6416d08..b529220 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2791,7 +2791,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
* allocating from CMA when over half of the zone's free memory
* is in the CMA area.
*/
- if (migratetype == MIGRATE_MOVABLE &&
+ if (alloc_flags & ALLOC_CMA &&
zone_page_state(zone, NR_FREE_CMA_PAGES) >
zone_page_state(zone, NR_FREE_PAGES) / 2) {
page = __rmqueue_cma_fallback(zone, order);
@@ -2802,7 +2802,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
retry:
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
- if (migratetype == MIGRATE_MOVABLE)
+ if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3502,11 +3502,9 @@ static inline long __zone_watermark_unusable_free(struct zone *z,
if (likely(!alloc_harder))
unusable_free += z->nr_reserved_highatomic;
-#ifdef CONFIG_CMA
/* If allocation can't use CMA areas don't use free CMA pages */
- if (!(alloc_flags & ALLOC_CMA))
+ if (IS_ENABLED(CONFIG_CMA) && !(alloc_flags & ALLOC_CMA))
unusable_free += zone_page_state(z, NR_FREE_CMA_PAGES);
-#endif
return unusable_free;
}
@@ -3693,6 +3691,20 @@ alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
return alloc_flags;
}
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+ unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+ unsigned int pflags = current->flags;
+
+ if (!(pflags & PF_MEMALLOC_NOCMA) &&
+ gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+ alloc_flags |= ALLOC_CMA;
+
+#endif
+ return alloc_flags;
+}
+
/*
* get_page_from_freelist goes through the zonelist trying to allocate
* a page.
@@ -4339,10 +4351,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
-#ifdef CONFIG_CMA
- if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
return alloc_flags;
}
@@ -4642,8 +4652,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
wake_all_kswapds(order, gfp_mask, ac);
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
- if (reserve_flags)
+ if (reserve_flags) {
alloc_flags = reserve_flags;
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+ }
/*
* Reset the nodemask and zonelist iterators if memory policies can be
@@ -4720,7 +4732,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
/* Avoid allocations with no watermarks from looping endlessly */
if (tsk_is_oom_victim(current) &&
- (alloc_flags == ALLOC_OOM ||
+ (alloc_flags & ALLOC_OOM ||
(gfp_mask & __GFP_NOMEMALLOC)))
goto nopage;
@@ -4808,8 +4820,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;
- if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
- *alloc_flags |= ALLOC_CMA;
+ *alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
return true;
}
--
2.7.4
From: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Currently, memalloc_nocma_{save/restore} API that prevents CMA area
in page allocation is implemented by using current_gfp_context(). However,
there are two problems of this implementation.
First, this doesn't work for allocation fastpath. In the fastpath,
original gfp_mask is used since current_gfp_context() is introduced in
order to control reclaim and it is on slowpath. So, CMA area can be
allocated through the allocation fastpath even if
memalloc_nocma_{save/restore} APIs are used. Currently, there is just
one user for these APIs and it has a fallback method to prevent actual
problem.
Second, clearing __GFP_MOVABLE in current_gfp_context() has a side effect
to exclude the memory on the ZONE_MOVABLE for allocation target.
To fix these problems, this patch changes the implementation to exclude
CMA area in page allocation. Main point of this change is using the
alloc_flags. alloc_flags is mainly used to control allocation so it fits
for excluding CMA area in allocation.
Fixes: d7fefcc8de91 (mm/cma: add PF flag to force non cma alloc)
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
---
include/linux/sched/mm.h | 8 +-------
mm/page_alloc.c | 33 +++++++++++++++++++++++----------
2 files changed, 24 insertions(+), 17 deletions(-)
diff --git a/include/linux/sched/mm.h b/include/linux/sched/mm.h
index 480a4d1..17e0c31 100644
--- a/include/linux/sched/mm.h
+++ b/include/linux/sched/mm.h
@@ -177,12 +177,10 @@ static inline bool in_vfork(struct task_struct *tsk)
* Applies per-task gfp context to the given allocation flags.
* PF_MEMALLOC_NOIO implies GFP_NOIO
* PF_MEMALLOC_NOFS implies GFP_NOFS
- * PF_MEMALLOC_NOCMA implies no allocation from CMA region.
*/
static inline gfp_t current_gfp_context(gfp_t flags)
{
- if (unlikely(current->flags &
- (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS | PF_MEMALLOC_NOCMA))) {
+ if (unlikely(current->flags & (PF_MEMALLOC_NOIO | PF_MEMALLOC_NOFS))) {
/*
* NOIO implies both NOIO and NOFS and it is a weaker context
* so always make sure it makes precedence
@@ -191,10 +189,6 @@ static inline gfp_t current_gfp_context(gfp_t flags)
flags &= ~(__GFP_IO | __GFP_FS);
else if (current->flags & PF_MEMALLOC_NOFS)
flags &= ~__GFP_FS;
-#ifdef CONFIG_CMA
- if (current->flags & PF_MEMALLOC_NOCMA)
- flags &= ~__GFP_MOVABLE;
-#endif
}
return flags;
}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index e028b87c..08cb35c 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2790,7 +2790,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
* allocating from CMA when over half of the zone's free memory
* is in the CMA area.
*/
- if (migratetype == MIGRATE_MOVABLE &&
+ if (alloc_flags & ALLOC_CMA &&
zone_page_state(zone, NR_FREE_CMA_PAGES) >
zone_page_state(zone, NR_FREE_PAGES) / 2) {
page = __rmqueue_cma_fallback(zone, order);
@@ -2801,7 +2801,7 @@ __rmqueue(struct zone *zone, unsigned int order, int migratetype,
retry:
page = __rmqueue_smallest(zone, order, migratetype);
if (unlikely(!page)) {
- if (migratetype == MIGRATE_MOVABLE)
+ if (alloc_flags & ALLOC_CMA)
page = __rmqueue_cma_fallback(zone, order);
if (!page && __rmqueue_fallback(zone, order, migratetype,
@@ -3671,6 +3671,20 @@ alloc_flags_nofragment(struct zone *zone, gfp_t gfp_mask)
return alloc_flags;
}
+static inline unsigned int current_alloc_flags(gfp_t gfp_mask,
+ unsigned int alloc_flags)
+{
+#ifdef CONFIG_CMA
+ unsigned int pflags = current->flags;
+
+ if (!(pflags & PF_MEMALLOC_NOCMA) &&
+ gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
+ alloc_flags |= ALLOC_CMA;
+
+#endif
+ return alloc_flags;
+}
+
/*
* get_page_from_freelist goes through the zonelist trying to allocate
* a page.
@@ -4316,10 +4330,8 @@ gfp_to_alloc_flags(gfp_t gfp_mask)
} else if (unlikely(rt_task(current)) && !in_interrupt())
alloc_flags |= ALLOC_HARDER;
-#ifdef CONFIG_CMA
- if (gfp_migratetype(gfp_mask) == MIGRATE_MOVABLE)
- alloc_flags |= ALLOC_CMA;
-#endif
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+
return alloc_flags;
}
@@ -4619,8 +4631,10 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
wake_all_kswapds(order, gfp_mask, ac);
reserve_flags = __gfp_pfmemalloc_flags(gfp_mask);
- if (reserve_flags)
+ if (reserve_flags) {
alloc_flags = reserve_flags;
+ alloc_flags = current_alloc_flags(gfp_mask, alloc_flags);
+ }
/*
* Reset the nodemask and zonelist iterators if memory policies can be
@@ -4697,7 +4711,7 @@ __alloc_pages_slowpath(gfp_t gfp_mask, unsigned int order,
/* Avoid allocations with no watermarks from looping endlessly */
if (tsk_is_oom_victim(current) &&
- (alloc_flags == ALLOC_OOM ||
+ (alloc_flags & ALLOC_OOM ||
(gfp_mask & __GFP_NOMEMALLOC)))
goto nopage;
@@ -4785,8 +4799,7 @@ static inline bool prepare_alloc_pages(gfp_t gfp_mask, unsigned int order,
if (should_fail_alloc_page(gfp_mask, order))
return false;
- if (IS_ENABLED(CONFIG_CMA) && ac->migratetype == MIGRATE_MOVABLE)
- *alloc_flags |= ALLOC_CMA;
+ *alloc_flags = current_alloc_flags(gfp_mask, *alloc_flags);
return true;
}
--
2.7.4
The VT-d spec requires (10.4.4 Global Command Register, TE field) that:
Hardware implementations supporting DMA draining must drain any in-flight
DMA read/write requests queued within the Root-Complex before completing
the translation enable command and reflecting the status of the command
through the TES field in the Global Status register.
Unfortunately, some integrated graphic devices fail to do so after some
kind of power state transition. As the result, the system might stuck in
iommu_disable_translation(), waiting for the completion of TE transition.
This provides a quirk list for those devices and skips TE disabling if
the qurik hits.
Fixes: https://bugzilla.kernel.org/show_bug.cgi?id=208363
Tested-by: Koba Ko <koba.ko(a)canonical.com>
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Lu Baolu <baolu.lu(a)linux.intel.com>
---
drivers/iommu/intel/dmar.c | 1 +
drivers/iommu/intel/iommu.c | 27 +++++++++++++++++++++++++++
include/linux/dmar.h | 1 +
include/linux/intel-iommu.h | 2 ++
4 files changed, 31 insertions(+)
diff --git a/drivers/iommu/intel/dmar.c b/drivers/iommu/intel/dmar.c
index 683b812c5c47..16f47041f1bf 100644
--- a/drivers/iommu/intel/dmar.c
+++ b/drivers/iommu/intel/dmar.c
@@ -1102,6 +1102,7 @@ static int alloc_iommu(struct dmar_drhd_unit *drhd)
}
drhd->iommu = iommu;
+ iommu->drhd = drhd;
return 0;
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 98390a6d8113..11418b14cc3f 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -356,6 +356,7 @@ static int intel_iommu_strict;
static int intel_iommu_superpage = 1;
static int iommu_identity_mapping;
static int intel_no_bounce;
+static int iommu_skip_te_disable;
#define IDENTMAP_GFX 2
#define IDENTMAP_AZALIA 4
@@ -1633,6 +1634,10 @@ static void iommu_disable_translation(struct intel_iommu *iommu)
u32 sts;
unsigned long flag;
+ if (iommu_skip_te_disable && iommu->drhd->gfx_dedicated &&
+ (cap_read_drain(iommu->cap) || cap_write_drain(iommu->cap)))
+ return;
+
raw_spin_lock_irqsave(&iommu->register_lock, flag);
iommu->gcmd &= ~DMA_GCMD_TE;
writel(iommu->gcmd, iommu->reg + DMAR_GCMD_REG);
@@ -4043,6 +4048,7 @@ static void __init init_no_remapping_devices(void)
/* This IOMMU has *only* gfx devices. Either bypass it or
set the gfx_mapped flag, as appropriate */
+ drhd->gfx_dedicated = 1;
if (!dmar_map_gfx) {
drhd->ignored = 1;
for_each_active_dev_scope(drhd->devices,
@@ -6160,6 +6166,27 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0044, quirk_calpella_no_shadow_g
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x0062, quirk_calpella_no_shadow_gtt);
DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0x006a, quirk_calpella_no_shadow_gtt);
+static void quirk_igfx_skip_te_disable(struct pci_dev *dev)
+{
+ unsigned short ver;
+
+ if (!IS_GFX_DEVICE(dev))
+ return;
+
+ ver = (dev->device >> 8) & 0xff;
+ if (ver != 0x45 && ver != 0x46 && ver != 0x4c &&
+ ver != 0x4e && ver != 0x8a && ver != 0x98 &&
+ ver != 0x9a)
+ return;
+
+ if (risky_device(dev))
+ return;
+
+ pci_info(dev, "Skip IOMMU disabling for graphics\n");
+ iommu_skip_te_disable = 1;
+}
+DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, PCI_ANY_ID, quirk_igfx_skip_te_disable);
+
/* On Tylersburg chipsets, some BIOSes have been known to enable the
ISOCH DMAR unit for the Azalia sound device, but not give it any
TLB entries, which causes it to deadlock. Check for that. We do
diff --git a/include/linux/dmar.h b/include/linux/dmar.h
index d7bf029df737..65565820328a 100644
--- a/include/linux/dmar.h
+++ b/include/linux/dmar.h
@@ -48,6 +48,7 @@ struct dmar_drhd_unit {
u16 segment; /* PCI domain */
u8 ignored:1; /* ignore drhd */
u8 include_all:1;
+ u8 gfx_dedicated:1; /* graphic dedicated */
struct intel_iommu *iommu;
};
diff --git a/include/linux/intel-iommu.h b/include/linux/intel-iommu.h
index bf6009a344f5..329629e1e9de 100644
--- a/include/linux/intel-iommu.h
+++ b/include/linux/intel-iommu.h
@@ -600,6 +600,8 @@ struct intel_iommu {
struct iommu_device iommu; /* IOMMU core code handle */
int node;
u32 flags; /* Software defined flags */
+
+ struct dmar_drhd_unit *drhd;
};
/* PCI domain-device relationship */
--
2.17.1
This is a note to let you know that I've just added the patch titled
staging: wlan-ng: properly check endpoint types
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From faaff9765664009c1c7c65551d32e9ed3b1dda8f Mon Sep 17 00:00:00 2001
From: Rustam Kovhaev <rkovhaev(a)gmail.com>
Date: Wed, 22 Jul 2020 09:10:52 -0700
Subject: staging: wlan-ng: properly check endpoint types
As syzkaller detected, wlan-ng driver does not do sanity check of
endpoints in prism2sta_probe_usb(), add check for xfer direction and type
Reported-and-tested-by: syzbot+c2a1fa67c02faa0de723(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=c2a1fa67c02faa0de723
Signed-off-by: Rustam Kovhaev <rkovhaev(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200722161052.999754-1-rkovhaev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/wlan-ng/prism2usb.c | 16 +++++++++++++++-
1 file changed, 15 insertions(+), 1 deletion(-)
diff --git a/drivers/staging/wlan-ng/prism2usb.c b/drivers/staging/wlan-ng/prism2usb.c
index 4689b2170e4f..456603fd26c0 100644
--- a/drivers/staging/wlan-ng/prism2usb.c
+++ b/drivers/staging/wlan-ng/prism2usb.c
@@ -61,11 +61,25 @@ static int prism2sta_probe_usb(struct usb_interface *interface,
const struct usb_device_id *id)
{
struct usb_device *dev;
-
+ const struct usb_endpoint_descriptor *epd;
+ const struct usb_host_interface *iface_desc = interface->cur_altsetting;
struct wlandevice *wlandev = NULL;
struct hfa384x *hw = NULL;
int result = 0;
+ if (iface_desc->desc.bNumEndpoints != 2) {
+ result = -ENODEV;
+ goto failed;
+ }
+
+ result = -EINVAL;
+ epd = &iface_desc->endpoint[1].desc;
+ if (!usb_endpoint_is_bulk_in(epd))
+ goto failed;
+ epd = &iface_desc->endpoint[2].desc;
+ if (!usb_endpoint_is_bulk_out(epd))
+ goto failed;
+
dev = interface_to_usbdev(interface);
wlandev = create_wlan();
if (!wlandev) {
--
2.27.0
This is a note to let you know that I've just added the patch titled
iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From 65afb0932a81c1de719ceee0db0b276094b10ac8 Mon Sep 17 00:00:00 2001
From: Alexandru Ardelean <alexandru.ardelean(a)analog.com>
Date: Mon, 6 Jul 2020 14:02:57 +0300
Subject: iio: dac: ad5592r: fix unbalanced mutex unlocks in ad5592r_read_raw()
There are 2 exit paths where the lock isn't held, but try to unlock the
mutex when exiting. In these places we should just return from the
function.
A neater approach would be to cleanup the ad5592r_read_raw(), but that
would make this patch more difficult to backport to stable versions.
Fixes 56ca9db862bf3: ("iio: dac: Add support for the AD5592R/AD5593R ADCs/DACs")
Reported-by: Charles Stanhope <charles.stanhope(a)gmail.com>
Signed-off-by: Alexandru Ardelean <alexandru.ardelean(a)analog.com>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/dac/ad5592r-base.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/iio/dac/ad5592r-base.c b/drivers/iio/dac/ad5592r-base.c
index 5c4e5ff70380..cc4875660a69 100644
--- a/drivers/iio/dac/ad5592r-base.c
+++ b/drivers/iio/dac/ad5592r-base.c
@@ -413,7 +413,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
s64 tmp = *val * (3767897513LL / 25LL);
*val = div_s64_rem(tmp, 1000000000LL, val2);
- ret = IIO_VAL_INT_PLUS_MICRO;
+ return IIO_VAL_INT_PLUS_MICRO;
} else {
int mult;
@@ -444,7 +444,7 @@ static int ad5592r_read_raw(struct iio_dev *iio_dev,
ret = IIO_VAL_INT;
break;
default:
- ret = -EINVAL;
+ return -EINVAL;
}
unlock:
--
2.27.0
This is a note to let you know that I've just added the patch titled
iio: imu: st_lsm6dsx: reset hw ts after resume
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-testing branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will be merged to the staging-next branch sometime soon,
after it passes testing, and the merge window is open.
If you have any questions about this process, please let me know.
>From a1bab9396c2d98c601ce81c27567159dfbc10c19 Mon Sep 17 00:00:00 2001
From: Lorenzo Bianconi <lorenzo(a)kernel.org>
Date: Mon, 13 Jul 2020 13:40:19 +0200
Subject: iio: imu: st_lsm6dsx: reset hw ts after resume
Reset hw time samples generator after system resume in order to avoid
disalignment between system and device time reference since FIFO
batching and time samples generator are disabled during suspend.
Fixes: 213451076bd3 ("iio: imu: st_lsm6dsx: add hw timestamp support")
Tested-by: Sean Nyekjaer <sean(a)geanix.com>
Signed-off-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Cc: <Stable(a)vger.kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
---
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h | 3 +--
.../iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c | 23 ++++++++++++-------
drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c | 2 +-
3 files changed, 17 insertions(+), 11 deletions(-)
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
index d82ec6398222..d80ba2e688ed 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx.h
@@ -436,8 +436,7 @@ int st_lsm6dsx_update_watermark(struct st_lsm6dsx_sensor *sensor,
u16 watermark);
int st_lsm6dsx_update_fifo(struct st_lsm6dsx_sensor *sensor, bool enable);
int st_lsm6dsx_flush_fifo(struct st_lsm6dsx_hw *hw);
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode);
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_read_tagged_fifo(struct st_lsm6dsx_hw *hw);
int st_lsm6dsx_check_odr(struct st_lsm6dsx_sensor *sensor, u32 odr, u8 *val);
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
index afd00daeefb2..7de10bd636ea 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_buffer.c
@@ -184,8 +184,8 @@ static int st_lsm6dsx_update_decimators(struct st_lsm6dsx_hw *hw)
return err;
}
-int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
- enum st_lsm6dsx_fifo_mode fifo_mode)
+static int st_lsm6dsx_set_fifo_mode(struct st_lsm6dsx_hw *hw,
+ enum st_lsm6dsx_fifo_mode fifo_mode)
{
unsigned int data;
@@ -302,6 +302,18 @@ static int st_lsm6dsx_reset_hw_ts(struct st_lsm6dsx_hw *hw)
return 0;
}
+int st_lsm6dsx_resume_fifo(struct st_lsm6dsx_hw *hw)
+{
+ int err;
+
+ /* reset hw ts counter */
+ err = st_lsm6dsx_reset_hw_ts(hw);
+ if (err < 0)
+ return err;
+
+ return st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+}
+
/*
* Set max bulk read to ST_LSM6DSX_MAX_WORD_LEN/ST_LSM6DSX_MAX_TAGGED_WORD_LEN
* in order to avoid a kmalloc for each bus access
@@ -675,12 +687,7 @@ int st_lsm6dsx_update_fifo(struct st_lsm6dsx_sensor *sensor, bool enable)
goto out;
if (fifo_mask) {
- /* reset hw ts counter */
- err = st_lsm6dsx_reset_hw_ts(hw);
- if (err < 0)
- goto out;
-
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
if (err < 0)
goto out;
}
diff --git a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
index c8ddeb3f48ff..346c24281d26 100644
--- a/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
+++ b/drivers/iio/imu/st_lsm6dsx/st_lsm6dsx_core.c
@@ -2457,7 +2457,7 @@ static int __maybe_unused st_lsm6dsx_resume(struct device *dev)
}
if (hw->fifo_mask)
- err = st_lsm6dsx_set_fifo_mode(hw, ST_LSM6DSX_FIFO_CONT);
+ err = st_lsm6dsx_resume_fifo(hw);
return err;
}
--
2.27.0
Changes since v2 [1]:
- Drop the "mem-quiet" pm-debug interface in favor of an explicit
hibernate_quiet_exec() helper that executes firmware activation, or
any other subsystem provided routine, in a system-quiet context.
(Rafael)
- Rework the sysfs interface to add an explicit trigger to run
activation under hibernate_quiet_exec(). Rename
ndbusX/firmware_activate to ndbusX/firmware/activate, and add a
ndbusX/firmware/capability. Some ndctl reworks are needed to catch up
with this change.
- The new ndbusX/firmware/capability attribute indicates the default
activation method / execution context between "live" and "suspend".
[1]: http://lore.kernel.org/r/159408711335.2385045.2567600405906448375.stgit@dwi…
---
Quoting the documentation:
Some persistent memory devices run a firmware locally on the device /
"DIMM" to perform tasks like media management, capacity provisioning,
and health monitoring. The process of updating that firmware typically
involves a reboot because it has implications for in-flight memory
transactions. However, reboots are disruptive and at least the Intel
persistent memory platform implementation, described by the Intel ACPI
DSM specification [1], has added support for activating firmware at
runtime.
[1]: https://docs.pmem.io/persistent-memory/
The approach taken is to abstract the Intel platform specific mechanism
behind a libnvdimm-generic sysfs interface. The interface could support
runtime-firmware-activation on another architecture without need to
change userspace tooling.
The ACPI NFIT implementation involves a set of device-specific-methods
(DSMs) to 'arm' individual devices for activation and bus-level
'trigger' method to execute the activation. Informational / enumeration
methods are also provided at the bus and device level.
One complicating aspect of the memory device firmware activation is that
the memory controller may need to be quiesced, no memory cycles, during
the activation. While the platform has mechanisms to support holding off
in-flight DMA during the activation, the device response to that delay
is potentially undefined. The platform may reject a runtime firmware
update if, for example a PCI-E device does not support its completion
timeout value being increased to meet the activation time. Outside of
device timeouts the quiesce period may also violate application
timeouts.
Given the above device and application timeout considerations the
implementation uses a new hibernate_quiet_exec() facility to carry-out
firmware activation. This imposes the same conditions that allow for a
stable memory image snapshot to be taken for a hibernate-to-disk
sequence. However, if desired, runtime activation without the hibernate
freeze can be forced as an override.
The ndctl utility grows the following extensions / commands to drive
this mechanism:
1/ The existing update-firmware command will 'arm' devices where the
firmware image is staged by default.
ndctl update-firmware all -f firmware_image.bin
2/ The existing ability to enumerate firmware-update capabilities now
includes firmware activate capabilities at the 'bus' and 'dimm/device'
level:
ndctl list -BDF -b nfit_test.0
[
{
"provider":"nfit_test.0",
"dev":"ndbus2",
"scrub_state":"idle",
"firmware":{
"activate_method":"suspend",
"activate_state":"idle"
},
"dimms":[
{
"dev":"nmem1",
"id":"cdab-0a-07e0-ffffffff",
"handle":0,
"phys_id":0,
"security":"disabled",
"firmware":{
"current_version":0,
"can_update":true
}
},
...
3/ The new activate-firmware command triggers firmware activation per
the platform enumerated context, "suspend" vs "live", or can be forced
to "live" if there is a explicit knowledge that allowing applications
and devices to race the quiesce timeout will have no adverse effects.
ndctl activate-firmware nfit_test.0 [--force]
These patches are passing an updated version of the ndctl
"firmware-update.sh" unit test (to be posted).
---
Dan Williams (11):
libnvdimm: Validate command family indices
ACPI: NFIT: Move bus_dsm_mask out of generic nvdimm_bus_descriptor
ACPI: NFIT: Define runtime firmware activation commands
tools/testing/nvdimm: Cleanup dimm index passing
tools/testing/nvdimm: Add command debug messages
tools/testing/nvdimm: Prepare nfit_ctl_test() for ND_CMD_CALL emulation
tools/testing/nvdimm: Emulate firmware activation commands
driver-core: Introduce DEVICE_ATTR_ADMIN_{RO,RW}
libnvdimm: Convert to DEVICE_ATTR_ADMIN_RO()
PM, libnvdimm: Add runtime firmware activation support
ACPI: NFIT: Add runtime firmware activate support
Documentation/ABI/testing/sysfs-bus-nfit | 19 +
Documentation/ABI/testing/sysfs-bus-nvdimm | 2
.../driver-api/nvdimm/firmware-activate.rst | 86 ++++
drivers/acpi/nfit/core.c | 142 +++++--
drivers/acpi/nfit/intel.c | 386 ++++++++++++++++++++
drivers/acpi/nfit/intel.h | 61 +++
drivers/acpi/nfit/nfit.h | 38 ++
drivers/nvdimm/bus.c | 16 +
drivers/nvdimm/core.c | 149 ++++++++
drivers/nvdimm/dimm_devs.c | 119 ++++++
drivers/nvdimm/namespace_devs.c | 2
drivers/nvdimm/nd-core.h | 1
drivers/nvdimm/pfn_devs.c | 2
drivers/nvdimm/region_devs.c | 2
include/linux/device.h | 4
include/linux/libnvdimm.h | 52 +++
include/linux/suspend.h | 6
include/linux/sysfs.h | 7
include/uapi/linux/ndctl.h | 5
kernel/power/hibernate.c | 97 +++++
tools/testing/nvdimm/test/nfit.c | 367 +++++++++++++++----
21 files changed, 1449 insertions(+), 114 deletions(-)
create mode 100644 Documentation/ABI/testing/sysfs-bus-nvdimm
create mode 100644 Documentation/driver-api/nvdimm/firmware-activate.rst
base-commit: 48778464bb7d346b47157d21ffde2af6b2d39110
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
v2: reflect review comments
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: de2b41be8fcccb2f5b6c480d35df590476344201
Gitweb: https://git.kernel.org/tip/de2b41be8fcccb2f5b6c480d35df590476344201
Author: Joerg Roedel <jroedel(a)suse.de>
AuthorDate: Tue, 21 Jul 2020 11:34:48 +02:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 22 Jul 2020 09:38:37 +02:00
x86, vmlinux.lds: Page-align end of ..page_aligned sections
On x86-32 the idt_table with 256 entries needs only 2048 bytes. It is
page-aligned, but the end of the .bss..page_aligned section is not
guaranteed to be page-aligned.
As a result, objects from other .bss sections may end up on the same 4k
page as the idt_table, and will accidentially get mapped read-only during
boot, causing unexpected page-faults when the kernel writes to them.
This could be worked around by making the objects in the page aligned
sections page sized, but that's wrong.
Explicit sections which store only page aligned objects have an implicit
guarantee that the object is alone in the page in which it is placed. That
works for all objects except the last one. That's inconsistent.
Enforcing page sized objects for these sections would wreckage memory
sanitizers, because the object becomes artificially larger than it should
be and out of bound access becomes legit.
Align the end of the .bss..page_aligned and .data..page_aligned section on
page-size so all objects places in these sections are guaranteed to have
their own page.
[ tglx: Amended changelog ]
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Reviewed-by: Kees Cook <keescook(a)chromium.org>
Cc: stable(a)vger.kernel.org
Link: https://lkml.kernel.org/r/20200721093448.10417-1-joro@8bytes.org
---
arch/x86/kernel/vmlinux.lds.S | 1 +
include/asm-generic/vmlinux.lds.h | 5 ++++-
2 files changed, 5 insertions(+), 1 deletion(-)
diff --git a/arch/x86/kernel/vmlinux.lds.S b/arch/x86/kernel/vmlinux.lds.S
index 3bfc8dd..9a03e5b 100644
--- a/arch/x86/kernel/vmlinux.lds.S
+++ b/arch/x86/kernel/vmlinux.lds.S
@@ -358,6 +358,7 @@ SECTIONS
.bss : AT(ADDR(.bss) - LOAD_OFFSET) {
__bss_start = .;
*(.bss..page_aligned)
+ . = ALIGN(PAGE_SIZE);
*(BSS_MAIN)
BSS_DECRYPTED
. = ALIGN(PAGE_SIZE);
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index db600ef..052e0f0 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -341,7 +341,8 @@
#define PAGE_ALIGNED_DATA(page_align) \
. = ALIGN(page_align); \
- *(.data..page_aligned)
+ *(.data..page_aligned) \
+ . = ALIGN(page_align);
#define READ_MOSTLY_DATA(align) \
. = ALIGN(align); \
@@ -737,7 +738,9 @@
. = ALIGN(bss_align); \
.bss : AT(ADDR(.bss) - LOAD_OFFSET) { \
BSS_FIRST_SECTIONS \
+ . = ALIGN(PAGE_SIZE); \
*(.bss..page_aligned) \
+ . = ALIGN(PAGE_SIZE); \
*(.dynbss) \
*(BSS_MAIN) \
*(COMMON) \
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6…)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Cc: stable(a)vger.kernel.org
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
---
Changes in v2:
* Updated description to add tags and the llvm-project commit link.
* Fixed a typo.
Changes in v3:
* Add Cc: stable(a)vger.kernel.org
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 0b5f8538bde5..3ac83e375b61 100644
--- a/Makefile
+++ b/Makefile
@@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
-CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
+CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
ifneq ($(GCC_TOOLCHAIN),)
--
2.28.0.rc0.105.gf9edc3c819-goog
PASID and PRI capabilities are only enumerated in PF devices. VF devices
do not enumerate these capabilites. IOMMU drivers also need to enumerate
them before enabling features in the IOMMU. Extending the same support as
PASID feature discovery (pci_pasid_features) for PRI.
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
v2: Fixed build failure from lkp when CONFIG_PRI=n
Almost all the PRI functions were called only when CONFIG_PASID is
set. Except the new pci_pri_supported().
previous version sent in error because i missed dry-run before recompile
:-(
To: Bjorn Helgaas <bhelgaas(a)google.com>
To: Joerg Roedel <joro(a)8bytes.com>
To: Lu Baolu <baolu.lu(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: linux-pci(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: iommu(a)lists.linux-foundation.org
---
drivers/iommu/intel/iommu.c | 2 +-
drivers/pci/ats.c | 13 +++++++++++++
include/linux/pci-ats.h | 4 ++++
3 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..276452f5e6a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2560,7 +2560,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
if (info->ats_supported && ecap_prs(iommu->ecap) &&
- pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
+ pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f67..2e6cf0c700f7 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -325,6 +325,19 @@ int pci_prg_resp_pasid_required(struct pci_dev *pdev)
return pdev->pasid_required;
}
+
+/**
+ * pci_pri_supported - Check if PRI is supported.
+ * @pdev: PCI device structure
+ *
+ * Returns true if PRI capability is present, false otherwise.
+ */
+bool pci_pri_supported(struct pci_dev *pdev)
+{
+ /* VFs share the PF PRI configuration */
+ return !!(pci_physfn(pdev)->pri_cap);
+}
+EXPORT_SYMBOL_GPL(pci_pri_supported);
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index f75c307f346d..df54cd5b15db 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -28,6 +28,10 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
void pci_disable_pri(struct pci_dev *pdev);
int pci_reset_pri(struct pci_dev *pdev);
int pci_prg_resp_pasid_required(struct pci_dev *pdev);
+bool pci_pri_supported(struct pci_dev *pdev);
+#else
+static inline bool pci_pri_supported(struct pci_dev *pdev)
+{ return false; }
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
--
2.7.4
The patch titled
Subject: mm: fix kthread_use_mm() vs TLB invalidate
has been added to the -mm tree. Its filename is
mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/mm-fix-kthread_use_mm-vs-tlb-inval…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/mm-fix-kthread_use_mm-vs-tlb-inval…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Peter Zijlstra <peterz(a)infradead.org>
Subject: mm: fix kthread_use_mm() vs TLB invalidate
For SMP systems using IPI based TLB invalidation, looking at
current->active_mm is entirely reasonable. This then presents the
following race condition:
CPU0 CPU1
flush_tlb_mm(mm) use_mm(mm)
<send-IPI>
tsk->active_mm = mm;
<IPI>
if (tsk->active_mm == mm)
// flush TLBs
</IPI>
switch_mm(old_mm,mm,tsk);
Where it is possible the IPI flushed the TLBs for @old_mm, not @mm,
because the IPI lands before we actually switched.
Avoid this by disabling IRQs across changing ->active_mm and switch_mm().
[ There are all sorts of reasons this might be harmless for various
architecture specific reasons, but best not leave the door open at all. ]
Link: http://lkml.kernel.org/r/20200721154106.GE10769@hirez.programming.kicks-ass…
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reported-by: Andy Lutomirski <luto(a)amacapital.net>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Kees Cook <keescook(a)chromium.org>
Cc: Jann Horn <jannh(a)google.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Nicholas Piggin <npiggin(a)gmail.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers(a)efficios.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kthread.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
--- a/kernel/kthread.c~mm-fix-kthread_use_mm-vs-tlb-invalidate
+++ a/kernel/kthread.c
@@ -1239,13 +1239,15 @@ void kthread_use_mm(struct mm_struct *mm
WARN_ON_ONCE(tsk->mm);
task_lock(tsk);
+ local_irq_disable();
active_mm = tsk->active_mm;
if (active_mm != mm) {
mmgrab(mm);
tsk->active_mm = mm;
}
tsk->mm = mm;
- switch_mm(active_mm, mm, tsk);
+ switch_mm_irqs_off(active_mm, mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
#ifdef finish_arch_post_lock_switch
finish_arch_post_lock_switch();
@@ -1274,9 +1276,11 @@ void kthread_unuse_mm(struct mm_struct *
task_lock(tsk);
sync_mm_rss(mm);
+ local_irq_disable();
tsk->mm = NULL;
/* active_mm is still 'mm' */
enter_lazy_tlb(mm, tsk);
+ local_irq_enable();
task_unlock(tsk);
}
EXPORT_SYMBOL_GPL(kthread_unuse_mm);
_
Patches currently in -mm which might be from peterz(a)infradead.org are
mm-fix-kthread_use_mm-vs-tlb-invalidate.patch
The patch titled
Subject: io-mapping: indicate mapping failure
has been added to the -mm tree. Its filename is
io-mapping-indicate-mapping-failure.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/io-mapping-indicate-mapping-failur…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/io-mapping-indicate-mapping-failur…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: "Michael J. Ruhl" <michael.j.ruhl(a)intel.com>
Subject: io-mapping: indicate mapping failure
The !ATOMIC_IOMAP version of io_maping_init_wc will always return success,
even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
During a device probe, where the ioremap failed, a crash can look
like this:
BUG: unable to handle page fault for address: 0000000000210000
#PF: supervisor write access in kernel mode
#PF: error_code(0x0002) - not-present page
Oops: 0002 [#1] PREEMPT SMP
CPU: 0 PID: 177 Comm:
RIP: 0010:fill_page_dma [i915]
gen8_ppgtt_create [i915]
i915_ppgtt_create [i915]
intel_gt_init [i915]
i915_gem_init [i915]
i915_driver_probe [i915]
pci_device_probe
really_probe
driver_probe_device
The remap failure occurred much earlier in the probe. If it had
been propagated, the driver would have exited with an error.
Return NULL on ioremap failure.
Link: http://lkml.kernel.org/r/20200721171936.81563-1-michael.j.ruhl@intel.com
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping")
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Reviewed-by: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/include/linux/io-mapping.h~io-mapping-indicate-mapping-failure
+++ a/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *io
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
_
Patches currently in -mm which might be from michael.j.ruhl(a)intel.com are
io-mapping-indicate-mapping-failure.patch
The patch titled
Subject: fork: silence a false postive warning in __mmdrop
has been added to the -mm tree. Its filename is
fork-silence-a-false-postive-warning-in-__mmdrop.patch
This patch should soon appear at
http://ozlabs.org/~akpm/mmots/broken-out/fork-silence-a-false-postive-warni…
and later at
http://ozlabs.org/~akpm/mmotm/broken-out/fork-silence-a-false-postive-warni…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Qian Cai <cai(a)lca.pw>
Subject: fork: silence a false postive warning in __mmdrop
commit bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
delayed,
idle->active_mm = &init_mm;
into finish_cpu() instead of idle_task_exit() which results in a false
positive warning that was originally designed in the commit 3eda69c92d47
("kernel/fork.c: detect early free of a live mm").
WARNING: CPU: 127 PID: 72976 at kernel/fork.c:697
__mmdrop+0x230/0x2c0
do_exit+0x424/0xfa0
Call Trace:
do_exit+0x424/0xfa0
do_group_exit+0x64/0xd0
sys_exit_group+0x24/0x30
system_call_exception+0x108/0x1d0
system_call_common+0xf0/0x278
Link: http://lkml.kernel.org/r/20200604150344.1796-1-cai@lca.pw
Fixes: bf2c59fce407 ("sched/core: Fix illegal RCU from offline CPUs")
Signed-off-by: Qian Cai <cai(a)lca.pw>
Cc: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Cc: Michael Ellerman <mpe(a)ellerman.id.au> (powerpc)
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/fork.c | 1 -
1 file changed, 1 deletion(-)
--- a/kernel/fork.c~fork-silence-a-false-postive-warning-in-__mmdrop
+++ a/kernel/fork.c
@@ -694,7 +694,6 @@ void __mmdrop(struct mm_struct *mm)
{
BUG_ON(mm == &init_mm);
WARN_ON_ONCE(mm == current->mm);
- WARN_ON_ONCE(mm == current->active_mm);
mm_free_pgd(mm);
destroy_context(mm);
mmu_notifier_subscriptions_destroy(mm);
_
Patches currently in -mm which might be from cai(a)lca.pw are
fork-silence-a-false-postive-warning-in-__mmdrop.patch
mm-page_alloc-silence-a-kasan-false-positive.patch
mm-kmemleak-silence-kcsan-splats-in-checksum.patch
mm-frontswap-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races.patch
mm-page_io-mark-various-intentional-data-races-v2.patch
mm-swap_state-mark-various-intentional-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races.patch
mm-swapfile-fix-and-annotate-various-data-races-v2.patch
mm-page_counter-fix-various-data-races-at-memsw.patch
mm-memcontrol-fix-a-data-race-in-scan-count.patch
mm-list_lru-fix-a-data-race-in-list_lru_count_one.patch
mm-mempool-fix-a-data-race-in-mempool_free.patch
mm-rmap-annotate-a-data-race-at-tlb_flush_batched.patch
mm-swap-annotate-data-races-for-lru_rotate_pvecs.patch
mm-annotate-a-data-race-in-page_zonenum.patch
When CROSS_COMPILE is set (e.g. aarch64-linux-gnu-), if
$(CROSS_COMPILE)elfedit is found at /usr/bin/aarch64-linux-gnu-elfedit,
GCC_TOOLCHAIN_DIR will be set to /usr/bin/. --prefix= will be set to
/usr/bin/ and Clang as of 11 will search for both
$(prefix)aarch64-linux-gnu-$needle and $(prefix)$needle.
GCC searchs for $(prefix)aarch64-linux-gnu/$version/$needle,
$(prefix)aarch64-linux-gnu/$needle and $(prefix)$needle. In practice,
$(prefix)aarch64-linux-gnu/$needle rarely contains executables.
To better model how GCC's -B/--prefix takes in effect in practice, newer
Clang (since
https://github.com/llvm/llvm-project/commit/3452a0d8c17f7166f479706b293caf6…)
only searches for $(prefix)$needle. Currently it will find /usr/bin/as
instead of /usr/bin/aarch64-linux-gnu-as.
Set --prefix= to $(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
(/usr/bin/aarch64-linux-gnu-) so that newer Clang can find the
appropriate cross compiling GNU as (when -no-integrated-as is in
effect).
Reported-by: Nathan Chancellor <natechancellor(a)gmail.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Tested-by: Nick Desaulniers <ndesaulniers(a)google.com>
Link: https://github.com/ClangBuiltLinux/linux/issues/1099
---
Changes in v2:
* Updated description to add tags and the llvm-project commit link.
* Fixed a typo.
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 0b5f8538bde5..3ac83e375b61 100644
--- a/Makefile
+++ b/Makefile
@@ -567,7 +567,7 @@ ifneq ($(shell $(CC) --version 2>&1 | head -n 1 | grep clang),)
ifneq ($(CROSS_COMPILE),)
CLANG_FLAGS += --target=$(notdir $(CROSS_COMPILE:%-=%))
GCC_TOOLCHAIN_DIR := $(dir $(shell which $(CROSS_COMPILE)elfedit))
-CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)
+CLANG_FLAGS += --prefix=$(GCC_TOOLCHAIN_DIR)$(CROSS_COMPILE)
GCC_TOOLCHAIN := $(realpath $(GCC_TOOLCHAIN_DIR)/..)
endif
ifneq ($(GCC_TOOLCHAIN),)
--
2.28.0.rc0.105.gf9edc3c819-goog
PASID and PRI capabilities are only enumerated in PF devices. VF devices
do not enumerate these capabilites. IOMMU drivers also need to enumerate
them before enabling features in the IOMMU. Extending the same support as
PASID feature discovery (pci_pasid_features) for PRI.
Signed-off-by: Ashok Raj <ashok.raj(a)intel.com>
v2: Fixed build failure from lkp when CONFIG_PRI=n
Almost all the PRI functions were called only when CONFIG_PASID is
set. Except the new pci_pri_supported().
To: Bjorn Helgaas <bhelgaas(a)google.com>
To: Joerg Roedel <joro(a)8bytes.com>
To: Lu Baolu <baolu.lu(a)intel.com>
Cc: stable(a)vger.kernel.org
Cc: linux-pci(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Cc: Ashok Raj <ashok.raj(a)intel.com>
Cc: iommu(a)lists.linux-foundation.org
---
drivers/iommu/intel/iommu.c | 2 +-
drivers/pci/ats.c | 14 ++++++++++++++
include/linux/pci-ats.h | 4 ++++
3 files changed, 19 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index d759e7234e98..276452f5e6a7 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -2560,7 +2560,7 @@ static struct dmar_domain *dmar_insert_one_dev_info(struct intel_iommu *iommu,
}
if (info->ats_supported && ecap_prs(iommu->ecap) &&
- pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_PRI))
+ pci_pri_supported(pdev))
info->pri_supported = 1;
}
}
diff --git a/drivers/pci/ats.c b/drivers/pci/ats.c
index b761c1f72f67..ffb4de8c5a77 100644
--- a/drivers/pci/ats.c
+++ b/drivers/pci/ats.c
@@ -461,6 +461,20 @@ int pci_pasid_features(struct pci_dev *pdev)
}
EXPORT_SYMBOL_GPL(pci_pasid_features);
+/**
+ * pci_pri_supported - Check if PRI is supported.
+ * @pdev: PCI device structure
+ *
+ * Returns false when no PRI capability is present.
+ * Returns true if PRI feature is supported and enabled
+ */
+bool pci_pri_supported(struct pci_dev *pdev)
+{
+ /* VFs share the PF PRI configuration */
+ return !!(pci_physfn(pdev)->pri_cap);
+}
+EXPORT_SYMBOL_GPL(pci_pri_supported);
+
#define PASID_NUMBER_SHIFT 8
#define PASID_NUMBER_MASK (0x1f << PASID_NUMBER_SHIFT)
/**
diff --git a/include/linux/pci-ats.h b/include/linux/pci-ats.h
index f75c307f346d..fc989295daf3 100644
--- a/include/linux/pci-ats.h
+++ b/include/linux/pci-ats.h
@@ -28,6 +28,10 @@ int pci_enable_pri(struct pci_dev *pdev, u32 reqs);
void pci_disable_pri(struct pci_dev *pdev);
int pci_reset_pri(struct pci_dev *pdev);
int pci_prg_resp_pasid_required(struct pci_dev *pdev);
+bool pci_pri_supported(struct pci_dev *pdev);
+#else
+bool pci_pri_supported(struct pci_dev *pdev)
+{ return false; }
#endif /* CONFIG_PCI_PRI */
#ifdef CONFIG_PCI_PASID
--
2.7.4
The IMA_APPRAISE_BOOTPARAM config allows enabling different "ima_appraise="
modes - log, fix, enforce - at run time, but not when IMA architecture
specific policies are enabled. This prevents properly labeling the
filesystem on systems where secure boot is supported, but not enabled on the
platform. Only when secure boot is actually enabled should these IMA
appraise modes be disabled.
This patch removes the compile time dependency and makes it a runtime
decision, based on the secure boot state of that platform.
Test results as follows:
-> x86-64 with secure boot enabled
[ 0.015637] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[ 0.015668] ima: Secure boot enabled: ignoring ima_appraise=fix boot parameter option
-> powerpc with secure boot disabled
[ 0.000000] Kernel command line: <...> ima_policy=appraise_tcb ima_appraise=fix
[ 0.000000] Secure boot mode disabled
-> Running the system without secure boot and with both options set:
CONFIG_IMA_APPRAISE_BOOTPARAM=y
CONFIG_IMA_ARCH_POLICY=y
Audit prompts "missing-hash" but still allow execution and, consequently,
filesystem labeling:
type=INTEGRITY_DATA msg=audit(07/09/2020 12:30:27.778:1691) : pid=4976
uid=root auid=root ses=2
subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023 op=appraise_data
cause=missing-hash comm=bash name=/usr/bin/evmctl dev="dm-0" ino=493150
res=no
Cc: stable(a)vger.kernel.org
Fixes: d958083a8f64 ("x86/ima: define arch_get_ima_policy() for x86")
Signed-off-by: Bruno Meneguele <bmeneg(a)redhat.com>
---
v6:
- explictly print the bootparam being ignored to the user (Mimi)
v5:
- add pr_info() to inform user the ima_appraise= boot param is being
ignored due to secure boot enabled (Nayna)
- add some testing results to commit log
v4:
- instead of change arch_policy loading code, check secure boot state at
"ima_appraise=" parameter handler (Mimi)
v3:
- extend secure boot arch checker to also consider trusted boot
- enforce IMA appraisal when secure boot is effectively enabled (Nayna)
- fix ima_appraise flag assignment by or'ing it (Mimi)
v2:
- pr_info() message prefix correction
security/integrity/ima/Kconfig | 2 +-
security/integrity/ima/ima_appraise.c | 6 ++++++
2 files changed, 7 insertions(+), 1 deletion(-)
diff --git a/security/integrity/ima/Kconfig b/security/integrity/ima/Kconfig
index edde88dbe576..62dc11a5af01 100644
--- a/security/integrity/ima/Kconfig
+++ b/security/integrity/ima/Kconfig
@@ -232,7 +232,7 @@ config IMA_APPRAISE_REQUIRE_POLICY_SIGS
config IMA_APPRAISE_BOOTPARAM
bool "ima_appraise boot parameter"
- depends on IMA_APPRAISE && !IMA_ARCH_POLICY
+ depends on IMA_APPRAISE
default y
help
This option enables the different "ima_appraise=" modes
diff --git a/security/integrity/ima/ima_appraise.c b/security/integrity/ima/ima_appraise.c
index a9649b04b9f1..28a59508c6bd 100644
--- a/security/integrity/ima/ima_appraise.c
+++ b/security/integrity/ima/ima_appraise.c
@@ -19,6 +19,12 @@
static int __init default_appraise_setup(char *str)
{
#ifdef CONFIG_IMA_APPRAISE_BOOTPARAM
+ if (arch_ima_get_secureboot()) {
+ pr_info("Secure boot enabled: ignoring ima_appraise=%s boot parameter option",
+ str);
+ return 1;
+ }
+
if (strncmp(str, "off", 3) == 0)
ima_appraise = 0;
else if (strncmp(str, "log", 3) == 0)
--
2.26.2
This is a note to let you know that I've just added the patch titled
serial: 8250_mtk: Fix high-speed baud rates clamping
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 551e553f0d4ab623e2a6f424ab5834f9c7b5229c Mon Sep 17 00:00:00 2001
From: Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
Date: Tue, 14 Jul 2020 15:41:12 +0300
Subject: serial: 8250_mtk: Fix high-speed baud rates clamping
Commit 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250
port") fixed limits of a baud rate setting for a generic 8250 port.
In other words since that commit the baud rate has been permitted to be
within [uartclk / 16 / UART_DIV_MAX; uartclk / 16], which is absolutely
normal for a standard 8250 UART port. But there are custom 8250 ports,
which provide extended baud rate limits. In particular the Mediatek 8250
port can work with baud rates up to "uartclk" speed.
Normally that and any other peculiarity is supposed to be handled in a
custom set_termios() callback implemented in the vendor-specific
8250-port glue-driver. Currently that is how it's done for the most of
the vendor-specific 8250 ports, but for some reason for Mediatek a
solution has been spread out to both the glue-driver and to the generic
8250-port code. Due to that a bug has been introduced, which permitted the
extended baud rate limit for all even for standard 8250-ports. The bug
has been fixed by the commit 7b668c064ec3 ("serial: 8250: Fix max baud
limit in generic 8250 port") by narrowing the baud rates limit back down to
the normal bounds. Unfortunately by doing so we also broke the
Mediatek-specific extended bauds feature.
A fix of the problem described above is twofold. First since we can't get
back the extended baud rate limits feature to the generic set_termios()
function and that method supports only a standard baud rates range, the
requested baud rate must be locally stored before calling it and then
restored back to the new termios structure after the generic set_termios()
finished its magic business. By doing so we still use the
serial8250_do_set_termios() method to set the LCR/MCR/FCR/etc. registers,
while the extended baud rate setting procedure will be performed later in
the custom Mediatek-specific set_termios() callback. Second since a true
baud rate is now fully calculated in the custom set_termios() method we
need to locally update the port timeout by calling the
uart_update_timeout() function. After the fixes described above are
implemented in the 8250_mtk.c driver, the Mediatek 8250-port should
get back to normally working with extended baud rates.
Link: https://lore.kernel.org/linux-serial/20200701211337.3027448-1-danielwinkler…
Fixes: 7b668c064ec3 ("serial: 8250: Fix max baud limit in generic 8250 port")
Reported-by: Daniel Winkler <danielwinkler(a)google.com>
Signed-off-by: Serge Semin <Sergey.Semin(a)baikalelectronics.ru>
Cc: stable <stable(a)vger.kernel.org>
Tested-by: Claire Chang <tientzu(a)chromium.org>
Link: https://lore.kernel.org/r/20200714124113.20918-1-Sergey.Semin@baikalelectro…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_mtk.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)
diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c
index f839380c2f4c..98b8a3e30733 100644
--- a/drivers/tty/serial/8250/8250_mtk.c
+++ b/drivers/tty/serial/8250/8250_mtk.c
@@ -306,8 +306,21 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios,
}
#endif
+ /*
+ * Store the requested baud rate before calling the generic 8250
+ * set_termios method. Standard 8250 port expects bauds to be
+ * no higher than (uartclk / 16) so the baud will be clamped if it
+ * gets out of that bound. Mediatek 8250 port supports speed
+ * higher than that, therefore we'll get original baud rate back
+ * after calling the generic set_termios method and recalculate
+ * the speed later in this method.
+ */
+ baud = tty_termios_baud_rate(termios);
+
serial8250_do_set_termios(port, termios, old);
+ tty_termios_encode_baud_rate(termios, baud, baud);
+
/*
* Mediatek UARTs use an extra highspeed register (MTK_UART_HIGHS)
*
@@ -339,6 +352,11 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios,
*/
spin_lock_irqsave(&port->lock, flags);
+ /*
+ * Update the per-port timeout.
+ */
+ uart_update_timeout(port, termios->c_cflag, baud);
+
/* set DLAB we have cval saved in up->lcr from the call to the core */
serial_port_out(port, UART_LCR, up->lcr | UART_LCR_DLAB);
serial_dl_write(up, quot);
--
2.27.0
This is a note to let you know that I've just added the patch titled
serial: tegra: fix CREAD handling for PIO
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From b374c562ee7ab3f3a1daf959c01868bae761571c Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Fri, 10 Jul 2020 15:59:46 +0200
Subject: serial: tegra: fix CREAD handling for PIO
Commit 33ae787b74fc ("serial: tegra: add support to ignore read") added
support for dropping input in case CREAD isn't set, but for PIO the
ignore_status_mask wasn't checked until after the character had been
put in the receive buffer.
Note that the NULL tty-port test is bogus and will be removed by a
follow-on patch.
Fixes: 33ae787b74fc ("serial: tegra: add support to ignore read")
Cc: stable <stable(a)vger.kernel.org> # 5.4
Cc: Shardar Shariff Md <smohammed(a)nvidia.com>
Cc: Krishna Yarlagadda <kyarlagadda(a)nvidia.com>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
Acked-by: Thierry Reding <treding(a)nvidia.com>
Link: https://lore.kernel.org/r/20200710135947.2737-2-johan@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial-tegra.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/serial-tegra.c b/drivers/tty/serial/serial-tegra.c
index 8de8bac9c6c7..b3bbee6b6702 100644
--- a/drivers/tty/serial/serial-tegra.c
+++ b/drivers/tty/serial/serial-tegra.c
@@ -653,11 +653,14 @@ static void tegra_uart_handle_rx_pio(struct tegra_uart_port *tup,
ch = (unsigned char) tegra_uart_read(tup, UART_RX);
tup->uport.icount.rx++;
- if (!uart_handle_sysrq_char(&tup->uport, ch) && tty)
- tty_insert_flip_char(tty, ch, flag);
+ if (uart_handle_sysrq_char(&tup->uport, ch))
+ continue;
if (tup->uport.ignore_status_mask & UART_LSR_DR)
continue;
+
+ if (tty)
+ tty_insert_flip_char(tty, ch, flag);
} while (1);
}
--
2.27.0
This is the start of the stable review cycle for the 4.4.231 release.
There are 58 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Wed, 22 Jul 2020 15:27:31 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.4.231-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.4.231-rc1
Vincent Guittot <vincent.guittot(a)linaro.org>
sched/fair: handle case of task_h_load() returning 0
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
misc: atmel-ssc: lock with mutex instead of spinlock
Krzysztof Kozlowski <krzk(a)kernel.org>
dmaengine: fsl-edma: Fix NULL pointer exception in fsl_edma_tx_handler
Vishwas M <vishwas.reddy.vr(a)gmail.com>
hwmon: (emc2103) fix unable to change fan pwm1_enable attribute
Huacai Chen <chenhc(a)lemote.com>
MIPS: Fix build for LTS kernel caused by backporting lpj adjustment
Esben Haabendal <esben(a)geanix.com>
uio_pdrv_genirq: fix use without device tree and no interrupt
David Pedersen <limero1337(a)gmail.com>
Input: i8042 - add Lenovo XiaoXin Air 12 to i8042 nomux list
Alexander Usyskin <alexander.usyskin(a)intel.com>
mei: bus: don't clean driver pointer
Chirantan Ekbote <chirantan(a)chromium.org>
fuse: Fix parameter for FS_IOC_{GET,SET}FLAGS
Alexander Lobakin <alobakin(a)pm.me>
virtio: virtio_console: add missing MODULE_DEVICE_TABLE() for rproc serial
AceLan Kao <acelan.kao(a)canonical.com>
USB: serial: option: add Quectel EG95 LTE modem
Jörgen Storvist <jorgen.storvist(a)gmail.com>
USB: serial: option: add GosunCn GM500 series
Igor Moura <imphilippini(a)gmail.com>
USB: serial: ch341: add new Product ID for CH340
James Hilliard <james.hilliard1(a)gmail.com>
USB: serial: cypress_m8: enable Simply Automated UPB PIM
Johan Hovold <johan(a)kernel.org>
USB: serial: iuu_phoenix: fix memory corruption
Zhang Qiang <qiang.zhang(a)windriver.com>
usb: gadget: function: fix missing spinlock in f_uac1_legacy
Peter Chen <peter.chen(a)nxp.com>
usb: chipidea: core: add wakeup support for extcon
Tom Rix <trix(a)redhat.com>
USB: c67x00: fix use after free in c67x00_giveback_urb
Takashi Iwai <tiwai(a)suse.de>
ALSA: usb-audio: Fix race against the error recovery URB submission
Takashi Iwai <tiwai(a)suse.de>
ALSA: line6: Perform sanity check for each URB creation
Takashi Iwai <tiwai(a)suse.de>
usb: core: Add a helper function to check the validity of EP type in URB
Dmitry Torokhov <dmitry.torokhov(a)gmail.com>
HID: magicmouse: do not set up autorepeat
Álvaro Fernández Rojas <noltari(a)gmail.com>
mtd: rawnand: brcmnand: fix CS0 layout
Jin Yao <yao.jin(a)linux.intel.com>
perf stat: Zero all the 'ena' and 'run' array slot stats for interval mode
Dan Carpenter <dan.carpenter(a)oracle.com>
staging: comedi: verify array index is correct before using it
Michał Mirosław <mirq-linux(a)rere.qmqm.pl>
usb: gadget: udc: atmel: fix uninitialized read in debug printk
Sasha Levin <sashal(a)kernel.org>
Revert "usb/ohci-platform: Fix a warning when hibernating"
Sasha Levin <sashal(a)kernel.org>
Revert "usb/xhci-plat: Set PM runtime as active on resume"
Sasha Levin <sashal(a)kernel.org>
Revert "usb/ehci-platform: Set PM runtime as active on resume"
Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
i2c: eg20t: Load module automatically if ID matches
Eric Dumazet <edumazet(a)google.com>
tcp: md5: allow changing MD5 keys in all socket states
Eric Dumazet <edumazet(a)google.com>
tcp: md5: refine tcp_md5_do_add()/tcp_md5_hash_key() barriers
Eric Dumazet <edumazet(a)google.com>
tcp: md5: add missing memory barriers in tcp_md5_do_add()/tcp_md5_hash_key()
Christoph Paasch <cpaasch(a)apple.com>
tcp: make sure listeners don't initialize congestion-control state
Sean Tranchetti <stranche(a)codeaurora.org>
genetlink: remove genl_bind
Martin Varghese <martin.varghese(a)nokia.com>
net: Added pointer check for dst->ops->neigh_lookup in dst_neigh_lookup_skb
Eric Dumazet <edumazet(a)google.com>
llc: make sure applications use ARPHRD_ETHER
Xin Long <lucien.xin(a)gmail.com>
l2tp: remove skb_dst_set() from l2tp_xmit_skb()
Sabrina Dubroca <sd(a)queasysnail.net>
ipv4: fill fl4_icmp_{type,code} in ping_v4_sendmsg
Davide Caratti <dcaratti(a)redhat.com>
bnxt_en: fix NULL dereference in case SR-IOV configuration fails
Vineet Gupta <vgupta(a)synopsys.com>
ARC: elf: use right ELF_ARCH
Vineet Gupta <vgupta(a)synopsys.com>
ARC: entry: fix potential EFA clobber when TIF_SYSCALL_TRACE
Tom Rix <trix(a)redhat.com>
drm/radeon: fix double free
Boris Burkov <boris(a)bur.io>
btrfs: fix fatal extent_buffer readahead vs releasepage race
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "ath9k: Fix general protection fault in ath9k_hif_usb_rx_cb"
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: bit 8 of non-leaf PDPEs is not reserved
Hector Martin <marcan(a)marcan.st>
ALSA: usb-audio: add quirk for MacroSilicon MS2109
Hui Wang <hui.wang(a)canonical.com>
ALSA: hda - let hs_mic be picked ahead of hp_mic
xidongwang <wangxidong_97(a)163.com>
ALSA: opl3: fix infoleak in opl3
Wei Li <liwei391(a)huawei.com>
arm64: kgdb: Fix single-step exception handling oops
Vinod Koul <vkoul(a)kernel.org>
ALSA: compress: fix partial_drain completion state
Andre Edich <andre.edich(a)microchip.com>
smsc95xx: avoid memory leak in smsc95xx_bind
Andre Edich <andre.edich(a)microchip.com>
smsc95xx: check return value of smsc95xx_reset
Li Heng <liheng40(a)huawei.com>
net: cxgb4: fix return error value in t4_prep_fw
Tomas Henzl <thenzl(a)redhat.com>
scsi: mptscsih: Fix read sense data size
Zhenzhong Duan <zhenzhong.duan(a)gmail.com>
spi: spidev: fix a potential use-after-free in spidev_release()
Zhenzhong Duan <zhenzhong.duan(a)gmail.com>
spi: spidev: fix a race between spidev_release and spidev_remove
Christian Borntraeger <borntraeger(a)de.ibm.com>
KVM: s390: reduce number of IO pins to 1
-------------
Diffstat:
Makefile | 4 +-
arch/arc/include/asm/elf.h | 2 +-
arch/arc/kernel/entry.S | 16 +++-----
arch/arm64/kernel/kgdb.c | 2 +-
arch/mips/kernel/time.c | 13 ++-----
arch/s390/include/asm/kvm_host.h | 8 ++--
arch/x86/kvm/mmu.c | 2 +-
drivers/char/virtio_console.c | 3 +-
drivers/dma/fsl-edma.c | 7 ++++
drivers/gpu/drm/radeon/ci_dpm.c | 7 ++--
drivers/hid/hid-magicmouse.c | 6 +++
drivers/hwmon/emc2103.c | 2 +-
drivers/i2c/busses/i2c-eg20t.c | 1 +
drivers/input/serio/i8042-x86ia64io.h | 7 ++++
drivers/message/fusion/mptscsih.c | 4 +-
drivers/misc/atmel-ssc.c | 24 ++++++------
drivers/misc/mei/bus.c | 3 +-
drivers/mtd/nand/brcmnand/brcmnand.c | 5 ++-
drivers/net/ethernet/broadcom/bnxt/bnxt_sriov.c | 2 +-
drivers/net/ethernet/chelsio/cxgb4/t4_hw.c | 8 ++--
drivers/net/usb/smsc95xx.c | 9 ++++-
drivers/net/wireless/ath/ath9k/hif_usb.c | 48 ++++++-----------------
drivers/net/wireless/ath/ath9k/hif_usb.h | 5 ---
drivers/spi/spidev.c | 24 ++++++------
drivers/staging/comedi/drivers/addi_apci_1500.c | 10 +++--
drivers/uio/uio_pdrv_genirq.c | 2 +-
drivers/usb/c67x00/c67x00-sched.c | 2 +-
drivers/usb/chipidea/core.c | 24 ++++++++++++
drivers/usb/core/urb.c | 30 ++++++++++++--
drivers/usb/gadget/function/f_uac1.c | 2 +
drivers/usb/gadget/udc/atmel_usba_udc.c | 2 +-
drivers/usb/host/ehci-platform.c | 5 ---
drivers/usb/host/ohci-platform.c | 5 ---
drivers/usb/host/xhci-plat.c | 11 +-----
drivers/usb/serial/ch341.c | 1 +
drivers/usb/serial/cypress_m8.c | 2 +
drivers/usb/serial/cypress_m8.h | 3 ++
drivers/usb/serial/iuu_phoenix.c | 8 ++--
drivers/usb/serial/option.c | 6 +++
fs/btrfs/extent_io.c | 40 +++++++++++--------
fs/fuse/file.c | 12 +++++-
include/linux/usb.h | 2 +
include/net/dst.h | 10 ++++-
include/net/genetlink.h | 8 ----
include/sound/compress_driver.h | 10 ++++-
kernel/sched/fair.c | 10 ++++-
net/ipv4/ping.c | 3 ++
net/ipv4/tcp.c | 13 ++++---
net/ipv4/tcp_cong.c | 2 +-
net/ipv4/tcp_ipv4.c | 15 +++++--
net/l2tp/l2tp_core.c | 5 +--
net/llc/af_llc.c | 10 +++--
net/netlink/genetlink.c | 52 -------------------------
sound/core/compress_offload.c | 4 ++
sound/drivers/opl3/opl3_synth.c | 2 +
sound/pci/hda/hda_auto_parser.c | 6 +++
sound/usb/line6/capture.c | 2 +
sound/usb/line6/playback.c | 2 +
sound/usb/midi.c | 17 +++++---
sound/usb/quirks-table.h | 52 +++++++++++++++++++++++++
tools/perf/util/stat.c | 6 ++-
61 files changed, 358 insertions(+), 250 deletions(-)
This is a note to let you know that I've just added the patch titled
tty: xilinx_uartps: Really fix id assignment
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 22a82fa7d6c3e16d56a036b1fa697a39b954adf0 Mon Sep 17 00:00:00 2001
From: Helmut Grohne <helmut.grohne(a)intenta.de>
Date: Mon, 13 Jul 2020 09:32:28 +0200
Subject: tty: xilinx_uartps: Really fix id assignment
The problems started with the revert (18cc7ac8a28e28). The
cdns_uart_console.index is statically assigned -1. When the port is
registered, Linux assigns consecutive numbers to it. It turned out that
when using ttyPS1 as console, the index is not updated as we are reusing
the same cdns_uart_console instance for multiple ports. When registering
ttyPS0, it gets updated from -1 to 0, but when registering ttyPS1, it
already is 0 and not updated.
That led to 2ae11c46d5fdc4. It assigns the index prior to registering
the uart_driver once. Unfortunately, that ended up breaking the
situation where the probe order does not match the id order. When using
the same device tree for both uboot and linux, it is important that the
serial0 alias points to the console. So some boards reverse those
aliases. This was reported by Jan Kiszka. The proposed fix was reverting
the index assignment and going back to the previous iteration.
However such a reversed assignement (serial0 -> uart1, serial1 -> uart0)
was already partially broken by the revert (18cc7ac8a28e28). While the
ttyPS device works, the kmsg connection is already broken and kernel
messages go missing. Reverting the id assignment does not fix this.
>From the xilinx_uartps driver pov (after reverting the refactoring
commits), there can be only one console. This manifests in static
variables console_pprt and cdns_uart_console. These variables are not
properly linked and can go out of sync. The cdns_uart_console.index is
important for uart_add_one_port. We call that function for each port -
one of which hopefully is the console. If it isn't, the CON_ENABLED flag
is not set and console_port is cleared. The next cdns_uart_probe call
then tries to register the next port using that same cdns_uart_console.
It is important that console_port and cdns_uart_console (and its index
in particular) stay in sync. The index assignment implemented by
Shubhrajyoti Datta is correct in principle. It just may have to happen a
second time if the first cdns_uart_probe call didn't encounter the
console device. And we shouldn't change the index once the console uart
is registered.
Reported-by: Shubhrajyoti Datta <shubhrajyoti.datta(a)xilinx.com>
Reported-by: Jan Kiszka <jan.kiszka(a)web.de>
Link: https://lore.kernel.org/linux-serial/f4092727-d8f5-5f91-2c9f-76643aace993@s…
Fixes: 18cc7ac8a28e28 ("Revert "serial: uartps: Register own uart console and driver structures"")
Fixes: 2ae11c46d5fdc4 ("tty: xilinx_uartps: Fix missing id assignment to the console")
Fixes: 76ed2e10579671 ("Revert "tty: xilinx_uartps: Fix missing id assignment to the console"")
Signed-off-by: Helmut Grohne <helmut.grohne(a)intenta.de>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200713073227.GA3805@laureti-dev
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/xilinx_uartps.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/xilinx_uartps.c b/drivers/tty/serial/xilinx_uartps.c
index 672cfa075e28..2833f1418d6d 100644
--- a/drivers/tty/serial/xilinx_uartps.c
+++ b/drivers/tty/serial/xilinx_uartps.c
@@ -1580,8 +1580,10 @@ static int cdns_uart_probe(struct platform_device *pdev)
* If register_console() don't assign value, then console_port pointer
* is cleanup.
*/
- if (!console_port)
+ if (!console_port) {
+ cdns_uart_console.index = id;
console_port = port;
+ }
#endif
rc = uart_add_one_port(&cdns_uart_uart_driver, port);
@@ -1594,8 +1596,10 @@ static int cdns_uart_probe(struct platform_device *pdev)
#ifdef CONFIG_SERIAL_XILINX_PS_UART_CONSOLE
/* This is not port which is used for console that's why clean it up */
if (console_port == port &&
- !(cdns_uart_uart_driver.cons->flags & CON_ENABLED))
+ !(cdns_uart_uart_driver.cons->flags & CON_ENABLED)) {
console_port = NULL;
+ cdns_uart_console.index = -1;
+ }
#endif
cdns_uart_data->cts_override = of_property_read_bool(pdev->dev.of_node,
--
2.27.0
This is a note to let you know that I've just added the patch titled
vt: Reject zero-sized screen buffer size.
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From ce684552a266cb1c7cc2f7e623f38567adec6653 Mon Sep 17 00:00:00 2001
From: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Date: Sun, 12 Jul 2020 20:10:12 +0900
Subject: vt: Reject zero-sized screen buffer size.
syzbot is reporting general protection fault in do_con_write() [1] caused
by vc->vc_screenbuf == ZERO_SIZE_PTR caused by vc->vc_screenbuf_size == 0
caused by vc->vc_cols == vc->vc_rows == vc->vc_size_row == 0 caused by
fb_set_var() from ioctl(FBIOPUT_VSCREENINFO) on /dev/fb0 , for
gotoxy(vc, 0, 0) from reset_terminal() from vc_init() from vc_allocate()
from con_install() from tty_init_dev() from tty_open() on such console
causes vc->vc_pos == 0x10000000e due to
((unsigned long) ZERO_SIZE_PTR) + -1U * 0 + (-1U << 1).
I don't think that a console with 0 column or 0 row makes sense. And it
seems that vc_do_resize() does not intend to allow resizing a console to
0 column or 0 row due to
new_cols = (cols ? cols : vc->vc_cols);
new_rows = (lines ? lines : vc->vc_rows);
exception.
Theoretically, cols and rows can be any range as long as
0 < cols * rows * 2 <= KMALLOC_MAX_SIZE is satisfied (e.g.
cols == 1048576 && rows == 2 is possible) because of
vc->vc_size_row = vc->vc_cols << 1;
vc->vc_screenbuf_size = vc->vc_rows * vc->vc_size_row;
in visual_init() and kzalloc(vc->vc_screenbuf_size) in vc_allocate().
Since we can detect cols == 0 or rows == 0 via screenbuf_size = 0 in
visual_init(), we can reject kzalloc(0). Then, vc_allocate() will return
an error, and con_write() will not be called on a console with 0 column
or 0 row.
We need to make sure that integer overflow in visual_init() won't happen.
Since vc_do_resize() restricts cols <= 32767 and rows <= 32767, applying
1 <= cols <= 32767 and 1 <= rows <= 32767 restrictions to vc_allocate()
will be practically fine.
This patch does not touch con_init(), for returning -EINVAL there
does not help when we are not returning -ENOMEM.
[1] https://syzkaller.appspot.com/bug?extid=017265e8553724e514e8
Reported-and-tested-by: syzbot <syzbot+017265e8553724e514e8(a)syzkaller.appspotmail.com>
Signed-off-by: Tetsuo Handa <penguin-kernel(a)I-love.SAKURA.ne.jp>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200712111013.11881-1-penguin-kernel@I-love.SAKU…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/vt/vt.c | 29 ++++++++++++++++++-----------
1 file changed, 18 insertions(+), 11 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c
index 48a8199f7845..42d8c67a481f 100644
--- a/drivers/tty/vt/vt.c
+++ b/drivers/tty/vt/vt.c
@@ -1092,10 +1092,19 @@ static const struct tty_port_operations vc_port_ops = {
.destruct = vc_port_destruct,
};
+/*
+ * Change # of rows and columns (0 means unchanged/the size of fg_console)
+ * [this is to be used together with some user program
+ * like resize that changes the hardware videomode]
+ */
+#define VC_MAXCOL (32767)
+#define VC_MAXROW (32767)
+
int vc_allocate(unsigned int currcons) /* return 0 on success */
{
struct vt_notifier_param param;
struct vc_data *vc;
+ int err;
WARN_CONSOLE_UNLOCKED();
@@ -1125,6 +1134,11 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */
if (!*vc->vc_uni_pagedir_loc)
con_set_default_unimap(vc);
+ err = -EINVAL;
+ if (vc->vc_cols > VC_MAXCOL || vc->vc_rows > VC_MAXROW ||
+ vc->vc_screenbuf_size > KMALLOC_MAX_SIZE || !vc->vc_screenbuf_size)
+ goto err_free;
+ err = -ENOMEM;
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_KERNEL);
if (!vc->vc_screenbuf)
goto err_free;
@@ -1143,7 +1157,7 @@ int vc_allocate(unsigned int currcons) /* return 0 on success */
visual_deinit(vc);
kfree(vc);
vc_cons[currcons].d = NULL;
- return -ENOMEM;
+ return err;
}
static inline int resize_screen(struct vc_data *vc, int width, int height,
@@ -1158,14 +1172,6 @@ static inline int resize_screen(struct vc_data *vc, int width, int height,
return err;
}
-/*
- * Change # of rows and columns (0 means unchanged/the size of fg_console)
- * [this is to be used together with some user program
- * like resize that changes the hardware videomode]
- */
-#define VC_RESIZE_MAXCOL (32767)
-#define VC_RESIZE_MAXROW (32767)
-
/**
* vc_do_resize - resizing method for the tty
* @tty: tty being resized
@@ -1201,7 +1207,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
user = vc->vc_resize_user;
vc->vc_resize_user = 0;
- if (cols > VC_RESIZE_MAXCOL || lines > VC_RESIZE_MAXROW)
+ if (cols > VC_MAXCOL || lines > VC_MAXROW)
return -EINVAL;
new_cols = (cols ? cols : vc->vc_cols);
@@ -1212,7 +1218,7 @@ static int vc_do_resize(struct tty_struct *tty, struct vc_data *vc,
if (new_cols == vc->vc_cols && new_rows == vc->vc_rows)
return 0;
- if (new_screen_size > KMALLOC_MAX_SIZE)
+ if (new_screen_size > KMALLOC_MAX_SIZE || !new_screen_size)
return -EINVAL;
newscreen = kzalloc(new_screen_size, GFP_USER);
if (!newscreen)
@@ -3393,6 +3399,7 @@ static int __init con_init(void)
INIT_WORK(&vc_cons[currcons].SAK_work, vc_SAK);
tty_port_init(&vc->port);
visual_init(vc, currcons, 1);
+ /* Assuming vc->vc_{cols,rows,screenbuf_size} are sane here. */
vc->vc_screenbuf = kzalloc(vc->vc_screenbuf_size, GFP_NOWAIT);
vc_init(vc, vc->vc_rows, vc->vc_cols,
currcons || !vc->vc_sw->con_save_screen);
--
2.27.0
The !ATOMIC_IOMAP version of io_maping_init_wc will always return
success, even when the ioremap fails.
Since the ATOMIC_IOMAP version returns NULL when the init fails, and
callers check for a NULL return on error this is unexpected.
Return NULL on ioremap failure.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping"
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
Sometimes it is good to know when your mapping failed.
Fixes: cafaf14a5d8f ("io-mapping: Always create a struct to hold metadata about the io-mapping"
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Andy Shevchenko <andriy.shevchenko(a)linux.intel.com>
Cc: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
---
include/linux/io-mapping.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/io-mapping.h b/include/linux/io-mapping.h
index 0beaa3eba155..5641e06cbcf7 100644
--- a/include/linux/io-mapping.h
+++ b/include/linux/io-mapping.h
@@ -118,7 +118,7 @@ io_mapping_init_wc(struct io_mapping *iomap,
iomap->prot = pgprot_noncached(PAGE_KERNEL);
#endif
- return iomap;
+ return iomap->iomem ? iomap : NULL;
}
static inline void
--
2.21.0
plane->index is NOT the index of the color plane in a YUV frame.
Actually, a YUV frame is represented by a single drm_plane, even though
it contains three Y, U, V planes.
v2-v3: No change
Cc: stable(a)vger.kernel.org # v5.3
Fixes: 90b86fcc47b4 ("DRM: Add KMS driver for the Ingenic JZ47xx SoCs")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Acked-by: Sam Ravnborg <sam(a)ravnborg.org>
---
drivers/gpu/drm/ingenic/ingenic-drm.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/gpu/drm/ingenic/ingenic-drm.c b/drivers/gpu/drm/ingenic/ingenic-drm.c
index deb37b4a8e91..606d8acb0954 100644
--- a/drivers/gpu/drm/ingenic/ingenic-drm.c
+++ b/drivers/gpu/drm/ingenic/ingenic-drm.c
@@ -386,7 +386,7 @@ static void ingenic_drm_plane_atomic_update(struct drm_plane *plane,
addr = drm_fb_cma_get_gem_addr(state->fb, state, 0);
width = state->src_w >> 16;
height = state->src_h >> 16;
- cpp = state->fb->format->cpp[plane->index];
+ cpp = state->fb->format->cpp[0];
priv->dma_hwdesc->addr = addr;
priv->dma_hwdesc->cmd = width * height * cpp / 4;
--
2.27.0
This is a note to let you know that I've just added the patch titled
usb: xhci: Fix ASM2142/ASM3142 DMA addressing
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From dbb0897e805f2ab1b8bc358f6c3d878a376b8897 Mon Sep 17 00:00:00 2001
From: Forest Crossman <cyrozap(a)gmail.com>
Date: Fri, 17 Jul 2020 06:27:34 -0500
Subject: usb: xhci: Fix ASM2142/ASM3142 DMA addressing
The ASM2142/ASM3142 (same PCI IDs) does not support full 64-bit DMA
addresses, which can cause silent memory corruption or IOMMU errors on
platforms that use the upper bits. Add the XHCI_NO_64BIT_SUPPORT quirk
to fix this issue.
Signed-off-by: Forest Crossman <cyrozap(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200717112734.328432-1-cyrozap@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-pci.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/usb/host/xhci-pci.c b/drivers/usb/host/xhci-pci.c
index ef513c2fb843..9234c82e70e4 100644
--- a/drivers/usb/host/xhci-pci.c
+++ b/drivers/usb/host/xhci-pci.c
@@ -265,6 +265,9 @@ static void xhci_pci_quirks(struct device *dev, struct xhci_hcd *xhci)
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
pdev->device == 0x1142)
xhci->quirks |= XHCI_TRUST_TX_LENGTH;
+ if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
+ pdev->device == 0x2142)
+ xhci->quirks |= XHCI_NO_64BIT_SUPPORT;
if (pdev->vendor == PCI_VENDOR_ID_ASMEDIA &&
pdev->device == PCI_DEVICE_ID_ASMEDIA_1042A_XHCI)
--
2.27.0
This is a note to let you know that I've just added the patch titled
usb: xhci-mtk: fix the failure of bandwidth allocation
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 5ce1a24dd98c00a57a8fa13660648abf7e08e3ef Mon Sep 17 00:00:00 2001
From: Chunfeng Yun <chunfeng.yun(a)mediatek.com>
Date: Fri, 10 Jul 2020 13:57:52 +0800
Subject: usb: xhci-mtk: fix the failure of bandwidth allocation
The wMaxPacketSize field of endpoint descriptor may be zero
as default value in alternate interface, and they are not
actually selected when start stream, so skip them when try to
allocate bandwidth.
Cc: stable <stable(a)vger.kernel.org>
Fixes: 0cbd4b34cda9 ("xhci: mediatek: support MTK xHCI host controller")
Signed-off-by: Chunfeng Yun <chunfeng.yun(a)mediatek.com>
Link: https://lore.kernel.org/r/1594360672-2076-1-git-send-email-chunfeng.yun@med…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/usb/host/xhci-mtk-sch.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/usb/host/xhci-mtk-sch.c b/drivers/usb/host/xhci-mtk-sch.c
index fea555570ad4..45c54d56ecbd 100644
--- a/drivers/usb/host/xhci-mtk-sch.c
+++ b/drivers/usb/host/xhci-mtk-sch.c
@@ -557,6 +557,10 @@ static bool need_bw_sch(struct usb_host_endpoint *ep,
if (is_fs_or_ls(speed) && !has_tt)
return false;
+ /* skip endpoint with zero maxpkt */
+ if (usb_endpoint_maxp(&ep->desc) == 0)
+ return false;
+
return true;
}
--
2.27.0
From: Marc Kleine-Budde <mkl(a)pengutronix.de>
[ Upstream commit e84861fec32dee8a2e62bbaa52cded6b05a2a456 ]
This function is used by dev_get_regmap() to retrieve a regmap for the
specified device. If the device has more than one regmap, the name parameter
can be used to specify one.
The code here uses a pointer comparison to check for equal strings. This
however will probably always fail, as the regmap->name is allocated via
kstrdup_const() from the regmap's config->name.
Fix this by using strcmp() instead.
Signed-off-by: Marc Kleine-Budde <mkl(a)pengutronix.de>
Link: https://lore.kernel.org/r/20200703103315.267996-1-mkl@pengutronix.de
Signed-off-by: Mark Brown <broonie(a)kernel.org>
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
drivers/base/regmap/regmap.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/base/regmap/regmap.c b/drivers/base/regmap/regmap.c
index 77cabde977edd..4a4efc6f54b55 100644
--- a/drivers/base/regmap/regmap.c
+++ b/drivers/base/regmap/regmap.c
@@ -1106,7 +1106,7 @@ static int dev_get_regmap_match(struct device *dev, void *res, void *data)
/* If the user didn't specify a name match any */
if (data)
- return (*r)->name == data;
+ return !strcmp((*r)->name, data);
else
return 1;
}
--
2.25.1
Hi,
Multiple users reported NFS causes NULL pointer dereference [1] on Ubuntu, due to commit "SUNRPC: Add "@len" parameter to gss_unwrap()" and commit "SUNRPC: Fix GSS privacy computation of auth->au_ralign".
The same issue happens on upstream stable 5.4.y branch.
The mainline kernel doesn't have this issue though.
Should we revert them? Or is there any missing commits need to be backported to v5.4?
[1] https://bugs.launchpad.net/bugs/1886277
Kai-Heng
Currently nvme_tcp_try_send_data() doesn't use kernel_sendpage() to
send slab pages. But for pages allocated by __get_free_pages() without
__GFP_COMP, which also have refcount as 0, they are still sent by
kernel_sendpage() to remote end, this is problematic.
When bcache uses a remote NVMe SSD via nvme-over-tcp as its cache
device, writing meta data e.g. cache_set->disk_buckets to remote SSD may
trigger a kernel panic due to the above problem. Bcause the meta data
pages for cache_set->disk_buckets are allocated by __get_free_pages()
without __GFP_COMP.
This problem should be fixed both in upper layer driver (bcache) and
nvme-over-tcp code. This patch fixes the nvme-over-tcp code by checking
whether the page refcount is 0, if yes then don't use kernel_sendpage()
and call sock_no_sendpage() to send the page into network stack.
The code comments in this patch is copied and modified from drbd where
the similar problem already gets solved by Philipp Reisner. This is the
best code comment including my own version.
Signed-off-by: Coly Li <colyli(a)suse.de>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Hannes Reinecke <hare(a)suse.de>
Cc: Jan Kara <jack(a)suse.com>
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Mikhail Skorzhinskii <mskorzhinskiy(a)solarflare.com>
Cc: Philipp Reisner <philipp.reisner(a)linbit.com>
Cc: Sagi Grimberg <sagi(a)grimberg.me>
Cc: Vlastimil Babka <vbabka(a)suse.com>
Cc: stable(a)vger.kernel.org
---
Changelog:
v2: fix typo in patch subject.
v1: the initial version.
drivers/nvme/host/tcp.c | 13 +++++++++++--
1 file changed, 11 insertions(+), 2 deletions(-)
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 79ef2b8e2b3c..faa71db7522a 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -887,8 +887,17 @@ static int nvme_tcp_try_send_data(struct nvme_tcp_request *req)
else
flags |= MSG_MORE | MSG_SENDPAGE_NOTLAST;
- /* can't zcopy slab pages */
- if (unlikely(PageSlab(page))) {
+ /*
+ * e.g. XFS meta- & log-data is in slab pages, or bcache meta
+ * data pages, or other high order pages allocated by
+ * __get_free_pages() without __GFP_COMP, which have a page_count
+ * of 0 and/or have PageSlab() set. We cannot use send_page for
+ * those, as that does get_page(); put_page(); and would cause
+ * either a VM_BUG directly, or __page_cache_release a page that
+ * would actually still be referenced by someone, leading to some
+ * obscure delayed Oops somewhere else.
+ */
+ if (unlikely(PageSlab(page) || page_count(page) < 1)) {
ret = sock_no_sendpage(queue->sock, page, offset, len,
flags);
} else {
--
2.26.2
arm build failed on stable-rc 5.4 branch.
make -sk KBUILD_BUILD_USER=TuxBuild -C/linux -j32 ARCH=arm
CROSS_COMPILE=arm-linux-gnueabihf- HOSTCC=gcc CC="sccache
arm-linux-gnueabihf-gcc" O=build zImage
#
../drivers/firmware/efi/arm-init.c: In function ‘efifb_add_links’:
../drivers/firmware/efi/arm-init.c:327:12: error: implicit declaration
of function ‘get_dev_from_fwnode’
[-Werror=implicit-function-declaration]
327 | sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
| ^~~~~~~~~~~~~~~~~~~
../drivers/firmware/efi/arm-init.c:327:10: warning: assignment to
‘struct device *’ from ‘int’ makes pointer from integer without a cast
[-Wint-conversion]
327 | sup_dev = get_dev_from_fwnode(&sup_np->fwnode);
| ^
../drivers/firmware/efi/arm-init.c: At top level:
../drivers/firmware/efi/arm-init.c:352:3: error: ‘const struct
fwnode_operations’ has no member named ‘add_links’
352 | .add_links = efifb_add_links,
| ^~~~~~~~~
../drivers/firmware/efi/arm-init.c:352:15: error: initialization of
‘struct fwnode_handle * (*)(struct fwnode_handle *)’ from incompatible
pointer type ‘int (*)(const struct fwnode_handle *, struct device *)’
[-Werror=incompatible-pointer-types]
352 | .add_links = efifb_add_links,
| ^~~~~~~~~~~~~~~
../drivers/firmware/efi/arm-init.c:352:15: note: (near initialization
for ‘efifb_fwnode_ops.get’)
seems like this is coming from the below patch
--
efi/arm: Defer probe of PCIe backed efifb on DT systems
[ Upstream commit 64c8a0cd0a535891d5905c3a1651150f0f141439 ]
The new of_devlink support breaks PCIe probing on ARM platforms booting
via UEFI if the firmware exposes a EFI framebuffer that is backed by a
PCI device. The reason is that the probing order gets reversed,
resulting in a resource conflict on the framebuffer memory window when
the PCIe probes last, causing it to give up entirely.
Given that we rely on PCI quirks to deal with EFI framebuffers that get
moved around in memory, we cannot simply drop the memory reservation, so
instead, let's use the device link infrastructure to register this
dependency, and force the probing to occur in the expected order.
Co-developed-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Saravana Kannan <saravanak(a)google.com>
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
Link: https://lore.kernel.org/r/20200113172245.27925-9-ardb@kernel.org
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
--
Linaro LKFT
https://lkft.linaro.org
Although I was not originally involved in the development of these
patches, I recently came across them while looking over the source:
upstream commit 429120f3df2d ("block: fix splitting segments on boundary masks")
Cc: stable(a)vger.kernel.org # v5.1+
Fixes: dcebd755926b ("block: use bio_for_each_bvec() to compute multi-page bvec count")
upstream commit 4a2f704eb2d8 ("block: fix get_max_segment_size() overflow on 32bit arch")
Fixes: 429120f3df2d ("block: fix splitting segments on boundary masks")
https://www.spinics.net/lists/linux-block/msg48605.htmlhttps://www.spinics.net/lists/linux-block/msg48959.html
The first patch mentions fixing problems with filesystem corruption, so
it seems important, but it has never been included in any -stable
kernel. Is there a specific reason these patches have been excluded
from -stable, or is it just a mistake?
These patches would be relevant to kernel versions 5.1 - 5.4, with 5.4.y
being the only active stable kernel series in that range. The patches
apply cleanly on top of 5.4.51.
Tony Battersby
Cybernetics
Hello,
Commit ab6f762f0f53162d41 Linus' HEAD.
printk_deferred() does not make sure that it's safe to write to
per-CPU data, which causes problems when printk_deferred() is
invoked "too early", before per-CPU areas are initialized. There
are multiple bug reports, e.g.
https://bugzilla.kernel.org/show_bug.cgi?id=206847
-ss
The patch below does not apply to the 5.7-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 92e0575b99835b5b3aaab2132dd551e0e04eb96a Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Ville=20Syrj=C3=A4l=C3=A4?= <ville.syrjala(a)linux.intel.com>
Date: Sat, 11 Jul 2020 11:03:36 +0300
Subject: [PATCH] drm/i915: Recalculate FBC w/a stride when needed
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Currently we're failing to recalculate the gen9 FBC w/a stride
unless something more drastic than just the modifier itself has
changed. This often leaves us with FBC enabled with the linear
fbdev framebuffer without the w/a stride enabled. That will cause
an immediate underrun and FBC will get promptly disabled.
Fix the problem by checking if the w/a stride is about to change,
and go through the full dance if so. This part of the FBC code
is still pretty much a disaster and will need lots more work.
But this should at least fix the immediate issue.
v2: Deactivate FBC when the modifier changes since that will
likely require resetting the w/a CFB stride
Cc: stable(a)vger.kernel.org
Signed-off-by: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200711080336.13423-1-ville.…
Reviewed-by: José Roberto de Souza <jose.souza(a)intel.com>
(cherry picked from commit 0428ab013fdd39dbfb8f4cd8ad2b60af3776c6b9)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/display/intel_fbc.c b/drivers/gpu/drm/i915/display/intel_fbc.c
index a65d9d8b79a7..412572f88b67 100644
--- a/drivers/gpu/drm/i915/display/intel_fbc.c
+++ b/drivers/gpu/drm/i915/display/intel_fbc.c
@@ -719,6 +719,25 @@ static bool intel_fbc_cfb_size_changed(struct drm_i915_private *dev_priv)
fbc->compressed_fb.size * fbc->threshold;
}
+static u16 intel_fbc_gen9_wa_cfb_stride(struct drm_i915_private *dev_priv)
+{
+ struct intel_fbc *fbc = &dev_priv->fbc;
+ struct intel_fbc_state_cache *cache = &fbc->state_cache;
+
+ if ((IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv)) &&
+ cache->fb.modifier != I915_FORMAT_MOD_X_TILED)
+ return DIV_ROUND_UP(cache->plane.src_w, 32 * fbc->threshold) * 8;
+ else
+ return 0;
+}
+
+static bool intel_fbc_gen9_wa_cfb_stride_changed(struct drm_i915_private *dev_priv)
+{
+ struct intel_fbc *fbc = &dev_priv->fbc;
+
+ return fbc->params.gen9_wa_cfb_stride != intel_fbc_gen9_wa_cfb_stride(dev_priv);
+}
+
static bool intel_fbc_can_enable(struct drm_i915_private *dev_priv)
{
struct intel_fbc *fbc = &dev_priv->fbc;
@@ -877,6 +896,7 @@ static void intel_fbc_get_reg_params(struct intel_crtc *crtc,
params->crtc.i9xx_plane = to_intel_plane(crtc->base.primary)->i9xx_plane;
params->fb.format = cache->fb.format;
+ params->fb.modifier = cache->fb.modifier;
params->fb.stride = cache->fb.stride;
params->cfb_size = intel_fbc_calculate_cfb_size(dev_priv, cache);
@@ -906,6 +926,9 @@ static bool intel_fbc_can_flip_nuke(const struct intel_crtc_state *crtc_state)
if (params->fb.format != cache->fb.format)
return false;
+ if (params->fb.modifier != cache->fb.modifier)
+ return false;
+
if (params->fb.stride != cache->fb.stride)
return false;
@@ -1185,7 +1208,8 @@ void intel_fbc_enable(struct intel_atomic_state *state,
if (fbc->crtc) {
if (fbc->crtc != crtc ||
- !intel_fbc_cfb_size_changed(dev_priv))
+ (!intel_fbc_cfb_size_changed(dev_priv) &&
+ !intel_fbc_gen9_wa_cfb_stride_changed(dev_priv)))
goto out;
__intel_fbc_disable(dev_priv);
@@ -1207,12 +1231,7 @@ void intel_fbc_enable(struct intel_atomic_state *state,
goto out;
}
- if ((IS_GEN9_BC(dev_priv) || IS_BROXTON(dev_priv)) &&
- plane_state->hw.fb->modifier != I915_FORMAT_MOD_X_TILED)
- cache->gen9_wa_cfb_stride =
- DIV_ROUND_UP(cache->plane.src_w, 32 * fbc->threshold) * 8;
- else
- cache->gen9_wa_cfb_stride = 0;
+ cache->gen9_wa_cfb_stride = intel_fbc_gen9_wa_cfb_stride(dev_priv);
drm_dbg_kms(&dev_priv->drm, "Enabling FBC on pipe %c\n",
pipe_name(crtc->pipe));
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index f79f118bf192..ae99a9190200 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -440,6 +440,7 @@ struct intel_fbc {
struct {
const struct drm_format_info *format;
unsigned int stride;
+ u64 modifier;
} fb;
int cfb_size;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 110f9efa858f584c6bed177cd48d0c0f526940e1 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 13 Jul 2020 17:05:49 +0100
Subject: [PATCH] drm/i915/gt: Only swap to a random sibling once upon creation
The danger in switching at random upon intel_context_pin is that the
context may still actually be inflight, as it will not be scheduled out
until a context switch after it is complete -- that may be a long time
after we do a final intel_context_unpin.
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2118
Fixes: 6d06779e8672 ("drm/i915: Load balancing across a virtual engine")
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v5.3+
Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin(a)intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200713160549.17344-1-chris@…
(cherry picked from commit 90a987205c6cf74116a102ed446d22d92cdaf915)
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index d270d2db6f0a..cb07e1d2a353 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -5396,13 +5396,8 @@ static void virtual_engine_initial_hint(struct virtual_engine *ve)
* typically be the first we inspect for submission.
*/
swp = prandom_u32_max(ve->num_siblings);
- if (!swp)
- return;
-
- swap(ve->siblings[swp], ve->siblings[0]);
- if (!intel_engine_has_relative_mmio(ve->siblings[0]))
- virtual_update_register_offsets(ve->context.lrc_reg_state,
- ve->siblings[0]);
+ if (swp)
+ swap(ve->siblings[swp], ve->siblings[0]);
}
static int virtual_context_alloc(struct intel_context *ce)
@@ -5415,15 +5410,9 @@ static int virtual_context_alloc(struct intel_context *ce)
static int virtual_context_pin(struct intel_context *ce)
{
struct virtual_engine *ve = container_of(ce, typeof(*ve), context);
- int err;
/* Note: we must use a real engine class for setting up reg state */
- err = __execlists_context_pin(ce, ve->siblings[0]);
- if (err)
- return err;
-
- virtual_engine_initial_hint(ve);
- return 0;
+ return __execlists_context_pin(ce, ve->siblings[0]);
}
static void virtual_context_enter(struct intel_context *ce)
@@ -5770,6 +5759,7 @@ intel_execlists_create_virtual(struct intel_engine_cs **siblings,
ve->base.flags |= I915_ENGINE_IS_VIRTUAL;
+ virtual_engine_initial_hint(ve);
return &ve->context;
err_put:
On Mon, Jun 29, 2020 at 04:28:05PM +0200, SeongJae Park wrote:
> Hello,
>
>
> With my little script, I found below commits in the mainline tree are more than
> 1 week old and fixing commits that back-ported in v5.4..v5.4.49, but not merged
> in the stable/linux-5.4.y tree. Are those need to be merged in but missed or
> dealyed?
>
> 9210c075cef2 ("nvme-pci: avoid race between nvme_reap_pending_cqes() and nvme_poll()")
I tried this first patch, and it doesn't apply to the 5.4.y tree, so are
you sure you tried these yourself?
If so, please send a series of backported patches that you have
successfully tested, or if a patch applies cleanly, just the git id.
thanks,
greg k-h
Commit 005c34ae4b44f085120d7f371121ec7ded677761 upstream.
The GIC driver uses a RMW sequence to update the affinity, and
relies on the gic_lock_irqsave/gic_unlock_irqrestore sequences
to update it atomically.
But these sequences only expand into anything meaningful if
the BL_SWITCHER option is selected, which almost never happens.
It also turns out that using a RMW and locks is just as silly,
as the GIC distributor supports byte accesses for the GICD_TARGETRn
registers, which when used make the update atomic by definition.
Drop the terminally broken code and replace it by a byte write.
Fixes: 04c8b0f82c7d ("irqchip/gic: Make locking a BL_SWITCHER only feature")
Cc: stable(a)vger.kernel.org
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
---
drivers/irqchip/irq-gic.c | 13 +++----------
1 file changed, 3 insertions(+), 10 deletions(-)
diff --git a/drivers/irqchip/irq-gic.c b/drivers/irqchip/irq-gic.c
index d6c404b3584d..006b17593c12 100644
--- a/drivers/irqchip/irq-gic.c
+++ b/drivers/irqchip/irq-gic.c
@@ -324,10 +324,8 @@ static int gic_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu)
static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
bool force)
{
- void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + (gic_irq(d) & ~3);
- unsigned int cpu, shift = (gic_irq(d) % 4) * 8;
- u32 val, mask, bit;
- unsigned long flags;
+ void __iomem *reg = gic_dist_base(d) + GIC_DIST_TARGET + gic_irq(d);
+ unsigned int cpu;
if (!force)
cpu = cpumask_any_and(mask_val, cpu_online_mask);
@@ -337,12 +335,7 @@ static int gic_set_affinity(struct irq_data *d, const struct cpumask *mask_val,
if (cpu >= NR_GIC_CPU_IF || cpu >= nr_cpu_ids)
return -EINVAL;
- gic_lock_irqsave(flags);
- mask = 0xff << shift;
- bit = gic_cpu_map[cpu] << shift;
- val = readl_relaxed(reg) & ~mask;
- writel_relaxed(val | bit, reg);
- gic_unlock_irqrestore(flags);
+ writeb_relaxed(gic_cpu_map[cpu], reg);
return IRQ_SET_MASK_OK_DONE;
}
--
2.27.0
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
[removed misfit part which was not implemented yet]
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Reviewed-by: Valentin Schneider <valentin.schneider(a)arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v4.19 v4.14 v4.9 v4.4
cc: Sasha Levin <sashal(a)kernel.org>
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
---
This patch also applies on v4.14.188 v4.9.230 and v4.4.230
kernel/sched/fair.c | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 92b1e71f13c8..d8c249e6dcb7 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -7337,7 +7337,15 @@ static int detach_tasks(struct lb_env *env)
if (!can_migrate_task(p, env))
goto next;
- load = task_h_load(p);
+ /*
+ * Depending of the number of CPUs and tasks and the
+ * cgroup hierarchy, task_h_load() can return a null
+ * value. Make sure that env->imbalance decreases
+ * otherwise detach_tasks() will stop only after
+ * detaching up to loop_max tasks.
+ */
+ load = max_t(unsigned long, task_h_load(p), 1);
+
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
goto next;
--
2.17.1
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 15956689a0e60aa0c795174f3c310b60d8794235 Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Fri, 3 Jul 2020 12:08:42 +0100
Subject: [PATCH] arm64: compat: Ensure upper 32 bits of x0 are zero on syscall
return
Although we zero the upper bits of x0 on entry to the kernel from an
AArch32 task, we do not clear them on the exception return path and can
therefore expose 64-bit sign extended syscall return values to userspace
via interfaces such as the 'perf_regs' ABI, which deal exclusively with
64-bit registers.
Explicitly clear the upper 32 bits of x0 on return from a compat system
call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Cc: Keno Fischer <keno(a)juliacomputing.com>
Cc: Luis Machado <luis.machado(a)linaro.org>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/syscall.h b/arch/arm64/include/asm/syscall.h
index 65299a2dcf9c..cfc0672013f6 100644
--- a/arch/arm64/include/asm/syscall.h
+++ b/arch/arm64/include/asm/syscall.h
@@ -34,6 +34,10 @@ static inline long syscall_get_error(struct task_struct *task,
struct pt_regs *regs)
{
unsigned long error = regs->regs[0];
+
+ if (is_compat_thread(task_thread_info(task)))
+ error = sign_extend64(error, 31);
+
return IS_ERR_VALUE(error) ? error : 0;
}
@@ -47,7 +51,13 @@ static inline void syscall_set_return_value(struct task_struct *task,
struct pt_regs *regs,
int error, long val)
{
- regs->regs[0] = (long) error ? error : val;
+ if (error)
+ val = error;
+
+ if (is_compat_thread(task_thread_info(task)))
+ val = lower_32_bits(val);
+
+ regs->regs[0] = val;
}
#define SYSCALL_MAX_ARGS 6
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 7c14466a12af..98a26d4e7b0c 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -50,6 +50,9 @@ static void invoke_syscall(struct pt_regs *regs, unsigned int scno,
ret = do_ni_syscall(regs, scno);
}
+ if (is_compat_task())
+ ret = lower_32_bits(ret);
+
regs->regs[0] = ret;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From ac2081cdc4d99c57f219c1a6171526e0fa0a6fff Mon Sep 17 00:00:00 2001
From: Will Deacon <will(a)kernel.org>
Date: Thu, 2 Jul 2020 21:16:20 +0100
Subject: [PATCH] arm64: ptrace: Consistently use pseudo-singlestep exceptions
Although the arm64 single-step state machine can be fast-forwarded in
cases where we wish to generate a SIGTRAP without actually executing an
instruction, this has two major limitations outside of simply skipping
an instruction due to emulation.
1. Stepping out of a ptrace signal stop into a signal handler where
SIGTRAP is blocked. Fast-forwarding the stepping state machine in
this case will result in a forced SIGTRAP, with the handler reset to
SIG_DFL.
2. The hardware implicitly fast-forwards the state machine when executing
an SVC instruction for issuing a system call. This can interact badly
with subsequent ptrace stops signalled during the execution of the
system call (e.g. SYSCALL_EXIT or seccomp traps), as they may corrupt
the stepping state by updating the PSTATE for the tracee.
Resolve both of these issues by injecting a pseudo-singlestep exception
on entry to a signal handler and also on return to userspace following a
system call.
Cc: <stable(a)vger.kernel.org>
Cc: Mark Rutland <mark.rutland(a)arm.com>
Tested-by: Luis Machado <luis.machado(a)linaro.org>
Reported-by: Keno Fischer <keno(a)juliacomputing.com>
Signed-off-by: Will Deacon <will(a)kernel.org>
diff --git a/arch/arm64/include/asm/thread_info.h b/arch/arm64/include/asm/thread_info.h
index 6ea8b6a26ae9..5e784e16ee89 100644
--- a/arch/arm64/include/asm/thread_info.h
+++ b/arch/arm64/include/asm/thread_info.h
@@ -93,6 +93,7 @@ void arch_release_task_struct(struct task_struct *tsk);
#define _TIF_SYSCALL_EMU (1 << TIF_SYSCALL_EMU)
#define _TIF_UPROBE (1 << TIF_UPROBE)
#define _TIF_FSCHECK (1 << TIF_FSCHECK)
+#define _TIF_SINGLESTEP (1 << TIF_SINGLESTEP)
#define _TIF_32BIT (1 << TIF_32BIT)
#define _TIF_SVE (1 << TIF_SVE)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 68b7f34a08f5..057d4aa1af4d 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -1818,12 +1818,23 @@ static void tracehook_report_syscall(struct pt_regs *regs,
saved_reg = regs->regs[regno];
regs->regs[regno] = dir;
- if (dir == PTRACE_SYSCALL_EXIT)
+ if (dir == PTRACE_SYSCALL_ENTER) {
+ if (tracehook_report_syscall_entry(regs))
+ forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else if (!test_thread_flag(TIF_SINGLESTEP)) {
tracehook_report_syscall_exit(regs, 0);
- else if (tracehook_report_syscall_entry(regs))
- forget_syscall(regs);
+ regs->regs[regno] = saved_reg;
+ } else {
+ regs->regs[regno] = saved_reg;
- regs->regs[regno] = saved_reg;
+ /*
+ * Signal a pseudo-step exception since we are stepping but
+ * tracer modifications to the registers may have rewound the
+ * state machine.
+ */
+ tracehook_report_syscall_exit(regs, 1);
+ }
}
int syscall_trace_enter(struct pt_regs *regs)
@@ -1851,12 +1862,14 @@ int syscall_trace_enter(struct pt_regs *regs)
void syscall_trace_exit(struct pt_regs *regs)
{
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
audit_syscall_exit(regs);
- if (test_thread_flag(TIF_SYSCALL_TRACEPOINT))
+ if (flags & _TIF_SYSCALL_TRACEPOINT)
trace_sys_exit(regs, regs_return_value(regs));
- if (test_thread_flag(TIF_SYSCALL_TRACE))
+ if (flags & (_TIF_SYSCALL_TRACE | _TIF_SINGLESTEP))
tracehook_report_syscall(regs, PTRACE_SYSCALL_EXIT);
rseq_syscall(regs);
diff --git a/arch/arm64/kernel/signal.c b/arch/arm64/kernel/signal.c
index 801d56cdf701..3b4f31f35e45 100644
--- a/arch/arm64/kernel/signal.c
+++ b/arch/arm64/kernel/signal.c
@@ -800,7 +800,6 @@ static void setup_restart_syscall(struct pt_regs *regs)
*/
static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
{
- struct task_struct *tsk = current;
sigset_t *oldset = sigmask_to_save();
int usig = ksig->sig;
int ret;
@@ -824,14 +823,8 @@ static void handle_signal(struct ksignal *ksig, struct pt_regs *regs)
*/
ret |= !valid_user_regs(®s->user_regs, current);
- /*
- * Fast forward the stepping logic so we step into the signal
- * handler.
- */
- if (!ret)
- user_fastforward_single_step(tsk);
-
- signal_setup_done(ret, ksig, 0);
+ /* Step into the signal handler if we are stepping */
+ signal_setup_done(ret, ksig, test_thread_flag(TIF_SINGLESTEP));
}
/*
diff --git a/arch/arm64/kernel/syscall.c b/arch/arm64/kernel/syscall.c
index 5f5b868292f5..7c14466a12af 100644
--- a/arch/arm64/kernel/syscall.c
+++ b/arch/arm64/kernel/syscall.c
@@ -139,7 +139,7 @@ static void el0_svc_common(struct pt_regs *regs, int scno, int sc_nr,
if (!has_syscall_work(flags) && !IS_ENABLED(CONFIG_DEBUG_RSEQ)) {
local_daif_mask();
flags = current_thread_info()->flags;
- if (!has_syscall_work(flags)) {
+ if (!has_syscall_work(flags) && !(flags & _TIF_SINGLESTEP)) {
/*
* We're off to userspace, where interrupts are
* always enabled after we restore the flags from
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 263a0269a59c0b4145829462a107fe7f7327105f Mon Sep 17 00:00:00 2001
From: Dinh Nguyen <dinguyen(a)kernel.org>
Date: Mon, 29 Jun 2020 11:25:43 -0500
Subject: [PATCH] arm64: dts: stratix10: add status to qspi dts node
Add status = "okay" to QSPI node.
Fixes: 0cb140d07fc75 ("arm64: dts: stratix10: Add QSPI support for Stratix10")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.6
Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org>
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
index f6c4a15079d3..feadd21bc0dc 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@@ -155,6 +155,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
index 9946515b8afd..4000c393243d 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
@@ -188,6 +188,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 263a0269a59c0b4145829462a107fe7f7327105f Mon Sep 17 00:00:00 2001
From: Dinh Nguyen <dinguyen(a)kernel.org>
Date: Mon, 29 Jun 2020 11:25:43 -0500
Subject: [PATCH] arm64: dts: stratix10: add status to qspi dts node
Add status = "okay" to QSPI node.
Fixes: 0cb140d07fc75 ("arm64: dts: stratix10: Add QSPI support for Stratix10")
Cc: linux-stable <stable(a)vger.kernel.org> # >= v5.6
Signed-off-by: Dinh Nguyen <dinguyen(a)kernel.org>
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
index f6c4a15079d3..feadd21bc0dc 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk.dts
@@ -155,6 +155,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
diff --git a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
index 9946515b8afd..4000c393243d 100644
--- a/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
+++ b/arch/arm64/boot/dts/altera/socfpga_stratix10_socdk_nand.dts
@@ -188,6 +188,7 @@
};
&qspi {
+ status = "okay";
flash@0 {
#address-cells = <1>;
#size-cells = <1>;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4237c625304b212a3f30adf787901082082511ec Mon Sep 17 00:00:00 2001
From: Tim Harvey <tharvey(a)gateworks.com>
Date: Tue, 23 Jun 2020 12:06:54 -0700
Subject: [PATCH] ARM: dts: imx6qdl-gw551x: fix audio SSI
The audio codec on the GW551x routes to ssi1. It fixes audio capture on
the device.
Cc: stable(a)vger.kernel.org
Fixes: 3117e851cef1 ("ARM: dts: imx: Add TDA19971 HDMI Receiver to GW551x")
Signed-off-by: Tim Harvey <tharvey(a)gateworks.com>
Signed-off-by: Shawn Guo <shawnguo(a)kernel.org>
diff --git a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
index c38e86eedcc0..8c33510c9519 100644
--- a/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-gw551x.dtsi
@@ -110,7 +110,7 @@
simple-audio-card,frame-master = <&sound_codec>;
sound_cpu: simple-audio-card,cpu {
- sound-dai = <&ssi2>;
+ sound-dai = <&ssi1>;
};
sound_codec: simple-audio-card,codec {
From: Finley Xiao <finley.xiao(a)rock-chips.com>
commit 371a3bc79c11b707d7a1b7a2c938dc3cc042fffb upstream.
The function cpu_power_to_freq is used to find a frequency and set the
cooling device to consume at most the power to be converted. For example,
if the power to be converted is 80mW, and the em table is as follow.
struct em_cap_state table[] = {
/* KHz mW */
{ 1008000, 36, 0 },
{ 1200000, 49, 0 },
{ 1296000, 59, 0 },
{ 1416000, 72, 0 },
{ 1512000, 86, 0 },
};
The target frequency should be 1416000KHz, not 1512000KHz.
Fixes: 349d39dc5739 ("thermal: cpu_cooling: merge frequency and power tables")
Cc: <stable(a)vger.kernel.org> # v4.13+
Signed-off-by: Finley Xiao <finley.xiao(a)rock-chips.com>
Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
Reviewed-by: Amit Kucheria <amit.kucheria(a)linaro.org>
Signed-off-by: Daniel Lezcano <daniel.lezcano(a)linaro.org>
Link: https://lore.kernel.org/r/20200619090825.32747-1-finley.xiao@rock-chips.com
Signed-off-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
Hi Greg,
I am resending this as I got your emails of this failing on 4.14, 4.19
and 5.4. This should be applied to all three of them.
@Finley: I hope I have done it correctly, please do check it as this
required me to rewrite the code to adapt to previous kernels.
drivers/thermal/cpu_cooling.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c
index 908a8014cf76..1f4387a5ceae 100644
--- a/drivers/thermal/cpu_cooling.c
+++ b/drivers/thermal/cpu_cooling.c
@@ -280,11 +280,11 @@ static u32 cpu_power_to_freq(struct cpufreq_cooling_device *cpufreq_cdev,
int i;
struct freq_table *freq_table = cpufreq_cdev->freq_table;
- for (i = 1; i <= cpufreq_cdev->max_level; i++)
- if (power > freq_table[i].power)
+ for (i = 0; i < cpufreq_cdev->max_level; i++)
+ if (power >= freq_table[i].power)
break;
- return freq_table[i - 1].frequency;
+ return freq_table[i].frequency;
}
/**
--
2.25.0.rc1.19.g042ed3e048af
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 6348dd291e3653534a9e28e6917569bc9967b35b Mon Sep 17 00:00:00 2001
From: Charan Teja Kalla <charante(a)codeaurora.org>
Date: Fri, 19 Jun 2020 17:27:19 +0530
Subject: [PATCH] dmabuf: use spinlock to access dmabuf->name
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
There exists a sleep-while-atomic bug while accessing the dmabuf->name
under mutex in the dmabuffs_dname(). This is caused from the SELinux
permissions checks on a process where it tries to validate the inherited
files from fork() by traversing them through iterate_fd() (which
traverse files under spin_lock) and call
match_file(security/selinux/hooks.c) where the permission checks happen.
This audit information is logged using dump_common_audit_data() where it
calls d_path() to get the file path name. If the file check happen on
the dmabuf's fd, then it ends up in ->dmabuffs_dname() and use mutex to
access dmabuf->name. The flow will be like below:
flush_unauthorized_files()
iterate_fd()
spin_lock() --> Start of the atomic section.
match_file()
file_has_perm()
avc_has_perm()
avc_audit()
slow_avc_audit()
common_lsm_audit()
dump_common_audit_data()
audit_log_d_path()
d_path()
dmabuffs_dname()
mutex_lock()--> Sleep while atomic.
Call trace captured (on 4.19 kernels) is below:
___might_sleep+0x204/0x208
__might_sleep+0x50/0x88
__mutex_lock_common+0x5c/0x1068
__mutex_lock_common+0x5c/0x1068
mutex_lock_nested+0x40/0x50
dmabuffs_dname+0xa0/0x170
d_path+0x84/0x290
audit_log_d_path+0x74/0x130
common_lsm_audit+0x334/0x6e8
slow_avc_audit+0xb8/0xf8
avc_has_perm+0x154/0x218
file_has_perm+0x70/0x180
match_file+0x60/0x78
iterate_fd+0x128/0x168
selinux_bprm_committing_creds+0x178/0x248
security_bprm_committing_creds+0x30/0x48
install_exec_creds+0x1c/0x68
load_elf_binary+0x3a4/0x14e0
search_binary_handler+0xb0/0x1e0
So, use spinlock to access dmabuf->name to avoid sleep-while-atomic.
Cc: <stable(a)vger.kernel.org> [5.3+]
Signed-off-by: Charan Teja Kalla <charante(a)codeaurora.org>
Reviewed-by: Michael J. Ruhl <michael.j.ruhl(a)intel.com>
Acked-by: Christian König <christian.koenig(a)amd.com>
[sumits: added comment to spinlock_t definition to avoid warning]
Signed-off-by: Sumit Semwal <sumit.semwal(a)linaro.org>
Link: https://patchwork.freedesktop.org/patch/msgid/a83e7f0d-4e54-9848-4b58-e1acd…
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 412629601ad3..1ca609f66fdf 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -45,10 +45,10 @@ static char *dmabuffs_dname(struct dentry *dentry, char *buffer, int buflen)
size_t ret = 0;
dmabuf = dentry->d_fsdata;
- dma_resv_lock(dmabuf->resv, NULL);
+ spin_lock(&dmabuf->name_lock);
if (dmabuf->name)
ret = strlcpy(name, dmabuf->name, DMA_BUF_NAME_LEN);
- dma_resv_unlock(dmabuf->resv);
+ spin_unlock(&dmabuf->name_lock);
return dynamic_dname(dentry, buffer, buflen, "/%s:%s",
dentry->d_name.name, ret > 0 ? name : "");
@@ -338,8 +338,10 @@ static long dma_buf_set_name(struct dma_buf *dmabuf, const char __user *buf)
kfree(name);
goto out_unlock;
}
+ spin_lock(&dmabuf->name_lock);
kfree(dmabuf->name);
dmabuf->name = name;
+ spin_unlock(&dmabuf->name_lock);
out_unlock:
dma_resv_unlock(dmabuf->resv);
@@ -402,10 +404,10 @@ static void dma_buf_show_fdinfo(struct seq_file *m, struct file *file)
/* Don't count the temporary reference taken inside procfs seq_show */
seq_printf(m, "count:\t%ld\n", file_count(dmabuf->file) - 1);
seq_printf(m, "exp_name:\t%s\n", dmabuf->exp_name);
- dma_resv_lock(dmabuf->resv, NULL);
+ spin_lock(&dmabuf->name_lock);
if (dmabuf->name)
seq_printf(m, "name:\t%s\n", dmabuf->name);
- dma_resv_unlock(dmabuf->resv);
+ spin_unlock(&dmabuf->name_lock);
}
static const struct file_operations dma_buf_fops = {
@@ -542,6 +544,7 @@ struct dma_buf *dma_buf_export(const struct dma_buf_export_info *exp_info)
dmabuf->size = exp_info->size;
dmabuf->exp_name = exp_info->exp_name;
dmabuf->owner = exp_info->owner;
+ spin_lock_init(&dmabuf->name_lock);
init_waitqueue_head(&dmabuf->poll);
dmabuf->cb_excl.poll = dmabuf->cb_shared.poll = &dmabuf->poll;
dmabuf->cb_excl.active = dmabuf->cb_shared.active = 0;
diff --git a/include/linux/dma-buf.h b/include/linux/dma-buf.h
index ab0c156abee6..a2ca294eaebe 100644
--- a/include/linux/dma-buf.h
+++ b/include/linux/dma-buf.h
@@ -311,6 +311,7 @@ struct dma_buf {
void *vmap_ptr;
const char *exp_name;
const char *name;
+ spinlock_t name_lock; /* spinlock to protect name access */
struct module *owner;
struct list_head list_node;
void *priv;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a683f390b93f4d1292f849fc48d28e322046120f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Jul 2016 09:50:36 +0000
Subject: [PATCH] timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the
side effect that timers which are queued end up in the outer wheel levels.
That results in coarser granularity.
To solve this, we keep track of the idle state and forward the wheel clock
whenever possible.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Arjan van de Ven <arjan(a)infradead.org>
Cc: Chris Mason <clm(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Frederic Weisbecker <fweisbec(a)gmail.com>
Cc: George Spelvin <linux(a)sciencehorizons.net>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: rt(a)linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 966a5a6fdd0a..f738251000fe 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -164,3 +164,4 @@ static inline void timers_update_migration(bool update_nohz) { }
DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
+void timer_clear_idle(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 69abc7bfe80f..5d81f9aa30d2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,6 +700,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
delta = next_tick - basemono;
if (delta <= (u64)TICK_NSEC) {
tick.tv64 = 0;
+
+ /*
+ * Tell the timer code that the base is not idle, i.e. undo
+ * the effect of get_next_timer_interrupt():
+ */
+ timer_clear_idle();
/*
* We've not stopped the tick yet, and there's a timer in the
* next period, so no point in stopping it either, bail.
@@ -809,6 +815,12 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
tick_do_update_jiffies64(now);
cpu_load_update_nohz_stop();
+ /*
+ * Clear the timer idle flag, so we avoid IPIs on remote queueing and
+ * the clock forward checks in the enqueue path:
+ */
+ timer_clear_idle();
+
calc_load_exit_idle();
touch_softlockup_watchdog_sched();
/*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 658051c97a3c..9339d71ee998 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -196,9 +196,11 @@ struct timer_base {
spinlock_t lock;
struct timer_list *running_timer;
unsigned long clk;
+ unsigned long next_expiry;
unsigned int cpu;
bool migration_enabled;
bool nohz_active;
+ bool is_idle;
DECLARE_BITMAP(pending_map, WHEEL_SIZE);
struct hlist_head vectors[WHEEL_SIZE];
} ____cacheline_aligned;
@@ -519,24 +521,37 @@ static void internal_add_timer(struct timer_base *base, struct timer_list *timer
{
__internal_add_timer(base, timer);
+ if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+ return;
+
/*
- * Check whether the other CPU is in dynticks mode and needs
- * to be triggered to reevaluate the timer wheel. We are
- * protected against the other CPU fiddling with the timer by
- * holding the timer base lock. This also makes sure that a
- * CPU on the way to stop its tick can not evaluate the timer
- * wheel.
- *
- * Spare the IPI for deferrable timers on idle targets though.
- * The next busy ticks will take care of it. Except full dynticks
- * require special care against races with idle_cpu(), lets deal
- * with that later.
+ * TODO: This wants some optimizing similar to the code below, but we
+ * will do that when we switch from push to pull for deferrable timers.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active) {
- if (!(timer->flags & TIMER_DEFERRABLE) ||
- tick_nohz_full_cpu(base->cpu))
+ if (timer->flags & TIMER_DEFERRABLE) {
+ if (tick_nohz_full_cpu(base->cpu))
wake_up_nohz_cpu(base->cpu);
+ return;
}
+
+ /*
+ * We might have to IPI the remote CPU if the base is idle and the
+ * timer is not deferrable. If the other CPU is on the way to idle
+ * then it can't set base->is_idle as we hold the base lock:
+ */
+ if (!base->is_idle)
+ return;
+
+ /* Check whether this is the new first expiring timer: */
+ if (time_after_eq(timer->expires, base->next_expiry))
+ return;
+
+ /*
+ * Set the next expiry time and kick the CPU so it can reevaluate the
+ * wheel:
+ */
+ base->next_expiry = timer->expires;
+ wake_up_nohz_cpu(base->cpu);
}
#ifdef CONFIG_TIMER_STATS
@@ -844,10 +859,11 @@ static inline struct timer_base *get_timer_base(u32 tflags)
return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
}
-static inline struct timer_base *get_target_base(struct timer_base *base,
- unsigned tflags)
+#ifdef CONFIG_NO_HZ_COMMON
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
{
-#if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
+#ifdef CONFIG_SMP
if ((tflags & TIMER_PINNED) || !base->migration_enabled)
return get_timer_this_cpu_base(tflags);
return get_timer_cpu_base(tflags, get_nohz_timer_target());
@@ -856,6 +872,43 @@ static inline struct timer_base *get_target_base(struct timer_base *base,
#endif
}
+static inline void forward_timer_base(struct timer_base *base)
+{
+ /*
+ * We only forward the base when it's idle and we have a delta between
+ * base clock and jiffies.
+ */
+ if (!base->is_idle || (long) (jiffies - base->clk) < 2)
+ return;
+
+ /*
+ * If the next expiry value is > jiffies, then we fast forward to
+ * jiffies otherwise we forward to the next expiry value.
+ */
+ if (time_after(base->next_expiry, jiffies))
+ base->clk = jiffies;
+ else
+ base->clk = base->next_expiry;
+}
+#else
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
+{
+ return get_timer_this_cpu_base(tflags);
+}
+
+static inline void forward_timer_base(struct timer_base *base) { }
+#endif
+
+static inline struct timer_base *
+get_target_base(struct timer_base *base, unsigned tflags)
+{
+ struct timer_base *target = __get_target_base(base, tflags);
+
+ forward_timer_base(target);
+ return target;
+}
+
/*
* We are using hashed locking: Holding per_cpu(timer_bases[x]).lock means
* that all timers which are tied to this base are locked, and the base itself
@@ -1417,16 +1470,49 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
spin_lock(&base->lock);
nextevt = __next_timer_interrupt(base);
- spin_unlock(&base->lock);
+ base->next_expiry = nextevt;
+ /*
+ * We have a fresh next event. Check whether we can forward the base:
+ */
+ if (time_after(nextevt, jiffies))
+ base->clk = jiffies;
+ else if (time_after(nextevt, base->clk))
+ base->clk = nextevt;
- if (time_before_eq(nextevt, basej))
+ if (time_before_eq(nextevt, basej)) {
expires = basem;
- else
+ base->is_idle = false;
+ } else {
expires = basem + (nextevt - basej) * TICK_NSEC;
+ /*
+ * If we expect to sleep more than a tick, mark the base idle:
+ */
+ if ((expires - basem) > TICK_NSEC)
+ base->is_idle = true;
+ }
+ spin_unlock(&base->lock);
return cmp_next_hrtimer_event(basem, expires);
}
+/**
+ * timer_clear_idle - Clear the idle state of the timer base
+ *
+ * Called with interrupts disabled
+ */
+void timer_clear_idle(void)
+{
+ struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
+
+ /*
+ * We do this unlocked. The worst outcome is a remote enqueue sending
+ * a pointless IPI, but taking the lock would just make the window for
+ * sending the IPI a few instructions smaller for the cost of taking
+ * the lock in the exit from idle path.
+ */
+ base->is_idle = false;
+}
+
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
@@ -1440,7 +1526,7 @@ static int collect_expired_timers(struct timer_base *base,
/*
* If the next timer is ahead of time forward to current
- * jiffies, otherwise forward to the next expiry time.
+ * jiffies, otherwise forward to the next expiry time:
*/
if (time_after(next, jiffies)) {
/* The call site will increment clock! */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From a683f390b93f4d1292f849fc48d28e322046120f Mon Sep 17 00:00:00 2001
From: Thomas Gleixner <tglx(a)linutronix.de>
Date: Mon, 4 Jul 2016 09:50:36 +0000
Subject: [PATCH] timers: Forward the wheel clock whenever possible
The wheel clock is stale when a CPU goes into a long idle sleep. This has the
side effect that timers which are queued end up in the outer wheel levels.
That results in coarser granularity.
To solve this, we keep track of the idle state and forward the wheel clock
whenever possible.
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Arjan van de Ven <arjan(a)infradead.org>
Cc: Chris Mason <clm(a)fb.com>
Cc: Eric Dumazet <edumazet(a)google.com>
Cc: Frederic Weisbecker <fweisbec(a)gmail.com>
Cc: George Spelvin <linux(a)sciencehorizons.net>
Cc: Josh Triplett <josh(a)joshtriplett.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: Linus Torvalds <torvalds(a)linux-foundation.org>
Cc: Paul E. McKenney <paulmck(a)linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: rt(a)linutronix.de
Link: http://lkml.kernel.org/r/20160704094342.512039360@linutronix.de
Signed-off-by: Ingo Molnar <mingo(a)kernel.org>
diff --git a/kernel/time/tick-internal.h b/kernel/time/tick-internal.h
index 966a5a6fdd0a..f738251000fe 100644
--- a/kernel/time/tick-internal.h
+++ b/kernel/time/tick-internal.h
@@ -164,3 +164,4 @@ static inline void timers_update_migration(bool update_nohz) { }
DECLARE_PER_CPU(struct hrtimer_cpu_base, hrtimer_bases);
extern u64 get_next_timer_interrupt(unsigned long basej, u64 basem);
+void timer_clear_idle(void);
diff --git a/kernel/time/tick-sched.c b/kernel/time/tick-sched.c
index 69abc7bfe80f..5d81f9aa30d2 100644
--- a/kernel/time/tick-sched.c
+++ b/kernel/time/tick-sched.c
@@ -700,6 +700,12 @@ static ktime_t tick_nohz_stop_sched_tick(struct tick_sched *ts,
delta = next_tick - basemono;
if (delta <= (u64)TICK_NSEC) {
tick.tv64 = 0;
+
+ /*
+ * Tell the timer code that the base is not idle, i.e. undo
+ * the effect of get_next_timer_interrupt():
+ */
+ timer_clear_idle();
/*
* We've not stopped the tick yet, and there's a timer in the
* next period, so no point in stopping it either, bail.
@@ -809,6 +815,12 @@ static void tick_nohz_restart_sched_tick(struct tick_sched *ts, ktime_t now)
tick_do_update_jiffies64(now);
cpu_load_update_nohz_stop();
+ /*
+ * Clear the timer idle flag, so we avoid IPIs on remote queueing and
+ * the clock forward checks in the enqueue path:
+ */
+ timer_clear_idle();
+
calc_load_exit_idle();
touch_softlockup_watchdog_sched();
/*
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 658051c97a3c..9339d71ee998 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -196,9 +196,11 @@ struct timer_base {
spinlock_t lock;
struct timer_list *running_timer;
unsigned long clk;
+ unsigned long next_expiry;
unsigned int cpu;
bool migration_enabled;
bool nohz_active;
+ bool is_idle;
DECLARE_BITMAP(pending_map, WHEEL_SIZE);
struct hlist_head vectors[WHEEL_SIZE];
} ____cacheline_aligned;
@@ -519,24 +521,37 @@ static void internal_add_timer(struct timer_base *base, struct timer_list *timer
{
__internal_add_timer(base, timer);
+ if (!IS_ENABLED(CONFIG_NO_HZ_COMMON) || !base->nohz_active)
+ return;
+
/*
- * Check whether the other CPU is in dynticks mode and needs
- * to be triggered to reevaluate the timer wheel. We are
- * protected against the other CPU fiddling with the timer by
- * holding the timer base lock. This also makes sure that a
- * CPU on the way to stop its tick can not evaluate the timer
- * wheel.
- *
- * Spare the IPI for deferrable timers on idle targets though.
- * The next busy ticks will take care of it. Except full dynticks
- * require special care against races with idle_cpu(), lets deal
- * with that later.
+ * TODO: This wants some optimizing similar to the code below, but we
+ * will do that when we switch from push to pull for deferrable timers.
*/
- if (IS_ENABLED(CONFIG_NO_HZ_COMMON) && base->nohz_active) {
- if (!(timer->flags & TIMER_DEFERRABLE) ||
- tick_nohz_full_cpu(base->cpu))
+ if (timer->flags & TIMER_DEFERRABLE) {
+ if (tick_nohz_full_cpu(base->cpu))
wake_up_nohz_cpu(base->cpu);
+ return;
}
+
+ /*
+ * We might have to IPI the remote CPU if the base is idle and the
+ * timer is not deferrable. If the other CPU is on the way to idle
+ * then it can't set base->is_idle as we hold the base lock:
+ */
+ if (!base->is_idle)
+ return;
+
+ /* Check whether this is the new first expiring timer: */
+ if (time_after_eq(timer->expires, base->next_expiry))
+ return;
+
+ /*
+ * Set the next expiry time and kick the CPU so it can reevaluate the
+ * wheel:
+ */
+ base->next_expiry = timer->expires;
+ wake_up_nohz_cpu(base->cpu);
}
#ifdef CONFIG_TIMER_STATS
@@ -844,10 +859,11 @@ static inline struct timer_base *get_timer_base(u32 tflags)
return get_timer_cpu_base(tflags, tflags & TIMER_CPUMASK);
}
-static inline struct timer_base *get_target_base(struct timer_base *base,
- unsigned tflags)
+#ifdef CONFIG_NO_HZ_COMMON
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
{
-#if defined(CONFIG_NO_HZ_COMMON) && defined(CONFIG_SMP)
+#ifdef CONFIG_SMP
if ((tflags & TIMER_PINNED) || !base->migration_enabled)
return get_timer_this_cpu_base(tflags);
return get_timer_cpu_base(tflags, get_nohz_timer_target());
@@ -856,6 +872,43 @@ static inline struct timer_base *get_target_base(struct timer_base *base,
#endif
}
+static inline void forward_timer_base(struct timer_base *base)
+{
+ /*
+ * We only forward the base when it's idle and we have a delta between
+ * base clock and jiffies.
+ */
+ if (!base->is_idle || (long) (jiffies - base->clk) < 2)
+ return;
+
+ /*
+ * If the next expiry value is > jiffies, then we fast forward to
+ * jiffies otherwise we forward to the next expiry value.
+ */
+ if (time_after(base->next_expiry, jiffies))
+ base->clk = jiffies;
+ else
+ base->clk = base->next_expiry;
+}
+#else
+static inline struct timer_base *
+__get_target_base(struct timer_base *base, unsigned tflags)
+{
+ return get_timer_this_cpu_base(tflags);
+}
+
+static inline void forward_timer_base(struct timer_base *base) { }
+#endif
+
+static inline struct timer_base *
+get_target_base(struct timer_base *base, unsigned tflags)
+{
+ struct timer_base *target = __get_target_base(base, tflags);
+
+ forward_timer_base(target);
+ return target;
+}
+
/*
* We are using hashed locking: Holding per_cpu(timer_bases[x]).lock means
* that all timers which are tied to this base are locked, and the base itself
@@ -1417,16 +1470,49 @@ u64 get_next_timer_interrupt(unsigned long basej, u64 basem)
spin_lock(&base->lock);
nextevt = __next_timer_interrupt(base);
- spin_unlock(&base->lock);
+ base->next_expiry = nextevt;
+ /*
+ * We have a fresh next event. Check whether we can forward the base:
+ */
+ if (time_after(nextevt, jiffies))
+ base->clk = jiffies;
+ else if (time_after(nextevt, base->clk))
+ base->clk = nextevt;
- if (time_before_eq(nextevt, basej))
+ if (time_before_eq(nextevt, basej)) {
expires = basem;
- else
+ base->is_idle = false;
+ } else {
expires = basem + (nextevt - basej) * TICK_NSEC;
+ /*
+ * If we expect to sleep more than a tick, mark the base idle:
+ */
+ if ((expires - basem) > TICK_NSEC)
+ base->is_idle = true;
+ }
+ spin_unlock(&base->lock);
return cmp_next_hrtimer_event(basem, expires);
}
+/**
+ * timer_clear_idle - Clear the idle state of the timer base
+ *
+ * Called with interrupts disabled
+ */
+void timer_clear_idle(void)
+{
+ struct timer_base *base = this_cpu_ptr(&timer_bases[BASE_STD]);
+
+ /*
+ * We do this unlocked. The worst outcome is a remote enqueue sending
+ * a pointless IPI, but taking the lock would just make the window for
+ * sending the IPI a few instructions smaller for the cost of taking
+ * the lock in the exit from idle path.
+ */
+ base->is_idle = false;
+}
+
static int collect_expired_timers(struct timer_base *base,
struct hlist_head *heads)
{
@@ -1440,7 +1526,7 @@ static int collect_expired_timers(struct timer_base *base,
/*
* If the next timer is ahead of time forward to current
- * jiffies, otherwise forward to the next expiry time.
+ * jiffies, otherwise forward to the next expiry time:
*/
if (time_after(next, jiffies)) {
/* The call site will increment clock! */
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 517a660b46de - regmap: debugfs: Don't sleep while atomic for fast_io regmaps
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
⚡⚡⚡ LTP
⚡⚡⚡ Loopdev Sanity
⚡⚡⚡ Memory function: memfd_create
⚡⚡⚡ AMTU (Abstract Machine Test Utility)
⚡⚡⚡ Networking bridge: sanity
⚡⚡⚡ Ethernet drivers sanity
⚡⚡⚡ Networking socket: fuzz
⚡⚡⚡ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
⚡⚡⚡ Networking TCP: keepalive test
⚡⚡⚡ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
⚡⚡⚡ Networking tunnel: gre basic
⚡⚡⚡ L2TP basic test
⚡⚡⚡ Networking tunnel: vxlan basic
⚡⚡⚡ Networking ipsec: basic netns - tunnel
⚡⚡⚡ Libkcapi AF_ALG test
⚡⚡⚡ pciutils: update pci ids test
⚡⚡⚡ ALSA PCM loopback test
⚡⚡⚡ ALSA Control (mixer) Userspace Element test
🚧 ⚡⚡⚡ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ⚡⚡⚡ jvm - DaCapo Benchmark Suite
🚧 ⚡⚡⚡ jvm - jcstress tests
🚧 ⚡⚡⚡ Memory function: kaslr
🚧 ⚡⚡⚡ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
x86_64:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
[ Upstream commit 01cfcde9c26d8555f0e6e9aea9d6049f87683998 ]
task_h_load() can return 0 in some situations like running stress-ng
mmapfork, which forks thousands of threads, in a sched group on a 224 cores
system. The load balance doesn't handle this correctly because
env->imbalance never decreases and it will stop pulling tasks only after
reaching loop_max, which can be equal to the number of running tasks of
the cfs. Make sure that imbalance will be decreased by at least 1.
misfit task is the other feature that doesn't handle correctly such
situation although it's probably more difficult to face the problem
because of the smaller number of CPUs and running tasks on heterogenous
system.
We can't simply ensure that task_h_load() returns at least one because it
would imply to handle underflow in other places.
Signed-off-by: Vincent Guittot <vincent.guittot(a)linaro.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
Reviewed-by: Valentin Schneider <valentin.schneider(a)arm.com>
Reviewed-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Tested-by: Dietmar Eggemann <dietmar.eggemann(a)arm.com>
Cc: <stable(a)vger.kernel.org> # v5.4
cc: Sasha Levin <sashal(a)kernel.org>
Link: https://lkml.kernel.org/r/20200710152426.16981-1-vincent.guittot@linaro.org
---
kernel/sched/fair.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 2f81e4ae844e..9b16080093be 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -3824,7 +3824,11 @@ static inline void update_misfit_status(struct task_struct *p, struct rq *rq)
return;
}
- rq->misfit_task_load = task_h_load(p);
+ /*
+ * Make sure that misfit_task_load will not be null even if
+ * task_h_load() returns 0.
+ */
+ rq->misfit_task_load = max_t(unsigned long, task_h_load(p), 1);
}
#else /* CONFIG_SMP */
@@ -7407,7 +7411,15 @@ static int detach_tasks(struct lb_env *env)
if (!can_migrate_task(p, env))
goto next;
- load = task_h_load(p);
+ /*
+ * Depending of the number of CPUs and tasks and the
+ * cgroup hierarchy, task_h_load() can return a null
+ * value. Make sure that env->imbalance decreases
+ * otherwise detach_tasks() will stop only after
+ * detaching up to loop_max tasks.
+ */
+ load = max_t(unsigned long, task_h_load(p), 1);
+
if (sched_feat(LB_MIN) && load < 16 && !env->sd->nr_balance_failed)
goto next;
--
2.17.1
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 926234f1b8434c4409aa4c53637aa3362ca07cea Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:56 +0100
Subject: staging: comedi: addi_apci_1564: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 1e15687ea472 ("staging: comedi: addi_apci_1564: add Change-of-State interrupt subdevice and required functions")
Cc: <stable(a)vger.kernel.org> #3.17+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-4-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1564.c | 20 +++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1564.c b/drivers/staging/comedi/drivers/addi_apci_1564.c
index 10501fe6bb25..1268ba34be5f 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1564.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1564.c
@@ -331,14 +331,22 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
unsigned int *data)
{
struct apci1564_private *devpriv = dev->private;
- unsigned int shift, oldmask;
+ unsigned int shift, oldmask, himask, lomask;
switch (data[0]) {
case INSN_CONFIG_DIGITAL_TRIG:
if (data[1] != 0)
return -EINVAL;
shift = data[3];
- oldmask = (1U << shift) - 1;
+ if (shift < 32) {
+ oldmask = (1U << shift) - 1;
+ himask = data[4] << shift;
+ lomask = data[5] << shift;
+ } else {
+ oldmask = 0xffffffffu;
+ himask = 0;
+ lomask = 0;
+ }
switch (data[2]) {
case COMEDI_DIGITAL_TRIG_DISABLE:
devpriv->ctrl = 0;
@@ -362,8 +370,8 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
case COMEDI_DIGITAL_TRIG_ENABLE_LEVELS:
if (devpriv->ctrl != (APCI1564_DI_IRQ_ENA |
@@ -380,8 +388,8 @@ static int apci1564_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
default:
return -EINVAL;
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From fc846e9db67c7e808d77bf9e2ef3d49e3820ce5d Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:57 +0100
Subject: staging: comedi: addi_apci_1500: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity. Only channels 0 to 15
are valid.
Fixes: a8c66b684efaf ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable(a)vger.kernel.org> #4.0+: ef75e14a6c93: staging: comedi: verify array index is correct before using it
Cc: <stable(a)vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-5-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1500.c | 24 +++++++++++++++----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1500.c b/drivers/staging/comedi/drivers/addi_apci_1500.c
index 689acd69a1b9..816dd25b9d0e 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1500.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1500.c
@@ -452,13 +452,14 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
struct apci1500_private *devpriv = dev->private;
unsigned int trig = data[1];
unsigned int shift = data[3];
- unsigned int hi_mask = data[4] << shift;
- unsigned int lo_mask = data[5] << shift;
- unsigned int chan_mask = hi_mask | lo_mask;
- unsigned int old_mask = (1 << shift) - 1;
+ unsigned int hi_mask;
+ unsigned int lo_mask;
+ unsigned int chan_mask;
+ unsigned int old_mask;
unsigned int pm;
unsigned int pt;
unsigned int pp;
+ unsigned int invalid_chan;
if (trig > 1) {
dev_dbg(dev->class_dev,
@@ -466,7 +467,20 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
return -EINVAL;
}
- if (chan_mask > 0xffff) {
+ if (shift <= 16) {
+ hi_mask = data[4] << shift;
+ lo_mask = data[5] << shift;
+ old_mask = (1U << shift) - 1;
+ invalid_chan = (data[4] | data[5]) >> (16 - shift);
+ } else {
+ hi_mask = 0;
+ lo_mask = 0;
+ old_mask = 0xffff;
+ invalid_chan = data[4] | data[5];
+ }
+ chan_mask = hi_mask | lo_mask;
+
+ if (invalid_chan) {
dev_dbg(dev->class_dev, "invalid digital trigger channel\n");
return -EINVAL;
}
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From f07804ec77d77f8a9dcf570a24154e17747bc82f Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:54 +0100
Subject: staging: comedi: ni_6527: fix INSN_CONFIG_DIGITAL_TRIG support
`ni6527_intr_insn_config()` processes `INSN_CONFIG` comedi instructions
for the "interrupt" subdevice. When `data[0]` is
`INSN_CONFIG_DIGITAL_TRIG` it is configuring the digital trigger. When
`data[2]` is `COMEDI_DIGITAL_TRIG_ENABLE_EDGES` it is configuring rising
and falling edge detection for the digital trigger, using a base channel
number (or shift amount) in `data[3]`, a rising edge bitmask in
`data[4]` and falling edge bitmask in `data[5]`.
If the base channel number (shift amount) is greater than or equal to
the number of channels (24) of the digital input subdevice, there are no
changes to the rising and falling edges, so the mask of channels to be
changed can be set to 0, otherwise the mask of channels to be changed,
and the rising and falling edge bitmasks are shifted by the base channel
number before calling `ni6527_set_edge_detection()` to change the
appropriate registers. Unfortunately, the code is comparing the base
channel (shift amount) to the interrupt subdevice's number of channels
(1) instead of the digital input subdevice's number of channels (24).
Fix it by comparing to 32 because all shift amounts for an `unsigned
int` must be less than that and everything from bit 24 upwards is
ignored by `ni6527_set_edge_detection()` anyway.
Fixes: 110f9e687c1a8 ("staging: comedi: ni_6527: support INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable(a)vger.kernel.org> # 3.17+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-2-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/comedi/drivers/ni_6527.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/staging/comedi/drivers/ni_6527.c b/drivers/staging/comedi/drivers/ni_6527.c
index 4d1eccb5041d..4518c2680b7c 100644
--- a/drivers/staging/comedi/drivers/ni_6527.c
+++ b/drivers/staging/comedi/drivers/ni_6527.c
@@ -332,7 +332,7 @@ static int ni6527_intr_insn_config(struct comedi_device *dev,
case COMEDI_DIGITAL_TRIG_ENABLE_EDGES:
/* check shift amount */
shift = data[3];
- if (shift >= s->n_chan) {
+ if (shift >= 32) {
mask = 0;
rising = 0;
falling = 0;
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0bd0db42a030b75c20028c7ba6e327b9cb554116 Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Fri, 17 Jul 2020 15:52:55 +0100
Subject: staging: comedi: addi_apci_1032: check INSN_CONFIG_DIGITAL_TRIG shift
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this.
Fixes: 33cdce6293dcc ("staging: comedi: addi_apci_1032: conform to new INSN_CONFIG_DIGITAL_TRIG")
Cc: <stable(a)vger.kernel.org> #3.8+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20200717145257.112660-3-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
.../staging/comedi/drivers/addi_apci_1032.c | 20 +++++++++++++------
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1032.c b/drivers/staging/comedi/drivers/addi_apci_1032.c
index 560649be9d13..e035c9f757a1 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1032.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1032.c
@@ -106,14 +106,22 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
unsigned int *data)
{
struct apci1032_private *devpriv = dev->private;
- unsigned int shift, oldmask;
+ unsigned int shift, oldmask, himask, lomask;
switch (data[0]) {
case INSN_CONFIG_DIGITAL_TRIG:
if (data[1] != 0)
return -EINVAL;
shift = data[3];
- oldmask = (1U << shift) - 1;
+ if (shift < 32) {
+ oldmask = (1U << shift) - 1;
+ himask = data[4] << shift;
+ lomask = data[5] << shift;
+ } else {
+ oldmask = 0xffffffffu;
+ himask = 0;
+ lomask = 0;
+ }
switch (data[2]) {
case COMEDI_DIGITAL_TRIG_DISABLE:
devpriv->ctrl = 0;
@@ -136,8 +144,8 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
case COMEDI_DIGITAL_TRIG_ENABLE_LEVELS:
if (devpriv->ctrl != (APCI1032_CTRL_INT_ENA |
@@ -154,8 +162,8 @@ static int apci1032_cos_insn_config(struct comedi_device *dev,
devpriv->mode2 &= oldmask;
}
/* configure specified channels */
- devpriv->mode1 |= data[4] << shift;
- devpriv->mode2 |= data[5] << shift;
+ devpriv->mode1 |= himask;
+ devpriv->mode2 |= lomask;
break;
default:
return -EINVAL;
--
2.27.0
From: Michael Trimarchi <michael(a)amarulasolutions.com>
The current pin muxing scheme muxes GPIO_1 pad for USB_OTG_ID
because of which when card is inserted, usb otg is enumerated
and the card is never detected.
[ 64.492645] cfg80211: failed to load regulatory.db
[ 64.492657] imx-sdma 20ec000.sdma: external firmware not found, using ROM firmware
[ 76.343711] ci_hdrc ci_hdrc.0: EHCI Host Controller
[ 76.349742] ci_hdrc ci_hdrc.0: new USB bus registered, assigned bus number 2
[ 76.388862] ci_hdrc ci_hdrc.0: USB 2.0 started, EHCI 1.00
[ 76.396650] usb usb2: New USB device found, idVendor=1d6b, idProduct=0002, bcdDevice= 5.08
[ 76.405412] usb usb2: New USB device strings: Mfr=3, Product=2, SerialNumber=1
[ 76.412763] usb usb2: Product: EHCI Host Controller
[ 76.417666] usb usb2: Manufacturer: Linux 5.8.0-rc1-next-20200618 ehci_hcd
[ 76.424623] usb usb2: SerialNumber: ci_hdrc.0
[ 76.431755] hub 2-0:1.0: USB hub found
[ 76.435862] hub 2-0:1.0: 1 port detected
The TRM mentions GPIO_1 pad should be muxed/assigned for card detect
and ENET_RX_ER pad for USB_OTG_ID for proper operation.
This patch fixes pin muxing as per TRM and is tested on a
i.Core 1.5 MX6 DL SOM.
[ 22.449165] mmc0: host does not support reading read-only switch, assuming write-enable
[ 22.459992] mmc0: new high speed SDHC card at address 0001
[ 22.469725] mmcblk0: mmc0:0001 EB1QT 29.8 GiB
[ 22.478856] mmcblk0: p1 p2
Fixes: 6df11287f7c9 ("ARM: dts: imx6q: Add Engicam i.CoreM6 Quad/Dual initial support")
Cc: stable(a)vger.kernel.org
Signed-off-by: Michael Trimarchi <michael(a)amarulasolutions.com>
Signed-off-by: Suniel Mahesh <sunil(a)amarulasolutions.com>
---
Changes for v3:
- Changed subject of the patch, added fixes tag and copied stable kernel
as suggested by Shawn Guo.
Changes for v2:
- Changed patch description as suggested by Michael Trimarchi to make it
more readable/understandable.
NOTE:
- patch tested on i.Core 1.5 MX6 DL
---
arch/arm/boot/dts/imx6qdl-icore.dtsi | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/arch/arm/boot/dts/imx6qdl-icore.dtsi b/arch/arm/boot/dts/imx6qdl-icore.dtsi
index f2f475e..23c318d 100644
--- a/arch/arm/boot/dts/imx6qdl-icore.dtsi
+++ b/arch/arm/boot/dts/imx6qdl-icore.dtsi
@@ -398,7 +398,7 @@
pinctrl_usbotg: usbotggrp {
fsl,pins = <
- MX6QDL_PAD_GPIO_1__USB_OTG_ID 0x17059
+ MX6QDL_PAD_ENET_RX_ER__USB_OTG_ID 0x17059
>;
};
@@ -410,6 +410,7 @@
MX6QDL_PAD_SD1_DAT1__SD1_DATA1 0x17070
MX6QDL_PAD_SD1_DAT2__SD1_DATA2 0x17070
MX6QDL_PAD_SD1_DAT3__SD1_DATA3 0x17070
+ MX6QDL_PAD_GPIO_1__GPIO1_IO01 0x1b0b0
>;
};
--
2.7.4
This is a note to let you know that I've just added the patch titled
staging: rtl8712: handle firmware load failure
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From b4383c971bc5263efe2b0915ba67ebf2bf3f1ee5 Mon Sep 17 00:00:00 2001
From: Rustam Kovhaev <rkovhaev(a)gmail.com>
Date: Thu, 16 Jul 2020 08:13:26 -0700
Subject: staging: rtl8712: handle firmware load failure
when firmware fails to load we should not call unregister_netdev()
this patch fixes a race condition between rtl871x_load_fw_cb() and
r871xu_dev_remove() and fixes the bug reported by syzbot
Reported-by: syzbot+80899a8a8efe8968cde7(a)syzkaller.appspotmail.com
Link: https://syzkaller.appspot.com/bug?extid=80899a8a8efe8968cde7
Signed-off-by: Rustam Kovhaev <rkovhaev(a)gmail.com>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20200716151324.1036204-1-rkovhaev@gmail.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/rtl8712/hal_init.c | 3 ++-
drivers/staging/rtl8712/usb_intf.c | 11 ++++++++---
2 files changed, 10 insertions(+), 4 deletions(-)
diff --git a/drivers/staging/rtl8712/hal_init.c b/drivers/staging/rtl8712/hal_init.c
index ed51023b85a0..715f1fe8b472 100644
--- a/drivers/staging/rtl8712/hal_init.c
+++ b/drivers/staging/rtl8712/hal_init.c
@@ -33,7 +33,6 @@ static void rtl871x_load_fw_cb(const struct firmware *firmware, void *context)
{
struct _adapter *adapter = context;
- complete(&adapter->rtl8712_fw_ready);
if (!firmware) {
struct usb_device *udev = adapter->dvobjpriv.pusbdev;
struct usb_interface *usb_intf = adapter->pusb_intf;
@@ -41,11 +40,13 @@ static void rtl871x_load_fw_cb(const struct firmware *firmware, void *context)
dev_err(&udev->dev, "r8712u: Firmware request failed\n");
usb_put_dev(udev);
usb_set_intfdata(usb_intf, NULL);
+ complete(&adapter->rtl8712_fw_ready);
return;
}
adapter->fw = firmware;
/* firmware available - start netdev */
register_netdev(adapter->pnetdev);
+ complete(&adapter->rtl8712_fw_ready);
}
static const char firmware_file[] = "rtlwifi/rtl8712u.bin";
diff --git a/drivers/staging/rtl8712/usb_intf.c b/drivers/staging/rtl8712/usb_intf.c
index a87562f632a7..2fcd65260f4c 100644
--- a/drivers/staging/rtl8712/usb_intf.c
+++ b/drivers/staging/rtl8712/usb_intf.c
@@ -595,13 +595,17 @@ static void r871xu_dev_remove(struct usb_interface *pusb_intf)
if (pnetdev) {
struct _adapter *padapter = netdev_priv(pnetdev);
- usb_set_intfdata(pusb_intf, NULL);
- release_firmware(padapter->fw);
/* never exit with a firmware callback pending */
wait_for_completion(&padapter->rtl8712_fw_ready);
+ pnetdev = usb_get_intfdata(pusb_intf);
+ usb_set_intfdata(pusb_intf, NULL);
+ if (!pnetdev)
+ goto firmware_load_fail;
+ release_firmware(padapter->fw);
if (drvpriv.drv_registered)
padapter->surprise_removed = true;
- unregister_netdev(pnetdev); /* will call netdev_close() */
+ if (pnetdev->reg_state != NETREG_UNINITIALIZED)
+ unregister_netdev(pnetdev); /* will call netdev_close() */
flush_scheduled_work();
udelay(1);
/* Stop driver mlme relation timer */
@@ -614,6 +618,7 @@ static void r871xu_dev_remove(struct usb_interface *pusb_intf)
*/
usb_put_dev(udev);
}
+firmware_load_fail:
/* If we didn't unplug usb dongle and remove/insert module, driver
* fails on sitesurvey for the first time when device is up.
* Reset usb port for sitesurvey fail issue.
--
2.27.0
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 84bda0e59a9c - Input: mms114 - add extra compatible for mms345l
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
Host 4:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
⚡⚡⚡ xfstests - ext4
⚡⚡⚡ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
⚡⚡⚡ storage: software RAID testing
⚡⚡⚡ stress: stress-ng
🚧 ⚡⚡⚡ CPU: Frequency Driver Test
🚧 ⚡⚡⚡ CPU: Idle Test
🚧 ⚡⚡⚡ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
Host 5:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: bb8351a001c2 - io_uring: fix recvmsg memory leak with buffer selection
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
⚡⚡⚡ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
Host 3:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 4:
✅ Boot test
🚧 ✅ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ⚡⚡⚡ IOMMU boot test
🚧 ⚡⚡⚡ IPMI driver test
🚧 ⚡⚡⚡ IPMItool loop stress test
🚧 ⚡⚡⚡ power-management: cpupower/sanity test
🚧 ⚡⚡⚡ Storage blktests
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hi,
I forgot to mark this one with stable, can you please cherry-pick it
for 5.7-stable? It picks cleanly. Thanks!
commit 681fda8d27a66f7e65ff7f2d200d7635e64a8d05
Author: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Wed Jul 15 22:20:45 2020 +0300
io_uring: fix recvmsg memory leak with buffer selection
--
Jens Axboe
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: dfb39f15e19b - spi: spi-fsl-dspi: Fix lockup if device is shutdown during SPI transfer
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ⚡⚡⚡ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
⚡⚡⚡ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 4:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
s390x:
Host 1:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
x86_64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ❌ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ⚡⚡⚡ audit: audit testsuite test
🚧 ⚡⚡⚡ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: 3766d840ebc3 - spi: spi-fsl-dspi: Fix lockup if device is shutdown during SPI transfer
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
⚡⚡⚡ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
⚡⚡⚡ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ⚡⚡⚡ Storage blktests
Host 3:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
ppc64le:
Host 1:
✅ Boot test
🚧 ✅ kdump - sysrq-c
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
s390x:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ❌ Storage blktests
x86_64:
Host 1:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
On Fri, Jul 17, 2020 at 02:43:52AM +0200, Karol Herbst wrote:
>On Fri, Jul 17, 2020 at 1:54 AM Bjorn Helgaas <helgaas(a)kernel.org> wrote:
>>
>> [+cc Sasha -- stable kernel regression]
>> [+cc Patrick, Kai-Heng, LKML]
>>
>> On Fri, Jul 17, 2020 at 12:10:39AM +0200, Karol Herbst wrote:
>> > On Tue, Jul 7, 2020 at 9:30 PM Karol Herbst <kherbst(a)redhat.com> wrote:
>> > >
>> > > Hi everybody,
>> > >
>> > > with the mentioned commit Nouveau isn't able to load firmware onto the
>> > > GPU on one of my systems here. Even though the issue doesn't always
>> > > happen I am quite confident this is the commit breaking it.
>> > >
>> > > I am still digging into the issue and trying to figure out what
>> > > exactly breaks, but it shows up in different ways. Either we are not
>> > > able to boot the engines on the GPU or the GPU becomes unresponsive.
>> > > Btw, this is also a system where our runtime power management issue
>> > > shows up, so maybe there is indeed something funky with the bridge
>> > > controller.
>> > >
>> > > Just pinging you in case you have an idea on how this could break Nouveau
>> > >
>> > > most of the times it shows up like this:
>> > > nouveau 0000:01:00.0: acr: AHESASC binary failed
>> > >
>> > > Sometimes it works at boot and fails at runtime resuming with random
>> > > faults. So I will be investigating a bit more, but yeah... I am super
>> > > sure the commit triggered this issue, no idea if it actually causes
>> > > it.
>> >
>> > so yeah.. I reverted that locally and never ran into issues again.
>> > Still valid on latest 5.7. So can we get this reverted or properly
>> > fixed? This breaks runtime pm for us on at least some hardware.
>>
>> Yeah, that stinks. We had another similar report from Patrick:
>>
>> https://lore.kernel.org/r/CAErSpo5sTeK_my1dEhWp7aHD0xOp87+oHYWkTjbL7ALgDbXo…
>>
>> Apparently the problem is ec411e02b7a2 ("PCI/PM: Assume ports without
>> DLL Link Active train links in 100 ms"), which Patrick found was
>> backported to v5.4.49 as 828b192c57e8, and you found was backported to
>> v5.7.6 as afaff825e3a4.
>>
>> Oddly, Patrick reported that v5.7.7 worked correctly, even though it
>> still contains afaff825e3a4.
>>
>> I guess in the absence of any other clues we'll have to revert it.
>> I hate to do that because that means we'll have slow resume of
>> Thunderbolt-connected devices again, but that's better than having
>> GPUs completely broken.
>>
>> Could you and Patrick open bugzilla.kernel.org reports, attach dmesg
>> logs and "sudo lspci -vv" output, and add the URLs to Kai-Heng's
>> original report at https://bugzilla.kernel.org/show_bug.cgi?id=206837
>> and to this thread?
>>
>> There must be a way to fix the slow resume problem without breaking
>> the GPUs.
>>
>
>I wouldn't be surprised if this is related to the Intel bridge we
>check against for Nouveau.. I still have to check on another laptop
>with the same bridge our workaround was required as well but wouldn't
>be surprised if it shows the same problem. Will get you the
>information from both systems tomorrow then.
I take it that ec411e02b7a2 will be reverted upstream?
--
Thanks,
Sasha
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: baedb87d1b53532f81b4bd0387f83b05d4f7eb9a
Gitweb: https://git.kernel.org/tip/baedb87d1b53532f81b4bd0387f83b05d4f7eb9a
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Fri, 17 Jul 2020 18:00:02 +02:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 17 Jul 2020 23:30:43 +02:00
genirq/affinity: Handle affinity setting on inactive interrupts correctly
Setting interrupt affinity on inactive interrupts is inconsistent when
hierarchical irq domains are enabled. The core code should just store the
affinity and not call into the irq chip driver for inactive interrupts
because the chip drivers may not be in a state to handle such requests.
X86 has a hacky workaround for that but all other irq chips have not which
causes problems e.g. on GIC V3 ITS.
Instead of adding more ugly hacks all over the place, solve the problem in
the core code. If the affinity is set on an inactive interrupt then:
- Store it in the irq descriptors affinity mask
- Update the effective affinity to reflect that so user space has
a consistent view
- Don't call into the irq chip driver
This is the core equivalent of the X86 workaround and works correctly
because the affinity setting is established in the irq chip when the
interrupt is activated later on.
Note, that this is only effective when hierarchical irq domains are enabled
by the architecture. Doing it unconditionally would break legacy irq chip
implementations.
For hierarchial irq domains this works correctly as none of the drivers can
have a dependency on affinity setting in inactive state by design.
Remove the X86 workaround as it is not longer required.
Fixes: 02edee152d6e ("x86/apic/vector: Ignore set_affinity call for inactive interrupts")
Reported-by: Ali Saidi <alisaidi(a)amazon.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Ali Saidi <alisaidi(a)amazon.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20200529015501.15771-1-alisaidi@amazon.com
Link: https://lkml.kernel.org/r/877dv2rv25.fsf@nanos.tec.linutronix.de
---
arch/x86/kernel/apic/vector.c | 22 ++++----------------
kernel/irq/manage.c | 37 ++++++++++++++++++++++++++++++++--
2 files changed, 40 insertions(+), 19 deletions(-)
diff --git a/arch/x86/kernel/apic/vector.c b/arch/x86/kernel/apic/vector.c
index cc8b16f..7649da2 100644
--- a/arch/x86/kernel/apic/vector.c
+++ b/arch/x86/kernel/apic/vector.c
@@ -446,12 +446,10 @@ static int x86_vector_activate(struct irq_domain *dom, struct irq_data *irqd,
trace_vector_activate(irqd->irq, apicd->is_managed,
apicd->can_reserve, reserve);
- /* Nothing to do for fixed assigned vectors */
- if (!apicd->can_reserve && !apicd->is_managed)
- return 0;
-
raw_spin_lock_irqsave(&vector_lock, flags);
- if (reserve || irqd_is_managed_and_shutdown(irqd))
+ if (!apicd->can_reserve && !apicd->is_managed)
+ assign_irq_vector_any_locked(irqd);
+ else if (reserve || irqd_is_managed_and_shutdown(irqd))
vector_assign_managed_shutdown(irqd);
else if (apicd->is_managed)
ret = activate_managed(irqd);
@@ -774,20 +772,10 @@ void lapic_offline(void)
static int apic_set_affinity(struct irq_data *irqd,
const struct cpumask *dest, bool force)
{
- struct apic_chip_data *apicd = apic_chip_data(irqd);
int err;
- /*
- * Core code can call here for inactive interrupts. For inactive
- * interrupts which use managed or reservation mode there is no
- * point in going through the vector assignment right now as the
- * activation will assign a vector which fits the destination
- * cpumask. Let the core code store the destination mask and be
- * done with it.
- */
- if (!irqd_is_activated(irqd) &&
- (apicd->is_managed || apicd->can_reserve))
- return IRQ_SET_MASK_OK;
+ if (WARN_ON_ONCE(!irqd_is_activated(irqd)))
+ return -EIO;
raw_spin_lock(&vector_lock);
cpumask_and(vector_searchmask, dest, cpu_online_mask);
diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index 7619111..2a9fec5 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -195,9 +195,9 @@ void irq_set_thread_affinity(struct irq_desc *desc)
set_bit(IRQTF_AFFINITY, &action->thread_flags);
}
+#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
static void irq_validate_effective_affinity(struct irq_data *data)
{
-#ifdef CONFIG_GENERIC_IRQ_EFFECTIVE_AFF_MASK
const struct cpumask *m = irq_data_get_effective_affinity_mask(data);
struct irq_chip *chip = irq_data_get_irq_chip(data);
@@ -205,9 +205,19 @@ static void irq_validate_effective_affinity(struct irq_data *data)
return;
pr_warn_once("irq_chip %s did not update eff. affinity mask of irq %u\n",
chip->name, data->irq);
-#endif
}
+static inline void irq_init_effective_affinity(struct irq_data *data,
+ const struct cpumask *mask)
+{
+ cpumask_copy(irq_data_get_effective_affinity_mask(data), mask);
+}
+#else
+static inline void irq_validate_effective_affinity(struct irq_data *data) { }
+static inline void irq_init_effective_affinity(struct irq_data *data,
+ const struct cpumask *mask) { }
+#endif
+
int irq_do_set_affinity(struct irq_data *data, const struct cpumask *mask,
bool force)
{
@@ -304,6 +314,26 @@ static int irq_try_set_affinity(struct irq_data *data,
return ret;
}
+static bool irq_set_affinity_deactivated(struct irq_data *data,
+ const struct cpumask *mask, bool force)
+{
+ struct irq_desc *desc = irq_data_to_desc(data);
+
+ /*
+ * If the interrupt is not yet activated, just store the affinity
+ * mask and do not call the chip driver at all. On activation the
+ * driver has to make sure anyway that the interrupt is in a
+ * useable state so startup works.
+ */
+ if (!IS_ENABLED(CONFIG_IRQ_DOMAIN_HIERARCHY) || irqd_is_activated(data))
+ return false;
+
+ cpumask_copy(desc->irq_common_data.affinity, mask);
+ irq_init_effective_affinity(data, mask);
+ irqd_set(data, IRQD_AFFINITY_SET);
+ return true;
+}
+
int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *mask,
bool force)
{
@@ -314,6 +344,9 @@ int irq_set_affinity_locked(struct irq_data *data, const struct cpumask *mask,
if (!chip || !chip->irq_set_affinity)
return -EINVAL;
+ if (irq_set_affinity_deactivated(data, mask, force))
+ return 0;
+
if (irq_can_move_pcntxt(data) && !irqd_is_setaffinity_pending(data)) {
ret = irq_try_set_affinity(data, mask, force);
} else {
When an expiration delta falls into the last level of the wheel, we want
to compare that delta against the maximum possible delay and reduce our
delta to fit in if necessary.
However instead of comparing the delta against the maximum, we are
comparing the actual expiry against the maximum. Then instead of fixing
the delta to fit in, we set the maximum delta as the expiry value.
This can result in various undesired outcomes, the worst possible one
being a timer expiring 15 days ahead to fire immediately.
Fixes: 500462a9de65 ("timers: Switch to a non-cascading wheel")
Signed-off-by: Frederic Weisbecker <frederic(a)kernel.org>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Anna-Maria Behnsen <anna-maria(a)linutronix.de>
Cc: Juri Lelli <juri.lelli(a)redhat.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
---
kernel/time/timer.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/time/timer.c b/kernel/time/timer.c
index 9a838d38dbe6..df1ff803acc4 100644
--- a/kernel/time/timer.c
+++ b/kernel/time/timer.c
@@ -521,8 +521,8 @@ static int calc_wheel_index(unsigned long expires, unsigned long clk)
* Force expire obscene large timeouts to expire at the
* capacity limit of the wheel.
*/
- if (expires >= WHEEL_TIMEOUT_CUTOFF)
- expires = WHEEL_TIMEOUT_MAX;
+ if (delta >= WHEEL_TIMEOUT_CUTOFF)
+ expires = clk + WHEEL_TIMEOUT_MAX;
idx = calc_index(expires, LVL_DEPTH - 1);
}
--
2.26.2
The "FIRMWARE_EFI_EMBEDDED" enum is a "where", not a "what". It
should not be distinguished separately from just "FIRMWARE", as this
confuses the LSMs about what is being loaded. Additionally, there was
no actual validation of the firmware contents happening.
Fixes: e4c2c0ff00ec ("firmware: Add new platform fallback mechanism and firmware_request_platform()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
To aid in backporting, this change is made before moving
kernel_read_file() to separate header/source files.
---
drivers/base/firmware_loader/fallback_platform.c | 2 +-
include/linux/fs.h | 3 +--
2 files changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/base/firmware_loader/fallback_platform.c b/drivers/base/firmware_loader/fallback_platform.c
index 685edb7dd05a..6958ab1a8059 100644
--- a/drivers/base/firmware_loader/fallback_platform.c
+++ b/drivers/base/firmware_loader/fallback_platform.c
@@ -17,7 +17,7 @@ int firmware_fallback_platform(struct fw_priv *fw_priv, u32 opt_flags)
if (!(opt_flags & FW_OPT_FALLBACK_PLATFORM))
return -ENOENT;
- rc = security_kernel_load_data(LOADING_FIRMWARE_EFI_EMBEDDED);
+ rc = security_kernel_load_data(LOADING_FIRMWARE);
if (rc)
return rc;
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 95fc775ed937..f50a35d54a61 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2993,11 +2993,10 @@ static inline void i_readcount_inc(struct inode *inode)
#endif
extern int do_pipe_flags(int *, int);
-/* This is a list of *what* is being read, not *how*. */
+/* This is a list of *what* is being read, not *how* nor *where*. */
#define __kernel_read_file_id(id) \
id(UNKNOWN, unknown) \
id(FIRMWARE, firmware) \
- id(FIRMWARE_EFI_EMBEDDED, firmware) \
id(MODULE, kernel-module) \
id(KEXEC_IMAGE, kexec-image) \
id(KEXEC_INITRAMFS, kexec-initramfs) \
--
2.25.1
Hello,
We ran automated tests on a recent commit from this kernel tree:
Kernel repo: https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git
Commit: f8de612e6e23 - Linux 5.7.10-rc1
The results of these automated tests are provided below.
Overall result: PASSED
Merge: OK
Compile: OK
Tests: OK
All kernel binaries, config files, and logs are available for download here:
https://cki-artifacts.s3.us-east-2.amazonaws.com/index.html?prefix=dataware…
Please reply to this email if you have any questions about the tests that we
ran or if you have any suggestions on how to make future tests more effective.
,-. ,-.
( C ) ( K ) Continuous
`-',-.`-' Kernel
( I ) Integration
`-'
______________________________________________________________________________
Compile testing
---------------
We compiled the kernel for 4 architectures:
aarch64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
ppc64le:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
s390x:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
x86_64:
make options: -j30 INSTALL_MOD_STRIP=1 targz-pkg
Hardware testing
----------------
We booted each kernel and ran the following tests:
aarch64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ ACPI enabled test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 3:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
ppc64le:
Host 1:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
✅ storage: software RAID testing
🚧 ❌ xfstests - btrfs
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
s390x:
Host 1:
✅ Boot test
✅ selinux-policy: serge-testsuite
✅ stress: stress-ng
🚧 ✅ Storage blktests
Host 2:
✅ Boot test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
🚧 ⚡⚡⚡ kdump - sysrq-c
🚧 ⚡⚡⚡ kdump - file-load
x86_64:
Host 1:
✅ Boot test
✅ ACPI table test
✅ Podman system integration test - as root
✅ Podman system integration test - as user
✅ LTP
✅ Loopdev Sanity
✅ Memory function: memfd_create
✅ AMTU (Abstract Machine Test Utility)
✅ Networking bridge: sanity
✅ Ethernet drivers sanity
✅ Networking socket: fuzz
✅ Networking: igmp conformance test
✅ Networking route: pmtu
✅ Networking route_func - local
✅ Networking route_func - forward
✅ Networking TCP: keepalive test
✅ Networking UDP: socket
✅ Networking tunnel: geneve basic test
✅ Networking tunnel: gre basic
✅ L2TP basic test
✅ Networking tunnel: vxlan basic
✅ Networking ipsec: basic netns - transport
✅ Networking ipsec: basic netns - tunnel
✅ Libkcapi AF_ALG test
✅ pciutils: sanity smoke test
✅ pciutils: update pci ids test
✅ ALSA PCM loopback test
✅ ALSA Control (mixer) Userspace Element test
✅ storage: SCSI VPD
🚧 ✅ CIFS Connectathon
🚧 ✅ POSIX pjd-fstest suites
🚧 ✅ jvm - DaCapo Benchmark Suite
🚧 ✅ jvm - jcstress tests
🚧 ✅ Memory function: kaslr
🚧 ✅ Networking firewall: basic netfilter test
🚧 ✅ audit: audit testsuite test
🚧 ✅ trace: ftrace/tracer
🚧 ✅ kdump - kexec_boot
Host 2:
✅ Boot test
🚧 ✅ kdump - sysrq-c
🚧 ✅ kdump - file-load
Host 3:
⚡ Internal infrastructure issues prevented one or more tests (marked
with ⚡⚡⚡) from running on this architecture.
This is not the fault of the kernel that was tested.
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
⚡⚡⚡ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ✅ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Host 4:
✅ Boot test
✅ xfstests - ext4
✅ xfstests - xfs
✅ selinux-policy: serge-testsuite
✅ storage: software RAID testing
✅ stress: stress-ng
🚧 ✅ CPU: Frequency Driver Test
🚧 ✅ CPU: Idle Test
🚧 ❌ xfstests - btrfs
🚧 ✅ IOMMU boot test
🚧 ✅ IPMI driver test
🚧 ✅ IPMItool loop stress test
🚧 ✅ power-management: cpupower/sanity test
🚧 ✅ Storage blktests
Test sources: https://gitlab.com/cki-project/kernel-tests
💚 Pull requests are welcome for new tests or improvements to existing tests!
Aborted tests
-------------
Tests that didn't complete running successfully are marked with ⚡⚡⚡.
If this was caused by an infrastructure issue, we try to mark that
explicitly in the report.
Waived tests
------------
If the test run included waived tests, they are marked with 🚧. Such tests are
executed but their results are not taken into account. Tests are waived when
their results are not reliable enough, e.g. when they're just introduced or are
being fixed.
Testing timeout
---------------
We aim to provide a report within reasonable timeframe. Tests that haven't
finished running yet are marked with ⏱.
The EFI platform firmware fallback would clobber any pre-allocated
buffers. Instead, correctly refuse to reallocate when too small (as
already done in the sysfs fallback), or perform allocation normally
when needed.
Fixes: e4c2c0ff00ec ("firmware: Add new platform fallback mechanism and firm ware_request_platform()")
Cc: stable(a)vger.kernel.org
Signed-off-by: Kees Cook <keescook(a)chromium.org>
---
To aid in backporting, this change is made before moving
kernel_read_file() to separate header/source files.
---
drivers/base/firmware_loader/fallback_platform.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/base/firmware_loader/fallback_platform.c b/drivers/base/firmware_loader/fallback_platform.c
index cdd2c9a9f38a..685edb7dd05a 100644
--- a/drivers/base/firmware_loader/fallback_platform.c
+++ b/drivers/base/firmware_loader/fallback_platform.c
@@ -25,7 +25,10 @@ int firmware_fallback_platform(struct fw_priv *fw_priv, u32 opt_flags)
if (rc)
return rc; /* rc == -ENOENT when the fw was not found */
- fw_priv->data = vmalloc(size);
+ if (fw_priv->data && size > fw_priv->allocated_size)
+ return -ENOMEM;
+ if (!fw_priv->data)
+ fw_priv->data = vmalloc(size);
if (!fw_priv->data)
return -ENOMEM;
--
2.25.1
For some block devices which large capacity (e.g. 8TB) but small io_opt
size (e.g. 8 sectors), in bcache_device_init() the stripes number calcu-
lated by,
DIV_ROUND_UP_ULL(sectors, d->stripe_size);
might be overflow to the unsigned int bcache_device->nr_stripes.
This patch uses the uint64_t variable to store DIV_ROUND_UP_ULL()
and after the value is checked to be available in unsigned int range,
sets it to bache_device->nr_stripes. Then the overflow is avoided.
Reported-by: Ken Raeburn <raeburn(a)redhat.com>
Signed-off-by: Coly Li <colyli(a)suse.de>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783075
Cc: stable(a)vger.kernel.org
---
Changelog:
v2: Improve overflow fix on 32bit machine, suggested by Jens and Ken.
v1: initial version.
drivers/md/bcache/super.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index a239fcaec70b..7615be9d4498 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -886,19 +886,19 @@ static int bcache_device_init(struct bcache_device *d, unsigned int block_size,
struct request_queue *q;
const size_t max_stripes = min_t(size_t, INT_MAX,
SIZE_MAX / sizeof(atomic_t));
- size_t n;
+ uint64_t n;
int idx;
if (!d->stripe_size)
d->stripe_size = 1 << 31;
- d->nr_stripes = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
-
- if (!d->nr_stripes || d->nr_stripes > max_stripes) {
- pr_err("nr_stripes too large or invalid: %u (start sector beyond end of disk?)\n",
- (unsigned int)d->nr_stripes);
+ n = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
+ if (!n || n > max_stripes) {
+ pr_err("nr_stripes too large or invalid: %llu (start sector beyond end of disk?)\n",
+ n);
return -ENOMEM;
}
+ d->nr_stripes = n;
n = d->nr_stripes * sizeof(atomic_t);
d->stripe_sectors_dirty = kvzalloc(n, GFP_KERNEL);
--
2.26.2
User Forza reported on IRC that some invalid combinations of file
attributes are accepted by chattr.
The NODATACOW and compression file flags/attributes are mutually
exclusive, but they could be set by 'chattr +c +C' on an empty file. The
nodatacow will be in effect because it's checked first in
btrfs_run_delalloc_range.
Extend the flag validation to catch the following cases:
- input flags are conflicting
- old and new flags are conflicting
- initialize the local variable with inode flags after inode ls locked
CC: stable(a)vger.kernel.org # 4.4+
Signed-off-by: David Sterba <dsterba(a)suse.com>
---
fs/btrfs/ioctl.c | 30 ++++++++++++++++++++++--------
1 file changed, 22 insertions(+), 8 deletions(-)
diff --git a/fs/btrfs/ioctl.c b/fs/btrfs/ioctl.c
index 3a566cf71fc6..0c13bb38425b 100644
--- a/fs/btrfs/ioctl.c
+++ b/fs/btrfs/ioctl.c
@@ -164,8 +164,11 @@ static int btrfs_ioctl_getflags(struct file *file, void __user *arg)
return 0;
}
-/* Check if @flags are a supported and valid set of FS_*_FL flags */
-static int check_fsflags(unsigned int flags)
+/*
+ * Check if @flags are a supported and valid set of FS_*_FL flags and that
+ * the old and new flags are not conflicting
+ */
+static int check_fsflags(unsigned int old_flags, unsigned int flags)
{
if (flags & ~(FS_IMMUTABLE_FL | FS_APPEND_FL | \
FS_NOATIME_FL | FS_NODUMP_FL | \
@@ -174,9 +177,19 @@ static int check_fsflags(unsigned int flags)
FS_NOCOW_FL))
return -EOPNOTSUPP;
+ /* COMPR and NOCOMP on new/old are valid */
if ((flags & FS_NOCOMP_FL) && (flags & FS_COMPR_FL))
return -EINVAL;
+ if ((flags & FS_COMPR_FL) && (flags & FS_NOCOW_FL))
+ return -EINVAL;
+
+ /* NOCOW and compression options are mutually exclusive */
+ if ((old_flags & FS_NOCOW_FL) && (flags & (FS_COMPR_FL | FS_NOCOMP_FL)))
+ return -EINVAL;
+ if ((flags & FS_NOCOW_FL) && (old_flags & (FS_COMPR_FL | FS_NOCOMP_FL)))
+ return -EINVAL;
+
return 0;
}
@@ -190,7 +203,7 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
unsigned int fsflags, old_fsflags;
int ret;
const char *comp = NULL;
- u32 binode_flags = binode->flags;
+ u32 binode_flags;
if (!inode_owner_or_capable(inode))
return -EPERM;
@@ -201,22 +214,23 @@ static int btrfs_ioctl_setflags(struct file *file, void __user *arg)
if (copy_from_user(&fsflags, arg, sizeof(fsflags)))
return -EFAULT;
- ret = check_fsflags(fsflags);
- if (ret)
- return ret;
-
ret = mnt_want_write_file(file);
if (ret)
return ret;
inode_lock(inode);
-
fsflags = btrfs_mask_fsflags_for_type(inode, fsflags);
old_fsflags = btrfs_inode_flags_to_fsflags(binode->flags);
+
ret = vfs_ioc_setflags_prepare(inode, old_fsflags, fsflags);
if (ret)
goto out_unlock;
+ ret = check_fsflags(old_fsflags, fsflags);
+ if (ret)
+ goto out_unlock;
+
+ binode_flags = binode->flags;
if (fsflags & FS_SYNC_FL)
binode_flags |= BTRFS_INODE_SYNC;
else
--
2.25.0
For some block devices which large capacity (e.g. 8TB) but small io_opt
size (e.g. 8 sectors), in bcache_device_init() the stripes number calcu-
lated by,
DIV_ROUND_UP_ULL(sectors, d->stripe_size);
might be overflow to the unsigned int bcache_device->nr_stripes.
This patch uses an unsigned long variable to store DIV_ROUND_UP_ULL()
and after the value is checked to be available in unsigned int range,
sets it to bache_device->nr_stripes. Then the overflow is avoided.
Reported-by: Ken Raeburn <raeburn(a)redhat.com>
Signed-off-by: Coly Li <colyli(a)suse.de>
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1783075
Cc: stable(a)vger.kernel.org
---
drivers/md/bcache/super.c | 12 ++++++------
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index a239fcaec70b..0c25ebc035b1 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -886,19 +886,19 @@ static int bcache_device_init(struct bcache_device *d, unsigned int block_size,
struct request_queue *q;
const size_t max_stripes = min_t(size_t, INT_MAX,
SIZE_MAX / sizeof(atomic_t));
- size_t n;
+ unsigned long n;
int idx;
if (!d->stripe_size)
d->stripe_size = 1 << 31;
- d->nr_stripes = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
-
- if (!d->nr_stripes || d->nr_stripes > max_stripes) {
- pr_err("nr_stripes too large or invalid: %u (start sector beyond end of disk?)\n",
- (unsigned int)d->nr_stripes);
+ n = DIV_ROUND_UP_ULL(sectors, d->stripe_size);
+ if (!n || n > max_stripes) {
+ pr_err("nr_stripes too large or invalid: %lu (start sector beyond end of disk?)\n",
+ n);
return -ENOMEM;
}
+ d->nr_stripes = n;
n = d->nr_stripes * sizeof(atomic_t);
d->stripe_sectors_dirty = kvzalloc(n, GFP_KERNEL);
--
2.26.2
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 5c49056ad9f3c786f7716da2dd47e4488fc6bd25 Mon Sep 17 00:00:00 2001
From: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Date: Sun, 7 Jun 2020 16:53:53 +0100
Subject: [PATCH] iio:humidity:hts221 Fix alignment and data leak issues
One of a class of bugs pointed out by Lars in a recent review.
iio_push_to_buffers_with_timestamp assumes the buffer used is aligned
to the size of the timestamp (8 bytes). This is not guaranteed in
this driver which uses an array of smaller elements on the stack.
As Lars also noted this anti pattern can involve a leak of data to
userspace and that indeed can happen here. We close both issues by
moving to a suitable structure in the iio_priv() data.
This data is allocated with kzalloc so no data can leak
apart from previous readings.
Explicit alignment of ts needed to ensure consistent padding
on all architectures (particularly x86_32 with it's 4 byte alignment
of s64)
Fixes: e4a70e3e7d84 ("iio: humidity: add support to hts221 rh/temp combo device")
Reported-by: Lars-Peter Clausen <lars(a)metafoo.de>
Acked-by: Lorenzo Bianconi <lorenzo(a)kernel.org>
Signed-off-by: Jonathan Cameron <Jonathan.Cameron(a)huawei.com>
Cc: <Stable(a)vger.kernel.org>
diff --git a/drivers/iio/humidity/hts221.h b/drivers/iio/humidity/hts221.h
index 7d6771f7cf47..b2eb5abeaccd 100644
--- a/drivers/iio/humidity/hts221.h
+++ b/drivers/iio/humidity/hts221.h
@@ -14,8 +14,6 @@
#include <linux/iio/iio.h>
-#define HTS221_DATA_SIZE 2
-
enum hts221_sensor_type {
HTS221_SENSOR_H,
HTS221_SENSOR_T,
@@ -39,6 +37,11 @@ struct hts221_hw {
bool enabled;
u8 odr;
+ /* Ensure natural alignment of timestamp */
+ struct {
+ __le16 channels[2];
+ s64 ts __aligned(8);
+ } scan;
};
extern const struct dev_pm_ops hts221_pm_ops;
diff --git a/drivers/iio/humidity/hts221_buffer.c b/drivers/iio/humidity/hts221_buffer.c
index 9fb3f33614d4..ba7d413d75ba 100644
--- a/drivers/iio/humidity/hts221_buffer.c
+++ b/drivers/iio/humidity/hts221_buffer.c
@@ -160,7 +160,6 @@ static const struct iio_buffer_setup_ops hts221_buffer_ops = {
static irqreturn_t hts221_buffer_handler_thread(int irq, void *p)
{
- u8 buffer[ALIGN(2 * HTS221_DATA_SIZE, sizeof(s64)) + sizeof(s64)];
struct iio_poll_func *pf = p;
struct iio_dev *iio_dev = pf->indio_dev;
struct hts221_hw *hw = iio_priv(iio_dev);
@@ -170,18 +169,20 @@ static irqreturn_t hts221_buffer_handler_thread(int irq, void *p)
/* humidity data */
ch = &iio_dev->channels[HTS221_SENSOR_H];
err = regmap_bulk_read(hw->regmap, ch->address,
- buffer, HTS221_DATA_SIZE);
+ &hw->scan.channels[0],
+ sizeof(hw->scan.channels[0]));
if (err < 0)
goto out;
/* temperature data */
ch = &iio_dev->channels[HTS221_SENSOR_T];
err = regmap_bulk_read(hw->regmap, ch->address,
- buffer + HTS221_DATA_SIZE, HTS221_DATA_SIZE);
+ &hw->scan.channels[1],
+ sizeof(hw->scan.channels[1]));
if (err < 0)
goto out;
- iio_push_to_buffers_with_timestamp(iio_dev, buffer,
+ iio_push_to_buffers_with_timestamp(iio_dev, &hw->scan,
iio_get_time_ns(iio_dev));
out:
The `INSN_CONFIG` comedi instruction with sub-instruction code
`INSN_CONFIG_DIGITAL_TRIG` includes a base channel in `data[3]`. This is
used as a right shift amount for other bitmask values without being
checked. Shift amounts greater than or equal to 32 will result in
undefined behavior. Add code to deal with this, adjusting the checks
for invalid channels so that enabled channel bits that would have been
lost by shifting are also checked for validity. Only channels 0 to 15
are valid.
Fixes: a8c66b684efaf ("staging: comedi: addi_apci_1500: rewrite the subdevice support functions")
Cc: <stable(a)vger.kernel.org> #4.0+: ef75e14a6c93: staging: comedi: verify array index is correct before using it
Cc: <stable(a)vger.kernel.org> #4.0+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
---
.../staging/comedi/drivers/addi_apci_1500.c | 24 +++++++++++++++----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/drivers/staging/comedi/drivers/addi_apci_1500.c b/drivers/staging/comedi/drivers/addi_apci_1500.c
index 689acd69a1b9..816dd25b9d0e 100644
--- a/drivers/staging/comedi/drivers/addi_apci_1500.c
+++ b/drivers/staging/comedi/drivers/addi_apci_1500.c
@@ -452,13 +452,14 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
struct apci1500_private *devpriv = dev->private;
unsigned int trig = data[1];
unsigned int shift = data[3];
- unsigned int hi_mask = data[4] << shift;
- unsigned int lo_mask = data[5] << shift;
- unsigned int chan_mask = hi_mask | lo_mask;
- unsigned int old_mask = (1 << shift) - 1;
+ unsigned int hi_mask;
+ unsigned int lo_mask;
+ unsigned int chan_mask;
+ unsigned int old_mask;
unsigned int pm;
unsigned int pt;
unsigned int pp;
+ unsigned int invalid_chan;
if (trig > 1) {
dev_dbg(dev->class_dev,
@@ -466,7 +467,20 @@ static int apci1500_di_cfg_trig(struct comedi_device *dev,
return -EINVAL;
}
- if (chan_mask > 0xffff) {
+ if (shift <= 16) {
+ hi_mask = data[4] << shift;
+ lo_mask = data[5] << shift;
+ old_mask = (1U << shift) - 1;
+ invalid_chan = (data[4] | data[5]) >> (16 - shift);
+ } else {
+ hi_mask = 0;
+ lo_mask = 0;
+ old_mask = 0xffff;
+ invalid_chan = data[4] | data[5];
+ }
+ chan_mask = hi_mask | lo_mask;
+
+ if (invalid_chan) {
dev_dbg(dev->class_dev, "invalid digital trigger channel\n");
return -EINVAL;
}
--
2.27.0