The patch titled
Subject: mm/vmalloc: separate put pages and flush VM flags
has been removed from the -mm tree. Its filename was
mm-vmalloc-separate-put-pages-and-flush-vm-flags.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Subject: mm/vmalloc: separate put pages and flush VM flags
When VM_MAP_PUT_PAGES was added, it was defined with the same value as
VM_FLUSH_RESET_PERMS. This doesn't seem like it will cause any big
functional problems other than some excess flushing for VM_MAP_PUT_PAGES
allocations.
Redefine VM_MAP_PUT_PAGES to have its own value. Also, rearrange things
so flags are less likely to be missed in the future.
Link: https://lkml.kernel.org/r/20210122233706.9304-1-rick.p.edgecombe@intel.com
Fixes: b944afc9d64d ("mm: add a VM_MAP_PUT_PAGES flag for vmap")
Signed-off-by: Rick Edgecombe <rick.p.edgecombe(a)intel.com>
Suggested-by: Matthew Wilcox <willy(a)infradead.org>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Christoph Hellwig <hch(a)lst.de>
Cc: Daniel Axtens <dja(a)axtens.net>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/vmalloc.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
--- a/include/linux/vmalloc.h~mm-vmalloc-separate-put-pages-and-flush-vm-flags
+++ a/include/linux/vmalloc.h
@@ -24,7 +24,8 @@ struct notifier_block; /* in notifier.h
#define VM_UNINITIALIZED 0x00000020 /* vm_struct is not fully initialized */
#define VM_NO_GUARD 0x00000040 /* don't add guard page */
#define VM_KASAN 0x00000080 /* has allocated kasan shadow memory */
-#define VM_MAP_PUT_PAGES 0x00000100 /* put pages and free array in vfree */
+#define VM_FLUSH_RESET_PERMS 0x00000100 /* reset direct map and flush TLB on unmap, can't be freed in atomic context */
+#define VM_MAP_PUT_PAGES 0x00000200 /* put pages and free array in vfree */
/*
* VM_KASAN is used slighly differently depending on CONFIG_KASAN_VMALLOC.
@@ -37,12 +38,6 @@ struct notifier_block; /* in notifier.h
* determine which allocations need the module shadow freed.
*/
-/*
- * Memory with VM_FLUSH_RESET_PERMS cannot be freed in an interrupt or with
- * vfree_atomic().
- */
-#define VM_FLUSH_RESET_PERMS 0x00000100 /* Reset direct map and flush TLB on unmap */
-
/* bits [20..32] reserved for arch specific ioremap internals */
/*
_
Patches currently in -mm which might be from rick.p.edgecombe(a)intel.com are
I'm announcing the release of the 4.4.256 kernel.
This, and the 4.9.256 release are a little bit "different" than normal.
This contains only 1 patch, just the version bump from .255 to .256 which ends
up causing the userspace-visable LINUX_VERSION_CODE to behave a bit differently
than normal due to the "overflow".
With this release, KERNEL_VERSION(4, 4, 256) is the same as KERNEL_VERSION(4, 5, 0).
Nothing in the kernel build itself breaks with this change, but given that this
is a userspace visible change, and some crazy tools (like glibc and gcc) have
logic that checks the kernel version for different reasons, I wanted to do this
release as an "empty" release to ensure that everything still works properly.
So, this is a YOU MUST UPGRADE requirement of a release. If you rely on the
4.4.y kernel, please throw this release into your test builds and rebuild the
world and let us know if anything breaks, or if all is well.
Go forth and do full system rebuilds! Yocto and Gentoo are great for this, as
will systems that use buildroot.
I'll try to hold off on doing a "real" 4.4.y release for a week to give
everyone a chance to test this out and get back to me. The pending patches in
the 4.4.y queue are pretty serious, so I am loath to wait longer than that,
consider yourself warned...
The updated 4.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Greg Kroah-Hartman (1):
Linux 4.4.256
The patch titled
Subject: mm, compaction: move high_pfn to the for loop scope
has been removed from the -mm tree. Its filename was
mm-compaction-move-high_pfn-to-the-for-loop-scope.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Rokudo Yan <wu-yan(a)tcl.com>
Subject: mm, compaction: move high_pfn to the for loop scope
In fast_isolate_freepages, high_pfn will be used if a prefered one(PFN >=
low_fn) not found. But the high_pfn is not reset before searching an free
area, so when it was used as freepage, it may from another free area
searched before. And move_freelist_head(freelist, freepage) will have
unexpected behavior(eg. corrupt the MOVABLE freelist)
Unable to handle kernel paging request at virtual address dead000000000200
Mem abort info:
ESR = 0x96000044
Exception class = DABT (current EL), IL = 32 bits
SET = 0, FnV = 0
EA = 0, S1PTW = 0
Data abort info:
ISV = 0, ISS = 0x00000044
CM = 0, WnR = 1
[dead000000000200] address between user and kernel address ranges
-000|list_cut_before(inline)
-000|move_freelist_head(inline)
-000|fast_isolate_freepages(inline)
-000|isolate_freepages(inline)
-000|compaction_alloc(?, ?)
-001|unmap_and_move(inline)
-001|migrate_pages([NSD:0xFFFFFF80088CBBD0] from = 0xFFFFFF80088CBD88, [NSD:0xFFFFFF80088CBBC8] get_new_p
-002|__read_once_size(inline)
-002|static_key_count(inline)
-002|static_key_false(inline)
-002|trace_mm_compaction_migratepages(inline)
-002|compact_zone(?, [NSD:0xFFFFFF80088CBCB0] capc = 0x0)
-003|kcompactd_do_work(inline)
-003|kcompactd([X19] p = 0xFFFFFF93227FBC40)
-004|kthread([X20] _create = 0xFFFFFFE1AFB26380)
-005|ret_from_fork(asm)
---|end of frame
The issue was reported on an smart phone product with 6GB ram and 3GB zram
as swap device.
This patch fixes the issue by reset high_pfn before searching each free
area, which ensure freepage and freelist match when call
move_freelist_head in fast_isolate_freepages().
Link: http://lkml.kernel.org/r/20190118175136.31341-12-mgorman@techsingularity.net
Link: https://lkml.kernel.org/r/20210112094720.1238444-1-wu-yan@tcl.com
Fixes: 5a811889de10f1eb ("mm, compaction: use free lists to quickly locate a migration target")
Signed-off-by: Rokudo Yan <wu-yan(a)tcl.com>
Acked-by: Mel Gorman <mgorman(a)techsingularity.net>
Acked-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/compaction.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/mm/compaction.c~mm-compaction-move-high_pfn-to-the-for-loop-scope
+++ a/mm/compaction.c
@@ -1342,7 +1342,7 @@ fast_isolate_freepages(struct compact_co
{
unsigned int limit = min(1U, freelist_scan_limit(cc) >> 1);
unsigned int nr_scanned = 0;
- unsigned long low_pfn, min_pfn, high_pfn = 0, highest = 0;
+ unsigned long low_pfn, min_pfn, highest = 0;
unsigned long nr_isolated = 0;
unsigned long distance;
struct page *page = NULL;
@@ -1387,6 +1387,7 @@ fast_isolate_freepages(struct compact_co
struct page *freepage;
unsigned long flags;
unsigned int order_scanned = 0;
+ unsigned long high_pfn = 0;
if (!area->nr_free)
continue;
_
Patches currently in -mm which might be from wu-yan(a)tcl.com are
zsmalloc-account-the-number-of-compacted-pages-correctly.patch
The patch titled
Subject: mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
has been removed from the -mm tree. Its filename was
mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: remove VM_BUG_ON_PAGE from page_huge_active
The page_huge_active() can be called from scan_movable_pages() which do
not hold a reference count to the HugeTLB page. So when we call
page_huge_active() from scan_movable_pages(), the HugeTLB page can be
freed parallel. Then we will trigger a BUG_ON which is in the
page_huge_active() when CONFIG_DEBUG_VM is enabled. Just remove the
VM_BUG_ON_PAGE.
Link: https://lkml.kernel.org/r/20210115124942.46403-6-songmuchun@bytedance.com
Fixes: 7e1f049efb86 ("mm: hugetlb: cleanup using paeg_huge_active()")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-remove-vm_bug_on_page-from-page_huge_active
+++ a/mm/hugetlb.c
@@ -1361,8 +1361,7 @@ struct hstate *size_to_hstate(unsigned l
*/
bool page_huge_active(struct page *page)
{
- VM_BUG_ON_PAGE(!PageHuge(page), page);
- return PageHead(page) && PagePrivate(&page[1]);
+ return PageHeadHuge(page) && PagePrivate(&page[1]);
}
/* never called for tail page */
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-memcontrol-replace-the-loop-with-a-list_for_each_entry.patch
hugetlb-convert-page_huge_active-hpagemigratable-flag-fix.patch
The patch titled
Subject: mm: hugetlb: fix a race between isolating and freeing page
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-a-race-between-isolating-and-freeing-page.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix a race between isolating and freeing page
There is a race between isolate_huge_page() and __free_huge_page().
CPU0: CPU1:
if (PageHuge(page))
put_page(page)
__free_huge_page(page)
spin_lock(&hugetlb_lock)
update_and_free_page(page)
set_compound_page_dtor(page,
NULL_COMPOUND_DTOR)
spin_unlock(&hugetlb_lock)
isolate_huge_page(page)
// trigger BUG_ON
VM_BUG_ON_PAGE(!PageHead(page), page)
spin_lock(&hugetlb_lock)
page_huge_active(page)
// trigger BUG_ON
VM_BUG_ON_PAGE(!PageHuge(page), page)
spin_unlock(&hugetlb_lock)
When we isolate a HugeTLB page on CPU0. Meanwhile, we free it to the
buddy allocator on CPU1. Then, we can trigger a BUG_ON on CPU0. Because
it is already freed to the buddy allocator.
Link: https://lkml.kernel.org/r/20210115124942.46403-5-songmuchun@bytedance.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
--- a/mm/hugetlb.c~mm-hugetlb-fix-a-race-between-isolating-and-freeing-page
+++ a/mm/hugetlb.c
@@ -5594,9 +5594,9 @@ bool isolate_huge_page(struct page *page
{
bool ret = true;
- VM_BUG_ON_PAGE(!PageHead(page), page);
spin_lock(&hugetlb_lock);
- if (!page_huge_active(page) || !get_page_unless_zero(page)) {
+ if (!PageHeadHuge(page) || !page_huge_active(page) ||
+ !get_page_unless_zero(page)) {
ret = false;
goto unlock;
}
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-memcontrol-replace-the-loop-with-a-list_for_each_entry.patch
hugetlb-convert-page_huge_active-hpagemigratable-flag-fix.patch
The patch titled
Subject: mm: hugetlb: fix a race between freeing and dissolving the page
has been removed from the -mm tree. Its filename was
mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlb: fix a race between freeing and dissolving the page
There is a race condition between __free_huge_page()
and dissolve_free_huge_page().
CPU0: CPU1:
// page_count(page) == 1
put_page(page)
__free_huge_page(page)
dissolve_free_huge_page(page)
spin_lock(&hugetlb_lock)
// PageHuge(page) && !page_count(page)
update_and_free_page(page)
// page is freed to the buddy
spin_unlock(&hugetlb_lock)
spin_lock(&hugetlb_lock)
clear_page_huge_active(page)
enqueue_huge_page(page)
// It is wrong, the page is already freed
spin_unlock(&hugetlb_lock)
The race windows is between put_page() and dissolve_free_huge_page().
We should make sure that the page is already on the free list
when it is dissolved.
As a result __free_huge_page would corrupt page(s) already in the buddy
allocator.
Link: https://lkml.kernel.org/r/20210115124942.46403-4-songmuchun@bytedance.com
Fixes: c8721bbbdd36 ("mm: memory-hotplug: enable memory hotplug to handle hugepage")
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 39 +++++++++++++++++++++++++++++++++++++++
1 file changed, 39 insertions(+)
--- a/mm/hugetlb.c~mm-hugetlb-fix-a-race-between-freeing-and-dissolving-the-page
+++ a/mm/hugetlb.c
@@ -79,6 +79,21 @@ DEFINE_SPINLOCK(hugetlb_lock);
static int num_fault_mutexes;
struct mutex *hugetlb_fault_mutex_table ____cacheline_aligned_in_smp;
+static inline bool PageHugeFreed(struct page *head)
+{
+ return page_private(head + 4) == -1UL;
+}
+
+static inline void SetPageHugeFreed(struct page *head)
+{
+ set_page_private(head + 4, -1UL);
+}
+
+static inline void ClearPageHugeFreed(struct page *head)
+{
+ set_page_private(head + 4, 0);
+}
+
/* Forward declaration */
static int hugetlb_acct_memory(struct hstate *h, long delta);
@@ -1028,6 +1043,7 @@ static void enqueue_huge_page(struct hst
list_move(&page->lru, &h->hugepage_freelists[nid]);
h->free_huge_pages++;
h->free_huge_pages_node[nid]++;
+ SetPageHugeFreed(page);
}
static struct page *dequeue_huge_page_node_exact(struct hstate *h, int nid)
@@ -1044,6 +1060,7 @@ static struct page *dequeue_huge_page_no
list_move(&page->lru, &h->hugepage_activelist);
set_page_refcounted(page);
+ ClearPageHugeFreed(page);
h->free_huge_pages--;
h->free_huge_pages_node[nid]--;
return page;
@@ -1505,6 +1522,7 @@ static void prep_new_huge_page(struct hs
spin_lock(&hugetlb_lock);
h->nr_huge_pages++;
h->nr_huge_pages_node[nid]++;
+ ClearPageHugeFreed(page);
spin_unlock(&hugetlb_lock);
}
@@ -1755,6 +1773,7 @@ int dissolve_free_huge_page(struct page
{
int rc = -EBUSY;
+retry:
/* Not to disrupt normal path by vainly holding hugetlb_lock */
if (!PageHuge(page))
return 0;
@@ -1771,6 +1790,26 @@ int dissolve_free_huge_page(struct page
int nid = page_to_nid(head);
if (h->free_huge_pages - h->resv_huge_pages == 0)
goto out;
+
+ /*
+ * We should make sure that the page is already on the free list
+ * when it is dissolved.
+ */
+ if (unlikely(!PageHugeFreed(head))) {
+ spin_unlock(&hugetlb_lock);
+ cond_resched();
+
+ /*
+ * Theoretically, we should return -EBUSY when we
+ * encounter this race. In fact, we have a chance
+ * to successfully dissolve the page if we do a
+ * retry. Because the race window is quite small.
+ * If we seize this opportunity, it is an optimization
+ * for increasing the success rate of dissolving page.
+ */
+ goto retry;
+ }
+
/*
* Move PageHWPoison flag from head page to the raw error page,
* which makes any subpages rather than the error page reusable.
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-memcontrol-replace-the-loop-with-a-list_for_each_entry.patch
hugetlb-convert-page_huge_active-hpagemigratable-flag-fix.patch
The patch titled
Subject: mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
has been removed from the -mm tree. Its filename was
mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page.patch
This patch was dropped because it was merged into mainline or a subsystem tree
------------------------------------------------------
From: Muchun Song <songmuchun(a)bytedance.com>
Subject: mm: hugetlbfs: fix cannot migrate the fallocated HugeTLB page
If a new hugetlb page is allocated during fallocate it will not be marked
as active (set_page_huge_active) which will result in a later
isolate_huge_page failure when the page migration code would like to move
that page. Such a failure would be unexpected and wrong.
Only export set_page_huge_active, just leave clear_page_huge_active as
static. Because there are no external users.
Link: https://lkml.kernel.org/r/20210115124942.46403-3-songmuchun@bytedance.com
Fixes: 70c3547e36f5 (hugetlbfs: add hugetlbfs_fallocate())
Signed-off-by: Muchun Song <songmuchun(a)bytedance.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Cc: David Hildenbrand <david(a)redhat.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/hugetlbfs/inode.c | 3 ++-
include/linux/hugetlb.h | 2 ++
mm/hugetlb.c | 2 +-
3 files changed, 5 insertions(+), 2 deletions(-)
--- a/fs/hugetlbfs/inode.c~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/fs/hugetlbfs/inode.c
@@ -735,9 +735,10 @@ static long hugetlbfs_fallocate(struct f
mutex_unlock(&hugetlb_fault_mutex_table[hash]);
+ set_page_huge_active(page);
/*
* unlock_page because locked by add_to_page_cache()
- * page_put due to reference from alloc_huge_page()
+ * put_page() due to reference from alloc_huge_page()
*/
unlock_page(page);
put_page(page);
--- a/include/linux/hugetlb.h~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/include/linux/hugetlb.h
@@ -770,6 +770,8 @@ static inline void huge_ptep_modify_prot
}
#endif
+void set_page_huge_active(struct page *page);
+
#else /* CONFIG_HUGETLB_PAGE */
struct hstate {};
--- a/mm/hugetlb.c~mm-hugetlbfs-fix-cannot-migrate-the-fallocated-hugetlb-page
+++ a/mm/hugetlb.c
@@ -1349,7 +1349,7 @@ bool page_huge_active(struct page *page)
}
/* never called for tail page */
-static void set_page_huge_active(struct page *page)
+void set_page_huge_active(struct page *page)
{
VM_BUG_ON_PAGE(!PageHeadHuge(page), page);
SetPagePrivate(&page[1]);
_
Patches currently in -mm which might be from songmuchun(a)bytedance.com are
mm-memcontrol-optimize-per-lruvec-stats-counter-memory-usage.patch
mm-memcontrol-fix-nr_anon_thps-accounting-in-charge-moving.patch
mm-memcontrol-convert-nr_anon_thps-account-to-pages.patch
mm-memcontrol-convert-nr_file_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_thps-account-to-pages.patch
mm-memcontrol-convert-nr_shmem_pmdmapped-account-to-pages.patch
mm-memcontrol-convert-nr_file_pmdmapped-account-to-pages.patch
mm-memcontrol-make-the-slab-calculation-consistent.patch
mm-memcontrol-replace-the-loop-with-a-list_for_each_entry.patch
hugetlb-convert-page_huge_active-hpagemigratable-flag-fix.patch
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 29 Jan 2021 10:13:53 -0500
Subject: [PATCH] fgraph: Initialize tracing_graph_pause at task creation
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable(a)vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois(a)arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/init/init_task.c b/init/init_task.c
index 8a992d73e6fb..3711cdaafed2 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -198,7 +198,8 @@ struct task_struct init_task
.lockdep_recursion = 0,
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- .ret_stack = NULL,
+ .ret_stack = NULL,
+ .tracing_graph_pause = ATOMIC_INIT(0),
#endif
#if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPTION)
.trace_recursion = 0,
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 73edb9e4f354..29a6ebeebc9e 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -394,7 +394,6 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
}
if (t->ret_stack == NULL) {
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->curr_ret_stack = -1;
t->curr_ret_depth = -1;
@@ -489,7 +488,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_stack *, idle_ret_stack);
static void
graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack)
{
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->ftrace_timestamp = 0;
/* make curr_ret_stack visible before we add the ret_stack */
The TypeC FIA can be powered down if the TC-COLD power state is allowed,
so block the TC-COLD state when initializing the FIA.
Note that this isn't needed on ICL where the FIA is never modular and
which has no generic way to block TC-COLD (except for platforms with a
legacy TypeC port and on those too only via these legacy ports, not via
a DP-alt/TBT port).
Cc: <stable(a)vger.kernel.org> # v5.10+
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Reported-by: Paul Menzel <pmenzel(a)molgen.mpg.de>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/3027
Signed-off-by: Imre Deak <imre.deak(a)intel.com>
---
drivers/gpu/drm/i915/display/intel_tc.c | 67 ++++++++++++++-----------
1 file changed, 37 insertions(+), 30 deletions(-)
diff --git a/drivers/gpu/drm/i915/display/intel_tc.c b/drivers/gpu/drm/i915/display/intel_tc.c
index 27dc2dad6809c..2cefc13535a0f 100644
--- a/drivers/gpu/drm/i915/display/intel_tc.c
+++ b/drivers/gpu/drm/i915/display/intel_tc.c
@@ -23,36 +23,6 @@ static const char *tc_port_mode_name(enum tc_port_mode mode)
return names[mode];
}
-static void
-tc_port_load_fia_params(struct drm_i915_private *i915,
- struct intel_digital_port *dig_port)
-{
- enum port port = dig_port->base.port;
- enum tc_port tc_port = intel_port_to_tc(i915, port);
- u32 modular_fia;
-
- if (INTEL_INFO(i915)->display.has_modular_fia) {
- modular_fia = intel_uncore_read(&i915->uncore,
- PORT_TX_DFLEXDPSP(FIA1));
- drm_WARN_ON(&i915->drm, modular_fia == 0xffffffff);
- modular_fia &= MODULAR_FIA_MASK;
- } else {
- modular_fia = 0;
- }
-
- /*
- * Each Modular FIA instance houses 2 TC ports. In SOC that has more
- * than two TC ports, there are multiple instances of Modular FIA.
- */
- if (modular_fia) {
- dig_port->tc_phy_fia = tc_port / 2;
- dig_port->tc_phy_fia_idx = tc_port % 2;
- } else {
- dig_port->tc_phy_fia = FIA1;
- dig_port->tc_phy_fia_idx = tc_port;
- }
-}
-
static enum intel_display_power_domain
tc_cold_get_power_domain(struct intel_digital_port *dig_port)
{
@@ -646,6 +616,43 @@ void intel_tc_port_put_link(struct intel_digital_port *dig_port)
mutex_unlock(&dig_port->tc_lock);
}
+static bool
+tc_has_modular_fia(struct drm_i915_private *i915, struct intel_digital_port *dig_port)
+{
+ intel_wakeref_t wakeref;
+ u32 val;
+
+ if (!INTEL_INFO(i915)->display.has_modular_fia)
+ return false;
+
+ wakeref = tc_cold_block(dig_port);
+ val = intel_uncore_read(&i915->uncore, PORT_TX_DFLEXDPSP(FIA1));
+ tc_cold_unblock(dig_port, wakeref);
+
+ drm_WARN_ON(&i915->drm, val == 0xffffffff);
+
+ return val & MODULAR_FIA_MASK;
+}
+
+static void
+tc_port_load_fia_params(struct drm_i915_private *i915, struct intel_digital_port *dig_port)
+{
+ enum port port = dig_port->base.port;
+ enum tc_port tc_port = intel_port_to_tc(i915, port);
+
+ /*
+ * Each Modular FIA instance houses 2 TC ports. In SOC that has more
+ * than two TC ports, there are multiple instances of Modular FIA.
+ */
+ if (tc_has_modular_fia(i915, dig_port)) {
+ dig_port->tc_phy_fia = tc_port / 2;
+ dig_port->tc_phy_fia_idx = tc_port % 2;
+ } else {
+ dig_port->tc_phy_fia = FIA1;
+ dig_port->tc_phy_fia_idx = tc_port;
+ }
+}
+
void intel_tc_port_init(struct intel_digital_port *dig_port, bool is_legacy)
{
struct drm_i915_private *i915 = to_i915(dig_port->base.base.dev);
--
2.25.1
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3241929b67d28c83945d3191c6816a3271fd6b85 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Pali=20Roh=C3=A1r?= <pali(a)kernel.org>
Date: Mon, 1 Feb 2021 16:08:03 +0100
Subject: [PATCH] usb: host: xhci: mvebu: make USB 3.0 PHY optional for Armada
3720
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Older ATF does not provide SMC call for USB 3.0 phy power on functionality
and therefore initialization of xhci-hcd is failing when older version of
ATF is used. In this case phy_power_on() function returns -EOPNOTSUPP.
[ 3.108467] mvebu-a3700-comphy d0018300.phy: unsupported SMC call, try updating your firmware
[ 3.117250] phy phy-d0018300.phy.0: phy poweron failed --> -95
[ 3.123465] xhci-hcd: probe of d0058000.usb failed with error -95
This patch introduces a new plat_setup callback for xhci platform drivers
which is called prior calling usb_add_hcd() function. This function at its
beginning skips PHY init if hcd->skip_phy_initialization is set.
Current init_quirk callback for xhci platform drivers is called from
xhci_plat_setup() function which is called after chip reset completes.
It happens in the middle of the usb_add_hcd() function and therefore this
callback cannot be used for setting if PHY init should be skipped or not.
For Armada 3720 this patch introduce a new xhci_mvebu_a3700_plat_setup()
function configured as a xhci platform plat_setup callback. This new
function calls phy_power_on() and in case it returns -EOPNOTSUPP then
XHCI_SKIP_PHY_INIT quirk is set to instruct xhci-plat to skip PHY
initialization.
This patch fixes above failure by ignoring 'not supported' error in
xhci-hcd driver. In this case it is expected that phy is already power on.
It fixes initialization of xhci-hcd on Espressobin boards where is older
Marvell's Arm Trusted Firmware without SMC call for USB 3.0 phy power.
This is regression introduced in commit bd3d25b07342 ("arm64: dts: marvell:
armada-37xx: link USB hosts with their PHYs") where USB 3.0 phy was defined
and therefore xhci-hcd on Espressobin with older ATF started failing.
Fixes: bd3d25b07342 ("arm64: dts: marvell: armada-37xx: link USB hosts with their PHYs")
Cc: <stable(a)vger.kernel.org> # 5.1+: ea17a0f153af: phy: marvell: comphy: Convert internal SMCC firmware return codes to errno
Cc: <stable(a)vger.kernel.org> # 5.1+: f768e718911e: usb: host: xhci-plat: add priv quirk for skip PHY initialization
Tested-by: Tomasz Maciej Nowak <tmn505(a)gmail.com>
Tested-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com> # On R-Car
Reviewed-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com> # xhci-plat
Acked-by: Mathias Nyman <mathias.nyman(a)linux.intel.com>
Signed-off-by: Pali Rohár <pali(a)kernel.org>
Link: https://lore.kernel.org/r/20210201150803.7305-1-pali@kernel.org
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/host/xhci-mvebu.c b/drivers/usb/host/xhci-mvebu.c
index 60651a50770f..8ca1a235d164 100644
--- a/drivers/usb/host/xhci-mvebu.c
+++ b/drivers/usb/host/xhci-mvebu.c
@@ -8,6 +8,7 @@
#include <linux/mbus.h>
#include <linux/of.h>
#include <linux/platform_device.h>
+#include <linux/phy/phy.h>
#include <linux/usb.h>
#include <linux/usb/hcd.h>
@@ -74,6 +75,47 @@ int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd)
return 0;
}
+int xhci_mvebu_a3700_plat_setup(struct usb_hcd *hcd)
+{
+ struct xhci_hcd *xhci = hcd_to_xhci(hcd);
+ struct device *dev = hcd->self.controller;
+ struct phy *phy;
+ int ret;
+
+ /* Old bindings miss the PHY handle */
+ phy = of_phy_get(dev->of_node, "usb3-phy");
+ if (IS_ERR(phy) && PTR_ERR(phy) == -EPROBE_DEFER)
+ return -EPROBE_DEFER;
+ else if (IS_ERR(phy))
+ goto phy_out;
+
+ ret = phy_init(phy);
+ if (ret)
+ goto phy_put;
+
+ ret = phy_set_mode(phy, PHY_MODE_USB_HOST_SS);
+ if (ret)
+ goto phy_exit;
+
+ ret = phy_power_on(phy);
+ if (ret == -EOPNOTSUPP) {
+ /* Skip initializatin of XHCI PHY when it is unsupported by firmware */
+ dev_warn(dev, "PHY unsupported by firmware\n");
+ xhci->quirks |= XHCI_SKIP_PHY_INIT;
+ }
+ if (ret)
+ goto phy_exit;
+
+ phy_power_off(phy);
+phy_exit:
+ phy_exit(phy);
+phy_put:
+ of_phy_put(phy);
+phy_out:
+
+ return 0;
+}
+
int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd)
{
struct xhci_hcd *xhci = hcd_to_xhci(hcd);
diff --git a/drivers/usb/host/xhci-mvebu.h b/drivers/usb/host/xhci-mvebu.h
index 3be021793cc8..01bf3fcb3eca 100644
--- a/drivers/usb/host/xhci-mvebu.h
+++ b/drivers/usb/host/xhci-mvebu.h
@@ -12,6 +12,7 @@ struct usb_hcd;
#if IS_ENABLED(CONFIG_USB_XHCI_MVEBU)
int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd);
+int xhci_mvebu_a3700_plat_setup(struct usb_hcd *hcd);
int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd);
#else
static inline int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd)
@@ -19,6 +20,11 @@ static inline int xhci_mvebu_mbus_init_quirk(struct usb_hcd *hcd)
return 0;
}
+static inline int xhci_mvebu_a3700_plat_setup(struct usb_hcd *hcd)
+{
+ return 0;
+}
+
static inline int xhci_mvebu_a3700_init_quirk(struct usb_hcd *hcd)
{
return 0;
diff --git a/drivers/usb/host/xhci-plat.c b/drivers/usb/host/xhci-plat.c
index 4d34f6005381..c1edcc9b13ce 100644
--- a/drivers/usb/host/xhci-plat.c
+++ b/drivers/usb/host/xhci-plat.c
@@ -44,6 +44,16 @@ static void xhci_priv_plat_start(struct usb_hcd *hcd)
priv->plat_start(hcd);
}
+static int xhci_priv_plat_setup(struct usb_hcd *hcd)
+{
+ struct xhci_plat_priv *priv = hcd_to_xhci_priv(hcd);
+
+ if (!priv->plat_setup)
+ return 0;
+
+ return priv->plat_setup(hcd);
+}
+
static int xhci_priv_init_quirk(struct usb_hcd *hcd)
{
struct xhci_plat_priv *priv = hcd_to_xhci_priv(hcd);
@@ -111,6 +121,7 @@ static const struct xhci_plat_priv xhci_plat_marvell_armada = {
};
static const struct xhci_plat_priv xhci_plat_marvell_armada3700 = {
+ .plat_setup = xhci_mvebu_a3700_plat_setup,
.init_quirk = xhci_mvebu_a3700_init_quirk,
};
@@ -330,7 +341,14 @@ static int xhci_plat_probe(struct platform_device *pdev)
hcd->tpl_support = of_usb_host_tpl_support(sysdev->of_node);
xhci->shared_hcd->tpl_support = hcd->tpl_support;
- if (priv && (priv->quirks & XHCI_SKIP_PHY_INIT))
+
+ if (priv) {
+ ret = xhci_priv_plat_setup(hcd);
+ if (ret)
+ goto disable_usb_phy;
+ }
+
+ if ((xhci->quirks & XHCI_SKIP_PHY_INIT) || (priv && (priv->quirks & XHCI_SKIP_PHY_INIT)))
hcd->skip_phy_initialization = 1;
if (priv && (priv->quirks & XHCI_SG_TRB_CACHE_SIZE_QUIRK))
diff --git a/drivers/usb/host/xhci-plat.h b/drivers/usb/host/xhci-plat.h
index 1fb149d1fbce..561d0b7bce09 100644
--- a/drivers/usb/host/xhci-plat.h
+++ b/drivers/usb/host/xhci-plat.h
@@ -13,6 +13,7 @@
struct xhci_plat_priv {
const char *firmware_name;
unsigned long long quirks;
+ int (*plat_setup)(struct usb_hcd *);
void (*plat_start)(struct usb_hcd *);
int (*init_quirk)(struct usb_hcd *);
int (*suspend_quirk)(struct usb_hcd *);
Dear Friend:
I am sorry to encroach into your privacy in this manner, I find
it appropriate to offer you my partnership in business, I only
pray at this time that your email address is still valid. I am
contacting you because my status would not permit me to do this
alone.
I want to solicit your attention to receive money on my behalf.
Can I trust you to do business?
I will send you the full details when I receive your reply.
Best wishes.
The erratum 1024718 affects Cortex-A55 r0p0 to r2p0. However
we apply the work around for r0p0 - r1p0. Unfortunately this
won't be fixed for the future revisions for the CPU. Thus
extend the work around for all versions of A55, to cover
for r2p0 and any future revisions.
Cc: stable(a)vger.kernel.org
Cc: Catalin Marinas <catalin.marinas(a)arm.com>
Cc: Will Deacon <will(a)kernel.org>
Cc: James Morse <james.morse(a)arm.com>
Cc: Kunihiko Hayashi <hayashi.kunihiko(a)socionext.com>
Signed-off-by: Suzuki K Poulose <suzuki.poulose(a)arm.com>
---
arch/arm64/kernel/cpufeature.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index e99eddec0a46..db400ca77427 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -1455,7 +1455,7 @@ static bool cpu_has_broken_dbm(void)
/* List of CPUs which have broken DBM support. */
static const struct midr_range cpus[] = {
#ifdef CONFIG_ARM64_ERRATUM_1024718
- MIDR_RANGE(MIDR_CORTEX_A55, 0, 0, 1, 0), // A55 r0p0 -r1p0
+ MIDR_ALL_VERSIONS(MIDR_CORTEX_A55),
/* Kryo4xx Silver (rdpe => r1p0) */
MIDR_REV(MIDR_QCOM_KRYO_4XX_SILVER, 0xd, 0xe),
#endif
--
2.24.1
The first three patches are fixes for XSA-332. The avoid WARN splats
and a performance issue with interdomain events.
Patches 4 and 5 are some additions to event handling in order to add
some per pv-device statistics to sysfs and the ability to have a per
backend device spurious event delay control.
Patches 6 and 7 are minor fixes I had lying around.
Juergen Gross (7):
xen/events: reset affinity of 2-level event initially
xen/events: don't unmask an event channel when an eoi is pending
xen/events: fix lateeoi irq acknowledgement
xen/events: link interdomain events to associated xenbus device
xen/events: add per-xenbus device event statistics and settings
xen/evtch: use smp barriers for user event ring
xen/evtchn: read producer index only once
drivers/block/xen-blkback/xenbus.c | 2 +-
drivers/net/xen-netback/interface.c | 16 ++--
drivers/xen/events/events_2l.c | 20 +++++
drivers/xen/events/events_base.c | 133 ++++++++++++++++++++++------
drivers/xen/evtchn.c | 6 +-
drivers/xen/pvcalls-back.c | 4 +-
drivers/xen/xen-pciback/xenbus.c | 2 +-
drivers/xen/xen-scsiback.c | 2 +-
drivers/xen/xenbus/xenbus_probe.c | 66 ++++++++++++++
include/xen/events.h | 7 +-
include/xen/xenbus.h | 7 ++
11 files changed, 217 insertions(+), 48 deletions(-)
--
2.26.2
From: Xiao Ni <xni(a)redhat.com>
One customer reports a crash problem which causes by flush request. It
triggers a warning before crash.
/* new request after previous flush is completed */
if (ktime_after(req_start, mddev->prev_flush_start)) {
WARN_ON(mddev->flush_bio);
mddev->flush_bio = bio;
bio = NULL;
}
The WARN_ON is triggered. We use spin lock to protect prev_flush_start and
flush_bio in md_flush_request. But there is no lock protection in
md_submit_flush_data. It can set flush_bio to NULL first because of
compiler reordering write instructions.
For example, flush bio1 sets flush bio to NULL first in
md_submit_flush_data. An interrupt or vmware causing an extended stall
happen between updating flush_bio and prev_flush_start. Because flush_bio
is NULL, flush bio2 can get the lock and submit to underlayer disks. Then
flush bio1 updates prev_flush_start after the interrupt or extended stall.
Then flush bio3 enters in md_flush_request. The start time req_start is
behind prev_flush_start. The flush_bio is not NULL(flush bio2 hasn't
finished). So it can trigger the WARN_ON now. Then it calls INIT_WORK
again. INIT_WORK() will re-initialize the list pointers in the
work_struct, which then can result in a corrupted work list and the
work_struct queued a second time. With the work list corrupted, it can
lead in invalid work items being used and cause a crash in
process_one_work.
We need to make sure only one flush bio can be handled at one same time.
So add spin lock in md_submit_flush_data to protect prev_flush_start and
flush_bio in an atomic way.
Reviewed-by: David Jeffery <djeffery(a)redhat.com>
Signed-off-by: Xiao Ni <xni(a)redhat.com>
Signed-off-by: Song Liu <songliubraving(a)fb.com>
[jwang: backport dc5d17a3c39b06aef866afca19245a9cfb533a79 to 4.19]
Signed-off-by: Jack Wang <jinpu.wang(a)cloud.ionos.com>
---
drivers/md/md.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/md/md.c b/drivers/md/md.c
index ea139d0c0bc3..2bd60bd9e2ca 100644
--- a/drivers/md/md.c
+++ b/drivers/md/md.c
@@ -639,8 +639,10 @@ static void md_submit_flush_data(struct work_struct *ws)
* could wait for this and below md_handle_request could wait for those
* bios because of suspend check
*/
+ spin_lock_irq(&mddev->lock);
mddev->last_flush = mddev->start_flush;
mddev->flush_bio = NULL;
+ spin_unlock_irq(&mddev->lock);
wake_up(&mddev->sb_wait);
if (bio->bi_iter.bi_size == 0) {
--
2.25.1
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f295c8cfec833c2707ff1512da10d65386dde7af Mon Sep 17 00:00:00 2001
From: Dave Airlie <airlied(a)redhat.com>
Date: Mon, 1 Feb 2021 10:56:32 +1000
Subject: [PATCH] drm/nouveau: fix dma syncing warning with debugging on.
Since I wrote the below patch if you run a debug kernel you can a
dma debug warning like:
nouveau 0000:1f:00.0: DMA-API: device driver tries to sync DMA memory it has not allocated [device address=0x000000016e012000] [size=4096 bytes]
The old nouveau code wasn't consolidate the pages like the ttm code,
but the dma-debug expects the sync code to give it the same base/range
pairs as the allocator.
Fix the nouveau sync code to consolidate pages before calling the
sync code.
Fixes: bd549d35b4be0 ("nouveau: use ttm populate mapping functions. (v2)")
Reported-by: Lyude Paul <lyude(a)redhat.com>
Reviewed-by: Ben Skeggs <bskeggs(a)redhat.com>
Signed-off-by: Dave Airlie <airlied(a)redhat.com>
Link: https://patchwork.freedesktop.org/patch/417588/
diff --git a/drivers/gpu/drm/nouveau/nouveau_bo.c b/drivers/gpu/drm/nouveau/nouveau_bo.c
index c85b1af06b7b..7ea367a5444d 100644
--- a/drivers/gpu/drm/nouveau/nouveau_bo.c
+++ b/drivers/gpu/drm/nouveau/nouveau_bo.c
@@ -547,7 +547,7 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
{
struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
- int i;
+ int i, j;
if (!ttm_dma)
return;
@@ -556,10 +556,21 @@ nouveau_bo_sync_for_device(struct nouveau_bo *nvbo)
if (nvbo->force_coherent)
return;
- for (i = 0; i < ttm_dma->num_pages; i++)
+ for (i = 0; i < ttm_dma->num_pages; ++i) {
+ struct page *p = ttm_dma->pages[i];
+ size_t num_pages = 1;
+
+ for (j = i + 1; j < ttm_dma->num_pages; ++j) {
+ if (++p != ttm_dma->pages[j])
+ break;
+
+ ++num_pages;
+ }
dma_sync_single_for_device(drm->dev->dev,
ttm_dma->dma_address[i],
- PAGE_SIZE, DMA_TO_DEVICE);
+ num_pages * PAGE_SIZE, DMA_TO_DEVICE);
+ i += num_pages;
+ }
}
void
@@ -567,7 +578,7 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
{
struct nouveau_drm *drm = nouveau_bdev(nvbo->bo.bdev);
struct ttm_tt *ttm_dma = (struct ttm_tt *)nvbo->bo.ttm;
- int i;
+ int i, j;
if (!ttm_dma)
return;
@@ -576,9 +587,21 @@ nouveau_bo_sync_for_cpu(struct nouveau_bo *nvbo)
if (nvbo->force_coherent)
return;
- for (i = 0; i < ttm_dma->num_pages; i++)
+ for (i = 0; i < ttm_dma->num_pages; ++i) {
+ struct page *p = ttm_dma->pages[i];
+ size_t num_pages = 1;
+
+ for (j = i + 1; j < ttm_dma->num_pages; ++j) {
+ if (++p != ttm_dma->pages[j])
+ break;
+
+ ++num_pages;
+ }
+
dma_sync_single_for_cpu(drm->dev->dev, ttm_dma->dma_address[i],
- PAGE_SIZE, DMA_FROM_DEVICE);
+ num_pages * PAGE_SIZE, DMA_FROM_DEVICE);
+ i += num_pages;
+ }
}
void nouveau_bo_add_io_reserve_lru(struct ttm_buffer_object *bo)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 29b32839725f8c89a41cb6ee054c85f3116ea8b5 Mon Sep 17 00:00:00 2001
From: Nadav Amit <namit(a)vmware.com>
Date: Wed, 27 Jan 2021 09:53:17 -0800
Subject: [PATCH] iommu/vt-d: Do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if
the hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in
such circumstances is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing")
Signed-off-by: Nadav Amit <namit(a)vmware.com>
Cc: David Woodhouse <dwmw2(a)infradead.org>
Cc: Lu Baolu <baolu.lu(a)linux.intel.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Will Deacon <will(a)kernel.org>
Cc: stable(a)vger.kernel.org
Acked-by: Lu Baolu <baolu.lu(a)linux.intel.com>
Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com
Signed-off-by: Joerg Roedel <jroedel(a)suse.de>
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index f665322a0991..06b00b5363d8 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5440,6 +5440,36 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
return ret;
}
+static bool domain_use_flush_queue(void)
+{
+ struct dmar_drhd_unit *drhd;
+ struct intel_iommu *iommu;
+ bool r = true;
+
+ if (intel_iommu_strict)
+ return false;
+
+ /*
+ * The flush queue implementation does not perform page-selective
+ * invalidations that are required for efficient TLB flushes in virtual
+ * environments. The benefit of batching is likely to be much lower than
+ * the overhead of synchronizing the virtual and physical IOMMU
+ * page-tables.
+ */
+ rcu_read_lock();
+ for_each_active_iommu(iommu, drhd) {
+ if (!cap_caching_mode(iommu->cap))
+ continue;
+
+ pr_warn_once("IOMMU batching is disabled due to virtualization");
+ r = false;
+ break;
+ }
+ rcu_read_unlock();
+
+ return r;
+}
+
static int
intel_iommu_domain_get_attr(struct iommu_domain *domain,
enum iommu_attr attr, void *data)
@@ -5450,7 +5480,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain,
case IOMMU_DOMAIN_DMA:
switch (attr) {
case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE:
- *(int *)data = !intel_iommu_strict;
+ *(int *)data = domain_use_flush_queue();
return 0;
default:
return -ENODEV;
Upstream commit 81b704d3e4674e09781d331df73d76675d5ad8cb
Applies to 4.4-stable tree
----------->8---------------
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Thu, 14 Jan 2021 19:34:22 +0100
Subject: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly
Calling acpi_thermal_check() from acpi_thermal_notify() directly
is problematic if _TMP triggers Notify () on the thermal zone for
which it has been evaluated (which happens on some systems), because
it causes a new acpi_thermal_notify() invocation to be queued up
every time and if that takes place too often, an indefinite number of
pending work items may accumulate in kacpi_notify_wq over time.
Besides, it is not really useful to queue up a new invocation of
acpi_thermal_check() if one of them is pending already.
For these reasons, rework acpi_thermal_notify() to queue up a thermal
check instead of calling acpi_thermal_check() directly and only allow
one thermal check to be pending at a time. Moreover, only allow one
acpi_thermal_check_fn() instance at a time to run
thermal_zone_device_update() for one thermal zone and make it return
early if it sees other instances running for the same thermal zone.
While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
as it is only called from there after the other changes made here.
[This issue appears to have been exposed by commit 6d25be5782e4
("sched/core, workqueues: Distangle worker accounting from rq
lock"), but it is unclear why it was not visible earlier.]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
Reported-by: Stephen Berman <stephen.berman(a)gmx.net>
Diagnosed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Tested-by: Stephen Berman <stephen.berman(a)gmx.net>
Cc: All applicable <stable(a)vger.kernel.org>
[bigeasy: Backported to v4.4.y, use atomic_t instead of refcount_t]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
---
drivers/acpi/thermal.c | 54 +++++++++++++++++++++++++++++-------------
1 file changed, 38 insertions(+), 16 deletions(-)
diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
index 82707f9824cae..b4826335ad0b3 100644
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -188,6 +188,8 @@ struct acpi_thermal {
int tz_enabled;
int kelvin_offset;
struct work_struct thermal_check_work;
+ struct mutex thermal_check_lock;
+ atomic_t thermal_check_count;
};
/* --------------------------------------------------------------------------
@@ -513,16 +515,6 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz)
return 0;
}
-static void acpi_thermal_check(void *data)
-{
- struct acpi_thermal *tz = data;
-
- if (!tz->tz_enabled)
- return;
-
- thermal_zone_device_update(tz->thermal_zone);
-}
-
/* sys I/F for generic thermal sysfs support */
static int thermal_get_temp(struct thermal_zone_device *thermal, int *temp)
@@ -556,6 +548,8 @@ static int thermal_get_mode(struct thermal_zone_device *thermal,
return 0;
}
+static void acpi_thermal_check_fn(struct work_struct *work);
+
static int thermal_set_mode(struct thermal_zone_device *thermal,
enum thermal_device_mode mode)
{
@@ -581,7 +575,7 @@ static int thermal_set_mode(struct thermal_zone_device *thermal,
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"%s kernel ACPI thermal control\n",
tz->tz_enabled ? "Enable" : "Disable"));
- acpi_thermal_check(tz);
+ acpi_thermal_check_fn(&tz->thermal_check_work);
}
return 0;
}
@@ -950,6 +944,12 @@ static void acpi_thermal_unregister_thermal_zone(struct acpi_thermal *tz)
Driver Interface
-------------------------------------------------------------------------- */
+static void acpi_queue_thermal_check(struct acpi_thermal *tz)
+{
+ if (!work_pending(&tz->thermal_check_work))
+ queue_work(acpi_thermal_pm_queue, &tz->thermal_check_work);
+}
+
static void acpi_thermal_notify(struct acpi_device *device, u32 event)
{
struct acpi_thermal *tz = acpi_driver_data(device);
@@ -960,17 +960,17 @@ static void acpi_thermal_notify(struct acpi_device *device, u32 event)
switch (event) {
case ACPI_THERMAL_NOTIFY_TEMPERATURE:
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
break;
case ACPI_THERMAL_NOTIFY_THRESHOLDS:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_THRESHOLDS);
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
acpi_bus_generate_netlink_event(device->pnp.device_class,
dev_name(&device->dev), event, 0);
break;
case ACPI_THERMAL_NOTIFY_DEVICES:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_DEVICES);
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
acpi_bus_generate_netlink_event(device->pnp.device_class,
dev_name(&device->dev), event, 0);
break;
@@ -1070,7 +1070,27 @@ static void acpi_thermal_check_fn(struct work_struct *work)
{
struct acpi_thermal *tz = container_of(work, struct acpi_thermal,
thermal_check_work);
- acpi_thermal_check(tz);
+
+ if (!tz->tz_enabled)
+ return;
+ /*
+ * In general, it is not sufficient to check the pending bit, because
+ * subsequent instances of this function may be queued after one of them
+ * has started running (e.g. if _TMP sleeps). Avoid bailing out if just
+ * one of them is running, though, because it may have done the actual
+ * check some time ago, so allow at least one of them to block on the
+ * mutex while another one is running the update.
+ */
+ if (!atomic_add_unless(&tz->thermal_check_count, -1, 1))
+ return;
+
+ mutex_lock(&tz->thermal_check_lock);
+
+ thermal_zone_device_update(tz->thermal_zone);
+
+ atomic_inc(&tz->thermal_check_count);
+
+ mutex_unlock(&tz->thermal_check_lock);
}
static int acpi_thermal_add(struct acpi_device *device)
@@ -1102,6 +1122,8 @@ static int acpi_thermal_add(struct acpi_device *device)
if (result)
goto free_memory;
+ atomic_set(&tz->thermal_check_count, 3);
+ mutex_init(&tz->thermal_check_lock);
INIT_WORK(&tz->thermal_check_work, acpi_thermal_check_fn);
pr_info(PREFIX "%s [%s] (%ld C)\n", acpi_device_name(device),
@@ -1167,7 +1189,7 @@ static int acpi_thermal_resume(struct device *dev)
tz->state.active |= tz->trips.active[i].flags.enabled;
}
- queue_work(acpi_thermal_pm_queue, &tz->thermal_check_work);
+ acpi_queue_thermal_check(tz);
return AE_OK;
}
--
2.30.0
This is the start of the stable review cycle for the 5.10.14 release.
There are 57 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 07 Feb 2021 14:06:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.14-rc…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.14-rc1
Peter Zijlstra <peterz(a)infradead.org>
workqueue: Restrict affinity change to rescuer
Peter Zijlstra <peterz(a)infradead.org>
kthread: Extract KTHREAD_IS_PER_CPU
Gayatri Kammela <gayatri.kammela(a)intel.com>
x86/cpu: Add another Alder Lake CPU to the Intel family
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail the kernel build on fatal errors
Oded Gabbay <ogabbay(a)kernel.org>
habanalabs: disable FW events on device removal
Oded Gabbay <ogabbay(a)kernel.org>
habanalabs: fix backward compatibility of idle check
Ofir Bitton <obitton(a)habana.ai>
habanalabs: zero pci counters packet before submit to FW
Vladimir Stempen <vladimir.stempen(a)amd.com>
drm/amd/display: Fixed corruptions on HPDRX link loss restore
Nicholas Kazlauskas <nicholas.kazlauskas(a)amd.com>
drm/amd/display: Use hardware sequencer functions for PG control
Bing Guo <bing.guo(a)amd.com>
drm/amd/display: Change function decide_dp_link_settings to avoid infinite looping
Aric Cyr <aric.cyr(a)amd.com>
drm/amd/display: Allow PSTATE chnage when no displays are enabled
Jake Wang <haonan.wang2(a)amd.com>
drm/amd/display: Update dram_clock_change_latency for DCN2.1
Michael Ellerman <mpe(a)ellerman.id.au>
selftests/powerpc: Only test lwm/stmw on big endian
Jeannie Stevenson <jeanniestevenson(a)protonmail.com>
platform/x86: thinkpad_acpi: Add P53/73 firmware to fan_quirk_table for dual fan control
Chaitanya Kulkarni <chaitanya.kulkarni(a)wdc.com>
nvmet: set right status on error in id-ns handler
Klaus Jensen <k.jensen(a)samsung.com>
nvme-pci: allow use of cmb on v1.4 controllers
Chao Leng <lengchao(a)huawei.com>
nvme-tcp: avoid request double completion for concurrent nvme_tcp_timeout
Chao Leng <lengchao(a)huawei.com>
nvme-rdma: avoid request double completion for concurrent nvme_rdma_timeout
Revanth Rajashekar <revanth.rajashekar(a)intel.com>
nvme: check the PRINFO bit before deciding the host buffer length
lianzhi chang <changlianzhi(a)uniontech.com>
udf: fix the problem that the disc content is not displayed
Sowjanya Komatineni <skomatineni(a)nvidia.com>
i2c: tegra: Create i2c_writesl_vi() to use with VI I2C for filling TX FIFO
Kai-Chuan Hsieh <kaichuan.hsieh(a)canonical.com>
ALSA: hda: Add Cometlake-R PCI ID
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix encryption key selection for 802.3 xmit
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Shayne Chen <shayne.chen(a)mediatek.com>
mac80211: fix incorrect strlen of .write in debugfs
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't add empty symbols to the rbtree
Kai Vehmanen <kai.vehmanen(a)linux.intel.com>
ALSA: hda: Add AlderLake-P PCI ID and HDMI codec vid
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
ASoC: SOF: Intel: hda: Resume codec to do jack detection
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
scsi: fnic: Fix memleak in vnic_dev_init_devcmd2
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Martin Wilck <mwilck(a)suse.com>
scsi: scsi_transport_srp: Don't block target in failfast state
Peter Zijlstra <peterz(a)infradead.org>
x86: __always_inline __{rd,wr}msr()
Peter Zijlstra <peterz(a)infradead.org>
locking/lockdep: Avoid noinstr warning for DEBUG_LOCKDEP
Oded Gabbay <ogabbay(a)kernel.org>
habanalabs: fix dma_addr passed to dma_mmap_coherent
Arnold Gozum <arngozum(a)gmail.com>
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
tools/power/x86/intel-speed-select: Set higher of cpuinfo_max_freq or base_frequency
Srinivas Pandruvada <srinivas.pandruvada(a)linux.intel.com>
tools/power/x86/intel-speed-select: Set scaling_max_freq to base_frequency
Tony Lindgren <tony(a)atomide.com>
phy: cpcap-usb: Fix warning for missing regulator_disable
Nadav Amit <namit(a)vmware.com>
iommu/vt-d: Do not use flush-queue when caching-mode is on
Nick Desaulniers <ndesaulniers(a)google.com>
ARM: 9025/1: Kconfig: CPU_BIG_ENDIAN depends on !LD_IS_LLD
Mike Rapoport <rppt(a)kernel.org>
Revert "x86/setup: don't remove E820_TYPE_RAM for pfn 0"
Catalin Marinas <catalin.marinas(a)arm.com>
arm64: Do not pass tagged addresses to __is_lm_address()
Vincenzo Frascino <vincenzo.frascino(a)arm.com>
arm64: Fix kernel address detection of __is_lm_address()
Robin Murphy <robin.murphy(a)arm.com>
arm64: dts: meson: Describe G12b GPU as coherent
Robin Murphy <robin.murphy(a)arm.com>
drm/panfrost: Support cache-coherent integrations
Robin Murphy <robin.murphy(a)arm.com>
iommu/io-pgtable-arm: Support coherency for Mali LPAE
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
Ido Schimmel <idosch(a)nvidia.com>
mlxsw: spectrum_span: Do not overwrite policer configuration
Voon Weifeng <weifeng.voon(a)intel.com>
stmmac: intel: Configure EHL PSE0 GbE and PSE1 GbE to 32 bits DMA addressing
Kevin Hao <haokexin(a)gmail.com>
net: octeontx2: Make sure the buffer is 128 byte aligned
Pan Bian <bianpan2016(a)163.com>
net: fec: put child node on error path
Pan Bian <bianpan2016(a)163.com>
net: stmmac: dwmac-intel-plat: remove config data on error
Marek Vasut <marex(a)denx.de>
net: dsa: microchip: Adjust reset release timing to match reference reset circuit
-------------
Diffstat:
Makefile | 4 +-
arch/arm/mm/Kconfig | 1 +
arch/arm64/boot/dts/amlogic/meson-g12b.dtsi | 4 ++
arch/arm64/include/asm/memory.h | 10 ++---
arch/arm64/mm/physaddr.c | 2 +-
arch/x86/include/asm/intel-family.h | 1 +
arch/x86/include/asm/msr.h | 4 +-
arch/x86/kernel/setup.c | 20 +++++-----
.../amd/display/dc/clk_mgr/dcn30/dcn30_clk_mgr.c | 6 ++-
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 7 +++-
.../drm/amd/display/dc/dcn10/dcn10_hw_sequencer.c | 18 +++++++--
drivers/gpu/drm/amd/display/dc/dcn20/dcn20_hwseq.c | 9 ++++-
.../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +-
drivers/gpu/drm/panfrost/panfrost_device.h | 1 +
drivers/gpu/drm/panfrost/panfrost_drv.c | 2 +
drivers/gpu/drm/panfrost/panfrost_gem.c | 2 +
drivers/gpu/drm/panfrost/panfrost_mmu.c | 1 +
drivers/i2c/busses/i2c-tegra.c | 22 ++++++++++-
drivers/iommu/intel/iommu.c | 5 +++
drivers/iommu/io-pgtable-arm.c | 11 +++++-
drivers/misc/habanalabs/common/device.c | 9 +++++
drivers/misc/habanalabs/common/firmware_if.c | 5 +++
drivers/misc/habanalabs/common/habanalabs_ioctl.c | 2 +
drivers/misc/habanalabs/gaudi/gaudi.c | 3 +-
drivers/misc/habanalabs/goya/goya.c | 3 +-
drivers/net/dsa/bcm_sf2.c | 8 +++-
drivers/net/dsa/microchip/ksz_common.c | 2 +-
drivers/net/ethernet/freescale/fec_main.c | 3 +-
drivers/net/ethernet/ibm/ibmvnic.c | 6 +++
.../ethernet/marvell/octeontx2/nic/otx2_common.c | 3 +-
.../net/ethernet/mellanox/mlxsw/spectrum_span.c | 6 +++
.../net/ethernet/mellanox/mlxsw/spectrum_span.h | 1 +
.../net/ethernet/stmicro/stmmac/dwmac-intel-plat.c | 4 +-
drivers/net/ethernet/stmicro/stmmac/dwmac-intel.c | 2 +
drivers/nvme/host/core.c | 17 ++++++++-
drivers/nvme/host/pci.c | 14 +++++++
drivers/nvme/host/rdma.c | 15 ++++++--
drivers/nvme/host/tcp.c | 14 +++++--
drivers/nvme/target/admin-cmd.c | 8 +++-
drivers/phy/motorola/phy-cpcap-usb.c | 19 +++++++---
drivers/platform/x86/intel-vbtn.c | 6 +++
drivers/platform/x86/thinkpad_acpi.c | 1 +
drivers/platform/x86/touchscreen_dmi.c | 18 +++++++++
drivers/scsi/fnic/vnic_dev.c | 8 ++--
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 +++++++-
drivers/scsi/scsi_transport_srp.c | 9 ++++-
fs/udf/super.c | 7 ++--
include/linux/kthread.h | 3 ++
include/linux/nvme.h | 6 +++
kernel/kthread.c | 27 ++++++++++++-
kernel/locking/lockdep.c | 7 +++-
kernel/smpboot.c | 1 +
kernel/workqueue.c | 9 ++---
net/mac80211/debugfs.c | 44 ++++++++++------------
net/mac80211/rx.c | 2 +
net/mac80211/tx.c | 27 +++++++------
net/switchdev/switchdev.c | 23 ++++++-----
sound/pci/hda/hda_intel.c | 6 +++
sound/pci/hda/patch_hdmi.c | 1 +
sound/soc/sof/intel/hda-codec.c | 3 +-
tools/objtool/check.c | 14 +++----
tools/objtool/elf.c | 7 ++++
tools/power/x86/intel-speed-select/isst-config.c | 32 ++++++++++++++++
.../powerpc/alignment/alignment_handler.c | 5 ++-
65 files changed, 427 insertions(+), 135 deletions(-)
Hi,
commit 64f55156f7adedb1ac5bb9cdbcbc9ac05ff5a724 upstream
The requested patch allows the iwlwifi driver to work with newer AX200
wifi cards when MSI-X interrupts are not available. Without it,
bringing an interface "up" fails with a Microcode SW error. Xen PCI
passthrough with a linux stubdom doesn't enable MSI-X which triggers
this condition.
I think it's only applicable to 5.4 because it's in 5.10 and earlier
kernels don't have AX200 support.
I'm making this request to stable and CC-ing netdev since I saw a
message [1] on netdev saying:
"""
We're actually experimenting with letting Greg take networking patches
into stable like he does for every other tree. If the patch doesn't
appear in the next stable release please poke stable@ directly.
"""
Thanks,
Jason
[1] https://lore.kernel.org/netdev/20210129193030.46ef3b17@kicinski-fedora-pc1c…
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2dcb3964544177c51853a210b6ad400de78ef17d Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro(a)fb.com>
Date: Thu, 4 Feb 2021 18:32:36 -0800
Subject: [PATCH] memblock: do not start bottom-up allocations with kernel_end
With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:
hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
------------[ cut here ]------------
memblock: bottom-up allocation failed, memory hotremove may be affected
WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
RIP: 0010:memblock_find_in_range_node+0x178/0x25a
Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
Call Trace:
memblock_alloc_range_nid+0x8d/0x11e
cma_declare_contiguous_nid+0x2c4/0x38c
hugetlb_cma_reserve+0xdc/0x128
flush_tlb_one_kernel+0xc/0x20
native_set_fixmap+0x82/0xd0
flat_get_apic_id+0x5/0x10
register_lapic_address+0x8e/0x97
setup_arch+0x8a5/0xc3f
start_kernel+0x66/0x547
load_ucode_bsp+0x4c/0xcd
secondary_startup_64_no_verify+0xb0/0xbb
random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE. In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure. All together it
simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Wonhyuk Yang <vvghjk1234(a)gmail.com>
Cc: Thiago Jung Bauermann <bauerman(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memblock.c b/mm/memblock.c
index 1eaaec1e7687..8d9b5f1e7040 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -275,14 +275,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end,
*
* Find @size free area aligned to @align in the specified range and node.
*
- * When allocation direction is bottom-up, the @start should be greater
- * than the end of the kernel image. Otherwise, it will be trimmed. The
- * reason is that we want the bottom-up allocation just near the kernel
- * image so it is highly likely that the allocated memory and the kernel
- * will reside in the same node.
- *
- * If bottom-up allocation failed, will try to allocate memory top-down.
- *
* Return:
* Found address on success, 0 on failure.
*/
@@ -291,8 +283,6 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
phys_addr_t end, int nid,
enum memblock_flags flags)
{
- phys_addr_t kernel_end, ret;
-
/* pump up @end */
if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
end == MEMBLOCK_ALLOC_KASAN)
@@ -301,40 +291,13 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
/* avoid allocating the first page */
start = max_t(phys_addr_t, start, PAGE_SIZE);
end = max(start, end);
- kernel_end = __pa_symbol(_end);
-
- /*
- * try bottom-up allocation only when bottom-up mode
- * is set and @end is above the kernel image.
- */
- if (memblock_bottom_up() && end > kernel_end) {
- phys_addr_t bottom_up_start;
-
- /* make sure we will allocate above the kernel */
- bottom_up_start = max(start, kernel_end);
- /* ok, try bottom-up allocation first */
- ret = __memblock_find_range_bottom_up(bottom_up_start, end,
- size, align, nid, flags);
- if (ret)
- return ret;
-
- /*
- * we always limit bottom-up allocation above the kernel,
- * but top-down allocation doesn't have the limit, so
- * retrying top-down allocation may succeed when bottom-up
- * allocation failed.
- *
- * bottom-up allocation is expected to be fail very rarely,
- * so we use WARN_ONCE() here to see the stack trace if
- * fail happens.
- */
- WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE),
- "memblock: bottom-up allocation failed, memory hotremove may be affected\n");
- }
-
- return __memblock_find_range_top_down(start, end, size, align, nid,
- flags);
+ if (memblock_bottom_up())
+ return __memblock_find_range_bottom_up(start, end, size, align,
+ nid, flags);
+ else
+ return __memblock_find_range_top_down(start, end, size, align,
+ nid, flags);
}
/**
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2dcb3964544177c51853a210b6ad400de78ef17d Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro(a)fb.com>
Date: Thu, 4 Feb 2021 18:32:36 -0800
Subject: [PATCH] memblock: do not start bottom-up allocations with kernel_end
With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:
hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
------------[ cut here ]------------
memblock: bottom-up allocation failed, memory hotremove may be affected
WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
RIP: 0010:memblock_find_in_range_node+0x178/0x25a
Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
Call Trace:
memblock_alloc_range_nid+0x8d/0x11e
cma_declare_contiguous_nid+0x2c4/0x38c
hugetlb_cma_reserve+0xdc/0x128
flush_tlb_one_kernel+0xc/0x20
native_set_fixmap+0x82/0xd0
flat_get_apic_id+0x5/0x10
register_lapic_address+0x8e/0x97
setup_arch+0x8a5/0xc3f
start_kernel+0x66/0x547
load_ucode_bsp+0x4c/0xcd
secondary_startup_64_no_verify+0xb0/0xbb
random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE. In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure. All together it
simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Wonhyuk Yang <vvghjk1234(a)gmail.com>
Cc: Thiago Jung Bauermann <bauerman(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memblock.c b/mm/memblock.c
index 1eaaec1e7687..8d9b5f1e7040 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -275,14 +275,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end,
*
* Find @size free area aligned to @align in the specified range and node.
*
- * When allocation direction is bottom-up, the @start should be greater
- * than the end of the kernel image. Otherwise, it will be trimmed. The
- * reason is that we want the bottom-up allocation just near the kernel
- * image so it is highly likely that the allocated memory and the kernel
- * will reside in the same node.
- *
- * If bottom-up allocation failed, will try to allocate memory top-down.
- *
* Return:
* Found address on success, 0 on failure.
*/
@@ -291,8 +283,6 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
phys_addr_t end, int nid,
enum memblock_flags flags)
{
- phys_addr_t kernel_end, ret;
-
/* pump up @end */
if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
end == MEMBLOCK_ALLOC_KASAN)
@@ -301,40 +291,13 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
/* avoid allocating the first page */
start = max_t(phys_addr_t, start, PAGE_SIZE);
end = max(start, end);
- kernel_end = __pa_symbol(_end);
-
- /*
- * try bottom-up allocation only when bottom-up mode
- * is set and @end is above the kernel image.
- */
- if (memblock_bottom_up() && end > kernel_end) {
- phys_addr_t bottom_up_start;
-
- /* make sure we will allocate above the kernel */
- bottom_up_start = max(start, kernel_end);
- /* ok, try bottom-up allocation first */
- ret = __memblock_find_range_bottom_up(bottom_up_start, end,
- size, align, nid, flags);
- if (ret)
- return ret;
-
- /*
- * we always limit bottom-up allocation above the kernel,
- * but top-down allocation doesn't have the limit, so
- * retrying top-down allocation may succeed when bottom-up
- * allocation failed.
- *
- * bottom-up allocation is expected to be fail very rarely,
- * so we use WARN_ONCE() here to see the stack trace if
- * fail happens.
- */
- WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE),
- "memblock: bottom-up allocation failed, memory hotremove may be affected\n");
- }
-
- return __memblock_find_range_top_down(start, end, size, align, nid,
- flags);
+ if (memblock_bottom_up())
+ return __memblock_find_range_bottom_up(start, end, size, align,
+ nid, flags);
+ else
+ return __memblock_find_range_top_down(start, end, size, align,
+ nid, flags);
}
/**
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 2dcb3964544177c51853a210b6ad400de78ef17d Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro(a)fb.com>
Date: Thu, 4 Feb 2021 18:32:36 -0800
Subject: [PATCH] memblock: do not start bottom-up allocations with kernel_end
With kaslr the kernel image is placed at a random place, so starting the
bottom-up allocation with the kernel_end can result in an allocation
failure and a warning like this one:
hugetlb_cma: reserve 2048 MiB, up to 2048 MiB per node
------------[ cut here ]------------
memblock: bottom-up allocation failed, memory hotremove may be affected
WARNING: CPU: 0 PID: 0 at mm/memblock.c:332 memblock_find_in_range_node+0x178/0x25a
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.10.0+ #1169
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-1.fc33 04/01/2014
RIP: 0010:memblock_find_in_range_node+0x178/0x25a
Code: e9 6d ff ff ff 48 85 c0 0f 85 da 00 00 00 80 3d 9b 35 df 00 00 75 15 48 c7 c7 c0 75 59 88 c6 05 8b 35 df 00 01 e8 25 8a fa ff <0f> 0b 48 c7 44 24 20 ff ff ff ff 44 89 e6 44 89 ea 48 c7 c1 70 5c
RSP: 0000:ffffffff88803d18 EFLAGS: 00010086 ORIG_RAX: 0000000000000000
RAX: 0000000000000000 RBX: 0000000240000000 RCX: 00000000ffffdfff
RDX: 00000000ffffdfff RSI: 00000000ffffffea RDI: 0000000000000046
RBP: 0000000100000000 R08: ffffffff88922788 R09: 0000000000009ffb
R10: 00000000ffffe000 R11: 3fffffffffffffff R12: 0000000000000000
R13: 0000000000000000 R14: 0000000080000000 R15: 00000001fb42c000
FS: 0000000000000000(0000) GS:ffffffff88f71000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: ffffa080fb401000 CR3: 00000001fa80a000 CR4: 00000000000406b0
Call Trace:
memblock_alloc_range_nid+0x8d/0x11e
cma_declare_contiguous_nid+0x2c4/0x38c
hugetlb_cma_reserve+0xdc/0x128
flush_tlb_one_kernel+0xc/0x20
native_set_fixmap+0x82/0xd0
flat_get_apic_id+0x5/0x10
register_lapic_address+0x8e/0x97
setup_arch+0x8a5/0xc3f
start_kernel+0x66/0x547
load_ucode_bsp+0x4c/0xcd
secondary_startup_64_no_verify+0xb0/0xbb
random: get_random_bytes called from __warn+0xab/0x110 with crng_init=0
---[ end trace f151227d0b39be70 ]---
At the same time, the kernel image is protected with memblock_reserve(),
so we can just start searching at PAGE_SIZE. In this case the bottom-up
allocation has the same chances to success as a top-down allocation, so
there is no reason to fallback in the case of a failure. All together it
simplifies the logic.
Link: https://lkml.kernel.org/r/20201217201214.3414100-2-guro@fb.com
Fixes: 8fabc623238e ("powerpc: Ensure that swiotlb buffer is allocated from low memory")
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Reviewed-by: Mike Rapoport <rppt(a)linux.ibm.com>
Cc: Joonsoo Kim <iamjoonsoo.kim(a)lge.com>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Wonhyuk Yang <vvghjk1234(a)gmail.com>
Cc: Thiago Jung Bauermann <bauerman(a)linux.ibm.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memblock.c b/mm/memblock.c
index 1eaaec1e7687..8d9b5f1e7040 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -275,14 +275,6 @@ __memblock_find_range_top_down(phys_addr_t start, phys_addr_t end,
*
* Find @size free area aligned to @align in the specified range and node.
*
- * When allocation direction is bottom-up, the @start should be greater
- * than the end of the kernel image. Otherwise, it will be trimmed. The
- * reason is that we want the bottom-up allocation just near the kernel
- * image so it is highly likely that the allocated memory and the kernel
- * will reside in the same node.
- *
- * If bottom-up allocation failed, will try to allocate memory top-down.
- *
* Return:
* Found address on success, 0 on failure.
*/
@@ -291,8 +283,6 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
phys_addr_t end, int nid,
enum memblock_flags flags)
{
- phys_addr_t kernel_end, ret;
-
/* pump up @end */
if (end == MEMBLOCK_ALLOC_ACCESSIBLE ||
end == MEMBLOCK_ALLOC_KASAN)
@@ -301,40 +291,13 @@ static phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t size,
/* avoid allocating the first page */
start = max_t(phys_addr_t, start, PAGE_SIZE);
end = max(start, end);
- kernel_end = __pa_symbol(_end);
-
- /*
- * try bottom-up allocation only when bottom-up mode
- * is set and @end is above the kernel image.
- */
- if (memblock_bottom_up() && end > kernel_end) {
- phys_addr_t bottom_up_start;
-
- /* make sure we will allocate above the kernel */
- bottom_up_start = max(start, kernel_end);
- /* ok, try bottom-up allocation first */
- ret = __memblock_find_range_bottom_up(bottom_up_start, end,
- size, align, nid, flags);
- if (ret)
- return ret;
-
- /*
- * we always limit bottom-up allocation above the kernel,
- * but top-down allocation doesn't have the limit, so
- * retrying top-down allocation may succeed when bottom-up
- * allocation failed.
- *
- * bottom-up allocation is expected to be fail very rarely,
- * so we use WARN_ONCE() here to see the stack trace if
- * fail happens.
- */
- WARN_ONCE(IS_ENABLED(CONFIG_MEMORY_HOTREMOVE),
- "memblock: bottom-up allocation failed, memory hotremove may be affected\n");
- }
-
- return __memblock_find_range_top_down(start, end, size, align, nid,
- flags);
+ if (memblock_bottom_up())
+ return __memblock_find_range_bottom_up(start, end, size, align,
+ nid, flags);
+ else
+ return __memblock_find_range_top_down(start, end, size, align,
+ nid, flags);
}
/**
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 24321ac668e452a4942598533d267805f291fdc9 Mon Sep 17 00:00:00 2001
From: Raoni Fassina Firmino <raoni(a)linux.ibm.com>
Date: Mon, 1 Feb 2021 17:05:05 -0300
Subject: [PATCH] powerpc/64/signal: Fix regression in __kernel_sigtramp_rt64()
semantics
Commit 0138ba5783ae ("powerpc/64/signal: Balance return predictor
stack in signal trampoline") changed __kernel_sigtramp_rt64() VDSO and
trampoline code, and introduced a regression in the way glibc's
backtrace()[1] detects the signal-handler stack frame. Apart from the
practical implications, __kernel_sigtramp_rt64() was a VDSO function
with the semantics that it is a function you can call from userspace
to end a signal handling. Now this semantics are no longer valid.
I believe the aforementioned change affects all releases since 5.9.
This patch tries to fix both the semantics and practical aspect of
__kernel_sigtramp_rt64() returning it to the previous code, whilst
keeping the intended behaviour of 0138ba5783ae by adding a new symbol
to serve as the jump target from the kernel to the trampoline. Now the
trampoline has two parts, a new entry point and the old return point.
[1] https://lists.ozlabs.org/pipermail/linuxppc-dev/2021-January/223194.html
Fixes: 0138ba5783ae ("powerpc/64/signal: Balance return predictor stack in signal trampoline")
Cc: stable(a)vger.kernel.org # v5.9+
Signed-off-by: Raoni Fassina Firmino <raoni(a)linux.ibm.com>
Acked-by: Nicholas Piggin <npiggin(a)gmail.com>
[mpe: Minor tweaks to change log formatting, add stable tag]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20210201200505.iz46ubcizipnkcxe@work-tp
diff --git a/arch/powerpc/kernel/vdso64/sigtramp.S b/arch/powerpc/kernel/vdso64/sigtramp.S
index bbf68cd01088..2d4067561293 100644
--- a/arch/powerpc/kernel/vdso64/sigtramp.S
+++ b/arch/powerpc/kernel/vdso64/sigtramp.S
@@ -15,11 +15,20 @@
.text
+/*
+ * __kernel_start_sigtramp_rt64 and __kernel_sigtramp_rt64 together
+ * are one function split in two parts. The kernel jumps to the former
+ * and the signal handler indirectly (by blr) returns to the latter.
+ * __kernel_sigtramp_rt64 needs to point to the return address so
+ * glibc can correctly identify the trampoline stack frame.
+ */
.balign 8
.balign IFETCH_ALIGN_BYTES
-V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
+V_FUNCTION_BEGIN(__kernel_start_sigtramp_rt64)
.Lsigrt_start:
bctrl /* call the handler */
+V_FUNCTION_END(__kernel_start_sigtramp_rt64)
+V_FUNCTION_BEGIN(__kernel_sigtramp_rt64)
addi r1, r1, __SIGNAL_FRAMESIZE
li r0,__NR_rt_sigreturn
sc
diff --git a/arch/powerpc/kernel/vdso64/vdso64.lds.S b/arch/powerpc/kernel/vdso64/vdso64.lds.S
index 6164d1a1ba11..2f3c359cacd3 100644
--- a/arch/powerpc/kernel/vdso64/vdso64.lds.S
+++ b/arch/powerpc/kernel/vdso64/vdso64.lds.S
@@ -131,4 +131,4 @@ VERSION
/*
* Make the sigreturn code visible to the kernel.
*/
-VDSO_sigtramp_rt64 = __kernel_sigtramp_rt64;
+VDSO_sigtramp_rt64 = __kernel_start_sigtramp_rt64;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c1c35cf78bfab31b8cb455259524395c9e4c7cd6 Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Fri, 13 Nov 2020 08:30:38 -0500
Subject: [PATCH] KVM: x86: cleanup CR3 reserved bits checks
If not in long mode, the low bits of CR3 are reserved but not enforced to
be zero, so remove those checks. If in long mode, however, the MBZ bits
extend down to the highest physical address bit of the guest, excluding
the encryption bit.
Make the checks consistent with the above, and match them between
nested_vmcb_checks and KVM_SET_SREGS.
Cc: stable(a)vger.kernel.org
Fixes: 761e41693465 ("KVM: nSVM: Check that MBZ bits in CR3 and CR4 are not set on vmrun of nested guests")
Fixes: a780a3ea6282 ("KVM: X86: Fix reserved bits check for MOV to CR3")
Reviewed-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 7a605ad8254d..db30670dd8c4 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -231,6 +231,7 @@ static bool nested_vmcb_check_controls(struct vmcb_control_area *control)
static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb12)
{
+ struct kvm_vcpu *vcpu = &svm->vcpu;
bool vmcb12_lma;
if ((vmcb12->save.efer & EFER_SVME) == 0)
@@ -244,18 +245,10 @@ static bool nested_vmcb_checks(struct vcpu_svm *svm, struct vmcb *vmcb12)
vmcb12_lma = (vmcb12->save.efer & EFER_LME) && (vmcb12->save.cr0 & X86_CR0_PG);
- if (!vmcb12_lma) {
- if (vmcb12->save.cr4 & X86_CR4_PAE) {
- if (vmcb12->save.cr3 & MSR_CR3_LEGACY_PAE_RESERVED_MASK)
- return false;
- } else {
- if (vmcb12->save.cr3 & MSR_CR3_LEGACY_RESERVED_MASK)
- return false;
- }
- } else {
+ if (vmcb12_lma) {
if (!(vmcb12->save.cr4 & X86_CR4_PAE) ||
!(vmcb12->save.cr0 & X86_CR0_PE) ||
- (vmcb12->save.cr3 & MSR_CR3_LONG_MBZ_MASK))
+ (vmcb12->save.cr3 & vcpu->arch.cr3_lm_rsvd_bits))
return false;
}
if (!kvm_is_valid_cr4(&svm->vcpu, vmcb12->save.cr4))
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 0fe874ae5498..6e7d070f8b86 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -403,9 +403,6 @@ static inline bool gif_set(struct vcpu_svm *svm)
}
/* svm.c */
-#define MSR_CR3_LEGACY_RESERVED_MASK 0xfe7U
-#define MSR_CR3_LEGACY_PAE_RESERVED_MASK 0x7U
-#define MSR_CR3_LONG_MBZ_MASK 0xfff0000000000000U
#define MSR_INVALID 0xffffffffU
extern int sev;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 42b28d0f0311..c1650e26715b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9624,6 +9624,8 @@ static bool kvm_is_valid_sregs(struct kvm_vcpu *vcpu, struct kvm_sregs *sregs)
*/
if (!(sregs->cr4 & X86_CR4_PAE) || !(sregs->efer & EFER_LMA))
return false;
+ if (sregs->cr3 & vcpu->arch.cr3_lm_rsvd_bits)
+ return false;
} else {
/*
* Not in 64-bit mode: EFER.LMA is clear and the code
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7131636e7ea5b50ca910f8953f6365ef2d1f741c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 28 Jan 2021 11:45:00 -0500
Subject: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if
tsx=off
Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.
To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
Cc: stable(a)vger.kernel.org
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..eb69fef57485 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6860,11 +6860,20 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
switch (index) {
case MSR_IA32_TSX_CTRL:
/*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
+ * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
+ * interception. Keep the host value unchanged to avoid
+ * changing CPUID bits under the host kernel's feet.
+ *
+ * hle=0, rtm=0, tsx_ctrl=1 can be found with some
+ * combinations of new kernel and old userspace. If
+ * those guests run on a tsx=off host, do allow guests
+ * to use TSX_CTRL, but do not change the value on the
+ * host so that TSX remains always disabled.
*/
- vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ if (boot_cpu_has(X86_FEATURE_RTM))
+ vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ else
+ vmx->guest_uret_msrs[j].mask = 0;
break;
default:
vmx->guest_uret_msrs[j].mask = -1ull;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..b05a1fe9dae9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1394,16 +1394,24 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
- /*
- * On TAA affected systems:
- * - nothing to do if TSX is disabled on the host.
- * - we emulate TSX_CTRL if present on the host.
- * This lets the guest use VERW to clear CPU buffers.
- */
- if (!boot_cpu_has(X86_FEATURE_RTM))
- data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
- else if (!boot_cpu_has_bug(X86_BUG_TAA))
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ /*
+ * If RTM=0 because the kernel has disabled TSX, the host might
+ * have TAA_NO or TSX_CTRL. Clear TAA_NO (the guest sees RTM=0
+ * and therefore knows that there cannot be TAA) but keep
+ * TSX_CTRL: some buggy userspaces leave it set on tsx=on hosts,
+ * and we want to allow migrating those guests to tsx=off hosts.
+ */
+ data &= ~ARCH_CAP_TAA_NO;
+ } else if (!boot_cpu_has_bug(X86_BUG_TAA)) {
data |= ARCH_CAP_TAA_NO;
+ } else {
+ /*
+ * Nothing to do here; we emulate TSX_CTRL if present on the
+ * host so the guest can choose between disabling TSX or
+ * using VERW to clear CPU buffers.
+ */
+ }
return data;
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7131636e7ea5b50ca910f8953f6365ef2d1f741c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 28 Jan 2021 11:45:00 -0500
Subject: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if
tsx=off
Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.
To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
Cc: stable(a)vger.kernel.org
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..eb69fef57485 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6860,11 +6860,20 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
switch (index) {
case MSR_IA32_TSX_CTRL:
/*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
+ * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
+ * interception. Keep the host value unchanged to avoid
+ * changing CPUID bits under the host kernel's feet.
+ *
+ * hle=0, rtm=0, tsx_ctrl=1 can be found with some
+ * combinations of new kernel and old userspace. If
+ * those guests run on a tsx=off host, do allow guests
+ * to use TSX_CTRL, but do not change the value on the
+ * host so that TSX remains always disabled.
*/
- vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ if (boot_cpu_has(X86_FEATURE_RTM))
+ vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ else
+ vmx->guest_uret_msrs[j].mask = 0;
break;
default:
vmx->guest_uret_msrs[j].mask = -1ull;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..b05a1fe9dae9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1394,16 +1394,24 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
- /*
- * On TAA affected systems:
- * - nothing to do if TSX is disabled on the host.
- * - we emulate TSX_CTRL if present on the host.
- * This lets the guest use VERW to clear CPU buffers.
- */
- if (!boot_cpu_has(X86_FEATURE_RTM))
- data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
- else if (!boot_cpu_has_bug(X86_BUG_TAA))
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ /*
+ * If RTM=0 because the kernel has disabled TSX, the host might
+ * have TAA_NO or TSX_CTRL. Clear TAA_NO (the guest sees RTM=0
+ * and therefore knows that there cannot be TAA) but keep
+ * TSX_CTRL: some buggy userspaces leave it set on tsx=on hosts,
+ * and we want to allow migrating those guests to tsx=off hosts.
+ */
+ data &= ~ARCH_CAP_TAA_NO;
+ } else if (!boot_cpu_has_bug(X86_BUG_TAA)) {
data |= ARCH_CAP_TAA_NO;
+ } else {
+ /*
+ * Nothing to do here; we emulate TSX_CTRL if present on the
+ * host so the guest can choose between disabling TSX or
+ * using VERW to clear CPU buffers.
+ */
+ }
return data;
}
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7131636e7ea5b50ca910f8953f6365ef2d1f741c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 28 Jan 2021 11:45:00 -0500
Subject: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if
tsx=off
Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.
To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
Cc: stable(a)vger.kernel.org
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..eb69fef57485 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6860,11 +6860,20 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
switch (index) {
case MSR_IA32_TSX_CTRL:
/*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
+ * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
+ * interception. Keep the host value unchanged to avoid
+ * changing CPUID bits under the host kernel's feet.
+ *
+ * hle=0, rtm=0, tsx_ctrl=1 can be found with some
+ * combinations of new kernel and old userspace. If
+ * those guests run on a tsx=off host, do allow guests
+ * to use TSX_CTRL, but do not change the value on the
+ * host so that TSX remains always disabled.
*/
- vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ if (boot_cpu_has(X86_FEATURE_RTM))
+ vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ else
+ vmx->guest_uret_msrs[j].mask = 0;
break;
default:
vmx->guest_uret_msrs[j].mask = -1ull;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..b05a1fe9dae9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1394,16 +1394,24 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
- /*
- * On TAA affected systems:
- * - nothing to do if TSX is disabled on the host.
- * - we emulate TSX_CTRL if present on the host.
- * This lets the guest use VERW to clear CPU buffers.
- */
- if (!boot_cpu_has(X86_FEATURE_RTM))
- data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
- else if (!boot_cpu_has_bug(X86_BUG_TAA))
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ /*
+ * If RTM=0 because the kernel has disabled TSX, the host might
+ * have TAA_NO or TSX_CTRL. Clear TAA_NO (the guest sees RTM=0
+ * and therefore knows that there cannot be TAA) but keep
+ * TSX_CTRL: some buggy userspaces leave it set on tsx=on hosts,
+ * and we want to allow migrating those guests to tsx=off hosts.
+ */
+ data &= ~ARCH_CAP_TAA_NO;
+ } else if (!boot_cpu_has_bug(X86_BUG_TAA)) {
data |= ARCH_CAP_TAA_NO;
+ } else {
+ /*
+ * Nothing to do here; we emulate TSX_CTRL if present on the
+ * host so the guest can choose between disabling TSX or
+ * using VERW to clear CPU buffers.
+ */
+ }
return data;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7131636e7ea5b50ca910f8953f6365ef2d1f741c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 28 Jan 2021 11:45:00 -0500
Subject: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if
tsx=off
Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.
To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
Cc: stable(a)vger.kernel.org
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..eb69fef57485 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6860,11 +6860,20 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
switch (index) {
case MSR_IA32_TSX_CTRL:
/*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
+ * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
+ * interception. Keep the host value unchanged to avoid
+ * changing CPUID bits under the host kernel's feet.
+ *
+ * hle=0, rtm=0, tsx_ctrl=1 can be found with some
+ * combinations of new kernel and old userspace. If
+ * those guests run on a tsx=off host, do allow guests
+ * to use TSX_CTRL, but do not change the value on the
+ * host so that TSX remains always disabled.
*/
- vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ if (boot_cpu_has(X86_FEATURE_RTM))
+ vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ else
+ vmx->guest_uret_msrs[j].mask = 0;
break;
default:
vmx->guest_uret_msrs[j].mask = -1ull;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..b05a1fe9dae9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1394,16 +1394,24 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
- /*
- * On TAA affected systems:
- * - nothing to do if TSX is disabled on the host.
- * - we emulate TSX_CTRL if present on the host.
- * This lets the guest use VERW to clear CPU buffers.
- */
- if (!boot_cpu_has(X86_FEATURE_RTM))
- data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
- else if (!boot_cpu_has_bug(X86_BUG_TAA))
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ /*
+ * If RTM=0 because the kernel has disabled TSX, the host might
+ * have TAA_NO or TSX_CTRL. Clear TAA_NO (the guest sees RTM=0
+ * and therefore knows that there cannot be TAA) but keep
+ * TSX_CTRL: some buggy userspaces leave it set on tsx=on hosts,
+ * and we want to allow migrating those guests to tsx=off hosts.
+ */
+ data &= ~ARCH_CAP_TAA_NO;
+ } else if (!boot_cpu_has_bug(X86_BUG_TAA)) {
data |= ARCH_CAP_TAA_NO;
+ } else {
+ /*
+ * Nothing to do here; we emulate TSX_CTRL if present on the
+ * host so the guest can choose between disabling TSX or
+ * using VERW to clear CPU buffers.
+ */
+ }
return data;
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7131636e7ea5b50ca910f8953f6365ef2d1f741c Mon Sep 17 00:00:00 2001
From: Paolo Bonzini <pbonzini(a)redhat.com>
Date: Thu, 28 Jan 2021 11:45:00 -0500
Subject: [PATCH] KVM: x86: Allow guests to see MSR_IA32_TSX_CTRL even if
tsx=off
Userspace that does not know about KVM_GET_MSR_FEATURE_INDEX_LIST
will generally use the default value for MSR_IA32_ARCH_CAPABILITIES.
When this happens and the host has tsx=on, it is possible to end up with
virtual machines that have HLE and RTM disabled, but TSX_CTRL available.
If the fleet is then switched to tsx=off, kvm_get_arch_capabilities()
will clear the ARCH_CAP_TSX_CTRL_MSR bit and it will not be possible to
use the tsx=off hosts as migration destinations, even though the guests
do not have TSX enabled.
To allow this migration, allow guests to write to their TSX_CTRL MSR,
while keeping the host MSR unchanged for the entire life of the guests.
This ensures that TSX remains disabled and also saves MSR reads and
writes, and it's okay to do because with tsx=off we know that guests will
not have the HLE and RTM features in their CPUID. (If userspace sets
bogus CPUID data, we do not expect HLE and RTM to work in guests anyway).
Cc: stable(a)vger.kernel.org
Fixes: cbbaa2727aa3 ("KVM: x86: fix presentation of TSX feature in ARCH_CAPABILITIES")
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cc60b1fc3ee7..eb69fef57485 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -6860,11 +6860,20 @@ static int vmx_create_vcpu(struct kvm_vcpu *vcpu)
switch (index) {
case MSR_IA32_TSX_CTRL:
/*
- * No need to pass TSX_CTRL_CPUID_CLEAR through, so
- * let's avoid changing CPUID bits under the host
- * kernel's feet.
+ * TSX_CTRL_CPUID_CLEAR is handled in the CPUID
+ * interception. Keep the host value unchanged to avoid
+ * changing CPUID bits under the host kernel's feet.
+ *
+ * hle=0, rtm=0, tsx_ctrl=1 can be found with some
+ * combinations of new kernel and old userspace. If
+ * those guests run on a tsx=off host, do allow guests
+ * to use TSX_CTRL, but do not change the value on the
+ * host so that TSX remains always disabled.
*/
- vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ if (boot_cpu_has(X86_FEATURE_RTM))
+ vmx->guest_uret_msrs[j].mask = ~(u64)TSX_CTRL_CPUID_CLEAR;
+ else
+ vmx->guest_uret_msrs[j].mask = 0;
break;
default:
vmx->guest_uret_msrs[j].mask = -1ull;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 76bce832cade..b05a1fe9dae9 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1394,16 +1394,24 @@ static u64 kvm_get_arch_capabilities(void)
if (!boot_cpu_has_bug(X86_BUG_MDS))
data |= ARCH_CAP_MDS_NO;
- /*
- * On TAA affected systems:
- * - nothing to do if TSX is disabled on the host.
- * - we emulate TSX_CTRL if present on the host.
- * This lets the guest use VERW to clear CPU buffers.
- */
- if (!boot_cpu_has(X86_FEATURE_RTM))
- data &= ~(ARCH_CAP_TAA_NO | ARCH_CAP_TSX_CTRL_MSR);
- else if (!boot_cpu_has_bug(X86_BUG_TAA))
+ if (!boot_cpu_has(X86_FEATURE_RTM)) {
+ /*
+ * If RTM=0 because the kernel has disabled TSX, the host might
+ * have TAA_NO or TSX_CTRL. Clear TAA_NO (the guest sees RTM=0
+ * and therefore knows that there cannot be TAA) but keep
+ * TSX_CTRL: some buggy userspaces leave it set on tsx=on hosts,
+ * and we want to allow migrating those guests to tsx=off hosts.
+ */
+ data &= ~ARCH_CAP_TAA_NO;
+ } else if (!boot_cpu_has_bug(X86_BUG_TAA)) {
data |= ARCH_CAP_TAA_NO;
+ } else {
+ /*
+ * Nothing to do here; we emulate TSX_CTRL if present on the
+ * host so the guest can choose between disabling TSX or
+ * using VERW to clear CPU buffers.
+ */
+ }
return data;
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From aec18a57edad562d620f7d19016de1fc0cc2208c Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Thu, 4 Feb 2021 19:22:46 +0000
Subject: [PATCH] io_uring: drop mm/files between task_work_submit
Since SQPOLL task can be shared and so task_work entries can be a mix of
them, we need to drop mm and files before trying to issue next request.
Cc: stable(a)vger.kernel.org # 5.10+
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 5d3348d66f06..1f68105a41ed 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -2205,6 +2205,9 @@ static void __io_req_task_submit(struct io_kiocb *req)
else
__io_req_task_cancel(req, -EFAULT);
mutex_unlock(&ctx->uring_lock);
+
+ if (ctx->flags & IORING_SETUP_SQPOLL)
+ io_sq_thread_drop_mm_files();
}
static void io_req_task_submit(struct callback_head *cb)
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 89fa15ecdca7eb46a711476b961f70a74765bbe4 Mon Sep 17 00:00:00 2001
From: Huang Rui <ray.huang(a)amd.com>
Date: Sat, 30 Jan 2021 17:14:30 +0800
Subject: [PATCH] drm/amdgpu: fix the issue that retry constantly once the
buffer is oversize
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
We cannot modify initial_domain every time while the retry starts. That
will cause the busy waiting that unable to switch to GTT while the vram
is not enough.
Fixes: f8aab60422c3 ("drm/amdgpu: Initialise drm_gem_object_funcs for imported BOs")
Signed-off-by: Huang Rui <ray.huang(a)amd.com>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Reviewed-by: Christian König <christian.koenig(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
index d0a1fee1f5f6..174a73eb23f0 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c
@@ -269,8 +269,8 @@ int amdgpu_gem_create_ioctl(struct drm_device *dev, void *data,
resv = vm->root.base.bo->tbo.base.resv;
}
-retry:
initial_domain = (u32)(0xffffffff & args->in.domains);
+retry:
r = amdgpu_gem_object_create(adev, size, args->in.alignment,
initial_domain,
flags, ttm_bo_type_device, resv, &gobj);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 19a23da53932bc8011220bd8c410cb76012de004 Mon Sep 17 00:00:00 2001
From: Peter Gonda <pgonda(a)google.com>
Date: Wed, 27 Jan 2021 08:15:24 -0800
Subject: [PATCH] Fix unsynchronized access to sev members through
svm_register_enc_region
Grab kvm->lock before pinning memory when registering an encrypted
region; sev_pin_memory() relies on kvm->lock being held to ensure
correctness when checking and updating the number of pinned pages.
Add a lockdep assertion to help prevent future regressions.
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: x86(a)kernel.org
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active")
Signed-off-by: Peter Gonda <pgonda(a)google.com>
V2
- Fix up patch description
- Correct file paths svm.c -> sev.c
- Add unlock of kvm->lock on sev_pin_memory error
V1
- https://lore.kernel.org/kvm/20210126185431.1824530-1-pgonda@google.com/
Message-Id: <20210127161524.2832400-1-pgonda(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index ac652bc476ae..48017fef1cd9 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -342,6 +342,8 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
unsigned long first, last;
int ret;
+ lockdep_assert_held(&kvm->lock);
+
if (ulen == 0 || uaddr + ulen < uaddr)
return ERR_PTR(-EINVAL);
@@ -1119,12 +1121,20 @@ int svm_register_enc_region(struct kvm *kvm,
if (!region)
return -ENOMEM;
+ mutex_lock(&kvm->lock);
region->pages = sev_pin_memory(kvm, range->addr, range->size, ®ion->npages, 1);
if (IS_ERR(region->pages)) {
ret = PTR_ERR(region->pages);
+ mutex_unlock(&kvm->lock);
goto e_free;
}
+ region->uaddr = range->addr;
+ region->size = range->size;
+
+ list_add_tail(®ion->list, &sev->regions_list);
+ mutex_unlock(&kvm->lock);
+
/*
* The guest may change the memory encryption attribute from C=0 -> C=1
* or vice versa for this memory range. Lets make sure caches are
@@ -1133,13 +1143,6 @@ int svm_register_enc_region(struct kvm *kvm,
*/
sev_clflush_pages(region->pages, region->npages);
- region->uaddr = range->addr;
- region->size = range->size;
-
- mutex_lock(&kvm->lock);
- list_add_tail(®ion->list, &sev->regions_list);
- mutex_unlock(&kvm->lock);
-
return ret;
e_free:
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 19a23da53932bc8011220bd8c410cb76012de004 Mon Sep 17 00:00:00 2001
From: Peter Gonda <pgonda(a)google.com>
Date: Wed, 27 Jan 2021 08:15:24 -0800
Subject: [PATCH] Fix unsynchronized access to sev members through
svm_register_enc_region
Grab kvm->lock before pinning memory when registering an encrypted
region; sev_pin_memory() relies on kvm->lock being held to ensure
correctness when checking and updating the number of pinned pages.
Add a lockdep assertion to help prevent future regressions.
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Ingo Molnar <mingo(a)redhat.com>
Cc: "H. Peter Anvin" <hpa(a)zytor.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Joerg Roedel <joro(a)8bytes.org>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Cc: Sean Christopherson <seanjc(a)google.com>
Cc: x86(a)kernel.org
Cc: kvm(a)vger.kernel.org
Cc: stable(a)vger.kernel.org
Cc: linux-kernel(a)vger.kernel.org
Fixes: 1e80fdc09d12 ("KVM: SVM: Pin guest memory when SEV is active")
Signed-off-by: Peter Gonda <pgonda(a)google.com>
V2
- Fix up patch description
- Correct file paths svm.c -> sev.c
- Add unlock of kvm->lock on sev_pin_memory error
V1
- https://lore.kernel.org/kvm/20210126185431.1824530-1-pgonda@google.com/
Message-Id: <20210127161524.2832400-1-pgonda(a)google.com>
Signed-off-by: Paolo Bonzini <pbonzini(a)redhat.com>
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index ac652bc476ae..48017fef1cd9 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -342,6 +342,8 @@ static struct page **sev_pin_memory(struct kvm *kvm, unsigned long uaddr,
unsigned long first, last;
int ret;
+ lockdep_assert_held(&kvm->lock);
+
if (ulen == 0 || uaddr + ulen < uaddr)
return ERR_PTR(-EINVAL);
@@ -1119,12 +1121,20 @@ int svm_register_enc_region(struct kvm *kvm,
if (!region)
return -ENOMEM;
+ mutex_lock(&kvm->lock);
region->pages = sev_pin_memory(kvm, range->addr, range->size, ®ion->npages, 1);
if (IS_ERR(region->pages)) {
ret = PTR_ERR(region->pages);
+ mutex_unlock(&kvm->lock);
goto e_free;
}
+ region->uaddr = range->addr;
+ region->size = range->size;
+
+ list_add_tail(®ion->list, &sev->regions_list);
+ mutex_unlock(&kvm->lock);
+
/*
* The guest may change the memory encryption attribute from C=0 -> C=1
* or vice versa for this memory range. Lets make sure caches are
@@ -1133,13 +1143,6 @@ int svm_register_enc_region(struct kvm *kvm,
*/
sev_clflush_pages(region->pages, region->npages);
- region->uaddr = range->addr;
- region->size = range->size;
-
- mutex_lock(&kvm->lock);
- list_add_tail(®ion->list, &sev->regions_list);
- mutex_unlock(&kvm->lock);
-
return ret;
e_free:
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7018c897c2f243d4b5f1b94bc6b4831a7eab80fb Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 1 Feb 2021 16:20:40 -0800
Subject: [PATCH] libnvdimm/dimm: Avoid race between probe and
available_slots_show()
Richard reports that the following test:
(while true; do
cat /sys/bus/nd/devices/nmem*/available_slots 2>&1 > /dev/null
done) &
while true; do
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/bind
done
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/unbind
done
done
...fails with a crash signature like:
divide error: 0000 [#1] SMP KASAN PTI
RIP: 0010:nd_label_nfree+0x134/0x1a0 [libnvdimm]
[..]
Call Trace:
available_slots_show+0x4e/0x120 [libnvdimm]
dev_attr_show+0x42/0x80
? memset+0x20/0x40
sysfs_kf_seq_show+0x218/0x410
The root cause is that available_slots_show() consults driver-data, but
fails to synchronize against device-unbind setting up a TOCTOU race to
access uninitialized memory.
Validate driver-data under the device-lock.
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure")
Cc: <stable(a)vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Coly Li <colyli(a)suse.com>
Reported-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Acked-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index b59032e0859b..9d208570d059 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -335,16 +335,16 @@ static ssize_t state_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(state);
-static ssize_t available_slots_show(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t __available_slots_show(struct nvdimm_drvdata *ndd, char *buf)
{
- struct nvdimm_drvdata *ndd = dev_get_drvdata(dev);
+ struct device *dev;
ssize_t rc;
u32 nfree;
if (!ndd)
return -ENXIO;
+ dev = ndd->dev;
nvdimm_bus_lock(dev);
nfree = nd_label_nfree(ndd);
if (nfree - 1 > nfree) {
@@ -356,6 +356,18 @@ static ssize_t available_slots_show(struct device *dev,
nvdimm_bus_unlock(dev);
return rc;
}
+
+static ssize_t available_slots_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ ssize_t rc;
+
+ nd_device_lock(dev);
+ rc = __available_slots_show(dev_get_drvdata(dev), buf);
+ nd_device_unlock(dev);
+
+ return rc;
+}
static DEVICE_ATTR_RO(available_slots);
__weak ssize_t security_show(struct device *dev,
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7018c897c2f243d4b5f1b94bc6b4831a7eab80fb Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 1 Feb 2021 16:20:40 -0800
Subject: [PATCH] libnvdimm/dimm: Avoid race between probe and
available_slots_show()
Richard reports that the following test:
(while true; do
cat /sys/bus/nd/devices/nmem*/available_slots 2>&1 > /dev/null
done) &
while true; do
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/bind
done
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/unbind
done
done
...fails with a crash signature like:
divide error: 0000 [#1] SMP KASAN PTI
RIP: 0010:nd_label_nfree+0x134/0x1a0 [libnvdimm]
[..]
Call Trace:
available_slots_show+0x4e/0x120 [libnvdimm]
dev_attr_show+0x42/0x80
? memset+0x20/0x40
sysfs_kf_seq_show+0x218/0x410
The root cause is that available_slots_show() consults driver-data, but
fails to synchronize against device-unbind setting up a TOCTOU race to
access uninitialized memory.
Validate driver-data under the device-lock.
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure")
Cc: <stable(a)vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Coly Li <colyli(a)suse.com>
Reported-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Acked-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index b59032e0859b..9d208570d059 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -335,16 +335,16 @@ static ssize_t state_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(state);
-static ssize_t available_slots_show(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t __available_slots_show(struct nvdimm_drvdata *ndd, char *buf)
{
- struct nvdimm_drvdata *ndd = dev_get_drvdata(dev);
+ struct device *dev;
ssize_t rc;
u32 nfree;
if (!ndd)
return -ENXIO;
+ dev = ndd->dev;
nvdimm_bus_lock(dev);
nfree = nd_label_nfree(ndd);
if (nfree - 1 > nfree) {
@@ -356,6 +356,18 @@ static ssize_t available_slots_show(struct device *dev,
nvdimm_bus_unlock(dev);
return rc;
}
+
+static ssize_t available_slots_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ ssize_t rc;
+
+ nd_device_lock(dev);
+ rc = __available_slots_show(dev_get_drvdata(dev), buf);
+ nd_device_unlock(dev);
+
+ return rc;
+}
static DEVICE_ATTR_RO(available_slots);
__weak ssize_t security_show(struct device *dev,
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7018c897c2f243d4b5f1b94bc6b4831a7eab80fb Mon Sep 17 00:00:00 2001
From: Dan Williams <dan.j.williams(a)intel.com>
Date: Mon, 1 Feb 2021 16:20:40 -0800
Subject: [PATCH] libnvdimm/dimm: Avoid race between probe and
available_slots_show()
Richard reports that the following test:
(while true; do
cat /sys/bus/nd/devices/nmem*/available_slots 2>&1 > /dev/null
done) &
while true; do
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/bind
done
for i in $(seq 0 4); do
echo nmem$i > /sys/bus/nd/drivers/nvdimm/unbind
done
done
...fails with a crash signature like:
divide error: 0000 [#1] SMP KASAN PTI
RIP: 0010:nd_label_nfree+0x134/0x1a0 [libnvdimm]
[..]
Call Trace:
available_slots_show+0x4e/0x120 [libnvdimm]
dev_attr_show+0x42/0x80
? memset+0x20/0x40
sysfs_kf_seq_show+0x218/0x410
The root cause is that available_slots_show() consults driver-data, but
fails to synchronize against device-unbind setting up a TOCTOU race to
access uninitialized memory.
Validate driver-data under the device-lock.
Fixes: 4d88a97aa9e8 ("libnvdimm, nvdimm: dimm driver and base libnvdimm device-driver infrastructure")
Cc: <stable(a)vger.kernel.org>
Cc: Vishal Verma <vishal.l.verma(a)intel.com>
Cc: Dave Jiang <dave.jiang(a)intel.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Coly Li <colyli(a)suse.com>
Reported-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Acked-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Signed-off-by: Dan Williams <dan.j.williams(a)intel.com>
diff --git a/drivers/nvdimm/dimm_devs.c b/drivers/nvdimm/dimm_devs.c
index b59032e0859b..9d208570d059 100644
--- a/drivers/nvdimm/dimm_devs.c
+++ b/drivers/nvdimm/dimm_devs.c
@@ -335,16 +335,16 @@ static ssize_t state_show(struct device *dev, struct device_attribute *attr,
}
static DEVICE_ATTR_RO(state);
-static ssize_t available_slots_show(struct device *dev,
- struct device_attribute *attr, char *buf)
+static ssize_t __available_slots_show(struct nvdimm_drvdata *ndd, char *buf)
{
- struct nvdimm_drvdata *ndd = dev_get_drvdata(dev);
+ struct device *dev;
ssize_t rc;
u32 nfree;
if (!ndd)
return -ENXIO;
+ dev = ndd->dev;
nvdimm_bus_lock(dev);
nfree = nd_label_nfree(ndd);
if (nfree - 1 > nfree) {
@@ -356,6 +356,18 @@ static ssize_t available_slots_show(struct device *dev,
nvdimm_bus_unlock(dev);
return rc;
}
+
+static ssize_t available_slots_show(struct device *dev,
+ struct device_attribute *attr, char *buf)
+{
+ ssize_t rc;
+
+ nd_device_lock(dev);
+ rc = __available_slots_show(dev_get_drvdata(dev), buf);
+ nd_device_unlock(dev);
+
+ return rc;
+}
static DEVICE_ATTR_RO(available_slots);
__weak ssize_t security_show(struct device *dev,
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4c457e8cb75eda91906a4f89fc39bde3f9a43922 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sat, 23 Jan 2021 12:27:59 +0000
Subject: [PATCH] genirq/msi: Activate Multi-MSI early when
MSI_FLAG_ACTIVATE_EARLY is set
When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
__msi_domain_alloc_irqs() performs the activation of the interrupt (which
in the case of PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.
But it appears that this is only done for the first vector, introducing an
inconsistent behaviour for PCI Multi-MSI.
Fix it by iterating over the number of vectors allocated to each MSI
descriptor. This is easily achieved by introducing a new
"for_each_msi_vector" iterator, together with a tiny bit of refactoring.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210123122759.1781359-1-maz@kernel.org
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7e7341..aef35fd1cf11 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
#define for_each_msi_entry_safe(desc, tmp, dev) \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
+#define for_each_msi_vector(desc, __irq, dev) \
+ for_each_msi_entry((desc), (dev)) \
+ if ((desc)->irq) \
+ for (__irq = (desc)->irq; \
+ __irq < ((desc)->irq + (desc)->nvec_used); \
+ __irq++)
#ifdef CONFIG_IRQ_MSI_IOMMU
static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index dc0e2d7fbdfd..b338d622f26e 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
can_reserve = msi_check_reservation_mode(domain, info, dev);
- for_each_msi_entry(desc, dev) {
- virq = desc->irq;
- if (desc->nvec_used == 1)
- dev_dbg(dev, "irq %d for MSI\n", virq);
- else
+ /*
+ * This flag is set by the PCI layer as we need to activate
+ * the MSI entries before the PCI layer enables MSI in the
+ * card. Otherwise the card latches a random msi message.
+ */
+ if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+ goto skip_activate;
+
+ for_each_msi_vector(desc, i, dev) {
+ if (desc->irq == i) {
+ virq = desc->irq;
dev_dbg(dev, "irq [%d-%d] for MSI\n",
virq, virq + desc->nvec_used - 1);
- /*
- * This flag is set by the PCI layer as we need to activate
- * the MSI entries before the PCI layer enables MSI in the
- * card. Otherwise the card latches a random msi message.
- */
- if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
- continue;
+ }
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ irq_data = irq_domain_get_irq_data(domain, i);
if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
@@ -462,28 +462,24 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
goto cleanup;
}
+skip_activate:
/*
* If these interrupts use reservation mode, clear the activated bit
* so request_irq() will assign the final vector.
*/
if (can_reserve) {
- for_each_msi_entry(desc, dev) {
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
irqd_clr_activated(irq_data);
}
}
return 0;
cleanup:
- for_each_msi_entry(desc, dev) {
- struct irq_data *irqd;
-
- if (desc->irq == virq)
- break;
-
- irqd = irq_domain_get_irq_data(domain, desc->irq);
- if (irqd_is_activated(irqd))
- irq_domain_deactivate_irq(irqd);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
+ if (irqd_is_activated(irq_data))
+ irq_domain_deactivate_irq(irq_data);
}
msi_domain_free_irqs(domain, dev);
return ret;
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4c457e8cb75eda91906a4f89fc39bde3f9a43922 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sat, 23 Jan 2021 12:27:59 +0000
Subject: [PATCH] genirq/msi: Activate Multi-MSI early when
MSI_FLAG_ACTIVATE_EARLY is set
When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
__msi_domain_alloc_irqs() performs the activation of the interrupt (which
in the case of PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.
But it appears that this is only done for the first vector, introducing an
inconsistent behaviour for PCI Multi-MSI.
Fix it by iterating over the number of vectors allocated to each MSI
descriptor. This is easily achieved by introducing a new
"for_each_msi_vector" iterator, together with a tiny bit of refactoring.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210123122759.1781359-1-maz@kernel.org
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7e7341..aef35fd1cf11 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
#define for_each_msi_entry_safe(desc, tmp, dev) \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
+#define for_each_msi_vector(desc, __irq, dev) \
+ for_each_msi_entry((desc), (dev)) \
+ if ((desc)->irq) \
+ for (__irq = (desc)->irq; \
+ __irq < ((desc)->irq + (desc)->nvec_used); \
+ __irq++)
#ifdef CONFIG_IRQ_MSI_IOMMU
static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index dc0e2d7fbdfd..b338d622f26e 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
can_reserve = msi_check_reservation_mode(domain, info, dev);
- for_each_msi_entry(desc, dev) {
- virq = desc->irq;
- if (desc->nvec_used == 1)
- dev_dbg(dev, "irq %d for MSI\n", virq);
- else
+ /*
+ * This flag is set by the PCI layer as we need to activate
+ * the MSI entries before the PCI layer enables MSI in the
+ * card. Otherwise the card latches a random msi message.
+ */
+ if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+ goto skip_activate;
+
+ for_each_msi_vector(desc, i, dev) {
+ if (desc->irq == i) {
+ virq = desc->irq;
dev_dbg(dev, "irq [%d-%d] for MSI\n",
virq, virq + desc->nvec_used - 1);
- /*
- * This flag is set by the PCI layer as we need to activate
- * the MSI entries before the PCI layer enables MSI in the
- * card. Otherwise the card latches a random msi message.
- */
- if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
- continue;
+ }
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ irq_data = irq_domain_get_irq_data(domain, i);
if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
@@ -462,28 +462,24 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
goto cleanup;
}
+skip_activate:
/*
* If these interrupts use reservation mode, clear the activated bit
* so request_irq() will assign the final vector.
*/
if (can_reserve) {
- for_each_msi_entry(desc, dev) {
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
irqd_clr_activated(irq_data);
}
}
return 0;
cleanup:
- for_each_msi_entry(desc, dev) {
- struct irq_data *irqd;
-
- if (desc->irq == virq)
- break;
-
- irqd = irq_domain_get_irq_data(domain, desc->irq);
- if (irqd_is_activated(irqd))
- irq_domain_deactivate_irq(irqd);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
+ if (irqd_is_activated(irq_data))
+ irq_domain_deactivate_irq(irq_data);
}
msi_domain_free_irqs(domain, dev);
return ret;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4c457e8cb75eda91906a4f89fc39bde3f9a43922 Mon Sep 17 00:00:00 2001
From: Marc Zyngier <maz(a)kernel.org>
Date: Sat, 23 Jan 2021 12:27:59 +0000
Subject: [PATCH] genirq/msi: Activate Multi-MSI early when
MSI_FLAG_ACTIVATE_EARLY is set
When MSI_FLAG_ACTIVATE_EARLY is set (which is the case for PCI),
__msi_domain_alloc_irqs() performs the activation of the interrupt (which
in the case of PCI results in the endpoint being programmed) as soon as the
interrupt is allocated.
But it appears that this is only done for the first vector, introducing an
inconsistent behaviour for PCI Multi-MSI.
Fix it by iterating over the number of vectors allocated to each MSI
descriptor. This is easily achieved by introducing a new
"for_each_msi_vector" iterator, together with a tiny bit of refactoring.
Fixes: f3b0946d629c ("genirq/msi: Make sure PCI MSIs are activated early")
Reported-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Shameer Kolothum <shameerali.kolothum.thodi(a)huawei.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210123122759.1781359-1-maz@kernel.org
diff --git a/include/linux/msi.h b/include/linux/msi.h
index 360a0a7e7341..aef35fd1cf11 100644
--- a/include/linux/msi.h
+++ b/include/linux/msi.h
@@ -178,6 +178,12 @@ struct msi_desc {
list_for_each_entry((desc), dev_to_msi_list((dev)), list)
#define for_each_msi_entry_safe(desc, tmp, dev) \
list_for_each_entry_safe((desc), (tmp), dev_to_msi_list((dev)), list)
+#define for_each_msi_vector(desc, __irq, dev) \
+ for_each_msi_entry((desc), (dev)) \
+ if ((desc)->irq) \
+ for (__irq = (desc)->irq; \
+ __irq < ((desc)->irq + (desc)->nvec_used); \
+ __irq++)
#ifdef CONFIG_IRQ_MSI_IOMMU
static inline const void *msi_desc_get_iommu_cookie(struct msi_desc *desc)
diff --git a/kernel/irq/msi.c b/kernel/irq/msi.c
index dc0e2d7fbdfd..b338d622f26e 100644
--- a/kernel/irq/msi.c
+++ b/kernel/irq/msi.c
@@ -436,22 +436,22 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
can_reserve = msi_check_reservation_mode(domain, info, dev);
- for_each_msi_entry(desc, dev) {
- virq = desc->irq;
- if (desc->nvec_used == 1)
- dev_dbg(dev, "irq %d for MSI\n", virq);
- else
+ /*
+ * This flag is set by the PCI layer as we need to activate
+ * the MSI entries before the PCI layer enables MSI in the
+ * card. Otherwise the card latches a random msi message.
+ */
+ if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
+ goto skip_activate;
+
+ for_each_msi_vector(desc, i, dev) {
+ if (desc->irq == i) {
+ virq = desc->irq;
dev_dbg(dev, "irq [%d-%d] for MSI\n",
virq, virq + desc->nvec_used - 1);
- /*
- * This flag is set by the PCI layer as we need to activate
- * the MSI entries before the PCI layer enables MSI in the
- * card. Otherwise the card latches a random msi message.
- */
- if (!(info->flags & MSI_FLAG_ACTIVATE_EARLY))
- continue;
+ }
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ irq_data = irq_domain_get_irq_data(domain, i);
if (!can_reserve) {
irqd_clr_can_reserve(irq_data);
if (domain->flags & IRQ_DOMAIN_MSI_NOMASK_QUIRK)
@@ -462,28 +462,24 @@ int __msi_domain_alloc_irqs(struct irq_domain *domain, struct device *dev,
goto cleanup;
}
+skip_activate:
/*
* If these interrupts use reservation mode, clear the activated bit
* so request_irq() will assign the final vector.
*/
if (can_reserve) {
- for_each_msi_entry(desc, dev) {
- irq_data = irq_domain_get_irq_data(domain, desc->irq);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
irqd_clr_activated(irq_data);
}
}
return 0;
cleanup:
- for_each_msi_entry(desc, dev) {
- struct irq_data *irqd;
-
- if (desc->irq == virq)
- break;
-
- irqd = irq_domain_get_irq_data(domain, desc->irq);
- if (irqd_is_activated(irqd))
- irq_domain_deactivate_irq(irqd);
+ for_each_msi_vector(desc, i, dev) {
+ irq_data = irq_domain_get_irq_data(domain, i);
+ if (irqd_is_activated(irq_data))
+ irq_domain_deactivate_irq(irq_data);
}
msi_domain_free_irqs(domain, dev);
return ret;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 97c753e62e6c31a404183898d950d8c08d752dbd Mon Sep 17 00:00:00 2001
From: Masami Hiramatsu <mhiramat(a)kernel.org>
Date: Thu, 28 Jan 2021 00:37:51 +0900
Subject: [PATCH] tracing/kprobe: Fix to support kretprobe events on unloaded
modules
Fix kprobe_on_func_entry() returns error code instead of false so that
register_kretprobe() can return an appropriate error code.
append_trace_kprobe() expects the kprobe registration returns -ENOENT
when the target symbol is not found, and it checks whether the target
module is unloaded or not. If the target module doesn't exist, it
defers to probe the target symbol until the module is loaded.
However, since register_kretprobe() returns -EINVAL instead of -ENOENT
in that case, it always fail on putting the kretprobe event on unloaded
modules. e.g.
Kprobe event:
/sys/kernel/debug/tracing # echo p xfs:xfs_end_io >> kprobe_events
[ 16.515574] trace_kprobe: This probe might be able to register after target module is loaded. Continue.
Kretprobe event: (p -> r)
/sys/kernel/debug/tracing # echo r xfs:xfs_end_io >> kprobe_events
sh: write error: Invalid argument
/sys/kernel/debug/tracing # cat error_log
[ 41.122514] trace_kprobe: error: Failed to register probe event
Command: r xfs:xfs_end_io
^
To fix this bug, change kprobe_on_func_entry() to detect symbol lookup
failure and return -ENOENT in that case. Otherwise it returns -EINVAL
or 0 (succeeded, given address is on the entry).
Link: https://lkml.kernel.org/r/161176187132.1067016.8118042342894378981.stgit@de…
Cc: stable(a)vger.kernel.org
Fixes: 59158ec4aef7 ("tracing/kprobes: Check the probe on unloaded module correctly")
Reported-by: Jianlin Lv <Jianlin.Lv(a)arm.com>
Signed-off-by: Masami Hiramatsu <mhiramat(a)kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/include/linux/kprobes.h b/include/linux/kprobes.h
index b3a36b0cfc81..1883a4a9f16a 100644
--- a/include/linux/kprobes.h
+++ b/include/linux/kprobes.h
@@ -266,7 +266,7 @@ extern void kprobes_inc_nmissed_count(struct kprobe *p);
extern bool arch_within_kprobe_blacklist(unsigned long addr);
extern int arch_populate_kprobe_blacklist(void);
extern bool arch_kprobe_on_func_entry(unsigned long offset);
-extern bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
+extern int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset);
extern bool within_kprobe_blacklist(unsigned long addr);
extern int kprobe_add_ksym_blacklist(unsigned long entry);
diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index f7fb5d135930..1a5bc321e0a5 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1954,29 +1954,45 @@ bool __weak arch_kprobe_on_func_entry(unsigned long offset)
return !offset;
}
-bool kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
+/**
+ * kprobe_on_func_entry() -- check whether given address is function entry
+ * @addr: Target address
+ * @sym: Target symbol name
+ * @offset: The offset from the symbol or the address
+ *
+ * This checks whether the given @addr+@offset or @sym+@offset is on the
+ * function entry address or not.
+ * This returns 0 if it is the function entry, or -EINVAL if it is not.
+ * And also it returns -ENOENT if it fails the symbol or address lookup.
+ * Caller must pass @addr or @sym (either one must be NULL), or this
+ * returns -EINVAL.
+ */
+int kprobe_on_func_entry(kprobe_opcode_t *addr, const char *sym, unsigned long offset)
{
kprobe_opcode_t *kp_addr = _kprobe_addr(addr, sym, offset);
if (IS_ERR(kp_addr))
- return false;
+ return PTR_ERR(kp_addr);
- if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset) ||
- !arch_kprobe_on_func_entry(offset))
- return false;
+ if (!kallsyms_lookup_size_offset((unsigned long)kp_addr, NULL, &offset))
+ return -ENOENT;
- return true;
+ if (!arch_kprobe_on_func_entry(offset))
+ return -EINVAL;
+
+ return 0;
}
int register_kretprobe(struct kretprobe *rp)
{
- int ret = 0;
+ int ret;
struct kretprobe_instance *inst;
int i;
void *addr;
- if (!kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset))
- return -EINVAL;
+ ret = kprobe_on_func_entry(rp->kp.addr, rp->kp.symbol_name, rp->kp.offset);
+ if (ret)
+ return ret;
if (kretprobe_blacklist_size) {
addr = kprobe_addr(&rp->kp);
diff --git a/kernel/trace/trace_kprobe.c b/kernel/trace/trace_kprobe.c
index e6fba1798771..56c7fbff7bd7 100644
--- a/kernel/trace/trace_kprobe.c
+++ b/kernel/trace/trace_kprobe.c
@@ -221,9 +221,9 @@ bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
- return tk ? kprobe_on_func_entry(tk->rp.kp.addr,
+ return tk ? (kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset) : false;
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) == 0) : false;
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
@@ -828,9 +828,11 @@ static int trace_kprobe_create(int argc, const char *argv[])
}
if (is_return)
flags |= TPARG_FL_RETURN;
- if (kprobe_on_func_entry(NULL, symbol, offset))
+ ret = kprobe_on_func_entry(NULL, symbol, offset);
+ if (ret == 0)
flags |= TPARG_FL_FENTRY;
- if (offset && is_return && !(flags & TPARG_FL_FENTRY)) {
+ /* Defer the ENOENT case until register kprobe */
+ if (ret == -EINVAL && is_return) {
trace_probe_log_err(0, BAD_RETPROBE);
goto parse_error;
}
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 29 Jan 2021 10:13:53 -0500
Subject: [PATCH] fgraph: Initialize tracing_graph_pause at task creation
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable(a)vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois(a)arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/init/init_task.c b/init/init_task.c
index 8a992d73e6fb..3711cdaafed2 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -198,7 +198,8 @@ struct task_struct init_task
.lockdep_recursion = 0,
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- .ret_stack = NULL,
+ .ret_stack = NULL,
+ .tracing_graph_pause = ATOMIC_INIT(0),
#endif
#if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPTION)
.trace_recursion = 0,
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 73edb9e4f354..29a6ebeebc9e 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -394,7 +394,6 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
}
if (t->ret_stack == NULL) {
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->curr_ret_stack = -1;
t->curr_ret_depth = -1;
@@ -489,7 +488,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_stack *, idle_ret_stack);
static void
graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack)
{
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->ftrace_timestamp = 0;
/* make curr_ret_stack visible before we add the ret_stack */
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 7e0a9220467dbcfdc5bc62825724f3e52e50ab31 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Fri, 29 Jan 2021 10:13:53 -0500
Subject: [PATCH] fgraph: Initialize tracing_graph_pause at task creation
On some archs, the idle task can call into cpu_suspend(). The cpu_suspend()
will disable or pause function graph tracing, as there's some paths in
bringing down the CPU that can have issues with its return address being
modified. The task_struct structure has a "tracing_graph_pause" atomic
counter, that when set to something other than zero, the function graph
tracer will not modify the return address.
The problem is that the tracing_graph_pause counter is initialized when the
function graph tracer is enabled. This can corrupt the counter for the idle
task if it is suspended in these architectures.
CPU 1 CPU 2
----- -----
do_idle()
cpu_suspend()
pause_graph_tracing()
task_struct->tracing_graph_pause++ (0 -> 1)
start_graph_tracing()
for_each_online_cpu(cpu) {
ftrace_graph_init_idle_task(cpu)
task-struct->tracing_graph_pause = 0 (1 -> 0)
unpause_graph_tracing()
task_struct->tracing_graph_pause-- (0 -> -1)
The above should have gone from 1 to zero, and enabled function graph
tracing again. But instead, it is set to -1, which keeps it disabled.
There's no reason that the field tracing_graph_pause on the task_struct can
not be initialized at boot up.
Cc: stable(a)vger.kernel.org
Fixes: 380c4b1411ccd ("tracing/function-graph-tracer: append the tracing_graph_flag")
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=211339
Reported-by: pierre.gondois(a)arm.com
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/init/init_task.c b/init/init_task.c
index 8a992d73e6fb..3711cdaafed2 100644
--- a/init/init_task.c
+++ b/init/init_task.c
@@ -198,7 +198,8 @@ struct task_struct init_task
.lockdep_recursion = 0,
#endif
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
- .ret_stack = NULL,
+ .ret_stack = NULL,
+ .tracing_graph_pause = ATOMIC_INIT(0),
#endif
#if defined(CONFIG_TRACING) && defined(CONFIG_PREEMPTION)
.trace_recursion = 0,
diff --git a/kernel/trace/fgraph.c b/kernel/trace/fgraph.c
index 73edb9e4f354..29a6ebeebc9e 100644
--- a/kernel/trace/fgraph.c
+++ b/kernel/trace/fgraph.c
@@ -394,7 +394,6 @@ static int alloc_retstack_tasklist(struct ftrace_ret_stack **ret_stack_list)
}
if (t->ret_stack == NULL) {
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->curr_ret_stack = -1;
t->curr_ret_depth = -1;
@@ -489,7 +488,6 @@ static DEFINE_PER_CPU(struct ftrace_ret_stack *, idle_ret_stack);
static void
graph_init_task(struct task_struct *t, struct ftrace_ret_stack *ret_stack)
{
- atomic_set(&t->tracing_graph_pause, 0);
atomic_set(&t->trace_overrun, 0);
t->ftrace_timestamp = 0;
/* make curr_ret_stack visible before we add the ret_stack */
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9917f0e3cdba7b9f1a23f70e3f70b1a106be54a8 Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Mon, 1 Feb 2021 21:47:20 +0900
Subject: [PATCH] usb: renesas_usbhs: Clear pipe running flag in
usbhs_pkt_pop()
Should clear the pipe running flag in usbhs_pkt_pop(). Otherwise,
we cannot use this pipe after dequeue was called while the pipe was
running.
Fixes: 8355b2b3082d ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle")
Reported-by: Tho Vu <tho.vu.wh(a)renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Link: https://lore.kernel.org/r/1612183640-8898-1-git-send-email-yoshihiro.shimod…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/renesas_usbhs/fifo.c b/drivers/usb/renesas_usbhs/fifo.c
index ac9a81ae8216..e6fa13701808 100644
--- a/drivers/usb/renesas_usbhs/fifo.c
+++ b/drivers/usb/renesas_usbhs/fifo.c
@@ -126,6 +126,7 @@ struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt)
}
usbhs_pipe_clear_without_sequence(pipe, 0, 0);
+ usbhs_pipe_running(pipe, 0);
__usbhsf_pkt_del(pkt);
}
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 9917f0e3cdba7b9f1a23f70e3f70b1a106be54a8 Mon Sep 17 00:00:00 2001
From: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Date: Mon, 1 Feb 2021 21:47:20 +0900
Subject: [PATCH] usb: renesas_usbhs: Clear pipe running flag in
usbhs_pkt_pop()
Should clear the pipe running flag in usbhs_pkt_pop(). Otherwise,
we cannot use this pipe after dequeue was called while the pipe was
running.
Fixes: 8355b2b3082d ("usb: renesas_usbhs: fix the behavior of some usbhs_pkt_handle")
Reported-by: Tho Vu <tho.vu.wh(a)renesas.com>
Signed-off-by: Yoshihiro Shimoda <yoshihiro.shimoda.uh(a)renesas.com>
Link: https://lore.kernel.org/r/1612183640-8898-1-git-send-email-yoshihiro.shimod…
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/renesas_usbhs/fifo.c b/drivers/usb/renesas_usbhs/fifo.c
index ac9a81ae8216..e6fa13701808 100644
--- a/drivers/usb/renesas_usbhs/fifo.c
+++ b/drivers/usb/renesas_usbhs/fifo.c
@@ -126,6 +126,7 @@ struct usbhs_pkt *usbhs_pkt_pop(struct usbhs_pipe *pipe, struct usbhs_pkt *pkt)
}
usbhs_pipe_clear_without_sequence(pipe, 0, 0);
+ usbhs_pipe_running(pipe, 0);
__usbhsf_pkt_del(pkt);
}
Upstream commit 81b704d3e4674e09781d331df73d76675d5ad8cb
Applies to 4.9-stable tree
----------->8---------------
From: "Rafael J. Wysocki" <rafael.j.wysocki(a)intel.com>
Date: Thu, 14 Jan 2021 19:34:22 +0100
Subject: [PATCH] ACPI: thermal: Do not call acpi_thermal_check() directly
Calling acpi_thermal_check() from acpi_thermal_notify() directly
is problematic if _TMP triggers Notify () on the thermal zone for
which it has been evaluated (which happens on some systems), because
it causes a new acpi_thermal_notify() invocation to be queued up
every time and if that takes place too often, an indefinite number of
pending work items may accumulate in kacpi_notify_wq over time.
Besides, it is not really useful to queue up a new invocation of
acpi_thermal_check() if one of them is pending already.
For these reasons, rework acpi_thermal_notify() to queue up a thermal
check instead of calling acpi_thermal_check() directly and only allow
one thermal check to be pending at a time. Moreover, only allow one
acpi_thermal_check_fn() instance at a time to run
thermal_zone_device_update() for one thermal zone and make it return
early if it sees other instances running for the same thermal zone.
While at it, fold acpi_thermal_check() into acpi_thermal_check_fn(),
as it is only called from there after the other changes made here.
[This issue appears to have been exposed by commit 6d25be5782e4
("sched/core, workqueues: Distangle worker accounting from rq
lock"), but it is unclear why it was not visible earlier.]
BugLink: https://bugzilla.kernel.org/show_bug.cgi?id=208877
Reported-by: Stephen Berman <stephen.berman(a)gmx.net>
Diagnosed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Tested-by: Stephen Berman <stephen.berman(a)gmx.net>
Cc: All applicable <stable(a)vger.kernel.org>
[bigeasy: Backported to v4.9.y, use atomic_t instead of refcount_t]
Signed-off-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
---
drivers/acpi/thermal.c | 55 +++++++++++++++++++++++++++++-------------
1 file changed, 38 insertions(+), 17 deletions(-)
diff --git a/drivers/acpi/thermal.c b/drivers/acpi/thermal.c
index 35e8fbca10ad5..c53c88b531639 100644
--- a/drivers/acpi/thermal.c
+++ b/drivers/acpi/thermal.c
@@ -188,6 +188,8 @@ struct acpi_thermal {
int tz_enabled;
int kelvin_offset;
struct work_struct thermal_check_work;
+ struct mutex thermal_check_lock;
+ atomic_t thermal_check_count;
};
/* --------------------------------------------------------------------------
@@ -513,17 +515,6 @@ static int acpi_thermal_get_trip_points(struct acpi_thermal *tz)
return 0;
}
-static void acpi_thermal_check(void *data)
-{
- struct acpi_thermal *tz = data;
-
- if (!tz->tz_enabled)
- return;
-
- thermal_zone_device_update(tz->thermal_zone,
- THERMAL_EVENT_UNSPECIFIED);
-}
-
/* sys I/F for generic thermal sysfs support */
static int thermal_get_temp(struct thermal_zone_device *thermal, int *temp)
@@ -557,6 +548,8 @@ static int thermal_get_mode(struct thermal_zone_device *thermal,
return 0;
}
+static void acpi_thermal_check_fn(struct work_struct *work);
+
static int thermal_set_mode(struct thermal_zone_device *thermal,
enum thermal_device_mode mode)
{
@@ -582,7 +575,7 @@ static int thermal_set_mode(struct thermal_zone_device *thermal,
ACPI_DEBUG_PRINT((ACPI_DB_INFO,
"%s kernel ACPI thermal control\n",
tz->tz_enabled ? "Enable" : "Disable"));
- acpi_thermal_check(tz);
+ acpi_thermal_check_fn(&tz->thermal_check_work);
}
return 0;
}
@@ -951,6 +944,12 @@ static void acpi_thermal_unregister_thermal_zone(struct acpi_thermal *tz)
Driver Interface
-------------------------------------------------------------------------- */
+static void acpi_queue_thermal_check(struct acpi_thermal *tz)
+{
+ if (!work_pending(&tz->thermal_check_work))
+ queue_work(acpi_thermal_pm_queue, &tz->thermal_check_work);
+}
+
static void acpi_thermal_notify(struct acpi_device *device, u32 event)
{
struct acpi_thermal *tz = acpi_driver_data(device);
@@ -961,17 +960,17 @@ static void acpi_thermal_notify(struct acpi_device *device, u32 event)
switch (event) {
case ACPI_THERMAL_NOTIFY_TEMPERATURE:
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
break;
case ACPI_THERMAL_NOTIFY_THRESHOLDS:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_THRESHOLDS);
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
acpi_bus_generate_netlink_event(device->pnp.device_class,
dev_name(&device->dev), event, 0);
break;
case ACPI_THERMAL_NOTIFY_DEVICES:
acpi_thermal_trips_update(tz, ACPI_TRIPS_REFRESH_DEVICES);
- acpi_thermal_check(tz);
+ acpi_queue_thermal_check(tz);
acpi_bus_generate_netlink_event(device->pnp.device_class,
dev_name(&device->dev), event, 0);
break;
@@ -1071,7 +1070,27 @@ static void acpi_thermal_check_fn(struct work_struct *work)
{
struct acpi_thermal *tz = container_of(work, struct acpi_thermal,
thermal_check_work);
- acpi_thermal_check(tz);
+
+ if (!tz->tz_enabled)
+ return;
+ /*
+ * In general, it is not sufficient to check the pending bit, because
+ * subsequent instances of this function may be queued after one of them
+ * has started running (e.g. if _TMP sleeps). Avoid bailing out if just
+ * one of them is running, though, because it may have done the actual
+ * check some time ago, so allow at least one of them to block on the
+ * mutex while another one is running the update.
+ */
+ if (!atomic_add_unless(&tz->thermal_check_count, -1, 1))
+ return;
+
+ mutex_lock(&tz->thermal_check_lock);
+
+ thermal_zone_device_update(tz->thermal_zone, THERMAL_EVENT_UNSPECIFIED);
+
+ atomic_inc(&tz->thermal_check_count);
+
+ mutex_unlock(&tz->thermal_check_lock);
}
static int acpi_thermal_add(struct acpi_device *device)
@@ -1103,6 +1122,8 @@ static int acpi_thermal_add(struct acpi_device *device)
if (result)
goto free_memory;
+ atomic_set(&tz->thermal_check_count, 3);
+ mutex_init(&tz->thermal_check_lock);
INIT_WORK(&tz->thermal_check_work, acpi_thermal_check_fn);
pr_info(PREFIX "%s [%s] (%ld C)\n", acpi_device_name(device),
@@ -1168,7 +1189,7 @@ static int acpi_thermal_resume(struct device *dev)
tz->state.active |= tz->trips.active[i].flags.enabled;
}
- queue_work(acpi_thermal_pm_queue, &tz->thermal_check_work);
+ acpi_queue_thermal_check(tz);
return AE_OK;
}
--
2.30.0
This is the start of the stable review cycle for the 5.4.96 release.
There are 32 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 07 Feb 2021 14:06:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.4.96-rc1…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.4.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.4.96-rc1
Peter Zijlstra <peterz(a)infradead.org>
workqueue: Restrict affinity change to rescuer
Peter Zijlstra <peterz(a)infradead.org>
kthread: Extract KTHREAD_IS_PER_CPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Bing Guo <bing.guo(a)amd.com>
drm/amd/display: Change function decide_dp_link_settings to avoid infinite looping
Jake Wang <haonan.wang2(a)amd.com>
drm/amd/display: Update dram_clock_change_latency for DCN2.1
Michael Ellerman <mpe(a)ellerman.id.au>
selftests/powerpc: Only test lwm/stmw on big endian
Revanth Rajashekar <revanth.rajashekar(a)intel.com>
nvme: check the PRINFO bit before deciding the host buffer length
lianzhi chang <changlianzhi(a)uniontech.com>
udf: fix the problem that the disc content is not displayed
Kai-Chuan Hsieh <kaichuan.hsieh(a)canonical.com>
ALSA: hda: Add Cometlake-R PCI ID
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Kai-Heng Feng <kai.heng.feng(a)canonical.com>
ASoC: SOF: Intel: hda: Resume codec to do jack detection
Dinghao Liu <dinghao.liu(a)zju.edu.cn>
scsi: fnic: Fix memleak in vnic_dev_init_devcmd2
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Martin Wilck <mwilck(a)suse.com>
scsi: scsi_transport_srp: Don't block target in failfast state
Peter Zijlstra <peterz(a)infradead.org>
x86: __always_inline __{rd,wr}msr()
Arnold Gozum <arngozum(a)gmail.com>
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
Tony Lindgren <tony(a)atomide.com>
phy: cpcap-usb: Fix warning for missing regulator_disable
Eric Dumazet <edumazet(a)google.com>
net_sched: gen_estimator: support large ewma log
ethanwu <ethanwu(a)synology.com>
btrfs: backref, use correct count to resolve normal data refs
ethanwu <ethanwu(a)synology.com>
btrfs: backref, only search backref entries from leaves of the same root
ethanwu <ethanwu(a)synology.com>
btrfs: backref, don't add refs from shared block when resolving normal backref
ethanwu <ethanwu(a)synology.com>
btrfs: backref, only collect file extent items matching backref offset
Enke Chen <enchen(a)paloaltonetworks.com>
tcp: make TCP_USER_TIMEOUT accurate for zero window probes
Catalin Marinas <catalin.marinas(a)arm.com>
arm64: Do not pass tagged addresses to __is_lm_address()
Vincenzo Frascino <vincenzo.frascino(a)arm.com>
arm64: Fix kernel address detection of __is_lm_address()
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Revert "Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT""
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Rasmus Villemoes <rasmus.villemoes(a)prevas.dk>
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 4 +-
arch/arm64/include/asm/memory.h | 10 +-
arch/arm64/mm/physaddr.c | 2 +-
arch/x86/include/asm/msr.h | 4 +-
block/blk-core.c | 11 +-
drivers/acpi/thermal.c | 55 +++++---
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 3 +
.../gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2 +-
drivers/net/dsa/bcm_sf2.c | 8 +-
drivers/net/ethernet/ibm/ibmvnic.c | 6 +
drivers/nvme/host/core.c | 17 ++-
drivers/phy/motorola/phy-cpcap-usb.c | 19 ++-
drivers/platform/x86/intel-vbtn.c | 6 +
drivers/platform/x86/touchscreen_dmi.c | 18 +++
drivers/scsi/fnic/vnic_dev.c | 8 +-
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 ++-
drivers/scsi/scsi_transport_srp.c | 9 +-
fs/btrfs/backref.c | 157 +++++++++++++--------
fs/udf/super.c | 7 +-
include/linux/kthread.h | 3 +
include/net/tcp.h | 1 +
kernel/kthread.c | 27 +++-
kernel/smpboot.c | 1 +
kernel/workqueue.c | 9 +-
net/core/gen_estimator.c | 11 +-
net/ipv4/tcp_input.c | 1 +
net/ipv4/tcp_output.c | 2 +
net/ipv4/tcp_timer.c | 18 +++
net/mac80211/rx.c | 2 +
net/switchdev/switchdev.c | 23 +--
sound/pci/hda/hda_intel.c | 3 +
sound/soc/sof/intel/hda-codec.c | 3 +-
tools/objtool/elf.c | 7 +-
.../powerpc/alignment/alignment_handler.c | 5 +-
35 files changed, 348 insertions(+), 134 deletions(-)
I'm announcing the release of the 5.4.96 kernel.
All users of the 5.4 kernel series must upgrade.
The updated 5.4.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.4.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/arm64/include/asm/memory.h | 10
arch/arm64/mm/physaddr.c | 2
arch/x86/include/asm/msr.h | 4
block/blk-core.c | 11
drivers/acpi/thermal.c | 55 ++-
drivers/gpu/drm/amd/display/dc/core/dc_link_dp.c | 3
drivers/gpu/drm/amd/display/dc/dcn21/dcn21_resource.c | 2
drivers/net/dsa/bcm_sf2.c | 8
drivers/net/ethernet/ibm/ibmvnic.c | 6
drivers/nvme/host/core.c | 17 -
drivers/phy/motorola/phy-cpcap-usb.c | 19 -
drivers/platform/x86/intel-vbtn.c | 6
drivers/platform/x86/touchscreen_dmi.c | 18 +
drivers/scsi/fnic/vnic_dev.c | 8
drivers/scsi/ibmvscsi/ibmvfc.c | 4
drivers/scsi/libfc/fc_exch.c | 16 -
drivers/scsi/scsi_transport_srp.c | 9
fs/btrfs/backref.c | 157 ++++++----
fs/udf/super.c | 7
include/linux/kthread.h | 3
include/net/tcp.h | 1
kernel/kthread.c | 27 +
kernel/smpboot.c | 1
kernel/workqueue.c | 9
net/core/gen_estimator.c | 11
net/ipv4/tcp_input.c | 1
net/ipv4/tcp_output.c | 2
net/ipv4/tcp_timer.c | 18 +
net/mac80211/rx.c | 2
net/switchdev/switchdev.c | 23 -
sound/pci/hda/hda_intel.c | 3
sound/soc/sof/intel/hda-codec.c | 3
tools/objtool/elf.c | 7
tools/testing/selftests/powerpc/alignment/alignment_handler.c | 5
35 files changed, 347 insertions(+), 133 deletions(-)
Arnold Gozum (1):
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
Bing Guo (1):
drm/amd/display: Change function decide_dp_link_settings to avoid infinite looping
Brian King (1):
scsi: ibmvfc: Set default timeout to avoid crash during migration
Catalin Marinas (1):
arm64: Do not pass tagged addresses to __is_lm_address()
Dinghao Liu (1):
scsi: fnic: Fix memleak in vnic_dev_init_devcmd2
Enke Chen (1):
tcp: make TCP_USER_TIMEOUT accurate for zero window probes
Eric Dumazet (1):
net_sched: gen_estimator: support large ewma log
Felix Fietkau (1):
mac80211: fix fast-rx encryption check
Greg Kroah-Hartman (2):
Revert "Revert "block: end bio with BLK_STS_AGAIN in case of non-mq devs and REQ_NOWAIT""
Linux 5.4.96
Hans de Goede (1):
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
Jake Wang (1):
drm/amd/display: Update dram_clock_change_latency for DCN2.1
Javed Hasan (1):
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Josh Poimboeuf (1):
objtool: Don't fail on missing symbol table
Kai-Chuan Hsieh (1):
ALSA: hda: Add Cometlake-R PCI ID
Kai-Heng Feng (1):
ASoC: SOF: Intel: hda: Resume codec to do jack detection
Lijun Pan (1):
ibmvnic: Ensure that CRQ entry read are correctly ordered
Martin Wilck (1):
scsi: scsi_transport_srp: Don't block target in failfast state
Michael Ellerman (1):
selftests/powerpc: Only test lwm/stmw on big endian
Pan Bian (1):
net: dsa: bcm_sf2: put device node before return
Peter Zijlstra (3):
x86: __always_inline __{rd,wr}msr()
kthread: Extract KTHREAD_IS_PER_CPU
workqueue: Restrict affinity change to rescuer
Rafael J. Wysocki (1):
ACPI: thermal: Do not call acpi_thermal_check() directly
Rasmus Villemoes (1):
net: switchdev: don't set port_obj_info->handled true when -EOPNOTSUPP
Revanth Rajashekar (1):
nvme: check the PRINFO bit before deciding the host buffer length
Tony Lindgren (1):
phy: cpcap-usb: Fix warning for missing regulator_disable
Vincenzo Frascino (1):
arm64: Fix kernel address detection of __is_lm_address()
ethanwu (4):
btrfs: backref, only collect file extent items matching backref offset
btrfs: backref, don't add refs from shared block when resolving normal backref
btrfs: backref, only search backref entries from leaves of the same root
btrfs: backref, use correct count to resolve normal data refs
lianzhi chang (1):
udf: fix the problem that the disc content is not displayed
I'm announcing the release of the 4.19.174 kernel.
All users of the 4.19 kernel series must upgrade.
The updated 4.19.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.19.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
arch/x86/include/asm/msr.h | 4
drivers/acpi/thermal.c | 55 ++++++----
drivers/net/dsa/bcm_sf2.c | 8 +
drivers/net/ethernet/ibm/ibmvnic.c | 6 +
drivers/phy/motorola/phy-cpcap-usb.c | 19 ++-
drivers/platform/x86/intel-vbtn.c | 6 +
drivers/platform/x86/touchscreen_dmi.c | 18 +++
drivers/scsi/ibmvscsi/ibmvfc.c | 4
drivers/scsi/libfc/fc_exch.c | 16 ++
drivers/scsi/scsi_transport_srp.c | 9 +
include/linux/kthread.h | 3
kernel/kthread.c | 27 ++++
kernel/smpboot.c | 1
kernel/sysctl.c | 40 +++++++
kernel/workqueue.c | 9 -
net/core/gen_estimator.c | 11 +-
net/mac80211/rx.c | 2
tools/objtool/elf.c | 7 -
tools/testing/selftests/powerpc/alignment/alignment_handler.c | 5
20 files changed, 205 insertions(+), 47 deletions(-)
Arnold Gozum (1):
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
Brian King (1):
scsi: ibmvfc: Set default timeout to avoid crash during migration
Christian Brauner (1):
sysctl: handle overflow in proc_get_long
Eric Dumazet (1):
net_sched: gen_estimator: support large ewma log
Felix Fietkau (1):
mac80211: fix fast-rx encryption check
Greg Kroah-Hartman (1):
Linux 4.19.174
Hans de Goede (1):
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
Javed Hasan (1):
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Josh Poimboeuf (1):
objtool: Don't fail on missing symbol table
Lijun Pan (1):
ibmvnic: Ensure that CRQ entry read are correctly ordered
Martin Wilck (1):
scsi: scsi_transport_srp: Don't block target in failfast state
Michael Ellerman (1):
selftests/powerpc: Only test lwm/stmw on big endian
Pan Bian (1):
net: dsa: bcm_sf2: put device node before return
Peter Zijlstra (3):
x86: __always_inline __{rd,wr}msr()
kthread: Extract KTHREAD_IS_PER_CPU
workqueue: Restrict affinity change to rescuer
Rafael J. Wysocki (1):
ACPI: thermal: Do not call acpi_thermal_check() directly
Tony Lindgren (1):
phy: cpcap-usb: Fix warning for missing regulator_disable
I'm announcing the release of the 4.14.220 kernel.
All users of the 4.14 kernel series must upgrade.
The updated 4.14.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-4.14.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 -
arch/x86/include/asm/msr.h | 4 +-
drivers/acpi/thermal.c | 55 ++++++++++++++++++++++++-----------
drivers/base/core.c | 19 ++++++++++--
drivers/net/dsa/bcm_sf2.c | 8 +++--
drivers/net/ethernet/ibm/ibmvnic.c | 6 +++
drivers/phy/motorola/phy-cpcap-usb.c | 19 ++++++++----
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 ++++++++--
drivers/scsi/scsi_transport_srp.c | 9 +++++
include/linux/kthread.h | 3 +
kernel/kthread.c | 27 ++++++++++++++++-
kernel/smpboot.c | 1
net/core/gen_estimator.c | 11 ++++---
net/mac80211/rx.c | 2 +
net/sched/sch_api.c | 3 +
tools/objtool/elf.c | 7 +++-
17 files changed, 154 insertions(+), 42 deletions(-)
Benjamin Gaignard (1):
base: core: Remove WARN_ON from link dependencies check
Brian King (1):
scsi: ibmvfc: Set default timeout to avoid crash during migration
Eric Dumazet (2):
net_sched: reject silly cell_log in qdisc_get_rtab()
net_sched: gen_estimator: support large ewma log
Felix Fietkau (1):
mac80211: fix fast-rx encryption check
Greg Kroah-Hartman (1):
Linux 4.14.220
Javed Hasan (1):
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Josh Poimboeuf (1):
objtool: Don't fail on missing symbol table
Lijun Pan (1):
ibmvnic: Ensure that CRQ entry read are correctly ordered
Martin Wilck (1):
scsi: scsi_transport_srp: Don't block target in failfast state
Pan Bian (1):
net: dsa: bcm_sf2: put device node before return
Peter Zijlstra (2):
x86: __always_inline __{rd,wr}msr()
kthread: Extract KTHREAD_IS_PER_CPU
Rafael J. Wysocki (2):
ACPI: thermal: Do not call acpi_thermal_check() directly
driver core: Extend device_is_dependent()
Tony Lindgren (1):
phy: cpcap-usb: Fix warning for missing regulator_disable
This is the start of the stable review cycle for the 4.19.174 release.
There are 17 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Sun, 07 Feb 2021 14:06:42 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v4.x/stable-review/patch-4.19.174-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-4.19.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 4.19.174-rc1
Peter Zijlstra <peterz(a)infradead.org>
workqueue: Restrict affinity change to rescuer
Peter Zijlstra <peterz(a)infradead.org>
kthread: Extract KTHREAD_IS_PER_CPU
Josh Poimboeuf <jpoimboe(a)redhat.com>
objtool: Don't fail on missing symbol table
Michael Ellerman <mpe(a)ellerman.id.au>
selftests/powerpc: Only test lwm/stmw on big endian
Brian King <brking(a)linux.vnet.ibm.com>
scsi: ibmvfc: Set default timeout to avoid crash during migration
Felix Fietkau <nbd(a)nbd.name>
mac80211: fix fast-rx encryption check
Javed Hasan <jhasan(a)marvell.com>
scsi: libfc: Avoid invoking response handler twice if ep is already completed
Martin Wilck <mwilck(a)suse.com>
scsi: scsi_transport_srp: Don't block target in failfast state
Peter Zijlstra <peterz(a)infradead.org>
x86: __always_inline __{rd,wr}msr()
Arnold Gozum <arngozum(a)gmail.com>
platform/x86: intel-vbtn: Support for tablet mode on Dell Inspiron 7352
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: touchscreen_dmi: Add swap-x-y quirk for Goodix touchscreen on Estar Beauty HD tablet
Tony Lindgren <tony(a)atomide.com>
phy: cpcap-usb: Fix warning for missing regulator_disable
Eric Dumazet <edumazet(a)google.com>
net_sched: gen_estimator: support large ewma log
Christian Brauner <christian(a)brauner.io>
sysctl: handle overflow in proc_get_long
Rafael J. Wysocki <rafael.j.wysocki(a)intel.com>
ACPI: thermal: Do not call acpi_thermal_check() directly
Lijun Pan <ljp(a)linux.ibm.com>
ibmvnic: Ensure that CRQ entry read are correctly ordered
Pan Bian <bianpan2016(a)163.com>
net: dsa: bcm_sf2: put device node before return
-------------
Diffstat:
Makefile | 4 +-
arch/x86/include/asm/msr.h | 4 +-
drivers/acpi/thermal.c | 55 +++++++++++++++-------
drivers/net/dsa/bcm_sf2.c | 8 +++-
drivers/net/ethernet/ibm/ibmvnic.c | 6 +++
drivers/phy/motorola/phy-cpcap-usb.c | 19 +++++---
drivers/platform/x86/intel-vbtn.c | 6 +++
drivers/platform/x86/touchscreen_dmi.c | 18 +++++++
drivers/scsi/ibmvscsi/ibmvfc.c | 4 +-
drivers/scsi/libfc/fc_exch.c | 16 ++++++-
drivers/scsi/scsi_transport_srp.c | 9 +++-
include/linux/kthread.h | 3 ++
kernel/kthread.c | 27 ++++++++++-
kernel/smpboot.c | 1 +
kernel/sysctl.c | 40 +++++++++++++++-
kernel/workqueue.c | 9 ++--
net/core/gen_estimator.c | 11 +++--
net/mac80211/rx.c | 2 +
tools/objtool/elf.c | 7 ++-
.../powerpc/alignment/alignment_handler.c | 5 +-
20 files changed, 206 insertions(+), 48 deletions(-)
We expect toolchains to produce these new debug info sections as part of
DWARF v5. Add explicit placements to prevent the linker warnings from
--orphan-section=warn.
Compilers may produce such sections with explicit -gdwarf-5, or based on
the implicit default version of DWARF when -g is used via DEBUG_INFO.
This implicit default changes over time, and has changed to DWARF v5
with GCC 11.
.debug_sup was mentioned in review, but without compilers producing it
today, let's wait to add it until it becomes necessary.
Cc: stable(a)vger.kernel.org
Link: https://bugzilla.redhat.com/show_bug.cgi?id=1922707
Reported-by: Chris Murphy <lists(a)colorremedies.com>
Suggested-by: Fangrui Song <maskray(a)google.com>
Reviewed-by: Nathan Chancellor <nathan(a)kernel.org>
Tested-by: Sedat Dilek <sedat.dilek(a)gmail.com>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
include/asm-generic/vmlinux.lds.h | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/include/asm-generic/vmlinux.lds.h b/include/asm-generic/vmlinux.lds.h
index 34b7e0d2346c..1e7cde4bd3f9 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -842,8 +842,13 @@
/* DWARF 4 */ \
.debug_types 0 : { *(.debug_types) } \
/* DWARF 5 */ \
+ .debug_addr 0 : { *(.debug_addr) } \
+ .debug_line_str 0 : { *(.debug_line_str) } \
+ .debug_loclists 0 : { *(.debug_loclists) } \
.debug_macro 0 : { *(.debug_macro) } \
- .debug_addr 0 : { *(.debug_addr) }
+ .debug_names 0 : { *(.debug_names) } \
+ .debug_rnglists 0 : { *(.debug_rnglists) } \
+ .debug_str_offsets 0 : { *(.debug_str_offsets) }
/* Stabs debugging sections. */
#define STABS_DEBUG \
--
2.30.0.365.g02bc693789-goog
Do you really want companies only?
If you detail how to install the new kernel
(and any other prerequisites) and if 'ordinary use'
is sufficient, I'm sure others would help.
regards
--
Dave Pawson
XSLT XSL-FO FAQ.
Docbook FAQ.
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: i2c: max9286: fix access to unallocated memory
Author: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Date: Mon Jan 18 09:14:46 2021 +0100
The asd allocated with v4l2_async_notifier_add_fwnode_subdev() must be
of size max9286_asd, otherwise access to max9286_asd->source will go to
unallocated memory.
Fixes: 86d37bf31af6 ("media: i2c: max9286: Allocate v4l2_async_subdev dynamically")
Signed-off-by: Tomi Valkeinen <tomi.valkeinen(a)ideasonboard.com>
Cc: stable(a)vger.kernel.org # v5.10+
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Reviewed-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Tested-by: Kieran Bingham <kieran.bingham+renesas(a)ideasonboard.com>
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/i2c/max9286.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
---
diff --git a/drivers/media/i2c/max9286.c b/drivers/media/i2c/max9286.c
index c82c1493e099..b1e2476d3c9e 100644
--- a/drivers/media/i2c/max9286.c
+++ b/drivers/media/i2c/max9286.c
@@ -580,7 +580,7 @@ static int max9286_v4l2_notifier_register(struct max9286_priv *priv)
asd = v4l2_async_notifier_add_fwnode_subdev(&priv->notifier,
source->fwnode,
- sizeof(*asd));
+ sizeof(struct max9286_asd));
if (IS_ERR(asd)) {
dev_err(dev, "Failed to add subdev for source %u: %ld",
i, PTR_ERR(asd));
On Thu, Feb 04, 2021 at 05:59:42AM +0000, Jari Ruusu wrote:
> Greg,
> I hope that your linux kernel release scripts are
> implemented in a way that understands that PATCHLEVEL= and
> SUBLEVEL= numbers in top-level linux Makefile are encoded
> as 8-bit numbers for LINUX_VERSION_CODE and
> KERNEL_VERSION() macros, and must stay in range 0...255.
> These 8-bit limits are hardcoded in both kernel source and
> userspace ABI.
>
> After 4.9.255 and 4.4.255, your scripts should be
> incrementing a number in EXTRAVERSION= in top-level
> linux Makefile.
Should already be fixed in linux-next, right?
thanks,
greg k-h
The patch titled
Subject: tmpfs: don't use 64-bit inodes by defulat with 32-bit ino_t
has been removed from the -mm tree. Its filename was
tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t.patch
This patch was dropped because an alternative patch was merged
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: don't use 64-bit inodes by defulat with 32-bit ino_t
Currently there seems to be an assumption in tmpfs that 64-bit
architectures also have a 64-bit ino_t. This is not true; s390 at least
has a 32-bit ino_t. With CONFIG_TMPFS_INODE64=y tmpfs mounts will get
64-bit inode numbers and display "inode64" in the mount options, but
passing the "inode64" mount option will fail. This leads to the following
behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
Ideally CONFIG_TMPFS_INODE64 would depend on sizeof(ino_t) < 8, but I
don't think it's possible to test for this (potentially
CONFIG_ARCH_HAS_64BIT_INO_T or similar could be added, but I'm not sure
whether or not that is wanted). So fix this by simply refusing to honor
the CONFIG_TMPFS_INODE64 setting when sizeof(ino_t) < 8.
Link: https://lkml.kernel.org/r/20210205202159.505612-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/shmem.c~tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t
+++ a/mm/shmem.c
@@ -3733,7 +3733,7 @@ static int shmem_fill_super(struct super
ctx->blocks = shmem_default_max_blocks();
if (!(ctx->seen & SHMEM_SEEN_INODES))
ctx->inodes = shmem_default_max_inodes();
- if (!(ctx->seen & SHMEM_SEEN_INUMS))
+ if (!(ctx->seen & SHMEM_SEEN_INUMS) && sizeof(ino_t) >= 8)
ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64);
} else {
sb->s_flags |= SB_NOUSER;
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
tmpfs-disallow-config_tmpfs_inode64-on-s390.patch
The patch titled
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
has been added to the -mm tree. Its filename is
tmpfs-disallow-config_tmpfs_inode64-on-s390.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/tmpfs-disallow-config_tmpfs_inode…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-disallow-config_tmpfs_inode…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: disallow CONFIG_TMPFS_INODE64 on s390
Currently there is an assumption in tmpfs that 64-bit architectures also
have a 64-bit ino_t. This is not true on s390 which has a 32-bit ino_t.
With CONFIG_TMPFS_INODE64=y tmpfs mounts will get 64-bit inode numbers and
display "inode64" in the mount options, but passing the "inode64" mount
option will fail. This leads to the following behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
So prevent CONFIG_TMPFS_INODE64 from being selected on s390.
Link: https://lkml.kernel.org/r/20210205230620.518245-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: Heiko Carstens <hca(a)linux.ibm.com>
Cc: Vasily Gorbik <gor(a)linux.ibm.com>
Cc: Christian Borntraeger <borntraeger(a)de.ibm.com>
Cc: <stable(a)vger.kernel.org> [5.9+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
fs/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/fs/Kconfig~tmpfs-disallow-config_tmpfs_inode64-on-s390
+++ a/fs/Kconfig
@@ -203,7 +203,7 @@ config TMPFS_XATTR
config TMPFS_INODE64
bool "Use 64-bit ino_t by default in tmpfs"
- depends on TMPFS && 64BIT
+ depends on TMPFS && 64BIT && !S390
default n
help
tmpfs has historically used only inode numbers as wide as an unsigned
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
tmpfs-disallow-config_tmpfs_inode64-on-s390.patch
tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t.patch
From: Aurelien Aptel <aaptel(a)suse.com>
Assuming
- //HOST/a is mounted on /mnt
- //HOST/b is mounted on /mnt/b
On a slow connection, running 'df' and killing it while it's
processing /mnt/b can make cifs_get_inode_info() returns -ERESTARTSYS.
This triggers the following chain of events:
=> the dentry revalidation fail
=> dentry is put and released
=> superblock associated with the dentry is put
=> /mnt/b is unmounted
This patch makes cifs_d_revalidate() return the error instead of 0
(invalid) when cifs_revalidate_dentry() fails, except for ENOENT (file
deleted) and ESTALE (file recreated).
Signed-off-by: Aurelien Aptel <aaptel(a)suse.com>
Suggested-by: Shyam Prasad N <nspmangalore(a)gmail.com>
CC: stable(a)vger.kernel.org
---
fs/cifs/dir.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
diff --git a/fs/cifs/dir.c b/fs/cifs/dir.c
index 68900f1629bff..97ac363b5df16 100644
--- a/fs/cifs/dir.c
+++ b/fs/cifs/dir.c
@@ -737,6 +737,7 @@ static int
cifs_d_revalidate(struct dentry *direntry, unsigned int flags)
{
struct inode *inode;
+ int rc;
if (flags & LOOKUP_RCU)
return -ECHILD;
@@ -746,8 +747,25 @@ cifs_d_revalidate(struct dentry *direntry, unsigned int flags)
if ((flags & LOOKUP_REVAL) && !CIFS_CACHE_READ(CIFS_I(inode)))
CIFS_I(inode)->time = 0; /* force reval */
- if (cifs_revalidate_dentry(direntry))
- return 0;
+ rc = cifs_revalidate_dentry(direntry);
+ if (rc) {
+ cifs_dbg(FYI, "cifs_revalidate_dentry failed with rc=%d", rc);
+ switch (rc) {
+ case -ENOENT:
+ case -ESTALE:
+ /*
+ * Those errors mean the dentry is invalid
+ * (file was deleted or recreated)
+ */
+ return 0;
+ default:
+ /*
+ * Otherwise some unexpected error happened
+ * report it as-is to VFS layer
+ */
+ return rc;
+ }
+ }
else {
/*
* If the inode wasn't known to be a dfs entry when
--
2.29.2
The patch titled
Subject: tmpfs: don't use 64-bit inodes by defulat with 32-bit ino_t
has been added to the -mm tree. Its filename is
tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/tmpfs-dont-use-64-bit-inodes-by-d…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/tmpfs-dont-use-64-bit-inodes-by-d…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Seth Forshee <seth.forshee(a)canonical.com>
Subject: tmpfs: don't use 64-bit inodes by defulat with 32-bit ino_t
Currently there seems to be an assumption in tmpfs that 64-bit
architectures also have a 64-bit ino_t. This is not true; s390 at least
has a 32-bit ino_t. With CONFIG_TMPFS_INODE64=y tmpfs mounts will get
64-bit inode numbers and display "inode64" in the mount options, but
passing the "inode64" mount option will fail. This leads to the following
behavior:
# mkdir mnt
# mount -t tmpfs nodev mnt
# mount -o remount,rw mnt
mount: /home/ubuntu/mnt: mount point not mounted or bad option.
As mount sees "inode64" in the mount options and thus passes it in the
options for the remount.
Ideally CONFIG_TMPFS_INODE64 would depend on sizeof(ino_t) < 8, but I
don't think it's possible to test for this (potentially
CONFIG_ARCH_HAS_64BIT_INO_T or similar could be added, but I'm not sure
whether or not that is wanted). So fix this by simply refusing to honor
the CONFIG_TMPFS_INODE64 setting when sizeof(ino_t) < 8.
Link: https://lkml.kernel.org/r/20210205202159.505612-1-seth.forshee@canonical.com
Fixes: ea3271f7196c ("tmpfs: support 64-bit inums per-sb")
Signed-off-by: Seth Forshee <seth.forshee(a)canonical.com>
Cc: Chris Down <chris(a)chrisdown.name>
Cc: Hugh Dickins <hughd(a)google.com>
Cc: Amir Goldstein <amir73il(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/shmem.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
--- a/mm/shmem.c~tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t
+++ a/mm/shmem.c
@@ -3733,7 +3733,7 @@ static int shmem_fill_super(struct super
ctx->blocks = shmem_default_max_blocks();
if (!(ctx->seen & SHMEM_SEEN_INODES))
ctx->inodes = shmem_default_max_inodes();
- if (!(ctx->seen & SHMEM_SEEN_INUMS))
+ if (!(ctx->seen & SHMEM_SEEN_INUMS) && sizeof(ino_t) >= 8)
ctx->full_inums = IS_ENABLED(CONFIG_TMPFS_INODE64);
} else {
sb->s_flags |= SB_NOUSER;
_
Patches currently in -mm which might be from seth.forshee(a)canonical.com are
tmpfs-dont-use-64-bit-inodes-by-defulat-with-32-bit-ino_t.patch
The following commit has been merged into the irq/urgent branch of tip:
Commit-ID: 4c7bcb51ae25f79e3733982e5d0cd8ce8640ddfc
Gitweb: https://git.kernel.org/tip/4c7bcb51ae25f79e3733982e5d0cd8ce8640ddfc
Author: Hans de Goede <hdegoede(a)redhat.com>
AuthorDate: Mon, 21 Dec 2020 19:56:47 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 05 Feb 2021 20:48:28 +01:00
genirq: Prevent [devm_]irq_alloc_desc from returning irq 0
Since commit a85a6c86c25b ("driver core: platform: Clarify that IRQ 0
is invalid"), having a linux-irq with number 0 will trigger a WARN()
when calling platform_get_irq*() to retrieve that linux-irq.
Since [devm_]irq_alloc_desc allocs a single irq and since irq 0 is not used
on some systems, it can return 0, triggering that WARN(). This happens
e.g. on Intel Bay Trail and Cherry Trail devices using the LPE audio engine
for HDMI audio:
0 is an invalid IRQ number
WARNING: CPU: 3 PID: 472 at drivers/base/platform.c:238 platform_get_irq_optional+0x108/0x180
Modules linked in: snd_hdmi_lpe_audio(+) ...
Call Trace:
platform_get_irq+0x17/0x30
hdmi_lpe_audio_probe+0x4a/0x6c0 [snd_hdmi_lpe_audio]
---[ end trace ceece38854223a0b ]---
Change the 'from' parameter passed to __[devm_]irq_alloc_descs() by the
[devm_]irq_alloc_desc macros from 0 to 1, so that these macros will no
longer return 0.
Fixes: a85a6c86c25b ("driver core: platform: Clarify that IRQ 0 is invalid")
Signed-off-by: Hans de Goede <hdegoede(a)redhat.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20201221185647.226146-1-hdegoede@redhat.com
---
include/linux/irq.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/include/linux/irq.h b/include/linux/irq.h
index 4aeb1c4..2efde6a 100644
--- a/include/linux/irq.h
+++ b/include/linux/irq.h
@@ -928,7 +928,7 @@ int __devm_irq_alloc_descs(struct device *dev, int irq, unsigned int from,
__irq_alloc_descs(irq, from, cnt, node, THIS_MODULE, NULL)
#define irq_alloc_desc(node) \
- irq_alloc_descs(-1, 0, 1, node)
+ irq_alloc_descs(-1, 1, 1, node)
#define irq_alloc_desc_at(at, node) \
irq_alloc_descs(at, at, 1, node)
@@ -943,7 +943,7 @@ int __devm_irq_alloc_descs(struct device *dev, int irq, unsigned int from,
__devm_irq_alloc_descs(dev, irq, from, cnt, node, THIS_MODULE, NULL)
#define devm_irq_alloc_desc(dev, node) \
- devm_irq_alloc_descs(dev, -1, 0, 1, node)
+ devm_irq_alloc_descs(dev, -1, 1, 1, node)
#define devm_irq_alloc_desc_at(dev, at, node) \
devm_irq_alloc_descs(dev, at, at, 1, node)
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: c4bed4b96918ff1d062ee81fdae4d207da4fa9b0
Gitweb: https://git.kernel.org/tip/c4bed4b96918ff1d062ee81fdae4d207da4fa9b0
Author: Lai Jiangshan <laijs(a)linux.alibaba.com>
AuthorDate: Thu, 04 Feb 2021 23:27:06 +08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 05 Feb 2021 20:13:11 +01:00
x86/debug: Prevent data breakpoints on __per_cpu_offset
When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU GSBASE value
via __per_cpu_offset or pcpu_unit_offsets.
When a data breakpoint is set on __per_cpu_offset[cpu] (read-write
operation), the specific CPU will be stuck in an infinite #DB loop.
RCU will try to send an NMI to the specific CPU, but it is not working
either since NMI also relies on paranoid_entry(). Which means it's
undebuggable.
Fixes: eaad981291ee3("x86/entry/64: Introduce the FIND_PERCPU_BASE macro")
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-1-jiangshanlai@gmail.com
---
arch/x86/kernel/hw_breakpoint.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 6694c0f..012ed82 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -269,6 +269,20 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
CPU_ENTRY_AREA_TOTAL_SIZE))
return true;
+ /*
+ * When FSGSBASE is enabled, paranoid_entry() fetches the per-CPU
+ * GSBASE value via __per_cpu_offset or pcpu_unit_offsets.
+ */
+#ifdef CONFIG_SMP
+ if (within_area(addr, end, (unsigned long)__per_cpu_offset,
+ sizeof(unsigned long) * nr_cpu_ids))
+ return true;
+#else
+ if (within_area(addr, end, (unsigned long)&pcpu_unit_offsets,
+ sizeof(pcpu_unit_offsets)))
+ return true;
+#endif
+
for_each_possible_cpu(cpu) {
/* The original rw GDT is being used after load_direct_gdt() */
if (within_area(addr, end, (unsigned long)get_cpu_gdt_rw(cpu),
The following commit has been merged into the x86/urgent branch of tip:
Commit-ID: 3943abf2dbfae9ea4d2da05c1db569a0603f76da
Gitweb: https://git.kernel.org/tip/3943abf2dbfae9ea4d2da05c1db569a0603f76da
Author: Lai Jiangshan <laijs(a)linux.alibaba.com>
AuthorDate: Thu, 04 Feb 2021 23:27:07 +08:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Fri, 05 Feb 2021 20:13:12 +01:00
x86/debug: Prevent data breakpoints on cpu_dr7
local_db_save() is called at the start of exc_debug_kernel(), reads DR7 and
disables breakpoints to prevent recursion.
When running in a guest (X86_FEATURE_HYPERVISOR), local_db_save() reads the
per-cpu variable cpu_dr7 to check whether a breakpoint is active or not
before it accesses DR7.
A data breakpoint on cpu_dr7 therefore results in infinite #DB recursion.
Disallow data breakpoints on cpu_dr7 to prevent that.
Fixes: 84b6a3491567a("x86/entry: Optimize local_db_save() for virt")
Signed-off-by: Lai Jiangshan <laijs(a)linux.alibaba.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/20210204152708.21308-2-jiangshanlai@gmail.com
---
arch/x86/kernel/hw_breakpoint.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/arch/x86/kernel/hw_breakpoint.c b/arch/x86/kernel/hw_breakpoint.c
index 012ed82..668a4a6 100644
--- a/arch/x86/kernel/hw_breakpoint.c
+++ b/arch/x86/kernel/hw_breakpoint.c
@@ -307,6 +307,14 @@ static inline bool within_cpu_entry(unsigned long addr, unsigned long end)
(unsigned long)&per_cpu(cpu_tlbstate, cpu),
sizeof(struct tlb_state)))
return true;
+
+ /*
+ * When in guest (X86_FEATURE_HYPERVISOR), local_db_save()
+ * will read per-cpu cpu_dr7 before clear dr7 register.
+ */
+ if (within_area(addr, end, (unsigned long)&per_cpu(cpu_dr7, cpu),
+ sizeof(cpu_dr7)))
+ return true;
}
return false;
Right now SUBLEVEL is overflowing, and some userspace may start treating
4.9.256 as 4.10. While out of tree modules have different ways of
extracting the version number (and we're generally ok with breaking
them), we do care about breaking userspace and it would appear that this
overflow might do just that.
Our rules around userspace ABI in the stable kernel are pretty simple:
we don't break it. Thus, while userspace may be checking major/minor, it
shouldn't be doing anything with sublevel.
This patch applies a big band-aid to the 4.9 and 4.4 kernels in the form
of clamping their sublevel to 255.
The clamp is done for the purpose of LINUX_VERSION_CODE only, and
extracting the version number from the Makefile or "make kernelversion"
will continue to work as intended.
We might need to do it later in newer trees, but maybe we'll have a
better solution by then, so I'm ignoring that problem for now.
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
Makefile | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/Makefile b/Makefile
index 69af44d3dcd14..93f7ce06fc8c9 100644
--- a/Makefile
+++ b/Makefile
@@ -1141,7 +1141,7 @@ endef
define filechk_version.h
(echo \#define LINUX_VERSION_CODE $(shell \
- expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 + 0$(SUBLEVEL)); \
+ expr $(VERSION) \* 65536 + 0$(PATCHLEVEL) \* 256 + 255); \
echo '#define KERNEL_VERSION(a,b,c) (((a) << 16) + ((b) << 8) + (c))';)
endef
--
2.27.0
This is a note to let you know that I've just added the patch titled
staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-next branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will also be merged in the next major kernel release
during the merge window.
If you have any questions about this process, please let me know.
>From 7a8d2f1908a59003e55ef8691d09efb7fbc51625 Mon Sep 17 00:00:00 2001
From: Martin Kaiser <martin(a)kaiser.cx>
Date: Thu, 4 Feb 2021 09:52:17 +0100
Subject: staging: rtl8188eu: Add Edimax EW-7811UN V2 to device table
The Edimax EW-7811UN V2 uses an RTL8188EU chipset and works with this
driver.
Signed-off-by: Martin Kaiser <martin(a)kaiser.cx>
Cc: stable <stable(a)vger.kernel.org>
Link: https://lore.kernel.org/r/20210204085217.9743-1-martin@kaiser.cx
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/rtl8188eu/os_dep/usb_intf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/rtl8188eu/os_dep/usb_intf.c b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
index 43ebd11b53fe..efad43d8e465 100644
--- a/drivers/staging/rtl8188eu/os_dep/usb_intf.c
+++ b/drivers/staging/rtl8188eu/os_dep/usb_intf.c
@@ -41,6 +41,7 @@ static const struct usb_device_id rtw_usb_id_tbl[] = {
{USB_DEVICE(0x2357, 0x0111)}, /* TP-Link TL-WN727N v5.21 */
{USB_DEVICE(0x2C4E, 0x0102)}, /* MERCUSYS MW150US v2 */
{USB_DEVICE(0x0df6, 0x0076)}, /* Sitecom N150 v2 */
+ {USB_DEVICE(0x7392, 0xb811)}, /* Edimax EW-7811UN V2 */
{USB_DEVICE(USB_VENDER_ID_REALTEK, 0xffef)}, /* Rosewill RNX-N150NUB */
{} /* Terminating entry */
};
--
2.30.0
From: Filipe Laíns <lains(a)riseup.net>
In e400071a805d6229223a98899e9da8c6233704a1 I added support for the
receiver that comes with the G602 device, but unfortunately I screwed up
during testing and it seems the keyboard events were actually not being
sent to userspace.
This resulted in keyboard events being broken in userspace, please
backport the fix.
The receiver uses the normal 0x01 Logitech keyboard report descriptor,
as expected, so it is just a matter of flagging it as supported.
Reported in
https://github.com/libratbag/libratbag/issues/1124
Fixes: e400071a805d6 ("HID: logitech-dj: add the G602 receiver")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Filipe Laíns <lains(a)riseup.net>
---
Changes in v2:
- added missing Fixes: anc Cc: tags
drivers/hid/hid-logitech-dj.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/hid/hid-logitech-dj.c b/drivers/hid/hid-logitech-dj.c
index 6596c81947a8..2703333edc34 100644
--- a/drivers/hid/hid-logitech-dj.c
+++ b/drivers/hid/hid-logitech-dj.c
@@ -981,6 +981,7 @@ static void logi_hidpp_recv_queue_notif(struct hid_device *hdev,
case 0x07:
device_type = "eQUAD step 4 Gaming";
logi_hidpp_dev_conn_notif_equad(hdev, hidpp_report, &workitem);
+ workitem.reports_supported |= STD_KEYBOARD;
break;
case 0x08:
device_type = "eQUAD step 4 for gamepads";
--
2.30.0