The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b02414c8f045ab3b9afc816c3735bc98c5c3d262 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Mon, 2 Nov 2020 15:31:27 -0500
Subject: [PATCH] ring-buffer: Fix recursion protection transitions between
interrupt context
The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.
Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.
Cc: stable(a)vger.kernel.org
Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7f45fd9d5a45..dc83b3fa9fe7 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -438,14 +438,16 @@ enum {
};
/*
* Used for which event context the event is in.
- * NMI = 0
- * IRQ = 1
- * SOFTIRQ = 2
- * NORMAL = 3
+ * TRANSITION = 0
+ * NMI = 1
+ * IRQ = 2
+ * SOFTIRQ = 3
+ * NORMAL = 4
*
* See trace_recursive_lock() comment below for more details.
*/
enum {
+ RB_CTX_TRANSITION,
RB_CTX_NMI,
RB_CTX_IRQ,
RB_CTX_SOFTIRQ,
@@ -3014,10 +3016,10 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* a bit of overhead in something as critical as function tracing,
* we use a bitmask trick.
*
- * bit 0 = NMI context
- * bit 1 = IRQ context
- * bit 2 = SoftIRQ context
- * bit 3 = normal context.
+ * bit 1 = NMI context
+ * bit 2 = IRQ context
+ * bit 3 = SoftIRQ context
+ * bit 4 = normal context.
*
* This works because this is the order of contexts that can
* preempt other contexts. A SoftIRQ never preempts an IRQ
@@ -3040,6 +3042,30 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* The least significant bit can be cleared this way, and it
* just so happens that it is the same bit corresponding to
* the current context.
+ *
+ * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit
+ * is set when a recursion is detected at the current context, and if
+ * the TRANSITION bit is already set, it will fail the recursion.
+ * This is needed because there's a lag between the changing of
+ * interrupt context and updating the preempt count. In this case,
+ * a false positive will be found. To handle this, one extra recursion
+ * is allowed, and this is done by the TRANSITION bit. If the TRANSITION
+ * bit is already set, then it is considered a recursion and the function
+ * ends. Otherwise, the TRANSITION bit is set, and that bit is returned.
+ *
+ * On the trace_recursive_unlock(), the TRANSITION bit will be the first
+ * to be cleared. Even if it wasn't the context that set it. That is,
+ * if an interrupt comes in while NORMAL bit is set and the ring buffer
+ * is called before preempt_count() is updated, since the check will
+ * be on the NORMAL bit, the TRANSITION bit will then be set. If an
+ * NMI then comes in, it will set the NMI bit, but when the NMI code
+ * does the trace_recursive_unlock() it will clear the TRANSTION bit
+ * and leave the NMI bit set. But this is fine, because the interrupt
+ * code that set the TRANSITION bit will then clear the NMI bit when it
+ * calls trace_recursive_unlock(). If another NMI comes in, it will
+ * set the TRANSITION bit and continue.
+ *
+ * Note: The TRANSITION bit only handles a single transition between context.
*/
static __always_inline int
@@ -3055,8 +3081,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
bit = pc & NMI_MASK ? RB_CTX_NMI :
pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
- if (unlikely(val & (1 << (bit + cpu_buffer->nest))))
- return 1;
+ if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) {
+ /*
+ * It is possible that this was called by transitioning
+ * between interrupt context, and preempt_count() has not
+ * been updated yet. In this case, use the TRANSITION bit.
+ */
+ bit = RB_CTX_TRANSITION;
+ if (val & (1 << (bit + cpu_buffer->nest)))
+ return 1;
+ }
val |= (1 << (bit + cpu_buffer->nest));
cpu_buffer->current_context = val;
@@ -3071,8 +3105,8 @@ trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->current_context - (1 << cpu_buffer->nest);
}
-/* The recursive locking above uses 4 bits */
-#define NESTED_BITS 4
+/* The recursive locking above uses 5 bits */
+#define NESTED_BITS 5
/**
* ring_buffer_nest_start - Allow to trace while nested
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From b02414c8f045ab3b9afc816c3735bc98c5c3d262 Mon Sep 17 00:00:00 2001
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
Date: Mon, 2 Nov 2020 15:31:27 -0500
Subject: [PATCH] ring-buffer: Fix recursion protection transitions between
interrupt context
The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.
Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.
Cc: stable(a)vger.kernel.org
Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7f45fd9d5a45..dc83b3fa9fe7 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -438,14 +438,16 @@ enum {
};
/*
* Used for which event context the event is in.
- * NMI = 0
- * IRQ = 1
- * SOFTIRQ = 2
- * NORMAL = 3
+ * TRANSITION = 0
+ * NMI = 1
+ * IRQ = 2
+ * SOFTIRQ = 3
+ * NORMAL = 4
*
* See trace_recursive_lock() comment below for more details.
*/
enum {
+ RB_CTX_TRANSITION,
RB_CTX_NMI,
RB_CTX_IRQ,
RB_CTX_SOFTIRQ,
@@ -3014,10 +3016,10 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* a bit of overhead in something as critical as function tracing,
* we use a bitmask trick.
*
- * bit 0 = NMI context
- * bit 1 = IRQ context
- * bit 2 = SoftIRQ context
- * bit 3 = normal context.
+ * bit 1 = NMI context
+ * bit 2 = IRQ context
+ * bit 3 = SoftIRQ context
+ * bit 4 = normal context.
*
* This works because this is the order of contexts that can
* preempt other contexts. A SoftIRQ never preempts an IRQ
@@ -3040,6 +3042,30 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* The least significant bit can be cleared this way, and it
* just so happens that it is the same bit corresponding to
* the current context.
+ *
+ * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit
+ * is set when a recursion is detected at the current context, and if
+ * the TRANSITION bit is already set, it will fail the recursion.
+ * This is needed because there's a lag between the changing of
+ * interrupt context and updating the preempt count. In this case,
+ * a false positive will be found. To handle this, one extra recursion
+ * is allowed, and this is done by the TRANSITION bit. If the TRANSITION
+ * bit is already set, then it is considered a recursion and the function
+ * ends. Otherwise, the TRANSITION bit is set, and that bit is returned.
+ *
+ * On the trace_recursive_unlock(), the TRANSITION bit will be the first
+ * to be cleared. Even if it wasn't the context that set it. That is,
+ * if an interrupt comes in while NORMAL bit is set and the ring buffer
+ * is called before preempt_count() is updated, since the check will
+ * be on the NORMAL bit, the TRANSITION bit will then be set. If an
+ * NMI then comes in, it will set the NMI bit, but when the NMI code
+ * does the trace_recursive_unlock() it will clear the TRANSTION bit
+ * and leave the NMI bit set. But this is fine, because the interrupt
+ * code that set the TRANSITION bit will then clear the NMI bit when it
+ * calls trace_recursive_unlock(). If another NMI comes in, it will
+ * set the TRANSITION bit and continue.
+ *
+ * Note: The TRANSITION bit only handles a single transition between context.
*/
static __always_inline int
@@ -3055,8 +3081,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
bit = pc & NMI_MASK ? RB_CTX_NMI :
pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
- if (unlikely(val & (1 << (bit + cpu_buffer->nest))))
- return 1;
+ if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) {
+ /*
+ * It is possible that this was called by transitioning
+ * between interrupt context, and preempt_count() has not
+ * been updated yet. In this case, use the TRANSITION bit.
+ */
+ bit = RB_CTX_TRANSITION;
+ if (val & (1 << (bit + cpu_buffer->nest)))
+ return 1;
+ }
val |= (1 << (bit + cpu_buffer->nest));
cpu_buffer->current_context = val;
@@ -3071,8 +3105,8 @@ trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->current_context - (1 << cpu_buffer->nest);
}
-/* The recursive locking above uses 4 bits */
-#define NESTED_BITS 4
+/* The recursive locking above uses 5 bits */
+#define NESTED_BITS 5
/**
* ring_buffer_nest_start - Allow to trace while nested
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From da7d554f7c62d0c17c1ac3cc2586473c2d99f0bd Mon Sep 17 00:00:00 2001
From: Alexander Aring <aahringo(a)redhat.com>
Date: Mon, 26 Oct 2020 10:52:29 -0400
Subject: [PATCH] gfs2: Wake up when sd_glock_disposal becomes zero
Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a
sd_glock_disposal accounting bug by adding a missing atomic_dec
statement, but it failed to wake up sd_glock_wait when that decrement
causes sd_glock_disposal to reach zero. As a consequence,
gfs2_gl_hash_clear can now run into a 10-minute timeout instead of
being woken up. Add the missing wakeup.
Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race")
Cc: stable(a)vger.kernel.org # v2.6.39+
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5441c17562c5..d98a2e5dab9f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1078,7 +1078,8 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
out_free:
kfree(gl->gl_lksb.sb_lvbptr);
kmem_cache_free(cachep, gl);
- atomic_dec(&sdp->sd_glock_disposal);
+ if (atomic_dec_and_test(&sdp->sd_glock_disposal))
+ wake_up(&sdp->sd_glock_wait);
out:
return ret;
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From da7d554f7c62d0c17c1ac3cc2586473c2d99f0bd Mon Sep 17 00:00:00 2001
From: Alexander Aring <aahringo(a)redhat.com>
Date: Mon, 26 Oct 2020 10:52:29 -0400
Subject: [PATCH] gfs2: Wake up when sd_glock_disposal becomes zero
Commit fc0e38dae645 ("GFS2: Fix glock deallocation race") fixed a
sd_glock_disposal accounting bug by adding a missing atomic_dec
statement, but it failed to wake up sd_glock_wait when that decrement
causes sd_glock_disposal to reach zero. As a consequence,
gfs2_gl_hash_clear can now run into a 10-minute timeout instead of
being woken up. Add the missing wakeup.
Fixes: fc0e38dae645 ("GFS2: Fix glock deallocation race")
Cc: stable(a)vger.kernel.org # v2.6.39+
Signed-off-by: Alexander Aring <aahringo(a)redhat.com>
Signed-off-by: Andreas Gruenbacher <agruenba(a)redhat.com>
diff --git a/fs/gfs2/glock.c b/fs/gfs2/glock.c
index 5441c17562c5..d98a2e5dab9f 100644
--- a/fs/gfs2/glock.c
+++ b/fs/gfs2/glock.c
@@ -1078,7 +1078,8 @@ int gfs2_glock_get(struct gfs2_sbd *sdp, u64 number,
out_free:
kfree(gl->gl_lksb.sb_lvbptr);
kmem_cache_free(cachep, gl);
- atomic_dec(&sdp->sd_glock_disposal);
+ if (atomic_dec_and_test(&sdp->sd_glock_disposal))
+ wake_up(&sdp->sd_glock_wait);
out:
return ret;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 3f08842098e842c51e3b97d0dcdebf810b32558e Mon Sep 17 00:00:00 2001
From: Shijie Luo <luoshijie1(a)huawei.com>
Date: Sun, 1 Nov 2020 17:07:40 -0800
Subject: [PATCH] mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return
-EIO when MPOL_MF_STRICT is specified") and early break on the first pte
in the range results in pte_unmap_unlock on an underflow pte. This can
lead to lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 3fde772ef5ef..3ca4898f3f24 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *pmd, unsigned long addr,
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 8de15e920dc85d1705ab9c202c95d56845bc2d48 Mon Sep 17 00:00:00 2001
From: Roman Gushchin <guro(a)fb.com>
Date: Sun, 1 Nov 2020 17:07:34 -0800
Subject: [PATCH] mm: memcg: link page counters to root if use_hierarchy is
false
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Richard reported a warning which can be reproduced by running the LTP
madvise6 test (cgroup v1 in the non-hierarchical mode should be used):
WARNING: CPU: 0 PID: 12 at mm/page_counter.c:57 page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Modules linked in:
CPU: 0 PID: 12 Comm: kworker/0:1 Not tainted 5.9.0-rc7-22-default #77
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.13.0-48-gd9c812d-rebuilt.opensuse.org 04/01/2014
Workqueue: events drain_local_stock
RIP: 0010:page_counter_uncharge (mm/page_counter.c:57 mm/page_counter.c:50 mm/page_counter.c:156)
Call Trace:
__memcg_kmem_uncharge (mm/memcontrol.c:3022)
drain_obj_stock (./include/linux/rcupdate.h:689 mm/memcontrol.c:3114)
drain_local_stock (mm/memcontrol.c:2255)
process_one_work (./arch/x86/include/asm/jump_label.h:25 ./include/linux/jump_label.h:200 ./include/trace/events/workqueue.h:108 kernel/workqueue.c:2274)
worker_thread (./include/linux/list.h:282 kernel/workqueue.c:2416)
kthread (kernel/kthread.c:292)
ret_from_fork (arch/x86/entry/entry_64.S:300)
The problem occurs because in the non-hierarchical mode non-root page
counters are not linked to root page counters, so the charge is not
propagated to the root memory cgroup.
After the removal of the original memory cgroup and reparenting of the
object cgroup, the root cgroup might be uncharged by draining a objcg
stock, for example. It leads to an eventual underflow of the charge and
triggers a warning.
Fix it by linking all page counters to corresponding root page counters
in the non-hierarchical mode.
Please note, that in the non-hierarchical mode all objcgs are always
reparented to the root memory cgroup, even if the hierarchy has more
than 1 level. This patch doesn't change it.
The patch also doesn't affect how the hierarchical mode is working,
which is the only sane and truly supported mode now.
Thanks to Richard for reporting, debugging and providing an alternative
version of the fix!
Fixes: bf4f059954dc ("mm: memcg/slab: obj_cgroup API")
Reported-by: <ltp(a)lists.linux.it>
Signed-off-by: Roman Gushchin <guro(a)fb.com>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
Reviewed-by: Shakeel Butt <shakeelb(a)google.com>
Reviewed-by: Michal Koutný <mkoutny(a)suse.com>
Acked-by: Johannes Weiner <hannes(a)cmpxchg.org>
Cc: Michal Hocko <mhocko(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Link: https://lkml.kernel.org/r/20201026231326.3212225-1-guro@fb.com
Debugged-by: Richard Palethorpe <rpalethorpe(a)suse.com>
Signed-off-by: Linus Torvalds <torvalds(a)linux-foundation.org>
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index c3b6dc7d5c94..3dcbf24d2227 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -5345,17 +5345,22 @@ mem_cgroup_css_alloc(struct cgroup_subsys_state *parent_css)
memcg->swappiness = mem_cgroup_swappiness(parent);
memcg->oom_kill_disable = parent->oom_kill_disable;
}
- if (parent && parent->use_hierarchy) {
+ if (!parent) {
+ page_counter_init(&memcg->memory, NULL);
+ page_counter_init(&memcg->swap, NULL);
+ page_counter_init(&memcg->kmem, NULL);
+ page_counter_init(&memcg->tcpmem, NULL);
+ } else if (parent->use_hierarchy) {
memcg->use_hierarchy = true;
page_counter_init(&memcg->memory, &parent->memory);
page_counter_init(&memcg->swap, &parent->swap);
page_counter_init(&memcg->kmem, &parent->kmem);
page_counter_init(&memcg->tcpmem, &parent->tcpmem);
} else {
- page_counter_init(&memcg->memory, NULL);
- page_counter_init(&memcg->swap, NULL);
- page_counter_init(&memcg->kmem, NULL);
- page_counter_init(&memcg->tcpmem, NULL);
+ page_counter_init(&memcg->memory, &root_mem_cgroup->memory);
+ page_counter_init(&memcg->swap, &root_mem_cgroup->swap);
+ page_counter_init(&memcg->kmem, &root_mem_cgroup->kmem);
+ page_counter_init(&memcg->tcpmem, &root_mem_cgroup->tcpmem);
/*
* Deeper hierachy with use_hierarchy == false doesn't make
* much sense so let cgroup subsystem know about this
Here are backports of some fixes to the 4.14 stable branch.
I tested the blktrace fix with the script referenced in the commit
message.
I wasn't able to test the i40e changes (no hardware and no reproducer
available).
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
Here are backports of some fixes to the 4.19 stable branch.
I tested the blktrace fix with the script referenced in the commit
message.
I tested the btrfs changes with the reproducers for CVE-2019-19039,
CVE-2019-19377, and CVE-2019-19816, and checked for regressions with
xfstests.
Ben.
--
Ben Hutchings, Software Developer Codethink Ltd
https://www.codethink.co.uk/ Dale House, 35 Dale Street
Manchester, M1 2HF, United Kingdom
This is a note to let you know that I've just added the patch titled
USB: serial: cyberjack: fix write-URB completion race
to my usb git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/usb.git
in the usb-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 985616f0457d9f555fff417d0da56174f70cc14f Mon Sep 17 00:00:00 2001
From: Johan Hovold <johan(a)kernel.org>
Date: Mon, 26 Oct 2020 09:25:48 +0100
Subject: USB: serial: cyberjack: fix write-URB completion race
The write-URB busy flag was being cleared before the completion handler
was done with the URB, something which could lead to corrupt transfers
due to a racing write request if the URB is resubmitted.
Fixes: 507ca9bc0476 ("[PATCH] USB: add ability for usb-serial drivers to determine if their write urb is currently being used.")
Cc: stable <stable(a)vger.kernel.org> # 2.6.13
Reviewed-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/cyberjack.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/serial/cyberjack.c b/drivers/usb/serial/cyberjack.c
index 821970609695..2e40908963da 100644
--- a/drivers/usb/serial/cyberjack.c
+++ b/drivers/usb/serial/cyberjack.c
@@ -357,11 +357,12 @@ static void cyberjack_write_bulk_callback(struct urb *urb)
struct device *dev = &port->dev;
int status = urb->status;
unsigned long flags;
+ bool resubmitted = false;
- set_bit(0, &port->write_urbs_free);
if (status) {
dev_dbg(dev, "%s - nonzero write bulk status received: %d\n",
__func__, status);
+ set_bit(0, &port->write_urbs_free);
return;
}
@@ -394,6 +395,8 @@ static void cyberjack_write_bulk_callback(struct urb *urb)
goto exit;
}
+ resubmitted = true;
+
dev_dbg(dev, "%s - priv->wrsent=%d\n", __func__, priv->wrsent);
dev_dbg(dev, "%s - priv->wrfilled=%d\n", __func__, priv->wrfilled);
@@ -410,6 +413,8 @@ static void cyberjack_write_bulk_callback(struct urb *urb)
exit:
spin_unlock_irqrestore(&priv->lock, flags);
+ if (!resubmitted)
+ set_bit(0, &port->write_urbs_free);
usb_serial_port_softint(port);
}
--
2.29.2
We can get a crash when disconnecting the iSCSI session,
the call trace like this:
[ffff00002a00fb70] kfree at ffff00000830e224
[ffff00002a00fba0] ses_intf_remove at ffff000001f200e4
[ffff00002a00fbd0] device_del at ffff0000086b6a98
[ffff00002a00fc50] device_unregister at ffff0000086b6d58
[ffff00002a00fc70] __scsi_remove_device at ffff00000870608c
[ffff00002a00fca0] scsi_remove_device at ffff000008706134
[ffff00002a00fcc0] __scsi_remove_target at ffff0000087062e4
[ffff00002a00fd10] scsi_remove_target at ffff0000087064c0
[ffff00002a00fd70] __iscsi_unbind_session at ffff000001c872c4
[ffff00002a00fdb0] process_one_work at ffff00000810f35c
[ffff00002a00fe00] worker_thread at ffff00000810f648
[ffff00002a00fe70] kthread at ffff000008116e98
In ses_intf_add, components count can be 0, and kcalloc 0 size scomp,
but not saved at edev->component[i].scratch
In this situation, edev->component[0].scratch is an invalid pointer,
when kfree it in ses_intf_remove_enclosure, a crash like above would happen
The call trace also could be other random cases when kfree cannot detect
the invalid pointer
We should not use edev->component[] array when we get components count is 0
We also need check index when use edev->component[] array in
ses_enclosure_data_process
Tested-by: Zeng Zhicong <timmyzeng(a)163.com>
Cc: stable <stable(a)vger.kernel.org> # 2.6.25+
Signed-off-by: Ding Hui <dinghui(a)sangfor.com.cn>
---
drivers/scsi/ses.c | 18 ++++++++++--------
1 file changed, 10 insertions(+), 8 deletions(-)
diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c
index c2afba2a5414..f5ef0a91f0eb 100644
--- a/drivers/scsi/ses.c
+++ b/drivers/scsi/ses.c
@@ -477,9 +477,6 @@ static int ses_enclosure_find_by_addr(struct enclosure_device *edev,
int i;
struct ses_component *scomp;
- if (!edev->component[0].scratch)
- return 0;
-
for (i = 0; i < edev->components; i++) {
scomp = edev->component[i].scratch;
if (scomp->addr != efd->addr)
@@ -565,8 +562,10 @@ static void ses_enclosure_data_process(struct enclosure_device *edev,
components++,
type_ptr[0],
name);
- else
+ else if (components < edev->components)
ecomp = &edev->component[components++];
+ else
+ ecomp = ERR_PTR(-EINVAL);
if (!IS_ERR(ecomp)) {
if (addl_desc_ptr)
@@ -731,9 +730,11 @@ static int ses_intf_add(struct device *cdev,
buf = NULL;
}
page2_not_supported:
- scomp = kcalloc(components, sizeof(struct ses_component), GFP_KERNEL);
- if (!scomp)
- goto err_free;
+ if (components > 0) {
+ scomp = kcalloc(components, sizeof(struct ses_component), GFP_KERNEL);
+ if (!scomp)
+ goto err_free;
+ }
edev = enclosure_register(cdev->parent, dev_name(&sdev->sdev_gendev),
components, &ses_enclosure_callbacks);
@@ -813,7 +814,8 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev)
kfree(ses_dev->page2);
kfree(ses_dev);
- kfree(edev->component[0].scratch);
+ if (edev->components > 0)
+ kfree(edev->component[0].scratch);
put_device(&edev->edev);
enclosure_unregister(edev);
--
2.17.1
The patch titled
Subject: hugetlbfs: fix anon huge page migration race
has been added to the -mm tree. Its filename is
hugetlbfs-fix-anon-huge-page-migration-race.patch
This patch should soon appear at
https://ozlabs.org/~akpm/mmots/broken-out/hugetlbfs-fix-anon-huge-page-migr…
and later at
https://ozlabs.org/~akpm/mmotm/broken-out/hugetlbfs-fix-anon-huge-page-migr…
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next and is updated
there every 3-4 working days
------------------------------------------------------
From: Mike Kravetz <mike.kravetz(a)oracle.com>
Subject: hugetlbfs: fix anon huge page migration race
Qian Cai reported the following BUG in [1]
[ 6147.019063][T45242] LTP: starting move_pages12
[ 6147.475680][T64921] BUG: unable to handle page fault for address: ffffffffffffffe0
...
[ 6147.525866][T64921] RIP: 0010:anon_vma_interval_tree_iter_first+0xa2/0x170
avc_start_pgoff at mm/interval_tree.c:63
[ 6147.620914][T64921] Call Trace:
[ 6147.624078][T64921] rmap_walk_anon+0x141/0xa30
rmap_walk_anon at mm/rmap.c:1864
[ 6147.628639][T64921] try_to_unmap+0x209/0x2d0
try_to_unmap at mm/rmap.c:1763
[ 6147.633026][T64921] ? rmap_walk_locked+0x140/0x140
[ 6147.637936][T64921] ? page_remove_rmap+0x1190/0x1190
[ 6147.643020][T64921] ? page_not_mapped+0x10/0x10
[ 6147.647668][T64921] ? page_get_anon_vma+0x290/0x290
[ 6147.652664][T64921] ? page_mapcount_is_zero+0x10/0x10
[ 6147.657838][T64921] ? hugetlb_page_mapping_lock_write+0x97/0x180
[ 6147.663972][T64921] migrate_pages+0x1005/0x1fb0
[ 6147.668617][T64921] ? remove_migration_pte+0xac0/0xac0
[ 6147.673875][T64921] move_pages_and_store_status.isra.47+0xd7/0x1a0
[ 6147.680181][T64921] ? migrate_pages+0x1fb0/0x1fb0
[ 6147.685002][T64921] __x64_sys_move_pages+0xa5c/0x1100
[ 6147.690176][T64921] ? trace_hardirqs_on+0x20/0x1b5
[ 6147.695084][T64921] ? move_pages_and_store_status.isra.47+0x1a0/0x1a0
[ 6147.701653][T64921] ? rcu_read_lock_sched_held+0xaa/0xd0
[ 6147.707088][T64921] ? switch_fpu_return+0x196/0x400
[ 6147.712083][T64921] ? lockdep_hardirqs_on_prepare+0x38c/0x550
[ 6147.717954][T64921] ? do_syscall_64+0x24/0x310
[ 6147.722513][T64921] do_syscall_64+0x5f/0x310
[ 6147.726897][T64921] ? trace_hardirqs_off+0x12/0x1a0
[ 6147.731894][T64921] ? asm_exc_page_fault+0x8/0x30
[ 6147.736714][T64921] entry_SYSCALL_64_after_hwframe+0x44/0xa9
Hugh Dickens diagnosed this as a migration bug caused by code introduced
to use i_mmap_rwsem for pmd sharing synchronization. Specifically, the
routine unmap_and_move_huge_page() is always passing the TTU_RMAP_LOCKED
flag to try_to_unmap() while holding i_mmap_rwsem. This is wrong for
anon pages as the anon_vma_lock should be held in this case. Further
analysis suggested that i_mmap_rwsem was not required to he held at all
when calling try_to_unmap for anon pages as an anon page could never be
part of a shared pmd mapping.
Discussion also revealed that the hack in hugetlb_page_mapping_lock_write
to drop page lock and acquire i_mmap_rwsem is wrong. There is no way to
keep mapping valid while dropping page lock.
This patch does the following:
- Do not take i_mmap_rwsem and set TTU_RMAP_LOCKED for anon pages when
calling try_to_unmap.
- Remove the hacky code in hugetlb_page_mapping_lock_write. The routine
will now simply do a 'trylock' while still holding the page lock. If
the trylock fails, it will return NULL. This could impact the callers:
- migration calling code will receive -EAGAIN and retry up to the hard
coded limit (10).
- memory error code will treat the page as BUSY. This will force
killing (SIGKILL) instead of SIGBUS any mapping tasks.
Do note that this change in behavior only happens when there is a race.
None of the standard kernel testing suites actually hit this race, but
it is possible.
[1] https://lore.kernel.org/lkml/20200708012044.GC992@lca.pw/
[2] https://lore.kernel.org/linux-mm/alpine.LSU.2.11.2010071833100.2214@eggly.a…
Link: https://lkml.kernel.org/r/20201105195058.78401-1-mike.kravetz@oracle.com
Fixes: c0d0381ade79 ("hugetlbfs: use i_mmap_rwsem for more pmd sharing synchronization")
Signed-off-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Reported-by: Qian Cai <cai(a)lca.pw>
Suggested-by: Hugh Dickins <hughd(a)google.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/hugetlb.c | 90 ++----------------------------------------
mm/memory-failure.c | 36 +++++++---------
mm/migrate.c | 46 +++++++++++----------
mm/rmap.c | 5 --
4 files changed, 48 insertions(+), 129 deletions(-)
--- a/mm/hugetlb.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/hugetlb.c
@@ -1568,103 +1568,23 @@ int PageHeadHuge(struct page *page_head)
}
/*
- * Find address_space associated with hugetlbfs page.
- * Upon entry page is locked and page 'was' mapped although mapped state
- * could change. If necessary, use anon_vma to find vma and associated
- * address space. The returned mapping may be stale, but it can not be
- * invalid as page lock (which is held) is required to destroy mapping.
- */
-static struct address_space *_get_hugetlb_page_mapping(struct page *hpage)
-{
- struct anon_vma *anon_vma;
- pgoff_t pgoff_start, pgoff_end;
- struct anon_vma_chain *avc;
- struct address_space *mapping = page_mapping(hpage);
-
- /* Simple file based mapping */
- if (mapping)
- return mapping;
-
- /*
- * Even anonymous hugetlbfs mappings are associated with an
- * underlying hugetlbfs file (see hugetlb_file_setup in mmap
- * code). Find a vma associated with the anonymous vma, and
- * use the file pointer to get address_space.
- */
- anon_vma = page_lock_anon_vma_read(hpage);
- if (!anon_vma)
- return mapping; /* NULL */
-
- /* Use first found vma */
- pgoff_start = page_to_pgoff(hpage);
- pgoff_end = pgoff_start + pages_per_huge_page(page_hstate(hpage)) - 1;
- anon_vma_interval_tree_foreach(avc, &anon_vma->rb_root,
- pgoff_start, pgoff_end) {
- struct vm_area_struct *vma = avc->vma;
-
- mapping = vma->vm_file->f_mapping;
- break;
- }
-
- anon_vma_unlock_read(anon_vma);
- return mapping;
-}
-
-/*
* Find and lock address space (mapping) in write mode.
*
- * Upon entry, the page is locked which allows us to find the mapping
- * even in the case of an anon page. However, locking order dictates
- * the i_mmap_rwsem be acquired BEFORE the page lock. This is hugetlbfs
- * specific. So, we first try to lock the sema while still holding the
- * page lock. If this works, great! If not, then we need to drop the
- * page lock and then acquire i_mmap_rwsem and reacquire page lock. Of
- * course, need to revalidate state along the way.
+ * Upon entry, the page is locked which means that page_mapping() is
+ * stable. Due to locking order, we can only trylock_write. If we can
+ * not get the lock, simply return NULL to caller.
*/
struct address_space *hugetlb_page_mapping_lock_write(struct page *hpage)
{
- struct address_space *mapping, *mapping2;
+ struct address_space *mapping = page_mapping(hpage);
- mapping = _get_hugetlb_page_mapping(hpage);
-retry:
if (!mapping)
return mapping;
- /*
- * If no contention, take lock and return
- */
if (i_mmap_trylock_write(mapping))
return mapping;
- /*
- * Must drop page lock and wait on mapping sema.
- * Note: Once page lock is dropped, mapping could become invalid.
- * As a hack, increase map count until we lock page again.
- */
- atomic_inc(&hpage->_mapcount);
- unlock_page(hpage);
- i_mmap_lock_write(mapping);
- lock_page(hpage);
- atomic_add_negative(-1, &hpage->_mapcount);
-
- /* verify page is still mapped */
- if (!page_mapped(hpage)) {
- i_mmap_unlock_write(mapping);
- return NULL;
- }
-
- /*
- * Get address space again and verify it is the same one
- * we locked. If not, drop lock and retry.
- */
- mapping2 = _get_hugetlb_page_mapping(hpage);
- if (mapping2 != mapping) {
- i_mmap_unlock_write(mapping);
- mapping = mapping2;
- goto retry;
- }
-
- return mapping;
+ return NULL;
}
pgoff_t __basepage_index(struct page *page)
--- a/mm/memory-failure.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/memory-failure.c
@@ -1057,27 +1057,25 @@ static bool hwpoison_user_mappings(struc
if (!PageHuge(hpage)) {
unmap_success = try_to_unmap(hpage, ttu);
} else {
- /*
- * For hugetlb pages, try_to_unmap could potentially call
- * huge_pmd_unshare. Because of this, take semaphore in
- * write mode here and set TTU_RMAP_LOCKED to indicate we
- * have taken the lock at this higer level.
- *
- * Note that the call to hugetlb_page_mapping_lock_write
- * is necessary even if mapping is already set. It handles
- * ugliness of potentially having to drop page lock to obtain
- * i_mmap_rwsem.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
-
- if (mapping) {
- unmap_success = try_to_unmap(hpage,
+ if (!PageAnon(hpage)) {
+ /*
+ * For hugetlb pages in shared mappings, try_to_unmap
+ * could potentially call huge_pmd_unshare. Because of
+ * this, take semaphore in write mode here and set
+ * TTU_RMAP_LOCKED to indicate we have taken the lock
+ * at this higer level.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (mapping) {
+ unmap_success = try_to_unmap(hpage,
ttu|TTU_RMAP_LOCKED);
- i_mmap_unlock_write(mapping);
+ i_mmap_unlock_write(mapping);
+ } else {
+ pr_info("Memory failure: %#lx: could not lock mapping for mapped huge page\n", pfn);
+ unmap_success = false;
+ }
} else {
- pr_info("Memory failure: %#lx: could not find mapping for mapped huge page\n",
- pfn);
- unmap_success = false;
+ unmap_success = try_to_unmap(hpage, ttu);
}
}
if (!unmap_success)
--- a/mm/migrate.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/migrate.c
@@ -1328,34 +1328,38 @@ static int unmap_and_move_huge_page(new_
goto put_anon;
if (page_mapped(hpage)) {
- /*
- * try_to_unmap could potentially call huge_pmd_unshare.
- * Because of this, take semaphore in write mode here and
- * set TTU_RMAP_LOCKED to let lower levels know we have
- * taken the lock.
- */
- mapping = hugetlb_page_mapping_lock_write(hpage);
- if (unlikely(!mapping))
- goto unlock_put_anon;
-
- try_to_unmap(hpage,
- TTU_MIGRATION|TTU_IGNORE_MLOCK|TTU_IGNORE_ACCESS|
- TTU_RMAP_LOCKED);
+ bool mapping_locked = false;
+ enum ttu_flags ttu = TTU_MIGRATION|TTU_IGNORE_MLOCK|
+ TTU_IGNORE_ACCESS;
+
+ if (!PageAnon(hpage)) {
+ /*
+ * In shared mappings, try_to_unmap could potentially
+ * call huge_pmd_unshare. Because of this, take
+ * semaphore in write mode here and set TTU_RMAP_LOCKED
+ * to let lower levels know we have taken the lock.
+ */
+ mapping = hugetlb_page_mapping_lock_write(hpage);
+ if (unlikely(!mapping))
+ goto unlock_put_anon;
+
+ mapping_locked = true;
+ ttu |= TTU_RMAP_LOCKED;
+ }
+
+ try_to_unmap(hpage, ttu);
page_was_mapped = 1;
- /*
- * Leave mapping locked until after subsequent call to
- * remove_migration_ptes()
- */
+
+ if (mapping_locked)
+ i_mmap_unlock_write(mapping);
}
if (!page_mapped(hpage))
rc = move_to_new_page(new_hpage, hpage, mode);
- if (page_was_mapped) {
+ if (page_was_mapped)
remove_migration_ptes(hpage,
- rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, true);
- i_mmap_unlock_write(mapping);
- }
+ rc == MIGRATEPAGE_SUCCESS ? new_hpage : hpage, false);
unlock_put_anon:
unlock_page(new_hpage);
--- a/mm/rmap.c~hugetlbfs-fix-anon-huge-page-migration-race
+++ a/mm/rmap.c
@@ -1413,9 +1413,6 @@ static bool try_to_unmap_one(struct page
/*
* If sharing is possible, start and end will be adjusted
* accordingly.
- *
- * If called for a huge page, caller must hold i_mmap_rwsem
- * in write mode as it is possible to call huge_pmd_unshare.
*/
adjust_range_if_pmd_sharing_possible(vma, &range.start,
&range.end);
@@ -1462,7 +1459,7 @@ static bool try_to_unmap_one(struct page
subpage = page - page_to_pfn(page) + pte_pfn(*pvmw.pte);
address = pvmw.address;
- if (PageHuge(page)) {
+ if (PageHuge(page) && !PageAnon(page)) {
/*
* To call huge_pmd_unshare, i_mmap_rwsem must be
* held in write mode. Caller needs to explicitly
_
Patches currently in -mm which might be from mike.kravetz(a)oracle.com are
hugetlbfs-fix-anon-huge-page-migration-race.patch
bpftrace parses the kernel headers and uses Clang under the hood. Remove
the version check when __BPF_TRACING__ is defined (as bpftrace does) so
that this tool can continue to parse kernel headers, even with older
clang sources.
Cc: <stable(a)vger.kernel.org>
Fixes: commit 1f7a44f63e6c ("compiler-clang: add build check for clang 10.0.1")
Reported-by: Chen Yu <yu.chen.surf(a)gmail.com>
Reported-by: Jarkko Sakkinen <jarkko(a)kernel.org>
Signed-off-by: Nick Desaulniers <ndesaulniers(a)google.com>
---
include/linux/compiler-clang.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index dd7233c48bf3..98cff1b4b088 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -8,8 +8,10 @@
+ __clang_patchlevel__)
#if CLANG_VERSION < 100001
+#ifndef __BPF_TRACING__
# error Sorry, your version of Clang is too old - please use 10.0.1 or newer.
#endif
+#endif
/* Compiler specific definitions for Clang compiler */
--
2.29.1.341.ge80a0c044ae-goog
From: Andrew Jones <drjones(a)redhat.com>
ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function, SVE, is not present.
Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not
implemented. Removing the userspace visibility checks is enough
to reexpose it, as it will already return zero to userspace when
SVE is not present. The register already behaves as RAZ for the
guest when SVE is not present.
Fixes: 73433762fcae ("KVM: arm64/sve: System register context switch and access support")
Reported-by: 张东旭 <xu910121(a)sina.com>
Signed-off-by: Andrew Jones <drjones(a)redhat.com>
Signed-off-by: Marc Zyngier <maz(a)kernel.org>
Cc: stable(a)vger.kernel.org#v5.2+
Link: https://lore.kernel.org/r/20201105091022.15373-2-drjones@redhat.com
---
arch/arm64/kvm/sys_regs.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index 983994f01a63..3af306e6b9cd 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1193,16 +1193,6 @@ static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
return REG_HIDDEN_USER | REG_HIDDEN_GUEST;
}
-/* Visibility overrides for SVE-specific ID registers */
-static unsigned int sve_id_visibility(const struct kvm_vcpu *vcpu,
- const struct sys_reg_desc *rd)
-{
- if (vcpu_has_sve(vcpu))
- return 0;
-
- return REG_HIDDEN_USER;
-}
-
/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
{
@@ -1229,9 +1219,6 @@ static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
{
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
val = guest_id_aa64zfr0_el1(vcpu);
return reg_to_user(uaddr, &val, reg->id);
}
@@ -1244,9 +1231,6 @@ static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
int err;
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
err = reg_from_user(&val, uaddr, id);
if (err)
return err;
@@ -1509,7 +1493,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_SANITISED(ID_AA64PFR1_EL1),
ID_UNALLOCATED(4,2),
ID_UNALLOCATED(4,3),
- { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = sve_id_visibility },
+ { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, },
ID_UNALLOCATED(4,5),
ID_UNALLOCATED(4,6),
ID_UNALLOCATED(4,7),
--
2.28.0
This is a note to let you know that I've just added the patch titled
serial: txx9: add missing platform_driver_unregister() on error in
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 0c5fc92622ed5531ff324b20f014e9e3092f0187 Mon Sep 17 00:00:00 2001
From: Qinglang Miao <miaoqinglang(a)huawei.com>
Date: Tue, 3 Nov 2020 16:49:42 +0800
Subject: serial: txx9: add missing platform_driver_unregister() on error in
serial_txx9_init
Add the missing platform_driver_unregister() before return
from serial_txx9_init in the error handling case when failed
to register serial_txx9_pci_driver with macro ENABLE_SERIAL_TXX9_PCI
defined.
Fixes: ab4382d27412 ("tty: move drivers/serial/ to drivers/tty/serial/")
Signed-off-by: Qinglang Miao <miaoqinglang(a)huawei.com>
Link: https://lore.kernel.org/r/20201103084942.109076-1-miaoqinglang@huawei.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/serial_txx9.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/drivers/tty/serial/serial_txx9.c b/drivers/tty/serial/serial_txx9.c
index b4d89e31730e..7a07e7272de1 100644
--- a/drivers/tty/serial/serial_txx9.c
+++ b/drivers/tty/serial/serial_txx9.c
@@ -1280,6 +1280,9 @@ static int __init serial_txx9_init(void)
#ifdef ENABLE_SERIAL_TXX9_PCI
ret = pci_register_driver(&serial_txx9_pci_driver);
+ if (ret) {
+ platform_driver_unregister(&serial_txx9_plat_driver);
+ }
#endif
if (ret == 0)
goto out;
--
2.29.2
This is a note to let you know that I've just added the patch titled
tty: fix crash in release_tty if tty->port is not set
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 4466d6d2f80c1193e0845d110277c56da77a6418 Mon Sep 17 00:00:00 2001
From: Matthias Reichl <hias(a)horus.com>
Date: Thu, 5 Nov 2020 13:34:32 +0100
Subject: tty: fix crash in release_tty if tty->port is not set
Commit 2ae0b31e0face ("tty: don't crash in tty_init_dev when missing
tty_port") didn't fully prevent the crash as the cleanup path in
tty_init_dev() calls release_tty() which dereferences tty->port
without checking it for non-null.
Add tty->port checks to release_tty to avoid the kernel crash.
Fixes: 2ae0b31e0face ("tty: don't crash in tty_init_dev when missing tty_port")
Signed-off-by: Matthias Reichl <hias(a)horus.com>
Link: https://lore.kernel.org/r/20201105123432.4448-1-hias@horus.com
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/tty_io.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 7a4c02548fb3..9f8b9a567b35 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -1515,10 +1515,12 @@ static void release_tty(struct tty_struct *tty, int idx)
tty->ops->shutdown(tty);
tty_save_termios(tty);
tty_driver_remove_tty(tty->driver, tty);
- tty->port->itty = NULL;
+ if (tty->port)
+ tty->port->itty = NULL;
if (tty->link)
tty->link->port->itty = NULL;
- tty_buffer_cancel_work(tty->port);
+ if (tty->port)
+ tty_buffer_cancel_work(tty->port);
if (tty->link)
tty_buffer_cancel_work(tty->link->port);
--
2.29.2
This is a note to let you know that I've just added the patch titled
tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 427627a23c3e86e31113f9db9bfdca41698a0ee5 Mon Sep 17 00:00:00 2001
From: Lucas Stach <l.stach(a)pengutronix.de>
Date: Thu, 5 Nov 2020 21:40:26 +0100
Subject: tty: serial: imx: enable earlycon by default if IMX_SERIAL_CONSOLE is
enabled
Since 699cc4dfd140 (tty: serial: imx: add imx earlycon driver), the earlycon
part of imx serial is a separate driver and isn't necessarily enabled anymore
when the console is enabled. This causes users to loose the earlycon
functionality when upgrading their kenrel configuration via oldconfig.
Enable earlycon by default when IMX_SERIAL_CONSOLE is enabled.
Fixes: 699cc4dfd140 (tty: serial: imx: add imx earlycon driver)
Reviewed-by: Fabio Estevam <festevam(a)gmail.com>
Reviewed-by: Fugang Duan <fugang.duan(a)nxp.com>
Signed-off-by: Lucas Stach <l.stach(a)pengutronix.de>
Link: https://lore.kernel.org/r/20201105204026.1818219-1-l.stach@pengutronix.de
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/Kconfig | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/tty/serial/Kconfig b/drivers/tty/serial/Kconfig
index 1044fc387691..28f22e58639c 100644
--- a/drivers/tty/serial/Kconfig
+++ b/drivers/tty/serial/Kconfig
@@ -522,6 +522,7 @@ config SERIAL_IMX_EARLYCON
depends on OF
select SERIAL_EARLYCON
select SERIAL_CORE_CONSOLE
+ default y if SERIAL_IMX_CONSOLE
help
If you have enabled the earlycon on the Freescale IMX
CPU you can make it the earlycon by answering Y to this option.
--
2.29.2
This is a note to let you know that I've just added the patch titled
serial: 8250_mtk: Fix uart_get_baud_rate warning
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From 912ab37c798770f21b182d656937072b58553378 Mon Sep 17 00:00:00 2001
From: Claire Chang <tientzu(a)chromium.org>
Date: Mon, 2 Nov 2020 20:07:49 +0800
Subject: serial: 8250_mtk: Fix uart_get_baud_rate warning
Mediatek 8250 port supports speed higher than uartclk / 16. If the baud
rates in both the new and the old termios setting are higher than
uartclk / 16, the WARN_ON in uart_get_baud_rate() will be triggered.
Passing NULL as the old termios so uart_get_baud_rate() will use
uartclk / 16 - 1 as the new baud rate which will be replaced by the
original baud rate later by tty_termios_encode_baud_rate() in
mtk8250_set_termios().
Fixes: 551e553f0d4a ("serial: 8250_mtk: Fix high-speed baud rates clamping")
Signed-off-by: Claire Chang <tientzu(a)chromium.org>
Link: https://lore.kernel.org/r/20201102120749.374458-1-tientzu@chromium.org
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/8250/8250_mtk.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/tty/serial/8250/8250_mtk.c b/drivers/tty/serial/8250/8250_mtk.c
index 41f4120abdf2..fa876e2c13e5 100644
--- a/drivers/tty/serial/8250/8250_mtk.c
+++ b/drivers/tty/serial/8250/8250_mtk.c
@@ -317,7 +317,7 @@ mtk8250_set_termios(struct uart_port *port, struct ktermios *termios,
*/
baud = tty_termios_baud_rate(termios);
- serial8250_do_set_termios(port, termios, old);
+ serial8250_do_set_termios(port, termios, NULL);
tty_termios_encode_baud_rate(termios, baud, baud);
--
2.29.2
This is a note to let you know that I've just added the patch titled
staging: rtl8723bs: Add 024c:0627 to the list of SDIO device-ids
to my staging git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
in the staging-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
>From aee9dccc5b64e878cf1b18207436e73f66d74157 Mon Sep 17 00:00:00 2001
From: Brian O'Keefe <bokeefe(a)alum.wpi.edu>
Date: Fri, 6 Nov 2020 10:10:34 -0500
Subject: staging: rtl8723bs: Add 024c:0627 to the list of SDIO device-ids
Add 024c:0627 to the list of SDIO device-ids, based on hardware found in
the wild. This hardware exists on at least some Acer SW1-011 tablets.
Signed-off-by: Brian O'Keefe <bokeefe(a)alum.wpi.edu>
Reviewed-by: Hans de Goede <hdegoede(a)redhat.com>
Link: https://lore.kernel.org/r/b9e1523f-2ba7-fb82-646a-37f095b4440e@alum.wpi.edu
Cc: stable <stable(a)vger.kernel.org>
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/staging/rtl8723bs/os_dep/sdio_intf.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/rtl8723bs/os_dep/sdio_intf.c b/drivers/staging/rtl8723bs/os_dep/sdio_intf.c
index 79b55ec827a4..b2208e5f190a 100644
--- a/drivers/staging/rtl8723bs/os_dep/sdio_intf.c
+++ b/drivers/staging/rtl8723bs/os_dep/sdio_intf.c
@@ -20,6 +20,7 @@ static const struct sdio_device_id sdio_ids[] = {
{ SDIO_DEVICE(0x024c, 0x0525), },
{ SDIO_DEVICE(0x024c, 0x0623), },
{ SDIO_DEVICE(0x024c, 0x0626), },
+ { SDIO_DEVICE(0x024c, 0x0627), },
{ SDIO_DEVICE(0x024c, 0xb723), },
{ /* end: all zeroes */ },
};
--
2.29.2
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f5449e74802c1112dea984aec8af7a33c4516af1 Mon Sep 17 00:00:00 2001
From: Jason Gunthorpe <jgg(a)ziepe.ca>
Date: Mon, 14 Sep 2020 08:59:56 -0300
Subject: [PATCH] RDMA/ucma: Rework ucma_migrate_id() to avoid races with
destroy
ucma_destroy_id() assumes that all things accessing the ctx will do so via
the xarray. This assumption violated only in the case the FD is being
closed, then the ctx is reached via the ctx_list. Normally this is OK
since ucma_destroy_id() cannot run concurrenty with release(), however
with ucma_migrate_id() is involved this can violated as the close of the
2nd FD can run concurrently with destroy on the first:
CPU0 CPU1
ucma_destroy_id(fda)
ucma_migrate_id(fda -> fdb)
ucma_get_ctx()
xa_lock()
_ucma_find_context()
xa_erase()
xa_unlock()
xa_lock()
ctx->file = new_file
list_move()
xa_unlock()
ucma_put_ctx()
ucma_close(fdb)
_destroy_id()
kfree(ctx)
_destroy_id()
wait_for_completion()
// boom, ctx was freed
The ctx->file must be modified under the handler and xa_lock, and prior to
modification the ID must be rechecked that it is still reachable from
cur_file, ie there is no parallel destroy or migrate.
To make this work remove the double locking and streamline the control
flow. The double locking was obsoleted by the handler lock now directly
preventing new uevents from being created, and the ctx_list cannot be read
while holding fgets on both files. Removing the double locking also
removes the need to check for the same file.
Fixes: 88314e4dda1e ("RDMA/cma: add support for rdma_migrate_id()")
Link: https://lore.kernel.org/r/0-v1-05c5a4090305+3a872-ucma_syz_migrate_jgg@nvid…
Reported-and-tested-by: syzbot+cc6fc752b3819e082d0c(a)syzkaller.appspotmail.com
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
diff --git a/drivers/infiniband/core/ucma.c b/drivers/infiniband/core/ucma.c
index a5595bd489b0..b3a7dbb12f25 100644
--- a/drivers/infiniband/core/ucma.c
+++ b/drivers/infiniband/core/ucma.c
@@ -1587,45 +1587,15 @@ static ssize_t ucma_leave_multicast(struct ucma_file *file,
return ret;
}
-static void ucma_lock_files(struct ucma_file *file1, struct ucma_file *file2)
-{
- /* Acquire mutex's based on pointer comparison to prevent deadlock. */
- if (file1 < file2) {
- mutex_lock(&file1->mut);
- mutex_lock_nested(&file2->mut, SINGLE_DEPTH_NESTING);
- } else {
- mutex_lock(&file2->mut);
- mutex_lock_nested(&file1->mut, SINGLE_DEPTH_NESTING);
- }
-}
-
-static void ucma_unlock_files(struct ucma_file *file1, struct ucma_file *file2)
-{
- if (file1 < file2) {
- mutex_unlock(&file2->mut);
- mutex_unlock(&file1->mut);
- } else {
- mutex_unlock(&file1->mut);
- mutex_unlock(&file2->mut);
- }
-}
-
-static void ucma_move_events(struct ucma_context *ctx, struct ucma_file *file)
-{
- struct ucma_event *uevent, *tmp;
-
- list_for_each_entry_safe(uevent, tmp, &ctx->file->event_list, list)
- if (uevent->ctx == ctx)
- list_move_tail(&uevent->list, &file->event_list);
-}
-
static ssize_t ucma_migrate_id(struct ucma_file *new_file,
const char __user *inbuf,
int in_len, int out_len)
{
struct rdma_ucm_migrate_id cmd;
struct rdma_ucm_migrate_resp resp;
+ struct ucma_event *uevent, *tmp;
struct ucma_context *ctx;
+ LIST_HEAD(event_list);
struct fd f;
struct ucma_file *cur_file;
int ret = 0;
@@ -1641,43 +1611,52 @@ static ssize_t ucma_migrate_id(struct ucma_file *new_file,
ret = -EINVAL;
goto file_put;
}
+ cur_file = f.file->private_data;
/* Validate current fd and prevent destruction of id. */
- ctx = ucma_get_ctx(f.file->private_data, cmd.id);
+ ctx = ucma_get_ctx(cur_file, cmd.id);
if (IS_ERR(ctx)) {
ret = PTR_ERR(ctx);
goto file_put;
}
rdma_lock_handler(ctx->cm_id);
- cur_file = ctx->file;
- if (cur_file == new_file) {
- mutex_lock(&cur_file->mut);
- resp.events_reported = ctx->events_reported;
- mutex_unlock(&cur_file->mut);
- goto response;
- }
-
/*
- * Migrate events between fd's, maintaining order, and avoiding new
- * events being added before existing events.
+ * ctx->file can only be changed under the handler & xa_lock. xa_load()
+ * must be checked again to ensure the ctx hasn't begun destruction
+ * since the ucma_get_ctx().
*/
- ucma_lock_files(cur_file, new_file);
xa_lock(&ctx_table);
-
- list_move_tail(&ctx->list, &new_file->ctx_list);
- ucma_move_events(ctx, new_file);
+ if (_ucma_find_context(cmd.id, cur_file) != ctx) {
+ xa_unlock(&ctx_table);
+ ret = -ENOENT;
+ goto err_unlock;
+ }
ctx->file = new_file;
+ xa_unlock(&ctx_table);
+
+ mutex_lock(&cur_file->mut);
+ list_del(&ctx->list);
+ /*
+ * At this point lock_handler() prevents addition of new uevents for
+ * this ctx.
+ */
+ list_for_each_entry_safe(uevent, tmp, &cur_file->event_list, list)
+ if (uevent->ctx == ctx)
+ list_move_tail(&uevent->list, &event_list);
resp.events_reported = ctx->events_reported;
+ mutex_unlock(&cur_file->mut);
- xa_unlock(&ctx_table);
- ucma_unlock_files(cur_file, new_file);
+ mutex_lock(&new_file->mut);
+ list_add_tail(&ctx->list, &new_file->ctx_list);
+ list_splice_tail(&event_list, &new_file->event_list);
+ mutex_unlock(&new_file->mut);
-response:
if (copy_to_user(u64_to_user_ptr(cmd.response),
&resp, sizeof(resp)))
ret = -EFAULT;
+err_unlock:
rdma_unlock_handler(ctx->cm_id);
ucma_put_ctx(ctx);
file_put:
Option "mediatek,str-clock-on" means to keep clock on during system
suspend and resume. Some platform will flush register settings if clock has
been disabled when system is suspended. Set this option to avoid clock off.
Fixes: 0cbd4b34cda9 ("xhci: mediatek: support MTK xHCI host controller")
Signed-off-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Cc: stable(a)vger.kernel.org
---
Changes for v3:
- Remove unnecessary Change-Id in commit message.
- Add "Fixes" tag as a bug fix on phone system.
Changes for v2:
- Rename "mediatek,keep-clock-on" to "mediatek,str-clock-on" which implies
this option related to STR functions.
- After discussion with Chunfeng, resend dt-bindings descritption based on
mediatek,mtk-xhci.txt instead of yaml format.
.../devicetree/bindings/usb/mediatek,mtk-xhci.txt | 3 +++
1 file changed, 3 insertions(+)
diff --git a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.txt b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.txt
index 42d8814..fc93bcf 100644
--- a/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.txt
+++ b/Documentation/devicetree/bindings/usb/mediatek,mtk-xhci.txt
@@ -37,6 +37,9 @@ Required properties:
Optional properties:
- wakeup-source : enable USB remote wakeup;
+ - mediatek,str-clock-on: Keep clock on during system suspend and resume.
+ Some platform will flush register settings if clock has been disabled
+ when system is suspended.
- mediatek,syscon-wakeup : phandle to syscon used to access the register
of the USB wakeup glue layer between xHCI and SPM; it depends on
"wakeup-source", and has two arguments:
--
1.7.9.5
Commit eddd2a4c675c ("staging: comedi: cb_pcidas: refactor
write_calibration_bitstream()") inadvertently removed one of the
`udelay(1)` calls when writing to the calibration register in
`cb_pcidas_calib_write()`. Reinstate the delay. It may seem strange
that the delay is placed before the register write, but this function is
called in a loop so the extra delay can make a difference.
This _might_ solve reported issues reading analog inputs on a
PCIe-DAS1602/16 card where the analog input values "were scaled in a
strange way that didn't make sense". On the same hardware running a
system with a 3.13 kernel, and then a system with a 4.4 kernel, but with
the same application software, the system with the 3.13 kernel was fine,
but the one with the 4.4 kernel exhibited the problem. Of the 90
changes to the driver between those kernel versions, this change looked
like the most likely culprit.
Fixes: eddd2a4c675c ("staging: comedi: cb_pcidas: refactor write_calibration_bitstream()")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
---
drivers/staging/comedi/drivers/cb_pcidas.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/staging/comedi/drivers/cb_pcidas.c b/drivers/staging/comedi/drivers/cb_pcidas.c
index 48ec2ee953dc..4f2ac39aa619 100644
--- a/drivers/staging/comedi/drivers/cb_pcidas.c
+++ b/drivers/staging/comedi/drivers/cb_pcidas.c
@@ -529,6 +529,7 @@ static void cb_pcidas_calib_write(struct comedi_device *dev,
if (trimpot) {
/* select trimpot */
calib_bits |= PCIDAS_CALIB_TRIM_SEL;
+ udelay(1);
outw(calib_bits, devpriv->pcibar1 + PCIDAS_CALIB_REG);
}
--
2.28.0
We were relying on GNU ld's ability to re-link executable files in order
to extract our VDSO symbols. This behavior was deemed a bug as of
binutils-2.35 (specifically the binutils-gdb commit a87e1817a4 ("Have
the linker fail if any attempt to link in an executable is made."), but as that
has been backported to at least Debian's binutils-2.34 in may manifest in other
places.
The previous version of this was a bit of a mess: we were linking a
static executable version of the VDSO, containing only a subset of the
input symbols, which we then linked into the kernel. This worked, but
certainly wasn't a supported path through the toolchain. Instead this
new version parses the textual output of nm to produce a symbol table.
Both rely on near-zero addresses being linkable, but as we rely on weak
undefined symbols being linkable elsewhere I don't view this as a major
issue.
Fixes: e2c0cdfba7f6 ("RISC-V: User-facing API")
Cc: clang-built-linux(a)googlegroups.com
Cc: stable(a)vger.kernel.org
Signed-off-by: Palmer Dabbelt <palmerdabbelt(a)google.com>
---
Changes since v2 <20201019235630.762886-1-palmerdabbelt(a)google.com>:
* Uses $(srctree)/$(src) to allow for out-of-tree builds.
Changes since v1 <20201017002500.503011-1-palmerdabbelt(a)google.com>:
* Uses $(NM) instead of $(CROSS_COMPILE)nm. We use the $(CROSS_COMPILE) form
elsewhere in this file, but we'll fix that later.
* Removed the unnecesary .map file creation.
---
arch/riscv/kernel/vdso/.gitignore | 1 +
arch/riscv/kernel/vdso/Makefile | 17 ++++++++---------
arch/riscv/kernel/vdso/so2s.sh | 6 ++++++
3 files changed, 15 insertions(+), 9 deletions(-)
create mode 100755 arch/riscv/kernel/vdso/so2s.sh
diff --git a/arch/riscv/kernel/vdso/.gitignore b/arch/riscv/kernel/vdso/.gitignore
index 11ebee9e4c1d..3a19def868ec 100644
--- a/arch/riscv/kernel/vdso/.gitignore
+++ b/arch/riscv/kernel/vdso/.gitignore
@@ -1,3 +1,4 @@
# SPDX-License-Identifier: GPL-2.0-only
vdso.lds
*.tmp
+vdso-syms.S
diff --git a/arch/riscv/kernel/vdso/Makefile b/arch/riscv/kernel/vdso/Makefile
index 478e7338ddc1..a8ecf102e09b 100644
--- a/arch/riscv/kernel/vdso/Makefile
+++ b/arch/riscv/kernel/vdso/Makefile
@@ -43,19 +43,14 @@ $(obj)/vdso.o: $(obj)/vdso.so
SYSCFLAGS_vdso.so.dbg = $(c_flags)
$(obj)/vdso.so.dbg: $(src)/vdso.lds $(obj-vdso) FORCE
$(call if_changed,vdsold)
+SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \
+ -Wl,--build-id -Wl,--hash-style=both
# We also create a special relocatable object that should mirror the symbol
# table and layout of the linked DSO. With ld --just-symbols we can then
# refer to these symbols in the kernel code rather than hand-coded addresses.
-
-SYSCFLAGS_vdso.so.dbg = -shared -s -Wl,-soname=linux-vdso.so.1 \
- -Wl,--build-id -Wl,--hash-style=both
-$(obj)/vdso-dummy.o: $(src)/vdso.lds $(obj)/rt_sigreturn.o FORCE
- $(call if_changed,vdsold)
-
-LDFLAGS_vdso-syms.o := -r --just-symbols
-$(obj)/vdso-syms.o: $(obj)/vdso-dummy.o FORCE
- $(call if_changed,ld)
+$(obj)/vdso-syms.S: $(obj)/vdso.so FORCE
+ $(call if_changed,so2s)
# strip rule for the .so file
$(obj)/%.so: OBJCOPYFLAGS := -S
@@ -73,6 +68,10 @@ quiet_cmd_vdsold = VDSOLD $@
$(patsubst %, -G __vdso_%, $(vdso-syms)) $@.tmp $@ && \
rm $@.tmp
+# Extracts
+quiet_cmd_so2s = SO2S $@
+ cmd_so2s = $(NM) -D $< | $(srctree)/$(src)/so2s.sh > $@
+
# install commands for the unstripped file
quiet_cmd_vdso_install = INSTALL $@
cmd_vdso_install = cp $(obj)/$@.dbg $(MODLIB)/vdso/$@
diff --git a/arch/riscv/kernel/vdso/so2s.sh b/arch/riscv/kernel/vdso/so2s.sh
new file mode 100755
index 000000000000..3c5b43207658
--- /dev/null
+++ b/arch/riscv/kernel/vdso/so2s.sh
@@ -0,0 +1,6 @@
+#!/bin/sh
+# SPDX-License-Identifier: GPL-2.0+
+# Copyright 2020 Palmer Dabbelt <palmerdabbelt(a)google.com>
+
+sed 's!\([0-9a-f]*\) T \([a-z0-9_]*\)@@LINUX_4.15!.global \2\n.set \2,0x\1!' \
+| grep '^\.'
--
2.29.0.rc1.297.gfa9743e501-goog
From: Zhuguangqing <zhuguangqing(a)xiaomi.com>
If state has not changed successfully and we updated cpufreq_state,
next time when the new state is equal to cpufreq_state (not changed
successfully last time), we will return directly and miss a
freq_qos_update_request() that should have been.
Fixes: 5130802ddbb1 ("thermal: cpu_cooling: Switch to QoS requests for freq limits")
Cc: v5.4+ <stable(a)vger.kernel.org> # v5.4+
Signed-off-by: Zhuguangqing <zhuguangqing(a)xiaomi.com>
Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org>
---
v2:
- Add Fixes: 5130802ddbb1 in log.
- Add Cc: v5.4+ <stable(a)vger.kernel.org> # v5.4+ in log.
- Add Acked-by: Viresh Kumar <viresh.kumar(a)linaro.org> in log.
- Delete an extra blank line.
---
drivers/thermal/cpufreq_cooling.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/drivers/thermal/cpufreq_cooling.c b/drivers/thermal/cpufreq_cooling.c
index cc2959f22f01..612f063c1cfc 100644
--- a/drivers/thermal/cpufreq_cooling.c
+++ b/drivers/thermal/cpufreq_cooling.c
@@ -438,13 +438,11 @@ static int cpufreq_set_cur_state(struct thermal_cooling_device *cdev,
if (cpufreq_cdev->cpufreq_state == state)
return 0;
- cpufreq_cdev->cpufreq_state = state;
-
frequency = get_state_freq(cpufreq_cdev, state);
ret = freq_qos_update_request(&cpufreq_cdev->qos_req, frequency);
-
if (ret > 0) {
+ cpufreq_cdev->cpufreq_state = state;
cpus = cpufreq_cdev->policy->cpus;
max_capacity = arch_scale_cpu_capacity(cpumask_first(cpus));
capacity = frequency * max_capacity;
--
2.17.1
From: Eric Biggers <ebiggers(a)google.com>
Commit 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm
names") made the kernel start accepting arbitrarily long algorithm names
in sockaddr_alg. However, the actual length of the salg_name field
stayed at the original 64 bytes.
This is broken because the kernel can access indices >= 64 in salg_name,
which is undefined behavior -- even though the memory that is accessed
is still located within the sockaddr structure. It would only be
defined behavior if the array were properly marked as arbitrary-length
(either by making it a flexible array, which is the recommended way
these days, or by making it an array of length 0 or 1).
We can't simply change salg_name into a flexible array, since that would
break source compatibility with userspace programs that embed
sockaddr_alg into another struct, or (more commonly) declare a
sockaddr_alg like 'struct sockaddr_alg sa = { .salg_name = "foo" };'.
One solution would be to change salg_name into a flexible array only
when '#ifdef __KERNEL__'. However, that would keep userspace without an
easy way to actually use the longer algorithm names.
Instead, add a new structure 'sockaddr_alg_new' that has the flexible
array field, and expose it to both userspace and the kernel.
Make the kernel use it correctly in alg_bind().
This addresses the syzbot report
"UBSAN: array-index-out-of-bounds in alg_bind"
(https://syzkaller.appspot.com/bug?extid=92ead4eb8e26a26d465e).
Reported-by: syzbot+92ead4eb8e26a26d465e(a)syzkaller.appspotmail.com
Fixes: 3f69cc60768b ("crypto: af_alg - Allow arbitrarily long algorithm names")
Cc: <stable(a)vger.kernel.org> # v4.12+
Signed-off-by: Eric Biggers <ebiggers(a)google.com>
---
crypto/af_alg.c | 10 +++++++---
include/uapi/linux/if_alg.h | 16 ++++++++++++++++
2 files changed, 23 insertions(+), 3 deletions(-)
diff --git a/crypto/af_alg.c b/crypto/af_alg.c
index d11db80d24cd1..9acb9d2c4bcf9 100644
--- a/crypto/af_alg.c
+++ b/crypto/af_alg.c
@@ -147,7 +147,7 @@ static int alg_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
const u32 allowed = CRYPTO_ALG_KERN_DRIVER_ONLY;
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
- struct sockaddr_alg *sa = (void *)uaddr;
+ struct sockaddr_alg_new *sa = (void *)uaddr;
const struct af_alg_type *type;
void *private;
int err;
@@ -155,7 +155,11 @@ static int alg_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
if (sock->state == SS_CONNECTED)
return -EINVAL;
- if (addr_len < sizeof(*sa))
+ BUILD_BUG_ON(offsetof(struct sockaddr_alg_new, salg_name) !=
+ offsetof(struct sockaddr_alg, salg_name));
+ BUILD_BUG_ON(offsetof(struct sockaddr_alg, salg_name) != sizeof(*sa));
+
+ if (addr_len < sizeof(*sa) + 1)
return -EINVAL;
/* If caller uses non-allowed flag, return error. */
@@ -163,7 +167,7 @@ static int alg_bind(struct socket *sock, struct sockaddr *uaddr, int addr_len)
return -EINVAL;
sa->salg_type[sizeof(sa->salg_type) - 1] = 0;
- sa->salg_name[sizeof(sa->salg_name) + addr_len - sizeof(*sa) - 1] = 0;
+ sa->salg_name[addr_len - sizeof(*sa) - 1] = 0;
type = alg_get_type(sa->salg_type);
if (PTR_ERR(type) == -ENOENT) {
diff --git a/include/uapi/linux/if_alg.h b/include/uapi/linux/if_alg.h
index 60b7c2efd921c..dc52a11ba6d15 100644
--- a/include/uapi/linux/if_alg.h
+++ b/include/uapi/linux/if_alg.h
@@ -24,6 +24,22 @@ struct sockaddr_alg {
__u8 salg_name[64];
};
+/*
+ * Linux v4.12 and later removed the 64-byte limit on salg_name[]; it's now an
+ * arbitrary-length field. We had to keep the original struct above for source
+ * compatibility with existing userspace programs, though. Use the new struct
+ * below if support for very long algorithm names is needed. To do this,
+ * allocate 'sizeof(struct sockaddr_alg_new) + strlen(algname) + 1' bytes, and
+ * copy algname (including the null terminator) into salg_name.
+ */
+struct sockaddr_alg_new {
+ __u16 salg_family;
+ __u8 salg_type[14];
+ __u32 salg_feat;
+ __u32 salg_mask;
+ __u8 salg_name[];
+};
+
struct af_alg_iv {
__u32 ivlen;
__u8 iv[0];
base-commit: 3650b228f83adda7e5ee532e2b90429c03f7b9ec
--
2.29.1
From: Oliver O'Halloran <oohall(a)gmail.com>
[ Upstream commit f6bac19cf65c5be21d14a0c9684c8f560f2096dd ]
When building with W=1 we get the following warning:
arch/powerpc/platforms/powernv/smp.c: In function ‘pnv_smp_cpu_kill_self’:
arch/powerpc/platforms/powernv/smp.c:276:16: error: suggest braces around
empty body in an ‘if’ statement [-Werror=empty-body]
276 | cpu, srr1);
| ^
cc1: all warnings being treated as errors
The full context is this block:
if (srr1 && !generic_check_cpu_restart(cpu))
DBG("CPU%d Unexpected exit while offline srr1=%lx!\n",
cpu, srr1);
When building with DEBUG undefined DBG() expands to nothing and GCC emits
the warning due to the lack of braces around an empty statement.
Signed-off-by: Oliver O'Halloran <oohall(a)gmail.com>
Reviewed-by: Joel Stanley <joel(a)jms.id.au>
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200804005410.146094-2-oohall@gmail.com
Signed-off-by: Sasha Levin <sashal(a)kernel.org>
---
arch/powerpc/platforms/powernv/smp.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/powerpc/platforms/powernv/smp.c b/arch/powerpc/platforms/powernv/smp.c
index 8d49ba370c504..889c3dbec6fb9 100644
--- a/arch/powerpc/platforms/powernv/smp.c
+++ b/arch/powerpc/platforms/powernv/smp.c
@@ -47,7 +47,7 @@
#include <asm/udbg.h>
#define DBG(fmt...) udbg_printf(fmt)
#else
-#define DBG(fmt...)
+#define DBG(fmt...) do { } while (0)
#endif
static void pnv_smp_setup_cpu(int cpu)
--
2.25.1
Noticed this when trying to compile with -Wall on a kernel fork. We potentially
don't set width here, which causes the compiler to complain about width
potentially being uninitialized in drm_cvt_modes(). So, let's fix that.
Changes since v1:
* Don't emit an error as this code isn't reachable, just mark it as such
Changes since v2:
* Remove now unused variable
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
Cc: <stable(a)vger.kernel.org> # v5.9+
Fixes: 3f649ab728cd ("treewide: Remove uninitialized_var() usage")
Signed-off-by: Lyude Paul <lyude(a)redhat.com>
---
drivers/gpu/drm/drm_edid.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 631125b46e04..b84efd538a70 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -3114,6 +3114,8 @@ static int drm_cvt_modes(struct drm_connector *connector,
case 0x0c:
width = height * 15 / 9;
break;
+ default:
+ unreachable();
}
for (j = 1; j < 5; j++) {
--
2.28.0
On Tue, Nov 03, 2020 at 02:51:32PM -0800, Jian Cai wrote:
> Dear Stable kernel maintainers,
>
> Please consider cherry picking the following commits (ordered by commit
> time) ino linux-5.4.y.
>
> ffedeeb780dc linkage: Introduce new macros for assembler symbols
>
> 35e61c77ef38 arm64: asm: Add new-style position independent function
> annotations
>
> 3ac0f4526dfb arm64: lib: Use modern annotations for assembly functions
>
> ec9d78070de9 arm64: Change .weak to SYM_FUNC_START_WEAK_PI for
> arch/arm64/lib/mem*.S
>
> The first three are required to apply the last patch. This would unblock
> Chrome OS to build with LLVM's integrated assembler (Please see
> http://crbug.com/1143847 for details).
I've done this, but does this also provide this functionality for x86?
thanks,
greg k-h
(adding stable and Greg KH for additional review)
On Thu, 2020-11-05 at 17:29 +0530, Dwaipayan Ray wrote:
> checkpatch doesn't report warnings for many common mistakes
> in emails. Some of which are trailing commas and incorrect
> use of email comments.
I presume you've tested this against the git tree.
Can you send me a file with the BAD_SIGN_OFF messages generated
and if possible the git SHA-1s of the commits?
> At the same time several false positives are reported due to
> incorrect handling of mail comments. The most common of which
> is due to the pattern:
>
> <stable(a)vger.kernel.org> # X.X
>
> Improve email parsing in checkpatch.
>
> Some general comment rules are defined:
>
> - Multiple name comments should not be allowed.
> - Comments inside address should not be allowed.
> - In general comments should be enclosed within parentheses.
> Exception for stable(a)vger.kernel.org # X.X
not just vger.kernel.org, but this should also allow stable(a)kernel.org
and only allow cc: and not any other -by: type for that email address.
A process preference question for Greg and the stable team:
The most common stable forms are
stable(a)vger.kernel.org # version info
then
stable(a)vger.kernel.org [ version info ]
with some other relatively infrequently used outlier styles, some
that use parentheses, but this is not frequent.
It might be sensible to standardize on the "# version info" trailer
comment version info style and warn on any other form.
A somewhat common style for the stable address is to use a name
before the stable address which describes the version info:
Perhaps any name before stable should be warned and the version
should be a comment.
Here's a list of the stable addresses with "version name" then
stable address in the git tree and other outlier styles.
24 linux-stable <stable(a)vger.kernel.org>
21 5.4+ <stable(a)vger.kernel.org>
14 All applicable <stable(a)vger.kernel.org>
6 3.10+ <stable(a)vger.kernel.org>
5 5.9+ <stable(a)vger.kernel.org>
5 5.3+ <stable(a)vger.kernel.org>
5 5.1+ <stable(a)vger.kernel.org>
4 5.6+ <stable(a)vger.kernel.org>
4 4.20+ <stable(a)vger.kernel.org>
3 Stable Team <stable(a)vger.kernel.org>
3 4.19+ <stable(a)vger.kernel.org>
3 4.15+ <stable(a)vger.kernel.org>
3 4.10+ <stable(a)vger.kernel.org>
2 stable(a)vger.kernel.org (v2.6.12+)
2 5.2+ <stable(a)vger.kernel.org>
2 4.16+ <stable(a)vger.kernel.org>
1 v5.8+ <stable(a)vger.kernel.org>
1 v5.7+ <stable(a)vger.kernel.org>
1 v5.6+ <stable(a)vger.kernel.org>
1 v5.3+ <stable(a)vger.kernel.org>
1 v5.0+ <stable(a)vger.kernel.org>
1 v4.9+ <stable(a)vger.kernel.org>
1 <stable(a)vger.kernel.org> v5.0+
1 <stable(a)vger.kernel.org> +v4.18
1 stable(a)vger.kernel.org (3.14+)
1 5.8+ <stable(a)vger.kernel.org>
1 5.5+ <stable(a)vger.kernel.org>
1 5.0+ <stable(a)vger.kernel.org>
1 4.18+ <stable(a)vger.kernel.org>
1 4.14+ <stable(a)vger.kernel.org>
1 4.13+ <stable(a)vger.kernel.org>
1 4.0+ <stable(a)vger.kernel.org>
1 3.15+ <stable(a)vger.kernel.org>
1 3.11+ <stable(a)vger.kernel.org>
> Improvements to parsing:
>
> - Detect and report unexpected content after email.
> - Quoted names are excluded from comment parsing.
> - Trailing dots or commas in email are removed during
> formatting. Correspondingly a BAD_SIGN_OFF warning
> is emitted.
> - Improperly quoted email like '"name <address>"' are now
> warned about.
All of the above seems right but perhaps the comment style for any
<foo>-by: lines should also allow # comments.
The below is just comments on the patch itself.
> diff --git a/scripts/checkpatch.pl b/scripts/checkpatch.pl
[]
> @@ -2800,9 +2806,57 @@ sub process {
> $dequoted =~ s/" </ </;
> # Don't force email to have quotes
> # Allow just an angle bracketed address
> - if (!same_email_addresses($email, $suggested_email, 0)) {
> + if (!same_email_addresses($email, $suggested_email)) {
> + if (WARN("BAD_SIGN_OFF",
> + "email address '$email' might be better as '$suggested_email'\n" . $herecurr) &&
> + $fix) {
trivia:
Please always align $fix with tabs to the if and then 4 spaces to the
open parenthesis.
> + # Comments must begin only with (
> + # or # in case of stable(a)vger.kernel.org
> + if ($email =~ /^.*stable\@vger/) {
I believe this should be
if ($email =~ /^stable\@(?:vger\.)?kernel.org$/) {
> + if ($comment ne "" && $comment !~ /^#.+/) {
> + if (WARN("BAD_SIGN_OFF",
> + "Invalid comment format for stable: '$email', prefer parentheses\n" . $herecurr) &&
Prefer #
> + $fix) {
> + my $new_comment = $comment;
> + $new_comment =~ s/^[ \(\[]+|[ \)\]]+$//g;
Does the comment include any leading whitespace here?
I presumed not given the $comment !~ /^#/ test above.
I'm announcing the release of the 5.9.6 kernel.
This is a build-fix release, if 5.9.5 built properly for you, wonderful,
no need to upgrade. If 5.9.5 did not build for you, then move to 5.9.6
please. Sorry for the inconvenience.
The updated 5.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2 +-
sound/soc/sof/intel/hda-codec.c | 5 +++++
2 files changed, 6 insertions(+), 1 deletion(-)
Greg Kroah-Hartman (1):
Linux 5.9.6
Pierre-Louis Bossart (1):
ASOC: SOF: Intel: hda-codec: move unused label to correct position
Commit "mtd: cfi_cmdset_0002: Add support for polling status register"
added support for polling the status rather than using DQ polling.
However, status register is used only when DQ polling is missing.
Lets use status register when available as it is superior to DQ polling.
Signed-off-by: Joakim Tjernlund <joakim.tjernlund(a)infinera.com>
Cc: stable(a)vger.kernel.org
---
drivers/mtd/chips/cfi_cmdset_0002.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c b/drivers/mtd/chips/cfi_cmdset_0002.c
index a1f3e1031c3d..ee9b322e63bb 100644
--- a/drivers/mtd/chips/cfi_cmdset_0002.c
+++ b/drivers/mtd/chips/cfi_cmdset_0002.c
@@ -117,7 +117,7 @@ static struct mtd_chip_driver cfi_amdstd_chipdrv = {
static int cfi_use_status_reg(struct cfi_private *cfi)
{
struct cfi_pri_amdstd *extp = cfi->cmdset_priv;
- u8 poll_mask = CFI_POLL_STATUS_REG | CFI_POLL_DQ;
+ u8 poll_mask = CFI_POLL_STATUS_REG;
return extp->MinorVersion >= '5' &&
(extp->SoftwareFeatures & poll_mask) == CFI_POLL_STATUS_REG;
--
2.26.2
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: ipu3-cio2: Remove traces of returned buffers
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Mon Oct 12 17:25:28 2020 +0200
If starting a video buffer queue fails, the buffers are returned to
videobuf2. Remove the reference to the buffer from the driver's queue as
well.
Fixes: c2a6a07afe4a ("media: intel-ipu3: cio2: add new MIPI-CSI2 driver")
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v4.16 and up
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/pci/intel/ipu3/ipu3-cio2.c | 1 +
1 file changed, 1 insertion(+)
---
diff --git a/drivers/media/pci/intel/ipu3/ipu3-cio2.c b/drivers/media/pci/intel/ipu3/ipu3-cio2.c
index d9baa8bfe54f..51c4dd6a8f9a 100644
--- a/drivers/media/pci/intel/ipu3/ipu3-cio2.c
+++ b/drivers/media/pci/intel/ipu3/ipu3-cio2.c
@@ -791,6 +791,7 @@ static void cio2_vb2_return_all_buffers(struct cio2_queue *q,
atomic_dec(&q->bufs_queued);
vb2_buffer_done(&q->bufs[i]->vbb.vb2_buf,
state);
+ q->bufs[i] = NULL;
}
}
}
This is an automatic generated email to let you know that the following patch were queued:
Subject: media: ipu3-cio2: Make the field on subdev format V4L2_FIELD_NONE
Author: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Date: Fri Oct 9 15:56:05 2020 +0200
The ipu3-cio2 doesn't make use of the field and this is reflected in V4L2
buffers as well as the try format. Do this in active format, too.
Fixes: c2a6a07afe4a ("media: intel-ipu3: cio2: add new MIPI-CSI2 driver")
Signed-off-by: Sakari Ailus <sakari.ailus(a)linux.intel.com>
Reviewed-by: Bingbu Cao <bingbu.cao(a)intel.com>
Reviewed-by: Andy Shevchenko <andy.shevchenko(a)gmail.com>
Reviewed-by: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Cc: stable(a)vger.kernel.org # v4.16 and up
Signed-off-by: Mauro Carvalho Chehab <mchehab+huawei(a)kernel.org>
drivers/media/pci/intel/ipu3/ipu3-cio2.c | 1 +
1 file changed, 1 insertion(+)
---
diff --git a/drivers/media/pci/intel/ipu3/ipu3-cio2.c b/drivers/media/pci/intel/ipu3/ipu3-cio2.c
index 72095f8a4d46..87d040e176f7 100644
--- a/drivers/media/pci/intel/ipu3/ipu3-cio2.c
+++ b/drivers/media/pci/intel/ipu3/ipu3-cio2.c
@@ -1285,6 +1285,7 @@ static int cio2_subdev_set_fmt(struct v4l2_subdev *sd,
fmt->format.width = min_t(u32, fmt->format.width, CIO2_IMAGE_MAX_WIDTH);
fmt->format.height = min_t(u32, fmt->format.height,
CIO2_IMAGE_MAX_LENGTH);
+ fmt->format.field = V4L2_FIELD_NONE;
mutex_lock(&q->subdev_lock);
*mbus = fmt->format;
From: "Steven Rostedt (VMware)" <rostedt(a)goodmis.org>
The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.
Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.
Cc: stable(a)vger.kernel.org
Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt(a)goodmis.org>
---
kernel/trace/ring_buffer.c | 58 ++++++++++++++++++++++++++++++--------
1 file changed, 46 insertions(+), 12 deletions(-)
diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7f45fd9d5a45..dc83b3fa9fe7 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -438,14 +438,16 @@ enum {
};
/*
* Used for which event context the event is in.
- * NMI = 0
- * IRQ = 1
- * SOFTIRQ = 2
- * NORMAL = 3
+ * TRANSITION = 0
+ * NMI = 1
+ * IRQ = 2
+ * SOFTIRQ = 3
+ * NORMAL = 4
*
* See trace_recursive_lock() comment below for more details.
*/
enum {
+ RB_CTX_TRANSITION,
RB_CTX_NMI,
RB_CTX_IRQ,
RB_CTX_SOFTIRQ,
@@ -3014,10 +3016,10 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* a bit of overhead in something as critical as function tracing,
* we use a bitmask trick.
*
- * bit 0 = NMI context
- * bit 1 = IRQ context
- * bit 2 = SoftIRQ context
- * bit 3 = normal context.
+ * bit 1 = NMI context
+ * bit 2 = IRQ context
+ * bit 3 = SoftIRQ context
+ * bit 4 = normal context.
*
* This works because this is the order of contexts that can
* preempt other contexts. A SoftIRQ never preempts an IRQ
@@ -3040,6 +3042,30 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
* The least significant bit can be cleared this way, and it
* just so happens that it is the same bit corresponding to
* the current context.
+ *
+ * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit
+ * is set when a recursion is detected at the current context, and if
+ * the TRANSITION bit is already set, it will fail the recursion.
+ * This is needed because there's a lag between the changing of
+ * interrupt context and updating the preempt count. In this case,
+ * a false positive will be found. To handle this, one extra recursion
+ * is allowed, and this is done by the TRANSITION bit. If the TRANSITION
+ * bit is already set, then it is considered a recursion and the function
+ * ends. Otherwise, the TRANSITION bit is set, and that bit is returned.
+ *
+ * On the trace_recursive_unlock(), the TRANSITION bit will be the first
+ * to be cleared. Even if it wasn't the context that set it. That is,
+ * if an interrupt comes in while NORMAL bit is set and the ring buffer
+ * is called before preempt_count() is updated, since the check will
+ * be on the NORMAL bit, the TRANSITION bit will then be set. If an
+ * NMI then comes in, it will set the NMI bit, but when the NMI code
+ * does the trace_recursive_unlock() it will clear the TRANSTION bit
+ * and leave the NMI bit set. But this is fine, because the interrupt
+ * code that set the TRANSITION bit will then clear the NMI bit when it
+ * calls trace_recursive_unlock(). If another NMI comes in, it will
+ * set the TRANSITION bit and continue.
+ *
+ * Note: The TRANSITION bit only handles a single transition between context.
*/
static __always_inline int
@@ -3055,8 +3081,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
bit = pc & NMI_MASK ? RB_CTX_NMI :
pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
- if (unlikely(val & (1 << (bit + cpu_buffer->nest))))
- return 1;
+ if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) {
+ /*
+ * It is possible that this was called by transitioning
+ * between interrupt context, and preempt_count() has not
+ * been updated yet. In this case, use the TRANSITION bit.
+ */
+ bit = RB_CTX_TRANSITION;
+ if (val & (1 << (bit + cpu_buffer->nest)))
+ return 1;
+ }
val |= (1 << (bit + cpu_buffer->nest));
cpu_buffer->current_context = val;
@@ -3071,8 +3105,8 @@ trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
cpu_buffer->current_context - (1 << cpu_buffer->nest);
}
-/* The recursive locking above uses 4 bits */
-#define NESTED_BITS 4
+/* The recursive locking above uses 5 bits */
+#define NESTED_BITS 5
/**
* ring_buffer_nest_start - Allow to trace while nested
--
2.28.0
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 4.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From d5e8782129c22036425f29f9b6a254895482d7bd Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Thu, 15 Oct 2020 12:59:54 +0100
Subject: [PATCH] drm/i915/gem: Support parsing of oversize batches
Matthew Auld noted that on more recent systems (such as the parser for
gen9) we may have objects that are larger than expected by the GEM uAPI
(i.e. greater than u32). These objects would have incorrect implicit
batch lengths, causing the parser to reject them for being incomplete,
or worse.
Based on a patch by Matthew Auld.
Reported-by: Matthew Auld <matthew.auld(a)intel.com>
Fixes: 435e8fc059db ("drm/i915: Allow parsing of unsized batches")
Testcase: igt/gem_exec_params/larger-than-life-batch
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Matthew Auld <matthew.auld(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Jon Bloomfield <jon.bloomfield(a)intel.com>
Reviewed-by: Matthew Auld <matthew.auld(a)intel.com>
Cc: stable(a)vger.kernel.org
Link: https://patchwork.freedesktop.org/patch/msgid/20201015115954.871-1-chris@ch…
(cherry picked from commit 57b2d834bf235daab388c3ba12d035c820ae09c6)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 4b09bcd70cf4..1904e6e5ea64 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -287,8 +287,8 @@ struct i915_execbuffer {
u64 invalid_flags; /** Set of execobj.flags that are invalid */
u32 context_flags; /** Set of execobj.flags to insert from the ctx */
+ u64 batch_len; /** Length of batch within object */
u32 batch_start_offset; /** Location within object of batch */
- u32 batch_len; /** Length of batch within object */
u32 batch_flags; /** Flags composed for emit_bb_start() */
struct intel_gt_buffer_pool_node *batch_pool; /** pool node for batch buffer */
@@ -871,6 +871,10 @@ static int eb_lookup_vmas(struct i915_execbuffer *eb)
if (eb->batch_len == 0)
eb->batch_len = eb->batch->vma->size - eb->batch_start_offset;
+ if (unlikely(eb->batch_len == 0)) { /* impossible! */
+ drm_dbg(&i915->drm, "Invalid batch length\n");
+ return -EINVAL;
+ }
return 0;
@@ -2424,7 +2428,7 @@ static int eb_parse(struct i915_execbuffer *eb)
struct drm_i915_private *i915 = eb->i915;
struct intel_gt_buffer_pool_node *pool = eb->batch_pool;
struct i915_vma *shadow, *trampoline, *batch;
- unsigned int len;
+ unsigned long len;
int err;
if (!eb_use_cmdparser(eb)) {
@@ -2449,6 +2453,8 @@ static int eb_parse(struct i915_execbuffer *eb)
} else {
len += I915_CMD_PARSER_TRAMPOLINE_SIZE;
}
+ if (unlikely(len < eb->batch_len)) /* last paranoid check of overflow */
+ return -EINVAL;
if (!pool) {
pool = intel_gt_get_buffer_pool(eb->engine->gt, len);
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4a9bb58aba6db4eba2a8b3aa1edc415c94a669a8 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 15 Sep 2020 14:49:21 +0100
Subject: [PATCH] drm/i915/gt: Wait for CSB entries on Tigerlake
On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@…
(cherry picked from commit 233c1ae3c83f21046c6c4083da904163ece8f110)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1a7bbbb16356..a32aabce7901 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2497,9 +2497,22 @@ invalidate_csb_entries(const u64 *first, const u64 *last)
*/
static inline bool gen12_csb_parse(const u64 *csb)
{
- u64 entry = READ_ONCE(*csb);
- bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
- bool new_queue =
+ bool ctx_away_valid;
+ bool new_queue;
+ u64 entry;
+
+ /* HSD#22011248461 */
+ entry = READ_ONCE(*csb);
+ if (unlikely(entry == -1)) {
+ preempt_disable();
+ if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50))
+ GEM_WARN_ON("50us CSB timeout");
+ preempt_enable();
+ }
+ WRITE_ONCE(*(u64 *)csb, -1);
+
+ ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
+ new_queue =
lower_32_bits(entry) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
/*
@@ -4006,6 +4019,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
WRITE_ONCE(*execlists->csb_write, reset_value);
wmb(); /* Make sure this is visible to HW (paranoia?) */
+ /* Check that the GPU does indeed update the CSB entries! */
+ memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
invalidate_csb_entries(&execlists->csb_status[0],
&execlists->csb_status[reset_value]);
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From 4a9bb58aba6db4eba2a8b3aa1edc415c94a669a8 Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Tue, 15 Sep 2020 14:49:21 +0100
Subject: [PATCH] drm/i915/gt: Wait for CSB entries on Tigerlake
On Tigerlake, we are seeing a repeat of commit d8f505311717 ("drm/i915/icl:
Forcibly evict stale csb entries") where, presumably, due to a missing
Global Observation Point synchronisation, the write pointer of the CSB
ringbuffer is updated _prior_ to the contents of the ringbuffer. That is
we see the GPU report more context-switch entries for us to parse, but
those entries have not been written, leading us to process stale events,
and eventually report a hung GPU.
However, this effect appears to be much more severe than we previously
saw on Icelake (though it might be best if we try the same approach
there as well and measure), and Bruce suggested the good idea of resetting
the CSB entry after use so that we can detect when it has been updated by
the GPU. By instrumenting how long that may be, we can set a reliable
upper bound for how long we should wait for:
513 late, avg of 61 retries (590 ns), max of 1061 retries (10099 ns)
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/2045
References: d8f505311717 ("drm/i915/icl: Forcibly evict stale csb entries")
References: HSDES#22011327657, HSDES#1508287568
Suggested-by: Bruce Chang <yu.bruce.chang(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Bruce Chang <yu.bruce.chang(a)intel.com>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.4
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200915134923.30088-2-chris@…
(cherry picked from commit 233c1ae3c83f21046c6c4083da904163ece8f110)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gt/intel_lrc.c b/drivers/gpu/drm/i915/gt/intel_lrc.c
index 1a7bbbb16356..a32aabce7901 100644
--- a/drivers/gpu/drm/i915/gt/intel_lrc.c
+++ b/drivers/gpu/drm/i915/gt/intel_lrc.c
@@ -2497,9 +2497,22 @@ invalidate_csb_entries(const u64 *first, const u64 *last)
*/
static inline bool gen12_csb_parse(const u64 *csb)
{
- u64 entry = READ_ONCE(*csb);
- bool ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
- bool new_queue =
+ bool ctx_away_valid;
+ bool new_queue;
+ u64 entry;
+
+ /* HSD#22011248461 */
+ entry = READ_ONCE(*csb);
+ if (unlikely(entry == -1)) {
+ preempt_disable();
+ if (wait_for_atomic_us((entry = READ_ONCE(*csb)) != -1, 50))
+ GEM_WARN_ON("50us CSB timeout");
+ preempt_enable();
+ }
+ WRITE_ONCE(*(u64 *)csb, -1);
+
+ ctx_away_valid = GEN12_CSB_CTX_VALID(upper_32_bits(entry));
+ new_queue =
lower_32_bits(entry) & GEN12_CTX_STATUS_SWITCHED_TO_NEW_QUEUE;
/*
@@ -4006,6 +4019,8 @@ static void reset_csb_pointers(struct intel_engine_cs *engine)
WRITE_ONCE(*execlists->csb_write, reset_value);
wmb(); /* Make sure this is visible to HW (paranoia?) */
+ /* Check that the GPU does indeed update the CSB entries! */
+ memset(execlists->csb_status, -1, (reset_value + 1) * sizeof(u64));
invalidate_csb_entries(&execlists->csb_status[0],
&execlists->csb_status[reset_value]);
The patch below does not apply to the 4.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c60b93cd4862d108214a14e655358ea714d7a12a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 28 Sep 2020 22:59:42 +0100
Subject: [PATCH] drm/i915: Avoid mixing integer types during batch copies
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@…
(cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 5509946f1a1d..4b09bcd70cf4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2267,8 +2267,8 @@ struct eb_parse_work {
struct i915_vma *batch;
struct i915_vma *shadow;
struct i915_vma *trampoline;
- unsigned int batch_offset;
- unsigned int batch_length;
+ unsigned long batch_offset;
+ unsigned long batch_length;
};
static int __eb_parse(struct dma_fence_work *work)
@@ -2338,6 +2338,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
struct eb_parse_work *pw;
int err;
+ GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
+ GEM_BUG_ON(overflows_type(eb->batch_len, pw->batch_length));
+
pw = kzalloc(sizeof(*pw), GFP_KERNEL);
if (!pw)
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5ac4a999f05a..e88970256e8e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,7 +1136,7 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
struct drm_i915_gem_object *src_obj,
- u32 offset, u32 length)
+ unsigned long offset, unsigned long length)
{
bool needs_clflush;
void *dst, *src;
@@ -1166,8 +1166,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
}
}
if (IS_ERR(src)) {
+ unsigned long x, n;
void *ptr;
- int x, n;
/*
* We can avoid clflushing partial cachelines before the write
@@ -1184,7 +1184,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
ptr = dst;
x = offset_in_page(offset);
for (n = offset >> PAGE_SHIFT; length; n++) {
- int len = min_t(int, length, PAGE_SIZE - x);
+ int len = min(length, PAGE_SIZE - x);
src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
if (needs_clflush)
@@ -1414,8 +1414,8 @@ static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
*/
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline)
{
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72a9449b674e..eef9a821c49c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,8 +1949,8 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline);
#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c60b93cd4862d108214a14e655358ea714d7a12a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 28 Sep 2020 22:59:42 +0100
Subject: [PATCH] drm/i915: Avoid mixing integer types during batch copies
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@…
(cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 5509946f1a1d..4b09bcd70cf4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2267,8 +2267,8 @@ struct eb_parse_work {
struct i915_vma *batch;
struct i915_vma *shadow;
struct i915_vma *trampoline;
- unsigned int batch_offset;
- unsigned int batch_length;
+ unsigned long batch_offset;
+ unsigned long batch_length;
};
static int __eb_parse(struct dma_fence_work *work)
@@ -2338,6 +2338,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
struct eb_parse_work *pw;
int err;
+ GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
+ GEM_BUG_ON(overflows_type(eb->batch_len, pw->batch_length));
+
pw = kzalloc(sizeof(*pw), GFP_KERNEL);
if (!pw)
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5ac4a999f05a..e88970256e8e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,7 +1136,7 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
struct drm_i915_gem_object *src_obj,
- u32 offset, u32 length)
+ unsigned long offset, unsigned long length)
{
bool needs_clflush;
void *dst, *src;
@@ -1166,8 +1166,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
}
}
if (IS_ERR(src)) {
+ unsigned long x, n;
void *ptr;
- int x, n;
/*
* We can avoid clflushing partial cachelines before the write
@@ -1184,7 +1184,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
ptr = dst;
x = offset_in_page(offset);
for (n = offset >> PAGE_SHIFT; length; n++) {
- int len = min_t(int, length, PAGE_SIZE - x);
+ int len = min(length, PAGE_SIZE - x);
src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
if (needs_clflush)
@@ -1414,8 +1414,8 @@ static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
*/
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline)
{
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72a9449b674e..eef9a821c49c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,8 +1949,8 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline);
#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c60b93cd4862d108214a14e655358ea714d7a12a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 28 Sep 2020 22:59:42 +0100
Subject: [PATCH] drm/i915: Avoid mixing integer types during batch copies
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@…
(cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 5509946f1a1d..4b09bcd70cf4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2267,8 +2267,8 @@ struct eb_parse_work {
struct i915_vma *batch;
struct i915_vma *shadow;
struct i915_vma *trampoline;
- unsigned int batch_offset;
- unsigned int batch_length;
+ unsigned long batch_offset;
+ unsigned long batch_length;
};
static int __eb_parse(struct dma_fence_work *work)
@@ -2338,6 +2338,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
struct eb_parse_work *pw;
int err;
+ GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
+ GEM_BUG_ON(overflows_type(eb->batch_len, pw->batch_length));
+
pw = kzalloc(sizeof(*pw), GFP_KERNEL);
if (!pw)
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5ac4a999f05a..e88970256e8e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,7 +1136,7 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
struct drm_i915_gem_object *src_obj,
- u32 offset, u32 length)
+ unsigned long offset, unsigned long length)
{
bool needs_clflush;
void *dst, *src;
@@ -1166,8 +1166,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
}
}
if (IS_ERR(src)) {
+ unsigned long x, n;
void *ptr;
- int x, n;
/*
* We can avoid clflushing partial cachelines before the write
@@ -1184,7 +1184,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
ptr = dst;
x = offset_in_page(offset);
for (n = offset >> PAGE_SHIFT; length; n++) {
- int len = min_t(int, length, PAGE_SIZE - x);
+ int len = min(length, PAGE_SIZE - x);
src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
if (needs_clflush)
@@ -1414,8 +1414,8 @@ static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
*/
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline)
{
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72a9449b674e..eef9a821c49c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,8 +1949,8 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline);
#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From c60b93cd4862d108214a14e655358ea714d7a12a Mon Sep 17 00:00:00 2001
From: Chris Wilson <chris(a)chris-wilson.co.uk>
Date: Mon, 28 Sep 2020 22:59:42 +0100
Subject: [PATCH] drm/i915: Avoid mixing integer types during batch copies
Be consistent and use unsigned long throughout the chunk copies to
avoid the inherent clumsiness of mixing integer types of different
widths and signs. Failing to take acount of a wider unsigned type when
using min_t can lead to treating it as a negative, only for it flip back
to a large unsigned value after passing a boundary check.
Fixes: ed13033f0287 ("drm/i915/cmdparser: Only cache the dst vmap")
Testcase: igt/gen9_exec_parse/bb-large
Reported-by: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Signed-off-by: Chris Wilson <chris(a)chris-wilson.co.uk>
Cc: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: "Candelaria, Jared" <jared.candelaria(a)intel.com>
Cc: "Bloomfield, Jon" <jon.bloomfield(a)intel.com>
Cc: <stable(a)vger.kernel.org> # v4.9+
Reviewed-by: Mika Kuoppala <mika.kuoppala(a)linux.intel.com>
Link: https://patchwork.freedesktop.org/patch/msgid/20200928215942.31917-1-chris@…
(cherry picked from commit b7eeb2b4132ccf1a7d38f434cde7043913d1ed3c)
Signed-off-by: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
diff --git a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
index 5509946f1a1d..4b09bcd70cf4 100644
--- a/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
+++ b/drivers/gpu/drm/i915/gem/i915_gem_execbuffer.c
@@ -2267,8 +2267,8 @@ struct eb_parse_work {
struct i915_vma *batch;
struct i915_vma *shadow;
struct i915_vma *trampoline;
- unsigned int batch_offset;
- unsigned int batch_length;
+ unsigned long batch_offset;
+ unsigned long batch_length;
};
static int __eb_parse(struct dma_fence_work *work)
@@ -2338,6 +2338,9 @@ static int eb_parse_pipeline(struct i915_execbuffer *eb,
struct eb_parse_work *pw;
int err;
+ GEM_BUG_ON(overflows_type(eb->batch_start_offset, pw->batch_offset));
+ GEM_BUG_ON(overflows_type(eb->batch_len, pw->batch_length));
+
pw = kzalloc(sizeof(*pw), GFP_KERNEL);
if (!pw)
return -ENOMEM;
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c
index 5ac4a999f05a..e88970256e8e 100644
--- a/drivers/gpu/drm/i915/i915_cmd_parser.c
+++ b/drivers/gpu/drm/i915/i915_cmd_parser.c
@@ -1136,7 +1136,7 @@ find_reg(const struct intel_engine_cs *engine, u32 addr)
/* Returns a vmap'd pointer to dst_obj, which the caller must unmap */
static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
struct drm_i915_gem_object *src_obj,
- u32 offset, u32 length)
+ unsigned long offset, unsigned long length)
{
bool needs_clflush;
void *dst, *src;
@@ -1166,8 +1166,8 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
}
}
if (IS_ERR(src)) {
+ unsigned long x, n;
void *ptr;
- int x, n;
/*
* We can avoid clflushing partial cachelines before the write
@@ -1184,7 +1184,7 @@ static u32 *copy_batch(struct drm_i915_gem_object *dst_obj,
ptr = dst;
x = offset_in_page(offset);
for (n = offset >> PAGE_SHIFT; length; n++) {
- int len = min_t(int, length, PAGE_SIZE - x);
+ int len = min(length, PAGE_SIZE - x);
src = kmap_atomic(i915_gem_object_get_page(src_obj, n));
if (needs_clflush)
@@ -1414,8 +1414,8 @@ static bool shadow_needs_clflush(struct drm_i915_gem_object *obj)
*/
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline)
{
diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h
index 72a9449b674e..eef9a821c49c 100644
--- a/drivers/gpu/drm/i915/i915_drv.h
+++ b/drivers/gpu/drm/i915/i915_drv.h
@@ -1949,8 +1949,8 @@ void intel_engine_init_cmd_parser(struct intel_engine_cs *engine);
void intel_engine_cleanup_cmd_parser(struct intel_engine_cs *engine);
int intel_engine_cmd_parser(struct intel_engine_cs *engine,
struct i915_vma *batch,
- u32 batch_offset,
- u32 batch_length,
+ unsigned long batch_offset,
+ unsigned long batch_length,
struct i915_vma *shadow,
bool trampoline);
#define I915_CMD_PARSER_TRAMPOLINE_SIZE 8
I'm announcing the release of the 5.9.4 kernel.
This is only a bugfix for the 5.9.3 kernel release which had some
problems with some symlinks for the powerpc selftests. This problem was
caused by issues in going from git->patch->quilt->git and things got a
bit messed up. To resolve this, I reverted the offending patch and a
prerequsite one, and then used 'git cherry-pick' to backport the patches
properly, which preserved the links correctly.
Many thanks to Marc Aurèle La Franc and others for helping me notice
this and provide some solutions for it.
If you had no issues with these files in 5.9.3, no need to upgrade at
this time.
The updated 5.9.y git tree can be found at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git linux-5.9.y
and can be browsed at the normal kernel.org git web browser:
https://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=summary
thanks,
greg k-h
------------
Makefile | 2
tools/testing/selftests/powerpc/copyloops/copy_mc_64.S | 243 -----------
tools/testing/selftests/powerpc/copyloops/memcpy_mcsafe_64.S | 1
3 files changed, 2 insertions(+), 244 deletions(-)
Dan Williams (2):
x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()
x86/copy_mc: Introduce copy_mc_enhanced_fast_string()
Greg Kroah-Hartman (3):
Revert "x86/copy_mc: Introduce copy_mc_enhanced_fast_string()"
Revert "x86, powerpc: Rename memcpy_mcsafe() to copy_mc_to_{user, kernel}()"
Linux 5.9.4
ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function, SVE, is not present.
Expose SYS_ID_AA64ZFR0_EL1 to userspace as RAZ when SVE is not
implemented. Removing the userspace visibility checks is enough
to reexpose it, as it will already return zero to userspace when
SVE is not present. The register already behaves as RAZ for the
guest when SVE is not present.
Fixes: 73433762fcae ("KVM: arm64/sve: System register context switch and access support")
Cc: stable(a)vger.kernel.org#v5.2+
Reported-by: 张东旭 <xu910121(a)sina.com>
Signed-off-by: Andrew Jones <drjones(a)redhat.com>
---
arch/arm64/kvm/sys_regs.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fb12d3ef423a..6ff0c15531ca 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1195,16 +1195,6 @@ static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
return REG_HIDDEN_USER | REG_HIDDEN_GUEST;
}
-/* Visibility overrides for SVE-specific ID registers */
-static unsigned int sve_id_visibility(const struct kvm_vcpu *vcpu,
- const struct sys_reg_desc *rd)
-{
- if (vcpu_has_sve(vcpu))
- return 0;
-
- return REG_HIDDEN_USER;
-}
-
/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
{
@@ -1231,9 +1221,6 @@ static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
{
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
val = guest_id_aa64zfr0_el1(vcpu);
return reg_to_user(uaddr, &val, reg->id);
}
@@ -1246,9 +1233,6 @@ static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
int err;
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
err = reg_from_user(&val, uaddr, id);
if (err)
return err;
@@ -1518,7 +1502,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_SANITISED(ID_AA64PFR1_EL1),
ID_UNALLOCATED(4,2),
ID_UNALLOCATED(4,3),
- { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = sve_id_visibility },
+ { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, },
ID_UNALLOCATED(4,5),
ID_UNALLOCATED(4,6),
ID_UNALLOCATED(4,7),
--
2.26.2
The parallel-port restore operations is called when a driver claims the
port and is supposed to restore the provided state (e.g. saved when
releasing the port).
Fixes: b69578df7e98 ("USB: usbserial: mos7720: add support for parallel port on moschip 7715")
Cc: stable <stable(a)vger.kernel.org> # 2.6.35
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/mos7720.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/serial/mos7720.c b/drivers/usb/serial/mos7720.c
index 5eed1078fac8..5a5d2a95070e 100644
--- a/drivers/usb/serial/mos7720.c
+++ b/drivers/usb/serial/mos7720.c
@@ -639,6 +639,8 @@ static void parport_mos7715_restore_state(struct parport *pp,
spin_unlock(&release_lock);
return;
}
+ mos_parport->shadowDCR = s->u.pc.ctr;
+ mos_parport->shadowECR = s->u.pc.ecr;
write_parport_reg_nonblock(mos_parport, MOS7720_DCR,
mos_parport->shadowDCR);
write_parport_reg_nonblock(mos_parport, MOS7720_ECR,
--
2.26.2
The reliance on the remoteproc's state for determining when to send
sysmon notifications to a remote processor is racy with regard to
concurrent remoteproc operations.
Further more the advertisement of the state of other remote processor to
a newly started remote processor might not only send the wrong state,
but might result in a stream of state changes that are out of order.
Address this by introducing state tracking within the sysmon instances
themselves and extend the locking to ensure that the notifications are
consistent with this state.
The use of a big lock for all instances will cause contention for
concurrent remote processor state transitions, but the correctness of
the remote processors' view of their peers is more important.
Fixes: 1f36ab3f6e3b ("remoteproc: sysmon: Inform current rproc about all active rprocs")
Fixes: 1877f54f75ad ("remoteproc: sysmon: Add notifications for events")
Fixes: 1fb82ee806d1 ("remoteproc: qcom: Introduce sysmon")
Cc: stable(a)vger.kernel.org
Signed-off-by: Bjorn Andersson <bjorn.andersson(a)linaro.org>
---
drivers/remoteproc/qcom_sysmon.c | 20 ++++++++++++++++----
1 file changed, 16 insertions(+), 4 deletions(-)
diff --git a/drivers/remoteproc/qcom_sysmon.c b/drivers/remoteproc/qcom_sysmon.c
index 9eb2f6bccea6..1e507b66354a 100644
--- a/drivers/remoteproc/qcom_sysmon.c
+++ b/drivers/remoteproc/qcom_sysmon.c
@@ -22,6 +22,8 @@ struct qcom_sysmon {
struct rproc_subdev subdev;
struct rproc *rproc;
+ int state;
+
struct list_head node;
const char *name;
@@ -448,7 +450,10 @@ static int sysmon_prepare(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_BEFORE_POWERUP
};
+ mutex_lock(&sysmon_lock);
+ sysmon->state = SSCTL_SSR_EVENT_BEFORE_POWERUP;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon_lock);
return 0;
}
@@ -472,15 +477,16 @@ static int sysmon_start(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_AFTER_POWERUP
};
+ mutex_lock(&sysmon_lock);
+ sysmon->state = SSCTL_SSR_EVENT_AFTER_POWERUP;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
- mutex_lock(&sysmon_lock);
list_for_each_entry(target, &sysmon_list, node) {
- if (target == sysmon ||
- target->rproc->state != RPROC_RUNNING)
+ if (target == sysmon)
continue;
event.subsys_name = target->name;
+ event.ssr_event = target->state;
if (sysmon->ssctl_version == 2)
ssctl_send_event(sysmon, &event);
@@ -500,7 +506,10 @@ static void sysmon_stop(struct rproc_subdev *subdev, bool crashed)
.ssr_event = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN
};
+ mutex_lock(&sysmon_lock);
+ sysmon->state = SSCTL_SSR_EVENT_BEFORE_SHUTDOWN;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon_lock);
/* Don't request graceful shutdown if we've crashed */
if (crashed)
@@ -521,7 +530,10 @@ static void sysmon_unprepare(struct rproc_subdev *subdev)
.ssr_event = SSCTL_SSR_EVENT_AFTER_SHUTDOWN
};
+ mutex_lock(&sysmon_lock);
+ sysmon->state = SSCTL_SSR_EVENT_AFTER_SHUTDOWN;
blocking_notifier_call_chain(&sysmon_notifiers, 0, (void *)&event);
+ mutex_unlock(&sysmon_lock);
}
/**
@@ -538,7 +550,7 @@ static int sysmon_notify(struct notifier_block *nb, unsigned long event,
struct sysmon_event *sysmon_event = data;
/* Skip non-running rprocs and the originating instance */
- if (rproc->state != RPROC_RUNNING ||
+ if (sysmon->state != SSCTL_SSR_EVENT_AFTER_POWERUP ||
!strcmp(sysmon_event->subsys_name, sysmon->name)) {
dev_dbg(sysmon->dev, "not notifying %s\n", sysmon->name);
return NOTIFY_DONE;
--
2.28.0
The following commit has been merged into the core/urgent branch of tip:
Commit-ID: 9d820f68b2bdba5b2e7bf135123c3f57c5051d05
Gitweb: https://git.kernel.org/tip/9d820f68b2bdba5b2e7bf135123c3f57c5051d05
Author: Thomas Gleixner <tglx(a)linutronix.de>
AuthorDate: Wed, 04 Nov 2020 14:06:23 +01:00
Committer: Thomas Gleixner <tglx(a)linutronix.de>
CommitterDate: Wed, 04 Nov 2020 18:06:14 +01:00
entry: Fix the incorrect ordering of lockdep and RCU check
When an exception/interrupt hits kernel space and the kernel is not
currently in the idle task then RCU must be watching.
irqentry_enter() validates this via rcu_irq_enter_check_tick(), which in
turn invokes lockdep when taking a lock. But at that point lockdep does not
yet know about the fact that interrupts have been disabled by the CPU,
which triggers a lockdep splat complaining about inconsistent state.
Invoking trace_hardirqs_off() before rcu_irq_enter_check_tick() defeats the
point of rcu_irq_enter_check_tick() because trace_hardirqs_off() uses RCU.
So use the same sequence as for the idle case and tell lockdep about the
irq state change first, invoke the RCU check and then do the lockdep and
tracer update.
Fixes: a5497bab5f72 ("entry: Provide generic interrupt entry/exit code")
Reported-by: Mark Rutland <mark.rutland(a)arm.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>
Tested-by: Mark Rutland <mark.rutland(a)arm.com>
Cc: stable(a)vger.kernel.org
Link: https://lore.kernel.org/r/87y2jhl19s.fsf@nanos.tec.linutronix.de
---
kernel/entry/common.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/kernel/entry/common.c b/kernel/entry/common.c
index 2b83666..e9e2df3 100644
--- a/kernel/entry/common.c
+++ b/kernel/entry/common.c
@@ -337,10 +337,10 @@ noinstr irqentry_state_t irqentry_enter(struct pt_regs *regs)
* already contains a warning when RCU is not watching, so no point
* in having another one here.
*/
+ lockdep_hardirqs_off(CALLER_ADDR0);
instrumentation_begin();
rcu_irq_enter_check_tick();
- /* Use the combo lockdep/tracing function */
- trace_hardirqs_off();
+ trace_hardirqs_off_finish();
instrumentation_end();
return ret;
ID registers are RAZ until they've been allocated a purpose, but
that doesn't mean they should be removed from the KVM_GET_REG_LIST
list. So far we only have one register, SYS_ID_AA64ZFR0_EL1, that
is hidden from userspace when its function is not present. Removing
the userspace visibility checks is enough to reexpose it, as it
already behaves as RAZ when the function is not present.
Fixes: 73433762fcae ("KVM: arm64/sve: System register context switch and access support")
Cc: <stable(a)vger.kernel.org> # v5.2+
Reported-by: 张东旭 <xu910121(a)sina.com>
Signed-off-by: Andrew Jones <drjones(a)redhat.com>
---
arch/arm64/kvm/sys_regs.c | 18 +-----------------
1 file changed, 1 insertion(+), 17 deletions(-)
diff --git a/arch/arm64/kvm/sys_regs.c b/arch/arm64/kvm/sys_regs.c
index fb12d3ef423a..6ff0c15531ca 100644
--- a/arch/arm64/kvm/sys_regs.c
+++ b/arch/arm64/kvm/sys_regs.c
@@ -1195,16 +1195,6 @@ static unsigned int sve_visibility(const struct kvm_vcpu *vcpu,
return REG_HIDDEN_USER | REG_HIDDEN_GUEST;
}
-/* Visibility overrides for SVE-specific ID registers */
-static unsigned int sve_id_visibility(const struct kvm_vcpu *vcpu,
- const struct sys_reg_desc *rd)
-{
- if (vcpu_has_sve(vcpu))
- return 0;
-
- return REG_HIDDEN_USER;
-}
-
/* Generate the emulated ID_AA64ZFR0_EL1 value exposed to the guest */
static u64 guest_id_aa64zfr0_el1(const struct kvm_vcpu *vcpu)
{
@@ -1231,9 +1221,6 @@ static int get_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
{
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
val = guest_id_aa64zfr0_el1(vcpu);
return reg_to_user(uaddr, &val, reg->id);
}
@@ -1246,9 +1233,6 @@ static int set_id_aa64zfr0_el1(struct kvm_vcpu *vcpu,
int err;
u64 val;
- if (WARN_ON(!vcpu_has_sve(vcpu)))
- return -ENOENT;
-
err = reg_from_user(&val, uaddr, id);
if (err)
return err;
@@ -1518,7 +1502,7 @@ static const struct sys_reg_desc sys_reg_descs[] = {
ID_SANITISED(ID_AA64PFR1_EL1),
ID_UNALLOCATED(4,2),
ID_UNALLOCATED(4,3),
- { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, .visibility = sve_id_visibility },
+ { SYS_DESC(SYS_ID_AA64ZFR0_EL1), access_id_aa64zfr0_el1, .get_user = get_id_aa64zfr0_el1, .set_user = set_id_aa64zfr0_el1, },
ID_UNALLOCATED(4,5),
ID_UNALLOCATED(4,6),
ID_UNALLOCATED(4,7),
--
2.26.2
The patch below does not apply to the 5.9-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
>From f9c9104288da543cd64f186f9e2fba389f415630 Mon Sep 17 00:00:00 2001
From: Damien Le Moal <damien.lemoal(a)wdc.com>
Date: Thu, 29 Oct 2020 20:04:59 +0900
Subject: [PATCH] null_blk: Fix zone reset all tracing
In the cae of the REQ_OP_ZONE_RESET_ALL operation, the command sector is
ignored and the operation is applied to all sequential zones. For these
commands, tracing the effect of the command using the command sector to
determine the target zone is thus incorrect.
Fix null_zone_mgmt() zone condition tracing in the case of
REQ_OP_ZONE_RESET_ALL to apply tracing to all sequential zones that are
not already empty.
Fixes: 766c3297d7e1 ("null_blk: add trace in null_blk_zoned.c")
Signed-off-by: Damien Le Moal <damien.lemoal(a)wdc.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c
index 98056c88926b..b637b16a5f54 100644
--- a/drivers/block/null_blk_zoned.c
+++ b/drivers/block/null_blk_zoned.c
@@ -475,9 +475,14 @@ static blk_status_t null_zone_mgmt(struct nullb_cmd *cmd, enum req_opf op,
switch (op) {
case REQ_OP_ZONE_RESET_ALL:
- for (i = dev->zone_nr_conv; i < dev->nr_zones; i++)
- null_reset_zone(dev, &dev->zones[i]);
- break;
+ for (i = dev->zone_nr_conv; i < dev->nr_zones; i++) {
+ zone = &dev->zones[i];
+ if (zone->cond != BLK_ZONE_COND_EMPTY) {
+ null_reset_zone(dev, zone);
+ trace_nullb_zone_op(cmd, i, zone->cond);
+ }
+ }
+ return BLK_STS_OK;
case REQ_OP_ZONE_RESET:
ret = null_reset_zone(dev, zone);
break;
Commit 393f203f5fd5 ("x86_64: kasan: add interceptors for
memset/memmove/memcpy functions") added .weak directives to
arch/x86/lib/mem*_64.S instead of changing the existing ENTRY macros to
WEAK. This can lead to the assembly snippet `.weak memcpy ... .globl
memcpy` which will produce a STB_WEAK memcpy with GNU as but STB_GLOBAL
memcpy with LLVM's integrated assembler before LLVM 12. LLVM 12 (since
https://reviews.llvm.org/D90108) will error on such an overridden symbol
binding.
Commit ef1e03152cb0 ("x86/asm: Make some functions local") changed ENTRY in
arch/x86/lib/memcpy_64.S to SYM_FUNC_START_LOCAL, which was ineffective due to
the preceding .weak directive.
Use the appropriate SYM_FUNC_START_WEAK instead.
Fixes: 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions")
Fixes: ef1e03152cb0 ("x86/asm: Make some functions local")
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Cc: <stable(a)vger.kernel.org>
---
Changes in v2
* Correct the message: SYM_FUNC_START_WEAK was not available at commit 393f203f5fd5.
* Mention Fixes: ef1e03152cb0
---
arch/x86/lib/memcpy_64.S | 4 +---
arch/x86/lib/memmove_64.S | 4 +---
arch/x86/lib/memset_64.S | 4 +---
3 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 037faac46b0c..1e299ac73c86 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -16,8 +16,6 @@
* to a jmp to memcpy_erms which does the REP; MOVSB mem copy.
*/
-.weak memcpy
-
/*
* memcpy - Copy a memory block.
*
@@ -30,7 +28,7 @@
* rax original destination
*/
SYM_FUNC_START_ALIAS(__memcpy)
-SYM_FUNC_START_LOCAL(memcpy)
+SYM_FUNC_START_WEAK(memcpy)
ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \
"jmp memcpy_erms", X86_FEATURE_ERMS
diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index 7ff00ea64e4f..41902fe8b859 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -24,9 +24,7 @@
* Output:
* rax: dest
*/
-.weak memmove
-
-SYM_FUNC_START_ALIAS(memmove)
+SYM_FUNC_START_WEAK(memmove)
SYM_FUNC_START(__memmove)
mov %rdi, %rax
diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S
index 9ff15ee404a4..0bfd26e4ca9e 100644
--- a/arch/x86/lib/memset_64.S
+++ b/arch/x86/lib/memset_64.S
@@ -6,8 +6,6 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>
-.weak memset
-
/*
* ISO C memset - set a memory block to a byte value. This function uses fast
* string to get better performance than the original function. The code is
@@ -19,7 +17,7 @@
*
* rax original destination
*/
-SYM_FUNC_START_ALIAS(memset)
+SYM_FUNC_START_WEAK(memset)
SYM_FUNC_START(__memset)
/*
* Some CPUs support enhanced REP MOVSB/STOSB feature. It is recommended
--
2.29.1.341.ge80a0c044ae-goog
The driver must not call tty_wakeup() while holding its private lock as
line disciplines are allowed to call back into write() from
write_wakeup(), leading to a deadlock.
Also remove the unneeded work struct that was used to defer wakeup in
order to work around a possible race in ancient times (see comment about
n_tty write_chan() in commit 14b54e39b412 ("USB: serial: remove
changelogs and old todo entries")).
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable(a)vger.kernel.org
Signed-off-by: Johan Hovold <johan(a)kernel.org>
---
drivers/usb/serial/digi_acceleport.c | 45 ++++++++--------------------
1 file changed, 13 insertions(+), 32 deletions(-)
diff --git a/drivers/usb/serial/digi_acceleport.c b/drivers/usb/serial/digi_acceleport.c
index 91055a191995..0d606fa9fdca 100644
--- a/drivers/usb/serial/digi_acceleport.c
+++ b/drivers/usb/serial/digi_acceleport.c
@@ -19,7 +19,6 @@
#include <linux/tty_flip.h>
#include <linux/module.h>
#include <linux/spinlock.h>
-#include <linux/workqueue.h>
#include <linux/uaccess.h>
#include <linux/usb.h>
#include <linux/wait.h>
@@ -198,14 +197,12 @@ struct digi_port {
int dp_throttle_restart;
wait_queue_head_t dp_flush_wait;
wait_queue_head_t dp_close_wait; /* wait queue for close */
- struct work_struct dp_wakeup_work;
struct usb_serial_port *dp_port;
};
/* Local Function Declarations */
-static void digi_wakeup_write_lock(struct work_struct *work);
static int digi_write_oob_command(struct usb_serial_port *port,
unsigned char *buf, int count, int interruptible);
static int digi_write_inb_command(struct usb_serial_port *port,
@@ -356,26 +353,6 @@ __releases(lock)
return timeout;
}
-
-/*
- * Digi Wakeup Write
- *
- * Wake up port, line discipline, and tty processes sleeping
- * on writes.
- */
-
-static void digi_wakeup_write_lock(struct work_struct *work)
-{
- struct digi_port *priv =
- container_of(work, struct digi_port, dp_wakeup_work);
- struct usb_serial_port *port = priv->dp_port;
- unsigned long flags;
-
- spin_lock_irqsave(&priv->dp_port_lock, flags);
- tty_port_tty_wakeup(&port->port);
- spin_unlock_irqrestore(&priv->dp_port_lock, flags);
-}
-
/*
* Digi Write OOB Command
*
@@ -986,6 +963,7 @@ static void digi_write_bulk_callback(struct urb *urb)
unsigned long flags;
int ret = 0;
int status = urb->status;
+ bool wakeup;
/* port and serial sanity check */
if (port == NULL || (priv = usb_get_serial_port_data(port)) == NULL) {
@@ -1012,6 +990,7 @@ static void digi_write_bulk_callback(struct urb *urb)
}
/* try to send any buffered data on this port */
+ wakeup = true;
spin_lock_irqsave(&priv->dp_port_lock, flags);
priv->dp_write_urb_in_use = 0;
if (priv->dp_out_buf_len > 0) {
@@ -1027,19 +1006,18 @@ static void digi_write_bulk_callback(struct urb *urb)
if (ret == 0) {
priv->dp_write_urb_in_use = 1;
priv->dp_out_buf_len = 0;
+ wakeup = false;
}
}
- /* wake up processes sleeping on writes immediately */
- tty_port_tty_wakeup(&port->port);
- /* also queue up a wakeup at scheduler time, in case we */
- /* lost the race in write_chan(). */
- schedule_work(&priv->dp_wakeup_work);
-
spin_unlock_irqrestore(&priv->dp_port_lock, flags);
+
if (ret && ret != -EPERM)
dev_err_console(port,
"%s: usb_submit_urb failed, ret=%d, port=%d\n",
__func__, ret, priv->dp_port_num);
+
+ if (wakeup)
+ tty_port_tty_wakeup(&port->port);
}
static int digi_write_room(struct tty_struct *tty)
@@ -1239,7 +1217,6 @@ static int digi_port_init(struct usb_serial_port *port, unsigned port_num)
init_waitqueue_head(&priv->dp_transmit_idle_wait);
init_waitqueue_head(&priv->dp_flush_wait);
init_waitqueue_head(&priv->dp_close_wait);
- INIT_WORK(&priv->dp_wakeup_work, digi_wakeup_write_lock);
priv->dp_port = port;
init_waitqueue_head(&port->write_wait);
@@ -1508,13 +1485,14 @@ static int digi_read_oob_callback(struct urb *urb)
rts = C_CRTSCTS(tty);
if (tty && opcode == DIGI_CMD_READ_INPUT_SIGNALS) {
+ bool wakeup = false;
+
spin_lock_irqsave(&priv->dp_port_lock, flags);
/* convert from digi flags to termiox flags */
if (val & DIGI_READ_INPUT_SIGNALS_CTS) {
priv->dp_modem_signals |= TIOCM_CTS;
- /* port must be open to use tty struct */
if (rts)
- tty_port_tty_wakeup(&port->port);
+ wakeup = true;
} else {
priv->dp_modem_signals &= ~TIOCM_CTS;
/* port must be open to use tty struct */
@@ -1533,6 +1511,9 @@ static int digi_read_oob_callback(struct urb *urb)
priv->dp_modem_signals &= ~TIOCM_CD;
spin_unlock_irqrestore(&priv->dp_port_lock, flags);
+
+ if (wakeup)
+ tty_port_tty_wakeup(&port->port);
} else if (opcode == DIGI_CMD_TRANSMIT_IDLE) {
spin_lock_irqsave(&priv->dp_port_lock, flags);
priv->dp_transmit_idle = 1;
--
2.26.2
This is a bit of a mess, to put it mildly. But, it's a bug
that seems to have gone unticked up to now, probably because
nobody uses MPX. The other alternative to this fix is to just
deprecate MPX, even in -stable kernels.
MPX has the arch_unmap() hook inside of munmap() because MPX
uses bounds tables that protect other areas of memory. When
memory is unmapped, there is also a need to unmap the MPX
bounds tables. Barring this, unused bounds tables can eat 80%
of the address space.
But, the recursive do_munmap() that gets called vi arch_unmap()
wreaks havoc with __do_munmap()'s state. It can result in
freeing populated page tables, accessing bogus VMA state,
double-freed VMAs and more.
To fix this, call arch_unmap() before __do_unmap() has a chance
to do anything meaningful. Also, remove the 'vma' argument
and force the MPX code to do its own, independent VMA lookup.
For the common success case this is functionally identical to
what was there before. For the munmap() failure case, it's
possible that some MPX tables will be zapped for memory that
continues to be in use. But, this is an extraordinarily
unlikely scenario and the harm would be that MPX provides no
protection since the bounds table got reset (zeroed).
I can't imagine anyone doing this:
ptr = mmap();
// use ptr
ret = munmap(ptr);
if (ret)
// oh, there was an error, I'll
// keep using ptr.
Because if you're doing munmap(), you are *done* with the
memory. There's probably no good data in there _anyway_.
This passes the original reproducer from Richard Biener as
well as the existing mpx selftests/.
====
The long story:
munmap() has a couple of pieces:
1. Find the affected VMA(s)
2. Split the start/end one(s) if neceesary
3. Pull the VMAs out of the rbtree
4. Actually zap the memory via unmap_region(), including
freeing page tables (or queueing them to be freed).
5. Fixup some of the accounting (like fput()) and actually
free the VMA itself.
I decided to put the arch_unmap() call right afer #3. This
was *just* before mmap_sem looked like it might get downgraded
(it won't in this context), but it looked right. It wasn't.
Richard Biener reported a test that shows this in dmesg:
[1216548.787498] BUG: Bad rss-counter state mm:0000000017ce560b idx:1 val:551
[1216548.787500] BUG: non-zero pgtables_bytes on freeing mm: 24576
What triggered this was the recursive do_munmap() called via
arch_unmap(). It was freeing page tables that has not been
properly zapped.
But, the problem was bigger than this. For one, arch_unmap()
can free VMAs. But, the calling __do_munmap() has variables
that *point* to VMAs and obviously can't handle them just
getting freed while the pointer is still valid.
I tried a couple of things here. First, I tried to fix the page
table freeing problem in isolation, but I then found the VMA
issue. I also tried having the MPX code return a flag if it
modified the rbtree which would force __do_munmap() to re-walk
to restart. That spiralled out of control in complexity pretty
fast.
Just moving arch_unmap() and accepting that the bonkers failure
case might eat some bounds tables seems like the simplest viable
fix.
Reported-by: Richard Biener <rguenther(a)suse.de>
Cc: Michal Hocko <mhocko(a)suse.com>
Cc: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Andy Lutomirski <luto(a)amacapital.net>
Cc: x86(a)kernel.org
Cc: Andrew Morton <akpm(a)linux-foundation.org>
Cc: linux-kernel(a)vger.kernel.org
Cc: linux-mm(a)kvack.org
Cc: stable(a)vger.kernel.org
---
b/arch/x86/include/asm/mmu_context.h | 6 +++---
b/arch/x86/include/asm/mpx.h | 5 ++---
b/arch/x86/mm/mpx.c | 10 ++++++----
b/include/asm-generic/mm_hooks.h | 1 -
b/mm/mmap.c | 15 ++++++++-------
5 files changed, 19 insertions(+), 18 deletions(-)
diff -puN mm/mmap.c~mpx-rss-pass-no-vma mm/mmap.c
--- a/mm/mmap.c~mpx-rss-pass-no-vma 2019-04-01 06:56:53.409411123 -0700
+++ b/mm/mmap.c 2019-04-01 06:56:53.423411123 -0700
@@ -2731,9 +2731,17 @@ int __do_munmap(struct mm_struct *mm, un
return -EINVAL;
len = PAGE_ALIGN(len);
+ end = start + len;
if (len == 0)
return -EINVAL;
+ /*
+ * arch_unmap() might do unmaps itself. It must be called
+ * and finish any rbtree manipulation before this code
+ * runs and also starts to manipulate the rbtree.
+ */
+ arch_unmap(mm, start, end);
+
/* Find the first overlapping VMA */
vma = find_vma(mm, start);
if (!vma)
@@ -2742,7 +2750,6 @@ int __do_munmap(struct mm_struct *mm, un
/* we have start < vma->vm_end */
/* if it doesn't overlap, we have nothing.. */
- end = start + len;
if (vma->vm_start >= end)
return 0;
@@ -2812,12 +2819,6 @@ int __do_munmap(struct mm_struct *mm, un
/* Detach vmas from rbtree */
detach_vmas_to_be_unmapped(mm, vma, prev, end);
- /*
- * mpx unmap needs to be called with mmap_sem held for write.
- * It is safe to call it before unmap_region().
- */
- arch_unmap(mm, vma, start, end);
-
if (downgrade)
downgrade_write(&mm->mmap_sem);
diff -puN arch/x86/include/asm/mmu_context.h~mpx-rss-pass-no-vma arch/x86/include/asm/mmu_context.h
--- a/arch/x86/include/asm/mmu_context.h~mpx-rss-pass-no-vma 2019-04-01 06:56:53.412411123 -0700
+++ b/arch/x86/include/asm/mmu_context.h 2019-04-01 06:56:53.423411123 -0700
@@ -277,8 +277,8 @@ static inline void arch_bprm_mm_init(str
mpx_mm_init(mm);
}
-static inline void arch_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
- unsigned long start, unsigned long end)
+static inline void arch_unmap(struct mm_struct *mm, unsigned long start,
+ unsigned long end)
{
/*
* mpx_notify_unmap() goes and reads a rarely-hot
@@ -298,7 +298,7 @@ static inline void arch_unmap(struct mm_
* consistently wrong.
*/
if (unlikely(cpu_feature_enabled(X86_FEATURE_MPX)))
- mpx_notify_unmap(mm, vma, start, end);
+ mpx_notify_unmap(mm, start, end);
}
/*
diff -puN include/asm-generic/mm_hooks.h~mpx-rss-pass-no-vma include/asm-generic/mm_hooks.h
--- a/include/asm-generic/mm_hooks.h~mpx-rss-pass-no-vma 2019-04-01 06:56:53.414411123 -0700
+++ b/include/asm-generic/mm_hooks.h 2019-04-01 06:56:53.423411123 -0700
@@ -18,7 +18,6 @@ static inline void arch_exit_mmap(struct
}
static inline void arch_unmap(struct mm_struct *mm,
- struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
}
diff -puN arch/x86/mm/mpx.c~mpx-rss-pass-no-vma arch/x86/mm/mpx.c
--- a/arch/x86/mm/mpx.c~mpx-rss-pass-no-vma 2019-04-01 06:56:53.416411123 -0700
+++ b/arch/x86/mm/mpx.c 2019-04-01 06:56:53.423411123 -0700
@@ -881,9 +881,10 @@ static int mpx_unmap_tables(struct mm_st
* the virtual address region start...end have already been split if
* necessary, and the 'vma' is the first vma in this range (start -> end).
*/
-void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
- unsigned long start, unsigned long end)
+void mpx_notify_unmap(struct mm_struct *mm, unsigned long start,
+ unsigned long end)
{
+ struct vm_area_struct *vma;
int ret;
/*
@@ -902,11 +903,12 @@ void mpx_notify_unmap(struct mm_struct *
* which should not occur normally. Being strict about it here
* helps ensure that we do not have an exploitable stack overflow.
*/
- do {
+ vma = find_vma(mm, start);
+ while (vma && vma->vm_start < end) {
if (vma->vm_flags & VM_MPX)
return;
vma = vma->vm_next;
- } while (vma && vma->vm_start < end);
+ }
ret = mpx_unmap_tables(mm, start, end);
if (ret)
diff -puN arch/x86/include/asm/mpx.h~mpx-rss-pass-no-vma arch/x86/include/asm/mpx.h
--- a/arch/x86/include/asm/mpx.h~mpx-rss-pass-no-vma 2019-04-01 06:56:53.418411123 -0700
+++ b/arch/x86/include/asm/mpx.h 2019-04-01 06:56:53.424411123 -0700
@@ -78,8 +78,8 @@ static inline void mpx_mm_init(struct mm
*/
mm->context.bd_addr = MPX_INVALID_BOUNDS_DIR;
}
-void mpx_notify_unmap(struct mm_struct *mm, struct vm_area_struct *vma,
- unsigned long start, unsigned long end);
+void mpx_notify_unmap(struct mm_struct *mm, unsigned long start,
+ unsigned long end);
unsigned long mpx_unmapped_area_check(unsigned long addr, unsigned long len,
unsigned long flags);
@@ -100,7 +100,6 @@ static inline void mpx_mm_init(struct mm
{
}
static inline void mpx_notify_unmap(struct mm_struct *mm,
- struct vm_area_struct *vma,
unsigned long start, unsigned long end)
{
}
_
From: Nicholas Piggin <npiggin(a)gmail.com>
commit d53c3dfb23c45f7d4f910c3a3ca84bf0a99c6143 upstream.
Reading and modifying current->mm and current->active_mm and switching
mm should be done with irqs off, to prevent races seeing an intermediate
state.
This is similar to commit 38cf307c1f20 ("mm: fix kthread_use_mm() vs TLB
invalidate"). At exec-time when the new mm is activated, the old one
should usually be single-threaded and no longer used, unless something
else is holding an mm_users reference (which may be possible).
Absent other mm_users, there is also a race with preemption and lazy tlb
switching. Consider the kernel_execve case where the current thread is
using a lazy tlb active mm:
call_usermodehelper()
kernel_execve()
old_mm = current->mm;
active_mm = current->active_mm;
*** preempt *** --------------------> schedule()
prev->active_mm = NULL;
mmdrop(prev active_mm);
...
<-------------------- schedule()
current->mm = mm;
current->active_mm = mm;
if (!old_mm)
mmdrop(active_mm);
If we switch back to the kernel thread from a different mm, there is a
double free of the old active_mm, and a missing free of the new one.
Closing this race only requires interrupts to be disabled while ->mm
and ->active_mm are being switched, but the TLB problem requires also
holding interrupts off over activate_mm. Unfortunately not all archs
can do that yet, e.g., arm defers the switch if irqs are disabled and
expects finish_arch_post_lock_switch() to be called to complete the
flush; um takes a blocking lock in activate_mm().
So as a first step, disable interrupts across the mm/active_mm updates
to close the lazy tlb preempt race, and provide an arch option to
extend that to activate_mm which allows architectures doing IPI based
TLB shootdowns to close the second race.
This is a bit ugly, but in the interest of fixing the bug and backporting
before all architectures are converted this is a compromise.
Signed-off-by: Nicholas Piggin <npiggin(a)gmail.com>
Acked-by: Peter Zijlstra (Intel) <peterz(a)infradead.org>
[mpe: Manual backport to 4.19 due to membarrier_exec_mmap(mm) changes]
Signed-off-by: Michael Ellerman <mpe(a)ellerman.id.au>
Link: https://lore.kernel.org/r/20200914045219.3736466-2-npiggin@gmail.com
---
arch/Kconfig | 7 +++++++
fs/exec.c | 15 ++++++++++++++-
2 files changed, 21 insertions(+), 1 deletion(-)
diff --git a/arch/Kconfig b/arch/Kconfig
index a336548487e6..e3a030f7a722 100644
--- a/arch/Kconfig
+++ b/arch/Kconfig
@@ -366,6 +366,13 @@ config HAVE_RCU_TABLE_FREE
config HAVE_RCU_TABLE_INVALIDATE
bool
+config ARCH_WANT_IRQS_OFF_ACTIVATE_MM
+ bool
+ help
+ Temporary select until all architectures can be converted to have
+ irqs disabled over activate_mm. Architectures that do IPI based TLB
+ shootdowns should enable this.
+
config ARCH_HAVE_NMI_SAFE_CMPXCHG
bool
diff --git a/fs/exec.c b/fs/exec.c
index cece8c14f377..52788644c4af 100644
--- a/fs/exec.c
+++ b/fs/exec.c
@@ -1028,10 +1028,23 @@ static int exec_mmap(struct mm_struct *mm)
}
}
task_lock(tsk);
+
+ local_irq_disable();
active_mm = tsk->active_mm;
- tsk->mm = mm;
tsk->active_mm = mm;
+ tsk->mm = mm;
+ /*
+ * This prevents preemption while active_mm is being loaded and
+ * it and mm are being updated, which could cause problems for
+ * lazy tlb mm refcounting when these are updated by context
+ * switches. Not all architectures can handle irqs off over
+ * activate_mm yet.
+ */
+ if (!IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+ local_irq_enable();
activate_mm(active_mm, mm);
+ if (IS_ENABLED(CONFIG_ARCH_WANT_IRQS_OFF_ACTIVATE_MM))
+ local_irq_enable();
tsk->mm->vmacache_seqnum = 0;
vmacache_flush(tsk);
task_unlock(tsk);
--
2.25.1
The patch titled
Subject: mm/gup: use unpin_user_pages() in check_and_migrate_cma_pages()
has been removed from the -mm tree. Its filename was
mm-gup-use-unpin_user_pages-in-check_and_migrate_cma_pages.patch
This patch was dropped because an alternative patch was merged
------------------------------------------------------
From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm/gup: use unpin_user_pages() in check_and_migrate_cma_pages()
When FOLL_PIN is passed to __get_user_pages() the page list must be put
back using unpin_user_pages() otherwise the page pin reference persists in
a corrupted state.
Link: https://lkml.kernel.org/r/0-v1-976effcd4468+d4-gup_cma_fix_jgg@nvidia.com
Fixes: 3faa52c03f44 ("mm/gup: track FOLL_PIN pages")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Aneesh Kumar K.V <aneesh.kumar(a)linux.ibm.com>
Cc: Claudio Imbrenda <imbrenda(a)linux.ibm.com>
Cc: Ira Weiny <ira.weiny(a)intel.com>
Cc: Jan Kara <jack(a)suse.cz>
Cc: John Hubbard <jhubbard(a)nvidia.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/gup.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
--- a/mm/gup.c~mm-gup-use-unpin_user_pages-in-check_and_migrate_cma_pages
+++ a/mm/gup.c
@@ -1647,8 +1647,11 @@ check_again:
/*
* drop the above get_user_pages reference.
*/
- for (i = 0; i < nr_pages; i++)
- put_page(pages[i]);
+ if (gup_flags & FOLL_PIN)
+ unpin_user_pages(pages, nr_pages);
+ else
+ for (i = 0; i < nr_pages; i++)
+ put_page(pages[i]);
if (migrate_pages(&cma_page_list, alloc_migration_target, NULL,
(unsigned long)&mtc, MIGRATE_SYNC, MR_CONTIG_RANGE)) {
_
Patches currently in -mm which might be from jgg(a)nvidia.com are
mm-gup-use-unpin_user_pages-in-__gup_longterm_locked.patch
When tpm_get_random() was introduced, it defined the following API for the
return value:
1. A positive value tells how many bytes of random data was generated.
2. A negative value on error.
However, in the call sites the API was used incorrectly, i.e. as it would
only return negative values and otherwise zero. Returning he positive read
counts to the user space does not make any possible sense.
Fix this by returning -EIO when tpm_get_random() returns a positive value.
Fixes: 41ab999c80f1 ("tpm: Move tpm_get_random api into the TPM device driver")
Cc: stable(a)vger.kernel.org
Cc: Mimi Zohar <zohar(a)linux.ibm.com>
Cc: "James E.J. Bottomley" <James.Bottomley(a)HansenPartnership.com>
Cc: David Howells <dhowells(a)redhat.com>
Cc: Kent Yoder <key(a)linux.vnet.ibm.com>
Signed-off-by: Jarkko Sakkinen <jarkko.sakkinen(a)linux.intel.com>
---
security/keys/trusted-keys/trusted_tpm1.c | 20 +++++++++++++++++---
1 file changed, 17 insertions(+), 3 deletions(-)
diff --git a/security/keys/trusted-keys/trusted_tpm1.c b/security/keys/trusted-keys/trusted_tpm1.c
index b9fe02e5f84f..c7b1701cdac5 100644
--- a/security/keys/trusted-keys/trusted_tpm1.c
+++ b/security/keys/trusted-keys/trusted_tpm1.c
@@ -403,9 +403,12 @@ static int osap(struct tpm_buf *tb, struct osapsess *s,
int ret;
ret = tpm_get_random(chip, ononce, TPM_NONCE_SIZE);
- if (ret != TPM_NONCE_SIZE)
+ if (ret < 0)
return ret;
+ if (ret != TPM_NONCE_SIZE)
+ return -EIO;
+
tpm_buf_reset(tb, TPM_TAG_RQU_COMMAND, TPM_ORD_OSAP);
tpm_buf_append_u16(tb, type);
tpm_buf_append_u32(tb, handle);
@@ -496,8 +499,12 @@ static int tpm_seal(struct tpm_buf *tb, uint16_t keytype,
goto out;
ret = tpm_get_random(chip, td->nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE)
- goto out;
+ return -EIO;
+
ordinal = htonl(TPM_ORD_SEAL);
datsize = htonl(datalen);
pcrsize = htonl(pcrinfosize);
@@ -601,9 +608,12 @@ static int tpm_unseal(struct tpm_buf *tb,
ordinal = htonl(TPM_ORD_UNSEAL);
ret = tpm_get_random(chip, nonceodd, TPM_NONCE_SIZE);
+ if (ret < 0)
+ return ret;
+
if (ret != TPM_NONCE_SIZE) {
pr_info("trusted_key: tpm_get_random failed (%d)\n", ret);
- return ret;
+ return -EIO;
}
ret = TSS_authhmac(authdata1, keyauth, TPM_NONCE_SIZE,
enonce1, nonceodd, cont, sizeof(uint32_t),
@@ -1013,8 +1023,12 @@ static int trusted_instantiate(struct key *key,
case Opt_new:
key_len = payload->key_len;
ret = tpm_get_random(chip, payload->key, key_len);
+ if (ret < 0)
+ goto out;
+
if (ret != key_len) {
pr_info("trusted_key: key_create failed (%d)\n", ret);
+ ret = -EIO;
goto out;
}
if (tpm2)
--
2.25.1
From: Gao Xiang <hsiangkao(a)redhat.com>
EROFS has _only one_ ondisk timestamp (ctime is currently
documented and recorded, we might also record mtime instead
with a new compat feature if needed) for each extended inode
since EROFS isn't mainly for archival purposes so no need to
keep all timestamps on disk especially for Android scenarios
due to security concerns. Also, romfs/cramfs don't have their
own on-disk timestamp, and squashfs only records mtime instead.
Let's also derive access time from ondisk timestamp rather than
leaving it empty, and if mtime/atime for each file are really
needed for specific scenarios as well, we can also use xattrs
to record them then.
Reported-by: nl6720 <nl6720(a)gmail.com>
[ Gao Xiang: It'd be better to backport for user-friendly concern. ]
Fixes: 431339ba9042 ("staging: erofs: add inode operations")
Cc: stable <stable(a)vger.kernel.org> # 4.19+
Signed-off-by: Gao Xiang <hsiangkao(a)redhat.com>
---
fs/erofs/inode.c | 21 +++++++++++----------
1 file changed, 11 insertions(+), 10 deletions(-)
diff --git a/fs/erofs/inode.c b/fs/erofs/inode.c
index 139d0bed42f8..3e21c0e8adae 100644
--- a/fs/erofs/inode.c
+++ b/fs/erofs/inode.c
@@ -107,11 +107,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le32_to_cpu(die->i_gid));
set_nlink(inode, le32_to_cpu(die->i_nlink));
- /* ns timestamp */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- le64_to_cpu(die->i_ctime);
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- le32_to_cpu(die->i_ctime_nsec);
+ /* extended inode has its own timestamp */
+ inode->i_ctime.tv_sec = le64_to_cpu(die->i_ctime);
+ inode->i_ctime.tv_nsec = le32_to_cpu(die->i_ctime_nsec);
inode->i_size = le64_to_cpu(die->i_size);
@@ -149,11 +147,9 @@ static struct page *erofs_read_inode(struct inode *inode,
i_gid_write(inode, le16_to_cpu(dic->i_gid));
set_nlink(inode, le16_to_cpu(dic->i_nlink));
- /* use build time to derive all file time */
- inode->i_mtime.tv_sec = inode->i_ctime.tv_sec =
- sbi->build_time;
- inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec =
- sbi->build_time_nsec;
+ /* use build time for compact inodes */
+ inode->i_ctime.tv_sec = sbi->build_time;
+ inode->i_ctime.tv_nsec = sbi->build_time_nsec;
inode->i_size = le32_to_cpu(dic->i_size);
if (erofs_inode_is_data_compressed(vi->datalayout))
@@ -167,6 +163,11 @@ static struct page *erofs_read_inode(struct inode *inode,
goto err_out;
}
+ inode->i_mtime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_atime.tv_sec = inode->i_ctime.tv_sec;
+ inode->i_mtime.tv_nsec = inode->i_ctime.tv_nsec;
+ inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec;
+
if (!nblks)
/* measure inode.i_blocks as generic filesystems */
inode->i_blocks = roundup(inode->i_size, EROFS_BLKSIZ) >> 9;
--
2.24.0
From: Fangrui Song <maskray(a)google.com>
Commit 393f203f5fd5 ("x86_64: kasan: add interceptors for
memset/memmove/memcpy functions") added .weak directives to
arch/x86/lib/mem*_64.S instead of changing the existing ENTRY macros to
WEAK. This can lead to the assembly snippet `.weak memcpy ... .globl
memcpy` which will produce a STB_WEAK memcpy with GNU as but STB_GLOBAL
memcpy with LLVM's integrated assembler before LLVM 12. LLVM 12 (since
https://reviews.llvm.org/D90108) will error on such an overridden symbol
binding.
Commit ef1e03152cb0 ("x86/asm: Make some functions local") changed ENTRY in
arch/x86/lib/memcpy_64.S to SYM_FUNC_START_LOCAL, which was ineffective due to
the preceding .weak directive.
Use the appropriate SYM_FUNC_START_WEAK instead.
Fixes: 393f203f5fd5 ("x86_64: kasan: add interceptors for memset/memmove/memcpy functions")
Fixes: ef1e03152cb0 ("x86/asm: Make some functions local")
Reported-by: Sami Tolvanen <samitolvanen(a)google.com>
Signed-off-by: Fangrui Song <maskray(a)google.com>
Tested-by: Nathan Chancellor <natechancellor(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
---
arch/x86/lib/memcpy_64.S | 4 +---
arch/x86/lib/memmove_64.S | 4 +---
arch/x86/lib/memset_64.S | 4 +---
3 files changed, 3 insertions(+), 9 deletions(-)
diff --git a/arch/x86/lib/memcpy_64.S b/arch/x86/lib/memcpy_64.S
index 037faac46b0c..1e299ac73c86 100644
--- a/arch/x86/lib/memcpy_64.S
+++ b/arch/x86/lib/memcpy_64.S
@@ -16,8 +16,6 @@
* to a jmp to memcpy_erms which does the REP; MOVSB mem copy.
*/
-.weak memcpy
-
/*
* memcpy - Copy a memory block.
*
@@ -30,7 +28,7 @@
* rax original destination
*/
SYM_FUNC_START_ALIAS(__memcpy)
-SYM_FUNC_START_LOCAL(memcpy)
+SYM_FUNC_START_WEAK(memcpy)
ALTERNATIVE_2 "jmp memcpy_orig", "", X86_FEATURE_REP_GOOD, \
"jmp memcpy_erms", X86_FEATURE_ERMS
diff --git a/arch/x86/lib/memmove_64.S b/arch/x86/lib/memmove_64.S
index 7ff00ea64e4f..41902fe8b859 100644
--- a/arch/x86/lib/memmove_64.S
+++ b/arch/x86/lib/memmove_64.S
@@ -24,9 +24,7 @@
* Output:
* rax: dest
*/
-.weak memmove
-
-SYM_FUNC_START_ALIAS(memmove)
+SYM_FUNC_START_WEAK(memmove)
SYM_FUNC_START(__memmove)
mov %rdi, %rax
diff --git a/arch/x86/lib/memset_64.S b/arch/x86/lib/memset_64.S
index 9ff15ee404a4..0bfd26e4ca9e 100644
--- a/arch/x86/lib/memset_64.S
+++ b/arch/x86/lib/memset_64.S
@@ -6,8 +6,6 @@
#include <asm/alternative-asm.h>
#include <asm/export.h>
-.weak memset
-
/*
* ISO C memset - set a memory block to a byte value. This function uses fast
* string to get better performance than the original function. The code is
@@ -19,7 +17,7 @@
*
* rax original destination
*/
-SYM_FUNC_START_ALIAS(memset)
+SYM_FUNC_START_WEAK(memset)
SYM_FUNC_START(__memset)
/*
* Some CPUs support enhanced REP MOVSB/STOSB feature. It is recommended
--
2.29.1.341.ge80a0c044ae-goog