From: Jason Gunthorpe <jgg(a)nvidia.com>
Subject: mm: always have io_remap_pfn_range() set pgprot_decrypted()
The purpose of io_remap_pfn_range() is to map IO memory, such as a memory
mapped IO exposed through a PCI BAR. IO devices do not understand
encryption, so this memory must always be decrypted. Automatically call
pgprot_decrypted() as part of the generic implementation.
This fixes a bug where enabling AMD SME causes subsystems, such as RDMA,
using io_remap_pfn_range() to expose BAR pages to user space to fail. The
CPU will encrypt access to those BAR pages instead of passing unencrypted
IO directly to the device.
Places not mapping IO should use remap_pfn_range().
Link: https://lkml.kernel.org/r/0-v1-025d64bdf6c4+e-amd_sme_fix_jgg@nvidia.com
Fixes: aca20d546214 ("x86/mm: Add support to make use of Secure Memory Encryption")
Signed-off-by: Jason Gunthorpe <jgg(a)nvidia.com>
Cc: Tom Lendacky <thomas.lendacky(a)amd.com>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
CcK Arnd Bergmann <arnd(a)arndb.de>
Cc: Andrey Ryabinin <aryabinin(a)virtuozzo.com>
Cc: Borislav Petkov <bp(a)alien8.de>
Cc: Brijesh Singh <brijesh.singh(a)amd.com>
Cc: Jonathan Corbet <corbet(a)lwn.net>
Cc: Dmitry Vyukov <dvyukov(a)google.com>
Cc: "Dave Young" <dyoung(a)redhat.com>
Cc: Alexander Potapenko <glider(a)google.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk(a)oracle.com>
Cc: Andy Lutomirski <luto(a)kernel.org>
Cc: Larry Woodman <lwoodman(a)redhat.com>
Cc: Matt Fleming <matt(a)codeblueprint.co.uk>
Cc: Ingo Molnar <mingo(a)kernel.org>
Cc: "Michael S. Tsirkin" <mst(a)redhat.com>
Cc: Paolo Bonzini <pbonzini(a)redhat.com>
Cc: Peter Zijlstra <peterz(a)infradead.org>
Cc: Rik van Riel <riel(a)redhat.com>
Cc: Toshimitsu Kani <toshi.kani(a)hpe.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
include/linux/mm.h | 9 +++++++++
include/linux/pgtable.h | 4 ----
2 files changed, 9 insertions(+), 4 deletions(-)
--- a/include/linux/mm.h~mm-always-have-io_remap_pfn_range-set-pgprot_decrypted
+++ a/include/linux/mm.h
@@ -2759,6 +2759,15 @@ static inline vm_fault_t vmf_insert_page
return VM_FAULT_NOPAGE;
}
+#ifndef io_remap_pfn_range
+static inline int io_remap_pfn_range(struct vm_area_struct *vma,
+ unsigned long addr, unsigned long pfn,
+ unsigned long size, pgprot_t prot)
+{
+ return remap_pfn_range(vma, addr, pfn, size, pgprot_decrypted(prot));
+}
+#endif
+
static inline vm_fault_t vmf_error(int err)
{
if (err == -ENOMEM)
--- a/include/linux/pgtable.h~mm-always-have-io_remap_pfn_range-set-pgprot_decrypted
+++ a/include/linux/pgtable.h
@@ -1427,10 +1427,6 @@ typedef unsigned int pgtbl_mod_mask;
#endif /* !__ASSEMBLY__ */
-#ifndef io_remap_pfn_range
-#define io_remap_pfn_range remap_pfn_range
-#endif
-
#ifndef has_transparent_hugepage
#ifdef CONFIG_TRANSPARENT_HUGEPAGE
#define has_transparent_hugepage() 1
_
From: Zqiang <qiang.zhang(a)windriver.com>
Subject: kthread_worker: prevent queuing delayed work from timer_fn when it is being canceled
There is a small race window when a delayed work is being canceled and the
work still might be queued from the timer_fn:
CPU0 CPU1
kthread_cancel_delayed_work_sync()
__kthread_cancel_work_sync()
__kthread_cancel_work()
work->canceling++;
kthread_delayed_work_timer_fn()
kthread_insert_work();
BUG: kthread_insert_work() should not get called when work->canceling is
set.
Link: https://lkml.kernel.org/r/20201014083030.16895-1-qiang.zhang@windriver.com
Signed-off-by: Zqiang <qiang.zhang(a)windriver.com>
Reviewed-by: Petr Mladek <pmladek(a)suse.com>
Acked-by: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/kthread.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
--- a/kernel/kthread.c~kthread_worker-prevent-queuing-delayed-work-from-timer_fn-when-it-is-being-canceled
+++ a/kernel/kthread.c
@@ -897,7 +897,8 @@ void kthread_delayed_work_timer_fn(struc
/* Move the work from worker->delayed_work_list. */
WARN_ON_ONCE(list_empty(&work->node));
list_del_init(&work->node);
- kthread_insert_work(worker, work, &worker->work_list);
+ if (!work->canceling)
+ kthread_insert_work(worker, work, &worker->work_list);
raw_spin_unlock_irqrestore(&worker->lock, flags);
}
_
From: Oleg Nesterov <oleg(a)redhat.com>
Subject: ptrace: fix task_join_group_stop() for the case when current is traced
This testcase
#include <stdio.h>
#include <unistd.h>
#include <signal.h>
#include <sys/ptrace.h>
#include <sys/wait.h>
#include <pthread.h>
#include <assert.h>
void *tf(void *arg)
{
return NULL;
}
int main(void)
{
int pid = fork();
if (!pid) {
kill(getpid(), SIGSTOP);
pthread_t th;
pthread_create(&th, NULL, tf, NULL);
return 0;
}
waitpid(pid, NULL, WSTOPPED);
ptrace(PTRACE_SEIZE, pid, 0, PTRACE_O_TRACECLONE);
waitpid(pid, NULL, 0);
ptrace(PTRACE_CONT, pid, 0,0);
waitpid(pid, NULL, 0);
int status;
int thread = waitpid(-1, &status, 0);
assert(thread > 0 && thread != pid);
assert(status == 0x80137f);
return 0;
}
fails and triggers WARN_ON_ONCE(!signr) in do_jobctl_trap().
This is because task_join_group_stop() has 2 problems when current is traced:
1. We can't rely on the "JOBCTL_STOP_PENDING" check, a stopped tracee
can be woken up by debugger and it can clone another thread which
should join the group-stop.
We need to check group_stop_count || SIGNAL_STOP_STOPPED.
2. If SIGNAL_STOP_STOPPED is already set, we should not increment
sig->group_stop_count and add JOBCTL_STOP_CONSUME. The new thread
should stop without another do_notify_parent_cldstop() report.
To clarify, the problem is very old and we should blame
ptrace_init_task(). But now that we have task_join_group_stop() it makes
more sense to fix this helper to avoid the code duplication.
Link: https://lkml.kernel.org/r/20201019134237.GA18810@redhat.com
Signed-off-by: Oleg Nesterov <oleg(a)redhat.com>
Reported-by: syzbot+3485e3773f7da290eecc(a)syzkaller.appspotmail.com
Cc: Jens Axboe <axboe(a)kernel.dk>
Cc: Christian Brauner <christian(a)brauner.io>
Cc: "Eric W . Biederman" <ebiederm(a)xmission.com>
Cc: Zhiqiang Liu <liuzhiqiang26(a)huawei.com>
Cc: Tejun Heo <tj(a)kernel.org>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
kernel/signal.c | 19 ++++++++++---------
1 file changed, 10 insertions(+), 9 deletions(-)
--- a/kernel/signal.c~ptrace-fix-task_join_group_stop-for-the-case-when-current-is-traced
+++ a/kernel/signal.c
@@ -391,16 +391,17 @@ static bool task_participate_group_stop(
void task_join_group_stop(struct task_struct *task)
{
+ unsigned long mask = current->jobctl & JOBCTL_STOP_SIGMASK;
+ struct signal_struct *sig = current->signal;
+
+ if (sig->group_stop_count) {
+ sig->group_stop_count++;
+ mask |= JOBCTL_STOP_CONSUME;
+ } else if (!(sig->flags & SIGNAL_STOP_STOPPED))
+ return;
+
/* Have the new thread join an on-going signal group stop */
- unsigned long jobctl = current->jobctl;
- if (jobctl & JOBCTL_STOP_PENDING) {
- struct signal_struct *sig = current->signal;
- unsigned long signr = jobctl & JOBCTL_STOP_SIGMASK;
- unsigned long gstop = JOBCTL_STOP_PENDING | JOBCTL_STOP_CONSUME;
- if (task_set_jobctl_pending(task, signr | gstop)) {
- sig->group_stop_count++;
- }
- }
+ task_set_jobctl_pending(task, mask | JOBCTL_STOP_PENDING);
}
/*
_
From: Shijie Luo <luoshijie1(a)huawei.com>
Subject: mm: mempolicy: fix potential pte_unmap_unlock pte error
When flags in queue_pages_pte_range don't have MPOL_MF_MOVE or
MPOL_MF_MOVE_ALL bits, code breaks and passing origin pte - 1 to
pte_unmap_unlock seems like not a good idea.
queue_pages_pte_range can run in MPOL_MF_MOVE_ALL mode which doesn't
migrate misplaced pages but returns with EIO when encountering such a
page. Since commit a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO
when MPOL_MF_STRICT is specified") and early break on the first pte in the
range results in pte_unmap_unlock on an underflow pte. This can lead to
lockups later on when somebody tries to lock the pte resp.
page_table_lock again..
Link: https://lkml.kernel.org/r/20201019074853.50856-1-luoshijie1@huawei.com
Fixes: a7f40cfe3b7a ("mm: mempolicy: make mbind() return -EIO when MPOL_MF_STRICT is specified")
Signed-off-by: Shijie Luo <luoshijie1(a)huawei.com>
Signed-off-by: Miaohe Lin <linmiaohe(a)huawei.com>
Reviewed-by: Oscar Salvador <osalvador(a)suse.de>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Feilong Lin <linfeilong(a)huawei.com>
Cc: Shijie Luo <luoshijie1(a)huawei.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/mempolicy.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
--- a/mm/mempolicy.c~mm-mempolicy-fix-potential-pte_unmap_unlock-pte-error
+++ a/mm/mempolicy.c
@@ -525,7 +525,7 @@ static int queue_pages_pte_range(pmd_t *
unsigned long flags = qp->flags;
int ret;
bool has_unmovable = false;
- pte_t *pte;
+ pte_t *pte, *mapped_pte;
spinlock_t *ptl;
ptl = pmd_trans_huge_lock(pmd, vma);
@@ -539,7 +539,7 @@ static int queue_pages_pte_range(pmd_t *
if (pmd_trans_unstable(pmd))
return 0;
- pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
+ mapped_pte = pte = pte_offset_map_lock(walk->mm, pmd, addr, &ptl);
for (; addr != end; pte++, addr += PAGE_SIZE) {
if (!pte_present(*pte))
continue;
@@ -571,7 +571,7 @@ static int queue_pages_pte_range(pmd_t *
} else
break;
}
- pte_unmap_unlock(pte - 1, ptl);
+ pte_unmap_unlock(mapped_pte, ptl);
cond_resched();
if (has_unmovable)
_
Commit
b9de06783f01 ("compiler.h: fix barrier_data() on clang")
moved the definition of barrier() into compiler.h.
This causes build failures at least on alpha, because there are files
that rely on barrier() being defined via the implicit include of
compiler_types.h.
Revert this portion of the commit to fix.
Link: https://lore.kernel.org/linux-mm/202010312104.Dk9VQJYb-lkp@intel.com/
Reported-by: kernel test robot <lkp(a)intel.com>
Signed-off-by: Arvind Sankar <nivedita(a)alum.mit.edu>
Fixes: b9de06783f01 ("compiler.h: fix barrier_data() on clang")
Cc: <stable(a)vger.kernel.org>
---
include/linux/compiler-clang.h | 6 ++++++
include/linux/compiler-gcc.h | 5 +++++
include/linux/compiler.h | 3 +--
3 files changed, 12 insertions(+), 2 deletions(-)
diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
index dd7233c48bf3..230604e7f057 100644
--- a/include/linux/compiler-clang.h
+++ b/include/linux/compiler-clang.h
@@ -60,6 +60,12 @@
#define COMPILER_HAS_GENERIC_BUILTIN_OVERFLOW 1
#endif
+/* The following are for compatibility with GCC, from compiler-gcc.h,
+ * and may be redefined here because they should not be shared with other
+ * compilers, like ICC.
+ */
+#define barrier() __asm__ __volatile__("" : : : "memory")
+
#if __has_feature(shadow_call_stack)
# define __noscs __attribute__((__no_sanitize__("shadow-call-stack")))
#endif
diff --git a/include/linux/compiler-gcc.h b/include/linux/compiler-gcc.h
index 50912ed00278..a572965c801a 100644
--- a/include/linux/compiler-gcc.h
+++ b/include/linux/compiler-gcc.h
@@ -15,6 +15,11 @@
# error Sorry, your version of GCC is too old - please use 4.9 or newer.
#endif
+/* Optimization barrier */
+
+/* The "volatile" is due to gcc bugs */
+#define barrier() __asm__ __volatile__("": : :"memory")
+
/*
* This macro obfuscates arithmetic on a variable address so that gcc
* shouldn't recognize the original var, and make assumptions about it.
diff --git a/include/linux/compiler.h b/include/linux/compiler.h
index b8fe0c23cfff..25c803f4222f 100644
--- a/include/linux/compiler.h
+++ b/include/linux/compiler.h
@@ -80,8 +80,7 @@ void ftrace_likely_update(struct ftrace_likely_data *f, int val,
/* Optimization barrier */
#ifndef barrier
-/* The "volatile" is due to gcc bugs */
-# define barrier() __asm__ __volatile__("": : :"memory")
+# define barrier() __memory_barrier()
#endif
#ifndef barrier_data
--
2.26.2
A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?
A: Top-posting.
Q: What is the most annoying thing in e-mail?
A: No.
Q: Should I include quotations after my reply?
http://daringfireball.net/2007/07/on_top
On Thu, Oct 29, 2020 at 08:23:57AM -0700, Chris Ye wrote:
> Haha, thanks!
> I have fixed my git config and sent a new mail, can you check it?
Do you have a pointer to it on lore.kernel.org?
And you did properly version the patch, right?
thanks,
greg k-h