From: xinhui pan xinhui.pan@amd.com
[ Upstream commit ad2c28bd9a4083816fa45a7e90c2486cde8a9873 ]
BO would be added into swap list if it is validated into system domain. If BO is validated again into non-system domain, say, VRAM domain. It actually should not be in the swap list.
Signed-off-by: xinhui pan xinhui.pan@amd.com Acked-by: Guchun Chen guchun.chen@amd.com Acked-by: Alex Deucher alexander.deucher@amd.com Reviewed-by: Christian König christian.koenig@amd.com Link: https://patchwork.freedesktop.org/patch/msgid/20210224032808.150465-1-xinhui... Signed-off-by: Christian König christian.koenig@amd.com Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/gpu/drm/ttm/ttm_bo.c | 2 ++ 1 file changed, 2 insertions(+)
diff --git a/drivers/gpu/drm/ttm/ttm_bo.c b/drivers/gpu/drm/ttm/ttm_bo.c index 101a68dc615b..799ec7a7caa4 100644 --- a/drivers/gpu/drm/ttm/ttm_bo.c +++ b/drivers/gpu/drm/ttm/ttm_bo.c @@ -153,6 +153,8 @@ void ttm_bo_move_to_lru_tail(struct ttm_buffer_object *bo,
swap = &ttm_bo_glob.swap_lru[bo->priority]; list_move_tail(&bo->swap, swap); + } else { + list_del_init(&bo->swap); }
if (bdev->driver->del_from_lru_notify)
From: Nicholas Piggin npiggin@gmail.com
[ Upstream commit 2c8c89b95831f46a2fb31a8d0fef4601694023ce ]
The paravit queued spinlock slow path adds itself to the queue then calls pv_wait to wait for the lock to become free. This is implemented by calling H_CONFER to donate cycles.
When hcall tracing is enabled, this H_CONFER call can lead to a spin lock being taken in the tracing code, which will result in the lock to be taken again, which will also go to the slow path because it queues behind itself and so won't ever make progress.
An example trace of a deadlock:
__pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_hcall_exit __trace_hcall_exit plpar_hcall_norets_trace __pv_queued_spin_lock_slowpath trace_clock_global ring_buffer_lock_reserve trace_event_buffer_lock_reserve trace_event_buffer_reserve trace_event_raw_event_rcu_dyntick rcu_irq_exit irq_exit __do_irq call_do_irq do_IRQ hardware_interrupt_common_virt
Fix this by introducing plpar_hcall_norets_notrace(), and using that to make SPLPAR virtual processor dispatching hcalls by the paravirt spinlock code.
Signed-off-by: Nicholas Piggin npiggin@gmail.com Reviewed-by: Naveen N. Rao naveen.n.rao@linux.vnet.ibm.com Signed-off-by: Michael Ellerman mpe@ellerman.id.au Link: https://lore.kernel.org/r/20210508101455.1578318-2-npiggin@gmail.com Signed-off-by: Sasha Levin sashal@kernel.org --- arch/powerpc/include/asm/hvcall.h | 3 +++ arch/powerpc/include/asm/paravirt.h | 22 +++++++++++++++++++--- arch/powerpc/platforms/pseries/hvCall.S | 10 ++++++++++ arch/powerpc/platforms/pseries/lpar.c | 3 +-- 4 files changed, 33 insertions(+), 5 deletions(-)
diff --git a/arch/powerpc/include/asm/hvcall.h b/arch/powerpc/include/asm/hvcall.h index ed6086d57b22..0c92b01a3c3c 100644 --- a/arch/powerpc/include/asm/hvcall.h +++ b/arch/powerpc/include/asm/hvcall.h @@ -446,6 +446,9 @@ */ long plpar_hcall_norets(unsigned long opcode, ...);
+/* Variant which does not do hcall tracing */ +long plpar_hcall_norets_notrace(unsigned long opcode, ...); + /** * plpar_hcall: - Make a pseries hypervisor call * @opcode: The hypervisor call to make. diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h index 5d1726bb28e7..bcb7b5f917be 100644 --- a/arch/powerpc/include/asm/paravirt.h +++ b/arch/powerpc/include/asm/paravirt.h @@ -28,19 +28,35 @@ static inline u32 yield_count_of(int cpu) return be32_to_cpu(yield_count); }
+/* + * Spinlock code confers and prods, so don't trace the hcalls because the + * tracing code takes spinlocks which can cause recursion deadlocks. + * + * These calls are made while the lock is not held: the lock slowpath yields if + * it can not acquire the lock, and unlock slow path might prod if a waiter has + * yielded). So this may not be a problem for simple spin locks because the + * tracing does not technically recurse on the lock, but we avoid it anyway. + * + * However the queued spin lock contended path is more strictly ordered: the + * H_CONFER hcall is made after the task has queued itself on the lock, so then + * recursing on that lock will cause the task to then queue up again behind the + * first instance (or worse: queued spinlocks use tricks that assume a context + * never waits on more than one spinlock, so such recursion may cause random + * corruption in the lock code). + */ static inline void yield_to_preempted(int cpu, u32 yield_count) { - plpar_hcall_norets(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); + plpar_hcall_norets_notrace(H_CONFER, get_hard_smp_processor_id(cpu), yield_count); }
static inline void prod_cpu(int cpu) { - plpar_hcall_norets(H_PROD, get_hard_smp_processor_id(cpu)); + plpar_hcall_norets_notrace(H_PROD, get_hard_smp_processor_id(cpu)); }
static inline void yield_to_any(void) { - plpar_hcall_norets(H_CONFER, -1, 0); + plpar_hcall_norets_notrace(H_CONFER, -1, 0); } #else static inline bool is_shared_processor(void) diff --git a/arch/powerpc/platforms/pseries/hvCall.S b/arch/powerpc/platforms/pseries/hvCall.S index 2136e42833af..8a2b8d64265b 100644 --- a/arch/powerpc/platforms/pseries/hvCall.S +++ b/arch/powerpc/platforms/pseries/hvCall.S @@ -102,6 +102,16 @@ END_FTR_SECTION(0, 1); \ #define HCALL_BRANCH(LABEL) #endif
+_GLOBAL_TOC(plpar_hcall_norets_notrace) + HMT_MEDIUM + + mfcr r0 + stw r0,8(r1) + HVSC /* invoke the hypervisor */ + lwz r0,8(r1) + mtcrf 0xff,r0 + blr /* return r3 = status */ + _GLOBAL_TOC(plpar_hcall_norets) HMT_MEDIUM
diff --git a/arch/powerpc/platforms/pseries/lpar.c b/arch/powerpc/platforms/pseries/lpar.c index cd38bd421f38..d4aa6a46e1fa 100644 --- a/arch/powerpc/platforms/pseries/lpar.c +++ b/arch/powerpc/platforms/pseries/lpar.c @@ -1830,8 +1830,7 @@ void hcall_tracepoint_unregfunc(void)
/* * Since the tracing code might execute hcalls we need to guard against - * recursion. One example of this are spinlocks calling H_YIELD on - * shared processor partitions. + * recursion. */ static DEFINE_PER_CPU(unsigned int, hcall_trace_depth);
From: Oleg Nesterov oleg@redhat.com
[ Upstream commit dbb5afad100a828c97e012c6106566d99f041db6 ]
Suppose we have 2 threads, the group-leader L and a sub-theread T, both parked in ptrace_stop(). Debugger tries to resume both threads and does
ptrace(PTRACE_CONT, T); ptrace(PTRACE_CONT, L);
If the sub-thread T execs in between, the 2nd PTRACE_CONT doesn not resume the old leader L, it resumes the post-exec thread T which was actually now stopped in PTHREAD_EVENT_EXEC. In this case the PTHREAD_EVENT_EXEC event is lost, and the tracer can't know that the tracee changed its pid.
This patch makes ptrace() fail in this case until debugger does wait() and consumes PTHREAD_EVENT_EXEC which reports old_pid. This affects all ptrace requests except the "asynchronous" PTRACE_INTERRUPT/KILL.
The patch doesn't add the new PTRACE_ option to not complicate the API, and I _hope_ this won't cause any noticeable regression:
- If debugger uses PTRACE_O_TRACEEXEC and the thread did an exec and the tracer does a ptrace request without having consumed the exec event, it's 100% sure that the thread the ptracer thinks it is targeting does not exist anymore, or isn't the same as the one it thinks it is targeting.
- To some degree this patch adds nothing new. In the scenario above ptrace(L) can fail with -ESRCH if it is called after the execing sub-thread wakes the leader up and before it "steals" the leader's pid.
Test-case:
#include <stdio.h> #include <unistd.h> #include <signal.h> #include <sys/ptrace.h> #include <sys/wait.h> #include <errno.h> #include <pthread.h> #include <assert.h>
void *tf(void *arg) { execve("/usr/bin/true", NULL, NULL); assert(0);
return NULL; }
int main(void) { int leader = fork(); if (!leader) { kill(getpid(), SIGSTOP);
pthread_t th; pthread_create(&th, NULL, tf, NULL); for (;;) pause();
return 0; }
waitpid(leader, NULL, WSTOPPED);
ptrace(PTRACE_SEIZE, leader, 0, PTRACE_O_TRACECLONE | PTRACE_O_TRACEEXEC); waitpid(leader, NULL, 0);
ptrace(PTRACE_CONT, leader, 0,0); waitpid(leader, NULL, 0);
int status, thread = waitpid(-1, &status, 0); assert(thread > 0 && thread != leader); assert(status == 0x80137f);
ptrace(PTRACE_CONT, thread, 0,0); /* * waitid() because waitpid(leader, &status, WNOWAIT) does not * report status. Why ???? * * Why WEXITED? because we have another kernel problem connected * to mt-exec. */ siginfo_t info; assert(waitid(P_PID, leader, &info, WSTOPPED|WEXITED|WNOWAIT) == 0); assert(info.si_pid == leader && info.si_status == 0x0405);
/* OK, it sleeps in ptrace(PTRACE_EVENT_EXEC == 0x04) */ assert(ptrace(PTRACE_CONT, leader, 0,0) == -1); assert(errno == ESRCH);
assert(leader == waitpid(leader, &status, WNOHANG)); assert(status == 0x04057f);
assert(ptrace(PTRACE_CONT, leader, 0,0) == 0);
return 0; }
Signed-off-by: Oleg Nesterov oleg@redhat.com Reported-by: Simon Marchi simon.marchi@efficios.com Acked-by: "Eric W. Biederman" ebiederm@xmission.com Acked-by: Pedro Alves palves@redhat.com Acked-by: Simon Marchi simon.marchi@efficios.com Acked-by: Jan Kratochvil jan.kratochvil@redhat.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- kernel/ptrace.c | 18 +++++++++++++++++- 1 file changed, 17 insertions(+), 1 deletion(-)
diff --git a/kernel/ptrace.c b/kernel/ptrace.c index 61db50f7ca86..5f50fdd1d855 100644 --- a/kernel/ptrace.c +++ b/kernel/ptrace.c @@ -169,6 +169,21 @@ void __ptrace_unlink(struct task_struct *child) spin_unlock(&child->sighand->siglock); }
+static bool looks_like_a_spurious_pid(struct task_struct *task) +{ + if (task->exit_code != ((PTRACE_EVENT_EXEC << 8) | SIGTRAP)) + return false; + + if (task_pid_vnr(task) == task->ptrace_message) + return false; + /* + * The tracee changed its pid but the PTRACE_EVENT_EXEC event + * was not wait()'ed, most probably debugger targets the old + * leader which was destroyed in de_thread(). + */ + return true; +} + /* Ensure that nothing can wake it up, even SIGKILL */ static bool ptrace_freeze_traced(struct task_struct *task) { @@ -179,7 +194,8 @@ static bool ptrace_freeze_traced(struct task_struct *task) return ret;
spin_lock_irq(&task->sighand->siglock); - if (task_is_traced(task) && !__fatal_signal_pending(task)) { + if (task_is_traced(task) && !looks_like_a_spurious_pid(task) && + !__fatal_signal_pending(task)) { task->state = __TASK_TRACED; ret = true; }
From: Daniel Wagner dwagner@suse.de
[ Upstream commit 85428beac80dbcace5b146b218697c73e367dcf5 ]
Reset the ns->file value to NULL also in the error case in nvmet_file_ns_enable().
The ns->file variable points either to file object or contains the error code after the filp_open() call. This can lead to following problem:
When the user first setups an invalid file backend and tries to enable the ns, it will fail. Then the user switches over to a bdev backend and enables successfully the ns. The first received I/O will crash the system because the IO backend is chosen based on the ns->file value:
static u16 nvmet_parse_io_cmd(struct nvmet_req *req) { [...]
if (req->ns->file) return nvmet_file_parse_io_cmd(req);
return nvmet_bdev_parse_io_cmd(req); }
Reported-by: Enzo Matsumiya ematsumiya@suse.com Signed-off-by: Daniel Wagner dwagner@suse.de Signed-off-by: Christoph Hellwig hch@lst.de Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/nvme/target/io-cmd-file.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-)
diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c index 715d4376c997..7fdbdc496597 100644 --- a/drivers/nvme/target/io-cmd-file.c +++ b/drivers/nvme/target/io-cmd-file.c @@ -49,9 +49,11 @@ int nvmet_file_ns_enable(struct nvmet_ns *ns)
ns->file = filp_open(ns->device_path, flags, 0); if (IS_ERR(ns->file)) { - pr_err("failed to open file %s: (%ld)\n", - ns->device_path, PTR_ERR(ns->file)); - return PTR_ERR(ns->file); + ret = PTR_ERR(ns->file); + pr_err("failed to open file %s: (%d)\n", + ns->device_path, ret); + ns->file = NULL; + return ret; }
ret = nvmet_file_ns_revalidate(ns);
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
syzbot is reporting OOB write at vga16fb_imageblit() [1], for resize_screen() from ioctl(VT_RESIZE) returns 0 without checking whether requested rows/columns fit the amount of memory reserved for the graphical screen if current mode is KD_GRAPHICS.
---------- #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <sys/ioctl.h> #include <linux/kd.h> #include <linux/vt.h>
int main(int argc, char *argv[]) { const int fd = open("/dev/char/4:1", O_RDWR); struct vt_sizes vt = { 0x4100, 2 };
ioctl(fd, KDSETMODE, KD_GRAPHICS); ioctl(fd, VT_RESIZE, &vt); ioctl(fd, KDSETMODE, KD_TEXT); return 0; } ----------
Allow framebuffer drivers to return -EINVAL, by moving vc->vc_mode != KD_GRAPHICS check from resize_screen() to fbcon_resize().
Link: https://syzkaller.appspot.com/bug?extid=1f29e126cf461c4de3b3 [1] Reported-by: syzbot syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com Suggested-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp Tested-by: syzbot syzbot+1f29e126cf461c4de3b3@syzkaller.appspotmail.com Signed-off-by: Linus Torvalds torvalds@linux-foundation.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/tty/vt/vt.c | 2 +- drivers/video/fbdev/core/fbcon.c | 2 +- 2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/vt/vt.c b/drivers/tty/vt/vt.c index 0cc360da5426..53cbf2c3f033 100644 --- a/drivers/tty/vt/vt.c +++ b/drivers/tty/vt/vt.c @@ -1171,7 +1171,7 @@ static inline int resize_screen(struct vc_data *vc, int width, int height, /* Resizes the resolution of the display adapater */ int err = 0;
- if (vc->vc_mode != KD_GRAPHICS && vc->vc_sw->con_resize) + if (vc->vc_sw->con_resize) err = vc->vc_sw->con_resize(vc, width, height, user);
return err; diff --git a/drivers/video/fbdev/core/fbcon.c b/drivers/video/fbdev/core/fbcon.c index 3406067985b1..22bb3892f6bd 100644 --- a/drivers/video/fbdev/core/fbcon.c +++ b/drivers/video/fbdev/core/fbcon.c @@ -2019,7 +2019,7 @@ static int fbcon_resize(struct vc_data *vc, unsigned int width, return -EINVAL;
pr_debug("resize now %ix%i\n", var.xres, var.yres); - if (con_is_visible(vc)) { + if (con_is_visible(vc) && vc->vc_mode == KD_TEXT) { var.activate = FB_ACTIVATE_NOW | FB_ACTIVATE_FORCE; fb_set_var(info, &var);
On Mon, May 17, 2021 at 6:09 PM Sasha Levin sashal@kernel.org wrote:
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
So I think the commit is fine, and yes, it should be applied to stable, but it's one of those "there were three different patches in as many days to fix the problem, and this is the right one, but maybe stable should hold off for a while to see that there aren't any problem reports".
I don't think there will be any problems from this, but while the patch is tiny, it's conceptually quite a big change to something that people haven't really touched for a long time.
So use your own judgement, but it might be a good idea to wait a week before backporting this to see if anything screams.
Linus
On Mon, May 17, 2021 at 06:35:24PM -0700, Linus Torvalds wrote:
On Mon, May 17, 2021 at 6:09 PM Sasha Levin sashal@kernel.org wrote:
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
So I think the commit is fine, and yes, it should be applied to stable, but it's one of those "there were three different patches in as many days to fix the problem, and this is the right one, but maybe stable should hold off for a while to see that there aren't any problem reports".
I don't think there will be any problems from this, but while the patch is tiny, it's conceptually quite a big change to something that people haven't really touched for a long time.
So use your own judgement, but it might be a good idea to wait a week before backporting this to see if anything screams.
I was going to wait a few weeks for this, and the other vt patches that were marked with cc: stable@ before queueing them up.
thanks,
greg k-h
On Tue, May 18, 2021 at 07:45:59AM +0200, Greg KH wrote:
On Mon, May 17, 2021 at 06:35:24PM -0700, Linus Torvalds wrote:
On Mon, May 17, 2021 at 6:09 PM Sasha Levin sashal@kernel.org wrote:
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
So I think the commit is fine, and yes, it should be applied to stable, but it's one of those "there were three different patches in as many days to fix the problem, and this is the right one, but maybe stable should hold off for a while to see that there aren't any problem reports".
I don't think there will be any problems from this, but while the patch is tiny, it's conceptually quite a big change to something that people haven't really touched for a long time.
So use your own judgement, but it might be a good idea to wait a week before backporting this to see if anything screams.
I was going to wait a few weeks for this, and the other vt patches that were marked with cc: stable@ before queueing them up.
I'll drop it from my queue then.
On Tue, May 18, 2021 at 09:22:48AM -0400, Sasha Levin wrote:
On Tue, May 18, 2021 at 07:45:59AM +0200, Greg KH wrote:
On Mon, May 17, 2021 at 06:35:24PM -0700, Linus Torvalds wrote:
On Mon, May 17, 2021 at 6:09 PM Sasha Levin sashal@kernel.org wrote:
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
So I think the commit is fine, and yes, it should be applied to stable, but it's one of those "there were three different patches in as many days to fix the problem, and this is the right one, but maybe stable should hold off for a while to see that there aren't any problem reports".
I don't think there will be any problems from this, but while the patch is tiny, it's conceptually quite a big change to something that people haven't really touched for a long time.
So use your own judgement, but it might be a good idea to wait a week before backporting this to see if anything screams.
I was going to wait a few weeks for this, and the other vt patches that were marked with cc: stable@ before queueing them up.
I'll drop it from my queue then.
Thanks!
On Tue, May 18, 2021 at 07:45:59AM +0200, Greg KH wrote:
On Mon, May 17, 2021 at 06:35:24PM -0700, Linus Torvalds wrote:
On Mon, May 17, 2021 at 6:09 PM Sasha Levin sashal@kernel.org wrote:
From: Tetsuo Handa penguin-kernel@I-love.SAKURA.ne.jp
[ Upstream commit ffb324e6f874121f7dce5bdae5e05d02baae7269 ]
So I think the commit is fine, and yes, it should be applied to stable, but it's one of those "there were three different patches in as many days to fix the problem, and this is the right one, but maybe stable should hold off for a while to see that there aren't any problem reports".
I don't think there will be any problems from this, but while the patch is tiny, it's conceptually quite a big change to something that people haven't really touched for a long time.
So use your own judgement, but it might be a good idea to wait a week before backporting this to see if anything screams.
I was going to wait a few weeks for this, and the other vt patches that were marked with cc: stable@ before queueing them up.
I have now queued all of these up.
greg k-h
linux-stable-mirror@lists.linaro.org