Hi,
Two parts here:
1) The wakeup fix that went into 5.10-stable, but hadn't been done for
5.15-stable yet. It was the last 3 patches in the 5.10-stable backport
for io_uring
2) Other patches that were marked for stable or should go to stable, but
initially failed.
This gets us to basically parity on the regression test front for 5.15,
and have all been runtime tested.
Please queue up for the next 5.15-stable, thanks!
--
Jens Axboe
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
0d7c1153d929 ("io_uring: Clean up a false-positive warning from GCC 9.3.0")
d1fd1c201d75 ("io_uring: simplify selected buf handling")
3648e5265cfa ("io_uring: move up io_put_kbuf() and io_put_rw_kbuf()")
04c76b41ca97 ("io_uring: add option to skip CQE posting")
913a571affed ("io_uring: clean cqe filling functions")
7297ce3d5944 ("io_uring: improve send/recv error handling")
54daa9b2d80a ("io_uring: correct fill events helpers types")
867f8fa5aeb7 ("io_uring: inline io_req_needs_clean()")
d17e56eb4907 ("io_uring: remove struct io_completion")
d886e185a128 ("io_uring: control ->async_data with a REQ_F flag")
fff4e40e3094 ("io_uring: delay req queueing into compl-batch list")
51d48dab62ed ("io_uring: add more likely/unlikely() annotations")
7e3709d57651 ("io_uring: optimise kiocb layout")
30d51dd4ad20 ("io_uring: clean up buffer select")
a1cdbb4cb5f7 ("io_uring: comment why inline complete calls io_clean_op()")
ef05d9ebcc92 ("io_uring: kill off ->inflight_entry field")
6962980947e2 ("io_uring: restructure submit sqes to_submit checks")
d9f9d2842c91 ("io_uring: reshuffle queue_sqe completion handling")
f5ed3bcd5b11 ("io_uring: optimise batch completion")
b3fa03fd1b17 ("io_uring: convert iopoll_completed to store_release")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0d7c1153d9291197c1dc473cfaade77acb874b4b Mon Sep 17 00:00:00 2001
From: Alviro Iskandar Setiawan <alviro.iskandar(a)gmail.com>
Date: Mon, 7 Feb 2022 21:05:33 +0700
Subject: [PATCH] io_uring: Clean up a false-positive warning from GCC 9.3.0
In io_recv(), if import_single_range() fails, the @flags variable is
uninitialized, then it will goto out_free.
After the goto, the compiler doesn't know that (ret < min_ret) is
always true, so it thinks the "if ((flags & MSG_WAITALL) ..." path
could be taken.
The complaint comes from gcc-9 (Debian 9.3.0-22) 9.3.0:
```
fs/io_uring.c:5238 io_recvfrom() error: uninitialized symbol 'flags'
```
Fix this by bypassing the @ret and @flags check when
import_single_range() fails.
Reasons:
1. import_single_range() only returns -EFAULT when it fails.
2. At that point, @flags is uninitialized and shouldn't be read.
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reported-by: "Chen, Rong A" <rong.a.chen(a)intel.com>
Link: https://lore.gnuweeb.org/timl/d33bb5a9-8173-f65b-f653-51fc0681c6d6@intel.co…
Cc: Pavel Begunkov <asml.silence(a)gmail.com>
Suggested-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Fixes: 7297ce3d59449de49d3c9e1f64ae25488750a1fc ("io_uring: improve send/recv error handling")
Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar(a)gmail.com>
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Link: https://lore.kernel.org/r/20220207140533.565411-1-ammarfaizi2@gnuweeb.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2e04f718319d..3445c4da0153 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5228,7 +5228,6 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
min_ret = iov_iter_count(&msg.msg_iter);
ret = sock_recvmsg(sock, &msg, flags);
-out_free:
if (ret < min_ret) {
if (ret == -EAGAIN && force_nonblock)
return -EAGAIN;
@@ -5236,9 +5235,9 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
ret = -EINTR;
req_set_fail(req);
} else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+out_free:
req_set_fail(req);
}
-
__io_req_complete(req, issue_flags, ret, io_put_kbuf(req));
return 0;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
7cfe7a09489c ("io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not available")
46a525e199e4 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
b4c98d59a787 ("io_uring: introduce io_has_work")
78a861b94959 ("io_uring: add sync cancelation API through io_uring_register()")
c34398a8c018 ("io_uring: remove __io_req_task_work_add")
ed5ccb3beeba ("io_uring: remove priority tw list optimisation")
625d38b3fd34 ("io_uring: improve io_run_task_work()")
4a0fef62788b ("io_uring: optimize io_uring_task layout")
253993210bd8 ("io_uring: introduce locking helpers for CQE posting")
305bef988708 ("io_uring: hide eventfd assumptions in eventfd paths")
affa87db9010 ("io_uring: fix multi ctx cancellation")
d9dee4302a7c ("io_uring: remove ->flush_cqes optimisation")
a830ffd28780 ("io_uring: move io_eventfd_signal()")
9046c6415be6 ("io_uring: reshuffle io_uring/io_uring.h")
d142c3ec8d16 ("io_uring: remove extra io_commit_cqring()")
68494a65d0e2 ("io_uring: introduce io_req_cqe_overflow()")
faf88dde060f ("io_uring: don't inline __io_get_cqe()")
d245bca6375b ("io_uring: don't expose io_fill_cqe_aux()")
9ca9fb24d5fe ("io_uring: mutex locked poll hashing")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7cfe7a09489c1cefee7181e07b5f2bcbaebd9f41 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Fri, 25 Nov 2022 09:36:29 -0700
Subject: [PATCH] io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not
available
With how task_work is added and signaled, we can have TIF_NOTIFY_SIGNAL
set and no task_work pending as it got run in a previous loop. Treat
TIF_NOTIFY_SIGNAL like get_signal(), always clear it if set regardless
of whether or not task_work is pending to run.
Cc: stable(a)vger.kernel.org
Fixes: 46a525e199e4 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index cef5ff924e63..50bc3af44953 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -238,9 +238,14 @@ static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
static inline int io_run_task_work(void)
{
+ /*
+ * Always check-and-clear the task_work notification signal. With how
+ * signaling works for task_work, we can find it set with nothing to
+ * run. We need to clear it for that case, like get_signal() does.
+ */
+ if (test_thread_flag(TIF_NOTIFY_SIGNAL))
+ clear_notify_signal();
if (task_work_pending(current)) {
- if (test_thread_flag(TIF_NOTIFY_SIGNAL))
- clear_notify_signal();
__set_current_state(TASK_RUNNING);
task_work_run();
return 1;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
7cfe7a09489c ("io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not available")
46a525e199e4 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
c0e0d6ba25f1 ("io_uring: add IORING_SETUP_DEFER_TASKRUN")
b4c98d59a787 ("io_uring: introduce io_has_work")
78a861b94959 ("io_uring: add sync cancelation API through io_uring_register()")
c34398a8c018 ("io_uring: remove __io_req_task_work_add")
ed5ccb3beeba ("io_uring: remove priority tw list optimisation")
625d38b3fd34 ("io_uring: improve io_run_task_work()")
4a0fef62788b ("io_uring: optimize io_uring_task layout")
253993210bd8 ("io_uring: introduce locking helpers for CQE posting")
305bef988708 ("io_uring: hide eventfd assumptions in eventfd paths")
affa87db9010 ("io_uring: fix multi ctx cancellation")
d9dee4302a7c ("io_uring: remove ->flush_cqes optimisation")
a830ffd28780 ("io_uring: move io_eventfd_signal()")
9046c6415be6 ("io_uring: reshuffle io_uring/io_uring.h")
d142c3ec8d16 ("io_uring: remove extra io_commit_cqring()")
68494a65d0e2 ("io_uring: introduce io_req_cqe_overflow()")
faf88dde060f ("io_uring: don't inline __io_get_cqe()")
d245bca6375b ("io_uring: don't expose io_fill_cqe_aux()")
9ca9fb24d5fe ("io_uring: mutex locked poll hashing")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 7cfe7a09489c1cefee7181e07b5f2bcbaebd9f41 Mon Sep 17 00:00:00 2001
From: Jens Axboe <axboe(a)kernel.dk>
Date: Fri, 25 Nov 2022 09:36:29 -0700
Subject: [PATCH] io_uring: clear TIF_NOTIFY_SIGNAL if set and task_work not
available
With how task_work is added and signaled, we can have TIF_NOTIFY_SIGNAL
set and no task_work pending as it got run in a previous loop. Treat
TIF_NOTIFY_SIGNAL like get_signal(), always clear it if set regardless
of whether or not task_work is pending to run.
Cc: stable(a)vger.kernel.org
Fixes: 46a525e199e4 ("io_uring: don't gate task_work run on TIF_NOTIFY_SIGNAL")
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h
index cef5ff924e63..50bc3af44953 100644
--- a/io_uring/io_uring.h
+++ b/io_uring/io_uring.h
@@ -238,9 +238,14 @@ static inline unsigned int io_sqring_entries(struct io_ring_ctx *ctx)
static inline int io_run_task_work(void)
{
+ /*
+ * Always check-and-clear the task_work notification signal. With how
+ * signaling works for task_work, we can find it set with nothing to
+ * run. We need to clear it for that case, like get_signal() does.
+ */
+ if (test_thread_flag(TIF_NOTIFY_SIGNAL))
+ clear_notify_signal();
if (task_work_pending(current)) {
- if (test_thread_flag(TIF_NOTIFY_SIGNAL))
- clear_notify_signal();
__set_current_state(TASK_RUNNING);
task_work_run();
return 1;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
55ba18dc62de ("octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on rt")
4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 55ba18dc62deff5910c0fa64486dea1ff20832ff Mon Sep 17 00:00:00 2001
From: Kevin Hao <haokexin(a)gmail.com>
Date: Wed, 18 Jan 2023 15:13:00 +0800
Subject: [PATCH] octeontx2-pf: Fix the use of GFP_KERNEL in atomic context on
rt
The commit 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura
free") uses the get/put_cpu() to protect the usage of percpu pointer
in ->aura_freeptr() callback, but it also unnecessarily disable the
preemption for the blockable memory allocation. The commit 87b93b678e95
("octeontx2-pf: Avoid use of GFP_KERNEL in atomic context") tried to
fix these sleep inside atomic warnings. But it only fix the one for
the non-rt kernel. For the rt kernel, we still get the similar warnings
like below.
BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:46
in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
preempt_count: 1, expected: 0
RCU nest depth: 0, expected: 0
3 locks held by swapper/0/1:
#0: ffff800009fc5fe8 (rtnl_mutex){+.+.}-{3:3}, at: rtnl_lock+0x24/0x30
#1: ffff000100c276c0 (&mbox->lock){+.+.}-{3:3}, at: otx2_init_hw_resources+0x8c/0x3a4
#2: ffffffbfef6537e0 (&cpu_rcache->lock){+.+.}-{2:2}, at: alloc_iova_fast+0x1ac/0x2ac
Preemption disabled at:
[<ffff800008b1908c>] otx2_rq_aura_pool_init+0x14c/0x284
CPU: 20 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc3-rt1-yocto-preempt-rt #1
Hardware name: Marvell OcteonTX CN96XX board (DT)
Call trace:
dump_backtrace.part.0+0xe8/0xf4
show_stack+0x20/0x30
dump_stack_lvl+0x9c/0xd8
dump_stack+0x18/0x34
__might_resched+0x188/0x224
rt_spin_lock+0x64/0x110
alloc_iova_fast+0x1ac/0x2ac
iommu_dma_alloc_iova+0xd4/0x110
__iommu_dma_map+0x80/0x144
iommu_dma_map_page+0xe8/0x260
dma_map_page_attrs+0xb4/0xc0
__otx2_alloc_rbuf+0x90/0x150
otx2_rq_aura_pool_init+0x1c8/0x284
otx2_init_hw_resources+0xe4/0x3a4
otx2_open+0xf0/0x610
__dev_open+0x104/0x224
__dev_change_flags+0x1e4/0x274
dev_change_flags+0x2c/0x7c
ic_open_devs+0x124/0x2f8
ip_auto_config+0x180/0x42c
do_one_initcall+0x90/0x4dc
do_basic_setup+0x10c/0x14c
kernel_init_freeable+0x10c/0x13c
kernel_init+0x2c/0x140
ret_from_fork+0x10/0x20
Of course, we can shuffle the get/put_cpu() to only wrap the invocation
of ->aura_freeptr() as what commit 87b93b678e95 does. But there are only
two ->aura_freeptr() callbacks, otx2_aura_freeptr() and
cn10k_aura_freeptr(). There is no usage of perpcu variable in the
otx2_aura_freeptr() at all, so the get/put_cpu() seems redundant to it.
We can move the get/put_cpu() into the corresponding callback which
really has the percpu variable usage and avoid the sprinkling of
get/put_cpu() in several places.
Fixes: 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free")
Signed-off-by: Kevin Hao <haokexin(a)gmail.com>
Link: https://lore.kernel.org/r/20230118071300.3271125-1-haokexin@gmail.com
Signed-off-by: Paolo Abeni <pabeni(a)redhat.com>
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
index 497b777b6a34..8a41ad8ca04f 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
@@ -1012,7 +1012,6 @@ static void otx2_pool_refill_task(struct work_struct *work)
rbpool = cq->rbpool;
free_ptrs = cq->pool_ptrs;
- get_cpu();
while (cq->pool_ptrs) {
if (otx2_alloc_rbuf(pfvf, rbpool, &bufptr)) {
/* Schedule a WQ if we fails to free atleast half of the
@@ -1032,7 +1031,6 @@ static void otx2_pool_refill_task(struct work_struct *work)
pfvf->hw_ops->aura_freeptr(pfvf, qidx, bufptr + OTX2_HEAD_ROOM);
cq->pool_ptrs--;
}
- put_cpu();
cq->refill_task_sched = false;
}
@@ -1387,9 +1385,7 @@ int otx2_sq_aura_pool_init(struct otx2_nic *pfvf)
err = otx2_alloc_rbuf(pfvf, pool, &bufptr);
if (err)
goto err_mem;
- get_cpu();
pfvf->hw_ops->aura_freeptr(pfvf, pool_id, bufptr);
- put_cpu();
sq->sqb_ptrs[sq->sqb_count++] = (u64)bufptr;
}
}
@@ -1435,21 +1431,18 @@ int otx2_rq_aura_pool_init(struct otx2_nic *pfvf)
if (err)
goto fail;
- get_cpu();
/* Allocate pointers and free them to aura/pool */
for (pool_id = 0; pool_id < hw->rqpool_cnt; pool_id++) {
pool = &pfvf->qset.pool[pool_id];
for (ptr = 0; ptr < num_ptrs; ptr++) {
err = otx2_alloc_rbuf(pfvf, pool, &bufptr);
if (err)
- goto err_mem;
+ return -ENOMEM;
pfvf->hw_ops->aura_freeptr(pfvf, pool_id,
bufptr + OTX2_HEAD_ROOM);
}
}
-err_mem:
- put_cpu();
- return err ? -ENOMEM : 0;
+ return 0;
fail:
otx2_mbox_reset(&pfvf->mbox.mbox, 0);
otx2_aura_pool_free(pfvf);
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
index 5bee3c3a7ce4..3d22cc6a2804 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.h
@@ -736,8 +736,10 @@ static inline void cn10k_aura_freeptr(void *dev, int aura, u64 buf)
u64 ptrs[2];
ptrs[1] = buf;
+ get_cpu();
/* Free only one buffer at time during init and teardown */
__cn10k_aura_freeptr(pfvf, aura, ptrs, 2);
+ put_cpu();
}
/* Alloc pointer from pool/aura */
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
87b93b678e95 ("octeontx2-pf: Avoid use of GFP_KERNEL in atomic context")
4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 87b93b678e95c7d93fe6a55b0e0fbda26d8c7760 Mon Sep 17 00:00:00 2001
From: Geetha sowjanya <gakula(a)marvell.com>
Date: Fri, 13 Jan 2023 11:49:02 +0530
Subject: [PATCH] octeontx2-pf: Avoid use of GFP_KERNEL in atomic context
Using GFP_KERNEL in preemption disable context, causing below warning
when CONFIG_DEBUG_ATOMIC_SLEEP is enabled.
[ 32.542271] BUG: sleeping function called from invalid context at include/linux/sched/mm.h:274
[ 32.550883] in_atomic(): 1, irqs_disabled(): 0, non_block: 0, pid: 1, name: swapper/0
[ 32.558707] preempt_count: 1, expected: 0
[ 32.562710] RCU nest depth: 0, expected: 0
[ 32.566800] CPU: 3 PID: 1 Comm: swapper/0 Tainted: G W 6.2.0-rc2-00269-gae9dcb91c606 #7
[ 32.576188] Hardware name: Marvell CN106XX board (DT)
[ 32.581232] Call trace:
[ 32.583670] dump_backtrace.part.0+0xe0/0xf0
[ 32.587937] show_stack+0x18/0x30
[ 32.591245] dump_stack_lvl+0x68/0x84
[ 32.594900] dump_stack+0x18/0x34
[ 32.598206] __might_resched+0x12c/0x160
[ 32.602122] __might_sleep+0x48/0xa0
[ 32.605689] __kmem_cache_alloc_node+0x2b8/0x2e0
[ 32.610301] __kmalloc+0x58/0x190
[ 32.613610] otx2_sq_aura_pool_init+0x1a8/0x314
[ 32.618134] otx2_open+0x1d4/0x9d0
To avoid use of GFP_ATOMIC for memory allocation, disable preemption
after all memory allocation is done.
Fixes: 4af1b64f80fb ("octeontx2-pf: Fix lmtst ID used in aura free")
Signed-off-by: Geetha sowjanya <gakula(a)marvell.com>
Signed-off-by: Sunil Kovvuri Goutham <sgoutham(a)marvell.com>
Reviewed-by: Leon Romanovsky <leonro(a)nvidia.com>
Signed-off-by: David S. Miller <davem(a)davemloft.net>
diff --git a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
index 88f8772a61cd..497b777b6a34 100644
--- a/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
+++ b/drivers/net/ethernet/marvell/octeontx2/nic/otx2_common.c
@@ -1370,7 +1370,6 @@ int otx2_sq_aura_pool_init(struct otx2_nic *pfvf)
if (err)
goto fail;
- get_cpu();
/* Allocate pointers and free them to aura/pool */
for (qidx = 0; qidx < hw->tot_tx_queues; qidx++) {
pool_id = otx2_get_pool_idx(pfvf, AURA_NIX_SQ, qidx);
@@ -1388,13 +1387,14 @@ int otx2_sq_aura_pool_init(struct otx2_nic *pfvf)
err = otx2_alloc_rbuf(pfvf, pool, &bufptr);
if (err)
goto err_mem;
+ get_cpu();
pfvf->hw_ops->aura_freeptr(pfvf, pool_id, bufptr);
+ put_cpu();
sq->sqb_ptrs[sq->sqb_count++] = (u64)bufptr;
}
}
err_mem:
- put_cpu();
return err ? -ENOMEM : 0;
fail:
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
c0737fa9a5a5 ("io_uring: fix double poll leak on repolling")
10c873334feb ("io_uring: allow re-poll if we made progress")
52dd86406dfa ("io_uring: enable EPOLLEXCLUSIVE for accept poll")
4d9237e32c5d ("io_uring: recycle apoll_poll entries")
cc3cec8367cb ("io_uring: speedup provided buffer handling")
d5ec1dfaf59b ("io-uring: add __fill_cqe function")
77bc59b49817 ("io_uring: avoid ring quiesce while registering/unregistering eventfd")
0d7c1153d929 ("io_uring: Clean up a false-positive warning from GCC 9.3.0")
42a7b4ed45e7 ("Merge tag 'for-5.17/io_uring-2022-01-11' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0737fa9a5a5cf5a053bcc983f72d58919b997c6 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Wed, 22 Jun 2022 00:00:37 +0100
Subject: [PATCH] io_uring: fix double poll leak on repolling
We have re-polling for partial IO, so a request can be polled twice. If
it used two poll entries the first time then on the second
io_arm_poll_handler() it will find the old apoll entry and NULL
kmalloc()'ed second entry, i.e. apoll->double_poll, so leaking it.
Fixes: 10c873334feba ("io_uring: allow re-poll if we made progress")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/fee2452494222ecc7f1f88c8fb659baef971414a.16558522…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index cb719a53b8bd..5c95755619e2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7208,6 +7208,7 @@ static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
mask |= EPOLLEXCLUSIVE;
if (req->flags & REQ_F_POLLED) {
apoll = req->apoll;
+ kfree(apoll->double_poll);
} else if (!(issue_flags & IO_URING_F_UNLOCKED) &&
!list_empty(&ctx->apoll_cache)) {
apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
c0737fa9a5a5 ("io_uring: fix double poll leak on repolling")
10c873334feb ("io_uring: allow re-poll if we made progress")
52dd86406dfa ("io_uring: enable EPOLLEXCLUSIVE for accept poll")
4d9237e32c5d ("io_uring: recycle apoll_poll entries")
cc3cec8367cb ("io_uring: speedup provided buffer handling")
d5ec1dfaf59b ("io-uring: add __fill_cqe function")
77bc59b49817 ("io_uring: avoid ring quiesce while registering/unregistering eventfd")
0d7c1153d929 ("io_uring: Clean up a false-positive warning from GCC 9.3.0")
42a7b4ed45e7 ("Merge tag 'for-5.17/io_uring-2022-01-11' of git://git.kernel.dk/linux-block")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c0737fa9a5a5cf5a053bcc983f72d58919b997c6 Mon Sep 17 00:00:00 2001
From: Pavel Begunkov <asml.silence(a)gmail.com>
Date: Wed, 22 Jun 2022 00:00:37 +0100
Subject: [PATCH] io_uring: fix double poll leak on repolling
We have re-polling for partial IO, so a request can be polled twice. If
it used two poll entries the first time then on the second
io_arm_poll_handler() it will find the old apoll entry and NULL
kmalloc()'ed second entry, i.e. apoll->double_poll, so leaking it.
Fixes: 10c873334feba ("io_uring: allow re-poll if we made progress")
Signed-off-by: Pavel Begunkov <asml.silence(a)gmail.com>
Link: https://lore.kernel.org/r/fee2452494222ecc7f1f88c8fb659baef971414a.16558522…
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index cb719a53b8bd..5c95755619e2 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -7208,6 +7208,7 @@ static int io_arm_poll_handler(struct io_kiocb *req, unsigned issue_flags)
mask |= EPOLLEXCLUSIVE;
if (req->flags & REQ_F_POLLED) {
apoll = req->apoll;
+ kfree(apoll->double_poll);
} else if (!(issue_flags & IO_URING_F_UNLOCKED) &&
!list_empty(&ctx->apoll_cache)) {
apoll = list_first_entry(&ctx->apoll_cache, struct async_poll,
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
0d7c1153d929 ("io_uring: Clean up a false-positive warning from GCC 9.3.0")
d1fd1c201d75 ("io_uring: simplify selected buf handling")
3648e5265cfa ("io_uring: move up io_put_kbuf() and io_put_rw_kbuf()")
04c76b41ca97 ("io_uring: add option to skip CQE posting")
913a571affed ("io_uring: clean cqe filling functions")
7297ce3d5944 ("io_uring: improve send/recv error handling")
54daa9b2d80a ("io_uring: correct fill events helpers types")
867f8fa5aeb7 ("io_uring: inline io_req_needs_clean()")
d17e56eb4907 ("io_uring: remove struct io_completion")
d886e185a128 ("io_uring: control ->async_data with a REQ_F flag")
fff4e40e3094 ("io_uring: delay req queueing into compl-batch list")
51d48dab62ed ("io_uring: add more likely/unlikely() annotations")
7e3709d57651 ("io_uring: optimise kiocb layout")
30d51dd4ad20 ("io_uring: clean up buffer select")
a1cdbb4cb5f7 ("io_uring: comment why inline complete calls io_clean_op()")
ef05d9ebcc92 ("io_uring: kill off ->inflight_entry field")
6962980947e2 ("io_uring: restructure submit sqes to_submit checks")
d9f9d2842c91 ("io_uring: reshuffle queue_sqe completion handling")
f5ed3bcd5b11 ("io_uring: optimise batch completion")
b3fa03fd1b17 ("io_uring: convert iopoll_completed to store_release")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0d7c1153d9291197c1dc473cfaade77acb874b4b Mon Sep 17 00:00:00 2001
From: Alviro Iskandar Setiawan <alviro.iskandar(a)gmail.com>
Date: Mon, 7 Feb 2022 21:05:33 +0700
Subject: [PATCH] io_uring: Clean up a false-positive warning from GCC 9.3.0
In io_recv(), if import_single_range() fails, the @flags variable is
uninitialized, then it will goto out_free.
After the goto, the compiler doesn't know that (ret < min_ret) is
always true, so it thinks the "if ((flags & MSG_WAITALL) ..." path
could be taken.
The complaint comes from gcc-9 (Debian 9.3.0-22) 9.3.0:
```
fs/io_uring.c:5238 io_recvfrom() error: uninitialized symbol 'flags'
```
Fix this by bypassing the @ret and @flags check when
import_single_range() fails.
Reasons:
1. import_single_range() only returns -EFAULT when it fails.
2. At that point, @flags is uninitialized and shouldn't be read.
Reported-by: kernel test robot <lkp(a)intel.com>
Reported-by: Dan Carpenter <dan.carpenter(a)oracle.com>
Reported-by: "Chen, Rong A" <rong.a.chen(a)intel.com>
Link: https://lore.gnuweeb.org/timl/d33bb5a9-8173-f65b-f653-51fc0681c6d6@intel.co…
Cc: Pavel Begunkov <asml.silence(a)gmail.com>
Suggested-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Fixes: 7297ce3d59449de49d3c9e1f64ae25488750a1fc ("io_uring: improve send/recv error handling")
Signed-off-by: Alviro Iskandar Setiawan <alviro.iskandar(a)gmail.com>
Signed-off-by: Ammar Faizi <ammarfaizi2(a)gnuweeb.org>
Link: https://lore.kernel.org/r/20220207140533.565411-1-ammarfaizi2@gnuweeb.org
Signed-off-by: Jens Axboe <axboe(a)kernel.dk>
diff --git a/fs/io_uring.c b/fs/io_uring.c
index 2e04f718319d..3445c4da0153 100644
--- a/fs/io_uring.c
+++ b/fs/io_uring.c
@@ -5228,7 +5228,6 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
min_ret = iov_iter_count(&msg.msg_iter);
ret = sock_recvmsg(sock, &msg, flags);
-out_free:
if (ret < min_ret) {
if (ret == -EAGAIN && force_nonblock)
return -EAGAIN;
@@ -5236,9 +5235,9 @@ static int io_recv(struct io_kiocb *req, unsigned int issue_flags)
ret = -EINTR;
req_set_fail(req);
} else if ((flags & MSG_WAITALL) && (msg.msg_flags & (MSG_TRUNC | MSG_CTRUNC))) {
+out_free:
req_set_fail(req);
}
-
__io_req_complete(req, issue_flags, ret, io_put_kbuf(req));
return 0;
}
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
0ddadc3a2208 ("drm/amdgpu: correct MEC number for gfx11 APUs")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 0ddadc3a2208aedb1b27dbb76d0b4e722b5b527a Mon Sep 17 00:00:00 2001
From: Lang Yu <Lang.Yu(a)amd.com>
Date: Wed, 11 Jan 2023 09:52:11 +0800
Subject: [PATCH] drm/amdgpu: correct MEC number for gfx11 APUs
There is only one MEC on these APUs.
Signed-off-by: Lang Yu <Lang.Yu(a)amd.com>
Reviewed-by: Aaron Liu <aaron.liu(a)amd.com>
Reviewed-by: Yifan Zhang <yifan1.zhang(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org # 6.1.x
diff --git a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
index a56c6e106d00..b9b57a66e113 100644
--- a/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
+++ b/drivers/gpu/drm/amd/amdgpu/gfx_v11_0.c
@@ -1287,10 +1287,8 @@ static int gfx_v11_0_sw_init(void *handle)
switch (adev->ip_versions[GC_HWIP][0]) {
case IP_VERSION(11, 0, 0):
- case IP_VERSION(11, 0, 1):
case IP_VERSION(11, 0, 2):
case IP_VERSION(11, 0, 3):
- case IP_VERSION(11, 0, 4):
adev->gfx.me.num_me = 1;
adev->gfx.me.num_pipe_per_me = 1;
adev->gfx.me.num_queue_per_pipe = 1;
@@ -1298,6 +1296,15 @@ static int gfx_v11_0_sw_init(void *handle)
adev->gfx.mec.num_pipe_per_mec = 4;
adev->gfx.mec.num_queue_per_pipe = 4;
break;
+ case IP_VERSION(11, 0, 1):
+ case IP_VERSION(11, 0, 4):
+ adev->gfx.me.num_me = 1;
+ adev->gfx.me.num_pipe_per_me = 1;
+ adev->gfx.me.num_queue_per_pipe = 1;
+ adev->gfx.mec.num_mec = 1;
+ adev->gfx.mec.num_pipe_per_mec = 4;
+ adev->gfx.mec.num_queue_per_pipe = 4;
+ break;
default:
adev->gfx.me.num_me = 1;
adev->gfx.me.num_pipe_per_me = 1;
[Public]
Hi,
AMDGPU changed the codebase to use IP block versions rather than device IDs.
Some products that contain IP blocks "GC 11.0.4", "PSP 13.0.11" and "NBIO 7.7.1" work with the code in 6.1.y, but the various switch/case statements for IP version match are missing for these.
So the following minor changes allow those products to work with acceleration on 6.1.y:
69dc98bbd441 ("drm/amdgpu/discovery: enable soc21 common for GC 11.0.4")
d5fd8c89ed20 ("drm/amdgpu/discovery: enable gmc v11 for GC 11.0.4")
b952d6b3d3ff ("drm/amdgpu/discovery: enable gfx v11 for GC 11.0.4")
6a6af77570ad ("drm/amdgpu/discovery: enable mes support for GC v11.0.4")
94ab70685844 ("drm/amdgpu: set GC 11.0.4 family")
dd2d9c7fd771 ("drm/amdgpu/discovery: set the APU flag for GC 11.0.4")
1763cb65e870 ("drm/amdgpu: add gfx support for GC 11.0.4")
311d52367d0a ("drm/amdgpu: add soc21 common ip block support for GC 11.0.4")
d0ca8248999e ("drm/amdgpu: add gmc v11 support for GC 11.0.4")
7c1389f1b122 ("drm/amdgpu/discovery: add PSP IP v13.0.11 support")
16412a94364d ("drm/amdgpu/pm: enable swsmu for SMU IP v13.0.11")
51e7a2168769 ("drm/amdgpu: add smu 13 support for smu 13.0.11")
9f83e61201bb ("drm/amdgpu/pm: add GFXOFF control IP version check for SMU IP v13.0.11")
18ad18853cf2 ("drm/amdgpu/soc21: add mode2 asic reset for SMU IP v13.0.11")
069a5af97ce3 ("drm/amdgpu/pm: use the specific mailbox registers only for SMU IP v13.0.4")
7308ceb44663 ("drm/amdgpu/discovery: enable nbio support for NBIO v7.7.1")
2a0fe2ca6e9c ("drm/amdgpu: Enable pg/cg flags on GC11_0_4 for VCN")
2c83e3fd928b ("drm/amdgpu: enable PSP IP v13.0.11 support")
f2b91e5a7cc0 ("drm/amdgpu: enable GFX IP v11.0.4 CG support")
a89e2965da6e ("drm/amdgpu: enable GFX Power Gating for GC IP v11.0.4")
f9caa237372b ("drm/amdgpu: enable GFX Clock Gating control for GC IP v11.0.4")
97074216917b ("drm/amdgpu: add tmz support for GC 11.0.1")
2aecbe492a3c ("drm/amdgpu: add tmz support for GC IP v11.0.4")
Can these please be brought into 6.1.y? As amdgpu switched from device IDs to IP blocks these are akin to "device ID" support.
Thanks,
Hi Greg and Sasha,
Clang 16 (current main, next major release) errors when offsetof() has a
type defintion in it, in response to language in newer C standards
stating it is undefined behavior.
https://github.com/llvm/llvm-project/commit/e327b52766ed497e4779f4e652b9ad2…https://reviews.llvm.org/D133574
While this might be eventually demoted to just a warning, the kernel has
already cleaned up places that had this construct, so we can apply them
to the stable trees and avoid the issue altogether.
Please find attached mbox files for all supported stable trees, which
fix up the relevant instances for each tree using the upstream commits:
55228db2697c ("x86/fpu: Use _Alignof to avoid undefined behavior in TYPE_ALIGN")
09794a5a6c34 ("tracing: Use alignof__(struct {type b;}) instead of offsetof()")
The fpu commit uses _Alignof, which as far as I can tell was only
supported in GCC 4.7.0+. This is not a problem for mainline due to
requiring GCC 5.1.0+ but it could be relevant for old trees like 4.14,
which have an older minimum supported version. I hope people are not
using ancient compilers like that but I suppose if they are using 4.14,
they might just be stuck with old software...
If there are any issues or comments, please let me know.
Nathan
Hi,
I'd like to ask that the oops_limit series get included in -stable
releases. It's a recommended defense developed while writing this
report:
https://googleprojectzero.blogspot.com/2023/01/exploiting-null-dereferences…
I've had a few people ask about having it in -stable, for example:
https://lore.kernel.org/lkml/20230119201023.4003-1-sj@kernel.org
This is the series:
9360d035a579 panic: Separate sysctl logic from CONFIG_SMP
d4ccd54d28d3 exit: Put an upper limit on how often we can oops
9db89b411170 exit: Expose "oops_count" to sysfs
de92f65719cd exit: Allow oops_limit to be disabled
79cc1ba7badf panic: Consolidate open-coded panic_on_warn checks
9fc9e278a5c0 panic: Introduce warn_limit
8b05aa263361 panic: Expose "warn_count" to sysfs
00dd027f721e docs: Fix path paste-o for /sys/kernel/warn_count
7535b832c639 exit: Use READ_ONCE() for all oops/warn limit reads
For v6.1.x they apply cleanly and behave as expected.
I'm hoping someone can step up and do backports for v5.15.x and earlier,
as there appear to be a number of conflicts and I'm swamped with other
stuff to do. :P
Thanks!
-Kees
--
Kees Cook
Stable team,
J. Pablo González <disablez(a)disablez.com> writes:
> On Mon, Jan 16, 2023 at 2:02 PM Paulo Alcantara <pc(a)cjr.nz> wrote:
>>
>> J. Pablo González <disablez(a)disablez.com> writes:
>>
>> > We’re experiencing some issues when accessing some mounts in a DFS
>> > share, which seem to happen since kernel 5.17.
>> >
>> > After some investigation, we’ve pinpointed the origin to commit
>> > a2809d0e16963fdf3984409e47f145
>> > cccb0c6821
>> > - Original BZ for that is https://bugzilla.kernel.org/show_bug.cgi?id=215440
>> > - Patch discussion is at
>> > https://patchwork.kernel.org/project/cifs-client/patch/YeHUxJ9zTVNrKveF@him…
>> > - Similar issues referenced in https://bugzilla.suse.com/show_bug.cgi?id=1198753
>>
>> 6.2-rc4 has
>>
>> c877ce47e137 ("cifs: reduce roundtrips on create/qinfo requests")
>>
>> which should fix your issue.
>>
>> Could you try it? Thanks.
>
> I'll still need to test it more thoroughly, but for now, this patch
> seems to have fixed all issues, including the "-o nodfs ones." Thank
> you!
> Any chance this could be formally backported to 6.1.x ? I see it's
> only tagged for 6.2-rc for now.
Could you please queue
c877ce47e137 ("cifs: reduce roundtrips on create/qinfo requests")
for 6.1.y as a fix for
a2809d0e1696 ("cifs: quirk for STATUS_OBJECT_NAME_INVALID returned for non-ASCII dfs refs")
Find attached a backportable version of such commit. Thanks.
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
c7bae4aaa560 ("drm/amdgpu: Correct the power calcultion for Renior/Cezanne.")
138292f1dc00 ("drm/amd/pm: update smartshift powerboost calc for smu12")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c7bae4aaa5609c1fa9761c35dbcc5fcc92915222 Mon Sep 17 00:00:00 2001
From: jie1zhan <jesse.zhang(a)amd.com>
Date: Fri, 13 Jan 2023 10:39:13 +0800
Subject: [PATCH] drm/amdgpu: Correct the power calcultion for Renior/Cezanne.
From smu firmware,the value of power is transferred in units of watts.
Bug: https://gitlab.freedesktop.org/drm/amd/-/issues/2321
Fixes: 137aac26a2ed ("drm/amdgpu/smu12: fix power reporting on renoir")
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Signed-off-by: Jesse Zhang <Jesse.Zhang(a)amd.com>
Reviewed-by: Aaron Liu <aaron.liu(a)amd.com>
Signed-off-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: stable(a)vger.kernel.org
diff --git a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
index 85e22210963f..5cdc07165480 100644
--- a/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
+++ b/drivers/gpu/drm/amd/pm/swsmu/smu12/renoir_ppt.c
@@ -1171,6 +1171,7 @@ static int renoir_get_smu_metrics_data(struct smu_context *smu,
int ret = 0;
uint32_t apu_percent = 0;
uint32_t dgpu_percent = 0;
+ struct amdgpu_device *adev = smu->adev;
ret = smu_cmn_get_metrics_table(smu,
@@ -1196,7 +1197,11 @@ static int renoir_get_smu_metrics_data(struct smu_context *smu,
*value = metrics->AverageUvdActivity / 100;
break;
case METRICS_AVERAGE_SOCKETPOWER:
- *value = (metrics->CurrentSocketPower << 8) / 1000;
+ if (((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 1)) && (adev->pm.fw_version >= 0x40000f)) ||
+ ((adev->ip_versions[MP1_HWIP][0] == IP_VERSION(12, 0, 0)) && (adev->pm.fw_version >= 0x373200)))
+ *value = metrics->CurrentSocketPower << 8;
+ else
+ *value = (metrics->CurrentSocketPower << 8) / 1000;
break;
case METRICS_TEMPERATURE_EDGE:
*value = (metrics->GfxTemperature / 100) *
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
d1d5101452ab ("drm/fb-helper: Set framebuffer for vga-switcheroo clients")
8ab59da26bc0 ("drm/fb-helper: Move generic fbdev emulation into separate source file")
e7c5c29a9eb1 ("drm/fb-helper: Set flag in struct drm_fb_helper for leaking physical addresses")
7ce19535e9b4 ("drm/fb-helper: Always initialize generic fbdev emulation")
983780918c75 ("drm/fb-helper: Perform all fbdev I/O with the same implementation")
3add5f97734d ("drm/fb-helper: Call fb_sync in I/O functions")
f231af498c29 ("drm/fb-helper: Disconnect damage worker from update logic")
afb0ff78c13c ("drm/fb-helper: Rename drm_fb_helper_unregister_fbi() to use _info postfix")
7fd50bc39d12 ("drm/fb-helper: Rename drm_fb_helper_alloc_fbi() to use _info postfix")
9877d8f6bc37 ("drm/fb_helper: Rename field fbdev to info in struct drm_fb_helper")
2b1966c65b6d ("Merge tag 'drm-misc-next-2022-10-27' of git://anongit.freedesktop.org/drm/drm-misc into drm-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From d1d5101452ab04e5a3f010bdd200971d78956e5a Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 16 Jan 2023 12:54:24 +0100
Subject: [PATCH] drm/fb-helper: Set framebuffer for vga-switcheroo clients
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Set the framebuffer info for drivers that support VGA switcheroo. Only
affects the amdgpu and nouveau drivers, which use VGA switcheroo and
generic fbdev emulation. For other drivers, this does nothing.
This fixes a potential regression in the console code. Both, amdgpu and
nouveau, invoked vga_switcheroo_client_fb_set() from their internal fbdev
code. But the call got lost when the drivers switched to the generic
emulation.
Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers")
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Ben Skeggs <bskeggs(a)redhat.com>
Cc: Karol Herbst <kherbst(a)redhat.com>
Cc: Lyude Paul <lyude(a)redhat.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Javier Martinez Canillas <javierm(a)redhat.com>
Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: Dave Airlie <airlied(a)redhat.com>
Cc: Evan Quan <evan.quan(a)amd.com>
Cc: Christian König <christian.koenig(a)amd.com>
Cc: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Hawking Zhang <Hawking.Zhang(a)amd.com>
Cc: Likun Gao <Likun.Gao(a)amd.com>
Cc: "Christian König" <christian.koenig(a)amd.com>
Cc: Stanley Yang <Stanley.Yang(a)amd.com>
Cc: "Tianci.Yin" <tianci.yin(a)amd.com>
Cc: Xiaojian Du <Xiaojian.Du(a)amd.com>
Cc: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
Cc: YiPeng Chai <YiPeng.Chai(a)amd.com>
Cc: Somalapuram Amaranath <Amaranath.Somalapuram(a)amd.com>
Cc: Bokun Zhang <Bokun.Zhang(a)amd.com>
Cc: Guchun Chen <guchun.chen(a)amd.com>
Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
Cc: Mario Limonciello <mario.limonciello(a)amd.com>
Cc: Solomon Chiu <solomon.chiu(a)amd.com>
Cc: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
Cc: "Marek Olšák" <marek.olsak(a)amd.com>
Cc: Sam Ravnborg <sam(a)ravnborg.org>
Cc: Hans de Goede <hdegoede(a)redhat.com>
Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
Cc: dri-devel(a)lists.freedesktop.org
Cc: nouveau(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.17+
Link: https://patchwork.freedesktop.org/patch/msgid/20230116115425.13484-3-tzimme…
diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
index b3a731b9170a..0d0c26ebab90 100644
--- a/drivers/gpu/drm/drm_fb_helper.c
+++ b/drivers/gpu/drm/drm_fb_helper.c
@@ -30,7 +30,9 @@
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
#include <linux/console.h>
+#include <linux/pci.h>
#include <linux/sysrq.h>
+#include <linux/vga_switcheroo.h>
#include <drm/drm_atomic.h>
#include <drm/drm_drv.h>
@@ -1909,6 +1911,11 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
return ret;
strcpy(fb_helper->fb->comm, "[fbcon]");
+
+ /* Set the fb info for vgaswitcheroo clients. Does nothing otherwise. */
+ if (dev_is_pci(dev->dev))
+ vga_switcheroo_client_fb_set(to_pci_dev(dev->dev), fb_helper->info);
+
return 0;
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a273e95721e9 ("drm/i915: Allow switching away via vga-switcheroo if uninitialized")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a273e95721e96885971a05f1b34cb6d093904d9d Mon Sep 17 00:00:00 2001
From: Thomas Zimmermann <tzimmermann(a)suse.de>
Date: Mon, 16 Jan 2023 12:54:23 +0100
Subject: [PATCH] drm/i915: Allow switching away via vga-switcheroo if
uninitialized
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Always allow switching away via vga-switcheroo if the display is
uninitalized. Instead prevent switching to i915 if the device has
not been initialized.
This issue was introduced by commit 5df7bd130818 ("drm/i915: skip
display initialization when there is no display") protected, which
protects code paths from being executed on uninitialized devices.
In the case of vga-switcheroo, we want to allow a switch away from
i915's device. So run vga_switcheroo_process_delayed_switch() and
test in the switcheroo callbacks if the i915 device is available.
Fixes: 5df7bd130818 ("drm/i915: skip display initialization when there is no display")
Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
Cc: Radhakrishna Sripada <radhakrishna.sripada(a)intel.com>
Cc: Lucas De Marchi <lucas.demarchi(a)intel.com>
Cc: José Roberto de Souza <jose.souza(a)intel.com>
Cc: Jani Nikula <jani.nikula(a)intel.com>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Cc: Jani Nikula <jani.nikula(a)linux.intel.com>
Cc: Joonas Lahtinen <joonas.lahtinen(a)linux.intel.com>
Cc: Rodrigo Vivi <rodrigo.vivi(a)intel.com>
Cc: Tvrtko Ursulin <tvrtko.ursulin(a)linux.intel.com>
Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
Cc: Manasi Navare <manasi.d.navare(a)intel.com>
Cc: Stanislav Lisovskiy <stanislav.lisovskiy(a)intel.com>
Cc: Imre Deak <imre.deak(a)intel.com>
Cc: "Jouni Högander" <jouni.hogander(a)intel.com>
Cc: Uma Shankar <uma.shankar(a)intel.com>
Cc: Ankit Nautiyal <ankit.k.nautiyal(a)intel.com>
Cc: "Jason A. Donenfeld" <Jason(a)zx2c4.com>
Cc: Matt Roper <matthew.d.roper(a)intel.com>
Cc: Ramalingam C <ramalingam.c(a)intel.com>
Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
Cc: Andi Shyti <andi.shyti(a)linux.intel.com>
Cc: Andrzej Hajda <andrzej.hajda(a)intel.com>
Cc: "José Roberto de Souza" <jose.souza(a)intel.com>
Cc: Julia Lawall <Julia.Lawall(a)inria.fr>
Cc: intel-gfx(a)lists.freedesktop.org
Cc: <stable(a)vger.kernel.org> # v5.14+
Link: https://patchwork.freedesktop.org/patch/msgid/20230116115425.13484-2-tzimme…
diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index 69103ae37779..b5700681432d 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1073,8 +1073,7 @@ static void i915_driver_lastclose(struct drm_device *dev)
intel_fbdev_restore_mode(dev);
- if (HAS_DISPLAY(i915))
- vga_switcheroo_process_delayed_switch();
+ vga_switcheroo_process_delayed_switch();
}
static void i915_driver_postclose(struct drm_device *dev, struct drm_file *file)
diff --git a/drivers/gpu/drm/i915/i915_switcheroo.c b/drivers/gpu/drm/i915/i915_switcheroo.c
index 23777d500cdf..f45bd6b6cede 100644
--- a/drivers/gpu/drm/i915/i915_switcheroo.c
+++ b/drivers/gpu/drm/i915/i915_switcheroo.c
@@ -19,6 +19,10 @@ static void i915_switcheroo_set_state(struct pci_dev *pdev,
dev_err(&pdev->dev, "DRM not initialized, aborting switch.\n");
return;
}
+ if (!HAS_DISPLAY(i915)) {
+ dev_err(&pdev->dev, "Device state not initialized, aborting switch.\n");
+ return;
+ }
if (state == VGA_SWITCHEROO_ON) {
drm_info(&i915->drm, "switched on\n");
@@ -44,7 +48,7 @@ static bool i915_switcheroo_can_switch(struct pci_dev *pdev)
* locking inversion with the driver load path. And the access here is
* completely racy anyway. So don't bother with locking for now.
*/
- return i915 && atomic_read(&i915->drm.open_count) == 0;
+ return i915 && HAS_DISPLAY(i915) && atomic_read(&i915->drm.open_count) == 0;
}
static const struct vga_switcheroo_client_ops i915_switcheroo_ops = {
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a43866856125 ("mei: bus: fix unlink on bus in error path")
2cca3465147d ("mei: bus: add client dma interface")
e5617d2bf549 ("mei: bus: use zero vtag for bus clients.")
b5958faa34e2 ("mei: bus: move hw module get/put to probe/release")
34f1166afd67 ("mei: bus: need to unlink client before freeing")
69bf53130359 ("mei: bus: fix hw module get/put balance")
257355a44b99 ("mei: make module referencing local to the bus.c")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: [PATCH] mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a43866856125 ("mei: bus: fix unlink on bus in error path")
2cca3465147d ("mei: bus: add client dma interface")
e5617d2bf549 ("mei: bus: use zero vtag for bus clients.")
b5958faa34e2 ("mei: bus: move hw module get/put to probe/release")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: [PATCH] mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a43866856125 ("mei: bus: fix unlink on bus in error path")
2cca3465147d ("mei: bus: add client dma interface")
e5617d2bf549 ("mei: bus: use zero vtag for bus clients.")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: [PATCH] mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a43866856125 ("mei: bus: fix unlink on bus in error path")
2cca3465147d ("mei: bus: add client dma interface")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: [PATCH] mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
a43866856125 ("mei: bus: fix unlink on bus in error path")
2cca3465147d ("mei: bus: add client dma interface")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: [PATCH] mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
36f78477ac2c ("usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail")
ef52b4a9fcc2 ("usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running")
5571ea3117ca ("usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers")
2b537cf877ea ("usb: typec: tcpm: Relax disconnect threshold during power negotiation")
c34e85fa69b9 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work")
e00943e91678 ("usb: typec: tcpm: PD3.0 sinks can send Discover Identity even in device mode")
5e1d4c49fbc8 ("usb: typec: tcpm: Determine common SVDM Version")
31737c27d665 ("usb: pd: Make SVDM Version configurable in VDM header")
8d3a0578ad1a ("usb: typec: tcpm: Respond Wait if VDM state machine is running")
8dea75e11380 ("usb: typec: tcpm: Protocol Error handling")
0908c5aca31e ("usb: typec: tcpm: AMS and Collision Avoidance")
60e998d1c6d9 ("USB: typec: tcpm: Hard Reset after not receiving a Request")
3bac42f02d41 ("usb: typec: tcpm: Clear send_discover in tcpm_check_send_discover")
f321a02caebd ("usb: typec: tcpm: Implement enabling Auto Discharge disconnect support")
a30a00e37ceb ("usb: typec: tcpm: frs sourcing vbus callback")
8dc4bd073663 ("usb: typec: tcpm: Add support for Sink Fast Role SWAP(FRS)")
3ed8e1c2ac99 ("usb: typec: tcpm: Migrate workqueue to RT priority for processing events")
aefc66afe42b ("usb: typec: pd: Fix formatting in pd.h header")
6bbe2a90a0bb ("usb: typec: tcpm: During PR_SWAP, source caps should be sent only after tSwapSourceStart")
95b4d51c96a8 ("usb: typec: tcpm: Refactor tcpm_handle_vdm_request")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 36f78477ac2c89e9a2eed4a31404a291a3450b5d Mon Sep 17 00:00:00 2001
From: ChiYuan Huang <cy_huang(a)richtek.com>
Date: Mon, 9 Jan 2023 15:19:50 +0800
Subject: [PATCH] usb: typec: tcpm: Fix altmode re-registration causes sysfs
create fail
There's the altmode re-registeration issue after data role
swap (DR_SWAP).
Comparing to USBPD 2.0, in USBPD 3.0, it loose the limit that only DFP
can initiate the VDM command to get partner identity information.
For a USBPD 3.0 UFP device, it may already get the identity information
from its port partner before DR_SWAP. If DR_SWAP send or receive at the
mean time, 'send_discover' flag will be raised again. It causes discover
identify action restart while entering ready state. And after all
discover actions are done, the 'tcpm_register_altmodes' will be called.
If old altmode is not unregistered, this sysfs create fail can be found.
In 'DR_SWAP_CHANGE_DR' state case, only DFP will unregister altmodes.
For UFP, the original altmodes keep registered.
This patch fix the logic that after DR_SWAP, 'tcpm_unregister_altmodes'
must be called whatever the current data role is.
Reviewed-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Fixes: ae8a2ca8a221 ("usb: typec: Group all TCPCI/TCPM code together")
Reported-by: TommyYl Chen <tommyyl.chen(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: ChiYuan Huang <cy_huang(a)richtek.com>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/1673248790-15794-1-git-send-email-cy_huang@richte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 904c7b4ce2f0..59b366b5c614 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -4594,14 +4594,13 @@ static void run_state_machine(struct tcpm_port *port)
tcpm_set_state(port, ready_state(port), 0);
break;
case DR_SWAP_CHANGE_DR:
- if (port->data_role == TYPEC_HOST) {
- tcpm_unregister_altmodes(port);
+ tcpm_unregister_altmodes(port);
+ if (port->data_role == TYPEC_HOST)
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_DEVICE);
- } else {
+ else
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_HOST);
- }
tcpm_ams_finish(port);
tcpm_set_state(port, ready_state(port), 0);
break;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
36f78477ac2c ("usb: typec: tcpm: Fix altmode re-registration causes sysfs create fail")
ef52b4a9fcc2 ("usb: typec: tcpm: Raise vdm_sm_running flag only when VDM SM is running")
5571ea3117ca ("usb: typec: tcpm: Fix VDMs sometimes not being forwarded to alt-mode drivers")
2b537cf877ea ("usb: typec: tcpm: Relax disconnect threshold during power negotiation")
c34e85fa69b9 ("usb: typec: tcpm: Send DISCOVER_IDENTITY from dedicated work")
e00943e91678 ("usb: typec: tcpm: PD3.0 sinks can send Discover Identity even in device mode")
5e1d4c49fbc8 ("usb: typec: tcpm: Determine common SVDM Version")
31737c27d665 ("usb: pd: Make SVDM Version configurable in VDM header")
8d3a0578ad1a ("usb: typec: tcpm: Respond Wait if VDM state machine is running")
8dea75e11380 ("usb: typec: tcpm: Protocol Error handling")
0908c5aca31e ("usb: typec: tcpm: AMS and Collision Avoidance")
60e998d1c6d9 ("USB: typec: tcpm: Hard Reset after not receiving a Request")
3bac42f02d41 ("usb: typec: tcpm: Clear send_discover in tcpm_check_send_discover")
f321a02caebd ("usb: typec: tcpm: Implement enabling Auto Discharge disconnect support")
a30a00e37ceb ("usb: typec: tcpm: frs sourcing vbus callback")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 36f78477ac2c89e9a2eed4a31404a291a3450b5d Mon Sep 17 00:00:00 2001
From: ChiYuan Huang <cy_huang(a)richtek.com>
Date: Mon, 9 Jan 2023 15:19:50 +0800
Subject: [PATCH] usb: typec: tcpm: Fix altmode re-registration causes sysfs
create fail
There's the altmode re-registeration issue after data role
swap (DR_SWAP).
Comparing to USBPD 2.0, in USBPD 3.0, it loose the limit that only DFP
can initiate the VDM command to get partner identity information.
For a USBPD 3.0 UFP device, it may already get the identity information
from its port partner before DR_SWAP. If DR_SWAP send or receive at the
mean time, 'send_discover' flag will be raised again. It causes discover
identify action restart while entering ready state. And after all
discover actions are done, the 'tcpm_register_altmodes' will be called.
If old altmode is not unregistered, this sysfs create fail can be found.
In 'DR_SWAP_CHANGE_DR' state case, only DFP will unregister altmodes.
For UFP, the original altmodes keep registered.
This patch fix the logic that after DR_SWAP, 'tcpm_unregister_altmodes'
must be called whatever the current data role is.
Reviewed-by: Macpaul Lin <macpaul.lin(a)mediatek.com>
Fixes: ae8a2ca8a221 ("usb: typec: Group all TCPCI/TCPM code together")
Reported-by: TommyYl Chen <tommyyl.chen(a)mediatek.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: ChiYuan Huang <cy_huang(a)richtek.com>
Acked-by: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Link: https://lore.kernel.org/r/1673248790-15794-1-git-send-email-cy_huang@richte…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/typec/tcpm/tcpm.c b/drivers/usb/typec/tcpm/tcpm.c
index 904c7b4ce2f0..59b366b5c614 100644
--- a/drivers/usb/typec/tcpm/tcpm.c
+++ b/drivers/usb/typec/tcpm/tcpm.c
@@ -4594,14 +4594,13 @@ static void run_state_machine(struct tcpm_port *port)
tcpm_set_state(port, ready_state(port), 0);
break;
case DR_SWAP_CHANGE_DR:
- if (port->data_role == TYPEC_HOST) {
- tcpm_unregister_altmodes(port);
+ tcpm_unregister_altmodes(port);
+ if (port->data_role == TYPEC_HOST)
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_DEVICE);
- } else {
+ else
tcpm_set_roles(port, true, port->pwr_role,
TYPEC_HOST);
- }
tcpm_ams_finish(port);
tcpm_set_state(port, ready_state(port), 0);
break;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
1301c7b9f7ef ("usb: cdns3: remove fetched trb from cache before dequeuing")
64b558f597d1 ("usb: cdns3: Change file names for cdns3 driver.")
118b2a3237cf ("usb: cdnsp: Add tracepoints for CDNSP driver")
3d82904559f4 ("usb: cdnsp: cdns3 Add main part of Cadence USBSSP DRD Driver")
e93e58d27402 ("usb: cdnsp: Device side header file for CDNSP driver")
0b490046d8d7 ("usb: cdns3: Refactoring names in reusable code")
394c3a144de8 ("usb: cdns3: Moves reusable code to separate module")
f738957277ba ("usb: cdns3: Split core.c into cdns3-plat and core.c file")
db8892bb1bb6 ("usb: cdns3: Add support for DRD CDNSP")
d2a968dddf98 ("Merge tag 'usb-v5.11-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/peter.chen/usb into usb-next")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1301c7b9f7efad2f11ef924e317c18ebd714fc9a Mon Sep 17 00:00:00 2001
From: Pawel Laszczak <pawell(a)cadence.com>
Date: Tue, 15 Nov 2022 05:00:39 -0500
Subject: [PATCH] usb: cdns3: remove fetched trb from cache before dequeuing
After doorbell DMA fetches the TRB. If during dequeuing request
driver changes NORMAL TRB to LINK TRB but doesn't delete it from
controller cache then controller will handle cached TRB and packet
can be lost.
The example scenario for this issue looks like:
1. queue request - set doorbell
2. dequeue request
3. send OUT data packet from host
4. Device will accept this packet which is unexpected
5. queue new request - set doorbell
6. Device lost the expected packet.
By setting DFLUSH controller clears DRDY bit and stop DMA transfer.
Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
cc: <stable(a)vger.kernel.org>
Signed-off-by: Pawel Laszczak <pawell(a)cadence.com>
Acked-by: Peter Chen <peter.chen(a)kernel.org>
Link: https://lore.kernel.org/r/20221115100039.441295-1-pawell@cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 5adcb349718c..ccfaebca6faa 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -2614,6 +2614,7 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
u8 req_on_hw_ring = 0;
unsigned long flags;
int ret = 0;
+ int val;
if (!ep || !request || !ep->desc)
return -EINVAL;
@@ -2649,6 +2650,13 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
/* Update ring only if removed request is on pending_req_list list */
if (req_on_hw_ring && link_trb) {
+ /* Stop DMA */
+ writel(EP_CMD_DFLUSH, &priv_dev->regs->ep_cmd);
+
+ /* wait for DFLUSH cleared */
+ readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
+ !(val & EP_CMD_DFLUSH), 1, 1000);
+
link_trb->buffer = cpu_to_le32(TRB_BUFFER(priv_ep->trb_pool_dma +
((priv_req->end_trb + 1) * TRB_SIZE)));
link_trb->control = cpu_to_le32((le32_to_cpu(link_trb->control) & TRB_CYCLE) |
@@ -2660,6 +2668,10 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
cdns3_gadget_giveback(priv_ep, priv_req, -ECONNRESET);
+ req = cdns3_next_request(&priv_ep->pending_req_list);
+ if (req)
+ cdns3_rearm_transfer(priv_ep, 1);
+
not_found:
spin_unlock_irqrestore(&priv_dev->lock, flags);
return ret;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
1301c7b9f7ef ("usb: cdns3: remove fetched trb from cache before dequeuing")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 1301c7b9f7efad2f11ef924e317c18ebd714fc9a Mon Sep 17 00:00:00 2001
From: Pawel Laszczak <pawell(a)cadence.com>
Date: Tue, 15 Nov 2022 05:00:39 -0500
Subject: [PATCH] usb: cdns3: remove fetched trb from cache before dequeuing
After doorbell DMA fetches the TRB. If during dequeuing request
driver changes NORMAL TRB to LINK TRB but doesn't delete it from
controller cache then controller will handle cached TRB and packet
can be lost.
The example scenario for this issue looks like:
1. queue request - set doorbell
2. dequeue request
3. send OUT data packet from host
4. Device will accept this packet which is unexpected
5. queue new request - set doorbell
6. Device lost the expected packet.
By setting DFLUSH controller clears DRDY bit and stop DMA transfer.
Fixes: 7733f6c32e36 ("usb: cdns3: Add Cadence USB3 DRD Driver")
cc: <stable(a)vger.kernel.org>
Signed-off-by: Pawel Laszczak <pawell(a)cadence.com>
Acked-by: Peter Chen <peter.chen(a)kernel.org>
Link: https://lore.kernel.org/r/20221115100039.441295-1-pawell@cadence.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/usb/cdns3/cdns3-gadget.c b/drivers/usb/cdns3/cdns3-gadget.c
index 5adcb349718c..ccfaebca6faa 100644
--- a/drivers/usb/cdns3/cdns3-gadget.c
+++ b/drivers/usb/cdns3/cdns3-gadget.c
@@ -2614,6 +2614,7 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
u8 req_on_hw_ring = 0;
unsigned long flags;
int ret = 0;
+ int val;
if (!ep || !request || !ep->desc)
return -EINVAL;
@@ -2649,6 +2650,13 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
/* Update ring only if removed request is on pending_req_list list */
if (req_on_hw_ring && link_trb) {
+ /* Stop DMA */
+ writel(EP_CMD_DFLUSH, &priv_dev->regs->ep_cmd);
+
+ /* wait for DFLUSH cleared */
+ readl_poll_timeout_atomic(&priv_dev->regs->ep_cmd, val,
+ !(val & EP_CMD_DFLUSH), 1, 1000);
+
link_trb->buffer = cpu_to_le32(TRB_BUFFER(priv_ep->trb_pool_dma +
((priv_req->end_trb + 1) * TRB_SIZE)));
link_trb->control = cpu_to_le32((le32_to_cpu(link_trb->control) & TRB_CYCLE) |
@@ -2660,6 +2668,10 @@ int cdns3_gadget_ep_dequeue(struct usb_ep *ep,
cdns3_gadget_giveback(priv_ep, priv_req, -ECONNRESET);
+ req = cdns3_next_request(&priv_ep->pending_req_list);
+ if (req)
+ cdns3_rearm_transfer(priv_ep, 1);
+
not_found:
spin_unlock_irqrestore(&priv_dev->lock, flags);
return ret;
Commit ce835dbd04d7b24f9fd50d9a9c59be46304aaa8a ("staging: mt7621-dts:
change some node hex addresses to lower case") upstream.
Please apply this commit to 5.15. It solves the regression the kernel
test robot has reported.
https://lore.kernel.org/all/aa73cfb8-99ed-20b4-071c-9858399aee0a@arinc9.com…
Arınç
The patch below does not apply to the 6.1-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
85e79ec7b78f ("btrfs: zoned: enable metadata over-commit for non-ZNS setup")
c52cc7b7acfb ("btrfs: add a BTRFS_FS_NEED_TRANS_COMMIT flag")
7966a6b5959b ("btrfs: move fs_info::flags enum to fs.h")
fc97a410bd78 ("btrfs: move mount option definitions to fs.h")
0d3a9cf8c306 ("btrfs: convert incompat and compat flag test helpers to macros")
ec8eb376e271 ("btrfs: move BTRFS_FS_STATE* definitions and helpers to fs.h")
9b569ea0be6f ("btrfs: move the printk helpers out of ctree.h")
e118578a8df7 ("btrfs: move assert helpers out of ctree.h")
c7f13d428ea1 ("btrfs: move fs wide helpers out of ctree.h")
63a7cb130718 ("btrfs: auto enable discard=async when possible")
f1e5c6185ca1 ("btrfs: move flush related definitions to space-info.h")
4300c58f8090 ("btrfs: move btrfs on-disk definitions out of ctree.h")
d60d956eb41f ("btrfs: remove unused set/clear_pending_info helpers")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 85e79ec7b78f863178ca488fd8cb5b3de6347756 Mon Sep 17 00:00:00 2001
From: Naohiro Aota <naohiro.aota(a)wdc.com>
Date: Tue, 10 Jan 2023 15:04:32 +0900
Subject: [PATCH] btrfs: zoned: enable metadata over-commit for non-ZNS setup
The commit 79417d040f4f ("btrfs: zoned: disable metadata overcommit for
zoned") disabled the metadata over-commit to track active zones properly.
However, it also introduced a heavy overhead by allocating new metadata
block groups and/or flushing dirty buffers to release the space
reservations. Specifically, a workload (write only without any sync
operations) worsen its performance from 343.77 MB/sec (v5.19) to 182.89
MB/sec (v6.0).
The performance is still bad on current misc-next which is 187.95 MB/sec.
And, with this patch applied, it improves back to 326.70 MB/sec (+73.82%).
This patch introduces a new fs_info->flag BTRFS_FS_NO_OVERCOMMIT to
indicate it needs to disable the metadata over-commit. The flag is enabled
when a device with max active zones limit is loaded into a file-system.
Fixes: 79417d040f4f ("btrfs: zoned: disable metadata overcommit for zoned")
CC: stable(a)vger.kernel.org # 6.0+
Reviewed-by: Johannes Thumshirn <johannes.thumshirn(a)wdc.com>
Signed-off-by: Naohiro Aota <naohiro.aota(a)wdc.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/fs.h b/fs/btrfs/fs.h
index a749367e5ae2..37b86acfcbcf 100644
--- a/fs/btrfs/fs.h
+++ b/fs/btrfs/fs.h
@@ -119,6 +119,12 @@ enum {
/* Indicate that we want to commit the transaction. */
BTRFS_FS_NEED_TRANS_COMMIT,
+ /*
+ * Indicate metadata over-commit is disabled. This is set when active
+ * zone tracking is needed.
+ */
+ BTRFS_FS_NO_OVERCOMMIT,
+
#if BITS_PER_LONG == 32
/* Indicate if we have error/warn message printed on 32bit systems */
BTRFS_FS_32BIT_ERROR,
diff --git a/fs/btrfs/space-info.c b/fs/btrfs/space-info.c
index d28ee4e36f3d..69c09508afb5 100644
--- a/fs/btrfs/space-info.c
+++ b/fs/btrfs/space-info.c
@@ -407,7 +407,8 @@ int btrfs_can_overcommit(struct btrfs_fs_info *fs_info,
return 0;
used = btrfs_space_info_used(space_info, true);
- if (btrfs_is_zoned(fs_info) && (space_info->flags & BTRFS_BLOCK_GROUP_METADATA))
+ if (test_bit(BTRFS_FS_NO_OVERCOMMIT, &fs_info->flags) &&
+ (space_info->flags & BTRFS_BLOCK_GROUP_METADATA))
avail = 0;
else
avail = calc_available_free_space(fs_info, space_info, flush);
diff --git a/fs/btrfs/zoned.c b/fs/btrfs/zoned.c
index a759668477bb..1f503e8e42d4 100644
--- a/fs/btrfs/zoned.c
+++ b/fs/btrfs/zoned.c
@@ -539,6 +539,8 @@ int btrfs_get_dev_zone_info(struct btrfs_device *device, bool populate_cache)
}
atomic_set(&zone_info->active_zones_left,
max_active_zones - nactive);
+ /* Overcommit does not work well with active zone tacking. */
+ set_bit(BTRFS_FS_NO_OVERCOMMIT, &fs_info->flags);
}
/* Validate superblock log */
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
8bb6898da627 ("btrfs: fix directory logging due to race with concurrent index key deletion")
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8bb6898da6271d82d8e76d8088d66b971a7dcfa6 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:35 +0000
Subject: [PATCH] btrfs: fix directory logging due to race with concurrent
index key deletion
Sometimes we log a directory without holding its VFS lock, so while we
logging it, dir index entries may be added or removed. This typically
happens when logging a dentry from a parent directory that points to a
new directory, through log_new_dir_dentries(), or when while logging
some other inode we also need to log its parent directories (through
btrfs_log_all_parents()).
This means that while we are at log_dir_items(), we may not find a dir
index key we found before, because it was deleted in the meanwhile, so
a call to btrfs_search_slot() may return 1 (key not found). In that case
we return from log_dir_items() with a success value (the variable 'err'
has a value of 0). This can lead to a few problems, specially in the case
where the variable 'last_offset' has a value of (u64)-1 (and it's
initialized to that when it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned 1, we end up
returning from log_dir_items() with success (0) and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
So fix this by making log_dir_items() move on to the next dir index key
if it does not find the one it was looking for.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 3ef0266e9527..c09daab3f19e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3857,17 +3857,26 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
btrfs_release_path(path);
/*
- * Find the first key from this transaction again. See the note for
- * log_new_dir_dentries, if we're logging a directory recursively we
- * won't be holding its i_mutex, which means we can modify the directory
- * while we're logging it. If we remove an entry between our first
- * search and this search we'll not find the key again and can just
- * bail.
+ * Find the first key from this transaction again or the one we were at
+ * in the loop below in case we had to reschedule. We may be logging the
+ * directory without holding its VFS lock, which happen when logging new
+ * dentries (through log_new_dir_dentries()) or in some cases when we
+ * need to log the parent directory of an inode. This means a dir index
+ * key might be deleted from the inode's root, and therefore we may not
+ * find it anymore. If we can't find it, just move to the next key. We
+ * can not bail out and ignore, because if we do that we will simply
+ * not log dir index keys that come after the one that was just deleted
+ * and we can end up logging a dir index range that ends at (u64)-1
+ * (@last_offset is initialized to that), resulting in removing dir
+ * entries we should not remove at log replay time.
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret > 0)
+ ret = btrfs_next_item(root, path);
if (ret < 0)
err = ret;
+ /* If ret is 1, there are no more keys in the inode's root. */
if (ret != 0)
goto done;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
8bb6898da627 ("btrfs: fix directory logging due to race with concurrent index key deletion")
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8bb6898da6271d82d8e76d8088d66b971a7dcfa6 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:35 +0000
Subject: [PATCH] btrfs: fix directory logging due to race with concurrent
index key deletion
Sometimes we log a directory without holding its VFS lock, so while we
logging it, dir index entries may be added or removed. This typically
happens when logging a dentry from a parent directory that points to a
new directory, through log_new_dir_dentries(), or when while logging
some other inode we also need to log its parent directories (through
btrfs_log_all_parents()).
This means that while we are at log_dir_items(), we may not find a dir
index key we found before, because it was deleted in the meanwhile, so
a call to btrfs_search_slot() may return 1 (key not found). In that case
we return from log_dir_items() with a success value (the variable 'err'
has a value of 0). This can lead to a few problems, specially in the case
where the variable 'last_offset' has a value of (u64)-1 (and it's
initialized to that when it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned 1, we end up
returning from log_dir_items() with success (0) and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
So fix this by making log_dir_items() move on to the next dir index key
if it does not find the one it was looking for.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 3ef0266e9527..c09daab3f19e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3857,17 +3857,26 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
btrfs_release_path(path);
/*
- * Find the first key from this transaction again. See the note for
- * log_new_dir_dentries, if we're logging a directory recursively we
- * won't be holding its i_mutex, which means we can modify the directory
- * while we're logging it. If we remove an entry between our first
- * search and this search we'll not find the key again and can just
- * bail.
+ * Find the first key from this transaction again or the one we were at
+ * in the loop below in case we had to reschedule. We may be logging the
+ * directory without holding its VFS lock, which happen when logging new
+ * dentries (through log_new_dir_dentries()) or in some cases when we
+ * need to log the parent directory of an inode. This means a dir index
+ * key might be deleted from the inode's root, and therefore we may not
+ * find it anymore. If we can't find it, just move to the next key. We
+ * can not bail out and ignore, because if we do that we will simply
+ * not log dir index keys that come after the one that was just deleted
+ * and we can end up logging a dir index range that ends at (u64)-1
+ * (@last_offset is initialized to that), resulting in removing dir
+ * entries we should not remove at log replay time.
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret > 0)
+ ret = btrfs_next_item(root, path);
if (ret < 0)
err = ret;
+ /* If ret is 1, there are no more keys in the inode's root. */
if (ret != 0)
goto done;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
8bb6898da627 ("btrfs: fix directory logging due to race with concurrent index key deletion")
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8bb6898da6271d82d8e76d8088d66b971a7dcfa6 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:35 +0000
Subject: [PATCH] btrfs: fix directory logging due to race with concurrent
index key deletion
Sometimes we log a directory without holding its VFS lock, so while we
logging it, dir index entries may be added or removed. This typically
happens when logging a dentry from a parent directory that points to a
new directory, through log_new_dir_dentries(), or when while logging
some other inode we also need to log its parent directories (through
btrfs_log_all_parents()).
This means that while we are at log_dir_items(), we may not find a dir
index key we found before, because it was deleted in the meanwhile, so
a call to btrfs_search_slot() may return 1 (key not found). In that case
we return from log_dir_items() with a success value (the variable 'err'
has a value of 0). This can lead to a few problems, specially in the case
where the variable 'last_offset' has a value of (u64)-1 (and it's
initialized to that when it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned 1, we end up
returning from log_dir_items() with success (0) and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
So fix this by making log_dir_items() move on to the next dir index key
if it does not find the one it was looking for.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 3ef0266e9527..c09daab3f19e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3857,17 +3857,26 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
btrfs_release_path(path);
/*
- * Find the first key from this transaction again. See the note for
- * log_new_dir_dentries, if we're logging a directory recursively we
- * won't be holding its i_mutex, which means we can modify the directory
- * while we're logging it. If we remove an entry between our first
- * search and this search we'll not find the key again and can just
- * bail.
+ * Find the first key from this transaction again or the one we were at
+ * in the loop below in case we had to reschedule. We may be logging the
+ * directory without holding its VFS lock, which happen when logging new
+ * dentries (through log_new_dir_dentries()) or in some cases when we
+ * need to log the parent directory of an inode. This means a dir index
+ * key might be deleted from the inode's root, and therefore we may not
+ * find it anymore. If we can't find it, just move to the next key. We
+ * can not bail out and ignore, because if we do that we will simply
+ * not log dir index keys that come after the one that was just deleted
+ * and we can end up logging a dir index range that ends at (u64)-1
+ * (@last_offset is initialized to that), resulting in removing dir
+ * entries we should not remove at log replay time.
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret > 0)
+ ret = btrfs_next_item(root, path);
if (ret < 0)
err = ret;
+ /* If ret is 1, there are no more keys in the inode's root. */
if (ret != 0)
goto done;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
8bb6898da627 ("btrfs: fix directory logging due to race with concurrent index key deletion")
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8bb6898da6271d82d8e76d8088d66b971a7dcfa6 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:35 +0000
Subject: [PATCH] btrfs: fix directory logging due to race with concurrent
index key deletion
Sometimes we log a directory without holding its VFS lock, so while we
logging it, dir index entries may be added or removed. This typically
happens when logging a dentry from a parent directory that points to a
new directory, through log_new_dir_dentries(), or when while logging
some other inode we also need to log its parent directories (through
btrfs_log_all_parents()).
This means that while we are at log_dir_items(), we may not find a dir
index key we found before, because it was deleted in the meanwhile, so
a call to btrfs_search_slot() may return 1 (key not found). In that case
we return from log_dir_items() with a success value (the variable 'err'
has a value of 0). This can lead to a few problems, specially in the case
where the variable 'last_offset' has a value of (u64)-1 (and it's
initialized to that when it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned 1, we end up
returning from log_dir_items() with success (0) and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
So fix this by making log_dir_items() move on to the next dir index key
if it does not find the one it was looking for.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 3ef0266e9527..c09daab3f19e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3857,17 +3857,26 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
btrfs_release_path(path);
/*
- * Find the first key from this transaction again. See the note for
- * log_new_dir_dentries, if we're logging a directory recursively we
- * won't be holding its i_mutex, which means we can modify the directory
- * while we're logging it. If we remove an entry between our first
- * search and this search we'll not find the key again and can just
- * bail.
+ * Find the first key from this transaction again or the one we were at
+ * in the loop below in case we had to reschedule. We may be logging the
+ * directory without holding its VFS lock, which happen when logging new
+ * dentries (through log_new_dir_dentries()) or in some cases when we
+ * need to log the parent directory of an inode. This means a dir index
+ * key might be deleted from the inode's root, and therefore we may not
+ * find it anymore. If we can't find it, just move to the next key. We
+ * can not bail out and ignore, because if we do that we will simply
+ * not log dir index keys that come after the one that was just deleted
+ * and we can end up logging a dir index range that ends at (u64)-1
+ * (@last_offset is initialized to that), resulting in removing dir
+ * entries we should not remove at log replay time.
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret > 0)
+ ret = btrfs_next_item(root, path);
if (ret < 0)
err = ret;
+ /* If ret is 1, there are no more keys in the inode's root. */
if (ret != 0)
goto done;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
8bb6898da627 ("btrfs: fix directory logging due to race with concurrent index key deletion")
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 8bb6898da6271d82d8e76d8088d66b971a7dcfa6 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:35 +0000
Subject: [PATCH] btrfs: fix directory logging due to race with concurrent
index key deletion
Sometimes we log a directory without holding its VFS lock, so while we
logging it, dir index entries may be added or removed. This typically
happens when logging a dentry from a parent directory that points to a
new directory, through log_new_dir_dentries(), or when while logging
some other inode we also need to log its parent directories (through
btrfs_log_all_parents()).
This means that while we are at log_dir_items(), we may not find a dir
index key we found before, because it was deleted in the meanwhile, so
a call to btrfs_search_slot() may return 1 (key not found). In that case
we return from log_dir_items() with a success value (the variable 'err'
has a value of 0). This can lead to a few problems, specially in the case
where the variable 'last_offset' has a value of (u64)-1 (and it's
initialized to that when it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned 1, we end up
returning from log_dir_items() with success (0) and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
So fix this by making log_dir_items() move on to the next dir index key
if it does not find the one it was looking for.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index 3ef0266e9527..c09daab3f19e 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3857,17 +3857,26 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
btrfs_release_path(path);
/*
- * Find the first key from this transaction again. See the note for
- * log_new_dir_dentries, if we're logging a directory recursively we
- * won't be holding its i_mutex, which means we can modify the directory
- * while we're logging it. If we remove an entry between our first
- * search and this search we'll not find the key again and can just
- * bail.
+ * Find the first key from this transaction again or the one we were at
+ * in the loop below in case we had to reschedule. We may be logging the
+ * directory without holding its VFS lock, which happen when logging new
+ * dentries (through log_new_dir_dentries()) or in some cases when we
+ * need to log the parent directory of an inode. This means a dir index
+ * key might be deleted from the inode's root, and therefore we may not
+ * find it anymore. If we can't find it, just move to the next key. We
+ * can not bail out and ignore, because if we do that we will simply
+ * not log dir index keys that come after the one that was just deleted
+ * and we can end up logging a dir index range that ends at (u64)-1
+ * (@last_offset is initialized to that), resulting in removing dir
+ * entries we should not remove at log replay time.
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret > 0)
+ ret = btrfs_next_item(root, path);
if (ret < 0)
err = ret;
+ /* If ret is 1, there are no more keys in the inode's root. */
if (ret != 0)
goto done;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
64d6b281ba4d ("btrfs: remove unnecessary check_parent_dirs_for_sync()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d3d970b2735b967650d319be27268fedc5598d1 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:34 +0000
Subject: [PATCH] btrfs: fix missing error handling when logging directory
items
When logging a directory, at log_dir_items(), if we get an error when
attempting to search the subvolume tree for a dir index item, we end up
returning 0 (success) from log_dir_items() because 'err' is left with a
value of 0.
This can lead to a few problems, specially in the case the variable
'last_offset' has a value of (u64)-1 (and it's initialized to that when
it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned an error, we end up
returning without any error from log_dir_items() and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
Fix this by setting 'err' to the value of 'ret' in case
btrfs_search_slot() or btrfs_previous_item() returned an error. That will
result in falling back to a full transaction commit.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fb52aa060093..3ef0266e9527 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3826,7 +3826,10 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
path->slots[0]);
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
}
+
goto done;
}
@@ -3846,7 +3849,11 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
+ goto done;
}
+
btrfs_release_path(path);
/*
@@ -3859,6 +3866,8 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret < 0)
+ err = ret;
if (ret != 0)
goto done;
The patch below does not apply to the 4.19-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
64d6b281ba4d ("btrfs: remove unnecessary check_parent_dirs_for_sync()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d3d970b2735b967650d319be27268fedc5598d1 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:34 +0000
Subject: [PATCH] btrfs: fix missing error handling when logging directory
items
When logging a directory, at log_dir_items(), if we get an error when
attempting to search the subvolume tree for a dir index item, we end up
returning 0 (success) from log_dir_items() because 'err' is left with a
value of 0.
This can lead to a few problems, specially in the case the variable
'last_offset' has a value of (u64)-1 (and it's initialized to that when
it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned an error, we end up
returning without any error from log_dir_items() and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
Fix this by setting 'err' to the value of 'ret' in case
btrfs_search_slot() or btrfs_previous_item() returned an error. That will
result in falling back to a full transaction commit.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fb52aa060093..3ef0266e9527 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3826,7 +3826,10 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
path->slots[0]);
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
}
+
goto done;
}
@@ -3846,7 +3849,11 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
+ goto done;
}
+
btrfs_release_path(path);
/*
@@ -3859,6 +3866,8 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret < 0)
+ err = ret;
if (ret != 0)
goto done;
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
64d6b281ba4d ("btrfs: remove unnecessary check_parent_dirs_for_sync()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d3d970b2735b967650d319be27268fedc5598d1 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:34 +0000
Subject: [PATCH] btrfs: fix missing error handling when logging directory
items
When logging a directory, at log_dir_items(), if we get an error when
attempting to search the subvolume tree for a dir index item, we end up
returning 0 (success) from log_dir_items() because 'err' is left with a
value of 0.
This can lead to a few problems, specially in the case the variable
'last_offset' has a value of (u64)-1 (and it's initialized to that when
it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned an error, we end up
returning without any error from log_dir_items() and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
Fix this by setting 'err' to the value of 'ret' in case
btrfs_search_slot() or btrfs_previous_item() returned an error. That will
result in falling back to a full transaction commit.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fb52aa060093..3ef0266e9527 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3826,7 +3826,10 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
path->slots[0]);
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
}
+
goto done;
}
@@ -3846,7 +3849,11 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
+ goto done;
}
+
btrfs_release_path(path);
/*
@@ -3859,6 +3866,8 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret < 0)
+ err = ret;
if (ret != 0)
goto done;
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
cfd312695b71 ("btrfs: check for error when looking up inode during dir entry replay")
8dcbc26194eb ("btrfs: unify lookup return value when dir entry is missing")
52db77791fe2 ("btrfs: deal with errors when adding inode reference during log replay")
e15ac6413745 ("btrfs: deal with errors when replaying dir entry during log replay")
77a5b9e3d14c ("btrfs: deal with errors when checking if a dir entry exists during log replay")
a7d1c5dc8632 ("btrfs: introduce btrfs_lookup_match_dir")
b590b839720c ("btrfs: avoid unnecessary logging of xattrs during fast fsyncs")
54a40fc3a1da ("btrfs: fix removed dentries still existing after log is synced")
64d6b281ba4d ("btrfs: remove unnecessary check_parent_dirs_for_sync()")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d3d970b2735b967650d319be27268fedc5598d1 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:34 +0000
Subject: [PATCH] btrfs: fix missing error handling when logging directory
items
When logging a directory, at log_dir_items(), if we get an error when
attempting to search the subvolume tree for a dir index item, we end up
returning 0 (success) from log_dir_items() because 'err' is left with a
value of 0.
This can lead to a few problems, specially in the case the variable
'last_offset' has a value of (u64)-1 (and it's initialized to that when
it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned an error, we end up
returning without any error from log_dir_items() and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
Fix this by setting 'err' to the value of 'ret' in case
btrfs_search_slot() or btrfs_previous_item() returned an error. That will
result in falling back to a full transaction commit.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fb52aa060093..3ef0266e9527 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3826,7 +3826,10 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
path->slots[0]);
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
}
+
goto done;
}
@@ -3846,7 +3849,11 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
+ goto done;
}
+
btrfs_release_path(path);
/*
@@ -3859,6 +3866,8 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret < 0)
+ err = ret;
if (ret != 0)
goto done;
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
6d3d970b2735 ("btrfs: fix missing error handling when logging directory items")
732d591a5d6c ("btrfs: stop copying old dir items when logging a directory")
a450a4af7433 ("btrfs: don't log unnecessary boundary keys when logging directory")
339d03542484 ("btrfs: only copy dir index keys when logging a directory")
1b2e5e5c7fea ("btrfs: fix missing last dir item offset update when logging directory")
9798ba24cb76 ("btrfs: remove root argument from drop_one_dir_item()")
dc2872247ec0 ("btrfs: keep track of the last logged keys when logging a directory")
086dcbfa50d3 ("btrfs: insert items in batches when logging a directory when possible")
eb10d85ee77f ("btrfs: factor out the copying loop of dir items from log_dir_items()")
90d04510a774 ("btrfs: remove root argument from btrfs_log_inode() and its callees")
289cffcb0399 ("btrfs: remove no longer needed checks for NULL log context")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 6d3d970b2735b967650d319be27268fedc5598d1 Mon Sep 17 00:00:00 2001
From: Filipe Manana <fdmanana(a)suse.com>
Date: Tue, 10 Jan 2023 14:56:34 +0000
Subject: [PATCH] btrfs: fix missing error handling when logging directory
items
When logging a directory, at log_dir_items(), if we get an error when
attempting to search the subvolume tree for a dir index item, we end up
returning 0 (success) from log_dir_items() because 'err' is left with a
value of 0.
This can lead to a few problems, specially in the case the variable
'last_offset' has a value of (u64)-1 (and it's initialized to that when
it was declared):
1) By returning from log_dir_items() with success (0) and a value of
(u64)-1 for '*last_offset_ret', we end up not logging any other dir
index keys that follow the missing, just deleted, index key. The
(u64)-1 value makes log_directory_changes() not call log_dir_items()
again;
2) Before returning with success (0), log_dir_items(), will log a dir
index range item covering a range from the last old dentry index
(stored in the variable 'last_old_dentry_offset') to the value of
'last_offset'. If 'last_offset' has a value of (u64)-1, then it means
if the log is persisted and replayed after a power failure, it will
cause deletion of all the directory entries that have an index number
between last_old_dentry_offset + 1 and (u64)-1;
3) We can end up returning from log_dir_items() with
ctx->last_dir_item_offset having a lower value than
inode->last_dir_index_offset, because the former is set to the current
key we are processing at process_dir_items_leaf(), and at the end of
log_directory_changes() we set inode->last_dir_index_offset to the
current value of ctx->last_dir_item_offset. So if for example a
deletion of a lower dir index key happened, we set
ctx->last_dir_item_offset to that index value, then if we return from
log_dir_items() because btrfs_search_slot() returned an error, we end up
returning without any error from log_dir_items() and then
log_directory_changes() sets inode->last_dir_index_offset to a lower
value than it had before.
This can result in unpredictable and unexpected behaviour when we
need to log again the directory in the same transaction, and can result
in ending up with a log tree leaf that has duplicated keys, as we do
batch insertions of dir index keys into a log tree.
Fix this by setting 'err' to the value of 'ret' in case
btrfs_search_slot() or btrfs_previous_item() returned an error. That will
result in falling back to a full transaction commit.
Reported-by: David Arendt <admin(a)prnet.org>
Link: https://lore.kernel.org/linux-btrfs/ae169fc6-f504-28f0-a098-6fa6a4dfb612@le…
Fixes: e02119d5a7b4 ("Btrfs: Add a write ahead tree log to optimize synchronous operations")
CC: stable(a)vger.kernel.org # 4.14+
Reviewed-by: Josef Bacik <josef(a)toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana(a)suse.com>
Signed-off-by: David Sterba <dsterba(a)suse.com>
diff --git a/fs/btrfs/tree-log.c b/fs/btrfs/tree-log.c
index fb52aa060093..3ef0266e9527 100644
--- a/fs/btrfs/tree-log.c
+++ b/fs/btrfs/tree-log.c
@@ -3826,7 +3826,10 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
path->slots[0]);
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
}
+
goto done;
}
@@ -3846,7 +3849,11 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
if (tmp.type == BTRFS_DIR_INDEX_KEY)
last_old_dentry_offset = tmp.offset;
+ } else if (ret < 0) {
+ err = ret;
+ goto done;
}
+
btrfs_release_path(path);
/*
@@ -3859,6 +3866,8 @@ static noinline int log_dir_items(struct btrfs_trans_handle *trans,
*/
search:
ret = btrfs_search_slot(NULL, root, &min_key, path, 0, 0);
+ if (ret < 0)
+ err = ret;
if (ret != 0)
goto done;
The patch below does not apply to the 4.14-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
2efb6edd52dc ("comedi: adv_pci1760: Fix PWM instruction handling")
8ffdff6a8cfb ("staging: comedi: move out of staging directory")
5b7b4ce1d116 ("staging: comedi: tests: example_test: Rename to 'comedi_example_test'")
a98f670e41a9 ("Merge tag 'media/v5.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/mchehab/linux-media")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2efb6edd52dc50273f5e68ad863dd1b1fb2f2d1c Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Tue, 3 Jan 2023 14:37:54 +0000
Subject: [PATCH] comedi: adv_pci1760: Fix PWM instruction handling
(Actually, this is fixing the "Read the Current Status" command sent to
the device's outgoing mailbox, but it is only currently used for the PWM
instructions.)
The PCI-1760 is operated mostly by sending commands to a set of Outgoing
Mailbox registers, waiting for the command to complete, and reading the
result from the Incoming Mailbox registers. One of these commands is
the "Read the Current Status" command. The number of this command is
0x07 (see the User's Manual for the PCI-1760 at
<https://advdownload.advantech.com/productfile/Downloadfile2/1-11P6653/PCI-1…>.
The `PCI1760_CMD_GET_STATUS` macro defined in the driver should expand
to this command number 0x07, but unfortunately it currently expands to
0x03. (Command number 0x03 is not defined in the User's Manual.)
Correct the definition of the `PCI1760_CMD_GET_STATUS` macro to fix it.
This is used by all the PWM subdevice related instructions handled by
`pci1760_pwm_insn_config()` which are probably all broken. The effect
of sending the undefined command number 0x03 is not known.
Fixes: 14b93bb6bbf0 ("staging: comedi: adv_pci_dio: separate out PCI-1760 support")
Cc: <stable(a)vger.kernel.org> # v4.5+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20230103143754.17564-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
diff --git a/drivers/comedi/drivers/adv_pci1760.c b/drivers/comedi/drivers/adv_pci1760.c
index fcfc2e299110..27f3890f471d 100644
--- a/drivers/comedi/drivers/adv_pci1760.c
+++ b/drivers/comedi/drivers/adv_pci1760.c
@@ -58,7 +58,7 @@
#define PCI1760_CMD_CLR_IMB2 0x00 /* Clears IMB2 */
#define PCI1760_CMD_SET_DO 0x01 /* Set output state */
#define PCI1760_CMD_GET_DO 0x02 /* Read output status */
-#define PCI1760_CMD_GET_STATUS 0x03 /* Read current status */
+#define PCI1760_CMD_GET_STATUS 0x07 /* Read current status */
#define PCI1760_CMD_GET_FW_VER 0x0e /* Read firmware version */
#define PCI1760_CMD_GET_HW_VER 0x0f /* Read hardware version */
#define PCI1760_CMD_SET_PWM_HI(x) (0x10 + (x) * 2) /* Set "hi" period */
On 22.01.23 05:28, Sasha Levin wrote:
> This is a note to let you know that I've just added the patch titled
>
> wifi: mac80211: Drop support for TX push path
>
> to the 6.1-stable tree which can be found at:
> http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=sum…
>
> The filename of the patch is:
> wifi-mac80211-drop-support-for-tx-push-path.patch
> and it can be found in the queue-6.1 subdirectory.
>
> If you, or anyone else, feels it should not be added to the stable tree,
> please let <stable(a)vger.kernel.org> know about it.
>
We should at least have a discussion about that.
While I think we have sorted out all related regressions it's still way
too early to be sure.
The patch is also changing most mac80211 driver interfaces from queuing
to non-queuing and is thus nothing I would do within a fix release.
All in all it's more likely to cause issues than fix them, at last at
this point in time.
So do we really want to backport that to (all) stable kernels?
I've also just backported the two for stable relevant patches which
depend on the iTXQ transformation:
https://lore.kernel.org/r/20230121223330.389255-2-alexander@wetzel-home.dehttps://lore.kernel.org/r/20230121223330.389255-2-alexander@wetzel-home.de
If there are more patches can point them out to me and I'll should be
able port them, too.
All in all I see no pressing need to retire the old push path for stable
kernels at that time.
Question is also where to stop if we back port it now:
The transition to iTXQ is only the first step to get rid of the old push
path in mac80211. Working patch titels are:
1) wifi: mac80211: Always provide the MMPDU TXQ
2) wifi: mac80211: Convert vif->txq to an array
3) wifi: mac80211: add new iTXQs to replace remaining legacy TX
4) wifi: mac80211: Stop using legacy TX path
5) wifi: mac80211: Cleanup legacy TX path - AMPDU
6) WIP: Drop pending
7) wifi: mac80211: integrate PS buffering into iTXQ
8) wifi: mac80211: handle filtered frames within iTXQs
Patch 6) is only a rough skeleton so far and 4-8 still need at least
some moderate work. All in all thinks seem to hash out quite well and
I'm hoping to get them merged for 6.4.
Together they are fundamentally altering the TX path and nothing I would
like to backport to stable.
Alexander
From: Sean Christopherson <seanjc(a)google.com>
Since VMX and SVM both would never update the control bits if exits
are disable after vCPUs are created, only allow setting exits
disable flag before vCPU creation.
Fixes: 4d5422cea3b6 ("KVM: X86: Provide a capability to disable MWAIT intercepts")
Signed-off-by: Sean Christopherson <seanjc(a)google.com>
Signed-off-by: Kechen Lu <kechenl(a)nvidia.com>
Cc: stable(a)vger.kernel.org
---
Documentation/virt/kvm/api.rst | 1 +
arch/x86/kvm/x86.c | 6 ++++++
2 files changed, 7 insertions(+)
diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index 9807b05a1b57..fb0fcc566d5a 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -7087,6 +7087,7 @@ branch to guests' 0x200 interrupt vector.
:Architectures: x86
:Parameters: args[0] defines which exits are disabled
:Returns: 0 on success, -EINVAL when args[0] contains invalid exits
+ or if any vCPU has already been created
Valid bits in args[0] are::
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index da4bbd043a7b..c8ae9c4f9f08 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -6227,6 +6227,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if (cap->args[0] & ~KVM_X86_DISABLE_VALID_EXITS)
break;
+ mutex_lock(&kvm->lock);
+ if (kvm->created_vcpus)
+ goto disable_exits_unlock;
+
if ((cap->args[0] & KVM_X86_DISABLE_EXITS_MWAIT) &&
kvm_can_mwait_in_guest())
kvm->arch.mwait_in_guest = true;
@@ -6237,6 +6241,8 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
if (cap->args[0] & KVM_X86_DISABLE_EXITS_CSTATE)
kvm->arch.cstate_in_guest = true;
r = 0;
+disable_exits_unlock:
+ mutex_unlock(&kvm->lock);
break;
case KVM_CAP_MSR_PLATFORM_INFO:
kvm->arch.guest_can_read_msr_platform_info = cap->args[0];
--
2.34.1
Hi,
Dig a final round of digging, and found two sets of missing backports:
1) File position fixes
2) fsnotify fix (original is in stable, not the fixup)
With these, I've verified that 5.10-stable and 5.15-stable both fully
pass the liburing regression suite.
Please queue up for 5.10-stable and 5.15-stable, thanks!
--
Jens Axboe
Hi,
Noticed one more missing patch, here's a backport that applies to both
the 5.10 and 5.15 stable branches. Please apply to both of them, thanks!
--
Jens Axboe
Hi,
Same series as I just sent for 5.15-stable, except 5.10-stable already
has the three wakeup patches from that series, and two patches were
missing from 5.10-stable that got auto-picked for 5.15-stable. Not
quite sure why, as they apply directly... Possibly because they
coincided with the move to the io_uring/ directory.
In fact the rest are identical, they apply directly to 5.10-stable.
Yay for a unified backport base! These have been runtime tested on
top of the current 5.10-stable tree, 5.10.164.
Please apply for the next 5.10-stable release, thanks!
--
Jens Axboe
I have a transaction which is of mutual benefits and I would like to share with you. if interested for more information please get back to me via my email: david.murray606(a)gmail.com
Regards.
David Murray
--
Este mensaje ha sido analizado por MailScanner
en busca de virus y otros contenidos peligrosos,
y se considera que está limpio.
I have a transaction which is of mutual benefits and I would like to share with you. if interested for more information please get back to me via my email: david.murray606(a)gmail.com
Regards.
David Murray
--
Este mensaje ha sido analizado por MailScanner
en busca de virus y otros contenidos peligrosos,
y se considera que está limpio.
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
272970be3dab ("Bluetooth: hci_qca: Fix driver shutdown on closed serdev")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 272970be3dabd24cbe50e393ffee8f04aec3b9a8 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Thu, 29 Dec 2022 11:28:29 +0100
Subject: [PATCH] Bluetooth: hci_qca: Fix driver shutdown on closed serdev
The driver shutdown callback (which sends EDL_SOC_RESET to the device
over serdev) should not be invoked when HCI device is not open (e.g. if
hci_dev_open_sync() failed), because the serdev and its TTY are not open
either. Also skip this step if device is powered off
(qca_power_shutdown()).
The shutdown callback causes use-after-free during system reboot with
Qualcomm Atheros Bluetooth:
Unable to handle kernel paging request at virtual address
0072662f67726fd7
...
CPU: 6 PID: 1 Comm: systemd-shutdow Tainted: G W
6.1.0-rt5-00325-g8a5f56bcfcca #8
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Call trace:
tty_driver_flush_buffer+0x4/0x30
serdev_device_write_flush+0x24/0x34
qca_serdev_shutdown+0x80/0x130 [hci_uart]
device_shutdown+0x15c/0x260
kernel_restart+0x48/0xac
KASAN report:
BUG: KASAN: use-after-free in tty_driver_flush_buffer+0x1c/0x50
Read of size 8 at addr ffff16270c2e0018 by task systemd-shutdow/1
CPU: 7 PID: 1 Comm: systemd-shutdow Not tainted
6.1.0-next-20221220-00014-gb85aaf97fb01-dirty #28
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Call trace:
dump_backtrace.part.0+0xdc/0xf0
show_stack+0x18/0x30
dump_stack_lvl+0x68/0x84
print_report+0x188/0x488
kasan_report+0xa4/0xf0
__asan_load8+0x80/0xac
tty_driver_flush_buffer+0x1c/0x50
ttyport_write_flush+0x34/0x44
serdev_device_write_flush+0x48/0x60
qca_serdev_shutdown+0x124/0x274
device_shutdown+0x1e8/0x350
kernel_restart+0x48/0xb0
__do_sys_reboot+0x244/0x2d0
__arm64_sys_reboot+0x54/0x70
invoke_syscall+0x60/0x190
el0_svc_common.constprop.0+0x7c/0x160
do_el0_svc+0x44/0xf0
el0_svc+0x2c/0x6c
el0t_64_sync_handler+0xbc/0x140
el0t_64_sync+0x190/0x194
Fixes: 7e7bbddd029b ("Bluetooth: hci_qca: Fix qca6390 enable failure after warm reboot")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz(a)intel.com>
diff --git a/drivers/bluetooth/hci_qca.c b/drivers/bluetooth/hci_qca.c
index 6eddc23e49d9..bbe9cf1cae27 100644
--- a/drivers/bluetooth/hci_qca.c
+++ b/drivers/bluetooth/hci_qca.c
@@ -2164,10 +2164,17 @@ static void qca_serdev_shutdown(struct device *dev)
int timeout = msecs_to_jiffies(CMD_TRANS_TIMEOUT_MS);
struct serdev_device *serdev = to_serdev_device(dev);
struct qca_serdev *qcadev = serdev_device_get_drvdata(serdev);
+ struct hci_uart *hu = &qcadev->serdev_hu;
+ struct hci_dev *hdev = hu->hdev;
+ struct qca_data *qca = hu->priv;
const u8 ibs_wake_cmd[] = { 0xFD };
const u8 edl_reset_soc_cmd[] = { 0x01, 0x00, 0xFC, 0x01, 0x05 };
if (qcadev->btsoc_type == QCA_QCA6390) {
+ if (test_bit(QCA_BT_OFF, &qca->flags) ||
+ !test_bit(HCI_RUNNING, &hdev->flags))
+ return;
+
serdev_device_write_flush(serdev);
ret = serdev_device_write_buf(serdev, ibs_wake_cmd,
sizeof(ibs_wake_cmd));
The patch below does not apply to the 5.4-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
ab0c3f1251b4 ("mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma")
8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction")
34488399fa08 ("mm/madvise: add file and shmem support to MADV_COLLAPSE")
58ac9a8993a1 ("mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds")
780a4b6fb865 ("mm/khugepaged: check compound_order() in collapse_pte_mapped_thp()")
b26e27015ec9 ("mm: thp: convert to use common struct mm_slot")
685405020b9f ("mm/khugepaged: stop using vma linked list")
7d2c4385c341 ("mm/khugepaged: rename prefix of shared collapse functions")
7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
507228044236 ("mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds hugepage")
a7f4e6e4c47c ("mm/thp: add flag to enforce sysfs THP in hugepage_vma_check()")
50ad2f24b3b4 ("mm/khugepaged: propagate enum scan_result codes back to callers")
9710a78ab2ae ("mm/khugepaged: dedup and simplify hugepage alloc and charging")
34d6b470ab9c ("mm/khugepaged: add struct collapse_control")
c6a7f445a272 ("mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA")
1064026bab9f ("mm: khugepaged: reorg some khugepaged helpers")
7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
f707fa493784 ("mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate")
4fa6893faeaa ("mm: thp: consolidate vma size check to transhuge_vma_suitable")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab0c3f1251b4670978fde0bd54161795a139b060 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 22 Dec 2022 12:41:50 -0800
Subject: [PATCH] mm/khugepaged: fix collapse_pte_mapped_thp() to allow
anon_vma
uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd,
when removing a breakpoint from hugepage text: vma->anon_vma is always set
in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able
to collapse some page tables in a vma which happens to have anon_vma set
from CoWing elsewhere.
Is anon_vma lock required? Almost not: if any page other than expected
subpage of the non-anon huge page is found in the page table, collapse is
aborted without making any change. However, it is possible that an anon
page was CoWed from this extent in another mm or vma, in which case a
concurrent lookup might look here: so keep it away while clearing pmd (but
perhaps we shall go back to using pmd_lock() there in future).
Note that collapse_pte_mapped_thp() is exceptional in freeing a page table
without having cleared its ptes: I'm uneasy about that, and had thought
pte_clear()ing appropriate; but exclusive i_mmap lock does fix the
problem, and we would have to move the mmu_notification if clearing those
ptes.
What this fixes is not a dangerous instability. But I suggest Cc stable
because uprobes "healing" has regressed in that way, so this should follow
8d3c106e19e8 into those stable releases where it was backported (and may
want adjustment there - I'll supply backports as needed).
Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com
Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: Zach O'Keefe <zokeefe(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5cb401aa2b9d..9a0135b39b19 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1460,14 +1460,6 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
if (!hugepage_vma_check(vma, vma->vm_flags, false, false, false))
return SCAN_VMA_CHECK;
- /*
- * Symmetry with retract_page_tables(): Exclude MAP_PRIVATE mappings
- * that got written to. Without this, we'd have to also lock the
- * anon_vma if one exists.
- */
- if (vma->anon_vma)
- return SCAN_VMA_CHECK;
-
/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
if (userfaultfd_wp(vma))
return SCAN_PTE_UFFD_WP;
@@ -1567,8 +1559,14 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
}
/* step 4: remove pte entries */
+ /* we make no change to anon, but protect concurrent anon page lookup */
+ if (vma->anon_vma)
+ anon_vma_lock_write(vma->anon_vma);
+
collapse_and_free_pmd(mm, vma, haddr, pmd);
+ if (vma->anon_vma)
+ anon_vma_unlock_write(vma->anon_vma);
i_mmap_unlock_write(vma->vm_file->f_mapping);
maybe_install_pmd:
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
ab0c3f1251b4 ("mm/khugepaged: fix collapse_pte_mapped_thp() to allow anon_vma")
8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction")
34488399fa08 ("mm/madvise: add file and shmem support to MADV_COLLAPSE")
58ac9a8993a1 ("mm/khugepaged: attempt to map file/shmem-backed pte-mapped THPs by pmds")
780a4b6fb865 ("mm/khugepaged: check compound_order() in collapse_pte_mapped_thp()")
b26e27015ec9 ("mm: thp: convert to use common struct mm_slot")
685405020b9f ("mm/khugepaged: stop using vma linked list")
7d2c4385c341 ("mm/khugepaged: rename prefix of shared collapse functions")
7d8faaf15545 ("mm/madvise: introduce MADV_COLLAPSE sync hugepage collapse")
507228044236 ("mm/khugepaged: record SCAN_PMD_MAPPED when scan_pmd() finds hugepage")
a7f4e6e4c47c ("mm/thp: add flag to enforce sysfs THP in hugepage_vma_check()")
50ad2f24b3b4 ("mm/khugepaged: propagate enum scan_result codes back to callers")
9710a78ab2ae ("mm/khugepaged: dedup and simplify hugepage alloc and charging")
34d6b470ab9c ("mm/khugepaged: add struct collapse_control")
c6a7f445a272 ("mm: khugepaged: don't carry huge page to the next loop for !CONFIG_NUMA")
1064026bab9f ("mm: khugepaged: reorg some khugepaged helpers")
7da4e2cb8b1f ("mm: thp: kill __transhuge_page_enabled()")
9fec51689ff6 ("mm: thp: kill transparent_hugepage_active()")
f707fa493784 ("mm: khugepaged: better comments for anon vma check in hugepage_vma_revalidate")
4fa6893faeaa ("mm: thp: consolidate vma size check to transhuge_vma_suitable")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From ab0c3f1251b4670978fde0bd54161795a139b060 Mon Sep 17 00:00:00 2001
From: Hugh Dickins <hughd(a)google.com>
Date: Thu, 22 Dec 2022 12:41:50 -0800
Subject: [PATCH] mm/khugepaged: fix collapse_pte_mapped_thp() to allow
anon_vma
uprobe_write_opcode() uses collapse_pte_mapped_thp() to restore huge pmd,
when removing a breakpoint from hugepage text: vma->anon_vma is always set
in that case, so undo the prohibition. And MADV_COLLAPSE ought to be able
to collapse some page tables in a vma which happens to have anon_vma set
from CoWing elsewhere.
Is anon_vma lock required? Almost not: if any page other than expected
subpage of the non-anon huge page is found in the page table, collapse is
aborted without making any change. However, it is possible that an anon
page was CoWed from this extent in another mm or vma, in which case a
concurrent lookup might look here: so keep it away while clearing pmd (but
perhaps we shall go back to using pmd_lock() there in future).
Note that collapse_pte_mapped_thp() is exceptional in freeing a page table
without having cleared its ptes: I'm uneasy about that, and had thought
pte_clear()ing appropriate; but exclusive i_mmap lock does fix the
problem, and we would have to move the mmu_notification if clearing those
ptes.
What this fixes is not a dangerous instability. But I suggest Cc stable
because uprobes "healing" has regressed in that way, so this should follow
8d3c106e19e8 into those stable releases where it was backported (and may
want adjustment there - I'll supply backports as needed).
Link: https://lkml.kernel.org/r/b740c9fb-edba-92ba-59fb-7a5592e5dfc@google.com
Fixes: 8d3c106e19e8 ("mm/khugepaged: take the right locks for page table retraction")
Signed-off-by: Hugh Dickins <hughd(a)google.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Cc: Jann Horn <jannh(a)google.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: Zach O'Keefe <zokeefe(a)google.com>
Cc: Song Liu <songliubraving(a)fb.com>
Cc: <stable(a)vger.kernel.org> [5.4+]
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 5cb401aa2b9d..9a0135b39b19 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -1460,14 +1460,6 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
if (!hugepage_vma_check(vma, vma->vm_flags, false, false, false))
return SCAN_VMA_CHECK;
- /*
- * Symmetry with retract_page_tables(): Exclude MAP_PRIVATE mappings
- * that got written to. Without this, we'd have to also lock the
- * anon_vma if one exists.
- */
- if (vma->anon_vma)
- return SCAN_VMA_CHECK;
-
/* Keep pmd pgtable for uffd-wp; see comment in retract_page_tables() */
if (userfaultfd_wp(vma))
return SCAN_PTE_UFFD_WP;
@@ -1567,8 +1559,14 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
}
/* step 4: remove pte entries */
+ /* we make no change to anon, but protect concurrent anon page lookup */
+ if (vma->anon_vma)
+ anon_vma_lock_write(vma->anon_vma);
+
collapse_and_free_pmd(mm, vma, haddr, pmd);
+ if (vma->anon_vma)
+ anon_vma_unlock_write(vma->anon_vma);
i_mmap_unlock_write(vma->vm_file->f_mapping);
maybe_install_pmd:
The patch below does not apply to the 5.15-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
Possible dependencies:
b30c14cd6102 ("hugetlb: unshare some PMDs when splitting VMAs")
ecfbd733878d ("hugetlb: take hugetlb vma_lock when clearing vma_lock->vma pointer")
131a79b474e9 ("hugetlb: fix vma lock handling during split vma and range unmapping")
40549ba8f8e0 ("hugetlb: use new vma_lock for pmd sharing synchronization")
378397ccb8e5 ("hugetlb: create hugetlb_unmap_file_folio to unmap single file folio")
8d9bfb260814 ("hugetlb: add vma based lock for pmd sharing")
12710fd69634 ("hugetlb: rename vma_shareable() and refactor code")
c86272287bc6 ("hugetlb: create remove_inode_single_folio to remove single file folio")
7e1813d48dd3 ("hugetlb: rename remove_huge_page to hugetlb_delete_from_page_cache")
3a47c54f09c4 ("hugetlbfs: revert use i_mmap_rwsem for more pmd sharing synchronization")
188a39725ad7 ("hugetlbfs: revert use i_mmap_rwsem to address page fault/truncate race")
5e6b1bf1b5c3 ("hugetlb: remove meaningless BUG_ON(huge_pte_none())")
3a5497a2dae3 ("mm/hugetlb: fix missing call to restore_reserve_on_error()")
6614a3c3164a ("Merge tag 'mm-stable-2022-08-03' of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From b30c14cd61025eeea2f2e8569606cd167ba9ad2d Mon Sep 17 00:00:00 2001
From: James Houghton <jthoughton(a)google.com>
Date: Wed, 4 Jan 2023 23:19:10 +0000
Subject: [PATCH] hugetlb: unshare some PMDs when splitting VMAs
PMD sharing can only be done in PUD_SIZE-aligned pieces of VMAs; however,
it is possible that HugeTLB VMAs are split without unsharing the PMDs
first.
Without this fix, it is possible to hit the uffd-wp-related WARN_ON_ONCE
in hugetlb_change_protection [1]. The key there is that
hugetlb_unshare_all_pmds will not attempt to unshare PMDs in
non-PUD_SIZE-aligned sections of the VMA.
It might seem ideal to unshare in hugetlb_vm_op_open, but we need to
unshare in both the new and old VMAs, so unsharing in hugetlb_vm_op_split
seems natural.
[1]: https://lore.kernel.org/linux-mm/CADrL8HVeOkj0QH5VZZbRzybNE8CG-tEGFshnA+bG9…
Link: https://lkml.kernel.org/r/20230104231910.1464197-1-jthoughton@google.com
Fixes: 6dfeaff93be1 ("hugetlb/userfaultfd: unshare all pmds for hugetlbfs when register wp")
Signed-off-by: James Houghton <jthoughton(a)google.com>
Reviewed-by: Mike Kravetz <mike.kravetz(a)oracle.com>
Acked-by: Peter Xu <peterx(a)redhat.com>
Cc: Axel Rasmussen <axelrasmussen(a)google.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index bd7d39227344..2ce912c915eb 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -94,6 +94,8 @@ static int hugetlb_acct_memory(struct hstate *h, long delta);
static void hugetlb_vma_lock_free(struct vm_area_struct *vma);
static void hugetlb_vma_lock_alloc(struct vm_area_struct *vma);
static void __hugetlb_vma_unlock_write_free(struct vm_area_struct *vma);
+static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
+ unsigned long start, unsigned long end);
static inline bool subpool_is_free(struct hugepage_subpool *spool)
{
@@ -4834,6 +4836,25 @@ static int hugetlb_vm_op_split(struct vm_area_struct *vma, unsigned long addr)
{
if (addr & ~(huge_page_mask(hstate_vma(vma))))
return -EINVAL;
+
+ /*
+ * PMD sharing is only possible for PUD_SIZE-aligned address ranges
+ * in HugeTLB VMAs. If we will lose PUD_SIZE alignment due to this
+ * split, unshare PMDs in the PUD_SIZE interval surrounding addr now.
+ */
+ if (addr & ~PUD_MASK) {
+ /*
+ * hugetlb_vm_op_split is called right before we attempt to
+ * split the VMA. We will need to unshare PMDs in the old and
+ * new VMAs, so let's unshare before we split.
+ */
+ unsigned long floor = addr & PUD_MASK;
+ unsigned long ceil = floor + PUD_SIZE;
+
+ if (floor >= vma->vm_start && ceil <= vma->vm_end)
+ hugetlb_unshare_pmds(vma, floor, ceil);
+ }
+
return 0;
}
@@ -7322,26 +7343,21 @@ void move_hugetlb_state(struct folio *old_folio, struct folio *new_folio, int re
}
}
-/*
- * This function will unconditionally remove all the shared pmd pgtable entries
- * within the specific vma for a hugetlbfs memory range.
- */
-void hugetlb_unshare_all_pmds(struct vm_area_struct *vma)
+static void hugetlb_unshare_pmds(struct vm_area_struct *vma,
+ unsigned long start,
+ unsigned long end)
{
struct hstate *h = hstate_vma(vma);
unsigned long sz = huge_page_size(h);
struct mm_struct *mm = vma->vm_mm;
struct mmu_notifier_range range;
- unsigned long address, start, end;
+ unsigned long address;
spinlock_t *ptl;
pte_t *ptep;
if (!(vma->vm_flags & VM_MAYSHARE))
return;
- start = ALIGN(vma->vm_start, PUD_SIZE);
- end = ALIGN_DOWN(vma->vm_end, PUD_SIZE);
-
if (start >= end)
return;
@@ -7373,6 +7389,16 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma)
mmu_notifier_invalidate_range_end(&range);
}
+/*
+ * This function will unconditionally remove all the shared pmd pgtable entries
+ * within the specific vma for a hugetlbfs memory range.
+ */
+void hugetlb_unshare_all_pmds(struct vm_area_struct *vma)
+{
+ hugetlb_unshare_pmds(vma, ALIGN(vma->vm_start, PUD_SIZE),
+ ALIGN_DOWN(vma->vm_end, PUD_SIZE));
+}
+
#ifdef CONFIG_CMA
static bool cma_reserve_called __initdata;
The psp suspend & resume should be skipped to avoid destroy
the TMR and reload FWs again for IMU enabled APU ASICs.
Signed-off-by: Tim Huang <tim.huang(a)amd.com>
Acked-by: Alex Deucher <alexander.deucher(a)amd.com>
Reviewed-by: Mario Limonciello <mario.limonciello(a)amd.com>
---
drivers/gpu/drm/amd/amdgpu/amdgpu_device.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
index efd4f8226120..0f9a5b12c3a5 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_device.c
@@ -3036,6 +3036,18 @@ static int amdgpu_device_ip_suspend_phase2(struct amdgpu_device *adev)
(adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_SDMA))
continue;
+ /* Once swPSP provides the IMU, RLC FW binaries to TOS during cold-boot.
+ * These are in TMR, hence are expected to be reused by PSP-TOS to reload
+ * from this location and RLC Autoload automatically also gets loaded
+ * from here based on PMFW -> PSP message during re-init sequence.
+ * Therefore, the psp suspend & resume should be skipped to avoid destroy
+ * the TMR and reload FWs again for IMU enabled APU ASICs.
+ */
+ if (amdgpu_in_reset(adev) &&
+ (adev->flags & AMD_IS_APU) && adev->gfx.imu.funcs &&
+ adev->ip_blocks[i].version->type == AMD_IP_BLOCK_TYPE_PSP)
+ continue;
+
/* XXX handle errors */
r = adev->ip_blocks[i].version->funcs->suspend(adev);
/* XXX handle errors */
--
2.25.1
I have a business proposal in the region of $19.3million USD for you to handle
with me. I have the opportunity to transfer this abandoned fund to your bank
account in your country which belongs to our dead client.
I am inviting you in this transaction where this money can be shared
between us at the ratio of 50/50% and help the needy around us don’t be
afraid of anything I am with you and will instruct you what you will do
to maintain this fund.
below is my information.
Full name...Mr.Ibrahim idewu
country.....Burkina faso
Occupation.....Banker
Age...55years
Telephone number...+22665604193
send me your own so we can proceed.
THANKS.
REPLY ME
Good day to you,
I've viewed your LINKEDIN profile. I've a proposal that is in your name, kindly contact me on my private email: carolynclarke687(a)gmail.com for more update for your understanding,
Regards
Carolyn Clarke
carolynclarke687(a)gmail.com
TPM 1's support for its hardware RNG is broken across system suspends,
due to races or locking issues or something else that haven't been
diagnosed or fixed yet. These issues prevent the system from actually
suspending. So disable the driver in this case. Later, when this is
fixed properly, we can remove this.
Current breakage amounts to something like:
tpm tpm0: A TPM error (28) occurred continue selftest
...
tpm tpm0: A TPM error (28) occurred attempting get random
...
tpm tpm0: Error (28) sending savestate before suspend
tpm_tis 00:08: PM: __pnp_bus_suspend(): tpm_pm_suspend+0x0/0x80 returns 28
tpm_tis 00:08: PM: dpm_run_callback(): pnp_bus_suspend+0x0/0x10 returns 28
tpm_tis 00:08: PM: failed to suspend: error 28
PM: Some devices failed to suspend, or early wake event detected
This issue was partially fixed by 23393c646142 ("char: tpm: Protect
tpm_pm_suspend with locks"), in a last minute 6.1 commit that Linus took
directly because the TPM maintainers weren't available. However, it
seems like this just addresses the most common cases of the bug, rather
than addressing it entirely. So there are more things to fix still,
apparently.
The hwrng driver appears already to be occasionally disabled due to
other conditions, so this shouldn't be too large of a surprise.
Link: https://lore.kernel.org/lkml/7cbe96cf-e0b5-ba63-d1b4-f63d2e826efa@suse.cz/
Cc: stable(a)vger.kernel.org # 6.1+
Signed-off-by: Jason A. Donenfeld <Jason(a)zx2c4.com>
---
drivers/char/tpm/tpm-chip.c | 8 ++++++++
1 file changed, 8 insertions(+)
diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
index 741d8f3e8fb3..eed67ea8d3a7 100644
--- a/drivers/char/tpm/tpm-chip.c
+++ b/drivers/char/tpm/tpm-chip.c
@@ -524,6 +524,14 @@ static int tpm_add_hwrng(struct tpm_chip *chip)
if (!IS_ENABLED(CONFIG_HW_RANDOM_TPM) || tpm_is_firmware_upgrade(chip))
return 0;
+ /*
+ * This driver's support for using the RNG across suspend is broken on
+ * TPM1. Until somebody fixes this, just stop registering a HWRNG in
+ * that case.
+ */
+ if (!(chip->flags & TPM_CHIP_FLAG_TPM2) && IS_ENABLED(CONFIG_PM_SLEEP))
+ return 0;
+
snprintf(chip->hwrng_name, sizeof(chip->hwrng_name),
"tpm-rng-%d", chip->dev_num);
chip->hwrng.name = chip->hwrng_name;
--
2.39.0
Nathan reports that recent kernels built with LTO will crash when doing
EFI boot using Fedora's GRUB and SHIM. The culprit turns out to be a
misaligned load from the TPM event log, which is annotated with
READ_ONCE(), and under LTO, this gets translated into a LDAR instruction
which does not tolerate misaligned accesses.
Interestingly, this does not happen when booting the same kernel
straight from the UEFI shell, and so the fact that the event log may
appear misaligned in memory may be caused by a bug in GRUB or SHIM.
However, using READ_ONCE() to access firmware tables is slightly unusual
in any case, and here, we only need to ensure that 'event' is not
dereferenced again after it gets unmapped, so a compiler barrier should
be sufficient, and works around the reported issue.
Cc: <stable(a)vger.kernel.org>
Cc: Peter Jones <pjones(a)redhat.com>
Cc: Jarkko Sakkinen <jarkko(a)kernel.org>
Cc: Matthew Garrett <mjg59(a)srcf.ucam.org>
Reported-by: Nathan Chancellor <nathan(a)kernel.org>
Link: https://github.com/ClangBuiltLinux/linux/issues/1782
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
include/linux/tpm_eventlog.h | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/include/linux/tpm_eventlog.h b/include/linux/tpm_eventlog.h
index 20c0ff54b7a0d313..0abcc85904cba874 100644
--- a/include/linux/tpm_eventlog.h
+++ b/include/linux/tpm_eventlog.h
@@ -198,8 +198,10 @@ static __always_inline int __calc_tpm2_event_size(struct tcg_pcr_event2_head *ev
* The loop below will unmap these fields if the log is larger than
* one page, so save them here for reference:
*/
- count = READ_ONCE(event->count);
- event_type = READ_ONCE(event->event_type);
+ count = event->count;
+ event_type = event->event_type;
+
+ barrier();
/* Verify that it's the log header */
if (event_header->pcr_idx != 0 ||
--
2.39.0
A non-first waiter can potentially spin in the for loop of
rwsem_down_write_slowpath() without sleeping but fail to acquire the
lock even if the rwsem is free if the following sequence happens:
Non-first RT waiter First waiter Lock holder
------------------- ------------ -----------
Acquire wait_lock
rwsem_try_write_lock():
Set handoff bit if RT or
wait too long
Set waiter->handoff_set
Release wait_lock
Acquire wait_lock
Inherit waiter->handoff_set
Release wait_lock
Clear owner
Release lock
if (waiter.handoff_set) {
rwsem_spin_on_owner(();
if (OWNER_NULL)
goto trylock_again;
}
trylock_again:
Acquire wait_lock
rwsem_try_write_lock():
if (first->handoff_set && (waiter != first))
return false;
Release wait_lock
A non-first waiter cannot really acquire the rwsem even if it mistakenly
believes that it can spin on OWNER_NULL value. If that waiter happens
to be an RT task running on the same CPU as the first waiter, it can
block the first waiter from acquiring the rwsem leading to live lock.
Fix this problem by making sure that a non-first waiter cannot spin in
the slowpath loop without sleeping.
Fixes: d257cc8cb8d5 ("locking/rwsem: Make handoff bit handling more consistent")
Reviewed-and-tested-by: Mukesh Ojha <quic_mojha(a)quicinc.com>
Signed-off-by: Waiman Long <longman(a)redhat.com>
Cc: stable(a)vger.kernel.org
---
kernel/locking/rwsem.c | 19 +++++++++----------
1 file changed, 9 insertions(+), 10 deletions(-)
diff --git a/kernel/locking/rwsem.c b/kernel/locking/rwsem.c
index 44873594de03..be2df9ea7c30 100644
--- a/kernel/locking/rwsem.c
+++ b/kernel/locking/rwsem.c
@@ -624,18 +624,16 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
*/
if (first->handoff_set && (waiter != first))
return false;
-
- /*
- * First waiter can inherit a previously set handoff
- * bit and spin on rwsem if lock acquisition fails.
- */
- if (waiter == first)
- waiter->handoff_set = true;
}
new = count;
if (count & RWSEM_LOCK_MASK) {
+ /*
+ * A waiter (first or not) can set the handoff bit
+ * if it is an RT task or wait in the wait queue
+ * for too long.
+ */
if (has_handoff || (!rt_task(waiter->task) &&
!time_after(jiffies, waiter->timeout)))
return false;
@@ -651,11 +649,12 @@ static inline bool rwsem_try_write_lock(struct rw_semaphore *sem,
} while (!atomic_long_try_cmpxchg_acquire(&sem->count, &count, new));
/*
- * We have either acquired the lock with handoff bit cleared or
- * set the handoff bit.
+ * We have either acquired the lock with handoff bit cleared or set
+ * the handoff bit. Only the first waiter can have its handoff_set
+ * set here to enable optimistic spinning in slowpath loop.
*/
if (new & RWSEM_FLAG_HANDOFF) {
- waiter->handoff_set = true;
+ first->handoff_set = true;
lockevent_inc(rwsem_wlock_handoff);
return false;
}
--
2.31.1
In Google internal bug 265639009 we've received an (as yet) unreproducible
crash report from an aarch64 GKI 5.10.149-android13 running device.
AFAICT the source code is at:
https://android.googlesource.com/kernel/common/+/refs/tags/ASB-2022-12-05_1…
The call stack is:
ncm_close() -> ncm_notify() -> ncm_do_notify()
with the crash at:
ncm_do_notify+0x98/0x270
Code: 79000d0b b9000a6c f940012a f9400269 (b9405d4b)
Which I believe disassembles to (I don't know ARM assembly, but it looks sane enough to me...):
// halfword (16-bit) store presumably to event->wLength (at offset 6 of struct usb_cdc_notification)
0B 0D 00 79 strh w11, [x8, #6]
// word (32-bit) store presumably to req->Length (at offset 8 of struct usb_request)
6C 0A 00 B9 str w12, [x19, #8]
// x10 (NULL) was read here from offset 0 of valid pointer x9
// IMHO we're reading 'cdev->gadget' and getting NULL
// gadget is indeed at offset 0 of struct usb_composite_dev
2A 01 40 F9 ldr x10, [x9]
// loading req->buf pointer, which is at offset 0 of struct usb_request
69 02 40 F9 ldr x9, [x19]
// x10 is null, crash, appears to be attempt to read cdev->gadget->max_speed
4B 5D 40 B9 ldr w11, [x10, #0x5c]
which seems to line up with ncm_do_notify() case NCM_NOTIFY_SPEED code fragment:
event->wLength = cpu_to_le16(8);
req->length = NCM_STATUS_BYTECOUNT;
/* SPEED_CHANGE data is up/down speeds in bits/sec */
data = req->buf + sizeof *event;
data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));
My analysis of registers and NULL ptr deref crash offset
(Unable to handle kernel NULL pointer dereference at virtual address 000000000000005c)
heavily suggests that the crash is due to 'cdev->gadget' being NULL when executing:
data[0] = cpu_to_le32(ncm_bitrate(cdev->gadget));
which calls:
ncm_bitrate(NULL)
which then calls:
gadget_is_superspeed(NULL)
which reads
((struct usb_gadget *)NULL)->max_speed
and hits a panic.
AFAICT, if I'm counting right, the offset of max_speed is indeed 0x5C.
(remember there's a GKI KABI reservation of 16 bytes in struct work_struct)
It's not at all clear to me how this is all supposed to work...
but returning 0 seems much better than panic-ing...
Cc: Felipe Balbi <balbi(a)kernel.org>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Lorenzo Colitti <lorenzo(a)google.com>
Cc: Carlos Llamas <cmllamas(a)google.com>
Cc: stable(a)vger.kernel.org
Signed-off-by: Maciej Żenczykowski <maze(a)google.com>
---
drivers/usb/gadget/function/f_ncm.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/usb/gadget/function/f_ncm.c b/drivers/usb/gadget/function/f_ncm.c
index c36bcfa0e9b4..424bb3b666db 100644
--- a/drivers/usb/gadget/function/f_ncm.c
+++ b/drivers/usb/gadget/function/f_ncm.c
@@ -83,7 +83,9 @@ static inline struct f_ncm *func_to_ncm(struct usb_function *f)
/* peak (theoretical) bulk transfer rate in bits-per-second */
static inline unsigned ncm_bitrate(struct usb_gadget *g)
{
- if (gadget_is_superspeed(g) && g->speed >= USB_SPEED_SUPER_PLUS)
+ if (!g)
+ return 0;
+ else if (gadget_is_superspeed(g) && g->speed >= USB_SPEED_SUPER_PLUS)
return 4250000000U;
else if (gadget_is_superspeed(g) && g->speed == USB_SPEED_SUPER)
return 3750000000U;
--
2.39.0.314.g84b9a713c41-goog
Update the altmode "active" state when we receive Acks for Enter and
Exit Mode commands. Having the right state is necessary to change Pin
Assignments using the 'pin_assignment" sysfs file.
Fixes: 0e3bb7d6894d ("usb: typec: Add driver for DisplayPort alternate mode")
Cc: stable(a)vger.kernel.org
Cc: Heikki Krogerus <heikki.krogerus(a)linux.intel.com>
Signed-off-by: Prashant Malani <pmalani(a)chromium.org>
---
drivers/usb/typec/altmodes/displayport.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/drivers/usb/typec/altmodes/displayport.c b/drivers/usb/typec/altmodes/displayport.c
index 06fb4732f8cd..bc1c556944d6 100644
--- a/drivers/usb/typec/altmodes/displayport.c
+++ b/drivers/usb/typec/altmodes/displayport.c
@@ -277,9 +277,11 @@ static int dp_altmode_vdm(struct typec_altmode *alt,
case CMDT_RSP_ACK:
switch (cmd) {
case CMD_ENTER_MODE:
+ typec_altmode_update_active(alt, true);
dp->state = DP_STATE_UPDATE;
break;
case CMD_EXIT_MODE:
+ typec_altmode_update_active(alt, false);
dp->data.status = 0;
dp->data.conf = 0;
break;
--
2.39.0.314.g84b9a713c41-goog
The memory for "llcc_driv_data" is allocated by the LLCC driver. But when
it is passed as "pvt_info" to the EDAC core, it will get freed during the
qcom_edac driver release. So when the qcom_edac driver gets probed again,
it will try to use the freed data leading to the use-after-free bug.
Hence, do not pass "llcc_driv_data" as pvt_info but rather reference it
using the "platform_data" in the qcom_edac driver.
Cc: <stable(a)vger.kernel.org> # 4.20
Fixes: 27450653f1db ("drivers: edac: Add EDAC driver support for QCOM SoCs")
Tested-by: Steev Klimaszewski <steev(a)kali.org> # Thinkpad X13s
Tested-by: Andrew Halaney <ahalaney(a)redhat.com> # sa8540p-ride
Reported-by: Steev Klimaszewski <steev(a)kali.org>
Reviewed-by: Borislav Petkov (AMD) <bp(a)alien8.de>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/edac/qcom_edac.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/edac/qcom_edac.c b/drivers/edac/qcom_edac.c
index 9e77fa84e84f..3256254c3722 100644
--- a/drivers/edac/qcom_edac.c
+++ b/drivers/edac/qcom_edac.c
@@ -252,7 +252,7 @@ dump_syn_reg_values(struct llcc_drv_data *drv, u32 bank, int err_type)
static int
dump_syn_reg(struct edac_device_ctl_info *edev_ctl, int err_type, u32 bank)
{
- struct llcc_drv_data *drv = edev_ctl->pvt_info;
+ struct llcc_drv_data *drv = edev_ctl->dev->platform_data;
int ret;
ret = dump_syn_reg_values(drv, bank, err_type);
@@ -289,7 +289,7 @@ static irqreturn_t
llcc_ecc_irq_handler(int irq, void *edev_ctl)
{
struct edac_device_ctl_info *edac_dev_ctl = edev_ctl;
- struct llcc_drv_data *drv = edac_dev_ctl->pvt_info;
+ struct llcc_drv_data *drv = edac_dev_ctl->dev->platform_data;
irqreturn_t irq_rc = IRQ_NONE;
u32 drp_error, trp_error, i;
int ret;
@@ -358,7 +358,6 @@ static int qcom_llcc_edac_probe(struct platform_device *pdev)
edev_ctl->dev_name = dev_name(dev);
edev_ctl->ctl_name = "llcc";
edev_ctl->panic_on_ue = LLCC_ERP_PANIC_ON_UE;
- edev_ctl->pvt_info = llcc_driv_data;
rc = edac_device_add_device(edev_ctl);
if (rc)
--
2.25.1
Requesting an interrupt with IRQF_ONESHOT will run the primary handler
in the hard-IRQ context even in the force-threaded mode. The
force-threaded mode is used by PREEMPT_RT in order to avoid acquiring
sleeping locks (spinlock_t) in hard-IRQ context. This combination
makes it impossible and leads to "sleeping while atomic" warnings.
Use one interrupt handler for both handlers (primary and secondary)
and drop the IRQF_ONESHOT flag which is not needed.
Fixes: e359b4411c283 ("serial: stm32: fix threaded interrupt handling")
Reviewed-by: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Tested-by: Valentin Caron <valentin.caron(a)foss.st.com> # V3
Signed-off-by: Marek Vasut <marex(a)denx.de>
Cc: stable(a)vger.kernel.org
---
Cc: Alexandre Torgue <alexandre.torgue(a)foss.st.com>
Cc: Erwan Le Ray <erwan.leray(a)foss.st.com>
Cc: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Cc: Jiri Slaby <jirislaby(a)kernel.org>
Cc: Maxime Coquelin <mcoquelin.stm32(a)gmail.com>
Cc: Sebastian Andrzej Siewior <bigeasy(a)linutronix.de>
Cc: Thomas Gleixner <tglx(a)linutronix.de>
Cc: Valentin Caron <valentin.caron(a)foss.st.com>
Cc: linux-arm-kernel(a)lists.infradead.org
Cc: linux-stm32(a)st-md-mailman.stormreply.com
To: linux-serial(a)vger.kernel.org
---
V2: - Update patch subject, was:
serial: stm32: Move hard IRQ handling to threaded interrupt context
- Use request_irq() instead, rename the IRQ handler function
V3: - Update the commit message per suggestion from Sebastian
- Add RB from Sebastian
- Add Fixes tag
V4: - Remove uart_console() deadlock check from
stm32_usart_of_dma_rx_probe()
- Use plain spin_lock()/spin_unlock() instead of the
_irqsave/_irqrestore variants in IRQ handler
- Add TB from Valentin
V5: - Add CC stable@
- Do not move the sr variable, removes one useless hunk from the patch
V6: - Replace uart_unlock_and_check_sysrq_irqrestore with uart_unlock_and_check_sysrq
and drop last instance of flags
---
drivers/tty/serial/stm32-usart.c | 33 +++++---------------------------
1 file changed, 5 insertions(+), 28 deletions(-)
diff --git a/drivers/tty/serial/stm32-usart.c b/drivers/tty/serial/stm32-usart.c
index a1490033aa164..409e91d6829a5 100644
--- a/drivers/tty/serial/stm32-usart.c
+++ b/drivers/tty/serial/stm32-usart.c
@@ -797,25 +797,11 @@ static irqreturn_t stm32_usart_interrupt(int irq, void *ptr)
spin_unlock(&port->lock);
}
- if (stm32_usart_rx_dma_enabled(port))
- return IRQ_WAKE_THREAD;
- else
- return IRQ_HANDLED;
-}
-
-static irqreturn_t stm32_usart_threaded_interrupt(int irq, void *ptr)
-{
- struct uart_port *port = ptr;
- struct tty_port *tport = &port->state->port;
- struct stm32_port *stm32_port = to_stm32_port(port);
- unsigned int size;
- unsigned long flags;
-
/* Receiver timeout irq for DMA RX */
- if (!stm32_port->throttled) {
- spin_lock_irqsave(&port->lock, flags);
+ if (stm32_usart_rx_dma_enabled(port) && !stm32_port->throttled) {
+ spin_lock(&port->lock);
size = stm32_usart_receive_chars(port, false);
- uart_unlock_and_check_sysrq_irqrestore(port, flags);
+ uart_unlock_and_check_sysrq(port);
if (size)
tty_flip_buffer_push(tport);
}
@@ -1015,10 +1001,8 @@ static int stm32_usart_startup(struct uart_port *port)
u32 val;
int ret;
- ret = request_threaded_irq(port->irq, stm32_usart_interrupt,
- stm32_usart_threaded_interrupt,
- IRQF_ONESHOT | IRQF_NO_SUSPEND,
- name, port);
+ ret = request_irq(port->irq, stm32_usart_interrupt,
+ IRQF_NO_SUSPEND, name, port);
if (ret)
return ret;
@@ -1601,13 +1585,6 @@ static int stm32_usart_of_dma_rx_probe(struct stm32_port *stm32port,
struct dma_slave_config config;
int ret;
- /*
- * Using DMA and threaded handler for the console could lead to
- * deadlocks.
- */
- if (uart_console(port))
- return -ENODEV;
-
stm32port->rx_buf = dma_alloc_coherent(dev, RX_BUF_L,
&stm32port->rx_dma_buf,
GFP_KERNEL);
--
2.39.0
This is a note to let you know that I've just added the patch titled
mei: me: add meteor lake point M DID
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 0c4d68261717f89fa8c4f98a6967c3832fcb3ad0 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:47 +0200
Subject: mei: me: add meteor lake point M DID
Add Meteor Lake Point M device id.
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-2-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/hw-me-regs.h | 2 ++
drivers/misc/mei/pci-me.c | 2 ++
2 files changed, 4 insertions(+)
diff --git a/drivers/misc/mei/hw-me-regs.h b/drivers/misc/mei/hw-me-regs.h
index 99966cd3e7d8..bdc65d50b945 100644
--- a/drivers/misc/mei/hw-me-regs.h
+++ b/drivers/misc/mei/hw-me-regs.h
@@ -111,6 +111,8 @@
#define MEI_DEV_ID_RPL_S 0x7A68 /* Raptor Lake Point S */
+#define MEI_DEV_ID_MTL_M 0x7E70 /* Meteor Lake Point M */
+
/*
* MEI HW Section
*/
diff --git a/drivers/misc/mei/pci-me.c b/drivers/misc/mei/pci-me.c
index 704cd0caa172..5bf0d50d55a0 100644
--- a/drivers/misc/mei/pci-me.c
+++ b/drivers/misc/mei/pci-me.c
@@ -118,6 +118,8 @@ static const struct pci_device_id mei_me_pci_tbl[] = {
{MEI_PCI_DEVICE(MEI_DEV_ID_RPL_S, MEI_ME_PCH15_CFG)},
+ {MEI_PCI_DEVICE(MEI_DEV_ID_MTL_M, MEI_ME_PCH15_CFG)},
+
/* required last entry */
{0, }
};
--
2.39.1
This is a note to let you know that I've just added the patch titled
mei: bus: fix unlink on bus in error path
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From a43866856125c3c432e2fbb6cc63cee1539ec4a7 Mon Sep 17 00:00:00 2001
From: Alexander Usyskin <alexander.usyskin(a)intel.com>
Date: Tue, 13 Dec 2022 00:02:46 +0200
Subject: mei: bus: fix unlink on bus in error path
Unconditional call to mei_cl_unlink in mei_cl_bus_dev_release leads
to call of the mei_cl_unlink without corresponding mei_cl_link.
This leads to miscalculation of open_handle_count (decrease without
increase).
Call unlink in mei_cldev_enable fail path and remove blanket unlink
from mei_cl_bus_dev_release.
Fixes: 34f1166afd67 ("mei: bus: need to unlink client before freeing")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Alexander Usyskin <alexander.usyskin(a)intel.com>
Reviewed-by: Tomas Winkler <tomas.winkler(a)intel.com>
Signed-off-by: Tomas Winkler <tomas.winkler(a)intel.com>
Link: https://lore.kernel.org/r/20221212220247.286019-1-tomas.winkler@intel.com
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/misc/mei/bus.c | 12 ++++++++----
1 file changed, 8 insertions(+), 4 deletions(-)
diff --git a/drivers/misc/mei/bus.c b/drivers/misc/mei/bus.c
index 4a08b624910a..a81b890c7ee6 100644
--- a/drivers/misc/mei/bus.c
+++ b/drivers/misc/mei/bus.c
@@ -702,13 +702,15 @@ void *mei_cldev_dma_map(struct mei_cl_device *cldev, u8 buffer_id, size_t size)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
ret = mei_cl_dma_alloc_and_map(cl, NULL, buffer_id, size);
-out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
if (ret)
return ERR_PTR(ret);
@@ -758,7 +760,7 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
if (cl->state == MEI_FILE_UNINITIALIZED) {
ret = mei_cl_link(cl);
if (ret)
- goto out;
+ goto notlinked;
/* update pointers */
cl->cldev = cldev;
}
@@ -785,6 +787,9 @@ int mei_cldev_enable(struct mei_cl_device *cldev)
}
out:
+ if (ret)
+ mei_cl_unlink(cl);
+notlinked:
mutex_unlock(&bus->device_lock);
return ret;
@@ -1277,7 +1282,6 @@ static void mei_cl_bus_dev_release(struct device *dev)
mei_cl_flush_queues(cldev->cl, NULL);
mei_me_cl_put(cldev->me_cl);
mei_dev_bus_put(cldev->bus);
- mei_cl_unlink(cldev->cl);
kfree(cldev->cl);
kfree(cldev);
}
--
2.39.1
There was a recent regression in btrfs/177 that started happening with
the size class patches. This however isn't a regression introduced by
those patches, but rather the bug was uncovered by a change in behavior
in these patches. The patches triggered more chunk allocations in the
^free-space-tree case, which uncovered a race with device shrink.
The problem is we will set the device total size to the new size, and
use this to find a hole for a device extent. However during shrink we
may have device extents allocated past this range, so we could
potentially find a hole in a range past our new shrink size. We don't
actually limit our found extent to the device size anywhere, we assume
that we will not find a hole past our device size. This isn't true with
shrink as we're relocating block groups and thus creating holes past the
device size.
Fix this by making sure we do not search past the new device size, and
if we wander into any device extents that start after our device size
simply break from the loop and use whatever hole we've already found.
cc: stable(a)vger.kernel.org
Signed-off-by: Josef Bacik <josef(a)toxicpanda.com>
---
fs/btrfs/volumes.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 675598c3cb35..707dd0456cea 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1600,7 +1600,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
if (ret < 0)
goto out;
- while (1) {
+ while (search_start < search_end) {
l = path->nodes[0];
slot = path->slots[0];
if (slot >= btrfs_header_nritems(l)) {
@@ -1623,6 +1623,9 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
if (key.type != BTRFS_DEV_EXTENT_KEY)
goto next;
+ if (key.offset > search_end)
+ break;
+
if (key.offset > search_start) {
hole_size = key.offset - search_start;
dev_extent_hole_check(device, &search_start, &hole_size,
@@ -1683,6 +1686,7 @@ static int find_free_dev_extent_start(struct btrfs_device *device,
else
ret = 0;
+ ASSERT(max_hole_start + max_hole_size <= search_end);
out:
btrfs_free_path(path);
*start = max_hole_start;
--
2.26.3
This is a note to let you know that I've just added the patch titled
comedi: adv_pci1760: Fix PWM instruction handling
to my char-misc git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/char-misc.git
in the char-misc-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From 2efb6edd52dc50273f5e68ad863dd1b1fb2f2d1c Mon Sep 17 00:00:00 2001
From: Ian Abbott <abbotti(a)mev.co.uk>
Date: Tue, 3 Jan 2023 14:37:54 +0000
Subject: comedi: adv_pci1760: Fix PWM instruction handling
(Actually, this is fixing the "Read the Current Status" command sent to
the device's outgoing mailbox, but it is only currently used for the PWM
instructions.)
The PCI-1760 is operated mostly by sending commands to a set of Outgoing
Mailbox registers, waiting for the command to complete, and reading the
result from the Incoming Mailbox registers. One of these commands is
the "Read the Current Status" command. The number of this command is
0x07 (see the User's Manual for the PCI-1760 at
<https://advdownload.advantech.com/productfile/Downloadfile2/1-11P6653/PCI-1…>.
The `PCI1760_CMD_GET_STATUS` macro defined in the driver should expand
to this command number 0x07, but unfortunately it currently expands to
0x03. (Command number 0x03 is not defined in the User's Manual.)
Correct the definition of the `PCI1760_CMD_GET_STATUS` macro to fix it.
This is used by all the PWM subdevice related instructions handled by
`pci1760_pwm_insn_config()` which are probably all broken. The effect
of sending the undefined command number 0x03 is not known.
Fixes: 14b93bb6bbf0 ("staging: comedi: adv_pci_dio: separate out PCI-1760 support")
Cc: <stable(a)vger.kernel.org> # v4.5+
Signed-off-by: Ian Abbott <abbotti(a)mev.co.uk>
Link: https://lore.kernel.org/r/20230103143754.17564-1-abbotti@mev.co.uk
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/comedi/drivers/adv_pci1760.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/comedi/drivers/adv_pci1760.c b/drivers/comedi/drivers/adv_pci1760.c
index fcfc2e299110..27f3890f471d 100644
--- a/drivers/comedi/drivers/adv_pci1760.c
+++ b/drivers/comedi/drivers/adv_pci1760.c
@@ -58,7 +58,7 @@
#define PCI1760_CMD_CLR_IMB2 0x00 /* Clears IMB2 */
#define PCI1760_CMD_SET_DO 0x01 /* Set output state */
#define PCI1760_CMD_GET_DO 0x02 /* Read output status */
-#define PCI1760_CMD_GET_STATUS 0x03 /* Read current status */
+#define PCI1760_CMD_GET_STATUS 0x07 /* Read current status */
#define PCI1760_CMD_GET_FW_VER 0x0e /* Read firmware version */
#define PCI1760_CMD_GET_HW_VER 0x0f /* Read hardware version */
#define PCI1760_CMD_SET_PWM_HI(x) (0x10 + (x) * 2) /* Set "hi" period */
--
2.39.1
Fabian has reported another regression in 6.1 due to ca3d76b0aa80 ("mm:
add merging after mremap resize"). The problem is that vma_merge() can
fail when vma has a vm_ops->close() method, causing is_mergeable_vma()
test to be negative. This was happening for vma mapping a file from
fuse-overlayfs, which does have the method. But when we are simply
expanding the vma, we never remove it due to the "merge" with the added
area, so the test should not prevent the expansion.
As a quick fix, check for such vmas and expand them using vma_adjust()
directly as was done before commit ca3d76b0aa80. For a more robust long
term solution we should try to limit the check for vma_ops->close only
to cases that actually result in vma removal, so that no merge would be
prevented unnecessarily.
Reported-by: Fabian Vogt <fvogt(a)suse.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359#c35
Fixes: ca3d76b0aa80 ("mm: add merging after mremap resize")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Cc: Jakub Matěna <matenajakub(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Tested-by: Fabian Vogt <fvogt(a)suse.com>
---
Thorsten: this should be added to the previous regression which wasn't
fully fixed by the previous patch:
https://linux-regtracking.leemhuis.info/regzbot/regression/20221216163227.2…
mm/mremap.c | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/mm/mremap.c b/mm/mremap.c
index fe587c5d6591..1e234fd95547 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -1032,11 +1032,22 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len,
* the already existing vma (expand operation itself) and possibly also
* with the next vma if it becomes adjacent to the expanded vma and
* otherwise compatible.
+ *
+ * However, vma_merge() can currently fail due to is_mergeable_vma()
+ * check for vm_ops->close (see the comment there). Yet this should not
+ * prevent vma expanding, so perform a simple expand for such vma.
+ * Ideally the check for close op should be only done when a vma would
+ * be actually removed due to a merge.
*/
- vma = vma_merge(mm, vma, extension_start, extension_end,
+ if (!vma->vm_ops || !vma->vm_ops->close) {
+ vma = vma_merge(mm, vma, extension_start, extension_end,
vma->vm_flags, vma->anon_vma, vma->vm_file,
extension_pgoff, vma_policy(vma),
vma->vm_userfaultfd_ctx, anon_vma_name(vma));
+ } else if (vma_adjust(vma, vma->vm_start, addr + new_len,
+ vma->vm_pgoff, NULL)) {
+ vma = NULL;
+ }
if (!vma) {
vm_unacct_memory(pages);
ret = -ENOMEM;
--
2.38.1
This is a note to let you know that I've just added the patch titled
tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO
to my tty git tree which can be found at
git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/tty.git
in the tty-linus branch.
The patch will show up in the next release of the linux-next tree
(usually sometime within the next 24 hours during the week.)
The patch will hopefully also be merged in Linus's tree for the
next -rc kernel release.
If you have any questions about this process, please let me know.
From b8caf69a6946e18ffebad49847e258f5b6d52ac2 Mon Sep 17 00:00:00 2001
From: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Date: Wed, 21 Dec 2022 17:40:22 +0100
Subject: tty: serial: qcom-geni-serial: fix slab-out-of-bounds on RX FIFO
buffer
Driver's probe allocates memory for RX FIFO (port->rx_fifo) based on
default RX FIFO depth, e.g. 16. Later during serial startup the
qcom_geni_serial_port_setup() updates the RX FIFO depth
(port->rx_fifo_depth) to match real device capabilities, e.g. to 32.
The RX UART handle code will read "port->rx_fifo_depth" number of words
into "port->rx_fifo" buffer, thus exceeding the bounds. This can be
observed in certain configurations with Qualcomm Bluetooth HCI UART
device and KASAN:
Bluetooth: hci0: QCA Product ID :0x00000010
Bluetooth: hci0: QCA SOC Version :0x400a0200
Bluetooth: hci0: QCA ROM Version :0x00000200
Bluetooth: hci0: QCA Patch Version:0x00000d2b
Bluetooth: hci0: QCA controller version 0x02000200
Bluetooth: hci0: QCA Downloading qca/htbtfw20.tlv
bluetooth hci0: Direct firmware load for qca/htbtfw20.tlv failed with error -2
Bluetooth: hci0: QCA Failed to request file: qca/htbtfw20.tlv (-2)
Bluetooth: hci0: QCA Failed to download patch (-2)
==================================================================
BUG: KASAN: slab-out-of-bounds in handle_rx_uart+0xa8/0x18c
Write of size 4 at addr ffff279347d578c0 by task swapper/0/0
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 6.1.0-rt5-00350-gb2450b7e00be-dirty #26
Hardware name: Qualcomm Technologies, Inc. Robotics RB5 (DT)
Call trace:
dump_backtrace.part.0+0xe0/0xf0
show_stack+0x18/0x40
dump_stack_lvl+0x8c/0xb8
print_report+0x188/0x488
kasan_report+0xb4/0x100
__asan_store4+0x80/0xa4
handle_rx_uart+0xa8/0x18c
qcom_geni_serial_handle_rx+0x84/0x9c
qcom_geni_serial_isr+0x24c/0x760
__handle_irq_event_percpu+0x108/0x500
handle_irq_event+0x6c/0x110
handle_fasteoi_irq+0x138/0x2cc
generic_handle_domain_irq+0x48/0x64
If the RX FIFO depth changes after probe, be sure to resize the buffer.
Fixes: f9d690b6ece7 ("tty: serial: qcom_geni_serial: Allocate port->rx_fifo buffer in probe")
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Krzysztof Kozlowski <krzysztof.kozlowski(a)linaro.org>
Reviewed-by: Jiri Slaby <jirislaby(a)kernel.org>
Link: https://lore.kernel.org/r/20221221164022.1087814-1-krzysztof.kozlowski@lina…
Signed-off-by: Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
---
drivers/tty/serial/qcom_geni_serial.c | 18 ++++++++++++++++--
1 file changed, 16 insertions(+), 2 deletions(-)
diff --git a/drivers/tty/serial/qcom_geni_serial.c b/drivers/tty/serial/qcom_geni_serial.c
index b487823f0e61..49b9ffeae676 100644
--- a/drivers/tty/serial/qcom_geni_serial.c
+++ b/drivers/tty/serial/qcom_geni_serial.c
@@ -864,9 +864,10 @@ static irqreturn_t qcom_geni_serial_isr(int isr, void *dev)
return IRQ_HANDLED;
}
-static void get_tx_fifo_size(struct qcom_geni_serial_port *port)
+static int setup_fifos(struct qcom_geni_serial_port *port)
{
struct uart_port *uport;
+ u32 old_rx_fifo_depth = port->rx_fifo_depth;
uport = &port->uport;
port->tx_fifo_depth = geni_se_get_tx_fifo_depth(&port->se);
@@ -874,6 +875,16 @@ static void get_tx_fifo_size(struct qcom_geni_serial_port *port)
port->rx_fifo_depth = geni_se_get_rx_fifo_depth(&port->se);
uport->fifosize =
(port->tx_fifo_depth * port->tx_fifo_width) / BITS_PER_BYTE;
+
+ if (port->rx_fifo && (old_rx_fifo_depth != port->rx_fifo_depth) && port->rx_fifo_depth) {
+ port->rx_fifo = devm_krealloc(uport->dev, port->rx_fifo,
+ port->rx_fifo_depth * sizeof(u32),
+ GFP_KERNEL);
+ if (!port->rx_fifo)
+ return -ENOMEM;
+ }
+
+ return 0;
}
@@ -888,6 +899,7 @@ static int qcom_geni_serial_port_setup(struct uart_port *uport)
u32 rxstale = DEFAULT_BITS_PER_CHAR * STALE_TIMEOUT;
u32 proto;
u32 pin_swap;
+ int ret;
proto = geni_se_read_proto(&port->se);
if (proto != GENI_SE_UART) {
@@ -897,7 +909,9 @@ static int qcom_geni_serial_port_setup(struct uart_port *uport)
qcom_geni_serial_stop_rx(uport);
- get_tx_fifo_size(port);
+ ret = setup_fifos(port);
+ if (ret)
+ return ret;
writel(rxstale, uport->membase + SE_UART_RX_STALE_CNT);
--
2.39.1
Hi Marc Zyngier
We found the APPS Crash issue on MTBF test.
Brief crash information, APPS Crash - Kernel BUG at /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow
[Test equipment]
1. Number of phone: 200 pcs
2. CPU info of phone: CPU architecture with Quad Cortex-A73 and Quad Cortex-A53
[Preset conditions]
1. Android 13 with kernel 5.15
2. Install application of top 20
3. Connected to Wi-Fi
4. Connect the adapter to charge the phone
5. Test duration 48 hours
[Expected results]
0 crash happened.
[Test results without [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround ]
1. First round of test, found 3 phones with APPS Crash issue on /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow
2. Second round of test, found 7 phones with APP Crash issue on /sched/walt/sched_avg.c:281! in sched_update_nr_prod flow
[Test results with [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround ]
1. First round of test, 0 crash happened.
2. Second round of test, 0 crash happened.
Tested-by: wangfajie <wangfajie(a)longcheer.com>
Tested-by: liurenwang <liurenwang(a)longcheer.com>
Tested-by: zhanghui <zhanghui5(a)longcheer.com>
Tested-by: liangke <liangke1(a)xiaomi.com>
So we think the PATCH is working and it can fix APPS crash issue. Many thanks your time.
Best regards!
Wangfajie
-----邮件原件-----
发件人: Marc Zyngier [mailto:maz@kernel.org]
发送时间: 2023年1月19日 17:52
收件人: 刘琦 <liuqi405(a)icloud.com>
抄送: daniel.lezcano(a)kernel.org; linux-arm-kernel(a)lists.infradead.org; linux-kernel(a)vger.kernel.org; mark.rutland(a)arm.com; quic_ylal(a)quicinc.com; stable(a)vger.kernel.org; will(a)kernel.org; 王法杰 <wangfajie(a)longcheer.com>; 刘仁旺 <liurenwang(a)longcheer.com>; 张辉 <zhanghui5(a)longcheer.com>; liangke1(a)xiaomi.com
主题: Re: [PATCH] clocksource/drivers/arm_arch_timer: Update sched_clock when non-boot CPUs need counter workaround
On 2023-01-19 07:59, 刘琦 wrote:
> [Test Report]
> Result: Test Pass
>
> A total of two rounds of pending testing
> a. The first round of hanging test
> Number of machines: 200
> Hanging test duration: 48h
> Hanging test results: no walt crash problem
> b. The second round of hanging test
> Number of machines: 200
> Hanging test duration: 72h
> Hanging test results: no walt crash problem
>
> Tested-by: wangfajie <wangfajie(a)longcheer.com>
> Tested-by: liurenwang <liurenwang(a)longcheer.com>
> Tested-by: zhanghui <zhanghui5(a)longcheer.com>
> Tested-by: liangke <liangke1(a)xiaomi.com>
Thanks for this.
The only issue here is that that you don't explain what you tested, nor how you tested it.
It is also a patch that has known defects (you just have to read the thread for the details)... This makes this testing, no matter how thorough it is, rather ineffective.
M.
--
Jazz is not dead. It just smells funny...
--
Hello Dear
My name is Dr Ava Smith,a medical doctor from United States.
I have Dual citizenship which is English and French.
I will share pictures and more details about me as soon as i get
a response from you
Thanks
Ava
[Test Report]
Result: Test Pass
A total of two rounds of pending testing
a. The first round of hanging test
Number of machines: 200
Hanging test duration: 48h
Hanging test results: no walt crash problem
b. The second round of hanging test
Number of machines: 200
Hanging test duration: 72h
Hanging test results: no walt crash problem
Tested-by: wangfajie <wangfajie(a)longcheer.com>
Tested-by: liurenwang <liurenwang(a)longcheer.com>
Tested-by: zhanghui <zhanghui5(a)longcheer.com>
Tested-by: liangke <liangke1(a)xiaomi.com>
This is the start of the stable review cycle for the 5.10.164 release.
There are 64 patches in this series, all will be posted as a response
to this one. If anyone has any issues with these being applied, please
let me know.
Responses should be made by Thu, 19 Jan 2023 12:45:11 +0000.
Anything received after that time might be too late.
The whole patch series can be found in one patch at:
https://www.kernel.org/pub/linux/kernel/v5.x/stable-review/patch-5.10.164-r…
or in the git tree and branch at:
git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable-rc.git linux-5.10.y
and the diffstat can be found below.
thanks,
greg k-h
-------------
Pseudo-Shortlog of commits:
Greg Kroah-Hartman <gregkh(a)linuxfoundation.org>
Linux 5.10.164-rc2
Ferry Toth <ftoth(a)exalondelft.nl>
Revert "usb: ulpi: defer ulpi_register on ulpi_read_id timeout"
Jens Axboe <axboe(a)kernel.dk>
io_uring/io-wq: only free worker if it was allocated for creation
Jens Axboe <axboe(a)kernel.dk>
io_uring/io-wq: free worker if task_work creation is canceled
Rob Clark <robdclark(a)chromium.org>
drm/virtio: Fix GEM handle creation UAF
Johan Hovold <johan+linaro(a)kernel.org>
efi: fix NULL-deref in init error path
Mark Rutland <mark.rutland(a)arm.com>
arm64: cmpxchg_double*: hazard against entire exchange variable
Mark Rutland <mark.rutland(a)arm.com>
arm64: atomics: remove LL/SC trampolines
Mark Rutland <mark.rutland(a)arm.com>
arm64: atomics: format whitespace consistently
Peter Newman <peternewman(a)google.com>
x86/resctrl: Fix task CLOSID/RMID update race
Reinette Chatre <reinette.chatre(a)intel.com>
x86/resctrl: Use task_curr() instead of task_struct->on_cpu to prevent unnecessary IPI
Paolo Bonzini <pbonzini(a)redhat.com>
KVM: x86: Do not return host topology information from KVM_GET_SUPPORTED_CPUID
Paolo Bonzini <pbonzini(a)redhat.com>
Documentation: KVM: add API issues section
Christophe JAILLET <christophe.jaillet(a)wanadoo.fr>
iommu/mediatek-v1: Fix an error handling path in mtk_iommu_v1_probe()
Yong Wu <yong.wu(a)mediatek.com>
iommu/mediatek-v1: Add error handle for mtk_iommu_probe
Aaron Thompson <dev(a)aaront.org>
mm: Always release pages to the buddy allocator in memblock_free_late().
Gavin Li <gavinl(a)nvidia.com>
net/mlx5e: Don't support encap rules with gbp option
Rahul Rameshbabu <rrameshbabu(a)nvidia.com>
net/mlx5: Fix ptp max frequency adjustment range
Ido Schimmel <idosch(a)nvidia.com>
net/sched: act_mpls: Fix warning during failed attribute validation
Minsuk Kang <linuxlovemin(a)yonsei.ac.kr>
nfc: pn533: Wait for out_urb's completion in pn533_usb_send_frame()
Roger Pau Monne <roger.pau(a)citrix.com>
hvc/xen: lock console list traversal
Angela Czubak <aczubak(a)marvell.com>
octeontx2-af: Fix LMAC config in cgx_lmac_rx_tx_enable
Subbaraya Sundeep <sbhatta(a)marvell.com>
octeontx2-af: Map NIX block from CGX connection
Subbaraya Sundeep <sbhatta(a)marvell.com>
octeontx2-af: Update get/set resource count functions
Tung Nguyen <tung.q.nguyen(a)dektech.com.au>
tipc: fix unexpected link reset due to discovery messages
Emanuele Ghidoli <emanuele.ghidoli(a)toradex.com>
ASoC: wm8904: fix wrong outputs volume after power reactivation
Ricardo Ribalda <ribalda(a)chromium.org>
regulator: da9211: Use irq handler when ready
Eliav Farber <farbere(a)amazon.com>
EDAC/device: Fix period calculation in edac_device_reset_delay_period()
Peter Zijlstra <peterz(a)infradead.org>
x86/boot: Avoid using Intel mnemonics in AT&T syntax asm
Kajol Jain <kjain(a)linux.ibm.com>
powerpc/imc-pmu: Fix use of mutex in IRQs disabled section
Gavrilov Ilia <Ilia.Gavrilov(a)infotecs.ru>
netfilter: ipset: Fix overflow before widen in the bitmap_ip_create() function.
Nicolas Dichtel <nicolas.dichtel(a)6wind.com>
xfrm: fix rcu lock in xfrm_notify_userpolicy()
Ye Bin <yebin10(a)huawei.com>
ext4: fix uninititialized value in 'ext4_evict_inode'
Ferry Toth <ftoth(a)exalondelft.nl>
usb: ulpi: defer ulpi_register on ulpi_read_id timeout
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Prevent infinite loop in transaction errors recovery for streams
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: move and rename xhci_cleanup_halted_endpoint()
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: store TD status in the td struct instead of passing it along
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: move xhci_td_cleanup so it can be called by more functions
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Add xhci_reset_halted_ep() helper function
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: adjust parameters passed to cleanup_halted_endpoint()
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: get isochronous ring directly from endpoint structure
Mathias Nyman <mathias.nyman(a)linux.intel.com>
xhci: Avoid parsing transfer events several times
Li Jun <jun.li(a)nxp.com>
clk: imx: imx8mp: add shared clk gate for usb suspend clk
Li Jun <jun.li(a)nxp.com>
dt-bindings: clocks: imx8mp: Add ID for usb suspend clock
Lucas Stach <l.stach(a)pengutronix.de>
clk: imx8mp: add clkout1/2 support
Marek Vasut <marex(a)denx.de>
clk: imx8mp: Add DISP2 pixel clock
Kim Phillips <kim.phillips(a)amd.com>
iommu/amd: Fix ill-formed ivrs_ioapic, ivrs_hpet and ivrs_acpihid options
Suravee Suthikulpanit <suravee.suthikulpanit(a)amd.com>
iommu/amd: Add PCI segment support for ivrs_[ioapic/hpet/acpihid] commands
Qiang Yu <quic_qianyu(a)quicinc.com>
bus: mhi: host: Fix race between channel preparation and M0 event
Herbert Xu <herbert(a)gondor.apana.org.au>
ipv6: raw: Deduct extension header length in rawv6_push_pending_frames
Yang Yingliang <yangyingliang(a)huawei.com>
ixgbe: fix pci device refcount leak
Hans de Goede <hdegoede(a)redhat.com>
platform/x86: sony-laptop: Don't turn off 0x153 keyboard backlight during probe
Kuogee Hsieh <quic_khsieh(a)quicinc.com>
drm/msm/dp: do not complete dp_aux_cmd_fifo_tx() if irq is not for aux transfer
Konrad Dybcio <konrad.dybcio(a)linaro.org>
drm/msm/adreno: Make adreno quirks not overwrite each other
Volker Lendecke <vl(a)samba.org>
cifs: Fix uninitialized memory read for smb311 posix symlink create
Heiko Carstens <hca(a)linux.ibm.com>
s390/percpu: add READ_ONCE() to arch_this_cpu_to_op_simple()
Heiko Carstens <hca(a)linux.ibm.com>
s390/cpum_sf: add READ_ONCE() semantics to compare and swap loops
Brian Norris <computersforpeace(a)gmail.com>
ASoC: qcom: lpass-cpu: Fix fallback SD line index handling
Alexander Egorenkov <egorenar(a)linux.ibm.com>
s390/kexec: fix ipl report address for kdump
Adrian Hunter <adrian.hunter(a)intel.com>
perf auxtrace: Fix address filter duplicate symbol selection
Jonathan Corbet <corbet(a)lwn.net>
docs: Fix the docs build with Sphinx 6.0
Ard Biesheuvel <ardb(a)kernel.org>
efi: tpm: Avoid READ_ONCE() for accessing the event log
Marc Zyngier <maz(a)kernel.org>
KVM: arm64: Fix S1PTW handling on RO memslots
Luka Guzenko <l.guzenko(a)web.de>
ALSA: hda/realtek: Enable mute/micmute LEDs on HP Spectre x360 13-aw0xxx
Pablo Neira Ayuso <pablo(a)netfilter.org>
netfilter: nft_payload: incorrect arithmetics when fetching VLAN header bits
-------------
Diffstat:
Documentation/admin-guide/kernel-parameters.txt | 51 +++-
Documentation/sphinx/load_config.py | 6 +-
Documentation/virt/kvm/api.rst | 60 +++++
Makefile | 4 +-
arch/arm64/include/asm/atomic_ll_sc.h | 114 ++++----
arch/arm64/include/asm/atomic_lse.h | 16 +-
arch/arm64/include/asm/kvm_emulate.h | 22 +-
arch/powerpc/include/asm/imc-pmu.h | 2 +-
arch/powerpc/perf/imc-pmu.c | 136 +++++-----
arch/s390/include/asm/cpu_mf.h | 31 ++-
arch/s390/include/asm/percpu.h | 2 +-
arch/s390/kernel/machine_kexec_file.c | 5 +-
arch/s390/kernel/perf_cpum_sf.c | 101 ++++---
arch/x86/boot/bioscall.S | 4 +-
arch/x86/kernel/cpu/resctrl/rdtgroup.c | 26 +-
arch/x86/kvm/cpuid.c | 32 +--
drivers/bus/mhi/core/pm.c | 3 +-
drivers/clk/imx/clk-imx8mp.c | 23 +-
drivers/edac/edac_device.c | 17 +-
drivers/edac/edac_module.h | 2 +-
drivers/firmware/efi/efi.c | 9 +-
drivers/gpu/drm/msm/adreno/adreno_gpu.h | 10 +-
drivers/gpu/drm/msm/dp/dp_aux.c | 4 +
drivers/gpu/drm/virtio/virtgpu_ioctl.c | 10 +-
drivers/iommu/amd/init.c | 89 ++++--
drivers/iommu/mtk_iommu_v1.c | 26 +-
drivers/net/ethernet/intel/ixgbe/ixgbe_phy.c | 14 +-
drivers/net/ethernet/marvell/octeontx2/af/cgx.c | 17 +-
drivers/net/ethernet/marvell/octeontx2/af/cgx.h | 6 +-
drivers/net/ethernet/marvell/octeontx2/af/rvu.c | 134 ++++++++--
drivers/net/ethernet/marvell/octeontx2/af/rvu.h | 4 +
.../net/ethernet/marvell/octeontx2/af/rvu_cgx.c | 15 ++
.../net/ethernet/marvell/octeontx2/af/rvu_nix.c | 21 +-
.../ethernet/mellanox/mlx5/core/en/tc_tun_vxlan.c | 2 +
.../net/ethernet/mellanox/mlx5/core/lib/clock.c | 2 +-
drivers/nfc/pn533/usb.c | 44 ++-
drivers/platform/x86/sony-laptop.c | 21 +-
drivers/regulator/da9211-regulator.c | 11 +-
drivers/tty/hvc/hvc_xen.c | 46 ++--
drivers/usb/host/xhci-mem.c | 4 +
drivers/usb/host/xhci-ring.c | 297 +++++++++++----------
drivers/usb/host/xhci.h | 6 +-
fs/cifs/link.c | 1 +
fs/ext4/super.c | 1 +
include/dt-bindings/clock/imx8mp-clock.h | 10 +-
include/linux/tpm_eventlog.h | 4 +-
io_uring/io-wq.c | 6 +
mm/memblock.c | 8 +-
net/ipv6/raw.c | 4 +
net/netfilter/ipset/ip_set_bitmap_ip.c | 4 +-
net/netfilter/nft_payload.c | 2 +-
net/sched/act_mpls.c | 8 +-
net/tipc/node.c | 12 +-
net/xfrm/xfrm_user.c | 7 +-
sound/pci/hda/patch_realtek.c | 23 ++
sound/soc/codecs/wm8904.c | 7 +
sound/soc/qcom/lpass-cpu.c | 5 +-
tools/perf/util/auxtrace.c | 2 +-
58 files changed, 1015 insertions(+), 538 deletions(-)
Hi
Am 18.01.23 um 20:21 schrieb Rodrigo Vivi:
> On Thu, Jan 12, 2023 at 09:11:55PM +0100, Thomas Zimmermann wrote:
>> Set the framebuffer info for drivers that support VGA switcheroo. Only
>> affects the amdgpu and nouveau drivers, which use VGA switcheroo and
>> generic fbdev emulation. For other drivers, this does nothing.
>>
>> This fixes a potential regression in the console code. Both, amdgpu and
>> nouveau, invoked vga_switcheroo_client_fb_set() from their internal fbdev
>> code. But the call got lost when the drivers switched to the generic
>> emulation.
>>
>> Fixes: 087451f372bf ("drm/amdgpu: use generic fb helpers instead of setting up AMD own's.")
>> Fixes: 4a16dd9d18a0 ("drm/nouveau/kms: switch to drm fbdev helpers")
>> Signed-off-by: Thomas Zimmermann <tzimmermann(a)suse.de>
>> Reviewed-by: Daniel Vetter <daniel.vetter(a)ffwll.ch>
>> Reviewed-by: Alex Deucher <alexander.deucher(a)amd.com>
>> Cc: Ben Skeggs <bskeggs(a)redhat.com>
>> Cc: Karol Herbst <kherbst(a)redhat.com>
>> Cc: Lyude Paul <lyude(a)redhat.com>
>> Cc: Thomas Zimmermann <tzimmermann(a)suse.de>
>> Cc: Javier Martinez Canillas <javierm(a)redhat.com>
>> Cc: Laurent Pinchart <laurent.pinchart(a)ideasonboard.com>
>> Cc: Jani Nikula <jani.nikula(a)intel.com>
>> Cc: Dave Airlie <airlied(a)redhat.com>
>> Cc: Evan Quan <evan.quan(a)amd.com>
>> Cc: Christian König <christian.koenig(a)amd.com>
>> Cc: Alex Deucher <alexander.deucher(a)amd.com>
>> Cc: Hawking Zhang <Hawking.Zhang(a)amd.com>
>> Cc: Likun Gao <Likun.Gao(a)amd.com>
>> Cc: "Christian König" <christian.koenig(a)amd.com>
>> Cc: Stanley Yang <Stanley.Yang(a)amd.com>
>> Cc: "Tianci.Yin" <tianci.yin(a)amd.com>
>> Cc: Xiaojian Du <Xiaojian.Du(a)amd.com>
>> Cc: Andrey Grodzovsky <andrey.grodzovsky(a)amd.com>
>> Cc: YiPeng Chai <YiPeng.Chai(a)amd.com>
>> Cc: Somalapuram Amaranath <Amaranath.Somalapuram(a)amd.com>
>> Cc: Bokun Zhang <Bokun.Zhang(a)amd.com>
>> Cc: Guchun Chen <guchun.chen(a)amd.com>
>> Cc: Hamza Mahfooz <hamza.mahfooz(a)amd.com>
>> Cc: Aurabindo Pillai <aurabindo.pillai(a)amd.com>
>> Cc: Mario Limonciello <mario.limonciello(a)amd.com>
>> Cc: Solomon Chiu <solomon.chiu(a)amd.com>
>> Cc: Kai-Heng Feng <kai.heng.feng(a)canonical.com>
>> Cc: Felix Kuehling <Felix.Kuehling(a)amd.com>
>> Cc: Daniel Vetter <daniel.vetter(a)ffwll.ch>
>> Cc: "Marek Olšák" <marek.olsak(a)amd.com>
>> Cc: Sam Ravnborg <sam(a)ravnborg.org>
>> Cc: Hans de Goede <hdegoede(a)redhat.com>
>> Cc: "Ville Syrjälä" <ville.syrjala(a)linux.intel.com>
>> Cc: dri-devel(a)lists.freedesktop.org
>> Cc: nouveau(a)lists.freedesktop.org
>> Cc: <stable(a)vger.kernel.org> # v5.17+
>> ---
>> drivers/gpu/drm/drm_fb_helper.c | 8 ++++++++
>> 1 file changed, 8 insertions(+)
>>
>> diff --git a/drivers/gpu/drm/drm_fb_helper.c b/drivers/gpu/drm/drm_fb_helper.c
>> index 427631706128..5e445c61252d 100644
>> --- a/drivers/gpu/drm/drm_fb_helper.c
>> +++ b/drivers/gpu/drm/drm_fb_helper.c
>> @@ -30,7 +30,9 @@
>> #define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
>>
>> #include <linux/console.h>
>> +#include <linux/pci.h>
>> #include <linux/sysrq.h>
>> +#include <linux/vga_switcheroo.h>
>>
>> #include <drm/drm_atomic.h>
>> #include <drm/drm_drv.h>
>> @@ -1940,6 +1942,7 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
>> int preferred_bpp)
>> {
>> struct drm_client_dev *client = &fb_helper->client;
>> + struct drm_device *dev = fb_helper->dev;
>
> On drm-tip, this commit has a silent conflict with
> cff84bac9922 ("drm/fh-helper: Split fbdev single-probe helper")
> that's already in drm-next.
>
> I had created a fix-up patch in drm-tip re-introducing this line.
Thank you. Is it fixed for now?
>
> We probably need a backmerge from drm-next into drm-misc-fixes with
> the resolution applied there. And probably propagated that resolution
> later...
Backmerging from -next into -fixes branches is a problem, as -fixes
should be close to the latest release.
Can we solve this by merging -fixes into upstream and backmerging this
into our -next branches?
Best regards
Thomas
>
>> struct drm_fb_helper_surface_size sizes;
>> int ret;
>>
>> @@ -1961,6 +1964,11 @@ static int drm_fb_helper_single_fb_probe(struct drm_fb_helper *fb_helper,
>> return ret;
>>
>> strcpy(fb_helper->fb->comm, "[fbcon]");
>> +
>> + /* Set the fb info for vgaswitcheroo clients. Does nothing otherwise. */
>> + if (dev_is_pci(dev->dev))
>> + vga_switcheroo_client_fb_set(to_pci_dev(dev->dev), fb_helper->info);
>> +
>> return 0;
>> }
>>
>> --
>> 2.39.0
>>
--
Thomas Zimmermann
Graphics Driver Developer
SUSE Software Solutions Germany GmbH
Maxfeldstr. 5, 90409 Nürnberg, Germany
(HRB 36809, AG Nürnberg)
Geschäftsführer: Ivo Totev
The quilt patch titled
Subject: mm/thp: check and bail out if page in deferred queue already
has been removed from the -mm tree. Its filename was
mm-thp-check-and-bail-out-if-page-in-deferred-queue-already.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Yin Fengwei <fengwei.yin(a)intel.com>
Subject: mm/thp: check and bail out if page in deferred queue already
Date: Fri, 23 Dec 2022 21:52:07 +0800
Kernel build regression with LLVM was reported here:
https://lore.kernel.org/all/Y1GCYXGtEVZbcv%2F5@dev-arch.thelio-3990X/ with
commit f35b5d7d676e ("mm: align larger anonymous mappings on THP
boundaries"). And the commit f35b5d7d676e was reverted.
It turned out the regression is related with madvise(MADV_DONTNEED)
was used by ld.lld. But with none PMD_SIZE aligned parameter len.
trace-bpfcc captured:
531607 531732 ld.lld do_madvise.part.0 start: 0x7feca9000000, len: 0x7fb000, behavior: 0x4
531607 531793 ld.lld do_madvise.part.0 start: 0x7fec86a00000, len: 0x7fb000, behavior: 0x4
If the underneath physical page is THP, the madvise(MADV_DONTNEED) can
trigger split_queue_lock contention raised significantly. perf showed
following data:
14.85% 0.00% ld.lld [kernel.kallsyms] [k]
entry_SYSCALL_64_after_hwframe
11.52%
entry_SYSCALL_64_after_hwframe
do_syscall_64
__x64_sys_madvise
do_madvise.part.0
zap_page_range
unmap_single_vma
unmap_page_range
page_remove_rmap
deferred_split_huge_page
__lock_text_start
native_queued_spin_lock_slowpath
If THP can't be removed from rmap as whole THP, partial THP will be
removed from rmap by removing sub-pages from rmap. Even the THP head page
is added to deferred queue already, the split_queue_lock will be acquired
and check whether the THP head page is in the queue already. Thus, the
contention of split_queue_lock is raised.
Before acquire split_queue_lock, check and bail out early if the THP
head page is in the queue already. The checking without holding
split_queue_lock could race with deferred_split_scan, but it doesn't
impact the correctness here.
Test result of building kernel with ld.lld:
commit 7b5a0b664ebe (parent commit of f35b5d7d676e):
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:07.99 real, 26367.77 user, 5063.35 sys
commit f35b5d7d676e:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
7:22.15 real, 26235.03 user, 12504.55 sys
commit f35b5d7d676e with the fixing patch:
time -f "\t%E real,\t%U user,\t%S sys" make LD=ld.lld -skj96 allmodconfig all
6:08.49 real, 26520.15 user, 5047.91 sys
Link: https://lkml.kernel.org/r/20221223135207.2275317-1-fengwei.yin@intel.com
Signed-off-by: Yin Fengwei <fengwei.yin(a)intel.com>
Tested-by: Nathan Chancellor <nathan(a)kernel.org>
Acked-by: David Rientjes <rientjes(a)google.com>
Reviewed-by: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Feng Tang <feng.tang(a)intel.com>
Cc: Matthew Wilcox <willy(a)infradead.org>
Cc: Rik van Riel <riel(a)surriel.com>
Cc: Xing Zhengjun <zhengjun.xing(a)linux.intel.com>
Cc: Yang Shi <shy828301(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/huge_memory.c | 3 +++
1 file changed, 3 insertions(+)
--- a/mm/huge_memory.c~mm-thp-check-and-bail-out-if-page-in-deferred-queue-already
+++ a/mm/huge_memory.c
@@ -2835,6 +2835,9 @@ void deferred_split_huge_page(struct pag
if (PageSwapCache(page))
return;
+ if (!list_empty(page_deferred_list(page)))
+ return;
+
spin_lock_irqsave(&ds_queue->split_queue_lock, flags);
if (list_empty(page_deferred_list(page))) {
count_vm_event(THP_DEFERRED_SPLIT_PAGE);
_
Patches currently in -mm which might be from fengwei.yin(a)intel.com are
The quilt patch titled
Subject: mm: memcontrol: deprecate charge moving
has been removed from the -mm tree. Its filename was
mm-memcontrol-deprecate-charge-moving.patch
This patch was dropped because it was merged into the mm-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Johannes Weiner <hannes(a)cmpxchg.org>
Subject: mm: memcontrol: deprecate charge moving
Date: Wed, 7 Dec 2022 14:00:39 +0100
Charge moving mode in cgroup1 allows memory to follow tasks as they
migrate between cgroups. This is, and always has been, a questionable
thing to do - for several reasons.
First, it's expensive. Pages need to be identified, locked and isolated
from various MM operations, and reassigned, one by one.
Second, it's unreliable. Once pages are charged to a cgroup, there isn't
always a clear owner task anymore. Cache isn't moved at all, for example.
Mapped memory is moved - but if trylocking or isolating a page fails,
it's arbitrarily left behind. Frequent moving between domains may leave a
task's memory scattered all over the place.
Third, it isn't really needed. Launcher tasks can kick off workload tasks
directly in their target cgroup. Using dedicated per-workload groups
allows fine-grained policy adjustments - no need to move tasks and their
physical pages between control domains. The feature was never
forward-ported to cgroup2, and it hasn't been missed.
Despite it being a niche usecase, the maintenance overhead of supporting
it is enormous. Because pages are moved while they are live and subject
to various MM operations, the synchronization rules are complicated.
There are lock_page_memcg() in MM and FS code, which non-cgroup people
don't understand. In some cases we've been able to shift code and cgroup
API calls around such that we can rely on native locking as much as
possible. But that's fragile, and sometimes we need to hold MM locks for
longer than we otherwise would (pte lock e.g.).
Mark the feature deprecated. Hopefully we can remove it soon.
And backport into -stable kernels so that people who develop against
earlier kernels are warned about this deprecation as early as possible.
[akpm(a)linux-foundation.org: fix memory.rst underlining]
Link: https://lkml.kernel.org/r/Y5COd+qXwk/S+n8N@cmpxchg.org
Signed-off-by: Johannes Weiner <hannes(a)cmpxchg.org>
Acked-by: Shakeel Butt <shakeelb(a)google.com>
Acked-by: Hugh Dickins <hughd(a)google.com>
Acked-by: Michal Hocko <mhocko(a)suse.com>
Cc: Muchun Song <songmuchun(a)bytedance.com>
Cc: Roman Gushchin <roman.gushchin(a)linux.dev>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
Documentation/admin-guide/cgroup-v1/memory.rst | 13 +++++++++++--
mm/memcontrol.c | 4 ++++
2 files changed, 15 insertions(+), 2 deletions(-)
--- a/Documentation/admin-guide/cgroup-v1/memory.rst~mm-memcontrol-deprecate-charge-moving
+++ a/Documentation/admin-guide/cgroup-v1/memory.rst
@@ -86,6 +86,8 @@ Brief summary of control files.
memory.swappiness set/show swappiness parameter of vmscan
(See sysctl's vm.swappiness)
memory.move_charge_at_immigrate set/show controls of moving charges
+ This knob is deprecated and shouldn't be
+ used.
memory.oom_control set/show oom controls.
memory.numa_stat show the number of memory usage per numa
node
@@ -717,8 +719,15 @@ NOTE2:
It is recommended to set the soft limit always below the hard limit,
otherwise the hard limit will take precedence.
-8. Move charges at task migration
-=================================
+8. Move charges at task migration (DEPRECATED!)
+===============================================
+
+THIS IS DEPRECATED!
+
+It's expensive and unreliable! It's better practice to launch workload
+tasks directly from inside their target cgroup. Use dedicated workload
+cgroups to allow fine-grained policy adjustments without having to
+move physical pages between control domains.
Users can move charges associated with a task along with task migration, that
is, uncharge task's pages from the old cgroup and charge them to the new cgroup.
--- a/mm/memcontrol.c~mm-memcontrol-deprecate-charge-moving
+++ a/mm/memcontrol.c
@@ -3919,6 +3919,10 @@ static int mem_cgroup_move_charge_write(
{
struct mem_cgroup *memcg = mem_cgroup_from_css(css);
+ pr_warn_once("Cgroup memory moving (move_charge_at_immigrate) is deprecated. "
+ "Please report your usecase to linux-mm(a)kvack.org if you "
+ "depend on this functionality.\n");
+
if (val & ~MOVE_MASK)
return -EINVAL;
_
Patches currently in -mm which might be from hannes(a)cmpxchg.org are
The quilt patch titled
Subject: mm/uffd: fix pte marker when fork() without fork event
has been removed from the -mm tree. Its filename was
mm-uffd-fix-pte-marker-when-fork-without-fork-event.patch
This patch was dropped because it was merged into the mm-hotfixes-stable branch
of git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
------------------------------------------------------
From: Peter Xu <peterx(a)redhat.com>
Subject: mm/uffd: fix pte marker when fork() without fork event
Date: Wed, 14 Dec 2022 15:04:52 -0500
Patch series "mm: Fixes on pte markers".
Patch 1 resolves the syzkiller report from Pengfei.
Patch 2 further harden pte markers when used with the recent swapin error
markers. The major case is we should persist a swapin error marker after
fork(), so child shouldn't read a corrupted page.
This patch (of 2):
When fork(), dst_vma is not guaranteed to have VM_UFFD_WP even if src may
have it and has pte marker installed. The warning is improper along with
the comment. The right thing is to inherit the pte marker when needed, or
keep the dst pte empty.
A vague guess is this happened by an accident when there's the prior patch
to introduce src/dst vma into this helper during the uffd-wp feature got
developed and I probably messed up in the rebase, since if we replace
dst_vma with src_vma the warning & comment it all makes sense too.
Hugetlb did exactly the right here (copy_hugetlb_page_range()). Fix the
general path.
Reproducer:
https://github.com/xupengfe/syzkaller_logs/blob/main/221208_115556_copy_pag…
Bugzilla report: https://bugzilla.kernel.org/show_bug.cgi?id=216808
Link: https://lkml.kernel.org/r/20221214200453.1772655-1-peterx@redhat.com
Link: https://lkml.kernel.org/r/20221214200453.1772655-2-peterx@redhat.com
Fixes: c56d1b62cce8 ("mm/shmem: handle uffd-wp during fork()")
Signed-off-by: Peter Xu <peterx(a)redhat.com>
Reported-by: Pengfei Xu <pengfei.xu(a)intel.com>
Acked-by: David Hildenbrand <david(a)redhat.com>
Reviewed-by: Miaohe Lin <linmiaohe(a)huawei.com>
Cc: Andrea Arcangeli <aarcange(a)redhat.com>
Cc: "Huang, Ying" <ying.huang(a)intel.com>
Cc: Nadav Amit <nadav.amit(a)gmail.com>
Cc: <stable(a)vger.kernel.org> # 5.19+
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
mm/memory.c | 8 ++------
1 file changed, 2 insertions(+), 6 deletions(-)
--- a/mm/memory.c~mm-uffd-fix-pte-marker-when-fork-without-fork-event
+++ a/mm/memory.c
@@ -828,12 +828,8 @@ copy_nonpresent_pte(struct mm_struct *ds
return -EBUSY;
return -ENOENT;
} else if (is_pte_marker_entry(entry)) {
- /*
- * We're copying the pgtable should only because dst_vma has
- * uffd-wp enabled, do sanity check.
- */
- WARN_ON_ONCE(!userfaultfd_wp(dst_vma));
- set_pte_at(dst_mm, addr, dst_pte, pte);
+ if (userfaultfd_wp(dst_vma))
+ set_pte_at(dst_mm, addr, dst_pte, pte);
return 0;
}
if (!userfaultfd_wp(dst_vma))
_
Patches currently in -mm which might be from peterx(a)redhat.com are
selftests-vm-remove-__use_gnu-in-hugetlb-madvisec.patch
mm-uffd-always-wr-protect-pte-in-ptepmd_mkuffd_wp.patch
mm-mprotect-use-long-for-page-accountings-and-retval.patch
mm-uffd-detect-pgtable-allocation-failures.patch
mm-hugetlb-let-vma_offset_start-to-return-start.patch
mm-hugetlb-dont-wait-for-migration-entry-during-follow-page.patch
mm-hugetlb-document-huge_pte_offset-usage.patch
mm-hugetlb-move-swap-entry-handling-into-vma-lock-when-faulted.patch
mm-hugetlb-make-userfaultfd_huge_must_wait-safe-to-pmd-unshare.patch
mm-hugetlb-make-hugetlb_follow_page_mask-safe-to-pmd-unshare.patch
mm-hugetlb-make-follow_hugetlb_page-safe-to-pmd-unshare.patch
mm-hugetlb-make-walk_hugetlb_range-safe-to-pmd-unshare.patch
mm-hugetlb-introduce-hugetlb_walk.patch
There is a lock inversion and rwsem read-lock recursion in the devfreq
target callback which can lead to deadlocks.
Specifically, ufshcd_devfreq_scale() already holds a clk_scaling_lock
read lock when toggling the write booster, which involves taking the
dev_cmd mutex before taking another clk_scaling_lock read lock.
This can lead to a deadlock if another thread:
1) tries to acquire the dev_cmd and clk_scaling locks in the correct
order, or
2) takes a clk_scaling write lock before the attempt to take the
clk_scaling read lock a second time.
Fix this by dropping the clk_scaling_lock before toggling the write
booster as was done before commit 0e9d4ca43ba8 ("scsi: ufs: Protect some
contexts from unexpected clock scaling").
While the devfreq callbacks are already serialised, add a second
serialising mutex to handle the unlikely case where a callback triggered
through the devfreq sysfs interface is racing with a request to disable
clock scaling through the UFS controller 'clkscale_enable' sysfs
attribute. This could otherwise lead to the write booster being left
disabled after having disabled clock scaling.
Also take the new mutex in ufshcd_clk_scaling_allow() to make sure that
any pending write booster update has completed on return.
Note that this currently only affects Qualcomm platforms since commit
87bd05016a64 ("scsi: ufs: core: Allow host driver to disable wb toggling
during clock scaling").
The lock inversion (i.e. 1 above) was reported by lockdep as:
======================================================
WARNING: possible circular locking dependency detected
6.1.0-next-20221216 #211 Not tainted
------------------------------------------------------
kworker/u16:2/71 is trying to acquire lock:
ffff076280ba98a0 (&hba->dev_cmd.lock){+.+.}-{3:3}, at: ufshcd_query_flag+0x50/0x1c0
but task is already holding lock:
ffff076280ba9cf0 (&hba->clk_scaling_lock){++++}-{3:3}, at: ufshcd_devfreq_scale+0x2b8/0x380
which lock already depends on the new lock.
[ +0.011606]
the existing dependency chain (in reverse order) is:
-> #1 (&hba->clk_scaling_lock){++++}-{3:3}:
lock_acquire+0x68/0x90
down_read+0x58/0x80
ufshcd_exec_dev_cmd+0x70/0x2c0
ufshcd_verify_dev_init+0x68/0x170
ufshcd_probe_hba+0x398/0x1180
ufshcd_async_scan+0x30/0x320
async_run_entry_fn+0x34/0x150
process_one_work+0x288/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x120
ret_from_fork+0x10/0x20
-> #0 (&hba->dev_cmd.lock){+.+.}-{3:3}:
__lock_acquire+0x12a0/0x2240
lock_acquire.part.0+0xcc/0x220
lock_acquire+0x68/0x90
__mutex_lock+0x98/0x430
mutex_lock_nested+0x2c/0x40
ufshcd_query_flag+0x50/0x1c0
ufshcd_query_flag_retry+0x64/0x100
ufshcd_wb_toggle+0x5c/0x120
ufshcd_devfreq_scale+0x2c4/0x380
ufshcd_devfreq_target+0xf4/0x230
devfreq_set_target+0x84/0x2f0
devfreq_update_target+0xc4/0xf0
devfreq_monitor+0x38/0x1f0
process_one_work+0x288/0x6c0
worker_thread+0x74/0x450
kthread+0x118/0x120
ret_from_fork+0x10/0x20
other info that might help us debug this:
Possible unsafe locking scenario:
CPU0 CPU1
---- ----
lock(&hba->clk_scaling_lock);
lock(&hba->dev_cmd.lock);
lock(&hba->clk_scaling_lock);
lock(&hba->dev_cmd.lock);
*** DEADLOCK ***
Fixes: 0e9d4ca43ba8 ("scsi: ufs: Protect some contexts from unexpected clock scaling")
Cc: stable(a)vger.kernel.org # 5.12
Cc: Can Guo <quic_cang(a)quicinc.com>
Tested-by: Andrew Halaney <ahalaney(a)redhat.com>
Signed-off-by: Johan Hovold <johan+linaro(a)kernel.org>
---
This issue has apparently been known for over a year [1] without anyone
bothering to fix it. Aside from the potential deadlocks, this also leads
to developers using Qualcomm platforms not being able to use lockdep to
prevent further issues like this from being introduced in other places.
Johan
[1] https://lore.kernel.org/lkml/1631843521-2863-1-git-send-email-cang@codeauro…
Changes in v2
- leave the write booster unchanged on errors (Bart)
drivers/ufs/core/ufshcd.c | 29 +++++++++++++++--------------
include/ufs/ufshcd.h | 2 ++
2 files changed, 17 insertions(+), 14 deletions(-)
diff --git a/drivers/ufs/core/ufshcd.c b/drivers/ufs/core/ufshcd.c
index bda61be5f035..3a1c4d31e010 100644
--- a/drivers/ufs/core/ufshcd.c
+++ b/drivers/ufs/core/ufshcd.c
@@ -1234,12 +1234,14 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba)
* clock scaling is in progress
*/
ufshcd_scsi_block_requests(hba);
+ mutex_lock(&hba->wb_mutex);
down_write(&hba->clk_scaling_lock);
if (!hba->clk_scaling.is_allowed ||
ufshcd_wait_for_doorbell_clr(hba, DOORBELL_CLR_TOUT_US)) {
ret = -EBUSY;
up_write(&hba->clk_scaling_lock);
+ mutex_unlock(&hba->wb_mutex);
ufshcd_scsi_unblock_requests(hba);
goto out;
}
@@ -1251,12 +1253,16 @@ static int ufshcd_clock_scaling_prepare(struct ufs_hba *hba)
return ret;
}
-static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, bool writelock)
+static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, int err, bool scale_up)
{
- if (writelock)
- up_write(&hba->clk_scaling_lock);
- else
- up_read(&hba->clk_scaling_lock);
+ up_write(&hba->clk_scaling_lock);
+
+ /* Enable Write Booster if we have scaled up else disable it */
+ if (ufshcd_enable_wb_if_scaling_up(hba) && !err)
+ ufshcd_wb_toggle(hba, scale_up);
+
+ mutex_unlock(&hba->wb_mutex);
+
ufshcd_scsi_unblock_requests(hba);
ufshcd_release(hba);
}
@@ -1273,7 +1279,6 @@ static void ufshcd_clock_scaling_unprepare(struct ufs_hba *hba, bool writelock)
static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up)
{
int ret = 0;
- bool is_writelock = true;
ret = ufshcd_clock_scaling_prepare(hba);
if (ret)
@@ -1302,15 +1307,8 @@ static int ufshcd_devfreq_scale(struct ufs_hba *hba, bool scale_up)
}
}
- /* Enable Write Booster if we have scaled up else disable it */
- if (ufshcd_enable_wb_if_scaling_up(hba)) {
- downgrade_write(&hba->clk_scaling_lock);
- is_writelock = false;
- ufshcd_wb_toggle(hba, scale_up);
- }
-
out_unprepare:
- ufshcd_clock_scaling_unprepare(hba, is_writelock);
+ ufshcd_clock_scaling_unprepare(hba, ret, scale_up);
return ret;
}
@@ -6066,9 +6064,11 @@ static void ufshcd_force_error_recovery(struct ufs_hba *hba)
static void ufshcd_clk_scaling_allow(struct ufs_hba *hba, bool allow)
{
+ mutex_lock(&hba->wb_mutex);
down_write(&hba->clk_scaling_lock);
hba->clk_scaling.is_allowed = allow;
up_write(&hba->clk_scaling_lock);
+ mutex_unlock(&hba->wb_mutex);
}
static void ufshcd_clk_scaling_suspend(struct ufs_hba *hba, bool suspend)
@@ -9793,6 +9793,7 @@ int ufshcd_init(struct ufs_hba *hba, void __iomem *mmio_base, unsigned int irq)
/* Initialize mutex for exception event control */
mutex_init(&hba->ee_ctrl_mutex);
+ mutex_init(&hba->wb_mutex);
init_rwsem(&hba->clk_scaling_lock);
ufshcd_init_clk_gating(hba);
diff --git a/include/ufs/ufshcd.h b/include/ufs/ufshcd.h
index 5cf81dff60aa..727084cd79be 100644
--- a/include/ufs/ufshcd.h
+++ b/include/ufs/ufshcd.h
@@ -808,6 +808,7 @@ struct ufs_hba_monitor {
* @urgent_bkops_lvl: keeps track of urgent bkops level for device
* @is_urgent_bkops_lvl_checked: keeps track if the urgent bkops level for
* device is known or not.
+ * @wb_mutex: used to serialize devfreq and sysfs write booster toggling
* @clk_scaling_lock: used to serialize device commands and clock scaling
* @desc_size: descriptor sizes reported by device
* @scsi_block_reqs_cnt: reference counting for scsi block requests
@@ -951,6 +952,7 @@ struct ufs_hba {
enum bkops_status urgent_bkops_lvl;
bool is_urgent_bkops_lvl_checked;
+ struct mutex wb_mutex;
struct rw_semaphore clk_scaling_lock;
unsigned char desc_size[QUERY_DESC_IDN_MAX];
atomic_t scsi_block_reqs_cnt;
--
2.38.2
From: Steven Rostedt <rostedt(a)goodmis.org>
There is a disconnect between the run_command function and the
wait_for_input. The wait_for_input has a default timeout of 2 minutes. But
if that happens, the run_command loop will exit out to the waitpid() of
the executing command. This fails in that it no longer monitors the
command, and also, the ssh to the test box can hang when its finished, as
it's waiting for the pipe it's writing to to flush, but the loop that
reads that pipe has already exited, leaving the command stuck, and the
test hangs.
Instead, make the default "wait_for_input" of the run_command infinite,
and allow the user to override it if they want with a default timeout
option "RUN_TIMEOUT".
But this fixes the hang that happens when the pipe is full and the ssh
session never exits.
Cc: stable(a)vger.kernel.org
Fixes: 6e98d1b4415fe ("ktest: Add timeout to ssh command")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
---
tools/testing/ktest/ktest.pl | 20 ++++++++++++++++----
tools/testing/ktest/sample.conf | 5 +++++
2 files changed, 21 insertions(+), 4 deletions(-)
diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl
index 78249c3a03a5..7c91d753a9f2 100755
--- a/tools/testing/ktest/ktest.pl
+++ b/tools/testing/ktest/ktest.pl
@@ -178,6 +178,7 @@ my $store_failures;
my $store_successes;
my $test_name;
my $timeout;
+my $run_timeout;
my $connect_timeout;
my $config_bisect_exec;
my $booted_timeout;
@@ -340,6 +341,7 @@ my %option_map = (
"STORE_SUCCESSES" => \$store_successes,
"TEST_NAME" => \$test_name,
"TIMEOUT" => \$timeout,
+ "RUN_TIMEOUT" => \$run_timeout,
"CONNECT_TIMEOUT" => \$connect_timeout,
"CONFIG_BISECT_EXEC" => \$config_bisect_exec,
"BOOTED_TIMEOUT" => \$booted_timeout,
@@ -1862,6 +1864,14 @@ sub run_command {
$command =~ s/\$SSH_USER/$ssh_user/g;
$command =~ s/\$MACHINE/$machine/g;
+ if (!defined($timeout)) {
+ $timeout = $run_timeout;
+ }
+
+ if (!defined($timeout)) {
+ $timeout = -1; # tell wait_for_input to wait indefinitely
+ }
+
doprint("$command ... ");
$start_time = time;
@@ -1888,13 +1898,10 @@ sub run_command {
while (1) {
my $fp = \*CMD;
- if (defined($timeout)) {
- doprint "timeout = $timeout\n";
- }
my $line = wait_for_input($fp, $timeout);
if (!defined($line)) {
my $now = time;
- if (defined($timeout) && (($now - $start_time) >= $timeout)) {
+ if ($timeout >= 0 && (($now - $start_time) >= $timeout)) {
doprint "Hit timeout of $timeout, killing process\n";
$hit_timeout = 1;
kill 9, $pid;
@@ -2066,6 +2073,11 @@ sub wait_for_input {
$time = $timeout;
}
+ if ($time < 0) {
+ # Negative number means wait indefinitely
+ undef $time;
+ }
+
$rin = '';
vec($rin, fileno($fp), 1) = 1;
vec($rin, fileno(\*STDIN), 1) = 1;
diff --git a/tools/testing/ktest/sample.conf b/tools/testing/ktest/sample.conf
index 2d0fe15a096d..f43477a9b857 100644
--- a/tools/testing/ktest/sample.conf
+++ b/tools/testing/ktest/sample.conf
@@ -817,6 +817,11 @@
# is issued instead of a reboot.
# CONNECT_TIMEOUT = 25
+# The timeout in seconds for how long to wait for any running command
+# to timeout. If not defined, it will let it go indefinitely.
+# (default undefined)
+#RUN_TIMEOUT = 600
+
# In between tests, a reboot of the box may occur, and this
# is the time to wait for the console after it stops producing
# output. Some machines may not produce a large lag on reboot
--
2.39.0
From: Steven Rostedt <rostedt(a)goodmis.org>
When monitoring the console output, the stdout is being redirected to do
so. If Ctrl^C is hit during this mode, the stdout is not back to the
console, the user does not see anything they type (no echo).
Add "end_monitor" to the SIGINT interrupt handler to give back the console
on Ctrl^C.
Cc: stable(a)vger.kernel.org
Fixes: 9f2cdcbbb90e7 ("ktest: Give console process a dedicated tty")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
---
tools/testing/ktest/ktest.pl | 3 +++
1 file changed, 3 insertions(+)
diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl
index f2f48ce6ac4d..78249c3a03a5 100755
--- a/tools/testing/ktest/ktest.pl
+++ b/tools/testing/ktest/ktest.pl
@@ -4205,6 +4205,9 @@ sub send_email {
}
sub cancel_test {
+ if ($monitor_cnt) {
+ end_monitor;
+ }
if ($email_when_canceled) {
my $name = get_test_name;
send_email("KTEST: Your [$name] test was cancelled",
--
2.39.0
From: Steven Rostedt <rostedt(a)goodmis.org>
In the "reboot" command, it does a check of the machine to see if it is
still alive with a simple "ssh echo" command. If it fails, it will assume
that a normal "ssh reboot" is not possible and force a power cycle.
In this case, the "start_monitor" is executed, but the "end_monitor" is
not, and this causes the screen will not be given back to the console. That
is, after the test, a "reset" command needs to be performed, as "echo" is
turned off.
Cc: stable(a)vger.kernel.org
Fixes: 6474ace999edd ("ktest.pl: Powercycle the box on reboot if no connection can be made")
Signed-off-by: Steven Rostedt <rostedt(a)goodmis.org>
---
tools/testing/ktest/ktest.pl | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl
index 6f9fff88cedf..f2f48ce6ac4d 100755
--- a/tools/testing/ktest/ktest.pl
+++ b/tools/testing/ktest/ktest.pl
@@ -1499,7 +1499,8 @@ sub reboot {
# Still need to wait for the reboot to finish
wait_for_monitor($time, $reboot_success_line);
-
+ }
+ if ($powercycle || $time) {
end_monitor;
}
}
--
2.39.0
The ACPI PRM address space handler calls efi_call_virt_pointer() to
execute PRM firmware code, but doing so is only permitted when the EFI
runtime environment is available. Otherwise, such calls are guaranteed
to result in a crash, and must therefore be avoided.
Given that the EFI runtime services may become unavailable after a crash
occurring in the firmware, we need to check this each time the PRM
address space handler is invoked. If the EFI runtime services were not
available at registration time to being with, don't install the address
space handler at all.
Cc: <stable(a)vger.kernel.org>
Cc: "Rafael J. Wysocki" <rafael(a)kernel.org>
Cc: Len Brown <lenb(a)kernel.org>
Cc: linux-acpi(a)vger.kernel.org
Signed-off-by: Ard Biesheuvel <ardb(a)kernel.org>
---
v2: check both at registration and at invocation time
drivers/acpi/prmt.c | 10 ++++++++++
1 file changed, 10 insertions(+)
diff --git a/drivers/acpi/prmt.c b/drivers/acpi/prmt.c
index 998101cf16e47145..3d4c4620f9f95309 100644
--- a/drivers/acpi/prmt.c
+++ b/drivers/acpi/prmt.c
@@ -236,6 +236,11 @@ static acpi_status acpi_platformrt_space_handler(u32 function,
efi_status_t status;
struct prm_context_buffer context;
+ if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
+ pr_err_ratelimited("PRM: EFI runtime services no longer available\n");
+ return AE_NO_HANDLER;
+ }
+
/*
* The returned acpi_status will always be AE_OK. Error values will be
* saved in the first byte of the PRM message buffer to be used by ASL.
@@ -325,6 +330,11 @@ void __init init_prmt(void)
pr_info("PRM: found %u modules\n", mc);
+ if (!efi_enabled(EFI_RUNTIME_SERVICES)) {
+ pr_err("PRM: EFI runtime services unavailable\n");
+ return;
+ }
+
status = acpi_install_address_space_handler(ACPI_ROOT_OBJECT,
ACPI_ADR_SPACE_PLATFORM_RT,
&acpi_platformrt_space_handler,
--
2.39.0
The platforms based on SDM845 SoC locks the access to EDAC registers in the
bootloader. So probing the EDAC driver will result in a crash. Hence,
disable the creation of EDAC platform device on all SDM845 devices.
The issue has been observed on Lenovo Yoga C630 and DB845c.
Cc: <stable(a)vger.kernel.org> # 5.10
Reported-by: Steev Klimaszewski <steev(a)kali.org>
Signed-off-by: Manivannan Sadhasivam <manivannan.sadhasivam(a)linaro.org>
---
drivers/soc/qcom/llcc-qcom.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)
diff --git a/drivers/soc/qcom/llcc-qcom.c b/drivers/soc/qcom/llcc-qcom.c
index 7b7c5a38bac6..8d840702df50 100644
--- a/drivers/soc/qcom/llcc-qcom.c
+++ b/drivers/soc/qcom/llcc-qcom.c
@@ -1012,11 +1012,18 @@ static int qcom_llcc_probe(struct platform_device *pdev)
drv_data->ecc_irq = platform_get_irq_optional(pdev, 0);
- llcc_edac = platform_device_register_data(&pdev->dev,
- "qcom_llcc_edac", -1, drv_data,
- sizeof(*drv_data));
- if (IS_ERR(llcc_edac))
- dev_err(dev, "Failed to register llcc edac driver\n");
+ /*
+ * The platforms based on SDM845 SoC locks the access to EDAC registers
+ * in bootloader. So probing the EDAC driver will result in a crash.
+ * Hence, disable the creation of EDAC platform device on SDM845.
+ */
+ if (!of_device_is_compatible(dev->of_node, "qcom,sdm845-llcc")) {
+ llcc_edac = platform_device_register_data(&pdev->dev,
+ "qcom_llcc_edac", -1, drv_data,
+ sizeof(*drv_data));
+ if (IS_ERR(llcc_edac))
+ dev_err(dev, "Failed to register llcc edac driver\n");
+ }
return 0;
err:
--
2.25.1
Commit 537d9ed2f6c1 ("drm/edid: convert add_cea_modes() to use cea db
iter") inadvertently moved the do_hdmi_vsdb_modes() call within the db
iteration loop, always passing NULL as the CTA VDB to
do_hdmi_vsdb_modes(), skipping a lot of stereo modes.
Move the call back outside of the loop.
This does mean only one CTA VDB and HDMI VSDB combination will be
handled, but it's an unlikely scenario to have more than one of either
block, and it was not accounted for before the regression either.
Fixes: 537d9ed2f6c1 ("drm/edid: convert add_cea_modes() to use cea db iter")
Cc: <stable(a)vger.kernel.org> # v6.0+
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
---
drivers/gpu/drm/drm_edid.c | 22 ++++++++++------------
1 file changed, 10 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 23f7146e6a9b..b94adb9bbefb 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -5249,13 +5249,12 @@ static int add_cea_modes(struct drm_connector *connector,
{
const struct cea_db *db;
struct cea_db_iter iter;
+ const u8 *hdmi = NULL, *video = NULL;
+ u8 hdmi_len = 0, video_len = 0;
int modes = 0;
cea_db_iter_edid_begin(drm_edid, &iter);
cea_db_iter_for_each(db, &iter) {
- const u8 *hdmi = NULL, *video = NULL;
- u8 hdmi_len = 0, video_len = 0;
-
if (cea_db_tag(db) == CTA_DB_VIDEO) {
video = cea_db_data(db);
video_len = cea_db_payload_len(db);
@@ -5271,18 +5270,17 @@ static int add_cea_modes(struct drm_connector *connector,
modes += do_y420vdb_modes(connector, vdb420,
cea_db_payload_len(db) - 1);
}
-
- /*
- * We parse the HDMI VSDB after having added the cea modes as we
- * will be patching their flags when the sink supports stereo
- * 3D.
- */
- if (hdmi)
- modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len,
- video, video_len);
}
cea_db_iter_end(&iter);
+ /*
+ * We parse the HDMI VSDB after having added the cea modes as we will be
+ * patching their flags when the sink supports stereo 3D.
+ */
+ if (hdmi)
+ modes += do_hdmi_vsdb_modes(connector, hdmi, hdmi_len,
+ video, video_len);
+
return modes;
}
--
2.34.1
We try to avoid sending VICs defined in the later specs in AVI
infoframes to sinks that conform to the earlier specs, to not upset
them, and use 0 for the VIC instead. However, we do this detection and
conversion to 0 too early, as we'll need the actual VIC to figure out
the aspect ratio.
In particular, for a mode with 64:27 aspect ratio, 0 for VIC fails the
AVI infoframe generation altogether with -EINVAL.
Separate the VIC lookup from the "filtering", and postpone the
filtering, to use the proper VIC for aspect ratio handling, and the 0
VIC for the infoframe video code as needed.
Reported-by: William Tseng <william.tseng(a)intel.com>
Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/6153
References: https://lore.kernel.org/r/20220920062316.43162-1-william.tseng@intel.com
Cc: <stable(a)vger.kernel.org>
Cc: Ville Syrjälä <ville.syrjala(a)linux.intel.com>
Signed-off-by: Jani Nikula <jani.nikula(a)intel.com>
---
drivers/gpu/drm/drm_edid.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c
index 3841aba17abd..23f7146e6a9b 100644
--- a/drivers/gpu/drm/drm_edid.c
+++ b/drivers/gpu/drm/drm_edid.c
@@ -6885,8 +6885,6 @@ static u8 drm_mode_hdmi_vic(const struct drm_connector *connector,
static u8 drm_mode_cea_vic(const struct drm_connector *connector,
const struct drm_display_mode *mode)
{
- u8 vic;
-
/*
* HDMI spec says if a mode is found in HDMI 1.4b 4K modes
* we should send its VIC in vendor infoframes, else send the
@@ -6896,13 +6894,18 @@ static u8 drm_mode_cea_vic(const struct drm_connector *connector,
if (drm_mode_hdmi_vic(connector, mode))
return 0;
- vic = drm_match_cea_mode(mode);
+ return drm_match_cea_mode(mode);
+}
- /*
- * HDMI 1.4 VIC range: 1 <= VIC <= 64 (CEA-861-D) but
- * HDMI 2.0 VIC range: 1 <= VIC <= 107 (CEA-861-F). So we
- * have to make sure we dont break HDMI 1.4 sinks.
- */
+/*
+ * Avoid sending VICs defined in HDMI 2.0 in AVI infoframes to sinks that
+ * conform to HDMI 1.4.
+ *
+ * HDMI 1.4 (CTA-861-D) VIC range: [1..64]
+ * HDMI 2.0 (CTA-861-F) VIC range: [1..107]
+ */
+static u8 vic_for_avi_infoframe(const struct drm_connector *connector, u8 vic)
+{
if (!is_hdmi2_sink(connector) && vic > 64)
return 0;
@@ -6978,7 +6981,7 @@ drm_hdmi_avi_infoframe_from_display_mode(struct hdmi_avi_infoframe *frame,
picture_aspect = HDMI_PICTURE_ASPECT_NONE;
}
- frame->video_code = vic;
+ frame->video_code = vic_for_avi_infoframe(connector, vic);
frame->picture_aspect = picture_aspect;
frame->active_aspect = HDMI_ACTIVE_ASPECT_PICTURE;
frame->scan_mode = HDMI_SCAN_MODE_UNDERSCAN;
--
2.34.1
This bug is marked as fixed by commit:
ext4: block range must be validated before use in ext4_mb_clear_bb()
But I can't find it in the tested trees[1] for more than 90 days.
Is it a correct commit? Please update it by replying:
#syz fix: exact-commit-title
Until then the bug is still considered open and new crashes with
the same signature are ignored.
Kernel: Android 5.10
Dashboard link: https://syzkaller.appspot.com/bug?extid=15cd994e273307bf5cfa
---
[1] I expect the commit to be present in:
1. android12-5.10-lts branch of
https://android.googlesource.com/kernel/common
This device has a trackstick that is sent through the same HID device
than the touchpad.
Unfortunately there are 2 mice attached to that device descriptor, with
the first one for the touchpad when the second is for the trackstick.
Force all devices to be exported.
Cc: stable(a)vger.kernel.org # 5.8+
Link: https://bugzilla.redhat.com/show_bug.cgi?id=2154204
Signed-off-by: Benjamin Tissoires <benjamin.tissoires(a)redhat.com>
---
drivers/hid/hid-multitouch.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/hid/hid-multitouch.c b/drivers/hid/hid-multitouch.c
index 91a4d3fc30e0..91ac72b32d45 100644
--- a/drivers/hid/hid-multitouch.c
+++ b/drivers/hid/hid-multitouch.c
@@ -1860,6 +1860,12 @@ static const struct hid_device_id mt_devices[] = {
USB_VENDOR_ID_ASUSTEK,
USB_DEVICE_ID_ASUSTEK_T304_KEYBOARD) },
+ /* Asus ExpertBook with trackstick */
+ { .driver_data = MT_CLS_WIN_8_FORCE_MULTI_INPUT,
+ HID_DEVICE(BUS_I2C, HID_GROUP_MULTITOUCH_WIN_8,
+ USB_VENDOR_ID_ELAN,
+ 0x3148) },
+
/* Atmel panels */
{ .driver_data = MT_CLS_SERIAL,
MT_USB_DEVICE(USB_VENDOR_ID_ATMEL,
--
2.38.1
The "duty > cycle" trick to force the pin level of a disabled TCU2
channel would only work when the channel had been enabled previously.
Address this issue by enabling the PWM mode in jz4740_pwm_disable
(I know, right) so that the "duty > cycle" trick works before disabling
the PWM channel right after.
This issue went unnoticed, as the PWM pins on the majority of the boards
tested would default to the inactive level once the corresponding TCU
clock was enabled, so the first call to jz4740_pwm_disable() would not
actually change the pin levels.
On the GCW Zero however, the PWM pin for the backlight (PWM1, which is
a TCU2 channel) goes active as soon as the timer1 clock is enabled.
Since the jz4740_pwm_disable() function did not work on channels not
previously enabled, the backlight would shine at full brightness from
the moment the backlight driver would probe, until the backlight driver
tried to *enable* the PWM output.
With this fix, the PWM pins will be forced inactive as soon as
jz4740_pwm_apply() is called (and might be reconfigured to active if
dictated by the pwm_state). This means that there is still a tiny time
frame between the .request() and .apply() callbacks where the PWM pin
might be active. Sadly, there is no way to fix this issue: it is
impossible to write a PWM channel's registers if the corresponding clock
is not enabled, and enabling the clock is what causes the PWM pin to go
active.
There is a workaround, though, which complements this fix: simply
starting the backlight driver (or any PWM client driver) with a "init"
pinctrl state that sets the pin as an inactive GPIO. Once the driver is
probed and the pinctrl state switches to "default", the regular PWM pin
configuration can be used as it will be properly driven.
Fixes: c2693514a0a1 ("pwm: jz4740: Obtain regmap from parent node")
Signed-off-by: Paul Cercueil <paul(a)crapouillou.net>
Cc: stable(a)vger.kernel.org
---
drivers/pwm/pwm-jz4740.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/drivers/pwm/pwm-jz4740.c b/drivers/pwm/pwm-jz4740.c
index a5fdf97c0d2e..228eb104bf1e 100644
--- a/drivers/pwm/pwm-jz4740.c
+++ b/drivers/pwm/pwm-jz4740.c
@@ -102,11 +102,14 @@ static void jz4740_pwm_disable(struct pwm_chip *chip, struct pwm_device *pwm)
struct jz4740_pwm_chip *jz = to_jz4740(chip);
/*
- * Set duty > period. This trick allows the TCU channels in TCU2 mode to
- * properly return to their init level.
+ * Set duty > period, then enable PWM mode and start the counter.
+ * This trick allows to force the inactive pin level for the TCU2
+ * channels.
*/
regmap_write(jz->map, TCU_REG_TDHRc(pwm->hwpwm), 0xffff);
regmap_write(jz->map, TCU_REG_TDFRc(pwm->hwpwm), 0x0);
+ regmap_set_bits(jz->map, TCU_REG_TCSRc(pwm->hwpwm), TCU_TCSR_PWM_EN);
+ regmap_write(jz->map, TCU_REG_TESR, BIT(pwm->hwpwm));
/*
* Disable PWM output.
--
2.35.1
On Wed, Jan 18, 2023 at 02:26:37AM +0530, mkv22(a)cantab.net wrote:
> OK. I just read about git bisect. Should I do it on the t2 github page from
> where things are downloaded and compiled and installed or on the mainline
> kernel github pages? Many thanks.
Nit, we don't use github for kernel development, but rather,
git.kernel.org :)
> It works fine till 6.1.5.
I recommend using the "t2 github" repo as odds are there are still
out-of-tree patches in there that you rely on.
Let us know how that goes, and watch out with sending html mail to our
mailing lists, they get automatically rejected.
thanks,
greg k-h
This series backports 6 commits with 'Cc stable' that had failed to be
applied, and 4 related commits that made the backports much easier.
Please apply this series to 5.15-stable.
I verified that this series does not cause any regressions with
'gce-xfstests -c ext4/fast_commit -g auto'. There is one test failure
both before and after (ext4/050).
Eric Biggers (5):
ext4: disable fast-commit of encrypted dir operations
ext4: don't set up encryption key during jbd2 transaction
ext4: add missing validation of fast-commit record lengths
ext4: fix unaligned memory access in ext4_fc_reserve_space()
ext4: fix off-by-one errors in fast-commit block filling
Jan Kara (1):
ext4: use ext4_debug() instead of jbd_debug()
Ritesh Harjani (1):
ext4: remove unused enum EXT4_FC_COMMIT_FAILED
Ye Bin (3):
ext4: introduce EXT4_FC_TAG_BASE_LEN helper
ext4: factor out ext4_fc_get_tl()
ext4: fix potential out of bound read in ext4_fc_replay_scan()
fs/ext4/balloc.c | 2 +-
fs/ext4/ext4.h | 4 +-
fs/ext4/ext4_jbd2.c | 3 +-
fs/ext4/fast_commit.c | 284 +++++++++++++++++++++---------------
fs/ext4/fast_commit.h | 7 +-
fs/ext4/indirect.c | 4 +-
fs/ext4/inode.c | 2 +-
fs/ext4/namei.c | 44 +++---
fs/ext4/orphan.c | 24 +--
fs/ext4/super.c | 2 +-
include/trace/events/ext4.h | 7 +-
11 files changed, 222 insertions(+), 161 deletions(-)
--
2.39.0
The patch titled
Subject: mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: Vlastimil Babka <vbabka(a)suse.cz>
Subject: mm, mremap: fix mremap() expanding for vma's with vm_ops->close()
Date: Tue, 17 Jan 2023 11:19:39 +0100
Fabian has reported another regression in 6.1 due to ca3d76b0aa80 ("mm:
add merging after mremap resize"). The problem is that vma_merge() can
fail when vma has a vm_ops->close() method, causing is_mergeable_vma()
test to be negative. This was happening for vma mapping a file from
fuse-overlayfs, which does have the method. But when we are simply
expanding the vma, we never remove it due to the "merge" with the added
area, so the test should not prevent the expansion.
As a quick fix, check for such vmas and expand them using vma_adjust()
directly as was done before commit ca3d76b0aa80. For a more robust long
term solution we should try to limit the check for vma_ops->close only to
cases that actually result in vma removal, so that no merge would be
prevented unnecessarily.
Link: https://lkml.kernel.org/r/20230117101939.9753-1-vbabka@suse.cz
Fixes: ca3d76b0aa80 ("mm: add merging after mremap resize")
Signed-off-by: Vlastimil Babka <vbabka(a)suse.cz>
Reported-by: Fabian Vogt <fvogt(a)suse.com>
Link: https://bugzilla.suse.com/show_bug.cgi?id=1206359#c35
Tested-by: Fabian Vogt <fvogt(a)suse.com>
Cc: Jakub Mat��na <matenajakub(a)gmail.com>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/mm/mremap.c~mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close
+++ a/mm/mremap.c
@@ -1032,11 +1032,22 @@ SYSCALL_DEFINE5(mremap, unsigned long, a
* the already existing vma (expand operation itself) and possibly also
* with the next vma if it becomes adjacent to the expanded vma and
* otherwise compatible.
+ *
+ * However, vma_merge() can currently fail due to is_mergeable_vma()
+ * check for vm_ops->close (see the comment there). Yet this should not
+ * prevent vma expanding, so perform a simple expand for such vma.
+ * Ideally the check for close op should be only done when a vma would
+ * be actually removed due to a merge.
*/
- vma = vma_merge(mm, vma, extension_start, extension_end,
+ if (!vma->vm_ops || !vma->vm_ops->close) {
+ vma = vma_merge(mm, vma, extension_start, extension_end,
vma->vm_flags, vma->anon_vma, vma->vm_file,
extension_pgoff, vma_policy(vma),
vma->vm_userfaultfd_ctx, anon_vma_name(vma));
+ } else if (vma_adjust(vma, vma->vm_start, addr + new_len,
+ vma->vm_pgoff, NULL)) {
+ vma = NULL;
+ }
if (!vma) {
vm_unacct_memory(pages);
ret = -ENOMEM;
_
Patches currently in -mm which might be from vbabka(a)suse.cz are
revert-mm-compaction-fix-set-skip-in-fast_find_migrateblock.patch
mm-mremap-fix-mremap-expanding-for-vmas-with-vm_ops-close.patch
The patch titled
Subject: ia64: fix build error due to switch case label appearing next to declaration
has been added to the -mm mm-hotfixes-unstable branch. Its filename is
ia64-fix-build-error-due-to-switch-case-label-appearing-next-to-declaration.patch
This patch will shortly appear at
https://git.kernel.org/pub/scm/linux/kernel/git/akpm/25-new.git/tree/patche…
This patch will later appear in the mm-hotfixes-unstable branch at
git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
Before you just go and hit "reply", please:
a) Consider who else should be cc'ed
b) Prefer to cc a suitable mailing list as well
c) Ideally: find the original patch on the mailing list and do a
reply-to-all to that, adding suitable additional cc's
*** Remember to use Documentation/process/submit-checklist.rst when testing your code ***
The -mm tree is included into linux-next via the mm-everything
branch at git://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm
and is updated there every 2-3 working days
------------------------------------------------------
From: James Morse <james.morse(a)arm.com>
Subject: ia64: fix build error due to switch case label appearing next to declaration
Date: Tue, 17 Jan 2023 15:16:32 +0000
Since commit aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to
report ITC frequency"), gcc 10.1.0 fails to build ia64 with the gnomic:
| ../arch/ia64/kernel/sys_ia64.c: In function 'ia64_clock_getres':
| ../arch/ia64/kernel/sys_ia64.c:189:3: error: a label can only be part of a statement and a declaration is not a statement
| 189 | s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
This line appears immediately after a case label in a switch.
Move the declarations out of the case, to the top of the function.
Link: https://lkml.kernel.org/r/20230117151632.393836-1-james.morse@arm.com
Fixes: aa06a9bd8533 ("ia64: fix clock_getres(CLOCK_MONOTONIC) to report ITC frequency")
Signed-off-by: James Morse <james.morse(a)arm.com>
Cc: ��meric Maschino <emeric.maschino(a)gmail.com>
Cc: matoro <matoro_mailinglist_kernel(a)matoro.tk>
Cc: Sergei Trofimovich <slyich(a)gmail.com>
Cc: John Paul Adrian Glaubitz <glaubitz(a)physik.fu-berlin.de>
Cc: <stable(a)vger.kernel.org>
Signed-off-by: Andrew Morton <akpm(a)linux-foundation.org>
---
--- a/arch/ia64/kernel/sys_ia64.c~ia64-fix-build-error-due-to-switch-case-label-appearing-next-to-declaration
+++ a/arch/ia64/kernel/sys_ia64.c
@@ -170,6 +170,9 @@ ia64_mremap (unsigned long addr, unsigne
asmlinkage long
ia64_clock_getres(const clockid_t which_clock, struct __kernel_timespec __user *tp)
{
+ struct timespec64 rtn_tp;
+ s64 tick_ns;
+
/*
* ia64's clock_gettime() syscall is implemented as a vdso call
* fsys_clock_gettime(). Currently it handles only
@@ -185,8 +188,8 @@ ia64_clock_getres(const clockid_t which_
switch (which_clock) {
case CLOCK_REALTIME:
case CLOCK_MONOTONIC:
- s64 tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
- struct timespec64 rtn_tp = ns_to_timespec64(tick_ns);
+ tick_ns = DIV_ROUND_UP(NSEC_PER_SEC, local_cpu_data->itc_freq);
+ rtn_tp = ns_to_timespec64(tick_ns);
return put_timespec64(&rtn_tp, tp);
}
_
Patches currently in -mm which might be from james.morse(a)arm.com are
ia64-fix-build-error-due-to-switch-case-label-appearing-next-to-declaration.patch