commit 11a1f4bc47362700fcbde717292158873fb847ed upstream.
Keith reports a use-after-free when a DPC event occurs concurrently to
hot-removal of the same portion of the hierarchy:
The dpc_handler() awaits readiness of the secondary bus below the
Downstream Port where the DPC event occurred. To do so, it polls the
config space of the first child device on the secondary bus. If that
child device is concurrently removed, accesses to its struct pci_dev
cause the kernel to oops.
That's because pci_bridge_wait_for_secondary_bus() neglects to hold a
reference on the child device. Before v6.3, the function was only
called on resume from system sleep or on runtime resume. Holding a
reference wasn't necessary back then because the pciehp IRQ thread
could never run concurrently. (On resume from system sleep, IRQs are
not enabled until after the resume_noirq phase. And runtime resume is
always awaited before a PCI device is removed.)
However starting with v6.3, pci_bridge_wait_for_secondary_bus() is also
called on a DPC event. Commit 53b54ad074de ("PCI/DPC: Await readiness
of secondary bus after reset"), which introduced that, failed to
appreciate that pci_bridge_wait_for_secondary_bus() now needs to hold a
reference on the child device because dpc_handler() and pciehp may
indeed run concurrently. The commit was backported to v5.10+ stable
kernels, so that's the oldest one affected.
Add the missing reference acquisition.
Abridged stack trace:
BUG: unable to handle page fault for address: 00000000091400c0
CPU: 15 PID: 2464 Comm: irq/53-pcie-dpc 6.9.0
RIP: pci_bus_read_config_dword+0x17/0x50
pci_dev_wait()
pci_bridge_wait_for_secondary_bus()
dpc_reset_link()
pcie_do_recovery()
dpc_handler()
Fixes: 53b54ad074de ("PCI/DPC: Await readiness of secondary bus after reset")
Closes: https://lore.kernel.org/r/20240612181625.3604512-3-kbusch@meta.com/
Link: https://lore.kernel.org/linux-pci/8e4bcd4116fd94f592f2bf2749f168099c480ddf.…
Reported-by: Keith Busch <kbusch(a)kernel.org>
Tested-by: Keith Busch <kbusch(a)kernel.org>
Signed-off-by: Lukas Wunner <lukas(a)wunner.de>
Signed-off-by: Krzysztof Wilczyński <kwilczynski(a)kernel.org>
Reviewed-by: Keith Busch <kbusch(a)kernel.org>
Reviewed-by: Mika Westerberg <mika.westerberg(a)linux.intel.com>
Cc: stable(a)vger.kernel.org # v5.10+
---
drivers/pci/pci.c | 19 ++++++++++++-------
1 file changed, 12 insertions(+), 7 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index 67216f4ea215..a88909f2ae65 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4916,7 +4916,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
int timeout)
{
struct pci_dev *child;
- int delay;
+ int delay, ret = 0;
if (pci_dev_is_disconnected(dev))
return 0;
@@ -4944,8 +4944,8 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
return 0;
}
- child = list_first_entry(&dev->subordinate->devices, struct pci_dev,
- bus_list);
+ child = pci_dev_get(list_first_entry(&dev->subordinate->devices,
+ struct pci_dev, bus_list));
up_read(&pci_bus_sem);
/*
@@ -4955,7 +4955,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
if (!pci_is_pcie(dev)) {
pci_dbg(dev, "waiting %d ms for secondary bus\n", 1000 + delay);
msleep(1000 + delay);
- return 0;
+ goto put_child;
}
/*
@@ -4976,7 +4976,7 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
* until the timeout expires.
*/
if (!pcie_downstream_port(dev))
- return 0;
+ goto put_child;
if (pcie_get_speed_cap(dev) <= PCIE_SPEED_5_0GT) {
pci_dbg(dev, "waiting %d ms for downstream link\n", delay);
@@ -4987,11 +4987,16 @@ int pci_bridge_wait_for_secondary_bus(struct pci_dev *dev, char *reset_type,
if (!pcie_wait_for_link_delay(dev, true, delay)) {
/* Did not train, no need to wait any further */
pci_info(dev, "Data Link Layer Link Active not set in 1000 msec\n");
- return -ENOTTY;
+ ret = -ENOTTY;
+ goto put_child;
}
}
- return pci_dev_wait(child, reset_type, timeout - delay);
+ ret = pci_dev_wait(child, reset_type, timeout - delay);
+
+put_child:
+ pci_dev_put(child);
+ return ret;
}
void pci_reset_secondary_bus(struct pci_dev *dev)
--
2.43.0
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024073021-strut-specimen-8aad@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
2237ceb71f89 ("rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive mappings")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 2237ceb71f89837ac47c5dce2aaa2c2b3a337a3c Mon Sep 17 00:00:00 2001
From: Ilya Dryomov <idryomov(a)gmail.com>
Date: Tue, 23 Jul 2024 18:07:59 +0200
Subject: [PATCH] rbd: don't assume RBD_LOCK_STATE_LOCKED for exclusive
mappings
Every time a watch is reestablished after getting lost, we need to
update the cookie which involves quiescing exclusive lock. For this,
we transition from RBD_LOCK_STATE_LOCKED to RBD_LOCK_STATE_QUIESCING
roughly for the duration of rbd_reacquire_lock() call. If the mapping
is exclusive and I/O happens to arrive in this time window, it's failed
with EROFS (later translated to EIO) based on the wrong assumption in
rbd_img_exclusive_lock() -- "lock got released?" check there stopped
making sense with commit a2b1da09793d ("rbd: lock should be quiesced on
reacquire").
To make it worse, any such I/O is added to the acquiring list before
EROFS is returned and this sets up for violating rbd_lock_del_request()
precondition that the request is either on the running list or not on
any list at all -- see commit ded080c86b3f ("rbd: don't move requests
to the running list on errors"). rbd_lock_del_request() ends up
processing these requests as if they were on the running list which
screws up quiescing_wait completion counter and ultimately leads to
rbd_assert(!completion_done(&rbd_dev->quiescing_wait));
being triggered on the next watch error.
Cc: stable(a)vger.kernel.org # 06ef84c4e9c4: rbd: rename RBD_LOCK_STATE_RELEASING and releasing_wait
Cc: stable(a)vger.kernel.org
Fixes: 637cd060537d ("rbd: new exclusive lock wait/wake code")
Signed-off-by: Ilya Dryomov <idryomov(a)gmail.com>
Reviewed-by: Dongsheng Yang <dongsheng.yang(a)easystack.cn>
diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c
index c30d227753d7..ea6c592e015c 100644
--- a/drivers/block/rbd.c
+++ b/drivers/block/rbd.c
@@ -3457,6 +3457,7 @@ static void rbd_lock_del_request(struct rbd_img_request *img_req)
lockdep_assert_held(&rbd_dev->lock_rwsem);
spin_lock(&rbd_dev->lock_lists_lock);
if (!list_empty(&img_req->lock_item)) {
+ rbd_assert(!list_empty(&rbd_dev->running_list));
list_del_init(&img_req->lock_item);
need_wakeup = (rbd_dev->lock_state == RBD_LOCK_STATE_QUIESCING &&
list_empty(&rbd_dev->running_list));
@@ -3476,11 +3477,6 @@ static int rbd_img_exclusive_lock(struct rbd_img_request *img_req)
if (rbd_lock_add_request(img_req))
return 1;
- if (rbd_dev->opts->exclusive) {
- WARN_ON(1); /* lock got released? */
- return -EROFS;
- }
-
/*
* Note the use of mod_delayed_work() in rbd_acquire_lock()
* and cancel_delayed_work() in wake_lock_waiters().
@@ -4601,6 +4597,10 @@ static void rbd_reacquire_lock(struct rbd_device *rbd_dev)
rbd_warn(rbd_dev, "failed to update lock cookie: %d",
ret);
+ if (rbd_dev->opts->exclusive)
+ rbd_warn(rbd_dev,
+ "temporarily releasing lock on exclusive mapping");
+
/*
* Lock cookie cannot be updated on older OSDs, so do
* a manual release and queue an acquire.
The patch below does not apply to the 5.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-5.10.y
git checkout FETCH_HEAD
git cherry-pick -x c3281abea67c9c0dc6219bbc41d1feae05a16da3
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024073024-ruined-frightful-2dc2@gregkh' --subject-prefix 'PATCH 5.10.y' HEAD^..
Possible dependencies:
c3281abea67c ("remoteproc: stm32_rproc: Fix mailbox interrupts queuing")
35bdafda40cc ("remoteproc: stm32_rproc: Add mutex protection for workqueue")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From c3281abea67c9c0dc6219bbc41d1feae05a16da3 Mon Sep 17 00:00:00 2001
From: Gwenael Treuveur <gwenael.treuveur(a)foss.st.com>
Date: Tue, 21 May 2024 18:23:16 +0200
Subject: [PATCH] remoteproc: stm32_rproc: Fix mailbox interrupts queuing
Manage interrupt coming from coprocessor also when state is
ATTACHED.
Fixes: 35bdafda40cc ("remoteproc: stm32_rproc: Add mutex protection for workqueue")
Cc: stable(a)vger.kernel.org
Signed-off-by: Gwenael Treuveur <gwenael.treuveur(a)foss.st.com>
Acked-by: Arnaud Pouliquen <arnaud.pouliquen(a)foss.st.com>
Link: https://lore.kernel.org/r/20240521162316.156259-1-gwenael.treuveur@foss.st.…
Signed-off-by: Mathieu Poirier <mathieu.poirier(a)linaro.org>
diff --git a/drivers/remoteproc/stm32_rproc.c b/drivers/remoteproc/stm32_rproc.c
index 88623df7d0c3..8c7f7950b80e 100644
--- a/drivers/remoteproc/stm32_rproc.c
+++ b/drivers/remoteproc/stm32_rproc.c
@@ -294,7 +294,7 @@ static void stm32_rproc_mb_vq_work(struct work_struct *work)
mutex_lock(&rproc->lock);
- if (rproc->state != RPROC_RUNNING)
+ if (rproc->state != RPROC_RUNNING && rproc->state != RPROC_ATTACHED)
goto unlock_mutex;
if (rproc_vq_interrupt(rproc, mb->vq_id) == IRQ_NONE)
The patch below does not apply to the 6.6-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.6.y
git checkout FETCH_HEAD
git cherry-pick -x f70fd92df7529e7283e02a6c3a2510075f13ba30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024073000-yelp-remnant-0e02@gregkh' --subject-prefix 'PATCH 6.6.y' HEAD^..
Possible dependencies:
f70fd92df752 ("MIPS: dts: loongson: Fix ls2k1000-rtc interrupt")
dbb69b9d6234 ("MIPS: dts: loongson: Fix liointc IRQ polarity")
d89a415ff8d5 ("MIPS: Loongson64: DTS: Fix PCIe port nodes for ls7a")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f70fd92df7529e7283e02a6c3a2510075f13ba30 Mon Sep 17 00:00:00 2001
From: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
Date: Fri, 14 Jun 2024 16:40:11 +0100
Subject: [PATCH] MIPS: dts: loongson: Fix ls2k1000-rtc interrupt
The correct interrupt line for RTC is line 8 on liointc1.
Fixes: e47084e116fc ("MIPS: Loongson64: DTS: Add RTC support to Loongson-2K1000")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend(a)alpha.franken.de>
diff --git a/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi b/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
index 3f5255584c30..c3a57a0befa7 100644
--- a/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
+++ b/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
@@ -92,8 +92,8 @@ liointc1: interrupt-controller@1fe11440 {
rtc0: rtc@1fe07800 {
compatible = "loongson,ls2k1000-rtc";
reg = <0 0x1fe07800 0 0x78>;
- interrupt-parent = <&liointc0>;
- interrupts = <60 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-parent = <&liointc1>;
+ interrupts = <8 IRQ_TYPE_LEVEL_HIGH>;
};
uart0: serial@1fe00000 {
The patch below does not apply to the 6.10-stable tree.
If someone wants it applied there, or to any other stable or longterm
tree, then please email the backport, including the original git commit
id to <stable(a)vger.kernel.org>.
To reproduce the conflict and resubmit, you may use the following commands:
git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-6.10.y
git checkout FETCH_HEAD
git cherry-pick -x f70fd92df7529e7283e02a6c3a2510075f13ba30
# <resolve conflicts, build, test, etc.>
git commit -s
git send-email --to '<stable(a)vger.kernel.org>' --in-reply-to '2024073000-directory-swimmable-5878@gregkh' --subject-prefix 'PATCH 6.10.y' HEAD^..
Possible dependencies:
f70fd92df752 ("MIPS: dts: loongson: Fix ls2k1000-rtc interrupt")
dbb69b9d6234 ("MIPS: dts: loongson: Fix liointc IRQ polarity")
d89a415ff8d5 ("MIPS: Loongson64: DTS: Fix PCIe port nodes for ls7a")
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From f70fd92df7529e7283e02a6c3a2510075f13ba30 Mon Sep 17 00:00:00 2001
From: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
Date: Fri, 14 Jun 2024 16:40:11 +0100
Subject: [PATCH] MIPS: dts: loongson: Fix ls2k1000-rtc interrupt
The correct interrupt line for RTC is line 8 on liointc1.
Fixes: e47084e116fc ("MIPS: Loongson64: DTS: Add RTC support to Loongson-2K1000")
Cc: stable(a)vger.kernel.org
Signed-off-by: Jiaxun Yang <jiaxun.yang(a)flygoat.com>
Signed-off-by: Thomas Bogendoerfer <tsbogend(a)alpha.franken.de>
diff --git a/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi b/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
index 3f5255584c30..c3a57a0befa7 100644
--- a/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
+++ b/arch/mips/boot/dts/loongson/loongson64-2k1000.dtsi
@@ -92,8 +92,8 @@ liointc1: interrupt-controller@1fe11440 {
rtc0: rtc@1fe07800 {
compatible = "loongson,ls2k1000-rtc";
reg = <0 0x1fe07800 0 0x78>;
- interrupt-parent = <&liointc0>;
- interrupts = <60 IRQ_TYPE_LEVEL_HIGH>;
+ interrupt-parent = <&liointc1>;
+ interrupts = <8 IRQ_TYPE_LEVEL_HIGH>;
};
uart0: serial@1fe00000 {