5.4-stable review patch. If anyone has any objections, please let me know.
------------------
From: Chengfeng Ye dg573847474@gmail.com
[ Upstream commit 2f19c4b8395ccb6eb25ccafee883c8cfbe3fc193 ]
handle_receive_interrupt_napi_sp() running inside interrupt handler could introduce inverse lock ordering between &dd->irq_src_lock and &dd->uctxt_lock, if read_mod_write() is preempted by the isr.
[CPU0] | [CPU1] hfi1_ipoib_dev_open() | --> hfi1_netdev_enable_queues() | --> enable_queues(rx) | --> hfi1_rcvctrl() | --> set_intr_bits() | --> read_mod_write() | --> spin_lock(&dd->irq_src_lock) | | hfi1_poll() | --> poll_next() | --> spin_lock_irq(&dd->uctxt_lock) | | --> hfi1_rcvctrl() | --> set_intr_bits() | --> read_mod_write() | --> spin_lock(&dd->irq_src_lock) <interrupt> | --> handle_receive_interrupt_napi_sp() | --> set_all_fastpath() | --> hfi1_rcd_get_by_index() | --> spin_lock_irqsave(&dd->uctxt_lock) |
This flaw was found by an experimental static analysis tool I am developing for irq-related deadlock.
To prevent the potential deadlock, the patch use spin_lock_irqsave() on &dd->irq_src_lock inside read_mod_write() to prevent the possible deadlock scenario.
Signed-off-by: Chengfeng Ye dg573847474@gmail.com Link: https://lore.kernel.org/r/20230926101116.2797-1-dg573847474@gmail.com Acked-by: Dennis Dalessandro dennis.dalessandro@cornelisnetworks.com Signed-off-by: Leon Romanovsky leon@kernel.org Signed-off-by: Sasha Levin sashal@kernel.org --- drivers/infiniband/hw/hfi1/chip.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-)
diff --git a/drivers/infiniband/hw/hfi1/chip.c b/drivers/infiniband/hw/hfi1/chip.c index 65d6bf34614c8..6cf87fcfc4eb5 100644 --- a/drivers/infiniband/hw/hfi1/chip.c +++ b/drivers/infiniband/hw/hfi1/chip.c @@ -13067,15 +13067,16 @@ static void read_mod_write(struct hfi1_devdata *dd, u16 src, u64 bits, { u64 reg; u16 idx = src / BITS_PER_REGISTER; + unsigned long flags;
- spin_lock(&dd->irq_src_lock); + spin_lock_irqsave(&dd->irq_src_lock, flags); reg = read_csr(dd, CCE_INT_MASK + (8 * idx)); if (set) reg |= bits; else reg &= ~bits; write_csr(dd, CCE_INT_MASK + (8 * idx), reg); - spin_unlock(&dd->irq_src_lock); + spin_unlock_irqrestore(&dd->irq_src_lock, flags); }
/**