On 2024-11-01 15:21:24 [-0400], Sasha Levin wrote:
commit 052382490ee4f0f6d783ddce02fe6f2d15e134b5 Author: Wander Lairson Costa wander@redhat.com Date: Mon Oct 21 16:26:24 2024 -0700
igb: Disable threaded IRQ for igb_msix_other
[ Upstream commit 338c4d3902feb5be49bfda530a72c7ab860e2c9f ] During testing of SR-IOV, Red Hat QE encountered an issue where the ip link up command intermittently fails for the igbvf interfaces when using the PREEMPT_RT variant. Investigation revealed that e1000_write_posted_mbx returns an error due to the lack of an ACK from e1000_poll_for_ack. The underlying issue arises from the fact that IRQs are threaded by default under PREEMPT_RT. While the exact hardware details are not available, it appears that the IRQ handled by igb_msix_other must be processed before e1000_poll_for_ack times out. However, e1000_write_posted_mbx is called with preemption disabled, leading to a scenario where the IRQ is serviced only after the failure of e1000_write_posted_mbx. To resolve this, we set IRQF_NO_THREAD for the affected interrupt, ensuring that the kernel handles it immediately, thereby preventing the aforementioned error.
Wander, please send a revert of this patch. The ISR (E1000_ICR_TS set) may invoke igb_msg_task(), ptp_clock_event(), igb_perout(), igb_extts() each of which acquire sleeping locks on PREEMPT_RT. Not sure if this improved the situation or not.
Sebastian