On 06.11.24 13:01, Greg Kroah-Hartman wrote:
6.11-stable review patch. If anyone has any objections, please let me know.
From: Wander Lairson Costa wander@redhat.com
[ Upstream commit 338c4d3902feb5be49bfda530a72c7ab860e2c9f ]
During testing of SR-IOV, Red Hat QE encountered an issue where the ip link up command intermittently fails for the igbvf interfaces when using the PREEMPT_RT variant. Investigation revealed that e1000_write_posted_mbx returns an error due to the lack of an ACK from e1000_poll_for_ack.
The underlying issue arises from the fact that IRQs are threaded by default under PREEMPT_RT. While the exact hardware details are not available, it appears that the IRQ handled by igb_msix_other must be processed before e1000_poll_for_ack times out. However, e1000_write_posted_mbx is called with preemption disabled, leading to a scenario where the IRQ is serviced only after the failure of e1000_write_posted_mbx.
To resolve this, we set IRQF_NO_THREAD for the affected interrupt, ensuring that the kernel handles it immediately, thereby preventing the aforementioned error.
Reproducer:
#!/bin/bash # echo 2 > /sys/class/net/ens14f0/device/sriov_numvfs ipaddr_vlan=3 nic_test=ens14f0 vf=${nic_test}v0 while true; do ip link set ${nic_test} mtu 1500 ip link set ${vf} mtu 1500 ip link set $vf up ip link set ${nic_test} vf 0 vlan ${ipaddr_vlan} ip addr add 172.30.${ipaddr_vlan}.1/24 dev ${vf} ip addr add 2021:db8:${ipaddr_vlan}::1/64 dev ${vf} if ! ip link show $vf | grep 'state UP'; then echo 'Error found' break fi ip link set $vf down done
Signed-off-by: Wander Lairson Costa wander@redhat.com Fixes: 9d5c824399de ("igb: PCI-Express 82575 Gigabit Ethernet driver") Reported-by: Yuying Ma yuma@redhat.com Reviewed-by: Przemek Kitszel przemyslaw.kitszel@intel.com Tested-by: Rafal Romanowski rafal.romanowski@intel.com Signed-off-by: Jacob Keller jacob.e.keller@intel.com Reviewed-by: Simon Horman horms@kernel.org Signed-off-by: Paolo Abeni pabeni@redhat.com Signed-off-by: Sasha Levin sashal@kernel.org
drivers/net/ethernet/intel/igb/igb_main.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c index f1d0881687233..b83df5f94b1f5 100644 --- a/drivers/net/ethernet/intel/igb/igb_main.c +++ b/drivers/net/ethernet/intel/igb/igb_main.c @@ -907,7 +907,7 @@ static int igb_request_msix(struct igb_adapter *adapter) int i, err = 0, vector = 0, free_vector = 0; err = request_irq(adapter->msix_entries[vector].vector,
igb_msix_other, 0, netdev->name, adapter);
if (err) goto err_out;igb_msix_other, IRQF_NO_THREAD, netdev->name, adapter);
This is scheduled for being reverted upstream [1]. Please drop from all stable queues.
(credits go to Florian for pointing me to this rt-suspicious patch)
Jan
[1] https://lore.kernel.org/all/20241104124050.22290-1-wander@redhat.com/