[ Upstream commit b46064a18810bad3aea089a79993ca5ea7a3d2b2 ]
It turns out that deferred default domain creation leaves a subtle race window during iommu_device_register() wherein a client driver may asynchronously probe in parallel and get as far as performing DMA API operations with dma-direct, only to be switched to iommu-dma underfoot once the default domain attachment finally happens, with obviously disastrous consequences. Even the wonky of_iommu_configure() path is at risk, since iommu_fwspec_init() will no longer defer client probe as the instance ops are (necessarily) already registered, and the "replay" iommu_probe_device() call can see dev->iommu_group already set and so think there's nothing to do either.
Fortunately we already have the right tool in the right place in the form of iommu_device_use_default_domain(), which just needs to ensure that said default domain is actually ready to *be* used. Deferring the client probe shouldn't have too much impact, given that this only happens while the IOMMU driver is probing, and thus due to kick the deferred probe list again once it finishes.
[ Backport: The above is true for mainline, but here we still have arch_setup_dma_ops() to worry about, which is not replayed if the default domain happens to be allocated *between* that call and subsequently reaching iommu_device_use_default_domain(), so we need an additional earlier check to cover that case. Also we're now back before the nominal commit 98ac73f99bc4 so we need to tweak the logic to depend on IOMMU_DMA as well, to avoid falsely deferring on architectures not using default domains. This then serves us back as far as f188056352bc, where this specific form of the problem first arises. ]
Reported-by: Charan Teja Kalla quic_charante@quicinc.com Fixes: 98ac73f99bc4 ("iommu: Require a default_domain for all iommu drivers") Fixes: f188056352bc ("iommu: Avoid locking/unlocking for iommu_probe_device()") Link: https://lore.kernel.org/r/e88b94c9b575034a2c98a48b3d383654cbda7902.174075326... Signed-off-by: Robin Murphy robin.murphy@arm.com --- drivers/iommu/iommu.c | 16 ++++++++++++++++ 1 file changed, 16 insertions(+)
diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index 3f1029c0825e..29b48a6a0275 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -566,6 +566,17 @@ int iommu_probe_device(struct device *dev) mutex_lock(&iommu_probe_device_lock); ret = __iommu_probe_device(dev, NULL); mutex_unlock(&iommu_probe_device_lock); + + /* + * The dma_configure replay paths need bus_iommu_probe() to + * finish before they can call arch_setup_dma_ops() + */ + if (IS_ENABLED(CONFIG_IOMMU_DMA) && !ret && dev->iommu_group) { + mutex_lock(&dev->iommu_group->mutex); + if (!dev->iommu_group->default_domain) + ret = -EPROBE_DEFER; + mutex_unlock(&dev->iommu_group->mutex); + } if (ret) return ret;
@@ -3149,6 +3160,11 @@ int iommu_device_use_default_domain(struct device *dev) return 0;
mutex_lock(&group->mutex); + /* We may race against bus_iommu_probe() finalising groups here */ + if (!group->default_domain) { + ret = -EPROBE_DEFER; + goto unlock_out; + } if (group->owner_cnt) { if (group->owner || !iommu_is_default_domain(group) || !xa_empty(&group->pasid_array)) {
[ Sasha's backport helper bot ]
Hi,
✅ All tests passed successfully. No issues detected. No action required from the submitter.
The upstream commit SHA1 provided is correct: b46064a18810bad3aea089a79993ca5ea7a3d2b2
Status in newer kernel trees: 6.14.y | Present (different SHA1: 32ed1cb03461) 6.12.y | Not found
Note: The patch differs from the upstream commit: --- 1: b46064a18810b < -: ------------- iommu: Handle race with default domain setup -: ------------- > 1: c855fb779034b iommu: Handle race with default domain setup ---
Results of testing on various branches:
| Branch | Patch Apply | Build Test | |---------------------------|-------------|------------| | stable/linux-6.6.y | Success | Success |
On Fri, Apr 25, 2025 at 03:19:22PM +0100, Robin Murphy wrote:
[ Upstream commit b46064a18810bad3aea089a79993ca5ea7a3d2b2 ]
<snip>
You seem to have forgotten about a backport for 6.12.y as well. You know we can't take backports that skip stable branches, otherwise users will have regressions when they upgrade.
Please resend this when you also send a 6.12.y backport, and we will be glad to queue both up.
thanks,
greg k-h
On 29/04/2025 8:35 am, Greg KH wrote:
On Fri, Apr 25, 2025 at 03:19:22PM +0100, Robin Murphy wrote:
[ Upstream commit b46064a18810bad3aea089a79993ca5ea7a3d2b2 ]
<snip>
You seem to have forgotten about a backport for 6.12.y as well. You know we can't take backports that skip stable branches, otherwise users will have regressions when they upgrade.
Please resend this when you also send a 6.12.y backport, and we will be glad to queue both up.
Hmm, 6.12.y should have had the mainline patch already, I wonder why it didn't get picked? Admittedly I didn't explicitly CC stable, but I'm so used to autosel picking up fixes tags these days... I resent anyway for good measure, given what else I also managed to mess up last time, sorry for any confusion.
Thanks, Robin.
linux-stable-mirror@lists.linaro.org