On 15/10/2024 2:31 pm, Pratyush Brahma wrote:
On 10/4/2024 2:34 PM, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
T1:Smmu device probe T2: Client device probe
really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail()
When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes.
Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver.
Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Pratyush Brahma quic_pbrahma@quicinc.com
Changes in v2: Fix kernel test robot warning Add stable kernel list in cc Link to v1: https://lore.kernel.org/all/20241001055633.21062-1-quic_pbrahma@quicinc.com/
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 723273440c21..7c778b7eb8c8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1437,6 +1437,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) goto out_free; } else { smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); + if (!smmu) + return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, + "smmu dev has not bound yet\n")); } ret = -EINVAL;
Hi Can someone please review this patch? Let me know if any further information is required.
This really shouldn't be leaking into drivers... :(
Honestly, I'm now so fed up of piling on hacks around the fundamental mis-design here, I think it's finally time to blow everything else off and spend a few days figuring out the most expedient way to fix it properly once and for all. Watch this space...
Thanks, Robin.