Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
T1:Smmu device probe T2: Client device probe
really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail()
When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes.
Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver.
Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Pratyush Brahma quic_pbrahma@quicinc.com --- Changes in v2: Fix kernel test robot warning Add stable kernel list in cc Link to v1: https://lore.kernel.org/all/20241001055633.21062-1-quic_pbrahma@quicinc.com/
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 723273440c21..7c778b7eb8c8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1437,6 +1437,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) goto out_free; } else { smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); + if (!smmu) + return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, + "smmu dev has not bound yet\n")); }
ret = -EINVAL;
On 10/4/2024 2:34 PM, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
T1:Smmu device probe T2: Client device probe
really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail()
When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes.
Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver.
Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Pratyush Brahma quic_pbrahma@quicinc.com
Changes in v2: Fix kernel test robot warning Add stable kernel list in cc Link to v1: https://lore.kernel.org/all/20241001055633.21062-1-quic_pbrahma@quicinc.com/
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 723273440c21..7c778b7eb8c8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1437,6 +1437,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) goto out_free; } else { smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode);
if (!smmu)
return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER,
}"smmu dev has not bound yet\n"));
ret = -EINVAL;
Hi Can someone please review this patch? Let me know if any further information is required.
Thanks Pratyush
On 15/10/2024 2:31 pm, Pratyush Brahma wrote:
On 10/4/2024 2:34 PM, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
T1:Smmu device probe T2: Client device probe
really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail()
When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes.
Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver.
Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Pratyush Brahma quic_pbrahma@quicinc.com
Changes in v2: Fix kernel test robot warning Add stable kernel list in cc Link to v1: https://lore.kernel.org/all/20241001055633.21062-1-quic_pbrahma@quicinc.com/
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 723273440c21..7c778b7eb8c8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1437,6 +1437,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) goto out_free; } else { smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); + if (!smmu) + return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, + "smmu dev has not bound yet\n")); } ret = -EINVAL;
Hi Can someone please review this patch? Let me know if any further information is required.
This really shouldn't be leaking into drivers... :(
Honestly, I'm now so fed up of piling on hacks around the fundamental mis-design here, I think it's finally time to blow everything else off and spend a few days figuring out the most expedient way to fix it properly once and for all. Watch this space...
Thanks, Robin.
On 10/17/2024 7:24 PM, Robin Murphy wrote:
On 15/10/2024 2:31 pm, Pratyush Brahma wrote:
On 10/4/2024 2:34 PM, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
T1:Smmu device probe T2: Client device probe
really_probe() arm_smmu_device_probe() iommu_device_register() really_probe() platform_dma_configure() of_dma_configure() of_dma_configure_id() of_iommu_configure() iommu_probe_device() iommu_init_device() arm_smmu_probe_device() arm_smmu_get_by_fwnode() driver_find_device_by_fwnode() driver_find_device() next_device() klist_next() /* null ptr assigned to smmu */ /* null ptr dereference while smmu->streamid_mask */ driver_bound() klist_add_tail()
When this null smmu pointer is dereferenced later in arm_smmu_probe_device, the device crashes.
Fix this by deferring the probe of the client device until the smmu device has bound to the arm smmu driver.
Fixes: 021bb8420d44 ("iommu/arm-smmu: Wire up generic configuration support") Cc: stable@vger.kernel.org Co-developed-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Prakash Gupta quic_guptap@quicinc.com Signed-off-by: Pratyush Brahma quic_pbrahma@quicinc.com
Changes in v2: Fix kernel test robot warning Add stable kernel list in cc Link to v1: https://lore.kernel.org/all/20241001055633.21062-1-quic_pbrahma@quicinc.com/
drivers/iommu/arm/arm-smmu/arm-smmu.c | 3 +++ 1 file changed, 3 insertions(+)
diff --git a/drivers/iommu/arm/arm-smmu/arm-smmu.c b/drivers/iommu/arm/arm-smmu/arm-smmu.c index 723273440c21..7c778b7eb8c8 100644 --- a/drivers/iommu/arm/arm-smmu/arm-smmu.c +++ b/drivers/iommu/arm/arm-smmu/arm-smmu.c @@ -1437,6 +1437,9 @@ static struct iommu_device *arm_smmu_probe_device(struct device *dev) goto out_free; } else { smmu = arm_smmu_get_by_fwnode(fwspec->iommu_fwnode); + if (!smmu) + return ERR_PTR(dev_err_probe(dev, -EPROBE_DEFER, + "smmu dev has not bound yet\n")); } ret = -EINVAL;
Hi Can someone please review this patch? Let me know if any further information is required.
This really shouldn't be leaking into drivers... :(
Honestly, I'm now so fed up of piling on hacks around the fundamental mis-design here, I think it's finally time to blow everything else off and spend a few days figuring out the most expedient way to fix it properly once and for all. Watch this space...
Thanks, Robin.
Hi Robin
Do you have any approaches in mind that you are currently considering or exploring?
Thanks Pratyush
On Fri, 04 Oct 2024 14:34:28 +0530, Pratyush Brahma wrote:
Null pointer dereference occurs due to a race between smmu driver probe and client driver probe, when of_dma_configure() for client is called after the iommu_device_register() for smmu driver probe has executed but before the driver_bound() for smmu driver has been called.
Following is how the race occurs:
[...]
Applied to will (for-joerg/arm-smmu/updates), thanks!
[1/1] iommu/arm-smmu: Defer probe of clients after smmu device bound https://git.kernel.org/will/c/229e6ee43d2a
Cheers,
linux-stable-mirror@lists.linaro.org