PASID (Process Address Space ID) is a PCIe extension to tag the DMA transactions out of a physical device, and most modern IOMMU hardware have supported PASID granular address translation. So a PASID-capable device can be attached to multiple hwpts (a.k.a. domains), each attachment is tagged with a pasid.
This series is based on a preparation series [1], it first adds a missing iommu API to replace domain for a pasid. Based on the iommu pasid attach/ replace/detach APIs, this series adds iommufd APIs for device drivers to attach/replace/detach pasid to/from hwpt per userspace's request, and adds selftest to validate the iommufd APIs.
The completed code can be found in below link [2]. Heads up! The existing iommufd selftest was broken, there was a fix [3] to it, but not been upstreamed yet. If want to run the iommufd selftest, please apply that fix. Sorry for the inconvenience.
[1] https://lore.kernel.org/linux-iommu/20240628085538.47049-1-yi.l.liu@intel.co... [2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid [3] https://lore.kernel.org/linux-iommu/20240111073213.180020-1-baolu.lu@linux.i...
Change log:
v3: - Split the set_dev_pasid op enhancements for domain replacement to be a separate series "Make set_dev_pasid op supportting domain replacement" [1]. The below changes are made in the separate series. *) set_dev_pasid() callback should keep the old config if failed to attach to a domain. This simplifies the caller a lot as caller does not need to attach it back to old domain explicitly. This also avoids some corner cases in which the core may do duplicated domain attachment as described in below link (Jason) https://lore.kernel.org/linux-iommu/BN9PR11MB52768C98314A95AFCD2FA6478C0F2@B... *) Drop patch 10 of v2 as it's a bug fix and can be submitted separately (Kevin) *) Rebase on top of Baolu's domain_alloc_paging refactor series (Jason) - Drop the attach_data which includes attach_fn and pasid, insteadly passing the pasid through the device attach path. (Jason) - Add a pasid-num-bits property to mock dev to make pasid selftest work (Kevin)
v2: https://lore.kernel.org/linux-iommu/20240412081516.31168-1-yi.l.liu@intel.co... - Domain replace for pasid should be handled in set_dev_pasid() callbacks instead of remove_dev_pasid and call set_dev_pasid afteward in iommu layer (Jason) - Make xarray operations more self-contained in iommufd pasid attach/replace/detach (Jason) - Tweak the dev_iommu_get_max_pasids() to allow iommu driver to populate the max_pasids. This makes the iommufd selftest simpler to meet the max_pasids check in iommu_attach_device_pasid() (Jason)
v1: https://lore.kernel.org/kvm/20231127063428.127436-1-yi.l.liu@intel.com/#r - Implemnet iommu_replace_device_pasid() to fall back to the original domain if this replacement failed (Kevin) - Add check in do_attach() to check corressponding attach_fn per the pasid value.
rfc: https://lore.kernel.org/linux-iommu/20230926092651.17041-1-yi.l.liu@intel.co...
Regards, Yi Liu
Yi Liu (7): iommu: Introduce a replace API for device pasid iommufd: Pass pasid through the device attach/replace path iommufd: Support attach/replace hwpt per pasid iommufd/selftest: Add set_dev_pasid and remove_dev_pasid in mock iommu iommufd/selftest: Add a helper to get test device iommufd/selftest: Add test ops to test pasid attach/detach iommufd/selftest: Add coverage for iommufd pasid attach/detach
drivers/iommu/iommu-priv.h | 3 + drivers/iommu/iommu.c | 80 ++++++- drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/device.c | 31 +-- drivers/iommu/iommufd/iommufd_private.h | 15 ++ drivers/iommu/iommufd/iommufd_test.h | 30 +++ drivers/iommu/iommufd/pasid.c | 157 +++++++++++++ drivers/iommu/iommufd/selftest.c | 206 ++++++++++++++++- include/linux/iommufd.h | 6 + tools/testing/selftests/iommu/iommufd.c | 207 ++++++++++++++++++ .../selftests/iommu/iommufd_fail_nth.c | 28 ++- tools/testing/selftests/iommu/iommufd_utils.h | 78 +++++++ 12 files changed, 808 insertions(+), 34 deletions(-) create mode 100644 drivers/iommu/iommufd/pasid.c
Provide a high-level API to allow replacements of one domain with another for specific pasid of a device. This is similar to iommu_group_replace_domain() and it is expected to be used only by IOMMUFD.
Co-developed-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommu-priv.h | 3 ++ drivers/iommu/iommu.c | 80 ++++++++++++++++++++++++++++++++++++-- 2 files changed, 79 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/iommu-priv.h b/drivers/iommu/iommu-priv.h index 5f731d994803..0949c02cee93 100644 --- a/drivers/iommu/iommu-priv.h +++ b/drivers/iommu/iommu-priv.h @@ -20,6 +20,9 @@ static inline const struct iommu_ops *dev_iommu_ops(struct device *dev) int iommu_group_replace_domain(struct iommu_group *group, struct iommu_domain *new_domain);
+int iommu_replace_device_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid); + int iommu_device_register_bus(struct iommu_device *iommu, const struct iommu_ops *ops, const struct bus_type *bus, diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index b3a1dabed2dd..2d64582b7c43 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -3268,14 +3268,15 @@ bool iommu_group_dma_owner_claimed(struct iommu_group *group) EXPORT_SYMBOL_GPL(iommu_group_dma_owner_claimed);
static int __iommu_set_group_pasid(struct iommu_domain *domain, - struct iommu_group *group, ioasid_t pasid) + struct iommu_group *group, ioasid_t pasid, + struct iommu_domain *old) { struct group_device *device, *last_gdev; int ret;
for_each_group_device(group, device) { ret = domain->ops->set_dev_pasid(domain, device->dev, - pasid, NULL); + pasid, old); if (ret) goto err_revert; } @@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) break; - ops->remove_dev_pasid(device->dev, pasid, domain); + /* If no old domain, undo the succeeded devices/pasid */ + if (!old) { + ops->remove_dev_pasid(device->dev, pasid, domain); + continue; + } + + /* + * Rollback the succeeded devices/pasid to the old domain. + * And it is a driver bug to fail attaching with a previously + * good domain. + */ + if (WARN_ON(old->ops->set_dev_pasid(old, device->dev, + pasid, domain))) + ops->remove_dev_pasid(device->dev, pasid, domain); } return ret; } @@ -3348,7 +3362,7 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, goto out_unlock; }
- ret = __iommu_set_group_pasid(domain, group, pasid); + ret = __iommu_set_group_pasid(domain, group, pasid, NULL); if (ret) xa_erase(&group->pasid_array, pasid); out_unlock: @@ -3357,6 +3371,64 @@ int iommu_attach_device_pasid(struct iommu_domain *domain, } EXPORT_SYMBOL_GPL(iommu_attach_device_pasid);
+/** + * iommu_replace_device_pasid - Replace the domain that a pasid is attached to + * @domain: the new iommu domain + * @dev: the attached device. + * @pasid: the pasid of the device. + * + * This API allows the pasid to switch domains. Return 0 on success, or an + * error. The pasid will keep the old configuration if replacement failed. + */ +int iommu_replace_device_pasid(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid) +{ + /* Caller must be a probed driver on dev */ + struct iommu_group *group = dev->iommu_group; + void *curr; + int ret; + + if (!domain->ops->set_dev_pasid) + return -EOPNOTSUPP; + + if (!group) + return -ENODEV; + + if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain->owner || + pasid == IOMMU_NO_PASID) + return -EINVAL; + + mutex_lock(&group->mutex); + /* + * The recorded domain is inconsistent with the domain pasid is + * actually attached until pasid is attached to the new domain. + * This has race condition with the paths that do not hold + * group->mutex. E.g. the Page Request forwarding. + */ + curr = xa_store(&group->pasid_array, pasid, domain, GFP_KERNEL); + if (!curr) { + xa_erase(&group->pasid_array, pasid); + ret = -EINVAL; + goto out_unlock; + } + + ret = xa_err(curr); + if (ret) + goto out_unlock; + + if (curr == domain) + goto out_unlock; + + ret = __iommu_set_group_pasid(domain, group, pasid, curr); + if (ret) + WARN_ON(domain != xa_store(&group->pasid_array, pasid, + curr, GFP_KERNEL)); +out_unlock: + mutex_unlock(&group->mutex); + return ret; +} +EXPORT_SYMBOL_NS_GPL(iommu_replace_device_pasid, IOMMUFD_INTERNAL); + /* * iommu_detach_device_pasid() - Detach the domain from pasid of device * @domain: the iommu domain.
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) break;
ops->remove_dev_pasid(device->dev, pasid, domain);
/* If no old domain, undo the succeeded devices/pasid */
if (!old) {
ops->remove_dev_pasid(device->dev, pasid, domain);
continue;
}
/*
* Rollback the succeeded devices/pasid to the old domain.
* And it is a driver bug to fail attaching with a previously
* good domain.
*/
if (WARN_ON(old->ops->set_dev_pasid(old, device->dev,
pasid, domain)))
ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
+int iommu_replace_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
+{
- /* Caller must be a probed driver on dev */
- struct iommu_group *group = dev->iommu_group;
- void *curr;
- int ret;
- if (!domain->ops->set_dev_pasid)
return -EOPNOTSUPP;
- if (!group)
return -ENODEV;
- if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain-
owner ||
pasid == IOMMU_NO_PASID)
return -EINVAL;
- mutex_lock(&group->mutex);
- /*
* The recorded domain is inconsistent with the domain pasid is
* actually attached until pasid is attached to the new domain.
* This has race condition with the paths that do not hold
* group->mutex. E.g. the Page Request forwarding.
*/
so?
- curr = xa_store(&group->pasid_array, pasid, domain, GFP_KERNEL);
- if (!curr) {
xa_erase(&group->pasid_array, pasid);
ret = -EINVAL;
goto out_unlock;
- }
- ret = xa_err(curr);
- if (ret)
goto out_unlock;
- if (curr == domain)
goto out_unlock;
- ret = __iommu_set_group_pasid(domain, group, pasid, curr);
- if (ret)
WARN_ON(domain != xa_store(&group->pasid_array, pasid,
curr, GFP_KERNEL));
above can follow Jason's suggestion to iommu_group_replace_domain () in Baolu's series, i.e. doing a xa_reserve() first.
On 2024/7/18 16:27, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) break;
ops->remove_dev_pasid(device->dev, pasid, domain);
/* If no old domain, undo the succeeded devices/pasid */
if (!old) {
ops->remove_dev_pasid(device->dev, pasid, domain);
continue;
}
/*
* Rollback the succeeded devices/pasid to the old domain.
* And it is a driver bug to fail attaching with a previously
* good domain.
*/
if (WARN_ON(old->ops->set_dev_pasid(old, device->dev,
pasid, domain)))
ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
how about your thought, @Jason?
+int iommu_replace_device_pasid(struct iommu_domain *domain,
struct device *dev, ioasid_t pasid)
+{
- /* Caller must be a probed driver on dev */
- struct iommu_group *group = dev->iommu_group;
- void *curr;
- int ret;
- if (!domain->ops->set_dev_pasid)
return -EOPNOTSUPP;
- if (!group)
return -ENODEV;
- if (!dev_has_iommu(dev) || dev_iommu_ops(dev) != domain-
owner ||
pasid == IOMMU_NO_PASID)
return -EINVAL;
- mutex_lock(&group->mutex);
- /*
* The recorded domain is inconsistent with the domain pasid is
* actually attached until pasid is attached to the new domain.
* This has race condition with the paths that do not hold
* group->mutex. E.g. the Page Request forwarding.
*/
so?
This is added per the below comment. Maybe I should have made it clearer. Due to the order of this xa operations, the domain in the xarray does not match the actual translation structure, but it will become consistent in the end.
https://lore.kernel.org/linux-iommu/20240429135512.GC941030@nvidia.com/
- curr = xa_store(&group->pasid_array, pasid, domain, GFP_KERNEL);
- if (!curr) {
xa_erase(&group->pasid_array, pasid);
ret = -EINVAL;
goto out_unlock;
- }
- ret = xa_err(curr);
- if (ret)
goto out_unlock;
- if (curr == domain)
goto out_unlock;
- ret = __iommu_set_group_pasid(domain, group, pasid, curr);
- if (ret)
WARN_ON(domain != xa_store(&group->pasid_array, pasid,
curr, GFP_KERNEL));
above can follow Jason's suggestion to iommu_group_replace_domain () in Baolu's series, i.e. doing a xa_reserve() first.
yeah, I noticed it. But there is a minor difference. In Baolu's series no need to retrieve the old domain, but this path needs to get it and pass it to set_dev_pasid().
On Fri, Aug 16, 2024 at 05:43:18PM +0800, Yi Liu wrote:
On 2024/7/18 16:27, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) break;
ops->remove_dev_pasid(device->dev, pasid, domain);
/* If no old domain, undo the succeeded devices/pasid */
if (!old) {
ops->remove_dev_pasid(device->dev, pasid, domain);
continue;
}
/*
* Rollback the succeeded devices/pasid to the old domain.
* And it is a driver bug to fail attaching with a previously
* good domain.
*/
if (WARN_ON(old->ops->set_dev_pasid(old, device->dev,
pasid, domain)))
ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
how about your thought, @Jason?
I think we talked about doing that once before, I forget why it was not done. Maybe there was an issue?
But it seems worth trying.
I would like to see set_dev_pasid pass in the old domain first:
int (*set_dev_pasid)(struct iommu_domain *new_domain, struct iommu_domain *old_domain, struct device *dev, ioasid_t pasid);
Replace includes the old_domain as an argument and it is necessary information..
A quick try on SMMUv3 seems reasonable:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9bc50bded5af72..f512bfe5cd202c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2931,13 +2931,12 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, return ret; }
-static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid, - struct iommu_domain *domain) +static void arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain, + struct iommu_domain *old_domain, + struct device *dev, ioasid_t pasid) { struct arm_smmu_master *master = dev_iommu_priv_get(dev); - struct arm_smmu_domain *smmu_domain; - - smmu_domain = to_smmu_domain(domain); + struct arm_smmu_domain *smmu_domain = to_smmu_domain(old_domain);
mutex_lock(&arm_smmu_asid_lock); arm_smmu_clear_cd(master, pasid); @@ -3039,6 +3038,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain,
static const struct iommu_domain_ops arm_smmu_blocked_ops = { .attach_dev = arm_smmu_attach_dev_blocked, + .set_dev_pasid = arm_smmu_blocked_set_dev_pasid, };
static struct iommu_domain arm_smmu_blocked_domain = { @@ -3487,7 +3487,6 @@ static struct iommu_ops arm_smmu_ops = { .device_group = arm_smmu_device_group, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions, - .remove_dev_pasid = arm_smmu_remove_dev_pasid, .dev_enable_feat = arm_smmu_dev_enable_feature, .dev_disable_feat = arm_smmu_dev_disable_feature, .page_response = arm_smmu_page_response,
Jason
On 2024/8/16 21:02, Jason Gunthorpe wrote:
On Fri, Aug 16, 2024 at 05:43:18PM +0800, Yi Liu wrote:
On 2024/7/18 16:27, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) break;
ops->remove_dev_pasid(device->dev, pasid, domain);
/* If no old domain, undo the succeeded devices/pasid */
if (!old) {
ops->remove_dev_pasid(device->dev, pasid, domain);
continue;
}
/*
* Rollback the succeeded devices/pasid to the old domain.
* And it is a driver bug to fail attaching with a previously
* good domain.
*/
if (WARN_ON(old->ops->set_dev_pasid(old, device->dev,
pasid, domain)))
ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
how about your thought, @Jason?
I think we talked about doing that once before, I forget why it was not done. Maybe there was an issue?
But it seems worth trying.
Since remove_dev_pasid() does not return a result, so caller does not need to check the result of it. If we want to replace it with the blocked_domain->ops->set_dev_pasid(), shall we enforce that the set_dev_pasid() op of blocked_domain to be always success. Is it? Otherwise, this is not an apple-to-apple replacement.
I would like to see set_dev_pasid pass in the old domain first:
int (*set_dev_pasid)(struct iommu_domain *new_domain, struct iommu_domain *old_domain, struct device *dev, ioasid_t pasid);
Replace includes the old_domain as an argument and it is necessary information..
sure. Intel iommu driver should be able to support it as well. While AMD iommu driver does no have the blocking domain stuff yet. Vasant may keep me honest.
A quick try on SMMUv3 seems reasonable:
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 9bc50bded5af72..f512bfe5cd202c 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2931,13 +2931,12 @@ int arm_smmu_set_pasid(struct arm_smmu_master *master, return ret; } -static void arm_smmu_remove_dev_pasid(struct device *dev, ioasid_t pasid,
struct iommu_domain *domain)
+static void arm_smmu_blocking_set_dev_pasid(struct iommu_domain *new_domain,
struct iommu_domain *old_domain,
{ struct arm_smmu_master *master = dev_iommu_priv_get(dev);struct device *dev, ioasid_t pasid)
struct arm_smmu_domain *smmu_domain;
smmu_domain = to_smmu_domain(domain);
struct arm_smmu_domain *smmu_domain = to_smmu_domain(old_domain);
mutex_lock(&arm_smmu_asid_lock); arm_smmu_clear_cd(master, pasid); @@ -3039,6 +3038,7 @@ static int arm_smmu_attach_dev_blocked(struct iommu_domain *domain, static const struct iommu_domain_ops arm_smmu_blocked_ops = { .attach_dev = arm_smmu_attach_dev_blocked,
};.set_dev_pasid = arm_smmu_blocked_set_dev_pasid,
static struct iommu_domain arm_smmu_blocked_domain = { @@ -3487,7 +3487,6 @@ static struct iommu_ops arm_smmu_ops = { .device_group = arm_smmu_device_group, .of_xlate = arm_smmu_of_xlate, .get_resv_regions = arm_smmu_get_resv_regions,
.remove_dev_pasid = arm_smmu_remove_dev_pasid, .dev_enable_feat = arm_smmu_dev_enable_feature, .dev_disable_feat = arm_smmu_dev_disable_feature, .page_response = arm_smmu_page_response,
Jason
On 9/6/24 12:21 PM, Yi Liu wrote:
On 2024/8/16 21:02, Jason Gunthorpe wrote:
On Fri, Aug 16, 2024 at 05:43:18PM +0800, Yi Liu wrote:
On 2024/7/18 16:27, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) Â Â Â Â Â Â Â Â Â Â Â Â Â Â break; -Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain); +Â Â Â Â Â Â Â /* If no old domain, undo the succeeded devices/pasid */ +Â Â Â Â Â Â Â if (!old) { +Â Â Â Â Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain); +Â Â Â Â Â Â Â Â Â Â Â continue; +Â Â Â Â Â Â Â }
+Â Â Â Â Â Â Â /* +Â Â Â Â Â Â Â Â * Rollback the succeeded devices/pasid to the old domain. +Â Â Â Â Â Â Â Â * And it is a driver bug to fail attaching with a previously +Â Â Â Â Â Â Â Â * good domain. +Â Â Â Â Â Â Â Â */ +Â Â Â Â Â Â Â if (WARN_ON(old->ops->set_dev_pasid(old, device->dev, +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â pasid, domain))) +Â Â Â Â Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
how about your thought, @Jason?
I think we talked about doing that once before, I forget why it was not done. Maybe there was an issue?
But it seems worth trying.
Since remove_dev_pasid() does not return a result, so caller does not need to check the result of it. If we want to replace it with the blocked_domain->ops->set_dev_pasid(), shall we enforce that the set_dev_pasid() op of blocked_domain to be always success. Is it?
Yes. The semantics of blocking domain is that the iommu driver must ensure successful completion.
Thanks, baolu
On 2024/9/6 12:33, Baolu Lu wrote:
On 9/6/24 12:21 PM, Yi Liu wrote:
On 2024/8/16 21:02, Jason Gunthorpe wrote:
On Fri, Aug 16, 2024 at 05:43:18PM +0800, Yi Liu wrote:
On 2024/7/18 16:27, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Friday, June 28, 2024 5:06 PM
@@ -3289,7 +3290,20 @@ static int __iommu_set_group_pasid(struct iommu_domain *domain,
if (device == last_gdev) Â Â Â Â Â Â Â Â Â Â Â Â Â Â break; -Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain); +Â Â Â Â Â Â Â /* If no old domain, undo the succeeded devices/pasid */ +Â Â Â Â Â Â Â if (!old) { +Â Â Â Â Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain); +Â Â Â Â Â Â Â Â Â Â Â continue; +Â Â Â Â Â Â Â }
+Â Â Â Â Â Â Â /* +Â Â Â Â Â Â Â Â * Rollback the succeeded devices/pasid to the old domain. +Â Â Â Â Â Â Â Â * And it is a driver bug to fail attaching with a previously +Â Â Â Â Â Â Â Â * good domain. +Â Â Â Â Â Â Â Â */ +Â Â Â Â Â Â Â if (WARN_ON(old->ops->set_dev_pasid(old, device->dev, +Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â pasid, domain))) +Â Â Â Â Â Â Â Â Â Â Â ops->remove_dev_pasid(device->dev, pasid, domain);
I wonder whether @remove_dev_pasid() can be replaced by having blocking_domain support @set_dev_pasid?
how about your thought, @Jason?
I think we talked about doing that once before, I forget why it was not done. Maybe there was an issue?
But it seems worth trying.
Since remove_dev_pasid() does not return a result, so caller does not need to check the result of it. If we want to replace it with the blocked_domain->ops->set_dev_pasid(), shall we enforce that the set_dev_pasid() op of blocked_domain to be always success. Is it?
Yes. The semantics of blocking domain is that the iommu driver must ensure successful completion.
great. thanks for the confirmation.
Most of the core logic before conducting the actual device attach/ replace operation can be shared with pasid attach/replace. So pass pasid through the device attach/replace helpers to prepare adding pasid attach/replace.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 28 +++++++++++++++------------- 1 file changed, 15 insertions(+), 13 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 873630c111c1..8f13aa94d3af 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -415,7 +415,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) }
static struct iommufd_hw_pagetable * -iommufd_device_do_attach(struct iommufd_device *idev, +iommufd_device_do_attach(struct iommufd_device *idev, u32 pasid, struct iommufd_hw_pagetable *hwpt) { int rc; @@ -469,7 +469,7 @@ iommufd_group_do_replace_paging(struct iommufd_group *igroup, }
static struct iommufd_hw_pagetable * -iommufd_device_do_replace(struct iommufd_device *idev, +iommufd_device_do_replace(struct iommufd_device *idev, u32 pasid, struct iommufd_hw_pagetable *hwpt) { struct iommufd_group *igroup = idev->igroup; @@ -532,7 +532,8 @@ iommufd_device_do_replace(struct iommufd_device *idev, }
typedef struct iommufd_hw_pagetable *(*attach_fn)( - struct iommufd_device *idev, struct iommufd_hw_pagetable *hwpt); + struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt);
/* * When automatically managing the domains we search for a compatible domain in @@ -540,7 +541,7 @@ typedef struct iommufd_hw_pagetable *(*attach_fn)( * Automatic domain selection will never pick a manually created domain. */ static struct iommufd_hw_pagetable * -iommufd_device_auto_get_domain(struct iommufd_device *idev, +iommufd_device_auto_get_domain(struct iommufd_device *idev, u32 pasid, struct iommufd_ioas *ioas, u32 *pt_id, attach_fn do_attach) { @@ -569,7 +570,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, hwpt = &hwpt_paging->common; if (!iommufd_lock_obj(&hwpt->obj)) continue; - destroy_hwpt = (*do_attach)(idev, hwpt); + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); if (IS_ERR(destroy_hwpt)) { iommufd_put_object(idev->ictx, &hwpt->obj); /* @@ -596,7 +597,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, hwpt = &hwpt_paging->common;
if (!immediate_attach) { - destroy_hwpt = (*do_attach)(idev, hwpt); + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); if (IS_ERR(destroy_hwpt)) goto out_abort; } else { @@ -617,8 +618,8 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, return destroy_hwpt; }
-static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, - attach_fn do_attach) +static int iommufd_device_change_pt(struct iommufd_device *idev, u32 pasid, + u32 *pt_id, attach_fn do_attach) { struct iommufd_hw_pagetable *destroy_hwpt; struct iommufd_object *pt_obj; @@ -633,7 +634,7 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, struct iommufd_hw_pagetable *hwpt = container_of(pt_obj, struct iommufd_hw_pagetable, obj);
- destroy_hwpt = (*do_attach)(idev, hwpt); + destroy_hwpt = (*do_attach)(idev, pasid, hwpt); if (IS_ERR(destroy_hwpt)) goto out_put_pt_obj; break; @@ -642,8 +643,8 @@ static int iommufd_device_change_pt(struct iommufd_device *idev, u32 *pt_id, struct iommufd_ioas *ioas = container_of(pt_obj, struct iommufd_ioas, obj);
- destroy_hwpt = iommufd_device_auto_get_domain(idev, ioas, pt_id, - do_attach); + destroy_hwpt = iommufd_device_auto_get_domain(idev, pasid, ioas, + pt_id, do_attach); if (IS_ERR(destroy_hwpt)) goto out_put_pt_obj; break; @@ -680,7 +681,8 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id) { int rc;
- rc = iommufd_device_change_pt(idev, pt_id, &iommufd_device_do_attach); + rc = iommufd_device_change_pt(idev, IOMMU_PASID_INVALID, pt_id, + &iommufd_device_do_attach); if (rc) return rc;
@@ -710,7 +712,7 @@ EXPORT_SYMBOL_NS_GPL(iommufd_device_attach, IOMMUFD); */ int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id) { - return iommufd_device_change_pt(idev, pt_id, + return iommufd_device_change_pt(idev, IOMMU_PASID_INVALID, pt_id, &iommufd_device_do_replace); } EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, IOMMUFD);
This introduces three APIs for device drivers to manage pasid attach/ replace/detach.
int iommufd_device_pasid_attach(struct iommufd_device *idev, u32 pasid, u32 *pt_id); int iommufd_device_pasid_replace(struct iommufd_device *idev, u32 pasid, u32 *pt_id); void iommufd_device_pasid_detach(struct iommufd_device *idev, u32 pasid);
pasid operations have different implications when comparing to device operations:
- No connection to iommufd_group since pasid is a device capability and can be enabled only in singleton group;
- no reserved region per pasid otherwise SVA architecture is already broken (CPU address space doesn't count device reserved regions);
- accordingly no sw_msi trick;
- immediated_attach is not supported, expecting that arm-smmu driver will already remove that requirement before supporting this pasid operation. This avoids unnecessary change in iommufd_hw_pagetable_alloc() to carry the pasid from device.c.
With above differences, this puts all pasid related logics into a new pasid.c file.
Cache coherency enforcement is still applied to pasid operations since it is about memory accesses post page table walking (no matter the walk is per RID or per PASID).
Since the attach is per PASID, this introduces a pasid_hwpts xarray to track the per-pasid attach data.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/device.c | 11 +- drivers/iommu/iommufd/iommufd_private.h | 15 +++ drivers/iommu/iommufd/pasid.c | 157 ++++++++++++++++++++++++ include/linux/iommufd.h | 6 + 5 files changed, 184 insertions(+), 6 deletions(-) create mode 100644 drivers/iommu/iommufd/pasid.c
diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index 34b446146961..4b4d516b025c 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -6,6 +6,7 @@ iommufd-y := \ ioas.o \ main.o \ pages.o \ + pasid.o \ vfio_compat.o
iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 8f13aa94d3af..9933fc492207 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -136,6 +136,7 @@ void iommufd_device_destroy(struct iommufd_object *obj) struct iommufd_device *idev = container_of(obj, struct iommufd_device, obj);
+ WARN_ON(!xa_empty(&idev->pasid_hwpts)); iommu_device_release_dma_owner(idev->dev); iommufd_put_group(idev->igroup); if (!iommufd_selftest_is_mock_dev(idev->dev)) @@ -216,6 +217,8 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, /* igroup refcount moves into iommufd_device */ idev->igroup = igroup;
+ xa_init(&idev->pasid_hwpts); + /* * If the caller fails after this success it must call * iommufd_unbind_device() which is safe since we hold this refcount. @@ -531,10 +534,6 @@ iommufd_device_do_replace(struct iommufd_device *idev, u32 pasid, return ERR_PTR(rc); }
-typedef struct iommufd_hw_pagetable *(*attach_fn)( - struct iommufd_device *idev, u32 pasid, - struct iommufd_hw_pagetable *hwpt); - /* * When automatically managing the domains we search for a compatible domain in * the iopt and if one is found use it, otherwise create a new domain. @@ -618,8 +617,8 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, u32 pasid, return destroy_hwpt; }
-static int iommufd_device_change_pt(struct iommufd_device *idev, u32 pasid, - u32 *pt_id, attach_fn do_attach) +int iommufd_device_change_pt(struct iommufd_device *idev, u32 pasid, + u32 *pt_id, attach_fn do_attach) { struct iommufd_hw_pagetable *destroy_hwpt; struct iommufd_object *pt_obj; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 991f864d1f9b..673ebf5dd0a5 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -394,6 +394,7 @@ struct iommufd_device { struct list_head group_item; /* always the physical device */ struct device *dev; + struct xarray pasid_hwpts; bool enforce_cache_coherency; };
@@ -408,6 +409,20 @@ iommufd_get_device(struct iommufd_ucmd *ucmd, u32 id) void iommufd_device_destroy(struct iommufd_object *obj); int iommufd_get_hw_info(struct iommufd_ucmd *ucmd);
+typedef struct iommufd_hw_pagetable *(*attach_fn)( + struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt); + +int iommufd_device_change_pt(struct iommufd_device *idev, u32 pasid, + u32 *pt_id, attach_fn do_attach); + +struct iommufd_hw_pagetable * +iommufd_device_pasid_do_attach(struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt); +struct iommufd_hw_pagetable * +iommufd_device_pasid_do_replace(struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt); + struct iommufd_access { struct iommufd_object obj; struct iommufd_ctx *ictx; diff --git a/drivers/iommu/iommufd/pasid.c b/drivers/iommu/iommufd/pasid.c new file mode 100644 index 000000000000..2f0cb836955f --- /dev/null +++ b/drivers/iommu/iommufd/pasid.c @@ -0,0 +1,157 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* Copyright (c) 2024, Intel Corporation + */ +#include <linux/iommufd.h> +#include <linux/iommu.h> +#include "../iommu-priv.h" + +#include "iommufd_private.h" + +struct iommufd_hw_pagetable * +iommufd_device_pasid_do_attach(struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt) +{ + void *curr; + int rc; + + refcount_inc(&hwpt->obj.users); + curr = xa_cmpxchg(&idev->pasid_hwpts, pasid, NULL, hwpt, GFP_KERNEL); + if (curr) { + if (curr == hwpt) + rc = 0; + else + rc = xa_err(curr) ? : -EBUSY; + goto err_put_hwpt; + } + + rc = iommu_attach_device_pasid(hwpt->domain, idev->dev, pasid); + if (rc) { + xa_erase(&idev->pasid_hwpts, pasid); + goto err_put_hwpt; + } + + return NULL; + +err_put_hwpt: + refcount_dec(&hwpt->obj.users); + return ERR_PTR(rc); +} + +struct iommufd_hw_pagetable * +iommufd_device_pasid_do_replace(struct iommufd_device *idev, u32 pasid, + struct iommufd_hw_pagetable *hwpt) +{ + void *curr; + int rc; + + refcount_inc(&hwpt->obj.users); + curr = xa_store(&idev->pasid_hwpts, pasid, hwpt, GFP_KERNEL); + rc = xa_err(curr); + if (rc) + goto out_put_hwpt; + + if (!curr) { + xa_erase(&idev->pasid_hwpts, pasid); + rc = -EINVAL; + goto out_put_hwpt; + } + + if (curr == hwpt) + goto out_put_hwpt; + + /* + * After replacement, the reference on the old hwpt is retained + * in this thread as caller would free it. + */ + rc = iommu_replace_device_pasid(hwpt->domain, idev->dev, pasid); + if (rc) { + WARN_ON(xa_err(xa_store(&idev->pasid_hwpts, pasid, + curr, GFP_KERNEL))); + goto out_put_hwpt; + } + + /* Caller must destroy old_hwpt */ + return curr; + +out_put_hwpt: + refcount_dec(&hwpt->obj.users); + return ERR_PTR(rc); +} + +/** + * iommufd_device_pasid_attach - Connect a {device, pasid} to an iommu_domain + * @idev: device to attach + * @pasid: pasid to attach + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE + * Output the IOMMUFD_OBJ_HW_PAGETABLE ID + * + * This connects a pasid of the device to an iommu_domain. Once this + * completes the device could do DMA with the pasid. + * + * This function is undone by calling iommufd_device_detach_pasid(). + * + * iommufd does not handle race between iommufd_device_pasid_attach(), + * iommufd_device_pasid_replace() and iommufd_device_pasid_detach(). + * So caller of them should guarantee no concurrent call on the same + * device and pasid. + */ +int iommufd_device_pasid_attach(struct iommufd_device *idev, + u32 pasid, u32 *pt_id) +{ + return iommufd_device_change_pt(idev, pasid, pt_id, + &iommufd_device_pasid_do_attach); +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_pasid_attach, IOMMUFD); + +/** + * iommufd_device_pasid_replace - Change the {device, pasid}'s iommu_domain + * @idev: device to change + * @pasid: pasid to change + * @pt_id: Input a IOMMUFD_OBJ_IOAS, or IOMMUFD_OBJ_HW_PAGETABLE + * Output the IOMMUFD_OBJ_HW_PAGETABLE ID + * + * This is the same as + * iommufd_device_pasid_detach(); + * iommufd_device_pasid_attach(); + * + * If it fails then no change is made to the attachment. The iommu driver may + * implement this so there is no disruption in translation. This can only be + * called if iommufd_device_pasid_attach() has already succeeded. + * + * iommufd does not handle race between iommufd_device_pasid_replace(), + * iommufd_device_pasid_attach() and iommufd_device_pasid_detach(). + * So caller of them should guarantee no concurrent call on the same + * device and pasid. + */ +int iommufd_device_pasid_replace(struct iommufd_device *idev, + u32 pasid, u32 *pt_id) +{ + return iommufd_device_change_pt(idev, pasid, pt_id, + &iommufd_device_pasid_do_replace); +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_pasid_replace, IOMMUFD); + +/** + * iommufd_device_pasid_detach - Disconnect a {device, pasid} to an iommu_domain + * @idev: device to detach + * @pasid: pasid to detach + * + * Undo iommufd_device_pasid_attach(). This disconnects the idev/pasid from + * the previously attached pt_id. + * + * iommufd does not handle race between iommufd_device_pasid_detach(), + * iommufd_device_pasid_attach() and iommufd_device_pasid_replace(). + * So caller of them should guarantee no concurrent call on the same + * device and pasid. + */ +void iommufd_device_pasid_detach(struct iommufd_device *idev, u32 pasid) +{ + struct iommufd_hw_pagetable *hwpt; + + hwpt = xa_erase(&idev->pasid_hwpts, pasid); + if (WARN_ON(!hwpt)) + return; + iommu_detach_device_pasid(hwpt->domain, idev->dev, pasid); + iommufd_hw_pagetable_put(idev->ictx, hwpt); +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_pasid_detach, IOMMUFD); diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index ffc3a949f837..0b007c376306 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -26,6 +26,12 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id); int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id); void iommufd_device_detach(struct iommufd_device *idev);
+int iommufd_device_pasid_attach(struct iommufd_device *idev, + u32 pasid, u32 *pt_id); +int iommufd_device_pasid_replace(struct iommufd_device *idev, + u32 pasid, u32 *pt_id); +void iommufd_device_pasid_detach(struct iommufd_device *idev, u32 pasid); + struct iommufd_ctx *iommufd_device_to_ictx(struct iommufd_device *idev); u32 iommufd_device_to_id(struct iommufd_device *idev);
The two callbacks are needed to make pasid_attach/detach path complete for mock device. A nop is enough for set_dev_pasid, a domain type check in the remove_dev_pasid is also helpful.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/selftest.c | 39 ++++++++++++++++++++++++++++++++ 1 file changed, 39 insertions(+)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 7a2199470f31..1404eca156a8 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -514,6 +514,30 @@ static struct iommu_device *mock_probe_device(struct device *dev) return &mock_iommu_device; }
+static void mock_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, + struct iommu_domain *domain) +{ + /* Domain type specific cleanup: */ + if (domain) { + switch (domain->type) { + case IOMMU_DOMAIN_NESTED: + case IOMMU_DOMAIN_UNMANAGED: + break; + default: + /* should never reach here */ + WARN_ON(1); + break; + } + } +} + +static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain, + struct device *dev, ioasid_t pasid, + struct iommu_domain *old) +{ + return 0; +} + static const struct iommu_ops mock_ops = { /* * IOMMU_DOMAIN_BLOCKED cannot be returned from def_domain_type() @@ -529,6 +553,7 @@ static const struct iommu_ops mock_ops = { .capable = mock_domain_capable, .device_group = generic_device_group, .probe_device = mock_probe_device, + .remove_dev_pasid = mock_iommu_remove_dev_pasid, .default_domain_ops = &(struct iommu_domain_ops){ .free = mock_domain_free, @@ -536,6 +561,7 @@ static const struct iommu_ops mock_ops = { .map_pages = mock_domain_map_pages, .unmap_pages = mock_domain_unmap_pages, .iova_to_phys = mock_domain_iova_to_phys, + .set_dev_pasid = mock_domain_set_dev_pasid_nop, }, };
@@ -600,6 +626,7 @@ static struct iommu_domain_ops domain_nested_ops = { .free = mock_domain_free_nested, .attach_dev = mock_domain_nop_attach, .cache_invalidate_user = mock_domain_cache_invalidate_user, + .set_dev_pasid = mock_domain_set_dev_pasid_nop, };
static inline struct iommufd_hw_pagetable * @@ -660,6 +687,10 @@ static void mock_dev_release(struct device *dev)
static struct mock_dev *mock_dev_create(unsigned long dev_flags) { + struct property_entry prop[] = { + PROPERTY_ENTRY_U32("pasid-num-bits", 20), + {}, + }; struct mock_dev *mdev; int rc;
@@ -685,6 +716,12 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags) if (rc) goto err_put;
+ rc = device_create_managed_software_node(&mdev->dev, prop, NULL); + if (rc) { + dev_err(&mdev->dev, "add pasid-num-bits property failed, rc: %d", rc); + goto err_put; + } + rc = device_add(&mdev->dev); if (rc) goto err_put; @@ -1491,6 +1528,8 @@ int __init iommufd_test_init(void) &iommufd_mock_bus_type.nb); if (rc) goto err_sysfs; + + mock_iommu_device.max_pasids = (1 << 20); return 0;
err_sysfs:
There is need to get the selftest device (sobj->type == TYPE_IDEV) in multiple places, so have a helper to for it.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/selftest.c | 32 +++++++++++++++++++++----------- 1 file changed, 21 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 1404eca156a8..782e3c469530 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -802,29 +802,39 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd, return rc; }
-/* Replace the mock domain with a manually allocated hw_pagetable */ -static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, - unsigned int device_id, u32 pt_id, - struct iommu_test_cmd *cmd) +static struct selftest_obj * +iommufd_test_get_self_test_device(struct iommufd_ctx *ictx, u32 id) { struct iommufd_object *dev_obj; struct selftest_obj *sobj; - int rc;
/* * Prefer to use the OBJ_SELFTEST because the destroy_rwsem will ensure * it doesn't race with detach, which is not allowed. */ - dev_obj = - iommufd_get_object(ucmd->ictx, device_id, IOMMUFD_OBJ_SELFTEST); + dev_obj = iommufd_get_object(ictx, id, IOMMUFD_OBJ_SELFTEST); if (IS_ERR(dev_obj)) - return PTR_ERR(dev_obj); + return ERR_CAST(dev_obj);
sobj = container_of(dev_obj, struct selftest_obj, obj); if (sobj->type != TYPE_IDEV) { - rc = -EINVAL; - goto out_dev_obj; + iommufd_put_object(ictx, dev_obj); + return ERR_PTR(-EINVAL); } + return sobj; +} + +/* Replace the mock domain with a manually allocated hw_pagetable */ +static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, + unsigned int device_id, u32 pt_id, + struct iommu_test_cmd *cmd) +{ + struct selftest_obj *sobj; + int rc; + + sobj = iommufd_test_get_self_test_device(ucmd->ictx, device_id); + if (IS_ERR(sobj)) + return PTR_ERR(sobj);
rc = iommufd_device_replace(sobj->idev.idev, &pt_id); if (rc) @@ -834,7 +844,7 @@ static int iommufd_test_mock_domain_replace(struct iommufd_ucmd *ucmd, rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
out_dev_obj: - iommufd_put_object(ucmd->ictx, dev_obj); + iommufd_put_object(ucmd->ictx, &sobj->obj); return rc; }
This adds 4 test ops for pasid attach/replace/detach testing. There are ops to attach/detach pasid, and also op to check the attached domain of a pasid.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/iommufd_test.h | 30 ++++++ drivers/iommu/iommufd/selftest.c | 135 +++++++++++++++++++++++++++ 2 files changed, 165 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index e854d3f67205..ee6310f07749 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -22,6 +22,10 @@ enum { IOMMU_TEST_OP_MOCK_DOMAIN_FLAGS, IOMMU_TEST_OP_DIRTY, IOMMU_TEST_OP_MD_CHECK_IOTLB, + IOMMU_TEST_OP_PASID_ATTACH, + IOMMU_TEST_OP_PASID_REPLACE, + IOMMU_TEST_OP_PASID_DETACH, + IOMMU_TEST_OP_PASID_CHECK_DOMAIN, };
enum { @@ -127,6 +131,32 @@ struct iommu_test_cmd { __u32 id; __u32 iotlb; } check_iotlb; + struct { + __u32 pasid; + __u32 pt_id; + /* @id is stdev_id for IOMMU_TEST_OP_PASID_ATTACH + * pasid#1024 is for special test, avoid use it + * in normal case. + */ + } pasid_attach; + struct { + __u32 pasid; + __u32 pt_id; + /* @id is stdev_id for IOMMU_TEST_OP_PASID_ATTACH + * pasid#1024 is for special test, avoid use it + * in normal case. + */ + } pasid_replace; + struct { + __u32 pasid; + /* @id is stdev_id for IOMMU_TEST_OP_PASID_DETACH */ + } pasid_detach; + struct { + __u32 pasid; + __u32 hwpt_id; + __u64 out_result_ptr; + /* @id is stdev_id for IOMMU_TEST_OP_HWPT_GET_DOMAIN */ + } pasid_check; }; __u32 last; }; diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 782e3c469530..b44d56118e90 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -514,6 +514,8 @@ static struct iommu_device *mock_probe_device(struct device *dev) return &mock_iommu_device; }
+static bool pasid_1024_attached; + static void mock_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, struct iommu_domain *domain) { @@ -522,6 +524,8 @@ static void mock_iommu_remove_dev_pasid(struct device *dev, ioasid_t pasid, switch (domain->type) { case IOMMU_DOMAIN_NESTED: case IOMMU_DOMAIN_UNMANAGED: + if (pasid == 1024) + pasid_1024_attached = false; break; default: /* should never reach here */ @@ -535,6 +539,20 @@ static int mock_domain_set_dev_pasid_nop(struct iommu_domain *domain, struct device *dev, ioasid_t pasid, struct iommu_domain *old) { + /* + * First attach with pasid 1024 succ, second attach would fail, + * and so on. This is helpful to test the case in which the iommu + * layer needs to rollback to old domain due to driver failure. + */ + if (pasid == 1024) { + if (pasid_1024_attached) { + pasid_1024_attached = false; + // Fake an error to fail the replacement + return -ENOMEM; + } + pasid_1024_attached = true; + } + return 0; }
@@ -1422,6 +1440,114 @@ static int iommufd_test_dirty(struct iommufd_ucmd *ucmd, unsigned int mockpt_id, return rc; }
+static int iommufd_test_pasid_attach(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct selftest_obj *sobj; + int rc; + + sobj = iommufd_test_get_self_test_device(ucmd->ictx, cmd->id); + if (IS_ERR(sobj)) + return PTR_ERR(sobj); + + rc = iommufd_device_pasid_attach(sobj->idev.idev, + cmd->pasid_attach.pasid, + &cmd->pasid_attach.pt_id); + iommufd_put_object(ucmd->ictx, &sobj->obj); + return rc; +} + +static int iommufd_test_pasid_replace(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct selftest_obj *sobj; + int rc; + + sobj = iommufd_test_get_self_test_device(ucmd->ictx, cmd->id); + if (IS_ERR(sobj)) + return PTR_ERR(sobj); + + rc = iommufd_device_pasid_replace(sobj->idev.idev, + cmd->pasid_attach.pasid, + &cmd->pasid_attach.pt_id); + iommufd_put_object(ucmd->ictx, &sobj->obj); + return rc; +} + +static int iommufd_test_pasid_detach(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct selftest_obj *sobj; + + sobj = iommufd_test_get_self_test_device(ucmd->ictx, cmd->id); + if (IS_ERR(sobj)) + return PTR_ERR(sobj); + + iommufd_device_pasid_detach(sobj->idev.idev, + cmd->pasid_detach.pasid); + iommufd_put_object(ucmd->ictx, &sobj->obj); + return 0; +} + +static inline struct iommufd_hw_pagetable * +iommufd_get_hwpt(struct iommufd_ucmd *ucmd, u32 id) +{ + struct iommufd_object *pt_obj; + + pt_obj = iommufd_get_object(ucmd->ictx, id, IOMMUFD_OBJ_ANY); + if (IS_ERR(pt_obj)) + return ERR_CAST(pt_obj); + + if (pt_obj->type != IOMMUFD_OBJ_HWPT_NESTED && + pt_obj->type != IOMMUFD_OBJ_HWPT_PAGING) { + iommufd_put_object(ucmd->ictx, pt_obj); + return ERR_PTR(-EINVAL); + } + + return container_of(pt_obj, struct iommufd_hw_pagetable, obj); +} + +static int iommufd_test_pasid_check_domain(struct iommufd_ucmd *ucmd, + struct iommu_test_cmd *cmd) +{ + struct iommu_domain *attached_domain, *expect_domain = NULL; + struct iommufd_hw_pagetable *hwpt = NULL; + struct selftest_obj *sobj; + struct mock_dev *mdev; + bool result; + int rc = 0; + + sobj = iommufd_test_get_self_test_device(ucmd->ictx, cmd->id); + if (IS_ERR(sobj)) + return PTR_ERR(sobj); + + mdev = sobj->idev.mock_dev; + + attached_domain = iommu_get_domain_for_dev_pasid(&mdev->dev, + cmd->pasid_check.pasid, 0); + if (IS_ERR(attached_domain)) + attached_domain = NULL; + + if (cmd->pasid_check.hwpt_id) { + hwpt = iommufd_get_hwpt(ucmd, cmd->pasid_check.hwpt_id); + if (IS_ERR(hwpt)) { + rc = PTR_ERR(hwpt); + goto out_put_dev; + } + expect_domain = hwpt->domain; + } + + result = (attached_domain == expect_domain) ? 1 : 0; + if (copy_to_user(u64_to_user_ptr(cmd->pasid_check.out_result_ptr), + &result, sizeof(result))) + rc = -EFAULT; + if (hwpt) + iommufd_put_object(ucmd->ictx, &hwpt->obj); +out_put_dev: + iommufd_put_object(ucmd->ictx, &sobj->obj); + return rc; +} + void iommufd_selftest_destroy(struct iommufd_object *obj) { struct selftest_obj *sobj = container_of(obj, struct selftest_obj, obj); @@ -1497,6 +1623,14 @@ int iommufd_test(struct iommufd_ucmd *ucmd) cmd->dirty.page_size, u64_to_user_ptr(cmd->dirty.uptr), cmd->dirty.flags); + case IOMMU_TEST_OP_PASID_ATTACH: + return iommufd_test_pasid_attach(ucmd, cmd); + case IOMMU_TEST_OP_PASID_REPLACE: + return iommufd_test_pasid_replace(ucmd, cmd); + case IOMMU_TEST_OP_PASID_DETACH: + return iommufd_test_pasid_detach(ucmd, cmd); + case IOMMU_TEST_OP_PASID_CHECK_DOMAIN: + return iommufd_test_pasid_check_domain(ucmd, cmd); default: return -EOPNOTSUPP; } @@ -1540,6 +1674,7 @@ int __init iommufd_test_init(void) goto err_sysfs;
mock_iommu_device.max_pasids = (1 << 20); + pasid_1024_attached = false; return 0;
err_sysfs:
This tests iommufd pasid attach/replace/detach.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- tools/testing/selftests/iommu/iommufd.c | 207 ++++++++++++++++++ .../selftests/iommu/iommufd_fail_nth.c | 28 ++- tools/testing/selftests/iommu/iommufd_utils.h | 78 +++++++ 3 files changed, 309 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index edf1c99c9936..6db0eabb689e 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -2346,4 +2346,211 @@ TEST_F(vfio_compat_mock_domain, huge_map) } }
+FIXTURE(iommufd_device_pasid) +{ + int fd; + uint32_t ioas_id; + uint32_t hwpt_id; + uint32_t stdev_id; + uint32_t device_id; +}; + +FIXTURE_SETUP(iommufd_device_pasid) +{ + self->fd = open("/dev/iommu", O_RDWR); + ASSERT_NE(-1, self->fd); + test_ioctl_ioas_alloc(&self->ioas_id); + + test_cmd_mock_domain(self->ioas_id, &self->stdev_id, + &self->hwpt_id, &self->device_id); +} + +FIXTURE_TEARDOWN(iommufd_device_pasid) +{ + teardown_iommufd(self->fd, _metadata); +} + +TEST_F(iommufd_device_pasid, pasid_attach) +{ + if (self->device_id) { + struct iommu_hwpt_selftest data = { + .iotlb = IOMMU_TEST_IOTLB_DEFAULT, + }; + uint32_t nested_hwpt_id[2] = {}; + uint32_t parent_hwpt_id = 0; + uint32_t pasid = 100; + bool result; + + /* Allocate two nested hwpts sharing one common parent hwpt */ + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, + IOMMU_HWPT_ALLOC_NEST_PARENT, + &parent_hwpt_id); + + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, + &nested_hwpt_id[0], + IOMMU_HWPT_DATA_SELFTEST, + &data, sizeof(data)); + test_cmd_hwpt_alloc_nested(self->device_id, parent_hwpt_id, 0, + &nested_hwpt_id[1], + IOMMU_HWPT_DATA_SELFTEST, + &data, sizeof(data)); + + /* + * Attach ioas to pasid 100, should succeed, domain should + * be valid. + */ + test_cmd_pasid_attach(pasid, self->ioas_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, self->hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Try attach pasid 100 with self->ioas_id, should succeed + * as it is the same with existing hwpt. + */ + test_cmd_pasid_attach(pasid, self->ioas_id); + + /* + * Try attach pasid 100 with another hwpt, should FAIL + * as attach does not allow overwrite, use REPLACE instead. + */ + test_err_cmd_pasid_attach(EBUSY, pasid, nested_hwpt_id[0]); + + /* + * Detach hwpt from pasid 100, and check if the pasid 100 + * has null domain. Should be done before the next attach. + */ + test_cmd_pasid_detach(pasid); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, 0, &result)); + EXPECT_EQ(1, result); + + /* + * Attach nested hwpt to pasid 100, should succeed, domain + * should be valid. + */ + test_cmd_pasid_attach(pasid, nested_hwpt_id[0]); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, nested_hwpt_id[0], + &result)); + EXPECT_EQ(1, result); + + /* + * Detach hwpt from pasid 100, and check if the pasid 100 + * has null domain + */ + test_cmd_pasid_detach(pasid); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, 0, &result)); + EXPECT_EQ(1, result); + + /* Replace tests */ + pasid = 200; + + /* + * Replace pasid 200 without attaching it first, should + * fail with -EINVAL. + */ + test_err_cmd_pasid_replace(EINVAL, pasid, parent_hwpt_id); + + /* + * Attach a s2 hwpt to pasid 200, should succeed, domain should + * be valid. + */ + test_cmd_pasid_attach(pasid, parent_hwpt_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, parent_hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Replace pasid 200 with self->ioas_id, should succeed, + * and have valid domain. + */ + test_cmd_pasid_replace(pasid, self->ioas_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, self->hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Replace a nested hwpt for pasid 200, should succeed, + * and have valid domain. + */ + test_cmd_pasid_replace(pasid, nested_hwpt_id[0]); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, nested_hwpt_id[0], + &result)); + EXPECT_EQ(1, result); + + /* + * Replace with another nested hwpt for pasid 200, should + * succeed, and have valid domain. + */ + test_cmd_pasid_replace(pasid, nested_hwpt_id[1]); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, nested_hwpt_id[1], + &result)); + EXPECT_EQ(1, result); + + /* + * Detach hwpt from pasid 200, and check if the pasid 200 + * has null domain. + */ + test_cmd_pasid_detach(pasid); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, 0, &result)); + EXPECT_EQ(1, result); + + /* Negative Tests for pasid replace, use pasid 1024 */ + + /* + * Attach a s2 hwpt to pasid 1024, should succeed, domain should + * be valid. + */ + pasid = 1024; + test_cmd_pasid_attach(pasid, parent_hwpt_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, parent_hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Replace pasid 1024 with self->ioas_id, should fail, + * but have the old valid domain. + */ + test_err_cmd_pasid_replace(ENOMEM, pasid, self->ioas_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, parent_hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Detach hwpt from pasid 1024, and check if the pasid 1024 + * has null domain. + */ + test_cmd_pasid_detach(pasid); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, self->stdev_id, + pasid, 0, &result)); + EXPECT_EQ(1, result); + + test_ioctl_destroy(nested_hwpt_id[0]); + test_ioctl_destroy(nested_hwpt_id[1]); + test_ioctl_destroy(parent_hwpt_id); + } +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index f590417cd67a..6d1b03e73b9d 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -206,12 +206,16 @@ FIXTURE(basic_fail_nth) { int fd; uint32_t access_id; + uint32_t stdev_id; + uint32_t pasid; };
FIXTURE_SETUP(basic_fail_nth) { self->fd = -1; self->access_id = 0; + self->stdev_id = 0; + self->pasid = 0; //test should use a non-zero value }
FIXTURE_TEARDOWN(basic_fail_nth) @@ -223,6 +227,8 @@ FIXTURE_TEARDOWN(basic_fail_nth) rc = _test_cmd_destroy_access(self->access_id); assert(rc == 0); } + if (self->pasid && self->stdev_id) + _test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid); teardown_iommufd(self->fd, _metadata); }
@@ -579,7 +585,6 @@ TEST_FAIL_NTH(basic_fail_nth, device) struct iommu_test_hw_info info; uint32_t ioas_id; uint32_t ioas_id2; - uint32_t stdev_id; uint32_t idev_id; uint32_t hwpt_id; __u64 iova; @@ -608,7 +613,7 @@ TEST_FAIL_NTH(basic_fail_nth, device)
fail_nth_enable();
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, NULL, + if (_test_cmd_mock_domain(self->fd, ioas_id, &self->stdev_id, NULL, &idev_id)) return -1;
@@ -619,11 +624,26 @@ TEST_FAIL_NTH(basic_fail_nth, device) IOMMU_HWPT_DATA_NONE, 0, 0)) return -1;
- if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL)) + if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, ioas_id2, NULL)) + return -1; + + if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, hwpt_id, NULL)) return -1;
- if (_test_cmd_mock_domain_replace(self->fd, stdev_id, hwpt_id, NULL)) + self->pasid = 200; + + /* Tests for pasid attach/replace/detach */ + if (_test_cmd_pasid_attach(self->fd, self->stdev_id, self->pasid, ioas_id)) return -1; + + if (_test_cmd_pasid_replace(self->fd, self->stdev_id, self->pasid, ioas_id2)) + return -1; + + if (_test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid)) + return -1; + + self->pasid = 0; + return 0; }
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 8d2b46b2114d..9b112b108670 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -684,3 +684,81 @@ static int _test_cmd_get_hw_info(int fd, __u32 device_id, void *data,
#define test_cmd_get_hw_capabilities(device_id, caps, mask) \ ASSERT_EQ(0, _test_cmd_get_hw_info(self->fd, device_id, NULL, 0, &caps)) + +static int _test_cmd_pasid_attach(int fd, __u32 stdev_id, __u32 pasid, __u32 pt_id) +{ + struct iommu_test_cmd test_attach = { + .size = sizeof(test_attach), + .op = IOMMU_TEST_OP_PASID_ATTACH, + .id = stdev_id, + .pasid_attach = { + .pasid = pasid, + .pt_id = pt_id, + }, + }; + + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_ATTACH), &test_attach); +} + +#define test_cmd_pasid_attach(pasid, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_pasid_attach(self->fd, self->stdev_id, pasid, hwpt_id)) + +#define test_err_cmd_pasid_attach(_errno, pasid, hwpt_id) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_pasid_attach(self->fd, self->stdev_id, pasid, hwpt_id)) + +static int _test_cmd_pasid_replace(int fd, __u32 stdev_id, __u32 pasid, __u32 pt_id) +{ + struct iommu_test_cmd test_replace = { + .size = sizeof(test_replace), + .op = IOMMU_TEST_OP_PASID_REPLACE, + .id = stdev_id, + .pasid_replace = { + .pasid = pasid, + .pt_id = pt_id, + }, + }; + + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_REPLACE), &test_replace); +} + +#define test_cmd_pasid_replace(pasid, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_pasid_replace(self->fd, self->stdev_id, pasid, hwpt_id)) + +#define test_err_cmd_pasid_replace(_errno, pasid, hwpt_id) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_pasid_replace(self->fd, self->stdev_id, pasid, hwpt_id)) + +static int _test_cmd_pasid_detach(int fd, __u32 stdev_id, __u32 pasid) +{ + struct iommu_test_cmd test_detach = { + .size = sizeof(test_detach), + .op = IOMMU_TEST_OP_PASID_DETACH, + .id = stdev_id, + .pasid_detach = { + .pasid = pasid, + }, + }; + + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_DETACH), &test_detach); +} + +#define test_cmd_pasid_detach(pasid) \ + ASSERT_EQ(0, _test_cmd_pasid_detach(self->fd, self->stdev_id, pasid)) + +static int test_cmd_pasid_check_domain(int fd, __u32 stdev_id, __u32 pasid, + __u32 hwpt_id, bool *result) +{ + struct iommu_test_cmd test_pasid_check = { + .size = sizeof(test_pasid_check), + .op = IOMMU_TEST_OP_PASID_CHECK_DOMAIN, + .id = stdev_id, + .pasid_check = { + .pasid = pasid, + .hwpt_id = hwpt_id, + .out_result_ptr = (__u64)result, + }, + }; + + return ioctl(fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_PASID_CHECK_DOMAIN), &test_pasid_check); +}
linux-kselftest-mirror@lists.linaro.org