IOMMU hardwares that support nested translation would have two stages address translation (normally mentioned as stage-1 and stage-2). The page table formats of the stage-1 and stage-2 can be different. e.g., VT-d has different page table formats for stage-1 and stage-2.
Nested parent domain is the iommu domain used to represent the stage-2 translation. In IOMMUFD, both stage-1 and stage-2 translation are tracked as HWPT (a.k.a. iommu domain). Stage-2 HWPT is parent of stage-1 HWPT as stage-1 cannot work alone in nested translation. In the cases of stage-1 and stage-2 page table format are different, the parent HWPT should use exactly the stage-2 page table format. However, the existing kernel hides the format selection in iommu drivers, so the domain allocated via IOMMU_HWPT_ALLOC can use either stage-1 page table format or stage-2 page table format, there is no guarantees for it.
To enforce the page table format of the nested parent domain, this series introduces a new iommu op (domain_alloc_user) which can accept user flags to allocate domain as userspace requires. It also converts IOMMUFD to use the new domain_alloc_user op for domain allocation if supported, then extends the IOMMU_HWPT_ALLOC ioctl to pass down a NEST_PARENT flag to allocate a HWPT which can be used as parent. This series implements the new op in Intel iommu driver to have a complete picture. It is a preparation for adding nesting support in IOMMUFD/IOMMU.
Complete code can be found: https://github.com/yiliu1765/iommufd/tree/iommufd_alloc_user_v2
Change log:
v2: - Require domain_alloc_user op if IOMMU_HWPT_ALLOC passes non-zero flags (Kevin) - IOMMUFD core should check kernel known flags while iommu driver needs to check supported flags as well (Jason) - Minor tweaks per Baolu's comment
v1: https://lore.kernel.org/linux-iommu/20230919092523.39286-1-yi.l.liu@intel.co...
Regards, Yi Liu
Yi Liu (6): iommu: Add new iommu op to create domains owned by userspace iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation iommufd/hw_pagetable: Accepts user flags for domain allocation iommufd/hw_pagetable: Support allocating nested parent domain iommufd/selftest: Add domain_alloc_user() support in iommu mock iommu/vt-d: Add domain_alloc_user op
drivers/iommu/intel/iommu.c | 28 +++++++++++++++++ drivers/iommu/iommufd/device.c | 2 +- drivers/iommu/iommufd/hw_pagetable.c | 31 ++++++++++++++----- drivers/iommu/iommufd/iommufd_private.h | 3 +- drivers/iommu/iommufd/selftest.c | 19 ++++++++++++ include/linux/iommu.h | 11 ++++++- include/uapi/linux/iommufd.h | 12 ++++++- tools/testing/selftests/iommu/iommufd.c | 24 +++++++++++--- .../selftests/iommu/iommufd_fail_nth.c | 2 +- tools/testing/selftests/iommu/iommufd_utils.h | 11 +++++-- 10 files changed, 124 insertions(+), 19 deletions(-)
Introduce a new iommu_domain op to create domains owned by userspace, e.g. through IOMMUFD. These domains have a few different properties compares to kernel owned domains:
- They may be UNMANAGED domains, but created with special parameters. For instance aperture size changes/number of levels, different IOPTE formats, or other things necessary to make a vIOMMU work
- We have to track all the memory allocations with GFP_KERNEL_ACCOUNT to make the cgroup sandbox stronger
- Device-specialty domains, such as NESTED domains can be created by IOMMUFD.
The new op clearly says the domain is being created by IOMMUFD, that the domain is intended for userspace use, and it provides a way to pass user flags or a driver specific uAPI structure to customize the created domain to exactly what the vIOMMU userspace driver requires.
iommu drivers that cannot support VFIO/IOMMUFD should not support this op. This includes any driver that cannot provide a fully functional UNMANAGED domain.
This new op for now is only supposed to be used by IOMMUFD, hence no wrapper for it. IOMMUFD would call the callback directly. As for domain free, IOMMUFD would use iommu_domain_free().
Suggested-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Co-developed-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- include/linux/iommu.h | 11 ++++++++++- include/uapi/linux/iommufd.h | 12 +++++++++++- 2 files changed, 21 insertions(+), 2 deletions(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index c50a769d569a..3861d66b65c1 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -234,7 +234,15 @@ struct iommu_iotlb_gather { * op is allocated in the iommu driver and freed by the caller after * use. The information type is one of enum iommu_hw_info_type defined * in include/uapi/linux/iommufd.h. - * @domain_alloc: allocate iommu domain + * @domain_alloc: allocate and return an iommu domain if success. Otherwise + * NULL is returned. The domain is not fully initialized until + * the caller iommu_domain_alloc() returns. + * @domain_alloc_user: Allocate an iommu domain corresponding to the input + * parameters as defined in include/uapi/linux/iommufd.h. + * Unlike @domain_alloc, it is called only by IOMMUFD and + * must fully initialize the new domain before return. + * Upon success, a domain is returned. Upon failure, + * ERR_PTR must be returned. * @probe_device: Add device to iommu driver handling * @release_device: Remove device from iommu driver handling * @probe_finalize: Do final setup work after the device is added to an IOMMU @@ -267,6 +275,7 @@ struct iommu_ops {
/* Domain allocation and freeing by the iommu driver */ struct iommu_domain *(*domain_alloc)(unsigned iommu_domain_type); + struct iommu_domain *(*domain_alloc_user)(struct device *dev, u32 flags);
struct iommu_device *(*probe_device)(struct device *dev); void (*release_device)(struct device *dev); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6..4a7c5c8fdbb4 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+/** + * enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation + * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve + * as the parent domain in the nesting + * configuration. + */ +enum iommufd_hwpt_alloc_flags { + IOMMU_HWPT_ALLOC_NEST_PARENT = 1 << 0, +}; + /** * struct iommu_hwpt_alloc - ioctl(IOMMU_HWPT_ALLOC) * @size: sizeof(struct iommu_hwpt_alloc) - * @flags: Must be 0 + * @flags: Combination of enum iommufd_hwpt_alloc_flags * @dev_id: The device to allocate this HWPT for * @pt_id: The IOAS to connect this HWPT to * @out_hwpt_id: The ID of the new HWPT
From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, September 28, 2023 3:15 PM
Introduce a new iommu_domain op to create domains owned by userspace, e.g. through IOMMUFD. These domains have a few different properties compares to kernel owned domains:
They may be UNMANAGED domains, but created with special parameters. For instance aperture size changes/number of levels, different IOPTE formats, or other things necessary to make a vIOMMU work
We have to track all the memory allocations with GFP_KERNEL_ACCOUNT to make the cgroup sandbox stronger
Device-specialty domains, such as NESTED domains can be created by IOMMUFD.
The new op clearly says the domain is being created by IOMMUFD, that the domain is intended for userspace use, and it provides a way to pass user flags or a driver specific uAPI structure to customize the created domain to exactly what the vIOMMU userspace driver requires.
iommu drivers that cannot support VFIO/IOMMUFD should not support this op. This includes any driver that cannot provide a fully functional UNMANAGED domain.
This new op for now is only supposed to be used by IOMMUFD, hence no wrapper for it. IOMMUFD would call the callback directly. As for domain free, IOMMUFD would use iommu_domain_free().
Suggested-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Lu Baolu baolu.lu@linux.intel.com Co-developed-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Yi Liu yi.l.liu@intel.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6..4a7c5c8fdbb4 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+/**
- enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
as the parent domain in the nesting
configuration.
I just noticed a nit here: we should probably align with other parts of this file by using "HWPT" v.s. "domain"? I.e.
+ * @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve + * as the parent HWPT in the nesting + * configuration.
Thanks Nic
On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6..4a7c5c8fdbb4 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+/**
- enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
as the parent domain in the nesting
configuration.
I just noticed a nit here: we should probably align with other parts of this file by using "HWPT" v.s. "domain"? I.e.
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
as the parent HWPT in the nesting
configuration.
Yes
Jason
On Mon, Oct 16, 2023 at 09:04:54AM -0300, Jason Gunthorpe wrote:
On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6..4a7c5c8fdbb4 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+/**
- enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
as the parent domain in the nesting
configuration.
I just noticed a nit here: we should probably align with other parts of this file by using "HWPT" v.s. "domain"? I.e.
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
as the parent HWPT in the nesting
configuration.
Yes
Should we resend? Or would it be possible for you to update it in your for-next tree?
On Mon, Oct 16, 2023 at 10:46:10AM -0700, Nicolin Chen wrote:
On Mon, Oct 16, 2023 at 09:04:54AM -0300, Jason Gunthorpe wrote:
On Sun, Oct 15, 2023 at 12:14:01AM -0700, Nicolin Chen wrote:
On Thu, Sep 28, 2023 at 12:15:23AM -0700, Yi Liu wrote:
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index b4ba0c0cbab6..4a7c5c8fdbb4 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -347,10 +347,20 @@ struct iommu_vfio_ioas { }; #define IOMMU_VFIO_IOAS _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VFIO_IOAS)
+/**
- enum iommufd_hwpt_alloc_flags - Flags for HWPT allocation
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a domain which can serve
as the parent domain in the nesting
configuration.
I just noticed a nit here: we should probably align with other parts of this file by using "HWPT" v.s. "domain"? I.e.
- @IOMMU_HWPT_ALLOC_NEST_PARENT: If set, allocate a HWPT which can serve
as the parent HWPT in the nesting
configuration.
Yes
Should we resend? Or would it be possible for you to update it in your for-next tree?
At this point send a Fixes: patch
Jason
This makes IOMMUFD to use iommu_domain_alloc_user() for iommu_domain creation as IOMMUFD needs to support iommu_domain allocation with parameters from userspace in nested support. If the iommu driver doesn't provide domain_alloc_user callback then IOMMUFD falls back to use iommu_domain_alloc().
Suggested-by: Jason Gunthorpe jgg@nvidia.com Reviewed-by: Lu Baolu baolu.lu@linux.intel.com Reviewed-by: Kevin Tian kevin.tian@intel.com Co-developed-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/hw_pagetable.c | 19 +++++++++++++++---- 1 file changed, 15 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index cf2c1504e20d..48874f896521 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -5,6 +5,7 @@ #include <linux/iommu.h> #include <uapi/linux/iommufd.h>
+#include "../iommu-priv.h" #include "iommufd_private.h"
void iommufd_hw_pagetable_destroy(struct iommufd_object *obj) @@ -74,6 +75,7 @@ struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, struct iommufd_device *idev, bool immediate_attach) { + const struct iommu_ops *ops = dev_iommu_ops(idev->dev); struct iommufd_hw_pagetable *hwpt; int rc;
@@ -88,10 +90,19 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, refcount_inc(&ioas->obj.users); hwpt->ioas = ioas;
- hwpt->domain = iommu_domain_alloc(idev->dev->bus); - if (!hwpt->domain) { - rc = -ENOMEM; - goto out_abort; + if (ops->domain_alloc_user) { + hwpt->domain = ops->domain_alloc_user(idev->dev, 0); + if (IS_ERR(hwpt->domain)) { + rc = PTR_ERR(hwpt->domain); + hwpt->domain = NULL; + goto out_abort; + } + } else { + hwpt->domain = iommu_domain_alloc(idev->dev->bus); + if (!hwpt->domain) { + rc = -ENOMEM; + goto out_abort; + } }
/*
This extends iommufd_hw_pagetable_alloc() to accepts user flags.
Reviewed-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 2 +- drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++--- drivers/iommu/iommufd/iommufd_private.h | 3 ++- 3 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index ce78c3671539..e88fa73a45e6 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -540,7 +540,7 @@ iommufd_device_auto_get_domain(struct iommufd_device *idev, }
hwpt = iommufd_hw_pagetable_alloc(idev->ictx, ioas, idev, - immediate_attach); + 0, immediate_attach); if (IS_ERR(hwpt)) { destroy_hwpt = ERR_CAST(hwpt); goto out_unlock; diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 48874f896521..5be7a31cbd9c 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -61,6 +61,7 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) * @ictx: iommufd context * @ioas: IOAS to associate the domain with * @idev: Device to get an iommu_domain for + * @flags: Flags from userspace * @immediate_attach: True if idev should be attached to the hwpt * * Allocate a new iommu_domain and return it as a hw_pagetable. The HWPT @@ -73,7 +74,8 @@ int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt) */ struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, bool immediate_attach) + struct iommufd_device *idev, u32 flags, + bool immediate_attach) { const struct iommu_ops *ops = dev_iommu_ops(idev->dev); struct iommufd_hw_pagetable *hwpt; @@ -91,7 +93,7 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, hwpt->ioas = ioas;
if (ops->domain_alloc_user) { - hwpt->domain = ops->domain_alloc_user(idev->dev, 0); + hwpt->domain = ops->domain_alloc_user(idev->dev, flags); if (IS_ERR(hwpt->domain)) { rc = PTR_ERR(hwpt->domain); hwpt->domain = NULL; @@ -166,7 +168,8 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) }
mutex_lock(&ioas->mutex); - hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, idev, false); + hwpt = iommufd_hw_pagetable_alloc(ucmd->ictx, ioas, + idev, cmd->flags, false); if (IS_ERR(hwpt)) { rc = PTR_ERR(hwpt); goto out_unlock; diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 2c58670011fe..3064997a0181 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -242,7 +242,8 @@ struct iommufd_hw_pagetable {
struct iommufd_hw_pagetable * iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas, - struct iommufd_device *idev, bool immediate_attach); + struct iommufd_device *idev, u32 flags, + bool immediate_attach); int iommufd_hw_pagetable_enforce_cc(struct iommufd_hw_pagetable *hwpt); int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, struct iommufd_device *idev);
On 9/28/23 3:15 PM, Yi Liu wrote:
This extends iommufd_hw_pagetable_alloc() to accepts user flags.
Reviewed-by: Kevin Tiankevin.tian@intel.com Signed-off-by: Yi Liuyi.l.liu@intel.com
drivers/iommu/iommufd/device.c | 2 +- drivers/iommu/iommufd/hw_pagetable.c | 9 ++++++--- drivers/iommu/iommufd/iommufd_private.h | 3 ++- 3 files changed, 9 insertions(+), 5 deletions(-)
Reviewed-by: Lu Baolu baolu.lu@linux.intel.com
Best regards, baolu
This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2) in nested translation.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/hw_pagetable.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 5be7a31cbd9c..8b3d2875d642 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -83,6 +83,9 @@ iommufd_hw_pagetable_alloc(struct iommufd_ctx *ictx, struct iommufd_ioas *ioas,
lockdep_assert_held(&ioas->mutex);
+ if (flags && !ops->domain_alloc_user) + return ERR_PTR(-EOPNOTSUPP); + hwpt = iommufd_object_alloc(ictx, hwpt, IOMMUFD_OBJ_HW_PAGETABLE); if (IS_ERR(hwpt)) return hwpt; @@ -154,7 +157,7 @@ int iommufd_hwpt_alloc(struct iommufd_ucmd *ucmd) struct iommufd_ioas *ioas; int rc;
- if (cmd->flags || cmd->__reserved) + if ((cmd->flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) || cmd->__reserved) return -EOPNOTSUPP;
idev = iommufd_get_device(ucmd, cmd->dev_id);
On 9/28/23 3:15 PM, Yi Liu wrote:
This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2) in nested translation.
Signed-off-by: Yi Liuyi.l.liu@intel.com
drivers/iommu/iommufd/hw_pagetable.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-)
Reviewed-by: Lu Baolu baolu.lu@linux.intel.com
Best regards, baolu
From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, September 28, 2023 3:15 PM
This extends IOMMU_HWPT_ALLOC to allocate domains used as parent (stage-2) in nested translation.
Signed-off-by: Yi Liu yi.l.liu@intel.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
This adds mock_domain_alloc_user function and also new test case for the new flag IOMMU_HWPT_ALLOC_NEST_PARENT.
Co-developed-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/selftest.c | 19 +++++++++++++++ tools/testing/selftests/iommu/iommufd.c | 24 +++++++++++++++---- .../selftests/iommu/iommufd_fail_nth.c | 2 +- tools/testing/selftests/iommu/iommufd_utils.h | 11 ++++++--- 4 files changed, 48 insertions(+), 8 deletions(-)
diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 56506d5753f1..fe7e3c7d933a 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -146,6 +146,8 @@ static void *mock_domain_hw_info(struct device *dev, u32 *length, u32 *type) return info; }
+static const struct iommu_ops mock_ops; + static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) { struct mock_iommu_domain *mock; @@ -162,10 +164,26 @@ static struct iommu_domain *mock_domain_alloc(unsigned int iommu_domain_type) mock->domain.geometry.aperture_start = MOCK_APERTURE_START; mock->domain.geometry.aperture_end = MOCK_APERTURE_LAST; mock->domain.pgsize_bitmap = MOCK_IO_PAGE_SIZE; + mock->domain.ops = mock_ops.default_domain_ops; + mock->domain.type = iommu_domain_type; xa_init(&mock->pfns); return &mock->domain; }
+static struct iommu_domain * +mock_domain_alloc_user(struct device *dev, u32 flags) +{ + struct iommu_domain *domain; + + if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + return ERR_PTR(-EOPNOTSUPP); + + domain = mock_domain_alloc(IOMMU_DOMAIN_UNMANAGED); + if (!domain) + domain = ERR_PTR(-ENOMEM); + return domain; +} + static void mock_domain_free(struct iommu_domain *domain) { struct mock_iommu_domain *mock = @@ -307,6 +325,7 @@ static const struct iommu_ops mock_ops = { .pgsize_bitmap = MOCK_IO_PAGE_SIZE, .hw_info = mock_domain_hw_info, .domain_alloc = mock_domain_alloc, + .domain_alloc_user = mock_domain_alloc_user, .capable = mock_domain_capable, .set_platform_dma_ops = mock_domain_set_plaform_dma_ops, .device_group = generic_device_group, diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 9f705c1ea30f..9c129e63d7c7 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -114,6 +114,7 @@ TEST_F(iommufd, cmd_length)
TEST_LENGTH(iommu_destroy, IOMMU_DESTROY); TEST_LENGTH(iommu_hw_info, IOMMU_GET_HW_INFO); + TEST_LENGTH(iommu_hwpt_alloc, IOMMU_HWPT_ALLOC); TEST_LENGTH(iommu_ioas_alloc, IOMMU_IOAS_ALLOC); TEST_LENGTH(iommu_ioas_iova_ranges, IOMMU_IOAS_IOVA_RANGES); TEST_LENGTH(iommu_ioas_allow_iovas, IOMMU_IOAS_ALLOW_IOVAS); @@ -1404,13 +1405,28 @@ TEST_F(iommufd_mock_domain, alloc_hwpt) int i;
for (i = 0; i != variant->mock_domains; i++) { + uint32_t hwpt_id[2]; uint32_t stddev_id; - uint32_t hwpt_id;
- test_cmd_hwpt_alloc(self->idev_ids[i], self->ioas_id, &hwpt_id); - test_cmd_mock_domain(hwpt_id, &stddev_id, NULL, NULL); + test_err_hwpt_alloc(EOPNOTSUPP, + self->idev_ids[i], self->ioas_id, + ~IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id[0]); + test_cmd_hwpt_alloc(self->idev_ids[i], self->ioas_id, + 0, &hwpt_id[0]); + test_cmd_hwpt_alloc(self->idev_ids[i], self->ioas_id, + IOMMU_HWPT_ALLOC_NEST_PARENT, &hwpt_id[1]); + + /* Do a hw_pagetable rotation test */ + test_cmd_mock_domain_replace(self->stdev_ids[i], hwpt_id[0]); + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, hwpt_id[0])); + test_cmd_mock_domain_replace(self->stdev_ids[i], hwpt_id[1]); + EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, hwpt_id[1])); + test_cmd_mock_domain_replace(self->stdev_ids[i], self->ioas_id); + test_ioctl_destroy(hwpt_id[1]); + + test_cmd_mock_domain(hwpt_id[0], &stddev_id, NULL, NULL); test_ioctl_destroy(stddev_id); - test_ioctl_destroy(hwpt_id); + test_ioctl_destroy(hwpt_id[0]); } }
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index a220ca2a689d..3d7838506bfe 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -615,7 +615,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) if (_test_cmd_get_hw_info(self->fd, idev_id, &info, sizeof(info))) return -1;
- if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, &hwpt_id)) + if (_test_cmd_hwpt_alloc(self->fd, idev_id, ioas_id, 0, &hwpt_id)) return -1;
if (_test_cmd_mock_domain_replace(self->fd, stdev_id, ioas_id2, NULL)) diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index e0753d03ecaa..be4970a84977 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -103,10 +103,11 @@ static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, pt_id, NULL))
static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, - __u32 *hwpt_id) + __u32 flags, __u32 *hwpt_id) { struct iommu_hwpt_alloc cmd = { .size = sizeof(cmd), + .flags = flags, .dev_id = device_id, .pt_id = pt_id, }; @@ -120,8 +121,12 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, return 0; }
-#define test_cmd_hwpt_alloc(device_id, pt_id, hwpt_id) \ - ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, pt_id, hwpt_id)) +#define test_cmd_hwpt_alloc(device_id, pt_id, flags, hwpt_id) \ + ASSERT_EQ(0, _test_cmd_hwpt_alloc(self->fd, device_id, \ + pt_id, flags, hwpt_id)) +#define test_err_hwpt_alloc(_errno, device_id, pt_id, flags, hwpt_id) \ + EXPECT_ERRNO(_errno, _test_cmd_hwpt_alloc(self->fd, device_id, \ + pt_id, flags, hwpt_id))
static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, unsigned int ioas_id)
From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, September 28, 2023 3:15 PM
This adds mock_domain_alloc_user function and also new test case for the new flag IOMMU_HWPT_ALLOC_NEST_PARENT.
Co-developed-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com Signed-off-by: Yi Liu yi.l.liu@intel.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
This adds the domain_alloc_user op implementation. It supports allocating domains to be used as parent under nested translation.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/intel/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 5db283c17e0d..017aed5813d8 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4074,6 +4074,33 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; }
+static struct iommu_domain * +intel_iommu_domain_alloc_user(struct device *dev, u32 flags) +{ + struct iommu_domain *domain; + struct intel_iommu *iommu; + + if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT)) + return ERR_PTR(-EOPNOTSUPP); + + iommu = device_to_iommu(dev, NULL, NULL); + if (!iommu) + return ERR_PTR(-ENODEV); + + if ((flags & IOMMU_HWPT_ALLOC_NEST_PARENT) && !ecap_nest(iommu->ecap)) + return ERR_PTR(-EOPNOTSUPP); + + /* + * domain_alloc_user op needs to fully initialize a domain + * before return, so uses iommu_domain_alloc() here for + * simple. + */ + domain = iommu_domain_alloc(dev->bus); + if (!domain) + domain = ERR_PTR(-ENOMEM); + return domain; +} + static void intel_iommu_domain_free(struct iommu_domain *domain) { if (domain != &si_domain->domain && domain != &blocking_domain) @@ -4807,6 +4834,7 @@ const struct iommu_ops intel_iommu_ops = { .capable = intel_iommu_capable, .hw_info = intel_iommu_hw_info, .domain_alloc = intel_iommu_domain_alloc, + .domain_alloc_user = intel_iommu_domain_alloc_user, .probe_device = intel_iommu_probe_device, .probe_finalize = intel_iommu_probe_finalize, .release_device = intel_iommu_release_device,
On 9/28/23 3:15 PM, Yi Liu wrote:
This adds the domain_alloc_user op implementation. It supports allocating domains to be used as parent under nested translation.
Signed-off-by: Yi Liuyi.l.liu@intel.com
drivers/iommu/intel/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
Reviewed-by: Lu Baolu baolu.lu@linux.intel.com
Best regards, baolu
From: Liu, Yi L yi.l.liu@intel.com Sent: Thursday, September 28, 2023 3:15 PM
This adds the domain_alloc_user op implementation. It supports allocating domains to be used as parent under nested translation.
Signed-off-by: Yi Liu yi.l.liu@intel.com
Reviewed-by: Kevin Tian kevin.tian@intel.com
On Thu, Sep 28, 2023 at 12:15:28AM -0700, Yi Liu wrote:
This adds the domain_alloc_user op implementation. It supports allocating domains to be used as parent under nested translation.
Signed-off-by: Yi Liu yi.l.liu@intel.com
drivers/iommu/intel/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 5db283c17e0d..017aed5813d8 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4074,6 +4074,33 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; } +static struct iommu_domain * +intel_iommu_domain_alloc_user(struct device *dev, u32 flags) +{
- struct iommu_domain *domain;
- struct intel_iommu *iommu;
- if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
return ERR_PTR(-EOPNOTSUPP);
- iommu = device_to_iommu(dev, NULL, NULL);
- if (!iommu)
return ERR_PTR(-ENODEV);
Why isn't this just
struct device_domain_info *info = dev_iommu_priv_get(dev) struct intel_iommu *iommu = info->iommu
???
Same question for almost all other calls to this function! The one in probe is reasonable, but I don't think it should be ever called again.
I'm going to leave this, but please make a series cleaning all the device_to_iommu() stuff next cycle..
Jason
On 10/11/23 12:43 AM, Jason Gunthorpe wrote:
On Thu, Sep 28, 2023 at 12:15:28AM -0700, Yi Liu wrote:
This adds the domain_alloc_user op implementation. It supports allocating domains to be used as parent under nested translation.
Signed-off-by: Yi Liuyi.l.liu@intel.com
drivers/iommu/intel/iommu.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 5db283c17e0d..017aed5813d8 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -4074,6 +4074,33 @@ static struct iommu_domain *intel_iommu_domain_alloc(unsigned type) return NULL; } +static struct iommu_domain * +intel_iommu_domain_alloc_user(struct device *dev, u32 flags) +{
- struct iommu_domain *domain;
- struct intel_iommu *iommu;
- if (flags & (~IOMMU_HWPT_ALLOC_NEST_PARENT))
return ERR_PTR(-EOPNOTSUPP);
- iommu = device_to_iommu(dev, NULL, NULL);
- if (!iommu)
return ERR_PTR(-ENODEV);
Why isn't this just
struct device_domain_info *info = dev_iommu_priv_get(dev) struct intel_iommu *iommu = info->iommu
???
Same question for almost all other calls to this function! The one in probe is reasonable, but I don't think it should be ever called again.
I'm going to leave this, but please make a series cleaning all the device_to_iommu() stuff next cycle..
Sure. I also have a plan to do this. device_to_iommu() is only needed in the probe path. For runtime callbacks, we can just retrieve it from the device's iommu private data.
Best regards, baolu
On Thu, Sep 28, 2023 at 12:15:22AM -0700, Yi Liu wrote:
Yi Liu (6): iommu: Add new iommu op to create domains owned by userspace iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation iommufd/hw_pagetable: Accepts user flags for domain allocation iommufd/hw_pagetable: Support allocating nested parent domain iommufd/selftest: Add domain_alloc_user() support in iommu mock iommu/vt-d: Add domain_alloc_user op
I copy edited the commit messages, and moved the hunk adding IOMMU_HWPT_ALLOC_NEST_PARENT from 'iommu: Add new iommu op to create domains owned by userspace' into 'iommufd: Support allocating nested parent domain'
Otherwise applied to iommufd for-next
Thanks, Jason
On 2023/10/11 00:47, Jason Gunthorpe wrote:
On Thu, Sep 28, 2023 at 12:15:22AM -0700, Yi Liu wrote:
Yi Liu (6): iommu: Add new iommu op to create domains owned by userspace iommufd/hw_pagetable: Use domain_alloc_user op for domain allocation iommufd/hw_pagetable: Accepts user flags for domain allocation iommufd/hw_pagetable: Support allocating nested parent domain iommufd/selftest: Add domain_alloc_user() support in iommu mock iommu/vt-d: Add domain_alloc_user op
I copy edited the commit messages, and moved the hunk adding IOMMU_HWPT_ALLOC_NEST_PARENT from 'iommu: Add new iommu op to create domains owned by userspace' into 'iommufd: Support allocating nested parent domain'
sure. this movement makes sense to me.
Otherwise applied to iommufd for-next
thanks. feel free to let me know if an updated version is needed.
linux-kselftest-mirror@lists.linaro.org