Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr Paring QEMU branch for testing: https://github.com/nicolinc/qemu/commits/wip/for_iommufd_viommu_p2-v4
Changelog v4 * Added missing brackets in switch-case * Fixed the unreleased idev refcount issue * Reworked the iommufd_vdevice_alloc allocator * Dropped support for IOMMU_VIOMMU_TYPE_DEFAULT * Added missing TEST_LENGTH and fail_nth coverages * Added a verification to the driver-allocated vDEVICE object * Added an iommufd_vdevice_abort for a missing mutex protection * Added a u64 structure arm_vsmmu_invalidation_cmd for user command conversion v3 https://lore.kernel.org/all/cover.1728491532.git.nicolinc@nvidia.com/ * Added Jason's Reviewed-by * Split this invalidation part out of the part-1 series * Repurposed VDEV_ID ioctl to a wider vDEVICE structure and ioctl * Reduced viommu_api functions by allowing drivers to access viommu and vdevice structure directly * Dropped vdevs_rwsem by using xa_lock instead * Dropped arm_smmu_cache_invalidate_user v2 https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/ * Limited vdev_id to one per idev * Added a rw_sem to protect the vdev_id list * Reworked driver-level APIs with proper lockings * Added a new viommu_api file for IOMMUFD_DRIVER config * Dropped useless iommu_dev point from the viommu structure * Added missing index numnbers to new types in the uAPI header * Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one * Reworked mock_viommu_cache_invalidate() using the new iommu helper * Reordered details of set/unset_vdev_id handlers for proper lockings v1 https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
Thanks! Nicolin
Jason Gunthorpe (2): iommu: Add iommu_copy_struct_from_full_user_array helper iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED
Nicolin Chen (12): iommufd/viommu: Introduce IOMMUFD_OBJ_VDEVICE and its related struct iommufd/viommu: Add IOMMU_VDEVICE_ALLOC ioctl iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage iommu/viommu: Add cache_invalidate to iommufd_viommu_ops iommufd/hw_pagetable: Enforce cache invalidation op on vIOMMU-based hwpt_nested iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE iommufd/viommu: Add vdev_to_dev helper iommufd/selftest: Add mock_viommu_cache_invalidate iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl Documentation: userspace-api: iommufd: Update vDEVICE iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +- drivers/iommu/iommufd/iommufd_private.h | 12 ++ drivers/iommu/iommufd/iommufd_test.h | 30 +++ include/linux/iommu.h | 49 ++++- include/linux/iommufd.h | 50 +++++ include/uapi/linux/iommufd.h | 61 +++++- tools/testing/selftests/iommu/iommufd_utils.h | 83 +++++++ .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 162 +++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++- drivers/iommu/iommufd/device.c | 11 + drivers/iommu/iommufd/driver.c | 7 + drivers/iommu/iommufd/hw_pagetable.c | 36 +++- drivers/iommu/iommufd/main.c | 7 + drivers/iommu/iommufd/selftest.c | 115 +++++++++- drivers/iommu/iommufd/viommu.c | 108 ++++++++++ tools/testing/selftests/iommu/iommufd.c | 204 +++++++++++++++++- .../selftests/iommu/iommufd_fail_nth.c | 4 + Documentation/userspace-api/iommufd.rst | 41 +++- 18 files changed, 983 insertions(+), 38 deletions(-)
Introduce a new IOMMUFD_OBJ_VDEVICE to represent a physical device, i.e. iommufd_device (idev) object, against an iommufd_viommu (vIOMMU) object in the VM. This vDEVICE object (and its structure) holds all the information and attributes in a VM, regarding the device related to the vIOMMU.
As an initial patch, add a per-vIOMMU virtual ID. This can be: - Virtual StreamID on a nested ARM SMMUv3, an index to a Stream Table - Virtual DeviceID on a nested AMD IOMMU, an index to a Device Table - Virtual ID on a nested Intel VT-D IOMMU, an index to a Context Table Potentially, this vDEVICE structure can hold some vData for Confidential Compute Architecture (CCA).
Add a pair of vdevice_alloc and vdevice_free in struct iommufd_viommu_ops to allow driver-level vDEVICE structure allocations.
Similar to iommufd_viommu_alloc, add an iommufd_vdevice_alloc helper, so IOMMU drivers can allocate core-embedded style structures.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- include/linux/iommufd.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 5c13c35952d8..5d61a1d2947a 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -31,6 +31,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_ACCESS, IOMMUFD_OBJ_FAULT, IOMMUFD_OBJ_VIOMMU, + IOMMUFD_OBJ_VDEVICE, #ifdef CONFIG_IOMMUFD_TEST IOMMUFD_OBJ_SELFTEST, #endif @@ -92,6 +93,14 @@ struct iommufd_viommu { unsigned int type; };
+struct iommufd_vdevice { + struct iommufd_object obj; + struct iommufd_ctx *ictx; + struct iommufd_device *idev; + struct iommufd_viommu *viommu; + u64 id; /* per-vIOMMU virtual ID */ +}; + /** * struct iommufd_viommu_ops - vIOMMU specific operations * @free: Free all driver-specific parts of an iommufd_viommu. The memory of the @@ -101,12 +110,24 @@ struct iommufd_viommu { * must be defined in include/uapi/linux/iommufd.h. * It must fully initialize the new iommu_domain before * returning. Upon failure, ERR_PTR must be returned. + * @vdevice_alloc: Allocate a driver-managed iommufd_vdevice to init some driver + * specific structure or HW procedure. Note that the core-level + * structure is filled by the iommufd core after calling this op. + * It is suggested to call iommufd_vdevice_alloc() helper for + * a bundled allocation of the core and the driver structures, + * using the ictx pointer in the given @viommu. + * @vdevice_free: Free a driver-managed iommufd_vdevice to de-init its structure + * or HW procedure. The memory of the vdevice will be free-ed by + * iommufd core. */ struct iommufd_viommu_ops { void (*free)(struct iommufd_viommu *viommu); struct iommu_domain *(*domain_alloc_nested)( struct iommufd_viommu *viommu, const struct iommu_user_data *user_data); + struct iommufd_vdevice *(*vdevice_alloc)(struct iommufd_viommu *viommu, + struct device *dev, u64 id); + void (*vdevice_free)(struct iommufd_vdevice *vdev); };
#if IS_ENABLED(CONFIG_IOMMUFD) @@ -200,4 +221,15 @@ _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, ret->member.ops = viommu_ops; \ ret; \ }) +#define iommufd_vdevice_alloc(ictx, drv_struct, member) \ + ({ \ + static_assert( \ + __same_type(struct iommufd_vdevice, \ + ((struct drv_struct *)NULL)->member)); \ + static_assert(offsetof(struct drv_struct, member.obj) == 0); \ + container_of(_iommufd_object_alloc(ictx, \ + sizeof(struct drv_struct), \ + IOMMUFD_OBJ_VDEVICE), \ + struct drv_struct, member.obj); \ + }) #endif
On Mon, Oct 21, 2024 at 05:20:10PM -0700, Nicolin Chen wrote:
struct iommufd_viommu_ops {
struct iommufd_vdevice *(*vdevice_alloc)(struct iommufd_viommu *viommu,
struct device *dev, u64 id);
void (*vdevice_free)(struct iommufd_vdevice *vdev);
...
+#define iommufd_vdevice_alloc(ictx, drv_struct, member) \
({ \
static_assert( \
__same_type(struct iommufd_vdevice, \
((struct drv_struct *)NULL)->member)); \
static_assert(offsetof(struct drv_struct, member.obj) == 0); \
container_of(_iommufd_object_alloc(ictx, \
sizeof(struct drv_struct), \
IOMMUFD_OBJ_VDEVICE), \
struct drv_struct, member.obj); \
})
Per discussion in vIRQ series [1], we might not need to expose struct iommufd_vdevice. So, dropping most of the changes here, and moving iommufd_device to the private header.
[1] https://lore.kernel.org/linux-iommu/ZxlGfgfwrGZGIbeF@Asurada-Nvidia/
Nicolin
On 22/10/24 11:20, Nicolin Chen wrote:
Introduce a new IOMMUFD_OBJ_VDEVICE to represent a physical device, i.e. iommufd_device (idev) object, against an iommufd_viommu (vIOMMU) object in the VM. This vDEVICE object (and its structure) holds all the information and attributes in a VM, regarding the device related to the vIOMMU.
As an initial patch, add a per-vIOMMU virtual ID. This can be:
- Virtual StreamID on a nested ARM SMMUv3, an index to a Stream Table
- Virtual DeviceID on a nested AMD IOMMU, an index to a Device Table
- Virtual ID on a nested Intel VT-D IOMMU, an index to a Context Table
Potentially, this vDEVICE structure can hold some vData for Confidential Compute Architecture (CCA).
Add a pair of vdevice_alloc and vdevice_free in struct iommufd_viommu_ops to allow driver-level vDEVICE structure allocations.
Similar to iommufd_viommu_alloc, add an iommufd_vdevice_alloc helper, so IOMMU drivers can allocate core-embedded style structures.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com
include/linux/iommufd.h | 32 ++++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+)
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 5c13c35952d8..5d61a1d2947a 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -31,6 +31,7 @@ enum iommufd_object_type { IOMMUFD_OBJ_ACCESS, IOMMUFD_OBJ_FAULT, IOMMUFD_OBJ_VIOMMU,
- IOMMUFD_OBJ_VDEVICE, #ifdef CONFIG_IOMMUFD_TEST IOMMUFD_OBJ_SELFTEST, #endif
@@ -92,6 +93,14 @@ struct iommufd_viommu { unsigned int type; }; +struct iommufd_vdevice {
- struct iommufd_object obj;
- struct iommufd_ctx *ictx;
- struct iommufd_device *idev;
- struct iommufd_viommu *viommu;
- u64 id; /* per-vIOMMU virtual ID */
+};
- /**
- struct iommufd_viommu_ops - vIOMMU specific operations
- @free: Free all driver-specific parts of an iommufd_viommu. The memory of the
@@ -101,12 +110,24 @@ struct iommufd_viommu {
must be defined in include/uapi/linux/iommufd.h.
It must fully initialize the new iommu_domain before
returning. Upon failure, ERR_PTR must be returned.
- @vdevice_alloc: Allocate a driver-managed iommufd_vdevice to init some driver
specific structure or HW procedure. Note that the core-level
structure is filled by the iommufd core after calling this op.
It is suggested to call iommufd_vdevice_alloc() helper for
a bundled allocation of the core and the driver structures,
using the ictx pointer in the given @viommu.
- @vdevice_free: Free a driver-managed iommufd_vdevice to de-init its structure
or HW procedure. The memory of the vdevice will be free-ed by
*/ struct iommufd_viommu_ops { void (*free)(struct iommufd_viommu *viommu); struct iommu_domain *(*domain_alloc_nested)( struct iommufd_viommu *viommu, const struct iommu_user_data *user_data);
iommufd core.
- struct iommufd_vdevice *(*vdevice_alloc)(struct iommufd_viommu *viommu,
struct device *dev, u64 id);
- void (*vdevice_free)(struct iommufd_vdevice *vdev); };
#if IS_ENABLED(CONFIG_IOMMUFD) @@ -200,4 +221,15 @@ _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, ret->member.ops = viommu_ops; \ ret; \ }) +#define iommufd_vdevice_alloc(ictx, drv_struct, member) \
- ({ \
static_assert( \
__same_type(struct iommufd_vdevice, \
((struct drv_struct *)NULL)->member)); \
static_assert(offsetof(struct drv_struct, member.obj) == 0); \
container_of(_iommufd_object_alloc(ictx, \
sizeof(struct drv_struct), \
IOMMUFD_OBJ_VDEVICE), \
struct drv_struct, member.obj); \
- }) #endif
A nit: it hurts eyes to read:
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, mock_vdevice, core);
vs.
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, struct mock_vdevice, core);
as for the former I go searching for a "mock_vdevice" variable and for the latter it is clear it is 1) a macro 2) which does some type checking.
also, it makes it impossible to pass things like typeof(..) or a type from typedef. Thanks,
On Fri, Oct 25, 2024 at 06:53:01PM +1100, Alexey Kardashevskiy wrote:
+#define iommufd_vdevice_alloc(ictx, drv_struct, member) \
- ({ \
static_assert( \
__same_type(struct iommufd_vdevice, \
((struct drv_struct *)NULL)->member)); \
static_assert(offsetof(struct drv_struct, member.obj) == 0); \
container_of(_iommufd_object_alloc(ictx, \
sizeof(struct drv_struct), \
IOMMUFD_OBJ_VDEVICE), \
struct drv_struct, member.obj); \
- }) #endif
A nit: it hurts eyes to read:
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, mock_vdevice, core);
vs.
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, struct mock_vdevice, core);
as for the former I go searching for a "mock_vdevice" variable and for the latter it is clear it is 1) a macro 2) which does some type checking.
also, it makes it impossible to pass things like typeof(..) or a type from typedef. Thanks,
Makes sense to me
And the container_of() should not be used in these macros, the point was to avoid it to make the PTR_ERR behavior cleraer. Just put a force type cast
Jason
On Fri, Oct 25, 2024 at 10:20:54AM -0300, Jason Gunthorpe wrote:
On Fri, Oct 25, 2024 at 06:53:01PM +1100, Alexey Kardashevskiy wrote:
+#define iommufd_vdevice_alloc(ictx, drv_struct, member) \
- ({ \
static_assert( \
__same_type(struct iommufd_vdevice, \
((struct drv_struct *)NULL)->member)); \
static_assert(offsetof(struct drv_struct, member.obj) == 0); \
container_of(_iommufd_object_alloc(ictx, \
sizeof(struct drv_struct), \
IOMMUFD_OBJ_VDEVICE), \
struct drv_struct, member.obj); \
- }) #endif
A nit: it hurts eyes to read:
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, mock_vdevice, core);
vs.
mock_vdev = iommufd_vdevice_alloc(viommu->ictx, struct mock_vdevice, core);
as for the former I go searching for a "mock_vdevice" variable and for the latter it is clear it is 1) a macro 2) which does some type checking.
also, it makes it impossible to pass things like typeof(..) or a type from typedef. Thanks,
Makes sense to me
Ack. Will change accordingly.
And the container_of() should not be used in these macros, the point was to avoid it to make the PTR_ERR behavior cleraer. Just put a force type cast
I recall that I changed it for a compiler complaint. But it seems to be gone now. Will change it back.
Thanks Nicolin
Introduce a new ioctl to allocate a vDEVICE object. Since a vDEVICE object is a connection of an iommufd_iommufd object and an iommufd_viommu object, require both as the ioctl inputs and take refcounts in the ioctl handler.
Add to the vIOMMU object a "vdevs" xarray, indexed by a per-vIOMMU virtual device ID, which can be: - Virtual StreamID on a nested ARM SMMUv3, an index to a Stream Table - Virtual DeviceID on a nested AMD IOMMU, an index to a Device Table - Virtual ID on a nested Intel VT-D IOMMU, an index to a Context Table
Then, let the idev structure hold the allocated vdev pointer with a proper locking protection.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/iommufd_private.h | 12 +++ include/linux/iommufd.h | 2 + include/uapi/linux/iommufd.h | 26 ++++++ drivers/iommu/iommufd/device.c | 11 +++ drivers/iommu/iommufd/main.c | 7 ++ drivers/iommu/iommufd/viommu.c | 108 ++++++++++++++++++++++++ 6 files changed, 166 insertions(+)
diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 8c9ab35eaea5..05587c60d0d3 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -391,6 +391,7 @@ struct iommufd_device { struct iommufd_object obj; struct iommufd_ctx *ictx; struct iommufd_group *igroup; + struct iommufd_vdevice *vdev; struct list_head group_item; /* always the physical device */ struct device *dev; @@ -505,8 +506,19 @@ static inline int iommufd_hwpt_replace_device(struct iommufd_device *idev, return iommu_group_replace_domain(idev->igroup->group, hwpt->domain); }
+static inline struct iommufd_viommu * +iommufd_get_viommu(struct iommufd_ucmd *ucmd, u32 id) +{ + return container_of(iommufd_get_object(ucmd->ictx, id, + IOMMUFD_OBJ_VIOMMU), + struct iommufd_viommu, obj); +} + int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd); void iommufd_viommu_destroy(struct iommufd_object *obj); +int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd); +void iommufd_vdevice_destroy(struct iommufd_object *obj); +void iommufd_vdevice_abort(struct iommufd_object *obj);
#ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 5d61a1d2947a..821816e3b1fd 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -90,6 +90,8 @@ struct iommufd_viommu {
const struct iommufd_viommu_ops *ops;
+ struct xarray vdevs; + unsigned int type; };
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 09c1b4ba46d8..407747785f6a 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -52,6 +52,7 @@ enum { IOMMUFD_CMD_HWPT_INVALIDATE = 0x8d, IOMMUFD_CMD_FAULT_QUEUE_ALLOC = 0x8e, IOMMUFD_CMD_VIOMMU_ALLOC = 0x8f, + IOMMUFD_CMD_VDEVICE_ALLOC = 0x90, };
/** @@ -896,4 +897,29 @@ struct iommu_viommu_alloc { __u32 out_viommu_id; }; #define IOMMU_VIOMMU_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VIOMMU_ALLOC) + +/** + * struct iommu_vdevice_alloc - ioctl(IOMMU_VDEVICE_ALLOC) + * @size: sizeof(struct iommu_vdevice_alloc) + * @viommu_id: vIOMMU ID to associate with the virtual device + * @dev_id: The pyhsical device to allocate a virtual instance on the vIOMMU + * @__reserved: Must be 0 + * @virt_id: Virtual device ID per vIOMMU, e.g. vSID of ARM SMMUv3, vDeviceID + * of AMD IOMMU, and vID of a nested Intel VT-d to a Context Table. + * @out_vdevice_id: Output virtual instance ID for the allocated object + * @__reserved2: Must be 0 + * + * Allocate a virtual device instance (for a physical device) against a vIOMMU. + * This instance holds the device's information (related to its vIOMMU) in a VM. + */ +struct iommu_vdevice_alloc { + __u32 size; + __u32 viommu_id; + __u32 dev_id; + __u32 __reserved; + __aligned_u64 virt_id; + __u32 out_vdevice_id; + __u32 __reserved2; +}; +#define IOMMU_VDEVICE_ALLOC _IO(IOMMUFD_TYPE, IOMMUFD_CMD_VDEVICE_ALLOC) #endif diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 5fd3dd420290..e50113305a9c 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -277,6 +277,17 @@ EXPORT_SYMBOL_NS_GPL(iommufd_ctx_has_group, IOMMUFD); */ void iommufd_device_unbind(struct iommufd_device *idev) { + u32 vdev_id = 0; + + /* idev->vdev object should be destroyed prior, yet just in case.. */ + mutex_lock(&idev->igroup->lock); + if (idev->vdev) + vdev_id = idev->vdev->obj.id; + mutex_unlock(&idev->igroup->lock); + /* Relying on xa_lock against a race with iommufd_destroy() */ + if (vdev_id) + iommufd_object_remove(idev->ictx, NULL, vdev_id, 0); + iommufd_object_destroy_user(idev->ictx, &idev->obj); } EXPORT_SYMBOL_NS_GPL(iommufd_device_unbind, IOMMUFD); diff --git a/drivers/iommu/iommufd/main.c b/drivers/iommu/iommufd/main.c index ab5ee325d809..696ac9e0e74b 100644 --- a/drivers/iommu/iommufd/main.c +++ b/drivers/iommu/iommufd/main.c @@ -322,6 +322,7 @@ union ucmd_buffer { struct iommu_option option; struct iommu_vfio_ioas vfio_ioas; struct iommu_viommu_alloc viommu; + struct iommu_vdevice_alloc vdev; #ifdef CONFIG_IOMMUFD_TEST struct iommu_test_cmd test; #endif @@ -375,6 +376,8 @@ static const struct iommufd_ioctl_op iommufd_ioctl_ops[] = { __reserved), IOCTL_OP(IOMMU_VIOMMU_ALLOC, iommufd_viommu_alloc_ioctl, struct iommu_viommu_alloc, out_viommu_id), + IOCTL_OP(IOMMU_VDEVICE_ALLOC, iommufd_vdevice_alloc_ioctl, + struct iommu_vdevice_alloc, __reserved2), #ifdef CONFIG_IOMMUFD_TEST IOCTL_OP(IOMMU_TEST_CMD, iommufd_test, struct iommu_test_cmd, last), #endif @@ -513,6 +516,10 @@ static const struct iommufd_object_ops iommufd_object_ops[] = { [IOMMUFD_OBJ_VIOMMU] = { .destroy = iommufd_viommu_destroy, }, + [IOMMUFD_OBJ_VDEVICE] = { + .destroy = iommufd_vdevice_destroy, + .abort = iommufd_vdevice_abort, + }, #ifdef CONFIG_IOMMUFD_TEST [IOMMUFD_OBJ_SELFTEST] = { .destroy = iommufd_selftest_destroy, diff --git a/drivers/iommu/iommufd/viommu.c b/drivers/iommu/iommufd/viommu.c index e612f3d539b7..77ea41f36b0f 100644 --- a/drivers/iommu/iommufd/viommu.c +++ b/drivers/iommu/iommufd/viommu.c @@ -12,6 +12,7 @@ void iommufd_viommu_destroy(struct iommufd_object *obj) if (viommu->ops && viommu->ops->free) viommu->ops->free(viommu); refcount_dec(&viommu->hwpt->common.obj.users); + xa_destroy(&viommu->vdevs); }
int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) @@ -66,6 +67,7 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) /* Assume physical IOMMUs are unpluggable (the most likely case) */ viommu->iommu_dev = __iommu_get_iommu_dev(idev->dev);
+ xa_init(&viommu->vdevs); refcount_inc(&viommu->hwpt->common.obj.users);
cmd->out_viommu_id = viommu->obj.id; @@ -83,3 +85,109 @@ int iommufd_viommu_alloc_ioctl(struct iommufd_ucmd *ucmd) iommufd_put_object(ucmd->ictx, &idev->obj); return rc; } + +void iommufd_vdevice_abort(struct iommufd_object *obj) +{ + struct iommufd_vdevice *old, + *vdev = container_of(obj, struct iommufd_vdevice, obj); + struct iommufd_viommu *viommu = vdev->viommu; + struct iommufd_device *idev = vdev->idev; + + lockdep_assert_held(&idev->igroup->lock); + + if (viommu->ops && viommu->ops->vdevice_free) + viommu->ops->vdevice_free(vdev); + + old = xa_cmpxchg(&viommu->vdevs, vdev->id, vdev, NULL, GFP_KERNEL); + if (old) + WARN_ON(old != vdev); + + refcount_dec(&viommu->obj.users); + refcount_dec(&idev->obj.users); + idev->vdev = NULL; +} + +void iommufd_vdevice_destroy(struct iommufd_object *obj) +{ + struct iommufd_vdevice *vdev = + container_of(obj, struct iommufd_vdevice, obj); + + mutex_lock(&vdev->idev->igroup->lock); + iommufd_vdevice_abort(obj); + mutex_unlock(&vdev->idev->igroup->lock); +} + +int iommufd_vdevice_alloc_ioctl(struct iommufd_ucmd *ucmd) +{ + struct iommu_vdevice_alloc *cmd = ucmd->cmd; + struct iommufd_vdevice *vdev, *curr; + struct iommufd_viommu *viommu; + struct iommufd_device *idev; + u64 virt_id = cmd->virt_id; + int rc = 0; + + if (virt_id > ULONG_MAX) + return -EINVAL; + + viommu = iommufd_get_viommu(ucmd, cmd->viommu_id); + if (IS_ERR(viommu)) + return PTR_ERR(viommu); + + idev = iommufd_get_device(ucmd, cmd->dev_id); + if (IS_ERR(idev)) { + rc = PTR_ERR(idev); + goto out_put_viommu; + } + + mutex_lock(&idev->igroup->lock); + if (idev->vdev) { + rc = -EEXIST; + goto out_unlock_igroup; + } + + if (viommu->ops && viommu->ops->vdevice_alloc) + vdev = viommu->ops->vdevice_alloc(viommu, idev->dev, virt_id); + else + vdev = iommufd_object_alloc(ucmd->ictx, vdev, + IOMMUFD_OBJ_VDEVICE); + if (IS_ERR(vdev)) { + rc = PTR_ERR(vdev); + goto out_unlock_igroup; + } + + rc = iommufd_verify_unfinalized_object(ucmd->ictx, &vdev->obj); + if (rc) { + kfree(vdev); + goto out_unlock_igroup; + } + + vdev->idev = idev; + vdev->id = virt_id; + vdev->viommu = viommu; + + idev->vdev = vdev; + refcount_inc(&idev->obj.users); + refcount_inc(&viommu->obj.users); + + curr = xa_cmpxchg(&viommu->vdevs, virt_id, NULL, vdev, GFP_KERNEL); + if (curr) { + rc = xa_err(curr) ?: -EBUSY; + goto out_abort; + } + + cmd->out_vdevice_id = vdev->obj.id; + rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd)); + if (rc) + goto out_abort; + iommufd_object_finalize(ucmd->ictx, &vdev->obj); + goto out_unlock_igroup; + +out_abort: + iommufd_object_abort_and_destroy(ucmd->ictx, &vdev->obj); +out_unlock_igroup: + mutex_unlock(&idev->igroup->lock); + iommufd_put_object(ucmd->ictx, &idev->obj); +out_put_viommu: + iommufd_put_object(ucmd->ictx, &viommu->obj); + return rc; +}
On 2024/10/22 8:20, Nicolin Chen wrote:
Introduce a new ioctl to allocate a vDEVICE object. Since a vDEVICE object is a connection of an iommufd_iommufd object and an iommufd_viommu object,
nit:
:s/iommufd_iommufd/iommufd_device/g
or not?
require both as the ioctl inputs and take refcounts in the ioctl handler.
Add to the vIOMMU object a "vdevs" xarray, indexed by a per-vIOMMU virtual device ID, which can be:
- Virtual StreamID on a nested ARM SMMUv3, an index to a Stream Table
- Virtual DeviceID on a nested AMD IOMMU, an index to a Device Table
- Virtual ID on a nested Intel VT-D IOMMU, an index to a Context Table
Then, let the idev structure hold the allocated vdev pointer with a proper locking protection.
Signed-off-by: Nicolin Chennicolinc@nvidia.com
drivers/iommu/iommufd/iommufd_private.h | 12 +++ include/linux/iommufd.h | 2 + include/uapi/linux/iommufd.h | 26 ++++++ drivers/iommu/iommufd/device.c | 11 +++ drivers/iommu/iommufd/main.c | 7 ++ drivers/iommu/iommufd/viommu.c | 108 ++++++++++++++++++++++++ 6 files changed, 166 insertions(+)
Thanks, baolu
On Tue, Oct 22, 2024 at 11:40:11AM +0800, Baolu Lu wrote:
On 2024/10/22 8:20, Nicolin Chen wrote:
Introduce a new ioctl to allocate a vDEVICE object. Since a vDEVICE object is a connection of an iommufd_iommufd object and an iommufd_viommu object,
nit:
:s/iommufd_iommufd/iommufd_device/g
or not?
Right, thanks!
Nicolin
Add a vdevice_alloc op to the viommu mock_viommu_ops for the coverage of IOMMU_VIOMMU_TYPE_SELFTEST allocations. Then, add a vdevice_alloc TEST_F to cover the IOMMU_VDEVICE_ALLOC ioctl.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- tools/testing/selftests/iommu/iommufd_utils.h | 27 +++++++++++++++++++ drivers/iommu/iommufd/selftest.c | 17 ++++++++++++ tools/testing/selftests/iommu/iommufd.c | 20 ++++++++++++++ .../selftests/iommu/iommufd_fail_nth.c | 4 +++ 4 files changed, 68 insertions(+)
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index ca09308dad6a..5b17d7b2ac5c 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -790,3 +790,30 @@ static int _test_cmd_viommu_alloc(int fd, __u32 device_id, __u32 hwpt_id, EXPECT_ERRNO(_errno, \ _test_cmd_viommu_alloc(self->fd, device_id, hwpt_id, \ type, 0, viommu_id)) + +static int _test_cmd_vdevice_alloc(int fd, __u32 viommu_id, __u32 idev_id, + __u64 virt_id, __u32 *vdev_id) +{ + struct iommu_vdevice_alloc cmd = { + .size = sizeof(cmd), + .dev_id = idev_id, + .viommu_id = viommu_id, + .virt_id = virt_id, + }; + int ret; + + ret = ioctl(fd, IOMMU_VDEVICE_ALLOC, &cmd); + if (ret) + return ret; + if (vdev_id) + *vdev_id = cmd.out_vdevice_id; + return 0; +} + +#define test_cmd_vdevice_alloc(viommu_id, idev_id, virt_id, vdev_id) \ + ASSERT_EQ(0, _test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, \ + virt_id, vdev_id)) +#define test_err_vdevice_alloc(_errno, viommu_id, idev_id, virt_id, vdev_id) \ + EXPECT_ERRNO(_errno, \ + _test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, \ + virt_id, vdev_id)) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 04dd95fe24ca..f401c565143f 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -136,6 +136,10 @@ struct mock_viommu { struct iommufd_viommu core; };
+struct mock_vdevice { + struct iommufd_vdevice core; +}; + enum selftest_obj_type { TYPE_IDEV, }; @@ -558,8 +562,21 @@ static void mock_viommu_free(struct iommufd_viommu *viommu) /* iommufd core frees mock_viommu and viommu */ }
+static struct iommufd_vdevice *mock_vdevice_alloc(struct iommufd_viommu *viommu, + struct device *dev, u64 id) +{ + struct mock_vdevice *mock_vdev; + + mock_vdev = iommufd_vdevice_alloc(viommu->ictx, mock_vdevice, core); + if (IS_ERR(mock_vdev)) + return ERR_CAST(mock_vdev); + + return &mock_vdev->core; +} + static struct iommufd_viommu_ops mock_viommu_ops = { .free = mock_viommu_free, + .vdevice_alloc = mock_vdevice_alloc, };
static struct iommufd_viommu *mock_viommu_alloc(struct device *dev, diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 49ac144cf5a4..a8c6bdf64825 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -129,6 +129,7 @@ TEST_F(iommufd, cmd_length) TEST_LENGTH(iommu_option, IOMMU_OPTION, val64); TEST_LENGTH(iommu_vfio_ioas, IOMMU_VFIO_IOAS, __reserved); TEST_LENGTH(iommu_viommu_alloc, IOMMU_VIOMMU_ALLOC, out_viommu_id); + TEST_LENGTH(iommu_vdevice_alloc, IOMMU_VDEVICE_ALLOC, __reserved2); #undef TEST_LENGTH }
@@ -2464,4 +2465,23 @@ TEST_F(iommufd_viommu, viommu_auto_destroy) { }
+TEST_F(iommufd_viommu, vdevice_alloc) +{ + uint32_t viommu_id = self->viommu_id; + uint32_t dev_id = self->device_id; + uint32_t vdev_id = 0; + + if (dev_id) { + /* Set vdev_id to 0x99, unset it, and set to 0x88 */ + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); + test_err_vdevice_alloc(EEXIST, + viommu_id, dev_id, 0x99, &vdev_id); + test_ioctl_destroy(vdev_id); + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x88, &vdev_id); + test_ioctl_destroy(vdev_id); + } else { + test_err_vdevice_alloc(ENOENT, viommu_id, dev_id, 0x99, NULL); + } +} + TEST_HARNESS_MAIN diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index e9a980b7729b..28f11b26f836 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -583,6 +583,7 @@ TEST_FAIL_NTH(basic_fail_nth, device) uint32_t idev_id; uint32_t hwpt_id; uint32_t viommu_id; + uint32_t vdev_id; __u64 iova;
self->fd = open("/dev/iommu", O_RDWR); @@ -635,6 +636,9 @@ TEST_FAIL_NTH(basic_fail_nth, device) IOMMU_VIOMMU_TYPE_SELFTEST, 0, &viommu_id)) return -1;
+ if (_test_cmd_vdevice_alloc(self->fd, viommu_id, idev_id, 0, &vdev_id)) + return -1; + return 0; }
This per-vIOMMU cache_invalidate op is like the cache_invalidate_user op in struct iommu_domain_ops, but wider, supporting device cache (e.g. PCI ATC invaldiations).
Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- include/linux/iommufd.h | 10 ++++++++++ 1 file changed, 10 insertions(+)
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 821816e3b1fd..559f274a26ea 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -15,6 +15,7 @@ struct device; struct file; struct iommu_group; struct iommu_user_data; +struct iommu_user_data_array; struct iommufd_access; struct iommufd_ctx; struct iommufd_device; @@ -121,6 +122,13 @@ struct iommufd_vdevice { * @vdevice_free: Free a driver-managed iommufd_vdevice to de-init its structure * or HW procedure. The memory of the vdevice will be free-ed by * iommufd core. + * @cache_invalidate: Flush hardware cache used by a vIOMMU. It can be used for + * any IOMMU hardware specific cache: TLB and device cache. + * The @array passes in the cache invalidation requests, in + * form of a driver data structure. A driver must update the + * array->entry_num to report the number of handled requests. + * The data structure of the array entry must be defined in + * include/uapi/linux/iommufd.h */ struct iommufd_viommu_ops { void (*free)(struct iommufd_viommu *viommu); @@ -130,6 +138,8 @@ struct iommufd_viommu_ops { struct iommufd_vdevice *(*vdevice_alloc)(struct iommufd_viommu *viommu, struct device *dev, u64 id); void (*vdevice_free)(struct iommufd_vdevice *vdev); + int (*cache_invalidate)(struct iommufd_viommu *viommu, + struct iommu_user_data_array *array); };
#if IS_ENABLED(CONFIG_IOMMUFD)
A vIOMMU-based hwpt_nested requires a cache invalidation op too, either using the one in iommu_domain_ops or the one in viommu_ops. Enforce that upon the allocated hwpt_nested.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/hw_pagetable.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 5314cd486ddb..7af6cbe6dbf2 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -300,7 +300,9 @@ iommufd_hwpt_nested_alloc_for_viommu(struct iommufd_viommu *viommu, } hwpt->domain->owner = viommu->iommu_dev->ops;
- if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED)) { + if (WARN_ON_ONCE(hwpt->domain->type != IOMMU_DOMAIN_NESTED || + (!viommu->ops->cache_invalidate && + !hwpt->domain->ops->cache_invalidate_user))) { rc = -EINVAL; goto out_abort; }
With a vIOMMU object, use space can flush any IOMMU related cache that can be directed via a vIOMMU object. It is similar to the IOMMU_HWPT_INVALIDATE uAPI, but can cover a wider range than IOTLB, e.g. device/desciprtor cache.
Allow hwpt_id of the iommu_hwpt_invalidate structure to carry a viommu_id, and reuse the IOMMU_HWPT_INVALIDATE uAPI for vIOMMU invalidations. Drivers can define different structures for vIOMMU invalidations v.s. HWPT ones.
Update the uAPI, kdoc, and selftest case accordingly.
Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- include/uapi/linux/iommufd.h | 9 ++++--- drivers/iommu/iommufd/hw_pagetable.c | 32 +++++++++++++++++++------ tools/testing/selftests/iommu/iommufd.c | 4 ++-- 3 files changed, 33 insertions(+), 12 deletions(-)
diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 407747785f6a..3486ae2d62d1 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -730,7 +730,7 @@ struct iommu_hwpt_vtd_s1_invalidate { /** * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE) * @size: sizeof(struct iommu_hwpt_invalidate) - * @hwpt_id: ID of a nested HWPT for cache invalidation + * @hwpt_id: ID of a nested HWPT or a vIOMMU, for cache invalidation * @data_uptr: User pointer to an array of driver-specific cache invalidation * data. * @data_type: One of enum iommu_hwpt_invalidate_data_type, defining the data @@ -741,8 +741,11 @@ struct iommu_hwpt_vtd_s1_invalidate { * Output the number of requests successfully handled by kernel. * @__reserved: Must be 0. * - * Invalidate the iommu cache for user-managed page table. Modifications on a - * user-managed page table should be followed by this operation to sync cache. + * Invalidate iommu cache for user-managed page table or vIOMMU. Modifications + * on a user-managed page table should be followed by this operation, if a HWPT + * is passed in via @hwpt_id. Other caches, such as device cache or descriptor + * cache can be flushed if a vIOMMU is passed in via the @hwpt_id field. + * * Each ioctl can support one or more cache invalidation requests in the array * that has a total size of @entry_len * @entry_num. * diff --git a/drivers/iommu/iommufd/hw_pagetable.c b/drivers/iommu/iommufd/hw_pagetable.c index 7af6cbe6dbf2..cbc58e98d8b4 100644 --- a/drivers/iommu/iommufd/hw_pagetable.c +++ b/drivers/iommu/iommufd/hw_pagetable.c @@ -481,7 +481,7 @@ int iommufd_hwpt_invalidate(struct iommufd_ucmd *ucmd) .entry_len = cmd->entry_len, .entry_num = cmd->entry_num, }; - struct iommufd_hw_pagetable *hwpt; + struct iommufd_object *pt_obj; u32 done_num = 0; int rc;
@@ -495,17 +495,35 @@ int iommufd_hwpt_invalidate(struct iommufd_ucmd *ucmd) goto out; }
- hwpt = iommufd_get_hwpt_nested(ucmd, cmd->hwpt_id); - if (IS_ERR(hwpt)) { - rc = PTR_ERR(hwpt); + pt_obj = iommufd_get_object(ucmd->ictx, cmd->hwpt_id, IOMMUFD_OBJ_ANY); + if (IS_ERR(pt_obj)) { + rc = PTR_ERR(pt_obj); goto out; } + if (pt_obj->type == IOMMUFD_OBJ_HWPT_NESTED) { + struct iommufd_hw_pagetable *hwpt = + container_of(pt_obj, struct iommufd_hw_pagetable, obj); + + rc = hwpt->domain->ops->cache_invalidate_user(hwpt->domain, + &data_array); + } else if (pt_obj->type == IOMMUFD_OBJ_VIOMMU) { + struct iommufd_viommu *viommu = + container_of(pt_obj, struct iommufd_viommu, obj); + + if (!viommu->ops || !viommu->ops->cache_invalidate) { + rc = -EOPNOTSUPP; + goto out_put_pt; + } + rc = viommu->ops->cache_invalidate(viommu, &data_array); + } else { + rc = -EINVAL; + goto out_put_pt; + }
- rc = hwpt->domain->ops->cache_invalidate_user(hwpt->domain, - &data_array); done_num = data_array.entry_num;
- iommufd_put_object(ucmd->ictx, &hwpt->obj); +out_put_pt: + iommufd_put_object(ucmd->ictx, pt_obj); out: cmd->entry_num = done_num; if (iommufd_ucmd_respond(ucmd, sizeof(*cmd))) diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index a8c6bdf64825..e09dba3588ee 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -362,9 +362,9 @@ TEST_F(iommufd_ioas, alloc_hwpt_nested) EXPECT_ERRNO(EBUSY, _test_ioctl_destroy(self->fd, parent_hwpt_id));
- /* hwpt_invalidate only supports a user-managed hwpt (nested) */ + /* hwpt_invalidate does not support a parent hwpt */ num_inv = 1; - test_err_hwpt_invalidate(ENOENT, parent_hwpt_id, inv_reqs, + test_err_hwpt_invalidate(EINVAL, parent_hwpt_id, inv_reqs, IOMMU_HWPT_INVALIDATE_DATA_SELFTEST, sizeof(*inv_reqs), &num_inv); assert(!num_inv);
From: Jason Gunthorpe jgg@nvidia.com
The iommu_copy_struct_from_user_array helper can be used to copy a single entry from a user array which might not be efficient if the array is big.
Add a new iommu_copy_struct_from_full_user_array to copy the entire user array at once. Update the existing iommu_copy_struct_from_user_array kdoc accordingly.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- include/linux/iommu.h | 49 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 48 insertions(+), 1 deletion(-)
diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 14f24b5cd16f..8105079ba5d9 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -493,7 +493,9 @@ static inline int __iommu_copy_struct_from_user_array( * @index: Index to the location in the array to copy user data from * @min_last: The last member of the data structure @kdst points in the * initial version. - * Return 0 for success, otherwise -error. + * + * Copy a single entry from a user array. Return 0 for success, otherwise + * -error. */ #define iommu_copy_struct_from_user_array(kdst, user_array, data_type, index, \ min_last) \ @@ -501,6 +503,51 @@ static inline int __iommu_copy_struct_from_user_array( kdst, user_array, data_type, index, sizeof(*(kdst)), \ offsetofend(typeof(*(kdst)), min_last))
+ +/** + * iommu_copy_struct_from_full_user_array - Copy iommu driver specific user + * space data from an iommu_user_data_array + * @kdst: Pointer to an iommu driver specific user data that is defined in + * include/uapi/linux/iommufd.h + * @kdst_entry_size: sizeof(*kdst) + * @user_array: Pointer to a struct iommu_user_data_array for a user space + * array + * @data_type: The data type of the @kdst. Must match with @user_array->type + * + * Copy the entire user array. kdst must have room for kdst_entry_size * + * user_array->entry_num bytes. Return 0 for success, otherwise -error. + */ +static inline int +iommu_copy_struct_from_full_user_array(void *kdst, size_t kdst_entry_size, + struct iommu_user_data_array *user_array, + unsigned int data_type) +{ + unsigned int i; + int ret; + + if (user_array->type != data_type) + return -EINVAL; + if (!user_array->entry_num) + return -EINVAL; + if (likely(user_array->entry_len == kdst_entry_size)) { + if (copy_from_user(kdst, user_array->uptr, + user_array->entry_num * + user_array->entry_len)) + return -EFAULT; + } + + /* Copy item by item */ + for (i = 0; i != user_array->entry_num; i++) { + ret = copy_struct_from_user( + kdst + kdst_entry_size * i, kdst_entry_size, + user_array->uptr + user_array->entry_len * i, + user_array->entry_len); + if (ret) + return ret; + } + return 0; +} + /** * struct iommu_ops - iommu ops and capabilities * @capable: check capability
This avoids a bigger trouble of moving the struct iommufd_device to the public header.
Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- include/linux/iommufd.h | 6 ++++++ drivers/iommu/iommufd/driver.c | 7 +++++++ 2 files changed, 13 insertions(+)
diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 559f274a26ea..2f4ec6e6df21 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -204,6 +204,7 @@ static inline int iommufd_vfio_compat_set_no_iommu(struct iommufd_ctx *ictx) struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, enum iommufd_object_type type); +struct device *vdev_to_dev(struct iommufd_vdevice *vdev); #else /* !CONFIG_IOMMUFD_DRIVER */ static inline struct iommufd_object * _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, @@ -211,6 +212,11 @@ _iommufd_object_alloc(struct iommufd_ctx *ictx, size_t size, { return ERR_PTR(-EOPNOTSUPP); } + +static inline struct device *vdev_to_dev(struct iommufd_vdevice *vdev) +{ + return NULL; +} #endif /* CONFIG_IOMMUFD_DRIVER */
/* diff --git a/drivers/iommu/iommufd/driver.c b/drivers/iommu/iommufd/driver.c index c0876d3f91c7..a5d750d2cfaa 100644 --- a/drivers/iommu/iommufd/driver.c +++ b/drivers/iommu/iommufd/driver.c @@ -36,3 +36,10 @@ struct iommufd_object *_iommufd_object_alloc(struct iommufd_ctx *ictx, return ERR_PTR(rc); } EXPORT_SYMBOL_NS_GPL(_iommufd_object_alloc, IOMMUFD); + +/* Caller should xa_lock(&viommu->vdevs) to protect the return value */ +struct device *vdev_to_dev(struct iommufd_vdevice *vdev) +{ + return vdev ? vdev->idev->dev : NULL; +} +EXPORT_SYMBOL_NS_GPL(vdev_to_dev, IOMMUFD);
On Mon, Oct 21, 2024 at 05:20:17PM -0700, Nicolin Chen wrote:
+/* Caller should xa_lock(&viommu->vdevs) to protect the return value */ +struct device *vdev_to_dev(struct iommufd_vdevice *vdev) +{
return vdev ? vdev->idev->dev : NULL;
+} +EXPORT_SYMBOL_NS_GPL(vdev_to_dev, IOMMUFD);
Changing to not exposing the iommufd_vdevice structure, I reworked this helper to a vIOMMU-based one:
+/* Caller should xa_lock(&viommu->vdevs) to protect the return value */ +struct device *iommufd_viommu_find_dev(struct iommufd_viommu *viommu, + unsigned long vdev_id) +{ + struct iommufd_vdevice *vdev; + + lockdep_is_held(&viommu->vdevs.xa_lock); + + vdev = xa_load(&viommu->vdevs, vdev_id); + return vdev ? vdev->idev->dev : NULL; +} +EXPORT_SYMBOL_NS_GPL(iommufd_viommu_find_dev, IOMMUFD);
Nicolin
Similar to the coverage of cache_invalidate_user for iotlb invalidation, add a device cache and a viommu_cache_invalidate function to test it out.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/iommufd_test.h | 25 +++++++++ drivers/iommu/iommufd/selftest.c | 76 +++++++++++++++++++++++++++- 2 files changed, 100 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index edced4ac7cd3..05f57a968d25 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -54,6 +54,11 @@ enum { MOCK_NESTED_DOMAIN_IOTLB_NUM = 4, };
+enum { + MOCK_DEV_CACHE_ID_MAX = 3, + MOCK_DEV_CACHE_NUM = 4, +}; + struct iommu_test_cmd { __u32 size; __u32 op; @@ -152,6 +157,7 @@ struct iommu_test_hw_info { /* Should not be equal to any defined value in enum iommu_hwpt_data_type */ #define IOMMU_HWPT_DATA_SELFTEST 0xdead #define IOMMU_TEST_IOTLB_DEFAULT 0xbadbeef +#define IOMMU_TEST_DEV_CACHE_DEFAULT 0xbaddad
/** * struct iommu_hwpt_selftest @@ -182,4 +188,23 @@ struct iommu_hwpt_invalidate_selftest {
#define IOMMU_VIOMMU_TYPE_SELFTEST 0xdeadbeef
+/* Should not be equal to any defined value in enum iommu_viommu_invalidate_data_type */ +#define IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST 0xdeadbeef +#define IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID 0xdadbeef + +/** + * struct iommu_viommu_invalidate_selftest - Invalidation data for Mock VIOMMU + * (IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST) + * @flags: Invalidate flags + * @cache_id: Invalidate cache entry index + * + * If IOMMU_TEST_INVALIDATE_ALL is set in @flags, @cache_id will be ignored + */ +struct iommu_viommu_invalidate_selftest { +#define IOMMU_TEST_INVALIDATE_FLAG_ALL (1 << 0) + __u32 flags; + __u32 vdev_id; + __u32 cache_id; +}; + #endif diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index f401c565143f..11e0c81c77c4 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -148,6 +148,7 @@ struct mock_dev { struct device dev; unsigned long flags; int id; + u32 cache[MOCK_DEV_CACHE_NUM]; };
struct selftest_obj { @@ -574,9 +575,80 @@ static struct iommufd_vdevice *mock_vdevice_alloc(struct iommufd_viommu *viommu, return &mock_vdev->core; }
+static int mock_viommu_cache_invalidate(struct iommufd_viommu *viommu, + struct iommu_user_data_array *array) +{ + struct iommu_viommu_invalidate_selftest *cmds; + struct iommu_viommu_invalidate_selftest *cur; + struct iommu_viommu_invalidate_selftest *end; + int rc; + + /* A zero-length array is allowed to validate the array type */ + if (array->entry_num == 0 && + array->type == IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST) { + array->entry_num = 0; + return 0; + } + + cmds = kcalloc(array->entry_num, sizeof(*cmds), GFP_KERNEL); + if (!cmds) + return -ENOMEM; + cur = cmds; + end = cmds + array->entry_num; + + static_assert(sizeof(*cmds) == 3 * sizeof(u32)); + rc = iommu_copy_struct_from_full_user_array( + cmds, sizeof(*cmds), array, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST); + if (rc) + goto out; + + while (cur != end) { + XA_STATE(xas, &viommu->vdevs, (unsigned long)cur->vdev_id); + struct mock_dev *mdev; + struct device *dev; + int i; + + if (cur->flags & ~IOMMU_TEST_INVALIDATE_FLAG_ALL) { + rc = -EOPNOTSUPP; + goto out; + } + + if (cur->cache_id > MOCK_DEV_CACHE_ID_MAX) { + rc = -EINVAL; + goto out; + } + + xa_lock(&viommu->vdevs); + dev = vdev_to_dev(xas_load(&xas)); + if (!dev) { + xa_unlock(&viommu->vdevs); + rc = -EINVAL; + goto out; + } + mdev = container_of(dev, struct mock_dev, dev); + + if (cur->flags & IOMMU_TEST_INVALIDATE_FLAG_ALL) { + /* Invalidate all cache entries and ignore cache_id */ + for (i = 0; i < MOCK_DEV_CACHE_NUM; i++) + mdev->cache[i] = 0; + } else { + mdev->cache[cur->cache_id] = 0; + } + xa_unlock(&viommu->vdevs); + + cur++; + } +out: + array->entry_num = cur - cmds; + kfree(cmds); + return rc; +} + static struct iommufd_viommu_ops mock_viommu_ops = { .free = mock_viommu_free, .vdevice_alloc = mock_vdevice_alloc, + .cache_invalidate = mock_viommu_cache_invalidate, };
static struct iommufd_viommu *mock_viommu_alloc(struct device *dev, @@ -752,7 +824,7 @@ static void mock_dev_release(struct device *dev) static struct mock_dev *mock_dev_create(unsigned long dev_flags) { struct mock_dev *mdev; - int rc; + int rc, i;
if (dev_flags & ~(MOCK_FLAGS_DEVICE_NO_DIRTY | MOCK_FLAGS_DEVICE_HUGE_IOVA)) @@ -766,6 +838,8 @@ static struct mock_dev *mock_dev_create(unsigned long dev_flags) mdev->flags = dev_flags; mdev->dev.release = mock_dev_release; mdev->dev.bus = &iommufd_mock_bus_type.bus; + for (i = 0; i < MOCK_DEV_CACHE_NUM; i++) + mdev->cache[i] = IOMMU_TEST_DEV_CACHE_DEFAULT;
rc = ida_alloc(&mock_dev_ida, GFP_KERNEL); if (rc < 0)
Similar to IOMMU_TEST_OP_MD_CHECK_IOTLB verifying a mock_domain's iotlb, IOMMU_TEST_OP_DEV_CHECK_CACHE will be used to verify a mock_dev's cache.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/iommufd/iommufd_test.h | 5 ++++ tools/testing/selftests/iommu/iommufd_utils.h | 24 +++++++++++++++++++ drivers/iommu/iommufd/selftest.c | 22 +++++++++++++++++ tools/testing/selftests/iommu/iommufd.c | 7 +++++- 4 files changed, 57 insertions(+), 1 deletion(-)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index 05f57a968d25..b226636aa07a 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -23,6 +23,7 @@ enum { IOMMU_TEST_OP_DIRTY, IOMMU_TEST_OP_MD_CHECK_IOTLB, IOMMU_TEST_OP_TRIGGER_IOPF, + IOMMU_TEST_OP_DEV_CHECK_CACHE, };
enum { @@ -140,6 +141,10 @@ struct iommu_test_cmd { __u32 perm; __u64 addr; } trigger_iopf; + struct { + __u32 id; + __u32 cache; + } check_dev_cache; }; __u32 last; }; diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 5b17d7b2ac5c..3ae6cb5eed7d 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -234,6 +234,30 @@ static int _test_cmd_hwpt_alloc(int fd, __u32 device_id, __u32 pt_id, __u32 ft_i test_cmd_hwpt_check_iotlb(hwpt_id, i, expected); \ })
+#define test_cmd_dev_check_cache(device_id, cache_id, expected) \ + ({ \ + struct iommu_test_cmd test_cmd = { \ + .size = sizeof(test_cmd), \ + .op = IOMMU_TEST_OP_DEV_CHECK_CACHE, \ + .id = device_id, \ + .check_dev_cache = { \ + .id = cache_id, \ + .cache = expected, \ + }, \ + }; \ + ASSERT_EQ(0, ioctl(self->fd, \ + _IOMMU_TEST_CMD( \ + IOMMU_TEST_OP_DEV_CHECK_CACHE), \ + &test_cmd)); \ + }) + +#define test_cmd_dev_check_cache_all(device_id, expected) \ + ({ \ + int c; \ + for (c = 0; c < MOCK_DEV_CACHE_NUM; c++) \ + test_cmd_dev_check_cache(device_id, c, expected); \ + }) + static int _test_cmd_hwpt_invalidate(int fd, __u32 hwpt_id, void *reqs, uint32_t data_type, uint32_t lreq, uint32_t *nreqs) diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 11e0c81c77c4..6f524cea045f 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -1096,6 +1096,24 @@ static int iommufd_test_md_check_iotlb(struct iommufd_ucmd *ucmd, return rc; }
+static int iommufd_test_dev_check_cache(struct iommufd_ucmd *ucmd, u32 idev_id, + unsigned int cache_id, u32 cache) +{ + struct iommufd_device *idev; + struct mock_dev *mdev; + int rc = 0; + + idev = iommufd_get_device(ucmd, idev_id); + if (IS_ERR(idev)) + return PTR_ERR(idev); + mdev = container_of(idev->dev, struct mock_dev, dev); + + if (cache_id > MOCK_DEV_CACHE_ID_MAX || mdev->cache[cache_id] != cache) + rc = -EINVAL; + iommufd_put_object(ucmd->ictx, &idev->obj); + return rc; +} + struct selftest_access { struct iommufd_access *access; struct file *file; @@ -1605,6 +1623,10 @@ int iommufd_test(struct iommufd_ucmd *ucmd) return iommufd_test_md_check_iotlb(ucmd, cmd->id, cmd->check_iotlb.id, cmd->check_iotlb.iotlb); + case IOMMU_TEST_OP_DEV_CHECK_CACHE: + return iommufd_test_dev_check_cache(ucmd, cmd->id, + cmd->check_dev_cache.id, + cmd->check_dev_cache.cache); case IOMMU_TEST_OP_CREATE_ACCESS: return iommufd_test_create_access(ucmd, cmd->id, cmd->create_access.flags); diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index e09dba3588ee..79b0739d586b 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -222,6 +222,8 @@ FIXTURE_SETUP(iommufd_ioas) for (i = 0; i != variant->mock_domains; i++) { test_cmd_mock_domain(self->ioas_id, &self->stdev_id, &self->hwpt_id, &self->device_id); + test_cmd_dev_check_cache_all(self->device_id, + IOMMU_TEST_DEV_CACHE_DEFAULT); self->base_iova = MOCK_APERTURE_START; } } @@ -1386,9 +1388,12 @@ FIXTURE_SETUP(iommufd_mock_domain)
ASSERT_GE(ARRAY_SIZE(self->hwpt_ids), variant->mock_domains);
- for (i = 0; i != variant->mock_domains; i++) + for (i = 0; i != variant->mock_domains; i++) { test_cmd_mock_domain(self->ioas_id, &self->stdev_ids[i], &self->hwpt_ids[i], &self->idev_ids[i]); + test_cmd_dev_check_cache_all(self->idev_ids[0], + IOMMU_TEST_DEV_CACHE_DEFAULT); + } self->hwpt_id = self->hwpt_ids[0];
self->mmap_flags = MAP_SHARED | MAP_ANONYMOUS;
Add a viommu_cache test function to cover vIOMMU invalidations using the updated IOMMU_HWPT_INVALIDATE ioctl, which now allows passing in a vIOMMU via its hwpt_id field.
Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- tools/testing/selftests/iommu/iommufd_utils.h | 32 ++++ tools/testing/selftests/iommu/iommufd.c | 173 ++++++++++++++++++ 2 files changed, 205 insertions(+)
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 3ae6cb5eed7d..aa458c80ad30 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -289,6 +289,38 @@ static int _test_cmd_hwpt_invalidate(int fd, __u32 hwpt_id, void *reqs, data_type, lreq, nreqs)); \ })
+static int _test_cmd_viommu_invalidate(int fd, __u32 viommu_id, void *reqs, + uint32_t data_type, uint32_t lreq, + uint32_t *nreqs) +{ + struct iommu_hwpt_invalidate cmd = { + .size = sizeof(cmd), + .hwpt_id = viommu_id, + .data_type = data_type, + .data_uptr = (uint64_t)reqs, + .entry_len = lreq, + .entry_num = *nreqs, + }; + int rc = ioctl(fd, IOMMU_HWPT_INVALIDATE, &cmd); + *nreqs = cmd.entry_num; + return rc; +} + +#define test_cmd_viommu_invalidate(viommu, reqs, lreq, nreqs) \ + ({ \ + ASSERT_EQ(0, \ + _test_cmd_viommu_invalidate(self->fd, viommu, reqs, \ + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, \ + lreq, nreqs)); \ + }) +#define test_err_viommu_invalidate(_errno, viommu_id, reqs, data_type, lreq, \ + nreqs) \ + ({ \ + EXPECT_ERRNO(_errno, _test_cmd_viommu_invalidate( \ + self->fd, viommu_id, reqs, \ + data_type, lreq, nreqs)); \ + }) + static int _test_cmd_access_replace_ioas(int fd, __u32 access_id, unsigned int ioas_id) { diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 79b0739d586b..00f08f79b0d9 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -2489,4 +2489,177 @@ TEST_F(iommufd_viommu, vdevice_alloc) } }
+TEST_F(iommufd_viommu, vdevice_cache) +{ + struct iommu_viommu_invalidate_selftest inv_reqs[2] = {}; + uint32_t viommu_id = self->viommu_id; + uint32_t dev_id = self->device_id; + uint32_t vdev_id = 0; + uint32_t num_inv; + + if (dev_id) { + test_cmd_vdevice_alloc(viommu_id, dev_id, 0x99, &vdev_id); + + test_cmd_dev_check_cache_all(dev_id, + IOMMU_TEST_DEV_CACHE_DEFAULT); + + /* Check data_type by passing zero-length array */ + num_inv = 0; + test_cmd_viommu_invalidate(viommu_id, inv_reqs, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* Negative test: Invalid data_type */ + num_inv = 1; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST_INVALID, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* Negative test: structure size sanity */ + num_inv = 1; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs) + 1, &num_inv); + assert(!num_inv); + + num_inv = 1; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + 1, &num_inv); + assert(!num_inv); + + /* Negative test: invalid flag is passed */ + num_inv = 1; + inv_reqs[0].flags = 0xffffffff; + inv_reqs[0].vdev_id = 0x99; + test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* Negative test: invalid data_uptr when array is not empty */ + num_inv = 1; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + test_err_viommu_invalidate(EINVAL, viommu_id, NULL, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* Negative test: invalid entry_len when array is not empty */ + num_inv = 1; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + 0, &num_inv); + assert(!num_inv); + + /* Negative test: invalid cache_id */ + num_inv = 1; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* Negative test: invalid vdev_id */ + num_inv = 1; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x9; + inv_reqs[0].cache_id = 0; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(!num_inv); + + /* + * Invalidate the 1st cache entry but fail the 2nd request + * due to invalid flags configuration in the 2nd request. + */ + num_inv = 2; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].cache_id = 0; + inv_reqs[1].flags = 0xffffffff; + inv_reqs[1].vdev_id = 0x99; + inv_reqs[1].cache_id = 1; + test_err_viommu_invalidate(EOPNOTSUPP, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(num_inv == 1); + test_cmd_dev_check_cache(dev_id, 0, 0); + test_cmd_dev_check_cache(dev_id, 1, + IOMMU_TEST_DEV_CACHE_DEFAULT); + test_cmd_dev_check_cache(dev_id, 2, + IOMMU_TEST_DEV_CACHE_DEFAULT); + test_cmd_dev_check_cache(dev_id, 3, + IOMMU_TEST_DEV_CACHE_DEFAULT); + + /* + * Invalidate the 1st cache entry but fail the 2nd request + * due to invalid cache_id configuration in the 2nd request. + */ + num_inv = 2; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].cache_id = 0; + inv_reqs[1].flags = 0; + inv_reqs[1].vdev_id = 0x99; + inv_reqs[1].cache_id = MOCK_DEV_CACHE_ID_MAX + 1; + test_err_viommu_invalidate(EINVAL, viommu_id, inv_reqs, + IOMMU_VIOMMU_INVALIDATE_DATA_SELFTEST, + sizeof(*inv_reqs), &num_inv); + assert(num_inv == 1); + test_cmd_dev_check_cache(dev_id, 0, 0); + test_cmd_dev_check_cache(dev_id, 1, + IOMMU_TEST_DEV_CACHE_DEFAULT); + test_cmd_dev_check_cache(dev_id, 2, + IOMMU_TEST_DEV_CACHE_DEFAULT); + test_cmd_dev_check_cache(dev_id, 3, + IOMMU_TEST_DEV_CACHE_DEFAULT); + + /* Invalidate the 2nd cache entry and verify */ + num_inv = 1; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].cache_id = 1; + test_cmd_viommu_invalidate(viommu_id, inv_reqs, + sizeof(*inv_reqs), &num_inv); + assert(num_inv == 1); + test_cmd_dev_check_cache(dev_id, 0, 0); + test_cmd_dev_check_cache(dev_id, 1, 0); + test_cmd_dev_check_cache(dev_id, 2, + IOMMU_TEST_DEV_CACHE_DEFAULT); + test_cmd_dev_check_cache(dev_id, 3, + IOMMU_TEST_DEV_CACHE_DEFAULT); + + /* Invalidate the 3rd and 4th cache entries and verify */ + num_inv = 2; + inv_reqs[0].flags = 0; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].cache_id = 2; + inv_reqs[1].flags = 0; + inv_reqs[1].vdev_id = 0x99; + inv_reqs[1].cache_id = 3; + test_cmd_viommu_invalidate(viommu_id, inv_reqs, + sizeof(*inv_reqs), &num_inv); + assert(num_inv == 2); + test_cmd_dev_check_cache_all(dev_id, 0); + + /* Invalidate all cache entries for nested_dev_id[1] and verify */ + num_inv = 1; + inv_reqs[0].vdev_id = 0x99; + inv_reqs[0].flags = IOMMU_TEST_INVALIDATE_FLAG_ALL; + test_cmd_viommu_invalidate(viommu_id, inv_reqs, + sizeof(*inv_reqs), &num_inv); + assert(num_inv == 1); + test_cmd_dev_check_cache_all(dev_id, 0); + test_ioctl_destroy(vdev_id); + } +} + TEST_HARNESS_MAIN
With the introduction of the new object and its infrastructure, update the doc and the vIOMMU graph to reflect that.
Reviewed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- Documentation/userspace-api/iommufd.rst | 41 +++++++++++++++++++------ 1 file changed, 32 insertions(+), 9 deletions(-)
diff --git a/Documentation/userspace-api/iommufd.rst b/Documentation/userspace-api/iommufd.rst index 92d16efad5b0..3c27cc92c2cb 100644 --- a/Documentation/userspace-api/iommufd.rst +++ b/Documentation/userspace-api/iommufd.rst @@ -94,6 +94,19 @@ Following IOMMUFD objects are exposed to userspace: backed by corresponding vIOMMU objects, in which case a guest OS would do the "dispatch" naturally instead of VMM trappings.
+ - IOMMUFD_OBJ_VDEVICE, representing a virtual device for an IOMMUFD_OBJ_DEVICE + against an IOMMUFD_OBJ_VIOMMU. This virtual device holds the device's virtual + information or attributes (related to the vIOMMU) in a VM. An immediate vDATA + example can be the virtual ID of the device on a vIOMMU, which is a unique ID + that VMM assigns to the device for a translation channel/port of the vIOMMU, + e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a + Context Table. Potential use cases of some advanced security information can + be forwarded via this object too, such as security level or realm information + in a Confidential Compute Architecture. A VMM should create a vDEVICE object + to forward all the device information in a VM, when it connects a device to a + vIOMMU, which is a separate ioctl call from attaching the same device to an + HWPT_PAGING that the vIOMMU holds. + All user-visible objects are destroyed via the IOMMU_DESTROY uAPI.
The diagrams below show relationships between user-visible objects and kernel @@ -133,16 +146,16 @@ creating the objects and links:: |____________| |____________| |______|
_______________________________________________________________________ - | iommufd (with vIOMMU) | + | iommufd (with vIOMMU/vDEVICE) | | | - | [5] | - | _____________ | - | | | | - | |----------------| vIOMMU | | - | | | | | - | | | | | - | | [1] | | [4] [2] | - | | ______ | | _____________ ________ | + | [5] [6] | + | _____________ _____________ | + | | | | | | + | |----------------| vIOMMU |<---| vDEVICE |<----| | + | | | | |_____________| | | + | | | | | | + | | [1] | | [4] | [2] | + | | ______ | | _____________ _|______ | | | | | | [3] | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | @@ -215,6 +228,15 @@ creating the objects and links:: the vIOMMU object and the HWPT_PAGING, then this vIOMMU object can be used as a nesting parent object to allocate an HWPT_NESTED object described above.
+6. IOMMUFD_OBJ_VDEVICE can be only manually created via the IOMMU_VDEVICE_ALLOC + uAPI, provided a viommu_id for an iommufd_viommu object and a dev_id for an + iommufd_device object. The vDEVICE object will be the binding between these + two parent objects. Another @virt_id will be also set via the uAPI providing + the iommufd core an index to store the vDEVICE object to a vDEVICE array per + vIOMMU. If necessary, the IOMMU driver may choose to implement a vdevce_alloc + op to init its HW for virtualization feature related to a vDEVICE. Successful + completion of this operation sets up the linkages between vIOMMU and device. + A device can only bind to an iommufd due to DMA ownership claim and attach to at most one IOAS object (no support of PASID yet).
@@ -228,6 +250,7 @@ User visible objects are backed by following datastructures: - iommufd_hwpt_paging for IOMMUFD_OBJ_HWPT_PAGING. - iommufd_hwpt_nested for IOMMUFD_OBJ_HWPT_NESTED. - iommufd_viommu for IOMMUFD_OBJ_VIOMMU. +- iommufd_vdevice for IOMMUFD_OBJ_VDEVICE
Several terminologies when looking at these datastructures:
Implement the vIOMMU's cache_invalidate op for user space to invalidate the IOTLB entries, Device ATS and CD entries that are still cached by hardware.
Add struct iommu_viommu_arm_smmuv3_invalidate defining invalidation entries that are simply in the native format of a 128-bit TLBI command. Scan those commands against the permitted command list and fix their VMID/SID fields.
Co-developed-by: Eric Auger eric.auger@redhat.com Signed-off-by: Eric Auger eric.auger@redhat.com Co-developed-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 5 + include/uapi/linux/iommufd.h | 24 ++++ .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 131 ++++++++++++++++++ drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 6 +- 4 files changed, 163 insertions(+), 3 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 5a025d310dbe..8bd740f537ee 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -529,6 +529,7 @@ struct arm_smmu_cmdq_ent { #define CMDQ_OP_TLBI_NH_ALL 0x10 #define CMDQ_OP_TLBI_NH_ASID 0x11 #define CMDQ_OP_TLBI_NH_VA 0x12 + #define CMDQ_OP_TLBI_NH_VAA 0x13 #define CMDQ_OP_TLBI_EL2_ALL 0x20 #define CMDQ_OP_TLBI_EL2_ASID 0x21 #define CMDQ_OP_TLBI_EL2_VA 0x22 @@ -949,6 +950,10 @@ void arm_smmu_attach_commit(struct arm_smmu_attach_state *state); void arm_smmu_install_ste_for_dev(struct arm_smmu_master *master, const struct arm_smmu_ste *target);
+int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, + struct arm_smmu_cmdq *cmdq, u64 *cmds, int n, + bool sync); + #ifdef CONFIG_ARM_SMMU_V3_SVA bool arm_smmu_sva_supported(struct arm_smmu_device *smmu); bool arm_smmu_master_sva_supported(struct arm_smmu_master *master); diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 3486ae2d62d1..007e4b5ebe34 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -686,9 +686,11 @@ struct iommu_hwpt_get_dirty_bitmap { * enum iommu_hwpt_invalidate_data_type - IOMMU HWPT Cache Invalidation * Data Type * @IOMMU_HWPT_INVALIDATE_DATA_VTD_S1: Invalidation data for VTD_S1 + * @IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3: Invalidation data for ARM SMMUv3 */ enum iommu_hwpt_invalidate_data_type { IOMMU_HWPT_INVALIDATE_DATA_VTD_S1 = 0, + IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3 = 1, };
/** @@ -727,6 +729,28 @@ struct iommu_hwpt_vtd_s1_invalidate { __u32 __reserved; };
+/** + * struct iommu_viommu_arm_smmuv3_invalidate - ARM SMMUv3 cahce invalidation + * (IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3) + * @cmd: 128-bit cache invalidation command that runs in SMMU CMDQ. + * Must be little-endian. + * + * Supported command list only when passing in a vIOMMU via @hwpt_id: + * CMDQ_OP_TLBI_NSNH_ALL + * CMDQ_OP_TLBI_NH_VA + * CMDQ_OP_TLBI_NH_VAA + * CMDQ_OP_TLBI_NH_ALL + * CMDQ_OP_TLBI_NH_ASID + * CMDQ_OP_ATC_INV + * CMDQ_OP_CFGI_CD + * CMDQ_OP_CFGI_CD_ALL + * + * -EIO will be returned if the command is not supported. + */ +struct iommu_viommu_arm_smmuv3_invalidate { + __aligned_le64 cmd[2]; +}; + /** * struct iommu_hwpt_invalidate - ioctl(IOMMU_HWPT_INVALIDATE) * @size: sizeof(struct iommu_hwpt_invalidate) diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c index 70ad857a57b8..6f53a2928c36 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c @@ -184,9 +184,132 @@ arm_vsmmu_domain_alloc_nested(struct iommufd_viommu *viommu, return &nested_domain->domain; }
+static int arm_vsmmu_vsid_to_sid(struct arm_vsmmu *vsmmu, u32 vsid, u32 *sid) +{ + XA_STATE(xas, &vsmmu->core.vdevs, (unsigned long)vsid); + struct arm_smmu_master *master; + struct device *dev; + int ret = 0; + + xa_lock(&vsmmu->core.vdevs); + dev = vdev_to_dev(xas_load(&xas)); + if (!dev) { + ret = -EIO; + goto unlock; + } + master = dev_iommu_priv_get(dev); + + /* At this moment, iommufd only supports PCI device that has one SID */ + if (sid) + *sid = master->streams[0].id; +unlock: + xa_unlock(&vsmmu->core.vdevs); + return ret; +} + +/* This is basically iommu_viommu_arm_smmuv3_invalidate in u64 for conversion */ +struct arm_vsmmu_invalidation_cmd { + u64 cmd[2]; +}; + +/* + * Convert, in place, the raw invalidation command into an internal format that + * can be passed to arm_smmu_cmdq_issue_cmdlist(). Internally commands are + * stored in CPU endian. + * + * Enforce the VMID or SID on the command. + */ +static int arm_vsmmu_convert_user_cmd(struct arm_vsmmu *vsmmu, + struct arm_vsmmu_invalidation_cmd *cmd) +{ + /* Commands are le64 stored in u64 */ + cmd->cmd[0] = le64_to_cpu((__force __le64)cmd->cmd[0]); + cmd->cmd[1] = le64_to_cpu((__force __le64)cmd->cmd[1]); + + switch (cmd->cmd[0] & CMDQ_0_OP) { + case CMDQ_OP_TLBI_NSNH_ALL: + /* Convert to NH_ALL */ + cmd->cmd[0] = CMDQ_OP_TLBI_NH_ALL | + FIELD_PREP(CMDQ_TLBI_0_VMID, vsmmu->vmid); + cmd->cmd[1] = 0; + break; + case CMDQ_OP_TLBI_NH_VA: + case CMDQ_OP_TLBI_NH_VAA: + case CMDQ_OP_TLBI_NH_ALL: + case CMDQ_OP_TLBI_NH_ASID: + cmd->cmd[0] &= ~CMDQ_TLBI_0_VMID; + cmd->cmd[0] |= FIELD_PREP(CMDQ_TLBI_0_VMID, vsmmu->vmid); + break; + case CMDQ_OP_ATC_INV: + case CMDQ_OP_CFGI_CD: + case CMDQ_OP_CFGI_CD_ALL: { + u32 sid, vsid = FIELD_GET(CMDQ_CFGI_0_SID, cmd->cmd[0]); + + if (arm_vsmmu_vsid_to_sid(vsmmu, vsid, &sid)) + return -EIO; + cmd->cmd[0] &= ~CMDQ_CFGI_0_SID; + cmd->cmd[0] |= FIELD_PREP(CMDQ_CFGI_0_SID, sid); + break; + } + default: + return -EIO; + } + return 0; +} + +static int arm_vsmmu_cache_invalidate(struct iommufd_viommu *viommu, + struct iommu_user_data_array *array) +{ + struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core); + struct arm_smmu_device *smmu = vsmmu->smmu; + struct arm_vsmmu_invalidation_cmd *last; + struct arm_vsmmu_invalidation_cmd *cmds; + struct arm_vsmmu_invalidation_cmd *cur; + struct arm_vsmmu_invalidation_cmd *end; + int ret; + + cmds = kcalloc(array->entry_num, sizeof(*cmds), GFP_KERNEL); + if (!cmds) + return -ENOMEM; + cur = cmds; + end = cmds + array->entry_num; + + static_assert(sizeof(*cmds) == 2 * sizeof(u64)); + ret = iommu_copy_struct_from_full_user_array( + cmds, sizeof(*cmds), array, + IOMMU_VIOMMU_INVALIDATE_DATA_ARM_SMMUV3); + if (ret) + goto out; + + last = cmds; + while (cur != end) { + ret = arm_vsmmu_convert_user_cmd(vsmmu, cur); + if (ret) + goto out; + + /* FIXME work in blocks of CMDQ_BATCH_ENTRIES and copy each block? */ + cur++; + if (cur != end && (cur - last) != CMDQ_BATCH_ENTRIES - 1) + continue; + + /* FIXME always uses the main cmdq rather than trying to group by type */ + ret = arm_smmu_cmdq_issue_cmdlist(smmu, &smmu->cmdq, last->cmd, + cur - last, true); + if (ret) { + cur--; + goto out; + } + last = cur; + } +out: + array->entry_num = cur - cmds; + kfree(cmds); + return ret; +}
static const struct iommufd_viommu_ops arm_vsmmu_ops = { .domain_alloc_nested = arm_vsmmu_domain_alloc_nested, + .cache_invalidate = arm_vsmmu_cache_invalidate, };
struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, @@ -206,6 +329,14 @@ struct iommufd_viommu *arm_vsmmu_alloc(struct device *dev, if (!(smmu->features & ARM_SMMU_FEAT_NESTING)) return ERR_PTR(-EOPNOTSUPP);
+ /* + * FORCE_SYNC is not set with FEAT_NESTING. Some study of the exact HW + * defect is needed to determine if arm_vsmmu_cache_invalidate() needs + * any change to remove this. + */ + if (WARN_ON(smmu->options & ARM_SMMU_OPT_CMDQ_FORCE_SYNC)) + return ERR_PTR(-EOPNOTSUPP); + /* * Must support some way to prevent the VM from bypassing the cache * because VFIO currently does not do any cache maintenance. diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index de598d66b5c2..5c652e914a51 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -766,9 +766,9 @@ static void arm_smmu_cmdq_write_entries(struct arm_smmu_cmdq *cmdq, u64 *cmds, * insert their own list of commands then all of the commands from one * CPU will appear before any of the commands from the other CPU. */ -static int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, - struct arm_smmu_cmdq *cmdq, - u64 *cmds, int n, bool sync) +int arm_smmu_cmdq_issue_cmdlist(struct arm_smmu_device *smmu, + struct arm_smmu_cmdq *cmdq, u64 *cmds, int n, + bool sync) { u64 cmd_sync[CMDQ_ENT_DWORDS]; u32 prod;
From: Jason Gunthorpe jgg@nvidia.com
Now, ATC invalidation can be done with the vIOMMU invalidation op. A guest owned IOMMU_DOMAIN_NESTED can do an ATS too. Allow it to pass in the EATS field via the vSTE words.
Signed-off-by: Jason Gunthorpe jgg@nvidia.com Signed-off-by: Nicolin Chen nicolinc@nvidia.com --- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 4 ++- include/uapi/linux/iommufd.h | 2 +- .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 31 ++++++++++++++++--- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 26 +++++++++++++--- 4 files changed, 53 insertions(+), 10 deletions(-)
diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h index 8bd740f537ee..af25f092303f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h @@ -305,7 +305,7 @@ static inline u32 arm_smmu_strtab_l2_idx(u32 sid) #define STRTAB_STE_1_NESTING_ALLOWED \ cpu_to_le64(STRTAB_STE_1_S1DSS | STRTAB_STE_1_S1CIR | \ STRTAB_STE_1_S1COR | STRTAB_STE_1_S1CSH | \ - STRTAB_STE_1_S1STALLD) + STRTAB_STE_1_S1STALLD | STRTAB_STE_1_EATS)
/* * Context descriptors. @@ -838,6 +838,7 @@ struct arm_smmu_domain { struct arm_smmu_nested_domain { struct iommu_domain domain; struct arm_vsmmu *vsmmu; + bool enable_ats : 1;
__le64 ste[2]; }; @@ -879,6 +880,7 @@ struct arm_smmu_master_domain { struct list_head devices_elm; struct arm_smmu_master *master; ioasid_t ssid; + bool nested_ats_flush : 1; };
static inline struct arm_smmu_domain *to_smmu_domain(struct iommu_domain *dom) diff --git a/include/uapi/linux/iommufd.h b/include/uapi/linux/iommufd.h index 007e4b5ebe34..c38ce44ae6f0 100644 --- a/include/uapi/linux/iommufd.h +++ b/include/uapi/linux/iommufd.h @@ -404,7 +404,7 @@ struct iommu_hwpt_vtd_s1 { * a user stage-1 Context Descriptor Table. Must be little-endian. * Allowed fields: (Refer to "5.2 Stream Table Entry" in SMMUv3 HW Spec) * - word-0: V, Cfg, S1Fmt, S1ContextPtr, S1CDMax - * - word-1: S1DSS, S1CIR, S1COR, S1CSH, S1STALLD + * - word-1: EATS, S1DSS, S1CIR, S1COR, S1CSH, S1STALLD * * -EIO will be returned if @ste is not legal or contains any non-allowed field. * Cfg can be used to select a S1, Bypass or Abort configuration. A Bypass diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c index 6f53a2928c36..59b8eb776f1f 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3-iommufd.c @@ -96,8 +96,6 @@ static int arm_smmu_attach_dev_nested(struct iommu_domain *domain, .master = master, .old_domain = iommu_get_domain_for_dev(dev), .ssid = IOMMU_NO_PASID, - /* Currently invalidation of ATC is not supported */ - .disable_ats = true, }; struct arm_smmu_ste ste; int ret; @@ -108,6 +106,15 @@ static int arm_smmu_attach_dev_nested(struct iommu_domain *domain, return -EBUSY;
mutex_lock(&arm_smmu_asid_lock); + /* + * The VM has to control the actual ATS state at the PCI device because + * we forward the invalidations directly from the VM. If the VM doesn't + * think ATS is on it will not generate ATC flushes and the ATC will + * become incoherent. Since we can't access the actual virtual PCI ATS + * config bit here base this off the EATS value in the STE. If the EATS + * is set then the VM must generate ATC flushes. + */ + state.disable_ats = !nested_domain->enable_ats; ret = arm_smmu_attach_prepare(&state, domain); if (ret) { mutex_unlock(&arm_smmu_asid_lock); @@ -132,8 +139,10 @@ static const struct iommu_domain_ops arm_smmu_nested_ops = { .free = arm_smmu_domain_nested_free, };
-static int arm_smmu_validate_vste(struct iommu_hwpt_arm_smmuv3 *arg) +static int arm_smmu_validate_vste(struct iommu_hwpt_arm_smmuv3 *arg, + bool *enable_ats) { + unsigned int eats; unsigned int cfg;
if (!(arg->ste[0] & cpu_to_le64(STRTAB_STE_0_V))) { @@ -150,6 +159,18 @@ static int arm_smmu_validate_vste(struct iommu_hwpt_arm_smmuv3 *arg) if (cfg != STRTAB_STE_0_CFG_ABORT && cfg != STRTAB_STE_0_CFG_BYPASS && cfg != STRTAB_STE_0_CFG_S1_TRANS) return -EIO; + + /* + * Only Full ATS or ATS UR is supported + * The EATS field will be set by arm_smmu_make_nested_domain_ste() + */ + eats = FIELD_GET(STRTAB_STE_1_EATS, le64_to_cpu(arg->ste[1])); + arg->ste[1] &= ~cpu_to_le64(STRTAB_STE_1_EATS); + if (eats != STRTAB_STE_1_EATS_ABT && eats != STRTAB_STE_1_EATS_TRANS) + return -EIO; + + if (cfg == STRTAB_STE_0_CFG_S1_TRANS) + *enable_ats = (eats == STRTAB_STE_1_EATS_TRANS); return 0; }
@@ -160,6 +181,7 @@ arm_vsmmu_domain_alloc_nested(struct iommufd_viommu *viommu, struct arm_vsmmu *vsmmu = container_of(viommu, struct arm_vsmmu, core); struct arm_smmu_nested_domain *nested_domain; struct iommu_hwpt_arm_smmuv3 arg; + bool enable_ats = false; int ret;
ret = iommu_copy_struct_from_user(&arg, user_data, @@ -167,7 +189,7 @@ arm_vsmmu_domain_alloc_nested(struct iommufd_viommu *viommu, if (ret) return ERR_PTR(ret);
- ret = arm_smmu_validate_vste(&arg); + ret = arm_smmu_validate_vste(&arg, &enable_ats); if (ret) return ERR_PTR(ret);
@@ -177,6 +199,7 @@ arm_vsmmu_domain_alloc_nested(struct iommufd_viommu *viommu,
nested_domain->domain.type = IOMMU_DOMAIN_NESTED; nested_domain->domain.ops = &arm_smmu_nested_ops; + nested_domain->enable_ats = enable_ats; nested_domain->vsmmu = vsmmu; nested_domain->ste[0] = arg.ste[0]; nested_domain->ste[1] = arg.ste[1] & ~cpu_to_le64(STRTAB_STE_1_EATS); diff --git a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c index 5c652e914a51..2a9f2d1d3ed9 100644 --- a/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c +++ b/drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c @@ -2107,7 +2107,16 @@ int arm_smmu_atc_inv_domain(struct arm_smmu_domain *smmu_domain, if (!master->ats_enabled) continue;
- arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, &cmd); + if (master_domain->nested_ats_flush) { + /* + * If a S2 used as a nesting parent is changed we have + * no option but to completely flush the ATC. + */ + arm_smmu_atc_inv_to_cmd(IOMMU_NO_PASID, 0, 0, &cmd); + } else { + arm_smmu_atc_inv_to_cmd(master_domain->ssid, iova, size, + &cmd); + }
for (i = 0; i < master->num_streams; i++) { cmd.atc.sid = master->streams[i].id; @@ -2631,7 +2640,7 @@ static void arm_smmu_disable_pasid(struct arm_smmu_master *master) static struct arm_smmu_master_domain * arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, struct arm_smmu_master *master, - ioasid_t ssid) + ioasid_t ssid, bool nested_ats_flush) { struct arm_smmu_master_domain *master_domain;
@@ -2640,7 +2649,8 @@ arm_smmu_find_master_domain(struct arm_smmu_domain *smmu_domain, list_for_each_entry(master_domain, &smmu_domain->devices, devices_elm) { if (master_domain->master == master && - master_domain->ssid == ssid) + master_domain->ssid == ssid && + master_domain->nested_ats_flush == nested_ats_flush) return master_domain; } return NULL; @@ -2671,13 +2681,18 @@ static void arm_smmu_remove_master_domain(struct arm_smmu_master *master, { struct arm_smmu_domain *smmu_domain = to_smmu_domain_devices(domain); struct arm_smmu_master_domain *master_domain; + bool nested_ats_flush = false; unsigned long flags;
if (!smmu_domain) return;
+ if (domain->type == IOMMU_DOMAIN_NESTED) + nested_ats_flush = to_smmu_nested_domain(domain)->enable_ats; + spin_lock_irqsave(&smmu_domain->devices_lock, flags); - master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid); + master_domain = arm_smmu_find_master_domain(smmu_domain, master, ssid, + nested_ats_flush); if (master_domain) { list_del(&master_domain->devices_elm); kfree(master_domain); @@ -2744,6 +2759,9 @@ int arm_smmu_attach_prepare(struct arm_smmu_attach_state *state, return -ENOMEM; master_domain->master = master; master_domain->ssid = state->ssid; + if (new_domain->type == IOMMU_DOMAIN_NESTED) + master_domain->nested_ats_flush = + to_smmu_nested_domain(new_domain)->enable_ats;
/* * During prepare we want the current smmu_domain and new
On 22/10/24 11:20, Nicolin Chen wrote:
Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
Paring QEMU branch for testing: https://github.com/nicolinc/qemu/commits/wip/for_iommufd_viommu_p2-v4
Changelog v4
- Added missing brackets in switch-case
- Fixed the unreleased idev refcount issue
- Reworked the iommufd_vdevice_alloc allocator
- Dropped support for IOMMU_VIOMMU_TYPE_DEFAULT
- Added missing TEST_LENGTH and fail_nth coverages
- Added a verification to the driver-allocated vDEVICE object
- Added an iommufd_vdevice_abort for a missing mutex protection
- Added a u64 structure arm_vsmmu_invalidation_cmd for user command conversion
v3 https://lore.kernel.org/all/cover.1728491532.git.nicolinc@nvidia.com/
- Added Jason's Reviewed-by
- Split this invalidation part out of the part-1 series
- Repurposed VDEV_ID ioctl to a wider vDEVICE structure and ioctl
- Reduced viommu_api functions by allowing drivers to access viommu and vdevice structure directly
- Dropped vdevs_rwsem by using xa_lock instead
- Dropped arm_smmu_cache_invalidate_user
v2 https://lore.kernel.org/all/cover.1724776335.git.nicolinc@nvidia.com/
- Limited vdev_id to one per idev
- Added a rw_sem to protect the vdev_id list
- Reworked driver-level APIs with proper lockings
- Added a new viommu_api file for IOMMUFD_DRIVER config
- Dropped useless iommu_dev point from the viommu structure
- Added missing index numnbers to new types in the uAPI header
- Dropped IOMMU_VIOMMU_INVALIDATE uAPI; Instead, reuse the HWPT one
- Reworked mock_viommu_cache_invalidate() using the new iommu helper
- Reordered details of set/unset_vdev_id handlers for proper lockings
v1 https://lore.kernel.org/all/cover.1723061377.git.nicolinc@nvidia.com/
Thanks! Nicolin
Jason Gunthorpe (2): iommu: Add iommu_copy_struct_from_full_user_array helper iommu/arm-smmu-v3: Allow ATS for IOMMU_DOMAIN_NESTED
Nicolin Chen (12): iommufd/viommu: Introduce IOMMUFD_OBJ_VDEVICE and its related struct iommufd/viommu: Add IOMMU_VDEVICE_ALLOC ioctl iommufd/selftest: Add IOMMU_VDEVICE_ALLOC test coverage iommu/viommu: Add cache_invalidate to iommufd_viommu_ops iommufd/hw_pagetable: Enforce cache invalidation op on vIOMMU-based hwpt_nested iommufd: Allow hwpt_id to carry viommu_id for IOMMU_HWPT_INVALIDATE iommufd/viommu: Add vdev_to_dev helper iommufd/selftest: Add mock_viommu_cache_invalidate iommufd/selftest: Add IOMMU_TEST_OP_DEV_CHECK_CACHE test command iommufd/selftest: Add vIOMMU coverage for IOMMU_HWPT_INVALIDATE ioctl Documentation: userspace-api: iommufd: Update vDEVICE iommu/arm-smmu-v3: Add arm_vsmmu_cache_invalidate
drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.h | 9 +- drivers/iommu/iommufd/iommufd_private.h | 12 ++ drivers/iommu/iommufd/iommufd_test.h | 30 +++ include/linux/iommu.h | 49 ++++- include/linux/iommufd.h | 50 +++++ include/uapi/linux/iommufd.h | 61 +++++- tools/testing/selftests/iommu/iommufd_utils.h | 83 +++++++ .../arm/arm-smmu-v3/arm-smmu-v3-iommufd.c | 162 +++++++++++++- drivers/iommu/arm/arm-smmu-v3/arm-smmu-v3.c | 32 ++- drivers/iommu/iommufd/device.c | 11 + drivers/iommu/iommufd/driver.c | 7 + drivers/iommu/iommufd/hw_pagetable.c | 36 +++- drivers/iommu/iommufd/main.c | 7 + drivers/iommu/iommufd/selftest.c | 115 +++++++++- drivers/iommu/iommufd/viommu.c | 108 ++++++++++ tools/testing/selftests/iommu/iommufd.c | 204 +++++++++++++++++- .../selftests/iommu/iommufd_fail_nth.c | 4 + Documentation/userspace-api/iommufd.rst | 41 +++- 18 files changed, 983 insertions(+), 38 deletions(-)
On Fri, Oct 25, 2024 at 03:54:44PM +1100, Alexey Kardashevskiy wrote:
On 22/10/24 11:20, Nicolin Chen wrote:
Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case: https://lore.kernel.org/linux-iommu/ZxsSYbK3gqyC84U7@Asurada-Nvidia/ https://lore.kernel.org/linux-iommu/ZxsTAANTTuQzQ9HR@Asurada-Nvidia/
Do you need vdevice_alloc in the driver for your sketch?
Thanks Nicolin
On 25/10/24 15:59, Nicolin Chen wrote:
On Fri, Oct 25, 2024 at 03:54:44PM +1100, Alexey Kardashevskiy wrote:
On 22/10/24 11:20, Nicolin Chen wrote:
Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case:
Why keep it then?
https://lore.kernel.org/linux-iommu/ZxsSYbK3gqyC84U7@Asurada-Nvidia/ https://lore.kernel.org/linux-iommu/ZxsTAANTTuQzQ9HR@Asurada-Nvidia/
Do you need vdevice_alloc in the driver for your sketch?
At the moment one of the things I am looking for is a place to add tsm_bind(host_bdfn, guest_bdfn, kvm_vmid). I assumed that this vdevice represents the guest's IOMMU attributes for a passed through device and naturally stores the guest_bdfn as an id (for AMD). I am still trying to wrap my head around these new things. Thanks,
On Fri, Oct 25, 2024 at 04:32:10PM +1100, Alexey Kardashevskiy wrote:
External email: Use caution opening links or attachments
On 25/10/24 15:59, Nicolin Chen wrote:
On Fri, Oct 25, 2024 at 03:54:44PM +1100, Alexey Kardashevskiy wrote:
On 22/10/24 11:20, Nicolin Chen wrote:
Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case:
Why keep it then?
We need that structure to store per-vIOMMU virtual ID. Hiding it in the core only means we need to provide another vIOMMU APIs for drivers to look up the ID, v.s. exposing it for drivers to access directly.
Thanks Nicolin
On 25/10/24 16:41, Nicolin Chen wrote:
On Fri, Oct 25, 2024 at 04:32:10PM +1100, Alexey Kardashevskiy wrote:
External email: Use caution opening links or attachments
On 25/10/24 15:59, Nicolin Chen wrote:
On Fri, Oct 25, 2024 at 03:54:44PM +1100, Alexey Kardashevskiy wrote:
On 22/10/24 11:20, Nicolin Chen wrote:
Following the previous vIOMMU series, this adds another vDEVICE structure, representing the association from an iommufd_device to an iommufd_viommu. This gives the whole architecture a new "v" layer: _______________________________________________________________________ | iommufd (with vIOMMU/vDEVICE) | | _____________ _____________ | | | | | | | | |----------------| vIOMMU |<---| vDEVICE |<------| | | | | | |_____________| | | | | ______ | | _____________ ___|____ | | | | | | | | | | | | | | | IOAS |<---|(HWPT_PAGING)|<---| HWPT_NESTED |<--| DEVICE | | | | |______| |_____________| |_____________| |________| | |______|________|______________|__________________|_______________|_____| | | | | | ______v_____ | ______v_____ ______v_____ ___v__ | struct | | PFN | (paging) | | (nested) | |struct| |iommu_device| |------>|iommu_domain|<----|iommu_domain|<----|device| |____________| storage|____________| |____________| |______|
This vDEVICE object is used to collect and store all vIOMMU-related device information/attributes in a VM. As an initial series for vDEVICE, add only the virt_id to the vDEVICE, which is a vIOMMU specific device ID in a VM: e.g. vSID of ARM SMMUv3, vDeviceID of AMD IOMMU, and vID of Intel VT-d to a Context Table. This virt_id helps IOMMU drivers to link the vID to a pID of the device against the physical IOMMU instance. This is essential for a vIOMMU-based invalidation, where the request contains a device's vID for a device cache flush, e.g. ATC invalidation.
Therefore, with this vDEVICE object, support a vIOMMU-based invalidation, by reusing IOMMUFD_CMD_HWPT_INVALIDATE for a vIOMMU object to flush cache with a given driver data.
As for the implementation of the series, add driver support in ARM SMMUv3 for a real world use case.
This series is on Github: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4
For testing, try this "with-rmr" branch: https://github.com/nicolinc/iommufd/commits/iommufd_viommu_p2-v4-with-rmr
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case:
Why keep it then?
We need that structure to store per-vIOMMU virtual ID. Hiding it in the core only means we need to provide another vIOMMU APIs for drivers to look up the ID, v.s. exposing it for drivers to access directly.
Sorry I lost you here. If we need it, then there should be an example of .vdevice_alloc() somewhere but you say they is not one. How do you test this, with just selftests? :) Thanks,
Thanks Nicolin
On Fri, Oct 25, 2024 at 04:58:33PM +1100, Alexey Kardashevskiy wrote:
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case:
Why keep it then?
We need that structure to store per-vIOMMU virtual ID. Hiding it in the core only means we need to provide another vIOMMU APIs for drivers to look up the ID, v.s. exposing it for drivers to access directly.
Sorry I lost you here. If we need it, then there should be an example of .vdevice_alloc() somewhere but you say they is not one. How do you test this, with just selftests? :) Thanks,
A vDEVICE object will be core-allocated and core-managed, while the vdevice_alloc is for driver-allocated purpose for which there is no use case (at least with this series). You can check the vdev ioctl in this version that has two pathways to allocate a vDEVICE object.
A vdev_id is used to index viommu's xarray for a driver to convert the id to a dev pointer via a vIOMMU API. Dropping .vdevice_alloc just means the driver only lost its direct access.
Nicolin
On Thu, Oct 24, 2024 at 11:14:21PM -0700, Nicolin Chen wrote:
On Fri, Oct 25, 2024 at 04:58:33PM +1100, Alexey Kardashevskiy wrote:
Is there any real example of a .vdevice_alloc hook, besides the selftests? It is not in iommufd_viommu_p2-v4-with-rmr, hence the question. I am trying to sketch something with this new machinery and less guessing would be nice. Thanks,
No, I am actually dropping that one, and moving the vdevice struct to the private header, as there seems to be no use case:
Why keep it then?
We need that structure to store per-vIOMMU virtual ID. Hiding it in the core only means we need to provide another vIOMMU APIs for drivers to look up the ID, v.s. exposing it for drivers to access directly.
Sorry I lost you here. If we need it, then there should be an example of .vdevice_alloc() somewhere but you say they is not one. How do you test this, with just selftests? :) Thanks,
A vDEVICE object will be core-allocated and core-managed, while the vdevice_alloc is for driver-allocated purpose for which there is no use case (at least with this series). You can check the vdev ioctl in this version that has two pathways to allocate a vDEVICE object.
A vdev_id is used to index viommu's xarray for a driver to convert the id to a dev pointer via a vIOMMU API. Dropping .vdevice_alloc just means the driver only lost its direct access.
I think the point here is this has to go in stages at the present moment the iommu drivers don't need to hook the vdevice object, so Nicolin should take it out of this series.
I would expect CC to need to be in this path, so we should bring it back in the CC series.
For CC I'm broadly expecting that creating the CC type vIOMMU will call a CC implementation, and then creating a vdevice against the vIOMMU will also call the CC implementation. The two callbacks would ask the secure world to create the relevant VM visible objects.
Jason
linux-kselftest-mirror@lists.linaro.org