Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. It is called as SIOV virtual device in this series. Such devices can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid. Such PASIDs are called as default pasid.
iommufd has already supported pasid attach[1]. So a simple way to support SIOV virtual device attachment is to let device driver call the iommufd_device_pasid_attach() and pass in the default pasid for the virtual device. This should work for now, but it may have problem if iommufd core wants to differentiate the default pasids with other kind of pasids (e.g. pasid given by userspace). In the later forwarding page request to userspace, the default pasids are not supposed to send to userspace as default pasids are mainly used by the SIOV device driver.
With above reason, this series chooses to have a new API to bind the default pasid to iommufd, and extends the iommufd_device_attach() to convert the attachment to be pasid attach with the default pasid. Device drivers (e.g. VFIO) that support SIOV shall call the below APIs to interact with iommufd:
- iommufd_device_bind_pasid(): Bind virtual device (a pasid of a device) to iommufd; - iommufd_device_attach(): Attach a SIOV virtual device to IOAS/HWPT; - iommufd_device_replace(): Replace IOAS/HWPT of a SIOV virtual device; - iommufd_device_detach(): Detach IOAS/HWPT of a SIOV virtual device; - iommufd_device_unbind(): Unbind virtual device from iommufd;
For vfio devices, the device drivers that support SIOV should:
- use below API to register vdev for SIOV virtual device vfio_register_pasid_iommu_dev()
- use below API to bind vdev to iommufd in .bind_iommufd() callback iommufd_device_bind_pasid()
- allocate pasid by itself before calling iommufd_device_bind_pasid()
Complete code can be found at[2]
[1] https://lore.kernel.org/linux-iommu/20230926092651.17041-1-yi.l.liu@intel.co... [2] https://github.com/yiliu1765/iommufd/tree/iommufd_pasid_siov
Regards, Yi Liu
Kevin Tian (5): iommufd: Handle unsafe interrupts in a separate function iommufd: Introduce iommufd_alloc_device() iommufd: Add iommufd_device_bind_pasid() iommufd: Support attach/replace for SIOV virtual device {dev, pasid} vfio: Add vfio_register_pasid_iommu_dev()
Yi Liu (2): iommufd/selftest: Extend IOMMU_TEST_OP_MOCK_DOMAIN to pass in pasid iommufd/selftest: Add test coverage for SIOV virtual device
drivers/iommu/iommufd/device.c | 163 ++++++++++++++---- drivers/iommu/iommufd/iommufd_private.h | 7 + drivers/iommu/iommufd/iommufd_test.h | 2 + drivers/iommu/iommufd/selftest.c | 10 +- drivers/vfio/group.c | 18 ++ drivers/vfio/vfio.h | 8 + drivers/vfio/vfio_main.c | 10 ++ include/linux/iommufd.h | 3 + include/linux/vfio.h | 1 + tools/testing/selftests/iommu/iommufd.c | 75 ++++++-- .../selftests/iommu/iommufd_fail_nth.c | 42 ++++- tools/testing/selftests/iommu/iommufd_utils.h | 21 ++- 12 files changed, 296 insertions(+), 64 deletions(-)
From: Kevin Tian kevin.tian@intel.com
This wraps the unsafe interrupts handling into helper as same check is also required when supporting iommufd_device_bind_pasid() later.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 36 ++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 15 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 6a6145b4a25e..ca3919fecc89 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -145,6 +145,25 @@ void iommufd_device_destroy(struct iommufd_object *obj) iommufd_ctx_put(idev->ictx); }
+/* + * For historical compat with VFIO the insecure interrupt path is + * allowed if the module parameter is set. Insecure means that a MemWr + * operation from the device (eg a simple DMA) cannot trigger an + * interrupt outside this iommufd context. + */ +static int iommufd_allow_unsafe_interrupts(struct device *dev) +{ + if (!allow_unsafe_interrupts) + return -EPERM; + + dev_warn( + dev, + "MSI interrupts are not secure, they cannot be isolated by the platform. " + "Check that platform features like interrupt remapping are enabled. " + "Use the "allow_unsafe_interrupts" module parameter to override\n"); + return 0; +} + /** * iommufd_device_bind - Bind a physical device to an iommu fd * @ictx: iommufd file descriptor @@ -179,24 +198,11 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, if (IS_ERR(igroup)) return ERR_CAST(igroup);
- /* - * For historical compat with VFIO the insecure interrupt path is - * allowed if the module parameter is set. Secure/Isolated means that a - * MemWr operation from the device (eg a simple DMA) cannot trigger an - * interrupt outside this iommufd context. - */ if (!iommufd_selftest_is_mock_dev(dev) && !iommu_group_has_isolated_msi(igroup->group)) { - if (!allow_unsafe_interrupts) { - rc = -EPERM; + rc = iommufd_allow_unsafe_interrupts(dev); + if (rc) goto out_group_put; - } - - dev_warn( - dev, - "MSI interrupts are not secure, they cannot be isolated by the platform. " - "Check that platform features like interrupt remapping are enabled. " - "Use the "allow_unsafe_interrupts" module parameter to override\n"); }
rc = iommu_device_claim_dma_owner(dev, ictx);
From: Kevin Tian kevin.tian@intel.com
This abstracts the common logic used in the iommufd_device_bind() and the later iommufd_device_bind_pasid() to be helper.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 34 +++++++++++++++++++++++----------- 1 file changed, 23 insertions(+), 11 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index ca3919fecc89..9dd76d92b7f6 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -164,6 +164,27 @@ static int iommufd_allow_unsafe_interrupts(struct device *dev) return 0; }
+static struct iommufd_device *iommufd_alloc_device(struct iommufd_ctx *ictx, + struct device *dev) +{ + struct iommufd_device *idev; + + idev = iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE); + if (IS_ERR(idev)) + return idev; + idev->ictx = ictx; + if (!iommufd_selftest_is_mock_dev(dev)) + iommufd_ctx_get(ictx); + idev->dev = dev; + idev->enforce_cache_coherency = + device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY); + xa_init(&idev->pasid_hwpts); + + /* The calling driver is a user until iommufd_device_unbind() */ + refcount_inc(&idev->obj.users); + return idev; +} + /** * iommufd_device_bind - Bind a physical device to an iommu fd * @ictx: iommufd file descriptor @@ -209,24 +230,15 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, if (rc) goto out_group_put;
- idev = iommufd_object_alloc(ictx, idev, IOMMUFD_OBJ_DEVICE); + idev = iommufd_alloc_device(ictx, dev); if (IS_ERR(idev)) { rc = PTR_ERR(idev); goto out_release_owner; } - idev->ictx = ictx; - if (!iommufd_selftest_is_mock_dev(dev)) - iommufd_ctx_get(ictx); - idev->dev = dev; - idev->enforce_cache_coherency = - device_iommu_capable(dev, IOMMU_CAP_ENFORCE_CACHE_COHERENCY); - /* The calling driver is a user until iommufd_device_unbind() */ - refcount_inc(&idev->obj.users); + /* igroup refcount moves into iommufd_device */ idev->igroup = igroup;
- xa_init(&idev->pasid_hwpts); - /* * If the caller fails after this success it must call * iommufd_unbind_device() which is safe since we hold this refcount.
From: Kevin Tian kevin.tian@intel.com
Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. So such device can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid.
Binding a virtual device has different security contract comparing to binding a physical device. There is no DMA ownership claim per pasid since the parent device is already claimed by the parent driver including the entire pasid space. With that we simply store the pasid in the object once it passes other checks.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 72 ++++++++++++++++++++++++- drivers/iommu/iommufd/iommufd_private.h | 7 +++ include/linux/iommufd.h | 3 ++ 3 files changed, 80 insertions(+), 2 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 9dd76d92b7f6..35c1419ee96b 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -5,6 +5,7 @@ #include <linux/slab.h> #include <linux/iommu.h> #include <uapi/linux/iommufd.h> +#include <linux/msi.h> #include "../iommu-priv.h"
#include "io_pagetable.h" @@ -139,8 +140,10 @@ void iommufd_device_destroy(struct iommufd_object *obj) WARN_ON(!xa_empty(&idev->pasid_hwpts)); if (idev->has_user_data) dev_iommu_ops(idev->dev)->unset_dev_user_data(idev->dev); - iommu_device_release_dma_owner(idev->dev); - iommufd_put_group(idev->igroup); + if (idev->igroup) { + iommu_device_release_dma_owner(idev->dev); + iommufd_put_group(idev->igroup); + } if (!iommufd_selftest_is_mock_dev(idev->dev)) iommufd_ctx_put(idev->ictx); } @@ -257,6 +260,71 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, } EXPORT_SYMBOL_NS_GPL(iommufd_device_bind, IOMMUFD);
+/** + * iommufd_device_bind_pasid - Bind a virtual device to an iommu fd + * @ictx: iommufd file descriptor + * @dev: Pointer to the parent physical device struct + * @pasid: the pasid value representing vRID of this virtual device + * @id: Output ID number to return to userspace for this device + * + * The virtual device always tags its DMA with the provided pasid. + * A successful bind allows the pasid to be used in other iommufd + * operations e.g. attach/detach and returns struct iommufd_device + * pointer, otherwise returns error pointer. + * + * There is no ownership check per pasid. A driver using this API + * must already claim the DMA ownership over the parent device and + * the pasid is allocated by the driver itself. + * + * PASID is a device capability so unlike iommufd_device_bind() it + * has no iommu group associated. + * + * The caller must undo this with iommufd_device_unbind() + */ +struct iommufd_device *iommufd_device_bind_pasid(struct iommufd_ctx *ictx, + struct device *dev, + u32 pasid, u32 *id) +{ + struct iommufd_device *idev; + int rc; + + /* + * iommufd always sets IOMMU_CACHE because we offer no way for userspace + * to restore cache coherency. + */ + if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY)) + return ERR_PTR(-EINVAL); + + /* + * No iommu supports pasid-granular msi message today. Here we + * just check whether the parent device can do safe interrupts. + * Isolation between virtual devices within the parent device + * relies on the parent driver to enforce. + */ + if (!iommufd_selftest_is_mock_dev(dev) && + !msi_device_has_isolated_msi(dev)) { + rc = iommufd_allow_unsafe_interrupts(dev); + if (rc) + return ERR_PTR(rc); + } + + idev = iommufd_alloc_device(ictx, dev); + if (IS_ERR(idev)) + return idev; + idev->default_pasid = pasid; + + /* + * If the caller fails after this success it must call + * iommufd_unbind_device() which is safe since we hold this refcount. + * This also means the device is a leaf in the graph and no other + * object can take a reference on it. + */ + iommufd_object_finalize(ictx, &idev->obj); + *id = idev->obj.id; + return idev; +} +EXPORT_SYMBOL_NS_GPL(iommufd_device_bind_pasid, IOMMUFD); + /** * iommufd_ctx_has_group - True if any device within the group is bound * to the ictx diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index 06ebee4c87c5..7b3405fd6fd3 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -332,10 +332,17 @@ struct iommufd_group { struct iommufd_device { struct iommufd_object obj; struct iommufd_ctx *ictx; + /* valid if this is a physical device */ struct iommufd_group *igroup; struct list_head group_item; /* always the physical device */ struct device *dev; + /* + * valid if this is a virtual device which gains pasid-granular + * DMA isolation in IOMMU. The default pasid is used when attaching + * this device to a IOAS/hwpt. + */ + u32 default_pasid; struct xarray pasid_hwpts; bool enforce_cache_coherency; bool has_user_data; diff --git a/include/linux/iommufd.h b/include/linux/iommufd.h index 0b007c376306..402320d6eba1 100644 --- a/include/linux/iommufd.h +++ b/include/linux/iommufd.h @@ -20,6 +20,9 @@ struct iommu_group;
struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, struct device *dev, u32 *id); +struct iommufd_device *iommufd_device_bind_pasid(struct iommufd_ctx *ictx, + struct device *dev, + u32 pasid, u32 *id); void iommufd_device_unbind(struct iommufd_device *idev);
int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id);
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
+struct iommufd_device *iommufd_device_bind_pasid(struct iommufd_ctx *ictx,
struct device *dev,
u32 pasid, u32 *id)
+{
- struct iommufd_device *idev;
- int rc;
- /*
* iommufd always sets IOMMU_CACHE because we offer no way for
userspace
* to restore cache coherency.
*/
- if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
return ERR_PTR(-EINVAL);
- /*
* No iommu supports pasid-granular msi message today. Here we
* just check whether the parent device can do safe interrupts.
* Isolation between virtual devices within the parent device
* relies on the parent driver to enforce.
*/
- if (!iommufd_selftest_is_mock_dev(dev) &&
!msi_device_has_isolated_msi(dev)) {
rc = iommufd_allow_unsafe_interrupts(dev);
if (rc)
return ERR_PTR(rc);
- }
Only MemWr w/o pasid can be interpreted as an interrupt message then we need msi isolation to protect.
But for SIOV all MemWr's are tagged with a pasid hence can never trigger an interrupt. From this angle looks this check is unnecessary.
On 2023/10/10 16:19, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
+struct iommufd_device *iommufd_device_bind_pasid(struct iommufd_ctx *ictx,
struct device *dev,
u32 pasid, u32 *id)
+{
- struct iommufd_device *idev;
- int rc;
- /*
* iommufd always sets IOMMU_CACHE because we offer no way for
userspace
* to restore cache coherency.
*/
- if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
return ERR_PTR(-EINVAL);
- /*
* No iommu supports pasid-granular msi message today. Here we
* just check whether the parent device can do safe interrupts.
* Isolation between virtual devices within the parent device
* relies on the parent driver to enforce.
*/
- if (!iommufd_selftest_is_mock_dev(dev) &&
!msi_device_has_isolated_msi(dev)) {
rc = iommufd_allow_unsafe_interrupts(dev);
if (rc)
return ERR_PTR(rc);
- }
Only MemWr w/o pasid can be interpreted as an interrupt message then we need msi isolation to protect.
yes.
But for SIOV all MemWr's are tagged with a pasid hence can never trigger an interrupt. From this angle looks this check is unnecessary.
But the interrupts out from a SIOV virtual device do not have pasid (at least today). Seems still need a check here if we consider this bind for a SIOV virtual device just like binding a physical device.
From: Liu, Yi L yi.l.liu@intel.com Sent: Wednesday, November 8, 2023 3:45 PM
On 2023/10/10 16:19, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
+struct iommufd_device *iommufd_device_bind_pasid(struct
iommufd_ctx
*ictx,
struct device *dev,
u32 pasid, u32 *id)
+{
- struct iommufd_device *idev;
- int rc;
- /*
* iommufd always sets IOMMU_CACHE because we offer no way for
userspace
* to restore cache coherency.
*/
- if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
return ERR_PTR(-EINVAL);
- /*
* No iommu supports pasid-granular msi message today. Here we
* just check whether the parent device can do safe interrupts.
* Isolation between virtual devices within the parent device
* relies on the parent driver to enforce.
*/
- if (!iommufd_selftest_is_mock_dev(dev) &&
!msi_device_has_isolated_msi(dev)) {
rc = iommufd_allow_unsafe_interrupts(dev);
if (rc)
return ERR_PTR(rc);
- }
Only MemWr w/o pasid can be interpreted as an interrupt message then we need msi isolation to protect.
yes.
But for SIOV all MemWr's are tagged with a pasid hence can never trigger an interrupt. From this angle looks this check is unnecessary.
But the interrupts out from a SIOV virtual device do not have pasid (at least today). Seems still need a check here if we consider this bind for a SIOV virtual device just like binding a physical device.
this check assumes the device is trusted. as long as there is no way for malicious guest to generate arbitrary interrupt messages then it's fine.
for physical device a MemWr can be interpreted as interrupt so we need msi isolation.
for SIOV all MemWr has pasid then we don't have such worry. IMS is under host's control so interrupt messages are already sanitized.
On 2023/11/8 16:46, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Wednesday, November 8, 2023 3:45 PM
On 2023/10/10 16:19, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
+struct iommufd_device *iommufd_device_bind_pasid(struct
iommufd_ctx
*ictx,
struct device *dev,
u32 pasid, u32 *id)
+{
- struct iommufd_device *idev;
- int rc;
- /*
* iommufd always sets IOMMU_CACHE because we offer no way for
userspace
* to restore cache coherency.
*/
- if (!device_iommu_capable(dev, IOMMU_CAP_CACHE_COHERENCY))
return ERR_PTR(-EINVAL);
- /*
* No iommu supports pasid-granular msi message today. Here we
* just check whether the parent device can do safe interrupts.
* Isolation between virtual devices within the parent device
* relies on the parent driver to enforce.
*/
- if (!iommufd_selftest_is_mock_dev(dev) &&
!msi_device_has_isolated_msi(dev)) {
rc = iommufd_allow_unsafe_interrupts(dev);
if (rc)
return ERR_PTR(rc);
- }
Only MemWr w/o pasid can be interpreted as an interrupt message then we need msi isolation to protect.
yes.
But for SIOV all MemWr's are tagged with a pasid hence can never trigger an interrupt. From this angle looks this check is unnecessary.
But the interrupts out from a SIOV virtual device do not have pasid (at least today). Seems still need a check here if we consider this bind for a SIOV virtual device just like binding a physical device.
this check assumes the device is trusted. as long as there is no way for malicious guest to generate arbitrary interrupt messages then it's fine.
for physical device a MemWr can be interpreted as interrupt so we need msi isolation.
for SIOV all MemWr has pasid then we don't have such worry. IMS is under host's control so interrupt messages are already sanitized.
sure. this makes sense to me now.:)
From: Kevin Tian kevin.tian@intel.com
SIOV devices allows driver to tag different PASIDs for the virtual devices within it. Such driver should call iommufd_device_bind_pasid() to connect the pasid of the device to iommufd, and then driver is able to attach the virtual device to IOAS/HWPT with the iommufd_device_attach() API.
Unlike physical devices, for SIOV virtual devices, iommufd_device_attach() eventually uses the idev->default_pasid when the virtual device is attached to an IOAS/HWPT. Also, there is no need to do immediate_attach per iommu domain allocation in the attach/replace path if any iommu domain allocation happens since the attach/replace is eventually pasid attach/replace.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/device.c | 21 +++++++++++++++++---- 1 file changed, 17 insertions(+), 4 deletions(-)
diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 35c1419ee96b..4882e3106b2e 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -841,7 +841,11 @@ int iommufd_device_attach(struct iommufd_device *idev, u32 *pt_id) .pasid = IOMMU_PASID_INVALID };
- rc = iommufd_device_change_pt(idev, pt_id, &data); + if (idev->igroup) + rc = iommufd_device_change_pt(idev, pt_id, &data); + else + /* SIOV device follows generic pasid attach flow */ + rc = iommufd_device_pasid_attach(idev, idev->default_pasid, pt_id); if (rc) return rc;
@@ -876,7 +880,12 @@ int iommufd_device_replace(struct iommufd_device *idev, u32 *pt_id) .pasid = IOMMU_PASID_INVALID };
- return iommufd_device_change_pt(idev, pt_id, &data); + if (idev->igroup) { + return iommufd_device_change_pt(idev, pt_id, &data); + } else { + /* SIOV device follows generic pasid replace flow */ + return iommufd_device_pasid_replace(idev, idev->default_pasid, pt_id); + } } EXPORT_SYMBOL_NS_GPL(iommufd_device_replace, IOMMUFD);
@@ -891,8 +900,12 @@ void iommufd_device_detach(struct iommufd_device *idev) { struct iommufd_hw_pagetable *hwpt;
- hwpt = iommufd_hw_pagetable_detach(idev); - iommufd_hw_pagetable_put(idev->ictx, hwpt); + if (idev->igroup) { + hwpt = iommufd_hw_pagetable_detach(idev); + iommufd_hw_pagetable_put(idev->ictx, hwpt); + } else { + iommufd_device_pasid_detach(idev, idev->default_pasid); + } refcount_dec(&idev->obj.users); } EXPORT_SYMBOL_NS_GPL(iommufd_device_detach, IOMMUFD);
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
From: Kevin Tian kevin.tian@intel.com
SIOV devices allows driver to tag different PASIDs for the virtual devices within it. Such driver should call iommufd_device_bind_pasid() to connect the pasid of the device to iommufd, and then driver is able to attach the virtual device to IOAS/HWPT with the iommufd_device_attach() API.
Unlike physical devices, for SIOV virtual devices, iommufd_device_attach() eventually uses the idev->default_pasid when the virtual device is attached
s/default_pasid/rid_pasid/? or just call it idev->pasid. 'default' adds slight confusion instead...
to an IOAS/HWPT. Also, there is no need to do immediate_attach per iommu domain allocation in the attach/replace path if any iommu domain allocation happens since the attach/replace is eventually pasid attach/replace.
immediate_attach rationale belongs to earlier pasid attach series when iommufd_device_pasid_attach() was introduced. Not here.
On 2023/10/10 16:24, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
From: Kevin Tian kevin.tian@intel.com
SIOV devices allows driver to tag different PASIDs for the virtual devices within it. Such driver should call iommufd_device_bind_pasid() to connect the pasid of the device to iommufd, and then driver is able to attach the virtual device to IOAS/HWPT with the iommufd_device_attach() API.
Unlike physical devices, for SIOV virtual devices, iommufd_device_attach() eventually uses the idev->default_pasid when the virtual device is attached
s/default_pasid/rid_pasid/? or just call it idev->pasid. 'default' adds slight confusion instead...
then let's use rid_pasid as it is used as vRID for virtual device.
to an IOAS/HWPT. Also, there is no need to do immediate_attach per iommu domain allocation in the attach/replace path if any iommu domain allocation happens since the attach/replace is eventually pasid attach/replace.
immediate_attach rationale belongs to earlier pasid attach series when iommufd_device_pasid_attach() was introduced. Not here.
sure, will drop it.
This extends IOMMU_TEST_OP_MOCK_DOMAIN to accept a pasid from caller. Hence it is able to cover the iommufd_device_bind_pasid() for SIOV virtual devices. pasid #0 is selected to mark the physical devices, non-zero pasid values would be considered as SIOV virtual device bind. Will add SIOV test cases later.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/iommu/iommufd/iommufd_test.h | 2 ++ drivers/iommu/iommufd/selftest.c | 10 ++++++-- tools/testing/selftests/iommu/iommufd.c | 24 +++++++++---------- .../selftests/iommu/iommufd_fail_nth.c | 16 ++++++------- tools/testing/selftests/iommu/iommufd_utils.h | 21 +++++++++------- 5 files changed, 42 insertions(+), 31 deletions(-)
diff --git a/drivers/iommu/iommufd/iommufd_test.h b/drivers/iommu/iommufd/iommufd_test.h index cf10f250b0d2..64217f33f91a 100644 --- a/drivers/iommu/iommufd/iommufd_test.h +++ b/drivers/iommu/iommufd/iommufd_test.h @@ -62,6 +62,8 @@ struct iommu_test_cmd { __aligned_u64 length; } add_reserved; struct { + /* #0 is invalid, any non-zero is meaningful */ + __u32 default_pasid; __u32 out_stdev_id; __u32 out_hwpt_id; /* out_idev_id is the standard iommufd_bind object */ diff --git a/drivers/iommu/iommufd/selftest.c b/drivers/iommu/iommufd/selftest.c index 5fb025ab8677..60c6d76c82b4 100644 --- a/drivers/iommu/iommufd/selftest.c +++ b/drivers/iommu/iommufd/selftest.c @@ -638,8 +638,14 @@ static int iommufd_test_mock_domain(struct iommufd_ucmd *ucmd, goto out_sobj; }
- idev = iommufd_device_bind(ucmd->ictx, &sobj->idev.mock_dev->dev, - &idev_id); + if (!cmd->mock_domain.default_pasid) + idev = iommufd_device_bind(ucmd->ictx, &sobj->idev.mock_dev->dev, + &idev_id); + else + idev = iommufd_device_bind_pasid(ucmd->ictx, + &sobj->idev.mock_dev->dev, + cmd->mock_domain.default_pasid, + &idev_id); if (IS_ERR(idev)) { rc = PTR_ERR(idev); goto out_mdev; diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index be2a95163d10..9a1fbba89e96 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -219,7 +219,7 @@ FIXTURE_SETUP(iommufd_ioas) }
for (i = 0; i != variant->mock_domains; i++) { - test_cmd_mock_domain(self->ioas_id, &self->stdev_id, + test_cmd_mock_domain(self->ioas_id, 0, &self->stdev_id, &self->hwpt_id, &self->device_id); self->base_iova = MOCK_APERTURE_START; } @@ -450,9 +450,9 @@ TEST_F(iommufd_ioas, hwpt_attach) { /* Create a device attached directly to a hwpt */ if (self->stdev_id) { - test_cmd_mock_domain(self->hwpt_id, NULL, NULL, NULL); + test_cmd_mock_domain(self->hwpt_id, 0, NULL, NULL, NULL); } else { - test_err_mock_domain(ENOENT, self->hwpt_id, NULL, NULL); + test_err_mock_domain(ENOENT, self->hwpt_id, 0, NULL, NULL); } }
@@ -902,7 +902,7 @@ TEST_F(iommufd_ioas, access_pin) ASSERT_EQ(0, ioctl(self->fd, _IOMMU_TEST_CMD(IOMMU_TEST_OP_ACCESS_PAGES), &access_cmd)); - test_cmd_mock_domain(self->ioas_id, &mock_stdev_id, + test_cmd_mock_domain(self->ioas_id, 0, &mock_stdev_id, &mock_hwpt_id, NULL); check_map_cmd.id = mock_hwpt_id; ASSERT_EQ(0, ioctl(self->fd, @@ -1058,7 +1058,7 @@ TEST_F(iommufd_ioas, fork_gone) * If a domain already existed then everything was pinned within * the fork, so this copies from one domain to another. */ - test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL); + test_cmd_mock_domain(self->ioas_id, 0, NULL, NULL, NULL); check_access_rw(_metadata, self->fd, access_id, MOCK_APERTURE_START, 0);
@@ -1067,7 +1067,7 @@ TEST_F(iommufd_ioas, fork_gone) * Otherwise we need to actually pin pages which can't happen * since the fork is gone. */ - test_err_mock_domain(EFAULT, self->ioas_id, NULL, NULL); + test_err_mock_domain(EFAULT, self->ioas_id, 0, NULL, NULL); }
test_cmd_destroy_access(access_id); @@ -1107,7 +1107,7 @@ TEST_F(iommufd_ioas, fork_present) ASSERT_EQ(8, read(efd, &tmp, sizeof(tmp)));
/* Read pages from the remote process */ - test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL); + test_cmd_mock_domain(self->ioas_id, 0, NULL, NULL, NULL); check_access_rw(_metadata, self->fd, access_id, MOCK_APERTURE_START, 0);
ASSERT_EQ(0, close(pipefds[1])); @@ -1277,7 +1277,7 @@ FIXTURE_SETUP(iommufd_mock_domain) ASSERT_GE(ARRAY_SIZE(self->hwpt_ids), variant->mock_domains);
for (i = 0; i != variant->mock_domains; i++) - test_cmd_mock_domain(self->ioas_id, &self->stdev_ids[i], + test_cmd_mock_domain(self->ioas_id, 0, &self->stdev_ids[i], &self->hwpt_ids[i], &self->idev_ids[i]); self->hwpt_id = self->hwpt_ids[0];
@@ -1471,7 +1471,7 @@ TEST_F(iommufd_mock_domain, all_aligns_copy)
/* Add and destroy a domain while the area exists */ old_id = self->hwpt_ids[1]; - test_cmd_mock_domain(self->ioas_id, &mock_stdev_id, + test_cmd_mock_domain(self->ioas_id, 0, &mock_stdev_id, &self->hwpt_ids[1], NULL);
check_mock_iova(buf + start, iova, length); @@ -1609,7 +1609,7 @@ TEST_F(iommufd_mock_domain, alloc_hwpt) test_cmd_mock_domain_replace(self->stdev_ids[i], self->ioas_id); test_ioctl_destroy(hwpt_id[1]);
- test_cmd_mock_domain(hwpt_id[0], &stddev_id, NULL, NULL); + test_cmd_mock_domain(hwpt_id[0], 0, &stddev_id, NULL, NULL); test_ioctl_destroy(stddev_id); test_ioctl_destroy(hwpt_id[0]); } @@ -1756,7 +1756,7 @@ FIXTURE_SETUP(vfio_compat_mock_domain)
/* Create what VFIO would consider a group */ test_ioctl_ioas_alloc(&self->ioas_id); - test_cmd_mock_domain(self->ioas_id, NULL, NULL, NULL); + test_cmd_mock_domain(self->ioas_id, 0, NULL, NULL, NULL);
/* Attach it to the vfio compat */ vfio_ioas_cmd.ioas_id = self->ioas_id; @@ -2037,7 +2037,7 @@ FIXTURE_SETUP(iommufd_device_pasid) ASSERT_NE(-1, self->fd); test_ioctl_ioas_alloc(&self->ioas_id);
- test_cmd_mock_domain(self->ioas_id, &self->stdev_id, + test_cmd_mock_domain(self->ioas_id, 0, &self->stdev_id, &self->hwpt_id, &self->device_id); }
diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index f7f4b838c2d1..691903c63de0 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -321,7 +321,7 @@ TEST_FAIL_NTH(basic_fail_nth, map_domain)
fail_nth_enable();
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL)) + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id, &hwpt_id, NULL)) return -1;
if (_test_ioctl_ioas_map(self->fd, ioas_id, buffer, 262144, &iova, @@ -332,7 +332,7 @@ TEST_FAIL_NTH(basic_fail_nth, map_domain) if (_test_ioctl_destroy(self->fd, stdev_id)) return -1;
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL)) + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id, &hwpt_id, NULL)) return -1; return 0; } @@ -356,12 +356,12 @@ TEST_FAIL_NTH(basic_fail_nth, map_two_domains) if (_test_ioctl_set_temp_memory_limit(self->fd, 32)) return -1;
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL)) + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id, &hwpt_id, NULL)) return -1;
fail_nth_enable();
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id2, &hwpt_id2, + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id2, &hwpt_id2, NULL)) return -1;
@@ -376,9 +376,9 @@ TEST_FAIL_NTH(basic_fail_nth, map_two_domains) if (_test_ioctl_destroy(self->fd, stdev_id2)) return -1;
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL)) + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id, &hwpt_id, NULL)) return -1; - if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id2, &hwpt_id2, + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id2, &hwpt_id2, NULL)) return -1; return 0; @@ -536,7 +536,7 @@ TEST_FAIL_NTH(basic_fail_nth, access_pin_domain) if (_test_ioctl_set_temp_memory_limit(self->fd, 32)) return -1;
- if (_test_cmd_mock_domain(self->fd, ioas_id, &stdev_id, &hwpt_id, NULL)) + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &stdev_id, &hwpt_id, NULL)) return -1;
if (_test_ioctl_ioas_map(self->fd, ioas_id, buffer, BUFFER_SIZE, &iova, @@ -613,7 +613,7 @@ TEST_FAIL_NTH(basic_fail_nth, device)
fail_nth_enable();
- if (_test_cmd_mock_domain(self->fd, ioas_id, &self->stdev_id, NULL, + if (_test_cmd_mock_domain(self->fd, ioas_id, 0, &self->stdev_id, NULL, &idev_id)) return -1;
diff --git a/tools/testing/selftests/iommu/iommufd_utils.h b/tools/testing/selftests/iommu/iommufd_utils.h index 8339925562f3..bc9080fc9c2f 100644 --- a/tools/testing/selftests/iommu/iommufd_utils.h +++ b/tools/testing/selftests/iommu/iommufd_utils.h @@ -44,14 +44,16 @@ static unsigned long PAGE_SIZE; &test_cmd)); \ })
-static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *stdev_id, - __u32 *hwpt_id, __u32 *idev_id) +static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, + unsigned int default_pasid, + __u32 *stdev_id, __u32 *hwpt_id, + __u32 *idev_id) { struct iommu_test_cmd cmd = { .size = sizeof(cmd), .op = IOMMU_TEST_OP_MOCK_DOMAIN, .id = ioas_id, - .mock_domain = {}, + .mock_domain = { .default_pasid = default_pasid, }, }; int ret;
@@ -67,12 +69,13 @@ static int _test_cmd_mock_domain(int fd, unsigned int ioas_id, __u32 *stdev_id, *idev_id = cmd.mock_domain.out_idev_id; return 0; } -#define test_cmd_mock_domain(ioas_id, stdev_id, hwpt_id, idev_id) \ - ASSERT_EQ(0, _test_cmd_mock_domain(self->fd, ioas_id, stdev_id, \ - hwpt_id, idev_id)) -#define test_err_mock_domain(_errno, ioas_id, stdev_id, hwpt_id) \ - EXPECT_ERRNO(_errno, _test_cmd_mock_domain(self->fd, ioas_id, \ - stdev_id, hwpt_id, NULL)) +#define test_cmd_mock_domain(ioas_id, pasid, stdev_id, hwpt_id, idev_id) \ + ASSERT_EQ(0, _test_cmd_mock_domain(self->fd, ioas_id, pasid, \ + stdev_id, hwpt_id, idev_id)) +#define test_err_mock_domain(_errno, ioas_id, pasid, stdev_id, hwpt_id) \ + EXPECT_ERRNO(_errno, _test_cmd_mock_domain(self->fd, ioas_id, \ + pasid, stdev_id, \ + hwpt_id, NULL))
static int _test_cmd_mock_domain_replace(int fd, __u32 stdev_id, __u32 pt_id, __u32 *hwpt_id)
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
This extends IOMMU_TEST_OP_MOCK_DOMAIN to accept a pasid from caller. Hence it is able to cover the iommufd_device_bind_pasid() for SIOV virtual devices. pasid #0 is selected to mark the physical devices, non-zero pasid values would be considered as SIOV virtual device bind. Will add SIOV test cases later.
...
@@ -62,6 +62,8 @@ struct iommu_test_cmd { __aligned_u64 length; } add_reserved; struct {
/* #0 is invalid, any non-zero is meaningful */
__u32 default_pasid;
#0 represents the physical device instead of being invalid.
This adds test coverage for SIOV virtual device by passsing a non-zero pasid to IOMMU_TEST_OP_MOCK_DOMAIN op, and check if the SIOV virtual device (a.k.a pasid of this device) is attached to the mock domain, then tries to replace with a new hwpt and other types of hwpts, and check if the attached domain of this virtual device is correct.
Signed-off-by: Yi Liu yi.l.liu@intel.com --- tools/testing/selftests/iommu/iommufd.c | 53 ++++++++++++++++++- .../selftests/iommu/iommufd_fail_nth.c | 26 +++++++++ 2 files changed, 77 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/iommu/iommufd.c b/tools/testing/selftests/iommu/iommufd.c index 9a1fbba89e96..945ab07a8b84 100644 --- a/tools/testing/selftests/iommu/iommufd.c +++ b/tools/testing/selftests/iommu/iommufd.c @@ -2031,14 +2031,20 @@ FIXTURE(iommufd_device_pasid) uint32_t device_id; };
+FIXTURE_VARIANT(iommufd_device_pasid) +{ + uint32_t pasid; +}; + FIXTURE_SETUP(iommufd_device_pasid) { self->fd = open("/dev/iommu", O_RDWR); ASSERT_NE(-1, self->fd); test_ioctl_ioas_alloc(&self->ioas_id);
- test_cmd_mock_domain(self->ioas_id, 0, &self->stdev_id, - &self->hwpt_id, &self->device_id); + test_cmd_mock_domain(self->ioas_id, variant->pasid, + &self->stdev_id, &self->hwpt_id, + &self->device_id); }
FIXTURE_TEARDOWN(iommufd_device_pasid) @@ -2046,6 +2052,12 @@ FIXTURE_TEARDOWN(iommufd_device_pasid) teardown_iommufd(self->fd, _metadata); }
+/* For SIOV test */ +FIXTURE_VARIANT_ADD(iommufd_device_pasid, siov_pasid_600) +{ + .pasid = 600, //this is the default pasid for the SIOV virtual device +}; + TEST_F(iommufd_device_pasid, pasid_attach) { if (self->device_id) { @@ -2071,6 +2083,43 @@ TEST_F(iommufd_device_pasid, pasid_attach) IOMMU_HWPT_ALLOC_DATA_SELFTEST, &data, sizeof(data));
+ if (variant->pasid) { + uint32_t new_hwpt_id = 0; + + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, + self->stdev_id, + variant->pasid, + self->hwpt_id, + &result)); + EXPECT_EQ(1, result); + test_cmd_hwpt_alloc(self->device_id, self->ioas_id, + 0, &new_hwpt_id); + test_cmd_mock_domain_replace(self->stdev_id, + new_hwpt_id); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, + self->stdev_id, + variant->pasid, + new_hwpt_id, + &result)); + EXPECT_EQ(1, result); + + /* + * Detach hwpt from variant->pasid, and check if the + * variant->pasid has null domain + */ + test_cmd_pasid_detach(variant->pasid); + ASSERT_EQ(0, + test_cmd_pasid_check_domain(self->fd, + self->stdev_id, + variant->pasid, + 0, &result)); + EXPECT_EQ(1, result); + + test_ioctl_destroy(new_hwpt_id); + } + /* * Attach ioas to pasid 100, should succeed, domain should * be valid. diff --git a/tools/testing/selftests/iommu/iommufd_fail_nth.c b/tools/testing/selftests/iommu/iommufd_fail_nth.c index 691903c63de0..a5fb45d99869 100644 --- a/tools/testing/selftests/iommu/iommufd_fail_nth.c +++ b/tools/testing/selftests/iommu/iommufd_fail_nth.c @@ -644,6 +644,32 @@ TEST_FAIL_NTH(basic_fail_nth, device)
self->pasid = 0;
+ if (_test_ioctl_destroy(self->fd, self->stdev_id)) + return -1; + + self->pasid = 300; + self->stdev_id = 0; + + /* Test for SIOV virtual devices attach */ + if (_test_cmd_mock_domain(self->fd, ioas_id, self->pasid, + &self->stdev_id, NULL, &idev_id)) + return -1; + + /* Test for SIOV virtual device replace */ + if (_test_cmd_mock_domain_replace(self->fd, self->stdev_id, + hwpt_id, NULL)) + return -1; + + if (_test_cmd_pasid_detach(self->fd, self->stdev_id, self->pasid)) + return -1; + + self->pasid = 0; + + if (_test_ioctl_destroy(self->fd, self->stdev_id)) + return -1; + + self->stdev_id = 0; + return 0; }
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
@@ -2071,6 +2083,43 @@ TEST_F(iommufd_device_pasid, pasid_attach)
IOMMU_HWPT_ALLOC_DATA_SELFTEST, &data, sizeof(data));
if (variant->pasid) {
uint32_t new_hwpt_id = 0;
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
self->hwpt_id,
&result));
EXPECT_EQ(1, result);
test_cmd_hwpt_alloc(self->device_id, self->ioas_id,
0, &new_hwpt_id);
test_cmd_mock_domain_replace(self->stdev_id,
new_hwpt_id);
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
new_hwpt_id,
&result));
EXPECT_EQ(1, result);
/*
* Detach hwpt from variant->pasid, and check if the
* variant->pasid has null domain
*/
test_cmd_pasid_detach(variant->pasid);
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
0, &result));
EXPECT_EQ(1, result);
test_ioctl_destroy(new_hwpt_id);
}
I wonder whether above better reuses the device attach/replace cases given default_pasid is hidden inside iommufd_device. this pasid_attach case is more for testing user pasids on a iommufd_device which hasn't yet been supported by SIOV device?
On 2023/10/10 16:30, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
@@ -2071,6 +2083,43 @@ TEST_F(iommufd_device_pasid, pasid_attach)
IOMMU_HWPT_ALLOC_DATA_SELFTEST, &data, sizeof(data));
if (variant->pasid) {
uint32_t new_hwpt_id = 0;
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
self->hwpt_id,
&result));
EXPECT_EQ(1, result);
test_cmd_hwpt_alloc(self->device_id, self->ioas_id,
0, &new_hwpt_id);
test_cmd_mock_domain_replace(self->stdev_id,
new_hwpt_id);
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
new_hwpt_id,
&result));
EXPECT_EQ(1, result);
/*
* Detach hwpt from variant->pasid, and check if the
* variant->pasid has null domain
*/
test_cmd_pasid_detach(variant->pasid);
ASSERT_EQ(0,
test_cmd_pasid_check_domain(self->fd,
self->stdev_id,
variant->pasid,
0, &result));
EXPECT_EQ(1, result);
test_ioctl_destroy(new_hwpt_id);
}
I wonder whether above better reuses the device attach/replace cases given default_pasid is hidden inside iommufd_device. this pasid_attach case is more for testing user pasids on a iommufd_device which hasn't yet been supported by SIOV device?
perhaps the way how the above code checks the attached domain misled you. Actually, this is still testing the siov default_pasid. In the variant setup, the default_pasid is passed to the testing driver when creating the stdev. That's why the replace test does not require a pasid.
maybe I can let have a new selftest op to check attached domain for a given stdev instead of reusing test_cmd_pasid_check_domain().
From: Kevin Tian kevin.tian@intel.com
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For a given vfio device, VFIO core creates both group user interface and device user interface (device cdev) if configured. However, for the virtual devices backed by PASID of the device, VFIO core shall only create device user interface as there is no plan to support such devices in the legacy vfio_iommu drivers which is a must if creating group user interface for such virtual devices. This introduces a VFIO_PASID_IOMMU group type for the device driver to register PASID virtual devices, and provides a wrapper API for it. In particular no iommu group (neither fake group or real group) exists per PASID, hence no group interface for this type.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com --- drivers/vfio/group.c | 18 ++++++++++++++++++ drivers/vfio/vfio.h | 8 ++++++++ drivers/vfio/vfio_main.c | 10 ++++++++++ include/linux/vfio.h | 1 + 4 files changed, 37 insertions(+)
diff --git a/drivers/vfio/group.c b/drivers/vfio/group.c index 610a429c6191..20771d0feb37 100644 --- a/drivers/vfio/group.c +++ b/drivers/vfio/group.c @@ -407,6 +407,9 @@ int vfio_device_block_group(struct vfio_device *device) struct vfio_group *group = device->group; int ret = 0;
+ if (!group) + return 0; + mutex_lock(&group->group_lock); if (group->opened_file) { ret = -EBUSY; @@ -424,6 +427,8 @@ void vfio_device_unblock_group(struct vfio_device *device) { struct vfio_group *group = device->group;
+ if (!group) + return; mutex_lock(&group->group_lock); group->cdev_device_open_cnt--; mutex_unlock(&group->group_lock); @@ -704,6 +709,10 @@ int vfio_device_set_group(struct vfio_device *device, { struct vfio_group *group;
+ /* No group associate with a device with pasid */ + if (type == VFIO_PASID_IOMMU) + return 0; + if (type == VFIO_IOMMU) group = vfio_group_find_or_alloc(device->dev); else @@ -722,6 +731,9 @@ void vfio_device_remove_group(struct vfio_device *device) struct vfio_group *group = device->group; struct iommu_group *iommu_group;
+ if (!group) + return; + if (group->type == VFIO_NO_IOMMU || group->type == VFIO_EMULATED_IOMMU) iommu_group_remove_device(device->dev);
@@ -766,6 +778,9 @@ void vfio_device_remove_group(struct vfio_device *device)
void vfio_device_group_register(struct vfio_device *device) { + if (!device->group) + return; + mutex_lock(&device->group->device_lock); list_add(&device->group_next, &device->group->device_list); mutex_unlock(&device->group->device_lock); @@ -773,6 +788,9 @@ void vfio_device_group_register(struct vfio_device *device)
void vfio_device_group_unregister(struct vfio_device *device) { + if (!device->group) + return; + mutex_lock(&device->group->device_lock); list_del(&device->group_next); mutex_unlock(&device->group->device_lock); diff --git a/drivers/vfio/vfio.h b/drivers/vfio/vfio.h index d228cdb6b345..1ccc9aba6dc7 100644 --- a/drivers/vfio/vfio.h +++ b/drivers/vfio/vfio.h @@ -48,6 +48,14 @@ enum vfio_group_type { */ VFIO_IOMMU,
+ /* + * Virtual device with IOMMU backing. The user of these devices can + * trigger DMAs which are all tagged with a pasid. Pasid itself is + * a device resource so there is no group associated. The VFIO core + * doesn't create a vfio_group for such devices. + */ + VFIO_PASID_IOMMU, + /* * Virtual device without IOMMU backing. The VFIO core fakes up an * iommu_group as the iommu_group sysfs interface is part of the diff --git a/drivers/vfio/vfio_main.c b/drivers/vfio/vfio_main.c index 850bbaebdd29..362de0ad36ce 100644 --- a/drivers/vfio/vfio_main.c +++ b/drivers/vfio/vfio_main.c @@ -334,6 +334,16 @@ int vfio_register_emulated_iommu_dev(struct vfio_device *device) } EXPORT_SYMBOL_GPL(vfio_register_emulated_iommu_dev);
+/* + * Register a virtual device with IOMMU pasid protection. The user of + * this device can trigger DMA as long as all of its outgoing DMAs are + * always tagged with a pasid. + */ +int vfio_register_pasid_iommu_dev(struct vfio_device *device) +{ + return __vfio_register_dev(device, VFIO_PASID_IOMMU); +} + /* * Decrement the device reference count and wait for the device to be * removed. Open file descriptors for the device... */ diff --git a/include/linux/vfio.h b/include/linux/vfio.h index 7b06d1bc7cb3..2662f2ece924 100644 --- a/include/linux/vfio.h +++ b/include/linux/vfio.h @@ -281,6 +281,7 @@ static inline void vfio_put_device(struct vfio_device *device)
int vfio_register_group_dev(struct vfio_device *device); int vfio_register_emulated_iommu_dev(struct vfio_device *device); +int vfio_register_pasid_iommu_dev(struct vfio_device *device); void vfio_unregister_group_dev(struct vfio_device *device);
int vfio_assign_device_set(struct vfio_device *device, void *set_id);
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
From: Kevin Tian kevin.tian@intel.com
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For a given vfio device, VFIO core creates both group user interface and device user interface (device cdev) if configured. However, for the virtual devices backed by PASID of the device, VFIO core shall only create device user interface as there is no plan to support such devices in the legacy vfio_iommu drivers which is a must if creating group user interface for such virtual devices. This introduces a VFIO_PASID_IOMMU group type for the device driver to register PASID virtual devices, and provides a wrapper API for it. In particular no iommu group (neither fake group or real group) exists per PASID, hence no group interface for this type.
this commit msg needs some revision. The key is that there is no group per pasid *in concept* so it doesn't make sense to fake a group...
On 2023/10/10 16:33, Tian, Kevin wrote:
From: Liu, Yi L yi.l.liu@intel.com Sent: Monday, October 9, 2023 4:51 PM
From: Kevin Tian kevin.tian@intel.com
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For a given vfio device, VFIO core creates both group user interface and device user interface (device cdev) if configured. However, for the virtual devices backed by PASID of the device, VFIO core shall only create device user interface as there is no plan to support such devices in the legacy vfio_iommu drivers which is a must if creating group user interface for such virtual devices. This introduces a VFIO_PASID_IOMMU group type for the device driver to register PASID virtual devices, and provides a wrapper API for it. In particular no iommu group (neither fake group or real group) exists per PASID, hence no group interface for this type.
this commit msg needs some revision. The key is that there is no group per pasid *in concept* so it doesn't make sense to fake a group...
how about below?
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows the device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For the PASID virtual devices, there is no iommu group in concept. Hence the VFIO core only creates device user interface for such devices. This introduces a VFIO_PASID_IOMMU group type to differentiate from the other devices that have iommu group in concept.
On 10/9/2023 4:51 PM, Yi Liu wrote:
From: Kevin Tian kevin.tian@intel.com
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For a given vfio device, VFIO core creates both group user interface and device user interface (device cdev) if configured. However, for the virtual devices backed by PASID of the device, VFIO core shall only create device user interface as there is no plan to support such devices in the legacy vfio_iommu drivers which is a must if creating group user interface for such virtual devices. This introduces a VFIO_PASID_IOMMU group type for the device driver to register PASID virtual devices, and provides a wrapper API for it. In particular no iommu group (neither fake group or real group) exists per PASID, hence no group interface for this type.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com
+/*
- Register a virtual device with IOMMU pasid protection. The user of
- this device can trigger DMA as long as all of its outgoing DMAs are
- always tagged with a pasid.
- */
+int vfio_register_pasid_iommu_dev(struct vfio_device *device) +{
- return __vfio_register_dev(device, VFIO_PASID_IOMMU);
+}
If CONFIG_VFIO_GROUP kconfig is selected, then there will be access to vdev->group shown as below ->__vfio_register_dev() ->vfio_device_add() ->vfio_device_is_noiommu() { return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && vdev->group->type == VFIO_NO_IOMMU}
For SIOV virtual devices, vfio group is not created and vfio cdev is used. Thus vdev->group is NULL and there is NULL pointer access here.
Thanks. Yahui.
On 2023/11/16 13:35, Cao, Yahui wrote:
On 10/9/2023 4:51 PM, Yi Liu wrote:
From: Kevin Tian kevin.tian@intel.com
This adds vfio_register_pasid_iommu_dev() for device driver to register virtual devices which are isolated per PASID in physical IOMMU. The major usage is for the SIOV devices which allows device driver to tag the DMAs out of virtual devices within it with different PASIDs.
For a given vfio device, VFIO core creates both group user interface and device user interface (device cdev) if configured. However, for the virtual devices backed by PASID of the device, VFIO core shall only create device user interface as there is no plan to support such devices in the legacy vfio_iommu drivers which is a must if creating group user interface for such virtual devices. This introduces a VFIO_PASID_IOMMU group type for the device driver to register PASID virtual devices, and provides a wrapper API for it. In particular no iommu group (neither fake group or real group) exists per PASID, hence no group interface for this type.
Signed-off-by: Kevin Tian kevin.tian@intel.com Signed-off-by: Yi Liu yi.l.liu@intel.com
+/*
- Register a virtual device with IOMMU pasid protection. The user of
- this device can trigger DMA as long as all of its outgoing DMAs are
- always tagged with a pasid.
- */
+int vfio_register_pasid_iommu_dev(struct vfio_device *device) +{ + return __vfio_register_dev(device, VFIO_PASID_IOMMU); +}
If CONFIG_VFIO_GROUP kconfig is selected, then there will be access to vdev->group shown as below ->__vfio_register_dev() ->vfio_device_add() ->vfio_device_is_noiommu() { return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && vdev->group->type == VFIO_NO_IOMMU}
For SIOV virtual devices, vfio group is not created and vfio cdev is used. Thus vdev->group is NULL and there is NULL pointer access here.
yes. needs to be like below:
return IS_ENABLED(CONFIG_VFIO_NOIOMMU) && vdev->group && vdev->group->type == VFIO_NO_IOMMU;
On Mon, Oct 09, 2023 at 01:51:16AM -0700, Yi Liu wrote:
Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. It is called as SIOV virtual device in this series. Such devices can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid. Such PASIDs are called as default pasid.
I would want to see the idxd implementation too..
Jason
From: Jason Gunthorpe jgg@nvidia.com Sent: Monday, October 9, 2023 9:21 PM
On Mon, Oct 09, 2023 at 01:51:16AM -0700, Yi Liu wrote:
Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. It is called as SIOV virtual device in this series. Such devices can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid. Such PASIDs are called as default pasid.
I would want to see the idxd implementation too..
It still needs some time (and unfortunately the guy working on idxd is currently on a long vacation).
Instead of waiting we want to seek early comments on the iommufd changes given that part is relatively self-contained. Same as what Reinette is doing for IMS.
Certainly this is not for merging w/o having a driver user. 😊
On 10/9/2023 9:21 PM, Jason Gunthorpe wrote:
On Mon, Oct 09, 2023 at 01:51:16AM -0700, Yi Liu wrote:
Intel SIOV allows creating virtual devices of which the vRID is represented by a pasid of a physical device. It is called as SIOV virtual device in this series. Such devices can be bound to an iommufd as physical device does and then later be attached to an IOAS/hwpt using that pasid. Such PASIDs are called as default pasid.
I would want to see the idxd implementation too..
Jason
Hey Jason,
ice(E810 NIC) driver implementation for SIOV virtual device is also working in progress. We are working closely with Kevin and Yi on the patchset.
We'll send out the ice patch for SIOV as lan driver user example once it is available. (There is some format issue in the last email, re-send again and sorry for any inconvenience)
Thanks. Yahui.
linux-kselftest-mirror@lists.linaro.org