From: Liu, Yi L yi.l.liu@intel.com Sent: Wednesday, August 16, 2023 8:14 PM
Under nested IOMMU translation, userspace owns the stage-1 translation table (e.g. the stage-1 page table of Intel VT-d or the context table of ARM SMMUv3, and etc.). Stage-1 translation tables are vendor specific, and need to be compatible with the underlying IOMMU hardware. Hence, userspace should know the IOMMU hardware capability before creating and configuring the stage-1 translation table to kernel.
This adds IOMMU_GET_HW_INFO ioctl to query the IOMMU hardware information (a.k.a capability) for a given device. The returned data is vendor specific, userspace needs to decode it with the structure by the output @out_data_type field.
"The format of the returned data is vendor specific and must be decoded according to @out_data_type field".
+int iommufd_get_hw_info(struct iommufd_ucmd *ucmd) +{
- struct iommu_hw_info *cmd = ucmd->cmd;
- void __user *user_ptr = u64_to_user_ptr(cmd->data_uptr);
- const struct iommu_ops *ops;
- struct iommufd_device *idev;
- unsigned int data_len;
- unsigned int copy_len;
- void *data = NULL;
- int rc;
- if (cmd->flags || cmd->__reserved)
return -EOPNOTSUPP;
- idev = iommufd_get_device(ucmd, cmd->dev_id);
- if (IS_ERR(idev))
return PTR_ERR(idev);
- ops = dev_iommu_ops(idev->dev);
- if (ops->hw_info) {
data = ops->hw_info(idev->dev, &data_len, &cmd-
out_data_type);
if (IS_ERR(data)) {
rc = PTR_ERR(data);
goto err_put;
}
/*
* drivers that have hw_info callback should have a unique
* iommu_hw_info_type.
*/
if (WARN_ON_ONCE(cmd->out_data_type ==
IOMMU_HW_INFO_TYPE_NONE)) {
rc = -ENODEV;
goto out;
}
- } else {
cmd->out_data_type = IOMMU_HW_INFO_TYPE_NONE;
data_len = 0;
data = NULL;
data is already initialized as NULL.
- /*
* We return the length the kernel supports so userspace may know
what
* the kernel capability is. It could be larger than the input buffer.
*/
- cmd->data_len = data_len;
- rc = iommufd_ucmd_respond(ucmd, sizeof(*cmd));
+out:
out_free:
- kfree(data);
+err_put:
out_put: (since this is also used in the success path)
- To capture an iommu type specific hardware information data,
@data_uptr and
- its length @data_len must be provided. Trailing bytes will be zeroed if the
- user buffer is larger than the data that kernel has. Otherwise, kernel only
- fills the buffer using the given length in @data_len. If the ioctl succeeds,
- @data_len will be updated to the length that kernel actually supports,
- @out_data_type will be filled to decode the data filled in the buffer
- pointed by @data_uptr. Input @data_len == zero is allowed, no
information
- data will be filled to user, but user space could get the
iommu_hw_info_type
- filled in @out_data_type and the iommu hardware information data
length
- supported by kernel filled in @data_len.
I'd just keep "Input @data_len == zero is allowed" and remove all the trailing words which just duplicate with the former context.
Reviewed-by: Kevin Tian kevin.tian@intel.com