On Tue, Dec 19, 2023 at 05:26:21PM +0800, Yi Liu wrote:
On 2023/12/17 19:21, Joel Granados wrote:
Hey Yi
I have been working with https://protect2.fireeye.com/v1/url?k=b58750ce-ea1c9eaa-b586db81-000babda020...
good to know about it.
and have some questions regarding one of the commits in that series. I however cannot find it in lore.kernel.org. Can you please direct me to where the rfc was posted? If it has not been posted yet, do you have an alternate place for discussion?
the qemu series has not been posted yet as kernel side is still changing. It still needs some time to be ready for public review. Zhenzhong Duan is going to post it when it's ready. If you have questions to discuss, you can post your questions to Zhenzhong and me first. I guess it may be fine to cc Alex Williamson, Eric Auger, Nicolin Chen, Cédric Le Goater, Kevin Tian, Jason Gunthorpe and qemu mail list as this is discussion something that is going to be posted in public.
Thx for getting back to me. I'll direct my questions to these recipients.
Best
Best
On Fri, Nov 17, 2023 at 05:07:11AM -0800, Yi Liu wrote:
Nested translation is a hardware feature that is supported by many modern IOMMU hardwares. It has two stages (stage-1, stage-2) address translation to get access to the physical address. stage-1 translation table is owned by userspace (e.g. by a guest OS), while stage-2 is owned by kernel. Changes to stage-1 translation table should be followed by an IOTLB invalidation.
Take Intel VT-d as an example, the stage-1 translation table is I/O page table. As the below diagram shows, guest I/O page table pointer in GPA (guest physical address) is passed to host and be used to perform the stage-1 address translation. Along with it, modifications to present mappings in the guest I/O page table should be followed with an IOTLB invalidation.
.-------------. .---------------------------. | vIOMMU | | Guest I/O page table | | | '---------------------------' .----------------/ | PASID Entry |--- PASID cache flush --+ '-------------' | | | V | | I/O page table pointer in GPA '-------------'
Guest ------| Shadow |---------------------------|-------- v v v Host .-------------. .------------------------. | pIOMMU | | FS for GIOVA->GPA | | | '------------------------' .----------------/ | | PASID Entry | V (Nested xlate) '----------------.----------------------------------. | | | SS for GPA->HPA, unmanaged domain| | | '----------------------------------' '-------------' Where:
- FS = First stage page tables
- SS = Second stage page tables
<Intel VT-d Nested translation>
This series adds the cache invalidation path for the userspace to invalidate cache after modifying the stage-1 page table. This is based on the first part of nesting [1]
Complete code can be found in [2], QEMU could can be found in [3].
At last, this is a team work together with Nicolin Chen, Lu Baolu. Thanks them for the help. ^_^. Look forward to your feedbacks.
[1] https://lore.kernel.org/linux-iommu/20231026044216.64964-1-yi.l.liu@intel.co... - merged [2] https://protect2.fireeye.com/v1/url?k=38b56f01-672ea165-38b4e44e-000babda020... [3] https://protect2.fireeye.com/v1/url?k=d6e01ed1-897bd0b5-d6e1959e-000babda020...
Change log:
v6:
- No much change, just rebase on top of 6.7-rc1 as part 1/2 is merged
v5: https://lore.kernel.org/linux-iommu/20231020092426.13907-1-yi.l.liu@intel.co...
- Split the iommufd nesting series into two parts of alloc_user and invalidation (Jason)
- Split IOMMUFD_OBJ_HW_PAGETABLE to IOMMUFD_OBJ_HWPT_PAGING/_NESTED, and do the same with the structures/alloc()/abort()/destroy(). Reworked the selftest accordingly too. (Jason)
- Move hwpt/data_type into struct iommu_user_data from standalone op arguments. (Jason)
- Rename hwpt_type to be data_type, the HWPT_TYPE to be HWPT_ALLOC_DATA, _TYPE_DEFAULT to be _ALLOC_DATA_NONE (Jason, Kevin)
- Rename iommu_copy_user_data() to iommu_copy_struct_from_user() (Kevin)
- Add macro to the iommu_copy_struct_from_user() to calculate min_size (Jason)
- Fix two bugs spotted by ZhaoYan
v4: https://lore.kernel.org/linux-iommu/20230921075138.124099-1-yi.l.liu@intel.c...
- Separate HWPT alloc/destroy/abort functions between user-managed HWPTs and kernel-managed HWPTs
- Rework invalidate uAPI to be a multi-request array-based design
- Add a struct iommu_user_data_array and a helper for driver to sanitize and copy the entry data from user space invalidation array
- Add a patch fixing TEST_LENGTH() in selftest program
- Drop IOMMU_RESV_IOVA_RANGES patches
- Update kdoc and inline comments
- Drop the code to add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation, this does not change the rule that resv regions should only be added to the kernel-managed HWPT. The IOMMU_RESV_SW_MSI stuff will be added in later series as it is needed only by SMMU so far.
v3: https://lore.kernel.org/linux-iommu/20230724110406.107212-1-yi.l.liu@intel.c...
- Add new uAPI things in alphabetical order
- Pass in "enum iommu_hwpt_type hwpt_type" to op->domain_alloc_user for sanity, replacing the previous op->domain_alloc_user_data_len solution
- Return ERR_PTR from domain_alloc_user instead of NULL
- Only add IOMMU_RESV_SW_MSI to kernel-managed HWPT in nested translation (Kevin)
- Add IOMMU_RESV_IOVA_RANGES to report resv iova ranges to userspace hence userspace is able to exclude the ranges in the stage-1 HWPT (e.g. guest I/O page table). (Kevin)
- Add selftest coverage for the new IOMMU_RESV_IOVA_RANGES ioctl
- Minor changes per Kevin's inputs
v2: https://lore.kernel.org/linux-iommu/20230511143844.22693-1-yi.l.liu@intel.co...
- Add union iommu_domain_user_data to include all user data structures to avoid passing void * in kernel APIs.
- Add iommu op to return user data length for user domain allocation
- Rename struct iommu_hwpt_alloc::data_type to be hwpt_type
- Store the invalidation data length in iommu_domain_ops::cache_invalidate_user_data_len
- Convert cache_invalidate_user op to be int instead of void
- Remove @data_type in struct iommu_hwpt_invalidate
- Remove out_hwpt_type_bitmap in struct iommu_hw_info hence drop patch 08 of v1
v1: https://lore.kernel.org/linux-iommu/20230309080910.607396-1-yi.l.liu@intel.c...
Thanks, Yi Liu
Lu Baolu (1): iommu: Add cache_invalidate_user op
Nicolin Chen (4): iommu: Add iommu_copy_struct_from_user_array helper iommufd/selftest: Add mock_domain_cache_invalidate_user support iommufd/selftest: Add IOMMU_TEST_OP_MD_CHECK_IOTLB test op iommufd/selftest: Add coverage for IOMMU_HWPT_INVALIDATE ioctl
Yi Liu (1): iommufd: Add IOMMU_HWPT_INVALIDATE
drivers/iommu/iommufd/hw_pagetable.c | 35 ++++++++ drivers/iommu/iommufd/iommufd_private.h | 9 ++ drivers/iommu/iommufd/iommufd_test.h | 22 +++++ drivers/iommu/iommufd/main.c | 3 + drivers/iommu/iommufd/selftest.c | 69 +++++++++++++++ include/linux/iommu.h | 84 +++++++++++++++++++ include/uapi/linux/iommufd.h | 35 ++++++++ tools/testing/selftests/iommu/iommufd.c | 75 +++++++++++++++++ tools/testing/selftests/iommu/iommufd_utils.h | 63 ++++++++++++++ 9 files changed, 395 insertions(+)
-- 2.34.1
-- Regards, Yi Liu