Re: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on

27 Jan 2021

...
On Jan 26, 2021, at 4:26 PM, Lu Baolu baolu.lu@linux.intel.com wrote:
Hi Nadav,
On 1/27/21 4:38 AM, Nadav Amit wrote:
...
From: Nadav Amit namit@vmware.com
When an Intel IOMMU is virtualized, and a physical device is
passed-through to the VM, changes of the virtual IOMMU need to be
propagated to the physical IOMMU. The hypervisor therefore needs to
monitor PTE mappings in the IOMMU page-tables. Intel specifications
provide "caching-mode" capability that a virtual IOMMU uses to report
that the IOMMU is virtualized and a TLB flush is needed after mapping to
allow the hypervisor to propagate virtual IOMMU mappings to the physical
IOMMU. To the best of my knowledge no real physical IOMMU reports
"caching-mode" as turned on.
Synchronizing the virtual and the physical TLBs is expensive if the
hypervisor is unaware which PTEs have changed, as the hypervisor is
required to walk all the virtualized tables and look for changes.
Consequently, domain flushes are much more expensive than page-specific
flushes on virtualized IOMMUs with passthrough devices. The kernel
therefore exploited the "caching-mode" indication to avoid domain
flushing and use page-specific flushing in virtualized environments. See
commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching
mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use
of iova deferred flushing"). Now, when batched TLB flushing is used (the
default), full TLB domain flushes are performed frequently, requiring
the hypervisor to perform expensive synchronization between the virtual
TLB and the physical one.
Good catch. Thank you!
...
Getting batched TLB flushes to use in such circumstances page-specific
invalidations again is not easy, since the TLB invalidation scheme
assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance
benefit from using batched TLB invalidations is likely to be much
smaller than the overhead of the virtual-to-physical IOMMU page-tables
synchronization.
Fixes: 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
Signed-off-by: Nadav Amit namit@vmware.com
Cc: David Woodhouse dwmw2@infradead.org
Cc: Lu Baolu baolu.lu@linux.intel.com
Cc: Joerg Roedel joro@8bytes.org
Cc: Will Deacon will@kernel.org
Cc: stable@vger.kernel.org

drivers/iommu/intel/iommu.c | 26 +++++++++++++++++++++++++-
 1 file changed, 25 insertions(+), 1 deletion(-)

diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c
index 788119c5b021..4e08f5e17175 100644
--- a/drivers/iommu/intel/iommu.c
+++ b/drivers/iommu/intel/iommu.c
@@ -5373,6 +5373,30 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain,
   return ret;
 }
 +static int
+intel_iommu_domain_get_attr_use_flush_queue(struct iommu_domain *domain)
Just nit:
Can we just use this
static bool domain_use_flush_queue(struct iommu_domain *domain)
?
Sure.
...
...
+{

struct dmar_domain *dmar_domain = to_dmar_domain(domain);
struct intel_iommu *iommu = domain_get_iommu(dmar_domain);

if (intel_iommu_strict)
return 0;



/*
* The flush queue implementation does not perform page-selective


* invalidations that are required for efficient TLB flushes in virtual


* environments. The benefit of batching is likely to be much lower than


* the overhead of synchronizing the virtual and physical IOMMU


* page-tables.


*/


if (iommu && cap_caching_mode(iommu->cap)) {
pr_warn_once("IOMMU batching is partially disabled due to virtualization");


return 0;


}

domain_get_iommu() only returns the first iommu, and could return NULL
when this is called before domain attaching to any device. A better
choice could be check caching mode globally and return false if caching
mode is supported on any iommu.
  struct dmar_drhd_unit *drhd;
  struct intel_iommu *iommu;

  rcu_read_lock();
  for_each_active_iommu(iommu, drhd) {
           if (cap_caching_mode(iommu->cap))
                   return false;
   }
   rcu_read_unlock();

Err.. You are correct (thanks!).
To be frank, I do not like it. I once (10 years ago) implemented a VT-d DMAR
virtualization prototype (not for VMware), and it seemed to me that the best
practice to maximize performance would be to virtualize one IOMMU with
caching-mode turned off for emulated devices, and one IOMMU with
caching-mode turned on for passthrough devices.
If some hypervisor used such a scheme, it would be better to allow the VM to
use batched invalidations for domains that only hold emulated devices.
However, as you said, it is not simple, since we do not know which devices
will be attached to each domain. So I will use your scheme (with the
correction that you sent).
It is a shame the code got broken so long ago and I (and others) did not
notice.
Thanks again,
Nadav

    

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on