The patch below does not apply to the 5.10-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to stable@vger.kernel.org.
thanks,
greg k-h
------------------ original commit in Linus's tree ------------------
From 29b32839725f8c89a41cb6ee054c85f3116ea8b5 Mon Sep 17 00:00:00 2001
From: Nadav Amit namit@vmware.com Date: Wed, 27 Jan 2021 09:53:17 -0800 Subject: [PATCH] iommu/vt-d: Do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org
Acked-by: Lu Baolu baolu.lu@linux.intel.com Link: https://lore.kernel.org/r/20210127175317.1600473-1-namit@vmware.com Signed-off-by: Joerg Roedel jroedel@suse.de
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index f665322a0991..06b00b5363d8 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -5440,6 +5440,36 @@ intel_iommu_domain_set_attr(struct iommu_domain *domain, return ret; }
+static bool domain_use_flush_queue(void) +{ + struct dmar_drhd_unit *drhd; + struct intel_iommu *iommu; + bool r = true; + + if (intel_iommu_strict) + return false; + + /* + * The flush queue implementation does not perform page-selective + * invalidations that are required for efficient TLB flushes in virtual + * environments. The benefit of batching is likely to be much lower than + * the overhead of synchronizing the virtual and physical IOMMU + * page-tables. + */ + rcu_read_lock(); + for_each_active_iommu(iommu, drhd) { + if (!cap_caching_mode(iommu->cap)) + continue; + + pr_warn_once("IOMMU batching is disabled due to virtualization"); + r = false; + break; + } + rcu_read_unlock(); + + return r; +} + static int intel_iommu_domain_get_attr(struct iommu_domain *domain, enum iommu_attr attr, void *data) @@ -5450,7 +5480,7 @@ intel_iommu_domain_get_attr(struct iommu_domain *domain, case IOMMU_DOMAIN_DMA: switch (attr) { case DOMAIN_ATTR_DMA_USE_FLUSH_QUEUE: - *(int *)data = !intel_iommu_strict; + *(int *)data = domain_use_flush_queue(); return 0; default: return -ENODEV;
Backporting requires to disable strict during initialization. Lu, can you ack this patch?
-- >8 --
From d5ce982ce6f6f869c53cc0ed496a6dd4c1309657 Mon Sep 17 00:00:00 2001
From: Nadav Amit namit@vmware.com Date: Tue, 26 Jan 2021 12:03:11 -0800 Subject: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org --- drivers/iommu/intel/iommu.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 151243fa01ba..7e3db4c0324d 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3350,6 +3350,11 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0; + + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { + pr_warn("Disable batched IOTLB flush due to virtualization"); + intel_iommu_strict = 1; + } intel_svm_check(iommu); }
On 2/5/21 2:04 AM, Nadav Amit wrote:
Backporting requires to disable strict during initialization. Lu, can you ack this patch?
-- >8 --
From d5ce982ce6f6f869c53cc0ed496a6dd4c1309657 Mon Sep 17 00:00:00 2001 From: Nadav Amit namit@vmware.com Date: Tue, 26 Jan 2021 12:03:11 -0800 Subject: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org
drivers/iommu/intel/iommu.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 151243fa01ba..7e3db4c0324d 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3350,6 +3350,11 @@ static int __init init_dmars(void) if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0;
if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) {
pr_warn("Disable batched IOTLB flush due to virtualization");
It doesn't mean something is wrong. Just optimized for virtualization, right? Is pr_info() sufficient?
Others look good to me.
Acked-by: Lu Baolu baolu.lu@linux.intel.com
Best regards, baolu
intel_iommu_strict = 1;
intel_svm_check(iommu); }}
On Thu, Feb 04, 2021 at 06:04:13PM +0000, Nadav Amit wrote:
Backporting requires to disable strict during initialization. Lu, can you ack this patch?
-- >8 --
From d5ce982ce6f6f869c53cc0ed496a6dd4c1309657 Mon Sep 17 00:00:00 2001
From: Nadav Amit namit@vmware.com Date: Tue, 26 Jan 2021 12:03:11 -0800 Subject: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org
drivers/iommu/intel/iommu.c | 5 +++++ 1 file changed, 5 insertions(+)
diff --git a/drivers/iommu/intel/iommu.c b/drivers/iommu/intel/iommu.c index 151243fa01ba..7e3db4c0324d 100644 --- a/drivers/iommu/intel/iommu.c +++ b/drivers/iommu/intel/iommu.c @@ -3350,6 +3350,11 @@ static int __init init_dmars(void) if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0;
if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) {
pr_warn("Disable batched IOTLB flush due to virtualization");
intel_iommu_strict = 1;
intel_svm_check(iommu); }}
This works for 5.10, thanks! But what about 4.9, 4.14, 4.19, and 5.4? Those also need this change, right?
thanks,
greg k-h
On Feb 5, 2021, at 12:54 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Feb 04, 2021 at 06:04:13PM +0000, Nadav Amit wrote:
Backporting requires to disable strict during initialization. Lu, can you ack this patch?
This works for 5.10, thanks! But what about 4.9, 4.14, 4.19, and 5.4? Those also need this change, right?
Thanks for taking the patch.
Yes, older kernels need to be patched too. I wanted Lu to ack the 5.10 patch first.
For 5.4 and older kernels, the patch is fundamentally the same as the one for 5.10. Yet the patch that I sent for 5.10 does not apply cleanly, so please use the following patch.
Please let me know if there is any problem.
-- >8 --
From 4abd08d6c3c997160257606a6c4057601d32dd7b Mon Sep 17 00:00:00 2001
From: Nadav Amit namit@vmware.com Date: Mon, 1 Feb 2021 10:45:35 -0800 Subject: [PATCH] iommu/vt-d: do not use flush-queue when caching-mode is on
When an Intel IOMMU is virtualized, and a physical device is passed-through to the VM, changes of the virtual IOMMU need to be propagated to the physical IOMMU. The hypervisor therefore needs to monitor PTE mappings in the IOMMU page-tables. Intel specifications provide "caching-mode" capability that a virtual IOMMU uses to report that the IOMMU is virtualized and a TLB flush is needed after mapping to allow the hypervisor to propagate virtual IOMMU mappings to the physical IOMMU. To the best of my knowledge no real physical IOMMU reports "caching-mode" as turned on.
Synchronizing the virtual and the physical IOMMU tables is expensive if the hypervisor is unaware which PTEs have changed, as the hypervisor is required to walk all the virtualized tables and look for changes. Consequently, domain flushes are much more expensive than page-specific flushes on virtualized IOMMUs with passthrough devices. The kernel therefore exploited the "caching-mode" indication to avoid domain flushing and use page-specific flushing in virtualized environments. See commit 78d5f0f500e6 ("intel-iommu: Avoid global flushes with caching mode.")
This behavior changed after commit 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing"). Now, when batched TLB flushing is used (the default), full TLB domain flushes are performed frequently, requiring the hypervisor to perform expensive synchronization between the virtual TLB and the physical one.
Getting batched TLB flushes to use page-specific invalidations again in such circumstances is not easy, since the TLB invalidation scheme assumes that "full" domain TLB flushes are performed for scalability.
Disable batched TLB flushes when caching-mode is on, as the performance benefit from using batched TLB invalidations is likely to be much smaller than the overhead of the virtual-to-physical IOMMU page-tables synchronization.
The backported patch checks upon init_dmars() whether any IOMMU has caching-mode turned on, and if it is turns off strict-mode.
Fixes: 13cf01744608 ("iommu/vt-d: Make use of iova deferred flushing") Signed-off-by: Nadav Amit namit@vmware.com Cc: David Woodhouse dwmw2@infradead.org Cc: Lu Baolu baolu.lu@linux.intel.com Cc: Joerg Roedel joro@8bytes.org Cc: Will Deacon will@kernel.org Cc: stable@vger.kernel.org
Signed-off-by: Nadav Amit namit@vmware.com --- drivers/iommu/intel-iommu.c | 6 ++++++ 1 file changed, 6 insertions(+)
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 984c7a6ea4fe..953d86ca6d2b 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -3285,6 +3285,12 @@ static int __init init_dmars(void)
if (!ecap_pass_through(iommu->ecap)) hw_pass_through = 0; + + if (!intel_iommu_strict && cap_caching_mode(iommu->cap)) { + pr_info("Disable batched IOTLB flush due to virtualization"); + intel_iommu_strict = 1; + } + #ifdef CONFIG_INTEL_IOMMU_SVM if (pasid_supported(iommu)) intel_svm_init(iommu);
On Fri, Feb 05, 2021 at 06:29:27PM +0000, Nadav Amit wrote:
On Feb 5, 2021, at 12:54 AM, Greg Kroah-Hartman gregkh@linuxfoundation.org wrote:
On Thu, Feb 04, 2021 at 06:04:13PM +0000, Nadav Amit wrote:
Backporting requires to disable strict during initialization. Lu, can you ack this patch?
This works for 5.10, thanks! But what about 4.9, 4.14, 4.19, and 5.4? Those also need this change, right?
Thanks for taking the patch.
Yes, older kernels need to be patched too. I wanted Lu to ack the 5.10 patch first.
For 5.4 and older kernels, the patch is fundamentally the same as the one for 5.10. Yet the patch that I sent for 5.10 does not apply cleanly, so please use the following patch.
Please let me know if there is any problem.
That worked, thanks.
greg k-h
linux-stable-mirror@lists.linaro.org