Re: [PATCH 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush

14 Jul 2025


      On Mon, Jul 14, 2025 at 03:19:17PM +0200, Uladzislau Rezki wrote:
...
On Mon, Jul 14, 2025 at 01:39:20PM +0100, David Laight wrote:
...
On Wed, 9 Jul 2025 11:22:34 -0700
Dave Hansen dave.hansen@intel.com wrote:
...
On 7/9/25 11:15, Jacob Pan wrote:
...
...
...
Is there a use case where a SVA user can access kernel memory in the
first place?
No. It should be fully blocked.
Then I don't understand what is the "vulnerability condition" being
addressed here. We are talking about KVA range here.
SVA users can't access kernel memory, but they can compel walks of
kernel page tables, which the IOMMU caches. The trouble starts if the
kernel happens to free that page table page and the IOMMU is using the
cache after the page is freed.
That was covered in the changelog, but I guess it could be made a bit
more succinct.
But does this really mean that every flush_tlb_kernel_range() should flush
the IOMMU page tables as well? AFAIU, set_memory flushes TLB even when bits
in pte change and it seems like an overkill...
...
...
Is it worth just never freeing the page tables used for vmalloc() memory?
After all they are likely to be reallocated again.
Do we free? Maybe on some arches? According to my tests(AMD x86-64) i did
once upon a time, the PTE entries were not freed after vfree(). It could be
expensive if we did it, due to a global "page_table_lock" lock.
I see one place though, it is in the vmap_try_huge_pud()
if (pud_present(*pud) && !pud_free_pmd_page(pud, addr))
   	return 0;
it is when replace a pud by a huge-page.
There's also a place that replaces a pmd by a smaller huge page, but other
than that vmalloc does not free page tables.
...
--
Uladzislau Rezki
-- 
Sincerely yours,
Mike.

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH 1/1] iommu/sva: Invalidate KVA range on kernel TLB flush