On Mon, Jul 14, 2025 at 03:19:17PM +0200, Uladzislau Rezki wrote:
On Mon, Jul 14, 2025 at 01:39:20PM +0100, David Laight wrote:
On Wed, 9 Jul 2025 11:22:34 -0700 Dave Hansen dave.hansen@intel.com wrote:
On 7/9/25 11:15, Jacob Pan wrote:
Is there a use case where a SVA user can access kernel memory in the first place?
No. It should be fully blocked.
Then I don't understand what is the "vulnerability condition" being addressed here. We are talking about KVA range here.
SVA users can't access kernel memory, but they can compel walks of kernel page tables, which the IOMMU caches. The trouble starts if the kernel happens to free that page table page and the IOMMU is using the cache after the page is freed.
That was covered in the changelog, but I guess it could be made a bit more succinct.
But does this really mean that every flush_tlb_kernel_range() should flush the IOMMU page tables as well? AFAIU, set_memory flushes TLB even when bits in pte change and it seems like an overkill...
Is it worth just never freeing the page tables used for vmalloc() memory? After all they are likely to be reallocated again.
Do we free? Maybe on some arches? According to my tests(AMD x86-64) i did once upon a time, the PTE entries were not freed after vfree(). It could be expensive if we did it, due to a global "page_table_lock" lock.
I see one place though, it is in the vmap_try_huge_pud()
if (pud_present(*pud) && !pud_free_pmd_page(pud, addr)) return 0;
it is when replace a pud by a huge-page.
There's also a place that replaces a pmd by a smaller huge page, but other than that vmalloc does not free page tables.
-- Uladzislau Rezki