From: Jason Gunthorpe jgg@nvidia.com Sent: Friday, August 8, 2025 3:52 AM
On Thu, Aug 07, 2025 at 10:40:39PM +0800, Baolu Lu wrote:
+static void kernel_pte_work_func(struct work_struct *work) +{
- struct page *page, *next;
- iommu_sva_invalidate_kva_range(0, TLB_FLUSH_ALL);
- guard(spinlock)(&kernel_pte_work.lock);
- list_for_each_entry_safe(page, next, &kernel_pte_work.list, lru) {
list_del_init(&page->lru);
Please don't add new usages of lru, we are trying to get rid of this. :(
I think the memory should be struct ptdesc, use that..
btw with this change we should also defer free of the pmd page:
pud_free_pmd_page() ... for (i = 0; i < PTRS_PER_PMD; i++) { if (!pmd_none(pmd_sv[i])) { pte = (pte_t *)pmd_page_vaddr(pmd_sv[i]); pte_free_kernel(&init_mm, pte); } }
free_page((unsigned long)pmd_sv);
Otherwise the risk still exists if the pmd page is repurposed before the pte work is scheduled.
another observation - pte_free_kernel is not used in remove_pagetable () and __change_page_attr(). Is it straightforward to put it in those paths or do we need duplicate some deferring logic there?