On 8/25/25 19:49, Baolu Lu wrote:
The three separate lists are needed because we're handling three distinct types of page deallocation. Grouping the pages this way allows the workqueue handler to free each type using the correct function.
Please allow me to add more details.
Right, I know why it got added this way: it was the quickest way to hack together a patch that fixes the IOMMU issue without refactoring anything.
I agree that you have three cases: 1. A full on 'struct ptdesc' that needs its destructor run 2. An order-0 'struct page' 3. A higher-order 'struct page'
Long-term, #2 and #3 probably need to get converted over to 'struct ptdesc'. They don't look _that_ hard to convert to me. Willy, Vishal, any other mm folks: do you agree?
Short-term, I'd just consolidate your issue down to a single list.
#1: For 'struct ptdesc', modify pte_free_kernel() to pass information in to pagetable_dtor_free() to tell it to use the deferred page table free list. Do this with a bit in the ptdesc or a new argument to pagetable_dtor_free(). #2. Just append these to the deferred page table free list. Easy. #3. The biggest hacky way to do this is to just treat the higher-order non-compound page and put the pages on the deferred page table free list one at a time. The other way to do it is to track down how this thing got allocated in the first place and make sure it's got __GFP_COMP metadata. If so, you can just use __free_pages() for everything.
Yeah, it'll take a couple patches up front to refactor some things. But that refactoring will make things more consistent instead of adding adding complexity to deal with the inconsistency.