Is there any other way of handling this? E.g. not release the metadata in arch_swap_invalidate_page() but later in set_pte_at() once it was restored. But then we may leak this metadata if there's no set_pte_at() (the process mapping the swap entry died).
That was my immediate thought: do we really have to hook into swap_range_free() at all?
As I alluded to in another reply, without the hook in swap_range_free() I think we would either end up with a race or an effective memory leak in the arch code that maintains the metadata for swapped out pages, as there would be no way for the arch-specific code to know when it is safe to free it after swapin.
Agreed, hooking swap_range_free() is actually cleaner (also considering COW-shared pages).
And I also wondered why we have to do this from set_pte_at() and not do this explicitly (maybe that's the other arch_* callback on the swapin path).
I don't think it's necessary, as the set_pte_at() call sites for swapped in pages are known. I'd much rather do this via an explicit hook at those call sites, as the existing approach of implicit restoring seems too subtle and easy to be overlooked when refactoring, as we have seen with this bug. In the end we only have 3 call sites for the hook and hopefully the comments that I'm adding are sufficient to ensure that any new swapin code should end up with a call to the hook in the right place.
Agreed, much cleaner, thanks!