On Dec 22, 2020, at 11:44 AM, Andrea Arcangeli aarcange@redhat.com wrote:
On Mon, Dec 21, 2020 at 02:55:12PM -0800, Nadav Amit wrote:
wouldn’t mmap_write_downgrade() be executed before mprotect_fixup() (so
I assume you mean "in" mprotect_fixup, after change_protection.
If you would downgrade the mmap_lock to read there, then it'd severely slowdown the non contention case, if there's more than vma that needs change_protection.
You'd need to throw away the prev->vm_next info and you'd need to do a new find_vma after droping the mmap_lock for reading and re-taking the mmap_lock for writing at every iteration of the loop.
To do less harm to the non-contention case you could perhaps walk vma->vm_next and check if it's outside the mprotect range and only downgrade in such case. So let's assume we intend to optimize with mmap_write_downgrade only the last vma.
…
I read in detail your response and you make great points. To be fair, I did not think it through and just tried to make a point that not taking mmap_lock for write is an unrelated optimization.
So you make a great point that it is actually related and can only(?) benefit uffd and arguably soft-dirty, to which I am going to add mmap_write_lock().
Yet, my confidence in doing the optimization that you suggested (keep using mmap_read_lock()) as part of the bug fix is very low and just got lower since we discussed. So I do want in the future to try to avoid the overheads I introduce (sorry), but it requires a systematic solution and some thought.
Perhaps any change to PTE in a page-table should increase a page-table generation that we would save in the page-table page-struct (next to the PTL) and pte_same() would also look at the original and updated "page-table generation” when it checks if a PTE changed. So if a PTE went through not-writable -> writable -> not-writable cycle without a TLB flush this can go detected. Yet, there is still a question of how to detect pending TLB flushes in finer granularity to avoid frequent unwarranted TLB flushes while a TLB flush is pending.
It all requires some thought, and the fact that soft-dirty appears to be broken too indicates that bugs can get unnoticed for some time.
Sorry for being a chicken, but I prefer to be safe than sorry.