Hi Linus,
On Fri, May 19, 2023 at 3:21 PM Linus Torvalds torvalds@linux-foundation.org wrote:
On Fri, May 19, 2023 at 12:09 PM Joel Fernandes (Google) joel@joelfernandes.org wrote:
+static bool check_addr_in_prev(struct vm_area_struct *vma, unsigned long addr,
unsigned long mask)
+{
int addr_masked = addr & mask;
struct vm_area_struct *prev = NULL, *cur = NULL;
/* If the masked address is within vma, there is no prev mapping of concern. */
if (vma->vm_start <= addr_masked)
return false;
Hmm.
I should have caught this last time, but I didn't.
That test smells bad to me. Or maybe it's just the comment.
I *suspect* that the test is literally just for the stack movement case by execve, where it catches the case where we're doing the movement entirely within the one vma we set up.
Yes that's right, the test is only for the stack movement case. For the regular mremap case, I don't think there is a way for it to trigger.
But in the *general* case I think the above is horribly wrong: if you want to move pages within an existing mapping, the page moving code can't just randomly say "I'll expand the area you wanted to move". Again, in that general mremap() case (as opposed to the special stack moving case for execve), I *think* that the caller has already split the vma's at the point of the move, and this test simply cannot ever trigger.
Yes, the test simply cannot ever trigger for mremap() but we still need the test for the stack case. That's why I had considered adding it and I had indeed reviewed the stack case when adding the test. I could update the comment to mention that, if you want.
So I think the _code_ works, but I think the comment in particular is questionable, and I'm a bit surprised about the code too,. because I thought execve() only expanded to exactly the moving area.
It expands to cover both the new start and the old end of the stack AFAICS: /* * cover the whole range: [new_start, old_end) */ if (vma_expand(&vmi, vma, new_start, old_end, vma->vm_pgoff, NULL)) return -ENOMEM;
In this case, it will trigger for the stack because this same expanded vma is passed as both the new vma and the old vma to move_page_tables().
*/ if (length != move_page_tables(vma, old_start, vma, new_start, length, false)) return -ENOMEM;
So AFAICS, it is possible that old_start will start later than this newly expanded VMA. So for such a situation, old_start can be realigned to PMD and the test allows that by saying it need not worry about aligning down to an existing VMA.
End result: I think the patch on the whole looks nice, and smaller than I expected. I also suspect it works in practice, but I'd like that test clarified. Does it *actually* trigger for the stack moving case? Because I think it must (never* trigger for the mremap case?
You are right that the test will not trigger for the mremap case. But from a correctness standpoint, I thought of leaving it there for the stack (and who knows for what other future reasons the test may be needed). I can update the comment describing the stack if you like.
And maybe I'm the one confused here, and all I really need is an explanation with small words and simple grammar starting with "No, Linus, this is for case xyz"
Hopefully it is clear now and you agree. Let me know if you want me to do something more. I can make some time next week to trace the stack case a bit more if you like and report back on any behaviors, however the mremap tests I did are looking good and working as expected.
thanks,
- Joel