On Mon, Jun 22, 2020 at 04:15:40PM -0400, Jerome Glisse wrote:
On Mon, Jun 22, 2020 at 08:46:17AM -0300, Jason Gunthorpe wrote:
On Fri, Jun 19, 2020 at 04:31:47PM -0400, Jerome Glisse wrote:
Not doable as page refcount can change for things unrelated to GUP, with John changes we can identify GUP and we could potentialy copy GUPed page instead of COW but this can potentialy slow down fork() and i am not sure how acceptable this would be. Also this does not solve GUP against page that are already in fork tree ie page P0 is in process A which forks, we now have page P0 in process A and B. Now we have process A which forks again and we have page P0 in A, B, and C. Here B and C are two branches with root in A. B and/or C can keep forking and grow the fork tree.
For a long time now RDMA has broken COW pages when creating user DMA regions.
The problem has been that fork re-COW's regions that had their COW broken.
So, if you break the COW upon mapping and prevent fork (and others) from copying DMA pinned then you'd cover the cases.
I am not sure we want to prevent COW for pinned GUP pages, this would change current semantic and potentialy break/slow down existing apps.
Isn't that basically exactly what 17839856fd588 does? It looks like it uses the same approach RDMA does by sticking FOLL_WRITE even though it is a read action.
After that change the reamining bug is that fork can re-establish a COW./
Anyway i think we focus too much on fork/COW, it is just an unfixable broken corner cases, mmu notifier allows you to avoid it. Forcing real copy on fork would likely be seen as regression by most people.
If you don't copy the there are data corruption bugs though. Real apps probably don't hit a problem here as they are not forking while GUP's are active (RDMA excluded, which does do this)
I think that implementing page pinning by blocking mmu notifiers for the duration of the pin is a particularly good idea either, that actually seems a lot worse than just having the pin in the first place.
Particularly if it is only being done to avoid corner case bugs that already afflict other GUP cases :(
What do you mean 'GUP fast is still succeptible to this' ?
Not all GUP fast path are updated (intentionaly) __get_user_pages_fast()
Sure, that is is the 'raw' accessor
Jason