On 10/30/19 3:49 PM, John Hubbard wrote:
This also fixes one or two likely bugs.
Well, actually just one...
- Change vfio from get_user_pages(FOLL_LONGTERM), to
pin_longterm_pages(), which sets both FOLL_LONGTERM and FOLL_PIN.
Note that this is a change in behavior, because the get_user_pages_remote() call was not setting FOLL_LONGTERM, but the new pin_user_pages_remote() call that replaces it, *is* setting FOLL_LONGTERM. It is important to set FOLL_LONGTERM, because the DMA case requires it. Please see the FOLL_PIN documentation in include/linux/mm.h, and Documentation/pin_user_pages.rst for details.
Correction: the above comment is stale and wrong. I wrote it before getting further into the details, and the patch doesn't do this.
Instead, it keeps exactly the old behavior: pin_longterm_pages_remote() is careful to avoid setting FOLL_LONGTERM. Instead of setting that flag, it drops in a "TODO" comment nearby. :)
I'll update the commit description in the next version of the series.
thanks,
John Hubbard NVIDIA
- Because all FOLL_PIN-acquired pages must be released via
put_user_page(), also convert the put_page() call over to put_user_pages().
Note that this effectively changes the code's behavior in vfio_iommu_type1.c: put_pfn(): it now ultimately calls set_page_dirty_lock(), instead of set_page_dirty(). This is probably more accurate.
As Christoph Hellwig put it, "set_page_dirty() is only safe if we are dealing with a file backed page where we have reference on the inode it hangs off." [1]
[1] https://lore.kernel.org/r/20190723153640.GB720@lst.de
Cc: Alex Williamson alex.williamson@redhat.com Signed-off-by: John Hubbard jhubbard@nvidia.com
drivers/vfio/vfio_iommu_type1.c | 15 +++++++-------- 1 file changed, 7 insertions(+), 8 deletions(-)
diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index d864277ea16f..795e13f3ef08 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -327,9 +327,8 @@ static int put_pfn(unsigned long pfn, int prot) { if (!is_invalid_reserved_pfn(pfn)) { struct page *page = pfn_to_page(pfn);
if (prot & IOMMU_WRITE)
SetPageDirty(page);
put_page(page);
return 1; } return 0;put_user_pages_dirty_lock(&page, 1, prot & IOMMU_WRITE);
@@ -349,11 +348,11 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, down_read(&mm->mmap_sem); if (mm == current->mm) {
ret = get_user_pages(vaddr, 1, flags | FOLL_LONGTERM, page,
vmas);
} else {ret = pin_longterm_pages(vaddr, 1, flags, page, vmas);
ret = get_user_pages_remote(NULL, mm, vaddr, 1, flags, page,
vmas, NULL);
ret = pin_longterm_pages_remote(NULL, mm, vaddr, 1,
flags, page, vmas,
/*NULL);
- The lifetime of a vaddr_get_pfn() page pin is
- userspace-controlled. In the fs-dax case this could
@@ -363,7 +362,7 @@ static int vaddr_get_pfn(struct mm_struct *mm, unsigned long vaddr, */ if (ret > 0 && vma_is_fsdax(vmas[0])) { ret = -EOPNOTSUPP;
put_page(page[0]);
} } up_read(&mm->mmap_sem);put_user_page(page[0]);