On Wed, Apr 04, 2018 at 07:40:36PM -0700, Linus Torvalds wrote:
On Wed, Apr 4, 2018 at 6:53 PM, Michael S. Tsirkin mst@redhat.com wrote:
Any feedback on this? As this fixes a bug in vhost, I'll merge through the vhost tree unless someone objects.
NAK.
__get_user_pages_fast() returns the number of pages it gets.
It has never returned an error code, and all the other versions of it (architecture-specific) don't either.
Thanks Linus. I can change the docs and all the callers.
I wonder however whether all the following should be changed then:
static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
...
if (!vma || check_vma_flags(vma, gup_flags)) return i ? : -EFAULT;
is this a bug in __get_user_pages?
Another example:
ret = get_gate_page(mm, start & PAGE_MASK, gup_flags, &vma, pages ? &pages[i] : NULL); if (ret) return i ? : ret;
and ret is -EFAULT on error.
Another example: switch (ret) { case 0: goto retry; case -EFAULT: case -ENOMEM: case -EHWPOISON: return i ? i : ret; case -EBUSY: return i; case -ENOENT: goto next_page; }
it looks like this will return -EFAULT/-ENOMEM/-EHWPOISON if i is 0.
If you ask for one page, and get zero pages, then that's an -EFAULT. Note that that's an EFAULT regardless of whether that zero page happened due to kernel addresses or just lack of mapping in user space.
The documentation is simply wrong if it says anything else. Fix the docs, and fix the users.
The correct use has always been to check the number of pages returned.
Just looking around, returning an error number looks like it could seriously confuse some things.
You have things like the kvm code that does the *right* thing:
unsigned long ... npinned ... npinned = get_user_pages_fast(uaddr, npages, write ?
FOLL_WRITE : 0, pages); if (npinned != npages) { ...
err: if (npinned > 0) release_pages(pages, npinned);
and the above code clearly depends on the actual behavior, not on the documentation.
This seems to work fine with my patch since it only changes the case where npinned == 0.
Any changes in this area would need some *extreme* care, exactly because of code like the above that clearly depends on the existing semantics.
In fact, the documentation really seems to be just buggy. The actual get_user_pages() function itself is expressly being careful *not* to return an error code, it even has a comment to the effect ("Have to be a bit careful with return values").
So the "If no pages were pinned, returns -errno" comment is just bogus.
Linus
I'd like to change the doc then, but it seems that I'll have to change the implementation in that case too.