On Thu, Apr 05, 2018 at 08:40:05AM -0700, Linus Torvalds wrote:
On Thu, Apr 5, 2018 at 7:17 AM, Michael S. Tsirkin mst@redhat.com wrote:
I wonder however whether all the following should be changed then:
static long __get_user_pages(struct task_struct *tsk, struct mm_struct *mm,
...
if (!vma || check_vma_flags(vma, gup_flags)) return i ? : -EFAULT;
is this a bug in __get_user_pages?
Note the difference between "get_user_pages()", and "get_user_pages_fast()".
It's the *fast* versions that just return the number of pages pinned.
The non-fast ones will return an error code for various cases.
Why?
The non-fast cases actually *have* various error cases. They can block and get interrupted etc.
The fast cases are basically "just get me the pages, dammit, and if you can't get some page, stop".
At least that's one excuse for the difference in behavior.
The real excuse is probably just "that's how it worked" - the fast case just walked the page tables and that was it.
Linus
I see, thanks for the clarification Linus.
to repeat what you are saying IIUC __get_user_pages_fast returns 0 if it can't pin any pages and that is by design. Returning 0 on error isn't usual I think so I guess this behaviour should we well documented.
That part of my patch was wrong and should be replaced with a doc update.
What about get_user_pages_fast though? That's the other part of the patch. Right now get_user_pages_fast does:
ret = get_user_pages_unlocked(start, nr_pages - nr, pages, write ? FOLL_WRITE : 0);
/* Have to be a bit careful with return values */ if (nr > 0) { if (ret < 0) ret = nr; else ret += nr; }
so an error on the 1st page gets propagated to the caller, and that get_user_pages_unlocked eventually calls __get_user_pages so it does return an error sometimes.
Would it be correct to apply the second part of the patch then (pasted below for reference) or should get_user_pages_fast and all its callers be changed to return 0 on error instead?
@@ -1806,9 +1809,12 @@ int get_user_pages_fast(unsigned long start, int nr_pages, int write, len = (unsigned long) nr_pages << PAGE_SHIFT; end = start + len;
+ if (nr_pages <= 0) + return 0; + if (unlikely(!access_ok(write ? VERIFY_WRITE : VERIFY_READ, (void __user *)start, len))) - return 0; + return -EFAULT;
if (gup_fast_permitted(start, nr_pages, write)) { local_irq_disable();