On 12/22/19 5:23 AM, Leon Romanovsky wrote:
On Fri, Dec 20, 2019 at 03:54:55PM -0800, John Hubbard wrote:
On 12/20/19 10:29 AM, Leon Romanovsky wrote: ...
$ ./build.sh $ build/bin/run_tests.py
If you get things that far I think Leon can get a reproduction for you
I'm not so optimistic about that.
OK, I'm going to proceed for now on the assumption that I've got an overflow problem that happens when huge pages are pinned. If I can get more information, great, otherwise it's probably enough.
One thing: for your repro, if you know the huge page size, and the system page size for that case, that would really help. Also the number of pins per page, more or less, that you'd expect. Because Jason says that only 2M huge pages are used...
Because the other possibility is that the refcount really is going negative, likely due to a mismatched pin/unpin somehow.
If there's not an obvious repro case available, but you do have one (is it easy to repro, though?), then *if* you have the time, I could point you to a github branch that reduces GUP_PIN_COUNTING_BIAS by, say, 4x, by applying this:
diff --git a/include/linux/mm.h b/include/linux/mm.h index bb44c4d2ada7..8526fd03b978 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1077,7 +1077,7 @@ static inline void put_page(struct page *page)
- get_user_pages and page_mkclean and other calls that race to set up page
- table entries.
*/ -#define GUP_PIN_COUNTING_BIAS (1U << 10) +#define GUP_PIN_COUNTING_BIAS (1U << 8)
void unpin_user_page(struct page *page); void unpin_user_pages_dirty_lock(struct page **pages, unsigned long npages,
If that fails to repro, then we would be zeroing in on the root cause.
The branch is here (I just tested it and it seems healthy):
git@github.com:johnhubbard/linux.git pin_user_pages_tracking_v11_with_diags
Hi,
We tested the following branch and here comes results:
Thanks for this testing run!
[root@server consume_mtts]# (master) $ grep foll_pin /proc/vmstat nr_foll_pin_requested 0 nr_foll_pin_returned 0
Zero pinned pages!
...now I'm confused. Somehow FOLL_PIN and pin_user_pages*() calls are not happening. And although the backtraces below show some of my new routines (like try_grab_page), they also confirm the above: there is no pin_user_page*() call in the stack.
In particular, it looks like ib_umem_get() is calling through to get_user_pages*(), rather than pin_user_pages*(). I don't see how this is possible, because the code on my screen shows ib_umem_get() calling pin_user_pages_fast().
Any thoughts or ideas are welcome here.
However, glossing over all of that and assuming that the new GUP_PIN_COUNTING_BIAS of 256 is applied, it's interesting that we still see any overflow. I'm less confident now that this is a true refcount overflow.
Also, any information that would get me closer to being able to attempt my own reproduction of the problem are *very* welcome. :)
thanks,