Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV, we would still truncate SG_IO requests beyond UIO_MAXIOV pages. Changing it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the request is too big and blocks it.
Cc: stable@vger.kernel.org Cc: Al Viro viro@zeniv.linux.org.uk Fixes: b282cc766958 ("bio_map_user_iov(): get rid of the iov_for_each()", 2017-10-11) Signed-off-by: Paolo Bonzini pbonzini@redhat.com
--- v1->v2: now with "git commit" --- block/bio.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/bio.c b/block/bio.c index 4db1008309ed..0914ae4adae9 100644 --- a/block/bio.c +++ b/block/bio.c @@ -1299,7 +1299,7 @@ struct bio *bio_map_user_iov(struct request_queue *q, if (!iov_iter_count(iter)) return ERR_PTR(-EINVAL);
- bio = bio_kmalloc(gfp_mask, iov_iter_npages(iter, BIO_MAX_PAGES)); + bio = bio_kmalloc(gfp_mask, iov_iter_npages(iter, UIO_MAXIOV + 1)); if (!bio) return ERR_PTR(-ENOMEM);
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
So I understand SG_IO request shouldn't be truncated because of BIO_MAX_PAGES, or could you explain it in a bit detail or provide a reproducer?
Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV, we would still truncate SG_IO requests beyond UIO_MAXIOV pages. Changing it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the request is too big and blocks it.
We should pass UIO_MAXIOV instead of UIO_MAXIOV + 1, otherwise bio_kmalloc() will fail. Otherwise, the patch looks fine, but shouldn't be a fix if my above analysis is correct.
Thanks, Ming
On Thu, Apr 18, 2019 at 10:19:04AM +0800, Ming Lei wrote:
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
So I understand SG_IO request shouldn't be truncated because of BIO_MAX_PAGES, or could you explain it in a bit detail or provide a reproducer?
Note that if the argument to iov_iter_npages were changed to UIO_MAXIOV, we would still truncate SG_IO requests beyond UIO_MAXIOV pages. Changing it to UIO_MAXIOV + 1 instead ensures that bio_kmalloc notices that the request is too big and blocks it.
We should pass UIO_MAXIOV instead of UIO_MAXIOV + 1, otherwise bio_kmalloc() will fail. Otherwise, the patch looks fine, but shouldn't be a fix if my above analysis is correct.
Also we have enabled multipage bvec for passthough IO[1], we shouldn't need to allocate so big max io vectors any more, and actually the reasonable number is (PAGE_SIZE - sizeof(struct bio)) / sizeof(struct bio_vec), then we can avoid to stress mm for high order allocation.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/axboe/linux-block.git/commit...
Thanks, Ming
On 18/04/19 04:19, Ming Lei wrote:
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
Stupid question: where? I don't see any place starting at blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov) that would allocate a second bio. The only bio_kmalloc in that path is the one I'm patching.
So I understand SG_IO request shouldn't be truncated because of BIO_MAX_PAGES, or could you explain it in a bit detail or provide a reproducer?
Unfortunately I don't have a reproducer. Actually any userspace SG_IO above 4 MB triggers it, but you need the right HBA to ensure that it reaches bio_kmalloc.
Paolo
On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
On 18/04/19 04:19, Ming Lei wrote:
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
Stupid question: where? I don't see any place starting at blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov) that would allocate a second bio. The only bio_kmalloc in that path is the one I'm patching.
Each bio is created inside __blk_rq_map_user_iov() which is run inside a loop, and the created bio is added to request via blk_rq_append_bio(), see the following code:
blk_rq_map_user_iov __blk_rq_map_user_iov blk_rq_append_bio
blk_rq_map_user_iov(): ... do { ret =__blk_rq_map_user_iov(rq, map_data, &i, gfp_mask, copy); if (ret) goto unmap_rq; if (!bio) bio = rq->bio; } while (iov_iter_count(&i)); ...
Thanks, Ming
On 18/04/19 11:29, Ming Lei wrote:
On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
On 18/04/19 04:19, Ming Lei wrote:
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
Stupid question: where? I don't see any place starting at blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov) that would allocate a second bio. The only bio_kmalloc in that path is the one I'm patching.
Each bio is created inside __blk_rq_map_user_iov() which is run inside a loop, and the created bio is added to request via blk_rq_append_bio(), see the following code:
Uff, I can't read apparently. :( This is the commit that introduced it:
commit 4d6af73d9e43f78651a43ee4c5ad221107ac8365 Author: Christoph Hellwig hch@lst.de Date: Wed Mar 2 18:07:14 2016 +0100
block: support large requests in blk_rq_map_user_iov
This patch adds support for larger requests in blk_rq_map_user_iov by allowing it to build multiple bios for a request. This functionality used to exist for the non-vectored blk_rq_map_user in the past, and this patch reuses the existing functionality for it on the unmap side, which stuck around. Thanks to the iov_iter API supporting multiple bios is fairly trivial, as we can just iterate the iov until we've consumed the whole iov_iter.
Signed-off-by: Christoph Hellwig hch@lst.de Reported-by: Jeff Lien Jeff.Lien@hgst.com Tested-by: Jeff Lien Jeff.Lien@hgst.com Reviewed-by: Keith Busch keith.busch@intel.com Signed-off-by: Jens Axboe axboe@fb.com
Paolo
On Thu, Apr 18, 2019 at 11:36:03AM +0200, Paolo Bonzini wrote:
On 18/04/19 11:29, Ming Lei wrote:
On Thu, Apr 18, 2019 at 10:42:21AM +0200, Paolo Bonzini wrote:
On 18/04/19 04:19, Ming Lei wrote:
Hi Paolo,
On Wed, Apr 17, 2019 at 01:52:07PM +0200, Paolo Bonzini wrote:
Because bio_kmalloc uses inline iovecs, the limit on the number of entries is not BIO_MAX_PAGES but rather UIO_MAXIOV, which indeed is already checked in bio_kmalloc. This could cause SG_IO requests to be truncated and the HBA to report a DMA overrun.
BIO_MAX_PAGES only limits the single bio's max vector number, if one bio can't hold all user space request, new bio will be allocated and appended to the passthrough request if queue limits aren't reached.
Stupid question: where? I don't see any place starting at blk_rq_map_user_iov (and then __blk_rq_map_user_iov->bio_map_user_iov) that would allocate a second bio. The only bio_kmalloc in that path is the one I'm patching.
Each bio is created inside __blk_rq_map_user_iov() which is run inside a loop, and the created bio is added to request via blk_rq_append_bio(), see the following code:
Uff, I can't read apparently. :( This is the commit that introduced it:
commit 4d6af73d9e43f78651a43ee4c5ad221107ac8365 Author: Christoph Hellwig <hch@lst.de> Date: Wed Mar 2 18:07:14 2016 +0100 block: support large requests in blk_rq_map_user_iov
Exactly, the above commit starts to build multiple bios for a request.
Then I guess your issue is triggered on kernel without the commit.
Thanks, Ming
linux-stable-mirror@lists.linaro.org