Commit d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") changed bio_clone_blkg_association() to just clone bio->bi_blkg reference from source to destination bio. This is however wrong if the source and destination bios are against different block devices because struct blkcg_gq is different for each bdev-blkcg pair. This will result in IOs being accounted (and throttled as a result) multiple times against the same device (src bdev) while throttling of the other device (dst bdev) is ignored. In case of BFQ the inconsistency can even result in crashes in bfq_bic_update_cgroup(). Fix the problem by looking up correct blkcg_gq for the cloned bio.
Reported-by: Logan Gunthorpe logang@deltatee.com Reported-by: Donald Buczek buczek@molgen.mpg.de Fixes: d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz --- block/blk-cgroup.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 40161a3f68d0..ecb4eaff6817 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1975,10 +1975,9 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg); void bio_clone_blkg_association(struct bio *dst, struct bio *src) { if (src->bi_blkg) { - if (dst->bi_blkg) - blkg_put(dst->bi_blkg); - blkg_get(src->bi_blkg); - dst->bi_blkg = src->bi_blkg; + rcu_read_lock(); + bio_associate_blkg_from_css(dst, bio_blkcg_css(src)); + rcu_read_unlock(); } } EXPORT_SYMBOL_GPL(bio_clone_blkg_association);
On Wed, Jun 01, 2022 at 06:34:05PM +0200, Jan Kara wrote:
--- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1975,10 +1975,9 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg); void bio_clone_blkg_association(struct bio *dst, struct bio *src) { if (src->bi_blkg) {
rcu_read_lock();
bio_associate_blkg_from_css(dst, bio_blkcg_css(src));
rcu_read_unlock();
What do we even need the rcu critical section here?
Otherwise looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
On Wed 01-06-22 10:42:32, Christoph Hellwig wrote:
On Wed, Jun 01, 2022 at 06:34:05PM +0200, Jan Kara wrote:
--- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1975,10 +1975,9 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg); void bio_clone_blkg_association(struct bio *dst, struct bio *src) { if (src->bi_blkg) {
rcu_read_lock();
bio_associate_blkg_from_css(dst, bio_blkcg_css(src));
rcu_read_unlock();
What do we even need the rcu critical section here?
Good question. I've just blindly copied it from bio_associate_blkg() but bio_blkcg_css(src) is safe without RCU (we hold object references for all the dereferences) and bio_associate_blkg_from_css() takes RCU lock in blkg_tryget_closest() which is the only place where it needs it. So no, we don't need the RCU lock there. Thanks for noticing. I'll send V2 shortly with your change and the added tags.
Otherwise looks good:
Reviewed-by: Christoph Hellwig hch@lst.de
Thanks for review!
Honza
On 01.06.22 18:34, Jan Kara wrote:
Commit d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") changed bio_clone_blkg_association() to just clone bio->bi_blkg reference from source to destination bio. This is however wrong if the source and destination bios are against different block devices because struct blkcg_gq is different for each bdev-blkcg pair. This will result in IOs being accounted (and throttled as a result) multiple times against the same device (src bdev) while throttling of the other device (dst bdev) is ignored. In case of BFQ the inconsistency can even result in crashes in bfq_bic_update_cgroup(). Fix the problem by looking up correct blkcg_gq for the cloned bio.
Reported-by: Logan Gunthorpe logang@deltatee.com Reported-by: Donald Buczek buczek@molgen.mpg.de Fixes: d92c370a16cb ("block: really clone the block cgroup in bio_clone_blkg_association") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz
block/blk-cgroup.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/block/blk-cgroup.c b/block/blk-cgroup.c index 40161a3f68d0..ecb4eaff6817 100644 --- a/block/blk-cgroup.c +++ b/block/blk-cgroup.c @@ -1975,10 +1975,9 @@ EXPORT_SYMBOL_GPL(bio_associate_blkg); void bio_clone_blkg_association(struct bio *dst, struct bio *src) { if (src->bi_blkg) {
if (dst->bi_blkg)
blkg_put(dst->bi_blkg);
blkg_get(src->bi_blkg);
dst->bi_blkg = src->bi_blkg;
rcu_read_lock();
bio_associate_blkg_from_css(dst, bio_blkcg_css(src));
} } EXPORT_SYMBOL_GPL(bio_clone_blkg_association);rcu_read_unlock();
Great. Fixed the problem for me. Thanks to you, also to Logan.
Tested-By: Donald Buczek buczek@molgen.mpg.de
linux-stable-mirror@lists.linaro.org