bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged. This bug can be observed for example with the following fio jobfile:
[global] direct=0 ioengine=sync invalidate=1 size=1g rw=read
[reader] numjobs=4 directory=/mnt
where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting).
Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue.
Fixes: 058fdecc6de7 ("block, bfq: fix in-service-queue check for queue merging") CC: stable@vger.kernel.org Signed-off-by: Jan Kara jack@suse.cz --- block/bfq-iosched.c | 1 + 1 file changed, 1 insertion(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c index 3d411716d7ee..50017275915f 100644 --- a/block/bfq-iosched.c +++ b/block/bfq-iosched.c @@ -2937,6 +2937,7 @@ static void __bfq_set_in_service_queue(struct bfq_data *bfqd, }
bfqd->in_service_queue = bfqq; + bfqd->in_serv_last_pos = 0; }
/*
Il giorno 5 giu 2020, alle ore 16:16, Jan Kara jack@suse.cz ha scritto:
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged.
Good catch!
This bug can be observed for example with the following fio jobfile:
[global] direct=0 ioengine=sync invalidate=1 size=1g rw=read
[reader] numjobs=4 directory=/mnt
where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting).
Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue.
Apart from the nonexistent problem that even 0 is a valid LBA :)
Acked-by: Paolo Valente paolo.valente@linaro.org
On Thu 11-06-20 09:13:07, Paolo Valente wrote:
Il giorno 5 giu 2020, alle ore 16:16, Jan Kara jack@suse.cz ha scritto:
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged.
Good catch!
This bug can be observed for example with the following fio jobfile:
[global] direct=0 ioengine=sync invalidate=1 size=1g rw=read
[reader] numjobs=4 directory=/mnt
where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting).
Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue.
Apart from the nonexistent problem that even 0 is a valid LBA :)
Yes, I was also thinking about that and decided 0 is "good enough" :). But I just as well just switch to (sector_t)-1 if you think it would be better.
Acked-by: Paolo Valente paolo.valente@linaro.org
Thanks!
Honza
Il giorno 11 giu 2020, alle ore 10:31, Jan Kara jack@suse.cz ha scritto:
On Thu 11-06-20 09:13:07, Paolo Valente wrote:
Il giorno 5 giu 2020, alle ore 16:16, Jan Kara jack@suse.cz ha scritto:
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged.
Good catch!
This bug can be observed for example with the following fio jobfile:
[global] direct=0 ioengine=sync invalidate=1 size=1g rw=read
[reader] numjobs=4 directory=/mnt
where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting).
Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue.
Apart from the nonexistent problem that even 0 is a valid LBA :)
Yes, I was also thinking about that and decided 0 is "good enough" :). But I just as well just switch to (sector_t)-1 if you think it would be better.
0 is ok :)
Thanks, Paolo
Acked-by: Paolo Valente paolo.valente@linaro.org
Thanks!
Honza
-- Jan Kara jack@suse.com SUSE Labs, CR
Il giorno 11 giu 2020, alle ore 16:12, Paolo Valente paolo.valente@linaro.org ha scritto:
Il giorno 11 giu 2020, alle ore 10:31, Jan Kara jack@suse.cz ha scritto:
On Thu 11-06-20 09:13:07, Paolo Valente wrote:
Il giorno 5 giu 2020, alle ore 16:16, Jan Kara jack@suse.cz ha scritto:
bfq_setup_cooperator() uses bfqd->in_serv_last_pos so detect whether it makes sense to merge current bfq queue with the in-service queue. However if the in-service queue is freshly scheduled and didn't dispatch any requests yet, bfqd->in_serv_last_pos is stale and contains value from the previously scheduled bfq queue which can thus result in a bogus decision that the two queues should be merged.
Good catch!
This bug can be observed for example with the following fio jobfile:
[global] direct=0 ioengine=sync invalidate=1 size=1g rw=read
[reader] numjobs=4 directory=/mnt
where the 4 processes will end up in the one shared bfq queue although they do IO to physically very distant files (for some reason I was able to observe this only with slice_idle=1ms setting).
Fix the problem by invalidating bfqd->in_serv_last_pos when switching in-service queue.
Apart from the nonexistent problem that even 0 is a valid LBA :)
Yes, I was also thinking about that and decided 0 is "good enough" :). But I just as well just switch to (sector_t)-1 if you think it would be better.
0 is ok :)
Hi Jan, I've finally tested this patch of yours. No regression.
Once again: Acked-by: Paolo Valente paolo.valente@linaro.org
Thanks, Paolo
Thanks, Paolo
Acked-by: Paolo Valente paolo.valente@linaro.org
Thanks!
Honza
-- Jan Kara jack@suse.com SUSE Labs, CR
linux-stable-mirror@lists.linaro.org