Hello,
I have been investigating a read performance regression of dm-snapshot on top of loopback in which the read time for a dd command increased from 2min to 40min. I bisected the issue to dc5fc361d89 ("block: attempt direct issue of plug list"). I blktraced before and after this commit and the main difference I saw was that before this commit, when the performance was good, there were a lot of IO unplugs on the loop dev. After this commit, I saw 0 IO unplugs.
On the mainline, I was also able to bisect to a commit which fixed this issue: 667ea36378cf ("loop: don't set QUEUE_FLAG_NOMERGES"). I also blktraced before and after this commit, and unsurprisingly, the main difference was that commit resulted in IO merges whereas previously there were none being.
I don't totally understand what is going on with the first commit which introduced the issue but I'd guess some modifying of the plug list behavior resulted in IO not getting merged/grouped but when we enabled QUEUE_FLAG_NOMERGES, we were then able to optimize through this mechanism. Buuuut 2min->40min seems like a huge performance drop just from merged vs non-merged IO, no? So perhaps it's more complicated than that...
dc5fc361d89 -> 5.16 667ea36378c -> 6.11
6.6.y and 6.1.y and were both experiencing the performance issue. I tried porting 667ea36378 to these branches; it applied cleanly and resolved the issue for both. So perhaps we should consider it for the stable trees, but it'd be great if someone from the block layer could chime in with a better idea of what's going on here.
- leah
Hi Leah,
On Thu, Oct 03, 2024 at 02:13:52PM -0700, Leah Rumancik wrote:
Hello,
I have been investigating a read performance regression of dm-snapshot on top of loopback in which the read time for a dd command increased from 2min to 40min. I bisected the issue to dc5fc361d89 ("block: attempt direct issue of plug list"). I blktraced before and after this commit and the main difference I saw was that before this commit, when the performance was good, there were a lot of IO unplugs on the loop dev. After this commit, I saw 0 IO unplugs.
/me makes a not that it might make sense to enhance the tracing to show which of the trace_block_unplug call sites did a particular unplug because that might be helpful here, but I suspect the ones you saw the ones from blk_mq_dispatch_plug_list, which now gets bypassed.
I'm not really sure how that changed things, except that I know in old kernels we had issues with reordering I/O in this path, and maybe your workload hit that? Did the issue order change in the blktrace?
On the mainline, I was also able to bisect to a commit which fixed this issue: 667ea36378cf ("loop: don't set QUEUE_FLAG_NOMERGES"). I also blktraced before and after this commit, and unsurprisingly, the main difference was that commit resulted in IO merges whereas previously there were none being.
With QUEUE_FLAG_NOMERGES even very basic merging is enabled, so that would fix any smaller amount of reordering. It is in fact a bit stupid that this ever got set by default on the loop driver.
6.6.y and 6.1.y and were both experiencing the performance issue. I tried porting 667ea36378 to these branches; it applied cleanly and resolved the issue for both. So perhaps we should consider it for the stable trees, but it'd be great if someone from the block layer could chime in with a better idea of what's going on here.
I'm totally fine with backporting the patch.
Cool, thanks. I'll poke around some more next week, but sounds good, let's go ahead with 667ea36378 for 6.6 and 6.1 then.
- leah
On Thu, Oct 3, 2024 at 10:58 PM Christoph Hellwig hch@lst.de wrote:
Hi Leah,
On Thu, Oct 03, 2024 at 02:13:52PM -0700, Leah Rumancik wrote:
Hello,
I have been investigating a read performance regression of dm-snapshot on top of loopback in which the read time for a dd command increased from 2min to 40min. I bisected the issue to dc5fc361d89 ("block: attempt direct issue of plug list"). I blktraced before and after this commit and the main difference I saw was that before this commit, when the performance was good, there were a lot of IO unplugs on the loop dev. After this commit, I saw 0 IO unplugs.
/me makes a not that it might make sense to enhance the tracing to show which of the trace_block_unplug call sites did a particular unplug because that might be helpful here, but I suspect the ones you saw the ones from blk_mq_dispatch_plug_list, which now gets bypassed.
I'm not really sure how that changed things, except that I know in old kernels we had issues with reordering I/O in this path, and maybe your workload hit that? Did the issue order change in the blktrace?
On the mainline, I was also able to bisect to a commit which fixed this issue: 667ea36378cf ("loop: don't set QUEUE_FLAG_NOMERGES"). I also blktraced before and after this commit, and unsurprisingly, the main difference was that commit resulted in IO merges whereas previously there were none being.
With QUEUE_FLAG_NOMERGES even very basic merging is enabled, so that would fix any smaller amount of reordering. It is in fact a bit stupid that this ever got set by default on the loop driver.
6.6.y and 6.1.y and were both experiencing the performance issue. I tried porting 667ea36378 to these branches; it applied cleanly and resolved the issue for both. So perhaps we should consider it for the stable trees, but it'd be great if someone from the block layer could chime in with a better idea of what's going on here.
I'm totally fine with backporting the patch.
On 10/4/24 6:41 PM, Leah Rumancik wrote:
Cool, thanks. I'll poke around some more next week, but sounds good, let's go ahead with 667ea36378 for 6.6 and 6.1 then.
Greg, can you pickup 667ea36378cf for 6.1-stable and 6.6-stable?
On Fri, Oct 04, 2024 at 07:22:24PM -0600, Jens Axboe wrote:
On 10/4/24 6:41 PM, Leah Rumancik wrote:
Cool, thanks. I'll poke around some more next week, but sounds good, let's go ahead with 667ea36378 for 6.6 and 6.1 then.
Greg, can you pickup 667ea36378cf for 6.1-stable and 6.6-stable?
Queued up, thanks!
On 10/5/24 6:22 PM, Sasha Levin wrote:
On Fri, Oct 04, 2024 at 07:22:24PM -0600, Jens Axboe wrote:
On 10/4/24 6:41 PM, Leah Rumancik wrote:
Cool, thanks. I'll poke around some more next week, but sounds good, let's go ahead with 667ea36378 for 6.6 and 6.1 then.
Greg, can you pickup 667ea36378cf for 6.1-stable and 6.6-stable?
Queued up, thanks!
Thanks Sasha!
linux-stable-mirror@lists.linaro.org