Re: [PATCH v7 1/8] ublk: have a per-io daemon instead of a per-queue daemon

29 May 2025


      On Tue, May 27, 2025 at 05:01:24PM -0600, Uday Shankar wrote:
...
Currently, ublk_drv associates to each hardware queue (hctx) a unique
task (called the queue's ubq_daemon) which is allowed to issue
COMMIT_AND_FETCH commands against the hctx. If any other task attempts
to do so, the command fails immediately with EINVAL. When considered
together with the block layer architecture, the result is that for each
CPU C on the system, there is a unique ublk server thread which is
allowed to handle I/O submitted on CPU C. This can lead to suboptimal
performance under imbalanced load generation. For an extreme example,
suppose all the load is generated on CPUs mapping to a single ublk
server thread. Then that thread may be fully utilized and become the
bottleneck in the system, while other ublk server threads are totally
idle.
This issue can also be addressed directly in the ublk server without
kernel support by having threads dequeue I/Os and pass them around to
ensure even load. But this solution requires inter-thread communication
at least twice for each I/O (submission and completion), which is
generally a bad pattern for performance. The problem gets even worse
with zero copy, as more inter-thread communication would be required to
have the buffer register/unregister calls to come from the correct
thread.
Therefore, address this issue in ublk_drv by allowing each I/O to have
its own daemon task. Two I/Os in the same queue are now allowed to be
serviced by different daemon tasks - this was not possible before.
Imbalanced load can then be balanced across all ublk server threads by
having the ublk server threads issue FETCH_REQs in a round-robin manner.
As a small toy example, consider a system with a single ublk device
having 2 queues, each of depth 4. A ublk server having 4 threads could
issue its FETCH_REQs against this device as follows (where each entry is
the qid,tag pair that the FETCH_REQ targets):
ublk server thread:	T0	T1	T2	T3
   		0,0	0,1	0,2	0,3
   		1,3	1,0	1,1	1,2
This setup allows for load that is concentrated on one hctx/ublk_queue
to be spread out across all ublk server threads, alleviating the issue
described above.
Add the new UBLK_F_PER_IO_DAEMON feature to ublk_drv, which ublk servers
can use to essentially test for the presence of this change and tailor
their behavior accordingly.
Signed-off-by: Uday Shankar ushankar@purestorage.com
Reviewed-by: Caleb Sander Mateos csander@purestorage.com
This patch looks close to go, just one panic triggered immediately by
the following steps, I think it needs to be addressed first.
Maybe we need to add one such stress test for UBLK_F_PER_IO_DAEMON too.
1) run heavy IO:
[root@ktest-40 ublk]# ./kublk add -t null -q 2 --nthreads 4 --per_io_tasks
dev id 0: nr_hw_queues 2 queue_depth 128 block size 512 dev_capacity 524288000
    max rq size 1048576 daemon pid 1283 flags 0x2042 state LIVE
    queue 0: affinity(0 )
    queue 1: affinity(8 )
[root@ktest-40 ublk]#
[root@ktest-40 ublk]# ~/git/fio/t/io_uring -p 0 -n 8 /dev/ublkb0
Or
`fio -numjobs=8 --ioengine=libaio --iodepth=128 --iodepth_batch_submit=32 \
    --iodepth_batch_complete_min=32`
2) panic immediately:
[   51.297750] BUG: kernel NULL pointer dereference, address: 0000000000000000
[   51.298719] #PF: supervisor read access in kernel mode
[   51.299403] #PF: error_code(0x0000) - not-present page
[   51.300069] PGD 1161c8067 P4D 1161c8067 PUD 11a793067 PMD 0 
[   51.300825] Oops: Oops: 0000 [#1] SMP NOPTI
[   51.301389] CPU: 0 UID: 0 PID: 1285 Comm: kublk Not tainted 6.15.0+ #288 PREEMPT(full) 
[   51.302375] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.3-1.fc39 04/01/2014
[   51.303551] RIP: 0010:io_uring_cmd_done+0xa7/0x1d0
[   51.304226] Code: 48 89 f1 48 89 f0 48 83 e1 bf 80 cc 01 48 81 c9 00 01 80 00 83 e6 40 48 0f 45 c1 48 89 43 48 44 89 6b 58 c7 43 5c 00 00 00 00 <8b> 07 f6 c4 08 74 12 48 89 93 e8 00 00 0
[   51.306554] RSP: 0018:ffffd1da436e3a40 EFLAGS: 00010246
[   51.307253] RAX: 0000000000000100 RBX: ffff8d9cd3737300 RCX: 0000000000000001
[   51.308178] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[   51.309333] RBP: 0000000000000001 R08: 0000000000000018 R09: 0000000000190015
[   51.310744] R10: 0000000000190015 R11: 0000000000000035 R12: ffff8d9cd1c7c000
[   51.311986] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[   51.313386] FS:  00007f2c293916c0(0000) GS:ffff8da179df6000(0000) knlGS:0000000000000000
[   51.314899] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   51.315926] CR2: 0000000000000000 CR3: 00000001161c9002 CR4: 0000000000772ef0
[   51.317179] PKRU: 55555554
[   51.317682] Call Trace:
[   51.318040]  <TASK>
[   51.318355]  ublk_cmd_list_tw_cb+0x30/0x40 [ublk_drv]
[   51.319061]  __io_run_local_work_loop+0x72/0x80
[   51.319696]  __io_run_local_work+0x69/0x1e0
[   51.320274]  io_cqring_wait+0x8f/0x6a0
[   51.320794]  __do_sys_io_uring_enter+0x500/0x770
[   51.321422]  do_syscall_64+0x82/0x170
[   51.321891]  ? __do_sys_io_uring_enter+0x500/0x770
Thanks,
Ming

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH v7 1/8] ublk: have a per-io daemon instead of a per-queue daemon