This patch set aims to allow ublk server threads to better balance load amongst themselves by decoupling server threads from ublk_queues/hctxs, so that multiple threads can service I/Os that are issued from a single CPU. This can improve performance for workloads in which ublk server CPU is a bottleneck, and for which load is issued from CPUs which are not balanced across ublk_queues/hctxs.
Performance -----------
First create two ublk devices with:
ublkb0: ./kublk add -t null -q 2 --nthreads 2 ublkb1: ./kublk add -t null -q 2 --nthreads 2 --per_io_tasks
Then run load with:
taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb0: 1.90M IOPS taskset -c 1 fio/t/io_uring -r5 -p0 /dev/ublkb1: 2.18M IOPS
Since ublkb1 has per-io-tasks, the second command is able to make use of both ublk server worker threads and therefore has increased max throughput.
Caveats: - This testing was done on a system with 2 numa nodes, but the penalty of having I/O cross a numa (or LLC) boundary in the per_io_tasks case is quite high. So these numbers were obtained after moving all ublk server threads and the application threads to CPUs on the same numa node/LLC. - One might expect the scaling to be linear - because ublkb1 can make use of twice as many ublk server threads, it should be able to drive twice the throughput. However this is not true (the improvement is ~15%), and needs further investigation.
Signed-off-by: Uday Shankar ushankar@purestorage.com --- Changes in v7: - Fix queue_rqs batch dispatch for per-io daemons - Kick round-robin tag allocation changes to a followup - Add explicit feature flag for per-task daemons (Ming Lei, Caleb Sander Mateos) - Move some variable assignments to avoid redundant computation (Caleb Sander Mateos) - Switch from storing pointers in ublk_io to computing based on address with container_of in a couple places (Ming Lei) - Link to v6: https://lore.kernel.org/r/20250507-ublk_task_per_io-v6-0-a2a298783c01@purest...
Changes in v6: - Add a feature flag for this feature, called UBLK_F_RR_TAGS (Ming Lei) - Add test for this feature (Ming Lei) - Add documentation for this feature (Ming Lei) - Link to v5: https://lore.kernel.org/r/20250416-ublk_task_per_io-v5-0-9261ad7bff20@purest...
Changes in v5: - Set io->task before ublk_mark_io_ready (Caleb Sander Mateos) - Set io->task atomically, read it atomically when needed - Return 0 on success from command-specific helpers in __ublk_ch_uring_cmd (Caleb Sander Mateos) - Rename ublk_handle_need_get_data to ublk_get_data (Caleb Sander Mateos) - Link to v4: https://lore.kernel.org/r/20250415-ublk_task_per_io-v4-0-54210b91a46f@purest...
Changes in v4: - Drop "ublk: properly serialize all FETCH_REQs" since Ming is taking it in another set - Prevent data races by marking data structures which should be read-only in the I/O path as const (Ming Lei) - Link to v3: https://lore.kernel.org/r/20250410-ublk_task_per_io-v3-0-b811e8f4554a@purest...
Changes in v3: - Check for UBLK_IO_FLAG_ACTIVE on I/O again after taking lock to ensure that two concurrent FETCH_REQs on the same I/O can't succeed (Caleb Sander Mateos) - Link to v2: https://lore.kernel.org/r/20250408-ublk_task_per_io-v2-0-b97877e6fd50@purest...
Changes in v2: - Remove changes split into other patches - To ease error handling/synchronization, associate each I/O (instead of each queue) to the last task that issues a FETCH_REQ against it. Only that task is allowed to operate on the I/O. - Link to v1: https://lore.kernel.org/r/20241002224437.3088981-1-ushankar@purestorage.com
--- Uday Shankar (8): ublk: have a per-io daemon instead of a per-queue daemon selftests: ublk: kublk: plumb q_id in io_uring user_data selftests: ublk: kublk: tie sqe allocation to io instead of queue selftests: ublk: kublk: lift queue initialization out of thread selftests: ublk: kublk: move per-thread data out of ublk_queue selftests: ublk: kublk: decouple ublk_queues from ublk server threads selftests: ublk: add test for per io daemons Documentation: ublk: document UBLK_F_PER_IO_DAEMON
Documentation/block/ublk.rst | 35 ++- drivers/block/ublk_drv.c | 108 +++---- include/uapi/linux/ublk_cmd.h | 9 + tools/testing/selftests/ublk/Makefile | 1 + tools/testing/selftests/ublk/fault_inject.c | 4 +- tools/testing/selftests/ublk/file_backed.c | 20 +- tools/testing/selftests/ublk/kublk.c | 345 ++++++++++++++------- tools/testing/selftests/ublk/kublk.h | 73 +++-- tools/testing/selftests/ublk/null.c | 22 +- tools/testing/selftests/ublk/stripe.c | 17 +- tools/testing/selftests/ublk/test_generic_12.sh | 55 ++++ .../selftests/ublk/trace/count_ios_per_tid.bt | 11 + 12 files changed, 470 insertions(+), 230 deletions(-) --- base-commit: 533c87e2ed742454957f14d7bef9f48d5a72e72d change-id: 20250408-ublk_task_per_io-c693cf608d7a
Best regards,