On Mon, Jun 10, 2024 at 02:38:18PM +0200, Christian König wrote:
Am 10.06.24 um 14:16 schrieb Jason Gunthorpe:
On Mon, Jun 10, 2024 at 02:07:01AM +0100, Pavel Begunkov wrote:
On 6/10/24 01:37, David Wei wrote:
On 2024-06-07 17:52, Jason Gunthorpe wrote:
IMHO it seems to compose poorly if you can only use the io_uring lifecycle model with io_uring registered memory, and not with DMABUF memory registered through Mina's mechanism.
By this, do you mean io_uring must be exclusively used to use this feature?
And you'd rather see the two decoupled, so userspace can register w/ say dmabuf then pass it to io_uring?
Personally, I have no clue what Jason means. You can just as well say that it's poorly composable that write(2) to a disk cannot post a completion into a XDP ring, or a netlink socket, or io_uring's main completion queue, or name any other API.
There is no reason you shouldn't be able to use your fast io_uring completion and lifecycle flow with DMABUF backed memory. Those are not widly different things and there is good reason they should work together.
Well there is the fundamental problem that you can't use io_uring to implement the semantics necessary for a dma_fence.
That's why we had to reject the io_uring work on DMA-buf sharing from Google a few years ago.
But this only affects the dma_fence synchronization part of DMA-buf, but *not* the general buffer sharing.
More precisely, it only impacts the userspace/data access implicit synchronization part of dma-buf. For tracking buffer movements like on invalidations/refault with a dynamic dma-buf importer/exporter I think the dma-fence rules are acceptable. At least they've been for rdma drivers.
But the escape hatch is to (temporarily) pin the dma-buf, which is exactly what direct I/O also does when accessing pages. So aside from the still unsolved question on how we should account/track pinned dma-buf, there shouldn't be an issue. Or at least I'm failing to see one.
And for synchronization to data access the dma-fence stuff on dma-buf is anyway rather deprecated on the gpu side too, exactly because of all these limitations. On the gpu side we've been moving to free-standing drm_syncobj instead, but those are fairly gpu specific and any other subsystem should be able to just reuse what they have already to signal transaction completions.
Cheers, Sima
Regards, Christian.
Pretending they are totally different just because two different people wrote them is a very siloed view.
The devmem TCP callback can implement it in a way feasible to the project, but it cannot directly post events to an unrelated API like io_uring. And devmem attaches buffers to a socket, for which a ring for returning buffers might even be a nuisance.
If you can't compose your io_uring completion mechanism with a DMABUF provided backing store then I think it needs more work.
Jason