On 6/10/24 16:16, David Ahern wrote:
On 6/10/24 6:16 AM, Jason Gunthorpe wrote:
On Mon, Jun 10, 2024 at 02:07:01AM +0100, Pavel Begunkov wrote:
On 6/10/24 01:37, David Wei wrote:
On 2024-06-07 17:52, Jason Gunthorpe wrote:
IMHO it seems to compose poorly if you can only use the io_uring lifecycle model with io_uring registered memory, and not with DMABUF memory registered through Mina's mechanism.
By this, do you mean io_uring must be exclusively used to use this feature?
And you'd rather see the two decoupled, so userspace can register w/ say dmabuf then pass it to io_uring?
Personally, I have no clue what Jason means. You can just as well say that it's poorly composable that write(2) to a disk cannot post a completion into a XDP ring, or a netlink socket, or io_uring's main completion queue, or name any other API.
There is no reason you shouldn't be able to use your fast io_uring completion and lifecycle flow with DMABUF backed memory. Those are not widly different things and there is good reason they should work together.
Let's not mix up devmem TCP and dmabuf specifically, as I see it your question was concerning the latter: "... DMABUF memory registered through Mina's mechanism". io_uring's zcrx can trivially get dmabuf support in future, as mentioned it's mostly the setup side. ABI, buffer workflow and some details is a separate issue, and I don't see how further integration aside from what we're already sharing is beneficial, on opposite it'll complicate things.
Pretending they are totally different just because two different people wrote them is a very siloed view.
io_uring zcrx and devmem? They are not, nobody is saying otherwise, _very_ similar approaches if anything but with different API, which is the reason we already use common infra.
The devmem TCP callback can implement it in a way feasible to the project, but it cannot directly post events to an unrelated API like io_uring. And devmem attaches buffers to a socket, for which a ring for returning buffers might even be a nuisance.
If you can't compose your io_uring completion mechanism with a DMABUF provided backing store then I think it needs more work.
As per above, it conflates devmem TCP with dmabuf.
exactly. io_uring, page_pool, dmabuf - all kernel building blocks for solutions. This why I was pushing for Mina's set not to be using the name `devmem` - it is but one type of memory and with dmabuf it should not matter if it is gpu or host (or something else later on - cxl?).