Am 09.11.23 um 04:20 schrieb Mina Almasry:

[SNIP]

But we might be able to do something as folio is doing now, mm subsystem
is still seeing 'struct folio/page', but other subsystem like slab is using
'struct slab', and there is still some common fields shared between
'struct folio' and 'struct slab'.

In my eyes this is almost exactly what I suggested in RFC v1 and got
immediately nacked with no room to negotiate. What we did for v1 is to
allocate struct pages for dma-buf to make dma-bufs look like struct
page to mm subsystem. Almost exactly what you're describing above.
It's a no-go. I don't think renaming struct page to netmem is going to
move the needle (it also re-introduces code-churn). What I feel like I
learnt is that dma-bufs are not struct pages and can't be made to look
like one, I think.

Yeah, that pretty much hits the nail on the head. What was not mentioned yet and you could potentially try to do is to go the other way around.

In other words instead of importing a DMA-buf file descriptor into the page-pool, you take some netdev page-pool pages and export them as DMA-buf handle.

All you then need to do is to implement the necessary DMA-buf interface, e.g. wait for DMA-fences before accessing stuff, when you have an async operation install a DMA-fence so that other can wait for your operation to finish etc.. etc..

This still doesn't gives you peer 2 peer over for example the PCIe bus, but it would give you zero copy in the sense that a GPU or other acceleration device directly writes stuff into memory a network device can work with.

We already have some similar functionality for at least Intel and AMD hw where an userptr mechanism is used to make malloced memory (backed by normal struct pages) available to the GPU. The problem with this approach is that most GPUs currently can't do good recoverable page faults which in turn makes the whole MMU notifier based approach just horrible inefficient.

Just giving a few more pointers you might want to look into...

Cheers,
Christian.