Re: [PATCH net-next v19 06/13] memory-provider: dmabuf devmem memory provider

14 Aug 2024

On 8/13/24 22:13, Mina Almasry wrote:
...
Implement a memory provider that allocates dmabuf devmem in the form of
net_iov.
The provider receives a reference to the struct netdev_dmabuf_binding
via the pool->mp_priv pointer. The driver needs to set this pointer for
the provider in the net_iov.
The provider obtains a reference on the netdev_dmabuf_binding which
guarantees the binding and the underlying mapping remains alive until
the provider is destroyed.
Usage of PP_FLAG_DMA_MAP is required for this memory provide such that
the page_pool can provide the driver with the dma-addrs of the devmem.
Support for PP_FLAG_DMA_SYNC_DEV is omitted for simplicity & p.order !=
0.
Signed-off-by: Willem de Bruijn willemb@google.com
Signed-off-by: Kaiyuan Zhang kaiyuanz@google.com
Signed-off-by: Mina Almasry almasrymina@google.com
Reviewed-by: Pavel Begunkov asml.silence@gmail.com

v19:

Add PP_FLAG_ALLOW_UNREADABLE_NETMEM flag. It serves 2 purposes, (a)
 it guards drivers that don't support unreadable netmem (net_iov
 backed) from accidentally getting exposed to it, and (b) drivers that
 wish to create header pools can unset it for that pool to force
 readable netmem.
Add page_pool_check_memory_provider, which verifies that the driver
 has created a page_pool with the expected configuration. This is used
 to report to the user if the mp configuration succeeded, and also
 verify that the driver is doing the right thing.
Don't reset niov->dma_addr on allocation/free.

v17:

Use ASSERT_RTNL (Jakub)

v16:

Add DEBUG_NET_WARN_ON_ONCE(!rtnl_is_locked()), to catch cases if
 page_pool_init without rtnl_locking when the queue is provided. In
 this case, the queue configuration may be changed while we're initing
 the page_pool, which could be a race.

v13:

Return on warning (Pavel).
Fixed pool->recycle_stats not being freed on error (Pavel).
Applied reviewed-by from Pavel.

v11:

Rebase to not use the ops. (Christoph)

v8:

Use skb_frag_size instead of frag->bv_len to fix patch-by-patch build
 error

v6:

refactor new memory provider functions into net/core/devmem.c (Pavel)

v2:

Disable devmem for p.order != 0

v1:

static_branch check in page_is_page_pool_iov() (Willem & Paolo).
PP_DEVMEM -> PP_IOV (David).
Require PP_FLAG_DMA_MAP (Jakub).

...
...

diff --git a/net/core/devmem.c b/net/core/devmem.c
index 301f4250ca82..2f2a7f4dee4c 100644
--- a/net/core/devmem.c
+++ b/net/core/devmem.c
@@ -17,6 +17,7 @@
  #include <linux/genalloc.h>
  #include <linux/dma-buf.h>
  #include <net/devmem.h>
+#include <net/mp_dmabuf_devmem.h>
  #include <net/netdev_queues.h>
  
  #include "page_pool_priv.h"
@@ -153,6 +154,10 @@ int net_devmem_bind_dmabuf_to_queue(struct net_device *dev, u32 rxq_idx,
   if (err)
   	goto err_xa_erase;

err = page_pool_check_memory_provider(dev, rxq, binding);

Frankly, I pretty much don't like it.
1. We do it after reconfiguring the queue just to fail and reconfigure
it again.
2. It should be a part of the common path like netdev_rx_queue_restart(),
not specific to devmem TCP.
These two can be fixed by moving the check into
netdev_rx_queue_restart() just after ->ndo_queue_mem_alloc, assuming
that the callback where we init page pools.
3. That implicit check gives me bad feeling, instead of just getting
direct feedback from the driver, either it's a flag or an error
returned, we have to try to figure what exactly the driver did, with
a high chance this inference will fail us at some point.
And page_pool_check_memory_provider() is not that straightforward,
it doesn't walk through pools of a queue. Not looking too deep,
but it seems like the nested loop can be moved out with the same
effect, so it first looks for a pool in the device and the follows
with the bound_rxqs. And seems the bound_rxqs check would always turn
true, you set the binding into the map in
net_devmem_bind_dmabuf_to_queue() before the restart and it'll be there
after restart for page_pool_check_memory_provider(). Maybe I missed
something, but it's not super clear.
4. And the last thing Jakub mentioned is that we need to be prepared
to expose a flag to the userspace for whether a queue supports
netiov. Not really doable in a sane manner with such implicit
post configuration checks.
And that brings us back to the first approach I mentioned, where
we have a flag in the queue structure, drivers set it, and
netdev_rx_queue_restart() checks it before any callback. That's
where the thread with Jakub stopped, and it reads like at least
he's not against the idea.
...

if (err)
goto err_xa_erase;


return 0;

err_xa_erase:
@@ -305,4 +310,69 @@ void dev_dmabuf_uninstall(struct net_device *dev)
   			xa_erase(&binding->bound_rxqs, xa_idx);
   }
  }



...
...
diff --git a/net/core/page_pool_user.c b/net/core/page_pool_user.c
index 3a3277ba167b..cbc54ee4f670 100644
--- a/net/core/page_pool_user.c
+++ b/net/core/page_pool_user.c
@@ -344,6 +344,32 @@ void page_pool_unlist(struct page_pool *pool)
   mutex_unlock(&page_pools_lock);
  }
  
+int page_pool_check_memory_provider(struct net_device *dev,

		    struct netdev_rx_queue *rxq,


		    struct net_devmem_dmabuf_binding *binding)



+{

struct netdev_rx_queue *binding_rxq;
struct page_pool *pool;
struct hlist_node *n;
unsigned long xa_idx;

mutex_lock(&page_pools_lock);
hlist_for_each_entry_safe(pool, n, &dev->page_pools, user.list) {
if (pool->mp_priv != binding)


	continue;



xa_for_each(&binding->bound_rxqs, xa_idx, binding_rxq) {


	if (rxq != binding_rxq)


		continue;



	mutex_unlock(&page_pools_lock);


	return 0;


}


}
mutex_unlock(&page_pools_lock);
return -ENODATA;

+}

static void page_pool_unreg_netdev_wipe(struct net_device *netdev)
{
 struct page_pool *pool;

-- 
Pavel Begunkov


    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH net-next v19 06/13] memory-provider: dmabuf devmem memory provider