On Fri, Jun 7, 2024 at 8:47 AM Pavel Begunkov asml.silence@gmail.com wrote:
On 6/7/24 16:42, Pavel Begunkov wrote:
On 6/7/24 15:27, David Ahern wrote:
On 6/7/24 7:42 AM, Pavel Begunkov wrote:
I haven't seen any arguments against from the (net) maintainers so far. Nor I see any objection against callbacks from them (considering that either option adds an if).
I have said before I do not understand why the dmabuf paradigm is not sufficient for both device memory and host memory. A less than ideal control path to put hostmem in a dmabuf wrapper vs extra checks and changes in the datapath. The former should always be preferred.
If we're talking about types of memory specifically, I'm not strictly against wrapping into dmabuf in kernel, but that just doesn't give anything.
And the reason I don't have too strong of an opinion on that is mainly because it's just setup/cleanup path.
I agree wrapping io uring in dmabuf seems to be an unnecessary detour. I never understood the need or upside to do that, but it could be a lack of understanding on my part.
However, the concern that David brings up may materialize. I've had to spend a lot of time minimizing or justifying checks to the code with page pool benchmarks that detect even 1 cycle regressions. You may be asked to run the same benchmarks and minimize similar overhead.
The benchmark in question is Jesper's bench_page_pool_simple. I've forked it and applied it on top of net-next here: https://github.com/mina/linux/commit/927596f87ab5791a8a6ba8597ba2189747396e5...
As io_uring ZC comes close to merging, I suspect it would be good to run this to understand the regression in the fast path, if any. If there are no to little regressions, I have no concerns over io uring memory not being wrapped in dmabufs, and David may agree as well.
-- Thanks, Mina