On Mon, Jun 08, 2026 at 03:59:04PM +0200, Christian König wrote:
On 6/8/26 15:55, Bobby Eshleman wrote:
On Sun, Jun 7, 2026 at 11:42 PM Christian König <christian.koenig@amd.com mailto:christian.koenig@amd.com> wrote:
On 6/5/26 20:44, Bobby Eshleman wrote: > On Fri, Jun 05, 2026 at 11:30:07AM +0200, Christian König wrote: >> On 6/4/26 02:42, Bobby Eshleman wrote: >>> From: Bobby Eshleman <bobbyeshleman@meta.com <mailto:bobbyeshleman@meta.com>> >>> >>> get_sg_table() emitted one PAGE_SIZE sg entry per page even when the >>> underlying folio was larger. >>> >>> Instead, walk folios[] and emit one sg entry per folio. When folios >>> represent large pages (as is for MFD_HUGETLB), each sg entry is a large >>> page. Normal PAGE_SIZE sg tables are unchanged. >>> >>> Required by net/core/devmem to support rx-buf-size > PAGE_SIZE with >>> udmabuf. >> >> That doesn't explain why this is required. > > Sure, can definitely add. Devmem currently requires dmabuf sg entries to > be length and size aligned when it allocates niovs for NIC page pools. > Though udmabuf is not violating any dmabuf contract by emitting > PAGE_SIZE entries and the above restriction is probably more a > shortfalling of devmem, by emitting a single entry per folio this patch > allows udmabuf to be used by devmem for large pages. > >> >> Please note that accessing the pages/folio of an sg-table returned by DMA-buf is illegal and strictly forbidden! >> >> Regards, >> Christian. > > It seems both devmem and io_uring zcrx at least introspect through to > the sg-table to build NIC page pools (not accessing the memory itself, > however). Is there a better way? That's an absolute NO-GO! We need to stop that immediately. Touching the underlying struct page of an DMA-buf exported sg-table is strictly forbidden. We even have code to wrap the sg_table and hide the struct pages on debug builds to catch those issues, see function dma_buf_wrap_sg_table(). My last status is that the NIC page pools are build directly from the DMA addresses exposed by the sg_table. Was there any change I'm not aware of? Regards, Christian.Oh no change, your mental model is still current. They just go through each sg and use sg_dma_address() on each.
Ah, thanks! That was a near heart attack :D
Yeah that is perfectly correct, question is do you then still really need this udmabuf change? I mean the DMA API usually merges together contiguous DMA addresses.
Regards, Christian.
Hey Christian, sorry for the delay I justed want to double check what I'm seeing...
I reverted the udmabuf patch and confirmed devmem still runs into 4K pages even for hugepage udmabuf. I see that the dma_map_direct() path is being taken, which if I am reading the code correctly results in the sg_dma_len(sg) inheriting sg->length directly (set by udmabuf's sg_set_folio(..., PAGE_SIZE) call), compared to the iommu_dma_map_phys() path which looks like it does merge when possible.
Best, Bobby