On Thu, Mar 12, 2020 at 11:19:28AM -0300, Jason Gunthorpe wrote:
The non-page scatterlist is also a big concern for RDMA as we have drivers that want the page list, so even if we did as this series contemplates I'd have still have to split the drivers and create the notion of a dma-only SGL.
The drivers I looked at want a list of IOVA address, aligned to the device "page size". What other data do drivers want? Execept for the software protocol stack drivers, which of couse need pages for the stack futher down.
I haven't used bio_vecs before, do they support chaining like SGL so they can be very big? RDMA dma maps gigabytes of memory
bio_vecs itself don't have the chaining, but the bios build around them do. But each entry can map a huge pile. If needed we could use the same chaining scheme we use for scatterlists for bio_vecs as well, but lets see if we really end up needing that.
So I'm guessing the path forward is something like
- Add some generic dma_sg data structure and helper
- Add dma mapping code to go from pages to dma_sg
That has been on my todo list for a while. All the DMA consolidatation is to prepare for that and we're finally getting close.
Am 13.03.20 um 12:21 schrieb Christoph Hellwig:
On Thu, Mar 12, 2020 at 11:19:28AM -0300, Jason Gunthorpe wrote:
The non-page scatterlist is also a big concern for RDMA as we have drivers that want the page list, so even if we did as this series contemplates I'd have still have to split the drivers and create the notion of a dma-only SGL.
The drivers I looked at want a list of IOVA address, aligned to the device "page size". What other data do drivers want?
Well for GPUs I have the requirement that those IOVA addresses allow random access.
That's the reason why we currently convert the sg_table into a linear arrays of addresses and pages. To solve that keeping the length in separate optional array would be ideal for us.
But this is so a special use case that I'm not sure if we want to support this in the common framework or not.
Execept for the software protocol stack drivers, which of couse need pages for the stack futher down.
Yes completely agree.
For the GPUs I will propose a patch to stop copying the page from the sg_table over into our linear arrays and see if anybody starts to scream.
I don't think so, but probably better to double check.
Thanks, Christian.
I haven't used bio_vecs before, do they support chaining like SGL so they can be very big? RDMA dma maps gigabytes of memory
bio_vecs itself don't have the chaining, but the bios build around them do. But each entry can map a huge pile. If needed we could use the same chaining scheme we use for scatterlists for bio_vecs as well, but lets see if we really end up needing that.
So I'm guessing the path forward is something like
- Add some generic dma_sg data structure and helper
- Add dma mapping code to go from pages to dma_sg
That has been on my todo list for a while. All the DMA consolidatation is to prepare for that and we're finally getting close.
linaro-mm-sig@lists.linaro.org