On Thu, Aug 01, 2024 at 10:53:45AM +0800, Huan Yang wrote:
在 2024/8/1 4:46, Daniel Vetter 写道:
On Tue, Jul 30, 2024 at 08:04:04PM +0800, Huan Yang wrote:
在 2024/7/30 17:05, Huan Yang 写道:
在 2024/7/30 16:56, Daniel Vetter 写道:
[????????? daniel.vetter@ffwll.ch ????????? https://aka.ms/LearnAboutSenderIdentification?????????????]
On Tue, Jul 30, 2024 at 03:57:44PM +0800, Huan Yang wrote:
UDMA-BUF step: 1. memfd_create 2. open file(buffer/direct) 3. udmabuf create 4. mmap memfd 5. read file into memfd vaddr
Yeah this is really slow and the worst way to do it. You absolutely want to start _all_ the io before you start creating the dma-buf, ideally with everything running in parallel. But just starting the direct I/O with async and then creating the umdabuf should be a lot faster and avoid
That's greate, Let me rephrase that, and please correct me if I'm wrong.
UDMA-BUF step: 1. memfd_create 2. mmap memfd 3. open file(buffer/direct) 4. start thread to async read 3. udmabuf create
With this, can improve
I just test with it. Step is:
UDMA-BUF step: 1. memfd_create 2. mmap memfd 3. open file(buffer/direct) 4. start thread to async read 5. udmabuf create
6 . join wait
3G file read all step cost 1,527,103,431ns, it's greate.
Ok that's almost the throughput of your patch set, which I think is close enough. The remaining difference is probably just the mmap overhead, not sure whether/how we can do direct i/o to an fd directly ... in principle it's possible for any file that uses the standard pagecache.
Yes, for mmap, IMO, now that we get all folios and pin it. That's mean all pfn it's got when udmabuf created.
So, I think mmap with page fault is helpless for save memory but increase the mmap access cost.(maybe can save a little page table's memory)
I want to offer a patchset to remove it and more suitable for folios operate(And remove unpin list). And contains some fix patch.
I'll send it when I test it's good.
About fd operation for direct I/O, maybe use sendfile or copy_file_range?
sendfile base pipe buffer, it's low performance when I test is.
copy_file_range can't work due to it's not the same file system.
So, I can't find other way to do it. Can someone give some suggestions?
Yeah direct I/O to pagecache without an mmap might be too niche to be supported. Maybe io_uring has something, but I guess as unlikely as anything else. -Sima