On Thu, Jan 28, 2021 at 11:00 AM Suren Baghdasaryan surenb@google.com wrote:
On Thu, Jan 28, 2021 at 10:19 AM Minchan Kim minchan@kernel.org wrote:
On Thu, Jan 28, 2021 at 09:52:59AM -0800, Suren Baghdasaryan wrote:
On Thu, Jan 28, 2021 at 1:13 AM Christoph Hellwig hch@infradead.org wrote:
On Thu, Jan 28, 2021 at 12:38:17AM -0800, Suren Baghdasaryan wrote:
Currently system heap maps its buffers with VM_PFNMAP flag using remap_pfn_range. This results in such buffers not being accounted for in PSS calculations because vm treats this memory as having no page structs. Without page structs there are no counters representing how many processes are mapping a page and therefore PSS calculation is impossible. Historically, ION driver used to map its buffers as VM_PFNMAP areas due to memory carveouts that did not have page structs [1]. That is not the case anymore and it seems there was desire to move away from remap_pfn_range [2]. Dmabuf system heap design inherits this ION behavior and maps its pages using remap_pfn_range even though allocated pages are backed by page structs. Clear VM_IO and VM_PFNMAP flags when mapping memory allocated by the system heap and replace remap_pfn_range with vm_insert_page, following Laura's suggestion in [1]. This would allow correct PSS calculation for dmabufs.
[1] https://driverdev-devel.linuxdriverproject.narkive.com/v0fJGpaD/using-ion-me... [2] http://driverdev.linuxdriverproject.org/pipermail/driverdev-devel/2018-Octob... (sorry, could not find lore links for these discussions)
Suggested-by: Laura Abbott labbott@kernel.org Signed-off-by: Suren Baghdasaryan surenb@google.com
drivers/dma-buf/heaps/system_heap.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/drivers/dma-buf/heaps/system_heap.c b/drivers/dma-buf/heaps/system_heap.c index 17e0e9a68baf..0e92e42b2251 100644 --- a/drivers/dma-buf/heaps/system_heap.c +++ b/drivers/dma-buf/heaps/system_heap.c @@ -200,11 +200,13 @@ static int system_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) struct sg_page_iter piter; int ret;
/* All pages are backed by a "struct page" */
vma->vm_flags &= ~VM_PFNMAP;
Why do we clear this flag? It shouldn't even be set here as far as I can tell.
Thanks for the question, Christoph. I tracked down that flag being set by drm_gem_mmap_obj() which DRM drivers use to "Set up the VMA to prepare mapping of the GEM object" (according to drm_gem_mmap_obj comments). I also see a pattern in several DMR drivers to call drm_gem_mmap_obj()/drm_gem_mmap(), then clear VM_PFNMAP and then map the VMA (for example here: https://elixir.bootlin.com/linux/latest/source/drivers/gpu/drm/rockchip/rock...). I thought that dmabuf allocator (in this case the system heap) would be the right place to set these flags because it controls how memory is allocated before mapping. However it's quite possible that I'm
However, you're not setting but removing a flag under the caller. It's different with appending more flags(e.g., removing condition vs adding more conditions). If we should remove the flag, caller didn't need to set it from the beginning. Hiding it under this API continue to make wrong usecase in future.
Which takes us back to the question of why VM_PFNMAP is being set by the caller in the first place.
missing the real reason for VM_PFNMAP being set in drm_gem_mmap_obj() before dma_buf_mmap() is called. I could not find the answer to that, so I hope someone here can clarify that.
Guess DRM had used carved out pure PFN memory long time ago and changed to use dmabuf since somepoint.
It would be really good to know the reason for sure to address the issue properly.
Whatever there is a history, rather than removing the flag under them, let's add WARN_ON(vma->vm_flags & VM_PFNMAP) so we could clean up catching them and start discussion.
The issue with not clearing the flag here is that vm_insert_page() has a BUG_ON(vma->vm_flags & VM_PFNMAP). If we do not clear this flag I suspect we will get many angry developers :) If your above guess is correct and we can mandate dmabuf heap users not to use VM_PFNMAP then I think the following code might be the best way forward:
bool pfn_requested = !!(vma->vm_flags & VM_PFNMAP);
+. WARN_ON_ONCE(pfn_requested);
for_each_sgtable_page(table, &piter, vma->vm_pgoff) { struct page *page = sg_page_iter_page(&piter);
ret = remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
vma->vm_page_prot);
ret = pfn_requested ?
+. remap_pfn_range(vma, addr, page_to_pfn(page), PAGE_SIZE,
vma->vm_page_prot) :
vm_insert_page(vma, addr, page);
Folks, any objections to the approach above?