DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Mercier tjmercier@google.com --- drivers/dma-buf/heaps/cma_heap.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c index ee899f8e6721..4a63567e93ba 100644 --- a/drivers/dma-buf/heaps/cma_heap.c +++ b/drivers/dma-buf/heaps/cma_heap.c @@ -168,10 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf) if (vmf->pgoff > buffer->pagecount) return VM_FAULT_SIGBUS;
- vmf->page = buffer->pages[vmf->pgoff]; - get_page(vmf->page); - - return 0; + return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff])); }
static const struct vm_operations_struct dma_heap_vm_ops = { @@ -185,6 +182,8 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) return -EINVAL;
+ vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP); + vma->vm_ops = &dma_heap_vm_ops; vma->vm_private_data = buffer;
Am 17.01.24 um 19:11 schrieb T.J. Mercier:
DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
Mhm, not an issue with this patch but Daniel wanted to add a check for VM_PFNMAP to dma_buf_mmap() which would have noted this earlier.
I don't fully remember the discussion but for some reason that was never committed. We should probably try that again.
[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Merciertjmercier@google.com
Acked-by: Christian König christian.koenig@amd.com
drivers/dma-buf/heaps/cma_heap.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c index ee899f8e6721..4a63567e93ba 100644 --- a/drivers/dma-buf/heaps/cma_heap.c +++ b/drivers/dma-buf/heaps/cma_heap.c @@ -168,10 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf) if (vmf->pgoff > buffer->pagecount) return VM_FAULT_SIGBUS;
- vmf->page = buffer->pages[vmf->pgoff];
- get_page(vmf->page);
- return 0;
- return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff])); }
static const struct vm_operations_struct dma_heap_vm_ops = { @@ -185,6 +182,8 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) return -EINVAL;
- vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
- vma->vm_ops = &dma_heap_vm_ops; vma->vm_private_data = buffer;
On Thu, Jan 18, 2024 at 11:02:22AM +0100, Christian König wrote:
Am 17.01.24 um 19:11 schrieb T.J. Mercier:
DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
Mhm, not an issue with this patch but Daniel wanted to add a check for VM_PFNMAP to dma_buf_mmap() which would have noted this earlier.
I don't fully remember the discussion but for some reason that was never committed. We should probably try that again.
Iirc the issue is that dma_mmap is not guaranteed to give you a VM_SPECIAL mapping, at least on absolutely all architectures. That's why I defacto dropped that idea, but it would indeed be really great if we could resurrect it.
Maybe for x86 only? Or x86+armv8, I'm honestly not sure anymore which exact cases ended up with a VM_NORMAL mapping ... Would need a pile of digging.
I think all the other patches I've landed. -Sima
[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Merciertjmercier@google.com
Acked-by: Christian König christian.koenig@amd.com
drivers/dma-buf/heaps/cma_heap.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c index ee899f8e6721..4a63567e93ba 100644 --- a/drivers/dma-buf/heaps/cma_heap.c +++ b/drivers/dma-buf/heaps/cma_heap.c @@ -168,10 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf) if (vmf->pgoff > buffer->pagecount) return VM_FAULT_SIGBUS;
- vmf->page = buffer->pages[vmf->pgoff];
- get_page(vmf->page);
- return 0;
- return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff])); } static const struct vm_operations_struct dma_heap_vm_ops = {
@@ -185,6 +182,8 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) return -EINVAL;
- vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
- vma->vm_ops = &dma_heap_vm_ops; vma->vm_private_data = buffer;
On Thu, Jan 18, 2024 at 6:49 AM Daniel Vetter daniel@ffwll.ch wrote:
On Thu, Jan 18, 2024 at 11:02:22AM +0100, Christian König wrote:
Am 17.01.24 um 19:11 schrieb T.J. Mercier:
DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
Mhm, not an issue with this patch but Daniel wanted to add a check for VM_PFNMAP to dma_buf_mmap() which would have noted this earlier.
I don't fully remember the discussion but for some reason that was never committed. We should probably try that again.
Iirc the issue is that dma_mmap is not guaranteed to give you a VM_SPECIAL mapping, at least on absolutely all architectures. That's why I defacto dropped that idea, but it would indeed be really great if we could resurrect it.
I actually had it in my head that it was a BUG_ON check for VM_PFNMAP in dma_buf_mmap and it was merged, so I was surprised to discover that it wasn't set for these CMA buffers.
Maybe for x86 only? Or x86+armv8, I'm honestly not sure anymore which exact cases ended up with a VM_NORMAL mapping ... Would need a pile of digging.
Looking back at the patch, the CI email at the end of the thread lists a bunch of now-broken links to DMESG-WARN test failures I assume pointed at a large chunk of them.
https://lore.kernel.org/all/166919750173.15575.2864736980735346730@emeril.fr...
[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Merciertjmercier@google.com
Acked-by: Christian König christian.koenig@amd.com
Thanks Christian.
On Thu, Jan 18, 2024 at 08:57:16AM -0800, T.J. Mercier wrote:
On Thu, Jan 18, 2024 at 6:49 AM Daniel Vetter daniel@ffwll.ch wrote:
On Thu, Jan 18, 2024 at 11:02:22AM +0100, Christian König wrote:
Am 17.01.24 um 19:11 schrieb T.J. Mercier:
DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
Mhm, not an issue with this patch but Daniel wanted to add a check for VM_PFNMAP to dma_buf_mmap() which would have noted this earlier.
I don't fully remember the discussion but for some reason that was never committed. We should probably try that again.
Iirc the issue is that dma_mmap is not guaranteed to give you a VM_SPECIAL mapping, at least on absolutely all architectures. That's why I defacto dropped that idea, but it would indeed be really great if we could resurrect it.
I actually had it in my head that it was a BUG_ON check for VM_PFNMAP in dma_buf_mmap and it was merged, so I was surprised to discover that it wasn't set for these CMA buffers.
Maybe for x86 only? Or x86+armv8, I'm honestly not sure anymore which exact cases ended up with a VM_NORMAL mapping ... Would need a pile of digging.
Looking back at the patch, the CI email at the end of the thread lists a bunch of now-broken links to DMESG-WARN test failures I assume pointed at a large chunk of them.
https://lore.kernel.org/all/166919750173.15575.2864736980735346730@emeril.fr...
I thought there was a more recent submission, where I at least fixed the various fallout in gem code. But maybe I only dreamed ...
Also I did the code grepping again, and at least iommu_dma_mmap() in drivers/iommu/dma-iommu.c and arm_iommu_mmap_attrs() for arm use vm_map_pages in certain cases, which is _not_ VM_PFNMAP.
Means really no cases where I think we can assume we'll always get VM_PFNMAP, and unfortunately we need VM_PFNMAP or VM_IO to prevent get_user_pages and similar bad things from happening to dma-buf mmaps.
So still no luck :-/ -Sima
[1]https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Merciertjmercier@google.com
Acked-by: Christian König christian.koenig@amd.com
Thanks Christian.
Hi TJ,
On Thu, 18 Jan 2024 at 15:32, Christian König christian.koenig@amd.com wrote:
Am 17.01.24 um 19:11 schrieb T.J. Mercier:
DMA buffers allocated from the CMA dma-buf heap get counted under RssFile for processes that map them and trigger page faults. In addition to the incorrect accounting reported to userspace, reclaim behavior was influenced by the MM_FILEPAGES counter until linux 6.8, but this memory is not reclaimable. [1] Change the CMA dma-buf heap to set VM_PFNMAP on the VMA so MM does not poke at the memory managed by this dma-buf heap, and use vmf_insert_pfn to correct the RSS accounting.
The system dma-buf heap does not suffer from this issue since remap_pfn_range is used during the mmap of the buffer, which also sets VM_PFNMAP on the VMA.
Mhm, not an issue with this patch but Daniel wanted to add a check for VM_PFNMAP to dma_buf_mmap() which would have noted this earlier.
I don't fully remember the discussion but for some reason that was never committed. We should probably try that again.
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/mm...
Fixes: b61614ec318a ("dma-buf: heaps: Add CMA heap to dmabuf heaps") Signed-off-by: T.J. Mercier tjmercier@google.com tjmercier@google.com
Acked-by: Christian König christian.koenig@amd.com christian.koenig@amd.com
Thanks for the patch; pushed to drm-misc-fixes.
Best, Sumit
drivers/dma-buf/heaps/cma_heap.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/dma-buf/heaps/cma_heap.c b/drivers/dma-buf/heaps/cma_heap.c index ee899f8e6721..4a63567e93ba 100644 --- a/drivers/dma-buf/heaps/cma_heap.c +++ b/drivers/dma-buf/heaps/cma_heap.c @@ -168,10 +168,7 @@ static vm_fault_t cma_heap_vm_fault(struct vm_fault *vmf) if (vmf->pgoff > buffer->pagecount) return VM_FAULT_SIGBUS;
- vmf->page = buffer->pages[vmf->pgoff];
- get_page(vmf->page);
- return 0;
- return vmf_insert_pfn(vma, vmf->address, page_to_pfn(buffer->pages[vmf->pgoff]));
}
static const struct vm_operations_struct dma_heap_vm_ops = { @@ -185,6 +182,8 @@ static int cma_heap_mmap(struct dma_buf *dmabuf, struct vm_area_struct *vma) if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) == 0) return -EINVAL;
- vm_flags_set(vma, VM_IO | VM_PFNMAP | VM_DONTEXPAND | VM_DONTDUMP);
- vma->vm_ops = &dma_heap_vm_ops; vma->vm_private_data = buffer;
linaro-mm-sig@lists.linaro.org