Dear All,
During the Exynos DRM GEM rework and fixing the issues in the drm_prime_sg_to_page_addr_arrays() function [1] I've noticed that most drivers in DRM framework incorrectly use nents and orig_nents entries of the struct sg_table.
In case of the most DMA-mapping implementations exchanging those two entries or using nents for all loops on the scatterlist is harmless, because they both have the same value. There exists however a DMA-mapping implementations, for which such incorrect usage breaks things. The nents returned by dma_map_sg() might be lower than the nents passed as its parameter and this is perfectly fine. DMA framework or IOMMU is allowed to join consecutive chunks while mapping if such operation is supported by the underlying HW (bus, bridge, IOMMU, etc). Example of the case where dma_map_sg() might return 1 'DMA' chunk for the 4 'physical' pages is described here [2]
The DMA-mapping framework documentation [3] states that dma_map_sg() returns the numer of the created entries in the DMA address space. However the subsequent calls to dma_sync_sg_for_{device,cpu} and dma_unmap_sg must be called with the original number of entries passed to dma_map_sg. The common pattern in DRM drivers were to assign the dma_map_sg() return value to sg_table->nents and use that value for the subsequent calls to dma_sync_sg_* or dma_unmap_sg functions. Also the code iterated over nents times to access the pages stored in the processed scatterlist, while it should use orig_nents as the numer of the page entries.
I've tried to identify all such incorrect usage of sg_table->nents and this is a result of my research. It looks that the incorrect pattern has been copied over the many drivers mainly in the DRM subsystem. Too bad in most cases it even worked correctly if the system used simple, linear DMA-mapping implementation, for which swapping nents and orig_nents doesn't make any difference.
The biggest TODO is DRM/i915 driver and I don't feel brave enough to fix it fully. The driver creatively uses sg_table->orig_nents to store the size of the allocate scatterlist and ignores the number of the entries returned by dma_map_sg function. In this patchset I only fixed the sg_table objects exported by dmabuf related functions. I hope that I didn't break anything there.
As a follow-up for this patchset I will prepare a common dma_{map,sync,unmap}_sgtable() wrappers, which will use a proper sg_table entries and call respective DMA-mapping functions. I hope this will help to avoid issue like this in the future.
Patches are based on top of Linux next-20200504.
Best regards, Marek Szyprowski
References:
[1] https://lkml.org/lkml/2020/3/27/555 [2] https://lkml.org/lkml/2020/3/29/65 [3] Documentation/DMA-API-HOWTO.txt
Changelog:
v2: - dropped most of the changes to drm/i915 - added fixes for rcar-du, xen, media and ion - fixed a few issues pointed by kbuild test robot - added wide cc: list for each patch
v1: https://lore.kernel.org/linux-iommu/c01c9766-9778-fd1f-f36e-2dc7bd376ba4@arm... - initial version
Patch summary:
Marek Szyprowski (21): drm: core: fix sg_table nents vs. orig_nents misuse drm: amdgpu: fix sg_table nents vs. orig_nents misuse drm: armada: fix sg_table nents vs. orig_nents misuse drm: etnaviv: fix sg_table nents vs. orig_nents misuse drm: exynos: fix sg_table nents vs. orig_nents misuse drm: i915: fix sg_table nents vs. orig_nents misuse for dmabuf objects drm: lima: fix sg_table nents vs. orig_nents misuse drm: msm: fix sg_table nents vs. orig_nents misuse drm: panfrost: fix sg_table nents vs. orig_nents misuse drm: radeon: fix sg_table nents vs. orig_nents misuse drm: rockchip: fix sg_table nents vs. orig_nents misuse drm: tegra: fix sg_table nents vs. orig_nents misuse drm: virtio: fix sg_table nents vs. orig_nents misuse drm: vmwgfx: fix sg_table nents vs. orig_nents misuse drm: xen: fix sg_table nents vs. orig_nents misuse drm: host1x: fix sg_table nents vs. orig_nents misuse drm: rcar-du: fix sg_table nents vs. orig_nents misuse xen: gntdev: fix sg_table nents vs. orig_nents misuse dmabuf: fix sg_table nents vs. orig_nents misuse media: pci: fix common ALSA DMA-mapping related code staging: ion: fix sg_table nents vs. orig_nents misuse
drivers/dma-buf/heaps/heap-helpers.c | 7 ++++--- drivers/dma-buf/udmabuf.c | 5 +++-- drivers/gpu/drm/amd/amdgpu/amdgpu_dma_buf.c | 7 ++++--- drivers/gpu/drm/amd/amdgpu/amdgpu_ttm.c | 8 ++++---- drivers/gpu/drm/armada/armada_gem.c | 14 ++++++++----- drivers/gpu/drm/drm_cache.c | 2 +- drivers/gpu/drm/drm_gem_shmem_helper.c | 7 ++++--- drivers/gpu/drm/drm_prime.c | 9 +++++---- drivers/gpu/drm/etnaviv/etnaviv_gem.c | 10 ++++++---- drivers/gpu/drm/exynos/exynos_drm_g2d.c | 7 ++++--- drivers/gpu/drm/i915/gem/i915_gem_dmabuf.c | 9 +++++---- drivers/gpu/drm/i915/gem/selftests/mock_dmabuf.c | 5 +++-- drivers/gpu/drm/lima/lima_gem.c | 4 ++-- drivers/gpu/drm/msm/msm_gem.c | 8 ++++---- drivers/gpu/drm/msm/msm_iommu.c | 3 ++- drivers/gpu/drm/panfrost/panfrost_gem.c | 3 ++- drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 +++- drivers/gpu/drm/radeon/radeon_ttm.c | 11 ++++++----- drivers/gpu/drm/rockchip/rockchip_drm_gem.c | 15 +++++++------- drivers/gpu/drm/tegra/gem.c | 25 ++++++++++++------------ drivers/gpu/drm/tegra/plane.c | 13 ++++++------ drivers/gpu/drm/virtio/virtgpu_object.c | 11 ++++++----- drivers/gpu/drm/virtio/virtgpu_vq.c | 8 ++++---- drivers/gpu/drm/vmwgfx/vmwgfx_ttm_buffer.c | 6 +++--- drivers/gpu/drm/xen/xen_drm_front_gem.c | 2 +- drivers/gpu/host1x/job.c | 17 ++++++++-------- drivers/media/pci/cx23885/cx23885-alsa.c | 2 +- drivers/media/pci/cx25821/cx25821-alsa.c | 2 +- drivers/media/pci/cx88/cx88-alsa.c | 2 +- drivers/media/pci/saa7134/saa7134-alsa.c | 2 +- drivers/media/platform/vsp1/vsp1_drm.c | 7 ++++--- drivers/staging/android/ion/ion.c | 17 ++++++++-------- drivers/staging/android/ion/ion_heap.c | 6 +++--- drivers/staging/android/ion/ion_system_heap.c | 2 +- drivers/xen/gntdev-dmabuf.c | 10 ++++++---- 35 files changed, 149 insertions(+), 121 deletions(-)