On Sat, Nov 1, 2025 at 2:12 AM Timur Tabi ttabi@nvidia.com wrote:
Set the DMA mask before calling nvkm_device_ctor(), so that when the flush page is created in nvkm_fb_ctor(), the allocation will not fail if the page is outside of DMA address space, which can easily happen if IOMMU is disable. In such situations, you will get an error like this:
nouveau 0000:65:00.0: DMA addr 0x0000000107c56000+4096 overflow (mask ffffffff, bus limit 0).
Commit 38f5359354d4 ("rm/nouveau/pci: set streaming DMA mask early") set the mask after calling nvkm_device_ctor(), but back then there was no flush page being created, which might explain why the mask wasn't set earlier.
Flush page allocation was added in commit 5728d064190e ("drm/nouveau/fb: handle sysmem flush page from common code"). nvkm_fb_ctor() calls alloc_page(), which can allocate a page anywhere in system memory, but then calls dma_map_page() on that page. But since the DMA mask is still set to 32, the map can fail if the page is allocated above 4GB. This is easy to reproduce on systems with a lot of memory and IOMMU disabled.
An alternative approach would be to force the allocation of the flush page to low memory, by specifying __GFP_DMA32. However, this would always allocate the page in low memory, even though the hardware can access high memory.
So this caused a regression, because the sysmem flush page has to be inside 40 bits.
look in openrm: src/nvidia/src/kernel/gpu/mem_sys/arch/maxwell/kern_mem_sys_gm107.c:kmemsysInitFlushSysmemBuffer_GM107
The prop driver tries to use GFP_DMA32, then use 40 bits and the code is all horrible. It's probably fine for use to just set the dma_bits to 40 here before and then the full range after.
Dave.
Fixes: 5728d064190e ("drm/nouveau/fb: handle sysmem flush page from common code") Signed-off-by: Timur Tabi ttabi@nvidia.com
.../gpu/drm/nouveau/nvkm/engine/device/pci.c | 24 +++++++++---------- 1 file changed, 12 insertions(+), 12 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c index 8f0261a0d618..7cc5a7499583 100644 --- a/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c +++ b/drivers/gpu/drm/nouveau/nvkm/engine/device/pci.c @@ -1695,6 +1695,18 @@ nvkm_device_pci_new(struct pci_dev *pci_dev, const char *cfg, const char *dbg, *pdevice = &pdev->device; pdev->pdev = pci_dev;
/* Set DMA mask based on capabilities reported by the MMU subdev. */if (pdev->device.mmu && !pdev->device.pci->agp.bridge)bits = pdev->device.mmu->dma_bits;elsebits = 32;ret = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(bits));if (ret && bits != 32) {dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));pdev->device.mmu->dma_bits = 32;}ret = nvkm_device_ctor(&nvkm_device_pci_func, quirk, &pci_dev->dev, pci_is_pcie(pci_dev) ? NVKM_DEVICE_PCIE : pci_find_capability(pci_dev, PCI_CAP_ID_AGP) ?@@ -1708,17 +1720,5 @@ nvkm_device_pci_new(struct pci_dev *pci_dev, const char *cfg, const char *dbg, if (ret) return ret;
/* Set DMA mask based on capabilities reported by the MMU subdev. */if (pdev->device.mmu && !pdev->device.pci->agp.bridge)bits = pdev->device.mmu->dma_bits;elsebits = 32;ret = dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(bits));if (ret && bits != 32) {dma_set_mask_and_coherent(&pci_dev->dev, DMA_BIT_MASK(32));pdev->device.mmu->dma_bits = 32;}return 0;}
base-commit: 18a7e218cfcdca6666e1f7356533e4c988780b57
2.51.0