On Fri, Sep 23, 2022, at 2:56 PM, Linus Walleij wrote:
On Fri, Sep 23, 2022 at 1:58 PM Arnd Bergmann arnd@arndb.de wrote:
On Fri, Sep 23, 2022, at 1:38 PM, Linus Walleij wrote:
A recent change affecting the behaviour of phys_to_dma() to actually require the device tree ranges to work unmasked a bug in the Integrator DMA ranges.
The PL110 uses the CMA allocator to obtain coherent allocations from a dedicated 1MB video memory, leading to the following call chain:
drm_gem_cma_create() dma_alloc_attrs() dma_alloc_from_dev_coherent() __dma_alloc_from_coherent() dma_get_device_base() phys_to_dma() translate_phys_to_dma()
phys_to_dma() by way of translate_phys_to_dma() will nowadays not provide 1:1 mappings unless the ranges are properly defined in the device tree and reflected into the dev->dma_range_map.
I don't understand this yet, what did the kernel previously do to that allowed the correct DMA mapping when a wrong address was set in the DT?
Ah this goes way back. I think I need a different fixes tag. I hadn't tested this platform in a while.
What happens currently in translate_phys_to_dma() is that it returns DMA_MAPPING_ERROR (0xffffffff) because of the buggy device tree and that causes the problem.
Before the dma direct rework we selected ARCH_HAS_PHYS_TO_DMA and arch/arm/include/asm/dma-direct.h did this:
static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn) { if (dev) pfn -= dev->dma_pfn_offset; return (dma_addr_t)__pfn_to_bus(pfn); } (...) static inline dma_addr_t __phys_to_dma(struct device *dev, phys_addr_t paddr) { unsigned int offset = paddr & ~PAGE_MASK; return pfn_to_dma(dev, __phys_to_pfn(paddr)) + offset; }
As dma_pfn_offset was 0 this resulted in a 1:1 map.
Ah, I see. So the offset was 0 because the dma-ranges did not actually point to RAM then, right?
Odd thing: this flag has no device tree bindings I could find. Maybe it's in the really old DT docs?
No idea. I do remember that in the old times, whether DMA was coherent or not was just a compile-time option, so ppc was usually assumed to be coherent while Arm was not, but then we needed something to identify machines that were different.
Here the special memory appears in the CPU address map.
bus@c0000000 { compatible = "arm,integrator-ap-lm"; #address-cells = <1>; #size-cells = <1>; ranges = <0xc0000000 0xc0000000 0x40000000>; dma-ranges;
(...) lm0: bus@c0000000 { compatible = "simple-bus"; ranges = <0x00000000 0xc0000000 0x10000000>; dma-ranges = <0x00000000 0xc0000000 0x10000000>; reg = <0xc0000000 0x10000000>; #address-cells = <1>; #size-cells = <1>;
display@1000000 { compatible = "arm,pl110", "arm,primecell"; reg = <0x01000000 0x1000>;
(...) memory-region = <&impd1_ram>; dma-ranges; (...)
Here we find the reg properly at physical address 0xc1000000 thanks to the ranges: 0xc0000000 + 0x00000000 + 0x01000000 = 0xc1000000.
But the special memory region is in the root of the device tree, outside of the translation.
Now dma-ranges will assume that when we translate the memory region it is in the address space of the display controller, but it isn't, because the phandle goes to something in the root of the device tree.
0xc2000000 + 0x00000000 + 0x00000000 = 0xc2000000
Whenever I do ranges my head start spinning :/
If you have a better way to accomodate the DMA ranges I am happy to comply!
I'm still not sure I understand it. Is that reserved memory area in RAM or on the logic bus? The 'ranges' property makes it appear that all of 0xc0000000-0xcfffffff is the logic bus, but then the vram itself is also in that address range, which would indicate that it is behind that bus. However, using the 'reserved-memory' property indicates that this is actually just normal RAM that is carved out from what the OS manages.
Arnd