On Tuesday 20 May 2014 12:02:46 Catalin Marinas wrote:
On Mon, May 19, 2014 at 05:55:56PM +0100, Arnd Bergmann wrote:
On Monday 19 May 2014 16:56:08 Catalin Marinas wrote:
On Mon, May 19, 2014 at 10:44:51AM +0100, Arnd Bergmann wrote:
On Monday 19 May 2014 10:03:40 Catalin Marinas wrote:
We probably want to default to 32-bit for arm32 in the absence of dma-ranges. For arm64, I'd prefer if we could always mandate dma-ranges to be present for each bus, just like we mandate ranges to be present. I hope it's not too late for that.
dma_set_mask should definitely look at the dma-ranges properties, and the helper that Santosh just introduced should give us all the information we need. We just need to decide on the correct behavior.
Last time I looked at Santosh's patches I thought the dma-ranges is per device rather than per bus. We could make it per bus only and let the device call dma_set_mask() explicitly if it wants to restrict it further.
Can you check again? I've read the code again yesterday to check this, and I concluded it was correctly doing this per bus.
While we currently don't have a set of swiotlb DMA ops on ARM32, we do have it on ARM64, and I think we should be using them properly. It should really not be hard to implement a proper dma_set_mask() function for ARM64 that gets is able to set up the swiotlb based on the dma-ranges properties and always returns success but leaves the mask unchanged.
The swiotlb bounce buffer needs to be pre-allocated at boot, otherwise we don't have any guarantees. Since we can't honour random masks anyway, we stick to ZONE_DMA which is currently in the 4G limit. But the driver calls dma_set_mask() too late for any further swiotlb setup.
With IOMMU we can be more flexible around dma_set_mask(), can be done at run-time.
What we can do with swiotlb is to check if the mask is smaller than ZONE_DMA. If it ever is, we have to fail dma_set_mask and hope the driver can fall back to PIO mode or it will have to fail its probe() function.
dma_set_(coherent_)mask check swiotlb_dma_supported() which returns false if io_tlb_end goes beyond the device mask. So we just need to ensure that io_tlb is allocated within ZONE_DMA.
Makes sense for dma_set_mask. Why do you do the same thing for coherent_mask? Shouldn't that check against ZONE_DMA instead?
For dma_set_coherent_mask(), we also have to fail any call that tries to set a mask larger than what the device hardware can do. Unlike that, dma_set_mask() can succeed with any mask, we just have to enable swiotlb if the mask that the driver wants is larger than what the hardware can do.
Currently we can't satisfy any arbitrarily small dma mask even with swiotlb since the bounce buffer is just guaranteed to be in ZONE_DMA. Swiotlb allows for smaller masks but we need to reserve the io_tlb buffer early during boot and at smaller addresses. For example, swiotlb_alloc_coherent() first tries __get_free_pages(GFP_DMA) and if the coherent_dma_mask isn't matched, it frees the pages and falls back to the io_tlb buffer. However, I don't think it's worth going for masks smaller than 32-bit on arm64.
Is that safe for noncoherent systems? I'd expect the io_tlb buffer to be cached there, which means we can't use it for coherent allocations.
CMA is pretty similar to swiotlb with regards to pre-allocated buffers for coherent dma. We currently don't limit it for arm64 but I think we should just limit it to ZONE_DMA because we can't tell what masks the devices need. We could parse the DT for dma-ranges but we can still have explicit dma_set_coherent_mask() calls to make it smaller.
Yet another issue is what we actually mean by ZONE_DMA. If we have devices with different dma_pfn_offset (as per Santosh's patches), ZONE_DMA would mean different things for them since phys_to_dma() may no longer be the same for a single SoC.
I never figured out how that works.
Arnd