It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
MMC drivers seem to be aware of coalescing the SG entries together as they are using dma_sg_len().
I have updated the arm_iommu_map_sg() back to coalesce and fixed the issues with it. During testing, I found out that mmc host driver doesn't support buffers bigger than 64K. To get the device working, I had to break the sg entries coalesce when dma_length is about to go beyond 64KB. Looks like Mmc host driver(sdhci.c) need to be fixed to handle buffers bigger than 64KB. Should the clients be forced to handle bigger buffers or is there any better way to handle these kind of issues?
-- nvpublic