It looks that You have simplified arm_iommu_map_sg() function too much. The main advantage of the iommu is to map scattered memory pages into contiguous dma address space. DMA-mapping is allowed to merge consecutive entries in the scatter list if hardware supports that. http://article.gmane.org/gmane.linux.kernel/1128416
I would update arm_iommu_map_sg() back to coalesce the sg list.
MMC drivers seem to be aware of coalescing the SG entries together as they are using
dma_sg_len().
I have updated the arm_iommu_map_sg() back to coalesce and fixed the issues with it. During testing, I found out that mmc host driver doesn't support buffers bigger than 64K. To get the device working, I had to break the sg entries coalesce when dma_length is about to go beyond 64KB. Looks like Mmc host driver(sdhci.c) need to be fixed to handle buffers bigger than 64KB. Should the clients be forced to handle bigger buffers or is there any better way to handle these kind of issues?
There is struct device_dma_parameters *dma_parms member of struct device. You can specify maximum segment size for the dma_map_sg function. This will of course complicate this function even more...
dma_get_max_seg_size() seem to take care of this issue already. This returns default max_seg_size as 64K unless device has defined its own size.
Best regards