 
            Linus Torvalds torvalds@linux-foundation.org writes:
On Thu, Mar 24, 2022 at 10:07 AM Toke Høiland-Jørgensen toke@toke.dk wrote:
Right, but is that sync_for_device call really needed?
Well, imagine that you have a non-cache-coherent DMA (not bounce buffers - just bad hardware)...
So the driver first does that dma_sync_single_for_cpu() for the CPU see the current state (for the non-cache-coherent case it would just invalidate caches).
The driver then examines the command buffer state, sees that it's still in progress, and does that return -EINPROGRESS.
It's actually very natural in that situation to flush the caches from the CPU side again. And so dma_sync_single_for_device() is a fairly reasonable thing to do in that situation.
But it doesn't seem *required*, no. The CPU caches only have a copy of the data in them, no writeback needed (and writeback wouldn't work since DMA from the device may be in progress).
So I don't think the dma_sync_single_for_device() is *wrong* per se, because the CPU didn't actually do any modifications.
But yes, I think it's unnecessary - because any later CPU accesses would need that dma_sync_single_for_cpu() anyway, which should invalidate any stale caches.
OK, the above was basically how I understood it. Thank you for confirming!
And it clearly doesn't work in a bounce-buffer situation, but honestly I don't think a "CPU modified buffers concurrently with DMA" can *ever* work in that situation, so I think it's wrong for a bounce buffer model to ever do anything in the dma_sync_single_for_device() situation.
Right.
Does removing that dma_sync_single_for_device() actually fix the problem for the ath driver?
I am hoping Oleksandr can help answer that since my own ath9k hardware is currently on strike :(
-Toke