On Mon, Mar 10, 2025 at 06:44:51PM +0000, Robin Murphy wrote:
On 2025-03-10 4:28 pm, Maxime Ripard wrote:
On Mon, Mar 10, 2025 at 02:56:37PM +0000, Robin Murphy wrote:
On 2025-03-10 12:06 pm, Maxime Ripard wrote:
Consumers of the direct DMA API will have to know which region their device allocate from in order for them to charge the memory allocation in the right one.
This doesn't seem to make much sense - dma-direct is not an allocator itself, it just provides the high-level dma_alloc_attrs/dma_alloc_pages/etc. interfaces wherein the underlying allocations _could_ come from CMA, but also a per-device coherent/restricted pool, or a global coherent/atomic pool, or the regular page allocator, or in one weird corner case the SWIOTLB buffer, or...
I guess it wasn't super clear, but what I meant is that it's an allocator to the consumer: it gets called, and returns a buffer. How it does so is transparent to the device, and on the other side of the abstraction.
I do agree that the logic is complicated to follow, and that's what I was getting at in the cover letter.
Right, but ultimately my point is that when we later end up with:
struct dmem_cgroup_region * dma_get_dmem_cgroup_region(struct device *dev) { if (dma_alloc_direct(dev, get_dma_ops(dev))) return dma_direct_get_dmem_cgroup_region(dev);
= dma_contiguous_get_dmem_cgroup_region(dev);
it's objectively wrong given what dma_alloc_direct() means in context:
void *dma_alloc_attrs(...) { if (dma_alloc_direct(dev, ops)) cpu_addr = dma_direct_alloc(...);
where dma_direct_alloc() may then use at least 5 different allocation methods, only one of which is CMA. Accounting things which are not CMA to CMA seems to thoroughly defeat the purpose of having such fine-grained accounting at all.
This is why the very notion of "consumers of dma-direct" should fundamentally not be a thing IMO. Drivers consume the DMA API interfaces, and the DMA API ultimately consumes various memory allocators, but what happens in between is nobody else's business; dma-direct happens to represent *some* paths between the two, but there are plenty more paths to the same (and different) allocators through other DMA API implementations as well. Which route a particular call takes to end up at a particular allocator is not meaningful unless you are the DMA ops dispatch code.
Or to put it another way, to even go for the "dumbest possible correct solution", the plumbing of dma_get_dmem_cgroup_region() would need to be about as complex and widespread as the plumbing of dma_alloc_attrs() itself ;)
I largely agree with the sentiment, and I think the very idea of dma_get_dmem_cgroup_region() is a bad one for that reason. But since I wasn't too sure what a good one might look like, I figured it would be a good way to start the discussion still :)
I think I see why a simple DMA attribute couldn't be made to work, as dmem_cgroup_uncharge() can't simply look up the pool the same way dmem_cgroup_try_charge() found it, since we still need a cg for that and get_current_dmemcs() can't be assumed to be stable over time, right? At the point I'm probably starting to lean towards a whole new DMA op with a properly encapsulated return type (and maybe a long-term goal of consolidating the 3 or 4 different allocation type we already have)
It felt like a good solution to me too, and what I alluded to with struct page or folio. My feeling was that the best way to do it would be to encapsulate it into the structure returned by the dma_alloc_* API. That's a pretty large rework though, so I wanted to make sure I was on the right path before doing so.
or just have a single dmem region for "DMA API memory" and don't care where it came from (although I do see the issues with that too - you probably wouldn't want to ration a device-private pool the same way as global system memory, for example)
Yeah, the CMA pool is probably something you want to limit differently as well.
Maxime