On Tue, Mar 22, 2022 at 08:41:55AM -0700, "T.J. Mercier" tjmercier@google.com wrote:
So "total" is used twice here in two different contexts. The first one is the global "GPU" cgroup context. As in any buffer that any exporter claims is a GPU buffer, regardless of where/how it is allocated. So this refers to the sum of all gpu buffers of any type/source. An exporter contributes to this total by registering a corresponding gpucg_device and making charges against that device when it exports. The second one is in a per device context. This allows us to make a distinction between different types of GPU memory based on who exported the buffer. A single process can make use of several different types of dma buffers (for example cached and uncached versions of the same type of memory), and it would be useful to have different limits for each. These are distinguished by the device name string chosen when the gpucg_device is first registered.
So is this understanding correct?
(if there was an analogous line in gpu.memory.current to gpu.memory.max) $ cat gpu.memory.current total T dev1 d1 ... devN dn
T = Σ di + RAM_backed_buffers
and that some of RAM_backed_buffers may be accounted also in memory.current (case by case, depending on allocator).
Thanks, Michal