Re: [RFC v3 1/8] gpu: rfc: Proposal for a GPU cgroup controller

23 Mar 2022


      On Tue, Mar 22, 2022 at 08:41:55AM -0700, "T.J. Mercier" tjmercier@google.com wrote:
...
So "total" is used twice here in two different contexts.
The first one is the global "GPU" cgroup context. As in any buffer
that any exporter claims is a GPU buffer, regardless of where/how it
is allocated. So this refers to the sum of all gpu buffers of any
type/source. An exporter contributes to this total by registering a
corresponding gpucg_device and making charges against that device when
it exports.
The second one is in a per device context. This allows us to make a
distinction between different types of GPU memory based on who
exported the buffer. A single process can make use of several
different types of dma buffers (for example cached and uncached
versions of the same type of memory), and it would be useful to have
different limits for each. These are distinguished by the device name
string chosen when the gpucg_device is first registered.
So is this understanding correct?
(if there was an analogous line in gpu.memory.current to gpu.memory.max)
    $ cat gpu.memory.current
    total T
    dev1  d1
    ...
    devN  dn
T = Σ di + RAM_backed_buffers
and that some of RAM_backed_buffers may be accounted also in
memory.current (case by case, depending on allocator).
Thanks,
Michal

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [RFC v3 1/8] gpu: rfc: Proposal for a GPU cgroup controller