Hello.
On Fri, Aug 23, 2024 at 01:05:16PM GMT, JoshuaHahnjoshua.hahn6@gmail.com wrote:
Niced CPU usage is a metric reported in host-level /proc/stat, but is not reported in cgroup-level statistics in cpu.stat. However, when a host contains multiple tasks across different workloads, it becomes difficult to gauage how much of the task is being spent on niced processes based on /proc/stat alone, since host-level metrics do not provide this cgroup-level granularity.
The difference between the two metrics is in cputime.c: index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
Exposing this metric will allow load balancers to correctly probe the niced CPU metric for each workload, and make more informed decisions when directing higher priority tasks.
How would this work? (E.g. if too little nice time -> reduce priority of high prio tasks?)
Thanks, Michal