On Tue, Jul 27, 2021 at 11:56:25AM -0400, Waiman Long wrote:
On 7/27/21 7:42 AM, Frederic Weisbecker wrote:
On Tue, Jul 20, 2021 at 10:18:31AM -0400, Waiman Long wrote:
Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=TBD
commit 994fb794cb252edd124a46ca0994e37a4726a100 Author: Waiman Long longman@redhat.com Date: Sat, 19 Jun 2021 13:28:19 -0400
cgroup/cpuset: Add a new isolated cpus.partition type Cpuset v1 uses the sched_load_balance control file to determine if load balancing should be enabled. Cpuset v2 gets rid of sched_load_balance as its use may require disabling load balancing at cgroup root. For workloads that require very low latency like DPDK, the latency jitters caused by periodic load balancing may exceed the desired latency limit. When cpuset v2 is in use, the only way to avoid this latency cost is to use the "isolcpus=" kernel boot option to isolate a set of CPUs. After the kernel boot, however, there is no way to add or remove CPUs from this isolated set. For workloads that are more dynamic in nature, that means users have to provision enough CPUs for the worst case situation resulting in excess idle CPUs. To address this issue for cpuset v2, a new cpuset.cpus.partition type "isolated" is added which allows the creation of a cpuset partition without load balancing. This will allow system administrators to dynamically adjust the size of isolated partition to the current need of the workload without rebooting the system. Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Waiman Long longman@redhat.com
Nice! And while we are adding a new ABI, can we take advantage of that and add a specific semantic that if a new isolated partition matches a subset of "isolcpus=", it automatically maps to it. This means that any further modification to that isolated partition will also modify the associated isolcpus= subset.
Or to summarize, when we create a new isolated partition, remove the associated CPUs from isolcpus= ?
We can certainly do that as a follow-on.
I'm just concerned that this feature gets merged before we add that new isolcpus= implicit mapping, which technically is a new ABI. Well I guess I should hurry up and try to propose a patchset quickly once I'm back from vacation :-)
Another idea that I have been thinking about is to automatically generating a isolated partition under root to match the given isolcpus parameter when the v2 filesystem is mounted. That needs more experimentation and testing to verify that it can work.
I thought about that too, mounting an "isolcpus" subdirectory withing the top cpuset but I was worried it could break userspace that wouldn't expect that new thing to show up.
Thanks.