On 8/27/21 7:35 PM, Tejun Heo wrote:
Hello,
On Fri, Aug 27, 2021 at 06:50:10PM -0400, Waiman Long wrote:
The cpu exclusivity rule is due to the setting of CPU_EXCLUSIVE bit. This is a pre-existing condition unless you want to change how the cpuset.cpu_exclusive works.
So the new rules will be:
- The "cpuset.cpus" is not empty and the list of CPUs are exclusive.
Empty cpu list can be considered an exclusive one.
It doesn't make sense to me to have a partition with no cpu configured at all. I very much prefer the users to set cpuset.cpus first before turning it into a partition.
- The parent cgroup is a partition root (can be an invalid one).
Does this mean a partition parent can't stop being a partition if one or more of its children become partitions? If so, it violates the rule that a descendant shouldn't be able to restrict what its ancestors can do.
No. As I said in the documentation, transitioning from partition root to member is allowed. Against, it is illogical to allow a cpuset to become a potential partition if it parent is not even a partition root at all. In the case that the parent is reverted back to a member, the child partitions will stay invalid forever unless the parent become a valid partition again.
- The "cpuset.cpus" is a subset of the parent's cpuset.cpus.allowed.
Why not just go by effective? This would mean that a parent can't withdraw CPUs from its allowed set once descendants are configured. Restrictions like this are fine when the entire hierarchy is configured by a single entity but become awkward when configurations are multi-tiered, automated and dynamic.
The original rule is to be based on effective cpus. However, to properly handle the case of allowing offlined cpus to be included in the partition, I have to change it to cpu_allowed instead. I can certainly change it back to effective if you prefer.
- No child cgroup with cpuset enabled.
idk, maybe? I'm having a hard time seeing the point in adding these restrictions when the state transitions are asynchronous anyway. Would it help if we try to separate what's absoluately and technically necessary and what seems reasonable or high bar and try to justify why each of the latter should be added?
This rule is there mainly for ease of implementation. Otherwise, I need to add additional code to handle the conversion of child cpusets which can be rather complex and require a lot more debugging. This rule will no longer apply once the cpuset becomes a partition root.
Cheers, Longman