On Mon, Nov 15, 2021 at 04:10:29PM -0500, Waiman Long longman@redhat.com wrote:
On Mon, Oct 18, 2021 at 10:36:18AM -0400, Waiman Long longman@redhat.com wrote:
- scheduler. Tasks in such a partition must be explicitly bound
- to each individual CPU.
[...]
It can be a problem when one is trying to move from one cgroup to another cgroup with non-overlapping cpus laterally. However, if a task is initially from a parent cgroup with affinity mask that include cpus in the isolated child cgroup, I believe it should be able to move to the isolated child cgroup without problem. Otherwise, it is a bug that needs to be fixed.
app_root cpuset.cpus=0-3 `- non_rt cpuset.cpus=0-1 cpuset.cpus.partition=member `- rt cpuset.cpus=2-3 cpuset.cpus.partition=isolated
The app_root would have cpuset.cpus.effective=0-1 so even the task in app_root can't sched_setaffinity() to cpus 2-3. But AFAICS, the migration calls set_cpus_allowed_ptr() anyway, so the task in the isolated partition needn't to bind explicitly with sched_setaffinity(). (It'd have two cpus available, so one more sched_setaffinity() or migration into a single-cpu list is desirable.)
All in all, I think the behavior is OK and the explicit binding of tasks in an isolated cpuset is optional (not a must as worded currently).
I think the wording may be confusing. What I meant is none of the requested cpu can be granted. So if there is at least one granted, the effective cpus won't be empty.
Ack.
You currently cannot make change to cpuset.cpus that violates the cpu exclusivity rule. The above constraints will not disallow you to make the change. They just affect the validity of the partition root.
Sibling exclusivity should be a validity condition regardless of whether transition is allowed or not. (At least it looks simpler to me.)
Changing a partition root to "member" is always allowed.
If there are child partition roots underneath it, however,
they will be forced to be switched back to "member" too and
lose their partitions. So care must be taken to double check
for this condition before disabling a partition root.
(Or is this how delegation is intended?) However, AFAICS, parent still can't remove cpuset.cpus even when the child is a "member". Otherwise, I agree with the back-switch.
There are only 2 possibilities here. Either we force the child partitions to be become members or invalid partition root.
My point here was mostly about preempting the cpus (as a v2 specific feature). (I'm rather indifferent whether children turn into invalid roots or members.)
Thanks, Michal