When we migrate a task between two cgroups, one of the checks is a verification whether we can modify task's scheduler settings (cap_task_setscheduler()).
An implicit migration occurs also when enabling a controller on the unified hierarchy (think of parent to child migration). The aforementioned check may be problematic if the caller of the migration (enabling a controller) has no permissions over migrated tasks. For instance, a user's cgroup that ends up running a process of a different user. Although cgroup permissions are configured favorably, the enablement fails due to the foreign process [1].
Change the behavior by relaxing the permissions check on the unified hierarchy (or in v2 mode). This is in accordance with unified hierarchy attachment behavior when permissions of the source to target cgroups are decisive whereas the migrated task is opaque (as opposed to more restrictive check in __cgroup1_procs_write()).
[1] https://github.com/systemd/systemd/issues/18293#issuecomment-831205649
Signed-off-by: Michal Koutný mkoutny@suse.com --- kernel/cgroup/cpuset.c | 7 +++++++ 1 file changed, 7 insertions(+)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index e4ca2dd2b764..3b5f87a9a150 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -2495,6 +2495,13 @@ static int cpuset_can_attach(struct cgroup_taskset *tset) ret = task_can_attach(task, cs->effective_cpus); if (ret) goto out_unlock; + + /* + * Skip rights over task check in v2, migration permission derives + * from hierarchy ownership in cgroup_procs_write_permission()). + */ + if (is_in_v2_mode()) + continue; ret = security_task_setscheduler(task); if (ret) goto out_unlock;