On 03/04/24 10:47, Waiman Long wrote:
On 4/3/24 10:26, Valentin Schneider wrote:
IIUC that was Thomas' suggestion [1], but I can't tell yet how bad it would be to change cgroup_lock() to also do a cpus_read_lock().
Changing the locking order is certainly doable. I have taken a cursory look at it and at least the following files need to be changed:
kernel/bpf/cgroup.c kernel/cgroup/cgroup.c kernel/cgroup/legacy_freezer.c mm/memcontrol.c
That requires a lot more testing to make sure that there won't be a regression. Given that the call to cgroup_transfer_tasks() should be rare these days as it will only apply in the case of cgroup v1 under certain condtion, I am not sure this requirement justifies making such extensive changes. So I kind of defer it until we reach a consensus that it is the right thing to do.
Yeah if we can avoid it initially I'd be up for it.
Just one thing that came to mind - there's no flushing of the cpuset_migrate_tasks_workfn() work, so the scheduler might move tasks itself before the cpuset does via:
balance_push() ->__balance_push_cpu_stop() -> select_fallback_rq()
But, given the current placement of cpuset_wait_for_hotplug(), I believe that's something we can already have, so we should be good.
Cheers, Longman