On 6/3/22 10:34, Tadeusz Struk wrote:
Syzbot found a corrupted list bug scenario that can be triggered from cgroup_subtree_control_write(cgrp). The reproduces writes to cgroup.subtree_control file, which invokes: cgroup_apply_control_enable()->css_create()->css_populate_dir(), which then fails with a fault injected -ENOMEM. In such scenario the css_killed_work_fn will be en-queued via cgroup_apply_control_disable(cgrp)->kill_css(css), and bail out to cgroup_kn_unlock(). Then cgroup_kn_unlock() will call: cgroup_put(cgrp)->css_put(&cgrp->self), which will try to enqueue css_release_work_fn for the same css instance, causing a list_add corruption bug, as can be seen in the syzkaller report [1].
Fix this by synchronizing the css ref_kill and css_release jobs. css_release() function will check if the css_killed_work_fn() has been scheduled for the css and only en-queue the css_release_work_fn() if css_killed_work_fn wasn't already en-queued. Otherwise css_release() will set the CSS_REL_LATER flag for that css. This will cause the css_release_work_fn() work to be executed after css_killed_work_fn() is finished.
Two scc flags have been introduced to implement this serialization mechanizm:
- CSS_KILL_ENQED, which will be set when css_killed_work_fn() is en-queued, and
- CSS_REL_LATER, which, if set, will cause the css_release_work_fn() to be scheduled after the css_killed_work_fn is finished.
There is also a new lock, which will protect the integrity of the css flags.
[1]https://syzkaller.appspot.com/bug?id=e26e54d6eac9d9fb50b221ec3e4627b327465db...
Cc: Tejun Heotj@kernel.org Cc: Michal Koutnymkoutny@suse.com Cc: Zefan Lilizefan.x@bytedance.com Cc: Johannes Weinerhannes@cmpxchg.org Cc: Christian Braunerbrauner@kernel.org Cc: Alexei Starovoitovast@kernel.org Cc: Daniel Borkmanndaniel@iogearbox.net Cc: Andrii Nakryikoandrii@kernel.org Cc: Martin KaFai Laukafai@fb.com Cc: Song Liusongliubraving@fb.com Cc: Yonghong Songyhs@fb.com Cc: John Fastabendjohn.fastabend@gmail.com Cc: KP Singhkpsingh@kernel.org Cc:cgroups@vger.kernel.org Cc:netdev@vger.kernel.org Cc:bpf@vger.kernel.org Cc:stable@vger.kernel.org Cc:linux-kernel@vger.kernel.org
Reported-and-tested-by:syzbot+e42ae441c3b10acf9e9d@syzkaller.appspotmail.com Fixes: 8f36aaec9c92 ("cgroup: Use rcu_work instead of explicit rcu and work item") Signed-off-by: Tadeusz Struktadeusz.struk@linaro.org
I just spotted an issue with this. I'm holding invalid lock in css_killed_work_fn(). I will follow up with a v2 of the patch soon.