On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
Task A calls binder_update_page_range() to allocate and insert pages on a remote address space from Task B. For this, Task A pins the remote mm via mmget_not_zero() first. This can race with Task B do_exit() and the final mmput() refcount decrement will come from Task A.
Task A | Task B ------------------+------------------ mmget_not_zero() | | do_exit() | exit_mm() | mmput() mmput() | exit_mmap() | remove_vma() | fput() |
In this case, the work of ____fput() from Task B is queued up in Task A as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup work gets executed. However, Task A instead sleep, waiting for a reply from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated binder event forces Task A to go back to userspace. All the associated death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Reviewed-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Carlos Llamas cmllamas@google.com
Sorry, I forgot to Cc: stable@vger.kernel.org.
-- Carlos Llamas
On Thu, Jan 18, 2024 at 07:29:07PM +0000, Carlos Llamas wrote:
On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
Task A calls binder_update_page_range() to allocate and insert pages on a remote address space from Task B. For this, Task A pins the remote mm via mmget_not_zero() first. This can race with Task B do_exit() and the final mmput() refcount decrement will come from Task A.
Task A | Task B ------------------+------------------ mmget_not_zero() | | do_exit() | exit_mm() | mmput() mmput() | exit_mmap() | remove_vma() | fput() |
In this case, the work of ____fput() from Task B is queued up in Task A as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup work gets executed. However, Task A instead sleep, waiting for a reply from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated binder event forces Task A to go back to userspace. All the associated death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Reviewed-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Carlos Llamas cmllamas@google.com
Sorry, I forgot to Cc: stable@vger.kernel.org.
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
On Fri, Jan 19, 2024 at 06:48:43AM +0100, Greg Kroah-Hartman wrote:
On Thu, Jan 18, 2024 at 07:29:07PM +0000, Carlos Llamas wrote:
On Fri, Dec 01, 2023 at 05:21:32PM +0000, Carlos Llamas wrote:
Task A calls binder_update_page_range() to allocate and insert pages on a remote address space from Task B. For this, Task A pins the remote mm via mmget_not_zero() first. This can race with Task B do_exit() and the final mmput() refcount decrement will come from Task A.
Task A | Task B ------------------+------------------ mmget_not_zero() | | do_exit() | exit_mm() | mmput() mmput() | exit_mmap() | remove_vma() | fput() |
In this case, the work of ____fput() from Task B is queued up in Task A as TWA_RESUME. So in theory, Task A returns to userspace and the cleanup work gets executed. However, Task A instead sleep, waiting for a reply from Task B that never comes (it's dead).
This means the binder_deferred_release() is blocked until an unrelated binder event forces Task A to go back to userspace. All the associated death notifications will also be delayed until then.
In order to fix this use mmput_async() that will schedule the work in the corresponding mm->async_put_work WQ instead of Task A.
Fixes: 457b9a6f09f0 ("Staging: android: add binder driver") Reviewed-by: Alice Ryhl aliceryhl@google.com Signed-off-by: Carlos Llamas cmllamas@google.com
Sorry, I forgot to Cc: stable@vger.kernel.org.
<formletter>
This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly.
</formletter>
Oops, here is the complete info:
Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5 Subject: "binder: fix race between mmput() and do_exit()" Reason: Fixes a race condition in binder. Versions: v4.19+
Note this will have a trivial conflict in v4.19 and v5.10 kernels as commit d8ed45c5dcd4 is not there. Please let me know if I should send those patches separately.
Thanks, -- Carlos Llamas
On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
Oops, here is the complete info:
Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5 Subject: "binder: fix race between mmput() and do_exit()" Reason: Fixes a race condition in binder. Versions: v4.19+
Note this will have a trivial conflict in v4.19 and v5.10 kernels as commit d8ed45c5dcd4 is not there. Please let me know if I should send those patches separately.
Thanks,
Carlos Llamas
Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies cleanly in v5.10+.
On Fri, Jan 19, 2024 at 05:37:22PM +0000, Carlos Llamas wrote:
On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
Oops, here is the complete info:
Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5 Subject: "binder: fix race between mmput() and do_exit()" Reason: Fixes a race condition in binder. Versions: v4.19+
Note this will have a trivial conflict in v4.19 and v5.10 kernels as commit d8ed45c5dcd4 is not there. Please let me know if I should send those patches separately.
Thanks,
Carlos Llamas
Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies cleanly in v5.10+.
Yes, I need backported patches please.
thanks,
greg k-h
On Sat, Jan 20, 2024 at 07:37:25AM +0100, Greg Kroah-Hartman wrote:
On Fri, Jan 19, 2024 at 05:37:22PM +0000, Carlos Llamas wrote:
On Fri, Jan 19, 2024 at 05:06:13PM +0000, Carlos Llamas wrote:
Oops, here is the complete info:
Commit ID: 9a9ab0d963621d9d12199df9817e66982582d5a5 Subject: "binder: fix race between mmput() and do_exit()" Reason: Fixes a race condition in binder. Versions: v4.19+
Note this will have a trivial conflict in v4.19 and v5.10 kernels as commit d8ed45c5dcd4 is not there. Please let me know if I should send those patches separately.
Thanks,
Carlos Llamas
Sigh, I meant to type "conflict in v4.19 and v5.4". The patch applies cleanly in v5.10+.
Yes, I need backported patches please.
thanks,
greg k-h
Backports have been sent.
linux-4.19.y: https://lore.kernel.org/all/20240122174250.2123854-1-cmllamas@google.com/
linux-5.4.y: https://lore.kernel.org/all/20240122175751.2214176-1-cmllamas@google.com/
The patch should apply cleanly in remaining stable branches.
Thanks, -- Carlos Llamas
linux-stable-mirror@lists.linaro.org