On Fri, Sep 27, 2024 at 06:15:40PM +0200, Alice Ryhl wrote:
On Fri, Sep 27, 2024 at 6:13 PM Yu-Ting Tseng yutingtseng@google.com wrote:
On Fri, Sep 27, 2024 at 12:19 AM Alice Ryhl aliceryhl@google.com wrote:
On Fri, Sep 27, 2024 at 1:37 AM Carlos Llamas cmllamas@google.com wrote:
Alice points out that binder_request_freeze_notification() should not return EINVAL when the relevant node is dead [1]. The node can die at any point even if the user input is valid. Instead, allow the request to be allocated but skip the initial notification for dead nodes. This avoids propagating unnecessary errors back to userspace.
Fixes: d579b04a52a1 ("binder: frozen notification") Cc: stable@vger.kernel.org Suggested-by: Alice Ryhl aliceryhl@google.com Link: https://lore.kernel.org/all/CAH5fLghapZJ4PbbkC8V5A6Zay-_sgTzwVpwqk6RWWUNKKyJ... [1] Signed-off-by: Carlos Llamas cmllamas@google.com
drivers/android/binder.c | 28 +++++++++++++--------------- 1 file changed, 13 insertions(+), 15 deletions(-)
diff --git a/drivers/android/binder.c b/drivers/android/binder.c index 73dc6cbc1681..415fc9759249 100644 --- a/drivers/android/binder.c +++ b/drivers/android/binder.c @@ -3856,7 +3856,6 @@ binder_request_freeze_notification(struct binder_proc *proc, { struct binder_ref_freeze *freeze; struct binder_ref *ref;
bool is_frozen; freeze = kzalloc(sizeof(*freeze), GFP_KERNEL); if (!freeze)
@@ -3872,32 +3871,31 @@ binder_request_freeze_notification(struct binder_proc *proc, }
binder_node_lock(ref->node);
if (ref->freeze || !ref->node->proc) {
binder_user_error("%d:%d invalid BC_REQUEST_FREEZE_NOTIFICATION %s\n",
proc->pid, thread->pid,
ref->freeze ? "already set" : "dead node");
if (ref->freeze) {
binder_user_error("%d:%d BC_REQUEST_FREEZE_NOTIFICATION already set\n",
proc->pid, thread->pid); binder_node_unlock(ref->node); binder_proc_unlock(proc); kfree(freeze); return -EINVAL; }
binder_inner_proc_lock(ref->node->proc);
is_frozen = ref->node->proc->is_frozen;
binder_inner_proc_unlock(ref->node->proc); binder_stats_created(BINDER_STAT_FREEZE); INIT_LIST_HEAD(&freeze->work.entry); freeze->cookie = handle_cookie->cookie; freeze->work.type = BINDER_WORK_FROZEN_BINDER;
freeze->is_frozen = is_frozen;
ref->freeze = freeze;
binder_inner_proc_lock(proc);
binder_enqueue_work_ilocked(&ref->freeze->work, &proc->todo);
binder_wakeup_proc_ilocked(proc);
binder_inner_proc_unlock(proc);
if (ref->node->proc) {
binder_inner_proc_lock(ref->node->proc);
freeze->is_frozen = ref->node->proc->is_frozen;
binder_inner_proc_unlock(ref->node->proc);
binder_inner_proc_lock(proc);
binder_enqueue_work_ilocked(&freeze->work, &proc->todo);
binder_wakeup_proc_ilocked(proc);
binder_inner_proc_unlock(proc);
This is not a problem with your change ... but, why exactly are we scheduling the BINDER_WORK_FROZEN_BINDER right after creating it? For death notications, we only schedule it immediately if the process is dead. So shouldn't we only schedule it if the process is not frozen?
For death notifications, we only care about a remote binder's death. Unlike freeze, in which we have a state that can toggle at any point. This is important for suspending and resuming transactions to a node.
Sending the freeze notification immediately allows for (1) userspace knowing the current state of the remote node and (2) avoiding a race with BINDER_FREEZE ioctl in which we could miss a freeze/thaw.
And if the answer is that frozen notifications are always sent immediately to notify about the current state, then we should also send one for a dead process ... maybe. I guess a dead process is not frozen?
Yes this is to immediately notify about the current state (frozen or unfrozen). A dead process is in neither state so it feels more correct not to send either?
Okay.
On the other hand, I can easily imagine userspace code being written with the assumption that it'll always get a notification immediately. That would probably result in deadlocks in the edge case where the process happens to be dead.
There are different ways to proceed with this dead node scenario:
1. return ESRCH 2. silently fail and don't allocate a ref->freeze 3. allocate a ref->freeze but don't notify the current state 4. allocate and send a "fake" state notification.
I like 1 just because it is technically the correct thing to do from the driver's perspective. However, it does complicate things in userspace as we've discussed. Option 2, could work but it would also fail with EINVAL if a "clear notification" is sent later anyway. Option 3 changes the behavior of guaranteeing a notification upon success. Option 4 can cause trouble on how a "not-frozen" notification is handled in userspace e.g start sending transactions.
As you can see there is no clear winner here, we have to compromise something and option #3 is the best we can do IMO.
-- Carlos Llamas