On Mon, Sep 13 2021 at 13:01, Sohil Mehta wrote:
Add a new system call to allow applications to block in the kernel and wait for user interrupts.
<The current implementation doesn't support waking up from other blocking system calls like sleep(), read(), epoll(), etc.
uintr_wait() is a placeholder syscall while we decide on that behaviour.>
When the application makes this syscall the notification vector is switched to a new kernel vector. Any new SENDUIPI will invoke the kernel interrupt which is then used to wake up the process.
Currently, the task wait list is global one. To make the implementation scalable there is a need to move to a distributed per-cpu wait list.
How are per cpu wait lists going to solve the problem?
+/*
- Handler for UINTR_KERNEL_VECTOR.
- */
+DEFINE_IDTENTRY_SYSVEC(sysvec_uintr_kernel_notification) +{
- /* TODO: Add entry-exit tracepoints */
- ack_APIC_irq();
- inc_irq_stat(uintr_kernel_notifications);
- uintr_wake_up_process();
So this interrupt happens for any of those notifications. How are they differentiated?
+int uintr_receiver_wait(void) +{
- struct uintr_upid_ctx *upid_ctx;
- unsigned long flags;
- if (!is_uintr_receiver(current))
return -EOPNOTSUPP;
- upid_ctx = current->thread.ui_recv->upid_ctx;
- upid_ctx->upid->nc.nv = UINTR_KERNEL_VECTOR;
- upid_ctx->waiting = true;
- spin_lock_irqsave(&uintr_wait_lock, flags);
- list_add(&upid_ctx->node, &uintr_wait_list);
- spin_unlock_irqrestore(&uintr_wait_lock, flags);
- set_current_state(TASK_INTERRUPTIBLE);
Because we have not enough properly implemented wait primitives you need to open code one which is blantantly wrong vs. a concurrent wake up?
- schedule();
How is that correct vs. a spurious wakeup? What takes care that the entry is removed from the list?
Again. We have proper wait primitives.
- return -EINTR;
+}
+/*
- Runs in interrupt context.
- Scan through all UPIDs to check if any interrupt is on going.
- */
+void uintr_wake_up_process(void) +{
- struct uintr_upid_ctx *upid_ctx, *tmp;
- unsigned long flags;
- spin_lock_irqsave(&uintr_wait_lock, flags);
- list_for_each_entry_safe(upid_ctx, tmp, &uintr_wait_list, node) {
if (test_bit(UPID_ON, (unsigned long*)&upid_ctx->upid->nc.status)) {
set_bit(UPID_SN, (unsigned long *)&upid_ctx->upid->nc.status);
upid_ctx->upid->nc.nv = UINTR_NOTIFICATION_VECTOR;
upid_ctx->waiting = false;
wake_up_process(upid_ctx->task);
list_del(&upid_ctx->node);
So any of these notification interrupts does a global mass wake up? How does that make sense?
}
- }
- spin_unlock_irqrestore(&uintr_wait_lock, flags);
+}
+/* Called when task is unregistering/exiting */ +static void uintr_remove_task_wait(struct task_struct *task) +{
- struct uintr_upid_ctx *upid_ctx, *tmp;
- unsigned long flags;
- spin_lock_irqsave(&uintr_wait_lock, flags);
- list_for_each_entry_safe(upid_ctx, tmp, &uintr_wait_list, node) {
if (upid_ctx->task == task) {
pr_debug("wait: Removing task %d from wait\n",
upid_ctx->task->pid);
upid_ctx->upid->nc.nv = UINTR_NOTIFICATION_VECTOR;
upid_ctx->waiting = false;
list_del(&upid_ctx->node);
}
What? You have to do a global list walk to find the entry which you added yourself?
Thanks,
tglx