On Thu, Sep 30, 2021, at 12:29 PM, Thomas Gleixner wrote:
On Thu, Sep 30 2021 at 11:08, Andy Lutomirski wrote:
On Tue, Sep 28, 2021, at 9:56 PM, Sohil Mehta wrote: I think we have three choices:
Use a fancy wrapper around SENDUIPI. This is probably a bad idea.
Treat the NV-2 as a real interrupt and honor affinity settings. This will be annoying and slow, I think, if it's even workable at all.
We can make it a real interrupt in form of a per CPU interrupt, but affinity settings are not really feasible because the affinity is in the UPID.ndst field. So, yes we can target it to some CPU, but that's racy.
Handle this case with faults instead of interrupts. We could set a reserved bit in UPID so that SENDUIPI results in #GP, decode it, and process it. This puts the onus on the actual task causing trouble, which is nice, and it lets us find the UPID and target directly instead of walking all of them. I don't know how well it would play with hypothetical future hardware-initiated uintrs, though.
I thought about that as well and dismissed it due to the hardware initiated ones but thinking more about it, those need some translation unit (e.g. irq remapping) anyway, so it might be doable to catch those as well. So we could just ignore them for now and go for the #GP trick and deal with the device initiated ones later when they come around :)
Sounds good to me. In the long run, if Intel wants device initiated fancy interrupts to work well, they need a new design.
But even with that we still need to keep track of the armed ones per CPU so we can handle CPU hotunplug correctly. Sigh...
I don’t think any real work is needed. We will only ever have armed UPIDs (with notification interrupts enabled) for running tasks, and hot-unplugged CPUs don’t have running tasks. We do need a way to drain pending IPIs before we offline a CPU, but that’s a separate problem and may be unsolvable for all I know. Is there a magic APIC operation to wait until all initiated IPIs targeting the local CPU arrive? I guess we can also just mask the notification vector so that it won’t crash us if we get a stale IPI after going offline.
Thanks,
tglx