On Wed, Sep 8, 2021 at 3:03 AM Wei Liu wei.liu@kernel.org wrote:
Thanks for the heads-up. I found one instance of this bad practice in hv_apic.c. Presumably that's the one you were referring to.
Yeah, that must have been the one I saw.
However calling into the allocator from that IPI path seems very heavy weight. I will discuss with fellow engineers on how to fix it properly.
In other places, the options have been fairly straightforward:
(a) avoid the allocation entirely.
I think the main reason hyperv does it is because it wants to clear the "current cpu" from the cpumask for the ALL_BUT_SELF case, and if you can just instead keep track of that "all but self" bit separately and pass it down the call chain, you may not need that allocation at all.
(b) use a static percpu allocation
The IPI code generally wants interrupts disabled anyway, or it certainly can just do the required preemption disable to make sure that it has exclusive access to a percpu allocation.
(c) take advantage of any hypervisor limitations
If hyperv is limited to some smaller number of CPU's - google seems to imply 240 - maybe you can keep such a smaller allocation on the stack, but just verify that the incoming cpumask is less than whatever the hyperv limit is.
That (c) is obviously the worst choice in some sense, but in some cases limiting yourself to simplify things isn't wrong.
I suspect the percpu allocation model is the easiest one. It's what other IPI code does. See for example
arch/x86/kernel/apic/x2apic_cluster.c: __x2apic_send_IPI_mask()
and that percpu 'ipi_mask' thing.
That said, if you are already just iterating over the mask, doing (a) may be trivial. No allocation at all is even better than a percpu one.
I'm sure there are other options. The above is just the obvious ones that come to mind.
Linus