From: Vitaly Kuznetsov vkuznets@redhat.com Sent: Tuesday, May 16, 2023 2:12 AM
Michael Kelley mikelley@microsoft.com writes:
vmbus_wait_for_unload() may be called in the panic path after other CPUs are stopped. vmbus_wait_for_unload() currently loops through online CPUs looking for the UNLOAD response message. But the values of CONFIG_KEXEC_CORE and crash_kexec_post_notifiers affect the path used to stop the other CPUs, and in one of the paths the stopped CPUs are removed from cpu_online_mask. This removal happens in both x86/x64 and arm64 architectures. In such a case, vmbus_wait_for_unload() only checks the panic'ing CPU, and misses the UNLOAD response message except when the panic'ing CPU is CPU 0. vmbus_wait_for_unload() eventually times out, but only after waiting 100 seconds.
Fix this by looping through *present* CPUs in vmbus_wait_for_unload(). The cpu_present_mask is not modified by stopping the other CPUs in the panic path, nor should it be. Furthermore, the synic_message_page being checked in vmbus_wait_for_unload() is allocated in hv_synic_alloc() for all present CPUs. So looping through the present CPUs is more consistent.
For additional safety, also add a check for the message_page being NULL before looking for the UNLOAD response message.
Reported-by: John Starks jostarks@microsoft.com Fixes: cd95aad55793 ("Drivers: hv: vmbus: handle various crash scenarios")
I see you Cc:ed stable@ on the patch, should we also add
Cc: stable@vger.kernel.org
here explicitly so it gets picked up by various stable backporting scritps? I guess Wei can do it when picking the patch to the queue...
Yes, the kernel test robot has already warned me about not doing that right. :-(
Signed-off-by: Michael Kelley mikelley@microsoft.com
drivers/hv/channel_mgmt.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-)
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c index 007f26d..df2ba20 100644 --- a/drivers/hv/channel_mgmt.c +++ b/drivers/hv/channel_mgmt.c @@ -829,11 +829,14 @@ static void vmbus_wait_for_unload(void) if (completion_done(&vmbus_connection.unload_event)) goto completed;
for_each_online_cpu(cpu) {
for_each_present_cpu(cpu) { struct hv_per_cpu_context *hv_cpu = per_cpu_ptr(hv_context.cpu_context, cpu); page_addr = hv_cpu->synic_message_page;
if (!page_addr)
continue;
In theory, synic_message_page for all present CPUs is permanently assigned in hv_synic_alloc() and we fail the whole thing if any of these allocations fail so page_addr == NULL is likely impossible today but there's certainly no harm in having this extra check here, this is not a hotpath.
But consider a CoCo VM where the allocation is not done in hv_synic_alloc(). In this case, synic_message_page is set in hv_synic_enable_regs(), which is called only when a CPU is brought online. If the CPUs that are brought online are less than all present CPUs because of kernel command line options, then we might have synic_message_page values for other present CPUs that don't get initialized and remain NULL.
I should probably tweak the commit message to call out this case explicitly.
msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT;
@@ -867,11 +870,14 @@ static void vmbus_wait_for_unload(void) * maybe-pending messages on all CPUs to be able to receive new * messages after we reconnect. */
- for_each_online_cpu(cpu) {
for_each_present_cpu(cpu) { struct hv_per_cpu_context *hv_cpu = per_cpu_ptr(hv_context.cpu_context, cpu);
page_addr = hv_cpu->synic_message_page;
if (!page_addr)
continue;
msg = (struct hv_message *)page_addr + VMBUS_MESSAGE_SINT; msg->header.message_type = HVMSG_NONE; }
Reviewed-by: Vitaly Kuznetsov vkuznets@redhat.com
Thanks for reviewing!