New subject: [EXTERNAL] [PATCH] Drivers: hv: kvp/vss: Avoid accessing a ringbuffer not initialized yet

9 Sep 2024

If the KVP (or VSS) daemon starts before the VMBus channel's ringbuffer is
fully initialized, we can hit the panic below:
hv_utils: Registering HyperV Utility Driver
hv_vmbus: registering driver hv_utils
...
BUG: kernel NULL pointer dereference, address: 0000000000000000
CPU: 44 UID: 0 PID: 2552 Comm: hv_kvp_daemon Tainted: G E 6.11.0-rc3+ #1
RIP: 0010:hv_pkt_iter_first+0x12/0xd0
Call Trace:
...
 vmbus_recvpacket
 hv_kvp_onchannelcallback
 vmbus_on_event
 tasklet_action_common
 tasklet_action
 handle_softirqs
 irq_exit_rcu
 sysvec_hyperv_stimer0
 </IRQ>
 <TASK>
 asm_sysvec_hyperv_stimer0
...
 kvp_register_done
 hvt_op_read
 vfs_read
 ksys_read
 __x64_sys_read
This can happen because the KVP/VSS channel callback can be invoked
even before the channel is fully opened:
1) as soon as hv_kvp_init() -> hvutil_transport_init() creates
/dev/vmbus/hv_kvp, the kvp daemon can open the device file immediately and
register itself to the driver by writing a message KVP_OP_REGISTER1 to the
file (which is handled by kvp_on_msg() ->kvp_handle_handshake()) and
reading the file for the driver's response, which is handled by
hvt_op_read(), which calls hvt->on_read(), i.e. kvp_register_done().
2) the problem with kvp_register_done() is that it can cause the
channel callback to be called even before the channel is fully opened,
and when the channel callback is starting to run, util_probe()->
vmbus_open() may have not initialized the ringbuffer yet, so the
callback can hit the panic of NULL pointer dereference.
To reproduce the panic consistently, we can add a "ssleep(10)" for KVP in
__vmbus_open(), just before the first hv_ringbuffer_init(), and then we
unload and reload the driver hv_utils, and run the daemon manually within
the 10 seconds.
Fix the panic by checking the channel state in the channel callback.
To avoid the race condition with __vmbus_open(), we disable and enable
the channel callback temporarily in __vmbus_open().
The channel callbacks of the other VMBus devices don't need to check
the channel state since they can't run before the channels are fully
initialized.
Note: we would also need to fix the fcopy driver code, but that has
been removed in commit ec314f61e4fc ("Drivers: hv: Remove fcopy driver") in
the mainline kernel since v6.10. For old 6.x LTS kernels, and the 5.x
and 4.x LTS kernels, the fcopy driver needs to be fixed when the
fix is backported to the stable kernel branches.
Fixes: e0fa3e5e7df6 ("Drivers: hv: utils: fix a race on userspace daemons registration")
Cc: stable@vger.kernel.org
Signed-off-by: Dexuan Cui decui@microsoft.com
---
 drivers/hv/channel.c     | 11 +++++++++++
 drivers/hv/hv_kvp.c      |  3 +++
 drivers/hv/hv_snapshot.c |  3 +++
 3 files changed, 17 insertions(+)

diff --git a/drivers/hv/channel.c b/drivers/hv/channel.c
index fb8cd8469328..685e407a3fdf 100644
--- a/drivers/hv/channel.c
+++ b/drivers/hv/channel.c
@@ -657,6 +657,14 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
    		return -ENOMEM;
    }
+	/*
+	 * The channel callbacks of KVP/VSS may run before __vmbus_open()
+	 * finishes (see kvp_register_done() -> ... -> kvp_poll_wrapper()), so
+	 * they check newchannel->state to tell the ringbuffer has been fully
+	 * initialized or not. Disable and enable the tasklet to avoid the race.
+	 */
+	tasklet_disable(&newchannel->callback_event);
+
    newchannel->state = CHANNEL_OPENING_STATE;
    newchannel->onchannel_callback = onchannelcallback;
    newchannel->channel_callback_context = context;
@@ -750,6 +758,8 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
    }
newchannel->state = CHANNEL_OPENED_STATE;
+	tasklet_enable(&newchannel->callback_event);
+
    kfree(open_info);
    return 0;
@@ -766,6 +776,7 @@ static int __vmbus_open(struct vmbus_channel *newchannel,
    hv_ringbuffer_cleanup(&newchannel->inbound);
    vmbus_free_requestor(&newchannel->requestor);
    newchannel->state = CHANNEL_OPEN_STATE;
+	tasklet_enable(&newchannel->callback_event);
    return err;
 }
diff --git a/drivers/hv/hv_kvp.c b/drivers/hv/hv_kvp.c
index d35b60c06114..ec098067e579 100644
--- a/drivers/hv/hv_kvp.c
+++ b/drivers/hv/hv_kvp.c
@@ -662,6 +662,9 @@ void hv_kvp_onchannelcallback(void *context)
    if (kvp_transaction.state > HVUTIL_READY)
    	return;
+	if (channel->state != CHANNEL_OPENED_STATE)
+		return;
+
    if (vmbus_recvpacket(channel, recv_buffer, HV_HYP_PAGE_SIZE * 4, &recvlen, &requestid)) {
    	pr_err_ratelimited("KVP request received. Could not read into recv buf\n");
    	return;
diff --git a/drivers/hv/hv_snapshot.c b/drivers/hv/hv_snapshot.c
index 0d2184be1691..f7924c2fc62e 100644
--- a/drivers/hv/hv_snapshot.c
+++ b/drivers/hv/hv_snapshot.c
@@ -301,6 +301,9 @@ void hv_vss_onchannelcallback(void *context)
    if (vss_transaction.state > HVUTIL_READY)
    	return;
+	if (channel->state != CHANNEL_OPENED_STATE)
+		return;
+
    if (vmbus_recvpacket(channel, recv_buffer, VSS_MAX_PKT_SIZE, &recvlen, &requestid)) {
    	pr_err_ratelimited("VSS request received. Could not read into recv buf\n");
    	return;
-- 
2.25.1