Re: [PATCH net 2/2] vhost-net: correctly flush batched packet before enabling notification

12 Sep 2025

On Fri, Sep 12, 2025 at 03:33:32PM +0000, Jon Kohler wrote:
...
...
On Sep 12, 2025, at 11:30 AM, Michael S. Tsirkin mst@redhat.com wrote:
!-------------------------------------------------------------------|
 CAUTION: External Email
|-------------------------------------------------------------------!
On Fri, Sep 12, 2025 at 03:24:42PM +0000, Jon Kohler wrote:
...
...
On Sep 12, 2025, at 4:50 AM, Michael S. Tsirkin mst@redhat.com wrote:
!-------------------------------------------------------------------|
CAUTION: External Email
|-------------------------------------------------------------------!
On Fri, Sep 12, 2025 at 04:26:58PM +0800, Jason Wang wrote:
...
Commit 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after
sendmsg") tries to defer the notification enabling by moving the logic
out of the loop after the vhost_tx_batch() when nothing new is
spotted. This will bring side effects as the new logic would be reused
for several other error conditions.
One example is the IOTLB: when there's an IOTLB miss, get_tx_bufs()
might return -EAGAIN and exit the loop and see there's still available
buffers, so it will queue the tx work again until userspace feed the
IOTLB entry correctly. This will slowdown the tx processing and may
trigger the TX watchdog in the guest.
It's not that it might.
Pls clarify that it *has been reported* to do exactly that,
and add a link to the report.
...
Fixing this by stick the notificaiton enabling logic inside the loop
when nothing new is spotted and flush the batched before.
Reported-by: Jon Kohler jon@nutanix.com
Cc: stable@vger.kernel.org
Fixes: 8c2e6b26ffe2 ("vhost/net: Defer TX queue re-enable until after sendmsg")
Signed-off-by: Jason Wang jasowang@redhat.com
So this is mostly a revert, but with
                   vhost_tx_batch(net, nvq, sock, &msg);
added in to avoid regressing performance.
If you do not want to structure it like this (revert+optimization),
then pls make that clear in the message.
...

drivers/vhost/net.c | 33 +++++++++++++--------------------
1 file changed, 13 insertions(+), 20 deletions(-)

diff --git a/drivers/vhost/net.c b/drivers/vhost/net.c
index 16e39f3ab956..3611b7537932 100644
--- a/drivers/vhost/net.c
+++ b/drivers/vhost/net.c
@@ -765,11 +765,11 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
int err;
int sent_pkts = 0;
bool sock_can_batch = (sock->sk->sk_sndbuf == INT_MAX);

bool busyloop_intr;

bool in_order = vhost_has_feature(vq, VIRTIO_F_IN_ORDER);
do {

busyloop_intr = false;


bool busyloop_intr = false;


if (nvq->done_idx == VHOST_NET_BATCH)
vhost_tx_batch(net, nvq, sock, &msg);
@@ -780,10 +780,18 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
break;
/* Nothing new?  Wait for eventfd to tell us they refilled. */
if (head == vq->num) {

/* Kicks are disabled at this point, break loop and

process any remaining batched packets. Queue will



be re-enabled afterwards.




/* Flush batched packets before enabling

virqtueue notification to reduce



unnecssary virtqueue kicks.



typos: virtqueue, unnecessary
...
*/

vhost_tx_batch(net, nvq, sock, &msg);
if (unlikely(busyloop_intr)) {
vhost_poll_queue(&vq->poll);
} else if (unlikely(vhost_enable_notify(&net->dev,
vq))) {
vhost_disable_notify(&net->dev, vq);
continue;
}

break;
}
See my comment below, but how about something like this?
if (head == vq->num) {
/* Flush batched packets before enabling

virtqueue notification to reduce
unnecessary virtqueue kicks.

*/
vhost_tx_batch(net, nvq, sock, &msg);
if (unlikely(busyloop_intr))
/* If interrupted while doing busy polling,

requeue the handler to be fair handle_rx
as well as other tasks waiting on cpu.

*/
vhost_poll_queue(&vq->poll);
else
/* All of our work has been completed;

however, before leaving the TX handler,
do one last check for work, and requeue
handler if necessary. If there is no work,
queue will be reenabled.

*/
vhost_net_busy_poll_try_queue(net, vq);
I mean it's functionally equivalent, but vhost_net_busy_poll_try_queue 
checks the avail ring again and we just checked it.
Why is this a good idea?
This happens on good path so I dislike unnecessary work like this.
For the sake of discussion, let’s say vhost_tx_batch and the
sendmsg within took 1 full second to complete. A lot could potentially
happen in that amount of time. So sure, control path wise it looks like
we just checked it, but time wise, that could have been ages ago.
Oh I forgot we had the tx batch in there.
OK then, I don't have a problem with this.
However, what I like about Jason's patch is that
it is actually simply revert of your patch +
a single call to 
vhost_tx_batch(net, nvq, sock, &msg);
So it is a more obviosly correct approach.
I'll be fine with doing what you propose on top,
with testing that they are benefitial for performance.
...
...
...
break;
}
...
...
@@ -839,22 +847,7 @@ static void handle_tx_copy(struct vhost_net *net, struct socket *sock)
++nvq->done_idx;
} while (likely(!vhost_exceeds_weight(vq, ++sent_pkts, total_len)));

/* Kicks are still disabled, dispatch any remaining batched msgs. */

vhost_tx_batch(net, nvq, sock, &msg);


if (unlikely(busyloop_intr))
/* If interrupted while doing busy polling, requeue the

handler to be fair handle_rx as well as other tasks



waiting on cpu.


*/
vhost_poll_queue(&vq->poll);
else
/* All of our work has been completed; however, before

leaving the TX handler, do one last check for work,



and requeue handler if necessary. If there is no work,



queue will be reenabled.


*/
vhost_net_busy_poll_try_queue(net, vq);

Note: the use of vhost_net_busy_poll_try_queue was intentional in my
patch as it was checking to see both conditionals.
Can we simply hoist my logic up instead?
...
...
}
static void handle_tx_zerocopy(struct vhost_net *net, struct socket *sock)
2.34.1
Tested-by: Jon Kohler <jon@nutanix.com mailto:jon@nutanix.com>
Tried this out on a 6.16 host / guest that locked up with iotlb miss loop,
applied this patch and all was well.

    

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

Re: [PATCH net 2/2] vhost-net: correctly flush batched packet before enabling notification

static void handle_tx_zerocopy(struct vhost_net net, struct socket sock)