This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue memory. Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block devices.
[1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638DCBD... [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.c...
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@baid... Signed-off-by: Parav Pandit parav@nvidia.com --- drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
- /* - * Device is marked broken on surprise removal so that virtio upper - * layers can abort any ongoing operation. - */ - if (!pci_device_is_present(pci_dev)) - virtio_break_device(&vp_dev->vdev); - pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev);
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed. Windows drivers supported this since forever and it's just a Linux bug that it does not handle all the cases. This is not something you can handle with a capability.
and driver negotiation in the virtio specification to safely stop I/O and free queue memory. Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA.
If the device is gone, then DMA does not continue.
IIUC what is going on for you, is that you have developed a surprise removal emulation that pretends to remove the device but actually the device is doing DMA. So of course things break then.
Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block devices.
[1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638DCBD...
I can only repeat what I said then, this is not how we do kernel development.
[2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.c...
What was missing here, is handling corner cases. So let us please try to handle them.
Here is how I would try to do it:
- add a new driver callback - start a periodic timer task in virtio core on remove - in the timer, probe that the device is still present. if not, invoke a driver callback - cancel the task on device reset
If you do not have the time, let me know and I will try to look into it.
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@baid... Signed-off-by: Parav Pandit parav@nvidia.com
drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
- /*
* Device is marked broken on surprise removal so that virtio upper
* layers can abort any ongoing operation.
*/
- if (!pci_device_is_present(pci_dev))
virtio_break_device(&vp_dev->vdev);
- pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev); -- 2.26.2
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
Windows drivers supported this since forever and it's just a Linux bug that it does not handle all the cases.
Internal tests showed the opposite that hotplug driver + virtio didn't handle the surprise removal either.
Can you please share the starting version of the Windows driver that actually does not wait for the CVQ commands and outstanding requests for block etc drivers?
We have even seen window driver that corrupts the CVQ command buffer when CVQ command times out!
This is not something you can handle with a capability.
The sad truth in my view is that capability is needed so that device can know how to do _sane_ stop as driver told exactly to do so.
For sure if device stops the DMA, the device is unusable until kernel 6.20 or whatever new kernel arrives.
and driver negotiation in the virtio specification to safely stop I/O and free queue memory. Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA.
If the device is gone, then DMA does not continue.
This is exactly how we built the device. And device behaved correctly, it resulted in the call trace on multiple driver hands in net, block area. Commit 43bb40c5b926 partially fixed it. However, device does not know when to behave correctly vs behave as driver expects. And to do this reliability a capability is needed.
IIUC what is going on for you, is that you have developed a surprise removal emulation that pretends to remove the device but actually the device is doing DMA. So of course things break then.
Not really. The device was built correctly to do surprise removal and stop the DMA. That caused the call traces. One of them listed in 43bb40c5b926.
So to maintain backward compatibility, device had to graceful removal as expected by these drivers.
Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C6
38DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t
I can only repeat what I said then, this is not how we do kernel development.
Kernel development does not promote breaking users. Here users are affected with this incomplete kernel behaviour. Virtio subsystem (multiple drivers) including marked stable kernel lacks the contract on when to expect DMA completions and when not.
[2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nv idia.com/
What was missing here, is handling corner cases. So let us please try to handle them.
Here is how I would try to do it:
- add a new driver callback
- start a periodic timer task in virtio core on remove
- in the timer, probe that the device is still present. if not, invoke a driver callback
- cancel the task on device reset
Multiple users request to restore the stability before adding the surprise removal support in reliable way. Lets be practical. Each net, blk, fs, console, crypto has different way of flushing the pending requests. It is unlikely to find a single stable kernel where all the virtio devices would have support for above.
So until above grant work is added to new kernel, user deserves to have stability restored in stable kernels.
Hence this request to revert multiple times from the users.
If you do not have the time, let me know and I will try to look into it.
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b474 1@baidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev
*pci_dev)
struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
/*
* Device is marked broken on surprise removal so that virtio upper
* layers can abort any ongoing operation.
*/
if (!pci_device_is_present(pci_dev))
virtio_break_device(&vp_dev->vdev);
pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev);
-- 2.26.2
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:34 PM
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
Sure yes, stack is not there yet to support it. Each of the virtio device drivers are not there yet. Lets build that infra, let device indicate it and it will be smooth ride for driver and device.
On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:34 PM
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
Sure yes, stack is not there yet to support it. Each of the virtio device drivers are not there yet. Lets build that infra, let device indicate it and it will be smooth ride for driver and device.
There is simply no way for the device to "support" for surprise removal, or lack such support thereof. The support is up to the slot, not the device. Any pci compliant device can be placed in a slot that allows surprise removal and that is all. The user can then remove the device. Software can then either recover gracefully - it should - or hang or crash - it does sometimes, now. The patch you are trying to revert is an attempt to move some use-cases from the 1st to the 2nd category.
But what is going on now, as far as I could tell, is that someone developed a surprise removal emulation that does not actually remove the device, and is using that for testing the code in linux that supports surprise removal. That weird emulation seems to lead to all kind of weird issues. You answer is to remove the existing code and tell your testing team "we do not support surprise removal".
But just go ahead and tell this to them straight away. You do not need this patch for this.
Or better still, let's fix the issues please.
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:30 PM
On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:34 PM
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote:
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of
virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
Sure yes, stack is not there yet to support it. Each of the virtio device drivers are not there yet. Lets build that infra, let device indicate it and it will be smooth ride for driver
and device.
There is simply no way for the device to "support" for surprise removal, or lack such support thereof. The support is up to the slot, not the device. Any pci compliant device can be placed in a slot that allows surprise removal and that is all. The user can then remove the device. Software can then either recover gracefully - it should - or hang or crash - it does sometimes, now. The patch you are trying to revert is an attempt to move some use-cases from the 1st to the 2nd category.
It is the driver (and not the device) who needs to tell the device that it will do sane cleanup and not wait infinitely.
But what is going on now, as far as I could tell, is that someone developed a surprise removal emulation that does not actually remove the device, and is using that for testing the code in linux that supports surprise removal.
Nop. Your analysis is incorrect. And I explained you that already. The device implementation supports correct implementation where device stops all the dma and also does not support register access. And no single virtio driver supported that.
On a surprised removed device, driver expects I/Os to complete and this is beyond a 'bug fix' watermark.
That weird emulation seems to lead to all kind of weird issues. You answer is to remove the existing code and tell your testing team "we do not support surprise removal".
He he, it is no the device, it is the driver that does not support surprise removal as you can see in your proposed patches and other sw changes.
But just go ahead and tell this to them straight away. You do not need this patch for this.
It is needed until infrastructure in multiple subsystem is built.
Or better still, let's fix the issues please.
The implementation is more than a fix category for stable kernels. Hence, what is asked is to do proper implementation for future kernels and until that point restore the bad use experience.
-- MST
On Sun, Aug 24, 2025 at 02:36:23AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:30 PM
On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:34 PM
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 03:52 PM
On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote: > This reverts commit 43bb40c5b926 ("virtio_pci: Support > surprise removal of virtio pci device"). > > Virtio drivers and PCI devices have never fully supported true > surprise (aka hot unplug) removal. Drivers historically > continued processing and waiting for pending I/O and even > continued synchronous device reset during surprise removal. > Devices have also continued completing I/Os, doing DMA and > allowing device reset after surprise removal to support such drivers. > > Supporting it correctly would require a new device capability
If a device is removed, it is removed.
This is how it was implemented and none of the virtio drivers supported it. So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
Sure yes, stack is not there yet to support it. Each of the virtio device drivers are not there yet. Lets build that infra, let device indicate it and it will be smooth ride for driver
and device.
There is simply no way for the device to "support" for surprise removal, or lack such support thereof. The support is up to the slot, not the device. Any pci compliant device can be placed in a slot that allows surprise removal and that is all. The user can then remove the device. Software can then either recover gracefully - it should - or hang or crash - it does sometimes, now. The patch you are trying to revert is an attempt to move some use-cases from the 1st to the 2nd category.
It is the driver (and not the device) who needs to tell the device that it will do sane cleanup and not wait infinitely.
You can invent a way for driver to tell the device that it is not broken. But even if the driver does not do it, nothing at all prevents users from removing the device.
But what is going on now, as far as I could tell, is that someone developed a surprise removal emulation that does not actually remove the device, and is using that for testing the code in linux that supports surprise removal.
Nop. Your analysis is incorrect. And I explained you that already. The device implementation supports correct implementation where device stops all the dma and also does not support register access. And no single virtio driver supported that.
On a surprised removed device, driver expects I/Os to complete and this is beyond a 'bug fix' watermark.
That weird emulation seems to lead to all kind of weird issues. You answer is to remove the existing code and tell your testing team "we do not support surprise removal".
He he, it is no the device, it is the driver that does not support surprise removal as you can see in your proposed patches and other sw changes.
Then fix the driver. Or don't, for that matter, if you lack the time.
But just go ahead and tell this to them straight away. You do not need this patch for this.
It is needed until infrastructure in multiple subsystem is built.
What I do not understand, is what good does the revert do. Sorry.
Or better still, let's fix the issues please.
The implementation is more than a fix category for stable kernels. Hence, what is asked is to do proper implementation for future kernels and until that point restore the bad use experience.
I am not at all interested in discussing ease of backporting fixes before they are developed. Not how we do kernel development, sorry.
-- MST
From: Michael S. Tsirkin mst@redhat.com Sent: 24 August 2025 08:03 PM To: Parav Pandit parav@nvidia.com Cc: virtualization@lists.linux.dev; jasowang@redhat.com; stefanha@redhat.com; pbonzini@redhat.com; xuanzhuo@linux.alibaba.com; stable@vger.kernel.org; Max Gurtovoy mgurtovoy@nvidia.com; NBU- Contact-Li Rongqing (EXTERNAL) lirongqing@baidu.com Subject: Re: [PATCH] Revert "virtio_pci: Support surprise removal of virtio pci device"
On Sun, Aug 24, 2025 at 02:36:23AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:30 PM
On Fri, Aug 22, 2025 at 01:49:36PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:34 PM
On Fri, Aug 22, 2025 at 12:22:50PM +0000, Parav Pandit wrote:
> From: Michael S. Tsirkin mst@redhat.com > Sent: 22 August 2025 03:52 PM > > On Fri, Aug 22, 2025 at 12:17:06PM +0300, Parav Pandit wrote: > > This reverts commit 43bb40c5b926 ("virtio_pci: Support > > surprise removal of > virtio pci device"). > > > > Virtio drivers and PCI devices have never fully supported > > true surprise (aka hot unplug) removal. Drivers > > historically continued processing and waiting for pending > > I/O and even continued synchronous device reset during surprise
removal.
> > Devices have also continued completing I/Os, doing DMA and > > allowing device reset after surprise removal to support such
drivers.
> > > > Supporting it correctly would require a new device > > capability > > If a device is removed, it is removed. This is how it was implemented and none of the virtio drivers
supported it.
So vendors had stepped away from such device implementation. (not just us).
If the slot does not have a mechanical interlock, I can pull the device out. It's not up to a device implementation.
Sure yes, stack is not there yet to support it. Each of the virtio device drivers are not there yet. Lets build that infra, let device indicate it and it will be smooth ride for driver
and device.
There is simply no way for the device to "support" for surprise removal, or lack such support thereof. The support is up to the slot, not the device. Any pci compliant device can be placed in a slot that allows surprise removal and that is all. The user can then remove the device. Software can then either recover gracefully - it should - or hang or crash - it does sometimes, now. The patch you are trying to revert is an attempt to move some use-cases from the 1st to the 2nd category.
It is the driver (and not the device) who needs to tell the device that it will
do sane cleanup and not wait infinitely.
You can invent a way for driver to tell the device that it is not broken. But even if the driver does not do it, nothing at all prevents users from removing the device.
Sure. Vendors will implement the device accordingly based on the driver's config. If driver tells that it is ready for surprise removal, the device will implement functionality accordingly. And user can do anything it wants. But user will have indication and knowledge from the device, when not to break the system.
But what is going on now, as far as I could tell, is that someone developed a surprise removal emulation that does not actually remove the device, and is using that for testing the code in linux that supports
surprise removal.
Nop. Your analysis is incorrect. And I explained you that already. The device implementation supports correct implementation where device
stops all the dma and also does not support register access.
And no single virtio driver supported that.
On a surprised removed device, driver expects I/Os to complete and this is
beyond a 'bug fix' watermark.
That weird emulation seems to lead to all kind of weird issues. You answer is to remove the existing code and tell your testing team "we do not support surprise removal".
He he, it is no the device, it is the driver that does not support surprise
removal as you can see in your proposed patches and other sw changes.
Then fix the driver. Or don't, for that matter, if you lack the time.
I explained, it is not a fix. It is a completely new implementation of the infrastructure that spans, virtio, pci or core subsystem and each individual device types.
But just go ahead and tell this to them straight away. You do not need this patch for this.
It is needed until infrastructure in multiple subsystem is built.
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions. And virtio stack works exactly the way it was working before the commit.
Or better still, let's fix the issues please.
The implementation is more than a fix category for stable kernels. Hence, what is asked is to do proper implementation for future kernels and
until that point restore the bad use experience.
I am not at all interested in discussing ease of backporting fixes before they are developed.
Your attribution of "fixing a problem" for the missing new implementation that span across multiple different subsystem (virtio generic, block net) is not accurate. I understand that such confusion arises from your previous email regarding physical removal of the device... Hope I explained that its virtual system where user has not removed the card physically. Driver is expecting requests to complete and attempting device reset too...
Not how we do kernel development, sorry.
The ask is to revert a patch that broke the virtio stack. No one is asking to backport new implementation and new spec to old kernels.
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active vblk Ios and outstanding transmit pkt and cvq commands. Hard to say as we don't have ccw hw or systems.
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active vblk Ios and outstanding transmit pkt and cvq commands. Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
What I do not understand, is what good does the revert do. Sorry.
Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active vblk Ios and outstanding transmit pkt and cvq commands. Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
As it became very clear from follow up discussion, the issue is nothing to do with virtio, it is with a broken hypervisor that allows device to DMA into guest memory while also telling the guest that the device has been removed.
I guess s390 is just not broken like this.
On Thu, Aug 28 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote:
> What I do not understand, is what good does the revert do. Sorry. > Let me explain. It prevents the issue of vblk requests being stuck due to broken VQ. It prevents the vnet driver start_xmit() to be not stuck on skb completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active vblk Ios and outstanding transmit pkt and cvq commands. Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
As it became very clear from follow up discussion, the issue is nothing to do with virtio, it is with a broken hypervisor that allows device to DMA into guest memory while also telling the guest that the device has been removed.
I guess s390 is just not broken like this.
Ah good, I missed that -- that indeed sounds broken, and needs to be fixed there.
From: Cornelia Huck cohuck@redhat.com Sent: 28 August 2025 05:52 PM
On Thu, Aug 28 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote: > > What I do not understand, is what good does the revert do. Sorry. > > > Let me explain. > It prevents the issue of vblk requests being stuck due to broken VQ. > It prevents the vnet driver start_xmit() to be not stuck on skb
completions.
This is the part I don't get. In what scenario, before 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is
stuck?
Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active
vblk Ios and outstanding transmit pkt and cvq commands.
Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
As it became very clear from follow up discussion, the issue is nothing to do with virtio, it is with a broken hypervisor that allows device to DMA into guest memory while also telling the guest that the device has been removed.
I guess s390 is just not broken like this.
Ah good, I missed that -- that indeed sounds broken, and needs to be fixed there.
Nop. This is not the issue. You missed this focused on fixing the device.
The fact is: the driver is expecting the IOs and CVQ commands and DMA to succeed even after device is removed. The driver is expecting the device reset to also succeed. Stefan already pointed out this in the vblk driver patches.
This is why you see call traces on del_gendisk(), CVQ commands.
Again, it is the broken drivers not the device. Device can stop the DMA and stop responding to the requests and kernel 6.X will continue to hang as long as it has cited commit.
On Thu, Aug 28, 2025 at 12:33:58PM +0000, Parav Pandit wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 28 August 2025 05:52 PM
On Thu, Aug 28 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 27 August 2025 05:04 PM
On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
> On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote: >> > What I do not understand, is what good does the revert do. Sorry. >> > >> Let me explain. >> It prevents the issue of vblk requests being stuck due to broken VQ. >> It prevents the vnet driver start_xmit() to be not stuck on skb
completions.
> > This is the part I don't get. In what scenario, before > 43bb40c5b9265 start_xmit is not stuck, but after 43bb40c5b9265 it is
stuck?
> > Once the device is gone, it is not using any buffers at all.
What I also don't understand: virtio-ccw does exactly the same thing (virtio_break_device(), added in 2014), and it supports surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active
vblk Ios and outstanding transmit pkt and cvq commands.
Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
As it became very clear from follow up discussion, the issue is nothing to do with virtio, it is with a broken hypervisor that allows device to DMA into guest memory while also telling the guest that the device has been removed.
I guess s390 is just not broken like this.
Ah good, I missed that -- that indeed sounds broken, and needs to be fixed there.
Nop. This is not the issue. You missed this focused on fixing the device.
The fact is: the driver is expecting the IOs and CVQ commands and DMA to succeed even after device is removed. The driver is expecting the device reset to also succeed. Stefan already pointed out this in the vblk driver patches. This is why you see call traces on del_gendisk(), CVQ commands.
Again, it is the broken drivers not the device. Device can stop the DMA and stop responding to the requests and kernel 6.X will continue to hang as long as it has cited commit.
Parav, the issues you cite are real but unrelated and will hang anyway with or without the commit.
All you have to do is pull out the device while e.g. a command is in the process of being submitted.
All the commit you want to revert does, is in some instances instead of just hanging it will make queue as broken and release memory. Since you device is not really gone and keeps DMAing into memory, guest memory gets corrupted.
But your argument that the issue is that the fix is "incomplete" is bogus - when we make the fix complete it will become even worse for this broken devices.
From: Michael S. Tsirkin mst@redhat.com Sent: 28 August 2025 06:31 PM
On Thu, Aug 28, 2025 at 12:33:58PM +0000, Parav Pandit wrote:
From: Cornelia Huck cohuck@redhat.com Sent: 28 August 2025 05:52 PM
On Thu, Aug 28 2025, "Michael S. Tsirkin" mst@redhat.com wrote:
On Thu, Aug 28, 2025 at 02:16:28PM +0200, Cornelia Huck wrote:
On Thu, Aug 28 2025, Parav Pandit parav@nvidia.com wrote:
> From: Cornelia Huck cohuck@redhat.com > Sent: 27 August 2025 05:04 PM > > On Wed, Aug 27 2025, "Michael S. Tsirkin" mst@redhat.com
wrote:
> > > On Tue, Aug 26, 2025 at 06:52:03PM +0000, Parav Pandit wrote: > >> > What I do not understand, is what good does the revert do.
Sorry.
> >> > > >> Let me explain. > >> It prevents the issue of vblk requests being stuck due to broken
VQ.
> >> It prevents the vnet driver start_xmit() to be not stuck on > >> skb
completions.
> > > > This is the part I don't get. In what scenario, before > > 43bb40c5b9265 start_xmit is not stuck, but after > > 43bb40c5b9265 it is
stuck?
> > > > Once the device is gone, it is not using any buffers at all. > > What I also don't understand: virtio-ccw does exactly the same > thing (virtio_break_device(), added in 2014), and it supports > surprise removal _only_, yet I don't remember seeing bug reports?
I suspect that stress testing may not have happened for ccw with active
vblk Ios and outstanding transmit pkt and cvq commands.
Hard to say as we don't have ccw hw or systems.
cc:ing linux-s390 list. I'd be surprised if nobody ever tested surprise removal on a loaded system in the last 11 years.
As it became very clear from follow up discussion, the issue is nothing to do with virtio, it is with a broken hypervisor that allows device to DMA into guest memory while also telling the guest that the device has been removed.
I guess s390 is just not broken like this.
Ah good, I missed that -- that indeed sounds broken, and needs to be fixed there.
Nop. This is not the issue. You missed this focused on fixing the device.
The fact is: the driver is expecting the IOs and CVQ commands and DMA to
succeed even after device is removed.
The driver is expecting the device reset to also succeed. Stefan already pointed out this in the vblk driver patches. This is why you see call traces on del_gendisk(), CVQ commands.
Again, it is the broken drivers not the device. Device can stop the DMA and stop responding to the requests and kernel
6.X will continue to hang as long as it has cited commit.
Parav, the issues you cite are real but unrelated and will hang anyway with or without the commit.
How is it unrelated?
If it is going to hang anyway (in your view), and you proposed different callback etc as brand-new feature to Linux kernel, what is the objection to revert it? As you pointed out it will be in multiple subsystems (net, block, pci) etc, why not do the proper work?
Reverting at least helps those stable kernels to operate smoothly as before.
All you have to do is pull out the device while e.g. a command is in the process of being submitted.
All the commit you want to revert does, is in some instances instead of just hanging it will make queue as broken and release memory. Since you device is not really gone and keeps DMAing into memory, guest memory gets corrupted.
Nop. This is not the case. What is "some instance"? The virtio block driver is expecting the IOs to be completed without the cited commit. As you listed cross subsystem callbacks, such infrastructure != fix.
So to make things clear, as discussed. 1. have proper kernel infrastructure in placed as you outlined the design using callback 2. have the spec update to make sure drivers negotiate its readiness for surprise removal and do not expect to access the device.
Until that point, restore the stability of stable kernels.
But your argument that the issue is that the fix is "incomplete" is bogus -
when we make the fix complete it will become even worse for this broken devices.
I explained you that the device was doing the right thing and that is why exactly the call trace in the cited patch showed up.
Again quoting "broken device" is wrong. The drivers are broken trying to reset the removed device. And the ask is to do proper feature negotiation to get to that point.
When the reasonable workaround is suggested in previous email, you opted to not respond to it? This is not how broken user experience is restored for stable kernel.
Are you waiting for the test results if it works? If so, then yes, it makes sense. I will of course test and submit proper v2.
linux-stable-mirror@lists.linaro.org