This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue memory. Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block devices.
[1] https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C638 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nvidia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b4741@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
/*
* Device is marked broken on surprise removal so that virtio upper
* layers can abort any ongoing operation.
*/
if (!pci_device_is_present(pci_dev))
virtio_break_device(&vp_dev->vdev);
pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev);
-- 2.26.2
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue
memory.
Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block
devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C6
38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nv idia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b474 1@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel. Thanks a lot Li for the quick test.
drivers/virtio/virtio_pci_common.c | 7 ------- 1 file changed, 7 deletions(-)
diff --git a/drivers/virtio/virtio_pci_common.c b/drivers/virtio/virtio_pci_common.c index d6d79af44569..dba5eb2eaff9 100644 --- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -747,13 +747,6 @@ static void virtio_pci_remove(struct pci_dev
*pci_dev)
struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
/*
* Device is marked broken on surprise removal so that virtio upper
* layers can abort any ongoing operation.
*/
if (!pci_device_is_present(pci_dev))
virtio_break_device(&vp_dev->vdev);
pci_disable_sriov(pci_dev);
unregister_virtio_device(&vp_dev->vdev);
-- 2.26.2
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue
memory.
Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block
devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6C6
38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-parav@nv idia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9b474 1@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:35 PM
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise
removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue
memory.
Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block
devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6 C6
38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-para v@nv idia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9 b474 1@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
Not sure I understand the question. Let me try to answer. They are unable to dynamically add/remove the virtio net, block, fs devices in their systems. Users have their networking applications running over NS network and database and file system through these devices. Some of them keep reverting the patch. Some are unable to. They are in search of stable kernel.
Did I understand your question?
-- MST
On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:35 PM
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise
removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue
memory.
Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the DMA. Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block
devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB100BC6 C6
38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1-para v@nv idia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb73ca9 b474 1@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
Not sure I understand the question. Let me try to answer. They are unable to dynamically add/remove the virtio net, block, fs devices in their systems. Users have their networking applications running over NS network and database and file system through these devices. Some of them keep reverting the patch. Some are unable to. They are in search of stable kernel.
Did I understand your question?
Not really, sorry.
Does the system or does it not have a mechanical interlock?
If it does, how does a user run into surprise removal issues without the ability to remove the device?
If it does not, and a user pull out the working device, how does your patch help?
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:32 PM
On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:35 PM
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
This reverts commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device").
Virtio drivers and PCI devices have never fully supported true surprise (aka hot unplug) removal. Drivers historically continued processing and waiting for pending I/O and even continued synchronous device reset during surprise removal. Devices have also continued completing I/Os, doing DMA and allowing device reset after surprise
removal to support such drivers.
Supporting it correctly would require a new device capability and driver negotiation in the virtio specification to safely stop I/O and free queue
memory.
Failure to do so either breaks all the existing drivers with call trace listed in the commit or crashes the host on continuing the
DMA.
Hence, until such specification and devices are invented, restore the previous behavior of treating surprise removal as graceful removal to avoid regressions and maintain system stability same as before the commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci
device").
As explained above, previous analysis of solving this only in driver was incomplete and non-reliable at [1] and at [2]; Hence reverting commit 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") is still the best stand to restore failures of virtio net and block
devices.
[1]
https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB10 0BC6 C6
38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t [2] https://lore.kernel.org/virtualization/20250602024358.57114-1- para v@nv idia.c om/
Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio pci device") Cc: stable@vger.kernel.org Reported-by: lirongqing@baidu.com Closes: https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb7 3ca9 b474 1@b aidu.com/ Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
Not sure I understand the question. Let me try to answer. They are unable to dynamically add/remove the virtio net, block, fs devices in
their systems.
Users have their networking applications running over NS network and
database and file system through these devices.
Some of them keep reverting the patch. Some are unable to. They are in search of stable kernel.
Did I understand your question?
Not really, sorry.
Does the system or does it not have a mechanical interlock?
It is modern system beyond mechanical interlock but has the ability for surprise removal.
If it does, how does a user run into surprise removal issues without the ability to remove the device?
User has the ability to surprise removal a device from the slot via the slot's pci registers. Yet the device is capable enough to fulfil the needs of broken drivers which are waiting for the pending requests to arrive.
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that point device would stop its ancient backward compatibility mode.
-- MST
On Sun, Aug 24, 2025 at 02:36:11AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:32 PM
On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:35 PM
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
From: Li,Rongqing lirongqing@baidu.com Sent: 22 August 2025 03:57 PM
> This reverts commit 43bb40c5b926 ("virtio_pci: Support > surprise removal of virtio pci device"). > > Virtio drivers and PCI devices have never fully supported true > surprise (aka hot > unplug) removal. Drivers historically continued processing and > waiting for pending I/O and even continued synchronous device > reset during surprise removal. Devices have also continued > completing I/Os, doing DMA and allowing device reset after > surprise
removal to support such drivers.
> > Supporting it correctly would require a new device capability > and driver negotiation in the virtio specification to safely > stop I/O and free queue memory. > Failure to do so either breaks all the existing drivers with > call trace listed in the commit or crashes the host on continuing the
DMA.
> Hence, until such specification and devices are invented, > restore the previous behavior of treating surprise removal as > graceful removal to avoid regressions and maintain system > stability same as before the commit 43bb40c5b926 ("virtio_pci: > Support surprise removal of virtio pci device"). > > As explained above, previous analysis of solving this only in > driver was incomplete and non-reliable at [1] and at [2]; > Hence reverting commit > 43bb40c5b926 ("virtio_pci: Support surprise removal of virtio > pci > device") is still the best stand to restore failures of virtio > net and block devices. > > [1] > https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613EB10 0BC6 C6 > 38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t > [2] > https://lore.kernel.org/virtualization/20250602024358.57114-1- > para > v@nv > idia.c > om/ > > Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal of > virtio pci device") > Cc: stable@vger.kernel.org > Reported-by: lirongqing@baidu.com > Closes: > https://lore.kernel.org/virtualization/c45dd68698cd47238c55fb7 > 3ca9 > b474 > 1@b > aidu.com/ > Signed-off-by: Parav Pandit parav@nvidia.com
Tested-by: Li RongQing lirongqing@baidu.com
Thanks
-Li
Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
Not sure I understand the question. Let me try to answer. They are unable to dynamically add/remove the virtio net, block, fs devices in
their systems.
Users have their networking applications running over NS network and
database and file system through these devices.
Some of them keep reverting the patch. Some are unable to. They are in search of stable kernel.
Did I understand your question?
Not really, sorry.
Does the system or does it not have a mechanical interlock?
It is modern system beyond mechanical interlock but has the ability for surprise removal.
I am not sure what does "beyond" mean. I guess that it does not have it?
If it does, how does a user run into surprise removal issues without the ability to remove the device?
User has the ability to surprise removal a device from the slot via the slot's pci registers.
I don't know what this means. Surprise removal is done by removing the device. Not via pci registers.
Yet the device is capable enough to fulfil the needs of broken drivers which are waiting for the pending requests to arrive.
I don't know what this means. A removed device can not do anything at all.
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
-- MST
From: Michael S. Tsirkin mst@redhat.com Sent: 24 August 2025 08:00 PM
On Sun, Aug 24, 2025 at 02:36:11AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 07:32 PM
On Fri, Aug 22, 2025 at 01:53:02PM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 22 August 2025 06:35 PM
On Fri, Aug 22, 2025 at 12:24:06PM +0000, Parav Pandit wrote:
> From: Li,Rongqing lirongqing@baidu.com > Sent: 22 August 2025 03:57 PM > > > This reverts commit 43bb40c5b926 ("virtio_pci: Support > > surprise removal of virtio pci device"). > > > > Virtio drivers and PCI devices have never fully supported > > true surprise (aka hot > > unplug) removal. Drivers historically continued processing > > and waiting for pending I/O and even continued synchronous > > device reset during surprise removal. Devices have also > > continued completing I/Os, doing DMA and allowing device > > reset after surprise
removal to support such drivers.
> > > > Supporting it correctly would require a new device > > capability and driver negotiation in the virtio > > specification to safely stop I/O and free queue > memory. > > Failure to do so either breaks all the existing drivers > > with call trace listed in the commit or crashes the host > > on continuing the
DMA.
> > Hence, until such specification and devices are invented, > > restore the previous behavior of treating surprise removal > > as graceful removal to avoid regressions and maintain > > system stability same as before the commit 43bb40c5b926
("virtio_pci:
> > Support surprise removal of virtio pci > device"). > > > > As explained above, previous analysis of solving this only > > in driver was incomplete and non-reliable at [1] and at > > [2]; Hence reverting commit > > 43bb40c5b926 ("virtio_pci: Support surprise removal of > > virtio pci > > device") is still the best stand to restore failures of > > virtio net and block > devices. > > > > [1] > > > https://lore.kernel.org/virtualization/CY8PR12MB719506CC5613 > EB10 > 0BC6 > C6 > > 38 DCBD2@CY8PR12MB7195.namprd12.prod.outlook.com/#t > > [2] > > https://lore.kernel.org/virtualization/20250602024358.5711 > > 4-1- > > para > > v@nv > > idia.c > > om/ > > > > Fixes: 43bb40c5b926 ("virtio_pci: Support surprise removal > > of virtio pci device") > > Cc: stable@vger.kernel.org > > Reported-by: lirongqing@baidu.com > > Closes: > > https://lore.kernel.org/virtualization/c45dd68698cd47238c5 > > 5fb7 > > 3ca9 > > b474 > > 1@b > > aidu.com/ > > Signed-off-by: Parav Pandit parav@nvidia.com > > > > Tested-by: Li RongQing lirongqing@baidu.com > > Thanks > > -Li > Multiple users are blocked to have this fix in stable kernel.
what are these users doing that is blocked by this fix?
Not sure I understand the question. Let me try to answer. They are unable to dynamically add/remove the virtio net, block, fs devices in
their systems.
Users have their networking applications running over NS network and
database and file system through these devices.
Some of them keep reverting the patch. Some are unable to. They are in search of stable kernel.
Did I understand your question?
Not really, sorry.
Does the system or does it not have a mechanical interlock?
It is modern system beyond mechanical interlock but has the ability for
surprise removal.
I am not sure what does "beyond" mean. I guess that it does not have it?
Right.
If it does, how does a user run into surprise removal issues without the ability to remove the device?
User has the ability to surprise removal a device from the slot via the slot's
pci registers.
I don't know what this means. Surprise removal is done by removing the device. Not via pci registers.
They are many ways to surprise remove a device from a slot. Slots are capable to detect the device removal and when that occurs, the pcie switch/bridge updates the slot registers for the downstream port and indicate to the sw. Some slots are physical in nature. Some are virtual slots in the pcie switch/bridge. End result is, sw gets to know this sw PCIe registers in switch/bridge/port level.
Yet the device is capable enough to fulfil the needs of broken drivers which
are waiting for the pending requests to arrive.
I don't know what this means. A removed device can not do anything at all.
If this was really physically removed, yes. but in virtual system, the device is still present on the slot until it gets indication from the OS.
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that
point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it works with ancient Linux kernel which does not have commit 43bb40c5b9265. Commit 43bb40c5b9265 was partial in nature. Hence request to revert and do proper implementation like you mentioned using extra callback and/or with spec extension.
-- MST
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that
point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that
point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
For example, how about reading device control+status?
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Hmm?
-- MST
From: Michael S. Tsirkin mst@redhat.com Sent: 27 August 2025 04:19 PM
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
If it does not, and a user pull out the working device, how does your patch help?
A driver must tell that it will not follow broken ancient behaviour and at that
point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it
works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
Yes.
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
The issue is: one can build the device to stop the DMA. There is no predictable combination for the driver and device that can work for the user. For example, Device that stops the dma will not work before the commit 43bb40c5b9265. Device that continues the dma will not work with whatever new implementation done in future kernels.
Hence the capability negotiation would be needed so that device can stop the DMA, config interrupts etc.
For example, how about reading device control+status?
Most platforms read 0xffff on non-existing device, but not sure if this the standard or well defined.
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Bus master check may pass, right returning all 1s, even if the device is removed, isn't it?
Hmm?
-- MST
On Thu, Aug 28, 2025 at 06:23:02AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 27 August 2025 04:19 PM
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
> If it does not, and a user pull out the working device, how > does your patch help? > A driver must tell that it will not follow broken ancient behaviour and at that
point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it
works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
Yes.
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
The issue is: one can build the device to stop the DMA. There is no predictable combination for the driver and device that can work for the user. For example, Device that stops the dma will not work before the commit 43bb40c5b9265. Device that continues the dma will not work with whatever new implementation done in future kernels.
Hence the capability negotiation would be needed so that device can stop the DMA, config interrupts etc.
So this is a broken implementation at the pci level. We really can't fix removal for this device at all, except by fixing the device. Whatever works, works by chance. Feature negotiation in spec is not the way to fix that, but some work arounds in the driver to skip the device are acceptable, mostly to not bother with it.
Pls document exactly how this pci looks. Does it have an id we can use to detect it?
For example, how about reading device control+status?
Most platforms read 0xffff on non-existing device, but not sure if this the standard or well defined.
IIRC it's in the pci spec as a note.
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Bus master check may pass, right returning all 1s, even if the device is removed, isn't it?
So we check all ones 1st, only check bus master if not all ones?
Hmm?
-- MST
From: Michael S. Tsirkin mst@redhat.com Sent: 28 August 2025 12:04 PM
On Thu, Aug 28, 2025 at 06:23:02AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 27 August 2025 04:19 PM
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
> > If it does not, and a user pull out the working device, > > how does your patch help? > > > A driver must tell that it will not follow broken ancient > behaviour and at that point device would stop its ancient backward compatibility mode.
I don't know what is "ancient backward compatibility mode".
Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it
works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
Yes.
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
The issue is: one can build the device to stop the DMA. There is no predictable combination for the driver and device that can work
for the user.
For example, Device that stops the dma will not work before the commit 43bb40c5b9265. Device that continues the dma will not work with whatever new
implementation done in future kernels.
Hence the capability negotiation would be needed so that device can stop the
DMA, config interrupts etc.
So this is a broken implementation at the pci level. We really can't fix removal for this device at all, except by fixing the device.
The device to be told how to behave with/without commit 43bb40c5b9265. Not sure what you mean by 'fix the device'.
Users are running stable kernel that has commit 43bb40c5b9265 and its broken setup for them.
Whatever works, works by chance. Feature negotiation in spec is not the way to fix that, but some work arounds in the driver to skip the device are acceptable, mostly to not bother with it.
Why not? It sounds like we need feature bit like VERSION_1 or ORDER_PLATFORM.
To _fix_ a stable kernel, if you have a suggestion, please suggest.
Pls document exactly how this pci looks. Does it have an id we can use to detect it?
CSPs have different device and vendor id for vnet, blk vfs. Is that what you mean by id?
For example, how about reading device control+status?
Most platforms read 0xffff on non-existing device, but not sure if this the
standard or well defined.
IIRC it's in the pci spec as a note.
Checking.
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Bus master check may pass, right returning all 1s, even if the device is
removed, isn't it?
So we check all ones 1st, only check bus master if not all ones?
Pci subsystem typically checks the vendor and device ids, and if its not all 1s, its safe enough check.
How about a fix something like this:
--- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -746,12 +746,16 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) { struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev); + u32 v;
/* * Device is marked broken on surprise removal so that virtio upper * layers can abort any ongoing operation. + * Make sure that device is truly removed by directly interacting + * with the device (and not just depend on the slot registers). */ - if (!pci_device_is_present(pci_dev)) + if (!pci_device_is_present(pci_dev) && + !pci_bus_read_dev_vendor_id(pci_dev->bus, pci_dev->devfn, &v, 0)) virtio_break_device(&vp_dev->vdev);
So if the device is still there, it let it go through its usual cleanup flow. And post this fix, a proper implementation with callback etc that you described can be implemented.
On Thu, Aug 28, 2025 at 06:59:26AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 28 August 2025 12:04 PM
On Thu, Aug 28, 2025 at 06:23:02AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 27 August 2025 04:19 PM
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote:
> > > If it does not, and a user pull out the working device, > > > how does your patch help? > > > > > A driver must tell that it will not follow broken ancient > > behaviour and at that > point device would stop its ancient backward compatibility mode. > > > > I don't know what is "ancient backward compatibility mode". > Let me explain. Sadly, CSPs virtio pci device implementation is done such a way that, it
works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
Yes.
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
The issue is: one can build the device to stop the DMA. There is no predictable combination for the driver and device that can work
for the user.
For example, Device that stops the dma will not work before the commit 43bb40c5b9265. Device that continues the dma will not work with whatever new
implementation done in future kernels.
Hence the capability negotiation would be needed so that device can stop the
DMA, config interrupts etc.
So this is a broken implementation at the pci level. We really can't fix removal for this device at all, except by fixing the device.
The device to be told how to behave with/without commit 43bb40c5b9265. Not sure what you mean by 'fix the device'.
Users are running stable kernel that has commit 43bb40c5b9265 and its broken setup for them.
Whatever works, works by chance. Feature negotiation in spec is not the way to fix that, but some work arounds in the driver to skip the device are acceptable, mostly to not bother with it.
Why not? It sounds like we need feature bit like VERSION_1 or ORDER_PLATFORM.
Because the device is out of spec (PCI spec which virtio references).
Besides the bug is not in the device, it's in the pci emulation.
To _fix_ a stable kernel, if you have a suggestion, please suggest.
Pls document exactly how this pci looks. Does it have an id we can use to detect it?
CSPs have different device and vendor id for vnet, blk vfs. Is that what you mean by id?
vendor id is one way, yes. maybe a revision check, too.
For example, how about reading device control+status?
Most platforms read 0xffff on non-existing device, but not sure if this the
standard or well defined.
IIRC it's in the pci spec as a note.
Checking.
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Bus master check may pass, right returning all 1s, even if the device is
removed, isn't it?
So we check all ones 1st, only check bus master if not all ones?
Pci subsystem typically checks the vendor and device ids, and if its not all 1s, its safe enough check.
How about a fix something like this:
--- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -746,12 +746,16 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) { struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
u32 v; /* * Device is marked broken on surprise removal so that virtio upper * layers can abort any ongoing operation.
* Make sure that device is truly removed by directly interacting
* with the device (and not just depend on the slot registers). */
if (!pci_device_is_present(pci_dev))
if (!pci_device_is_present(pci_dev) &&
!pci_bus_read_dev_vendor_id(pci_dev->bus, pci_dev->devfn, &v, 0)) virtio_break_device(&vp_dev->vdev);
So if the device is still there, it let it go through its usual cleanup flow. And post this fix, a proper implementation with callback etc that you described can be implemented.
I don't have a big problem with this, but I don't understand the scenario now again. report_error_detected relies on dev->error_state and bus read. error_state is set on AER reporting an error. This is not what you described.
Does the patch actually solve the problem for you?
Also can we limit this to a specific vendor id, or something like that?
I also still like the idea of reading dev control and status, since it always bothered me that there's a theoretical chance that device is re-inserted and bus read will succeed. Or maybe I'm imagining it.
From: Michael S. Tsirkin mst@redhat.com Sent: 28 August 2025 02:53 PM
On Thu, Aug 28, 2025 at 06:59:26AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 28 August 2025 12:04 PM
On Thu, Aug 28, 2025 at 06:23:02AM +0000, Parav Pandit wrote:
From: Michael S. Tsirkin mst@redhat.com Sent: 27 August 2025 04:19 PM
On Wed, Aug 27, 2025 at 06:21:28AM -0400, Michael S. Tsirkin wrote:
On Tue, Aug 26, 2025 at 06:52:11PM +0000, Parav Pandit wrote: > > > > If it does not, and a user pull out the working > > > > device, how does your patch help? > > > > > > > A driver must tell that it will not follow broken > > > ancient behaviour and at that > > point device would stop its ancient backward compatibility mode. > > > > > > > > I don't know what is "ancient backward compatibility mode". > > > Let me explain. > Sadly, CSPs virtio pci device implementation is done such a > way that, it
works with ancient Linux kernel which does not have commit 43bb40c5b9265.
OK we are getting new information here.
So let me summarize. There's a virtual system that pretends, to the guest, that device was removed by surprise removal, but actually device is there and is still doing DMA. Is that a fair summary?
Yes.
If that is the case, the thing to do would be to try and detect the fake removal and then work with device as usual - device not doing DMA after removal is pretty fundamental, after all.
The issue is: one can build the device to stop the DMA. There is no predictable combination for the driver and device that can work
for the user.
For example, Device that stops the dma will not work before the commit
43bb40c5b9265.
Device that continues the dma will not work with whatever new
implementation done in future kernels.
Hence the capability negotiation would be needed so that device can stop the
DMA, config interrupts etc.
So this is a broken implementation at the pci level. We really can't fix removal for this device at all, except by fixing the device.
The device to be told how to behave with/without commit 43bb40c5b9265. Not sure what you mean by 'fix the device'.
Users are running stable kernel that has commit 43bb40c5b9265 and its
broken setup for them.
Whatever works, works by chance. Feature negotiation in spec is not the way to fix that, but some work arounds in the driver to skip the device are acceptable, mostly to not bother with it.
Why not? It sounds like we need feature bit like VERSION_1 or ORDER_PLATFORM.
Because the device is out of spec (PCI spec which virtio references).
Besides the bug is not in the device, it's in the pci emulation.
To _fix_ a stable kernel, if you have a suggestion, please suggest.
Pls document exactly how this pci looks. Does it have an id we can use to detect it?
CSPs have different device and vendor id for vnet, blk vfs. Is that what you mean by id?
vendor id is one way, yes. maybe a revision check, too.
Vendor and device id are as defined in virtio spec as ID 0x1AF4 and respective device id.
For example, how about reading device control+status?
Most platforms read 0xffff on non-existing device, but not sure if this the
standard or well defined.
IIRC it's in the pci spec as a note.
Checking.
If we get all ones device has been removed If we get 0 in bus master: device has been removed but re-inserted Anything else is a fake removal
Bus master check may pass, right returning all 1s, even if the device is
removed, isn't it?
So we check all ones 1st, only check bus master if not all ones?
Pci subsystem typically checks the vendor and device ids, and if its not all 1s,
its safe enough check.
How about a fix something like this:
--- a/drivers/virtio/virtio_pci_common.c +++ b/drivers/virtio/virtio_pci_common.c @@ -746,12 +746,16 @@ static void virtio_pci_remove(struct pci_dev *pci_dev) { struct virtio_pci_device *vp_dev = pci_get_drvdata(pci_dev); struct device *dev = get_device(&vp_dev->vdev.dev);
u32 v; /* * Device is marked broken on surprise removal so that virtio upper * layers can abort any ongoing operation.
* Make sure that device is truly removed by directly interacting
* with the device (and not just depend on the slot registers). */
if (!pci_device_is_present(pci_dev))
if (!pci_device_is_present(pci_dev) &&
!pci_bus_read_dev_vendor_id(pci_dev->bus, pci_dev->devfn,
- &v, 0)) virtio_break_device(&vp_dev->vdev);
So if the device is still there, it let it go through its usual cleanup flow. And post this fix, a proper implementation with callback etc that you
described can be implemented.
I don't have a big problem with this, but I don't understand the scenario now again. report_error_detected relies on dev->error_state and bus read. error_state is set on AER reporting an error. This is not what you described.
When pci device is virtually removed from the slot error_state is updated using,
pci_dev_set_disconnected() pci_dev_set_io_state()
Does the patch actually solve the problem for you?
It should. I am going to check if this approach looks fine to you. Please let me know.
Also can we limit this to a specific vendor id, or something like that?
Its spec defined 0x1AF4.
I also still like the idea of reading dev control and status, since it always bothered me that there's a theoretical chance that device is re-inserted and bus read will succeed. Or maybe I'm imagining it.
Re-insertion cannot happen in same slot until the previous slot is properly cleaned up and bus number is not released. User may still attempt to plug in in same (virtual) or physical slot, but it will get different bus assigned as the previous one is not recycled yet because cleanup didnt finish yet.
-- MST
linux-stable-mirror@lists.linaro.org