This reverts commit d6af2ed29c7c1c311b96dac989dcb991e90ee195.
The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed29c7c is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device().
The commit d6af2ed29c7c was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe().
Signed-off-by: Dexuan Cui decui@microsoft.com Reviewed-by: Michael Kelley mikelley@microsoft.com Acked-by: Wei Hu weh@microsoft.com Cc: stable@vger.kernel.org ---
v2: No change to the patch body. Added Wei Hu's Acked-by. Added Cc:stable
v3: Added Michael's Reviewed-by.
drivers/pci/controller/pci-hyperv.c | 71 ++++++++++++++--------------- 1 file changed, 34 insertions(+), 37 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 46df6d093d683..48feab095a144 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3225,8 +3225,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev) struct pci_bus_d0_entry *d0_entry; struct hv_pci_compl comp_pkt; struct pci_packet *pkt; + bool retry = true; int ret;
+enter_d0_retry: /* * Tell the host that the bus is ready to use, and moved into the * powered-on state. This includes telling the host which region @@ -3253,6 +3255,38 @@ static int hv_pci_enter_d0(struct hv_device *hdev) if (ret) goto exit;
+ /* + * In certain case (Kdump) the pci device of interest was + * not cleanly shut down and resource is still held on host + * side, the host could return invalid device status. + * We need to explicitly request host to release the resource + * and try to enter D0 again. + */ + if (comp_pkt.completion_status < 0 && retry) { + retry = false; + + dev_err(&hdev->device, "Retrying D0 Entry\n"); + + /* + * Hv_pci_bus_exit() calls hv_send_resource_released() + * to free up resources of its child devices. + * In the kdump kernel we need to set the + * wslot_res_allocated to 255 so it scans all child + * devices to release resources allocated in the + * normal kernel before panic happened. + */ + hbus->wslot_res_allocated = 255; + + ret = hv_pci_bus_exit(hdev, true); + + if (ret == 0) { + kfree(pkt); + goto enter_d0_retry; + } + dev_err(&hdev->device, + "Retrying D0 failed with ret %d\n", ret); + } + if (comp_pkt.completion_status < 0) { dev_err(&hdev->device, "PCI Pass-through VSP failed D0 Entry with status %x\n", @@ -3493,7 +3527,6 @@ static int hv_pci_probe(struct hv_device *hdev, struct hv_pcibus_device *hbus; u16 dom_req, dom; char *name; - bool enter_d0_retry = true; int ret;
/* @@ -3633,47 +3666,11 @@ static int hv_pci_probe(struct hv_device *hdev, if (ret) goto free_fwnode;
-retry: ret = hv_pci_query_relations(hdev); if (ret) goto free_irq_domain;
ret = hv_pci_enter_d0(hdev); - /* - * In certain case (Kdump) the pci device of interest was - * not cleanly shut down and resource is still held on host - * side, the host could return invalid device status. - * We need to explicitly request host to release the resource - * and try to enter D0 again. - * Since the hv_pci_bus_exit() call releases structures - * of all its child devices, we need to start the retry from - * hv_pci_query_relations() call, requesting host to send - * the synchronous child device relations message before this - * information is needed in hv_send_resources_allocated() - * call later. - */ - if (ret == -EPROTO && enter_d0_retry) { - enter_d0_retry = false; - - dev_err(&hdev->device, "Retrying D0 Entry\n"); - - /* - * Hv_pci_bus_exit() calls hv_send_resources_released() - * to free up resources of its child devices. - * In the kdump kernel we need to set the - * wslot_res_allocated to 255 so it scans all child - * devices to release resources allocated in the - * normal kernel before panic happened. - */ - hbus->wslot_res_allocated = 255; - ret = hv_pci_bus_exit(hdev, true); - - if (ret == 0) - goto retry; - - dev_err(&hdev->device, - "Retrying D0 failed with ret %d\n", ret); - } if (ret) goto free_irq_domain;
On Wed, Apr 19, 2023 at 07:40:35PM -0700, Dexuan Cui wrote:
This reverts commit d6af2ed29c7c1c311b96dac989dcb991e90ee195.
The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed29c7c is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device().
The commit d6af2ed29c7c was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe().
If d6af2ed29c7c does not cause any issue this is not a fix and should be merged only with subsequent changes.
Signed-off-by: Dexuan Cui decui@microsoft.com Reviewed-by: Michael Kelley mikelley@microsoft.com Acked-by: Wei Hu weh@microsoft.com Cc: stable@vger.kernel.org
v2: No change to the patch body. Added Wei Hu's Acked-by. Added Cc:stable
v3: Added Michael's Reviewed-by.
drivers/pci/controller/pci-hyperv.c | 71 ++++++++++++++--------------- 1 file changed, 34 insertions(+), 37 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 46df6d093d683..48feab095a144 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3225,8 +3225,10 @@ static int hv_pci_enter_d0(struct hv_device *hdev) struct pci_bus_d0_entry *d0_entry; struct hv_pci_compl comp_pkt; struct pci_packet *pkt;
- bool retry = true; int ret;
+enter_d0_retry: /* * Tell the host that the bus is ready to use, and moved into the * powered-on state. This includes telling the host which region @@ -3253,6 +3255,38 @@ static int hv_pci_enter_d0(struct hv_device *hdev) if (ret) goto exit;
- /*
* In certain case (Kdump) the pci device of interest was
* not cleanly shut down and resource is still held on host
* side, the host could return invalid device status.
* We need to explicitly request host to release the resource
* and try to enter D0 again.
*/
- if (comp_pkt.completion_status < 0 && retry) {
retry = false;
dev_err(&hdev->device, "Retrying D0 Entry\n");
/*
* Hv_pci_bus_exit() calls hv_send_resource_released()
* to free up resources of its child devices.
* In the kdump kernel we need to set the
* wslot_res_allocated to 255 so it scans all child
* devices to release resources allocated in the
* normal kernel before panic happened.
*/
hbus->wslot_res_allocated = 255;
ret = hv_pci_bus_exit(hdev, true);
if (ret == 0) {
kfree(pkt);
goto enter_d0_retry;
}
dev_err(&hdev->device,
"Retrying D0 failed with ret %d\n", ret);
- }
- if (comp_pkt.completion_status < 0) { dev_err(&hdev->device, "PCI Pass-through VSP failed D0 Entry with status %x\n",
@@ -3493,7 +3527,6 @@ static int hv_pci_probe(struct hv_device *hdev, struct hv_pcibus_device *hbus; u16 dom_req, dom; char *name;
- bool enter_d0_retry = true; int ret;
/* @@ -3633,47 +3666,11 @@ static int hv_pci_probe(struct hv_device *hdev, if (ret) goto free_fwnode; -retry: ret = hv_pci_query_relations(hdev); if (ret) goto free_irq_domain; ret = hv_pci_enter_d0(hdev);
- /*
* In certain case (Kdump) the pci device of interest was
* not cleanly shut down and resource is still held on host
* side, the host could return invalid device status.
* We need to explicitly request host to release the resource
* and try to enter D0 again.
* Since the hv_pci_bus_exit() call releases structures
* of all its child devices, we need to start the retry from
* hv_pci_query_relations() call, requesting host to send
* the synchronous child device relations message before this
* information is needed in hv_send_resources_allocated()
* call later.
*/
- if (ret == -EPROTO && enter_d0_retry) {
enter_d0_retry = false;
dev_err(&hdev->device, "Retrying D0 Entry\n");
/*
* Hv_pci_bus_exit() calls hv_send_resources_released()
* to free up resources of its child devices.
* In the kdump kernel we need to set the
* wslot_res_allocated to 255 so it scans all child
* devices to release resources allocated in the
* normal kernel before panic happened.
*/
hbus->wslot_res_allocated = 255;
ret = hv_pci_bus_exit(hdev, true);
if (ret == 0)
goto retry;
dev_err(&hdev->device,
"Retrying D0 failed with ret %d\n", ret);
- } if (ret) goto free_irq_domain;
2.25.1
From: Lorenzo Pieralisi lpieralisi@kernel.org Sent: Thursday, May 25, 2023 1:23 AM ... On Wed, Apr 19, 2023 at 07:40:35PM -0700, Dexuan Cui wrote:
This reverts commit d6af2ed29c7c1c311b96dac989dcb991e90ee195.
The statement "the hv_pci_bus_exit() call releases structures of all its child devices" in commit d6af2ed29c7c is not true: in the path hv_pci_probe() -> hv_pci_enter_d0() -> hv_pci_bus_exit(hdev, true): the parameter "keep_devs" is true, so hv_pci_bus_exit() does *not* release the child "struct hv_pci_dev *hpdev" that is created earlier in pci_devices_present_work() -> new_pcichild_device().
The commit d6af2ed29c7c was originally made in July 2020 for RHEL 7.7, where the old version of hv_pci_bus_exit() was used; when the commit was rebased and merged into the upstream, people didn't notice that it's not really necessary. The commit itself doesn't cause any issue, but it makes hv_pci_probe() more complicated. Revert it to facilitate some upcoming changes to hv_pci_probe().
If d6af2ed29c7c does not cause any issue this is not a fix and should be merged only with subsequent changes.
d6af2ed29c7c does not cause any functional issue, but it makes the code less readable, and so I'd like to not merge this patch with patch 5 -- this way people can easily know what the real change is in patch 5.
linux-stable-mirror@lists.linaro.org