Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added pci_lock_rescan_remove() and pci_unlock_rescan_remove() in create_root_hv_pci_bus() and in hv_eject_device_work() to address the race between create_root_hv_pci_bus() and hv_eject_device_work(), but it turns that grabing the pci_rescan_remove_lock mutex is not enough: refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
Now with hbus->state_lock and other fixes, the race is resolved, so remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus(): this removes the serialization in hv_pci_probe() and hence allows async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
Add the async-probing flag to hv_pci_drv.
pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in hv_pci_remove() are still kept: according to the comment before drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock), "PCI device removal routines should always be executed under this mutex".
Signed-off-by: Dexuan Cui decui@microsoft.com Reviewed-by: Michael Kelley mikelley@microsoft.com Reviewed-by: Long Li longli@microsoft.com Cc: stable@vger.kernel.org ---
v2: No change to the patch body. Improved the commit message [Michael Kelley] Added Cc:stable
v3: Added Michael's and Long Li's Reviewed-by. Fixed a typo in the commit message: grubing -> grabing [Thanks, Michael!]
drivers/pci/controller/pci-hyperv.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 3ae2f99dea8c2..2ea2b1b8a4c9a 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -2312,12 +2312,16 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) if (error) return error;
- pci_lock_rescan_remove(); + /* + * pci_lock_rescan_remove() and pci_unlock_rescan_remove() are + * unnecessary here, because we hold the hbus->state_lock, meaning + * hv_eject_device_work() and pci_devices_present_work() can't race + * with create_root_hv_pci_bus(). + */ hv_pci_assign_numa_node(hbus); pci_bus_assign_resources(bridge->bus); hv_pci_assign_slots(hbus); pci_bus_add_devices(bridge->bus); - pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0; } @@ -4003,6 +4007,9 @@ static struct hv_driver hv_pci_drv = { .remove = hv_pci_remove, .suspend = hv_pci_suspend, .resume = hv_pci_resume, + .driver = { + .probe_type = PROBE_PREFER_ASYNCHRONOUS, + }, };
static void __exit exit_hv_pci_drv(void)
On Wed, Apr 19, 2023 at 07:40:37PM -0700, Dexuan Cui wrote:
Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added pci_lock_rescan_remove() and pci_unlock_rescan_remove() in create_root_hv_pci_bus() and in hv_eject_device_work() to address the race between create_root_hv_pci_bus() and hv_eject_device_work(), but it turns that grabing the pci_rescan_remove_lock mutex is not enough:
nit: s/grabing/grabbing/g
refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
...
From: Simon Horman simon.horman@corigine.com Sent: Sunday, April 23, 2023 12:11 PM To: Dexuan Cui decui@microsoft.com ... On Wed, Apr 19, 2023 at 07:40:37PM -0700, Dexuan Cui wrote:
Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added pci_lock_rescan_remove() and pci_unlock_rescan_remove() in create_root_hv_pci_bus() and in hv_eject_device_work() to address the race between create_root_hv_pci_bus() and hv_eject_device_work(), but it turns that grabing the pci_rescan_remove_lock mutex is not enough:
nit: s/grabing/grabbing/g
Thanks for spotting this! I suppose the maintainer(s) would help fix this.
On Wed, Apr 19, 2023 at 07:40:37PM -0700, Dexuan Cui wrote:
Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added pci_lock_rescan_remove() and pci_unlock_rescan_remove() in create_root_hv_pci_bus() and in hv_eject_device_work() to address the race between create_root_hv_pci_bus() and hv_eject_device_work(), but it turns that grabing the pci_rescan_remove_lock mutex is not enough: refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
This is meaningless for a commit log reader, there is nothing to refer to.
Now with hbus->state_lock and other fixes, the race is resolved, so
"other fixes" is meaningless too.
Explain the problem and how you fix it (this patch should be split because the Subject does not represent what you are doing precisely, see below).
remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus(): this removes the serialization in hv_pci_probe() and hence allows async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
Add the async-probing flag to hv_pci_drv.
Adding the asynchronous probing should be a separate patch and I don't think you should send it to stable kernels straight away because a) it is not a fix b) it can trigger further regressions.
pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in hv_pci_remove() are still kept: according to the comment before drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock), "PCI device removal routines should always be executed under this mutex".
This patch should be split, first thing is to fix and document what you are changing for pci_{lock,unlock}_rescan_remove() then add asynchronous probing.
Lorenzo
Signed-off-by: Dexuan Cui decui@microsoft.com Reviewed-by: Michael Kelley mikelley@microsoft.com Reviewed-by: Long Li longli@microsoft.com Cc: stable@vger.kernel.org
v2: No change to the patch body. Improved the commit message [Michael Kelley] Added Cc:stable
v3: Added Michael's and Long Li's Reviewed-by. Fixed a typo in the commit message: grubing -> grabing [Thanks, Michael!]
drivers/pci/controller/pci-hyperv.c | 11 +++++++++-- 1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 3ae2f99dea8c2..2ea2b1b8a4c9a 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -2312,12 +2312,16 @@ static int create_root_hv_pci_bus(struct hv_pcibus_device *hbus) if (error) return error;
- pci_lock_rescan_remove();
- /*
* pci_lock_rescan_remove() and pci_unlock_rescan_remove() are
* unnecessary here, because we hold the hbus->state_lock, meaning
* hv_eject_device_work() and pci_devices_present_work() can't race
* with create_root_hv_pci_bus().
hv_pci_assign_numa_node(hbus); pci_bus_assign_resources(bridge->bus); hv_pci_assign_slots(hbus); pci_bus_add_devices(bridge->bus);*/
- pci_unlock_rescan_remove(); hbus->state = hv_pcibus_installed; return 0;
} @@ -4003,6 +4007,9 @@ static struct hv_driver hv_pci_drv = { .remove = hv_pci_remove, .suspend = hv_pci_suspend, .resume = hv_pci_resume,
- .driver = {
.probe_type = PROBE_PREFER_ASYNCHRONOUS,
- },
}; static void __exit exit_hv_pci_drv(void) -- 2.25.1
From: Lorenzo Pieralisi lpieralisi@kernel.org Sent: Wednesday, May 10, 2023 1:23 AM To: Dexuan Cui decui@microsoft.com ... On Wed, Apr 19, 2023 at 07:40:37PM -0700, Dexuan Cui wrote:
Commit 414428c5da1c ("PCI: hv: Lock PCI bus on device eject") added pci_lock_rescan_remove() and pci_unlock_rescan_remove() in create_root_hv_pci_bus() and in hv_eject_device_work() to address the race between create_root_hv_pci_bus() and hv_eject_device_work(), but it turns that grabing the pci_rescan_remove_lock mutex is not enough: refer to the earlier fix "PCI: hv: Add a per-bus mutex state_lock".
This is meaningless for a commit log reader, there is nothing to refer to.
Correct. Because patch 5 [PATCH v3 5/6] PCI: hv: Add a per-bus mutex state_lock has not been in any upstream tree, so I don't have a commit id yet.
Now with hbus->state_lock and other fixes, the race is resolved, so
"other fixes" is meaningless too.
Ditto.
Explain the problem and how you fix it (this patch should be split because the Subject does not represent what you are doing precisely, see below).
Ok, I will better explain the boot time issue.
remove pci_{lock,unlock}_rescan_remove() in create_root_hv_pci_bus(): this removes the serialization in hv_pci_probe() and hence allows async-probing (PROBE_PREFER_ASYNCHRONOUS) to work.
Add the async-probing flag to hv_pci_drv.
Adding the asynchronous probing should be a separate patch and I don't think you should send it to stable kernels straight away because a) it is not a fix b) it can trigger further regressions.
Agreed. I'll remove the line "Cc: stable".
pci_{lock,unlock}_rescan_remove() in hv_eject_device_work() and in hv_pci_remove() are still kept: according to the comment before drivers/pci/probe.c: static DEFINE_MUTEX(pci_rescan_remove_lock), "PCI device removal routines should always be executed under this mutex".
This patch should be split, first thing is to fix and document what you are changing for pci_{lock,unlock}_rescan_remove() then add asynchronous probing.
Lorenzo
Ok, I'll split this patch into two.
Thanks for reviewing the patch. Can you please give an "Acked-by" or "Reviewed-by" to patch 1~5 if they look good to you? The first 5 patches have been there for a while, and they already got Michael's Reviewed-by.
I hope the first 5 patches can go through the hyperv-fixes branch in the hyperv tree https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hype... since they are specific to Hyper-V.
After the first 5 patches are in, I can refer to the commit IDs, and I will split this patch (patch 6).
Thanks, Dexuan
From: Dexuan Cui Sent: Wednesday, May 10, 2023 10:12 AM To: Lorenzo Pieralisi lpieralisi@kernel.org
... This patch should be split, first thing is to fix and document what you are changing for pci_{lock,unlock}_rescan_remove() then add asynchronous probing.
Lorenzo
Ok, I'll split this patch into two.
Thanks for reviewing the patch. Can you please give an "Acked-by" or "Reviewed-by" to patch 1~5 if they look good to you? The first 5 patches have been there for a while, and they already got Michael's Reviewed-by.
Hi Lorenzo, Bjorn and all, Ping -- it would be great to have your Acked-by or Reviewed-by for patch 1 to 5.
I hope the first 5 patches can go through the hyperv-fixes branch in the hyperv tree https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/log/?h=hyp erv-fixes since they are specific to Hyper-V.
After the first 5 patches are in, I can refer to the commit IDs, and I will split this patch (patch 6).
Thanks, Dexuan
From: Dexuan Cui decui@microsoft.com Sent: Tuesday, May 16, 2023 5:03 PM ...
From: Dexuan Cui Sent: Wednesday, May 10, 2023 10:12 AM To: Lorenzo Pieralisi lpieralisi@kernel.org
... This patch should be split, first thing is to fix and document what you are changing for pci_{lock,unlock}_rescan_remove() then add asynchronous probing.
Lorenzo
Ok, I'll split this patch into two.
Thanks for reviewing the patch. Can you please give an "Acked-by" or "Reviewed-by" to patch 1~5 if they look good to you? The first 5 patches have been there for a while, and they already got Michael's Reviewed-by.
Hi Lorenzo, Bjorn and all, Ping -- it would be great to have your Acked-by or Reviewed-by for patch 1 to 5.
Gentle ping .
I hope the first 5 patches can go through the hyperv-fixes branch in the hyperv tree
https://git.ke/ rnel.org%2Fpub%2Fscm%2Flinux%2Fkernel%2Fgit%2Fhyperv%2Flinux.git%2F log%2F%3Fh%3Dhyp&data=05%7C01%7Cdecui%40microsoft.com%7C65c86f fe8d8542dbae0708db566a1607%7C72f988bf86f141af91ab2d7cd011db47% 7C1%7C0%7C638198785892993948%7CUnknown%7CTWFpbGZsb3d8eyJWIj oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3 000%7C%7C%7C&sdata=h9vxjnGo6oYzare%2FqqcXndg2NZZ0Ap%2BH33q0i Mtf7D4%3D&reserved=0
erv-fixes since they are specific to Hyper-V.
After the first 5 patches are in, I can refer to the commit IDs, and I will split this patch (patch 6).
Thanks, Dexuan
linux-stable-mirror@lists.linaro.org