We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
Cc: stable@vger.kernel.org Cc: Wei Liu wei.liu@kernel.org Signed-off-by: Hamza Mahfooz hamzamahfooz@linux.microsoft.com --- drivers/hv/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 862c47b191af..6ee75b3f0fa6 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -9,6 +9,7 @@ config HYPERV select PARAVIRT select X86_HV_CALLBACK_VECTOR if X86 select OF_EARLY_FLATTREE if OF + select PCI_HYPERV if PCI help Select this option to run Linux as a Hyper-V client operating system.
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 10:10 AM
We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
The commit message needs more precision: What does "not show up" mean, and what does "not in an orderly manner" mean? And "it's possible" is vague -- can you be more specific about the conditions? Also, avoid the use of personal pronouns like "we".
But the commit message notwithstanding, I don't think this is change that should be made. CONFIG_PCI_HYPERV refers to the VMBus device driver for handling vPCI (a.k.a PCI pass-thru) devices. It's perfectly possible and normal for a VM on Hyper-V to not have any such devices, in which case the driver isn't needed and should not be forced to be included. (See Documentation/virt/hyperv/vpci.rst for more on vPCI devices.)
There are other VMBus device drivers: storvsc, netvsc, the Hyper-V frame buffer driver, the "util" drivers for shutdown, KVP, etc., and more. These each have their own CONFIG_* entry, and current practice doesn't select them when CONFIG_HYPERV is set. I don't see a reason that the vPCI driver should be handled differently.
Also, different distro vendors take different approaches as to whether these drivers are built as modules, or as built-in to their kernel images. I'm not sure what the Kconfig tool does when a SELECT statement identifies a tri-state setting. Since CONFIG_HYPERV is tri-state, does the target of the SELECT get the same tri-state value as CONFIG_HYPERV? Again, that may not be what distro vendors want. They may choose to have some of the VMBus drivers built-in and others built as modules. Distro vendors (and anyone doing a custom kernel build) should be allowed to make their choices just like for any other drivers.
If you've come across a situation these considerations don't apply or are problematic, provide more details. That's what a good commit message should do -- be convincing as to *why* the change should be made! :-)
Michael
Cc: stable@vger.kernel.org Cc: Wei Liu wei.liu@kernel.org Signed-off-by: Hamza Mahfooz hamzamahfooz@linux.microsoft.com
drivers/hv/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 862c47b191af..6ee75b3f0fa6 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -9,6 +9,7 @@ config HYPERV select PARAVIRT select X86_HV_CALLBACK_VECTOR if X86 select OF_EARLY_FLATTREE if OF
- select PCI_HYPERV if PCI help Select this option to run Linux as a Hyper-V client operating system.
-- 2.47.1
On Mon, Jan 27, 2025 at 09:02:22PM +0000, Michael Kelley wrote:
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 10:10 AM
We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
The commit message needs more precision: What does "not show up" mean, and what does "not in an orderly manner" mean? And "it's possible" is vague -- can you be more specific about the conditions? Also, avoid the use of personal pronouns like "we".
But the commit message notwithstanding, I don't think this is change that should be made. CONFIG_PCI_HYPERV refers to the VMBus device driver for handling vPCI (a.k.a PCI pass-thru) devices. It's perfectly possible and normal for a VM on Hyper-V to not have any such devices, in which case the driver isn't needed and should not be forced to be included. (See Documentation/virt/hyperv/vpci.rst for more on vPCI devices.)
Ya, we ran into an issue where CONFIG_NVME_CORE=y and CONFIG_PCI_HYPERV=m caused the passed-through SSDs not to show up (i.e. they aren't visible to userspace). I guess it's cause PCI_HYPERV has to load in before the nvme stuff for that workload. So, I thought it was reasonable to select PCI_HYPERV here to prevent someone else from shooting themselves in the foot. Though, I guess it really it on the distro guys to get that right.
There are other VMBus device drivers: storvsc, netvsc, the Hyper-V frame buffer driver, the "util" drivers for shutdown, KVP, etc., and more. These each have their own CONFIG_* entry, and current practice doesn't select them when CONFIG_HYPERV is set. I don't see a reason that the vPCI driver should be handled differently.
Also, different distro vendors take different approaches as to whether these drivers are built as modules, or as built-in to their kernel images. I'm not sure what the Kconfig tool does when a SELECT statement identifies a tri-state setting. Since CONFIG_HYPERV is tri-state, does the target of the SELECT get the same tri-state value as CONFIG_HYPERV? Again, that may not be what distro vendors want. They may choose to have some of the VMBus drivers built-in and others built as modules. Distro vendors (and anyone doing a custom kernel build) should be allowed to make their choices just like for any other drivers.
If you've come across a situation these considerations don't apply or are problematic, provide more details. That's what a good commit message should do -- be convincing as to *why* the change should be made! :-)
Michael
Cc: stable@vger.kernel.org Cc: Wei Liu wei.liu@kernel.org Signed-off-by: Hamza Mahfooz hamzamahfooz@linux.microsoft.com
drivers/hv/Kconfig | 1 + 1 file changed, 1 insertion(+)
diff --git a/drivers/hv/Kconfig b/drivers/hv/Kconfig index 862c47b191af..6ee75b3f0fa6 100644 --- a/drivers/hv/Kconfig +++ b/drivers/hv/Kconfig @@ -9,6 +9,7 @@ config HYPERV select PARAVIRT select X86_HV_CALLBACK_VECTOR if X86 select OF_EARLY_FLATTREE if OF
- select PCI_HYPERV if PCI help Select this option to run Linux as a Hyper-V client operating system.
-- 2.47.1
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 1:43 PM
On Mon, Jan 27, 2025 at 09:02:22PM +0000, Michael Kelley wrote:
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 10:10 AM
We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
The commit message needs more precision: What does "not show up" mean, and what does "not in an orderly manner" mean? And "it's possible" is vague -- can you be more specific about the conditions? Also, avoid the use of personal pronouns like "we".
But the commit message notwithstanding, I don't think this is change that should be made. CONFIG_PCI_HYPERV refers to the VMBus device driver for handling vPCI (a.k.a PCI pass-thru) devices. It's perfectly possible and normal for a VM on Hyper-V to not have any such devices, in which case the driver isn't needed and should not be forced to be included. (See Documentation/virt/hyperv/vpci.rst for more on vPCI devices.)
Ya, we ran into an issue where CONFIG_NVME_CORE=y and CONFIG_PCI_HYPERV=m caused the passed-through SSDs not to show up (i.e. they aren't visible to userspace). I guess it's cause PCI_HYPERV has to load in before the nvme stuff for that workload. So, I thought it was reasonable to select PCI_HYPERV here to prevent someone else from shooting themselves in the foot. Though, I guess it really it on the distro guys to get that right.
Hmmm. By itself, the combination of CONFIG_NVME_CORE=y and CONFIG_PCI_HYPERV=m should not cause a problem for an NVMe data disk. If you are seeing a problem with that combo for NVMe data disks, then maybe something else is going wrong.
However, things are trickier if the NVMe disk is the boot disk with the OS. In that case, that CONFIG_* combination is still OK, but the Hyper-V PCI driver module *must* be included in the initramfs image so that they can be loaded and used when finding and mounting the root file system. Same thing is true for Hyper-V storvsc when the boot disk is a SCSI disk -- the storvsc driver and generic SCSI stack must either be built-in, or the modules included in the initramfs.
The need to have NVME_CORE and the Hyper-V PCI driver available to mount an NVMe root disk is another case where different distros have taken different approaches. Some make them built-in to the kernel image so they don't have to worry about the initramfs, while other distros make them modules and include them initramfs.
Michael
On Mon, Jan 27, 2025 at 04:42:56PM -0500, Hamza Mahfooz wrote:
On Mon, Jan 27, 2025 at 09:02:22PM +0000, Michael Kelley wrote:
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 10:10 AM
We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
The commit message needs more precision: What does "not show up" mean, and what does "not in an orderly manner" mean? And "it's possible" is vague -- can you be more specific about the conditions? Also, avoid the use of personal pronouns like "we".
But the commit message notwithstanding, I don't think this is change that should be made. CONFIG_PCI_HYPERV refers to the VMBus device driver for handling vPCI (a.k.a PCI pass-thru) devices. It's perfectly possible and normal for a VM on Hyper-V to not have any such devices, in which case the driver isn't needed and should not be forced to be included. (See Documentation/virt/hyperv/vpci.rst for more on vPCI devices.)
Ya, we ran into an issue where CONFIG_NVME_CORE=y and CONFIG_PCI_HYPERV=m caused the passed-through SSDs not to show up (i.e. they aren't visible to userspace). I guess it's cause PCI_HYPERV has to load in before the nvme stuff for that workload. So, I thought it was reasonable to select PCI_HYPERV here to prevent someone else from shooting themselves in the foot. Though, I guess it really it on the distro guys to get that right.
Does inserting the PCI_HYPERV module trigger a (re)scanning of the (v)PCI bus? If so, the passed-through NVMe devices should show up just fine, I suppose.
I agree with Michael that we should not select PCI_HYPERV by default. In some environments, it is not needed at all.
Thanks, Wei.
From: Wei Liu wei.liu@kernel.org Sent: Monday, January 27, 2025 8:10 PM
On Mon, Jan 27, 2025 at 04:42:56PM -0500, Hamza Mahfooz wrote:
On Mon, Jan 27, 2025 at 09:02:22PM +0000, Michael Kelley wrote:
From: Hamza Mahfooz hamzamahfooz@linux.microsoft.com Sent: Monday, January 27, 2025 10:10 AM
We should select PCI_HYPERV here, otherwise it's possible for devices to not show up as expected, at least not in an orderly manner.
The commit message needs more precision: What does "not show up" mean, and what does "not in an orderly manner" mean? And "it's possible" is vague -- can you be more specific about the conditions? Also, avoid the use of personal pronouns like "we".
But the commit message notwithstanding, I don't think this is change that should be made. CONFIG_PCI_HYPERV refers to the VMBus device driver for handling vPCI (a.k.a PCI pass-thru) devices. It's perfectly possible and normal for a VM on Hyper-V to not have any such devices, in which case the driver isn't needed and should not be forced to be included. (See Documentation/virt/hyperv/vpci.rst for more on vPCI devices.)
Ya, we ran into an issue where CONFIG_NVME_CORE=y and CONFIG_PCI_HYPERV=m caused the passed-through SSDs not to show up (i.e. they aren't visible to userspace). I guess it's cause PCI_HYPERV has to load in before the nvme stuff for that workload. So, I thought it was reasonable to select PCI_HYPERV here to prevent someone else from shooting themselves in the foot. Though, I guess it really it on the distro guys to get that right.
Does inserting the PCI_HYPERV module trigger a (re)scanning of the (v)PCI bus? If so, the passed-through NVMe devices should show up just fine, I suppose.
vPCI devices are made available to a Hyper-V VM as VMBus offers of class "vPCI". For such an offer, the Linux device subsystem tries to find a matching VMBus driver. If the vPCI driver isn't available, the offer just stays in the VMBus code waiting for a driver. If the vPCI driver later shows up, the Linux device subsystem does the match, and VMBus device probing proceeds. The hv_pci_probe() function in the vPCI driver is called, and all the normal steps are taken so that the vPCI device appears and is functional in the VM. The details of "the normal steps" are in the documentation in Documentation/virt/hyperv/vpci.rst. See Sections "Device Presentation" and "PCI Device Setup". A key point is that each vPCI device is modelled in Linux to be a separate PCI domain with its own host bridge and root PCI bus.
So the outcome is as you describe and would expect, though the mechanism is not a scan of the virtual PCI bus.
Michael
I agree with Michael that we should not select PCI_HYPERV by default. In some environments, it is not needed at all.
Thanks, Wei.
linux-stable-mirror@lists.linaro.org